Combinatorial Optimization (Mathematical programming study)

COMBINATORIAL OPTIMIZAnON MATHEMATICAL PROGRAMMING STUDIES Editor-in-Chief M.L. BALINSKI, Yale University, New Haven, ...

Author: M.W. Padberg

84 downloads 1111 Views 9MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

COMBINATORIAL OPTIMIZAnON

MATHEMATICAL PROGRAMMING STUDIES Editor-in-Chief M.L. BALINSKI, Yale University, New Haven, CT, U.S.A. Senior Editors E.M.L. BEALE, Scicon Computer Services, Ltd. Milton Keynes, Great Britain GEORGE B. DANTZIG, Stanford University, Stanford, CA, U.S.A. L. KANTOROVICH, National Academy of Sciences, Moscow, U.S.S.R. TJALLING C. KOOPMANS, Yale University, New Haven, CT, U.S.A. A.W. TUCKER, Princeton University, Princeton, NJ, U.S.A. PHILIP WOLFE, IBM Research, Yorktown Heights, NY, U.S.A. Associate Editors R. BARTELS, The University of Waterloo, Waterloo, Ont., Canada vACLAV CHVATAL, McGill University, Montreal, Quebec, Canada RICHARD W. COTTLE. Stanford University, Stanford, CA, U.S.A. J.E. DENNIS, Jr., Cornell University, Ithaca, NY, U.S.A. B. CURTIS EAVES, Stanford University, Stanford, CA, U.S.A. R. FLETCHER, The University, Dundee, Scotland B. KORTE, Universitat Bonn, Bonn, West Germany MASAO IRI, University of Tokyo, Tokyo, Japan C. LEMARECHAL, IRIA-Laboria, Le Chesnay, Yvelines, France C.E. LEMKE, Rensselaer Polytechnic Institute, Troy, NY, U.S.A. GEORGE L. NEMHAUSER, Cornell University, Ithaca, NY, U.S.A. MANFRED W. PADBERG, New York University, New York, U.S.A. M.J.D. POWELL, University of Cambridge, Cambridge, England JEREMY F. SHAPIRO, Massachusetts Institute of Technology, Cambridge, MA, U.S.A. L.S. SHAPLEY, The RAND Corporation, Santa Monica, CA, U.S.A. K. SPIELBERG, IBM Scientific Computing, White Plains, NY, U.S.A. HOANG TUY, Institute of Mathematics, Hanoi, Socialist Republic of Vietnam D.W. WALKUP, Washington University, Saint Louis, MO, U.S.A. ROGER WETS, University of Kentucky, Lexington, KY, U.S.A. C. WITZGALL, National Bureau of Standards, Washington, DC, U.S.A.

NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM. NEW YORK· OXFORD

MATHEMATICAL PROGRAMMING STUDY 12

Combinatorial Optimization Edited by M. W. PADBERG E. Balas C. Berge R.E. Burkard P.M. Camerini Pol. Hammer G. Gallo M. Grotschel M. Guignard D. Hausmann A.Ho S. Hong

H.-C. Huang T.A. Jenkyns B. Korte F. Maffioli M.W. Padberg W. Pulleyblank B. Simeone Th.H.C. Smith J. Tind L.E .. Trotter, J r. U. Zimmermann

1980

NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM -NEW YORK-OXFORD

© THE MATHEMATICAL PROGRAMMING SOCIETY - 1980 . All rights reserved. No part of this publication may be reproduced, stored In a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner.

This book is also available in journal format on subscription.

ISBN for this series: 0 7204 8300 X for this volume:1 0 444 854894

Published by: NORTH-HOLLAND PUBLISHING COMPANY AMSTERDAM· NEW YORK· OXFORD

Sole distributors for the U.S.A. and Canada: Elsevier North-Holland, Inc. 52 Vanderbilt Avenue New York, N.Y. 10017

Library of Congress Cataloging in Publication Data

Main entry under title: Combinatorial optimization. (Mathematical programming study; 12) Bibliography: p. 1. Mathematical optimization--Addresses, essays, lectures. 2. Combinatorial analysis--Addresses, essays, lectures. I. Padberg, M. W. II. Series. Q.A402.5.C544 519.5 80-16548 ISBN 0-444-85489-4 (Elsevier North-Holland)

PRINTED IN THE NETHERLANDS

PREFACE Mathematical programming is concerned with the optimization of functions in many variables subject to side constraints. Combinatorial optimization addresses the same problem with the additional complication brought about by the incorporation of indivisible activities, i.e. by variables that are required to assume integer values only. Since its inception shortly after World War II, mathematical programming ideas have been successfully utilized to formulate and to solve many complex decision problems-with applications covering the entire range from the quantitative social sciences to engineering science. Examples of some of the important decision problems are distribution network planning models, the modelling of budgetary processes, manpower planning models and many complex scheduling tasks faced today by business and government agencies. Notable success in the problem-solving ability has been recorded to date in linear computation. The increased ability to numerically solve mathematical programming problems has brought about a greater demand to incorporate indivisible activities into the planning models. In this volume a number of papers dealing with computational and theoretical aspects of combinatorial optimization are brought together. Emphasis is put on papers dealing with special structures such as the set covering problem, the travelling salesman problem, the knapsack problem, etc.. In the computational studies the properties of these special structures are exploited and are seen to advance the state-of-the-art considerably. Theoretical issues in this STUDY concern the generalization of such standard problems as the transportation problem and the knapsack problem to the case of nonlinear objective functions, the worst-case analysis of heuristics as well as integrality 'question in combinatorial optimization. Also two papers dealing with blocking and anti-blocking polyhedra are included. A selected bibliography on combinatorial optimization with over 200 items is included in this STUDY. In order to get a representative bibliography, all authors of the papers in this STUDY were asked to contribute the papers and books they most frequently refer to in their research work. The request was met most enthusiastically by almost all contributors to this STUDY and the bibliography that resulted should serve well as a guide to combinatorial optimization for researchers interested in this subject. I wish to express my sincere gratitude to the contributors of this volume as well as to the referees who helped in getting this work done. The names of the referees will appear in due course in the pages of MATHEMATICAL PROGRAMMING. Manfred W. Padberg v

CONTENTS

Preface

v

Contents

vi

(1) Weakly admissible transformations for solving algebraic assignment

and transportation problems, R.E. Burkard and U. Zimmermann (2) Cutting planes from conditional bounds: A new approach to set covering, E. Balas .... . . . .

19

(3) Set covering algorithms using cutting planes, heuristics, and subgradient optimization: A computational study, E. Balas and A. Ho .

37

(4) On the symmetric travelling salesman problem: Solution of 120-city problem, M. Gr6tschel . . . .

61

(5) On the symmetric travelling salesman problem: A computational study, M. W. Padberg and S. Hong

78

(6) A LIFO implicit enumeration algorithm for the asymmetric travelling salesman problem using a one-arborescence relaxation, Th. H.e. Smith . . . .

108

(7) Polynomial bounding for NP~hard problems, P.M. Camerini and F.

Maffioli

.

.

.

.

.

.

115

(8) Worst Case analysis of greedy type algorithms for independence systems, D. Hausmann, T.A. Jenkyns and B. Korte

120

(9) Quadratic knapsack problems, G. Gallo, P.L. Hammer and B. Simeone

132

(10) Fractional vertices, cuts and facets of the simple plant location problem, M. Guignard

150

(II) Balanced matrices and property (G),

vi

e.

Berge

163

Contents

vii

(12) Dual integrality in b-matching problems, W.R. Fulleyblank

176

(13) A technique for determining blocking and antiblocking polyhedral descriptions, H.-c. Huang and L.E. Trotter, Jr.

197

(14) Certain kinds of polar sets and their relation to mathematical programming, J. Tind . . . .

206

Bibliography

214

Mathematical Programming Study 12 (1980) ms. 1-18. North-Holland Publishing Company

WEAKLY ADMISSIBLE TRANSFORMATIONS FOR SOLVING ALGEBRAIC ASSIGNMENT AND TRANSPORTATION PROBLEMS R.E. B U R K A R D

A N D U. Z I M M E R M A N N

Universitiit K61n, K61n, Federal Republic of Germany Received 2 August 1977 Revised manuscript received 27 March 1978

Weakly admissible transformations are introduced for solving algebraic assignment and transportation problems, which cover so important classes as problems with sum objectives, bottleneck objectives, lexicographical objectives and others. A transformation of the cost matrix is called weakly admissible, if there are two constants a and /3 in the underlying semigroup that for all feasible solutions the composition of a and the objective value with respect to the original cost coefficients is equal to the composition of // and the objective value with respect to the transformed cost coefficients. The elements a and /3 can be determined by shortest path algorithms. An optimal solution for the algebraic assignment problem can be found after at most n weakly admissible transformations, therefore the proposed method yields an O(n ~) algorithm for algebraic assignment problems.

Key words: Algebraic Assignment Problem, Algebraic Transportation Problem, Shortest Paths, Weakly Admissible Transformations, Semigroups, Polynomial Algorithms.

I. Introduction A l g e b r a i c a s s i g n m e n t a n d t r a n s p o r t a t i o n p r o b l e m s w e r e i n t r o d u c e d to t r e a t a s s i g n m e n t a n d t r a n s p o r t a t i o n p r o b l e m s w i t h d i f f e r e n t k i n d o f o b j e c t i v e s in a unified w a y . T h e a l g e b r a i c t r a n s p o r t a t i o n p r o b l e m ( A T P ) c a n b e s t a t e d in t h e following way. Let M={1,2 ..... m}and N={m+l,m+2 ..... m+n}andlet c;begiven e l e m e n t s o f a n a t u r a l l y o r d e r e d (cf. (2.1),(2.2)) s u b s e m i g r o u p R o f R+ f o r i E M U N. W i t h o u t loss o f g e n e r a l i t y w e c a n a s s u m e

Ci= ~ iEM

Cj.

jEN

T h e set o f f e a s i b l e s o l u t i o n s PT f o r a t r a n s p o r t a t i o n p r o b l e m is n o w d e f i n e d b y

jEN

iEM

In the case m = n and ci = q = 1 for all i, j we get an assignment problem. cost coefficients of algebraic assignment and transportation problems are ments of a naturally ordered semigroup H with internal composition * order relation -<. Since the real variables xi~ have to be c o m p o s e d with a

The eleand cost

2

R.E. Burkard and U. Zimmermannl Algebraic transportation problems

coefficient aij, we define an external composition [-I:R x H ~ H which fulfills distributive laws (cf. Section 2). If we denote the cost matrix of the considered problems by a =(aii) we get therefore the following expression for the objective function of an ATP: [xi--la] : = (xIm+l["]alm+l) * (Xlrn+2[]alm+2) * " " * (Xm,m+.['-la.,,.,+.).

(1.2)

Therefore an algebraic transportation problem (ATP) has the form min [xIZ]a].

(1.3)

xEPT

It is possible to define algebraic assignment problems (AAP) without real variables. In this case we consider the set SN of all permutations q~ of the set N and solve min (at~.~ * a2,p(2) *

r

" ' "

* a.r

(1.4)

By specialising the algebraic compositions and the order relation we get wellknown cases. For example, if H is the set of nonnegative reals with the natural order relation, * is the addition and [] is the multiplication, then (1.3) is the classical transportation problem with sum objective. The system (R, max, I-q) with R := R U {-~} and a

xf-qa:=

-

forx>0, forx=0

leads to time transportation resp. bottleneck assignment problems. Other examples are lexicographical problems, where the semigroup is a set of vectors with componentwise internal composition and time-cost problems which play an important r61e in practice. In this case we have H = R x R+ with (al, a2) if al > bl, if al < bl, (al, a2) * (bl, b2) := [ (bl, b2) (ai, a2 + b2) if al = bl and ~" ( - ~ , 0 ) if x = 0 x [ - q ( a b a 2 ) : = t ( a , x " a2) i f x > 0 . An O(n 4) solution procedure for algebraic assignment problems was given in [2]. This concept is extended in [1] to solve ATP. In both cases so-called admissible transformations play the central r61e. A transformation of the cost coefficients a to coefficients d is called admissible, if there is a constant a E H such that for all feasible solutions x the following equation holds [xOa] = a * [xl-]~].

R.E. Burkard and U, Zimmermann[ Algebraic transportation problems

3

In this paper the concept of admissible transformations is weakened to develop a more efficient solution procedure for AAP and ATP. A transformation T :aii ~-~ tiij is called weakly admissible, if there are two constants a and/3 in H such that for all feasible solutions the following equation holds a * [xt'qa]

=/3 * [xl--]a].

By applying weakly admissible transformations successively we get optimal solutions of our problems. The constants a and/3 are determined by a shortest path algorithm. This leads always to a situation, where [x[]a] is uniquely determined. (Note that in general the equation a * ~ =/3 in a semigroup may have no solution or several ones.) By applying weakly admissible transformations to AAP it is possible to solve them in O(n 3) steps. In the case of (linear) assignment problems with sum objective this method is a modification of an algorithm of Tomizawa [10] (cf. also Dorhout [7]). Recently weakly admissible transformations were applied to bottleneck problems in [4, 5]. The computational results reported in [4, 5] are very encouraging and show the efficiency of this newly proposed method. In Section 2 the underlying algebraic systems are described. Thereupon we introduce weakly admissible transformations and describe the combinatorial background of the method in Section 4. Then the transformation algorithm is described and modifications for applying it to assignment problems resp. problems with bottleneck objectives are mentioned.

2. Algebraic properties In Section 1 the naturally ordered commutative semigroups (H, *,---) and (R, +,-<) were introduced. Their neutral elements are e resp. 0. If ai E H for i = 1, 2 . . . . . k we denote at * a2 * "" * ak by *k=l ai. We call an ordered commutative semigroup ( H , *, <-) n a t u r a l l y o r d e r e d if a - a *b

(2.1)

holds for all a, b E H (positively ordered) and if for all a, b E H with a < b there exists an element c E H with a * c = b.

(2.2)

(R,+,-<) is assumed to be a naturally ordered subsemigroup of (R+, +,-<). Furthermore a w e a k c a n c e l l a t i o n rule is assumed in H, that is a * b =a * c~b

=c v a*b=a

(2.3)

holds for all a, b, c E H. Then H can be partitioned in a family (Hi, i ~ I) of naturally ordered commutative semigroups (H,., *i, -
4

R.E. Burkard and Uo Zimmermannl Algebraic transportation problems

L In each semigroup Hi the cancellation law a * b = a * c => b = c

(2.4)

holds for all a, b, c ~ Hi. For an element a E H we denote the unique index of the semigroup which contains this element by i(a), that is a E Hi(a). Internal compositions and order relations fulfill a,b=

a b

i(a) > i(b), i(a)
a *i(a) b

i(a) = i(b)

(2.5)

and a < b r

i(a) < i(b) v (i(a) = i(b) ^ a <.~ b)

(2.6)

for all a, b E H. H is called an ordinal s u m o f the family (Hi, i E I). Furtheron every semigroup Hi of the family is "irreducible" in the sense that it cannot be partitioned in two or more semigroups with properties (2.2), (2.5) and (2.6). These results are essentially known from the theory of ordered semigroups and can be found in Fuchs [8] or with regard to combinatorial optimization problems in [11]. Additionally we may assume w.l.o.g, that each Hi contains a neutral element ei. If in the above family Hi has not such an element, then we adjoin it. In this case Hi := Hi U {ei} is no longer "irreducible" but fulfills (2.2), (2.4), (2.5) and (2.6) again. In this way trivial decompositions as for example R+ = {O} t_J(R+ ~ {O}) can be avoided. The theory in [8] yields a characterization of the semigroup H. In [11] this has been stated for combinatorial optimization problems in the following way. Due to the cancellation law and the natural order each Hi is the "positive cone" of an ordered commutative group Gi (a E ~ iff a E G~ and e~ - a). Now let A be a well-ordered index set and consider the product G :=II~Ea R together with componentwise addition + of real numbers and the lexicographical ordering<. Then (G, +, ~<) is called a lexicographical product o f real groups. Obviously (G, +, <) is an ordered commutative group. Now by Hahn's theorem (cf. e.g. Fuchs [8], Chapter IV, Section 5) each G~ can be embedded in a lexicographical product of real groups. Therefore H can be described as ordinal sum of "'positive cones" of groups which can be embedded in lexicographical products of real groups. Unfortunately this characterization is not very transparent and therefore we prefer an axiomatic treatment of such semigroups H. In the following we use the same sign " - " for the order relations in H, I and R but from the context it will always be obvious which relation is used specifically. At first we prove some properties which will be used in later sections.

R.E. Burkard and U. Zimmermann[ Algebraic transportation problems

5

Proposition 2.1. Let a, b, c E H with i(a) <- min(i(b), i(c)). Then the following

implication holds: a.b=a.c~b=c. Proof. L e t a * b = a * c. T h e n i(a)<_ i(b) and i(a)<-i(c) t o g e t h e r with (2.5) yields i ( b ) = i(c). I n the case i ( a ) < i(b) w e find the result b y (2.5). If i(a)= i(b) holds the result follows f r o m (2.4).

Proposition 2.2. Let a, b E H with a < b. Then the equation a*c=b has a unique solution c E H. Furthermore c E l-I~b~ holds. Proof. F r o m (2.2) we k n o w the existence o f a solution c E H. (2.5) implies i(a * c ) = m a x ( / ( a ) , i(c))= i(b) t h e r e f o r e the a s s u m p t i o n i ( c ) < i ( b ) yields the c o n t r a d i c t i o n a = a * c = b. T h u s a n y solution has the index i(c) = i(b) = : ~. L e t c ' E H be a n o t h e r solution o f the equation. In the c a s e i(a) < ~ w e find c = a * c = b = a * c ' = c'. O t h e r w i s e (2.4) implies c = c'.

Proposition 2.3. Let i E I and a, b E Hi. If Hi is irreducible, then a * b = a ~ I-I, = {a}. Proof. T h e a s s u m p t i o n c ~ with c < b yields a < a * c < a * b = a and b y (2.4) we find the c o n t r a d i c t i o n b = c. T h e r e f o r e b - < c holds for all c E/-//. F u r t h e r m o r e w e find

b*c=c f o r al/ c E Hi. A t first we c o n s i d e r the c a s e a < c. T h e n there exists c ' E with a * c ' = c and therefore b * c = b * a * c ' = a * c ' = c . I f c < a , then there exists c ' E Hi with c * c ' = a and b * c * c ' = c * c ' t o g e t h e r with (2.4) yields

b*c=c. T h e r e f o r e / - / / c a n be partitioned into t w o s e m i g r o u p s {b} a n d (//i ~-{b}) w h i c h fulfill (2.2), (2.5) a n d (2.6). W e o n l y p r o v e (2.2) for H,. ~. {b}. L e t c, d E/-L.-~{b} with c > d and take a solution c ' ~ Hi with c * c ' = d. T h e n c * b = c < d = c * c ' t o g e t h e r with (2.4) yields b < c ' and t h e r e f o r e c' E

H, --{b}. The d e c o m p o s i t i o n o f H,. into s u c h s e m i g r o u p s c o n t r a d i c t s the irreducibility o f Hi. T h e r e f o r e ~ -- {b} is e m p t y and H,. = {b} = {a}. T h e two s e m i g r o u p s (R, + , - ) c o m p o s i t i o n [] : R x H ~ H.

and (H, *-<) are c o n n e c t e d b y an external

6

R.E. Burkard and U. Zimmermann] Algebraic transportation problems

We assume the distributivity laws ( x + y ) [ ] a = (x [] a ) * ( y [] a),

(2.7.1)

x [](a * b) = (x [] a ) * ( x [] b)

(2.7.2)

and 0 [] a = e,

(2.8)

i(x [] a) <--i(a)

(2.9)

for all x, y E R and a, b E H. In particular (2.9) is fulfilled if the external composition can be partitioned in external compositions [2,-: R • Hi ~/-/,. U {e} for i E L In general the external composition may not have such a partitioning. An interesting implication of the axiomatic system is the monotonicity of the external composition which follows essentially from (2.2). We find <- y [ ] a ,

(2.10.1)

a - x [] y -< x [] b

(2.10.2)

x <_ y ~ x D a

for all x, y E R and a, b E H (cf.[3]). T h e r e f o r e the following proposition holds. Proposition 2.4. L e t x ~ R -. {0} a n d a, b E H. i(y [] a) < i(b) f o r all y E R.

Then i(x [ ] a ) -~ i ( b )

implies

Proof. If y--<x we find y [] a - < x [] a by (2.10.1). Then (2.6) yields the result. Otherwise choose a sufficient large n ~ N with n 9 x > y. Then a ' : = ( n 9x ) 9 a = ( x [ : ] a ) * ( x [ ] a ) * . . .

* ( x [ ] a ) < - b * b * . . . * b =: b'

where both products have n elements. We get i(y [] a) <- i( a') <- i(b') = i( b ). Further properties of the algebraic structure and a discussion of the various examples including those of the introduction can be found in [3].

3. Weakly admissible transformations

The algorithm for solving the transportation problem (1.3) is based on transforming the cost coefficients aii (i E M, j E N). Definition 3.1. A transformation T:aij~->dij is called weakly admissible with value (~, fl) if for all x E P r

a * [x [] a] =/3 * [x [-'1~] holds.

R.E. Burkard and U. Zimmermann] Algebraic transportation problems

7

Obviously the composition T o T' of two weakly admissible transformations is a weakly admissible transformation.

Definition 3.2. A transformation T : air ~ ai~ is called (u, v)-transformation if ui * air = (lii * vj

holds for all i E M, j E N. The finite composition of (u, v)-transformations is again an (u, v)-transformation.

Proposition 3.3. Every (u, v)-transformation is weakly admissible and has the value (ct, [3) with

a:=

* (czDui),

/ 3 : = * (crE3vr)

iEM

tEN

where ci, i E M U N are the right-hand sides in (1.1).

Proof. Let x E Pr. Then a*[xOa]=

* iEM

((Ci["]Ui)* * (xirFqair)). j~N

With ci = ~reN X~j, i E M we find the right-hand side equal to 9

* (xij [] (ui * air)).

iEM tEN

Due to cj = ~, iEM, Xir, J ~ N and (3.2) this is equal to 9 ((crUlv~). 9 (x~j [] air)). jEN

iEM

Therefore a * [x [] a] = [3* [x [] 4]. A weakly admissible transformation with value (a, [3) is called suitable if i(a) <--i(fl) <--i(z*)

(3.1)

holds for the optimal value z* of the transportation problem (1.3). Again the composition of two suitable transformations is a suitable transformation. Theorem 3.4. L e t T : air ~ air be a suitable transformation with value (a, [3). I f x* E PT fulfills [3 * [x* [] ti] = [3, then x* is an optimal solution o f the algebraic transportation problem (1.3). Furthermore z* is the unique solution o f the equation a * z* = [3.

8

R.E. Burkard and U. ZimmermannJ Algebraic transportation problems

Proof. L e t x E Pr. Then a * [x*rq a] =/3 * [x*[q ti] = ,8 --
Together with Proposition 2.1 this yields [x* [] a] -- [x [] a]. From a * z* =/3 follows a---
/3 * (x~j [] a~) =/3

for all i E M, j E N, then x is a solution of the algebraic transportation problem (1.3). (3.2) is obviously fulfilled for those i E M, j E N with x~j = O.

4. F l o w s and shortest paths

The algebraic transportation problem (1.3) can be viewed as a special minimum cost network flow problem. Let G = (V, E) be the corresponding directed graph with node set V = {s, t} U M U N where s is the source and t the sink. The arcs, capacities and cost coefficients are given in Table 1 with i E M and ] E N. A maximal flow f in G corresponds to a unique transportation vector x ~ P r by xij = f ( i , j )

for all i E M , j E N

(4.1)

and vice versa. Flow values on other arcs of G are uniquely determined by (4.1) and the conservation law for flows. Table 1 Arcs (., .)

Capacities c(., .)

Cost coefficients a(., .)

(s, i)

c~

e

(j, t)

cj

e

(i, j)

=

a,j

R.E. Burkard and U. Zimmermann/ Algebraic transportation problems

9

According to Table 1 a transportation vector has the same objective value as its corresponding f l o w / : [x [] a] = [.f [] a] :=

*

(i, j)~E

if(i, j) [] a(i, j)).

(4.2)

In the following we make use of both representations of the algebraic transportation problem. In solving algebraic transportation problems we consider subproblems P(6k, a) of the form rain Ix [] a]

(4.3)

where P(Pk) is the set of feasible solutions of a transportation problem with right-hand sides

6i:=

ci

for i < k,

6k

fori=kwith0<--6k<~Ck,

0

for k < i < - m .

Let / be the flow corresponding to a solution x E P(6k) with 0--< 6k < Ck. We increase Pk to Ck and consider the corresponding incremental graph G i ) = (V, E i ) ) with respect to the flow/r Thereby we define E t ) : = E F i ) U E B i ) with E F i ) := {(s, k)} O (M x N) 13 {(j, t) I f(J, t) < c(j, t), j E N},

E B i ) := {(j, i) I f(i, j) > O, i E M, j E N}.

(4.5)

A sequence (/l, 12. . . . . l,) of mutual disjoint nodes of V is called path in G i ) if

(li, li+l) E E i ) for all 1 _<_i -< r - 1. The arcset E W := {(li, li+t) I i = 1. . . . . r - 1} of a path can be partitioned according to (4.5) in E W F and E W B : E W F := E W f'l E F i ) E W B := E W tq E B i ) .

(4.6)

An augmenting path is a path with l~ = s and l, = t. A sequence (11, 12. . . . . I,) is called a cycle in G i ) if (ll, 12. . . . . It-l) is a path, 11= l, and (l,-i, !1) E E i ) . The capacity of an augmenting path w = (s, k . . . . . j, t) is given by 8(w) := min{SB, c(s, k)-f(s, k), c(j, t)-f(j, t)}

(4.7)

8B := min{f(i, j) [ (j, i)) E EWB}.

(4.8)

with The capacity of a cycle w is given by

8(w)= 88. Obviously 6(w) > 0 holds.

(4.9)

R.E. Burkard and U. Zimmermannl Algebraic transportation problems

I0

Now let w be an augmenting path or cycle and 8 an element of R with

0 <- 8 <--8(w). We define the flow f @ w(8) := f' by ['(i,j) :=

[(i, j) + 8 [(i,j)-8

(i, j) E EWF, (j,i)EEWB,

[(i, j)

else.

(4.10)

If w is a cycle the flow value does not change, yet if w is an augmenting path the flow value increases by 8. Let 6k < C'k<--Ck and let [ resp. [' be flows for P(6k) resp. P(c'k). Then f ' can be represented by

f'=f(~(p~=lWp(~p)"Fp=r+l ~ Wp(~p)).

(4.11)

The first sum covers only augmenting paths and the second only cycles in GOt). Especially we get for every arc(i, j ) E E

f(i, j) + ~p~r, 8p

[(i, j) < ['(i, j),

['(i, j) = 1 f(i, j) - ~p~r2 8p I [ f(i, j)

f(i, j) > ['(i, j),

I

(4.12)

f(i, j) = f'(i, j)

with R, = {P [ (i, j) ~ EWFp} and R2 = {p I (], i) E EWBp} (cf. [3]). We define the length L(w) of a path or cycle w in G(f) by

L(w) :=

,

a(i,j).

(4.13)

(i, j ) E E W F

Now the following property holds. Proposition 4.1. Let [ be a flo w [o r P (6k) with [[ [] a ] = e. I[ [' is a flow f o r P ( c 'k) with 6k < C'k<- Ck, then according to (4.1) s

[['[3 a] = 9 (Sp [] L(wp)) 0=1

holds. Proof. Immediate consequence of (4.11), (4.12) and the distributivity laws (2.7). Let D denote the set of the augmenting paths in G(f). w* E D is called a

shortest path if L(w*) = min L(w). wED

The determination of shortest paths in G(f) plays now an important r61e. We

R.E. Burkard and U. Zimmermann/ Algebraic transportation problems

I1

define in GO r) the following cost coefficients

ai( i, J

:= {a(i, j)

e

(i, j) E EFff) (i,j)EEB(f).

(4.14)

In G(/) there exist paths from the source s to all nodes in V ' = {1, 2 , . . . , k} U N U {t}. Since all cost coefficients ai(i, j) in G(f) are elements of a positively ordered semigroui~, a modification of the Dijkstra-algorithm [6] can be used to determine shortest paths in G(f). In the following algorithm shortest paths from the node k to all nodes in V' are computed. During the algorithm a label [Li, p(i)] is attached to nodes i E V' where L; is the length of the current shortest path from s to node i and p(i) is the predecessor of i on this path. Algorithm 4.2 (Shortest path algorithm). Step 1: Label node k with [e, s] and node j ~ N with [at(k,j), k]; M' :={1, 2 . . . . . k - 1} U {t};

N':=N. Step 2: Determine ]0 E N ' with L~ = min{Li I J E N'} and

N' := N'-..{jo}. Step 3: Let S(jo) := {i E M ' I (1o, i) E Eft)}. If t ~ S(j0) go to Step 5. If S(j0) = fl go to Step 2. Else label all nodes i E S(jo) with [Lj0, J0] and M ' := M ' ~ S(j0).

Step 4 : Check for all nodes j E N ' and i E S(j0) successively the inequality L, * as(i, j ) < L i. If this inequality holds replace the label of node ] by

[Li * a1(i, j); i]. Go to Step 2. Step 5: Label the sink t by [L/0; j0]. Stop. At the termination of the algorithm the sink t is labeled and the length of the shortest path w* from s to t is L(w*)= Lt. w* can be found by backtracing. Furthermore for all labeled nodes i ~ {j ~ V' ILi - L(w*)} Li is the length of the shortest path from s to i. Shortest paths to all other nodes have at least the length L(w*). The algorithm terminates in O(IVI 2) steps. Its validity can be proved in the same way as in the case (R+, +, - ) .

12

R.E. Burkard and U. ZimmermannlAlgebraic transportation problems

5. An augmenting path method In the proposed method alternately problems P(6k; a) are solved and suitable transformations are applied. The subproblems P(Ck; a) are solved by means of the shortest path method (Algorithm 4.2), which yields the values u, vi, i E M, j N for the subsequent suitable (u, v)-transformation. We start with the flow f * = 0 which is optimal for P ( t ~ ; a ) with 6~=0. Let f* be optimal for P(6R; a). Then the following theorem shows how f* can be augmented to an optimal solution for P(6k + 8; a).

Theorem 5.1. Let f* be optimal for P(~k; a) with Ck < Ck and [f*[Ta] = e. Let w* be a shortest path in G(f*) and 8 E R with 0 <- 8 <- 8(w*). Then f * • w * ( 8 ) is optimal for P(Ck + 8; a). Proof. Let f be optimal for P(6~ + 8; a). f has a representation (4.11) in G(f*). Then the flow f defined by

is also optimal for P(dk + 8, a) because of Proposition 4.1. Furtheron 8 = ~,=1 80 holds. Since L(w*) ~- L(wo) for 0 = 1, 2 . . . . . r we get r

8 [] L(w*) < ,--*1(8~[] L(wo)).

Thus f* (~)w*(8) is optimal by Proposition 4.1. We denote the length of a shortest path in G(f*) from s to node i by Li with i E {1, 2 . . . . . k} U N. Now we define two vectors u, v by ui:=

Li L(w*)

if Li <- L(w*) and 1 --- i -< k, else,

vi:=

Li L(w*)

if L~--< L(w*), else

(5.1)

for i ~ M, j ~ N. The values L(w*) and L; with Li <-L(w*) are determined in Algorithm 4.2. At first we show that a (u, v)-transformation can be defined by (5.1) and Definition 3.2. In particular we choose in the transformation tiis

:= ~ e if ui * aij = vj, t air if ui = v~

Proposition 5.2. Let u, v be defined according to (5.1). Then ui * air >- ~j for all i ~ M, j E N.

(5.2)

R.E. Burkard and U. Zimmermannl Algebraic transportation problems

13

Proof. If ui = L(w*), then Proposition 5.2 follows from L ( w * ) >- vj for all j E N. Otherwise we get 1 -- Lj

holds and we find ui * aij = Li * aij >- Li >- vj

for all j ~ N. At the beginning the flow f* -- 0 has the property f* [] a = e. By transforming the cost coefficients with u, v according to (5.1) we preserve this property.

Proposition 5.3. Let f* be a flow f o r P(6k; a) with Pk < Ck and [f* [] a] = e. Let w* be a shortest path in G(f*) and u, v defined according to (5.1). The transformed cost coe~icients air, i E M, j E N are defined by Definition 3.2 according to (5.1). Then

[(f* <~) w*(8)) [] a] = e holds for all 8 E R with 0 < 8 <- 8(w*).

Proof. It is sufficient to show f ( i , j ) >O ~ a~ = e

for 1 - i - k, j E N with f := f* <~) w*(8). Let (i, j) be an arc with f(i, j) > 0. If f*(i, j) > 0, then (j, i) E Eft*) and aij = e. Since (i, j) E E(f*) the lengths of the shortest paths to nodes i and j are the same. By (5.1) we get u,. = vi and therefore tiii = e according to (5.2). Otherwise (i, j) is an arc of the shortest path w* and therefore ui *aii = vs

holds. By (5.2) we find t~ij= e. The value (a, [3) of the (u, v)-transformation according to (5.2) is given by Proposition 3.3. The next theorem enables us to show i(tz) -< i([3) <- i(z*).

Theorem 5.4. (Monotonicity). Let T :aij ~ ~ be a suitable transformation with value (a, [3). L e t f ' be optimal for P(c'e; ~) and f" be optimal f o r P(c[,; ~) with k' < k" or (k' = k" and C'k <--C'kr). Then the following inequalities hold: (a) If' [--]a] -< [f" 1-1 a], (b) i([f' [] ti]) -< i(z*).

Proof. (a) Let f denote the partial flow of f" which is feasible for P(c'~; ~). Then

14

R.E. Burkard and U. Zimmermann/ Algebraic transportation problems

we find

If' ~ 4]_< ff [] 4]-< [/"D a]. (b) Let f be optimal for P(c,,, ; 4) and f be optimal for P(cr,, ; a). Then we find [3 , [ f l-3 4 ] <- [3 * [f O a l = a * z * .

This implies max(/(0), i([f Fq 4 ] ) ) - max(i(a), i(z*)) and together with i(ot) _ i([3) <- i(z*) we find i([f[-] 41) -< i(z*). N o w (a) yields in particular i([f' [--14]) -- i ( [ f O 4]).

Corollary 5.$. Let T : aij~-->4ii be the (u, v)-transformation according to (5.1) with value (a, [3). T h e n T is suitable.

Proof. Let L(w*) denote the length of the shortest path w* used in (5.1). Proposition 4.1 and Theorem 5.4 yield i(6(w*) [] L( w*) ) <- i( z*). With the aid of Propositions 2.4, 3.3 and L(w*) >_max(ui, v~)

for all i E M, j G N we find max(/(a), i([3)) -< i(z*). With respect to w* the sink t has a unique predecessor j G N. Then vj = L(w*) holds. This implies i(a) <- i([3). N o w we will solve the algebraic transportation problem by successive suitable transformations of the current data aij. Consider the suitable (u,v)-transformation T:aii~-'>4ii with value (a,/3) according to (5.2). We define another (fi, 15)-transformation 2r : alj ~ tilj by ~i=ui*er vi = vi * er

for a l l i E M , for all j G N

(5.3)

with ~ = i(L(w*)) and with the neutral element er of the semi-group Hr. Then tiii is defined as the unique solution of Ui * air = Ctii. * Vi

for all i ~ M, j E N and the special choice of

R.E. Burkard and U. Zimmermann/ Algebraic transportation problems t~ii : =

15

e if ai * aij = ~j, aii if ~i = ~j.

Then T is a suitable (~, ~)-transformation and has the same value as T. Furthermore

aij = e ~ clij = e and therefore Proposition 5.3 holds with T, too. Due to (5.3) and (5.1) we find

i(f~,) = i(gi) = f

(5.4)

for all i ~ M, j E N. Therefore the total (a, O)-transformation of the initial data a ~ to tiii has always the property that all ui, v/ are elements of the same semigroup H:. The index t is the maximal i(L(w*)) of all paths w* considered up to now. The following algorithm yields an optimal solution for the algebraic transportation problem.

Algorithm 5.6 (Algorithm for algebraic transportation problems). Step 1: Start with k = 1; ~1 = 0; f * - 0; a :=/3 := e. Step 2: Determine a shortest augmenting path w* in G ( f * ) with capacity 8 := ,~(w*). Step 3: f * := .f* @ w*(8); Ck := t~k + 8. Step 4: P e r f o r m a (a, t~)-transformation according to (5.3) with value (& fl). Redefine the current cost coefficients and ot := a * t~, fl := fl */3. Step 5: If 6k < Ck go to Step 2; If k = m go to Step 6; k : = k + 1; Ck : = 0; go to Step 2. Step 6: Solve a * z* =/3; stop.

Theorem 5.7. The Algorithm 5.6 terminates after a finite number of steps with an optimal solution of the algebraic transportation problem. Proof. (1) At the termination of the algorithm [* is optimal for P(cm; 4) with [f*[3 d] = e. F r o m Theorem 3.4 follows immediately the optimality of f* for P(Cm ; a). (2) To show the finiteness it is sufficient to prove that the number of (~, ~)-transformations (Step 4) is finite. At first we show that only a finite number of successively determined shortest augmenting paths with length e can occur. In this case the corresponding (t~, ~)-transformations do not change the cost coefficients. T h e r e f o r e the problem reduces to the determination of a maximal flow in the partial network defined by the arcs with current cost coefficients aii = e. This problem is solved in a finite number of steps if the shortest paths are chosen subject to a finiteness rule of Ponstein [9].

16

R.E. Burkard and U. Zimmermann/ Aigebraic transportation problems

Secondly consider the case when at least one cost coefficient is changed by the 0/,~5)-transformation T : tiij~-~j. Therefore L ( w * ) > e and at least one cost coefficient ti~0 is changed for (i0, ]o) E E W * . Let T :aii ~ aij and ~ : a~j ~ a 0 denote the transformations of the initial cost coefficients aij to t/it resp. air. The corresponding vectors to T, 2F, ~ are (t/, 15),(t/, F) and (~, v). Then and

a~,a~,aioJo=e~o,fjo hold. Therefore we find

We show now that the shortest path w* has not been used in any of the previous augmentation steps with the same partition of its arc set E W * into E W * F and and E W * B . Let 2P:a~j~--~&j be the (ti, t3)-transformation of the initial cost coefficients to d~i at a previous stage. Then we find >

_>

(5.5)

If w* is also the shortest path used for augmentation just before the cost coefficients were changed to &i then we can find a contradiction to (5.5). If (s, k) is the first arc of w*, then obviously ak -- ak. With the aid of the transformation equations (Definition 3.2) and ](i, j) = d(i, j) = e for all arcs of w* we find vk * bB = bF * t~k, with bB := bF :=

*

a(i, j).

*

a(i, j).

(j, i)EEW*B

(i, i)•EW* F

Furtheron due to the remark after (5.4)

i(~k) >-- i( ~ >_ i(a(i, ])) holds for all (j, i) ~ E W * B

and therefore

i(v~) >- i( ~ >-- i(bB). Then Proposition 2.1 implies

This is a contradiction to (5.5). Since there is only a finite number of augmenting paths and furthermore only a finite number of partitions of the arc set of an augmenting path, only a finite number of (u, v)-transformations can be performed which change at least one cost coefficient.

R.E. Burkard and U. Zimmermann/ Algebraic transportation problems

17

6. Assignment problems and bottleneck objectives In the case of time transportation problems min max air

(6.1)

xEPr x,-/>0

the explicit transformation of the cost coefficients aij can be avoided. We consider the (~, ~)-transformation according to ~i = L(w*),

~j = L(w*),

iE M jEN

(6.2)

defined by alj d/t:=

L(w*)

if air > L(w*), else.

(6.3)

w* denotes the current shortest augmenting path. The value of the transformation is (L(w*), L(w*)). By means of Proposition 2.3 we find for the system (H, *, - ) := (R, max, - ) i(a) <- i(b)C~ a <- b.

(6.4)

The transformation (6.3) is suitable because of (6.4), Proposition 2.4, 4.1 and Theorem 5.4. We find the value (&,/3) of the total transformation aij ~-~ d,.i : =

= L(w*).

Since [(f* @ w*(6)) [] a] = L ( w * ) holds for 8 = 6(w*) the flows determined in the algorithm are not only optimal for P(6k; ~) but also for P(6k; a). At the termination of the algorithm the last found L ( w * ) is the optimal objective value of the considered time transportation problem. Instead of transforming the cost coefficients explicitly by (6.3) the test air > L ( w * )

can be used in the algorithm. In this way a computer code is developed in [5]. In the case of assignment problems the polynomial bound O(n 3) can be derived for the algorithm. Every augmenting path increases the flow value by one. Therefore after n steps an optimal solution is found. In every step the shortest path algorithm with complexity O(n 2) is used and a suitable transformation is carried out which consists essentially in solving n 2 equations of the form in Definition 3.2. An algorithm for the bottleneck assignment problem has been developed in [4]. Extensive numerical investigations and comparisons showed the superiority of the new method.

18

R.E. Burkard and U. Zimmermannl Algebraic transportation problems

References [1] R.E. Burkard, "A general Hungarian method for the algebraic transportation problem", Discrete Mathematics 22 (1978) 219-232. [2] R.E. Burkard, W. Hahn and U. Zimmermann, "An algebraic approach to assignment problems", Mathematical Programming 12 (1977) 318-327. [3] R.E. Burkard, H. Hamacher and U. Zimmermann, "The algebraic network flow problem", Report 1976-7, Mathematisches Institut der Universit~it zu K61n (K61n, 1976). [4] U. Derigs and U. Zimmermann, "An augmenting path method for solving linear bottleneck assignment problems", Computing 19 (1978) 285-295. [5] U. Derigs and U. Zimmermann, "An augmenting path method for solving linear bottleneck transportation problems", Computing 22 (1979) 1-15. [6] E.W. Dijkstra, "A note on two problems in connection with graphs", Numerische Mathematik 1 (1959) 269-271. [7] B. Dorhout, "Het lineaire toewijzingsprobleem, Vergelijken van algoritmen", Report BN 21/73, Stichting Mathematisch Centrum (Amsterdam, 1973). [8] L. Fuchs, Partially ordered algebraic systems. (Addison-Wesley, Reading, MA, 1963). [9] J. Ponstein, "On the maximal flow problem with real arc capacities", Mathematical Programming 3 (1972) 254-256. [10] N. Tomizawa, "On some techniques useful for solution of transportation network problems", Networks 1 (1972) 179-194. [ll] U. Zimmermann, "Boole'sche Optimierungsprobleme mit separabler Zielfunktion und matroidalen Restriktionen", Thesis, Universit~it zu K/~ln(K/Sin 1976).

Mathematical Programming 12 (1980) 19-36. North-Holland Publishing Company

CUTTING

PLANES

FROM

CONDITIONAL

BOUNDS:

A N E W A P P R O A C H TO SET C O V E R I N G * Egon BALAS

Carnegie-Mellon University,Pittsburgh, PA, U.S.A. Received 8 April 1977 Revised manuscript received 30 July 1979

A conditional lower bound on the minimand of an integer program is a number which would be a valid lower bound if the constraint set were amended by certain inequalities, also called conditional. If such a conditional lower bound exceeds some known upper bound, then every solution better than the one corresponding to the upper bound violates at least one of the conditional inequalities. This yields a valid disjunction, which can be used to partition the feasible set, or to derive a family of valid cutting planes. In the case of a set covering problem, these cutting planes are themselves of the set covering type. The family of valid inequalities derived from conditional bounds subsumes as a special case the Bellmore-Ratliff inequalities generated via involutory bases, but is richer than the latter class and contains considerably stronger members, where strength is measured by the number of positive coefficients. We discuss the properties of the family of cuts from conditional bounds, and give a procedure for generating strong members of the family. Finally, we outline a class of algorithms based on these cuts. Our approach was implemented and extensively tested in a computational study whose results are reported in a companion paper [2]. The algorithm that emerged from the testing seems capable of solving considerably larger set covering problems than earlier methods.

Key words: Set Covering, Cutting Planes, Facets, Integer Programming, Conditional Bounds.

1. Introduction We consider the set covering problem m i n { c x I Ax --- e, xj = 0 o r 1, j ~ N } w h e r e A=(aij) is re•

(SC)

e E R m, e = ( 1 . . . . . 1), c E R " , a n d a i j E { 0 , 1 } , i ~ M =

{1 . . . . . m}, j E N = {1 . . . . . n}. W e w i l l d e n o t e b y a i a n d aj t h e i t h r o w a n d j t h column of A, respectively.

Without loss of generality, we assume

t h a t c~ > 0,

V j E N. U s i n g e s t a b l i s h e d t e r m i n o l o g y , w e call a v e c t o r x s a t i s f y i n g t h e c o n s t r a i n t s o f ( S C ) a cover, a n d t h e s e t o f i n d i c e s j s u c h t h a t x i = 1, t h e support o f t h e c o v e r . A c o v e r is c a l l e d prime if n o p r o p e r s u b s e t o f its s u p p o r t d e f i n e s a cover.

* Research supported by the National Science Foundation under grant MCS 76-12026 A02 and the Office of Naval Research under contract N00014-75-C-0621 NR 047-048. 19

20

E. Balasl Set covering problems

This problem, and its equality-constrained counterpart, the set partitioning problem, are useful mathematical models for a great variety of scheduling and other important real world problems, like crew scheduling, truck delivery, tanker routing, information retrieval, fault detection, stock cutting, offshore drilling platform location, etc., and a literature of considerable size exists on solution methods for these models (see [9] for a survey of set covering and set partitioning; [7] for a computational study and comparison of several solution techniques; and [4] for a more recent survey of set partitioning, which also contains a bibliography of applications of both models). In this paper we propose a new approach to set covering, based on the idea of conditional bounds. In Section 2 we introduce this concept for arbitrary mixed integer programs, and show how it can be used to derive valid disjunctions. The latter in turn can be used either to partition the feasible set in the framework of a branch-and-bound approach, or to derive a family of valid cutting planes. In case of a set covering problem, the cutting planes derived from conditional bounds are themselves of the set covering type. These cuts are discussed in Section 3, where the Bellmore-Ratliff inequalities generated from involutory bases are shown to be a special case of the larger family of inequalities defined in this paper. In Section 4 we examine some basic properties of our cutting planes. The family of cuts from conditional bounds is rather large, and in Section 5 we discuss a procedure for generating "strong" members of the family. Section 6 outlines a class of algorithms based on the cutting planes introduced in this paper, and using heuristics as well as subgradient optimization rather than the simplex method. Several versions of this approach were implemented and tested computationally in a joint study of Andrew Ho and the author, that is summarized in a companion paper [2]. The algorithm that emerged from this testing seems capable of solving larger problems in less time and more reliably than earlier methods. The approach discussed here was first circulated under [1].

2. Disjunctions from conditional bounds

The central idea of our approach is to derive valid inequalities for the set covering problem from conditional bounds. Since this concept is valid and useful for arbitrary mixed integer programs, we will introduce it in this more general context. In solving pure or mixed integer programs by branch-and-bound, if the feasible set is tightly constrained, it is sometimes possible to derive disjunctions stronger than the usual dichotomy on a single variable. On the other hand, the feasible set of any integer program becomes more or less tightly constrained after the discovery of a "good" solution (in particular, of an optimal solution), provided that one restricts it to those solutions better than the current best. Such

E. Balasl Set covering problems

21

a "tightly constrained" state of the feasible set can often be expressed in the form of an inequality Irx -< It0, with 7r _>0 and ~'0 > 0, as will be discussed later on. The smaller ~'0 relative to the other coefficients ~-i, the tighter the inequality. Whenever such an inequality is at hand, the following result can be used to generate a valid disjunction. Here we denote disjunction by the symbol v, and the meaning of k V Ai=AIvA2v'"vAk i=l

is that at least one of the conditions A~ . . . . . Ak must hold. R~ denotes the non-negative orthant of R*. Theorem

1. L e t zr E R~, 1ro E R + , N = {1 . . . . .

n}, and Qi c_ N , i = 1 . . . . . p , 1 - ~o,

(2)

if and only if every integer x E R~ that satisfies r disjunction P

<-~ro also satisfies the

(3)

v (x~ = o, j ~ Q,). l=l

Proof. Let G = (gii) be the p x n matrix defined by

j~Q,, 1, gii= O, j E N ' - Q i ,

i=1,

""'

p,

and let e = (1 . . . . . 1) have p components. From Farkas' T h e o r e m of the Alternative (nonhomogeneous version, see Duffin [8]), one and only one of the following two systems has a solution (here T denotes transpose):

(Irx<--zro, I ~Gx >- e, t x>_O,

f ev>~ro, II IGTv <--rr, l v>O.

System II is the same as (1), (2), and v E RP+. Thus there exists vERP+ satisfying (1) and (2) if and only if system I has no solution, i.e., if and only if every x E RT- such that ~rx - ~'0, violates at least one inequality of Gx >- e. But an integer x E R~ violates the ith inequality of Gx >- e, i.e., the inequality i ~ x~_~l,

22

E. Balas! Set covering problems

if and only if it satisfies xi = 0, j E Qi; hence it violates at least one inequality of Gx >- e if and only if it satisfies the disjunction (3).

Example 1. The inequality 9xl + 8xz + 8x~ + 7x4 + 7xs + 6x6 + 6x7 + 5Xs + 5Xg + 5Xlo + 4Xn +4xl2

+ 3x~s + 3X!4+ 3Xls+ 2X~6+ 2Xi7<: 10, together with the condition x > 0, xs integer, Yj, implies the disjunction (xi = 0 , j = 1,2, 3,4, 5,6,7) v (xi = 0 , j =

1,8,9, 10, 11, 12, 13, 14)

v(xj = 0 , j = 2 , 3 , 8 , 9 , 10, 15, 16, 17). Indeed, setting v~--6, v2= 3 and v3= 2, we have 6 + 3 + 2 > 10, i.e., (2) holds; and defining the sets Qi, i = 1, 2, 3, to be those used in the above disjunction, condition (1) is satisfied. This can easily be seen from Table 1, whose rows are the incidence vectors of the sets Qi, while the numbers on top are the ~rj and those to the right are the v~. The columns of the table correspond to the inequalities (1), which for the vector v = (6, 3, 2) are 6 + 3 < 9, 6 + 2 <- 8 . . . . . 2 - 2, all satisfied.

Tablel 98877665554433322 1111111

1

11

i 6

1 1 1 1 1 1 1

111

3

111

2

Remark 1.1. Theorem 1 remains true if zrx - ~r0 is replaced by ~rx < rr0 and (2) is replaced by p

v, -> =o.

(2')

i=1

Proof. If the indicated changes are made in systems I and II, the Theorem of the Alternative still holds. One way of obtaining a "tight" inequality r It0 (or zrx < zr0) in order to derive from it a conveniently strong disjunction, is as follows. Consider the mixed integer program mix{cx I A x > b, x >- O, x s integer, j E N~ C N},

(P)

let zv be a known upper bound on the value of (P), and let the vectors u and s satisfy u>-O,

s=c-uA>-O.

(5)

E. Balas/ Set covering problems

23

Then multiplying A x >-b by - u and adding the resulting inequality, - u A x
Corollary 1.2. L e t Zu be a n u p p e r b o u n d o n t h e v a l u e o f (P), a n d let u, s s a t i s f y (5). I f there e x i s t s S _CN~, S = {j(1) . . . . . j(p)}, 1 <-p <-[N~I, s u c h t h a t

jEs

(6)

si >- z v - ub,

t h e n / o r a n y c o l l e c t i o n o f sets Qi c N1, i = 1 . . . . . p, s u c h t h a t

i lj~Eri sj,) --- st, j (~ N,

(7)

every f e a s i b l e s o l u t i o n x to (P) f o r w h i c h c x < zu, satisfies t h e d i s j u n c t i o n p

V (xj = O, j E Q,).

(3)

i=1

Note that if p = 1, i.e., (3) has a single term, then (3) converts to the condition xi = 0, j E Q1. Somewhat more generally, we have Remark 1.3. Let Zu, u and s be as in Corollary 1.2, and define Qo = {j E N l [ sj >- Zu - u b }.

Then every feasible solution x to (P) such that c x < z u satisfies xj = 0, j E Q0. Corollary 1.2 has an interpretation (and alternative proof) in terms of conditional bounds, which yields some insight and is appealing to intuition. Consider the pair of dual linear programs min{cx [ Ax -> b, x -> 0}

(L)

max{ub I u A <- c, u >- 0},

(D)

and

associated with (P). Clearly, for any u feasible to (D), u b is a lower bound on the value of (L), hence of (P). Now suppose the constraint set of (P) (and (L)) is amended by the system G x >- e defined by (4). Then (L) and (D) become min{cx I A x >- b, G x >- e, x >- 0}

(Lc)

max{ub + ve I u A + v G <- c, u >- O, v >- 0}

(De)

and

E. Balas/ Set covering problems

24

respectively, and ub + ve is a lower bound on the value of (LG), hence of (PG), the problem obtained from (P) by adding to its constraints G x >-e. N o w if a v e c t o r v can be found that together with G satisfies the constraints of (D6) and u b + ve >- zu, then, since c x >- u b + re, every feasible solution to (L6), hence to (Pc), satisfies c x >-zu. It follows that every feasible solution x to (P) such that c x < zv, must violate the constraint set G x >-e, hence (as xi is integer-constrained for j E Nt) must satisfy the disjunction (3). If we set v~ = sj,), i = 1 . . . . . p, with s defined as in (5), then the above conditions on v are a paraphrase of (6), (7), and we obtain Corollary 1.2. The inequalities G x >- e are not part of the problem (P), and the sole purpose of introducing them is to conclude that i[ they were to hold, that would imply a lower bound at least equal to the upper bound zv, hence any solution x better than the one that produced zu, must violate at least one of them. We therefore call these inequalities, as well as the lower bound obtained from them, c o n ditional.

In a broader context, the idea of deriving a valid ("unconditional") constraint from one or several conditional constraints may have many other applications. One of them appears in a recent paper by Kov~tcs and Dienes [10], where a properly chosen inequality is used to derive a bound from the fact that either the inequality or its complement must be satisfied by any feasible solution. F r o m Corollary 1.2, a valid disjunction (3) can be derived for the problem (P) if an upper bound ztr is known, a feasible solution u to the dual linear program (D) is at hand, and a subset S of N~ can be found for which (6) holds. This latter condition is usually easy to satisfy, and we will have more to say about this later on. Given such a set S, however, every collection of subsets Qi of N~ that satisfies (7) gives rise to a valid disjunction (3), and the question arises o f choosing one that yields a disjunction as " s t r o n g " as possible, i.e., one with p as small as possible, and the sets Q~, i = 1. . . . . p, as large as possible. N e x t we state a simple heuristic that generates a disjunction (3) with that objective in mind. (1) Choose a minimal subset S C N~ such that ~ , s~ >_ z u - ue, I~S

and order S = {j(1) . . . . . j(p)} according to decreasing values of sj,). (2) Set Qj = {j E Ni] sj -- si,~} and define recursively Qi =

j E N1 I si >- sj(i) +

k=l

Si(k)gkj ,

i = 2 . . . . . p,

where gkj = 1 if j ~ Q~, and gkj = 0 if j ~ Qk. The sets Q~, i = 1. . . . . p, then define a valid disjunction (3). A disjunction (3) can be used either for branching, or for generating cuts. If used for branching, this disjunction can be strengthened so as to define a

E. Balas/ Set covering problems

25

partition of the feasible set; namely, (3) can be replaced by V

xj=O,]EQ~; ~

xj~l,k=l

..... i - 1

.

(3')

J~Qk

i=l

Note that, by construction of the sets Q~, sj -~ sj, i > 0 for j E Qi, i -- 1..... p, and thus on all branches except the one corresponding to i -- 1, the lower bound ub given by the dual solution associated with the reduced cost vector s, can be strengthened immediately after branching, by associating with each inequality

JEOk

the positive multiplier Sj(k). In other words, on the ith branch (i > l) the lower bound ub can be replaced right after branching by ub + sjo ) + . . . + sj,-l~. The above described branching rule, while often considerably stronger than the traditional one, can occasionally be a lot weaker. Therefore, the best way of using it is to judiciously combine it with other branching rules, according to criteria that make sure it is only used at such nodes of the search tree where it can be expected to perform relatively well. It is in this fashion that disjunctions of the type (3) are being successfully used for branching in our set covering algorithm that also uses them to generate cutting planes (see the companion paper [2]), and in a restricted Lagrangean algorithm for the traveling salesman problem [3]. Next we turn to the other use of disjunctions of type (3), namely for generating cutting planes. In the case of the set covering problem, these cutting planes turn out to be of the same type as the original constraints.

3. The cutting planes From now on, we address ourselves to the set covering problem min{cx I A x -> e, xj = 0 or 1, j E N}

(SC)

introduced in Section I. (Here A is m • n.) We will denote Ni=(jENla,~=I},

iEM.

Consider the ith term of a disjunction (3), i.e., xj = 0, j ~ Qi. Clearly, every cover x that satisfies the ith term of (3), also satisfies the inequalities xj->l,

hEM

and hence, for any choice of indices h ( i ) E M, i = 1 . . . . . p, every cover that satisfies (3), also satisfies the disjunction

i=1

jENh(i),\,Qi

E. Balasl Set covering problems

26

which is easily seen to imply (for integer x) the inequality Y~x~-> 1, with the summation taken over the union of the sets Nh0~ \ Q~, i = 1..... p. Combining this reasoning with Corollary 1.2 yields the following, Theorem 2. L e t zv be an u p p e r b o u n d on the value o f (SC), a n d let u, s satisfy (5). I f there exists a set o f c o l u m n indices S = {j(1) ..... j(p)}, fJ # S C_ N, such that si >- zv - ue,

(8)

then f o r any set o f p r o w indices h(i) E M, i = 1 . . . . . p, and a n y collection o f p subsets Q~ c_ N, i = 1. . . . . p, satisfying

il

~o~ si.> --- st, j E N,

(7)

every cover x such that cx < zv satisfies the inequality

x; _> 1,

(9)

where P

W = U (Nhti~ \ Q~) i=1

(10)

Remark 2.1. The family of cuts (9) remains the same if the condition Q~ _c N in Theorem 2 is replaced by Q~ c_ Nh,~, i = 1 . . . . . p. Proof. From (10), the change does not affect the set W which defines inequality

(9). The inequalities (9) are valid cutting planes in the sense of being satisfied by every cover better than a given one. Further, they are of the set covering type. Since these properties are the same as those of the Bellmore-Ratliff cuts [5] obtained by the use of involutory bases, we next examine the relationship between the latter and our inequalities from conditional bounds. First, we show that the Bellmore-Ratliff inequalities are a subclass of the class of inequalities (9). Then we show by way of example that the subclass in question is a proper one. Theorem 3. The B e i l m o r e - R a t l i f f inequalities [5] are a subclass o f the class defined in T h e o r e m 2.

Proof. Let g be a prime cover, B an involutory basis associated with $, and cj - caa s the jth reduced cost, where ca is the m-vector whose ith component is cj,~, if the basic variable associated with row i is (the structural variable) x~,~, and 0 if the basic variable associated with row i is slack. (When B is an

E. Balas/ Set covering problems

27

involutory basis, the reduced costs are known to be of this form.) Let the columns of B be indexed by L and denote F = { j E N I c j - c S a i < O } . The BeUmore-Ratliff cut associated with ~ and B is then

i•exj

-> 1.

(11)

To obtain this cut via our procedure, set S = I f l N, S = {j(1) . . . . . j(p)}, i.e., let S be the index set of the basic structural variables, and set u = 0, s = c. Then u and s satisfy (5), and S satisfies (8) (with equality) for zu = c$. Next let h(i) be the row index associated with basic variable xi, ~, and set Qi = Nh(i~ "~-F, i = 1. . . . . p. It is easy to see that these sets Q~ satisfy (7). Substituting for Qi in (10) then yields p

W =

U

Nh,) f'l F.

i= l

On the other hand, from the definition of F it follows that j E F implies j E Nh,~ for some i E {I . . . . . p}, hence P

and therefore W = F. Thus (11) is a special case of (9). Note that the cutting planes derived by Bowman and Starr [6] via a vector partial ordering are a special case of the Bellmore-Ratliff inequalities, hence they can also be obtained by our procedure. Next we illustrate by an example the fact that the Bellmore-Ratliff inequalities are a proper subclass of the class of inequalities (9), and in some cases those inequalities (9) that cannot be derived by the Bellmore-Ratliff procedure are considerably stronger than the ones that can.

Example 2. Consider the set covering problem whose costs c~ and coefficient matrix A are shown in Table 2. The 0-1 vector $ whose support is {2, 3, 5, 12, 13, 17} is a cover, satisfying with equality all the inequalities except for 1 and 8, which are oversatisfied. The Bellmore-Ratliff procedure generates cuts from the involutory bases that can be associated with $, and it can obtain one cut from every such basis. The variables x3, x4 and x5 can be basic only in rows 3, 4 and 6 respectively. Since rows 1 and 8 are slack, x12 and x13 can be basic only in rows 11 and 10 respectively. Finally, xt7 can be basic in any of the four rows 2, 5, 7, 9; and accordingly there are four involutory bases that can be associated with $. We will denote them by B2, Bs, B7 and B9, according as xl7 is basic in row 2, 5, 7 or 9 respectively. The basis B2 (after row permutations) is shown in Table 3. All variables whose index exceeds 20 are slacks.

E. Balas / Set covering problems

28 Table 2

q

1

2

3

4

3

1 1 3

5

6

7

8

1 2

2

3 3

10

11

12

13

14

15

16

17

18

19

20

3

3

3

3

4

4

4

5

6

8

9

1

1

I 1

1 1

1

1

2 3 4 5 6 7 8 9 10 I1

9

1

1

1 1

1

1

1 1 1

1

I 1

1

1 1

1

I

! 1 1

1 1

1

1

1

1

1

1 1

I

1

1 1

1

1

1

1

!

1 i 1

1 1

1

The four cutting planes that can be obtained by the Bellmore-Ratliff procedure, depending on which basis is used, are B2,

Xl -I- X6 -t- X9 "4- Xl0 -I- X15 + X16 "4- X18 "[- X20 ~--- 1

from

X4"~X6"I-X9+XIo'~t-Xll +X19

~-~ 1

from Bs,

X6 "~ X7 + Xl0 -I- X15 + X19 "~ X20

~-~ 1

f r o m BT,

x6 + xa + xlo + x14 + xla + x2o

>-- 1

f r o m B9.

On the other hand, using the conditional bound approach, we construct (by inspection or a heuristic) the dual vector u = (0, 1, 1, 1, 1, 1, 2, 0, 1, 2, 2) which, together with the associated reduced cost vector s = ( 2 , 0 , 0 , 2 , 0 , 0 , 0 , 1, 1,0, 1, 1, 1, 1, 1 , 0 , 0 , 2 , 0 , 1), satisfies the condition (5). The cover x whose support is {2, 3, 5, 12, 13, 17} yields z v = cg = 14; and the dual vector u yields the lower bound ue = 12.

Table 3 2 3 4 6 ll 10 2 5 7 9 1 8

3

5

12

13

17

25

27

29

21

28

1 1 1 1 1 1 1 1 1

1 1

1

I L t

t I I

I

-1

-1 -1 -1 -1

E. Balas/ Set covering problems

29

Since z v - u e = 2, Qo = {] E N [ sj >- 2} = {I, 4, 18}, and thus (Remark 1.3) every cover better than ~ satisfies x~ = x4 = x~s = 0. Hence we replace N by N "-. {1,4, 18}. Further, to apply Corollary 1.4, we pick the column indices j ( l ) = 12, j(2) = 13; for which (8) holds, since s~2 + s~3 = 2 >- z v --: ue. Next we pick the row indices h(1) = 8, h(2) = 5, and choose the sets Qj = {12, 13}, 02 = {9, 11}, to obtain Nht~) \ Q1 = {6, 19} and Nha) "- 02 = {10, 16, 19}, hence W = {6, 10, 16, 19}. In choosing the sets Q; we make sure that (7) is satisfied, and apart from that try to make each successive Nh,) "- Qi add to W as few new elements as possible. We have thus obtained the cut x6 + Xi0 + Xl6 + X19> 1 which has only four positive coefficients, whereas each of the involutory basis cuts has at least six. The above inequality cuts off 2. This is due to the way we chose the column indices j ( i ) and the row indices h ( i ) , i = 1. . . . . p, as will be shown in the next section. If we do not care about cutting off a specified cover, we can obtain inequalities which are "stronger" in the sense of having fewer positive coefficients. Thus, for instance, if we choose j ( i ) = 13, j(2)= 9, and h(1)= 8, h(2) = 5, we can generate the cut X17 q- XI9 ~"

1

(by setting Q~ = {12, 13}, Q2 = {9, 11}); and for j(1)= 13, j(2)= 14, h(1)= 8, h(2)= 4, we obtain the cut X3 "]- X19 ~ 1

(by choosing Q~ = {12, 13} and 02 = {14, 20}).

4. S o m e properties of cuts from conditional bounds

The family of cuts defined by Theorem 2 is vast, and one is interested of course in computationally cheap procedures for generating "strong" members of this class. In this section we investigate some properties of the cuts (9) that will be helpful toward that goal. The first practical question that arises is whether condition (8) can always be met, and how. Since s depends on u, it should not be surprising that one answer to this question comes in terms of additional conditions on u. T h e o r e m 4. L e t the v e c t o r s ~ a n d g s a t i s f y (5), a n d let ~ be a c o v e r with s u p p o r t S(~). I I

a(A~ - e) = 0, then (8) holds f o r S = S ( ~ ) .

E. Balas / Set covering problems

30

Proof. Consider the pair of dual linear programs. min{cx I A x >- e, xj >- 1, j E S(~), xj >_O, j E N ".. S(:~)}

(LO

max{ue+ ~

(DO

and sjluai+si=ci, jEN;

u>-O,s>-O }.

Clearly, $ is a feasible solution to (L0, and (a, g) is a feasible solution to (D0. Further, $ and (a, g) satisfy the complementary slackness conditions u ( A x - e) and ( x j - 1)sj = O, j E S($), x~sj = 0, j E N ~. S($); hence they are optimal solutions to (L0 and (DO respectively. Therefore

ae+ ~, ~,=c~ j~s(~)

which together with z~ -< c~ proves the statement. For any cover x, denote T(x) = {i E M I aix = 1}. Then as an immediate consequence of Theorem 4, we have Remark 4.1. Let $ be a cover, and let (a, g) satisfy (5). If t1 also satisfies

ai=0,

Vi~M~

T(s

then (8) holds for S = S(s Thus, if an upper bound zv and vectors u, s satisfying (5) are at hand, but condition (8) does not hold, it can be made to hold by successively setting to 0 components ui of u such that i ~ M \ T(s At worst all such components may have to be set to 0; then (8) will hold. Before turning to other characteristics of the cuts (9), we now state a basic property of the set covering problem. Let P be the convex hull of all integer n-vectors satisfying A x >- e, x >- O, i.e., P = conv{x E R" I A x >- e, x >- O, xj integer, j ~ N}. We then have the following Theorem 5. The inequality

~, xj --> 1

(12)

JENi

where i E M, defines a facet of P i[ and only if there exists no k ~ M such that

Nk C Ni, Nk ~ Ni.

E. Balas/ Set covering problems

31

Proof. The " o n l y if" part is obvious. To prove the " i f " part, we assume there is no k E M such that NkC1V~,Nk# Ni, and we exhibit n linearly independent integer n-vectors that satisfy A x >- e, x >- 0 and for which (1) holds with equality. Let INil--p, and assume w.l.o.g, that N~ is the set of the first p indices in N. Let y = (1 . . . . . 1), y E R n-p, and let ei and [i be the unit vector in R p and R n-p respectively, whose ith c o m p o n e n t is 1. Now consider the p n-vectors (e~, y), i = 1. . . . . p, and the n - p n-vectors (el, y + f~_p), i = p + 1. . . . . n. Since there is no k E M such that Nk C 1V~,N k # N~, each of these nonnegative integer vectors satisfies Ax-> e; and since each one of them has a single 1 among its first p components, they all satisfy (12) with equality. Further, the n x n matrix whose rows are these vectors is

Yp where for k = p and k = n - p , /k is the identity of order k, while Yk is the k• matrix whose entries are all equal to 1; and E is the ( n - p ) x p matrix whose first column consists of l's, and whose remaining columns consist of O's. Now define the matrix

Z = ( I" + YPE -E

- Yp I~_p )"

Using the fact that EYp = Y,_p, it is easy to see that X Z = In, i.e., Z = X -I and hence X is nonsingular. This proves that the n vectors introduced above are linearly independent. In a cut-generating procedure it is important to make sure that no cut is repeated. N e x t we give a necessary and sufficient condition for a cut to be " n e w " . Let (SC) stand for the set covering problem amended with all the cutting planes generated up to some point, and let

jew

xj ~ 1

(9)

be the next cut generated. We then say that the inequality (9) is new, if there is no i E M such that N,. _C W. Remark 5.1. The inequality (9) is new if and only if N "~ W is the support of a

cover for (SC). Proof. The cut (9) is new if and only if /V~_ W, Vi ~ M ; hence if and only if /V~ \ W # 0, Vi E M. But this condition holds if and only if N \ W is the support of some cover. While the condition of Remark 5.1 is straightforward, it is easier to embed in a cut generating procedure when paraphrased as follows.

32

E. Balas/ Set covering problems

Remark 5.1.a. The inequality (9) is new if and only if it cuts off (is violated by) some cover of (SC). The next theorem gives conditions on the column indices j(i) and row indices h(i) used in generating inequality (9), to guarantee that the inequality obtained cuts off a specified cover. We will denote

Mj={i~Mlaij=l},

j~N.

Theorem 6. Let zv, u, v, S and Qi, i---1 . . . . . p, be as in Theorem 2, and let

j(i) E Q~, i = 1..... p. If g is a cover such that S C S(g) and h(i) ~ T($) n Mj,~,

i = 1..... p,

(13)

then the inequality (9) cuts off (is violated by) 2. Proof. Assume S C_ S(g) and (13) holds. From h ( i ) E Mi(i ~ we have j ( i ) E Nh(i~, i = 1. . . . . p ; and since j(i) E S C_ S(g) implies gjti~= 1, while h(i) E T(g) implies IS(E) n Nhti~l = l, i = 1. . . . . p, it follows that

S(g) n Nh,) = {j(i)},

i = 1. . . . . p.

Further, since j(i) E Qi, i = 1. . . . . p, we have S(g) n (Nh,~ \ Qi) = 0,

i = 1. . . . . p,

and hence S(g) n W = 0, i.e., the inequality (9) cuts off $. Remark 6.1. E v e r y inequality (9) for which the conditions of T h e o r e m 6 are satisfied, defines a facet of

P* = c o n v { x E Rn l A x >--e, ~

j~w

x~ >-- l, xj integer, j E N } .

Proof. Follows from Remark 5.1 and T h e o r e m 5. Theorems 2 and 6 provide rules for generating a sequence of valid cutting planes that are all distinct, and furthermore, are all facets of the current polytope P * . This latter property, however, does not imply that all inequalities generated this way are equally strong. Since all the inequalities in question have coefficients equal to 0 or 1 and a right hand side equal to 1, we will use the number of coefficients equal to 1 as a measure of their strength (the fewer the l's, the stronger the inequality). Note that some facets of the set covering polytope may be much weaker than others, according to this criterion. Thus, for instance, all five inequalities represented by the rows of the matrix A in Table 4 define facets of the set covering polytope corresponding to A, yet inequality 4, with only two l's, is much stronger than inequality 5, which has ten l's.

E. Balas/ Set covering problems

33

Table 4 1 11

1

A=

1

1

1 1

1

1 1 1

1 1

1

1 !1

1 1

1

1

1 1

1

1 1

1

Thus, although they all define facets of the current polytope P * , the cutting planes obtainable via the rules of Theorem 6 are not all equally desirable. The next section discusses a procedure for generating conveniently strong members of the family.

5. Generating cuts The strength of an inequality (9), i.e., the size of the set W, depends on the integer p and the size of the sets N h , ) \ Qi, i = 1 . . . . . p, of Theorem 6. To have p conveniently small, the procedure chooses the set S = { j ( 1 ) . . . . . j(p)}, corresponding to the p largest reduced costs sj, j E S($), where p is the smallest integer for which (8) is satisfied. Each row index h ( i ) , i = 1 . . . . . p, is of course chosen from T($) N Mj,), as prescribed by Theorem 6. Further, in order to have W as small as possible, the sequence of row indices h ( i ) is chosen so as to make as small as possible at each step k E {1 . . . . . p} the set Wk \ W k - h where W0 = t~ and k

Wk = U (Nh(i) \ Qi),

k = 1 . . . . . p.

i=l

Since for any S satisfying (8), ISI = 1 implies (Remark 1.3) that the variable xj such that S = {]} can be permanently set to 0, we assume this has already been done for all such singleton sets, and hence ISI -> 2 for all S satisfying (8). Let M and N be the row and column index sets of the current problem (SC), let $ be a prime cover for (SC), and denote, as before,

Further,

S($) = {j E N [$j = 1},

T($) = {i E M I ai$ = 1}.

let

(5),

u

and

s

satisfy

and

assume

(8)

holds

s+ Cut-generating procedure S t e p 0: Initialize W = ~, S = S § zL = ue, i = 1 and go to Step 1. S t e p 1: Define sj,) = max s t,

Q = {j E N I sj >- s~,)}.

for

S=

E. Balasl Set covering problems

34

Choose h(i) such that

INh,) \ Q O W I =

min

[Nh x Q O W I

hE T(g)NMRi )

(breaking ties arbitrarily), and set

W *-- W U (Nh,~ \ Q),

zL ~ ZL + Sj,).

If ZL --> ZU, go to Step 2. Otherwise, let Sj

sj - sj,)

I

( sj

j E Q f3 Nh~i) otherwise,

S <---S ~ {j(i)}, set i ~-- i + 1, and go to Step 1.

Step 2: Add to the constraint set of (SC) the inequality jEW

x i > l.

In at most IS+I iterations, this procedure generates an inequality (9) satisfied by every cover x such that cx < zu and violated by g. Indeed, S (initialized as S § is diminished at e v e r y iteration by one element, hence there are at most IS+I iterations. Further, since (8) holds for S = S +, after p - I S + [ iterations, (8) holds for the current set S (i.e., z/. --- zu), and we go to Step 2 to generate a cut (9). For the sets Qi = Q A Nh,~, i = 1. . . . . p, j(i) E Qi, and (7) is satisfied (by the definition of Q and ss at e v e r y iteration). Finally, the choice of h(i) guarantees (13). Thus all requirements of T h e o r e m 6 are met. A couple of minor improvements are at hand. Choosing the largest sj to define j(i) at every iteration has the purpose of minimizing the size of the set S in (8). But at the last iteration choosing the largest si may not be indicated, if a smaller ss suffices to satisfy (8). Thus a better rule for choosing j(i) in Step 1 is to set

v, = min~max s,, min{ss l s, >- zv - zL}} LjES

jES

and then choose as j(i) one of the indices j E S for which s i = vi. Furthermore, whenever this index is not unique, i.e., IJI > 1, where J = {j ~ S i s s >-vi}, the choice of j(i) and h(i) can often be improved by first setting Q = {j E N I st >- vi}, next choosing h(i) so as to minimize INh \ Q U WI over all h E T ( g ) f 3 Ms, where Ms = U M/; j~J

and then choosing the unique index {j} = J N Nh,) as j(i). Example 3. Consider the set covering problem of Example 2 (Table 2), with c4 = 3 replaced by c4 = 1. Then the cover ~ whose support is S(~) = {2, 4, 13, 20}

t~. l:talasl Set coveting problems

35

gives zu = c:~ = 14, and T(:D = M ' ~ {1}. The vector u of E x a m p l e 2 yields the same reduced costs ss as in that example, except for s4, which now is 0. The lower bound ue is 12, and since sj -> 14 - 12 = 2 for j = 1, 18, we set x~ = xl8 = 0, and replace N b y N "~{1, 18}. Condition (8) holds for S = S + = {13, 20}, since sj3+ S2o>-2.

Step 0: W = 0, S = {13, 20}, zL = ue = 12. Step 1: v l = m i n { 1 , 2 } = 1 , J = { 1 3 , 20}, Q = { 8 , 9, 11, 12, 13, 14, 15, 20}, Mj = Mi3 O M20 = M \ {3, 5}. To choose h(1), we minimize INh \ QI o v e r h E T($) tq Mj = M \ {1, 3, 5}, and find that the minimum is 1, attained for k = 4, 8, 9. We arbitrarily choose h(1) = 4, and set W = N4 \ Q = {3}, zL = 12 + 1 = 13. The sj remain unchanged except for j = 14, 20, the new values for the latter being s~4 = s20 = 0. We set S = {13}, i = 2, and go to Step 1:v2 = min{1, 1}= 1, J = {13}, Q = { 8 , 9, 11, 12, 13, 15}, Mj = MI3={1, 8, 10}. To choose h(2), we minimize INh \ QI o v e r h E T($) tq Mj = {8, 10), and find h(2) = 8. We set W = {3} U (Ns \ Q) = {3, 19}, zL = 13 + 1 = 14, and since zL > zu, we go to Step 2: We add to (SC) the inequality x3 + xl9 --> 1.

6. A class of algorithms The cutting planes discussed in this p a p e r can best be used in a f r a m e w o r k that takes m a x i m u m advantage of their properties. To obtain a cutting plane f r o m conditional bounds, one needs a feasible solution (u, s) to the dual of the linear p r o g r a m associated with (SC). Such a solution also provides a lower bound ue on the value of (SC). On the other hand, the easiest w a y to make sure that the cuts that one generates are all distinct, is to have each inequality cut off some c o v e r satisfying all the previously produced inequalities. Thus to obtain a sequence of distinct cutting planes, one also needs a sequence of covers. Each c o v e r x in turn, provides an upper bound cx on the value of (SC). The best a p p r o a c h then seems one that alternates b e t w e e n ( a ) generating a c o v e r x for the current problem, and (/3) generating a dual solution (u, s) and using it to derive an inequality that cuts off x. In such a procedure, the value of (SC) is b o u n d e d f r o m a b o v e by the sequence of covers obtained under (a); and bounded f r o m below b y the sequence of dual solutions p r o d u c e d under (/3). The rate of c o n v e r g e n c e of the algorithm is the rate at which the gap zu - zL between the two bounds decreases.

36

E. Balasl Set covering problems

Since every inequality generated in the procedure cuts off at least one new cover, and since the number of distinct covers is finite, the procedure outlined above is finite, irrespective of the methods used to generate the sequence of covers x and dual solutions (u, s). Its efficiency on the other hand depends crucially on the efficiency of those methods. Several versions of the approach outlined above were implemented and thoroughly tested in a computational study summarized in the companion paper [2]. The algorithm that emerged as a result of the testing uses several different heuristics intermittently to generate prime covers, and produces dual solutions (u, s) both by heuristics and by subgradient optimization. When the decrease in the gap z ~ - ZL slows down, the algorithm branches, using either a disjunction of the type discussed in this paper, or a dichotomy derived from other considerations, according to some measure of comparative strength. The algorithm is particularly well suited for low density problems, and its performance on set covering problems taken from the literature compares favorably with earlier methods. Randomly generated test problems with up to 200 constraints and 2000 variables have been successively run. For details of the algorithm and results of the computational tests the reader is referred to [2].

References [1] E. Balas, "Set covering with cutting planes from conditional bounds", MSRR No. 399, Carnegie-Mellon University, Pittsburg, PA, July 1976. [2] E. Balas and A. Ho, "Set covering algorithms using cutting planes, heuristics and subgradient optimization: A computational study", Mathematical Programming Study 12 (1980) 37-60 [this volume]. [3] E. Balas and N. Christofides, "A restricted Lagrangean approach to the traveling salesman problem", MSRR No. 439, Carnegie-Mellon University, Pittsburg, PA, July 1979. [4] E. Balas and M.W. Padberg, "Set partitioning: A survey", SIAMReview 18 (1976) 710-760. [5] M. Bellmore and H.D. Ratliff, "Set covering and involutory bases", Management Science 18 (1971) 194--206. [6] V.J. Bowman and J. Starr, "Set covering by ordinal cuts. I: Linear objective functions", MSRR No. 321, Carnegie-Mellon University, June 1973. [7] N. Christofides and S. Korman, "A computational survey of methods for the set covering problem", Management Science 21 (1975) 591-599. [8l R.J. Duttin, "Infinite programs", in: H.W. Kuhn and A.W. Tucker, eds., Linear inequalities and related systems (Princeton University Press, Princeton, NJ, 1956) 157-170. [9] R. Garfinkel and G.L. Nemhauser, "Optimal set covering: A survey", in: A.M. Geoffrion, ed., Perspectives on optimization (Addison-Wesley, Reading, MA, 1972) 164-183. [10] L.B. Kov~ics and I. Dienes, "Maximal direction--complete paths and their application to a geological problem: setting up stratigraphic units", Paper presented at the 9th International Symposium in Mathematical Programming (Budapest, August 1976).

Mathematical Programming 12 (1980) 37-60. North-Holland Publishing Company

SET COVERING ALGORITHMS USING CUTTING PLANES, HEURISTICS, AND SUBGRADIENT OPTIMIZATION: A COMPUTATIONAL STUDY Egon BALAS* and Andrew HO Carnegie-Mellon University, Pittsburgh, PA, U.S.A. Received 30 July 1979 Revised manuscript received 18 October 1979 We report on the implementation and computational testing of several versions of a set covering algorithm, based on the family of cutting planes from conditional bounds discussed in the companion paper [2]. The algorithm uses a set of heuristics to find prime covers, another set of heuristics to find feasible solutions to the dual linear program which are needed to generate cuts, and subgradient optimization to find lower bounds. It also uses implicit enumeration with some new branching rules. Each of the ingredients was implemented and tested in several versions. The variant of the algorithm that emerged as best was run on 55 randomly generated test problems (20 of them from the literature), with up to 200 constraints and 2000 variables. The results show the algorithm to be more reliable and efficient than earlier procedures on large, sparse set covering problems. Key words: Set Covering Problem, Cutting-planes, Subgradient Optimization, Computation, Heuristics, Algorithms, Branch and Bound.

1. Introduction: Cutting planes from conditional bounds In this p a p e r w e r e p o r t on t h e i m p l e m e n t a t i o n a n d c o m p u t a t i o n a l t e s t i n g o f a n algorithm, or rather a class of algorithms, based on the cutting planes from c o n d i t i o n a l b o u n d s i n t r o d u c e d in [1], a n d also u s i n g as i n g r e d i e n t s s e v e r a l h e u r i s t i c s f o r finding f e a s i b l e p r i m a l a n d d u a l s o l u t i o n s , s u b g r a d i e n t o p t i m i z a t i o n of a Lagrangean function, and implicit enumeration with some new branching rules. T h e f a m i l y o f c u t t i n g p l a n e s f r o m c o n d i t i o n a l b o u n d s is b r i e f l y d e s c r i b e d in this s e c t i o n ; a m o r e d e t a i l e d d i s c u s s i o n o f t h e p r o p e r t i e s o f t h e s e c u t s is to b e f o u n d in t h e c o m p a n i o n p a p e r [2]. I n S e c t i o n 2 w e d e s c r i b e t h e g e n e r a l f e a t u r e s o f the a l g o r i t h m w h o s e v e r s i o n s w e i m p l e m e n t e d a n d t e s t e d . S e c t i o n s 3 - 7 discuss each of the ingredients of our procedure, with comparisons of various v e r s i o n s b a s e d o n c o m p u t a t i o n a l testing. F i n a l l y , S e c t i o n 8 s u m m a r i z e s o u r c o m p u t a t i o n a l e x p e r i e n c e with t h e a l g o r i t h m as a w h o l e . W h i l e t h e d i s c u s s i o n f o c u s e s on a p a r t i c u l a r i n s t a n c e o f t h e a l g o r i t h m s in the c l a s s c o n s i d e r e d , n a m e l y the o n e t h a t e m e r g e d as the m o s t s u c c e s s f u l f r o m o u r c o m p u t a t i o n a l e x p e r i m e n t s , w e a l s o d i s c u s s p o s s i b l e v a r i a t i o n s w h e r e v e r it s e e m s useful. * Research supported by the National Science Foundation under grant MCS76--12026 A02 and the Office of Naval Research under contract N00014--75-C--0621 NR 0474)48. 37

E. Balas, A. Ho/ Set covering problems: Computation

38

As one can see from the computational results presented in Section 8, the algorithm discussed here is a reliable and efficient tool for solving large, sparse set covering problems of the kind that frequently occur in practice. With a time limit of 10 minutes on the DEC 20/50, we have solved all but one of a set of 50 randomly generated set covering problems with up to 200 constraints, 2000 variables, and 8000 nonzero matrix entries (here "solving" means finding an optimal solution and proving its optimality), never generating a branch and bound tree with more than 50 nodes. For problems that are too large to be solved within a reasonable time limit, the procedure usually finds good feasible solutions, with a bound on the distance from the optimum (for the one unsolved problem, this bound was 2.3%). We also tested the algorithm on a set of 5% density problems, but as density increases, the performance of the algorithm tends to decline. We consider the set covering problem (SC)

min{cx [ A x >- e, x E {0, 1}n}

where A = (a0) is an m x n 0 - 1 matrix, and e = (1 . . . . . 1) has m components. Let a i and a s denote the i-th row and the j-th column of A, and let M = {I . . . . . m}, N = {1 . . . . . n}. We denote M~={iEMIao=I

}, j E N ;

N,={jENlao=I},

iEM.

We will also use the pair of dual linear programs (L)

min{cx I A x >- e, x >- 0}

and

(D)

max{ue I u A <-- c, u <--0},

associated with (SC). A vector x E{0,1} n satisfying A x l e is called a cover, and S ( x ) = {j E N I xs = 1} its support. A cover whose support is minimal, is prime. For a cover x, we denote T ( x ) = {i E M I aix = 1}. The theory underlying the family of cutting planes from conditional bounds can be summarized as follows (for proofs of these statements and further elaboration see the companion paper [2]). Let zu be some known upper bound on the value of (SC), and let u be any feasible solution to (D), with s = c - u A , such that the condition ~ s ~ >- zu - ue

(1)

is satisfied for some S _CN. Let S = {j(1) . . . . . j(p)}, and let Qi, i = 1. . . . . p, be any collection of subsets of N satisfying

~o

s~i~ <-- sj, j E N .

(2)

E. Balas, A. Ho] Set covering problems: Computation

39

Then every cover x such that c x < Zu satisfies the disjunction p

V (xi = 0, j E Qi).

(3)

i=l

Further, for any choice of indices h ( i ) E M , i = 1 . . . . . p, the disjunction (3) implies the inequality xj - 1

(4)

jew

where p

w = U (Nj.~-~ Q;). i=I

Finally. if ] ( i ) E Qi. i = 1 . . . . . p. and if 9 is a cover such that S _C S(~) and h ( i ) ~ T ( ~ ) C'I Mi,~. i = 1 . . . . . p. then the inequality (4) cuts off J7 and defines a facet of P* = conv{x E R " I A x ~- e. ~. xj >- 1. x ~- 0. x i integer. ] E N}. jew

Using the above results, one can generate a sequence of cutting planes that are all distinct from each other, by generating a sequence of covers x and feasible solutions u to (D). The covers x provide upper bounds, while the vectors u provide lower bounds on the value of (SC). Since every inequality generated cuts off a cover satisfying all previously generated inequalities, and the number of distinct covers is finite, the procedure ends in a finite number of iterations, with an optimal cover at hand.

2. Outline of the algorithm The algorithm alternates between two sets of heuristics, one of which finds a "good'.' prime cover x for the current problem and a (possibly improved) upper bound, while the other generates a feasible solution to (D) satisfying (1) for S = S ( x ) , and from it a cutting plane that cuts off x, as well as a (possibly improved) lower bound. Whenever a disjunction (3) is obtained with p = 1, all the variables indexed by Q~ are set to 0. The second set of heuristics is periodically supplemented by subgradient optimization to obtain sharper lower bounds. Though this procedure in itself must find an optimal cover in a finite number of iterations, for large problems this may take too many cuts. Therefore, as soon as the rate of improvement in the bounds decreases beyond a certain value, the algorithm branches. A schematic flowchart of the algorithm is given in Fig. 1. P R I M A L designates the set of heuristics used for finding prime covers (feasible primal solutions),

40

E. Balas, A. Ho/ Set covering problems: Computation

L2

L!

1.3

D J.J

% 4.2

Fig. 1.

D U A L the heuristics used for finding feasible dual solutions. T E S T is the routine for fixing variables at 0. C U T generates a cutting plane violated by the current cover. S G R A D uses subgradient optimization in an attempt to find an improved dual solution and lower bound. B R A N C H is the branching routine, which breaks up the current problem into a number of subproblems, while S E L E C T chooses a new subproblem to be processed. The four decision boxes of the flowchart can be described as follows. Let zu and ZL be the current upper and lower bound, respectively, on the value of (SC). (1) If ZL--ZU, the current subproblem is fathomed (1.1). If Z L < Z u and some variable belonging to the last prime cover has been fixed at 0, a new cover needs to be found (1.2). Otherwise, a cut is generated (1.3). (2) After adding a cut, the algorithm returns to P R I M A L (2.1) unless the iteration counter is a multiple of a ; in which case (2.2) it uses subgradient optimization in an attempt to improve upon ZL. On the basis of some experimentation, we set l M IIlO<-a<- I M 1/20. (3) If ZL -> Ztj, the current subproblem is fathomed (3.1). If ZL< ZU but the gap zv - zL has decreased by at least ~ > 0 during the last/3 iterations, we continue

E. Balas, A. Ho/ Set covering problems: Computation

41

the iterative process (3.2). Otherwise, we branch (3.3). Again, following some numerical experiments, we use ~ = 0.5 and fl = 4t~, with t~ as defined in 2. (4) If there are no active subproblems, the algorithm stops: the cover associated with zu is optimal (4.1). Otherwise, it applies the iterative procedure to the selected subproblem (4.2). In the next five sections we discuss each of the ingredients of the algorithm in some detail on the basis of computational testing of several versions. After that we report on our computational experience with the algorithm as a whole.

3. Primal heuristics The heuristics we use to generate prime covers are of the " g r e e d y " type, in that they construct a cover by a sequence of steps, each of which consists of the selection of a variable x~ that minimizes a certain function f of the coefficients of xj. They differ in the function [ used to evaluate the variables. If kj denotes the number of positive coefficients of xj in those rows of the current constraint set not yet covered, the general form of the evaluation function is f(cj, ki). Since it is computationally cheaper to consider only a subset of variables at a time and since e v e r y row must be covered anyhow, i.e., the cover to be constructed must contain at least one of the variables having positive coefficient in any given row, we restrict the choice at each step to the set of variables having a positive coefficient in some specified row i* ~ M, where M denotes the set of rows. Denoting by R the set of rows not yet covered and by S the support of the cover to be constructed, the basic procedure that we use can be stated as follows.

StepO.

SetR=M,S=0,

t=l,

andgotol.

Step 1. If R = 0 , go to 2. Otherwise, let k j = I M j N R I , i E R, and choose j(t) such that

choose

f ( q . ) , ki.)) = min f ( q , ki). JENi*

Set R ~--R -~ Mj,),

S§

t~-t+

1, and go to 1.

Step 2. Consider the elements i E S in order, and if S\{i} is the support of a cover, set S*-S\{i}. When all i ~ S have been considered, S defines a prime cover. As to the choice of i* in step 1, the criterion that suggests itself is

I N,. I= m2nlN/I Rather than implement this choice rule directly, which would be costly, we approximate it by ordering once and for all the rows of the initial coefficient

E. Balas, A. Ho/ Set covering problems: Computation

42

matrix A according to decreasing [Ni [, and always choosing i* as the last element of the ordered set R. Since the cuts generated in the procedure also tend to have a decreasing number of l's, i.e., later cuts tend to have fewer l's than earlier cuts, this rule comes sufficiently close to choosing the row with the minimum number of l's. If the set Ni. in step 1 is replaced by N, i.e., the choice is not restricted to a certain row, and step 2 is removed, i.e., the procedure is allowed to stop whenever a cover is obtained, whether prime or not, then the above procedure with [(ci,kj)= cj/kj is the greedy heuristic shown by Chvfital [3] to have the following property: if Zopt is the value of (SC) and Zheuis the value of a solution found by the heuristic, then

d 1 i=lI

Zheu]Zopt-< ~ "2"

where d = max I M / [ , j~N

and this bound is best possible. From a practical point of view this bound is very poor, and it can be shown [7] that there is no better bound for any function [ used in the above procedure. However, proving this statement requires the construction of examples for which the worst case bound is attained, and every function [ requires a different example. This suggests as a practical remedy against the poor worst-case performance of the heuristic, the intermittent use of several functions [ rather than a single one. This idea has been implemented and tested with reasonably good results. The following five functions [(ci, k s) were considered: (i) cs;

(ii) cjlks;

(iii) ci[log2kj;

(iv) cj/kjlog2ki;

(v) ci/kilnk j.

In cases (iii) and (iv), log2ki is to be replaced by 1 when ki = 1; and in case (v) I n k i is to be replaced by 1 when ki = 1 or 2. Using (i) amounts to simply choosing the lowest-cost variable at each step. Criterion (ii) minimizes the unit cost of covering an uncovered row. The functions (iii), (iv) and (v) select the same variable as function (ii) whenever cj = 1, Vj ~ N, but otherwise (iii) assigns less weight, while (v) assigns more and (iv) even more weight to the number ki of rows covered, versus the cost ci. The five functions were tested on a set of 13 randomly generated problems with 100_< m-<200, 100-< n -< 1000, and 2% density. Though none of them emerged as uniformly dominating any of the others in terms of the quality of the solutions obtained, (iii) scored best and (ii) second best, in the sense that of the 13 problems, (iii) gave the best solution in 6 cases, (ii) in 5 cases. As to the other functions, the best solution was found by (i) and (v) in 3 cases each, and by (iv) in 2 cases (the sum of these numbers exceeds 13, since often more than one

43

E. Balas, A. Ho/ Set covering problems: Computation

function gave a best solution). Table 1 shows the % deviation from the optimum, of the solution found by each function for each problem. The numbers in the first column are those under which the problems are listed in Section 8, where they are also described in more detail. The best solution f o u n d by any of the 5 functions never deviated from the optimum by more than 10.8%. The above described procedure can be amended so as to produce, at little extra cost, more than one cever, the best of which can then be retained. This is done as follows. We first use the above heuristic to find a cover. Then we consider the variables in the order of their inclusion into the cover, and remove from the c o v e r all those that have a positive coefficient in at least one oversatisfied constraint. Next we complete the cover again using the heuristic. We continue in this fashion until either a cover is generated which does not oversatisfy any of the constraints, or the number of covers generated (not necassarily distinct) is some specified integer tr. To find the most desirable value for tr, we have applied this procedure with o- = 10 to each of the above described 13 problems, with each of the 5 functions discussed. S o m e w h a t surprisingly, we found that of the 13 • 5 = 65 cases, the best of the 10 covers generated was the first one in 22 instances, the second one in 34 instances, the third one in 8 instances, and in 1 instance it was the fifth one. In other words, only once was there an improvement after the third cover was found, and this in spite of the fact that the best cover found was not optimal in any of the 65 cases. Consequently, we set t7 = 3, and use function (iii) for the first cover, a different function ((i) or (ii)) for the second cover, and again a different function for the third cover. This way the procedure is still computationally cheap, and yields considerably better results than the version that produces only one cover. We call this procedure P R I M A L 1. Table 1 Deviation from optimum (in %) of the values found by primal heuristics Function used with heuristic

Problem data No.

m

n

cj

cj/ki

cj/log2kj

cj/kjlog2kj

cj/kjln kj

2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10

2~ 2~ 2~ l~ l~ l~ 1~ 1~ 1~ l~ l~ l~ l~

413 3~ 5~ l~ 200 3~ 400 5~ 6~ 7~ 8~ 900 1000

5.4 3.1 11.4 1.6 6.9 3.7 16.2 11.7 6.4 4.7 4.6 7.3 6.8

5.0 8.8 11.7 0.5 3.5 16.4 6.5 28.2 17.6 8.0 2.4 15.9 1.5

6.2 10.0 2.3 1.6 7.6 3.1 10.3 10.8 4.9 6.7 3.5 7.3 1.5

16.3 14.1 14.5 0.5 7.6 12.0 9.1 31.1 23.1 13.7 2.4 9.2 1.5

5.2 13.0 14.5 0.5 6.9 16.4 9.7 28.8 18.7 ll.5 2.4 6.4 1.5

44

E. Balas, A. Ho/ Set covering problems: Computation

The general algorithm discussed in section 1 at times fixes variables at 0. Whenever some variable belonging to the current cover gets fixed at 0, we have to generate a new cover. Rather than starting from scratch, in such situations we start with the partial cover at hand, and complete the cover by using the procedure discussed above. This version of the heuristic we call P R I M A L 2. When the dual heuristics to be discussed in the next section produce a vector (u,s) such that (1) does not hold for S = S(x), where x is the current cover, either the dual solution u must be weakened (see the next section), or else the cover x must be replaced by another one, say $, such that (1) is satisfied for S = S($). P R I M A L 3 was designed to accomplish this starting with the cover at hand. It introduces into the cover additional variables xj such that si > 0, in order of increasing values of f (as defined in (iii)), and removes variables in order of increasing values of sj so as to increase the lefthand side of (1) while keeping the cover prime. While it is not guaranteed to succeed, the percentage of failure is sufficiently low to justify the use of P R I M A L 3. When it fails, the dual solution u must be weakened as explained in the next section. Whenever a new cut is generated, the last cover satisfies all the constraints except for the newly added one. Since it is much cheaper to obtain a new cover from the old one than to construct a new one from scratch, a special heuristic was implemented for this purpose. P R I M A L 4 adds to the current cover a number tr of variables with positive coefficients in the cut just generated, chosen in order of increasing cj, and then removes from the cover redundant variables so as to make it prime. Computational experiments with o-= 1. . . . . 5 have shown tr = 2 to give the best results. Finally, every time we apply the subgradient method to obtain an improved lower bound, we also generate a new c o v e r by using the reduced costs sj produced by that procedure. This is done by subroutine P R I M A L 5, by setting xj = 1 if st = 0, xj = 0 otherwise. The resulting vector either is a cover, or else if row i is uncovered, then st > O, Vj ~ Ni, and variable ui can be increased to u~ + minieN~s~. This creates at least one new reduced cost Sk equal to 0, and for each such k we set Xk = 1. We proceed this way until every row is covered, after which we make sure the cover is prime, like in Step 2 of P R I M A L 1. P R I M A L 5 produces consistently better covers (hence better upper bounds) than any of the other 4 procedures; but obtaining the reduced costs by subgradient optimization is many times more expensive than producing them by the dual heuristics, as will be discussed below. While the conditions for using P R I M A L 2, 3 and 5 have been spelled out, the choice between P R I M A L 1 and 4 seems open at this point. P R I M A L 4 is computationally cheaper, but it often produces a cover that differs very little from the previous one. P R I M A L 1 is more expensive, but yields a genuinely new cover. We follow the strategy of using P R I M A L 1 to obtain the first cover, then using P R I M A L 4, except for every 0-th iteration, when we again use P R I M A L 1. As to the value of 0, computational experiments have led us to start with

E. Balas, A. Ho/ Set covering problems: Computation

45

0 = 1 and then set 0*--min {0 + 1, 7}. In other words, at the beginning we return to P R I M A L 1 more frequently, then at regular intervals of, say, 7 iterations.

4. D u a l h e u r i s t i c s

The purpose of the dual heuristics is to find, at a low computational cost, a feasible solution to the dual linear program (D) associated with current problem, with as high an objective function value (lower bound on the value of (D), hence of (SC)), as possible. In addition, the dual solution u and its associated reduced cost vector s = c - u A have to satisfy condition (1) for S = S(x), where x is the current cover. Again, we use a procedure of the "greedy" type, which considers the variables in some prescribed order and assigns to each one the maximum value that can be assigned without violating some constraint or changing some earlier value assignment. Since it is known (see [2], Theorem 4) that for a feasible solution u to (D), the vector s = c - u A satisfies (1) for S = S ( x ) if u satisfies u ( A x - c ) = 0, in considering the variables ui priority is given to i E T ( x ) , where T ( x ) = {i ~ M I a i x 1}. We recall at this point that M (the row set) is an ordered set, where the rows of the initial constraint set come first, in order of decreasing I N,. I, while the cut-rows come next in the order of their generation. Denoting by R the index set of the dual variables (rows) not yet assigned a value (ordered in accordance with M), the basic procedure is as follows. =

StepO.

SetR=MnT(x),s=c,t=l,

a n d g o t o 1.

S t e p 1. If R = d,, go to 2. Otherwise, let I C_R, choose i ( t ) such that

I Ni.) I = min ] Nil, iEl

and let Ui(t) = min sj.

Then set S.<__f Sj -- Ui(t) I [ Sj

j ~.~ Ni(t) ,

j ~ N ~ Nifty'

R * - - R ~ {i(t)}, t * - t + 1, and go to 1. S t e p 2. If step 2 is entered for the first time, set R ~ - - M ~ T ( x ) and go to 1. Otherwise, stop: u is a feasible solution to (D). Restricting the choice of i(t) to a subset I of R has the sole purpose of making this choice computationally less costly, at the risk of sacrificing some quality.

46

E. Balas, A. Ho/ Set covering problems: C9mputation

Since the rows of the original matrix A are ordered according to decreasing I N~ I and the cuts generated also tend to have progressively fewer l's, we define ! as the union of (i) the last element of M0 n R, where M0 is the row index set of the original constraint matrix, and (ii) the last A elements of R, where ;t = min {I R I, X/I M0 I}. The row selected this way usually achieves or comes close to the minimum of I N,- I over all i E R. This procedure we call D U A L 1. A feasible solution to the current (D) remains feasible after adding a new cut to (SC), i.e., a new column to the constraint set of (D), but usually ceases to be feasible if the new variable is assigned a positive value. On the other hand, if the new variable is set to 0, the solution remains unchanged. Furthermore, if a new solution is generated from scratch by the same heuristic, if is often identical to the previous one. D U A L 2 is a version of the heuristic that starts with the infeasible solution obtained from the last feasible solution u by assigning a value of 1 to the new variable associated with the last cut. This guarantees that the dual solution to be obtained will differ from all the previous ones. Next the remaining variables are considered in a certain order, and set to 0 (or, in the case of the last variable, to the maximum allowable value), until the solution becomes feasible. The order in which the variables ui are considered is that of decreasing number of positive coefficients in constraints of (D) that are violated, with priority given to variables ui corresponding to primal constraints that are oversatisfied by the current cover. Finally, a vector (u,s) generated by D U A L 1 or Dual 2 may violate condition (1) for S = S(x), where x is the current cover. In such cases the algorithm goes to P R I M A L 3, in an attempt to find a new cover g such that (1) holds for S = S(g). Though our computational experience has been that P R I M A L 3 seldom fails to find such a cover, failure does sometimes occur. In such cases we use D U A L 3 to modify the solution u, while weakening it as little as possible, so as to satisfy (1). Dual 3 considers the variables ui, i E M--. T(x), in decreasing order of I N / n S(x) I, and successively reduces their value until (1) is satisfied for S = S(x). Since this latter condition always holds when all ui, i E M ~ T(x), are set to 0, the procedure always ends with a solution having the required property. While D U A L 3 is used under clearly defined circumstances, D U A L 1 and D U A L 2 can be used intermittently. D U A L 2 is computationally cheaper than Dual 1 and guarantees a new solution, but D U A L 1 is more likely to produce an improved lower bound. We start with Dual 1 and then use D U A L 2, except for every/.t-th iteration, when we again use D U A L 1. Based on some computational experimentation, we s e t / z = 4.

5. Subgradient optimization While the dual heuristics provide reasonably good feasible solutions to (D) at a low computational cost, a sharper lower bound could of course be obtained by

E. Balas, A. Ho/ Set covering problems: Computation

47

solving to optimality the linear program (D). After sufficient cuts have been added to the constraint set of (SC), the value of the current problem (D) may exceed ztj, thus bringing the procedure to an end. However, the computational effort involved in solving (D) by the simplex method is considerable, and increases about quadratically with the number of cuts added to the constraint set of (SC). On the other hand, one can use subgradient optimization to find a near-optimal solution to (D) at a computational cost that seems to increase only linearly with the number of cuts added. This is what we are doing periodically in order to generate lower bounds stronger than those obtained by the dual heuristics. Though the subgradient method consistently produces a stronger bound than the heuristics, the cuts derived from the dual solution obtained this way tend to be weaker than those derived from the heuristically generated dual solutions. This is so because the reduced costs obtained by the subgradient method do not usually satisfy (1), and the dual solution u (together with the bound ue) has to be weakened in order to get (1) satisfied. The subgradient method that we use is a specialization of the general procedure discussed, for instance, in [6] or [5]. We wish to find or approximate max min L(x,u) = cx + ue - uAx, uEF x~G

(5)

where F and G are suitable relaxations (supersets) of the feasible sets of (D) and (L) respectively, i.e.,

F3{u

l uA<-c,O<-ui<--fti, Vi}

and

G D { x I A x >_ e,O<_xi <_ 1, Vj}, where ~i = min cj, Vi. j~N~

During every subgradient iteration, given the current vector u k, we solve the problem min L(x, uk),

(6)

xEG

and if x(u k) denotes an optimal solution to (6), we put d k = e - Ax(uk), U k+l = p F ( U k + tkdk).

Here the direction vector d R is a subgradient of L(x,u) at u = u k, the scalar tk is the step length, while PF(g) is the projection of the vector g on F. We take the relaxation of the feasible set of (D) to be F = {u [ 0 <- u --- ti}, and we use two different relaxations G of the feasible set of (L). The first one,

GI={xER

n lO<-xj <- l , j E N } ,

48

E. Balas, A. Ho/ Set covering problems: Computation

makes the solution of (6) trivial: the optimum is attained at

Xj(U k)

=0

if ukaj < Cj,

E [0,1]

if ukaj = CS,

= 1

if ukas > Cj,

For j E N such that ukaj = Cj, where any x i E [0,1] is optimal, we have tried to set x s = 0, 89and 1, and x s = 1 gave on the average slightly better results; so this is what we use. For the second relaxation, we choose a (maximal) subset ~t C M such that Ni tq N k = ~, V i, k E MI, i ~ k, and define jENi ~

62~ IX

ER"

xj>l,

iEMr t

X a,x >_ao

j~:N

O<-xj<--l,

jEN

where dj = I Mi I - 1 if j E/V~ for some i E lff'l, dj = I Mj I otherwise, and do = The idea of using the inequalities defined by a family of disjoint subsets /V~ C N, i E/~r, is borrowed from J. Etcheberry [4]. The extra inequality that we add, which is the sum of the remaining inequalities of A x > e, usually makes G2 considerably more constrained. While G2 is a tighter relaxation than G1, it is also one for which solving (6) is considerably more expensive. We therefore restrict ourselves, when using G2, to approximating the optimum of (6) by a fast heuristic. In this version, the subgradient procedure using G2 is about 1.2-1.3 times more time consuming than the one using Gt, but it also tends to be more reliable and to occasionally produce a slightly better bound. A comparison on 5 randomly generated 200 • 2000 problems (with 8000 nonzero matrix entries) showed the version using G2 to generate 0.61 times the number of nodes (of the search tree) generated by the version using GI, and to require 0.85 times the amount of total time required by the latter. Currently the main version of our algorithm uses G2. To start the subgradient optimization procedure, one needs an initial solution u ~ We use for this purpose the vector u obtained from the dual heuristics when we apply SGRAD the first time to a problem; then at subsequent applications of SGRAD we use as u ~ the dual solution obtained in the last application of SGRAD, which is usually considerably better than the one obtained from the dual heuristics. The quality of the starting solution apparently makes a great difference in the computational effort involved in SGRAD: the first application of SGRAD takes about 6 times the computational effort required by subsequent applications to the same problem (amended with cuts). As to the overall usefulness of the subgradient method in the algorithm, our

E. Balas, A. Ho/ Set covering problems: Computation

49

experience has been that though it is computationally more expensive than the dual heuristics by 1 and often 2 orders of magnitude, subgradient optimization nevertheless pays off. On the one hand, it produces consistently better lower bounds than the heuristics, by a margin that tends to increase with problem size; on the other, it provides a set of reduced costs that can be used by PRIMAL 5 to generate consistently better covers, and hence better upper bounds, than the other primal heuristics. These findings are illustrated in Table 2. The problems listed there are all randomly generated, 2% density set covering problems, described in more detail in section 8. Table 3 shows the average time needed for one application of the heuristic DUAL 1 or 2, and one application of SGRAD, with the averages taken over all applications to all of the 23 problems of Table 2. The comparison shows SGRAD to be about 27 times (3.44:0.127 = 27) as time consuming as DUAL 1 or 2. The factor 27 is, however, an average: as mentioned earlier, the first application of SGRAD to a problem is about 6 times as expensive as subsequent applications to the same problem (with added inequalities); in particular, the first application of Table 2 Improvement in bounds (in %) due to SGRAD and PRIMAL 5 Problem data

No.

m

2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10

200 200 200 100 100 100 100 I00 100 I00 100 100 100 200 200 200 200 200 200 200 200 200 200

Improvement (%)

n

413 300 500 100 200 300 400 500 600 700 800 900 1000 2000 2000 2000 2000 2000 2000 2000 2000 200O 2000

Lower bound SGRAD versus DUAL 1 8.18 3.37 11.27 1.21 1.57 4.90 6.07 0.61 7.39 9.35 15.13 2.87 6.58 13.82 15.98 15.90 14.95 15.90 12.38 19.18 16.25 8.74 12.29

Upper bound PRIMAL 5 versus PRIMAL 1

SGRAD iterations to obtain lower bound

1.84 3.02 4.04 1.53 3.69 3.05 9.30 9.08 3.65 3.63 3.15 6.77 2. I 1 8.90 0.00 6.22 0.78 6.19 5.31 0.97 7.89 3.37 6.03

172 119 78 41 23 82 80 44 178 139 164 263 93 93 176 96 148 97 99 146 183 123 92

E. Balas, A. Hol Set coveringproblems: Computation

50

Table 3 Average time per application of DUAL 1 or 2 and SGRAD DUAL 1 or 2

No. of times used ~

Average time for one application

Total timeb spent

1,638

SGRAD

207.4

No. of times used~

0.127

181

Average time for one application

Total timeb spent 623.3

3.44

~ Total for all 23 problems of Table 2. DEC 20/50 seconds.

Table 4 Cutting plane algorithm with and without SGRAD Problem data

Without SGRAD

No.

m

n

No. of cuts

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 l.ll 1.12

15 30 30 30 30 30 30 30 30 30 30 30

32 30 40 50 60 60 70 70 80 80 90 90

3 12 14 28 115 77 > 438 > 480 211 47 >406 > 496

Timeb

> >

> >

0.30 0.58 0.77 1.44 9.44 4.75 120.00 120.00 23.35 3.40 120.00 120.00

With SGRAD~ No. of cuts

Timeb

0 0 18 10 0 29 0 0 0 6 72 0

0.73 0.77 1.28 1.57 0.76 2.00 1.27 0.87 1.12 1.23 6.60 1.16

a SGRAD applied at first and every 10-th iteration. b DEC 20]50 seconds.

S G R A D r e q u i r e s a b o u t 100 times m o r e t i m e t h a n a n a p p l i c a t i o n of D U A L 1 or 2, w h i l e s u b s e q u e n t a p p l i c a t i o n s to a m e n d e d v e r s i o n s of the s a m e p r o b l e m r e q u i r e o n the a v e r a g e 15 t i m e s as m u c h time as D U A L 1 or 2. F i n a l l y , to s u p p o r t o u r c o n t e n t i o n t h a t i n spite of t h e s e great time d i f f e r e n c e s the use of S G R A D p a y s off, we s h o w in T a b l e 4 the o u t c o m e o f the c u t t i n g p l a n e p r o c e d u r e with a n d w i t h o u t S G R A D , o n a set of 12 set c o v e r i n g p r o b l e m s t a k e n f r o m the l i t e r a t u r e a n d d e s c r i b e d in s e c t i o n 7. T h e s e p r o b l e m s , all of w h i c h e x c e p t f o r 1.1 h a v e u n i t costs, w e r e p a r t i c u l a r l y h a r d for t h e c u t t i n g p l a n e a l g o r i t h m w i t h o u t S G R A D , w h i c h failed to s o l v e 4 of t h e m w i t h i n a time limit of 2 minutes.

E. Balas, A. Ho/ Set covering problems: Computation

51

6. Fixing variables and generating cuts Every time a new solution u to (D) is obtained either by one of the heuristics or by subgradient optimization, the algorithm searches for variables xs such that ss >- zu - ue, and fixes them at 0, removing the corresponding indices from N. Intuitively, one would be inclined to think that this feature of the algorithm becomes operative only after m a n y iterations, when the gap z u - ZL has been narrowed down considerably. This, however, was not the case on the randomly generated problems that we solved. Substantial numbers of variables were usually fixed at 0 quite early in the procedure, and by the time the first branching occurred, the number of variables left was almost always close to the number m of initial constraints, as can be seen from the data of Section 8. To generate a cut, the algorithm uses a subroutine that implements the procedure discussed in [2]. Let x be the current cover, S(x) and T(x) defined as above, and zu the current upper bound. StepO. Set W = O , S = { j ~ S ( x )

lst >0},

y=ue, t=l,

a n d g o t o 1.

Step 1. Let vt = m i n i max si, min {ss Is,-> z u - y}l, t

J={S

itS

3

i~S

s I st= v,),

Q={jENIsi~--vt},

Mj=

U M s. jEJ

Choose i(t) such that INi,~*~QuWI--

min

i~T(x)NMj

INi~QNW]

and let {j(t)} = J A Ni,). Then set W ~-- W U (Ni(t) ~- Q), y +- y + sin). If y -> zo, go to 2. Otherwise set S~---S -.. {j(t)}, ss~__f. Ss - si,) tsi

if j ~ Q A Nitt) otherwise,

t~-t + 1, and go to 1. Step 2. Add to (SC) the inequality ~xi~-l. i~W

This procedure terminates after a number of iterations equal to the number of j ~ S(x) such that si > 0, with an inequality satisfied by every cover better than the one associated with zu, and violated by the last cover x. Typically, the cuts tend to become successively stronger during the procedure, the last few cuts often having just one or two l's. The total number of cuts required to solve a

E. Balas, A. Ho] Set covering problems: Computation

52

m x n problem increases with both m and n. For the randomly generated sparse problems solved in our experiment the number of cuts needed was typically less than 3m or n/3, as can be seen from the results of Section 7. This of course refers to the number of cuts required when the cuts are used within the f r a m e w o r k of an algorithm in the class discussed here, which also uses implicit enumeration. The cuts by themselves, without branching (but with periodic use of SGRAD) were able to solve all of the 20 test problems from the literature that we could obtain, and all but one of the 10 test problems with m = 100 and 100 _< n -< 1000 that we generated. As to the larger problems, six of the ten 200 • 1000 problems and four of the ten 200 • 2000 problems were solved by cutting planes only, without branching (see Section 7 for details). 7. Branching and node selection As mentioned in Section 2, we branch w h e n e v e r the gap z u - ZL decreases by less than ~ = 0.5 during a sequence of 4 a iterations, where a is the f r e q u e n c y of applying SGRAD (every a-th iteration). We use two branching rules intermittently. The first one is based on a disjunction (3), which is strengthened to (7) when used for branching, so as to partition the feasible set:

Y xj=O, jEQi: ~ x j _ > l , k = l

..... i-1

(7)

J~:Qk

i=1

The sets Q; for the disjunction (7) are constructed by a procedure similar to Step 1 of the cut generating routine. The use of the same criterion (of minimizing Nh,)~ Qi) as in the cut generating routine is motivated by the attempt to guarantee that each subproblem created by the branching will have at least one inequality with as few l's as possible. The second branching rule is a variation of the one proposed by J. E t c h e b e r r y [4]. We choose two row indices i,k E M, such that i is the last element in the ordered set M, and k ~ i, with INknNil

=

min INhnNil, NhnN~O

and then branch on the disjunction

(x,=O,j~NinNk)V(

Z

x,---l).

\]ENiANk

(8)

Whenever I Ni n Nk I = 1, which is usually the case, the second term becomes xi = l, where {j} = Ni N NR, and (8) becomes a special case of the usual branching dichotomy (x] = 0 ) v (xj = 1). However, a comparison of the rule based on (8) with the usual branching dichotomy combined with a choice rule for the branching variable different from the one used here, has shown (8) to be on the average considerably better.

E. Balas, A. Ho/ Set covering problems: Computation

53

The choice between rule 1 (using the disjunction (7)) and rule 2 (using the disjunction (8)) is dictated by the following considerations. Since none of the two rules dominates the other, i.e., at certain nodes it may be more advantageous to use rule 1, whereas at other nodes (of the same search tree) rule 2 may be preferable, we introduce a measure of relative efficiency of rule 1 (as compared to the usual dichotomy), and then choose rule 1 whenever it meets the efficiency requirements. With the traditional dichotomy, breaking up the feasible set into p = 2 k subsets (all on the k th level of the search tree) makes it possible to fix a total of f = p log2 p variables. Therefore, in order to use branching rule 1, i.e., a disjunction (7) which breaks up the feasible set into p subsets, we require that (i)

~-, I Qi I > P log2 p, i=1

i.e., that the number of variables fixed on all branches be greater than p log2 p. Besides, we have also found it useful to require that (ii) there be at most one singleton among the sets Qi, i = 1 . . . . . p, and that (iii) p not exceed a specified constant (which we usually set to 8). Whenever conditions (i), (ii) and (iii) are met, we use disjunction (7); otherwise we use disjunction (8). Our node selection rule is LIFO. Table 5 contains information concerning our branching rules. The problems listed are all those among the 200 • 1000 and 200 • 2000, 2% density problems, (i.e., among the problems in sets 4 and 5), whose solution required branching. For each problem, the table gives the number of branchings according to rule 1 and rule 2. Further, each branching according to rule 1 is described by a sequence of numbers in parentheses; where the length of the sequence is the number of branches created, and each number in the sequence is the number of variables fixed at 0 on the corresponding branch. Thus, for solving problem 4.4 (with m = 200, n = 1000, density 2%), the code branched 3 times, using each time rule 1 (disjunction (7)). The first branching created 5 new nodes (subproblems), with 58 variables fixed at 0 on the first branch, 53 on the second, 44 on the third, etc. 8. Computational experience with the algorithm as a whole

We have tested the algorithm as a whole on 6 sets of problems, that we now describe. The problems are labelled with two numbers separated by a dot. The first number is the set to which the problem belongs, the second one distinguishes the problems within the same set. Thus 2.3 is the third problem in set 2. Sets 1 and 2, containing 12 and 8 problems respectively, are from Salkin and Koncal [9], who account for their origin as follows. Problem 1.1 was obtained from C.E. L e m k e , problem 1.8 from IBM Buffalo, and the remaining problems in

54

E. Balas, A. Ho[ Set covering problems: Computation

Table 5 Information on the use of the two branching rules Problem data

No. of branchings according to

No. of variables fixed on each branch, for every branching according to

No.

m

n

Rule !

Rule 2

Rule l

4.4

200

1000

3

0

(58,53,44,15,6), (15,7,5), (15,6,5,4)

4.6

200

!000

6

1

(20,10,8,7,6,1), (24,12,10,9,5,3,1), (7,5,2,2), (14,4,2,1), (16,14,11,4,4), (6,1)

4.8

200

1000

2

2

(26,13,7,5,2,2), (8,7,1)

4.9

200

1000

6

5

(40,21,10,9,8,4,2), (18,15,10,3,2,1), (14,13,13,13,2), (15,4,2), (25,1), (10,3,1)

5. I

200

2000

5

3

(42,24,1), (15,14,14,9,3,3,2,1), (32,24,19,9,2), (8,6,2), 16,8,3,3)

5.4

200

2000

9

3

(35,33,33,27,6), (50,26,3,2), (40,6,5,2,1), (16,10,5,4,1), (20,16,10,5,1), (41,12,9,5,3) (7,7,6,2,1), (21,4,2), (30,16,5,3,2)

5.7

200

2000

2

4

(5,1), (36,11,5,5)

5.8

200

2000

4

2

(35,9,3,3,2,1), (23,8,8,3,2), (9,4,4,2) (20,13,9,5,3,3,3,2)

5.9

200

2000

1

0

(44,20,17,10,3,2)

set 1 f r o m A.M. Geoffrion. All the problems in set 1, except for 1.1, have unit costs. T h e y all have randomly generated coefficient matrices with 7% density. Problem 2.1 is attributed by Salkin and Koncal [9] to A m e r i c a n Airlines, problems 2.6 and 2.7 to I B M N e w York, while the remaining problems in set 2 were randomly generated by H.M. Salkin. The problems in set 2 have coefficient matrices whose density varies between 2% and 11%, and r a n d o m l y generated costs in the range [0,99]. Sets 3, 4, 5 contain 10 problems each, randomly generated b y the second author, with coefficient matrices of 2% density, subject to the requirement that e v e r y column has at least one, and e v e r y row at least two, nonzero entries. The costs are randomly generated f r o m the range [1,100]. Finally, set 6 contains 5 problems, also randomly generated by the second author subject to the same conditions, with costs f r o m the same range, but with coefficient matrices of 5% density. Table 6 c o m p a r e s the p e r f o r m a n c e of our algorithm with two other procedures, that of Salkin and Koncal [9], and of L e m k e , Salkin and Spielberg [8], on the 20 Salkin-Koncal problems (sets 1 and 2). The procedure used by Salkin

E. Balas, A. Ho/ Set covering problems: Computation

55

Table 6 Comparison of algorithms on 20 problems from the literature Problem data

No.

m

n

1.1 1.2 1.3 !.4 !.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8

15 30 30 30 30 30 30 30 30 30 30 30 104 200 200 200 50 36 46 50

32 30 40 50 60 60 70 70 80 80 90 90 133 413 300 500 450 455 683 905

SalkinKoncal [9] Timea 0.51 0.41 0.78 16.33 2.47.~ 10.07 J 5.66~ 4.68 J 5.99 } 6.83 J 16.99} 19.16 ~ 5.70 26.71 15.90 22.70/ > 120.00" 18.64 > 120.00~ > 120.00"

LemkeSalkinSpielberg [8] Time~ 2.7 5.3 19.7 21.6 18.0d 20Ad 31.2d 30"6a 424.0 625.9 461.3 803.5 144.5 35.5 56.9 670.0

Algorithm of section 2 (without branching) No. of cuts 0 0 18 10 0 29 0 0 0 6 0 72 22 6 0 0 0 0 10 0

Time c

Total 0.73 0.77 1.28 1.57 0.76 2.00 1.27 0.87 I. 12 1.23 1.16 6.60 8.03 12.00 6.10 6.23 2.06 1.76 5.94 5.12

SGRAD 0.46 0.46 0.63 1.01 0.36 0.96 0.94 0.39 0.73 0.73 0.79 1.76 4.61 7.10 4.12 3.74 1.24 0.55 3.35 3.34

o UNIVAC 1108 seconds (about the same speed as DEC 20/50). b IBM 360/50 seconds (4-5 times slower than DEC 20/50). c DEC 20]50 seconds. d Average time for the two problems of the same size. "Time limit exceeded. s Storage limit exceeded.

a n d K o n c a l is G o m o r y ' s all i n t e g e r c u t t i n g p l a n e a l g o r i t h m , w h i l e t h e a l g o r i t h m o f L e m k e , S a l k i n a n d S p i e l b e r g is a s p e c i a l i z e d i m p l i c i t e n u m e r a t i o n p r o c e d u r e , w i t h an i m b e d d e d l i n e a r p r o g r a m . O u r a l g o r i t h m s o l v e d e a c h o f t h e s e p r o b l e m s w i t h o u t b r a n c h i n g , a n d o n the l a r g e r p r o b l e m s o f s e t 2 its p e r f o r m a n c e w a s a n order of magnitude better than that of the other two procedures. Note that about h a l f o f t h e t o t a l t i m e (in s o m e c a s e s m o r e , in o t h e r s less t h a n 21) w a s s p e n t o n SGRAD. The number of times the subgradient procedure was used can be c a l c u l a t e d b y d i v i d i n g t h e n u m b e r o f c u t s b y 10, a n d a d d i n g 1 to t h e result. T h e rest of the time was spent on primal and dual heuristics and cut generation. N o t e also t h a t 7 o f t h e 12 p r o b l e m s in s e t 1, a n d 5 o f t h e 8 p r o b l e m s in set 2, d i d n o t r e q u i r e a n y cuts. T h i s d o e s n o t n e c e s s a r i l y i m p l y t h a t t h e l i n e a r p r o g r a m m i n g r e l a x a t i o n o f (SC) h a d a n i n t e g e r s o l u t i o n in t h e s e c a s e s , b u t it does imply that the gap between the linear programming optimum and the i n t e g e r o p t i m u m w a s less t h a n 1. T h i s s m a l l g a p a p p a r e n t l y d i d n o t m a k e m o s t o f

E. Balas, A. Ho! Set covering problems: Computation

56

these problems easy for the other two procedures, as evidenced by their performance on problems 1.11, 2.3, 2.4, 2.5, 2.7 and 2.8. Our procedure, however, can take advantage of the small gap due to the use of the primal heuristics. Table 7 shows the performance of our algorithm, still without branching, on the l0 randomly generated problems of set 3. Note that 6 of these problems did not require cuts. As to the remaining 4 problems, one of them (3.5) required only 4 cuts, while the other three required large numbers of cuts and one of them (3.8) was actually not solved within the time limit of 5 minutes. Had we used the full algorithm (with branching) on these 3 problems, the number of cuts and the time needed would in all likelihood have been smaller. However, we ran the full version of the algorithm only on problem 3.8, with the outcome that the problem was solved in 92.24 seconds, with a search tree of 30 nodes and a total of 362 cuts. Table 8 describes the performance of our algorithm (in its complete version) on the 20 randomly generated test problems of sets 4 and 5. It shows the value Zopt of the optimum; the upper and lower bounds, as well as the number of variables left, before the algorithm first branched; the number of nodes and cuts, and, finally, the total time and the time spent on subgradient optimization. Six out of the 10 problems in set 4, and 4 out of the 10 problems in set 5, did not require any branching. Of those problems that did not require branching, 4 in set 4 and 2 in set 5 did not require cuts either. These 6 problems had a gap of less than 1 between the linear programming optimum and the integer optimum, and for some of them the linear programming optimum may actually be integer (since we don't use the simplex method, we do not necessarily discover this when we solve a problem). As to the remaining problems in sets 4 and 5, they were solved with a

Table 7 Algorithm of section 2 without branching Problem data

No. of

Timea

No.

m

n

cuts

Total

SGRAD

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8b 3.9 3.10

100 100 100 100 100 100 100 100 100 100

100 200 300 400 500 600 700 800 900 1000

0 0 0 0 4 146 59 682 0 0

1.21 1.26 3.04 3.22 3.95 42.61 24.03 >300.00 13.27 8.30

0.57 0.48 1.85 2.13 1.51 19.03 14.38 65.06 i1.18 6.00

a DEC 20/50 seconds. b Time limit exceeded.

E. Balas, A. Ho/ Set covering problems: Computation

57

Table 8 Complete algorithm on 2% density problems (m = 200; n = 1000 for set 4, n = 2000 for set 5) Before first branching

No.

Zopt

zU

ZL

No. of variables left

No. of nodes in search tree

Time~

No. of cuts

Total

SGRAD

4.1 4.2 4.3 4.4

429 512 516 494

429 512 516 507

429 512 516 493.77

0 0 0 243

I I I 13

20 0 22 119

31.88 18.62 26.02 81.28

24.04 13.54 16.29 50.34

4.5 4.6 4.7 4.8 4.9 4.10

512 560 430 492 641 514

512 572 430 492 648 514

512 556.83 430 478.78 636.57 514

0 258 0 99 224 0

1 31 1 14 37 1

0 580 0 274 686 0

13.33 316.22 19.59 167.15 416.46 22.34

8.40 179.28 14.11 110.52 215.41 16.50

5.1 5.2b 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10

253 307c 226 242 211 213 293 288 279 265

256 315 226 247 211 213 296 288 281 265

250.58 299.32 226 240.29 211 213 291.02 286.09 276.21 265

204 408 0 258 0 0 173 125 181 0

30 51a 1 49 1 l 15 28 7 1

473 625 0 765 15 10 298 413 ll8 0

327.89 >600.00 26.87 393.22 38.73 32.71 248.65 241.42 140.61 25.89

181.20 206.81 15.83 133.47 24.31 19.47 152.62 108.72 94.60 15.38

DEC 20/50 seconds. b Time limit exceeded. c Best solution found before exceeding time limit. a 51 nodes generated, of which 30 fathomed.

r e a s o n a b l e c o m p u t a t i o n a l effort, in t e r m s o f the n u m b e r of n o d e s in the s e a r c h tree ( n e v e r m o r e t h a n 50), the n u m b e r of c u t t i n g p l a n e s ( s e v e r a l h u n d r e d at most), as well as in t e r m s of c o m p u t i n g time ( b e t w e e n a b o u t 20 s e c o n d s a n d 7 m i n u t e s ) , e x c e p t for p r o b l e m 5.2, w h i c h c o u l d n o t be s o l v e d w i t h i n the time limit of l0 m i n u t e s . T h e b e s t s o l u t i o n f o u n d for this p r o b l e m , with a v a l u e of 307, is at m o s t 2.33% w o r s e t h a n the o p t i m u m , s i n c e (299.32) is the l o w e r b o u n d f o u n d b e f o r e the first b r a n c h i n g o c c u r r e d . F r o m T a b l e 8 o n e c a n see again t h a t the s u b g r a d i e n t p r o c e d u r e in m o s t cases takes up b e t w e e n 12a n d 32of the c o m p u t a t i o n a l effort. T h e time n e e d e d to solve a p r o b l e m s t r o n g l y d e p e n d s o n the n u m b e r of v a r i a b l e s left b e f o r e o n e has to b r a n c h : t h e r e is a high p o s i t i v e c o r r e l a t i o n b e t w e e n this n u m b e r , a n d the n u m b e r of n o d e s in the s e a r c h tree. T h e r e is a n e v e n higher c o r r e l a t i o n , o f c o u r s e , b e t w e e n the n u m b e r of n o d e s in the s e a r c h tree a n d the total time n e e d e d to solve the p r o b l e m . O n the o t h e r h a n d , cuts are c h e a p to g e n e r a t e , a n d the

58

E. Balas, A. Hol Set covering problems: Computation

AA

Z

~2

NNN~

Z

A

E. Balas, A. Ho/ Set covering problems: Computation

59

number of cuts affects the total time mainly through the fact that after every ot cuts the subgradient procedure is applied (which in turn is costly). This can be seen, for instance, by looking at the 4 problems in set 5 that required no branching. Problems 5.3 and 5.10, which required no cuts either, took 26-27 seconds to be solved. Problems 5.6 and 5.5, which required 10 and 15 cuts respectively, required only 33 and 39 seconds respectively, i.e., about 1.24--1.45 times the time required for problems 5.3 and 5.10. The reason for this is that the subgradient procedure was applied once to each of problems 5.3 and 5.10, and twice to each of problems 5.6 and 5.5. The reason the computational effort increased less than twice for the second pair of problems, is that SGRAD is the most time consuming when applied for the first time to a problem, as discussed in Section 5. As can be seen from the above computational experience, the algorithm discussed here is a reasonably reliable, efficient tool for solving large, sparse set covering problems, as well as for finding good approximate solutions to problems that are too hard to be solved exactly. However, the strength of the family of cuts from conditional bounds strongly depends on sparsity. As problem density increases, the strength of the cuts diminishes, and so does the efficiency of this algorithm, at least in its versions that we have tested. For relatively small problem sizes, the algorithm can cope well with somewhat higher density, as illustrated by problem sets 1 (7% density) and 2 (2~ density), on which it clearly outperformed the other two procedures that had been tried on those problems. But for larger problems, this is unlikely to be the case. To see how fast the algorithm's performance declines with problem density, we have run the code on a set of 5 randomly generated 200 x 1000 problems with 5% density (problem set 6). The results are shown in Table 9. Two of the 5 problems were solved within the 30 minutes time limit. By no means accidentally, these two happen to be the problems with the smallest numbers of variables left before branching. For the remaining 3 problems, the best solution found is guaranteed to be within 6%, 6.4% and 5.2% respectively, of the optimum, though it could of course be much closer. It is possible that a different version of our approach, that would generate a larger number of cuts but retain only the stronger ones, and rely more heavily on branching from a disjunction (7), would be more successful on higher density problems. It is also possible, even highly probable, that a different backtracking rule, that would lead to earlier processing of the nodes with the best lower bounds, would provide a considerably better bound on the quality of the solution obtained when the procedure stops prematurely because of the time limit. These ideas, however, have not yet been tested. References

[1] E. Balas, "Set coveringwith cuttingplanes from conditionalbounds", MSRRNO. 399, CarnegieMellon University(July 1976).

60

E. Balas, A. Ho[ Set covering problems: Computation

[2] E. Balas, "Cutting planes from conditional bounds: A new approach to set covering", Mathematical Programming Study 12 (1980) 19-36 [this volume]. [3] V. Chvfital, "A greedy heuristic for the set covering problem", Publication 284, D6partement d'Informatique et de Recherche Op6rationnelle, Universit~ de Montr6al (1978). [4] J. Etcheberry, "The set covering problem: A new implicit enumeration algorithm", Operations Research 25 (1977) 760-772. [5] J.L. Goffin, "On the convergence rates of subgradient optimization methods", Mathematical programming 13 (1977) 329-348. [6] M. Held, P. Wolfe and H.P. Crowder, "Validation of subgradient optimization", Mathematical Programming 6 (1974) 62-88. [7] A. Ho, "Worst case analysis of a class of set covering heuristics", GSIA, Carnegie-Mellon University (June 1979). [8] C.E. Lemke, H.M. Salkin and K. Spielberg, "Set covering by single branch enumeration with linear programming subproblems", Operations Research 19 (1971) 998-1022. [9] H.M. Salkin and R.D. Koncal, "Set covering by an all-integer Algorithm: Computational experience", ACMJournal 20 (1973). 189-193.

Mathematical Programming Study 12 (1980)61-77. North-Holland Publishing Company

ON THE SYMMETRIC TRAVELLING SALESMAN PROBLEM: SOLUTION OF A 120-CITY PROBLEM Martin G R O T S C H E L * Universitiit Bonn, Bonn, Federal Republic o[ Germany

Received 3 January 1978 Revised manuscript received 5 January 1979 The polytope associated with the symmetric travelling salesman problem has been intensively studied recently (cf. [7, 10, 11]). In this note we demonstrate how the knowledge of the facets of this polytope can be utilized to solve large-scale travelling salesman problems. In particular, we report how the shortest roundtrip through 120 German cities was found using a commercial linear programming code and adding facetial cutting planes in an interactive way. Key words: Travelling Salesman Problem, Cutting Planes, Facets, Computation.

I. Introduction The" travelling salesman problem (TSP) is one of the oldest and most intensively studied combinatorial optimization problems, but the first approach towards a solution of the T S P due to Dantzig et al. [3] does not seem to have attracted much attention in the sequel. Their idea was to generate a " g o o d " solution heuristically, to formulate the T S P as a linear programming problem in z e r o - o n e variables, and to try to prove optimality of the heuristically obtained tour using cutting planes. Their procedure contained "artistic" and interactive parts and did not result in a straightforward algorithm. Although they were able to solve a 49-city problem, it was not clear how good their cutting planes really were with respect to proving optimality or obtaining a good lower bound. These are some of the reasons why the branch-and-bound techniques that came up in the sixties superceded the cutting planes approach for the TSP. The various branch-and-bound algorithms (for a survey see [1]) were and still are highly successful, but it became evident that due to the fact that the computational work grows exponentially with the number of cities, there are bounds on the size of the problems solvable with these methods. S. Hong [12] seems to be one of the first to have rediscovered the appeal of the cutting planes approach. He automatized some of the interactive parts of the Dantzig et al. procedure and incorporated further cutting planes. He reported good results on moderately sized problems. So did Miliotis [14] who combined the Dantzig et al. linear programming idea with branch-and-bound techniques. * Supported by Sonderforschungsbereich 21 (DFG), Institut for Okonometrie und Operations Research, Universit/it Bonn. 61

M. GrOtschel/ Solutions of a 120-city problem

62

Stimulated b y the work of Edmonds (e.g. [5, 6]) and others polyhedral combinatorics, i.e. the study of polyhedra associated with combinatorial optimization problems, became a field of intensive research during the last decade and soon grew into a powerful tool for the solution of combinatorial optimization problems. With regard to the travelling salesman problem intensive studies of the facet-structure of polytopes related to the TSP were carried out, cf. [2, 7-13, 15], and tremendously large classes of inequalities essential for the characterization of these polytopes were discovered. However, it also turned out that it is very unlikely that a complete description of these polytopes can ever be obtained (cf. [7]). Based on the results in [7, 10-12] concerning the symmetric travelling salesman problem Padberg and Hong [15] developed a quite sophisticated cutting plane algorithm which clearly proved the usefulness of this approach, in particular, they solved a symmetric 318-city problem within 0.26% of optimality, a result which seems to be far outside the range of all presently known branchand-bound methods. This paper also aims at validating the usefulness of facetial cutting planes and presents the solution of a real-world symmetric 120-city travelling salesman problem gained in the same interactive fashion which was used in 1954 by Dantzig et al. [3]. The only difference is that our procedure due to the theoretical work in [7, 10, 11] could be based on a much better knowledge of the underlying polytope and that better LP-routines and computers were available. 2. Notation

All graphs G = [V,E] considered are undirected, have no loops and no multiple edges. The node set V is assumed to be {1, 2 ..... n}; edges e U E are denoted by {i, j} where i# j. A graph on n nodes is called complete if E = {{i, j}: i, j ~ V, i ~ j } and will be denoted by K , = [ V , E ] . A set C of k___3 edges, C = {{vl, vz}, {v2, v3}..... {v~-l, v~}, {v~, vl}} where via vj if iS L is called a cycle of length k. Cycles of length n are called tours, cycles of length k < n are sometimes also called subtours. If C1, C2..... C, are cycles such that every node is contained in exactly one cycle Ci, then the union of these cycles is called perfect 2-matching or simply 2-matching. Thus every tour is a 2-matching. For any W C V and F C E we use the following abbreviations:

V(F) : = {i E V: i is contained in an edge of F} = I..Je~F e, E ( W ) : = { { i , j } E E : i E W, j E W } . If G = [V, E] is a graph and {xe: e ~ E} is a set of variables indexed by E, and if F C E, W C V, then we write

x(F) := ~ xe eEF

M. Gr6tschel/ Solutions of a 120-cityproblem

63

and x(W) :=

x(E(W)).

3. Polytopes related to the symmetric travelling salesman problem The (symmetric) travelling salesman problem can be stated as follows: Given a complete graph K. = [V, E] and edge lengths c, ~ R for all e E E, find the shortest tour in K,, i.e. a tour T such that ~ , ~ r c, is minimal. This combinatorial optimization problem can be formulated in algebraical terms in the following way: To each edge e E E we associate a variable xe, and to each tour T C E we associate an incidence vector x r, i.e. a vector such that xr={10

if e E T, otherwise.

As [El =12n(n- 1 ) = : m we have x r ~ R m. The convex hull Q~ of all incidence vectors of tours is called (symmetric) travelling salesman polytope, i.e. Q~ := conv{x r ~Rm: T is a tour in K.}.

(3.1)

Hence, to each vertex of Q~ corresponds a tour in K. and vice versa. If a complete characterization of Q~ by means of linear equations and inequalities were known, then the T S P could be solved (theoretically) through the linear program min cx, x ~ Q~r. By definition Q~ is contained in the unit hypercube {x E R '~: 0 -< Xe --< 1 for all e E E}, and a tour (also a 2-matching) T has the property that every node is contained in exactly two edges of T, hence the system of equations

A x = 2e,

(3.2)

where A is the node-edge incidence matrix of K., e. is an n-vector of ones, must be satisfied by all incidence vectors of tours and 2-matchings. This implies that Q~ c 0~u where 0~u : = {x E R": Ax = 2e., 0 --- x, <- 1 for all e E E}.

(3.3)

It is not difficult to characterize the vertices of Q~u: they are incidence vectors of tours, incidence vectors of 2-matchings which are not tours (so called subtour vertices), or simply-structured fractional vertices, i.e. 0 < x, < 1 for some e ~ E (cf. [7]). To get a pol_ytope closer to Q~r both the subtour vertices and the fractional vertices of ()~u have to be chopped off. In order to cut off the subtour vertices Dantzig et al. [3] introduced the following subtour-elimination constraints x ( W ) -< [WI - 1 for all W C V, 2 -< I Wl --- n - 1.

(3.4)

64

M. Gr6tschel[ Solutions o[ a 120-city problem

Defining Q ] : = {x ~ Q~M: x satisfies (3.4)},

(3.5)

it is obvious that Q~ has no other integer vertices apart from incidence vectors of tours, hence Q~ = conv{x E Q~: x integer}. The subtour-elimination constraints, however, do not suffice to eliminate all fractional vertices from ()~u. Inequalities doing this were found by Edmonds [5]. He introduced the so called 2-matching constraints k

x(W~) -< {W01+89 - 1),

(3.6)

i=0

where the node sets W0, WI ..... Wk C V satisfy [WoN W~I= 1, IW~I = 2,

i = 1..... k,

i = 1..... k,

k --- 1 and odd.

(3.6.1) (3.6.2) (3.6.3)

Letting Q~u := conv{x M ~ Rm: M is a 2-matching in Kn} = conv{x ~ t~M: x integer},

(3.7)

Edmonds [5] proved: Q~M = {x ~ 0~M: x satisfies all inequalities (3.6)}. This result shows that the 2-matching constraints cut off all fractional vertices of (~M without creating new ones. By construction we have Q~ c Q] tq Q~M, but unfortunately equality does not hold. This means that although all fractional and subtour vertices of t~M are chopped off, new fractional vertices which are "more complicated" than those of t~M are created by the intersection of the half spaces defined by (3.4) with those given by (3.6). Several new types of inequalities which are valid with respect to Q~, i.e. Q~- is contained in the half spaces defined by these inequalities or equivalently all incidence vectors of tours satisfy these inequalities, were proposed in the literature [2, 7, 10, 11, 13, 15], the largest and best studied class of these are the comb inequalities (cf. [2, 7, 10, 11]): Given Wo, W~..... Wk C V such that

[Won w~l--- l,

i = l ..... k,

(3.8.1)

I W~- Wol -- l,

i = 1..... k,

(3.8.2)

W~ n Wj = 0,

i<_i<j<_k,

(3.8.3)

k -> 3 and odd then one can show (cf. [10]) that

(3.8.4)

M. Gr6tschel/ Solutions of a 120-city problem

k k k + 1 =: s(C) /-_~ox(W/) < I W~ +/_-~l(I W'I - 1 ) - 2

65

(3.8)

is a valid inequality with respect to Q~. Validity, however, is not a proper criterion for checking the "sharpness" of inequalities, i.e. their suitability as cutting planes. The concept of facets allows one to definitely judge the goodness of cutting planes. An inequality ax <-ao valid with respect to Q~ is called a `facet of Q~ if dim(Q~ n {x E RM: ax = a0}) = dim Q ~ - I , and two facets ax < ao and bx <-bo are called equivalent if Q~ N {x E R m: ax = a0} = Q~ n {x E R m: bx = b0}. As Q~ is not a fully-dimensional polytope (dim Q ~ = m - n = I E I - I v ] , cf. [10]) many different inequalities turn out to be equivalent with respect to Q~ (cf. [7]). If K is the number of different classes of equivalent facets of Q~, and if we choose from each of these classes exactly one inequality alx <- a~, i = 1. . . . . K, then it is well-known that Q ~ = { x E R " : Ax = 2e,, a l x < a ~ , i = 1..... K}

(3.9)

holds. Furthermore this characterization of Q~ is non-redundant, i.e. if we drop any of the equations of A x = 2e, or any of the inequalities aix <- a~ the polytope on the right-hand side of (3.9) is no longer equal to Q~. These properties establish in a precise sense that facets are best cutting planes because only the knowledge of at least one element of all classes of facets of Q~- renders a complete and non-redundant characterization of Q~ possible. This also shows that in cutting plane algorithms only facetial inequalities (if such are known) should be used, all other (non-facet) inequalities do not suffice to fully establish the polytope considered--although in some practical applications they may suffice to prove optimality. With respect to the travelling salesman polytope the following results using quite involved proof techniques were obtained (cf. [7, 10, 11]).

Theorem 1. Let n >-6. (a) The trivial inequalities xe >- O, Xe <--1 are facets of Q~r f o r all e E E. (b) The subtour elimination constraints x ( W ) < - I W [ - 1 are `facets o f Q~ for all W C V, 3 < - I W l < - n - 3 . (c) Two different subtour elimination constraints x ( W ) - < I W ] - l, x(W')-< I W'[ - 1 are equivalent with respect to Q~: if and only if W = V - W'. (d) All comb inequalities (3.8) are `facets of Q~. (e) Two different comb inequalities k

h

~, x(W~) <- s(C),

~, x(W~) < s(C')

i=O

i=O

are equivalent with respect to Q~ if and only if k = h, Wo = V - W~, W,. = W[, i = l . . . . . k.

66

M. Gr6tschell Solutions of a 120-cityproblem

(f) Trivial inequalities, comb inequalities and subtour-elimination constraints are pairwise non-equivalent. One can also show (cf. [7]) that the comb inequalities (3.8) contain as a special case all 2-matching inequalities (3.6) and all Chvfital-comb inequalities (cf. [2]) which define facets of Q~, hence, comb inequalities are a fairly general class of facets of Q~. Letting Q~ : = {x ~ 0~M: x satisfies all inequalities (3.4) and (3.8)}

(3.10)

then obviously Q ~ c Q~-c Q]tq Q~M and for larger n these inclusions are proper. By Theorem 1 we know all the facets of Q~, and furthermore, Theorem 1 says that all facets of Q~ are also facets of Q~. Although Theorem 1 does not characterize the travelling salesman polytope completely, the polytope Q~ seems to be quite a good approximation of Q~, and it seems reasonable to use this polytope as a relaxation of the travelling salesman problem, in particular because the combinatorial structure of the facets of Q~ is quite simple. Theorem l makes it possible to design a cutting plane algorithm for the TSP using facets only and avoiding the use of inequalities which are equivalent to others or do not define facets.

4. Solution of a 120-city problem by linear programming Having claimed that a good knowledge of the polytope Q9 is of high value for the solution of travelling salesman problems, we are going to demonstrate this by solving a real-world symmetric 120-city problem using the facets found as cutting planes. Our method is the same interactive procedure carried out by Dantzig et al. [3]. We did not try to mechanize the generation of cutting planes but rather found cutting planes by inspecting the solutions of relaxed linear programs and added them in an interactive way. For a discussion of the possibilities of mechanically identifying violated facets and activating them in order to cut off non-tour solutions, the reader should consult [15] where several methods of automatizing these procedures are described. The data of our problem were taken from [4], the edge lengths are the road distances between every two of 120 cities of the Federal Republic of Germany (including some cities bordering Germany). For a complete listing of the cities see the Appendix. Before using cutting planes we tried to solve the problem with all branch-andbound algorithms available to us, but none of them terminated with an optimal tour. To get a "good" estimate of the order of magnitude of the length of the optimal tour and to obtain a starting basis for our simplex procedure, we

hi. Gr6tschd/ Solutions of a 120-city problem

67

generated tours using several heuristical methods. The shortest of the heuristically found tours had length 7091 km; by visual analysis of this tour on a map of Germany we were able to construct a tour of length 7011 km, but no other handl or machine improvements could be obtained. As every tour is a 2-matching the shortest 2-matching (which can be obtained by a good algorithm) gives a lower bound for the length of the optimal tour. The shortest 2-matching of our 120-city problem has length 6694 kra, hence we know that the shortest tour will be in the interval 6694 km to 7011 km. We now used the linear programming cutting plane technique suggested in [3] which works as follows: Relax the TSP as far as possible, i.e. choose a polytope Q~ that has few facets only and contains Q~:O, and solve the linear program min cx, x E Q1. If the optimal solution xl of Q~ is a tour one is done, if not choose from the facets given in Theorem 1 a set of inequalities B x <- b which are violated by x~ and add these to Q~, thus cutting off x~. Then solve the linear program min cx, x E Q2 = Q~ tq{x: B x <- b} and proceed in the same manner. ----120 (3.3) which we In our case we started the procedure with the polytope Q2M consider to be the coarsest meaningful relaxation of the symmetric travelling salesman problem. After every LP-run we represented the optimal solution graphically by hand on a map. In the beginning a plotter was used, but as the number of different fractional components of the solutions increased there were not enough symbols to distinguish them and the plottings became too cluttered. Using the graphical representation of the optimal solution we looked for subtour elimination constraints and comb inequalities to cut off the present solution and added them to the present constraint system. Drawing and searching took from 30 man-minutes in the beginning up to 3 man-hours after the last runs. Altogether 13 LP-runs were needed and a total number of 96 additional inequalities had to be added to /'~ ,r 120 Among these 96 cutting planes we used 36 subtour-elimination constraints (3.4) and 60 comb inequalities (3.8). The 60 (general) comb inequalities were composed of 25 2-matching inequalities, 14 Chvfital-comb inequalities and 21 other comb inequalities. Thus the polytope defined by the intersection of ~2M"~12~with these 96 half spaces contains Q~20 and has the same optimal solution for the given 120-city problem as the LP over

Q~0.

In the following we list all the 96 facetial.inequalities we used and the value of the objective function after each run. We subdivide the inequalities into subtourelimination constraints x ( W ) < - I W I - 1 which we give by W = { v l , v2. . . . . Vk}, R S = IW I - 1 = k - 1, and comb inequalities k

i=O

k

x(W~) _
which we represent by W0, I4"1..... Wk and s ( C ) .

68

M. Gr6tschel/ Solutions of a 120-city problem

R u n 1:

Q2M= x ~

xi~+.

x0 = 2 f o r i = l ,

120,

0 <- xii --- 1 for 1 -< i < j ___120].

Minimum: 6662.5. Run 2: Subtour-elimination constraints: (1) W = {100, 52, 33}, R S = 2; (2) W = {100, 91, 79, 68, 58, 52, 33}, R S = 6;

(3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)

W = { I 1 7 , 66, 31}, R S = 2 ; W = { l 1 8 , 49, 13}, R S = 2 ; W = {118, 49, 17, 13}, R S = 3; W = {116, 70, 8}, R S = 2; W = {34, 26, 4}, R S = 2; W ={119, 115, 103, 82, 51, 23, 11, 9, 3, 2}, R S = 9 ; W = {97, 95, 12}, R S = 2; W = {67, 62, 37}, R S = 2; W ={104, 99, 84, 36, 35, 10}, R S = 5 ; W = {104, 99, 84, 36, 35, 10, 6}, R S = 6; W = { l l 9 , 116, 115, 114, 112, llO, 109, 106, 104, 103, 102, 101, 99, 97, 95, 93, 89, 88, 84, 83, 82, 80, 77, 73, 71, 70, 67, 64, 63, 62, 57, 55, 54, 53, 51, 48, 47, 39, 37, 36, 35, 34, 27, 26, 23, 21, 12, 11, lO, 9, 8, 6, 5, 4, 3, 2}, R S = 55. Minimum 6883.5. Run 3: Subtour-elimination constraints:

(14) W ={114, 112, 110, 106, 104, 102, 101, 99, 89, 84, 83, 73, 67, 62, 57, 55, 48, 47, 37, 36, 35, 10, 6}, R S = 22; (15) W = {109, 97, 95, 93, 88, 77, 64, 63, 53, 39, 27, 21, 12, 5}, R S = 13; (16) W = {109, 97, 95, 93, 88, 77, 64, 63, 53, 39, 21, 12, 5}, R S = 12; (17) W = {120, 92, 32, 30, 29, 28}, R S = 5; (18) W = {120, 92, 32, 30, 29}, R S = 4; (19) W = {105, 74, 72, 40}, R S = 3; (20) W = {105, 72, 40}, R S = 2; (21) W = { l 1 7 , 85, 66, 31, 22}, R S = 4 ; (22) W = { l 1 7 , 85, 66, 31, 22, 18}, R S = 5; (23) W = {100, 91, 79, 68, 58, 52, 43, 33}, R S = 7. Comb inequalities: (24) W0= {85, 22, 18}, W1 ={117, 85}, W2= {66, 22}, W3={19, 18}, s(C) = 4 ; (25) Wo = {120, 92, 28}, W, = {120, 29}, WE = {92, 32}, W3 = {45, 28}, s(C) = 4;

M. Grdtschel/Solutions o[ a 120-cityproblem

69

(26) Wo = {104, 99, 10}, W1 = {104, 36}, W2 = {99, 62}, W3 = {35, 10}, s(C) = 4; (27) Wo = {53, 27, 5}, W~ = {80, 27}, W2 = {64, 53}, W3 = {63, 5}, s(C) = 4; (28) Wo={89, 55, 48, 47}, W, ={89, 55,. 6}, W2={102, 48}, W3={71, 47}, s(C) = 6. Minimum: 6912.5.

Run 4: Subtour-elimination constraints: (29) W = {t09, 97, 95, 93, 88, 77, 64, 53, 21, 12}, R S = 9; (30) W = { I I 9 , 115, 114, 112, llO, 109, 106, 104, 103, 102, 101,99,97,95,93,89, 88, 84, 83, 82, 80, 77, 73, 71, 67, 64, 63, 62, 57, 55, 53, 51, 48, 47, 39, 37, 36, 35, 27, 26, 23, 21, 12, I 1, 10, 9, 6, 5, 4, 3, 2}, R S = 50; (31) W = {113, 107, 69}, R S = 2. Comb inequalities: (32) Wo = {73, 62, 57}, W~ = {114, 73}, WE = {83, 57}, W3 = {62, 37}, s(C) = 4; (33) Wo={l18, 113, 107, 98, 69, 65, 50, 49, 46, 44, 20, 17, 13}, W~ ={98, 42}, Wz = {68, 65}, W3 = {75, 44}, s(C) = 14; (34) Wo = {118, 113, 107, 69, 65, 49, 20, 17, 13}, Wi = {98, 17}, Wz = {68, 65}, W3 = {46, 20}, s(C)= 10; (35) Wo = {I 19, 72, 40, 38, 34, 4}, W~ = {105, 72, 40}, W2 = {119, 103}, W3 = {I16, 34}, W4 = {38, 7}, W5 = {26, 4}, s(C) = 9. Minimum: 6918.75.

5: Subtour-elimination constraints: W = {93, 64, 53}, R S = 2; W = {118, 98, 49, 42, 17, 13}, R S = 5; W = {104, 99, 89, 84, 55, 36, 35, I0, 6}, R S = 8; W = { l 1 4 , 112, 110, 106, 104, 99, 89, 84, 83, 73, 67, 62, 57, 55, 37, 36, 35, 10, 6), R S = 18. Comb inequalities: (40) Wo={l18, 98, 50, 49, 46, 42, 41, 20, 17, 13}, Wm={50, 46, 44}, W2 ={56, 41}, W3 = {107, 20}, s(C)= 12; (41) Wo = {I 18, 113, 107, 98, 69, 65, 50, 49, 46, 44, 42, 41, 20, 17, 13}, Wi = {75, 44}, I4"2= {68, 65}, W3 = {56, 41}, s(C) = 16; (42) Wo = {119, 116, 103, 72, 71, 70, 54, 40, 38, 34, 26, 8, 4}, W~ ={119, 115, 103, 82, 51, 23, 11, 9, 3, 2}, W2={105, 72, 40/, W3={71, 47}, W4={90, 54}, W5 = {38, 7}, s(C) = 24; (43) W0={l19, 116, 105, 103, 72, 71, 70,54, 40, 38, 34, 26,8,4}, W1 ={119, 115, 109, 103, 97, 95, 93, 88, 82, 80, 77, 64, 63, 53, 51, 39, 27, 23, 21, 12, 11, 9, 5, 3, 2}, 14/2= {105, 74}, W~ = {71, 47}, W4 = {90, 54}, W~ = {38, 7}, s(C) = 39; (44) We ={71, 54, 8}, W~ ={116, 70, 8}, W2={71, 47}, W3 = {90, 54}, s(C) =5. Minimum: 6928. Run (36) (37) (38) (39)

M. GrOtschel/ Solutions o[ a 120-city problem

70

Run 6: Subtour-elimination constraints: (45) W = {118, 113, 107, 105, 98, 87, 75, 74, 72, 69, 65, 56, 50, 49, 46, 44, 42, 41, 40, 38, 20, 17, 14, 13, 7}, RS = 24; (46) W = {120, 118, 117, 113, 108, 107, 105, 100, 98, 94, 92, 91, 87, 86, 85, 81, 79, 78, 76, 75, 74, 72, 69, 68, 66, 65, 61, 59, 58, 56, 52, 50, 49, 46, 45, 44, 43, 42, 41, 40, 38, 33, 32, 31, 30, 29, 28, 25, 22, 20, 19, 18, 17, 16, 15, 14, 13, 7, 1}, RS = 58. inequalitie s: Comb (47) Wo = {68, 65, 13}, W~ = {91, 68}, Wz = {69, 65}, W3 = {49, 13}, s(C) = 4; (48) W0-{90, 71, 54, 8}, W~={96, 90, 54}, W2={116,70,8}, W3={71, 47},

s(C) = 7; (49) Wo = {89, 55, 48, 47}, W1 = {104, 99, 89, 84, 55, 36, 35, 10, 6}, W2 = {102, 48}, W3 = {71, 47}, s(C) = 12; (50) Wo ={118, 113, 108, 107, 100, 91, 79, 69, 68, 65, 58, 52, 49, 43, 33, 20, 17, 13}, W1 = {108, 25}, W2 = {98, 17}, W3 = {46, 20}, s(C) = 19. Minimum: 6935.3. Run 7: Subtour-elimination constraints: (51) W = {71, 47, 26}, RS = 2; (52) W = {94, 86, 81, 78}, RS = 3; (53) W = {119, 103, 82, 23, 9, 3}, RS = 5. Comb inequalities: (54) Wo= {69, 68, 65}, W~ ={113, 69}, W2= {91, 68}, W3 = {65, 13}, s(C) = 4 ; (55) Wo = {69, 68, 65}, W~ = {113, 107, 68}, W2 = {91, 68}, W3 = {118, 65, 49, 13}, s(C)=7;

(56) Wo = {114, 112, 106, 104, 99, 84, 83, 73, 67, 62, 57, 37, 36, 35, 10}, W~ = {114, 83, 73, 63, 57, 39, 5}, W2={112, 110}, W3 = {84, 6}, s(C)=21; (57) Wo = {118, 113, 107,98,69,68,65,50,49,46,44,42,20, 17, 13}, Wt = {91,68}, Wz = {75, 44}, W3 = {42, 41}, s(C) = 16; (58) Wo = {119, 116, 115,114, 112, 111,110, 109, 106, 104, 103, 102, 101,99,97,96, 95, 93, 90, 89, 88, 84, 83, 82, 80, 77, 73, 72, 71, 70, 67, 64, 63, 62, 60, 57, 55, 54, 53,51,48,47,40,39,38,37,36,35,34,27,26,24,23,21, 12, 11, 10,9,8,7,6,5, 4, 3, 2}, W1 = {105, 74, 72, 40}, Wz = {60, 16}, W3 = {56, 7}, s(C) = 68; (59) Wo={l15, 93, 21, 2}, W~ ={119, 115, 103,82,51,23, 11,9,3, 2}, W2 ={93, 64, 53}, W3 = {109, 21}, s(C)= 14. Minimum: 6937.222. Run (60) (61) (62)

8: Comb inequalities: Wo ={34, 26, 4}, W~ ={119, 4}, W2= {116, 34}, W3= {71, 26}, s(C)~-4; Wo -- {53, 27, 5}, W~ = {93, 64, 53}, W2 -- {80, 27}, W3 = {63, 5}, s(C) = 5; Wo = {115, 103, 93, 82, 51, 23, 21, 11, 9, 2}, W~ ={119, 103}, W2 = {109, 21}, W3 = {93, 64, 53}, W, = {82, 3}, W, =- {51, 13}, s(C) = 13;

M. Gr6tschel/Solutions o[ a 120-cityproblem

71

(63) Wo={l15, 93, 82, 51, 23, 21, 11, 9, 2}, Wj ={103, 23, 9}, W2= {109, 21}, W3 = {93, 64, 53}, W4 = {82, 3}, W5 = {51, 13}, s(C)= 13; (64) Wo={103, 51, 23, 11, 9}, W,={l19, 103}, Wz={ll5, 11}, W3={51, 13}, s(C) = 6; (65) Wo = {89, 55, 48, 47}, W1 = {114, 112, 110, 106, 104, 99, 89, 84, 83, 73, 67, 62, 57, 55, 37, 36, 35, 10, 6}, WE = {102, 48}, W3 = {71, 47, 26}, s(C)= 23; (66) Wo= {112, 106, 104, 99, 84, 67, 62, 37, 36, 35, 10}, W1 ={114, 106, 83, 73, 67, 62, 57, 37}, W2={112, I10}, W3 = {84, 6}, s ( C ) = 18; (67) Wo = {106, 104, 99, 67, 62, 37}, Wr = {114, 106, 83, 73, 67, 62, 57, 37}, WE = {104, 36}, W3 = {99, 10}, s(C)= 13. Minimum: 6939.5. Run 9: Comb inequalities: (68) Wo={72, 56, 40, 38, 7}, W~={105, 74, 72, 40}, W2={119, 103, 38}, W3 = {56, 41}, s(C) = 9; (69) Wo = {119, 116, 105, 103, 74, 72, 70, 56, 40, 38, 34, 26, 8, 7, 4}, W~ = {119, 115, 103, 82, 51, 23, 11, 9, 3, 2}, W2 ={87, 74}, W3 = {71, 26}, W4 = {56, 41}, W5 = {54, 8}, s(C) = 25; (70) Wo = {112, 110, 89, 84, 55, 48, 47, 36, 6}, W~ = {112, 106}, W2 = {104, 36}, W3 = {102, 48}, W4 = {84, 35}, W5 = {71, 47}, s ( C ) = 11; (71) Wo={l12, 110, 106, 104, 99, 89, 84, 67, 62, 55, 48, 47, 37, 36, 35, 10, 6}, W~ = {102, 48}, W2 = {83, 67}, W3 = {71, 47}, s(C) = 18; (72) Wo = {I 18, 117, 113, 108, 107, 100, 98, 91, 85, 79, 75, 69, 68, 66, 65, 58, 52, 50, 49. 46, 44, 43, 42, 41, 33, 31, 25, 22, 20, 19, 18, 17, 13}, W~ = {81, 22}, WE = {75, 14}, W3 = {56, 41}, s(C) = 34. Minimum: 6940.38281. Run 10. Subtour-elimination constraints: (73) W = { l 1 7 , 113, 108, 107, 100, 91, 85, 79, 69, 68, 66, 65, 58, 52, 43, 33, 31, 25, 22, 19, 18}, RS = 20. Comb inequalities: (74) Wo= {I16, 70, 34, 26, 8, 4}, Wi ={116, 70, 54, 8}, W2={119, 4}, W3 ={71, 26}, s(C) = 9; (75) Wo={l14, 106, 83, 73, 67, 62, 57, 37}, W1={112, 106}, W2={99, 62}, W3 = {83, 39}, s(C) = 9; (76) Wo = {118, 117, 113, 108, 107, 100, 98, 94, 91, 87, 86, 85, 81, 79, 75, 69, 68, 66, 65, 58, 56, 52, 50, 49, 46, 44, 43, 42, 41, 33, 31, 25, 22, 20, 19, 18, 17, 14, 13}, WI = {120, 94, 92, 86, 81, 78, 45, 32, 30, 29, 28, 15}, WE = {87, 74}, W3 = {56, 41, 7}, s(C) = 51. Minimum: 6940.81641. Run 11: Comb inequalities: (77) Wo={108, 100, 91, 79, 69, 68, 65, 58, 52, 43, 33}, W~ ={113, 107, 69},

M. Gr6tschel/ Solutions of a 120-city problem

72

WE = {108, 25}, W3 = {65, 13}, s(C)= 13; (78) Wo = {120, 92, 76, 59, 32, 30, 29, 15}, W~ = {92, 28}, WE = {81, 15}, W3 = {76, 1}, s(C) = 9; (79) Wo = {94, 87, 86, 78, 75, 44, 14}, W, --- {94, 81}, W2 = {87, 74}, W3 = {78, 45}, s(C) = 8; (80) Wo = {116, 70, 34, 26, 8, 4}, Wm={119, 4}, WE= {71, 47, 26}, W3 = {54, 8},

s(C) --

8;

(81) Wo={l14, 112, 106, 104, 99, 84, 83, 73, 67, 62, 57, 37, 36, 35, 10},

W~ ={112, 110}, W2 = {84, 6}, W3 = {83, 39}, s(C)= 16; (82) Wo = {108, 100, 91, 79, 69, 68, 65, 58, 52, 43, 33, 13}, W~ ={ 118, 49, 13}, W2={113, 107, 69}, W3 = {108, 25}, s(C) = 15; (83) Wo = {120, 92, 76, 59, 45, 32, 30, 29, 28, 15}, W~ = {81, 15}, WE = {78, 45}, W3 = {76, 1}, s(C)= 11; (84) Wo = {94, 87, 86, 81, 78, 75, 44, 41, 22}, W~ = {117, 85, 66, 31, 22}, W2 = {78, 45}, W3 = {87, 74}, s(C)= 13; (85) Wo = {120, 92, 45, 32, 28}, Wl = {120, 29}, W2 = {78, 45}, W3 = {32, 30}, s(C) = 6; (86) Wo = {116, 105, 72, 71, 70, 60, 54, 47, 40, 38, 34, 26, 24, 8, 7, 4}, W~ ={105, 87, 74, 72, 56, 41, 40, 38, 7}, W2 = {114, 112, 110, 106, 104, 102, 101, 99, 89, 84, 83, 73, 71, 67, 62, 57, 55, 48, 47, 37, 36, 35, 10, 6}, W3 = {96, 90, 54}, W,={60, 16}, W5 = { l l l , 24}, s(C)=48; (87) Wo={73, 67, 57}, W1 = {67, 62, 37}, Wz={ll4, 73}, W3 ={83, 57}, s t C ) = 5; (88) Wo = {83, 73, 67, 57}, W~ = {83, 39}, WE = {114, 73}, W3 = {67, 37}, s(C) = 5. Minimum: 6941.18359. Run 12: Comb inequalities: (89) Wo ={105, 74, 72, 56, 40, 38, 7}, W~ ={119, 103, 38, 23, 9}, W2 = {87, 74}, W3 = {56, 41}, s ( C ) = 11; (90) Wo={l18, 117, 113, 108, 107, 100, 98, 94, 91, 87, 86, 85, 81, 79, 78, 75, 69, 68, 66, 65, 58, 52, 50, 49, 46, 44, 43, 42, 41, 33, 31, 25, 22, 20, 19, 18, 17, 14, 13}, W, = {87, 74}, W2 = {92, 78, 45, 28}, W3 = {56, 41}, s(C) = 42; (91) Wo={l16, 71, 70, 54, 34, 26, 8, 4}, W~ ={119, 4}, W2 = {90, 54}, W3 = {71, 47}, s(C) = 9; (92) Wo={105, 74, 72, 56, 40, 38, 7}, W~ ={87, 74}, W2={56, 41}, W3={119, 103, 38, 23, 9}, s(C)= 11; (93) Wo = {108, 69, 68, 65, 43, 13}, W~ = {118, 49, 13}, Wz = {113, 69}, W3 = {108, 25}, W4 = {91, 68}, W5 = {79, 43}, s(C) = 9. Minimum: 6941.5. Run 13: Comb inequalities: (94) Wo={109, 93, 21}, W~={109, 88}, W2={93, 64, 53}, W3={115, 21}, s(C) = 5;

73

M. Gr6tschell Solutions of a 120-city problem

33 .r

94

tt3 6Sr

~i~,

.o.K

.

I ~3e

S~'c

35"

/

Jc~z

i ~)('

24

)'

) (" ' ~

~;

/

7"'

4o

~3

3~

Fig. 1. The shortest roundtrip through 120 German cities. The length of this tour is 6942 km.

74

M. GrOtschel/Solutions of a 120-city problem

(95) W0={l15, 109, 93, 21, 2}, WI={II9, 115, 103, 82, 51, 23, 11, 9, 3, 2}, W2 = {109, 88}, W3 = {93, 64, 53}, s(C) = 15; (96) W0 = {119, 115, 114, 109, 103, 101, 93, 83, 82, 80, 77, 73, 64, 63, 57, 53, 51, 39,27,23,21, 11,9,5,3, 2}, W~ = {114, 112, 110, 106, 104,99,89,84,83,73,67, 62, 57, 55, 37, 36, 35, I0, 6}, W2 = {109, 88}, W3 = {102, 101}, W4 = {95, 77}, W5 = {51, 13}, s(C) = 45. Minimum: 6942. The optimal solution of the 13th LP-run was the incidence vector of a tour of length 6942 km, hence this vector represents the shortest roundtrip through the 120 cities of Germany. A graphical representation of this optimal tour is given in Fig. 1. We have calculated the number of non-equivalent facets of Q~0 which are given by Theorem 1. This number is exactly 26792549076063489375554618994821987399578869037768 70780484651943295772470308627340156321170880759399 86913459296483643418942533445648036828825541887362 42799920969079258554704177287. Considering the fact that the trivial inequalities, the subtour-elimination constraints, and the comb inequalities are by far not all facets of Q~20, it is quite surprising that only the trivial inequalities and an additional 96 inequalities out of these 10~79 inequalities and no other were needed to find an optimal tour and prove optimality. It can be seen from the sequence of minimum values of the 13 linear programs that the increase of these values is considerably large during the first runs (221 km after the second run, a total of 273 km after the first six runs) but it took a further seven runs to beat the last 6 km. This fact was observed in several other experiments of this kind. Due to limited time for the solution procedure or due to possible incorrectness of the data a near optimal solution is often good enough for practical purposes. If for instance a tour at most 2% off optimality would be considered as satisfactory we could have stopped after the second LP-run having a lower bound of 6883.5 km and a "good" (heuristically found) tour of length 7011 km. For LP-problems of our size (7140 variables, 120 equations, 7140 upper and lower bounds, 96 inequalities) advanced LP-codes and large computers are indispensable. We have used the LP-program of the MPSX-package of IBM, and all runs were executed on the IBM-computer 370/168 of the Rechenzentrum der Universit/it Bonn. To simplify the data input several auxiliary routines were written: one program that generated the equation system Ax = 2e~20 and the upper and lower bounds in the input format required by MPSX, another that after each run saved the whole constraint system on tape, a third program that generated comb inequalities and subtour-elimination constraints in MPSX input

M. Gr6tschel! Solutions of a 120-city problem

75

format if these were given as in our cutting plane list above and which added these inequalities to the present constraint system. In each run the heuristically obtained tour of length 7011 km was given as partial LP-basis which was then completed by MPSX to a full basis and used as starting basis for the linear program. We found that such a device can reduce the computation times considerably. The CPU-times needed for the solution of the thirteen programs ranged between 30 seconds and 2 minutes, the number of pivot-operations was between 100 and 1000, both CPU-times and pivot-operations increased slightly but not monotonically with the number of additional inequalities. The last run for instance was executed in 1.76 CPU-minutes, and 714 pivot operations were necessary to obtain the optimal solution. Considering these moderately sized CPU-times it seems possible that even larger travelling salesman problems can be solved using ordinary linear programming codes provided that the user is capable of identifying violated inequalities and has the patience to solve the problem interactively. Clearly, the author does not suggest the method presented here as a standard method for solving travelling salesman problems. The main purpose of this note is to show the practical usefulness of the theoretical research done in polyhedral combinatorics and to give an example showing that we already have tools to solve very large real world problems optimally. As is to be expected an optimal solution requires a lot of effort but the chances of finding the shortest tour are quite good. Further research in applying the results on the facetial structure of polytopes associated with hard (NP-complete) combinatorial optimization problems should go in the direction of automatizing the interactive procedure used above. A first step with respect to the TSP was done with considerable success by Padberg and Hong [15]. If good methods for solving some of the problems encountered by them can be found, it seems likely that we will be able to attack truly large problems by sophisticated combinations of heuristics, cutting plane methods and branch and bound techniques. Appendix List of the 120 German cities contained in the distance table in [4]: 1 5 9 13 17 21 25 29 33 37

Aachen Augsburg Bamberg Berlin Braunschweig Cham Cuxhaven DOsseldorf Flensburg Friedrichshafen

2 6 10 14 18 22 26 30 34 38

Amberg Baden-Baden Basel Bielefeld Bremen Cloppenburg Darmstadt Duisburg Frankfurt Fulda

3 7 I1 15 19 23 27 31 35 39

Ansbach Bad Hersfeld Bayreuth Bocholt Bremerhaven Coburg Donauw6rth Emden Freiburg Garm,-Partenk.

4 8 12 16 20 24 28 32 36 40

Aschaffenburg Bad Kreuznach Berchtesgaden Bonn Celle Cochem Dortmund Essen Freudenstadt Giel~en

M. Gr#tschel[ Solutions o[ a 120-city problem

76

Appendix. (Continued) 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117

G6ttingen Hamm Helmstedt Ingolstadt Kempten K61n Lauenburg Liineburg Memmingen Miincben Nordhorn Oldenburg Pforzheim Regensburg Salzburg Schw/ib. Gmiind Siegen Straubing Uelzen Wilhelmshaven

42 46 50 54 58 62 66 70 74 78 82 86 90 94 98 102 106 110 114 118

Goslar Hannover Hildesheim Kaiserslautern Kiel Konstanz Leer Mainz Meschede Mtinster Niirnberg Osnabriick Pirmasens Rheine Salzgitter, Bad Schwab. Hall Sigmaringen Stuttgart Ulm Wolfsburg

43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119

Hamburg Heidelberg Hof Karlsruhe Kieve Landsberg Lindau Mannbeim Minden Neumtinster Oberstdorf Paderborn Puttgarden Rosenheim Schaffhausen Schweinfurt Soltau Trier Weiden W0rzburg

44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 104 108 112 116 120

Hameln Heilbronn Husum Kassel Koblenz Landshut Liibeck Marburg M6nchengladb. N6rdlingen Offenburg Passau Recklinghausen Saarbrticken Schleswig Schwenningen Stade Tiibingen Wiesbaden Wuppertal

The numbers in Fig. 1 correspond to the numbering of the cities in the list above. Readers interested in trying their TSP-code or -heuristic on the 120-city problem solved in this paper can obtain a card deck with the road distances between the above listed cities from the author.

References [1] R.E. Burkard, "Travelling salesman and assignment problems: A survey", Annals o[ Discrete Mathematics 4 (1979) 193-215. [2] V. Cbvfital, "Edmonds polytopes and weakly Hamiltonian graphs", Mathematical Programming 5 (1973) 29--40. [3] G.B. Dantzig, D.R. Fulkerson and S.M. Johnson, "Solution of a large-scale travelling-salesman problem", Operations Research 2 (1954) 393-410. [4] Deutscher General Atlas (Mairs Geographischer Verlag, Stuttgart, 1967/68). [5] J. Edmonds, "Maximum matching and a polyhedron with 0, ! vertices", Journal o[ Research o[ the National Bureau of Standards 69B (1965) 125-130. [6] J. Edmonds, "Matroids and the Greedy algorithm", Mathematical Programming 1 (1971) 127-136. [7] M. Gr6tschel, Polyedrische Charakterisierungen kombinatorischer Optimierungsprobleme (Anton Hain Verlag, Meisenheim/Gian, 1977). [8] M. Grftschel and M.W, Padberg, "Partial linear characterizations of the asymmetric travelling salesman polytope", Mathematical Programming 8 (1975) 378-381. [9] M. Gr6tschel and M.W. Padberg, "Lineare Charakterisierungen von Travelling Salesman Problemen", Zeitschri[t [iir Operations Research 21A (1977) 33--64. [10] M. Gr6tschel and M.W. Padberg, "On the symmetric travelling salesman problem I: Inequalities", Mathematical Programming 16 (1979) 265-280.

M. Grftschel/ Solutions o[ a 120-city problem

77

[11] M. Gr~tschel and M.W. Padberg, "On the symmetric travelling salesman problem II: Lifting theorems and facets", Mathematical Programming 16 (1979) 281-302. [12] S. Hong, "A linear programming approach for the travelling salesman problem", Thesis (Johns Hopkins University, Baltimore, 1972). [13] J.F. Maurras, "Some results on the convex hull of the Hamiltonian cycles of symmetric complete graphs", in: B. Roy, ed., Combinatorial programming: methods and applications (Reidel, Dordrecht, 1975). [14] P. Miliotis, "Integer programming approaches to the travelling salesman problem", Mathematical Programming 10 (1976) 367-378. [15] M.W. Padberg and S. Hong, "On the symmetric travelling salesman problem: A computational study", Mathematical Programming Studies 12 (1980) 61-77.

Mathematical Programming Study 12 (1980) 78-107. North-Holland Publishing Company ON THE SYMMETRIC TRAVELLING A COMPUTATIONAL STUDY

SALESMAN

PROBLEM:

Manfred W. P A D B E R G New York University, New York, U.S.A.

Saman H O N G Operations Research Group, RCA Corporation, Princeton, N J, U.S.A.

Received i September 1977 Revised manuscript received 5 July 1979 The symmetric travelling salesman problem has laeen formulated by Dantzig, Fulkerson and Johnson in 1954 as a linear programming problem in zero--one variables. We use this formulation and report the results of a computational study addressing itself to the problem of proving optimality of a particular tour. The empirical results based on a total of 74 problems of sizes ranging from 15-cities to 318-cities lend convincing support to the hypothesis that inequalities defining facets of the convex hull of tours are of substantial computational value in the solution of this difficult combinatorial problem. Key words: Travelling Salesman Problem, Cutting Planes, Facets, Computation.

0. Introduction The symmetric travelling salesman p r o b l e m (TSP) is the p r o b l e m of finding the shortest Hamiltonian cycle (or tour) in a weighted finite undirected graph without loops. This problem appears to have been formulated some 45 years ago [14] and has been a subject of intensive investigation in combinatorial optimization during the past 25 years. The interest that this problem has received is well deserved: M a n y practical combinatorial problems in scheduling and production m a n a g e m e n t can be formulated as or s h o w n to be equivalent to a symmetric travelling salesman problem. On the other hand, the travelling salesman problem is of theoretical interest because it is a " h a r d " combinatorial p r o b l e m (see K a r p [10]). In this paper we focus on the oldest a p p r o a c h to this problem due to Dantzig et al. [3] who formulated the problem as a linear p r o g r a m m i n g problem in z e r o - o n e variables in 1954 and used a cutting plane approach to p r o v e the optimality of a heuristically obtained solution to a 49-city problem (see also [4]). Our proceeding is v e r y much the same: We have used the m o s t effective heuristic known to us in order to obtain a good, possibly optimal tour. This tour is then used as starting solution for a linear program that gradually increases in size as we identify m o r e and more cutting planes in order to p r o v e optimality. (The heuristic used in this study is due to Lin and Kernighan [13] and we are grateful to Shen Lin for making his FORTRAN IV program available to us.) 78

M. W. Padberg, S. Hong/ A Computational study [or TSPs

79

The objective of our computational study is to empirically validate the usefulness of cutting plane methods for the travelling salesman problem and, by generalization, for other hard combinatorial optimization problems as well. The cutting planes used here are, however, not cutting planes used commonly in integer programming, rather they are problem-specific and belong to the class of inequalities that define [acets of the convex hull of tours (see Gr6tschel [6], Gr6tschel and Padberg [7]). The difficulty with the implementation of this approach lies in the identification of a suitable cutting plane, a problem that has so far been neglected in the work on the facial structure of polyhedra associated with various combinatorial problems of practical interest. While we offer only a small contribution towards bridging this gap between theory and computation, the results of our computational study support very well the contention that facetial cutting planes are of considerable practical help in establishing optimality in combinatorial problems. We begin by introducing some notation and by reviewing facetial inequalities some of which were used in the computational study. In Section 3 we describe the overall flow of the algorithm, while in Section 4 we discuss our routines for identifying a suitable constraint. In Section 5 we discuss the results of the computational study, which is divided into four parts. In part 1 we report the results on randomly generated euclidean problems with n = 15, 30, 45, 60, 75 and 100 cities. In part 2 we examine problems due to Papadimitriou and Steiglitz [17] which are practically insoluble for heuristic algorithms, which, however, were solved to optimality surprisingly easily by the linear programming approach. In part 3 we report on various travelling salesman problems that have been used as test problems in the literature. In part 4 we report on the drilling-machine problem with 318 points (cities) published in [13]. The reader should be aware that in only 54 of a total of 74 sample problems we actually prove optimality, while in all problems we obtain excellent lower bounds on the optimum tour length. The "goodness" of this bound is measured by a ratio which is invariant under scaling and translating the original problem data.

1. Notation

Let G = ( V , E ) be the undirected graph with node-set V={1 ..... n} and edge-set E = {(i, j) I i E V, j E V, i~ j}. A tour is either a cyclic permutation of nodes (i~, iz. . . . . i,) or equivalently, a set of n edges {(i~, iz), (iz, i3)..... (i,_~, i,), (i,, il)} which form a Hamiltonian cycle (or tour) in G. A cyclic permutation of r nodes with r < n or its associated edge-set is called a subtour. Algebraically a tour is described by a zero-one vector x with the convention that xij = 1 if the edge (i, j) is in the tour and xij = 0 if not. As we are dealing with undirected edges the vector x has m = n ( n -- 1)/2 components. For any S c_ V and H _CE we use the following abbreviations:

80

M. W. Padberg, S. Hong/ A Computational study for TSPs

N ( H ) = {i E V I i is incident with an edge in H}, E ( S ) = {(i, ]) E E I i E S, j E S}, star(S) = {(i, j) ~ E I i E S, j E V - S}, star(/-/) = star(N(H)),

x ( H ) = ~, {x,j I (i, j) E H}, x(S) = x(E(S)),

(S~ : $2) = {(i, j) E E I i E $1, j E S:}. For x E ~, we let (x) = min{z ~ Z I x -- z}, where Z are all integer numbers.

2. Valid inequalities Let A be the node-edge incidence matrix of a complete graph, i.e. A has n rows corresponding to the nodes of G and m = n(n - 1)/2 columns corresponding to the edges of G. Let ek be the k-dimensional vector of ones. Then the assignment polytope Q~ is given by

Q"A = {x E ~ m I A x = 2e,, 0 -----x --< era}.

(2.1)

It is well-known that Q~ has three different kinds of extreme points or vertices: (i) fractional vertices, i.e. vertices $ satisfying 0<$ij < 1 for some (i, j), (ii) subtour vertices, i.e. vertices which have zero-one components and correspond to subtours in G and (iii) tour vertices, i.e. vertices which have zero-one components and correspond to tours in G. In order to cut off the subtour vertices of Q~, one intersects Q~ with the totality of halfspaces given by the subtour-elimination constraints due to Dantzig et al. [3]:

x ( S ) -< ISI - 1 for all S _C V, 2 - I S I -< [89

(2.2)

Denote by Q~ the convex hull of tours and let Q~ = {x ~ Q~ I x satisfies (2.2)}. Then it is well-known that QF = conv{x ~ Q"s I xii E {0, 1} for all (i, j) E E}. On the other hand, Q~ has fractional vertices, part of which are "inherited" from Q~ and part of which are "newly created" by intersecting Q~ with the subtourelimination constraints. The fractional vertices of Q~ are cut off by the 2matching constraints due to Edmonds [5]:

x ( S ) + x ( n ) <-ISI + [~lnl] where S _C V satisfies 3 - Isl ---

(2.3) and H _Cstar(S) is a collection of edges with

M. W. Padberg, S. Hong/ A Computational study f o r T S P s

81

mutually distinct endpoints in S. The polytope that results from intersecting Q~ with all constraints (2.3) has fractional vertices, part of which are excluded by the following generalized comb inequality, which we have obtained jointly with Martin Gr6tschel: Let S/_C V for i = 0 . . . . . k be any k + 1 proper subsets of V satisfying Is0ns,l->l

s/nsj=O k-> 3,

and

[S/-S01->l

f o r i = l . . . . . k,

for l<-i<j<-k,

(2.4a) (2.4b)

k odd.

(2.4c)

Then the generalized comb inequality k

k

i=O

x(S/) < [So[ + ~ (IS,.[ - l)(- 89

(2.5)

i=l

is valid for Q~, i.e. satisfied by every tour. Inequality (2.5) contains the comb inequality of Chvatal [1] as a special case. In this case (2.4a) is replaced by the stronger requirement

IS/ --

IS0 O S;I = 1 and

S01 ~-~

1 for i = 1. . . . . k.

(2.4~t)

Furthermore, the 2-matching constraints are also a special case of the generalized comb inequality (2.5) with S = So, IS~I=2 for i= 1. . . . . k and H = u/k=, E(Sp. The subtour elimination constraints (2.2) as well as all generalized comb inequalities (2.5) constitute facets of Q~. The proof of this statement is very involved, see Gr6tschel [6], and will be published in detail elsewhere [7]. A further valid inequality for Q~ is the following chain constraint: Let S~ C_ V for i = 0 , 1 . . . . . k be any proper subsets of V satisfying S/AS0 =0 for i = 1..... p, condition (2.4a) for i =p + 1..... k and condition (2.4b) as stated. Let R C So be a subset of So satisfying IRI = p and R O Si = 0 for i = 1..... k. Then k

k

~x(S/)+~x(R:S/)
(2.6)

i=l

i=1

is a valid inequality for Q~ where 1 -< p < k. This follows because

2

x(S/) + "=

x(R :

<--

i=1

<- ~, x(star(v))+x(R U O S/) + ~ x(Si) v~So

i=1

i=l

k

+ ~, (x(S/) + x ( S / - So) + x ( S / n So)) i=p+l

< 21Sol + IRI + ~ I s / I - I + ~ (Is/I- I) i=l

i=l

k

+ ~

i=p+l

(Is~l-l+lS~-Sol-l+ls~nso[-1)

M. W. Padberg, S. Hongl A Computational study for TSPs

82

-----2

IS01+IRI+ (Is,I-

1) - ( k - p

+ 1).

Then inequality (2.6) follows by dividing by two and by integerization of the right-hand side. Inequality (2.6) has yet to be studied in detail from a theoretical point. It does not define a facet for Q~ with small n, though it intersects Q~ in a face of high dimension and has been useful on occasion in our computational study. To be sure, the inequalities stated here do not completely characterize Q~, i.e. the polytope that results from intersecting Q~ with all inequalities (2.2), (2.5) and (2.6) has fractional vertices as well as tour vertices.

3. Description of the algorithm The core of the algorithm is the primal simplex algorithm for linear programs with bounded variables and we refer the reader to e.g. Simmonard [19] for a general description of this method. Our program starts by reading in the distance-table of the graph or the coordinates of the n points and the starting tour (see Fig. 1). This solution is then used to initialize the linear program min{cx I A x = 2e., 0 < x < e,,}. Since the columns of A that are in the tour yield a submatrix B of A with row and column sums equal to two, we have a starting basis if n is odd. If n is even, B is singular. In this case we modify B by discarding the column corresponding to the edge 0.-1, i,) and introducing in its place the edge (il, i,-1). The resulting matrix is nonsingular and in both cases, one can readily determine B -~ explicitly. (Of course, the discarded column (i._~, i,) is marked to be nonbasic and at its upper bound of 1.) Given B -~ we next compute c~B -~, where c a is the vector of distances corresponding to the edges in the initial basis. Both B -I and caB -~ are carried along during the entire computation and updated whenever a pivot operation is carried out. Given c~B -~ we can now price out all columns by computing c - c ~ B - ~ A . Next we scan through the entire graph and order the edges according to increasing reduced costs c - cSB-IA. Though this ordering is very time consuming, we found it useful in order to have a simple, but effective tie-breaking rule for degenerate pivots. Next we proceed according to the simplex method using a steepest edge criterion for the selection of the pivot column, see e.g. Crowder and Hattingh [21 for a description of this criterion. We found this criterion superior to the usual minimum reduced cost criterion, but decided to use it for another reason as well: It appears to be difficult to identify improving tour vertices, so while scanning for the best pivot column we check concurrently whether or not a candidate column defines an improved tour vertex. To this end we need to construct the full column B-~a j anyway, which is used in order to compute the steepest-edge pivot column, where a j is the jth column of A. Though this may appear to be rather time consuming, our computation times justify ex post the added work.

M, W. Padberg, S. Hongl A Computational study for TSPs

83

READ DISTANCES OR COORDINATES AND STARTING TOUR; INITIALIZE LINEAR PROGRAM BY SETTING UP BASIS INVERSE ETC.

=I-

I SELECT PIVOT COLUMN BY STEEPEST

EDGE CRITERION]

YES

SELECT PIVOT ROW AND CARRY OUT DEGENERATE PIVOT

CARRY OUT PIVOTAND DROP SUPERFLUOUS CUTS

( ~

NO

VERTEX A

I NO IDENTIFY A CONSTRAINT THAT IS SATISFIED AT EQUALITY BY CURRENT TOUR AND CUTS OFF ADJACENT VERTEX

APPEND NEW CONSTRAINT AND PIVOT IN PIVOT COLUMN: DEGENERATE PIVOT

Fig. 1. Flowchart.

Having identified the next pivot column, we can have one of two situations: Either the pivot column can be pivoted into the basis without changing the current solution vector (degenerate pivot) or alternatively, introducing the pivot column into the basis changes the solution vector, i.e. defines a vertex for the current linear program which is adjacent to the current tour. In the first case we use as the pivot row selection criterion the usual criterion, i.e. the maximal pivot element, and break ties by selecting as exiting variable the one with the highest index as defined by our initial ordering. In the second case, we find an improving adjacent vertex. If this vertex is a tour, we update the solution vector and drop all superfluous cuts that were previously introduced (if any). If the vertex is not a tour, then we need to identify a constraint (cut) from among the constraints discussed in Section 2 that satisfies two requirements: (i) The constraint must cut off the improving non-tour vertex. (ii) The constraint must be satisfied at equality by the current tour (= current solution vector). The constraint identification is

84

M. W. Padberg, S. Hong/ A Computational study for TSPs

described in the next section. If our program finds a suitable constraint, the linear program is enlarged by this new constraint, i.e. the basis inverse is enlarged suitably and the coefficients of the cut are stored compactly. It follows that the pivot column can now be pivoted into the basis by a degenerate pivot and the procedure is repeated. If we cannot find a suitable constraint, we default to solving the current linear program to optimality, thus obtaining a lower bound on the minimum tour length.

4. Identification of valid inequalities When introducing the pivot column into the basis produces a new vertex which is not a tour, we must find a facetial inequality which is satisfied by the current tour and cuts off the new vertex. Due to the fact that we have already "activated" other facetial inequalities, the new vertex generally is not a vertex of Q~, rather it is a basic feasible solution to the enlarged linear programming problem. Consequently, this identification is rather difficult and must be carried out by an algorithm geared specifically to answer the following problem: (P1)

Given xl ~ Q~A, a tour vertex, a n d X2 ~ Q~A, a n o n - t o u r feasible point in Q~A, find an inequality ax <-ao o f type i satisfying ax I = ao a n d ax 2 > ao if one exists.

That is, we distinguish different inequalities by their respective combinatorial description: Type 1 refers to subtour elimination constraints (2.2), type 2 to 2-matching constraints (2.3), type 3 to comb inequalities, type 4 to generalized comb inequalities (2.5) satisfying IS0 N Si[-> 2 for at least one i ~ {1..... k} and type 5 to chain-constraints (2.6). We have been able to solve problem (P1) completely only for the type 1 constraints. In all other cases, including the 2-matching constraints, we have at present only an improvised way of identifying a suitable constraint. This is in part due to the fact that we originally pursued a different strategy with regard to the identification of a suitable constraint: If x 2 in (P1) is a fractional vertex of Q~, we can readily identify an appropriate 2-matching constraint. Furthermore, we can characterize again completely the resulting new vertices of Q~A A { x [ a x = a0}, where ax = ao is a 2-matching constraint. Thus we can attack the constraint identification problem by chopping off 1-faces, 2-faces, etc. of Q~. We found that this proceeding becomes quite impossible as in actual computation one fairly quickly gets down to k-faces with k---10. Furthermore, if x 2 is actually a subtour vertex of Q~, x 2 possibly adjacent to x I on Q~, then quite contrary to what one might expect, subtour elimination constraints do not always solve problem (P1). A point in case is the following situation for n = 8: Let x ~ be the tour (12 ... 78) and x 2 be the subtour vertex on subtours (1256) and (3478). Then no subtour elimination constraint solves (P1). Rather we need the generalized comb inequality (2.5) with So =

M. W. Padberg, S. Hong/A Computational study for TSPs

85

{1, 2, 3, 8}, $1 = {1, 2, 5, 6}, $2 = {3, 4} and $3 = {8, 7} in order to solve (P1). Similar observations apply to the chain constraint (2.6). A major conclusion With regard to the continuing theoretical development of this approach to travelling salesman problems is that in lieu of attempting to cut off k-faces of Q'A algorithms for the solution of problem (P1) must be found. In the following we describe the routines that we have used to identify constraints in our program. Since our procedures for identifying constraints of type i-> 2 must be considered preliminary, we will discuss them only very briefly.

4.1. Subtour elimination constraints For subtour elimination constraints we have two different ways of solving (P1): One possibility is to solve a minimum-cut problem on the partial (undirected) graph Gr = (V, EF) where (i, j) E Ep iff xlj > 0 or x/~ > 0. Indeed, let z = x I + x2; then there exists a subtour constraint containing x I and cutting off x 2 if and only if z(S : V - S) < 4 for some S _C V. This follows from the well-known equivalence of a subtour constraint (2.2) with the constraint x(S: V - S ) > - 2 . Hence there exists a good algorithm for identifying subtour elimination constraints. Rather than implementing a minimum-cut algorithm, we have used an enumeration algorithm which is polynomially bounded because there are at most ~n(n + 1 ) - 1 constraints (3.2) which need to be examined in order to solve (PI). This algorithm is particularly simple to implement and effective as it exploits the sparsity of GF. L e t us assume that the current tour x ~ is given by (12 ... n). If the subtour elimination constraint on a node-set S solves (P1), then so does the subtour elimination constraint on V - S. Furthermore, either S or V - S must be a partial sequence of l, 2 . . . . . n to contain the tour x ~. Denote by x 2 the vertex to be cut off. Then the following procedure determines the most violated subtour elimination constraint if any exists:

Routine A Step 1. L e t S U M ( l ) = 0, GAP = 0 and k = 2. Go to Step 2. Step 2. If k > n, go to Step 4. Otherwise, let SUM(k) = 0 and recompute for i, 1 -< i < k; SUM(i) to be SUM(i) + ~ {x~j li ~ j < k}. I

Let D E L T A = ~. {x~ql 1 --<j < k} I

and go to Step 3.

86

M. W. Padberg, S. Hong/ A Computational study for TSPs

Step 3. If D E L T A _ 1, then no partial sequence of 1,2 . . . . . k containing node k provides a better subtour elimination constraint. In this case, set k = k + 1 and go to Step 2. If D E L T A > 1, c o m p u t e y = max{SUM(i) - k + i [ i = 1. . . . . k - 1}. If y -< G A P , set k = k + 1 and go to Step 2. Else set G A P = y and store the two nodes i* and k* at which the m a x i m u m is attained. Set k = k + 1 and go to Step 2.

Step 4. If G A P - 0 , there does not exist a subtour elimination constraint solving (PI). If G A P > 0 , then construct a subtour elimination constraint on S = {i*, i* + 1. . . . . k*} or on V - S w h i c h e v e r has the smaller cardinality. Note that Routine A enumerates all subtour elimination constraints which contain the current tour and selects the one which cuts off the new vertex. If the vertex x 2 is a subtour vertex, the reverse process is easier and f u r t h e r m o r e useful for the determination of other types of constraints. This is done with the following procedure:

Routine B Step 1. Given x 2 construct the edge-sets T~, T2. . . . . Tk C_ E defining the subtours in G. Any subtour elimination constraint on S = N(Ti) cuts off x 2. G o to Step 2. Step 2. C o m p u t e for i = 1. . . . . k SLACK(T~) = ]T~[- IT n E ( N ( T , ) ) I, where T denotes the edge-set of the tour vertex x t. If there exists an i E {1 . . . . . k} such that SLACK(T~) = 1, construct a subtour elimination constraint on S = N(T~) or on V - S whichever has the smaller cardinality. Else there does not exist a subtour elimination constraint solving (P1).

4.2. 2-Matching constraints Despite the fact that matching theory is a well understood area in integer programming we have not yet been able to utilize (or modify) the k n o w n results so as to solve (P1) for type 2 constraints. Rather we have implemented a heuristic for identifying 2-matching constraints which is essentially enumerative and derived f r o m the following consideration: Suppose that point x 2 in (P1) is a fractional vertex of Q~ adjacent to x ~ on Q~. Then one can show that there exists a type 2 constraint that solves (P1). (See [15, L e m m a 3.6, T h e o r e m 3.8] for a related argument that can be modified to p r o v e the assertion.) The reason is that the partial graph GF = (VF, EF) induced by the edges (i, j) ~ E with 0 < x~ < 1 consists of an e v e n n u m b e r of disjoint odd cycles. E v e r y odd cycle has an odd

M. W. Padberg, S. Hong/ A Computational study for TSPs

87

number of edges (i, j) incident to it which satisfy x 2 = 1. Consequently choosing S to be the node-set of the odd cycle and H to be the collection of edges incident to it, we obtain an even number of 2-matching constraints which solve (P1). To generalize this construction, we first determine for an arbitrary x 2 E Q~ the graph G~ = (VF, EF). (Note that GF is different from the GF used in Section 4.1.) By tedious, but elementary considerations one can show that not only different connected components, but under certain circumstances also the weakly connected c o m p o n e n t s of GF play a role in determining a constraint of type 2. To this end we use an algorithm [18] to determine all cutnodes of GF. (A cutnode of GF is a node whose removal increases the number of connected components of Gr by at least one.) The resulting connected components are then used as candidate sets for the set S in the 2-matching constraint (2.3). For each set S we identify a set of edges H incident with S and satisfying x~ > 0 for (i, j) ~ H. If a constraint of type 2 thus found is satisfied at equality by x ~ and cuts off x 2, it is returned to the main program. Connected components having cutnodes with no edge (i, j) satisfying x]i = x 2 = 1 incident to them are also considered for constraint generation and the program attempts the constraint generation starting with the smallest cardinality candidate set S. Even though this procedure does not guarantee a solution to (P1) we have found it to be very effective in identifying 2-matching constraints. In judging the computational work involved it should be kept in mind that the graph GF is extremely sparse and generally has only a very small number of nodes. Furthermore, lengthy considerations show that we can always cut off points x 2 which are such that the face of minimal dimension of Q~ containing both x I and x 2 has dimension two.

4.3. Other constraints The generation of comb inequalities, generalized combs and chain-constraints is based on adjacency properties of vertices on Q~. If x I and x 2 in (P1) are adjacent integer vertices of Q~, then the symmetric difference graph Gs = (Vs, Es) is either an even cycle or consists of two odd cycles joined at a node, where Es = {(i, j) E E I x~j # x~} and Vs = N ( E s ) . The collection of edges in Ec = {(i,j) lx] = x~ = 1} from " c h a i n s " connecting the nodes in Vs. We have studied various patterns of these graphs and derived several sufficient conditions that ensure existence of a constraint of type 3, 4 or 5 with the required properties. Rather than discussing all the cases that we have programmed, we will discuss a few that are representative. Let T~, T2. . . . . T, _CE be the edge-sets of the subtours defined by x 2 and T _CE be the edge-set of the tour x t. Since we can assume that there does not exist a subtour elimination constraint solving (P1), we know that S L A C K ( T / ) > 2 for i = 1. . . . . r, where S L A C K is computed by Routine B of Section 4.1. If

88

M.W. Padberg, S. Hongl A Computational study for TSPs

SLACK(T,-) > 2 for all i, we do not k n o w of any constraint that solves (P1). Otherwise let S L A C K ( T 0 = 2 and select a pair of nodes u and v of Vs n N(TI) which are connected by a chain H0 in Es - TI. (Since SLACK(T1) = 2, there are exactly two candidate pairs (u, v).) If u and v are connected by a chain Hi C T N T1, let So = N(Ho) O N(H1), and $1 = N(TO. Let k = IEc N star(N(H0) - {u, v})l + 1 and denote b y Si the k - 1 two-element node-sets defined b y the edges in Ec n star(N(Ho)-{u, v}) for i = 2 . . . . . k. Then the generalized c o m b constraint (2.5) on So, Sj ..... Sk solves (P1). If u and v are not connected by a chain H~ C T N T~, we c h e c k whether or not we can partition N(TO into node-sets R, $1 . . . . . Sp satisfying (i) u E R , v E R , IR[=p, (ii) Ef=~ x2(Si) + ~f=z x2(R : Si) = ITd, (iii) Ef=l xl(Si) + ~,~=~xl(R : Si) = ITiI- 2. If these conditions cannot be met, we do not know any constraint that solves (P1). Else we let q = IEc O star(N(Ho)-{u, v})l and denote b y S,. the q twoelement node-sets defined by the edges in Ec O star(N(Ho)-{u, v}) for i = p + 1. . . . . p + q. T h e n the chain-constraint (2.6) on So, S~. . . . . Sk solves (P1), where k = p + q and So = N(Ho) U R. The a b o v e conditions (i)-(iii) can be checked very effectively by a labelling technique in both cases where Gs is an e v e n cycle or two odd cycles joined at a node. Similar conditions have been developed to identify c o m b inequalities. In order to use the a b o v e results to cut off fractional vertices x 2 we represent x 2 as a convex combination of two integer vertices (if possible) and a t t e m p t to generate constraints of the type 3, 4 or 5 according to the a b o v e proceeding. If the attempt is successful, we check whether or not the fractional vertex x 2 can be cut off this way. If the answer is negative, we default to solving the current linear program.

5. The computational study The program was written in FORTRAN IV and compiled using a H-compiler (option 2). The runs were executed on the I B M 370-168 M V S of the I B M T.J. Watson Research Center. To execute the p r o g r a m for up to 120 cities with a proviso for up to 120 additional (automatically generated) constraints a total storage capacity of 608 K was needed. The entire program uses fixed-point arithmetic so as to avoid any problems connected with round-off errors. In order to save storage integer-halfwords are used w h e r e v e r possible. All simplex computations are carried out in rational f o r m ; integer words (not halfwords) are used in this part of the program with three double-precision floating-point storage locations for the calculation o f the steepest-edge pivot column. Special provisions against cycling in the simplex-method part are not implemented: We use an upper bound of 3n on the total pivot count and, of course, a time limit for

M. W. Padberg, S. Hong/ A Computational study [or TSPs

89

the execution of the program. The basis inverse B -~ is stored explicitly, i.e. we do not use the product-form of the basis inverse. To accommodate the rational form, we calculate the greatest c o m m o n divisor w h e n e v e r the denominator exceeds 16. To our surprise, the largest value of the denominator encountered in this study was of the order of 2000, i.e. overflow problems were not encountered. The value of order 2000 was, however, a singular event and the bulk of values for the denominator was well below 200. It should be kept in mind that the determinant of the basis may very well be several orders of magnitude larger than the denominator in the rational form after the factoring out of c o m m o n divisors. (A discussion with Philip Wolfe concerning this point convinced us that the work involved to compute and factor out the greatest c o m m o n divisor is about of the order of a pivot operation and thus can be done reasonably quickly.) The incidence matrix A of the complete graph G is stored compactly as a 2 x m dimensional array of integer halfwords. The constraints that are generated during the constraint-identification phase are stored compactly using integer halfwords as two separate lists with respective pointers: In one list we store the edge-set of each constraint, in a second list we store the node-sets of each constraint. As the coefficients of the generated constraints are 0, 1 or 2, this can readily be implemented with coefficients of 2 stored as two consecutive indices so as to avoid multiplications in the computation of a transformed column or the reduced cost. The storage of new constraints as both edge- and node-sets is necessary if the optimization is carried out sequentially over sparse subgraphs rather than the complete graph, see Section 5.2. The overall objective of the computational study is to empirically validate the usefulness of facetial inequalities in the solution of travelling salesman problems. In order to evaluate the value of facetial inequalities towards the goal of proving optimality we proceed as follows: Given a heuristically obtained solution, we solve the 2-matching problem as a linear program, i.e. without the integrality stipulation. Using the same solution as a starting solution, we run the problem a second time generating facetial inequalities: In the second run we either terminate with an optimal tour or in case of our inability to identify a suitable new constraint, default to solving the by now enlarged linear programming problem. This gives us two values: V A L U E 1 is the objective function value without cuts and V A L U E 2 is the objective function value with cuts. If T O U R denotes the minimum length tour of the problem, then the following ratio is a good p r o x y for measuring the added value of the additional work: R A T I O = ( V A L U E 2 - V A L U E 1)/(TOUR - V A L U E 1). Note that R A T I O is zero if no improvement is obtained (e.g. if no constraint was generated), while RATIO is one if the constraint-generation procedure terminates with the optimal tour. R A T I O is, of course, always between zero and one and due to taking both differences and a ratio, the measure is invariant under scaling and translating the data. This is of particular importance since a single

90

M. W. Padberg, S. Hong/ A Computational study [or TSPs

ratio, e.g. VALUE 2/TOUR, can be made to "look arbitrarily good" by a simple translation of the data (distances) which affects the numerical values of both tour length and VALUE 2, but not the optimal tour nor the linear programming solution vector. The above measure RATIO has the additional merit of being a conservative measure, if the actual optimum tour length is not known and TOUR represents the tour length of the best tour obtained either by the heuristic or during the course of a computation which ended with a default to the amended linear program. In this case, the corresponding value of RATIO is less than the value that is obtained if TOUR is the minimum tour length, so that the added value is never overestimated. It should be noted further that for the computation of both VALUE 1 and VALUE 2 the same starting solution for the respective linear programs is used, which is known to greatly impact the performance of simplex methods. Due to this choice it is to some extent meaningful to evaluate the trade-off between added value and additional work in terms of increased CPU-times and increased number of pivots which we report as well. Finally, we note that all CPU-times reported include the entire set-up of the problem, in particular the very costly ordering of all edges of the complete graph mentioned in Section 3. Also, to have an insurance against possibly remaining "bugs" in the program, at termination we always check whether or not the final B -j, cBB -I as well as the transformed right-hand side vector are computed correctly by multiplying B-~B, recomputing cBB -I and the transformed right-hand side vector afresh and comparing it to the values that the program has computed. Though probably negligible, these times are included in the reported CPU-times as well. 5. I. Randomly generated Euclidean problems

The objective of this part of the computational study is to assess broadly the added value of the use of the facetial inequalities in proving optimality of a tour. To generate the problems for this part we used the pseudo-random number generator of Lin and Kernighan [13], which generates coordinates xi and yi with values between 1 and 1000 for i = 1..... n. Due to the fact that coordinates are generated as pairs, the same random seed for n + m cities produces a graph that is properly contained in the graph on the first n cities. This is desirable as we wanted to study how increasing n affects the added value of the facetial inequalities. We ran 10 different problems for n = 15, 30, 45, 60 and 75, respectively using 10 different seeds for the random number generator. Furthermore, 5 Euclidean problems with n = 100 due to Krolak [12] were included in this statistical part of the study, though they are different from the other ones. Table l contains all the relevant statistics for this part of the study. The entries in Table 1 were obtained by averaging the respective individual figures and their mean/.t is given with ~ being the standard deviation. The top row of Table 1 contains the value RATIO. As it is to be expected, RATIO declines with increasing n. TOUR is the tour length obtained by the heuristic [13] and

M. W. Padberg, S. Hong/ A Computational study [or TSPs

91

Table 1 n

15

30

45

60

75

100

Ratio

it tr

1.0 0.0

0.99 0.03

0.93 0.11

0.92 0.10

0.88 0.09

0.92 0.02

TOUR

it tr it ~r it cr

3555 383 224 121 0.0 0.0

4738 314 352 100 5 15

5566 224 379 149 24 57

6297 181 452 133 38 44

6878 224 387 93 50 44

21507 525 1507 313 120 43

it ~r it cr

I1 2 2 2

34 7 13 5

47 13 17 12

76 30 36 29

87 23 37 22

167 40 97 38

it tr it cr it tr

0.33 0.03 0.07 0.03 3 2 10

1.37 0.26 0.46 0.23 12 5 9

4.46 1.27 1.39 1.23 15 8 5

14.47 6.82 6.25 6.63 26 13 4

30.52 10.81 11.64 10.63 28 12 3

108.74 39.97 50.4 31.7 72 18 0

GAP1 GAP2 PIVOT2 APIVOT TIME2 ATIME CUTS OPTIM

somewhat to our surprise, the linear program did in no case find an improved tour. (The average running times for the heuristic on the IBM 370-168 of the Computation Center of the City University of New York using the H-Compiler (Option 2) were 0.45, 3.48, 6.06, 12.21 and 16.23 seconds of CPU-time for n = 15, 30, 45, 60 and 75. We set the parameter specifying the number of trials in the heuristic equal to 20, see [13].) The bottom line of Table 1 (OPTIM) specifies the number of times the linear program terminated by proving optimality of the heuristically obtained tour. GAP1 measures the average difference between TOUR and the objective function value of the (initial) linear program given by min{cx I Ax = 2en, 0 - x - e , , } . GAP2 is the crucial measure in evaluating the constraint-generation procedure. It is the difference between TOUR and the objective function value of the amended linear program. In case of n = 100 this "remaining" gap was on average 120 with a standard deviation of 43. With an average tour length of 21,507 this means that the optimum tour is greater than or equal to 21,387, a truly good lower bound if the order of magnitude has economic significance. (In Appendix A we state all individual figures.) PIVOT2 is the average of the total number of pivot operations carried out by the constraintgenerating program. APIVOT is the average increment of the pivot count over what it takes to solve the initial linear program. Likewise, TIME2 specifies the total CPU-time of the constraint-generating program, ATIME the average increment over the respective times for the initial linear program. Thus to terminate with an average lower bound of 21 387 for n ~ 100 it took on average

92

M. W. Padberg, S. Hong/ A Computational study for TSPs

108.74 seconds of CPU-time, while it took on average 50.4 seconds of CPU-time less than 108.74 to obtain an average lower bound of 20,000. (The number in the respective rows labelled tr are the respective standard deviations of our sample problems.) Finally, CUTS is the average number of constraints that were generated and amended to the original linear program. Thus for n = 10 the initial linear program had 100 rows and 4950 variables, while at termination of the constraint-generation procedure the linear program had increased on average to 172 rows and 5022 variables, a truly modest increase given the complexity of the problem and the goodness of the bound obtained.

5.2. "Traps" for the travelling salesman [16] This part of the computational study is concerned with the optimum-finding ability of the linear programming approach. In a recent paper, Papadimitriou and Steiglitz [17] demonstrated the difficulty that local search procedures for combinatorial problems (more precisely, exchange heuristics) can encounter. The authors of [17] construct a class of noneuclidean travelling salesman problems for which local search procedures find the optimum tour only if computational work tantamount to total enumeration is carried out. In fact, by choosing a cost parameter appropriately, the best solution found by a local search heuristic can be made "arbitrarily bad". More precisely, the edge-weights of these graphs are a__2 T • 4,, 2_ + 2 edges having weights 0 0, 1, M or 2 * M, respectively, and there are 32,, and 1. We ran these test problems both with the program made available to us by Shen Lin [13] and with our code. The results of our tests are displayed in Table 2. The cost parameter M was arbitrarily set to 5n, where n, the number of cities, is a multiple of 8. To get a representative picture we executed 11 problems ranging from n = 40 to n = 120 cities. The first run (RUN1) was started as usual at the solution found by the heuristic. This solution has length START1 and is in terms of the cost matrix a very satisfactory, but not optimal solution. (These problems were run batch with the heuristic [13]: Using 20 tries the batch of problems with 40, 48 . . . . . 72 cities required a total 99 seconds of CPU-time, the batch of problems with 80, 88 and 96 cities required a total of 217 seconds and the batch of problems with 96, 104, 112 and 120 cities required 550 seconds of CPU-time on the IBM370-168 of CUNY. The problem with 96 cities was accidentially run twice.) Our code always found the optimal tour and proved optimality in excellent CPU-times. The second run (RUN2) was started at a randomly selected starting solution with initial tour length START2. Again, in all but one instance the optimum tour was found and proved to be optimal. The only exception is for n = 104 where after 69 pivots and the generation of 5 cuts the program did not find a suitable constraint and thus prematurely defaulted to solving the amended linear programming problem. (The number in brackets indicate that 34 pivots where carried out after defaulting and that one cut was dropped again.)

M. W, Padberg, S. Hong/ A Computational study for TSPs

r

Z

> 7. 9

r

<

[9

[-

[-

>

7~

[-

~

~

1

~

93

94

M. W, Padberg, S. Hong/ A Computational study for TSPs

The execution of these test problems differs from all other reported results in one perhaps important aspect: The problems are solved sequentially by first selecting the sparse subgraph of the entire graph induced by the edges with weight zero and one plus as many edges with weights M or 2M that were necessary to set up the initial basis, i.e. essentially the number of such edges in the starting solution. Then the optimization is carried out over this sparse graph. Upon termination a subroutine is called to check the reduced cost of the edges not considered so far. If they price out correctly, we terminate. This was always the case in RUN1. If these edges do not price out correctly--as was the case always in RUN2 (except for n = 40)--, a second "round" is initiated: Given the basis-inverse B -I and the associated dual variables c~B -~ obtained at the termination of round one, all edges of the graph are ordered on their respective reduced cost computed with respect to the current cSB -~. Due to our storing the new constraints generated during round 1 as both edge- and node-set lists, it is possible to implement this proceeding. Then the constraint-generation program is called again. In no case, however, was it necessary to amend a further constraint. Rather after as many pivots as shown under the heading PIVOTS2, column ROUND2, the program halted with all columns of the amended linear programming problem priced out correctly. It should be noted that this proceeding of extracting a sparse subgraph for a sequential optimization extending over several "rounds" can as well be used on arbitrary travelling salesman problems. Different from the Papadimitriou/Steiglitz test problems, however, we do not know how to "suitably" choose a sparse subgraph. In the case of these test problems the subgraph selection was predicated on our knowledge that the optimum solution was to be found among the edges with weight zero and one. In general we do not have a sufficiently sharp criterion that would allow us to separate in a meaningful way "desirable" from "less desirable" edges of the graph. The original distances are, generally speaking, a rather weak criterion. Another reason why we did not use a sequential optimization over several rounds for the other problems as well was the fact that the ordering of the edges of the original graph in ascending magnitude of their respective reduced cost is so time-consuming that we want to avoid doing it several times. Of course, in principle such sequential optimization can very well be done within a linear programming framework and certainly deserves to be studied further, see also [2].

5.3. Test problems from the literature on TSPs In order to permit a limited comparison of the performance of the constraintgeneration procedure vis-a-vis other approaches we solved a number of test problems that have been used by other researchers. The results are summarized in Table 3. The heading "without cuts" refers to the solution of the (initial) linear program: TIME1 is the CPU-time in seconds, PIVOT1 the pivot count,

M. W. Padberg, S. Hong/ A Computational study [or TSPs

95

Table 3 Without cuts Problem

With cuts

T I M E 1 PIVOT1 VALUE1

DAN42 GRO48 HEI.,48 TOM57 KROL70

2.57 4.09 3.69 7.79 16.33

30 33 33 44 53

641 4769 11197 1263389 6231

GRO120

I 11.20

69

666289

KNUI21 LIN318

TOUR

VALUE2 PIVOT2 CUTS

TIME2 RATIO

699 5046 11461 12955 675

699 5031~ 11461 12940 6731+

37 83 38 61 120

0 9 0 4 8

9 0 32 8 10 0 22 1 44 10

3.10 9.16 4.30 10.40 31.91

1.0 0.95 1.0 0.95 0.98

6942

6938~

243

4

75 17

221.50

0.99

6951

6928~

166 17 49

4

171.44

0,92

34389

1

7.25

0.76

578 70 171 64 1751.46

0.96

4.54

45

328

349

670.8

251

3876589

41349

41236~

74 13

10

VALUE1 the objective function. TOUR refers to the minimum length tour or the value of the best tour found by the heuristic. The heading "with cuts" refers to the constraint-generation procedure: VALUE2 is the objective function value of the linear program with cuts, the first column under PIVOT2 refers to the total number of pivots, the second column under PIVOT2 refers to the number of pivots carried out after the default in the constraint-generation procedure (i.e. the second column is counted already in the first). CUTS specifies the total number of cuts generated in the run, its second column the number of cuts that were dropped again after defaulting. TIME2 is total execution time to termination in CPU-seconds. RATIO is the value discussed in the introduction to this section. DAN42 is the 42-city version of the 49-city problem due to Dantzig et al. [3]. The solution was proven to be optimal in 3.10 seconds of CPU-time after adding nine constraints. These nine constraints are given in Appendix B. GRO48 is a 48-city problem due to Gr6tschel [6] (48 cities with distances given in Shell's Roadatlas of Germany). After 9.16 seconds of CPU-time the program terminated with a lower bound of 5032 for the optimum tour; the best tour found by the heuristic has a length of 5046. HEL48 is the 48-city problem due to Held and Karp [8]. The solution was proven to be optimal in 4.30 seconds of CPU-time after adding ten constraints. These ten constraints are given in Appendix B. TOM57 is the 57-city problem due to Thompson and Karg [9]. After 10.40 seconds of CPU-time a lower bound of 12 940 for the optimum length tour of 12,955 was obtained. (Optimality was proven by Held and Karp [8].) KROL70 is a 70-city problem due to Krolak [12]. After 31.91 seconds of CPU-time a lower bound of 674 on the heuristically obtained best tour of length 675 was obtained. GRO120 is a 120-city problem due to Gr6tschel [6], who proved 6942 to be the minimum-length tour using the same general algorithmic approach as described in this paper. (The problem has 120 cities with distances given in the Deutscher

96

M. W. Padberg, S. Hong/ A Computational study for TSPs

Generalatlas, Mairs Geographischer Verlag, Stuttgart 1967/8.) In the case of this problem, the heuristic [13] obtained in 158 CPU-seconds and 20 tries on CUNY's IBM370-168 a best tour with length 6951. When this suboptimal solution was used as a starting solution to our linear programming code we obtained after roughly 3 minutes of CPU-time a lower bound of 6929. When the optimal tour was used as a starting solution to our linear programming code, we obtained after roughly 4 minutes a better bound of 6939. (This problem was also run in the interactive mode using a blip to time the amount of time needed to order the 7140 edges of the complete graph on 120 nodes according to increasing reduced costs. As the program was executed during the night shift with comparatively few users the blip count is fairly accurate. The preprocessing took about 60 seconds of CPU-time. Of course, it must be noted that this time is not in any sense "absolute", though it is part of the reported CPU-times for the linear programming code. This time depends crucially upon the distribution of the reduced cost and upon the order in which these numbers are encountered. A rather straightforward ordering routine was programmed, as we paid initially not very much attention to this detail.) KNU121 is a supersparse 121-city problem due to Knuth [11]. Our code encountered very early an "unknown" vertex and defaulted to solving the amended linear program. 7.25 seconds of CPU-time were used to obtain a lower bound of 344 on the optimum tour length of 349 published in the New York Times.

5.4. A 318-city problem

LIN318 is a 318-city problem the data of which are published in [13]. The data come from an actual problem involving the routing of a numerically controlled drilling-machine through three identical sets of 105 point each plus three outliers. As the drilling is done by a pulsed laser, drilling time is negligible and the problem becomes a standard travelling salesman problem. The only exception from the standard form is that a particular start and end point are to be used; the resulting Hamiltonian path problem can, however, easily be accommodated within the linear programming framework by assigning a negative distance to the particular arc. The distance-table of the complete graph on 318 points (with the exception of one arc) was computed from the coordinates published in [13]. The coordinates are given in milli inches and the usual single-precision square-root function was used to calculate the distances. More precisely, we compute the distances by taking the square-root of the sum of the squared differences, adding 0.5 to the resulting real number and by subsequently truncating it to its integer part. While the resulting total path length differed by approximately 10 milli inches from the total path length that results from adding up the 317 individual real numbers, we felt that this difference was small enough to justify our approximation. If the best solution published in [13] is calculated this way, it is 41 871 milli inches (rather than 41 883 milli inches) and the best solution that our

M. W. Padberg, S. Hong/ A Computational study/or TSPs

97

procedure produced (see Figure 2) has 41 349 milli inches. It is thus 522 milli inches shorter and furthermore, it is at most ~% off the shortest possible hamiltonian path through the 318 points with distances as defined above (see Table 3). The (initial) linear programming problem that has to be solved to obtain this result has 318 equations and 50 403 variables with upper bounds of one. Since 50 000 exceeds the allowable maximum for an integer half-word, several changes in our program had to be carried out. Also, it seemed unreasonable to continue to order columns by their reduced cost as done previously. (In fact, in one trial run we used this part of the program to order the first 30 000 variables. The

3063~

2813.4

2563.4

2373.4

2063.4

1813.4

1563.4

1313.4

1063.4

I 813.0

563.0

313.0

63.0

-500.0

500.0 o.

1500.a 1000.0

2500.4 2000.~

F i g . 2. 3 1 8 - c i t y

problem.

3500.4 BOOO.4

asoo.a ao00.~

98

M. W. Padberg, S. Hongl A Computational study for TSPs

ordering took 10 minutes of computing time.) Furthermore, the "longest" arcs are the most unlikely candidates to enter into the solution. We consequently considered-like in the case of sequential optimization in sparse subgraphs, see Section 5.2---explicitly only a small subset of columns in the actual optimization. The bulk of "long" arcs was then simply checked at termination and always found to price out correctly, if the cut-off point was chosen large enough. We first ran the problem with a cut-off point of 1500 milli inches. This produced a problem with 30 939 variables. The associated (initial) linear program was solved in 11 minutes, 10.8 seconds on the IBM 370-168 MVS of the IBM T.J. Watson Research Center and the remaining 19 464 arcs priced out correctly. As the storage capacity presented no problem, a proviso for up to 382 additional constraints was made and 3688 K of storage used. Later the storage requirement was reduced to about 2700 K by using a smaller cut-off point and this number could have further been halved by reducing the proviso for additional constraints as the program never added more than 200 constraints. Obviously, the 700 • 700 array for the storage of the explicit basis inverse as integer words accounts for most of the storage requirements. The optimal solution of the (initial) linear programming problem was plotted and consisted of 12 cycles of length 3 with associated arc values equal to 2 l, 32 subtours (most of which had length 3 or 4) and a number of chains linking the various parts. As a consequence of the 12 fractional cycles the determinant of the optimal basis greater than or equal to 212 -- 4096 and probably, several orders of magnitude larger than that due to some further 17 3-cycles with associated arc values equal to one. After factoring out the common divisors, the denominator of the 252 different bases in rational form encountered in the solution of this large-scale, highly degenerate linear programming problem never exceeded 16. (In fact, it is at most 2.) Furthermore, the optimum objective function value is 38, 765~ milli inches, thus we have a gap of 31052l milli inches to the solution published in [13]. The problem was then run with the constraint-generation procedure started with tour length of 41 745 and a cut-off point of 1500. (This solution had been found in a previous trial run executed in the interactive mode using the sparse-subgraph sequential optimization described in Section 5.2.) The only changes to this point that had been made in the program concern the necessity of using integer words rather than half-words for the storage of certain lists or their pointers. The procedure found after 209 pivots and the generation of 43 cuts a better tour of 41 729 milli inches and terminated in 23 minutes, 46.85 seconds after generating a total of 60 cuts and defaulting after 256 pivots to the linear programming subroutine where an additional 151 pivots were carried out. The remaining 19 464 arcs priced out correctly and the optimal objective function value of the amended linear program was 40 518, leaving a gap of 1211. The largest denominator encountered in this run was 48. The new tour was plotted and manually improved to 41 713 milli inches. (Plotting and inputting a new tour into the computer took about three 1-man hours.) As the gap was

M. W. Padberg, S. Hong/ A Computational study/or TSPs

99

rather large, we decided at this point to rerun the entire problem with the heuristic procedure made available to us by Shen Lin. As the solution published in [13] had been "obtained by joining several pieces of about 100 points, each optimized" by the heuristic (see [13, p. 513]), we expected the heuristic to find a better solution if the problem was solved as a whole, rather than by partitioning. The problem was run on the IBM 370-168 of the City University of New York using a region size of 900 K. After 50 tries and 28.33 minutes of CPU-time, the heuristic found a best solution of length 41 415 milli inches. (As we believe that this is the first time that the heuristic [13] has been used on a problem of such large scale, we have included Table 4 which gives the frequency distribution of the 50 local optima found during the execution of the heuristic. 41 415 is the smallest, 42 676 the longest path identified as locally optimum by the procedure. Raw data are contained in Appendix C.) Before the best solution found by the heuristic was used to run our constraintgeneration procedure, several changes were implemented: In order to delay a default to the linear programming subroutine, i.e. in order to increase the chance of generating more constraints than in the first run, a kick-off list of four was used to store the four "best" pivot columns. In the case that the first pivotcolumn, i.e. the steepest-edge column, defined an "unknown" vertex, the next best pivot-column was tried and so forth. As a consequence, a default to the linear programming subroutine occurred only if all four pivot columns on the kick-off list defined an "unknown" vertex that the program could not cut off. Furthermore, casual analysis of the fractional vertices encountered in the previous run showed that the program identified subtour elimination constraints later on during execution which could have been added on earlier when a non-tour vertex was encountered. (This was due to the fact that we generated only one constraint at a time.) It was thus desirable to add several constraints at a time. This change was carried out for subtour-elimination constraints only. Also, while in previous runs all constraints with basic slack-variables were Table 4 Tour length 41415-41449 41450-41549 41550-41649 41650-41749 41750-41849 41850-41949 41950-42049 42050-42149 42150-42249 42250-42349 42350-42676

Frequency 1 4 5 6 7 6 6 4 4 4 3

100

M. W. Padberg, S. Hongl A Computational study/or TSPs

dropped when a default occurred, this seemed no longer desirable and the program was changed accordingly. The constraint-generation procedure was then started with the best solution with value 41 415 and a cut-off point of 600 milli inches producing a sparse subgraph with 4076 edges. In this run, the solution was improved (twice) to a new solution with value 41 355 and after a total of 478 pivots and 153 constraints, the program defaulted to solving the amended linear programming problem. Up to this point the largest value of the denumerator was 120. The linear programming subroutine-possibly due to a programming error found subsequently in the changed linear programming subroutine-encountered a rapid build-up of the denominator to a value greater than 4000. As a result the array c a B -I overflowed, the program started to cycle and stopped due to the time limit. After correcting the code, the constraintgeneration procedure was run again with a starting value of 41 355. This run improved the value to 41 349, i.e. the tour plotted in Fig. 2 was found. (We are indebted to Arthur Appel of IBM Research for his kind help with the computer graphics facilities.) In this run the largest value of the denumerator was 912 and 183 constraints were generated. After 508 pivots the program defaulted to solving the amended linear programming problem, which took another 53 pivots. At termination, all 50 403 arcs as well as the nonbasic slack variables priced out correctly. Consequently, the optimal objective function value of 41 236~ obtained in this run constitutes a lower bound on the minimum length path through the 318 points. (Total execution time for this run was 37 minutes, 11.26 seconds.) The problem was run again with a starting value of 41 439 and with the following additional check programmed into the subroutine which checks all 50 403 arcs at termination for optimality: If we terminate optimally and the reduced cost at optimality of a zero-one variable j is greater than or equal to the (positive) gap between the value of the best path, i.e. 41 439, and the l.p. objective function value, then there cannot be a strictly better integer solution satisfying xj = 1. Likewise, if variable j is at its upper bound of one in the optimal l.p. solution and the negative of its reduced cost is greater than or equal to the gap, then there cannot be a strictly better integer solution satisfying xj = 0. (See also Dantzig et al. [3] for the same kind of argument. These observations are of course basic elements of any branch-and-bound procedure.) It was thus interesting to know, how many variables could be eliminated at the end of the constraint-generation procedure using this simple bounding device. (The details of this run are in Table 3.) The run terminated optimally and it turned out that at termination 48 871 variables could be eliminated this way, thus leaving a problem in 1532 zero-one variables, 318 equations and 171 inequalities. If the economics of this particular application demanded a true optimum solution, one would have--in view of the small remaining gap of 112--a better than even chance to solve this comparatively small problem exactly by any good branch-and-bound code.

M. IV. Padberg, S. Hong/ A Computational study for TSPs

I01

6. Conclusions

The most surprising outcome of our study is that only very few additional facetial inequalities are needed in order to obtain an excellent lower bound on the minimum tour length and in some cases, to also prove optimality. The results obtained in the statistical part of the computational of the study are consistently at most ~ off the optimal tour length and the standard deviations are consistently small as well. The test problems which proved insoluble to the local search procedure were solved to optimality without any difficulties. Our results for the test-problems from the literature including the--by today's standards-truly large-scale travelling salesman problems with 120 and 318 cities, respectively generally outperformed the results that we would have expected based on the statistical part of our study. In particular, the bound for the 120-city problem indicates that the solution is within 0.04% of the optimum tour (Gr6tschel [6] actually proved optimality of the solution using facetial inequalities as well) and the bound for the 318-city problem indicates that the solution is within 0.26% of the minimum length Hamiltonian path through the 318 points. With the resulting (remaining) gap between the best tour found and the bound obtained by the use of facetial inequalities being so relatively small, it is entirely realistic to expect that any good branch-and-bound procedure will enable one to solve large-scale travelling salesman to optimality. It thus appears worthwhile and promising to continue the line of the algorithmic approach to travelling salesman problems begun by Dantzig et al. [3] in 1954 and to continue the theoretical studies concerning the facial structure of travelling salesman polytope as done in [1], [6] and [7]. In addition, our study substantiates the fact that a carefully designed heuristic such as [13] does indeed provide very satisfactory, if not optimal solutions to travelling salesman problems. The main difficulty encountered in the study is connected with the solution of problem (P1) formulated in Section 4. While problem (P1) captures the implementional difficulties of a primal-simplex approach to proving optimality of solutions to travelling salesman problems, the following problem (P2) formulates the same problem from a dual-simplex point of view: (P2)

Given x 2 E Q~A, a non-tour feasible point of Q~, find an inequality ax <- ao of type i satisfying a x 2 > a0, if one exists.

As before type i refers to the special combinatorial description of e.g. subtour elimination and generalized comb inequalities. In the case of subtour-elimination constraints problem (P1) is a min-cut problem which can be solved efficiently. A promising strategy to pursue in the solution of both (P1) and (P2) is to look for a constraint of the given type which is violated the most by x 2. If (P1) and/or (P2) are solvable efficiently for comb inequalities and generalized comb inequalities, this should go a long way towards solving the problem of proving optimality in travelling salesman problems.

102

M. W. Padberg, S. Hongl A Computational study for TSPs

Note added in proof. Problem (P1) and (P2) has been solved recently by Padberg and Rao [20] for 2-matching constraints. The necessary calculations require essentially the solution of a minimum cut-set problem and can be done in polynomial time. The large-scale problems of Sections 5.3 and 5.4 have been solved to optimality by Crowder and Padberg [21] using the IBM packages MPSX/370 and MIP/370 in a novel (iterative) way.

Acknowledgements We are indebted to the many people who helped us gather the data for the study: Harlan Crowder of IBM, Martin Gr6tschel of the University of Bonn, Shen Lin of Bell Laboratories, Chris Papadimitriou of Harvard University and Bill Stewart of the University of Maryland. The work on the study began in February 1975 and was terminated in July 1977. We are most grateful for the generous support given to us by the computation centers of the City University of New York, the IBM T.J. Watson Research Center and New York University during the course of the study. The study was completed while M.W. Padberg was a Visiting Scientist at the IBM T.J. Watson Research Center (September 1976-September 1977). During this time, he benefited from numerous discussions with H.P. Crowder, P.C. Gilmore, A.J. Hoffman, E.L. Johnson and Phil Wolfe on the subject of this study.

Appendix A. Legend to Table A1 through Table A6 TIME 1: X X . Y Y seconds of CPU time on IBM 370-168V using FORTRAN H Option 2. PIVOT 1: Total number of pivots to solve min{cx I A_x = 2, 0 --<x --< 1} starting from the tour obtained through the heuristic. VALUEI: Objective function value of min{cx I Ax = 2, 0 _<x --< 1}. TOUR: Tour length obtained through the heuristic. VALUE2: Objective function value after adding cuts. (If VALUE2 = TOUR, then the heuristic solution was optimal.) PIVOT2: First column gives the total number of pivots in constraint generation part plus BVLINP; second column gives the number of pivots in BVLINP only. CUTS: First column gives the total number of cuts generated; second column gives the number of cuts dropped again in BVLINP. TIME2: X X . Y Y total execution time in C P U seconds for constraint generation part plus BVLINP. RATIO: ((VALUE2) - ( V A L U E I ) ) / ( T O U R - (VALUED).

Table A1 15-cities random Euclidean problems

103

Without cuts

With cuts

Problem No.

TIME1

PIVOTI

VALUEI

1 2 3 4 5 6 7 8 9 I0

00.28 00.26 00.24 00.26 00.27 00.25 00.25 00.28 00.24 00.28

9 11 7 9 8 7 8 8 8 I0

321189 3469 353112 2968 3759 3070 3760 2709 332812 3502

TOUR VALUE2 PIVOT2 CUTS TIME2 RATIO 3412 3961 3767 3156 4039 3119 3996 3011 3452 3632

3412 3961 3767 3156 4039 3119 3996 3011 3452 3632

12 13 9 12 13 7 14 8 10 9

0 0 0 0 0 0 0 0 0 0

5 3 2 2 5 1 6 3 1 1

0 0 0 0 0 0 0 0 0 0

00.36 00.37 00.30 00.31 00.31 00.30 00.37 00.31 00.31 00.32

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

TIME2

RATIO

Table A2 30-cities random Euclidean problems Without cuts Problem No.

TIMEI

1 2 3 4 5 6 7 8 9 10

00.93 00.99 00.87 01.I1 00.89 00.80 00.89 00.97 00.79 00.92

With cuts

PIVOT1 VALUE1 24 25 17 23 21 19 22 26 20 19

449989 4814 4161 4106~ 4276 4434 4636 405389 4935 394712

TOUR

4812 5138 4616 4607 4624 4739 5111 4288 5132 4313

VALUE2

4812 5138 4616 4607 4624 4739 5064 4288 5132 4313

PIVOT2

CUTS

36 44 24 34 45 29 37 35 29 30

9 21 8 9 20 lO 12 13 8 9

0 0 0 0 0 0 6 0 0 0

0 0 0 0 0 0 0 0 0 0

01.29 01.76 01.07 01.44 01.80 01.17 01.49 01.41 01.08 01.24

1.0 1.0 1.0 1.0 1.0 1.0 0.90 1.0 1.0 1.0

Table A3 45-cities random Euclidean problems Without cuts Problem No. 1 2 3 4 5 6 7 8 9 10

TIMEI 03.79 02.82 02.94 03.03 03.14 02.79 03.03 02.94 02.54 03.76

PIVOTI 35 26 32 27 31 29 33 23 24 38

With cuts

VALUEI 5199 578689 5025 5142 5049 477189 5296 4779 5486 5334

TOUR VALUE2 PIVOT2 CUTS TIME2 RATIO 5438 5846 5549 5608 5518 5241 5808 5218 5822 5612

5422 61 5834~ 38 5549 40 561)8 53 5518 44 52376~ 74 56231 41 5218 30 5822 42 5572 49

4 9 0 0 0 3 13 0 0 6

22 13 11 21 15 32 10 6 8 13

2 4 0 0 0 5 1 0 0 0

05.87 03.58 03.62 04.98 03.92 07.30 04.00 03.35 03.48 04.55

0.97 0.81 1.0 1.0 1.0 0.97 0.64 1.0 1.0 0.88

104

M. W. Padberg, S. Hong/ A Computational study for TSPs

Table A4 60-cities random Euclidean problems Without cuts Problem No. 1 2 3 4 5 6 7 8 9 10

With cuts

TIME1 PIVOT1 VALUEI TOUR VALUE2 PIVOT2 CUTS TIME2 08.23 07.77 07.89 08.37 08.63 08.50 09.00 08.06 07.98 O7.83

43 39 37 35 47 42 42 43 36 33

5954 618721 5764~ 5823 575121 5241 5890~2 5749 6174~ 591621

6330 6559 6344 6324 6101 5960 6439 6191 6484 6235

6305 6440 6344 6324 6101 5865 6405~ 6115 6484 620412

66 5 51 16 89 0 59 0 61 0 146 33 105 11 69 15 51 0 62 5

22 14 32 19 19 54 42 21 12 22

2 3 0 0 0 18 11 2 0 1

12.53 09.78 14.10 11.19 10.75 31.56 20.91 12.53 09.58 11.81

RATIO 0.93 0.68 1.0 1.0 1.0 0.88 0.94 0.83 1.0 0.90

Table A5 75-cities random Euclidean problems Without cuts Problem No. 1 2 3 4 5 6 7 8 9 10

With cuts

TIME1 PIVOT1 VALUE1 19.89 19.14 19.21 17.44 19.47 19.34 16.62 18.78 20.29 18.67

51 49 52 51 51 58 48 49 55 52

6568 6708 644721 6390 6344 6030 6710 6183 687721 6654~

TOUR VALUE2 PIVOT2 CUTS TIME2 6865 7086 6763 6701 6837 6624 7120 6563 7232 6990

683221 704321 6763 6701 6735 6503 706414 6500 7232 6912~

80 10 26 3 69 9 23 4 83 0 19 0 61 0 16 0 81 17 27 6 138 35 50 13 75 9 23 6 III 19 45 5 71 0 16 0 103 16 36 6

27.37 24.42 24.59 20.46 31.09 54.73 23.47 43.82 23.20 32.07

RATIO 0.89 0.89 1.0 1.0 0.79 0.80 0.87 0.83 1.0 0.77

Table A6 100-cities Euclidean problems due to Krolak Without cuts Problem No. 24 25 26 27 28

With cuts

TIME1 PIVOTI VALUEI 62.85 56.41 54.65 58.40 59.05

78 65 66 69 74

1937889 2033~ 19705 19952~ 20622

TOUR VALUE2 PIVOT2 CUTS TIME2 21282 22141 20749 21294 22068

211871 219371 20661 211936 ~ 2195561

159 12 141 18 135 17 156 19 245 39

63 2 51 5 53 6 53 12 90 33

104.55 97.03 80.22 90.06 171.84

RATIO 0.95 0.89 0.92 0.93 0.92

M. W. Padberg, S. Hong/ A Computational study for TSPs

105

FORTRAN listing/or random generation of x-, y-coordinates

DO I I : I , N C I T Y C A L L R A N D (IRANDX,IRANDY,RANDNO) IRANDX:IRANDY I X ( I ) = R A N D N O ~ i000.+i. C A L L _RAND (IRANDX,IRANDY,RANDNO) IRANDX:IRANDY I Y ( I ) = R A N D N O ~ i000.+i. CONTINUE DO 2 I : I , N C I T Y DO 2 J = l , I X=IX (I)-IX(J) Y:IY(I)-IY(J) I D I S T ( I , J ) : S Q R T ( X W X+Y % Y ) + . 5 CONTINUE

S U B R O U T I N E RAND(IX,IY,YFL) IY=IX ~ 65539 IF (IY) 5,6,6 IY=IY+2147433647+I YFL:IY Y F L : Y F L ~ .4656613E-9 RETURN END

Random number seeds for 10 problems Problem no.

1 2 3 4 5 6 7 8 9 10

7864680 7930219 7995758 8061297 8126836 8192375 8257914 8323453 8388992 8454531

Appendix B The nine additional constraints for D A N 4 2 are the following 2 subtourelimination constraints (2.2), a matching constraint (2.5) and a chain-constraint (2.6). RHS is the value of the right-hand side:

lO6

M. W. Padberg, S. Hong/ A Computational study for TSPs

Subtour-elimination constraints: (1) S = {1, 2, 42}, RHS = 2. (2) S = {1, 2, 41,42}, RHS = 3. (3) S = {13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23}, RHS = 10. (4) S = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23}, RHS = 12. (5) S = {24, 25, 26, 27}, RHS = 3. (6) S = { 3 , 4 , 5 , 6 , 7 , 8 , 9 } , R H S = 6 . (7) S = {13, 14, 15, 16, 17}, RHS = 4. Matching-constraint: (8) So = {15, 16, 18}, S~ = {14, 15}, $2 = {18, 19}, $3 = {16, 17}, RHS = 4. Chain-constraint: So = {21, 22, 25, 27, 28}, R = {25, 27, 28} $1 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42} $2 = {24}, $3 = {26}, $4 = {20, 21}, $5 = {22, 23}, p = 3, RHS = 31. The ten additional constraints for HEL48 are the following nine subtourelimination constraints (2.2) and one matching-constraint (2.5): Subtour-elimination constraints: (1) S = {39, 45, 46}, RHS = 2. (2) S = {1, 2, 4, 11, 16, 19, 27, 37, 39, 42, 44, 45, 46}, RHS = 12. (3) S = {9, 23, 30}, RHS = 2. (4) S = {13, 22, 28}, RHS = 2. (5) S = {1, 2, 4, 11, 16, 17, 19, 20, 24, 26, 27, 29, 31, 35, 37, 39, 41, 42, 44, 45, 46, 48}, RHS = 21. (6) S = {15, 21, 43}, RHS = 2. (7) S = {6, 7, 8, 14, 18, 34, 36, 38, 47}, RHS = 8. (8) S = {3, 5, 25, 32}, RHS = 3. (9) S = {11, 16, 42}, RHS = 2. Matching-constraint: (10) S = {2, 15, 48}, S~ = {1, 2}, $2 = {15, 43}, $3 = {26, 48}, RHS = 4.

Appendix C Heuristic solutions o f 318-cities problem

The Lin and Kernighan's heuristic algorithm (coded in FORTRAN) was run for 318-cities problem in IBM370/168 using H-compiler. The required region size was 900 K bytes of memory which includes two full-word arrays of size 318 by 318, one for distance matrix, and one for names of cities arranged in ascending order of distances from each city. The execution time was 28.33 minutes for 50 tries. The tour lengths of 50 heuristic solutions are the following:

M. W. Padberg, S. Hong/ A Computational study for TSPs 41415 41664 41822 41976 42155

41460 41672 41832 41995 42174

41481 41673 41845 42004 42211

41489 41686 41850 42009 42258

41493 41722 41851 42030 42269

41551 41740 41856 42062 42269

41630 41762 41871 42099 42303

41643 41782 41872 42115 42~4~

41645 41788 41931 42134 42560

107 41648 41789 41967 42151 42676

References [1] V. Chvfital, "Edmonds polytopes and weakly Hamiltonian graphs", Mathematical Programming 5 (1973) 29-40. [2] H. Crowder and J. Hattingh, "Partially normalized pivot selection in linear programming",. Mathematical Programming Study 4 (1975) 12-25. [3] G.B. Dantzig, D.R. Fulkerson and S.M. Johnson, "Solution of a large-scale travelling-salesman problem", Operations Research 2 (1954) 393-410. [4] G.B. Dantzig, D.R. Fulkerson and S.M. Johnson, "On a linear programming approach to the travelling salesman problem", Operations Research 7 (1959) 59-66. [5] J. Edmonds, "Maximum matching and a polyhedron with 0, 1 vertices", Journal of Research of the National Bureau of Standards, Section B 69 (1965) 125-130. [6] M. Grrtschel, "Polyedrische Charakterisierungen kombinatorischer Optimierungsprobleme", Dissertation, Rheinische Friederich-Wilhelms-Universit/it(Bonn, 1977). [7] M. Grrtschel and M.W. Padberg, "On the symmetric travelling salesman problem, Part I and Part II", Mathematical Programming 16 (1979) 265-302. [8] M. Held and R.M. Karp, "The travelling salesman problem and minimum spanning trees, Part I", Operations Research 18 (1970) 1138-1162; "Part II", Mathematical Programming 1 (1971) 6-26. [9] R.L. Karg and G.L. Thompson, "A heuristic approach to solving travelling-salesman problems", Management Science 10 (1964) 225-247. [10] R.M. Karp, "Reducibility among combinatorial problems", in: R.E. Miller and J.W. Thatcher, eds., Complexity of computer computations (Plenum Press, New York, 1972) pp. 85-103. [l l] D. Knuth, "The travelling salesman problem", illustrative example in: W. Sullivan, "Frontiers of science, from microcosm to macrocosm", The New York Times (February 24, 1976) p. 18. [12] P. Krolak, W. Felts and G. Marble, "A man-machine approach towards solving the travelling salesman problem", Communications of the Association for Computing Machinery 14 (1971) 327-334. [13] S. Lin and B.W. Kernighan, "An effective heuristic algorithm for the travelling-salesman problem", Operations Research 21 (1973) 498-516. [14] K. Menger, "Botenproblem", in: K. Menger, ed., Ergebnisse eines mathematischen Kolloquiums (Leipzig 1932) Heft 2, pp. 11-12. [15] M.W. Padberg, "On the facial structure of set packing polyhedra", Mathematical Programming 5 (1973) 199-215. [16] Ch. Papadimitriou and K. Steiglitz, "Traps for the travelling salesman", Paper presented at the ORSA Meeting, Miami, Florida, November 1976. [17] Ch. Papadimitriou and K. Steiglitz, "On the complexity of local search for the travelling salesman problem", Tech. Rep. No. 189, Department of Electrical Engineering, Princeton University (Princeton, NJ, 1976). [18] K. Patton, "An algorithm for the blocks and cutnodes of a graph", Communications of the Association/or Computing Machinery 14 (1971) 468--475. [19] M. Simmonard, Linear programming (Prentice-Hall, Englewood Cliffs, N J, 1962). [20] M.W. Padberg and M.R. Rao, "Odd minimum cut-sets and b-matchings", 68A Working paper, New York University (New York, July 1979). [21] H.P. Crowder and M.W. Padberg, "Solving large-scale symmetric travelling salesman problems to optimality", T.J. Watson Research Center Report, IBM Research (Yorktown Heights, June 1979).

Mathematical Programming Study 12 (1980) 108-114. North-Holland Publishing Company

A LIFO IMPLICIT ENUMERATION ALGORITHM FOR THE ASYMMETRIC TRAVELLING SALESMAN PROBLEM USING A ONE-ARBORESCENCE RELAXATION T.H.C. S M I T H The University o[ the Orange Free State, Bloem[ontein, South Africa

Received 23 December 1975 Revised manuscript received 12 December 1977 It is well-known that for the symmetric travelling salesman problem search methods using the 1-tree relaxation introduced by Held and Karp are much more efficient than those using the assignment relaxation due to the sharper bounds obtained. Held and Karp noted an analogous relationship between the asymmetric travelling salesman problem and the minimum one-arborescence problem. We implemented a LIFO implicit enumeration algorithm based on this idea and found that it is inferior to the assignment relaxation for two reasons: the relatively large computation time required to compute a minimum one-arborescence; and, more importantly, the fact that the bounds obtained from the two relaxations are about the same for asymmetric problems. Key words: Implicit Enumeration, One-Arborescence, Travelling Salesman Problem, Integer Programming.

1. Introduction

In two excellent papers [7,8], Held and K a r p investigated the relationship between the s y m m e t r i c travelling salesman problem and the minimum spanning tree problem (see also the c o n t e m p o r a n e o u s w o r k of Christofides [3]). Held and K a r p used this relationship to determine a lower bound on the minimum tour cost and in [8] developed an efficient ascent method for improving this lower bound. T h e y incorporated this method in a branch-and-bound algorithm for the symmetric travelling salesman problem and reported exceptionally good computational experience with this algorithm. In a subsequent p a p e r [9] Held et al. reported additional experience with a refined implementation of Held and K a r p ' s ascent method, verifying the effectiveness of the method in obtaining a nearmaximal lower bound (of this type) on the minimum tour cost. In [1 I] Smith and T h o m p s o n used this refined ascent method in a L I F O implicit enumeration algorithm for the s y m m e t r i c travelling salesman problem which, on the basis of computational experience, appears to be currently the most efficient algorithm (with reference to both execution time and storage requirements) for this problem. 108

T.H.C. Smith/Asymmetry travelling salesman problems

109

In [7] Held and Karp also noted the analogous relationship between the asymmetric travelling salesman problem and the minimum spanning arborescence problem. In this paper we propose a L I F O implicit enumeration algorithm for the asymmetric travelling salesman problem based on this relationship which employs the ascent method of [8] and [9]. We also present computational experience with a FORTRAN V implementation of the algorithm on the UNIVAC 1108.

2. Notation and review

Consider a directed graph G with node set N = {I, 2 . . . . . n} and edge set E containing an edge (i, j) (with associated non-negative integer cost cii) directed from any node i to any other node j # i. The cost of a subgraph of G is the sum of the associated costs of the edges in the subgraph. With any permutation il, i2. . . . . ip, 1 < p - n, of p different nodes from N we can associate a subgraph of G with node set N1 = {ij, i2. . . . . ip} and edge set E~ = {(i~, i2), (i2, i3). . . . . (ip_l, ip), (ip, il)} which we call a tour if p = n and a subtour if p < n . A spanning arborescence rooted at node 1 is a subgraph of G which contain no subtours, exactly one edge directed i n t o e v e r y node ./# 1 and no edge directed into node 1. The minimum spanning arborescence problem (MSAP) is that of finding a minimum cost spanning arborescence rooted at node 1 (this entails no loss of generality since any node could be labeled "1"). Several efficient methods for solving the M S A P have been proposed [2, 4, 5]. We chose to use the modified Edmonds-algorithm proposed by Fuikerson in our computational work. A 1-arborescence is defined to be a subgraph of G consisting of a spanning arborescence rooted at node 1 as well as a single edge directed into node 1. A minimum cost 1-arborescence can be obtained by finding the least cost edge directed into node 1 as well as a minimum cost spanning arborescence rooted at node 1. The asymmetric travelling salesman problem (TSP) is that of finding a minimum cost tour in G. It follows from the results in [7] that if, for any set of node weights {~r~liE N}, we transform the edge costs using the transformation c~j = c~j + 7ri, the set of minimum cost tours remains the same while the set of minimum cost 1-arborescences may change. A lower bound L on the minimum tour cost is found by subtracting the sum of the node weights from the cost of a minimum cost 1-arborescence with respect to the transformed edge costs. This lower bound is a piecewise-linear concave function of the node weights and can be maximised using the ascent method of [8] and [9]. A single iteration of this ascent method can be described as follows: Given a set of node weights {rri, i E N } and an upper bound U on the minimum tour cost, find a minimum cost 1-arborescence A with respect to the transformed edge costs and let L be the lower bound computed from A. If A is a tour the ascent is terminated since L is the optimal lower bound. Otherwise let di

110

T.H.C. Smith/Asymmetry travelling salesman problems

be the number of edges in A directed away from node i and let A be a given positive scalar smaller than or equal to 2. Compute the scalar quantity

t = A ( U - L ) ] ~ (di-1) 2 iEN

and replace the old set of node weights with the new set of node weights {~'], i E N} computed from the following formulae: zr~ = zr~+ t(d~ - 1),

i~N.

(1)

In [9] Held and Karp started the ascent method for the symmetric TSP with all node weights equal to zero. They also indicated how the dual variables associated with the optimal solution to the assignment problem with cost matrix (cij) can be combined to yield a good set of starting node weights. A similar result is true for the asymmetric TSP since if we use the node weights 7ri =-u~, i E N, where u~ is the optimal dual variable for row i of the assignment problem, it follows easily that the resulting lower bound L is greater than or equal to the optimal assignment cost (in our computational work we always found L equal to the optimal assignment cost). Our implementation of the ascent method is based on the strategies used in [8] and [9]. We make use of three tolerances, a,/3 and r (positive real numbers, all less than one in magnitude). Given a, set of node weights and the upper bound U the ascent method requires input parameters K and z which control the number of iterations as well as an initial value of A, the scalar used in the computation of the step size t. In an ascent we initially do K iterations (each involving the computation of a minimum cost 1-arborescence and a change of node weights). Thereafter we successively halve the current value of A, set K = maximum (~K, z) and do another K iterations until the first iteration at which at least one of the following statements is true (at which point the ascent is terminated): (i) the computed t-value is less than a ; (ii) K = z and no improvement in the maximum lower bound of at least /3 occurred in a block of 4z ascent iterations; (iii) U - L -< z (this includes the case where the minimum cost l-arborescence is a tour). The particular values of K and A that we used to start the ascents as well as the setting of the tolerances ~, /3 and ~" are discussed in the section where we report our computational experience. The quantity z (which was called a "threshold value" in [9]) was set equal to the integer part of An. The block size of 4z in (ii) was determined experimentally, The Edmonds-Fulkerson algorithm for the MSAP assumes non-negative edge costs. We maintained this condition by projecting the node weight vector with components obtained from (1) onto the nonnegative orthant (for the validity of

T.H.C. Smith/Asymmetry travelling salesman problems

111

this, see [9]). This is accomplished by using the following formulae instead of (1) to compute the new node weights: 7r~ = maximum(0, ~-,-+ t(di - 1))

for all i E N.

(2)

However, the use of (2) slows down the convergence of the ascent method if started with all node weights being zero. It is advisable to start with a strictly positive set of node weights. We ensured this by adding a positive constant larger than the maximum edge cost to each of the initial node weights (either zero or given by the negatives of the optimal row dual variables for the assignment problem). Note that this does not change the value of the lower bound that would be obtained from the initial set of node weights.

3. A LIFO implicit enumeration algorithm In our algorithm the minimum cost 1-arborescence problem is solved and if the subgraph A obtained is not a tour but contains more than one edge directed away from some node i, the constraint

~_~ xii ~-- I

(i,i.)~A

(3)

is added to eliminate this solution (where xi~ is the 0-1 decision variable corresponding to edge (i, j) of G). Suppose the minimal 1-arborescence A contains the p ( > l ) edges (i, jD, k = 1, 2 ..... p, where c0k < cijk§ for k < p. Then we apply the above constraint by partitioning the problem into p + 1 subproblems with no common solutions as follows: (i) the first subproblem is formed by fixing the edge (i, Jl) in, (ii) for 1 < q - - - p the qth subproblem is formed by fixing edge (i, jq) in and fixing all edges (i, A), k < q, out, (iii) the last subproblem is formed by fixing out all edges (i, JD, k = I, 2 ..... p. We omit further details of our algorithm, henceforth called algorithm SPAR, since it is a simple L I F O implicit enumeration algorithm similar to, for example, algorithm TSP2 in [12].

4. Computational experience An important detail in the efficient implementation of the Edmonds-Fulkerson algorithm is the choice of data structures for the set manipulation required to keep track of connected components and contracted nodes. In our implementation we used the node labelling scheme discussed in [10]. This scheme is also discussed in [1] as well as a more sophisticated technique based on tree

112

T.H.C. Smith/Asymmetry travelling salesman problems

structures (see [1, Algorithm 4.3]). We recoded the E d m o n d s - F u l k e r s o n algorithm using the latter technique but found that at least for problems with 60 or fewer nodes, the original implementation is slightly more efficient. In [12] implicit enumeration algorithms for the asymmetric TSP, based on the assignment relaxation, were tested on randomly generated problems with edge costs drawn from a discrete uniform distribution over the interval (1, 1000). We used some of these problems to test the algorithm of this paper. For each problem tested we took the initial upper bound U equal to one plus the minimum tour cost (available from the results of [12]) in algorithm SPAR. We first experimented with the ascent method on the original problem (i.e., terminating algorithm SPAR after the first application of the ascent method) to determine whether use of the negative optimal row dual variables as node weights is indeed advantageous. We found that although the ascent method converges to approximately the same value whether started with zero node weights or with the dual node weights, fewer ascent iterations were necessary to obtain this value when started with the dual node weights. Furthermore we found that starting with the dual node weights, the smallest number of ascent iterations was required when K and h were initially set to 4z and 1 respectively. If L1, L2 and L3 respectively denote the lower bound on the minimal tour cost obtained using zero node weights, the dual node weights and the final node weights (available at termination of the initial ascent), we define P1, P2 and P3 as L1, L2 and L3 expressed as percentages of the minimum tour cost (which had an average value of -+ 1 600 for the problems considered). In Table 1 we report the average values of these percentages for the different problem sizes. We also report the average time (in seconds) to find the duals of the optimal assignment solution (i.e. to solve the assignment problem), the average time (in seconds) to find a minimum cost 1-arborescence for a given set of node weights and the average number of ascent iterations before termination of the initial ascent. In the last column of Table 1 we give the average time (in seconds) taken by subtour elimination algorithm TSP2 of [12] to solve the same problems completely. From Table 1 it is clear that the time for the initial ascent (i.e. the number of

Table 1 Average assignment Problem Sample Average Average Average time size size P1 P2 P3 (seconds) 30 40 50 60

5 5 5 5

62.1 63.0 62.7 61.9

97.4 96.8 98.3 98.6

98.9 98.5 99.3 99.5

0.2 0.4 0.5 0.7

Average iteration Average Average time number of TSP2-time (seconds)iterations (seconds) 0.045 0.082 0.127 0.200

45 59 76 101

0.9 2.9 1.7 9.3

T.H.C. Smith / Asymmetry travelling salesman problems

113

iterations times the average iteration time) already greatly exceeds the total time for algorithm TSP2. Comparing the average iteration time with the average TSP2-time minus the average assignment time, one realizes that the number of iterations in the initial ascent has to be cut significantly if the initial ascent is to require less time than the complete solution of the problem by algorithm TSP2. Also note that the P3-values, obtained by the ascent method, are not much better than the P2-values, obtainable by solving a single assignment problem. Is therefore clear that algorithm S P A R is less efficient than algorithm TSP2. However, for interest's sake we solved ten problems to completion using algorithm S P A R (note that the upper bound was initialized at one plus the optimum). The computational results are reported in Table 2 (the ith problem of size n is identified as Pn - i). In all ascents we took a = 0.01,/3 = 0.1, r = 0.9 and in all ascents except the first, the initial values of K and A were taken equal to z (the threshold value) and 2 respectively. It is perhaps worth noting that our implementation of the E d m o n d s - F u l k e r s o n algorithm requires on the average three times as much time to compute a minimum cost spanning arborescence than our implementation in [11] of the Prim-Dijkstra algorithm to compute a minimum cost spanning tree (for problems of the same size). Since the ascent method used in this paper is virtually the same as that used in [11], it is clear that algorithm S P A R will not be more efficient than algorithm I E (proposed and tested in [11] for symmetric TSP problems). For example, the average runtime required by algorithm IE to solve a 60-node symmetric TSP was 34.1 seconds.

5. Conclusion Whereas the 1-tree relaxation algorithms [8, 11] are more efficient than the assignment relaxation algorithms [6, 12] for the symmetric travelling salesman problem, the opposite is true for the asymmetric travelling salesman problem. Table 2

Problem

Number of ascent iterations

Numberof nodes in search tree

Total SPAR-time (seconds)

TotalTSP2-time (seconds)

P30- 1 P30 - 2 P30 - 3 P30 - 4 P30 - 5 P40- 1 P40 - 2 P60- 1 P60 - 2 P60 - 3

382 220 137 185 70 88 847 247 348 1201

35 20 16 14 6 9 64 14 22 70

14.4 7.2 5.3 7.6 2.8 5.8 53.7 39.1 50.7 174.7

0.7 0.5 1.6 0.7 1.0 0.4 1.9 4.1 4.9 12.5

114

T.lt.C. Smith/Asymmetry travelling salesman problems

References [I] A.V. Aho, J.E. Hopcroft and J.D. Ullman, The design and analysis of computer algorithms (Addison-Wesley, Reading, MA, 1974). [2] F.C. Bock, "An algorithm to construct a minimum directed spanning tree in a directed network", in: B. Avi-Itzhak, ed., Developments in operations research (Gordon and Breach, New York, 1971) pp. 29--44. [3] N. Christofides, "The shortest Hamiltonian chain of a graph", SIAM Journal of Applied Mathematics 19 (1970) 689-696. [4] J. Edmonds, "Optimum branchings", Journal of Research of the National Bureau of Standards 71b (1967) 233-240. [5] D.R. Fulkerson, "Packing rooted directed cuts in a weighted directed graph", Mathematical Programming 6 (1974) 1-13. [6] R.S. Garfinkel and G.L. Nemhauser, Integer programming (John Wiley, New York, 1972). [7] M. Held and R.M. Karp, "The traveling salesman problem and minimum spanning trees", Operations Research 18 (1970) !138-1162. [8] M. Held and R.M. Karp, "The traveling salesman problem and minimum spanning trees: Part II", Mathematical Programming 1 (1971) 6-25. [9] M. Held, P. Wolfe and H.P. Crowder, "Validation of subgradient optimization", Mathematical Programming 6 (1974) 62-88. [10] J. Kershenbaum and R. Van Slyke, "Computing minimum trees", in: Proceedings of the ACM Annual Conference (Boston, 1972) pp. 518-527. [11] T.H.C. Smith and G.L. Thompson, "A LIFO implicit enumeration search algorithm for the symmetric traveling salesman problem using Held and Karp's l-tree relaxation", Annals of Discrete Mathematics 1 (1977) 479-493. [12] T.H.C. Smith, V. Srinivasan and G.L. Thompson, "Computational performance of three subtour elimination algorithms for solving asymmetric traveling salesman problems", Annals of Discrete Mathematics 1 (1977) 495-506.

Mathematical Programming Study 12 (1980) 115-119. North-Holland Publishing Company

POLYNOMIAL BOUNDING FOR NP-HARD PROBLEMS* P.M. C A M E R I N I and F. M A F F I O L I Politecnico di Milano, Milano, Italy Received 7 July 1977 Revised manuscript received 21 April 1978

A polynomial bounded method is presented for computing bounds to the value of the optimum of a large class of NP-hard combinatorial optimization problems.

Key words: NP-Hard Problems, Suboptimal Solutions, Polynomial Algorithms, Heuristics.

1. Introduction Consider the following constrained transportation problem: minimize

~. ciix~j, I]

subject to

~ixii=aj,

j=l,2

~xij--bi,

i = 1 , 2 ..... n,

(1)

k = l , 2 . . . . . •.

(2)

m,

I

xii EN.

~, Xii<--rk,

(i,j)(~Sk

Here each Sk denotes a set of index pairs {(i, j)}; aj, bi, rk are non-negative integers and C = [cij] is a n x m " c o s t " matrix of real numbers. This work presents a polynomial bounded method (in the size of matrix C) for calculating lower bounds to the value of the solution to problem (1), (2), provided that it is possible to solve in polynomial time the following subproblem. Given any non-negative integer matrix X = [xij], find, if any, a violated constraint among those defined in (2). The interest of problem (1), (2) arises from the fact that many NP-hard problems in the sense of Karp [9], such as for instance the traveling salesman, 3-dimensional assignment, K-parity matroid and some scheduling problems are particular cases of problem (1), (2), so that it is very unlikely that polynomial bounded methods for the exact solution to problem (1), (2) will ever be found [10]. * This paper is a complete and correct version of the work presented at IX International Symposium on Mathematical Programming, Budapest, August 1976. 115

116

P.M. Camerini and F. Ma~foli[Polynomial bounding

However, for the above mentioned instances, and for many others, the constraints (2) can be checked in polynomial time, so that the algorithm here proposed may be useful both for guiding implicit search methods (e.g. branch and bound, heuristically guided search [2]) towards the optimal solution and/or to evaluate heuristically obtained suboptimal solutions [1, 11].

2. A bordering algorithm Step 0: (Start). L*--0. Step 1: (Bounding). Solve the transportation problem (1), relaxing constraints

(2), by the Hungarian method (see for instance [4, 7]). Let C ' = [c~] be the cost matrix at the end of the computation and X = [x0] the optimal flow matrix. L ~ L + E Ci~Xij. Step 2: (Testing). Find, if any, a set Sk s.t. the corresponding constraint defined in (2) is violated by X. If none exists, stop: L is the best lower bound obtained. Step 3: (Bordering). Adjoin to C' a new row n + 1 and column m + 1. Set c'+~.m§ L e t Tk={(i,j) ESk:Xii>O}. For all (i,j)ETk, set c lt i ' , - - +O 0, c,+L i ' ~ O, cz,,~+~'~--O. Set all other entries of row n + 1 and column m + 1 of C' equal to +oo. Set C,,-- C', am+l +-- rk, b,+~ +-- rk, n ~-- n + 1, rn +-- m + 1 and go to Step 1. The rationale of this algorithm is as follows. At each iteration of Step 1, if we were able to solve problem P, i.e. current problem (1), (2), the cost e = E ci$;~ of an optimal solution to P could be added to L, thus producing a valid lower bound to the original problem. (Assume this by inductive hypothesis.) Instead, we compute a lower bound c = E ci~x0 to e by solving a relaxed problem R, i.e. current problem (1), without constraints (2). If some constraint (2) is violated by the solution to R, new problems P', R' are obtained in Step 3 by bordering matrix C ' with a new row and a new column. The entries corresponding to the violated constraint are made prohibitively costly whereas total flow in the new row (and column) is bounded by rk and is taken into account by R'. The following lemma implies that the cost of an optimal solution to P ' is a lower bound to ~ - c, thus proving (by induction) the correctness of the bordering algorithm. Lemma. L e t X = [~ij] be any feasible solution to P and ~ = E ci~fi~ be its cost. Then there corresponds to X a feasible solution X ' to P', whose cost is c' = ~ - c.

Proof. Denote by (~ = [~ij] the n • m matrix obtained at the end of Step 1, whereas C' = [c~j] denotes the corresponding (n + 1) x (m + 1) bordered matrix of

P.M. Camerini and F. Maffioli/ Polynomial bounding

117

Step 3. Recall [7] that 6ij = c,.j+ u~ - vj

(3)

where u~ and vj are the optimal dual variables of problem R. Moreover, orthogonality conditions of primal-dual optimal solutions imply that x,i>O

6ij=0.

~

(4)

Define a flow matrix X ' = [x~i] as follows. For i - n, j -< m ,

if (i,j) E Tk,

f0

x iJ = I :f~j

otherwise,

X;+l,J = X

XhJ,

he1~

X~,,+~ = ~ Xih,

(5)

h~J i

/,

= {i: (i, y) e Tk},

J~ = {j: (i, j) E Tk}, '

Xn+l,m+ 1 = r k - -

~ j=l

,

Xn+l,j.

One can readily see that being 3~" a feasible solution to P, X ' is a feasible solution to P'. Moreover, by (4), (5) and construction of C' in Step 3, we have (6)

c ' = X c ;:,~ = ~ e : . . Substitution of (3) into (6) yields

(7)

c' = 6 - ~ (v~. - ui)~ij.

Since X" is a feasible solution to R, (vi - ui)~ij = ~

vjai - ~

(8)

uibi.

By (4), c = •

c,~xij = •

(vj - ui)xij = s

via i - s

uib,,

since X too is a feasible solution to R. Hence, by (8) Y. (vj - ui)~ii = c and from (7), lemma follows. Since, at each iteration of Step 3, at least one entry in the original cost matrix is set equal to +oo, after no more than n x m iterations (n x m being the size of the original cost matrix), the algorithm terminates. This fact shows that the

118

P.M. Camerini and F. Ma~ioli/ Polynomial bounding

bordering algorithm is polynomial bounded, provided that the constraints (2) can be checked in polynomial time. Note also (see [4, 7]) that for all i, ] in Step 1 c~i -> 0, so that the final value of the lower bound is never worse and likely to be better than that obtained by solving only once problem R.

3. Remarks

If problem (1) is an assignment problem (i.e. a i -- bi = I for all i, j) on a square n x n matrix and the constraints (2) are given in such a way to forbid any cyclic sequence of length less than n, f o r m e d by nonzero flows, problem (1), (2) becomes the well known traveling salesman problem, and the algorithm here described becomes similar to that proposed in [3]. Note also that in general the final value of L may be strongly dependent on which constraint (if more than one is violated) is selected in Step 2 at each iteration. It would be interesting to find a policy for choosing the constraint which yields the better lower bound at the end of the procedure. This seems to be a very hard task, at least in the general case. A heuristic approach in this direction could be that of choosing, at each iteration, the violated constraint which yields the greatest increment of L. This would imply to solve R' for each violated constraint. Another possibility could be that of considering, in Step 2, all the violated constraints and for each of these add a new row and a new column to C' in the same way as described in Step 3. Table 1 Lower bounds

Example Croes [5] Held and Karp [8] Dantzig [6]

Number of nodes

Value By solving once of the assignment optimum problem

Number of iterations By the bordering needed to reach algorithm the bound

20

246

218

236

3

25 42

1711 699

1281 583

1459 621

8 14

No extensive computational testing of the bordering method has been yet performed. Only a few instances [5, 6, 8] of the traveling salesman problem, taken from the existing literature, have been considered. These results are summarized in Table 1 and seem to be comparable with those of [3], whereas the bounds are much less satisfactory than those of [12]. H o w e v e r , this latter method is not polynomial bounded. It is hoped that an extensive computational experience will be available in the future, ranging possibly in a wider class of problems.

P.M. Camerini and F. Maffioli/ Polynomial bounding

119

Acknowledgment Many insightful discussions with professor E.L. Lawler are gratefully acknowledged: his suggestions were essential to correct and complete this work.

References [1] P.M. Camerini and F. Maflioli, "Bounds for 3-matroid intersection problems", Information Processing Letters 3 (1975) 81-83. [2] P.M. Camerini and F. Maflioli, "Heuristically guided algorithm for k-parity matroid problem", Discrete Mathematics 21 (1978) 103-116. [3] N. Christofides, "Bounds for the traveling salesman problem", Operations Research 20 (1972) 1044-1056. [4] N. Christofides, Graph theory: an algorithmic approacll (Academic Press, New York, 1975) 380-381. [5] G.A. Croes, "A method for solving traveling salesman problems", Operations Research 6 (1958) 791-812. [6] G.B. Dantzig, D.R. Fulkerson and S.M. Johnson, "Solution of a large scale traveling salesman problem", Operations Research 2 (1954) 393--410. [7] L.R. Ford and D.R. Fulkerson, Flows in networks (Princeton Univ. Press, Princeton, NJ, 1962) pp. 93-111. [8] M. Held and R.M. Karp, "A dynamic programming approach to sequencing problems", Journal of SIAM 10 (1962) 196-210. [9] R.M. Karp, "On the computational complexity of combinatorial problems", Networks 5 (1975) 45--68. [10] E.L. Lawler, "Polynomial bounded and (apparently) non-polynomial bounded matroid computations", in: R. Rustin, ed., Combinatorial algorithms (Algorithmic Press, 1973) pp. 49-57. [11] F. Maflioli, "Subgradient optimization, matroid problems and heuristic evaluation", in: Optimization techniques, Proceedings 7th IFIP Conference Part 2 (Springer-Verlag, Berlin, 1976) pp. 389-396. [12] P. Wolfe, M. Held and H.P. Crowder, "Validation of subgradient optimization", Mathematical Programming 6 (1974) 62-88.

Mathematical Programming Study 12 (1980) 120-131. North-Holland Publishing Company

WORST CASE A N A L Y SI S O F G R E E D Y T Y P E ALGORITHMS FOR INDEPENDENCE SYSTEMS D. HAUSMANN and B. KORTE Universitiit Bonn, Bonn, Federal Republic of Germany

T.A. J ENKYN S Brock University, St. Catherines, Ont., Canada Received 9 January 1978 Revised manuscript received 10 July 1979

This paper deals with results on performance measures of greedy type algorithms for maximization or minimization problems on general independence systems which were given by the authors independently in earlier papers ([3] and [6]). Besides a unified formulation of the earlier results some modifications and extensions are presented here underlining the central role which the greedy algorithm plays in combinatorial optimization.

Key words: Independence Systems, Greedy Algorithm, Heuristics, Analysis of Algorithms.

1. Introduction Many optimization problems take the following form: Given a finite set E, a non-negative real-valued weight function c on E, and a family ~: of subsets of E; find a member M of ~: with largest total weight; i.e. maximize

c(A) = ~ c(e),

subject to

A E ~.

eEA

(1)

In this paper we consider the case that ~: is an independence system on E; that is, 1~ is non-empty and A C_B E ~ implies A ~ ~. The elements of ~: are called independent sets. The most naive method for attempting to solve (1) efficiently has been called by Jack Edmonds the "greedy algorithm" in [1] and elsewhere. It consists of forming a set G E ~: by taking the heaviest independent element, adding the next heaviest element which preserves independence, then the next, and so on until all elements of E are considered. More formally, (2a) Set Go = 0 and set R = E ; (2b) Let n = [RI and order the elements of R in any way such that c ( e l ) ~ c(e2) -->... -> c(en); 120

D. Hausmann et aL[ Greedy type algorithms

121

(2c) Given G;, set Gi+l

fG~ U {ei+l} if Gi U {ei+l}~ ~=,

[

G~

otherwise;

(2d) If i + 1 < n repeat (2c). The set G = Gn is called a greedy solution of eq. (I) (or the greedy solution ~.termined by the ordering (el, ez. . . . . en) of E). It is well-known that a greedy solution need not be optimal and that c ( G ) may much smaller than c ( M ) . However, in Section 2, a parameter q > 0 is given w independence systems such that for any weight function and any greedy ~lution q " c ( M ) <-- c ( G ) <- c ( M ) .

(3)

is also shown that the lower bound is obtained by certain weight functions. ection 3 contains results on the value of q for several familiar independence r and Section 4 contains corrollaries of (3) related to certain minimization roblems. Section 5 contains quite startling results on modifications to the :eedy algorithm which show the fundamental role it plays in combinatorial ptimization. Problem (I) may be solved for M exactly, if one is willing to test all 2~ lbsets of E for independence; the greedy algorithm tests only n subsets. is fairly natural to believe that if some compromise between exhaustive camination and the greedy algorithm is used one obtains an approximation to M hich, even in the worst case, must be at least as good as a greedy solution. Consider, for instance, a k-starting-greedy algorithm where k is some toderate integer: (4a) Solve (by exhaustive examination) maximize

c(T),

subject to

IT l = k a n d T ~ : ;

(4b) Set G 0 = T and R = { e E E ' ~ T : T U { e } ~ - } ; and apply the greedy gorithm (from step (2b)). Of course if one is willing to examine all k-bubsets of E for independence, ae could proceed after obtaining T, by adding the heaviest independent -subset at each stage. Consider now what might be called an ( = k ) - g r e e d y rgorithm : (5a) Set G O= ~; (Sb) Given G i, set R i = {e E E ~ Gi: G i U{e}~T}; If R ~= I~ stop; otherwise solve maximize

c(T),

subject to

ITI = k, G i U T E ~, and T C_R i ;

(5c) If there is a solution T, set G ~+l = G i U T and repeat (5b);~

D. Hausmann et al. I Greedy type algorithms

122

(5d) If there is no solution T, set Go---G i, R = R i, and apply the greedy algorithm. The number of k-subsets of E is of order [El k, but so is the number of subsets of E with cardinality _ k, so one might consider one further modification we call the ( <-k)-greedy algorithm: (6a) Set Go = ~; (6b) Given Gi, set R i = { e E E ~ G ~ : G~ U { e } ~ } ; If R~ = t~ stop; otherwise solve maximize

c( T),

subject to

[TI -- k, Gi U T ~

~: and T C_R~;

(6c) Set Gi+l = Gi U T and repeat (6b). Clearly for k = 1 all these "modifications" are essentially the usual greedy algorithm. Incredibly, for k > 1 the solution obtained by any of these modifications may be worse than that obtained by the usual greedy algorithm. Solutions obtained by the k-starting-greedy algorithm and the (=k)-greedy algorithm may be so bad that (3) is violated. But despite all this, the lower bound on the worst case performance, given in (3) above, holds for the (-
2. The rank quotient and worst case performance of the greedy algorithm Let (E, ~-) be an independence system where ~:# {~}. For a subset S C E, a

basis of S is a maximal independent subset of S and we define lower rank of S = lr(S) = min{lBl: B a basis of S}, upper rank of S = ur(S) = max{IBI. B a basis of S}. The number 9 fir(S). ~ 0} q(E, ~ ) = mm,[ur-~. S C_E and ur(S) will be called the rank quotient of (E, ~). Edmonds [1] defined a matroid to be an independence system with l r = ur, i.e. with q(E, ~T)= 1. Thus, the rank quotient of an independence system (E, ~) can be regarded as a measure of how much (E, ~ differs from being a matroid. Clearly, q(E, ~) <- 1 and if B is a basis of S and X ~ ~, then [B[ -> lr(S) -> q(E, ~r)ur(S) _> q(E, ~)]X A S[. The following basic theorem gives a performance guarantee of the greedy algorithm in terms of the lower and upper rank. This result was discovered by

D. Hausmann et al. / Greedy type algorithms

123

Jenkyns and appears in [3]. Independently the other two authors found different proofs for it in [6]. We give here what we consider to be the most elegant proof from [6].

Theorem 1. Let (E, ~ be an independence system. If c is a weight function on E, M is an optimal solution of (1), and G is any greedy solution, then c(G) > q(E, ~). c(M)

(7)

Furthermore, for some weight [unction c, (7) holds with equality.

Proof. Suppose that G is the greedy solution with respect to the ordering (ea, e2. . . . . en) of E. For j = 0, 1. . . . . n let Ej = {el, e2. . . . . ej}, Gj = G f3 E~, and M ~ = M f 3 E i. Set cn+l=0 and for ] = I , 2 . . . . , n let cj=c(ei) and dr = c i-ci§ Then since Gj is a basis of E i for all j, we have =

I

i-,I}ci =

i=1

1=1

IGjl c, - c,+,

-IG01c, + 16olc

,,

-> ~ lr(Ej)di .i=1

>--~ q(E, :?T)ur(Ei)di. 1=1

>--q(E, :T) ~ [Mild, i=l

= q(E, ~) ~ {tMjl- [M/_,l}ci = q(E, :T). c(M). i=i

To show that equality in (7) might hold, let So be a subset of E such that q(E, ~ = Ir(SO)/ur(SO) and define a weight function c on E by: c(e) = 1 if e E So, otherwise c(e)= O. Let BI and B2 be bases of So such that [Bil = Ir(So) and IB21 = ur(So). Choose any ordering (el, e2. . . . . e,) of E satisfying

e1EB1,

ei~So-~BI,

ek~E-,So~i<j
Then the ordering satisfies the condition c(ei) >- c(ei+O, c ( G ) = c ( B 0 = I B L c(M) = c ( B 2 ) = IB2[ and (7) does hold with equality. A direct consequence of Theorem 1 is the following result of Edmonds [1]: Corollary 2. An independence system (E, :T) is a matroid iff, for any weight function c : E--->R § any greedy solution of (1) is an optimal solution of (1).

3. Bounds for the rank quotient Theorem 1 characterizes the worst-case performance of the greedy algorithm for a general independence system. Thus, to get performance guarantees for special independence systems (E, ~ we have only to derive sharp lower bounds

D. Hausmann et al. I Greedy type algorithms

124

on the rank quotient q(E, ~). A very useful tool for that purpose is the following theorem which is related to a well-known characterization of a matroid [1] by its circuits. A circuit of an independence system (E, 2T) is a minimal subset of E not belonging to ~:.

Theorem 3. Let (E, ~ ) be an independence system. I[, [or any A ~ ~ and e E E, A U {e} contains at most p circuits, then q(E, 2T) >- lip. Proof. Let A _CE, J be a basis of A, K be any independent subset of A and J ~ K = {el, e2. . . . . er}. Since K O {el} contains -< p circuits and since each such circuit meets K -. J, there is a subset XI C K - - J such that [Xll-

uX, l<-PlJ --gl,

IKI---p IJI- (p - 1)lJ n KI--- p II1, and when A, J and K are the appropriate "extreme" sets lr(A)_ J~>l q(E, ~ ) = ur(A) IKI - p" It is well-known that for every independence system (E, :g) there are matroids (E, ~:i), 1 -< i - p, such that :~ = ~:l N... 17 ~:P; we say (E, :g) is the intersection of the matroids (E, :~i). In fact let C1. . . . . Cp be the circuits of (E, :~), then (E, :g) is the intersection of the matroids (E, :~) with :~' = {A C_E: C;~ A}. Hence the following immediate corollary to Theorem 3 is of some interest:

Corollary 4. If (E, ~ ) is the intersection o[ p matroids, then q(E, ~ >- l/p. Proof. This is an immediate consequence of Theorem 3 and of the well-known fact (cf. [I]) that, for a matroid (E, ~i), the union of an independent set A E ~i and a singleton {e} contains at most one circuit. Applying Theorem 3 to some special independence systems and exploiting their special structures, we get the following results. For the standard definitions used in these results and their proofs we refer to [3, 4, 5, 6].

Theorem 5. Let Gu = (V, U) be an undirected graph and Gd = (V, D) be a directed graph (without loops or multiple edges). Let n = IvI. (a) Let 2T be the system of all matchings in Gu. I[ every connected component of Gu is a triangle or a path of length O, 1, or 2, then q( U, 2T) = 1, otherwise

q(U, ~3 =~.

D. Hausmann et aLl Greedy type algorithms

125

(b) Let ~; be the system of all subsets of Hamiltonian cycles in G,. Then q(U, ~') >- 1/(2- 1/[~n]) and equality holds when G, is a complete graph. (c) Let ~: be the system of all subsets of directed Hamiltonian cycles in Ga. Then q(D, ~;) >-13. If Gd is a complete directed graph, q(D, 3~) = ~, and if Ga is a tournament, q(D, ~') equals 89or 3. (d) Let ~; be the system of all vertex packings (stable sets) in G,. Then 1/q(V, 3~) equals the maximum cardinality of a vertex packing contained in the neighborhood of some vertex v E V. In particular there is a sequence of graphs G~ =(V", U") such that IV"I= n and q(V",:~")->O for n->oo. (e) Let Gd = ( V, D) be a complete directed graph, and let ~ be the system of all the subsets of D not containing a directed cycle ("acyclic subgraphs"). Then q(D, ~') = 2In.

4. Greedy algorithm for minimum problems Since the greedy algorithm constructs a basis of E, it can also be applied to the minimum problem: minimize

c(B),

subject to

B being a basis of E.

(8)

We restrict ourselves to bases in (8), for if we would admit arbitrary independent sets as in (1), then the empty set would be an optimal solution of (8) for any (E, ~) and c. The greedy algorithm for the minimum problem (8) first orders the elements e E E such that c(el) <- c(ei+l) for any i, and then proceeds as before. Again the set G = G, will be called a greedy solution of (8). Now let crux -- max{c(e): e E E} and assume that all bases of E have the same cardinality k. Then it is easy to see that G is a greedy solution and M an optimal solution of the minimum problem (8) if[ G is a greedy solution and M an optimal solution of the maximum problem (1) for the weight function c':E-->R + defined as c'(e) = Cmax- c(e). Moreover, for any basis B of E, in particular for G and M, we have c(B) = k 9 cm~x- c'(B) Hence, Theorem 1 immediately implies the following performance guarantee of the greedy algorithm for minimum problems.

Theorem 6. Let (E, ~ ) be an independence system such that all bases of E have the same cardinality k, and let q = q(E, gk-). Let c be a weight function, Cmax= max{c(e): e E E}, G a greedy solution, and M an optimal solution of the minimum problem (8). Then c(G)<-q 9 c ( M ) + ( 1 - q). k . Cmax

(9)

and for any independence system there is a weight function such that (9) holds with equality.

126

D. Hausmann et al. I Greedy type algorithms

Obviously, the bound in (9) depends not only on the independence system (E, ~-), but also on the weight function c. This shows that for any (E, ~O the worst-case behaviour of the greedy algorithm for the minimum problem (8) may be arbitrarily bad for some weight function c; cf. the corresponding examples in [3] and [6]. Two very important independence systems which satisfy the additional assumption of equicardinality of the bases of E are the systems of subsets of all travelling salesman tours (Hamiltonian cycles in a complete graph or complete digraph). Hence, T h e o r e m 5(b) and (c), and Theorem 6 immediately imply the following results:

Corollary 7. Let G be a greedy solution and M an optimal solution of the symmetric travelling salesman problem for n cities (vertices), and distances (weights on the edges) c(e)>_O, e ~ E . Then c ( G ) < - q . c ( M ) + ( 1 - q ) . n ' c m a x where q = 1 / ( 2 - 1/[89

Corollary 8. Let G be a greedy solution and M an optimal solution of the asymmetric travelling salesman problem for n cities and distances c(e), e E D. Then c(G) <- c ( M ) . 13+ (~). n . C~ax.

5. Modifications of the greedy algorithm The usual greedy algorithm, in every step, adds a single element e ~ E to the current independent set. In this section we examine modifications of the greedy heuristic which in every step add up to k elements to the current independent set where k is a fixed integer, 1 < k < n = IEI. These modifications were given earlier in Section 1. Example 1. Let Km,n be the complete bipartite graph whose vertex set R U B. Let ~ = {X C_ V: no two members in X are adjacent}. Then (V, ~ independence system where q(V, ~') -- 1/max{m, n}. Suppose [R I = m = and [B I = n = p 9 k for some integer p -> 1. Define a weight function c on c(v)=

a>O B> 0

V= is an k- 1 V by

if v E R , ifvEB.

Then R and B are the only bases of V and the basis G* obtained by either the k-starting greedy algorithm or the ( = k ) - g r e e d y algorithm must be B. Therefore c(G*) = p 9 k .ft. If a >> [(pk)2/(k - 1)] 9 fl, then p 9 k-/3 <~ (k - 1)a so M = R and c(G*) = p . k . B ~ q . c ( M ) = {~p } [ ( k - 1 ) a ] < c ( M ) . Thus, the lower bound in (3) for the usual greedy algorithm is violated. On the

D. Hausmann et all Greedy type algorithms

127

other hand, the usual greedy algorithm (and the ( - k ) - g r e e d y algorithm) would find R as the greedy solution. The solution G* is so bad because exactly k elements must be taken despite the fact that it would be better to take fewer. Example 2. L e t (V, ~') be the independence system of Example 1. But suppose IR[ = m = k and IB[ = n >> k + 1. Suppose also that v0 is a particular vertex in B. Define a new weight function on V by

if u ~ R,

c(v)={i +

if u > 2 k = c(R). However, since 2k > (k + ~) + (k - 1) 9 1 = 2k - (1 - E), R is the solution G* obtained by the (_< k)-greedy algorithm. Thus the ( - k ) - g r e e d y solution may also be worse than (the worst) solution obtained by the usual greedy algorithm. H o w e v e r , the techniques of the proof of Theorem 1 may be modified to prove the generalization T h e o r e m 9. Let (E, ~ ) be an independence system. If c is a weight function on E, M is an optimal solution of (1), and G is any solution obtained by the

(<-k)-greedy algorithm, then c(G) >_q(E, ~-).

c(M) Proof. The (- c(e'), then i -< L The elements of E are not ordered by the (--C2>___"" >__Cn,

e~T i
~, e j E T ~§ e~ G

but

c~=cj~i<];

(10)

e~ ~ G ~c~ > cj.

Then for j = 0, 1 . . . . . n let E~ = {el, e2. . . . . ej}, Gj = G t-I Ej, and Mj = M ('1 Ej. Set c,+l=O and for j = 1,2 . . . . . n let dj=c~-cj+l. Set a ( 0 ) = 0 and for i = 1, 2 . . . . . m set a(i) to be the largest index ] such that ej E T i. If we could show n

c(G) >- ~ lr(E~)dj 9= the theorem would follow, since for any M ~ ~ we have

(11)

D. Hausmann et al./ Greedy type algorithms

128

.

c(M)=

q .

IMjldj j=l

--<~ q " ur(Ej)dj j=l

< ~ lr(Ej)ds j=l

where q denotes q(E, J~). Suppose a ' = a ( i - 1 ) < j < a(i) = a and suppose B is a basis of E s containing Gj. Let X = B - . Gs. If IX[ > I T i-- Gjl = t let Y be any t-subset of X and let Z=YU(Ti~.Gs). Then Z C E - - . G , [ZI=IT~ I and G U Z E ~ . Since any element of Y has a strictly larger weight than any element of T ~ --Gj, w ( Z ) > w(T') contradicting the choice of T'. H e n c e IX[ = [ B [ - IGs[ < IT' - Gj[ and

Isl < los[ + IT' -.. or[ =Ioil. From this and the fact that G~(h) is a basis of E~(h) for all h we obtain lr(Ej) ~ IGa[

for J = 0, 1. . . . . a(i).

02)

We shall show that for i = 1, 2 . . . . . m there is an index fl(i) such that setting b = / 3 ( 0 and a = c~(i) b

w( O~) >- ~'~ Ir(Ei)ds + IOa[cb+,.

(13)

j=l

Thus if a ( m ) < j < n, since G = G~m) is a basis of E s, IGIdj >-Ir(Ej)ds. From this and (12) we get

IOa(m)ICb+l:IG Ii=b+l ~ di.~j=~+

Ir(E,)d,.

Thus (13) holds for i = m, we have (ll). To prove (13) for i = 1 let H denote the usual greedy solution to (1). Since/-/s is a basis of E i in the construction, I/-/j[ - lr(Ej) for 0 --- j --- n. L e t t be the largest index j such that I/-/jf---IO~,)l = IT'I. Then

c(G,,o) = c ( T l) >- c(Ht) t

= ~ II-Ijld~+ [Htlc,+, j=l t

-->~ lr(E~)dj + [S,[c,+,. j=l

If t = n we have (13) for all i; if t < n, IHtl = [G~o)[ and putting/3(1) = t we have (13) for i = 1. Suppose now that (13) holds for i = 1, 2 . . . . . h < m ; in particular, let b be the largest index such that

D. H a u s m a n n

et a i d G r e e d y t y p e a l g o r i t h m s

129

b

(14)

c(Go) >t 2 lr(E)di + [G, Jcb+, j=l

where a denotes a(h). If b < a, then using (12)

[GalCb+, = IGa] ~

j=b+l

-> ~

dj + IGalc.+l

lr(Ej)d i +]Go[c.+1.

j=b+l

Then b could be replaced by a in (14) contradicting our selection of b. Let T denote T h+l and a' denote a(h + 1). If w(T) >_[Tlcb+l, then

c(G~,) = c(G~) + c(T) b

>--2 lr(Ei)di + Iaalcb+l + ITlcb+' 9 1=1

So setting fl(h + 1) = b we have (13) for i = h + 1. Assume then that c ( T ) < ITlcb+l and implicitly that b < n. Extend G~ using the usual greedy algorithm, i.e. set Ha = Go, and given ~ for a -< j < n, set /'/~+~ = (Hii U {eJ+~} otherwise.if ~ U {e1+1}~ ~, Let t be the largest index j such that ]Hi] <-]Goq = IGol + ]T], and let X = Ht -~ Go. Then X C E ~ Go, IxI -< ]TI and Go U X E ~:. Hence, by the construction of the (-_ IXlc,+l >-ITIcb+l> c(T). Therefore we must have a - b < t -< n, and then t

c(X)>_ 2 Ixdai + IXdc,+, j=b+l t

I1-11~. Galdi + ]Xlct+l --E j=b+l t

t

I lai- j=b+l 2 Ioolaj+ IXlc,+, -=E j=b+l t

" -j=b+l 2 lr(E/)dj -IG~l(cb+l -

Or+l)q'-]XlCt+l.

Hence c(G~,) = c(G,~) + c(T) b

>- 2 lr(Ej)di + IG~lCb+l+ c(X) j=l t

-> 2 lr(E~)dj + IG, Vct+,+ ]Xlct+l. i=1

Hence if t = n, we have (13) for all i; if t < n, we have (13) for i = h + 1. The theorem follows by induction.

IxI = ITI

and putting r

+ 1) = t

130

D. Hausmann et al. I Greedy type algorithms

We have just proved that the (-
Example 3.

Consider the independence system (E, :T) where E = {1, 2 . . . . . m} form>n+2->4and = {A C_ E: A = {1,2 . . . . . n} or IA O {1, 2 . . . . .

n}l <

n}.

then q(E,50

--- l r ( E ) / u r ( E ) = n / ( m - 1).

If n = 2, then this lower bound can be achieved by the (-<2)-greedy algorithm using a constant weight function. H o w e v e r , if n = 3 this lower bound cannot be achieved by the ( - 2 ) - g r e e d y algorithm using a n y weight function c on E. On account of the s y m m e t r y in the definition of ~, we may assume that Cl~C2~C

3

and

c4>-cs>-'">-c,..

(15)

Let G be a (-<2)-greedy solution and M an optimal solution of (1). C a s e 1: G ={1,2,3}. If also M = {1, 2, 3}, then obviously c(G) _

c~

,,~ 3 ~ q t ~ , ~ = m---~-1"

(16)

Thus, we can assume that IM A {1, 2, 3}1 -< 2; hence c ( M ) <-- ca + c2 + c4 + ( m - 4)c5

Since G = {1,2,3} it follows from (15) and the definition of the (-<2)-greedy algorithm that c ( G ) = c I -]- c2 -}- c3 ~ ci -J- c2 -{- c4 --I- c5.

Thus, if c5 = 0, c ( G ) >- c ( M ) so (16) holds. If c5 > 0, then c(G) ~ cl + c2 + c4 + c5 c ( M ) - c~ + C2 q" C4 "Or( m - 4)c5 ~.

5c5 + c5

- 5c5 + (m - 4)c5 6 =m+l

3 > m- - l "

C a s e 2: [G N {1, 2, 3}1 = 2.

(as ca + c2 + c4 -> 2c3 + c4 ->-5c5 and m > 4)

D. Hausmann et al. / Greedy type algorithms

131

Because of (15) w e can assume that G = E ~ {3} and c4 + c5-> c3. If also IM A {1, 2, 3}[ = 2, inequality (16) follows at once, and if M = {1, 2, 3} (16) follows f r o m c ( G ) ~ CI Jr C2 Jr C4 Jr C5 ~ CI -[- C2 -~" C3 =

c(M).

References [1] J. Edmonds, "Matroids and the greedy algorithm", mimeographed notes, International Symposium on Mathematical Programming (Princeton, NJ, 1966) pp. 93-117. [2l D. Hau~mann and B. Korte, "K-Greedy algorithm for independence systems", working paper No. 7764-OR, Institut ffir 0konometrie und Operations Research, University of Bonn (Bonn, June 1977). [3] T.A. Jenkyns, "The efficacy of the "greedy" algorithm", in: Proceedings of the seventh Southeastern conference on combinatorics, graph theory, and computing pp. 341-350. [4] T.A. Jenkyns, "p-systems as intersections of matroids", in: Proceedings of the eighth Southeastern conference on combinatorics, graph theory, and computing, to appear. [5] T.A. Jenkyns, "The greedy travelling salesman's problem", Networks, to appear. [6] B. Korte and D. Hausmann, "An analysis of the greedy heuristic for independence systems", Annals of Discrete Mathematics 2 (1978) 65-74.

Mathematical Programming 12 (1980) 132-149. North-Holland Publishing Company

QUADRATIC KNAPSACK PROBLEMS* G. G A L L O lstituto M. Picone per le Applicazioni del Calcolo, CNR, Roma, Italy

P.L. HAMMER and B. SIMEONE University o[ Waterloo, Waterloo, Ont., Canada Received 17 June 1977 Revised manuscript received 10 April 1978

The quadratic knapsack (QK) model naturally arises in a variety of problems in operations research, statistics and combinatorics. Some "upper planes" for the QK problem are derived, and their different uses in a branch-and-bound scheme for solving such a problem are discussed. Some theoretical results concerning the class of all upper planes, as well as extensive computational experience, are reported.

Key words: Knapsack Problem, Quadratic Programming, Upper Planes, Branch-andBound, Computation.

I. Introduction 1.1. L e t B n be the set of all 0-1 n-vectors, and define as a quadratic 0-1

knapsack problem (QK) any of the following two problems: max s.t.

xT Qx, ax <--b, x ~ B ~,

rain s.t.

(1.1)

xr Qx, ax >- b, x ~ B ",

(1.2)

where Q is a non-negative 1 square matrix of order n, a is a positive n-vector and b a positive scalar. Since xi = x~ for x~ = 0, 1 linear terms in the objective function of (1.1) or (1.2) can be implicitly taken into account in the quadratic part. We can also a s s u m e that Q is symmetric. This model arises f r o m a variety of applications. * Presented at the IX International Symposium on Mathematical Programming, Budapest, August 1976. l Actually, the presence of negative elements on the diagonal does not affect our development. 132

G. Gallo et al./ Quadratic knapsack problems

133

Example 1. (Witzgall[14]). The technology of communication satellites has recently inspired the design of mail subsystems in which messages are transmitted electronically rather than physically. Electronic message handling stations convert physical messages into electronic ones and vice versa, and communicate with each other through satellites. Known n potential sites for the stations, the investment cost a~ for building a station in site i, and the average daily mail volume qij observed between i and j, the problem of selecting a set S C_{1..... N} of locations such that the global traffic ~ . j e s q~j is maximized and a budget constraint ~,ies ai <-b is met is obviously of the form (1.1). Similar problems have been investigated regarding the location of railway stations (Land [8]), freight handling terminals (Rhys [11]) and airports. Example 2. In hydrological studies, the observations made by different pluviometers in a same region are usually seen to be correlated. Thus it is desirable to reduce the number of puviometers by choosing only a few of them, with minimal redundancy, and an acceptable loss of information. If qij, iS j, is the absolute value of the covariance between the rainfall in stations i and ], estimated from a record of past observations, qii -- 0, ai is the variance of rainfall in i, and b is a fixed fraction of the total variance ~,~ai, the problem can be formulated as (1.2). A similar model arises in applications to portfolio selections (Laughunn [9]), when one wishes to select a subset of n possible investment proposals such that some fixed minimum return is guaranteed and the risk is minimized. The sum of the absolute values of the covariances of the (random) returns associated to the proposals can be taken as a reasonable measure of risk. Example 3. The problem of determining whether a given graph possesses a clique of order k (k is a fixed integer) is a special case of (1.1), where Q is the node-node adjacency matrix of the graph, ai = 1 for all i and b -- k. The clique does exist if and only if the optimum value of (1.1) is k(k - 1). It is well-known (Karp [7]) that this clique problem is NP-complete. More generally, the QK problem can be formulated in the following graph-theoretic terms. Given a graph G, let a "benefit" be associated with each edge, and a "cost" with each node of G. Define the "benefit" of a subgraph H to be the sum of the benefits of all edges in H, and the "cost" of H to be the sum of the costs of all nodes in H. (1.1) is then equivalent to the problem of finding a subset S of nodes, such that the subgraph induced by S has a cost not exceeding some fixed level, and yields a maximum benefit. 1.2. The present paper is mainly concerned with the formulation (1.1) of QK. Actually, the algorithms we propose can be converted, in a straightforward way, for handling problem (1.2). Any lagrangian relaxation of (1.1) with respect to the knapsack constraint is amenable to the maximization over B" of a quadratic function whose diagonal

G. Gallo et al. I Quadratic knapsack problems

134

terms are negative and whose off-diagonal terms are non-negative. Rys [11] showed that the last problem is equivalent to a network flow one and Balinski [1] proposed a labelling procedure for solving it. 1.3. The algorithms developed in this paper are based on bounds on the optimum value of (1.1), obtained by considering linear relaxations in which the quadratic objective [(x) is replaced by an "upper plane") i.e., a linear function g(x) which dominates f(x) in all feas~le points. This approach is motivated by the easiness with which even large linear knapsack problems can be solved (Salkin and de Kluyver[19]). The use of upper planes is not new in the literature (Taha [13]) and goes back to at least as early as to the works of Beale [3] and Balinski [1] on certain nonlinear transportation problems. Upper planes have usually been considered in connection with the outer-linearization of a concave function; also, Cabot and Francis [4] suggested a simple method for deriving upper planes for non-convex continuous quadratic problems. (As a matter of fact, the upper planes constructed in Section 2 of this paper turn out to be closely related to those of Cabot and Francis.) The focus of this work is on the relative efficiency of different upper planes and on the different ways in which they can be exploited in branch-and-bound techniques for solving (1.1). Extensive computational experience is presented. Finally, in the last section some theoretical properties concerning the family of all upper planes are investigated.

2. Upper planes for the quadratic knapsack problem 2.1. Let us denote by X =- {x E Bn: ax <- b} the feasible set of (1.1). Without loss of generality, we can assume ~7~ ai a~ > b > a~ -> a2->" 9 9>-- an > 0. An upper plane (UP) for the function f ( x ) = xTQx in X is any linear function g(x) such that g(x) >_[(.x) for all x ~ X. Given an UP g(x) for f(x) in X, the corresponding linear relaxation of QK is the (linear) knapsack problem max{g(x): x E X}.

(2.1)

From the solution of (2.1) both an upper and a lower bound on the optimum value of (1.1) can be derived. Indeed, if x* and ~ are optimal solutions of (1.1) and (2.1) respectively, one has f($) <- f(x*) <- g(fr 2.2. Given the upper plane g(x), a cheaper way of getting bounds consists in solving the continuous knapsack problem max{g(x): x ~ 3(), where f ( - { x E R n :

ax<-B, O<_x<_e}, with e =(1, 1..... 1).

2Some authors use the term "linear overestimator'.

(2.2)

G. Gallo et al./ Quadratic knapsack problems

135

Such a problem admits an optimal solution whose components are all zeroes or one, with the possible exception of a single component. Setting this variable down to zero, a binary feasible vector x ~ is obtained and f ( x ~ is a lower bound for the quadratic optimum; an upper bound is given by the optimum value of (2.2). 2.3. The algorithm which will be described in the next section is an enumerative one, in which the exploration of each node can be viewed as a two-stage process. The r61e of the first stage is the generation (through the exact or approximate solution of a number of linear knapsack problems) of the coefficients of an upper plane, while the r61e of the second stage is the determination of upper and lower bounds on the maximum of f ( x ) in X (through the solution of the linear relaxation corresponding to the above upper plane). 2.4. A simple way to derive upper planes is the following. If vi is an upper bound of the set {qTx: X ~ X}, where qi is the jth column of Q, the function ~7=1 vjxj is obviously an U P f o r / ( x ) = ~,~=1 (qTx)xi in X. This turns out to be true even when vj is an upper bound for the set {qTx: X E X, xj--1}, since (qTx)xi <-- VgCj when xi = 1 and ( q ~ x ) x i = vixi = 0 when x~ = 0. Four possible choices for vi are given, under the hypothesis that qii -> 0 for all i,j:

v 2 = ~,~h)qi~,

(2)

where ~!h~ means that the sum is restricted to the h largest elements of the jth column of Q, and h is the maximum number of ones in a feasible solution (since the a~'s are non-increasing, h is the largest index k for which an-k+1 + " " + an <-b), v~ = max{qTx: x E ,~},

(3)

(of course one can take the integer part of the r.h.s, if Q is integer), v~ = max{qTx: x E X}.

(4)

The order relationship between the four upper bounds is shown in Fig. 1, where a continuous arrow points from a smaller to a larger upper bound, while the dotted arrow means that in most observed cases (but not always) v] was found to be smaller than v~.t3) On the other hand, the computational effort needed for evaluating the vfs increases from (1) to (4). The upper bound (4) is the best possible one, but it is also by far the most expensive. Extensive numerical experimentation has been 3A remarkable exception arises when all the maximalfeasible vectors have the same number of ones. In this case v~= v~-> v~.

G. Gatlo et al./ Quadratic knapsack problems

136

Fig. 1. performed (see Section 4) in order to find an upper plane which corresponds to a good trade-off between computational effort and tightness to the quadratic function. 2.5. A given lower bound on the optimum value of (1.1) can often be improved, at low cost, through a sequence of the following elementary operations (suggested for linear knapsack problems by Petersen [10]): F i l l - u p , which transforms a given feasible point x, having xi = 0, to a point of the form x ' = x + uj (uj is the jth unit vector). Then one has f ( x ' ) _ f ( x ) , the inequality being strict unless qj is the zero vector. E x c h a n g e , which transforms a point x such that xi = I and x s = 0 for some i, j to the point x' = x - ul + uj, so that x~ -- 0 and x~ = 1. The exchange is performed only if x' is still feasible and (using the terminology in [5]) the s e c o n d d e r i v a t i v e

(

Aii(x) = [ xl . . . . . 1 . . . . . . . . .

, x,

)

i

-,(x, ..... o ..... 1 ..... x.) is negative. Since Aii(x) = A~(x) - As(x) + 2qis(xi - x~),

where Ak(X)=

,(

k

xl . . . . .

'1 . . . . . x , ),(--

k

'

xl . . . . . 0 . . . . . x ,

) = qu + 2

h~k

qh~xh

is the f i r s t derivative, only the first derivatives need to be kept in storage; they can be updated after each elementary operation by simple formulae. There are two good reasons for trying to improve the lower bounds. As we shall see later, the above discussed linear relaxations yield very good l o w e r bounds. Thus, the solution of the linear relaxation, followed by the improving procedure sketched above, gives rise to an economic and effective h e u r i s t i c a l g o r i t h m , which often leads to the true optimum. If, on the other hand, one wishes to use linear relaxations in a branch-and-bound scheme for finding an optimal solution, early knowledge of tight lower bounds enhances the fathoming power of the algorithm.

G. Gallo et al./ Quadratic knapsack problems

137

3. Use of upper planes in branch-and-bound algorithms 3.1. Upper planes can be exploited as basic tools in deriving enumerative algorithms for QK. For any subproblem obtained from (1.1) by restricting certain variables to 0 or 1, it is possible to derive bounds through the solution of a discrete or continuous linear relaxation. Moreover, upper planes play an interesting r61e in other basic aspects of enumerative algorithms, such as forcing of variables and branching. Up and down penalties for linear relaxations can also be obtained with a modest cost. Finally, the amount of work required in order to generate the linear relaxations can be considerably reduced by means of recursive formulae. Throughout this section, we shall assume, for the sake of simplicity, that all the data Q, a, b are integer; we shall denote by U, Z and F the index sets of the variables which, in the current subproblem (P), are fixed to 1, 0 or are free, respectively, so that the feasible set in (P) is

S-{x:

xEX;xj

= I ViE U:xi=O Vj~Z};

Vo + ~,ieF V~Xi will stand for the current UP, and ~,]~F a~xj < G for the current knapsack constraint, while (p0) and (P~) will represent two subproblems, with feasible sets S ~ and S~ respectively, obtained from (P) by adding the further constraint xr = 0 or xr = I. 3.2. Bounding. In each subproblem (P), the restriction of the objective function f to S is a non-negative quadratic function in the free variables, for which we can construct an upper plane VO+~,iEF ViXj, according to one of the rules (1) ..... (4) of Section 2.4. It is important to recognize that, in all four cases, the law which associates to any subproblem (P) the corresponding upper plane g is adaptive: in other words, if g' and g" are the upper planes corresponding to (p0) and (P~), generated according to the same rule as for g, one has g'(x) <-g(x) Vx E S ~ and g"(x) <- g(x) Vx E S~. Let us prove it, for illustration purposes, in the case that g(x) is the UP v4x. Let us also restrict our attention to the upper plane g'(x) = v"x for (P~). Dropping for simplicity the index "4", we have: Vo = E

i,]EU

qii,

Vi=2 iEU E qii+max~Eqi~xi: Eaixi<--b, xi=O,l~,.I ~.iEF iEF

v'o = i.~ff,avqij + q=, v~=2~

i~U

qii + 2q,j + max{i ~ ~F-{r} qijxi: ~,

iEF-{r}

aixi-----/~- a,, xi = 0, 1}.

Hence we have vo= v ~ - q . ,

vi >- q,i + (vj - 2@0 = v; - q,j, V] E F - {r},

G. Gallo et al./ Quadratic knapsack problems

138

so that, for any x ~ S~,

g(x) = Vo+ Vr+ E

i~F--{r}

IE

{r}

/ \ ViXj> (V~-- qrr)+ [qrr+ E qirXi) / \ iEF-{r} j~F-{r}

jEF-{r)

3.3. Branching. Let x ~ be an optimal solution for the linear relaxation of the current subproblem, and let F~---{j~ F: x ~ 1}. A reasonable criterion for selecting the branching variable x, is to choose r such that

v-r' = ar

max

v/

lEFt ai'

the branch x, = 0 being explored first. 3.4. Recursions. If V~+~,j~FV~Xj is the U P for the current node, generated according to rule (1) of Section 2.4, the corresponding UPs for the two successors can be generated through simple recursions. Actually, a direct computation shows that:

Proposition 3.1. The UPs of type (1) for (p0) and (P~) are given by v~ + j~_~,~ ( v ~ - q.)x~ and by

respectively. Slightly more complicated recursions hold for the UPs (2) and (3). Even if recursions are not used, it is important to notice that the most expensive part of the work required for generating the U P (3), i.e., sorting the elements of each column j according to non-increasing ratios qq,/a~, need not be repeated, once it has been done for the initial one. A similar observation can be done for the UP

(2). 3.5. Forcing. At each node, one is really interested only in those feasible points x in which the quadratic objective function, and hence, afortiori, the current UP, is strictly greater than the current lower bound z. The constraint v0+ ~jep vjx~>_ _z+ l, combined with the knapsack constraint ~,jeF a~xj <- b, may be very effective in forcing variables to the value 0 or 1. To this end, the constraint pairing techniques described in Hammer et al. [6] can be successfully exploited. Such techniques also allow for detecting whether the system composed by the two above inequalities is inconsistent.

G. Gallo et al.l Quadratic knapsack problems

139

3.6. Penalties. Let u, u ~ and u~ denote upper bounds for the subproblems (P), (p0), (p~) respectively, computed according to one of the methods of Section 2.4. Down and up penalties, i.e., underestimates of the decreases u - u ~ and u - u~ can be exploited, in the usual way, for fathoming (P) by bound, for forcing variables o r - a s an alternative to the criterion in Section 3.3- for selecting the branching variable. The following proposition provides an inexpensive procedure for evaluating down and up penalties when the upper bounds are obtained by solving continuous linear relaxations. Such penalties are valid for an arbitrary upper plane, provided that it is adaptive (according to the definition in Section 3.2). Let Vo+ ~,i~F V~X~be such an UP for (P). Assuming the free variables numbered according to non-increasing ratios vr/aj, let xt be the last positive variable in the optimal solution $ of the corresponding continuous linear relaxation of (P). Proposition 3.2. (a)

If ~,

= 1,

(\ Vr- a~vt at) is a down penalty for Xr. (b) II ~, = O, a,-

v,)

is an up penalty for x, (c) If 0 < 2r < 1 (i.e., ! = r),

,~,(v, -~l+lU,] V,+l_~

and

(1 -,~,)(~

at - v , )

are down and up penalties for x , respectively. Let us prove, for example, (a). If ~, = l, since the upper plane is adaptive we have

U~176

i~F ~ a~xj <-b;O<-xi <- l' Vj E

(3.1)

On the other hand,

U=Vr+max~vo+ t.

~,

jEF-~r}

v~xF ~

jEF-{r}

ajxj<-b-ar;O<-xj<-l,VjEF-{r}}

>-V, _v~ at ar+ max{vo +/eF~ } ViXF i~-,,, ~ a j x i < - b ; O < - x i < - l ' V j E F - { r } '}

G. Gallo et al./ Quadratic knapsack problems

140

that is,

u>-v,-Vla,+max{vo+~VFj:~,aFj<-G;O<-xi<-l, Vj~F;xr=O}. at ~Ep I~F Comparing the last inequality with (3.1), one gets

u-u~

at

which proves (a). Statements (b) and (c) can be proved in a similar way. 3.7. We now give a general description of a class of branch-and-bound methods for QK, involving the procedures discussed in this section. Two algorithms in the class may differ in one or more of the following aspects: - - t h e type of upper plane ((1), (2), (3) or (4) of Section 2.2) generated at every node during the first stage (see Section 2.1). the kind (discrete or continuous) of linear relaxation solved at every node during the second stage. the presence or absence of routines for constraint pairing and for evaluating and using penalties. Each algorithm performs a standard search over a binary tree, whose nodes correspond to subproblems obtained by restricting some variables to the values 0 or 1. The two successors of each node are generated by fixing some free variable to 0 or to 1, respectively. In exploring the tree, a LIFO priority rule is adopted. At each node the following steps are performed:

Step 1: Step 2:

A linear relaxation for the corresponding subproblem is generated. An attempt is made in order either to detect infeasibility or to force a single variable by the constraint pairing techniques mentioned in Section 3.4. If the problem is feasible and no variable can be forced, Step 3: The relaxed subproblem is solved and the bounds on the quadratic optimum are updated. Step 4: Every time the lower bound is improved, use is made of the routine described in Section 2.5 in an attempt to further improving it. Step 5: If the subproblem cannot be fathomed by bound or by optimality, Step 6: Up and down penalties for all the free variables are computed, by means of which the algorithm again tries either to fathom the subproblem by bound or to force as many variables as possible. In case of failure, Step 7: A branching is made on the variable with the highest up or down penalty. Step 8: (Only if Steps 6 and 7 are not executed) A branching is made on the variable selected according to the criterion of Section 3. Step 9: Whenever a subproblem is fathomed (by infeasibility, bound or

G. Gallo et al.I Quadratic knapsack problems

141

optimality), backtracking to the predecessor occurs and the other branch emanating from it is explored, unless this has already been done. Steps 2, 6, 7 are optional.

4. Computational experience 4.1. Numerical experimentation has been performed on ten different variants of the general branch-and-bound method sketched at the end of the last section. To this end, a FORTRAN code for this class of algorithms has been implemented on an IBM 370/168. For reference, each variant is identified by a label, in which the first character is B or C, according to whether discrete or continuous linear relaxations are solved during the second stage at each node; the second character is 1, 2, 3 or 4, according to the type of upper plane generated during the first stage; one or more optional characters may follow, with the following meaning: F a routine for forcing variables, based on constraint pairing techniques, is included. Fk as above, but only k surrogate constraints are generated rather than n + 2 as suggested in [6]. P a routine for evaluating and using penalties is included. 4.2. The numerical experiments were performed on randomly generated test problems, ranging from 10 to 70 variables, in which the elements of Q were integers uniformly distributed between 0 and 100, the knapsack coefficients aj were integers uniformly distributed between 1 and 50, and the b's were integers uniformly distributed between 1 and E~=~aj. Unless otherwise specified, Q was taken full (100% dense). For each problem size n -- 10, 20 ..... ten test problems were run. 4.3. Let UB and LB denote the upper and lower bounds for the initial subproblem, computed as in 2.1 and 2.2, and let OPT denote the quadratic optimum. Table 1 gives, for the variants B1, B2, B3, B4, C3, the average "Av", the standard deviation "Sd" and the maximum "Max" (estimated, for each size n, from a sample of ten test problems) of (UB-LB)/LB% and of (OPT-

LB)/LB%. The following remarks are in order: - - T h e upper bounds given by B4 and B3 are remarkably better than those provided by B2 and B 1. - - T h e lower bounds are always much closer to the quadratic optimum than the upper bounds. - - T h e relative error of the bounds with respect to the optimum appears to remain of the same order of magnitude as the size n of the problem increases.

G. Gallo et all Quadratic knapsack problems

142 Table I

UB - LB.% LB

OPT - LB% LB

Version

n

Av

Sd

Max

BI

10 20 30

75.42 166.02 74.01

82.77 229.03 92.09

293.48 835.05 307.47

B2

10 20 30 40

25.73 39.61 31.02 32.96

14.30 22.97 28.44 22.30

B3

l0 20 30 40 50

17.89 12.62 10.50 13.46 10.88

B4

10 20 30 40

C3

10 20 30 40 50

Av

Sd

Max 9

8.52 1.03 0.63

16.16 2.45 1.69

51.55 8.30 5.68

42.24 76.46 97.39 69.74

3.80 3.82 0.41 1.76

5.73 8.28 0.86 2.38

15.09 28.43 2.69 7.24

19.80 6.08 4.68 5.31 5.77

73.31 20.32 14.73 20.54 18.91

6.40 0.56 0.15 0.20 0.08

17.92 0.70 0.29 0.37 0.10

50.28 1.24 0.79 1.14 0.22

11.40 10.83 9.66 10.87

6.13 5.60 4.17 6.66

21.55 17.87 14.24 16.68

1.43 0.55 0.14 0.49

3.16 0.70 0.28 0.35

10.01 1.76 0.78 0.75

35.47 21.46 17.86 16.55 16.49

31.79 14.45 11.98 6.68 7.46

111.33 52.05 49.27 26.89 30.21

10,34 5.51 3.90 1,31 2.75

15.22 8.30 6.32 1.20 3.33

51.56 26.83 22.45 4.43 10.74

The a c c u r a c y of lower bounds can be further improved, as Table 2 shows, when use is m a d e of the routine sketched in Section 2.5. The column " S c o r e " shows in how m a n y cases out of ten the initial lower bounds was found to be

equal to the true quadratic optimum. Table 2

OPT - LB% LB Version

n

Av

Sd

Max

Score

B3

10 20 30 40 50

4.39 0.23 0.08 0.19 0.04

12.64 0.46 0.24 0.36 0.09

42.29 ! .24 0.79 1.14 0.22

8 8 9 7 8

C3

10 20 30 40 50 60

3.17 0.78 0.29 0.47 0.90 0.34

6.35 2.00 0.45 0.47 1.82 0.87

16.02 6.70 1.07 1.38 4.97 2.94

8 8 7 3 8 6

143

G. Gallo et al. ] Quadratic knapsack problems

Table 3 n

10

20

30

40

0.09 0.08 0.04 0.09 0.056 0.036 0.033

1.65 0.85 0.40 0.49 0.23 0.19 0.17

0.04 0.042

0.21 0.25

21.28 8.78 3.93 5.89 2.56 1.54 1.17 1.53 1.13 1.23

37.34 43.99 14.45 9.46 7.88 9.31 8.24 9.18

50

60

70

78.83 27.93 29.22

68.92 78.36

318.714

Version B 1F B2 B2F

B3 B3F

C3 C3F C3F3 C3P C3FIP

27.78 30.30

Notice that, with this improvement, the lower bounds given by C3 become comparable with those given by B3. In Tables 3 and 4, the performance of the above mentioned ten variants 4 of the algorithm is compared in terms of average running time per problem (in CPU seconds on the IBM 370/168) and in terms of average number of linear relaxations generated (upper figure) and solved (lower figure) during the branch-andbound search. The averages have been estimated, as usual, from a sample of ten test problems for each size n. It should be remarked that, since forcing a variable Xr to the value a (a = 0 or 1) is always followed by a forced branching along the opposite direction xr = 1 - a, (which implies the generation of a new relaxation and thus delays a possible fathoming by bound), the number of linear relaxations generated is usually higher in those variants which include forcing routines. The following remarks can be made: The algorithms' performance is heavily conditioned by the particular type of U P chosen. - - T h e C3 variants unquestionably outperform the other ones. None of the variants C3, C 3 F and C3F3 seems to be definitely dominant over the other ones in terms of running time. - - T h e forcing routine tends to increase the number of linear relaxations generated and to decrease the number of those solved. While it is very effective in B type variants (with savings of 50% and more) it loses its effectiveness in C type variants, where the times for generating a linear relaxation are roughly comparable with those needed for solving it. - - U s e of penalties sensibly reduces the number of nodes explored, but obviously increases the amount of work per node. In the range of problem 4 Because of its expensiveness, variant B4 was only used to solve the initial problem.

144

G. Gallo et al. / Quadratic knapsack problems Table 4 n

10

20

30

40

50

60

70

B 1F

65.4 12.7

457.9 77.7

B2

18.8 18.8

63.4 62.9

227.4 227

B2F

29.4 7

108.2 17.5

453.4 66.7

B3

23.2 20.4

38.6 32.3

147.6 137

533.2 526.8

B3F

10.2 2.8

45.6 6.4

215.2 34.5

991.2 126. l

3235.4 361.8

C3

12.12 12.12

58 53.8

259 251.8

1138.6 1130.6

2094.4 2079.8

3476.4 3449.1

C3F

10.44 2.82

50.6 7.6

231 38.9

1257.7 181.4

3329 321.3

5556.8 530.9

239.5 56.6

1121.5 284.4

Version 2868 426

C3F3

2480 322

C3P

7.4 7.3

36 36

110.2 110.2

587.4 587.4

1397 1396.9

C3FIP

8.2 6.4

51.6 29. l

136.6 87.9

785.8 453.4

2094.9 885.5

15611 1787.8

sizes observed, there seems to be no particular advantage in using them. The variant in which they are used together with constraint pairing is not recommended. Tables 5 and 6 show, for some variants, the effect of decreasing the density of Table 5

UBLBLB %(A v )

OPT B LB %(A v )

Size Density

10

20

30

40

50

l0

20

30

40

50

100% 50%

17.89 27.88

12.62 39.63

10.50 27.92

13.46 37.50

10.88

6.40 0

0.23 1.13

0,08 1.53

0.20 1.05

0.04

25% 5%

38.27 9.63

52.12 21.39

44.28 34.77

31.44 39.90

0 0

1.29 2.92

5,54 4,34

0.22 4.87

35.47 38.63 70.60 16.55

21,46 43.62 53,62 23.36

17.86 28.37 40.53 36.44

16.55 39.57 53.64 36.49

10.34 1.12 7.95

5.51 2.02 0.58 2.05

3.90 1.86 3.18 4.53

1.31 1.65 3.72 3.01

Version B3

28.97

3.43

Version C3 100% 50% 25% 5%

16.49

28.06

2.75

2.49

G. Gallo et al./ Quadratic knapsack problems

145

Table 6 Version B3F Size: Density: 100% 50% 25% 5%

10

20

30

40

0.056 0.047 0.040 0.026

0.23 0.65 0.54 0.11

2.56 6.38 4.53 0.70

14.45 88.50 74.55 5.40

10

20

30

40

0.036 0.039 0.037 0.021

0.20 0.55 0.52 0.12

1.54 6.81 3.99 0.79

9.47 71.71 90.42 6.61

10

20

30

40

0.033 0.034 0.029 0.018

0.18 0.40 0.30 0.088

1.17 3.93 2.04 0.45

7.88 43.02 50.92 3.28

Version C3 Size: Density: 100% 50% 25% 5% Version C3F Size: Density: 100% 50% 25% 5%

Q on the tightness of the bounds (Table 5) and on the average running time per problem (Table 6). All variants are seen to behave better when the matrix is very dense (100%) or very sparse (5%). Fortunately this is actually the case in most applications.

5. Some theoretical properties of upper planes 5.1. Let S be any subset of R n and f an arbitrary real function whose domain contains S and which has a maximum in S. For convenience, we shall often denote a point V - (v0, v~. . . . . vn) in R n+~ as (v0, v). Define S*~{VERn+I:

v0+vx~>f(x),xES}

by convention, 4'* = R "+~. The following properties hold:

(a) (b) (c) (d) (e)

S D_T ~ S * C_ T*; (S U T)* = S* n T*; (S n T)* _DS* U T*; S* is non empty, convex and unbounded;

If S is finite, S* is a convex polyhedron;

G. Gallo et al. I Quadratic k n a p s a c k problems

146

(f) If f is c o n v e x and S is a c o n v e x polytope (i.e., a bounded convex polyhedron), then S* = E ( S ) * , where E ( S ) is the set of the vertices of S; (g) If [ is c o n v e x and S is a c o n v e x polytope, S* is a c o n v e x polyhedron. The proof of properties (a) through (e) is straightforward s. (g) follows at once from (e) and (f). L e t us p r o v e (f). By property (a), E ( S ) * 3_ S*. On the other hand, let V =- (vo, v) E E(S)*. Any x E S is a c o n v e x combination A~Zl + -.. + ;t,z, of the vertices of S. H e n c e vo + vx = vo

At + v

Xiz~ =

"=

,li(vo + vz~) >i=1

~J(z~) i=l

Thus V E S* and E ( S ) * = S*. 5.2. For simplicity, we shall assume n o w S to be a c o m p a c t set. For any V E S*, maxxas(v0+ vx) does exist and is an upper bound for z = m a x x ~ s f ( X ) . On the other hand, there is at least one I) E S* for which such an u p p e r bound actually equals z (it suffices to take I7"= (z, 0 . . . . . 0)9. H e n c e we have

Proposition 5.1. min max(v0 + vx) = max f ( x ) . VES* xES

x~S

This equality suggests that the m a x i m u m of f in S could be found, in principle, by minimizing o v e r S* the convex function h ( V ) = maxx~s(V0 + vx). Notice that, if h ( V ) = vo+v$, then [1 " ~] is a subgradient of h in V, since, for any V ' E S * , we have h ( V ' ) > v ~ + v ' ~ = h ( V ) + [ 1 i ~ ] ( V ' - V).

Corollary 5.2. I f h(17) = minv~s, h( V) and [ ( s = maxx~s/(x), then Vo + ~x attains its m a x i m u m over S in s Proof. for all x ~ S, ~50+ v.f -< h ( f ' ) = f(.f) ~ z5o + ~ . 5.3. Remark. It is worth noting that, if we restrict ourselves to homogeneous U P s vx, it is still true that m i n ( m a x vx: v ~ R", vx >- f ( x ) u k xES

~ S } >- m a x f ( x ) , J

(5.1)

xES

but equality does not h a v e to hold, as the following example shows: 5 Properties (a), (b), (c) are satisfied, more generally, by any Galois connexion between the subsets of R" and the subsets of R"+L

G. Gallo et al./ Quadratic knapsack problems

147

Example. f ( x ) = 18x~x2 + 2xtx4 + 2x~x5 + 2x2x3 + 2x2x6 + lOx3x4 + 2x3x6 + 2x4x5 + 10xsx6,

S ~ {x E B*: 5x~ + 4x2 + 4x3 + 2x4 + 3x5 + 2x6 -< 10}. The maximum of [ ( x ) in S is 18. Equality in (5.1) would imply the existence of a vector v such that 18 <-- v~ + v2 --< 18, 10 --< v3 + v4 -- 18, 10_< vs+ v6-< 18, 6 <-- vt + v4+ v5 <-- 18,

6--< v2+ v3+ v6 -< 18, since the points (1, 1, 0, 0, 0, 0), (0, 0, 1, 1, 0, 0), (0, 0, 0, 0, 1, 1), (1, 0, 0, 1, 1, 0), (0, 1, 1, 0, 0, 1) belong to S. But the above system of inequalities is clearly inconsistent.

5.4. Let us now come back to the particular case in which [ ( x ) = - - x r Q x and S-= X = {x E Bn: a x <-b}. Since X is finite, X* is a convex polyhedron by the above property (e). An explicit description of X* as intersection of half spaces is in practice very difficult to obtain, because of the usually large number of points in X. However, property (a) above suggests the possibility of getting more tractable subsets of X* from suitable extensions of X. Let us consider, for instance, the extension : ( ~- {x E Rn: x >- O, ax <- b} D X. Define p as the vector whose jth component is (b/aj)qj~.

Theorem 5.3. It' Q is positive semidellnite, p x is an U P f o r x T Q x in X. Moreover, h(0, p)=

min h(vo, v). (v0, v )EX*

Proof. 2* = E(X)* =

{

O,

u~. . . . . ~ un

a

}

= ( vo, v ): vo > O, vj + _~ vo ---~qjj b for j = 1. . . . . n . From the last expression of X* it is apparent that the point (0, p) belongs to

148

G. Gallo et al./ Quadratic knapsack problems

.~*. On the other hand, for all (v0, v) ~ .~* and x E ~, one has v0+ ~=~ ~ vpcj-> v0+ j=l b q, _a_L b Vo xj = vo l - -

a/x/

n

n

]~l

j=l

This proves the second half of the theorem. The upper plane p x has been tested on randomly generated test problems ranging from 10 to 80 variables. The results, except for some very tight problems (i.e., with b -< 0.2 ~7=J a~), showed a poor behaviour, due perhaps to the fact that in most observed cases there were a few p f s taking large values, which have sharply pushed up the optimum value of the linear relaxation, in spite of the fact that the vast majority of the p's were low. A slight improvement was obtained by replacing the above considered X with )(, - {x ~ R" : x - O, a x <-- b, xr <- 1}, where p, = max~_~i<, pj. It is hoped that tighter extensions of X could lead to significant improvements.

Acknowledgement Acknowledgements are expressed for the partial support given by the Istituoto per le Applicazioni del Calcolo (Rome) and the Centro Nazionale Universitario di Calcolo Elettronico (Pisa) of the Consiglio Nazionale delle Recherche, as well as the National Research Council of Canada (Grant A8552) and Nato's System Science Exchange Grant.

References [1] M.L. Balinski, "Fixed cost transportation problems", Naval Research Logistics Quarterly 11 (1961) 41-54. [2] M.L. Balinski, "On a selection problem", Management Science 17 (3) (1970) 230-231. [3] E.M.L. Beale, "An algorithm for solving the transportation problem when the shipping cost over each route is convex", Naval Research Logistics Quarterly 6 (1959) 43-56. [4] V.A. Cabot and R. Francis, "Solving certain non-convex quadratic minimization problems by ranking the extreme points", Operations Research 18 (1970) 82-86. [5] P.L. Hammer and P. Hansen "Quadratic 0-1 programming", discussion paper No. 7129, CORE. (Heverlee, 1972). [6] P.L. Hammer, M. Padberg and U.N. Peled, "Constraint pairing in integer programming", INFOR. Canadian Journal of Operational Research and Information Processing 13 (1975) 68-81. [7] R.M. Karp, "Reducibility among combinatorial problems", technical report 3, Computer Science University of California (Berkeley, CA, 1972). [8] A. Land, personal communication (1975). [9] D.J. Laughhunn, "Quadratic binary programming with applications to capital budgeting problems", Operations Research 18 (1970) 454-461. [10] C.C. Petersen, "A capital budgeting heuristic algorithm using exchange operations", AIIE Transactions 6 (1974) 143-150.

G. Gallo et al./ Quadratic knapsack problems

149

[1 I] J. Rhys, "A selection problem of shared fixed costs and network flows", Management Science 17 (3) (1970) 200-207. [12] H. Salkin and C.A. de Kluyver, "The knapsack problem: a survey", Naval Research Logistics Quarterly 22 (1) (1975) 127-144. [13] H.A. Taha, "Concave minimization over a convex polyhedron", Naval Research Logistics Quarterly 20 (1973) 533-548. [14] C. Witzgall, "Mathematical methods of site selection for Electronic Message Systems (EMS)", NBS internal report (1975).

Mathematical Programming 12 (1980) 150-162. North-Holland Publishing Company

FRACTIONAL VERTICES, CUTS AND FACETS OF THE SIMPLE PLANT LOCATION PROBLEM Monique GUIGNARD University of Pennsylvania, Philadelphia, PA, U.S.A.

Received 22 May 1978 Revised manuscript received 3 January 1979

This paper investigates the structure of the Integer Programming Polytope of an uncapacitated (simple) plant location problem. One can describe families of fractional vertices and derive from them valid inequalities for the integer problem. Some of these will actually be shown to be facets of the integer polytope. Also some families of SPLP with large duality gaps will be described, together with facets which bridge these gaps. Much of the motivation stems from algorithmic work in which the exploitation of "good" cutting planes within a direct dual algorithm have been shown to be of crucial importance. Key words: Plant Location Problem, Cutting Planes, Facets, Convex Polytopes, Duality Gap.

O. Introduction This paper investigates the structure of the L P polytope of an uncapacitated location problem: SPLP(m, n):

min~ ~

x,y i~l jEJ

cijxi~+ ~,

f/y/,

iEl

VjEI: ~

x/~

ViEI, VjEJ:

X~j-

=l,

iEl

yi-<0, -0,

Xi/

y/-0,

integer,

where I is a set of m possible plant locations, J a set of n destination points, xij the ratio of the requirement of j E J satisfied from location i E / , and y/, a 0-1 variable, is 1 if and only if location i E I is open. c~j is a normalized cost of shipment from i to j, and f/ is a fixed charge associated with opening plant (or location) i E L We assume n --> m >- 3. We will first construct families of fractional vertices of the L P polytope, and generate from them valid cuts. We will then prove that some of these cuts are facets. 150

Monique Guignard/ Plant location problems

151

In a second part, we will study families of S P L P with large duality gaps, and show that for these one specific facet bridges the gap. Fractional vertices have been characterized in [2, Theorem 8], where certain cuts are proposed without proof. In this paper we construct similar inequalities and derive from them cuts which are shown to be facets of the integer polyhedron. We have shown elsewhere, [4], how such cuts can be (and indeed must be) exploited via their associated dual variables within a direct dual algorithm (for the more difficult capacitated plant location problem and for pure integer cuts) and expect that a similar approach may be even more effective when suitable facets are constructed and utilized during the calculations.

1. Some fractional vertices of SPLP (m, n) Define yz = 1 - y; and s~j = 1 - x0 - Yi, Yi ~ L Vj E J. SPLP(m, n) can be rewritten as a set partitioning problem: m +minx

X ci~xij-XfYi

WjSJ: ~ iEl

x0

=

ViELVj~J:

x~i

+Yi + sij = 1 (*it)

x,~ iEl jEJ

iEl

l

(~i)

yi, so->O y~ integer. LP(m, n) will denote the LP relaxation of SPLP(m, n). Let o- be a surjection (i.e. an onto function) from J into I p-a - I ~ ~ where li~l=p>O, m>p+2, and let M i = { j E J l c r ( j ) = i } , i E I p , 1Vli=Y~.Mi. Let Jq _CJ be such that the restriction of cr to Jq, cr]Jq, is still a surjection into Ip. Let J, = J ~ Jq, where q = I . ~ l>- O. We will need the following results in the generation of the valid inequalities. Lemma 1.1. Vi # i', i, i' E Ip, M~ f3 Jq C JQlv n Jq.

Proof. Let i, i' ~ Ip. i" ~ Ip (1) V i ' # i, M / f 3 J q # / ~ , , N Jq: since or is a function (o'(i) is unique for a given i)

Yi" # i, i' (i" exists as lip[ > 2) i" E Ip M,. n ]~ c ]~ ~ ( M, UM/,) If M~ n Jq = M,-,n Jq, then M,., n Jq = 0 which contradicts the fact that o'lJq is a surjection.

Monique Guignard/Plant location problems

152

(2) V i ~ i ' , M ~ A J q C MI~,NJq

V j E M ~ A J q , Vi' ~ i,

~r(j)~ i'::>jE1Vl~,NJq.

Corollary 1.2. Let ni be a positive integer, Vj E Jq. Then Vi ~ Ip,

E

n, <

j c Mi NJq

n, jE Mi,f3Jq

Definition. Let K C Jq with [K] = m - p and tr(K) = I o. L e t / ( = J -~ K. Let z be a point-to-set m a p p i n g f r o m / ( into 1o s u c h that

r(j) n ~(j) = ~,

Vj ~ g

and = rn

lT(J)l

-

p

-

2.

Let X = (x, y, s). T h e n X(cr, K, r , p ) will be defined as follows: Vi ~ ~ , Vj E J, x0 = 0, and 3j = j(i): s~(j)i = O.

V i ~ Ip,

] V j E Mi,

I

x~=0.

VjE1VIinK, Vj ~ K,

sij = O.

s~(j)j = 0.

Proposition 1.3. X(tr, K, z, p) is a vertex of L P ( m , n). Proof. One can easily c o m p u t e the remaining c o m p o n e n t s of X(o-, K, r, p): x,j =

l m-p-I

vi4.,

j ~

~,

and Y~

_m-p-2

y/=l

m-p-I or

or yi=0

1

Yi-m_p_l

ViEIp

ViEfp.

By [2, T h e o r e m 8], the non-integer solution X(cr, K, r, p) is a vertex of L P ( m , n).

Example. L e t m = 5, n = 7 , p = 1 . I o = { 1 , 2 , 3 , 4 } , J = {1,2, 3,4, 5 , 6 , 7 } , r ( 5 ) = {1,2}, 9 (6) = {2, 3}, r(7) = {1,4}. M, = {1}, M2 = {2}, M3 = {3, 7}, M4 = {4, 5, 6}. K = {1,2, 3, 4}, [K[ = 4, o'(K) = 1o = {1,2, 3, 4}, j(5) = 1.

153

Monique Guignard/ Plant location problems

T h e k n o w n e n t r i e s o f X(cr, K , ~', p ) a r e ^g

K 0 0

r.{5

0

0

0

0 0 0

0 0

0

0

0

0

0

0

0 0

0 0

0 0 0

0 0

0 0

0

0

x

Y

W e m u s t d e t e r m i n e the o t h e r e n t r i e s

0 Y2 Y3 Y4

Yl 0 Y3 Y4

Y~ Y2 0 Y4

Yl Yz Y3 0

0

0

0

0

0

0

0

0

0

0

Yl Y2 Y3 Y4

Yl 0 0 0

0 y: 0 0

0 0 Y3 0

0 0 0 Y4

0 0

1

0

0

0

0

0

Y4

0 0 y4

Y3 0

0

0

0

X

+ Y4

Yl + Y2

:~Yl=

Y 2 = Y 3 = Y4=31 9

Yl + Y2 + Y3 Y2 + Y3 + Y4

0 1/3 1/3 1/3

1/3 0 1/3 1/3

1/3 1/3 0 1/3

1/3 1/3 1/3 0

0

0

0

0

0

0

0

0

0

0

x Vj~ K,

2/3 2/3 2/3 2/3

1/3 0 0 0

0 1/3 0 0

0 0 1/3 0

0 0 0 1/3

0 0

1

0

0

0

0

;

0

1/3

0 0 1/3

1/3 0

0

0

0

s

x ~ m = I - Y,cJ) = Y,~) = 31. vi~

~-(j),

x~ = 1 - 3 - z _ I,

s~j = z - 31- 32 = o.

0 113 1/3 1/3

1/3 0 1/3 1/3

1/3 1/3 0 1/3

1/3 x/3 1/3 0

1/3 1/3 1/3 0

1/3 1/3 1/3 0

1/3 1/3 0 113

2/3 2/3 213 2/3

1/3 0 0 0

0 1/3 0 0

0 0 1/3 0

0 0 0 1/3

0 0 0 1/3

0 0 0 1/3

0 0 113 0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

x

y

s

Monique Guignard/Plant location problems

154

2. Valid inequalities associated with X(o', K, ~', p) Consider the inequalities x0+yi_-_l,

~, xq

Vi~Ip, VjEf/l~OJq, (*q)

<: 1, Vj ~ Jq

(~,j)

satisfied as equalities by X(~, K, ~', p). Any nonnegative integer combination of these inequalities will be a valid inequality for SPLP(m, n). If the resulting left-hand side coefficients are all congruent to 0 modulo an integer and if the right-hand side is not, one can divide all coefficients by that integer and round down the right-hand side. The result will still be a valid inequality. Multiply every (*ij) inequality by a positive integer ni and every (~j.) inequality by (a - nil where a is an integer larger than hi. The composite inequality will be such that Vi ~ Ip, Yi will have a coefficient ~ie~inJq ni and xii will appear only if j ~ M~, i.e. if i# or(j), with the coefficient nj + (a - ni) = a. Proposition 2.1. If the system

~,

ni=a

ViEIp

has a positive integer solution (ni, a), then E E x'i+E Yi<--m+n-p-q

(F..,q)

-2

is a valid inequality for SPLP(m, n). Proof. Assdme that (ni, a), positive integers, satisfy

~, nj=a

Vi~Ip.

JEMiNJq

Then the composite inequality will read

a[~, ~, xq+ ~, ~ i ] < , v ( m - p ) + ~. ( a - n , ) iEJq i~EJO)

iElp

jEJq

<--a(m - p ) + ( n - q ) a -

~, nj.

J~Jq

Monique Guignard/Plant location problems We know that Vi E Ip, M~ ~ Jq ~ 0 and

L;. n,< Z

JeMinSq

je~fiCISq

155

.,--a

therefore

a<~ns

S6Sct

~

=

ns + ~

]ElfffinSq

i.EMitqSq

nj < 2a

and 1 < ( I / a ) ~i~sq nj < 2, i.e.

[-(l/a) ~j~sq njJ = -2.

Then

~

xii+~pYi<-m+n-p-q

-2

is a valid inequality for SPLP(m, n).

Proposition 2.2. The system

~,

nj=a,

ViEIp

je~nsq

has a positive integer solution. Proof.

a = ~_, n~iesq

~

jE~nsq

nj

is a constant for all iEIp, so that nj, j~Jq, must be such that Vi# i', i, i'EIp:

Zn,=Y JEMiNJa

n,. JEMi,C'IJq

Consider B = LCM{(IMi t3 J~l), i E I,}. Then, set nj = BI(IM~o) f'l Jq[), j ~ J~. ns will be a positive integer. As the sets M~ A Jq form a partition of Jq, all ns are determined in this way, and

a = ~q

lsNnsq~ n s = [ 3 ( m - p - 1 )

is a positive integer.

Corollary 2.3.

F,,,p,,7 is a valid inequality which eliminates X(cr, K, T, p).

Example. Consider again the example of Section 1. Suppose q = 0. We must find a, nj, positive integers such that n 2 "4- n 3 -~- n 4 Jr- n 5 -~- n 6 -~-/I 7 - - (7/ -~ O,

n~

+n3+n4+ns+n6+n7-a=O,

n l q- n 2

nl + n2+ n3

O) (2)

"1-n4 -.1-n5 .-~-n6 -1- n7 - a = O,

(3)

+ n7- a = O,

(4)

Monique Guignardl Plant location problems

156

It is clear that n~ = n2 = n3 + n7 = n4 + n5 + n6. T h u s a -- n~ + na+ n7+ n3+ n7 = nl + 2n3 + 2n7

(1)

= n~ + 2n3 + 2n7

(2)

= 2n~ + n3 + n7

(3)

= 2n~ + n3 + n7

(4)

and finally a = 3n~. Choose/3 = LCM(IM~I) = LCM(1, 2, 3) = 6. Then: j 1 n~ 6 / 1 = 6

2 6/1=6

3 6/2=3

4 5 6/3=2 6[3=2

6

6/3=2

7

6/2=3

and a = 18. The valid inequality F~,p,q reads: X12 + X13 "Jr X14 "~-X15 q- X16 "al-X17 ] + X23 + X24 + X25 + X26 + X27 ] + YJ + Y2+ Y3+ Y4 -----9.

+X21 "l-X31 + X32

"1-XM -~-X35 "~-X36

"~-X41q- X42 -~" X43

-I- X47

For X(tr, K, r , p ) the left-hand side is 7 + 4 •

= 9{3 and F~.p,q is violated.

3. Some facets of SPLP(m, n) Consider now p = 0, q = n - m and Jq = K. F~,p,q becomes: ~ xij<~iYi+m-2. iEK i~r(j)

(Fo, r )

Rather than being derived from the previously treated valid inequalities, these facets could also be generated more directly with n i = 1. H o w e v e r , the simple form of the valid inequalities, and their direct relationship with fractional vertices, may make them important in their own rights, for example in a simplex-based algorithm.

Proposition 3.1.

F~,r is a f a c e t o f S P L P ( m , n).

Proof. We must show that F . , r contains m n + m - n linearly independent integer vertices of S P L P ( m , n), and that there exists at least one integer vector satisfying F~.r as a strict inequality.

Monique Guignard[ Plant location problems

157

(a) (1) L e t X i be defined by x/~ = 1, Vj, Yi = 1, Yk ---- 0 , V k # i. X i satisfies E,,r as an equality:

~_, ~, x,j = m - l = ~fl y~ + m - 2.

jEK i#,r(j)

T h e r e are m such vertices X ~. (2) C o n s i d e r j o < j ~ < j 2 < . . . < jh, such that j~k < jo with t r ( k ) = tr(j ~ and tr(j ~ = cr(j l) . . . . . o'(j~). L e t i ~ ~(jo). T h e n

X i ~ will be defined by xi? = 1,

x~k = O, V k S jo

X{' will be defined by x0~ = x~i, = 1,

Xik = O, V k # jo, jl

X{ h will be defined by xo~ = xij~ . . . . .

x~p = 1,

Xik = O, V k r jo .... jh, and for all X{ t, ! = 0 . . . . . h, all other shipments are m a d e in r o w t~(j~ x~o% = 1, V j S jo . . . . . j~ and Yi = ! ; Y,o% = 1 if tr(j ~ S i, Vi. T h e r e are n(m - 1) such X{. (3) W e w a n t to p r o v e that {xi}, i E I , {X{}, j E J , i S tr(j) are linearly i n d e p e n d e n t by showing that ~,i~i~(+ ~ 9

~

fli~x{=O==>[~i=O

j i~,,~)

0

Vi, Vj: iStrO'), Vi.

L e t us f o r m a matrix w h o s e r o w s are indexed by Yl. . . . . Ym, x , . . . . . Xm,, and w h o s e c o l u m n s are {Xi}, i E I and {X{}, j ~ J , i S tr(j). T h e rows x~.~,-J/r(i)have only one 1 in c o l u m n X i, so that ai ~ 0, Vi. With the same notation as a b o v e , we obtain ~,~ + ~j, + . . . + ~j~ = ~ , + - - . + ~j~ = fl~j~, ..... o'(j h) ~ i so that flis~ = ~ij' . . . . . ~ i p - , = O. The only remaining c o l u m n s are X{ h with i S tr(jh), say X~ ~, X~ i. . . . . X~ !k', iEL It is e a s y to see that

Vjo < jl < . . . < jh with tr(j ~

Vi:

/3ii, =/3/j~ . . . . .

/3/j~k,=/3i,

say f r o m the x~j rows, since they n o w a p p e a r i n d e p e n d e n t l y in equal sums w h o s e other terms are identical. Finally, b y substituting into the y; rows, one must solve the s y s t e m 01...

1

1.

.

1 1... 10

iit=

Monique Guignard!Plant location problems

158

We know that this matrix is nonsingular and that all/3i must be zero. The m + n ( m - 1) = m n + m - n vertices are linearly independent. (b) The vector (x, y), defined by

i = cr(j),jEJ,

I'1 Xii =

otherwise,

y,.=0

Vi,

satisfies F~,K with a strict inequality. (c) F~.K is therefore a facet of SPLP(m, n). Example. m = 3, n = 7, Ml = {1}, M2 = {2, 6, 7}, M3 = {3, 4, 5}. The facet is X 1 2 + X l 3 + X21 + X 2 3 + X31 + X 3 2 ~-~ Y l +

I 2 3

1

1

1 1

1 1 1 1

I I 1 I

I

1

I

I

I

I

I 1

1 1 1 1 1 1

1 I 1 I

1 1 I 1

I 1 I

I I 1 1

31 32 36 37

I

1

1

I

x12 13 14 15 16 17 21 23 24 25

I

Y2 +

1

I 1 1 1

I 1

1 1 I

1 1 1 t

1 1 1 1

1 I 1 I

1. 1

1 1 1

1 1 I 1 1 1

1 1 1 1 1 1 I 1 1 1

l I 1 I

1 1 I 1

1

1 I

1 I 1

1 I

I

1 1 I I 1 1 1 1

1

I1 22 26 27 33 M 35

1

1

1 1

Y3 +

I 1 1

~

1 1

1 1 I

1

xl

x~

x~

x~

x]

x~

1 I

1

x~

x~

1 1 1

1

1 1 1

x~

x~

x]

x~

x1

Jr

~

3

x

Remarks. In the (x, y) space, by varying p, K, and or, one obtains quite a number of vertices of the form X(tr, K, z, p). For instance, for rn = 3, n = 4, one has 36 fractional vertices of the form X ( m K, r, p) vs. 81 integer vertices. Similarly, one obtains (g). m! facets F,,,K. For example, with m = 3 and n = 4, one has 24 facets. Notice that some fractional points are eliminated by several facets. For instance, the fractional solution x=

y=

Monique Guignardl Plant location problems

159

is eliminated by both xt~ + x3~ + x22 + x32 + x~3 + x23 -< y~ + Y2+ Y3+ 1, and X/4-['- XM "q'-X22 "~- X32 "1- X13 -I- X23 ~---Yl -F y2 + y 3 + 1.

On the other hand, each-facet eliminates m " - " fractional vertices with p = 0, and many more with p > 0.

4. Duality gaps and facets We shall now consider a special family of S P L P ( m , n)'s, for which the structure of the cost matrix is such that the continuous and the integer optimal solutions are known explicitly and the duality gaps are very large. For these problems we shall show that a facet of the form F~.K bridges the duality gap. Let tr be a surjection from J into I. L e t c, ti)j be infinite, and let c,.i = c, V i # tr(j). In other words, the shipment costs are either infinite or equal to constant c. L e t jfi = L, Vi E I. We shall denote this problem by SPLP(o-, L).

Proposition 4.1. The relative duality gap f o r SPLP(~r, L ) c a n n o t exceed 89 Proof. A continuous optimal solution will be determined by:

u e J:

Xo-(j)j = 0, Vi E L i r tr(]),

x~i = 1/(m - 1)

and Vi E L

yi = I/(m - 1 ) .

The cost of this solution is zc=m

9

L m-1

c

+ n 9( m - 1 ) . - m-1

mL ---~-n.c. m-1

An integer solution will correspond to two open plants (given that at least one route is prohibited from an open plant by means of an infinite cost), and an optimal solution will, for instance, be defined by: Yil = Y/2

=

1, il, i2 E L and say il < i2

xi, j = 1, V j E 1~i,, xid = 1, V j E Mi, C JQ1,.2,

with a total cost of zl = 2 L + nc.

160

Monique Guignard] Plant location problems

T h e n t h e r e l a t i v e d u a l i t y g a p a will b e = (2L + nc - ((mL/(m

a = (z~ - z r

= (2L - mL/(m

- 1)) + n c ) ) / ( 2 L

+ nc)

- 1))/(2L + n c )

= (2(m - 1)L - mL)/(2(m

- 1)L + (m - 1)nc)

= ( ( m - 2 ) L ) / ( 2 ( m - l ) L + ( m - 1 ) n c ) < ((m - l ) L ) / ( 2 ( m - 1)L) = 89 I n o r d e r to g e n e r a t e a p r o b l e m w i t h a g i v e n r e l a t i v e d u a l i t y g a p a , o n e m u s t choose (m - 2)L = 2a(m

-

I)L + a(m

-

1)nc

or mL

- 2amL

+ 2aL

- 2L = a(m

-

1)nc,

i.e. L = (a(m

-

1)nc)/(m(1

- 2 a ) + 2 ( a - 1))

T h i s m a k e s s e n s e o n l y if m ( 1 - 2 a ) + 2 ( a - 1) > 0, i.e. if m > (2(1 - a ) ) / ( 1 - 2 a ) s i n c e w e k n o w t h a t a < 89 E x a m p l e . F o r a = ~ ( v e r y c l o s e to 89 w e m u s t h a v e m > (2(1 - ~))/(1 - 2 • ~) = (2 • 11)/1 = 22. T a k e m = 23. T h e n L s h o u l d b e e q u a l to - ~ ) + 2(~ ~o -

L -- ( ~ ( 2 2 ) n c ) / ( 2 3 ( 1

1)) = 220 n c .

I f n = 30 a n d c = 1, f o r i n s t a n c e , w e o b t a i n L = 6600. T h e n , z~ = 2 . 6600 + 30 = 13230, and zc = 30 + ~ . 6600 = 6930, so t h a t i n d e e d t h e g a p is = ( 1 3 2 3 0 - 6 9 3 0 ) / 1 3 2 3 0 = ~.

a = (zr - zr

T h e c o s t m a t r i x l o o k s as f o l l o w s : 6600

oo

111

1

1

9I

1 oo

23

ll~176 ~176

1

6600 23

30

1

1 oo

161

Monique Guignard[ Plant location problems

Proposition 4.2. Given a subset K C J such that ( r ( K ) = L IKI = m, F~,K bridges

the gap f o r SPLP(~, L). Proof. A d d F~,K to the constraints o f SPLP((T, L ) and write the corresponding

dual problem: M i n ~ f," Yi + ~ / ~

Max ~ v~ - (m - 2)p = za

cq . xq

I

vj:

=

1

vi unrestricted in sign

i

Vi, Yj:

y~ y, - ~ , jEK

9

Vi:

~,

x/j

-> 0

xq

>- 2 - m

>--0

p

>--0

i~o'(j)

Yi

>- 0

~ w i j + p <<-f,

>- 0

vj -- w~j -- p ~ cq

V j E K, V i ~ o'(j):

x~i

Vj E K:

x~q)j >- 0

VjEJ-~K,

wii

Vi:

x~i

>-0

J

Vj - - Wo'(i)j

vj - wq

<7 Co.(])j

<- cq

The following solution is feasible for the dual: wq=O

Vi, Vj,

P =fi = L , vj=c+L

VjEK,

vj=c

VjUJ-~.K.

Then za=~

(c+L)+

jEK

~

(c)-L(m-2)

jEJ~K

= m ( c + L ) + (n - m ) c - L m + 2L,

that is Zd = 2 L + nc.

Since we know a primal solution with the same objective function value, this is the optimal value for the dual and there is no more duality gap.

162

Monique Guignard/ Plant location problems

References P e r t i n e n t r e f e r e n c e s are either the f o l l o w i n g p a p e r s , or c a n be f o u n d r e f e r e n c e d in them. [1] E. Balas and M. Padberg, Set partitioning, in: B. Roy, ed,, Combinatorial programming: methods and applications (D. Reidel Publ. Co., Dordrecht, 1975). [2] G. Cornuejols, M. Fisher, and G.L. Nemhauser, On the uncapacitated location problem, Annals of Discrete Mathematics 1 (1977) 163-177. [3] M. Guignard and K. Spielberg, Algorithms for exploiting the structure of the simple plant location problem, Annals of Discrete Mathematics 1 (1977) 247-271. [4] M. Guignard and K. Spielberg, A dual method for the mixed plant location problem, with some side constraints, Mathematical Programming 17 (1979) 198-228. [5] J. Krarup and P. Pruzan, Selected families of discrete location problems, Part III: The plant location family, University of Calgary Working Paper No. WP. 12.77. [6] M. Guignard and K. Spielberg, A direct dual approach to a transshipment formulationfor multi-layer network problems with fixed charges, Wharton School Department of Statistics, Report No. 43, August 1979.

Mathematical Programming Study 12 (1980) 163-175. North-Holland Publishing Company

BALANCED CLAUDE

MATRICES

AND PROPERTY

(G)

BERGE

Universit~ Pierre et Marie Curie, Paris, France Received Received 25 January 1979 Revised manuscript received 5 September 1979 Let A be a (0, 1)-matrix with n rows and m columns, considered as the incidence matrix of a hypergraph H with edges Et, E2. . . . . E, (the columns) and with vertices xl, x2. . . . . x, (the rows). H is called a balanced hypergraph if A does not contain a square sub-matrix of odd order with exactly two ones in each row and in each column. In this paper, we prove more "minimax" equalities for balanced hypergraphs, than those already proved in Berge [1], Berge and Las Vergnas [3], Fulkerson et al. [7], Lovfisz [12]; in fact, the known results will follow easily from our main theorem.

Key words: Balanced Hypergraphs, Normal Hypergraphs, Zero-One Matrices, Minimax Theorems, Integral Polyhedra.

1. General Definitions L e t A b e a (0, 1)-matri• w i t h n r o w s a n d m c o l u m n s c o n s i d e r e d h e r e as t h e i n c i d e n c e m a t r i x o f a h y p e r g r a p h H . F o r p E N", q E N m, w e c o n s i d e r t h e following coefficients:

v ( H ; p, q ) = max{(q, z) I z E Z m, z - 0, A z <- p}, ~-(H; p, q ) = min{(p, t) I t E Z", t >- 0, tA >- q}, ~-*(H; p, q ) = min{(p, t) I t ~ R", t >- O, tA >- q}. C l e a r l y , v ( H ; p, q) <- ~'*(H; p, q ) --< r ( H ; p, q). T h e c o e f f i c i e n t v ( H ; p, q) is t h e m a x q - v a l u e o f a p - m a t c h i n g , w h i l e r ( H ; p , q ) is t h e m i n p - v a l u e o f a q t r a n s v e r s a l in H . Some properties of H, which have been considered under different names by different authors, can be rephrased as follows: H is normal (Lov~isz; " p e r f e c t m a t r i c e s " in P a d b e r g [14]) iff

v ( H ; 1 , q ) = r ( H ; 1 , q)

for allqEN

m.

H has t h e Menger property ( S e y m o u r , S c h r i j v e r ) iff

v ( H ; p, l) = ~-(H; p, 1)

f o r all p E N".

h is paranormal ( F u l k e r s o n , L e h m a n , S a k a r o v i t c h , etc.) iff ~ ' * ( H ; p , l) = ~ ' ( H ; p , 1)

f o r all p E N " . 163

164

c. Berge[Balancedmatrices and property (G)

Clearly, the Menger property implies paranormality, but the converse is not true. Equivalent formulations exist in the literature. For instance, H is "paranormal" means that every vertex of the "transversal" polytope {t I t ~ R " , t ~ O ,

tA>-l}

has only integral coordinates. Also H is "normal" means that every vertex of the "matching" polytope

{z [ z ~ R m,z >-O, Az <= 1} has only integral coordinates. We shall say that H has the K6nig property iff v(H; 1, 1) = r ( H ; 1, 1). Clearly, if H is normal, or if H has the Menger property, then H has also the K6nig property. Numerous examples of hypergraphs having one of the above mentioned properties exist in the context of Graph Theory (cf. [16]). They appear most often as a "minimax" theorem, expressing the duality principle for some linear program in integers. The other classical definitions needed in this paper are the following. If H = ( E I , E2. . . . . Era) is a hypergraph, its vertex-set is X = U Ei; its rank is r(H) = max[Eil; its anti-rank is s ( H ) = minlE~[; they are both positive integers. H ' is a partial hypergraph of H if it is obtained by removing edges (and all the vertices which become isolated). HA is the subhypergraph induced by A C_X if its edges are the non-empty sets to the form E~ f3 A. Also, H/A is the section hypergraph of H by A, i.e. its edges are all the Ei which are contained in A. For u = (ul, u2..... un) E N n, we denote by H u a hypergraph obtained from H by replacing each vertex xi of H by a set Xj with cardinality ui and each edge E~ by UxjE~, Xj. A set T is a transversal set of H if TAE~#Y

(i = 1,2 ..... m).

The set of all minimal transversal sets of H is denoted by Tr H ("transversal hypergraph"). So, z ( H ; p , 1) can be described as the minimum weight of a transversal set. The number r ( H ; 1, 1), more often denoted by z ( H ) = min{ITI; T ~ Tr H}, is the transversal number of H, and v(H; 1, 1) more often denoted by v(H) is the matching number of H. A well-known result of Lehman-Fulkerson [6, 10] is: Lemma. (Lehman-Fulkerson lemma). Let p E N", w ~ N", and let H be a hypergraph with vertices Xl, x2..... x,. Then H is paranormal if and only if

C. Berge[Balanced matrices and property (G)

165

n

~'(H ; p, l)~'(Tr H ; w, 1) <--~, p, wi i=I

for all p E N", w ~ N L From this result, it follows that H is paranormal if and only if T r H is paranormal. H is semi-normal (Lov~sz) if for every subset A C X, the section hypergraph H/A has the K6nig property. Clearly, if H has the Menger property, or if H is normal, then H is semi-normal.

2. The property (G) Let H be a hypergraph of n vertices with anti rank s(H); denote by or(H) the maximum number of pairwise disjoint transversal sets of H. Clearly,

or(H) <--s(H). We shall say that H has the property (G) if

o'(HA) = S(HA)

(A C X).

We shall say that H has the strong property (G) if

o-(H") = s ( H u) (u E N"). For instance, if H is the dual of bipartite multigraph G with edges el, e2..... en, then H u is the dual of a bipartite multigraph G' obtained from G by multiplicating each edge ej by uj. By a theorem of Gupta [8], we have Q(G') = ~(G'), hence tr(H ~) = s(H"). So, the dual of a bipartite multigraph has the strong property

(G). Proposition 1. A hypergraph H has property (G) if and only if Tr H is seminormal. Proof. Tr H is semi-normal if and only if, for every A C X, the hypergraph ( T r H ) / A satisfies the K6nig property. Since ( T r H ) / A = T r ( H A ) (unless (Tr H ) / A does not exist) this is equivalent to v(Tr HA) = ~-(Tr HA). Note that ~-(Tr HA) is equal to the minimum size of an edge of HA, SO ~-(Tr HA) = s(HA). On the other hand, v(Tr HA) is the largest integer k such that there exist k pairwise disjoint transversal sets of HA, so v(Tr HA) = or(HA) and the proposition follows.

166

C. Berge/Balancedmatrices and property (G)

Proposition 2. A hypergraph H has the strong property (G) if and only if Tr H has the Menger property. Proof. Let H be such that Tr H has the Menger property,

v ( T r H ; p , 1)=~-(TrH;p, 1) ( p E N ~ ) .

(1)

For any integer k, each of the following inequalities is equivalent to the consecutive one: ~-(Tr H ; p, 1) - k,

~, pj>-k (i<_m), xi~Ei s(H p) >--k. Hence

s(H p) = ~,(Tr H ; p, 1). On the other hand, H p admits k disjoint minimal transversal sets iff there exist k transversal sets of H, say T~, T2..... Tk E Tr H, such that:

I{i [xjET~}l~--pj

(j<=n).

Since this is equivalent to v(Tr H ; p , 1)=> k, we have

tr(HP) = v(Tr H ; p, 1). Thus, (1) is equivalent to:

o,(H p) = s(SP),

(2)

which means that H has the strong property (G). Corollary. Every hypergraph with the strong property ( G) is paranormal. Proof. From Proposition 2, a hypergraph H with the strong property (G) has a transversal hypergraph H ' = T r H which has the Menger property. By the duality principle of linear programming, we have

v(H'; p, 1) =< ~'*(H'; p, 1) =<~'(H'; p, 1). Since H' has the Menger property we have also ~'*(H'; p, 1) = 7(H'; p, 1). From the Lehman-Fulherson lemma, it follows that H is paranormal. Remark. The converse of the above corollary is not true. For instance, the hypergraph H = {abc, cde, era, bdf, ad, be, cf} is not paranormal; however, s(H) = 2, o,(H) = 1, so that H has not the property (G).

C. Berge[Balanced matrices and property (G)

167

We shall see in the next section that if all partial hypergraphs of H are paranormal, then H has the strong p r o p e r t y (G).

3. Minimax theorems for balanced hypergraphs A h y p e r g r a p h H is balanced if e v e r y odd cycle has an edge containing three vertices of the cycle, i.e. if the incidence matrix of H does not contain an odd square submatrix of the following form: /1100 -.. 0 ~ 0110 ... 0 0011 ... 0 00... 10...

11 01/

Clearly, if H is balanced, then a partial hypergraph, or a sub-hypergraph, or the dual h y p e r g r a p h of H, is also balanced; furthermore, we h a v e

v(H) = r ( H ) ,

q(H) = A(H)

(see [1]). To set m o r e properties for balanced h y p e r g r a p h s we shall show the following result:

Theorem 1. For a hypergraph H = (El, E2 . . . . . Era) on X = {xl, x2. . . . . x~}, the following conditions are equivalent: (1) H is balanced; (2) every partial hypergraph H' C H has the strong property (G); (3) every partial hypergraph H' C_H has a transversal hypergraph Tr H' with the Menger property; (4) every partial hypergraph H' C_H has a transversal hypergraph Tr H' which is paranormal ; (5) every partial hypergraph H' C H is paranormal; (6) every partial hypergraph H' C H has the Menger property; (7) every partial hypergraph H' has a transversal hypergraph T r H' which is semi-normal; (8) every partial hypergraph H' C_H has the property (G). Proof. ( 1 ) ~ ( 2 ) . L e t H be a balanced hypergraph of order n, and let p ~ N " . Since H p is also balanced, we h a v e only to show that e v e r y balanced hypergraph H = (El, E2 . . . . . Era) satisfies

tr(H) = s ( n ) = min[Eil

168

C. Berge/ Balanced matrices and property ( G)

Let k = min[Ei[, and let (S,, 82 ..... Sk) be a partition of the vertex-set X into k classes. Denote by k(i) the number of classes which meet Ei. If k(i) = k for all i, all the classes are transversal sets of H, and the proof is achieved; otherwise, there exists an index j _-<m with k(j) < k. Since kfj) < k _-< [/~[, there exists a class Sp such that [Sp n Ej{--> 2. Furthermore, there exists a class Sq such that

Is~n~l=0. The subhypergraph Hspnsq is balanced, and therefore has a 2-coloring (S~, S~). Put S~=S~ if irSp, q. The partition (S~,S~ ..... S'k) determines as above new coefficients k'(i), and

k'(i) > k(i)

(i <=m),

k'(j) = k(j) + 1. With this method, it is always possible to improve a partition (S~, S~. . . . . S~) until we have k ' ( i ) = k for all i; then we have a partition of X into k = s ( H ) transversal sets, and the proof is achieved. (2) ~ (3). This follows from the Proposition 2. (3) ~ (4). This follows from the general relation:

v(H'; p, 1) _-
p, 1) _-
(4) ~ (5). This follows from the L e h m a n - F u l k e r s o n lemma. (5) ~ (6). Let H be a hypergraph whose partial hypergraphs are paranormal, and assume there exists a partial hypergraph of H which has not the Menger property (per absurdum). Let H ' be the partial hypergraph of H which does not satisfies the Menger property and with the least number of edges. Let n be order of H ' and let m be the number of edges of H ' . Let p be the integer vector with (p, 1> minimum such that

(i)

v(H'; p, 1) < ~-*(H'; p, 1) = ~(H'; p, 1).

Let z be an optimal p-matching of H, that is a vector of the p-matching polytope:

{z [ z E R m , z >-O, A z , we have (ii)

0 < z, < 1.

From the minimality of H ' , we know that the partial h y p e r g r a p h / ~ = H ' - E , ,

C. Berge/Balanced matrices and property (G)

169

with incidence matrix fi~, has the Menger property. Thus,

is attained b y an integral vector a = (u2, uj . . . . . urn). On the othor hand, ~ = (z2, z3. . . . . z , ) is also a fractional p-matching o f / ~ , so that m

~'*(H'; p, 1 ) - z l = ~ i =2

m

zi -------~ui = ~'*(H; p, 1). i=2

But z * ( H ' ; p, 1) and z*(/~; p, 1) are both integers, and it follows from (ii) that: (iii)

T*(H'; p, 1) _--<~-*(/~; p, 1).

Since (0, u2, u3. . . . . urn) belongs also to the p-matching polytope of H ' , we obtain: (iv)

T*(H'; p, 1)_--<~ ui = ~-*(lq; p, 1). i=2

Comparing (iii) and (iv), we see that (0, u2, u3. . . . . urn) is an optimal p-matching of H ' ; since all its coordinates are integral,

v(H'; p, 1) = z * ( H ' ; p, 1")= "r(H'; p, 1). This is a contradiction with (i), and achieves the proof. (6) ~ (1). If H satisfies (6) and is not balanced, there exists an odd cycle, say ( X l , E l , X2, E 2 . . . . . X2k+l, EER+I,XO, with no El containing three xj's of the sequence. Take H ' -- (El, E2 . . . . . E2k+l), and

PJ=

1 for j = 1,2 . . . . . 2 k + l , 0 otherwise.

We have

v(H'; p, 1) = k, ~-(H'; p, 1) = k + 1. This contradicts (6) and achieves the proof. (3) ~ (7). The Menger property implies the semi-normality. (7) ~ (8). This follows from Proposition 1. (8) ~ (1). If H satisfies (8) and is not balanced, then there exists an odd cycle, say (xl, El, ..., EEk+b X0, with no Ei containing three x f s of the sequence. Take H ' = ( E I , E2 .... EEk+O, A ={Xl, X2. . . . . X2R+~}. We have s ( H ~ ) = 2 , tr(H~) = 1. This contradicts (8) and achieves the proof. Corollary 1. Every balanced hypergraph has the Menger property; that is, the

170

C. Berge/ Balanced matrices and property (G)

minimum p-value of a transversal set is equal to the m a x i m u m cardinality of a p-matching.

This is a result of Fulkerson et al. [7]. Corollary 2. The transversal hypergraph of a balanced hypergraph has the Menger property. This is a partial answer to a problem raised by Fulkerson et al. [7], who have noted that the transversal hypergraph of a balanced hypergraph is not necessarily balanced, but asked if it may have similar properties.

4. Minimax theorems for balanced hypergraphs which follow from the normality

Let H be a hypergraph; its chromatic index q ( H ) is the least number of colors needed to color the edges so that two intersecting edges have different colors. Let A(H) be the maximum degree, that is the maximum number of edges having one point in common; H is called a normal hypergraph if every partial hypergraph H ' of H satisfies q ( H ' ) = A(H'). Several characterisation of normal hypergraphs have been given by Lovdsz [11], Padberg [13, 14] and by Chwitai [4, 5]. The main results have been summarized by Lovfisz in a very simple theorem, that we shall use (in a slightly different context) to get more minimax theorems. Lenuna (Lovdsz [11]). Let H = ( E I , E2 . . . . . Era) be a hypergraph on X = {xl, x2 . . . . . x,}, and let y = (Yl, y2. . . . . y,,) E N ~. Then the hypergraph ffI obtained from H by multiplicating each Ei by yi is also normal. It suffices to show that /~ = ( E I , E~, Ez, E3 . . . . . Era), with E~ = E~, satisfies q(/-)) --- A(/~). Put q ( H ) = A ( H ) = q. Case 1. In H, the edge E~ contains a vertex x with degree du(x) -- A ( H ) . Then, A(/~) = q + 1, and we have A(fft) <=q ( f I ) <--_q ( H ) + 1 = q + 1 = A ( f I ) .

Hence, q(/4) = A(/~). Case 2. The edge E1 does not contain a vertex with degree A(H), i.e. dn(x) <=q - 1

(x E EO.

Let denote by (I) the color received by the edge E~ in an optimal q-coloring of the edges of H ; each vertex of H with maximum degree belongs to an edge of color (1).

C. Berge ! Balanced matrices and property ( G )

171

Denote by Ht the family of edges of H having color (1) and different from E~. So, A ( H --Hi) = q -- 1. Since H is normal, q ( H - H ~ ) = q - 1; therefore, we can color with q - 1 colors the edges in H - H~, and with one new color for HI + El, we obtain a q-coloring o f / ~ . Hence q ( ~ ) __0, Ay -< 1} has (0, D-coordinates ; (3) every vertex of the matching polytope has integral coordinates. (4) v(H; 1, q) = I"*(H; 1, q) for all q E Nm; (5) v(H; 1, q) = ~'(H; 1, q) for all q E N m ; (6) every partial hypergraph H ' has the K6nig property. Proof. ( 1 ) ~ ( 2 ) . Let z be a vertex of the polytope Q; since z is determined by a system of equalities with integral coefficients, its coordinates are rational, and there exist integers k, pl, p2 ..... Pm >-0 so that kz = (Pl, P2. . . . . P~). In the hypergraph H obtained from H by multiplicating each edge E~ by p~, we have d~(x~) = ~, Pi = (a j, kz) = k(a s, z) <-_k. E~xj

Hence A(/4) _---k, and, by the lemma, q ( H ) _-
fl

if one copy of E, is in/-I~,

l0 otherwise.

The vector y~ = (y~, y~.... , Ym), with (0, 1)-coordinates, belongs to Q; furthermore l~yk -k

1 = ~ ( P l , P 2 . . . . . Pm)=z.

Since z is a vertex of Q,

C. Berge/ Balanced matrices andproperty (G)

172 y l = y2 . . . . .

yk = Z.

This shows that z is a vector with (0, D-coordinates. (2) ~ (3). Obvious. (3) ~ (4). Since max{(q, y) [ y E Q} is attained by a vertex of the polytope Q, we have

v ( H ; 1, q) = ~-*(H; 1, q). (4) ~ (5). Put Ql = {z [ z E Q, (z, q) -- maxy~ o (y, q)}. Q1 being a face of the polytope Q, there exists a r o w - v e c t o r t~1 of the incidence matrix A such that

z E Ql ~ (a i',z) = 1. It follows f r o m (4) that each matching of H with m a x i m u m q-value covers the vertex xjl. Put

ql=

qi -- 1 ff xi~ E E i , qi otherwise.

Thus

v(H; 1, ql) = v ( H ; 1, q ) - 1. As above, there exists a vertex xj2 of H and a vector q2 = (ql,2 q22. . . . . q2) such that v ( H ; 1, q2) = v ( H ; 1, q l ) _ 1. We can continue, to define a sequence tr = (xiz, xi2. . . . . xjk), until we h a v e

v ( H ; 1, qk) = O. The vector t = (h, t2. . . . . t n ) , where ti is the n u m b e r of a p p e a r a n c e s of xj in the sequence or, is a q-transversal of H such that:

(t, 1 ) = s

j=l

ti = k = v ( H ; 1, q).

H e n c e t is a q-transversal of minimum value and z ( H ; 1, q) = v ( H ; 1, q). (5) ~ (6). L e t H ' be a partial h y p e r g r a p h of H, and put

{1 qi =

if Ei E H', otherwise.

The vector q = (q~, q2 . . . . . qm) satisfies

C. Berge/ Balanced matrices and property (G)

173

v ( H ; 1, q) = v(H'), r ( H ; 1, q) = ~-(H').

Thus, (5) implies v(H') = "r(H'). (6) ~ ( 1 ) . It su1~ces to show that a hypergraph H which satisfies (6) is such that q ( H ) = A ( H ) . L e t / ~ be a hypergraph whose vertices are the matchings of H, and where an edge/~i denotes the set of all matchings of H containing El. Clearly,/~i N/~i = )ff if and only if El A E i # ~ . Since H satisfies (6), it has the Helly property, hence v(~) = a(H).

Also, q(fI) = r(H), r(ff-I) = q ( H ) , A (if,I ) = v(H).

Hence, /~ is normal; since we have already shown that ( 1 ) ~ (6), we have also v(H) = r ( H ) , that is q(H)=A(H).

This achieves the proof. Theorem 3. L e t H = (E,, E2 ..... Era) be a hypergraph on X with incidence matrix A. For J C_{1, 2 .... , n}, we denote by A J the matrix obtained f r o m A by deleting the rows whose index is not in J. For I C_{1, 2 ..... m}, we denote by AI the matrix obtained from A by deleting the columns whose index is not in I. The following conditions are equivalent: (1) H is balanced; (2) All the vertices of the matching polytope {y ] y ~ Rm, y _>0, AJy <=1} have integral coordinates;

(3) V(HA; 1, q) = **(HA; 1, q) for all q ~ N m and all A C_X ; (4) V(HA ; 1, q) = T(HA ; 1, q) for all q E N m and all A C_X. This follows immediately from Theorem 2, and from the fact that H is balanced if and only if HA is normal for all A C_X. Corollary 1. Let H be a balanced hypergraph; then v ( H ; l, q) = T(H; 1, q), i.e. the m a x i m u m q-value of a matching of H is equal to the minimum value of a q-transversal of H.

174

C. Berge/Balanced matrices and property (G)

(This is a result o f F u l k e r s o n et al. [7].) Corollary 2. For all J the vertices of the matching polytope: {y [ y E R m , y =>0, AJy <- 1}

have integral coordinates if and only i[ [or all the vertices o f the transversal polytope {z I z E R ~,z >--O,zA, >=1} have integral coordinates. This follows f r o m T h e o r e m 3 and T h e o r e m 1. R e m a r k . A b a l a n c e d h y p e r g r a p h H satisfies

v ( H ; p, q) = r ( H ; p, q)

(1)

f o r p = 1 and q E N ~' ; or f o r p E N ~ a n d q = 1. H o w e v e r , (1) d o e s not hold f o r e v e r y pair (p, q); f o r H = (ad, bd, cd, abcd), p = (2, 2, 2, 3) a n d q = (1, 1, 1, 2), we see that t -_ tl t2, I2, I2, 89 is a fractional q - t r a n s v e r s a l , and z = (89 89 89 a2) is a fractional p - m a t c h i n g with (p, t) = (q, z), so that z * ( H ; p, q) = 9. Also,

v ( H ; p, q) = 4, r ( H ; p, q) = 5. Hence

v ( H ; p, q) < ~*(H; p, q) < r ( H ; p, q). T h u s (1) is not true f o r H. In fact, no b a l a n c e d h y p e r g r a p h s , e x c e p t the u n i m o d u l a r o n e s , satisfy (1) f o r all p E N n, and q E N m. T h e a b o v e c o n s i d e r a t i o n s suggest the following p r o b lem: D e t e r m i n e all pairs (p, q) with p E N n a n d q E N m such that v ( H ; p, q) = ~'(H; p, q) holds f o r all b a l a n c e d h y p e r g r a p h s H.

References

[1] C. Berge, "Balanced matrices", Mathematical Programming 2 (1972) 19-31. [2] C. Berge, "Sur une extension de la th6orie des matrices bi-stochastiques", in: Dall'Aglio, ed., Studi de probabilita, satistitica e ricerca operativa, in onore di G. Pompilj. (Oderisi-Gubbio, Rome 1970) 475-482. [3] C. Berge and M. Las Vergnas, "Sur un th6or~me du type K6nig pour hypergraphes", Annals of the New York Academy of Sciences 175 (1970) 32-40. [4] V. Chvfital, "On certain polytopes associated with graphs", Journal of Combinatorial Theory 18(B) (1975) 138-154.

C. Berge/ Balanced matrices and property (G)

175

[5] V. Chv~tal, "On the strong perfect graph conjecture", Journal of Combinatorial Theory 20(B) (1976) 139-141. [6] D.R. Fulkerson, "Blocking and anti blocking pairs of polyhedra", Mathematical Programming 1 (1971) 168-194. [7] D.R. Fulkerson, A.J. Hoffman and R. Oppenheim, "On balanced matrices", Mathematical Programming Study 1 (1974) 120-132. [8] R.P. Gupta, "An edge-coloration theorem for bipartite graphs of paths in trees", Discrete Mathematics 23 (1978) 229-233. [9] A.J. Hoffman, "A generalisation of max-flow min-cut", Mathematical Programming 6 (1974) 352-359. [10] A. Lehman, "On the width-length inequality", mimeo (1965). [11] L. Lovasz, "Normal hypergraphs and perfect graph conjecture", Discrete Mathematics 2 (1972) 253-267. [12] L. Lovhsz, "On two mini-max theorems in graphs", Journal of Combinatorial Theory 21(B) (1976) 96-103. [13] M.W. Padberg, "On the facial structure of set packing polyhedra", Mathematical Programming 5 (1973) 199-215. [14] M.W. Padberg, "Perfect zero-one matrices", Mathematical Programming 6 (1974) 180-196. [15] M.W. Padberg, "Almost integral polyhedra related to certain combinatorial optimization problems", Linear Algebra and Appl. 15 (1976) 69-88. [15] M.W. Padberg, "Almost integral polyhedra related to certain combinatorial optimization problems", Linear Algebra and its Applications, 15 (1976) 69-88. [16] A. Schrijver, Fractional packing and covering, in: Packing and covering, (Mathematisch Centrum, Amsterdam, 1978) 175-248.

Mathematical Programming Study 12 (1980) 176-196. North-Holland Publishing Company

DUAL INTEGRALITY IN b-MATCHING PROBLEMS W. P U L L E Y B L A N K The University of Calgary, Calgary, Alberta, Canada

Received 24 June 1977 Revised manuscript received 20 July 1978 We investigate the problem of when a b-matching problem with integer edge costs has an integer optimal dual solution. We introduce the concept of b-bicritical graphs, give a characterization of them and show that these play a pivotal role in determining when there exists an integer optimal dual solution. Key words: b-Matching, Matching, Dual Integrality, Bicritical, b-Critical, Duality.

I. Introduction The weighted b-matching problem is a well solved class of integer programming problems. For such a problem we know a polynomially bounded algorithm which solves the problem, we know a good optimality criterion and we know the unique minimal set of inequalities necessary and sufficient to define the convex hull of the feasible solutions. In fact these are all related via the duality theory of linear programming. In this paper we study the dual linear programming problem and obtain integrality results. There are several reasons why we are interested in such theorems. First, when a b-matching problem has an integral optimal dual solution, the duality theorem of linear programming gives a combinatorial min-max theorem. An example of this is the famous K6nig theorem which states that the maximum cardinality of a matching in a bipartite graph is equal to the minimum cardinality of a set of nodes that meets e v e r y edge. One way of proving this theorem is by showing that the associated linear program always has a 0-1 valued optimal dual solution. A second reason for studying this problem is that the problem of finding an optimal integer solution to the dual of a b-matching problem is well-known to be very difficult. For example in the case that all edge costs are 1 and all degree constraints are 2, the set of nodes given a dual variable of 1 in 0-1 valued solution to the dual problem is a set of nodes that covers all the edges of the graph. Thus the problem of finding a minimum integer solution is polynomially equivalent to the problem of determining whether or not a graph has a set of k nodes that cover the edges. This problem is well-known to be NP-complete (see Aho et al. [1]). We elaborate on consequences of this viewpoint of the results presented here in another paper [8]. In Section 2 we present the basic terminology used throughout the paper. In 176

I4/. Pulleyblank/Dual integrality in b-matching problems

177

Section 3 we describe the weighted b-matching problem and give some fundamental results that are used in the rest of the paper. When determining whether a b-matching problem has an integer optimal dual solution, the set of primal constraints that we use is very important. We observe in Section 3 that if enough redundant primal constraints are added to the problem, then there always exists an integer optimal dual solution. Thus through the bulk of this paper we require that the set of primal constraints used be essentially minimal, that is, it should correspond to the set of facets of the matching polytope. In actual fact, for ease of development, we do use a set of primal constraints which may include some non-facets, but in Section 7 we show how this may be restricted to the case in which only dual variables corresponding to facets are permitted. In Section 4 we introduce the concept of b-bicritical graphs. If we are given a graph G(V, E, ~) and a vector b -- (bi; i E V) of positive integral degree constraints, we say that x = (xj: j E E) is a perfect b-matching if x is an assignment of nonnegative integers to the edges of G, such that for each node i, the sum of the xj on the edges j incident with i is exactly equal to b~. Then we say that G is b-bicritical if G is connected, I VI > 2, b; -> 2 for all i E V, and whenever we reduce any bi by 2, there then exists a perfect b-matching. These graphs are shown to play a central role in determining when a b-matching problem has an integer optimal dual solution. This is the subject of Sections 5 and 6. In Section 5 we consider the case in which all edge costs are equal to 1 and in Section 6, we consider the case in which the edge costs are arbitrary integers. It is known that for any b-matching problem in which the edge costs are integers, there exists an optimal dual solution for which all dual variables are integer or half integer. We call such a dual solution discrete. Theorem 5.5 shows that if G is b-critical and edge costs are 1, then the only optimal discrete dual solution is obtained by letting every node have a dual variable of 89and letting all other dual variables be 0. Theorems 5.9 and 6.2 show that when no integer optimal dual solution exists, the set of dual variables that receive fractional values in an optimal dual solution can be restricted to the node set of a b-bicritical subgraph of the graph (which also satisfies some additional properties).

2. Terminology A graph G is an ordered triple (V, E, ~b) where V is a finite set of nodes, E is a finite set of edges and ~b is an incidence function. For any j ~ E, g,(j) is the set of two nodes incident with the edge j. For any S _C V we let 8(S) denote the set of edges incident with exactly one node of S. That is, 8(S) -- {j ~ E: Irk(j) N SI = 1}. If S consists of a single node v, then we write 8(v) for 8({v}) and so B(v) is the set of edges incident with v. For any S _C V, we let y(S) denote the set of edges having both ends incident with S. That is,

178

W. Pulleyblank/ Dual integrality in b-matching problems

~,(s) = (j ~ E: t~(i) r S}. For any S C_ V, we let G[S] denote the subgraph of G induced by S, That is,

6 [ s ] = (s, y(s), d ) where ~' is the restriction of 0 to y(S). For any vector x = (x~: j ~ J) and for any K ~ J we let x(J) denote ~,ier x~. Thus for example if x is a vector indexed by the edges of G, and S is a subset of nodes, then x(8(S)) is the sum of the values over those edges having one end in S, x(y(S)) is the sum of the values over those edges having both ends in S. For any S C_ V, N(S) is the set of nodes of V - S which are adjacent to a node of S. Thus ~ E N(S) if and only if there is some j E E such that ~(j) - S = {~}. We abbreviate N({v}) by N(v). Thus N(v) is simply the set of nodes of G adjacent to v. Finally let be any set of subsets of V. Then for any edge L ~(J) denotes those members of ~ that contain both nodes incident with j. Thus 9(j) = {s ~ ~: ~,(j) c_ s}.

3. The weighted b-matching problem Let G =(V, E, t~) be a graph, let b = (b~: i E V) be a vector of positive integral degree constraints and let c = (cF j E E) be a vector of arbitrary real edge costs. The b-matching problem is to find a vector x = (xF j ~ E) such that xi is a nonnegative integer

for every j E E,

x(8(i)) < b~ for every i E V,

(3.1) (3.2)

cx is maximized subject to these conditions.

A vector satisfying (3.1) and (3.2) is called a b-matching. If a b-matching x satisfies x(6(i)) = bi for all i E S C_ V, then we say that x is a perfect b-matching on S. The matching polytope P(G, b) is defined to be the convex hull of the set of b-matchings. (Note that since the zero vector is always a b-matching, P(G, b) is never empty.) We are able to give a minimal set of linear inequalities necessary and sufficient to define P(G, b) (Edmonds and Pulleyblank, see Pulleyblank [7]). In order to describe these, we first introduce some terminology. Let S _C V be such that b(S) is odd. A near perfect matching (np-matching) of G[S] deficient at node i ~ S is a b-matching of G[S] which satisfies

x(8(i)) = b~- 1, x(~(v)) = b~ for every v E S - {i}.

w. Puileyblank/Dual integrality in b-matching problems

179

That is, an np-matching is a b-matching of G[S] which is "as close as possible" to being a perfect matching on S. We say that G is b-critical if for each node i there exists an np-matching of G deficient at i. Note that if a graph G is b-critical, then G is connected and b ( V ) is odd. (A simple example of a b-critical graph is a triangle with bi = 3 for every node i.) We let = {S C V: ISI -> 3 and

G[S] is b-critical}.

(3.3)

Theorem 3.1 (Edmonds and Pulleyblank, see Pulleyblank [7]). P(G, b) = {x ~ RB: X_>0,

(3.4)

X(8(i)) <--bi for all i E V,

(3.5)

x ( 7 ( S ) ) <- 89

(3.6)

- 1) f o r every S E ~}.

This set of inequalities may not in fact be minimal, but it is more convenient to work with here, and our results are easily shown to hold when we have a minimal set of inequalities. This is the subject of Section 7. By virtue of Theorem 3.1, we can now Consider the b-matching problem as the linear program: maximize cx subject to (3.4), (3.5), (3.6). The dual linear program is minimize

~ biyi + ~, qsYs, iE V

SE~

Yi -> 0

for all i E V,

Ys->0

for a l l S E ~ ,

y(~b(])) + y(~(j)) -> cj

(3.7) for all j E E,

where for any S E ~, qs = I2(b(S) - 1). This linear programming problem (3.7), henceforth called the dual problem, is the subject of our study. If the vector c of edge costs is integer valued, then in some cases we can find an optimal solution y* to the dual problem which is integer valued. In fact we can always find an optimal dual solution which satisfies the following discreteness property.

Theorem 3.2 (Edmonds, see Pulleyblank [7]). If c is integer valued, then there exists an optimal dual solution y* satisfying y~ is integer valued

f o r all S E ~

yi* is integer or half integer valued

(3.8)

for all i E V.

(3.9)

180

W. Pulleyblankl Dual integrality in b-matching problems

We call a dual solution that satisfies (3.8) and (3.9) a discrete dual solution. Note that every integer dual solution is a discrete dual solution. In fact Theorem 3.2 is an immediate consequence of Edmonds' b-matching algorithm. It solves a b-matching problem by finding a b-matching and a dual solution which satisfy the complementary slackness conditions for optimality. Provided that we initialize the algorithm in the proper fashion, if we have integer edge costs, then the optimal dual solution found by the algorithm will be a discrete optimal dual solution. The complementary slackness conditions of linear programming which are satisfied by optimal matchings and dual solutions are the following. Theorem 3.3. A b-matching x and a dual solution y are optimal if and only if x(8(i)) = bi x('y(S)) = qs

for every i E V such that Yi > O, for every S E ~ such that Ys > O,

y(O(j)) + y(~(j)) = ci for every j E E such that x i > O. When studying dual integrality, our set of primal constraints is very important, since we have a dual variable available for each such primal constraint. Thus although the presence of superfluous primal constraints does not affect the set of solutions to the b-matching problem, the presence of corresponding superfluous dual variables may enable us to find an integer optimal dual solution when none exists if we are limited to dual variables corresponding to essential primal constraints. Let ~ = {S _C V}. Then for any S ~ ~, for any b-matching x, we have x( ?/(S)) <--[89

where for any real number t "[t]" denotes the greatest integer no greater that t. This is true since for any b-matching x, b(S) >- ~, x(8(i)) = 2x(~/(S)) + x(8(S)) i~s

and so x(y(S))<_ 89 But x ( y ( S ) ) is integer valued and the result follows. Note that if S E 2, then x ( y ( S ) ) < - 8 9 1) is the same constraint as x(y(S))<[89 Thus it follows from (3.4) that P(G, b) = {x E RE: x satisfies (3.4), (3.5) and x ( y ( S ) ) <- [89

for all S E ~}.

(3.10)

Now (3.10) contains many redundant constraints, in fact enough that the dual problem always has an integer solution, provided c is integer valued. Theorem 3.4. If c is integer valued, then there exists an integer optimal dual solution to the linear program: maximize cx subject to (3.4), (3.5) and (3.10).

W. Pulleyblank/ Dual integrality in b-matching problems

181

Proof. By Theorem 3.2 there exists a discrete optimal dual solution y to the problem of maximizing cx subject to (3.4), (3.5) and (3.6). Let X be the set of nodes i for which yi is fractional. By linear programming duality

~, biYi + ~, qsYs = cx

iEV

SE~

for an optimal b-matching x and since c is integer valued, it follows that ~,iex biyi is integer valued, thus b ( X ) must be even. If we lower yi by 89for every i ~ X, let yx = 1 and let Ys = 0 for all other S ~ ~ - ~, we obtain an integral optimal solution to the dual problem of maximizing cx subject to (3.4), (3.5) and (3.10). Note that in order to obtain an integer optimal dual solution, we let Yx be positive for a set X E ~ - ~. Fig. 1 is an example of a b-matching problem which has an integer optimal dual solution only if we are allowed to use such a dual variable. We obtain a maximum matching with value 3 by letting xj -- 1 for each edge j. Since b(S) is even for every S _C V, ~ = ~ and ~,i~v biYi is even for any integer dual solution y. Thus if we are restricted to dual variables corresponding to sets S in ~ and nodes in V we cannot obtain an integer valued dual solution whose objective value is less than 4. The problem that we now study is: When does the dual problem (3.7) (i.e., when we are restricted to ~ instead of ~ ) , have an integer optimal solution.

4. Bicritical graphs In this section we introduce the concept of bicritical graphs and establish some of their basic properties. In the following sections we show the role that these graphs play in characterizing b-matching problems that have an integer optimal dual solution. The bicritical graphs defined in this section are a generalization of the bicritical graphs introduced by Lovfisz [5] and Lovfisz and Plummer [6] who were concerned with obtaining bounds on the number of perfect b-matchings possessed by a graph when bi = 1 for all i E V. Let b = (bi: i E V) be a vector of integers satisfying bi ->2 for all i E V. We call b' a simple reduction of b at v if 2

Fig. 1. b-matching problem requiring fractional optimal dual solution. All edge costs = 1.

W. Pulleyblank/ Dual integrality in b-matching problems

182

, [bi, b~=[bo-2,

iE V-{v}, i=v.

Thus we obtain a simple reduction of b by simply subtracting 2 from the degree constraint of one node. We say that G is weakly b-bicritical if bi -> 2

for all i E V,

(4.1)

for any simple reduction b' of b, there exists a perfect b'matching. (4.2) It is an immediate consequence of (4.2) that if G is a weakly b-bicritical graph, then b(V) is even.

(4.3)

Fig. 2 is an example of a weakly b-bicritical graph. We call a graph nontrivial if IVI -> 3. It is easily seen that the only trivial weakly b-bicritical graph consists of a single node i and bi = 2. Lemma 4.1. A nontrivial weakly b-bicritical graph has a perfect b-matching.

Proof. Suppose G is weakly b-bicritical and has no perfect b-matching. Let y* be an optimal dual solution to the b-matching problem, taking c; = 1 for all ] E E. Then its value is 21b(V) - 1. For any simple reduction b' of b at i, the perfect b'-matching is an optimal solution to the original problem so by complementary slackness (Theorem 3.3) we must have y~* = 0 for all i E V. For any S E ~ there is an optimum matching x having a deficiency of 2 at any node i ~ S, so x(3,(S))< qs and hence by Theorem 3.3 y$ = 0. Hence the dual solution has value 0 and b ( V ) = 2. This together with (4.1) shows that G is a trivial weakly b-bicritical graph. It should be observed that weakly b-bicritical graphs need not be connected; any union of disjoint nontrivial weakly b-bicritical graphs is itself weakly b-bicritical. We now prove an analogue of a theorem of Lovfisz and Plummer [6] which characterizes the nontrivial weakly b-bicritical graphs. First we introduce some 2

2 Fig. 2. A weakly b-bicritical graph.

W. Pulleyblank/Dual integrality in b-matching problems

183

terminology. Let G = (V, E, ~) be a graph and let X _C V. We partition V - X as follows ~~

= {i E V - X: G[{i}] is a component of G[ V - X]},

~I(X) = {S C_ V - X : b(S) is odd, ISI->2 and G[S] is a component of G[ V - X]}, ~2(X) = {S_C V - X : b(S) is even, Isl->2 and G[S] is a component of G[ V - X]}. See Fig. 3. Tutte [9] proved the following characterization of those graphs possessing a perfect b-matching (see also Berge [2]). Theorem 4.2. G has a perfect b-matching if and only if for every X C_ V

b(X) >- b(C~~

+ Ic~'(x)l.

Now we prove a characterization of weakly b-critical graphs. Theorem 4.3. Let G be a graph with no trivial components. Then G is weakly b-bicritical if and only if b(K) is even for the nodeset K of each component and for every nonempty X C_ V,

b(X) -> b (cr176

+ I ~ ( X ) l + 2.

(4.4)

Proof. First we prove the sufficiency. If there exists i E V such that bi = 1, then setting X = {i} we would obtain a contradiction to (4.4) so bi -> 2 for all i ~ V. Now suppose that for some i ~ V, there is a simple reduction b' of b at i for which there does not exist a perfect b'-matching. Then by Theorem 4.2 there exists X C_ V such that

b'(X) < b'(~~

+ Ic~(X)I.

Fig. 3. Sample cr176 ~'(X), ~2(X). The integer beside each node i is the value of bl.

184

W. Pulleyblank/ Dual integrality in b-matching problems

If X = O, then since G has no trivial components and since b ( K ) is even for the nodeset K of each component we must have c~~ = ~I(X) = O. Thus b'(X) < O, a contradiction. Hence X # O. If i ~ V - X , then

b ( X ) = b ' ( X ) < b(CC~

+ Ice'(X) I

so (4.4) is violated for X. If i E X, then

b ( X ) = b ' ( X ) + 2 < b(CC~

+ Ic~'(X)l + 2

so (4.4) is violated. Thus the sufficiency is proved. To prove the necessity, let X be a nonempty subset of V. Let b' be a simple reduction of b at i E X. Since G has a perfect b'-matching, by Theorem 4.2

b ' ( X ) >- b'(C~~

+ Ic~'(X)l.

Therefore

b ( X ) = b ' ( X ) + 2 _> b(C~~

+ Ic~(X)[ + 2.

It follows from (4.3) that b ( K ) is even for the nodeset K of each component which completes the proof. It is interesting to compare Theorem 4.3 to the following characterization of b-critical graphs.

T h e o r e m 4.4 (Pulleyblank [7]). A connected graph G is b-critical if and only if

b ( V ) is odd and [or every nonempty X C_ V, b ( X ) >- b(C~~

+ I~I(X)I + 1.

Let b =(hi: i E V) be a vector of positive integers. We call an integer b ' = (b~: i E V ) a complex reduction of b if O<-b'<-b and b'(V) = b ( V ) - 2 . In other words, a complex reduction of b is obtained by either subtracting 2 from the value of bi for some i ~ V such that bi -> 2 or else by subtracting 1 from the values of bu, by for distinct nodes u, v E V. We say that G is strongly b-bicritical if for any complex reduction b' of b, G has a perfect b'-matching. There are essentially two trivial strongly b-bicritical graphs. The first consists of a single node i with bi = 2 (the trivial weakly b-bicritical graph). The second consists of two nodes u and v joined by any number of edges (including no edges) and for which bu = by = 1. The following is immediate. Strongly b-bicritical graphs have for any edge j a perfect bmatching x such that x~ > 0. (4.5) The following characterizes nontrivial strongly b-bicritical graphs.

W. PulleyblankJDual integrality in b-matching problems

185

Theorem 4.5. G is a nontrivial strongly b-bicritical graph if and only if G is connected

and

G has no cutnode i

IV[ -> 3,

for which bi = 1,

b ( V ) is even, b ( X ) >- b(%'~

(4.6) (4.7) (4.8)

+ Ics

+ 2,

(4.9)

for any X C_ V such that b ( X ) >_2. ProoL The necessity of (4.6)-(4.8) is easily seen and the necessity of (4.9) can be shown in a manner analogous to that used in the proof of Theorem 4.3. We now prove the sufficiency. Suppose G satisfies (4.6)-(4.8) but is not a nontrivial strongly b-bicritical graph. Then there exists a complex reduction b' of b such that G has no perfect b'-matching. Therefore by Theorem 4.2 there exists X _C V such that

b'(X) < b'(~g~

+ I~g"(X)[

where ~'l(X) is the set of node sets S of components G[S] of G [ V - X ] for which b'(S) is odd and IsI > 1. If b ( X ) = 0, then X = 0 and so G is not connected, contrary to (4.6). Suppose b ( X ) = I. If G [ V - X ] has two or more components, then X consists of a cutnode i for which b; = 1, contradicting (4.7). If G[ V - X] is connected, then by (4.8), b'(X) and b'(CC~ + I~g'l(X)l must have the same parity and hence differ in value by at least two. Therefore G [ V - X] consists of a single node u with bu -> 3. But then I VI = 2, contradicting (4.6). Therefore b ( X ) -> 2. Since

( b ( X ) - b'(X)) + (Icg't(X)l -I
b ( X ) < b'( ~~

+ lqgt(X)f + 2

<--b(~~

+ I~I(X)I + 2

so (4.9) is violated and the proof is complete. Theorem 4.6. A connected weakly b-bicritical graph is strongly b-bicritical.

Proof. If G is trivial, then so too is the result. Otherwise (4.6)-(4.8) are immediate and (4.9) follows from Theorem 4.3. Thus the result follows from Theorem 4.5. In view of Theorem 4.6 we simplify our terminology and say that G is b-bicritical if it is a connected nontrivial weakly b-bicritical graph. Edmonds [4] introduced the concept of a "good" characterization. A charac-

186

w. PulleyblankIDual integrality in b-matching problems

terization is called good if it can be checked in time which grows polynomially with the size of the problem. Since we can determine in polynomial time whether or not a graph has a perfect b-matching, at most IV[ applications of this algorithm will determine whether or not a graph is b-bicritical. Thus there exists a priori a good characterization of which graphs are b-bicritical (connectivity and nontriviality being easy to check). However, Theorem 4.3 gives a good characterization of weakly b-bicritical graphs (and hence of b-bicritical graphs) independent of the b-matching algorithm or Theorem 4.2. Thus we can show a graph G is weakly b-bicritical by exhibiting [VI b-matchings, one for each simple reduction of b. We can show that G is not weakly b-bicritical by exhibiting a single nonempty X C_ V which violates (4.4). 5. Dual integrality with unit edge costs We now turn to the main subject of this paper, determining when we require a fractional optimal dual solution if all edge costs are equal to 1. First we observe that in this case, Theorem 3.2 specializes to the following. Theorem 5.1. If cj = 1 for all j E 17, then there exists an optimal dual solution y* satisfying y~ ~{0, 1} f o r a l l S E ~ ,

(5.1)

y~'~{0,2l, 1} f o r a l l i ~ V.

(5.2)

In fact when all edges costs are equal to 1, optimal discrete dual solutions have a particularly simple structure, as described in the following lemmas. Lemma 5.2. L e t y* be an optimal discrete dual solution. Then for any S E ~ such that y~ = 1 there exists at most one i E S such that y~' > 0 and if such an i exists, then bi = 1. Proof. Let X = {i E S: y~* = 1}, Y = {i ~ S: y* = 89 We define a vector y' by y~=

Y~'=

1 if i E X , 21 if i E S - X , y* i f i E V - S ,

{o

,f

-s

if T E n - { S } .

Then y' is easily seen to be a discrete dual solution. Moreover its value exceeds that of y* by ~b(S-(X

U Y)) - 89

1) = ~(1 - b ( X ) - b(Y)).

W. Pulleyblank/ Dual integrality in b-matching problems

187

Since y* is optimal, this value is nonnegative. Hence b ( X ) + b( Y) <- 1. Since bi -> 1 for all i, the result now follows. Lemma 5.3. Let y* be an optimal discrete dual solution. Then for any S, T ~ such that y~ = y~ = 1 we have b(S O T) <- 2. Proof. We define a feasible dual solution y to the problem (3.4), (3.5) and (3.10) by letting Yi = Y,* for all i E V, and

Yx=

t

X = S o r T, if X = S U T , if x ~ - { s , T, SU T}, if X E ~ - ( . ~ U{SU T}).

The value of y exceeds that of y* by

[b(S~_T,J_(b(S~-,.)_(b~-l) -b(SO_2T'+2 This value must be nonnegative so b(S n T)<_ 2. Fig. 4 gives examples to show that the bounds given in Lemmas 5.2 and 5.3 can in fact be achieved. If G is b-bicritical, then an optimal discrete dual solution is uniquely determined. The only optimal y* is obtained by letting y~' = l2 for all i ~ V and letting y~ = 0 for all S ~ ~. To prove this we first establish the following lemma.

Lemma5.4. Let G be weakly b-bicritical. Then for any optimal dual solution y*, we have y.*, <-I for all i ~ V. Proof. If G is trivial, then the zero vector is the unique optimal dual solution, so assume G is nontrivial. Suppose for some v E V, y* > ~. Since G has a perfect b-matching (by Lemma 4.1) the value of the dual solution y* is 89 If b' is a simple reduction of b at v, then we obtain a feasible dual solution y to the problem (3.4), (3.5), (3.10) with b' by extending the domain of y* to ~ by letting /

"%

\~, YS = I ~ - ~

dge j with x I = 1 --

edge j with xj =0

Fig. 4. Sample optimal dual solutions.

W. PulleyblanklDual integrality in b-matching problems

188

Ys = 0 for all S ~ ~ - ~ . Its values is ~ b ( V ) - 2 y * which is less than ~ b ( V ) - 1 . But since G is b-bicritical, there exists a solution to (3.4), (3.5), (3.10) with b' having value 89 1. Thus we have contradicted linear programming duality. We now give a sufficient condition for a graph to require a fractional dual solution. Theorem 5.5. If G is a b-bicritical graph, then the only optimal discrete dual solution y* is obtained by letting yi* = 89for all i E V and y$ = 0 for all S ~ ~. Proof. It will be sufficient to show y$ = 0

for all S E ~

(5.3)

for since G has a perfect b-matching (by L e m m a 4.1), by linear programming duality the value of y* is ~b(lO. Thus L e m m a 5.4 and (5.3) imply y; = 12for all iEV. Let ~ + = { S ~ . ~ : y ~ = 1} and suppose ~ + ~ 0 . (Note that by Theorem 5.1, y* = 0 for all S ~ ~ - ~+.) Since bi -> 2 for all i E V, we have by Lemma 5.2 that y~' = 0

for all i ~ S, for all S E ~+.

(5.4)

Let j E ~(S) for some S E ~ + . If there is no T E ~ + such that j E y(T), then y*(~(j))+ y * ( ~ ( j ) ) = y*(O(j))-<~ by L e m m a 5.4 and (5.4). But this contradicts the feasibility of y* so for any S E ~t +, for any j ~ 8(S) that j ~ y(T).

there exists T ~ ~+ such

(5.5)

By Lemma 5.3 we have for a n y S , T E ~ + ,

b(SAT)~2

and

[ S A T I_<1.

(5.6)

Let S E ~ +. Then since b(S) is odd but b(V) is even and G is connected, by (5.5) and (5.6) there is v E S and T ~ ~+ - {S} such that {v} -- S n T and by = 2. If v is the only member of S belonging to a member of ~ + - { S } then by (5.5) v is a cutnode of G. Hence if we reduce bi by 2, G [ S - {v}] is "disconnected" from the rest of G. But b ( S - {v}) is odd, so no perfect matching can exist, contradictory to G being b-bicritical. Hence there exists u ~ S - {v} and U E ~+ - {S, T} such that b, = 2 and {u} = S n U. See Fig. 5. Now let x be a perfect matching of G. By complementary slackness (Theorem 3.3) x is a np-matching of G[T] so there is some j E y(T) n 8(v) for which xj -> 1. Similarly x is a np-matching of G[U] so there is some k E 8(u) n y ( U ) for which xj > 1. But {j, k} C ~(S) so x(8(S)) - 2 and hence x(y(S)) <_89 - 2). Thus x cannot be near perfect on G[S] which contradicts Theorem 3.3 for S since we assumed x and y* to be optimal. Therefore, ~+ -- 0 and so (5.3) is established and the result follows.

W. Pulleyblank/ Dual integrality in b-matching problems

189

S

t

/

x

\

Fig. 5.

For any feasible dual solution y, we let to(G; y, b) denote the value of this dual solution relative to this dual objective function. That is, to(G; y, b) = ~'~ (y,b: i E V) + ~, (ysqs: S E ~). (Since qs = ~ ( b ( S ) - 1) we do not include q as a parameter.) If we omit the first parameter, then the graph G is understood, i.e., to(y, b) -~ to(G; y, b). We now establish some structural properties of graphs that do not possess an integer optimal dual solution. We frequently wish to " c o m b i n e " dual solutions in the following fashion. Let y be an optimal discrete dual solution and let X = {i E V: Yi = 21}. Let y be any feasible dual solution to the b-matching problem restricted to G[X]. The modification of y by y is defined to be the vector 9 defined by ~. = {y, Yi

ifiEX, ifiEV-X,

j'Ys i f S ~ Ys i f S ~

and and

S_CX, S~X.

Proposition 5.6. The modification of y by y is dual feasible. Proof. We must show that 9($(j)) + 3~(~(j)) -> 1 for all j E E. This is obvious for j E ~/(X) U 3,(V - X). Let j E 8(X). Then 9($(J)) + 9(~(J)) -> Y($(J)) + Y(~(J)) - 12. But since exactly one end of j had a fractional y, we must have had y($(j))+ y(~(j)) > 189 since y was dual feasible. The result now follows. By L e m m a 5.2, if S E ~ immediate.

and SC_X, then y s = 0 . The following is now

Proposition 5.7. o2(9, b) = co(y, b) - ~b(X) + r

y, b).

As a result of Proposition 5.7 we have the following "local optimality" condition.

190

W. Pulleyblankl Dual integrality in b-matching problems

Lemma 5.8. Let y be an optimal discrete dual solution to a b-matching problem and let Z = {i E V: Yi = 12}.Let G[H] be any component of G[Z]. If we let y be the restriction of y to H U {S E ~: S C_H}, then y is an optimal dual solution to the b-matching problem on G[H]. Moreover, for any other optimal dual solution ~ to the b-matching problem on G[H], the modification of y by ~ is an optimal dual solution to the b-matching problem on G.

Proof. Clearly y is a feasible dual solution to the b-matching problem on G[H], and the modification of y by y is of course just y. Now let ~ be any feasible dual solution to the b-matching problem on G[H]. All we need show is that if we let 33 be the modification of y by ~, then 33 is a feasible dual solution to the original problem and the result will follow from Proposition 5.7. Let ] E E. If ] E T(H) or j E y ( V - H ) , then clearly 33($(])) + 33(~(])) -> 1. If j E 8(H), then since H is the node set of a component of G[Z], if {v} = $(j) - H, then v ~ V - Z . Then as we saw when proving Proposition 5.6, y(~b(j))+ y(~(])) _> l89 and so 33(~/,(]))+ 33(~(])) > 33v+ 9(~(J)) -> 1. We now give necessary conditions for a b-matching problem to have no integer optimal dual solution. Theorem 5.9. Let G and b be such that there exists no integer optimal dual solution. Then there exists X C_ V such that bi >- 2

for all i ~ X,

[or every optimal matching x,

(5.7) x(8(i)) -- bi, for all i ~ X,

at least one component of G[ V - X] is b-bicritical, if Z is the set of nodes in the b-bicritical components of G[ V - X ] , then [or every optimal matching x, x~ = 0 for all ] ~ 8(z) u ~(x), there is an optimal discrete dual solution y* which is fractional for precisely those nodes in Z.

(5.8) (5.9) (5.10) (5.11)

See Fig. 6. Proof. Let y* be an optimal discrete dual solution for which Z = {i E V: y* = 2l} is minimal. Since there exists no integer optimal dual solution, Z # f l . Let X = N(Z). Clearly (5.11) is satisfied; we show that (5.7)-(5.10) are also satisfied for X and Z so defined. Let G[H] be a connected component of G[Z]. We wish to show that G[H] is b-bicritical. By L e m m a 5.8, the restriction of y* to H U { S E ~ : S C_H} is an optimal dual solution to the b-matching problem on G[H]. The value of this solution, ~b(H) must be integral and hence b(H) must be even.

W. Pulleyblank/ Dual integrality in b-matching problems

191

x(8(i)): b i for every

xj=O for every optimal matching Fig. 6. Structure when there exists no integer optimal dual solution.

Suppose G[H] is not b-bicritical. If G[H] is trivial, then let v be the single node in G[H] for which by = 2. Since y* is optimal and y* = 89 we must have 8(v) ~ 0. Let j E 6(v), where ~(j) = {u, v}. Since u E Z we have Y*(CK]))+ y*(~(j))-> 189 Thus we can reduce y* to zero, preserving feasibility, contradicting optimality of y*. Hence G[H] is nontrivial, so by Theorem 4.3 there exists nonempty M C H such that

b(M) <

b(~~

+

Ic~l(M)l + 2,

where ~0, cr and ~2 are taken relative to G[H]. We define a dual solution ] to the b-matching problem on G[H] (using ~ instead of ~) by

{i

)~i= I {1 Ys =

ifiEM'

ifi~SE~2(M), if i E H - (M U {S: S E ~g2(M)}), if S E cd'(M), if S E ~ - ~l(M).

It is easily checked that ~ is dual feasible. Moreover to(G[H]; ~, b) =

b(M)+ 89~

< 89b(M) + 89176

(b(M):

S E ~2(M))+ 2~, (89

+89~g1(M)] +

1): S E ~ ' ( M ) )

1 - 89~r

+ 89~ (b(S): S E ~ ( M ) U cg2(M)) =

89

+ 1.

Thus to(G[H]; ~, b) <-89 = to(G[H]; y*, b). Moreover, {i E H : Yi = 89 C H. It may happen that ]s = 1 for some S E ~ - ~, in this case we can simply find an optimal discrete dual solution y' to the b-matching problem on G[S] and make y conform with y' on S O ~[S]. Thus we are able to obtain an optimal discrete dual solution to the b-matching problem on G[H] having a smaller set of fractional components than y*. Hence if we modify y* by this solution, we obtain an optimal discrete dual solution which contradicts the minimality of Z. Hence G[H] is b-bicritical. If some component G[H] of G[V-(XUZ)] were b-bicritical, then by

192

W. PuUeyblank/ Dual integrality in b-matching problems

Theorem 5.5 the unique optimal discrete dual solution to the b-matching problem on G[H] is obtained by defining y~ = 89for all i E H. We could then perform the obvious modification to y* which would contradict optimality of y*. (Since y * = I for all i E X , and since j E S ( H ) implies j ~ 8 ( X ) we know this modification will be dual feasible.) Thus G[Z] consists precisely of the b-bicritical components of G[ V - X], and at least one such component exists, establishing (5.9). By L e m m a 5.2 there can be no S E 9 such that S n Z ~ I~ and y~ = 1, for since every component of G[Z] is b-bicritical, we must have bi - 2 for all i ~ Z. Therefore, for y* to be feasible, we must have y* = 1 for all i E X. Therefore, (5.8) follows from complementary slackness Theorem 3.3. Moreover,

y*(t~(j)) + y*(~(j)) > 1 for all j E 8(Z) U y(X) so again by complementary slackness, xj = 0 for any such j in any optimal matching, establishing (5.10). Finally suppose b~ = 1 for some i E X. Let H be the set of nodes in components of G[Z] which contain a node adjacent to i. We show that G[H U {i}] is b-critical. That is we show that there is a np-matching x of G[H U {i}] deficient at any v E H O {i}. This follows from L e m m a 4.1 for v = i, since G[H] is nontrivial and weakly b-bicritical. For v ~ N(i) we can find a b-matching x of G[H] which is perfect for u E H - { v } and which has a deficiency of 2 at v (since G is weakly b-bicritical). Then by setting xj = 1 for the edge j joining v and i we get the desired result. Finally for v ~ H - N(i), let u ~ N(i) belong to the same component of G[Z] as v. Construct a b-matching x perfect for H - {u, v} and having a deficiency of 1 at u, v (which is possible by Theorem 4.6). Set xi = 1 for the edge ] joining u and i and we get the desired result. Hence G[H U{/}] is b-critical. Thus we c a n modify y* by letting y* = 0 for v E H U {i} and ySu{i}= 1 and obtain a discrete dual solution whose value is no greater but which has fewer fractional components, contradicting our choice of y*. Hence b~ -> 2 for all i E X, (5.7) is established and the proof is complete. Unfortunately if not all the bi are even, conditions (5.7)-(5.11) are not sufficient to require a fractional optimal dual solution, as can be seen from Fig. 7.

a

Fig. 7. Example of insufficiency of (5.7)-(5.11).

w. PulleyblanklDual integrality in b-matching problems

193

Although in this case an optimal dual solution y* is obtained by letting y~* = 21for i E Z , y ~ = l for i E X and y ~ ' = 0 for i E V - ( Z U X ) , w e can in fact find an integer optimal dual solution y by letting Ya = 1, Ys = 1, and all other components of y be 0. In general, the problem is determining when a b-bicritical graph is contained in a b-critical graph, whose dual variable can be used to enable us to avoid making the nodal dual variables in the b-bicritical part become fractional. Thus a question which is still open is that of finding necessary and sufficient conditions for a graph to require a fractional optimal dual solution when all edge costs are 1. However, if bi is even for all i ~ V, then we obtain the following converse results.

Theorem 5.10. Suppose bi is even [or all i E V and let Z C V be such that G[Z] is b-bicritical and x i = 0 for every ] E 8(Z), for every maximum matching x. Then for any optimal dual solution y* we have yi* = ~ .for all i E Z. Proof. Any maximum b-matching of G[Z] can be extended to a maximum b-matching of G, since every maximum matching x of G has xj = 0 for all ] E 8(Z). Since G[Z] is b-bicritical, by (4.5) and Theorem 4.6 for each edge j E 3/(Z) there exists a perfect b-matching x of G[Z] such that xj > 0. Therefore, by complementary slackness (Theorem 3.3) y*(~b(j)) = 1 for all j E y(Z). (Since bi is even for all i E V, .~ = ~.) Thus the restriction of y* to Z is a solution to this system of equations, but since G[Z], being b-bicritical is connected and nonbipartite, we have y~* = ~ for all i E Z. Thus, when b is even, we have the following characterization of optimal dual solutions with a minimal set of fractional components.

Theorem 5.11. Suppose bi is even for all i ~ V and let y* be an optimal discrete dual solution such that the subgraph of G induced by Z = {i E V: y~' = ~} is nontrivial and weakly b-bicritical. Then y* has a minimal set of fractional components. Proof. All we need show is that xj = 0 for every j E 8(Z) for every maximum b-matching x and the result follows from an application of Theorem 5.10 to each component of G[Z]. Let j E 8(Z) and let {u, v} = ~b(j) where u ~ Z, v E V - Z. Then y* = 21and since y* is feasible and y* is integer valued, we must have y* = 1. But then y*(~b(j)) > 1 so by complementary slackness Theorem 3.3 xj = 0 for every optimum b-matching x. We close this section by noting the following corollaries of Theorem 5.9.

Corollary 5.12. If G is a minimal graph which does not have an integer optimal dual solution, then G is b-bicritical.

194

W. PulleyblanklDual integrality in b-matchingproblems

Corollary 5.13. If bi = 1 for all i E V, then there exists an integer optimal dual solution.

6. Arbitrary integer edge costs Now we give some results concerning b-matching problems in which the edge costs are arbitrary integers. First we note that for any edge j such that c~ -< 0, we can assume xj = 0 in any optimal matching x. Thus we can ignore all such edges. Moreover, any feasible dual solution to the problem obtained by deleting these edges will also be a feasible dual solution to the problem obtained with them present. Hence we assume that all edge costs are positive integers.

Proposition 6.1. If cj is even for all j ~ E, then there exists an integer optimal dual solution.

Proof. Let y* be an optimal discrete dual solution to the problem obtained by replacing c~ with 89 for all j E E. Then 2y* is an integer optimal solution to the original problem.

We can also obtain the following analogue of Theorem 5.9.

Theorem 6.2. Let G, b and c be such that there does not exist an integer optimal dual solution. Then there exists Z C_ V such that every component of G[Z] is b-bicritical and there exists an optimal dual solution y* which is fractional for precisely the nodes in Z.

Proof. Let y* be an optimal discrete dual solution for which Z = {i E V: y,* is fractional} is minimal. Then Z r ~. We show that every component of G[Z] is b-bicritical. Let y = [y*]. That is, we truncate the fractional members of y*. Then y is not a feasible dual solution, but for any edge j such that y(~b(j))+ y(~(j)) < ci we will have j E y(Z) and y(~b(j)) + y(~(j)) = cj - 1. Thus if we add to y any feasible dual solution y' for the b-matching problem on G[Z] taking all edge costs to be 1, we will obtain a feasible dual solution )~ to the original problem. Moreover the only nodes i for which 9~ will be fractional are those nodes i in Z for which y} is fractional. Since Z is minimal, the only discrete optimal dual solution y' to the problem on G[Z] is obtained by letting y~ = ~ for all i E Z and y~ = 0 for all S. Therefore it follows from Theorem 5.9 that G[Z] consists of bobicritical components.

W. Pulleyblank/ Dual integrality in b-matching problems

195

Corollary 6.3. If bi = 1 .for all i E V, then .for any integer objective function c there exists an integer optimal dual solution. (This has also been proved by Cunningham and Marsh [3].)

7. Restriction to facet dual variables

We mentioned in Section 3 that there may exist i ~ V such that x(8(i)) <- bi is not a facet of the matching polytope and S C 2 such that x(y(S))-< qs is not a facet of this polytope. We conclude by showing that the corresponding superfluous dual variables we have used do not affect the problem of finding an integer optimal dual solution. Let 2 ' = { S ~ 2 : x ( y ( S ) ) < - q s is a facet} and V' = {i E V: x(3(i)) <- bi is a facet}. All we need show is that if we can obtain an integer optimal dual solution using dual variables corresponding to members of V and 2, then we can also find an integer optimal dual solution using dual variables corresponding to members of V' and 2'. The following characterizes 2 p.

Theorem 7.1 (Pulleyblank [7]). 2 ' = { S E 2 :

G[S] contains no cutnode v for

which b~ = 1}. Thus for any $ E 2 - 2 ' , G[S] decomposes into " w e a k " blocks, i.e., S = S ~ U S z U ' " U S k where for each i ~ { 1 , 2 ..... k}, S i E 2 ' , if S~ASjgfJ, then b(Si O Sj) = IS~ n Sj[ = 1 and U~=l y(Si)is a partition of y(S). (See Pulleyblank [7] for details.) Hence if an optimal dual solution y* had y~ = r we could instead have set y~, = ,e for i E {1, 2 ..... k} and y~ = 0. This is easily seen to be feasible and a straightforward induction on k shows that k

b(S)

- 1 = Y~ ( b ( S , ) - 1) i=1

so this modification will retain the same objective value. Now we analyze nodal dual variables. A balanced edge is a graph consisting of two nodes u, v joined by one or more edges and satisfying b, = b,. The following characterizes the essential nodal dual variables. Theorem 7.2 (Pulleyblank [7]). V - V' = {i ~ V:

b ( N ( i ) ) < bi and if b ( N ( i ) ) = bi, then i belongs to a component o.f G which is not a balanced edge

(7.1)

b ( N ( i ) ) = bi + 1 and y(N(i)) ~ 0}.

(7.2)

or

See Fig. 8.

196

W. Puileyblank/ Dual integrality in b-matching problems 7

3 3

2

5

Nodes not corresponding to primal facets: 9

Fig. 8. Inessential nodal dual variables.

Now suppose i E V - V ' and y~*= e > 0 for an optimal dual solution y*. If (7.1) applies, we simply set y~' to 0 and add e to the dual variables of the nodes in N(i). If (7.2) applies, we set y~* to 0 and add ~e to the dual variable y} where S = N ( i ) U {i}. (It is easily checked that G [ N ( i ) U {i}] is b-critical and has no cutnode v for which by = 1 and hence S E .9.'.)

Acknowledgements The bulk of this work was carried out while the author was a research fellow at CORE, University Catholique de Louvain Belgium. I am indebted to G6rard Cornu6jols for several helpful discussions during the preparation of this paper as well as Martin Gr6tschel and an anonymous referree for careful readings of an earlier version and suggested improvements. This work was supported in part by the National Research Council of Canada.

References [1] A. Aho, J. Hopcroft and J. Ullman, The design and analysis of computer algorithms (AddisonWesley, Reading, MA, 1974). [2] C. Berge, "Sur le couplage maximum d'un graphe", Comptes Rendus Hebdomadaires des s~ances de I'Acad~mie des Sciences, 247 (1958) 258-259. [3] W.H. Cunningham and A.B. Marsh III, "A primal algorithm for optimum matching", Mathematical Programming Study 8 0978) 50-72. [4] J. Edmonds, "Paths trees and flowers", Canadian Journal of Mathematics 17 (1965) 449-469. [5] L. Lov~isz, "On the structure of factorizable graphs", Acta Mathematica Academiae Scientiarum Hungaricae 23 (1972) 179-195. [6] L. Lov~isz and M.D. Plummer, "On bicritical graphs", Colloquia Mathematica Societatis Jtinos Bolyai 10 (1973) 1051-1079. [7] W. Pulleyblank, "The faces of matching polyhedra", Thesis, University of Waterloo (Waterloo, Ont., 1974). [8] W. Pulleyblank, "Minimum node covers and 2-bicritical graphs", Mathematical Programming, 17 (1979) 91-103. [9] W.T. Tutte, "The factors of graphs", Canadian Journal of Mathematics (1952) 315-318.

Mathematical Programming Study 12 (1980) 197-205. North-Holland Publishing Company

A TECHNIQUE FOR DETERMINING BLOCKING ANTI-BLOCKING POLYHEDRAL DESCRIPTIONS

AND

H.-C. H U A N G

Nanyang University, Republic o[ Singapore L.E. T R O T T E R , Jr.

Cornell University, Ithaca, NY, U.S.A. Received 24 October 1977 Revised manuscript received 7 March 1978 A general method is described for determining blocking and anti-blocking polyhedra related to any combinatorial family given as the extreme points of a polyhedron. This technique is illustrated in detail for the common independent sets of two matroids.

Key words: Blocking Polyhedra, Anti-Blocking Polyhedra, Matroids, Matroid Intersection.

1. Introduction Suppose M is a nonnegative rational matrix with n columns and let ~ = {x ~ R~: Mx >-1}, where R~ denotes the nonnegative orthant of R" and 1 is a vector of appropriate dimension, each of whose components is one. Fulkerson has shown in [6, 8] that associated with each such p o l y h e d r o n ~ is a unique blocking polyhedron ~ = {x E R~: ~7=~ yjxj _> 1, Vy E ~}. Furthermore, ~ is of the same form as ~ ; i.e., ~ = {x U R~-: Bx > 1} for a nonnegative matrix B. Also = ~. Any such nonnegative matrix B is called a blocking matrix for M. (Observe that we do not require the elimination of inessential rows for matrices M and B; i.e., M and B need not be " p r o p e r " , as in [6].) Analogously, when M has no zero columns, so that M = {x E R~: Mx <-1} is bounded, Fulkerson [7, 8] associates with M the unique anti-blocking polyhedron s~ = {x E R~-: ~7=1 yjxj --< 1, Vy E M}. H e r e too, ~ = M and s~ = {x E R~: Ax <- 1} for a nonnegative matrix A with no zero columns, called an anti-blocking matrix for M. Thus the blocking relation pairs polyhedra of the form ~ above, whereas the anti-blocking relation pairs polyhedra of the form M; these relations provide respective frameworks for viewing the paired occurrence of certain combinatorial theorems. For blocking polyhedra in R2 such results concern a maximum rational packing of extreme points of ~ (or ~ ) into a vector w ~ R2; for anti-blocking polyhedra in R2 attention is directed to a minimum rational covering of w E R2 by extreme points of M (or s~). For specific examples the reader is referred to references [1, 2, 6-9, 11, 13-17]. Thus, given a polyhedron 197

198

H.-C. Huang and L.E. Trotter Jr. l Blocking and anti-blocking polyhedra

_CR~ whose extreme points correspond to some combinatorial family of interest, it is natural to attempt deducing rational packing and covering results for members of this family by determining a blocking polyhedron ~ for = {x G R~-: x-> y for some y E ~} and (when ~ is bounded) an anti-blocking polyhedron sr for ~ = {x E R~: x -< y for some y E ~}. In this paper we describe a general method for determining linear systems which define polyhedra ~] and when polyhedron ~ is given by any linear system. This technique is based on results which relate blocking and anti-blocking polyhedra to generalized circulations (see [ 14]) and on the algebraic operations of deletion, contraction and projection for manipulating blocking and anti-blocking polyhedra (see [8]). In the final section of the paper we illustrate this technique on a "matroid intersection" example.

2. Procedure Suppose ~ ={x ~ R g : Mx >-1} where M is a nonnegative matrix with n columns and let B be a blocking matrix for M so that ~ = {x E R~: Bx >- 1} is the blocking polyhedron of ~. When M has no zero columns, we define sr = {xER~-: Mx<_ 1} and let A be an anti-blocking matrix for M with ~ = {x E R~: Ax <- 1} the anti-blocking polyhedron of M. By contraction of the jth column of M is meant the removal of column j from M; deletion of column j is the removal of this column along with all rows which have a positive entry in column j. In polyhedral terms, contraction of the jth column of M corresponds to intersecting ~ with the hyperplane ~ = {x E R": xi = 0}, whereas deletion of column j corresponds to projecting ~ onto ~. These operations are used on blocking pairs of polyhedra as described in the theorem below. Since for an anti-blocking polyhedron M, 0 -< y -< x E M implies y E M, the projection of M onto is identical tO its intersection with ~. Thus for anti-blocking matrices we define the single matrix operation, projection of column j, as the removal of column j. Theorem 1 (Fulkerson [6, 7]). Suppose matrices M, B, A and polyhedra ~, ~, M, ~l are as described above. Also let matrices M', B', A' be obtained from M, B, A by respective contraction (projection), deletion and projection of column j, and define the resulting polyhedra : ~ ' = {x E R~--I: M ' x >- 1},

~ ' = {x E//R~-1: B ' x >- 1},

..~' = {x E R~-l: M ' x <- 1},

,~' = {x E R~-l: A ' x <- 1}.

Then ~ ' and ~ ' are a blocking pair of polyhedra and sg' and ~ ' are an anti-blocking pair of polyhedra. Suppose N is a real matrix with n columns and suppose we are given n

H.-C. Huang and L.E. TrotterJr.~Blocking and anti-blocking polyhedra

199

intervals /j, i - ]--< n, where /i =/at, bi] with 0 - a t -< b~. We list as the rows of matrix M the extreme points of the polyhedron = {x ERn: Nx = 0 and xj~Ij, Vj}. Theorem 2 below, which specifies blocking and anti-blocking matrices for M, is proved in [14]. We recall that the support of the vector k E R n is the set {j: k~#0}; K+ ={j: kj>0} and K_ ={j: kj<0} denote respectively the positive and negative support of k. Also denote the row space of N by ~ and recall that nonzero vectors of ~ with minimal support are termed elementary vectors of ~. We denote by ~:(~) the collection of all elementary vectors of ~. Theorem 2. (i) The blocking polyhedron of ~ = {x E R~: Mx >- 1} is given by

{

xj>-ai,

jEJ

j = l ..... n,

k~xj> ~, (-kj)a i jEK_

~

jEK+\ J

(ii) I[ M has no zero columns, {x ~ R~-: Mx <- 1} is given by { 0---xj--- bj,

~kjxj<_ ~ ( - k j ) b j .iEJ

jEK-

kjbi, kE~;(~),JC_K+. the anti-blocking polyhedron of M =

j = 1..... n,

~,

kjaj, k ~ ( ~ ) , J C K + .

jEK+\ J

Notice that the description of the blocking polyhedron of g~ and the antiblocking polyhedron of M in Theorem 2 using elementary vectors of ~ actually requires only finitely many constraints, since there are only finitely many vectors of ~ ( ~ ) up to scalar multiplication. Also observe that we could broaden the stipulation in Theorem 2 that k E ~ ( ~ ) to any k E ~. This follows from the fact (see [5]) that any k E ~ is a conformai sum of vectors in ~:(~); i.e., k = k I +... + k", where for 1 - i - m, k; E ~ ( ~ ) , K~+_CK+ and K/_ C_K_. Theorm 2 provides a means for determining both blocking and anti-blocking results for a given matrix M - - w h e n M is the list of extreme points of a polyhedron ~ as described above. This situation is completely general in the following sense (see also [1]). Suppose we wish to determine blocking and anti-blocking results for the extreme points of an arbitrary bounded polyhedron ~ ' C R ~ , given as the solution set to a system of linear equalities and inequalities. Then it is routine to transform the representation of ~ ' into the form of ~ above by the addition of slack variables and appropriate coordinate bounds. (If ~ ' is unbounded in coordinate ], we take bj arbitrarily large in Theorem 2--see [14].) Furthermore, an elementary argument establishes a one-to-one correspondence between the extreme points of ~ ' and those of ~. That is, the matrix M', whose rows are the extreme points of ~', is a full-rowed submatrix o f M, whose rows are the extreme points of ~. Thus we may apply

200

H.-C Huang and L.E. Trotter Jr. [ Blocking and anti-blocking polyhedra

Theorem 2 to obtain a blocking matrix B for M and an anti-blocking matrix A for M. Since M' is obtained from M by contracting (projecting) certain of its columns, Theorem 1 implies that a blocking matrix B' for M' and an antiblocking matrix A' for M' may be obtained, respectively, by using the column deletion operation on B and by projecting certain columns of A. We next illustrate this process in detail with an example.

3. Example In [3] Edmonds has demonstrated that the incidence vectors of comnon independent sets of maximum cardinality for two matroids defined on the set E are the extreme points of the following polyhedron ~ ' C R~ f, z( S ) <- rl(S),

SC_E,

z( T) <---r2(T),

Tc_E,

z ( E ) = ro(E), zi>-O,

j~E,

(1)

where z(U), rI(U), r2(U) and ro(U) denote respectively, for U C E, ~j~v zi, the maximum size of a subset of U which is independent in the first matroid, similarly for the second matroid and the maximum size of a set contained in U which is independent in both matroids. Rewriting (1) in a form compatible with Theorem 2 gives the polyhedron ~ defined by (2) and (3) below. z(S) + Ss - rl(S)y = O,

S C_ E,

z(T)+tr-r2(T)y=O,

TCE,

z(E)

(2)

- ro(E)y = O.

0-< zj-< bj,

j~E,

O<-ss<-bs,

SCE,

O<-tr<--br,

TC_E,

(3)

l_
H.-C. Huang and L.E. Trotter Jr./Blocking and anti-blocking polyhedra

20I

jecture was shown true for the special case of transversal matroids by Weinberger [15, 17]. The general form of the conjecture was established independently by McDiarmid [13], Cunningham [2] and Edmonds and Giles [4]. The derivation which follows, based on the approach of Section 2, is due to Huang [11]. McDiarmid [13] also determined an anti-blocking matrix for M'; the anti-blocking analogue of the derivation below also yields this result. Let n = [E[ + 2 IEI+~+ 1 and let N denote the matrix with n columns whose rows are determined by the constraints (2), with ~ the row space of N. Also denote as in Theorem 2 = {x E R~-: M x >- 1},

~ ' = {z ~ R~I: M ' z >- 1},

and let ~ and ~ ' denote the respective blocking polyhedra for ~ and ~ ' . Theorem 2 indicates that ~ is described by lower bound constraints and elementary vector constraints. After deleting s, t, y coordinates, the lower bound constraints simply become nonnegativity restrictions on the z coordinates. Furthermore, in any constraint for ~ determined by k ~ ~ ( ~ ) , K_ must contain the coordinate corresponding to the variable y. Otherwise the right-hand side of this constraint is nonpositive, and hence the constraint is inessential (see [6]) for defining ~ and it need not be considered in determining ~ ' . It then follows from the fact that upper bound components may be taken arbitrarily large for all remaining coordinates that an elementary vector constraint is also inessential for defining ~ unless K+ ~ J = 0, i.e., unless J = K+. Finally, the deletion operation requires that we omit such a constraint in determining the blocking matrix for M' if it contains a positive coefficient for any coordinate corresponding to an s or t variable. Thus the general form of those elementary vector constraints defining ~ which are useful in determining ~ ' is

kjzj -> -k~,

(4)

where ky denotes the component of k corresponding to the variable y. In the following we assume given a vector k ~ ~:(~) which determines a constraint of the form (4) that is essential for defining ~ and we show that after deleting coordinates corresponding to s, t and y variables, this constraint is of a specific form (see (9) below) which defines the polyhedron ~ ' of interest. Since k ~ we may write k as a linear combination of the rows of N using multipliers -ors for S C E , --~'T for T C E and p for the final row of N. Suppose that the multipliers -O'sl, 1-< i- 0, 1 - 0, 1 ~ j -- 0, since we have ky < 0. Thus we may assume that all multipliers O's~( 1 - i - p ) , rTj (1--] ~ q) and p are positive integers.

202

H.-C. Huang and L.E. Trotter Jr.I Blocking and anti-blocking polyhedra 2

-- orsp

incidence I vectors of I subsets of E]

TT I

T I

--o'sl

--

incidence vectors of I subsets of El

"rTq P

s

t

!

0

I

L

,...,

0

1

,I I -r,(S,) I I -r,(Sp)

I I I -r2(~r,) I I -r,(~r,)

I 0 I0

Fig. I. Row multipliers defining a constraint of the form (4).

Thus far we have used no " m a t r o i d " properties in the problem. We now do so by invoking the well-known fact that matroid rank functions are s u b m o d u l a r (see [3]), i.e., for rl we have rl(S) + r~(S') >- rl(S U S') + r~(S O S'),

S, S' C_E.

Suppose that for some distinct i and i', 1 - i, i' - p, we have oX _> trsr and both Si-~ Sr and Sr-~ S; are nonempty. Consider the vector k ' E ~ determined by altering the row multipliers which produced k as follows: increase the multipliers trs, usF and trs~ns~, both by the amount O'sr, reduce the multipliers O's~ and o% both by this same amount and leave the remaining multipliers unchanged. Now k' E so the comments following the statement of T h e o r e m 2 imply that the following constraint determined by k' is satisfied for each vector x -- (z, s, t, y) E ~ , k;zj - - k y ,

(4')

jEK~_

_

,

i

i.e., the constraint (4') is valid for ~. Furthermore, k j - k i for all j E K+ = K+ and it follows easily from the submodularity inequality that ky-> k'y. If this inequality holds strictly, i.e., if ky > k~, then (4) is inessential for defining ~ , a contradiction 9 H e n c e we must have that ky = k~, and we may replace k by k'. Note that in passing from k to k' the total g-weight, ~s_c~ trs, does not change. Thus after a finite number of applications of this process to both r and ~" multipliers we obtain a vector k ' E ~ determined by positive integer-valued multipliers crs~, rr~ and p for which the sets S~ and T) form two nested sequences: SI D_S;_ D_... D_ S'p,, T~ ~_ T~ ~ ... D_ T'q,. Furthermore, k' determines the same constraint (4) as k. N o w if k' ~ ~ ( ~ ) , then we may write k' as a conformal sum of elementary vectors of ~ , say k ' = k t + . . - + k =. Since this sum is conformal, the sets S~ and T; defining each k j must also be nested. Thus we may restrict consideration to those vectors k ~ ~:(~) for which the sets Si, 1 _ i ---p, and Tj, l<_j<_q, form two nested sequences: S, D_ S2 ~_ ... D_ Sp, T~_T2D_...D_Tq. Finally, by allowing copies of certain sets, we may also assume that O's, = 1, 1 --- i - p, and Trj = l, 1 - j ~ q; since the constraints of (2) for S = 0 and T = 0 are trivial, we also have that Sp ~ 0, Tq ~ 0.

H.-C. Huang and L.E. Trotter Jr. I Blocking and anti-blocking polyhedra

203

The nesting property of the sets Si and Tj is crucial to the remainder of our development. The importance of the structure obtained from two such nested families of sets is well-known in other problems concerning the intersection of two matroids (see [3, 10, 12]). Suppose p > p and consider the vector k ' E ~ which is the row of N corresponding to the set Sp. Since p > p, the support of k contains that of k', and because k E ~=(~), k' must be a nonzero scalar multiple of k. This is impossible, since vectors k and k' disagree in sign on z coordinates but agree in sign on the y coordinate. Thus we must have p -< p, and similarly, q -< p. Suppose q < p and consider the vector ! E ~ defined by subtracting the row of N corresponding to the set St from the last row of N. Since q < p, it is clear that the support of k contains that of l. Thus k must be a nonzero scalar multiple of I. But this implies q = 0, a contradiction. Hence we have q = p, and similarly, p = p; i.e., p = q = p>0. Consider the vector k ' E ~ defined by subtracting the two rows of N corresponding to sets S~ and T~ from the last row of N. We claim that the support of k contains that of k'. For coordinates corresponding to s, t and y variables this is clear. If coordinate j corresponds to a z variable and j is in the negative support of k', then j E St n Tq. Since Tt _D T2_D ... D Tq, we also have that j E St n T~ n 7"2n . . . n Tq; hence j is also in the negative support of k. On the other hand, if coordinate j corresponding to some z variable is in the positive support of k', then j ~ St n ~Pq (here 0 = E --. U). Thus j StnS2n...nspnTq, and so j is also in the positive support of k. This establishes the claim, and since k is elementary, k must be a positive scalar multiple of k'. Hence after deleting Ssi, 1 <- i <-p, and t~, 1 < j < q, constraint (4) is of the form z ( E ) - z ( S O - z ( T q , SO >- ro(E) - rl(Sl)

--

(5)

rz(Tq).

If p = 0 (q = 0), the same analysis determines a constraint of the form (5) in which St = ~ (Tq = 0). Thus the polyhedron ~ ' is given by z ( E ) - z ( S ) - z ( T , S ) >- ro( E ) - r,( S ) - r2( T ) ,

S, T C _ E ,

zs>-O,

j E E.

(6)

Since r2( T ) >- r2( T -- S ) , an equivalent representation is z ( E ) - z ( S ) - z ( T ) >- ro(E) - r t ( S ) - r2(T),

S, T C _ E , S N T = O ,

zs>-O,

j E E.

(7)

The latter system may be rewritten as z ( E ) - z ( U ) >- ro(E) - min {q(U') + r2( U - U')}, U'CU

u_c~,

zj_>0,

j~.

(8)

204

H.-C. Huang and L.E. Trotter Jr./Blocking and anti-blocking polyhedra

Edmonds has shown (see [3]) that for any set U C_E, ro(U) = rain {rs(U') + r , ( U ~ U')}. U'~U

Thus the following system determines the polyhedron ~ ' , z ( E ) - z ( U ) > ro(E) - to(U),

U C_ E,

zi -> 0,

j ~EE.

(9)

An analysis similar to that above, using part (ii) of Theorem 2 rather than part (i) of that theorem, yields the following anti-blocking polyhedron ti' for the polyhedron ~t' = {z E R~f: M ' z <- 1}, z(U) -< r0(U),

ucE,

z ( S tq T) < rl(S) + r2(T) - ro(E),

S, T C _ E and S U T = E,

zi>O,

]~E.

(10)

Of course, in this case we insist that M have no zero columns; i.e., so that no z-column of M is zero we require that each element of E be in a common independent set of size r0(E), and so that no s or t-column of M is zero, no element of E can be in all common independent sets of size ro(E). The inequalities (10) defining M' were first determined by McDiarmid [13]. One may obtain blocking and anti-blocking results for common independent sets of fixed cardinality r <- ro(E) simply by truncating (see [3]) the matroids at value r, i.e., by using rank functions r'i(S) = min(ri(S), r), S C_ E, i = l, 2. It is perhaps worth reemphasizing the generality and the limitations of the technique used here. In the present instance the success in determining relations (9) and (10) derived mainly from two factors: (i) knowing at the outset a polyhedral representation (1) for the combinatorial family of interest and (ii) applying submodularity of the matroid rank functions in order to determine nesting of the sets Si and Tj, as described following relation (4'). Item (i) above allowed us to invoke Theorem 2, whereas item (ii) led to the eventual simplification of (4) into the more meaningful form (9). Perhaps a similar analysis for other combinatorial families will prove useful in determining additional blocking and anti-blocking polyhedra.

References

[1] R.G. Bland, "Elementary vectors and two polyhedral relaxations", Mathematical Programming Study 8 (1978) 159-166. [2] W. Cunningham,"An unbounded matroid intersection polyhedron", Journal o[ Linear Algebra and Its Applications 16 0977) 209-215.

H.-C. Huang and L.E. Trotter Jr. / Blocking and anti-blocking polyhedra

205

[3] J. Edmonds, "Submodular functions, matroids, and certain polyhedra", in: R. Guy, H. Hanai, N. Sauer and J. Schonheim, eds., Combinatorial structures and their applications, Proceedings of the 1969 Calgary International Conference (Gordon and Breach, New York, 1970) pp. 69--87. [4] J. Edmonds and F.R. Giles, "A rain-max relation for submodular functions on graphs", Annals of Discrete Mathematics 1 (1977) 185-204. [5] D.R. Fuikerson, "Networks, frames, blocking systems", in: G.B. Dantzig and A.F. Veinott Jr., eds., Mathematics of the Decision Sciences, Lectures in Applied Mathematics, Vol. 11 (Am. Math. Soc., Providence, RI, 1968) pp. 303-'335. [6] D.R. Fulkerson, "Blocking polyhedra", in: B. Harris, ed., Graph Theory and Its Application (Academic Press, New York, 1970) pp. 93-112. [7] D.R. Fuikerson, "Anti-blocking polyhedra", Journal of Combinatorial Theory 12 (1972) 50-71. [8] D.R. Fulkerson, "Blocking and anti-blocking pairs of polyhedra", Mathematical Programming 1 (1971) 168-194. [9] D.R. Fulkerson and D.B. Weinberger, "Blocking pairs of polyhedra arising from network flows", Journal of Combinatorial Theory 18 (1975) 265-283. [10] F.R. Giles, "Submodular functions, graphs and integer polyhedra", Thesis, Department of Combinatorics and Optimization, University of Waterloo (Waterloo, Ont., 1975). [11] H.-C. Huang, "Investigations on combinatorial optimization", Thesis, School of Organization and Management, Yale University (Cornell University, School of Operations Research and Industrial Engineering, Tech. Rept. No. 308 (Ithaca, NY, 1976)). [12] E. Lawler, "Matroid intersection algorithms", Mathematical Programming 9 (1975) 31-56. [13] C.J.H. McDiarmid, "Blocking, anti-blocking, and pairs of matroids and polymatroids", Journal of Combinatorial Theory (B) 25 (1978) 313-325. [14] L.E. Trotter Jr. and D.B. Weinberger, "Symmetric blocking and anti-blocking relations for generalized circulations", Mathematical Programming Study 8 (1978) 141-158. [15] D.B. Weinberger, "Investigations in the theory of blocking pairs of polyhedra", Thesis, Corneli University, School of Operations Research and Industrial Engineering, Tech. Rept. No. 190 (Ithaca, NY, 1973). [16] D.B. Weinberger, "Network flows, minimum coverings, and the four-color-conjecture", Operations Research 24 (1976) 272-290. [17] D.B. Weinberger, "Transversal matroid intersections and related packings", Mathematical Programming 11 (1976) 164-176.

Mathematical Programming Study 12 (1980) 206-213. North-Holland Publishing Company

CERTAIN KINDS OF POLAR SETS AND THEIR RELATION TO MATHEMATICAL PROGRAMMING* Jergen T I N D Aarhus Universitet, Aarhus, Denmark

Received 28 April 1977 Revised manuscript received 14 August 1979 In this paper is discussed some special polar correspondences, which are related to the Minkowski polarity, known from convex analysis. They represent a natural generalization of the concepts of blocking and antiblocking polyhedra, developed by D.R. Fulkerson and used in the study of certain problems in mathematical programming and combinatorics. Here emphasis will be placed on the study of necessary and sufficient conditions to ensure the validity of the polar correspondences in question. At the same time an economic interpretation will be given within the framework of activity analysis. Key words: Blocking Sets, Antiblocking Sets, Polarity, Activity Analysis.

1. Introduction

This paper describes a generalization of and a connection b e t w e e n the Minkowski polarity and the idea of blocking and anti-blocking polyhedra, developed by F u l k e r s o n in [2], [3] and [4]. This is done by the introduction of generalized ideas as blocking and antiblocking sets and functions. Together with them an economic interpretation is given along the lines of nonlinear activity analysis (see e.g. Williams [11] or K n u d s e n [6]). S o m e of the material has been presented earlier in Tind [8] and [9], but in the present d e v e l o p m e n t with some modifications. Finally a relation is included between some special blocking and antiblocking pairs.

2. Polar sets

We will first state the m o s t basic properties of the classic Minkowski polarity. Consider a closed, c o n v e x set C C_R", 0 E C. Let C* denote the polar set of C, i.e. C* = { x * E R" I x " x*-< 1,Vx E C}. * Paper presented at the IXth International Symposium on Mathematical Programming, Budapest, 1976. 206

J. Tind/Certain kinds of polar sets

207

It is seen that C* is also closed, convex and 0 ~ C*. Additionally, we have the so-called involutory property, i.e. C** = C,

(2.1)

which says that C is the polar of C*. For arbitrary sets C E R" we have the more general formula: C** = cl(conv(C U {0})),

(2.2)

where "cl" denotes the "closure" and " c o n v " denotes the "convex hull" (see Rockafellar [7, p. 125]).

3. Antiblocking sets

The idea is here to modify the definition of polar sets by intersecting the polar sets by a given set D. As the polar set is basically a set which contains all normals for the supporting hyperplanes of the original set, we may lose some normals by this intersection. In general, this destroys the involutory property. Therefore, we will impose some necessary and sufficient conditions in order to save the involutory property. Let B _CR n and D _CR n be nonempty sets. Define the antiblocking set/~ _CR ~ of B with respect to D as follows:

B-B*ND. The notion "antiblocking set" is used here, because it represents a generalization of antiblocking polyhedra, introduced and studied by Fulkerson (see e.g. [3]). The basic question is now to examine the conditions under which /~ = B,

(3.1)

i.e. when B is the antiblocking set o f / ~ with respect to D. In this case B a n d / ~ constitute a pair of antiblocking sets. It is remarked that if R ~ = D, we are back in the Minkowski polarity. Hence, we are actually dealing with a generalization of this polarity. The following two theorems give necessary and sufficient conditions for Eq. (3.1) to be valid.

Theorem 3.1. Let D be closed, convex and containing the origin. Then

~=B if and only if

208

J. Tind/ Certain kinds of polar sets B = cl(conv(B U D*)) n D.

Proof. We will first show that cl(conv(B** U D*)) = cl(conv(B U D*)).

(3.2)

Since B** _~ B, the inclusion "_D" is obvious. To show that cl(conv(B** U D*)) _Ccl(conv(B U D*)),

(3.3)

it is remarked that cl(conv(B U D*)) _Dcl(conv(B U {0})) = B**, as {0} _CD*. Additionally, cl(conv(B U D*))_~D*. Since cl(conv(B U D*)) is closed and convex, we obtain (3.3), and (3.2) is proved. For closed, c o n v e x sets P, Q c_ R n containing the origin we have (P O Q)* = cl(conv(P* u Q*)). See e.g. Rockafellar [7, Corollary 16.5.2]. H e n c e =

n

o

= (n*

n/9)* n

o

= cl(conv(B** U D*)) n D = cl(conv(B U D*)) n D, where the last equation follows by (3.2). Theorem 3.2.* L e t D be closed, convex and containing the origin. Then

if and only if there exists a closed, convex set C C_R ~ such that B = C n D and such that D* C_ C.

Proof. Let us first assume t h a t / ~ = B, and let C = cl(conv(B U D*)). Obviously, C is closed and convex, and C _~ D*. T h e o r e m 3.1 implies that B = C n D. This shows one direction of the theorem. Now assume that we have a set C such that B = C A D and D'C_ C. [It is remarked that in general C might here be different from the previous set cl(conv(B U D*)).] F r o m Theorem 3.1 it is now sufficient to show that B = cl(conv(B U D*)) n D.

(3.4)

Since B = B n D _Ccl(conv(B U D * ) n D, it is enough to show the reverse inclusion: * It is remarked that for polyhedra Theorem 3.2 is a special case of joint work by Julian Arfioz, Jack Edmonds and Victor Griffin.Personal communication with Julian Arfioz.

J. Tind/ Certain kinds of polar sets

209

B _~ cl(conv(B O D*)) O D. By assumption C _~ D*. Moreover, C _~ B. Since C is closed and convex, we obtain that C _Dcl(conv(B O D*)). Hence, cl(conv(B U D*)) n D c_ C N D = B, where the last equation follows by assumption. This implies (3.4), and the theorem is proved. The assumption D'C_ C in the theorem is equivalent to D D C*. This expresses in particular that all supporting hyperplanes for C have their normals contained in D. The defining linear forms are normalized (equal to 1).

4. Antiblocking functions Antiblocking functions present a development similar to the one just outlined for antiblocking sets. The starting point is here the concept of conjugate functions and especially the similar notion of polar functions [7]. Again, the basic concepts take place within a given set D, and here we are intersecting the subgraphs of concave functions by D. Consider a concave, non-negative closed function f(x):R~-->R+, where R~denotes the non-negative orthant, i.e. R~ = {x E R n I x - 0}. Let sub§ f E R n denote the non-negative subgraph of f(x), i.e. sub+ f = {(x, y) E R "+1 [ x -> 0 and 0 --- y -< f(x)}. Since f(x) is a non-negative function, it is uniquely determined by sub§ f. With the given specifications on f(x) it follows that sub§ f is a closed, convex set containing 0. Define the following function f(x*) : R~--> R+, f(x*) = sup{y* E

R I- x .

x* + y*f(x) --< 1, Vx -> 0}.

Call f(x*) the antiblocking function of f. f(x*) becomes non-negative and non-decreasing. Its subgraph is given by sub+ [ = {(x*, y*) E R ~+z I (x*, y*). (-x, y) --< 1 for all (x, y) E sub§ f} n {(x*, y*) E R n+l I (x*, y*) _>0}. This shows that f is also concave and closed. Let T denote the linear transformation T : (x, y)-->(-x, y). If D = R~+1, it is seen that sub+ f is obtained by the antiblocking relation with respect to D as follows: sub+ [ = T(sub+ f). We now additionally assume that f(x) is non-decreasing in each component.

J. Tind/ Certain kinds of polar sets

210

This implies that all supporting hyperplanes for T(sub+ f ) have their normals in D = D**. Hence, it is obtained by the same reasoning as in the proof of Theorem 2.1 that T(sub+ f ) = sub+ f. This shows that

7=f,

4.1)

i.e. f is the antiblocking function of

5. An economic interpretation For a pair of conjugate convex functions Williams in [11] gives an economic interpretation of the corresponding involutory property. Along similar lines an economic interpretation is here to be given for a pair of antiblocking functions and the Eq. (4.1). It is remarked that [ ( x * ) can be expressed alternatively as f(x*) = inf x* 9 x + 1 ~-_o

f(x)

'

where (x* . x + 1)/f(x) = ~, if f ( x ) = O. The polarity will not be disturbed by rescaling, i.e. by replacement of the number 1 by an arbitrary number k > 0, which means that f ( x * ) = inf x 9 x* + k x~0

f(x)

Assume now that a manufacturer produces a product by means of n activities. Let the components of the vector x -> 0 denote the activity level of each activity. With a given activity level he produces f ( x ) units of the product. Assume additionally that the components of the vector x* -> 0 denote market prices that equal the cost for use or consumption of one unit of the corresponding activities. H e n c e x 9 x* is a cost for production of f ( x ) units of the product. In addition to this cost, which is linear in x, there is supposed to be a constant cost of size k. It is further assumed that the manufacturer's objective is to minimize the average cost per unit produced, i.e. the manufacturer wants to solve the following problem f ( x * ) = inf x 9 x* + k x~-o

f(x)

Hence, the antiblocking function of f denotes the minimal average cost, given a price vector x*. Now the manufacturer also considers selling his activities at a given level

Z Tind/Certain kinds o/polar sets

211

x-> 0 to a contractor, who in return should pay him with an amount of the finished product. For that purpose the contractor quotes a unit price x*-> 0 on each activity. Based on this price the manufacturer at least would demand an amount of the finished product that equals the estimated production costs, divided by the average cost per unit, i.e.

x .x*+k

[(x*) Hence, the contractor, seeing no reason to return more than that amount, will get the task to find a price x* -> 0 that solves the problem f ( x ) = inf x - x* + k x.~o [(x*) By (4.1) we get the reasonable result that with such a price the amount of finished product does not depend on whether the manufacturer produces by himself or lets the contractor do it for him. The main difference between the development here and that in [11] is that the manufacturer in [11] minimizes his total costs, whereas he here is minimizing the average costs. Additionally, the whole thing here is considered over the nonnegative orthant D = R~. This has the effect that x and x* are non-negative and that [(x) and [(x*) are non-decreasing functions, which seems reasonable in the economic context above.

6. Reverse polar sets The difference between the material on polar sets and the following is due to a change in the direction of the inequality sign in the basic definition. Consider a non-empty set C C R n. Let C O denote the reverse polar set of C, i.e.

C ~176

Ix" x~ > l, Vx E C}.

It is seen that C o is convex, closed and 0 ~ C. Additionally, C o is non-empty if and only if cl cony C ~ O. Let Po(C) denote the cone generated by C with vertex at O, i.e.

Po(C) = {Ax I x E C and A -> 0}. It can then be shown that C oo = cl conv C + P0(cl conv C). See [8, T h e o r e m 5.1]. This equation is corresponding to Eq. (2.2) for polar sets. If C = cl conv C + P0(cl conv C) and 0 ~ C, then the c o n v e x set C is called reverse closed, or by the terminology by Ar~oz for polyhedra/3-closed [1].

212

J. Tind/ Certain kinds of polar sets

For C reverse closed we have that

C00= C, which corresponds to the similar Eq. (2.1) for polar sets.

7. Blocking sets Here we will make a development similar to the one for antiblocking sets. Let B _CR n and D _CR" be nonempty sets. Define the blocking set/~ _CR" of B with respect to D as follows:

~-B~ Again, the notion "blocking" is due to Fulkerson [2]. We could now in full detail proceed along the same lines as in Section 3, but instead we refer to [8], where it has been done for certain sets. There are many similarities between antiblocking and blocking sets. One of them may be illustrated by the following observation. Let D = {x, y) ~ R ~-~ x R [ (x, y) > 0} be the positive orthant in R ~. Let B ~ R ~ a n d / ~ E R ~ be a pair of antiblocking sets with respect to this D. Let (XA,YA) E B, where (XA,YA) E R n-I • R, and let (x*, y*) E/~, where (X,~, y.~) E R n-I X R. Define the relation (XA, YA)--) (XB, YB) where xa YA and 1 YB = - -

YA

for

(XA,YA) E B,

and the relation

y*) (x~ yO) where

xo=XA y* and

1 Y~ = y---~A for (x'A, y~) ~/~. Then the two sets of points {(xB, YB)} and {(x~, y~)} constitute a pair of blocking sets with respect to D. This is due to the fact that the following two basic inequalities are equivalent for the present choice of D:

J. Tind/ Certain kinds of polar sets

213

XA " X~ + YAY~ <- 1

and x8 9 x ~ + ysyO _> 1. The differences between blocking and antiblocking sets b e c o m e clear, for example in the literature of the polyhedral case. See the next section.

8. Polyhedra IfD=R~={xER n l x - > 0 } a n d C = { x E R n l A x < - l } , w h e r e A i s a n m by n matrix of non-negative elements, and 1 ={1 . . . . . 1) with m elements, then T h e o r e m 3.2 can be applied on B = C n D. In this case B a n d / ~ constitute a pair of antiblocking polyhedra. There is a similar connexion to blocking polyhedra. Both c o n c e p t s play an important role in m a n y extremal combinatorial problems. See the original papers by Fulkerson [2, 3, 4] and the later papers, Fulkerson and Weinberger [5] and Weinberger [10], a m o n g others.

References [1] J.A. Ar,qoz, "Polyhedral neopolarities", Dissertation, Department of Applied Analysis and Computer Science, University of Waterloo (Waterloo, Ont., 1973). [2] D.R. Fulkerson, "Blocking polyhedra", in: B. Harris, ed., Graph theory and its applications (Academic Press, New York, 1970) pp. 93--112. [3] D.R. Fulkerson, "Anti-blocking polyhedra", Journal o[ Combinatorial Theory 12 (1972) 50-71. [4] D.R. Fulkerson, "Blocking and anti-blocking pairs of polyhedra", Mathematical Programming 1 (1971) 168-194. [5] D.R. Fulkerson and D.B. Weinberger, "Blocking pairs of polyhedra arising from network flows", Journal of Combinatorial Theory 18(B) (1975) 265-283. [6] N.C. Knudsen, Production and cost models o[ a multi-product firm (Odense University Press, Odense, Denmark, 1973). [7] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, NJ, 1970). [8] J. Tind, "Blocking and antiblocking sets", Mathematical Programming 6 (1974) 157-166. [9] J. Tind, "On antiblocking sets and polyhedra", in: Hammer et al., eds., Studies in integer Programming, Annals of Discrete Mathematics 1 (1977) 507-516. [10] D.B. Weinberger, "Network flows, minimum coverings, and the four-color conjecture", Operations Research 24 (1976) 272-290. [1 I] A.C. Williams, "Nonlinear activity analysis", Management Science 17 (1970) 127-139.

Mathematical Programming Study 12 (1980)214-221. North-Holland Publishing Company

BIBLIOGRAPHY [1] Aho. A.V., J.E. Hopcroft and J.D. Ullman, The design and analysis o[ computer algorithms (Addison Wesley, Reading, MA, 1974). [2] Araoz Durand, J.A., "Polyhedral neopolarities", Ph.D. Dissertation, University of Waterloo (November 1973). [3] Araoz, J.A., "Blocking and antiblocking extensions", in: Henn et al., eds., Operations research ver[ahren 32 (Athen~ium/Hain/Scriptor/Hanstein, Germany, 5-18, 1978). [4] Balas, E., "Un algorithme additif pour la resolution des programmes lineaires a variables bivalentes", Comptes Rendus de I'Acad~mie des Sciences, Paris 258 (1964) 3817-3820. [5] Balas, E., "An additive algorithm for solving linear programs with zero-one variables", Operations Research 13 (1965) 517-546. [6] Balas, E., "Intersection cuts--A new type of cutting planes for integer programming", Operations Research 19 (1971) 1%39. [7] Balas, E., "Integer programming and convex analysis: Intersection cuts from outer polars", Mathematical Programming 2 (1972) 330-382. [8] Balas. E., "Ranking the facets of the octahedron", Discrete Mathematics 2 (1972) 1-15. [9] Balas, E., "A constraint-activating outer polar method for pure and mixed integer 0-1 programs", in: P.L. Hammer and G.T. Zoutendijk, eds., Mathematical programming in theory and practice (North-Holland, Amsterdam, 1974) pp. 275-310. [10] Balas, E., "Disjunctive programming: cutting planes from logical conditions", in: O.L. Mangasarian, R.R. Meyer and S.M. Robinson, eds., Nonlinear programming, 2 (Academic Press, New York, 1975) pp. 279-312. [11] Balas, E., "Facets of the knapsack polytope", Mathematical Programming 8 (1975) 146--164. [12] Balas, E., "A note on duality in disjunctive programming", Journal of Optimization Theory and Applications 15 (1977). [13] Balas, E., "Some valid inequalities for the set partitioning problem", Annals of Discrete Mathematics I (1977) 13-47. [14] Batas, E., V.J. Bowman, F. Glover and D. Sommer, "An intersection cut from the dual of the unit hypercube", Operations Research 19 (1971) 40-44. [15] Balas, E. and M.W. Padberg, "On the set covering problem", Operations Research 20 (1972) 1152-1161.

[16] Balas, E. and M.W. Padberg, "On the set covering problem, II. An algorithm for set partitioning", Operations Research 23 (1975) 74-90. [17] Balas, E. and M.W. Padberg, "Set partitioning: A survey", SIAM Review 18 (1976) 710-760. [18] Balas, E. and M.W. Padberg, "Adjacent vertices of the convex hull feasible 0-1 points", R.A.LR.O. 13 (1979) 3-12. [19] Balas, E. and H. Samuelsson, "A node covering algorithm", Naval Logistics Research Quarterly 24 (1977) 213-233. [20] Balas, E. and E. Zemel, "'Facets of the knapsack polytope from minimal covers", SIAM J. App. Math. 34 (1978) 119-148. [21] Balas, E. and E. Zemel, "Graph substitution and set packing polytopes", Networks 7 (1977) 267-284. [22] Balas, E. and E. Zemel, "Critical cutsets of graphs and canonical facets of set packing polytopes", Mathematics of Operations Research 2 (1977) 15-20. [23] Balas, E. and A. Zoltners, "Intersection cuts from outer polars of truncated cubes", Naval Research Logistics Quarterly 22 (1975) 477-496. [24] Baker. T., J. Gill and R. Solvay, "Relativizations of the P = ? NP question", SIAM J. Comput. 4 (1975) 431-442. [25] Balinski, M.L., "Labelling to obtain a maximum matching", in: R.C. Bose and T.A. Dowling, eds., Combinatorial mathematics and its applications (University of North Carolina Press, 1969) pp. 585-601.

Bibliography

215

[26] Balinski, M.L., "On maximum matching, minimum covering and their connections", in: H.W. Kuhn, ed., Proceedings of Princeton symposium on mathematical programming (Princeton University Press, Princeton, 1970). [27] Balinski, M.L., "Establishing the matching polytope", Journal of Combinatorial Theory 13 (1972) 1-13. [28] Berge, C., Graphes et hypergraphes (Dunod, Paris, 1970). [English translation: North-Holland, Amsterdam, 1973.] [29] Berge, C., "F/irbung von Graphen, deren s~mtliche, bezw. deren ungerade Kreise starr sind. Zusammenfassung", Wiss. Z. Martin-Luther-Univ., Halle-Wittenberg. Math.-Natur. R (1961) 119. [30] Berge, C., "Balanced matrices", Mathematical Programming 2 (1972) 19-31. [31] Bellman, R.E. and Dreyfus, S.E., Applied Dynamic Programming (Princeton University Press, 1962). [32] Bellmore, M. and H.D. Ratliff. "'Set covering and involutory bases", Management Science 18 (1971) 194-206. [33] Bland, R.G., "Elementary vectors and two polyhedral relaxations", Mathematical Programming Study 8 (1978) 159-166. [34] Bland, R.G. and M. Las Vergnas, "Orientability of matroids", Journal o.f Combinatorial Theory (B) 24 (1978) 94-123. [35] Bland, R.G., H.-C. Huang and L.E. Trotter, Jr., "Graphical properties related to minimal imperfection", Discrete Mathematics 27 (1979) 11-22. [36] Burdet, C.A., "Enumerative inequalities in integer programming", Mathematical Programming 2 (1972) 32-64. [371 Burdet, C.A., "Polaroids: A new tool in nonconvex and integer programming", Nat,al Logistics Research Quarterly 20 (1973) 13-24. [38] Burkard, R.E., Methoden der ganzzahligen Optimierung (Springer, Berlin, 1972). [39] Burkard, R.E., "A general Hungarian method for the algebraic transportation problem", Discrete Mathematics 22 (1978) 219-232. [40] Burkard, R.E., "Travelling salesman and assignment problems: A survey", Annals of Discrete Mathematics 4 (1979) 193-215. [411 Burkard, R.E., "Remarks on some scheduling problems with algebraic objective functions", Op. Res. Verfahren 32 (1979) 63-77. [42] Burkard, R.E., W. Hahn and U. Zimmerman, "An algebraic approach to assignment problems", Mathematical Programming 12 (1977) 318--327. [43] Burkard, R.E. and K.-H. Stratmann, "Numerical investigations on quadratic assignment problems", Naval Research Logistics Quarterly 25 (1978) 129-148. [44] Burkard, R.E. and U. Zimmermann, "The solution of algebraic assignment and transportation problems", in: R. Henn, B. Korte and W. Oettli, eds., Optimization and operations research, Lecture Notes in Econ. and Math. Systems 157 11979) 55-65. [45] Camerini, P.M. and F. Maffioli, "Heuristically guided algorithm for K-parity matroid problems", Discrete Mathematics 21 (1978) 103-116. [46] Camion, P., "Characterization of totally unimodular matrices", Proc. Am. Math. Soc. 16 (1965) 1068-1073. [47] Christofides, N., Graph theory--An algorithmic approach (Academic Press, New York, 1975). [48] Chv~ital, V., "Edmonds polytopes and weakly hamiltonian graphs", Matbenzatical Progranzming 5 (1973) 29-40. [49] Chvfital, V., "On certain polytopes associated with graphs", J. of Combinatorial Theory (B) 18 (1975) 138--154. [50] Chv~ital, V., "On the strong perfect graph conjecture", Journal of Combinatorial Theory B 20 (1976) 139-141. [51] Chv~ital, V., "Determining the stability number'of a graph", SIAM J. Comput. 6 (1977) 643-662. [52] Cook, St., "The complexity of theorem-proving procedures", Col. rec. of third ACM syrup, on theory o[ computing (1970) pp. 151-158. [531 Cornu6jols, G., M. Fisher and G.L. Nemhauser, "On the uncapacitated location problem", Annals o[ Discrete Mathematics I (1977) 163-178.

216

Bibliography

[54] Cunningham, W.H., "An unbounded matroid intersection polyhedron", Linear Algebra and Its Applications 16 (1977) 209-215. [55] Cunningham. W.H. and A.B. Marsh, "A primal algorithm for optimum matching", Mathematical Programming Study 8 (1978) 50-72. [56] Dantzig, G., Linear programming and extensions (Princeton University Press, 1963). [57] Dantzig, G.B., D.R. Fulkerson and S.M. Johnson, "Solution of a large-scale travelling salesman problem", Operations Research 2 (1954) 393-410. [58] Dantzig, G.B., D.R. Fulkerson and S.M. Johnson, "On a linear programming, combinatorial approach to the travelling salesman problem", Operations Research 7 (1956) 59-66. [59] Dantzig, G.B. and A.F. Veinott, "Integral extreme points", SlAM Review 10 (1968) 371-372. [60] Dreyfus, D., "An appraisal of some shortest path algorithms", Operations Research 17 (1969) 395-412. [61] Derigs, U., "Duality and the algebraic matching problem", Op. Res. Ver[. 28 (1978) 253-264. [62] Derigs, U., "On solving symmetric assignment and perfect matching problems with algebraic objectives", in: R. Henn and W. Oettli, eds., Optimization and operations research, Lecture Notes in Econ. and Math. Systems 157 (1978) 79-86. [63] Derigs, U., "A generalized Hungarian method for solving minimum weight perfect matching problems with algebraic objective", Discrete Appl. Math. I (1979) 167-180. [64] Derigs, U. and U. Zimmerman, "An augmenting path method for solving linear bottleneck assignment problems", Computing 19 (1978) 285-295. [65] Derigs, U. and U. Zimmerman, "An augmenting path method for solving linear bottleneck transportation problems", Computing 22 (1979) 1-15. [66] Dobkin, D. and R. Lipton, "On the complexity of computations under varying sets of primitive operations", in: H. Brakhage, ed., Automata Theory and Formal Languages, Lecture Notes in Computer Science 33, (Springer, Berlin, 1975). [67] Edmonds, J., "Covers and packing in a family of sets", Bulletin of the American Mathematical Society 68 (1962) 494-499. [68] Edmonds, J., "Maximum matching and a polyhedron with 0, 1 vertices", Journal of Research of the National Bureau of Standards, 69B (1965) 125-130. [69] Edmonds, J., "Paths, trees and flowers", Canadian Journal of Mathematics 17 (1965) 449-467. [70] Edmonds, J., "Minimum partition of a matroid into independent subsets", Journal of Research of the National Bureau of Standards 69B (1965) 67-72. [71] Edmonds, J., "Submodular functions, matroids, and certain polyhedra", in: R. Guy, H. Hanani, N. Sauer and J. Sch6nheim, eds., Combinatorial structures and their applications (Gordon and Breach, New York, 1970) pp. 69-87. [72] Edmonds, J,, "Optimum branchings", in: G.B. Dantzig and A.F. Veinott, Jr., eds., Mathematics of the decision sciences, Lectures in Applied Mathematics, Vol. 11 (Am. Math. Soc., Providence, RI, 1%8) pp. 346-361. [73] Edmonds, J., "Matroids and the greedy algorithm", Mathematical Programming 1 (1971) 127-137. [74] Edmonds, J., "Edge-disjoint branchings", in: R. Rustin, ed., Combinatorial algorithms (AIgorithmics Press, New York, 1972) pp. 91-96. [75] Edmonds, J. and D.R. Fulkerson, "Transversals and matroid partition", Journal of Research of the National Bureau o[ Standards 69B (1965) 147-153. [76] Edmonds, J. and D.R. Fulkerson, "Bottleneck extrema", J. Comb. Th. (B) 8 (1970) 229-306. [77] Edmonds, J. and R. Giles, "A min-max relation for submodular functions on Graphs", Annals of Discrete Math. 1 (1977) 185-204. [78] Edmonds, J. and E.L. Johnson, "Matching Euler tours and the Chinese postman", Mathenzutical Programming 5 (1973) 88-124. [79] Edmonds, J. and E.L. Johnson, "Matching: A well-solved class of integer programs", in: R. Guy, ed., Combinatorial stractures and their applications (Gordon and Breach, New York, 1970). [801 Edmonds, J. and R.M. Karp, "Theoretical improvements in algorithmic efficiency for network flow problems", J.A.C.M. 19 (1973) 248-264. [81] Erd6s, P. and J. Spencer, Probabilistic methods in combinatorics (Academic Press, New York, 1974).

Bibliography

217

[82] Even, J.. Algorithmic combinatorics (MacMillan. New York, 1973). [83] Ford, L.R. Jr. and D,R. Fulkerson, Flows in networks (Princeton University Press, 1962). [84] Frank. H. and I. Frisch, Communications, transmissions and transportation networks (Addison Wesley, Reading, MA 1971). [85] Fulkerson, D.R., "Network. frames, blocking systems", in: G.B. Dantzig and A.F. Veinott, Jr., eds., Mathematics of the Decision Sciences. Lectures in Applied Mathematics. Vol. II (Am. Math. Soc. Providence, RI, 1968) pp. 303-335. [86] Fulkerson. D.R., "Blocking polyhedra", in: B. Harris, ed., Graph theory and its applications (Academic Press, New York, 1970) pp. 93-112. [87] Fulkerson, D.R., "Blocking and anti-blocking pairs of polyhedra", Mathematical Programming 1 (1971) 168-194. [88] Fulkerson, D.R., "Anti-blocking polyhedra", Journal o[ Combinatorial Theory 12 (1972) 50-71. [89] Fulkerson, D.R., "On the perfect graph theorem", in: T.C. Hu and S.M. Robinson, eds., Mathematical Programming (Academic Press, New York. 1973) pp. 69-77. [90] Fulkerson. D.R., "Packing rooted directed cuts in a weighted directed graph", Mathematical Programming 6 (1974) 1-14. [91] Fulkerson, D.R., A.J. Hoffman and R. Oppenheim, "On balanced matrices", Mathematical Programming Study 1(1974) 120-133. [92] Fulkerson, D.R., G.L. Nemhauser and L.E. Trotter, Jr., "Two computationally difficult set covering problems that arise in computing the l-width of incidence matrices of Steiner triple systems", Mathematical Programming Study 2 (1975) 72-81. [93] Fulkerson, D.R. and D.B. Weinberger, "Blocking pairs of polyhedra arising from network flows", J. o[ Combinatorial Theory 18 (1975) 265-283. [94] Gallai, T., "lJber extreme Punkt- und Kantenmengen", Ann. Univ. Sci. Budapest, E6tv6s, Sect. Math. 2 (1958) 133-138. [95] Garey. M.R. and D.J. Johnson, Computers and intractability (Freeman and Co., 1979). [96] Garfinkel, R.S. and G.L. Nemhauser, Integer programming (Wiley, New York, 1972). [97] Glover, F., "Convexity cuts and cut search", Operations Research 21 (1973) 123-134. [98] Glover, F., "Convexity cuts for multiple choice problems", Discrete Mathematics 6 (1973) 221-234. [99] Glover, F., "Polyhedral annexation in mixed integer and combinatorial programming", Mathematical Programming 9 (1975~, 16i-188. [100] Glover, F. and D. Klingman. "The generalized lattice point problem", Operations Research 21 (1973) 141-156. [101] Golumbic, M.C., "Comparability graphs and a new matroid", J. Comb. Th. (B) 22 (1977) 68-50. [102] Gomory, R.E., "An algorithm for integer solutions to linear programs", in: R.L. Graves and Ph. Wolfe, eds., Recent advances in mathematical programming (McGraw-Hill, New York, 1963). [103] Gomory, R.E., "The travelling salesman problem", in: Proceedings of the IBM scientific computing conference on combinatorial problems, IBM Data Processing Division, White Plains, N.Y. (1964). [104] Gomory, R.E., "Polyhedra related to combinatorial problems", Linear Algebra and Its Applications (1968). [105] Gr6tschel, M., Polyedrische Charakterisierungen kombinatorischer Optimierungsprobleme (Verlag A. Hain, Meisenheim am Glan, 1977). [106] Gr6tschel, M. and M.W. Padberg, "Zur Oberflfichenstruktur des travelling salesman Polytopen", Proceeding in Operations Research 4 (1974) 207-21 I. [107] Gr6tschel, M. and M.W. Padberg. "Lineare charakterisierungen von lravelling salesman Problems", Zeitschrift/iir Operations Research 21 (1977) 33-64. [108] Gr6tschel, M. and M.W. Padberg, "On the symmetric travelling salesman problem 1: Inequalities", Mathematical Programming 16 (1979) 265-280. [109] Gr6tschel, M. and M.W. Padberg, "On the symmetric travelling salesman Problem I!: Lifting theorem and facets", Mathematical Programming 16 (1979) 281-302. [I 10] B. Gr/inbaum, Convex polytopes (Wiley, New York, 1967). [111] Hammer, P.L., E.L. Johnson and B.H. Korte (eds.), Discrete optimication I/II., Annals of Discrete Mathematics 4[5 (North-Holland, Amsterdam, 1979).

218

Bibliography

[112] Hammer, P.L., E.L. Johnson, B, Korte and G.L. Nemhauser (eds.), Studies in integer programming, Annals of Discrete Mathematics 1 (North-Holland, Amsterdam, 1977). [113] Harary, F.. Graph theory (Addison Wesley, Reading, MA, 1969). {114] Harary, R. and E.M. Palmer, Graphical enumeration (Academic Press, New York, 1973). [115] Hausmann, D. (ed.), Integer programming and related areas, A Classified Bibliography 19761978, Lecture Notes in Economics and Mathematical Systems 160 (1978). [116] Hausmann, D., Adjacency on polytopes in combinatorial Gptimization, (Athen~ium Verlagsgruppe, K6nigstein i. Ts., 1979). [117] Hausmann, D. and B. Korte, "Lower bounds on the worst-case complexity of some oracle algorithms", Discrete Mathematics 24 (1978) 261-276. [118] Hausmann, D. and B. Korte, "Colouring criteria for adjacency on 0-I polyhedra", Mathematical Programming Study 8 (1978) 106-t27. [119] Held, M. and R.M. Karp, "The traveling-salesman problem and minimum spanning trees", Operations Research 18 (1970) 1138-1162. [120] Held, M. and R.M. Karp, "The traveling-salesman problem and minimum spanning trees: Part II", Mathematical Programming 1 (1971) 6-25. [121] Hoffman, A.J. and J.B. Kruskal, "Integral boundary points of convex polyhedra", in: H.W. Kuhn and A.W. Tucker, eds., Linear inequalities and related systems, Annals of Mathematics Studies 38 (1956). [122] Hoffman, A.J., "A generalization of max flow-rain cut", Mathematical Programming 6 (1974) 352-359. [123] Hopcroft, J.E. and R.M. Karp, "An N 5~2 algorithm for maximum matchings in bipartite graphs", SIAM J. Computing 2 (1973) 225-231. [124] Hu, T.C., Integer programming and network flows (Addison Wesley, Reading, MA, 1%9). [125] Ibarra, O.H. and C.E. Kim, "Fast approximation algorithms for the knapsack and sum of subsets problems", JACM 22 (1975) 463-468. [126] Jenkyns, T.A., "The efficacy of the 'greedy' algorithm", in: F. Hoffman et al., eds., Proc. 7th SE con[-, combinatorics, graph theory, and computing, Baton Rouge, Congressus Numerantium, 17 (Utilitas Mathematica, Winnipeg, 1975) pp. 341-350. [127] Jeroslow, R.G., "The theory of cutting planes", in: N. Christofides et al., ed., Combinatorial optimization (Wiley, New York, 1979) pp. 21-72. [128] Jeroslow, R.G., "An introduction to the theory of cutting planes", Annals of Discrete Mathematics 5 (1979) 71-96. [129] Johnson, E.L., "The group problem for mixed integer programming", Mathematical Programming Study 2 (1974) 137-179. [130] Johnson, E.L., "Support functions, blocking pairs and antiblocking pairs", Mathematical Programming Study 8 (1978) 167-196. [131] Johnson, E.L., "On the group problem and a subadditive approach to integer programming", Annals of Discrete Mathematics 5 (1979) 97-112. [132] Karp, RM., "Reducibility among combinatorial problems", in: R.E.Miller and J.W. Thatcher, eds., Complexity of computer computations (Plenum Press, New York, 1972) pp. 85-103. [133] Karp, R.M., "On the computational complexity of combimatorial problems", Networks 5 (1975) 45-68. [134] Karp, R.M., "The fast approximate solution of hard combinatorial problems", in: F. Hoffman et al., eds., Proc. 6th SE conf. combinatorics, graph theory, and computing, Boca Raton, Congressus Numerantium 14 (Uti[itas Mathematica, Winnipeg, 1975) pp. 15-34. [135] Karp, R.M., "The probabilistic analysis of some combinatorial search algorithms", in: Traub, ed., Algorithms and complexity (Academic Press, t976) pp. 1-20. [136] Karzanov, A.V., "Determining the maximal flow in a network by the method of preflows", Soviet Math. Dokl. 15 (1974) 434-437. [137] Kastning, C. (ed.), Integer programming and related argas, A Classified Bibliography, Lecture Notes in Economics and Mathematical Systems 128, (Springer, Berlin, 1976). [138] Knuth, D., The art of computer programming, Vols. 1 and 3 (Addison-Wesley, 1%9 and 1973). [139] Korte, B. and D. Hausmann, "An analysis of the greedy heuristic for independence systems", Annals of Discrete Mathematics 2 (1978) 65-74.

Bibliography

219

[140] Kuhn, H.W. and A.W. Tucker (eds.), Linear inequalities and related systems, Annals of Mathematics Studies, No. 38 (Princeton University Press, 1956). [141] Land, A and S. Powell, "Computer codes for problems of integer programming", Annals o[ Discrete Mathematics 5 (1979) 221-270. [142] Lawler, E. L., Combinatorial optimization: Networks and matroids (Holt, Rinehart & Winston, New York, 1976). [143] Lehman, A., "On the width-length inequality", Mathematical Programming 17 (1979) 403-417. [144] Lenstra, J. K., Sequencing by enumerative methods, Mathematical Centre Tracts 69 (Mathematisch Centrum, Amsterdam, 1977). [145] Lenstra, J. K., A.H.G. Rinnooy and P. van Erode Boas, eds., Interfaces between computer science and operations research, Mathematical Centre Tracts (Mathematisch Centrum, Amsterdam, 1978). [146] Lovfisz, L., "Normal hypergraphs and the perfect graph conjecture", Discrete Mathematics 2 (1972) 253-267. [147] Lovtisz, L., "'A characterization of perfect graphs", Journal of Combinatorial Theory B 13 (1972) 95-98. [148] Lovfisz, L., "On two minimax theorems in graph theory", Journal of combinatorial theory B 21 (1976) 96-103. [149] Lov~isz, L., "Certain duality principles in integer programming", Annals of Discrete Mathematics I (1977) 363-374. [150] Lov~isz, L., "Graph theory and integer programming", Annals of Discrete Mathematics 4 (1979) 141-158. [151] Lucchesi, C. and D.H. Younger, "A minimax theorem for directed graphs", J. London Math. Soc. 17 (1978) 369-374. [152] McDiarmid, C.J.H., "Blocking, antiblocking, and pairs of matroids and polymatroids", Journal of Combinatorial Theory B, 25 (1978) 313-325. [153] Minty, G.J.. "'On the axiomatic foundation of the theories of directed linear graphs, electrical networks and network programming", Journal of Mathematics and Mechanics 15 (1966) 485-520. [154] Minty, G.J., "Maximum independent sets of vertices in claw-free graphs", Journal of Combinatorial Theory B, (1979) to appear. [155] Nemhauser, G.L. and L.E. Trotter, "Properties of vertex packing and independence system polyhedra", Mathematical Programming 6 (1974) 48-61. [156] Nemhauser, G.L. and L.E. Trotter, Jr., "Vertex packings: Structural properties and algorithms", Mathemaical Programming 8 (t975) 232-248. [157] Nemhauser, G.L., L.A. Wolsey and M.L. Fisher, "An analysis of approximations for maximizing submodular set functions", Mathematical Programming 14 (1978) 265-294. [158] Nemhauser, G.L. and L.A. Wolsey, "Best algorithms for approximating the maximum of a submodular set function", Math. o~ OR 3 [1978) 177-188. [159] Padberg, M.W., "On the facial structure of set packing polyhedra", Mathematical Programming 5 (1973) 199-215. [ 160} Padberg, M.W., "Perfect zero-one matrices", Mathematical Programming 6 (1974) 180-196. [161] Padberg, M.W., "Perfect-one matrices--IF', in: Proceedings in operations research 3 (PhysicaVerlag, WiJrzburg-Wien, 1974) pp. 75-83. [162] Padberg, M.W., "A note on zero-one programming", Operations Research 23 (1975) 833-837. [163] Padberg, M.W., "Characterizations of totally unimodular, balanced and perfect matrices", in: Combinatorial Programming: Methods and Applications, ed. B. Roy [Reidel, Boston) 275-284, 1975. [164] Padberg, M.W., "Almost integral polyhedra related to certain combinatorial optimization problems", Linear Algebra and Its Applications 15 (1976) 69-88. [165] Padberg, M.W., "A note on the total unimodularity of matrices", Discrete Mathematics 14 (1976) 273-278. [166] Padberg, M.W., "On the complexity of set packing polyhedra", Annals of Discrete Mathematics l (I977) 421-434. [167] Padberg, M.W., "Covering, packing and knapsack problems", Annals of Discrete Mathematics 4 (1979) 265-287.

220

Bibliography

[168] Padberg, M.W.. "(I, k)-Configurations and facets for packing problems", Mathematical Programming 1 (1980) 94-99. [169] Padberg, M.W. and M.R. Rao, "The travelling salesman problem and a class of polyhedra of diameter two", Mathematical Programming 7 (1974) 32-45. [170] Parthasarathy, K.R. and G. Ravindra, "The strong perfect-graph conjecture is true for K~.3-free graphs", Jounzal o[ Combinatorial Theory B 21 (1976) 212-223. [171] Peled, U.N., "Properties of facets of binary polytopes", Ph.D. Thesis, Dept. of Combinatorics and Optimization, University of Waterloo, Waterloo (1973). [172] Pulleyblank, W.R., "Faces of matching polyhedra", Ph.D. Thesis, Dept. of Combinatorics and Optimization. University of Waterloo, Waterloo (1973). [173] Pulleyblank, W.R., "Minimum node covers and 2-bicritical graphs", Mathematical Programruing 17 (1979) 97-103. [174] Pulleyblank, W.R. and J. Edmonds, "Facets of 1-matching polyhedra", in: Hypergraph seminar, Lectures Notes in Math., 411 (Springer, Berlin, 1974) pp. 214-242. [175] Rabin, M,O., "Probabilistic algorithms", in: Traub, ed., Algorithm and complexity (Academic Press, New York, 1976) pp. 21-40. [176] Randow, R. von, Introduction to the theory o[ matroids, Lecture Notes in Economics and Mathematical Systems 109 (Springer, Berlin, 1975). [177] Rinooy Kan, A.H.G., Machine scheduling problems (Martinus Nijhoff, The Hague, 1976). [178] Rockafellar, R.T., Convex analysis (Princeton University Press, 1970). [179] Rockafellar, R.T., "The elementary vectors of a subspace of R N'', in: R.C. Bose and T.A. Dowling, eds., Combinatorial mathematics and its applications (University of North Carolina Press, Chapel Hill, NC, 1969) pp. 104--127. [180] Roy, B., AIg~bre moderne et th~orie des graphes, I (Dunod, Paris, 1969). [181] Roy, B., AIg~bre moderne et th~orie des graphes, II (Dunod, Paris, 1970). [182] Sahni, S. and T. Gonzalez, "P-complete approximation problems", JACM 23 (1976) 555-565. [183] Schrijver, A. (ed.), Packing and co~'ering in combinatorics, Mathematical Center Tracts 106 (Mathematisch Centrum, Amsterdam, 1979). [184] Simmonard, M., Linear programming (Prentice-Hall, Englewood Cliffs, NY, 1966). [185] Seymour, P.D., "The matroids with the max-flow-rain-cut property", J. of Comb. Theory B 23 (1977) 189-222. [186] Spielberg, K., "Enumerative methods in integer programming", Annals of Discrete Mathematics 5 (1979) 139-183. [187] Stoer, J. and C. Witzgall, Convexity and optimization in finite dimensions. I (Springer, Berlin, 1970). [188] Tarjan, R.E., "Complexity of combinatorial algorithms", SIAM Review 20 (1978) 457-491. [189] Tind, J., "Blocking and antiblocking sets", Mathematical Programming 6 (1974) 157-166. [190] Tind, J., "On antiblocking sets and polyhedra", in: P. Hammer et al., eds., Studies it, integer programming, Annals of Discrete Mathematics 1 (North-Holland, 1977) pp. 507-516. [191] Trotter, L.E. Jr. and D.B. Weinberger, "Symmetric blocking and antiblocking relations for generalized circulations", Mathematical Programming Study 8 (1978) 141-158. [192] Tucker, A.C., "Critical perfect graphs and perfect 3-chromatic graphs", Journal o[ Combinatorial Theory B 23 (1977) 143-149. [193] Weinberger, D.B., "Network flows, minimum coverings, and the four-color conjecture", Operations Research 24 (1976) 272-290. [194] Weinberger, D.B., "Transversal matroid intersections and related packings", Mathematical Programming I1 (1976) 164-176. [195] Welsh, D.J., Matroid theory (Academic Press, New York, 1976). [196] Weyl, H., "Elementare theorie der konvexen polyeder", Comm. Math. Heir. 7 (1935) 290-306. [Translated in: Contributions to the Theory of Games, Vol. I, Annals of Mathematics Studies, No. 24 (Princeton, 1950) pp. 3-18.] [197] Whitney, H., "On the abstract properties of linear dependence", Amer. J. Math. 57 (1935) 509-533. [198] Wolsey, L.A., "Further facet generating procedures for vertex packing polytopes", Mathematical Programming 11 (1976) 158-163.

Bibliography

221

[199] Wolsey, L.A., "Valid inequalities, covering problems and discrete dynamic programs", Annals of Discrete Mathematics 1 (1977) 527-538. [200] Zemel, E., "Lifting the facets of 0-1 programming polytopes", Mathematical Programming 15 (1978) 268-277. [201] Zimmerman, U., "Matroid intersection problems with generalized objectives", in: Proceedings of the IX international symposium on mathematical programming, Budapest, 1976. [202] Zimmermann, U., "Some partial orders related to Boolean optimization and the greedy algorithm", Annals of Discrete Mathematics 1 (1977) 539-550. [203] Zimmermann, U., "Threshold methods for Boolean optimization problems with separable objectives", in: J. Stoer, ed., Optimization Techniques (Lecture Notes in Control and Information Sciences 7 (Springer, Berlin, 1978) pp. 289-298. [204] Zimmermann, U., "Duality and the algebraic matroid intersection problem", Op. Res. Verf. 28 (1978) 285-296. [205] Zimmermann, U., "Duality principles and the algebraic transportation problem", in: Numerische Methoden bei graphentheoretischen und kombinatorischen problemen, Band 2, republished by L. Collatz, G. Meinardus and W. Wetterling (Birkh~iuser Verlag, Basel, 1979) pp. 234-255.