Network optimization problems : algorithms, applications, and complexity

NETWORK PROBLEMS SERIES ON APPLIED MATHEMATICS Editor-in-Chief: Frank Hwang Associate Editors-in-Chief: Zhong-ci Shi a...

Author: Du D.-Z. | Pardalos P.M. (eds.)

119 downloads 1757 Views 17MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

NETWORK PROBLEMS

SERIES ON APPLIED MATHEMATICS Editor-in-Chief: Frank Hwang Associate Editors-in-Chief: Zhong-ci Shi and Kunio Tanabe

Vol. 1

International Conference on Scientific Computation ed. T. Chan and Z.-C. Shi

P"NETWORK PROBLEMS ALGORITHMS, APPLICATIONS AND COMPLEXITY Editors

Ding-Zhu Du Department of Computer Science University of Minnesota and Institute of Applied Mathematics Academia Sinica, Beijing

P a n o s M. P a r d a l o s Department

of Industrial and Systems University of Florida

Engineering

Y J * World Scientific wfc

Singapore • New Jersey • London • Hong Kong

Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farcer Road, Singapore 9128 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 73 Lynton Mead, Totteridge, London N20 8DH

Library of Congress Cataloging-in-Publication Data Network optimization problems : algorithms, applications, and complexity / editors, Ding-Zhu Du, Panos M. Pardalos. p. cm. — (Series on applied mathematics; v. 2) Includes bibliographical references. ISBN 9810212771 1. System analysis. 2. System design. 3. Mathematical optimization. I. Du, Dingzhu. II. Pardalos, P. M. (Panos M.), 1954III. Series. T57.6.N47 1993 OO3-dc20 93-16337 CIP

Copyright © 1993 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form orby any means, electronic or mechanical, including photocopying, recording orany information storage and retrieval system now known or to be invented, without written permission from the Publisher.

Printed in Singapore by Utopia Press.

V

Preface T h e field of networks is a lively one, both in terms of theoretical developments and in t e r m s of t h e diversity of its applications. Many problems of design and analysis of large systems can be formulated and solved using techniques of network theory. Such problems include communication systems, electrical networks, computer networks, transportation, scheduling of industrial processes, facility location, and modeling of combinatorial optimization problems. Network theory originated m a n y years ago, before our information age. In the eighteenth century, Euler solved the famous Konigsberg Bridge problem and later Kirchoff initiated the theory of electrical networks. But it was not until late last century, when Bell invented t h e telephone, t h a t many areas of network theory were stimulated. After t h e appearance of t h e first graph theory book (by D. Konig) in 1936, there was tremendous development regarding t h e theory and applications of networks. Hitchcock proposed t h e first complete algorithm for t h e transportation problem in 1941, Dantzig proposed t h e simplex algorithm for linear programming in 1947, and algorithms for t h e m i n i m u m spanning tree (Kruskal, 1956) and shortest p a t h problems were proposed ( P r i m , 1957). During t h e same period, the first commercial computers became available. As it happened with many other areas of research, the fields of computer science and networks influenced each other in many respects. In 1962 the book by Ford and Fulkerson on "Flows in Networks" appeared. W i t h t h e development of new d a t a structure techniques and t h e theory of computational complexity we entered a new era of algorithmic developments in networks. During t h e second half of our century we saw major technological developments in all areas of h u m a n endeavor and particularly in information processing. C o m p u t e r networks play a vital role in providing fast, reliable, cost-effective means of communication and information sharing. In addition, network techniques and computer technology enable us to solve large-scale network models t h a t appear in applications such as transportation and telecommunications. It is clear t h a t the theory and applications of networks is so great that this book could not give a full account and systematic t r e a t m e n t of the subject in its entirety. It is our intention to introduce a number of special topics in order to show t h e spectrum of recent research activities and the richness of ideas in the development of algorithms and t h e applications of networks. While we were able to provide only a glimpse of this expansive field, we felt t h a t this glimpse would allow the reader to sense t h e breadth and t h e depth of t h e field. We would like to take the opportunity to thank t h e authors of the papers, t h e anonymous referees, and t h e publisher for helping us to produce this excellent collection of papers.

vi Ding-Zhu Du and Panos M. Pardalos University of Minnesota and University of Florida October 1992

Preface

vii

Contents Preface

v

Greedily Solvable Transportation Networks and Edge-Guided Vertex Elimination Ilan Adler and Ron Shamir

1

1. Introduction 2. Preliminaries 3. Vertex Elimination and Spanning Trees 4. Generating an O p t i m a l Spanning Tree 5. Gale Certificates 6. Signatures 7. S u m m a r y References Networks Minimizing Length Plus t h e N u m b e r of S t e i n e r P o i n t s Thomas Colthurst, Chris Cox, Joel Foisy, Hugh Kathryn Kollett, Holly Lowy, and Stephen Root

1 4 7 11 12 15 19 20

23 Howards

1. Introduction 23 2. Examples of Networks Minimizing Length Plus t h e N u m b e r of Steiner Points .24 3. Bounds on t h e N u m b e r of Edges Meeting at Steiner Points 32 References 35 P r a c t i c a l E x p e r i e n c e s U s i n g an I n t e r a c t i v e O p t i m i z a t i o n P r o c e d u r e for V e h i c l e S c h e d u l i n g Joachim R. Daduna, Miodrag Mojsilovic, and Peter Schiitze

37

1. Introduction 2. Problem Formulation 3. M a t h e m a t i c a l Formulation 4. Operational Process 5. Example 6. Results 7. Conclusions 8. Outlook References

37 38 40 43 43 48 49 51 51

viii

Contents

S u b s e t I n t e r c o n n e c t i o n D e s i g n s : G e n e r a l i z a t i o n s of S p a n n i n g T r e e s and Steiner Trees Ding-Zhu Du and Panos M. Pardalos

53

1. Introduction 2. Multi-Phase Spanning Networks 3. Multi-Phase Steiner Networks References

53 54 59 61

Polynomial and Strongly Polynomial Algorithms for C o n v e x N e t w o r k O p t i m i z a t i o n Dorit S. Hochbaum

63

1. Introduction 2. P r o x i m i t y Theorems and Piecewise Linear Approximations 3. T h e Impossibility of Strongly Polynomial Algorithms 4. Q u a d r a t i c Network Flow Problems References

64 68 77 80 88

H a m i l t o n i a n C i r c u i t s for 2 - R e g u l a r I n t e r c o n n e c t i o n N e t w o r k s Frank K. Hwang and Wen-Ching Winnie Li

93

1. Introduction 2. A General Approach 3. T h e E x t e n d e d Double Loop Network 4. Semi-Torus Networks 5. S e m i - M a n h a t t a n Networks References

94 95 96 103 106 109

E q u i v a l e n t F o r m u l a t i o n s for t h e S t e i n e r P r o b l e m in G r a p h s Bassam N. Khoury, Panos M. Pardalos, and Donald W. Hearn

Ill

1. Introduction 2. T h e S P G and t h e S P D G 3. Mixed Integer Formulations 4. Integer Formulations 5. Continuous Formulations 6. Concluding Remarks References

Ill 112 113 114 119 120 121

Contents

ix

M i n i m u m Concave-Cost Network Flow Problems with a Single Nonlinear Arc Cost Bettina Klinz and Hoang Tuy

125

1. Introduction 2. A P a r a m e t r i c Method for Rank Two Quasiconcave Minimization 3. A Special M i n i m u m Concave-Cost Network Flow Problem 4. A Strongly Polynomial T i m e Algorithm for SSU Networks 5. An Improved Algorithm for SSU Networks 6. S u m m a r y and Concluding Remarks References

125 127 130 137 138 143 143

A M e t h o d for S o l v i n g N e t w o r k F l o w P r o b l e m s with General Nonlinear Arc Costs Bruce W. Lamar

147

1. Introduction 2. Problem Formulation 3. Conversion Procedure 4. Numerical Examples 5. S u m m a r y References

147 150 152 158 165 166

A p p l i c a t i o n of G l o b a l L i n e S e a r c h in O p t i m i z a t i o n of N e t w o r k s Jonas Mockus

169

1. Global Line Search 2. Optimization of Networks 3. T h e Optimization of High-Voltage Net of Power System 4. Mixed Integer Global Line Search References

169 170 171 173 175

S o l v i n g N o n l i n e a r P r o g r a m s w i t h E m b e d d e d N e t w o r k S t r u c t u r e s . . . . 177 Mustafa C. Pinar and Stavros A. Zenios 1. Introduction and Background 2. T h e Linear-Quadratic Penalty Algorithm for Networks with Side Constraints and Variables 3. Numerical Experience 4. Conclusions References

177 179 189 200 200

x

Contents

O n A l g o r i t h m s for N o n l i n e a r D y n a m i c N e t w o r k s Warren B. Powell, Elif Berkkam, and Irvin J. Lustig

203

1. Introduction 2. T h e N D N - X Formulation 3. T h e Transformed Problem N D N - T 4. Solution Algorithms for N D N - T 5. Numerical Results 6. Appendix: Calculation of t h e Derivatives References

203 205 207 209 212 215 221

S t r a t e g i c a n d T a c t i c a l M o d e l s a n d A l g o r i t h m s for t h e C o a l I n d u s t r y U n d e r t h e 1990 Clean Air A c t Hanif D. Sherali and Quaid J. Saifee

233

1. Introduction 2. Related Models in the Literature 3. Formulation of a Long-Term Strategic Model 4. Solution Procedures 5. C o m p u t a t i o n a l Experience Appendix: Modifications for a Tactical Day-to-Day References

234 236 237 246 249 255 261

M u l t i - O b j e c t i v e R o u t i n g in S t o c h a s t i c E v a c u a t i o n N e t w o r k s J. MacGregor Smith

263

1. Problem Overview 2. Assumptions and Definitions 3. Mathematical Model 4. Congestion Properties 5. Algorithm 6. Example 7. S u m m a r y and Conclusions References

263 265 267 268 272 274 280 280

A S i m p l e x M e t h o d for N e t w o r k P r o g r a m s w i t h C o n v e x Separable P i e c e w i s e Linear Costs and Its Application to Stochastic Transshipment Problems Jie Sun, K.-H. Tsai, and L. Qi 1. Introduction

283

284

Contents

xi

2. Background Materials 3. T h e Simplex Algorithm for ( N e t P L P ) and Its Convergence 4. Implementation of t h e Algorithm 5. C o m p u t a t i o n a l Results 6. T h e S T P and Computational Results 7. S T P C and O t h e r Extensions References

285 288 288 292 294 296 298

A Bibliography on N e t w o r k Flow P r o b l e m s Marinus Veldhorst

301

1. Introduction References

301 304

Tabu Search: Applications and P r o s p e c t s Stefan Vofl

333

1. Introduction 2. Tabu Search 3. Applications 4. Concepts for Parallel Tabu Search 5. Conclusions References

333 334 338 346 350 351

T h e S h o r t e s t P a t h N e t w o r k a n d I t s A p p l i c a t i o n s in Bicriteria Shortest Path Problems Guo-Liang Xue and Shang-Zhi Sun

355

1. Introduction 2. T h e Shortest P a t h Network 3. Applications 4. Conclusions References

356 356 359 360 361

A N e t w o r k F o r m a l i s m for P u r e E x c h a n g e E c o n o m i c E q u i l i b r i a Lan Zhao and Anna Nagurney

363

1. Introduction 2. T h e Variational Inequality Model of t h e P u r e Exchange Economy and its Isomorphic Network Equilibrium Representation 3. A General Iterative Scheme for t h e C o m p u t a t i o n of Walrasian Price Equilibrium

363 365 367

xii 4. T h e Projection and Relaxation Methods for t h e C o m p u t a t i o n of t h e Equilibrium Prices 5. Numerical Examples 6. S u m m a r y and Conclusions References

Contents

372 380 384 385

S t e i n e r P r o b l e m in M u l t i s t a g e C o m p u t e r N e t w o r k s Sourav Bhattacharya and Bhaskar Dasgupta

387

1. Introduction 2. Multistage Interconnection Networks 3. Multistage Communication Networks 4. Conclusion References

387 389 397 400 401

1 Network Optimization Problems, pp. 1-22 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Greedily Solvable Transportation Networks and Edge-Guided Vertex Elimination Ilan Adler IEOR Department,

University

of California,

Berkeley,

CA 94720

USA.

Ron Shamir Department of Computer Science, Sackler Faculty of Exact Sciences, University, Tel-Aviv 69978, ISRAEL.1

Tel Aviv

Abstract

The greedy algorithm for the transportation problem repeatedly picks an edge, maximizes flow on it and updates the supplies and demands. If, with the same order of edges, the greedy algorithm gives an optimal (resp., feasible) solution for every feasible supply and demand functions, we call that order an optimality (resp., feasibility) sequence. We show that with a feasibility sequence, one can guarantee a feasible solution which is also a spanning tree, or give a certificate on infeasibility. Furthermore, with an optimality sequence, one can guarantee an optimal basic solution for every feasible problem. We also show how to obtain a spanning tree with a given signature or a dual feasible basis with a given signature, using feasibility and optimality sequences. Our results build on some interesting properties of vertex elimination algorithms which are guided by edge orders.

1

Introduction

T h e transportation problem can be stated as follows: A commodity which is available at certain sources is demanded at some destinations. Shipping costs from each source 'Supported by AFOSR grants 89-0512 and 90-0008, and by NSF grant STC88-09648.

2

I. Adler & R.

Shamir

to each destination are known. Given the amount available at each source and the a m o u n t d e m a n d e d at each destination, the problem is to determine how much to send directly from each source t o each destination so t h a t all supplies and d e m a n d s are met and t h e total shipping cost incurred is m i n i m u m . W h e n not every source can ship to every destination, we say t h a t the problem is restricted. T h e transportation problem is one of t h e fundamental problems in combinatorial optimization, and has been studied intensively over t h e last fifty years. T h e excellent surveys [2] and [11] describe several algorithms for transportation and related network flow problems. One n a t u r a l approach for solving t h e transportation problem is by a greedy algorithm, guided by a predetermined order of t h e edges. This algorithm repeatedly picks t h e next edge in t h a t order, sends t h e m a x i m u m possible a m o u n t of flow along it and u p d a t e s t h e supplies and d e m a n d s accordingly. In general, this approach does not guarantee an optimal solution, but is often used to generate an initial feasible solution for more elaborate methods in unrestricted problems. T h e "north-west corner rule" and t h e " m i n i m u m CV, rule" are two well-known examples of this approach (see, e.g., [13].) T h e greedy algorithm has t h e obvious advantage t h a t it is very fast, hence t h e interest in identifying when it guarantees an optimal solution. Research has focused on cases when there exists a single order of the edges which - when used by t h e greedy algorithm - is guaranteed to produce an optimal solution for all feasible problems on t h a t network. Such order of the edges is called an optimality sequence. T h e knowledge of an optimality sequence is useful in particular if one needs to solve several problems with t h e same costs, but with varying supplies and demands: Given an optimality sequence, each problem is solvable in linear time. Similarly, if the greedy algorithm guided by some edge-order produces a feasible solution for every feasible problem, t h a t order is called a feasibility sequence. Hoffman [15] gave a characterization of bipartite networks which a d m i t an optimality sequence, in t e r m s of t h e Monge property. (The property will be defined in t h e next section.) Dietrich and Shamir [8, 18] (see also [19]) generalized the characterization to restricted transportation problems. Many families of problems which fit those characterizations or closely related ones have been studied in operations research, computational geometry and molecular biology applications. (See [1] for a list of references.) Alon et. al. [3] provided an efficient algorithm which finds an optimality sequence or determines t h a t no such sequence exists for unrestricted problems. Dietrich and Shamir [8, 18, 19]) generalized t h e algorithm to restricted transportation problems. Adler, Hoffman and Shamir [1] characterized t h e bipartite graphs which admit a feasibility sequence in graph-theoretic terms and gave very efficient algorithms for recognizing t h e m . They also extended t h e characterizations and algorithms for feasibility and optimality sequences to general, non-bipartite networks. In this paper we introduce a requirement t h a t the solution obtained from t h e greedy algorithm will also be basic. This requirement has several motivations: In m a n y linear p r o g r a m m i n g situations, t h e optimal solution of a problem is also required

Greedily

Solvable

Transportation

Networks

3

to be basic. This is needed when dual information is required, or for post-optimality analysis. Such complete information on t h e problem is not always available from an optimal solution, unless it is also a dual feasible basis. For example, a solution to t h e dual problem m a y not be immediately available in t h a t situation. In case t h e problem is infeasible, it may also be i m p o r t a n t to obtain a proof of infeasibility by pointing out a "bottleneck" which caused t h e infeasibility in t h e original problem. For example, m e t h o d s using linear programming solvers as subroutines (or "oracles"), require a separating hyperplane in case the problem is infeasible (cf. [10]). How much t h e n do we have to sacrifice in terms of complexity, and in t e r m s of t h e size of t h e class of greedily solvable problems, in order to guarantee t h a t we get a basis? As it turns out, by modest modifications of t h e algorithms, we can guarantee obtaining basic information without increasing t h e complexity and without decreasing t h e size of the class. It is well known in linear programming theory t h a t proper p e r t u r b a t i o n of t h e right-hand side a n d / o r the objective function generates a problem in which t h e only optimal solution is also an optimal basis. However, it is interesting to show how to achieve an optimal basis directly, by equipping an optimization algorithm with an appropriate mechanism (typically some sort of tie breaker). For example, Bland [7] showed how to prevent cycling in t h e simplex m e t h o d , by endowing t h e simplex algorithm with some combinatorial properties, instead of using t h e geometric 'trick' of p e r t u r b a t i o n . Similarly, for t h e greedy algorithm in t h e transportation problem, we construct here edge selection rules which guarantee independently t h a t t h e final selected set is a basis, primal feasible and dual feasible. One immediate consequence of this setup is t h a t imposing t h e intersection of these selection rules guarantees t h a t t h e final set of edges is an optimal basis. T h e "modular" approach outlined above yields additional results: Since these selection rules are independent, one can combine a subset of these rules with other rules so t h a t t h e final selected set of edges will satisfy additional desired properties. For example, an additional rule which enforces a given signature leads t h e greedy algorithm (using an optimality sequence) t o produce a dual feasible spanning tree with a a prescribed signature. This t y p e of analysis is used to obtain several additional results. Our m a i n results are the following: • In section 3 we define a very simple generic vertex elimination algorithm, which scans t h e edges in predetermined order and eliminates one vertex in each step. We show t h a t with any edge order and on any graph, t h a t algorithm generates a spanning forest. These results do not require the transportation and greedy algorithm setting, and may be of interest by themselves. Next we show t h a t for bipartite graphs, if the edge order forms a feasibility sequence, then t h e generic algorithm forms a spanning tree and therefore a basis. Moreover, if t h e order is an optimality sequence, then t h e resulting basis is dual feasible. • In section 4 we slightly modify t h e greedy algorithm so t h a t it will have t h e

4

I. Adler & R.

Shamir

properties of t h e vertex elimination algorithm defined in section 3. We show t h a t with an optimality sequence this new algorithm guarantees an optimal basis for every feasible problem. • In section 5 we concentrate on t h e consequences on negative termination in t h e greedy algorithm: Whenever t h e algorithm uses a feasibility sequence and determines t h a t a problem is infeasible, we can identify a "bottleneck" set of sources (or destinations) whose total supply exceeds t h e total d e m a n d of all their neighbors in t h e original problem. • As a by-product, we show in section 6 t h a t t h e tools developed here yield some interesting results on signatures of transportation problems. Recall t h a t t h e signature of a tree on a bipartite graph is t h e vector of t h e degrees of t h e sources in t h a t tree. We give an algorithm which when guided by a feasibility sequence, produces a spanning tree with a given signature or determines t h a t t h e signature is invalid. T h e same algorithm, when guided by an optimality sequence, produces a dual feasible basis with t h a t signature, if it is valid. Finally, we give new characterizations for feasibility and optimality sequences using signatures.

2

Preliminaries

Let G = (7, J; E) be a b i p a r t i t e graph whose two vertex sets (or sides) are I and J , where E C I x J , |7| = m , \J\ = n and \E\ = p. T h e vertices in I and J are numbered 1 , . . . , m and m + 1 , . . . , m -\- n, respectively. We assume throughout t h a t graphs are undirected and connected. For convenience, ij and ji are used interchangeably to denote t h e edge between i and j . Denote V = IUJ. T h e set N(v) = {j € V\vj 6 E} is called t h e neighborhood of v, and dv = \N(y)\ is called t h e degree of vertex v. v is called an end-vertex if dv = 1. For each edge ij 6 E, a cost Ci3 > 0 is assigned. Again, we use b o t h dj and Cji for t h e cost of t h e edge between i and j . Edge costs m a y also be represented by an m x n m a t r i x , in which case we define C;,- = oo for each ij $ E. T h e graph together with its edge-costs are called a network and are denoted N = (G,C). Finally, for each vertex v £ V, a non-negative excess e„ is given. T h e bipartite network together with an excess vector specify t h e input to a transportation problem. T h e problem is called restricted if t h e underlying bipartite graph is incomplete. T h e origin of t h e model is in planning of shipping, where t h e sets I and J correspond to sources and destinations, respectively. A commodity available at t h e sources must be shipped to satisfy d e m a n d s at t h e destinations, dj is t h e cost of shipping each unit of t h e c o m m o d i t y on edge ij. ev is t h e supply available at source v if v € I, or t h e d e m a n d at destination v if v 6 J- T h e transportation problem is to determine t h e a m o u n t Xij to be shipped each edge ij, so t h a t all supplied and d e m a n d s are met

Greedily

Solvable

Transportation

Networks

5

at m i n i m u m overall cost, i.e., min

s.t. JZ xa

= e,

i<EV

Xij

> 0

ij G E

(P(N,e))

We say t h a t t h e excess vector e is feasible if P(N, e) has a feasible solution, i.e., a solution in which all supplies and demands are satisfied. An obvious necessary condition for feasibility is t h a t J2iei e; = Y.j^j^jGiven a spanning tree T in G, t h e (unique) solution of YlijeT xij = e > ? £ ^ xij = 0 ij £ E — T is called t h e the primal basic solution associated with T. (A spanning tree is sometimes called a basis, since t h e set of columns corresponding to its edges forms a m a x i m a l independent set in t h e coefficient m a t r i x of t h e equations in P(N,e).) If t h e primal basic solution is non-negative t h e n it is called a feasible solution, and T is called primal feasible. T h e dual of t h e transportation problem is defined as : 2 J eiui + zZ is/ jeJ S.t. U; + Vj < Cij

max

e v

jj (D(N,e)) ij G E

Given a spanning tree T in G, t h e (unique) solution of t h e system ui + Vj = Cij ij G T is called t h e the dual basic solution associated with T. If t h e dual basic solution satisfies t h e rest of t h e inequalities in D(N,e) then it is called dual feasible solution, and T is called dual feasible. A tree which is b o t h primal and dual feasible is called optimal. Let S = (Si,S2,.-.Sp) be a p e r m u t a t i o n of t h e edges. For a problem P(N,e), t h e greedy algorithm maximizes each variable in t u r n , according to the order given in S. A formal description is the following. (We assume t h a t initially X;J = 0 for all ij G E). algorithm G R E E D Y ( S ) ; begin For i = 1, ...,p do : begin pick the next edge •?,- = rs. xrs «- min{e r ,e 5 } 6r <

6r

XTS

e$ * es

%rs

end if e 5^ 0 t h e n output "infeasible" else output x end A p e r m u t a t i o n S of t h e edges is a feasibility sequence for t h e graph G if for every feasible excess vector e, G R E E D Y ( 5 ) provides a feasible solution to P(N,e). S is

I. Adler & R. Shamir

The Z Property kl i E

The Hoffman Property Cij + Cu < Ca + Ck3

Figure 1: demonstration of the Z and Hoffman properties. A broken edge cannot appear first in S among the edges in its figure.

called an optimality sequence for the network N if for every feasible excess vector e, algorithm GREEDY(5) provides an optimal solution to P(N,e). For an order S s of the edges, write ij -< kl if ij precedes kl in S. To simplify notation, we shall use ij -< kl when there is no ambiguity about the order. We say that S has the Z property for the graph G if the following holds: The Z Property: s .. s For every i,k G / and j , I £ J , if ij -< il and ij -< kj then kl € £ . Note that the condition implicitly requires that ij, il and kj are edges, by the fact that they are included in S. An equivalent statement to that above is that if kl £ E s s then the order li >- ij -< jk is impossible (see figure 1). We say that S has the Hoffman property for the network ./V if the following property holds (see figure 1): The Hoffman Property: For every i,k 6 I and j , I € J, such that ij, il, kj and kl are all edges in s .. s E, if ij -< il and ij -< kj then Cij + Ckt < Cu + CkjA sequence is called a Monge sequence for the network N if it satisfies both the Z property and the Hoffman property. Note that in a complete bipartite graph every permutation trivially satisfies the Z property, hence the Hoffman property and the Monge property are equivalent on unrestricted transportation networks. The following theorem summarizes some of the known characterizations and complexity results for optimality and feasibility sequences in bipartite networks. (Recall that a bipartite graph is called chordal bipartite if it does not contain an induced cycle of length six or more [12].) For the complexity results we assume m < n: Theorem 2.1 (a) [15, 19] S is an optimality sequence for a network if and only if it is a Monge sequence.

Greedily

Solvable

Transportation

Networks

1

(b) [19, 1] S is a feasibility sequence for G if and only if it has the Z property. (c) [1] A bipartite graph admits a feasibility sequence if and only if it is chordal bipartite. (d) [3, 19] In a bipartite network, one can construct an optimality sequence or determine that no such sequence exists in 0(pm log n) steps. (e) [1] In a bipartite graph, one can construct a feasibility sequence or determine that no such sequence exists in 0(p\ogn) steps.

3

Vertex Elimination and Spanning Trees

First, we consider vertex elimination in a general setting. A basic step which will be used in several algorithms in t h e sequel is eliminating a vertex together with all t h e edges incident on it from t h e graph G(V, E). A formal description is t h e following: procedure ELIMINATE^); begin V <- V - {v} E <- E- {vj e E\j € N{v)} end In all t h e algorithm we shall discuss, we call a vertex active at a certain stage if it is in t h e current set of vertices V. An edge is called active if it is in t h e current set E, i. e., if b o t h of its endpoints are active. T h e generic algorithm described below builds a set of edges B which under certain conditions will be shown to form a spanning tree. Initially, B is empty. P a r t of t h e input of t h e algorithm is a p e r m u t a t i o n of the edges S = ( S i , S 2 , . . . , Sp), which is used to guide t h e algorithm: T h e algorithm examines the edges one at a t i m e , according to their order in S. Whenever t h e examined edge is active, it is added to t h e set B and one of t h e vertices incident on it is eliminated. If exactly one vertex incident on t h e examined edge is an end-vertex, t h a t vertex is eliminated. Otherwise, t h e eliminated vertex is chosen arbitrarily. This algorithm is described below: algorithm GENERATE-B(£); begin B^-0; For i = 1, ...,p d o : b e g i n pick the next edge S{ = rs. if rs is active t h e n b e g i n B <- B U {rs} if exactly one endpoint of rs is an end-vertex t h e n call it v. else pick either v <— r or v <— s e n d i f ELIMINATE(v) end

8

I. Adler & R.

Shamir

end end This algorithm satisfies two elementary properties, which we call t h e vertex ination properties:

elim-

• V E . l : For each edge added to B, one of its endpoints is eliminated and no other edge incident on t h a t endpoint may be added later into B. • VE.2: If an edge added to B is incident on at least one end-vertex, then one of its endpoints which is an end-vertex is eliminated. T h e freedom to choose t h e p e r m u t a t i o n S and t h e eliminated vertex is quite large. We shall show in t h e sequel several ways of exploiting this freedom, in designing algorithms which carry out additional tasks while maintaining these vertex elimination properties. Vertex and edge elimination schemes have been studied in t h e past, beginning with t h e work of Rose [17], in conjunction with efficient Gaussian elimination in sparse matrices (see [12, ch. 3 and 12] and t h e references thereof). In t h a t context, vertex elimination procedures use a p e r m u t a t i o n of t h e vertices, and in each step one vertex is eliminated. In edge elimination, an ordering of a subset of the edges in sought, and both endpoints are always eliminated with t h e current edge. T h e vertex elimination properties defined here require t h a t the algorithm eliminate one endpoint of a selected edge in each step, b u t be guided by an ordering of all t h e edges. T h e reason for this tighter definition will become clear in t h e following sections. We first note some useful properties of t h e algorithm which are t r u e for any ordering S of t h e edges. L e m m a 3 . 1 Let B be a set of edges generated by algorithm GENERATE-B. If (i\i2, ^ 3 > • • •, h-iik ) is a path of edges in B, then either i\ii -< iTiT+i for r = 2 , . . . k — 1, or ik-iik X M r + i for r = 1 , . . . ^ — 2. P r o o f . Assume t h a t itit+i is t h e first among t h e edges in the p a t h to appear in S, and 1 < t < k — 1. T h e n on t h e same step when t h a t edge was introduced into B, by property V E . l , either it or it+i became inactive. In the former case, it-iit could not be added later to B. In t h e latter, it+iit+2 could not be added later to B. In both cases we obtain a contradiction. • C o r o l l a r y 3 . 2 If B was generated by algorithm GENERATE-B, then for every path of edges in B, the order of appearance of these edges in S is unimodal. That is, if the path is ( i i i i , 22^3, • • •, ik-ih), then for some r, 1 < r < k — I, «1«2 -< »2*3 -< . . . X J r - l * r ~< ir,ir+l

X ir+l*r+2 X • • • X

P r o o f . Apply L e m m a 3.1 t o s u b p a t h s of t h a t p a t h . •

ik-lik-

Greedily

Solvable

Transportation

C o r o l l a r y 3 . 3 The set B generated

9

Networks by the algorithm

is acyclic.

•

Note t h a t for t h e above only t h e first vertex elimination property ( V E . l ) was required. Note also t h a t in every stage of the algorithm, every connected component of B contains exactly one active vertex. So far we did not require t h e bipartiteness of t h e graph. We shall use t h e results above for graphs of t h e transportation problems, which are bipartite. Generating a spanning forest or a spanning tree in any graph is an easy, well-studied problem. It is t h e additional requirements from t h e spanning tree (e.g., dual feasibility, optimality or correspondence to a given signature) which make t h e problem of interest to us. In general, t h e final set B generated by t h e above algorithm may contain several connected components. We now show t h a t if S is a feasibility sequence, t h e final set B will be a spanning tree. T h e proof relies on t h e following lemma: L e m m a 3.4 Let S be a feasibility sequence for G. Algorithm GENERATE-B(S) maintains the following invariants: (i) the set E is connected, (ii) the set B U E is connected, and (Hi) the set B U E spans the original graph. P r o o f . T h e fact t h a t in each step B U E spans t h e original graph is i m m e d i a t e from t h e algorithm. T h e proof of (i) and (ii) is inductive. We give here the proof of (i) only, since t h e proof of (ii) in similar. Originally G is connected. Assume t h a t E is connected before the insertion of edge rs into B. Denote the set of active edges after t h a t step by E'. We need to prove t h a t E' is connected. If dT = 1 or d„ = 1, by property V E . 2 , E' = E — {rs}, so t h e statement is clearly true. Otherwise t h e degrees of both endpoints are at least two. Let q ^ s and ( / r be such t h a t rq £ E and ts £ E. Since rs ~< rq, rs -< ts, by t h e Z property of sequence S, tq € E. More generally, by t h e Z property, t h e set N(r) U N(s) induces a complete bipartite graph in E. W i t h o u t loss of generality suppose r is eliminated. T h e n N(s) — {r} jt 0, and t h e set N(r)(jN(s) — {r} induces a complete bipartite graph in E'. If E' is not connected, then since E was connected, and since all t h e edges which were inactivated in t h a t step were incident on r, each component of E' must contain a vertex in N(r). But any two neighbors of r are also neighbors of t, so they remain connected in E', a contradiction. • P r o p o s i t i o n 3.5 If S is a feasibility sequence, gorithm GENERATE-B(S) is a spanning tree.

then the final set B produced by al-

P r o o f . W h e n t h e algorithm terminates, E = 0. Hence by L e m m a 3.4, B is connected and spans t h e original graph. By Corollary 3.3 B is cycle-free. Hence it is a spanning tree. • Let us now show t h a t in a sense, t h e converse of Proposition 3.5 also holds: Call a p e r m u t a t i o n S a spanning tree sequence for G if for every induced connected subgraph G'(V',E') of G, algorithm G E N E R A T E - B ( S ) operating on G' generates a spanning

10

I. Adler & R.

Shamir

tree. ( T h e algorithm can be viewed as operating on G while considering all vertices which are not in V as inactive from the beginning. An equivalent interpretation is t h a t t h e induced subsequence of S which is obtained by restricting S to edges in E' is used on G'.) P r o p o s i t i o n 3 . 6 S is a spanning sequence for G.

tree sequence for G if and only if it is a

feasibility

P r o o f . If S is a feasibility sequence then for every induced subgraph G', t h e corresponding induced subsequence of S is also a feasibility sequence on G', by t h e Z property. Hence by Proposition 3.5, S is a spanning tree sequence. If S is not a feasibility sequence, then there exist some i,j, k, / such t h a t kl £ E and kj >- ji -< il (cf. figure 1). On t h e subgraph induced by those four vertices, S does not form a spanning tree. • We now show t h a t when S is a Monge sequence, t h e spanning tree generated by t h e algorithm is also dual feasible: T h e o r e m 3.7 If S is a Monge sequence for N, GENERATE-B(S) feasible spanning tree.

produces

a dual

P r o o f . Since every Monge sequence is a feasibility sequence, by Proposition 3.5, B is a spanning tree. To show it is dual feasible, it suffices to prove t h a t no cycle which is formed by adding a non-basic edge to B is a negative cost cycle (see, e.g., [2]). T h a t is, if ikik+i € B k = 1 , . . . 21 — 1, and i2iii £ E — B, t'l'iij + ki'314 + •- • + t/iji_i«jt < C; 2 ; 3 + G;4,'5 + • • • + L ^ m •

(3-1)

T h e proof is by induction on I: For I = 2, let t h e cycle be (st, tu, ur, rs) where rs is t h e non-tree edge. Applying l e m m a 3.1 to t h e p a t h (st, tu, ur) we get t h a t tu cannot precede b o t h st and ur in S. Clearly, rs cannot precede both st and ur, since otherwise rs would have been in B. Hence by the Hoffman property, Cst + Cur < CTS + C ( u , and equation (3.1) is satisfied. Assume now t h a t the equation (3.1) is t r u e for all cycles of length up to 2/ — 2, and let ( i \ i 2 , 22*3, • • •, 121-^21, ^2fli) be a cycle of 2/ edges, where only i2ii\ £ B (see figure 2). Using L e m m a 3.1 again, we can assume without loss of generality t h a t i\i2 appeared first in S among t h e cycle edges. By property Z of t h e Monge sequence this implies t h a t 12113 € E. Moreover, by t h e Hoffman property, Giji2 + Ci2li3 < C,-2,-3 + Cjjjij.

(3.2)

Using t h e induction hypothesis for t h e cycle (1314, 1415, . . . , i2/-iJ2/, J2/J3) whose length is 2/ — 2, we get t h a t G,'3,'4 + Ci5i6 + • • • + t7,-2t_1i21 < 6, 4 i 5 + C,'6;7 + • • • + Ci2li3.

(3-3)

Greedily

Solvable

Transportation

Networks ^^-?*

ii9h-

*2I—1

'-+ ?i

Figure 2: diagram for the proof of Theorem 3.7. Solid lines: edges in B. Dotted lines: edges in E — B.

S u m m i n g equations (3.2) and (3.3) together, we obtain t h e desired proof of (3.1) for

/. • We can again show t h a t t h e above condition leads to a new characterization of Monge sequences, in a fashion similar to Proposition 3.6: Call a sequence a dual feasibility sequence for N if GENERATE-B(S') produces a dual feasible tree on every induced connected subnetwork. P r o p o s i t i o n 3.8 S is a dual feasibility mality sequence for G.

sequence for N if and only if it is an opti-

P r o o f . Analogous to the proof of Proposition 3.6. •

4

Generating an Optimal Spanning Tree

T h e algorithm above used only the graph structure and t h e order S to select a subset of t h e edges which forms a spanning tree. We now describe an algorithm which also receives as input t h e excesses at t h e vertices, and a t t e m p t s to generate a feasible or optimal solution to t h e transportation problem. T h e algorithm greedily maximizes the flow on each active edge in turn, according to the order S, and eliminates one of its two endpoints. If one endpoint has strictly smaller excess then t h a t vertex is eliminated. If both have equal excess and one is an end-vertex, then t h a t vertex is eliminated. Otherwise, (if excesses at t h e two endpoints are equal, and either none is an end-vertex or b o t h are,) we can choose t h e eliminated vertex arbitrarily. Infeasibility is detected whenever a vertex with positive excess becomes isolated. T h e algorithm is described below: algorithm GREEDY*(S); begin B *- 0; k <- 0;

12

I. Adler & R.

Shamir

For i = 1, ...,p do : begin pick the next edge Si = rs. if rs is inactive t h e n return else begin k <— k + 1 /* step fc begins */ if one endpoint has less excess, say e r < e s t h e n begin ELIMINATE(r); if ds = 0 then output "infeasibility" and stop end else (e r = e s ) begin if dr = 1 t h e n ELIMINATE(r) else ELIMINATE^) end B *- flu {rs} x r 5 <- min{e r ,e s } cr * cr

*Crs

end end end Successful termination occurs when step k = m + n — l i s completed without detecting infeasibility. In t h a t case a single vertex remains active and has excess zero. T h e above algorithm is a specialization of algorithm G R E E D Y which was described in section 2, modified so t h a t it will also have t h e vertex elimination properties. In other words, it has both the properties of t h e greedy algorithm and of algorithm G E N E R A T E - B . An immediate consequence of this, using T h e o r e m 2.1(b) and Proposition 3.5, is t h e following: P r o p o s i t i o n 4 . 1 If the sequence S satisfies the Z property, then for every feasible problem, algorithm GREEDY*(S) produces a primal feasible basic solution and its associated spanning tree B. • From Proposition 4.1 and Theorem 3.7 we get t h e final result of this section: T h e o r e m 4 . 2 If S is a Monge sequence, then for feasible problems, GREEDY*(S) terminates with an optimal basic solution and an optimal tree B. •

algorithm spanning

As mentioned in section 2, given t h e optimal tree B, one can also construct t h e corresponding dual basic solution, by solving a system of linear equations. Note t h a t t h e converse of Proposition 4.1 and T h e o r e m 4.2 are already known to be true, by [1] and [19], respectively.

5

Gale Certificates

In this section we discuss t h e consequences of terminating t h e algorithm of Section 4 with t h e o u t p u t 'infeasible'. For a set of vertices S G V, let e(S) = J2ves ev be its

Greedily Solvable

Transportation

Networks

13

total excess. If there exists a set of sources (or destinations) whose total supply (resp., d e m a n d ) exceeds t h e total d e m a n d (resp., supply) of all their neighbors t h e n t h e problem is obviously infeasible. Gale [9] has shown t h a t t h e converse is also true, i.e., whenever t h e t r a n s p o r t a t i o n problem is infeasible, there exists a set S € / (or S € J) such t h a t e(S) > e(N(S)). (This is a generalization of t h e celebrated Hall's theorem for t h e assignment problem[14].) Therefore, finding such a set S can be viewed as a proof of infeasibility. Let us call such a set S a Gale certificate of infeasibility. We shall show t h a t whenever t h e greedy algorithm is guided by a feasibility sequence and t e r m i n a t e s with t h e answer "infeasible", a Gale certificate can be pointed out. For t h e following proposition we need some notation which refers to intermediate steps of t h e algorithm. Let Gk = (Vk, Ek) be t h e graph after step k of t h e algorithm, and let ek and dk be t h e excess and t h e degree, respectively, of vertex v in t h a t stage. G = G° is t h e original graph. We can view t h e problem Pk = ((Gk,C),ek) as a new transportation problem starting with t h e modified d a t a after step k. Let Bk be t h e final set B produced by an algorithm operating on problem Pk as t h e initial problem. Note t h a t for each k, step i of t h e algorithm on Pk is identical to step k + i on t h e original problem. In particular, if ak is t h e edge introduced into B in step k by t h e algorithm operating on t h e original problem P°, t h e n Bk = Bk+1 U {ak}. Moreover, infeasibility is detected at the same vertex in all these nested problems. Suppose t h e algorithm t e r m i n a t e d after determining infeasibility at vertex w, in step / of the original problem. T h a t is, d'w = 0 and e'w > 0. W i t h o u t loss of generality assume w £ I. Define Rk = {u £ / | 3 a p a t h in-B^ from u; to u } Lk = {u £ J \3v £ Rk such t h a t vu £ Ek } In other words, R is t h e set of vertices in I which are in t h e same connected component of Bk with w, and Lk is t h e set of their neighbors in Gk. Finally, let Dk = RkULk. P r o p o s i t i o n 5.1 Suppose S is a feasibility sequence for G. Then with the definitions above, for k = 0 , . . . , I: (i) Rk is a Gale certificate for Pk. (ii) For every v £ D , there exists a path from v to w containing only edges from B . P r o o f . T h e proof is by backwards induction on k = Z, Z — 1 , . . . , 0. For t h e induction basis, note t h a t after t h e last step Z, d'w = 0 and e'w > 0. Hence R — {w} is a Gale certificate, and D' = {w}, so (ii) is trivially true. We now show t h e inductive step. Suppose (i) and (ii) hold for k, and let us prove t h e m for k — 1. Denote t h e edge introduced into B in step k by ak = rt, where r is t h e vertex eliminated in step k. Recall t h a t we assumed t h a t w £ I. We distinguish four cases (compare figure 3): C a s e I : r € I, t £ Lk. Since t € Lk, by t h e induction hypothesis there exists p £ Rk such t h a t pt 6 Bk, so rt belongs to the same component as w in B f c _ 1 . Is it possible t h a t in Gk~1, vertex r has a neighbor which is not in Lk1 Suppose rj € Ek_1.

14

I. Adler & R.

Case I

Case II

Case III

Shamir

Case IV

Figure 3: diagrams for the proof of the existence of a Gale certificate. In all cases r is the eliminated vertex, and dotted edges are eliminated with it. Edges included in the final set B are marked by double lines. Only vertices which are used in the proof are labeled.

Since j is active in step A;—1, we get rj >- rt -< pt, so by t h e Z property of t h e sequence S, pj £ Ek~x. But then also pj £ Ek, so j £ Lk. Hence all neighbors of r in Gk~x are also present in Gk, which implies Dk~x = Dk U {rt}. This immediately implies (ii). (i) follows since e * " 1 ^ - 1 ) = ek{Rk)

+ e*" 1 > ek(Lk)

+ e*" 1 = ek~\Lk^),

(5.1)

where t h e inequality follows from t h e inductive hypothesis. C a s e II: r £ I, t £ Lk. Here we want to show t h a t Rk = Rk~1, i.e., rt is not in t h e connected component of w in Bk~l. This will imply (i) and (ii). Suppose there exists a p a t h in Bk~x from r to w. Such p a t h must contain an edge it £ Bk_1 such t h a t i is connected to w in Bk~x, since r is eliminated in step k — 1. B u t then it £ Bk and i £ Rk, so t £ Lk, a contradiction. C a s e III: r £ J, t g Rk. In t h a t case Rk = Rk~1, and r £ Lk, since r g Vk. (i) and (ii) are thus i m m e d i a t e by t h e inductive hypothesis. C a s e I V : r £ J, t £ Rk. By t h e inductive hypothesis, t belongs to t h e same component as w in Bk. Hence in I ? * - 1 , both r and t belong to t h a t component, which implies t h a t there exists an edge ti on t h e p a t h from t to w in Bk~1, and t h a t edge is also present in Bk. Since every other neighbor of t (except r ) is also in Vk, by t h e inductive hypothesis it is also in Lk. Hence (ii) is true, (i) now follows by an argument identical to t h e chain of inequalities (5.1) in case I. • Applying t h e proposition with k = 0 we get the desired result: T h e o r e m 5.2 If S is a feasibility sequence and algorithm GREEDY* detects infeasibility at vertex w, then the vertices in w 's connected component of B which are on the same side as w form a Gale certificate of infeasibility. •

Greedily

6

Solvable

Transportation

Networks

15

Signatures

Let B b e a subset of t h e edges in t h e bipartite graph G = (7, J; E). In this section it will be more convenient to denote edges by ordered pairs of vertices, i.e., for t h e edge (r, s) G E, r € I and s G J. T h e (row) signature of B is defined as an integer vector crB = ( o f , . . . , o-%) such t h a t

a? = \{k\(i,k)eB}\

« = l,...,m.

(6.1)

T h e column signature is defined analogously. We shall discuss row signatures here, b u t t h e same results apply to column signatures, by interchanging t h e roles of I and J. For a bipartite graph an integer vector 0 and E J l t o\ = m + n — 1. (For complete bipartite graphs, these conditions are sufficient for validity.) Balinsky and Russakoff have shown t h a t in unrestricted problems, for every valid signature vector cr, there exists at least one dual feasible spanning tree B with t h a t signature [6] (see also [4, 5]). Indeed, under dual non-degeneracy, they showed t h a t there exists one to one correspondence between dual feasible bases and valid signatures. In this section, we shall show how to use vertex elimination and feasibility sequences in order to determine quickly if a given signature is valid. Subsequently, we shall show t h a t with a Monge sequence, one can also find a dual feasible spanning tree with t h a t signature, if one exists. We shall also provide new characterizations based on signatures for feasibility and optimality sequences. A preliminary question is whether one can efficiently decide if a given signature is valid. More precisely, suppose we are given a bipartite graph and a signature cr, and asked whether there exist a spanning tree with t h a t signature. We now show t h a t this problem is polynomial, and also give a polynomial algorithm which constructs a spanning tree with t h e given signature if it is valid. Clearly, t h e conditions cr > 0 and J27Li &i = m, + n — I are necessary. For complete bipartite graphs these conditions are also sufficient, so t h e question is trivial. For an arbitrary bipartite graph, having verified t h e necessary conditions, we can rephrase t h e question: P r o b l e m 1: Given cr, does there exist an independent (cycle-free) set of edges D with aD =
I. Adler & R.

16

Shamir

It is easy t o verify t h a t t h e collection of edge-sets which satisfies (a) forms a m a t r o i d , and so does t h e collections of edge-sets which satisfies (b). Hence problem 2 is t h a t of finding a m a x i m u m cardinality set in t h e intersection of two matroids. This problem is known t o be polynomially solvable (see, e.g., [16]). A solution D to problem 2 gives also t h e answer to problem 1: If l,ds> I, of < oT) ELIMINATE(s) end end if a ^ aB t h e n output "failure" else output B. end T h e algorithm terminates either with a set of edges B corresponding to t h e given signature, or with t h e answer "failure". Note t h a t termination with failure can be done earlier, by stopping t h e algorithm when a vertex r is eliminated because dT = 1

but still of < <JT. Observe t h a t t h e algorithm eliminates one vertex in every step, and it chooses to eliminate an end-vertex if such a vertex is incident on t h e currently examined edge. This immediately implies: L e m m a 6.1 Algorithm VE.l and VE.2. m

SIGNATURE-B

satisfies

the vertex

elimination

properties

T h e o r e m 6.2 7 / 5 is a feasibility sequence then algorithm SIGNATURE-B(cr,S) determines if the given vector a is a valid signature, and constructs a spanning tree B with that signature if one exists. P r o o f . T h e fact t h a t t h e algorithm constructs a spanning tree follows from L e m m a 6.1 and Proposition 3.5. If t h e algorithm terminates successfully, it follows from t h e algor i t h m t h a t B has t h e correct signature. It remains to show t h a t t h e algorithm never

Greedily

Solvable

Transportation

Networks

17

t e r m i n a t e s with failure for a valid signature. To prove this, it suffices to show t h a t by performing t h e first step of t h e algorithm we do not lose t h e validity of t h e signature. Suppose G has a spanning tree T w i t h signature 1 and ds > 1. This can h a p p e n only if ar = 1, and in t h a t case again T — {a} has t h e desired properties. T h e only remaining case is when s 6 J was t h e eliminated vertex, ar > 1, dT > 1 and d„ > 1. Let us first show t h a t there exists a spanning tree T" in G with signature cr which includes (r, s). If (r, s) (j_ T then T U {(r, s)} contains a unique cycle. In t h a t cycle there exists an edge (r,t), t ^ s. T h e set T" = T U {(r, 5)} — { ( r , t ) } has t h e same signature as T, it contains (r, 5) and is also a spanning tree for G. Next, suppose (r, s) € T and t h e spanning tree T contains additional edges incident on t h e vertex s. In t h a t case, since t h e algorithm removes all those edges during t h e first step, we should show t h a t there is another spanning tree with t h e same sign a t u r e b u t without those edges. Let ( j , s) G T be such an edge. Since aT > 1, there is some other edge (r, k) £ T,r ^ s. Since (r, s) -i, (r,k) and (r, s) -< ( j , s ) , by property Z of t h e sequence S, (j, k) 6 E. Moreover, (j, k) $ T, or else t h e four edges would have formed a cycle in T. Let T * = T U {(j, k)} — {(j, s)}. T h e set T* has t h e same signature as T , is still a spanning tree for G and has one less edge incident on s. By repeating t h e argument (with T* replacing T) if necessary, we conclude t h a t there exists a spanning tree T with signature cr such t h a t (r, 5) is t h e only edge it contains which is incident on s. Therefore, T — {(r, s)} is a spanning tree in G' with signature

a'. • To prove t h e converse, we need t h e following l e m m a which shows t h a t it is possible to extend a valid signature by "going backwards in time" and adding rather t h a n eliminating vertices. Define the residual signature at any stage of t h e algorithm to be
18

I. Adler &: R.

Shamir

j is eliminated, define <x,- =
S

t h a t (i,l) € E, (k,j) € E, (i,j) -< («',/) and (i,j) •< (k,j) and (k,l) £ E. If (i,j) is t h e first edge in S, define Rk = {(fc,.s)|s G N(k)}, 8k = \Rk\, Ri = {(i,s)\s e N(i) - N(k)}, Si - \Ri\ + 1. T h e set R = Rk U Ri is cycle free, hence it can be extended t o a spanning tree with a valid signature cr, satisfying <7; = <$;, ak = Sk- Moreover, any such spanning tree T must contain all t h e edges in Rk- If in addition (i,j) £ T, then no other edge (i,s) with 5 £ N(k) can be included in T , since it would form a cycle. Hence every spanning tree T which satisfies = ^k, and (i,j) € T must contain all edges in Rk and Ri. Since in t h e first step t h e algorithm introduces (i,j) into t h e spanning tree and eliminates one of its endpoints, it is impossible t h a t b o t h (i,l) 6 Ri and (k,j) 6 Rk will be included later in T, a contradiction. Assume now t h a t (i,j) is the ( t + l ) - s t edge in t h e sequence S, t > 1. Suppose we use t h e algorithm G E N E R A T E - B on t h e subsequence Si,..., St, temporarily ignoring signature requirements, b u t making sure t h a t none of the vertices i,j,k,l is eliminated. This can be guaranteed since none of those vertices is an end vertex incident on an edge in ST for r < t. Let Et be t h e set of active edges produced by the algorithm after step t. Since t h e subsequence S\,..., St satisfies property Z, by L e m m a 3.4, El is connected. Hence t h e same argument as in t h e first part of this proof implies t h a t in t h e active graph after step t, there is a valid signature
Greedily Solvable

Transportation

19

Networks

P r o o f . By Theorem 6.4, S must be a feasibility sequence. Suppose it does not have the Hoffman property. T h e n there exist edges (i,j), (i, I), (k,j), (k, I) such t h a t

(«',iH(«\0>

(MH(M),

(i,j)*{k,j)

(6.2)

and Cn + Cu > Cu + Ckj.

(6.3)

Let (i,j) be t h e first edge in the order S which satisfies conditions (6.2) and (6.3). Define G to be the same graph without edge (k, I). Note t h a t S does not satisfy the Z property on G, by (6.2). Let er be the signature generated on G by t h e process described in the proof of Theorem 6.4. Since u is valid for G, it is also a valid signature for t h e original graph G. By the proof of Theorem 6.4, algorithm S I G N A T U R E B(cr,S) must fail on G. Consider the operation of algorithm SIGNATURE-B(
Uk + Vi = Cjfcf

Substituting the above equations into (6.3), we obtain: w; + VJ + uk + vi> Cu + Ckj or, equivalently, (U{ + V,-

Cu) + {uk +Vj-

Ckj)

> 0 .

Hence, at least one of the values in parentheses must be strictly positive, in contradiction to the dual feasibility of (u,v). • Theorems 6.5 and 6.6 together give another characterization to Monge sequences, analogous to the primal one: Call S a dual feasibility sequence for C if for every feasible signature
7

feasibility

Summary

This paper brings together two previously studied topics: vertex elimination, and greedily solvable bipartite flow problems. A concept which, somewhat surprisingly,

20

I. Adler &: R.

Shamir

turns out to be relevant to both, is t h e feasibility sequence. We have defined a simple vertex elimination algorithm guided by an edge order, and have shown t h a t with a feasibility sequence, t h e algorithm generates a spanning tree. Moreover, only a feasibility sequence guarantees t h a t t h e algorithm will generate a spanning tree for every induced subgraph. W h e n we consider also t h e edge costs, then with a Monge sequence t h e algorithm generates a dual feasible spanning tree, and again, Monge sequences are t h e only sequences t h a t guarantee this for every induces subgraph. W h e n supplies and demands are also introduced, we have a complete transportation problem. By combining t h e properties of t h e greedy algorithm and t h e vertex elimination algorithm, we have proved t h a t with a Monge sequence, t h e hybrid algorithm generates an optimal basic solution and its corresponding optimal spanning tree, for every feasible supply and d e m a n d function. W h e n t h a t function is infeasible, t h e algorithm produces a Hall-type certificate of infeasibility. Finally, we have considered spanning trees with given signatures. By adjusting t h e vertex elimination algorithm to take into consideration a given signature, we have shown t h a t with a feasibility sequence, t h e algorithm generates a spanning tree with t h e input signature whenever t h a t signature is valid (i.e., corresponds to a spanning tree). Moreover, with any other sequence, this is not t h e case. As we noted, building a spanning tree with a given signature can be done polynomially even without a feasibility sequence, using matroid intersection. Several interesting problems arise from our discussions. First, given a signature, how can one find a dual feasible spanning tree which corresponds to it, or determine t h a t none exist? We have shown how to do this only in case a Monge sequence exists. Second, if t h e signature algorithm terminates with infeasibility, can one give a certificate t h a t shows t h a t the signature is invalid, in a fashion similar to what was done in section 5? Finally, can t h e results of this paper be generalized t o non-bipartite problems?

References [1] I. Adler, A. J. Hoffman, and R. Shamir, Monge and Feasibility Sequences in General Flow Problems, Technical report, R U T C O R , Rutgers University, (1990). To appear in Discrete Mathematics. [2] R. K. Ahuja, T. L. Magnanti, and J. B . Orlin, Network Flows, Handbooks in Operations Research and Management Science, Vol. I, G. L. Nemhauser, A. H. G. Rinnooy Kan, and M. S. Todd (editors), (Elsevier, A m s t e r d a m , 1989) p. 211-369. [3] N. Alon, S. Cosares, D. S. Hochbaum and R. Shamir, An algorithm for t h e detection and construction of Monge sequences, Linear Algebra and Its Applications, 1 1 4 / 1 1 5 (1989) 669-680.

Greedily Solvable

Transportation

Networks

21

[4] M. L. Balinsky, T h e Hirsch conjecture for dual transportation polyhedra, ematics of Operations Research, 9 ( 4 ) (1984) 629-633.

Math-

[5] M. L. Balinsky, Signature m e t h o d s for the assignment problem, Operations search, 3 3 ( 3 ) (1985) 527-536. [6] M. L. Balinsky and A. Russakoff, Faces of dual transportation polyhedra, ematical Programming Study, 22 (1984) 1-8.

Re-

Math-

[7] R. G. Bland, New finite pivoting rules for t h e simplex m e t h o d , Mathematics Operations Research 2 ( 2 ) , (1977) 103-107.

of

[8] B.L. Dietrich, Monge sequences, antimatroids, and t h e transportation problem with forbidden arcs, Linear Algebra and Its Applications, 1 3 9 (1990) 133-145. [9] D. Gale, A theorem on flows in networks, (1957) 1073-1082.

Pacific

[10] M. Grotschel, L. Lovasz and A. Schrijver, Geometric torial Optimization (Springer, Berlin, 1988).

Journal

of Mathematics,

Algorithms

and

7

Combina-

[11] A. V. Goldberg, E. Tardos, and R. E. Tarjan, Network Flow Algorithms, Paths, Flows and VLSI-Layout, B . Korte, L. Lovasz and A. Schrijver (editors), (SpringerVerlag, Berlin 1990) p. 101-164. [12] M. C. Golumbic, Algorithmic Press, New York, 1980). [13] G. Hadley, 1962).

Linear

Graph Theory

Programming.

and Perfect

Graphs.

(Academic

(Addison-Wesley, Reading, Massachusetts,

[14] P. Hall, On representatives of subsets, J. London Math. Soc,

10 (1935) 26-30.

[15] A.J. Hoffman, On simple linear programming problems, in Convexity: Proceedings of symposia in Pure Mathematics, Vol 7, (V. Klee, E d . ) , (American M a t h e m a t i c a l Society, 1963). [16] E. L. Lawler, Combinatorial Optimization: h a r t and Winston, New York, 1976).

Networks

and Matroids.

(Holt, Rein-

[17] D. J. Rose, Triangulated graphs and the elimination process, J. Math. Appi, 32 (1970) 597-609.

Anal.

[18] R. Shamir, A fast algorithm for constructing Monge sequences in transportation problems with forbidden arcs, Report 136/89 (1989), Tel Aviv University. To appear in Discrete Mathematics.

22

I. Adler & R. Shamir

[19] R. Shamir and B. Dietrich, Characterization and algorithms for greedily solvable transportation problems, Proceedings of the first ACM/SIAM Symposium on Discrete Algorithms, (SIAM, Philadelphia, 1990) p. 358-366.

23 Network Optimization Problems, pp. 23-36 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Networks Minimizing Length Plus the Number of Steiner Points T h o m a s Colthurst, Chris Cox, Joel Foisy, Hugh Howards, K a t h r y n Kollett, Holly Lowy, and Stephen Root c/o Prof. Frank Morgan, Department of Mathematics, Williams College, Williamstown, MA 01267, Internet: [email protected]

1

Introduction

T h e Steiner problem seeks to connect a given set of vertices with t h e shortest network, generally with added vertices called Steiner points (cf. [CR] p p . 354-361, [BG]). We consider networks minimizing a cost function of length plus t h e n u m b e r of Steiner points, modeling, for example, t h e expense of each intersection in a road network. Unlike solutions t o t h e Steiner problem, these cost-minimizing networks are not scale invariant, and in general there will be different optimal solutions for similar but scaled point sets. T h e existence of networks minimizing length plus t h e n u m b e r of Steiner points follows from s t a n d a r d compactness arguments. This paper establishes four as t h e sharp bound on t h e n u m b e r of edges t h a t can meet at a Steiner point in a cost-minimizing network in St2 (Theorem 3.5). We have learned recently t h a t this was originally conjectured by W . Boyce in 1970 [B], and proved independently Rubinstein, T h o m a s , and Weng [RTW] using a variational approach to t h e Steiner ratio. In addition to our proof of Boyce's conjecture we show t h a t in general In edges may meet in SR", as along t h e coordinate axes (Theorem 2.6), and t h a t t h e m a x i m u m number of edges which can meet in 3?" grows exponentially (3.2, 3.7). T h e higher-dimensional problem of surfaces minimizing area plus t h e length of singular curves models interfacial energies materials (see [MT], [M]). Our results suggest t h a t such interfaces would meet only in threes or fours. This paper is t h e work of t h e Geometry Group of t h e Williams College SMALL U n d e r g r a d u a t e Research Project, Summer 1991, a National Science Foundation site for Research Experiences for Undergraduates ( R E U ) . For a period of ten weeks, each of twenty- four students worked in one or two of t h e seven groups t h a t comprised t h e project. Professor Frank Morgan advised the Geometry Group. Support for t h e project was provided by grants from t h e National Science Foundation Research Experiences for Undergraduates Program, N E C U S E , Shell, and t h e Bronfman Science Center at Williams College.

24

T. Colthurst et al.

2

Examples of Networks Minimizing Length Plus the Number of Steiner Points

Examples 2.1 shows that cost-minimizing networks connecting the vertices of a square can meet in threes and fours, depending on the size of the square. Theorem 2.6 shows that In unit vectors from the origin along the In axis directions in 5ftn are costminimizing. E x a m p l e s 2.1 The Square

H Figure 1: The four possible topological types of networks minimizing length plus number of Steiner points on the square. The network in the upper right is never the minimizer, but the other three are minimizing at some scale. We list the cost-minimizing networks on a square to show that three and four edges can meet at a Steiner point in such a network. We do this by considering all

Networks

Minimizing

Length

Plus the Number

of Steiner Points

25

topological types which do not have non-adjacent vertices connected by an edge, in which case an edge connecting adjacent vertices would clearly decrease cost. On t h e square there are four such types, as is shown in Figure 1: 1. Two Steiner points connected to each other and each connecting adjacent vertices. This is t h e length minimizing tree, and has cost C(L) — 2 + (1 + \/?>)L « 2 + 2.732L. 2. One Steiner point connecting three of t h e vertices and a side of t h e square connecting t h e fourth. This has cost C(L) = 1 + (1 + ^ + &)L « 1 + 2.932L. 3. One Steiner point at t h e center of t h e square connecting all four vertices. This has cost C(L) = 1 + 2^/2L « 1 + 2.8281. 4. T h e perimeter m e t h o d , with no Steiner points and three sides of t h e square. Here t h e cost C(L) = 3L. T h e network with two Steiner points is t h e unique network with two Steiner points and four vertices [C2]. We see here t h e scale dependence: when L > /?^ / i , , t h e length minimizing network is cost-minimizing. For * ,- < L < ^ r 1 /-_ , t h e third network with four edges meeting at t h e center is minimizing. T h e perimeter m e t h o d is best for relatively small values of L, when L < * y-. L e m m a 2.2 In a network minimizing length plus the number of Steiner points, sum of the unit vectors of segments meeting at a Steiner point must be 0.

the

PROOF: For a given vector v, t h e derivative of |v| is A . T h e sum of t h e derivatives, i.e., of t h e unit vectors meeting at a point m u s t add t o zero, or else we could move t h e Steiner point to a position t h a t does allow t h e unit vectors to sum to 0, and t h e network would have reduced total length. • In particular, any Steiner point of degree three must have 120° angles between its edges. L e m m a 2.3 Suppose a network N(0) is continuously transformed to a new network N(l) of the same topological type. Then the function F(t) describing the length of the intermediate networks is strictly convex. PROOF: Consider any two networks ./V(O) and ./V(l) of a given topological t y p e , with nodes rti(0), ...,nm(Q) and n ^ l ) , . . . , n m ( l ) . T h e n a n e w network N(t) of t h e same t y p e can be created by interpolating between each pair of nodes n,(0) and ra,-(l) to form the node n , ( i ) . Let F(t) equal the length of N(t), which will clearly be a function of t. T h e n

m = £ fuM ni,rij

26

T. Colthurst

et a}.

where fij(t) — d(ni(t),rij(t)), t h e distance between two connected nodes «,,«,•. Consider two pairs of corresponding points ra,(0),rjj(0) and n ; ( l ) , n j ( l ) . T h e n .1. / l j (

2)

n,-(0) + n,-(l) n,-(0) + n,-(l) =

d(

2

'

2

>

+

nj(l)\\

so fiA\)

= \hi(0)+ni{l),nj(0)

Considering t h e four points as vectors, it follows from t h e triangle inequality t h a t i | K ( 0 ) + n,(\),n3(0)

+ ^(1)11 < | ( | | n , - ( 0 ) + ^(0)11 + ||r»,-(l) + ^ ( l ) | | ) / . , ( 0 ) + /.•,(!) 2

Hence fi,j(\) < ,; 2 which implies t h a t /(<) is an upward convex function, so F(t) is also convex. Observe t h a t equality holds and F(t) is not strictly convex only when equality holds in the triangle inequality above. This implies t h a t all edges are parallel. It can b e shown by a simple induction a r g u m e n t t h a t , assuming there are no cycles, no two distinct networks of the same topological t y p e have all edges parallel. Hence, F(t) is strictly convex. Q L e m m a 2.4 For n > 2, the cost-minimizing network connecting the 2ra vertices ( 0 , . . . , ± a , ...,0) of a regular cross-polytope with at most one Steiner point connects every vertex directly to the origin for a > J—T-—^—T=T. PROOF: Consider t h e cross-polytope of edge length / = a\/2. Since it has In vertices, connecting t h e m by a tree on t h e perimeter has cost l(2n — 1). Connecting its vertices to a Steiner point at t h e origin has cost 1 + y/2nl. T h u s , t h e center network is cheaper for / > l/(2ra — \/2n — 1) and for such an I has cost greater t h a n 1 + _ ^* _, • Any cost-minimizing network N with one Steiner point may be classified by t h e n u m b e r of vertices t h e Steiner point connects; call this n u m b e r t. Since t h e Steiner tree portion of N has length at least 4 - t h a t of the m i n i m u m spanning tree [GH], N has cost at least 1 + l[2n — t) + -yz(t — 1)/. Comparing this to the center cost of 1 + y/2nl

one concludes t h a t it is cheaper to connect a vertex to the Steiner point if

n(2-\/2)—7-

t < —-^T—iS-.

For t higher t h a n this, we claim t h a t there is a vertex not directly

connected to t h e Steiner point which is closer to it t h a n to t h e nearest vertex. We m a y assume t h a t t h e In — t unconnected vertices are m u t u a l l y adjacent, containing no two vertices on the same axis, and form a 2n — t — 1 dimensional simplex. For if neither of two points on the same axis are connected to t h e Steiner point, symm e t r y dictates t h a t t h e Steiner point m u s t lie in the hyperplane orthogonal to their axis. Since it m u s t b e contained in t h e convex hull of the vertices in t h e hyperplane,

Networks Minimizing Length Plus the Number of Steiner Points

27

the Steiner point will actually be closer to both vertices than they are to any other vertices. On the other hand axis directions with both vertices connected directly to the Steiner point need not be considered either, as the Steiner point will always lie in the K" _1 dimensional orthogonal hyperplane, so the problem can be considered in Call the distance between the origin and the Steiner point x. Then the coordinates of the Steiner point look like (-£-, ~fri ~TTi •••)"7§-)i a " d the coordinates of a given unconnected vertex are ( /,""_,, 0,0, ...0). Then the distance between them is just x

I

(2n-Qa»

fVlVl'

2n-t

Hence, a vertex is closer to the Steiner point than to other vertices if f >

2 XI ^ + 2x2 V2n-i

x < /(2,/2 - —?

7=L=)

Clearly x is less than the distance from a vertex to the origin, so we know that x <

/ . y/2n-t

Combining the two inequalities and simplifying we get 3 - < in - 2t Now we just use the limit on t relative to n established above and we get 0.6141n + 1.3660 > 1.5 which is true for all n > 2. Thus, the network connecting all the vertices at the center is the cheaper than networks of all other topological types with one Steiner point. It remains to show that the Steiner point should be placed at the origin in the minimal network of this type. Consider how the length of the central network changes as the Steiner point is moved away from the center along some vector u. The change in length with respect to time is given by ^ = — (uj + v2... + v6) • u where the V{ are the six unit vectors along the axes. A quick calculation show that 4j = 0 for all u, so the origin is an equilibrium for this configuration. By Lemma 2.3 length is a strictly convex function of the position of the Steiner point, so it follows that the network connecting all the vertices to the center must be minimal. •

28

T. Colthurst

L e m m a 2.5 The unit vectors in the six axis directions minimizing network for the octahedron.

in 5K3 constitute

et a.1. a cost-

PROOF:

By L e m m a 2.4 we know t h a t t h e unit vectors are cost-minimizing over all networks with (at most) one Steiner point. We will show t h a t there exists a scaling (a range of /) over which t h e unit vectors are of lesser cost t h a n t h e minimal spanning tree and all possible trees with two or more Steiner points. T h e minimal spanning tree has cost 5y/2l and t h e unit vectors have cost 6/ + 1. It can be easily calculated t h a t if / > 0.934 > >s_6> t h e n t h e unit vectors will be minimizing over t h e minimal spanning tree. Now consider networks having at least two Steiner points. T h e cost of such a network is at least two plus t h e length of t h e shortest network connecting all six points. A group of Williams students in 1989 proved t h a t t h e length-minimizing network for the octahedron had four Steiner points. T h e y also showed t h e eight combinatorial choices for t h e minimizer [CI], by classifying t h e topological types according to how t h e four Steiner points connect pairs of vertices on t h e same axis. Here we examine each case individually. All we have to do is find a lower bound for each candidate, then show t h a t for some / > 0.934, t h e network with one Steiner point is cheaper. Note t h a t in these representations, x, y, and z represent t h e axis t h a t each segment hits at distance 1 from t h e origin. Case 1:

z

X

Figure 2: Case 1. T h e network at right has length three times t h a t of t h e piece of t h e network component on t h e left, which has length > 1.93/. This case uses a Steiner point at the origin which is connected to t h e vertices in pairs using three additional Steiner points. Hence every pair of vertices sharing an

Networks

Minimizing

Length

Plus the Number

of Steiner

Points

29

axis are connected through three Steiner points. Here a straightforward c o m p u t a t i o n shows t h a t t h e length is greater t h a n 5.7/, which means t h a t when 6/ + 1 < 5.7/ + 2, t h e one Steiner point network will be cheaper. T h u s for 0.934 < I < 3.333, t h e one Steiner point network will be better. Cases 2, 3, 4:

Figure 3: Case 2. T h e portion of t h e network in t h e dotted circle has length at least 21.

Y

Z

Figure 4: Case 3. T h e ?/-legs have length > 2/ and t h e remaining portion of t h e network is t h e minimizing network on a square with side length \/2l.

Figure 5: Case 4. T h e z-legs have length > 2/ and t h e remaining portion of t h e network is t h e minimizing network on a square with side length \J2l. Cases 2, 3, and 4 consider t h e possible topological types in which exactly one pair of boundary points on the same axis are connected through a single Steiner point. T h e m i n i m u m distance between this pair of points is 21. T h e cost of these three cases is bounded below by 2/ + L, where L is the length of t h e minimizing network connecting four points on t h e same plane forming a square with side length y2l. More specifically, t h e overall length is bounded by ( 4 w | + \/2 — \ + 2)1. T h u s , it is

T. Colthurst et al.

30

cheaper to use one Steiner point when 6/ + 1 < ( 4 w | + v 2 —yr + 2)1 + 2, i.e. when 0-934 < / < 6_(4(0,g11)+1.7g) » 1.04. Cases 5, 6:

Figure 6: Case 5. The length of the portion of the network in each dotted circle is > 21.

'•--/.is-'

X-i..--'

Figure 7: Case 6. The length of the portion of the network in each dotted circle is >2l. Cases 5 and 6 are the two topological types in which all three pairs of vertices on the same axis are connected through a single Steiner point. In these cases, it is clear that cost is bounded by 6/. Clearly, the one Steiner point network is cheaper. Case 7:

Figure 8: Case 7. The parts of the network in the dotted ellipses each have length > 21, and the parts of the network in the dotted circles each have length > i / | . In this case two pairs are connected through two Steiner points and the other through four. Its cost is bounded by (2 -f 2 + 2
Networks

Minimizing

Length

Plus the Number

of Steiner

Points

31

b e t t e r when 6/ + 1 < (2 + 2 + 2-^/f)/. Therefore t h e center network is cheaper when 0-934 < / < g Case 8:

^

« 2.63.

Figure 9: Case 8. T h e p a r t s of t h e network in t h e dotted circles each have length > J\l,

a n d t h e parts in t h e large ellipse have length > ( 4 - i / | + \ / 2 — -\)l-

In this case two pairs of vertices are connected through two Steiner points a n d t h e other through four. Its length is bounded by t h e shortest network connecting x, x, y a n d y + 2 i / | / . In other words, by 6 ^ / | + y/2 — -7g. T h u s , t h e 1 point network is b e t t e r when 0.934 < 7 < 6_{6{J1)+0,i8) < 1.78. D We have seen t h a t four edges of a network t h a t minimizes length plus t h e n u m b e r of Steiner points may meet in 5R2, and t h a t six edges may meet in 5R3. We will now complete t h e proof t h a t for n > 2, there is a network in 3?" t h a t minimizes length plus t h e n u m b e r of Steiner points and has In edges meeting at a point. T h e o r e m 2 . 6 The In vectors of length a in the In axis directions constitute a costminimizing network for some value of a. In fact, if n > 2, then the unit vectors are cost-minimizing. PROOF: By L e m m a 2.4, we need only consider networks with two or more Steiner points. A network with two or more Steiner points must again have length of at least -j- t h a t of t h e m i n i m u m spanning tree [GH], and thus cost of at least 2 + -T;(2?I — 1)/, where / = \/2a is t h e distance between adjacent points. B u t for I < l / ( v 2 n — 4 - ( 2 n — 1)) this is greater t h a n t h e cost for t h e center network, a n d for n > 5, l/(\/2n —7T(2TC — 1)) > l / ( 2 n — \/2n — 1), the value for / at which t h e center network becomes cheaper t h a n t h e minimal spanning tree. Therefore there exists a range of values for a over which t h e vectors along t h e axes form a minimal network for n > 5. For n = 2 a n d n = 3 , Examples 2.1 a n d L e m m a 2.5 provide examples in which t h e vectors along t h e axes form cost-minimizing networks. For n = 4, t h e cross-polytope has t e t r a h e d r a as 3-faces, and any network on it m u s t have length twice t h a t of t h e shortest network on a tetrahedron, or 2(2.445)/ [Cl], since we may project it onto a tetrahedron in four orthogonal directions. T h u s any network with two or more Steiner points has cost > 2(1 + 2.445), which is more expensive t h a n t h e center network for l / ( 2 n - V^n - 1) = 0.744 < / < 1.319.

32

T. Colthurst

et al.

T h u s for n > 4 there is a range of / for which 2n edges of length / meeting at a point is cost-minimizing. •

3

Bounds on the Number of Edges Meeting at Steiner Points

T h e o r e m 3.5 establishes four as t h e sharp upper bound on t h e n u m b e r of edges in a cost-minimizing network t h a t can meet at a Steiner point in the plane. Proposition 3.6 and Corollary 3.7 establish lower bounds on t h e n u m b e r of edges t h a t can meet in 5K". First L e m m a 3.1 gives a lower bound on t h e angle between two edges in a cost minimizing network. L e m m a 3.1 If AB and AC are two edges meeting at a point A in a network, then IB AC > 60°.

cost-minimizing

P R O O F : If LB AC < 60°, then LABC or IBCA > 60°. W i t h o u t loss of generality, take LABC > 60°. T h e n ||AC|| > \\BC\\, so AB + BC is a cheaper network t h a n AB + AC which was, by assumption, minimal. • R e m a r k 3 . 2 An exponential Steiner point.

upper bound on the number of edges that can meet at a

L e m m a 3.1 yields as an upper bound on the n u m b e r of segments t h a t m a y meet at a point t h e n u m b e r of balls of radius | t h a t can be packed on t h e unit sphere. This upper bound is 6 in 3J2, between 12 and 14 in 3J3, and grows exponentially as a function of n.

Figure 10: In a triangle with largest angle 7, ||a|| + ||6|| > |(cj|(2cos-7 + 1).

Networks

Minimizing

Length

Plus the Number

of Steiner Points

33

a4

Figure 11: Five edges cannot meet at a Steiner point in a cost minimizing network. L e m m a 3 . 3 Given a Aabc such that 90° > 7 > a and 7 > /? (see Figure 10),

then

||a|| + | | 6 | | > | | c | | ( 2 c o s 7 + l ) . PROOF: W i t h o u t loss of generality, fix 7 and take /3 > a. By t h e law of sines,

Nl sin a

=

II&II sin j3

Ml

=

=

sin 7

HI sin(/? + 7)

Therefore ||q|| + ||6|| ^ s i n ( / ? + 7 ) + s i n ^ ||c|| sin7 T h e m i n i m u m value of this expression when varying f3 occurs when /? = a , so

W^i

||c||

= l+ ^

sin 7

= 2cos7+l.

D L e m m a 3.4 If five edges, po,Pi, • • • ,Pi, meet at a Steiner point in a cost-minimizing network in the plane, as in Figure 11, then a; < 2 t a n - 1 ;A=, for all i € { 0 , 1 , . . . , 4 } .

PROOF: W i t h o u t loss of generality, let Lao — 29 be t h e largest angle. By L e m m a 2.2 t h e unit vectors must add to zero. Consider t h e components in t h e po and p$

34

T. ColthuTst

et al.

directions: 2 c o s # + cos(# + ai)-|-cos(# + a:i + oc2) + cos(a4 + 8) = 0. 6 is maximized if a a = Q 2 = an = 60° by aconvexity argument, so 2cos 6> + cos(# + 60°) + cos(6l + 120°) = 0, or 6 = t a n - 1 ^ , so a 0 = 20 < 87.795°. a T h e o r e m 3.5 Four is a sharp bound on the number point in a cost-minimizing network.

of edges meeting

at a

Steiner

PROOF: We have shown examples of four edges meeting (See Examples 2.1). By L e m m a 3.1 it follows t h a t seven or more edges cannot meet, and if six meet all angles must be exactly 60°. In this case, however, a regular hexagon can be substituted in some small region about the Steiner point, and if one edge is removed the length is decreased. This creates six new Steiner points, but they can be eliminated by expanding t h e hexagon until t h e new Steiner points reach t h e vertices at t h e end of t h e original six edges. For this reason, six edges cannot meet. Hence it remains to determine if five can meet. Consider five edges p 0 , . . . , p 4 meeting at a Steiner point in a cost-minimizing network. ct0 > /30 and a0 > f0, since otherwise we could switch a0 for p0 or p 4 and decrease t h e overall length. Now apply L e m m a 3.4 to obtain ||p,|| + ||pi+i]| > | | a ; | | ( 2 c o s a , + l ) . T h e n 2 £ 4 = 0 | | p ; | | > £ 4 = 0 ||a,||(2cos a, + 1). W i t h o u t loss of generality, let a0 be t h e longest side. T h e cost of t h e perimeter is Z ] 4 = 1 ||'II + !• Since X^=o <*; = 27T, it follows from a convexity argument t h a t Ylt=ocos a' is minimized when t h e angles are at the e x t r e m u m of t h e domain (60°, 87.796°). This m e a n s t h a t a{ = 60°,60°,64.41°,87.796°,87.796°, and ZUocosai > 1-5087 > | . Therefore, 1 ' 4 1 ||ao||(cosa 0 + - ) + ^ ||a,||(cosa v - - ) > 0, so 1 1 4 -|| 0, and i£||a1||(2cosa, + l)>£||a,||. Z

i=0

."=1

4

Recall t h a t 2 £ = o ||p,|| > Et=o ||a.||(2cos a,- + 1), so

EW>EWt=0

i=l

and

Ellwll + i>ElNli=0

i=l

Networks

Minimizing

Length

Plus the Number

of Steiner

Points

35

But t h e n t h e cost of five edges meeting at a Steiner point is greater t h a n t h e cost of t h e perimeter. Therefore, five edges cannot meet at a Steiner point in a cost minimizing network. D P r o p o s i t i o n 3 . 6 If t points are distributed on a unit n-sphere such that the minimum angle a between any two points is greater than 78.687947° and ift > 1/(1— i//ft/»)' where fi = A/2 — 2 cos a is the minimum distance between two points, then for some scaling to a sphere of radius r, there is a cost-minimizing network in which at least a third of the points meet at a Steiner point. PROOF: If t h e n-sphere has radius r then connecting the points through t h e center has cost 1 + r t . Since t h e m i n i m u m distance between any two points is y/2 — 2 cos ctr = fir, connecting t h e points through a perimeter tree has cost at least fir(t — 1). T h u s t h e center network has lower cost for r > l/(fi{t — 1) — t), which is positive for t > 4 and a > 78.68°. All other one Steiner point networks can be determined by t h e n u m b e r of points they connect; call this number c. Since t h e Steiner tree portion of t h e network must have cost at least A- times the minimal spanning tree [GH], the one Steiner point network has m i n i m u m cost 1 + l/%/3/3r(c — 1) + fir(t — c), which is greater t h a n 1 + rt for c <

t{

*~1)+'JS

« 0.5* - 1.36 when a > 78.687947°. Hence t h e

optimal network with one Steiner point must have at least half of its vertices meeting at a Steiner point as t becomes large, and in fact calculations reveal t h a t in any case Now consider networks with two or more Steiner points. Again use t h e fact t h a t any Steiner network has length at least A- t h a t of t h e minimal spanning tree, which implies t h a t t h e cost of such networks is at least 2 + A=fir{t — 1). A one Steiner point network is then cheaper for r < l/(t — A-fir(t — l)). T h u s there is a range of r for which a one Steiner point network is cost minimizing if \ftt —izfi r {t — 1)) > ^l{fi{t ~ 1) — t) or t > 1/(1 - 2 / ( ( l 4- -j$)fi))-

This is positive for a > 78.687947°.

•

C o r o l l a r y 3 . 7 The number of edges which can meet in a cost-minimizing in ?Hn grows exponentially in n.

network

PROOF: By [EF], there is a set of points of exponential cardinality on an n-sphere such t h a t t h e angle between any pair exceeds any given angle less t h a n 90°, including the angle a RJ 78.687947° above. •

References [BG]

Marshall W . Bern and Ronald L. G r a h a m , T h e shortest-network problem, Scientific American, January, 1989, pp. 84-89.

36

T. Colthurst

et al.

[B]

W . M. Boyce, Can degree five Steiner points reduce network cost for planar sets?, 1970, preprint.

[CI]

M. Conger, M. Christian, D. Cooper, J. Goodell, Length-minimizing networks in 3i 3 : T h e t e t r a h e d r o n and new examples, Geometry Group, W i n t e r Study, 1989, Advisor: F . Morgan.

[C2]

M. A. Conger, Energy-minimizing networks in SRn, Honors Thesis, Williams College, 1989, revised 1989.

[CR]

R. Courant and H. Robbins, What is Mathematics?, 1941.

[EF]

P. Erdos and Z. Fiiredi, T h e greatest angle among n points in t h e d-dimensional Euclidean space, Annals of Discrete Mathematics 17(1983) 275-283.

Oxford University Press,

[GH] R. L. G r a h a m and F . K. Hwang, R e m a r k s on Steiner minimal trees, /. Bull. Inst. Math. Acad. Sinica 4(1976) 177- 182. [M]

Frank Morgan, Surfaces minimizing area plus length of singular curves, 1991, preprint.

[MT] Frank Morgan and Jean Taylor, Destabilization of t h e t e t r a h e d r a l point junction by positive triple junction line energy, Scripta Met. et Mat 25(1991) 19071910. [RTW] J. H. Rubinstein, D. A. T h o m a s , and J. F . Weng, Degree five Steiner points cannot reduce network costs for planar sets, Networks, to appear.

37 Network Optimization Problems, pp. 37-52 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Practical Experiences Using an Interactive Optimization Procedure for Vehicle Scheduling Joachim R. D a d u n a Dornier G m b H / Management Consulting Biiro Berlin, Niirnberger Strafle 68 / 69, D - 1000 Berlin SO Miodrag Mojsilovic Peter Schiitze HanseCom Gesellschaft Informationsverarbeitung

fur angewandte Informatik und mbH, Spohrstrafle 4, D - 2000 Hamburg

60

Abstract During the last twenty years operations research methods became more and more important in urban mass transit planning. One of the most essential subjects in this area is the preparation of solutions for the vehicle scheduling problem. For the HOT program system two different modules have been developed for this scheduling problem. As well as the usual scheduling procedure the system offers a module, that connects timetable preparation and vehicle scheduling. The usage of the module "sensitivity analysis" undoubtedly lead to additional savings in fleet size and operational cost. Practical experiences are described.

1

Introduction

A line of software systems has been developed since the 70's, with which efficient planning in public mass transit is attained (see Wren (1981), Bodin et al. (1983), Rousseau (1985), D a d u n a / Wren (1988), Desrochers / Rousseau (1992)). Early successful solutions were especially evident in vehicle scheduling (cp. Mojsilovic (1983), Luedtke (1985)), since it involved structures, which can be relatively easily described through models and then can be worked out using efficient algorithms.

38

J.R. Daduna., M. Mojsilovic,

and P.

Schiitze

One of t h e first of such systems available in this m a r k e t was t h e H O T program syst e m , by H a m b u r g Hochbahn Aktiengesellschaft, and used in bus, t r a m , and light rail operational planning, for which it was developed. T h e intense technological progress in t h e field of d a t a processing was used to overhaul t h e fundamental hardware and software during the 80's (cp. D a d u n a / Mojsilovic (1988)). This development is m a d e by HanseCom, a subsidiary of Hamburger Hochbahn Aktiengesellschaft. By overhauling t h e software an additional module was developed, t h a t connects two spheres of planning, timetable preparation and vehicle scheduling, which were, as a rule, treated separately in t h e past. T h e goal was to analyze with t h e help of an interactive optimization procedure how stable a solution is, in particular to which degree b e t t e r results in scheduling could be achieved through slight alteration to t h e i n p u t . This module, which is called "sensitivity analysis", is now in use for several years by various urban mass transit authorities. T h e results achieved through this module m a k e it evident t h a t efficient algorithms, in conjunction with a highly developed d a t a processing technology, form an im- p o r t a n t basis for operational planning. First of all the problem, as well as the formal foundations, will be described, based on D a d u n a (1988). Next t h e procedure will be presented and explained through an example. New results from urban mass transit authorities of varying size and structure will show t h e benefits of the sensitivity analysis.

2

Problem Formulation

T h e sensitivity analysis is a module that gives the user assistance in operational planning. T h e applicability which is described in this paper shows, in this particular form, only the advanced version H O T II. T h e background to this development was t h e consideration of how t h e schedule aspects could b e included in t i m e t a b l e construction to achieve a higher efficiency rate of operation. Since a simultaneous t r e a t m e n t of the planning problems is not possible due to its extreme complexity, t h e sensitivity analysis is applied to check the entered timetable d a t a from t h e point of view of scheduling. It is systematically applied to t h e critical t i m e periods of operation, i.e. rush hour. These time periods of peak capacity cause an unproportional increase in costs for t h e transit authorities. This implies t h a t t h e m a x i m u m n u m b e r of blocks in this t i m e period determines the vehicle d e m a n d and therefore, to a certain extent, also the n u m b e r of required personnel (see D a d u n a 1987). From this t h e basic objective results, namely to reduce the n u m b e r of blocks during rush hour not only through vehicle scheduling with the help of an optimization procedure, as is t h e case with t h e usual planning of operation, but also by changing some timetable data. Herewith possible changes of the timetable d a t a are determined with an interactive optimization procedure. T h e utilization of such a module in t h e operational planning is further demonstrated by t h e example of Fig. 2.1. Carrying out eight trips on passenger service

Practical Experiences for Vehicle Scheduling

m

Blffi!

mmt

BUS 3

13 i

3TO§4

mm 5 Rasimlng fjmea for deadhead tri|is; Regular trip to deiicihe^i trip Regular trip to fcgufor ^P °»J fie same line Regularrtnp to mauler frip on different lines

1 min BOS

MAR

WUB

__._LL

VED

21

11

8 25 20

28

NGR

n n

_ _ _ 2 miis |

5min

28

j

8

1

38

]

Figure 2.1: Example of a vehicle schedule before an alteration to the input data we need, in the present case, five vehicles, because only three links are feasible. The crucial point is the trip of line 140, ending in NGE at 7:03 a.m.. This trip cannot be linked with the one on line 240, starting at 7:16 a.m.. It is only one minute short. The necessary deadhead trip takes eight minutes with an additional pre-layover of five minutes after the deadhead trip and a post-layover of one minute before it. This result will certainly be unsatisfactory to the planning staff. Therefore we must look for alterations to specific input data in view of the following essential limitations: • It is not permitted to reduce the offered service level. The total number of trips on service, as well as routes, and running times have to remain unchanged. @ in-company and contractual agreements are not allowed to be violated, e.g. layover times and the duration of deadhead trips. To a certain extent, however, slight alterations of departure times will be possible, because this does not mean a reduction, but only a marginal alteration without any

40

J.R. Daduna,, M. Mojsilovic,

a n d P.

Schiitze

effects on t h e service level. T h e possible range of alteration is determined by t h e following aspects: • In general, published trips with even headways have to remain unchanged. • Additional trips during peak hours are not allowed to come after a regular trip. Furthermore, t h e difference between t h e departure times of these trips should not exceed a given range, because otherwise t h e objective of relieving t h e respective regular trip will not be achieved. • Connections to transfer stops on other lines or with other means of t r a n s p o r t must be considered. School trips and trips on contract hire, which, in general, are not in t h e timetable, yield a good scope for such alterations. In these cases, it only must be guaranteed t h a t t h e vehicle arrives at t h e corresponding point of destination in t i m e before t h e start of lesson or work. In addition, special delay buffers may be reduced, because they are fixed in the course of planning and are not restricted by in-company or contractual agreements. T h e possible alterations offered by sensitivity analysis are: • Altering t h e d e p a r t u r e t i m e of one of either t h e trips to be linked or of b o t h trips • Reducing t h e delay buffers (determined by t h e company) In t h e example described above, we can shift t h e d e p a r t u r e t i m e either of t h e trip on line 140 or 240. T h e necessary alteration amounts to only one m i n u t e . T h e adaption of input d a t a produces a solution, which yields a saving of one vehicle (see Fig. 2.2). This example has, at first sight, a relatively simple structure. However one should not come to t h e conclusion t h a t these links can be realized in m a n u a l planning, because t h e example represents an isolated part of a complex structure.

3

Mathematical Formulation

Vehicle scheduling planning pursues two objectives, which are hierarchically structured, at t h e same time. T h e main objective is minimizing the n u m b e r of blocks in peak hours (see Mojsilovic (1983), Carraresi / Gallo (1984), Luedtke (1985)). T h e other, subordinate objective pursues t h e reduction of non-productive times between two trips on service (deadhead trips, etc.).To a certain extent, this emerges, however, from t h e m a i n objective, because considerable cuts in pull-out / pull-in trips will be achieved by reducing the number of blocks. To solve such problems using optimization procedures, t h e complex s t r u c t u r e of real world d a t a must be reflected in a model. In doing so, a lot of different d a t a will

Practical Experiences for Vehicle Scheduling

41

Figure 2.2: Extract from a vehicle schedule after shifting the departure time of one trip. be qualitatively assessed and aggregated in a weight. This process represents one of the major steps in using computer-aided systems, because the result very much depends on this model form. The basic data for vehicle scheduling are the given trips of one timetable period. In addition technical and in-company restrictions must be considered when linking two trips. For two trips i and j with i preceding j , we must take into account the following data 6{j

:=

duration of the deadhead trip

Pi

:=

delay buffer at the end of trip i

Aa*C»)

:=

layover at the end of trip t depending on trip j (post-layover)

A^'<*)

:=

layover before the start of trip j depending on trip i (pre-layover)

These A-values resulting from in-company restrictions depend on which trips are to be linked. A linking of two trips i and j will be admissible, if the following conditions are satisfied: Si + di + A a w + % + Pi + Arfxo < SJ (1)

42

J.R. Daduna,

M. Mojsilovic,

and P.

Schiitze

and s3 - {si + di} < TMax

(2)

with Si as the departure t i m e and
.

if it is possible t o link trip i and j without any dviation if it is possible t o link trip i and j in t h e range of a given maximal deviation otherwise

. -

.

.

.

.

.

(4)

R

This m a t r i x C has t o b e u p d a t e d in each iteration in accordance with t h e users' decision. This combinatorial problem may be solved as an assignment problem with an additional constraint. This problem results in t h e following formulation: n

Minimize

n

^ J ^ Cijxij

(5)

Vj = l,2,...,n

(6)

Vi=l,2,...,n

(7)

Vi,j = 1 , 2 , . . . , n

(8)

with the constraints

5>y =l i=i

X>y = l J'=I

xo-€{0;l} r
(9)

Practical

Experiences

for Vehicle

Scheduling

43

with n defining t h e n u m b e r of trips and r t h e n u m b e r of calculated cycles of t h e solution of t h e assignment problem for t h e given R.

4

Operational Process

T h e operational consists of four main steps which will be described separatedly. Fig. 4.1 shows t h e flow chart for these steps. S t e p 1: In t h e first step, the user must define which t y p e of day and which t i m e interval will be processed and which alterations at t h e most will be p e r m i t t e d . On this basis, t h e necessary d a t a structure with t h e m a t r i x CR will be computed within t h e system. S t e p 2: In t h e second step, a first optimal solution will be calculated for R = n, the initial solution. This produces the value r, t h e m a x i m u m n u m b e r of blocks in t h e t i m e interval under examination, without altering any d a t a . S t e p 3 : This step represents the interactive part of the sensitivity analysis. T h e n u m b e r of t h e blocks R will be set to r — 1. Starting from t h e existing optimal basic solution with r blocks, a re-optimization procedure will be effected to obtain a saving. T h e necessary alterations will be shown successively on screen. T h e user must decide in each case whether t h e suggested alteration will be accepted or not. If not, then t h e next suggestion follows. After an finite n u m b e r of possible alterations, a saving will be achieved or t h e procedure will be t e r m i n a t e d because no feasible solution for R can be found. If t h e procedure is continued, the re-optimization procedure starts again with t h e value R = r — 1 . Because the sensitivity analysis works with p a r a m e t e r s , t h e procedure may be repeated with different values, starting with step 1. However, repetition procedures may also be started in order to modify individual decisions in t h e interactive operational process. S t e p 4: In the last step, t h e user must choose t h a t solution which is t h e most suitable one in his opinion. Finally, t h e input d a t a is altered accordingly in t h e d a t a base.

5

Example

In the following, we will show in an example with real data, how t h e interactive part of t h e operation works from t h e users point of view. This d a t a is taken from t h e s u m m e r timetable 1992 of Verkehrsbetriebe Hamburg-Holstein AG.

J.R. Daduna, M. Mojsilovk, and P. SckMze

44 f-^T

^^^pji???^

PARAMETER

INPUT DATA

Figure 4.1: Flow chart of the sensitivity analysis

Practical

Experiences for Vehicle

Fahrzeuge: Anfangswert 350

Linie

Abfahrt

davor

136

6:33

Fahrt 1 Fahrt 2

131 136

danach

136

45

Scheduling

gegenwaert ige Anzah] 350

von --> nach

Ankunf t

MWZ F-Art

Einsparung Zeitpuffer

MAX

BBF

6:43

0

N

1

_< 6:59 SPE 7:24>_ HID

BBF SRE

7:11 7:36

0 0

V N

-3 8

BBF

8:13

0

N

7:46

SRE

Vorschlaege seit der letzten Einsparung:

0

Veraend.

1

Koennen Sie eine Veraenderung urn 3 Minuten akzept .eren ?

Figure 5.1: Suggestion for alteration shown on the scree In t h e first step t h e following suggestion for alteration appears on screen (see Fig. 5.1): T h e trips shown, in this case four, represent a segment from one block. Lines 5 and 6 contain t h e d a t a of those two trips which may be linked after altering the respective d e p a r t u r e times. Lines 4 and 7 show in addition t h e trips before t h e first and after t h e second trip to be linked. Such an additional information helps t h e user to assess more easily t h e impacts of an alteration. T h e following will be indicated: t h e number of t h e line on which t h e trip will be carried out, d e p a r t u r e and arrival time, first and last stop of t h e trip, (in-company) delay buffer ( M W Z ) , the type of trip ( F - A r t ) , and t i m e range (Zeitpuffer) for alterations or the missing time for a link. This would mean in t h e example above t h a t trip 1 starts at 6:59 a.m. on line 131 from stop S P E and ends at B B F at 7:11 a.m. V (the type of trip) indicates t h a t it is an additional trip. (The other types of trips, mentioned in t h e following, are: N = regular trip, Y = school trip.) T h e value -3 in t h e field "Zeitpuffer" of trip 1 at 6:59 indicates t h a t a link with trip 2 is not possible, because there is a shortage of three minutes between t h e end of t h e first trip and t h e start of the second one. True, we have thirteen minutes between t h e two trips, b u t there is not enough t i m e for a deadhead trip from B B F to HID, plus layovers. However, subsequent to trip 2, there is still a t i m e range of eight minutes in S R E (after subtracting two minutes for layover). True, we have t h e possibility to start trip 2 t h e necessary three minutes later. But trip 2 is a trip with an even headway, published in t h e timetable, and therefore in general not p e r m i t t e d to be shifted.

46

J.R. Daduna,

M. Mojsilovic,

and P.

Schntze

— Fahrzeuge: Anfangswert 350

Linie

Abfahrt

gegenwaertige Anzahl 349

von —> nach

Ankunft

MWZ F-Art

Einsparung Zeitpuffer

1

Veraend.

davor Fahrt 1 Fahrt 2

275 275

danach

275

_< 6:43 SKA 7:08>_ RAD 7:32

MEO

RAD MEO

7:07 7:29

0 0

N N

RAD

7:54

0

N

Vorschlaege seit der letzten Einsparung:

1

Koennen Sie eine Veraenderung um

-1 0

1

1 Minuten akzept .eren ?

Figure 5.2: Second suggestion for alteration F u r t h e r m o r e , there is a connection to City Rail in B B F which has to be adhered to. Trip 1, on t h e contrary, is an additional trip and therefore not published in t h e timetable. From an in-company point of view, this makes it possible to start this t r i p t h e required three m i n u t e s earlier. Should t h e suggestion be accepted, then already one vehicle has been saved in this example. Often, however, further alterations will have to be m a d e until a saving is a t t a i n e d , because a restructuring of t h e basis solution usually requires several steps. T h e suggestion for alteration shown in Fig. 5.2 will be indicated in t h e next iteration. In this case, t h e line above trip one contains no data. T h i s m e a n s t h a t trip 1 represents t h e start of t h e block. Correspondingly, trip 2 forms t h e end of a block, if no information after this trip is given. This suggestion for alteration cannot be accepted for in-company reasons, because it is a regular t r i p with a fixed headway. Only one alteration to t h e d e p a r t u r e t i m e is suggested in Fig. 5.1. This was t h e only possibility in t h e earlier versions of t h e sensitivity analysis. By expanding t h e p r o g r a m m e system, we can now shift the two trips which have to be linked. This is described by taking t h e above example one step further. After not having accepted the suggestion in Fig. 5.2 nor three other suggestions for alteration, t h e user will be presented with t h e suggestion shown in Fig. 5.3. For linking the school trips at 7:02 a.m. and 7:17 a.m., there is a shortage of four m i n u t e s . Due to t h e d a t a structure, t h e departure time of one of t h e two trips might be altered by t h e required four minutes. However, it is appropriate in this situation to shift both trips by two minutes.

Practical

Experiences

for Vehicle Scheduling

Fahrzeuge: Anfangswert 350

Linie

Abfahrt

47

g e g e n w a e r t i g e Anzah] 349

von —> nach

Ankunf t

Einsparung Zeitpuffer

MWZ F-Art

1

Veraend.

davor Fahrt 1 Fahrt 2

293

danach

700

293

_< 7:02 WSS 7:17>_ WSS 7:45

USU

SZU SZU

7:12 7:30

0 0

Y Y

ULS

7:53

0

Y

Vorschlaege s e i t der l e t z t e n Einsparung:

Koennen S i e e i n e Veraenderung um

-4 4

5

4 Mixiuten akzept .eren ?

Figure 5.3: Fifth suggestion for alteration in t h e second iteration.

At first, t h e user must enter whether he accepts the suggestion. be asked to indicate a shift for one of t h e two trips. This is shown t h e n u m b e r "2" before t h e sign "_<" in t h e first t r i p . T h e necessary second trip will be automatically calculated within t h e system. By fifth suggestion for alteration, a saving will have been m a d e again. being m a d e are listed in Tab. 5.1.

Line

Departure time

131

6.59 7:02

293 293

7:17

If so, he will in Fig. 5.4 by shifting of t h e accepting this All alterations

From

To

Arrival time

Departure time (changed)

Arrival time (changed)

SPE

BBF SZU SZU

7:11

6:56 7:00

7:08

7:12 7:30

7:19

WSS WSS

Table 5.1: Alterations in t h e d e p a r t u r e times

7:10 7:32

48

J.R. Daduna, M. Mojsilovic, and P. Schiitze

Fahrzeuge: Anfangswert 350

Linie

Abfahrt

gegenwaertige Anzahl 349

von —> nach

Ankunft

MUZ F-Ar1

Einsparung 1 Zeitpuffer

Veraend.

davor Fahrt 1 Fahrt 2

293 293

danach

700

2< 7:02 WSS 7:17>_ WSS 7:45

USU

SZU SZU

7:12 7:30

0 0

Y Y

ULS

7:53

0

Y

-4 4

Vorschlaege seit der letzten Einsparung: E Bitte bei einer Fahrt die Verschiebung angeben

Figure 5.4: Alteration in the departure times of both trips by two minutes each

6

Results

The programme system HOT is being used by several urban mass transit companies in their operational planning. The experiences and results of four of these companies from using the sensitivity analysis in the bus networks are explained. H a m b u r g e r H o c h b a h n AG ( H H A ) HHA operates 142 mainly inner-city bus lines. In the summer timetable 1992 the maximum number of trips to be served at the same time is 634 in the morning peak. This requires a total of 735 buses. By using the sensitivity analysis, eight vehicles could be saved (ca. 1.1 Verkehrsbetriebe Hamburg-Holstein AG (VHH) The VHH bus network consists of 121 lines in total in a city area as well as on the outskirts. Out of the original 350 blocks in the summer timetable 1992 (with 260 trips as maximum) 10 could be saved (ca. 2.9 which operate at the same time, amounts to 327 (see fig. 7.1). This difference results from specific in-company conditions, which, in the scheduling procedure, lead to short blocks ending before the maximum peak. Dresdner Verkehrsbetriebe AG (DVB) Dresdner Verkehrsbetriebe AG mainly serves the municipal area. In the summer timetable 1992 it was possible to save in total four busses out of 143 (ca. 2,8

Practical

Experiences

tor Vehicle

Scheduling

49

Compagnie des Transport de Besancon ( C T B ) C T B operates 20 bus lines in t h e municipal area. For t h e winter t i m e t a b l e 1992/93, in which several lines were restructured and at t h e same t i m e t h e service was increased, it was possible to reduce the number of vehicles in t h e morning peak hours by eleven vehicles from 137 to 126 (ca. 8.0 As t h e results d e m o n s t r a t e , t h e savings differ obviously from each other. We are not able to define t h e exact causes, but some factors can be indicated. W i t h municipal lines, even headways and connections to rail systems, only a small choice of possible suggestions for alterations will be available to t h e sensitivity analysis (e.g. with H H A ) . This applies especially when no or not enough school trips are offered, because this transit is part of the regular timetable. If t h e service run mainly in the outskirts, t h e n u m b e r of school trips and trips on contractional hire increases sharply (e.g. with V H H ) , which offers a greater choice of alterations. T h e considerable improvements with C T B divert clearly from t h e other results. T h e y are probably due to t h e restructuring of the network and t h e transitional problems involved. However, we cannot exclude t h e considerable individual influence of t h e user in t h e process. Because t h e alterations are not fixed to rigid rules and therefore not a u t o m a t e d , t h e user holds a central position. By accepting or refusing, he strongly influences t h e result by his assessment and, of course, in-company specialized knowledge. However, how long t h e sensitivity analysis has already been used by t h e company might also be an important question. More experience might result in impacts on t h e decisions in the interactive p a r t . T h e attained percentage savings might, at first sight, appear to be small, especially in t h e case of HHA. A more detailed view yields, however, another interpretation. First, we must consider t h a t the savings through t h e sensitivity analysis represent an additional effect to t h e actual vehicle scheduling. W h e n calculating t h e basic solution, an optimization step will be m a d e which, in comparison with m a n u a l planning, offers considerable savings. In any case, a proportional value is often only partially meaningful. R a t h e r t h e absolute results of t h e savings m a d e in the operating cost must be looked at for an interpretation, because these may, even with small percentages, show a considerable a m o u n t . Take t h e operating cost, (including payroll cost) claimed by vehicle producers per vehicle and year, which rank between \JS170.000andUS 270.000 for middleEuropean countries, and the additional positive effects will become more clear.

7

Conclusions

T h e results described show t h a t t h e strongest effects are obviously m a d e in restructured networks when using the sensitivity analysis. However, this does not limit t h e usage to such cases.

50

J.R. Daduna,

M. Mojsilovic,

and P.

Schutze

350-r

Number of busses 300

250

200--

150

100

50--

frfr o o o o o o o o o o o o o o invot--ooo>0'~©t--ooo\

Figure 7.1: Distribution curve of t h e n u m b e r of vehicles ( V H H ) T h e examples of HHA and V H H , two companies which have used t h e sensitivity analysis for several years, show t h a t cost-saving potentials result even from, to some extent, small alterations to the timetable in successive schedule periods. T h e cause can be found in the complexity of t h e problem structure. Even small alterations may allow entirely new combinations. T h e m a i n objective for the sensitivity analysis is, also because of its incorporation in t h e planning process, t h e reduction of the m a x i m u m fleet size. This, however, does not m e a n t h a t t h e usage has to be limited to the t i m e before and after t h e absolute peak, which in the case of VHH (see Fig. 7.1) lies between 5:30 a.m. and 8:40 a.m. Because t h e number of vehicles required at certain times determines t h e n u m b e r of duties to some extent, it might be of interest with regard to d u t y scheduling to analyse also t h e local m a x i m a . W i t h a distribution curve of t h e n u m b e r of vehicles like t h e one of C T B (see Fig. 7.2), which in total has four almost equal m a x i m a , all peaks m u s t be analysed with regard to t h e m a x i m u m number of vehicles. By altering an originally absolute m a x i m u m , it is possible with such a structure t h a t due to a

Practical

Experiences

for Vehicle

Scheduling

51

i50j Nujnbe]- 0 f busses

oo

©

o

C4

o

oo

O

oo

IllMMNMIIIIII iiillllllll © © O O © O «n ^£5 r - 00 <3\ (S ( S <S

oo

oo

Figure 7.2: Distribution curve of the number of vehicles ( C T B ) reduction a new absolute m a x i m u m occurs in another time period.

8

Outlook

W i t h t h e sensitivity analysis the program system H O T II offers an efficient module. Its efficiency results from an effective interaction of algorithm and in-company experience. Naturally, this module will be further developed. T h e aim is to take advantage of technological progress in t h e field of hardware and software and furthermore to include t h e experiences from in-company usage. This is m a d e by HanseCom, the developer and distributor of t h e H O T II system.

References: [1 ] L. Bodin, B . Golden, A. Assad, and M. Ball (1983): Routing and Scheduling of Vehicles and Crews: T h e State of t h e Art, in: Computer and Operations Research 10, 63-211 [2 ] P. Carraresi and G. Gallo (1984): Network Models for Vehicle and Crew Scheduling, in: European Journal of Operations Research 16, 139-151 [3 ] J.R. D a d u n a (1987): DV-gestutzte Dienstplanbildungfjl29/,r das Fahrpersonal in Verkehrsbetrieben - Problematik und Losungsansatze, in: Operations Research Proceedings 1986, (Springer, Berlin, Heidelberg) 76-85 [4 ] J.R. D a d u n a (1988), A Decision Support System for Vehicle Scheduling in Public Transport, in: W. Gaul and M. Schader (eds.) Data, Expert Knowledge and

52

J.R. Daduna, Decisions 102

M. Mojsilovic,

and P.

Schiitze

(Springer, Berlin, Heidelberg, New York, London, Paris, Tokyo) 93-

[5 ] J.R. D a d u n a and M. Mojsilovic (1988): Computer-Aided Vehicle and D u t y Scheduling Using t h e H O T P r o g r a m m e System, in: J.R. D a d u n a and A. Wren (eds.) 133-146 [6 ] J.R. D a d u n a and A. Wren (eds.) (1988): Computer-Aided Transit (Springer, Berlin, Heidelberg, New York, London, Paris, Tokyo)

Scheduling

[7 ] M. Desrochers and J.-M. Rousseau (eds.) (1992): Computer-Aided Transit Scheduling (Springer, Berlin, Heidelberg, New York, London, Paris, Tokyo) [8 ] Luedtke, L.K. (1985): RUCUS II: A Review of System Capcilities, in: J.-M. Rousseau (ed.), 61-115 [9 ] M. Mojsilovic (1983): Verfahren fur die Bildung von Fahrzeugumlaufen, Dienstplanen und Dienstreihenfolgenplanen in Verkehr und Transport, in: H E U R E K A '83 - Optimierung in Transport und Verkehr, Karlsruhe, 178-191 [10 ] J.-M. Rousseau (ed.) (1981): Computer Scheduling Holland, A m s t e r d a m , New York, Oxford)

of Public Transport 2 (North

[11 ] A. Wren (1981): Computer Scheduling A m s t e r d a m , New York, Oxford)

Transport

of Public

1 (North Holland,

53 Network Optimization Problems, pp. 53-62 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Subset Interconnection Designs: Generalizations of Spanning Trees and Steiner Trees Ding-Zhu Du Computer Science USA and Institute

Department, University of Applied Mathematics,

Panos M. Pardalos Department of Industrial and Systems Gainesville, FL 32611 USA

of Minnesota, Minneapolis, MN 55455, Academia Sinica, Beijing 100080, PRC

Engineering,

University

of

Florida,

Abstract

Given an edge-weighed graph with vertex set X and a family of subsets of X, we consider problems of finding a subgraph G with the minimum total weight such that G gives some necessary connection between vertices in each subset in the family. The problems are generalizations of the spanning tree problem and the Steiner tree problem. Recently, many results for such problems have appeared. We give a survey of these together with some new results.

1

Introduction

Consider t h e following two problems: (1) Communication Network Design: There is a set of employees and a set of projects. In each project there is a subset of employees. Each employee m a y work in several projects. Now, we want to find a communication network between employees such t h a t any two m e m b e r s of a project are able to communicate through a sequence of other m e m b e r s of t h a t project. We also wish to find t h e less expensive network. (2) Vacuum System Design: A vacuum usually contains some valves. T h e function of valves is to give different interconnection states between components of t h e system.

54

D.-Z. Du and P. M. Pa.rda.los

T h e system must have specific interconnection states for each job. Now, given a particular job, how do we design a vacuum system t o do t h e job? Here, we want t h e n u m b e r of valves to be as small as possible. Both problems ask for a network to have multi-phase interconnections. In t h e first problem, a phase corresponds to a project; for each project, t h e network needs to give a spanning tree interconnecting all m e m b e r s in t h e project, where a spanning tree for a point set is just a tree with t h e point set as its vertex set. In t h e second problem, a phase corresponds to a state; for each state, t h e network needs to give a Steiner tree interconnecting a subset of the components with some restrictions (see Section 3 for details), where a Steiner tree for a point set is a tree with t h e vertex set containing t h e point set and leaf set contained by t h e point set. Problem (2) was first proposed by Du and Chen [4] in 1976. It was a very practical problem. In fact, they were in a bulb-making factory and wanted to p r o m o t e new products. For each new product, they had to design a new vacuum system for t h e new job, which raised t h e problem. In [4] t h e authors gave a m a t h e m a t i c a l model for problem (2) and several examples in t h e real world as well as an approach to adjust t h e design when valves of more complicated types are used. Since problem (2) appeared, a special case has been studied extensively in which, t h e restricted Steiner trees are spanning trees. Prisner [18] [19] generalized this case and proposed problem (1). In (1), all edges are weighed; different edges may have different weights. In (2), however, only the number of edges is counted; i.e., all edges have a uniform weight of one (called t h e unit-weight case). Both problems have been known to be N P - h a r d [6]; special cases and heuristics have gained attention. In this paper we survey existing results, extend some results from t h e unit-weight case to t h e general case, and point out some open questions. Before doing so, let us give some notations and terminologies. For a graph G, we denoted by ||G|| t h e n u m b e r of edges in G. For a set Y, we denote by \Y\ t h e n u m b e r of elements in Y. Let G = (V, E) and G' = (V, E'). T h e n t h e union, t h e intersection and t h e difference of graphs G and G' are defined to be G U G' = {V U V, E U E'), G n G' = (V n V, E n E'_) and G \ G' = (V, E \ E'), respectively. T h e complement of a graph G is t h e graph G with vertex set V(G) such t h a t an edge is in G if, and only if, it is not in G.

2

Multi-Phase Spanning Networks

Given an edge-weighed complete graph with vertex set X (\X\ — n) and subsets Xi, • • • ,Xm of vertices, t h e problem considered in this section is to find a m i n i m u m weighed subgraph G such t h a t for every i — 1, • • •, m, G contains a spanning tree for Xi. We will call this problem t h e Multi-Phase Spanning Network Problem ( M P S p N ) . A graph is called a feasible graph for (X^,--- , Xm) if for every X{ it contains a spanning tree for Xi. A solution of t h e M P S p N is called a minimum feasible graph for (w ; Xi, • • • ,Xm) where w is the weight function for t h e edge set. W h e n w = 1, t h e

Subset

Interconnection

Designs

55

solution of t h e M P S p N is said to be a m i n i m u m feasible graph for (1; X\, • • •, Xm). W h e n t h e M P S p N is studied, t h e following are usually assumed. (a) X — U^LjX,. (In fact, every m i n i m u m feasible graph has no edge incident to a point \T\X\ UJljX,-.) (b) \X{\ > 2 for all i. (\Xi\ with \Xi\ = 1 can always be deleted.) (c) Every feasible graph G satisfies G = Uj^jG,-. In fact, an edge not in U J ^ G ; can be deleted without changing t h e feasibility. To see an example, consider five subsets X\ = { f i , ^ } , ^ = { " l i ^ j ^ 3 ) 1 -^3 = {i>3,^4,v{\, X4 — {vi,V2,v4}, and Xs = {v2,v4,Vs}. These subsets and t h e weight function w are as shown in F i g . l . Some feasible graphs for (Xi,X2,X3,X4,Xs) are in Fig.2. Among t h e m , G* is a m i n i m u m feasible graph for (w ; X\,Xv, A3,X4,Xs).

w 1 2

1

2

3

4

5

5

6

7

g

5

6

7

5

6

3 4

5

5

Figure 1: An input of t h e M P S p N

Figure 2: Feasible graphs for t h e input of Fig.l T h e M P S p N is an N P - h a r d problem. It is closely related t o several classical problems. For m = 1, t h e M P S p N is exactly t h e m i n i m u m spanning tree problem. G r a h a m [13] gave an interesting historical survey. T h e most popular polynomial-time algorithm for t h e m i n i m u m spanning tree problem was discovered by Kruskal [15]. A set system (X\, • • • ,Xm) t h a t has a feasible graph to be a forest is called subtree hypergraph. Such a system has various applications in computer science [1] and statistics [16]. It is also related to chordal graphs [9][10]. How can we tell whether a set system is a subtree hypergraph? Tarjan and Yannakakis [21] found an 0(m -f rc)-time algorithm.

56

D.-Z. Du and P. M. Pa.rda.los

If (X\,- • • ,Xm) is a subtree hypergraph, then we can find a m i n i m u m feasible graph for (w ; X j , - • • ,Xm) in polynomial time. In fact, a subtree hypergraph (Xi, • • • ,Xm) must satisfy t h e following condition. (A) T h e r e exists a feasible graph G such t h a t for any i, j 6 {1, • • •, m } , G[X;] fl G[X,] is a tree, where G[X,] is t h e subgraph of G induced by X{. T h e proof is easy. Suppose t h a t G is the feasible forest. Consider two vertices u and v in G[X,-] 0 G[X,]. Since G is feasible, there exists a p a t h in Gi from u t o w and there also exists a p a t h in G[Xj] from u to v. However, G is a forest. So, t h e two paths are identical, which is a p a t h in G[X;] fl G [ X , ] . T h u s , G[X,] fl G[X,] is connected. A connected subgraph of a forest must be a tree. Therefore, (A) holds. T h e following theorem has been proved in [6] for t h e unit-weight case. We will show t h a t it is also t r u e in general. T h e o r e m 2.1 If (X\, • • •,Xm) satisfies condition (A), then computing feasible graph for (w ; X\, • • • ,Xm) can be done in polynomial time.

a

minimum

Before doing so, let us first look at an efficient special case given by Du and Miller [6]. Consider t h e following condition. (B) For any two subsets X,- and Xj, if |X; n Xj\ > 2, then there exists a subset Xk such t h a t Xi 0 Xj = XkThis condition implies t h a t for any 7 C {1, • • •, TO} either | C\iejX{\ < 1 or there exists k £ {1, • • • ,m} such t h a t Xk = flig/X,-. T h e following theorem was shown in [6]. T h e o r e m 2.2 If (B) holds, then all complements independent subsets of a matroid.

of feasible graphs form a family

of

^,From this theorem and t h e theory of matroids [12], we see t h a t t h e M P S p N with (B) can be solved by t h e following two greedy algorithms. A L G O R I T H M 1. G := t h e complete graph of X; Sort edges of G in weight-increasing order, e\, • • •, e„( n _i)/ 2 ; for i := 1 to n(n — l ) / 2 do if G \ e; is feasible then G : = G \ e;. A L G O R I T H M 2.

Subset

Interconnection

57

Designs

Sort all edges of t h e complete graph of X in weight-decreasing order, ei,- • • ,e„(n-i)/2; G:=0; for i := 1 to n(n — l ) / 2 do if there exists j such t h a t e, connects two connected components of G[-Xj] then G := G U e,-. Algorithm 2 is a variation of a greedy algorithm given by Prinsner [19]. It was proved in [6] t h a t if (A) holds, t h e n t h e feasible graph in (A) is a m i n i m u m feasible graph for (1; X\, • • •, Xm) and every m i n i m u m feasible graph for (1; X\, • • •, Xm) possesses t h e property in (A). In addition, all m i n i m u m feasible graphs for ( 1 ; X\,- • • ,Xm) are m i n i m u m feasible graphs for (1; CiieiXi, I C {1, • • •, m } ) . It follows t h a t : T h e o r e m 2.3 If (A) holds, then all minimum form a family of bases of a matroid.

feasible

graphs for

(l;Xi,---,Xm)

This theorem means t h a t t h e M P S p N with (A) can also be solved by greedy algorithms like Algorithms 1 and 2. T h u s , Theorem 2.1 is proved. An application of T h e o r e m 2.1 is to study t h e M P S p N with small m. It was proved by Tang [11] t h a t for m = 1,2,3, (A) holds. So, we have: C o r o l l a r y 2.4 The MPSpN

is polynomial-time

computable for m = 1,2,3.

Since the M P S p N is equivalent to a problem about intersection of m matroids, t h e case for m = 3 is a little surprising. In fact, a problem about three matroid intersection is usually N P - h a r d [11]. Tang [11] also constructed a counterexample which shows t h a t (A) does not hold for m > 4. T h e following question, therefore is still open. O p e n P r o b l e m 1 With a fixed m > 4, is the MPSpN

polynomial-time

computable?

Note t h a t in t h e unit-weight case, it is easy to show t h a t for m fixed, M P S p N is polynomial-time solvable. So, t h e problem is significant only for general weight function. An extension of bounding m is t h e following condition. (a,/?) for any a distinct points x-i, x?, • • •, xa £ X, there are at most f) X^s containing all X\, X2-, ' ' ' , Xa where a and (j are two natural numbers. Clearly, ( a , /?) implies (a1, /3') whenever a' > a and /?' > /3. Under t h e assumption t h a t all weights equal one, Du [8] proved t h a t for a < 2 and /? < 2, t h e problem is polynomial-time solvable while for a > 3 or j3 > 6 or (a > 2, /? > 3) t h e problem is N P - h a r d . T h e following is an open problem

left in [8].

58

D.-Z. Du and P. M.

Pardalos

O p e n P r o b l e m 2 What is the computational complexity of computing a feasible graph for (1; X\, • • •, Xm) in the case of a = 1 and 3 < (3 < 5 ?

minimum

We believe t h a t it is polynomial-time solvable in case ( a = 1,(3 = 3), b u t is N P - h a r d in cases ( a = 1,(3 = 4) and ( a = l,/3 = 5). T h e interesting work in [8] is about t h e case of a = (3 = 2. T h e problem is transfered to a m a x i m u m matching problem on a graph. It is not known whether this work could be extended to arbitrary edge-weights. An application of the results on a = (3 = 2 is to construct a p p r o x i m a t e solutions for t h e M P S p N . There are several ways. T h e first is to divide t h e collection of subsets [Xi ,••• , Xm) into several small collections satisfying condition (a = 2, (3 = 2), construct a m i n i m u m feasible graph for each small collection, and then union all of t h e m . T h e second is as follows: W h e n a pair of points appears in more t h a n two subsets X^s, we stick t h e m together. In this way, we can reduce t h e original collection of subsets to a new one, satisfying condition (a = 2, (3 = 2). After a m i n i m u m feasible graph for t h e new collection of subsets is found, we break stuck pairs by adding some edges. Although several heuristics have appeared in t h e literature, none of t h e m has been proved to be a bounded heuristics; i.e., a heuristic t h a t produces an approximation solution with total weight within a constant factor from optimal. T h u s , t h e following is an i m p o r t a n t question for the M P S p N . O p e n P r o b l e m 3 Does the MPSpN

have a bounded

heuristic?

Du and Miller [6] proved the following. T h e o r e m 2 . 5 Any set system (Xi, • • •,Xm) can be partitioned in time 0(mn) less than \Jlm subsystems such that for each (X,-,, • • •, X^) of them, X^, • • •, Xii an element in common.

into have

W h e n all weights equal one, a m i n i m u m feasible graph for (1; X\, • • •, Xm) with n^LjX,- ^ 0 is t h e star with vertex set X and t h e center chosen from flJ^A",-. This fact with Proposition 2. together yields a \/2m-heuristic for t h e M P S p N in t h e unitweight case. However, t h e same idea does not work for t h e M P S p N in general. To see this, we point out t h e following. T h e o r e m 2.6 Computing r\?=1Xi jt 0 is NP-hard.

a minimum

feasible

graph for (w ; X\,-

• •, Xm)

with

To prove this, let us mention an N P - h a r d problem, the vertex covering: Given a graph G, find a smallest subset of vertices t h a t cover all edges [11]. To do polynomialt i m e reduction, choose a point s different from all vertices and for each edge uv, give a set Xuv = {v,u,s}. Moreover, we assign weights in t h e following way: t h e distance between any two vertices of G is one; t h e distance from s to any vertex of G is n2

Subset

Interconnection

Designs

59

where n is t h e n u m b e r of vertices of G. Now, we claim t h a t there is a vertex covering of size at most k if and only if there is a feasible graph of total weight at most

\\G\\ + kn2. In fact, if a vertex covering of size at most k exists, then connecting s to every vertex in t h e vertex covering, we obtain a feasible graph of weight at most ||G|| + kn2. For t h e other direction, suppose t h a t a feasible graph of weight at most ||G|| + kn2 exists. In such a graph t h e degree of s must be at most k. Otherwise, t h e total weight would exceed ||G|| + kn2. Now, all vertices adjacent to s in t h e feasible graph form a vertex covering of size at most k. T h u s , we proved t h a t t h e vertex covering can be polynomial-time reduced to t h e considered problem in T h e o r e m 2. If t h e endpoints of an edge belong t o more X^s, then they are more likely to appear in a m i n i m u m feasible graph. From this thought, Prisner [19] gave t h e concept of "benefit". For a graph G and a set system (X\, • • •, Xm), t h e benefit b(uv,G) of an edge uv is t h e n u m b e r of G[X;]'s such t h a t uv connects two connected components of G[X;]. T h e benefit-cost ratio of uv is b(uv, G)/w(uv) where w(uv) is weight of t h e edge uv. Using this concept, Prisner [19] discovered a YA=2 7-heuristic for t h e M P S p N where K is the m a x i m u m number of X^s, which have two elements in common. His heuristic runs in t i m e 0(n4 + mn2) as follows. A L G O R I T H M 3. G:=0; while t h e r e is an edge of positive benefit do choose an edge uv with largest benefit-cost ratio and set G := G U uv. Prisner's heuristic is the best known heuristic for t h e M P S p N .

3

Multi-Phase Steiner Networks

Given an edge-weighed graph B with vertex set X and subsets X\,Y\, • • • ,Xm, Ym of X with Xi r\Y{ = 0, t h e problem considered in this section is to find a m i n i m u m weighed subgraph G such t h a t for every i = l , - - - , m , G contains a Steiner tree for Xi without using vertices not in Y{. We will call this problem t h e Multi-Phase Steiner Network Problem ( M P S t N ) and its solution a minimum feasible graph for [w ; X\, Y\, • • •, Xm, Ym]. For m = 1, the M P S t N is t h e m i n i m u m Steiner tree problem [22] which is already NP-hard. T h u s , t h e M P S t N is much harder t h a n t h e MPStN. It is a well-known fact [14] t h a t t h e m i n i m u m spanning tree is a 2-approximation of t h e m i n i m u m Steiner tree; i.e., the total length of a m i n i m u m spanning tree is not bigger t h a n twice t h e total length of a m i n i m u m Steiner tree. However, there does not exist a constant c such t h a t t h e total weight of a m i n i m u m feasible graph for (w • X\, • • • ,Xm) is not bigger t h a n c times t h e total weight of a m i n i m u m feasible graph for [w ; X\, Y\, • • •, Xm, Ym\. To see this, we look at t h e following example.

60

D.-Z. Du and P. M. Pa.rda.los

Let Xij = {i,j} and Yij — {i,j,0} for 1 < i < j < n. A m i n i m u m feasible graph for (1; Xij, 1 < i < j < n) t h e n has weight n(n — l ) / 2 . However, a m i n i m u m feasible graph for [1; Xij,Yij, 1 < i < j < n] has weight n. T h u s , t h e ratio between t h e m is (n — l ) / 2 , which cannot be bounded by a constant. W h a t is t h e relation between m i n i m u m feasible graphs for (w ; Ylt • • •, Ym) and [w ; X\, Yx, •••, Xm, Ym]l In t h e unit-weight case, Chao and Du [3] proved t h e following. T h e o r e m 3.1 If G is a minimum feasible graph for (1 ; Y\, • • •, Ym) satisfying condition (A), and for every edge uv of G there exists an index i such that u,v 6 Xi, then G is a minimum feasible graph for [1 ; X-y, Yj, • • •, Xm ,Ym]. So far, t h e following question is still open. O p e n P r o b l e m 4 Could Theorem

3.1 hold for general weight

function?

Concerning the ratio between m i n i m u m feasible graphs for (w ; Y\,---,Ym) and [w ; Xi, Yi, • • •, Xm, Ym], we look at t h e following example. Let X\ = {1,2} and Y\ = {1, 2, • • •, n}. T h e n a m i n i m u m feasible graph for (1; Yi) must have weight n — 1 while a m i n i m u m feasible graph for [1; X\, Y\\ has weight 1. Such a ratio cannot have a constant bound. In t h e above example, t h e m i n i m u m feasible graph for (1; Yi) contains many unnecessary edges for [1; A ^ Y i ] . Such edges are easily deleted. So, t h e following question arises: O p e n P r o b l e m 5 Starting from a minimum feasible graph for (w ; Y\,- • • ,Ym), could we modify it in polynomial time to obtain a bounded approximation solution of minimum feasible graph for [w ; Xi, Y\, • • • ,Xm,Ym}? In [6], two operatuions for simplifying a feasible graph for (1; Xi, • • • ,Xm) were described. Such operations can be extended to simplify a feasible graph G for [w; X1,Y1,---,Xm,Ym]as follows. ( 0 1 ) Check each edge e of G. If t h e removal of edge e preserves t h e feasibility of t h e graph G, then delete e. ( 0 2 ) Check each edge e of G, the complement of G. If we can use operation ( 0 1 ) to remove some edges of e U G with total weight more t h a n the weight of e, then remove such edges and add e. Clearly, t h e two operations work in polynomial time. An open question left in [6] is as follows. O p e n P r o b l e m 6 Can we obtain a bounded approximation of the minimum feasible graph for (1; Xi, • • • , Xm) from the minimum feasible graph for (1; fl.-g/X,-, / C {!,••• ,m}) by using operations (01) and (02)?

Subset Interconnection Designs

61

We can ask a similar question between [w ; X\, Y\, • • •, Xm, Ym] and [w ; Pl.-g/i^, / C {1, •••,"»}]. An interesting special case for the MPStN is n^ljY; ^ 0. In this case, there exists a star which is a feasible graph. Many problems in the real world can be decomposed into several subproblems in such a special case. So, a feasible graph can be obtained by taking the union of solutions of subproblems. Unfortunately, it was proved in [7] that: T h e o r e m 3.2 Computing a minimum feasible graph for [1; Xi,Y\, • • • ,Xm,Ym] n^jK' + 0 is still NP-hard.

with

For [1; Xi, Y\, • • •, Xm,Ym] with n^LjY; j= 0, the star feasible graph has, at most, twice the weight of a minimum feasible graph. However, it is not known whether or not for [w ; X\,Y\, • • • ,Xm,Ym] with n^jYi 7^ 0 there is a bounded approximation with a factor two from optimal. Bern and Plassmann [2] proved that the minimum Steiner tree problem in graphs is MAX SNP-hard; i.e., it is unlikely to have a polynomial-time approximation scheme. The MPStN clearly inherits this property.

References [1] C. Beeri, R. Pagin, D. Maier, M. Yannakakis, On the desidability of acyclic database schemes, J. ACM30 (1983) 479-513. [2] M. Bern and P. Plassmann, The Steiner problem with edge lengths 1 and 2, Information Processing Letters 32 (1989) 171-176. [3] S.-C. Chao and D.-Z. Du, A sufficient optimality condition for the valveplacement problem, J. North-East Heavy Industry Institute, 4 (1983) (in Chinese). [4] D.-Z. Du and Y.-M. Chen, Placement of valves in vacuum systems, J. Electric Light Sources, 4 (1976) (in Chinese). [5] D.-Z. Du, An optimization problem, Discrete Appl. Math., 14 (1986) 101-104. [6] D.-Z. Du and Z. Miller, Matroids and subset interconnection design, SIAM J. Disc. Math., 1 (1988) 416-424. [7] D.-Z. Du and X.-F. Du, A special case of valve-placement problem, Acta Mathematics Applicatae Sinica, 4 (1991) (in Chinese). [8] D.-Z. Du, On complexity of subset interconnection designs, DIMACS TR 91-23.

62

D.-Z. Du and P. M.

Pardalos

[9] P. Duchet, Propriete de Helly et problemes de representation, in: Colloque ternational Paris-Orsay 260 (1978) 117-118. [10] C. F l a m e n t , Hypergraphes arbores, Discrete

Mathematics

21 (1978) 223-227.

[11] M.R. Garey and D.S. Johnson, Computers and Intractability, Theory of NP-Completeness, Freeman, San Francisco, 1979. [12] M. Gondran and M. Minoux, Graphs and Algorithms, 1984.

In-

A Guide to the

John Wiley, New York,

[13] R.L. G r a h a m , P. Hell, On t h e history of t h e m i n i m u m spanning tree problem, Ann. History Computing 7 (1985) 43-57. [14] R.M. K a r p , Reducibility among combinatorial problem, in R.E. Miller and J . W . T h a t c h e r (ed.), Complexity of Computer Computation (Plenum Press, New York, 1972) 85-103. [15] J . B . Kruskal, On the shortest spanning subtree of a graph and t h e traveling saleman problem, Proc. Amer. Math. Soc. 71 (1956) 48-57. [16] S.L. Lauritzen, T.P. Speed, K. Vijayan, Decomposable graphs and hypergraphs, J. Austral. Math. Soc. A 36 (1984) 12-29. [17] E. Prisner, Intersection-representation of graphs in n-cyclomatic graphs, Combinatoria, to appear.

Ars

[18] E. Prisner, Familien zusammenhangender Teilgraphen eines Graphen und ihre Durchschnittsgraphen, Dissertation in Universitat H a m b u r g 1988. [19] E. Prisner, Two algorithms for the subset interconnection design, Networks, appear.

to

[20] R.E. Tarjan, M. Yannakakis, Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs, SIAM J. Comput. 13 (1984) 566-579. [21] T.-Z. Tang, An optimality condition for m i n i m u m feasible graphs, Applied ematics, 2 (1989) 21-24 (in Chinese). [22] S. Vofi, Steiner

Probleme

in Graphen, (Hain, Frankfurt, 1990).

Math-

63 Network Optimization Problems, pp. 63-92 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Polynomial and Strongly Polynomial Algorithms for Convex Network Optimization Dorit S. H o c h b a u m 1 School of Business Administration Research Department, University

and Industrial Engineering of California, Berkeley

and

Operations

Abstract

Numerous recent developments in complexity theory and the design and analysis of algorithms have contributed to new insights into nolinear network flows problem. In particular, it is now known that the separable convex costs network flow problem is polynomial, in either integers or continuous variables, the latter for output with specified number of significant digits. Moreover, assuming a reasonable computation model consisting of the arithmetic operations and the floor operation and comparisons, it was demonstrated that strongly polynomial algorithms are impossible for the problem. It follows that the available polynomial algorithms are very close in their running time to optimal algorithms. Several properties of convex separable network problems are investigated here. These include so-called proximity theorems that establish the proximity of integer and continuous solutions, and also that of the closeness of an optimal solution to a scaled piecewise linear approximation of the nonlinear problem to the optimial solution to the nonlinear problem. Such proximity theorems make it possible to use scaled piecewise linear approximations in algorithms of polynomial complexity and obtain integer solutions from continuous and vice versa. Classes of algorithms that use the proximity properties are described. Quadratic network problems have some special properties differentiating them from other nonlinear instances of the problem. It is shown how these properties could be exploited for faster and more efficient algorithms. Issues related to quadratic nonseparable network problems are discussed as well. Complexity issues and NP-hardness of classes of nonlinear network flow problems are presented and important directions for future work are delineated.

64

1

D. S. Hochbaum

Introduction

The general nonlinear costs network flow problem is an instance of nonlinear optimization subject to linear constraints. Given a network G = (N,A), let T be the n x m node-arc adjacency matrix of the network, n = \N\ and m = \A\, x the vector of flows on the arcs, x = {a;.j}(ij)e4, &v the supply (demand) of node i, and U;J the capacity upper bounds. The integer formulation of the nonlinear flow problem is, min F(x) s.t. Tx = b 0 < x < u x integer For F(x) a linear function, the problem of minimum cost flows have been studied since the early 60's, and has been studied extensively in the past decade. Until recently, results for the nonlinear problem were limited to iteratively convergent algorithms, rather than algorithms with finite established complexity. An extensive review of such iterative algorithms for the continuous nonlinear constrained problem is given in [Min86a]. The reason for the lack of finite algorithms is that in general the continuous version of the problem (when the integrality requirement is omitted) may posses solutions that cannot be expressed as finite output. Such solutions are irrational numbers or even nonalgrbraic numbers that cannot be represented as a solution to a certain polynomial equation. This property renders complexity theory inapplicable, as it requires finite length input and output. The nonconvex continuous version of the problem is NP-hard [Sah74] even for the quadratic case and so are many simplified versions of the nonseparable problem even if convex. The corresponding concave minimization problem is in general NP-hard. The complexity of the concave network problem as well as some algorithms (exponential) are discussed in [GP90], [GP91a] and [GP91b]. An excellent unifying presentation of some polynomial instances of the concave flow problem is given in [EMV87]. There the problem is proved polynomial when the arcs are uncapacitated and number of demand nodes is fixed. It is also proved polynomial for certain classes of planar graphs. Yet, even with the assumption of convexity, a quadratic nonseparable problem when the set of constraints on the adjacency matrix is empty, is NP-complete. (An example is provided in section 4). It is therefore the case, that for general purpose polynomial algorithms we focus primarily on the class of problems with F{x) = £ fij{xij) where fij are convex. Throughout we discuss convex network flow problems with the classification: 1. Separable or nonseparable 2. Quadratic or nonquadratic and nonlinear 3. Integer or continuous. Hochbaum and Shanthikumar proved several proximity results for convex separable nonlinear optimization ([HS90]). For the nonlinear separable network flow

Polynomial

&c Strongly

Polynomial

Algorithms

for Convex Optimization

65

problem these d e m o n s t r a t e closeness properties between an integer optimal solution to t h e problem and a continuous optimal solution, between two scaled solutions to t h e problem, and between a scaled solution and an o p t i m a l (integer or continuous) solution. A scaled solution is a solution to t h e problem when t h e objective is given on a piecewise linear grid determined by t h e scaling constant. T h e closeness in all these case is t h a t t h e two vectors differ in t h e m a x i m u m n o r m by at most m or 2m units. E d m o n d s and K a r p ' s [EK72] approach of capacity scaling could also be viewed as a proximity result, except t h a t for t h e m i n i m u m cost network flow problem, t h e solution at each phase is not necessarily a feasible solution to t h e scaled problem. We shall discuss this approach in t h e context of proximity. Defining what it means to obtain a solution in t h e continuous case is not straightforward due to t h e potential irrationality of t h e o u t p u t for nonlinear problems. We therefore use t h e definition of t- accurate solution, where log ^ is t h e n u m b e r of significant digits desired in t h e o u t p u t . A solution, x^ is e-a.ccura.te if there exists an optimal solution x* such t h a t ||a;' £ ' — a;*||oo < e. T h a t is, t is t h e accuracy required in t h e solution space. W h e n t h e functions f<j are convex nonlinear functions, both t h e continuous and integer versions of t h e problem are solvable by algorithms running in a n u m b e r of steps t h a t is a polynomial function of m and n, b u t also of log 2 ||M||OO or log 2 ||6||ooSuch an algorithm for t h e integer case is described by Minoux [Min86] using t h e capacity scaling approach, and for the integer and continuous cases in [HS90], using t h e proximity results, where t h e running t i m e for t h e continuous case depends polynomially on t h e n u m b e r of digits required to describe t h e solution. These algorithms are polynomial b u t not strongly polynomial, as they depend on t h e magnitudes of t h e n u m b e r s appearing in t h e problem instance, as well as t h e problem size p a r a m e t e r s . On t h e practice side, a heuristic implementation of this algorithm, using the interior point m e t h o d , indicates t h a t it is very efficient, [HS91]. T h e topic of strong polynomiality became an i m p o r t a n t issue once a polynomial algorithm, t h e Ellipsoid m e t h o d , was devised for solving linear p r o g r a m m i n g problems. T h e ellipsoid m e t h o d , as well as all other polynomial algorithms known for linear programming, is polynomial b u t not strongly polynomial. T h a t is, t h e running t i m e depends on t h e d a t a coefficients, rather then only on t h e n u m b e r of variables and constraints. Consequently, solving linear programming problems with different degrees of accuracy in t h e cost coefficients results in different running times. T h a t is, t h e actual n u m b e r of arithmetic operations grows as t h e accuracy of t h e d a t a , and hence the length of t h e numbers in the input, increases. Such behavior of an algorithm is undesirable as it requires careful monitoring of t h e size of t h e n u m b e r s appearing in t h e d a t a describing t h e problem instance, and hence limits t h e efficient applicablity of t h e algorithm. W h e t h e r linear programming is solvable in strongly polynomial t i m e is still an open problem, yet much progress has been reported. T h e most i m p o r t a n t work in

66

D. S.

Hochbaum

this context is t h a t of Tardos, [Tard86], t h a t established t h a t "combinatorial" linear p r o g r a m m i n g problems, those with a constraint m a t r i x having small coefficients, are solvable in strongly polynomial time. In particular, m i n i m u m cost network flow problems, which have coefficients t h a t are either 0, 1 or -1 in t h e constraint m a t r i x , are solvable in strongly polynomial time, [Tard85]. For non-quadratic objective functions, it was proved in [Hoc90] t h a t there is no algorithm solving t h e single source separable convex Transportation problem in strongly polynomial t i m e . This lower bound result holds either in the comparison model or in t h e algebraic computation tree model with t h e four arithmetic operations + , — , - , : , and comparisons, and it holds even if t h e floor operation is added. Since the single source Transportation problem is a special case of separable convex minimization problems over totally unimodular constraint matrices or over polymatroidal constraints, then in particular all nonlinear convex network flow problems cannot be solved in strongly polynomial time. T h e question of whether there exists a strongly polynomial algorithm for quadratic separable network flow problems remains open. T h e lower bound result in [Hoc90] does not apply to this case. Moreover, for quadratic problems there is a polynomial bound on the length of the o u t p u t (this is because the optimality conditions are linear). Partial results have been derived recently. Tamir, [Tam89], devised a strongly polynomial algorithm for m i n i m u m convex quadratic separable cost flow when the underlying network is series-parallel, in t h e presence of a single source-sink pair. In [CH90] a strongly polynomial algorithm for t h e Q u a d r a t i c Transportation problem with fixed n u m b e r of sources or sinks is described. T h e strongly polynomial algorithm described in t h a t paper delivers optimal continuous solutions using t h e operations mentioned above, without t h e floor operation. For t h e integer version, a strongly polynomial algorithm follows from a technique proposed in [HS90] t h a t relies on t h e proximity result. T h e technique is a strongly polynomial algorithm for deriving an integer optimal solution from a continuous optimal solution to nonlinear convex problems over totally unimodular constraint matrix. This technique requires the floor operation. In [HH92], Hochbaum and Hong describe a strongly polynomial algorithm for t h e q u a d r a t i c separable convex network Allocation problem with complexity equal to t h a t of a single m a x i m u m flow algorithm. This problem is not a general network flow problem as only some of t h e edges have costs associated with t h e m . All other edges only have capacity limits. Again, t h e idea here is to solve the continuous problem optimally, and then use the technique for converting a continuous optimal solution to a integer optimal solution. T h e r e is no hope of solving t h e integer version in strongly polynomial t i m e without t h e use of t h e floor operation. This follows from an observation by Tamir, [Tam89], illustrating t h a t the floor operation is essential for solving integer quadratic problems. Tamir has demonstrated this via t h e following quadratic Allocation problem which

Polynomial

&: Strongly

Polynomial

Algorithms

for Convex

Optimization

67

m a y be viewed as a quadratic Transportation problem with a single source node: min S.t.

[-x\

+ - ( a - l)a^]

Xi + X2 = b xi,

x2 > 0, integer

T h e optimal value of x2 is [-J. Therefore t h e floor operation can be executed via a routine t h a t solves a quadratic Transportation problem. This, along with t h e impossibility result of strongly polynomial algorithms for non-quadratic problems mentioned above, implies t h a t the floor operation, if helpful in devising strongly polynomial algorithms, is limited in its power to quadratic problems only. Although separable quadratic problems m a y be solvable in strongly polynomial t i m e , t h e question of strong polynomiality of nonsepamble quadratic continuous optimization problems subject only to nonnegativity constraints, is at least as hard as t h e question of strong polynomiality of linear p r o g r a m m i n g (this insight is due to I. Adler). Therefore, we do not investigate this issue here, as it should be treated in t h e framework of t h e strong polynomiality of linear programming. Nonseparable quadratic integer problems are N P - h a r d , [MK87], so in t h a t sense also these problems are beyond t h e scope of our current investigation. Nonseparable convex continuous problems, as well as nonseparable q u a d r a t i c convex problems are solvable as follows. A solution approximating t h e o p t i m a l objective value to t h e convex continuous problem is in principle obtainable in polynomial t i m e , provided t h a t t h e gradient of t h e objective functions are available and t h a t t h e value of t h e optimal solution is bounded in a certain interval. Such work, based on t h e Ellipsoid m e t h o d , is described by Yudin and Nemirovsky ([NY83]). In t h e q u a d r a t i c case, exact solutions are possible. Indeed, t h e polynomial solvability of continuous convex quadratic programming problems over linear constraints was established as a byproduct of the ellipsoid algorithm for Linear P r o g r a m m i n g (see Kozlov, Tarasov and Khachian, [KTK79]). T h e best running time reported to d a t e is by Monteiro and Adler [MA89], 0(m3L), where L represents t h e total length of t h e input coefficients and m t h e n u m b e r of variables. Similar results were also given by Kapoor and Vaidya [KV86]. T h e case for t h e integer problems t h a t are nonseparable is harder. We give in Section 5 a condition for such problems to be solvable in polynomial t i m e for t h e convex nonseparable problem. We t h e n describe special cases of integer nonseparable network flow problems t h a t are polynomial, all of which with quadratic objective. [HSS92] gave a polynomial algorithm for a specific quadratic nonseparable Transportation problem in integers. T h a t algorithm is strongly polynomial with some restrictions on t h e size of t h e linear coefficients in t h e objective function. Granot and Skorin-Kapov [GSK90] consider also a case of Transportation problem in integers solvable in polynomial t i m e . Cases where quadratic convex nonseparable optimization over box constraints (which are a special case of t h e quadratic network flow with an e m p t y set of flow

68

D. S. Hochbaum

balance constraints), have been shown to have strongly polynomial algorithms are: by Barahona [Bar86], by Baldick and Wu [BW90] and by Hochbaum [Hoc89]. All these are special cases in which the quadratic matrix in the objective can be made separable by using a totally unimodular transformation of the variables. When this is the case there is a polynomial time algorithm for separable convex optimization over totally unimodular constraints, that delivers optimal integer solutions in polynomial time ([HS90]). Baldick ([Bal91]) has also presented several classes of matrices that can be diagonalized (i.e. the quadratic problems are made separable) via the use of totally unimodular matrices. The plan of this paper is as follows. In section 1 we describe the scaled piecewise linear approximation and the proximity theorems for the separable flow problem and a general purpose polynomial algorithm. In Section 2 we propose a framework for obtaining several classes of algorithms that rely on these properties and on scaling. Section 3 provides the description of the lower bound proof that results in the impossibility of strongly polynomial algorithm for convex network flow problems. In Section 4 we demonstrate several results concerning the case of quadratic separable network flow, and Section 5 gives some limited results for nonseparable problems. The notation in this paper, includes bold letters for denoting vectors, and the notation e is used for the vector (1,1, ..., 1).

2 2.1

Proximity Theorems and Piecewise Linear Approximations The Scaled Piecewise Linear Approximation

Let fij : R —> R, (i,j) 6 A, be m convex functions and define

(•J')6A

We are interested in the solutions to the nonlinear convex integer network flow, (INF), problem: (INF)

min s.t.

F(x) Tx = b 0< x < u x integer

and to its continuous (real) relaxation {RNF)

min F(x) s.t. Tx = b 0< x < u

Polynomial

Si Strongly

Polynomial

Algorithms

for Convex

Optimization

69

Here T is an n x m adjacency m a t r i x of t h e network G = (V, A), 6 is a demandsupply n -vector, and u t h e capacity upper bounds vector on t h e arcs. T h e idea of linearizing a nonlinear function in order to obtain solutions is not new. In a 1959 book [Den59], Dennis writes regarding quadratic cost networks: "The electrical model for network flow problems can be extended to include flow branches for which the total cost contains terms depending on the square of the individual branch flows.... It appears that the algorithms presented in this chapter could be generalized.... These methods however are combinatorial in character and could require prohibitive calculation time, even on relatively simple networks. Certainly the simplicity and elegance of the diode-source algorithms would be absent. It would seem that the most practical means of attacking flow problems with quadratic costs would be to approximate the cost curve with piece-wise linear curve and substitute an appropriate number of linear cost branches connected in parallel." Whereas it is clear t h a t solving t h e problem on a piecewise linear approximation yields a feasible solution, t h e quality of such solution and its closeness to an optimal solution was not evaluated. [HS90] analyzed and described these proximity properties between the piecewise linearized problem's optimal solution and an optimal solution for t h e convex separable minimzation over linear constraints. We focus here only on t h e special case of network flow problems. We now describe formally t h e piecewise linear approximation. For this we introduce t h e following class of problems. For any scaling constant s € R+ let t h e scaled problem (RNF — s) be defined by

(RNF-s)

min

F(sy) Ty = b/s 0 < y < u/s

By setting x = sy in (RNF — s) it is easily seen t h a t (RNF — s) is t h e same as (RNF). If we define the integer scaled problem to be (RNF — s) with an integerality requirement t h a t y is integer, then there is a feasible solution only if b/s is integer. In [HS90], t h e constraints are given as inequality constraints. T h e scaled problem is then,

(INF-s)

min

F(sy) Ty > b/s -Ty > -b/s 0 < y < u/s y integer

Note t h a t a solution to (INF — s) is not necessarily feasible for t h e original problem. T h e a m o u n t of unsatisfied supply (demand) is bounded by n s units. This problem can be solved as a linear network flow problem by rounding t h e scaled supplies

70

D. S.

Hochbaum

down to t h e nearest integer, and t h e scaled d e m a n d s ( t h a t are negative numbers) up to t h e nearest integer and adding a d u m m y node t h a t absorbes t h e difference between t h e total supply and total d e m a n d at cost 0. W h e n s = 1, the difference betweent the scaled supply and d e m a n d is 0, and hence there is no need for using t h e d u m m y node. E d m o n d s and Karp [EK72] used such idea of capacity scaling for t h e m a x i m u m flow problem t h a t can be formulated as a m i n i m u m cost problem with 6 = 0. Since t h e network flow problem readily provides integer solutions for integer right hand sides, and 0 is integer, feasibility was not an issue. Instead of t h e scaled version of (RNF) we now use a linearized version of (RNF) as defined below. For any s > 0 let /,^ : s : R —> R be t h e linearized version of fij such t h a t ffj's takes t h e same value as fij at all integer multiples of s: t h a t is, fij''s(sy) = fij(sy), for V integer and

S

~T~

x

S

S

iy

S

o

where [^-J (the floor of ^LL) is t h e largest integer value smaller t h a n or equal to ^-. Clearly fffs is a piecewise linear function which is convex if fij is convex. /,^ :s is depicted in Figure 1. Now define

{LNF-s)

min

FL'(x) Tx>b 0< x < u

where

Note t h a t t h e optimal solution to t h e integer program

(INF'-s)

min

FL:s{sy) Ty > b/s -Ty > -b/s 0 < y < u/s y integer

is also an optimal solution to (INF — s) because FL:s and F take t h e same value at integer multiples of 5 : i.e., FL:s(sy) = F(sy) for all y integers. Hence, to solve (INF — s) one can solve instead t h e piecewise linear version, (INF' — s). T h e set of feasible solutions {a; | Tx = 6,0 < a; < u] is bounded in a box of length B — minfJIuHoo, ||6||i} in each dimension (the flow on each edge cannot exceed capacity, or total sum of demands). In t h e initial iteration we substitute t h e tightest bound for t h e upper bound on t h e capacities. In t h e proximity-scaling algorithm, the

Polynomial

Si Strongly

Polynomial

Algorithms

for Convex Optimization

71

Figure 1: upper and lower bounds on the variables will be updated at each iteration. So in a generic iteration t h e capacity bounds are replaced by constraints: Ijjj

2i -Eij _^ U ij i

(i,j)

G A.

These constraints will be scaled as well. In our procedure, at each iteration we shall work with t h e upper and lower bounds, U{j and Ljj, and a scaling constant s, such t h a t t h e length '^-, is an integer constant. We denote N= Each variable ?/;J, for t h e integer case, is then substituted by a sum of TV 0-1 variables:

., N. For t h e continuous case the substitution for i,-

c, = *{F i j+i>i; ) } s

k=i

o < 2,!*> < l So now (LNF — s) and (INF' on t h e variables z,(<0

forfc =

1,...,N.

— s) are piecewise linear convex minimization problems

72

D. S. Hochbaum

The modified objective function for the linear programming formulation (e.g. [D63] pp.482-486) of both (LNF - s) and (INF' - s) is min £

f^(s[^\)

+ f:

£

[/^(^j+^-J^^L^j+fc-l))]^).

Let the column of T corresponding to arc (i,j) be denoted by a;j. Each such column is duplicated N times. Alternatively, in the graph, each arc is multiplied N times where each duplicated arc has the capacity 1. Let TN be the matrix T in which each column is duplicated TV times. The constraint set is then, TNz

= 6'.

where b'p = b'p/s — Yl(aij)v [Lij/S\ • The linear programming version of the (LNF — s) problem is then (omitting the constant from the objective function),

(LNF'-s)

min

£

£[/£'(*( | ^ J + *)) - /£'(*( L^J + * - 1))]^

(<,i)eAk=i

s.t.

*

s

N

T z = 6'. 0<^
One then has from well known results (e.g. [D63]), L e m m a 2.1 Let z be an optimal solution to (LNF' — 5). If fij is convex for each (i,j) € A then x defined by Xij = s[-f-\ + Ylk=i % > for a " {hj)> *'s an optimal solution to (LNF — s). Polynomial time algorithms for the piecewise linear problem are essentially an adaptation of the minimum cost flow problem's algorithms to graphs with multiple edges between a pair of nodes. An algorithm for this problem is given in [ABG84]. The drawback is that the number of edges in the graph depends now on the number of grid intervals, or alternatively on the scaling constant. Although a straightforward implementation works in polynomial time, it is nevertheless possible to speed up the running time. Pinto and Shamir [PS90] presented an algorithm that is more efficient than the straightforward approach.

2.2

Proximity Results

In this section we give proximity results for the optimal solutions of (INF), (RNF), (INF — s) and (LNF — s). Proofs for these results in more general context are given in [HS90]. The first proximity result is between the optimal solution of (INF) and (RNF). This is useful either when a continuous solution is more easily achievable than an integer one, such as in the case of quadratic network flow. Or, it is useful

Polynomial Si Strongly Polynomial Algorithms for Convex Optimization

73

when an integer solution is achievable, and then this proximity is used to obtain an eps-accurate solution for the continuous problem. Note that integrality can be defined with respect to an arbitrary grid, in particular, with respect to an eps size grid. Theorem 2.2 (i) For each optimal solution x for (RNF) (INF) such that \\x - z*||oo < m.

there exists an optimal solution z* for

(ii) For each optimal solution z for (INF), (RNF) such that \\x" - z]],^ < m.

there exists an optimal solution x* for

Theorem 2.2 for the quadratic case was also proved in [GSK90], and an equivalent result for the general nonlinear case was independently derived by Werman [W91]. The following proximity is between the optimal solutions to (INF) and (INF — s). It allows to solve the scaled problem on finer and finer grids in order to derive a solution for the integer problem (INF). Theorem 2.3 Let s be a positive integer. (i) For each optimal solution y for (INF — s) there exists an optimal solution z" for (INF) such that \\sy — z*||oo < ms. (ii) For each optimal solution zhat for (INF) there exists an optimal solution y* for (INF — s) such that ||sy* — z||oo < rns. Remark: Since (RNF) is a continuous relaxation of (INF — s) with x := sy, from Theorem 2.2, one sees that for any optimal solution y of (INF — s), there exists an optimal solution x* to (RNF) such that ||x*/s—y|joo < m. Also from Theorem 2.2 we know that there exists an optimal solution z* to (INF) such that ||as* — z*||oo < rn. Combining these two proximity results using the triangular inequality of the l^, norm, one has \\sy — z*||<x> < m(\ + s). Similarly, it can be shown that for any optimal solution y of (INF — s), there exists an optimal solution y* for (INF — s) such that \\sy* — z||oo < m(\ + s). These bounds, however, are slightly improved in Theorem 2.3. The following proximity result between the optimal solutions of (INF) and (LNF— s) could be used to develop an algorithm for (INF) in case (LNF — s) is more efficiently solvable than (INF — s). Theorem 2.4 Let s be a fixed positive integer. For every optimal solution x for (LNF — s) there exists an optimal solution z* for (INF) such that

||*-z*IU <2ms. To develop an algorithm for the continuous case, (RNF), we use the following proximity result between the optimal solutions to (RNF) and (LNF — s). Theorem 2.5 Let s > 0 be fixed. For every optimal solution y for (INF — s) there exists an optimal solution x* for (RNF) such that \\y — x * ] ^ < ms.

74

D. S.

2.3

Hochbaum

The Proximity-Scaling Algorithm

In this section we give an overview of the [HS90], which will be used as a framework for t h e implementation. T h e proof to all claims in this section are in [HS90]. T h e algorithm maintains a box t h a t contains an optimal solution to t h e problem. At each iteration t h e volume of this box is reduced by a factor of 2 m . This is achieved by reducing each dimension of t h e box by a factor of 2. A direct implementation will therefore require a total of log 2 B iterations, where B is t h e length of a side of t h e initial cube for t h e integer problem. For t h e continuous problem solved to eps-accuracy there are log 2 ^ such iterations. T h e iterations continue, until, t h e optimal solution interval is reduced to a size at most It. T h e complexity of the algorithm is summarized in t h e following theorem: T h e o r e m 2.6 Let the complexity of unit capacity arcs and n nodes, (INF (INF) for an integer optimal solution ing (RNF) for an t-accurate optimal network flow problem is log 2 — T(im2,

solving the scaled network flow problem on m — s), be T{m,n), then the complexity of solving is log 2 —T(4m 2 , n), and the complexity of solvsolution to the nonlinear separable and convex n).

T h e idea of t h e algorithm is as follows. At each iteration t h e optimal value of each variable is to be found in a bounded length interval, which is initially of length B. T h e interval is subdivided in Am units, each of size s. T h e scaled problem is then solved for s (solved optimally or within some approximate feasibility). T h e optimal solution to t h e scaled problem is then within m units, each of size s away from an optimal solution. ( T h e same statement can be m a d e with regard to an optimal solution t h a t is not feasible in t h a t it satisfies only a portion of t h e supplies and d e m a n d s ) . T h e o p t i m a l solution is then to be found in a new interval the length of which is only 2m units of size s each. We then divide s by 2 to produce a new scaling constant for t h e half length interval centered around t h e solution derived in t h e previous iteration. At t h e last iteration t h e length of t h e interval for each variable is 0(m). Therefore then t h e piecewise linear approximation on the integer grid provides an optimal solution to t h e problem. Since we stop when t h e scaling constant is 1 and t h e length of t h e interval containing each variable Uij — Lij is TO, t h e total n u m b e r of iterations of this algorithm is at most log 2 —. In order to solve the continuous problem, we solve t h e integer problem on a grid of size ^ and a final interval of length e. From T h e o r e m 2.2 this integer solution is at most me units away from the continuous optimal solution. It is therefore an t -accurate solution. Let B = minJU-ulloo, | | 6 | | i } . T h e generic proximity-scaling algorithm is described below. Procedure Proximity-Scaling Step 0: Let s = 2^°^\

U^

= B, £ (1 > = Oe, i = 1.

Polynomial

& Strongly

Polynomial

Algorithms

for Convex Optimization

75

Let j/<°) = Oe. Step 1: Solve, (using 2 • j / ( , _ 1 ) as an initial feasible solution):

(INF'-s)

min

FL:'(sy) Ty > b/s -Ty > -b/s 0 < y < u/s L{i)
Let an optimal solution be y'O. If s = 1, (for t h e continuous case s = e / m ) , t h e o u t p u t is y^'K Stop, j/<0 is an o p t i m a l solution. Step 2: Set [7 ( , ' + 1 ) = i / « + s n e Set L ( , + 1 ) = 1/(0 - s u e Set s <- | . Set i <— i -f 1. Go to step 1. T h e proximity scaling algorithm is clearly polynomial, yet its efficiency depends on t h e algorithm used for each scaled problem. One way of solving t h e scaled problem is using t h e successive shortest paths method ([Je58], [Ir60] and [BG61] with some improvements by [EK72]). This algorithm works with a solution t h a t is capacity feasible, dual feasible - all reduced costs are nonnegative, and iteratively progresses towards t h e feasibility of t h e flow balance constraints. T h e initial solution is t h e 0 vector and t h e initial reduced costs are all set to 0 as well. This algorithm can be used directly for a piecewise linear convex function with t h e costs being t h e left segment derivative. For each node with a unit of unsatisfied supply (excess), there exists at least one node with a unit of unsatisfied d e m a n d (deficit), since sum of supplies equals sum of demands. T h e basic iteration of t h e algorithm is to find all t h e shortest paths in t h e residual network, originating at a node with an excess using as costs t h e reduced costs. T h e reduced costs are then u p d a t e d with t h e distance label - an operation t h a t maintains their nonnegativity, and one unit of excess supply is sent to some node with a deficit. Each such iteration involves finding single source shortest paths in a graph with nonnegative costs. This can be done using Dijkstra's algorithm in 0(m + n logn) operations, on a graph with n nodes and m arcs. Notice t h a t also in the graph we process at each iteration, t h a t has each arc duplicated 0(m) times, t h e single source shortest paths can be evaluated in t h e same running t i m e , since among all t h e duplicated arcs, there is only one of lowest cost to be considered, and maintaining t h e arcs sorted is straightforward.

76

D. S.

Hochbaum

Since for a given problem there are initially | | | 6 | | i = | H" = i \h\ unit oi excess, t h e total running t i m e of t h e successive shortest paths algorithm is 0 ( | | 6 | | i ( m + n log n)) steps. T h e piecewise linear problem, (INF' — s), differs from t h e linear problem in t h a t t h e sum of supplies is not to t h e sum of d e m a n d s . Each supply value 6; is rounded down to s[^J whereas each d e m a n d value (which is a negative n u m b e r ) , is rounded down in absolute value, i.e. t o s|^f-~|. Once all these d e m a n d s and supplies are satisfied t h e r e are up t o n unit multiples of s yet unsatisfied. Also, because capacity upper bounds in (INF' — s) are effectively rounded down it m a y be impossible to find a p a t h of capacity 1 in t h e residual network from an excess node to a deficit node. Since t h e original problem (INF) was feasible, t h e infeasibility per arc is at most one unit of s. It follows t h a t each arc can prevent at most one unit of excess from getting cancelled against deficit. Hence, applying t h e successive shortest p a t h s algorithm at an iteration will result in a solution satisfying all b u t 0(n + m) unit multiples of s of supply and demand. Therefore, if we start iteration i with t h e initial solution 2 • J/' 1 - 1 ', this solution will be capacity-feasible and dual feasible for t h e problem with s,- = |-s._i, while at most 0(m + n) units of excess need to be processed. Hence t h e running t i m e of each iteration for a given scaling factor is only 0 ( ( m + n ) ( m - f n l o g n ) ) . Since there are log fracBm iterations, t h e running t i m e for t h e integer convex problem is 0 ( l o g ^ ( m + n)(m + n l o g n ) ) , and for t h e e-accurate solution it is O(log —(m + n)(m + n l o g n ) ) . This complexity analysis does not depend on t h e value of [/'*' — 2/ , t h e interval size. W i t h o u t t h e validity of t h e proximity theorems, the proximity-scaling algorithm will still produce an optimal solution on any specified e-grid, with e = 1 for t h e integer case. Yet, in order to derive an e-accurate solution for t h e continuous case, one m u s t m a k e use of proximity T h e o r e m 2.5. Indeed, Minoux [Min86] was first to discover a polynomial algorithm for t h e integer convex flow problem. Minoux a d a p t e d t h e out-of -kilter m e t h o d with scaling, so t h a t at each iteration there is a factor reduction in t h e sum of kilter numbers. T h e reported running t i m e in [Min86] is 0 ( l o g 2 ||tt||oo • rati1). Recently, Ahuja Magnanti and Orlin [AM092], introduced another algorithm for t h e integer convex network flow problem using a capacity scaling algorithm. T h e y again use t h e solution at one scaling step as t h e initial solution in t h e next iteration for t h e next scaling step. T h e running time they report is O(log 2 ||6||oo • m(m + n l o g n ) ) , which is t h e same as t h e one reported here. Interestingly, this running t i m e is also t h e same as t h a t of the capacity scaling algorithm applied to t h e linear network flow problem. All these three algorithms depend in their complexity on log 2 B, ( recall t h a t B = mindluHoo, ||6||i}) which is essentially t h e length of t h e right h a n d sides. For linear network flow there are polynomial algorithms with running times independent of t h e right hand sides and independent of t h e objective function - strongly polynomial algorithms. Since as observed for the capacity scaling algorithm, one can achieve algorithms for t h e convex case with same running t i m e as linear case, it is conceivable t h a t t h e strongly polynomial algorithms could also be a d a p t e d to t h e convex case.

Polynomial

&: Strongly

Polynomial

Algorithms

for Convex Optimization

77

This is however impossible as proved in t h e next section. It follows t h a t algorithms with t h e running t i m e as t h e proximity-scaling algor i t h m are in a sense close to being optimal (with smallest complexity possible). This s t a t e m e n t holds in t h e sense t h a t in order to derive more efficient algorithms for t h e convex case, there must be more efficient algorithms for t h e linear case of t y p e t h a t depend on t h e right hand sides in their complexity. We believe t h a t any improvement of those should be extremely challenging.

3

The Impossibility of Strongly Polynomial Algorithms

T h e lower bounds presented in this section are from [Hoc90]. T h e problem for which t h e lower bound is given is t h e Allocation problem which is identical to a single source Transportation problem in maximization form, max{5Z" =1 fi(x{)\ X^"=11{ = B,x > 0 } . T h e generic presentation of this problem is as a maximization problem with a concave objective function. We first present a comparison model lower bound followed by an algebraic tree model lower bound.

3.1

A C o m p a r i s o n M o d e l Lower B o u n d

We provide a lower bound proof t h a t establishes t h a t no algorithm exists t h a t can solve SRA in less t h a n log 2 B comparisons. In this, we rely on a result of information theory according to which there is no algorithm t h a t finds a value in a monotonic decreasing n-array t h a t is the first to be smaller t h a n some specified constant in less t h a n log 2 n t i m e . We first show t h e lower bound result for t h e Allocation problem in two variables. Suppose one can solve t h e problem: max

/i(a;i) + cx 2 x\ + %i = B x1,x2

> 0, and integer.

Let the function f\(xi) be given as an array of B increments at t h e B integer points. Namely, if / i ( i ) = a; t h e array of increments is {ao> a i — a,0,a,2 — a i , • • • ,aB — O B _ I } . Since the function is concave, t h e entries in t h e array are monotonic nonincreasing, a, + i — a,- < a,- — a,_i. T h e optimal solution to this problem is X\ = j and x 2 = B — j , where j is t h e largest index such t h a t aj — Oj_i > c. Since t h e array of t h e differences between consecutive entries of t h e array is monotonic decreasing, determining in t h e array of differences t h e index j can be done using binary search in log 2 B comparisons. T h e information theoretic lower bound, is also log 2 B comparisons. This is because t h e comparison tree has B leaves so t h e p a t h of comparisons leading from t h e root to a leaf could be as long as log 2 B. (See [K73]).

78

D. S.

Hochbaum

Suppose t h e problem could be solved independently of B, t h e n given a monotonic decreasing array and a value c, it has a corresponding concave function, fx such t h a t t h e solution to t h e problem given above (name) is independent of B. Consequently, t h e required entry in t h e vector could be found independently of B, which is a contradiction. A similar proof works for t h e problem with a single variable if t h e constraint is an inequality constraint. T h e problem is, max

f(x) x < B x nonnegative integer .

Since / is concave it has an increment array corresponding to it as before. T h e array of differences is monotonic decreasing, and we shall look for t h e largest index such t h a t t h e entry with t h a t index is still nonnegative. In other words t h e role of c above is replaced here by 0. T h e conclusion follows precisely as above. T h e same arguments can be extended to prove t h a t in t h e comparison model t h e Allocation problem on n + 1 variables has complexity fi(nlog 2 —). Let t h e problem be defined for c > 0, n 3=1 n+1 3=1

xj > 0 and integer

j = I,...

,n + I.

Let t h e functions fj be concave and monotonic increasing in t h e interval [0, [—J], and zero in [|_~_]>-B]- Solving this problem is then equivalent to determining in n arrays of length [_—J each, t h e last entry of value > c. Since t h e arrays are independent, t h e information theory lower bound is Q(n log [_^J). Similarly, for t h e case of an inequality constraint t h e same lower bound applies for t h e problem on n variables, n

Y^fj(xi)

max 3=1

3=1

Xj > 0

integer

j = 1,..., n

since xn+\ can simply be viewed as t h e slack and c = 0. This comparison model lower bound holds also for t h e quadratic case. It is therefore impossible to solve t h e quadratic problems in strongly polynomial t i m e using only comparisons. Indeed, as stated in t h e introduction, t h e floor operation is essential for t h e quadratic integer problem.

Polynomial

3.2

&: Strongly

Polynomial

Algorithms

for Convex

Optimization

79

The Algebraic-Tree Model

One might criticize t h e choice of t h e comparison model for this problem as being too restrictive. Indeed, t h e use of arithmetic operations may help to reduce t h e problems complexity. This is t h e case for t h e quadratic simple Allocation problem, which is solvable in linear t i m e , 0 ( n ) , [B84]. T h e lower bound here demonstrates t h a t such success is not possible for other nonlinear functions. T h e c o m p u t a t i o n model t h a t is used hereafter allows t h e a r i t h m e t i c operations + , —, x , - j - as well as comparisons and branching based on any of these operations. It is d e m o n s t r a t e d t h a t t h e n a t u r e of t h e lower bound is unchanged even if t h e floor operation is p e r m i t t e d as well. We rely on Renegar's lower bound proof ([R87]) in this a r i t h m e t i c model of comp u t a t i o n for finding epsi/ora-accurate roots of polynomials of fixed degree > 2. In particular, t h e complexity of identifying an e-accurate single real root in an interval [0,.R] is O(loglog—) even if t h e polynomial is monotone in t h a t interval. Let Pi(x), ..., pn(x) be n polynomials each with a single root to t h e equation pi(x) = c in t h e interval [0, —], and each Pi(x) a monotone decreasing function in this interval. Since t h e choice of these polynomials is arbitrary, t h e lower bound on finding t h e n roots of these n polynomials is fi(rcloglog ~). Let fj(xj) = JQ3 pj(x)dx. T h e fjS are then polynomials of degree > 3. T h e problem, (P £ )

max

^2fj{xj-e) i

+

£J

B

j=i

t

c-xn+l-e

Xj > 0 Xj integer ,

has an optimal solution x such t h a t y — e • x is t h e (ne)-accurate vector of roots solving t h e system ' Pi(2/i) = c P2(«/ 2 ) = C

I Pn{Vn) = C. This follows directly from t h e Kuhn-Tucker conditions of optimality, and t h e fact t h a t an optimal integer solution to t h e scaled problem with a scaling constraint, s, x", and t h e optimal solution t o t h e continuous problem y* satisfy ||as* — J/*||oo < ns (Theorem 2.3). Hence, a lower bound for t h e complexity of solving (Pe) is Q(n log log -7-). For t = 1, we get t h e desired lower bound for t h e integer problem. In [MST88] there is a lower bound proof for finding epsi/on-accurate square roots t h a t allows also t h e floor, | J , operation. In our notation t h e resulting lower bound for our problem is fi(\/loglog ®), hence even with this additional operation t h e problem

80

D. S.

Hochbaum

cannot be solved in strongly polynomial t i m e . Again, t h e quadratic objective is an exception and t h e algorithms for solving t h e quadratic objective simple resource Allocation problems rely on solving for t h e continuous solution first, then rounding down, using t h e floor operation, and proceeding to c o m p u t e t h e resulting integer vector to feasibility and optimality using fewer t h a n n greedy steps. See for instance, [IK88] for such an algorithm. Since t h e lower bound result applies also in t h e presence of t h e floor operation, it follows t h a t t h e "ease" of solving t h e quadratic case is indeed due to t h e quadratic objective and not to this, perhaps powerful, operation.

4

Quadratic Network Flow Problems

As noted earlier, t h e quadratic problem takes a special place among nonlinear optimization problems over linear constraints. This is because t h e optimality conditions are linear (only t h e derivative of the objective function appears), and the solution to a system of linear inequalities is of polynomial length in t h e size of t h e coefficients. So for quadratic problems an optimal continuous solution is a polynomial function of t h e length of t h e input. In addition, the proof of impossibility for strongly polynomial algorithms using t h e algebraic tree computation model, is not applicable to t h e quadratic case (although for t h e comparison computation model t h e proof is valid and it is impossible to derive strongly polynomial algorithms using only comparisons). . P P All this raises a n u m b e r of issues: are there strongly polynomial algorithms for quadratic separable convex network flow? Are there strongly polynomial algorithms for t h e nonseparable convex continuous case (where polynomial algorithms exist)?, and what classes of nonseparable quadratic integer problems, that are generally N P hard, are solvable in polynomial time?. We survey here some results related to these questions.

4.1

Strongly Polynomial Algorithms for the Separable Case

Using a known transformation, any m i n i m u m cost network flow problem can be written as a Transportation problem. It is therefore sufficient in t h e search for efficient algorithms for t h e quadratic separable network flow problem to focus on t h e quadratic separable Transportation problem. . P P T h e Q u a d r a t i c Transportation problem ( Q T P ) is defined on a bipartite network, with k supply nodes and n dem a n d nodes. T h e cost of transporting flow from a supply node to a d e m a n d node is a convex quadratic function of t h e flow quantity. T h e formulation of t h e continuous problem is as follows:

k

min

n

i

J2 ]CK'T'J + oMij]

Polynomial

&z Strongly

S. I .

Polynomial

y

Algorithms

&ij — $\

Optimization

81

' — -I-5 • * • } ™

5 > i i = dj xn > 0

for Convex

j = l,...,n i = l,...,fc,

(QTP)

j = l,...,n

where 6,-j > 0, s,- > 0, and dj > 0 are rational numbers and £],- 5,- = 22j 4 r It is not known whether t h e quadratic Transportation problem is solvable in strongly polynomial t i m e . W h e n t h e n u m b e r of supply nodes k is fixed, Cosares and Hochbaum gave a strongly polynomial algorithm. T h e algorithms presented in [CH90] exploit t h e relationship between t h e Transportation problem and t h e Allocation problem. T h e continuous Allocation problem can be solved by identifying a Lagrangean multiplier associated with t h e single constraint. In t h e q u a d r a t i c case this can be done in linear time. T h e algorithm for t h e Q u a d r a t i c Transportation problem entails relaxing and aggregating source constraints, and then searching for o p t i m a l values for t h e Lagrangean multipliers. For t h e case of two supply nodes, k = 2, t h e algorithm is linear. For greater values of k, t h e algorithm has running t i m e of 0(nk+1). A recent result by Megiddo and Tamir, [MT91], which invokes an alternative searching m e t h o d , yields a linear running t i m e of t h e algorithm for fixed k (the constant coefficient is exponential in k). T h e Allocation problem may be viewed as a single source Transportation problem. We use this problem to illustrate t h e technique of converting a continuous solution t o an integer solution. Allocation problems are all characterized by solvability using t h e greedy algorithm t h a t amounts to adding one unit increment to a variable if its marginal contribution to t h e objective function is minimized. T h e greedy algorithm however is not a polynomial algorithm. For t h e simple Allocation problem, m i n { ^ " = 1 fi(xi)\ E)" =1 Xi = B,x > 0 } , the running t i m e of t h e greedy is 0(B). T h e most general case of Allocation problem involves separable convex optimization over polymatroidal constraints: Given a submodular rank function r : 2E —* R, for E = {1, ..., n), i.e. r() = 0 and for all A,B C E, r(A) + r(B) > r{A U B) + r(A D B). T h e polymatroid defined by t h e rank function r, is t h e polytope {x\^2jeAXj r(A), A C E}. We call t h e system of inequalities {Y.jzAxi < r(^)> A C E}, polymatroidal constraints. T h e general Allocation problem, GAP, is (GAP)

min

£

/,•(*,•)

jEE

Y.

XJ

< r(A)

ACE

jeA

Xj > lj and integer j € E.

< the

82

D. S.

Hochbaum

T h e network allocation problem is special case of (GAP) as proved in [FG86a] and [FG86b]. As such it is solvable in pseudo polynomial t i m e by t h e greedy algorithm. It is defined with respect t o any network (or graph), with a single source - root, and a set of sinks. Let G = (V, A) b e a directed network with node set V and arc set A. Let s G V be t h e source and T C V be t h e set of sinks. Let t h e total supply of t h e root, B > 0 be given. Let Cuv be t h e capacity limit on each arc (u,v). Let t h e vector of t h e flow b e = (<j>uv : (u,v) € A). min

£ cijXj + 0

£

4>vu-

(D,U)£A

ii)

£ £

su— £

4>uv = 0

v£V-T-{s}

^<" - &

(u,s)eA <j)uj -

(u,j)£A iv)

£ (u,v)eA

(s,
-bjxj

Yl

J

e

T

(j,u)£A

0 < <j>uv 5: kuv

0 < Xj < Uj

$3" = xi (u, v) G A

integers j € T.

T h e total sum of flow leaving t h e source B cannot exceed t h e m i n i m u m cut in t h e network. Also, so long as each variable is bounded in an interval where t h e derivative is negative and t h e sum of upper bounds is at least as large as B, t h e amount of flow in the network will be equal to B. So subject t o such preprocessing t h e problem can be stated either with equality or inequality constraint on t h e source. T h e quadratic network allocation problem is not t h e same problem as a quadratic cost flow problem. In t h e latter problem there is an underlying graph and a quadratic cost associated with the flow along each arc. In this problem there is also an underlying network, b u t costs (quadratic) are associated only with t h e flow arriving a t each sink. For this purpose we add a new d u m m y sink " i " , and send all flows from t h e set of nodes T t o t h a t node. T h e costs are then only associated with arcs adjacent t o node t. This graph is described in Figure 2, where only t h e wiggly arcs, t h a t connect t h e sink t o t h e "variables" nodes have costs associated with t h e m . T h e quadratic network allocation problem is solvable in strongly polynomial t i m e , 0 ( | A | | V | l o g ( | V | 7 | A | ) , as described in [HH92]. T h e description of t h a t algorithm is beyond t h e scope of this paper, b u t t h e algorithm for t h e simple allocation problem illustrates t h e i m p o r t a n t computational concerns t h a t are used in t h e network algor i t h m as well as other algorithms for special cases of t h e network Allocation problem and t h e Q u a d r a t i c Transportation Problem. Brucker described a linear t i m e algorithm for t h e quadratic continuous simple Allocation problem in [Bru84]. Our adaptation of t h e algorithm, and its application to t h e integer case is now described.

Polynomial

&: Strongly

Polynomial

Algorithms

for Convex Optimization

83

a, ^ +1/2 b " ^

Figure 2: T h e algorithm for t h e Quadratic Resource Allocation problem, ( Q R A ) , is based on a search for an optimal Lagrange multiplier. T h e continuous Q u a d r a t i c Resource Allocation problem (QRA) is formulated as follows: QRA

Xi+ 26-'x«'J 1=1

s.t.

£ ,nxi i=l

1,

Xi > 0

where d is positive and each 6,- is positive. T h e convexity of t h e objective function guarantees t h a t a solution satisfying t h e Kuhn-Tucker conditions is also optimal. In particular, we seek a non-negative solution x* and a value 8* such t h a t :

y^ nx* = d x* > 0 implies t h a t a,- + b{x' = 8* T h e situation is illustrated by t h e following figure: T h e value set for 8 determines associated values for x,-. associated solution x is: Xi = 0

for i such t h a t a,- > 8

For any value 8, t h e

D. S.

0

Hochbaum

x

5 a

k

+b

J

k k

a

k

Figure 3: 8 — a,i

X{ = — ; —

for i such t h a t

a,- < 6

0;

Finding t h e optimal solution to Q R A is equivalent to finding a value 8" such t h a t t h e associated solution satisfies d = ]C"=i xi >s equal to d. If d < d, t h e n we could conclude t h a t 8* is greater t h a n 8, because any smaller value would yield an even smaller value for d. Similarly, if d > d, then 8* is less t h a n 8. For any 8, t h e value of d is dependent on t h e coefficients in t h e set, {i | a,- > 6 }. Consequently, d(8), is a monotonic, piecewise linear function having breakpoints at t h e values a;, i = 1, ...,n. Its monotonicity allows for a binary search for t h e optimal value, 8* satisfying d(8) = d. Since t h e value of d is finite, it follows t h a t there is a finite optimal 8* for every instance of Q R A . T h e algorithm we propose for finding 8*, chooses "guesses", (from among t h e breakpoint values, a,), until it finds two consecutive breakpoints which contain 8* in t h e interval between t h e m . In this range, d = £ ; x,- is a linear function in 8. T h e problem is then solved by finding t h e particular value of 8 for which d = d, (i.e., by solving t h e linear equation in one variable). In t h e algorithm t h e parameters A and B maintain partial sums necessary to evaluate £ " = i x,-, without computing the sum at every iteration from scratch. Procedure quadratic-Allocation Step 0: Let L = {a l 5 ..., a „ } , / = {1, ..., n}

Polynomial

& Strongly

Polynomial

Algorithms

for Convex Optimization

85

Step 1: Set S to am, t h e median value from t h e set L. Let A = A - £.-£/+ ft,B = B - E,6/+ £, where 7 + = {« € 7|a,- > 5} Siep 2: Let d = B8 - A

Step 3: If

If 5iep ^ : If

lid = d then S T O P , 5 = S'. lid>d t h e n <5 > 8\ H d < d then <5 < 8*. 8 > 8* then: Set I = {i e 7|a,- <S}, L = {a,-|i e 7} Set A = A - £ % £ = £ - T ^ , 5 < <$}, 7, = {a,-|i € 7} |L| > 2, G O T O Step 1. Else 5* = ^

T h e algorithm o u t p u t s a value 6*. T h e optimal solution x* is t h e n readily available, and can be determined in linear time: ^"' 0 T h e o r e m 4 . 1 Algorithm

for i such t h a t a,- < <5* otherwise

quadratic Allocation finds S* and x* in 0(n)

time.

P r o o f : For any guess 8, t h e values of A and B are set to assure t h a t d = J^x,- is set to t h e appropriate value, (i.e. a:,- = 0 when a; > 5). T h e element a,- is removed from L if either it is known to be greater t h a n 6* or if it is less t h a n an established lower bound for 8*. W h e n L contains only one element, say a,- , then we can conclude t h a t 8* is between a; and aj, t h e next largest of the a's. Furthermore, since d is a linear function of 8 in this range, (i.e. d = B8 — A), 8* and x* are determined as in Step 4. T h e 0(n) complexity of t h e algorithm follows from the fact t h a t each of Steps 1, 2 and 3 can be performed in a n u m b e r of arithmetic operations t h a t is linear in t h e cardinality of t h e set L, including t h e selection of t h e median value ([BFPRT72]). Since t h e n u m b e r of elements in t h e set is initially n and is cut in half after each pass, t h e total work is linear in (n + n/2 + n / 4 + .... ) = 2n, so t h e complexity of t h e algorithm is 0(n). • W h e n t h e problem is to be solved in integers we apply t h e proximity T h e o r e m 2.2 (or alternatively a tighter proximity theorem developed for all Allocation problems in [Hoc90]). From t h e optimal continuous solution x* we create a lower bound vector to the optimal integer solution x* — ne (or using t h e tighter proximity theorem, x* — e is a tighter lower b o u n d ) . Since Ylx*j — B, there are only n2 more units to add (or rather only n more units to add), which can be determined in such n u m b e r of

D. S. Hochbaum

86

iterations of the greedy algorithm, each taking a constant time. The running time is therefore linear for the integer version of the problem. The question of the strong polynomiality of the general quadratic Allocation problem is not settled yet. Still, all known special cases of the network Allocation problem are all solvable very efficiently. These problems include the tree Allocation problem, the nested Allocation problem and the Generalized Upper Bounds Allocation problem (and of course the simple allocation problem). In [HH92] we describe algorithms for these problems with complexities, 0(n log n), O(nlogn) and 0(n), respectively. All these algorithm are based on efficient search for the optimal Lagrange multipliers and in that sense generalize the procedure quadratic-allocation. Finally, unrelated to the Allocation problem, Tamir ([Tam89]) devised a strongly polynomial algorithm for minimum convex quadratic separable cost flow when the underlying network is series-parallel, in the presence of a single source-sink pair. This algorithm is exploiting the series-parallel structure of the network in order to construct an optimal continuous solution, namely, the solution is constructed from series parallel operations applied to subproblems defined on subgraphs. An integer solution is determined using the same approach as described above.

4.2

The Complexity of Some Nonseparable Cases

We first illustrate that if there exists a "separating scheme" such that the separating matrix with T forms a totally unimodular matrix, then can still solve the integer nonseparable network flow problem in polynomial time ([HS90]). This relies on an algorithm that was developed based on proximity and scaling, which is polynomial when the largest subdeterminant of the constraint matrix has a polynomial size. The algorithm solves in polynomial time, in integers, separable convex problems over constraints for which the constraint matrix is totally unimodular. Such problems generalize the integer separable convex network flow problem. Consider the nonseparable problem, min s.t.

F(&) Tx = 6 0 <x
Suppose there exists an ntimesn matrix U such that F(Ux) is a separable function, then the newly stated problem is, min s.t.

F(y) TU~ly = b 0< U^y
Polynomial

& Strongly

Polynomial

Algorithms

for Convex Optimization

87

Now, if t h e m a t r i x U is totally unimodular, then t h e integrality requirement is "

preserved. If t h e new m a t r i x of constraints

T U

-i

-

JJ_l

is totally unimodular, t h e n t h e

noseparable flow problem is solvable in polynomial t i m e and in integers. T h e nonseparable quadratic convex continuous problem is polynomial as indicated in t h e introduction. W i t h t h e integrality requirement, t h e problem becomes N P - h a r d as illustrated by t h e following known reduction from t h e independent set problem. Consider t h e maximization of t h e weight of an independent set in a graph: Given a graph G = (V, E) with nonnegative weights Wv for each v € V, find a subset of vertices U C V such t h a t for any i, j € U, {i,j} £ E, and such t h a t t h e t o t a l weight W(U) = Z)vec/ Wv is m a x i m u m . T h e weighted independent set problem can be posed as t h e quadratic maximization problem: max

^2 Wvx„ — v£V

^2

W(V)

• xu • xv

{u,v}€E

xve {0,1}. Let x* be t h e optimal solution. T h e m a x i m u m weight independent set is then {v\x"v = 1}. Note t h a t t h e reduction also applies for t h e unweighted case. So even in t h e absence of t h e flow balance constraints, the integer problem is N P - h a r d . T h e objective function in this case is not necessarily concave. T h e question is t h e n asked whether t h e complexity of t h e problem is not a result of t h e indefiniteness of t h e quadratic m a t r i x . Baldick gives a negative answer to this question [Bal91], by constructing a reduction from t h e set splitting problem to 0-1 minimization of a convex quadratic function. T h e formulation of a quadratic nonseparable problem on box constraints (i.e. t h e set of flow balance constraints is e m p t y ) , is as follows: n

max

^

a{Xi +

^

U <Xi < m G { 0 , 1 } . X,-

integer .

T h e idea of making t h e problem separable so t h a t t h e [HS90] algorithm applies is translated here to finding a diagonalizing m a t r i x U, so t h a t for t h e m a t r i x Q = (g,j), U'^QU is a diagonal m a t r i x . Baldick and Wu [BW90] used this approach for a problem of electric distribution system where only box constraints are present. Baldick [Bal91] has further identified several classes of matrices Q where a "diagonalizing" scheme witha totally unimodular m a t r i x exists. T h e two classes are: (1) Diagonally dominant matrices, qa > Yli^j

\lij\-

(2) Matrices with forest structure, these are matrices with a partial order on t h e positive coefficients inducing a forest.

88

D. S.

Hochbaum

For b o t h these classes, with T empty, there are polynomial algorithms. Also if t h e [ TU'1 1 constraint m a t r i x , ,,_ a is totally unimodular t h e n still integer solutions can be obtained in polynomial time. Continuous solutions can be obtained in polynomial t i m e if t h e largest subdeterminant of t h e constraint m a t r i x is bounded by a polynomial. Barahona, [Bar86], discovered t h a t t h e case of quadratic nonseparble 0-1 optimization is polynomial if t h e quadratic m a t r i x Q has a series-parallel graph characteristic structure. T h a t is, there exists a series-parallel graph G = (V, E) with q^ / 0 if and only if (i,j) 6 E. This is again a case of network flow with an e m p t y set of flow balance constraints. A class of nonseparable problems over box constraints (upper and lower bound constraints) which is solvable in polynomial (and strongly polynomial) t i m e is presented in [Hoc89]. T h e class is identified by t h e property of t h e multivariate polynomial objective function called the bipartition property. This property can be easily described on a graph for quadratic objective functions: G = (V, E) with V = V\ U Vi so t h a t qij > 0 only if both i, j € Vi or i,j € V2. This property was discovered independently, for quadratic objective functions, by Hansen and Simeone, [HS86]. In [HS86], an objective function with this property is called a unate function. In [HSS92] a "high multiplicity" m i n i m u m weighted tardiness scheduling problem was discussed. This problem was formulated as a quadratic Transportation problem with a nonseparable objective function. This problem is unique among t h e problems discussed in this section in t h a t the set of flow balance constraints is not empty. In t h a t problem, t h e right hand sides (supplies and d e m a n d s ) , and t h e linear coefficients in the objective function are large, so t h e aim was to find an algorithm for t h e integer case t h e running t i m e of which is independent of these numbers. Such algorithm was found by solving a related continuous problem (not t h e relaxation), t h e solution of which could be rounded using a simple procedure, to derive an optimal integer solution. All problems presented in this section are special classes. T h e r e is still a need to discover t h e extent of t h e polynomial solvability of nonseparable network flow problems, although this cannot be expected to be so unified as in t h e separable case.

References [ABG84] R. K. Ahuja, J. L. B a t r a and S. K. G u p t a , A P a r a m e t r i c Algorithm for t h e Convex Cost Network Flow and Related Problems, Europ. J. Oper. Res., 16, (1984), 222-235. [AM092] R. K. Ahuja. T. L. Magnanti and J. B. Orlin, Network Flows: Theory, Algorithms and Applications, manuscript, to be published by Prentice Hall in 1993.

Polynomial

& Strongly

Polynomial

Algorithms

for Convex Optimization

89

[Bal91] R. Baldick, A Unification of Polynomially Solvable Cases Of Integer "Nonseparable" Q u a d r a t i c Optimization, Lawrence Berkeley Laboratory manuscript, November (1991). [BW90] R. Baldick and F . F . Wu, Efficient Integer Optimization Algorithms for Optimal Coordination of Capacitators and Regulators, IEEE Transactions on Power Systems, 5(3), 805-812, (1990). [Bar86] F . Barahona, A Solvable Case of Q u a d r a t i c 0-1 P r o g r a m m i n g , Discrete plied Mathematics, 13 (1) (1986), 23-28. [BS85] M. S. B a z a r a a and C. M. Shetty, Nonlinear rithms, (John Wiley & Sons, 1979).

Programming:

Ap-

Theory and Algo-

[BG61] R. G. Busacker and P. J. Gowen, A Procedure for Determining Minimal-Cost Network Flow P a t t e r n s , Operational Research Office, John Hopkins University, Baltimore, M D , (1961). [BFPRT72] M. Blum, R . W . Floyd, V.R. P r a t t , R.L. Rivest and R.E. Tarjan, T i m e Bounds for Selection, J. Computer Systems Science 7 (1972) 448-461 [Bru84] P. Brucker, An 0(n) Algorithm for Quadratic Knapsack Problems, Operations Research Letters, 3 (1984), 163-166. [CH90] S. Cosares and D. S. Hochbaum A Strongly Polynomial Algorithm for t h e Quadratic Transportation Problem with Fixed N u m b e r of Suppliers, manuscript, U . C . Berkeley, August 1990, t o appear Mathematics of Operations Research. [D63] G.B. Dantzig, Linear Press, 1963).

Programming

and Extensions,

(Princeton University

[Den59] J. B . Dennis, Mathematical Programming and Electrical Networks, (Technology Press Research Monographs, Technology Press and Wiley, 1959), pp.74-75. [D67] I.I. Dikin, Iterative Solution of Problems of Linear and Q u a d r a t i c Programming, Soviet Mathematics Doklady, 8, (1967), 674-675. [EK72] J. E d m o n d s and R. M. K a r p , Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems, Journal of ACM , 19, (1972), 248-264. [EMV87] R.E. Erickson, C.L. M o n m a and A . F . Veinott, Send-and-Split Method for Minimum-Concave-Cost Network Flows, Math, of Oper. Res., 12 (1987), 634-664. [FG86a] A. Federgruen and H. Groenevelt, T h e Greedy Procedure for Resource Allocation Problems: Necessary and Sufficient Conditions for Optimality, Operations Research, 34, (1986), 909-918.

90

D. S.

Hochbaum

[FG86b] A. Federgruen and H. Groenevelt, Optimal Flows in Networks with Multiple Sources and Sinks, with Applications to Oil and Gas Lease Investment Programs, Operations Research, 34, (1986), 218-225. [F85] R. Fourer, Simplex Algorithm for Piecewise-Linear P r o g r a m m i n g I: Derivation and Proof, Mathematical Programming, 33 (1985) 204-233. [GSK90] F . Granot and J. Skorin-Kapov, Some Proximity and Sensitivity Results in Q u a d r a t i c Integer programming, Mathematical programming, 47, No. 2, (1990), 259-268. [GSK90a] F . Granot and J. Skorin-Kapov, Strongly Polynomial Solvability of a Nonseparable Quadratic Integer Program with Applications to Toxic Waste Disposal, manuscript, December (1990). [GP90] G. Guisewite and P.M. Pardalos, M i n i m u m Concave Cost Network Flow Problems: Applications, Complexity, and Algorithms, Annals of Operations Research, 25, (1990), 75-100. [GP91a] G. Guisewite and P.M. Pardalos, Algorithms for t h e Single-Source Uncapacitated M i n i m u m Concave-Cost Network Flow Problem, Journal of Global Optimization, 1, No. 3 (1991), 245-265. [GP91b] G. Guisewite and P.M. Pardalos, Global Search Algorithms for M i n i m u m Concave Cost Network Flow Problems, Journal of Global Optimization, 1, No. 4 (1991), 309-330. [HS86] P. Hansen and B . Simeone, Unimodular Functions, Discrete matics 14 (1986), 269-281.

Applied

Mathe-

[Hoc89] D. S. Hochbaum, On a Polynomial Class of Nonlinear Optimization Problems, manuscript, U.C. Berkeley, (1989). [Hoc90] D. S. Hochbaum, On the Impossibility of Strongly Polynomial Algorithms for t h e Allocation Problem and Its Extensions, Proceedings of the 1st Integer Programming and Combinatorial Optimization Conference, May (1990), 261-274. Also see, D. S. Hochbaum, Lower and Upper Bounds for t h e Allocation Problem and Other Nonlinear Optimization Problems, to appear, Mathematics of Operations Research. [HH92] D. S. Hochbaum and S. P. Hong, About Strongly Polynomial Algorithms for Q u a d r a t i c Network Allocation Problems, manuscript, U.C. Berkeley, (1992). [HS91] D. S. H o c h b a u m and S. Seshadri, T h e Empirical Performance of a Polynomial Algorithm for Constrained Nonlinear Optimization, (1991), to appear, Annals of Operations Research.

Polynomial

& Strongly

Polynomial

Algorithms

tor Convex Optimization

91

[HSS92] . S. Hochbaum, R. Shamir and J. G. S h a n t h i k u m a r , A Polynomial Algorithm for an Integer Q u a d r a t i c Nonseparable Transportation Problem, Mathematical Programming, Vol. 55, No. 3, p p . 359-372, (1992). [HS90] D.S. Hochbaum and J.G. Shanthikumar, Convex Separable Optimization is not Much Harder T h a n Linear Optimization, Journal of ACM Vol. 37, No 4, (1990), 843-862. [IK88] T. Ibaraki and N. Katoh, Resource proaches, ( M I T Press, 1988).

Allocation

Problems:

Algorithmic

[Ir60] M. Iri, A New Method of Solving Transportation Network Problems, of the Operations Research Society of Japan, 3, 27-87, (1960).

Ap-

Journal

[Je58] W . S. Jewell, O p t i m a l Flow Through Networks, Technical report No. 8, Operations research Center, M I T Cambridge, MA (1958). [KV86] S. Kapoor and P. M. Vaidya, Fast Algorithms for Convex Q u a d r a t i c Prog r a m m i n g and Multicommodity Flows, Proc. 18th STOC, (1986), 147-159. [KTK79] M.K. Kozlov, S. P. Tarasov and L.G. Khachian, Polynomial Solvability of Convex Q u a d r a t i c P r o g r a m m i n g , Doklady Akad. Nauk SSR 5 (1979), 1051-1053 [Translated in Soviet Mathematics Doklady 20 (1979), 1108-1111]. [MST88] Y. Mansour, B. Schieber and P. Tiwari, Lower Bounds for C o m p u t a t i o n s with the Floor Operations, manuscript, December (1988). [MT91] N. Megiddo and A Tamir, Linear T i m e Algorithms for Some Separable Q u a d r a t i c P r o g r a m m i n g Problems, manuscript, Nov (1991). [Min84] M. Minoux, A Polynomial Algorithm for m i n i m u m Q u a d r a t i c Cost Flow Problems, European Journal of Operational Research, 18 (1984), 377-387. [Min86] M. Minoux, Solving Integer M i n i m u m Cost Flows with Separable Convex Cost Objective Polynomially, Mathematical Programming Study, 26 (1986), 237239. [Min86a] M. Minoux, Mathematical 1986), Chapters 5,6.

Programming,

Theory

and Algorithms,

(Wiley,

[MA89] R. D.C. Monteiro and I. Adler, Interior P a t h Following Primal-Dual Algor i t h m s . P a r t II: Convex Q u a d r a t i c P r o g r a m m i n g , Mathematical Programming 44 (1989) 43-66. [MS87] K. G. M u r t y and S. N. Kabadi, Some N P - Complete Problems in Q u a d r a t i c and Nonlinear P r o g r a m m i n g , Mathematical programming, 39, 117-129, (1987).

92 [NY83] A. S. Nemirovsky and D. B . Yudin, Problem ciency in Optimization, (Wiley, 1983).

D. S. Complexity

Hochbaum

and Method

Effi-

[PS90] Y. P i n t o and R. Shamir, Efficient Algorithms for M i n i m u m Cost Flow Problems with Convex Costs, Technical Report, D e p a r t m e n t of C o m p u t e r Science, Tel Aviv University, Tel Aviv, Israel, (1990). [R87] J. Renegar, On the Worst Case Arithmetic Complexity of Approximation Zeroes of Polynomials, J. of Complexity, 3, (1987) 9-113. [Sah74] S. Sahni, Computationally related Problems, SIAM 279 (1974).

J. on Computing,

3, 262-

[Tam89] A. Tamir, A Strongly Polynomial Algorithm for M i n i m u m Convex Separable Q u a d r a t i c Cost Flow Problems on Series-Parallel Networks, (1989), to appear Mathematical Programming. [Tard85] E. Tardos, A Strongly Polynomial M i n i m u m Cost Circulation Algorithm, Combinatorica 5 (1985) 247-255 [Tard86] E. Tardos, A Strongly Polynomial Algorithm to Solve Combinatorial Linear P r o g r a m s , Operations Research 34 (1986) 250-256 [W92] M. W e r m a n , T h e relationship between Integer and Real Solutions of Constrained Convex P r o g r a m m i n g , Mathematical Programming, 5 1 , (1991), 133-135.

93 Network Optimization Problems, pp. 93-110 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Hamiltonian Circuits for 2-Regular Interconnection Networks F. K. Hwang AT&T Bell Laboratories,

Murray Hill, NJ 07974

Wen-Ching Winnie Li* Department of Mathematics, 16802

Pennsylvania

State

University,

University

Park,

PA

Abstract

A 2-regular directed graph, in which each node has two inlinks and two outlinks, is a popular topology for interconnection networks due to its simple and homogeneous structure, and also due to the reduced transmission delay and the increased reliability as compared to a ring network. When the network is used as a processor network, it has been argued that a hamiltonian circuit embedded in such a network provides a maximum linear array of processors even under a fault, and also reduces the number of spare processors required. The problem of the existence of a hamiltonian circuit has been solved for the double loop network, the generalized deBruijn network, the Imase-Itoh network and the torus network. In this paper we study the same problem for extended double loop network, semi-torus network and semi-Manhattan street network. Our method enables us to unify the aforementioned known results and provides generalizations at the same time.

'This research, supported in part by the grant from NSA no. MDA904-90-H-1021, was done when the author was visiting AT&T Bell Laboratories, Murray Hill, NJ.

94

1

F. K. Hwang & W.-C.

W. Li

Introduction

Two i m p o r t a n t features in designing an interconnection network are a small transmission delay and some kind of fault tolerance. Represent t h e network by a graph with stations as nodes and wires between stations as links. T h e n delay is usually measured by t h e diameter (the distance of a pair of nodes farthest apart) and t h e fault tolerance by connectivity (the size of t h e smallest cutset). If t h e network is used as a processor network, then embedding a hamiltonian circuit is desirable [12] since it provides a linear array of n processors even with a link failed (or n — 1 processors with a processor failed). It has also been argued [9] t h a t t h e existence of a hamiltonian circuit reduces t h e n u m b e r of spare processors in certain types of 2-regular interconnection networks. In practice, simplicity and homogeneity are also i m p o r t a n t considerations. These are often built into t h e design by posing as constraints. T h u s regular networks (networks having t h e same n u m b e r £ of outlinks and inlinks at each node) are popular since t h e nodes are homogeneous. One would also like to keep £ small to reduce t h e hardware of t h e network as well as to keep t h e control operation within a node simple. T h e simplest regular network with n nodes is £ = 1, which, if strongly connected (i.e., there is a p a t h from every node to every other node), is a hamiltonian circuit. However, t h e 1-regular network is known to have large diameter and 1-connectivity. T h e next level of simplicity is when £ = 2. Not only have we shortened the diameter and increased t h e connectivity, we also now have an abundance of networks to choose from. Let [i,j] denote a link from node i to node j . We consider two general classes. (i) T h e extended double loop class. A network in this class has n nodes 0 , 1 , . . . ,n — 1 and In links [i,aii + &i], [i,a2i + b2] for i = 0 , 1 , . . . , n — 1, where a j , a2, &i, b2 are fixed integers and addition is modulo n. W i t h a\ = a 2 = 2, b\ = 0, b2 = 1, we have t h e generalized deBruijn network [6, 14] (n = 2 P being t h e graph corresponding to t h e deBruijn sequence). W i t h a± — a2 = —2, &i = — 1 , b2 = —2, we have the Imase-Itoh network [7]. W i t h Oj = a 2 = 1 we have t h e usual double loop network. If, furthermore, &i = 1, then t h e network is known as F L B H (forward loop backward

hop) [15]. (ii) T h e mesh class. A network in this class is defined on a grid lying on t h e surface of a torus. T h e nodes are t h e grid points and the links are the grid segments. All links on t h e same grid line have the same direction. W h e n the horizontal and vertical grid lines are all unidirectional, t h e network is known as a torus network [3, 10]. (A variation is known as a rectangular array [8].) W h e n t h e horizontal and vertical grid lines alternate in direction, except if t h e number of horizontal or vertical lines is odd, then one line does not alternate, t h e network is known as a Manhattan street network [11]. We also relax t h e requirements on line directions of torus networks and M a n h a t t a n street networks to only either t h e horizontal or t h e vertical lines, and call t h e m semi-torus networks and semi-Manhattan street networks, respectively.

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

95

Necessary and sufficient conditions for the existence of hamiltonian circuits have been determined for t h e usual double loop network and t h e torus network, which are directed Cayley graphs on abelian groups with two generators [13], and also for t h e generalized deBruijn network and the Imase-Itoh network [1, 2]. In this paper we investigate t h e hamiltonian property for t h e extended double loop network (Theorem 3.11), t h e semi-torus network (Theorem 4.4), and t h e s e m i - M a n h a t t a n street network (Theorem 5.1). Our m e t h o d is rather general, which enables us to unify t h e aforementioned known results and meanwhile provides generalizations to non-Cayley digraphs as well.

2

A General Approach

Consider a 2-regular network G with two types of links. For an extended double loop network, t h e two types are defined by t h e two equations ffi(i) = a^i + 6 t and <jr2(i) = o-ii + &2- For a mesh network t h e two types are the horizontal and t h e vertical links. Let L\ and i 2 denote t h e sets of type 1 and type 2 links of G, respectively. A subgraph F of G is called a factor if every node of G has exactly one inlink and one outlink in F. It is easily verified t h a t L\ and I/ 2 are factors for the extended double loop network and for t h e mesh network. Let F be a factor of G. Suppose t h a t F contains a type 1 link [ii, Ji]- T h e n F cannot contain t h e t y p e 2 link [j2,ii] since otherwise the node j \ has two inlinks in F. Consequently, F must contain t h e t y p e 1 link [12, J2] o r the node i 2 would have no outlink in F. A recursive argument shows t h a t if F contains a link [i,j], then F m u s t contain a set of links SQi,.?]) of links of t h e same type as [i,j]. Suppose t h a t t h e two types of links can be described by two invertible functions gi and 52 o n t h e nodes such t h a t t h e two outlinks of t h e node i are [i,<72(0]For j = 1,2, represent t h e set Sj(i) = S([i,gj(i)]) by t h e starting nodes of the links. P u t g = g^1 051 and g'1 = g^1 o g2. T h e n 5 i ( i ) = {i,g(i),g2{i), • • • ,gm~1(i)} and 5 2 (i) = {i, g _ 1 ( i ) , g~2(i),. . ., ( / _ ( m _ 1 ) ( i ) } , where m is the smallest integer k such t h a t gk(i) = i. Observe t h a t S\{i) = S
96

F. K. Hwang & W.-C.

W. Li

T h e o r e m 2.1 Suppose that G is m-uniform and every path in G of length n/m is basic. Then G is hamiltonian if and only if there is a path of length n/m and period m. P r o o f . T h e assumptions on G imply t h a t when we move along a p a t h or cycle contained in a factor of G, only every n/m nodes belong to t h e same class, hence a hamiltonian circuit exists if and only if there is a p a t h of length n/m and period m. T h e above approach is essentially t h e "arc-forcing subgroup" technique introduced by Rankin [13] in dealing with a group theoretic problem which can be used to determine if certain Cayley graphs are hamiltonian. There are m a n y papers in t h e literature dealing with t h e hamiltonian property of a Cayley graph, see [19] and t h e papers referred therein. Our formulation affords a broader application, including non-Cayley graphs as well. We end this section by noting the following L e m m a 2.2 A 2-regular digraph is connected

if and only if it is strongly

connected.

P r o o f . Clearly, strong connectedness implies connectedness. Conversely, let X be a connected 2-regular digraph. We show t h a t it is strongly connected. Let v be a node of X. Denote by V t h e set of nodes reachable from v. Since all t h e outlinks of nodes in V must end at nodes of V, the 2\V\ outlinks of V must be t h e same 2|V| inlinks of V. This shows t h a t the subgraph induced by V is a connected component of X. As X is connected, V contains all nodes of X, t h a t is, there is a p a t h from v to any other node of V. T h u s X is strongly connected. In view of this lemma, we shall only use the t e r m "connected" from now on.

3

The Extended Double Loop Network

Given a positive integer n and integers a, b, e, / , denote by G(n; a, e; b, / ) t h e directed graph with nodes being t h e elements of Z / x Z and t h e outlinks from a node i being [i, ai + e] and [i, bi -f / ] . L e m m a 3.1 G(n;a,e;b,f) is 2-regular if and only if either gcd(a,n) or gcd(a, n) = gcd(b, n) = 2 and e — / is odd.

= gcd(b,n)

= 1

P r o o f . By definition, there are two outlinks from each node. There are two inlinks at a node i if and only if t h e two equations ax + e = i ( m o d n)

and

bx -f / = i ( m o d n)

have two solutions modulo n altogether. If a is prime to n, then ax -f e = i ( m o d n) has a unique solution for all i, and hence the same must be true for bx + / = i (mod n ) , this implies t h a t 6 is also prime to n. We have shown t h a t gcd(n,a) = 1 if and only if gcd(n,b) = 1. Next consider the case gcd{n,a) = d > 1. An equation

Hamiltonian

Circuits for 2-Regular

Interconnection

97

Networks

ax + e = i (mod n) has zero or d solutions modulo N. As it is solvable when i = e, we conclude t h a t d = 2. As gcd(n, b) is also > 1 in this case, we have gcd(a, n) = gcd(b, n) = 2. Further, for each i, exactly one of ax + e = i (mod n) and bx + f = i ( m o d n) is solvable if and only if e and / have opposite parity. This proves t h e l e m m a . Call G(n; a, e; 6, / ) an extended double loop network when it is 2-regular. The special case a = b — 1 gives t h e usual double loop network Gn(e,f). In this section we study t h e hamiltonian property of extended double loop networks. First of all, observe t h e following. T h e translation by an integer k defines a bijection on Z / K Z , t h a t is, (f>(i) = i + k for all i. As tf>(ai + e) = ai + e + k = a(i + k) +e + k{\ — a) and cj>(bi + f) = b(i + k) + / + k(l — b), yields an isomorphism from G(n; a, e; 6, / ) to G(n; a, e'; b, / ' ) , where e' = e + k(l — a) and / ' = / + fc(l — 6). Let d = gcd(e — f,a — b,n). T h e n m other words, gcd(e! — f',n) = d. This shows t h a t among t h e extended double loop networks of t h e form G(n;a,e';b,f) which are isomorphic to G(n;a,e;b,f), we can choose suitable e', / ' such t h a t gcd(e' — f',n) divides a — b. Therefore we shall always assume that gcd(e — f, n) divides a — b for an extended double loop network G(n; a, e; b, / ) . Note t h a t a hamiltonian digraph is necessarily connected. So we begin by investigating when an extended double loop network G(n; a, e; 6, / ) is connected. L e m m a 3 . 2 Suppose gcd(n,a) = gcd(n,b) work G(n;a,e;b,f) is the line graph of

= 2. Then the extended G(n/2;a,e;b,/).

double loop net-

P r o o f . Denote t h e nodes of G ( n / 2 ) = G ( n / 2 ; o , e; 6 , / ) by 7, where i € Z/(*/*)% and call t h e links p , ai + e] of type 1 and p , bi + f] of type 2. We shall label t h e n links of G(n/2) by elements in Z / K Z . More precisely, t h e type 1 outlink from i is labelled ai + e and t h e t y p e 2 outlink labelled bi + / . Since ai + e = ai' + e ( m o d n) (resp. bi + f = bi' + f (mod n)) implies i = i' (mod ra/2) and since e — / is odd and a,b,n are even, we have ai -f e ^ 6j + / (mod n) for all i, j , therefore no two links are labelled t h e same. Further, t h e labels of t h e two inlinks at t h e node i are congruent to i ( m o d ra/2) from construction, so they are i and i + n/2. T h e nodes of t h e line graph of G(n/2) are t h e links of G ( n / 2 ) , which we identify with t h e elements of 7LJKTL via labelling, and t h e out-neighbors of a node i of t h e line graph are t h e outlinks at t h e node i, namely, ai + e and bi + f. This shows t h a t t h e line graph of G(n/2; a, e; b, / ) is precisely G(n; a, e; 6, / ) . T h e o r e m 3 . 3 Let G(n; a, e; b, / ) be an extended double loop network with n Let n' be the largest odd factor of n. Suppose that gcd(a,n) = gcd(b,n) — 2. the following three statements are equivalent. (1) G(n; a, e; b, f) is hamiltonian, (2) G(n; a, e; 6, / ) is connected. (3) G(n'; a, e; 6, / ) is connected.

even. Then

98

F. K. Hwang & W.-C.

W. Li

P r o o f . (3) = > (1). Since G(n';a,e;b, f) is 2-regular and connected, it is Eulerian. Its line graph, which is G(2n'; a, e; 6, / ) by L e m m a 3.2, is hamiltonian and hence connected. If n > 2n', then an iterative argument concludes t h a t G(n;a,e;b, f) is hamiltonian. (1) = > (2) is obvious. (2) = > (3) If G(n'; a, e; b, f) is not connected, then its line graph G(2n'; a, e; b, / ) is also not connected. Again, an iterative argument shows t h a t G(n; a, e; 6, / ) is not connected. T h e reader is referred to [16] for a similar, b u t more general situation. Given an extended double loop network G(n; a, e; 6, / ) , denote by ,

J n if gcd(n,a) = gcd(n,b) = 1 , 1 t h e largest odd factor of n if gcd(n, a) = gcd(n, b) = 2 and e — / is odd .

Note t h a t gcd(n', a) = gcd(n', b) = 1 and gcd(e — f,n)

= gcd(e — / , n').

T h e o r e m 3 . 4 Let G(n; a, e; b, f) be an extended double loop network with d = gcd(e — f,n). Assume that G(n';a,e;b,f) is ^-uniform. Then G(n\a,e;b, f) is connected if and only if G(d; a, e) is hamiltonian. Here G(m; k, h) denotes the directed graph with nodes being the elements of Z / > Z and t h e outlink at a node i being [i, ki + h]. P r o o f . By Theorem 3.3, G(n; a, e; 6, / ) is connected if and only if G(n'; a, e; b, f) is. Hence it suffices to prove the theorem for the case a, b b o t h prime to n. Write b' for t h e integer between 1 and n such t h a t bb' = 1 ( m o d n ) , and let c = ab'. Suppose nodes x and y have outlinks to the same node, say, ax + e = by + f (mod n ) , then x and y are in the same connected component and they are related by y = b'ax+ b'(e — / ) = ex + b'(e — f) (mod n ) . Therefore t h e nodes in S(x) = {x, ex + (e - /)&', c2x + (e - f)b'{c + 1 ) , . . . , cmx + (e - f)b'(cm-1 + • • • + c + 1),...} are in t h e same connected component. Since c = 1 (mod d) and e = / (mod d), all nodes in S(x) are congruent to x modulo d. This together with the uniformity assumption implies t h a t S(x) consists of the n/d nodes of the network which have t h e same residue as x when divided by d. Further, G(n; a, e; b, f) is connected if and only if the new graph obtained from G(n;a,e\b,f) modulo d, t h a t is, G(d;a,e), is connected (since b = a (mod d) and / = e (mod d)), or equivalently, G(d;a,e) is hamiltonian. This proves the theorem. T h e following algebraic criterion for t h e uniformity of an extended double loop network is proved in [5], T h e o r e m 3.5 Let G(n; a, e; 6, / ) be an extended double loop network with gcd(a, n) = gcd(b,n) = 1 and d = gcd(e — f,n). Then G(n;a,e;b, / ) is n/d-uniform if and only if every prime factor of n/d divides (a — b)/d and 4 divides a — b whenever it divides

n/d.

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

99

Observe t h a t t h e algebraic condition is trivially satisfied when a = b. In particular, all t h e usual double loop networks Gn(hi,fi2) = G(n; 1, h\\ 1, A2) are n/d-uniform, where d — gcd(hi — h2,n). A criterion for G(m;k,h) to be hamiltonian is given in [4], which we now recall. Let 6 = gcd(k — 1, ra). T h e n G(m; k, h) is hamiltonian if and only if (1) h is prime to m , (2) 6 and m have t h e same prime factors, (3) 4 divides 6 whenever it divides m. Hence a necessary condition for G{n; a, e; 6, / ) to be connected is gcd(e, d) = gcd(e,gcd(e — f,n)) = gcd(e,f,n) — 1. W h e n a = 6 = 1 , this condition is also sufficient as a — 1 = 0 satisfies t h e other requirements. We have shown C o r o l l a r y 3 . 6 (van D o o m [18]). Gn{hi,h,2) 1.

is connected if and only if gcd(hi,h2,n)

=

Other immediate consequences of Theorem 3.4 are C o r o l l a r y 3.7 An extended double loop network G(n; a, e; 6, / ) is connected e — f is prime to n and G(n'; a, e; b, f) is n1 -uniform.

whenever

C o r o l l a r y 3 . 8 Suppose n is square free. An extended double loop network G(n; a, e; a, f) is connected if and only if a = 1 (mod d) and gcd(e,f,n) = 1, where d = gcd(e —

Let G = G(n;a,e;b,f) be an extended double loop network with gcd(n,a) gcd(n, b) = 1. T h e n G is closely related to a Cayley graph as follows. Denote by

a e\

=

fb f

and by T t h e subgroup of GL2( ^ ( i ) • T h e group T acts on t h e nodes of G via m a t r i x multiplication, t h a t is, a m a t r i x M in T sends t h e node (M to t h e node MyA. It acts transitively since G is strongly connected. T h e stabilizer of t h e node [ ) is the subgroup H consisting of t h e diagonal matrices f J in T. Hence t h e vertices of G may be identified with t h e right coset space T'/H. As such, G can be viewed as a quotient graph of t h e Cayley graph Cay(T\ A,B) on T with generators A, B. If H is a normal subgroup of T, then G is also a Cayley graph. Denote by C t h e cyclic subgroup of T generated by B~lA. It is t h e "arc-forcing subgroup" of t h e Cayley digraph Cay(T; A, B) arising from considering outlinks as in §2. More precisely, for any node x g T, t h e set S(x) denned in §2 is t h e left coset Cx. Therefore t h e set S(i) for a node i in G is a double coset CxH viewed as a union of right H cosets. Let D be the cyclic subgroup of T generated by AB~*. It is. t h e "arc-forcing subgroup" of Cay(T;A,B) arising from considering inlinks instead of outlinks.

F. K. Hwang & W.-C. W. Li

100

Lemma 3.9 C is a normal subgroup of T if and only if C — D, i.e., {B-'A) = {A-'B) = {AB'1) _1

= (BA-1) .

Proof. Since AB = B(B~ A)B~ is conjugate to B~XA, if C is normal, then 1 C = BCB' = D. Conversely, suppose C = D. We want to show that C is a normal subgroup of T. It amounts to showing that B'CB~' = C and A'CA~' — C for all integers i > 0. Due to the symmetry in A and B, it suffices to prove the first statement. ^From B^^AB^ = Bi-1(AB-1)B^i-11 it follows that B'Cfl-' = Bi-iCB-(i-i)

=

...

=

l

1

c

If C is a normal subgroup of T, then C \ T/H = T/CH and G is J-rW-uniform. Conversely, we shall see that the uniformity condition is weaker than the normality of C. Indeed, by the Lemma above, C is normal if and only if

-Mo T)(; 0-C ' V " - - ( ; \)(" T)-(; -,"

is a power of

A direct computation shows that B XA= (AB ')* if and only if c' _1 = 1 (mod n) , (c'" 1 + • • • + c + l)(e - cf) = b'(e - f)

(mod n) .

Suppose G is n/d-uniform. It follows from the algebraic criterion above that gcd(e — cf,n) = gcd((e — / ) + (1 — c)f,n) = d and the system (*) is equivalent to 1 / ••- 2o d

vC

+

(- c + 1) = 0 ' ~

^e-c/

t •n mod V dj , e - /

e - c /

By Theorem'3.5, the n/d numbers 1, c+1, c2 + c + l , . . . , cn/d—l-\ hc+1 represent all the residues modulo n/d. Thus equation (b) has a unique solution mod n/d. Let i be a solution to (b). Then i satisfies (a) if and only if W y i z i . i Z f / U o

fmodj

We have shown Theorem 3.10 Suppose that G(n; a, e; 6, / ) is n/d-uniform, where gcd(a, n) = gcd(b, n) 1 and gcd(e — f,n) = d. Then the cyclic group C = (B_1A) is a normal subgroup of T = (A, B) if and only if

d where cb = a(mod n).

H^-^HHi)- '«»

flamiitonian

Circuits for 2-Regular

Interconnection

Networks

101

In particular, when a = b, G(n; a, e; o, / ) is n/d-uniform and t h e group C is always a normal subgroup of T. We exhibit one example below to show t h a t uniformity does not imply normality. E x a m p l e . Let p b e an odd prime and n = p 2 . Choose e , / such t h a t e — / = 1. T h e n d = 1. Choose c = p + 1. Equation (c) in this case reads p ( l -b

+ pfb) = 0 ( m o d p 2 ) ,

which is equivalent to 1 = b ( m o d p) . If we choose b ^ 0,1 (mod p) and a = he = 6(1 + p) ( m o d p 2 ) , then G(p2; a, e; 6, / ) is p 2 -uniform, while C = (B~1A) is not a normal subgroup of T. Suppose C is a normal subgroup of T. T h e n t h e Cayley graph Cay(T;A,B) is |C|-uniform and every p a t h in t h e graph with length | T | / | C | is basic. T h u s , by T h e o r e m 2.1, t h e Cayley graph is hamiltonian if and only if there is a word W in A, B of length | T | / | C | which generates the group C. This is Rankin's criterion. T h e corresponding extended double loop network G{n; a, e; b, f) is hamiltonian if and only if G(d;a,e) is hamiltonian and there is a word W in A, B of length | T | / | C i J | such t h a t Wm lies in H for m = \CH\/\H\, but not any 1 < m < \CH\/\H\. In particular, if H is a subgroup of C , then t h e hamiltonian property of the extended double loop network follows from t h a t of t h e Cayley graph. We now t u r n to the general situation. Let G = G(n; a, e; b, f) be an extended double loop network with d = gcd(e — / , n) such t h a t G(n'; a, e; i, / ) is n'/d-uniform. Suppose t h a t G is connected, we want to investigate when it is hamiltonian. In view of T h e o r e m 3.3, it suffices to consider t h e case gcd(n,a) — gcd(n,b) = 1, which we shall assume. By Theorem 3.4, every p a t h in G of length d is basic. Therefore G is hamiltonian if and only if there is a p a t h of length d with period n/d. As seen before, a p a t h P in G of length d corresponds to a word M = (° ^ J in A, B of length d. Note t h a t d divides j3 since P is basic. Further, P has period n/d if and only if t h e smallest positive integer m such t h a t Mm lies in H, the stabilizer of (°) in T = {A,B), is m = n/d. T h e uniformity assumption on G implies t h a t for any node x in G, t h e set S(x) consists of t h e nodes in G which are congruent to x modulo d. T h e m a t r i x M acts on S(x) for any x. It has period n/d if and only if this action is transitive. T h e latter can be easily checked algebraically as follows. T h e action of M on S(0) = {jd : 0 < j < n/d} defines a directed graph G' on 5(0) with t h e out-neighbor from jd being ajd + /9 since M(3A = r"31 J. Clearly, M is transitive on 5 ( 0 ) if and only if G' is hamiltonian. After dividing by t h e common factor d on nodes of G', t h e graph G' is nothing but t h e graph G(n/d;ce,/3/d). Therefore G is hamiltonian if and only if there is a word M = {£ x J in A, B of length d such t h a t G(n/d; a, fi/d) is hamiltonian. Using t h e algebraic criterion for t h e hamiltonicity of G ( m ; k, h), we summarize t h e above discussion in

102

F. K. Hwang &W.-C.

W. Li

T h e o r e m 3 . 1 1 Let G = G(n;a,e;b,f) be an extended double loop network with d = gcd(e — / , n ) . Suppose that G(n'; a, e; b, f) is n'/d-uniform. Then G is hamiltonian if and only if (1) gcd(e, f,n) = 1, gcd{a — l,d) = : d' and d have t h e same p r i m e factors, and 4 divides d' whenever 4 divides d; (2) in case gcd(n,a) = gcd(n,b) — 1, i.e., n' = n, there is a word M = (£ j J in A = ( ° j ) and £ = ( £ { ) of length d such t h a t gcd(/3,n) = d, gcd(a - l , n / d ) = : d" and n/d have t h e same prime factors and 4 divides d" whenever 4 divides n / d . Further, t h e n u m b e r of ways of getting M satisfying (2) is t h e n u m b e r of hamiltonian circuits in G in t h e case gcd(n, a) = gcd(n, 6) = 1. W h e n a = b = 1, t h e m a t r i x M = (a0 () has t h e form [\ k'+(d~k)^ for s o m e 0 < k < d, and t h e above theorem simplifies as C o r o l l a r y 3 . 1 2 Gn{hi,h2) is hamiltonian if and only if gcd{hi,h,2,n) = 1 and there is an integer 0 < k < d, where d = gcd{h\ — h2,n), such that gcd(khi + (d—k)h2,n) = d. Note t h a t Gn(hi,h2) with gcd(hi — h2,n) > 1 is not always hamiltonian, for instance, G3fJ(5,3) is hamiltonian, while Geo(5,3) is not. C o r o l l a r y 3 . 1 3 Let G = G(n;a,e;b,f) be an extended double loop network with gcd(a,n) = gcd(b,n) — gcd(e — f,n) = 1. Then G is hamiltonian if and only if at least one of G(n;a,e), G(n;b,f) is hamiltonian. C o r o l l a r y 3 . 1 4 Let G = G(n;a,e;b, f) be an extended double loop network gcd[a,n) = gcd(b,n) = 2 and gcd(e — f, n) = 1. Then G is hamiltonian.

with

Both generalized deBruijn network and Imase-Itoh network ha.ve gcd(e — f,n) = 1. T h e results on these networks given in [1, 2] are special cases of Corollaries 3.13 and 3.14. R e m a r k . W h e n t h e extended double loop network G = G(n; a, e; b, / ) is a Cayley graph, t h e criterion for G being hamiltonian obtained above is t h e same as t h a t for a Cayley graph as studied in [13, 16, 17]. This is t h e case when a = b = 1 in particular. On t h e other hand, there are extended double loop networks which are not Cayley graphs, and we can still determine if they are hamiltonian using t h e theorem above. We exhibit one example below. E x a m p l e . One checks easily t h a t G = G(18; 11,5; 11,7) is an extended double loop network with d = 2 and
M1.1 !) -M!, 1 !)•

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

103

One finds A 2 = ( » « ) , B 2 = ( » \ 2 ) , A B = ( » \°) and BA = ( £ «) (mod 18). Since gc
"•(;)• (;)- B , (;)-(;)-' , i , G)-(;)- d M (0-(0 ( ™ d u ) O n e verifies easily t h a t each of t h e first 2 equations has 6 solutions, while t h e last two equations have none. This shows t h a t G has only 6 cycles of length 2. By t h e way, this also explains why G(18; 11,5) and G(18; 11,7) are not hamiltonian. In t h e example above, t h e Cayley graph Cay(T;A,B) is also hamiltonian. It is unknown to us if the hamiltonian property of G is related to t h a t of t h e associated Cayley graph when A, B are invertible.

4

Semi-Torus Networks

Let r, c be integers > 2 and let T(r, c) denote an r x c rectangular grid lying on t h e surface of a torus with n = re nodes labeled by t h e grid points (i,j) with i modulo r and j modulo c. Suppose t h a t t h e row links are unidirectional, while t h e links on each column are unidirectional, but t h e directions m a y vary from column to column. We call T(r, c) a semi-torus network. W h e n t h e columns are also unidirectional, it is t h e familiar torus network. T h e directions on columns can be described by a ± 1 sequence {eo, ei> • • •, e c - i } so t h a t columns with t h e same direction have t h e same sign, and t h e opposite directions with opposite signs. T h e functions <7i(i, j ) = ( i , i + 1) and 9i{}-ij) — (z + ejtj) describe t h e two types of links on T(r,c). E x t e n d t h e definition of ft to integers k such t h a t e^ = e,-, where j is t h e remainder of k divided by c. For any node (a, b) of T(r, c), g(a, b) = g^ o gj(a, b) = (a + «&, b — 1), and consequently, S\(a,b)

= {(a + eb + e t _i H

(- eb-i+1,b-

i) : i = 0 , 1 , . . . , m - 1} ,

where m is t h e smallest positive integer k such t h a t b = b — k ( m o d c) and a -f ej + £;,_! + • • • + et-ic+i = a ( m o d r ) . T h e first equation implies m = cq for some integer q. Substituting it into the second equation yields tb + £(,_] H

h £j,_c,,+i = q(e0 + «i H

h e c - i ) = qsc = 0 ( m o d r ) ,

104

F. K. Hwang & W.-C.

W. Li

where sc — £o + £1 + • • • + £ c -i- Let d = gcd{r, sc). T h e n t h e second equation means t h a t q is a multiple of r/d. Thus m = rc/d — n/d. We have shown L e m m a 4 . 1 T(r,c)

is

nId-uniform.

C o r o l l a r y 4.2 If r,c > 2 and d = gcd(r,sc)

= 1, then T(r,c)

is not

hamiltonian.

T h e o r e m 4 . 3 Let sc = to + £i + • • • + £c-i o-nd let d = gcd(r, sc). If d = 2, then every path in T(r,c) of length 2 is basic. When d > 3, every path in T(r,c) of length d is basic if and only if to = £i = • • • = £ c _i, that is, all columns are unidirectional. P r o o f . Let p be a p a t h in T(r, c) starting at t h e node (a, 6). Its ending node has t h e form (a + ioffc +'ii£6+i + • • • + ik^b+k, b + k) for some nonnegative integers i0, i l t . .. ,ik, k with io + J'I + • • • + ik + k = d, and t h e intermediate nodes on p are represented by (a + i0£6 + • • • + it-i^b+t-i + i't^b+t, b + t), where 0 < i't < it, 0 < t < k, and JoH [-it-i + i't + t runs from 1 to d— 1. Let (a + i0eb-\ r-«t-i£6+i-i + ijCb+t,fe+ i) and (a + i0tb + • • • + iu-itb+u-i + i'l^b+u, b + u) be two nodes on p with d — 1 > u + i" + i u _i + • • • + io — (t + ?J + it-i + • • • + io) ^ 1- T h e n either u > < or else u = t and i„ > i\. We study t h e solvability of t h e following system of two congruence equations: b+ t = b+ u —j

( m o d c) ,

- a + i0tb + ••• + it-iCb+t-i + i'ttb+t s a + i0tb + • • • + i„_i£6 + „_i -\-i'^b+u + £(,+" + Q>+u-i H

r- ei+u-j+i

( m o d r) .

i,From t h e first equation we get t = u — j + qc for some integer q. Assume first t h a t u = i, t h a t is, t h e two nodes are on t h e same column. T h e n j = qc and t h e second equation becomes (i" — i'u)tb+u + £&+u + £(>+"-i H H e 6+u- 9 c+i = («" — *!JEH-U + i"u — iu > 0, this equation is not solvable for q so t h e two nodes on t h e same column are not in the same class. Next we discuss the case u > t. In this case j = u — t + qc and t h e second equation becomes (lt

— i't)tb+t

+ H+l^b+t

+ 1 + • ' • + « u - l £ & + u - l + i"«fc+u + C-b+u + £6+14-1 + • ' • +

= (?( — i',) £ 6+t + 2f+i£&+t+l H +

£b+u-qc-l

+ "- " +

h J „ - i £ n - u - i + i"ib+u + qsc +

ib+u-qc

^b+t-qc+1

= (i, - i't)tb+t + (it+i + l)«6+(+i H = M + qsc = 0

tb+t-qc+l

H (iu-i + l)£6+u-i + (*" + 1 )£(,+„ + gs c

( m o d r) ,

where M = (it - i't)tb+t + (*t+i + l)«6+<+i "I

1" (*u-i + l)£6+u-i + (*" + !)£(.+« •

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

105

As \M\ < it - i\ + ( i ( + 1 + 1) + • • • + (i«_i + 1) + (•; + 1) = (io + ... + iu-i

+ i" + «) - (*o + • • • + it-i + i't + t)
,

t h e system has a solution, t h a t is, the two nodes are in t h e same class, if and only if M = 0. W h e n e0 = • • • = e c - i , we have M / 0 and hence t h e p a t h p is basic. Conversely, assume all paths of length d > 3 are basic, we want to show t h a t e0 = • • • = e c _i. Suppose otherwise, say, £(,+i = —ej for some 0 < 6 < c — 2. Consider a p a t h p of length d starting at (0,6), going to (tb,b), then to (£(,, 6 + 1) and onward to other nodes, so t h a t i0 = 1. Then (0, 6) and (ej, 6 + 1 ) are in t h e same class since M = (i0 — 0) £(,+o + (0 + l)et-t-i = 0. Furthermore, (et, 6 + 1) is not the ending node of p since p has length > 3. This shows t h a t p is not basic. Finally, we are left with t h e case d = 2. We have shown t h a t two consecutive nodes (a, 6) and (a + £(,, 6) on t h e same column are not in t h e same class. Next consider any two consecutive nodes on a row, say, (a, b) and (a, 6 + 1). T h e n u = 1, t = 0 and M = «(,+i has absolute value 1. Again (a, 6) and (a, 6 + 1 ) are not in t h e same class. This shows t h a t in case d = 2 any p a t h of length d is basic. Concerning t h e existence of a hamiltonian circuit in a semi-torus network, we have t h e following result: T h e o r e m 4 . 4 Suppose r,c> 2. Let sc = e0 + • • • + e c _! and d = gcd(sc,r). (1) If d = 1, then T(r,c) is not hamiltonian. (2) If d = 2, then T(r, c) is hamiltonian. (3) If d > 3 and T(r, c) is unidirectional in both rows and columns, then it is hamiltonian if and only if there exists an integer k, 1 < k < d— 1, SUC/J £/ia£ gcd(k, r) = 1 and gcd(d — k, c) = 1. P r o o f . T h e first statement is Corollary 4.2. W h e n d = 2, we know t h a t any p a t h in T ( r , c) of length 2 is basic by Theorem 4.3. Consider t h e p a t h p from t h e node (0,0) through t h e node (0,1) to t h e node (ei, 1). It determines a factor F of T(r,c). We shall show t h a t p has period njl = rc/d in F. As we have seen in the proof of T h e o r e m 7 above, in T(r,c) adjacent nodes are not in the same class. So t h e nodes lying on t h e p a t h in F containing p belonging to t h e same class as (0, 0) have t h e form (ex + e2 + • • • + et-, i), and t h e period of p is t h e smallest positive solution to t h e system i = 0 ( m o d c) , t\ + ••• + £; = 0

( m o d r) .

which is rc/d by a simple computation. Finally, when rows and columns are unidirectional, T(r, c) is the usual torus network. As any p a t h of length k is basic by Theorem 4.3, a similar a r g u m e n t proves t h e third s t a t e m e n t . However, the network is also t h e Cartesian product of two circuits of lengths r and c. T h e necessary and sufficient condition for such a network to

106

F. K. Hwang & W.-C.

W. Li

be hamiltonian as described in (3) has been implied in t h e work of Rankin [13] and proved by Trotter and Erdos [17]. So no detail is given here. While we do not have a general criterion for t h e case d > 3 and columns not unidirectional, t h e following construction gives a hamiltonian circuit for T ( 6 , 9) with d = 3:

t l l t l t t t l Figure 1: A Hamiltonian Circuit of T ( 6 , 9 )

5

Semi-Manhattan Networks

A mesh with r rows and c columns is called a s e m i - M a n h a t t a n network M(r, c) if all (except possibly one) rows alternate in direction but the columns have arbitrary directions. We are interested in t h e existence of a hamiltonian circuit in M(r,c). Observe from Corollary 4.2 t h a t when r is odd, r > 3, and columns are unidirectional, there is no hamiltonian circuit in M{r,c). T h e following theorem says t h a t this is t h e only exception. T h e o r e m 5.1 M(r,c) unidirectional.

is hamiltonian

except when r > 3 is odd and the columns

are

P r o o f . T h e o r e m 5.1 is easily checked when r < 2 or c < 2. So assume r > 3 and c > 3 and we shall prove the theorem by construction. In all t h e following constructions, the orientation of t h e hamiltonian circuit (clockwise or counterclockwise) depends on t h e direction of t h e last column and an arc in a figure represents a single edge. C a s e 1. T h e r e are three adjacent columns alternating in direction. W i t h o u t loss of generality, assume t h a t these are the first two and the last columns. For even r we use t h e vertical comb in Fig. 2(a) and for odd r we use the vertical comb in Fig. 2(b).

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

teeth

107

r - 1

(a)

teeth

(b)

Figure 2: Combs as Hamiltonian Circuits

teeth

iih i + I s (a)

teeth

i + lst (b) Figure 3: 2-Sided Combs as Hamilton Circuits

F. K. Hwang & W.-C.

108

W. Li

C a s e 2 . T h e r e are two pairs of adjacent columns, say, t h e first, t h e last, t h e i t h and t h e i + 1 s t such t h a t t h e first and t h e i t h have one direction, while t h e i + I s ' and t h e last have t h e other. For even r we use t h e 2-sided comb in Fig. 3(a) and for odd r t h e 2-sided comb in Fig. 3(b). C a s e 3 . r is even and there are two adjacent columns with t h e same direction, say, t h e first and t h e last. We use a different comb as shown in Fig. 4.

^

teeth

Figure 4: A Different Comb as a Hamiltonian Circuit In general, when t h e columns are not unidirectional, partition t h e columns into consecutive runs of columns in t h e same direction. If there exists a run of length 1, then it falls in Case 1; if every run is of length at least 2, then it falls in Case 2. W h e n all t h e columns are unidirectional, we have seen t h a t M(r, c) is not hamiltonian for odd r, for even r this is Case 3. T h e proof of T h e o r e m 5.1 is now completed. C o r o l l a r y 5.2 The Manhattan

street network is

This is included in Case 1 above.

hamilton

Hamiltonian

Circuits for 2-Regular

Interconnection

Networks

109

References [1] D. Z. Du and F . K. Hwang, Generalized deBruijn digraphs, Networks, 27-38.

18 (1988)

[2] D. Z. Du, D. F . Hsu, F . K. Hwang and X. M. Zhang, T h e hamiltonian property of generalized deBruijn digraphs, J. Cornbin. Theory, series B, 52 (1991) 1-8. [3] L. J. Guibas, H. T . Kung, and C. D. Thompson, Direct VLSI implementation of combinatorial algorithms, Cal. Tech. Conf. VLSI, Pasadena, Ca., J a n . 1979. [4] F . K. Hwang, T h e hamiltonian property of linear functions, OR Letters, 6 (1987) 293-296. [5] F . K. Hwang and W.-C. W . Li, 2-connectivities of extended double loop networks, submitted. [6] M. Imase and M. Itoh, Design t o minimize diameter on building-block network, IEEE Trans. Comput., C - 3 0 (1981) 439-442. [7] M. Imase and M. Itoh, A design for directed graphs with m i n i m u m diameters, IEEE Trans. Comput., C - 3 2 (1983) 782-784. [8] W . H. K a u t z , K. N. Levitt and A. Waksman, Cellular interconnection arrays, IEEE Trans. Comput, C - 1 7 (1968) 443-451. [9] W . Liu, T . H. Hildebrandt and R. Cavin, III, Hamiltonian cycles in t h e shuffleexchange network, IEEE Trans. Comput., C - 3 8 (1989) 745-750. [10] A. J. Martin, T h e T O R U S : An exercise in constructing a processing surface, Proc. VLSI Conf. 1981. [11] N . F . Maxemchuk, Regular mesh topologies in local and metropolitan area networks, AT&T Tech. J., 6 5 (1985) 1659-1687. [12] D. K. P r a d h a n , Fault-tolerant multiprocessor and VLSI-based systems communication architecture in Fault Tolerant Computing Theory and Techniques, Vol. II (ed. D. K. P r a d h a n ) , (Prentice-Hall, Englewood Cliff, N J , 1986). [13] R. A. Rankin, A campanological problem in group theory, Proc. Camb. Soc, 4 4 (1948) 17-25.

Phil.

[14] S. M. Reddy, D. K. P r a d h a n and J. G. Kuhl, Direct graphs with m i n i m u m diameter and maximal connectivity, School of Eng., Oakland Univ. Tech. Rep., July 1980.

110

F. K. Hwang & W.-C. W. Li

[15] C. S. Raghavendra, M. Gerla and A. Avizienis, Reliable loop topologies for large local computer networks, IEEE Trans. Comput., C-34 (1985) 46-54. [16] 0 . Serra, M. Mora and M. A. Fiol, On c-circulant digraphs, in Proc. Combinatorics '88, Ravello (Italy), F. Rocca and F. Mazzocca, eds., (Mediterranean Press, 1991) pp. 371-387. [17] W. T. Trotter, Jr. and P. Erdos, When the Cartesian product of directed cycles is hamiltonian, J. Graph Theory, 2 (1978) 137-142. [18] E. A. van Doom, Connectivity of circulant digraphs, J. Graph Theory, 10 (1986) 9-14. [19] D. Witte and J. A. Gallian, A survey: hamiltonian cycles in Cayley graphs, Disc. Math., 51 (1984) 293-304.

Ill Network Optimization Problems, pp. 111-123 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Equivalent Formulations for the Steiner Problem in Graphs Bassam N. Khoury Panos M. Pardalos Donald W . Hearn Department of Industrial and Systems Gainesville, FL 32611 USA

Engineering,

University

of

Florida,

Abstract In the following paper, we present a collection of known and new formulations for the Steiner problem in undirected and directed graphs. K e y w o r d s : Steiner problem, graphs, directed graphs, integer formulations, mixed integer formulations, continuous formulations.

1

Introduction

Providing an appropriate formulation for a problem is a crucial step toward solving it. In addition, different formulations can be used to obtain insight of t h e properties and t h e complexity of a problem. In this paper, we present a collection of known and new m a t h e m a t i c a l formulations for t h e Steiner problem in graphs ( S P G ) , a problem t h a t was first posed by Hakirni [11] and Levin [18] independently in 1971. Since its conception, the S P G has become of importance due to its numerous applications in areas like manufacturing systems, communication and transportation network design and phylogeny. For good sources of information on t h e S P G , t h e reader is referred to [13, 12, 30, 29, 19]. This paper is organized in t h e following way. In section 2 we define t h e S P G and a related problem called t h e Steiner problem in directed graphs ( S P D G ) , and we establish the fact t h a t any m a t h e m a t i c a l model for t h e S P D G can be used to model t h e S P G . Based on this fact, all the m a t h e m a t i c a l models given in t h e following sections, either for the SPG or for the S P D G , are considered potentially for t h e S P G . In section 3 mixed integer formulation models are presented. Integer formulation models are given in section 4. In section 5, a continuous formulation, which is the first of its kind, is presented.

B. N. Khoury,

112

2

P. M. Pardalos,

and D. W.

Hearn

The SPG and the SPDG

In this section, t h e S P G and t h e SPDG are defined and t h e relationship between the two problems is highlighted. T h e SPG definition given below has an interesting matroid interpretation [16, 9]. Given an undirected graph G = (J\f, A,C), where Af = { 1 , . . . , n} is a set of nodes, and A is a set of undirected arcs (i,j)

with each arc incident to two nodes, and

C is a set of nonnegative costs cy associated with undirected arcs then t h e S P G is defined as follows:

(i,j),

I n s t a n c e : A graph G = (Af,A,C), a node subset TlCAf. Q u e s t i o n : Find t h e m i n i m u m cost tree, on G, t h a t would connect all the nodes in 1Z. T h e S P D G is t h e directed graph version of t h e Steiner problem. Apparently, [23] is the first reference to consider the S P D G . Consider Gd = (Af, Ad,Cd) where Af = { 1 , . . . , ra} is a set of nodes, and Ad is a set of directed arcs (i,j) d

incident to nodes in Af, and

C = {cfj} is a set of costs associated with directed arcs

(i,j).

Furthermore, let 1Z. (a set of regular nodes) be a subset of Af, and let r (a be an arbitrary node in 1Z. Also, define a directed Steiner tree T rooted with respect to 71 on G, to be a directed tree where all nodes in TV\{r} can from r via directed paths in T , and all leaf nodes are in 1Z. Next, we define

root node) at node r, be reached the SPDG.

I n s t a n c e : A graph G = (Af,A,C), a node subset 1Z and a root node r in 72.. Q u e s t i o n : Find t h e m i n i m u m cost directed Steiner tree, on Gd with respect to TZ, t h a t has node r as root node. In t h e light of t h e S P G and S P D G definitions given above, it is not hard for one to see t h a t t h e S P D G is a generalization of t h e S P G . In fact, t h e S P G can be transformed to a special case of t h e SPDG where the arc cost structure is symmetric. This transformation can be done as follows. Given G = (Af, A,C), create the corresponding directed graph Gd = (Af, Ad,Cd), where every undirected arc (i,j) € A corresponds to two oppositely directed arcs (i,j) and (j,i) G Ad associated with the same cost (cfj = CJ; = Cij). Potentially, any m a t h e m a t i c a l formulation for t h e S P D G can be used to formulate t h e S P G . Having highlighted this relationship, we can proceed to t h e next sections where different m a t h e m a t i c a l formulations are given for t h e S P D G and t h e S P G . Whenever used in t h e following sections, 0(i) and X(i) denote the subsets of arcs t h a t are going out of node i and into node i respectively.

Equivalent Formulations for the Steiner Problem in Graphs

3

113

Mixed Integer Formulations

One of the classes of mathematical models used to formulate the SPG is mixed integer programming, where some of the variables used in the model are integer, and others are continuous. Two known mixed integer formulations based on network flow requirements are given here.

3.1

Network Formulation 1

Given a directed graph Gd = (A/-, Ad,Cd) and a subset of nodes H, consider a network flow problem in which the root node r EH offers |7?.| — 1 units of flow and each node in 7?.\{r} demands one unit. Let yij be the amount of flow on arc (i,j), and x,j be a binary variable corresponding to arc (i,j)- Then, consider the following formulation. ninimize subject to:

/ . (i,J)£Ad

c x

ij ij

vn = - l

VH -

Y

yn-

Y

yji = \n\-i

(for i = r)

Y

yn -

Y

yji = 0

(Vz G Af\Tl)

(i,j)€0(i)

Y

(Vi G Tl\{r})

Y

0,i)ex(i)

0 < yn <(\K\-

l)xtj

x«e{o,i}

W,j)eAd) (v(i,j)eAd)

The constraints involving variables t/,-j only are simply flow conservation constraints. The coupling constraints enforce the fact that an arc can carry a positive flow only if it is used in the tree, meaning that its corresponding binary variable Xij is 1. There are formulations of the same type used in location problems and in the traveling salesman problem [8, 25]. This formulation for the SPDG is given in [2] by Arpin et al. where relaxation methods are applied to it.

3.2

Network Formulation 2

Given a directed graph Gd = (J\f,Ad,Cd) and a subset of nodes 1Z, consider a multicommodity flow problem in which the root node r 6 U offers one unit of every commodity k (k = 1 • • • |7?.| — 1), where commodities correspond to nodes in 7V\{r}. Let yf, be the amount of commodity k (the amount of flow between nodes r and k) on arc (i,j), and let X{3 be a binary variable that indicates whether or not an arc (i,j) is included in the solution. Then, the SPDG second formulation is given below.

114

{J-2)

B. N. Khoury,

minimize subject to:

^

P. M. Pa.ida.los, and D. W. Hearn

CijXij

£ y* £ y% = (i,j)eo(i) U,i)ei(i)

E 4 - E 4 == (M)6O(0

0,.)6r(.)

(••J)6O(0

(j,0er(.)

1 -1

= 0 E 4 - E yji =

(fori=r,VteR\{r}) (for i = k,V (for z ^ {r, fc

0 < 4 < *«

(V(t,j)e^:

z.-j e { 0 , 1 }

(V(z,j)e^

T h e coupling constraints indicate t h a t flow of any commodity is allowed on arc (i,j) only if t h a t arc is included in t h e solution. In a way, we can view xtJ as t h e flow capacity for commodity k on arc (i,j). T h e first three constraints, involving yf- only, dictate t h a t one unit of commodity k be routed between nodes r and k. This formulation is given in [31] by Wong where a dual ascent approach is applied t o it. Note t h a t formulation Ti can be reduced in size by aggregating t h e coupling constraints [31]. However, a number of studies of related problems such as plant location and fixed charge network design [7, 2 1 , 27, 28] have concluded t h a t the disaggregated formulation, although it is larger in size, is much more useful for developing efficient algorithms.

4

Integer Formulations

T h e S P G can be approached as an instance in t h e realm of integer programming. In integer programming, formulating a problem is crucially i m p o r t a n t to solving it [24]. In this section, we present several integer formulations for t h e S P G and t h e S P D G . Some of these formulations have been approached with relaxation m e t h o d s used to develop lower bounds for a branch and bound scheme in some previous studies.

4.1

An Elementary Path Formulation

Given an undirected graph G = (Af,A,C), a subset of regular nodes 11 and an arbitrary node r € 7?., associate with every arc (i,j) two binary variables xtJ and yijk\ Xij is one if arc (i,j) is used in t h e solution tree, and zero otherwise; y,-^ is one if arc (i,j) is on t h e unique elementary p a t h in t h e solution tree from node r to node k (k € TZ\{r}), and zero otherwise. T h e S P G is formulated as follows.

Equivalent

(^3)

Formulations

minimize subject to:

for the Steiner Problem

in Graphs

115

c x

2J

ij ij

there exists an elementary p a t h [yijk] from r to k , Vfc €

Xij>y,jk

7Z\{r]

{Vk£K\{r},V(i,j)eA)

iye{o,i} ^€{0,1}

(V(i,j)eA) (Vken\{r},v(i,j)eA).

T h e first constraint ensures t h a t t h e solution is connected. T h e second constraint dictates t h a t an arc is on some p a t h only if t h a t arc is used in t h e tree. This formulation is given in [4] by Beasley where a Lagrangean relaxation technique is used to develop lower bounds for a branch and bound algorithm. In this approach, redundant "box constraints" are added to the formulation to strengthen lower bounds.

4.2

A Minimum Spanning Tree Formulation

This formulation is given in [5] by Beasley where a branch and bound algorithm is devised using lower bounds obtained from Lagrangean relaxation techniques. Given an undirected graph G = (Af, A, C), a subset of nodes 1Z and an arbitrary node r £ 1Z, t h e formulation is motivated by t h e following. T h e S P G is to choose a subset of A t h a t would connect together all nodes in 1Z with t h e m i n i m u m cost. It is well known t h a t edges in this subset of A will form a m i n i m u m cost spanning tree, t h e Steiner tree, on some subset of vertices 1Z U <S, where S C Af\1Z is called the set of Steiner vertices. In order to exploit this valuable information, let us add an artificial node 0 to t h e graph. Also, for each node i g Af\1Z, add an edge (0, i) of cost zero. For node r, add an edge ( 0 , r ) of cost zero. Clearly, t h e SPG reduces to finding the m i n i m u m spanning tree (MST) on the augmented graph G° = (Af°, A°,C°) with the additional restriction t h a t in this M S T , any node i € Af\1Z connected by the arc (0, i) to vertex 0 must have degree 1 (see Figure 1). In more rigorous m a t h e m a t i c a l terms, this can expressed by t h e following formulation. (T4)

minimize

^Tj

CijXij

(•',j)e^°

[xij\

=

spanning tree on G° = (A^°,^4°,C°)

xoi + xpq

<

1

subject to: x,3e{0,l}

4.3

(for p or q = i, Mi e

Af°\TZ)

(V(i,j)eA°).

A Set Covering Formulation

Given a directed graph Gd = (Af,Ad,Cd) and a subset of nodes 7?., consider the partition of Af = X U ~X such t h a t X d ~X = 0, 11 D X ^ 0 and H n ~X ^ 0. Let

B. N. Khoury, P. M. Pardalos, and D. W. Hearn

116

^

Edge of cost zero

Isolated nodes connected to 0 by edges of cost 0

Subtree connected to

Figure 1: An optimal solution for the restricted MST problem.

Equivalent Formulations for the Steiner Problem in Graphs

117

V = (X,X) denote the cut-set of edges between X and X. Obviously, there are exponentially many sets. Suppose that we enumerate such sets to be V\,- • •, 7-V. Define constants ap;j (p € {1 • • • TV}, (i,j) £ Ad) to be 1 if arc (i,j) is in cut set Vp, and zero otherwise. Now, the SPDG can be formulated as the following set-covering problem: (^5)

minimize

£

CijXij

(.',j)6-4 d

subject to:

£

apijXij

x«€{0,l}

>

1

(Vp € {1, • • • ,w} (V(z,j)G^).

This formulation is given in [1] by Aneja where a set covering algorithm for the SPG is proposed. In addition, Chopra et al. [6] have used this formulation to find lower bounds, useful for a branch and bound algorithm, by solving the relaxed linear problem. The relaxed version has potentially an exponential number of constraints; however, one need not consider all of them, and this can be done using a row generation technique [1].

4.4

A Rooted Tree Formulation

In the following, a rooted-tree formulation, which was proposed recently by Khoury et al. [14] in the context of a test problem generator for the SPG based on the Karush-Kuhn-Tucker optimality conditions, is presented. Given a directed graph Gd = (Af,Ad,Cd), a subset of nodes U and a root node r e H, the SPDG can be formulated as follows. {J~e)

minimize subject to:

2_,

c x

ij ij

^P

X{j = 1

(Vj 6 1Z\{r~\)

(O)ex(j)

£ xkl >Xij (*.0er(0\{(i.01 ^•€{0,1} [x{j] contains no cycles.

(V(z,j)eAd\0(r)) (V(iJ)€Ad)

Whether or not an arc (i,j) is used in the tree is translated to assigning the value 1 or 0 to its corresponding binary variable Xij. An aggregated form of the above formulation, where aggregation is performed on the second set of constraints, is given by Nastansky et al. [23]. The first type of constraint implies that at least one arc is going into every regular node i, i 6 7^\{r}. The second type of constraint dictates that for a directed arc (i,j) to be used in a tree, at least one tree arc has to be going into node i, i 6 A/"\{r}. This constraint and the last constraint together imply the

118

B. N. Khoury,

P. M. Pardalos,

and D. W.

Hearn

structure of a rooted tree. Note t h a t there are different ways to represent the last constraint mathematically. This type of constraint is encountered in formulating the traveling salesman person where cycles corresponding to subtours are to be eliminated

[17]. Although this formulation does not involve flow variables, it has t h e feature of rooted trees in common with t h e network flow formulations given earlier. This combination of t h e integer and rooted-tree features gives t h e formulation a unique ingredient. Also, note t h a t this formulation does not involve potentially an exponential n u m b e r of constraints like Ts-

4.5

A Node-Arc Integer Formulation with Valid Cuts

Here, we present a new formulation motivated by the fact t h a t we can see an undirected Steiner tree on an undirected graph as a directed tree, on the corresponding directed graph, rooted at any regular node (we have \R\ such directed trees). Given an undirected graph G = (A/-, A,C), its corresponding s y m m e t r i c directed graph Gd = (Af,Ad,Cd) and a subset of regular nodes 1Z, associate with every undirected arc {i,j) in G a binary variable Xij, and associate with every directed arc (i, j) in Gd binary variables yt*; X{} is one if undirected arc (i,j) is used in t h e solution tree, and zero otherwise; y\- is one if directed arc (i,j) is in a tree rooted at k G TZ, and zero otherwise. Also, with every node i G Af\TZ, associate a binary variable 6{ t h a t is one if node i is in the solution tree, and zero otherwise. T h e SPG is formulated as follows. (.7-7)

minimize subject to:

J^

ctJXij ^

Xij

>

E *••; ^

1

(Vt € 71)

2S{

(Vi G Af\Tl)

(ij)eA Si

>

X{j

Vij + Vji

=

Xjj

{VieAr\ny{i,j)eA) (Vfc€ft,V(i,j)€ Ad)

E 4 =

i

{Vken,VjeK\{k})

E

s3

(Vke1i,VjeAr\1l)

0

(\/kenMi,k)ei{k))

(>j)ei(j)

»£• =

(•'J)6I(J')

Vik [Vij] contains no cycles

=

(Vfc G K).

T h e first three constraints ensure that in t h e undirected tree, every regular node is connected to t h e tree, and t h a t every Steiner node has at least a degree of two. T h e fourth set of constraints enforce t h e fact t h a t every rooted tree solution is derived from t h e unrooted tree by appropriately assigning directions to arcs in the undirected

Equivalent

Formulations

for the Steiner Problem

in Graphs

119

tree and vice versa. T h e fifth, sixth, seventh and eighth sets of constraints ensure t h a t t h e directed interconnecting network corresponding to node k € TZ is a directed tree rooted at k connecting its root node to all other regular nodes in 7?.\{A:}, a property which is present in formulation T%. Although this formulation contains a relatively large n u m b e r of constraints, most of these constraints are valid cuts. As a result, we expect this formulation t o develop good lower bounds for a branching tree.

5

Continuous Formulations

Every m a t h e m a t i c a l formulation t h a t is not continuous because of integer zero-one variables X{ can be m a d e continuous by relaxing the integrality constraints, replacing t h e m with zero-one "box constraints" and adding to the objective function a large penalty term corresponding to violated integer constraints. This is usually done by adding a large M t e r m , if we are dealing with a minimization problem, of the form a:'M(l — x). Naturally, t h e question now is how large should M be? How difficult is it to deal with a very large M factor in the minimization process? In general, these questions are very difficult to answer, see [26]. Here, we present a practical continuous formulation, which was proposed by Khoury [15] in t h e context of continuous approaches to t h e S P G . However, in this formulation, t h e source of concavity is still attached t o t h e objective function. Indeed, a series of continuous e-concave optimization problems is presented, and a proof t h a t the limit of t h e series when t goes to zero is the fixed charge formulation of the S P D G (see [10]) is given. Also, a rough error bound on approximating the solution of the SPDG by t h e solution of a sequence in t h e series is presented below.

5.1

An e-Concave Optimization Approach

Assume we are given a directed graph G = (J\f,Ad,Cd). Let ~R be a subset of nodes on t h e graph, set of regular nodes. Furthermore, let node r, t h e root node, be an arbitrary node in 1Z. In addition, let y,j be a flow variable t h a t is associated with arc (i, j). Moreover, let e be a strictly positive real number. Now, define Ts(t), a concave minimization problem, as follows. (jFs.J

minimize fs{e,Y) subject to:

=

^Z

c

«"j

T

-

AY = B F >0.

where A is an \M\ x \Ad\ node-arc incidence m a t r i x , and Y is an l ^ l x 1 vector whose entries are t h e !/;JS, and B is an \J\f\ x 1 vector whose entries fc; correspond to the set of nodes; furthermore, 6j = 0 if i € A/\7?-, and 6; = —1 if i € TZ\{r}, and br = \R.\ — 1. This is basically a m i n i m u m cost flow problem where t h e objective

120

B. N. Khoury,

P. M. Pardalos,

and D. W. Hearn

function is a separable concave function. Note t h a t due to the strict concavity of fi(Y), t h e unimodularity of A and t h e integrality of B, t h e optimal solution to T&tt is always integer. Now, if we let i , j = (Fs)

— , the limit of T% t when e goes to zero becomes Vi3 + «

minimize fs{Y) subject to:

=

^Z

AY = B Ad)

xti = 0(1) for ytJ = 0 ( > 0) (V(i,j) € Y

>0.

Observe t h a t F s , which is a fixed formulation of t h e S P G , is an equivalent tight formulation of T\.

5.2

e-Error Bound

Here, we give an e-error bound on the approximation of t h e solution of the SPDG by t h e solution of one sequence Fs, £ of the series of concave optimization problems given above. T h e o r e m : T h e error on approximating /g by /g(eo) is less t h a n 1 +

e

J^ ctJ. °(.',i)e^

Proof Since / s ( e , Y) is a convex and decreasing function of e, we have 0
8

*M

<

lim/8(c,y*(eo))-/8(eo) i +

e

e

° (i,j)eAd

< i T° 7 ^

6

C D

-

Concluding Remarks

In this paper, we have presented eight different formulations for the S P D G and the S P G . Those formulations fall into three categories, namely mixed integer, integer and continuous categories. In the mixed integer category, two network formulations are presented, where one of t h e m is based on flow requirements and the other on c o m m o d i t y requirements. Under t h e second category, we have presented three known formulations, t h e elementary p a t h , t h e set covering and t h e m i n i m u m spanning tree formulations. Also, we have proposed two new formulations. One of t h e proposed formulations, t h e rooted-tree integer formulation, combines the properties of involving

Equivalent

Formulations

for the Steiner Problem

in

Graphs

121

only integer variables and having t h e structure of rooted tree, a property present in t h e network formulations mentioned above. T h e second proposed integer formulation is more complicated in nature; however, it contains more valid cuts, and it is expected to give good lower bounds. In addition, we have presented, in t h e category of continuous formulations, a new approach where t h e S P D G is approximated by a sequence of a series of concave optimization problems. Every one of these formulations is valid. However, some formulations are better t h a n others due to a b e t t e r distribution of bad and good properties between t h e constraint set and t h e objective function. It is well known t h a t manipulating an optimization model can be done by either manipulating t h e constraint set or manipulating t h e objective function (see also [22]). Sometimes a formulation handles good optimality properties through its objective function. In this case, it is a good idea to express those properties in t h e feasible domain by adding more constraints to the problem to further tighten t h e feasible region. An alternative situation is when the set of constraints has bad properties while the objective function has nice properties. It would be a good idea in this case to take some of the bad properties out of the constraint set and express t h e m in the objective function. T h e moral of t h e story here is t h a t with better shared information between t h e objective function and the constraint set, m a t h e m a t i c a l optimization models would be b e t t e r off. Criteria to identify b e t t e r models and systematic ways to formulate t h e m are, however, open questions.

References [1] Aneja, Y.P. (1980), "An Integer Linear P r o g r a m m i n g Approach to the Steiner Problem in Graphs," Networks, Vol. 10, 167-178. [2] Arpin, D., Maculan, N. and Nguyen, S. (1983), " Le Probleme de Steiner sur un Graphe Oriente: Formulations et Relaxations," Publication 315, Centre de Recherche sur les Transports, Universite de Montreal. [3] Balakrishnan, A. (1982), "Formulations and Algorithms for the Steiner Network Problem," Unpublished manuscript, Sloan School Management, M I T . [4] Beasley, J . E . (1984), "An Algorithm for the Steiner Problem in Graphs," works, Vol. 14, 147-159.

Net-

[5] Beasley, J . E . (1989), "An SST-Based Algorithm for t h e Steiner Problem on Graphs," Networks, Vol. 19, 1-16. [6] Chopra, S., Gorres E.R. and Rao, M.R. (1992), " Solving t h e Steiner Tree Problem on Graphs Using Branch and Cut," ORSA Journal on Computing, Vol. 4, No. 3, 320-335.

122

B. N. Khoury,

P. M. Pardalos,

and D. W.

Hearn

7] Cornuejolis, G., Fisher, M.L. and Nemhauser, G.L. (1977), "Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and A p p r o x i m a t e Algorithms," Management Science, Vol 23, 789-810. 8] Gavish, R. and Graves, S.C. (1982), " Scheduling and routing in transportation systems: Formulations and new relaxations," G r a d u a t e School of Management, University of Rochester, N.Y. 9] Gondran, M. and Minoux, M. (1984), Graphs and Algorithms,

John Wiley & Sons.

[10] Guisewite, G.M. and Pardalos, P.M. (1991), "Algorithms for t h e Single-Source Uncapacitated M i n i m u m Concave-Cost Network Flow Problem," Journal of Global Optimization, Vol. 1, 245-265. [11] Hakimi, S.L. (1971), "Steiner's Problem in Graphs and its Implications," works, Vol. 1, 113-133. [12] Hwang, F.K. and Richards, D.S. (1992), "Steiner Tree Problems," Vol. 2, 55-89. [13] Hwang, F.K., Richards, D.S. and Winter, P. (1992), The Steiner Elsevier, A m s t e r d a m .

Tree

Net-

Networks,

Problem,

[14] Khoury, B.N., Pardalos, P.M. and Du, D.-Z (1993, to a p p e a r ) , "A Test Problem Generator for t h e Steiner Problem in Graphs," ACM TOMS. [15] Khoury, B.N. (1993), "The Steiner Problem in Graphs," P h . D . Dissertation (under preparation), D e p a r t m e n t of Industrial and Systems Engineering, University of Florida, Gainesville, Florida. [16] Lawler, E.L. (1976), Combinatorial Rinehart and Winston. [17] Lengauer, T. (1990), Combinatorial John Wiley & Sons, Inc. .

Optimization

Algorithms

Networks

and Matroids,

for Integrated

Circuit

Holt,

Layout,

[18] Levin, A . J u . (1971), "Algorithms for t h e Shortest Connection of a G r o u p of Graph vertices," Soviet Math. Doklady, Vol. 12, 1477-1481. [19] J. MacGregor Smith and P. Winter (Eds.), (1991), Topological Network Vol. 31, No. 1-4 of Annals of Operations Research. [20] Maculan, N. (1987), "The Steiner Problem in Graphs," Ann. Vol. 31, 185-222.

Discrete

Design,

Math.,

[21] Magnanti, T.L. and Wong, R . T . (1981), "Accelerating Benders Decomposition: Algorithmic E n h a n c e m e n t and Model Selection Criteria," Operations Research, Vol. 29, 464-484.

Equivalent

Formulations

for the Steiner Problem

in Graphs

123

[22] Martin R.K. (1991), "Using Separation Algorithms to Generate Mixed Integer Model Reformulations," Operations Research Letters, Vol. 10, 119-128. [23] Nastansky, L., Selkow, S.M. and Stewart, N . F . (1974), "Cost-Minimal Trees in Directed Acyclic Graphs," Zeitschrift filr Operations Research, Vol. 18, 59-67. [24] Nemhauser, G.L. and Wolsey L.A. (1988), Integer and Combinatorial tion, J o h n Wiley & Sons, Inc.

Optimiza-

[25] Nobert, Y. (1982), "Construction d'Algorithmes O p t i m a u x pour des Extensions du Problem de Voyageur de Commerce," P h D Dissertation, Publication 297, Cent r e de Recherche sur les Transport, Universite de Montreal. [26] Pardalos, P.M. and Rosen, J . B . (1987), Constrained Global Optimization, Algorithms and Applications, Springer-Verlag, Lecture Notes in C o m p u t e r Science 268. [27] Rardin, R.L. (1982), "Tight Relaxations of Fixed Charge Network Flow Problems," Technical Report J-82-3, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA. [28] Rardin, R.L., Parker R.G. and Lim, W.K. (1982), "Some Polynomially Solvable M u l t i - C o m m o d i t y Fixed Charge Network Flow Problems," Technical Report J-822, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA. [29] Vofi, S. (1990), Steiner-Probleme

in Graphen [in German], Hain, Frankfurt.

[30] Winter, P. (1987), "Steiner Problem in Networks: A Survey," Networks, 129-167.

Vol. 17,

[31] Wong, R . T . (1984), "A dual Ascent Approach to Steiner Tree Problems on a Directed Graph," Mathematical Programming Vol. 28, 271-287.

125 Network Optimization Problems, pp. 125-145 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Minimum Concave-Cost Network Flow Problems with a Single Nonlinear Arc Cost B e t t i n a Klinz Institut fur Mathematik

Hoang Tuy Institute of Mathematics, Vietnam

B, TU Graz, Kopemikusgas.se

Vien Toan Hoc, P.O.B.

24, A-8010

Graz,

631 Bo Ho, 10.000

Austria

Hanoi,

Abstract We apply the parametric method recently developed for rank two quasiconcave minimization problems to minimum concave-cost network flow problems in which all arcs except one have linear cost. The main step of the resulting algorithm consists of solving a specially structured parametric linear network flow problem. We derive upper bounds on the number of breakpoints in the optimal value function of this parametric problem and obtain the following results: For general networks with integer capacities and demands, we get a pseudo-polynomial bound on both the number of operations and of objective function evaluations required by our algorithm. On single source, uncapacitated networks with n nodes, m arcs and |T| sinks, our method can be shown to run in 0(S(n,m)) operations and at most \T\ + 1 evaluations of the single nonlinear arc cost function, where S(n,m) is the time needed to solve a single source shortest path problem on the given network. These bounds depend on n and m only and are thus strongly polynomial.

1

Introduction

Q u a s i c o n c a v e r a n k - t w o m i n i m i z a t i o n . In view of t h e difficulty inherent to global optimization, it is natural to look for specially structured global optimization problems t h a t can be solved practically by properly adapted methods. In [23] an i m p o r t a n t

B. Klinz and H. Tuy

126

class of these problems has been singled out whose nonconvexity is located in a subspace of low dimension and hence can be handled very efficiently. This class includes problems which have t h e following general formulation: (P)

minimize

f(x)

subject t o

x € D

where D is a polyhedron in R.d while f(x) is a quasiconcave function defined on an open convex set Q D D and satisfying t h e following "rank two condition1'': T h e r e exist two linearly independent vectors c and c such t h a t if (c\w)

> 0 (t = 1,2) t h e n f(x + w) > f{x)

for all i £ f i .

(1)

Here (y, z) denotes the scalar product of the two vectors y and z. For solving problems of t y p e (P), Tuy and Tarn [22] proposed two algorithms. T h e first one, using a polyhedral annexation technique, is most suitable when t h e computation of the values of f(x) is t i m e consuming (which is the case, in particular, if f(x) is defined in some implicit way). T h e second one, which practically reduces t h e problem to solving a p a r a m e t r i c linear program, has proven to be very efficient when t h e values of f(x) are easy to c o m p u t e (which occurs e.g. if f(x) is t h e product of two affine functions, as considered in [5, 7, 14, 18, 21, 22, 23]). In t h e present paper we shall discuss a special global optimization problem for which t h e above approach seems to be particularly useful. Specifically, we will s t u d y the minimum concave-cost network flow problem (MCCFP) in t h e case when only one arc has a nonlinear cost. It is well known t h a t the general (MCCFP) is MVhard. Nevertheless, as will be shown below, t h e special version of this problem t h a t we shall be considering falls into t h e above class of rank two quasiconcave minimization problems and therefore can be efficiently solved by t h e parametric method as developed by Tuy et al. in [22, 23]. O r g a n i z a t i o n of t h e p a p e r . In the next section we will first present t h e theoretical foundations of t h e parametric approach to quasiconcave minimization problems satisfying t h e rank two condition (1). T h e n in Section 3 we show t h a t the m i n i m u m concave-cost network flow problem with a single concave arc cost (MCCFP\) belongs to this class and how it can be solved through a two-phase algorithm. In t h e first phase we solve a specially structured parametric-cost linear network flow problem (TPa), i.e. we determine an optimal flow for each optimality interval (^interval between two adjacent breakpoints in the optimal value function of (TPa)). Due to the results obtained in Section 2, in the second phase it suffices to minimize the concave cost function over t h e set of flows determined in t h e first step. To analyze this algor i t h m , we derive a pseudo-polynomial upper bound on t h e n u m b e r of breakpoints of (TPa) for t h e case of integer-valued capacities and d e m a n d s . Based on this result, the n u m b e r of operations and of evaluations of t h e single nonlinear arc cost function performed by our algorithm can be shown to be pseudo-polynomial as well.

Concave-Cost Network

Flow Problems

with a Single Nonlinear

Arc Cost

127

In Section 4 we discuss t h e special case of a single source, uncapacitated (SSU) network as considered by Guisewite and Pardalos [8] and prove t h a t for SSU networks, \T\, t h e n u m b e r of sinks in t h e network, provides an upper bound on t h e n u m b e r of breakpoints of (TPa). Finally, in Section 5 we describe an implementation of t h e p a r a m e t r i c algorithm for SSU networks with n nodes and m arcs which requires 0(S(n,m)) operations and at most | T | + 1 evaluations of t h e single nonlinear arc cost function, where S(n, m) denotes t h e t i m e needed to solve a single source shortest p a t h problem on t h e given network. In contrast to t h e algorithm of Guisewite and Pardalos [8], these t i m e bounds depend on n and m only and not on t h e size of t h e total d e m a n d , and are thus strongly polynomial. We close t h e paper with a short s u m m a r y of our m a i n results and some concluding remarks in Section 6.

2

A Parametric Method for Rank Two Quasiconcave Minimization

We first prove a theorem which serves as the foundation of the p a r a m e t r i c approach t o quasiconcave minimization problems satisfying t h e rank two condition (1). In [22], this proposition was indirectly derived from a polyhedral annexation procedure for solving ( P ) . T h e proof below is more direct and may provide insight into t h e structure of t h e problem. Consider t h e p a r a m e t r i c linear program (Qa), (Qa)

minimize

a € [0,1], where

(c 1 + a(c2 — c 1 ), x)

s.t.

i e O .

Assuming D ^ 0, let (j>(ct) denote t h e optimal cost curve of (Qa)- From t h e theory of p a r a m e t r i c linear programming (cf. e.g. M u r t y [15]) it is well known t h a t (a) is a continuous, concave and piecewise linear function in t h e p a r a m e t e r a. Let a0 = 0 < cti < ... < ctM-i < CLB — 1 be the sequence of breakpoints of (a). For each interval Ajt = [ajt_i,a/fc] (k = 1 , . . . ,B) ( = o p t i m a l i t y interval) let us choose any vertex i ' ' of D which is an optimal solution of (Qa) for all a £ Afc. If (Qa) has unbounded optimal value for a £ A&, then x'*' is taken to be a vertex "at infinity", i.e. an unbounded edge of D over which the objective function of (Qa) is unbounded below. It follows from t h e properties of (a) mentioned above t h a t this may happen only for k € { 1 , B } . W h e n a;'*' is an unbounded edge of D, we agree to denote by f(x^) t h e inflmum of f(x) over this edge. Let E be a convex set in IR . T h e n we denote by E* t h e polar of E, i.e. t h e set of all v such t h a t (v,x) < 1 for all x € E. T h e o r e m 2 . 1 We have

mm{f(x)

: x £ D) = mm{f(x^)

:k=

l,...,B}.

128

B. Klinz and H. Tuy

P r o o f . Let k* € a r g m i n { / ( z W ) : k = 1 , . . . ,B} C = {x € fi : f(x)

and 7 = /(*<**>). Define > 7}.

We have to show t h a t D C C . Since this is trivial if 7 = —00, we may assume 7 > - c o . Arguing by contradiction, suppose there exists a point z £ D \C. Take any point x° g D D C and let «/ be t h e intersection of t h e line segment [x°; z] with t h e b o u n d a r y of t h e convex set C. T h e n there exists a vector u ^ O such t h a t (v,x-y)

<

0

(v,z-y)

>

0.

for all x € C,

(2) (3)

(So (v,x — y) = 0 is a, supporting hyperplane for C at point y.) By multiplying v by a positive factor, if necessary, we may always assume (v,y — x°) < 1. T h e n from (2) we have (v,x — x°) = (v,x — y) + (v,y — x°) < 1 for all x 6 C , i.e. {u,x} < 1 for all 1 6 C - 1 0 and hence, u € (C - x°)*. (4) But in view of t h e rank two condition (1), if K := {w : (c\w) > 0, i — 1,2} then f(x° + w)> f(x°) > 7 for all w € X , i.e. x° + K C C*. Consequently, K cC - x°, which in t u r n implies (C - x 0 )* C K*. (5) On t h e other h a n d , it is known from convex analysis t h a t t h e polar of t h e cone K = {w : (c',w) > 0, i = 1,2} is the cone generated by the vectors —c1 and —c2. From (4) and (5) we deduce v = —t\cl — i 2 c 2 , with ii,f 2 > 0, ii + t 2 > 0 (note t h a t v / 0). Setting a = £2/(^1 + ^2)1 so t h a t a 6 [0,1], we have v = - ( < ! + t2)[cl + a{c2 - c 1 )].

(6)

Let a 6 A^. Since x'*) solves (Qa) for a € A^, it follows from t h e above expression (6) t h a t x'k) solves also t h e linear program max{(v,x) : x £ D}. In view of t h e fact z 6 D, t h e inequality (3) implies m a x { ( » , i ) : x € D] > (u,z) > (v,y). Hence, in case x'*' is a vertex, (v,x^k' — y) > 0 follows, and consequently, by virtue of (2), x' fc ' ^ C , i.e. / ( x ' k ) ) < 7. If a;(*) is an unbounded edge of D, then by taking a sufficiently r e m o t e point x' ' on this edge, we have (v, x ' ' — y) > 0, hence x ' ' ^ C , i.e. /(x' f c )) < / ( x ' f c ' ) < 7. We arrived at a contradiction to t h e definition of 7 in both cases and thus DCC, i.e. x'4*) solves ( P ) . • T h e o r e m 2.1 gives rise to t h e following procedure for solving ( P ) : Algorithm 1 (parametric method): 1. C o m p u t e the successive breakpoints ct0,ai, •• • • ,C*B of t h e optimal cost curve of t h e p a r a m e t r i c linear program (Qa) and t h e associated sequence x ' 1 ) , . . . , x ' s * such t h a t x'*' is a basic optimal solution of (Qa) for all a € A* = [0^-1,0:*] or an unbounded edge of D over which (Qa) is unbounded below for all a € A*, respectively.

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

2. Find k* € a r g m i n { / ( x ^ ' ) : k = 1 , . . . , B], where for an unbounded edge z is defined as t h e infimum of f(x) over this edge. T h e n a;'**' solves ( P ) .

129 f(z)

T h u s solving a quasiconcave minimization problem with rank two property requires almost t h e same amount of work as solving a p a r a m e t r i c linear program. In t h e next section we will examine a case when this procedure seems to b e particularly efficient. R e m a r k . At this point it should b e mentioned t h a t t h e rank two condition (1) is only a special case of t h e following more general "rank K condition''' given by Tuy in [24]: T h e r e exists a cone K whose polar K* has dimension K such t h a t x + K C{xen:

f(x)

> /(£)}

for all x G D.

(7)

In our case K = 2 and K = {w : (c',w) > 0, i = 1,2}. In [23, 24] Tuy showed t h a t b o t h t h e parametric approach described above and t h e polyhedral annexation m e t h o d mentioned in t h e introduction, work also for problems of t y p e ( P ) satisfying t h e more general rank K condition (7). (For details and even further generalizations see [23].) D At first sight it might be surprising t h a t t h e rank two condition allows t h e reduction of t h e original d-dimensional problem ( P ) to a problem of dimension 2. However, we m a y gain some intuition by observing t h a t from condition (1) it follows t h a t (c\w)

= 0 (i = 1,2)

=>

f(x + w) = f(x)

for all i £ f l ,

(8)

i.e. t h e objective function f(x) is constant on a subspace of R. whose codimension is 2. Note t h a t (8) can also b e interpreted as rank two condition in t h e sense of (7), though it is obviously weaker t h a n (1). (Set simply K = {w : (c',w) = 0, i = 1,2}.) R e m a r k . A careful inspection of the proof of T h e o r e m 2.1 shows t h a t only a small change to Algorithm 1 is necessary for solving problems of t y p e ( P ) which satisfy only t h e weaker rank two condition (8) — one simply has to c o m p u t e all breakpoints of t h e parametric linear program (Qa) for a G R, instead of for a € [0,1] as above. • To conclude this section, let us r e m a r k t h a t an alternative p a r a m e t r i c procedure for solving problem ( P ) can be developed by considering t h e following parametric right-hand-side linear program. (Rfj)

minimize

(c2,x)

s.t.

x€D

and

(c1,x)=/3

for j3 € R.. It is interesting to note t h a t there holds a duality relationship between the parametric problems (Qa) and (Rp).

B. Klinz and H. Tuy

130

3 3.1

A Special M i n i m u m Concave-Cost N e t w o r k Flow Problem Basic Notation and Problem Statement

M i n i m u m c o n c a v e c o s t flow p r o b l e m s Let G = (N, A) be a connected directed graph with node set N and arc set A, where |iV| = n and |A| = m . Associated with each arc (i,i') € A are a nonnegative integer capacity u,y (u,;/ = co is possible) and a concave cost function A,,/ : R^" —» R.. Furthermore, an integer number 6;, the demand, is attached to each node i G N, where we assume XTigjv k — 0. Let S := {i € N : i; < 0} and T := {i € N : 6; > 0} denote t h e set of sources and t h e set of sinks, respectively. T h e minimum lated as follows: (MCCFP)

concave-cost

min

network flow problem (MCCFP)

£

can now be formu-

M*"')

(<,i')eA

s.t.

^2 (i',i)€A

Xi't — ^

xa* = bi

for all i £ N

(i,i')£A

0 < x;,-' < u,{i

for all (i,i')

S A.

By introducing AG, the node-arc incidence m a t r i x of t h e graph G, and using vector notation for t h e flow variables i,-t->, t h e capacities u;;/ and t h e d e m a n d s bi, we obtain t h e following more compact formulation for t h e constraints of (MCCFP): AGx = b Q<x
(9) (10)

C o m p l e x i t y o f m i n i m u m c o n c a v e c o s t flow p r o b l e m s It is well known t h a t (MCCFP) is A^P-hard in general. Unfortunately, also m a n y special cases of this problem, such as t h e fixed-charge network flow problem where the costs hai(-) are restricted to be fixed-charge costs and the single source uncapacitated (SSU) network case in which \S\ = 1 and u;;/ = co for all arcs (i,i') G A remain A^P-hard. (For further details on complexity issues in connection with (MCCFP) t h e interested reader is referred to the survey article of Guisewite and Pardalos [9] and t h e references given therein.) T h e inherent difficulty of solving the general (MCCFP) provides t h e motivation for looking for efficiently solvable special cases. Optimization problems with general nonlinear functions in the objective, like (MCCFP), however, pose certain difficulty

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

131

as far as their complexity is concerned. T h e main problem t h a t arises is t h a t there is no complexity model available for t h e description of general functions (though some a t t e m p t s to define such a model can be found in t h e book of Nemirowsky and Yudin [16]). For nonanalytical functions or functions without explicit regular representation, the length of t h e input may even become infinite. Likewise, t h e length of t h e o u t p u t may become infinite, if t h e computation of t h e exact optimal solution is required. For (MCCFP) with integer capacities and d e m a n d s , however, t h e latter case cannot occur and there is thus no need to deal with t h e concept of £-accurate optimal solutions. One way out of t h e problems with encoding t h e objective function, is to assume t h a t there exists an oracle providing us with t h e required function values (cf. Hochbaum and Shantikumar [12]), since then no representation is necessary. In this complexity model an algorithm is called (strongly) polynomial if both t h e number of operations (additions, multiplications, comparisons etc.) and the number of objective function evaluations it performs are (strongly) polynomial in t h e input length. Of course, if t h e involved functions can be evaluated in (strongly) polynomial time, the algorithm becomes (strongly) polynomial in t h e more traditional sense. As t h e above definition of polynomiality and strong polynomiality, respectively, seems to be most appropriate for our purpose, we will use it throughout t h e paper. T h e same convention is apparently adopted in m a n y papers on polynomially solvable nonlinear optimization problems, in particular on concave-cost flow problems (see e.g. [4, 8, 27]), though mostly only implicitly. (For a more detailed discussion of this topic see Hochbaum and Shantikumar [12]). Up to now not m a n y special cases of (MCFFP) are known to be solvable in polynomial t i m e (in t h e sense explained above). Almost all rely on a reduction to some kind of shortest p a t h problem. In this regard, Zangwill [27] gave a polynomial t i m e algorithm for SSU networks with only one sink and Erickson, M o n m a and Veinott [4] derived from their general send-and-split m e t h o d polynomial t i m e algorithms for (MCCFP) on fc-planar graphs (k fixed), and for several special production and inventory planning problems. M i n i m u m c o n c a v e - c o s t flows w i t h a s i n g l e n o n l i n e a r a r c c o s t In t h e following we will consider another interesting special case, namely m i n i m u m concave cost flow problems where all cost functions /&;,'(•), except a single one, are linear functions. Recently Guisewite and Pardalos [8] developed a polynomial t i m e algorithm for this problem on SSU networks. In this section we will give a pseudopolynomial t i m e algorithm for general networks and in t h e subsequent sections we shall show t h a t for SSU networks our algorithm is in fact strongly polynomial and improves upon t h e algorithm obtained by Guisewite and Pardalos. For t h e ease of exposition, assume from now on t h a t t h e arcs in A are labeled { 1 , 2 , . . . , rn) and t h a t t h e single arc with nonlinear cost is t h e first arc in this order.

132

B. Klinz a n d H.

Tuy

F u r t h e r m o r e , let / i ( - ) be a concave monotone nondecreasing function and denote by CJ £ R, t h e unit cost of arc j , j = 2 , . . . , m. T h e n t h e problem under consideration can be written as follows: m

(MCCFPJ

min

f{x)

= f1{x1)

s.t.

(9) and (10).

+ Y^cixi

(11)

R e m a r k . Note t h a t in t h e b o u n d e d solution case t h e assumption t h a t / i is nondecreasing can be m a d e without loss of generality because in t h a t case t h e flow on an arc is bounded above by t h e sum of t h e flows leaving t h e source. We may e.g. proceed as follows: If necessary, we split arc 1 into two arcs and assign t h e cost functions fi+g and —g, respectively, to t h e newly created arcs, where t h e linear function g is chosen such t h a t / i + g becomes nondecreasing. An alternative approach t o problems with cost functions / x which do not satisfy t h e required monotonicity assumption would be to use t h e weaker rank two condition (8) instead of condition (1) used in t h e current paper. In m a n y practical applications, however, / i will be nondecreasing from t h e beginning, anyway. •

3.2

A Parametric Algorithm for Minimum Concave Cost Flow Problems with a Single Nonlinear Arc Cost

It is not h a r d to see t h a t for (MCCFPi) c1 = ( 1 , 0 , . . . , 0 )

t h e rank two condition (1) holds with and

c2 = ( 0 , c 2 , . . . , c m ) .

Indeed if (c 1 , w) = wx > 0 and (c 2 , w) = YJ?=2 cjwi ^ 0 t n e n f(x + w) > f(x) for any x £ K?. Observe t h a t for t h e weaker rank two condition (8) to hold, it is not necessary to require / i to be monotone. For efficiency reasons (smaller p a r a m e t e r interval in t h e associated p a r a m e t r i c p r o g r a m ) , we prefer to use t h e stronger condition (1) nevertheless. Consequently, (MCCFPi) is a special quasiconcave rank-two minimization problem and hence we may apply t h e parametric m e t h o d developed in t h e previous section (cf. Algorithm 1). Recall t h a t Algorithm 1 consists of t h e following two steps: 1. Solve t h e parametric linear program {Qa) becomes

which in t h e case of m

(TPa)

min

(1 — a)xi + a J ^ CjXj

s.t.

(9) and (10).

J=2

(MCCFPi)

Concave-Cost

Network

Note t h a t (TPa) dj(a) where

Flow Problems

with a Single Nonlinear

Arc Cost

133

is a parametric-cost linear network flow problem with arc costs

, , , d

' ^

f 1— a = \c

j a

for j = 1

. . <12>

fori = 2,...,m.

Let {a) denote t h e optimal value function of problem (TPa) and, as previously, let a0 = 0 < cti < . .. < otB-i < as = 1 and x ' 1 ' , . . . , x ' B ' be t h e associated sequences of breakpoints and corresponding optimal solutions, respectively. Depending on whether or not there exists a finite optimal solution of (TPa) for a € [ctk-i, Ck], t h e optimal solution x ^ ' is either a vertex or an unbounded edge of t h e set of feasible flows given by (9) and (10), respectively. 2. Determine k* g a r g m i n { / ( i ' ' ; ' ) : k = 1 , . . . , B } . Therein f(x^) fi{x[ ) + X]j=2 c3xi 'f x ^ ls a v e r t e x and by t h e m i n i m u m of fi(xi) over t h e edge x'*' if x'*' is an unbounded edge (cf. (11). T h e n x^'^

is an optimal solution of

is given by + J2T-2 cixi

(MCCFPX).

A n a l y s i s of t h e p a r a m e t r i c a l g o r i t h m Step 1: Using an algorithm often a t t r i b u t e d to Eisner and Severance [3] t h e breakpoints a^ and t h e associated optimal flows x^ (k = 1,...,B) can be computed in 0(BM(n,m)) t i m e , where M(n,m) is t h e t i m e needed to solve a classical m i n i m u m cost flow problem on a network with n nodes and m arcs. T h e currently best strongly polynomial m i n i m u m cost flow algorithm with t i m e complexity M(n,m) = 0(m log n (n log n + rre)) is due to Orlin [17]. (This algorithm essentially consists of a sequence of 0(mlogn) shortest p a t h calculations). Hence, Step 1 can be implemented in 0( Bmlogn(n\ogn + m)) time, i.e. polynomial in t h e number of breakpoints of t h e optimal value function 4>{a). (Other algorithms to solve the parametric-cost problem (TPa) can e.g. be found in Gusfield [10, 11].) Step 2: We need to evaluate / ( x ' f c ' ) for k = 1 , . . . , B. For x ^ being a vertex, this task simply boils down to one evaluation of t h e single nonlinear arc cost function f\. T h e situation is, however, slightly more complicated in t h e case of an unbounded edge where we have to decide whether or not t h e original nonlinear objective function is unbounded below on this edge. Whereas till now it was sufficient to assume t h a t the function values fi(t) are computable (given to us by an oracle), we now need to require t h a t also the following limit is available: lim^oo fi'+[t) where fi'+{t) denotes t h e righthand-side derivative of f\ at t\ T h e n t h e infimum of / ( x ) along an unbounded edge can be computed as follows: L e m m a 3.1 Let z be an unbounded edge, say V = {q + Xy : A > 0 } , of the set and let £ = lim / / (t). Then f(z) = i n f { / ( x ) : i g H is given by t—*oo

feasible

134

B. Klinz and H. Tuy

if &/i + E Wi < 0, (13)

1=2

otherwise. P r o o f . T h e concave function f(x) either attains its m i n i m u m over F a t the origin q of F, or is unbounded below over T; t h e latter case occurs if and only if we have inf A _ 1 /(Aj/) < 0 (see e.g. Rockafellar [19], T h e o r e m 8.6.). But since fi(Xyi) < fi{iyi)

+ (A — t)yifi'+(tyi)

inf ^ A>0

= A

for all t > 0 with equality for t = A, we have

fc^. ^

+

infinf^M±(A^)M^l) A>0 t>0

TTi

=

A m

= H cj3/j + jnf 3/i/i'+(*2/i) = 53 ci& + ^£j=2

-

j=2

(Note t h a t due to t h e concavity of f\ t h e right-hand-side derivative fi'+(t) nonincreasing.)

is monotone •

R e m a r k . Let us have another look a t the condition £?/]. + J2T=2cjyj < 0 in (13) from a more combinatorial point of view. For t h a t purpose, observe t h a t due t o t h e fact t h a t F = {q + Xy : A > 0} is an edge of t h e flow polyhedron, t h e arc set {j (z A : yj > 0} induces a cycle C. Hence (13) simply checks whether or not C has negative weight w.r.t. t h e arc weights Cj for arcs j = 2 , . . . , m and £ = l i m ^ o o fi'+(t) for arc 1, respectively. Note t h e close relationship of (13) to conditions guaranteeing t h e existence of a finite optimal solution for t h e general m i n i m u m concave-cost flow problem (cf. e.g. Erickson e t al. [4]). D Summarizing we have obtained t h e following result: T h e o r e m 3.2 Let a network with n nodes and m arcs be given and assume that <j>(a), the optimal value function of the parametric-cost network flow problem (TPa), has 5 — 1 breakpoints aside from 0 and 1. Furthermore, assume that the limit lim fi'+(t) is available. Then the parametric algorithm computes an optimal solut—HXt

'

lion of(MCCFPi) on this network within 0( Bm\ogn(n\ogn at most B evaluations of the concave cost function f^.

+ m)) operations

and

P r o o f . First, note t h a t in Algorithm 1, the parametric algorithm, Step 1 dominates Step 2 with respect t o t i m e complexity. As explained above, Step 1 can b e implem e n t e d t o run in 0( Bm log n(n logn + m ) ) time. T h e upper bound of B for t h e n u m b e r of objective function evaluations is an i m m e d i a t e consequence of t h e above discussion. • R e m a r k . As already remarked in Section 2, due t o the concavity of (ct), an unbounded edge x ^ ' can only occur for k = 1 or k = B. In t h e special case of (TPa)

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

135

dealt with here, however, only k = B is possible, for if there exists a feasible flow x at all, i.e. a vector x satisfying the constraints (9) and (10), then it follows t h a t (j>(0) > - c o . (In fact, we have <j>{0) = m i n { x i : x is a feasible flow } > 0.) Obviously, for nonnegative arc costs and nonnegative concave cost function / i , we know beforehand t h a t (a) > —oo for all a € [0,1]. Hence, in this case, it is not necessary to have t h e limit lim;-,,*, fi'+(t) available. D U p p e r b o u n d s o n t h e n u m b e r of b r e a k p o i n t s T h e o r e m 3.2 provides the motivation for investigating how large t h e number of breakpoints of (ce) might become in the worst case. Unfortunately, a class of parametriccost linear network flow problems of t y p e (TPa) for which t h e n u m b e r of breakpoints is exponential in t h e n u m b e r n of nodes of t h e underlying network can be derived in t h e following way: We start from t h e famous pathological network class of Zadeh, see [26]. For every integer r > 2 this class contains a network AfT with n = 2r + 2 nodes and m = r 2 + r + 2 arcs. While t h e network topology and the capacities remain unchanged, t h e costs have to be modified in order to obtain a problem of t y p e (TPa). For simplicity, assume t h a t t h e arc from the source 1 to t h e sink n in Mr is arc 1 and let c j denote t h e unit cost of arc j in Zadeh's original construction. T h e n t h e p a r a m e t r i c costs dj(a) are constructed as follows: We set rfi(a) = 1 — a and dj(a) = cfa for j = 2 , . . . , m . Furthermore, we require a flow of value v = 2r + 2r~2 — 2 by defining t h e right hand side as follows: bx = —v, 6; = 0 for i / l , n and bn = v. Using similar arguments as in Carstensen [2] and Ruhe [20], it can now easily be shown t h a t in this problem, 4>(a) has 2 r — 2 breakpoints in t h e interval [0,1], aside from t h e endpoints 0 and 1. One might ask now whether there exist special cases for which there is a polynomial upper bound on t h e number of breakpoints. Since Afr has only one source, there is no hope to get such a bound for single source capacitated networks. By using a classical transformation for getting rid of t h e capacities (see e.g. Ahuja, Magnanti and Orlin [1]), the same negative result follows for multi source uncapacitated networks. However, as we will see in t h e next section, there is a nice polynomial bound for t h e single source uncapacitated case. Since in t h e bad network class described above t h e size of t h e d e m a n d s depends exponentially on t h e problem size n, t h e question whether there also exist bad networks with "small" demands arises. T h e answer is "no" and is provided by t h e pseudo-polynomial bound given in Theorem 3.3 below. T h e key observation will be t h a t due to t h e special structure of t h e costs in (TPa), t h e n u m b e r of breakpoints is bounded above by the number of different values for t h e flow on arc 1. Let V denote t h e set of basic feasible solutions of (TPa) Xi : = {xi : x €

V},

and define

B. Klinz and H. Tuy

136

i.e. X\ is t h e set of values t h e flow on arc 1 might attain in a basic feasible solution. Recall t h a t T denotes t h e set of sinks and Cj and 6; denote t h e capacity of arc j and t h e d e m a n d of sink i, respectively. We now can show T h e o r e m 3.3 (i) The optimal and 1.

cost curve of (TPa)

has at most \X\ \ — 1 breakpoints

aside from 0

(ii) Let R : = m i n { u i , J2 M - Then we have

|*i I (a) > ~°° f ° r a ' l a £ [0, <*B-I] provided there exists a feasible flow at all. From t h e linearity of t h e objective of (TPa) it follows now t h a t for a € [0, a s - i ] <j>{a) = mm{<j>x(a) : x € V},

(14)

where x(o) = (1 — a)xi + aJ2T=2cJxjF ° r short we write x(a)). For each I £ I i let V( be the collection of all x € V such t h a t x\ — £ and let qi = min {qx : x G V^} and (j>e(a) = £ + qict. Since t h e functions 4>x(a), x £ Vt, have equal intercepts, it is clear t h a t <j>e(ot) = m i n {(j>x(a) : x € Ve] for all a e [0,1], hence by (14), 4>{a) =mm{£+ qea: £<EX1}. (15) Now each interval A r = [ a r _ i , a r ] , (r = 1 , . . . , JB — 1), corresponds to a value £r 6 X\ such t h a t (a) = £r + 9< r a for all a £ A r .

F u r t h e r m o r e , since (^(a) is concave and continuous, its slope on A r , i.e. qeT, must decrease with r, while its intercept £T must increase. T h a t is, two different intervals correspond to two distinct elements from X\. As t h e number of breakpoints aside from 0 and 1 is one less t h a n t h e number of intervals, (i) follows immediately. To give an upper bound on \XX \ note t h a t clearly Xx C {£ : £ integer,0 < £ < R = m i n { u i , £ & , • } } . P r o p e r t y (ii) is now a direct consequence. Combining Theorems 3.2 and 3.3 we obtain

(16) •

Concave-Cost Network Flow Problems with a Single Nonlinear Arc Cost

137

Corollary 3.4 Let R := min{«i, J2 °i}- In order to compute an optimal solution for (MCCFPi), the parametric algorithm requires 0(Rm\ogn (n log n + m ) ) operations and at most R + 1 evaluations of the concave function f\, again provided that lim f\'+(t) is available for problems in which the existence of an unbounded optimal solution cannot be ruled out beforehand. The bounds in Corollary 3.4 imply that (MCCFP\) can be solved in pseudopolynomial time. This is in sharp contrast to a result by Guisewite and Pardalos [9] for the general (MCCFP) in which they show that this problem cannot be solved by a pseudo-polynomial algorithm being polynomial in the sum of the arc capacities unless V = MV.

4

A Strongly Polynomial Time Algorithm for SSU Networks

In [8] Guisewite and Pardalos gave a polynomial time algorithm for (MCCFP^) in the case of a single source, uncapacitated (SSU) network. In the sequel we will show that the parametric approach presented above yields another polynomial time algorithm for this special case. Let node 1 be the only source, i.e. &i < 0 and 6; > 0 (J; = 2, . . . , n ) . Since in the SSU case there are no capacity restrictions we set Uj = oo for j = l , . . . , m . Theorem 3.3 can now be strengthened as follows: T h e o r e m 4.1 For a SSU network with \T\ sinks, <j>(a), the optimal cost curve of (TPa), has at most \T\ breakpoints aside from the two endpoints of the interval [0,1]. Proof. Recall that for fixed a € [0,1] the problem (TPa) is a classical minimum cost flow problem. For the proof it is now essential to observe that the problem of finding a flow of minimum cost on a SSU network can be reduced to a single source shortest path problem. (This can for example easily be seen by applying the classical successive shortest path algorithm, see e.g. Ahuja et al. [1].) More specifically, for each sink i g T consider the subproblem (TPaj) obtained from (TPa) by replacing &i by —1, 6; by 1 and b( ((. / l,i) by 0. This problem is in fact to find the shortest path Pi(a) from node 1 to node i with respect to the arc weights dj(a) defined in (12). Fix 5 G [0,1] and let ,(a) denote the optimal value function of problem (TPa,i). (In other words, <j>i(a) is the weight of -P,(5), the shortest path from 1 to j , in case it exists and —oo otherwise.) If j/(') is a basic optimal solution of (TPaii) for a = 5, then, as can easily be checked, x = E, e T 6,y*'' is an optimal solution of (TPa) for

B. Klinz and H. Tuy

138

a = a. (x is t h e flow one obtains by sending 6,- units of flow along t h e p a t h -Pi (5) for every sink i € T. Hence,

#5) = 2>&(5)

(17)

whenever i(a) > — oo for all i € T, i.e. whenever there exists a shortest p a t h P , ( 5 ) from node 1 to node i for all i G T. But since t h e subproblem (TPaj) involves just one sink, with d e m a n d 1, for any basic feasible solution x to this subproblem one must have X\ 6 { 0 , 1 } . (x\ = 1 corresponds to t h e case where t h e arc 1 is included in t h e p a t h P , ( 5 ) while for x\ = 0 this arc is not contained in P;(5).) Relation (16) and Theorem 3.3 now imply t h a t the function i(a) has at most one breakpoint aside from t h e two endpoints of t h e interval [0,1]. formula (17) then shows t h a t <j>(ct) has at most \T\ breakpoints aside from 0 and 1. Q T h e following corollary is now an i m m e d i a t e consequence of Theorems 3.2 and 4.1. C o r o l l a r y 4 . 2 For SSU networks with n nodes and m arcs and lim fi'Ai) available, the parametric algorithm computes an optimal solution of problem (MCCFPi) in 0( \T\m\ogn ( n l o g n + m ) ) operations and at most | T | + 1 evaluations of the concave cost function / i . As t h e above bounds depend on n and m only (note \T\ < n — 1), (MCCFPi) can be solved in strongly polynomial t i m e . W h a t we have not exploited so far is t h a t in t h e SSU network case t h e parametric network flow problem (TPa) has a very special s t r u c t u r e and hence can be solved more efficiently t h a n t h e general problem. T h e key idea appears already in t h e proof of T h e o r e m 4.1 where it is shown how a solution t o (TPa) can be obtained by shortest p a t h computations (cf. formula (17)). T h e next section will deal with an efficient implementation of this idea.

5 5.1

A n I m p r o v e d A l g o r i t h m for S S U N e t w o r k s An Efficient Algorithm for a Special Parametric Shortest Path Problem

In t h e sequel it will be shown t h a t t h e parametric shortest p a t h problem t h a t occurred in t h e last section can be solved asymptotically in t h e same t i m e already needed for solving a shortest p a t h problem with non-parametric weights. For a G [0,1] consider t h e weighted digraph Ga = (N,A,d(a)) weights 1—a for j = 1 ^ CjCt for j = 2 , . . . , m .

with p a r a m e t r i c

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

139

T h e problem (SPa) is then to determine for each sink i £ T t h e shortest p a t h from t h e source node 1 to sink i with respect to t h e weights dj(a) or to find a negative cycle in Ga, where in t h e first case the shortest paths are to be chosen such t h a t they form a tree, t h e so-called shortest path tree. For t h e ease of exposition we introduce t h e class V of all paths containing arc 1 exactly £ times. T h e following observations follow now easily from t h e special structure of t h e weights dj(a): 1. If Ga contains no negative cycle, then for each sink i, t h e shortest p a t h from 1 to i either belongs to class V1 or to class V°, i.e. it either uses arc 1 exactly once or does not use this arc at all. 2. T h e r e exists a m a x i m u m value of a, say a*, such t h a t there are no cycles of negative weight in the graph Ga. (If Ga contains no negative cycle for all a € [0,1], we set a* = 1.) 3. Go contains no negative cycle as dj(0) = 0 for j = 2 , . . . , m and d$0) = 1. A closely related parametric shortest p a t h problem, (SP1$ for short, was treated by K a r p and Orlin [13] and subsequently by Young, Tarjan and Orlin [25]. Let A' C. A and ej € R for j 6 A and consider t h e following weights with A € R: l\\ WiW

- \ ei~ ~ \ ej

^

for j 6 A' for j e A \ A'.

Actually, our problem (SPa) can easily be transformed such t h a t it fits into the framework of (SPl\). (In this special case, t h e set A' contains only one arc, namely arc 1.) Let us therefore first give a brief overview of what is known for problem (SPl\). In [13] Karp and Orlin showed t h a t t h e optimal value function of this p a r a m e t r i c shortest p a t h problem has at most n(n — l ) / 2 = 0(n2) breakpoints. Furthermore, they developed two algorithms, t h e first, a direct approach, running in 0(n3) t i m e and t h e second, a specialized parametric simplex algorithm, running in 0(mn log n) time. Recently, Young et al. [25] improved t h e t i m e complexity of t h e latter approach to 0(mn + n 2 l o g n ) by using Fibonacci-heaps (see F r e d m a n and Tarjan [6]). These t i m e bounds rely however on the worst case assumption of 0(n2) breakpoints whereas (SPa) has only 0(n) breakpoints (cf. Theorem 4.1). In t h e following we will deal with t h e simplifications and improvements t h a t are possible for problem (SPa). Therein we assume some familiarity with t h e above algorithms. The parametric simplex method T h e p a r a m e t r i c simplex algorithm proceeds tree by tree. First we c o m p u t e an initial shortest p a t h tree To for an appropriately chosen starting value a = ctQ. In our case t h e choice a0 = 0 suggests itself. Assume now t h a t for a = a^ an optimal tree T)t is available. T h e n our task is to determine the next breakpoint 0^+1, where Tk ceases

140

B. Klinz and H. Tuy

to be a shortest p a t h tree, along with a shortest p a t h tree Tk+i optimal at some a > cth+i. It is definitely beyond t h e scope of this paper to describe in full detail how t h e transition from tree Tk to its successor Tfc+1 is m a d e in [25]. T h e m a i n idea, however, is rather simple: It is clear t h a t if Tk ceases to be a shortest p a t h tree, there is a node i which acquires a new shortest p a t h . Obviously in our case as long as a < a ' , this can h a p p e n only if t h e new shortest p a t h from 1 to i contains arc 1 while this arc was not contained in t h e old p a t h from Tk. Hence there are at most n — 1 p a t h changes since afterwards all shortest paths contain arc 1. Conceptually t h e parametric simplex algorithm proceeds as follows: For each arc j currently not in J \ , it computes a critical value a such t h a t the subgraph we obtain by inserting j into t h e tree and deleting t h e unique arc of Tj. which is directed into t h e head of j , has smaller weight t h a n T/, for a > a. T h e n an arc j * with m i n i m u m critical value is selected to enter the tree. If t h e resulting subgraph contains a cycle, we have found a* and t h e corresponding cycle whose weight is negative for a > a*. Otherwise, we obtain t h e next breakpoint otk+\ along with t h e next tree T^+i. To analyse this algorithm, note first t h a t t h e initial shortest p a t h problem for ao = 0 can be solved in 0(m) t i m e since only one arc has a weight different from zero. (In t h e general problem t h e same task needs 0(nm) t i m e since we m u s t be able to detect a negative cycle.) Second, as in our case there at most n — 1 p a t h changes as opposed to 0(n2) such changes for (SP1\), we can get rid of a factor of n in the t i m e taken by t h e algorithm after initialization. Based on t h e analysis in [25] we thus have obtained an 0 ( n log n + m ) algorithm for solving t h e p a r a m e t r i c shortest p a t h problem (SPa). This is t h e same t i m e as already needed for solving a single shortest p a t h problem with non-parametric weights. T h e direct, m e t h o d Let D[(a) denote t h e length of t h e shortest p a t h w.r.t t h e weights dj(a) t h a t starts in node 1, t e r m i n a t e s in node i and uses the special arc 1 exactly £ times. Further let S(i,i',a) designate t h e length of the shortest p a t h from i to i' not containing arc 1 and assume t h a t arc 1 = ( i i , ^ ) - T h e following l e m m a can now be easily established directly (it follows also from t h e results of Karp and Orlin [13]). L e m m a 5.1 (i) Let a < a*. Then we have £>f ( a )

= 6{1,H,

a) + £ • (1 -

a) + {I -

1) • 6{i2,

iua)

+ 6{i2,

i, a)

(18)

for all i £ N and 1 < £ < n. (it) If the graph G = (N,A\ {1}) contains a negative cycle then a' = 0. let W be such that <5(i 2 ,n,a) = Wa. Then 1/(1-WO

forW<\

1

forW>\.

Otherwise

{

'

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

141

P r o o f , (i): Let us start with £ = 1. Observe t h a t a shortest p a t h from 1 to i using arc (ii,i2) exactly once is obviously composed of a shortest p a t h Pi from node 1 to node i i , t h e arc (11,12) and a shortest p a t h P2 from node i2 to node i. (Of course neither Pi nor P2 contain arc 1.) As t h e weight of arc 1 is di(ot) = 1 — a , formula (18) is correct for £ = 1. T h e correctness for £ > 1 can be established by induction. J u s t note t h a t a shortest p a t h from 1 to i in class V1 can be split into a shortest p a t h from 1 to ii belonging to t h e class V1'1, t h e arc 1 = (21,22) and a shortest p a t h from i2 to i in class V°. (ii): In case t h a t G = (N, A \ {1}) contains a negative cycle, t h e n this cycle will be negative for all a £ (0,1]. Hence a* = 0. From (18) it follows t h a t Df> - L% = (/ 2 - / 1 ) • ((1 - a) +

S(i2,iua))

for 1 < £1 < £2 < n. T h e expression ((1 — a) + S(i2,ii,a)) equals however t h e weight of t h e cycle C which consists of arc 1 and t h e shortest p a t h from i2 to ii within t h e class V°. It follows from Theorem 2 of [13] t h a t a* is equal to t h e value of a for which t h e weight W(C, a) of C becomes 0. By t h e definition of W t h e weight of C is however equal to W(C, a) = (1 — a) + W • a. For W < 1, W(C, a) becomes negative for a > 1/(1 - W), whereas for W > 1, W(C) > 0 for all a € [0,1]. • Recall t h a t for a < a* t h e length of t h e shortest p a t h from node 1 to node i is given by mm{D°(a),D}(a)}. Hence it is an i m m e d i a t e consequence of L e m m a 5.1 t h a t t h e direct m e t h o d solves (SPa) essentially by solving two single source shortest p a t h problems, one with source 1 (to c o m p u t e 6(1, i, a) for i G iV) and t h e other with source i2 (to determine 6(i2,ii,a)). In case t h a t no negative weights are involved in these shortest p a t h computations (i.e. for Cj > 0 for all arcs j = 2, . . . , m ) , we get an 0 ( n l o g 7 i + m) algorithm, thus the same t i m e bound as we already obtained for t h e p a r a m e t r i c simplex approach. In the general case we need 0(nm) steps to solve (SPa)- One advantage of the direct approach over t h e parametric simplex m e t h o d is however t h a t it might be easier to implement.

5.2

An Improved SSU Implementation

From t h e results above it follows t h a t Corollary 4.2 can be strengthened as follows: C o r o l l a r y 5.2 For SSU networks with n nodes and m arcs and lim fi'+(t) available, the parametric algorithm can be implemented to solve problem (MCCFPi) in 0(n log n + rn) operations and at most \T\ + 1 evaluations 0 / / 1 . C o m p a r i s o n w i t h t h e a l g o r i t h m of G u i s e w i t e a n d P a r d a l o s To obtain additional insight into t h e problem structure, let us briefly compare our p a r a m e t r i c algorithm for t h e SSU case and t h e algorithm of Guisewite and Pardalos [8], henceforth called GP algorithm for short. Though obtained by fundamentally

142

B. KHnz and H.

Tuy

different approaches, our algorithm and t h e GP algorithm share several interesting properties. Both are two-phase approaches where in t h e first phase a set of up to | T | -f 1 candidate flows is determined and in the second phase t h e objective function is minimized over this candidate set only. T h e essential difference lies merely in t h e m e t h o d used for constructing the candidate set. Define T" C T to contain all sinks i for which the breakpoint of the optimal value function <j>i{ct) = min{.D°(a),.D; (a)} (cf. also Section 4) is contained in t h e interval [0,1]. (These are t h e sinks for which a p a t h change occurs in t h e course of t h e p a r a m e t r i c simplex algorithm.) W h a t our algorithm and t h e GP algorithm have in c o m m o n is t h a t for b o t h there exists an ordering tk1,tk7i • • • > *fc, of t h e set T", q = \T'\, such t h a t t h e set of candidate flows used in t h e second phase can be described as follows: (i) For t h e first candidate flow we send &,- units of flow from t h e source 1 to each sink i along t h e shortest p a t h in t h e class V° whenever this class contains at least one p a t h from 1 to i. For sinks i with no such p a t h , t h e shortest p a t h from 1 to i in t h e class V1 is used instead, (ii) T h e r - t h candidate flow is obtained from t h e (r — l)-st flow by rerouting t h e flow to be sent to sink £*, along t h e shortest p a t h from 1 to tkr belonging to class V1. Hence more and more flow is sent on arc 1. In our algorithm t h e ordering of T' is chosen such t h a t t h e breakpoints of t h e functions <j>i(a) , i € T", are in increasing order. As t h e weights in t h e p a r a m e t r i c shortest p a t h problem (SPa) do not depend on the single nonlinear concave cost function / i , this ordering is independent of / j . In t h e GP algorithm t h e ordering of t h e set T" is obtained by a binary search approach which in contrast to t h e m e t h o d described in this paper depends on t h e cost function f\. (Actually, Guisewite and Pardalos construct an ordering of the whole set T and not only of X", but it can easily be shown t h a t this is not necessary since for sinks not in T", arc 1 will never become beneficial in t h e sense of Guisewite and Pardalos.) Hence, in general t h e orderings obtained by t h e GP algorithm and by our algorithm will differ. T h o u g h in [8] no tight analysis is given, it is easy to see t h a t t h e GP algorithm can be implemented to run in 0(S(n,m) + |T|log_R + | T ] l o g | T | ) operations and 0 ( | T | l o g - R ) evaluations of / j , where R is t h e total d e m a n d , thus R — YlieT^i, and S(n, m) denotes again t h e t i m e needed to solve a single source shortest p a t h problem. (Thus S(n,m) = O(nlogn + m) or S(n,m) = 0(nm) depending on whether or not t h e given arc costs Cj are restricted to be nonnegative.) Note t h a t b o t h the number of operations and t h e number of evaluations of / i depend on R and are thus not strongly polynomial.

Concave-Cost Network

6

Flow Problems

with a Single Nonlinear

Arc Cost

143

Summary and Concluding Remarks

In this paper we considered (MCCFPi), t h e m i n i m u m concave-cost flow problem when all arcs except one have linear cost. By specializing a m e t h o d recently developed for rank two quasiconcave minimization problems, we derived a pseudo-polynomial t i m e algorithm for this problem. For a network with n nodes and m arcs, our algor i t h m runs in 0( Rm log n (n log n + m ) ) t i m e and performs at most R evaluations of t h e single nonlinear arcs cost function, where R is an upper bound on t h e flow on t h e special arc (i.e. R = min{ u j , Y^ieT ^« })• I n t h e case of a single source uncapacitated network with \T\ sinks we described a strongly polynomial t i m e implementation requiring 0 ( n l o g n + m) operations and \T\ + 1 evaluations of t h e single nonlinear arc cost function. T h e r e are, however, still quite a few issues t h a t deserve further attention. Let us briefly mention some of t h e m . (1) Since for general networks our algorithm to solve (MCCFPi) is only pseudopolynomial, t h e question on t h e complexity status of this problem arises — is it N P - h a r d or can a polynomial t i m e algorithm be developed? (2) In (MCCFPi) we allowed only one arc to have a nonlinear cost. A natural generalization would be to consider k > 2 arcs with nonlinear cost for fixed k. In two forthcoming papers it will be shown how t h e techniques developed in this paper can be generalized to solve t h e more general problem efficiently. It turns out t h a t for fixed k t h e SSU network case can again be solved in strongly polynomial t i m e . (3) It appears to be a challenging research question to investigate to what other types of combinatorial optimization problems t h e dimension reduction technique based on the rank two and rank K properties respectively could be applied successfully. (4) Are there any other interesting special cases of (MCCFPi) apart from t h e SSU network case for which t h e number of breakpoints of t h e optimal value function 4>(a) is bounded polynomially in the number of nodes of t h e given network? A c k n o w l e d g e m e n t : This work was initiated during Hoang Tuy's stay at t h e Technical University of Graz in t h e W i n t e r t e r m 1991. B e t t i n a Klinz acknowledges partial financial support by t h e Christian Doppler Laboratorium "Diskrete Optimierung" and by t h e Fonds zur Forderung der wissenschaftlichen Forschung, Project P8971-PHY.

References [1] R.K. Ahuja, T. Magnanti and J . B . Orlin, Network Flows, in: G.L. Nemhauser et al. (eds.), Handbooks in OR & MS, Vol. 1, (North Holland, A m s t e r d a m , 1989) p p . 211-369. [2] P.J. Carstensen, Complexity of some parametric integer and network programming problems, Mathematical Programming 5 (1983) 64-75.

144

B. Klinz and H. Tuy

[3] M.J. Eisner and D.G. Severance, Mathematical techniques for efficient record segmentation in large shared databases, Journal of the Association for Computing Machinery 2 3 (1976) 619-635. [4] R.E. Erickson, C.L. M o n m a and A . F . Veinott, Send-and-split m e t h o d for minimum-concave-cost network flows, Mathematics of Operations Research 12 (1987) 634-664. [5] F . Forgo, T h e solution of a special quadratic problem, (in Hungarian), (1975) 53-59.

Szigma

[6] M.L. Fredman and R.E. Tarjan, Fibonacci heaps and their uses in improved network optimization algorithms, Journal of the Association for Computing Machinery 3 4 (1987) 596-615. [7] R. Gabasov and F . M . Kirillova, Linear Programming problems) (in Russian) (Minsk, 1980).

Methods,

P a r t 3 (special

[8] G.M. Guisewite and P.M. Pardalos, A polynomial t i m e solvable concave network flow problem, Technical Report CS-90-48, D e p a r t m e n t of C o m p u t e r Science, Pennsylvania State University, PA (1990), s u b m i t t e d to Networks. [9] P.M. Guisewite and P.M. Pardalos, M i n i m u m concave-cost network flow problems: applications, complexity and algorithms, Annals of Operations Research25 (1990) 75-100. [10] D. Gusfield, Sensitivity analysis for combinatorial optimization, U C N / E R L M80/22, Research M e m o r a n d u m , Electronics Research Laboratory, Berkeley, CA (1980). [11] D. Gusfield, P a r a m e t r i c combinatorial computing and a problem of program module distribution, Journal of the Association for Computing Machinery 3 0 (1983) 551-563. [12] D.S. Hochbaum and J.G. Shantikumar, Convex separable optimization is not much harder t h a n linear optimization, Journal of the Association for Computing Machinery 3 7 (1990) 843-862. [13] R.M. K a r p , R.M. and J . B . Orlin, P a r a m e t r i c shortest p a t h algorithms with an application to cyclic staffing, Discrete Applied Mathematics 3 (1981) 37-45. [14] H. Konno and T. Kuno, Linear Multiplicative Programming, Mathematical gramming 5 6 (1992) 51-64. [15] K.G. Murty, Linear Programming

(John Wiley k, Sons, New York, 1983).

Pro-

Concave-Cost

Network

Flow Problems

with a Single Nonlinear

Arc Cost

[16] A.S. Nemirowsky and D.D. Yudin, Problem Complexity and Method in Optimization (John Wiley & Sons, New York, 1983).

145 Efficiency

[17] J . B . Orlin, A faster strongly polynomial m i n i m u m cost flow algorithm, 20-th Annual Symp, Theory of Computing (1988) 377-387.

Proc.

[18] P.M. Pardalos, Polynomial t i m e algorithms for some classes of constrained nonconvex quadratic problems, Optimization 2 1 (1990) 843-853. [19] R . T . Rockafellar, Convex Analysis (Princeton University Press, Princeton, 1970). [20] G. R u h e , Algorithmic Aspects of Flows in Networks, M a t h e m a t i c s and Its Applications, Volume 69 (Kluwer Academic Publishers, Doortrecht, 1991). [21] P.T. T h a c h , R.E. Burkard and W . Oettli, Mathematical programs with a twodimensional reverse convex constraint, Journal of Global Optimization 1 (1991) 145-154. [22] H. Tuy and B . T . Tarn, An efficient solution m e t h o d for rank two quasiconcave minimization problems, Optimization 2 4 (1992) 43-56. [23] H. Tuy, T h e complementary convex structure in global optimization, Global Optimization 2, 21-40. [24] H. Tuy, Polyhedral annexation, dualization and dimension reduction technique in global optimization, Journal of Global Optimization 1 (1991), 229-244. [25] N . E . Young, R.E. Tarjan and J . B . Orlin, Faster parametric shortest minimum-balance algorithms, Networks 2 1 (1991) 205-221.

path

[26] N. Zadeh, A bad network problem for the simplex m e t h o d and other m i n i m u m cost flow algorithms, Mathematical Programming 5 (1973) 255-266. [27] W . Zangwill, M i n i m u m concave cost flows in certain networks, Science 14 (1968) 429-450.

Management

147 Network Optimization Problems, pp. 147-167 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

A Method for Solving Network Flow Problems with General Nonlinear Arc Costs Bruce W . Lamar Department of Management,

University

of Canterbury,

Christchurch,

New

Zealand

Abstract General nonlinear network flow problems concern the minimization of costs over networks involving arcs with arbitrary cost functions. Finding the optimal solution to such problems is challenging because the arc cost functions are, in general, neither convex nor concave. This paper presents a procedure for converting any network with general nonlinear arc cost functions into an equivalent network containing only concave arc cost functions. This conversion procedure permits established solution methods for concave minimum cost network flow problems to be applied to networks with arbitrary arc cost functions.

1

Introduction

Single commodity network flow problems can be classified according to t h e complexity of their solution procedure. T h e salient feature of this taxonomy is t h e form of t h e arc cost function. Accordingly, we consider four classes of m i n i m u m cost network flow problems ( M C N F P ' s ) based on the arc costs: (i) linear, ( u ) convex, (Hi) concave, and (iv) general nonlinear. Linear M C N F P ' s have constant marginal arc costs. Because linear M C N F P ' s are a special form of linear program, this class of problems can be solved in polynomial time. Moreover, basis updating can be performed using simple list processing techniques [13]. T h u s , linear M C N F P ' s can be solved very efficiently. Algorithms and applications for linear M C N F P ' s are discussed in [1, 4, 15, 20]. Convex M C N F P ' s have nondecreasing marginal arc costs. This class of problems involves t h e minimization of a convex function over a convex feasible region. T h u s ,

148

B. W.

Lamar

a local o p t i m u m is also a global o p t i m u m . Iterative procedures, wherein each subproblem is a linear M C N F P , are used to solve convex M C N F P ' s . This means t h a t , although convex M C N F P ' s are harder to solve t h a n linear M C N F P ' s , large scale convex M C N F P ' s (on t h e order of 1000's of arcs) are still tractable. For convex M C N F P algorithms and applications see [12, 21]. Concave M C N F P ' s have nonincreasing marginal arc costs. Concave M C N F P ' s are considerably harder to solve t h a n either linear M C N F P ' s or convex M C N F P ' s . This increased complexity arises from t h e fact t h a t — a l t h o u g h an optimal solution (if one exists) always occurs at an e x t r e m e point of t h e feasible region—identification of t h e optimal point requires, in the worst case, a complete enumeration of all t h e e x t r e m e points in t h e feasible region. In fact, the concave M C N F P is known to be N P - h a r d [5]. So, except for special cases, exact algorithms for concave M C N F P ' s run in exponential t i m e . For m o d e r a t e sized problems (involving 100's of arcs), exact procedures—using branch and bound or dynamic programming techniques—are available. Large scale problems, however, require heuristic m e t h o d s . Recent work on concave M C N F P ' s is contained in [7, 8, 9, 10, 11, 14, 16, 19]. See Guisewite and Pardalos [6] for a comprehensive survey of algorithms and applications for concave M C N F P ' s . Lastly, general nonlinear M C N F P ' s allow for arbitrary arc cost functions including functions t h a t are neither concave nor convex J . Because of t h e arbitrary n a t u r e of t h e arc cost functions, general nonlinear M C N F P ' s are t h e most difficult of the four classes of M C N F P ' s to solve optimally. Yet, m a n y practical problems involving t h e t r a n s p o r t of people, commodities, or information have a general nonlinear arc cost structure. For example, consider the transport of passengers on vehicles with limited capacity. T h e costs are typically dominated by fixed charges (such as driver cost, insurance, fuel charges) t h a t are proportional to t h e n u m b e r of vehicles (not t h e n u m b e r of passengers). T h u s , if arc flow is in units of passengers, t h e arc cost function is in t h e form of a "staircase" as shown in Figure 1. For a second example, consider t h e distribution system of a wholesaler. T h e costs in t h e distribution network may typically include setup costs and quantity discounts. Larger volume flows on an arc will have lower marginal costs. But there will be discontinuities at t h e volume "breakpoints" making t h e arc cost functions neither concave nor convex. Instead, this cost structure will result in a "sawtooth" arc cost function of t h e form shown in Figure 2. See [3] for a discussion of quantity discounting and other forms of discontinuity in cost functions. T h e focus of this paper is on t h e last—and most challenging—class of M C N F P ' s . T h e paper is organized as follows. Section 2 formulates t h e general nonlinear M C N F P as a nonlinear integer programming problem and describes an equivalent formulation of t h e problem as a concave M C N F P . T h e equivalence between t h e two problems is based on converting each arc with an arbitrary cost function in t h e original problem 'Note that a general nonlinear MCNFP is different than a "generalized MCNFP". A "generalized MCNFP" refers to a network with flow losses or gains and either linear or convex arc cost functions. "Generalized MCNFP's" are not discussed in the paper.

Network Flow Problems with General Nonlinear Arc Costs

<j>(x 12 8 4

*-

0 0

5

10

15

Figure 1: "Staircase" Arc Cost Function.

100

50

10

20

30

Figure 2: "Sawtooth" Arc Cost Function.

149

B. W.

150

Lamar

Arc i, 1 -.

Arc i,0

O

L

Arc i,2

"O

—

V

Arc i, q{

*

J

Figure 3: Arcs in Expanded Network.

into an arc with a concave cost function in series with a set of parallel arcs, each with a linear (and therefore concave) arc cost function. This conversion procedure is described in Section 3. Section 4 illustrates this procedure with two numerical examples. Finally, Section 5 summarizes t h e paper.

2

Problem Formulation

In this section we formulate two problems, denoted P and Q. Problem P is the general nonlinear M C N F P . Problem Q is equivalent to Problem P. However, Problem Q is a concave M C N F P . To make t h e two problems equivalent, each arc in Problem P is replaced with an arc in series with a set of parallel arcs, as shown in Figure 3. T h u s , although Problem Q involves a larger network, it can be solved using established solution techniques for concave M C N F P ' s .

2.1

General Nonlinear MCNFP Formulation

To formulate Problem P, let • A denote t h e (directed) arc set with generic element i; • N denote t h e node set with generic element n; • Ln denote t h e set of arcs leaving node n; • En denote t h e set of arcs entering node n; • Xi denote t h e flow carried on arc i; • U{ denote t h e flow capacity for arc i; • bn denote t h e net supply at node n (if bn > 0 then n is supply node; if bn < 0 then n is a d e m a n d node);

Network

Flow Problems

with General Nonlinear

Arc

Costs

151

• <j>i(xi) denote t h e general nonlinear cost function for arc i defined over t h e domain of a:, (i.e., [0, u,]). T h e decision variables in t h e problem are t h e set of arc flows {a;,}. Using this notation, Problem P (the general nonlinear M C N F P ) is formulated as follows: Problem P: z* = g l o b a l m m ^ ^ , - ( x , - )

(1)

subject to:

53 *i - ^2 Xi = bn for a11 " G N i€L„

(2)

.'g£„

0 < x{ <

Ui

for a l i i € A

Xi integer for all i G A

(3)

(4)

T h e objective function (1) minimizes costs. Constraints (2) represent t h e conservation of flow at each node. Constraints (3) restrict t h e flow decision variables { i , } to feasible values on each arc. Because of t h e arbitrary form of t h e objective function, t h e optimal value of t h e decision variables will, in general, not b e integer-valued unless explicitly constrained as in (4).

2.2

Equivalent Concave M C N F P Formulation

We refer to t h e network for Problem P as t h e "original network". To construct Problem Q, we replace each arc i in arc set A in t h e original network with an arc in series with a set of parallel arcs (again, refer to Figure 3). We refer t o t h e network for P r o b l e m Q as t h e "expanded network". For each arc i in t h e original network, we refer to t h e series arc in t h e expanded network as arc i, 0. Also, for each arc i in t h e original network, let q b e t h e index of t h e set of parallel arcs in t h e expanded network and let <j; be t h e number of parallel arcs. T h e parallel arcs are denoted as arcs i , l through i,q~i. Let a:,- be t h e flow on arc z,0 and let a:,-,, be t h e flow on arc i, q for q = 1,. .. ,qi. In addition, let u; be the capacity of arc i, 0; and let u;>g be t h e capacity of arc i,q for q — 1,. . . ,g,-. We define r,'(x;) to b e the cost function for arc z,0; and we define St,g(£i,?) to be t h e cost function for arc i,q for q = 1 , . . . ,<j;. By construction, r ^ i , ) will be a piecewise-linear-concave function and each ^(a:,-,,) will be a linear (and thus concave) function for q = 1 , . . . , q~i. T h e decision variables in Problem Q are t h e arc flows in t h e expanded network; t h a t is, {xi} and {x;,,}. We now formulate Problem Q (the concave M C N F P equivalent of Problem P) as follows:

152

B. W.

Lamar

Problem Q: z* = globalmin J2 ri{xi) i£A

+ 2 ^ 5 3 si,q(xi,i)

(5)

i£A 9 = 1

subject to:

J2 x> - J2 xi = 6" f o r all n G iV i£L„

(6)

.££„

H = Yl xi
(7)

9=1

0 < Xi < Ui for all i € A

(8)

0 < xi,q < ui,q for all i € .A, 17 = 1 , . . . , 9,-

(9)

T h e objective function (5) minimizes costs in the expanded network. Constraints (6) and (7) ensure t h a t flow is conserved at all nodes in t h e expanded network. Constraints (8) and (9) allow only feasible flows on t h e arcs in t h e expanded network. Note t h a t if t h e net supplies {&„} and t h e capacities {u,} and {u,-,,} are integer valued, then t h e integer constraints (4) are not required in Problem Q. This is because every e x t r e m e point of t h e feasible region will be integer valued; and t h e minimal value of a concave function over a convex region will always occur at an e x t r e m e point [2]. In other words, Problem Q is a concave minimization problem whereas Problem P is a general nonlinear integer programming problem. Because of the equivalence between t h e two problems, t h e optimal solution {x*} to Problem Q will also be optimal in P r o b l e m P; and t h e optimal objective function value, z*, will be t h e same for t h e two problems. However, because Problem Q is a concave M C N F P , it can be solved by established solution techniques for this class of problems (see [6]). Note t h a t some concave M C N F P algorithms (e.g., [22]) assume t h a t the arc cost functions are continuously differentiable. In such cases, t h e piecewise-linear-concave function ^ ( i , ) can be approximated by a continuously differentiable function, denoted fi(xi). We let Problem Q denote t h e approximation of Problem Q in which the s u m m a t i o n of r;(a:;) is objective function (5) is replaced by the s u m m a t i o n of fi(xi). T h e next section describes a general procedure for determining t h e cost functions {r,(a:,')} and {•s1-]?(xt-l,)} and the capacities {ui,q} in t h e expanded network of Problem Q (and t h e approximate functions f;(x,) in Problem Q).

3

Conversion Procedure

In this section, we demonstrate how an arc in t h e original network with a general nonlinear cost function can be converted into a set of arcs in t h e expanded network

Network

Flow Problems

with General Nonlinear

Arc

Costs

153

consisting of an arc with a concave cost structure in series with a set of parallel arcs, each with a linear arc cost function. Figure 4 provides an overview of t h e five steps involved in t h e conversion procedure. We point out, however, t h a t these steps are for conceptual purposes only. T h a t is, they are intended solely to explain t h e rationale behind t h e conversion procedure. T h e actual mechanics of converting an arc in the original network with a general nonlinear cost structure into t h e form described above are summarized in Subsection 3.6. In this section we consider t h e conversion procedure for a generic arc i contained in t h e original arc set A. Because it is understood t h a t in this section we are always referring to a given arc i, t h e subscript i is omitted on all coefficients and variables defined and used in this section. T h u s , we let x be t h e flow on arc i, u be the flow capacity for arc i, and {x) be t h e general nonlinear cost function for arc i. We allow <j)(x) to be any possible functional form. However, we restrict out attention to flows x on arc i t h a t are integer-valued (as required by constraints (4)).

3.1

Step 1

In t h e first step of t h e conversion procedure, we observe t h a t , without loss of generality, we can replace [x) for x integer. Note t h a t , like (x), t h e domain of f(x) for arc i is [0,u]. We now divide this domain into contiguous intervals and we divide each interval into contiguous segments. Let j be t h e index for t h e intervals and let k be t h e index for t h e segments. Also let j denote t h e number of intervals for arc i and let kj denote t h e n u m b e r of segments in t h e j - t h interval for arc i. In addition, let rrijt\ and m j + i j denote, respectively, the left and right endpoints for t h e j - t h interval for arc i; and define t h e set Ij as Ij = {x : mJtl

< x < mj+i,i}

(10)

W i t h this notation, we define the intervals for arc i such t h a t t h e number of segments j is as small as possible given t h a t t h e following two conditions are satisfied: • If j is odd, t h e n f(x)

is a piecewise-linear-concave function for x G IJ;

• If j is even, then f(x)

is a piecewise-linear-convex function for x 6 Ij.

For t h e J - t h interval of arc i, let rrij^ and rrij^+i denote, respectively, t h e left and right endpoints of t h e fc-th segment. Note t h a t , because t h e intervals and segments are contiguous, the right endpoint of t h e last segment of the j - t h interval always equals the left endpoint of t h e first segment of t h e (j + l)-st interval. T h a t is, Tij £•+! = mj+i,!. Moreover, the left endpoint of first segment of t h e first interval of arc i is always zero and t h e right endpoint of the last segment of t h e last interval of arc i is always equal to t h e capacity u. T h a t is, m ^ i = 0 and m j j . + 1 = m j + 1 1 = u. To specify t h e segments for t h e j - t h interval, we define t h e set Ij^ as

154

B. W. Lamar

StepO:

O

M

Step 1:

O

—

Step 2:

Step 3:

O

-O

QJ°l(*LoJM+0-...-^yjM^0

O

^

O

^

-O

si(xi) r x

Step 4:

O

()

Step 5:

O

r(x)

s

X

2{x2)

S

2JX^

tX

V

s

l{xg)

Figure 4: Steps in Conversion Procedure.

J

Network

Flow Problems

with General Nonlinear

Arc

Costs

Ij,k = {x : m i i t < x < m j i t + 1 }

155

(11)

T h e segments of interval j for arc i are defined such t h a t kj, t h e n u m b e r of segments for t h e j-th interval, is as small as possible given t h a t f(x) is linear for each x £ Ij,k-

3.2

Step 2

Having defined t h e intervals and segments for arc i, we can now describe t h e second step in t h e conversion procedure (again, refer to Figure 4). In the second step, we replace t h e single arc i with a series of j arcs, one associated with each interval for arc i. We refer to t h e j-th arc in this series as arc i,j and we let gj(x) denote t h e cost function for arc i,j. To specify gj(x), let Ajtk denote t h e slope of t h e (linear) function f(x) for x 6 Ij,k and let Aj denote t h e slope of f(x) in t h e kj-th (i.e., last) segment of interval j . We define A 0 = 0. Note t h a t gj(x) must be defined for x in the domain [0,u], not just [mj,!,771^+1,1]. T h u s , for each arc i,j we define two additional segments—the zero-th segment and the (k3 + l)-st segment. For each arc i, j , we also define t h e sets Ij:o and 7 j j + 1 as lj,o = {x:0<x
!j,h+i

=

(12)

(x ' m ^+i.i ^x
(13)

We now specify t h e cost function gj(x) for arc i,j recursively as g}(x) = 0 for all x <E I]fi

gj(x) = gj{mjik)

9iix)

+ (Aj:k - Aj-i)

= 9j(mJ+i,i)

• (x - mjik)

(14)

for all x € Ihk, k = 1 , 2 , . . . , kj

+ (&i ~ A , - i ) • (x -

roj+i,i)

for all x € 7 i i S j . + 1

(15)

(16)

Note t h a t , for _;' odd, gj(x) is a piecewise-linear-concave function for x in t h e domain [0,tx]. Similarly, for j even, gj(x) is a piecewise-linear-convex function for x in t h e domain [0, u]. T h e logic underlying the specification of the cost functions gj{x) is as follows. For flows x in t h e first interval, i.e., for x £ 7 1 ; the function gi(x) is the same as t h e function f(x) and the other arc cost functions g2(x),. . . ,g~j(x) all have zero cost. For flows x in the second interval, i.e., for x 6 I2, t h e sum of t h e functions g\(x) plus g2(x) is the same as the function f(x) and the remaining functions gs(x),... ,gj(x) all have zero cost. For flows x € 73, t h e sum of g\(x) plus g2{x) plus 53(1) equals f(x) and t h e remaining functions 5 4 ( 1 ) , . . . ,g~j{x) are all zero; and so on. T h u s , for any x between zero and u, t h e cost of sending x units of flow through t h e series of arcs i, 1

156

B. W.

Lamar

through i,j is t h e same as t h e cost of sending t h e same amount of flow through t h e single arc i.

3.3

Step 3

In t h e third step of t h e conversion process, we simply note t h a t t h e order of the arcs in t h e series i, 1 through i,j is u n i m p o r t a n t . This means t h a t we may equivalently rearrange t h e order of t h e series of arcs such t h a t t h e arcs i,j for j odd are to t h e left of t h e arcs i,j for j even. Furthermore, we can recombine t h e arcs i, j with j odd into a single arc. We refer to this recombined arc as arc i, 0. Let r(x) denote t h e cost function for arc i, 0 where

r(x)=

£

9j(x)

(17)

j=l,3,...

Observe t h a t r{x) is the sum of piecewise-linear-concave functions and so is itself a piecewise-linear-concave function. Moreover, the endpoints of t h e linear segments of r(x) are given by rrij^ for k = l,...,kj and j odd. However, to simplify notation, let p be t h e index of t h e linear segments of the function r ( x ) , let p be t h e n u m b e r of segments, and let vp and u p + 1 be the left and right endpoints of t h e p-th segment. In a similar fashion, we can replace t h e arcs i, j for j even with a single arc. Let s(x) denote t h e cost function for this recombined arc where

*(*)=

£

9j(*)

(18)

j"=2,4,...

Here, s(x) m e n t s are by letting segments,

3.4

is a piecewise-linear-convex function and the endpoints of t h e linear seggiven by rrij^ for k = 1 , . . . , kj and j even. Once again, we simplify notation q be t h e index of t h e linear segments of s(x), letting q be t h e n u m b e r of and letting wq and w,+i be t h e left and right endpoints of the q-th segment.

Step 4

In t h e fourth step of t h e conversion procedure, we use t h e established m e t h o d of converting a piecewise-linear-convex arc cost function with q linear segments into a set of q parallel arcs, each with a linear cost function (see [12, p. 80]). We denote the q-th parallel arc as arc i,q, we let xq denote t h e flow on arc i,q, and we let s , ( x g ) be the (linear) cost function for arc i,q. T h e functional form of sq(xq) is s , ( x , ) = Sq • xq where Sq is t h e slope of t h e (7-th linear segment of t h e piecewise-linear function We let uq denote t h e flow capacity for arc i, q. Here, uq is given by

(19) s(x).

Network

Flow Problems

with General Nonlinear

Arc

Costs

157

uq = wq+1 - wq

3.5

(20)

Step 5

U p t o this point, t h e conversion procedure we have described is exact. T h a t is, solving a network with each arc i of t h e form shown in Step 0 in Figure 4 (i.e., t h e original network of Problem P) is identical to solving a network in which each arc i is replaced with t h e set of arcs shown in Step 4 (i.e., t h e expanded network of Problem Q). As mentioned at t h e end of Section 2, however, it may be desirable to approximate r(x) with a continuously differentiable concave function, denoted f(x). T h u s , t h e (optional) fifth step in t h e conversion procedure is to use an appropriate curve fitting technique to approximate r(x) by f(x). T h e n f(x) is used as t h e cost function for arc i , 0 in t h e expanded network for Problem Q.

3.6

Computational Summary

We conclude this section by pointing out t h a t t h e form of t h e functions r(x) and sq(xq) in t h e expanded network (i.e., Step 4) can be computed directly from t h e general nonlinear function <j>(x) in t h e original network (i.e., Step 0). To describe these computations, let 9L(x) and 6R(x) denote, respectively, t h e slope of t h e piecewiselinear function f(x) to t h e left and right of x for x = 1 , 2 , . . . ,u — 1. These slopes are computed directly from t h e general nonlinear function (x) as 9L(x)

= 4>(x) -4>{x-

0R{x)

= <)>(x + 1) - 4>{x)

1)

(21) (22)

In addition, let 8(x) denote t h e difference between t h e right and left slopes. T h a t is, 6(x) = 6R(x)

- 6L{x)

(23)

To determine t h e intervals j and segments k, and t h e associated endpoints rtij^, we initially set j <— 1, k <— 1, and m ^ i «— 0. T h e n , for x sequentially set to 1,2,. . . ,u — 1 (as in a "DO-loop") we perform t h e following four tests: • If j is odd and 9(x) < 0, then set rrij^+i «— x and increment k <— k -\- 1; • If j is odd and 9(x) > 0, then set mj^+\ <— x, set m j + i j <— x, and set kj <— k; t h e n increment j *— j + 1 and reset k *— 1; • If j is even and 9(x) > 0, t h e n set m^k+i <— x and increment k <— k -f 1; • If j is even and 6(x) < 0, then set m ^ i + i <— x, set m,j+iti then increment j <— j + 1 and reset k <— 1;

<— x, and set kj <— k;

B. W.

158

Lamar

Note t h a t when 6(x) = 0 t h e conditions of none of t h e four tests are satisfied. After performing these four tests for x = 1,2, ...,u — 1, we t h e n set j <— j , k3 <— k, mj^+i <— u and rrtj+iti <— u. To c o m p u t e t h e piecewise-linear-concave function r ( x ) , t h e numerical value of t h e slopes Aj j , and A j used in eqs. (14) through (16) are obtained from t h e slopes 9L(x) and 8R(x) given in eqs. (21) and (22). Specifically,

Ajik

= eR(mjik)

A,- = eL(mHhl)

(24)

(25)

T h e n t h e segments k = 1 , . . . , kj for j odd are reindexed as p = 1,. . . ,p and t h e endpoints vp are set equal to mj,* for the appropriate j and k. In a similar way, to compute the linear functions sq(xq), t h e segments k — 1,. . . , kj for j even are reindexed as q = 1 , . . . , q, t h e slopes 6q in eq. (19) are determined from Aj,* and A j using eqs. (24) and (25), t h e endpoints wq are set equal to rrij^ for t h e appropriate j and k, and t h e capacities uq are computed using eq. (20). T h e next section illustrates t h e conversion procedure with two numerical examples.

4

Numerical Examples

In this section, we apply t h e conversion procedure described in Section 3 t o t h e "staircase" arc cost function shown in Figure 1 and the "sawtooth" arc cost function shown in Figure 2. To illustrate the technique, t h e conversion steps are shown in full for t h e "staircase" function, but just briefly summarized for t h e "sawtooth" function. As in Section 3, because we are referring to a given arc i in the original arc set A, we omit t h e subscript i in t h e notation in this section.

4.1

"Staircase" Function

For t h e function in Figure 1, we have

4>(x) 12

for for for for

x = 0 0 < x < 5 5 < x < 10 10 < x < 15

Network

Flow Problems

with General Nonlinear

Arc

Costs

159

m

_y~ M

12 -

4 i 0

h

0

10

15

Figure 5: Piecewise-Linear Equivalent for "Staircase" Function.

We assume 4>(x) is defined over t h e domain [0,15]; i.e, u = 15. T h e function t h e piecewise-linear-continuous equivalent of
/(*) =

4•x 4 4 + 4 • (x - 5) 8 8 + 4 • (x - 10) 12

f(x),

for 0 < x < 1 for 1 < x < 5 for 5 < x < 6 for 6 < x < 10 for 10 < x < 11 for 11 < x < 15

and is shown in Figure 5. T h e domain [0,15] is divided into five intervals; i.e., j — 5. T h e y are 7i = {x : m ^ i <

x

< m 2 ,i} = {x : 0 < x < 5}

7 2 = {x : rre2,i < ^ < "^s.i} = {a: : 5 < x < 6}

I3 = {x : m 3 ] i < x < m4,i} = {x : 6 < x < 10}

I4 = {x : m4il

< x < m5,i] =

{x:10<x
7 5 = {x : m 5 ] i < x < m 6 i i } = {x : 11 < x < 15} T h e first interval contains two segments; i.e., fci = 2. They are 7 X i = {x : mi,i < x < m i j 2 } = {x : 0 < x < 1} 7i,2 = {x : m i ^ < x < m 1 ) 3 ] = {x : 1 < x < 5}

160

B. W.

Lamar

Each of t h e other intervals contains exactly one segment. T h a t is, I2,i = h, h,i = ^3, I*,\ = -f-4) and 75]i = 7 5 . For each j = 1 , . . . ,j, we also define t h e sets J,-i0 and /,-{•+! as follows: A,o = {x • 0 < x < mi,i]

= {x : 0 < x < 0}

^1,3 = {a; : rri2,\ < x < u} = {x : 5 < x < 15}

^2,0 = {x : 0 < 3; < rri2,i} = {x : 0 < x < 5}

h,2 — {x : m3,i < x < u } = { x : 6 < x < 15}

^3,0 = {x : 0 < x < m 3 ] i } = {x : 0 < x < 6}

^3,2 = {x : r?i4a < x < u} = {x : 10 < x < 15}

hp

= {x : 0 < x < m 4 a } = {x : 0 < x < 10}

^4,2 = {x : rasa < x < u} = {x : 11 < x < 15}

^5,o = {x : 0 < x < m 5 a } = {x : 0 < x < 11} ^5,2 — {x '• m^i

< x < u } = {x : 15 < x < 15}

Applying eqs. (24) and (25), t h e slopes A]tk and Ay are as follows:

A1.1 Ai,2 A2a A3a A4a Asa

= = = = = =

4 0 4 0 4 0

Ao Ar A2 A3 A4 A5

Using t h e slopes and intervals given above, t h e functions gj(x) through (16) are as follows:

defined in eqs. (14)

Network

Flow Problems

with General Nonlinear

, , _ ] 4 - x > " 4

9^x>

~ |

0 4 • (x - 5)

for 0 < x < 6 for 6 < x < 15

0 4 • (x - 10)

for 0 < x < 10 for 10 < x < 15

' 0 5sW- i ^

4

.(

Recombining t h e functions gj(x) ()

x

161

for 0 < x < 5 for 5 < x < 15

fO - 4 • (a; - 6)

9t{x) ^

r x

Costs

for 0 < x < 1 for 1 < x < 15

9l[X

92{x) = |

Arc

_n)

for 0 < x < 11 forll<x<15

according to eqs. (17) and (18) gives

= 9iix)

+ 9z(x) + 9s{x)

s{x) =g2(x)

+g4{x)

Numerically, this yields

4•x 4 4-4-(x-6) - 1 6 - 8 - (x- 11)

r(x)

s(x) = |

[0 4-(i-5) ( 20 + 8 • (x - 10)

for for for for

0< x <1 1< x < 6 6 < x < 11 11 < x < 15

for 0 < x < 5 for 5 < x < 10 for 10 < x < 15

T h e functions r ( x ) and s(x) are shown in Figures 6 and 7, respectively. Since function s(x) contains three linear segments, we have q = 3 and, using eqs. (19) and (20), we c o m p u t e t h e linear functions sq(xq) and capacities uq as follows: •Si(xi) ^2(2:2) 33(23)

= = =

0 4 • x2 8 • x3

Ui u2 u3

= = =

5 5 5

Finally, if desired, t h e piecewise-linear-concave function r ( x ) can be approximated by a continuously differentiable function f ( x ) . For instance, to a p p r o x i m a t e r(x) by a quadratic, t h e following function can be used: f(x) = 5 - 0 . 4 - ( x - 3 . 5 ) 2 This approximation is shown by the dotted line in Figure 6.

B. W. Lamar

162

-16 -

Figure 6: Concave Component for "Staircase" Function.

0

5

10

Figure 7: Convex Component for "Staircase" Function.

Network Flow Problems with General Nonlinear Arc Costs

163

/(*) 100

50

0

10

20

30

Figure 8: Piecewise-Linear Equivalent for "Sawtooth" Function.

4.2

"Sawtooth" Function

For the "sawtooth" function in Figure 2, (x) is given by

+(*) =

0 6+ 5• x 6+ 4•x 6+ 3• x

for x = 0 for 0 < x < 10 for 10 < x < 20 for 20 < x < 30

We assume that this function is defined over the domain [0,30]; i.e, u = 30. The function f(x), the piecewise-linear-continuous equivalent of 4>(x), is given by

/(*) =

11-x 11+ 5 - ( X - 1 ) 51 - 5 • (x - 9) 46 + 4 • (x - 10) 82 - 16 -(x- 19) ( 66 + 3 • (x - 20)

for for for for for for

0< x < 1 1< x < 9 9 < x < 10 10 < x < 19 19 < x < 20 20 < x < 30

as shown in Figure 8. The domain [0,30] is divided into four intervals; i.e., j = 4. They are Ii = {x : mi,! < x < m,2ti} = {x : 0 < x < 10}

I2 = {x :TO2,i< x < 7713,1} = {x : 10 < x < 19}

7 3 = {x : 7713,1 5: 1 < 7774,1} = {x . 19 < x < 20}

164

B. W. Lamar

Figure 9: Concave Component for "Sawtooth" Function.

I4 = {x : 77i4tl < x < r«5Ti} = {x : 20 < x < 30} The first interval contains three segments; i.e., fcj = 3. Each of the other intervals contains exactly one segment. Using eqs. (14) through (16), the functions gj(x) are determined for j — 1 , . . . ,4. Then using eqs. (17) and (18), functions r(x) and s(x) are given by r(x) = gi{x) + g3(x) s x

( ) = 9-iix) + 9i{x)

This yields

r(x)

11 • a; ll+5-(x-l) 51 - 5 • (x - 9) 1 - 25 • (x - 19)

for 0 < x < 1 forl
for 0 < x < 10 s(x) = I 9 • (x - 10) for 10 < x < 20 ( 90 + 28 • (x - 20) for 20 < x < 30 The functions r(x) and s(x) are shown in Figures 9 and 10, respectively. Since function s(x) contains three linear segments, we have q = 3 and the linear functions s,(x g ) and capacities uq are given as follows:

Network

Flow Problems

with General Nonlinear

Arc

Costs

165

s(x)

90

0

10

20

Figure 10: Convex Component for "Sawtooth" Function.

SlOd) s2(x2) S3{X3)

= = =

0 9-x2 28-x3

tii u2 u3

= = =

10 10 10

Lastly, if desired, t h e piecewise-linear-concave function r(x) can be approximated by a continuously differentiable function f(x). For instance, to a p p r o x i m a t e r(x) by a quadratic, t h e following function can be used: f(x)

= 55 - 0.75 • (x - 9) 2

This approximation is shown by t h e dotted line in Figure 9. T h e next section summarizes t h e paper.

5

Summary

This paper has described how a network containing arcs with general nonlinear cost functions can be converted into an expanded network involving only concave arc cost functions. T h e functional form of t h e concave arc cost functions can be calculated very efficiently from t h e original problem data. Although t h e expanded network will be larger t h a n t h e original network, it can be solved using established solution techniques for concave m i n i m u m cost network flow problems. Special cases of this conversion procedure have been applied to less-than-truckload motor carrier networks and to cash flow m a n a g e m e n t problems [17, 18]. This paper has extended t h e technique to any m i n i m u m cost network flow problem with arbitrary arc cost functions.

B. W.

166

Lamar

References [1] D.P. Bertsekas (1991), Linear MA.

Network

Optimization,

M I T Press, Cambridge,

[2] A. Charnes and W . W . Cooper (1961), Management Models and Industrial plications of Linear Programming, J o h n Wiley and Sons, New York, NY.

Ap-

[3] R . J . Dolan (1987), "Quantity Discounts: Managerial Issues and Research Opportunities," Marketing Science, vol. 6, p p . 1-22. [4] J . R . Evans and E. Minieka (1992), Optimization Graphs, Marcel Dekker, New York, NY.

Algorithms

for Networks

and

[5] M.R. Garey and D.S. Johnson (1979), Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman and Co., San Francisco, CA. [6] G.M. Guisewite and P.M. Pardalos (1990), "Minimum Concave-Cost Network Flow Problems: Applications, Complexity, and Algorithms," Annals of Operations Research, vol. 25, p p . 75-100. [7] G.M. Guisewite and P.M. Pardalos (1991a), "Single-Source Uncapacitated Minim u m Concave Cost Network Flow Problems," in H.E. Bradley (ed.), Operational Research '90, Pergamon Press, Oxford, England, p p . 703-713. [8] G.M. Guisewite and P.M. Pardalos (1991b), "Algorithms for t h e Single-Source Uncapacitated M i n i m u m Concave-Cost Network Flow Problem," Journal of Global Optimization, vol. 1, p p . 245-265. [9] G.M. Guisewite and P.M. Pardalos (1991c), "Global Search Algorithms for Mini m u m Concave Cost Network Flow Problems," Journal of Global Optimization, vol. 1, p p . 309-330. [10] G.M. Guisewite and P.M. Pardalos (1991d), "A Polynomial T i m e Solvable Concave Network Flow Problem," Networks, forthcoming. [11] G.M. Guisewite and P.M. Pardalos (1992), "Performance of Local Search in M i n i m u m Concave-Cost Network Flow Problems," in C.A. Floudas and P.M. Pardalos (eds.), Recent Advances in Global Optimization, Princeton University Press, Princeton, N J , p p . 50-75. [12] P.A. Jenson and J . W . Barnes (1980), Network and Sons, New York, NY.

Flow Programming,

[13] E.L. Johnson (1966), "Networks and Basic Solutions," Operations 14, pp. 619-623.

John Wiley

Research,

vol.

Network

Flow Problems

with General Nonlinear

Arc Costs

167

[14] D.B. K h a n g and 0 . Fujiwara (1991), "Approximate Solutions of Capacitated Fixed-Charge M i n i m u m Cost Network Flow Problems," Networks, vol. 2 1 , p p . 689-704. [15] J.L. Kennington and R.V. Helgason (1980), Algorithms ming, J o h n Wiley and Sons, New York, NY.

for Network

Program-

[16] B . W . Lamar (1992), "An Improved Branch and Bound Algorithm for M i n i m u m Concave Cost Network Flow Problems," Journal of Global Optimization, forthcoming. [17] B . W . Lamar and S. Jorjani (1990), "Incorporating Discounting Into NetworkBased Cash Flow Management Models," working paper, G r a d u a t e School of M a n a g e m e n t , University of California, Irvine, CA. [18] B . W . Lamar and Y. Sheffi (1988), "An Implicit E n u m e r a t i o n Method for LTL Network Design," Transportation Research Record, no. 1120, p p . 1-16. [19] B . W . Lamar, Y. Sheffi, and W . B . Powell (1990), "A Capacity Improvement Lower Bound for Fixed Charge Network Design Problems," Operations Research, vol. 38, pp. 704-710. [20] D . T . Phillips and A. Garcia-Diaz (1981), Fundamentals Prentice-Hall, Englewood Cliffs, N J .

of Network

Analysis,

[21] Y. Sheffi (1985), Urban Transportation Networks: Equilibrium Analysis Mathematical Programming Methods, Prentice-Hall, Englewood Cliffs, N J .

with

[22] B. Yaged, Jr. (1971), "Minimum Cost Routing for Static Network Models," Networks, vol. 1, p p . 139-172.

169 Network Optimization Problems, pp. 169-175 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Application of Global Line Search in Optimization of Networks Jonas Mockus Department of Optimal Decision Theory, Institute Akademijos 4, Vilnius 2600, Lithuania

of Mathematics

and

Informatics,

Abstract

In this paper a review of application of global line search to optimization of networks is given. Advantages and disadvantages of this approach are discussed. It is shown that global line search provides global minimum after finite number of steps in two cases of piecewise linear cost functions of arcs. The first case is where all cost functions are convex. The second case is where all costs are equal to zero at zero flow and equal to some constant at non-zero flow. In other cases the global line search approaches a global minimum with small average error. The extension of the method to vector demands is given. The application of the method to the optimization of high-voltage net of power system is described.

1

Global Line Search

Suppose t h a t t h e objective function f(x), x = (xj,j = 1,..., J) m a y be approximately expressed as a s u m of components depending on one variable Xj.

f(x) = £ /,-(*;)

(1)

3= 1

T h e n t h e original J-dimensional optimization problem can be reduced to a sequence of one-dimensional optimization problems. If t h e decomposition (1) is exact, then we shall get t h e global o p t i m u m after J steps of optimization. If the sum (1)

J.

170

Mockus

represents f(x) approximately, then generally we shall get some approximation of a global o p t i m u m . T h e result of step i we regard as an initial point for t h e i + 1-sth step of optimization. T h e optimization stops, if no change happens during J steps. T h e difference from t h e classical version of line search m e t h o d is t h a t search is not local b u t global. Generally it helps to approach a global m i n i m u m closer. T h e r e are i m p o r t a n t cases when global line search gets global m i n i m u m and local line search does not. One of such cases is t h e following problem of network optimization.

2

Optimization of Networks

Suppose t h a t t h e cost of network f(x) can be expressed as a sum (1), where fj(x() is a cost of arc j and Xj is a flow of arc j . Sum of flows of arcs connected to each node i has to be equal to t h e nodes demand. It means t h a t : j

22aijxj

=

ac

i i,

i'•

=

1) •••! I

(2)

where .=/ ] T aiCi = 0 1=1

Here J is a number of arcs, 7 is a number of nodes, c; is d e m a n d of node i, C + 1 , if arc j goes to node i a,j = < — 1, if arc j goes from node i I 0, if arc j is not connected to i. and _ f + 1 , if node i is source \ — 1, if node i is sink T h e ./-dimensional problem of network cost minimization (1) under I — 1 of conservation-of-flow equations (2) can be reduced to J — / + 1-dimensional unconstrained minimization problem: K

J

K

fix) = Ylfc{xk) + Yl h(52hikXk + Zfo) k=l

1=K+1

Here K = J — 7 + 1 is a n u m b e r of loops, xi0 is from (2) taking x^ = 0, k = 1,

{

(3)

k=-[

+ 1 , if arc / is directed as loop k — 1, if arc / is directed opposite to loop k 0, if arc / don't belong t o loop k

...,K,

Application

of Global Line Search in Optimization

of

Networks

171

Loop k is generated connecting a pair of nodes of some tree containing arcs /, I = K + 1,..., J by some additional arc k, k = 1,..., K. T h e flow x^ of arc k generating a loop k is called loop flow. Loop flows Xk,k = 1,...,K can be changed independently during optimization. Expression (3) depends on t h e tree which generates loops k, k = 1,..., K. Assume t h a t t h e costs of arcs are piecewise linear functions. T h e n it is convenient to carry out t h e global line search along t h e bounds of linear areas. It can be done changing t h e tree after each step of optimization. An obvious rule is to remove some arc from t h e tree, if t h e arcs flow happens to be on a bound of linear parts of piecewise linear cost function. Here we do not consider degenerate problems. It is shown, see Mockus (1967), t h a t t h e algorithm provides global m i n i m u m in two cases: 1. Cost functions of all arcs are convex 2. Cost functions of all arcs are constant, with exception of zero flow point. At zero flow t h e value of cost function has t o be zero. It means t h a t cost functions are very "non-convex". It is easy to see t h a t in t h e linear case t h e algorithm is as simple and efficient as conventional algorithms of linear programming. T h e difference is t h a t global line search algorithm works almost as well also in non-linear convex problems and in some special non-convex problems. T h e generalization of the algorithm to 5-dimensional d e m a n d is straightforward theoretically: we just replace the scalar d e m a n d c; by vector d e m a n d C; = (c,i,..., Cis). To keep conservation-of-flow equations (2) we have to replace scalar flows XJ by corresponding vector flows Xj, Xj = (XJI, ..., XJS)- For practical applications t h e straightforward generalization is not convenient by two reasons: T h e first reason is t h e exponential growth of calculations when S is large. T h e second reason is the practical difficulties defining general .^-dimensional piecewise linear functions fj(x3i, ...,XJS)In special .^-dimensional demand cases global line search can be carried out more conveniently using some heuristics. T h e convergence proof can be extended directly only for straightforward generalization. However the experience shows t h a t global line search is efficient in solving ^-dimensional load problems of network design. One of such problems is optimization of t h e high-voltage net of large power system.

3

The Optimization of High-Voltage Net of Power System

T h e arcs may transformers. higher voltage of nodes c; =

represent components of network such as power transmission lines and T h e nodes represent demands. Demands are sources (generators or substations) or sinks (users or lower voltage substations). T h e demands (c, n ,ra = 1,...,N) in real life power systems are some vector-valued

172

J.

Mockus

functions of t i m e c,„ = c >tl ( t ),t € [1,T]. Usually those functions are approximated as some step functions of t i m e , so c; = {cint,n

= l,...,N,t

- l,...,T),i = 1,...,/

Here different f-components of S = AT-dimensional d e m a n d represent different periods of t i m e , from t = 1 until t = T. T h e n flows of arcs are Xj = (xjnt, n — l,...,N,t = l,...,T),j = 1,..., J and t h e scalar problem (3) can be directly extended to t h e vector case:

/(*) = £/*(**) + E /KE6'*** + *») fc=l

/=A'+1

(4)

*=1

where x t = (x f c n t ,n = l ) . . . ) J V , i = 1 , . . . , T ) , A ; = 1 , . . . , A -

(5)

T h e costs of arcs j representing transmission lines and transformers j directly depends not only on flows Xj but also on states yj. So t h e cost of arc can be more conveniently expressed as fj{xj,yj). Here t h e state variable yj = {yj,t,t = 1,...,T) usually depends on t i m e b u t not on n. Each component yjt of vector yj is a non-negative integer defining t h e technical p a r a m e t e r s of arc j , such as t h e number of parallel circuits of transmission line, t h e n u m b e r and cross-section area of wires, t h e n u m b e r and t h e power rating of transformers and so on. Assume t h a t capacity of arc Xjt(yjt) is an increasing function of its s t a t e yjt (this assumption will help us later to deal with capacity constraints). So we define t h e mixed integer programming problem: K

f{x,v)

J

= ^2fk{xk,yk)+ fc=l

^

K

fi(J2bikXk

l=K+l

+ x,0,y,)

(6)

k=\

This problem can be reduced to continuous non-linear p r o g r a m m i n g problem by choosing t h e cheapest state yj for a fixed flow Xj, namely: fj(xj)

= rain f(Xj,yj)

(7)

providing t h a t t h e capacity constraints hold font I < X]nt(yJt)

(8)

T h e capacity constraints \xjnt\ < Xjnt(yjt) are satisfied by increasing t h e s t a t e variable yjt if inequality (8) does not holds. Notation Yj = YJ(XJ) means a set of feasible states of arc j which can depend on flow Xj. So expression (7) defines t h e cost function, generally a multimodal one.

Application

of Global Line Search in Optimization

of

Networks

173

Expression (7) gives a convenient definition of S'-dimensional cost function fj(xj). However there remains the exponential complexity of minimization of non-convex cost function (6) depending on vector variables xk, k = 1,..., K. So we shall consider some other ways too. An interesting way is to reduce t h e problem of mixed integer p r o g r a m m i n g (6) to a problem of pure integer programming, regarding t h e states yj as independent integer variables. Here for each fixed state y = yk t h e optimal value of flow xk has to be calculated. An advantage of pure integer programming approach is t h a t the cost of arc fj(xj,yj) at a fixed state yj is usually a convex function (compare it with nonconvex cost function (7) of non-linear programming approach). It is well known t h a t a sum (6) of convex functions also is a convex function. However we should carry out the optimization of convex function (6) m a n y times, for each s t a t e y. It is a hard task, if TV and T are not small. T h e practical experience shows, t h a t t h e best computational results can b e obtained by facing t h e mixed integer p r o g r a m m i n g problem (6) directly. It means t h a t we optimize states and flows together at each step of global line search.

4

Mixed Integer Global Line Search

We shall consider a special case T

fAX3i Vi) = Yj9it{Vit-1. (=1

N

Vjt) + Yl

h

int{yjt)x)nt)

(9)

n=l

Here t h e function gjt{yjt-i,yjt) defines t h e cost of reconstruction of node j from state yjt-i to t h e state yjt. Expression hjntxjt means t h e power loss of flow xjnt in arc j a t s t a t e J/ J ( . It is well known t h a t minimization of (6) u n d e r assumption (9) defines t h e natural distribution of power flows in a homogeneous electrical net for any fixed s t a t e y. T h e net is not homogeneous if it contains transmission lines of different voltages or if it includes lines and transformers together. Suppose t h a t yjnt = 0,ifxjnt = 0 and t h a t yjnt+i > Vint- It means t h a t zero s t a t e is feasible only for zero flows and t h a t t h e s t a t e variable can't be decreased in time. T h e last assumption is usually true, but not always. If the state of arc is zero from t h e t i m e 1 until t h e t i m e t we shall call it tinterrupted arc. It is supposed t h a t after t h e t i m e t the s t a t e of ^-interrupted arc is non-zero. T h e optimization of each loop k is carried out in T + 1 stages. At t h e 1-sth stage we compare all possible cases of T-interruption of arcs belonging to t h e loop k. For each fixed s t a t e we minimize sum (6) as a function of flow xk. We do it by solving linear equations corresponding t o t h e condition of zero first derivatives with regard t o xfc n ( ,n = 1 , . . . ,N,t = 1,... ,T. In t h e most economical s t a t e of loop

174

J.

Mockus

we replace zero s t a t e values yjt = 0 by unit s t a t e values, t h a t is by yjt = 1. We accept it as an initial s t a t e for t h e next stage. At t h e second stage we consider all cases of T — 1 interruption and so on until t h e last T + 1- sth stage. At T + 1-sth stage we consider all non-zero states. We compare sums of costs of arcs belonging to a loop k for all stages. T h e state which corresponds to t h e minimal cost function we accept as a result of global search along t h e "line" k. T h e enumeration of arcs may be changed after each step of global line search. T h e purpose of this change is to keep "most interrupted" arcs out of t h e tree. We call an arc as "most interrupted" if it is interrupted for a longest t i m e to- Here t 0 = a r g m a x j g i k i;, where ij we define as yiit = 0,t = l,...ti,yu > l , i = t/ + 1,...,T and Lk is a set of arcs belonging to a loop k. If T < 2, then t h e optimization at each stage can be carried out by a simple comparison of all corresponding states. If T is greater, t h e n some d y n a m i c p r o g r a m m i n g procedure usually is more efficient. T h e optimization stops if the cost of net changes less then e during K steps of global line search. T h e C P U t i m e T of global line search software developed by J.Valeviciene can be estimated as

r =

cKINM

Here c depends on computer, for P C approximately c = 0.2 — l . O s e c , K is a n u m b e r of loops, / is a number of nodes, A^ is a n u m b e r of flow components and M is an average n u m b e r of states of arcs. T h e algorithm was used since 1969 for t h e optimal planning of North-Western power system of t h e former USSR designing t h e new power transmission lines of 110 K V , 220 K V and 330 K V in t h e Leningrad branch of "Energosetprojekt", which was t h e leading institution in the country at t h e t i m e . D y n a m i c p r o g r a m m i n g procedures developed in Riga were also used at t h e some place, with lesser success. T h e reason was t h a t approximate global line search m e t h o d was solving t h e problems up to 100 nodes and more. T h e exact d y n a m i c programming procedures were directly applicable only to problems with tens of nodes. Pardalos and Rosen (1987) present an approximation technique based on piecewise linear underestimation of concave cost functions fj(xj). T h e resulting model is a linear, zero-one, mixed integer problem. A direct comparison of this approach and global line search techniques is an interesting problem of future research. A complete review of western results in network optimization is given by Guisewite and Pardalos (1990).

Application of Global Line Search in Optimization of Networks

175

References [1] G.M. Guisewite and P.M.Pardalos, Minimum Concave-Cost Network Flow Problems: Applications, Complexity, and Algorithms, Ann. of Operations Research, 25 (1990) 75-100. [2] J.Mockus , Multimodal Problems in Engineering Design (Nauka, Moscow, 1967) p. 216 (in Russian) [3] P.M. Pardalos and J.B. Rosen J.B. Constrained global optimization: Algorithms and applications, Lecture Notes in Computer Science 268 (Springer, Berlin, 1987)

177 Network Optimization Problems, pp. 177-202 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Solving Nonlinear Programs with Embedded Network Structures Mustafa Q. P m a r Institute for Numerical Lyngby, Denmark

Analysis,

Stavros A. Zenios Decision Sciences Department, USA

The Technical

University

University

of Pennsylvania,

of Denmark,

2800

Philadelphia,

PA

19104

Abstract

We present an algorithm for solving large scale nonlinear programs with embedded network structures. It is based on a Linear-Quadratic Penalty (LQP) function that eliminates the non-network constraints from the problem. The resulting nonlinear and nonseparable problems are solved using a simplicial decomposition algorithm that induces separability in the objective function. As a result one can employ network simplex technology. At the same time the values of any side variables can be determined by inspection. The algorithm is implemented in the software system GENOS/LP. Extensive numerical results are reported for diverse application areas: multicommodity network flows, Naval personnel assignment, matrix balancing and some of the NETLIB test problems. Comparisons with general purpose optimizers, like MINOS and OBI, are included.

1

Introduction and Background

It is well documented in t h e optimization literature t h a t network-structured optimization problems can be solved substantially faster t h a n t h e general linear program.

178

Mustafa

Q. Pmar & Stavros

A.

Zenios

This observation holds t r u e even with t h e recent developments of K a r m a r k a r ' s algorithm [1984] for linear programming, and t h e research t h a t followed it on interior point m e t h o d s . Furthermore, t h e superior performance of special purpose network algorithms has been documented for both pure and generalized networks, as well as for nonlinear programs. For example, in t h e mid-seventies, several studies established t h a t codes based on t h e network simplex algorithm for pure network problems were 150-200 times faster t h a n t h e state-of-the-art LP codes of t h e t i m e . See, for example, Glover et al. [1979] and Mulvey [1978]. In t h e early eighties research concentrated on t h e generalized network problem. Once more the network simplex algorithm was shown to be approximately 50 times faster t h a n LP codes. See, for instance, Brown and McBride [1985] and Mulvey and Zenios [1985]. This line of research was extended to t h e nonlinear network problem - see Dembo, Mulvey and Zenios [1989] - for a recent survey. Network specializations of nonlinear programming algorithms — like the primal t r u n c a t e d Newton or simplicial decomposition — were shown to be at least one order of m a g n i t u d e faster t h a n general purpose nonlinear programming solvers. Every development in network algorithms was followed by research to use the new algorithms in solving linear programs with large embedded networks. These efforts were generally successful in solving linear programs where t h e majority of constraints and variables had a network structure. Such programs are known as networks with side constraints and variables. In this category are included several well-known classes of problems: t h e processing (or blending) problem of Koene [1982], t h e equal flow problem of Ali, Kennington and Shetty [1988], the multicommodity network flow problem, Kennington and Helgason [1980] and so on. T h e applications of these problems in m a n a g e m e n t science are numerous and well documented in the above references. In this paper we develop an algorithm for solving nonlinear networks with side constraints and variables. Our p r i m a r y objective is to m a k e special-purpose nonlinear network optimization software applicable to a broader class of problems. T h e technique we propose here can also solve linear network problems with side constraints and side variables. As such it fits in the line of research pursued in t h e past by several others: McBride [1985], Chen and Engquist [1986], Glover and Klingman [1981], Chen and Saigal [1977] and so on. (See, also, Kennington and Helgason [1980, Chapter 7].) Even in t h e case of linear networks, however, our approach differs significantly from t h e earlier studies. Most of t h e earlier work dealt with specializations of the simplex algorithm. These specializations aimed at developing basis partitioning techniques t h a t would separate t h e network basis form t h e non-network component. These two separate components were treated using distinct computational procedures. Graph d a t a structures were used to carry out operations on a tree (corresponding to the network basis). General sparse m a t r i x factorizations were applied to the non-network component. W h e n applied to networks with few side constraints these methods were proven very efficient. T h e algorithm we propose here takes a different approach.

We use an exact

Solving Nonlinear

Programs

with Embedded

Network

Structures

179

penalty function to move t h e side constraints into the objective function. We then introduce a smoothing of t h e penalty t e r m in order to obtain a differentiable problem which we solve using a linearization procedure: simplicial decomposition. T h e use of penalty functions has been very effective in solving t h e multicommodity network flow problem. Ali, Kennington and Shetty [1988] use a relaxation approach together with subgradient optimization. Schultz and Meyer [1990] develop a barrier t y p e algorithm and Zenios, P m a r and D e m b o [1990] propose t h e use of a Linear-Quadratic Penalty (LQP) function. T h e last two algorithms were particularly successful in solving some very large problems from a Military Airlift C o m m a n d Application. In this paper we extend t h e L Q P algorithm of Zenios, P m a r and D e m b o [1990] from the multicommodity network flow problem to t h e more general networks with side constraints and variables. We also discuss key features of the software system G E N O S / L P t h a t we develop based on the L Q P algorithm. A comprehensive computational investigation provides information on t h e relative merits of t h e L Q P special purpose algorithm compared to general purpose optimizers. Section 2 formulates the problem we are going to solve and develops t h e algorithm. Section 3 describes t h e G E N O S / L P software system and reports its use on several diverse applications. Concluding remarks are given in Section 4.

2

The Linear-Quadratic Penalty Algorithm for Networks with Side Constraints and Variables

In this section we describe t h e Linear-Quadratic Penalty ( L Q P ) algorithm for nonlinear programs with embedded network structures. We begin with a formulation of the problem and proceed with t h e main components of t h e L Q P algorithm.

2.1

Problem Formulation

We consider t h e following nonlinear program:

[NLP] minimize x, z subject to

fix,

z) Ax

= b

Sx + Pz < d 0 < x < u 0• 3? is t h e objective function, assumed t o be convex and continuously differentiable,

180

Mustafa

Q. Pmar & Stavros

A.

Zenios

x € 3i n i is t h e vector of decision variables which represent flows on a graph, z € 5R"2 is t h e vector of decision variables which represent the side (non-network) columns, A is an m x n^ constraint m a t r i x with network structure. It could be t h e n o d e - a r c incidence m a t r i x of a network flow problem, or a block-diagonal m a t r i x where each block is a n o d e - a r c incidence m a t r i x as occurs in multicommodity network flows, stochastic networks and time-staged problems. S is t h e s x t i j m a t r i x of side (i.e., non-network) constraints imposed on t h e network flow variables, P is t h e s x n? m a t r i x of side (i.e., non-network) constraints imposed on t h e side variables, u

€ 5J"1 are upper bounds on the flow variables x,

r 6 5R"2 are upper bounds on t h e side variables z, b £ 3J m , d € 5RS are t h e r i g h t - h a n d side coefficients of t h e constraints. Also, let X = {{x,z)\Ax

=

b,0<x
Throughout t h e manuscript, transposition is indicated by a superscript T , Vxf and V z / denote t h e gradient vector of t h e function / with respect to x and z, and all vectors are column vectors.

2.2

The Linear-Quadratic Penalty (LQP) Algorithm

To exploit t h e network structure we want to remove t h e side constraints and append t h e m to t h e objective. To this end we use an exact penalty function p(t) = m a x { 0 , i } .

(1)

where t is a scalar variable. By placing the side constraints into the objective function using a penalty function we obtain a problem with network constraints. In particular, for multicommodity flows or stochastic networks, the penalty problem has a disjoint constraint set. Unfortunately, using an exact penalty function like (1) produces a non-differentiable problem. To avoid t h e difficulties of non-differentiability we use a smoothing approximation to t h e exact penalty function. This approach has been proposed in t h e context of m i n - m a x optimization by Bertsekas [1975] and later by Zang [1980]. For t h e exact

Solving Nonlinear

Programs

with Embedded

Network

Structures

181

e ' Figure 1: T h e linear-quadratic penalty function penalty function (1), we consider t h e linear-quadratic penalty function of Zenios, Pinar and D e m b o [1990]: 0 (C,t):

t-

if t<0 if 0 < t < t if t> e

(2)

where t is a scalar real variable and e is a positive real number. T h e linear-quadratic penalty function is depicted in Figure 1. T h e linear-quadratic penalty function is used to eliminate t h e side constraints by placing those in t h e objective function. T h e nonlinear network problem obtained by penalizing t h e side constraints Sx + Pz < d is formulated as: [NETNLP] minimize

$(:r, z) = f(x, z) + \i J ^ (j>(e, pj)

'

subject to

3

Ax = b 0<x
where p = Sx + Pz — d, the linear-quadratic penalty function is given by (2) and (i is a positive scalar which determines t h e severity of the penalty. T h e resulting nonlinear network problem is solved repeatedly with adaptively changing p a r a m e t e r s p and e until suitable stopping criteria are satisfied. T h e algorithm can be concisely stated as follows:

182

Mustafa Q. Pmar & Stavros

A.

Zenios

The Linear-Quadratic Penalty Algorithm LQP—0 (Initialization.) Find an initial feasible solution to t h e network component of N L P ignoring t h e side constraints, i.e., solve t h e problem minimize x, z subject to

fix,

z)

Ax = b 0 < x < u 0 < z < r

If t h e solution to this problem satisfies all side constraints, stop. Otherwise choose initial values for penalty parameters n and e and go to LQP—1. LQP—1 (Penalty Problem.) Solve - perhaps inexactly - t h e nonlinear network problem N E T N L P . Go t o L Q P - 2 . LQP—2 If t h e solution satisfies optimality criteria, stop. penalty p a r a m e t e r s fi and e and go to LQP—1.

Otherwise, adjust t h e

T h e r e are three main components of t h e L Q P algorithm which deserve special attention. T h e solution of the nonlinear network problem at step LQP—1 demands t h e most computational effort. This problem is solved using t h e network specialized version of simplicial decomposition algorithm, see Mulvey, Zenios and Ahlfeld [1990]. T h e second component is t h e multiplier adjustment procedure which is crucial to t h e efficient performance of t h e L Q P algorithm. Finally, we c o m p u t e lower and upper bounds to t h e optimal value and test stopping criteria. We study these topics next.

2.3

Simplicial Decomposition for the Penalty Problem

Simplicial decomposition iterates by solving a sequence of linearized subproblems to generate e x t r e m e points of the feasible region of t h e network component and master problems which minimize the nonlinear objective function over t h e simplex spanned by t h e e x t r e m e points. For a more detailed t r e a t m e n t t h e reader is directed to Mulvey, Zenios and Ahlfeld [1990]. Here we discuss t h e specialization of t h e algorithm in solving t h e penalty problem N E T N L P . In particular N E T N L P is decomposed into a linear network problem t h a t can be solved by the network simplex algorithm and a simple linear program t h a t can be solved by inspection.

T h e Simplicial D e c o m p o s i t i o n Algorithm S D - 0 Set v = 0, and use (x0) G X as t h e starting point. Let Y = 0, and v «— 0 denote t h e set of generated vertices and its cardinality, respectively.

Solving Nonlinear Programs with Embedded Network

Structures

183

SD—1 (Linearized subproblem.) Compute the gradient of the penalty function $ at the current iterate ( x J and solve a linear program to get a new vertex of the constraint set, i.e., solve for j / " + 1 = argmin v e x yTV$(x",z") and let Y = Y\j{yv+1}, vi-v + l. SD—2 (Nonlinear master problem.) Using the set of vertices Y to represent a simplex over the constraint set X, find an optimizer of the penalized objective function $ over this subset of X. Let w* = arg mmwewv <&(Bw) where Wv = {wi\^1Wi = l,wi>0\/i = l,2,...,v} and B = [yl\y2\... |y»] is the basis for the simplex generated by the set of vertices Y. The optimizer of $ over the simplex is given by (*„+i 1 = Bw". SD—3 Let v *— v + 1, and return to Step 1. T h e S u b p r o b l e m . At step SD—1 a new vertex (*J is generated as the solution to the following subproblem: Minimize x,z subject to

xTVx$(x\

z") + zTVz${x\

z")

Ax = b 0< x < u 0< z < r where \ v ) is the iterate at the f-th iteration of simplicial decomposition. This problem decomposes into two independent linear programs as follows: Minimize x subject to

xTVx$(x",

z")

Ax — b 0< x < u and Minimize z subject to

z T V 2 $(x",z") 0< z < r

The first problem is a linear network problem and is solved using the network simplex method. The second problem is solved trivially by assigning each component Zj of the vector z to its lower or upper bound depending on the sign of the gradient V^xV'O.i.e., " TV if V ^ x V ) > 0 , . (J 0 if V ^ z V ) < 0

Mustafa Q. Pmar k. Stavros A. Zenios

184

However, when no upper bound is specified for the side variables, this procedure fails to produce an accurate approximation to the optimal value of the side variables. We consider an alternative scheme instead. Instead of taking a full step in the direction of either the lower or the upper bound as indicated by the sign of the gradient component, we choose a point between the current value of the side variable and the bound. This is allowed since the descent direction is not affected by this operation. Thus, the side variable portion of the new vertex is obtained as follows:

_ f a(r,--^) if V , , * ( x V ) > 0 3

if VZj${xv,zv)

~\a{z?)

<0

l j

where a is a positive scalar in the interval (0,1]. This procedure is reminiscent of the trust region methods, Dennis and Schnabel [1983]. Using this procedure, the values of the side variables were computed to five digits of accuracy where this level of accuracy was not attained using the first procedure after an identical amount of computation time. The accuracy verification was made by comparing the value reported by the LQP algorithm and that of the general purpose code MINOS. We used a = 0.5 in this study. The Master Problem. At step S D - 2 a nonlinear master problem optimizes the objective function on the simplex specified by the extreme points generated by the subproblems. The master problem is formulated in the form: Minimize subject to

$(Bw) V

I>. = 1 1=1

W{ > 0 i = 1,. . . , v where v is the number of extreme points generated by the subproblems, B is the matrix whose columns are the extreme points and w = [w1, u>2,. . ., w"\ are the corresponding weights. The master problem, though nonlinear, is of significantly smaller size than the original problem since it is posed as a problem over the weights w. There are several standard methods that can be used for its solution, like, for example, Bertsekas's projected Newton method [1982]. If the simplicial decomposition algorithm drops vertices that carry zero weight at the optimal solution of the master problem, then subsequent master programs are locally unconstrained. Hence, methods of unconstrained optimization can be used to compute a descent direction. A simple ratio test determines the maximum feasible step length that will not violate the bounds. The master program can be rewritten in the form: mm$(Dw) ui>0

(5)

Solving Nonlinear

Programs

with Embedded

Network

Structures

185

where D = [yi — yv\y2 — yv\... \yv-i — Vv] is t h e derived linear basis for t h e simplex generated by t h e vertices yi,y2,---,yv. We denote by w t h e vector [wi, w2,...,u^-i] and t h e solution for wv is computed as v-l

wv = 1 - J^ Wi

(6)

;=i

At t h e current iteration we have v — 1 active vertices (i.e. u>; > 0, for i = 1,... ,v — 1) and t h e last vertex yv lies along a direction of descent. Hence, given an iterate (x", z") a descent direction p to (5) can be obtained as t h e solution to {DTMD)p

= -DTV^(xl/,

z"),

(7)

T h e choice of t h e m a t r i x M and alternative solution m e t h o d s for system (7) are discussed in Zenios, P m a r and D e m b o [1990].

2.4

Adjusting the Penalty Parameters

T h e procedure used to u p d a t e t h e penalty parameters p and t consist of dynamically decreasing t h e value of e to a small final tolerance and increasing t h e value of p when certain criteria are met. Suppose ( * * ) , pk, £k are given at iteration k of t h e L Q P algorithm. Also let pk = Sxk + Pzk — d and define t h e set V(x,z) = {j\pj > e} to be t h e set of violated constraints. T h e iterate (*k j is t e r m e d e-feasible if t h e index set of violated constraints V(xk,zk) is empty. We distinguish between t h e following two cases when u p d a t i n g t h e penalty parameters: C a s e 1: If V(xk, zk) = 0, this is an indication t h a t t h e m a g n i t u d e of t h e penalty par a m e t e r /J, was adequate in t h e previous iteration since e-feasibility is achieved. In this case t h e infeasibility tolerance t should be reduced. C a s e 2 : If V(xk,zk) / 0, t h e current point is not t-feasible, an indication t h a t t h e penalty p a r a m e t e r p should be increased. Let 7 = rjek be a target degree of infeasibility where r\ £ ( 0 , 1 ] . We consider t h e following u p d a t e equation: ^+1

=

/( tft )

(8)

7 or equivalently,

±

H>» = / ^ -k

(9)

rjc

And if \V(xk,zk)\

> 1, we get /+

1

= H. m a x pk. rjt jev{xk,zk)

(10)

186

Mustafa Q. Pmar & Stavros A. Zenios

In summary we have the following update procedure: Pick 7/1, 7/2 e (0,1] If V{xk,zk) = % ek+1 = max{ £„,,-„, 7/i e*} Else „fc+i _ _n_ m a x _t " 2 £ jSV(i*,z«)

J

where em,-n is a suitable final feasibility tolerance. A suitable initial value for ft can be found through some preliminary experimentation. W i t h t h e test problems we used in this study, t h e absolute m a x i m u m of objective function coefficients proved to be a good choice. T h e solution (* 0 J obtained by ignoring t h e side constraints can be used to provide an initial value for e. A reasonable choice is to pick a value equal to a fraction of t h e m a x i m u m of t h e side constraint violations, i.e., in t h e interval (0,maxj 6 y( I o i jO) p°). T h e value of parameters ?/i and T/2 was taken to be 0.5 for all computational tests reported in this study.

2.5

Bounds to the Optimal Value and Stopping Criteria

It is possible to compute lower bounds to the optimal objective function value during t h e course of t h e L Q P algorithm. This computation is performed after the subproblem phase of t h e simplicial decomposition and is based on a first-order Taylor series expansion of t h e function $ around t h e current iterate. Let v* be t h e optimal value of N L P and (x. J an optimal solution of N E T N L P for given penalty parameters n and t and x = (x) for notational simplicity. Also, let X = {x\Ax = 6,0 < x < u , 0 < 2 < r). Then §(x,z)
(11)

since N E T N L P is a relaxation of N L P . Therefore the optimal solution of N E T N L P is a lower bound for the optimal objective value. But in t h e presence of inexact minimizations of t h e penalized objective function $ , this is not always guaranteed to be a lower bound. Hence, we consider t h e first order Taylor series expansion of $ around a point y * ( x ) = * ( y ) + (x - y ) r V $ ( y ) + o(||y|| 2 )

(12)

Ignoring t h e second order t e r m define t h e function h

Mx) = $(y) + (x - y ) T V $ ( y )

(13)

By convexity of $ , min^g^ h(x) < $ ( £ ) , and hence, min r e x h(x) < v*. This bound is readily computed by the simplicial decomposition algorithm that generates extreme

Solving Nonlinear

Programs

with Embedded

Network

Structures

187

points of X by minimizing a linearized approximation to t h e objective function over X; see step SD—1. However, it is possible to obtain tighter lower bounds to t h e optimal value as follows. We slightly change notation for expositional simplicity. We denote t h e side constraint m a t r i x by E and temporarily ignore t h e distinction between network and side variables. Let x be an arbitrary iterate and denote by Q(e,pk) = Y?j=\ 4>{ei p)) where pk = Exk — d. Recall t h e subproblem objective function of t h e simplicial decomposition algorithm: V $ ( x i ) T • x = ( V / ( x f c ) + pVpQ(e,

k P

fE)

•x

(14)

We define t h e following lower bound function V(u) = ( V / ( x ' £ ) T - uE)x'

+ ud

(15)

where t h e vector u is given by u = -liVpQ(e,pk)

(16)

and x ' is t h e solution to t h e linearized subproblem, i.e., x ' = a r g m i n x s x i r V $ ( x * : ) . Next we show t h a t V(.) provides a lower bound superior to the linearized subproblem bound. T h e analysis is a generalization of the result given in Brown et al. [1989]. To proceed we need the following intermediate result. L e m m a 1. Let g : -R71 i—> 5R be a convex and at least once continuously differentiable function with the property t h a t 9(0) = 0

(17)

where 0 is t h e zero vector. T h e n

g(y) - yTvg(y)

v y.

(18)

P r o o f . Consider a first-order Taylor series expansion of g. By convexity

g(*) > g(y) + (* - y)Tvg(y)

v x, y e »-

In particular, g{0) > g{y) -

yTVg{y)

However, by hypothesis g(0) = 0, and t h e result follows.

•

T h e assumption in L e m m a on / holds for example for linear objective functions and quadratic objective functions of t h e form J^- djX2-.

Mustafa Q. Pwar & Stavros A. Zenios

188

Proposition 2. Let / be as in Lemma 1 and x' = argmin x g x z r V $ ( x ) where x is the current iterate. Then /t(x') < V(u) where h is given by (13). Proof. We will show that A(x') - V(u) < 0 Equivalently we want to show $(x) + (x' - x ) r V $ ( x ) - ( V / ( x ) T - u £ ) x ' - ud < 0 Consider the left hand side. Let p = £ x — d. By algebraic manipulation and using the definition of u, $(x) = = =

+ (x' - x ) r V $ ( x ) - ( V / ( x ) T - v,E)x' - ud f(±) + uQ{e,p)-Vf(x)Tx + u{EZ-d) / ( * ) - V / ( x ) T x + pQ(e, p) - pVpQ{t, p)T{E± - d) / ( x ) - V / ( x ) r x + p(Q(c,p) - VpQ(e,p)Tp) 1

2

By the assumption imposed on / , the first term is nonpositive following Lemma 1. The nonpositivity of the second term follows from the fact that p > 0 and Q(e,O) = 0 and by invoking Lemma 1. Hence the claim is established. • We now describe the procedure for generating upper bounds in the linear-quadratic penalty algorithm. For a general discussion on bounding exterior penalty function algorithms see Fiacco and McCormick [1968]. Define the set R° = {x G X\Sx + Pz > r} and assume that _R° is non-empty. Let x = (x) € -R0 and (x.) be an — perhaps approximate — optimal solution of NETNLP. Then a new interior point f^J is generated as follows: let y = Sx + Pz — d, and I = {i\yi > 0}.

p=

n

Vi

(19)

•=* y. + -r y'i y,''£' Vi

and define x = (1

-0)i - + /3x

z =:(1 -P)z '• + fiz

(20) (21)

It is easily seen that [ - J is feasible for NLP and thus provides an upper bound, Fiacco and McCormick [1968, Theorem 29, p. 107]. The same result also states that the upper bound converges monotonically to the optimal objective value. Obviously

Solving Nonlinear

Programs

with Embedded

Network

Structures

189

this procedure requires an interior point t o be generated a t t h e beginning of t h e algorithm. For example, in Zenios, P m a r and D e m b o [1990], a solution satisfying t h e m u t u a l capacity constraints of t h e multicommodity flow problem is c o m p u t e d based on t h e solution to t h e network relaxation. We were also able to generate initial feasible solutions for t h e Naval personnel assignment problems used in this study due to a special property of t h e problem. This is detailed in t h e forthcoming section on numerical experiments. Therefore t h e L Q P algorithm generates both upper and lower bounds for t h e optimal objective value during t h e execution of t h e algorithm for problems where an initial feasible solution can be computed. T h e algorithm t e r m i n a t e s when b o t h of t h e following error measures are within acceptable tolerance: 1. Absolute error in side constraint feasibility ||5a: + Pz-

dlloo < tmin

2. Bound gap

f(x,z)-V(u) V(u)

~

9ap

where x = (*J is t h e current iterate and ( | J is obtained from (20)-(21). T h e values of emin and cgap used in this study are 10~ 5 and 10~ 2 respectively. T h e ability to c o m p u t e improving upper bounds is an i m p o r t a n t feature of our approach since computation can be stopped as soon as a reasonable improvement in t h e upper bound is achieved.

3

Numerical Experience

T h e L Q P algorithm was implemented to solve problems of t h e form [ N L P ] . T h e code was written in Fortran 77. We refer to t h e code as the G E N O S / L P system. T h e computational testing was performed on D E C stations 3100 and 5100/200 running Ultrix and for large problems on a CRAY Y-MP. On the D E C stations, t h e code was compiled with t h e default compiler optimization option. For t h e CRAY experiments t h e code was tailored to take advantage of vectorization capabilities of t h e CRAY architecture. Before we report t h e results of computational testing, we discuss briefly t h e m a i n components of t h e vectorized code.

3.1

Vector Computing

T h e simplicial decomposition algorithm is particularly rich in dense linear algebra computations which can be efficiently vectorized. We mention here t h e m a i n components of t h e linear algebra involved in the L Q P m e t h o d .

Mustafa Q. Pmai
190

Computing Descent Directions. The following system of linear equations is solved to compute a descent direction during the course of the simplicial decomposition algorithm: (DTMD)p = -Z> T V$(z, z), (22) where D is a projection matrix, V $ denotes the gradient of the objective function, the pair (a;, z) is an arbitrary iterate and M is a matrix which usually approximates the second derivatives of the function $ . The matrix D tends to be very large depending on problem size. Typically for D can be 100000 x TV where N is the number of extreme points used in the master problem solution. TV varies from 1 to 100. The computation of the product DTMD can be very efficiently vectorized. Function and Gradient Evaluations. Having computed a descent direction, a one-dimensional search is executed to compute the next iterate. The time spent in the search procedure is dominated by the computation of function values and the gradient vector. The function and gradient evaluation of the original linear objective function can be vectorized trivially as it involves a simple DO-loop over all variables in the problem. However, the function and gradient values contributed by the nonseparable penalty function requires the evaluation of the side constraints. These computations are also vectorized. Other Linear Algebra. The solution of the system {DTMD)p

=

-DTV$(x\z"),

at every step of the master problem also requires the computation of the right-hand side reduced gradient vector DT V$(x", z"). This is a fully dense matrix-vector product suitable for vector architectures. To illustrate the impact of vectorization on the performance of the above components, we give in Table 1 below the time spent in these master problem components during execution of the LQP algorithm with problem PDS3 both with and without vectorization. Compiler vectorization refers to automatic vectorization of the code using compiler options whereas user vectorization refers to restructuring of code segments and use of library subroutines such as the BLAS (Basic Linear Algebra) subroutines as explained in Pmar and Zenios [1990]. O p t . level no vectorization compiler vectorization user vectorization

Descent Dir. 35.2 3.7 3.4

O t h e r Lin. A l g . 17.6 13.2 0.8

Func. a n d G r a d . Evals. 36.1 12.3 12.0

Solving Nonlinear

Programs

with Embedded

Network

Structures

191

Table l:Reduction in C P U spent in t h e main master problem components due to vectorization with P D S 3 . As evidenced by t h e results, significant gains are realized in t h e master problem phase with vectorization. It is not possible to improve t h e subproblem solution t i m e through vectorization due to t h e inherently scalar n a t u r e of the graph d a t a structures used in implementing t h e network simplex algorithm.

3.2

Solving Multicommodity Network Flow Problems

T h e multicommodity network flow problem can be seen as a special case of t h e networks with side constraints model. T h e side constraints in this case have t h e following simple form: commodity flows on all or a fraction of t h e arcs compete for a joint arc capacity. This special structure of t h e side constraints considerably simplifies t h e computer representation and evaluation of these constraints in t h e context of t h e L Q P algorithm. Extensive computational experience with multicommodity network flow problems using the L Q P algorithm is reported in Zenios, P m a r and D e m b o [1990]. Further work with a parallel decomposition of t h e algorithm is given in P m a r and Zenios [1990]. We only give a s u m m a r y of t h e results here. T h e first set of test problems is a collection of linear multicommodity network flow problems derived from a Military Airlift C o m m a n d (MAC) application. They are referred to as the Patient Distribution System (PDS) problems. T h e second set of test problems are randomly generated linear multicommodity network problems communicated to us by J.L. Kennington, see Ali and Kennington [1977]. T h e characteristics of t h e test problems are given in Table 2. Problem PDS1 PDS3 PDS5 PDS10 PDS15 PDS20 KEN 11 KEN13

No. of arcs

No. of nodes

339

126 390 686

1117 2149 4433 7203 10116

1399 2125 2447

176 225

121 169

No. of commodities

No. of rows

No. of columns

121 169

1473 4593 7546 15389 23375 31427 14694 28632

3816 12590 23639 48763 79233 105728 21349 42659

Table 2: Characteristics of multicommodity network flow problems. C o m p u t a t i o n a l results with multicommodity flow problems using t h e L Q P algorithm are given in Table 3. For each problem we report t h e total number of simplicial decomposition iterations, n u m b e r of e x t r e m e points retained upon completion, and t h e C P U t i m e consumed during t h e subproblem and master problem phase of t h e simplicial decomposition algorithm. These two components comprise t h e total C P U

192

Mustafa

Q. Pmar
A.

Zenios

usage during execution of t h e algorithm.

Test Problem

PDSl PDS3 PDS5 PDSIO PDS15 PDS20 KENll KEN13

Simplicial iters 23 41 71 103 121 145 16 87

GENOS/LP Subproblem Master time time 0.85 1.01 12.04 6.73 55.12 37.97 232.31 175.82 559.35 381.08 1225.83 720.06 7.4 8.1 70.6 66.8

Total time 1.86 18.77 93.09 408.13 940.43 1945.89 15.5 137.5

OBI time

1530 16000 21 67

Table 3: Multicommodity flow problems solution statistics on t h e CRAY Y - M P . W i t h all t h e test problems, both t h e infeasibility tolerance e m ,„ = 10~ 5 and t h e bound gap tolerance esap = 1 0 - 2 were achieved. P D S l and P D S 3 were also solved with t h e general purpose package MINOS of Murtagh and Saunders [1987]. T h e optimal values reported by MINOS matched t h e L Q P optimal values to 5 digits. Some of these problems were solved on t h e same computer by Marsten et al. [1990] using t h e code O B I based on interior point methods. T h e results are also given in Table 3. It is clear t h a t the L Q P algorithm outperforms substantially state-of-theart implementations of interior point methods. By virtue of t h e linearization in t h e subproblem phase of simplicial decomposition, t h e linear network flow problems for individual commodities can be solved on parallel processors. T h e results of this study are reported elsewhere; see P m a r and Zenios [1990].

3.3

Solving the Naval Personnel Assignment Problem

In this section we report numerical results obtained using G E N O S / L P on two Naval Personnel Assignment problems. Each year thousands of decisions are m a d e t o (re)allocate t h e Navy Enlisted Personnel to a fleet of combat units and to mission areas within these units. Allocations are m a d e in such a m a n n e r as to provide t h e best defense at t h e lowest cost. All mission areas within a combat unit require personnel with different skills to support operational capabilities. A unit's capability to perform its functions in all its mission areas is referred to as "readiness". Readiness is measured based on t h e skills of personnel assigned. A shortage of skilled personnel would decrease t h e level of readiness of a mission area and thus degrade the capabilities of t h e u n i t . Clearly maximizing t h e level of readiness is a complex decision making problem given t h e large n u m b e r of mission areas and personnel to be m a t c h e d . This problem can be formulated as a network optimization problem with side constraints and variable(s). T h e reader is directed Krass and Thompson [1990] for more details

Solving Nonlinear

Programs

with Embedded

Network

Structures

193

on t h e model. Expressed in m a t r i x notation t h e model has the following form: minimize x,z subject to

ex — z Ax Sx + Pz 0 < x 0 < 2

= > < <

b d u r

where A is a node-incidence m a t r i x for t h e network and represents t h e flow conservation conditions and S and P are matrices used to capture t h e non-network requirements. T h e variables x denote t h e flow variables and z is t h e side variable which represents t h e level of readiness to be maximized. T h e objective is to maximize t h e level of readiness for all units considered and minimize t h e cost of t h e assignment. For t h e Naval assignment problems used in this study, an initial feasible solution was readily c o m p u t e d since t h e solution to the network relaxation satisfied the side constraints when the side variable was ignored, i.e., let x° be a solution of t h e network relaxation minimize subject to

Ax = b 0 < x < u

If x° is such t h a t

Sx°> d then z° is computed as

(£,,.

-di)

(23)

Pij

where S{j and pij denote t h e entries at t h e i-th row and j-th column of t h e matrices S and P respectively. T h e first model — NAVY — is a simplified version of t h e complete model which we we call H U G E N A V Y . T h e size and characteristics of b o t h problems are given in Table 4. In addition t h e problem NAVY has a nonzero assignment cost vector c whereas t h e larger problem H U G E N A V Y has an assignment cost vector which is identically zero. T h e objective in problem H U G E N A V Y is solely to minimize t h e readiness level. Problem NAVY HUGENAVY

LP form Rows Columns 4144 6842 64542 36013

Network Nodes Arcs 3457 6841 30639 64541

N o . of side const. 687 5374

Opt. value - 2 . 7 2 3 4 7 x 10 5 -0.5340

Table 4: Problem Characteristics of Naval Personnel Assignment. Both problems were solved on t h e CRAY Y - M P . We give in Table 5 t h e solution

194

Mustafa

Q. Pmar
A.

Zenios

statistics of t h e LQP algorithm. All times are stated in C P U seconds exclusive of i n p u t / o u t p u t . Major iterations refer t o t h e total n u m b e r of times step 1 of t h e L Q P algorithm is executed. It is interesting to note t h a t t h e larger Navy problem is solved in a t i m e very close to t h e solution t i m e of t h e smaller problem. This can be att r i b u t e d to t h e larger number of major iterations t h e algorithm took in the case of t h e problem NAVY because t h e smaller problem is more tightly constrained t h a n t h e larger. Since the iterates generated by t h e L Q P algorithm become only feasible on t e r m i n a t i o n , t h e previous observation leads to t h e conclusion t h a t , though much larger in size, H U G E N A V Y is a relatively easier problem for our m e t h o d .

Problem

NAVY (CRAY Y - M P ) NAVY (DEC 5100) HUGENAVY(CRAY Y - M P )

Simpl. iters 6 6 2

GENOS/LP Master Subproblem time time 45 149 132 1428 157 181

Total time 194 1560 276

MINOS time

OBI time

NA 600 NA

NA NA 150

Table 5: Performance of t h e L Q P algorithm on the Naval personnel assignment problems. We also report t h e solution t i m e by MINOS for NAVY. T h e L Q P algorithm was outperformed by MINOS on this problem. However, MINOS was not able to produce a feasible solution to H U G E N A V Y after one hour of C P U t i m e on a CRAY Y - M P whereas this problem was solved within 5 minutes using t h e L Q P m e t h o d . On t h e other hand, t h e same problem was solved in less than 3 minutes using the O B I code based on interior point methods. This indicates t h a t t h e L Q P algorithm based on nonlinear programming technology is competitive with current linear programming technology while it outperforms the state-of-the-art simplex based linear programming code M I N O S . We also note t h a t t h e value of t h e side variable which represents t h e readiness level was computed to 5 digits of accuracy by t h e L Q P m e t h o d as confirmed for both problems by t h e O B I and MINOS solutions. We also experimented with nonlinear versions of NAVY and H U G E N A V Y problems. We refer to these problems as NAVYQ and H U G E N A V Y Q where the objective function is a separable quadratic function of t h e form YljajrfF ° r t h e problem NAVYQ, t h e coefficients a ; are precisely t h e coefficients given in the linear model. For t h e H U G E N A V Y Q problem t h e coefficient vector was taken to be identically unity. We report t h e solution statistics in Table 6. Both results were obtained on a CRAY Y-MP.

Solving Nonlinear

Problem NAVYQ HUGENAVYQ

Programs

Major iters 18 8

with Embedded

Subprob. time 89 270

Network

Master time 367 867

Total time 456 1137

Structures

Lower bound -273025.94 0.7220 x 108

195

Upper bound -252403. 0.7320 x 10s

Table 6: Performance of t h e L Q P algorithm on t h e nonlinear Naval personnel assignment problems. W i t h b o t h problems, t h e infeasibility tolerance e m ;„ = 1 0 - 5 was attained on termination. Using MINOS to solve NAVYQ a feasible solution with objective function value -269515.4 was obtained in 485 C P U seconds on t h e CRAY Y - M P . This solution is 1% b e t t e r t h a n the best feasible solution produced by the L Q P algorithm in 456 seconds. However, t h e L Q P algorithm produced a more accurate solution on H U G E N A V Y Q . MINOS was not used for this problem due to t h e anticipated C P U t i m e and m e m o r y requirements.

3.4

Solving Constrained Matrix Estimation Problems

In this section we report results with constrained versions of two m a t r i x estimation problems from t h e World Bank. T h e m a t r i x estimation problem is t h a t of adjusting t h e entries of Social Accounting Matrices (SAM) for an economy and can be formulated as a nonlinear network optimization problem, see Zenios, Drud and Mulvey [1989]. T h e first problem S A M K E is a SAM model for Kenya and t h e second problem S A M B O is a SAM model for Botswana. Both problems were derived from econometric studies conducted at the World Bank. Both problems have separable objective functions. T h e problem S A M K E has a weighted entropy objective function of t h e form ax • (logf — 1.0) for each flow variable and t h e problem S A M B O has a quadratic objective function of the form a • (x — b)2. We note t h a t these functions do not satisfy t h e assumption of L e m m a 1 of section 2.5 and therefore we rely on t h e subproblem lower bound for these problems. Since no initial feasible solution can be readily obtained, we do not compute upper bounds. We constructed side constraints for these problems as follows. A n u m b e r of arcs were randomly chosen and a fraction of t h e sum of t h e optimal flow values on these arcs was taken as the right-hand side of t h e inequality. This was repeated as m a n y times as the number of side constraints we added to t h e problem. Therefore, t h e side constraints have t h e following form:

22 *» >P22 *:,W)e£

(<j)€f

where £ is an arbitrary subset of t h e arcs of t h e underlying graph and HJ is t h e flow variable for arc (i,j), x*- are the optimal flows obtained by solving t h e original m a t r i x estimation problem and /? G (0,1]. Characteristics of t h e problems are summarized in Table 7.

196

Mustafa soo -r

Q. Pmar
A.

Zenios

—•—•»«*

Figure 2: Variation of t h e solution t i m e for constrained m a t r i x estimation problem S A M B O as a function of t h e number of side constraints with G E N O S / L P and M I N O S .

Problem SAMKE SAMBO

Network N o d e s Arcs 50 202 128 662

Optimal value -7768.11 10.71

Table 7: Characteristics of m a t r i x estimation problems. Starting with one side constraint, b o t h problems were solved with an increasing number of side constraints using t h e L Q P algorithm. Tests were performed on a D E C station 3100. All times are given in C P U seconds. We provide in Figure 2 t h e variation of t h e C P U t i m e taken by G E N O S / L P algorithm to solve SAMBO and t h e C P U t i m e taken by MINOS on t h e same problem as a function of t h e increasing number of side constraints added to the problem. We observe t h a t while MINOS t i m e does not vary considerably, G E N O S / L P outperforms MINOS by a significant margin. However t h e advantage of G E N O S / L P is reduced as the number of side constraints increase. This is not surprising since the L Q P m e t h o d is more sensitive to the size of t h e network component which gets smaller in percentage as more side constraints are added. In Figure 3, we plot t h e optimal values reported by G E N O S / L P and MINOS with t h e problem S A M B O as a function of the number of side constraints. In all experiments, G E N O S / L P was able to produce reasonably accurate solutions to the problem. We provide in Table 8 a s u m m a r y of the L Q P algorithm statistics for both problems. T h e problems are referred to as SAMKE25 and SAMBO20 to indicate t h e

Solving Nonlinear Programs with Embedded Network

Structures

197

MINOS • U3P

5 + 0 +5

10

15

20

number ot side constraints

Figure 3: Variation of the optimal value for constrained matrix estimation problem SAMBO as a function of the number of side constraints with GENOS/LP and MINOS.

number of side constraints present in the problem. Problem SAMKE25 SAMBO20

Major iters. 3 7

Infeasibility 5.82 x 10"4 6 x 10~4

Total time 14. 303.

Objective value -7624.85 28.10

Lower bound -7678.11 21.89

Table 8: The LQP statistics with the matrix estimation problems. The lower bounds in both cases are not very tight. The solution statistics with MINOS for problems SAMKE25 and SAMBO20 are given in Table 9. Problem SAMKE25 SAMBO20

Number of iterations 475 1804

Optimal value -7630.94 27.73

C P U time 7 472

Table 9: Performance of MINOS on the matrix estimation problems. Although MINOS provides more accurate solutions and was faster with the smaller SAMKE25, the LQP method outperforms MINOS on the larger problem with respect to CPU usage for different number of side constraints while it produced an acceptable level of accuracy.

Mustafa

198

3.5

Q. Pmar & Stavros

A. Zenios

Analyzing the NETLIB Test Problems

Using t h e network extracting heuristics of Bixby a n d Fourer [1988], we analyzed a subset of N E T L I B linear programming test problems. A s u m m a r y of t h e characteristics of t h e test problems a n d t h e associated network formulations are given in Table 10. As can be observed from t h e statistics, two of t h e problems have a large network component t o warrant some attention. Problem RECIPE GREENBEA GIFFPINC SCAGR25 SCRS8 SHIP12L SIERRA STANDATA

Linear Programming Rows Columns Opt. Val. 92 180 -2.6661 x 102 2400 5443 -7.2462 x 107 617 1092 6.9022 x 106 472 500 -1.4753 x 107 491 1169 9.9429 x 102 1165 5427 1.4701 x 106 9252 1228 1.5394 x 107 468 3686 1.2576 x 103

Network Nodes Arcs 54 140 895 4641 523 1071 372 200 301 1096 735 5321 878 2726 96 331

Side Const. 30 1423 69 147 156 104 349 226

Side Vars. 14 606 1 127 78

789

Table 10: T h e N E T L I B problems in LP a n d network forms. Our experience with t h e N E T L I B problems using t h e L Q P algorithm revealed t h a t solving these problems as general linear programs is more efficient. We report results with two problems G I F F P I N C and SHIP12L. T h e L Q P statistics are given below in Table 11. T h e n u m b e r of major iterations refer t o t h e n u m b e r of executions of step 1 of t h e L Q P algorithm. We also report t h e final objective function value reported on termination a n d t h e best lower bound computed thus far. Infeasibility refers t o t h e m a x i m u m degree of violation of t h e side constraints. It was not possible to provide an initial feasible solution for these problems a n d hence no feasible iterates a n d upper bounds t o optimal value were computed. T h e tests were performed on a D E C station 3100. SHIP12L was solved in 110 C P U seconds using MINOS a n d G I F F P I N C was soved in 25 seconds using t h e same code. Problem GIFFPINC SHIP12L

Major iters. 10 15

Infeasibility 0.242 x 10" 2 2.37

Total time 8240.9 7200

Objective value 7.581 x 106 1.768 x 106

Lower bound 6.766 x 106 NA

Table 11: T h e L Q P statistics with t h e N E T L I B problems. T h e statistics clearly indicate t h a t these problems proved t o be extremely hard for t h e L Q P algorithm. In t h e case of G I F F P I N C , although an acceptable level of infeasibility a n d a reasonable lower bound was achieved, t h e objective value is off t h e known optimal value by a considerable margin. T h e case of SHIP12L was more problematic.

Solving Nonlinear

Programs

with Embedded

Network

Structures

199

T h e c o m p u t a t i o n was stopped after two C P U hours and t h e iterate was still far from reaching t h e absolute infeasibility tolerance e m l n = 1 0 - 5 . It was also impossible to assess t h e quality of t h e solutions to nonlinear network (penalty) problems due to t h e poor quality of t h e lower bounds. We also note t h a t b o t h problems had a dense side constraint m a t r i x structure, a factor which affects t h e L Q P algorithm negatively. To conclude, we r e m a r k t h a t t h e L Q P technology was not effective in dealing with t h e N E T L I B problems although t h e experience contributed to t h e robustness testing of t h e G E N O S / L P code.

3.6

Integration with MINOS

T h e L Q P m e t h o d delivers quickly an approximate solution to t h e problem. W h e n higher accuracy is needed a linear programming solver m a y be used. T h e linear prog r a m NAVY is solved with t h e general purpose linear programming solver M I N O S of M u r t a g h and Saunders [1987]. T h e statistics are given in Table 12.

Number of PhaseT pivots Total number of pivots CPU time(DEC 5100)

1247 2423 10 mins.

Table 12: Performance of M I N O S on NAVY. Interfacing t h e G E N O S / L P system with MINOS may provide M I N O S with an advanced starting point. However, since t h e L Q P algorithm is essentially based on an exterior point penalty function, no basis for the problem is readily available. T h e optimal network basis produced as a result of solving linear network subproblems is input to M I N O S . This idea produced a significant reduction in the n u m b e r of pivots taken by M I N O S to reach optimality. A comparison is given below in Table 13.

Number of Phase-I pivots Total number of pivots

MINOS 1247 2423

MINOS with advanced start 1750 1782

Table 13: Performance of MINOS on NAVY using advanced start. As can be observed from Table 13 t h e total number of iterations were reduced significantly. Due to t h e anticipated C P U usage this strategy was not applied to HUGENAVY.

200

4

Mustafa

Q. Pmar
A.

Zenios

Conclusions

We presented in this paper a solution m e t h o d suitable for large scale optimization problems with embedded network structures, and results of extensive computational testing with various test problems. T h e L Q P m e t h o d is an exterior point m e t h o d based on a smooth penalty function and can produce feasible iterates if an initial feasible point is available. For applications where several problem instances need to be solved such as t h e Patient Distribution System and t h e Naval Personnel Assignment, there is a high payoff in exploiting t h e network structure and developing a specialized algorithm. Particularly as t h e problem size gets bigger, t h e benefits of using the L Q P algorithm become more accentuated. In this paper we presented strong evidence to support this claim. For smaller problems it is more beneficial to use general purpose algorithms as evidenced by the analysis with t h e N E T L I B problems. Another import a n t factor which affects t h e performance of the L Q P algorithm is the sparsity p a t t e r n of t h e side constraint matrix. W i t h the P D S problems and Naval personnel assignment problems t h e side constraint m a t r i x had a favourable sparsity p a t t e r n whereas t h e N E T L I B problems had a very dense structure. However, t h e L Q P algorithm may still be a viable alternative in smaller problems with relatively few side constraints as observed in t h e case of constrained m a t r i x estimation problems. In s u m m a r y t h e L Q P algorithm is able to provide quickly approximate solutions to t h e problem. For large problems it outperforms state-of-the-art general purpose optimization software while it remains competitive with the more recent interior point based optimization technology. To achieve higher accuracy, the L Q P solution can be used as an advanced start for a general purpose linear programming solver. A c k n o w l e d g m e n t s . This research was partially supported by N S F grants S E S 91-00216 and C C R - 9 1 - 0 4 0 2 and A F O S R grant 91-0168. T h e assistance of Mr. Ted T h o m p s o n and Mr. Iosif Krass with supplying t h e d a t a for t h e Navy Personnel Assignment problem is gratefully acknowledged. Mr. John Gregory kindly provided assistance with the CRAY experiments and O B I . Professor Bob Fourer kindly m a d e his network extraction program available.

References [1] A.I. Ali and J.L. Kennington, M N E T G N Program Documentation, Technical Report I E O R 77003. D e p a r t m e n t of Industrial Engineering and Operations Research, Southern Methodist University, Dallas (1985). [2] A.I. Ali, J.L. Kennington and B. Shetty, T h e Equal Flow Problem, Journal of Operational Research 3 6 (1988) 107-115. [3] D.P. Bertsekas, Nondifferentiable Optimization via Approximation, Programming Study 3 (1975) 1-25.

European

Mathematical

Solving Nonlinear

Programs

with Embedded

Network

Structures

201

[4] D.P. Bertsekas, Projected Newton Methods for Optimization Problems with Simple Constraints, SIAM Journal on Control and Optimization 2 0 (1982) 221-246. [5] R.E. Bixby and R. Fourer, Finding E m b e d d e d Network Rows in Linear Programs I. Extraction Heuristics, Management Science 3 4 (1988) 342-376. [6] G.G. Brown, G.W. Graves, H. Lange, C. Staniec and R.K. Wood Dual Decomposition Methods for Solving Multicommodity Flow Problems, Technical Report, Naval P o s t g r a d u a t e School (1989). [7] G.G. Brown and R.D. McBride, Solving Generalized Networks, Management ence 3 0 (1984) 1497-1523.

Sci-

[8] C.H.J. Chen and M. Engquist, A Primal Simplex Approach to P u r e Processing Networks, Management Science 32 (1986) 1582-1598. [9] S. Chen and R. Saigal, A Primal Algorithm for Solving a Capacitated Network Flow Problem with Additional Linear Constraints, Networks 7 (1977) 59-79. [10] R.S.Dembo, J.M. Mulvey and S.A. Zenios, Large-Scale Nonlinear Network Models and Their Application. Operations Research 3 7 (1989) 353-372. [11] J . E . Dennis,Jr. and R . B . Schnabel, Numerical Methods for Unconstrained mization and Nonlinear Equations (Prentice-Hall, New Jersey, 1983). [12] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential strained Minimization Techniques (John Wiley, New York, 1968).

Opti-

Uncon-

[13] A . B . Gamble, A.R. Conn and W . R . PuUeyblank, A Network Penalty Problem, Mathematical Programming 50 (1991) 53-74. [14] F . Glover, J. Hultz, D.Klingman and J . S t u t z , Generalized Networks: A Fund a m e n t a l C o m p u t e r Based Planning Tool, Management Science 2 4 (1978) 12091220. [15] F . Glover and D. Klingman, T h e Simplex SON Algorithm for L P / E m b e d d e d Network Problems, Mathematical Programming Study 15 (1981) 148-176. [16] N. Karmarkar, A New Polynomial T i m e Algorithm for Linear P r o g r a m m i n g , Combinatorica 4 (1984) 373-395. [17] J.L. Kennington and R.V. Helgason, Algorithms Wiley and Sons, New York, 1980).

for Network Programming

(John

[18] J. Koene, Minimal Cost Flow in Processing Networks, A P r i m a l Approach, P h . D . Thesis, Eindhoven University of Technology, Eindhoven, T h e Netherlands (1982).

202

Mustafa C. Pinar & Stavros

A.

Zenios

[19] LA. Krass and T . J . Thompson, M a t h e m a t i c a l Formulation of E D P R O J - Readiness Connection, Technical Report, Navy Personnel Research and Development Center, San Diego CA (1990). [20] R. Marsten, R. S u b r a m a n i a n , M. Saltzman, I. Lustig and D. Shanno, Interior Point Methods for Linear Programming, Interfaces 2 0 (1990) 105-116. [21] R . D . McBride, Solving E m b e d d e d Generalized Network Problems, Journal of Operational Research 2 1 (1985) 82-92.

European

[22] J . M . Mulvey, S.A. Zenios and D.P. Ahlfeld, Simplicial Decomposition for Convex Generalized Networks, Journal of Information and Optimization Sciences 1 1 (1990) 359-387.

[23] J.M. Mulvey and S.A. Zenios, Solving Large Scale Generalized Networks, of Information and Optimization Sciences 6 (1985) 95-112.

Journal

[24] J . M . Mulvey, Testing of a Large Scale Network Optimization Program, matical Programming 15 (1978) 291-315.

Mathe-

[25] B.A. M u r t a g h and M.A. Saunders, MINOS 5.1 User's Guide, Report SOL 8 3 20R, December 1983 (revised J a n u a r y 1987), Stanford University. [26] M.C. P m a r , Decomposition and Parallel Solution of Network Structured Optimization Problems, P h . D . Thesis, University of Pennsylvania, Philadelphia PA 19104 (1992). [27] M.Q. P m a r and S.A. Zenios, Parallel Decomposition of Multicommodity Network Flows using Smooth Penalty Functions, ORSA Journal on Computing 4 (1992) (forthcoming). [28] G.L. Schultz and R.R. Meyer., An Interior Point Method for Block Angular Optimization, SIAM Journal on Optimization 1 (1991) . [29] I. Zang, A Smoothing-out Technique for Min-Max Optimization, Programming 19 (1980) 61-77.

Mathematical

[30] S.A. Zenios, A. Drud and J.M. Mulvey, Balancing Large Social Accounting Matrices with Nonlinear Network Programming. Networks 19 (1989) 569-585. [31] S.A. Zenios, M.C. Pinar and R.S. Dembo, A Smooth Penalty Function Algorithm for Network Structured Problems, D e p a r t m e n t of Decision Sciences Report 9 0 12-05, T h e W h a r t o n School, University of Pennsylvania, Philadelphia, PA. 19104 (1990).

203 Network Optimization Problems, pp. 203-231 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

On Algorithms for Nonlinear Dynamic Networks Warren B . Powell Elif Berkkam Irvin J. Lustig Department of Civil Engineering and Operations Research, School of Engineering and Applied Science, Princeton University, Princeton, NJ 08544 USA

Abstract

We consider the problem of minimizing costs over a dynamic, acyclic network with convex, separable link cost functions. The standard approach is to formulate the problem as a convex, separable optimization problem subject to flow conservation constraints, where the decision variable is the flow x{j on link (i,j). We show that standard network algorithms applied to dynamic problems exhibit surprisingly poor performance for networks with as few as 10 or 20 time periods, suggesting that dynamic networks are intrinsically much harder to solve than static networks of comparable size. The problem can be reformulated using decision variables #,j, which gives the fraction of the total flow passing through node i that should be routed over link (i, j). This formulation has been used by other researchers in the development of parallel algorithms which take advantage of the simple constraint structure. We show that this reformulation produces substantially faster execution times for Frank-Wolfe type methods than the same methods applied to the standard formulation.

1

Introduction

We consider t h e problem of optimizing flows over a dynamic network with separable nonlinear cost functions. T h e problem can be motivated by problems arising in network models of dynamic fleet management for common carriers in freight transportation (truck, rail, containers). In these models, supplies of vehicles enter the network in t h e first few t i m e periods but then flow through t h e network and exit

204

W. B. Powell, E. Berkkam,

and I. J.

Lustig

via a supersink. An i m p o r t a n t characteristic of these problems is t h a t most of t h e nodes are p u r e transshipment nodes and only t h e supersink is a deficit node. Powell et al. [11] presents a model with this structure to manage a fleet of trucks over t i m e under uncertain d e m a n d s . Gallagher [5] introduces a similar model with multicommodity flows to optimize t h e routing of messages over communication networks. We show in this paper t h a t dynamic networks are intrinsically much more difficult to solve with first-order m e t h o d s t h a n static networks of comparable size. Algorithms applied to s t a n d a r d formulations of nonlinear dynamic networks can perform very poorly. Even special packages for nonlinear networks such as G E N O S (Mulvey and Zenios [8]) require exceptionally long run times for relatively small networks with 10 or 20 t i m e periods. By contrast, even relatively simplistic algorithms such as FrankWolfe are shown to work quite well when applied to a different formulation of t h e same problem. T h e r e are two approaches t h a t can be used to solve this problem. T h e first is to view it as a s t a n d a r d m i n i m u m cost flow problem with nonlinear cost functions and link flows, Xij, as decision variables. T h e second approach does not require t h e use of any network flow algorithms but takes advantage of t h e acyclic structure of t h e network (Figure 1) to describe the impact of decisions m a d e now on future costs. This new approach uses flow fractions, 8{j, as decision variables. T h e decision variable 9ij represents t h e fraction of total flow passing through node i t h a t is to be routed over link (i, j). We refer to t h e classical formulation, which uses XJJ as decision variables, as N D N - X (Nonlinear Dynamic Network with X variables), and refer to t h e new formulation, which uses 0;J, as N D N - T . T h e formulation N D N - T appears to have been first introduced by Gallagher [5] for multicommodity flow problems arising in telecommunications. T h e motivation for the formulation was t h e development of distributed algorithms for routing in telecommunications. As shown below, t h e N D N - T formulation uses a very simple constraint structure t h a t lends itself easily to parallel computation. T h e same formulation was developed independently by Powell et al. [11] in t h e context of managing fleets of vehicles under uncertainty. However, neither of these papers really investigate t h e behavior of solution algorithms for dynamic networks. Researchers have long realized t h a t t h e structure of dynamic networks could be used to develop specialized algorithms (see, for example, Aronson [1]). By contrast, there has been relatively little recognition of t h e challenges posed by dynamic networks. There are relatively few algorithms specialized for dynamic networks (see, for example, t h e extensive review in Aronson and Chen [2]). Aronson [1] presents a specialization of t h e network simplex algorithm t h a t takes advantage of breaks in t h e tree t h a t limit the effect of pivots on earlier time periods. This result, however, appears to be restricted to pure dynamic networks t h a t arise in inventory planning problems. W h i t e and Bomberault [12] offer a specialization of a primal dual algorithm for d y n a m i c networks motivated by e m p t y railcar models. Powell et al. [11] propose a stochastic formulation of t h e dynamic vehicle allocation problem which produces

On Algorithms

for Nonlinear

Dynamic

205

Networks

a nonlinear, dynamic network with t h e structure considered here. A flow splitting algorithm of t h e t y p e presented above is introduced and shown to exhibit good performance in limited tests. Bertsekas [3] and Bertsekas et al. [4] explore in d e p t h enhancements of t h e flow splitting formulation we refer to as N D N - T . Taking advantage of t h e simple structure of the constraint set, these papers develop projection algorithms and second order algorithms. In this paper we use this formulation to expose t h e impact of t h e dynamic structure of t h e network on algorithmic performance. Section 2 presents t h e formulation of t h e nonlinear dynamic network problem in t h e traditional form N D N - X . Section 3 presents a more detailed description of t h e N D N - T formulation in terms of dynamic networks, and states a simple backward recursion for calculating derivatives, with a derivation left to t h e appendix. Section 4 outlines several s t a n d a r d algorithms t h a t can be used with t h e new formulation, taking advantage of t h e simple structure of t h e constraint set. Finally, section 5 compares the N D N - X and N D N - T formulations.

2

The N D N - I Formulation

We begin by presenting t h e basic problem in a form t h a t explicitly reflects t h e dynamic structure of t h e problem. Let i and j refer to cities (points in space) and let a node in t h e network be denoted by (i,t). For notational simplicity only, we assume any links e m a n a t i n g from (i,t) t e r m i n a t e in period t + 1. Define R\ x\j fij(xlj)

=

net surplus (R\ > 0) or deficit (R\ < 0) at node i at t i m e t

=

total flow from i to j departing at t i m e t (and arriving at t i m e t + 1)

=

cos

t °f sending flow on the link from node (i,t)

to node (j,t + 1)

T h e formulation N D N - X is then minimize

F(x)

=

Y.J2 t

subject to

X) i

fij(xh)

j

Ax

=

R

(1)

x

>

0

(2)

where A is t h e node-arc incidence m a t r i x for t h e network. It is assumed t h a t the functions ff- are convex and t h a t R is the vector of surpluses and deficits at each node. This formulation produces a problem with a separable, nonlinear objective function with nonseparable network constraints. T h e relative ease with which derivatives can be calculated, due to separability, makes this traditional formulation favorable. In addition, t h e conservation of flow constraints can be handled in the context of first order nonlinear p r o g r a m m i n g algor i t h m s , such as t h e Frank-Wolfe algorithm, since t h e linearized subproblems are pure

206

W. B. Powell, E. Berkkam,

and I. J.

Lustig

networks. T h e gradient g of F is defined by

If a first-order algorithm is applied to N D N - X , then t h e linearized subproblem is based on t h e gradient at some point x. This problem is then a pure network problem of t h e form minimize

X ) 1C 5Z S i j ^ l ' ) ' vh t

i

j

subject to

Ay

=

R

(4)

y > o T h e Frank-Wolfe algorithm applied to N D N - X generates iterates i ' * ' for k > 0 using t h e following steps: Step 1. Set z( 0 ) = 0 and find g(x^). Solve (4) with x = x<0' and set k = l , so t h a t our initial solution is i ' 1 ' = y(°\ Step 2. Evaluate g'^x^). optimal solution.

Solve t h e linear network problem (4) and let yW be t h e

Step 3. Find the o p t i m u m step size o* by solving min 0
Step 4. U p d a t e a;'*-*"1) = x « + a*(yW

- i«).

1

|F(I( +'))-F(I('))|

L Step 5. If i . (t), < e, then stop, otherwise set k *— k + 1 and go to step 2. T h e problem with a s t a n d a r d application of Frank-Wolfe is t h e n a t u r e of t h e e x t r e m e solution, as depicted in Figure 2. T h e structure of t h e problem, where flows typically enter t h e network in t h e first few t i m e periods and leave through the supersink, produces a p a t t e r n of flows from the subproblem t h a t follows a tree. In fact, we are actually just solving a shortest p a t h problem into t h e supersink over an uncapacitated network with linear costs. For dynamic networks with at least five or ten t i m e periods, t h e result is a set of flows t h a t converges on a single p a t h several t i m e periods into t h e problem. This e x t r e m e point solution is an unusually poor approximation to t h e optimal solution, and hence is t h e cause of extremely slow rates of convergence. In fact, standard algorithms, applied to even relatively small problems with a large (greater t h a n 20) number of t i m e periods converge so slowly t h a t they may stop well short of optimality. In t h e next section, we show how a simple transformation takes advantage of t h e dynamic structure of the problem, and allows us to develop very simple and efficient algorithms for solving these networks.

On Algorithms for Nonlinear Dynamic Networks

3

207

The Transformed Problem NDN-T

In this section we give a different formulation of the same problem, where the decision variables are fractions of total supply at a node instead of link flows. The transformation from the original formulation NDN-X into this new formulation is performed using

*« = *irSj

(5)

where 9{j 0% P 6 §* S* S* + R\

= = = = = = =

fraction of total supply at node i at time t to be sent to node j at time t + 1 {..., 0\j, • • •} = values of 9 at a fixed time t total number of time periods {6\6\...,6P} 2 {0\0 ,...,0t} = all 6 values up to time t total endogenous flow through node i at time t total available supply at node i at time t

For simplicity of notation, we assume all flows pass from period t to t + 1. Flows from period t to t + m, m > 1, are easily handled, whereas flows between regions within a time period would require us to solve certain systems of linear equations, adding substantially to the computational effort. The supply at a node S\ is defined not as a constraint but is actually an implicit function of 0. This is because Sj+1 can be calculated as

sr i =-R5 +i +ix--$

(6)

As a result, the total flow 5 ' through a node depends on the partial vector 0* that gives the decisions made in earlier time periods. We will henceforth use the notation 5 ' and Sj (0) interchangeably, using the latter when the stress on the functional nature of S* is needed. Our objective is to minimize the total cost of flow along the links over the entire network using the link cost functions IhW = fb(°h • s<(0))-

(7)

The optimization problem can be stated as follows : minimize

F(0)

= E E E 4 W

6

subject to

t

Y,eh

=

l

i

(8)

i Vi

>*

(9)

Vi,j,t

(io)

j

0\3 > o

208

W. B. Powell, E. Berkkam,

and I. J.

Lustig

T h e first constraint ensures t h a t t h e sum of all the fractions of total supply at a particular node i add up to unity and t h e second one is t h e nonnegativity constraint of t h e flow fractions. By (7) and (6), we have t h a t t h e new formulation produces a nonseparable objective function, b u t separable constraints. For t h e special case of dynamic networks we show in t h e appendix t h a t t h e derivatives of F(9) can be calculated using only nominally more effort t h a n required for t h e original N D N - X case, producing subproblems t h a t are trivial to solve when using a Frank-Wolfe algorithm. T h r o u g h a straightforward application of t h e chain rule, we can obtain t h e following recursions for t h e derivatives:

dF

_

(dfj3

Wi ~ U ^

Jfasm

dF +

\

3Sp(0)J ' ( )

= y(M.#. +

yxdx^

_a^.*:\ °'^ ds}+1(6) ")

( }

(12) {

'

To simplify t h e notation, let t h e gradients be denoted as

9ij

=

(13)

-QQC

It will be clear t h a t g\- represents t h e gradient of F evaluated at some point 6. These derivatives can be found by a backward pass, where t h e main idea is to start at t i m e t = P and move backward in time over t h e entire network. T h e B a c k w a r d P a s s procedure works as follows where it is assumed t h a t 9\- and S\ are known for all i, j , and t. Step 1. Set gfj+1 = 0 for all i,j and ^f +1 = 0 for all i. Step 2. For each t = P, P — 1 , . . . , 1 , compute

4 = (H+3+iVs -d

§\ =

JX(||

+

#»

^ (16)

T h e loop is derived from t h e recursions (11) and (12). Note t h a t

is the gradient of t h e original objective function evaluated at the value x\- = Q\- • S\.

On Algorithms

for Nonlinear

Dynamic

Networks

209

For t h e overall algorithm we need t o calculate t h e supplies S* and t h e gradients
Step 2. For each t = 1,2,..., P - 1, update for each j , S]+1 = Rfl + £,- fy • Sj.

4

Solution Algorithms for NDN-T

In this section we outline three standard algorithms t h a t can b e used with t h e new formulation: t h e Frank-Wolfe algorithm, a gradient projection algorithm, and an active set strategy. All of these algorithms take advantage of t h e special structure of the constraint set. In t h e gradient projection m e t h o d , t h e required projection operator is particularly simple (see Bertsekas [3] for a discussion of projection m e t h o d s using this problem formulation). T h e active set m e t h o d also uses t h e structure of t h e constraints t o find a basis. These algorithms are only used t o illustrate t h e general performance of t h e transformation, and do not represent a comprehensive study of algorithms for this class of problems.

4.1

Frank-Wolfe

A s t a n d a r d application of t h e Frank-Wolfe algorithm involves solving t h e following subproblem determined by computing t h e gradients g\- at some point 6: minimize

£ £ £ < ^ t

i

subject t o

(17)

j

Ysfti =

l

Vi

><

(18)

3

fit

>

0

Vt,i,<

(19)

T h e solution to this subproblem is t h e vector j3 . T h e problem decomposes by each node (i, t). Hence, for each pair (i, t), we simply choose t h e index j corresponding to t h e most negative value g\j. We then set ji\j = 1 and t h e values /?'• = 0 for k ^ j . Using t h e backward and forward passes, t h e complete algorithm can b e built u p very efficiently, which is as follows: Step 1. Let 0'°) = 0 and calculate g\i at t h e point 6 = 0. Solve minimizeX)EE4-^ "

t

i

j

subject to (18) and (19). T h e solution ft becomes t h e initial solution 0' 1 ' = /5*. Set k = 1.

210

W. B. Powell, E. Berkkam,

Step 2. Use t h e Forward Pass t o find t h e vector (Sj)^.

and I. J.

Lustig

.

Step 3. C o m p u t e t h e derivatives g\- evaluated at #(*' using t h e Backward Pass. Step 4. Solve the linearized subproblem (17) subject to (18) and (19) to get t h e solution / 3 « . Step 5. Find t h e o p t i m u m step size ct by solving: minimize F(0<*> + a(/?(*> - 0 (fc) ))

Step 6. U p d a t e 0 = 0<*> + a*(/?<*> - 0<*>). |F(S( fc + 1 ))-F(»(*))|

Step 7. If J cva(fch < e i then stop. Otherwise set fc=fc+l and go back to Step 2. T h e linearized subproblem is solved easily when compared with t h e classical linearized subproblem since it needs to look only for t h e most negative gradient out of each node t o decide on which arc to put flow. By contrast, t h e classical formulation using link flows produces a subproblem t h a t requires t h e solution of a linear network.

4.2

Gradient Projection Method

An attraction of Frank-Wolfe is t h a t it produces a feasible descent vector, but t h e cost is t h e use of an extreme point solution. Gradient m e t h o d s generally offer a better search direction, but require a projection operation to regain feasibility. T h e simple constraint structure of t h e transformed problem allows this projection to be performed with relative ease. Let 0\. be t h e vector of flow fractions out of node (i,t). We scale the gradient out of node (i,t), g\_, by dividing all entries out of t h a t node by t h e most negative element. Let the scaled gradient be denoted as g\_. T h e u p d a t e d flow fractions out of node (i,i) are obtained by Pi = 0i. + 9i,

(20)

which will be infeasible. T h u s we project /?,-. back onto t h e feasible region using the m e t h o d presented by Held et ai. [7] and determine the feasible direction dk by dk = /? p r o j - 6

(21)

This direction is t h e n used in t h e main iterate

0<*+D = *<*) + a * . d*

(22)

On Algorithms

4.3

for Nonlinear

Dynamic

Networks

211

Active Set M e t h o d

An alternative use of t h e simple constraint set is t h e active set m e t h o d (see Gill et &1. [6]). Having a single separable constraint for each node (i,t) makes it easy to find a basis by eliminating one variable using each constraint. In doing so we distinguish between t h e nonnegativity constraints t h a t hold exactly (active) and those t h a t do not (inactive). At each node, apart from t h e convexity constraint which is always active, whenever 6\j = 0, we set it as active and whenever 6\j > 0, it is referred to as inactive. Out of all t h e inactive variables we eliminate one and m a k e it basic. Let

and

El

=

the set of variables 6\- t h a t are eliminated (basic),

N\

=

the set of variables 8'- t h a t are not eliminated and t h a t are inactive (nonbasic),

where t h e sets are m u t u a l l y exclusive. T h e basic iteration is 0(*+i) = fl(*) + a* • dk

(23)

We start by eliminating one of the inactive flow fractions, which is strictly positive, out of each node (i,t) such t h a t

e =1

for

« - E °h

°«^E*

(24)

If 6\: is nonbasic then t h e direction of movement is t h e negative gradient out of node i evaluated at $,-••

( 25 )

< = -9h If 0\; is basic then < =

E

9tk

(26)

Since t h e elements with d\j < 0 are going to decrease from their current values in order to m a i n t a i n feasibility ({8'j)k+1) > 0), a ratio test must be performed t h a t evaluates t h e distance of t h e variables to zero:

mm

4<0

(27)

i,j,t

T h e variables are then u p d a t e d with respect to a stepsize a* such t h a t 0 < a* < a n is obtained through a one dimensional search.

212

5 5.1

W. B. Powell, E. Berkkam,

and I. J.

Lustig

Numerical Results Experimental Design

Experiments were run to provide an indication of t h e i m p o r t a n t properties of each formulation and to contrast t h e performance of t h e different algorithms on each formulation. Relatively little formal experimentation has been reported specifically for dynamic networks, with t h e notable exception of t h e work by Aronson and Chen [2]. This work focussed on dynamic networks t h a t featured potentially unbalanced transportation problems in each t i m e period with inventory carry-over arcs from one period to t h e next. Aside from t h e fact t h a t t h e networks are linear, these networks exhibit a basically different structure from the deep networks t h a t we consider. Given t h e exploratory n a t u r e of our experiments, we used randomly generated networks to test t h e algorithms. T h e network generator, however, was designed in the context of dynamic fleet management problems arising in truckload trucking. In this application, the nonlinear "cost" functions arise from an a t t e m p t to capture the uncertainity in t h e d e m a n d for transportation from city i to city j (see Powell et al. [9] and Powell [10] for complete details of t h e model). Let D be t h e uncertain demand for transportation in a particular m a r k e t , and let x be t h e flow of vehicles. T h e n m i n { D , : r } vehicles will move loaded, generating revenue r, while x — mm{D,x} will move empty, at a cost c. Let p(x) be the expected cost (negative profit) generated on this link, given by: p(x) = EQ[ c(x — min{_D,x}) — r m i n { D , x } ] If D is described by a simple density function fo{x) then

= Ae~ Al: , where A =

p(x) = ex - j ( r + c)(l - e~Xx)

(28) 1/E[D], (29)

A

Of course, any nonlinear cost function could be used, but we felt t h a t it helped our design of the network generator to use parameters and assumptions t h a t were motivated by an application. For example, we were able to choose input supplies (representing t h e supplies of vehicles in the fleet distributed among t h e set of cities) in a m a n n e r consistent with t h e demand for t h e vehicles. In order to generate t h e parameters r, c, and A for each m a r k e t , we generated r a n d o m coordinates for cities, uniformly over a 1000 by 2000 mile rectangle, from which we could calculate distances dij for each city pair (i,j). Given these distances, we used r%]

=

1.2^

(30)

C{j =

0.6dij

(31)

Xn =

J-

(32)

On Algorithms

for Nonlinear

Dynamic

Networks

213

where 1.2 and 0.6 are typical per mile revenues and costs for t h e trucking industry. T h e external supplies t h a t enter the network at t h e first t i m e period are generated by:

where 7 G [0.3,0.7]. By choosing the fraction 7 within this range, we m a d e sure t h a t flows stayed on t h e curved part of t h e nonlinear cost function, which corresponds to t h e hatched region in Figure 3. Inconsistency between t h e flows (representing t h e supply of vehicles) and t h e market demands (which determines t h e shape of t h e cost function) may result in landing on t h e linear portion of t h e function by being too far off to t h e right or to the left. T h e last p a r a m e t e r required is t h e link density. We generated each link with probability a , using a = 0.5 for most problems. In addition to using our algorithms, certain sets of experiments were run using G E N O S (Mulvey and Zenios [8]). G E N O S includes implementations of the primal t r u n c a t e d Newton m e t h o d and simplicial decomposition, both specialized for nonlinear generalized networks. To accomodate t h e predefined functions in G E N O S , t h e experiments which involved comparisons against G E N O S used cost functions of t h e form ae . In t e r m s of t h e parameters of our problem, the functions were given by: p(x) = j{r

+ c)e-Xx

(34)

T h e test problems are designed to compare t h e classical formulation against t h e new one using Frank-Wolfe, and to see how well the new algorithm performs against G E N O S . Beyond t h a t there is the question of how different m e t h o d s like Frank-Wolfe, projected gradient, and t h e active set m e t h o d compare with each other when they are applied to t h e new formulation. Toward this goal, a series of experiments were run to test t h e effect of the number of regions and t i m e periods on networks with various densities. Of particular interest is t h e effect of longer planning horizons on t h e r a t e of convergence. We used a simple stopping rule based on t h e relative change in t h e objective function from one iteration to t h e next, given by:

—

• ,, F($W)

< t

(35) '

V

where e is a p a r a m e t e r t h a t we set to 0.001. For some of t h e experiments, we solved t h e problem with one algorithm and then measured how long a competing algorithm required to produce t h e same objective function value. T h e codes t h a t implement t h e solution algorithms are written using t h e C prog r a m m i n g language. Computational tests are performed on a Silicon Graphics 4 D / 7 0 workstation running SGI Unix V3.2 with code compiled with t h e M I P S c c compiler using t h e default optimization level.

214

5.2

W. B. Powell, E. Berkkam,

and I. J.

Lustig

Results and Conclusions

We began by experimenting with different algorithms for N D N - T to investigate t h e properties of t h e transformation. Six test problems were randomly generated. Each problem is characterized by t h e n u m b e r of cities, number of t i m e periods, and network density. Other p a r a m e t e r s (such as r,j and c,j) were fixed as described earlier. Initial experiments indicated t h a t t h e projected gradient algorithm was superior to others. To obtain fair comparisons between t h e three algorithms for N D N - T , we ran t h e projected gradient algorithm until it satisfied t h e e-optimality test. T h e other two algorithms were then run until they produced an objective value t h a t met or was closest to t h e result obtained using t h e projected gradient algorithm. Figure 4 illustrates t h e rate of convergence of t h e three methods. Table 2 gives t h e results of a side-by-side comparison of t h e Frank-Wolfe algorithm for N D N - T and N D N - X . Here t h e results of the Frank-Wolfe run for N D N - T were taken from table 1. T h e n , Frank-Wolfe was used on NDN-A" until it produced an objective value t h a t m e t or came closest to t h e results for N D N - T . T h e results show a d r a m a t i c deterioration in performance as t h e n u m b e r of t i m e periods is increased. For 20 t i m e periods, t h e Frank-Wolfe algorithm applied to N D N - X could not even reach t h e result obtained using N D N - T within a reasonable t i m e . This behavior is explained by t h e n a t u r e of t h e e x t r e m e point solution given by t h e Frank-Wolfe algorithm for N D N - X , as illustrated in figure 2. By contrast, figure 5 illustrates t h e Frank-Wolfe solution for N D N - T using the same network and link flows as figure 2. In the ^ - f o r m u l a t i o n , many nodes have no flow moving through t h e m in t h e e x t r e m e solution, especially in t h e later t i m e periods. As a result, t h e one dimensional search uniformly decreases t h e flow on all t h e links of these nodes. T h e links e m a n a t i n g from the same nodes in t h e T-formulation will also experience a net reduction in flow. However, this formulation also allows t h e algorithm to shift flow between t h e links emanating from these nodes, further refining t h e solution. It m u s t be acknowledged t h a t Frank-Wolfe is not t h e best algorithm for either of t h e formulations. In table 3, we used t h e best available algorithm for each formulation. For NDNT , we used t h e projected gradient algorithm, and for N D N - X we used G E N O S , a package designed for nonlinear (generalized) networks. G E N O S includes specialized implementations of both the primal t r u n c a t e d Newton algorithm and simplicial decomposition. These experiments revealed some limitations of t h e system. First, we were unable to run our larger problems due to restrictions in t h e G E N O S software. Second, initial experiments produced almost pathologically slow execution times using t h e primal t r u n c a t e d Newton algorithm. We concluded t h a t additional work was needed on this algorithm and t h a t t h e run times were probably not an accurate measure of t h e performance of t h e algorithm. As a result, we only report t h e results using simplicial decomposition. Recall t h a t we were forced to use different link cost functions to accomodate G E N O S . A new set of test problems were generated using at most 10 cities and 10 t i m e periods. In each case G E N O S was run until its inter-

On Algorithms

for Nonlinear

Dynamic

Networks

215

nal optimality conditions were satisfied. T h e n , t h e projected gradient algorithm was run for N D N - T until t h e objective function value m e t or came t h e closest t o t h a t produced by G E N O S . T h e execution times reported in table 3 indicate t h e d r a m a t i c improvements in r u n times over G E N O S , especially in t h e larger problems. This result should not b e t o o surprising when we consider t h a t simplicial decomposition is just a generalization of Frank-Wolfe, a n d must still use e x t r e m e points with t h e qualities depicted in figure 2. A final set of experiments was r u n t o investigate t h e possibility t h a t t h e results are sensitive t o t h e structure of t h e input flows. All t h e networks generated u p t o now exhibit t h e property t h a t R{(t) = 0 for t > 2. While this is fairly realistic for dynamic fleet m a n a g e m e n t problems, it produces t h e sparse structure of t h e e x t r e m e point solution exhibited in Figure 2. T h e final set of tests was conducted using a set of input flows t h a t satisfied i?,(i) > 0 for all cities a n d t i m e periods. T h e values for t h e external supply vector are obtained by: R,(t)

= ptf-^l

-S)^:

ji-]

(36)

where p = 0.5 and 6 = 0.3. This expression p u t s a declining amount of flow into later t i m e periods, with t h e total amount of flow entering all t i m e periods comparable t o t h a t used in earlier experiments. Again, t h e reason is to insure t h a t link flows stay on t h e "interesting" part of t h e nonlinear cost functions. In this case, a plot of t h e nonzero flows in an e x t r e m e point solution of Frank-Wolfe looks more like figure 5 for b o t h formulations. T h e results, shown in table 4, indicate t h a t t h e relative results do not change significantly as a result of all positive input flows. T h e explanation is t h a t while a plot of nonzero flows looks more like a dense tree, a plot of t h e flow volumes in t h e e x t r e m e point solution would still show a noticeable funneling of flows onto a single p a t h .

6

Appendix: Calculation of the Derivatives

In this section we wish to use t h e acyclic structure of network t o develop backward recursions for t h e derivatives dF/dO'j, in order to minimize t h e function F(6). For this we need t o evaluate its derivative with respect t o B\-. To m a k e t h e calculations easier t o follow, we can split t h e objective function into three p a r t s by considering a particular t i m e t. T h e objective function is then viewed as a combination of t h e earlier t i m e periods t' < t, t h e present t i m e t, a n d t h e future t' > t. For any value t, let F(0) = Ht1{0) + Ht2{8) + Ht3{0) (37) where

Hl(9) = EEE/«W t'
k

i

(38)

W. B. Powell, E. Berkkam,

216

H'2(9) = £•(*) + E E & W A

Lustig

(39)

(4°)

Hm = EEE/Sw t'X

and I. J.

(

T h e r e are some i m p o r t a n t facts about these functions stated in t h e following two lemmas, t h e first of which is stated without proof: L e m m a 6.1 For any time t < P, # J ( 0 ) = Hl2+1{B) + L e m m a 6.2 For s > t, the partial

Hl+l{6).

derivative

dH\{6) _

Proof. For f < t, f(,(0) = /£,(0£,5J'(0)) by equation (7). Since 5 / ( 0 ) does not depend on any values 6\- for t > t', and H[ is a sum of the functions fl, with t' < t, it follows t h a t H\ has no dependence on any value of &"• for s > t. O To find dF/dB^j, we first need to evaluate dH^/d9\ , which corresponds to differentiating t h e function of future time periods with respect to a change in the flow fraction at t h e present t i m e t. T h e n we need to evaluate dH\jd6\- which corresponds to differentiating t h e function of t h e present t i m e period with respect to a change at t h e present t i m e t. L e m m a 6.2 has shown t h a t dH\ld6\j = 0, i.e., the derivative of t h e function of t h e past t i m e periods with respect to t h e current t i m e period is zero. So we have t h e following two propositions: P r o p o s i t i o n 6 . 3 The partial

derivative

dH

>aetj

Proof.

dp

ds^ie)

-sue).

From L e m m a 6.1, it follows t h a t

H^("i+' + Ei:;4

(«)

Since t h e derivatives for k ^ j are zero due to t h e structure of the dynamic network, t h e expression (41) is simplified to

On Algorithms for Nonlinear Dynamic Networks

217

Using the chain rule and equation (7),

df]t\0) _

dfir^T) mi df)t\xT)

w3

dx),

(43)

dx^ de\3

(44)

It follows from the chain rule that dx'+l dS*+1{0)

9*r ddlj

ds<+1{6) d9\,

(45)

where

uu

ij

°ij

k

= jrSiW-t>h

(«)

=

(48)

St(6).

This last expression is obtained by using the relationship (6) and realizing that Sl(0) is an implicit function of 6(t — 1), and hence is constant with respect to #,--. Using the chain rule once again helps build the recursive structure by dH?1 — = dB\3

3#< + 1 dS' + 1 (0) • —-—— dSf\B) d6\j

= ^%-5lW

K(491

'

(50)

Substituting (48) into (45) and then into (44), and finally together with (50) into the original expression (42) we obtain dH*3

dH?1

„„„,

, „

df i + 1

\dS}+\o) + ^~d^i ' dsf\o) I' s'{9) dF SKO)dS<+1{&)

(52)

(53)

The last equality follows from the fact that F = H{+1 + H'2+1 + H'3+l and that Lemma 6.2 implies that (d# 1 + 1 (0)/dSj + 1 (0)) = 0. D

W. B. Powell, E. Berkkam, and I. J. Lustig

218 It is useful to note that r)-rt+l

ft

+

+1

~ ds* (e)(

dsl \e)

il

(

' '

'

(55)

= «T

Proposition 6.4 The derivative of the present cost function H\{6) with respect to

e\. is de\3

6

- ^ - -

'

w

'

Proof. Here we are evaluating the relative change in the objective function due to a change in the flow fraction 6\3 at the present time period t. Since this change is only along the arc from node i in period t to node j in period t + l,the derivative is not effected by changes along arcs other than this specific arc. From the definition of H\{6) given in (39), dHj(9) d0\3

_ ~

dfjM d9\3

[

dftMi •ffW)

(

de\3 =

Jl ,l}

'\ ox-

• Sl(0).

Theorem 6.5 The partial derivative of the function F with respect to 9\- is

Proof.

(dfj3{x\3) \ dx\3

+

dP \ dS}+1{0))

Differentiating (37) and using Lemma 6.2 gives dP{6) d6\3

=

dHj(6) d9\-

+

dHj(6) 36% '

,„, '

(58)

Using Proposition 6.3 and Proposition 6.4, we are now ready to state

dF{9) d9<3

'

'{>'

On Algorithms for Nonlinear Dynamic Networks

219

The first expression is presented as Proposition 6.4 and the second one as Proposition 6.3. Combining the two proves the theorem. •

The next step is to find dF/dSl{9). Hence we need to evaluate 5#|(0)/dS/(0) and dHl/dSl(8). We present the following two propositions: Proposition 6.6 The partial derivative of the future cost function H\(6) with respect to the flow S*(9) through node i at time t is dH<3(6)

dF(6)

=

asm

ij

^ds?\e)

-

Proof. From Lemma 6.1 and the definitions of Hl+1(8) and Hl+1(6) given by equations (39) and (40), one can write

dH*(0)_ a ((^f'ne^+H^m) dsi(0) asm u . ,

(59)

Using the chain rule,

dsjie)

(60)

dsj(e)

df^T)1 dx'j

=

x

&ff dsi(oy

(61)

Now the last derivative in (61) can be written as

dSi{0) ~ dS*+1(9) ' dSI(6) '

(62)

where it follows from equation (6) that dS'+1{9)

am

8

- sm^sm-9i>

(63)

st(oy (65)

220

W. B. Powell, E. Berkkam,

Since for each j , Sj+1(9)

and I. J.

Lustig

is a function of Sj(9), it follows from t h e chain rule t h a t

y dsl+1{e)'

ds}{6) _

^8H^(9)

- tww

{

dsi(8) ir

' (}

Substituting (65) into (62), then into (61), and finally together with (67) into t h e original expression (59) we obtain:

dam _ ( = £

aff1

dgrtfT) dfT(*T)

v agj

+1

dx'f ] , dH^(9)\

dxf1

.V

\

OS?1 {6)

dStJ+1{9)J

= zZj~y°\r

w

t

(69)

(70)

T h e last equality follows from t h e fact t h a t the expression in parentheses in equation (69) is precisely 8Hl+1(8) dHl+1(8) dS<+1{8)

+

dSt+\9)

and t h e fact t h a t

dHr(9) _ dS)+1{6)

'

which follows from L e m m a 6.2.

•

P r o p o s i t i o n 6.7 The derivative of the present the flow S\{9) through node % at time t is

ds\{9)

Proof.

y

^From the definition of Hl(9)

cost function

dx\3

with respect to

*'•

in equation (39),

dim

= r iw

dS't(9)

y

dS\{9)

" y dsm _ -

H\{9)

y ^ "Ji]{xij) at 2^—sit—•%•>

(7]) (

'

(U) /7o\ (73)

On Algorithms

for Nonlinear

Dynamic

Networks

221

with t h e last equality derived in a similar fashion as in equations (62) through (65).

• We now can c o m p u t e t h e derivative of F with respect to S\{9), lowing theorem. T h e o r e m 6.8 The derivative through node i at time t is

dF

of the cost function

(dfliiAi)

v

gt

Proof. Differentiating (37) and using (dH[{6)ldS\{6)) 8F dSt{6)

_ dH'2{6) dSl(6)

,

stating t h e fol-

F with respect to the flow Sj{9)

dF

= 0, gives dHl{6) dSj{d)'

T h e first expression is presented as Proposition 6.4 and t h e second one as Proposition 6.3. By combining t h e two we obtain the desired final derivative. •

References [1] J. E. Aronson, A survey of dynamic network flows, Annals search, 2 0 (1989) 1-66.

of Operations

Re-

[2] J. E. Aronson and B. D. Chen, A forward network simplex algorithm for solving multiperiod network flow problems, Naval Research Logistics Quarterly, 3 3 (1986) 445-467. [3] D. P. Bertsekas, Algorithms for nonlinear multicommodity flow problems, in International Symposium on Systems Optimization and Analysis, (Springer-Verlag, 1979) p. 210-224. [4] D. P. Bertsekas, E. M. Gafni, and R. G. Gallagher, Second derivative algorithms for m i n i m u m delay distributed routing in networks, IEEE Transactions on Communications, C O M - 3 2 (1984) 911-919. [5] R. G. Gallagher, A m i n i m u m delay routing algorithm using distributed computation, IEEE Transactions on Communications, C O M - 2 5 (1977) 73-85. [6] P. E. Gill, W . Murray, and M. H. Wright, Practical Press, London, 1981).

Optimization

(Academic

222

W. B. Powell, E. Berkkam,

and I. J.

Lustig

[7] M. Held, P. Wolfe, and H. P. Crowder, Validation of subgradient optimization, Mathematical Programming 6 (1974) 62-88. [8] J. M. Mulvey and S. A. Zenios, G E N O S 1.0 user's guide: A generalized network optimization system, Tech. R e p . 87-12-03, D e p a r t m e n t of Decision Sciences, T h e W h a r t o n School, University of Pennsylvania (1987). [9] W . B . Powell, A stochastic model of t h e dynamic vehicle allocation problem, Transportation Science 20 (1986) 117-129. [10] W . B . Powell, A Comparative Review of Alternative Algorithms for t h e Dynamic Vehicle Allocation Problem, in Vehicle Routing: Methods and Studies (North Holland, New York, 1988), p . 249-292. [11] W . B . Powell, Y. Sheffi, and S. Thiriez, T h e dynamic vehicle allocation problem with uncertain d e m a n d s , in Ninth International Symposium on Transportation and Traffic Theory (1984) p. 357-374. [12] W . W . W h i t e and A. M. Bomberault, A network algorithm for e m p t y freight car allocation, IBM Systems Journal 8 (1969) 147-171.

CN
NO CO

vo cs •«3-

co

ON

w->

oo

00

r~

• *

cs

NO

•»*

O l O >n 0 0 fa *-^ ^ <s NO

co r~

ON NO CO CO

TjON

—i

ON

00

u-> 00

Tt f^ ©

ON

ON

•<* Tf

*-H

«-i

r-

t•*r

•* CO

,_

co

rr

oo u-i NO

cs

1

© oo

NO* NO CO

ON

CO »—1

oo*

u~t 00

m

CO

oo

ON

tOO NO NO

O "1

rrNO

NO —<^

T

CN

OO

CO

d oo '-I

CO

>o ,_T

•*_

NO

©

•

°i

ON

r--

t-~
ON

—'

d d © © »—H OO CO O N

CO

q CO •*

NO

i—«

00 NO

ru-i NO

r-

co

•* d

00

fa OO

<s

ts

Cfl 0 0

<

CN

"-*

CN

OS (S

r-

NO

oo"

8.

ts

»—( m NO

o m O oo IT) fa <s d

fa O t-~

i

00 N©

ON

CO O

< NO NO

!/->

00 o 00 *—< r-^ C ; rNO oC ON CO "1 »—'

OO

r~

O t-^

W oC NO* NO fa oo r1

CS

vo

©

-.

cs

Q

fa

i

©

©

©

©

>r> >n

co"

"7 <"i

cs

•*

ON

*o r~ r" d oo"

1

oo

'

r~ NO

I-H

00 ON NO

VI

©

cs

,"^

VI 41 •™

w

"!

1-^

o

§

oo

CN

ro co

<

wa

On Algorithms for Nonlinear Dynamic Networks

V)

C

0)

*•*

.2 ^ «

/*^ cJ

-*-»

i

s o

1 4>

1 > 1 u

oo

NO CO

»—<

1/1

o

d d d d d

V)

U1

<s

NO

fa

>o

• *

m

VI 41

S

o

Tf

©

<s

CO

o

z

-

<s

eriods

(se

CPU Functi Obje

223

*.

Comparison of NDN-T and NDN-X using Frank-Wolfe

Objective Function No. 1

Cities PeFiods 20 5

Dens. 0.5

NDN-T -189,887

NDN-X -189,815

CPU time (sec.) NDN-T 6.12

NDN-X 22.15

NDN-T 37

NDN-X 103

152.01 70.74

47

181

31

171

898.82 246.54

65 47

529 297

to

999.88**

76

268

O

2

40

5

0.5

-766,618

-766,851

29.52

3 4

20 40

10 10

0.5

-367,280

0.5

10.32 80.94

5

20

20

0.5

-1,597,279 -710,476

-367,560 -1,597,414

6

40

20

0.5

-3,268,208

-706,348*

30.44

-2,065,743

190.85

Table 2

Iterations

to re

* result up to stopping criteria of 0.00001 ** result up to CPU Time of 1000

Ig

a

I

Comparison of GENOS verses NDN-T*

Objective Function *

0.5

GENOS 13,981 29,604

3 4

8

8

0.5

63,373

10

2

0.2

5

10

5

0.2

10,607 35,554

6

10

10

0.2

7

10

2

0.5 0.5 0.5

No. 1 2

8 9

Cities Periods 2 8 4 8

10 10

5 10

Dens. 0.5

NDN-T 13,981 29,594 63,348

CPU time (sec.) GENOS 2.20

NDN-T 0.68

5.76

0.46

10.94

10,609

1.85

0.98 1.85

5.59

0.47

62,485

35,545 62,472

10.78

0.74

43,180

43,179

4.63

0.36

90,342

90,237

164,096

163,883

10.86 25.12

0.69 1.60

Table 3

* same exponential function is used in both GENOS and NDN-T * objective function values reflect the minimum cost attained

Comparison of NDN-T and NDN-X using Frank-Wolfe (with external supplies at each period)

Objective Function No. 1 2 3 4 5 6

Cities Periods 20 5 40 5 20 10 40 10 20 20 40 20

Dens. 0.5 0.5 0.5 0.5 0.5 0.5

NDN-T -176,807 -608,662 -328,328

NDN-X -177,107

NDN-T 8.24

NDN-X 13.31 88.20

-606,669

43.69

-328,095

13.55

-1,228,038 -603,379

-1,228,730

78.96

-601,000

15.71

46.31 398.80 140.48

-2,764,947

-932,789

170.15

998.72*

Table 4

* result up to CPU Time of 1000

CPU time (sec.)

Iterations NDN-T 46 67 40 61 23 65

NDN-X 62 105 110 232 164 268

TIME PERIODS

o

->-

a cm o

(0

3

z o a

r

UJ

oc

f

*o

Figure 1 Space -time representation of a network

Example problem of 10 Regions and 10 Time periods

O

Q O

O

O

o o o o o\ o o o o o o o

\o/o\o o

o

o

\o / o o o \o/ o o o o o o o o Figure 2

Solution of classical linearized subproblem using Frank-Wolfe

°\

\° / \° /

s~\ Super >~J Sink

3 to well, E. Berkkam, and I. J. Lu,

ID/

o

o o o o o o o o o o o\ o o o\ o o o/ o o \o/ o o

O O

OS

On Algorithms for Nonlinear Dynamic Networks

)SO0

o v. TO C O

229

Sipnq

I

I

en

O

jB o'

o

-1

0 _§9_

p j pwe 'urejpfjsg- -g 'ffSAioj g •/&

O

o

-2 a5

1 Q5

Objective value 2

3 _§5

0£S

O

Example problem of 10 Regions and 10 Time periods

13

p

o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o p o o o o o o o o o o o o o Figure 5 Solution of linearized subproblem for the transformed problem using Frank-Wolfe

r i Super Sink

233 Network Optimization Problems, pp. 233-262 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Strategic and Tactical Models and Algorithms for the Coal Industry Under the 1990 Clean Air Act* Hanif D. Sherali Department of Industrial and Systems Engineering, Virginia Polytechnic and State University, Blacksburg, Virginia 24061-0118

Quaid J. Saifee Association of American Michigan 48076-4251

Railroads,

26555 Evergreen

Road, Suite 1120,

Institute

Southfield,

Abstract

This paper is concerned with a study of the effect of the Acid Rain Provision of the 1990 Clean Air Act on the investment, production, and distribution operations in the coal industry, with concentration on the development of new mines, shutting down inefficient strips of existing mines, and on blending and distribution problems. The problem here is to determine which new mines to open and when, and what decisions and schedules to make for the shipment of coal from the mines to silos, cleaning and blending operations at silos, and subsequent shipment of coal to customers over a multi-period time horizon, so as to satisfy the demand at a minimum total operational cost. A longterm strategic model is developed to meet this objective. The final product is a computer based decision tool which will serve as a mechanism for implementing cost effective decisions in light of complex variations in the production levels of existing and potential mines, ore quality, and demand and quality requirements. The strategic model will play a useful role in planning future growth and in making capital investment decisions. The model can also be used to study the effect of various policies, by testing the sensitivity, feasibility, and the cost of system operations under different perturbations of system configuration, data * Acknowledgement: This research has been supported by the Department of the Interior's Mineral Institute program administered by the Bureau of Mines under allotment grant # G1114151

H. D. Sherali and Quaid J. Saifee

234

and demand specifications. Real operational data from the Westmoreland Coal Company and hypothetical test data are used for testing purposes. We also present a related short-term tactical model in the appendix, that can be used to assist in decisions regarding day-to-day operations.

1

Introduction

T h e research conducted in this study was motivated by t h e Acid Rain Provision of t h e 1990 Clean Air Act, passed by Congress on October 28, and signed into law by President Bush on November 15, 1990. This Provision in t h e Act places stringent restrictions on t h e sulfur dioxide and nitrous oxide emissions, particularly on electric utilities operating coal-fired generators, and it promotes t h e role of Southwest Virginia (our area of study) as becoming a central player in t h e coal industry due to its low sulfur coal resources, as compared with other coal mined east of the Mississippi River. However, in order for different companies in the industry to remain solvent and competitive in light of anticipated sudden changes in quality requirements of future coal d e m a n d , they will need to judiciously plan for t h e usage of their reserves, and t h e operation of their cleaning and distribution facilities. This paper addresses t h e combined problems of choosing new mines to develop, determining mine production, ore purification, and blending of different grades of coal, along with t h e problem of distributing the coal from mines to silos to customers, in order to satisfy customer d e m a n d s placed over several t i m e periods, each having specified quality requirements as driven by t h e Clean Air Act. A long-term strategic model is developed to address this problem from a planning perspective, and is designed to help coal companies in making strategic decisions such as the development of new mines or t h e development of new production units at mines. Also, this model is intended to assist coal companies by providing guidelines for t h e usage of their existing and potential reserves, and in the operation of their cleaning and distribution facilities, along with a possible investment in new technologies to offset t h e development of new mines. Additionally, a modified formulation of Sherali and Puri's [14] short-term tactical model is presented in t h e appendix. This model can be used to aid in t h e day-to-day operational decision making process. Information and d a t a for these models have been provided by t h e Westmoreland Coal Company, one of t h e largest coal mining companies in Southwest Virginia. T h e company owns several coal mines, some of which are currently in use (existing mines), and some of which can be developed later as needs arise. Each mine can produce coal at a specific rate and has certain quality specifications t h a t vary over time. This coal needs to be appropriately shipped to silo facilities, where it is subjected to a beneficiation process in order to be partially cleaned to a desirable degree. T h e different grades of coal at individual silo facilities then need to be blended and shipped

Strategic

and Tactical Models

and Algorithms

for the Coal Industry

235

to customers in order to satisfy d e m a n d s for various quantities having stipulated quality specifications. In our case study, the Westmoreland Coal Company owns two large silo facilities, known as t h e Bullitt facility and t h e Wentz facility. T h e Bullitt facility has six silos, each with a capacity of 14,000 tons. Silo 1 and Silo 2 store coal t h a t does not require to be cleaned. Silo 3 contains high sulfur coal and Silo 4 contains low sulfur coal. After beneficiation, coal from Silo 3 is transferred to Silo 5 and coal from Silo 4 is transferred to Silo 6 for storage. T h e Wentz facility has five silos, three with a capacity of 6,000 tons and two with a capacity of 9,000 tons. One of t h e m is used to store coal which does not need t o be cleaned, two of t h e m are used to store t h e remaining coal prior to cleaning, and t h e remaining two are used for storing coal after it has been cleaned. This being a typical structure in the industry, our model partitions t h e silos within each facility into two categories based on two different kinds of coal stored. These categories are denoted as J\ and J-i types of silos. T h e Ji silos are "run-of-mine" (ROM) blend silos, where a shipment is received, stored, and is directly blended with t h e other coal in its run-of-mine form itself. Each J2 t y p e of silo, on t h e other hand, constitutes a pair of silos, one of which receives t h e run-of-mine coal, while t h e other stores t h e cleaned coal following t h e beneficiation process, holding it ready for t h e blending operation. These silo pairs are indexed by t h e sets J2w and J2B for t h e Wentz and Bullitt facilities, respectively. Besides t h e conventional mine to silo to customer shipments, there might also be a coal transfer between t h e silo facilities at two different geographical sites. Such a transfer occurs due to the "stoker" customers. In t h e example of t h e Westmoreland Coal Company, "stoker" customers are provided with cleaned, sifted coal from t h e Wentz facility. Simultaneously, an almost equivalent amount of sifted by-product is shipped to t h e Bullitt facility. T h e normal blending processes resume at this stage. T h e problem is to find out which production units of potential mines should be opened and at what t i m e periods, and to determine optimal schedules for shipping coal from mines to silos, for cleaning and blending at t h e silos, and for t h e distribution of coal to t h e customers. This has to be accomplished subject to restrictions involving storage at t h e silos, production capacity limits, and material flow balance constraints, so t h a t customer d e m a n d s having required quality specifications are satisfied at minimal total cost. T h e long-term strategic model developed to address this problem has the structure of a fixed-charge, mixed-integer, zero-one programming problem. Zeroone integer variables are used to model t h e decision of opening a production unit at each particular mine. In addition to t h e cleaning and shipping costs, t h e objective function incorporates a fixed-charge component to reflect t h e cost of opening a new production unit at a potential mine. Also, a storage cost component is added to t h e objective function in order to penalize t h e underutilization of active resources. T h e constraints include production capacity, storage, material flow balance, and quality requirement restrictions, along with restrictions on t h e sequence in which units can be opened, and on t h e m a x i m u m n u m b e r of units t h a t can b e opened in any t i m e

236

H. D. Sherali and Quaid J. Sa.ifee

period without any substantial surcharge penalties.

2

Related Models in the Literature

T h e problems related to t h e coal industry have been analyzed in a n u m b e r of different ways. Different techniques varying over linear, nonlinear, and mixed integer 0-1 programming problems have been used to approach different problems related to t h e coal industry. Young et al. [18], Faulkner [5] and Johnson [7] provide few of t h e pioneering papers in t h e area. Many practical case studies of Operations Research in t h e coal mining industry have been surveyed by Tomlinson [16]. Knight and Manula [9] have developed a long-term simulation model to study potential coal production and utilization systems in Pennsylvania. Gershon [6] has formulated a mixed-integer model for a mine scheduling problem. Lietaer [10] presents a linear programming model with an objective of preparing a combination of mining works at a m i n i m u m total operational cost subject to different restrictions including operational restrictions on the mines and the concentrators, among other constraints. T h e problem of allocating coking coals from collieries to washeries and blending plants is also modeled as a linear prog r a m m i n g problem by Williams and Haley [17]. Sherali and Puri [14] develop three short-term tactical models for analyzing day-to-day coal flow operations in t h e coal industry, with a concentration on t h e blending and distribution problems. A modified formulation of their most accurate model is presented in the Appendix of this paper. Steinmann and Schwinn [15] have formulated a zero-one programming model to minimize t h e total resources necessary for balancing t h e capacity structure of the coal mine, subject to capacity constraints for a particular mine, and have reported on c o m p u t a t i o n a l experience with different algorithms used to solve this problem. Two link models, a t r a n s p o r t a t i o n - t r a n s s h i p m e n t model defined for t h e coal distribution network, and a location-allocation model defined for t h e potential location of coal handling facilities within receiving and shipping nodes of t h e network, have been developed by Osleeb and Ratick [12] to determine t h e optimal capacity, placement, and railroad and marine interface of coal handling facilities within and between the. New England ports and converting power plants. A mixed-integer location-allocation problem has also been formulated by Osleeb et al. [13] to evaluate the potential for reducing water-borne coal transportation costs, and concomitantly, t h e cost of delivering coal to the European markets. Candler [4] formulates a blending problem with integer constraints on a m i n i m u m usage of each coal t y p e in a mix, and recommends a sampling plan along with penalty function and rejection options. As evident in t h e literature, different papers have considered different detailed aspects of specific problems faced by t h e coal industry. However, the imminent problem of developing a strategic plan for a time-staged resource m a n a g e m e n t along with coalblending and distribution operations in order to comply with stringent future quality

Strategic

and Tactical Models and Algorithms

for the Coal Industry

237

requirements as p r o m p t e d by t h e 1990 Clean Air Act, has not been addressed for the coal industry. However, strategic models developed for other contexts do share a philosophical structure with our model. For example, Aboudi et al. [1] develop a longt e r m planning model for t h e petroleum production and distribution problem using mixed-integer programming techniques, and K h a n [8] also presents a mixed-integer model t o minimize t h e total disposal costs in an u r b a n solid waste disposal problem. Section 3 below presents a formulation of our strategic planning model. Exact and heuristic solution procedures are described in Section 4, and Section 5 presents computational test results using real and hypothetical d a t a sees. T h e Appendix contains a related formulation t h a t deals with t h e tactical day-to-day operational problem.

3

Formulation of a Long-Term Strategic Model

As described earlier, t h e problem at hand is to determine which production units of potential mines should be opened and at what t i m e periods, along with optimal schedules for shipping coal from mines to silos, for cleaning and blending coal at the silos, and for t h e distribution of coal to t h e customers, subject to various constraints. These constraints include production capacity, silo capacity, material flow balance, d e m a n d satisfaction for customers, and quality requirement restrictions, along with restrictions on t h e sequence in which units can be opened, and on the m a x i m u m n u m b e r of units t h a t can be opened in any t i m e period, without any substantial surcharge penalties. T h e model requires specific d a t a pertaining to t h e existing and potential mines, silos, and customers. This model considers periods equal to six m o n t h s in duration, with a horizon of three to five years. As this is a long-term model, all costs such as cleaning and shipping costs, revenues for shipping b e t t e r quality coal, and penalties for storage and underutilization at mines, are present values using a certain rate of return. T h e subscript t attached to these cost factors reflects this representation. In this regard, note t h a t the fixed-charge cost to open a production unit at a potential mine is t h e present value of t h e semi-annualized payments over the life of t h e mine t h a t are to be m a d e over the horizon of the model. Real d a t a from t h e Westmoreland Coal Company and nine other similar, hypothetical d a t a sets have been used for making t h e model runs. Given below are t h e d a t a requirements, along with t h e notation used.

t = 1 , . . . , T = n u m b e r of t i m e periods ( < 8). i = 1,. . . , / = number of existing and potential mines ( < 23). q = 1,. . . , hi = n u m b e r of units at a potential mine i ( < 30).

H. D. Sherali and Quaid J. Saifee

238 j = 1 , . . . , J = n u m b e r of silo units ( < 10). k = 1,... ,K = n u m b e r of customers ( < 12).

Kst = "stoker customers" served by Wentz silos. pit = production (tons) at existing mine i, in period t. Piq(t) = a function which represents production (tons) at unit q of potential mine i, in period t of its life. (Note t h a t t denotes t h e period of its life after this unit has been opened, and not t h e period of t h e model.) a,-j = ash content (%) in coal produced at existing mine i, in period t. Su = sulfur content (%), in coal produced at existing mine i, in period t. a, g( = ash content (%), in coal produced at unit q of potential mine i, in period t. (Note t h a t t denotes t h e period of horizon, and not t h e period of t h e unit's life after it has been opened.) Siqt = sulfur content (%), in coal produced at unit q of potential mine i, in period t. ( T h e same c o m m e n t applies here as for a,, ( ) I\ = existing mines. I2 = potential mines. J\ = R O M silo storage units. J2W = cleaned silo units at t h e Wentz facility. J2B = cleaned silo units at the Bullitt facility. J2 = cleaned silo units ( J?w U JIB )• SCjt = storage capacity of silo unit j in period t. (For j 6 J2, this is taken as t h e sum of t h e two associated silo storage capacities.) c-jt = present value of shipping cost (per ton), from existing mine i to silo j in period t. c

fqjt = P r e s e n t value of shipping cost (per ton), from a unit q of a potential mine i to silo j in period t. cf,t = present value of cleaning cost (per t o n ) , at silo j for coal from existing mine i in period t. cf -t = present value of cleaning cost (per ton), at silo j for coal from a unit q of a potential mine i in period t.

Strategic and Tactical Models and Algorithms for the Coal Industry

239

ctij € (0,1] = total weight attenuation factor (output per ton input) at silo j € Ji for coal from existing mine i. ctiqj = defined similar to a^- for production unit q at potential mine i. fiij € (0,1] = ash content attenuation factor at silo j g J 2 for coal from existing mine i. /3iqj = defined similar to ftij for production unit q at potential mine i. Hi € (0,1] = sulfur content attenuation factor at silo j S J 2 for coal from existing mine i. -/iqj = defined similar to 7^ for production unit q at potential mine i. Note: ctij, a,gj', fy, /3iqj, 7,j, and 7,-w- are assumed to be 1 for j € J1# Ai = flow arcs {i,j) from mine i to silo j ; Ff — {j : (i,j) £ A\ }, iZj = {i :

(«',i)e A x }. A2 = flow arcs (j, fc) from silo j to customer k; Ff = {k : (j,k) iP t = { j : ( i , * ) e A a } .

g A2 },

As = arcs (j, j ' ) representing by-product flow from Wentz silo j € J2w to Bullitt silo j ' € J 2 B , corresponding to stoker customer shipments; Fj = { j ' : (j,j') G A3},Rj = {j:(j',j)eA3}. c

jkt = present value of shipping cost (per ton), from silo j to customer k in period t. dfj?t = present value of shipping cost (per ton), from Wentz silo j 6 J2w to Bullitt silo j ' 6 JIB in period t. dkt = demand placed by customer k during period t. ASHukt = upper limit on ash percentage in coal to be delivered to customer k in period t. SULukt = upper limit on sulfur percentage in coal to be delivered to customer k in period t. Rkt — present value of additional revenue earned per percentage point below the maximum specification of ash content, in coal delivered to customer k in period t. fiqt = fixed charge ( equivalent present value as described above) to open a unit q at a potential mine i in period t.

H. D. Sherali and Quaid J. Saifee

240 / ( m a x = m a x i m u m { /•'
for all t = 1,...

,T.

PEa = present value of penalty per ton of coal unused at existing mine i in period t. PE{qt = present value of penalty per ton of coal unused at unit q of potential mine i in period i. Ut — upper limit on t h e number of units t h a t can be opened at a potential mine i in period t.

Mathematical Formulation A detailed m a t h e m a t i c a l formulation is given below, followed by an explanation and motivation of t h e objective function and the various constraints. T h e decision variables track t h e flow of coal from mines through silos (within the blending process) to the different customers. Specifically, these variables correspond to (i) t h e amount (tons) shipped from existing mine i through silo j to customer k in t i m e period t, given by yiju (ii) stoker customer by-product amount (tons) shipped from existing mine to i to silo j G J2W, which is then shipped to silo j ' G J 2 g , and then to customer k in period t, given by Yijyki (iii) amount (tons) shipped from unit q of potential mine i through silo j to customer k in t i m e period t, given by Wiq]kt (iv) stoker customer by-product amount (tons) shipped from unit q of potential mine i to silo j G J W , which is then shipped to silo j ' G J 2 s , and then to customer A: in a period 2, given by Wiqjjikt (v) a binary variable t h a t takes on a value of one if unit q of a potential mine i G I2 is opened or initiated in a period t, and is zero otherwise, given by X{qt. Note t h a t t h e aggregate shipments of coal from mines to silos, the blending process at the silos, and the aggregate shipments of blended coal from the silo facilities to the customers can be readily computed via these variables. Auxiliary Decision Variables 1. zn = slack variable equal to the amount (tons ) of coal produced at existing mine i during period t t h a t remains unused in t h a t period, for i G h, t = 1,. . . , T. 2. Ziqt = slack variable equal to the amount (tons) of coal produced at a unit q of potential mine i during period t t h a t remains unused in t h a t period, for i G h, q = l,...,hi, 8 = 1,...,T. 3. ASHkt = % ash content in blended coal delivered to customer k in period t, for k = l,...,K,t = l,...,T. 4. SULkt = % sulfur content in blended coal delivered to customer k in period t, for k = 1,...,K, t = 1,...,T.

Strategic

and Tactical Models and Algorithms

for the Coal Industry

241

5. 8it = n u m b e r of mines opened within t h e set limit Ut in period t, for t = 1 , . . . , T . 6. 02f = n u m b e r of mines opened beyond t h e set limit Ut in period t, for t = Objective Function: Minimize

£

£

£

£

( < # + eg, + c $ ) y y t f

j = i iefljn/] tgF? *=i

+ 3&J2W £ iGR)nh £ j'eFf £ j'GF £ , *=1 E ( ^ + ^ + ^" + 4 )ijj'ktn 2

£ E EEE(C +4 + 4 W

i = l i6fl;n/ 2 9=1 i g F 2 '=1

+ E E E E E E ^ + ^ + ^ +4 - f e + E E E / ' A + E £^<** + E E E ^ ' ^ <SH}n72 9=1 i = l

r

O.2/tmax02i

+ £

igfijn/i *=1 A-

r

i6R]-n/2 9=1 «=1

5

- £ E ( ^ ^ « - ASHkt)dktRkt

t=i

«:=i (=i

Constraints: 1. F/ow balance at the

mines:

Pit ~ E E

2to*< + £

j € f ? keF]

£

i£F} j'£Ff

y

£

.^'t<

Zit

k£Fj,

for i £ J ] , and < = 1 , . . . , T

E

Pi
t

+ 1 ) I ''9<

E

E

"'•uw + E

jeF} keFj

E

E

w

iijj'ks

^inS

jeF? }'£Ff tef*

for i £ I2, q = 1,. . . , hi and S = 1,. . . , T 2. Capacity

constraints

E

E

e H j n ^ k£Ff

on silos: 1

a 7,0-+ n)y>jkt + J2 2

E

E ^

+

a

# »

igfijn/i j ' e J j w fcgF?

+ E £ £ ^(i+ <*«>.»* + £ £ £ £ i(i + ^„-)w;-, ie«}n/ 2 i = i fcgF2

<

5C,(

igfl!n/ 2 9=1 i ' g J j w fcgf2

H. D. Sherali and Quaid J. Saifee

242 for j € (J2B U J2w) n J2, and t = 1 , . . . , T

E

E

WW + E

i'e«}n/i * e ^

w

E E

<«*< ^

ieH}n/3 9=1 /te^?

^

for j € Ji, and £ = 1 , . . . , T 3. Wentz to Bullitt transfer:

E

E

*«'* ^

E

j'eFj1 jteF2, for j € J

w

y**. < i-i E

E *«'«

j'eFf keF2,

keFfnK,,

fl F?, i € # } n 7 1; and t = 1 , . . . , T

E

w

E

>in'kt <

j'6F/ ieF*

E

w

iijkt < 1.1 E

E

ww**

j'eFf keF2,

keFfnK,,

for i e J 2 w n i ? , i € R) fl J 2 , 9 = 1 , . . . , hit and t = 1 , . . . , T 4. Demand constraints for customers: hi

E

a

E

u 3/<j*« +

12

12

12 a'j Ynj'kt + E

E

E

E E

E

E a w *"<«*<

A,-

+

a

iii wiqi?kt = dkt

j€JWn/S, ieR)m2 9=1 j'eF/ for k = 1,...,K,

and* = 1,...,T

5. Customer product quality constraints: 1 a

ASHU =

E

E

y^kt u Pa +

Ai

E

E

E

E ^jj't« a<< A'J

E

jehwnR*, ieR)ni, j'eF 3

jefij ieR)nh

+

E

hi

w

iqjktaiqtf3iqj+ E

E E Wi«y*
E

jeJwnR3, ieR)nh 9=1 j'eF?

jeflj iefljn/2 9=1 for fc = l , . . . , / r , f = 1 , . . . , T

5£/Xw

=

—

E2

jeR

12 vanilla + iefijn/,

E

E

E

E

in'kt sit m

iehwnR*, ieR)ml j'eFf

hi

+

Y

E h,

E wi»ikt siqt 7.«+

E

E

E E

jeJ^nfl 3 , .efl]n/2 9=1 ygF?

ww^,^;?

Strategic

and Tactical Models and Algorithms

for k = 1,...,K,

t-

for the Coal Industry

1,...,T

0 < ASHkt

< ASHuH,

for k = 1 , . . . , K, and t = 1 , . . . , T

0<SULkt

<SULukt,

foi k = l,...,K,andt

6. Restricting

243

=

l,...,T

a unit of mine to be opened only once: T

E

x

(=1

1

'ii -

for i G I2, q = 1, • . . , hi. 1. Upper limit on the number of units that can be opened in a time

period:

hi

E E xtgt<elt

+ e2i{ovt = i,...,T

ieh 9=1

0<0lt
92t>0,ioit

8. Sequencing of units at a potential to derive a tighter relaxation.)

=

l,...,T

mine ( with some implied constraints

added

*"iqt _ / , •Tiq'6 6
for i G I2, q' = l,...,q9. Disaggregated ation)

1, q = 2 , . . . , A,-, t = 1 , . . . , T

constraints E jeF?

W

(further

ilikS +

restrictions

E E j€F>nJ2W j'eFf

added to derive a tighter

W

ilH'kS < dkS E Xigt t<s

for i £ 7 2 , q = 1,. . . , hi, k = 1 , . . . , K, 8 = 1 , . . . , T 10. Nonnegativity

and binary

constraints:

Vijkt > 0, for i e h, j = 1 , . . . , J , k = 1 , . . . , K, t = 1 , . . . , T Vigjkt > 0, for i G 7 2 , q = 1 , . . . , hi, j = 1 , . . . , J , k = 1 , . . . , K, t = 1 , . . . , T Yijj'kt > 0, for i e h, j e J2w, j ' € J2B, k = 1 , . . . ,K, t = 1 , . . . , T

relax-

244

H. D. Sherali and Quaid J. Yiqjj 0, for i G h,

q = l,...,ft,-, j G J 2 w , j ' € ^2i5, * =

Saifee 1,...,K,

t=l,...,T zu > 0, for i G / ! , t = 1 , . . . , T Ziqt > 0, foi i e I2, q = 1,...

,hi, t = 1,...

Xiqt = 0 01 1, foi i € I2, q = 1,...

,T

,hi, t = 1,...

,T

C o m m e n t s on the Formulation: Given below is an explanation of some of t h e finer points related to t h e objective function and t h e constraints. As this is a long-term model, all t h e cost coefficients in the objective function are present values. A fixed charge cost is incurred whenever a unit at a potential mine is opened. Shipment of better ash quality coal is rewarded, and storage and underutilization at a mine is discouraged by accommodating a penalty t e r m in t h e objective function. Also, in any t i m e period t, if the n u m b e r of units t h a t can be opened exceeds t h e specified upper limit on the number of units t h a t can be opened given the available b u d g e t a r y resources, t h e n a surcharge equal t o 20% of t h e m a x i m u m fixed cost over t h e different units of the potential mines is applied to the number of units t h a t exceed t h e stated limit. This surcharge (its value being governed by finance considerations) reflects t h e burden for acquiring additional capital for developing units beyond what t h e available budget permits. In t h e flow balance constraints (1) for t h e mines, we require t h a t all t h e coal produced within each six m o n t h period should account for flow within this duration, without any carryover of inventory. If the flow falls far shorter t h a n t h e production, despite t h e underutilization penalty in t h e objective function, then this mine is a strong candidate for a shutdown. Also, in t h e flow balance constraint for a new unit of a potential mine, t h e production rate function at t h a t unit is multiplied by a binary variable t h a t takes on a value of one when this unit is opened and is zero otherwise, such t h a t t h e constraint reflects t h e appropriate rate of production according to its age in t h e given period. As each J2 t y p e of silo constitutes a pair of silos, one of which receives the runof-mine coal, while t h e other stores t h e cleaned coal following t h e beneficiation process, we have used t h e coefficient ( "''' with the flow variables in t h e capacity constraints (2) for t h e J2 type of silos. Furthermore, t h e normal storage or handling capacity of t h e silo is multiplied by t h e n u m b e r of working days in a t i m e period to derive t h e p a r a m e t e r SCjtT h e Wentz to Bullitt transfer constraints (3) are interval constraints, and they reflect a transfer of coal from t h e Wentz silos to t h e Bullitt silos as actually practiced by t h e Westmoreland Coal Company. These constraints can easily be generalized

Strategic

and Tactical Models and Algorithms

for the Coal Industry

245

for any coal company, and can be removed from t h e formulation if no such t y p e of transfer is practiced in a particular company. In t h e d e m a n d constraints (4) for customers, t h e flow variables are multiplied by coefficients ctij to take into account t h e total weight attenuation during t h e beneficiation process at t h e silos. In t h e customer product quality constraints (5), we have simply used coefficients a,„( and s,-,( instead of using £ a
s
as it would lead to nonlinear constraints t h a t would have to be linearized. (These l a t t e r functions are of t h e same n a t u r e as t h e production r a t e function used in t h e flow balance constraints for a unit of a given potential mine.) However, since t h e variation in t h e ash/sulfur content over t h e horizon is not too significant, we can approximate these nonlinear terms by a,-3t and «;,(, which in effect assumes t h a t t h e ash/sulfur content fraction varies as if t h e unit was opened at t i m e period 1. In t h e constraints (6), every unit of a given potential mine is restricted to be opened only once. Each constraint (7) has a right-hand side equal to t h e sum of 8u and 02t, where 9U is bounded from above by t h e m a x i m u m number of units from all potential mines t h a t can be opened in t i m e period t. This upper limit depends on t h e availability of resources such as capital, equipment, and manpower. T h e variable 02t, is simply nonnegatively restricted, b u t is penalized as described in t h e c o m m e n t pertaining t o t h e objective function. T h e constraints (8) enforce t h e sequencing of units at a potential mine. Here, it. is sufficient to use only t h e constraints having
Wi,

j"6F!

jkS+

£ j€F>nJ2W

£

j'£Ff

WiqjjikS

t+ l)x,',t, 4 ^ I i , (

< min t<8

H. D. Sherali and Quaid J. Saifee

246 for i € h, q = I,...,hi,

k = 1,...,K,

8 = 1,...,T.

However, as we already have flow balance constraints of t h e form

Yl *«.•«•« + jeF> for i G h, 1 = !,•••,hi, constraints

Y izF}

w

Yl

JL wiqjjlkS < J2 Pi<,(s-t + \)xiqt

jeF>nJ2W j'eFf

t<s

k = \,...,K,

S = 1,...,T,

idkS +

Y

12

we have simply added t h e

W

idi'kS < dks J2 xkt

jeF?nj2W j'eFf

t<s

for i € h, 1 = 1, • • • ,hi, k = 1,...,K, 6 = 1,...,T in our formulation. T h e computational results given in Section 5 exhibit t h e effect of constraints (8) and (9) in tightening t h e continuous relaxation of t h e problem.

4

Solution Procedures

This section presents methodologies for solving t h e long-term strategic model, using b o t h exact and heuristic techniques aimed at deriving near optimal solutions. Solving t h e M o d e l Using C o m m e r c i a l Softwares T h e long-term model is a mixed-integer 0-1 programming problem. We first tried solving t h e various test problems generated using Z O O M , which is t h e default solver in GAMS (see Brook et al., [3]). However Z O O M could only find an optimal solution for two out of t h e ten test problems even with a 10% optimality tolerance, and did not even find an integer feasible solution for t h e remaining problems except for test problem 4. Hence, we had to resort to more sophisticated, state-of-the-art mixedinteger solvers, namely, OSL (developed by IBM) and C P L E X ( M I P option, developed by C P L E X Optimization Inc.). Different links to connect these softwares with GAMS are available through t h e GAMS Development Corporation. After coding t h e problem in G A M S , these solvers can be called to execute t h e solution process, provided of course, they have been installed within t h e system. T h e optimality tolerance while solving these test problems was kept at 10%. As t h e results of Section 5 indicate, these options, if available, are viable alternatives. C P L E X was able to solve seven out of t e n problems, while OSL solved all t h e test problems. However, C P L E X found b e t t e r quality solutions t h a n did OSL for five of t h e seven instances it solved. If these options are unavailable and one has access only to a linear p r o g r a m m i n g code, or if a robust procedure is required for solving larger sized problems, we propose t h e following linear programming based heuristic procedure. We remark here t h a t we tried to derive approximate solutions for our test problems using t h e Pivot and Complement Heuristic of Balas and Martin [2], but it failed

Strategic

and Tactical Models and Algorithms

for

the Coal Industry

247

to solve all b u t t h e smallest of our test problems. Hence, we designed our own heuristic procedures, exploiting t h e n a t u r e of our problem. (Some of t h e ideas below (see Step 2 in particular) are portable to other 0-1 mixed-integer p r o g r a m m i n g contexts as well.) Linear P r o g r a m m i n g Based Heuristic ( L P H ) This heuristic procedure employs a sequential rounding scheme based on a series of continuous linear programming relaxations, exploiting t h e structure of t h e problem in determining which variables to round up to 1 at each step of t h e process. T h e following are t h e steps involved in this heuristic procedure: Step 1: Solve t h e linear programming (LP) relaxation of t h e problem. If t h e solution obtained has all t h e x-variables at binary values, then stop; t h e optimal solution obtained for the LP relaxation solves t h e mixed-integer p r o g r a m m i n g problem. Otherwise, go to Step 2. Step 2: (Optional, if package such as MINOS is available to handle nonlinear objective functions.) Replace t h e objective function by

Maximize

£

£

E

(*•* ~ ll2f

(U)

ieR)r\I2 9=1 '=1

and incorporate t h e following additional constraint in t h e problem z < i/ L P (l + A)

(12)

where z represents the objective function of t h e strategic model, VLP is the value of its linear programming relaxation, and 100A is a specified % deviation p e r m i t t e d from v^p. Solve t h e continuous relaxation of this linearly constrained nonlinear programming problem. ( We used MINOS 5.2 available with GAMS for this purpose.) As t h e form of t h e objective function indicates, a solution to this problem encourages all t h e binary variables t o take a value of 0 or 1, in order to maximize t h e objective function, while satisfying t h e problem constraints, including t h e additional constraint (12). In fact, a solution is integer feasible to t h e original strategic model and has an objective value within t h e specified tolerance I^LP(1 + A ) if and only if it solves t h e above problem. However, this problem is nonconvex, and so MINOS might stall at a nonoptimal local m a x i m u m , which is not integer valued. Hence, a range of values of A € [0,0.1] say, may be a t t e m p t e d in a sequential set of runs, and t h e best solution obtained may be recorded. (Note t h a t if A = 0, and if t h e solution obtained has all the i-variables at binary values, t h e n an alternative optimal solution to the LP relaxation has been obtained which solves t h e original mixed-integer program.) Proceed t o Step 3.

H. D. Sherali and Quaid J. Saifee 3: Define F = { a set of mines i for which some unit q has a fractional variable Xiqt for some period t}. For each i G F, let q(i) b e t h e imminent unit defined as t h e smallest index q for which x,-,( is fractional for some t. Note t h a t by t h e constraints of t h e problem, and t h e definition of q(i), we must have Xiqs = 1 for each q < q(i), for some 8 < t, where t is t h e smallest t i m e period for which t h e x-variable for q(i) is fractional. Now, for each i G Fi, let tn < £,2 < . . . < tini be t h e t i m e periods t for which t h e variables xiq^y are fractional, and let vn,..., Vini be t h e fractional values of X{q^tik, k — 1 , . . . , n,-. Accordingly, c o m p u t e

Ti = ^2 k=l

vik tik + T(l - Y^, Vik) for each i G F k=\

where T is n u m b e r of t i m e periods in t h e horizon. Note t h a t t h e smaller the value of Ti, t h e relatively earlier is t h e tendency for t h e imminent unit of mine i to b e opened. Hence, find T" = m i n i m u m (T,-)

(13)

Instead of selecting a mine t h a t is determined simply by (13) as t h e one for which t h e imminent unit should be opened, we examine a b a n d of T; values by finding T = {i G F : T{ < T* + 1 } For mines within this band, identify a mine r according to r £ argmax { ^ ( i ) } , where !/;*(,-) = m a x i m u m {vik} for i G F , E 7?

k=i,...,m

If t h e fractional x-variables represent only one unit of a potential mine at different t i m e periods, then select k(r) = 1, for fixing xTqiT)tTl = 1. 4: Fix xrq(r)tr = 1, along with all t h e other binary variables which turned out to be 1 in t h e most recent linear programming relaxation solved. Re-solve t h e linear programming relaxation after fixing t h e above binary variables. If t h e problem is infeasible, go to Step 5. If t h e solution obtained has all t h e xvariables at binary values, then stop; t h e optimal solution obtained is prescribed as a heuristic solution. If the solution obtained still has fractional values for some of the x-variables, then return to Step 3. 5: Replace k(r) by k(r) — 1. ( Note t h a t by t h e structure of t h e problem and t h e n a t u r e of t h e solution to t h e last feasible linear p r o g r a m m i n g relaxation, it must b e t h a t t h e revised k(r) > 1.) R e t u r n t o Step 4.

Strategic

5

and Tactical Models and Algorithms

for the Coal Industry

249

Computational Experience

Real d a t a from t h e Westmoreland Coal Company and nine other similar, hypothetical d a t a sets are generated to test different heuristics and t h e commercial softwares used to solve t h e problem. T h e real d a t a provided by Westmoreland Coal Company has 11 existing mines and 8 potential mines. Of t h e latter, one has four units, two of t h e m have two units each, and t h e remaining five have one unit each. T h e r e are 8 silos and 6 customers. T h e horizon considered in running t h e model with this d a t a is 6 periods (3 years) in duration. Nine hypothetical test problems are created using a different n u m b e r of existing mines, potential mines, units in each potential mine, silo units, and customers. Also, various combinations and connections between existing mines, potential mines, units in a potential mine, silo units, and customers are incorporated into these test problems. These problems vary in t h e range of 388 constraints, 27 binary variables, and 481 t o t a l n u m b e r of variables, to 4155 constraints, 240 binary variables and 6565 total n u m b e r of variables. Table 1 provides a list of specifications for all t h e test problems generated. In all these d a t a sets, we have included a special hypothetical potential mine with a single production unit t h a t has a very high rate of production, and has a zero ash and sulfur percentage content in t h e coal produced by it. An inordinately large fixedcharge is required to open this mine. Also, t h e coal produced by it is given very high shipping and cleaning costs. In effect, it is ensured t h a t this special hypothetical unit is used only when t h e model would otherwise have been infeasible. In other words, whenever t h e model uses this particular mine, it implies t h a t t h e model is infeasible and t h a t t h e infeasiblity lies where t h e flow is satisfied using t h e coal from this special mine. (While this should be included in practice, none of our test problems needed to resort to this hypothetical mine.) Effect of I m p l i e d C o n s t r a i n t s After coding t h e test problems in G A M S , we also performed an investigation on t h e effect of using constraints t h a t are implied in t h e integer sense, though not in the continuous sense, for t h e purpose of derving a tighter relaxation of the model, as discussed in Section 3. Tighter relaxations of a discrete programming problem help in improving t h e performance of any exact or heuristic solution procedure. To ascertain the effect of t h e implied constraints in (8) and (9) on the model, linear programming relaxations of all t h e test problems were run using MINOS 5.2 as a solver, both with and without these implied constraints. T h e results in Table 2 exhibit t h a t t h e implied constraints used in t h e formulation produced a tighter relaxation for 6 out of the 10 test problems. C o m p a r i s o n of D i f f e r e n t C o m m e r c i a l S o f t w a r e s

250

H. D. Sherali and Quaid J. Saifee

T h e test problems were first solved by Z O O M ( Z e r o / O n e Optimization M e t h o d ) , which is a default solver in GAMS for mixed-integer programming problems. As it t u r n e d o u t , Z O O M faced significant difficulties even in obtaining a feasible solution to t h e linear p r o g r a m m i n g relaxation of t h e original mixed-integer programming problem. After trying different combinations of options, like C H E A T , DIVE, E X P A N D , F A C T O R , G A P , P A R T I A L , Q U I T ( see Brooke, Kendrick and Meeraus [3]) in t h e G A M S / Z O O M options file, we were able to find an integer solution for only three out of t h e t e n test problems using an optimality tolerance of 10%. (It should be noted t h a t if OSL, C P L E X , or Z O O M does not find an optimal solution within the specified range, t h e n it gives t h e best solution found up to t h e point of termination, mentioning t h a t there is no optimal solution within t h e specified range.) Next, we solved t h e ten problems using OSL (Optimization Subroutine Library) and C P L E X , again with a 10% optimality tolerance. Separate G A M S L I N K S , developed by t h e GAMS Development Corporation, were used to link GAMS with OSL and with C P L E X . T h e OSL runs were m a d e on an IBM RS/6000 workstation model 320H, running AIX 3.1. T h e runs with C P L E X as a solver were m a d e on a SUN Sparc 1, running SUNOS 4.1.1. T h e runs using Z O O M as t h e solver were m a d e on an I B M 3090. (Different computers have been used since t h e OSL and C P L E X runs were m a d e at t h e GAMS Development Corporation.) T h e results obtained after solving t h e problems with Z O O M / G A M S , O S L / G A M S and C P L E X / G A M S are tabulated in Table 3. OSL was able to solve all t h e test problems as shown in Table 3 while C P L E X was unable to solve three of these problems due to m e m o r y limitations on t h e SUN Sparc 1 computer. On t h e other hand, Z O O M could only solve three test problems, two of which were solved within 10% of optimality, while for t h e third problem (number 4), only a feasible integer solution within about 4 3 % of the linear programming lower bound could be found. No feasible solution was found for t h e remaining problems. C o m p a r i n g C P L E X and OSL in t e r m s of t h e number of iterations and t h e relative gap from t h e linear programming relaxation value, t h e results in the table show t h a t for 4 out of t h e 7 problems t h a t C P L E X solved, it required more iterations t h a n did OSL, b u t for 5 out of these 7 problems, it obtained a better solution t h a n did OSL. (Blank spaces in t h e table specify those problems which could not be solved either by C P L E X due to shortage of memory or by Z O O M due to different kinds of numerical difficulties.) C o m p a r i s o n of R e s u l t s O b t a i n e d U s i n g C o m m e r c i a l S o f t w a r e s a n d H e u r i s tic L P H Table 4 presents t h e results using Heuristic LPH to solve t h e ten test problems. In all t h e test problems except for problem n u m b e r 2, t h e proposed heuristic consumes more iterations to solve t h e problems t h a n do the commercial softwares OSL and

N u m

„ , . Prob. exist.. . Num. ine r mines

1

8

of

Num. of Num. of units in . , potential poten. '. , mines tial

Num. of cus, tomers

Num. of .. time . , periods

Num. of Num. of silos Constr.

Num. of Num. of, . binary vars. vars.

CPU .. time , -, (sec)

Iters.

& ^ a> 03

=

a.

T.

s

mines

5

9

7

3

8

388

481

27

3.7

560

£3 £;•

2

11

8

13

6

6

8

950

1145

78

31.2 2047

3

5

8

14

7

6

8

1195

2309

84

57.6

2599

^

4

4

8

15

5

6

8

776

1337

90

25.5

1885

g-

5

7

8

15

10

6

4

1310

2987

90

93.3

3367

t a

6

5

8

16

6

6

4

1053

863

96

36.6

2238

^

7

7

8

13

6

8

8

1290

1957

104

56.7

2634

1

8

4

8

14

6

8

6

1251

1752

112

105.8

5016

f1

9

6

13

24

10

8

8

2981

4149

192

497.8

9850

S,

10

7

16

30

12

8

10

4155

6565

240

584.6

6947

Table 1: Test problem specifications for the long-term strategic model Legend: CPU time = CPU seconds for solving the linear programming (LP) relaxation on an IBM 3090 computer Iters. = Number of iterations required to solve the linear programming relaxation of the problem. Vars. = Variables, Num. = Number, Prob. = Problem, Constr. = Constraints.

H. D. Sherali and Quaid J. Saifee

252

Solved without implied constraints (objective value)

Solved with implied constraints (objective value)

1

258368136

258368136

2

296801038

316045551

3

76531010

76531010

4

31379965

31379965

5

77480004

77740986

6

328998463

351037130

7

391664242

392528791

8

56816347

56816347

9

655812636

673834448

10

409673011

412477641

Test problem number

Table 2: C o m p a r i s o n of s o l u t i o n s o b t a i n e d w i t h a n d w i t h o u t i m p l i e d c o n straints

C P L E X , given t h e 10% optimality tolerance used within t h e latter methods. It should be noted, however, t h a t while solving the problems using our heuristic, we did not use any advanced bases from one LP run to t h e next. T h e effort for t h e proposed heuristic can be substantially reduced by using the optimal basis obtained for one run in the subsequent problem solved, since only one additional fractional binary variable is fixed at 1 from one run to t h e next in t h e sequence of problems solved. This automation can be done by incorporating t h e M P S file generated through G A M S 2.25 within a F O R T R A N or C program, and updating this file from one call to t h e next of t h e solver M I N O S 5.2, OSL, or C P L E X . As far as t h e solution quality is concerned, Heuristic LPH obtains a better solution t h a n does OSL for 3 of t h e test problems and the same solution for 2 of t h e m . It also obtains a better solution t h a n does C P L E X for a single case. In this case, Heuristic L P H actually identifies an alternative optimal linear p r o g r a m m i n g solution t h a t happens to be integer feasible at Step 2 of t h e procedure, using A = 0.0. Overall, as can be seen by comparing Tables 3 and 4, for t h e most part, the solutions obtained via t h e different procedures are comparable in quality. Finally, it should be noted t h a t because OSL and C P L E X employ a branch-and-bound technique, it can be expected t h a t as t h e problem size increases, the computer memory requirements would increase

OSL M I P soTest Prob.

lution value

RG %

ZOOM

CPLEX CPUl time

Iters

(sees)

M I P solution value

RG %

CPU2 time

Iters

(sees)

M I P solution value

RG %

CPU3 time

Iters

(sees)

5.4

1

259637230

0.49

1.8

178

259637230

0.49

6.0

426

259637230

0.49

2

368434780

16.5

1335.9

47348

373343025

18.12

414.2

8765

-

-

-

1121

-

3

86359033

12.84

27.9

1419

83701728

9.36

230.1

4436

-

-

4

33697853

7.38

20.6

1455

33642804

7.21

243.0

6825

44930032

43.18

355.7

49077

5

78278455

1.03

32.6

2126

78278455

1.03

83.9

2045

79326334

2.38

139.2

7554

6

446829414

27.28

196.0

13046

-

-

-

-

-

-

-

-

7

448381433

14.22

121.9

5912

440208987

12.14

179.9

4271

-

-

-

-

8

60458788

6.41

41.1

2238

60584061

6.63

127.7

2634

~

-

965484334

43.28

166.9

3994

-

-

-

-

-

-

10

449239319

8.91

221.5

4004

-

-

-

-

-

-

-

-

9

-

-

Table 3: Comparison of solutions obtained and effort required for solving the test problems with OSL/GAMS, C P L E X / G A M S , and Z O O M / G A M S Legend: CPUl time = Resource usage in CPU seconds on an IBM RS/6000 workstation model 320H CPU2 time = Resource usage in CPU seconds on a SUN Sparc 1 CPU3 time = Resource usage in CPU seconds on an IBM 3090 Iters = Total number of simplex iterations required to solve the problem RG = % Relative gap of the MIP solution obtained from the initial linear programming relaxation value

H. D. Sherali and Quaid J. Saifee

254

Test Problem

MIP soln.

RG %

XPTT time (sees)

Iters.

LPR

FRA

1

259637230

0.49

4.1

1108

3

1

2

370792903

17.32

88.2

5305

3

7

3

84327533

10.18

227.5

12340

6

3

4

34704832

10.59

102.5

9435

7

6

5

78278455

1.03

540.7

22516

8

6

6

469914352

33.86

293.1

19006

12

9

7

448381433

14.22

279.4

12883

6

4

8

60272957

6.08

770.8

42031

14

10

9

1023861971

51.94

4003.4

77540

12

11

10

412477641

0.00

1009.4

13447

2

8

Table 4: C o m p u t a t i o n a l E x p e r i e n c e U s i n g H e u r i s t i c L P H

Legend: C P U t i m e = Resource usage in C P U secounds on an IBM 3090 Iters = Total number of simplex iterations required to solve t h e problem RG = % Relative gap of t h e M I P solution obtained from t h e initial linear p r o g r a m m i n g value L P R = N u m b e r of linear programming runs required before discovering the mixed-integer solution. F R A = N u m b e r of units of potential mines having fractional values at t h e first step.

Strategic

and Tactical Models and Algorithms

lor the Coal Industry

255

substantially, and t h e optimality tolerance will need t o b e further relaxed t o keep these procedures viable. On t h e other hand, t h e proposed Heuristic L P H can be expected to remain robust since it only relies on t h e solution of a limited sequence of linear programming relaxations.

Appendix: Modifications for a Tactical Day-toDay Operational Model Sherali and Puri [14] have presented a description of three linear programming tactical models for making day-to-day mining, cleaning, blending, and distribution decisions, given a set of operating mines. T h e most accurate and detailed of these three models is called "Model 1". In this model, coal is assumed to be shipped out of mines to t h e silos at the beginning of t h e t i m e periods, and to be shipped out of t h e silo units to t h e customers at t h e end of t h e t i m e periods. A m a x i m u m of a three period shipment lag between coal production at t h e mines and t h e final shipment to customers is p e r m i t t e d , based on an estimate of the clearance t i m e at t h e silos. If t-y is t h e t i m e period for a certain mine to silo shipment, and ti is t h a t for a continuing silo to customer shipment, t h e shipment lag is given by t 2 — h- T h e transfer lag, another t i m e lag factor, t h a t indicates t h e difference between t h e t i m e of dispatch of a coal shipment from a mine to a silo and its actual arrival t i m e at the silo, is negligible for t h e problem under study, and is therefore assumed to be zero. (Nonzero transfer lags can, however, be readily accommodated by time-shifting t h e data.) We also adopt this same structure of t h e model. However, t h e following modifications have been m a d e to enhance t h e problem representation based on t h e feedback from our case study implementation.

1. A penalty-reward component has been introduced into the objective function, which either penalizes or rewards t h e quality sent to t h e customers relative to what is desired. T h e need for this function arose due to t h e 1990 Clean Air Act as t h e customers are now somewhat more liberal about t h e content of ash in the coal, b u t are more stringent about its sulfur content. Hence, besides a piecewise linear reward or penalty function imposed on t h e ash quality of t h e coal shipped relative to t h e m a x i m u m specified limit, we have also introduced a piecewise linear reward structure in t h e objective function to reflect t h e incentive for shipping coal having a better sulfur quality, while restricting a m a x i m u m sulfur content in each period as a h a r d constraint. 2. Due to t h e 1990 Clean Air Act, coal companies might have to look for different types of cleaning technologies instead of using a single type. "Model 1" can

H. D. Sherali and Quaid J. Saifee

256

be given an interpretation so t h a t it can accommodate more t h a n one t y p e of cleaning technology. For example, if there are two types of cleaning technologies used by a coal company, then t h e silos in J2 can be divided further into subsets J 2 i a n d ^22, where each J21 t y p e of silo constitutes a pair of silos, one of which receives t h e run-of-mine coal, while t h e other stores t h e coal following t h e cleaning operation, and where each J22 t y p e of silo constitutes a similar pair of silos, b u t representing an alternative cleaning technology. Note t h a t this is principally a d a t a processing, rather t h a n a model oriented modification. 3. T h e equality constraint t h a t strictly enforces as much amount of coal to be transferred from t h e Wentz to Bullitt silos as is shipped to stoker customers has been relaxed to an interval constraint, to b e t t e r reflect the actual practice in this transfer. Since Sherali and Puri do not provide a m a t h e m a t i c a l formulation for "Model 1", for t h e sake of convenience in reference by practitioners, we give below a complete m a t h e m a t i c a l formulation (including t h e foregoing modifications), t h a t has been tested and implemented using G A M S . For completeness, we first specify t h e d a t a requirements along with our notation. Note t h a t t h e short-term tactical model, in contrast with t h e long-term strategic model, has to contend with daily storage restrictions at the mines and at t h e silos, as well as with the dissipation of any initial a m o u n t in storage at the silos within a rolling horizon implementation framework. t = 1 , . . . , T = n u m b e r of t i m e periods. i — 1 , . . . , m = number of mines. j — 1 , . . . , J = n u m b e r of silo units. k = 1 , . . . , K = number of customers. Kst = "stoker customers" served by Wentz silos. Pit = production (tons) at mine i, in period t. an, sn = ash and sulfur content (%), respectively, in coal produced at mine i, in period t. SMi = storage capacity at mine i. c?m = storage cost (per ton) at mine i. J j = R O M silo storage units. J^w = cleaned silo units at Wentz facility. JiB= cleaned silo units at Bullitt facility.

Strategic

and Tactical Models and Algorithms

for the Coal Industry

257

J 2 = cleaned silo units (J2w U J2B)SSJ = storage capacity of silo unit j . ( For j £ J 2 , this is taken as t h e sum of t h e two associated silo storage capacities.) C*JS = storage cost per ton of coal at silo j . eft = initial amount in storage at silo j . a°,s°; = ash and sulfur content (%), respectively, in initial storage a m o u n t at silo j . C?M

_

snippmg cost

(per ton), from mine i to silo j .

ciij £ (0,1] = total weight attenuation factor (output per ton input) at silo j £ Ji for coal from mine i. Pij £ (0,1] = ash content attenuation factor (output per ton input) at silo j £ J2 for coal from mine i. 7tj 6 (0,1] = sulfur content a t t e n u a t i o n factor (output per ton i n p u t ) a t silo j £ J2 for coal from mine i. Note: Q . J , fa;, and 7,-j are assumed to be 1 for j £ J\. A\ = flow arcs (i,j) (i,j)eA1}.

from mine i to silo j ; F} = {j

: (i,j)

€ Ai }, R1- = {i: :

A2 = flow arcs (j, k) from silo j to customer k; FJ = { k : (j,k) Rl = {]--{hk)eA2).

£ A2 },

A3 = arcs (j, j') representing by-product flow from Wentz silo j 6 J2w to Bullitt silo j ' £ J2B, corresponding to stoker customer shipments; Ff = {_}' : {j,j') £

A3},R? =

{j:(j,j')eA3}.

cfj^ = shipping costs (per ton), from silo j to customer k. cSjjf = shipping costs (per ton), from Wentz silo j £ J2w to Bullitt silo j ' £ J2Bdkt = d e m a n d placed by customer k during period t. ASHukt = upper limit on ash percentage in coal to be delivered to customer k in period t. SULukt = upper limit on sulfur percentage in coal to be delivered to customer k in period t. c\k = slope (revenue/ton) of reward function for each % point below t h e maxim u m specified limit of ash in coal shipped to customer k.

H. D. Sherali and Quaid J. Saifee

258

c\k = slope ( c o s t / t o n ) of penalty function for each % point above t h e m a x i m u m specified limit of ash in coal shipped to customer k. c\k = slope (revenue/ton) of reward function for each % point below t h e maxim u m specified limit of sulfur in coal shipped to customer k.

Mathematical Formulation T h e decision variables for t h e tactical model are defined as follows: (i) amount (tons) shipped from mine i to silo j in period t, with continued shipment to customer k in period r, given by y\jt (ii) a m o u n t (tons) in initial storage at silo j , shipped t o customer k in period t, given by y^kt (iii) amount (tons) shipped from mine i to silo j £ J2W in t i m e period t1 which is then shipped to j ' € J2B in period t2, and finally shipped t o customer k in period r , given by Y^lJjH2. Here, for a given t\, (Note t h a t t h e initial storage at t h e silos is also assumed to be dissipated within three periods). Hence, we have, {h, r) € tL{U) = {{h + l,*i + 1), (*i + M i + 2), (h + 2, h + 2)} Auxiliary Decision Variables 1. x's = slack variable equal to t h e amount (tons) of coal remaining in storage at mine i during period &, for i = 1 , . . . , m, S = 1 , . . . , T. 2. u'jS = accumulated storage amount (tons) in silo unit j during period S, for j = 1 , . . . , J, 6 = 1,...,T. 3. ASHkT = % ash content in blended coal delivered t o customer k in period r , for k = 1,...,K, T= 1,...,T. (a) {ASHkT)i = % ash content in blended coal delivered to customer k in period T, t h a t is below t h e m a x i m u m specified limit, for k — 1 , . . . , A ' ,

r = i,...,r. (b) {ASHkT)2 = % ash content in blended coal delivered to customer k in period r , t h a t exceeds t h e m a x i m u m specified limit, for k = 1,...,K, r = l,...,T. 4. SULkr = % sulfur content in blended coal delivered t o customer k in period r , for k = 1,...,K, T = l,...,r. Objective Function: J

Minimize

T min((+2,T)

E E E E

E

3=1 .'£«] keF] <=i

*=t

(^+4

+

4 ^

Strategic and Tactical Models and Algorithms for the Coal Industry

+ E E E E E 3

(41 + <s% +
E

3

jeJ2w ieflj j'eff J

259

k<=F ,
2

T

J

T

+ E E E 4 ^ + 12E « + E E « +

K T E E

[ - C U ( ^ # U * T - (AStf* T )i)<4 T +

c\k{ASHkT)2dkT\

/fc=l T = l

+

E E [-clk(SULukT

-

SULkT)dkT]

A= l T= l

Constraints: 1. Flow balance and storage at mines: min((+2,T)

E y&- E E E E

E P - E E E

E

*#„•<* = *?«

for i = 1,. . ., m, and <5= 1,. . . , T 0 < xf5 < 5M; for z = 1 , . . . , m, and 8 = 1 , . . . , T 2. Storage constraints for silo units: t

min(«+2,T)

EE E

E

/

1,

^(1+ ««)!&+ « ? - E E 4 | + ^ = «].

0 < uj t < 55,- for j = 1 , . . . , J , and t = 1 , . . . , T where,

E E E E ieR)

j'€Fj> k<=Fj, S=t-2

E

ki + °n)Y/rM fnjtJiw

(i2,T)e(L(«),(2>l

E E E E E x(l + « O - 0 ^ , i < a j'eflj ieflj,fceF,2((s,T)ei,(t) *I=(T-2) Z 0

fori6J2B otherwise

and where t^ (t) = {(t — l,t), (t, t), (t, t + 1)} for defined combinations. 3. Dissipation of initial storage at silos: E

E

/c£F? ' = 1

V%t = 1° for j = 1,.. . , J

260

H. D. Sherali and Quaid J. Saifee

4. Wentz to Bullitt transfer: min(r,(+2)

E E

E

*&.„ <

j'SFf fcsF3, (*2,T)e«L(*)

E

E

keFjnK.,

-r=t

y%

< i-i E E

E

*&.„

i'6F? Jtei^, (i2,r)eti(()

for j G J 2 w, « € R), and < = 1 , . . . , T 5. Demand constraints for customers: T

T-l

E E E «*v% + E y%r+ E E E E jSflJ • € « ;

j'ZRl * j£H 3J, igfl 1J < = T - 2

*=T-2

=

for A;= 1,...,K

and r =

min(f+2,T)

E

l 2 =t+l

dkT

1,...,T

6. Customer product quality constraints: ASHkT

=

— dkr

E E E i & a « & + £ y°kTa° -1 min(t+2,T)

+ £ £ £ £ for k = 1,...,K, SULkT

T= =

E

1,...,T 1

<4T E E

E y&'iiia+Y, y%*° r —l

min((+2,r)

+ E E E E j'eRl

for k = 1,...,K,

for k = l,...,K,

T=

^l2«,^

jeR3, i£R\ * = T - 2

E

*&-„*« 7«

t2=t+i

1,...,T

ASHkt

=

(ASHu)!

+

(ASHkt)2

0

<

(ASH*)!

<

ASHukt

0 0

< <

(ASHkt)2 SULkt

< <

oo SULukt

and t = 1,...,T

«o-*&••«,

Strategic

and Tactical Models and Algorithms

7. Nonnegativity

for the Coal Industry

constraints:

V% > 0, y%t > 0, f o r i = l , . . . , m , j = 1 , . . . , J , fc = \,...,K,t r = <,..., min{r,t + 2 } Y

ijtu't2

261

= 1,...,2\

> 0, for i = 1 , . . . , m , j € J 2 w , j ' € J 2 B , fc = 1 , . . . , K, h = 1 , . . . , T,

(*2,T)€*L(*I)

References [1] R. Aboudi, A. Hallefjord, C. Helgesen, R. Helming, K. Jornsten, A.S. Pettersen, T. R a u m , and P. Spence. A Mathematical P r o g r a m m i n g Model for t h e Development of Petroleum Fields and Transport Systems. Euorpean Journal of Operational Research, 43:13-25, 1989. [2] E. Balas and C.H. Martin. Pivot and Complement- A Heuristic for 0-1 Programming. Management Science, 26(l):86-96, 1980. [3] A. Brooke, D. Kendrick, and A. Meeraus. GAMS A User's Guide. T h e International Bank for Reconstruction and Development, T h e World Bank, 1988. [4] W . Candler. Coal Blending - W i t h Acceptance Sampling. Computers tions Research, 18(7):591-596, 1991.

& Opera-

[5] G.B. Faulkner. Linear P r o g r a m m i n g Applied to a Mining Smelting Operation. Canadian Mining & Metallurgical Bulletin, 60(677):1297-1300, 1967. [6] M. Gershon. Mine Scheduling Optimization with Mixed Integer P r o g r a m m i n g . Mining Engineering, 35(4):351-354, 1983. [7] T . B . Johnson. O p t i m u m open-pit mine production scheduling. A Decade of Digital C o m p u t a t i o n in t h e Mineral Industry, S M E - A I M E , New York, USA., 1969. [8] A.M. K h a n . Solid-Waste Disposal with Intermediate Transfer Stations: An Application of the Fixed-Charge Location Problem. Journal of the Operational Research Society, 38(l):31-37, 1987. [9] C.G. Knight and C.B. Manula. T h e Pennsylvania Coal Model. In Proceedings of the 14th APCOM Symposium, SME-AIME, 655-665, New York, 1976. [10] B.A. Lietaer. A Planning Model for Underground Mines - An Application in a Developing Country. OMEGA, The International Journal of Management Science, 5(2):149-159, 1977.

262

H. D. Sherali and Quaid J. Saifee

[11] G.L. N e m h a u s e r a n d L.A. Wolsey. Integer and Combinatorial Wiley and Sons Inc., New York, NY, 1988.

Optimization.

John

[12] J . P . Osleeb and S.J. Ratick. A Mixed Integer and Multiple Objective Programm i n g Model t o Analyze Coal Handling in New England. European Journal of Operational Research, 12:302-313, 1983. [13] J . P . Osleeb, S.J. Ratick, P. Buckley, K. Lee, and M. Kuby. Evaluating Dredging and Offshore Loading Locations for U.S. Coal Exports Using t h e Local Logistics System. Annals of Operations Research, 6:163-180, 1986. [14] H.D. Sherali and R. P u r i . Model Development, Testing and C o m p u t e r Implem e n t a t i o n for a Coal Blending and Distribution Problem. OMEGA, The International Journal of Management Science. (To appear). [15] H. Steinmann and R. Schwinn. Computational Experience with a Zero-One P r o g r a m m i n g Problem. Operations Research, 17:917-920, 1969. [16] R.C. Tomlinson. T h e Practice of O.R. in Coal Mining. Operational Research, 1:9-21, 1977.

European

Journal

of

[17] K . B . Williams and K . B . Haley. A Practical Application of Linear P r o g r a m m i n g in t h e Mining Industry. Operational Research Quarterly, 10(3):131—138, 1989. [18] W . Young, J.G. Ferguson, and B. Corbishley. Some Aspects of Planning in Coal Mining. Operational Research Quarterly, 14(1):31—45, 1963.

263 Network Optimization Problems, pp. 263-281 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Multi-Objective Routing in Stochastic Evacuation Networks J. MacGregor Smith Industrial Engineering and Operations University of Massachusetts, Amherst,

Research Department, MA 01002, USA

Abstract

A fundamental problem of routing in stochastic queueing networks is the identification of paths which extremize a collection of objective functions. In this paper, an integer set partitioning model with two conflicting objective functions is presented and examined. Also, the properties of the Noninferior set of solutions and the mathematical development of the algorithm for iteratively generating the paths are described. The algorithm is based on a multi-objective k-shortest path algorithm for generating the Noninferior set of paths for which tradeoffs between evacuation time and distance travelled within the network are evaluated. An example of the methodology is also presented.

1

Problem Overview

Fundamentally, t h e problem of routing customers, occupants, packets, and other transactions in queueing networks is a complex, stochastic, integer, and nonlinear programming problem. These problems are highly transient as well as being multiobjective in n a t u r e and there are numerous performance measures inherent within t h e problem which complicate t h e different routing strategies. Mathematically, we have a finite queueing network G(V, E), with a finite set of nodes V, and edges(arcs) E, over which multiple classes of customers (occupants) flow from source(s) to sink(s) while a vector of objective functions fi = {fi(x), / 2 ( a ; ) , . . . , fp{x)} is simultaneously

J. M. Smith

264

extremized subject to a set of constraints on t h e occupants flowing through t h e network. In this paper, some of t h e m a t h e m a t i c a l properties, methodology and corresponding algorithms for solving this multi-objective routing problem in queueing networks are presented. One m a i n concern in this paper is to demonstrate how one can model t h e multi-objective n a t u r e of t h e problem and calculate effective alternative routing strategies t h a t allow t h e system planner of t h e queueing network to tradeoff perform a n c e between one objective and another. Much of our past research in this area has considered stochastic evacuation networks [14, 15, 1], and also static and real-time routing in production and manufacturing settings [5, 6, 7, 9]. In these latter studies, maximum throughput, sojourn time, and average number of customers in the system have been objectives of interest. In this paper, t h e focus is on stochastic evacuation networks in which we consider two p r i m a r y objectives: fi(x):= Total Evacuation Time and f2(x):= Total Distance travelled. T h e methodology and properties found can be generalized for other networks with similar objectives, such as p a t h reliability, see Figure 1.

c

_ Minimize Travel T i m e

-J

_ Minimize Evacuation Time

Overall Safety

Minimize Routing Complexity

Minimize Maximum Queue Lengths

_ Minimize Congestion

Minimize Average Queue Lengths

' — M i n i m i z e Total E v a c u a t i o n T i m e M i n i m i z e Total D i s t a n c e Travelled

Minimize Shortest Routes

P a t h Complexity

Minimize Maximum P a t h Lengths I—Min I -Minimize ]ym # Up-down Transitions

_ Minimize R e c e p t i o n _ | Center Failures

M:

_ Maximize P a t h _ Reliability Minimize _ Arc Failures

Minimize Latest Arrival Time

Minimize Maximum Flow Capacity Equalize Average Flow Capacities

Equalize Average —I Arc Flows Minimize Maximum Arc Flows

Figure 1: Morphological Diagram of Multi-Objective Approaches One might think t h a t t i m e and distance are highly correlated and for those situations where no congestion in t h e network evacuation p a t h exists, this is substantiated. However, when there is congestion in t h e network and all occupants seek t h e shortest

Multi-Objective

Routing

in Stochastic

Evacuation

Networks

265

p a t h , then there is a tradeoff which occurs, where routing some of t h e occupants on longer paths will reduce the overall evacuation t i m e of all occupants. T h e approach that is described is flexible and effective and can be utilized for transient situations where discrete-event simulation models such as Q - G E R T [12] and their variants, or for steady-state analytical queueing network models such as Mean Value Analysis Q N E T - C [15] are used to evaluate and generate the set of efficient routing paths in an evacuation network.

2

Assumptions and Definitions

By definition, a queueing network (graph) G(V, E) is comprised of a finite set V of nodes (vertices) of size N where V = { v\, V2, • • •, Vn} together with a finite set E of arcs e/t = (vt,Vj) V ( i , j ) nodal pairs. V can further be partitioned into three sets, Vi, which represents the occupant source nodes during the evacuation, V2 which represents t h e intermediate nodes during t h e evacuation, and V3 which represents t h e sink or destination nodes of the occupants. T h e set of arcs represent t h e different streets, passageways, or routes from V\ to V3. Associated with each node £ 6 V and each arc (v{,Vj) € E are variables and parameters which represent node and arc processing times, node and arc capacities, arrival times to t h e network, distances, and occupant population sizes at the source nodes. Figure 1 illustrates a small evacuation network with many of t h e parameters and variables of significance to the evacuation planning problem t h a t can be embedded in the network model. Some of the notation most commonly associated with this t y p e of network model is discussed below. A : = T h e r e are A chains of occupants labelled a = 1, 2 , . . . A. Each occupant chain represents a sequence (vector) of nodes and arcs which the occupant chain population will travel during the evacuation. D : = A distance m a t r i x where the elements of D , dij represent t h e Euclidean or rectilinear distance between nodal pairs (i,j) € E. E : = T h e network has a finite set E, of arcs(nodal pairs) fj{x)

: = An objective function evaluating the set of routing alternatives denoted by (x).

G(V,E)\

t h e queueing network (graph)

A : = the arrival rate vector of the occupant classes into the routing alternatives of t h e network. A = (Ai, A2,. . . , Ajj) for all occupant sources. fi : = t h e service rate vector for the nodes and arcs comprising the evacuation network. ^1, represents the service rate of a node while /J,( represents the service

J. M. Smith

266

r V31

V i •<

yv3

V12

v32 V

J

Y

v2 Figure 2: E x a m p l e E v a c u a t i o n N e t w o r k r a t e of an arc (travel time) between two nodes in G. Each queue is assumed to have infinite waiting room. NI

: = a Noninferior evacuation p a t h for an occupant class in an evacuation network.

V : = T h e network model has a finite set V of nodes and further t h a t V is partitioned into three sets: Vi, Va,V3 which represent the source(s), intermediate,& sink(s) nodes. fi : = T h e set of objectives in our routing problem. Since we are dealing with a multi-objective problem, we need to define t h e notion of a Noninferior NI evacuation p a t h [3]. Definition: x* is said to be a Noninferior evacuation p a t h for our evacuation problem defined in §1 if there exists no other feasible evacuation p a t h x such t h a t f(x) < f(x*) meaning t h a t fj{x) < fj(x*) for all j = 1,2, . . . , p with strict inequality for at least one j .

Multi-Objective

267

Routing in Stochastic Evacuation Networks

In other words, if we have a candidate path which we suspect is NI, there should be no other feasible path which is more minimal in both of the performance measures: time and. distance. The set of all paths which are NI is called the NI set.

3

Mathematical Model

There are few mathematical models which have appeared in the literature for generating and evaluating evacuation paths for an occupant population [10, 2, 11]. The model which is presented below is a variation of one model appearing in [11]. It was one of the first to account for the critical features of the stochastic evacuation problem. Another class of models that one might utilize to formulate the problem are those of the class of multi-commodity flow models. Unfortunately, these models will not control the splitting of the occupant population along the different evacuation paths which is problematic since splitting the different source populations will engender confusion and a potential sense of panic among the evacuating occupants. The integer programming model presented below has the desired property to control splitting of the flows. The multi-objective model of our routing problem is: Minimize{f\{x)\l

/2(*)}

where: (EvacuationTime)

: fx(x)

=

51HX^*-^*^ «

(DistanceTravelled) : f2(x)

i

— ^y^S^'i*'^*'1'*.** i

j

(1)

k

(^)

k

subject to: {V2 Arcs) :Y^YlYlatijk^ijkXijk «

j

< pe W

(3)

< Cq

(4)

k

(V3 Sinks) -YlY^llPiikX'iok i

j

Vq

k

(Occupant Classes) : 22 Xijk = (Routes) : x^

1 Vij

= 0,1

Vijfc

(5) (6)

and where: Xijk'-= 1 if the ith occupant class from the j t h source is assigned the kth NI route alternative. aajk-

a

data coefficient which equals 1 if the £th arc is included in the ijhth route assignment and equals 0 otherwise.

268

J. M. Smith

pt. m a x i m u m allowable traffic along arc (.. Cq: capacity of sink (destination) node q. Pijk'. occupant population of source ij on the kth NI route alternative. qijk'. expected evacuation (sojourn) t i m e of t h e ijkth occupant class. These values must be calculated from t h e particular stochastic model used in the evacuation study, see discussion below. dijk- average distance travelled for t h e ijkth

occupant class.

Because of t h e complexity of solving this model directly, an alternative approach which systematically generates feasible routing alternatives to a relaxed version of our m a t h e m a t i c a l model but at t h e same t i m e measures the critical objectives of evacuation t i m e and distance travelled is proposed and demonstrated in t h e next two sections of t h e paper.

4

Congestion Properties

Some crucial issues guide us in t h e routing/re-routing process: • How should the routing/re-routing • How should alternative

process be

initialized?

NI paths be selected

• How should the re-routing

process be

terminated

In general, a problem faced with multi-objective programming problems is t h e often exponential n u m b e r of NI solutions. We would like to limit the exploration of t h e NI solutions to a manageable quantity by decomposing and relaxing the original problem presented in §C so t h a t we systematically treat one objective at a t i m e . Before we present t h e relaxed m a t h e m a t i c a l model along with the m a t h e m a t i c a l properties, an i m p o r t a n t definition is needed which concerns the eventual gain in re-routing occupants along longer NI paths in t h e evacuation networks.

E x p e c t e d S a v i n g s : Given a current occupant populations' evacuation route, t h e potential gain in re-routing an occupant population along some other NI route is given as: where: Eijk '•= is t h e net increase or decrease in the average egress time per person caused by re-routing occupants to the (kth+1) NI route.

Multi-Objective

Routing

in Stochastic

Evacuation

Networks

269

qtjk := t h e sum of the average queue times per person on the original route. dif.=

t h e increased distance travelled on the (kth+1) NI route (e.g. if t h e kth NI route is 100 feet and t h e kth+l NI route is 120 feet, d*- is equal to 20 feet i.e. 120 minus 100).

u := is t h e average travel speed for d1--. qfj.= t h e sum of the expected queue times per person on the (kth+1)

NI route.

One impor!ant point in what follows is t h a t we assume that we have an accurate estimate of q^. For evacuation and networks which are Markovian (Poisson arrivals and Exponential service), this estimation process is exact. For more general networks however, a stochastic approximation technique is required which gives an estimate of t h e impact of re-routing occupants from their current evacuation p a t h to another NI one. Since t h e alternative evacuation paths are acyclic directed graphs of infinite queues, this estimation problem is difficult, yet not intractable. Let's present a relaxed version of our original mathematical programming problem which will form t h e foundation of what is to follow. Minimize]

fi(x)\

/2(a)}

where: (EvacuationTime)

: fy(x)

=

^Z^ZS^'i*^*^* i

(DistanceTravelled)

: f2(x)

=

3

YL^Z^2,(^iik^iJkXiJk j

•

(7)

k

(8)

k

subject to: {Occupant

Classes) : y , Xq^

=

1

=

0,1

Vij

(9)

k

(Routes)

: xijk

Vijk

(10)

T h e relaxed model removes constraint equations (1&2) and concentrates on evacuating t h e occupant populations along selected routes where time is minimized. T h e relaxation of constraint equations (1&2) is justified in certain evacuation situations where t h e capacity restrictions are not severe or critical. To find an initial NI solution to our relaxed model, we ignore f\(x) and solve our network for the first NI p a t h for each occupant population using a multi-objective shortest p a t h algorithm such as t h e one by Climaco and Martins [4]. Thus: T h e o r e m 1: The collection of 1st shortest paths for each occupant population resents a NI solution for the entire evacuation network.

rep-

270

J. M. Smith

P r o o f : This is a s t a n d a r d result in multi-objective optimization [3], viz. selecting one objective, fi(x), solving a weighting problem with u>,- = 1 and Wj = 0V? ^ i. T h u s , simultaneously minimizing t h e distance travelled for each occupant population results in t h e m i n i m u m distance travelled for all occupant classes for fi(x). | At t h e same t i m e we generate t h e first shortest paths, we compute t h e evacuation (sojourn time) for t h e occupant population. As we shall see, we approach the solution of our original m a t h e m a t i c a l model in §C oscillal ing between the set of NI paths and t h e tradeoffs gained in reducing t h e evacuation time by sending occupant populations along longer p a t h s . T h e 1st shortest p a t h solution for each occupant population may result in a unique optimal solution across all populations for both objectives fi(x), f2(x), if queueing delays along t h e routes are not significant. However, we need to ensure this by quantitatively using our notion of expected savings to verify this. T h e o r e m 2: / / the NI solutions at the first stage of the routing process where each occupant source is routed along the 1st shortest path an expected savings calculation: E

ij = lijk ~ [{dijM

+ Qij] > 0

for some

then, the 1st shortest paths are not a unique optimal solution will improve the evacuation time fi(x).

ijk and further

rerouting

P r o o f : T h e proof of this property rests on constructing a Linear/Integer programming problem and its dual which indicates how one can move from one NI solution to another on t h e efficient frontier, if necessary. T h e Linear/Integer programming problem represents another relaxation of t h e original model in §3 where we are only focusing on t h e expected savings possible by re-routing occupants along longer NI p a t h s . This Linear/Integer programming problem can be considered as a way of scalarizing t h e two objectives into a single objective problem in t h e expected savings of t h e alternative NI routes. Let's establish t h e following expected savings problem: Maximize

Z = ^ ^ ^ i

(Occupant Classes) 2 ^ 1 , ^

EijXijk

i

k

<

1

Vij

(11)

k

x

ijk

5: 0,1

Vijk routes

(12)

where

4 = [?**-[«•/<")+ 4-1 vy* we would like t o reroute t h e occupants perhaps on a longer NI path, if the expected savings in evacuation t i m e would be maximized. T h e unimodularity of t h e above

Multi-Objective Linear/Integer unimodularity Programming If we take alternative NI

Routing

in Stochastic Evacuation Networks

271

program follows from t h e 0,1 properties of t h e sparse m a t r i x [8]. T h e property allows us to solve for integer solutions using ordinary Linear algorithms. t h e dual of t h e above problem, we gain some insight into whether paths may result in some savings in / i ( x ) . T h e dual is: Minimize

^

J ^ 71^ »

3

*a > fe-[(4H + 4 l V i ^ *<j

>

OVy

( 13 ) (14)

Finally, from t h e Dual, we obtain the following Complimentary Slackness conditions: Uij ~ [qijk ~ [(dij/u)

+ q*j]}x,jk = 0

\fijk

At our current NI solution, x{jk = 1 for some NI alternative for each ij occupant population. By t h e above complimentary slackness condition, if Xijk = 1 then =>

{^-hik-[(dyco)+q^}}

= 0.

Therefore, for t h e other JV7 route alternatives of each occupant population source, t h e dual feasibility condition remains unsatisfied, i.e.

if and only if t h e Expected Savings for a route alternative is p o s i t i v e , viz. [ 0. T h u s , t h e current NI solution can be improved if t h e Expected Savings for a NI routing alternative is p o s i t i v e and a new x^k will enter t h e basis.| Since in t h e construction of t h e Linear/Integer programming problem, more t h a n one occupant population might be re-routed, t h e change of basis rule defined below determines t h e occupant population t h a t should be chosen. More t h a n one change of basis is not r e c o m m e n d e d because of t h e nonlinear interaction effects which might be caused by t h e multiple re-routings. Change-of Basis Rule: to the NI path solution for

At each stage of our re-routing which: E* =

max

process, we should

move

{EU,E22,...,EIJ}

V13 sources

Essentially, t h e rule is a greedy one and has been shown to be effective in practice. It will generate a subset of t h e NI solution space and will not guarantee complete enumeration of t h e NI set. To t e r m i n a t e the re-routing process, we have:

J. M. Smith

272 C o r o l l a r y 1: If E* < OV i j occupant sources then no more re-routing result in an improvement in f\{x).

iterations

will

P r o o f : Again, this result is derived from t h e Linear/Integer programming problem using t h e Dual Feasibility and Complimentary Slackness conditions. If t h e only NI route which provides t h e m a x i m u m savings is a current route, for each occupant population source, t h e n no additional Expected Savings Eij > OV ij are possible and t h e process is t e r m i n a t e d .

5

Algorithm

T h e problem we face in our evacuation planning problem is t h a t we do not know a priori which p a t h s are NI without assessing the congestion in G(V,E). We must iteratively generate candidate paths, assess t h e congestion in G(V,E), and then iterate again until t h e desired tradeoffs between distance travelled and evacuation t i m e is acceptable t o t h e planner. This iterative process leads to the algorithm described below. For product form networks where the estimate of t i m e delays in the Expected Savings calculation for re-routing among the alternative NI paths can be computed exactly, t h e n t h e algorithm will guarantee finding a NI p a t h for re-routing t h e occupant classes. For non-product form networks, which are typically t h e case, we can only a p p r o x i m a t e these t i m e delays, therefore, the algorithm can only guarantee an a p p r o x i m a t e NI solution. Considering the complexity of the underlying stochasticinteger p r o g r a m m i n g problem, this is a reasonable and practical strategy. Before presenting t h e algorithm formally, let us discuss our overall approach to t h e evacuation planning problem [15]. Our approach for modelling evacuation planning problems has t h r e e separate b u t interrelated steps: S t e p 1.0 concerns t h e Representation of the region or facility as a queueing network, while S t e p 2.0 concerns t h e Analysis of the queueing network to estimate t h e critical performance measures of the evacuation. A queueing network model is utilized in order to capture t h e potential congestion in the network where large populations come together during t h e evacuation process. Finally, S t e p 3 . 0 concerns t h e synthesis or multi-objective generation of t h e routing paths for t h e occupants, where t r a d e offs among t h e different performance measures estimated in S t e p 2.0 of t h e algorithmic process may be necessary. T h e algorithm to facilitate t h e design methodology can be incorporated into any simulation e . g . Q - G E R T or analytical model e . g . Q N E T - C to estimate / i , / 2 , and carry out t h e evacuation planning/routing analysis. To summarize and focus t h e efforts in this paper, an algorithmic description of S t e p s 1.0, 2 . 0 , & 3.0 and it substeps are presented. S t e p 1.0: R e p r e s e n t a t i o n Represent the underlying facility or region as a network G(V, E) where V :— is a finite set of nodes and E := is a finite set of arcs

Multi-Objective

Routing in Stochastic Evacuation Networks

273

or nodal pairs. Step 2.0: Analysis Analyze G(V, E) as a queueing network either with a transient or steady-state model and compute the total evacuation time of the occupant population along with total distance travelled to evacuate given a set of evacuation paths. Step 3.0 Synthesis Algorithm Step 3.1: Analyze the queueing output from the evacuation model and compute the set of NI evacuation paths which simultaneously minimize time and distance travelled in G(V, E) for each occupant population. 3.1.1 If the set on NI paths are uniquely optimal then

go to Step 3.2 otherwise: 3.1.2 Significant queueing (congestion) exists on one or more routes then go to Step 3.3. Step 3.2: STOP! The NI shortest time/distance routes are optimal and identical and total evacuation time, distance and congestion are minimized. Step 3.3: Determine the total number of occupants who pass through the queueing area(s) and trace them back to their origins. Step 3.4: Select the total number of occupants to be re-routed from each source node. The total number of occupants re-routed is correlated to both the size of the queues and the number of occupants on each route. In selecting the population, the analyst should strive to achieve uniformity of occupants and queues on each egress route. Step 3.5: Re-route the population to the kth route of the NI set of paths where k is selected by employing the following formula:

4 = « i i * - [ ( 4 » + «S] WJfc Step 3.6: Select the largest positive E* for each set of populations to be rerouted, where: £* =

. m a x V\j sources

{EU,E22,---,EIJ}

for all possible savings, and then re-run the computer evacuation planning model with the new set of routes, by returning to Step 2.0 of the General Algorithm. If all E[s are negative, stop! The current set of NI shortest routes used on the previous iteration are selected.

J. M. Smith

274

T h e overall t i m e and space complexity of the algorithm is exponential largely because of t h e t i m e complexity of S t e p 2.0. and the fact that t h e number of NI solution p a t h s for each occupant class may be exponential in the size of t h e network. W h e t h e r one has a product form network or not, t h e time complexity of S t e p 2.0 will be a key bottleneck in t h e efficiency of t h e algorithmic re-routing process. Even if an heuristic is utilized in S t e p 2.0, t h e time and space complexity of the algorithm would then be governed by the exponential number of alternative NI paths for each occupant class.

6

Example

In t h e following section of t h e paper, an example of t h e evacuation of three distinct occupant populations is utilized to demonstrate t h e scope and effectiveness of t h e previous design methodology to route and re-route t h e occupants based on t h e m a t h ematical formulation and subsequent algorithm. T h e example is taken from [11]. Q - G E R T is utilized here to estimate the evacuation times and congestion on the NI set of evacuation routes since it is a transient model and represents t h e most general t y p e of stochastic estimation tool in which t h e multi-objective routing methodology might be used. T h e arrivals to t h e network are from a log-normal distribution and log-normal distributions were used for the arc and nodal service time distributions. More discussion of t h e parameters are included in [11]. At the end of the example discussion, t h e use of Q N E T - C , a steady-state analytical queueing network model, is described to estimate t h e evacuation times. T h e example has three occupant groups located at source nodes # 1 , # 2 , and $ 3 with occupant populations of 40,10 and 30 persons respectively. Figure 1 illustrates t h e sample G(V,E). As you can see in Table # 1 , t h e paths marked with a single * represent t h e set of NI solutions for each of t h e occupant populations. The populations are denoted by t h e large P in each of t h e following tables and the V-j denotation represents t h e sink nodes of G(V, E). T h e other paths were generated by t h e algorithm, but since we are viewing t h e evacuation G(V, E) without congestion, the shortest distance paths also correspond to t h e shortest time paths. T h e above paths are chosen and t h e algorithm returns to S t e p 2 . 0 of t h e General Algorithm to assess the congestion in G(V,E). If no significant congestion exists on these routes, the algorithm terminates. Unfortunately, in our pass through S t e p 2.0 significant congestion at nodes # 5 , # 7 , and # 8 were encountered in the amount of 50.97,2.217 and 2.86 t i m e u n i t s ( t . u . ' s . ) and a total evacuation t i m e was 158.50 t.u.'s. In iterating again through S t e p 3 . 0 with t h e above queueing delay estimates along t h e paths, a different set of NI paths were generated due t o congestion, see Table # 2 . Inspecting Table # 2 , t h e p a t h s denoted by t h e single * are chosen from t h e set of NI paths, while those denoted by four * * * * are dominated or else were considered on the previous iteration. We are thus beginning t o t r a d e off distance travelled in order to seek reductions in evacuation

Multi-Objective

Routing

in Stochastic

Evacuation

Networks

275

time for all occupants. T h e results of the next iteration are displayed in Table # 3 . For this iteration # 2 , t h e total evacuation t i m e slightly decreased from 158.50 —> 152.09 t.u.'s. While t h e delay decreased significantly at node # 5 from 50.97 —> 25.48, t h e delay at node # 8 increased from 2.86 —» 18.70 which thus resulted in the marginal decrease in total evacuation time. W i t h these new queueing delays, t h e NI set was again generated and t h e new NI paths are depicted in Table # 3 . T h e new set of p a t h s were chosen a n d once again t h e algorithm cycled through S t e p 2 . 0 , b u t this iteration resulted in a d r a m a t i c decrease in evacuation time from 152.09 —» 114.23 t.u.'s. It is interesting to note t h a t t h e program Q N E T - C was run with t h e same set of p a t h s and t h e results where similar to t h e G E R T runs in t h a t t h e evacuation times where 165.78, 159.87, and 111.02 for t h e same iterations. T h e discrepancy in evacuation times is due to t h e fact t h a t Q N E T - C utilizes exponential service t i m e distributions for t h e service times along t h e evacuation paths so there is a tendency t o be pessimistic, while QGER.T makes no special distribution assumptions.

J. M. Smith

276

6.1

Iteration # 0

POP = 401

Y

l<

1

POP = 10

yv3

POP = 30( 3 J

v2 Figure 3: Initial Evacuation Graph

G(V,E)

P=l

Time

Distance

Route

V3 = 8 V3 = 9

35 35

120 180

1 -> 5 ^ 7 ^ 8 1-^4-^6^9

P=2 V3 = 8 V3 = 9

Time 24 35

Distance 80 160

Route 2 ^5-+7-+8 2^4-*6^9

P=3 V3 = 8 V3 = 9

Time 27 50

Distance 90 180

Route 3^5-+7^8 3 -> 4 ^ 6 - * 9

Table 1: NI set of Evacuation Paths

Multi-Objective

6.2

Routing

in Stochastic

Evacuation

Networks

277

Iteration # 1 : Evacuation Time = 158.50 t.u.

V2 Figure 4: E v a c u a t i o n G r a p h G(V, E) w / 1st p a t h s

P=l V3 = 8 V3 = S V3 = 9 P=2 V3 = S V3 = 8 ^3 = 9 P=3 V3 = 8 V3 = 8 y3 = 9

Time 91 45 50 Time 80 40 45 Time 83 45 50

Distance 120 160 180 Distance 80 140 120 Distance 90 160 180

Route l->-5->- 7 ^ 8 1 -> 4 - * 6 -» 8 1 ^ 4 ^ 6 — 9

Route 2 - * 5 -• 7 — 8 2 ^ 4 ^ 6 ^ 8 2^ 4^6-^ 9 Route 3 -+ 5 ^ 7 -+ 8 3 ^ 4 ^ 6 -»8 3-^4^6-^9

Table 2: NI set of Evacuation P a t h s

Savings 0 40.61*

**** 0* 37.65

**** 0* 35.05 ****

J. M. Smith

278

6.3

Iteration # 2 Evacuation Time = 152.09 t.u.'s

VW

>V

Figure 5: E v a c u a t i o n G r a p h G(V,E)w/

2nd s e t of p a t h s

Time 107 88 76 Time

Distance

Route

Savings

V3 = 8 V3 = 8 V3 = 9 P=2

120 150 180 Distance

l-+5^7-*8 1^4-+6^7-*8 1 ^ 4 ^ 6 ^ 9 Route

**** 0.38 13.50* Savings

V3 = 8 V3 = 8 V3 = 9 P=3

96 61 49 Time

80 130 160 Distance

2-»5^7^8 2 ^ 4 ^ 6 ^ 7 ^ 8 2->4 -s-6-* 9 Route

0* -12.74 0.39 Savings

v3 = s

99 66 54

90 150 180

3->5->-7-+8 3-»4 — 6 ^ 7-+8 3-^4-^6^9

0* -15.33 -2.21

P=l

V'3 = 8 V'3 = 9

Table 3: NJ set of Evacuation P a t h s

3

Multi-Objective

6.4

Routing

in Stochastic

Evacuation

Networks

279

Iteration # 3 : Evacuation Time 114.23 t.u.'s

v.

J

v2 Figure 6: E v a c u a t i o n G r a p h G(V, E) 3rd s e t of p a t h s

P=l

Time

Distance

Route

Savings

V3 = 8 V3 = 8 V3 = 9 P=2

107 88 78 Time

120 150 180 Distance

1 ^ 5 ^ 7 _ 8 1^4-^6-^7-^8 1-^4-^6^9 Route

**** **** 0* Savings

V3 = 8 Vs = 8 V3 = 9 P=3

96 61 51 Time

80 130 160 Distance

2^5-+7-*8 2^4-+6-* 7^8 2 - ^ 4 -> 6 ^ 9 Route

0* -12.72 -15.38 Savings

3 ^ 5 -+ 7 -^ 8 3 ^ 4 ^ 6 —7^8 3 — 4 -> 6 -+ 9

0* -15.33 -19.74

V3 = 8 V3 = 8 V3 = 9

99 66 56

90 150 180

Table 4: NI set of Evacuation P a t h s Table # 4 illustrates t h e final set of NI paths generated by t h e algorithm. In cycling through S t e p 3 . 0 , no additional savings are possible by re-routing occupants on

J. M.

280

Smith

longer p a t h s , therefore, the last set of paths yield the most favorable decrease in evacuation time for the occupant population.

7

Summary and Conclusions

In this paper, we have focused on the generation of the Noninferior NI set of paths for evacuating occupants in an emergency situation. T h e complex, multi-objective n a t u r e of the problem has been described and mathematical properties and a corresponding algorithm have been developed for generating the set of NI evacuation paths which allow for tradeoffs between evacuation time and distance travelled for the occupant populations. A small example problem also illustrates the iterative process of the algorithm which can be implemented either in a simulation environment e.g. QG E R T , or in an analytical model environment e.g. Q N E T - C .

Acknowledgement This material is based upon work supported by the National Science Foundation under grants #MSM-X1 l-il-ij and #MSS-9U6666.

References [1] J. Ahlberg, Stochastic Queueing Network Program for Evacuation Planning, Master's Project. Department of Industrial Engineering and Operations Research, University of Massachusetts, Amherst, MA 01003 (1988). [2] L.G. C h a h n e t , 11.L. Francis, and P.B. Saunders. Network Models for Building Evacuation. Management Science 2 8 , (1) (1982) 86-105. [3] V. Chankong and Y.Y. Haimes, Multiobjective Methodology (North-Holland, 1983).

Decision

Making:

Theory

and

[4] J.C.N. Climaco and E.Q.V. Martins, A Bi-Criterion Shortest P a t h Algorithm, European Journal of Operations Research,11(1982) 399-404. [5] S. Daskalaki and J. MacGregor Smith. T h e Static Routing Problem in Open Finite Queueing Networks. Presented « ORSA/TIMS Meeting. Miami, Florida, (October 1986.) [6] S. Daskalaki and J. MacGregor Smith, Optimal Routing and Buffer Space Allocation in Series-Parallel Queueing Networks. Invited presentation to t h e EURO/TIMS XXVIII Conference in Paris France, (July 6-8, 1988).

Multi-Objective

Routing in Stochastic Evacuation Networks

281

[7] S. Daskalaki and J. MacGregor Smith, Real Time Routing in Finite Queueing Networks, in Queueing Networks with Blocking, eds. H.G. Perros and T. Altiok. (New York: Elsevier Science Publishers B.V., 1989) 313-324. [8] R Garfmkel and G.L. Nemhauser, Integer Programming. (Wiley, 1972). [9] Hemant Gosavi and J. MacGregor Smith, Heavy Traffic Multi-Commodity Routing in Open Finite Queueing Networks, Paper Presented at the ORSA/TIMS Meeting, Denver, CO. (October 1988.) [10] R. L. Francis and L.G. Chalmet. Network Models for Building Evacuation: A Prototype Primer. Unpublished Paper, Department of Industrial and Systems Engineering, University of Florida, Gainesville, Florida (1980). [11] C.J. Karbowicz and J. MacGregor Smith, A K-Shortest path Routing Heuristic for Stochastic Evacuation Networks. Engineering Optimization 7 (1984) 253-280. [12] A.Pritsker, Modelling and Analysis using Q-GERT Networks. (Wiley, New York, 1979). [13] Harvey M. Salkin, Integer Programming ( Addison-Wesley Publishing Co., Reading Massachusetts, 1975). [14] J. MacGregor Smith and D. Towsley. The Use of Queueing Networks in the Evaluation of Egress from Buildings. Environment and Planning B-8 (1981) 125-139. [15] J.MacGregor Smith, QNET-C: An Interactive Graphics Computer Program for Evacuation Planning Proceedings of the Conference on Emergency Planning, SCS Multiconference, 14-16 (January 1987), pp 19-24

283 Network Optimization Problems, pp. 283-300 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

A Simplex Method for Network Programs with Convex Separable Piecewise Linear Costs and Its Application to Stochastic Transshipment Problems 1 J. Sun Department University, K.-H. Tsai Department

of Industrial Engineering and Management Evanston, IL 60208, USA

of Computer

L. Qi School of Mathematics, NSW2033, Australia

Sciences,

National

The University

Normal

of New South

Sciences,

University,

Wales,

Northwestern

Taipei,

Taiwan

Kensington,

Abstract

This paper is concerned with the pure network program whose objective function is convex, separable, and piecewise linear. We describe a direct simplex algorithm and its implementation for solving such problems. Computational results of applying this algorithm to stochastic transshipment problems are reported. The algorithm keeps the number of variables in the original level by allowing nonbasic variables to take breakpoint values and by using a straightforward pricing strategy similar to the traditional network simplex method. Tree data structure is used to construct efficient implementation. Computational results indicate that the solution time is insensitive to the increase of "piecenumber" of the objective function. As a result, we have been able to solve stochastic transshipment problems of more than 30,000 arcs and 3000 nodes, where each node has a discrete random demand of 100 possible values, within one minute on a SUN computer.

'The research is supported in part by NSF and Australian Research Counsel.

J. Sun, K.-H. Tsai, and L. Qi

284

1

Introduction

This paper is aimed at t h e following optimization problem (NetPLP) |

minimize subject to

F(x) = £ " = 1 Ax = 6,

fj(xj)

where A £ Ji^xn j s ^ n e node-arc incidence m a t r i x of a connected network, x = (x-i, • • • ,xn)T, c = (ci, • • • , c „ ) T £ if!", and b £ 7? m . Each fj is a convex piecewise linear function of t h e single variable Xj and has t h e following form: SjiXj

fj(xj)

+ rji,

if Cjo < Xj < CjX;

= Sjk,X3 + Tjkj,

if Cjkj-l

+co

otherwise,

< X3 <

Cjk/,

where for each j , we call t h e numbers c ; o, c ; i , . . . , c,^ t h e breakpoints of the variable XJ; they satisfy — OO < Cjo < Cji < . . . < Cjkj < + 0 0 .

Notice t h a t t h e bound constraints CJQ < Xj < Cj^ , for j = 1, • • •, n, are imposed from t h e definition of / , . If all kj equal to one, this model reduces to t h e ordinary network linear program. T h e problem ( N e t P L P ) arises i n at least two areas of operations research. First, it is used to model practical network problems with linear penalties or variable linear costs. Later in this paper, we will show how stochastic transshipment problems with r a n d o m d e m a n d s can be formulated as an ( N e t P L P ) . Second, it may represent an approximation of a network separable convex program, especially when t h e exact formula of t h e cost function is unknown and only experimental d a t a are available. T h e third possible application of ( N e t P L P ) is to offer a warm start for nonlinear network algorithms. We admit t h a t a nonlinear network program should be in general approached by nonlinear programming algorithms, b u t it might be as well i m p o r t a n t to find a good staring point for those algorithms to be truly efficient. A good algorithm for ( N e t P L P ) could conveniently produce such a warm start by solving a piecewise linearization of t h e nonlinear problem. In addition, t h e algorithms for ( N e t P L P ) could also be useful in directly solving some nonlinear problems, as demonstrated by Charnes, Song and Ali [CSA86]. Theoretically speaking, ( N e t P L P ) can be solved by reformulating it as a linear network problem where kj new arcs are introduced to replace the arc j . However, in practice, we would rather seek algorithms t h a t directly deal with the piecewise linear objective function because reformulation would often increase t h e number of variables to a prohibited level. Some authors have contributed specific pricing rules for applying the simplex m e t h o d to certain cases of ( N e t P L P ) , for example, see t h e algorithm of

Algorithm

for Network

Piece-wise Linear

Programming

285

Ali, Cook and Kress [ACK86] for ordinal ranking problems and t h e algorithm of Wets [We83] for stochastic programs with simple recourse. T h e contribution of this paper, however, is to introduce a unified simplex approach for solving ( N e t P L P ) and to elaborate its implementation based on t h e tree d a t a structure (e.g. Chvatal[C83]). Other t h a n thinking of implicitly dealing with t h e reformulated linear problem, we develop a direct pricing rule t h a t can be implemented efficiently using t h e tree d a t a structure. Although there have been considerable developments in interior point methods for linear p r o g r a m m i n g in recent years, t h e simplex m e t h o d remains being very competitive in solving network linear programs. It is our belief t h a t t h e simplex m e t h o d would be still efficient in solving t h e piecewise linear problem. To support our contention, we first test 40 randomly generated ( N e t P L P ) problems similar to t h a t of Klingman et. al [KNS74]. T h e n we choose four of these problems and observe how t h e solution t i m e depends on t h e piece n u m b e r kj. We find t h e increase of solution time is insignificant compared to t h e increase of k\ + • • • + kn. An i m p o r t a n t application of our algorithm is to solve stochastic transshipment problems ( S T P ) with discrete r a n d o m demand. This model include t h e classical stochastic t r a n s p o r t a t i o n problem as a special case. Due to t h e underlying network structure, t h e stochastic transshipment problem can b e efficiently solved by t h e network piecewise linear programming approach. In our computational test the method has been able to solve a problem of 35,000 arcs and 3000 nodes, with each node having a r a n d o m d e m a n d of 100 possible values, within one m i n u t e on a SUN computer. In t h e literature, some researchers have proposed direct m e t h o d s to solve general convex piecewise linear programs (Rockafellar [R84], Fourer [F88], Premoli [P87]). However, to t h e best of t h e a u t h o r s ' knowledge, none of those m e t h o d s has been implemented under t h e network circumstance. For an extensive literature list of applications of piecewise linear programs, see Fourer [F86]. This paper is organized as follows. In the next section we review t h e m a t h e m a t ical background. T h e m e t h o d and its convergence property are given in Section 3. Section 4 is devoted to implementation techniques. Computational experiment on 40 randomly generated b e n c h m a r k ( N e t P L P ) is reported in Section 5. T h e S T P is introduced and t h e corresponding test problems are solved in Section 6. Since only discrete distributions of d e m a n d in S T P can result in network piecewise linear programs, we discuss how to use t h e algorithm in S T P with continuous distribution and other extensions in Section 7.

2

Background Materials

Nothing much is new in this section. For more detailed introduction of network flow problems and monotropic optimization, we refer the reader to [R84]. Here, we just s t a t e some i m p o r t a n t facts t h a t are to be used in t h e sequel.

286

J. Sun, K.-H. Tsai, and L. Qi

Let A g Rmxn be t h e node-arc incidence m a t r i x of a connected network and let x = (xi, • • •, xn)T be a flow vector. Let [B, N] be a basis-nonbasis partition of t h e arc index set J = {1, • • •, n}. We then have t h e corresponding partition of the vector x = (XB, IJV)- A variable Xj in t h e vector XB is called a basic variable and in t h e vector xpf a nonbasic variable. T h e arcs in B form a m a x i m a l spanning tree of the network. For simplicity of notation, we will use B to represent both t h e index set and the tree. It is well-known (e.g. Rockafellar [R84]) t h a t , for each partition J = BU N, there are a vector b and a m a t r i x K such t h a t XB = KXN -f b, is an equivalent system to the system Ax = b. Moreover, t h e m a t r i x K has a combinatorial structure, namely, for each j £ N, t h e j ' - t h column of K is t h e incidence vector of t h e unique p a t h in tree B t h a t starts from t h e head node of arc j and terminates at the tail node of j . T h u s the increase (decrease) of the nonbasic variable Xj in t h e simplex m e t h o d corresponds to a equal increase (decrease) of t h e fluxes along t h e circuit Cj t h a t consists of this p a t h and arc j with j being positive in Cj. We call t h e incidence vector of the circuit Cj a simplex direction. Let d? = [d\, • • • ,d3n) be this vector. T h e n we have

(

1 if arc j is positive in Cj, — 1 if arc j is negative in Cj, 0 if arc j is not in Cj.

We now give a new interpretation of the reduced cost in linear programming, which will be instructive later in describing our pricing rule. O b s e r v a t i o n . In a network linear program (correspondingly, all kj = 1 in ( N e t P L P ) ) , if a; is a nondegenerate solution, or a degenerate solution with a non-zero ratio, the reduced cost associated with Xj in the simplex method is t h e directional derivative of F a t i along d3. i. e. F'{x y ^)

'

= l^X Ho

+

t d

t

^ -

F

^ .

In t h e case of degenerated solution with zero ratio, F'(x,d3) = oo. As a m a t t e r of fact, from t h e theory of monotropic programming the above directional derivative can be computed by F\x,d3)

= £ > a x K / f ( s O , d ? / , + (*/)},

(2.1)

where /,"" and / , + are t h e ordinary left and right derivatives of / ; (could be ± o o ) . In the case t h a t all kj = 1„ one has ftx\=fsixi I oo

if cm < xi < otherwise.

cn,

Therefore, if x is a nondegenerate basic feasible solution in t h e sense of usual network linear programming, one has c/0 < x\ < cn V / € B, which implies ff{x{) = ff(xi) =

Algorithm

for Network

Piecewise

Linear

Programming

287

5/,V / 6 B. therefore we get F\x,

#) = £

3/

-

£

a,

= si - ( P * - W«)»

(2-2)

where Cj" and CT are t h e positive and negative arc sets in Cj, respectively, pjh and Pjt are t h e prices at t h e head and t h e tail of arc j . T h e price vector p = (pi, • • •, pm) assigns to each node a price such t h a t pih — pit = S| for every / € B. (2.2) is exactly t h e formula used to c o m p u t e t h e reduced cost in linear network programs. It is also easy to see in t h e case of degenerate solution with non-zero ratio, all basic arcs with x; = C(0 are in Cj", while those with X{ = c/t, are in C~, so formula (2.2) is still valid. Finally, in t h e case of degenerate solution with zero ratio, at least one basic arc / with xi0 = c/0 is in C~, or at least one basic arc / with xi0 = c^, is in Cj, hence t h e /-th t e r m in (2.1) is + c o , resulting in F'(x,dJ) = oo. T h e simplex m e t h o d looks for a simplex direction d3, j € N such t h a t F'(x, d1) < 0. In t h e piecewise linear case, t h e simplex direction is defined as either d3 or — d3, and we simply call t h e derivative (2.1) t h e reduced cost and t h e right h a n d side of (2.2) t h e nominal cost. It is t h e reduced cost, not the nominal cost, determines if d3 is a feasible descend direction. However, since nominal cost can be obtained by relatively small efforts, we use it as an indicator for possible descend simplex directions. In our implementation, we first find a candidate of a descent simplex direction by checking if its nominal cost is negative, then as we go along the circuit Cj to c o m p u t e the maximal allowable change of flux (ratio test), we successively add t h e differentials between t h e reduced cost and t h e nominal cost to t h e later one to get t h e real directional derivative (2.1). D e f i n i t i o n 2 . 1 A basic solution x = [ I J , I J V ] to ( N e t P L P ) is defined as a solution to Ax = b, where each nonbasic variable takes one of its breakpoint values. If, in addition, F(x) is finite, then x is called a basic feasible solution. Using t h e "tree language", each basic feasible solution of ( N e t P L P ) can be identified as a m a x i m a l spanning tree so t h a t F(x) is finite and all arcs not in the tree take breakpoint values. T h e following proposition can be derived from a similar property of linear programs. P r o p o s i t i o n 2.2 If ( N e t P L P ) has an optimal solution, t h e n there exists a basic feasible solution which is also optimal. Suppose t h a t a; is a basic feasible solution to problem ( N e t P L P ) . One of the fundamental properties of ( N e t P L P ) is t h e following (see [R84]). P r o p o s i t i o n 2 . 3 T h e following statements are equivalent: (a) x is o p t i m a l to ( N e t P L P ) . (b) x is feasible and for all possible partitions J — B U N, none of the reduced cost of the simplex directions is strictly negative.

J. Sun, K.-H. Tsai, and L. Qi

288

(c) x is feasible and there exist a price vector p G Rm and a differential vector v = — ATp, such t h a t fjT(xi) < vi < /, + (xj), for / = 1, • • •, n. Notice t h a t by t h e special structure of t h e node-arc incidence m a t r i x A, one has vi = pih — pit. Now we are ready to s t a t e t h e algorithm.

3

The Simplex Algorithm for (NetPLP) and Its Convergence

A l g o r i t h m 3.1 S t e p 0. Find an initial basic feasible solution of ( N e t P L P ) or show t h a t the problem is infeasible. If a basic feasible solution is found, then set this solution as x° and k = 0. Go t o Step 1. S t e p 1. Determine by Proposition 2.3(c) whether t h e current solution is optimal. If not, find a simplex direction associated with a nonbasic variable j such t h a t F'(xk,d3) < 0. Go to Step 2. By Proposition 2.3(b), such a direction exists. S t e p 2 . Set xk+1 = xk + adJ where a minimizes F(xk + adJ) over a > 0. If such an a does not exist, i.e. inf a >o F(x + ad3) = —oo, stop; t h e ( N e t P L P ) has unbounded solution. Otherwise, set k = k + 1 and go to Step 1. It will be shown (Proposition 4.3) t h a t the minimizer a in Step 2 can be selected so t h a t xk+1 is again a basic feasible solution. If we maintain this machinery in Algorithm 3.1, t h e n it is obvious t h a t in a finite number of iterations, t h e algorithm will either find an unbounded solution or an optimal basic feasible solution because t h e total n u m b e r of basic feasible solution is finite and t h e strictly decreasing sequence {F(xk)} ensures t h a t no same basic feasible solution can repeat in t h e sequence {a; 4 }.

4 4.1

Implementation of the Algorithm Data Structure

We use t h e "tree d a t a structure" t h a t has been long used in m a n y successful implementations of network simplex m e t h o d s . For a description of it, see [C83]. T h e arrays used are as follows. F R O M and T O arrays. These two arc-length arrays store t h e initial and ending nodes of arcs. B R E A K P O I N T and S L O P E arrays. These arrays store t h e value of breakpoints and slopes for arcs in their natural order. Namely, in t h e orders of Cio, • • • , e n , , • • •, cno, • • •, cnk„

and

s n , • • •, s u , , • • •, s n i , • • • , snkn.

S T A T E array. This arc-length array records which breakpoint value is being taken by a nonbasic variable.

Algorithm

for Network

Piecewise

Linear

Programming

289

X array. This node-length array stores t h e values of basic variables. P array. This node-length array stores t h e dual variables, i.e. t h e price vector. A U array. This node-length array is used for convenience of adjusting dual variables. To maneuver operations on a tree, we use four node-length arrays called P R E D E C E S S O R , D E P T H , T H R E A D and E D G E . T h e functions of t h e first three can be found in [C83]. T h e last array points to t h e arc in t h e tree t h a t is above and incident to a given node. Here by convention, t h e tree is thought of as upside-down.

4.2

Implementation of Step 0

We use t h e Gradual Penalty Method ( G P M ) in Grigoriadis [G86] to solve ( N e t P L P ) in a single phase. It is a so-called big-M m e t h o d with a self-adaptive mechanism for choosing t h e n u m b e r M. It creates an initial base with all-artificial arcs, each with a m o d e r a t e linear penalty cost, and solve t h e augmented N e t P L P . If t h e optimal solution contains a positive artificial variable, its penalty is enlarged gradually to certain limit, and t h e problem is repeatedly solved until t h e artificial arc has a zero flux, otherwise t h e problem is claimed infeasible. A empirical formula M = m i n { ^ , l + ( m - l ) m a x { | s . | , | s * | } , s , + (s* -

s,)l.5[2*~2]}

is used to decide the penalty, where s» and s* are t h e m i n i m u m and m a x i m u m slopes of t h e cost functions, respectively; 7r is a pass n u m b e r which refers to t h e times of penalty changes, [i is a threshold value (10 9 in our code).

4.3

Implementation of Step 1

4 . 3 . 1 T h e S e t t i n g of t h e P r i c e V e c t o r As we mentioned in Section 2, given a tree B, each node i is associated with a price Pi. T h u s one has vi = p//, — pit. In our implementation only p is stored. Initially, the vector p is set so t h a t vi = / ( + (a:?) VZ 6 B (if ff{x°) happen to be + 0 0 , then vi = fi~(x°)). Same as in t h e linear network simplex m e t h o d , initially p.'s are set to satisfy t h e system pih — p/ ( = vi, VZ G B. Afterwards p can be u p d a t e d together with t h e u p d a t e of t h e tree. However, no m a t t e r how t h e vector p is u p d a t e d , we always keep / , _ ( x i ) < u; < ff(xi) for all / 6 B, and pih — pit = vi, VZ € B. Especially, after t h e line search in Step 2, t h e p,'s related to t h e simplex direction d? need to be changed according to t h e rule Pih - Pit = ff(xi)

if d] = 1, p,h - pu = fi~(xi)

if d\ = - 1 .

To save time, these changes are recorded in A U , while t h e m i n i m u m ratio a is computed. It can be seen t h a t such u p d a t i n g procedure always assigns vi = f+(xi) or fl~(xi), VZ G B, and t h e descent property of d' will ensure ft{xj) / 00 and

290

J. Sun, K.-H. Tsai, and L. Qi

ff{xj) 7^ —oo. At t h e end of an iteration, if there is a pivoting, then p is u p d a t e d by taking into consideration of b o t h t h e change of t h e tree and t h e value of A U . T h e above c o m p u t a t i o n procedure for p is completely t h e same as t h e linear case, if t h e objective function happens to be linear. 4 . 3 . 2 T h e S e a r c h of a D e s c e n t S i m p l e x D i r e c t i o n T h e key to find such a direction is t h e computation of t h e reduced cost (2.1) in Section 2. We separate t h e task into two processes. In process 1, we c o m p u t e the nominal cost as if we were dealing with t h e linear case. T h e concrete process is: For each arc j £ N, if pjh — Pjt > ff{xj), then d3, as described in Section 2, is taken as a candidate of t h e descent direction and the nominal cost is rij = ff{xj) — pj/, + pjt. If pjh — Pjt < f7{xj), then — d3 is taken as a candidate of t h e descent vector and the nominal cost is rij = —//" (XJ) +Pjh ~Pjt- If none of such j exists, namely for all j 6 N, fr(xj) — Pjh ~ Pjt ^ ff(xj)> t h e n t h e condition (c) in Proposition 3.3 is satisfied and x is t h e optimal solution. Otherwise, a sample pricing strategy is adopted to select an entering arc from among a group of candidates. This strategy, with details in [G86], also reduces t h e danger of cycling when degeneracy arises. Note t h a t t h e nominal cost n} is not necessarily t h e directional derivative F'(x, d3) in (2.1). In fact, we always have rij < F'(x,d3) and the equality is valid only if V( = fj~(xi) as dj = —1 and i>/ = ff(xi) as d\ = 1, V j € B. Therefore, if rij > 0, then d3 is definitely not a descent direction, while if rij < 0, we still have to c o m p u t e (2.1) to know whether d3 is really a descent direction. T h e case of rij < 0 and F'(x, d3) > 0 results a degenerate iteration in which we pivot to change t h e base but t h e iterative solution remains t h e same. After t h a t , the process 1 is restarted. In process 2, t h e computation of F'(x, d3) and t h e line search in Step 2 are carried over simultaneously. Notice t h a t both operations need to travel around the circuit indicated by d3, we do t h e two jobs by traveling around t h e circuit once. We leave t h e details in Section 4.4.

4.4

The Implementation of Step 2

As mentioned in Section 4.3, suppose t h a t rij < 0, we need to check if d3 is really a descent vector, and if it is, do a line search along d3. To achieve the two goals together, we first use t h e P R E D E C E S S O R , D E P T H , and E D G E arrays to identify the arcs of t h e circuit associated with d3. Details may be found in [C83]. Once an arc / in this circuit is identified, c o m p u t e , -i • • ,. • J c\ — xi if d3, = 1 (a) m i n i m u m ratio : a = m m < „ '• w ( \x, -c, if d\ = - 1 ' and (b) directional derivative : / ? = / ? + ( ^ ^ ~^xh~Jli\ [(Pih-pit)-f, (xi)

l

\ d\ = l lid3, =

, -1

Algorithm

for Network

Piecewise

Linear

291

Programming

where Q > xi is t h e closest breakpoint on t h e right of xi and c( < x\ is t h e closest breakpoint on t h e left of x\. If no such breakpoint exists, we regard t h e difference c; — x\ or x\ — C[ as + 0 0 . T h e initial value of j3 is ff{xj) — pjk + pjt = rij if tfj = 1, or -ff(xi) ~ Pit + Pjh if d] = - 1 . If P become positive or zero at some arc / (note: Initially /? < 0), then d3 is not a descent elementary direction because j3 increasingly reaches F'(x,d3) when we travel around t h e circuit (see Proposition 4.2 below). In this case we adjourn t h e process and pivot on (j, I). Here arc j is t h e entering arc and arc / is t h e leaving one. After pivoting, we go to Step 1 and start the next iteration. If after scanning the entire circuit we still have j3 < 0, then d3 is a real descent direction because now P = F'(x,d3) (see Proposition 4.2 below). Set x <— x + ad3.

(4.1)

If a = 00, t h e ( N e t P L P ) has unbounded solution. Otherwise, repeat (a)(b) until t h e pivoting operation happens. It should be noticed t h a t t h e values of d] can be obtained through t h e direction of t h e arc / in t h e circuit. Therefore, there is no need to store it. Since d1 could be still a descent direction at t h e new x obtained in (4.1), we repeat (a) and (b) until pivoting is necessary. Hence t h e line search can cross several pieces of F(x) instead of just one. To justify t h a t t h e implementation really achieves the goal of Algorithm 3.1, we prove t h e following. P r o p o s i t i o n 4.1 T h e j3 computed after scanning t h e whole circuit is t h e desired derivative (2.1). P r o o f . We only prove t h e case of d3j = 1, t h e case of crj = —1 can be discussed similarly, d1 satisfies Ad1 = 0 and v satisfies v = —ATp, so we have vTdJ = 52"=i d3kvi — 0. T h u s Vj = - EigB d\vi. At beginning we have (3 = ff{xj)-Vj = fj~{xj) + Y2ieB d\vi. In (b), we add /, + (x/) and subtract i>; when d] = 1, and add —fi~(xi) and subtract —vi when dj = — 1 for each / g B. These operations are equivalent to adding the terms max{
according to (2.1). (Q.S.V.) P r o p o s i t i o n 4.2 For any 0 < a < a, F(x + ad3) = F(x) + aP. P r o o f . By t h e way of determining a, F(x) is linear on t h e line segment [x, x + ad3}. T h u s F(x + ad3) = F(x) + aF'(x,d3) = F(x) + a/3 according to Proposition 4.1.

(Q.e.v.) ^From this proposition, ( N e t P L P ) is unbounded if a = 00. We still need to show P r o p o s i t i o n 4 . 3 After each iteration x remains to be a basic feasible solution.

292

J. Sun, K.-H. Tsai, and L. Qi

P r o o f . A new x is obtained from t h e current i b y a pivoting operation. T h e criterion for choosing t h e leaving variable is t h e change of sign of /3. This change is possible only if /; + (a;;) ^ vi or ff(xi) ^ vi. By t h e setting of vi, vi is either ff{xi) or fj~(xi). Hence Xf m u s t be a breakpoint and t h e new x is then a basic solution. T h e feasibility comes from Ad3 = 0 and F(x + ad3) < oo, which is implied by Proposition 4.2. (Q.S.V.) In summary, if after t h e circuit is scanned we have /? < 0, then we do a line search and a real decrease of t h e objective function is achieved. T h e new iterative solution is still a basic feasible one. On t h e other hand during t h e course if for some arc / in t h e circuit ft becomes nonnegative, then no decrease is m a d e and we merely pivot to another basic feasible solution (a degenerate step). In addition, if there is no "cycling" in consecutive degenerate steps, we will eventually end up with one of the three alternatives: Optimality, unboundedness, or a descent direction. Although there exist some anti-cycling procedures [F88], for simplicity of coding and efficiency of t h e algorithm, we did not adopt anyone of t h e m . However, it seems t h a t our policy of choosing t h e "first arc t h a t makes /? negative" as t h e leaving arc and t h e sample pricing strategy used in selecting t h e entering arc have practically prevented t h e algorithm from cycling. In all tested problems, no cycling is observed.

5

Computational Results

T h e 40 benchmark problems in Klingman et al. [KNS74] are generated with piecewise linear costs (Although in its original sense t h e assignment problem should not have piecewise linear cost, we treat it here as a special transportation problem for testing purposes.). Each arc incurs a piecewise linear cost whose breakpoints and slopes are randomly assigned according to uniform distribution. T h e code is written in integer version of F O R T R A N , compiled by F77 with optimization option - 0 , and executed on a SUN computer. Table 5.1 lists t h e computational time (in seconds) for these problems which does not include t h e i n p u t - o u t p u t time. Table 5.2 presents results for various kj. Four representative problems are solved repeatedly with respect to different numbers of pieces in their objective functions (without loss of generality, all / j ' s are assumed to have t h e same kj, because one can introduce r e d u n d a n t breakpoints if necessary). It is interesting to see t h a t the c o m p u t a t i o n a l t i m e of t h e m e t h o d does not change much with respect to t h e changes of kj.

Algorithm for Network Piecewise Linear Programming

293

Problem Number

Type

m

n

Max. No. of Pieces

C P U T i m e of NETPLP

Total Number of Iterations

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Assignment Assignment Assignment Assignment Assignment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment

200 200 200 200 200 300 300 300 300 300 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 1000 1000 1000 1000 1500 1500 1500 1500 8000 5000 3000 5000 3000

1308 1511 2000 2200 2900 3174 4519 5169 6075 6320 1500 2250 3000 3750 4500 1306 2443 1306 2443 1416 2836 1416 2836 1382 2676 1382 2676 2900 3400 4400 4800 4342 4385 5107 5730 15000 23000 35000 15000 23000

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

0.15 0.15 0.20 0.23 0.25 0.34 0.50 0.62 0.53 0.59 0.30 0.37 0.48 0.57 0.59 0.1& 0.30 0.17 0.33 0.23 0.33 0.21 0.30 0.23 0.47 0.16 0.31 0.83 1.03 1.09 1.09 1.82 1.95 2.05 2.32 41.53 25.35 17.04 20.66 12.86

636 566 801 785 881 1158 1531 1708 1550 1746 1252 1533 1936 2235 2294 1036 1724 984 1718 1178 1773 1056 1570 1366 2648 906 2050 2258 2713 3328 3316 3857 4010 4391 5408 15052 14768 14574 11696 11983

Table 5.1. Solution Times and Optimal values of 40 Benchmark Problems (on SUN SPARC 2)

J. Sun, K.-H. Tsui, and L. Qi

294

No. of pieces

Problem # 1

Problem # 1 3

Problem # 1 6

Problem # 2 8

8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100

0.42 0.39 0.47 0.40 0.43 0.43 0.52 0.51 0.48 0.52 0.42 0.47 0.47 0.42 0.42 0.42 0.42 0.53 0.48 0.50 0.40 0.38 0.49 0.40

1.60 1.35 1.74 1.90 1.25 1.30 1.41 1.22 1.29 1.33 1.17 1.19 0.96 1.23 1.02 0.91 0.88 1.00 0.99 0.88 0.68 0.84 0.88 0.91

0.88 0.91 0.94 1.16 1.19 0.82 0.86 1.11 1.07 1.03 0.92 1.00 1.03 1.05 1.07 1.34 1.29 1.46 1.29 1.29 1.49 1.19 1.18 1.32

2.37 2.58 2.61 2.55 2.77 1.99 2.42 2.58 2.47 2.27 2.30 2.48 2.84 2.11 2.48 2.54 2.42 2.18 2.84 2.41 2.54 2.63 2.53 2.58

Table 5.2. Solution Times as Number of Pieces Increases (on S U N 4 / 3 3 0 )

6

The STP and Computational Results

The STP can be described as follows. Suppose that a commodity is manufactured and consumed in m cities and transported between them through a highway network. At city i; (1 < i < m) there is a random demand u>; and a fixed supply 6,. The commodity flux Xj along highway link j (1 < j < n) has lower and upper limits dj and df. In accordance with the usual convention, negative flux is understood as positive flux in the opposite direction. The transportation cost along highway j is a convex piecewise linear function tj(xj). (e.g. qf per positive unit and qj per negative unit) Let E be the m x n incidence matrix of the network and Z{ be the net supply (i.e. the fixed supply minus export) of the commodity at city i. Then in vector notation we have b-Ex = z, d' <x
Algorithm

for Network

where x = (xi:and b = (hi, • • are understood T h e penalty

Piecewise

Linear

Programming

295

• • ,xn)T,, d~ = (d^,--- ,d~)T, and d+ = (df, • • • ,d+)T belong to Rn, T •, bm) and z = (zj, • • • , zm)T belong to Rm. T h e vector inequalities coordinatewise. for surplus or shortage of t h e commodity at city i is introduced as Fi(z{, Wi) = hi max{0, z; — «>,-} + /; max{0, u>,- — z , } ,

where /i; and /; are given cost coefficients, hi > 0 and /; > 0. We want to choose t h e a m o u n t of t h e c o m m o d i t y transported on each highway (i.e. t h e vector x ), subject to (6.1), so as to minimize t h e total shipment costs plus t h e expected total penalty, or in other words to minimize n

m

4>{x,z) = X>-(xJ-) + +£H,{X) #(*,-, «?0} t=l

i=l

j=l

1=1

where £w stands for t h e m a t h e m a t i c a l expectation with respect to r a n d o m vector W = [ y i > - - - , ^ m ] r and Ui(zi) = £w{Fi(zi,Wi)}. It is shown [S86] t h a t u,- is convex and t h a t if u>,- has marginal probability distribution function Wi then t h e subdifferential of u,- at z,- is given by t h e formula dui{zi)

= [{hi + h) lim Wi(0

-li,(h,

+ l,) lim Wi(t) - l,].

(6.2)

Specifically, if %£i has discrete (marginal) distribution of finite support 0 = {c; 0 , • • •,

cikt},

then, from (6.2) t h e function it; is convex piecewise linear and it consists of fc, -f 2 linear pieces with c; 0 , • • • , c ^ , being t h e breakpoints. We refer to this case as the discrete STP. T h e works on S T P can be looked back upon t h e paper of Dantzig and Ferguson on stochastic transportation problem [FD56]. Since then, the stochastic transportation problem and other network-related stochastic programs have received considerable attention over t h e years (see, for instance, [CL77] [B79] [E60] [L80] [P86] [Q84b] [Q85] [Q87] [Wi63] [Co78] [Q84a] [MV88a] [MV88b] [W86] [W89] [WW89], which cover t h e applications such as traffic control, production planning, and financial investment). Because of t h e non-bipartite network structure, our S T P model is more general t h a n the stochastic t r a n s p o r t a t i o n problem. On t h e other hand, due to t h e explicit form of t h e recourse function Fi(zi, Wi), the model is a special case of stochastic programming with simple recourse [We83]. Existing methods for t h e stochastic transportation problem do not lend themselves an obvious extension to S T P , while general methods for stochastic p r o g r a m m i n g with simple recourse do not take advantage of t h e network structure and hence tend to be inefficient for STP.

J. Sun, K.-H. Tsai, and L. Qi

296

We suggest t h a t t h e S T P problem be solved by explicitly solving its deterministic equivalence. To t u r n a discrete S T P into a piecewise linear network program, we set an artificial node r and m additional arcs, each initiates from a node i of the original network and t e r m i n a t e s at r. T h e flux on arc (i, r) is z< and its cost is Pj(zi). Now the S T P is equivalent to t h e following piecewise linear program: ' minimize^

£ " = i tj(xj)

+ YZLi Ui(zi)

••"*** (f -'OCM-iJ' dj <

XJ

(63)

< d* for j = 1, • • • , n,

T

where / is a unit m a t r i x , —e = ( — 1, • • • , —1) and b = (bi, • • • , bm)T. T h e functions tj and Ui are convex piecewise linear. Thus the S T P is a special case of the problem (NetPLP). In our computational test, t h e forty benchmark problems developed by Klingm a n , Napier and Stutz [KNS74] for network linear programming are regenerated under t h e S T P context. Although t h e r a n d o m d e m a n d does not apply to assignment problems, we still test t h e m and treat t h e m as transportation problems with b = (1, • • • , 1, — 1, • • • , — 1)T. In each problem, we associate to t h e nodes uniformly distributed d e m a n d s having 100 possible values. This is probably a typical size of t h e support for discrete distributions in practice. These problems cover transportation and transshipment networks of various size and have been long used in evaluating and comparing algorithms in network linear programming. Because at this time, no other codes for S T P are available to us, we have not been able to compare t h e efficiency of this m e t h o d with other m e t h o d s . However, t h e solution times in all tested problems are satisfactory — none of t h e m need more t h a n one m i n u t e of C P U time. Table 6.1 listed results for these S T P s . T h e biggest problem in this table is a transshipment problem with 3000 nodes and 35000 arcs. Each arc has a linear shipment cost and each node has a 100-piece expected penalty cost. Overall, we feel t h a t there are at least two advantages to solve S T P (including stochastic transportation problems) by using t h e network piecewise linear algorithm: First, t h e m e t h o d is conceptually simpler t h a n other current methods and yet is practically efficient in solving large-scale problems. Second, it is flexible in dealing with different types of problems and complications in practice, for example, it allows t h e transportation cost also to be piecewise linear.

7

S T P C and Other Extensions

Assume t h a t in problem (6.3) t h e distributions of the r a n d o m d e m a n d s at cities are continuous. T h u s , u;(z t ) are convex functions, but not necessarily piecewise linear. We may first discretize t h e distributions. This is equivalent to use convex piecewise linear functions u;(z;) to approximate «,(£;). T h e properties of approximating a

Algorithm

for Network

Piecewise

Linear

297

Programming

convex function by a piecewise linear function were studied by Geoffrion [G77]. In general his result requires a global finer division of t h e domain in order to obtain higher accuracy. However, for t h e S T P , more grid points m a y not be necessary. Let us denote t h e approximation problem by S T P D . Suppose t h a t we get an optimal solution (x, z) for S T P D by using t h e ( N e t P L P ) m e t h o d . We now discuss how to improve this approximation solution for S T P C . P r o p o s i t i o n 7 . 1 . S T P C has an optimal solution (if it exists) (x*,z*) such t h a t x*takes a breakpoint value of t h e domain of tj for each j except j belongs to a set of arcs forming a spanning tree in t h e network. P r o o f . Suppose t h a t an optimal solution to S T P C is ( x , z * ) . Fix t h e values of z at 2*. T h e resulted problem is a piecewise linear network program which has an optimal solution x* satisfying t h e requirements. (Q.S.V.) If pi are good piecewise linear approximations to p,-, we may expect Xj to take t h e same breakpoint value as x*- for each j except j belongs to an optimal spanning tree T. We m a y also assume t h a t Xj is in t h e same linear "piece" of t h e domain of tj as x*j if j e T. To find x*, it suffices to find t h e values of x"j for j G T by fixing the values of Xj for j' £ T at these breakpoint values. Replace Xj [j £ T) by Cj/t, where Cjk is t h e corresponding breakpoint values and adjust 6; accordingly. T h e n it suffices to solve the following problem: ' minimize^

£ j e T tj(xj)

+ YT=\ «••(•*••)

(STPT) subject to

1

(^

4 0

(.)

= ( - E L 6.'

dj < Xj < d+ for j e T,

where E is t h e incidence m a t r i x of T and x = {XJ)J^J. As proved in [Q85], S T P T m a y be solved with the work of solving a one-dimensional monotone equation. This certainly gives us a better approximation solution of S T P C . One of t h e extensions of S T P is the stochastic generalized transshipment problem, i.e., t h e problem with a magnification or reduction coefficient at each arc. Some versions of such a problem have been discussed in [FD56][E60][Q87]. It is expected t h a t t h e ( N e t P L P ) m e t h o d m a y be generalized to such a problem without big difficulty since t h e only difference is t h a t t h e tree structure is replaced by a one-tree structure. See [KH80] and [Q87]. It will be much difficult if we allow t h e magnification or reduction coefficients are also convex piecewise linear functions. This occurs in the real world. For example, t h e reduction effect increases in an electricity transmission line as t h e flow increases. Other extensions of S T P includes t h e multi-commodity case and t h e stochastic programming problem with network flow recourse, which was first considered by Wallace for fisheries m a n a g e m e n t . See [W86][W89] and [WW89].

298

J. Sun, K.-H. Tsai, and L. Qi

m Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Type

m

n

Size of Support

CPU Time (in Sec.)

Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Assignment Assignment Assignment Assignment Assignment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment

200 200 200 200 200 300 300 300 300 300 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 1000 1000 1000 1000 1500 1500 1500 1500 8000 5000 3000 5000 3000

1311 1500 2007 2205 2900 3150 4500 5170 6095 6311 1500 2250 3000 3750 4500 1306 2443 1306 2443 1416 2836 1416 2836 1382 2676 1382 2676 2900 3400 4400 4800 4342 4385 5107 5730 15000 23000 35000 15000 23000

100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

0.36 0.33 0.48 0.44 0.50 0.63 0.93 1.07 1.14 1.21 0.38 0.49 0.56 0.52 0.69 0.73 1.24 0.77 1.24 0.80 1.44 0.82 1.49 0.81 1.50 0.86 1.55 2.38 2.53 3.35 3.39 4.25 4.14 4.49 4.81 50.48 28.01 24.82 24.91 18.82

Table 6.1. Solution Times of 40 Benchmark STP (on S U N 4 / 3 3 0 )

References [ACK86] I. Ali, W.D. Cook and M. Kress, Ordinal ranking and intensity of preference:

Algorithm

for Network

Piecewise

Linear

Programming

a linear p r o g r a m m i n g approach, Management [C83] V. Chvatal, Linear Programming,

299

Science,

32(1986)1642-1647.

(Freeman, New York, NY, 1983).

[CSA86] A. Charnes, T. Song and I. Ali, A two-segment approximation algorithm for separable convex programming with linear constraints, Mathematische Operationsforschung und Statistik-Series Optimization, 17(1986)147-159. [Co78] L. Cooper, T h e stochastic transportation-location problem, Computer Mathematics with Applications 4(1978)265-275.

and

[CL77] L. Cooper and L.J. LeBlanc, Stochastic transportation problems and other network related convex problems, Naval Research Logistics Quarterly 24(1977) 324-336. [E60] S. Elmaghrabi, Allocation under uncertainty when t h e d e m a n d has a continuous distribution function, Management Science 6(1960)270-294. [F86] R. Fourer, A simplex algorithm for piecewise-linear programming III: Computational Analysis and Applications, Tech. Report 86-03 D e p a r t m e n t of I E / M S , Northwestern University, Evanston IL 60208 (1986). [F88] R. Fourer, A simplex algorithm for piecewise-linear programming: finiteness, feasibility and degeneracy, Mathematical Programming, 41(1988)281-316. [FD56] A. R. Ferguson and G. B. Dantzig, T h e allocation of aircraft to routes—an example of linear programming under uncertain d e m a n d , Management Science 3(1956)45-73. [FF62] L.R. Ford and D.R. Fulkerson, Flows and Networks, Princeton, N J , 1962).

(Princeton Univ. Press,

[G77] A.M. Geoffrion, Objective function approximation in m a t h e m a t i c a l programming, Mathematical Programming 13(1977)23-37. [Gr86] M.D. Grigoriadis, An efficient implementation of t h e network simplex m e t h o d , Mathematical Programming Study 26(1986)83-111. [KH80] J.L. Kennington and R.V. Helgason, Algorithms Wiley-Interscience, New York (1980).

for Network

Programming,

[KNS74] D. Klingman, A. Napier and J. Stutz, N E T G E N : a program for generating large scale capacitated assignment, transportation, and m i n i m u m cost flow network problems, Management Science 20(1974)814-821. [L80] F . Louveaux, A solution m e t h o d for multi-stage stochastic programs with recourse with applications to an energy investment problem, Operations Research 28(1980)889-902.

300

J. Sun, K.-H. Tsai, and L. Qi

[MV88a] J. M. Mulvey and H. Vladimirou, Solving multistage stochastic network flows: An application of scenario aggregation, Tech. Report SOR-88-1, Dept. of Civil Engineering and Operations Research, Princeton University, Princeton , N J (1988). [MV88b] J. M. Mulvey and H. Vladimirou, Stochastic network optimization models for investment planning, Tech. Report SOR-88-2, Dept. of Civil Engineering and Operations Research, Princeton University, Princeton , N J (1988). [P86] W . B . Powell, A stochastic model of t h e dynamic vehicle allocation problem, Transportation Science 20 (1986) 117-129. [Q84a] L. Qi, Finitely convergent methods for solving stochastic linear programming and stochastic network flow problems, Ph. D. Dissertation, University of Wisconsin, Madison, W I (1984). [Q84b] L. Qi, T h e dual forest iteration m e t h o d for t h e stochastic transportation problem, Working Paper WP-84-59, IIASA, Laxenburg, Austria (1984). [Q85] L. Qi, Forest iteration m e t h o d for stochastic transportation problem, matical Programming Study 25(1985)142-163.

Mathe-

[Q87] L. Qi, T h e A-forest iteration m e t h o d for t h e stochastic generalized transportation problem, Mathematics of Operations Research 12(1987)1-15. [R84] R . T . Rockafellar, Network Flows Interscience, New York, 1984).

and

Monotropic

Optimization,

(Wiley-

[S86] J. Sun, On monotropic piecewise quadratic programming, P h . D Dissertation, University of Washington, Seattle, Washington (1986). [Sz64] W . Szwarc, T h e transportation problem with stochastic d e m a n d s , ment Science 11(1964)33-50.

Manage-

[W86] S. Wallace, Solving stochastic programs with network recourse, 16(1986)295-317.

Networks

[W89] S. Wallace, Bounding t h e expected time-cost curve for a stochastic P E R T network from below, Operations Research Letters 8(1989)89-94. [We83] R. J - B . Wets, Solving stochastic programs with simple recourse, 10(1983)219-242. [Wi63] A.C. Williams, A stochastic transportation problem, Operations 1(1963)759-770.

Stochastics

Research

[WW89] S. Wallace a n d R. J-B Wets, Preprocessing in stochastic programming: T h e case of uncapacitated networks, ORSA Journal on Computing 1(1989)252-270.

301 Network Optimization Problems, pp. 301-331 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

A Bibliography on Network Flow Problems 1 Marinus Veldhorst Department of Computer Science, Utrecht University, 3508 TB Utrecht, The Netherlands

P.O. Box

80.089,

Abstract

Network flow problems form an important class of research problems in optimization, with many new developments in the last decade. There are many different subclasses of network flow problems, as well as many different techniques to address these problems. In this bibliography we concentrate on combinatorial algorithms for the maximum flow problem and for the minimum cost flow problem with integral capacities and linear cost functions with integral coefficients. Especially results published since 1982 are compiled.

1

Introduction

Network flow problems form an i m p o r t a n t class of optimization problems on graphs. Basically, we are given a network (directed graph) G = (V, A) and one or more commodities. Some a m o u n t of each commodity m u s t b e pushed through t h e network from a n u m b e r of sources (vertices in G which have a supply of t h e commodity) to a n u m b e r of sinks (vertices with a d e m a n d of the commodity). At every vertex except t h e sources and t h e sinks, t h e incoming flow of every commodity is pushed on in certain portions over t h e outgoing arcs. T h e flow of t h e commodities may incur some costs, and t h e amount of flow through the arcs of t h e network may be restricted by certain capacity constraints and m a y be subject to losses or gains. Network flow problems usually ask for criteria and efficient algorithms for determining flows of the commodities t h a t satisfy certain conditions. 'This work was partially supported by the ESPRIT II Basic Research Actions of the EC under contract no. 7141 (project ALCOM II).

302

M.

Veldhorst

I m p o r t a n t subclasses of network flow problems are t h e m a x i m u m flow problems, t h e m i n i m u m cost flow problems, t h e multicommodity flow problems, and t h e flows "with losses and gains" (the generalized flow problem). Even within a subclass, problems may vary in such a way t h a t different algorithms are necessary to find t h e solution for different sorts of problems. For example, t h e capacity constraints m a y be nonnegative integral or real n u m b e r s , and can be upper or lower bounds on t h e amount of flow through t h e arcs; t h e cost functions of t h e arcs may be integral or real valued functions and can be linear or nonlinear in t h e amount of flow through t h e arc. For special networks (e.g., planar networks) specific algorithms have been designed in order to obtain solutions more efficiently. Hence, in a bibliography on network flow problems t h a t is not too extensive, one must m a k e a selection from an overwhelming number of publications in this research area. In this bibliography we compiled results which, in our opinion, are interesting from t h e viewpoint of t h e design and analysis of algorithms, especially combinatorial algorithms. We concentrated on t h e m a x i m u m flow problems and on t h e m i n i m u m cost flow problems with integral capacities and linear cost functions with integral coefficients. These are, in a way, the most classical network flow problems and were already t h e subject of most of Ford and Fulkerson's book in 1962. We intend to be complete in these two areas as far as results published after 1982 are concerned. Results t h a t are not published in regular journals or proceedings of conferences, are only included when we consider t h e m i m p o r t a n t from an historic point of view or when they constitute t h e current s t a t e of t h e art. For t h e more general varieties and subclasses of the network flow problem t h e bibliography is certainly not complete, b u t nevertheless we hope to give a rather broad entrance to t h e scientific literature on these problems. In t h e remainder of this introduction we give an overview of t h e variants of the m a x i m u m flow problems and m i n i m u m cost flow problems, and mention a number of publications with good introductions, overviews or i m p o r t a n t contributions to the development of t h e field of network flow algorithms. After section 1 t h e bibliography is given 2 .

1.1

Maximum Flow Problems

In the (basic) m a x i m u m flow problem there is one commodity and a m a x i m u m amount of flow of t h e commodity has to be sent from one source s to one sink t; t h e flow in each arc (i,j) £ A is not allowed to be more t h a n a given positive integral value (upper bound) U J J . To be precise, in an instance of t h e m a x i m u m flow problem we are given a directed graph G = (V, A), and two specified vertices s and t which are assumed to have no incoming and outgoing arcs, respectively. W i t h each arc 2

Earlier versions of this bibliography have been published in Algorithms Review, vol. 1 (1990), pp. 97-117, and as Technical Report RUU-CS-91-38, Dept. of Computer Science, University of Utrecht, Utrecht, The Netherlands.

A Bibliography

on Network

Flow

303

Problems

(i, j) € A is associated a positive integral n u m b e r u,-j, its capacity. A (valid) flow for this instance consists of a real number /,-j for each {i,j) G A such t h a t t h e following two conditions are satisfied: 0 < fij H

fi,i=

< Ui,j 12

for all (i, j) € A hi

ior ea,chi^

s,i^=t

T h e flow has value £i:(»,«)e.4 /».'• A m a x i m u m flow is a valid flow with m a x i m u m value among all possible valid flows. T h e m a x i m u m flow problem is t h e problem to design an efficient algorithm t h a t computes a m a x i m u m flow for every given instance. Pioneering work on t h e m a x i m u m flow problem has been done by Ford and Fulkerson (e.g. [94]). F u n d a m e n t a l improvements have been given by E d m o n d s and Karp ([76]), Dinic ([72]), Karzanov ([208]), Malhotra et al. ([243]) and Goldberg ([120]). These fundamental improvements often triggered further improvement: faster algor i t h m s could be obtained by incorporation of more sophisticated d a t a structures in t h e fundamental algorithms (e.g., [ I l l ] , [333], [130], [125]), or by making proper choices t h a t were left open in t h e fundamental algorithms (e.g., [344], [130], [52]). Good introductions or overviews of t h e algorithmic area of m a x i m u m network flow can be found in [5], [61], [94], [177], [194], [231], [287], [344] and [357]. Included in [5] is an excellent historic overview. For subclasses of m a x i m u m flow problems specially designed algorithms may run faster t h a n t h e general algorithms. For example there are m a x i m u m flow algorithms for planar graphs (e.g., [189], [170], [98]), and for bipartite graphs (e.g., [10], [159]). On t h e other h a n d one can look for efficient algorithms for t h e special case t h a t t h e capacities Uij are relatively small numbers (e.g., [86], [88]). In case t h e capacities are bounded by a polynomial in t h e number of vertices of G, a scaling technique might be useful ([107], [7], [11]). Algorithms for sequential computers are not necessarily efficient for computers of a different architecture. Hence, several researchers have looked for efficient parallel algorithms (e.g., [328], [130]), and efficient distributed algorithms (e.g., [20], [244], [130], [52]). O t h e r researchers have considered t h e m a x i m u m flow of r a n d o m instances of t h e m a x i m u m flow problem (e.g., [150], [335], [268]). T h e m a x i m u m flow problem can naturally be extended to t h e so-called multiterminal flow problem. In this problem one wants to c o m p u t e t h e m a x i m u m flow values for k source-destination pairs (•Si,< 1 ),.. . , ( s j t , i t ) , simultaneously. Obviously this problem can be solved by separately solving k m a x i m u m flow problems (one for each pair (s,,z;)), b u t more efficient algorithms have been designed for t h e case t h a t G is an undirected graph (e.g., [140], [144], [156]), and for t h e case G is planar ([252]). Another extension of t h e m a x i m u m flow problem is t h e p a r a m e t r i c flow problem in which t h e capacities depend on one additional p a r a m e t e r , and one wants to compute some information about how t h e m a x i m u m flow depends on this p a r a m e t e r (e.g., [8],

304

M.

Veldhorst

[115], [158]). In a third extension there are upper bounds set on t h e amount of flow streaming through a vertex. Usually problems of this type can be transformed to ordinary m a x i m u m flow problems, b u t by these transformations one m a y loose several desirable properties of t h e networks, e.g. planarity ([215]). Other variants of t h e m a x i m u m flow problem can be found in e.g. [262], [272], [279]. As for implementations of t h e algorithms and their efficiency on existing computers, we refer to e.g. [225], [71], [203], [16] and [14].

1.2

M i n i m u m Cost Flow P r o b l e m s

In t h e m i n i m u m cost flow problem a nonnegative cost function c;j is associated with each arc (i,j) € A. Instead of maximizing t h e flow value, we want to c o m p u t e t h e flow of a given value v with m i n i m u m cost. T h e cost of a flow / is defined as ^2(i,j)eAci,j(fi,j)Arcs may have capacity constraints t h a t consist of nonnegative lower bounds and positive (possibly infinite) upper bounds. T h e m i n i m u m cost flow problem is easily generalized to t h e m i n i m u m cost circulation problem. Here we have a directed graph G = (V, A). W i t h each arc (i,j) £ A is associated a cost function Cij, and a (upper bounding) capacity u , j (a positive number, possibly infinite). W i t h each vertex i £ V is associated a n u m b e r &,-; if 6; < 0, vertex i has a d e m a n d of flow; if 6; > 0, i has a surplus of flow. A (valid) circulation consists of a nonnegative real n u m b e r fcj for each (i,j) £ A such t h a t t h e following two conditions hold: 0 < fi,j < « i j

S

j--(i,j)eA

/'J

_

for all (i,j) b

Yl hi = i

j:U,')eA

£ A

for a11

*e

V

and has cost J2(i,j)eAci,j{fi,j)T h e problem is to find a valid circulation of m i n i m u m cost. Usually t h e cost functions are convex or linear. For t h e case of linear cost functions with integral coefficients we refer to [5]. It concentrates on combinatorial algorithms, but contains also a historic overview of the different fundamental approaches (e.g., network simplex, primal-dual, out-of-kilter, scaling, relaxation) to t h e solution of t h e m i n i m u m cost circulation problem. For the case of convex cost functions we refer to [212], [301] and [35]. O t h e r variants of t h e m i n i m u m cost flow problems are mentioned in e.g. [166], [214] and [210]. M i n i m u m cost flow in planar graphs is treated in [183]. As for implementations of t h e algorithms and their efficiency on existing computers, we refer to e.g. [225], [37], [203], [126] and [122].

References [1] G. K. Adel'son-Velskii, E. A. Dinic, and A. V. Karzanov. Science, Moscow, 1975. in Russian.

Flow

algorithms.

A Bibliography

on Network

Flow

Problems

305

[2] R. K. Ahuja. Algorithm for t h e m i n i m a x transportation problem. Naval Log. Quart., 33:725-739, 1986.

Res.

[3] R. K. Ahuja, J. L. B a t r a , and S. K. G u p t a . A p a r a m e t r i c algorithm for t h e convex cost network flow and related problems. Europ. J. Oper. Res., 16:222235, 1984. [4] R. K. Ahuja, A. V. Goldberg, J. B . Orlin, and R. E. Tarjan. Finding minimumcost flows by double scaling. Math. Programming, 53:243-266, 1992. [5] R. K. Ahuja, T. L. Magnanti, and J. B . Orlin. Network flows. In G. L. Nemhauser, A. H. G. Rinnooy Kan, and M. J. Todd, editors, Handbooks of Operations Research and Management Science, vol. 1: Optimization, pages 211-369. North-Holland Publ. Comp., A m s t e r d a m , 1989. [6] R. K. Ahuja and J. B. Orlin. Improved primal simplex algorithms for t h e shortest p a t h , assignment and m i n i m u m cost flow problems. Technical Report 2090-88, Sloan School of Management, M I T , Cambridge, Mass., 1988. [7] R. K. Ahuja and J. B . Orlin. A fast and simple algorithm for t h e m a x i m u m flow problem. Operations Res., 37:748-759, 1989. [8] R. K. Ahuja and J. B. Orlin. Distance directed augmenting p a t h algorithms for m a x i m u m flow and parametric m a x i m u m flow problems. Naval Res. Log. Quart., 38:413-430, 1991. [9] R. K. Ahuja and J. B. Orlin. T h e scaling network simplex algorithm. Res., 40:Supplement S5-S13, 1992.

Operations

[10] R. K. Ahuja, J. B . Orlin, C. Stein, and R. E. Tarjan. Improved algorithms for bipartite network flow problems. Technical Report TR-338-91, D e p a r t m e n t of C o m p u t e r Science, Princeton University, Princeton, NJ., 1991. [11] R. K. Ahuja, J. B . Orlin, and R. E. Tarjan. Improved t i m e bounds for t h e m a x i m u m flow problem. SIAM J. Comput., 18:939-954, 1989. [12] A. I. Ali, R. P a d m a n , and H. Thiagaran. Dual algorithms for pure network problems. Operations Res., 37:159-171, 1989. [13] I. Ali, D. B a r n e t t , K. Farhangian, J. Kennington, B . Patty, B. Shetty, B. McCarl, and P. Wong. Multicommodity network problems: Applications and computations. A.I.I.E. Trans., 16:127-134, 1984. [14] F . Alizadeh and A. V. Goldberg. Implementing t h e push-relabel m e t h o d for t h e m a x i m u m flow problem on a connection machine. Technical Report STANCS-92-1410, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Feb. 1992.

306

M.

Veldhorst

[15] N. Alon. Generating pseudo-random permutations and m a x i m u m flow algor i t h m s . Inf. Process. Lett, 35:201-204, 1990. [16] R. J. Anderson and J. C. Setubal. On t h e parallel implementation of Goldberg's m a x i m u m flow algorithm. In Proc. 4th Annual ACM Symp. on Parallel Algorithms and Architectures, pages 168-177, 1992. [17] E. M. Arlin and C. H. Papadimitriou. On t h e complexity of circulations. Algorithms, 7:134-145, 1986.

J.

[18] J. Aronson and B . Chen. A primary/secondary m e m o r y implementation of a forward network simplex algorithm for multiperiod network flow problems. Comput. Oper. Res., 16:379-391, 1989. [19] A. Assad. Multicommodity network flows - a survey. Networks,

8:37-91, 1978.

[20] B . Awerbuch. Reducing complexities of t h e distributed max-flow and breadthfirst-search algorithms by means of network synchronization. Networks, 15:425437, 1985. [21] F . B a r a h o n a and E. Tardos. Note on W e i n t r a u b ' s minimum-cost circulation algorithm. SI AM J. Comput., 18:579-583, 1989. [22] A. E. Baratz. T h e complexity of m a x i m u m network flow. Technical Report M I T / L C S / T R - 2 3 0 , Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1980. [23] M. Bazaraa and J. J. Jarvis. Linear Programming ed.). J o h n Wiley L Sons, New York, 1990.

and Network

Flows

(2nd

[24] M. Bellmore and R. R. Vemuganti. On multicommodity m a x i m a l dynamic flows. Operations Res., 21:10-21, 1973. [25] G. E. Bennington. An efficient minimal cost flow algorithm. 19:1042-1051, 1973. [26] C. Berge. Graphs and Hypergraphs, A m s t e r d a m , 1973.

Manag.

Sci.,

chapter 5. North-Holland Publ. Comp.,

[27] C. Berge and A. Ghouila-Houri. Programming, works. John Wiley & Sons, New York, 1962.

Games and Transportation

Net-

[28] D. P. Bertsekas. A unified framework for primal-dual methods in m i n i m u m cost network flow problems. Math. Programming, 32:125-145, 1985. [29] D. P. Bertsekas. Distributed asynchronous relaxation m e t h o d s for linear network flow problems. Technical Report LIDS-P-1986, Lab. for Decision Systems, M I T , Cambridge, Mass., 1986.

A Bibliography

on Network

Flow

307

Problems

[30] D. P. Bertsekas and J. Eckstein. Dual coordinate step methods for linear network flow problems. Math. Programming, 42:203-243, 1988. [31] D. P. Bertsekas and D. El Baz. Distributed asynchronous relaxation methods for convex network flow problems. SIAM J. Contr. & Optim., 25:74-85, 1987. [32] D. P. Bertsekas, P. A. Hosein, and P. Tseng. Relaxation m e t h o d s for network flow problems with convex arc costs. SIAM J. Contr. & Optim., 25:1219-1243, 1987. [33] D. P. Bertsekas and P. Tseng. T h e relax codes for linear m i n i m u m cost network flow problems. In B . Simeone et al., editors, FORTRAN Codes for Network Optimization, Annals of Operations Research, vol. 13, pages 125-190, 1988. [34] D. P. Bertsekas and P. Tseng. Relaxation m e t h o d s for m i n i m u m cost ordinary and generalized network flow problems. Operations Res., 36:93-114, 1988. [35] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation, chapter 5, 6.5 and 6.6. Prentice-Hall, Inc., Englewood Cliffs, N J , 1989. [36] D. Bienstock. Some generalized max-flow min-cut problems in t h e plane. Oper. Res., 16:310-333, 1991.

Math.

[37] R. G. Bland and D. L. Jensen. On t h e computational behavior of a polynomialt i m e network flow algorithm. Math. Programming, 54:1-40, 1992. [38] G. Bradley, G. Brown, and G. Graves. Design and implementation of large scale primal transshipment algorithms. Manag. Sci., 24:1-38, 1977. [39] S. P. Bradley, A. C. Hax, and T. L. Magnanti. Applied Mathematical ming. Addison-Wesley Publ. Comp., New York, 1977.

Program-

[40] R. G. Busacker and P. J. Gowen. A procedure for determining a family of minimal-cost network flow p a t t e r n s . O.R.O. Technical paper 15, Johns Hopkins University, Baltimore, M D , 1961. [41] R. G. Busacker and T. L. Saaty. Finite Graphs and Networks: with Applications. McGraw-Hill, New York, 1965.

An

Introduction

[42] I. N. Chen. A new parallel algorithm for network flow problems. In T. Y. Feng, editor, Proc. 1974 Sagamore Computer Conf., Lecture Notes in C o m p u t e r Science, vol. 24, pages 306-307, Springer-Verlag, Berlin, 1975. [43] I. N. Chen, P. Y. Chen, and T. Y. Feng. Associative processing of network flow problems. IEEE Trans. Comput., C-28:184-190, 1979.

308

M.

Veldhorst

[44] I. N. Chen and T. Y. Feng. A parallel algorithm for m a x i m u m flow problem. In Proc. 1973 Sagamore Computer Conf., 1973. [45] Y. L. Chen and Y. H. Chin. Multicommodity network flows with safety considerations. Operations Res., 40:Supplement S48-S55, 1992. [46] C. K. Cheng. Ancestor tree for arbitrary multi-terminal-cut functions. of Operations Res., 33:199-213, 1991.

Annals

[47] C. K. Cheng and T. C. Hu. M a x i m u m concurrent flow and m i n i m u m cuts. Algorithmica, 8:233-249, 1992. [48] J. Cheriyan. Parametrized worst case networks for preflow push algorithms. Technical report, C o m p u t e r Science Group, T a t a Institute of F u n d a m e n t a l Research, Bombay, India, 1988. [49] J. Cheriyan and T. Hagerup. A randomized maximum-flow algorithm. In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 118-123, 1989. [50] J. Cheriyan, T. in o(nm) time? Languages and pages 235-248,

Hagerup, and K. Mehlhorn. Can a m a x i m u m flow be computed In M. Paterson, editor, Proc. 17th Intern. Coll. on Automata, Programming, Lecture Notes in C o m p u t e r Science, vol. 443, Springer-Verlag, Berlin, 1990.

[51] J. Cheriyan, T . Hagerup, and K. Mehlhorn. An o(n 3 )-time maximum-flow algorithm. Technical Report MPI-I-91-120, Max-Planck-institut fur Informatik, Saarbrucken, Germany, Nov. 1991. [52] J. Cheriyan and S. N. Maheshwari. Analysis of preflow push algorithms for m a x i m u m network flow. SI AM J. Comput., 18:1057-1086, 1989. [53] J. Cheriyan and S. N. Maheshwari. T h e parallel complexity of finding a blocking flow in a 3-layer network. Inf. Process. Lett., 31:157-161, 1989. [54] R. V. Cherkasky. Algorithm of construction of m a x i m a l flow in networks with complexity of 0(V2\/E) operations. Math. Methods of Solution of Economical Problems, 7:112-125, 1977. (In Russian). [55] T. Cheung. Computational comparison of eight m e t h o d s for the m a x i m u m network flow problem. ACM Trans. Math. Softw., 6:1-16, 1980. [56] T . Cheung. Graph traversal techniques and t h e m a x i m u m flow problem in distributed computation. IEEE Trans. Softw. Eng., SE-9:504-512, 1983. [57] N. Christofides. Graph Theory: demic Press, New York, 1975.

An Algorithmic

Approach,

chapter 11. Aca-

A Bibliography

on Network

Flow

309

Problems

[58] E. Cohen. Approximate m a x flow on small d e p t h networks. In Proc. 33rd Annual IEEE Symp. Foundations of Computer Science, pages 648-658, 1992. [59] E. Cohen and N. Megiddo. Algorithms and complexity analysis for some flow problems. In Proc. 2nd Annual ACM-SIAM Symp. Discrete Algorithms, pages 120-130, 1991. [60] E. Cohen and N. Megiddo. New algorithms for generalized network flows. In D. Dolev, Z. Galil, and M. Rodeh, editors, Theory of Computing and Systems, Proc. ISTCS '92, Haifa, Israel 1992, Lecture Notes in C o m p u t e r Science, vol. 601, pages 103-114, Springer-Verlag, Berlin, 1992. [61] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction chapter 28. M I T Press, Cambridge, Mass., 1990.

to

Algorithms,

[62] W. Cui. A network simplex m e t h o d for the m a x i m u m balanced flow problem. J. Oper. Res. Soc. Japan, 31:551-563, 1988. [63] W . Cui and S. Fujishige. A primal algorithm for the submodular flow problem with m i n i m u m - m e a n cycle selection. J. Oper. Res. Soc. Japan, 31:431-441, 1988. [64] W . H. C u n n i n g h a m . A network simplex m e t h o d . Math. Programming, 116, 1976.

11:105-

[65] W. H. Cunningham. Theoretical properties of the network simplex m e t h o d . Math. Oper. Res., 4:196-208, 1979. [66] W . H. C u n n i n g h a m and A. Frank. A primal-dual algorithm for submodular flows. Math. Oper. Res., 10:251-262, 1985. [67] G. B . Dantzig. Application of t h e simplex m e t h o d to a transportation problem. In T. C. Koopmans, editor, Activity Analysis of Production and Allocation, pages 359-373, John Wiley & Sons, New York, 1951. [68] G. B . Dantzig. Linear Princeton, N J , 1962.

Programming

and Extensions.

Princeton Univ. Press,

[69] G. B. Dantzig and D. R. Fulkerson. On t h e max-flow min-cut theorem of networks. In H. W . K u h n and A. W . Tucker, editors, Linear Inequalities and Related Systems, Annals of Mathematics Study, vol. 38, pages 215-221, Princeton Univ. Press, Princeton, N J , 1956. [70] U. Derigs. Programming in Networks and Graphs. Lecture Notes in Economics and M a t h e m a t i c a l Systems, vol. 300. Springer-Verlag, Berlin, 1988.

310

M.

Veldhorst

[71] U. Derigs and W . Meier. Implementing Goldberg's max-flow algorithm, a comp u t a t i o n a l investigation. Z. Oper. Res., 33:383-403, 1989. [72] E. A. Dinic. Algorithm for solution of a problem of m a x i m u m flow in networks with power estimation. Soviet Math. Dokl, 11:1277-1280, 1970. [73] J. Divoky and M. Hung. Performance of shortest p a t h algorithms in network flow problems. Manag. Sci., 36:661-673, 1990. [74] J. R. Driscoll, H. N. Gabow, R. Shrairman, and R. E. Tarjan. Relaxed heaps: An alternative to Fibonacci heaps with applications to parallel computations. Commun. ACM, 31:1343-1354, 1988. [75] J. E d m o n d s and R. Giles. A m i n - m a x relation for submodular functions on graphs. Annals of Discrete Math., 1:185-204, 1977. [76] J. E d m o n d s and R. M. Karp. Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM, 19:248-264, 1972. [77] J. E l a m , F . Glover, and D. Klingman. A strongly convergent primal simplex algorithm for generalized networks. Math. Oper. Res., 4:39-59, 1979. [78] P. Elias, A. Feinstein, and C. E. Shannon. Note on m a x i m u m flow through a network. IRE Trans, on Inform. Theory, 2:117-119, 1956. [79] S. E. Elmaghraby. Sensitivity analysis of multi-terminal network flows. ORSA, 12:680-688, 1964.

J.

[80] T. R. Ervolina and S. T. McCormick. A strongly polynomial dual cancel and tighten algorithm for m i n i m u m cost network flow. Technical Report 90-MSC010, U B C Faculty of Commerce, 1990. [81] T . R. Ervolina and S. T. McCormick. A strongly polynomial m a x i m u m mean cut cancelling algorithm for m i n i m u m cost network flow. Technical Report 90-MSC-009, U B C Faculty of Commerce, 1990. [82] J. R. Evans. M a x i m u m flow in probabilistic graphs - t h e discrete case. 6:161-183, 1976.

Networks,

[83] S. Even. T h e max-flow algorithm of Dinic and Karzanov. An exposition. Technical Report M I T / L C S / T M - 8 0 , Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1976. [84] S. Even. Graph Algorithms, 1979.

chapter 4, 5, and 10.8. P i t m a n Publ. Ltd, London,

A Bibliography

on Network

Flow

Problems

311

[85] S. Even, A. Itai, and A. Shamir. On t h e complexity of timetable and multicommodity flow problems. SIAM J. Comput., 5:691-703, 1976. [86] S. Even and R. E. Tarjan. Network flow and testing graph connectivity. J. Comput, 4:507-518, 1975. [87] T . E. Feather. The parallel complextity of some flow and matching P h D thesis, University of Toronto, Toronto, Canada, 1984.

SIAM

problems.

[88] D. Fernandez-Baca and C. U. Martel. On the efficiency of maximum-flow algorithms on networks with small integer capacities. Algorithmica, 4:173-189, 1989. [89] L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canad. Math., 8:399-404, 1956.

J.

[90] L. R. Ford and D. R. Fulkerson. A simple algorithm for finding m a x i m a l network flows and an application to t h e Hitchcock problem. Canad. J. Math., 9:210-218, 1957. [91] L. R. Ford and D. R. Fulkerson. Constructing m a x i m a l dynamic flows from static flows. Operations Res., 6:419-433, 1958. [92] L. R. Ford and D. R. Fulkerson. A suggested computation for m a x i m a l multic o m m o d i t y network flow. Manag. Sci., 5:97-101, 1958. [93] L. R. Ford and D. R. Fulkerson. A network flow feasibility theorem and combinatorial applications. Canad. J. Math., 11:440-450, 1959. [94] L. R. Ford and D. R. Fulkerson. Flows in Networks. Princeton, N J , 1962.

Princeton Univ. Press,

[95] A. Frank. Augmenting graphs to meet edge-connectivity requirements. J. Discr. Math., 5:25-53, 1992.

SIAM

[96] A. Frank and E. Tardos. An application of simultaneous Diophantine approximation in combinatorial optimization. Combinatorica, 7:49-65, 1987. Preliminary version: An application of simultaneous approximations in combinatorial optimization, Proc. 26th Annual I E E E Symp. Foundations of C o m p u t e r Science, 459-463, 1985. [97] H. Frank and I. T. Frisch. Communication, Transmission, networks. Addison-Wesley Publ. Comp., New York, 1971.

and

Transportation

[98] G. N. Frederickson. Fast algorithms for shortest paths in planar graphs, with applications. SIAM J. Comput., 16:1004-1022, 1987.

312

M.

Veldhorst

T . Fujisawa. Maximal flow in a lossy network. In Proc. Allerton Circuit and System Theory, pages 385-393, 1963.

Conf. on

S. Fujishige. Algorithms for solving t h e independent-flow problem. J. Res. Soc. Japan, 21:189-204, 1978.

Oper.

S. Fujishige. A capacity-rounding algorithms for t h e minimum-cost circulation problem: a dual framework of t h e Tardos algorithm. Math. Programming, 35:298-308, 1986. S. Fujishige. An out-of-kilter m e t h o d for submodular flows. Discrete Math., 17:3-16, 1987.

Applied

S. Fujishige, A. Nakayama, and W . - T . Cui. On t h e equivalence of the m a x i m u m balanced flow problem a m d t h e weighted m i n i m a x flow problem. Operations Res. Lett, 5:207-209, 1986. S. Fujishige, A. Rock, and U. Z i m m e r m a n n . A strongly polynomial algorithm for m i n i m u m cost submodular flow problems. Math. Oper. Res., 14:60-69, 1989. D. R. Fulkerson. An out-of-kilter m e t h o d for minimal cost flow problem. J. Appl. Math., 9:18-27, 1961.

SIAM

D. R. Fulkerson and G. B . Dantzig. C o m p u t a t i o n of m a x i m u m flow in networks. Naval Res. Log. Quart., 2:277-283, 1955. H. N. Gabow. Scaling algorithms for network problems. J. Comput. 31:148-168, 1985.

Syst.

Sci.,

H. N. Gabow and R. E. Tarjan. Faster scaling algorithms for network problems. SIAM J. Comput., 18:1013-1036, 1989. D. Gale. A t h e o r e m on flows in networks. Pacific J. Math., 7:1073-1082, 1957. D. Gale. Transient flows in networks. Michigan 5 3

J. Math., 6:59-63, 1959.

2 3

Z. Galil. An 0{n l m l ) algorithm for t h e m a x i m a l flow problem. Acta Inf., 14:221-242, 1980. Preliminary version in Proc. 19th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 231-245, 1978. Z. Galil. On t h e theoretical efficiency of various network flow algorithms. oretical Comput. Sci., 14:103-111, 1981.

The-

Z. Galil and A. N a a m a d . An 0(EV log 2 V) algorithm for the maximal flow problem. J. Comput. Syst. Sci., 21:203-217, 1980. Preliminary version as "Network flow and generalized p a t h compression" in Proc. 11th Annual ACM Symp. Theory of Computing, pages 13-26, 1979.

A Bibliography

on Network

Flow

Problems

313

[114] Z. Galil and E. Tardos. An 0(n2(m + rclogn)logn) min-cost flow algorithm. J. ACM, pages 374-386, 1988. Preliminary version in Proc. 27th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 1-9, 1986. [115] G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast p a r a m e t r i c m a x i m u m flow algorithm and applications. SIAM J. Comput., 18:30-55, 1989. [116] M. R. Garey and D. S. Johnson. Computers and Intractibility, a Guide to the Theory of NP-Completeness, chapter A2. W . H . Freeman and Co., San Francisco, 1979. [117] F . Glover, D. Karney, and D. Klingman. Implementation and computational comparisons of primal, dual and primal-dual computer codes for m i n i m u m cost network flow problem. Networks, 4:191-212, 1974. [118] F . Glover, D. Karney, D. Klingman, and A. Napier. A c o m p u t a t i o n a l study on start procedures, basis change criteria, and solution algorithms for transportation problem. Manag. Sci., 20:793-813, 1974. [119] A. V. Goldberg. A new max-flow algorithm. Technical Report M I T / L C S / T M 291, Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1985. [120] A. V. Goldberg. Efficient graph algorithms for sequential and parallel computers. P h D thesis, Dept. of Electr. Engin. and C o m p u t e r Science, M I T , Cambridge, Mass., 1987. Also available als Technical Report TR-374, Lab. for Computer Science, M I T , Cambridge, Mass., 1987. [121] A. V. Goldberg. Processor-efficient implementation of a m a x i m u m flow problem. Inf. Process. Lett, 38:179-185, 1991. [122] A. V. Goldberg. An efficient implementation of a scaling minimum-cost flow algorithm. Technical Report STAN-CS-92-14139, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Aug. 1992. [123] A. V. Goldberg. A n a t u r a l randomization strategy for multicommodity flow and related problems. Inf. Process. Lett., 42:249-256, 1992. [124] A. V. Goldberg, M. D. Grigoriadis, and R. E. Tarjan. Efficiency of t h e network simplex algorithm for t h e m a x i m u m flow problem. Technical Report STAN-CS89-1248, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1989. To appear in Math. Programming. [125] A. V. Goldberg, M. D. Grigoriadis, and R. E. Tarjan. Use of dynamic trees in a network simplex algorithm for t h e m a x i m u m flow problem. Math. Programming, 50:277-290, 1991.

314

M.

Veldhorst

[126] A. V. Goldberg and M. Kharitonov. On implementing scaling push-relabel algorithms for the minimum-cost flow problem. Technical Report STAN-CS92-1418, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Mar. 1992. [127] A. V. Goldberg, S. A. Plotkin, and E. Tardos. Combinatorial algorithms for t h e generalized circulation problem. Math. Oper. Res., 16:351-381, 1991. Preliminary version in Proc. 29th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 432-443, 1988. [128] A. V. Goldberg, S. A. Plotkin, and P. M. Vaidya. Sublinear-time parallel algor i t h m s for matching and related problems. In Proc. 29th Annual IEEE Symp. Foundations of Computer Science, pages 174-185, 1988. [129] A. V. Goldberg, E. Tardos, and R. E. Tarjan. Network flow algorithms. In B . K o r t e , L. Lovasz, H. P r o m e l , and A. Schrijver, editors, Flows, paths, and VLSI-layout, pages 101-164, 1990. Previously published as Tech. Rep. STANCS-89-1252, D e p a r t m e n t of C o m p u t e r Science, Stanford University, March 1989. [130] A. V. Goldberg and R. E. Tarjan. A new approach to the m a x i m u m flow problem. J. ACM, 35:921-940, 1988. Preliminary version in Proc. 18th Annual A C M Symp. Theory of Computing, pages 136-146, 1986. [131] A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by canceling negative cycles. J. ACM, 36:873-886, 1989. Preliminary version in Proc. 20th Annual ACM Symp. Theory of Computing, pages 388-397, 1987. [132] A. V. Goldberg and R. E. Tarjan. A parallel algorithm for finding a blocking flow in an acyclic network. Inf. Process. Lett., 31:265-271, 1989. [133] A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by successive approximation. Math. Oper. Res., 15:430-466, 1990. Preliminary version published as M I T / L C S / T M - 3 3 3 , M I T , 1987, and as Solving minimum-cost flow problems by successive approximation, Proc. 19th Annual ACM Symp. Theory of C o m p u t i n g , pages 7-18. [134] B . Golden and T. L. Magnanti. Deterministic network optimization: A bibliography. Networks, 7:149-183, 1977. [135] D. Goldfarb and M. D. Grigoriadis. A computational comparison of the Dinic and network simplex methods for m a x i m u m flow. In B. Simeone, et al., editor, FORTRAN Codes for Network Optimization, Annals of Operations Research, vol. 13, pages 83-124, 1988.

A Bibliography

on Network

Flow

Problems

315

[136] D. Goldfarb and J. Hao. A primal simplex algorithm t h a t solves t h e m a x i m u m flow problem in at most 0(nm) pivots and Oiji^m) time. Math. Programming, 47:353-363, 1990. [137] D. Goldfarb and J. Hao. On strongly polynomial variants of t h e network simplex algorithm for t h e m a x i m u m flow problem. Operations Res. Letters, 10:383-387, 1991. [138] D. Goldfarb, J. Hao, and S. Kai. Anti-stalling pivot rules for the network simplex algorithm. Networks, 20:79-91, 1990. [139] L. M. Goldschlager, R. A. Shaw, and J. Staples. T h e m a x i m u m flow problem is log space complete for P . Theoretical Comput. Sci., 21:105-111, 1982. [140] R. E. Gomory and T . C. Hu. Multi-terminal network flows. J. SIAM, 9:551-570, 1961. [141] R. E. Gomory and T . C. Hu. An application of generalized linear programming to network flows. J. SIAM, 10:260-283, 1962. [142] R. E. Gomory and T . C. Hu. Synthesis of a communication network. J. 12:348-369, 1964. [143] M. Gondran and M. Minoux. Graphs and Algorithms, Interscience, New York, 1984.

SIAM,

chapter 5 and 6. Wiley-

[144] F . Granot and R. Hassin. Multi-terminal m a x i m u m flows in node capacitated networks. Discrete Applied Math., 13:157-163, 1986. [145] F . Granot and M. Penn. On t h e integral plane two-commodity flow problem. Operations Res. Lett, 11:135-139, 1992. [146] F . Granot and A. F . Veinott J r . Substitutes, complements and ripples in network flows. Math. Oper. Res., 10:471-497, 1985. [147] M. D. Grigoriadis. An efficient implementation of t h e network simplex m e t h o d . Math. Prog. Study, 26:83-111, 1986. [148] M. D. Grigoriadis and W . W . W h i t e . A partitioning algorithm for t h e multicommodity network flow problem. Math. Programming, 3:157-177, 1972. [149] G. R. G r i m m e t t and W . - C . S. Suen. T h e m a x i m a l flow through a directed graph with r a n d o m capacities. Stochastics, 8:153-159, 1982. [150] G. R. G r i m m e t t and D. J. A. Welsh. Flow in networks with r a n d o m capacities. Stochastics, 7:205-229, 1982.

M.

316

Veldhorst

151] R. C. Grinold. Calculating m a x i m a l flows in a network with positive gains. Operations Res., 21:528-541, 1973. 152] M. Grotschel, L. Lovasz, and A. Schrijver. Geometric natorial Optimization. Springer-Verlag, Berlin, 1988.

Algorithms

and

Combi-

153] G. Guisewite and P. M. Pardalos. M i n i m u m concave cost network flow problems: applications, complexity, and algorithms. Annals of Operations Research, 25:125-190, 1990. 154] R. P. G u p t a . O n flows in pseudosymmetric networks. J. SIAM, 1966.

14:215-225,

[155] D. Gusfield. Simple constructions for multi-terminal network flow synthesis. SIAM J. Comput, 12:157-165, 1983. [156] D. Gusfield. Very simple methods for all pairs network flow analysis. SIAM Comput., 19:143-155, 1990. [157] D. Gusfield. Computing t h e strength of a graph. SIAM J. Comput., 1991.

J.

20:639-654,

[158] D. Gusfield and C. Martel. A fast algorithm for the generalized parametric m i n i m u m cut problem and applications. Algorithmica, 7:499-519, 1992. [159] D. Gusfield, C. Martel, and D. Fernandez-Baca. Fast algorithms for bipartite network flow. SIAM J. Comput., 16:237-251, 1987. [160] D. Gusfield and D. Naor. Efficient algorithms for generalized cut trees. In Proc. 1st Annual ACM-SIAM Symp. Discrete Algorithms, pages 422-433, 1990. [161] H. H a m a c h a r . Numerical investigations on t h e maximal flow algorithm of Karzanov. Computing, 22:17-29, 1979. [162] H. Hamachar and L. R. Foulds. Algorithms for flows with p a r a m e t r i c capacities. Z. Oper. Res., 33:21-37, 1989. [163] J. Hao and J. B. Orlin. A faster algorithm for finding t h e m i n i m u m cut in a graph. In Proc. 3rd Annual ACM-SIAM Symp. Discrete Algorithms, pages 165-174, 1992. [164] J. K. H a r t m a n and L. S. Lasdon. A generalized upper-bounding algorithm for multicommodity network flow problems. Networks, 1:333-354, 1971. [165] R. Hassin. M a x i m u m flow in (s,t) 107, 1981.

planar networks. Inf. Process. Lett., 13:107-

A Bibliography

on Network

Flow

Problems

317

[166] R. Hassin. M i n i m u m cost flow in set-constraints. Networks,

12:1-21, 1982.

[167] R. Hassin. T h e m i n i m u m cost flow problem: a unifying approach to dual algorithms and a new tree search algorithm. Math. Programming, 25:228-239, 1983. [168] R. Hassin. On multicommodity flow in planar graphs. Networks, 1985.

14:225-235,

[169] R. Hassin. Algorithms for t h e m i n i m u m cost circulation problem based on maximizing t h e m e a n improvement. Operations Res. Lett., 12:227-233, 1992. [170] R. Hassin and D. B . Johnson. An 0 ( n l o g 2 n ) algorithm for m a x i m u m flow in undirected planar networks. SIAM J. Comput., 14:612-624, 1985. [171] R. Hassin and E. Zemel. Probabilistic analysis of t h e capacitated transportation problem. Math. Oper. Res., 13:80-89, 1988. [172] R. V. Helgason and J. L. Kennington. An efficient procedure for implementing a dual simplex network flow algorithm. A.I.I.E. Trans., 9:63-68, 1977. [173] F . L. Hitchcock. T h e distribution of a product from several sources to numerous facilities. J. Math. Phys., 20:224-230, 1941. [174] D. S. Hochbaum and A. Segev. Analysis of a flow problem with fixed charges. Networks, 19:291-312, 1989. [175] T. C. Hu. Multicommodity network flows. Operations [176] T. C. Hu. Integer Programming C o m p . , Reading, Mass., 1969.

& Network

Flows.

Res., 11:344-360, 1963. Addison-Wesley Publ.

[177] T. C. Hu. Combinatorial Algorithms, chapter 2.1, 2.2 and 2.3. Addison-Wesley Publ. Comp., Reading, Mass., 1982. [178] T . C. Hu and M. T. Shing. Algorithms, 4:241-261, 1983.

Multiterminal flows in outerplanar graphs.

J.

[179] T. C. Hu and M. T . Shing. A decomposition algorithm for multi-terminal network flows. Technical Report T R C S 84-08, D e p a r t m e n t of C o m p u t e r Science, University of California, Santa Barbara, CA, 1984. [180] C. A. J. Hurkens, A. Schrijver, and E. Tardos. On fractional multicommodity flows and distance functions. Discrete Math., 73:99-109, 1989. [181] T. Ichimori, H. Ishii, and T. Nishida. Weighted m i n i m a x real-valued flow. Oper. Res. Soc. Japan, 24:52-59, 1981.

J.

M.

318

Veldhorst

182] H. Imai. On t h e practical efficiency of various m a x i m u m flow algorithms. Oper. Res. Soc. Japan, 26:61-82, 1983.

J.

183] H. Imai and K. Iwano. Efficient sequential and parallel algorithms for planar m i n i m u m cost flow. In T. Asano, T. Ibaraki, H. Imai, and T. Nishizeki, editors, Proc. SIGAL International Symposium on Algorithms SIGAL '90, Lect u r e Notes in C o m p u t e r Science, vol. 450, pages 21-30, Springer-Verlag, Berlin, 1990. 184] M. Iri. A new m e t h o d of solving transportation-network problems. J. Res. Soc. Japan, 3:27-87, 1960. 185] M. Iri. Network York, 1969.

Flows,

Transportation

and Scheduling.

Oper.

Academic Press, New

186] A. Itai. Two-commodity flow. J. ACM, 25:596-611, 1978. 187] A. Itai and D. K. P r a d h a n . Synthesis of directed multicommodity flow networks. Networks, 14:213-224, 1984. 188] A. Itai and M. Rodeh. Scheduling transmissions in a network. J. 6:409-429, 1985. 189] A. Itai and Y. Shiloach. M a x i m u m flows in planar networks. SI AM J. 8:135-150, 1979.

Algorithms,

Comput.,

190] A. V. Iyer, J. J. Jarvis, and H. D. Ratliff. Hierarchical solution to network flow problems. Networks, 20:731-752, 1990. 191] L. Janiga and V. Koubek. A note on finding cuts in directed planar networks by parallel computation. Inf. Process. Lett., 21:75-78, 1985. [192] J. J. Jarvis. On the equivalence between node-arc and arc-chain formulations for t h e multicommodity maximal flow problem. Naval Res. Log. Quart., 16:525529, 1969. [193] J. J. Jarvis and A. M. Jezior. Maximal flow with gains through a special network. Operations Res., 20:678-688, 1972. [194] P. A. Jensen and W . Barnes. Network New York, 1980.

Flow Programming.

J o h n Wiley & Sons,

[195] P. A. Jensen and G. B h a u m i k . A flow augmentation approach t o t h e network with gains m i n i m u m cost flow problem. Manag. Sci., 23:631-643, 1977. [196] W . S. Jewell. O p t i m a l flow through networks. Interim Technical Report No. 8, Operations Research Center, M I T , Cambridge, Mass., 1958.

A Bibliography

on Network

Flow

Problems

319

[197] W . S. Jewell. O p t i m a l flow through networks with gains. 10:476-499, 1962.

Operations

Res.,

[198] W . S. Jewell. A primal-dual multicommodity flow algorithm. O R C Report 6624, Operations Research Center, University of California, Berkeley, C A , 1966. [199] W . S. Jewell. Multicommodity network solutions. Dunod, Paris, page 183, 1967.

In Thiorie

des

graphes.

[200] D. B . Johnson. Parallel algorithms for m i n i m u m cuts and m a x i m u m flows in planar networks. J. ACM, 34:950-967, 1987. Preliminary version in Proc. 23rd Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 244-254, 1982. [201] D. B. Johnson and S. M. Venkatesan. Using divide and conquer to find flows in directed planar networks in 0 ( r c 3 ' 2 logra) t i m e . In Proc. 20th Annual Allerton Conf. on Communication, Control, and Computing, pages 898-905, Univ. of Illinois, U r b a n a - C h a m p a i g n , IL., 1982. [202] D. B. Johnson and S. M. Venkatesan. Partition of planar flow networks. In Proc. 24th Annual IEEE Symp. Foundations of Computer Science, pages 259264, 1983. [203] D. S. Johnson and C. C. McGeoch. DIMACS implementation challenge workshop algorithms for network flow and matching. Technical Report 92-4, DIM A C S , New Brunswick, N J , 1992. [204] E. L. Johnson. 1966.

Networks and basis solutions.

Operations

Res.,

14:619-624,

[205] S. Kapoor and P. M. Vaidya. Fast algorithms for convex quadratic programming and multicommodity flows. In Proc. 18th Annual ACM Symp. Theory of Computing, pages 147-159, 1986. [206] R. M. K a r p . A characterization of the m i n i m u m cycle mean in a digraph. Discrete Math., 23:309-311, 1978. [207] R. M. K a r p , E. Upfal, and A. Wigderson. Constructing a m a x i m u m matching is in R a n d o m N C . Combinatorica, 6:35-48, 1986. [208] A. V. Karzanov. Determining t h e m a x i m a l flow in a network by t h e m e t h o d of preflows. Soviet Math. DokL, 15:434-437, 1974. [209] A. V. Karzanov. Half-integral 18:263-278, 1987.

five-terminus

flows.

Discrete

Applied

Math.,

[210] N. K a t o h . An efficient algorithm for the bicriteria minimum-cost circulation problem. J. Oper. Res. Soc. Japan, 32:420-440, 1989.

320

M.

Veldhorst

[211] J. L. Kennington. Survey of linear cost multicommodity network flows. ations Res., 26:209-236, 1978. [212] J. L. Kennington and R. V. Helgason. Algorithms Wiley-Interscience, New York, 1980.

for Network

Oper-

Programming.

[213] J. L. Kennington and M. Shalaby. An effective subgradient procedure for minimal cost multicommodity flow problems. Manag. Sci., 23:994-1004, 1977. [214] D . B . K h a n g and 0 . Fujiwara. A p p r o x i m a t e solutions of capacitated fixedcharge m i n i m u m cost network flow problems. Networks, 21:689-704, 1991. [215] S. Khuller and J. Naor. Flow in planar graphs with vertex capacities. Technical Report 90-1089, C o m p u t e r Science D e p a r t m e n t , Cornell University, Ithaca, NY, J a n . 1990. [216] S. Khuller, J. Naor, and P. Klein. T h e lattice structure of flow in planar graphs. Technical Report UMIACS-TR-2566, Univ. of Maryland Inst, for Advanced C o m p u t e r Studies, 1990. [217] S. Khuller and B . Schieber. Efficient parallel algorithms for testing kconnectivity and finding disjoint s — t paths in graphs. SI AM J. Comput., 20:352-375, 1991. Preliminary version in Proc. 30th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 288-293, 1989. [218] A. B . Kinariwala and A. G. Rao. Flow switching approach to t h e m a x i m u m flow problem. J. ACM, 24:630-645, 1977. [219] V. King, S. Rao, and R. E. Tarjan. A faster deterministic m a x i m u m flow algorithm. In Proc. 3rd Annual ACM-SIAM Symp. Discrete Algorithms, pages 157-165, 1992. [220] M. Klein. A primal m e t h o d for minimal cost flows with applications t o t h e assignment and transportation problems. Manag. Sci., 14:205-220, 1967. [221] P. Klein, A. Agrawal, R. Ravi, and S. Rao. Approximation through multic o m m o d i t y flow. In Proc. 31th Annual IEEE Symp. Foundations of Computer Science, pages 726-737, 1990. [222] P. Klein, C. Stein, and E. Tardos. Leighton-Rao might be practical: faster approximation algorithms for concurrent flow with uniform capacities. In Proc. 22th Annual ACM Symp. Theory of Computing, pages 310-321, 1990. [223] D. J. Kleitman. An algorithm for certain multicommodity flow problems. Networks, 1:75-90, 1971.

A Bibliography

on Network

Flow

Problems

321

[224] J. G. Klincewicz. A Newton m e t h o d for convex separable network flow problems. Networks, 13:427-442, 1983. [225] D. Klingman, A. Napier, and J. Stutz. N E T G E N : A program for generating large scale capacitated assignment, transportation, and m i n i m u m cost flow network problems. Manag. Sci., 20:814-821, 1974. [226] E. K n a p p . An exercise in t h e formal derivation of parallel programs: M a x i m u m flows in graphs. ACM Trans. Program. Lang. Syst., 12:203-223, 1990. [227] T. C. Koopmans. O p t i m u m utilization of t h e transportation system. In Proc. International Statistical Conference, Washington, D.C., 1947. Also reprinted as supplement t o Econometrica 1 7 , 1949. [228] V. Koubek and A. Riha. T h e m a x i m u m fc-flow in a network. In J. Gruska and M. Chytil, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 118, pages 389-397, Springer-Verlag, Berlin, 1981. [229] L. Kucera. M a x i m u m flow in planar networks. In J. Gruska and M. Chytil, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 118, pages 418-422, Springer-Verlag, Berlin, 1981. [230] L. Kucera. Finding a m a x i m u m flow in / s , t / - p l a n a r network in linear expected t i m e . In M. P. Chytil and V. Koubek, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 176, pages 370-377, Springer-Verlag, Berlin, 1984. [231] E. L. Lawler. Combinatorial Optimization: Networks and Matroids, 6.3 and 7.11. Holt, Rinehart and Winston, New York, 1976. [232] E. L. Lawler. Shortest p a t h and network flow algorithms. Annals Math., 4:251-263, 1979.

chapter 4,

of

Discrete

[233] E. L. Lawler. An introduction to polymatroidal network flows. In G. Ausiello and M. Lucertini, editors, Analysis and Design of Algorithms in Combinatorial Optimization, International Centre for Mechanical Sciences, Courses and Lectures - No. 266, pages 129-146. Springer-Verlag, Vienna, 1981. [234] E. L. Lawler and C. U. Martel. Computing maximal "polymatroidal" network flow. Math. Oper. Res., 7:334-347, 1982. [235] T. Leighton, F . Makedon, S. Plotkin, C. Stein, E. Tardos, and S. Tragoudas. Fast approximation algorithms for multicommodity flow problems. In Proc. 23rd Annual ACM Symp. Theory of Computing, pages 101-111, 1991.

322

M.

Veldhorst

[236] T . Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In Proc. 29th Annual IEEE Symp. Foundations of Computer Science, pages 422-431, 1988. [237] T. Lengauer and K. W. Wagner. T h e binary network flow problem is logspace complete for P . Theoretical Comput. Sci., 75:357-363, 1990. A preliminary version was part of: T. Lengauer and K. W . Wagner, T h e correlation between t h e complexities of non-hierarchical and hierarchical versions of graph problems. In: F . J. Brandenburg, G. Vidal-Nacquet and M. Wirsing (eds.), Proc. STACS 87 - 4th Annual Symp. on Theor. Aspects of C o m p u t e r Science, Lecture Notes in C o m p u t e r Science, vol. 247, pages 100-113, Springer-Verlag, Berlin, 1987. [238] R. J. Lipton and R. E. Tarjan. A separator theorem for planar graphs. J. Appl. Math., 36:177-189, 1979. [239] M. V. Lomonosov. 3:207-218, 1983.

On the planar integer two-flow problem.

SIAM

Combinatorica,

[240] M. V. Lomonosov. Combinatorial approaches to multiflow problems. Applied Math., 11:1-94, 1985.

Discrete

[241] M. Malek-Zavarei and J. K. Aggarwal. Optimal flow in networks with gains and costs. Networks, 1:355-365, 1972. [242] M. Malek-Zavarei and I. T. Frisch. On t h e fixed cost flow problem. Control, 16:897-902, 1972.

Int.

J.

[243] V. M. Malhotra, M. P. K u m a r , and S. N. Maheshwari. An 0(n3) algorithm for finding m a x i m u m flows in networks. Inf. Process. Lett., 7:277-278, 1978. [244] J. M. Marberg and E. Gafni. An 0 ( n 2 m 1 / 2 ) distributed max-flow algorithm. In S. Sahni, editor, Proc. International Conf. on Parallel Processing, pages 2 1 3 216, 1987. [245] C. Martel. A comparison of phase and non-phase network flow algorithms. Networks, 19:691-705, 1989. [246] K. M a t s u m o t o , T. Nishizeki, and N. Saito. An efficient algorithm for finding multicommodity flows in planar networks. SIAM J. Comput., 14:289-302, 1985. [247] K. M a t s u m o t o , T. Nishizeki, and N. Saito. Planar multicommodity flows, maxi m u m matchings and negative cycles. SIAM J. Comput., 15:495-510, 1986. [248] J. F . Maurras. Optimization of t h e flow through networks with gains. Programming, 3:135-144, 1972.

Math.

A Bibliography

on Network

Flow

323

Problems

[249] N. Megiddo. Optimal flows in networks with multiple sources and sinks. Programming, 7:97-107, 1974.

Math.

[250] N. Megiddo. A good algorithm for lexicographically optimal flows in multiterminal networks. Bull, of the AMS, 83:97-107, 1977. [251] K. Mehlhorn. Data structures and Algorithms; vol. 2, Graph Algorithms NP-completeness, chapter IV.9. Springer-Verlag, Berlin, 1984.

and

[252] G. L. Miller and J. Naor. Flow in planar graphs with multiple sources and sinks, extended abstract. In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 112-117, 1989. [253] E. Minieka. Optimal flow in a network with gains. INFOR,

10:171-178, 1972.

[254] E. Minieka. P a r a m e t r i c network flows. Operations

Res., 20:1162-11678, 1972.

[255] E. Minieka. Optimization New York, 1978.

and Graphs. Marcel Dekker,

Algorithms

for Networks

[256] M. Minoux. Resolution des problemes de multiflots en nombres entier dans les grands resaux. RAIRO, 3:21-40, 1975. [257] M. Minoux. Flots equilibres et flots avec securite. E.D.F.-Bull. et Recherches, serie C - Mathem., Inform., 1:5-16, 1976.

Direction

[258] M. Minoux. Multiflots de cout minimal avec fonctions de cout concaves. Telecommun., 31:77-92, 1976.

Etudes

Annls

[259] M. Minoux. A polynomial algorithm for m i n i m u m quadratic cost flow problems. Europ. J. Oper. Res., 18:377-387, 1984. [260] M. Minoux. Network synthesis and o p t i m u m network design problems: Models, solution m e t h o d s and applications. Networks, 19:313-360, 1989. [261] G. J. Minty. Monotone networks. Proc. Royal Soc. London, 1960.

A(257):194-212,

[262] J. S. B . Mitchell. On m a x i m u m flows in polyhedral domains. J. Comput. Set., 40:88-123, 1990.

Syst.

[263] K. Mizuno, S. Mizuno, and M. Mori. A polynomial t i m e interior point algorithm for m i n i m u m cost flow problems. J. Oper. Res. Soc. Japan, 33:157-167, 1990. [264] J. Mulvey. Pivot strategies for primal-simplex network codes. J. ACM, 25:266270, 1978.

324 [265] K. G. Murty. Linear New York, 1976.

M. and Combinatorial

Programming.

Veldhorst

J o h n Wiley & Sons,

[266] H. Nagamochi and T. Ibaraki. On max-flow min-cut and integral flow properties for m u l t i c o m m o d i t y flows in directed networks. Inf. Process. Lett., 31:279-285, 1989. [267] H. Nagamochi and T. Ibaraki. Multicommodity flows in certain planar directed networks. Discrete Applied Math., 27:125-145, 1990. [268] H. Nagamochi and T. Ibaraki. M a x i m u m flows in probabilistic networks. works, 21:645-666, 1991.

Net-

[269] H. Nagamochi and T. Ibaraki. Computing edge-connectivity in multigraphs and capacitated networks. SIAM J. Discr. Math., 5:54-66, 1992. [270] A. Nakayama. A polynomial algorithm for t h e m a x i m u m balanced flow problem with a constant balancing r a t e function. J. Oper. Res. Soc. Japan, 29:400-410, 1986. [271] A. Nakayama. A polynomial-time dual simplex algorithm for t h e m i n i m u m cost flow problem. J. Oper. Res. Soc. Japan, 30:265-289, 1987. [272] A. Nakayama. A polynomial-time binary search algorithm for t h e m a x i m u m balanced flow problem. J. Oper. Res. Soc. Japan, 33:1-11, 1990. [273] A. Nakayama. NP-completeness and approximation algorithm for the m a x i m u m integral vertex-balanced flow problem. J. Oper. Res. Soc. Japan, 34:13-27, 1991. [274] T. Nishizeki and N. Chiba. Planar Graphs: Theory and Algorithms, chapter 11. Annals of Discrete M a t h e m a t i c s , vol. 32. North-Holland Publ. Comp., Amsterd a m , 1988. [275] H. Okamura. Multicommodity flows in graphs. Discrete Applied Math., 6:55-62, 1983. [276] H. O k a m u r a and P. D. Seymour. Multicommodity flows in planar graphs. Combin. Theory, B-31:75-81, 1981.

J.

[277] K. Onaga. D y n a m i c programming of o p t i m u m flows in lossy communication nets. IEEE Trans. Circuit Th., CT-13:282-287, 1966. [278] K. Onaga. O p t i m a l flows in general communication networks. J. Franklin 283:308-327, 1967. [279] J. B . Orlin. M a x i m u m t h r o u g h p u t - d y n a m i c networks flows. Math. ming, 27:214-231, 1983.

Inst.,

Program-

A Bibliography

on Network

Flow

Problems

325

[280] J. B . Orlin. Genuinely polynomial simplex and non-simplex algorithms for t h e m i n i m u m cost flow problem. Technical Report 1615-84, Sloan School of M a n a g e m e n t , M I T , Cambridge, Mass., 1984. Also as CWI-OS R8504, Center for M a t h e m a t i c s and C o m p u t e r Science, A m s t e r d a m , 1985. [281] J. B. Orlin. M i n i m u m convex cost dynamic network flows. Math. 9:190-207, 1984.

Oper.

Res.,

[282] J. B . Orlin. On t h e simplex algorithm for networks and generalized networks. Math. Prog. Stud., 24:166-178, 1985. [283] J. B. Orlin. A faster strongly polynomial m i n i m u m cost flow algorithm. In Proc. 20th Annual ACM Symp. Theory of Computing, pages 377-387, 1988. To appear in Operations Res. [284] J. B . Orlin and R. K. Ahuja. New distance-directed algorithms for m a x i m u m flow and p a r a m e t r i c m a x i m u m flow problems. Technical Report 1908-87, Sloan School of M a n a g e m e n t , M I T , Cambridge, Mass., 1987. [285] J. B . Orlin and R. K. Ahuja. New scaling algorithms for assignment and minim u m cycle m e a n problems. Math. Programming, 54:41-56, 1988. [286] M. P a d b e r g and G. Rinaldi. An efficient algorithm for t h e m i n i m u m capacity cut problem. Math. Programming, 47:19-36, 1990. [287] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization, Algorithms and Complexity., chapter 4.3, 5.6, 6, 7, 9 and 10.3. Prentice-Hall, Inc., Englewood Cliffs, N J , 1982. [288] A. B . P h i l p o t t . Continuous-time flows in networks. Math. 661, 1990.

Oper. Res., 15:640-

[289] S. A. Plotkin and E. Tardos. Improved dual network simplex. In Proc. Annual ACM-SIAM Symp. Discrete Algorithms, pages 367-376, 1990. [290] J. Ponstein. Programming,

On t h e maximal flow problem with real arc capacities. 3:254-256, 1972.

[291] R. B . P o t t s and R. M. Oliver. Flows in Transportation Press, New York, 1972.

Networks.

1st

Math.

Academic

[292] P. S. P u l a t . A decomposition algorithm to determine t h e m a x i m u m flow in a generalized network. Comput. Oper. Res., 16:161-172, 1989. [293] P. S. P u l a t . M a x i m u m outflow in generalized flow networks. Europ. J. Res., 43:65-77, 1989.

Oper.

326

M.

Veldhorst

[294] A. P. P u n n e n . A linear t i m e algorithm for t h e m a x i m u m capacity p a t h . J. Oper. Res., 53:402-404, 1991.

Europ.

[295] M. Queyranne. Theoretical efficiency of t h e algorithm "capacity" for t h e maxi m u m flow problem. Math. Oper. Res., 5:258-266, 1980. [296] T. Radzik and A. V. Goldberg. Tight bounds on t h e n u m b e r of m i n i m u m - m e a n cycle cancellations and related results. In Proc. 2nd Annual ACM-SIAM Symp. Discrete Algorithms, pages 110-119, 1991. [297] V . R a m a c h a n d r a n . T h e complexity of m i n i m u m cut and m a x i m u m flow problems in an acyclic network. Networks, 17:387-392, 1987. [298] K. G. R a m a k r i s h n a n . Solving two-commodity transportation problems with coupling constraints. J. ACM, 27:736-757, 1980. [299] J. H. Reif. M i n i m u m s-t cut of a planar undirected network in 0 ( n l o g 2 ( n ) ) t i m e . SI AM J. Comput., 12:71-81, 1983. [300] H. Rock. Scaling techniques for m i n i m u m cost network flows. In U. P a p e , editor, Discrete Structures and Algorithms, pages 181-191, Carl Hansen Verlag, Miinchen, 1980. [301] R. T. Rockafellar. Network k Sons, New York, 1984.

Flows and Monotropic

Optimization.

John Wiley

[302] B . Rothfarb and I. T. Frisch. On t h e 3-commodity flow problem. Appl. Math., 17:46-58, 1969.

SIAM

J.

[303] B . Rothfarb, N. P. Shein, and I. T. Frisch. Common terminal multicommodity flow. Operations Res., 16:202-205, 1968. [304] B. Rothschild and A. Whinston. Feasibility of two commodity network flows. Operations Res., 14:1121-1129, 1966. [305] B . Rothschild and A. Whinston. On two commodity network Res., 14:377-387, 1966.

flows.

Operations

[306] G. Ruhe. P a r a m e t r i c m a x i m a l flows in generalized networks - complexity and algorithms. Optimization, 19:235-251, 1988. [307] H. M. Safer. Scaling algorithms for distributed m a x flow. Technical report, Sloan School of Management, M I T , Cambridge, Mass., 1988. [308] R. Saigal. Multicommodity flows in directed networks. Operations Research Center, University of California, Berkeley, CA, 1968.

A Bibliography

on Network

Flow

327

Problems

[309] M. Sakarovitch. T h e multicommodity m a x i m u m flow problem. O R C Report 6625, Operations Research Center, University of California, Berkeley, CA, 1968. [310] M. Sakarovitch. Two commodity network flows and linear programming. Programming, 4:1-20, 1973.

Math.

[311] B . Schieber and S. Moran. Parallel algorithms for m a x i m u m bipartite matchings and m a x i m u m 0-1 flows. J. of Parallel and Distributed Computing, 6:20-38, 1989. [312] A. Schrijver. Applications of polyhedral combinatorics to multicommodity flows and compact surfaces. Technical Report CWI-BS-R8921, Center for M a t h e m a t ics and C o m p u t e r Science, A m s t e r d a m , 1989. [313] A. Schrijver. T h e Klein bottle and multicommodity 9:375-384, 1989.

flows.

Combinatorial,

[314] A. Schrijver. Short proofs on multicommodity flows and cuts. Technical Report CWI-BS-R8922, Center for Mathematics and C o m p u t e r Science, A m s t e r d a m , 1989. [315] A. Segall. Decentralized maximum-flow protocols. Networks,

12:213-230, 1982.

[316] M. Sengoku, S. Skinoda, and R. Yatsuboshi. On a function for t h e vulnerability of a directed flow network. Networks, 18:73-83, 1988. [317] M. Serna and P. Spirakis. Tight R N C approximations to maxflow. Technical Report T R 90.01.1, C o m p u t e r Technology Institute, P a t r a s University, P a t r a s , Greece, 1990. [318] P. D. Seymour. T h e matroids with t h e max-flow min-cut property. J. Theory, B-23:189-222, 1977. [319] P. D. Seymour. A two-commodity cut theorem. Discrete 1978.

Math.,

23:341-355,

[320] P. D. Seymour. A short proof of t h e two-commodity flow theorem. J. Theory, B-26:370-371, 1979. [321] P. D. Seymour. Four-terminus flows. Networks,

Comb.

Comb.

10:79-86, 1980.

[322] P. D. Seymour. On odd cuts and planar multicommodity flows. Proc. Mathem. Soc, 42:178-192, 1981.

London

[323] F . Shahrokhi. Approximation algorithms for t h e m a x i m u m concurrent flow problem. ORSA Jrnl. on Computing, 1:62-69, 1989.

328

M.

Veldhorst

[324] F . Shahrokhi and D. Matula. T h e m a x i m u m concurrent flow problem. J. 37:318-334, 1990.

ACM,

[325] Y. Shiloach. An 0(nl log 2 7) m a x i m u m flow algorithm. Technical Report STAN78-702, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1978. [326] Y. Shiloach. Multi-terminal 0 - 1 flows. SIAM

J. Comput.,

8:422-430, 1979.

[327] Y. Shiloach. A multi-terminal m i n i m u m cut algorithm for planar graphs. J. Comput, 9:214-219, 1980.

SIAM

[328] Y. Shiloach and U. Vishkin. An 0(n2 log n) parallel max-flow algorithm. Algorithms, 3:128-146, 1982.

J.

[329] M. T. Shing and P. K. Agarwal. Multi-terminal flows in planar networks. Technical Report T R C S 86-07, D e p a r t m e n t of C o m p u t e r Science, University of California, Santa Barbara, CA, 1986. [330] J. F . Sibeyn. A pseudo-polylog t i m e parallel maxflow algorithm. Technical Report RUU-CS-90-17, D e p a r t m e n t of C o m p u t e r Science, University of Utrecht, Utrecht, T h e Netherlands, 1990. [331] K. Simon. On m i n i m u m flow and transitive reduction. In Proc. 15th Intern. Coll. on Automata, Languages and Programming, Lecture Notes in C o m p u t e r Science, vol. 317, pages 535-546, Springer-Verlag, Berlin, 1988. [332] D. D. Sleator and R. E. Tarjan. An O(nmlogn) algorithm for m a x i m u m network flow. Technical Report STAN-CS-80-831, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1980. [333] D. D. Sleator and R. E. Tarjan. A d a t a structure for dynamic trees. J. Syst. ScL, 26:362-390, 1983. [334] D. D. Sleator and R. E. Tarjan. Self adjusting binary search trees. J. 32:652-686, 1985.

Comput.

ACM,

[335] J. E. Somers. M a x i m u m flow in networks with a small n u m b e r of r a n d o m arc capacities. Networks, 12:242-253, 1982. [336] H. Soroush and P. B . Mirchandani. T h e stochastic multicommodity flow problem. Networks, 20:121-155, 1990. [337] Y. Soun and K. Truemper. Single commodity representation of multicommodity networks. SIAM J. Algebraic Discrete Methods, 1:348-358, 1980.

A Bibliography

on Network

Flow

329

Problems

[338] V. Srinivasan and G. L. Thompson. Accelerated algorithms for labeling and relabeling of trees, with applications t o distribution problems. J. ACM, 19:712— 726, 1972. [339] V. Srinivasan and G. L. Thompson. Benefit-cost analysis of coding techniques for primal transportation problems. J. ACM, 20:194-213, 1973. [340] H. Suzuki, T . Nishizeki, and N. Saito. Algorithms for multicommodity flows in planar graphs. Algorithmica, 4:471-501, 1989. Preliminary version in Proc. 17th Annual ACM Symp. Theory of Computing, pages 195-204, 1985. [341] E. Tardos. A strongly polynomial m i n i m u m cost circulation algorithm. binatorica, 5:247-255, 1985.

Com-

[342] E. Tardos. Improved approximation algorithm for concurrent multi-commodity flows. Technical Report 872, School of Operations Research and Industrial Engineering, Cornell University, 1989. [343] E. Tardos, C. Tovey, and M. Trick. Layered augmented p a t h algorithms. Oper. Res., 11:362-370, 1986. [344] R. E. Tarjan. Data Structures Philadelphia, PA, 1983.

and Network

Algorithms,

chapter 8.

Math.

SIAM,

[345] R. E. Tarjan. A simple version of Karzanov's blocking flow algorithm. tions Res. Lett., 2:265-268, 1984.

Opera-

[346] R. E. Tarjan. Algorithms for m a x i m u m network flow. Math. Prog. Study, 26:1— 11, 1986. [347] R. E. Tarjan. Efficiency of t h e primal network simplex algorithm for the minimum-cost circulation problem. Math. Oper. Res., 16:272-291, 1991. [348] N. Tomizawa. On some techniques useful for solution of t r a n s p o r t a t i o n network problems. Networks, 1:173-194, 1972. [349] J. A. Tomlin. Minimum-cost multicommodity network flows. Operations 14:45-51, 1966.

Res.,

[350] L. E. Trotter, Jr. On the generality of multi-terminal flow theory. Annals Discrete Math., 1:517-525, 1977. [351] K. Truemper. On m a x flows with gains and pure m i n i m u m cost J. Appl. Math., 32:450-456, 1977.

flows.

[352] K. Truemper. O p t i m a l flows in nonlinear gain networks. Networks, 1978.

of

SIAM

8:17-36,

330

M.

Veldhorst

K. Truemper. Max-flow min-cut matroids: polynomial testing and polynomial algorithms for m a x i m u m flow and shortest routes. Math. Oper. Res., 12:72-96, 1987. P. Tseng, D. P. Bertsekas, and J. N. Tsitsiklis. Partially asynchronous, parallel algorithms for network flows and other problems. SIAM J. Control & Optim., 28:678-710, 1990. A. Tucker. A note on t h e convergence of t h e Ford-Fulkerson flow algorithm. Math. Oper. Res., 2:143-144, 1977. P. M. Vaidya. Speeding-up linear programming using fast m a t r i x multiplication, (extended a b s t r a c t ) . In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 332-337, 1989. J. van Leeuwen. Graph algorithms. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, vol. A: Algorithms and Complexity, pages 5 2 5 631. Elsevier Science Publ., A m s t e r d a m , 1990. U. Vishkin. A parallel blocking flow algorithm for acyclic networks. J. rithms, 13:489-501, 1992.

Algo-

S. W . Wallace. Investing in arcs in a network to maximize t h e expected max flow. Networks, 17:87-103, 1987. A. W e i n t r a u b . A primal algorithm to solve network flow problems with convex costs. Manag. Sci., 21:87-97, 1974. R. J. Wittrock. Operator assignment and t h e parametric prefiow problem. Manag. Sci., 38:1354-1359, 1992. R. D. Wollmer. Multicommodity networks with resource constraints: t h e generalized multicommodity flow problem. Networks, 1:245-263, 1972. M. A. Yakovleva. A problem on m i n i m u m transportation cost. In V. S. Nemchinov, editor, Applications of Mathematics in Economic Research, pages 390-399, Izdat. Social'no-Ekon. Lit., Moscow, 1959. N. E. Young, R. E. Tarjan, and J. B. Orlin. Faster p a r a m e t r i c shortest p a t h and m i n i m u m balance algorithms. Networks, 21:205-221, 1991. N. Zadeh. Theoretical efficiency of t h e E d m o n d s - K a r p algorithm for computing m a x i m a l flows. J. ACM, 19:184-192, 1972. N. Zadeh. A bad network problem for t h e simplex m e t h o d and other m i n i m u m cost flow algorithms. Math. Programming, 5:255-266, 1973.

A Bibliography on Network Flow Problems

331

[367] N. Zadeh. More pathological examples for network flow problems. Math. Programming, 5:217-224, 1973. [368] W. I. Zangwill. Minimum concave cost flows in certain networks. Manag. Sci., 14:429-450, 1968. [369] C.-Q. Zhang. Minimum cycle coverings and integer flows. J. Graph Theory, 14:537-546, 1990. [370] U. Zimmermann. Minimization on submodular flows. Discrete Applied Math., 4:303-323, 1982.

333 Network Optimization Problems, pp. 333-353 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Tabu Search: Applications and Prospects Stefan VofJ Technische Hochschule Darmstadt, FB 1 / FG Operations Hochschulstrafie 1, D - 6100 Darmstadt, Germany

Research,

Abstract Tabu Search is a metastrategy for guiding known heuristics to overcome local optimality. Successful applications of this kind of metaheuristic to a great variety of problems have been reported in the literature. In this paper we consider two applications of tabu search with special emphasis on dynamic tabu list management. Although still in its infancy, recently some implementations of tabu search on parallel computers have come up. Whereas these implementations are tailored to specific problems we attempt to provide ideas for a more general concept for developing parallel tabu search algorithms.

1

Introduction

Due to t h e complexity of a great variety of combinatorial optimization problems, heuristic algorithms are especially relevant for dealing with these large scale problems. T h e m a i n drawback of algorithms such as deterministic exchange procedures is their inability to continue t h e search upon becoming t r a p p e d in a local o p t i m u m . This suggests consideration of recent techniques for guiding known heuristics to overcome local optimality. Following this t h e m e , we investigate t h e application of t h e tabu search m e t a s t r a t e g y for solving combinatorial optimization problems. T h e first part of this paper considers two specific applications of t a b u search to t h e multiconstraint zero-one knapsack problem and to t h e quadratic semi-assignment problem. These applications have been performed on a sequential computer. Although usually a very fast m e t h o d , in real-world applications of e.g. t h e quadratic semi-assignment problem sometimes very large computation times for t a b u search

334

S. VoB

may be observed. Therefore, t h e idea of parallelization as it recently came up for t a b u search, too, may be especially relevant in reducing C P U - t i m e s for t h e algorithms under consideration. T h e key issue in designing parallel algorithms is to decompose t h e execution of t h e various ingredients of a procedure into processes executable by parallel processors. Improvement procedures like t a b u search or simulated annealing at first glance, however, have an intrinsic sequential n a t u r e due to t h e idea of performing t h e neighbourhood search from one solution t o t h e next. Therefore, there is not yet a common or generally applicable parallelization of t a b u search in t h e literature. In t h e second part of this paper we a t t e m p t to describe some general ideas and a classification scheme for parallel t a b u search algorithms. Following t h e above framework t h e paper is organized as follows. In Section 2, we present an outline of t a b u search. Section 3 describes our application of t a b u search t o two combinatorial optimization problems. For t h e multiconstraint zeroone knapsack problem we report some improvements in comparison to an already known t a b u search m e t h o d . Computational results are performed over a sample of 57 problems known from t h e literature. W i t h respect t o t h e quadratic semi-assignment problem we review results derived for a real-world application arising in the field of schedule synchronization in public mass transit systems. Before describing some concepts for parallel t a b u search algorithms in more detail we briefly discuss t h e common parallel machine models and algorithms in Section 4. Some examples are given and finally some conclusions are drawn (Section 5). T h e a t t e m p t , of course, is not to give a complete t r e a t m e n t of parallel t a b u search b u t to sketch t h e potential this area of research carries.

2

Tabu Search

Many solution approaches are characterized by identifying a neighbourhood of a given solution which contains other (transformed) solutions t h a t can be reached in a single iteration. A transition from a feasible solution to a transformed feasible solution is referred to as a move and m a y be described by a set of one or more attributes. For example, in a zero-one integer p r o g r a m m i n g context these a t t r i b u t e s m a y be t h e set of all possible value assignments (or changes in such assignments) for t h e binary variables. T h e n two a t t r i b u t e s e and e, which denote t h a t a certain binary variable is set to 1 or 0, m a y b e called complementary t o each other. Following a steepest descent/mildest ascent approach, a move m a y either result in a best possible improvement or a least deterioration of t h e objective function value. W i t h o u t additional control, however, such a process can cause a locally optimal solution t o b e re-visited immediately after moving to a neighbour. To prevent t h e search from endlessly cycling between t h e same solutions, t a b u search m a y be visualized as follows. Imagine t h a t t h e a t t r i b u t e s of all moves are stored in a list, n a m e d a running list, representing t h e trajectory of solutions encountered.

Tabu Search: Applications and Prospects

335

Then, related to a sublist of the running list a so-called tabu list may be defined. Based on certain restrictions, it keeps some moves, consisting of attributes complementary to those of the running list, which will be forbidden in at least one subsequent iteration because they might lead back to a previously visited solution. Thus, the tabu list restricts the search to a subset of admissible moves (consisting of admissible attributes or combinations of attributes). This hopefully leads to 'good' moves in each iteration without re-visiting solutions already encountered. A general outline of a tabu search procedure (for solving a minimization problem) may be described as follows: Tabu Search Given: A feasible solution x* with objective function value z*. Start: Let x := x* with z(x) = z*. Iteration: while stopping criterion is not fulfilled1do begin (1) select best admissible move that transforms x into x' with objective function value z(z') and add its attributes to the running list (2) perform tabu list management: compute moves to be set tabu, i.e., update the tabu list (3) perform exchanges: x := x', z(x) = z(x') if z{x) < z* then z* := z(x), x* := x endif endwhile Result: x* is the best of all determined solutions, with objective function value z*.

*** For a background on tabu search and a number of references on successful applications of this metaheuristic see, e.g., Glover (1989, 1990), Domschke et al. (1992), and Glover and Laguna (1992).

Tabu List Management Tabu list management concerns updating the tabu list, i.e., deciding on how many and which moves have to be set tabu within any iteration of the search. Up to now, usually static methods have been applied in the literature as, e.g., the tabu navigation method (TNM). In TNM, single attributes are set tabu as soon as their complements have been part of a selected move. The attributes stay tabu for a distinct time, i.e. number of iterations, until the probability of causing a solution's re-visit is small. The efficiency of the algorithm depends on the choice of the tabu status duration, i.e. the length tl_size of the underlying tabu list. (In the literature often a 'magic' tl_size=7 is proposed.) For the sake of an improved effectivity, a so-called aspiration level criterion 1

A possible stopping criterion can be, e.g., a prespecified time limit.

336

S. VoB

is considered, which permits t h e choice of an a t t r i b u t e even when it is tabu. This can be advantageous when a new best solution m a y be calculated, or when t h e t a b u status of t h e a t t r i b u t e s prevent any move from feasibility. T h e static approach, though successful in some applications, seems to be a rather limited one. Another probably more fruitful idea is to define an a t t r i b u t e as being potentially tabu if it belongs to a chosen move and to handle it in a candidate list first. Via additional criteria these a t t r i b u t e s can be definitely included in t h e t a b u list if necessary, or excluded from t h e candidate list if possible. Therefore, t h e candidate list is an intermediate list between a running list and a t a b u list. Glover (1990) suggests t h e use of different candidate list strategies in order t o avoid extensive computational effort without sacrificing solution quality. In t h e sequel, we describe t h e following dynamic strategies for managing t a b u lists: the cancellation sequence method (CSM, in a revised version, cf. D a m m e y e r et al. (1991)), and t h e reverse elimination method (REM). CSM as well as R E M both use additional criteria for setting a t t r i b u t e s tabu. T h e p r i m a r y goal is to permit t h e reversion of any a t t r i b u t e b u t one between two solutions to prevent from re-visiting t h e older one. To find those critical moves, CSM needs a candidate list t h a t contains t h e complements of a t t r i b u t e s being potentially t a b u . This active tabu list (ATL) is built like t h e running list where elimination of certain a t t r i b u t e s is furthermore p e r m i t t e d . Whenever an a t t r i b u t e of t h e last performed move finds its complement on ATL this complement will be eliminated from ATL. All a t t r i b u t e s between t h e cancelled one and its recently added complement build a cancellation sequence separating t h e actual solution from t h e solution t h a t has been left by t h e move t h a t contains t h e cancelled a t t r i b u t e . Any a t t r i b u t e but one of a cancellation sequence is allowed to be cancelled by future moves. This condition is sufficient b u t not necessary, as some aspects have t o be taken into account so t h a t CSM works well. • Making a single a t t r i b u t e t a b u prevents m a n y moves which could lead to yet unvisited solutions. (An a t t r i b u t e becomes t a b u if its complement is t h e only a t t r i b u t e of a cancellation sequence. An a t t r i b u t e becomes t a b u for one iteration if its complement is t h e most recent a t t r i b u t e of ATL. Otherwise a cancellation sequence could not be defined between these two attributes.) • For building a cancellation sequence, t h e remaining a t t r i b u t e s of the older and t h e current move are not necessarily taken into consideration. This depends on t h e order in which t h e move's attributes are added to ATL. • Those a t t r i b u t e s of a move t h a t did not cancel another a t t r i b u t e within a specific cancellation sequence are disregarded when making its last remaining a t t r i b u t e t a b u (although they separate two solutions). Whenever a cancellation sequence includes a smaller one t h e smaller sequence is said to dominate t h e larger. T h e n t h e larger cancellation sequence may be disre-

Tabu Search: Applications

and

Prospects

337

garded, because any of its a t t r i b u t e s will only become t a b u if they are within t h e smaller sequence, too. T h e above mentioned aspects work well for t h e case t h a t a move consists of exactly one a t t r i b u t e , i.e., when so-called single-attribute moves are considered instead of multi-attribute moves. In addition, t h e corresponding p a r a m e t e r s have to be chosen appropriately (e.g. t h e t a b u list duration of a t a b u a t t r i b u t e , and how to apply t h e aspiration level criterion). Applying CSM to m u l t i - a t t r i b u t e moves needs additional criteria to prevent errors caused by uncovered special cases. E.g. for paired-attribute moves (moves consisting of exactly two attributes) those moves must be prohibited t h a t m a y cancel a cancellation sequence consisting of exactly two a t t r i b u t e s (because none of t h e m is t a b u when choosing a move). T h e conditions of T N M and CSM need not be necessary to prevent from re-visiting previously encountered solutions. Necessity, however, can be achieved by R E M . T h e idea of R E M is t h a t any solution can only be re-visited in t h e next iteration if it is a neighbour of t h e current solution. Therefore, in each iteration t h e running list will b e traced back to determine all moves which have to be set t a b u (since they would lead to an already explored solution). For this purpose, a residual cancellation sequence (RCS) is built up stepwise by tracing back t h e running list. In each step exactly one a t t r i b u t e is processed, from last to first. After initializing an e m p t y R C S , only those a t t r i b u t e s are added whose complements are not in t h e sequence. Otherwise their complements in t h e RCS are eliminated (i.e. cancelled). T h e n at each tracing step it is known which a t t r i b u t e s have to be reversed in order to t u r n t h e current solution back into one examined at an earlier iteration of t h e search. If t h e remaining a t t r i b u t e s in t h e RCS can be reversed by exactly one move then this move is t a b u in t h e next iteration. For single-attribute moves, for instance, t h e length of an RCS must be one to enforce a t a b u move. Obviously, t h e execution of R E M represents a necessary and sufficient criterion to prevent re-visiting known solutions. Since t h e computational effort of R E M increases if t h e n u m b e r of iterations increases, ideas for reducing t h e n u m b e r of computations have been developed (cf. Glover (1990) and D a m m e y e r and Vofi(1991a)).

Search Intensification and Search Diversification A general idea for reducing t h e computational effort in a t a b u search algorithm is t h a t of search intensification using a so-called short term memory (cp. t h e expression intermediate term memory in Glover (1989)). Its basic idea is to observe t h e a t t r i b u t e s of all performed moves and to eliminate those from further consideration t h a t have not been part of any solution generated during a given n u m b e r of iterations. This results in a concentration of t h e search where the number of neighbourhood solutions in each iteration, and consequently t h e computational effort, decreases. Obviously t h e cost of this reduction can be a loss of accuracy. Correspondingly, a search diversification may be defined as a long term

memory

S. VoB

338

to penalize often selected assignments. T h e n the neighbourhood search can be led into not yet explored regions where t h e t a b u list operation is restarted (resulting in an increased c o m p u t a t i o n t i m e ) . An appealing opportunity for search diversification is created by R E M . Let t > 1 be integer. If at any tracing step t h e a t t r i b u t e s t h a t have to be reversed to t u r n t h e current solution back into an already explored one equal exactly t moves t h e n it is possible to set these moves t a b u for t h e next iteration. Note t h a t for t h e case of m u l t i - a t t r i b u t e moves, due to various combinations of a t t r i b u t e s to moves, even more t h a n t moves may be set t a b u in order to avoid different p a t h s through t h e search space leading to t h e same solution. Accordingly, search diversification is obvious. We highlight t h e case of t = 2 as a new stand-alone m e t h o d called R E M 2 since in t h e same n u m b e r of iterations a larger n u m b e r of solutions is encountered t h a n with R E M . To have t = 2 means t h a t all common neighbours of t h e current solution and of an already explored one are forbidden. These neighbours were implicitly investigated during a former step of t h e procedure (due to the choice of a best non-tabu neighbour) and need not be looked at again. Therefore, R E M 2 retains the nice property of being a necessary and a sufficient criterion as mentioned above without having the fear t h a t some solutions would not be encountered, as may be the case for t > 2. Nevertheless, from a computational point of view R E M t for t > 3 may be advantageous. However, as also with R E M or R E M 2 , black hole suboptima may still occur when all moves to neighbourhood solutions are t a b u . For applications and (sequential) comparisons of T N M , CSM, and R E M see D a m m e y e r and Vofi(1991b) and Domschke et al. (1992).

3

Applications

In this paragraph we report on two applications of tabu search. multiconstraint zero-one knapsack problem, which is an example for single-attribute moves are performed. T h e second example is t h e assignment problem, a representative for t h e use of paired-attribute

T h e first is t h e problems where quadratic semimoves.

Multiconstraint Zero-One Knapsack Problem T h e multiconstraint zero-one knapsack problem ( M C K P ) is a special case of general zero-one p r o g r a m m i n g with a great variety of applications in t h e areas of, e.g., resource allocation and capital budgeting. Given n objects with positive profits c, and m resources with nonnegative resource consumption values a;j and positive limitations 6,-. W i t h binary decision variables x, t h e problem may be stated as follows: n

Maximize

Z(x) = ^

Cj • Xj

(1)

Tabu Search: Applications

and

339

Prospects

subject to n

J2 a 'i " xi - b> xjG{0,l}

i=l,-.-,m i = l,...,n

(2) (3)

Various algorithms for solving this NP-hard problem have been proposed in t h e literature. Here we refer to Drexl (1988), who developed an efficient simulated annealing algorithm, and to Dammeyer and Vofl(1991a) for an implementation of R E M and a comparison with simulated annealing. To have a comparative study 57 test problems with known optimal solutions were taken as a reference in both papers. T h e d a t a for these studies with n varying from 6 to 105 and m from 2 to 30 are fully reproduced in Frevilleand Plateau (1982, 1990). Dammeyer and Vofi(1991a) show t h a t simulated annealing may be outperformed by R E M with respect to various criteria. Here we use their implementation of different versions of R E M and compare t h e m with t h e results of R E M 2 and R E M 3 as defined in Section 2. Despite t h e modification caused by this diversification approach all p a r a m e t e r setting is identical. We report some of t h e details (see Dammeyer and Vofi(1991a) for a complete description). For t h e purpose of finding a neighbourhood solution within t a b u search t h e following transformation is defined: G i v e n : A feasible solution x = ( x i , . . . , x n ) . Choose j * = arg max{a,-.j/cj|Xj = l,j = 1 , . . . , n } with t* being a bottleneck resource with i* : = arg max{£)? = i
Vi = 1 , . . . , m)

This so-called D R O P / A D D - move may be considered as a m u l t i - a t t r i b u t e move with variable length depending on a specific instance of M C K P . Any such move consists of exactly one DROP-attribute j * and a variable number of ADD-attributes according to t h e choice of elements k* given in t h e while-loop. Our implementation considers these moves as successive multi-attribute moves, i.e., every move is regarded as a n u m b e r of separate single-attribute moves. In any iteration t h e n u m b e r of traces to be performed, i.e. t h e number of times an RCS is built, is equal to t h e number of a t t r i b u t e s of t h e corresponding move. For each of t h e 57 test problems 20 initial feasible solutions have been generated. Starting with x = (0,..., 0) for t h e first feasible solution we added elements according to t h e ADD-criterion as applied in t h e D R O P / A D D - move as long as possible. Nine further solutions have been gained from x = (0, ...,0) by adding randomly chosen elements as long as possible. In t h e same way we proceeded with an initialization of x = ( ! , . . . , ! ) . Nine solutions have been obtained by randomly dropping elements

340

S. Vo5

until feasibility was achieved, and t h e t e n t h by applying a DROP-criterion inverse to t h e ADD-criterion. Given any of t h e 20 initial feasible solutions t h e algorithms t e r m i n a t e whenever there is no improvement of t h e best feasible solution within a certain n u m b e r a of iterations. Starting from an initialization a = n this n u m b e r is increased over t i m e by a factor of 1.1. An additional overall stopping criterion of 10 • n iterations did not affect t h e termination of t a b u search. T h e n u m b e r of tracing steps for building an RCS was limited to 4 • n. Table 1 gives a summarized description of our results. (All programs are implem e n t e d in PASCAL and run on an IBM PS 2/70, 386 personal computer.) T h e first rows show a comparison of R E M (t = 1), R E M 2 , and R E M 3 as described above. T h e n t h e influence of applying t h e short and long t e r m m e m o r y is analyzed in t h e next rows. In t h e short t e r m memory (STM) an element is eliminated from further consideration if it has not been included in any solution examined during a iterations. For a combined long and short t e r m memory ( L + S T M ) t h e following modification is used. Whenever S T M stops a new starting solution is obtained by choosing from those elements t h a t have previously been eliminated according to S T M and the algor i t h m is restarted. This procedure is repeated no more t h a n ten times as long as a new starting solution can be found. Correspondingly in t h e long t e r m memory (LTM) new starting solutions are obtained from those elements t h a t have not been in any solution for a certain n u m b e r of iterations. T h e first two columns of Table 1 show t h e n u m b e r of optimal solutions found referring to t h e 57 test problems and referring to all instances (i.e. from 20 • 57 = 1140). Correspondingly, t h e average deviation from optimality is given with respect to t h e best found solution out of t h e 20 instances over t h e 57 test problems and referring to all 1140 instances. T h e average number of moves gives t h e average n u m b e r of neighbourhood exchanges needed with respect to t h e best found solution over all 57 test problems. Finally, t h e average C P U - t i m e referring to all 1140 instances is given. All specified m e t h o d s behave in nearly t h e same way. Concerning solution quality, t h e inclusion of S T M into R E M t (t = 1,2,3) gives only slightly worse results, b u t with a remarkable decrease in C P U - t i m e s . T h e most astonishing entries in Table 1 are t h e average number of moves needed to find t h e best solution referring to t h e most successful out of t h e sample of 20 trials. R E M t with L + S T M leads to improved results b u t with significantly increased CPU-times. If R E M t is applied with LTM instead of L + S T M t h e solution quality is slightly affected in some test problems with mostly increased C P U - t i m e s . T h e modifications of R E M proposed in this paper lead to improvements in solution quality with only slightly affected CPU-times. Although t h e average n u m b e r of necessary neighbourhood exchanges increases, t h e C P U - t i m e s over all instances even might decrease because t h e stopping criterion gets active when no further improvements are found in a certain number of iterations. As a recommendation we conclude t h a t R E M 2 should replace R E M whenever this kind of dynamic t a b u list m a n a g e m e n t is used.

Tabu Search: Applications

and

Prospects

number of optimal solutions found ref. 57 all instances

341

average deviation from optimality (in %) ref. 57 all instances

average no. of moves ref. 57

average CPU-time all instances

algorithm

t

REM

1 2 3

40 43 44

283 302 223

0.126 0.117 0.088

3.483 2.925 2.681

11 17 18

4.85 5.30 4.97

REM with STM

1 2 3

39 43 41

268 261 210

0.130 0.119 0.113

4.207 3.904 3.871

9 12 17

2.99 2.89 2.69

REM with L + S T M

1 2 3

44 48 49

591 617 605

0.101 0.073 0.061

0.558 0.504 0.491

28 36 65

14.59 19.33 19.90

REM with LTM

1 2 3

45 47 50

519 677 633

0.095 0.097 0.064

0.646 0.462 0.494

42 48 66

33.50 32.31 30.71

Table 1: Numerical results for M C K P Further significant improvements for all versions of R E M are still possible, e.g., when increasing t h e factor for modifying a up to 5. T h e average deviation values for R E M 2 with LTM decrease to 0.062 and 0.317 with 769 out of 1140 instances solved to optimality, however, with a significant increase in CPU-times.

Quadratic Semi-Assignment Problem Assigning items to sets such t h a t a quadratic function is minimized may be referred to as t h e quadratic semi-assignment problem ( Q S A P ) . This problem indeed is a relaxed version of t h e well known quadratic assignment problem ( Q A P ) and may be represented in a m a t h e m a t i c a l model as follows. Given sets A = { l , . . . , m } and B = { 1 , . . . ,n} and a (not necessarily) symmetric cost m a t r i x (cjhjk)- W i t h binary variables 1

if h € B is assigned to i G A

0

otherwise

%ih

we get t h e model: m

Minimize

n

m

n

Z(x) = Y,Y.Y.^2cihjk i=i fc=i j = i fc=i

• xih • xjk

(4)

342

S. VoB

subject t o n

J2xih = l

i = l,...,m

(5)

h=l

xih G { 0 , 1 }

i = l,...,m,h

= l,...,n

(6)

Q S A P has been formulated in t h e literature by several authors with respect to different application areas like floor layout planning, certain median problems with m u t u a l communication, and t h e problem of schedule synchronization in public transit networks (see, e.g., D u t t a et al. (1982), Klemt and S t e m m e (1988), and Chhajed and Lowe (1992)). T h e problem also arises in certain scheduling problems where t h e deviation from due dates is penalized by a quadratic function. Here we focus on a real-world application in schedule synchronization in public mass transit networks where t h e objective is to minimize t h e total transfer waiting times of passengers in a mass transit system which is expressed as t h e sum of individual waiting times within given operation hours (cf. Domschke (1989), Vofi(1990), and Domschke et al. (1992)). Let there be a n u m b e r of m lines or routes (Fig. 1 shows an example for a transit network with m = 3; note t h a t a line is defined for one direction only). W i t h each route i we join a set N(i). Given a cycle t i m e <,• (in t i m e units, e.g. minutes) for route i t h e n N(i) = { 1 , . . . , <,} is a nodeset with each node giving a specific departure t i m e within t h e cycle t i m e . T h e d e p a r t u r e t i m e is t h e starting t i m e of i at its first station so t h a t all arrival and departure times furtheron may be easily calculated resulting in a complete t i m e table. All routes have to be scheduled such t h a t t h e above objective is minimized. Fixing t h e starting times corresponds to choosing exactly one representative from each set such t h a t t h e sum of all arc weights of t h e subgraph induced by these nodes is minimal. Sets A and B denote traffic lines and possible d e p a r t u r e times, respectively. T h e problem size may be referred to as ' n u m b e r of lines X cycle t i m e ' with identical cycle times for all lines. So, (4) - (6) is a suitable model for schedule synchronization. In what follows we relate specific issues of t a b u search to QSAP. A transition from one feasible solution to another one needs two exchanges within t h e binary m a t r i x (xih)mxn such t h a t conditions (5) remain valid. Accordingly moves are paired-attribute moves denoting b o t h t h e assignments and t h e type of exchange, i.e., selection for or exclusion from t h e (actual) solution. In more detail each move is described by two a t t r i b u t e s of different t y p e belonging to assignments between one element of set A and two different elements of set B. We explain t h e neighbourhood search by an example (cf. Domschke et al. (1992)): Consider a Q S A P with A = { 1 , 2 , 3 } , 2? = { 1 , 2 , 3 } , and a m a t r i x of symmetric cost coefficients given in Table 2. Fig. 1 shows an underlying transit network with three routes r = 1,2,3 corresponding with t h e elements of set A. Three possible departure times t = 1,2,3 are assumed for each route corresponding with t h e elements of set B. Fig. 2 visualizes in three different submatrices t h e objective function values (multiplied by 0.5 because of t h e s y m m e t r y of t h e problem) for all 3 3 feasible solutions. In

Tabu Search: Applications and Prospects

route 1

route 3 11

route 2 Figure 1: Transit network 1 2 3 2 3 1 2 3 1 2 3 8 9 Qs] LlJ 6 0 6 6 7 5 7 1 0 3 4 8 7 3 6 0 5 8 2 6 3 4 7 8 7 4 \o\ 8 9 5 8 5 4 0 7 7 8 7 8 1 3 2 8 9

T

t 1 1 2 3 1 2 2 3 1 3 2 3

r

1

8 9 5 1 6 0

Table 2: Cost matrix

rl — tl r t 1 3 2 3

1 14 3 22 10

r l —12 2 2 14 4 22 17

rl—13

r 3 [~6f 19 5 14 6

t 1 3 2 3

2 1 2

r 3

2 l ( 2 0 22 9 15 17

Figure 2: Example for TNM

t 1 3 2 3

2 1 2 3 13 15 12 2 15 17 19 © T 4 16

344

S. VoB

a figurative sense they represent t h e layers of a threedimensional solution cube each of t h e m with a fixed assignment of route 1 to a specific departure time. As starting solution we have a local o p t i m u m solution (1,3,1) by choosing route 1 to start at t i m e 1 (i.e. assignment r l —• 21), route 2 to start at t i m e 3 (r2 —> 23), and route 3 to start at t i m e 1 (r3 —» 21): i n = i 2 3 = ^31 = 1 a n d xTt = 0 otherwise. This solution has an objective function value of 6 (cf. Fig.2) which is calculated by t h e framed entries of t h e m a t r i x in Table 2. In addition, Fig. 2 shows all six neighbourhood solutions of (1,3,1) by entries with upper indices. T h e first and t h e second neighbourhood solution m a y be derived by varying times of route 1 with fixed times of routes 2 and 3. T h e third and fourth m a y be derived by varying times of route 2 with fixed times of routes 1 and 3 and t h e fifth and sixth by varying route 3, correspondingly. Changing to (2,3,1) is best possible (in t h e sense t h a t no b e t t e r neighbourhood solution can be found). T h e corresponding move may be described as (11,12) with the attributes 11 indicating t h a t t h e binary variable x\\ becomes 0, and 12 indicating t h a t variable X\2 gets t h e entry 1. For applying T N M we assume a t a b u list of length tl_size=4. (Note t h a t T N M is very sensitive with respect to tl_size since tl_size=2 would lead to cycling in our example.) As t h e choice of t h e first move is not restricted by t a b u attributes the move (11,12) is selected resulting in (2,3,1). By this t h e complements of the corresponding a t t r i b u t e s are stored in the tabu list preventing from any exchange of the d e p a r t u r e t i m e of route 1 as long as they stay t a b u . 2 T h u s , in t h e second iteration only four moves are allowed leading to solutions (2,1,1), (2,2,1), (2,3,2), and (2,3,3), respectively, of which (23,22) is selected increasing t h e objective function value less t h a n t h e other allowed moves. T h e t a b u list is u p d a t e d and now only allows moves according to route 3: (31,32) and (31,33). T h e b e t t e r of these two exchanges results in solution (2,2,3). As t h e new a t t r i b u t e s become t a b u t h e oldest t a b u a t t r i b u t e s are freed again. T N M for this small example continues in a very restricted manner (which is caused by the relation of problem size to tl_size) and finally reaches the optimal solution (3,1,3) in the fifth iteration. In Fig. 2 the trajectory of all performed moves is presented with arrows and Table 3 shows all necessary statistics of the five iterations. In Domschke et al. (1992) improvement procedures for Q S A P are compared. Initial feasible solutions are calculated either randomly or with different versions of a regret heuristic extended by a 2-optimal exchange procedure. To sum up t h e main results, there is not much difference between t h e three tested t a b u search m e t h o d s in solution quality if p a r a m e t e r s are chosen well. T N M asks for most exact determination of tLsize whereas CSM seems to be more robust according to nonoptimal parameters. CSM proves to be most independent from t h e starting solution quality, too. R E M performs slightly worse but steadily improves a given solution with increasing C P U time. It is t h e m e t h o d t h a t prevents from revisiting solutions best which can easily 2 Note that for TNM it is not necessary to store the attributes rt when x r l gets the entry 1. The proceeding becomes more transparent, however, when describing the moves in more detail.

Tabu Search: Applications

iteration 1 2 3 4 5

and Prospects

345

move

solution

tabu list (tabu attributes)

(11,12) (23,22) (31,33) (12,13) (22,21)

(1,3,1) (2,3,1) (2,2,1) (2,2,3) (3,2,3) (3,1,3)

12,11 12,11,22,23 22,23,33,31 33,31,13,12 13,12,21,22

Table 3: Example be guaranteed by choosing suitable parameters. Based on the procedures developed by Domschke et al. (1992) additional computational testing on large scale real-world problems has been performed including R E M 2 as proposed in Section 2 (as well as sequential testing of t h e look ahead m e t h o d proposed in Section 4). T h e d a t a represent schedule synchronization problems from three G e r m a n cities with 14, 24, and 27 routes operating in two directions each (i.e. m < 54). A modification with respect t o t h e above mentioned problem description concerns different cycle times for different routes as well as variable cycle times with 10 < t; < 80 m i n u t e s over t h e day. T h e results are quite astonishing in t h e sense t h a t an analysis of the case studies reveals very specialized d a t a , i.e., t h e underlying traffic networks have a special shape. In one case it might be characterized as a star network with a central station in t h e downtown part of t h e respective city. In t h e other two cases different lines partly use t h e same tracks implicating t h a t security distances have to be observed. Motivated by t h e modelling process above we m a y still use t h e Q S A P model with slight modifications. This is done as follows (cf. Vofi(1990)): • For any two lines using t h e same tracks calculate those combinations of depart u r e t i m e s which lead to superposition on t h e commonly used tracks, and define all weights in Q S A P corresponding to those combinations to be oo. T h e special structure of t h e problems leads to t h e following results. Nearly in all cases t h e different t a b u search m e t h o d s are able to find only slight improvements over t h e initialized solutions obtained with t h e regret and t h e simple 2-optimal exchange procedures. Even t h e modifications of R E M mentioned above do not give any additional reasonable improvements. Careful analysis of t h e smallest of t h e three examples gives us some reasoning (based on t h e branch and bound approach of Domschke (1989)) t h a t t h e t a b u search methods do not fail b u t t h a t t h e initial feasible solutions are close to t h e o p t i m u m or even equal to it. In addition, even local o p t i m a close to the o p t i m u m have a great number of neighbourhood solutions each with the same objective function value such t h a t additional testing, which is still under way,

S. VoS

346

has to deal with some more sophisticated diversification structure like a combination of REMt for larger t and the long term memory. In addition, another single-attribute approach could be tested with the attributes corresponding to objective function values. With respect to the real-world application, however, the results obtained by our algorithms (including tabu search) are quite satisfactorily and promising.

4

Concepts for Parallel Tabu Search

Parallel Machine Models Over the years a great variety of architectures have been proposed for parallel computing. The most widely known classification of parallel machine models (although somehow limited) is given by Flynn (1966). He distinguishes four general classes based on the idea of whether single or multiple instruction streams are executed on either one or multiple data set streams: • SISD (Single Instruction, Single Data) including the classical sequential computers • SIMD (Single Instruction, Multiple Data) including vector computers and array processors • MISD (Multiple Instructions, Single Data) • MIMD (Multiple Instructions, Multiple Data) with the processors performing each successive set of instructions either simultaneously (synchronous) or independently (asynchronous) The above classification of parallel machine models may lead to different classes of parallel algorithms. Vectorized algorithms operate uniformly on vectors of data sets (SIMD). Systolic ones operate rhythmically on streams of data sets (SIMD and synchronous MIMD). Parallel processing algorithms operate on a set of synchronously communicating parallel processors (synchronous MIMD). Correspondingly, asynchronous communication leads to distributed processing algorithms (asynchronous MIMD and neural networks). In addition to architectural aspects communication networks are used to classify parallel machine models. For instance, it makes a difference whether processors have simultaneous access to a shared memory, allowing communication between two arbitrary processors in constant time, or whether they communicate through a fixed interconnection network. Less formally, in certain models it is assumed that there is a master processor controlling the communication of the network, with the remaining processors of the network called slaves. For a comprehensive survey on parallel machines and algorithms see e.g. Akl (1989) and Van Leeuwen (1990).

Tabu Search: Applications

and

347

Prospects

T h e quality of parallel algorithms may be judged by a n u m b e r of quantities, t h e most i m p o r t a n t one being t h e speedup, which is t h e running t i m e of t h e best sequential implementation of t h e algorithm divided by t h e running t i m e of t h e parallel implementation executed on a n u m b e r of p processors. Similarly, given a prespecified t i m e limit (cf. footnote 1) a scaleup m a y be defined as t h e ratio of t h e average problem sizes solvable with a parallel implementation to a sequential implementation of t h e algorithm. W i t h heuristics, t h e solution quality attainable may also be measured. T h e processor utilization or efficiency is t h e speedup divided by p. T h e best one can achieve is a speedup of p and an efficiency equal to one.

Parallel Tabu Search Algorithms Due to t h e success and t h e underlying simplicity of t h e m a i n idea of t a b u search, recently some implementations on parallel computers have come up tailored to specific problems. Surprisingly, to t h e best of our knowledge, they are solely devoted to problems using t h e notion of paired-attribute moves: t h e travelling salesman problem (see Malek et al. (1989) and Fiechter (1990)), t h e job shop problem (see Taillard (1989)), and t h e quadratic assignment problem (see Chakrapani and Skorin-Kapov (1991, 1992), Taillard (1991)). In a first step we shall describe a classification of different types of parallelism t h a t is applicable to most iterative search techniques. Its basis is t h e idea of having different starting solutions (so-called balls, motivated by t h e idea of m o u n t a i n s ' like solution space where a ball is rolling to find a stable low altitude state) as well as a n u m b e r of different strategies, e.g. based on various possibilities of t h e p a r a m e t e r setting or on t h e t a b u list m a n a g e m e n t described in Section 2. • S B S S (Single Ball, Single

Strategy)

T h e algorithm starts from exactly one given feasible solution and performs its moves following exactly one strategy. • S B M S (Single Ball, Multiple

Strategies)

T h e algorithm starts from exactly one given feasible solution by t h e use of different strategies where each strategy is performed on a different processor. • M B S S (Multiple

Balls, Single

Strategy)

T h e algorithm starts from different initial feasible solutions, each on a different processor. T h e same t y p e of instruction, i.e. strategy, is performed on each processor. • M B M S (Multiple

Balls, Multiple

Strategies)

T h e algorithm starts from different initial feasible solutions performing different strategies.

348

S. VoB

In what follows we discuss t h e above ideas in more detail with special emphasis on further principles of parallelism within specific strategies. For ease of description we assume t h e notion of parallel or distributed processing algorithms.

SBSS T h e single ball, single strategy idea is t h e simplest version, and obviously corresponds to t h e idea of classical sequential computations (cf. t h e SISD-model). This, however, does not restrict t h e possibility of parallelization. Starting from an initial feasible solution, t h e best move which is not t a b u must be performed. T h e search for this move may be done in parallel by decomposing t h e set of admissible moves into a number of subsets. E.g. in a master-slave architecture each (slave) processor m a y evaluate the best move in a specific subset. T h e best move of each subset is communicated t o t h e m a s t e r who picks t h e overall best as t h e transformed solution and also performs t h e t a b u list m a n a g e m e n t . To restrict t h e amount of communication necessary for synchronizing t h e d a t a each slave could determine the best possible move in its subset without observing any t a b u list, while t h e t a b u list in t h e same t i m e is u p d a t e d by t h e master. T h e n the master picks among all answers t h e best which is not tabu. If no such move exists, a second trial must be m a d e while each processor has to receive and to observe t h e t a b u list. Otherwise t h e next iteration is to be performed. Additional ideas m a y be developed with respect to t h e specific strategies. In T N M , t h e t a b u list m a n a g e m e n t may be done by each processor itself by simply providing the most recent move (whose complement will be in t h e list). In CSM, t h e master builds t h e cancellation sequences and partitions t h e m to t h e slaves, i.e., every slave has to evaluate a certain n u m b e r of sequences. In subsequent iterations, t h e a t t r i b u t e s of t h e current moves are communicated. Whenever a cancellation sequence is reduced to 1 it will be re-communicated to t h e master.

SBMS In SBMS each processor executes a process which is one of the above t a b u search strategies with different t a b u conditions and p a r a m e t e r s , like e.g. R E M t for various t. For T N M this can be different (eventually randomly modified) t a b u list lengths; for CSM, different t a b u durations may be considered. T h e (slave) processors are halted after a prespecified t i m e and t h e results are compared and t h e best one is calculated. A restart is possible with the best or a good seed solution. Each strategy may take a different p a t h through t h e search space because of different t a b u list m a n a g e m e n t or p a r a m e t e r setting. A restart may be performed either with e m p t y running and t a b u lists or with a previously encountered list.

Tabu Search: Applications

and

Prospects

349

MBSS T h e multiple balls approaches start from at most p (the n u m b e r of processors available) different initial feasible solutions, whose calculation can vary. T h e y m a y be determined either randomly or by applying different heuristics to t h e same problem. This may also incorporate ideas involving different diversification and intensification strategies as described above. A third possibility assumes one given feasible solution and starts with a suitable subset of its transformed (neighbourhood) solutions. (Especially with R E M 2 it m a y be assured t h a t even in future iterations there is no overlap with t h e initial feasible solutions of t h e other processors.) T h e single strategy approach assumes t h e application of exactly one t a b u search algorithm with t h e same p a r a m e t e r setting for all processors. As with SBMS, t h e processes m a y be halted after a specific t i m e period to coordinate their results and possibly to initiate a restart with new (hopefully) improved solutions. If t h e processes are performed synchronously, then t h e stopping may be initiated after having generated, say, m successive moves. On synchronous M I M D machines t h e latter approach may be especially relevant. Note t h a t t h e above-mentioned possibility of parallelization within SBSS is related to a m e t h o d with m = 1 where t h e best transition is evaluated. W i t h respect to MBSS, this modifies to t h e evaluation of t h e p best moves usable for a restart. For m > 2 this approach m a y be used as a look ahead method, and is especially helpful when evaluating a region of a solution space with almost all neighbours having identical objective function values.

MBMS T h e multiple balls, multiple strategies approach subsumes all previous classes, allowing search of t h e solution space from different starting points with different m e t h o d s or p a r a m e t e r settings.

Examples In t h e sequel we sketch some of t h e ideas given in t h e previous sections with respect to well known combinatorial optimization problems. Surprisingly, as mentioned above, we only found some work on problems with t h e idea of paired-attribute moves to perform t h e neighbourhood search. Therefore, we start with respect to binary integer programming, exploiting single-attribute moves, as is t h e case for M C K P described in Section 3. Consider t h e SBSS concept. Also consider n decision variables in a binary problem with no (implicit or explicit) restriction on t h e n u m b e r of variables set to either 1 or 0. We m a y define simple ADD- or DROP-moves by complementing t h e corresponding entries of t h e binary variables x,-. Assume t h e existence of n + 2 processors with n + 2 being t h e master processor. T h e t a b u list m a n a g e m e n t is performed by processor n + 1. In any iteration of t h e search, each of t h e synchronously controlled processors

S. VoB

350

i € { l , . . . , n } receives t h e information whose variables' entry has been chosen to be exchanged as t h e most recent move. This move is performed together with t h e reversion of a:,-. This usually can be done quite efficiently by reconstructing t h e previous solution stored at i with at most one assignment complemented. T h e n i offers its objective function value to t h e m a s t e r who re-calls all results of processors referring to non-tabu moves (evaluated by processor n + 1). Obviously this approach may be generalized in various ways to t h e more general classes described above. This concept may be applied, for instance, to M C K P , to t h e well-known warehouse location problem, and to Steiner's problem in graphs. If t h e number or weighted number of variables with value 1 is limited (as for M C K P ) or fixed (as e.g. in t h e p-median problem) t h e n t h e same approach may be applied with combined A D D / D R O P - or SWAP-moves leading to paired-attribute moves. Malek et al. (1989) follow t h e SBMS approach to solve travelling salesman problems ( T S P ) by T N M with t h e 2-opt exchange as moves. T h e t a b u a t t r i b u t e s follow different strategies in t h a t they are restricted either to one or to t h e two cities t h a t have been swapped or to t h e cities and their respective positions in tour. In addition different t a b u p a r a m e t e r s were used on different processors. For another parallel t a b u search algorithm for t h e T S P see Fiechter (1990). T h e quadratic assignment problem ( Q A P ) is treated by Chakrapani and SkorinKapov (1991, 1992) by t h e use of SBSS and T N M with search intensification and search diversification performed sequentially while evaluating t h e moves in parallel. T h e set of moves is partitioned into disjoint subsets each one on a different processor as described above. T h e neighbourhood search is performed by pairwise interchanges such t h a t for 0(n2) processors available all moves can be evaluated in constant time, achieving a speedup of O ( n 2 / l o g n ) . Battiti and TecchioUi (1992) use T N M together with a hashing function and compare their algorithm also with a parallel genetic algor i t h m . Another parallel algorithm for Q A P based on T N M (with randomly varying tl_size) has been presented by Taillard (1991). It is an SBSS approach, too. T h e same idea has also been applied to t h e job shop as well as to the flow shop problem (see Taillard (1989, 1990)). T h e latter, in fact, also describes a single-attribute based implementation with attributes corresponding to objective function values. Chakrapani and Skorin-Kapov (1992) is especially relevant since its implementation is based on a connectionist approach related t o a Boltzmann machine (cf. Aarts and Korst (1989)).

5

Conclusions

In this paper we have summarized some ideas for developing parallel t a b u search algorithms. Motivated by a famous classification scheme for parallel machine models we proposed a classification scheme for parallel t a b u search algorithms. While research in this field is still in its infancy we believe t h a t reasonable achievements in the following two aspects will be provided.

Tabu Search: Applications

and

Prospects

351

• Development of a framework for a general parallel t a b u search algorithm t h a t can be applied to a wide range of combinatorial optimization problems. • Empirical results for parallel t a b u search algorithms tailored to specific problems. Some results known from t h e literature (cf. Section 4) support this feeling. Despite t h e emphasis on parallel t a b u search, sequential testing is still far from complete. Numerical results for t h e proposed algorithm R E M 2 have been reported in this paper, improving t h e previously known R E M . T h e results are quite encouraging, especially when combining R E M 2 with t h e proposed look ahead m e t h o d for m = 2. In addition, t h e t a b u search metastrategy should be tested on different classes of parallel algorithms and machine models. Especially relevant seems to be a comparison of algorithms tailored to different hardware specifications like vector computers versus synchronous and asynchronous MIMD machines. However, one should take into account identical user specifications with respect to t a b u search (e.g. p a r a m e t e r setting, definition of t h e neighbourhood). Note t h a t our classification scheme is not restriced to parallel t a b u search, but may be applied for nearly any iterative search procedure, such as simulated annealing or genetic algorithms.

References [1] E. Aarts and J. Korst (1989), Simulated ley, Chichester).

Annealing

[2] S.G. Akl (1989), The Design and Analysis Englewood Cliffs).

and Boltzmann

of Parallel Algorithms

Machines

(Wi-

(Prentice-Hall,

[3] R. B a t t i t i and G. Tecchiolli (1992), Parallel biased search for combinatorial optimization: genetic algorithms and t a b u search. Technical Report 9207-02, I R S T , Istituto Trentino di Cultura, Trento. [4] J. Chakrapani and J. Skorin-Kapov (1991), Massively parallel t a b u search for t h e quadratic assignment problem. Working Paper H a r r i m a n School for Management and Policy, S t a t e Univ. of New York at Stony Brook. [5] J. Chakrapani and J. Skorin-Kapov (1992), A connectionist approach to t h e quadratic assignment problem. Computers & Operations Research 1 9 , 287-295. [6] D. Chhajed and T . J . Lowe (1992), M-median and M-center problems with m u t u a l communication: solvable special cases. Operations Research 4 0 , S56-S66. [7] F . Dammeyer, P. Forst and S. Vofi(1991), On t h e cancellation sequence m e t h o d of t a b u search. ORSA Journal on Computing 3 , 262-265.

352

S. VoB

[8] F . D a m m e y e r and S. Vofl(1991a), Dynamic t a b u list m a n a g e m e n t using t h e reverse elimination m e t h o d . Annals of Operations Research, to appear. [9] F . D a m m e y e r and S. Vofi(1991b), Application of t a b u search strategies for solving multiconstraint zero-one knapsack problems. Working paper T H D a r m s t a d t . [10] W . Domschke (1989), Schedule synchronization for public transit networks. OR Spektrum 1 1 , 17-24. [11] W . Domschke, P. Forst and S. Vofi(1992), Tabu search techniques for the quadratic semi-assignment problem. In: G. Fandel, T. Gulledge and A. Jones (eds.), New Directions for Operations Research in Manufacturing (Springer, Berlin), 389-405. [12] A. Drexl (1988), A simulated annealing approach to t h e multiconstraint zero-one knapsack problem. Computing 4 0 , 1-8. [13] A. D u t t a , G. Koehler and A. Whinston (1982), On optimal allocation in a dist r i b u t e d processing environment. Management Science 2 8 , 839-853. [14] C.-N. Fiechter (1990), A parallel t a b u search algorithm for large traveling salesm a n problems. Paper presented at 1st Int. Workshop on Project Management and Scheduling, Compiegne. [15] M . J . Flynn (1966), Very high-speed computing systems. Proc. IEEE 5 4 , 19011909. [16] A. Freville and G. Plateau (1982), Methodes heuristiques performantes pour les problemes en variables 0-1 a plussieurs constraintes en inegalite. Publication ANO-91, Universite des Sciences et Techniques de Lille. [17] A. Freville and G. Plateau (1990), Hard 0-1 multiknapsack test problems for size reduction m e t h o d s . Investigacion Operativa 1, 251-270. [18] F . Glover (1989), Tabu search - part I. ORSA [19] F . Glover (1990), Tabu search - part II. ORSA

Journal Journal

on Computing on Computing

1, 190-206. 2, 4-32.

[20] F . Glover and M. Laguna (1992), Tabu search. Working Paper Univ. of Colorado at Boulder, to appear. [21] W . D . Klemt and W . S t e m m e (1988), Schedule synchronization for public transit networks. In: J.R. D a d u n a and A. Wren (eds.), Computer-Aided Transit Scheduling, Lecture Notes in Economics and Mathematical Systems 308 (Springer, Berlin), 327-335.

Tabu Search: Applications and Prospects

353

[22] M. Malek, M. Guruswamy, M. Pandya and H. Owens (1989), Serial and parallel simulated annealing and tabu search algorithms for the traveling salesman problem. Annals of Operations Research 21, 59-84. [23] E. Taillard (1989), Parallel taboo search technique for the jobshop scheduling problem. Working Paper Ecole Polytechnique Federale de Lausanne. [24] E. Taillard (1990), Some efficient heuristic methods for the flow shop sequencing problem. European Journal of Operational Research 47, 65-74. [25] E. Taillard (1991), Robust taboo search for the quadratic assignment problem. Parallel Computing 17, 443-455. [26] J. Van Leeuwen (1990), Algorithms and Complexity (Elsevier, Amsterdam). [27] S. Vofi(1990), Network design formulations in schedule synchronization. Working Paper TH Darmstadt, to appear.

355 Network Optimization Problems, pp. 355-362 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

The Shortest P a t h Network and Its Applications in Bicriteria Shortest P a t h Problems Guo-Liang Xue Army High Performance Computing Research Center, University of Suite 101, 1100 South Washington Avenue, Minneapolis, MN 55415,

Shang-Zhi Sun Computer Science USA

Department,

University

of Minnesota,

Minneapolis,

Minnesota, USA

MN

55455,

Abstract

Let N = (V, A, I, s) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is the set of e arcs, l(u, v) > 0 is the length of an arc (u, v) € A, and s 6 V is the source. The S h o r t e s t Path Network (SPN) is a subnetwork of N with the property that an s—u path in N is a shortest path in N if and only if it is a path in SPN. The SPN is a counterpart of the well-known S h o r t e s t Path Tree (SPT). Unlike the SPT which may not be unique for a given network, the SPN is unique for any given network. Also, the SPN provides a unified approach for solving certain kind of bicriteria or multicriteria shortest path problems where one criterion is more important than the others. We present a simple and efficient algorithm for computing the SPN with time complexity oiTSP(n, e) where TSP(n, e) is the time complexity for solving the one-to-all shortest path problem on a network with n vertices and e arcs. We also present applications of the SPN, including a unified approach for solving the maximum capacity shortest path problem, the least risky shortest path problem, and the most reliable shortest path problem.

356

1

G.-L. Xue and S. Z. Sun

Introduction

Shortest p a t h problems are among t h e most commonly encountered problems at the interface of C o m p u t e r Science and Operations Research due to their i m p o r t a n t applications in communication networks and in road transportation m a n a g e m e n t . In recent years, there have been increased interests in various bicriteriaor multicriteria shortest p a t h problems. Examples are t h e m a x i m u m capacity shortest p a t h problem, t h e least risky shortest p a t h problem, t h e most reliable shortest p a t h problem, t h e m i n i m u m cost-reliability ratio problem, and t h e quickest p a t h problem [1, 2, 3, 4, 8, 10, 11, 12]. Since t h e publication of Dijkstra's famous paper [6] in 1959, there have been m a n y m a n y papers dealing with algorithms for t h e one-to-all or all-to-all shortest p a t h problems. T h e fastest algorithm for the one-to-all shortest p a t h problem is the one provided by Fredman and Tarjan [7] which requires 0(e + log log n) t i m e on a network with n vertices and e arcs by using a d a t a structure called Fibonacci heaps. One i m p o r t a n t concept associated with t h e one-to-all shortest p a t h problem is the S h o r t e s t P a t h T r e e (SPT) [5] or t h e S h o r t e s t S p a n n i n g T r e e [9]. T h e SPT of a network TV has a nice property t h a t t h e unique p a t h from t h e source to any vertex in SPT is guaranteed to be a shortest p a t h in t h e network N. However, a shortest p a t h in t h e network N might not be a p a t h in t h e SPT. In this paper, we introduce a counterpart concept of t h e SPT called the S h o r t e s t P a t h Network (SPN) which enables a unified approach for solving certain kind of bicriteria or multicriteria shortest p a t h problems where one criterion is more i m p o r t a n t t h a n t h e other criteria and t h e goal is to find a p a t h which optimizes t h e other criteria among all the shortest paths with respect to the most i m p o r t a n t criterion. A simple and efficient algorithm for computing t h e SPN is presented together with examples of its applications. In section 2, we first observe t h e deficiency of t h e ordinary S h o r t e s t P a t h T r e e in solving bicriteria shortest p a t h problems. We then introduce t h e concept of S h o r t e s t P a t h Network and prove its existence and uniqueness. A simple algorithm is provided which computes t h e S h o r t e s t P a t h Network in t i m e TSP(n,e), where n and e are t h e n u m b e r of vertices and number of arcs of t h e network and TSP(n, e) is the t i m e complexity of solving the one-to-all shortest p a t h problem on t h a t network. In section 3, we show how t h e S h o r t e s t P a t h Network can be used as a useful tool in solving the m a x i m u m capacity shortest problem and t h e least risky shortest p a t h problems. Some conclusions are given in section 4.

2

The Shortest P a t h Network

Let N = (V, A, I, s) be a given network where G = (V, A) is a simple directed graph, V is t h e set of n vertices, A is t h e set of e arcs, l(u,v) > 0 is the length of an arc (u,v) £ A, s € V is t h e source. Applying Dijstra's one-to-all shortest p a t h algorithm [6], we may find, in t i m e 0(n2), the shortest paths from t h e source s to all t h e other

The Shortest

Path Network

and Bicriteria

Shortest

Path

Problems

357

vertices of N, together with a tree rooted at s with t h e property t h a t t h e unique p a t h from s to any vertex u in t h e tree is also a shortest s—u p a t h in N. Such a subnetwork is usually called a S h o r t e s t P a t h T r e e [5] which is formally defined as follows. D e f i n i t i o n 2 . 1 . Let N = (V, A, l,s) be a given network with a distinguished source node s. A subnetwork SPT of N is be called a S h o r t e s t P a t h T r e e of N If (1). t h e vertex set of SPT are all t h e vertices of N which are reachable from s; (2). any 5—u p a t h in SPT is a shortest s—u p a t h in A^; (3). SPT is a tree rooted at s. Given an SPT of N and a vertex u in N which is reachable from s, t h e unique s—u p a t h in SPT is a shortest s-u p a t h in N. However, t h e shortest s—u p a t h s in N may not be unique. Therefore there might be another shortest s—u p a t h in N which is not a p a t h in SPT. We are interested in a subnetwork which has t h e property t h a t an s—u p a t h in N is a shortest s—u p a t h if and only if it is a p a t h in t h e subnetwork. We will call such a subnetwork a S h o r t e s t P a t h Network and it is formally defined below. D e f i n i t i o n 2 . 2 . Let A^ = (V,A,l,s) be a given network with a distinguished source node s. A subnetwork SPN of A' will be called a S h o r t e s t P a t h Network of N If (1). the vertex set of SPN are all t h e vertices of N which are reachable from s; (2). any s—u p a t h in SPN is a shortest s—u p a t h in A^; (3). any shortest s—u p a t h in A7 is a p a t h in SPN.

T h e following theorem establishes t h e existence and uniqueness of t h e SPN and its characterization. T h e o r e m 2 . 1 . Let N = (V,A,l,s) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is the set of e arcs, l(u,v) > 0 is t h e length of an arc (u,v) 6 A, s £ V is t h e source. T h e n t h e SPN for N is unique and is t h e union of all t h e shortest s—u paths for u G V such t h a t there is an s—u p a t h in A'. P r o o f . Since l(u,v) > 0 for any arc (u,v) g A, there is a shortest 3 — u p a t h for a vertex u € V if and only if u is reachable from s. Let Union be t h e union of all t h e shortest 5—u p a t h s for u € V such t h a t there is an 5—u p a t h in N. We want to show t h a t Union is a S h o r t e s t P a t h Network. Clearly, t h e vertex set of Union are all the vertices of A' which are reachable from s and t h a t any shortest s—u path in N is a path in Union. Using the property t h a t every subpath of a shortest path is itself a shortest path, one can prove that any s—u p a t h in Union is a shortest s—u p a t h in N. This shows t h a t Union is a SPN for N. Now for any SPN of N, property (2) in t h e definition implies t h a t t h e SPN is a subnetwork of Union while property (3) in t h e definition implies t h a t Union is a subnetwork of SPN. Therefore Union is t h e unique SPN of N. •

358

G.-L. Xue and S. Z. Sun

Algorithm 2.1. Step 1. Apply any one-to-all shortest path algorithm on N. Let d(u) be the shortest distance from s to u for any u f F . Let d(u) = oo when there is no s—u path in iV. Step 2. For each u £ V if d(u) = oo then delete u from V and delete the arcs from A which are adjacent with u. Step 3. For each arc (u,v) € A if d(u) + l(u,v) > d(v) then delete (u,v) from A. Figure 1: Computing the SPN from a given network. The above proof also suggests the following algorithm for computing the SPN for a given network. It is clear that Algorithm 2.1 correctly changes the input network N = (V, A, I, s) to its unique SPN. Since steps 2 and 3 take at most 0(e) time, the time complexity of Algorithm 2.1 is TSP(n,e) + 0(e), where TSP(n,e) is the time complexity for the one-to-all shortest path problem on a network with n vertices and e arcs. Since TSP(n, e) is always greater than or equal to e, the time complexity of Algorithm 2.1 is TSP(n, e). In Figure 2 we illustrate a network and its unique S h o r t e s t Path Network. For clarity, the arcs in the SPN are drawn in thicker lines. It can be easily observed from the figure that the SPN is not a tree and that the given network has more than on SPT's.

Figure 2: A network and its Shortest Path Network.

Once the subnetwork SPN is found, the above mentioned bicriteria shortest path problems all reduce to single criterion problems on SPN. This makes the Shortest Path Network a very useful concept in bicriteria shortest path problems. In the next section, we will discuss applications of the SPN.

The Shortest

3

Path Network

and Bicriteria

Shortest

Path Problems

359

Applications

In t h e previous section, we have introduced t h e concept of SPN and presented an algorithm for computing t h e SPN. In this section, we will show how t h e SPN can be used to solve various bicriteria shortest p a t h problems in a unified approach. Specifically, we will investigate t h e m a x i m u m capacity shortest p a t h problem, t h e m i n i m u m risky shortest p a t h problem, and t h e most reliable shortest p a t h problem.

3.1

The Maximum Capacity Shortest Path Problem

Let N = (V, A, I, s, c) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is t h e set of e arcs, /(•) and c(») are weighting functions where l(u,v) > 0 is the length of an arc (u,v) € A and c(u,v) > 0 is t h e capacity of t h a t arc, and s £ V is the source. We want to find a m a x i m u m capacity shortest p a t h from s to u for all u g V , where the length of a p a t h in N is t h e sum of all t h e lengths of the arcs on t h e p a t h while t h e capacity of a p a t h in N is the m i n i m u m of all t h e capacities of t h e arcs on t h a t p a t h . Here we find our first application of t h e S h o r t e s t P a t h Network. Ignoring t h e weighting function c(«) for t h e m o m e n t , we may find t h e SPN of N with respect to / ( • ) . Now, for each u € V, there is an s—u p a t h if and only if u is a vertex of SPN. In addition, an s—u p a t h in N is a m a x i m u m capacity shortest s—u p a t h if and only if it is a m a x i m u m capacity 5—u p a t h in SPN. Therefore, t h e m a x i m u m capacity shortest p a t h problem can be solved by t h e following algorithm (see Figure 3). For convenience, we will assume t h a t t h e vertices of t h e SPN are labeled 1,2, • • •, k ( < n) where s is labeled 1. Algorithm 3.1. Step 1. Find t h e SPN with respect to / ( • ) . Delete all t h e arcs and vertices of N which are not on its SPN. Step 2. F o r i = 1 to k d o b e g i n f[i] = false; C[i] = c[l,z]; e n d ; Step 3. / [ I ] = t r u e ; C[l] = oo; For i = 1 to k — 2 do choose u £ argmax{C[w]\f[w] = false}; f[u] = t r u e ; for w = 1 t o k d o if n o t f[w] a n d min{C[u]|c[u,ti)]} > C[w] t h e n C[w] = mm{C[u],c[u,w]}; end; Figure 3: C o m p u t i n g t h e m a x i m u m capacity shortest p a t h . Step 1 takes TSP(n,e) + 0 ( e ) time. Step 2 takes 0(n) t i m e . Step 3 takes time. Therefore the t i m e complexity of t h e algorithm is 0(n2).

0(n2)

360

3.2

G.-L. Xue and S. Z. Sun

The Least Risky Shortest Path Problem

Let N = (V, A,l,s,R) be a given network where G = (V,A) is a simple directed graph, V is t h e set of n vertices, A is t h e set of e arcs, /(•) and R(») are weighting functions where l[u,v) > 0 is t h e length of an arc (u,v) £ A and R(u,v) > 0 is t h e risk of t h a t arc, s € V is t h e source. We want to find a least risky shortest p a t h from s to u for all u € V, where t h e length of a p a t h in ./V is t h e sum of all t h e lengths of t h e arcs on t h e p a t h while t h e risk of a p a t h in N is t h e sum of all t h e risks of the arcs on t h a t p a t h . As another application of t h e shortest p a t h network, we present in Figure 4 an algorithm for solving t h e least risky shortest p a t h problem. Algorithm 3.2. Step 1. Find t h e SPN with respect to / ( • ) . Delete all the arcs and vertices of N which are not in its SPN. Step 2. Solve t h e one-to-all shortest p a t h problem with respect to -/?(•). Figure 4: C o m p u t i n g t h e least risky shortest p a t h . S t e p . l takes TSP(n,e) + 0 ( e ) time. Step 2 takes TSP(n,e) t i m e complexity of t h e algorithm is TSP(n, e).

3.3

time. Therefore, the

The Most Reliable Shortest Path Problem

Let N = (V, A, I, s, r) be a given network where G = (V, A) is a simple directed graph, V is t h e set of n vertices, A is the set of e arcs, /(•) and r(») are weighting functions where l(u, v) > 0 is t h e length of an arc (u, v) £ A and r(u, v) £ (0, 1] is t h e reliability of t h a t arc, s £ V is t h e source. We want to find a most reliable shortest p a t h from s to u for all u £ V, where the length of a p a t h in A^ is t h e sum of all the lengths of the arcs on t h e p a t h while t h e reliability of a p a t h in A^ is the product of all the reliabilities of the arcs on t h a t p a t h . Now replace t h e weighting function r(») by R(»), where R(u,v) = — log(r(u, v)) for each arc (u,v) £ A. Note t h a t R(») > 0 because r(») £ (0, 1] We will call R(u,v) t h e risk of arc (u,v). T h e n it is clear t h a t t h e most reliable shortest path problem on (V, A,s,l, r) is equivalent to t h e m i n i m u m risky shortest p a t h problem on (V, A, s, I, R) which can be solved by Algorithm 3.2.

4

Conclusions

We have introduced t h e concept of S h o r t e s t P a t h Network which is a counterpart of t h e well-known S h o r t e s t P a t h T r e e . Unlike the S h o r t e s t P a t h T r e e , t h e SPN is unique to a given network with a distinguished source. One advantage of the SPN over t h e SPT is t h a t any p a t h in the network from the source is a shortest p a t h if

The Shortest

Path Network

and Bicriteria

Shortest

Path Problems

361

and only if it is also a p a t h in t h e SPN. A simple algorithm for computing the SPN is presented and its t i m e complexity is shown to be t h e same as t h a t of computing the one-to-all shortest p a t h problem. Examples are given to show the applications of the SPN in some bicriteria shortest p a t h problems. T h e SPN greatly simplifies t h e discussion/solution of certain kind of bicriteria shortest p a t h problems. Therefore, it has both education and practical value. We hope to see more applications of t h e SPN.

Acknowledgment T h e work of the first author was supported in part by the Army Research Office contract number DAAL03-89-C-0038 with the University of Minnesota Army High Performance C o m p u t i n g Research Center. T h e work of t h e second author was supported in part by t h e Computer Science D e p a r t m e n t of t h e Univeristy of Minnesota.

References [1] R.K. Ahuja, M i n i m u m Cost-Reliability Ratio Problem, Computers tions Research, Vol. 15 (1988), pp. 83-89.

and

Opera-

[2] L.D. Bodin, B.L. Golden, A.A. Assad, and M.O. Ball, Routing and Scheduling of Vehicles and Crews: T h e State of t h e Art, Computers and Operations Research, Vol. 10 (1983), p p . 63-211. [3] Y.L. Chen and Y.H. Chin, T h e Quickest P a t h Problem, Computers tions Research, Vol. 17 (1990), p p . 153-161.

and

Opera-

[4] Y.L. Chen, An Algorithm for Finding the K Quickest P a t h in a Network, revised in Computers and Operations Research. [5] R. Dial, F. Glover, D. Karney and D. Klingman, A Computational Analysis of Alternative Algorithms and Labeling Techniques for Finding Shortest P a t h Trees, Networks, Vol. 9(1979), pp. 215-248. [6] E. Dijkstra, A Note on Two Problems in Connection with Graphs, Mathematics, Vol. 1 (1959), pp. 269-271.

Numerical

[7] M.L. F r e d m a n and R.E. Tarjan, Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms, Journal of the Association of Computing Machinery, Vol. 34 (1987), pp. 596-615. [8] M. Minoux, Solving Combinatorial Problems with Combined Min-Max-Min-Sum Objective and Applications, Mathematical Programming, Vol. 45 (1989), pp. 361372.

362

G.-L. Xue and S. Z. Sun

[9] A.R. Pierce, Bibliography on Algorithms for Shortest P a t h , Shortest Spanning Tree, and Related Circuit Routing Problems, Networks, Vol. 5 (1975), p p . 129149. [10] J . B . Rosen, S.Z. Sun, and G.L. Xue, Algorithms for t h e Quickest P a t h Problem and t h e Enumeration of Quickest P a t h s , Computers and Operations Research, Vol. 18 (1991), p p . 579-584. [11] J . B . Rosen and G.L. Xue, Sequential and Distributed Algorithms for t h e All Pairs Quickest P a t h Problem, in Proceedings of the 1991 International Conference on Computing and Information, Ottawa, Canada, (Springer-Verlag, 1991), p p . 471473. [12] G.L. Xue, S.Z. Sun, and J . B . Rosen, M i n i m u m T i m e Message Transmission in Networks, in Proceedings of the 1992 International Conference on Computing and Information, May 28-30, 1992, Toronto, Canada, I E E E C o m p u t e r Society Press, p p . 22-25.

363 Network Optimization Problems, pp. 363-386 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

A Network Formalism for P u r e Exchange Economic Equilibria Lan Zhao Department 11568-0219

of Mathematics, USA

A n n a Nagurney School of Management,

SUNY/College

University

at Old Westbury,

of Massachusetts,

Amherst,

Old Westbury,

MA 01003

NY

USA

Abstract

In this paper we develop a network formalism for general economic equilibrium problems in the case of pure exchange economies. We first establish that the Walrasian price equilibrium (and its variational inequality formulation) is isomorphic to a network equilibrium problem with special structure. We then propose a general iterative scheme for the computation of the equilibrium prices, which contains, as special cases, the projection method and the relaxation method, and which allows for the full exploitation of the special network structure. Finally, we compare the numerical performance of the projection and the relaxation methods on several economic examples.

1

Introduction

Network equilibrium models have been used to formulate and study competitive phen o m e n a in a wide range of applications in operations research, m a n a g e m e n t science, and, more recently, in economics. Examples include: congested urban transportation systems (see, e.g., [1], [2], [3], [5], [6], [12], [15]), oligopolistic markets ([11]), spatial price equilibrium problems ([8], [10], [14], [17]), disequilibrium problems ([19]), and problems of h u m a n migration ([16], [18]).

364

Lan Zhao Si A n n a

Nagurney

T h e aforementioned models, however, have been exclusively partial equilibrium models in t h a t only a subset of agents/activities/commodities has been incorporated within t h e network equilibrium framework. In this paper, in contrast, we focus on t h e general economic equilibrium problem in t h e case of pure exchange, in which all of t h e commodities in an economy can be considered. Our approach utilizes the underlying special network structure of the problem, in an abstract setting, in which t h e nodes do not correspond to locations in space, and in which the links correspond to commodities. This underlying structure, heretofore unidentified and unexplored, motivates t h e subsequent algorithmic developments in this paper, with t h e u l t i m a t e goal being t h e computation of large-scale general economic equilibrium problems. As is well-known, t h e simplicial approximation m e t h o d s pioneered by Scarf [20] for t h e computation of economic equilibria, in their present s t a t e of development, cannot handle large-scale problems. In particular, we consider t h e variational inequality formulation of t h e problem recently described in Dafermos [9]. Thus far, as discussed therein, the variational inequality approach has been used to obtain only qualitative results, in the form of existence, uniqueness, and stability of pure exchange equilibria, and t h e computational analogue for this class of problems has not been addressed. In Section 2 we briefly review t h e pure exchange or Walrasian price equilibrium model. We then establish t h a t t h e problem is isomorphic to a particular network equilibrium problem with fixed demand. In Section 3 we propose a general iterative scheme for t h e computation of the Walrasian equilibrium price vectors t h a t is based on the general iterative scheme of Dafermos [7], and provide conditions for convergence. T h e Walrasian iterative scheme, as is then demonstrated in Section 4, contains, as special cases, both t h e projection m e t h o d and the relaxation m e t h o d . T h e projection m e t h o d resolves t h e Walrasian price equilibrium problem into a series of linear and s y m m e t r i c network equilibrium problems, each of which, as we also show, can be solved exactly in closed form. T h e relaxation m e t h o d , on the other hand, resolves t h e economic equilibrium problem into a series of nonlinear network flow problems, to which we then apply a network equilibration algorithm for its solution. In Section 5 we then t u r n to t h e empirical performance of t h e algorithms and compare t h e efficiency of the relaxation method with t h a t of the projection method on several economic examples. In Section 6 we summarize and conclude. T h e marriage of network theory and variational inequalities has already yielded efficient algorithms for a variety of applications characterized by their large-scale n a t u r e . This work brings a class of general economic equilibrium problems under the umbrella of network equilibrium.

A Network

2

Formalism

for Pure Exchange

Economic

Equilibria

365

The Variational Inequality Model of the P u r e Exchange Economy and its Isomorphic Network Equilibrium Representation

In this section we first briefly review t h e pure exchange economic equilibrium model and its variational inequality formulation. We then develop its isomorphic network equilibrium representation. In particular, we consider a pure exchange economy with / commodities, price vector p = (p\,p2, • • • ,Pi)T taking values in t h e positive ort h a n t R1, and with induced aggregate excess d e m a n d function z(p), with components 2 i ( p ) , . . . , zj(p). As usual, z(p) will be assumed t o be homogeneous of degree zero in p, and, therefore, we may normalize prices so t h a t they take values in t h e simplex:

S'= {p:pe tf+,I> = !}•

(!)

As is s t a n d a r d in general economic equilibrium theory, the aggregate excess dem a n d function must satisfy Walras' law: / • z ( p ) = 0,

Vp€5'.

(2)

We now state t h e definition of a Walrasian equilibrium. D e f i n i t i o n 1: A price vector p* £ S1 is called a Walrasian equilibrium if t h e market is cleared for valuable commodities and is in excess supply for free commodities, t h a t is, if p* > 0 Zi(p*) = 0 «.(P*)<0

if

P*=0.

(3)

T h e following theorem shows us t h a t Walrasian equilibrium price vectors can be characterized as solutions of a variational inequality (see, e.g., Dafermos [9] Theorem 1.1), and is included here for completeness. T h e o r e m 2 . 1 A price vector p* € S1 is a Walrasian satisfies the variational inequality z(P*)T-(p-p*)<0,

VpeS1

equilibrium

if and only if it

VI(z,Sl).

We now establish t h a t the variational inequality model VI(z, S') for the Walrasian price equilibrium problem is identical to the variational inequality problem governing a network equilibrium problem with a single origin-destination ( O / D ) pair and fixed demand. Consider t h e following network equilibrium problem: A network is given consisting of a single origin node x, a single destination node y, and with a single origin/destination pair (x,y). T h e r e are / links connecting t h e origin/destination pair

366

Lan Zhao & Anna

Nagurney

-z,(p)

Figure 1: Network equilibrium formulation of t h e pure exchange economy (x,y) (cf. Figure 1). A fixed O / D d e m a n d dxy is assumed given. Let / ; be t h e flow passing through link i; i = 1 , . . . , I, and let c; be t h e user cost associated with link i; i = 1 , . . . , /. Group t h e link loads into a vector / € R1, and t h e costs into a vector c G R . Assume t h e general situation t h a t a cost on a link may depend upon t h e entire link load p a t t e r n , t h a t is, c, = c,'(/)- T h e n / * is a user equilibrium p a t t e r n if and only if no user has any incentive to change his p a t h (which in t h e model corresponds to a link), t h a t is, mathematically, there exists an ordering of t h e links n,; i = 1 , . . . , /, such t h a t

.,(/•),•••,<>,.(/*) = A < c. + 1 (/') < ... < c„,(/«)

(4)

vhere

,, f > 0, i = l , . . . , s , '\ = 0 , i = s + l,...,l.

Jn

As shown in Dafermos ([5], [6]) t h e above s t a t e m e n t is equivalent to t h e following: A vector / * £ K is a user equilibrium load p a t t e r n if and only if it is a solution to the variational inequality

c(f')T • (f - f) > o,

V/ e K,

where

K = {f:f>0,J2f,

= dxy}.

(5)

A Network

Formalism

for Pure Exchange

Economic

We now establish t h e relationship between VI(z, librium problem. Consider t h e d e m a n d

Equilibria

367

Sl) and t h e above network equi-

t h e link load p a t t e r n f = P, and t h e user travel cost <•)

= -»(•)•

(6)

T h e equilibrium condition of t h e network with t h e cost vector defined in (6) is: Pi > 0

,_,

*••<*> { < A, if ; • = o .

/ .\ I = A, if

w

Multiplying now t h e above inequalities by p*; i: = 1 , . . . , /, summing then t h e resulting equalities, and using Walras' law, we obtain A = p'T • z(p')

= 0;

t h u s , t h e equilibrium condition (7) of t h e above network with t h e cost function defined in (6) is identical to t h e equilibrium condition (3) of t h e p u r e exchange economy. Furthermore, variational inequality (5) which governs the traffic network equilibrium problem described above coincides with VI(z,S'). Since t h e variational inequality problem VI(z, S1) and, hence, t h e Walrasian equilibrium problem is isomorphic t o t h e above user equilibrium network problem with disjoint p a t h s , we can develop algorithms for the network problem which exploit the disjoint p a t h structure in order to c o m p u t e t h e Walrasian price equilibrium.

3

A General Iterative Scheme for the Computation of Walrasian Price Equilibrium

In this section we develop a general iterative scheme for t h e computation of Walrasian price equilibria, which at each step allows for the exploitation of t h e special network structure depicted in Figure 1. In studying algorithms and their convergence, t h e s t a n d a r d assumption in the economics literature (cf. Scarf [20]) is t h a t the aggregate excess d e m a n d function z(p) is well-defined and continuous on all of S . In this paper we also make this assumption. T h e Iterative Scheme Construct a smooth function g(p, q) : S1 x S1 i-t R1 with the following properties: ( 0 g[p,p)

= -*(p),

Vpe5',

368

Lan Zhao & Anna Nagurney

(ii) for every fixed p, q 6 S', the / x / matrix Vpg(p,q) is positive definite. Any smooth function g(p, q) with the above properties generates the following algorithm. S t e p 0: Initialization Start with some p° € S'. Set k := 1. S t e p 1: C o n s t r u c t i o n and C o m p u t a t i o n Compute pk by solving the variational inequality 5(PV",)T-(I'-P*)>O,

VpeS'.

S t e p 2: Convergence Verification If \pk — p*_11 < t, with e > 0, a prespecified tolerance, then stop; otherwise, set k := k + 1, and go to Step 1. We denote the above variational inequality by VIk(g,Sl). Since Vpg(p,q) is positive definite, VI (g, S ) admits a unique solution p . Thus, we obtain a well-defined sequence {pk}. It is easy to see that if the sequence {pk} is convergent, say pk —> p", as k —+ oo, then p* is an equilibrium price vector, that is, it is a solution of variational inequality VI(z, Sl). In fact, on account of the continuity of g(p, q), VIk(g, Sl) yields -z{p'f

• (p - p*) = g{p\p')T

• (P - Pk) > 0, Vp € S<

• (p - p*) = lim gip^P^f k—*oo

so that p* is a solution of the original variational inequality VI(z, S1). The problem is now to find conditions on g(p, q) which guarantee that the sequence {pk} is convergent. Let | • | denote the usual Euclidean norm in the space Rl and let || • || denote the norm of the operator Q : G?V K-> R', \\Q\\=

™p

\Qu\

(8)

u£G$V,\u\ = l

where G(p, q) = ^MP,

I) + V P / ( P , q)),

which, in view of condition (ii), is positive definite, V = {v.ve

R', Y, Vi = 0}

(9)

•=i

and G*V = {u:u

= G'(p,q)v,ve

We now present conditions for convergence.

V}.

(10)

A Network Formalism for Pure Exchange Economic Equilibria

369

Theorem 3.1 Assume that I|G-*(PSOV,«,(PV)G-*(PW)II < i, 1

1

2

2

3

3

(ii)

k

for all {p ,q ),(p ,q ),(p ,q ) VIk(g,Sl) is CauchyinS1.

€ S'. Then the sequence {p } obtained by solving

Proof: Let p = pk+1 for VIk{g,S'),

that is,

gip^p'-'f-ip^-p^^O,

(12)

and let p = pk for V 7 t + 1 ( 5 , 5'), that is, 9(pk+\pkf-(pk~pk+1)>0.

(13)

Adding (12) and (13), we obtain M A P * - 1 ) - 0,

(14)

or (g(pk+\pk)-9(pk,Pk)f-(pk+1-Pk) < {g(pk,pk-1)-g{pk,pk))T • (Pk+1-Pk)By the Mean Value Theorem, there exists a t € (0,1), such that (g(Pk+\pk)

- g(pk,Pk)f

• (P* + 1 - Pk) = (Pk+1 -

(is)

k P

f

•Vvg(tpk + (1 - 0P* +1 ,P*) • (PM - p*),

(16)

or (g(pk+\pk)-g(pk,pk)f-(pk+1-Pk) = \(pk+1

- Pkf • (V pfl (tp* + (l -

t)Pk+\pk)

+VTpg(tPk + (1 - * ) p f c + V ) • (pk+i - Pk)-

(17)

Let Gk be defined as Gk = \(VPg(ipk

+ (1 " 0 P * + 1 . P * ) + VTpg(tpk + (1 - 0 P * + 1 , P * ) ) .

(18)

Observe that Gk is symmetric and positive definite. Using now (15), (17), and (18) yields (p*+i -pkf . Gfc(P*+1 - p * ) < (gip",?"-1) -g(pk,Pk))

• (Pk+1~pk).

(19)

370

Lan Zhao & Anna Nagurney

We define now the inner product on V as (vi, v2)k = vfGkv2,

Vvu v2 e V

(20)

which induces the norm \v\k = {vTGkvY

= \G\v\,

VeK

(21)

By applying the Mean Value Theorem, (19) yields

\pk+1-pk\l<(pk-1-pkfGLG-J1 V,ff(/, V + (1 - s)pk'1)GPGl(pk^

- pk)

(22)

for s 6 (0,1). Using the Schwarz inequality and condition (11), (22) yields \pk+1 -Pk\l < |G|-.(P* - P * " 1 ) ! • \\G-k2xVqg{p\spk

+ (1 - S ) p * - ' ) G ^ | |

•|G|(P*+1-P*)| k

k

k

k

+ {l - stf-l)G-S\\

= \p -p -'\k.,\\G-k2lVq9{p ,sp

• \Pk+1 ~

P

\.

(23)

Hence, \pk^-p\
k=l,2,...,

(24)

1

where 7 is the maximum over the compact set S of the lefthand side of (11), which is less than 1. ^From (24) we obtain

\pk+1 - A < 71/ - A V i < ... < l V -P°|0.

(25)

On the other hand, since Gk; k = 1,2,..., is nonsingular, for every (p, q) € Sl x 5', there is a /3 > 0 such that

IP^-P^/TV^-A.

VA1,P*;* = 0,1,2,....

Therefore, (25) yields fc+r-1 fc+r-1

|p f c + r -/l< £ i=k

IP^-P1'!^/?-1

£

IP''+1-P"'I,-

i=k

7 vhich shows that {p*} is a Cauchy sequence in 5 ' and the proof is complete

(26)

A Network

Formalism

for Pure Exchange

Economic

Equilibria

371

R e m a r k 1: Naturally, VI (g,S!) should be constructed in such a way so t h a t it is easy to solve. For example, when Vpg(p,q) is also symmetric, VIk(g, S1) is equivalent to t h e convex m a t h e m a t i c a l programming problem: Find p* € S' such t h a t F(p') = mm F(p), (28) pSS1

where F(p) is a strictly convex function denned by the line integral

F(P) = Jg(p,q)Tdp. Hence, any algorithm suitable for solving (28) can then be used for solving variational inequality VIk(g,S'). P r o p o s i t i o n 3 . 2 Assume that the Jacobian matrixVpg(p,q) a necessary condition for (11) to hold is that the Jacobian definite over V for any p G S1, that is, vTVz{p)v

< 0,

is also symmetric. Then matrix V z ( p ) is negative

Vt; € V, v ± 0, Vp € 5 ' .

The above condition implies that the function is (p1-p2)T-(z(P1)-z(p2))<0,

—z(p) is strictly

(29) monotone

on S , that

Vp\p2eSl,pl^p\

(30)

P r o o f : Assume t h a t condition (11) holds and select 1

2

p =p

3

=p

1

2

= q =q

3

=q

.

Note t h a t -Vpz{p)

= Vpg(p,p)

+

Vgg{p,p).

Therefore, (11) takes t h e form | | / + G-i{p,p)Vpz(p)G-l(P,p)\\

< 1.

(31)

Set B(P) = G-HP,P)VPZ(P)G-HP,P)-

(32)

Substituting now (32) into (31) and expanding t h e lefthand side of (31), we obtain ||7 + B | | 2 =

sup

\(I +

B)u\2

u6<3iv,|«|=l

sup uT(I usGiv,M=i

+ B)T{I

+ B)u = s u p ( l + 2uTBu "

+ uTBTBu)

< 1

(33)

or, 2uTBu

< -uTBTBu.

(34)

372

Lan Zhao & Anna Nagurney

Since u = G?(p,p)v, (34) yields 2vTVpz(p)v

<

= -\G-HP,P)VPZ(P)V\2

-vTV^z{p)G-i2(p,p)G-^(p,p)Vpz(p)v < 0,

W e V,p

£S',VT&

0.

Hence, 'Vpz(p) is negative definite over V for any p G 5'. The proof is complete. We would like to point out that, since z{p) is homogeneous of degree zero, Vz(p) cannot be positive definite. Therefore, z(p) is never strictly monotone on a set containing a segment of the ray originating from the origin of the /-dimensional space. However, it can be strictly monotone on the / — 1 dimensional simplex S1 (see, e.g., [9])-

4

The Projection and Relaxation Methods for the Computation of the Equilibrium Prices

In this section we show that the general iterative scheme induces a projection method and a relaxation method for the computation of the equilibrium prices. We first present the projection method and then the relaxation method. We also propose equilibration algorithms PMN and RMN for the solution of the respective symmetric network equilibrium subproblems with special structure. We note that the network subproblems induced by the projection method are characterized by linear user link cost functions, whereas those induced by the relaxation method are, in general, nonlinear. a. The Projection Method The projection method corresponds to the choice g(p,q) = -z(q) + -G(p-q),

(35)

where p is a positive scalar and G is a fixed, symmetric positive definite matrix. In this case properties (i) and (ii) are satisfied. In fact, (0 9{P, l) = ~Z(P) + \G{p - p) = -z(p), (ii) Vpg(p,q) = p _ 1 G, is positive definite and symmetric. Condition (11) then takes the form ||/ + p G - 5 V „ z ( p ) G - 3 | | < l .

(36)

The following lemma give conditions under which (36) is satisfied. Lemma 4.1 If —z(p) is strongly monotone on S1, then condition (36) is satisfied.

A Network

Formalism

for Pure Exchange

Economic

373

Equilibria

P r o o f : Let B{p) = G ? S/pz(p)G *. By virtue of t h e strong monotonicity assumption, t h e following inequality holds: vTVvz{p)v

< -a\v\2,

VveV,peS'.

(37)

Since z(p) is continuously differentiable on Sl, there is a sufBciently large number M bounding WVjzipjG'1 V p z ( p ) | | such t h a t vTVTpz(p)G-1

Vpz{p)v

<M\v\\

VpeS',ve

V.

Therefore, \\I + PB(p)\\2

=

=

sup {uT(I ueaiv,\u\=i

sup

+ PB(p))T(I

{1 + 2puTBu

+

PB(p))u}

+p2uTBTu}

u£G% V,|u|=l

= s u p { l + 2pvTVpz(p)v

+ p2vTV

< s u p { l - 2 p a | u | 2 + p2M\v\2}

T

pz

{p)G'xV

= sup{l + p\v\2{PM

pz{p)v)

- 2a)}.

(38)

T h e righthand side of (38) is strictly less t h a n 1, whenever p < | | . T h u s , condition (11) is satisfied. T h e proof is complete. R e m a r k 2: Define 8(p) = s u p j l - 2pa\v\2

+ p2M\v\2}.

(39)

We observe t h a t it is t h e value of 6(p) t h a t affects t h e speed of convergence. In fact, the smaller 6 is, the quicker the sequence {pk} converges. From (39) we know t h a t 8(p) is minimized at p = j ^ . Therefore, p = jg is t h e optimal choice for the projection method. W i t h such a selected g(p,q), each subproblem VIk(g,S') is isomorphic to t h e network equilibrium problem with linear link cost functions. In particular, we choose G to be the diagonal positive definite m a t r i x of t h e form a-y

•••

0

(40) a, where a;; i = 1 , 2 , . . . , / , is any positive number. A natural choice is to have a; = — I^lpo; i = 1 , 2 , . . . , / , in which case VI (g,Sl) is then isomorphic to t h e separable network equilibrium problem depicted in Figure 2. We now show t h a t VIk(g,S')

374

Lan Zhao &i Anna.

c

c, - « , P , • V P * " ' )

,• " I P " *

h

i
Nagurney

>

1 -ZP'

Figure 2: Network equilibrium representation of VIk(g, method

S1) induced by t h e projection

can b e solved in closed form. We first provide t h e motivation for t h e equilibration algorithm which will yield t h e exact solution to VIk(g,S'), and then its s t a t e m e n t . Let t h e components of g(p, pk_1) be given by

g.&p"-1) = -zi(pk~l) + -cute - p?-1), p

• = 1,2,.... /,

(41)

and define h,{pk-')

= -Zi{pk-')

- -*,pk-\ P

i = 1,2,..., I

Then gi&P1"1)

= -<*iPi + A,-(p t_1 ),

t = 1 , 2 , . . . , /.

(42)

If pk is a solution of VIk(g, S') (that is, p is t h e corresponding equilibrium of t h e network depicted in Figure 2), then we have ndPk,Pk

1

)=9n2(p\pk

= 9n,(pk,Pk-1)

= *

p£, > 0 , k

p =0,

i= j = s+

l,...,s l,...,l.

(43)

A Network Formalism for Pure Exchange Economic Equilibria

375

Substituting (42) into (43), we obtain A = -anipk P

+ hni,

i=

Pn. = — ( A - V ) ,

l,...,s,

1= 1,...,-.

(44)

Summing (44) over i yields k _ \„v-

:

-^phn,

E^ = vEr--E^-

(45)

Since pk € 5', p n j = 0, for all j > s, (45) yields

P 2-.=i a „. Hence, the solution p* is given by Pn, =

(A-/»..(),

P^. = 0 ,

j =

s

i = l,...,s + l,...,Z,

where A is determined through (46). In order to find A, we must know the critical index s. procedure for finding the critical index s. ^,From (43), we obtain S n , ( / y _ 1 ) = A,

if

(47) Below we describe a

p*. > 0

which implies A„,(p'=- 1 )
if p ^ > 0 ,

gitfy-1)^*,

(48)

if vi, = o

which implies MP*_1)>A> Hence, s is an index such that 1+

^

Pn. = 0 .

(49)

<*».+.-

(50)

PT.L-1^

A= — ™ — j - ^

376

Lan Zhao & A n n a

Nagurney

We are now ready to state t h e following algorithm for solving subproblem VIk(g, where g(-) is specified by (35) and (40).

S1)

A l g o r i t h m P M N (equilibration algorithm for Projection Method Network subproblems) S t e p 0: S o r t Sort t h e numbers hni;i = 1 , 2 , . . . , / , in nondescending order, and relabel t h e m accordingly. Assume, henceforth, t h a t they are relabeled. Also, define A; +1 = oo. Set L := 1. S t e p 1: C o m p u t a t i o n Compute A L =

i + p£f=1^ L

pT.Ut

1

S t e p 2: E v a l u a t i o n If hi < \ L < hi+1, and go to Step 1.

let s = L, A = XL, and go to Step 3; otherwise, set L : = L + l ,

Step 3: U p d a t e Set p? = —(A - A,-), a; P*=0,

: = l,2,...,a

j = s + l,s +

2,...,l.

T h e algorithm converges in a finite number of steps (cf. Dafermos and Sparrow [12]). b. T h e Relaxation M e t h o d T h e relaxation m e t h o d corresponds to t h e choice ff;(p>
Vi = i , 2 , . . . , ; .

(51)

In this case properties (i) and (ii) are also satisfied. In fact, (i) g{p,p) = - z ( p ) , - p -

(ii) Vpg{p,q)

•••

0

is a diagonal matrix.

=

0

••• -p , dpi

J

By recalling the properties of the aggregate excess d e m a n d function z(p), deduces t h a t it is reasonable to assume t h a t | ^ < 0 ,

Vz= 1,2,...,/.

one

(52)

A Network

Formalism

for Pure Exchange

Economic

Equilibria

377

Hence, V p j ( p , q) is positive definite. Furthermore, 0

^a

aP2

fa V,s(p,g) =

dp, dz2

0

opi

Bz,

£ and V p fl

2

8;n /• dp2^

0 |a(_|a)

3p

i

2

{pi,qi)Vqg(p2,q2)Vpg

5

(P3,9a)

dz, \~2 I dpi ' ^

(-|^) *

dzi y dp2 '

0

(53)

(-! a -r 5 (-! a r

! ^

opi '

dp; >

^

We now state: T h e o r e m 4 . 2 Lei

and assume

dzr -~— dpT

. , = mm{•

dzi. —-} dpi

(54)

that dzT > „ dzi dzk 2 dzT2 "a—
Then condition

(11) of Theorem

Vz = 1 , 2 , . . . , / .

(55)

3.1 holds.

P r o o f : Introduce t h e norm |i|oo = max;{|a;;|} in t h e Euclidean space, which leads to t h e n o r m || • H^, for any operator Q: sup \Qx\,

(56)

We use t h e norm

M? = IGMoo

(57)

in t h e proof of T h e o r e m 3.1. T h e n condition (11) in Theorem 3.1 becomes

\\G-Hp\ql)Vqg(p\q2)G-Hp\q3)\U
\\G~Hp\
(58)

Lan Zhao k. Anna

378

<max{^|—-|(-—-) dpi k^i opk

dzT'1

,, <

- — ) "PT

(-TT-) dph dzk

} dzT

\~K

\~K

m a x { ^ | — - (-•—

Nagurney

^Q^

(—TT-) °PT

}•

(59)

' t*i dp* "P* By virtue of assumption (55), t h e righthand side of (59) is strictly less t h a n 1, t h a t is, condition (11) holds. R e m a r k 3 : Note t h a t (—g 31 ) 2 (—f 2 1 1 ) 5 < 1, a n d , hence, (55) is a diagonal dominance condition which has been imposed in t h e literature to ensure t h e global stability of t h e t a t o n n e m e n t process (see, e.g., Cornwell [4]). Recalling t h a t V p g r (p, q) is diagonal and positive definite, and observing t h a t the diagonal elements — -^ depend only on p;, we see t h a t VI (g, S1) is equivalent to the separable strictly convex m a t h e m a t i c a l programming problem fP

mF(p) = mm{ s^p*-1) mm s< v pes' Jo pes' *' pes'

^•{-Ef^'.-.pi:'*^ pE-V

1

f

dp}

4

M

«

(so)

- _ . JO

which can be solved, in general, by any efficient m a t h e m a t i c a l programming algorithm. Next we design an algorithm for solving VI (g, Sl), where
and let

S t e p 1: S e l e c t i o n Select m and s such t h a t gm(p\ph-1)=m&x{g,(pn,pk-1)}, \',p">o}

or (nk~1 mVP\

-z z

=

nk~1 n" n*_1 n*_M I • • • ; P m - l > P m ; P m + l i • • • iPl )

max A-zt(pk~\

.. .,pk:},p?,pk;},...

{•.p">°}

gs(pn,pk-i)=mm{gt(pn,pk-i)},

,pk-')},

(61)

A Network Formalism for Pure Exchange Economic Equilibria

379

« -ZP1;

Figure 3: Network equilibrium representation of VIk(g, S') induced by the relaxation method

- 2 .(p{- 1 ,...,p;: 1 1 , P ;,P^ 1 1 ,...,pf- 1 ) = min{-Z,(rf-1,...,p?_-11,p?,pf+-11,...,pf-1)}.

(62)

If \gm{pn>pk~1)— ^SCP") P*_1)l < e, for e > 0 a preset convergence tolerance, then stop. The current pn is a solution of VI (g,Sl). Otherwise, go to Step 2. Step 2: Equilibration Equilibrate gm and gs by solving the following one-dimensional mathematical programming problem for 8: min \zm(p\~l,...,

Pkm~\ ,Pl-6,

piT + \, • • • , Pf" 1 ) I (63)

subject to 0 < 6 < p^. Suppose that 6" is the solution of the above minimization problem. Let P?+1=P?, ,n+l

P.

Vi^m,s, ^n Ps + SU

(64)

380

Lan Zhao k. A n n a

Nagurney

Table 1: P a r a m e t e r s for a 4 commodity, 2 consumer economy

a),w) i= 1 i =2

i =i 0.1, 20.0 0.2, 10.0

j=2 0.5, 30.0 0.4, 30.0

J =3 0.1, 6.0 0.2, 16.0

i=4 0.3, 2.0 0.2, 2.0

and go back to Step 1 with n = n + 1. T h e sequence {p"} thus obtained converges to t h e solution of VI can be seen by t h e fact t h a t F(p»+1) < F(pn)

(g,Sl),

which (65)

where F(-) is t h e objective function of (60). We conclude this section by pointing out t h e economic meaning of t h e convergence condition (36) of the projection m e t h o d and convergence condition (55) of the relaxation method: If the price of a commodity is a decreasing function of t h e demand for this c o m m o d i t y and is affected principally by the d e m a n d for this commodity, then conditions (36) and (55) can be expected to hold.

5

Numerical Examples

In this section, we illustrate t h e performance of t h e projection m e t h o d and t h e relaxation m e t h o d on several numerical examples. T h e aggregate excess d e m a n d functions in the economies are derived from Cobb-Douglas utility functions and are of the form: m

nTW'

a'

m

^•(P) = E ^ — T ( 7 ) - E « ' j .

; = !.•••.'.

(66)

where W is the vector with components {w\,... ,w}}. For completeness and ease of reproducibility, we give the d a t a for t h e examples below. E x a m p l e 1: T h e r e are 4 commodities and 2 consumers in this economy. T h e coefficients aland W'J are given in Table 1. E x a m p l e 2: T h e second example is taken from Eaves [13] and t h e d a t a are given in Table 2. In this economy there are eight commodities and five consumers. E x a m p l e 3: T h e d a t a for this example, consisting of 15 commodities and 4 consumers in the economy, are reported in Table 3. E x a m p l e 4:

A Network Formalism for Pure Exchange Economic Equilibria

Table 2: Parameters for an 8 commodity, 5 consumer economy a),w) 3 i i J J i 3 J

=1 =2 = 3 =4 =5 =6 =1 =8

i=2 i=1 i= 3 i=4 0.3,3.0 0.0,0.0 0.0,0.0 0.0,0.0 0.0,0.0 0.0,15. 1.0,0.0 0.0,0.0 .13,0.0 0.0,0.0 0.0,0.0 0.0,5.0 0.0,3.0 0.0,0.0 0.0,0.0 .73,4.0 0.0,3.0 1.0,2.0 0.0,3.0 0.0,0.0 0.0,5.0 1.0,0.0 0.0,0.0 0.0,0.0 .38,2.0 1.0,0.0 0.0,0.0 0.0,4.0 .19,0.0 1.0,0.0 0.0,0.0 .27,4.0

i =5 0.0,4.0 0.0,0.0 0.0,0.0 .47,13.0 0.0,0.0 .11,0.0 .05,6.0 .37,6.0

Table 3: Parameters for a 15 commodity, 4 consumer economy a),w) i = i 3 =2 J' = 3 J =4

i =1 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 j =5 .06, 10.0 J = 6 .06, 12.0 i = 7 .03, 14.0 i = 8 .01, 16.0 i = 9 .05, 14.0 i = io .05, 12.0 i = n .20, 10.0 J = 12 .30, 8.0 J = 13 .02, 4.0 J = 14 .04, 2.0 J = 15 .04, 2.0

i=2 .02, 10.0 .30, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 4.0 .02, 5.0 .02, 6.0 .02, 7.0

i= 3 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .01, 3.0 0.0, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0

i =4 .20, 10.0 .00, 5.0 .01, 7.0 .02, 9.0 0.0, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .02, 5.0 .02, 7.0 .02, 7.0 .02, 6.0 .04, 5.0 .06, 10.0

382

Lan Zhao & Anna

Nagurney

Table 4: P a r a m e t e r s for a 20 commodity, 4 consumer economy

a),w) J=l i=2 i=3 j =4 i=5 i=6 j=7

i =8 J =9 J = 10 J = 11 J = 12 i = 13 j = 14 J = 15 i = 16 j = 17 j = 18 J = 19 i = 20

t =1 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 12.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0

t=2 .20, 10.0 .30, 8.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 8.0 .03, 6.0 .04, 8.0 .06, 10.0

i=3 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0 .06, 12.0 .03, 14.0 .01, 16.0 .01, 14.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 12.0 .30, 4.0 .02, 4.0 .04, 2.0 .04, 2.0

z= 4 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0

T h e fourth example consists of 20 commodities and 4 consumers with the coefficients given in Table 4. E x a m p l e 5: T h e fifth example consists of 25 commodities and 4 consumers in the economy and t h e coefficients are given in Table 5. Both t h e projection m e t h o d and t h e relaxation m e t h o d were coded in F O R T R A N . T h e projection m e t h o d was embedded with P M N and the relaxation method with R M N . T h e golden section method was used to solve the one variable minimization problem encountered in R M N . In the projection method we chose t h e m a t r i x G = {f^, i = 1 , . . . ,/}|p° and p = .8 for / = 4 , m = 2, p = .5 for / = 8 , m = 5, p = .5 for / = 15, m = 4, p = .1 for / = 20, m = 4, and p = .5 for / = 25, m = 4. T h e codes were implemented on an IBM 3090 at Brown University, and the F O R T V S compiler was used for compilation. Both algorithms were initialized with p° = ( j , . . . , j ) , and t h e termination criterion was \pk — pk~1\ < 1 0 - 6 . As we can see from Table 6, t h e projection m e t h o d converged faster t h a n t h e relaxation m e t h o d , even though the

A Network Formalism for Pure Exchange Economic Equilibria

Table 5: Parameters for a 25 commodity, 4 consumer economy a),w)

i= l .01, 3.0 3=1 .02, 5.0 3=2 .02, 6.0 i = 3 .02, 7.0 i = 4 .02, 6.0 i = 5 .02, 5.0 i = 6 3=7 .00, 5.0 .01, 7.0 3=8 j=9 .02, 9.0 i = io .00, 1.0 j = 11 .05, 2.0 J = 12 .02, 4.0 i = 13 .03, 6.0 3 = 14 .04, 8.0 i = 15 .06, 10.0 j = 16 .06, 12.0 J = 17 .30, 14.0 J = 18 .01, 16.0 i = 19 .05, 14.0 3=20 .05, 12.0 .20, 10.0 3=21 3=22 .30, 8.0 J = 2 3 .02, 4.0 J = 2 4 .04, 2.0 ;=25 .04, 2.0

t' = 2 .20, 10.0 .30, 8.0 .01, 3.0 .02, 5.0 .02, 6.0 .02, 7.0 .02, 6.0 .02, 5.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 4.0 .02, 5.0 .04, 6.0 .02, 7.0

i= 3 .02, 6.0 .02, 5.0 .03, 6.0 .04, 8.0 .06, 10.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0

i =A .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .02, 5.0 .02, 6.0 .02, 7.0 .02, 6.0 .04, 5.0 .06, 10.0

384

Lan Zhao & A n n a

Nagurney

Table 6: Numerical Results for t h e Projection and Relaxation Methods Example Number

1 2 3 4 5

N u m b e r of Iterations

Relaxation 11 28 5 4 4

Projection 18 117 80 91 64

C P U T i m e (seconds)

Relaxation .20 1.64 1.02 1.97 3.49

Projection .05 .20 .11 .36 .38

number of iterations in the projection m e t h o d is larger than t h e n u m b e r of iterations in the relaxation m e t h o d , for the same degree of accuracy. This is most likely due to t h e fact t h a t t h e projection m e t h o d solves the network subproblems VIk{g,Sl) in closed form. As t h e scale of the economy becomes larger, the advantage of the projection m e t h o d becomes more significant.

6

Summary and Conclusions

In this paper we have developed a network formalism for the study of a class of general economic equilibrium problems - t h a t of pure exchange or Walrasian price equilibrium problems. We first established t h a t the Walrasian price equilibrium problem is isomorphic to a network equilibrium problem with special s t r u c t u r e and, hence, t h e corresponding variational inequality formulations are one and the same. We then turned to the computation of the equilibrium p a t t e r n s . We proposed a general iterative scheme for t h e computation of t h e Walrasian price equilibrium, which was then shown to induce the projection m e t h o d and the relaxation m e t h o d , as special cases. In particular, t h e projection m e t h o d resolves t h e network equilibrium problem into linear symmetric network equilibrium problems, for which an equilibration algorithm termed P M N was then proposed. T h e relaxation m e t h o d , on the other h a n d , resolves the problem into separable nonlinear problems, for which an equilibration algorithm named RMN was developed. Finally, we presented numerical results for five examples which demonstrated that t h e projection m e t h o d in combination with P M N consistently outperformed t h e relaxation m e t h o d in combination with R M N . This is due, in p a r t , to the simplicity of the network equilibrium subproblems which were then solved in closed form. As the scale of the problems increased, t h e relative efficiency of t h e projection m e t h o d vis a vis the relaxation m e t h o d also increased, suggesting t h a t the full exploitation of the underlying network structure of this class of general economic equilibrium problems will enable the computation of large-scale problems in practice.

A Network

Formalism

for Pure Exchange

Economic

Equilibria

385

References [I] H. Z. Aashtiani and T. L. Magnanti, Equilibria on a congested transportation network SIAM Journal on Algebraic and Discrete Methods 2 (1981) 213-226. [2] M. Beckmann, C. B. McGuire, and C. B. Winsten, Studies in the Economics Transportation (Yale University Press, New Haven, Connecticut, 1956).

of

[3] D. P. Bertsekas and E. Gafni, Projection m e t h o d s for variational inequalities and application t o t h e traffic assignment problem, Mathematical Programming Study 1 7 (1982) 139-159. [4] R. Cornwell, Introduction to the Use of General Holland, A m s t e r d a m , T h e Netherlands, 1984).

Equilibrium

Analysis

(North-

[5] S. Dafermos, Traffic equilibrium and variational inequalities, Transportation ence 14 (1980) 42-54. [6] S. Dafermos, T h e general multimodal traffic equilibrium problem, Networks (1982) 57-72. [7] S. Dafermos, An iterative scheme for variational inequalities, Mathematical gramming 15 (1983) 40-47.

Sci-

12

Pro-

[8] S. Dafermos, Isomorphic multiclass spatial price and multimodal traffic network equilibrium models, Regional Science and Urban Economics 16 (1986) 197-209. [9] S. Dafermos, Exchange price equilibria and variational inequalities, Programming 4 6 (1990) 391-402.

Mathematical

[10] S. Dafermos and A. Nagurney, Sensitivity analysis for t h e general spatial economic equilibrium problem, Operations Research 3 2 (1984) 1069-1086. [II] S. Dafermos and A. Nagurney, Oligopolistic and competitive behavior of spatially separated m a r k e t s , Regional Science and Urban Economics 17 (1987) 245-254. [12] S. C. Dafermos and F. T. Sparrow, T h e traffic assignment problem for a general network, Journal of Research of the National Bureau of Standards 7 3 B (1969) 91-118. [13] B . C. Eaves, W h e r e solving for stationary points by L C P s is mixing Newton iterates, in Homotopy Methods and Global Convergence, B. C. Eaves, F . J. Gould, H. O. Peitgen, and M. J. Todd, editors (Plenum Press, New York, 1983) pp. 63-78. [14] M. Florian and M. Los, A new look at static spatial price equilibrium models, Regional Science and Urban Economics 12 (1982) 579-597.

386

Lan Zhao & Anna Nagurney

[15] M. Florian and H. Spiess, The convergence of diagonalization algorithms for asymmetric network equilibrium problems, Transportation Research 16B (1982) 477-483. [16] A. Nagurney, Migration equilibrium and variational inequalities, Economics Letters 31 (1989) 109-112. [17] A. Nagurney and D. S. Kim, Parallel computation of large-scale dynamic market network equilibria via time period decomposition, Mathematical and Computer Modelling 15 (1991) 55- 67. [18] A. Nagurney, J. Pan, and L. Zhao, Human migration networks, European Journal of Operational Research 59 (1992) 262-274. [19] A. Nagurney and L. Zhao, A network equilibrium formulation of market disequilibrium and variational inequalities, Networks 21 (1991) 102-132. [20] H. Scarf (with T. Hansen), Computation of Economic Equilibria (Yale University Press, New Haven, Connecticut, 1973).

387 Network Optimization Problems, pp. 387-401 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.

Steiner Problem in Multistage Computer Networks Sourav B h a t t a c h a r y a Bhaskar Dasgupta 1 Computer Science Department,

University

of Minnesota,

Minneapolis,

MN

55455

Abstract

Multistage computer networks are popular in parallel architectures and communication applications. We consider the message communication problem for the two types of multistage networks: one popular for parallel architectures and the other popular for communication networks. A subset of the problem can be equated to the Steiner tree problem for multistage graphs. Inherent complexities of the problem is shown and polynomial-time heuristics are developed. Performance of these heuristics is evaluated using analytical as well as simulation results.

1

Introduction

Multistage interconnection networks (MINs) are popular among parallel architecture a n d / o r communication network topologies. An N x logiN element MIN consists of log2N stages of N elements each. A common pictorial view of an N x log2N MIN is to collect N elements in a stage (vertically) and arrange log^N + 1 such stages horizontally one after t h e other. MINs offer a good balance between network cost and performance. They are often characterized as intermediate {0(N x logzN)} cost networks falling within t h e two e x t r e m e cases: fully connected {0(N2) cost} and bus connected {O(N) cost}. Architectural and other topological properties of MIN may be found in [8]. 'Supported in part by NSF grant CCR-9208913

388

S. Bhattacharya

1.1

and B.

Dasgupta

T w o Versions of M I N s

Let Sij denote t h e z'-th stage j ' - t h row element in an N x log^N MIN, 0 < i < log2N,0 < j < N — 1. We consider source-to-source wrap-around MINs only, i.e., when V; : SQJ = Siog:2N,j- These networks can allow multiple passes using t h e wraparound connections. Depending on t h e role of intermediate stage elements, two types of MINs are possible as outlined below: • Intermediate stages as switches only: This t y p e is popular in parallel architecture applications. Here t h e source end (leftmost) and t h e destination end (rightmost stage) constitute of processors, while t h e intermediate elements are bare switches which interconnect various sources and destinations. Such MINs are of commercial usage in parallel processors, e.g., t h e BBN Butterfly machine. We refer to MINs of this t y p e as type-1 MIN. • Intermediate stages as processors: This t y p e is common in communication network applications. Here t h e intermediate stage elements are identical to t h e source or destination stage processors, i.e., they can have their own message traffic. Example of such MINs can be found in [10]. We refer to MINs of this t y p e as type-S MIN.

1.2

C o m m u n i c a t i o n in M I N s

Depending on the n u m b e r of destinations involved in a communication in MIN, three types can be classified: one-to-one, one-to-many and one-to-all. These are commonly known as routing, multicast and broadcast. In this article we focus ourselves to the multicast problem for MINs. Note t h a t routing k, broadcast are two special instances of multicast and do not offer any opportunity for traffic reduction. T h e multicast problem specifies a source node and a set of k destination nodes. W i t h o u t loss of generality we assume t h e source node to be So,o- Destination nodes are spread over t h e MIN, l < k < N ( k = l = routing, k = N = broadcast). Objective of t h e multicast problem is to transmit t h e message from the source node to t h e destination nodes.

Flow-control Mechanism For multihop networks, various form of switching and flow-control mechanisms have evolved. Store and forward is a traditional approach to message communication. Virtual cut-through, wormhole, deflection routing etc. have been subsequently proposed. A survey can be found in [11, 4]. We assume packetized message communication, where packets are independently flown through t h e network. Our focus is to estimate (and possibly reduce) the overall traffic overhead in message communications.

Steiner Problem

1.3

in Multistage

Computer

Networks

389

Optimality Criteria in MIN Multicast

Two possible criteria to measure the optimality of MIN multicast communication are to minimize one of t h e following two objective functions: • t h e total traffic generated in t h e network ( each occupied link of t h e network counts as one unit of traffic. • the hops-distance between t h e source node and any destination node. T h e traffic metric makes t h e problem equivalent to t h e Steiner problem for MIN, while t h e time metric is a different dimension altogether. These two metrics work in the dual sense. Reducing one increases the other and vice versa. T h u s , we focus on the traffic metric only. Considerations along t h e time metric is an open problem.

2

Multistage Interconnection Networks

We consider type 1 MINs with t h e cube network topology. These class of networks (e.g., baseline, delta, generalized cube, indirect binary-cube, omega, banyan [8]) have been proposed as fixed-degree alternative to hypercube architecture. T h e y are popular in switching and communication applications. They can also emulate t h e performance of hypercube in most applications (e.g., t h e CCC architecture [12]). Let MINd denote a d dimensional generalized MIN.

2.1

Formulation of the Traffic Reduction Problem.

We consider multicasting on MINd which are unique path networks. Given a set of k multicast destinations (Di, 1 < i < k) and a source node S in MINd, t h e p a t h from S to any particular D{ is fixed. However, it is clear t h a t for a given set of multicast destinations, t h e total traffic generated in MINd depends on t h e relative order in which d different dimensions are arranged. This leads to our problem formulation as (see Section 2.1.1 for practical applicability): Given a set of destination nodes, traffic optimum multicasting in MINd is to find a permutation of the d dimensions (each stage of MINd is allocated to one particular dimension value) so that the total traffic is minimized. Unfortunately, this problem is NP-complete as shown by the next theorem. Hence, we need to investigate t h e possibility of designing efficient heuristics for this problem. T h e o r e m 2.1 The traffic optimum

multicasting

problem is

NP-complete.

Proof sketch: T h e problem is obviously in NP. To show NP-hardness one can reduce the space minimized full trie problem, which is shown to be NP-complete in [3, 6], to this problem. Details are available in [1]. •

390 2.1.1

S. Bhattacharya

and B.

Dasgupta

Design Issues

Any hardware implementation of a MINj would assume an ordering among t h e d dimensions. In such cases, online dimension ordering (as required by t h e traffic reduction criterion in this paper) in a MINj may be argued from t h e practical viewpoint. We identify t h e following situations as practical applications. (1) Communication networks often use MINs. Traditional hardware implementation of switches at every intermediate stages have been replaced using Wave-Time Division Multiplexors ( W T D M ) over passive stars [5]. T h e actual interconnection is formed by wavelength (frequency) or time-slot assignment of different nodes, i.e., by firmware control. A firmware controlled design can be changed without changing t h e underlying hardware. T h u s , it is possible to re-order t h e dimensions in a MINd dynamically. Every stage may have to configure to at most d possible dimensions, for which t h e w a v e / t i m e assignments can be pre-computed and stored. (2) If t h e traffic p a t t e r n is known and repetitive (as m a y happen in periodically occurring similar message communications) then from t h e above o p t i m u m dimensional ordering for each multicasting instance one can derive the most common p a t t e r n and design t h e MINj using t h e corresponding o p t i m u m dimensional ordering. T h e idea here is to achieve traffic optimality for most multicasting instances which leads to an overall traffic reduction. (3) Hierarchical hypercubes are designed for several practical reasons [9]. Such hierarchical designs limit the availability of different dimensions at any node. Only a certain set of dimensions can be availed at each node. This imposes a hierarchy among dimensions in a r o u t i n g / multicasting operation. In some other cases, even with complete hypercubes r o u t i n g / multicasting is done in hierarchical fashion, imposing a (arbitrary) desired ordering among dimensions [4]. W i t h these applications our results and optimality ordering among dimensions can be used as a measure whether or not a particular multicast operation is generating optimal traffic. Note t h a t a hypercube with hierarchically ordered dimensions can be treated as a MINd for analysis purpose and results from t h e latter can be used for t h e former.

2.2

Greedy Heuristic

Let Reachp denote t h e number of nodes which received a copy of t h e message at stage p. Let ki be t h e dimension between stage p and stage p+1. We define an expansion ratio Fkd = ^ J ' . Intuitively, this fraction Fkd indicate how much t h e size of t h e multicast destination is increasing at every stage. This expansion ratio depends of the dimension, stage position and on t h e set of all prior dimensions served already. For the sake of brevity we treat this all previous information as part of t h e stage information and denote compactly using t h e stage n u m b e r position. Now, t h e total traffic equals

Steiner Problem

in Multistage

Computer

Networks

391

P=d

J2 Rtachp

= Fki x [1 + Fk2 x [1 + . . . + Fkd_, x [1 +

Fkd]...]}

Our objective is to arrive at values of k\, k^, ..., kj, such t h a t t h e above expression is minimized. Note t h a t Fki (1 < i < d) varies between 1 and 2 and have real values. We propose t h e following greedy algorithm which works stage by stage and selects one dimension in each stage. After d iterations of t h e algorithm t h e complete p e r m u t a t i o n of d dimensions in MINd are generated. Greedy heuristic: At each stage select dimension ki, where Wj, Fkj > Fki. In case of a tie, anyone of the smallest Fki dimension may be chosen. A particular dimension fc; used in a preceding stage may not be repeated in a subsequent stage. Regarding t i m e and space complexities of t h e above heuristic, it is easy to prove the following theorem. T h e o r e m 2.2 The greedy heuristic runs in 0(k.d3) arithmetic operations and uses O(k.d) space.

0(k.d4)

time, performs

bitwise

T h e following theorem, whose proof can be found in [1], shows local optimality of dimension ordering of t h e greedy heuristic. T h e o r e m 2.3 The greedy heuristic leads to a locally optimal ordering of the dimensions, i.e., orders the dimensions so that no a d j a c e n t p a i r w i s e i n t e r c h a n g e of the dimensions can minimize the total traffic. Table 1 compares t h e t i m e and space complexities of t h e greedy heuristic with two other known exponential t i m e o p t i m u m strategies[l]. Algorithm

T i m e (in bit operations)

Space (in bits)

Direct P e r m u t a t i o n Dynamic P r o g r a m m i n g Heuristic

0(k • d3 • d\) 0(k • d2 • 2d) 0(k • d4)

0(k • d)

0(k-d

+ 2d-d)

0(k • d)

Table 1: Multicast in MINd with k destinations: S u m m a r y of t i m e and space complexity of different solutions.

392

S. Bhattacharya

2.3

and B.

Dasgupta

Performance

This section analytically compares t h e greedy heuristic with "randomly ordered dimensions" approach as well as t h e optimal algorithm. Detailed proofs of all t h e results are available in [1]. For our analysis purpose, we characterize the destination node set into two classes: t h e "complete subcube multicast" and the "incomplete subcube multicast". Each one of these cases are explained below and worst (average) case performance of the greedy heuristic is compared with t h a t of optimal algorithm as well as r a n d o m dimension ordering. Let traffic "overhead" comparison between two approaches be denoted as the absolute difference in t h e traffic generated by those two individual approaches 2 . For example, Overhead(greedy, o p t i m u m ) = greedy traffic o p t i m u m traffic, Overhead(random, greedy) = traffic in "random ordering" - greedy traffic.

2.3.1

Complete Subcube Multicast

In this case t h e set of multicast destinations, D;, 1 < i < k, forms a complete subcube. T h u s , i = 2 r , for some integer r and t h e set [Di] can form a r-dimensional subcube. Let CSM(r) denote this situation. T h e o r e m 2 . 4 The greedy heuristic produces optimum multicast case. Also, in the worst CSM(r) case,

traffic for the complete

subcube

— r) x (2 r — 1).

Overhead(random,optimum)=Overhead(random,greedy)=(d

T h e next theorem gives t h e probabilistic traffic overhead of " r a n d o m dimension ordering" as opposed to t h e worst case performance as stated above. T h e o r e m 2 . 5 In the average CSM{r) case, the random dimensions ordering approach incurs a traffic overhead Overhead(random, optimum) = Overhead(random, greedy) = J2p=[ QAp) X {NP - N0), where

Np = ] T 2 ; + 2 p x (d - r) + 2" x £ 2 '

J • (d - r) • (d + p - r - 1)! • (r - p)\ Qr(p) = ^ 2

7[

A random dimension ordering occurs when d dimensions of MINj

are randomly ordered.

Steiner

Problem

in Multistage

Computer

2.3.2

Incomplete Subcube Multicast

393

Networks

In this case not all t h e 2 r destination nodes are present in t h e set of multicast destinations. T h u s , t h e set of multicast destinations (£),, 1 < i: < k) forms an incomplete r-dimensional subcube, where r is t h e m i n i m u m dimension value t o include all those destination nodes. Let ISM(r) denote this situation. T h e following theorems give worst case and average case performance ratios of various strategies. First we consider t h e case when A: is a power of 2. T h e o r e m 2.6 In the case of incomplete Overhead(greedy, optimum) Overhead (random, greedy)

r-subcube multicast < <

with k = V

destinations,

((r — j) x (k — 1)) [(d — 1r + log2k) x (k — 1)]

Since j = log2k, Overhead(greedy,optimum)~ (r — log2k) x k. For a given r, this value is maximized when k = 2 r _ 1 . T h u s , t h e worst case performance degradation suffered by t h e greedy heuristic equals 2 r _ 1 . Next we generalize t h e value of A; to a non-power of 2. L e m m a 2.7 In ISM(r), with k = 23 + I nodes (0 < I < V - I), optimum) = A(d,k,r,j), where A(d,k,r,j)

=

[(d-r) + E S

a

2 J + 1 +{r-j-l)

+ 1

Overheadfgreedy,

2' + (r-i-l)xfc]-

[(d-r) + (r~j) + r=^']

and, Overhead (random, greedy) = B(d,k,r,j), B(d,k,rJ)

= w

x (Jfc-1) where

[ ( r - j ) + EJlj2''+ (
T h e following l e m m a gives estimates for t h e average performance. Let

k We reuse notations of t h e previous l e m m a for brevity. L e m m a 2.8 In ISM(r), Overhead(greedy, Overhead(random,

with k = V' + I nodes (0 < I < V -1), on an average optimum) greedy)

= =

case:

Ep=o~ Pk=v{r) x A(d, k,r,p) Ep=o~ Pk=2p{r) x B(d,k,r,p)

T h e above result for t h e average case is based on equal distribution of destination nodes among t h e given nodes.

S. Bha.ttacha.rya

394

2.4

and B.

Dasgupta.

Simulation Performance

Often a MINd design implicitly assumes ascending or descending order among its d dimensions. For example, in a MIN3 an increasing order would be [0,1,2], while a decreasing order would b e [2,1,0]. Such is also t h e usual practice in hierarchical routing in hypercube, where dimensions are treated one after another alike distinct stages of MINd- Clearly, these linearly ordered dimensions approaches cannot generate traffic optimal multicasting in MIN. We compare performance of t h e greedy heuristic with t h e linearly ordered dimensions heuristic approach and demonstrate t h e advantages. We show for randomly generated M multicast destinations how much traffic can b e reduced (on an average) if t h e locally optimal greedy dimension ordering approach proposed in this paper is followed. We present simulation results towards this. Our simulation implemented four situations: t h e exhaustive o p t i m u m traffic generation approach, greedy approach, linearly increasing and linearly decreasing (hereafter referred as 'increasing' and 'decreasing' respectively). Performance of t h e "increasing" and "decreasing" cases are found similar, and hence we report only t h e "increasing" case. We consider three different dimension values 4, 5 and 6 (i.e., MIN4, MINS and MINe). N u m b e r of multicast destinations are varied as 1%, 2%, 5%, 10%, 20%, 50%, 80%, 90%, 9 5 % and 99% of t h e total number of nodes in t h e cube. For each multicast set size 30 r a n d o m distributions were generated and averages taken to introduce the effect of large numbers. Thus, each algorithm is run 300 times with every dimension - leading to a total of 3600 experiments. T h e proposed greedy algorithm produces optimal multicast traffic in most ( « 90%) of t h e cases. T h u s t h e greedy heuristic is "almost always" o p t i m u m . We present simulation results showing t h e miss-rate, i.e., t h e percentage of test runs in which t h e proposed greedy algorithm does not coincide with t h e o p t i m u m multicast algorithm. Even in cases where t h e greedy algorithm deviates from o p t i m u m solution, the deviation is found to be small. This section also shows the relationship of optimal traffic cost (T) to t h e n u m b e r of multicast destinations (M). We present simulation results to show variation of T for different values of M. Let To, Ta and T/ denote t h e traffic generated using t h e optimum (exhaustively generated), greedy and linearly ordered increasing approaches respectively. T h u s , t h e traffic overhead using greedy and increasing approaches equal (TQ — TO) and (Tj — To) respectively. F i g . l a shows the average value of these p a r a m e t e r s for MINd with d=i, 5, 6. Let "miss-rate" be defined as t h e percentage of simulation runs in which a particular heuristic approach (e.g., t h e greedy approach or t h e increasing approach) differs from t h e o p t i m u m approach. This percentage shows the frequency by which the heuristic deviates from the o p t i m u m solutions. A low value of miss-rate indicates t h a t the corresponding heuristic is 'almost always' o p t i m u m . F i g . l b shows t h e missr a t e greedy, increasing heuristics for different dimensions. O b s e r v a t i o n 1: T h e greedy approach misses t h e o p t i m u m solution rarely (e.g.,

Steiner Problem in Multistage Computer Networks

Traffic Overhead

395

Traffic Overhead

Traffic Overhead

Dimension«4

I

I

Dimensioned

t T 1—I 10 20 50 80 90 95 99

,

10 20 50 80 90 95 99

J

Destinations (percentage)

Destinations (percentage)

10 20 50 80 90 95 99

Destinations (percentage)

a) Miss-Rate 80%-

Dimension=4

Dimension*^)

Dimensi<£n=6

80%-

\

i

60%—

60%-

40%—

40%-

I

/ ,^VS \

5 10 20 50 80 90 95 99

Destinations (percentage)

5

\

\

I

20%-

-r-t

\

, 1 /MVS ;

10 20 50 80 90 95 99

5

Destinations (percentage)

10 20 50 80 90 95 99

Destinations (percentage)

b)

Figure 1: Greedy (solid line) and Increasing (dashed line) heuristic performance: a) Traffic overhead (—traffic in heuristic - optimal traffic), b) Miss-rate.

Traffic (scaled by size)

Traffic (scaled by dimension and size) 1 -

0.80.63-

•

t •

\ t

\

V-t:' \ \

0.420.2-i—i—i—i—i—i—i—i—i—r1 2 5 10 2 0 5 0 8 0 9 0 95 99

Destinations (percentage)

a)

I1*

5

10 20 SO 80 90 95 99

Destinations (percentage)

b)

Figure 2: Optimum traffic (per destination node) variation with different multicast sizes (solid line for dimension=4, dashed line for dimension=5, dotted line for dimen-

396

S. Bhattacharya

Traffic Overhead

and B.

Dasgupta

Traffic Overhead

(scaled) 0.5-

£

0.4-

/ :- *

0.30.2-

A.

7

-i—i—i—f—f—i—rI 2 5 10 20 50 80 90 95 99

Destinations (percentage)

_

\

| \ ^ ,

0.1— i

i

l

-H 10 20

i

i

i

i V"

50 80 90 95 99

Destinations (percentage)

b) a) Figure 3: a) Traffic overhead of Greedy heuristic, b) Scaled traffic overhead (soild line for d i m e n s i o n = 4 , dashed for 5, dotted for 6). 1 out of 30 cases or at most 2 out of 30 cases). T h u s it is an 'almost always o p t i m u m ' algorithm. T h e n u m b e r of mismatches increases as t h e dimension increases. However, t h e existing dimension ordering approaches (e.g., increasing or decreasing) have high miss-rate. O b s e r v a t i o n 2: For small (or large) number of multicast destinations all three heuristics yield o p t i m u m (or near o p t i m u m ) traffic solutions. This is because, for small n u m b e r of destinations expansion ratio (refer Section 2.2) is almost always 1 and regardless of t h e actual heuristic used, a near-optimum strategy is effected. Similarly, with large fraction of nodes as destinations, t h e expansion ratio is almost always 2 and regardless of t h e actual heuristic used, a near-optimum strategy is effected. O b s e r v a t i o n 3 : Traffic overhead of greedy approach is superior to "increasing" approach. O b s e r v a t i o n 4 : Traffic overhead of greedy approach increases with dimension (Fig. 3a). This is expected since at higher dimensions, each source-destination multicast involves larger traffic a m o u n t . At t h e same t i m e it can also be a t t r i b u t e d to the inherent characteristic of t h e greedy approach, i.e., t h e greedy approach deviates from optimality more with increasing dimension. These two factors are distinguished by scaling t h e traffic overhead using dimension value. T h e idea is to normalize t h e traffic overhead using the corresponding dimension value. T h e scaled traffic overhead (Fig. 3b) using t h e greedy heuristic for different dimensions also show t h a t traffic overhead of greedy approach increases with dimension. Fig.2 shows t h e o p t i m u m traffic load variation for different multicast sizes. T h e idea is to explore relationship (if any) between t h e total amount of traffic (T) required for a M destination multicast. We show the average traffic (i.e., traffic per destination node) reported from our simulations for cube sizes 4, 5 and 6. Then, we scale the traffic requirement using the corresponding dimension value. This plot also shows

Steiner Problem

in Multistage

Computer

similar trend as t h e absolute traffic plot. O b s e r v a t i o n 5: O p t i m u m multicast destination size. Initially, with small M, of traffic per stage. This is because t h e average no common message traffic can be increasing M this ratio decreases until M range of M t h e value of ' o p t i m u m traffic' a constant indicating high availability of share message links most efficiently.

3

Networks

397

traffic decreases with increasing multicast every destination requires nearly one unit destinations nodes are sparse and on an shared by two destinations. However, with reaches 50% of t h e cube size. Beyond this per multicast destination becomes almost destination nodes and frequent ability to

Multistage Communication Networks

We consider type 2 MINs in this section. Among several topologies we choose the shuffle connection and t h e multistage binary cube connection. W i t h o u t loss of generality we assume t h a t SQ^ is the source node, while t h e k destinations nodes are spread over t h e (log2N + 1) stages and N rows. Type-1 MINs are unique p a t h networks. This required us to re-order dimensions in order to have traffic reduction. T h e online dimension re-ordering led to practical feasibility questions (which is addressed in Section 2.1.1) and related issues. However, type-2 MINs are not unique p a t h networks. They allow multiple paths between source and any destination. Hence, traffic reduction can be achieved even without any online topological reconfiguration.

Optimality Criterion Given a,n N x log2N type-2 multistage communication network (connected using a particular topology T) with a source node S and k destination nodes D{ (1 < i < k), t h e objective is to find a p a t h from the source node to each one of t h e destinations such t h a t one of t h e following objective functions is minimized: • Total

traffic.

• Time (in hops) between source and each destination. T h e first objective equates the problem to t h e Steiner tree problem for t h e topology T. T h e second objective has not been investigated so far.

3.1

Multistage Shuffle Network

We consider a shuffle network with N x log2N P E s , arranged as log?N stages of N P E s each. Let such a network be called a log2N-sh\iffie, Let PEi:j be t h e 2-th row P E in the j - t h stage, 0 < i < N - 1, 0 < j < log2N - 1. log2N-Shuffte is a cyclical

398

S. Bhattacharya

a n d B.

Dasgupta

Figure 4: NP-completeness: a) Example 3-stage shuffle network, b) Restricted instance of t h e shuffle graph - dotted lines indicate zero-cost edges - nodes which are grouped together are tagged with identical integer, c) Equivalence to a 3-cube multicast. wrap around network, with log2N-th stage = stage 0. Formally, t h e binary shuffle connectivity is defined as (iV = 2") [10] PEij

has outgoing links to PEpi+p)mod

jv,(j+i)mo
T h e o r e m 3 . 1 The problem of traffic optimal shuffle multicast complete.

tree generation

is NP-

Proof sketch: It can be shown t h a t a special case of this problem is t h e problem of optimal traffic multicast for hypercube (which is known as NP-complete[7]) by setting the costs of some edges to zero and thereby identifying some nodes together. Details can be found in [2]. Fig. 4 pictorially depicts the idea. D

3.2

Multistage Cube Network

We consider a multistage cube network with N x log2N P E s , arranged as log2N of N P E s each. Let such a network be called a log2N-sta.ge cube. Let PEij i-th row P E in t h e j-th stage, 0 < i < N — l , 0 < j < log2N — 1. log2N-sta,ge a cyclical wrap around network, with log2N-th stage = stage 0. Formally, t h e multistage connectivity is defined as (N = 2")[8] PEij

has outgoing links to

PEi^j+i) mori n and PEi+(1-2r)x2>,u+i)m<,dn

stages be t h e cube is binary

where r = i-th bit in j

Note t h a t this multistage cube network is a particular instance of t h e generalized MINd considered in Section 2, where t h e dimensions are increasing from left to right. Also, intermediate stage nodes are active P E s here, unlike in t h e MINd of Section 2 (which are b a r e switches).

Steiner

Problem

in Multistage

Computer

Networks

399

define Dist(D,a) = MIN { H{Dj,a) : Dj e D } Vn, (n $ D) A (3Z»j e £ : H(Dj,n) = 1) Do courai(n) = \{dk <E D I : Dist(D U {n},djt) < H i s ^ D , ^ ) } ! select n such that counifra) is maximum;

Figure 5: Greedy algorithm for type-2 MINs: selection of t h e next step message recipient node. T h e proof of t h e following theorem is essentially similar to theorem 3.1. Details can be found in [2]. T h e o r e m 3 . 2 The optimal traffic multicast

3.3

problem is

NP-complete.

Greedy Heuristic

We developed a greedy heuristic for type-2 MINs. This heuristic is applied to b o t h t h e type-2 shuffle MIN as well as type-2 multistage cube MIN. We describe this greedy heuristic first and then d e m o n s t r a t e its performance using simulation results. T h e greedy heuristic is an iterative process selecting one node in each iteration. Every t i m e a node is selected it is included in a set D (representing t h e set of nodes which received a copy of t h e message so far). Initially D set only includes S, i.e., the source node. T h e algorithm stops when Vi 1 < i < k Z), £ D. Let DL denote t h e set of k destinations, and H(a, 6) equal t h e shortest distance between nodes a and b. Let Dist(D,a) be a function indicating t h e shortest distance from any node in D to a. Each step of t h e greedy iteration chooses a node n as in Fig. 5.

3.4

Simulation Performance

We simulated t h e greedy algorithm in t h e multistage shuffle MIN as well as in t h e multistage cube MIN. Its performance is compared with t h e exhaustively generated o p t i m u m algorithm in t h e respective architectures. T h e n u m b e r of destinations is varied as 1%, 2%, 5%, 10%, 20%, 50%, 80%, 90%, 95% and 99% of t h e total system size. In each case 50 random set of destinations were generated; b o t h t h e greedy & o p t i m u m algorithms are run and their results compared. Two metrics are used to characterize performance of t h e greedy heuristic: t h e number of miss and t h e average overhead. T h e former denote how often (out of t h e 50 runs) t h e greedy algorithm fails to produce optimal result, while t h e latter indicates average deviation of t h e greedy result when a miss occurs. A low miss r a t e indicates t h a t t h e greedy algorithm is almost always optimum, while a low average overhead indicates t h a t t h e greedy algorithm almost always produces a near-optimum solution. Let TG (TO) denote t h e traffic produced by t h e greedy (optimum) algorithm.

400

S. Bhcdta.cha.iya • Number

of miss = \TQ =/= TO\-

• Average

overhead = £)

T<1 Tn

^ ,

and B.

Dasgupta

where TQ ^ To-

Table 2 shows t h e performance of t h e greedy algorithm in multistage shuffle MIN, while Table 3 shows t h e same for multistage cube MIN. As can be observed from these two tables, t h e greedy algorithm has low miss rate (particularly for t h e multistage cube MIN) and low traffic overhead.

Systems

Destinations Metrics

0.02 3 1.03

O.OB 6 1.23

O.l 8 1.4

0.3 11 1.52

O.B 24 1.6

0.8 16 1.56

O.O 14 1.31

0.95 12 1.19

0.99

N u m b e r of m i s a Average overhead

O.Ol 1 1.01

4-stage Shuffle

N u m b e r of m i s s Average overhead

1 1.12

— J "

9 1.33

11 1.47

16 1.63

32 1.62

27 1.58

19 1.39

N u m b e r of m i s s Average overhead

2 1.14

10 1.41

14 1.53

21 1.71

37 2.09

29 1.69

23 1.45

- »/-» rr-

10 1.17

5-at a g e Shuffle

3-At a g e Shuffle

1.15 5"

1 *-23 1

17

„.''32

9 1.06

11 1.21

Table 2: Multistage Shuffle Multicast: Performance of t h e Greedy heuristic over the optimal solution.

Systems

Destinations Metrics

O.Ol 1

0.02 2

0.OB 3

O.l 3

0.2 4

O.B 7

0.8 5

o.e

0.95 1

o.ee

3

1.01

1.01

1.01

1.01

1.04

1.1

1.04

1.61

1.01

1.01

4 1.03

3 1-01

l 1.01

6 1.03

- 3 1.01

2 1.01

3-atage Cube

N u m b e r of m i s s Average overhead

4-stage Cube

N u m b e r of m i s s Average overhead

2 1.01

2 1.01

3 1.01

4 1.03

6 1.11

-To— 1.18

5 1.09

5-stage Cube

N u m b e r of m i s s Average overhead

2 1.01

3 1.01

6 1.02

7 1.04

9 1.14

-1 . 2— 3

8 1.12

I

Table 3: Multistage Cube Multicast: Performance of the Greedy heuristic over the optimal solution.

4

Conclusion

Multistage networks are popular for parallel architecture a n d / o r communication network applications. We formulate t h e traffic o p t i m u m multicasting problem for multistage networks. O p t i m u m traffic multicasting problem is NP-complete. Several greedy heuristics are proposed and their performances are shown using analytical as well as simulation m e t h o d s . This work considered just traffic optimality in MIN multicasting. O p t i m a l i t y issues of time metric in MIN multicasting is left as an interesting open problem. A c k n o w l e d g m e n t s : We thank Gary Elsesser, Lionel M. Ni and Wei-Tek Tsai for helpful discussions.

Steiner

Problem

in Multistage

Computer

Networks

401

References S. B h a t t a c h a r y a , G. Elsesser, W . T . Tsai and D.Z. Du, Multicasting in Generalized Multistage Interconnection Networks, to appear in Journal of Parallel and Distributed Computing, 1993. S. B h a t t a c h a r y a and L.M. Ni, Multicasting in Multistage Communication Networks, preprint, 1992. D. Comer and R. Sethi, Complexity of T R I E Index Construction, 17th Symposium on FOCS, 1976, pp 197-207.

Annual

W. J. Dally and C. L. Seitz, Deadlock-Free Message Routing in Multiprocessor Interconnection Networks, IEEE Transactions on Computers, C-36 (1987), p p . 547-553. P. Dowd, R a n d o m Access Protocols for High-Speed Interprocessor Communication Based on an Optical Passive Star Topology, Journal of Lightwave Technology, 9 (1991), pp. 799-808. M.R. Garey M.R. and D.S. Johnson, Computers and Intractability - A Guide to the Theory of NP-Completeness, ( Freeman, San Fransisco, CA, 1979 ). L.R. Foulds and R.L. G r a h a m , T h e Steiner Problem in Phylogeny in N P Complete, Advances in Applied Mathematics, 3 (1982), p p . 43-49. K. Hwang and F.A.Y. Briggs, Computer (New York : McGraw-Hill, cl984).

architecture

and parallel

processing,

K. Hwang and J. Ghosh, Hypernets: A Communication-Efficient Architecture for Constructing Massively Parallel Computers, IEEE Transactions on Computers, C-36 (1987), p p . 1450-1466. M.G. Hluchyj and M.J. Karol, ShuffleNet: An Application of Generalized Perfect Shuffles to Multihop Lightwave Networks, Journal of Lightwave Technology, 9 (1991). P. Kermani and L. Kleinrock, Virtual cut-through: A new computer communication switching technique, Computer Networks, 3 (1979), p p . 267-286. F.P. P r e p a r a t a and J. Vuillemin, T h e Cube-Connected Cycles: A Versatile Network for Parallel C o m p u t a t i o n , C A C M , 24 (1981), p p . 300-309.

Combinatorial optimization: algorithms and complexity

Read more

Combinatorial Optimization: Algorithms and Complexity

Read more

Combinatorial Optimization: Algorithms and Complexity

Read more

Combinatorial Optimization: Algorithms and Complexity

Read more

Algorithms and Complexity

Read more

Algorithms and Complexity

Read more

Algorithms and Complexity

Read more

Algorithms and Complexity 001

Read more

Algorithms and Complexity

Read more

Network flows: theory, algorithms, and applications

Read more

Network Flows: Theory, Algorithms, and Applications

Read more

Algorithms and Complexity

Read more

Algorithms and Complexity

Read more

Network flows: theory, algorithms, and applications(conservative)

Read more

Practical optimization. Algorithms and engineering applications

Read more

Inverse Eigenvalue Problems: Theory, Algorithms, and Applications

Read more

Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties

Read more

Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties

Read more

Network Optimization and Control

Read more

Bioinspired Computation in Combinatorial Optimization: Algorithms and Their Computational Complexity

Read more

Algorithms and Complexity, 2nd edition

Read more

Algorithms and Complexity (Second edition)

Read more

Complexity in numerical optimization

Read more

WiMAX Network Planning and Optimization

Read more

Optimization with Multivalued Mappings: Theory, Applications and Algorithms (Springer Optimization and Its Applications)

Read more

WiMAX Network Planning and Optimization

Read more

WiMAX Network Planning and Optimization

Read more

Algorithms and Architectures (Neural Network Systems Techniques and Applications)

Read more

Combinatorial optimization: theory and algorithms

Read more

Combinatorial optimization theory and algorithms

Read more

Recommend Documents

Combinatorial optimization: algorithms and complexity

Combinatorial Optimization: Algorithms and Complexity

Combinatorial Optimization: Algorithms and Complexity

Combinatorial Optimization: Algorithms and Complexity

Algorithms and Complexity

Algorithms and Complexity

Algorithms and Complexity Herbert S. Wilf University of Pennsylvania Philadelphia, PA 19104-6395 Copyright Notice Copy...

Algorithms and Complexity

Algorithms and Complexity Second Edition Algorithms and Complexity Second Edition Herbert S. Wilf A K Peters Natick...

Algorithms and Complexity 001

Algorithms and Complexity Herbert S. Wilf University of Pennsylvania Philadelphia, PA 19104-6395 Copyright Notice Copy...

Algorithms and Complexity

Algorithms and Complexity Second Edition Algorithms and Complexity Second Edition Herbert S. Wilf A K Peters Natick...

Network flows: theory, algorithms, and applications