This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
[x) for x integer. Note t h a t , like
< x < mj+i,i}
(10)
W i t h this notation, we define the intervals for arc i such t h a t t h e number of segments j is as small as possible given t h a t t h e following two conditions are satisfied: • If j is odd, t h e n f(x)
is a piecewise-linear-concave function for x G IJ;
• If j is even, then f(x)
is a piecewise-linear-convex function for x 6 Ij.
For t h e J - t h interval of arc i, let rrij^ and rrij^+i denote, respectively, t h e left and right endpoints of t h e fc-th segment. Note t h a t , because t h e intervals and segments are contiguous, the right endpoint of t h e last segment of the j - t h interval always equals the left endpoint of t h e first segment of t h e (j + l)-st interval. T h a t is, Tij £•+! = mj+i,!. Moreover, the left endpoint of first segment of t h e first interval of arc i is always zero and t h e right endpoint of the last segment of t h e last interval of arc i is always equal to t h e capacity u. T h a t is, m ^ i = 0 and m j j . + 1 = m j + 1 1 = u. To specify t h e segments for t h e j - t h interval, we define t h e set Ij^ as
154
B. W. Lamar
StepO:
O
M
Step 1:
O
—
Step 2:
Step 3:
O
-O
QJ°l(*LoJM+0-...-^yjM^0
O
^
O
^
-O
si(xi) r x
Step 4:
O
()
Step 5:
O
r(x)
s
X
2{x2)
S
2JX^
tX
V
s
l{xg)
Figure 4: Steps in Conversion Procedure.
J
Network
Flow Problems
with General Nonlinear
Arc
Costs
Ij,k = {x : m i i t < x < m j i t + 1 }
155
(11)
T h e segments of interval j for arc i are defined such t h a t kj, t h e n u m b e r of segments for t h e j-th interval, is as small as possible given t h a t f(x) is linear for each x £ Ij,k-
3.2
Step 2
Having defined t h e intervals and segments for arc i, we can now describe t h e second step in t h e conversion procedure (again, refer to Figure 4). In the second step, we replace t h e single arc i with a series of j arcs, one associated with each interval for arc i. We refer to t h e j-th arc in this series as arc i,j and we let gj(x) denote t h e cost function for arc i,j. To specify gj(x), let Ajtk denote t h e slope of t h e (linear) function f(x) for x 6 Ij,k and let Aj denote t h e slope of f(x) in t h e kj-th (i.e., last) segment of interval j . We define A 0 = 0. Note t h a t gj(x) must be defined for x in the domain [0,u], not just [mj,!,771^+1,1]. T h u s , for each arc i,j we define two additional segments—the zero-th segment and the (k3 + l)-st segment. For each arc i, j , we also define t h e sets Ij:o and 7 j j + 1 as lj,o = {x:0<x
!j,h+i
=
(12)
(x ' m ^+i.i ^x
(13)
We now specify t h e cost function gj(x) for arc i,j recursively as g}(x) = 0 for all x <E I]fi
gj(x) = gj{mjik)
9iix)
+ (Aj:k - Aj-i)
= 9j(mJ+i,i)
• (x - mjik)
(14)
for all x € Ihk, k = 1 , 2 , . . . , kj
+ (&i ~ A , - i ) • (x -
roj+i,i)
for all x € 7 i i S j . + 1
(15)
(16)
Note t h a t , for _;' odd, gj(x) is a piecewise-linear-concave function for x in t h e domain [0,tx]. Similarly, for j even, gj(x) is a piecewise-linear-convex function for x in t h e domain [0, u]. T h e logic underlying the specification of the cost functions gj{x) is as follows. For flows x in t h e first interval, i.e., for x £ 7 1 ; the function gi(x) is the same as t h e function f(x) and the other arc cost functions g2(x),. . . ,g~j(x) all have zero cost. For flows x in the second interval, i.e., for x 6 I2, t h e sum of t h e functions g\(x) plus g2(x) is the same as the function f(x) and the remaining functions gs(x),... ,gj(x) all have zero cost. For flows x € 73, t h e sum of g\(x) plus g2{x) plus 53(1) equals f(x) and t h e remaining functions 5 4 ( 1 ) , . . . ,g~j{x) are all zero; and so on. T h u s , for any x between zero and u, t h e cost of sending x units of flow through t h e series of arcs i, 1
156
B. W.
Lamar
through i,j is t h e same as t h e cost of sending t h e same amount of flow through t h e single arc i.
3.3
Step 3
In t h e third step of t h e conversion process, we simply note t h a t t h e order of the arcs in t h e series i, 1 through i,j is u n i m p o r t a n t . This means t h a t we may equivalently rearrange t h e order of t h e series of arcs such t h a t t h e arcs i,j for j odd are to t h e left of t h e arcs i,j for j even. Furthermore, we can recombine t h e arcs i, j with j odd into a single arc. We refer to this recombined arc as arc i, 0. Let r(x) denote t h e cost function for arc i, 0 where
r(x)=
£
9j(x)
(17)
j=l,3,...
Observe t h a t r{x) is the sum of piecewise-linear-concave functions and so is itself a piecewise-linear-concave function. Moreover, the endpoints of t h e linear segments of r(x) are given by rrij^ for k = l,...,kj and j odd. However, to simplify notation, let p be t h e index of t h e linear segments of the function r ( x ) , let p be t h e n u m b e r of segments, and let vp and u p + 1 be the left and right endpoints of t h e p-th segment. In a similar fashion, we can replace t h e arcs i, j for j even with a single arc. Let s(x) denote t h e cost function for this recombined arc where
*(*)=
£
9j(*)
(18)
j"=2,4,...
Here, s(x) m e n t s are by letting segments,
3.4
is a piecewise-linear-convex function and the endpoints of t h e linear seggiven by rrij^ for k = 1 , . . . , kj and j even. Once again, we simplify notation q be t h e index of t h e linear segments of s(x), letting q be t h e n u m b e r of and letting wq and w,+i be t h e left and right endpoints of the q-th segment.
Step 4
In t h e fourth step of t h e conversion procedure, we use t h e established m e t h o d of converting a piecewise-linear-convex arc cost function with q linear segments into a set of q parallel arcs, each with a linear cost function (see [12, p. 80]). We denote the q-th parallel arc as arc i,q, we let xq denote t h e flow on arc i,q, and we let s , ( x g ) be the (linear) cost function for arc i,q. T h e functional form of sq(xq) is s , ( x , ) = Sq • xq where Sq is t h e slope of t h e (7-th linear segment of t h e piecewise-linear function We let uq denote t h e flow capacity for arc i, q. Here, uq is given by
(19) s(x).
Network
Flow Problems
with General Nonlinear
Arc
Costs
157
uq = wq+1 - wq
3.5
(20)
Step 5
U p t o this point, t h e conversion procedure we have described is exact. T h a t is, solving a network with each arc i of t h e form shown in Step 0 in Figure 4 (i.e., t h e original network of Problem P) is identical to solving a network in which each arc i is replaced with t h e set of arcs shown in Step 4 (i.e., t h e expanded network of Problem Q). As mentioned at t h e end of Section 2, however, it may be desirable to approximate r(x) with a continuously differentiable concave function, denoted f(x). T h u s , t h e (optional) fifth step in t h e conversion procedure is to use an appropriate curve fitting technique to approximate r(x) by f(x). T h e n f(x) is used as t h e cost function for arc i , 0 in t h e expanded network for Problem Q.
3.6
Computational Summary
We conclude this section by pointing out t h a t t h e form of t h e functions r(x) and sq(xq) in t h e expanded network (i.e., Step 4) can be computed directly from t h e general nonlinear function <j>(x) in t h e original network (i.e., Step 0). To describe these computations, let 9L(x) and 6R(x) denote, respectively, t h e slope of t h e piecewiselinear function f(x) to t h e left and right of x for x = 1 , 2 , . . . ,u — 1. These slopes are computed directly from t h e general nonlinear function >(x) as 9L(x)
= 4>(x) -4>{x-
0R{x)
= <)>(x + 1) - 4>{x)
1)
(21) (22)
In addition, let 8(x) denote t h e difference between t h e right and left slopes. T h a t is, 6(x) = 6R(x)
- 6L{x)
(23)
To determine t h e intervals j and segments k, and t h e associated endpoints rtij^, we initially set j <— 1, k <— 1, and m ^ i «— 0. T h e n , for x sequentially set to 1,2,. . . ,u — 1 (as in a "DO-loop") we perform t h e following four tests: • If j is odd and 9(x) < 0, then set rrij^+i «— x and increment k <— k -\- 1; • If j is odd and 9(x) > 0, then set mj^+\ <— x, set m j + i j <— x, and set kj <— k; t h e n increment j *— j + 1 and reset k *— 1; • If j is even and 9(x) > 0, t h e n set m^k+i <— x and increment k <— k -f 1; • If j is even and 6(x) < 0, then set m ^ i + i <— x, set m,j+iti then increment j <— j + 1 and reset k <— 1;
<— x, and set kj <— k;
B. W.
158
Lamar
Note t h a t when 6(x) = 0 t h e conditions of none of t h e four tests are satisfied. After performing these four tests for x = 1,2, ...,u — 1, we t h e n set j <— j , k3 <— k, mj^+i <— u and rrtj+iti <— u. To c o m p u t e t h e piecewise-linear-concave function r ( x ) , t h e numerical value of t h e slopes Aj j , and A j used in eqs. (14) through (16) are obtained from t h e slopes 9L(x) and 8R(x) given in eqs. (21) and (22). Specifically,
Ajik
= eR(mjik)
A,- = eL(mHhl)
(24)
(25)
T h e n t h e segments k = 1 , . . . , kj for j odd are reindexed as p = 1,. . . ,p and t h e endpoints vp are set equal to mj,* for the appropriate j and k. In a similar way, to compute the linear functions sq(xq), t h e segments k — 1,. . . , kj for j even are reindexed as q = 1 , . . . , q, t h e slopes 6q in eq. (19) are determined from Aj,* and A j using eqs. (24) and (25), t h e endpoints wq are set equal to rrij^ for t h e appropriate j and k, and t h e capacities uq are computed using eq. (20). T h e next section illustrates t h e conversion procedure with two numerical examples.
4
Numerical Examples
In this section, we apply t h e conversion procedure described in Section 3 t o t h e "staircase" arc cost function shown in Figure 1 and the "sawtooth" arc cost function shown in Figure 2. To illustrate the technique, t h e conversion steps are shown in full for t h e "staircase" function, but just briefly summarized for t h e "sawtooth" function. As in Section 3, because we are referring to a given arc i in the original arc set A, we omit t h e subscript i in t h e notation in this section.
4.1
"Staircase" Function
For t h e function in Figure 1, we have
4>(x) 12
for for for for
x = 0 0 < x < 5 5 < x < 10 10 < x < 15
Network
Flow Problems
with General Nonlinear
Arc
Costs
159
m
_y~ M
12 -
4 i 0
h
0
10
15
Figure 5: Piecewise-Linear Equivalent for "Staircase" Function.
We assume 4>(x) is defined over t h e domain [0,15]; i.e, u = 15. T h e function t h e piecewise-linear-continuous equivalent of
/(*) =
4•x 4 4 + 4 • (x - 5) 8 8 + 4 • (x - 10) 12
f(x),
for 0 < x < 1 for 1 < x < 5 for 5 < x < 6 for 6 < x < 10 for 10 < x < 11 for 11 < x < 15
and is shown in Figure 5. T h e domain [0,15] is divided into five intervals; i.e., j — 5. T h e y are 7i = {x : m ^ i <
x
< m 2 ,i} = {x : 0 < x < 5}
7 2 = {x : rre2,i < ^ < "^s.i} = {a: : 5 < x < 6}
I3 = {x : m 3 ] i < x < m4,i} = {x : 6 < x < 10}
I4 = {x : m4il
< x < m5,i] =
{x:10<x
7 5 = {x : m 5 ] i < x < m 6 i i } = {x : 11 < x < 15} T h e first interval contains two segments; i.e., fci = 2. They are 7 X i = {x : mi,i < x < m i j 2 } = {x : 0 < x < 1} 7i,2 = {x : m i ^ < x < m 1 ) 3 ] = {x : 1 < x < 5}
160
B. W.
Lamar
Each of t h e other intervals contains exactly one segment. T h a t is, I2,i = h, h,i = ^3, I*,\ = -f-4) and 75]i = 7 5 . For each j = 1 , . . . ,j, we also define t h e sets J,-i0 and /,-{•+! as follows: A,o = {x • 0 < x < mi,i]
= {x : 0 < x < 0}
^1,3 = {a; : rri2,\ < x < u} = {x : 5 < x < 15}
^2,0 = {x : 0 < 3; < rri2,i} = {x : 0 < x < 5}
h,2 — {x : m3,i < x < u } = { x : 6 < x < 15}
^3,0 = {x : 0 < x < m 3 ] i } = {x : 0 < x < 6}
^3,2 = {x : r?i4a < x < u} = {x : 10 < x < 15}
hp
= {x : 0 < x < m 4 a } = {x : 0 < x < 10}
^4,2 = {x : rasa < x < u} = {x : 11 < x < 15}
^5,o = {x : 0 < x < m 5 a } = {x : 0 < x < 11} ^5,2 — {x '• m^i
< x < u } = {x : 15 < x < 15}
Applying eqs. (24) and (25), t h e slopes A]tk and Ay are as follows:
A1.1 Ai,2 A2a A3a A4a Asa
= = = = = =
4 0 4 0 4 0
Ao Ar A2 A3 A4 A5
Using t h e slopes and intervals given above, t h e functions gj(x) through (16) are as follows:
defined in eqs. (14)
Network
Flow Problems
with General Nonlinear
, , _ ] 4 - x > " 4
9^x>
~ |
0 4 • (x - 5)
for 0 < x < 6 for 6 < x < 15
0 4 • (x - 10)
for 0 < x < 10 for 10 < x < 15
' 0 5sW- i ^
4
.(
Recombining t h e functions gj(x) ()
x
161
for 0 < x < 5 for 5 < x < 15
fO - 4 • (a; - 6)
9t{x) ^
r x
Costs
for 0 < x < 1 for 1 < x < 15
9l[X
92{x) = |
Arc
_n)
for 0 < x < 11 forll<x<15
according to eqs. (17) and (18) gives
= 9iix)
+ 9z(x) + 9s{x)
s{x) =g2(x)
+g4{x)
Numerically, this yields
4•x 4 4-4-(x-6) - 1 6 - 8 - (x- 11)
r(x)
s(x) = |
[0 4-(i-5) ( 20 + 8 • (x - 10)
for for for for
0< x <1 1< x < 6 6 < x < 11 11 < x < 15
for 0 < x < 5 for 5 < x < 10 for 10 < x < 15
T h e functions r ( x ) and s(x) are shown in Figures 6 and 7, respectively. Since function s(x) contains three linear segments, we have q = 3 and, using eqs. (19) and (20), we c o m p u t e t h e linear functions sq(xq) and capacities uq as follows: •Si(xi) ^2(2:2) 33(23)
= = =
0 4 • x2 8 • x3
Ui u2 u3
= = =
5 5 5
Finally, if desired, t h e piecewise-linear-concave function r ( x ) can be approximated by a continuously differentiable function f ( x ) . For instance, to a p p r o x i m a t e r(x) by a quadratic, t h e following function can be used: f(x) = 5 - 0 . 4 - ( x - 3 . 5 ) 2 This approximation is shown by the dotted line in Figure 6.
B. W. Lamar
162
-16 -
Figure 6: Concave Component for "Staircase" Function.
0
5
10
Figure 7: Convex Component for "Staircase" Function.
Network Flow Problems with General Nonlinear Arc Costs
163
/(*) 100
50
0
10
20
30
Figure 8: Piecewise-Linear Equivalent for "Sawtooth" Function.
4.2
"Sawtooth" Function
For the "sawtooth" function in Figure 2, >(x) is given by
+(*) =
0 6+ 5• x 6+ 4•x 6+ 3• x
for x = 0 for 0 < x < 10 for 10 < x < 20 for 20 < x < 30
We assume that this function is defined over the domain [0,30]; i.e, u = 30. The function f(x), the piecewise-linear-continuous equivalent of 4>(x), is given by
/(*) =
11-x 11+ 5 - ( X - 1 ) 51 - 5 • (x - 9) 46 + 4 • (x - 10) 82 - 16 -(x- 19) ( 66 + 3 • (x - 20)
for for for for for for
0< x < 1 1< x < 9 9 < x < 10 10 < x < 19 19 < x < 20 20 < x < 30
as shown in Figure 8. The domain [0,30] is divided into four intervals; i.e., j = 4. They are Ii = {x : mi,! < x < m,2ti} = {x : 0 < x < 10}
I2 = {x :TO2,i< x < 7713,1} = {x : 10 < x < 19}
7 3 = {x : 7713,1 5: 1 < 7774,1} = {x . 19 < x < 20}
164
B. W. Lamar
Figure 9: Concave Component for "Sawtooth" Function.
I4 = {x : 77i4tl < x < r«5Ti} = {x : 20 < x < 30} The first interval contains three segments; i.e., fcj = 3. Each of the other intervals contains exactly one segment. Using eqs. (14) through (16), the functions gj(x) are determined for j — 1 , . . . ,4. Then using eqs. (17) and (18), functions r(x) and s(x) are given by r(x) = gi{x) + g3(x) s x
( ) = 9-iix) + 9i{x)
This yields
r(x)
11 • a; ll+5-(x-l) 51 - 5 • (x - 9) 1 - 25 • (x - 19)
for 0 < x < 1 forl
for 0 < x < 10 s(x) = I 9 • (x - 10) for 10 < x < 20 ( 90 + 28 • (x - 20) for 20 < x < 30 The functions r(x) and s(x) are shown in Figures 9 and 10, respectively. Since function s(x) contains three linear segments, we have q = 3 and the linear functions s,(x g ) and capacities uq are given as follows:
Network
Flow Problems
with General Nonlinear
Arc
Costs
165
s(x)
90
0
10
20
Figure 10: Convex Component for "Sawtooth" Function.
SlOd) s2(x2) S3{X3)
= = =
0 9-x2 28-x3
tii u2 u3
= = =
10 10 10
Lastly, if desired, t h e piecewise-linear-concave function r(x) can be approximated by a continuously differentiable function f(x). For instance, to a p p r o x i m a t e r(x) by a quadratic, t h e following function can be used: f(x)
= 55 - 0.75 • (x - 9) 2
This approximation is shown by t h e dotted line in Figure 9. T h e next section summarizes t h e paper.
5
Summary
This paper has described how a network containing arcs with general nonlinear cost functions can be converted into an expanded network involving only concave arc cost functions. T h e functional form of t h e concave arc cost functions can be calculated very efficiently from t h e original problem data. Although t h e expanded network will be larger t h a n t h e original network, it can be solved using established solution techniques for concave m i n i m u m cost network flow problems. Special cases of this conversion procedure have been applied to less-than-truckload motor carrier networks and to cash flow m a n a g e m e n t problems [17, 18]. This paper has extended t h e technique to any m i n i m u m cost network flow problem with arbitrary arc cost functions.
B. W.
166
Lamar
References [1] D.P. Bertsekas (1991), Linear MA.
Network
Optimization,
M I T Press, Cambridge,
[2] A. Charnes and W . W . Cooper (1961), Management Models and Industrial plications of Linear Programming, J o h n Wiley and Sons, New York, NY.
Ap-
[3] R . J . Dolan (1987), "Quantity Discounts: Managerial Issues and Research Opportunities," Marketing Science, vol. 6, p p . 1-22. [4] J . R . Evans and E. Minieka (1992), Optimization Graphs, Marcel Dekker, New York, NY.
Algorithms
for Networks
and
[5] M.R. Garey and D.S. Johnson (1979), Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman and Co., San Francisco, CA. [6] G.M. Guisewite and P.M. Pardalos (1990), "Minimum Concave-Cost Network Flow Problems: Applications, Complexity, and Algorithms," Annals of Operations Research, vol. 25, p p . 75-100. [7] G.M. Guisewite and P.M. Pardalos (1991a), "Single-Source Uncapacitated Minim u m Concave Cost Network Flow Problems," in H.E. Bradley (ed.), Operational Research '90, Pergamon Press, Oxford, England, p p . 703-713. [8] G.M. Guisewite and P.M. Pardalos (1991b), "Algorithms for t h e Single-Source Uncapacitated M i n i m u m Concave-Cost Network Flow Problem," Journal of Global Optimization, vol. 1, p p . 245-265. [9] G.M. Guisewite and P.M. Pardalos (1991c), "Global Search Algorithms for Mini m u m Concave Cost Network Flow Problems," Journal of Global Optimization, vol. 1, p p . 309-330. [10] G.M. Guisewite and P.M. Pardalos (1991d), "A Polynomial T i m e Solvable Concave Network Flow Problem," Networks, forthcoming. [11] G.M. Guisewite and P.M. Pardalos (1992), "Performance of Local Search in M i n i m u m Concave-Cost Network Flow Problems," in C.A. Floudas and P.M. Pardalos (eds.), Recent Advances in Global Optimization, Princeton University Press, Princeton, N J , p p . 50-75. [12] P.A. Jenson and J . W . Barnes (1980), Network and Sons, New York, NY.
Flow Programming,
[13] E.L. Johnson (1966), "Networks and Basic Solutions," Operations 14, pp. 619-623.
John Wiley
Research,
vol.
Network
Flow Problems
with General Nonlinear
Arc Costs
167
[14] D.B. K h a n g and 0 . Fujiwara (1991), "Approximate Solutions of Capacitated Fixed-Charge M i n i m u m Cost Network Flow Problems," Networks, vol. 2 1 , p p . 689-704. [15] J.L. Kennington and R.V. Helgason (1980), Algorithms ming, J o h n Wiley and Sons, New York, NY.
for Network
Program-
[16] B . W . Lamar (1992), "An Improved Branch and Bound Algorithm for M i n i m u m Concave Cost Network Flow Problems," Journal of Global Optimization, forthcoming. [17] B . W . Lamar and S. Jorjani (1990), "Incorporating Discounting Into NetworkBased Cash Flow Management Models," working paper, G r a d u a t e School of M a n a g e m e n t , University of California, Irvine, CA. [18] B . W . Lamar and Y. Sheffi (1988), "An Implicit E n u m e r a t i o n Method for LTL Network Design," Transportation Research Record, no. 1120, p p . 1-16. [19] B . W . Lamar, Y. Sheffi, and W . B . Powell (1990), "A Capacity Improvement Lower Bound for Fixed Charge Network Design Problems," Operations Research, vol. 38, pp. 704-710. [20] D . T . Phillips and A. Garcia-Diaz (1981), Fundamentals Prentice-Hall, Englewood Cliffs, N J .
of Network
Analysis,
[21] Y. Sheffi (1985), Urban Transportation Networks: Equilibrium Analysis Mathematical Programming Methods, Prentice-Hall, Englewood Cliffs, N J .
with
[22] B. Yaged, Jr. (1971), "Minimum Cost Routing for Static Network Models," Networks, vol. 1, p p . 139-172.
169 Network Optimization Problems, pp. 169-175 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Application of Global Line Search in Optimization of Networks Jonas Mockus Department of Optimal Decision Theory, Institute Akademijos 4, Vilnius 2600, Lithuania
of Mathematics
and
Informatics,
Abstract
In this paper a review of application of global line search to optimization of networks is given. Advantages and disadvantages of this approach are discussed. It is shown that global line search provides global minimum after finite number of steps in two cases of piecewise linear cost functions of arcs. The first case is where all cost functions are convex. The second case is where all costs are equal to zero at zero flow and equal to some constant at non-zero flow. In other cases the global line search approaches a global minimum with small average error. The extension of the method to vector demands is given. The application of the method to the optimization of high-voltage net of power system is described.
1
Global Line Search
Suppose t h a t t h e objective function f(x), x = (xj,j = 1,..., J) m a y be approximately expressed as a s u m of components depending on one variable Xj.
f(x) = £ /,-(*;)
(1)
3= 1
T h e n t h e original J-dimensional optimization problem can be reduced to a sequence of one-dimensional optimization problems. If t h e decomposition (1) is exact, then we shall get t h e global o p t i m u m after J steps of optimization. If the sum (1)
J.
170
Mockus
represents f(x) approximately, then generally we shall get some approximation of a global o p t i m u m . T h e result of step i we regard as an initial point for t h e i + 1-sth step of optimization. T h e optimization stops, if no change happens during J steps. T h e difference from t h e classical version of line search m e t h o d is t h a t search is not local b u t global. Generally it helps to approach a global m i n i m u m closer. T h e r e are i m p o r t a n t cases when global line search gets global m i n i m u m and local line search does not. One of such cases is t h e following problem of network optimization.
2
Optimization of Networks
Suppose t h a t t h e cost of network f(x) can be expressed as a sum (1), where fj(x() is a cost of arc j and Xj is a flow of arc j . Sum of flows of arcs connected to each node i has to be equal to t h e nodes demand. It means t h a t : j
22aijxj
=
ac
i i,
i'•
=
1) •••! I
(2)
where .=/ ] T aiCi = 0 1=1
Here J is a number of arcs, 7 is a number of nodes, c; is d e m a n d of node i, C + 1 , if arc j goes to node i a,j = < — 1, if arc j goes from node i I 0, if arc j is not connected to i. and _ f + 1 , if node i is source \ — 1, if node i is sink T h e ./-dimensional problem of network cost minimization (1) under I — 1 of conservation-of-flow equations (2) can be reduced to J — / + 1-dimensional unconstrained minimization problem: K
J
K
fix) = Ylfc{xk) + Yl h(52hikXk + Zfo) k=l
1=K+1
Here K = J — 7 + 1 is a n u m b e r of loops, xi0 is from (2) taking x^ = 0, k = 1,
{
(3)
k=-[
+ 1 , if arc / is directed as loop k — 1, if arc / is directed opposite to loop k 0, if arc / don't belong t o loop k
...,K,
Application
of Global Line Search in Optimization
of
Networks
171
Loop k is generated connecting a pair of nodes of some tree containing arcs /, I = K + 1,..., J by some additional arc k, k = 1,..., K. T h e flow x^ of arc k generating a loop k is called loop flow. Loop flows Xk,k = 1,...,K can be changed independently during optimization. Expression (3) depends on t h e tree which generates loops k, k = 1,..., K. Assume t h a t t h e costs of arcs are piecewise linear functions. T h e n it is convenient to carry out t h e global line search along t h e bounds of linear areas. It can be done changing t h e tree after each step of optimization. An obvious rule is to remove some arc from t h e tree, if t h e arcs flow happens to be on a bound of linear parts of piecewise linear cost function. Here we do not consider degenerate problems. It is shown, see Mockus (1967), t h a t t h e algorithm provides global m i n i m u m in two cases: 1. Cost functions of all arcs are convex 2. Cost functions of all arcs are constant, with exception of zero flow point. At zero flow t h e value of cost function has t o be zero. It means t h a t cost functions are very "non-convex". It is easy to see t h a t in t h e linear case t h e algorithm is as simple and efficient as conventional algorithms of linear programming. T h e difference is t h a t global line search algorithm works almost as well also in non-linear convex problems and in some special non-convex problems. T h e generalization of the algorithm to 5-dimensional d e m a n d is straightforward theoretically: we just replace the scalar d e m a n d c; by vector d e m a n d C; = (c,i,..., Cis). To keep conservation-of-flow equations (2) we have to replace scalar flows XJ by corresponding vector flows Xj, Xj = (XJI, ..., XJS)- For practical applications t h e straightforward generalization is not convenient by two reasons: T h e first reason is t h e exponential growth of calculations when S is large. T h e second reason is the practical difficulties defining general .^-dimensional piecewise linear functions fj(x3i, ...,XJS)In special .^-dimensional demand cases global line search can be carried out more conveniently using some heuristics. T h e convergence proof can be extended directly only for straightforward generalization. However the experience shows t h a t global line search is efficient in solving ^-dimensional load problems of network design. One of such problems is optimization of t h e high-voltage net of large power system.
3
The Optimization of High-Voltage Net of Power System
T h e arcs may transformers. higher voltage of nodes c; =
represent components of network such as power transmission lines and T h e nodes represent demands. Demands are sources (generators or substations) or sinks (users or lower voltage substations). T h e demands (c, n ,ra = 1,...,N) in real life power systems are some vector-valued
172
J.
Mockus
functions of t i m e c,„ = c >tl ( t ),t € [1,T]. Usually those functions are approximated as some step functions of t i m e , so c; = {cint,n
= l,...,N,t
- l,...,T),i = 1,...,/
Here different f-components of S = AT-dimensional d e m a n d represent different periods of t i m e , from t = 1 until t = T. T h e n flows of arcs are Xj = (xjnt, n — l,...,N,t = l,...,T),j = 1,..., J and t h e scalar problem (3) can be directly extended to t h e vector case:
/(*) = £/*(**) + E /KE6'*** + *») fc=l
/=A'+1
(4)
*=1
where x t = (x f c n t ,n = l ) . . . ) J V , i = 1 , . . . , T ) , A ; = 1 , . . . , A -
(5)
T h e costs of arcs j representing transmission lines and transformers j directly depends not only on flows Xj but also on states yj. So t h e cost of arc can be more conveniently expressed as fj{xj,yj). Here t h e state variable yj = {yj,t,t = 1,...,T) usually depends on t i m e b u t not on n. Each component yjt of vector yj is a non-negative integer defining t h e technical p a r a m e t e r s of arc j , such as t h e number of parallel circuits of transmission line, t h e n u m b e r and cross-section area of wires, t h e n u m b e r and t h e power rating of transformers and so on. Assume t h a t capacity of arc Xjt(yjt) is an increasing function of its s t a t e yjt (this assumption will help us later to deal with capacity constraints). So we define t h e mixed integer programming problem: K
f{x,v)
J
= ^2fk{xk,yk)+ fc=l
^
K
fi(J2bikXk
l=K+l
+ x,0,y,)
(6)
k=\
This problem can be reduced to continuous non-linear p r o g r a m m i n g problem by choosing t h e cheapest state yj for a fixed flow Xj, namely: fj(xj)
= rain f(Xj,yj)
(7)
providing t h a t t h e capacity constraints hold font I < X]nt(yJt)
(8)
T h e capacity constraints \xjnt\ < Xjnt(yjt) are satisfied by increasing t h e s t a t e variable yjt if inequality (8) does not holds. Notation Yj = YJ(XJ) means a set of feasible states of arc j which can depend on flow Xj. So expression (7) defines t h e cost function, generally a multimodal one.
Application
of Global Line Search in Optimization
of
Networks
173
Expression (7) gives a convenient definition of S'-dimensional cost function fj(xj). However there remains the exponential complexity of minimization of non-convex cost function (6) depending on vector variables xk, k = 1,..., K. So we shall consider some other ways too. An interesting way is to reduce t h e problem of mixed integer p r o g r a m m i n g (6) to a problem of pure integer programming, regarding t h e states yj as independent integer variables. Here for each fixed state y = yk t h e optimal value of flow xk has to be calculated. An advantage of pure integer programming approach is t h a t the cost of arc fj(xj,yj) at a fixed state yj is usually a convex function (compare it with nonconvex cost function (7) of non-linear programming approach). It is well known t h a t a sum (6) of convex functions also is a convex function. However we should carry out the optimization of convex function (6) m a n y times, for each s t a t e y. It is a hard task, if TV and T are not small. T h e practical experience shows, t h a t t h e best computational results can b e obtained by facing t h e mixed integer p r o g r a m m i n g problem (6) directly. It means t h a t we optimize states and flows together at each step of global line search.
4
Mixed Integer Global Line Search
We shall consider a special case T
fAX3i Vi) = Yj9it{Vit-1. (=1
N
Vjt) + Yl
h
int{yjt)x)nt)
(9)
n=l
Here t h e function gjt{yjt-i,yjt) defines t h e cost of reconstruction of node j from state yjt-i to t h e state yjt. Expression hjntxjt means t h e power loss of flow xjnt in arc j a t s t a t e J/ J ( . It is well known t h a t minimization of (6) u n d e r assumption (9) defines t h e natural distribution of power flows in a homogeneous electrical net for any fixed s t a t e y. T h e net is not homogeneous if it contains transmission lines of different voltages or if it includes lines and transformers together. Suppose t h a t yjnt = 0,ifxjnt = 0 and t h a t yjnt+i > Vint- It means t h a t zero s t a t e is feasible only for zero flows and t h a t t h e s t a t e variable can't be decreased in time. T h e last assumption is usually true, but not always. If the state of arc is zero from t h e t i m e 1 until t h e t i m e t we shall call it tinterrupted arc. It is supposed t h a t after t h e t i m e t the s t a t e of ^-interrupted arc is non-zero. T h e optimization of each loop k is carried out in T + 1 stages. At t h e 1-sth stage we compare all possible cases of T-interruption of arcs belonging to t h e loop k. For each fixed s t a t e we minimize sum (6) as a function of flow xk. We do it by solving linear equations corresponding t o t h e condition of zero first derivatives with regard t o xfc n ( ,n = 1 , . . . ,N,t = 1,... ,T. In t h e most economical s t a t e of loop
174
J.
Mockus
we replace zero s t a t e values yjt = 0 by unit s t a t e values, t h a t is by yjt = 1. We accept it as an initial s t a t e for t h e next stage. At t h e second stage we consider all cases of T — 1 interruption and so on until t h e last T + 1- sth stage. At T + 1-sth stage we consider all non-zero states. We compare sums of costs of arcs belonging to a loop k for all stages. T h e state which corresponds to t h e minimal cost function we accept as a result of global search along t h e "line" k. T h e enumeration of arcs may be changed after each step of global line search. T h e purpose of this change is to keep "most interrupted" arcs out of t h e tree. We call an arc as "most interrupted" if it is interrupted for a longest t i m e to- Here t 0 = a r g m a x j g i k i;, where ij we define as yiit = 0,t = l,...ti,yu > l , i = t/ + 1,...,T and Lk is a set of arcs belonging to a loop k. If T < 2, then t h e optimization at each stage can be carried out by a simple comparison of all corresponding states. If T is greater, t h e n some d y n a m i c p r o g r a m m i n g procedure usually is more efficient. T h e optimization stops if the cost of net changes less then e during K steps of global line search. T h e C P U t i m e T of global line search software developed by J.Valeviciene can be estimated as
r =
cKINM
Here c depends on computer, for P C approximately c = 0.2 — l . O s e c , K is a n u m b e r of loops, / is a number of nodes, A^ is a n u m b e r of flow components and M is an average n u m b e r of states of arcs. T h e algorithm was used since 1969 for t h e optimal planning of North-Western power system of t h e former USSR designing t h e new power transmission lines of 110 K V , 220 K V and 330 K V in t h e Leningrad branch of "Energosetprojekt", which was t h e leading institution in the country at t h e t i m e . D y n a m i c p r o g r a m m i n g procedures developed in Riga were also used at t h e some place, with lesser success. T h e reason was t h a t approximate global line search m e t h o d was solving t h e problems up to 100 nodes and more. T h e exact d y n a m i c programming procedures were directly applicable only to problems with tens of nodes. Pardalos and Rosen (1987) present an approximation technique based on piecewise linear underestimation of concave cost functions fj(xj). T h e resulting model is a linear, zero-one, mixed integer problem. A direct comparison of this approach and global line search techniques is an interesting problem of future research. A complete review of western results in network optimization is given by Guisewite and Pardalos (1990).
Application of Global Line Search in Optimization of Networks
175
References [1] G.M. Guisewite and P.M.Pardalos, Minimum Concave-Cost Network Flow Problems: Applications, Complexity, and Algorithms, Ann. of Operations Research, 25 (1990) 75-100. [2] J.Mockus , Multimodal Problems in Engineering Design (Nauka, Moscow, 1967) p. 216 (in Russian) [3] P.M. Pardalos and J.B. Rosen J.B. Constrained global optimization: Algorithms and applications, Lecture Notes in Computer Science 268 (Springer, Berlin, 1987)
177 Network Optimization Problems, pp. 177-202 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Solving Nonlinear Programs with Embedded Network Structures Mustafa Q. P m a r Institute for Numerical Lyngby, Denmark
Analysis,
Stavros A. Zenios Decision Sciences Department, USA
The Technical
University
University
of Pennsylvania,
of Denmark,
2800
Philadelphia,
PA
19104
Abstract
We present an algorithm for solving large scale nonlinear programs with embedded network structures. It is based on a Linear-Quadratic Penalty (LQP) function that eliminates the non-network constraints from the problem. The resulting nonlinear and nonseparable problems are solved using a simplicial decomposition algorithm that induces separability in the objective function. As a result one can employ network simplex technology. At the same time the values of any side variables can be determined by inspection. The algorithm is implemented in the software system GENOS/LP. Extensive numerical results are reported for diverse application areas: multicommodity network flows, Naval personnel assignment, matrix balancing and some of the NETLIB test problems. Comparisons with general purpose optimizers, like MINOS and OBI, are included.
1
Introduction and Background
It is well documented in t h e optimization literature t h a t network-structured optimization problems can be solved substantially faster t h a n t h e general linear program.
178
Mustafa
Q. Pmar & Stavros
A.
Zenios
This observation holds t r u e even with t h e recent developments of K a r m a r k a r ' s algorithm [1984] for linear programming, and t h e research t h a t followed it on interior point m e t h o d s . Furthermore, t h e superior performance of special purpose network algorithms has been documented for both pure and generalized networks, as well as for nonlinear programs. For example, in t h e mid-seventies, several studies established t h a t codes based on t h e network simplex algorithm for pure network problems were 150-200 times faster t h a n t h e state-of-the-art LP codes of t h e t i m e . See, for example, Glover et al. [1979] and Mulvey [1978]. In t h e early eighties research concentrated on t h e generalized network problem. Once more the network simplex algorithm was shown to be approximately 50 times faster t h a n LP codes. See, for instance, Brown and McBride [1985] and Mulvey and Zenios [1985]. This line of research was extended to t h e nonlinear network problem - see Dembo, Mulvey and Zenios [1989] - for a recent survey. Network specializations of nonlinear programming algorithms — like the primal t r u n c a t e d Newton or simplicial decomposition — were shown to be at least one order of m a g n i t u d e faster t h a n general purpose nonlinear programming solvers. Every development in network algorithms was followed by research to use the new algorithms in solving linear programs with large embedded networks. These efforts were generally successful in solving linear programs where t h e majority of constraints and variables had a network structure. Such programs are known as networks with side constraints and variables. In this category are included several well-known classes of problems: t h e processing (or blending) problem of Koene [1982], t h e equal flow problem of Ali, Kennington and Shetty [1988], the multicommodity network flow problem, Kennington and Helgason [1980] and so on. T h e applications of these problems in m a n a g e m e n t science are numerous and well documented in the above references. In this paper we develop an algorithm for solving nonlinear networks with side constraints and variables. Our p r i m a r y objective is to m a k e special-purpose nonlinear network optimization software applicable to a broader class of problems. T h e technique we propose here can also solve linear network problems with side constraints and side variables. As such it fits in the line of research pursued in t h e past by several others: McBride [1985], Chen and Engquist [1986], Glover and Klingman [1981], Chen and Saigal [1977] and so on. (See, also, Kennington and Helgason [1980, Chapter 7].) Even in t h e case of linear networks, however, our approach differs significantly from t h e earlier studies. Most of t h e earlier work dealt with specializations of the simplex algorithm. These specializations aimed at developing basis partitioning techniques t h a t would separate t h e network basis form t h e non-network component. These two separate components were treated using distinct computational procedures. Graph d a t a structures were used to carry out operations on a tree (corresponding to the network basis). General sparse m a t r i x factorizations were applied to the non-network component. W h e n applied to networks with few side constraints these methods were proven very efficient. T h e algorithm we propose here takes a different approach.
We use an exact
Solving Nonlinear
Programs
with Embedded
Network
Structures
179
penalty function to move t h e side constraints into the objective function. We then introduce a smoothing of t h e penalty t e r m in order to obtain a differentiable problem which we solve using a linearization procedure: simplicial decomposition. T h e use of penalty functions has been very effective in solving t h e multicommodity network flow problem. Ali, Kennington and Shetty [1988] use a relaxation approach together with subgradient optimization. Schultz and Meyer [1990] develop a barrier t y p e algorithm and Zenios, P m a r and D e m b o [1990] propose t h e use of a Linear-Quadratic Penalty (LQP) function. T h e last two algorithms were particularly successful in solving some very large problems from a Military Airlift C o m m a n d Application. In this paper we extend t h e L Q P algorithm of Zenios, P m a r and D e m b o [1990] from the multicommodity network flow problem to t h e more general networks with side constraints and variables. We also discuss key features of the software system G E N O S / L P t h a t we develop based on the L Q P algorithm. A comprehensive computational investigation provides information on t h e relative merits of t h e L Q P special purpose algorithm compared to general purpose optimizers. Section 2 formulates the problem we are going to solve and develops t h e algorithm. Section 3 describes t h e G E N O S / L P software system and reports its use on several diverse applications. Concluding remarks are given in Section 4.
2
The Linear-Quadratic Penalty Algorithm for Networks with Side Constraints and Variables
In this section we describe t h e Linear-Quadratic Penalty ( L Q P ) algorithm for nonlinear programs with embedded network structures. We begin with a formulation of the problem and proceed with t h e main components of t h e L Q P algorithm.
2.1
Problem Formulation
We consider t h e following nonlinear program:
[NLP] minimize x, z subject to
fix,
z) Ax
= b
Sx + Pz < d 0 < x < u 0
180
Mustafa
Q. Pmar & Stavros
A.
Zenios
x € 3i n i is t h e vector of decision variables which represent flows on a graph, z € 5R"2 is t h e vector of decision variables which represent the side (non-network) columns, A is an m x n^ constraint m a t r i x with network structure. It could be t h e n o d e - a r c incidence m a t r i x of a network flow problem, or a block-diagonal m a t r i x where each block is a n o d e - a r c incidence m a t r i x as occurs in multicommodity network flows, stochastic networks and time-staged problems. S is t h e s x t i j m a t r i x of side (i.e., non-network) constraints imposed on t h e network flow variables, P is t h e s x n? m a t r i x of side (i.e., non-network) constraints imposed on t h e side variables, u
€ 5J"1 are upper bounds on the flow variables x,
r 6 5R"2 are upper bounds on t h e side variables z, b £ 3J m , d € 5RS are t h e r i g h t - h a n d side coefficients of t h e constraints. Also, let X = {{x,z)\Ax
=
b,0<x
Throughout t h e manuscript, transposition is indicated by a superscript T , Vxf and V z / denote t h e gradient vector of t h e function / with respect to x and z, and all vectors are column vectors.
2.2
The Linear-Quadratic Penalty (LQP) Algorithm
To exploit t h e network structure we want to remove t h e side constraints and append t h e m to t h e objective. To this end we use an exact penalty function p(t) = m a x { 0 , i } .
(1)
where t is a scalar variable. By placing the side constraints into the objective function using a penalty function we obtain a problem with network constraints. In particular, for multicommodity flows or stochastic networks, the penalty problem has a disjoint constraint set. Unfortunately, using an exact penalty function like (1) produces a non-differentiable problem. To avoid t h e difficulties of non-differentiability we use a smoothing approximation to t h e exact penalty function. This approach has been proposed in t h e context of m i n - m a x optimization by Bertsekas [1975] and later by Zang [1980]. For t h e exact
Solving Nonlinear
Programs
with Embedded
Network
Structures
181
e ' Figure 1: T h e linear-quadratic penalty function penalty function (1), we consider t h e linear-quadratic penalty function of Zenios, Pinar and D e m b o [1990]: 0 (C,t):
t-
if t<0 if 0 < t < t if t> e
(2)
where t is a scalar real variable and e is a positive real number. T h e linear-quadratic penalty function is depicted in Figure 1. T h e linear-quadratic penalty function is used to eliminate t h e side constraints by placing those in t h e objective function. T h e nonlinear network problem obtained by penalizing t h e side constraints Sx + Pz < d is formulated as: [NETNLP] minimize
$(:r, z) = f(x, z) + \i J ^ (j>(e, pj)
'
subject to
3
Ax = b 0<x
where p = Sx + Pz — d, the linear-quadratic penalty function is given by (2) and (i is a positive scalar which determines t h e severity of the penalty. T h e resulting nonlinear network problem is solved repeatedly with adaptively changing p a r a m e t e r s p and e until suitable stopping criteria are satisfied. T h e algorithm can be concisely stated as follows:
182
Mustafa Q. Pmar & Stavros
A.
Zenios
The Linear-Quadratic Penalty Algorithm LQP—0 (Initialization.) Find an initial feasible solution to t h e network component of N L P ignoring t h e side constraints, i.e., solve t h e problem minimize x, z subject to
fix,
z)
Ax = b 0 < x < u 0 < z < r
If t h e solution to this problem satisfies all side constraints, stop. Otherwise choose initial values for penalty parameters n and e and go to LQP—1. LQP—1 (Penalty Problem.) Solve - perhaps inexactly - t h e nonlinear network problem N E T N L P . Go t o L Q P - 2 . LQP—2 If t h e solution satisfies optimality criteria, stop. penalty p a r a m e t e r s fi and e and go to LQP—1.
Otherwise, adjust t h e
T h e r e are three main components of t h e L Q P algorithm which deserve special attention. T h e solution of the nonlinear network problem at step LQP—1 demands t h e most computational effort. This problem is solved using t h e network specialized version of simplicial decomposition algorithm, see Mulvey, Zenios and Ahlfeld [1990]. T h e second component is t h e multiplier adjustment procedure which is crucial to t h e efficient performance of t h e L Q P algorithm. Finally, we c o m p u t e lower and upper bounds to t h e optimal value and test stopping criteria. We study these topics next.
2.3
Simplicial Decomposition for the Penalty Problem
Simplicial decomposition iterates by solving a sequence of linearized subproblems to generate e x t r e m e points of the feasible region of t h e network component and master problems which minimize the nonlinear objective function over t h e simplex spanned by t h e e x t r e m e points. For a more detailed t r e a t m e n t t h e reader is directed to Mulvey, Zenios and Ahlfeld [1990]. Here we discuss t h e specialization of t h e algorithm in solving t h e penalty problem N E T N L P . In particular N E T N L P is decomposed into a linear network problem t h a t can be solved by the network simplex algorithm and a simple linear program t h a t can be solved by inspection.
T h e Simplicial D e c o m p o s i t i o n Algorithm S D - 0 Set v = 0, and use (x0) G X as t h e starting point. Let Y = 0, and v «— 0 denote t h e set of generated vertices and its cardinality, respectively.
Solving Nonlinear Programs with Embedded Network
Structures
183
SD—1 (Linearized subproblem.) Compute the gradient of the penalty function $ at the current iterate ( x J and solve a linear program to get a new vertex of the constraint set, i.e., solve for j / " + 1 = argmin v e x yTV$(x",z") and let Y = Y\j{yv+1}, vi-v + l. SD—2 (Nonlinear master problem.) Using the set of vertices Y to represent a simplex over the constraint set X, find an optimizer of the penalized objective function $ over this subset of X. Let w* = arg mmwewv <&(Bw) where Wv = {wi\^1Wi = l,wi>0\/i = l,2,...,v} and B = [yl\y2\... |y»] is the basis for the simplex generated by the set of vertices Y. The optimizer of $ over the simplex is given by (*„+i 1 = Bw". SD—3 Let v *— v + 1, and return to Step 1. T h e S u b p r o b l e m . At step SD—1 a new vertex (*J is generated as the solution to the following subproblem: Minimize x,z subject to
xTVx$(x\
z") + zTVz${x\
z")
Ax = b 0< x < u 0< z < r where \ v ) is the iterate at the f-th iteration of simplicial decomposition. This problem decomposes into two independent linear programs as follows: Minimize x subject to
xTVx$(x",
z")
Ax — b 0< x < u and Minimize z subject to
z T V 2 $(x",z") 0< z < r
The first problem is a linear network problem and is solved using the network simplex method. The second problem is solved trivially by assigning each component Zj of the vector z to its lower or upper bound depending on the sign of the gradient V^xV'O.i.e., " TV if V ^ x V ) > 0 , . (J 0 if V ^ z V ) < 0
Mustafa Q. Pmar k. Stavros A. Zenios
184
However, when no upper bound is specified for the side variables, this procedure fails to produce an accurate approximation to the optimal value of the side variables. We consider an alternative scheme instead. Instead of taking a full step in the direction of either the lower or the upper bound as indicated by the sign of the gradient component, we choose a point between the current value of the side variable and the bound. This is allowed since the descent direction is not affected by this operation. Thus, the side variable portion of the new vertex is obtained as follows:
_ f a(r,--^) if V , , * ( x V ) > 0 3
if VZj${xv,zv)
~\a{z?)
<0
l j
where a is a positive scalar in the interval (0,1]. This procedure is reminiscent of the trust region methods, Dennis and Schnabel [1983]. Using this procedure, the values of the side variables were computed to five digits of accuracy where this level of accuracy was not attained using the first procedure after an identical amount of computation time. The accuracy verification was made by comparing the value reported by the LQP algorithm and that of the general purpose code MINOS. We used a = 0.5 in this study. The Master Problem. At step S D - 2 a nonlinear master problem optimizes the objective function on the simplex specified by the extreme points generated by the subproblems. The master problem is formulated in the form: Minimize subject to
$(Bw) V
I>. = 1 1=1
W{ > 0 i = 1,. . . , v where v is the number of extreme points generated by the subproblems, B is the matrix whose columns are the extreme points and w = [w1, u>2,. . ., w"\ are the corresponding weights. The master problem, though nonlinear, is of significantly smaller size than the original problem since it is posed as a problem over the weights w. There are several standard methods that can be used for its solution, like, for example, Bertsekas's projected Newton method [1982]. If the simplicial decomposition algorithm drops vertices that carry zero weight at the optimal solution of the master problem, then subsequent master programs are locally unconstrained. Hence, methods of unconstrained optimization can be used to compute a descent direction. A simple ratio test determines the maximum feasible step length that will not violate the bounds. The master program can be rewritten in the form: mm$(Dw) ui>0
(5)
Solving Nonlinear
Programs
with Embedded
Network
Structures
185
where D = [yi — yv\y2 — yv\... \yv-i — Vv] is t h e derived linear basis for t h e simplex generated by t h e vertices yi,y2,---,yv. We denote by w t h e vector [wi, w2,...,u^-i] and t h e solution for wv is computed as v-l
wv = 1 - J^ Wi
(6)
;=i
At t h e current iteration we have v — 1 active vertices (i.e. u>; > 0, for i = 1,... ,v — 1) and t h e last vertex yv lies along a direction of descent. Hence, given an iterate (x", z") a descent direction p to (5) can be obtained as t h e solution to {DTMD)p
= -DTV^(xl/,
z"),
(7)
T h e choice of t h e m a t r i x M and alternative solution m e t h o d s for system (7) are discussed in Zenios, P m a r and D e m b o [1990].
2.4
Adjusting the Penalty Parameters
T h e procedure used to u p d a t e t h e penalty parameters p and t consist of dynamically decreasing t h e value of e to a small final tolerance and increasing t h e value of p when certain criteria are met. Suppose ( * * ) , pk, £k are given at iteration k of t h e L Q P algorithm. Also let pk = Sxk + Pzk — d and define t h e set V(x,z) = {j\pj > e} to be t h e set of violated constraints. T h e iterate (*k j is t e r m e d e-feasible if t h e index set of violated constraints V(xk,zk) is empty. We distinguish between t h e following two cases when u p d a t i n g t h e penalty parameters: C a s e 1: If V(xk, zk) = 0, this is an indication t h a t t h e m a g n i t u d e of t h e penalty par a m e t e r /J, was adequate in t h e previous iteration since e-feasibility is achieved. In this case t h e infeasibility tolerance t should be reduced. C a s e 2 : If V(xk,zk) / 0, t h e current point is not t-feasible, an indication t h a t t h e penalty p a r a m e t e r p should be increased. Let 7 = rjek be a target degree of infeasibility where r\ £ ( 0 , 1 ] . We consider t h e following u p d a t e equation: ^+1
=
/( tft )
(8)
7 or equivalently,
±
H>» = / ^ -k
(9)
rjc
And if \V(xk,zk)\
> 1, we get /+
1
= H. m a x pk. rjt jev{xk,zk)
(10)
186
Mustafa Q. Pmar & Stavros A. Zenios
In summary we have the following update procedure: Pick 7/1, 7/2 e (0,1] If V{xk,zk) = % ek+1 = max{ £„,,-„, 7/i e*} Else „fc+i _ _n_ m a x _t " 2 £ jSV(i*,z«)
J
where em,-n is a suitable final feasibility tolerance. A suitable initial value for ft can be found through some preliminary experimentation. W i t h t h e test problems we used in this study, t h e absolute m a x i m u m of objective function coefficients proved to be a good choice. T h e solution (* 0 J obtained by ignoring t h e side constraints can be used to provide an initial value for e. A reasonable choice is to pick a value equal to a fraction of t h e m a x i m u m of t h e side constraint violations, i.e., in t h e interval (0,maxj 6 y( I o i jO) p°). T h e value of parameters ?/i and T/2 was taken to be 0.5 for all computational tests reported in this study.
2.5
Bounds to the Optimal Value and Stopping Criteria
It is possible to compute lower bounds to the optimal objective function value during t h e course of t h e L Q P algorithm. This computation is performed after the subproblem phase of t h e simplicial decomposition and is based on a first-order Taylor series expansion of t h e function $ around t h e current iterate. Let v* be t h e optimal value of N L P and (x. J an optimal solution of N E T N L P for given penalty parameters n and t and x = (x) for notational simplicity. Also, let X = {x\Ax = 6,0 < x < u , 0 < 2 < r). Then §(x,z)
(11)
since N E T N L P is a relaxation of N L P . Therefore the optimal solution of N E T N L P is a lower bound for the optimal objective value. But in t h e presence of inexact minimizations of t h e penalized objective function $ , this is not always guaranteed to be a lower bound. Hence, we consider t h e first order Taylor series expansion of $ around a point y * ( x ) = * ( y ) + (x - y ) r V $ ( y ) + o(||y|| 2 )
(12)
Ignoring t h e second order t e r m define t h e function h
Mx) = $(y) + (x - y ) T V $ ( y )
(13)
By convexity of $ , min^g^ h(x) < $ ( £ ) , and hence, min r e x h(x) < v*. This bound is readily computed by the simplicial decomposition algorithm that generates extreme
Solving Nonlinear
Programs
with Embedded
Network
Structures
187
points of X by minimizing a linearized approximation to t h e objective function over X; see step SD—1. However, it is possible to obtain tighter lower bounds to t h e optimal value as follows. We slightly change notation for expositional simplicity. We denote t h e side constraint m a t r i x by E and temporarily ignore t h e distinction between network and side variables. Let x be an arbitrary iterate and denote by Q(e,pk) = Y?j=\ 4>{ei p)) where pk = Exk — d. Recall t h e subproblem objective function of t h e simplicial decomposition algorithm: V $ ( x i ) T • x = ( V / ( x f c ) + pVpQ(e,
k P
fE)
•x
(14)
We define t h e following lower bound function V(u) = ( V / ( x ' £ ) T - uE)x'
+ ud
(15)
where t h e vector u is given by u = -liVpQ(e,pk)
(16)
and x ' is t h e solution to t h e linearized subproblem, i.e., x ' = a r g m i n x s x i r V $ ( x * : ) . Next we show t h a t V(.) provides a lower bound superior to the linearized subproblem bound. T h e analysis is a generalization of the result given in Brown et al. [1989]. To proceed we need the following intermediate result. L e m m a 1. Let g : -R71 i—> 5R be a convex and at least once continuously differentiable function with the property t h a t 9(0) = 0
(17)
where 0 is t h e zero vector. T h e n
g(y) - yTvg(y)
v y.
(18)
P r o o f . Consider a first-order Taylor series expansion of g. By convexity
g(*) > g(y) + (* - y)Tvg(y)
v x, y e »-
In particular, g{0) > g{y) -
yTVg{y)
However, by hypothesis g(0) = 0, and t h e result follows.
•
T h e assumption in L e m m a on / holds for example for linear objective functions and quadratic objective functions of t h e form J^- djX2-.
Mustafa Q. Pwar & Stavros A. Zenios
188
Proposition 2. Let / be as in Lemma 1 and x' = argmin x g x z r V $ ( x ) where x is the current iterate. Then /t(x') < V(u) where h is given by (13). Proof. We will show that A(x') - V(u) < 0 Equivalently we want to show $(x) + (x' - x ) r V $ ( x ) - ( V / ( x ) T - u £ ) x ' - ud < 0 Consider the left hand side. Let p = £ x — d. By algebraic manipulation and using the definition of u, $(x) = = =
+ (x' - x ) r V $ ( x ) - ( V / ( x ) T - v,E)x' - ud f(±) + uQ{e,p)-Vf(x)Tx + u{EZ-d) / ( * ) - V / ( x ) T x + pQ(e, p) - pVpQ{t, p)T{E± - d) / ( x ) - V / ( x ) r x + p(Q(c,p) - VpQ(e,p)Tp) 1
2
By the assumption imposed on / , the first term is nonpositive following Lemma 1. The nonpositivity of the second term follows from the fact that p > 0 and Q(e,O) = 0 and by invoking Lemma 1. Hence the claim is established. • We now describe the procedure for generating upper bounds in the linear-quadratic penalty algorithm. For a general discussion on bounding exterior penalty function algorithms see Fiacco and McCormick [1968]. Define the set R° = {x G X\Sx + Pz > r} and assume that _R° is non-empty. Let x = (x) € -R0 and (x.) be an — perhaps approximate — optimal solution of NETNLP. Then a new interior point f^J is generated as follows: let y = Sx + Pz — d, and I = {i\yi > 0}.
p=
n
Vi
(19)
•=* y. + -r y'i y,''£' Vi
and define x = (1
-0)i - + /3x
z =:(1 -P)z '• + fiz
(20) (21)
It is easily seen that [ - J is feasible for NLP and thus provides an upper bound, Fiacco and McCormick [1968, Theorem 29, p. 107]. The same result also states that the upper bound converges monotonically to the optimal objective value. Obviously
Solving Nonlinear
Programs
with Embedded
Network
Structures
189
this procedure requires an interior point t o be generated a t t h e beginning of t h e algorithm. For example, in Zenios, P m a r and D e m b o [1990], a solution satisfying t h e m u t u a l capacity constraints of t h e multicommodity flow problem is c o m p u t e d based on t h e solution to t h e network relaxation. We were also able to generate initial feasible solutions for t h e Naval personnel assignment problems used in this study due to a special property of t h e problem. This is detailed in t h e forthcoming section on numerical experiments. Therefore t h e L Q P algorithm generates both upper and lower bounds for t h e optimal objective value during t h e execution of t h e algorithm for problems where an initial feasible solution can be computed. T h e algorithm t e r m i n a t e s when b o t h of t h e following error measures are within acceptable tolerance: 1. Absolute error in side constraint feasibility ||5a: + Pz-
dlloo < tmin
2. Bound gap
f(x,z)-V(u) V(u)
~
9ap
where x = (*J is t h e current iterate and ( | J is obtained from (20)-(21). T h e values of emin and cgap used in this study are 10~ 5 and 10~ 2 respectively. T h e ability to c o m p u t e improving upper bounds is an i m p o r t a n t feature of our approach since computation can be stopped as soon as a reasonable improvement in t h e upper bound is achieved.
3
Numerical Experience
T h e L Q P algorithm was implemented to solve problems of t h e form [ N L P ] . T h e code was written in Fortran 77. We refer to t h e code as the G E N O S / L P system. T h e computational testing was performed on D E C stations 3100 and 5100/200 running Ultrix and for large problems on a CRAY Y-MP. On the D E C stations, t h e code was compiled with t h e default compiler optimization option. For t h e CRAY experiments t h e code was tailored to take advantage of vectorization capabilities of t h e CRAY architecture. Before we report t h e results of computational testing, we discuss briefly t h e m a i n components of t h e vectorized code.
3.1
Vector Computing
T h e simplicial decomposition algorithm is particularly rich in dense linear algebra computations which can be efficiently vectorized. We mention here t h e m a i n components of t h e linear algebra involved in the L Q P m e t h o d .
Mustafa Q. Pmai
190
Computing Descent Directions. The following system of linear equations is solved to compute a descent direction during the course of the simplicial decomposition algorithm: (DTMD)p = -Z> T V$(z, z), (22) where D is a projection matrix, V $ denotes the gradient of the objective function, the pair (a;, z) is an arbitrary iterate and M is a matrix which usually approximates the second derivatives of the function $ . The matrix D tends to be very large depending on problem size. Typically for D can be 100000 x TV where N is the number of extreme points used in the master problem solution. TV varies from 1 to 100. The computation of the product DTMD can be very efficiently vectorized. Function and Gradient Evaluations. Having computed a descent direction, a one-dimensional search is executed to compute the next iterate. The time spent in the search procedure is dominated by the computation of function values and the gradient vector. The function and gradient evaluation of the original linear objective function can be vectorized trivially as it involves a simple DO-loop over all variables in the problem. However, the function and gradient values contributed by the nonseparable penalty function requires the evaluation of the side constraints. These computations are also vectorized. Other Linear Algebra. The solution of the system {DTMD)p
=
-DTV$(x\z"),
at every step of the master problem also requires the computation of the right-hand side reduced gradient vector DT V$(x", z"). This is a fully dense matrix-vector product suitable for vector architectures. To illustrate the impact of vectorization on the performance of the above components, we give in Table 1 below the time spent in these master problem components during execution of the LQP algorithm with problem PDS3 both with and without vectorization. Compiler vectorization refers to automatic vectorization of the code using compiler options whereas user vectorization refers to restructuring of code segments and use of library subroutines such as the BLAS (Basic Linear Algebra) subroutines as explained in Pmar and Zenios [1990]. O p t . level no vectorization compiler vectorization user vectorization
Descent Dir. 35.2 3.7 3.4
O t h e r Lin. A l g . 17.6 13.2 0.8
Func. a n d G r a d . Evals. 36.1 12.3 12.0
Solving Nonlinear
Programs
with Embedded
Network
Structures
191
Table l:Reduction in C P U spent in t h e main master problem components due to vectorization with P D S 3 . As evidenced by t h e results, significant gains are realized in t h e master problem phase with vectorization. It is not possible to improve t h e subproblem solution t i m e through vectorization due to t h e inherently scalar n a t u r e of the graph d a t a structures used in implementing t h e network simplex algorithm.
3.2
Solving Multicommodity Network Flow Problems
T h e multicommodity network flow problem can be seen as a special case of t h e networks with side constraints model. T h e side constraints in this case have t h e following simple form: commodity flows on all or a fraction of t h e arcs compete for a joint arc capacity. This special structure of t h e side constraints considerably simplifies t h e computer representation and evaluation of these constraints in t h e context of t h e L Q P algorithm. Extensive computational experience with multicommodity network flow problems using the L Q P algorithm is reported in Zenios, P m a r and D e m b o [1990]. Further work with a parallel decomposition of t h e algorithm is given in P m a r and Zenios [1990]. We only give a s u m m a r y of t h e results here. T h e first set of test problems is a collection of linear multicommodity network flow problems derived from a Military Airlift C o m m a n d (MAC) application. They are referred to as the Patient Distribution System (PDS) problems. T h e second set of test problems are randomly generated linear multicommodity network problems communicated to us by J.L. Kennington, see Ali and Kennington [1977]. T h e characteristics of t h e test problems are given in Table 2. Problem PDS1 PDS3 PDS5 PDS10 PDS15 PDS20 KEN 11 KEN13
No. of arcs
No. of nodes
339
126 390 686
1117 2149 4433 7203 10116
1399 2125 2447
176 225
121 169
No. of commodities
No. of rows
No. of columns
121 169
1473 4593 7546 15389 23375 31427 14694 28632
3816 12590 23639 48763 79233 105728 21349 42659
Table 2: Characteristics of multicommodity network flow problems. C o m p u t a t i o n a l results with multicommodity flow problems using t h e L Q P algorithm are given in Table 3. For each problem we report t h e total number of simplicial decomposition iterations, n u m b e r of e x t r e m e points retained upon completion, and t h e C P U t i m e consumed during t h e subproblem and master problem phase of t h e simplicial decomposition algorithm. These two components comprise t h e total C P U
192
Mustafa
Q. Pmar
A.
Zenios
usage during execution of t h e algorithm.
Test Problem
PDSl PDS3 PDS5 PDSIO PDS15 PDS20 KENll KEN13
Simplicial iters 23 41 71 103 121 145 16 87
GENOS/LP Subproblem Master time time 0.85 1.01 12.04 6.73 55.12 37.97 232.31 175.82 559.35 381.08 1225.83 720.06 7.4 8.1 70.6 66.8
Total time 1.86 18.77 93.09 408.13 940.43 1945.89 15.5 137.5
OBI time
1530 16000 21 67
Table 3: Multicommodity flow problems solution statistics on t h e CRAY Y - M P . W i t h all t h e test problems, both t h e infeasibility tolerance e m ,„ = 10~ 5 and t h e bound gap tolerance esap = 1 0 - 2 were achieved. P D S l and P D S 3 were also solved with t h e general purpose package MINOS of Murtagh and Saunders [1987]. T h e optimal values reported by MINOS matched t h e L Q P optimal values to 5 digits. Some of these problems were solved on t h e same computer by Marsten et al. [1990] using t h e code O B I based on interior point methods. T h e results are also given in Table 3. It is clear t h a t the L Q P algorithm outperforms substantially state-of-theart implementations of interior point methods. By virtue of t h e linearization in t h e subproblem phase of simplicial decomposition, t h e linear network flow problems for individual commodities can be solved on parallel processors. T h e results of this study are reported elsewhere; see P m a r and Zenios [1990].
3.3
Solving the Naval Personnel Assignment Problem
In this section we report numerical results obtained using G E N O S / L P on two Naval Personnel Assignment problems. Each year thousands of decisions are m a d e t o (re)allocate t h e Navy Enlisted Personnel to a fleet of combat units and to mission areas within these units. Allocations are m a d e in such a m a n n e r as to provide t h e best defense at t h e lowest cost. All mission areas within a combat unit require personnel with different skills to support operational capabilities. A unit's capability to perform its functions in all its mission areas is referred to as "readiness". Readiness is measured based on t h e skills of personnel assigned. A shortage of skilled personnel would decrease t h e level of readiness of a mission area and thus degrade the capabilities of t h e u n i t . Clearly maximizing t h e level of readiness is a complex decision making problem given t h e large n u m b e r of mission areas and personnel to be m a t c h e d . This problem can be formulated as a network optimization problem with side constraints and variable(s). T h e reader is directed Krass and Thompson [1990] for more details
Solving Nonlinear
Programs
with Embedded
Network
Structures
193
on t h e model. Expressed in m a t r i x notation t h e model has the following form: minimize x,z subject to
ex — z Ax Sx + Pz 0 < x 0 < 2
= > < <
b d u r
where A is a node-incidence m a t r i x for t h e network and represents t h e flow conservation conditions and S and P are matrices used to capture t h e non-network requirements. T h e variables x denote t h e flow variables and z is t h e side variable which represents t h e level of readiness to be maximized. T h e objective is to maximize t h e level of readiness for all units considered and minimize t h e cost of t h e assignment. For t h e Naval assignment problems used in this study, an initial feasible solution was readily c o m p u t e d since t h e solution to the network relaxation satisfied the side constraints when the side variable was ignored, i.e., let x° be a solution of t h e network relaxation minimize subject to
Ax = b 0 < x < u
If x° is such t h a t
Sx°> d then z° is computed as
(£,,.
-di)
(23)
Pij
where S{j and pij denote t h e entries at t h e i-th row and j-th column of t h e matrices S and P respectively. T h e first model — NAVY — is a simplified version of t h e complete model which we we call H U G E N A V Y . T h e size and characteristics of b o t h problems are given in Table 4. In addition t h e problem NAVY has a nonzero assignment cost vector c whereas t h e larger problem H U G E N A V Y has an assignment cost vector which is identically zero. T h e objective in problem H U G E N A V Y is solely to minimize t h e readiness level. Problem NAVY HUGENAVY
LP form Rows Columns 4144 6842 64542 36013
Network Nodes Arcs 3457 6841 30639 64541
N o . of side const. 687 5374
Opt. value - 2 . 7 2 3 4 7 x 10 5 -0.5340
Table 4: Problem Characteristics of Naval Personnel Assignment. Both problems were solved on t h e CRAY Y - M P . We give in Table 5 t h e solution
194
Mustafa
Q. Pmar
A.
Zenios
statistics of t h e LQP algorithm. All times are stated in C P U seconds exclusive of i n p u t / o u t p u t . Major iterations refer t o t h e total n u m b e r of times step 1 of t h e L Q P algorithm is executed. It is interesting to note t h a t t h e larger Navy problem is solved in a t i m e very close to t h e solution t i m e of t h e smaller problem. This can be att r i b u t e d to t h e larger number of major iterations t h e algorithm took in the case of t h e problem NAVY because t h e smaller problem is more tightly constrained t h a n t h e larger. Since the iterates generated by t h e L Q P algorithm become only feasible on t e r m i n a t i o n , t h e previous observation leads to t h e conclusion t h a t , though much larger in size, H U G E N A V Y is a relatively easier problem for our m e t h o d .
Problem
NAVY (CRAY Y - M P ) NAVY (DEC 5100) HUGENAVY(CRAY Y - M P )
Simpl. iters 6 6 2
GENOS/LP Master Subproblem time time 45 149 132 1428 157 181
Total time 194 1560 276
MINOS time
OBI time
NA 600 NA
NA NA 150
Table 5: Performance of t h e L Q P algorithm on the Naval personnel assignment problems. We also report t h e solution t i m e by MINOS for NAVY. T h e L Q P algorithm was outperformed by MINOS on this problem. However, MINOS was not able to produce a feasible solution to H U G E N A V Y after one hour of C P U t i m e on a CRAY Y - M P whereas this problem was solved within 5 minutes using t h e L Q P m e t h o d . On t h e other hand, t h e same problem was solved in less than 3 minutes using the O B I code based on interior point methods. This indicates t h a t t h e L Q P algorithm based on nonlinear programming technology is competitive with current linear programming technology while it outperforms the state-of-the-art simplex based linear programming code M I N O S . We also note t h a t t h e value of t h e side variable which represents t h e readiness level was computed to 5 digits of accuracy by t h e L Q P m e t h o d as confirmed for both problems by t h e O B I and MINOS solutions. We also experimented with nonlinear versions of NAVY and H U G E N A V Y problems. We refer to these problems as NAVYQ and H U G E N A V Y Q where the objective function is a separable quadratic function of t h e form YljajrfF ° r t h e problem NAVYQ, t h e coefficients a ; are precisely t h e coefficients given in the linear model. For t h e H U G E N A V Y Q problem t h e coefficient vector was taken to be identically unity. We report t h e solution statistics in Table 6. Both results were obtained on a CRAY Y-MP.
Solving Nonlinear
Problem NAVYQ HUGENAVYQ
Programs
Major iters 18 8
with Embedded
Subprob. time 89 270
Network
Master time 367 867
Total time 456 1137
Structures
Lower bound -273025.94 0.7220 x 108
195
Upper bound -252403. 0.7320 x 10s
Table 6: Performance of t h e L Q P algorithm on t h e nonlinear Naval personnel assignment problems. W i t h b o t h problems, t h e infeasibility tolerance e m ;„ = 1 0 - 5 was attained on termination. Using MINOS to solve NAVYQ a feasible solution with objective function value -269515.4 was obtained in 485 C P U seconds on t h e CRAY Y - M P . This solution is 1% b e t t e r t h a n the best feasible solution produced by the L Q P algorithm in 456 seconds. However, t h e L Q P algorithm produced a more accurate solution on H U G E N A V Y Q . MINOS was not used for this problem due to t h e anticipated C P U t i m e and m e m o r y requirements.
3.4
Solving Constrained Matrix Estimation Problems
In this section we report results with constrained versions of two m a t r i x estimation problems from t h e World Bank. T h e m a t r i x estimation problem is t h a t of adjusting t h e entries of Social Accounting Matrices (SAM) for an economy and can be formulated as a nonlinear network optimization problem, see Zenios, Drud and Mulvey [1989]. T h e first problem S A M K E is a SAM model for Kenya and t h e second problem S A M B O is a SAM model for Botswana. Both problems were derived from econometric studies conducted at the World Bank. Both problems have separable objective functions. T h e problem S A M K E has a weighted entropy objective function of t h e form ax • (logf — 1.0) for each flow variable and t h e problem S A M B O has a quadratic objective function of the form a • (x — b)2. We note t h a t these functions do not satisfy t h e assumption of L e m m a 1 of section 2.5 and therefore we rely on t h e subproblem lower bound for these problems. Since no initial feasible solution can be readily obtained, we do not compute upper bounds. We constructed side constraints for these problems as follows. A n u m b e r of arcs were randomly chosen and a fraction of t h e sum of t h e optimal flow values on these arcs was taken as the right-hand side of t h e inequality. This was repeated as m a n y times as the number of side constraints we added to t h e problem. Therefore, t h e side constraints have t h e following form:
22 *» >P22 *:,W)e£
(<j)€f
where £ is an arbitrary subset of t h e arcs of t h e underlying graph and HJ is t h e flow variable for arc (i,j), x*- are the optimal flows obtained by solving t h e original m a t r i x estimation problem and /? G (0,1]. Characteristics of t h e problems are summarized in Table 7.
196
Mustafa soo -r
Q. Pmar
A.
Zenios
—•—•»«*
Figure 2: Variation of t h e solution t i m e for constrained m a t r i x estimation problem S A M B O as a function of t h e number of side constraints with G E N O S / L P and M I N O S .
Problem SAMKE SAMBO
Network N o d e s Arcs 50 202 128 662
Optimal value -7768.11 10.71
Table 7: Characteristics of m a t r i x estimation problems. Starting with one side constraint, b o t h problems were solved with an increasing number of side constraints using t h e L Q P algorithm. Tests were performed on a D E C station 3100. All times are given in C P U seconds. We provide in Figure 2 t h e variation of t h e C P U t i m e taken by G E N O S / L P algorithm to solve SAMBO and t h e C P U t i m e taken by MINOS on t h e same problem as a function of t h e increasing number of side constraints added to the problem. We observe t h a t while MINOS t i m e does not vary considerably, G E N O S / L P outperforms MINOS by a significant margin. However t h e advantage of G E N O S / L P is reduced as the number of side constraints increase. This is not surprising since the L Q P m e t h o d is more sensitive to the size of t h e network component which gets smaller in percentage as more side constraints are added. In Figure 3, we plot t h e optimal values reported by G E N O S / L P and MINOS with t h e problem S A M B O as a function of the number of side constraints. In all experiments, G E N O S / L P was able to produce reasonably accurate solutions to the problem. We provide in Table 8 a s u m m a r y of the L Q P algorithm statistics for both problems. T h e problems are referred to as SAMKE25 and SAMBO20 to indicate t h e
Solving Nonlinear Programs with Embedded Network
Structures
197
MINOS • U3P
5 + 0 +5
10
15
20
number ot side constraints
Figure 3: Variation of the optimal value for constrained matrix estimation problem SAMBO as a function of the number of side constraints with GENOS/LP and MINOS.
number of side constraints present in the problem. Problem SAMKE25 SAMBO20
Major iters. 3 7
Infeasibility 5.82 x 10"4 6 x 10~4
Total time 14. 303.
Objective value -7624.85 28.10
Lower bound -7678.11 21.89
Table 8: The LQP statistics with the matrix estimation problems. The lower bounds in both cases are not very tight. The solution statistics with MINOS for problems SAMKE25 and SAMBO20 are given in Table 9. Problem SAMKE25 SAMBO20
Number of iterations 475 1804
Optimal value -7630.94 27.73
C P U time 7 472
Table 9: Performance of MINOS on the matrix estimation problems. Although MINOS provides more accurate solutions and was faster with the smaller SAMKE25, the LQP method outperforms MINOS on the larger problem with respect to CPU usage for different number of side constraints while it produced an acceptable level of accuracy.
Mustafa
198
3.5
Q. Pmar & Stavros
A. Zenios
Analyzing the NETLIB Test Problems
Using t h e network extracting heuristics of Bixby a n d Fourer [1988], we analyzed a subset of N E T L I B linear programming test problems. A s u m m a r y of t h e characteristics of t h e test problems a n d t h e associated network formulations are given in Table 10. As can be observed from t h e statistics, two of t h e problems have a large network component t o warrant some attention. Problem RECIPE GREENBEA GIFFPINC SCAGR25 SCRS8 SHIP12L SIERRA STANDATA
Linear Programming Rows Columns Opt. Val. 92 180 -2.6661 x 102 2400 5443 -7.2462 x 107 617 1092 6.9022 x 106 472 500 -1.4753 x 107 491 1169 9.9429 x 102 1165 5427 1.4701 x 106 9252 1228 1.5394 x 107 468 3686 1.2576 x 103
Network Nodes Arcs 54 140 895 4641 523 1071 372 200 301 1096 735 5321 878 2726 96 331
Side Const. 30 1423 69 147 156 104 349 226
Side Vars. 14 606 1 127 78
789
Table 10: T h e N E T L I B problems in LP a n d network forms. Our experience with t h e N E T L I B problems using t h e L Q P algorithm revealed t h a t solving these problems as general linear programs is more efficient. We report results with two problems G I F F P I N C and SHIP12L. T h e L Q P statistics are given below in Table 11. T h e n u m b e r of major iterations refer t o t h e n u m b e r of executions of step 1 of t h e L Q P algorithm. We also report t h e final objective function value reported on termination a n d t h e best lower bound computed thus far. Infeasibility refers t o t h e m a x i m u m degree of violation of t h e side constraints. It was not possible to provide an initial feasible solution for these problems a n d hence no feasible iterates a n d upper bounds t o optimal value were computed. T h e tests were performed on a D E C station 3100. SHIP12L was solved in 110 C P U seconds using MINOS a n d G I F F P I N C was soved in 25 seconds using t h e same code. Problem GIFFPINC SHIP12L
Major iters. 10 15
Infeasibility 0.242 x 10" 2 2.37
Total time 8240.9 7200
Objective value 7.581 x 106 1.768 x 106
Lower bound 6.766 x 106 NA
Table 11: T h e L Q P statistics with t h e N E T L I B problems. T h e statistics clearly indicate t h a t these problems proved t o be extremely hard for t h e L Q P algorithm. In t h e case of G I F F P I N C , although an acceptable level of infeasibility a n d a reasonable lower bound was achieved, t h e objective value is off t h e known optimal value by a considerable margin. T h e case of SHIP12L was more problematic.
Solving Nonlinear
Programs
with Embedded
Network
Structures
199
T h e c o m p u t a t i o n was stopped after two C P U hours and t h e iterate was still far from reaching t h e absolute infeasibility tolerance e m l n = 1 0 - 5 . It was also impossible to assess t h e quality of t h e solutions to nonlinear network (penalty) problems due to t h e poor quality of t h e lower bounds. We also note t h a t b o t h problems had a dense side constraint m a t r i x structure, a factor which affects t h e L Q P algorithm negatively. To conclude, we r e m a r k t h a t t h e L Q P technology was not effective in dealing with t h e N E T L I B problems although t h e experience contributed to t h e robustness testing of t h e G E N O S / L P code.
3.6
Integration with MINOS
T h e L Q P m e t h o d delivers quickly an approximate solution to t h e problem. W h e n higher accuracy is needed a linear programming solver m a y be used. T h e linear prog r a m NAVY is solved with t h e general purpose linear programming solver M I N O S of M u r t a g h and Saunders [1987]. T h e statistics are given in Table 12.
Number of PhaseT pivots Total number of pivots CPU time(DEC 5100)
1247 2423 10 mins.
Table 12: Performance of M I N O S on NAVY. Interfacing t h e G E N O S / L P system with MINOS may provide M I N O S with an advanced starting point. However, since t h e L Q P algorithm is essentially based on an exterior point penalty function, no basis for the problem is readily available. T h e optimal network basis produced as a result of solving linear network subproblems is input to M I N O S . This idea produced a significant reduction in the n u m b e r of pivots taken by M I N O S to reach optimality. A comparison is given below in Table 13.
Number of Phase-I pivots Total number of pivots
MINOS 1247 2423
MINOS with advanced start 1750 1782
Table 13: Performance of MINOS on NAVY using advanced start. As can be observed from Table 13 t h e total number of iterations were reduced significantly. Due to t h e anticipated C P U usage this strategy was not applied to HUGENAVY.
200
4
Mustafa
Q. Pmar
A.
Zenios
Conclusions
We presented in this paper a solution m e t h o d suitable for large scale optimization problems with embedded network structures, and results of extensive computational testing with various test problems. T h e L Q P m e t h o d is an exterior point m e t h o d based on a smooth penalty function and can produce feasible iterates if an initial feasible point is available. For applications where several problem instances need to be solved such as t h e Patient Distribution System and t h e Naval Personnel Assignment, there is a high payoff in exploiting t h e network structure and developing a specialized algorithm. Particularly as t h e problem size gets bigger, t h e benefits of using the L Q P algorithm become more accentuated. In this paper we presented strong evidence to support this claim. For smaller problems it is more beneficial to use general purpose algorithms as evidenced by the analysis with t h e N E T L I B problems. Another import a n t factor which affects t h e performance of the L Q P algorithm is the sparsity p a t t e r n of t h e side constraint matrix. W i t h the P D S problems and Naval personnel assignment problems t h e side constraint m a t r i x had a favourable sparsity p a t t e r n whereas t h e N E T L I B problems had a very dense structure. However, t h e L Q P algorithm may still be a viable alternative in smaller problems with relatively few side constraints as observed in t h e case of constrained m a t r i x estimation problems. In s u m m a r y t h e L Q P algorithm is able to provide quickly approximate solutions to t h e problem. For large problems it outperforms state-of-the-art general purpose optimization software while it remains competitive with the more recent interior point based optimization technology. To achieve higher accuracy, the L Q P solution can be used as an advanced start for a general purpose linear programming solver. A c k n o w l e d g m e n t s . This research was partially supported by N S F grants S E S 91-00216 and C C R - 9 1 - 0 4 0 2 and A F O S R grant 91-0168. T h e assistance of Mr. Ted T h o m p s o n and Mr. Iosif Krass with supplying t h e d a t a for t h e Navy Personnel Assignment problem is gratefully acknowledged. Mr. John Gregory kindly provided assistance with the CRAY experiments and O B I . Professor Bob Fourer kindly m a d e his network extraction program available.
References [1] A.I. Ali and J.L. Kennington, M N E T G N Program Documentation, Technical Report I E O R 77003. D e p a r t m e n t of Industrial Engineering and Operations Research, Southern Methodist University, Dallas (1985). [2] A.I. Ali, J.L. Kennington and B. Shetty, T h e Equal Flow Problem, Journal of Operational Research 3 6 (1988) 107-115. [3] D.P. Bertsekas, Nondifferentiable Optimization via Approximation, Programming Study 3 (1975) 1-25.
European
Mathematical
Solving Nonlinear
Programs
with Embedded
Network
Structures
201
[4] D.P. Bertsekas, Projected Newton Methods for Optimization Problems with Simple Constraints, SIAM Journal on Control and Optimization 2 0 (1982) 221-246. [5] R.E. Bixby and R. Fourer, Finding E m b e d d e d Network Rows in Linear Programs I. Extraction Heuristics, Management Science 3 4 (1988) 342-376. [6] G.G. Brown, G.W. Graves, H. Lange, C. Staniec and R.K. Wood Dual Decomposition Methods for Solving Multicommodity Flow Problems, Technical Report, Naval P o s t g r a d u a t e School (1989). [7] G.G. Brown and R.D. McBride, Solving Generalized Networks, Management ence 3 0 (1984) 1497-1523.
Sci-
[8] C.H.J. Chen and M. Engquist, A Primal Simplex Approach to P u r e Processing Networks, Management Science 32 (1986) 1582-1598. [9] S. Chen and R. Saigal, A Primal Algorithm for Solving a Capacitated Network Flow Problem with Additional Linear Constraints, Networks 7 (1977) 59-79. [10] R.S.Dembo, J.M. Mulvey and S.A. Zenios, Large-Scale Nonlinear Network Models and Their Application. Operations Research 3 7 (1989) 353-372. [11] J . E . Dennis,Jr. and R . B . Schnabel, Numerical Methods for Unconstrained mization and Nonlinear Equations (Prentice-Hall, New Jersey, 1983). [12] A.V. Fiacco and G.P. McCormick, Nonlinear Programming: Sequential strained Minimization Techniques (John Wiley, New York, 1968).
Opti-
Uncon-
[13] A . B . Gamble, A.R. Conn and W . R . PuUeyblank, A Network Penalty Problem, Mathematical Programming 50 (1991) 53-74. [14] F . Glover, J. Hultz, D.Klingman and J . S t u t z , Generalized Networks: A Fund a m e n t a l C o m p u t e r Based Planning Tool, Management Science 2 4 (1978) 12091220. [15] F . Glover and D. Klingman, T h e Simplex SON Algorithm for L P / E m b e d d e d Network Problems, Mathematical Programming Study 15 (1981) 148-176. [16] N. Karmarkar, A New Polynomial T i m e Algorithm for Linear P r o g r a m m i n g , Combinatorica 4 (1984) 373-395. [17] J.L. Kennington and R.V. Helgason, Algorithms Wiley and Sons, New York, 1980).
for Network Programming
(John
[18] J. Koene, Minimal Cost Flow in Processing Networks, A P r i m a l Approach, P h . D . Thesis, Eindhoven University of Technology, Eindhoven, T h e Netherlands (1982).
202
Mustafa C. Pinar & Stavros
A.
Zenios
[19] LA. Krass and T . J . Thompson, M a t h e m a t i c a l Formulation of E D P R O J - Readiness Connection, Technical Report, Navy Personnel Research and Development Center, San Diego CA (1990). [20] R. Marsten, R. S u b r a m a n i a n , M. Saltzman, I. Lustig and D. Shanno, Interior Point Methods for Linear Programming, Interfaces 2 0 (1990) 105-116. [21] R . D . McBride, Solving E m b e d d e d Generalized Network Problems, Journal of Operational Research 2 1 (1985) 82-92.
European
[22] J . M . Mulvey, S.A. Zenios and D.P. Ahlfeld, Simplicial Decomposition for Convex Generalized Networks, Journal of Information and Optimization Sciences 1 1 (1990) 359-387.
[23] J.M. Mulvey and S.A. Zenios, Solving Large Scale Generalized Networks, of Information and Optimization Sciences 6 (1985) 95-112.
Journal
[24] J . M . Mulvey, Testing of a Large Scale Network Optimization Program, matical Programming 15 (1978) 291-315.
Mathe-
[25] B.A. M u r t a g h and M.A. Saunders, MINOS 5.1 User's Guide, Report SOL 8 3 20R, December 1983 (revised J a n u a r y 1987), Stanford University. [26] M.C. P m a r , Decomposition and Parallel Solution of Network Structured Optimization Problems, P h . D . Thesis, University of Pennsylvania, Philadelphia PA 19104 (1992). [27] M.Q. P m a r and S.A. Zenios, Parallel Decomposition of Multicommodity Network Flows using Smooth Penalty Functions, ORSA Journal on Computing 4 (1992) (forthcoming). [28] G.L. Schultz and R.R. Meyer., An Interior Point Method for Block Angular Optimization, SIAM Journal on Optimization 1 (1991) . [29] I. Zang, A Smoothing-out Technique for Min-Max Optimization, Programming 19 (1980) 61-77.
Mathematical
[30] S.A. Zenios, A. Drud and J.M. Mulvey, Balancing Large Social Accounting Matrices with Nonlinear Network Programming. Networks 19 (1989) 569-585. [31] S.A. Zenios, M.C. Pinar and R.S. Dembo, A Smooth Penalty Function Algorithm for Network Structured Problems, D e p a r t m e n t of Decision Sciences Report 9 0 12-05, T h e W h a r t o n School, University of Pennsylvania, Philadelphia, PA. 19104 (1990).
203 Network Optimization Problems, pp. 203-231 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
On Algorithms for Nonlinear Dynamic Networks Warren B . Powell Elif Berkkam Irvin J. Lustig Department of Civil Engineering and Operations Research, School of Engineering and Applied Science, Princeton University, Princeton, NJ 08544 USA
Abstract
We consider the problem of minimizing costs over a dynamic, acyclic network with convex, separable link cost functions. The standard approach is to formulate the problem as a convex, separable optimization problem subject to flow conservation constraints, where the decision variable is the flow x{j on link (i,j). We show that standard network algorithms applied to dynamic problems exhibit surprisingly poor performance for networks with as few as 10 or 20 time periods, suggesting that dynamic networks are intrinsically much harder to solve than static networks of comparable size. The problem can be reformulated using decision variables #,j, which gives the fraction of the total flow passing through node i that should be routed over link (i, j). This formulation has been used by other researchers in the development of parallel algorithms which take advantage of the simple constraint structure. We show that this reformulation produces substantially faster execution times for Frank-Wolfe type methods than the same methods applied to the standard formulation.
1
Introduction
We consider t h e problem of optimizing flows over a dynamic network with separable nonlinear cost functions. T h e problem can be motivated by problems arising in network models of dynamic fleet management for common carriers in freight transportation (truck, rail, containers). In these models, supplies of vehicles enter the network in t h e first few t i m e periods but then flow through t h e network and exit
204
W. B. Powell, E. Berkkam,
and I. J.
Lustig
via a supersink. An i m p o r t a n t characteristic of these problems is t h a t most of t h e nodes are p u r e transshipment nodes and only t h e supersink is a deficit node. Powell et al. [11] presents a model with this structure to manage a fleet of trucks over t i m e under uncertain d e m a n d s . Gallagher [5] introduces a similar model with multicommodity flows to optimize t h e routing of messages over communication networks. We show in this paper t h a t dynamic networks are intrinsically much more difficult to solve with first-order m e t h o d s t h a n static networks of comparable size. Algorithms applied to s t a n d a r d formulations of nonlinear dynamic networks can perform very poorly. Even special packages for nonlinear networks such as G E N O S (Mulvey and Zenios [8]) require exceptionally long run times for relatively small networks with 10 or 20 t i m e periods. By contrast, even relatively simplistic algorithms such as FrankWolfe are shown to work quite well when applied to a different formulation of t h e same problem. T h e r e are two approaches t h a t can be used to solve this problem. T h e first is to view it as a s t a n d a r d m i n i m u m cost flow problem with nonlinear cost functions and link flows, Xij, as decision variables. T h e second approach does not require t h e use of any network flow algorithms but takes advantage of t h e acyclic structure of t h e network (Figure 1) to describe the impact of decisions m a d e now on future costs. This new approach uses flow fractions, 8{j, as decision variables. T h e decision variable 9ij represents t h e fraction of total flow passing through node i t h a t is to be routed over link (i, j). We refer to t h e classical formulation, which uses XJJ as decision variables, as N D N - X (Nonlinear Dynamic Network with X variables), and refer to t h e new formulation, which uses 0;J, as N D N - T . T h e formulation N D N - T appears to have been first introduced by Gallagher [5] for multicommodity flow problems arising in telecommunications. T h e motivation for the formulation was t h e development of distributed algorithms for routing in telecommunications. As shown below, t h e N D N - T formulation uses a very simple constraint structure t h a t lends itself easily to parallel computation. T h e same formulation was developed independently by Powell et al. [11] in t h e context of managing fleets of vehicles under uncertainty. However, neither of these papers really investigate t h e behavior of solution algorithms for dynamic networks. Researchers have long realized t h a t t h e structure of dynamic networks could be used to develop specialized algorithms (see, for example, Aronson [1]). By contrast, there has been relatively little recognition of t h e challenges posed by dynamic networks. There are relatively few algorithms specialized for dynamic networks (see, for example, t h e extensive review in Aronson and Chen [2]). Aronson [1] presents a specialization of t h e network simplex algorithm t h a t takes advantage of breaks in t h e tree t h a t limit the effect of pivots on earlier time periods. This result, however, appears to be restricted to pure dynamic networks t h a t arise in inventory planning problems. W h i t e and Bomberault [12] offer a specialization of a primal dual algorithm for d y n a m i c networks motivated by e m p t y railcar models. Powell et al. [11] propose a stochastic formulation of t h e dynamic vehicle allocation problem which produces
On Algorithms
for Nonlinear
Dynamic
205
Networks
a nonlinear, dynamic network with t h e structure considered here. A flow splitting algorithm of t h e t y p e presented above is introduced and shown to exhibit good performance in limited tests. Bertsekas [3] and Bertsekas et al. [4] explore in d e p t h enhancements of t h e flow splitting formulation we refer to as N D N - T . Taking advantage of t h e simple structure of the constraint set, these papers develop projection algorithms and second order algorithms. In this paper we use this formulation to expose t h e impact of t h e dynamic structure of t h e network on algorithmic performance. Section 2 presents t h e formulation of t h e nonlinear dynamic network problem in t h e traditional form N D N - X . Section 3 presents a more detailed description of t h e N D N - T formulation in terms of dynamic networks, and states a simple backward recursion for calculating derivatives, with a derivation left to t h e appendix. Section 4 outlines several s t a n d a r d algorithms t h a t can be used with t h e new formulation, taking advantage of t h e simple structure of t h e constraint set. Finally, section 5 compares the N D N - X and N D N - T formulations.
2
The N D N - I Formulation
We begin by presenting t h e basic problem in a form t h a t explicitly reflects t h e dynamic structure of t h e problem. Let i and j refer to cities (points in space) and let a node in t h e network be denoted by (i,t). For notational simplicity only, we assume any links e m a n a t i n g from (i,t) t e r m i n a t e in period t + 1. Define R\ x\j fij(xlj)
=
net surplus (R\ > 0) or deficit (R\ < 0) at node i at t i m e t
=
total flow from i to j departing at t i m e t (and arriving at t i m e t + 1)
=
cos
t °f sending flow on the link from node (i,t)
to node (j,t + 1)
T h e formulation N D N - X is then minimize
F(x)
=
Y.J2 t
subject to
X) i
fij(xh)
j
Ax
=
R
(1)
x
>
0
(2)
where A is t h e node-arc incidence m a t r i x for t h e network. It is assumed t h a t the functions ff- are convex and t h a t R is the vector of surpluses and deficits at each node. This formulation produces a problem with a separable, nonlinear objective function with nonseparable network constraints. T h e relative ease with which derivatives can be calculated, due to separability, makes this traditional formulation favorable. In addition, t h e conservation of flow constraints can be handled in the context of first order nonlinear p r o g r a m m i n g algor i t h m s , such as t h e Frank-Wolfe algorithm, since t h e linearized subproblems are pure
206
W. B. Powell, E. Berkkam,
and I. J.
Lustig
networks. T h e gradient g of F is defined by
If a first-order algorithm is applied to N D N - X , then t h e linearized subproblem is based on t h e gradient at some point x. This problem is then a pure network problem of t h e form minimize
X ) 1C 5Z S i j ^ l ' ) ' vh t
i
j
subject to
Ay
=
R
(4)
y > o T h e Frank-Wolfe algorithm applied to N D N - X generates iterates i ' * ' for k > 0 using t h e following steps: Step 1. Set z( 0 ) = 0 and find g(x^). Solve (4) with x = x<0' and set k = l , so t h a t our initial solution is i ' 1 ' = y(°\ Step 2. Evaluate g'^x^). optimal solution.
Solve t h e linear network problem (4) and let yW be t h e
Step 3. Find the o p t i m u m step size o* by solving min 0
Step 4. U p d a t e a;'*-*"1) = x « + a*(yW
- i«).
1
|F(I( +'))-F(I('))|
L Step 5. If i . (t), < e, then stop, otherwise set k *— k + 1 and go to step 2. T h e problem with a s t a n d a r d application of Frank-Wolfe is t h e n a t u r e of t h e e x t r e m e solution, as depicted in Figure 2. T h e structure of t h e problem, where flows typically enter t h e network in t h e first few t i m e periods and leave through the supersink, produces a p a t t e r n of flows from the subproblem t h a t follows a tree. In fact, we are actually just solving a shortest p a t h problem into t h e supersink over an uncapacitated network with linear costs. For dynamic networks with at least five or ten t i m e periods, t h e result is a set of flows t h a t converges on a single p a t h several t i m e periods into t h e problem. This e x t r e m e point solution is an unusually poor approximation to t h e optimal solution, and hence is t h e cause of extremely slow rates of convergence. In fact, standard algorithms, applied to even relatively small problems with a large (greater t h a n 20) number of t i m e periods converge so slowly t h a t they may stop well short of optimality. In t h e next section, we show how a simple transformation takes advantage of t h e dynamic structure of the problem, and allows us to develop very simple and efficient algorithms for solving these networks.
On Algorithms for Nonlinear Dynamic Networks
3
207
The Transformed Problem NDN-T
In this section we give a different formulation of the same problem, where the decision variables are fractions of total supply at a node instead of link flows. The transformation from the original formulation NDN-X into this new formulation is performed using
*« = *irSj
(5)
where 9{j 0% P 6 §* S* S* + R\
= = = = = = =
fraction of total supply at node i at time t to be sent to node j at time t + 1 {..., 0\j, • • •} = values of 9 at a fixed time t total number of time periods {6\6\...,6P} 2 {0\0 ,...,0t} = all 6 values up to time t total endogenous flow through node i at time t total available supply at node i at time t
For simplicity of notation, we assume all flows pass from period t to t + 1. Flows from period t to t + m, m > 1, are easily handled, whereas flows between regions within a time period would require us to solve certain systems of linear equations, adding substantially to the computational effort. The supply at a node S\ is defined not as a constraint but is actually an implicit function of 0. This is because Sj+1 can be calculated as
sr i =-R5 +i +ix--$
(6)
As a result, the total flow 5 ' through a node depends on the partial vector 0* that gives the decisions made in earlier time periods. We will henceforth use the notation 5 ' and Sj (0) interchangeably, using the latter when the stress on the functional nature of S* is needed. Our objective is to minimize the total cost of flow along the links over the entire network using the link cost functions IhW = fb(°h • s<(0))-
(7)
The optimization problem can be stated as follows : minimize
F(0)
= E E E 4 W
6
subject to
t
Y,eh
=
l
i
(8)
i Vi
>*
(9)
Vi,j,t
(io)
j
0\3 > o
208
W. B. Powell, E. Berkkam,
and I. J.
Lustig
T h e first constraint ensures t h a t t h e sum of all the fractions of total supply at a particular node i add up to unity and t h e second one is t h e nonnegativity constraint of t h e flow fractions. By (7) and (6), we have t h a t t h e new formulation produces a nonseparable objective function, b u t separable constraints. For t h e special case of dynamic networks we show in t h e appendix t h a t t h e derivatives of F(9) can be calculated using only nominally more effort t h a n required for t h e original N D N - X case, producing subproblems t h a t are trivial to solve when using a Frank-Wolfe algorithm. T h r o u g h a straightforward application of t h e chain rule, we can obtain t h e following recursions for t h e derivatives:
dF
_
(dfj3
Wi ~ U ^
Jfasm
dF +
\
3Sp(0)J ' ( )
= y(M.#. +
yxdx^
_a^.*:\ °'^ ds}+1(6) ")
( }
(12) {
'
To simplify t h e notation, let t h e gradients be denoted as
9ij
=
(13)
-QQC
It will be clear t h a t g\- represents t h e gradient of F evaluated at some point 6. These derivatives can be found by a backward pass, where t h e main idea is to start at t i m e t = P and move backward in time over t h e entire network. T h e B a c k w a r d P a s s procedure works as follows where it is assumed t h a t 9\- and S\ are known for all i, j , and t. Step 1. Set gfj+1 = 0 for all i,j and ^f +1 = 0 for all i. Step 2. For each t = P, P — 1 , . . . , 1 , compute
4 = (H+3+iVs -d
§\ =
JX(||
+
#»
^ (16)
T h e loop is derived from t h e recursions (11) and (12). Note t h a t
is the gradient of t h e original objective function evaluated at the value x\- = Q\- • S\.
On Algorithms
for Nonlinear
Dynamic
Networks
209
For t h e overall algorithm we need t o calculate t h e supplies S* and t h e gradients '• for all nodes and all t i m e periods using a given 6 value. T h e values S\ can be found by a forward pass, where t h e main idea is t o start at t i m e t = 1 and move forward in t i m e over t h e entire network. This F o r w a r d P a s s procedure works as follows: Step 1. Set Sj = R] for all i t o determine t h e initial supplies for period 1.
Step 2. For each t = 1,2,..., P - 1, update for each j , S]+1 = Rfl + £,- fy • Sj.
4
Solution Algorithms for NDN-T
In this section we outline three standard algorithms t h a t can b e used with t h e new formulation: t h e Frank-Wolfe algorithm, a gradient projection algorithm, and an active set strategy. All of these algorithms take advantage of t h e special structure of the constraint set. In t h e gradient projection m e t h o d , t h e required projection operator is particularly simple (see Bertsekas [3] for a discussion of projection m e t h o d s using this problem formulation). T h e active set m e t h o d also uses t h e structure of t h e constraints t o find a basis. These algorithms are only used t o illustrate t h e general performance of t h e transformation, and do not represent a comprehensive study of algorithms for this class of problems.
4.1
Frank-Wolfe
A s t a n d a r d application of t h e Frank-Wolfe algorithm involves solving t h e following subproblem determined by computing t h e gradients g\- at some point 6: minimize
£ £ £ < ^ t
i
subject t o
(17)
j
Ysfti =
l
Vi
><
(18)
3
fit
>
0
Vt,i,<
(19)
T h e solution to this subproblem is t h e vector j3 . T h e problem decomposes by each node (i, t). Hence, for each pair (i, t), we simply choose t h e index j corresponding to t h e most negative value g\j. We then set ji\j = 1 and t h e values /?'• = 0 for k ^ j . Using t h e backward and forward passes, t h e complete algorithm can b e built u p very efficiently, which is as follows: Step 1. Let 0'°) = 0 and calculate g\i at t h e point 6 = 0. Solve minimizeX)EE4-^ "
t
i
j
subject to (18) and (19). T h e solution ft becomes t h e initial solution 0' 1 ' = /5*. Set k = 1.
210
W. B. Powell, E. Berkkam,
Step 2. Use t h e Forward Pass t o find t h e vector (Sj)^.
and I. J.
Lustig
.
Step 3. C o m p u t e t h e derivatives g\- evaluated at #(*' using t h e Backward Pass. Step 4. Solve the linearized subproblem (17) subject to (18) and (19) to get t h e solution / 3 « . Step 5. Find t h e o p t i m u m step size ct by solving: minimize F(0<*> + a(/?(*> - 0 (fc) ))
Step 6. U p d a t e 0 = 0<*> + a*(/?<*> - 0<*>). |F(S( fc + 1 ))-F(»(*))|
Step 7. If J cva(fch < e i then stop. Otherwise set fc=fc+l and go back to Step 2. T h e linearized subproblem is solved easily when compared with t h e classical linearized subproblem since it needs to look only for t h e most negative gradient out of each node t o decide on which arc to put flow. By contrast, t h e classical formulation using link flows produces a subproblem t h a t requires t h e solution of a linear network.
4.2
Gradient Projection Method
An attraction of Frank-Wolfe is t h a t it produces a feasible descent vector, but t h e cost is t h e use of an extreme point solution. Gradient m e t h o d s generally offer a better search direction, but require a projection operation to regain feasibility. T h e simple constraint structure of t h e transformed problem allows this projection to be performed with relative ease. Let 0\. be t h e vector of flow fractions out of node (i,t). We scale the gradient out of node (i,t), g\_, by dividing all entries out of t h a t node by t h e most negative element. Let the scaled gradient be denoted as g\_. T h e u p d a t e d flow fractions out of node (i,i) are obtained by Pi = 0i. + 9i,
(20)
which will be infeasible. T h u s we project /?,-. back onto t h e feasible region using the m e t h o d presented by Held et ai. [7] and determine the feasible direction dk by dk = /? p r o j - 6
(21)
This direction is t h e n used in t h e main iterate
0<*+D = *<*) + a * . d*
(22)
On Algorithms
4.3
for Nonlinear
Dynamic
Networks
211
Active Set M e t h o d
An alternative use of t h e simple constraint set is t h e active set m e t h o d (see Gill et &1. [6]). Having a single separable constraint for each node (i,t) makes it easy to find a basis by eliminating one variable using each constraint. In doing so we distinguish between t h e nonnegativity constraints t h a t hold exactly (active) and those t h a t do not (inactive). At each node, apart from t h e convexity constraint which is always active, whenever 6\j = 0, we set it as active and whenever 6\j > 0, it is referred to as inactive. Out of all t h e inactive variables we eliminate one and m a k e it basic. Let
and
El
=
the set of variables 6\- t h a t are eliminated (basic),
N\
=
the set of variables 8'- t h a t are not eliminated and t h a t are inactive (nonbasic),
where t h e sets are m u t u a l l y exclusive. T h e basic iteration is 0(*+i) = fl(*) + a* • dk
(23)
We start by eliminating one of the inactive flow fractions, which is strictly positive, out of each node (i,t) such t h a t
e =1
for
« - E °h
°«^E*
(24)
If 6\: is nonbasic then t h e direction of movement is t h e negative gradient out of node i evaluated at $,-••
( 25 )
< = -9h If 0\; is basic then < =
E
9tk
(26)
Since t h e elements with d\j < 0 are going to decrease from their current values in order to m a i n t a i n feasibility ({8'j)k+1) > 0), a ratio test must be performed t h a t evaluates t h e distance of t h e variables to zero:
mm
4<0
(27)
i,j,t
T h e variables are then u p d a t e d with respect to a stepsize a* such t h a t 0 < a* < a n is obtained through a one dimensional search.
212
5 5.1
W. B. Powell, E. Berkkam,
and I. J.
Lustig
Numerical Results Experimental Design
Experiments were run to provide an indication of t h e i m p o r t a n t properties of each formulation and to contrast t h e performance of t h e different algorithms on each formulation. Relatively little formal experimentation has been reported specifically for dynamic networks, with t h e notable exception of t h e work by Aronson and Chen [2]. This work focussed on dynamic networks t h a t featured potentially unbalanced transportation problems in each t i m e period with inventory carry-over arcs from one period to t h e next. Aside from t h e fact t h a t t h e networks are linear, these networks exhibit a basically different structure from the deep networks t h a t we consider. Given t h e exploratory n a t u r e of our experiments, we used randomly generated networks to test t h e algorithms. T h e network generator, however, was designed in the context of dynamic fleet management problems arising in truckload trucking. In this application, the nonlinear "cost" functions arise from an a t t e m p t to capture the uncertainity in t h e d e m a n d for transportation from city i to city j (see Powell et al. [9] and Powell [10] for complete details of t h e model). Let D be t h e uncertain demand for transportation in a particular m a r k e t , and let x be t h e flow of vehicles. T h e n m i n { D , : r } vehicles will move loaded, generating revenue r, while x — mm{D,x} will move empty, at a cost c. Let p(x) be the expected cost (negative profit) generated on this link, given by: p(x) = EQ[ c(x — min{_D,x}) — r m i n { D , x } ] If D is described by a simple density function fo{x) then
= Ae~ Al: , where A =
p(x) = ex - j ( r + c)(l - e~Xx)
(28) 1/E[D], (29)
A
Of course, any nonlinear cost function could be used, but we felt t h a t it helped our design of the network generator to use parameters and assumptions t h a t were motivated by an application. For example, we were able to choose input supplies (representing t h e supplies of vehicles in the fleet distributed among t h e set of cities) in a m a n n e r consistent with t h e demand for t h e vehicles. In order to generate t h e parameters r, c, and A for each m a r k e t , we generated r a n d o m coordinates for cities, uniformly over a 1000 by 2000 mile rectangle, from which we could calculate distances dij for each city pair (i,j). Given these distances, we used r%]
=
1.2^
(30)
C{j =
0.6dij
(31)
Xn =
J-
(32)
On Algorithms
for Nonlinear
Dynamic
Networks
213
where 1.2 and 0.6 are typical per mile revenues and costs for t h e trucking industry. T h e external supplies t h a t enter the network at t h e first t i m e period are generated by:
where 7 G [0.3,0.7]. By choosing the fraction 7 within this range, we m a d e sure t h a t flows stayed on t h e curved part of t h e nonlinear cost function, which corresponds to t h e hatched region in Figure 3. Inconsistency between t h e flows (representing t h e supply of vehicles) and t h e market demands (which determines t h e shape of t h e cost function) may result in landing on t h e linear portion of t h e function by being too far off to t h e right or to the left. T h e last p a r a m e t e r required is t h e link density. We generated each link with probability a , using a = 0.5 for most problems. In addition to using our algorithms, certain sets of experiments were run using G E N O S (Mulvey and Zenios [8]). G E N O S includes implementations of the primal t r u n c a t e d Newton m e t h o d and simplicial decomposition, both specialized for nonlinear generalized networks. To accomodate t h e predefined functions in G E N O S , t h e experiments which involved comparisons against G E N O S used cost functions of t h e form ae . In t e r m s of t h e parameters of our problem, the functions were given by: p(x) = j{r
+ c)e-Xx
(34)
T h e test problems are designed to compare t h e classical formulation against t h e new one using Frank-Wolfe, and to see how well the new algorithm performs against G E N O S . Beyond t h a t there is the question of how different m e t h o d s like Frank-Wolfe, projected gradient, and t h e active set m e t h o d compare with each other when they are applied to t h e new formulation. Toward this goal, a series of experiments were run to test t h e effect of the number of regions and t i m e periods on networks with various densities. Of particular interest is t h e effect of longer planning horizons on t h e r a t e of convergence. We used a simple stopping rule based on t h e relative change in t h e objective function from one iteration to t h e next, given by:
—
• ,, F($W)
< t
(35) '
V
where e is a p a r a m e t e r t h a t we set to 0.001. For some of t h e experiments, we solved t h e problem with one algorithm and then measured how long a competing algorithm required to produce t h e same objective function value. T h e codes t h a t implement t h e solution algorithms are written using t h e C prog r a m m i n g language. Computational tests are performed on a Silicon Graphics 4 D / 7 0 workstation running SGI Unix V3.2 with code compiled with t h e M I P S c c compiler using t h e default optimization level.
214
5.2
W. B. Powell, E. Berkkam,
and I. J.
Lustig
Results and Conclusions
We began by experimenting with different algorithms for N D N - T to investigate t h e properties of t h e transformation. Six test problems were randomly generated. Each problem is characterized by t h e n u m b e r of cities, number of t i m e periods, and network density. Other p a r a m e t e r s (such as r,j and c,j) were fixed as described earlier. Initial experiments indicated t h a t t h e projected gradient algorithm was superior to others. To obtain fair comparisons between t h e three algorithms for N D N - T , we ran t h e projected gradient algorithm until it satisfied t h e e-optimality test. T h e other two algorithms were then run until they produced an objective value t h a t met or was closest to t h e result obtained using t h e projected gradient algorithm. Figure 4 illustrates t h e rate of convergence of t h e three methods. Table 2 gives t h e results of a side-by-side comparison of t h e Frank-Wolfe algorithm for N D N - T and N D N - X . Here t h e results of the Frank-Wolfe run for N D N - T were taken from table 1. T h e n , Frank-Wolfe was used on NDN-A" until it produced an objective value t h a t m e t or came closest to t h e results for N D N - T . T h e results show a d r a m a t i c deterioration in performance as t h e n u m b e r of t i m e periods is increased. For 20 t i m e periods, t h e Frank-Wolfe algorithm applied to N D N - X could not even reach t h e result obtained using N D N - T within a reasonable t i m e . This behavior is explained by t h e n a t u r e of t h e e x t r e m e point solution given by t h e Frank-Wolfe algorithm for N D N - X , as illustrated in figure 2. By contrast, figure 5 illustrates t h e Frank-Wolfe solution for N D N - T using the same network and link flows as figure 2. In the ^ - f o r m u l a t i o n , many nodes have no flow moving through t h e m in t h e e x t r e m e solution, especially in t h e later t i m e periods. As a result, t h e one dimensional search uniformly decreases t h e flow on all t h e links of these nodes. T h e links e m a n a t i n g from the same nodes in t h e T-formulation will also experience a net reduction in flow. However, this formulation also allows t h e algorithm to shift flow between t h e links emanating from these nodes, further refining t h e solution. It m u s t be acknowledged t h a t Frank-Wolfe is not t h e best algorithm for either of t h e formulations. In table 3, we used t h e best available algorithm for each formulation. For NDNT , we used t h e projected gradient algorithm, and for N D N - X we used G E N O S , a package designed for nonlinear (generalized) networks. G E N O S includes specialized implementations of both the primal t r u n c a t e d Newton algorithm and simplicial decomposition. These experiments revealed some limitations of t h e system. First, we were unable to run our larger problems due to restrictions in t h e G E N O S software. Second, initial experiments produced almost pathologically slow execution times using t h e primal t r u n c a t e d Newton algorithm. We concluded t h a t additional work was needed on this algorithm and t h a t t h e run times were probably not an accurate measure of t h e performance of t h e algorithm. As a result, we only report t h e results using simplicial decomposition. Recall t h a t we were forced to use different link cost functions to accomodate G E N O S . A new set of test problems were generated using at most 10 cities and 10 t i m e periods. In each case G E N O S was run until its inter-
On Algorithms
for Nonlinear
Dynamic
Networks
215
nal optimality conditions were satisfied. T h e n , t h e projected gradient algorithm was run for N D N - T until t h e objective function value m e t or came t h e closest t o t h a t produced by G E N O S . T h e execution times reported in table 3 indicate t h e d r a m a t i c improvements in r u n times over G E N O S , especially in t h e larger problems. This result should not b e t o o surprising when we consider t h a t simplicial decomposition is just a generalization of Frank-Wolfe, a n d must still use e x t r e m e points with t h e qualities depicted in figure 2. A final set of experiments was r u n t o investigate t h e possibility t h a t t h e results are sensitive t o t h e structure of t h e input flows. All t h e networks generated u p t o now exhibit t h e property t h a t R{(t) = 0 for t > 2. While this is fairly realistic for dynamic fleet m a n a g e m e n t problems, it produces t h e sparse structure of t h e e x t r e m e point solution exhibited in Figure 2. T h e final set of tests was conducted using a set of input flows t h a t satisfied i?,(i) > 0 for all cities a n d t i m e periods. T h e values for t h e external supply vector are obtained by: R,(t)
= ptf-^l
-S)^:
ji-]
(36)
where p = 0.5 and 6 = 0.3. This expression p u t s a declining amount of flow into later t i m e periods, with t h e total amount of flow entering all t i m e periods comparable t o t h a t used in earlier experiments. Again, t h e reason is to insure t h a t link flows stay on t h e "interesting" part of t h e nonlinear cost functions. In this case, a plot of t h e nonzero flows in an e x t r e m e point solution of Frank-Wolfe looks more like figure 5 for b o t h formulations. T h e results, shown in table 4, indicate t h a t t h e relative results do not change significantly as a result of all positive input flows. T h e explanation is t h a t while a plot of nonzero flows looks more like a dense tree, a plot of t h e flow volumes in t h e e x t r e m e point solution would still show a noticeable funneling of flows onto a single p a t h .
6
Appendix: Calculation of the Derivatives
In this section we wish to use t h e acyclic structure of network t o develop backward recursions for t h e derivatives dF/dO'j, in order to minimize t h e function F(6). For this we need t o evaluate its derivative with respect t o B\-. To m a k e t h e calculations easier t o follow, we can split t h e objective function into three p a r t s by considering a particular t i m e t. T h e objective function is then viewed as a combination of t h e earlier t i m e periods t' < t, t h e present t i m e t, a n d t h e future t' > t. For any value t, let F(0) = Ht1{0) + Ht2{8) + Ht3{0) (37) where
Hl(9) = EEE/«W t'
k
i
(38)
W. B. Powell, E. Berkkam,
216
H'2(9) = £•(*) + E E & W A
Lustig
(39)
(4°)
Hm = EEE/Sw t'X
and I. J.
(
T h e r e are some i m p o r t a n t facts about these functions stated in t h e following two lemmas, t h e first of which is stated without proof: L e m m a 6.1 For any time t < P, # J ( 0 ) = Hl2+1{B) + L e m m a 6.2 For s > t, the partial
Hl+l{6).
derivative
dH\{6) _
Proof. For f < t, f(,(0) = /£,(0£,5J'(0)) by equation (7). Since 5 / ( 0 ) does not depend on any values 6\- for t > t', and H[ is a sum of the functions fl, with t' < t, it follows t h a t H\ has no dependence on any value of &"• for s > t. O To find dF/dB^j, we first need to evaluate dH^/d9\ , which corresponds to differentiating t h e function of future time periods with respect to a change in the flow fraction at t h e present t i m e t. T h e n we need to evaluate dH\jd6\- which corresponds to differentiating t h e function of t h e present t i m e period with respect to a change at t h e present t i m e t. L e m m a 6.2 has shown t h a t dH\ld6\j = 0, i.e., the derivative of t h e function of t h e past t i m e periods with respect to t h e current t i m e period is zero. So we have t h e following two propositions: P r o p o s i t i o n 6 . 3 The partial
derivative
dH
>aetj
Proof.
dp
ds^ie)
-sue).
From L e m m a 6.1, it follows t h a t
H^("i+' + Ei:;4
(«)
Since t h e derivatives for k ^ j are zero due to t h e structure of the dynamic network, t h e expression (41) is simplified to
On Algorithms for Nonlinear Dynamic Networks
217
Using the chain rule and equation (7),
df]t\0) _
dfir^T) mi df)t\xT)
w3
dx),
(43)
dx^ de\3
(44)
It follows from the chain rule that dx'+l dS*+1{0)
9*r ddlj
ds<+1{6) d9\,
(45)
where
uu
ij
°ij
k
= jrSiW-t>h
(«)
=
(48)
St(6).
This last expression is obtained by using the relationship (6) and realizing that Sl(0) is an implicit function of 6(t — 1), and hence is constant with respect to #,--. Using the chain rule once again helps build the recursive structure by dH?1 — = dB\3
3#< + 1 dS' + 1 (0) • —-—— dSf\B) d6\j
= ^%-5lW
K(491
'
(50)
Substituting (48) into (45) and then into (44), and finally together with (50) into the original expression (42) we obtain dH*3
dH?1
„„„,
, „
df i + 1
\dS}+\o) + ^~d^i ' dsf\o) I' s'{9) dF SKO)dS<+1{&)
(52)
(53)
The last equality follows from the fact that F = H{+1 + H'2+1 + H'3+l and that Lemma 6.2 implies that (d# 1 + 1 (0)/dSj + 1 (0)) = 0. D
W. B. Powell, E. Berkkam, and I. J. Lustig
218 It is useful to note that r)-rt+l
ft
+
+1
~ ds* (e)(
dsl \e)
il
(
' '
'
(55)
= «T
Proposition 6.4 The derivative of the present cost function H\{6) with respect to
e\. is de\3
6
- ^ - -
'
w
'
Proof. Here we are evaluating the relative change in the objective function due to a change in the flow fraction 6\3 at the present time period t. Since this change is only along the arc from node i in period t to node j in period t + l,the derivative is not effected by changes along arcs other than this specific arc. From the definition of H\{6) given in (39), dHj(9) d0\3
_ ~
dfjM d9\3
[
dftMi •ffW)
(
de\3 =
Jl ,l}
'\ ox-
• Sl(0).
Theorem 6.5 The partial derivative of the function F with respect to 9\- is
Proof.
(dfj3{x\3) \ dx\3
+
dP \ dS}+1{0))
Differentiating (37) and using Lemma 6.2 gives dP{6) d6\3
=
dHj(6) d9\-
+
dHj(6) 36% '
,„, '
(58)
Using Proposition 6.3 and Proposition 6.4, we are now ready to state
dF{9) d9<3
'
'{>'
On Algorithms for Nonlinear Dynamic Networks
219
The first expression is presented as Proposition 6.4 and the second one as Proposition 6.3. Combining the two proves the theorem. •
The next step is to find dF/dSl{9). Hence we need to evaluate 5#|(0)/dS/(0) and dHl/dSl(8). We present the following two propositions: Proposition 6.6 The partial derivative of the future cost function H\(6) with respect to the flow S*(9) through node i at time t is dH<3(6)
dF(6)
=
asm
ij
^ds?\e)
-
Proof. From Lemma 6.1 and the definitions of Hl+1(8) and Hl+1(6) given by equations (39) and (40), one can write
dH*(0)_ a ((^f'ne^+H^m) dsi(0) asm u . ,
(59)
Using the chain rule,
dsjie)
(60)
dsj(e)
df^T)1 dx'j
=
x
&ff dsi(oy
(61)
Now the last derivative in (61) can be written as
dSi{0) ~ dS*+1(9) ' dSI(6) '
(62)
where it follows from equation (6) that dS'+1{9)
am
8
- sm^sm-9i>
(63)
st(oy (65)
220
W. B. Powell, E. Berkkam,
Since for each j , Sj+1(9)
and I. J.
Lustig
is a function of Sj(9), it follows from t h e chain rule t h a t
y dsl+1{e)'
ds}{6) _
^8H^(9)
- tww
{
dsi(8) ir
' (}
Substituting (65) into (62), then into (61), and finally together with (67) into t h e original expression (59) we obtain:
dam _ ( = £
aff1
dgrtfT) dfT(*T)
v agj
+1
dx'f ] , dH^(9)\
dxf1
.V
\
OS?1 {6)
dStJ+1{9)J
= zZj~y°\r
w
t
(69)
(70)
T h e last equality follows from t h e fact t h a t the expression in parentheses in equation (69) is precisely 8Hl+1(8) dHl+1(8) dS<+1{8)
+
dSt+\9)
and t h e fact t h a t
dHr(9) _ dS)+1{6)
'
which follows from L e m m a 6.2.
•
P r o p o s i t i o n 6.7 The derivative of the present the flow S\{9) through node % at time t is
ds\{9)
Proof.
y
^From the definition of Hl(9)
cost function
dx\3
with respect to
*'•
in equation (39),
dim
= r iw
dS't(9)
y
dS\{9)
" y dsm _ -
H\{9)
y ^ "Ji]{xij) at 2^—sit—•%•>
(7]) (
'
(U) /7o\ (73)
On Algorithms
for Nonlinear
Dynamic
Networks
221
with t h e last equality derived in a similar fashion as in equations (62) through (65).
• We now can c o m p u t e t h e derivative of F with respect to S\{9), lowing theorem. T h e o r e m 6.8 The derivative through node i at time t is
dF
of the cost function
(dfliiAi)
v
gt
Proof. Differentiating (37) and using (dH[{6)ldS\{6)) 8F dSt{6)
_ dH'2{6) dSl(6)
,
stating t h e fol-
F with respect to the flow Sj{9)
dF
= 0, gives dHl{6) dSj{d)'
T h e first expression is presented as Proposition 6.4 and t h e second one as Proposition 6.3. By combining t h e two we obtain the desired final derivative. •
References [1] J. E. Aronson, A survey of dynamic network flows, Annals search, 2 0 (1989) 1-66.
of Operations
Re-
[2] J. E. Aronson and B. D. Chen, A forward network simplex algorithm for solving multiperiod network flow problems, Naval Research Logistics Quarterly, 3 3 (1986) 445-467. [3] D. P. Bertsekas, Algorithms for nonlinear multicommodity flow problems, in International Symposium on Systems Optimization and Analysis, (Springer-Verlag, 1979) p. 210-224. [4] D. P. Bertsekas, E. M. Gafni, and R. G. Gallagher, Second derivative algorithms for m i n i m u m delay distributed routing in networks, IEEE Transactions on Communications, C O M - 3 2 (1984) 911-919. [5] R. G. Gallagher, A m i n i m u m delay routing algorithm using distributed computation, IEEE Transactions on Communications, C O M - 2 5 (1977) 73-85. [6] P. E. Gill, W . Murray, and M. H. Wright, Practical Press, London, 1981).
Optimization
(Academic
222
W. B. Powell, E. Berkkam,
and I. J.
Lustig
[7] M. Held, P. Wolfe, and H. P. Crowder, Validation of subgradient optimization, Mathematical Programming 6 (1974) 62-88. [8] J. M. Mulvey and S. A. Zenios, G E N O S 1.0 user's guide: A generalized network optimization system, Tech. R e p . 87-12-03, D e p a r t m e n t of Decision Sciences, T h e W h a r t o n School, University of Pennsylvania (1987). [9] W . B . Powell, A stochastic model of t h e dynamic vehicle allocation problem, Transportation Science 20 (1986) 117-129. [10] W . B . Powell, A Comparative Review of Alternative Algorithms for t h e Dynamic Vehicle Allocation Problem, in Vehicle Routing: Methods and Studies (North Holland, New York, 1988), p . 249-292. [11] W . B . Powell, Y. Sheffi, and S. Thiriez, T h e dynamic vehicle allocation problem with uncertain d e m a n d s , in Ninth International Symposium on Transportation and Traffic Theory (1984) p. 357-374. [12] W . W . W h i t e and A. M. Bomberault, A network algorithm for e m p t y freight car allocation, IBM Systems Journal 8 (1969) 147-171.
CN
NO CO
vo cs •«3-
co
ON
w->
oo
00
r~
• *
cs
NO
•»*
O l O >n 0 0 fa *-^ ^ <s NO
co r~
ON NO CO CO
TjON
—i
ON
00
u-> 00
Tt f^ ©
ON
ON
•<* Tf
*-H
«-i
r-
t•*r
•* CO
,_
co
rr
oo u-i NO
cs
1
© oo
NO* NO CO
ON
CO »—1
oo*
u~t 00
m
CO
oo
ON
tOO NO NO
O "1
rrNO
NO —<^
T
CN
OO
CO
d oo '-I
CO
>o ,_T
•*_
NO
©
•
°i
ON
r--
t-~
ON
—'
d d © © »—H OO CO O N
CO
q CO •*
NO
i—«
00 NO
ru-i NO
r-
co
•* d
00
fa OO
<s
ts
Cfl 0 0
<
CN
"-*
CN
OS (S
r-
NO
oo"
8.
ts
»—( m NO
o m O oo IT) fa <s d
fa O t-~
i
00 N©
ON
CO O
< NO NO
!/->
00 o 00 *—< r-^ C ; rNO oC ON CO "1 »—'
OO
r~
O t-^
W oC NO* NO fa oo r1
CS
vo
©
-.
cs
Q
fa
i
->
©
©
©
©
>r> >n
co"
"7 <"i
cs
•*
ON
*o r~ r" d oo"
1
oo
'
r~ NO
I-H
00 ON NO
VI
©
cs
,"^
VI 41 •™
w
"!
1-^
o
§
oo
CN
ro co
<
wa
On Algorithms for Nonlinear Dynamic Networks
V)
C
0)
*•*
.2 ^ «
/*^ cJ
-*-»
i
s o
1 4>
1 > 1 u
oo
NO CO
»—<
1/1
o
d d d d d
V)
U1
<s
NO
fa
>o
• *
m
VI 41
S
o
Tf
©
<s
CO
o
z
-
<s
eriods
(se
CPU Functi Obje
223
*.
Comparison of NDN-T and NDN-X using Frank-Wolfe
Objective Function No. 1
Cities PeFiods 20 5
Dens. 0.5
NDN-T -189,887
NDN-X -189,815
CPU time (sec.) NDN-T 6.12
NDN-X 22.15
NDN-T 37
NDN-X 103
152.01 70.74
47
181
31
171
898.82 246.54
65 47
529 297
to
999.88**
76
268
O
2
40
5
0.5
-766,618
-766,851
29.52
3 4
20 40
10 10
0.5
-367,280
0.5
10.32 80.94
5
20
20
0.5
-1,597,279 -710,476
-367,560 -1,597,414
6
40
20
0.5
-3,268,208
-706,348*
30.44
-2,065,743
190.85
Table 2
Iterations
to re
* result up to stopping criteria of 0.00001 ** result up to CPU Time of 1000
Ig
a
I
Comparison of GENOS verses NDN-T*
Objective Function *
0.5
GENOS 13,981 29,604
3 4
8
8
0.5
63,373
10
2
0.2
5
10
5
0.2
10,607 35,554
6
10
10
0.2
7
10
2
0.5 0.5 0.5
No. 1 2
8 9
Cities Periods 2 8 4 8
10 10
5 10
Dens. 0.5
NDN-T 13,981 29,594 63,348
CPU time (sec.) GENOS 2.20
NDN-T 0.68
5.76
0.46
10.94
10,609
1.85
0.98 1.85
5.59
0.47
62,485
35,545 62,472
10.78
0.74
43,180
43,179
4.63
0.36
90,342
90,237
164,096
163,883
10.86 25.12
0.69 1.60
Table 3
* same exponential function is used in both GENOS and NDN-T * objective function values reflect the minimum cost attained
Comparison of NDN-T and NDN-X using Frank-Wolfe (with external supplies at each period)
Objective Function No. 1 2 3 4 5 6
Cities Periods 20 5 40 5 20 10 40 10 20 20 40 20
Dens. 0.5 0.5 0.5 0.5 0.5 0.5
NDN-T -176,807 -608,662 -328,328
NDN-X -177,107
NDN-T 8.24
NDN-X 13.31 88.20
-606,669
43.69
-328,095
13.55
-1,228,038 -603,379
-1,228,730
78.96
-601,000
15.71
46.31 398.80 140.48
-2,764,947
-932,789
170.15
998.72*
Table 4
* result up to CPU Time of 1000
CPU time (sec.)
Iterations NDN-T 46 67 40 61 23 65
NDN-X 62 105 110 232 164 268
TIME PERIODS
o
->-
a cm o
(0
3
z o a
r
UJ
oc
f
*o
Figure 1 Space -time representation of a network
Example problem of 10 Regions and 10 Time periods
O
Q O
O
O
o o o o o\ o o o o o o o
\o/o\o o
o
o
\o / o o o \o/ o o o o o o o o Figure 2
Solution of classical linearized subproblem using Frank-Wolfe
°\
\° / \° /
s~\ Super >~J Sink
3 to well, E. Berkkam, and I. J. Lu,
ID/
o
o o o o o o o o o o o\ o o o\ o o o/ o o \o/ o o
O O
OS
On Algorithms for Nonlinear Dynamic Networks
)SO0
o v. TO C O
229
Sipnq
I
I
en
O
jB o'
o
-1
0 _§9_
p j pwe 'urejpfjsg- -g 'ffSAioj g •/&
O
o
-2 a5
1 Q5
Objective value 2
3 _§5
0£S
O
Example problem of 10 Regions and 10 Time periods
13
p
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o p o o o o o o o o o o o o o Figure 5 Solution of linearized subproblem for the transformed problem using Frank-Wolfe
r i Super Sink
233 Network Optimization Problems, pp. 233-262 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Strategic and Tactical Models and Algorithms for the Coal Industry Under the 1990 Clean Air Act* Hanif D. Sherali Department of Industrial and Systems Engineering, Virginia Polytechnic and State University, Blacksburg, Virginia 24061-0118
Quaid J. Saifee Association of American Michigan 48076-4251
Railroads,
26555 Evergreen
Road, Suite 1120,
Institute
Southfield,
Abstract
This paper is concerned with a study of the effect of the Acid Rain Provision of the 1990 Clean Air Act on the investment, production, and distribution operations in the coal industry, with concentration on the development of new mines, shutting down inefficient strips of existing mines, and on blending and distribution problems. The problem here is to determine which new mines to open and when, and what decisions and schedules to make for the shipment of coal from the mines to silos, cleaning and blending operations at silos, and subsequent shipment of coal to customers over a multi-period time horizon, so as to satisfy the demand at a minimum total operational cost. A longterm strategic model is developed to meet this objective. The final product is a computer based decision tool which will serve as a mechanism for implementing cost effective decisions in light of complex variations in the production levels of existing and potential mines, ore quality, and demand and quality requirements. The strategic model will play a useful role in planning future growth and in making capital investment decisions. The model can also be used to study the effect of various policies, by testing the sensitivity, feasibility, and the cost of system operations under different perturbations of system configuration, data * Acknowledgement: This research has been supported by the Department of the Interior's Mineral Institute program administered by the Bureau of Mines under allotment grant # G1114151
H. D. Sherali and Quaid J. Saifee
234
and demand specifications. Real operational data from the Westmoreland Coal Company and hypothetical test data are used for testing purposes. We also present a related short-term tactical model in the appendix, that can be used to assist in decisions regarding day-to-day operations.
1
Introduction
T h e research conducted in this study was motivated by t h e Acid Rain Provision of t h e 1990 Clean Air Act, passed by Congress on October 28, and signed into law by President Bush on November 15, 1990. This Provision in t h e Act places stringent restrictions on t h e sulfur dioxide and nitrous oxide emissions, particularly on electric utilities operating coal-fired generators, and it promotes t h e role of Southwest Virginia (our area of study) as becoming a central player in t h e coal industry due to its low sulfur coal resources, as compared with other coal mined east of the Mississippi River. However, in order for different companies in the industry to remain solvent and competitive in light of anticipated sudden changes in quality requirements of future coal d e m a n d , they will need to judiciously plan for t h e usage of their reserves, and t h e operation of their cleaning and distribution facilities. This paper addresses t h e combined problems of choosing new mines to develop, determining mine production, ore purification, and blending of different grades of coal, along with t h e problem of distributing the coal from mines to silos to customers, in order to satisfy customer d e m a n d s placed over several t i m e periods, each having specified quality requirements as driven by t h e Clean Air Act. A long-term strategic model is developed to address this problem from a planning perspective, and is designed to help coal companies in making strategic decisions such as the development of new mines or t h e development of new production units at mines. Also, this model is intended to assist coal companies by providing guidelines for t h e usage of their existing and potential reserves, and in the operation of their cleaning and distribution facilities, along with a possible investment in new technologies to offset t h e development of new mines. Additionally, a modified formulation of Sherali and Puri's [14] short-term tactical model is presented in t h e appendix. This model can be used to aid in t h e day-to-day operational decision making process. Information and d a t a for these models have been provided by t h e Westmoreland Coal Company, one of t h e largest coal mining companies in Southwest Virginia. T h e company owns several coal mines, some of which are currently in use (existing mines), and some of which can be developed later as needs arise. Each mine can produce coal at a specific rate and has certain quality specifications t h a t vary over time. This coal needs to be appropriately shipped to silo facilities, where it is subjected to a beneficiation process in order to be partially cleaned to a desirable degree. T h e different grades of coal at individual silo facilities then need to be blended and shipped
Strategic
and Tactical Models
and Algorithms
for the Coal Industry
235
to customers in order to satisfy d e m a n d s for various quantities having stipulated quality specifications. In our case study, the Westmoreland Coal Company owns two large silo facilities, known as t h e Bullitt facility and t h e Wentz facility. T h e Bullitt facility has six silos, each with a capacity of 14,000 tons. Silo 1 and Silo 2 store coal t h a t does not require to be cleaned. Silo 3 contains high sulfur coal and Silo 4 contains low sulfur coal. After beneficiation, coal from Silo 3 is transferred to Silo 5 and coal from Silo 4 is transferred to Silo 6 for storage. T h e Wentz facility has five silos, three with a capacity of 6,000 tons and two with a capacity of 9,000 tons. One of t h e m is used to store coal which does not need t o be cleaned, two of t h e m are used to store t h e remaining coal prior to cleaning, and t h e remaining two are used for storing coal after it has been cleaned. This being a typical structure in the industry, our model partitions t h e silos within each facility into two categories based on two different kinds of coal stored. These categories are denoted as J\ and J-i types of silos. T h e Ji silos are "run-of-mine" (ROM) blend silos, where a shipment is received, stored, and is directly blended with t h e other coal in its run-of-mine form itself. Each J2 t y p e of silo, on t h e other hand, constitutes a pair of silos, one of which receives t h e run-of-mine coal, while t h e other stores t h e cleaned coal following t h e beneficiation process, holding it ready for t h e blending operation. These silo pairs are indexed by t h e sets J2w and J2B for t h e Wentz and Bullitt facilities, respectively. Besides t h e conventional mine to silo to customer shipments, there might also be a coal transfer between t h e silo facilities at two different geographical sites. Such a transfer occurs due to the "stoker" customers. In t h e example of t h e Westmoreland Coal Company, "stoker" customers are provided with cleaned, sifted coal from t h e Wentz facility. Simultaneously, an almost equivalent amount of sifted by-product is shipped to t h e Bullitt facility. T h e normal blending processes resume at this stage. T h e problem is to find out which production units of potential mines should be opened and at what t i m e periods, and to determine optimal schedules for shipping coal from mines to silos, for cleaning and blending at t h e silos, and for t h e distribution of coal to t h e customers. This has to be accomplished subject to restrictions involving storage at t h e silos, production capacity limits, and material flow balance constraints, so t h a t customer d e m a n d s having required quality specifications are satisfied at minimal total cost. T h e long-term strategic model developed to address this problem has the structure of a fixed-charge, mixed-integer, zero-one programming problem. Zeroone integer variables are used to model t h e decision of opening a production unit at each particular mine. In addition to t h e cleaning and shipping costs, t h e objective function incorporates a fixed-charge component to reflect t h e cost of opening a new production unit at a potential mine. Also, a storage cost component is added to t h e objective function in order to penalize t h e underutilization of active resources. T h e constraints include production capacity, storage, material flow balance, and quality requirement restrictions, along with restrictions on t h e sequence in which units can be opened, and on t h e m a x i m u m n u m b e r of units t h a t can b e opened in any t i m e
236
H. D. Sherali and Quaid J. Sa.ifee
period without any substantial surcharge penalties.
2
Related Models in the Literature
T h e problems related to t h e coal industry have been analyzed in a n u m b e r of different ways. Different techniques varying over linear, nonlinear, and mixed integer 0-1 programming problems have been used to approach different problems related to t h e coal industry. Young et al. [18], Faulkner [5] and Johnson [7] provide few of t h e pioneering papers in t h e area. Many practical case studies of Operations Research in t h e coal mining industry have been surveyed by Tomlinson [16]. Knight and Manula [9] have developed a long-term simulation model to study potential coal production and utilization systems in Pennsylvania. Gershon [6] has formulated a mixed-integer model for a mine scheduling problem. Lietaer [10] presents a linear programming model with an objective of preparing a combination of mining works at a m i n i m u m total operational cost subject to different restrictions including operational restrictions on the mines and the concentrators, among other constraints. T h e problem of allocating coking coals from collieries to washeries and blending plants is also modeled as a linear prog r a m m i n g problem by Williams and Haley [17]. Sherali and Puri [14] develop three short-term tactical models for analyzing day-to-day coal flow operations in t h e coal industry, with a concentration on t h e blending and distribution problems. A modified formulation of their most accurate model is presented in the Appendix of this paper. Steinmann and Schwinn [15] have formulated a zero-one programming model to minimize t h e total resources necessary for balancing t h e capacity structure of the coal mine, subject to capacity constraints for a particular mine, and have reported on c o m p u t a t i o n a l experience with different algorithms used to solve this problem. Two link models, a t r a n s p o r t a t i o n - t r a n s s h i p m e n t model defined for t h e coal distribution network, and a location-allocation model defined for t h e potential location of coal handling facilities within receiving and shipping nodes of t h e network, have been developed by Osleeb and Ratick [12] to determine t h e optimal capacity, placement, and railroad and marine interface of coal handling facilities within and between the. New England ports and converting power plants. A mixed-integer location-allocation problem has also been formulated by Osleeb et al. [13] to evaluate the potential for reducing water-borne coal transportation costs, and concomitantly, t h e cost of delivering coal to the European markets. Candler [4] formulates a blending problem with integer constraints on a m i n i m u m usage of each coal t y p e in a mix, and recommends a sampling plan along with penalty function and rejection options. As evident in t h e literature, different papers have considered different detailed aspects of specific problems faced by t h e coal industry. However, the imminent problem of developing a strategic plan for a time-staged resource m a n a g e m e n t along with coalblending and distribution operations in order to comply with stringent future quality
Strategic
and Tactical Models and Algorithms
for the Coal Industry
237
requirements as p r o m p t e d by t h e 1990 Clean Air Act, has not been addressed for the coal industry. However, strategic models developed for other contexts do share a philosophical structure with our model. For example, Aboudi et al. [1] develop a longt e r m planning model for t h e petroleum production and distribution problem using mixed-integer programming techniques, and K h a n [8] also presents a mixed-integer model t o minimize t h e total disposal costs in an u r b a n solid waste disposal problem. Section 3 below presents a formulation of our strategic planning model. Exact and heuristic solution procedures are described in Section 4, and Section 5 presents computational test results using real and hypothetical d a t a sees. T h e Appendix contains a related formulation t h a t deals with t h e tactical day-to-day operational problem.
3
Formulation of a Long-Term Strategic Model
As described earlier, t h e problem at hand is to determine which production units of potential mines should be opened and at what t i m e periods, along with optimal schedules for shipping coal from mines to silos, for cleaning and blending coal at the silos, and for t h e distribution of coal to t h e customers, subject to various constraints. These constraints include production capacity, silo capacity, material flow balance, d e m a n d satisfaction for customers, and quality requirement restrictions, along with restrictions on t h e sequence in which units can be opened, and on the m a x i m u m n u m b e r of units t h a t can be opened in any t i m e period, without any substantial surcharge penalties. T h e model requires specific d a t a pertaining to t h e existing and potential mines, silos, and customers. This model considers periods equal to six m o n t h s in duration, with a horizon of three to five years. As this is a long-term model, all costs such as cleaning and shipping costs, revenues for shipping b e t t e r quality coal, and penalties for storage and underutilization at mines, are present values using a certain rate of return. T h e subscript t attached to these cost factors reflects this representation. In this regard, note t h a t the fixed-charge cost to open a production unit at a potential mine is t h e present value of t h e semi-annualized payments over the life of t h e mine t h a t are to be m a d e over the horizon of the model. Real d a t a from t h e Westmoreland Coal Company and nine other similar, hypothetical d a t a sets have been used for making t h e model runs. Given below are t h e d a t a requirements, along with t h e notation used.
t = 1 , . . . , T = n u m b e r of t i m e periods ( < 8). i = 1,. . . , / = number of existing and potential mines ( < 23). q = 1,. . . , hi = n u m b e r of units at a potential mine i ( < 30).
H. D. Sherali and Quaid J. Saifee
238 j = 1 , . . . , J = n u m b e r of silo units ( < 10). k = 1,... ,K = n u m b e r of customers ( < 12).
Kst = "stoker customers" served by Wentz silos. pit = production (tons) at existing mine i, in period t. Piq(t) = a function which represents production (tons) at unit q of potential mine i, in period t of its life. (Note t h a t t denotes t h e period of its life after this unit has been opened, and not t h e period of t h e model.) a,-j = ash content (%) in coal produced at existing mine i, in period t. Su = sulfur content (%), in coal produced at existing mine i, in period t. a, g( = ash content (%), in coal produced at unit q of potential mine i, in period t. (Note t h a t t denotes t h e period of horizon, and not t h e period of t h e unit's life after it has been opened.) Siqt = sulfur content (%), in coal produced at unit q of potential mine i, in period t. ( T h e same c o m m e n t applies here as for a,, ( ) I\ = existing mines. I2 = potential mines. J\ = R O M silo storage units. J2W = cleaned silo units at t h e Wentz facility. J2B = cleaned silo units at the Bullitt facility. J2 = cleaned silo units ( J?w U JIB )• SCjt = storage capacity of silo unit j in period t. (For j 6 J2, this is taken as t h e sum of t h e two associated silo storage capacities.) c-jt = present value of shipping cost (per ton), from existing mine i to silo j in period t. c
fqjt = P r e s e n t value of shipping cost (per ton), from a unit q of a potential mine i to silo j in period t. cf,t = present value of cleaning cost (per t o n ) , at silo j for coal from existing mine i in period t. cf -t = present value of cleaning cost (per ton), at silo j for coal from a unit q of a potential mine i in period t.
Strategic and Tactical Models and Algorithms for the Coal Industry
239
ctij € (0,1] = total weight attenuation factor (output per ton input) at silo j € Ji for coal from existing mine i. ctiqj = defined similar to a^- for production unit q at potential mine i. fiij € (0,1] = ash content attenuation factor at silo j g J 2 for coal from existing mine i. /3iqj = defined similar to ftij for production unit q at potential mine i. Hi € (0,1] = sulfur content attenuation factor at silo j S J 2 for coal from existing mine i. -/iqj = defined similar to 7^ for production unit q at potential mine i. Note: ctij, a,gj', fy, /3iqj, 7,j, and 7,-w- are assumed to be 1 for j € J1# Ai = flow arcs {i,j) from mine i to silo j ; Ff — {j : (i,j) £ A\ }, iZj = {i :
(«',i)e A x }. A2 = flow arcs (j, fc) from silo j to customer k; Ff = {k : (j,k) iP t = { j : ( i , * ) e A a } .
g A2 },
As = arcs (j, j ' ) representing by-product flow from Wentz silo j € J2w to Bullitt silo j ' € J 2 B , corresponding to stoker customer shipments; Fj = { j ' : (j,j') G A3},Rj = {j:(j',j)eA3}. c
jkt = present value of shipping cost (per ton), from silo j to customer k in period t. dfj?t = present value of shipping cost (per ton), from Wentz silo j 6 J2w to Bullitt silo j ' 6 JIB in period t. dkt = demand placed by customer k during period t. ASHukt = upper limit on ash percentage in coal to be delivered to customer k in period t. SULukt = upper limit on sulfur percentage in coal to be delivered to customer k in period t. Rkt — present value of additional revenue earned per percentage point below the maximum specification of ash content, in coal delivered to customer k in period t. fiqt = fixed charge ( equivalent present value as described above) to open a unit q at a potential mine i in period t.
H. D. Sherali and Quaid J. Saifee
240 / ( m a x = m a x i m u m { /•'< : q = I,. •• ,hi, i € h}
for all t = 1,...
,T.
PEa = present value of penalty per ton of coal unused at existing mine i in period t. PE{qt = present value of penalty per ton of coal unused at unit q of potential mine i in period i. Ut — upper limit on t h e number of units t h a t can be opened at a potential mine i in period t.
Mathematical Formulation A detailed m a t h e m a t i c a l formulation is given below, followed by an explanation and motivation of t h e objective function and the various constraints. T h e decision variables track t h e flow of coal from mines through silos (within the blending process) to the different customers. Specifically, these variables correspond to (i) t h e amount (tons) shipped from existing mine i through silo j to customer k in t i m e period t, given by yiju (ii) stoker customer by-product amount (tons) shipped from existing mine to i to silo j G J2W, which is then shipped to silo j ' G J 2 g , and then to customer k in period t, given by Yijyki (iii) amount (tons) shipped from unit q of potential mine i through silo j to customer k in t i m e period t, given by Wiq]kt (iv) stoker customer by-product amount (tons) shipped from unit q of potential mine i to silo j G J W , which is then shipped to silo j ' G J 2 s , and then to customer A: in a period 2, given by Wiqjjikt (v) a binary variable t h a t takes on a value of one if unit q of a potential mine i G I2 is opened or initiated in a period t, and is zero otherwise, given by X{qt. Note t h a t t h e aggregate shipments of coal from mines to silos, the blending process at the silos, and the aggregate shipments of blended coal from the silo facilities to the customers can be readily computed via these variables. Auxiliary Decision Variables 1. zn = slack variable equal to the amount (tons ) of coal produced at existing mine i during period t t h a t remains unused in t h a t period, for i G h, t = 1,. . . , T. 2. Ziqt = slack variable equal to the amount (tons) of coal produced at a unit q of potential mine i during period t t h a t remains unused in t h a t period, for i G h, q = l,...,hi, 8 = 1,...,T. 3. ASHkt = % ash content in blended coal delivered to customer k in period t, for k = l,...,K,t = l,...,T. 4. SULkt = % sulfur content in blended coal delivered to customer k in period t, for k = 1,...,K, t = 1,...,T.
Strategic
and Tactical Models and Algorithms
for the Coal Industry
241
5. 8it = n u m b e r of mines opened within t h e set limit Ut in period t, for t = 1 , . . . , T . 6. 02f = n u m b e r of mines opened beyond t h e set limit Ut in period t, for t = Objective Function: Minimize
£
£
£
£
( < # + eg, + c $ ) y y t f
j = i iefljn/] tgF? *=i
+ 3&J2W £ iGR)nh £ j'eFf £ j'GF £ , *=1 E ( ^ + ^ + ^" + 4 )ijj'ktn 2
£ E EEE(C +4 + 4 W
i = l i6fl;n/ 2 9=1 i g F 2 '=1
+ E E E E E E ^ + ^ + ^ +4 - f e + E E E / ' A + E £^<** + E E E ^ ' ^ <SH}n72 9=1 i = l
r
O.2/tmax02i
+ £
igfijn/i *=1 A-
r
i6R]-n/2 9=1 «=1
5
- £ E ( ^ ^ « - ASHkt)dktRkt
t=i
«:=i (=i
Constraints: 1. F/ow balance at the
mines:
Pit ~ E E
2to*< + £
j € f ? keF]
£
i£F} j'£Ff
y
£
.^'t<
Zit
k£Fj,
for i £ J ] , and < = 1 , . . . , T
E
Pi
t
+ 1 ) I ''9<
E
E
"'•uw + E
jeF} keFj
E
E
w
iijj'ks
^inS
jeF? }'£Ff tef*
for i £ I2, q = 1,. . . , hi and S = 1,. . . , T 2. Capacity
constraints
E
E
e H j n ^ k£Ff
on silos: 1
a 7,0-+ n)y>jkt + J2 2
E
E ^
+
a
# »
igfijn/i j ' e J j w fcgF?
+ E £ £ ^(i+ <*«>.»* + £ £ £ £ i(i + ^„-)w;-, ie«}n/ 2 i = i fcgF2
<
5C,(
igfl!n/ 2 9=1 i ' g J j w fcgf2
H. D. Sherali and Quaid J. Saifee
242 for j € (J2B U J2w) n J2, and t = 1 , . . . , T
E
E
WW + E
i'e«}n/i * e ^
w
E E
<«*< ^
ieH}n/3 9=1 /te^?
^
for j € Ji, and £ = 1 , . . . , T 3. Wentz to Bullitt transfer:
E
E
*«'* ^
E
j'eFj1 jteF2, for j € J
w
y**. < i-i E
E *«'«
j'eFf keF2,
keFfnK,,
fl F?, i € # } n 7 1; and t = 1 , . . . , T
E
w
E
>in'kt <
j'6F/ ieF*
E
w
iijkt < 1.1 E
E
ww**
j'eFf keF2,
keFfnK,,
for i e J 2 w n i ? , i € R) fl J 2 , 9 = 1 , . . . , hit and t = 1 , . . . , T 4. Demand constraints for customers: hi
E
a
E
u 3/<j*« +
12
12
12 a'j Ynj'kt + E
E
E
E E
E
E a w *"<«*<
A,-
+
a
iii wiqi?kt = dkt
j€JWn/S, ieR)m2 9=1 j'eF/ for k = 1,...,K,
and* = 1,...,T
5. Customer product quality constraints: 1 a
ASHU =
E
E
y^kt u Pa +
Ai
E
E
E
E ^jj't« a<< A'J
E
jehwnR*, ieR)ni, j'eF 3
jefij ieR)nh
+
E
hi
w
iqjktaiqtf3iqj+ E
E E Wi«y*
E
jeJwnR3, ieR)nh 9=1 j'eF?
jeflj iefljn/2 9=1 for fc = l , . . . , / r , f = 1 , . . . , T
5£/Xw
=
—
E2
jeR
12 vanilla + iefijn/,
E
E
E
E
in'kt sit m
iehwnR*, ieR)ml j'eFf
hi
+
Y
E h,
E wi»ikt siqt 7.«+
E
E
E E
jeJ^nfl 3 , .efl]n/2 9=1 ygF?
ww^,^;?
Strategic
and Tactical Models and Algorithms
for k = 1,...,K,
t-
for the Coal Industry
1,...,T
0 < ASHkt
< ASHuH,
for k = 1 , . . . , K, and t = 1 , . . . , T
0<SULkt
<SULukt,
foi k = l,...,K,andt
6. Restricting
243
=
l,...,T
a unit of mine to be opened only once: T
E
x
(=1
1
'ii -
for i G I2, q = 1, • . . , hi. 1. Upper limit on the number of units that can be opened in a time
period:
hi
E E xtgt<elt
+ e2i{ovt = i,...,T
ieh 9=1
0<0lt
92t>0,ioit
8. Sequencing of units at a potential to derive a tighter relaxation.)
=
l,...,T
mine ( with some implied constraints
added
*"iqt _ / , •Tiq'6 6
for i G I2, q' = l,...,q9. Disaggregated ation)
1, q = 2 , . . . , A,-, t = 1 , . . . , T
constraints E jeF?
W
(further
ilikS +
restrictions
E E j€F>nJ2W j'eFf
added to derive a tighter
W
ilH'kS < dkS E Xigt t<s
for i £ 7 2 , q = 1,. . . , hi, k = 1 , . . . , K, 8 = 1 , . . . , T 10. Nonnegativity
and binary
constraints:
Vijkt > 0, for i e h, j = 1 , . . . , J , k = 1 , . . . , K, t = 1 , . . . , T Vigjkt > 0, for i G 7 2 , q = 1 , . . . , hi, j = 1 , . . . , J , k = 1 , . . . , K, t = 1 , . . . , T Yijj'kt > 0, for i e h, j e J2w, j ' € J2B, k = 1 , . . . ,K, t = 1 , . . . , T
relax-
244
H. D. Sherali and Quaid J. Yiqjj
q = l,...,ft,-, j G J 2 w , j ' € ^2i5, * =
Saifee 1,...,K,
t=l,...,T zu > 0, for i G / ! , t = 1 , . . . , T Ziqt > 0, foi i e I2, q = 1,...
,hi, t = 1,...
Xiqt = 0 01 1, foi i € I2, q = 1,...
,T
,hi, t = 1,...
,T
C o m m e n t s on the Formulation: Given below is an explanation of some of t h e finer points related to t h e objective function and t h e constraints. As this is a long-term model, all t h e cost coefficients in the objective function are present values. A fixed charge cost is incurred whenever a unit at a potential mine is opened. Shipment of better ash quality coal is rewarded, and storage and underutilization at a mine is discouraged by accommodating a penalty t e r m in t h e objective function. Also, in any t i m e period t, if the n u m b e r of units t h a t can be opened exceeds t h e specified upper limit on the number of units t h a t can be opened given the available b u d g e t a r y resources, t h e n a surcharge equal t o 20% of t h e m a x i m u m fixed cost over t h e different units of the potential mines is applied to the number of units t h a t exceed t h e stated limit. This surcharge (its value being governed by finance considerations) reflects t h e burden for acquiring additional capital for developing units beyond what t h e available budget permits. In t h e flow balance constraints (1) for t h e mines, we require t h a t all t h e coal produced within each six m o n t h period should account for flow within this duration, without any carryover of inventory. If the flow falls far shorter t h a n t h e production, despite t h e underutilization penalty in t h e objective function, then this mine is a strong candidate for a shutdown. Also, in t h e flow balance constraint for a new unit of a potential mine, t h e production rate function at t h a t unit is multiplied by a binary variable t h a t takes on a value of one when this unit is opened and is zero otherwise, such t h a t t h e constraint reflects t h e appropriate rate of production according to its age in t h e given period. As each J2 t y p e of silo constitutes a pair of silos, one of which receives the runof-mine coal, while t h e other stores t h e cleaned coal following t h e beneficiation process, we have used t h e coefficient ( "''' with the flow variables in t h e capacity constraints (2) for t h e J2 type of silos. Furthermore, t h e normal storage or handling capacity of t h e silo is multiplied by t h e n u m b e r of working days in a t i m e period to derive t h e p a r a m e t e r SCjtT h e Wentz to Bullitt transfer constraints (3) are interval constraints, and they reflect a transfer of coal from t h e Wentz silos to t h e Bullitt silos as actually practiced by t h e Westmoreland Coal Company. These constraints can easily be generalized
Strategic
and Tactical Models and Algorithms
for the Coal Industry
245
for any coal company, and can be removed from t h e formulation if no such t y p e of transfer is practiced in a particular company. In t h e d e m a n d constraints (4) for customers, t h e flow variables are multiplied by coefficients ctij to take into account t h e total weight attenuation during t h e beneficiation process at t h e silos. In t h e customer product quality constraints (5), we have simply used coefficients a,„( and s,-,( instead of using £ a
s
as it would lead to nonlinear constraints t h a t would have to be linearized. (These l a t t e r functions are of t h e same n a t u r e as t h e production r a t e function used in t h e flow balance constraints for a unit of a given potential mine.) However, since t h e variation in t h e ash/sulfur content over t h e horizon is not too significant, we can approximate these nonlinear terms by a,-3t and «;,(, which in effect assumes t h a t t h e ash/sulfur content fraction varies as if t h e unit was opened at t i m e period 1. In t h e constraints (6), every unit of a given potential mine is restricted to be opened only once. Each constraint (7) has a right-hand side equal to t h e sum of 8u and 02t, where 9U is bounded from above by t h e m a x i m u m number of units from all potential mines t h a t can be opened in t i m e period t. This upper limit depends on t h e availability of resources such as capital, equipment, and manpower. T h e variable 02t, is simply nonnegatively restricted, b u t is penalized as described in t h e c o m m e n t pertaining t o t h e objective function. T h e constraints (8) enforce t h e sequencing of units at a potential mine. Here, it. is sufficient to use only t h e constraints having ' = — 1 in order to obtain a valid representation. However, additional constraints have been added for 1 < q' < q — 1 as well, in order to obtain a tighter relaxation. Given below is an example to illustrate this feature. Consider i = 1, q = 1,2,3 and t = 2. For q' = q — 1, we would obtain t h e following constraints: x122 < Z m + £112 a n d X132 < X121 + X m - Along with constraints (6), these restrictions are sufficient to require the sequencing of units 1,2, and 3 in this order up to period 2. However, if we let 1 < q' < q — 1, we would include an additional constraint: X132 < X m + Xn2- Although this is implied in t h e integer sense, it is not implied in t h e continuous sense, and hence it assists in obtaining a tighter linear programming relaxation for our problem. Such a tighter relaxation can enhance the performance of b o t h exact and heuristic solution procedures ( see Nemhauser and Wolsey [11] ). In a similar spirit, t h e disaggregated constraints (9) are included in t h e formulation to further tighten t h e continuous relaxation. Analogous to t h e standard disaggregated fixed-charge problem formulation (see Nemhauser and Wolsey [11]), these constraints could be written as
Wi,
j"6F!
jkS+
£ j€F>nJ2W
£
j'£Ff
WiqjjikS
t+ l)x,',t, 4 ^ I i , (
< min t<8
H. D. Sherali and Quaid J. Saifee
246 for i € h, q = I,...,hi,
k = 1,...,K,
8 = 1,...,T.
However, as we already have flow balance constraints of t h e form
Yl *«.•«•« + jeF> for i G h, 1 = !,•••,hi, constraints
Y izF}
w
Yl
JL wiqjjlkS < J2 Pi<,(s-t + \)xiqt
jeF>nJ2W j'eFf
t<s
k = \,...,K,
S = 1,...,T,
idkS +
Y
12
we have simply added t h e
W
idi'kS < dks J2 xkt
jeF?nj2W j'eFf
t<s
for i € h, 1 = 1, • • • ,hi, k = 1,...,K, 6 = 1,...,T in our formulation. T h e computational results given in Section 5 exhibit t h e effect of constraints (8) and (9) in tightening t h e continuous relaxation of t h e problem.
4
Solution Procedures
This section presents methodologies for solving t h e long-term strategic model, using b o t h exact and heuristic techniques aimed at deriving near optimal solutions. Solving t h e M o d e l Using C o m m e r c i a l Softwares T h e long-term model is a mixed-integer 0-1 programming problem. We first tried solving t h e various test problems generated using Z O O M , which is t h e default solver in GAMS (see Brook et al., [3]). However Z O O M could only find an optimal solution for two out of t h e ten test problems even with a 10% optimality tolerance, and did not even find an integer feasible solution for t h e remaining problems except for test problem 4. Hence, we had to resort to more sophisticated, state-of-the-art mixedinteger solvers, namely, OSL (developed by IBM) and C P L E X ( M I P option, developed by C P L E X Optimization Inc.). Different links to connect these softwares with GAMS are available through t h e GAMS Development Corporation. After coding t h e problem in G A M S , these solvers can be called to execute t h e solution process, provided of course, they have been installed within t h e system. T h e optimality tolerance while solving these test problems was kept at 10%. As t h e results of Section 5 indicate, these options, if available, are viable alternatives. C P L E X was able to solve seven out of t e n problems, while OSL solved all t h e test problems. However, C P L E X found b e t t e r quality solutions t h a n did OSL for five of t h e seven instances it solved. If these options are unavailable and one has access only to a linear p r o g r a m m i n g code, or if a robust procedure is required for solving larger sized problems, we propose t h e following linear programming based heuristic procedure. We remark here t h a t we tried to derive approximate solutions for our test problems using t h e Pivot and Complement Heuristic of Balas and Martin [2], but it failed
Strategic
and Tactical Models and Algorithms
for
the Coal Industry
247
to solve all b u t t h e smallest of our test problems. Hence, we designed our own heuristic procedures, exploiting t h e n a t u r e of our problem. (Some of t h e ideas below (see Step 2 in particular) are portable to other 0-1 mixed-integer p r o g r a m m i n g contexts as well.) Linear P r o g r a m m i n g Based Heuristic ( L P H ) This heuristic procedure employs a sequential rounding scheme based on a series of continuous linear programming relaxations, exploiting t h e structure of t h e problem in determining which variables to round up to 1 at each step of t h e process. T h e following are t h e steps involved in this heuristic procedure: Step 1: Solve t h e linear programming (LP) relaxation of t h e problem. If t h e solution obtained has all t h e x-variables at binary values, then stop; t h e optimal solution obtained for the LP relaxation solves t h e mixed-integer p r o g r a m m i n g problem. Otherwise, go to Step 2. Step 2: (Optional, if package such as MINOS is available to handle nonlinear objective functions.) Replace t h e objective function by
Maximize
£
£
E
(*•* ~ ll2f
(U)
ieR)r\I2 9=1 '=1
and incorporate t h e following additional constraint in t h e problem z < i/ L P (l + A)
(12)
where z represents the objective function of t h e strategic model, VLP is the value of its linear programming relaxation, and 100A is a specified % deviation p e r m i t t e d from v^p. Solve t h e continuous relaxation of this linearly constrained nonlinear programming problem. ( We used MINOS 5.2 available with GAMS for this purpose.) As t h e form of t h e objective function indicates, a solution to this problem encourages all t h e binary variables t o take a value of 0 or 1, in order to maximize t h e objective function, while satisfying t h e problem constraints, including t h e additional constraint (12). In fact, a solution is integer feasible to t h e original strategic model and has an objective value within t h e specified tolerance I^LP(1 + A ) if and only if it solves t h e above problem. However, this problem is nonconvex, and so MINOS might stall at a nonoptimal local m a x i m u m , which is not integer valued. Hence, a range of values of A € [0,0.1] say, may be a t t e m p t e d in a sequential set of runs, and t h e best solution obtained may be recorded. (Note t h a t if A = 0, and if t h e solution obtained has all the i-variables at binary values, t h e n an alternative optimal solution to the LP relaxation has been obtained which solves t h e original mixed-integer program.) Proceed t o Step 3.
H. D. Sherali and Quaid J. Saifee 3: Define F = { a set of mines i for which some unit q has a fractional variable Xiqt for some period t}. For each i G F, let q(i) b e t h e imminent unit defined as t h e smallest index q for which x,-,( is fractional for some t. Note t h a t by t h e constraints of t h e problem, and t h e definition of q(i), we must have Xiqs = 1 for each q < q(i), for some 8 < t, where t is t h e smallest t i m e period for which t h e x-variable for q(i) is fractional. Now, for each i G Fi, let tn < £,2 < . . . < tini be t h e t i m e periods t for which t h e variables xiq^y are fractional, and let vn,..., Vini be t h e fractional values of X{q^tik, k — 1 , . . . , n,-. Accordingly, c o m p u t e
Ti = ^2 k=l
vik tik + T(l - Y^, Vik) for each i G F k=\
where T is n u m b e r of t i m e periods in t h e horizon. Note t h a t t h e smaller the value of Ti, t h e relatively earlier is t h e tendency for t h e imminent unit of mine i to b e opened. Hence, find T" = m i n i m u m (T,-)
(13)
Instead of selecting a mine t h a t is determined simply by (13) as t h e one for which t h e imminent unit should be opened, we examine a b a n d of T; values by finding T = {i G F : T{ < T* + 1 } For mines within this band, identify a mine r according to r £ argmax { ^ ( i ) } , where !/;*(,-) = m a x i m u m {vik} for i G F , E 7?
k=i,...,m
If t h e fractional x-variables represent only one unit of a potential mine at different t i m e periods, then select k(r) = 1, for fixing xTqiT)tTl = 1. 4: Fix xrq(r)tr = 1, along with all t h e other binary variables which turned out to be 1 in t h e most recent linear programming relaxation solved. Re-solve t h e linear programming relaxation after fixing t h e above binary variables. If t h e problem is infeasible, go to Step 5. If t h e solution obtained has all t h e xvariables at binary values, then stop; t h e optimal solution obtained is prescribed as a heuristic solution. If the solution obtained still has fractional values for some of the x-variables, then return to Step 3. 5: Replace k(r) by k(r) — 1. ( Note t h a t by t h e structure of t h e problem and t h e n a t u r e of t h e solution to t h e last feasible linear p r o g r a m m i n g relaxation, it must b e t h a t t h e revised k(r) > 1.) R e t u r n t o Step 4.
Strategic
5
and Tactical Models and Algorithms
for the Coal Industry
249
Computational Experience
Real d a t a from t h e Westmoreland Coal Company and nine other similar, hypothetical d a t a sets are generated to test different heuristics and t h e commercial softwares used to solve t h e problem. T h e real d a t a provided by Westmoreland Coal Company has 11 existing mines and 8 potential mines. Of t h e latter, one has four units, two of t h e m have two units each, and t h e remaining five have one unit each. T h e r e are 8 silos and 6 customers. T h e horizon considered in running t h e model with this d a t a is 6 periods (3 years) in duration. Nine hypothetical test problems are created using a different n u m b e r of existing mines, potential mines, units in each potential mine, silo units, and customers. Also, various combinations and connections between existing mines, potential mines, units in a potential mine, silo units, and customers are incorporated into these test problems. These problems vary in t h e range of 388 constraints, 27 binary variables, and 481 t o t a l n u m b e r of variables, to 4155 constraints, 240 binary variables and 6565 total n u m b e r of variables. Table 1 provides a list of specifications for all t h e test problems generated. In all these d a t a sets, we have included a special hypothetical potential mine with a single production unit t h a t has a very high rate of production, and has a zero ash and sulfur percentage content in t h e coal produced by it. An inordinately large fixedcharge is required to open this mine. Also, t h e coal produced by it is given very high shipping and cleaning costs. In effect, it is ensured t h a t this special hypothetical unit is used only when t h e model would otherwise have been infeasible. In other words, whenever t h e model uses this particular mine, it implies t h a t t h e model is infeasible and t h a t t h e infeasiblity lies where t h e flow is satisfied using t h e coal from this special mine. (While this should be included in practice, none of our test problems needed to resort to this hypothetical mine.) Effect of I m p l i e d C o n s t r a i n t s After coding t h e test problems in G A M S , we also performed an investigation on t h e effect of using constraints t h a t are implied in t h e integer sense, though not in the continuous sense, for t h e purpose of derving a tighter relaxation of the model, as discussed in Section 3. Tighter relaxations of a discrete programming problem help in improving t h e performance of any exact or heuristic solution procedure. To ascertain the effect of t h e implied constraints in (8) and (9) on the model, linear programming relaxations of all t h e test problems were run using MINOS 5.2 as a solver, both with and without these implied constraints. T h e results in Table 2 exhibit t h a t t h e implied constraints used in t h e formulation produced a tighter relaxation for 6 out of the 10 test problems. C o m p a r i s o n of D i f f e r e n t C o m m e r c i a l S o f t w a r e s
250
H. D. Sherali and Quaid J. Saifee
T h e test problems were first solved by Z O O M ( Z e r o / O n e Optimization M e t h o d ) , which is a default solver in GAMS for mixed-integer programming problems. As it t u r n e d o u t , Z O O M faced significant difficulties even in obtaining a feasible solution to t h e linear p r o g r a m m i n g relaxation of t h e original mixed-integer programming problem. After trying different combinations of options, like C H E A T , DIVE, E X P A N D , F A C T O R , G A P , P A R T I A L , Q U I T ( see Brooke, Kendrick and Meeraus [3]) in t h e G A M S / Z O O M options file, we were able to find an integer solution for only three out of t h e t e n test problems using an optimality tolerance of 10%. (It should be noted t h a t if OSL, C P L E X , or Z O O M does not find an optimal solution within the specified range, t h e n it gives t h e best solution found up to t h e point of termination, mentioning t h a t there is no optimal solution within t h e specified range.) Next, we solved t h e ten problems using OSL (Optimization Subroutine Library) and C P L E X , again with a 10% optimality tolerance. Separate G A M S L I N K S , developed by t h e GAMS Development Corporation, were used to link GAMS with OSL and with C P L E X . T h e OSL runs were m a d e on an IBM RS/6000 workstation model 320H, running AIX 3.1. T h e runs with C P L E X as a solver were m a d e on a SUN Sparc 1, running SUNOS 4.1.1. T h e runs using Z O O M as t h e solver were m a d e on an I B M 3090. (Different computers have been used since t h e OSL and C P L E X runs were m a d e at t h e GAMS Development Corporation.) T h e results obtained after solving t h e problems with Z O O M / G A M S , O S L / G A M S and C P L E X / G A M S are tabulated in Table 3. OSL was able to solve all t h e test problems as shown in Table 3 while C P L E X was unable to solve three of these problems due to m e m o r y limitations on t h e SUN Sparc 1 computer. On t h e other hand, Z O O M could only solve three test problems, two of which were solved within 10% of optimality, while for t h e third problem (number 4), only a feasible integer solution within about 4 3 % of the linear programming lower bound could be found. No feasible solution was found for t h e remaining problems. C o m p a r i n g C P L E X and OSL in t e r m s of t h e number of iterations and t h e relative gap from t h e linear programming relaxation value, t h e results in the table show t h a t for 4 out of t h e 7 problems t h a t C P L E X solved, it required more iterations t h a n did OSL, b u t for 5 out of these 7 problems, it obtained a better solution t h a n did OSL. (Blank spaces in t h e table specify those problems which could not be solved either by C P L E X due to shortage of memory or by Z O O M due to different kinds of numerical difficulties.) C o m p a r i s o n of R e s u l t s O b t a i n e d U s i n g C o m m e r c i a l S o f t w a r e s a n d H e u r i s tic L P H Table 4 presents t h e results using Heuristic LPH to solve t h e ten test problems. In all t h e test problems except for problem n u m b e r 2, t h e proposed heuristic consumes more iterations to solve t h e problems t h a n do the commercial softwares OSL and
N u m
„ , . Prob. exist.. . Num. ine r mines
1
8
of
Num. of Num. of units in . , potential poten. '. , mines tial
Num. of cus, tomers
Num. of .. time . , periods
Num. of Num. of silos Constr.
Num. of Num. of, . binary vars. vars.
CPU .. time , -, (sec)
Iters.
& ^ a> 03
=
a.
T.
s
mines
5
9
7
3
8
388
481
27
3.7
560
£3 £;•
2
11
8
13
6
6
8
950
1145
78
31.2 2047
3
5
8
14
7
6
8
1195
2309
84
57.6
2599
^
4
4
8
15
5
6
8
776
1337
90
25.5
1885
g-
5
7
8
15
10
6
4
1310
2987
90
93.3
3367
t a
6
5
8
16
6
6
4
1053
863
96
36.6
2238
^
7
7
8
13
6
8
8
1290
1957
104
56.7
2634
1
8
4
8
14
6
8
6
1251
1752
112
105.8
5016
f1
9
6
13
24
10
8
8
2981
4149
192
497.8
9850
S,
10
7
16
30
12
8
10
4155
6565
240
584.6
6947
Table 1: Test problem specifications for the long-term strategic model Legend: CPU time = CPU seconds for solving the linear programming (LP) relaxation on an IBM 3090 computer Iters. = Number of iterations required to solve the linear programming relaxation of the problem. Vars. = Variables, Num. = Number, Prob. = Problem, Constr. = Constraints.
H. D. Sherali and Quaid J. Saifee
252
Solved without implied constraints (objective value)
Solved with implied constraints (objective value)
1
258368136
258368136
2
296801038
316045551
3
76531010
76531010
4
31379965
31379965
5
77480004
77740986
6
328998463
351037130
7
391664242
392528791
8
56816347
56816347
9
655812636
673834448
10
409673011
412477641
Test problem number
Table 2: C o m p a r i s o n of s o l u t i o n s o b t a i n e d w i t h a n d w i t h o u t i m p l i e d c o n straints
C P L E X , given t h e 10% optimality tolerance used within t h e latter methods. It should be noted, however, t h a t while solving the problems using our heuristic, we did not use any advanced bases from one LP run to t h e next. T h e effort for t h e proposed heuristic can be substantially reduced by using the optimal basis obtained for one run in the subsequent problem solved, since only one additional fractional binary variable is fixed at 1 from one run to t h e next in t h e sequence of problems solved. This automation can be done by incorporating t h e M P S file generated through G A M S 2.25 within a F O R T R A N or C program, and updating this file from one call to t h e next of t h e solver M I N O S 5.2, OSL, or C P L E X . As far as t h e solution quality is concerned, Heuristic LPH obtains a better solution t h a n does OSL for 3 of t h e test problems and the same solution for 2 of t h e m . It also obtains a better solution t h a n does C P L E X for a single case. In this case, Heuristic L P H actually identifies an alternative optimal linear p r o g r a m m i n g solution t h a t happens to be integer feasible at Step 2 of t h e procedure, using A = 0.0. Overall, as can be seen by comparing Tables 3 and 4, for t h e most part, the solutions obtained via t h e different procedures are comparable in quality. Finally, it should be noted t h a t because OSL and C P L E X employ a branch-and-bound technique, it can be expected t h a t as t h e problem size increases, the computer memory requirements would increase
OSL M I P soTest Prob.
lution value
RG %
ZOOM
CPLEX CPUl time
Iters
(sees)
M I P solution value
RG %
CPU2 time
Iters
(sees)
M I P solution value
RG %
CPU3 time
Iters
(sees)
5.4
1
259637230
0.49
1.8
178
259637230
0.49
6.0
426
259637230
0.49
2
368434780
16.5
1335.9
47348
373343025
18.12
414.2
8765
-
-
-
1121
-
3
86359033
12.84
27.9
1419
83701728
9.36
230.1
4436
-
-
4
33697853
7.38
20.6
1455
33642804
7.21
243.0
6825
44930032
43.18
355.7
49077
5
78278455
1.03
32.6
2126
78278455
1.03
83.9
2045
79326334
2.38
139.2
7554
6
446829414
27.28
196.0
13046
-
-
-
-
-
-
-
-
7
448381433
14.22
121.9
5912
440208987
12.14
179.9
4271
-
-
-
-
8
60458788
6.41
41.1
2238
60584061
6.63
127.7
2634
~
-
965484334
43.28
166.9
3994
-
-
-
-
-
-
10
449239319
8.91
221.5
4004
-
-
-
-
-
-
-
-
9
-
-
Table 3: Comparison of solutions obtained and effort required for solving the test problems with OSL/GAMS, C P L E X / G A M S , and Z O O M / G A M S Legend: CPUl time = Resource usage in CPU seconds on an IBM RS/6000 workstation model 320H CPU2 time = Resource usage in CPU seconds on a SUN Sparc 1 CPU3 time = Resource usage in CPU seconds on an IBM 3090 Iters = Total number of simplex iterations required to solve the problem RG = % Relative gap of the MIP solution obtained from the initial linear programming relaxation value
H. D. Sherali and Quaid J. Saifee
254
Test Problem
MIP soln.
RG %
XPTT time (sees)
Iters.
LPR
FRA
1
259637230
0.49
4.1
1108
3
1
2
370792903
17.32
88.2
5305
3
7
3
84327533
10.18
227.5
12340
6
3
4
34704832
10.59
102.5
9435
7
6
5
78278455
1.03
540.7
22516
8
6
6
469914352
33.86
293.1
19006
12
9
7
448381433
14.22
279.4
12883
6
4
8
60272957
6.08
770.8
42031
14
10
9
1023861971
51.94
4003.4
77540
12
11
10
412477641
0.00
1009.4
13447
2
8
Table 4: C o m p u t a t i o n a l E x p e r i e n c e U s i n g H e u r i s t i c L P H
Legend: C P U t i m e = Resource usage in C P U secounds on an IBM 3090 Iters = Total number of simplex iterations required to solve t h e problem RG = % Relative gap of t h e M I P solution obtained from t h e initial linear p r o g r a m m i n g value L P R = N u m b e r of linear programming runs required before discovering the mixed-integer solution. F R A = N u m b e r of units of potential mines having fractional values at t h e first step.
Strategic
and Tactical Models and Algorithms
lor the Coal Industry
255
substantially, and t h e optimality tolerance will need t o b e further relaxed t o keep these procedures viable. On t h e other hand, t h e proposed Heuristic L P H can be expected to remain robust since it only relies on t h e solution of a limited sequence of linear programming relaxations.
Appendix: Modifications for a Tactical Day-toDay Operational Model Sherali and Puri [14] have presented a description of three linear programming tactical models for making day-to-day mining, cleaning, blending, and distribution decisions, given a set of operating mines. T h e most accurate and detailed of these three models is called "Model 1". In this model, coal is assumed to be shipped out of mines to t h e silos at the beginning of t h e t i m e periods, and to be shipped out of t h e silo units to t h e customers at t h e end of t h e t i m e periods. A m a x i m u m of a three period shipment lag between coal production at t h e mines and t h e final shipment to customers is p e r m i t t e d , based on an estimate of the clearance t i m e at t h e silos. If t-y is t h e t i m e period for a certain mine to silo shipment, and ti is t h a t for a continuing silo to customer shipment, t h e shipment lag is given by t 2 — h- T h e transfer lag, another t i m e lag factor, t h a t indicates t h e difference between t h e t i m e of dispatch of a coal shipment from a mine to a silo and its actual arrival t i m e at the silo, is negligible for t h e problem under study, and is therefore assumed to be zero. (Nonzero transfer lags can, however, be readily accommodated by time-shifting t h e data.) We also adopt this same structure of t h e model. However, t h e following modifications have been m a d e to enhance t h e problem representation based on t h e feedback from our case study implementation.
1. A penalty-reward component has been introduced into the objective function, which either penalizes or rewards t h e quality sent to t h e customers relative to what is desired. T h e need for this function arose due to t h e 1990 Clean Air Act as t h e customers are now somewhat more liberal about t h e content of ash in the coal, b u t are more stringent about its sulfur content. Hence, besides a piecewise linear reward or penalty function imposed on t h e ash quality of t h e coal shipped relative to t h e m a x i m u m specified limit, we have also introduced a piecewise linear reward structure in t h e objective function to reflect t h e incentive for shipping coal having a better sulfur quality, while restricting a m a x i m u m sulfur content in each period as a h a r d constraint. 2. Due to t h e 1990 Clean Air Act, coal companies might have to look for different types of cleaning technologies instead of using a single type. "Model 1" can
H. D. Sherali and Quaid J. Saifee
256
be given an interpretation so t h a t it can accommodate more t h a n one t y p e of cleaning technology. For example, if there are two types of cleaning technologies used by a coal company, then t h e silos in J2 can be divided further into subsets J 2 i a n d ^22, where each J21 t y p e of silo constitutes a pair of silos, one of which receives t h e run-of-mine coal, while t h e other stores t h e coal following t h e cleaning operation, and where each J22 t y p e of silo constitutes a similar pair of silos, b u t representing an alternative cleaning technology. Note t h a t this is principally a d a t a processing, rather t h a n a model oriented modification. 3. T h e equality constraint t h a t strictly enforces as much amount of coal to be transferred from t h e Wentz to Bullitt silos as is shipped to stoker customers has been relaxed to an interval constraint, to b e t t e r reflect the actual practice in this transfer. Since Sherali and Puri do not provide a m a t h e m a t i c a l formulation for "Model 1", for t h e sake of convenience in reference by practitioners, we give below a complete m a t h e m a t i c a l formulation (including t h e foregoing modifications), t h a t has been tested and implemented using G A M S . For completeness, we first specify t h e d a t a requirements along with our notation. Note t h a t t h e short-term tactical model, in contrast with t h e long-term strategic model, has to contend with daily storage restrictions at the mines and at t h e silos, as well as with the dissipation of any initial a m o u n t in storage at the silos within a rolling horizon implementation framework. t = 1 , . . . , T = n u m b e r of t i m e periods. i — 1 , . . . , m = number of mines. j — 1 , . . . , J = n u m b e r of silo units. k = 1 , . . . , K = number of customers. Kst = "stoker customers" served by Wentz silos. Pit = production (tons) at mine i, in period t. an, sn = ash and sulfur content (%), respectively, in coal produced at mine i, in period t. SMi = storage capacity at mine i. c?m = storage cost (per ton) at mine i. J j = R O M silo storage units. J^w = cleaned silo units at Wentz facility. JiB= cleaned silo units at Bullitt facility.
Strategic
and Tactical Models and Algorithms
for the Coal Industry
257
J 2 = cleaned silo units (J2w U J2B)SSJ = storage capacity of silo unit j . ( For j £ J 2 , this is taken as t h e sum of t h e two associated silo storage capacities.) C*JS = storage cost per ton of coal at silo j . eft = initial amount in storage at silo j . a°,s°; = ash and sulfur content (%), respectively, in initial storage a m o u n t at silo j . C?M
_
snippmg cost
(per ton), from mine i to silo j .
ciij £ (0,1] = total weight attenuation factor (output per ton input) at silo j £ Ji for coal from mine i. Pij £ (0,1] = ash content attenuation factor (output per ton input) at silo j £ J2 for coal from mine i. 7tj 6 (0,1] = sulfur content a t t e n u a t i o n factor (output per ton i n p u t ) a t silo j £ J2 for coal from mine i. Note: Q . J , fa;, and 7,-j are assumed to be 1 for j £ J\. A\ = flow arcs (i,j) (i,j)eA1}.
from mine i to silo j ; F} = {j
: (i,j)
€ Ai }, R1- = {i: :
A2 = flow arcs (j, k) from silo j to customer k; FJ = { k : (j,k) Rl = {]--{hk)eA2).
£ A2 },
A3 = arcs (j, j') representing by-product flow from Wentz silo j 6 J2w to Bullitt silo j ' £ J2B, corresponding to stoker customer shipments; Ff = {_}' : {j,j') £
A3},R? =
{j:(j,j')eA3}.
cfj^ = shipping costs (per ton), from silo j to customer k. cSjjf = shipping costs (per ton), from Wentz silo j £ J2w to Bullitt silo j ' £ J2Bdkt = d e m a n d placed by customer k during period t. ASHukt = upper limit on ash percentage in coal to be delivered to customer k in period t. SULukt = upper limit on sulfur percentage in coal to be delivered to customer k in period t. c\k = slope (revenue/ton) of reward function for each % point below t h e maxim u m specified limit of ash in coal shipped to customer k.
H. D. Sherali and Quaid J. Saifee
258
c\k = slope ( c o s t / t o n ) of penalty function for each % point above t h e m a x i m u m specified limit of ash in coal shipped to customer k. c\k = slope (revenue/ton) of reward function for each % point below t h e maxim u m specified limit of sulfur in coal shipped to customer k.
Mathematical Formulation T h e decision variables for t h e tactical model are defined as follows: (i) amount (tons) shipped from mine i to silo j in period t, with continued shipment to customer k in period r, given by y\jt (ii) a m o u n t (tons) in initial storage at silo j , shipped t o customer k in period t, given by y^kt (iii) amount (tons) shipped from mine i to silo j £ J2W in t i m e period t1 which is then shipped to j ' € J2B in period t2, and finally shipped t o customer k in period r , given by Y^lJjH2. Here, for a given t\, (Note t h a t t h e initial storage at t h e silos is also assumed to be dissipated within three periods). Hence, we have, {h, r) € tL{U) = {{h + l,*i + 1), (*i + M i + 2), (h + 2, h + 2)} Auxiliary Decision Variables 1. x's = slack variable equal to t h e amount (tons) of coal remaining in storage at mine i during period &, for i = 1 , . . . , m, S = 1 , . . . , T. 2. u'jS = accumulated storage amount (tons) in silo unit j during period S, for j = 1 , . . . , J, 6 = 1,...,T. 3. ASHkT = % ash content in blended coal delivered t o customer k in period r , for k = 1,...,K, T= 1,...,T. (a) {ASHkT)i = % ash content in blended coal delivered to customer k in period T, t h a t is below t h e m a x i m u m specified limit, for k — 1 , . . . , A ' ,
r = i,...,r. (b) {ASHkT)2 = % ash content in blended coal delivered to customer k in period r , t h a t exceeds t h e m a x i m u m specified limit, for k = 1,...,K, r = l,...,T. 4. SULkr = % sulfur content in blended coal delivered t o customer k in period r , for k = 1,...,K, T = l,...,r. Objective Function: J
Minimize
T min((+2,T)
E E E E
E
3=1 .'£«] keF] <=i
*=t
(^+4
+
4 ^
Strategic and Tactical Models and Algorithms for the Coal Industry
+ E E E E E 3
(41 + <s% +
E
3
jeJ2w ieflj j'eff J
259
k<=F ,
2
T
J
T
+ E E E 4 ^ + 12E « + E E « +
K T E E
[ - C U ( ^ # U * T - (AStf* T )i)<4 T +
c\k{ASHkT)2dkT\
/fc=l T = l
+
E E [-clk(SULukT
-
SULkT)dkT]
A= l T= l
Constraints: 1. Flow balance and storage at mines: min((+2,T)
E y&- E E E E
E P - E E E
E
*#„•<* = *?«
for i = 1,. . ., m, and <5= 1,. . . , T 0 < xf5 < 5M; for z = 1 , . . . , m, and 8 = 1 , . . . , T 2. Storage constraints for silo units: t
min(«+2,T)
EE E
E
/
1,
^(1+ ««)!&+ « ? - E E 4 | + ^ = «].
0 < uj t < 55,- for j = 1 , . . . , J , and t = 1 , . . . , T where,
E E E E ieR)
j'€Fj> k<=Fj, S=t-2
E
ki + °n)Y/rM fnjtJiw
(i2,T)e(L(«),(2>l
E E E E E x(l + « O - 0 ^ , i < a j'eflj ieflj,fceF,2((s,T)ei,(t) *I=(T-2) Z 0
fori6J2B otherwise
and where t^ (t) = {(t — l,t), (t, t), (t, t + 1)} for defined combinations. 3. Dissipation of initial storage at silos: E
E
/c£F? ' = 1
V%t = 1° for j = 1,.. . , J
260
H. D. Sherali and Quaid J. Saifee
4. Wentz to Bullitt transfer: min(r,(+2)
E E
E
*&.„ <
j'SFf fcsF3, (*2,T)e«L(*)
E
E
keFjnK.,
-r=t
y%
< i-i E E
E
*&.„
i'6F? Jtei^, (i2,r)eti(()
for j G J 2 w, « € R), and < = 1 , . . . , T 5. Demand constraints for customers: T
T-l
E E E «*v% + E y%r+ E E E E jSflJ • € « ;
j'ZRl * j£H 3J, igfl 1J < = T - 2
*=T-2
=
for A;= 1,...,K
and r =
min(f+2,T)
E
l 2 =t+l
dkT
1,...,T
6. Customer product quality constraints: ASHkT
=
— dkr
E E E i & a « & + £ y°kTa° -1 min(t+2,T)
+ £ £ £ £ for k = 1,...,K, SULkT
T= =
E
1,...,T 1
<4T E E
E y&'iiia+Y, y%*° r —l
min((+2,r)
+ E E E E j'eRl
for k = 1,...,K,
for k = l,...,K,
T=
^l2«,^
jeR3, i£R\ * = T - 2
E
*&-„*« 7«
t2=t+i
1,...,T
ASHkt
=
(ASHu)!
+
(ASHkt)2
0
<
(ASH*)!
<
ASHukt
0 0
< <
(ASHkt)2 SULkt
< <
oo SULukt
and t = 1,...,T
«o-*&••«,
Strategic
and Tactical Models and Algorithms
7. Nonnegativity
for the Coal Industry
constraints:
V% > 0, y%t > 0, f o r i = l , . . . , m , j = 1 , . . . , J , fc = \,...,K,t r = <,..., min{r,t + 2 } Y
ijtu't2
261
= 1,...,2\
> 0, for i = 1 , . . . , m , j € J 2 w , j ' € J 2 B , fc = 1 , . . . , K, h = 1 , . . . , T,
(*2,T)€*L(*I)
References [1] R. Aboudi, A. Hallefjord, C. Helgesen, R. Helming, K. Jornsten, A.S. Pettersen, T. R a u m , and P. Spence. A Mathematical P r o g r a m m i n g Model for t h e Development of Petroleum Fields and Transport Systems. Euorpean Journal of Operational Research, 43:13-25, 1989. [2] E. Balas and C.H. Martin. Pivot and Complement- A Heuristic for 0-1 Programming. Management Science, 26(l):86-96, 1980. [3] A. Brooke, D. Kendrick, and A. Meeraus. GAMS A User's Guide. T h e International Bank for Reconstruction and Development, T h e World Bank, 1988. [4] W . Candler. Coal Blending - W i t h Acceptance Sampling. Computers tions Research, 18(7):591-596, 1991.
& Opera-
[5] G.B. Faulkner. Linear P r o g r a m m i n g Applied to a Mining Smelting Operation. Canadian Mining & Metallurgical Bulletin, 60(677):1297-1300, 1967. [6] M. Gershon. Mine Scheduling Optimization with Mixed Integer P r o g r a m m i n g . Mining Engineering, 35(4):351-354, 1983. [7] T . B . Johnson. O p t i m u m open-pit mine production scheduling. A Decade of Digital C o m p u t a t i o n in t h e Mineral Industry, S M E - A I M E , New York, USA., 1969. [8] A.M. K h a n . Solid-Waste Disposal with Intermediate Transfer Stations: An Application of the Fixed-Charge Location Problem. Journal of the Operational Research Society, 38(l):31-37, 1987. [9] C.G. Knight and C.B. Manula. T h e Pennsylvania Coal Model. In Proceedings of the 14th APCOM Symposium, SME-AIME, 655-665, New York, 1976. [10] B.A. Lietaer. A Planning Model for Underground Mines - An Application in a Developing Country. OMEGA, The International Journal of Management Science, 5(2):149-159, 1977.
262
H. D. Sherali and Quaid J. Saifee
[11] G.L. N e m h a u s e r a n d L.A. Wolsey. Integer and Combinatorial Wiley and Sons Inc., New York, NY, 1988.
Optimization.
John
[12] J . P . Osleeb and S.J. Ratick. A Mixed Integer and Multiple Objective Programm i n g Model t o Analyze Coal Handling in New England. European Journal of Operational Research, 12:302-313, 1983. [13] J . P . Osleeb, S.J. Ratick, P. Buckley, K. Lee, and M. Kuby. Evaluating Dredging and Offshore Loading Locations for U.S. Coal Exports Using t h e Local Logistics System. Annals of Operations Research, 6:163-180, 1986. [14] H.D. Sherali and R. P u r i . Model Development, Testing and C o m p u t e r Implem e n t a t i o n for a Coal Blending and Distribution Problem. OMEGA, The International Journal of Management Science. (To appear). [15] H. Steinmann and R. Schwinn. Computational Experience with a Zero-One P r o g r a m m i n g Problem. Operations Research, 17:917-920, 1969. [16] R.C. Tomlinson. T h e Practice of O.R. in Coal Mining. Operational Research, 1:9-21, 1977.
European
Journal
of
[17] K . B . Williams and K . B . Haley. A Practical Application of Linear P r o g r a m m i n g in t h e Mining Industry. Operational Research Quarterly, 10(3):131—138, 1989. [18] W . Young, J.G. Ferguson, and B. Corbishley. Some Aspects of Planning in Coal Mining. Operational Research Quarterly, 14(1):31—45, 1963.
263 Network Optimization Problems, pp. 263-281 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Multi-Objective Routing in Stochastic Evacuation Networks J. MacGregor Smith Industrial Engineering and Operations University of Massachusetts, Amherst,
Research Department, MA 01002, USA
Abstract
A fundamental problem of routing in stochastic queueing networks is the identification of paths which extremize a collection of objective functions. In this paper, an integer set partitioning model with two conflicting objective functions is presented and examined. Also, the properties of the Noninferior set of solutions and the mathematical development of the algorithm for iteratively generating the paths are described. The algorithm is based on a multi-objective k-shortest path algorithm for generating the Noninferior set of paths for which tradeoffs between evacuation time and distance travelled within the network are evaluated. An example of the methodology is also presented.
1
Problem Overview
Fundamentally, t h e problem of routing customers, occupants, packets, and other transactions in queueing networks is a complex, stochastic, integer, and nonlinear programming problem. These problems are highly transient as well as being multiobjective in n a t u r e and there are numerous performance measures inherent within t h e problem which complicate t h e different routing strategies. Mathematically, we have a finite queueing network G(V, E), with a finite set of nodes V, and edges(arcs) E, over which multiple classes of customers (occupants) flow from source(s) to sink(s) while a vector of objective functions fi = {fi(x), / 2 ( a ; ) , . . . , fp{x)} is simultaneously
J. M. Smith
264
extremized subject to a set of constraints on t h e occupants flowing through t h e network. In this paper, some of t h e m a t h e m a t i c a l properties, methodology and corresponding algorithms for solving this multi-objective routing problem in queueing networks are presented. One m a i n concern in this paper is to demonstrate how one can model t h e multi-objective n a t u r e of t h e problem and calculate effective alternative routing strategies t h a t allow t h e system planner of t h e queueing network to tradeoff perform a n c e between one objective and another. Much of our past research in this area has considered stochastic evacuation networks [14, 15, 1], and also static and real-time routing in production and manufacturing settings [5, 6, 7, 9]. In these latter studies, maximum throughput, sojourn time, and average number of customers in the system have been objectives of interest. In this paper, t h e focus is on stochastic evacuation networks in which we consider two p r i m a r y objectives: fi(x):= Total Evacuation Time and f2(x):= Total Distance travelled. T h e methodology and properties found can be generalized for other networks with similar objectives, such as p a t h reliability, see Figure 1.
c
_ Minimize Travel T i m e
-J
_ Minimize Evacuation Time
Overall Safety
Minimize Routing Complexity
Minimize Maximum Queue Lengths
_ Minimize Congestion
Minimize Average Queue Lengths
' — M i n i m i z e Total E v a c u a t i o n T i m e M i n i m i z e Total D i s t a n c e Travelled
Minimize Shortest Routes
P a t h Complexity
Minimize Maximum P a t h Lengths I—Min I -Minimize ]ym # Up-down Transitions
_ Minimize R e c e p t i o n _ | Center Failures
M:
_ Maximize P a t h _ Reliability Minimize _ Arc Failures
Minimize Latest Arrival Time
Minimize Maximum Flow Capacity Equalize Average Flow Capacities
Equalize Average —I Arc Flows Minimize Maximum Arc Flows
Figure 1: Morphological Diagram of Multi-Objective Approaches One might think t h a t t i m e and distance are highly correlated and for those situations where no congestion in t h e network evacuation p a t h exists, this is substantiated. However, when there is congestion in t h e network and all occupants seek t h e shortest
Multi-Objective
Routing
in Stochastic
Evacuation
Networks
265
p a t h , then there is a tradeoff which occurs, where routing some of t h e occupants on longer paths will reduce the overall evacuation t i m e of all occupants. T h e approach that is described is flexible and effective and can be utilized for transient situations where discrete-event simulation models such as Q - G E R T [12] and their variants, or for steady-state analytical queueing network models such as Mean Value Analysis Q N E T - C [15] are used to evaluate and generate the set of efficient routing paths in an evacuation network.
2
Assumptions and Definitions
By definition, a queueing network (graph) G(V, E) is comprised of a finite set V of nodes (vertices) of size N where V = { v\, V2, • • •, Vn} together with a finite set E of arcs e/t = (vt,Vj) V ( i , j ) nodal pairs. V can further be partitioned into three sets, Vi, which represents the occupant source nodes during the evacuation, V2 which represents t h e intermediate nodes during t h e evacuation, and V3 which represents t h e sink or destination nodes of the occupants. T h e set of arcs represent t h e different streets, passageways, or routes from V\ to V3. Associated with each node £ 6 V and each arc (v{,Vj) € E are variables and parameters which represent node and arc processing times, node and arc capacities, arrival times to t h e network, distances, and occupant population sizes at the source nodes. Figure 1 illustrates a small evacuation network with many of t h e parameters and variables of significance to the evacuation planning problem t h a t can be embedded in the network model. Some of the notation most commonly associated with this t y p e of network model is discussed below. A : = T h e r e are A chains of occupants labelled a = 1, 2 , . . . A. Each occupant chain represents a sequence (vector) of nodes and arcs which the occupant chain population will travel during the evacuation. D : = A distance m a t r i x where the elements of D , dij represent t h e Euclidean or rectilinear distance between nodal pairs (i,j) € E. E : = T h e network has a finite set E, of arcs(nodal pairs) fj{x)
: = An objective function evaluating the set of routing alternatives denoted by (x).
G(V,E)\
t h e queueing network (graph)
A : = the arrival rate vector of the occupant classes into the routing alternatives of t h e network. A = (Ai, A2,. . . , Ajj) for all occupant sources. fi : = t h e service rate vector for the nodes and arcs comprising the evacuation network. ^1, represents the service rate of a node while /J,( represents the service
J. M. Smith
266
r V31
V i •<
yv3
V12
v32 V
J
Y
v2 Figure 2: E x a m p l e E v a c u a t i o n N e t w o r k r a t e of an arc (travel time) between two nodes in G. Each queue is assumed to have infinite waiting room. NI
: = a Noninferior evacuation p a t h for an occupant class in an evacuation network.
V : = T h e network model has a finite set V of nodes and further t h a t V is partitioned into three sets: Vi, Va,V3 which represent the source(s), intermediate,& sink(s) nodes. fi : = T h e set of objectives in our routing problem. Since we are dealing with a multi-objective problem, we need to define t h e notion of a Noninferior NI evacuation p a t h [3]. Definition: x* is said to be a Noninferior evacuation p a t h for our evacuation problem defined in §1 if there exists no other feasible evacuation p a t h x such t h a t f(x) < f(x*) meaning t h a t fj{x) < fj(x*) for all j = 1,2, . . . , p with strict inequality for at least one j .
Multi-Objective
267
Routing in Stochastic Evacuation Networks
In other words, if we have a candidate path which we suspect is NI, there should be no other feasible path which is more minimal in both of the performance measures: time and. distance. The set of all paths which are NI is called the NI set.
3
Mathematical Model
There are few mathematical models which have appeared in the literature for generating and evaluating evacuation paths for an occupant population [10, 2, 11]. The model which is presented below is a variation of one model appearing in [11]. It was one of the first to account for the critical features of the stochastic evacuation problem. Another class of models that one might utilize to formulate the problem are those of the class of multi-commodity flow models. Unfortunately, these models will not control the splitting of the occupant population along the different evacuation paths which is problematic since splitting the different source populations will engender confusion and a potential sense of panic among the evacuating occupants. The integer programming model presented below has the desired property to control splitting of the flows. The multi-objective model of our routing problem is: Minimize{f\{x)\l
/2(*)}
where: (EvacuationTime)
: fx(x)
=
51HX^*-^*^ «
(DistanceTravelled) : f2(x)
i
— ^y^S^'i*'^*'1'*.** i
j
(1)
k
(^)
k
subject to: {V2 Arcs) :Y^YlYlatijk^ijkXijk «
j
< pe W
(3)
< Cq
(4)
k
(V3 Sinks) -YlY^llPiikX'iok i
j
Vq
k
(Occupant Classes) : 22 Xijk = (Routes) : x^
1 Vij
= 0,1
Vijfc
(5) (6)
and where: Xijk'-= 1 if the ith occupant class from the j t h source is assigned the kth NI route alternative. aajk-
a
data coefficient which equals 1 if the £th arc is included in the ijhth route assignment and equals 0 otherwise.
268
J. M. Smith
pt. m a x i m u m allowable traffic along arc (.. Cq: capacity of sink (destination) node q. Pijk'. occupant population of source ij on the kth NI route alternative. qijk'. expected evacuation (sojourn) t i m e of t h e ijkth occupant class. These values must be calculated from t h e particular stochastic model used in the evacuation study, see discussion below. dijk- average distance travelled for t h e ijkth
occupant class.
Because of t h e complexity of solving this model directly, an alternative approach which systematically generates feasible routing alternatives to a relaxed version of our m a t h e m a t i c a l model but at t h e same t i m e measures the critical objectives of evacuation t i m e and distance travelled is proposed and demonstrated in t h e next two sections of t h e paper.
4
Congestion Properties
Some crucial issues guide us in t h e routing/re-routing process: • How should the routing/re-routing • How should alternative
process be
initialized?
NI paths be selected
• How should the re-routing
process be
terminated
In general, a problem faced with multi-objective programming problems is t h e often exponential n u m b e r of NI solutions. We would like to limit the exploration of t h e NI solutions to a manageable quantity by decomposing and relaxing the original problem presented in §C so t h a t we systematically treat one objective at a t i m e . Before we present t h e relaxed m a t h e m a t i c a l model along with the m a t h e m a t i c a l properties, an i m p o r t a n t definition is needed which concerns the eventual gain in re-routing occupants along longer NI paths in t h e evacuation networks.
E x p e c t e d S a v i n g s : Given a current occupant populations' evacuation route, t h e potential gain in re-routing an occupant population along some other NI route is given as: where: Eijk '•= is t h e net increase or decrease in the average egress time per person caused by re-routing occupants to the (kth+1) NI route.
Multi-Objective
Routing
in Stochastic
Evacuation
Networks
269
qtjk := t h e sum of the average queue times per person on the original route. dif.=
t h e increased distance travelled on the (kth+1) NI route (e.g. if t h e kth NI route is 100 feet and t h e kth+l NI route is 120 feet, d*- is equal to 20 feet i.e. 120 minus 100).
u := is t h e average travel speed for d1--. qfj.= t h e sum of the expected queue times per person on the (kth+1)
NI route.
One impor!ant point in what follows is t h a t we assume that we have an accurate estimate of q^. For evacuation and networks which are Markovian (Poisson arrivals and Exponential service), this estimation process is exact. For more general networks however, a stochastic approximation technique is required which gives an estimate of t h e impact of re-routing occupants from their current evacuation p a t h to another NI one. Since t h e alternative evacuation paths are acyclic directed graphs of infinite queues, this estimation problem is difficult, yet not intractable. Let's present a relaxed version of our original mathematical programming problem which will form t h e foundation of what is to follow. Minimize]
fi(x)\
/2(a)}
where: (EvacuationTime)
: fy(x)
=
^Z^ZS^'i*^*^* i
(DistanceTravelled)
: f2(x)
=
3
YL^Z^2,(^iik^iJkXiJk j
•
(7)
k
(8)
k
subject to: {Occupant
Classes) : y , Xq^
=
1
=
0,1
Vij
(9)
k
(Routes)
: xijk
Vijk
(10)
T h e relaxed model removes constraint equations (1&2) and concentrates on evacuating t h e occupant populations along selected routes where time is minimized. T h e relaxation of constraint equations (1&2) is justified in certain evacuation situations where t h e capacity restrictions are not severe or critical. To find an initial NI solution to our relaxed model, we ignore f\(x) and solve our network for the first NI p a t h for each occupant population using a multi-objective shortest p a t h algorithm such as t h e one by Climaco and Martins [4]. Thus: T h e o r e m 1: The collection of 1st shortest paths for each occupant population resents a NI solution for the entire evacuation network.
rep-
270
J. M. Smith
P r o o f : This is a s t a n d a r d result in multi-objective optimization [3], viz. selecting one objective, fi(x), solving a weighting problem with u>,- = 1 and Wj = 0V? ^ i. T h u s , simultaneously minimizing t h e distance travelled for each occupant population results in t h e m i n i m u m distance travelled for all occupant classes for fi(x). | At t h e same t i m e we generate t h e first shortest paths, we compute t h e evacuation (sojourn time) for t h e occupant population. As we shall see, we approach the solution of our original m a t h e m a t i c a l model in §C oscillal ing between the set of NI paths and t h e tradeoffs gained in reducing t h e evacuation time by sending occupant populations along longer p a t h s . T h e 1st shortest p a t h solution for each occupant population may result in a unique optimal solution across all populations for both objectives fi(x), f2(x), if queueing delays along t h e routes are not significant. However, we need to ensure this by quantitatively using our notion of expected savings to verify this. T h e o r e m 2: / / the NI solutions at the first stage of the routing process where each occupant source is routed along the 1st shortest path an expected savings calculation: E
ij = lijk ~ [{dijM
+ Qij] > 0
for some
then, the 1st shortest paths are not a unique optimal solution will improve the evacuation time fi(x).
ijk and further
rerouting
P r o o f : T h e proof of this property rests on constructing a Linear/Integer programming problem and its dual which indicates how one can move from one NI solution to another on t h e efficient frontier, if necessary. T h e Linear/Integer programming problem represents another relaxation of t h e original model in §3 where we are only focusing on t h e expected savings possible by re-routing occupants along longer NI p a t h s . This Linear/Integer programming problem can be considered as a way of scalarizing t h e two objectives into a single objective problem in t h e expected savings of t h e alternative NI routes. Let's establish t h e following expected savings problem: Maximize
Z = ^ ^ ^ i
(Occupant Classes) 2 ^ 1 , ^
EijXijk
i
k
<
1
Vij
(11)
k
x
ijk
5: 0,1
Vijk routes
(12)
where
4 = [?**-[«•/<")+ 4-1 vy* we would like t o reroute t h e occupants perhaps on a longer NI path, if the expected savings in evacuation t i m e would be maximized. T h e unimodularity of t h e above
Multi-Objective Linear/Integer unimodularity Programming If we take alternative NI
Routing
in Stochastic Evacuation Networks
271
program follows from t h e 0,1 properties of t h e sparse m a t r i x [8]. T h e property allows us to solve for integer solutions using ordinary Linear algorithms. t h e dual of t h e above problem, we gain some insight into whether paths may result in some savings in / i ( x ) . T h e dual is: Minimize
^
J ^ 71^ »
3
*a > fe-[(4H + 4 l V i ^ *<j
>
OVy
( 13 ) (14)
Finally, from t h e Dual, we obtain the following Complimentary Slackness conditions: Uij ~ [qijk ~ [(dij/u)
+ q*j]}x,jk = 0
\fijk
At our current NI solution, x{jk = 1 for some NI alternative for each ij occupant population. By t h e above complimentary slackness condition, if Xijk = 1 then =>
{^-hik-[(dyco)+q^}}
= 0.
Therefore, for t h e other JV7 route alternatives of each occupant population source, t h e dual feasibility condition remains unsatisfied, i.e.
if and only if t h e Expected Savings for a route alternative is p o s i t i v e , viz. [
At each stage of our re-routing which: E* =
max
process, we should
move
{EU,E22,...,EIJ}
V13 sources
Essentially, t h e rule is a greedy one and has been shown to be effective in practice. It will generate a subset of t h e NI solution space and will not guarantee complete enumeration of t h e NI set. To t e r m i n a t e the re-routing process, we have:
J. M. Smith
272 C o r o l l a r y 1: If E* < OV i j occupant sources then no more re-routing result in an improvement in f\{x).
iterations
will
P r o o f : Again, this result is derived from t h e Linear/Integer programming problem using t h e Dual Feasibility and Complimentary Slackness conditions. If t h e only NI route which provides t h e m a x i m u m savings is a current route, for each occupant population source, t h e n no additional Expected Savings Eij > OV ij are possible and t h e process is t e r m i n a t e d .
5
Algorithm
T h e problem we face in our evacuation planning problem is t h a t we do not know a priori which p a t h s are NI without assessing the congestion in G(V,E). We must iteratively generate candidate paths, assess t h e congestion in G(V,E), and then iterate again until t h e desired tradeoffs between distance travelled and evacuation t i m e is acceptable t o t h e planner. This iterative process leads to the algorithm described below. For product form networks where the estimate of t i m e delays in the Expected Savings calculation for re-routing among the alternative NI paths can be computed exactly, t h e n t h e algorithm will guarantee finding a NI p a t h for re-routing t h e occupant classes. For non-product form networks, which are typically t h e case, we can only a p p r o x i m a t e these t i m e delays, therefore, the algorithm can only guarantee an a p p r o x i m a t e NI solution. Considering the complexity of the underlying stochasticinteger p r o g r a m m i n g problem, this is a reasonable and practical strategy. Before presenting t h e algorithm formally, let us discuss our overall approach to t h e evacuation planning problem [15]. Our approach for modelling evacuation planning problems has t h r e e separate b u t interrelated steps: S t e p 1.0 concerns t h e Representation of the region or facility as a queueing network, while S t e p 2.0 concerns t h e Analysis of the queueing network to estimate t h e critical performance measures of the evacuation. A queueing network model is utilized in order to capture t h e potential congestion in the network where large populations come together during t h e evacuation process. Finally, S t e p 3 . 0 concerns t h e synthesis or multi-objective generation of t h e routing paths for t h e occupants, where t r a d e offs among t h e different performance measures estimated in S t e p 2.0 of t h e algorithmic process may be necessary. T h e algorithm to facilitate t h e design methodology can be incorporated into any simulation e . g . Q - G E R T or analytical model e . g . Q N E T - C to estimate / i , / 2 , and carry out t h e evacuation planning/routing analysis. To summarize and focus t h e efforts in this paper, an algorithmic description of S t e p s 1.0, 2 . 0 , & 3.0 and it substeps are presented. S t e p 1.0: R e p r e s e n t a t i o n Represent the underlying facility or region as a network G(V, E) where V :— is a finite set of nodes and E := is a finite set of arcs
Multi-Objective
Routing in Stochastic Evacuation Networks
273
or nodal pairs. Step 2.0: Analysis Analyze G(V, E) as a queueing network either with a transient or steady-state model and compute the total evacuation time of the occupant population along with total distance travelled to evacuate given a set of evacuation paths. Step 3.0 Synthesis Algorithm Step 3.1: Analyze the queueing output from the evacuation model and compute the set of NI evacuation paths which simultaneously minimize time and distance travelled in G(V, E) for each occupant population. 3.1.1 If the set on NI paths are uniquely optimal then
go to Step 3.2 otherwise: 3.1.2 Significant queueing (congestion) exists on one or more routes then go to Step 3.3. Step 3.2: STOP! The NI shortest time/distance routes are optimal and identical and total evacuation time, distance and congestion are minimized. Step 3.3: Determine the total number of occupants who pass through the queueing area(s) and trace them back to their origins. Step 3.4: Select the total number of occupants to be re-routed from each source node. The total number of occupants re-routed is correlated to both the size of the queues and the number of occupants on each route. In selecting the population, the analyst should strive to achieve uniformity of occupants and queues on each egress route. Step 3.5: Re-route the population to the kth route of the NI set of paths where k is selected by employing the following formula:
4 = « i i * - [ ( 4 » + «S] WJfc Step 3.6: Select the largest positive E* for each set of populations to be rerouted, where: £* =
. m a x V\j sources
{EU,E22,---,EIJ}
for all possible savings, and then re-run the computer evacuation planning model with the new set of routes, by returning to Step 2.0 of the General Algorithm. If all E[s are negative, stop! The current set of NI shortest routes used on the previous iteration are selected.
J. M. Smith
274
T h e overall t i m e and space complexity of the algorithm is exponential largely because of t h e t i m e complexity of S t e p 2.0. and the fact that t h e number of NI solution p a t h s for each occupant class may be exponential in the size of t h e network. W h e t h e r one has a product form network or not, t h e time complexity of S t e p 2.0 will be a key bottleneck in t h e efficiency of t h e algorithmic re-routing process. Even if an heuristic is utilized in S t e p 2.0, t h e time and space complexity of the algorithm would then be governed by the exponential number of alternative NI paths for each occupant class.
6
Example
In t h e following section of t h e paper, an example of t h e evacuation of three distinct occupant populations is utilized to demonstrate t h e scope and effectiveness of t h e previous design methodology to route and re-route t h e occupants based on t h e m a t h ematical formulation and subsequent algorithm. T h e example is taken from [11]. Q - G E R T is utilized here to estimate the evacuation times and congestion on the NI set of evacuation routes since it is a transient model and represents t h e most general t y p e of stochastic estimation tool in which t h e multi-objective routing methodology might be used. T h e arrivals to t h e network are from a log-normal distribution and log-normal distributions were used for the arc and nodal service time distributions. More discussion of t h e parameters are included in [11]. At the end of the example discussion, t h e use of Q N E T - C , a steady-state analytical queueing network model, is described to estimate t h e evacuation times. T h e example has three occupant groups located at source nodes # 1 , # 2 , and $ 3 with occupant populations of 40,10 and 30 persons respectively. Figure 1 illustrates t h e sample G(V,E). As you can see in Table # 1 , t h e paths marked with a single * represent t h e set of NI solutions for each of t h e occupant populations. The populations are denoted by t h e large P in each of t h e following tables and the V-j denotation represents t h e sink nodes of G(V, E). T h e other paths were generated by t h e algorithm, but since we are viewing t h e evacuation G(V, E) without congestion, the shortest distance paths also correspond to t h e shortest time paths. T h e above paths are chosen and t h e algorithm returns to S t e p 2 . 0 of t h e General Algorithm to assess the congestion in G(V,E). If no significant congestion exists on these routes, the algorithm terminates. Unfortunately, in our pass through S t e p 2.0 significant congestion at nodes # 5 , # 7 , and # 8 were encountered in the amount of 50.97,2.217 and 2.86 t i m e u n i t s ( t . u . ' s . ) and a total evacuation t i m e was 158.50 t.u.'s. In iterating again through S t e p 3 . 0 with t h e above queueing delay estimates along t h e paths, a different set of NI paths were generated due t o congestion, see Table # 2 . Inspecting Table # 2 , t h e p a t h s denoted by t h e single * are chosen from t h e set of NI paths, while those denoted by four * * * * are dominated or else were considered on the previous iteration. We are thus beginning t o t r a d e off distance travelled in order to seek reductions in evacuation
Multi-Objective
Routing
in Stochastic
Evacuation
Networks
275
time for all occupants. T h e results of the next iteration are displayed in Table # 3 . For this iteration # 2 , t h e total evacuation t i m e slightly decreased from 158.50 —> 152.09 t.u.'s. While t h e delay decreased significantly at node # 5 from 50.97 —> 25.48, t h e delay at node # 8 increased from 2.86 —» 18.70 which thus resulted in the marginal decrease in total evacuation time. W i t h these new queueing delays, t h e NI set was again generated and t h e new NI paths are depicted in Table # 3 . T h e new set of p a t h s were chosen a n d once again t h e algorithm cycled through S t e p 2 . 0 , b u t this iteration resulted in a d r a m a t i c decrease in evacuation time from 152.09 —» 114.23 t.u.'s. It is interesting to note t h a t t h e program Q N E T - C was run with t h e same set of p a t h s and t h e results where similar to t h e G E R T runs in t h a t t h e evacuation times where 165.78, 159.87, and 111.02 for t h e same iterations. T h e discrepancy in evacuation times is due to t h e fact t h a t Q N E T - C utilizes exponential service t i m e distributions for t h e service times along t h e evacuation paths so there is a tendency t o be pessimistic, while QGER.T makes no special distribution assumptions.
J. M. Smith
276
6.1
Iteration # 0
POP = 401
Y
l<
1
POP = 10
yv3
POP = 30( 3 J
v2 Figure 3: Initial Evacuation Graph
G(V,E)
P=l
Time
Distance
Route
V3 = 8 V3 = 9
35 35
120 180
1 -> 5 ^ 7 ^ 8 1-^4-^6^9
P=2 V3 = 8 V3 = 9
Time 24 35
Distance 80 160
Route 2 ^5-+7-+8 2^4-*6^9
P=3 V3 = 8 V3 = 9
Time 27 50
Distance 90 180
Route 3^5-+7^8 3 -> 4 ^ 6 - * 9
Table 1: NI set of Evacuation Paths
Multi-Objective
6.2
Routing
in Stochastic
Evacuation
Networks
277
Iteration # 1 : Evacuation Time = 158.50 t.u.
V2 Figure 4: E v a c u a t i o n G r a p h G(V, E) w / 1st p a t h s
P=l V3 = 8 V3 = S V3 = 9 P=2 V3 = S V3 = 8 ^3 = 9 P=3 V3 = 8 V3 = 8 y3 = 9
Time 91 45 50 Time 80 40 45 Time 83 45 50
Distance 120 160 180 Distance 80 140 120 Distance 90 160 180
Route l->-5->- 7 ^ 8 1 -> 4 - * 6 -» 8 1 ^ 4 ^ 6 — 9
Route 2 - * 5 -• 7 — 8 2 ^ 4 ^ 6 ^ 8 2^ 4^6-^ 9 Route 3 -+ 5 ^ 7 -+ 8 3 ^ 4 ^ 6 -»8 3-^4^6-^9
Table 2: NI set of Evacuation P a t h s
Savings 0 40.61*
**** 0* 37.65
**** 0* 35.05 ****
J. M. Smith
278
6.3
Iteration # 2 Evacuation Time = 152.09 t.u.'s
VW
>V
Figure 5: E v a c u a t i o n G r a p h G(V,E)w/
2nd s e t of p a t h s
Time 107 88 76 Time
Distance
Route
Savings
V3 = 8 V3 = 8 V3 = 9 P=2
120 150 180 Distance
l-+5^7-*8 1^4-+6^7-*8 1 ^ 4 ^ 6 ^ 9 Route
**** 0.38 13.50* Savings
V3 = 8 V3 = 8 V3 = 9 P=3
96 61 49 Time
80 130 160 Distance
2-»5^7^8 2 ^ 4 ^ 6 ^ 7 ^ 8 2->4 -s-6-* 9 Route
0* -12.74 0.39 Savings
v3 = s
99 66 54
90 150 180
3->5->-7-+8 3-»4 — 6 ^ 7-+8 3-^4-^6^9
0* -15.33 -2.21
P=l
V'3 = 8 V'3 = 9
Table 3: NJ set of Evacuation P a t h s
3
Multi-Objective
6.4
Routing
in Stochastic
Evacuation
Networks
279
Iteration # 3 : Evacuation Time 114.23 t.u.'s
v.
J
v2 Figure 6: E v a c u a t i o n G r a p h G(V, E) 3rd s e t of p a t h s
P=l
Time
Distance
Route
Savings
V3 = 8 V3 = 8 V3 = 9 P=2
107 88 78 Time
120 150 180 Distance
1 ^ 5 ^ 7 _ 8 1^4-^6-^7-^8 1-^4-^6^9 Route
**** **** 0* Savings
V3 = 8 Vs = 8 V3 = 9 P=3
96 61 51 Time
80 130 160 Distance
2^5-+7-*8 2^4-+6-* 7^8 2 - ^ 4 -> 6 ^ 9 Route
0* -12.72 -15.38 Savings
3 ^ 5 -+ 7 -^ 8 3 ^ 4 ^ 6 —7^8 3 — 4 -> 6 -+ 9
0* -15.33 -19.74
V3 = 8 V3 = 8 V3 = 9
99 66 56
90 150 180
Table 4: NI set of Evacuation P a t h s Table # 4 illustrates t h e final set of NI paths generated by t h e algorithm. In cycling through S t e p 3 . 0 , no additional savings are possible by re-routing occupants on
J. M.
280
Smith
longer p a t h s , therefore, the last set of paths yield the most favorable decrease in evacuation time for the occupant population.
7
Summary and Conclusions
In this paper, we have focused on the generation of the Noninferior NI set of paths for evacuating occupants in an emergency situation. T h e complex, multi-objective n a t u r e of the problem has been described and mathematical properties and a corresponding algorithm have been developed for generating the set of NI evacuation paths which allow for tradeoffs between evacuation time and distance travelled for the occupant populations. A small example problem also illustrates the iterative process of the algorithm which can be implemented either in a simulation environment e.g. QG E R T , or in an analytical model environment e.g. Q N E T - C .
Acknowledgement This material is based upon work supported by the National Science Foundation under grants #MSM-X1 l-il-ij and #MSS-9U6666.
References [1] J. Ahlberg, Stochastic Queueing Network Program for Evacuation Planning, Master's Project. Department of Industrial Engineering and Operations Research, University of Massachusetts, Amherst, MA 01003 (1988). [2] L.G. C h a h n e t , 11.L. Francis, and P.B. Saunders. Network Models for Building Evacuation. Management Science 2 8 , (1) (1982) 86-105. [3] V. Chankong and Y.Y. Haimes, Multiobjective Methodology (North-Holland, 1983).
Decision
Making:
Theory
and
[4] J.C.N. Climaco and E.Q.V. Martins, A Bi-Criterion Shortest P a t h Algorithm, European Journal of Operations Research,11(1982) 399-404. [5] S. Daskalaki and J. MacGregor Smith. T h e Static Routing Problem in Open Finite Queueing Networks. Presented « ORSA/TIMS Meeting. Miami, Florida, (October 1986.) [6] S. Daskalaki and J. MacGregor Smith, Optimal Routing and Buffer Space Allocation in Series-Parallel Queueing Networks. Invited presentation to t h e EURO/TIMS XXVIII Conference in Paris France, (July 6-8, 1988).
Multi-Objective
Routing in Stochastic Evacuation Networks
281
[7] S. Daskalaki and J. MacGregor Smith, Real Time Routing in Finite Queueing Networks, in Queueing Networks with Blocking, eds. H.G. Perros and T. Altiok. (New York: Elsevier Science Publishers B.V., 1989) 313-324. [8] R Garfmkel and G.L. Nemhauser, Integer Programming. (Wiley, 1972). [9] Hemant Gosavi and J. MacGregor Smith, Heavy Traffic Multi-Commodity Routing in Open Finite Queueing Networks, Paper Presented at the ORSA/TIMS Meeting, Denver, CO. (October 1988.) [10] R. L. Francis and L.G. Chalmet. Network Models for Building Evacuation: A Prototype Primer. Unpublished Paper, Department of Industrial and Systems Engineering, University of Florida, Gainesville, Florida (1980). [11] C.J. Karbowicz and J. MacGregor Smith, A K-Shortest path Routing Heuristic for Stochastic Evacuation Networks. Engineering Optimization 7 (1984) 253-280. [12] A.Pritsker, Modelling and Analysis using Q-GERT Networks. (Wiley, New York, 1979). [13] Harvey M. Salkin, Integer Programming ( Addison-Wesley Publishing Co., Reading Massachusetts, 1975). [14] J. MacGregor Smith and D. Towsley. The Use of Queueing Networks in the Evaluation of Egress from Buildings. Environment and Planning B-8 (1981) 125-139. [15] J.MacGregor Smith, QNET-C: An Interactive Graphics Computer Program for Evacuation Planning Proceedings of the Conference on Emergency Planning, SCS Multiconference, 14-16 (January 1987), pp 19-24
283 Network Optimization Problems, pp. 283-300 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
A Simplex Method for Network Programs with Convex Separable Piecewise Linear Costs and Its Application to Stochastic Transshipment Problems 1 J. Sun Department University, K.-H. Tsai Department
of Industrial Engineering and Management Evanston, IL 60208, USA
of Computer
L. Qi School of Mathematics, NSW2033, Australia
Sciences,
National
The University
Normal
of New South
Sciences,
University,
Wales,
Northwestern
Taipei,
Taiwan
Kensington,
Abstract
This paper is concerned with the pure network program whose objective function is convex, separable, and piecewise linear. We describe a direct simplex algorithm and its implementation for solving such problems. Computational results of applying this algorithm to stochastic transshipment problems are reported. The algorithm keeps the number of variables in the original level by allowing nonbasic variables to take breakpoint values and by using a straightforward pricing strategy similar to the traditional network simplex method. Tree data structure is used to construct efficient implementation. Computational results indicate that the solution time is insensitive to the increase of "piecenumber" of the objective function. As a result, we have been able to solve stochastic transshipment problems of more than 30,000 arcs and 3000 nodes, where each node has a discrete random demand of 100 possible values, within one minute on a SUN computer.
'The research is supported in part by NSF and Australian Research Counsel.
J. Sun, K.-H. Tsai, and L. Qi
284
1
Introduction
This paper is aimed at t h e following optimization problem (NetPLP) |
minimize subject to
F(x) = £ " = 1 Ax = 6,
fj(xj)
where A £ Ji^xn j s ^ n e node-arc incidence m a t r i x of a connected network, x = (x-i, • • • ,xn)T, c = (ci, • • • , c „ ) T £ if!", and b £ 7? m . Each fj is a convex piecewise linear function of t h e single variable Xj and has t h e following form: SjiXj
fj(xj)
+ rji,
if Cjo < Xj < CjX;
= Sjk,X3 + Tjkj,
if Cjkj-l
+co
otherwise,
< X3 <
Cjk/,
where for each j , we call t h e numbers c ; o, c ; i , . . . , c,^ t h e breakpoints of the variable XJ; they satisfy — OO < Cjo < Cji < . . . < Cjkj < + 0 0 .
Notice t h a t t h e bound constraints CJQ < Xj < Cj^ , for j = 1, • • •, n, are imposed from t h e definition of / , . If all kj equal to one, this model reduces to t h e ordinary network linear program. T h e problem ( N e t P L P ) arises i n at least two areas of operations research. First, it is used to model practical network problems with linear penalties or variable linear costs. Later in this paper, we will show how stochastic transshipment problems with r a n d o m d e m a n d s can be formulated as an ( N e t P L P ) . Second, it may represent an approximation of a network separable convex program, especially when t h e exact formula of t h e cost function is unknown and only experimental d a t a are available. T h e third possible application of ( N e t P L P ) is to offer a warm start for nonlinear network algorithms. We admit t h a t a nonlinear network program should be in general approached by nonlinear programming algorithms, b u t it might be as well i m p o r t a n t to find a good staring point for those algorithms to be truly efficient. A good algorithm for ( N e t P L P ) could conveniently produce such a warm start by solving a piecewise linearization of t h e nonlinear problem. In addition, t h e algorithms for ( N e t P L P ) could also be useful in directly solving some nonlinear problems, as demonstrated by Charnes, Song and Ali [CSA86]. Theoretically speaking, ( N e t P L P ) can be solved by reformulating it as a linear network problem where kj new arcs are introduced to replace the arc j . However, in practice, we would rather seek algorithms t h a t directly deal with the piecewise linear objective function because reformulation would often increase t h e number of variables to a prohibited level. Some authors have contributed specific pricing rules for applying the simplex m e t h o d to certain cases of ( N e t P L P ) , for example, see t h e algorithm of
Algorithm
for Network
Piece-wise Linear
Programming
285
Ali, Cook and Kress [ACK86] for ordinal ranking problems and t h e algorithm of Wets [We83] for stochastic programs with simple recourse. T h e contribution of this paper, however, is to introduce a unified simplex approach for solving ( N e t P L P ) and to elaborate its implementation based on t h e tree d a t a structure (e.g. Chvatal[C83]). Other t h a n thinking of implicitly dealing with t h e reformulated linear problem, we develop a direct pricing rule t h a t can be implemented efficiently using t h e tree d a t a structure. Although there have been considerable developments in interior point methods for linear p r o g r a m m i n g in recent years, t h e simplex m e t h o d remains being very competitive in solving network linear programs. It is our belief t h a t t h e simplex m e t h o d would be still efficient in solving t h e piecewise linear problem. To support our contention, we first test 40 randomly generated ( N e t P L P ) problems similar to t h a t of Klingman et. al [KNS74]. T h e n we choose four of these problems and observe how t h e solution t i m e depends on t h e piece n u m b e r kj. We find t h e increase of solution time is insignificant compared to t h e increase of k\ + • • • + kn. An i m p o r t a n t application of our algorithm is to solve stochastic transshipment problems ( S T P ) with discrete r a n d o m demand. This model include t h e classical stochastic t r a n s p o r t a t i o n problem as a special case. Due to t h e underlying network structure, t h e stochastic transshipment problem can b e efficiently solved by t h e network piecewise linear programming approach. In our computational test the method has been able to solve a problem of 35,000 arcs and 3000 nodes, with each node having a r a n d o m d e m a n d of 100 possible values, within one m i n u t e on a SUN computer. In t h e literature, some researchers have proposed direct m e t h o d s to solve general convex piecewise linear programs (Rockafellar [R84], Fourer [F88], Premoli [P87]). However, to t h e best of t h e a u t h o r s ' knowledge, none of those m e t h o d s has been implemented under t h e network circumstance. For an extensive literature list of applications of piecewise linear programs, see Fourer [F86]. This paper is organized as follows. In the next section we review t h e m a t h e m a t ical background. T h e m e t h o d and its convergence property are given in Section 3. Section 4 is devoted to implementation techniques. Computational experiment on 40 randomly generated b e n c h m a r k ( N e t P L P ) is reported in Section 5. T h e S T P is introduced and t h e corresponding test problems are solved in Section 6. Since only discrete distributions of d e m a n d in S T P can result in network piecewise linear programs, we discuss how to use t h e algorithm in S T P with continuous distribution and other extensions in Section 7.
2
Background Materials
Nothing much is new in this section. For more detailed introduction of network flow problems and monotropic optimization, we refer the reader to [R84]. Here, we just s t a t e some i m p o r t a n t facts t h a t are to be used in t h e sequel.
286
J. Sun, K.-H. Tsai, and L. Qi
Let A g Rmxn be t h e node-arc incidence m a t r i x of a connected network and let x = (xi, • • •, xn)T be a flow vector. Let [B, N] be a basis-nonbasis partition of t h e arc index set J = {1, • • •, n}. We then have t h e corresponding partition of the vector x = (XB, IJV)- A variable Xj in t h e vector XB is called a basic variable and in t h e vector xpf a nonbasic variable. T h e arcs in B form a m a x i m a l spanning tree of the network. For simplicity of notation, we will use B to represent both t h e index set and the tree. It is well-known (e.g. Rockafellar [R84]) t h a t , for each partition J = BU N, there are a vector b and a m a t r i x K such t h a t XB = KXN -f b, is an equivalent system to the system Ax = b. Moreover, t h e m a t r i x K has a combinatorial structure, namely, for each j £ N, t h e j ' - t h column of K is t h e incidence vector of t h e unique p a t h in tree B t h a t starts from t h e head node of arc j and terminates at the tail node of j . T h u s the increase (decrease) of the nonbasic variable Xj in t h e simplex m e t h o d corresponds to a equal increase (decrease) of t h e fluxes along t h e circuit Cj t h a t consists of this p a t h and arc j with j being positive in Cj. We call t h e incidence vector of the circuit Cj a simplex direction. Let d? = [d\, • • • ,d3n) be this vector. T h e n we have
(
1 if arc j is positive in Cj, — 1 if arc j is negative in Cj, 0 if arc j is not in Cj.
We now give a new interpretation of the reduced cost in linear programming, which will be instructive later in describing our pricing rule. O b s e r v a t i o n . In a network linear program (correspondingly, all kj = 1 in ( N e t P L P ) ) , if a; is a nondegenerate solution, or a degenerate solution with a non-zero ratio, the reduced cost associated with Xj in the simplex method is t h e directional derivative of F a t i along d3. i. e. F'{x y ^)
'
= l^X Ho
+
t d
t
^ -
F
^ .
In t h e case of degenerated solution with zero ratio, F'(x,d3) = oo. As a m a t t e r of fact, from t h e theory of monotropic programming the above directional derivative can be computed by F\x,d3)
= £ > a x K / f ( s O , d ? / , + (*/)},
(2.1)
where /,"" and / , + are t h e ordinary left and right derivatives of / ; (could be ± o o ) . In the case t h a t all kj = 1„ one has ftx\=fsixi I oo
if cm < xi < otherwise.
cn,
Therefore, if x is a nondegenerate basic feasible solution in t h e sense of usual network linear programming, one has c/0 < x\ < cn V / € B, which implies ff{x{) = ff(xi) =
Algorithm
for Network
Piecewise
Linear
Programming
287
5/,V / 6 B. therefore we get F\x,
#) = £
3/
-
£
a,
= si - ( P * - W«)»
(2-2)
where Cj" and CT are t h e positive and negative arc sets in Cj, respectively, pjh and Pjt are t h e prices at t h e head and t h e tail of arc j . T h e price vector p = (pi, • • •, pm) assigns to each node a price such t h a t pih — pit = S| for every / € B. (2.2) is exactly t h e formula used to c o m p u t e t h e reduced cost in linear network programs. It is also easy to see in t h e case of degenerate solution with non-zero ratio, all basic arcs with x; = C(0 are in Cj", while those with X{ = c/t, are in C~, so formula (2.2) is still valid. Finally, in t h e case of degenerate solution with zero ratio, at least one basic arc / with xi0 = c/0 is in C~, or at least one basic arc / with xi0 = c^, is in Cj, hence t h e /-th t e r m in (2.1) is + c o , resulting in F'(x,dJ) = oo. T h e simplex m e t h o d looks for a simplex direction d3, j € N such t h a t F'(x, d1) < 0. In t h e piecewise linear case, t h e simplex direction is defined as either d3 or — d3, and we simply call t h e derivative (2.1) t h e reduced cost and t h e right h a n d side of (2.2) t h e nominal cost. It is t h e reduced cost, not the nominal cost, determines if d3 is a feasible descend direction. However, since nominal cost can be obtained by relatively small efforts, we use it as an indicator for possible descend simplex directions. In our implementation, we first find a candidate of a descent simplex direction by checking if its nominal cost is negative, then as we go along the circuit Cj to c o m p u t e the maximal allowable change of flux (ratio test), we successively add t h e differentials between t h e reduced cost and t h e nominal cost to t h e later one to get t h e real directional derivative (2.1). D e f i n i t i o n 2 . 1 A basic solution x = [ I J , I J V ] to ( N e t P L P ) is defined as a solution to Ax = b, where each nonbasic variable takes one of its breakpoint values. If, in addition, F(x) is finite, then x is called a basic feasible solution. Using t h e "tree language", each basic feasible solution of ( N e t P L P ) can be identified as a m a x i m a l spanning tree so t h a t F(x) is finite and all arcs not in the tree take breakpoint values. T h e following proposition can be derived from a similar property of linear programs. P r o p o s i t i o n 2.2 If ( N e t P L P ) has an optimal solution, t h e n there exists a basic feasible solution which is also optimal. Suppose t h a t a; is a basic feasible solution to problem ( N e t P L P ) . One of the fundamental properties of ( N e t P L P ) is t h e following (see [R84]). P r o p o s i t i o n 2 . 3 T h e following statements are equivalent: (a) x is o p t i m a l to ( N e t P L P ) . (b) x is feasible and for all possible partitions J — B U N, none of the reduced cost of the simplex directions is strictly negative.
J. Sun, K.-H. Tsai, and L. Qi
288
(c) x is feasible and there exist a price vector p G Rm and a differential vector v = — ATp, such t h a t fjT(xi) < vi < /, + (xj), for / = 1, • • •, n. Notice t h a t by t h e special structure of t h e node-arc incidence m a t r i x A, one has vi = pih — pit. Now we are ready to s t a t e t h e algorithm.
3
The Simplex Algorithm for (NetPLP) and Its Convergence
A l g o r i t h m 3.1 S t e p 0. Find an initial basic feasible solution of ( N e t P L P ) or show t h a t the problem is infeasible. If a basic feasible solution is found, then set this solution as x° and k = 0. Go t o Step 1. S t e p 1. Determine by Proposition 2.3(c) whether t h e current solution is optimal. If not, find a simplex direction associated with a nonbasic variable j such t h a t F'(xk,d3) < 0. Go to Step 2. By Proposition 2.3(b), such a direction exists. S t e p 2 . Set xk+1 = xk + adJ where a minimizes F(xk + adJ) over a > 0. If such an a does not exist, i.e. inf a >o F(x + ad3) = —oo, stop; t h e ( N e t P L P ) has unbounded solution. Otherwise, set k = k + 1 and go to Step 1. It will be shown (Proposition 4.3) t h a t the minimizer a in Step 2 can be selected so t h a t xk+1 is again a basic feasible solution. If we maintain this machinery in Algorithm 3.1, t h e n it is obvious t h a t in a finite number of iterations, t h e algorithm will either find an unbounded solution or an optimal basic feasible solution because t h e total n u m b e r of basic feasible solution is finite and t h e strictly decreasing sequence {F(xk)} ensures t h a t no same basic feasible solution can repeat in t h e sequence {a; 4 }.
4 4.1
Implementation of the Algorithm Data Structure
We use t h e "tree d a t a structure" t h a t has been long used in m a n y successful implementations of network simplex m e t h o d s . For a description of it, see [C83]. T h e arrays used are as follows. F R O M and T O arrays. These two arc-length arrays store t h e initial and ending nodes of arcs. B R E A K P O I N T and S L O P E arrays. These arrays store t h e value of breakpoints and slopes for arcs in their natural order. Namely, in t h e orders of Cio, • • • , e n , , • • •, cno, • • •, cnk„
and
s n , • • •, s u , , • • •, s n i , • • • , snkn.
S T A T E array. This arc-length array records which breakpoint value is being taken by a nonbasic variable.
Algorithm
for Network
Piecewise
Linear
Programming
289
X array. This node-length array stores t h e values of basic variables. P array. This node-length array stores t h e dual variables, i.e. t h e price vector. A U array. This node-length array is used for convenience of adjusting dual variables. To maneuver operations on a tree, we use four node-length arrays called P R E D E C E S S O R , D E P T H , T H R E A D and E D G E . T h e functions of t h e first three can be found in [C83]. T h e last array points to t h e arc in t h e tree t h a t is above and incident to a given node. Here by convention, t h e tree is thought of as upside-down.
4.2
Implementation of Step 0
We use t h e Gradual Penalty Method ( G P M ) in Grigoriadis [G86] to solve ( N e t P L P ) in a single phase. It is a so-called big-M m e t h o d with a self-adaptive mechanism for choosing t h e n u m b e r M. It creates an initial base with all-artificial arcs, each with a m o d e r a t e linear penalty cost, and solve t h e augmented N e t P L P . If t h e optimal solution contains a positive artificial variable, its penalty is enlarged gradually to certain limit, and t h e problem is repeatedly solved until t h e artificial arc has a zero flux, otherwise t h e problem is claimed infeasible. A empirical formula M = m i n { ^ , l + ( m - l ) m a x { | s . | , | s * | } , s , + (s* -
s,)l.5[2*~2]}
is used to decide the penalty, where s» and s* are t h e m i n i m u m and m a x i m u m slopes of t h e cost functions, respectively; 7r is a pass n u m b e r which refers to t h e times of penalty changes, [i is a threshold value (10 9 in our code).
4.3
Implementation of Step 1
4 . 3 . 1 T h e S e t t i n g of t h e P r i c e V e c t o r As we mentioned in Section 2, given a tree B, each node i is associated with a price Pi. T h u s one has vi = p//, — pit. In our implementation only p is stored. Initially, the vector p is set so t h a t vi = / ( + (a:?) VZ 6 B (if ff{x°) happen to be + 0 0 , then vi = fi~(x°)). Same as in t h e linear network simplex m e t h o d , initially p.'s are set to satisfy t h e system pih — p/ ( = vi, VZ G B. Afterwards p can be u p d a t e d together with t h e u p d a t e of t h e tree. However, no m a t t e r how t h e vector p is u p d a t e d , we always keep / , _ ( x i ) < u; < ff(xi) for all / 6 B, and pih — pit = vi, VZ € B. Especially, after t h e line search in Step 2, t h e p,'s related to t h e simplex direction d? need to be changed according to t h e rule Pih - Pit = ff(xi)
if d] = 1, p,h - pu = fi~(xi)
if d\ = - 1 .
To save time, these changes are recorded in A U , while t h e m i n i m u m ratio a is computed. It can be seen t h a t such u p d a t i n g procedure always assigns vi = f+(xi) or fl~(xi), VZ G B, and t h e descent property of d' will ensure ft{xj) / 00 and
290
J. Sun, K.-H. Tsai, and L. Qi
ff{xj) 7^ —oo. At t h e end of an iteration, if there is a pivoting, then p is u p d a t e d by taking into consideration of b o t h t h e change of t h e tree and t h e value of A U . T h e above c o m p u t a t i o n procedure for p is completely t h e same as t h e linear case, if t h e objective function happens to be linear. 4 . 3 . 2 T h e S e a r c h of a D e s c e n t S i m p l e x D i r e c t i o n T h e key to find such a direction is t h e computation of t h e reduced cost (2.1) in Section 2. We separate t h e task into two processes. In process 1, we c o m p u t e the nominal cost as if we were dealing with t h e linear case. T h e concrete process is: For each arc j £ N, if pjh — Pjt > ff{xj), then d3, as described in Section 2, is taken as a candidate of t h e descent direction and the nominal cost is rij = ff{xj) — pj/, + pjt. If pjh — Pjt < f7{xj), then — d3 is taken as a candidate of t h e descent vector and the nominal cost is rij = —//" (XJ) +Pjh ~Pjt- If none of such j exists, namely for all j 6 N, fr(xj) — Pjh ~ Pjt ^ ff(xj)> t h e n t h e condition (c) in Proposition 3.3 is satisfied and x is t h e optimal solution. Otherwise, a sample pricing strategy is adopted to select an entering arc from among a group of candidates. This strategy, with details in [G86], also reduces t h e danger of cycling when degeneracy arises. Note t h a t t h e nominal cost n} is not necessarily t h e directional derivative F'(x, d3) in (2.1). In fact, we always have rij < F'(x,d3) and the equality is valid only if V( = fj~(xi) as dj = —1 and i>/ = ff(xi) as d\ = 1, V j € B. Therefore, if rij > 0, then d3 is definitely not a descent direction, while if rij < 0, we still have to c o m p u t e (2.1) to know whether d3 is really a descent direction. T h e case of rij < 0 and F'(x, d3) > 0 results a degenerate iteration in which we pivot to change t h e base but t h e iterative solution remains t h e same. After t h a t , the process 1 is restarted. In process 2, t h e computation of F'(x, d3) and t h e line search in Step 2 are carried over simultaneously. Notice t h a t both operations need to travel around the circuit indicated by d3, we do t h e two jobs by traveling around t h e circuit once. We leave t h e details in Section 4.4.
4.4
The Implementation of Step 2
As mentioned in Section 4.3, suppose t h a t rij < 0, we need to check if d3 is really a descent vector, and if it is, do a line search along d3. To achieve the two goals together, we first use t h e P R E D E C E S S O R , D E P T H , and E D G E arrays to identify the arcs of t h e circuit associated with d3. Details may be found in [C83]. Once an arc / in this circuit is identified, c o m p u t e , -i • • ,. • J c\ — xi if d3, = 1 (a) m i n i m u m ratio : a = m m < „ '• w ( \x, -c, if d\ = - 1 ' and (b) directional derivative : / ? = / ? + ( ^ ^ ~^xh~Jli\ [(Pih-pit)-f, (xi)
l
\ d\ = l lid3, =
, -1
Algorithm
for Network
Piecewise
Linear
291
Programming
where Q > xi is t h e closest breakpoint on t h e right of xi and c( < x\ is t h e closest breakpoint on t h e left of x\. If no such breakpoint exists, we regard t h e difference c; — x\ or x\ — C[ as + 0 0 . T h e initial value of j3 is ff{xj) — pjk + pjt = rij if tfj = 1, or -ff(xi) ~ Pit + Pjh if d] = - 1 . If P become positive or zero at some arc / (note: Initially /? < 0), then d3 is not a descent elementary direction because j3 increasingly reaches F'(x,d3) when we travel around t h e circuit (see Proposition 4.2 below). In this case we adjourn t h e process and pivot on (j, I). Here arc j is t h e entering arc and arc / is t h e leaving one. After pivoting, we go to Step 1 and start the next iteration. If after scanning the entire circuit we still have j3 < 0, then d3 is a real descent direction because now P = F'(x,d3) (see Proposition 4.2 below). Set x <— x + ad3.
(4.1)
If a = 00, t h e ( N e t P L P ) has unbounded solution. Otherwise, repeat (a)(b) until t h e pivoting operation happens. It should be noticed t h a t t h e values of d] can be obtained through t h e direction of t h e arc / in t h e circuit. Therefore, there is no need to store it. Since d1 could be still a descent direction at t h e new x obtained in (4.1), we repeat (a) and (b) until pivoting is necessary. Hence t h e line search can cross several pieces of F(x) instead of just one. To justify t h a t t h e implementation really achieves the goal of Algorithm 3.1, we prove t h e following. P r o p o s i t i o n 4.1 T h e j3 computed after scanning t h e whole circuit is t h e desired derivative (2.1). P r o o f . We only prove t h e case of d3j = 1, t h e case of crj = —1 can be discussed similarly, d1 satisfies Ad1 = 0 and v satisfies v = —ATp, so we have vTdJ = 52"=i d3kvi — 0. T h u s Vj = - EigB d\vi. At beginning we have (3 = ff{xj)-Vj = fj~{xj) + Y2ieB d\vi. In (b), we add /, + (x/) and subtract i>; when d] = 1, and add —fi~(xi) and subtract —vi when dj = — 1 for each / g B. These operations are equivalent to adding the terms max{
according to (2.1). (Q.S.V.) P r o p o s i t i o n 4.2 For any 0 < a < a, F(x + ad3) = F(x) + aP. P r o o f . By t h e way of determining a, F(x) is linear on t h e line segment [x, x + ad3}. T h u s F(x + ad3) = F(x) + aF'(x,d3) = F(x) + a/3 according to Proposition 4.1.
(Q.e.v.) ^From this proposition, ( N e t P L P ) is unbounded if a = 00. We still need to show P r o p o s i t i o n 4 . 3 After each iteration x remains to be a basic feasible solution.
292
J. Sun, K.-H. Tsai, and L. Qi
P r o o f . A new x is obtained from t h e current i b y a pivoting operation. T h e criterion for choosing t h e leaving variable is t h e change of sign of /3. This change is possible only if /; + (a;;) ^ vi or ff(xi) ^ vi. By t h e setting of vi, vi is either ff{xi) or fj~(xi). Hence Xf m u s t be a breakpoint and t h e new x is then a basic solution. T h e feasibility comes from Ad3 = 0 and F(x + ad3) < oo, which is implied by Proposition 4.2. (Q.S.V.) In summary, if after t h e circuit is scanned we have /? < 0, then we do a line search and a real decrease of t h e objective function is achieved. T h e new iterative solution is still a basic feasible one. On t h e other hand during t h e course if for some arc / in t h e circuit ft becomes nonnegative, then no decrease is m a d e and we merely pivot to another basic feasible solution (a degenerate step). In addition, if there is no "cycling" in consecutive degenerate steps, we will eventually end up with one of the three alternatives: Optimality, unboundedness, or a descent direction. Although there exist some anti-cycling procedures [F88], for simplicity of coding and efficiency of t h e algorithm, we did not adopt anyone of t h e m . However, it seems t h a t our policy of choosing t h e "first arc t h a t makes /? negative" as t h e leaving arc and t h e sample pricing strategy used in selecting t h e entering arc have practically prevented t h e algorithm from cycling. In all tested problems, no cycling is observed.
5
Computational Results
T h e 40 benchmark problems in Klingman et al. [KNS74] are generated with piecewise linear costs (Although in its original sense t h e assignment problem should not have piecewise linear cost, we treat it here as a special transportation problem for testing purposes.). Each arc incurs a piecewise linear cost whose breakpoints and slopes are randomly assigned according to uniform distribution. T h e code is written in integer version of F O R T R A N , compiled by F77 with optimization option - 0 , and executed on a SUN computer. Table 5.1 lists t h e computational time (in seconds) for these problems which does not include t h e i n p u t - o u t p u t time. Table 5.2 presents results for various kj. Four representative problems are solved repeatedly with respect to different numbers of pieces in their objective functions (without loss of generality, all / j ' s are assumed to have t h e same kj, because one can introduce r e d u n d a n t breakpoints if necessary). It is interesting to see t h a t the c o m p u t a t i o n a l t i m e of t h e m e t h o d does not change much with respect to t h e changes of kj.
Algorithm for Network Piecewise Linear Programming
293
Problem Number
Type
m
n
Max. No. of Pieces
C P U T i m e of NETPLP
Total Number of Iterations
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Assignment Assignment Assignment Assignment Assignment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment
200 200 200 200 200 300 300 300 300 300 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 1000 1000 1000 1000 1500 1500 1500 1500 8000 5000 3000 5000 3000
1308 1511 2000 2200 2900 3174 4519 5169 6075 6320 1500 2250 3000 3750 4500 1306 2443 1306 2443 1416 2836 1416 2836 1382 2676 1382 2676 2900 3400 4400 4800 4342 4385 5107 5730 15000 23000 35000 15000 23000
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
0.15 0.15 0.20 0.23 0.25 0.34 0.50 0.62 0.53 0.59 0.30 0.37 0.48 0.57 0.59 0.1& 0.30 0.17 0.33 0.23 0.33 0.21 0.30 0.23 0.47 0.16 0.31 0.83 1.03 1.09 1.09 1.82 1.95 2.05 2.32 41.53 25.35 17.04 20.66 12.86
636 566 801 785 881 1158 1531 1708 1550 1746 1252 1533 1936 2235 2294 1036 1724 984 1718 1178 1773 1056 1570 1366 2648 906 2050 2258 2713 3328 3316 3857 4010 4391 5408 15052 14768 14574 11696 11983
Table 5.1. Solution Times and Optimal values of 40 Benchmark Problems (on SUN SPARC 2)
J. Sun, K.-H. Tsui, and L. Qi
294
No. of pieces
Problem # 1
Problem # 1 3
Problem # 1 6
Problem # 2 8
8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100
0.42 0.39 0.47 0.40 0.43 0.43 0.52 0.51 0.48 0.52 0.42 0.47 0.47 0.42 0.42 0.42 0.42 0.53 0.48 0.50 0.40 0.38 0.49 0.40
1.60 1.35 1.74 1.90 1.25 1.30 1.41 1.22 1.29 1.33 1.17 1.19 0.96 1.23 1.02 0.91 0.88 1.00 0.99 0.88 0.68 0.84 0.88 0.91
0.88 0.91 0.94 1.16 1.19 0.82 0.86 1.11 1.07 1.03 0.92 1.00 1.03 1.05 1.07 1.34 1.29 1.46 1.29 1.29 1.49 1.19 1.18 1.32
2.37 2.58 2.61 2.55 2.77 1.99 2.42 2.58 2.47 2.27 2.30 2.48 2.84 2.11 2.48 2.54 2.42 2.18 2.84 2.41 2.54 2.63 2.53 2.58
Table 5.2. Solution Times as Number of Pieces Increases (on S U N 4 / 3 3 0 )
6
The STP and Computational Results
The STP can be described as follows. Suppose that a commodity is manufactured and consumed in m cities and transported between them through a highway network. At city i; (1 < i < m) there is a random demand u>; and a fixed supply 6,. The commodity flux Xj along highway link j (1 < j < n) has lower and upper limits dj and df. In accordance with the usual convention, negative flux is understood as positive flux in the opposite direction. The transportation cost along highway j is a convex piecewise linear function tj(xj). (e.g. qf per positive unit and qj per negative unit) Let E be the m x n incidence matrix of the network and Z{ be the net supply (i.e. the fixed supply minus export) of the commodity at city i. Then in vector notation we have b-Ex = z, d' <x
Algorithm
for Network
where x = (xi:and b = (hi, • • are understood T h e penalty
Piecewise
Linear
Programming
295
• • ,xn)T,, d~ = (d^,--- ,d~)T, and d+ = (df, • • • ,d+)T belong to Rn, T •, bm) and z = (zj, • • • , zm)T belong to Rm. T h e vector inequalities coordinatewise. for surplus or shortage of t h e commodity at city i is introduced as Fi(z{, Wi) = hi max{0, z; — «>,-} + /; max{0, u>,- — z , } ,
where /i; and /; are given cost coefficients, hi > 0 and /; > 0. We want to choose t h e a m o u n t of t h e c o m m o d i t y transported on each highway (i.e. t h e vector x ), subject to (6.1), so as to minimize t h e total shipment costs plus t h e expected total penalty, or in other words to minimize n
m
4>{x,z) = X>-(xJ-) + +£H,{X) #(*,-, «?0} t=l
i=l
j=l
1=1
where £w stands for t h e m a t h e m a t i c a l expectation with respect to r a n d o m vector W = [ y i > - - - , ^ m ] r and Ui(zi) = £w{Fi(zi,Wi)}. It is shown [S86] t h a t u,- is convex and t h a t if u>,- has marginal probability distribution function Wi then t h e subdifferential of u,- at z,- is given by t h e formula dui{zi)
= [{hi + h) lim Wi(0
-li,(h,
+ l,) lim Wi(t) - l,].
(6.2)
Specifically, if %£i has discrete (marginal) distribution of finite support 0 = {c; 0 , • • •,
cikt},
then, from (6.2) t h e function it; is convex piecewise linear and it consists of fc, -f 2 linear pieces with c; 0 , • • • , c ^ , being t h e breakpoints. We refer to this case as the discrete STP. T h e works on S T P can be looked back upon t h e paper of Dantzig and Ferguson on stochastic transportation problem [FD56]. Since then, the stochastic transportation problem and other network-related stochastic programs have received considerable attention over t h e years (see, for instance, [CL77] [B79] [E60] [L80] [P86] [Q84b] [Q85] [Q87] [Wi63] [Co78] [Q84a] [MV88a] [MV88b] [W86] [W89] [WW89], which cover t h e applications such as traffic control, production planning, and financial investment). Because of t h e non-bipartite network structure, our S T P model is more general t h a n the stochastic t r a n s p o r t a t i o n problem. On t h e other hand, due to t h e explicit form of t h e recourse function Fi(zi, Wi), the model is a special case of stochastic programming with simple recourse [We83]. Existing methods for t h e stochastic transportation problem do not lend themselves an obvious extension to S T P , while general methods for stochastic p r o g r a m m i n g with simple recourse do not take advantage of t h e network structure and hence tend to be inefficient for STP.
J. Sun, K.-H. Tsai, and L. Qi
296
We suggest t h a t t h e S T P problem be solved by explicitly solving its deterministic equivalence. To t u r n a discrete S T P into a piecewise linear network program, we set an artificial node r and m additional arcs, each initiates from a node i of the original network and t e r m i n a t e s at r. T h e flux on arc (i, r) is z< and its cost is Pj(zi). Now the S T P is equivalent to t h e following piecewise linear program: ' minimize^
£ " = i tj(xj)
+ YZLi Ui(zi)
••"*** (f -'OCM-iJ' dj <
XJ
(63)
< d* for j = 1, • • • , n,
T
where / is a unit m a t r i x , —e = ( — 1, • • • , —1) and b = (bi, • • • , bm)T. T h e functions tj and Ui are convex piecewise linear. Thus the S T P is a special case of the problem (NetPLP). In our computational test, t h e forty benchmark problems developed by Klingm a n , Napier and Stutz [KNS74] for network linear programming are regenerated under t h e S T P context. Although t h e r a n d o m d e m a n d does not apply to assignment problems, we still test t h e m and treat t h e m as transportation problems with b = (1, • • • , 1, — 1, • • • , — 1)T. In each problem, we associate to t h e nodes uniformly distributed d e m a n d s having 100 possible values. This is probably a typical size of t h e support for discrete distributions in practice. These problems cover transportation and transshipment networks of various size and have been long used in evaluating and comparing algorithms in network linear programming. Because at this time, no other codes for S T P are available to us, we have not been able to compare t h e efficiency of this m e t h o d with other m e t h o d s . However, t h e solution times in all tested problems are satisfactory — none of t h e m need more t h a n one m i n u t e of C P U time. Table 6.1 listed results for these S T P s . T h e biggest problem in this table is a transshipment problem with 3000 nodes and 35000 arcs. Each arc has a linear shipment cost and each node has a 100-piece expected penalty cost. Overall, we feel t h a t there are at least two advantages to solve S T P (including stochastic transportation problems) by using t h e network piecewise linear algorithm: First, t h e m e t h o d is conceptually simpler t h a n other current methods and yet is practically efficient in solving large-scale problems. Second, it is flexible in dealing with different types of problems and complications in practice, for example, it allows t h e transportation cost also to be piecewise linear.
7
S T P C and Other Extensions
Assume t h a t in problem (6.3) t h e distributions of the r a n d o m d e m a n d s at cities are continuous. T h u s , u;(z t ) are convex functions, but not necessarily piecewise linear. We may first discretize t h e distributions. This is equivalent to use convex piecewise linear functions u;(z;) to approximate «,(£;). T h e properties of approximating a
Algorithm
for Network
Piecewise
Linear
297
Programming
convex function by a piecewise linear function were studied by Geoffrion [G77]. In general his result requires a global finer division of t h e domain in order to obtain higher accuracy. However, for t h e S T P , more grid points m a y not be necessary. Let us denote t h e approximation problem by S T P D . Suppose t h a t we get an optimal solution (x, z) for S T P D by using t h e ( N e t P L P ) m e t h o d . We now discuss how to improve this approximation solution for S T P C . P r o p o s i t i o n 7 . 1 . S T P C has an optimal solution (if it exists) (x*,z*) such t h a t x*takes a breakpoint value of t h e domain of tj for each j except j belongs to a set of arcs forming a spanning tree in t h e network. P r o o f . Suppose t h a t an optimal solution to S T P C is ( x , z * ) . Fix t h e values of z at 2*. T h e resulted problem is a piecewise linear network program which has an optimal solution x* satisfying t h e requirements. (Q.S.V.) If pi are good piecewise linear approximations to p,-, we may expect Xj to take t h e same breakpoint value as x*- for each j except j belongs to an optimal spanning tree T. We m a y also assume t h a t Xj is in t h e same linear "piece" of t h e domain of tj as x*j if j e T. To find x*, it suffices to find t h e values of x"j for j G T by fixing the values of Xj for j' £ T at these breakpoint values. Replace Xj [j £ T) by Cj/t, where Cjk is t h e corresponding breakpoint values and adjust 6; accordingly. T h e n it suffices to solve the following problem: ' minimize^
£ j e T tj(xj)
+ YT=\ «••(•*••)
(STPT) subject to
1
(^
4 0
(.)
= ( - E L 6.'
dj < Xj < d+ for j e T,
where E is t h e incidence m a t r i x of T and x = {XJ)J^J. As proved in [Q85], S T P T m a y be solved with the work of solving a one-dimensional monotone equation. This certainly gives us a better approximation solution of S T P C . One of t h e extensions of S T P is the stochastic generalized transshipment problem, i.e., t h e problem with a magnification or reduction coefficient at each arc. Some versions of such a problem have been discussed in [FD56][E60][Q87]. It is expected t h a t t h e ( N e t P L P ) m e t h o d m a y be generalized to such a problem without big difficulty since t h e only difference is t h a t t h e tree structure is replaced by a one-tree structure. See [KH80] and [Q87]. It will be much difficult if we allow t h e magnification or reduction coefficients are also convex piecewise linear functions. This occurs in the real world. For example, t h e reduction effect increases in an electricity transmission line as t h e flow increases. Other extensions of S T P includes t h e multi-commodity case and t h e stochastic programming problem with network flow recourse, which was first considered by Wallace for fisheries m a n a g e m e n t . See [W86][W89] and [WW89].
298
J. Sun, K.-H. Tsai, and L. Qi
m Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Type
m
n
Size of Support
CPU Time (in Sec.)
Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Transportation Assignment Assignment Assignment Assignment Assignment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment Transshipment
200 200 200 200 200 300 300 300 300 300 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 1000 1000 1000 1000 1500 1500 1500 1500 8000 5000 3000 5000 3000
1311 1500 2007 2205 2900 3150 4500 5170 6095 6311 1500 2250 3000 3750 4500 1306 2443 1306 2443 1416 2836 1416 2836 1382 2676 1382 2676 2900 3400 4400 4800 4342 4385 5107 5730 15000 23000 35000 15000 23000
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
0.36 0.33 0.48 0.44 0.50 0.63 0.93 1.07 1.14 1.21 0.38 0.49 0.56 0.52 0.69 0.73 1.24 0.77 1.24 0.80 1.44 0.82 1.49 0.81 1.50 0.86 1.55 2.38 2.53 3.35 3.39 4.25 4.14 4.49 4.81 50.48 28.01 24.82 24.91 18.82
Table 6.1. Solution Times of 40 Benchmark STP (on S U N 4 / 3 3 0 )
References [ACK86] I. Ali, W.D. Cook and M. Kress, Ordinal ranking and intensity of preference:
Algorithm
for Network
Piecewise
Linear
Programming
a linear p r o g r a m m i n g approach, Management [C83] V. Chvatal, Linear Programming,
299
Science,
32(1986)1642-1647.
(Freeman, New York, NY, 1983).
[CSA86] A. Charnes, T. Song and I. Ali, A two-segment approximation algorithm for separable convex programming with linear constraints, Mathematische Operationsforschung und Statistik-Series Optimization, 17(1986)147-159. [Co78] L. Cooper, T h e stochastic transportation-location problem, Computer Mathematics with Applications 4(1978)265-275.
and
[CL77] L. Cooper and L.J. LeBlanc, Stochastic transportation problems and other network related convex problems, Naval Research Logistics Quarterly 24(1977) 324-336. [E60] S. Elmaghrabi, Allocation under uncertainty when t h e d e m a n d has a continuous distribution function, Management Science 6(1960)270-294. [F86] R. Fourer, A simplex algorithm for piecewise-linear programming III: Computational Analysis and Applications, Tech. Report 86-03 D e p a r t m e n t of I E / M S , Northwestern University, Evanston IL 60208 (1986). [F88] R. Fourer, A simplex algorithm for piecewise-linear programming: finiteness, feasibility and degeneracy, Mathematical Programming, 41(1988)281-316. [FD56] A. R. Ferguson and G. B. Dantzig, T h e allocation of aircraft to routes—an example of linear programming under uncertain d e m a n d , Management Science 3(1956)45-73. [FF62] L.R. Ford and D.R. Fulkerson, Flows and Networks, Princeton, N J , 1962).
(Princeton Univ. Press,
[G77] A.M. Geoffrion, Objective function approximation in m a t h e m a t i c a l programming, Mathematical Programming 13(1977)23-37. [Gr86] M.D. Grigoriadis, An efficient implementation of t h e network simplex m e t h o d , Mathematical Programming Study 26(1986)83-111. [KH80] J.L. Kennington and R.V. Helgason, Algorithms Wiley-Interscience, New York (1980).
for Network
Programming,
[KNS74] D. Klingman, A. Napier and J. Stutz, N E T G E N : a program for generating large scale capacitated assignment, transportation, and m i n i m u m cost flow network problems, Management Science 20(1974)814-821. [L80] F . Louveaux, A solution m e t h o d for multi-stage stochastic programs with recourse with applications to an energy investment problem, Operations Research 28(1980)889-902.
300
J. Sun, K.-H. Tsai, and L. Qi
[MV88a] J. M. Mulvey and H. Vladimirou, Solving multistage stochastic network flows: An application of scenario aggregation, Tech. Report SOR-88-1, Dept. of Civil Engineering and Operations Research, Princeton University, Princeton , N J (1988). [MV88b] J. M. Mulvey and H. Vladimirou, Stochastic network optimization models for investment planning, Tech. Report SOR-88-2, Dept. of Civil Engineering and Operations Research, Princeton University, Princeton , N J (1988). [P86] W . B . Powell, A stochastic model of t h e dynamic vehicle allocation problem, Transportation Science 20 (1986) 117-129. [Q84a] L. Qi, Finitely convergent methods for solving stochastic linear programming and stochastic network flow problems, Ph. D. Dissertation, University of Wisconsin, Madison, W I (1984). [Q84b] L. Qi, T h e dual forest iteration m e t h o d for t h e stochastic transportation problem, Working Paper WP-84-59, IIASA, Laxenburg, Austria (1984). [Q85] L. Qi, Forest iteration m e t h o d for stochastic transportation problem, matical Programming Study 25(1985)142-163.
Mathe-
[Q87] L. Qi, T h e A-forest iteration m e t h o d for t h e stochastic generalized transportation problem, Mathematics of Operations Research 12(1987)1-15. [R84] R . T . Rockafellar, Network Flows Interscience, New York, 1984).
and
Monotropic
Optimization,
(Wiley-
[S86] J. Sun, On monotropic piecewise quadratic programming, P h . D Dissertation, University of Washington, Seattle, Washington (1986). [Sz64] W . Szwarc, T h e transportation problem with stochastic d e m a n d s , ment Science 11(1964)33-50.
Manage-
[W86] S. Wallace, Solving stochastic programs with network recourse, 16(1986)295-317.
Networks
[W89] S. Wallace, Bounding t h e expected time-cost curve for a stochastic P E R T network from below, Operations Research Letters 8(1989)89-94. [We83] R. J - B . Wets, Solving stochastic programs with simple recourse, 10(1983)219-242. [Wi63] A.C. Williams, A stochastic transportation problem, Operations 1(1963)759-770.
Stochastics
Research
[WW89] S. Wallace a n d R. J-B Wets, Preprocessing in stochastic programming: T h e case of uncapacitated networks, ORSA Journal on Computing 1(1989)252-270.
301 Network Optimization Problems, pp. 301-331 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
A Bibliography on Network Flow Problems 1 Marinus Veldhorst Department of Computer Science, Utrecht University, 3508 TB Utrecht, The Netherlands
P.O. Box
80.089,
Abstract
Network flow problems form an important class of research problems in optimization, with many new developments in the last decade. There are many different subclasses of network flow problems, as well as many different techniques to address these problems. In this bibliography we concentrate on combinatorial algorithms for the maximum flow problem and for the minimum cost flow problem with integral capacities and linear cost functions with integral coefficients. Especially results published since 1982 are compiled.
1
Introduction
Network flow problems form an i m p o r t a n t class of optimization problems on graphs. Basically, we are given a network (directed graph) G = (V, A) and one or more commodities. Some a m o u n t of each commodity m u s t b e pushed through t h e network from a n u m b e r of sources (vertices in G which have a supply of t h e commodity) to a n u m b e r of sinks (vertices with a d e m a n d of the commodity). At every vertex except t h e sources and t h e sinks, t h e incoming flow of every commodity is pushed on in certain portions over t h e outgoing arcs. T h e flow of t h e commodities may incur some costs, and t h e amount of flow through the arcs of t h e network may be restricted by certain capacity constraints and m a y be subject to losses or gains. Network flow problems usually ask for criteria and efficient algorithms for determining flows of the commodities t h a t satisfy certain conditions. 'This work was partially supported by the ESPRIT II Basic Research Actions of the EC under contract no. 7141 (project ALCOM II).
302
M.
Veldhorst
I m p o r t a n t subclasses of network flow problems are t h e m a x i m u m flow problems, t h e m i n i m u m cost flow problems, t h e multicommodity flow problems, and t h e flows "with losses and gains" (the generalized flow problem). Even within a subclass, problems may vary in such a way t h a t different algorithms are necessary to find t h e solution for different sorts of problems. For example, t h e capacity constraints m a y be nonnegative integral or real n u m b e r s , and can be upper or lower bounds on t h e amount of flow through t h e arcs; t h e cost functions of t h e arcs may be integral or real valued functions and can be linear or nonlinear in t h e amount of flow through t h e arc. For special networks (e.g., planar networks) specific algorithms have been designed in order to obtain solutions more efficiently. Hence, in a bibliography on network flow problems t h a t is not too extensive, one must m a k e a selection from an overwhelming number of publications in this research area. In this bibliography we compiled results which, in our opinion, are interesting from t h e viewpoint of t h e design and analysis of algorithms, especially combinatorial algorithms. We concentrated on t h e m a x i m u m flow problems and on t h e m i n i m u m cost flow problems with integral capacities and linear cost functions with integral coefficients. These are, in a way, the most classical network flow problems and were already t h e subject of most of Ford and Fulkerson's book in 1962. We intend to be complete in these two areas as far as results published after 1982 are concerned. Results t h a t are not published in regular journals or proceedings of conferences, are only included when we consider t h e m i m p o r t a n t from an historic point of view or when they constitute t h e current s t a t e of t h e art. For t h e more general varieties and subclasses of the network flow problem t h e bibliography is certainly not complete, b u t nevertheless we hope to give a rather broad entrance to t h e scientific literature on these problems. In t h e remainder of this introduction we give an overview of t h e variants of the m a x i m u m flow problems and m i n i m u m cost flow problems, and mention a number of publications with good introductions, overviews or i m p o r t a n t contributions to the development of t h e field of network flow algorithms. After section 1 t h e bibliography is given 2 .
1.1
Maximum Flow Problems
In the (basic) m a x i m u m flow problem there is one commodity and a m a x i m u m amount of flow of t h e commodity has to be sent from one source s to one sink t; t h e flow in each arc (i,j) £ A is not allowed to be more t h a n a given positive integral value (upper bound) U J J . To be precise, in an instance of t h e m a x i m u m flow problem we are given a directed graph G = (V, A), and two specified vertices s and t which are assumed to have no incoming and outgoing arcs, respectively. W i t h each arc 2
Earlier versions of this bibliography have been published in Algorithms Review, vol. 1 (1990), pp. 97-117, and as Technical Report RUU-CS-91-38, Dept. of Computer Science, University of Utrecht, Utrecht, The Netherlands.
A Bibliography
on Network
Flow
303
Problems
(i, j) € A is associated a positive integral n u m b e r u,-j, its capacity. A (valid) flow for this instance consists of a real number /,-j for each {i,j) G A such t h a t t h e following two conditions are satisfied: 0 < fij H
fi,i=
< Ui,j 12
for all (i, j) € A hi
ior ea,chi^
s,i^=t
T h e flow has value £i:(»,«)e.4 /».'• A m a x i m u m flow is a valid flow with m a x i m u m value among all possible valid flows. T h e m a x i m u m flow problem is t h e problem to design an efficient algorithm t h a t computes a m a x i m u m flow for every given instance. Pioneering work on t h e m a x i m u m flow problem has been done by Ford and Fulkerson (e.g. [94]). F u n d a m e n t a l improvements have been given by E d m o n d s and Karp ([76]), Dinic ([72]), Karzanov ([208]), Malhotra et al. ([243]) and Goldberg ([120]). These fundamental improvements often triggered further improvement: faster algor i t h m s could be obtained by incorporation of more sophisticated d a t a structures in t h e fundamental algorithms (e.g., [ I l l ] , [333], [130], [125]), or by making proper choices t h a t were left open in t h e fundamental algorithms (e.g., [344], [130], [52]). Good introductions or overviews of t h e algorithmic area of m a x i m u m network flow can be found in [5], [61], [94], [177], [194], [231], [287], [344] and [357]. Included in [5] is an excellent historic overview. For subclasses of m a x i m u m flow problems specially designed algorithms may run faster t h a n t h e general algorithms. For example there are m a x i m u m flow algorithms for planar graphs (e.g., [189], [170], [98]), and for bipartite graphs (e.g., [10], [159]). On t h e other h a n d one can look for efficient algorithms for t h e special case t h a t t h e capacities Uij are relatively small numbers (e.g., [86], [88]). In case t h e capacities are bounded by a polynomial in t h e number of vertices of G, a scaling technique might be useful ([107], [7], [11]). Algorithms for sequential computers are not necessarily efficient for computers of a different architecture. Hence, several researchers have looked for efficient parallel algorithms (e.g., [328], [130]), and efficient distributed algorithms (e.g., [20], [244], [130], [52]). O t h e r researchers have considered t h e m a x i m u m flow of r a n d o m instances of t h e m a x i m u m flow problem (e.g., [150], [335], [268]). T h e m a x i m u m flow problem can naturally be extended to t h e so-called multiterminal flow problem. In this problem one wants to c o m p u t e t h e m a x i m u m flow values for k source-destination pairs (•Si,< 1 ),.. . , ( s j t , i t ) , simultaneously. Obviously this problem can be solved by separately solving k m a x i m u m flow problems (one for each pair (s,,z;)), b u t more efficient algorithms have been designed for t h e case t h a t G is an undirected graph (e.g., [140], [144], [156]), and for t h e case G is planar ([252]). Another extension of t h e m a x i m u m flow problem is t h e p a r a m e t r i c flow problem in which t h e capacities depend on one additional p a r a m e t e r , and one wants to compute some information about how t h e m a x i m u m flow depends on this p a r a m e t e r (e.g., [8],
304
M.
Veldhorst
[115], [158]). In a third extension there are upper bounds set on t h e amount of flow streaming through a vertex. Usually problems of this type can be transformed to ordinary m a x i m u m flow problems, b u t by these transformations one m a y loose several desirable properties of t h e networks, e.g. planarity ([215]). Other variants of t h e m a x i m u m flow problem can be found in e.g. [262], [272], [279]. As for implementations of t h e algorithms and their efficiency on existing computers, we refer to e.g. [225], [71], [203], [16] and [14].
1.2
M i n i m u m Cost Flow P r o b l e m s
In t h e m i n i m u m cost flow problem a nonnegative cost function c;j is associated with each arc (i,j) € A. Instead of maximizing t h e flow value, we want to c o m p u t e t h e flow of a given value v with m i n i m u m cost. T h e cost of a flow / is defined as ^2(i,j)eAci,j(fi,j)Arcs may have capacity constraints t h a t consist of nonnegative lower bounds and positive (possibly infinite) upper bounds. T h e m i n i m u m cost flow problem is easily generalized to t h e m i n i m u m cost circulation problem. Here we have a directed graph G = (V, A). W i t h each arc (i,j) £ A is associated a cost function Cij, and a (upper bounding) capacity u , j (a positive number, possibly infinite). W i t h each vertex i £ V is associated a n u m b e r &,-; if 6; < 0, vertex i has a d e m a n d of flow; if 6; > 0, i has a surplus of flow. A (valid) circulation consists of a nonnegative real n u m b e r fcj for each (i,j) £ A such t h a t t h e following two conditions hold: 0 < fi,j < « i j
S
j--(i,j)eA
/'J
_
for all (i,j) b
Yl hi = i
j:U,')eA
£ A
for a11
*e
V
and has cost J2(i,j)eAci,j{fi,j)T h e problem is to find a valid circulation of m i n i m u m cost. Usually t h e cost functions are convex or linear. For t h e case of linear cost functions with integral coefficients we refer to [5]. It concentrates on combinatorial algorithms, but contains also a historic overview of the different fundamental approaches (e.g., network simplex, primal-dual, out-of-kilter, scaling, relaxation) to t h e solution of t h e m i n i m u m cost circulation problem. For the case of convex cost functions we refer to [212], [301] and [35]. O t h e r variants of t h e m i n i m u m cost flow problems are mentioned in e.g. [166], [214] and [210]. M i n i m u m cost flow in planar graphs is treated in [183]. As for implementations of t h e algorithms and their efficiency on existing computers, we refer to e.g. [225], [37], [203], [126] and [122].
References [1] G. K. Adel'son-Velskii, E. A. Dinic, and A. V. Karzanov. Science, Moscow, 1975. in Russian.
Flow
algorithms.
A Bibliography
on Network
Flow
Problems
305
[2] R. K. Ahuja. Algorithm for t h e m i n i m a x transportation problem. Naval Log. Quart., 33:725-739, 1986.
Res.
[3] R. K. Ahuja, J. L. B a t r a , and S. K. G u p t a . A p a r a m e t r i c algorithm for t h e convex cost network flow and related problems. Europ. J. Oper. Res., 16:222235, 1984. [4] R. K. Ahuja, A. V. Goldberg, J. B . Orlin, and R. E. Tarjan. Finding minimumcost flows by double scaling. Math. Programming, 53:243-266, 1992. [5] R. K. Ahuja, T. L. Magnanti, and J. B . Orlin. Network flows. In G. L. Nemhauser, A. H. G. Rinnooy Kan, and M. J. Todd, editors, Handbooks of Operations Research and Management Science, vol. 1: Optimization, pages 211-369. North-Holland Publ. Comp., A m s t e r d a m , 1989. [6] R. K. Ahuja and J. B. Orlin. Improved primal simplex algorithms for t h e shortest p a t h , assignment and m i n i m u m cost flow problems. Technical Report 2090-88, Sloan School of Management, M I T , Cambridge, Mass., 1988. [7] R. K. Ahuja and J. B . Orlin. A fast and simple algorithm for t h e m a x i m u m flow problem. Operations Res., 37:748-759, 1989. [8] R. K. Ahuja and J. B. Orlin. Distance directed augmenting p a t h algorithms for m a x i m u m flow and parametric m a x i m u m flow problems. Naval Res. Log. Quart., 38:413-430, 1991. [9] R. K. Ahuja and J. B. Orlin. T h e scaling network simplex algorithm. Res., 40:Supplement S5-S13, 1992.
Operations
[10] R. K. Ahuja, J. B . Orlin, C. Stein, and R. E. Tarjan. Improved algorithms for bipartite network flow problems. Technical Report TR-338-91, D e p a r t m e n t of C o m p u t e r Science, Princeton University, Princeton, NJ., 1991. [11] R. K. Ahuja, J. B . Orlin, and R. E. Tarjan. Improved t i m e bounds for t h e m a x i m u m flow problem. SIAM J. Comput., 18:939-954, 1989. [12] A. I. Ali, R. P a d m a n , and H. Thiagaran. Dual algorithms for pure network problems. Operations Res., 37:159-171, 1989. [13] I. Ali, D. B a r n e t t , K. Farhangian, J. Kennington, B . Patty, B. Shetty, B. McCarl, and P. Wong. Multicommodity network problems: Applications and computations. A.I.I.E. Trans., 16:127-134, 1984. [14] F . Alizadeh and A. V. Goldberg. Implementing t h e push-relabel m e t h o d for t h e m a x i m u m flow problem on a connection machine. Technical Report STANCS-92-1410, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Feb. 1992.
306
M.
Veldhorst
[15] N. Alon. Generating pseudo-random permutations and m a x i m u m flow algor i t h m s . Inf. Process. Lett, 35:201-204, 1990. [16] R. J. Anderson and J. C. Setubal. On t h e parallel implementation of Goldberg's m a x i m u m flow algorithm. In Proc. 4th Annual ACM Symp. on Parallel Algorithms and Architectures, pages 168-177, 1992. [17] E. M. Arlin and C. H. Papadimitriou. On t h e complexity of circulations. Algorithms, 7:134-145, 1986.
J.
[18] J. Aronson and B . Chen. A primary/secondary m e m o r y implementation of a forward network simplex algorithm for multiperiod network flow problems. Comput. Oper. Res., 16:379-391, 1989. [19] A. Assad. Multicommodity network flows - a survey. Networks,
8:37-91, 1978.
[20] B . Awerbuch. Reducing complexities of t h e distributed max-flow and breadthfirst-search algorithms by means of network synchronization. Networks, 15:425437, 1985. [21] F . B a r a h o n a and E. Tardos. Note on W e i n t r a u b ' s minimum-cost circulation algorithm. SI AM J. Comput., 18:579-583, 1989. [22] A. E. Baratz. T h e complexity of m a x i m u m network flow. Technical Report M I T / L C S / T R - 2 3 0 , Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1980. [23] M. Bazaraa and J. J. Jarvis. Linear Programming ed.). J o h n Wiley L Sons, New York, 1990.
and Network
Flows
(2nd
[24] M. Bellmore and R. R. Vemuganti. On multicommodity m a x i m a l dynamic flows. Operations Res., 21:10-21, 1973. [25] G. E. Bennington. An efficient minimal cost flow algorithm. 19:1042-1051, 1973. [26] C. Berge. Graphs and Hypergraphs, A m s t e r d a m , 1973.
Manag.
Sci.,
chapter 5. North-Holland Publ. Comp.,
[27] C. Berge and A. Ghouila-Houri. Programming, works. John Wiley & Sons, New York, 1962.
Games and Transportation
Net-
[28] D. P. Bertsekas. A unified framework for primal-dual methods in m i n i m u m cost network flow problems. Math. Programming, 32:125-145, 1985. [29] D. P. Bertsekas. Distributed asynchronous relaxation m e t h o d s for linear network flow problems. Technical Report LIDS-P-1986, Lab. for Decision Systems, M I T , Cambridge, Mass., 1986.
A Bibliography
on Network
Flow
307
Problems
[30] D. P. Bertsekas and J. Eckstein. Dual coordinate step methods for linear network flow problems. Math. Programming, 42:203-243, 1988. [31] D. P. Bertsekas and D. El Baz. Distributed asynchronous relaxation methods for convex network flow problems. SIAM J. Contr. & Optim., 25:74-85, 1987. [32] D. P. Bertsekas, P. A. Hosein, and P. Tseng. Relaxation m e t h o d s for network flow problems with convex arc costs. SIAM J. Contr. & Optim., 25:1219-1243, 1987. [33] D. P. Bertsekas and P. Tseng. T h e relax codes for linear m i n i m u m cost network flow problems. In B . Simeone et al., editors, FORTRAN Codes for Network Optimization, Annals of Operations Research, vol. 13, pages 125-190, 1988. [34] D. P. Bertsekas and P. Tseng. Relaxation m e t h o d s for m i n i m u m cost ordinary and generalized network flow problems. Operations Res., 36:93-114, 1988. [35] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation, chapter 5, 6.5 and 6.6. Prentice-Hall, Inc., Englewood Cliffs, N J , 1989. [36] D. Bienstock. Some generalized max-flow min-cut problems in t h e plane. Oper. Res., 16:310-333, 1991.
Math.
[37] R. G. Bland and D. L. Jensen. On t h e computational behavior of a polynomialt i m e network flow algorithm. Math. Programming, 54:1-40, 1992. [38] G. Bradley, G. Brown, and G. Graves. Design and implementation of large scale primal transshipment algorithms. Manag. Sci., 24:1-38, 1977. [39] S. P. Bradley, A. C. Hax, and T. L. Magnanti. Applied Mathematical ming. Addison-Wesley Publ. Comp., New York, 1977.
Program-
[40] R. G. Busacker and P. J. Gowen. A procedure for determining a family of minimal-cost network flow p a t t e r n s . O.R.O. Technical paper 15, Johns Hopkins University, Baltimore, M D , 1961. [41] R. G. Busacker and T. L. Saaty. Finite Graphs and Networks: with Applications. McGraw-Hill, New York, 1965.
An
Introduction
[42] I. N. Chen. A new parallel algorithm for network flow problems. In T. Y. Feng, editor, Proc. 1974 Sagamore Computer Conf., Lecture Notes in C o m p u t e r Science, vol. 24, pages 306-307, Springer-Verlag, Berlin, 1975. [43] I. N. Chen, P. Y. Chen, and T. Y. Feng. Associative processing of network flow problems. IEEE Trans. Comput., C-28:184-190, 1979.
308
M.
Veldhorst
[44] I. N. Chen and T. Y. Feng. A parallel algorithm for m a x i m u m flow problem. In Proc. 1973 Sagamore Computer Conf., 1973. [45] Y. L. Chen and Y. H. Chin. Multicommodity network flows with safety considerations. Operations Res., 40:Supplement S48-S55, 1992. [46] C. K. Cheng. Ancestor tree for arbitrary multi-terminal-cut functions. of Operations Res., 33:199-213, 1991.
Annals
[47] C. K. Cheng and T. C. Hu. M a x i m u m concurrent flow and m i n i m u m cuts. Algorithmica, 8:233-249, 1992. [48] J. Cheriyan. Parametrized worst case networks for preflow push algorithms. Technical report, C o m p u t e r Science Group, T a t a Institute of F u n d a m e n t a l Research, Bombay, India, 1988. [49] J. Cheriyan and T. Hagerup. A randomized maximum-flow algorithm. In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 118-123, 1989. [50] J. Cheriyan, T. in o(nm) time? Languages and pages 235-248,
Hagerup, and K. Mehlhorn. Can a m a x i m u m flow be computed In M. Paterson, editor, Proc. 17th Intern. Coll. on Automata, Programming, Lecture Notes in C o m p u t e r Science, vol. 443, Springer-Verlag, Berlin, 1990.
[51] J. Cheriyan, T . Hagerup, and K. Mehlhorn. An o(n 3 )-time maximum-flow algorithm. Technical Report MPI-I-91-120, Max-Planck-institut fur Informatik, Saarbrucken, Germany, Nov. 1991. [52] J. Cheriyan and S. N. Maheshwari. Analysis of preflow push algorithms for m a x i m u m network flow. SI AM J. Comput., 18:1057-1086, 1989. [53] J. Cheriyan and S. N. Maheshwari. T h e parallel complexity of finding a blocking flow in a 3-layer network. Inf. Process. Lett., 31:157-161, 1989. [54] R. V. Cherkasky. Algorithm of construction of m a x i m a l flow in networks with complexity of 0(V2\/E) operations. Math. Methods of Solution of Economical Problems, 7:112-125, 1977. (In Russian). [55] T. Cheung. Computational comparison of eight m e t h o d s for the m a x i m u m network flow problem. ACM Trans. Math. Softw., 6:1-16, 1980. [56] T . Cheung. Graph traversal techniques and t h e m a x i m u m flow problem in distributed computation. IEEE Trans. Softw. Eng., SE-9:504-512, 1983. [57] N. Christofides. Graph Theory: demic Press, New York, 1975.
An Algorithmic
Approach,
chapter 11. Aca-
A Bibliography
on Network
Flow
309
Problems
[58] E. Cohen. Approximate m a x flow on small d e p t h networks. In Proc. 33rd Annual IEEE Symp. Foundations of Computer Science, pages 648-658, 1992. [59] E. Cohen and N. Megiddo. Algorithms and complexity analysis for some flow problems. In Proc. 2nd Annual ACM-SIAM Symp. Discrete Algorithms, pages 120-130, 1991. [60] E. Cohen and N. Megiddo. New algorithms for generalized network flows. In D. Dolev, Z. Galil, and M. Rodeh, editors, Theory of Computing and Systems, Proc. ISTCS '92, Haifa, Israel 1992, Lecture Notes in C o m p u t e r Science, vol. 601, pages 103-114, Springer-Verlag, Berlin, 1992. [61] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction chapter 28. M I T Press, Cambridge, Mass., 1990.
to
Algorithms,
[62] W. Cui. A network simplex m e t h o d for the m a x i m u m balanced flow problem. J. Oper. Res. Soc. Japan, 31:551-563, 1988. [63] W . Cui and S. Fujishige. A primal algorithm for the submodular flow problem with m i n i m u m - m e a n cycle selection. J. Oper. Res. Soc. Japan, 31:431-441, 1988. [64] W . H. C u n n i n g h a m . A network simplex m e t h o d . Math. Programming, 116, 1976.
11:105-
[65] W. H. Cunningham. Theoretical properties of the network simplex m e t h o d . Math. Oper. Res., 4:196-208, 1979. [66] W . H. C u n n i n g h a m and A. Frank. A primal-dual algorithm for submodular flows. Math. Oper. Res., 10:251-262, 1985. [67] G. B . Dantzig. Application of t h e simplex m e t h o d to a transportation problem. In T. C. Koopmans, editor, Activity Analysis of Production and Allocation, pages 359-373, John Wiley & Sons, New York, 1951. [68] G. B . Dantzig. Linear Princeton, N J , 1962.
Programming
and Extensions.
Princeton Univ. Press,
[69] G. B. Dantzig and D. R. Fulkerson. On t h e max-flow min-cut theorem of networks. In H. W . K u h n and A. W . Tucker, editors, Linear Inequalities and Related Systems, Annals of Mathematics Study, vol. 38, pages 215-221, Princeton Univ. Press, Princeton, N J , 1956. [70] U. Derigs. Programming in Networks and Graphs. Lecture Notes in Economics and M a t h e m a t i c a l Systems, vol. 300. Springer-Verlag, Berlin, 1988.
310
M.
Veldhorst
[71] U. Derigs and W . Meier. Implementing Goldberg's max-flow algorithm, a comp u t a t i o n a l investigation. Z. Oper. Res., 33:383-403, 1989. [72] E. A. Dinic. Algorithm for solution of a problem of m a x i m u m flow in networks with power estimation. Soviet Math. Dokl, 11:1277-1280, 1970. [73] J. Divoky and M. Hung. Performance of shortest p a t h algorithms in network flow problems. Manag. Sci., 36:661-673, 1990. [74] J. R. Driscoll, H. N. Gabow, R. Shrairman, and R. E. Tarjan. Relaxed heaps: An alternative to Fibonacci heaps with applications to parallel computations. Commun. ACM, 31:1343-1354, 1988. [75] J. E d m o n d s and R. Giles. A m i n - m a x relation for submodular functions on graphs. Annals of Discrete Math., 1:185-204, 1977. [76] J. E d m o n d s and R. M. Karp. Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM, 19:248-264, 1972. [77] J. E l a m , F . Glover, and D. Klingman. A strongly convergent primal simplex algorithm for generalized networks. Math. Oper. Res., 4:39-59, 1979. [78] P. Elias, A. Feinstein, and C. E. Shannon. Note on m a x i m u m flow through a network. IRE Trans, on Inform. Theory, 2:117-119, 1956. [79] S. E. Elmaghraby. Sensitivity analysis of multi-terminal network flows. ORSA, 12:680-688, 1964.
J.
[80] T. R. Ervolina and S. T. McCormick. A strongly polynomial dual cancel and tighten algorithm for m i n i m u m cost network flow. Technical Report 90-MSC010, U B C Faculty of Commerce, 1990. [81] T . R. Ervolina and S. T. McCormick. A strongly polynomial m a x i m u m mean cut cancelling algorithm for m i n i m u m cost network flow. Technical Report 90-MSC-009, U B C Faculty of Commerce, 1990. [82] J. R. Evans. M a x i m u m flow in probabilistic graphs - t h e discrete case. 6:161-183, 1976.
Networks,
[83] S. Even. T h e max-flow algorithm of Dinic and Karzanov. An exposition. Technical Report M I T / L C S / T M - 8 0 , Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1976. [84] S. Even. Graph Algorithms, 1979.
chapter 4, 5, and 10.8. P i t m a n Publ. Ltd, London,
A Bibliography
on Network
Flow
Problems
311
[85] S. Even, A. Itai, and A. Shamir. On t h e complexity of timetable and multicommodity flow problems. SIAM J. Comput., 5:691-703, 1976. [86] S. Even and R. E. Tarjan. Network flow and testing graph connectivity. J. Comput, 4:507-518, 1975. [87] T . E. Feather. The parallel complextity of some flow and matching P h D thesis, University of Toronto, Toronto, Canada, 1984.
SIAM
problems.
[88] D. Fernandez-Baca and C. U. Martel. On the efficiency of maximum-flow algorithms on networks with small integer capacities. Algorithmica, 4:173-189, 1989. [89] L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canad. Math., 8:399-404, 1956.
J.
[90] L. R. Ford and D. R. Fulkerson. A simple algorithm for finding m a x i m a l network flows and an application to t h e Hitchcock problem. Canad. J. Math., 9:210-218, 1957. [91] L. R. Ford and D. R. Fulkerson. Constructing m a x i m a l dynamic flows from static flows. Operations Res., 6:419-433, 1958. [92] L. R. Ford and D. R. Fulkerson. A suggested computation for m a x i m a l multic o m m o d i t y network flow. Manag. Sci., 5:97-101, 1958. [93] L. R. Ford and D. R. Fulkerson. A network flow feasibility theorem and combinatorial applications. Canad. J. Math., 11:440-450, 1959. [94] L. R. Ford and D. R. Fulkerson. Flows in Networks. Princeton, N J , 1962.
Princeton Univ. Press,
[95] A. Frank. Augmenting graphs to meet edge-connectivity requirements. J. Discr. Math., 5:25-53, 1992.
SIAM
[96] A. Frank and E. Tardos. An application of simultaneous Diophantine approximation in combinatorial optimization. Combinatorica, 7:49-65, 1987. Preliminary version: An application of simultaneous approximations in combinatorial optimization, Proc. 26th Annual I E E E Symp. Foundations of C o m p u t e r Science, 459-463, 1985. [97] H. Frank and I. T. Frisch. Communication, Transmission, networks. Addison-Wesley Publ. Comp., New York, 1971.
and
Transportation
[98] G. N. Frederickson. Fast algorithms for shortest paths in planar graphs, with applications. SIAM J. Comput., 16:1004-1022, 1987.
312
M.
Veldhorst
T . Fujisawa. Maximal flow in a lossy network. In Proc. Allerton Circuit and System Theory, pages 385-393, 1963.
Conf. on
S. Fujishige. Algorithms for solving t h e independent-flow problem. J. Res. Soc. Japan, 21:189-204, 1978.
Oper.
S. Fujishige. A capacity-rounding algorithms for t h e minimum-cost circulation problem: a dual framework of t h e Tardos algorithm. Math. Programming, 35:298-308, 1986. S. Fujishige. An out-of-kilter m e t h o d for submodular flows. Discrete Math., 17:3-16, 1987.
Applied
S. Fujishige, A. Nakayama, and W . - T . Cui. On t h e equivalence of the m a x i m u m balanced flow problem a m d t h e weighted m i n i m a x flow problem. Operations Res. Lett, 5:207-209, 1986. S. Fujishige, A. Rock, and U. Z i m m e r m a n n . A strongly polynomial algorithm for m i n i m u m cost submodular flow problems. Math. Oper. Res., 14:60-69, 1989. D. R. Fulkerson. An out-of-kilter m e t h o d for minimal cost flow problem. J. Appl. Math., 9:18-27, 1961.
SIAM
D. R. Fulkerson and G. B . Dantzig. C o m p u t a t i o n of m a x i m u m flow in networks. Naval Res. Log. Quart., 2:277-283, 1955. H. N. Gabow. Scaling algorithms for network problems. J. Comput. 31:148-168, 1985.
Syst.
Sci.,
H. N. Gabow and R. E. Tarjan. Faster scaling algorithms for network problems. SIAM J. Comput., 18:1013-1036, 1989. D. Gale. A t h e o r e m on flows in networks. Pacific J. Math., 7:1073-1082, 1957. D. Gale. Transient flows in networks. Michigan 5 3
J. Math., 6:59-63, 1959.
2 3
Z. Galil. An 0{n l m l ) algorithm for t h e m a x i m a l flow problem. Acta Inf., 14:221-242, 1980. Preliminary version in Proc. 19th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 231-245, 1978. Z. Galil. On t h e theoretical efficiency of various network flow algorithms. oretical Comput. Sci., 14:103-111, 1981.
The-
Z. Galil and A. N a a m a d . An 0(EV log 2 V) algorithm for the maximal flow problem. J. Comput. Syst. Sci., 21:203-217, 1980. Preliminary version as "Network flow and generalized p a t h compression" in Proc. 11th Annual ACM Symp. Theory of Computing, pages 13-26, 1979.
A Bibliography
on Network
Flow
Problems
313
[114] Z. Galil and E. Tardos. An 0(n2(m + rclogn)logn) min-cost flow algorithm. J. ACM, pages 374-386, 1988. Preliminary version in Proc. 27th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 1-9, 1986. [115] G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast p a r a m e t r i c m a x i m u m flow algorithm and applications. SIAM J. Comput., 18:30-55, 1989. [116] M. R. Garey and D. S. Johnson. Computers and Intractibility, a Guide to the Theory of NP-Completeness, chapter A2. W . H . Freeman and Co., San Francisco, 1979. [117] F . Glover, D. Karney, and D. Klingman. Implementation and computational comparisons of primal, dual and primal-dual computer codes for m i n i m u m cost network flow problem. Networks, 4:191-212, 1974. [118] F . Glover, D. Karney, D. Klingman, and A. Napier. A c o m p u t a t i o n a l study on start procedures, basis change criteria, and solution algorithms for transportation problem. Manag. Sci., 20:793-813, 1974. [119] A. V. Goldberg. A new max-flow algorithm. Technical Report M I T / L C S / T M 291, Lab. for C o m p u t e r Science, M I T , Cambridge, Mass., 1985. [120] A. V. Goldberg. Efficient graph algorithms for sequential and parallel computers. P h D thesis, Dept. of Electr. Engin. and C o m p u t e r Science, M I T , Cambridge, Mass., 1987. Also available als Technical Report TR-374, Lab. for Computer Science, M I T , Cambridge, Mass., 1987. [121] A. V. Goldberg. Processor-efficient implementation of a m a x i m u m flow problem. Inf. Process. Lett, 38:179-185, 1991. [122] A. V. Goldberg. An efficient implementation of a scaling minimum-cost flow algorithm. Technical Report STAN-CS-92-14139, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Aug. 1992. [123] A. V. Goldberg. A n a t u r a l randomization strategy for multicommodity flow and related problems. Inf. Process. Lett., 42:249-256, 1992. [124] A. V. Goldberg, M. D. Grigoriadis, and R. E. Tarjan. Efficiency of t h e network simplex algorithm for t h e m a x i m u m flow problem. Technical Report STAN-CS89-1248, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1989. To appear in Math. Programming. [125] A. V. Goldberg, M. D. Grigoriadis, and R. E. Tarjan. Use of dynamic trees in a network simplex algorithm for t h e m a x i m u m flow problem. Math. Programming, 50:277-290, 1991.
314
M.
Veldhorst
[126] A. V. Goldberg and M. Kharitonov. On implementing scaling push-relabel algorithms for the minimum-cost flow problem. Technical Report STAN-CS92-1418, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, Mar. 1992. [127] A. V. Goldberg, S. A. Plotkin, and E. Tardos. Combinatorial algorithms for t h e generalized circulation problem. Math. Oper. Res., 16:351-381, 1991. Preliminary version in Proc. 29th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 432-443, 1988. [128] A. V. Goldberg, S. A. Plotkin, and P. M. Vaidya. Sublinear-time parallel algor i t h m s for matching and related problems. In Proc. 29th Annual IEEE Symp. Foundations of Computer Science, pages 174-185, 1988. [129] A. V. Goldberg, E. Tardos, and R. E. Tarjan. Network flow algorithms. In B . K o r t e , L. Lovasz, H. P r o m e l , and A. Schrijver, editors, Flows, paths, and VLSI-layout, pages 101-164, 1990. Previously published as Tech. Rep. STANCS-89-1252, D e p a r t m e n t of C o m p u t e r Science, Stanford University, March 1989. [130] A. V. Goldberg and R. E. Tarjan. A new approach to the m a x i m u m flow problem. J. ACM, 35:921-940, 1988. Preliminary version in Proc. 18th Annual A C M Symp. Theory of Computing, pages 136-146, 1986. [131] A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by canceling negative cycles. J. ACM, 36:873-886, 1989. Preliminary version in Proc. 20th Annual ACM Symp. Theory of Computing, pages 388-397, 1987. [132] A. V. Goldberg and R. E. Tarjan. A parallel algorithm for finding a blocking flow in an acyclic network. Inf. Process. Lett., 31:265-271, 1989. [133] A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by successive approximation. Math. Oper. Res., 15:430-466, 1990. Preliminary version published as M I T / L C S / T M - 3 3 3 , M I T , 1987, and as Solving minimum-cost flow problems by successive approximation, Proc. 19th Annual ACM Symp. Theory of C o m p u t i n g , pages 7-18. [134] B . Golden and T. L. Magnanti. Deterministic network optimization: A bibliography. Networks, 7:149-183, 1977. [135] D. Goldfarb and M. D. Grigoriadis. A computational comparison of the Dinic and network simplex methods for m a x i m u m flow. In B. Simeone, et al., editor, FORTRAN Codes for Network Optimization, Annals of Operations Research, vol. 13, pages 83-124, 1988.
A Bibliography
on Network
Flow
Problems
315
[136] D. Goldfarb and J. Hao. A primal simplex algorithm t h a t solves t h e m a x i m u m flow problem in at most 0(nm) pivots and Oiji^m) time. Math. Programming, 47:353-363, 1990. [137] D. Goldfarb and J. Hao. On strongly polynomial variants of t h e network simplex algorithm for t h e m a x i m u m flow problem. Operations Res. Letters, 10:383-387, 1991. [138] D. Goldfarb, J. Hao, and S. Kai. Anti-stalling pivot rules for the network simplex algorithm. Networks, 20:79-91, 1990. [139] L. M. Goldschlager, R. A. Shaw, and J. Staples. T h e m a x i m u m flow problem is log space complete for P . Theoretical Comput. Sci., 21:105-111, 1982. [140] R. E. Gomory and T . C. Hu. Multi-terminal network flows. J. SIAM, 9:551-570, 1961. [141] R. E. Gomory and T . C. Hu. An application of generalized linear programming to network flows. J. SIAM, 10:260-283, 1962. [142] R. E. Gomory and T . C. Hu. Synthesis of a communication network. J. 12:348-369, 1964. [143] M. Gondran and M. Minoux. Graphs and Algorithms, Interscience, New York, 1984.
SIAM,
chapter 5 and 6. Wiley-
[144] F . Granot and R. Hassin. Multi-terminal m a x i m u m flows in node capacitated networks. Discrete Applied Math., 13:157-163, 1986. [145] F . Granot and M. Penn. On t h e integral plane two-commodity flow problem. Operations Res. Lett, 11:135-139, 1992. [146] F . Granot and A. F . Veinott J r . Substitutes, complements and ripples in network flows. Math. Oper. Res., 10:471-497, 1985. [147] M. D. Grigoriadis. An efficient implementation of t h e network simplex m e t h o d . Math. Prog. Study, 26:83-111, 1986. [148] M. D. Grigoriadis and W . W . W h i t e . A partitioning algorithm for t h e multicommodity network flow problem. Math. Programming, 3:157-177, 1972. [149] G. R. G r i m m e t t and W . - C . S. Suen. T h e m a x i m a l flow through a directed graph with r a n d o m capacities. Stochastics, 8:153-159, 1982. [150] G. R. G r i m m e t t and D. J. A. Welsh. Flow in networks with r a n d o m capacities. Stochastics, 7:205-229, 1982.
M.
316
Veldhorst
151] R. C. Grinold. Calculating m a x i m a l flows in a network with positive gains. Operations Res., 21:528-541, 1973. 152] M. Grotschel, L. Lovasz, and A. Schrijver. Geometric natorial Optimization. Springer-Verlag, Berlin, 1988.
Algorithms
and
Combi-
153] G. Guisewite and P. M. Pardalos. M i n i m u m concave cost network flow problems: applications, complexity, and algorithms. Annals of Operations Research, 25:125-190, 1990. 154] R. P. G u p t a . O n flows in pseudosymmetric networks. J. SIAM, 1966.
14:215-225,
[155] D. Gusfield. Simple constructions for multi-terminal network flow synthesis. SIAM J. Comput, 12:157-165, 1983. [156] D. Gusfield. Very simple methods for all pairs network flow analysis. SIAM Comput., 19:143-155, 1990. [157] D. Gusfield. Computing t h e strength of a graph. SIAM J. Comput., 1991.
J.
20:639-654,
[158] D. Gusfield and C. Martel. A fast algorithm for the generalized parametric m i n i m u m cut problem and applications. Algorithmica, 7:499-519, 1992. [159] D. Gusfield, C. Martel, and D. Fernandez-Baca. Fast algorithms for bipartite network flow. SIAM J. Comput., 16:237-251, 1987. [160] D. Gusfield and D. Naor. Efficient algorithms for generalized cut trees. In Proc. 1st Annual ACM-SIAM Symp. Discrete Algorithms, pages 422-433, 1990. [161] H. H a m a c h a r . Numerical investigations on t h e maximal flow algorithm of Karzanov. Computing, 22:17-29, 1979. [162] H. Hamachar and L. R. Foulds. Algorithms for flows with p a r a m e t r i c capacities. Z. Oper. Res., 33:21-37, 1989. [163] J. Hao and J. B. Orlin. A faster algorithm for finding t h e m i n i m u m cut in a graph. In Proc. 3rd Annual ACM-SIAM Symp. Discrete Algorithms, pages 165-174, 1992. [164] J. K. H a r t m a n and L. S. Lasdon. A generalized upper-bounding algorithm for multicommodity network flow problems. Networks, 1:333-354, 1971. [165] R. Hassin. M a x i m u m flow in (s,t) 107, 1981.
planar networks. Inf. Process. Lett., 13:107-
A Bibliography
on Network
Flow
Problems
317
[166] R. Hassin. M i n i m u m cost flow in set-constraints. Networks,
12:1-21, 1982.
[167] R. Hassin. T h e m i n i m u m cost flow problem: a unifying approach to dual algorithms and a new tree search algorithm. Math. Programming, 25:228-239, 1983. [168] R. Hassin. On multicommodity flow in planar graphs. Networks, 1985.
14:225-235,
[169] R. Hassin. Algorithms for t h e m i n i m u m cost circulation problem based on maximizing t h e m e a n improvement. Operations Res. Lett., 12:227-233, 1992. [170] R. Hassin and D. B . Johnson. An 0 ( n l o g 2 n ) algorithm for m a x i m u m flow in undirected planar networks. SIAM J. Comput., 14:612-624, 1985. [171] R. Hassin and E. Zemel. Probabilistic analysis of t h e capacitated transportation problem. Math. Oper. Res., 13:80-89, 1988. [172] R. V. Helgason and J. L. Kennington. An efficient procedure for implementing a dual simplex network flow algorithm. A.I.I.E. Trans., 9:63-68, 1977. [173] F . L. Hitchcock. T h e distribution of a product from several sources to numerous facilities. J. Math. Phys., 20:224-230, 1941. [174] D. S. Hochbaum and A. Segev. Analysis of a flow problem with fixed charges. Networks, 19:291-312, 1989. [175] T. C. Hu. Multicommodity network flows. Operations [176] T. C. Hu. Integer Programming C o m p . , Reading, Mass., 1969.
& Network
Flows.
Res., 11:344-360, 1963. Addison-Wesley Publ.
[177] T. C. Hu. Combinatorial Algorithms, chapter 2.1, 2.2 and 2.3. Addison-Wesley Publ. Comp., Reading, Mass., 1982. [178] T . C. Hu and M. T. Shing. Algorithms, 4:241-261, 1983.
Multiterminal flows in outerplanar graphs.
J.
[179] T. C. Hu and M. T . Shing. A decomposition algorithm for multi-terminal network flows. Technical Report T R C S 84-08, D e p a r t m e n t of C o m p u t e r Science, University of California, Santa Barbara, CA, 1984. [180] C. A. J. Hurkens, A. Schrijver, and E. Tardos. On fractional multicommodity flows and distance functions. Discrete Math., 73:99-109, 1989. [181] T. Ichimori, H. Ishii, and T. Nishida. Weighted m i n i m a x real-valued flow. Oper. Res. Soc. Japan, 24:52-59, 1981.
J.
M.
318
Veldhorst
182] H. Imai. On t h e practical efficiency of various m a x i m u m flow algorithms. Oper. Res. Soc. Japan, 26:61-82, 1983.
J.
183] H. Imai and K. Iwano. Efficient sequential and parallel algorithms for planar m i n i m u m cost flow. In T. Asano, T. Ibaraki, H. Imai, and T. Nishizeki, editors, Proc. SIGAL International Symposium on Algorithms SIGAL '90, Lect u r e Notes in C o m p u t e r Science, vol. 450, pages 21-30, Springer-Verlag, Berlin, 1990. 184] M. Iri. A new m e t h o d of solving transportation-network problems. J. Res. Soc. Japan, 3:27-87, 1960. 185] M. Iri. Network York, 1969.
Flows,
Transportation
and Scheduling.
Oper.
Academic Press, New
186] A. Itai. Two-commodity flow. J. ACM, 25:596-611, 1978. 187] A. Itai and D. K. P r a d h a n . Synthesis of directed multicommodity flow networks. Networks, 14:213-224, 1984. 188] A. Itai and M. Rodeh. Scheduling transmissions in a network. J. 6:409-429, 1985. 189] A. Itai and Y. Shiloach. M a x i m u m flows in planar networks. SI AM J. 8:135-150, 1979.
Algorithms,
Comput.,
190] A. V. Iyer, J. J. Jarvis, and H. D. Ratliff. Hierarchical solution to network flow problems. Networks, 20:731-752, 1990. 191] L. Janiga and V. Koubek. A note on finding cuts in directed planar networks by parallel computation. Inf. Process. Lett., 21:75-78, 1985. [192] J. J. Jarvis. On the equivalence between node-arc and arc-chain formulations for t h e multicommodity maximal flow problem. Naval Res. Log. Quart., 16:525529, 1969. [193] J. J. Jarvis and A. M. Jezior. Maximal flow with gains through a special network. Operations Res., 20:678-688, 1972. [194] P. A. Jensen and W . Barnes. Network New York, 1980.
Flow Programming.
J o h n Wiley & Sons,
[195] P. A. Jensen and G. B h a u m i k . A flow augmentation approach t o t h e network with gains m i n i m u m cost flow problem. Manag. Sci., 23:631-643, 1977. [196] W . S. Jewell. O p t i m a l flow through networks. Interim Technical Report No. 8, Operations Research Center, M I T , Cambridge, Mass., 1958.
A Bibliography
on Network
Flow
Problems
319
[197] W . S. Jewell. O p t i m a l flow through networks with gains. 10:476-499, 1962.
Operations
Res.,
[198] W . S. Jewell. A primal-dual multicommodity flow algorithm. O R C Report 6624, Operations Research Center, University of California, Berkeley, C A , 1966. [199] W . S. Jewell. Multicommodity network solutions. Dunod, Paris, page 183, 1967.
In Thiorie
des
graphes.
[200] D. B . Johnson. Parallel algorithms for m i n i m u m cuts and m a x i m u m flows in planar networks. J. ACM, 34:950-967, 1987. Preliminary version in Proc. 23rd Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 244-254, 1982. [201] D. B. Johnson and S. M. Venkatesan. Using divide and conquer to find flows in directed planar networks in 0 ( r c 3 ' 2 logra) t i m e . In Proc. 20th Annual Allerton Conf. on Communication, Control, and Computing, pages 898-905, Univ. of Illinois, U r b a n a - C h a m p a i g n , IL., 1982. [202] D. B. Johnson and S. M. Venkatesan. Partition of planar flow networks. In Proc. 24th Annual IEEE Symp. Foundations of Computer Science, pages 259264, 1983. [203] D. S. Johnson and C. C. McGeoch. DIMACS implementation challenge workshop algorithms for network flow and matching. Technical Report 92-4, DIM A C S , New Brunswick, N J , 1992. [204] E. L. Johnson. 1966.
Networks and basis solutions.
Operations
Res.,
14:619-624,
[205] S. Kapoor and P. M. Vaidya. Fast algorithms for convex quadratic programming and multicommodity flows. In Proc. 18th Annual ACM Symp. Theory of Computing, pages 147-159, 1986. [206] R. M. K a r p . A characterization of the m i n i m u m cycle mean in a digraph. Discrete Math., 23:309-311, 1978. [207] R. M. K a r p , E. Upfal, and A. Wigderson. Constructing a m a x i m u m matching is in R a n d o m N C . Combinatorica, 6:35-48, 1986. [208] A. V. Karzanov. Determining t h e m a x i m a l flow in a network by t h e m e t h o d of preflows. Soviet Math. DokL, 15:434-437, 1974. [209] A. V. Karzanov. Half-integral 18:263-278, 1987.
five-terminus
flows.
Discrete
Applied
Math.,
[210] N. K a t o h . An efficient algorithm for the bicriteria minimum-cost circulation problem. J. Oper. Res. Soc. Japan, 32:420-440, 1989.
320
M.
Veldhorst
[211] J. L. Kennington. Survey of linear cost multicommodity network flows. ations Res., 26:209-236, 1978. [212] J. L. Kennington and R. V. Helgason. Algorithms Wiley-Interscience, New York, 1980.
for Network
Oper-
Programming.
[213] J. L. Kennington and M. Shalaby. An effective subgradient procedure for minimal cost multicommodity flow problems. Manag. Sci., 23:994-1004, 1977. [214] D . B . K h a n g and 0 . Fujiwara. A p p r o x i m a t e solutions of capacitated fixedcharge m i n i m u m cost network flow problems. Networks, 21:689-704, 1991. [215] S. Khuller and J. Naor. Flow in planar graphs with vertex capacities. Technical Report 90-1089, C o m p u t e r Science D e p a r t m e n t , Cornell University, Ithaca, NY, J a n . 1990. [216] S. Khuller, J. Naor, and P. Klein. T h e lattice structure of flow in planar graphs. Technical Report UMIACS-TR-2566, Univ. of Maryland Inst, for Advanced C o m p u t e r Studies, 1990. [217] S. Khuller and B . Schieber. Efficient parallel algorithms for testing kconnectivity and finding disjoint s — t paths in graphs. SI AM J. Comput., 20:352-375, 1991. Preliminary version in Proc. 30th Annual I E E E Symp. Foundations of C o m p u t e r Science, pages 288-293, 1989. [218] A. B . Kinariwala and A. G. Rao. Flow switching approach to t h e m a x i m u m flow problem. J. ACM, 24:630-645, 1977. [219] V. King, S. Rao, and R. E. Tarjan. A faster deterministic m a x i m u m flow algorithm. In Proc. 3rd Annual ACM-SIAM Symp. Discrete Algorithms, pages 157-165, 1992. [220] M. Klein. A primal m e t h o d for minimal cost flows with applications t o t h e assignment and transportation problems. Manag. Sci., 14:205-220, 1967. [221] P. Klein, A. Agrawal, R. Ravi, and S. Rao. Approximation through multic o m m o d i t y flow. In Proc. 31th Annual IEEE Symp. Foundations of Computer Science, pages 726-737, 1990. [222] P. Klein, C. Stein, and E. Tardos. Leighton-Rao might be practical: faster approximation algorithms for concurrent flow with uniform capacities. In Proc. 22th Annual ACM Symp. Theory of Computing, pages 310-321, 1990. [223] D. J. Kleitman. An algorithm for certain multicommodity flow problems. Networks, 1:75-90, 1971.
A Bibliography
on Network
Flow
Problems
321
[224] J. G. Klincewicz. A Newton m e t h o d for convex separable network flow problems. Networks, 13:427-442, 1983. [225] D. Klingman, A. Napier, and J. Stutz. N E T G E N : A program for generating large scale capacitated assignment, transportation, and m i n i m u m cost flow network problems. Manag. Sci., 20:814-821, 1974. [226] E. K n a p p . An exercise in t h e formal derivation of parallel programs: M a x i m u m flows in graphs. ACM Trans. Program. Lang. Syst., 12:203-223, 1990. [227] T. C. Koopmans. O p t i m u m utilization of t h e transportation system. In Proc. International Statistical Conference, Washington, D.C., 1947. Also reprinted as supplement t o Econometrica 1 7 , 1949. [228] V. Koubek and A. Riha. T h e m a x i m u m fc-flow in a network. In J. Gruska and M. Chytil, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 118, pages 389-397, Springer-Verlag, Berlin, 1981. [229] L. Kucera. M a x i m u m flow in planar networks. In J. Gruska and M. Chytil, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 118, pages 418-422, Springer-Verlag, Berlin, 1981. [230] L. Kucera. Finding a m a x i m u m flow in / s , t / - p l a n a r network in linear expected t i m e . In M. P. Chytil and V. Koubek, editors, Proc. Mathem. Foundations of Computer Science, Lecture Notes in C o m p u t e r Science, vol. 176, pages 370-377, Springer-Verlag, Berlin, 1984. [231] E. L. Lawler. Combinatorial Optimization: Networks and Matroids, 6.3 and 7.11. Holt, Rinehart and Winston, New York, 1976. [232] E. L. Lawler. Shortest p a t h and network flow algorithms. Annals Math., 4:251-263, 1979.
chapter 4,
of
Discrete
[233] E. L. Lawler. An introduction to polymatroidal network flows. In G. Ausiello and M. Lucertini, editors, Analysis and Design of Algorithms in Combinatorial Optimization, International Centre for Mechanical Sciences, Courses and Lectures - No. 266, pages 129-146. Springer-Verlag, Vienna, 1981. [234] E. L. Lawler and C. U. Martel. Computing maximal "polymatroidal" network flow. Math. Oper. Res., 7:334-347, 1982. [235] T. Leighton, F . Makedon, S. Plotkin, C. Stein, E. Tardos, and S. Tragoudas. Fast approximation algorithms for multicommodity flow problems. In Proc. 23rd Annual ACM Symp. Theory of Computing, pages 101-111, 1991.
322
M.
Veldhorst
[236] T . Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In Proc. 29th Annual IEEE Symp. Foundations of Computer Science, pages 422-431, 1988. [237] T. Lengauer and K. W. Wagner. T h e binary network flow problem is logspace complete for P . Theoretical Comput. Sci., 75:357-363, 1990. A preliminary version was part of: T. Lengauer and K. W . Wagner, T h e correlation between t h e complexities of non-hierarchical and hierarchical versions of graph problems. In: F . J. Brandenburg, G. Vidal-Nacquet and M. Wirsing (eds.), Proc. STACS 87 - 4th Annual Symp. on Theor. Aspects of C o m p u t e r Science, Lecture Notes in C o m p u t e r Science, vol. 247, pages 100-113, Springer-Verlag, Berlin, 1987. [238] R. J. Lipton and R. E. Tarjan. A separator theorem for planar graphs. J. Appl. Math., 36:177-189, 1979. [239] M. V. Lomonosov. 3:207-218, 1983.
On the planar integer two-flow problem.
SIAM
Combinatorica,
[240] M. V. Lomonosov. Combinatorial approaches to multiflow problems. Applied Math., 11:1-94, 1985.
Discrete
[241] M. Malek-Zavarei and J. K. Aggarwal. Optimal flow in networks with gains and costs. Networks, 1:355-365, 1972. [242] M. Malek-Zavarei and I. T. Frisch. On t h e fixed cost flow problem. Control, 16:897-902, 1972.
Int.
J.
[243] V. M. Malhotra, M. P. K u m a r , and S. N. Maheshwari. An 0(n3) algorithm for finding m a x i m u m flows in networks. Inf. Process. Lett., 7:277-278, 1978. [244] J. M. Marberg and E. Gafni. An 0 ( n 2 m 1 / 2 ) distributed max-flow algorithm. In S. Sahni, editor, Proc. International Conf. on Parallel Processing, pages 2 1 3 216, 1987. [245] C. Martel. A comparison of phase and non-phase network flow algorithms. Networks, 19:691-705, 1989. [246] K. M a t s u m o t o , T. Nishizeki, and N. Saito. An efficient algorithm for finding multicommodity flows in planar networks. SIAM J. Comput., 14:289-302, 1985. [247] K. M a t s u m o t o , T. Nishizeki, and N. Saito. Planar multicommodity flows, maxi m u m matchings and negative cycles. SIAM J. Comput., 15:495-510, 1986. [248] J. F . Maurras. Optimization of t h e flow through networks with gains. Programming, 3:135-144, 1972.
Math.
A Bibliography
on Network
Flow
323
Problems
[249] N. Megiddo. Optimal flows in networks with multiple sources and sinks. Programming, 7:97-107, 1974.
Math.
[250] N. Megiddo. A good algorithm for lexicographically optimal flows in multiterminal networks. Bull, of the AMS, 83:97-107, 1977. [251] K. Mehlhorn. Data structures and Algorithms; vol. 2, Graph Algorithms NP-completeness, chapter IV.9. Springer-Verlag, Berlin, 1984.
and
[252] G. L. Miller and J. Naor. Flow in planar graphs with multiple sources and sinks, extended abstract. In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 112-117, 1989. [253] E. Minieka. Optimal flow in a network with gains. INFOR,
10:171-178, 1972.
[254] E. Minieka. P a r a m e t r i c network flows. Operations
Res., 20:1162-11678, 1972.
[255] E. Minieka. Optimization New York, 1978.
and Graphs. Marcel Dekker,
Algorithms
for Networks
[256] M. Minoux. Resolution des problemes de multiflots en nombres entier dans les grands resaux. RAIRO, 3:21-40, 1975. [257] M. Minoux. Flots equilibres et flots avec securite. E.D.F.-Bull. et Recherches, serie C - Mathem., Inform., 1:5-16, 1976.
Direction
[258] M. Minoux. Multiflots de cout minimal avec fonctions de cout concaves. Telecommun., 31:77-92, 1976.
Etudes
Annls
[259] M. Minoux. A polynomial algorithm for m i n i m u m quadratic cost flow problems. Europ. J. Oper. Res., 18:377-387, 1984. [260] M. Minoux. Network synthesis and o p t i m u m network design problems: Models, solution m e t h o d s and applications. Networks, 19:313-360, 1989. [261] G. J. Minty. Monotone networks. Proc. Royal Soc. London, 1960.
A(257):194-212,
[262] J. S. B . Mitchell. On m a x i m u m flows in polyhedral domains. J. Comput. Set., 40:88-123, 1990.
Syst.
[263] K. Mizuno, S. Mizuno, and M. Mori. A polynomial t i m e interior point algorithm for m i n i m u m cost flow problems. J. Oper. Res. Soc. Japan, 33:157-167, 1990. [264] J. Mulvey. Pivot strategies for primal-simplex network codes. J. ACM, 25:266270, 1978.
324 [265] K. G. Murty. Linear New York, 1976.
M. and Combinatorial
Programming.
Veldhorst
J o h n Wiley & Sons,
[266] H. Nagamochi and T. Ibaraki. On max-flow min-cut and integral flow properties for m u l t i c o m m o d i t y flows in directed networks. Inf. Process. Lett., 31:279-285, 1989. [267] H. Nagamochi and T. Ibaraki. Multicommodity flows in certain planar directed networks. Discrete Applied Math., 27:125-145, 1990. [268] H. Nagamochi and T. Ibaraki. M a x i m u m flows in probabilistic networks. works, 21:645-666, 1991.
Net-
[269] H. Nagamochi and T. Ibaraki. Computing edge-connectivity in multigraphs and capacitated networks. SIAM J. Discr. Math., 5:54-66, 1992. [270] A. Nakayama. A polynomial algorithm for t h e m a x i m u m balanced flow problem with a constant balancing r a t e function. J. Oper. Res. Soc. Japan, 29:400-410, 1986. [271] A. Nakayama. A polynomial-time dual simplex algorithm for t h e m i n i m u m cost flow problem. J. Oper. Res. Soc. Japan, 30:265-289, 1987. [272] A. Nakayama. A polynomial-time binary search algorithm for t h e m a x i m u m balanced flow problem. J. Oper. Res. Soc. Japan, 33:1-11, 1990. [273] A. Nakayama. NP-completeness and approximation algorithm for the m a x i m u m integral vertex-balanced flow problem. J. Oper. Res. Soc. Japan, 34:13-27, 1991. [274] T. Nishizeki and N. Chiba. Planar Graphs: Theory and Algorithms, chapter 11. Annals of Discrete M a t h e m a t i c s , vol. 32. North-Holland Publ. Comp., Amsterd a m , 1988. [275] H. Okamura. Multicommodity flows in graphs. Discrete Applied Math., 6:55-62, 1983. [276] H. O k a m u r a and P. D. Seymour. Multicommodity flows in planar graphs. Combin. Theory, B-31:75-81, 1981.
J.
[277] K. Onaga. D y n a m i c programming of o p t i m u m flows in lossy communication nets. IEEE Trans. Circuit Th., CT-13:282-287, 1966. [278] K. Onaga. O p t i m a l flows in general communication networks. J. Franklin 283:308-327, 1967. [279] J. B . Orlin. M a x i m u m t h r o u g h p u t - d y n a m i c networks flows. Math. ming, 27:214-231, 1983.
Inst.,
Program-
A Bibliography
on Network
Flow
Problems
325
[280] J. B . Orlin. Genuinely polynomial simplex and non-simplex algorithms for t h e m i n i m u m cost flow problem. Technical Report 1615-84, Sloan School of M a n a g e m e n t , M I T , Cambridge, Mass., 1984. Also as CWI-OS R8504, Center for M a t h e m a t i c s and C o m p u t e r Science, A m s t e r d a m , 1985. [281] J. B. Orlin. M i n i m u m convex cost dynamic network flows. Math. 9:190-207, 1984.
Oper.
Res.,
[282] J. B . Orlin. On t h e simplex algorithm for networks and generalized networks. Math. Prog. Stud., 24:166-178, 1985. [283] J. B. Orlin. A faster strongly polynomial m i n i m u m cost flow algorithm. In Proc. 20th Annual ACM Symp. Theory of Computing, pages 377-387, 1988. To appear in Operations Res. [284] J. B . Orlin and R. K. Ahuja. New distance-directed algorithms for m a x i m u m flow and p a r a m e t r i c m a x i m u m flow problems. Technical Report 1908-87, Sloan School of M a n a g e m e n t , M I T , Cambridge, Mass., 1987. [285] J. B . Orlin and R. K. Ahuja. New scaling algorithms for assignment and minim u m cycle m e a n problems. Math. Programming, 54:41-56, 1988. [286] M. P a d b e r g and G. Rinaldi. An efficient algorithm for t h e m i n i m u m capacity cut problem. Math. Programming, 47:19-36, 1990. [287] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization, Algorithms and Complexity., chapter 4.3, 5.6, 6, 7, 9 and 10.3. Prentice-Hall, Inc., Englewood Cliffs, N J , 1982. [288] A. B . P h i l p o t t . Continuous-time flows in networks. Math. 661, 1990.
Oper. Res., 15:640-
[289] S. A. Plotkin and E. Tardos. Improved dual network simplex. In Proc. Annual ACM-SIAM Symp. Discrete Algorithms, pages 367-376, 1990. [290] J. Ponstein. Programming,
On t h e maximal flow problem with real arc capacities. 3:254-256, 1972.
[291] R. B . P o t t s and R. M. Oliver. Flows in Transportation Press, New York, 1972.
Networks.
1st
Math.
Academic
[292] P. S. P u l a t . A decomposition algorithm to determine t h e m a x i m u m flow in a generalized network. Comput. Oper. Res., 16:161-172, 1989. [293] P. S. P u l a t . M a x i m u m outflow in generalized flow networks. Europ. J. Res., 43:65-77, 1989.
Oper.
326
M.
Veldhorst
[294] A. P. P u n n e n . A linear t i m e algorithm for t h e m a x i m u m capacity p a t h . J. Oper. Res., 53:402-404, 1991.
Europ.
[295] M. Queyranne. Theoretical efficiency of t h e algorithm "capacity" for t h e maxi m u m flow problem. Math. Oper. Res., 5:258-266, 1980. [296] T. Radzik and A. V. Goldberg. Tight bounds on t h e n u m b e r of m i n i m u m - m e a n cycle cancellations and related results. In Proc. 2nd Annual ACM-SIAM Symp. Discrete Algorithms, pages 110-119, 1991. [297] V . R a m a c h a n d r a n . T h e complexity of m i n i m u m cut and m a x i m u m flow problems in an acyclic network. Networks, 17:387-392, 1987. [298] K. G. R a m a k r i s h n a n . Solving two-commodity transportation problems with coupling constraints. J. ACM, 27:736-757, 1980. [299] J. H. Reif. M i n i m u m s-t cut of a planar undirected network in 0 ( n l o g 2 ( n ) ) t i m e . SI AM J. Comput., 12:71-81, 1983. [300] H. Rock. Scaling techniques for m i n i m u m cost network flows. In U. P a p e , editor, Discrete Structures and Algorithms, pages 181-191, Carl Hansen Verlag, Miinchen, 1980. [301] R. T. Rockafellar. Network k Sons, New York, 1984.
Flows and Monotropic
Optimization.
John Wiley
[302] B . Rothfarb and I. T. Frisch. On t h e 3-commodity flow problem. Appl. Math., 17:46-58, 1969.
SIAM
J.
[303] B . Rothfarb, N. P. Shein, and I. T. Frisch. Common terminal multicommodity flow. Operations Res., 16:202-205, 1968. [304] B. Rothschild and A. Whinston. Feasibility of two commodity network flows. Operations Res., 14:1121-1129, 1966. [305] B . Rothschild and A. Whinston. On two commodity network Res., 14:377-387, 1966.
flows.
Operations
[306] G. Ruhe. P a r a m e t r i c m a x i m a l flows in generalized networks - complexity and algorithms. Optimization, 19:235-251, 1988. [307] H. M. Safer. Scaling algorithms for distributed m a x flow. Technical report, Sloan School of Management, M I T , Cambridge, Mass., 1988. [308] R. Saigal. Multicommodity flows in directed networks. Operations Research Center, University of California, Berkeley, CA, 1968.
A Bibliography
on Network
Flow
327
Problems
[309] M. Sakarovitch. T h e multicommodity m a x i m u m flow problem. O R C Report 6625, Operations Research Center, University of California, Berkeley, CA, 1968. [310] M. Sakarovitch. Two commodity network flows and linear programming. Programming, 4:1-20, 1973.
Math.
[311] B . Schieber and S. Moran. Parallel algorithms for m a x i m u m bipartite matchings and m a x i m u m 0-1 flows. J. of Parallel and Distributed Computing, 6:20-38, 1989. [312] A. Schrijver. Applications of polyhedral combinatorics to multicommodity flows and compact surfaces. Technical Report CWI-BS-R8921, Center for M a t h e m a t ics and C o m p u t e r Science, A m s t e r d a m , 1989. [313] A. Schrijver. T h e Klein bottle and multicommodity 9:375-384, 1989.
flows.
Combinatorial,
[314] A. Schrijver. Short proofs on multicommodity flows and cuts. Technical Report CWI-BS-R8922, Center for Mathematics and C o m p u t e r Science, A m s t e r d a m , 1989. [315] A. Segall. Decentralized maximum-flow protocols. Networks,
12:213-230, 1982.
[316] M. Sengoku, S. Skinoda, and R. Yatsuboshi. On a function for t h e vulnerability of a directed flow network. Networks, 18:73-83, 1988. [317] M. Serna and P. Spirakis. Tight R N C approximations to maxflow. Technical Report T R 90.01.1, C o m p u t e r Technology Institute, P a t r a s University, P a t r a s , Greece, 1990. [318] P. D. Seymour. T h e matroids with t h e max-flow min-cut property. J. Theory, B-23:189-222, 1977. [319] P. D. Seymour. A two-commodity cut theorem. Discrete 1978.
Math.,
23:341-355,
[320] P. D. Seymour. A short proof of t h e two-commodity flow theorem. J. Theory, B-26:370-371, 1979. [321] P. D. Seymour. Four-terminus flows. Networks,
Comb.
Comb.
10:79-86, 1980.
[322] P. D. Seymour. On odd cuts and planar multicommodity flows. Proc. Mathem. Soc, 42:178-192, 1981.
London
[323] F . Shahrokhi. Approximation algorithms for t h e m a x i m u m concurrent flow problem. ORSA Jrnl. on Computing, 1:62-69, 1989.
328
M.
Veldhorst
[324] F . Shahrokhi and D. Matula. T h e m a x i m u m concurrent flow problem. J. 37:318-334, 1990.
ACM,
[325] Y. Shiloach. An 0(nl log 2 7) m a x i m u m flow algorithm. Technical Report STAN78-702, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1978. [326] Y. Shiloach. Multi-terminal 0 - 1 flows. SIAM
J. Comput.,
8:422-430, 1979.
[327] Y. Shiloach. A multi-terminal m i n i m u m cut algorithm for planar graphs. J. Comput, 9:214-219, 1980.
SIAM
[328] Y. Shiloach and U. Vishkin. An 0(n2 log n) parallel max-flow algorithm. Algorithms, 3:128-146, 1982.
J.
[329] M. T. Shing and P. K. Agarwal. Multi-terminal flows in planar networks. Technical Report T R C S 86-07, D e p a r t m e n t of C o m p u t e r Science, University of California, Santa Barbara, CA, 1986. [330] J. F . Sibeyn. A pseudo-polylog t i m e parallel maxflow algorithm. Technical Report RUU-CS-90-17, D e p a r t m e n t of C o m p u t e r Science, University of Utrecht, Utrecht, T h e Netherlands, 1990. [331] K. Simon. On m i n i m u m flow and transitive reduction. In Proc. 15th Intern. Coll. on Automata, Languages and Programming, Lecture Notes in C o m p u t e r Science, vol. 317, pages 535-546, Springer-Verlag, Berlin, 1988. [332] D. D. Sleator and R. E. Tarjan. An O(nmlogn) algorithm for m a x i m u m network flow. Technical Report STAN-CS-80-831, D e p a r t m e n t of C o m p u t e r Science, Stanford University, Stanford, CA, 1980. [333] D. D. Sleator and R. E. Tarjan. A d a t a structure for dynamic trees. J. Syst. ScL, 26:362-390, 1983. [334] D. D. Sleator and R. E. Tarjan. Self adjusting binary search trees. J. 32:652-686, 1985.
Comput.
ACM,
[335] J. E. Somers. M a x i m u m flow in networks with a small n u m b e r of r a n d o m arc capacities. Networks, 12:242-253, 1982. [336] H. Soroush and P. B . Mirchandani. T h e stochastic multicommodity flow problem. Networks, 20:121-155, 1990. [337] Y. Soun and K. Truemper. Single commodity representation of multicommodity networks. SIAM J. Algebraic Discrete Methods, 1:348-358, 1980.
A Bibliography
on Network
Flow
329
Problems
[338] V. Srinivasan and G. L. Thompson. Accelerated algorithms for labeling and relabeling of trees, with applications t o distribution problems. J. ACM, 19:712— 726, 1972. [339] V. Srinivasan and G. L. Thompson. Benefit-cost analysis of coding techniques for primal transportation problems. J. ACM, 20:194-213, 1973. [340] H. Suzuki, T . Nishizeki, and N. Saito. Algorithms for multicommodity flows in planar graphs. Algorithmica, 4:471-501, 1989. Preliminary version in Proc. 17th Annual ACM Symp. Theory of Computing, pages 195-204, 1985. [341] E. Tardos. A strongly polynomial m i n i m u m cost circulation algorithm. binatorica, 5:247-255, 1985.
Com-
[342] E. Tardos. Improved approximation algorithm for concurrent multi-commodity flows. Technical Report 872, School of Operations Research and Industrial Engineering, Cornell University, 1989. [343] E. Tardos, C. Tovey, and M. Trick. Layered augmented p a t h algorithms. Oper. Res., 11:362-370, 1986. [344] R. E. Tarjan. Data Structures Philadelphia, PA, 1983.
and Network
Algorithms,
chapter 8.
Math.
SIAM,
[345] R. E. Tarjan. A simple version of Karzanov's blocking flow algorithm. tions Res. Lett., 2:265-268, 1984.
Opera-
[346] R. E. Tarjan. Algorithms for m a x i m u m network flow. Math. Prog. Study, 26:1— 11, 1986. [347] R. E. Tarjan. Efficiency of t h e primal network simplex algorithm for the minimum-cost circulation problem. Math. Oper. Res., 16:272-291, 1991. [348] N. Tomizawa. On some techniques useful for solution of t r a n s p o r t a t i o n network problems. Networks, 1:173-194, 1972. [349] J. A. Tomlin. Minimum-cost multicommodity network flows. Operations 14:45-51, 1966.
Res.,
[350] L. E. Trotter, Jr. On the generality of multi-terminal flow theory. Annals Discrete Math., 1:517-525, 1977. [351] K. Truemper. On m a x flows with gains and pure m i n i m u m cost J. Appl. Math., 32:450-456, 1977.
flows.
[352] K. Truemper. O p t i m a l flows in nonlinear gain networks. Networks, 1978.
of
SIAM
8:17-36,
330
M.
Veldhorst
K. Truemper. Max-flow min-cut matroids: polynomial testing and polynomial algorithms for m a x i m u m flow and shortest routes. Math. Oper. Res., 12:72-96, 1987. P. Tseng, D. P. Bertsekas, and J. N. Tsitsiklis. Partially asynchronous, parallel algorithms for network flows and other problems. SIAM J. Control & Optim., 28:678-710, 1990. A. Tucker. A note on t h e convergence of t h e Ford-Fulkerson flow algorithm. Math. Oper. Res., 2:143-144, 1977. P. M. Vaidya. Speeding-up linear programming using fast m a t r i x multiplication, (extended a b s t r a c t ) . In Proc. 30th Annual IEEE Symp. Foundations of Computer Science, pages 332-337, 1989. J. van Leeuwen. Graph algorithms. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, vol. A: Algorithms and Complexity, pages 5 2 5 631. Elsevier Science Publ., A m s t e r d a m , 1990. U. Vishkin. A parallel blocking flow algorithm for acyclic networks. J. rithms, 13:489-501, 1992.
Algo-
S. W . Wallace. Investing in arcs in a network to maximize t h e expected max flow. Networks, 17:87-103, 1987. A. W e i n t r a u b . A primal algorithm to solve network flow problems with convex costs. Manag. Sci., 21:87-97, 1974. R. J. Wittrock. Operator assignment and t h e parametric prefiow problem. Manag. Sci., 38:1354-1359, 1992. R. D. Wollmer. Multicommodity networks with resource constraints: t h e generalized multicommodity flow problem. Networks, 1:245-263, 1972. M. A. Yakovleva. A problem on m i n i m u m transportation cost. In V. S. Nemchinov, editor, Applications of Mathematics in Economic Research, pages 390-399, Izdat. Social'no-Ekon. Lit., Moscow, 1959. N. E. Young, R. E. Tarjan, and J. B. Orlin. Faster p a r a m e t r i c shortest p a t h and m i n i m u m balance algorithms. Networks, 21:205-221, 1991. N. Zadeh. Theoretical efficiency of t h e E d m o n d s - K a r p algorithm for computing m a x i m a l flows. J. ACM, 19:184-192, 1972. N. Zadeh. A bad network problem for t h e simplex m e t h o d and other m i n i m u m cost flow algorithms. Math. Programming, 5:255-266, 1973.
A Bibliography on Network Flow Problems
331
[367] N. Zadeh. More pathological examples for network flow problems. Math. Programming, 5:217-224, 1973. [368] W. I. Zangwill. Minimum concave cost flows in certain networks. Manag. Sci., 14:429-450, 1968. [369] C.-Q. Zhang. Minimum cycle coverings and integer flows. J. Graph Theory, 14:537-546, 1990. [370] U. Zimmermann. Minimization on submodular flows. Discrete Applied Math., 4:303-323, 1982.
333 Network Optimization Problems, pp. 333-353 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Tabu Search: Applications and Prospects Stefan VofJ Technische Hochschule Darmstadt, FB 1 / FG Operations Hochschulstrafie 1, D - 6100 Darmstadt, Germany
Research,
Abstract Tabu Search is a metastrategy for guiding known heuristics to overcome local optimality. Successful applications of this kind of metaheuristic to a great variety of problems have been reported in the literature. In this paper we consider two applications of tabu search with special emphasis on dynamic tabu list management. Although still in its infancy, recently some implementations of tabu search on parallel computers have come up. Whereas these implementations are tailored to specific problems we attempt to provide ideas for a more general concept for developing parallel tabu search algorithms.
1
Introduction
Due to t h e complexity of a great variety of combinatorial optimization problems, heuristic algorithms are especially relevant for dealing with these large scale problems. T h e m a i n drawback of algorithms such as deterministic exchange procedures is their inability to continue t h e search upon becoming t r a p p e d in a local o p t i m u m . This suggests consideration of recent techniques for guiding known heuristics to overcome local optimality. Following this t h e m e , we investigate t h e application of t h e tabu search m e t a s t r a t e g y for solving combinatorial optimization problems. T h e first part of this paper considers two specific applications of t a b u search to t h e multiconstraint zero-one knapsack problem and to t h e quadratic semi-assignment problem. These applications have been performed on a sequential computer. Although usually a very fast m e t h o d , in real-world applications of e.g. t h e quadratic semi-assignment problem sometimes very large computation times for t a b u search
334
S. VoB
may be observed. Therefore, t h e idea of parallelization as it recently came up for t a b u search, too, may be especially relevant in reducing C P U - t i m e s for t h e algorithms under consideration. T h e key issue in designing parallel algorithms is to decompose t h e execution of t h e various ingredients of a procedure into processes executable by parallel processors. Improvement procedures like t a b u search or simulated annealing at first glance, however, have an intrinsic sequential n a t u r e due to t h e idea of performing t h e neighbourhood search from one solution t o t h e next. Therefore, there is not yet a common or generally applicable parallelization of t a b u search in t h e literature. In t h e second part of this paper we a t t e m p t to describe some general ideas and a classification scheme for parallel t a b u search algorithms. Following t h e above framework t h e paper is organized as follows. In Section 2, we present an outline of t a b u search. Section 3 describes our application of t a b u search t o two combinatorial optimization problems. For t h e multiconstraint zeroone knapsack problem we report some improvements in comparison to an already known t a b u search m e t h o d . Computational results are performed over a sample of 57 problems known from t h e literature. W i t h respect t o t h e quadratic semi-assignment problem we review results derived for a real-world application arising in the field of schedule synchronization in public mass transit systems. Before describing some concepts for parallel t a b u search algorithms in more detail we briefly discuss t h e common parallel machine models and algorithms in Section 4. Some examples are given and finally some conclusions are drawn (Section 5). T h e a t t e m p t , of course, is not to give a complete t r e a t m e n t of parallel t a b u search b u t to sketch t h e potential this area of research carries.
2
Tabu Search
Many solution approaches are characterized by identifying a neighbourhood of a given solution which contains other (transformed) solutions t h a t can be reached in a single iteration. A transition from a feasible solution to a transformed feasible solution is referred to as a move and m a y be described by a set of one or more attributes. For example, in a zero-one integer p r o g r a m m i n g context these a t t r i b u t e s m a y be t h e set of all possible value assignments (or changes in such assignments) for t h e binary variables. T h e n two a t t r i b u t e s e and e, which denote t h a t a certain binary variable is set to 1 or 0, m a y b e called complementary t o each other. Following a steepest descent/mildest ascent approach, a move m a y either result in a best possible improvement or a least deterioration of t h e objective function value. W i t h o u t additional control, however, such a process can cause a locally optimal solution t o b e re-visited immediately after moving to a neighbour. To prevent t h e search from endlessly cycling between t h e same solutions, t a b u search m a y be visualized as follows. Imagine t h a t t h e a t t r i b u t e s of all moves are stored in a list, n a m e d a running list, representing t h e trajectory of solutions encountered.
Tabu Search: Applications and Prospects
335
Then, related to a sublist of the running list a so-called tabu list may be defined. Based on certain restrictions, it keeps some moves, consisting of attributes complementary to those of the running list, which will be forbidden in at least one subsequent iteration because they might lead back to a previously visited solution. Thus, the tabu list restricts the search to a subset of admissible moves (consisting of admissible attributes or combinations of attributes). This hopefully leads to 'good' moves in each iteration without re-visiting solutions already encountered. A general outline of a tabu search procedure (for solving a minimization problem) may be described as follows: Tabu Search Given: A feasible solution x* with objective function value z*. Start: Let x := x* with z(x) = z*. Iteration: while stopping criterion is not fulfilled1do begin (1) select best admissible move that transforms x into x' with objective function value z(z') and add its attributes to the running list (2) perform tabu list management: compute moves to be set tabu, i.e., update the tabu list (3) perform exchanges: x := x', z(x) = z(x') if z{x) < z* then z* := z(x), x* := x endif endwhile Result: x* is the best of all determined solutions, with objective function value z*.
*** For a background on tabu search and a number of references on successful applications of this metaheuristic see, e.g., Glover (1989, 1990), Domschke et al. (1992), and Glover and Laguna (1992).
Tabu List Management Tabu list management concerns updating the tabu list, i.e., deciding on how many and which moves have to be set tabu within any iteration of the search. Up to now, usually static methods have been applied in the literature as, e.g., the tabu navigation method (TNM). In TNM, single attributes are set tabu as soon as their complements have been part of a selected move. The attributes stay tabu for a distinct time, i.e. number of iterations, until the probability of causing a solution's re-visit is small. The efficiency of the algorithm depends on the choice of the tabu status duration, i.e. the length tl_size of the underlying tabu list. (In the literature often a 'magic' tl_size=7 is proposed.) For the sake of an improved effectivity, a so-called aspiration level criterion 1
A possible stopping criterion can be, e.g., a prespecified time limit.
336
S. VoB
is considered, which permits t h e choice of an a t t r i b u t e even when it is tabu. This can be advantageous when a new best solution m a y be calculated, or when t h e t a b u status of t h e a t t r i b u t e s prevent any move from feasibility. T h e static approach, though successful in some applications, seems to be a rather limited one. Another probably more fruitful idea is to define an a t t r i b u t e as being potentially tabu if it belongs to a chosen move and to handle it in a candidate list first. Via additional criteria these a t t r i b u t e s can be definitely included in t h e t a b u list if necessary, or excluded from t h e candidate list if possible. Therefore, t h e candidate list is an intermediate list between a running list and a t a b u list. Glover (1990) suggests t h e use of different candidate list strategies in order t o avoid extensive computational effort without sacrificing solution quality. In t h e sequel, we describe t h e following dynamic strategies for managing t a b u lists: the cancellation sequence method (CSM, in a revised version, cf. D a m m e y e r et al. (1991)), and t h e reverse elimination method (REM). CSM as well as R E M both use additional criteria for setting a t t r i b u t e s tabu. T h e p r i m a r y goal is to permit t h e reversion of any a t t r i b u t e b u t one between two solutions to prevent from re-visiting t h e older one. To find those critical moves, CSM needs a candidate list t h a t contains t h e complements of a t t r i b u t e s being potentially t a b u . This active tabu list (ATL) is built like t h e running list where elimination of certain a t t r i b u t e s is furthermore p e r m i t t e d . Whenever an a t t r i b u t e of t h e last performed move finds its complement on ATL this complement will be eliminated from ATL. All a t t r i b u t e s between t h e cancelled one and its recently added complement build a cancellation sequence separating t h e actual solution from t h e solution t h a t has been left by t h e move t h a t contains t h e cancelled a t t r i b u t e . Any a t t r i b u t e but one of a cancellation sequence is allowed to be cancelled by future moves. This condition is sufficient b u t not necessary, as some aspects have t o be taken into account so t h a t CSM works well. • Making a single a t t r i b u t e t a b u prevents m a n y moves which could lead to yet unvisited solutions. (An a t t r i b u t e becomes t a b u if its complement is t h e only a t t r i b u t e of a cancellation sequence. An a t t r i b u t e becomes t a b u for one iteration if its complement is t h e most recent a t t r i b u t e of ATL. Otherwise a cancellation sequence could not be defined between these two attributes.) • For building a cancellation sequence, t h e remaining a t t r i b u t e s of the older and t h e current move are not necessarily taken into consideration. This depends on t h e order in which t h e move's attributes are added to ATL. • Those a t t r i b u t e s of a move t h a t did not cancel another a t t r i b u t e within a specific cancellation sequence are disregarded when making its last remaining a t t r i b u t e t a b u (although they separate two solutions). Whenever a cancellation sequence includes a smaller one t h e smaller sequence is said to dominate t h e larger. T h e n t h e larger cancellation sequence may be disre-
Tabu Search: Applications
and
Prospects
337
garded, because any of its a t t r i b u t e s will only become t a b u if they are within t h e smaller sequence, too. T h e above mentioned aspects work well for t h e case t h a t a move consists of exactly one a t t r i b u t e , i.e., when so-called single-attribute moves are considered instead of multi-attribute moves. In addition, t h e corresponding p a r a m e t e r s have to be chosen appropriately (e.g. t h e t a b u list duration of a t a b u a t t r i b u t e , and how to apply t h e aspiration level criterion). Applying CSM to m u l t i - a t t r i b u t e moves needs additional criteria to prevent errors caused by uncovered special cases. E.g. for paired-attribute moves (moves consisting of exactly two attributes) those moves must be prohibited t h a t m a y cancel a cancellation sequence consisting of exactly two a t t r i b u t e s (because none of t h e m is t a b u when choosing a move). T h e conditions of T N M and CSM need not be necessary to prevent from re-visiting previously encountered solutions. Necessity, however, can be achieved by R E M . T h e idea of R E M is t h a t any solution can only be re-visited in t h e next iteration if it is a neighbour of t h e current solution. Therefore, in each iteration t h e running list will b e traced back to determine all moves which have to be set t a b u (since they would lead to an already explored solution). For this purpose, a residual cancellation sequence (RCS) is built up stepwise by tracing back t h e running list. In each step exactly one a t t r i b u t e is processed, from last to first. After initializing an e m p t y R C S , only those a t t r i b u t e s are added whose complements are not in t h e sequence. Otherwise their complements in t h e RCS are eliminated (i.e. cancelled). T h e n at each tracing step it is known which a t t r i b u t e s have to be reversed in order to t u r n t h e current solution back into one examined at an earlier iteration of t h e search. If t h e remaining a t t r i b u t e s in t h e RCS can be reversed by exactly one move then this move is t a b u in t h e next iteration. For single-attribute moves, for instance, t h e length of an RCS must be one to enforce a t a b u move. Obviously, t h e execution of R E M represents a necessary and sufficient criterion to prevent re-visiting known solutions. Since t h e computational effort of R E M increases if t h e n u m b e r of iterations increases, ideas for reducing t h e n u m b e r of computations have been developed (cf. Glover (1990) and D a m m e y e r and Vofi(1991a)).
Search Intensification and Search Diversification A general idea for reducing t h e computational effort in a t a b u search algorithm is t h a t of search intensification using a so-called short term memory (cp. t h e expression intermediate term memory in Glover (1989)). Its basic idea is to observe t h e a t t r i b u t e s of all performed moves and to eliminate those from further consideration t h a t have not been part of any solution generated during a given n u m b e r of iterations. This results in a concentration of t h e search where the number of neighbourhood solutions in each iteration, and consequently t h e computational effort, decreases. Obviously t h e cost of this reduction can be a loss of accuracy. Correspondingly, a search diversification may be defined as a long term
memory
S. VoB
338
to penalize often selected assignments. T h e n the neighbourhood search can be led into not yet explored regions where t h e t a b u list operation is restarted (resulting in an increased c o m p u t a t i o n t i m e ) . An appealing opportunity for search diversification is created by R E M . Let t > 1 be integer. If at any tracing step t h e a t t r i b u t e s t h a t have to be reversed to t u r n t h e current solution back into an already explored one equal exactly t moves t h e n it is possible to set these moves t a b u for t h e next iteration. Note t h a t for t h e case of m u l t i - a t t r i b u t e moves, due to various combinations of a t t r i b u t e s to moves, even more t h a n t moves may be set t a b u in order to avoid different p a t h s through t h e search space leading to t h e same solution. Accordingly, search diversification is obvious. We highlight t h e case of t = 2 as a new stand-alone m e t h o d called R E M 2 since in t h e same n u m b e r of iterations a larger n u m b e r of solutions is encountered t h a n with R E M . To have t = 2 means t h a t all common neighbours of t h e current solution and of an already explored one are forbidden. These neighbours were implicitly investigated during a former step of t h e procedure (due to the choice of a best non-tabu neighbour) and need not be looked at again. Therefore, R E M 2 retains the nice property of being a necessary and a sufficient criterion as mentioned above without having the fear t h a t some solutions would not be encountered, as may be the case for t > 2. Nevertheless, from a computational point of view R E M t for t > 3 may be advantageous. However, as also with R E M or R E M 2 , black hole suboptima may still occur when all moves to neighbourhood solutions are t a b u . For applications and (sequential) comparisons of T N M , CSM, and R E M see D a m m e y e r and Vofi(1991b) and Domschke et al. (1992).
3
Applications
In this paragraph we report on two applications of tabu search. multiconstraint zero-one knapsack problem, which is an example for single-attribute moves are performed. T h e second example is t h e assignment problem, a representative for t h e use of paired-attribute
T h e first is t h e problems where quadratic semimoves.
Multiconstraint Zero-One Knapsack Problem T h e multiconstraint zero-one knapsack problem ( M C K P ) is a special case of general zero-one p r o g r a m m i n g with a great variety of applications in t h e areas of, e.g., resource allocation and capital budgeting. Given n objects with positive profits c, and m resources with nonnegative resource consumption values a;j and positive limitations 6,-. W i t h binary decision variables x, t h e problem may be stated as follows: n
Maximize
Z(x) = ^
Cj • Xj
(1)
Tabu Search: Applications
and
339
Prospects
subject to n
J2 a 'i " xi - b> xjG{0,l}
i=l,-.-,m i = l,...,n
(2) (3)
Various algorithms for solving this NP-hard problem have been proposed in t h e literature. Here we refer to Drexl (1988), who developed an efficient simulated annealing algorithm, and to Dammeyer and Vofl(1991a) for an implementation of R E M and a comparison with simulated annealing. To have a comparative study 57 test problems with known optimal solutions were taken as a reference in both papers. T h e d a t a for these studies with n varying from 6 to 105 and m from 2 to 30 are fully reproduced in Frevilleand Plateau (1982, 1990). Dammeyer and Vofi(1991a) show t h a t simulated annealing may be outperformed by R E M with respect to various criteria. Here we use their implementation of different versions of R E M and compare t h e m with t h e results of R E M 2 and R E M 3 as defined in Section 2. Despite t h e modification caused by this diversification approach all p a r a m e t e r setting is identical. We report some of t h e details (see Dammeyer and Vofi(1991a) for a complete description). For t h e purpose of finding a neighbourhood solution within t a b u search t h e following transformation is defined: G i v e n : A feasible solution x = ( x i , . . . , x n ) . Choose j * = arg max{a,-.j/cj|Xj = l,j = 1 , . . . , n } with t* being a bottleneck resource with i* : = arg max{£)? = i
Vi = 1 , . . . , m)
This so-called D R O P / A D D - move may be considered as a m u l t i - a t t r i b u t e move with variable length depending on a specific instance of M C K P . Any such move consists of exactly one DROP-attribute j * and a variable number of ADD-attributes according to t h e choice of elements k* given in t h e while-loop. Our implementation considers these moves as successive multi-attribute moves, i.e., every move is regarded as a n u m b e r of separate single-attribute moves. In any iteration t h e n u m b e r of traces to be performed, i.e. t h e number of times an RCS is built, is equal to t h e number of a t t r i b u t e s of t h e corresponding move. For each of t h e 57 test problems 20 initial feasible solutions have been generated. Starting with x = (0,..., 0) for t h e first feasible solution we added elements according to t h e ADD-criterion as applied in t h e D R O P / A D D - move as long as possible. Nine further solutions have been gained from x = (0, ...,0) by adding randomly chosen elements as long as possible. In t h e same way we proceeded with an initialization of x = ( ! , . . . , ! ) . Nine solutions have been obtained by randomly dropping elements
340
S. Vo5
until feasibility was achieved, and t h e t e n t h by applying a DROP-criterion inverse to t h e ADD-criterion. Given any of t h e 20 initial feasible solutions t h e algorithms t e r m i n a t e whenever there is no improvement of t h e best feasible solution within a certain n u m b e r a of iterations. Starting from an initialization a = n this n u m b e r is increased over t i m e by a factor of 1.1. An additional overall stopping criterion of 10 • n iterations did not affect t h e termination of t a b u search. T h e n u m b e r of tracing steps for building an RCS was limited to 4 • n. Table 1 gives a summarized description of our results. (All programs are implem e n t e d in PASCAL and run on an IBM PS 2/70, 386 personal computer.) T h e first rows show a comparison of R E M (t = 1), R E M 2 , and R E M 3 as described above. T h e n t h e influence of applying t h e short and long t e r m m e m o r y is analyzed in t h e next rows. In t h e short t e r m memory (STM) an element is eliminated from further consideration if it has not been included in any solution examined during a iterations. For a combined long and short t e r m memory ( L + S T M ) t h e following modification is used. Whenever S T M stops a new starting solution is obtained by choosing from those elements t h a t have previously been eliminated according to S T M and the algor i t h m is restarted. This procedure is repeated no more t h a n ten times as long as a new starting solution can be found. Correspondingly in t h e long t e r m memory (LTM) new starting solutions are obtained from those elements t h a t have not been in any solution for a certain n u m b e r of iterations. T h e first two columns of Table 1 show t h e n u m b e r of optimal solutions found referring to t h e 57 test problems and referring to all instances (i.e. from 20 • 57 = 1140). Correspondingly, t h e average deviation from optimality is given with respect to t h e best found solution out of t h e 20 instances over t h e 57 test problems and referring to all 1140 instances. T h e average number of moves gives t h e average n u m b e r of neighbourhood exchanges needed with respect to t h e best found solution over all 57 test problems. Finally, t h e average C P U - t i m e referring to all 1140 instances is given. All specified m e t h o d s behave in nearly t h e same way. Concerning solution quality, t h e inclusion of S T M into R E M t (t = 1,2,3) gives only slightly worse results, b u t with a remarkable decrease in C P U - t i m e s . T h e most astonishing entries in Table 1 are t h e average number of moves needed to find t h e best solution referring to t h e most successful out of t h e sample of 20 trials. R E M t with L + S T M leads to improved results b u t with significantly increased CPU-times. If R E M t is applied with LTM instead of L + S T M t h e solution quality is slightly affected in some test problems with mostly increased C P U - t i m e s . T h e modifications of R E M proposed in this paper lead to improvements in solution quality with only slightly affected CPU-times. Although t h e average n u m b e r of necessary neighbourhood exchanges increases, t h e C P U - t i m e s over all instances even might decrease because t h e stopping criterion gets active when no further improvements are found in a certain number of iterations. As a recommendation we conclude t h a t R E M 2 should replace R E M whenever this kind of dynamic t a b u list m a n a g e m e n t is used.
Tabu Search: Applications
and
Prospects
number of optimal solutions found ref. 57 all instances
341
average deviation from optimality (in %) ref. 57 all instances
average no. of moves ref. 57
average CPU-time all instances
algorithm
t
REM
1 2 3
40 43 44
283 302 223
0.126 0.117 0.088
3.483 2.925 2.681
11 17 18
4.85 5.30 4.97
REM with STM
1 2 3
39 43 41
268 261 210
0.130 0.119 0.113
4.207 3.904 3.871
9 12 17
2.99 2.89 2.69
REM with L + S T M
1 2 3
44 48 49
591 617 605
0.101 0.073 0.061
0.558 0.504 0.491
28 36 65
14.59 19.33 19.90
REM with LTM
1 2 3
45 47 50
519 677 633
0.095 0.097 0.064
0.646 0.462 0.494
42 48 66
33.50 32.31 30.71
Table 1: Numerical results for M C K P Further significant improvements for all versions of R E M are still possible, e.g., when increasing t h e factor for modifying a up to 5. T h e average deviation values for R E M 2 with LTM decrease to 0.062 and 0.317 with 769 out of 1140 instances solved to optimality, however, with a significant increase in CPU-times.
Quadratic Semi-Assignment Problem Assigning items to sets such t h a t a quadratic function is minimized may be referred to as t h e quadratic semi-assignment problem ( Q S A P ) . This problem indeed is a relaxed version of t h e well known quadratic assignment problem ( Q A P ) and may be represented in a m a t h e m a t i c a l model as follows. Given sets A = { l , . . . , m } and B = { 1 , . . . ,n} and a (not necessarily) symmetric cost m a t r i x (cjhjk)- W i t h binary variables 1
if h € B is assigned to i G A
0
otherwise
%ih
we get t h e model: m
Minimize
n
m
n
Z(x) = Y,Y.Y.^2cihjk i=i fc=i j = i fc=i
• xih • xjk
(4)
342
S. VoB
subject t o n
J2xih = l
i = l,...,m
(5)
h=l
xih G { 0 , 1 }
i = l,...,m,h
= l,...,n
(6)
Q S A P has been formulated in t h e literature by several authors with respect to different application areas like floor layout planning, certain median problems with m u t u a l communication, and t h e problem of schedule synchronization in public transit networks (see, e.g., D u t t a et al. (1982), Klemt and S t e m m e (1988), and Chhajed and Lowe (1992)). T h e problem also arises in certain scheduling problems where t h e deviation from due dates is penalized by a quadratic function. Here we focus on a real-world application in schedule synchronization in public mass transit networks where t h e objective is to minimize t h e total transfer waiting times of passengers in a mass transit system which is expressed as t h e sum of individual waiting times within given operation hours (cf. Domschke (1989), Vofi(1990), and Domschke et al. (1992)). Let there be a n u m b e r of m lines or routes (Fig. 1 shows an example for a transit network with m = 3; note t h a t a line is defined for one direction only). W i t h each route i we join a set N(i). Given a cycle t i m e <,• (in t i m e units, e.g. minutes) for route i t h e n N(i) = { 1 , . . . , <,} is a nodeset with each node giving a specific departure t i m e within t h e cycle t i m e . T h e d e p a r t u r e t i m e is t h e starting t i m e of i at its first station so t h a t all arrival and departure times furtheron may be easily calculated resulting in a complete t i m e table. All routes have to be scheduled such t h a t t h e above objective is minimized. Fixing t h e starting times corresponds to choosing exactly one representative from each set such t h a t t h e sum of all arc weights of t h e subgraph induced by these nodes is minimal. Sets A and B denote traffic lines and possible d e p a r t u r e times, respectively. T h e problem size may be referred to as ' n u m b e r of lines X cycle t i m e ' with identical cycle times for all lines. So, (4) - (6) is a suitable model for schedule synchronization. In what follows we relate specific issues of t a b u search to QSAP. A transition from one feasible solution to another one needs two exchanges within t h e binary m a t r i x (xih)mxn such t h a t conditions (5) remain valid. Accordingly moves are paired-attribute moves denoting b o t h t h e assignments and t h e type of exchange, i.e., selection for or exclusion from t h e (actual) solution. In more detail each move is described by two a t t r i b u t e s of different t y p e belonging to assignments between one element of set A and two different elements of set B. We explain t h e neighbourhood search by an example (cf. Domschke et al. (1992)): Consider a Q S A P with A = { 1 , 2 , 3 } , 2? = { 1 , 2 , 3 } , and a m a t r i x of symmetric cost coefficients given in Table 2. Fig. 1 shows an underlying transit network with three routes r = 1,2,3 corresponding with t h e elements of set A. Three possible departure times t = 1,2,3 are assumed for each route corresponding with t h e elements of set B. Fig. 2 visualizes in three different submatrices t h e objective function values (multiplied by 0.5 because of t h e s y m m e t r y of t h e problem) for all 3 3 feasible solutions. In
Tabu Search: Applications and Prospects
route 1
route 3 11
route 2 Figure 1: Transit network 1 2 3 2 3 1 2 3 1 2 3 8 9 Qs] LlJ 6 0 6 6 7 5 7 1 0 3 4 8 7 3 6 0 5 8 2 6 3 4 7 8 7 4 \o\ 8 9 5 8 5 4 0 7 7 8 7 8 1 3 2 8 9
T
t 1 1 2 3 1 2 2 3 1 3 2 3
r
1
8 9 5 1 6 0
Table 2: Cost matrix
rl — tl r t 1 3 2 3
1 14 3 22 10
r l —12 2 2 14 4 22 17
rl—13
r 3 [~6f 19 5 14 6
t 1 3 2 3
2 1 2
r 3
2 l ( 2 0 22 9 15 17
Figure 2: Example for TNM
t 1 3 2 3
2 1 2 3 13 15 12 2 15 17 19 © T 4 16
344
S. VoB
a figurative sense they represent t h e layers of a threedimensional solution cube each of t h e m with a fixed assignment of route 1 to a specific departure time. As starting solution we have a local o p t i m u m solution (1,3,1) by choosing route 1 to start at t i m e 1 (i.e. assignment r l —• 21), route 2 to start at t i m e 3 (r2 —> 23), and route 3 to start at t i m e 1 (r3 —» 21): i n = i 2 3 = ^31 = 1 a n d xTt = 0 otherwise. This solution has an objective function value of 6 (cf. Fig.2) which is calculated by t h e framed entries of t h e m a t r i x in Table 2. In addition, Fig. 2 shows all six neighbourhood solutions of (1,3,1) by entries with upper indices. T h e first and t h e second neighbourhood solution m a y be derived by varying times of route 1 with fixed times of routes 2 and 3. T h e third and fourth m a y be derived by varying times of route 2 with fixed times of routes 1 and 3 and t h e fifth and sixth by varying route 3, correspondingly. Changing to (2,3,1) is best possible (in t h e sense t h a t no b e t t e r neighbourhood solution can be found). T h e corresponding move may be described as (11,12) with the attributes 11 indicating t h a t t h e binary variable x\\ becomes 0, and 12 indicating t h a t variable X\2 gets t h e entry 1. For applying T N M we assume a t a b u list of length tl_size=4. (Note t h a t T N M is very sensitive with respect to tl_size since tl_size=2 would lead to cycling in our example.) As t h e choice of t h e first move is not restricted by t a b u attributes the move (11,12) is selected resulting in (2,3,1). By this t h e complements of the corresponding a t t r i b u t e s are stored in the tabu list preventing from any exchange of the d e p a r t u r e t i m e of route 1 as long as they stay t a b u . 2 T h u s , in t h e second iteration only four moves are allowed leading to solutions (2,1,1), (2,2,1), (2,3,2), and (2,3,3), respectively, of which (23,22) is selected increasing t h e objective function value less t h a n t h e other allowed moves. T h e t a b u list is u p d a t e d and now only allows moves according to route 3: (31,32) and (31,33). T h e b e t t e r of these two exchanges results in solution (2,2,3). As t h e new a t t r i b u t e s become t a b u t h e oldest t a b u a t t r i b u t e s are freed again. T N M for this small example continues in a very restricted manner (which is caused by the relation of problem size to tl_size) and finally reaches the optimal solution (3,1,3) in the fifth iteration. In Fig. 2 the trajectory of all performed moves is presented with arrows and Table 3 shows all necessary statistics of the five iterations. In Domschke et al. (1992) improvement procedures for Q S A P are compared. Initial feasible solutions are calculated either randomly or with different versions of a regret heuristic extended by a 2-optimal exchange procedure. To sum up t h e main results, there is not much difference between t h e three tested t a b u search m e t h o d s in solution quality if p a r a m e t e r s are chosen well. T N M asks for most exact determination of tLsize whereas CSM seems to be more robust according to nonoptimal parameters. CSM proves to be most independent from t h e starting solution quality, too. R E M performs slightly worse but steadily improves a given solution with increasing C P U time. It is t h e m e t h o d t h a t prevents from revisiting solutions best which can easily 2 Note that for TNM it is not necessary to store the attributes rt when x r l gets the entry 1. The proceeding becomes more transparent, however, when describing the moves in more detail.
Tabu Search: Applications
iteration 1 2 3 4 5
and Prospects
345
move
solution
tabu list (tabu attributes)
(11,12) (23,22) (31,33) (12,13) (22,21)
(1,3,1) (2,3,1) (2,2,1) (2,2,3) (3,2,3) (3,1,3)
12,11 12,11,22,23 22,23,33,31 33,31,13,12 13,12,21,22
Table 3: Example be guaranteed by choosing suitable parameters. Based on the procedures developed by Domschke et al. (1992) additional computational testing on large scale real-world problems has been performed including R E M 2 as proposed in Section 2 (as well as sequential testing of t h e look ahead m e t h o d proposed in Section 4). T h e d a t a represent schedule synchronization problems from three G e r m a n cities with 14, 24, and 27 routes operating in two directions each (i.e. m < 54). A modification with respect t o t h e above mentioned problem description concerns different cycle times for different routes as well as variable cycle times with 10 < t; < 80 m i n u t e s over t h e day. T h e results are quite astonishing in t h e sense t h a t an analysis of the case studies reveals very specialized d a t a , i.e., t h e underlying traffic networks have a special shape. In one case it might be characterized as a star network with a central station in t h e downtown part of t h e respective city. In t h e other two cases different lines partly use t h e same tracks implicating t h a t security distances have to be observed. Motivated by t h e modelling process above we m a y still use t h e Q S A P model with slight modifications. This is done as follows (cf. Vofi(1990)): • For any two lines using t h e same tracks calculate those combinations of depart u r e t i m e s which lead to superposition on t h e commonly used tracks, and define all weights in Q S A P corresponding to those combinations to be oo. T h e special structure of t h e problems leads to t h e following results. Nearly in all cases t h e different t a b u search m e t h o d s are able to find only slight improvements over t h e initialized solutions obtained with t h e regret and t h e simple 2-optimal exchange procedures. Even t h e modifications of R E M mentioned above do not give any additional reasonable improvements. Careful analysis of t h e smallest of t h e three examples gives us some reasoning (based on t h e branch and bound approach of Domschke (1989)) t h a t t h e t a b u search methods do not fail b u t t h a t t h e initial feasible solutions are close to t h e o p t i m u m or even equal to it. In addition, even local o p t i m a close to the o p t i m u m have a great number of neighbourhood solutions each with the same objective function value such t h a t additional testing, which is still under way,
S. VoS
346
has to deal with some more sophisticated diversification structure like a combination of REMt for larger t and the long term memory. In addition, another single-attribute approach could be tested with the attributes corresponding to objective function values. With respect to the real-world application, however, the results obtained by our algorithms (including tabu search) are quite satisfactorily and promising.
4
Concepts for Parallel Tabu Search
Parallel Machine Models Over the years a great variety of architectures have been proposed for parallel computing. The most widely known classification of parallel machine models (although somehow limited) is given by Flynn (1966). He distinguishes four general classes based on the idea of whether single or multiple instruction streams are executed on either one or multiple data set streams: • SISD (Single Instruction, Single Data) including the classical sequential computers • SIMD (Single Instruction, Multiple Data) including vector computers and array processors • MISD (Multiple Instructions, Single Data) • MIMD (Multiple Instructions, Multiple Data) with the processors performing each successive set of instructions either simultaneously (synchronous) or independently (asynchronous) The above classification of parallel machine models may lead to different classes of parallel algorithms. Vectorized algorithms operate uniformly on vectors of data sets (SIMD). Systolic ones operate rhythmically on streams of data sets (SIMD and synchronous MIMD). Parallel processing algorithms operate on a set of synchronously communicating parallel processors (synchronous MIMD). Correspondingly, asynchronous communication leads to distributed processing algorithms (asynchronous MIMD and neural networks). In addition to architectural aspects communication networks are used to classify parallel machine models. For instance, it makes a difference whether processors have simultaneous access to a shared memory, allowing communication between two arbitrary processors in constant time, or whether they communicate through a fixed interconnection network. Less formally, in certain models it is assumed that there is a master processor controlling the communication of the network, with the remaining processors of the network called slaves. For a comprehensive survey on parallel machines and algorithms see e.g. Akl (1989) and Van Leeuwen (1990).
Tabu Search: Applications
and
347
Prospects
T h e quality of parallel algorithms may be judged by a n u m b e r of quantities, t h e most i m p o r t a n t one being t h e speedup, which is t h e running t i m e of t h e best sequential implementation of t h e algorithm divided by t h e running t i m e of t h e parallel implementation executed on a n u m b e r of p processors. Similarly, given a prespecified t i m e limit (cf. footnote 1) a scaleup m a y be defined as t h e ratio of t h e average problem sizes solvable with a parallel implementation to a sequential implementation of t h e algorithm. W i t h heuristics, t h e solution quality attainable may also be measured. T h e processor utilization or efficiency is t h e speedup divided by p. T h e best one can achieve is a speedup of p and an efficiency equal to one.
Parallel Tabu Search Algorithms Due to t h e success and t h e underlying simplicity of t h e m a i n idea of t a b u search, recently some implementations on parallel computers have come up tailored to specific problems. Surprisingly, to t h e best of our knowledge, they are solely devoted to problems using t h e notion of paired-attribute moves: t h e travelling salesman problem (see Malek et al. (1989) and Fiechter (1990)), t h e job shop problem (see Taillard (1989)), and t h e quadratic assignment problem (see Chakrapani and Skorin-Kapov (1991, 1992), Taillard (1991)). In a first step we shall describe a classification of different types of parallelism t h a t is applicable to most iterative search techniques. Its basis is t h e idea of having different starting solutions (so-called balls, motivated by t h e idea of m o u n t a i n s ' like solution space where a ball is rolling to find a stable low altitude state) as well as a n u m b e r of different strategies, e.g. based on various possibilities of t h e p a r a m e t e r setting or on t h e t a b u list m a n a g e m e n t described in Section 2. • S B S S (Single Ball, Single
Strategy)
T h e algorithm starts from exactly one given feasible solution and performs its moves following exactly one strategy. • S B M S (Single Ball, Multiple
Strategies)
T h e algorithm starts from exactly one given feasible solution by t h e use of different strategies where each strategy is performed on a different processor. • M B S S (Multiple
Balls, Single
Strategy)
T h e algorithm starts from different initial feasible solutions, each on a different processor. T h e same t y p e of instruction, i.e. strategy, is performed on each processor. • M B M S (Multiple
Balls, Multiple
Strategies)
T h e algorithm starts from different initial feasible solutions performing different strategies.
348
S. VoB
In what follows we discuss t h e above ideas in more detail with special emphasis on further principles of parallelism within specific strategies. For ease of description we assume t h e notion of parallel or distributed processing algorithms.
SBSS T h e single ball, single strategy idea is t h e simplest version, and obviously corresponds to t h e idea of classical sequential computations (cf. t h e SISD-model). This, however, does not restrict t h e possibility of parallelization. Starting from an initial feasible solution, t h e best move which is not t a b u must be performed. T h e search for this move may be done in parallel by decomposing t h e set of admissible moves into a number of subsets. E.g. in a master-slave architecture each (slave) processor m a y evaluate the best move in a specific subset. T h e best move of each subset is communicated t o t h e m a s t e r who picks t h e overall best as t h e transformed solution and also performs t h e t a b u list m a n a g e m e n t . To restrict t h e amount of communication necessary for synchronizing t h e d a t a each slave could determine the best possible move in its subset without observing any t a b u list, while t h e t a b u list in t h e same t i m e is u p d a t e d by t h e master. T h e n the master picks among all answers t h e best which is not tabu. If no such move exists, a second trial must be m a d e while each processor has to receive and to observe t h e t a b u list. Otherwise t h e next iteration is to be performed. Additional ideas m a y be developed with respect to t h e specific strategies. In T N M , t h e t a b u list m a n a g e m e n t may be done by each processor itself by simply providing the most recent move (whose complement will be in t h e list). In CSM, t h e master builds t h e cancellation sequences and partitions t h e m to t h e slaves, i.e., every slave has to evaluate a certain n u m b e r of sequences. In subsequent iterations, t h e a t t r i b u t e s of t h e current moves are communicated. Whenever a cancellation sequence is reduced to 1 it will be re-communicated to t h e master.
SBMS In SBMS each processor executes a process which is one of the above t a b u search strategies with different t a b u conditions and p a r a m e t e r s , like e.g. R E M t for various t. For T N M this can be different (eventually randomly modified) t a b u list lengths; for CSM, different t a b u durations may be considered. T h e (slave) processors are halted after a prespecified t i m e and t h e results are compared and t h e best one is calculated. A restart is possible with the best or a good seed solution. Each strategy may take a different p a t h through t h e search space because of different t a b u list m a n a g e m e n t or p a r a m e t e r setting. A restart may be performed either with e m p t y running and t a b u lists or with a previously encountered list.
Tabu Search: Applications
and
Prospects
349
MBSS T h e multiple balls approaches start from at most p (the n u m b e r of processors available) different initial feasible solutions, whose calculation can vary. T h e y m a y be determined either randomly or by applying different heuristics to t h e same problem. This may also incorporate ideas involving different diversification and intensification strategies as described above. A third possibility assumes one given feasible solution and starts with a suitable subset of its transformed (neighbourhood) solutions. (Especially with R E M 2 it m a y be assured t h a t even in future iterations there is no overlap with t h e initial feasible solutions of t h e other processors.) T h e single strategy approach assumes t h e application of exactly one t a b u search algorithm with t h e same p a r a m e t e r setting for all processors. As with SBMS, t h e processes m a y be halted after a specific t i m e period to coordinate their results and possibly to initiate a restart with new (hopefully) improved solutions. If t h e processes are performed synchronously, then t h e stopping may be initiated after having generated, say, m successive moves. On synchronous M I M D machines t h e latter approach may be especially relevant. Note t h a t t h e above-mentioned possibility of parallelization within SBSS is related to a m e t h o d with m = 1 where t h e best transition is evaluated. W i t h respect to MBSS, this modifies to t h e evaluation of t h e p best moves usable for a restart. For m > 2 this approach m a y be used as a look ahead method, and is especially helpful when evaluating a region of a solution space with almost all neighbours having identical objective function values.
MBMS T h e multiple balls, multiple strategies approach subsumes all previous classes, allowing search of t h e solution space from different starting points with different m e t h o d s or p a r a m e t e r settings.
Examples In t h e sequel we sketch some of t h e ideas given in t h e previous sections with respect to well known combinatorial optimization problems. Surprisingly, as mentioned above, we only found some work on problems with t h e idea of paired-attribute moves to perform t h e neighbourhood search. Therefore, we start with respect to binary integer programming, exploiting single-attribute moves, as is t h e case for M C K P described in Section 3. Consider t h e SBSS concept. Also consider n decision variables in a binary problem with no (implicit or explicit) restriction on t h e n u m b e r of variables set to either 1 or 0. We m a y define simple ADD- or DROP-moves by complementing t h e corresponding entries of t h e binary variables x,-. Assume t h e existence of n + 2 processors with n + 2 being t h e master processor. T h e t a b u list m a n a g e m e n t is performed by processor n + 1. In any iteration of t h e search, each of t h e synchronously controlled processors
S. VoB
350
i € { l , . . . , n } receives t h e information whose variables' entry has been chosen to be exchanged as t h e most recent move. This move is performed together with t h e reversion of a:,-. This usually can be done quite efficiently by reconstructing t h e previous solution stored at i with at most one assignment complemented. T h e n i offers its objective function value to t h e m a s t e r who re-calls all results of processors referring to non-tabu moves (evaluated by processor n + 1). Obviously this approach may be generalized in various ways to t h e more general classes described above. This concept may be applied, for instance, to M C K P , to t h e well-known warehouse location problem, and to Steiner's problem in graphs. If t h e number or weighted number of variables with value 1 is limited (as for M C K P ) or fixed (as e.g. in t h e p-median problem) t h e n t h e same approach may be applied with combined A D D / D R O P - or SWAP-moves leading to paired-attribute moves. Malek et al. (1989) follow t h e SBMS approach to solve travelling salesman problems ( T S P ) by T N M with t h e 2-opt exchange as moves. T h e t a b u a t t r i b u t e s follow different strategies in t h a t they are restricted either to one or to t h e two cities t h a t have been swapped or to t h e cities and their respective positions in tour. In addition different t a b u p a r a m e t e r s were used on different processors. For another parallel t a b u search algorithm for t h e T S P see Fiechter (1990). T h e quadratic assignment problem ( Q A P ) is treated by Chakrapani and SkorinKapov (1991, 1992) by t h e use of SBSS and T N M with search intensification and search diversification performed sequentially while evaluating t h e moves in parallel. T h e set of moves is partitioned into disjoint subsets each one on a different processor as described above. T h e neighbourhood search is performed by pairwise interchanges such t h a t for 0(n2) processors available all moves can be evaluated in constant time, achieving a speedup of O ( n 2 / l o g n ) . Battiti and TecchioUi (1992) use T N M together with a hashing function and compare their algorithm also with a parallel genetic algor i t h m . Another parallel algorithm for Q A P based on T N M (with randomly varying tl_size) has been presented by Taillard (1991). It is an SBSS approach, too. T h e same idea has also been applied to t h e job shop as well as to the flow shop problem (see Taillard (1989, 1990)). T h e latter, in fact, also describes a single-attribute based implementation with attributes corresponding to objective function values. Chakrapani and Skorin-Kapov (1992) is especially relevant since its implementation is based on a connectionist approach related t o a Boltzmann machine (cf. Aarts and Korst (1989)).
5
Conclusions
In this paper we have summarized some ideas for developing parallel t a b u search algorithms. Motivated by a famous classification scheme for parallel machine models we proposed a classification scheme for parallel t a b u search algorithms. While research in this field is still in its infancy we believe t h a t reasonable achievements in the following two aspects will be provided.
Tabu Search: Applications
and
Prospects
351
• Development of a framework for a general parallel t a b u search algorithm t h a t can be applied to a wide range of combinatorial optimization problems. • Empirical results for parallel t a b u search algorithms tailored to specific problems. Some results known from t h e literature (cf. Section 4) support this feeling. Despite t h e emphasis on parallel t a b u search, sequential testing is still far from complete. Numerical results for t h e proposed algorithm R E M 2 have been reported in this paper, improving t h e previously known R E M . T h e results are quite encouraging, especially when combining R E M 2 with t h e proposed look ahead m e t h o d for m = 2. In addition, t h e t a b u search metastrategy should be tested on different classes of parallel algorithms and machine models. Especially relevant seems to be a comparison of algorithms tailored to different hardware specifications like vector computers versus synchronous and asynchronous MIMD machines. However, one should take into account identical user specifications with respect to t a b u search (e.g. p a r a m e t e r setting, definition of t h e neighbourhood). Note t h a t our classification scheme is not restriced to parallel t a b u search, but may be applied for nearly any iterative search procedure, such as simulated annealing or genetic algorithms.
References [1] E. Aarts and J. Korst (1989), Simulated ley, Chichester).
Annealing
[2] S.G. Akl (1989), The Design and Analysis Englewood Cliffs).
and Boltzmann
of Parallel Algorithms
Machines
(Wi-
(Prentice-Hall,
[3] R. B a t t i t i and G. Tecchiolli (1992), Parallel biased search for combinatorial optimization: genetic algorithms and t a b u search. Technical Report 9207-02, I R S T , Istituto Trentino di Cultura, Trento. [4] J. Chakrapani and J. Skorin-Kapov (1991), Massively parallel t a b u search for t h e quadratic assignment problem. Working Paper H a r r i m a n School for Management and Policy, S t a t e Univ. of New York at Stony Brook. [5] J. Chakrapani and J. Skorin-Kapov (1992), A connectionist approach to t h e quadratic assignment problem. Computers & Operations Research 1 9 , 287-295. [6] D. Chhajed and T . J . Lowe (1992), M-median and M-center problems with m u t u a l communication: solvable special cases. Operations Research 4 0 , S56-S66. [7] F . Dammeyer, P. Forst and S. Vofi(1991), On t h e cancellation sequence m e t h o d of t a b u search. ORSA Journal on Computing 3 , 262-265.
352
S. VoB
[8] F . D a m m e y e r and S. Vofl(1991a), Dynamic t a b u list m a n a g e m e n t using t h e reverse elimination m e t h o d . Annals of Operations Research, to appear. [9] F . D a m m e y e r and S. Vofi(1991b), Application of t a b u search strategies for solving multiconstraint zero-one knapsack problems. Working paper T H D a r m s t a d t . [10] W . Domschke (1989), Schedule synchronization for public transit networks. OR Spektrum 1 1 , 17-24. [11] W . Domschke, P. Forst and S. Vofi(1992), Tabu search techniques for the quadratic semi-assignment problem. In: G. Fandel, T. Gulledge and A. Jones (eds.), New Directions for Operations Research in Manufacturing (Springer, Berlin), 389-405. [12] A. Drexl (1988), A simulated annealing approach to t h e multiconstraint zero-one knapsack problem. Computing 4 0 , 1-8. [13] A. D u t t a , G. Koehler and A. Whinston (1982), On optimal allocation in a dist r i b u t e d processing environment. Management Science 2 8 , 839-853. [14] C.-N. Fiechter (1990), A parallel t a b u search algorithm for large traveling salesm a n problems. Paper presented at 1st Int. Workshop on Project Management and Scheduling, Compiegne. [15] M . J . Flynn (1966), Very high-speed computing systems. Proc. IEEE 5 4 , 19011909. [16] A. Freville and G. Plateau (1982), Methodes heuristiques performantes pour les problemes en variables 0-1 a plussieurs constraintes en inegalite. Publication ANO-91, Universite des Sciences et Techniques de Lille. [17] A. Freville and G. Plateau (1990), Hard 0-1 multiknapsack test problems for size reduction m e t h o d s . Investigacion Operativa 1, 251-270. [18] F . Glover (1989), Tabu search - part I. ORSA [19] F . Glover (1990), Tabu search - part II. ORSA
Journal Journal
on Computing on Computing
1, 190-206. 2, 4-32.
[20] F . Glover and M. Laguna (1992), Tabu search. Working Paper Univ. of Colorado at Boulder, to appear. [21] W . D . Klemt and W . S t e m m e (1988), Schedule synchronization for public transit networks. In: J.R. D a d u n a and A. Wren (eds.), Computer-Aided Transit Scheduling, Lecture Notes in Economics and Mathematical Systems 308 (Springer, Berlin), 327-335.
Tabu Search: Applications and Prospects
353
[22] M. Malek, M. Guruswamy, M. Pandya and H. Owens (1989), Serial and parallel simulated annealing and tabu search algorithms for the traveling salesman problem. Annals of Operations Research 21, 59-84. [23] E. Taillard (1989), Parallel taboo search technique for the jobshop scheduling problem. Working Paper Ecole Polytechnique Federale de Lausanne. [24] E. Taillard (1990), Some efficient heuristic methods for the flow shop sequencing problem. European Journal of Operational Research 47, 65-74. [25] E. Taillard (1991), Robust taboo search for the quadratic assignment problem. Parallel Computing 17, 443-455. [26] J. Van Leeuwen (1990), Algorithms and Complexity (Elsevier, Amsterdam). [27] S. Vofi(1990), Network design formulations in schedule synchronization. Working Paper TH Darmstadt, to appear.
355 Network Optimization Problems, pp. 355-362 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
The Shortest P a t h Network and Its Applications in Bicriteria Shortest P a t h Problems Guo-Liang Xue Army High Performance Computing Research Center, University of Suite 101, 1100 South Washington Avenue, Minneapolis, MN 55415,
Shang-Zhi Sun Computer Science USA
Department,
University
of Minnesota,
Minneapolis,
Minnesota, USA
MN
55455,
Abstract
Let N = (V, A, I, s) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is the set of e arcs, l(u, v) > 0 is the length of an arc (u, v) € A, and s 6 V is the source. The S h o r t e s t Path Network (SPN) is a subnetwork of N with the property that an s—u path in N is a shortest path in N if and only if it is a path in SPN. The SPN is a counterpart of the well-known S h o r t e s t Path Tree (SPT). Unlike the SPT which may not be unique for a given network, the SPN is unique for any given network. Also, the SPN provides a unified approach for solving certain kind of bicriteria or multicriteria shortest path problems where one criterion is more important than the others. We present a simple and efficient algorithm for computing the SPN with time complexity oiTSP(n, e) where TSP(n, e) is the time complexity for solving the one-to-all shortest path problem on a network with n vertices and e arcs. We also present applications of the SPN, including a unified approach for solving the maximum capacity shortest path problem, the least risky shortest path problem, and the most reliable shortest path problem.
356
1
G.-L. Xue and S. Z. Sun
Introduction
Shortest p a t h problems are among t h e most commonly encountered problems at the interface of C o m p u t e r Science and Operations Research due to their i m p o r t a n t applications in communication networks and in road transportation m a n a g e m e n t . In recent years, there have been increased interests in various bicriteriaor multicriteria shortest p a t h problems. Examples are t h e m a x i m u m capacity shortest p a t h problem, t h e least risky shortest p a t h problem, t h e most reliable shortest p a t h problem, t h e m i n i m u m cost-reliability ratio problem, and t h e quickest p a t h problem [1, 2, 3, 4, 8, 10, 11, 12]. Since t h e publication of Dijkstra's famous paper [6] in 1959, there have been m a n y m a n y papers dealing with algorithms for t h e one-to-all or all-to-all shortest p a t h problems. T h e fastest algorithm for the one-to-all shortest p a t h problem is the one provided by Fredman and Tarjan [7] which requires 0(e + log log n) t i m e on a network with n vertices and e arcs by using a d a t a structure called Fibonacci heaps. One i m p o r t a n t concept associated with t h e one-to-all shortest p a t h problem is the S h o r t e s t P a t h T r e e (SPT) [5] or t h e S h o r t e s t S p a n n i n g T r e e [9]. T h e SPT of a network TV has a nice property t h a t t h e unique p a t h from t h e source to any vertex in SPT is guaranteed to be a shortest p a t h in t h e network N. However, a shortest p a t h in t h e network N might not be a p a t h in t h e SPT. In this paper, we introduce a counterpart concept of t h e SPT called the S h o r t e s t P a t h Network (SPN) which enables a unified approach for solving certain kind of bicriteria or multicriteria shortest p a t h problems where one criterion is more i m p o r t a n t t h a n t h e other criteria and t h e goal is to find a p a t h which optimizes t h e other criteria among all the shortest paths with respect to the most i m p o r t a n t criterion. A simple and efficient algorithm for computing t h e SPN is presented together with examples of its applications. In section 2, we first observe t h e deficiency of t h e ordinary S h o r t e s t P a t h T r e e in solving bicriteria shortest p a t h problems. We then introduce t h e concept of S h o r t e s t P a t h Network and prove its existence and uniqueness. A simple algorithm is provided which computes t h e S h o r t e s t P a t h Network in t i m e TSP(n,e), where n and e are t h e n u m b e r of vertices and number of arcs of t h e network and TSP(n, e) is the t i m e complexity of solving the one-to-all shortest p a t h problem on t h a t network. In section 3, we show how t h e S h o r t e s t P a t h Network can be used as a useful tool in solving the m a x i m u m capacity shortest problem and t h e least risky shortest p a t h problems. Some conclusions are given in section 4.
2
The Shortest P a t h Network
Let N = (V, A, I, s) be a given network where G = (V, A) is a simple directed graph, V is t h e set of n vertices, A is t h e set of e arcs, l(u,v) > 0 is the length of an arc (u,v) £ A, s € V is t h e source. Applying Dijstra's one-to-all shortest p a t h algorithm [6], we may find, in t i m e 0(n2), the shortest paths from t h e source s to all t h e other
The Shortest
Path Network
and Bicriteria
Shortest
Path
Problems
357
vertices of N, together with a tree rooted at s with t h e property t h a t t h e unique p a t h from s to any vertex u in t h e tree is also a shortest s—u p a t h in N. Such a subnetwork is usually called a S h o r t e s t P a t h T r e e [5] which is formally defined as follows. D e f i n i t i o n 2 . 1 . Let N = (V, A, l,s) be a given network with a distinguished source node s. A subnetwork SPT of N is be called a S h o r t e s t P a t h T r e e of N If (1). t h e vertex set of SPT are all t h e vertices of N which are reachable from s; (2). any 5—u p a t h in SPT is a shortest s—u p a t h in A^; (3). SPT is a tree rooted at s. Given an SPT of N and a vertex u in N which is reachable from s, t h e unique s—u p a t h in SPT is a shortest s-u p a t h in N. However, t h e shortest s—u p a t h s in N may not be unique. Therefore there might be another shortest s—u p a t h in N which is not a p a t h in SPT. We are interested in a subnetwork which has t h e property t h a t an s—u p a t h in N is a shortest s—u p a t h if and only if it is a p a t h in t h e subnetwork. We will call such a subnetwork a S h o r t e s t P a t h Network and it is formally defined below. D e f i n i t i o n 2 . 2 . Let A^ = (V,A,l,s) be a given network with a distinguished source node s. A subnetwork SPN of A' will be called a S h o r t e s t P a t h Network of N If (1). the vertex set of SPN are all t h e vertices of N which are reachable from s; (2). any s—u p a t h in SPN is a shortest s—u p a t h in A^; (3). any shortest s—u p a t h in A7 is a p a t h in SPN.
T h e following theorem establishes t h e existence and uniqueness of t h e SPN and its characterization. T h e o r e m 2 . 1 . Let N = (V,A,l,s) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is the set of e arcs, l(u,v) > 0 is t h e length of an arc (u,v) 6 A, s £ V is t h e source. T h e n t h e SPN for N is unique and is t h e union of all t h e shortest s—u paths for u G V such t h a t there is an s—u p a t h in A'. P r o o f . Since l(u,v) > 0 for any arc (u,v) g A, there is a shortest 3 — u p a t h for a vertex u € V if and only if u is reachable from s. Let Union be t h e union of all t h e shortest 5—u p a t h s for u € V such t h a t there is an 5—u p a t h in N. We want to show t h a t Union is a S h o r t e s t P a t h Network. Clearly, t h e vertex set of Union are all the vertices of A' which are reachable from s and t h a t any shortest s—u path in N is a path in Union. Using the property t h a t every subpath of a shortest path is itself a shortest path, one can prove that any s—u p a t h in Union is a shortest s—u p a t h in N. This shows t h a t Union is a SPN for N. Now for any SPN of N, property (2) in t h e definition implies t h a t t h e SPN is a subnetwork of Union while property (3) in t h e definition implies t h a t Union is a subnetwork of SPN. Therefore Union is t h e unique SPN of N. •
358
G.-L. Xue and S. Z. Sun
Algorithm 2.1. Step 1. Apply any one-to-all shortest path algorithm on N. Let d(u) be the shortest distance from s to u for any u f F . Let d(u) = oo when there is no s—u path in iV. Step 2. For each u £ V if d(u) = oo then delete u from V and delete the arcs from A which are adjacent with u. Step 3. For each arc (u,v) € A if d(u) + l(u,v) > d(v) then delete (u,v) from A. Figure 1: Computing the SPN from a given network. The above proof also suggests the following algorithm for computing the SPN for a given network. It is clear that Algorithm 2.1 correctly changes the input network N = (V, A, I, s) to its unique SPN. Since steps 2 and 3 take at most 0(e) time, the time complexity of Algorithm 2.1 is TSP(n,e) + 0(e), where TSP(n,e) is the time complexity for the one-to-all shortest path problem on a network with n vertices and e arcs. Since TSP(n, e) is always greater than or equal to e, the time complexity of Algorithm 2.1 is TSP(n, e). In Figure 2 we illustrate a network and its unique S h o r t e s t Path Network. For clarity, the arcs in the SPN are drawn in thicker lines. It can be easily observed from the figure that the SPN is not a tree and that the given network has more than on SPT's.
Figure 2: A network and its Shortest Path Network.
Once the subnetwork SPN is found, the above mentioned bicriteria shortest path problems all reduce to single criterion problems on SPN. This makes the Shortest Path Network a very useful concept in bicriteria shortest path problems. In the next section, we will discuss applications of the SPN.
The Shortest
3
Path Network
and Bicriteria
Shortest
Path Problems
359
Applications
In t h e previous section, we have introduced t h e concept of SPN and presented an algorithm for computing t h e SPN. In this section, we will show how t h e SPN can be used to solve various bicriteria shortest p a t h problems in a unified approach. Specifically, we will investigate t h e m a x i m u m capacity shortest p a t h problem, t h e m i n i m u m risky shortest p a t h problem, and t h e most reliable shortest p a t h problem.
3.1
The Maximum Capacity Shortest Path Problem
Let N = (V, A, I, s, c) be a given network where G = (V, A) is a simple directed graph, V is the set of n vertices, A is t h e set of e arcs, /(•) and c(») are weighting functions where l(u,v) > 0 is the length of an arc (u,v) € A and c(u,v) > 0 is t h e capacity of t h a t arc, and s £ V is the source. We want to find a m a x i m u m capacity shortest p a t h from s to u for all u g V , where the length of a p a t h in N is t h e sum of all t h e lengths of the arcs on t h e p a t h while t h e capacity of a p a t h in N is the m i n i m u m of all t h e capacities of t h e arcs on t h a t p a t h . Here we find our first application of t h e S h o r t e s t P a t h Network. Ignoring t h e weighting function c(«) for t h e m o m e n t , we may find t h e SPN of N with respect to / ( • ) . Now, for each u € V, there is an s—u p a t h if and only if u is a vertex of SPN. In addition, an s—u p a t h in N is a m a x i m u m capacity shortest s—u p a t h if and only if it is a m a x i m u m capacity 5—u p a t h in SPN. Therefore, t h e m a x i m u m capacity shortest p a t h problem can be solved by t h e following algorithm (see Figure 3). For convenience, we will assume t h a t t h e vertices of t h e SPN are labeled 1,2, • • •, k ( < n) where s is labeled 1. Algorithm 3.1. Step 1. Find t h e SPN with respect to / ( • ) . Delete all t h e arcs and vertices of N which are not on its SPN. Step 2. F o r i = 1 to k d o b e g i n f[i] = false; C[i] = c[l,z]; e n d ; Step 3. / [ I ] = t r u e ; C[l] = oo; For i = 1 to k — 2 do choose u £ argmax{C[w]\f[w] = false}; f[u] = t r u e ; for w = 1 t o k d o if n o t f[w] a n d min{C[u]|c[u,ti)]} > C[w] t h e n C[w] = mm{C[u],c[u,w]}; end; Figure 3: C o m p u t i n g t h e m a x i m u m capacity shortest p a t h . Step 1 takes TSP(n,e) + 0 ( e ) time. Step 2 takes 0(n) t i m e . Step 3 takes time. Therefore the t i m e complexity of t h e algorithm is 0(n2).
0(n2)
360
3.2
G.-L. Xue and S. Z. Sun
The Least Risky Shortest Path Problem
Let N = (V, A,l,s,R) be a given network where G = (V,A) is a simple directed graph, V is t h e set of n vertices, A is t h e set of e arcs, /(•) and R(») are weighting functions where l[u,v) > 0 is t h e length of an arc (u,v) £ A and R(u,v) > 0 is t h e risk of t h a t arc, s € V is t h e source. We want to find a least risky shortest p a t h from s to u for all u € V, where t h e length of a p a t h in ./V is t h e sum of all t h e lengths of t h e arcs on t h e p a t h while t h e risk of a p a t h in N is t h e sum of all t h e risks of the arcs on t h a t p a t h . As another application of t h e shortest p a t h network, we present in Figure 4 an algorithm for solving t h e least risky shortest p a t h problem. Algorithm 3.2. Step 1. Find t h e SPN with respect to / ( • ) . Delete all the arcs and vertices of N which are not in its SPN. Step 2. Solve t h e one-to-all shortest p a t h problem with respect to -/?(•). Figure 4: C o m p u t i n g t h e least risky shortest p a t h . S t e p . l takes TSP(n,e) + 0 ( e ) time. Step 2 takes TSP(n,e) t i m e complexity of t h e algorithm is TSP(n, e).
3.3
time. Therefore, the
The Most Reliable Shortest Path Problem
Let N = (V, A, I, s, r) be a given network where G = (V, A) is a simple directed graph, V is t h e set of n vertices, A is the set of e arcs, /(•) and r(») are weighting functions where l(u, v) > 0 is t h e length of an arc (u, v) £ A and r(u, v) £ (0, 1] is t h e reliability of t h a t arc, s £ V is t h e source. We want to find a most reliable shortest p a t h from s to u for all u £ V, where the length of a p a t h in A^ is t h e sum of all the lengths of the arcs on t h e p a t h while t h e reliability of a p a t h in A^ is the product of all the reliabilities of the arcs on t h a t p a t h . Now replace t h e weighting function r(») by R(»), where R(u,v) = — log(r(u, v)) for each arc (u,v) £ A. Note t h a t R(») > 0 because r(») £ (0, 1] We will call R(u,v) t h e risk of arc (u,v). T h e n it is clear t h a t t h e most reliable shortest path problem on (V, A,s,l, r) is equivalent to t h e m i n i m u m risky shortest p a t h problem on (V, A, s, I, R) which can be solved by Algorithm 3.2.
4
Conclusions
We have introduced t h e concept of S h o r t e s t P a t h Network which is a counterpart of t h e well-known S h o r t e s t P a t h T r e e . Unlike the S h o r t e s t P a t h T r e e , t h e SPN is unique to a given network with a distinguished source. One advantage of the SPN over t h e SPT is t h a t any p a t h in the network from the source is a shortest p a t h if
The Shortest
Path Network
and Bicriteria
Shortest
Path Problems
361
and only if it is also a p a t h in t h e SPN. A simple algorithm for computing the SPN is presented and its t i m e complexity is shown to be t h e same as t h a t of computing the one-to-all shortest p a t h problem. Examples are given to show the applications of the SPN in some bicriteria shortest p a t h problems. T h e SPN greatly simplifies t h e discussion/solution of certain kind of bicriteria shortest p a t h problems. Therefore, it has both education and practical value. We hope to see more applications of t h e SPN.
Acknowledgment T h e work of the first author was supported in part by the Army Research Office contract number DAAL03-89-C-0038 with the University of Minnesota Army High Performance C o m p u t i n g Research Center. T h e work of t h e second author was supported in part by t h e Computer Science D e p a r t m e n t of t h e Univeristy of Minnesota.
References [1] R.K. Ahuja, M i n i m u m Cost-Reliability Ratio Problem, Computers tions Research, Vol. 15 (1988), pp. 83-89.
and
Opera-
[2] L.D. Bodin, B.L. Golden, A.A. Assad, and M.O. Ball, Routing and Scheduling of Vehicles and Crews: T h e State of t h e Art, Computers and Operations Research, Vol. 10 (1983), p p . 63-211. [3] Y.L. Chen and Y.H. Chin, T h e Quickest P a t h Problem, Computers tions Research, Vol. 17 (1990), p p . 153-161.
and
Opera-
[4] Y.L. Chen, An Algorithm for Finding the K Quickest P a t h in a Network, revised in Computers and Operations Research. [5] R. Dial, F. Glover, D. Karney and D. Klingman, A Computational Analysis of Alternative Algorithms and Labeling Techniques for Finding Shortest P a t h Trees, Networks, Vol. 9(1979), pp. 215-248. [6] E. Dijkstra, A Note on Two Problems in Connection with Graphs, Mathematics, Vol. 1 (1959), pp. 269-271.
Numerical
[7] M.L. F r e d m a n and R.E. Tarjan, Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms, Journal of the Association of Computing Machinery, Vol. 34 (1987), pp. 596-615. [8] M. Minoux, Solving Combinatorial Problems with Combined Min-Max-Min-Sum Objective and Applications, Mathematical Programming, Vol. 45 (1989), pp. 361372.
362
G.-L. Xue and S. Z. Sun
[9] A.R. Pierce, Bibliography on Algorithms for Shortest P a t h , Shortest Spanning Tree, and Related Circuit Routing Problems, Networks, Vol. 5 (1975), p p . 129149. [10] J . B . Rosen, S.Z. Sun, and G.L. Xue, Algorithms for t h e Quickest P a t h Problem and t h e Enumeration of Quickest P a t h s , Computers and Operations Research, Vol. 18 (1991), p p . 579-584. [11] J . B . Rosen and G.L. Xue, Sequential and Distributed Algorithms for t h e All Pairs Quickest P a t h Problem, in Proceedings of the 1991 International Conference on Computing and Information, Ottawa, Canada, (Springer-Verlag, 1991), p p . 471473. [12] G.L. Xue, S.Z. Sun, and J . B . Rosen, M i n i m u m T i m e Message Transmission in Networks, in Proceedings of the 1992 International Conference on Computing and Information, May 28-30, 1992, Toronto, Canada, I E E E C o m p u t e r Society Press, p p . 22-25.
363 Network Optimization Problems, pp. 363-386 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
A Network Formalism for P u r e Exchange Economic Equilibria Lan Zhao Department 11568-0219
of Mathematics, USA
A n n a Nagurney School of Management,
SUNY/College
University
at Old Westbury,
of Massachusetts,
Amherst,
Old Westbury,
MA 01003
NY
USA
Abstract
In this paper we develop a network formalism for general economic equilibrium problems in the case of pure exchange economies. We first establish that the Walrasian price equilibrium (and its variational inequality formulation) is isomorphic to a network equilibrium problem with special structure. We then propose a general iterative scheme for the computation of the equilibrium prices, which contains, as special cases, the projection method and the relaxation method, and which allows for the full exploitation of the special network structure. Finally, we compare the numerical performance of the projection and the relaxation methods on several economic examples.
1
Introduction
Network equilibrium models have been used to formulate and study competitive phen o m e n a in a wide range of applications in operations research, m a n a g e m e n t science, and, more recently, in economics. Examples include: congested urban transportation systems (see, e.g., [1], [2], [3], [5], [6], [12], [15]), oligopolistic markets ([11]), spatial price equilibrium problems ([8], [10], [14], [17]), disequilibrium problems ([19]), and problems of h u m a n migration ([16], [18]).
364
Lan Zhao Si A n n a
Nagurney
T h e aforementioned models, however, have been exclusively partial equilibrium models in t h a t only a subset of agents/activities/commodities has been incorporated within t h e network equilibrium framework. In this paper, in contrast, we focus on t h e general economic equilibrium problem in t h e case of pure exchange, in which all of t h e commodities in an economy can be considered. Our approach utilizes the underlying special network structure of the problem, in an abstract setting, in which t h e nodes do not correspond to locations in space, and in which the links correspond to commodities. This underlying structure, heretofore unidentified and unexplored, motivates t h e subsequent algorithmic developments in this paper, with t h e u l t i m a t e goal being t h e computation of large-scale general economic equilibrium problems. As is well-known, t h e simplicial approximation m e t h o d s pioneered by Scarf [20] for t h e computation of economic equilibria, in their present s t a t e of development, cannot handle large-scale problems. In particular, we consider t h e variational inequality formulation of t h e problem recently described in Dafermos [9]. Thus far, as discussed therein, the variational inequality approach has been used to obtain only qualitative results, in the form of existence, uniqueness, and stability of pure exchange equilibria, and t h e computational analogue for this class of problems has not been addressed. In Section 2 we briefly review t h e pure exchange or Walrasian price equilibrium model. We then establish t h a t t h e problem is isomorphic to a particular network equilibrium problem with fixed demand. In Section 3 we propose a general iterative scheme for t h e computation of the Walrasian equilibrium price vectors t h a t is based on the general iterative scheme of Dafermos [7], and provide conditions for convergence. T h e Walrasian iterative scheme, as is then demonstrated in Section 4, contains, as special cases, both t h e projection m e t h o d and the relaxation m e t h o d . T h e projection m e t h o d resolves t h e Walrasian price equilibrium problem into a series of linear and s y m m e t r i c network equilibrium problems, each of which, as we also show, can be solved exactly in closed form. T h e relaxation m e t h o d , on the other hand, resolves t h e economic equilibrium problem into a series of nonlinear network flow problems, to which we then apply a network equilibration algorithm for its solution. In Section 5 we then t u r n to t h e empirical performance of t h e algorithms and compare t h e efficiency of the relaxation method with t h a t of the projection method on several economic examples. In Section 6 we summarize and conclude. T h e marriage of network theory and variational inequalities has already yielded efficient algorithms for a variety of applications characterized by their large-scale n a t u r e . This work brings a class of general economic equilibrium problems under the umbrella of network equilibrium.
A Network
2
Formalism
for Pure Exchange
Economic
Equilibria
365
The Variational Inequality Model of the P u r e Exchange Economy and its Isomorphic Network Equilibrium Representation
In this section we first briefly review t h e pure exchange economic equilibrium model and its variational inequality formulation. We then develop its isomorphic network equilibrium representation. In particular, we consider a pure exchange economy with / commodities, price vector p = (p\,p2, • • • ,Pi)T taking values in t h e positive ort h a n t R1, and with induced aggregate excess d e m a n d function z(p), with components 2 i ( p ) , . . . , zj(p). As usual, z(p) will be assumed t o be homogeneous of degree zero in p, and, therefore, we may normalize prices so t h a t they take values in t h e simplex:
S'= {p:pe tf+,I> = !}•
(!)
As is s t a n d a r d in general economic equilibrium theory, the aggregate excess dem a n d function must satisfy Walras' law: / • z ( p ) = 0,
Vp€5'.
(2)
We now state t h e definition of a Walrasian equilibrium. D e f i n i t i o n 1: A price vector p* £ S1 is called a Walrasian equilibrium if t h e market is cleared for valuable commodities and is in excess supply for free commodities, t h a t is, if p* > 0 Zi(p*) = 0 «.(P*)<0
if
P*=0.
(3)
T h e following theorem shows us t h a t Walrasian equilibrium price vectors can be characterized as solutions of a variational inequality (see, e.g., Dafermos [9] Theorem 1.1), and is included here for completeness. T h e o r e m 2 . 1 A price vector p* € S1 is a Walrasian satisfies the variational inequality z(P*)T-(p-p*)<0,
VpeS1
equilibrium
if and only if it
VI(z,Sl).
We now establish t h a t the variational inequality model VI(z, S') for the Walrasian price equilibrium problem is identical to the variational inequality problem governing a network equilibrium problem with a single origin-destination ( O / D ) pair and fixed demand. Consider t h e following network equilibrium problem: A network is given consisting of a single origin node x, a single destination node y, and with a single origin/destination pair (x,y). T h e r e are / links connecting t h e origin/destination pair
366
Lan Zhao & Anna
Nagurney
-z,(p)
Figure 1: Network equilibrium formulation of t h e pure exchange economy (x,y) (cf. Figure 1). A fixed O / D d e m a n d dxy is assumed given. Let / ; be t h e flow passing through link i; i = 1 , . . . , I, and let c; be t h e user cost associated with link i; i = 1 , . . . , /. Group t h e link loads into a vector / € R1, and t h e costs into a vector c G R . Assume t h e general situation t h a t a cost on a link may depend upon t h e entire link load p a t t e r n , t h a t is, c, = c,'(/)- T h e n / * is a user equilibrium p a t t e r n if and only if no user has any incentive to change his p a t h (which in t h e model corresponds to a link), t h a t is, mathematically, there exists an ordering of t h e links n,; i = 1 , . . . , /, such t h a t
.,(/•),•••,<>,.(/*) = A < c. + 1 (/') < ... < c„,(/«)
(4)
vhere
,, f > 0, i = l , . . . , s , '\ = 0 , i = s + l,...,l.
Jn
As shown in Dafermos ([5], [6]) t h e above s t a t e m e n t is equivalent to t h e following: A vector / * £ K is a user equilibrium load p a t t e r n if and only if it is a solution to the variational inequality
c(f')T • (f - f) > o,
V/ e K,
where
K = {f:f>0,J2f,
= dxy}.
(5)
A Network
Formalism
for Pure Exchange
Economic
We now establish t h e relationship between VI(z, librium problem. Consider t h e d e m a n d
Equilibria
367
Sl) and t h e above network equi-
t h e link load p a t t e r n f = P, and t h e user travel cost <•)
= -»(•)•
(6)
T h e equilibrium condition of t h e network with t h e cost vector defined in (6) is: Pi > 0
,_,
*••<*> { < A, if ; • = o .
/ .\ I = A, if
w
Multiplying now t h e above inequalities by p*; i: = 1 , . . . , /, summing then t h e resulting equalities, and using Walras' law, we obtain A = p'T • z(p')
= 0;
t h u s , t h e equilibrium condition (7) of t h e above network with t h e cost function defined in (6) is identical to t h e equilibrium condition (3) of t h e p u r e exchange economy. Furthermore, variational inequality (5) which governs the traffic network equilibrium problem described above coincides with VI(z,S'). Since t h e variational inequality problem VI(z, S1) and, hence, t h e Walrasian equilibrium problem is isomorphic t o t h e above user equilibrium network problem with disjoint p a t h s , we can develop algorithms for the network problem which exploit the disjoint p a t h structure in order to c o m p u t e t h e Walrasian price equilibrium.
3
A General Iterative Scheme for the Computation of Walrasian Price Equilibrium
In this section we develop a general iterative scheme for t h e computation of Walrasian price equilibria, which at each step allows for the exploitation of t h e special network structure depicted in Figure 1. In studying algorithms and their convergence, t h e s t a n d a r d assumption in the economics literature (cf. Scarf [20]) is t h a t the aggregate excess d e m a n d function z(p) is well-defined and continuous on all of S . In this paper we also make this assumption. T h e Iterative Scheme Construct a smooth function g(p, q) : S1 x S1 i-t R1 with the following properties: ( 0 g[p,p)
= -*(p),
Vpe5',
368
Lan Zhao & Anna Nagurney
(ii) for every fixed p, q 6 S', the / x / matrix Vpg(p,q) is positive definite. Any smooth function g(p, q) with the above properties generates the following algorithm. S t e p 0: Initialization Start with some p° € S'. Set k := 1. S t e p 1: C o n s t r u c t i o n and C o m p u t a t i o n Compute pk by solving the variational inequality 5(PV",)T-(I'-P*)>O,
VpeS'.
S t e p 2: Convergence Verification If \pk — p*_11 < t, with e > 0, a prespecified tolerance, then stop; otherwise, set k := k + 1, and go to Step 1. We denote the above variational inequality by VIk(g,Sl). Since Vpg(p,q) is positive definite, VI (g, S ) admits a unique solution p . Thus, we obtain a well-defined sequence {pk}. It is easy to see that if the sequence {pk} is convergent, say pk —> p", as k —+ oo, then p* is an equilibrium price vector, that is, it is a solution of variational inequality VI(z, Sl). In fact, on account of the continuity of g(p, q), VIk(g, Sl) yields -z{p'f
• (p - p*) = g{p\p')T
• (P - Pk) > 0, Vp € S<
• (p - p*) = lim gip^P^f k—*oo
so that p* is a solution of the original variational inequality VI(z, S1). The problem is now to find conditions on g(p, q) which guarantee that the sequence {pk} is convergent. Let | • | denote the usual Euclidean norm in the space Rl and let || • || denote the norm of the operator Q : G?V K-> R', \\Q\\=
™p
\Qu\
(8)
u£G$V,\u\ = l
where G(p, q) = ^MP,
I) + V P / ( P , q)),
which, in view of condition (ii), is positive definite, V = {v.ve
R', Y, Vi = 0}
(9)
•=i
and G*V = {u:u
= G'(p,q)v,ve
We now present conditions for convergence.
V}.
(10)
A Network Formalism for Pure Exchange Economic Equilibria
369
Theorem 3.1 Assume that I|G-*(PSOV,«,(PV)G-*(PW)II < i, 1
1
2
2
3
3
(ii)
k
for all {p ,q ),(p ,q ),(p ,q ) VIk(g,Sl) is CauchyinS1.
€ S'. Then the sequence {p } obtained by solving
Proof: Let p = pk+1 for VIk{g,S'),
that is,
gip^p'-'f-ip^-p^^O,
(12)
and let p = pk for V 7 t + 1 ( 5 , 5'), that is, 9(pk+\pkf-(pk~pk+1)>0.
(13)
Adding (12) and (13), we obtain M A P * - 1 ) - (P* + V)) T • (P*+1 - P * ) > 0,
(14)
or (g(pk+\pk)-9(pk,Pk)f-(pk+1-Pk) < {g(pk,pk-1)-g{pk,pk))T • (Pk+1-Pk)By the Mean Value Theorem, there exists a t € (0,1), such that (g(Pk+\pk)
- g(pk,Pk)f
• (P* + 1 - Pk) = (Pk+1 -
(is)
k P
f
•Vvg(tpk + (1 - 0P* +1 ,P*) • (PM - p*),
(16)
or (g(pk+\pk)-g(pk,pk)f-(pk+1-Pk) = \(pk+1
- Pkf • (V pfl (tp* + (l -
t)Pk+\pk)
+VTpg(tPk + (1 - * ) p f c + V ) • (pk+i - Pk)-
(17)
Let Gk be defined as Gk = \(VPg(ipk
+ (1 " 0 P * + 1 . P * ) + VTpg(tpk + (1 - 0 P * + 1 , P * ) ) .
(18)
Observe that Gk is symmetric and positive definite. Using now (15), (17), and (18) yields (p*+i -pkf . Gfc(P*+1 - p * ) < (gip",?"-1) -g(pk,Pk))
• (Pk+1~pk).
(19)
370
Lan Zhao & Anna Nagurney
We define now the inner product on V as (vi, v2)k = vfGkv2,
Vvu v2 e V
(20)
which induces the norm \v\k = {vTGkvY
= \G\v\,
VeK
(21)
By applying the Mean Value Theorem, (19) yields
\pk+1-pk\l<(pk-1-pkfGLG-J1 V,ff(/, V + (1 - s)pk'1)GPGl(pk^
- pk)
(22)
for s 6 (0,1). Using the Schwarz inequality and condition (11), (22) yields \pk+1 -Pk\l < |G|-.(P* - P * " 1 ) ! • \\G-k2xVqg{p\spk
+ (1 - S ) p * - ' ) G ^ | |
•|G|(P*+1-P*)| k
k
k
k
+ {l - stf-l)G-S\\
= \p -p -'\k.,\\G-k2lVq9{p ,sp
• \Pk+1 ~
P
\.
(23)
Hence, \pk^-p\
k=l,2,...,
(24)
1
where 7 is the maximum over the compact set S of the lefthand side of (11), which is less than 1. ^From (24) we obtain
\pk+1 - A < 71/ - A V i < ... < l V -P°|0.
(25)
On the other hand, since Gk; k = 1,2,..., is nonsingular, for every (p, q) € Sl x 5', there is a /3 > 0 such that
IP^-P^/TV^-A.
VA1,P*;* = 0,1,2,....
Therefore, (25) yields fc+r-1 fc+r-1
|p f c + r -/l< £ i=k
IP^-P1'!^/?-1
£
IP''+1-P"'I,-
i=k
7 vhich shows that {p*} is a Cauchy sequence in 5 ' and the proof is complete
(26)
A Network
Formalism
for Pure Exchange
Economic
Equilibria
371
R e m a r k 1: Naturally, VI (g,S!) should be constructed in such a way so t h a t it is easy to solve. For example, when Vpg(p,q) is also symmetric, VIk(g, S1) is equivalent to t h e convex m a t h e m a t i c a l programming problem: Find p* € S' such t h a t F(p') = mm F(p), (28) pSS1
where F(p) is a strictly convex function denned by the line integral
F(P) = Jg(p,q)Tdp. Hence, any algorithm suitable for solving (28) can then be used for solving variational inequality VIk(g,S'). P r o p o s i t i o n 3 . 2 Assume that the Jacobian matrixVpg(p,q) a necessary condition for (11) to hold is that the Jacobian definite over V for any p G S1, that is, vTVz{p)v
< 0,
is also symmetric. Then matrix V z ( p ) is negative
Vt; € V, v ± 0, Vp € 5 ' .
The above condition implies that the function is (p1-p2)T-(z(P1)-z(p2))<0,
—z(p) is strictly
(29) monotone
on S , that
Vp\p2eSl,pl^p\
(30)
P r o o f : Assume t h a t condition (11) holds and select 1
2
p =p
3
=p
1
2
= q =q
3
=q
.
Note t h a t -Vpz{p)
= Vpg(p,p)
+
Vgg{p,p).
Therefore, (11) takes t h e form | | / + G-i{p,p)Vpz(p)G-l(P,p)\\
< 1.
(31)
Set B(P) = G-HP,P)VPZ(P)G-HP,P)-
(32)
Substituting now (32) into (31) and expanding t h e lefthand side of (31), we obtain ||7 + B | | 2 =
sup
\(I +
B)u\2
u6<3iv,|«|=l
sup uT(I usGiv,M=i
+ B)T{I
+ B)u = s u p ( l + 2uTBu "
+ uTBTBu)
< 1
(33)
or, 2uTBu
< -uTBTBu.
(34)
372
Lan Zhao & Anna Nagurney
Since u = G?(p,p)v, (34) yields 2vTVpz(p)v
<
= -\G-HP,P)VPZ(P)V\2
-vTV^z{p)G-i2(p,p)G-^(p,p)Vpz(p)v < 0,
W e V,p
£S',VT&
0.
Hence, 'Vpz(p) is negative definite over V for any p G 5'. The proof is complete. We would like to point out that, since z{p) is homogeneous of degree zero, Vz(p) cannot be positive definite. Therefore, z(p) is never strictly monotone on a set containing a segment of the ray originating from the origin of the /-dimensional space. However, it can be strictly monotone on the / — 1 dimensional simplex S1 (see, e.g., [9])-
4
The Projection and Relaxation Methods for the Computation of the Equilibrium Prices
In this section we show that the general iterative scheme induces a projection method and a relaxation method for the computation of the equilibrium prices. We first present the projection method and then the relaxation method. We also propose equilibration algorithms PMN and RMN for the solution of the respective symmetric network equilibrium subproblems with special structure. We note that the network subproblems induced by the projection method are characterized by linear user link cost functions, whereas those induced by the relaxation method are, in general, nonlinear. a. The Projection Method The projection method corresponds to the choice g(p,q) = -z(q) + -G(p-q),
(35)
where p is a positive scalar and G is a fixed, symmetric positive definite matrix. In this case properties (i) and (ii) are satisfied. In fact, (0 9{P, l) = ~Z(P) + \G{p - p) = -z(p), (ii) Vpg(p,q) = p _ 1 G, is positive definite and symmetric. Condition (11) then takes the form ||/ + p G - 5 V „ z ( p ) G - 3 | | < l .
(36)
The following lemma give conditions under which (36) is satisfied. Lemma 4.1 If —z(p) is strongly monotone on S1, then condition (36) is satisfied.
A Network
Formalism
for Pure Exchange
Economic
373
Equilibria
P r o o f : Let B{p) = G ? S/pz(p)G *. By virtue of t h e strong monotonicity assumption, t h e following inequality holds: vTVvz{p)v
< -a\v\2,
VveV,peS'.
(37)
Since z(p) is continuously differentiable on Sl, there is a sufBciently large number M bounding WVjzipjG'1 V p z ( p ) | | such t h a t vTVTpz(p)G-1
Vpz{p)v
<M\v\\
VpeS',ve
V.
Therefore, \\I + PB(p)\\2
=
=
sup {uT(I ueaiv,\u\=i
sup
+ PB(p))T(I
{1 + 2puTBu
+
PB(p))u}
+p2uTBTu}
u£G% V,|u|=l
= s u p { l + 2pvTVpz(p)v
+ p2vTV
< s u p { l - 2 p a | u | 2 + p2M\v\2}
T
pz
{p)G'xV
= sup{l + p\v\2{PM
pz{p)v)
- 2a)}.
(38)
T h e righthand side of (38) is strictly less t h a n 1, whenever p < | | . T h u s , condition (11) is satisfied. T h e proof is complete. R e m a r k 2: Define 8(p) = s u p j l - 2pa\v\2
+ p2M\v\2}.
(39)
We observe t h a t it is t h e value of 6(p) t h a t affects t h e speed of convergence. In fact, the smaller 6 is, the quicker the sequence {pk} converges. From (39) we know t h a t 8(p) is minimized at p = j ^ . Therefore, p = jg is t h e optimal choice for the projection method. W i t h such a selected g(p,q), each subproblem VIk(g,S') is isomorphic to t h e network equilibrium problem with linear link cost functions. In particular, we choose G to be the diagonal positive definite m a t r i x of t h e form a-y
•••
0
(40) a, where a;; i = 1 , 2 , . . . , / , is any positive number. A natural choice is to have a; = — I^lpo; i = 1 , 2 , . . . , / , in which case VI (g,Sl) is then isomorphic to t h e separable network equilibrium problem depicted in Figure 2. We now show t h a t VIk(g,S')
374
Lan Zhao &i Anna.
c
c, - « , P , • V P * " ' )
,• " I P " *
h
i
Nagurney
>
1 -ZP'
Figure 2: Network equilibrium representation of VIk(g, method
S1) induced by t h e projection
can b e solved in closed form. We first provide t h e motivation for t h e equilibration algorithm which will yield t h e exact solution to VIk(g,S'), and then its s t a t e m e n t . Let t h e components of g(p, pk_1) be given by
g.&p"-1) = -zi(pk~l) + -cute - p?-1), p
• = 1,2,.... /,
(41)
and define h,{pk-')
= -Zi{pk-')
- -*,pk-\ P
i = 1,2,..., I
Then gi&P1"1)
= -<*iPi + A,-(p t_1 ),
t = 1 , 2 , . . . , /.
(42)
If pk is a solution of VIk(g, S') (that is, p is t h e corresponding equilibrium of t h e network depicted in Figure 2), then we have ndPk,Pk
1
)=9n2(p\pk
= 9n,(pk,Pk-1)
= *
p£, > 0 , k
p =0,
i= j = s+
l,...,s l,...,l.
(43)
A Network Formalism for Pure Exchange Economic Equilibria
375
Substituting (42) into (43), we obtain A = -anipk P
+ hni,
i=
Pn. = — ( A - V ) ,
l,...,s,
1= 1,...,-.
(44)
Summing (44) over i yields k _ \„v-
:
-^phn,
E^ = vEr--E^-
(45)
Since pk € 5', p n j = 0, for all j > s, (45) yields
P 2-.=i a „. Hence, the solution p* is given by Pn, =
(A-/»..(),
P^. = 0 ,
j =
s
i = l,...,s + l,...,Z,
where A is determined through (46). In order to find A, we must know the critical index s. procedure for finding the critical index s. ^,From (43), we obtain S n , ( / y _ 1 ) = A,
if
(47) Below we describe a
p*. > 0
which implies A„,(p'=- 1 )
if p ^ > 0 ,
gitfy-1)^*,
(48)
if vi, = o
which implies MP*_1)>A> Hence, s is an index such that 1+
^
Pn. = 0 .
(49)
<*».+.-
(50)
PT.L-1^
A= — ™ — j - ^
376
Lan Zhao & A n n a
Nagurney
We are now ready to state t h e following algorithm for solving subproblem VIk(g, where g(-) is specified by (35) and (40).
S1)
A l g o r i t h m P M N (equilibration algorithm for Projection Method Network subproblems) S t e p 0: S o r t Sort t h e numbers hni;i = 1 , 2 , . . . , / , in nondescending order, and relabel t h e m accordingly. Assume, henceforth, t h a t they are relabeled. Also, define A; +1 = oo. Set L := 1. S t e p 1: C o m p u t a t i o n Compute A L =
i + p£f=1^ L
pT.Ut
1
S t e p 2: E v a l u a t i o n If hi < \ L < hi+1, and go to Step 1.
let s = L, A = XL, and go to Step 3; otherwise, set L : = L + l ,
Step 3: U p d a t e Set p? = —(A - A,-), a; P*=0,
: = l,2,...,a
j = s + l,s +
2,...,l.
T h e algorithm converges in a finite number of steps (cf. Dafermos and Sparrow [12]). b. T h e Relaxation M e t h o d T h e relaxation m e t h o d corresponds to t h e choice ff;(p>) = - 2 . ( < 7 i , - - - , q i - u P i , q i + i , • • • , ? ( ) '
Vi = i , 2 , . . . , ; .
(51)
In this case properties (i) and (ii) are also satisfied. In fact, (i) g{p,p) = - z ( p ) , - p -
(ii) Vpg{p,q)
•••
0
is a diagonal matrix.
=
0
••• -p , dpi
J
By recalling the properties of the aggregate excess d e m a n d function z(p), deduces t h a t it is reasonable to assume t h a t | ^ < 0 ,
Vz= 1,2,...,/.
one
(52)
A Network
Formalism
for Pure Exchange
Economic
Equilibria
377
Hence, V p j ( p , q) is positive definite. Furthermore, 0
^a
aP2
fa V,s(p,g) =
dp, dz2
0
opi
Bz,
£ and V p fl
2
8;n /• dp2^
0 |a(_|a)
3p
i
2
{pi,qi)Vqg(p2,q2)Vpg
5
(P3,9a)
dz, \~2 I dpi ' ^
(-|^) *
dzi y dp2 '
0
(53)
(-! a -r 5 (-! a r
! ^
opi '
dp; >
^
We now state: T h e o r e m 4 . 2 Lei
and assume
dzr -~— dpT
. , = mm{•
dzi. —-} dpi
(54)
that dzT > „ dzi dzk 2 dzT2 "a—
Then condition
(11) of Theorem
Vz = 1 , 2 , . . . , / .
(55)
3.1 holds.
P r o o f : Introduce t h e norm |i|oo = max;{|a;;|} in t h e Euclidean space, which leads to t h e n o r m || • H^, for any operator Q: sup \Qx\,
(56)
We use t h e norm
M? = IGMoo
(57)
in t h e proof of T h e o r e m 3.1. T h e n condition (11) in Theorem 3.1 becomes
\\G-Hp\ql)Vqg(p\q2)G-Hp\q3)\U
\\G~Hp\
(58)
Lan Zhao k. Anna
378
<max{^|—-|(-—-) dpi k^i opk
dzT'1
,, <
- — ) "PT
(-TT-) dph dzk
} dzT
\~K
\~K
m a x { ^ | — - (-•—
Nagurney
^Q^
(—TT-) °PT
}•
(59)
' t*i dp* "P* By virtue of assumption (55), t h e righthand side of (59) is strictly less t h a n 1, t h a t is, condition (11) holds. R e m a r k 3 : Note t h a t (—g 31 ) 2 (—f 2 1 1 ) 5 < 1, a n d , hence, (55) is a diagonal dominance condition which has been imposed in t h e literature to ensure t h e global stability of t h e t a t o n n e m e n t process (see, e.g., Cornwell [4]). Recalling t h a t V p g r (p, q) is diagonal and positive definite, and observing t h a t the diagonal elements — -^ depend only on p;, we see t h a t VI (g, S1) is equivalent to the separable strictly convex m a t h e m a t i c a l programming problem fP
mF(p) = mm{ s^p*-1) mm s< v pes' Jo pes' *' pes'
^•{-Ef^'.-.pi:'*^ pE-V
1
f
dp}
4
M
«
(so)
- _ . JO
which can be solved, in general, by any efficient m a t h e m a t i c a l programming algorithm. Next we design an algorithm for solving VI (g, Sl), where (•) is now specified by (51). T h e algorithm exploits t h e special network structure of t h e problem. For a graphical depiction, see Figure 3. A l g o r i t h m R M N (equilibration algorithm for Relaxation Method Network subproblems) S t e p 0: I n i t i a l i z a t i o n Start with the feasible point p * _ 1 G 5 ' obtained by solving VIk~1(g,Sl) n = k — 1.
and let
S t e p 1: S e l e c t i o n Select m and s such t h a t gm(p\ph-1)=m&x{g,(pn,pk-1)}, \',p">o}
or (nk~1 mVP\
-z z
=
nk~1 n" n*_1 n*_M I • • • ; P m - l > P m ; P m + l i • • • iPl )
max A-zt(pk~\
.. .,pk:},p?,pk;},...
{•.p">°}
gs(pn,pk-i)=mm{gt(pn,pk-i)},
,pk-')},
(61)
A Network Formalism for Pure Exchange Economic Equilibria
379
« -ZP1;
Figure 3: Network equilibrium representation of VIk(g, S') induced by the relaxation method
- 2 .(p{- 1 ,...,p;: 1 1 , P ;,P^ 1 1 ,...,pf- 1 ) = min{-Z,(rf-1,...,p?_-11,p?,pf+-11,...,pf-1)}.
(62)
If \gm{pn>pk~1)— ^SCP") P*_1)l < e, for e > 0 a preset convergence tolerance, then stop. The current pn is a solution of VI (g,Sl). Otherwise, go to Step 2. Step 2: Equilibration Equilibrate gm and gs by solving the following one-dimensional mathematical programming problem for 8: min \zm(p\~l,...,
Pkm~\ ,Pl-6,
piT + \, • • • , Pf" 1 ) I (63)
subject to 0 < 6 < p^. Suppose that 6" is the solution of the above minimization problem. Let P?+1=P?, ,n+l
P.
Vi^m,s, ^n Ps + SU
(64)
380
Lan Zhao k. A n n a
Nagurney
Table 1: P a r a m e t e r s for a 4 commodity, 2 consumer economy
a),w) i= 1 i =2
i =i 0.1, 20.0 0.2, 10.0
j=2 0.5, 30.0 0.4, 30.0
J =3 0.1, 6.0 0.2, 16.0
i=4 0.3, 2.0 0.2, 2.0
and go back to Step 1 with n = n + 1. T h e sequence {p"} thus obtained converges to t h e solution of VI can be seen by t h e fact t h a t F(p»+1) < F(pn)
(g,Sl),
which (65)
where F(-) is t h e objective function of (60). We conclude this section by pointing out t h e economic meaning of t h e convergence condition (36) of the projection m e t h o d and convergence condition (55) of the relaxation method: If the price of a commodity is a decreasing function of t h e demand for this c o m m o d i t y and is affected principally by the d e m a n d for this commodity, then conditions (36) and (55) can be expected to hold.
5
Numerical Examples
In this section, we illustrate t h e performance of t h e projection m e t h o d and t h e relaxation m e t h o d on several numerical examples. T h e aggregate excess d e m a n d functions in the economies are derived from Cobb-Douglas utility functions and are of the form: m
nTW'
a'
m
^•(P) = E ^ — T ( 7 ) - E « ' j .
; = !.•••.'.
(66)
where W is the vector with components {w\,... ,w}}. For completeness and ease of reproducibility, we give the d a t a for t h e examples below. E x a m p l e 1: T h e r e are 4 commodities and 2 consumers in this economy. T h e coefficients aland W'J are given in Table 1. E x a m p l e 2: T h e second example is taken from Eaves [13] and t h e d a t a are given in Table 2. In this economy there are eight commodities and five consumers. E x a m p l e 3: T h e d a t a for this example, consisting of 15 commodities and 4 consumers in the economy, are reported in Table 3. E x a m p l e 4:
A Network Formalism for Pure Exchange Economic Equilibria
Table 2: Parameters for an 8 commodity, 5 consumer economy a),w) 3 i i J J i 3 J
=1 =2 = 3 =4 =5 =6 =1 =8
i=2 i=1 i= 3 i=4 0.3,3.0 0.0,0.0 0.0,0.0 0.0,0.0 0.0,0.0 0.0,15. 1.0,0.0 0.0,0.0 .13,0.0 0.0,0.0 0.0,0.0 0.0,5.0 0.0,3.0 0.0,0.0 0.0,0.0 .73,4.0 0.0,3.0 1.0,2.0 0.0,3.0 0.0,0.0 0.0,5.0 1.0,0.0 0.0,0.0 0.0,0.0 .38,2.0 1.0,0.0 0.0,0.0 0.0,4.0 .19,0.0 1.0,0.0 0.0,0.0 .27,4.0
i =5 0.0,4.0 0.0,0.0 0.0,0.0 .47,13.0 0.0,0.0 .11,0.0 .05,6.0 .37,6.0
Table 3: Parameters for a 15 commodity, 4 consumer economy a),w) i = i 3 =2 J' = 3 J =4
i =1 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 j =5 .06, 10.0 J = 6 .06, 12.0 i = 7 .03, 14.0 i = 8 .01, 16.0 i = 9 .05, 14.0 i = io .05, 12.0 i = n .20, 10.0 J = 12 .30, 8.0 J = 13 .02, 4.0 J = 14 .04, 2.0 J = 15 .04, 2.0
i=2 .02, 10.0 .30, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 4.0 .02, 5.0 .02, 6.0 .02, 7.0
i= 3 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .01, 3.0 0.0, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0
i =4 .20, 10.0 .00, 5.0 .01, 7.0 .02, 9.0 0.0, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .02, 5.0 .02, 7.0 .02, 7.0 .02, 6.0 .04, 5.0 .06, 10.0
382
Lan Zhao & Anna
Nagurney
Table 4: P a r a m e t e r s for a 20 commodity, 4 consumer economy
a),w) J=l i=2 i=3 j =4 i=5 i=6 j=7
i =8 J =9 J = 10 J = 11 J = 12 i = 13 j = 14 J = 15 i = 16 j = 17 j = 18 J = 19 i = 20
t =1 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 12.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0
t=2 .20, 10.0 .30, 8.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 8.0 .03, 6.0 .04, 8.0 .06, 10.0
i=3 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0 .06, 12.0 .03, 14.0 .01, 16.0 .01, 14.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 12.0 .30, 4.0 .02, 4.0 .04, 2.0 .04, 2.0
z= 4 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0
T h e fourth example consists of 20 commodities and 4 consumers with the coefficients given in Table 4. E x a m p l e 5: T h e fifth example consists of 25 commodities and 4 consumers in the economy and t h e coefficients are given in Table 5. Both t h e projection m e t h o d and t h e relaxation m e t h o d were coded in F O R T R A N . T h e projection m e t h o d was embedded with P M N and the relaxation method with R M N . T h e golden section method was used to solve the one variable minimization problem encountered in R M N . In the projection method we chose t h e m a t r i x G = {f^, i = 1 , . . . ,/}|p° and p = .8 for / = 4 , m = 2, p = .5 for / = 8 , m = 5, p = .5 for / = 15, m = 4, p = .1 for / = 20, m = 4, and p = .5 for / = 25, m = 4. T h e codes were implemented on an IBM 3090 at Brown University, and the F O R T V S compiler was used for compilation. Both algorithms were initialized with p° = ( j , . . . , j ) , and t h e termination criterion was \pk — pk~1\ < 1 0 - 6 . As we can see from Table 6, t h e projection m e t h o d converged faster t h a n t h e relaxation m e t h o d , even though the
A Network Formalism for Pure Exchange Economic Equilibria
Table 5: Parameters for a 25 commodity, 4 consumer economy a),w)
i= l .01, 3.0 3=1 .02, 5.0 3=2 .02, 6.0 i = 3 .02, 7.0 i = 4 .02, 6.0 i = 5 .02, 5.0 i = 6 3=7 .00, 5.0 .01, 7.0 3=8 j=9 .02, 9.0 i = io .00, 1.0 j = 11 .05, 2.0 J = 12 .02, 4.0 i = 13 .03, 6.0 3 = 14 .04, 8.0 i = 15 .06, 10.0 j = 16 .06, 12.0 J = 17 .30, 14.0 J = 18 .01, 16.0 i = 19 .05, 14.0 3=20 .05, 12.0 .20, 10.0 3=21 3=22 .30, 8.0 J = 2 3 .02, 4.0 J = 2 4 .04, 2.0 ;=25 .04, 2.0
t' = 2 .20, 10.0 .30, 8.0 .01, 3.0 .02, 5.0 .02, 6.0 .02, 7.0 .02, 6.0 .02, 5.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .05, 2.0 .02, 4.0 .02, 5.0 .04, 6.0 .02, 7.0
i= 3 .02, 6.0 .02, 5.0 .03, 6.0 .04, 8.0 .06, 10.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .06, 10.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 12.0 .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0
i =A .20, 10.0 .30, 8.0 .02, 4.0 .04, 2.0 .04, 2.0 .06, 12.0 .03, 14.0 .01, 16.0 .05, 14.0 .05, 12.0 .01, 3.0 .00, 5.0 .01, 7.0 .02, 9.0 .00, 1.0 .05, 2.0 .02, 4.0 .03, 6.0 .04, 8.0 .02, 5.0 .02, 6.0 .02, 7.0 .02, 6.0 .04, 5.0 .06, 10.0
384
Lan Zhao & A n n a
Nagurney
Table 6: Numerical Results for t h e Projection and Relaxation Methods Example Number
1 2 3 4 5
N u m b e r of Iterations
Relaxation 11 28 5 4 4
Projection 18 117 80 91 64
C P U T i m e (seconds)
Relaxation .20 1.64 1.02 1.97 3.49
Projection .05 .20 .11 .36 .38
number of iterations in the projection m e t h o d is larger than t h e n u m b e r of iterations in the relaxation m e t h o d , for the same degree of accuracy. This is most likely due to t h e fact t h a t t h e projection m e t h o d solves the network subproblems VIk{g,Sl) in closed form. As t h e scale of the economy becomes larger, the advantage of the projection m e t h o d becomes more significant.
6
Summary and Conclusions
In this paper we have developed a network formalism for the study of a class of general economic equilibrium problems - t h a t of pure exchange or Walrasian price equilibrium problems. We first established t h a t the Walrasian price equilibrium problem is isomorphic to a network equilibrium problem with special s t r u c t u r e and, hence, t h e corresponding variational inequality formulations are one and the same. We then turned to the computation of the equilibrium p a t t e r n s . We proposed a general iterative scheme for t h e computation of t h e Walrasian price equilibrium, which was then shown to induce the projection m e t h o d and the relaxation m e t h o d , as special cases. In particular, t h e projection m e t h o d resolves t h e network equilibrium problem into linear symmetric network equilibrium problems, for which an equilibration algorithm termed P M N was then proposed. T h e relaxation m e t h o d , on the other h a n d , resolves the problem into separable nonlinear problems, for which an equilibration algorithm named RMN was developed. Finally, we presented numerical results for five examples which demonstrated that t h e projection m e t h o d in combination with P M N consistently outperformed t h e relaxation m e t h o d in combination with R M N . This is due, in p a r t , to the simplicity of the network equilibrium subproblems which were then solved in closed form. As the scale of the problems increased, t h e relative efficiency of t h e projection m e t h o d vis a vis the relaxation m e t h o d also increased, suggesting t h a t the full exploitation of the underlying network structure of this class of general economic equilibrium problems will enable the computation of large-scale problems in practice.
A Network
Formalism
for Pure Exchange
Economic
Equilibria
385
References [I] H. Z. Aashtiani and T. L. Magnanti, Equilibria on a congested transportation network SIAM Journal on Algebraic and Discrete Methods 2 (1981) 213-226. [2] M. Beckmann, C. B. McGuire, and C. B. Winsten, Studies in the Economics Transportation (Yale University Press, New Haven, Connecticut, 1956).
of
[3] D. P. Bertsekas and E. Gafni, Projection m e t h o d s for variational inequalities and application t o t h e traffic assignment problem, Mathematical Programming Study 1 7 (1982) 139-159. [4] R. Cornwell, Introduction to the Use of General Holland, A m s t e r d a m , T h e Netherlands, 1984).
Equilibrium
Analysis
(North-
[5] S. Dafermos, Traffic equilibrium and variational inequalities, Transportation ence 14 (1980) 42-54. [6] S. Dafermos, T h e general multimodal traffic equilibrium problem, Networks (1982) 57-72. [7] S. Dafermos, An iterative scheme for variational inequalities, Mathematical gramming 15 (1983) 40-47.
Sci-
12
Pro-
[8] S. Dafermos, Isomorphic multiclass spatial price and multimodal traffic network equilibrium models, Regional Science and Urban Economics 16 (1986) 197-209. [9] S. Dafermos, Exchange price equilibria and variational inequalities, Programming 4 6 (1990) 391-402.
Mathematical
[10] S. Dafermos and A. Nagurney, Sensitivity analysis for t h e general spatial economic equilibrium problem, Operations Research 3 2 (1984) 1069-1086. [II] S. Dafermos and A. Nagurney, Oligopolistic and competitive behavior of spatially separated m a r k e t s , Regional Science and Urban Economics 17 (1987) 245-254. [12] S. C. Dafermos and F. T. Sparrow, T h e traffic assignment problem for a general network, Journal of Research of the National Bureau of Standards 7 3 B (1969) 91-118. [13] B . C. Eaves, W h e r e solving for stationary points by L C P s is mixing Newton iterates, in Homotopy Methods and Global Convergence, B. C. Eaves, F . J. Gould, H. O. Peitgen, and M. J. Todd, editors (Plenum Press, New York, 1983) pp. 63-78. [14] M. Florian and M. Los, A new look at static spatial price equilibrium models, Regional Science and Urban Economics 12 (1982) 579-597.
386
Lan Zhao & Anna Nagurney
[15] M. Florian and H. Spiess, The convergence of diagonalization algorithms for asymmetric network equilibrium problems, Transportation Research 16B (1982) 477-483. [16] A. Nagurney, Migration equilibrium and variational inequalities, Economics Letters 31 (1989) 109-112. [17] A. Nagurney and D. S. Kim, Parallel computation of large-scale dynamic market network equilibria via time period decomposition, Mathematical and Computer Modelling 15 (1991) 55- 67. [18] A. Nagurney, J. Pan, and L. Zhao, Human migration networks, European Journal of Operational Research 59 (1992) 262-274. [19] A. Nagurney and L. Zhao, A network equilibrium formulation of market disequilibrium and variational inequalities, Networks 21 (1991) 102-132. [20] H. Scarf (with T. Hansen), Computation of Economic Equilibria (Yale University Press, New Haven, Connecticut, 1973).
387 Network Optimization Problems, pp. 387-401 Eds. D.-Z. Du and P.M. Pardalos ©1993 World Scientific Publishing Co.
Steiner Problem in Multistage Computer Networks Sourav B h a t t a c h a r y a Bhaskar Dasgupta 1 Computer Science Department,
University
of Minnesota,
Minneapolis,
MN
55455
Abstract
Multistage computer networks are popular in parallel architectures and communication applications. We consider the message communication problem for the two types of multistage networks: one popular for parallel architectures and the other popular for communication networks. A subset of the problem can be equated to the Steiner tree problem for multistage graphs. Inherent complexities of the problem is shown and polynomial-time heuristics are developed. Performance of these heuristics is evaluated using analytical as well as simulation results.
1
Introduction
Multistage interconnection networks (MINs) are popular among parallel architecture a n d / o r communication network topologies. An N x logiN element MIN consists of log2N stages of N elements each. A common pictorial view of an N x log2N MIN is to collect N elements in a stage (vertically) and arrange log^N + 1 such stages horizontally one after t h e other. MINs offer a good balance between network cost and performance. They are often characterized as intermediate {0(N x logzN)} cost networks falling within t h e two e x t r e m e cases: fully connected {0(N2) cost} and bus connected {O(N) cost}. Architectural and other topological properties of MIN may be found in [8]. 'Supported in part by NSF grant CCR-9208913
388
S. Bhattacharya
1.1
and B.
Dasgupta
T w o Versions of M I N s
Let Sij denote t h e z'-th stage j ' - t h row element in an N x log^N MIN, 0 < i < log2N,0 < j < N — 1. We consider source-to-source wrap-around MINs only, i.e., when V; : SQJ = Siog:2N,j- These networks can allow multiple passes using t h e wraparound connections. Depending on t h e role of intermediate stage elements, two types of MINs are possible as outlined below: • Intermediate stages as switches only: This t y p e is popular in parallel architecture applications. Here t h e source end (leftmost) and t h e destination end (rightmost stage) constitute of processors, while t h e intermediate elements are bare switches which interconnect various sources and destinations. Such MINs are of commercial usage in parallel processors, e.g., t h e BBN Butterfly machine. We refer to MINs of this t y p e as type-1 MIN. • Intermediate stages as processors: This t y p e is common in communication network applications. Here t h e intermediate stage elements are identical to t h e source or destination stage processors, i.e., they can have their own message traffic. Example of such MINs can be found in [10]. We refer to MINs of this t y p e as type-S MIN.
1.2
C o m m u n i c a t i o n in M I N s
Depending on the n u m b e r of destinations involved in a communication in MIN, three types can be classified: one-to-one, one-to-many and one-to-all. These are commonly known as routing, multicast and broadcast. In this article we focus ourselves to the multicast problem for MINs. Note t h a t routing k, broadcast are two special instances of multicast and do not offer any opportunity for traffic reduction. T h e multicast problem specifies a source node and a set of k destination nodes. W i t h o u t loss of generality we assume t h e source node to be So,o- Destination nodes are spread over t h e MIN, l < k < N ( k = l = routing, k = N = broadcast). Objective of t h e multicast problem is to transmit t h e message from the source node to t h e destination nodes.
Flow-control Mechanism For multihop networks, various form of switching and flow-control mechanisms have evolved. Store and forward is a traditional approach to message communication. Virtual cut-through, wormhole, deflection routing etc. have been subsequently proposed. A survey can be found in [11, 4]. We assume packetized message communication, where packets are independently flown through t h e network. Our focus is to estimate (and possibly reduce) the overall traffic overhead in message communications.
Steiner Problem
1.3
in Multistage
Computer
Networks
389
Optimality Criteria in MIN Multicast
Two possible criteria to measure the optimality of MIN multicast communication are to minimize one of t h e following two objective functions: • t h e total traffic generated in t h e network ( each occupied link of t h e network counts as one unit of traffic. • the hops-distance between t h e source node and any destination node. T h e traffic metric makes t h e problem equivalent to t h e Steiner problem for MIN, while t h e time metric is a different dimension altogether. These two metrics work in the dual sense. Reducing one increases the other and vice versa. T h u s , we focus on the traffic metric only. Considerations along t h e time metric is an open problem.
2
Multistage Interconnection Networks
We consider type 1 MINs with t h e cube network topology. These class of networks (e.g., baseline, delta, generalized cube, indirect binary-cube, omega, banyan [8]) have been proposed as fixed-degree alternative to hypercube architecture. T h e y are popular in switching and communication applications. They can also emulate t h e performance of hypercube in most applications (e.g., t h e CCC architecture [12]). Let MINd denote a d dimensional generalized MIN.
2.1
Formulation of the Traffic Reduction Problem.
We consider multicasting on MINd which are unique path networks. Given a set of k multicast destinations (Di, 1 < i < k) and a source node S in MINd, t h e p a t h from S to any particular D{ is fixed. However, it is clear t h a t for a given set of multicast destinations, t h e total traffic generated in MINd depends on t h e relative order in which d different dimensions are arranged. This leads to our problem formulation as (see Section 2.1.1 for practical applicability): Given a set of destination nodes, traffic optimum multicasting in MINd is to find a permutation of the d dimensions (each stage of MINd is allocated to one particular dimension value) so that the total traffic is minimized. Unfortunately, this problem is NP-complete as shown by the next theorem. Hence, we need to investigate t h e possibility of designing efficient heuristics for this problem. T h e o r e m 2.1 The traffic optimum
multicasting
problem is
NP-complete.
Proof sketch: T h e problem is obviously in NP. To show NP-hardness one can reduce the space minimized full trie problem, which is shown to be NP-complete in [3, 6], to this problem. Details are available in [1]. •
390 2.1.1
S. Bhattacharya
and B.
Dasgupta
Design Issues
Any hardware implementation of a MINj would assume an ordering among t h e d dimensions. In such cases, online dimension ordering (as required by t h e traffic reduction criterion in this paper) in a MINj may be argued from t h e practical viewpoint. We identify t h e following situations as practical applications. (1) Communication networks often use MINs. Traditional hardware implementation of switches at every intermediate stages have been replaced using Wave-Time Division Multiplexors ( W T D M ) over passive stars [5]. T h e actual interconnection is formed by wavelength (frequency) or time-slot assignment of different nodes, i.e., by firmware control. A firmware controlled design can be changed without changing t h e underlying hardware. T h u s , it is possible to re-order t h e dimensions in a MINd dynamically. Every stage may have to configure to at most d possible dimensions, for which t h e w a v e / t i m e assignments can be pre-computed and stored. (2) If t h e traffic p a t t e r n is known and repetitive (as m a y happen in periodically occurring similar message communications) then from t h e above o p t i m u m dimensional ordering for each multicasting instance one can derive the most common p a t t e r n and design t h e MINj using t h e corresponding o p t i m u m dimensional ordering. T h e idea here is to achieve traffic optimality for most multicasting instances which leads to an overall traffic reduction. (3) Hierarchical hypercubes are designed for several practical reasons [9]. Such hierarchical designs limit the availability of different dimensions at any node. Only a certain set of dimensions can be availed at each node. This imposes a hierarchy among dimensions in a r o u t i n g / multicasting operation. In some other cases, even with complete hypercubes r o u t i n g / multicasting is done in hierarchical fashion, imposing a (arbitrary) desired ordering among dimensions [4]. W i t h these applications our results and optimality ordering among dimensions can be used as a measure whether or not a particular multicast operation is generating optimal traffic. Note t h a t a hypercube with hierarchically ordered dimensions can be treated as a MINd for analysis purpose and results from t h e latter can be used for t h e former.
2.2
Greedy Heuristic
Let Reachp denote t h e number of nodes which received a copy of t h e message at stage p. Let ki be t h e dimension between stage p and stage p+1. We define an expansion ratio Fkd = ^ J ' . Intuitively, this fraction Fkd indicate how much t h e size of t h e multicast destination is increasing at every stage. This expansion ratio depends of the dimension, stage position and on t h e set of all prior dimensions served already. For the sake of brevity we treat this all previous information as part of t h e stage information and denote compactly using t h e stage n u m b e r position. Now, t h e total traffic equals
Steiner Problem
in Multistage
Computer
Networks
391
P=d
J2 Rtachp
= Fki x [1 + Fk2 x [1 + . . . + Fkd_, x [1 +
Fkd]...]}
Our objective is to arrive at values of k\, k^, ..., kj, such t h a t t h e above expression is minimized. Note t h a t Fki (1 < i < d) varies between 1 and 2 and have real values. We propose t h e following greedy algorithm which works stage by stage and selects one dimension in each stage. After d iterations of t h e algorithm t h e complete p e r m u t a t i o n of d dimensions in MINd are generated. Greedy heuristic: At each stage select dimension ki, where Wj, Fkj > Fki. In case of a tie, anyone of the smallest Fki dimension may be chosen. A particular dimension fc; used in a preceding stage may not be repeated in a subsequent stage. Regarding t i m e and space complexities of t h e above heuristic, it is easy to prove the following theorem. T h e o r e m 2.2 The greedy heuristic runs in 0(k.d3) arithmetic operations and uses O(k.d) space.
0(k.d4)
time, performs
bitwise
T h e following theorem, whose proof can be found in [1], shows local optimality of dimension ordering of t h e greedy heuristic. T h e o r e m 2.3 The greedy heuristic leads to a locally optimal ordering of the dimensions, i.e., orders the dimensions so that no a d j a c e n t p a i r w i s e i n t e r c h a n g e of the dimensions can minimize the total traffic. Table 1 compares t h e t i m e and space complexities of t h e greedy heuristic with two other known exponential t i m e o p t i m u m strategies[l]. Algorithm
T i m e (in bit operations)
Space (in bits)
Direct P e r m u t a t i o n Dynamic P r o g r a m m i n g Heuristic
0(k • d3 • d\) 0(k • d2 • 2d) 0(k • d4)
0(k • d)
0(k-d
+ 2d-d)
0(k • d)
Table 1: Multicast in MINd with k destinations: S u m m a r y of t i m e and space complexity of different solutions.
392
S. Bhattacharya
2.3
and B.
Dasgupta
Performance
This section analytically compares t h e greedy heuristic with "randomly ordered dimensions" approach as well as t h e optimal algorithm. Detailed proofs of all t h e results are available in [1]. For our analysis purpose, we characterize the destination node set into two classes: t h e "complete subcube multicast" and the "incomplete subcube multicast". Each one of these cases are explained below and worst (average) case performance of the greedy heuristic is compared with t h a t of optimal algorithm as well as r a n d o m dimension ordering. Let traffic "overhead" comparison between two approaches be denoted as the absolute difference in t h e traffic generated by those two individual approaches 2 . For example, Overhead(greedy, o p t i m u m ) = greedy traffic o p t i m u m traffic, Overhead(random, greedy) = traffic in "random ordering" - greedy traffic.
2.3.1
Complete Subcube Multicast
In this case t h e set of multicast destinations, D;, 1 < i < k, forms a complete subcube. T h u s , i = 2 r , for some integer r and t h e set [Di] can form a r-dimensional subcube. Let CSM(r) denote this situation. T h e o r e m 2 . 4 The greedy heuristic produces optimum multicast case. Also, in the worst CSM(r) case,
traffic for the complete
subcube
— r) x (2 r — 1).
Overhead(random,optimum)=Overhead(random,greedy)=(d
T h e next theorem gives t h e probabilistic traffic overhead of " r a n d o m dimension ordering" as opposed to t h e worst case performance as stated above. T h e o r e m 2 . 5 In the average CSM{r) case, the random dimensions ordering approach incurs a traffic overhead Overhead(random, optimum) = Overhead(random, greedy) = J2p=[ QAp) X {NP - N0), where
Np = ] T 2 ; + 2 p x (d - r) + 2" x £ 2 '
J • (d - r) • (d + p - r - 1)! • (r - p)\ Qr(p) = ^ 2
7[
A random dimension ordering occurs when d dimensions of MINj
are randomly ordered.
Steiner
Problem
in Multistage
Computer
2.3.2
Incomplete Subcube Multicast
393
Networks
In this case not all t h e 2 r destination nodes are present in t h e set of multicast destinations. T h u s , t h e set of multicast destinations (£),, 1 < i: < k) forms an incomplete r-dimensional subcube, where r is t h e m i n i m u m dimension value t o include all those destination nodes. Let ISM(r) denote this situation. T h e following theorems give worst case and average case performance ratios of various strategies. First we consider t h e case when A: is a power of 2. T h e o r e m 2.6 In the case of incomplete Overhead(greedy, optimum) Overhead (random, greedy)
r-subcube multicast < <
with k = V
destinations,
((r — j) x (k — 1)) [(d — 1r + log2k) x (k — 1)]
Since j = log2k, Overhead(greedy,optimum)~ (r — log2k) x k. For a given r, this value is maximized when k = 2 r _ 1 . T h u s , t h e worst case performance degradation suffered by t h e greedy heuristic equals 2 r _ 1 . Next we generalize t h e value of A; to a non-power of 2. L e m m a 2.7 In ISM(r), with k = 23 + I nodes (0 < I < V - I), optimum) = A(d,k,r,j), where A(d,k,r,j)
=
[(d-r) + E S
a
2 J + 1 +{r-j-l)
+ 1
Overheadfgreedy,
2' + (r-i-l)xfc]-
[(d-r) + (r~j) + r=^']
and, Overhead (random, greedy) = B(d,k,r,j), B(d,k,rJ)
= w
x (Jfc-1) where
[ ( r - j ) + EJlj2''+ (
T h e following l e m m a gives estimates for t h e average performance. Let
k We reuse notations of t h e previous l e m m a for brevity. L e m m a 2.8 In ISM(r), Overhead(greedy, Overhead(random,
with k = V' + I nodes (0 < I < V -1), on an average optimum) greedy)
= =
case:
Ep=o~ Pk=v{r) x A(d, k,r,p) Ep=o~ Pk=2p{r) x B(d,k,r,p)
T h e above result for t h e average case is based on equal distribution of destination nodes among t h e given nodes.
S. Bha.ttacha.rya
394
2.4
and B.
Dasgupta.
Simulation Performance
Often a MINd design implicitly assumes ascending or descending order among its d dimensions. For example, in a MIN3 an increasing order would be [0,1,2], while a decreasing order would b e [2,1,0]. Such is also t h e usual practice in hierarchical routing in hypercube, where dimensions are treated one after another alike distinct stages of MINd- Clearly, these linearly ordered dimensions approaches cannot generate traffic optimal multicasting in MIN. We compare performance of t h e greedy heuristic with t h e linearly ordered dimensions heuristic approach and demonstrate t h e advantages. We show for randomly generated M multicast destinations how much traffic can b e reduced (on an average) if t h e locally optimal greedy dimension ordering approach proposed in this paper is followed. We present simulation results towards this. Our simulation implemented four situations: t h e exhaustive o p t i m u m traffic generation approach, greedy approach, linearly increasing and linearly decreasing (hereafter referred as 'increasing' and 'decreasing' respectively). Performance of t h e "increasing" and "decreasing" cases are found similar, and hence we report only t h e "increasing" case. We consider three different dimension values 4, 5 and 6 (i.e., MIN4, MINS and MINe). N u m b e r of multicast destinations are varied as 1%, 2%, 5%, 10%, 20%, 50%, 80%, 90%, 9 5 % and 99% of t h e total number of nodes in t h e cube. For each multicast set size 30 r a n d o m distributions were generated and averages taken to introduce the effect of large numbers. Thus, each algorithm is run 300 times with every dimension - leading to a total of 3600 experiments. T h e proposed greedy algorithm produces optimal multicast traffic in most ( « 90%) of t h e cases. T h u s t h e greedy heuristic is "almost always" o p t i m u m . We present simulation results showing t h e miss-rate, i.e., t h e percentage of test runs in which t h e proposed greedy algorithm does not coincide with t h e o p t i m u m multicast algorithm. Even in cases where t h e greedy algorithm deviates from o p t i m u m solution, the deviation is found to be small. This section also shows the relationship of optimal traffic cost (T) to t h e n u m b e r of multicast destinations (M). We present simulation results to show variation of T for different values of M. Let To, Ta and T/ denote t h e traffic generated using t h e optimum (exhaustively generated), greedy and linearly ordered increasing approaches respectively. T h u s , t h e traffic overhead using greedy and increasing approaches equal (TQ — TO) and (Tj — To) respectively. F i g . l a shows the average value of these p a r a m e t e r s for MINd with d=i, 5, 6. Let "miss-rate" be defined as t h e percentage of simulation runs in which a particular heuristic approach (e.g., t h e greedy approach or t h e increasing approach) differs from t h e o p t i m u m approach. This percentage shows the frequency by which the heuristic deviates from the o p t i m u m solutions. A low value of miss-rate indicates t h a t the corresponding heuristic is 'almost always' o p t i m u m . F i g . l b shows t h e missr a t e greedy, increasing heuristics for different dimensions. O b s e r v a t i o n 1: T h e greedy approach misses t h e o p t i m u m solution rarely (e.g.,
Steiner Problem in Multistage Computer Networks
Traffic Overhead
395
Traffic Overhead
Traffic Overhead
Dimension«4
I
I
Dimensioned
t T 1—I 10 20 50 80 90 95 99
,
10 20 50 80 90 95 99
J
Destinations (percentage)
Destinations (percentage)
10 20 50 80 90 95 99
Destinations (percentage)
a) Miss-Rate 80%-
Dimension=4
Dimension*^)
Dimensi<£n=6
80%-
\
i
60%—
60%-
40%—
40%-
I
/ ,^VS \
5 10 20 50 80 90 95 99
Destinations (percentage)
5
\
\
I
20%-
-r-t
\
, 1 /MVS ;
10 20 50 80 90 95 99
5
Destinations (percentage)
10 20 50 80 90 95 99
Destinations (percentage)
b)
Figure 1: Greedy (solid line) and Increasing (dashed line) heuristic performance: a) Traffic overhead (—traffic in heuristic - optimal traffic), b) Miss-rate.
Traffic (scaled by size)
Traffic (scaled by dimension and size) 1 -
0.80.63-
•
t •
\ t
\
V-t:' \ \
0.420.2-i—i—i—i—i—i—i—i—i—r1 2 5 10 2 0 5 0 8 0 9 0 95 99
Destinations (percentage)
a)
I1*
5
10 20 SO 80 90 95 99
Destinations (percentage)
b)
Figure 2: Optimum traffic (per destination node) variation with different multicast sizes (solid line for dimension=4, dashed line for dimension=5, dotted line for dimen-
396
S. Bhattacharya
Traffic Overhead
and B.
Dasgupta
Traffic Overhead
(scaled) 0.5-
£
0.4-
/ :- *
0.30.2-
A.
7
-i—i—i—f—f—i—rI 2 5 10 20 50 80 90 95 99
Destinations (percentage)
_
\
| \ ^ ,
0.1— i
i
l
-H 10 20
i
i
i
i V"
50 80 90 95 99
Destinations (percentage)
b) a) Figure 3: a) Traffic overhead of Greedy heuristic, b) Scaled traffic overhead (soild line for d i m e n s i o n = 4 , dashed for 5, dotted for 6). 1 out of 30 cases or at most 2 out of 30 cases). T h u s it is an 'almost always o p t i m u m ' algorithm. T h e n u m b e r of mismatches increases as t h e dimension increases. However, t h e existing dimension ordering approaches (e.g., increasing or decreasing) have high miss-rate. O b s e r v a t i o n 2: For small (or large) number of multicast destinations all three heuristics yield o p t i m u m (or near o p t i m u m ) traffic solutions. This is because, for small n u m b e r of destinations expansion ratio (refer Section 2.2) is almost always 1 and regardless of t h e actual heuristic used, a near-optimum strategy is effected. Similarly, with large fraction of nodes as destinations, t h e expansion ratio is almost always 2 and regardless of t h e actual heuristic used, a near-optimum strategy is effected. O b s e r v a t i o n 3 : Traffic overhead of greedy approach is superior to "increasing" approach. O b s e r v a t i o n 4 : Traffic overhead of greedy approach increases with dimension (Fig. 3a). This is expected since at higher dimensions, each source-destination multicast involves larger traffic a m o u n t . At t h e same t i m e it can also be a t t r i b u t e d to the inherent characteristic of t h e greedy approach, i.e., t h e greedy approach deviates from optimality more with increasing dimension. These two factors are distinguished by scaling t h e traffic overhead using dimension value. T h e idea is to normalize t h e traffic overhead using the corresponding dimension value. T h e scaled traffic overhead (Fig. 3b) using t h e greedy heuristic for different dimensions also show t h a t traffic overhead of greedy approach increases with dimension. Fig.2 shows t h e o p t i m u m traffic load variation for different multicast sizes. T h e idea is to explore relationship (if any) between t h e total amount of traffic (T) required for a M destination multicast. We show the average traffic (i.e., traffic per destination node) reported from our simulations for cube sizes 4, 5 and 6. Then, we scale the traffic requirement using the corresponding dimension value. This plot also shows
Steiner Problem
in Multistage
Computer
similar trend as t h e absolute traffic plot. O b s e r v a t i o n 5: O p t i m u m multicast destination size. Initially, with small M, of traffic per stage. This is because t h e average no common message traffic can be increasing M this ratio decreases until M range of M t h e value of ' o p t i m u m traffic' a constant indicating high availability of share message links most efficiently.
3
Networks
397
traffic decreases with increasing multicast every destination requires nearly one unit destinations nodes are sparse and on an shared by two destinations. However, with reaches 50% of t h e cube size. Beyond this per multicast destination becomes almost destination nodes and frequent ability to
Multistage Communication Networks
We consider type 2 MINs in this section. Among several topologies we choose the shuffle connection and t h e multistage binary cube connection. W i t h o u t loss of generality we assume t h a t SQ^ is the source node, while t h e k destinations nodes are spread over t h e (log2N + 1) stages and N rows. Type-1 MINs are unique p a t h networks. This required us to re-order dimensions in order to have traffic reduction. T h e online dimension re-ordering led to practical feasibility questions (which is addressed in Section 2.1.1) and related issues. However, type-2 MINs are not unique p a t h networks. They allow multiple paths between source and any destination. Hence, traffic reduction can be achieved even without any online topological reconfiguration.
Optimality Criterion Given a,n N x log2N type-2 multistage communication network (connected using a particular topology T) with a source node S and k destination nodes D{ (1 < i < k), t h e objective is to find a p a t h from the source node to each one of t h e destinations such t h a t one of t h e following objective functions is minimized: • Total
traffic.
• Time (in hops) between source and each destination. T h e first objective equates the problem to t h e Steiner tree problem for t h e topology T. T h e second objective has not been investigated so far.
3.1
Multistage Shuffle Network
We consider a shuffle network with N x log2N P E s , arranged as log?N stages of N P E s each. Let such a network be called a log2N-sh\iffie, Let PEi:j be t h e 2-th row P E in the j - t h stage, 0 < i < N - 1, 0 < j < log2N - 1. log2N-Shuffte is a cyclical
398
S. Bhattacharya
a n d B.
Dasgupta
Figure 4: NP-completeness: a) Example 3-stage shuffle network, b) Restricted instance of t h e shuffle graph - dotted lines indicate zero-cost edges - nodes which are grouped together are tagged with identical integer, c) Equivalence to a 3-cube multicast. wrap around network, with log2N-th stage = stage 0. Formally, t h e binary shuffle connectivity is defined as (iV = 2") [10] PEij
has outgoing links to PEpi+p)mod
jv,(j+i)mo
T h e o r e m 3 . 1 The problem of traffic optimal shuffle multicast complete.
tree generation
is NP-
Proof sketch: It can be shown t h a t a special case of this problem is t h e problem of optimal traffic multicast for hypercube (which is known as NP-complete[7]) by setting the costs of some edges to zero and thereby identifying some nodes together. Details can be found in [2]. Fig. 4 pictorially depicts the idea. D
3.2
Multistage Cube Network
We consider a multistage cube network with N x log2N P E s , arranged as log2N of N P E s each. Let such a network be called a log2N-sta.ge cube. Let PEij i-th row P E in t h e j-th stage, 0 < i < N — l , 0 < j < log2N — 1. log2N-sta,ge a cyclical wrap around network, with log2N-th stage = stage 0. Formally, t h e multistage connectivity is defined as (N = 2")[8] PEij
has outgoing links to
PEi^j+i) mori n and PEi+(1-2r)x2>,u+i)m<,dn
stages be t h e cube is binary
where r = i-th bit in j
Note t h a t this multistage cube network is a particular instance of t h e generalized MINd considered in Section 2, where t h e dimensions are increasing from left to right. Also, intermediate stage nodes are active P E s here, unlike in t h e MINd of Section 2 (which are b a r e switches).
Steiner
Problem
in Multistage
Computer
Networks
399
define Dist(D,a) = MIN { H{Dj,a) : Dj e D } Vn, (n $ D) A (3Z»j e £ : H(Dj,n) = 1) Do courai(n) = \{dk <E D I : Dist(D U {n},djt) < H i s ^ D , ^ ) } ! select n such that counifra) is maximum;
Figure 5: Greedy algorithm for type-2 MINs: selection of t h e next step message recipient node. T h e proof of t h e following theorem is essentially similar to theorem 3.1. Details can be found in [2]. T h e o r e m 3 . 2 The optimal traffic multicast
3.3
problem is
NP-complete.
Greedy Heuristic
We developed a greedy heuristic for type-2 MINs. This heuristic is applied to b o t h t h e type-2 shuffle MIN as well as type-2 multistage cube MIN. We describe this greedy heuristic first and then d e m o n s t r a t e its performance using simulation results. T h e greedy heuristic is an iterative process selecting one node in each iteration. Every t i m e a node is selected it is included in a set D (representing t h e set of nodes which received a copy of t h e message so far). Initially D set only includes S, i.e., the source node. T h e algorithm stops when Vi 1 < i < k Z), £ D. Let DL denote t h e set of k destinations, and H(a, 6) equal t h e shortest distance between nodes a and b. Let Dist(D,a) be a function indicating t h e shortest distance from any node in D to a. Each step of t h e greedy iteration chooses a node n as in Fig. 5.
3.4
Simulation Performance
We simulated t h e greedy algorithm in t h e multistage shuffle MIN as well as in t h e multistage cube MIN. Its performance is compared with t h e exhaustively generated o p t i m u m algorithm in t h e respective architectures. T h e n u m b e r of destinations is varied as 1%, 2%, 5%, 10%, 20%, 50%, 80%, 90%, 95% and 99% of t h e total system size. In each case 50 random set of destinations were generated; b o t h t h e greedy & o p t i m u m algorithms are run and their results compared. Two metrics are used to characterize performance of t h e greedy heuristic: t h e number of miss and t h e average overhead. T h e former denote how often (out of t h e 50 runs) t h e greedy algorithm fails to produce optimal result, while t h e latter indicates average deviation of t h e greedy result when a miss occurs. A low miss r a t e indicates t h a t t h e greedy algorithm is almost always optimum, while a low average overhead indicates t h a t t h e greedy algorithm almost always produces a near-optimum solution. Let TG (TO) denote t h e traffic produced by t h e greedy (optimum) algorithm.
400
S. Bhcdta.cha.iya • Number
of miss = \TQ =/= TO\-
• Average
overhead = £)
T<1 Tn
^ ,
and B.
Dasgupta
where TQ ^ To-
Table 2 shows t h e performance of t h e greedy algorithm in multistage shuffle MIN, while Table 3 shows t h e same for multistage cube MIN. As can be observed from these two tables, t h e greedy algorithm has low miss rate (particularly for t h e multistage cube MIN) and low traffic overhead.
Systems
Destinations Metrics
0.02 3 1.03
O.OB 6 1.23
O.l 8 1.4
0.3 11 1.52
O.B 24 1.6
0.8 16 1.56
O.O 14 1.31
0.95 12 1.19
0.99
N u m b e r of m i s a Average overhead
O.Ol 1 1.01
4-stage Shuffle
N u m b e r of m i s s Average overhead
1 1.12
— J "
9 1.33
11 1.47
16 1.63
32 1.62
27 1.58
19 1.39
N u m b e r of m i s s Average overhead
2 1.14
10 1.41
14 1.53
21 1.71
37 2.09
29 1.69
23 1.45
- »/-» rr-
10 1.17
5-at a g e Shuffle
3-At a g e Shuffle
1.15 5"
1 *-23 1
17
„.''32
9 1.06
11 1.21
Table 2: Multistage Shuffle Multicast: Performance of t h e Greedy heuristic over the optimal solution.
Systems
Destinations Metrics
O.Ol 1
0.02 2
0.OB 3
O.l 3
0.2 4
O.B 7
0.8 5
o.e
0.95 1
o.ee
3
1.01
1.01
1.01
1.01
1.04
1.1
1.04
1.61
1.01
1.01
4 1.03
3 1-01
l 1.01
6 1.03
- 3 1.01
2 1.01
3-atage Cube
N u m b e r of m i s s Average overhead
4-stage Cube
N u m b e r of m i s s Average overhead
2 1.01
2 1.01
3 1.01
4 1.03
6 1.11
-To— 1.18
5 1.09
5-stage Cube
N u m b e r of m i s s Average overhead
2 1.01
3 1.01
6 1.02
7 1.04
9 1.14
-1 . 2— 3
8 1.12
I
Table 3: Multistage Cube Multicast: Performance of the Greedy heuristic over the optimal solution.
4
Conclusion
Multistage networks are popular for parallel architecture a n d / o r communication network applications. We formulate t h e traffic o p t i m u m multicasting problem for multistage networks. O p t i m u m traffic multicasting problem is NP-complete. Several greedy heuristics are proposed and their performances are shown using analytical as well as simulation m e t h o d s . This work considered just traffic optimality in MIN multicasting. O p t i m a l i t y issues of time metric in MIN multicasting is left as an interesting open problem. A c k n o w l e d g m e n t s : We thank Gary Elsesser, Lionel M. Ni and Wei-Tek Tsai for helpful discussions.
Steiner
Problem
in Multistage
Computer
Networks
401
References S. B h a t t a c h a r y a , G. Elsesser, W . T . Tsai and D.Z. Du, Multicasting in Generalized Multistage Interconnection Networks, to appear in Journal of Parallel and Distributed Computing, 1993. S. B h a t t a c h a r y a and L.M. Ni, Multicasting in Multistage Communication Networks, preprint, 1992. D. Comer and R. Sethi, Complexity of T R I E Index Construction, 17th Symposium on FOCS, 1976, pp 197-207.
Annual
W. J. Dally and C. L. Seitz, Deadlock-Free Message Routing in Multiprocessor Interconnection Networks, IEEE Transactions on Computers, C-36 (1987), p p . 547-553. P. Dowd, R a n d o m Access Protocols for High-Speed Interprocessor Communication Based on an Optical Passive Star Topology, Journal of Lightwave Technology, 9 (1991), pp. 799-808. M.R. Garey M.R. and D.S. Johnson, Computers and Intractability - A Guide to the Theory of NP-Completeness, ( Freeman, San Fransisco, CA, 1979 ). L.R. Foulds and R.L. G r a h a m , T h e Steiner Problem in Phylogeny in N P Complete, Advances in Applied Mathematics, 3 (1982), p p . 43-49. K. Hwang and F.A.Y. Briggs, Computer (New York : McGraw-Hill, cl984).
architecture
and parallel
processing,
K. Hwang and J. Ghosh, Hypernets: A Communication-Efficient Architecture for Constructing Massively Parallel Computers, IEEE Transactions on Computers, C-36 (1987), p p . 1450-1466. M.G. Hluchyj and M.J. Karol, ShuffleNet: An Application of Generalized Perfect Shuffles to Multihop Lightwave Networks, Journal of Lightwave Technology, 9 (1991). P. Kermani and L. Kleinrock, Virtual cut-through: A new computer communication switching technique, Computer Networks, 3 (1979), p p . 267-286. F.P. P r e p a r a t a and J. Vuillemin, T h e Cube-Connected Cycles: A Versatile Network for Parallel C o m p u t a t i o n , C A C M , 24 (1981), p p . 300-309.