M A T H E M A T I C A L P R O G R A M M I N G STUDIES
Founder and first Editor-in-Chief M.L. BALINSKI Editor-in-Chief R.W. COTTLE, Department of Operations Research, Stanford University, Stanford, CA 94305, U.S.A. Co-Editors L.C.W. DIXON, Numerical Optimisation Centre, The Hatfield Polytechnic, College Lane, Hatfield, Hertfordshire ALt0 9AB, England B. KORTE, Institut fiar Okonometrie und Operations Research, Universit~tt Bonn, Nassestrasse 2, D-5300 Bonn 1, W. Germany M.J. TODD, School of Operations Research and Industrial Engineering, Upson Hall, Cornell University, Ithaca, NY 14853, U.S.A. Associate Editors E.L. ALLGOWER, Colorado State University, Fort Collins, CO, U.S.A. W.H. CUNNINGHAM, Carleton University, Ottawa, Ontario, Canada J.E. DENNIS, Jr., Rice University, Houston, TX, U.S.A. B.C. EAVES, Stanford University, CA, U.S.A. R. FLETCHER, University of Dundee, Dundee, Scotland D. GOLDFARB, Columbia University, New York, USA J.-B. HIRIART-URRUTY, Universit6 Paul Sabatier, Toulouse, France M. IRI, University of Tokyo, Tokyo, Japan R.G. JEROSLOW, Georgia Institute of Technology, Atlanta, GA, U.S.A. D.S. JOHNSON, Bell Telephone Laboratories, Murray Hill, N J, U.S.A. C. LEMARECHAL, INRIA-Laboria, Le Chesnay, France L. LOVASZ, University of Szeged, Szeged, Hungary L. MCLINDEN, University of Illinois, Urbana, IL, U.S.A. M.J.D. POWELL, University of Cambridge, Cambridge, England W.R. PULLEYBLANK, University of Calgary, Calgary, Alberta, Canada A.H.G. RINNOOY KAN, Erasmus University, Rotterdam, The Netherlands K. R1TTER, University of Stuttgart, Stuttgart, W. Germany R.W.H. SARGENT, Imperial College, London, England D.F. SHANNO, University of California, Davis, CA, U.S.A. L.E. TROTTER, Jr., Cornell University, Ithaca, NY, U.S.A. H. TUY, Institute of Mathematics, Hanoi, Socialist Republic of Vietnam R.J.B. WETS, University of Kentucky, Lexington, KY, U.S.A. Senior Editors E.M.L. BEALE, Scicon Computer Services Ltd., Milton Keynes, England G.B. DANTZIG, Stanford University, Stanford, CA, U.S.A. L.V. KANTOROVICH, Academy of Sciences, Moscow, U.S.S.R. T.C. KOOPMANS, Yale University, New Haven, CT, U.S.A. A.W. TUCKER, Princeton University, Princeton, N J, U.S.A. P. WOLFE, IBM Research Center, Yorktown Heights, NY, U.S.A.
MATHEMATICAL
PROGRAMMING
STUDY29 A PUBLICATION OF THE MATHEMATICAL PROGRAMMING SOCIETY
Quasidifferential Calculus
Edited by V.F. D E M Y A N O V and L . C . W . D I X O N
May 1986
N O R T H - H O L L A N D - AMSTERDAM
© T h e M a t h e m a t i c a l P r o g r a m m i n g Society, Inc. -
1986
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. Submission to this journal of a paper entails the author's irrevocable and exclusive authorization of the publisher to collect any sums or considerations for copying or reproduction payable by third parties (as mentioned in article 17 paragraph 2 of the Dutch Copyright Act of 1912 and in the Royal Decree of June 20, 1974 (S. 351) pursuant to article 16b of the Dutch Copyright Act of 1912) a n d / o r to act in or out of Court in connection therewith.
This STUDY is also available to nonsubscribers in a book edition.
Printed in The Netherlands
To P.L. Chebyshev, the Godfather of Nonsmooth Analysis
PREFACE
F2Ia~Ko 6bI~O na 6yMare, ~a 3a6bi~rl npo oBparn, A no HHM XO~ltTb It was smooth on paper But ravines had been forgotten Where we should walk
The papers in the present Study deal with quasidifferentiable functions, i.e. functions which are directionally differentiable and such that at each fixed point the directional derivative as a function of direction can be expressed as the difference of two convex positively homogeneous functions. It turns out that quasidifferentiable functions form a linear space closed with respect to all 'differentiable' operations and (very importantly) with respect to the operations of taking the point-wise maximum and minimum. Many properties of these functions have been discovered, and we are now in a position to speak about Quasidifferential Calculus. But the importance of quasidifferentiable functions is not simply based on the results obtained so far. We can foresee a much greater role for these functions since (as far as the first-order properties are concerned) all directionally differentiable Lipschitzian functions can be approximated by quasidifferentiable functions. This is due to the fact that the directional derivative of any directionally differentiable Lipschitzian function can be approximated to within any given accuracy by the difference of two convex positively homogeneous functions. This Study reflects the state-of-the-art of Quasidifferential Calculus. The original idea of simply publishing English translations of a number of Russian papers on the subject was immediately rejected by the Editor-in-Chief, Professor R. W. Cottle; we are now grateful for this decision, since the authors obtained new results, thus leading to a much greater understanding of the subject. The Editors of this Study are greatly indebted to the International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria, which provided editorial and secretarial support for preparing the Study. We offer especial thanks to our language editor, Helen Gasking, whose role cannot be overestimated, and to Nora Avedisians, Edith Gruber and Elfriede Herbst for typing and retyping the papers. Thanks are also due to the referees, whose assistance, advice and criticism helped to improve many of the contributions. It is also necessary to note that the idea of such a Study was proposed by Professor Roger Wets and supported by Professor Andrzej Wierzbicki, then the Chairman of the System and Decision Sciences Program at IIASA. Some of the authors became vii
viii
Preface
involved in Quasidifferential Calculus through or at IIASA, and therefore this Study is in some sense a child of IIASA (although whether it is an offspring to be proud of is a question that can only be answered by the reader). Most of the Soviet authors of this Study are graduates a n d / o r staff members of Leningrad State University, where the first serious attempt to attack the problem of nondifferentiability was made more than a hundred years ago by P.L. Chebyshev, to whom this Study is dedicated. V.F. Demyanov L.C.W. Dixon
(Editors)
CONTENTS
Preface V.F. Demyanov, L.N. Polyakova and A.M. Rubinov, Nonsmoothness and quasidifferentiability V.F. Demyanov, Quasidifferentiable functions: Necessary conditions and descent directions L.N. Polyakova, On the minimization ofa quasidifferentiable function subject to equality-type quasidifferentiable constraints A. Shapiro, Quasidifferential calculus and first-order optimality conditions in nonsmooth optimization L.N. Polyakova, On minimizing the sum of a convex function and a concave function V.F. Demyanov, S. Gamidov and T.I. Sivelina, An algorithm for minimizing a certain class of quasidifferentiable functions K.C. Kiwiel, A linearization method for minimizing certain quasidifferentiable functions V.A. Demidova and V.F. Demyanov, A directional implicit function theorem for quasidifferentiable functions V.F. Demyanov and I.S. Zabrodin, Directional differentiability of a continual maximum function of quasidifferentiable functions D. Melzer, On the expressibility of piecewise-linear continuous functions as the difference of two piecewise-linear convex functions S.L. Pechersky, Positively homogeneous quasidifferentiable functions and their application in cooperative game theory N.A. Pecherskaya, Quasidifferentiable mappings and the differentiability of maximum functions V.F. Demyanov, V.N. Nikulina and I.R. Shablinskaya, Quasidifferentiable functions in Optimal Control A.M. Rubinov and A.A. Yagubov, The space of star-shaped sets and its applications in nonsmooth optimization V.V. Gorokhovik, e-Quasidifferentiability of real-valued functions and optimality conditions in extremal problems
vii 1 20 44 56 69 74 85 95 108 118 135 145 160 176 203
Appendix A guide to the bibliography on quasidifferential calculus Bibliography on quasidifferential calculus (January 1985)
219 219
Mathematical Programming Study 29 (1986) 1-19 North-Holland
NONSMOOTHNESS
AND
QUASIDIFFERENTIABILITY
V.F. D E M Y A N O V Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR, and International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria L.N. P O L Y A K O V A Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR A.M. R U B I N O V Institute for Social and Economic Problems, USSR Academy of Sciences, ul. Voinova 50-a, Leningrad 198015, USSR Received 9 April 1984 Revised manuscript received 15 November 1984
This paper is an introduction to the present volume. It is first shown that quasidifferentiable functions form a very distinct class of nondifferentiable functions. This and other papers in this volume demonstrate that we do not need to consider any other class of nonsmooth functions at least from the point of view of first-order approximation. The heart of quasidifferential calculus is the concept of a quasidiiterential--this replaces the concept of a gradient in the smooth case and that of a subdifferential in the convex case.
Key words: Nondifferentiable Functions, Quasidifferentiable Functions, Quasiditterentials, Subdifferentials, Superdiilerentials, Optimization Problems, Directional Differentiability, Upper Convex and Lower Concave Approximations, Clarke Subdifferential.
1. Introduction This is not the place to go into the m o t i v a t i o n s a n d origins of n o n d i f f e r e n t i a b i l i t y (although these are very i m l ~ r t a n t a n d interesting): for the p u r p o s e of this p a p e r it is only necessary to realize that although a n o n d i f f e r e n t i a b l e f u n c t i o n can often be a p p r o x i m a t e d by a differentiable one, this s u b s t i t u t i o n is u s u a l l y u n a c c e p t a b l e from a n o p t i m i z a t i o n v i e w p o i n t since some very i m p o r t a n t properties of the f u n c t i o n are lost (see E x a m p l e 2.1 below). We must therefore find some n e w analytical tool to apply to the p r o b l e m . Define a finite-valued f u n c t i o n f o n an o p e n set D c E,. I f f u n c t i o n f i s directionally differentiable, i.e., if the following limit exists:
af(x) = lim l [ f ( x + a g ) - f ( x ) ] Og ,,~+o ot
VgcE,, 1
(1.1)
2
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
then
f ( x + ag) = f ( x ) + ~ of(x) + o(~).
ag
Many important properties of the function can be described using the directional derivative. To solve optimization problems we must be able to (i) check necessary conditions for an extremum; (ii) find steepest-descent or -ascent directions; (iii) construct numerical methods. In general, we cannot solve these auxiliary problems for an arbitrary function f: we must have some additional information. In classical differential calculus it is assumed that af(x)/ag can be represented in the form
Of(x) ag -
(f'(x),
g),
where f'(x) ~ E, and (a, b) is the scalar product of vectors a and b. The function f is said to be differentiable at x and the vector f ' ( x ) is called the gradient o f f at x. Dif[erentiable functions form a well-known and important class of functions. The next cases that we shall consider are convex functions and m a x i m u m functions. It turns out that for these functions the directional derivative has the form
Of(x) Og
max (v, g),
(1.2)
vcOf(x)
where Of(x) is a convex compact set called the subdifferential of f at x. Each of these two classes of functions forms a convex cone and therefore their calculus is very limited (only two operations are allowed: addition, and multiplication by a positive number). The importance of (1.2) has led to m a n y attempts to extend the concept of a subdifferential to other classes of nondifferentiable functions (see, e.g., [1, 15, 16, 18, 22, 23, 28, 32]). One very natural and simple generalization was suggested by the authors of the present paper in 1979 [7, 13]. We shall say that a function f is quasidifferentiable at x if it is directionatty differentiable at x and if there exists a pair of compact convex sets Of(x)~ E. and -~f(x)c E. such that
Of(x) Og
max (v, g ) + re_in (w, g).
o~f(x)
wec~f(x)
(1.3)
The pair D f ( x ) = [_0f(x), 0f(x)] is called a quasidifferential o f f at x. It has been shown that quasidifferentiable functions form a linear space closed with respect to all algebraic operations and, even more importantly, to the operations of taking pointwise maxima and minima. This has led to the development of quasidifferential calculus, and many important and interesting properties of these
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
3
functions have been discovered (including a chain rule, an implicit function theorem, and so on). One very important property of these functions is that if f is directionally differentiable and its directional derivative a f ( x ) / a g at x is a continuous function of direction g (every directionally ditterentiable Lipschitzian function has this property), then a f ( x ) / a g can be approximated to within any prescribed accuracy by a function of form (1.3). Thus, the quasidifterential is an ideal tool for studying the first-order properties of functions. A more general approach, involving an extension of quasiditterential calculus, has been presented by Rubinov and Yagubov [29]. They proved that if a f ( x ) / a g is continuous in g then it can be represented in the form Of(x) =inf{h > 0 [ g ~ h U } + s u p { h < 0 [ g e AV}, Og
(1.4)
where U and V are what are known as star-shaped sets. If U and V are convex sets then eqn. (1.4) can be rewritten in the form (1.3). Thus, i f f is directionally ditterentiable it is natural to use this construction (the directional derivative) to study optimization problems. However, i f f is not directionally ditterentiable some other tool must be found. One approach is to generalize the notion of the directional derivative (1.1). We shall mention only the following two generalizations: 1. The Dini upper derivative o f f at x in the direction g, defined as Oof(x)~ _ ~ 1 [ f ( x + ag') - f ( x ) ] . Og s'~g ct a~+O
In the case of a Lipschitzian function this becomes: Oof(x)'~ _ ~ 1 [ f ( x + ag) - f ( x ) ]. Og ~+o a
(1.5)
2. The Clarke upper derivative o f f at x in the direction g, defined as Octf(x)~ - ~ l[f(x' + Og x'~x et.
ag) +f(x')].
(1.6)
Other generalizations and extensions are given in [18, 22, 28]. Equation (1.5) is a natural generalization of (1.1) and, in the case of a directionally ditterentiable function, the Dini upper derivative (1.5) coincides with the directional derivative (1.1). However, this is not the case for the Clarke upper derivative (1.6). The reason for this is that (1.6) describes not the local properties o f f at x but some 'cumulative' properties o f f in a neighborhood of x. It seems to the authors that for optimization purposes it is better to use the Dini derivative (and this idea has been exploited by B.N. Pschenichnyi [23]).
4
V..E Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
The Dini and Clarke upper derivatives are used to study minimization problems: for maximization problems it is necessary to invoke the Dini and Clarke lower derivatives. These are defined analogously to (1.5) and (1.6) with the operation I - ~ replaced by lim. We shall discuss both these generalizations later in the paper: for now, note only that if the Dini upper derivative is continuous (which is always the case if f is Lipschitzian), then it can be approximated by a function of the form (1.3), so that quasidifferential calculus can be used here as well. In Section 2 we discuss directional ditterentiability. Section 3 is concerned with convex functions and m a x i m u m functions, as well as with the Clarke subdifferential and pschenichnyi upper convex and lower concave approximations. Quasidifferentiable functions are treated in Section 4. This should be seen as a survey paper: we hope that it will provide a general introduction to the subject of this Study and enable readers to make use of the results in their own research.
2. Directional differentiability Let S c E, be an open set and f be defined and finite-valued on $. Fix x E S and g e E,. The function f is said to be differentiable at x in the direction g if the following finite limit exists: af(x) = f ' ( g ) -- lim l [ f ( x + ag a~+o a
ag) - f ( x ) ] .
(2.1)
(It is naturally assumed that x + a g ~ S ; since S is open this is the case for all a ~ [0, O~o(g)], where a o ( g ) > 0). The limit (2.1) is called the (first-order) directional derivative of f at x in the direction g. I f f is differentiable in every direction g ~ E, it is said to be directionally differentiable at x. If J" is directionally differentiable at x and Lipschitzian in some neighborhood of x, then lim l[f(x~-otg(t~))-f(x)] ~+o a
g(~)~g
Of(x) ag
i.e., in this case it is sufficient to consider only 'line' directions. It is clear from (2.1) that i f f is directionally ditierentiable then
f ( x + ag) = f ( x ) + ct af(x) + o ( a ), ag
i.e., the directional derivative provides a first-order approximation o f f in a neighborhood of x.
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Ouasidifferentiability
5
Let f be directionally differentiable at x, x e S. A direction g(x) is known as a
steepest-descent direction o f f at x if Of(x) Of(x) = inf~g(x) ~ s , Og ' where S, = {go E. I Ilgll = 1}. Here Ilgll is the euclidean norm g. A direction g'(x) is called a steepest-ascent direction o f f at x if
Of(x) Of(x) = sup Og'(x) ~ s . Og Directions of steepest descent or ascent need not necessarily exist and if they do, they are not necessarily unique. It is clear that for a point x* c E, to be a minimum point o f f it is necessary that
Of(x*) >-0 Og
Vg e En.
An analogous necessary condition for a m a x i m u m is Of(x**) ~<0 Og
Vg ~ E,.
However, these necessary conditions are in general difficult to verify; they are also trivial reformulations of the definitions of a minimum and a maximum. We therefore have to make use of certain specific properties of the function under consideration. Remark. Prof. J. Zowe observed that if a function f is Lipschitzian in a neighborhood of a point x* and if
Of(x*)>O 0g
VgeE~, g#0,
then the point x* is a strict local minimum point o f f . One very important class is that of differentiable functions. In this case
Of(x) Og
- ( f ' ( x ) , g),
(2.2)
where f ' ( x ) is the g r a d i e n t . o f f at x. Applying the concept of a gradient, for example, to the optimization problem, it is possible to: 1. C o m p u t e the directional derivative. 2. Derive the following necessary condition for a minimum or a maximum: for a differentiable function f to attain its local minimum (or maximum) value at x* c S it is necessary that
f ' ( x * ) --- 0.
(2.3)
The point x* at which condition (2.3) is satisfied is called a stationary point o f f .
V.F. Derayanov, L.N. Polyakovaand A.M. Rubinot;/ Quasidifferentiability 3. Find directions of steepest descent and ascent as follows: I f f ' ( x o ) # 0 then the direction
g(xo) =
f'(xo)
(2.4)
Ilf'xo)][
is the direction of steepest descent of f at Xo, and the direction
g'(xo)
f'(xo)
IIf'(xo)ll
is the direction of steepest ascent o f f at Xo. In this case the directions of steepest descent and ascent both exist and are unique. 4. Construct numerical methods for finding an extremum. The concept of a gradient (a derivative in the one-dimensional case) has had a profound impact on the development of science. It is impossible to overestimate its importance and influence. From being an art, mathematics became a technical science. However, differential calculus is only applicable if the functions studied are smooth (i.e., ditterentiable). For most practical problems tackled in the past (and for many presently under study) it has been sufficient to consider only smooth functions. Nevertheless, an increasing number of problems arising in engineering and technology are of an essentially non-smooth nature. There are two very popular ways to avoid nonditterentiability. First, one tries to replace a non-smooth problem by a smooth one. For example, the problem of minimizing the function
f(x) = max 6,(x), i61 where ~b~'s are smooth nonnegative functions, I = 1 : N and x E E,, is often replaced by the minimization of
F(x) = E a,6,(x), where the a~ are positive coefficients. The function F is smooth but it now describes quite a different problem. The second possibility is to consider the function
Fp(x) = ( ~ [( qb,(x)]P) '/, instead
off.
It is well-known that F,(x) p---~-s
Vx. Note that in many cases
the computations process by which Fp(x) is minimized becomes unstable. Some very important properties of the original function can thus be lost in the pursuit of smoothness. We can illustrate this using a very simple example.
Example 2.1. Let x = (x ~'), x C2~)~ E2; f(x) = Ix ')l- Ix'='l, go = (0, 0). R e function f
E F. Demyanov, L.N. Polyakova and A.M. Rubino~/ Quasidifferentiability
7
is not ditterentiable at points where x (~) = 0 or x ~2~= 0. Take a direction g = (g{3~, g~2~). The function f is directionally differentiable with directional derivative
Of(xo) Og
-- iim l [ f ( x o + a~+0
ag) - f ( x o ) ]
Ig"'l-Ig'2'l 9
=
O/
It is clear that there are two steepest-descent directions of f at Xo: g, = (0, 1) and
g'l = ( 0 , - 1 ) . There are also two steepest-ascent directions: g2 = (1, 0), g ~ = ( - 1 , 0 ) . Let us try to s m o o t h the function f functions:
Take e > 0
and consider the following
f , , ( x ) = ~/ix"))2 + e - ~/(xl2))2+ e,
(1)
f2,(x) = x/(x~'; + e) 2 - ~/(x~Z) + e) z,
(2)
f3,. (x) = ~ / ( - ~ ) 2 + e - ~/(x'2) + e)2,
(3)
It is clear that
f~(x)
?,f(x)
~0
V i e 1:3.
Find the gradients o f these functions at xo:
~f,~.(x) ox
(
of,,(Xo)
- - =
x I'~ x ''-~ ) 4[.~,~")~+ ~' ,/(x%2 + ~ '
(0,0)
Ox
'de>O,
,/(x (11 +~) 2 t 4(x (2) +~)12
ox of~,(Xo) Ox 3f~(x)
(-1,1)
(
re>O,
- x (~)
-
of3,(Xo) ax
x(2'+e
) '
(0, 1~) VE>O.
We can then make the following deductions: forfl~: Xo is a stationary point. forf2~ : the steepest-descent direction at Xo is g3 = (,f2/2, - x / 2 / 2 ) and the steepestascent direction is g~ = ( - x / 2 / 2 , x/2/2). forf3~ : the steepest-descent direction at Xo is g4 = (0, - 1 ) and the steepest-ascent direction is g~ = (0, 1). Thus, all three s m o o t h i n g functions provide incomplete or even misleading information about stationarity or directions o f steepest descent and ascent. The reason
8
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
is that these smoothing functions are zeroth-order approximations while steepestascent and -descent directions reflect first-order properties of the function. Since it appears that we cannot avoid nondifferentiability, we should rather study the properties of special classes of nonsmooth functions with the aim of developing analytical tools to handle these problems.
3. The subdifferentiai and its generalizations
3.1. Maximum functions Let
f ( x ) = max ~b(x, y), yeG
(3.1)
where ~b(x, y) is continuous in x and y on S x G and continuously differentiable in x on S; G is a compact set. The function f described above is not necessarily continuously differentiable. However, it is directionally differentiable on S and
of(x) = max (~b'(x,y), g), Og y~R~)
(3.2)
where g ( x ) = {y E G I ~p(x, y) =f(x)}. The set R ( x ) is closed and bounded. We can rewrite (3.2) in the form
of(x) = max (v, g), ag v~af(x)
(3.3)
Of(x) = co{~b'(x, y)Jy c R(x)}.
(3.4)
where
It is not difficult to see that the set af(x) described by (3.4) can be used for several purposes [2, 6]: 1. To compute the directional derivative (see (3.3)). 2. To derive the following necessary condition for an unconstrained minimum: for x*~ S to be a local minimum point o f f defined by (3.1) it is necessary that
0 ~ Of(x*).
(3.5)
A point x*~ S at which (3.5) is satisfied is called a stationary point o f f (note that S is an open set). 3. If xo is not a stationary point then the direction
g(xo) =
V(Xo)
IIv(xo)ll'
where V(Xo) c Of(xo), IIV(xo) ll = min o~oy~xo)IIv II, is a steepest-descent direction o f f at Xo. This direction is unique.
V.F. Demyanov, L.N. Polyakovaand A.M. Rubinov / Quasidifferentiability
9
If we find vl(Xo) e af(xo) such that IIv,(Xo)ll = maxv~or<~o)llvll, and if IIv~(xo)ll > 0, then the direction g~(xo) = v~(xo)/II v,(xo)II is a steepest-ascent direction of f at Xo. Note that this direction is not necessarily unique. The set Of(x) can also be used to construct numerical methods for minimizing f on E, or on a bounded set (see, e.g., [6]).
3.2. Convex functions Let S c E, be a convex open set and f be a convex function defined on S, i.e.,
f(ax,+(1-a)x2)<~af(xO+(1-a)f(x2)
Vae[0,1]
Vx,,x2~S.
Any finite-valued convex function is necessarily continuous and directionally differentiable on S, and
af(x) -
c3g
max (v, g), o~af(~)
(3.6)
where
Of(x)={v~U. lf(z)-f(x)~(v,z-x)
Vz~S}.
(3.7)
The set of Of(x) is nonempty, convex and compact, and is called the subdifferential of f at x. The subdifferential plays exactly the same role as the set a f defined by (3.4) for a m a x i m u m function (except that condition (3.5) in the convex case is sufficient as well as necessary). For this reason we shall refer to the set Of(x) defined by (3.4) as the subdifferential of the m a x i m u m function f described by (3.1). Note that if $ is also convex in x for any y c G then the set af(x) defined by (3.4) coincides with the set Of(x) defined by (3.7) (assuming that f is a m a x i m u m function of form (3.1)). Convex functions have been studied and used very widely: their fundamental properties were discovered and exploited by Fenchel [14], Moreau [21], and Rockafellar [27]. Thus we can define the subdifferential m a p p i n g a f for two very important classes of nondifferentiable functions. We may view the concept of a subdifferential as a generalization of the concept of a gradient (for continuously differentiable functions). I f f is a differentiable at x (where f is either a m a x i m u m function or a convex one), then a f ( x ) = {f'(x)}. The properties of convex and m a x i m u m functions (and especially eqns. (3.3) and (3.6)) seem to have had a mesmerizing effect on m a n y mathematicians. They have tried to generalize the concept of a subdifferential to other classes of nondifferentiable functions, while trying to somehow preserve (3.3) [1, 15, 16, 17, 18, 22, 23, 28, 32]. We shall consider here only two of these generalizations which are particularly relevant to the subject of this Study.
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
10
3.3. The Clarke subdifferential Let a function f be Lipschitzian on S. By T ( f ) we shall denote the subset of S on w h i c h f i s differentiable. It is well-known that Lipschitzian functions are differentiable almost everywhere. For x c S, consider the set a c , f ( x ) = co Os~f(x), where Oshf(x) = {v e E~ [:l{xk}: Xk ~ T ( f ) , xk "~ x , f ' ( x E ) ~ V}. The set OShf(X ) w a s introduced by Shor in [31] and the set 0 o f ( x ) by Clarke in [1]. The latter set will be referred to here as the Clarke subdifferential o f f at x. It has been shown that Octf(x) is a nonempty convex compact set. Clarke also introduced the Clarke upper derivative o f f at x in the direction g ~ En: Octf(x)'~ Og
~l[f(x'+ ~'~x u
ag)-f(x')].
(3.8)
ct ~ + 0
The most important result related to the Clarke upper derivative is the following: Oc,f(x)'~ 0g
max (v, g). v'=ac,f(x)
(3.9)
It is possible to show that for a point x * ~ S to be a minimum point of f it is necessary that 0 ~ Oof(x*).
(3.10)
We shall call any point x* at which (3.10) is satisfied a Clarke stationary point. If Xo is not such a stationary point then the direction
V(Xo)
g(xo) = -
IIv(xo)ll
where
IIv(xo)ll~
rain
v c a(nf(xo)
Ilvll
is a direction of descent o f f at Xo (but not necessarily a direction of steepest descent). There are also some very interesting numerical algorithms for minimizing a Lipschitzian function f based on the Clarke subdiilerential [19]. Let 0c,f(x)~, -
-
Og
1 -
lira - - [ f ( x ' +
x'~x ct a~+O
ag) - f ( x ' ) ] .
(3.11)
V.F. Demyanov, LN. Polyakova and A.M. Rubinov / Quasidifferentiability
11
This value is called the Clarke lower derivative of f at x in the direction g. It is possible to show that Oc:,f(x)~, ag
-
rain
veOc, r(~)
(v,g)
and that for x** to be a m a x i m u m point o f f it is necessary that (3.10) be satisfied at x**, i.e., the necessary conditions for a minimum and a m a x i m u m coincide. Thus, the role played by the Clarke subdiiierential with respect to Lipschitzian functions is analogous to that played by the subdifferential for convex and maximum functions. These results are very attractive from the aesthetic point of view. However, this approach nevertheless has some deficiencies from the optimization standpoint, the main reason for which being the fact that the Clarke u p p e r (lower) directional derivative does not necessarily coincide with the directional derivative (if the latter exists). Let us consider once again the function f described in Example 2.1:
Xo=(O,O). It is not difficult to check that Oc,f(xo) = co{(1, 1), ( 1 , - 1 ) , ( - 1 , 1), ( - 1 , - 1 ) } ,
i.e., 0 e Oof(xo), where Xo is a Clarke stationary point but is neither a minimum nor a maximum o f f The Clarke subdifferential reflects some 'cumulative' properties of the function in a neighborhood of a point. For example, if OcIT(X) - - < 0 Og
then the direction g is not only a descent direction of f at x: it is also a descent direction of f at every x' in some neighborhood of x. The Clarke subdifferential enables us to discover some very important properties of the function. However, the Clarke directional derivatives (upper and lower) defined by (3.8) and (3.11) are only very rough approximations of the directional derivative (if it exists). In our opinion the Clarke subditterential is not an appropriate tool for solving problems where directional derivatives are used (such as, for example, optimization problems). Nevertheless, the concept of the Clarke subdifferential is very important and can be very powerful in other areas of nonsmooth analysis. Note also that the calculus based on the Clarke subdifferential is incomplete (since the main relations are formulated as inclusions, not equalities) and this makes it unsuitable for computational use.
V.F. Demyanov, LN. Polyakovaand A.M. Rubinov/ Quasidifferentiability
12
3.4. The Pschenichnyi upper convex and lower concave approximations Consider first the Dini upper derivative Fx(g) = lim l [ f ( x + a g ) - f ( x ) ] , cf~+O
(3.12)
01~
where f is a Lipschitzian function and x is fixed. In the case where f is directionally ditterentiable, Fx(g) coincides with its directional derivative. The function Fx(g) provides a better local approximation than the Clarke upper directional derivative. However, Fx(g) is not a convex function and therefore it cannot be approximated by a maximum function of linear functions. Pschenichnyi [23] suggested that it should be approximated by a family of convex functions. Let f be Lipschitzian on S and directionally ditterentiable at a fixed point x e S. Note that the directional derivative Of(x)/ag =f'x(g) is both continuous in g (because f is Lipschitzian) and positively homogeneous, i.e., f ' ( A g ) = hf'x(g)
VA/> O.
A function p is said to be an upper convex approximation (u.c.a.) o f f at x if p is sublinear (i.e., convex and positively homogeneous) and if p(g)>~f'(g) Vg ~ E,. If p is an u.c.a, o f f at x then
f ( x + ag) <~f(x) + ap(g) + ox,g(ct),
(3.13)
where o~.g(o~) 01~
, 0. a~+O
Since p is sublinear there exists a unique convex compact set _0pc E, such that
p(g) = maxv~_0p (v, g). A function q is said to be a lower concave approximation (l.c.a.) o f f at x if q is superlinear (i.e., concave and positively homogeneous) and if q(g) <~f'(g) Vg ~ E,. Since q is superlinear there exists a unique convex compact set 0q e E, such that q(g) = minwc~q (w, g). Note that an upper convex approximation is not necessarily unique, and therefore a single u.c.a, cannot provide a satisfactory approximation of the function. The notion of an exhaustive family of upper convex approximations was introduced in [8], where it was defined as follows: Let A be an arbitrary set. A family {PA [A e A}, where px is an u.c.a, o f f at x, is called an exhaustive family of u.c.a.'s for f at x if
Of(x) Og
infp~(g)=--
xcA
VgcE,,
(3.14)
i.e., if
f ( x + a g ) = f ( x ) + a inf px(g)+ox.,(a) AcA
V g e E,.
V.F. Demyanov, L.N. Polyakovaand A.M. Rubinov/ Quasidifferentiability
13
Analogously, a family {q~l)teA}, where q~ is a l.c.a, o f f at x, is called an exhaustive family of lower concave approximations for f at x if
Of(x) ag
sup qA(g) = - -
XcA
Vg 9
i.e., if
f(x+ag)=f(x)+asupqx(g)+o~.g(a)
VgeEn.
Ag-A
The existence of an exhaustive family of u.c.a.'s (or l.c.a.'s) implies that f ' ( g ) may be represented in the equivalent forms
O f ( x ) inf m a x ( v , g ) = ( s u p min ( w , g ) ) ag A~A v~Op~ kAvA w~q~
(3.15)
(of course, A is not the same for a family of u.c.a.'s and that of 1.c.a.'s). It possible to show that exhaustive families of u.c.a.'s and I.c.a.'s exist for every directionally ditterentiable function whose directional derivative is continuous as a function of direction (see [8] for an illustration of the construction of a family of l.c.a.'s). The concepts of upper convex approximation and lower concave approximation can be applied with some success to the solution of extremal problems. The following properties are of particular use: if x* is a minimum point o f f on S (recall that S is an open set) then for every u.c.a, p(g) it is necessary that 0 e _0p. If {p~lx cA} is an exhaustive family of u.c.a.'s o f f at x* then we have the following necessary condition for a minimum: 0 e _Op~ VAeA.
(3.16)
If { PA [A 9 A } is an exhaustive family of u.c.a.'s o f f at xo and (3.16) is not satisfied, find
sup min Ilvll = IIv~IIXczA veO_px
The direction g ( x o ) = - v J [Iv~oll is then a direction of steepest descent o f f at Xo. Thus, if we have an exhaustive family of upper convex approximations we can: 1. Compute the directional derivative (see (3.14)). 2. State a necessary condition for a minimum (see (3.16)). 3. Find a steepest-descent direction. Analogous result~ can be obtained for maximization problems by using an exhaustive family of l.c.a.'s. Thus, the essence of this approach is to reduce the optimization problem to one of constructing the required families of u.c.a.'s (or l.c.a.'s). In what follows we describe a class of functions for which families of upper convex approximations and ler concave approximations can be constructed with relative ease.
14
V.F. Derayanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
4. Quasidifferentiable functions 4.1. Definitions and properties Let f be a finite-valued function defined on an open set S c En. The function f is said to be quasidifferentiable at x ~ S if it is directionally differentiable at x and if there exist convex c o m p a c t sets ~_f(x) c E~ and -af(x) c E~ such that
Of(x) ~g
max ( v , g ) + min ( w , g )
=f'(g)=
vc0f(x)
VgcEn.
(4.1)
wc~f(x)
The pair of sets D f ( x ) = [O_(x),-af(x)] is called a quasidifferential of f at x; sets O_f(x) and -df(x) are described as a subdifferential and a superdifferential, respectively, o f f at x. It is clear that a quasidifferential at a point is not unique. If the set of quasidifferentials of f at x contains an element of type D f ( x ) = [ a f ( x ) , 0], then the function f is said to be subdifferentiable at x. If there exists a quasidifferential of the form D f ( x ) = [ 0 , a f ( x ) ] , then this function is said to be superdifferentiable at point x. Some e x a m p l e s of quasiditterentiable functions are given below. 1. If f is continuously ditierentiable on S then it is quasiditterentiable at every point x e S, and the pair of sets D f ( x ) = [ f ' ( x ) , 0] (where f ' ( x ) is the gradient o f f at x) is a quasiditterential o f f at x. It is clear that the pair Df(x) = [O,f'(x)] is also a quasidifferential of f at x. Thus, if a function f is smooth at x it is also both subdifferentiable and superdifferentiable at x. 2. From (3.3) and (3.6) it is clear that both m a x i m u m functions (defined by (3.1)) and convex functions are quasidifferentiable at x e S, and that D f ( x ) = [_~f(x), 0], where ~_f(x) = af(x) (defined by (3.4) or (3.7), respectively) is a quasidifferential of f at x. In other words, both m a x i m u m functions and convex functions are subditterentiable. 3. In a similar way it can be seen that if f is concave on a convex open set S (i.e., f~ = - f is convex), then f is quasidifferentiable on S, with quasiditterential Df(x) = [0, ~f(x)]. Here Of(x) = { w c E~ If(z) - f ( x ) <~( w, z - x) Vz ~ S} is the superdifferential of the concave function f at x. Let D = [A, B] be a pair of sets, where A c E,, B c En. We define multiplication by a real n u m b e r )t as follows: ~[AA, AB] A D = ( [ A B , AA]
if h / > 0 , if h < 0 .
(4.2)
Let D, = [ A , , B~], D 2 = [A2, B2], where A~, A2, Bi, B2c En. We define addition of sets in the following way: D, + D2 = [A, B] where A = AI + A2, B = B~ + B 2. It follows from (4.1)-(4.3) that
(4.3)
V.F. D e m y a n o v , L. IV. Polyakova a n d A . M . R u b i n o v / Quasidifferentiability
15
9
N
1. If f u n c t i o n s f h . . . ,fN are quasidifferentiable at x then the f u n c U o n f = Y'-i=~ c~f~ (where c~ e E,) is also quasiditterentiable at x and N
Df(x) = ~. c,Dfi(x).
(4.4)
i=1
2. If functions f~ and f2 are quasidifferentiable at x then the function also quasidifferentiable at x and
f = f l "f2 is
Df(x) =fl(x)Df2(x) +f2(x)Dfl(x).
(4.5)
3. If functions f~ and f2 are continuous and quasidifferentiable at a point x and f2(x) ~ 0 then the function f =f~/f2 is quasidifferentiable at x and 1
Df(x) =f-'~[f2(x)Dfl(x) --fl(x)Df2(x)].
(4.6)
It is clear that (4.4)-(4.6) represent generalizations o f well-known relations from classical differential calculus. However, quasiditterentiable functions also have the following very i m p o r t a n t additional properties (see [7, 8, 29]): 4. Let functions f~, i e I = 1 : N, be quasidifferentiable at x ~ S. Then the function
f(x)
= max f i ( x ) i~l
is quasidifferentiable at x and
Df(x)=
[_0f(x), 0f(x)], where
O_f(x)=co{O_fk(X)-- ,~Y.R(,Oof,(x)lke R(x)}, ir
0f(x)=
~ kcR(x)
Ofk(X),
(4.7)
R(x)={iellf~(x)=f(x)}.
5. If functions f~, i e I = 1 : N , are quasidifferentiable at x e S then the function
f(x)
= min,~if/(x) is quasidifferentiable at x and
Df(x)=
[_0f(x),
Off(x)],
where
O_f(x) = ~. -#_fk(x), keQ(x)
(4.8)
Q(x) =
{i e
llf~(x) = f ( x ) } .
Thus, the class of quasidifferentiable functions is a linear space closed with respect to all algebraic operations and, even m o r e importantly, to the operations o f taking pointwise m a x i m a and minima.
16
V.F. Derayanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
4.2. Necessary conditions for an unconstrained extremum It is easy to state necessary conditions for extrema of quasidifferentiable functions. We shall limit ourselves to consideration of the unconstrained case; other cases are discussed in detail in [5, 25]. Let f be quasidifferentiable on E,. Theorem 4.1 (see [24]). For a point x * e E, to be a minimum point o f f on En it is necessary that
--af(x*) c Of(x*).
(4.9)
Theorem 4.2. For a point x** ~ E~ to be a maximum point o f f on 17,. it is necessary that
- O f ( x * * ) c af(x**).
(4.10)
A point x * e E, at which condition (4.9) is satisfied is called an inf-stationary point of function f on E,. A point x** c E, at which condition (4.10) is satisfied is called a sup-stationary point o f f on E,. Assume that Xo is not an inf-stationary point (i.e., condition (4.9) does not hold). Find woe-af(xo) and roe o_f(xo) such that max
min I I v + w l l - min
w~3f(xo) vcd_f(xo)
v6Sf(x~)
IIV+woll=llVo+woll.
It turns out that the direction go = -(Vo + Wo)/II vo + woll is a steepest-descent direction of f at the point xo, This direction may not be unique. Analogously, if a point Xo is not a sup-stationary point o f f on E, then we lind vl ~ O_f(xo) and wl ~ -af(xo) such that max
m_in IIv + wll = m_in IIvl + wll = IIvl § w, ll.
vcS_f(xo) weaf(xo)
wcOf(xo)
The direction gl = (vl + wt)/II o, + w~ll is a steepest-ascent direction o f f at Xo. The problem of verifying the necessary conditions for a minimum is thus reduced to that of finding the Hausdorff deviation of the set -af(xo) from the set ~_f(xo). Similarly, the verification of the necessary conditions for a maximum is equivalent to finding the Hausdorff deviation of the set O.f(xo) from the set -af(xo). If the necessary condition for a maximum or for a minimum holds at a point Xo, then the corresponding Hausdorff deviation is zero. Otherwise the deviation is positive and its absolute value is equal to the rate of steepest ascent (or descent) at point xo. Thus the concept of a quasidifferential is an extension of the idea of a gradient. The main formulae of quasidifferential calculus represent generalizations of relations from classical differential calculus (see (4.4)-(4.6)). A new and important additional operation is allowed in quasidifferential calculus--that of taking pointwise maxima or minima. This brings into play a host of new nondifferentiable functions obtained by combining ordinary "differentiable operations" with the taking of pointwise
V.F. Demyanov, L.N. Polyakovaand A.M. Rubinov/ Quasidifferentiability
17
maxima and minima. A chain rule for quasidifferentiable functions has been discovered and was proved in [8-10], while implicit function and inverse theorems were established in [3, 9]. The relation between the quasidifferential and the Clarke subdifferential has also been studied (see [4]): it appears that for a rather wide class of quasidifferentiable functions there exists a very simple relationship between the Clarke subdifferential and the quasidifferential. The next step is to develop numerical methods for finding extreme points of quasidifferentiable functions. First of all, we should recognize that there may be several directions of steepest descent (or ascent, if we are looking for a maximum). This property requires a new approach to the construction of algorithms. In the convex case, for example, the greatest differences between many algorithms lie in (i) the rule used to find a descent direction and (ii) the step-size rule. In the quasidifferentiable case, however, it is necessary to consider several directions at each step. Some promising results in this area are given in [12, 26].
4.3. The place and role of quasidifferentiable functions in nonsmooth optimization It follows from (4.1) that
Of(x)=f,(g)= ag
min[
It is clear that for every w 9
Pw(g) =
max ( v + w , g ) ] .
w~Of(x) L vcof(x)
max
ve[w+~f(x)]
the function
(v, g)
is an upper convex approximation o f f at x and the set of functions {pw] we af(x)} is an exhaustive family of upper convex approximations o f f at x. Analogously, for every v 9 Of(x) the function
q~(g) =
mi_n
we[v+af(x)]
(w, g)
is a lower concave approximation o f f at x and the set of functions {qv[v 9 _0f(x)} represents an exhaustive family of lower concave approximations o f f at x. Thus quasidifferentiable functions represent one class of functions for which it is possible to construct ~xhaustive families of upper convex and lower concave approximations. Note that the most important properties for optimization purposes are those of the directional derivative, because they can be used to check necessary conditions for an extremum and to find directions of steepest descent or ascent. If the directional derivative f ' ( g ) is a continuous function (as is always the case for a Lipschitzian, directionally differentiable function), then f ' ( g ) can be approximated by the difference of two convex, positively homogeneous functions. This means that the function f can be approximated to within any given accuracy ( o f f ' ( g ) ) by a quasidifferentiable function, thus ensuring that properties of f which are important from the
18
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
computational standpoint (e.g., the number of steepest-descent and -ascent directions, etc.) can be derived. The quasidifferential therefore seems to be quite adequate for studying the first-o~der properties of the function. Of course, there are many functions which are not quasidifferentiable (see, e.g., [11]), but for the purposes outlined above it is sumcient to consider only those which are. The main problem is how to approximate f'(g) by a quasidifferentiable function, and this is discussed in some detailed in papers by Rubinov and Yagubov [29], Shapiro [30] and Melzer [20]. If a function is not directionally differentiable then we employ the Dini directional derivative, which can also be approximated (for example, in the Lipschitzian case) by the difference of two positively homogeneous convex functions. Thus quasidifferential calculus is once again of use. It is now clear why it is important to develop quasidifferential calculus (and especially the software based on it).
5. Concluding remarks This paper considers only the finite-dimensional case, although most of the results can be extended to infinite-dimensional spaces (see, e.g., [9]). Second-order approximation problems seem to present an important and promising area of research, but at present only a few results have been obtained in this field.
References [ 1] F.H. Clarke, "Generalized gradients and applications", Transactions of the American Mathematical SocieO, 205 (1975) 247-262. [2] J.M. Danskin, The theory ofmax-min (Springer-Verlag, New York, 1967). [3] V.A. Demidova and V.F. Demyanov, "A directional implicit function theorem for quasiditterentiable functions", Working Paper WP-83-125, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [4] V.F. Demyanov, "'On a relation between the Clarke subditierential and the quasiditierential", Vestnik Leningradskogo Universiteta 13 (1980) 18-24 (translated in Vestnik Leningrad University Mathematics 13 (1981) 183-189). [5] V.F. Demyanov, "Qaiasidiiterentiable functions: Necessary conditions and descent directions", Working Paper W'P-83-64, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). (See also this volume, pp. 20-43). [6] V.F. Demyanov and V.N. Malozemov, Introducton to minimax, ( Wiley, New York, 1974). [7] V.F. Demyanov and A.M. Rubinov, "On quasiditierentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 21-25 (translated in Soviet Mathematics Doklady 21 (1980) 14-17). [8] V.F. Demyanov and A.M. Rubinov, "On some approaches to nonsmooth optimization problems" (in Russian), Ekonomika i Matematicheskie Metody 17 (1981) 1153-1174. [9] V.F. Demyanov and A.M. Rubinov, "'Elements of quasiditterentiable calculus" (in Russian), in: V.F. Demyanov, ed., Nonsmooth Problems of Control Theory and Optimization (Leningrad University Press, Leningrad, 1982) pp. 5-127. [10] V.F. Demyanov and A.M. Rubinov, "On quasiditierentiable mappings", Mathematische Operations Forschung und Statistik, Series Optimization 14 (I) (1983) 3-21.
V.F. Demyanov, L.N. Polyakova and A.M. Rubinov / Quasidifferentiability
19
[11] V.F. Demyanov and I.S. Zabrodin, "Directional differentiability of a continual maximum function of quasidifferentable functions", Working Paper WP-83-58, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). (See also this volume, pp. 108-117). [12] V.F. Demyanov, S. Gamidov and T.I. Sivelina, "An algorithm for minimizing a certain class of quasidifferentiable functions", Working Paper WP-83-122, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). (See also this volume, pp. 74-84). [13] V.F. Demyanov, L.N. Polyakova and A.M. Rubinov, "On one generalization of the concept of subdifferential", in All Union Conference on Dynamic Control: Abstracts of Reports (Sverdlovsk, 1979) pp. 79-84. [14] W. Fenchel, "On conjugate convex functions", Canadian Journal of Mathematics 1 (1949) 73-77. [15] A.A. Goldstein, "'Optimization of Lipschitzian continuous functions", Mathematical Programming 13 (1977) 14-22. [16] J.-B. Hiriart-Urrnty, "New concepts in nondifferentiable programming", Bulletin Socidtd Mathdmatique de France, M~moire 60 (1979) 57-85. [17] A.D. Ioffe, "Nonsmooth analysis: Differential calculus ofnonditterentiable mappings", Transactions of the American Mathematical Society 26 (1981) 1-56. [ 18] A.Ya. Kruger and B.S. Mordukhovich, "Extremal points--the Euler equation in nonsmooth optimization problems", Doklady of the Byelorussian Academy of Sciences, 24 (1980) 684-687. [19] C. Lemarechal and R. Mifflin, eds., Nonsmooth optimization (Pergamon Press, New York, 1977). [20] D. Melzer, "Expressibility of piecewise linear continuous functions as a difference oftwo piecewise linear convex functions". (See this volume pp. 118-134). [21] J.-J. Moreau, "Fonctionelles sous-ditirrentiables", Comptes Rendas de l'Academie des Sciences de Paris 257 (1963) 4117-4119. [22] J.-P. Penot, "Calcus sous-ditterentirl et optimization", Journal of Functional Analysis 27 (1978) 248-276. [23] B.N. Pschenichnyi, Convex analysis and extremal problems (Nauka, Moscow, 1980). [24] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions" (in Russian), Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vestnik Leningrad University Mathematics 13 (1981) 241-247). [25] L.N. Polyakova, "'On the minimization of a quasidifferentiable function subject to equality-type quasidifferentiable constraints". Collaborative Paper, CP-84-27, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). (See also this volume, pp. 44-55). [26] L.N. Polyakova, "On the minimization of the sum of a convex function and a concave function". Collaborative Paper, CP-84-28, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). (See also this volume, pp. 69-73). [27] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, New Jersey, 1970). [28] R.T. Rockafellar, The theory of subgradients and its applications to problems of optimization (Lecture Notes Series, Montreal University Press, Montreal, 1978). [29] A.M. Rubinov and A.A. Yagubov, "The space of star-shaped sets and its applications in nonsmooth optimization", Collaborative Paper CP-84-28, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). (See also this volume, pp. 176-202). [30] A. Shapiro, "Quasidifferential calculus and first-order optimality conditions in nonsmooth optimization". SIAM Journal on Control and Optimization 23 (4) (1984) 610-617. [31 ] N.Z. Shor, "'On one class~ol~almost-differentiable functions and a method for minimizing functions of this class", Kibernetika'4 (1972) 65-70. [32] J. Warga, "Derivative containers, inverse functions and controllability", in: D.L. Russei, eds., Calculus of variations and control theory (Academic Press, New York, 1976) pp. 13-45.
Mathematical Programming Study 29 (1986) 20-43 North-Holland
QUASIDIFFERENTIABLE FUNCTIONS: NECESSARY CONDITIONS AND DESCENT DIRECTIONS V.F. DEMYANOV Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR and International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria Received 13 March 1985
Necessary and sufficientconditions for extrema of quasi-ditterentiable functions are considered. These conditions are expressed in terms of quasidifterentials of the functions involved (i.e., the function to be optimized and a function describing the set over which optimization is to be performed). It is important that the optimality conditions should be expressed in a form which yields some information concerning search directions if the point under examination does not satisfy the necessary conditions. It is shown that most of the conditions discussed here provide such information.
Key words: Quasiditierentiable Functions, Necessary and SufficientConditions, Steepest Ascent and Descent Directions, Stationary Points, Cone of Feasible Directions.
1. Introduction To solve o p t i m i z a t i o n p r o b l e m s in practice it is necessary to be able to check whether a given p o i n t is a n extreme p o i n t or not, a n d if it is not, to find a p o i n t which is in some sense 'better'. This is generally achieved t h r o u g h the specification of c o n d i t i o n s necessary for optimality. This p a p e r is c o n c e r n e d with extremal p r o b l e m s i n v o l v i n g a new class of n o n d i f f e r e n t i a b l e f u n c t i o n s - - t h e so-called quasidifferentiable functions. O n l y m i n i m i z a t i o n p r o b l e m s are discussed, without loss of generality. Different forms of necessary c o n d i t i o n s yield different descent directions which can be used to develop a variety of n u m e r i c a l algorithms. Subsections 1.1 a n d 1.2 provide a b r i e f s u m m a r y of related p r o b l e m s in m a t h e m a t i c a l p r o g r a m m i n g a n d convex analysis.
1.1. M a t h e m a t i c a l programming problems Let /2 c E~, x c c l / 2 where c l / 2 denotes the closure o f / 2 . Set
r(x)--
/
ve
llx,_xll
v=Ag
/
.
(1.1)
It is clear that l ' ( x ) is a closed cone. F ( x ) is called the set of feasible (in a b r o a d sense) directions of the set O at the p o i n t x.
20
V.F. Demyanov / Necessary conditions and descent directions
21
Let a function f be defined and continuously differentiable on an open set S c E, and l e t / 2 c S. N o w consider the problem o f minimizing f on the set/2. Let f * = i n f ~ 2 f ( x ) . Theorem 1. For a point x* e cl/2 to be an infimum point o f f on 12 it is necessary that
(f'(x*), v)>~o vver(x*)
(1.2)
where (a, b) denotes the scalar product of a and b, and f ' ( x ) represents the gradient of f a t x. Unfortunately it is difficult to use this trivial condition in practice. Let A c F ( x ) be a convex cone such that 0 9 A and let M(x) be a family o f convex cones such that
AcF(x)
VA 9
U
A=F(x).
(1.3)
Ac.~t(x)
In [1] cones o f this type are called 'tents'. It is always possible to find a family .d(x) defined as above (take, for example, ~t(x) = {l I l = {v = Xvol A/> 0}, Vo9 F(x)}). We denote by A + the cone conjugate to A: A + = {w 9 Enl (v, w) >I 0 Vv 9 A}. Theorem 2. Condition (1.2) is equivalent to
f'(x*) 9
§
VA 9
(1.4)
A point x * 9 c l / 2 which satisfies (1.4) (or, equivalently, (1.2)) is called a stationary
point o f f on /2. In what follows we shall suppose that 12 is a closed set. Assume that x 9 is not a stationary point o f f o n / 2 . Then there exists A 9 M(x) such that
f ' ( x ) Z A 4. Let us find min IIf'(x)-wll w~m
= IIf'(x)-w(m)ll =- Ilv(a)[I.
(1.5)
§
It is not difficult to see that
v(A) = w(A) - f ' ( x ) 9 A and that v ( A ) is a descent direction o f f o n / 2 at x, i.e.,
(f(x), v(A)) < 0 . It is also clear that the direction go = Vo/I[volh where IIVo[[ = maxa~.~r direction o f steepest descent o f the function f on the s e t / 2 at x, i.e.,
Of(x) Of(x) inf Ogo g~s,~r~) c3g
[[v(A)ll, is a
22
V.F. Demyanov / Necessa~ conditions and descent directions
Here St = { g e E,I
Ilgll =
3f(x) f(x + ag) - f ( x ) = lira Og , , ~ + o a
1},
A steepest descent direction m a y not be u n i q u e . Note that
3f(x) Of(x) - rain 3g(A) s~s,~a 3g where g ( A ) =
(1.6)
v(A)/IIv(A)II.
R e m a r k 1. C o n d i t i o n (1.4) is equivalent to
f'(x*) e ~(x*)
(1.7)
where ~ ( x ) = Oa~,~(x)A+. If ~ ( x * ) = {0} then we obtain the well-known condition f ' ( x * ) = 0. Example 1. Let
X=(X ('),x (2))e E2,
Xo = (0, 0),
/2 = Ii w 12tO 13,
where
I, = {x = (c~,O)l~ ~0},
12-~{x = (0, ~ ) l a ~0},
/3={x=(-a,-a)la~>O}. It is clear that F(xo) = 12 and M(Xo)= {It, 12, 13}, i.e., M(Xo) = {A,, A2, A3}, where At = It, A2 = 12, A3 = 13. NOW we have
A~={xcE21(x, ft)>~O}, F t = ( 1 , 0 ) , A~={x~E2I(x, F2)~>O}, / 2 = ( 0 , 1 ) , Af={xeE21(x, /3) ~> 0},
T3 = ( - - 1 , --1).
It can be seen from Fig. 1 that Z/'(Xo) = f"qi~t :3 A ~ - = {0} and therefore f ' ( x o ) = 0 is a necessary condition at Xo.
s
\
\,,A~ \x
~a Fig. 1.
\
V.F. Demyanov / Necessary conditions and descent directions
23
Remark 2. If x c $2 is not a stationary point then min
IIv-f(x)ll
= Ilv(x)-f(x)tl >0.
vc~(x,)
However, note that the direction
v(x) - f ( x ) g- IIv(x)-f(x)ll has nothing to do with descent directions (it m a y not even be feasible). Thus, the necessary condition (1.7) provides no information about descent directions if Xo is not a stationary point. In contrast, condition (1.4) is more workable because it allows us to construct descent and even steepest descent directions. For a continuously differentiable function f,
Of(x) = ( f ( x ) , g). Og Thus the p r o b l e m of finding steepest descent directions of f o n / 2 at x is reduced to that o f solving (1.6) for all A e s C ( x ) . For this reason we are interested in constructing a family s f ( x ) containing as few cones as possible. I f / 2 is a convex set the cone F ( x ) is convex and therefore ~r consists of only one set. L e t / 2 be described by inequalities /2 - - { x 9 E, lh,(x)<~O Y i 9 I}
(1.8)
where the hi's are from C1, ! = 1 : N. If x 9 and
O~ co{ hl(x)l i 9 Q(x)},
(1.9)
where Q(x) = {i 9 I lh,(x) = 0}, then (see, e.g., [6])
( F ( x ) ) ~ =- F+(x) = c o n e { - h l ( x ) l i 9 Q(x)}. Here cone B is the conic hull of B. It is an easy exercise to show that if a convex cone A contains an interior point then the condition (see (1.4))
f(x*) 9 A + is equivalent for any r / > 0 to the condition 0 9 c o { f ' ( x * ) w T, (A)} where
T,7(A) = { v 9 E.I v 9 [ - A + ] , Assume that x 9 such that
f ' ( x ) Z A +.
Ilvll =
~}.
is not a stationary point o f f o n / 2 . Then there exists A 9 . ~ ( x )
V.F. Derayanov / Necessary conditions and descent directions
24
Suppose that int A g 0. Then, from the above condition,
O~ co{if(x) u T. (A)} -= Ln(A).
(1.10)
Let us find min
v~ L, (A)
IIv[l= [Iv.(a)ll.
From (1.10) we deduce that
IIv, (a)ll > 0. It is easy to see that the direction v,1(A) gn(A) = - I I v, (a)ll
(1.11)
is such that
(if(x),gn(A))
g,(A)eintA.
Hence, g, (A) is a descent direction leading strictly inside the cone A. The fact that g , ( A ) is an interior direction is i m p o r t a n t - - t h e direction g(A) (see (1.6)) may be tangential even though it is the steepest descent direction o f f on A (see (1.6)). This feature may be crucial if O is described by (1.8) and condition (1.9) holds, since in this case F(x) is a convex cone and therefore ~ ( x ) consists of only one set (namely F(x)). Thus, on the one hand it is possible to find the steepest descent direction g(A) (see (1.7)) but this direction may not be feasible if the hi's are not linear; on the other hand the descent direction g , ( F ( x ) ) is feasible for any rl > 0, where V,?
g, (F(x)) =
IIv, II
and
IIv~ II = minll vii,
L, = co{if(x); ~/h~(x) I i ~ Q(x)}.
v~ L n
The foregoing analysis reveals the importance of having several (possibly equivalent) necessary conditions, in that this enables us to develop different numerical methods. Remark 3. It is not difficult to show that, in (1.11), gn (A) ~ ~ ~o g(A), where g(A) is the steepest descent direction o f f on A at x.
1.2. Convex programming problems Similar considerations can be applied to constrained non-differentiable convex programming problems of the form
m i n { f ( x ) l x ~ [2}
V.F. Demyanov / Necessary conditions and descent directions
25
where
/2 = {x ~ E~l h(x) <~O} and functions f and h are finite and convex (but not necessarily differentiable) on En. Suppose that there exists a point g such that h(~)
(1.12)
(This is called the Slater condition.) It follows from convex analysis (see [10]) that
F+(x)
if h(x) < O, ifh(x) =0
S{O} (cone{0h(x)} "t
where ah(x) is the subdifferential of h at x, i.e.,
Oh(x) = {v ~ E, lf(z) - f ( x ) ~ (v, z - x ) V z ~ E,}.
(1.13)
Theorem 3 (see [8]). For x* ~ 12 to be a minimum point of f on II it is necessary and sufficient that
Of(x*) n F+(x *) r O.
(1.14)
Theorem 4. (see [5]). Let h ( x* ) = 0. Condition (1.14) is equivalent to the condition
Oeco{Of(x*)w T n ( x * ) }=- L,7(x*)
Vr/>O
(1.15)
where
T,(x) = { r e [ - F §
I Ilvll = n}.
If x ~ / 2 is not a minimum point o f f on /2 then the direction
v(x)- w(x) g(x) = -
IIv(x)- w(x)ll/'
where
]]v(x)-w(x)]]=
min
]]v-w]],
veo/(x)
weF+(x)
is the steepest descent direction o f f o n / 2 at x. Let us find
gn(x) =
vn(x) IIv.(x)ll
(1.16)
where IIv.(x)ll = min~L,~x)Ilvll. The direction g, (x) given by (1.16) is a descent direction and it can be shown that g, (x) e int F(x).
V.F.. Demyanov / Necessary conditions and descent directions
26
Thus condition (1.15) enables us to find a 'feasible' direction (i.e., a direction leading strictly inside/2), and this can be useful in constructing numerical methods. Some of the methods based on (1.15) are described in Chapter IV of [5]. Note that if x is not a stationary point then
gn(x)
, g(x)
where g(x) is the steepest descent direction o f f o n / 2 at x.
Theorem 4' (see [5]). Let h(x*)= O. Condition (1.14) is equivalent to the condition
Oeco{Of(x*)w[,1 Oh(x*)]}=- L,n(x*)
V'r/>O.
(1.15')
Proof. Consider a function ~bn (x) = m a x { f (x) - f * , "oh(x)} where f * = m i n x c a f ( x ) . Since d~ (x) >!0 Vx ~ E,, and ~b~(x*) = 0, x* is a minimum point of ~b, on E,. However, ~b~ is a convex function and so cg~n (x*) = co{Of(x*) w [r/h(x*)]}. Applying a necessary and sufficient condition for an unconstrained minimum of a convex function, we immediately obtain (1.15'). Assume that x e / 2 is not a minimum point o f f on /2, and find the direction
g,n(x)-
V'n(X)
IIv,,(x)lt
(1.16')
where
iiv,,(x)ll
=
min
vcLt,~(x)
Ilvll.
It can be shown that the direction gl~ (x) defined by (1.16') is a descent direction and gin (x) e int F(,x). Note also that
g,n(x)
, g(x),
where g(x) is the steepest descent direction o f f on /2.
Remark 4. Condition (1.15') is applicable even i f / 2 is an arbitrary convex compact set (not necessarily described explicitly by a convex function).
V.F. Demyanov / Necessary conditions and descent directions
27
2. Quasidifferentiable functions
2.1. Definitions and some properties A f u n c t i o n f is called quasidifferentiable (q.d.) at a point x c E~ if it is directionally differentiable at x and if there exist convex compact sets Of(x) c E, and "~f(x) c E, such that
Of(x)_ lim f ( x + a g ) - f ( x ) = Og ~+o a
max (v,g)+ min (w,g). vr weaf(x)
The pair of sets Df(x)= [0f(x), 0f(x)] is called a quasi-differential o f f at x. Quasidifferentiable functions were introduced in [3] and have been studied in more detail in [7, 2]. A survey of results concerning this class of functions is presented in [4]. It turns out that q.d. functions form a linear space closed with respect to all differentiable operations and, more importantly, to the operations of taking pointwise maximum and minimum. A new form of calculus (quasidifferential calculus) has been developed to handle these functions, and both a chain rule for composite functions and an inverse function theorem have been established [5, 4]. In what follows we shall use only two results from quasidifferential calculus (see below). If Di = [Ai, B1], D2 -- [A2, B2] a r e pairs of convex sets (i.e., A~ c En, Bi c En are convex sets) we put
Dl + D2 = [Ai + A2, BI + B2] and if D = [A, B] then ~[AA, AB]
AD=[[AB, AA]
if ht>0, ifh <0.
The following is then true: 1. If functions f i ( i e I = - l : N ) are q.d. at x and Df~(x)=[~_fi(x),-Of(x)] is a quasidifferential of f~ at x then a function f=Y,i~t hif~ (where the hi's are real numbers) is q.d. at x and
Df(x) = Z a,Df,(x).i~.!
is a quasidifferential o f f at x. 2. If functions f ( i e I = - l : N )
are q.d. at x then
f = m a x f/ icl
is a q.d. function and
Df(x) = [_Of(x), Of(x)]
(2.1)
V..F. Demyanov / Necessary conditions and descent directions
28
is a quasiditterential of f at x, where O icR(x)
~f(x)=
Y~ ofk(x),
keR(x)
R(x)={i~IIf~(x)=f(x)}.
L.N. Polyakova [7] has discovered necessary conditions for an unconstrained optimum o f f on En: Theorem 5. For x* ~ En to be a minimum point of a q.d. function f on E. it is necessary
that -'Of(x*) c Of(x*).
(2.2)
For x** c E,, to be a maximum point of a q.d. function on En it is necessary that -Of(x**) c "Of(x**).
(2.3)
Conditions (2.2) and (2.3) represent generalizations of the classical necessary conditions for an extreme point of a smooth f u n c t i o n f on E, (in this case -Of(x) = {0}, Of(x) = {f'(x)} and from (2.2) it follows that f ' ( x * ) = 0. From (2.3) it also follows that f ( x * * ) = 0, i.e., the necessary conditions for a maximum and for a minimum coincide.) If f is convex on E, then -Of(x) = {0}, O_f(x) = Of(x), where Of(x) is the subditterential o f f at x (see (1.13)), and (2.2) becomes the well-known condition [8, 10]
0 ~ Of(x*). 2.Z Quasidifferentiable sets. Necessary conditions for constrained optimality A set /2 is called quasidifferentiable if it can be represented in the form
O = { x ~ E , Ih(x)<~O} where h is quasiditterentiable on E~. The properties of q.d. sets and the necessary conditions for optimality of a q.d. function on a q.d. set are discussed in [2] (see also [5, Chapter II]). Take x ~ 12 and introduce cones
y ( x ) = { g c E , lOho~)
yl(X)={gEE~lOh(x) <~O}. ag
Let h(x)= 0. We say that the nondegeneracy condition is satisfied at x if cl[y(x)] = yl(x) where cl A denotes t h e closure o f A.
(2.4)
V.F. Demyanov / Necessary conditions and descent directions
29
Lemma 1 (see [5, 2]). l f h ( x ) < 0 then F ( x ) = E,. l f h ( x ) = 0 and the nondegeneracy condition (2.4) is satisfied at x and h(x) is Lipschitzian in some neighborhood of x then
r(x)
= ~,(x)
(2.5)
where F ( x ) is the set of feasible (in a broad sense) directions of 12 at x (see (1.1)). The following two theorems and lemma are proved in [2]. Theorem 6. Let a function f be Lipschitzian and quasidifferentiable in some neighborhood of a point x*~12. I f h ( x * ) = 0 then let h be Lipschitzian and q.d. in some neighborhood of x* and the nondegeneracy condition (2.4) be satisfied at x*. For the function f to attain its smallest value on 12 at x* it is necessary that -'Of(x*) c O_f(x*)
ifh(x*) < 0
(2.6)
and (O_f(x*) + w) c~ [-cl(cone(_~h(x*) + w'))] r 0
ifh(x*) = 0
(2.7)
for every w~-~f(x*), w' e-~h(x*).
Theorem 7. Condition (2.7) is equivalent to the condition --Of(x*) c L(x*)
(2.8)
L(x)=
(2.9)
where
(-7 [O_f(x)+cl(cone(O_h(x)+w))]. we~htx)
A point x * e 12 which satisfies (2.7) when h ( x * ) = 0 and (2.6) when h ( x * ) < 0 is called a stationary point o f f on 12. Note that L(x) is a convex set (and nonempty, since O_f(x)c L(x)). Corollary. I f f and h are convex functions it follows from (2.8) that 0 ~ Of(x*) - F+(x *)
(2.10)
where Of(x) is the subdifferential of f a t x (see (1.13)) and F ( x ) is the cone of feasible directions of 12 at x. This condition is both necessary and sufficient for x* e O to be a minimum point o f f on 1"2. Necessary conditions for a maximum of a q.d. function on a q.d. set can be derived in an analogous fashion [2, 5].
V.F. Demyanov / Necessary conditions and descent directions
30
2.3. Descent and steepest descent directions Take x e / 2 and suppose that x is not a s t a t i o n a r . / p o i n t o f f o n / 2 . We shall now consider in more detail the case where h ( x ) = 0 and condition (2.7) is not satisfied. For every w e -af(x) and w'e -ah(x) we calculate min zeOf(x)+w z' ecl(cone(Oh(x)+w'))
IIz+ z'll--IIz(w, w')+z'(w, w')ll = d(w, w').
(2.11)
Then we find p ( x ) = max
d(w, w') = d(wo, W'o).
(2.12)
w'cah(x)
Since (2.7) does not hold, p ( x ) > O. Let
Vo+ W( Vo) go=
(2.13)
Ilvo+w(oo)ll"
Lemma 2. If h ( x ) = 0 and the nondegeneracy condition (2.4) is satisfied then the direction go (see (2.13)) is a steepest descent direction o f f on 12 at x and d ( x ) =
Ilvo+ W(vo)ll is the rate of steepest descent, i.e., Of(x) = Of(x) min = -d(x). Ogo g~t~x),~s, ag
(2.14)
Remark 5. Since there m a y exist several Wo, w~ satisfying (2.12), there may exist several (or infinitely many) directions o f steepest descent. (This is impossible for convex sets and convex or continuously differentiable functions.) Remark 6. Let K(w') = cl(cone(_0h(x*)+ w')). If int K+(w ') # ~, then condition (2.7) is equivalent to 0 e co{[af(x*) + w] u T~ (w')} =- L , (w, w') where
T~(w)={v~g(w')lllvll=,fl,
7>0.
If for some x e 12 and w 9 -af(x), w'e Oh(x) we have h(x) = 0 and 0 r L n (w, w'), then
z.(w, w') g~ (w, w') = IIz~ (w, w')ll =
IIz~ (w, w')ll min
zeLn(w,w')
where
Ilzl[
is a descent direction o f f on 12 at x and, above all, is feasible, i.e.,
Of(x) <0 Ogn(w, w')
and
Oh(x) <0. Og,(w, w')
V.F. Demyanov / Necessary conditions and descent directions
31
Remark 7. If x ei/2 is not a stationary point of f on /2 conditions (2.6) and (2.7) allow us to find steepest descent directions (see Lemma 2), but in the case where h(x) = 0 the directions thus obtained may not necessarily be feasible. Condition (2.8) is similar to (2.2) and if x is not a stationary point we have - a f ( x ) r L(x).
(2.15)
Let us find max p ( V ) = p ( O ( X ) ) where p ( x ) = min IIv+wll = IIo+w(o)Jl. wc L(x)
It follows from (2.15) that p(v(x))>O but it is not clear whether
v(x) + w(v(x))
IIv(x)+w(v(x))ll
go=
is a descent direction. Let h(x)= O. The problem of finding a steepest descent direction is equivalent to the following problem: minimize
0
(2.16)
subject to af(x) ~0, ag
Oh(x) ag
(2.17)
~<0,
(2.18)
Ilgll ~< 1.
(2.19)
Since f and h are quasiditterentiable functions, problem (2.16)-(2.19) can be rewritten as min{OI 0 e E,, g 9 E,,, [ 0, g] 9
(2.16')
where/21 c E.4 ~ is described by inequalities max (v,g)+ min (w,g)<~O, Or163
max (v',g)+
t,'~o_h(x)
(2.17')
wc'Of(x)
rain (w',g)<~O,
(2.18')
w'E~h(x)
[Igl[ <~ 1.
(2.19')
Let O(w, w') =- O(w, w', x), g(w, w')-~ g(w, w', x) be a solution to the problem min{0l 0 c E,, g e E., [0, g]e/21(w, w')}
(2.20)
V.F. Demyanov / Necessary conditions and descent directions
32
where we-~f(x), w'e'~h(x), and nl(w, w ' ) c En+l is described by inequalities max (v,g)+(w,g)<-O,
(2.21)
v~f(x)
max (v',g)+(w',g)<-O,
(2.22)
v'~o_h(x)
IlgU ~ 1.
(2.23)
Let [O*(x), g*(x)] denote a solution to problem (2.16')-(2.19'). It is clear that
O*(x) = O(w*, w'*),
g * ( x ) = g(w*, w'*)
where [w*, w'*] = arg min{0(w, w')J w ~ 0f(x), w'e 0h(x)}.
(2.24)
It is also clear that 0* = -p(x) (see (2.12)). Here g*(x) is a steepest descent direction, and O*(x) is the rate of steepest descent; it has already been pointed out that direction g*(x) (as well as directions g(w, w')) may not be feasible. We shall therefore consider the following problem: min{0] 0 c El, g~ En, [0, g]e/21,}
(2.25)
where r/> 0, and/'2~n c En+l is described by max (v,g)+ rain (w,g)<~O, veo_f(x)
max (v',g)+ rain (w',g)<~,70, v'eO_h(x)
(2.26)
w~f(x)
(2.27)
w'E~h(x)
Ilgll ~ 1.
(2.28)
Let (O~(x), g~(x)) be a solution to problem (2.25)-(2.28). Now let us also consider the following problem: min{O I 0 ~ E~, g e En, [O, g] e / 2 ~ (w, w')}
(2.29)
where w ~ -Of(x), w' E "~h(x), and 12~,(w, w') c En+t is described by inequalities max (v,g)+(w,g)<~ O,
(2.30)
v~O_f(x)
max (v',g)+(w',g)<~nO ,
v'~O_h(x)
(2.31)
V.F. Demyanov / Necessary conditions and descent directions
Ilgll <~ 1.
33
(2.32)
If 07 ( w, w') =- 07 (w, w', x), g7 (w, w') = g7 (w, w', x) is a solution to problem (2.29)(2.32) then
gT(x) = g7 (wT, w~)
07(x) = OT(w,, wO, t
.
where [ w,, w~] = arg min{ 07 (w, w') I w e ~f(x), w'~ Oh(x)}.
(2.33)
Direction g, (x) is feasible for any 7/> 0. Remark 8. When solving problem (2.24) (as well as (2.33)) it is sufficient to consider only boundary points of the sets ~f(x) and Oh(x). Furthermore, if each of these sets is a convex hull of a finite number of points, it is sufficient to solve only a finite number of problems of the form (2.20)-(2.23) (or, for problem (2.33), of the form (2.29)-(2.32)). These become linear programming problems if the Euclidean norm in (2.23) (or (2.32)) is replaced by the m-norm:
Ilgll,. = max{Ig, l]i~ 1: n} where
g=(gl,...,g,). Remark 9. Let "Ok ~ k~o O0. Without loss of generality we can assume that g,k (x) g*. It is possible to show that g* is a steepest descent direction o f f on 12 at x and that 0,k ( x ) ~ O*(x), where O*(x) is the rate of steepest descent. Remark 10. Let
x e 12 and h(x) not necessarily equal zero. Consider the problem (2.34)
min{ 010 ~ E,, g ~ E,, [ 0, g] ~ .O27} where 'O > 0, and 027 ~ E,+~ is described by max (v,g)+ min (w,g)<~O, vcd.f(x)
wcdf(x)
h(x)+ max (v',g)+ rain (w',g)~'oO, v' ~O_h(x)
w'~ h ( x )
Ilgll ~ 1.
(2.35)
The replacement of (2.31) by (2.35) enables us to deal with points in 12 close to the boundary. It is hoped that, as in mathematical programming (see, e.g., [9]), it will eventually be possible to develop superlinearly (or even quadratically) convergent algorithms.
V.F. Demyanov / Necessary conditions and descent directions
34
A geometric interpretation of problem (2.16)-(2.19) is given by (2.12). For a similar interpretation of problem (2.29)-(2.32) we use the following result (obtained by A. Shapiro [11]): Theorem 8. Let x ' e ~ 2 and h ( x * ) = 0 . Functions f and h are assumed to be quasidifferentiable on E,. For x* to be a minimum point o f f on/2 it is necessary that Ll(x*) c L2(x*)
(2.36)
Ll(x) = - [ 0 f ( x ) + 0h(x)],
(2.37)
L2 (x) = co{-of(x) - 0h (x), _Oh(x) - 0f(x)}.
(2.38)
where
Proof. Let x* be a minimum point o f f o n / 2 and let h(x*) = 0. Consider a function F ( x ) = max{f (x) - f * , h(x)} where f * = f ( x * ) = min f ( x ) . X.2$2
It is clear that F ( x ) >i 0 V x e E,. Since F(x*) = 0 it can be concluded that x* is a minimum point of F on E,. But F is a q.d. function (because it is the pointwise maximum of q.d. functions f ( x ) - f * and h(x)). Applying (2.1) we have D F ( x * ) = [-OF(x*), 0F(x*)] where -OF(x*) = c o { 0 f ( x * ) - Oh(x*), -oh (x*) - 0f(x*)}, ~ F ( x * ) = ~f(x*) + ~h(x*). Since x* is a minimum point of F on E,, (2.2) leads immediately to (2.36).
[]
Remark 11. Condition (2.36) is equivalent to (2.7) and is applicable even in the case where the nondegeneracy condition (2.4) does not hold. However, it seems that condition (2.6) is always satisfied at a degenerate point. Now let us consider the case where x ~/2, h(x) = 0 and condition (2.36) does not hold. We first find d ( x ) = max p ( v ) = p ( v ( x ) )
(2.39)
t~c L l ( x )
where
p(v)= min LIv- wlb--IIv- w(v)ll. w e L2(x )
It is clear that p ( v ( x ) ) > O.
(2.40)
V.F. Demyanov / Necessary conditions and descent directions
35
Since sets Ll(X) and L2(x) are convex there exists for every v s Ll(x) a unique w(v) which satisfies (2.40), but there is not necessarily a unique v(w) which satisfies (2.39). Consider a direction
v(x) - w(v(x))
(2.41)
go- IIv(x)- w(v(x))ll" Lemma 3. The direction go defined by (2.41) is a descent direction o f f on 12 at x. Proof. By definition (see (2.39)-(2.41)), max (v, go)> max (w, go). v~ Ldx)
(2.42)
v~ L2(x)
In particular, it follows from (2.42) that max (v, go) > vELI(X )
max
(W, go),
(2.43)
(w, go).
(2.44)
wcOf(x)-~h(x)
max (v, go) > v~ Ll(x)
max w~Oh(x)-Of(x)
From (2.43) max
(v, go)+ max (v, go)> max (w, go)+ vr
re[ -af(x) l
w c (.)f( x )
max
(w, go),
w~ [ - ~ h ( x ) ]
i.e., -
min (v, go)> max (w, go). veaf(x)
(2.45)
wcaf(x)
But (2.45) implies that
Of(x) _ Ogo
max (v, go)+ min (w, go)<0. veOf(x)
(2.46)
wc0f(x)
Analogously, it follows from (2.44) that
oh(x) ago
= max (v, go)+ min (w, go)<0. v~a_h(x)
(2.47)
wc~h(x)
Inequality (2.47) implies that go is feasible; inequality (2.46) shows that it is a descent direction. [] Remark 12. The direction go defined by (2.39)-(2.41) may not be }mique. Observe that since/'2 can be described by
12 = {xm E, lh,(x)<-O}, where h , ( x ) = ~Th(x), 7/> 0, we can obtain the following necessary condition Lln (x*) = L2, (x*)
(2.36')
V.F. Demyanov / Necessary conditions and descent directions
36
where L,n (x) = - [ 0 f ( x ) + r/0h (x)],
(2.37')
Lzn(x) = co{_af(x) - r/0h(x); , 0 h ( x ) - af(x)}.
(2.38')
For a nonstationary point x (when h ( x ) = 0) it is possible to obtain a descent direction got different from go. It is also useful to note that if ~t is a quasidifferentiable function strictly positive o n / 2 then O can be given in the form a = {xl x ( x ) h ( x ) <~0}. This representation provides a variety of necessary conditions and, consequently, a variety of descent directions at a non-stationary point.
2.4. Sufficient conditions for a local minimum Necessary conditions (2.7), (2.8), (2.36) can be modified in such a way that they become sufficient conditions for a local minimum of f on O. Recall that
f(xo+ag)=f(xo)+a
Of(xo) +o(ct, g), Og
h (Xo+ ag) = h(xo) + et Oh(xo) + or(or, g). ag
(2.48)
(2.49)
Functions f and h are assumed to be continuous and quasidifferentiable at Xo~/2; it is also assumed that o(a, g)
- - - ~ 0
uniformly with respect to g ~ St in (2.48) and that if h(xo)= 0 then ot(a,g)
- - - * 0 o/
uniformly with respect to g ~ St in (2.49). Recall also that
St={g~E.IIIgH =1}. Theorem 9 (see [5, 2]). / f h ( x o ) < 0 and
-'af(xo) = int a_f(xo) then Xo is a local minimum point o f f on 1"2.
(2.50)
V.F.. Demyanov / Necessary conditions and descent directions
37
If h(xo) = 0 and r=
r(w,w')>O,
min
(2.51)
w~2af( x O) w'E~h(XO)
where r( w, w') is the radius of the maximal sphere centered at the origin that can be inscribed in the set .~(w, w') = a_f(xo) + w+ci(cone(ah(xo)+ w')), then xo is a strict local minimum point o f f on 12 and r=
af(xo)
min g~ F(Xo)c~St
f~g
Theorem 10. I f h(xo)=O and
-af(xo) c i n t L(xo),
-
(2.52)
where L(x) is defined by (2.9), then Xo is a strict local minimum point o f f on 12. The p r o o f of this theorem is analogous to that of Theorem 9 (see, e.g., [5, Section 7, Chapter II]). Theorem
11. I f h(xo)=0 and
L~(xo) c i n t L2(xo),
(2.53)
where L~(xo) and L2(xo) are defined by (2.37) and (2.38), then Xo is a strict local minimum point o f f on 12. Proof. From (2.53) it follows that there exists an r > 0 such that max ( v , g ) ~< o~ Ll(xo)
max
VgeS~,
(w,g)-r
we L2 (xo)
i.e.,
(v,g)<-M-r
max
VgeS~
(2.54)
v~ [--af( xo)--ah( xo) ]
where M =
max w~r
(w, g).
Xo)-ah( xo),ah( xo)-af( xo) }
Since max
(v' g) = max l mv~a a x (v' g)' max ~B (v'
vcco{A~B}
then from (2.43) -
min
(w,g)-
weOf(xo)
min
f (w,g)<~max~ max ( v , g ) -
w~ah(xo)
L v~a_f(xo)
/ max ( v , g ) ve a_h(xo)
Two cases are possible:
(w,g)~-r w~'af( xo) J min
VgeS1.
min
(w,g);
w~ah(xo)
(2.55)
38
V.F. Demyanov / Necessary conditions and descent directions
1. M = max w~.~S(xo)-;~h(xo)(W, g) = max v~_a.f(xo)(v, g ) - minw~h(xo) (w, g) 2. M = max,c~_~h(xo)-~S(~o)(w, g) = maxo~_~h(xo)(v, g) - minw~S(~o) (w, g). In case 1 it follows from (2.55) that max (v,g)+ vt- a_f(Xn)
min
(w,g)=af(x~
wc af( xo)
(2.56)
Og
In case 2 it follows from (2.55) that max (v,g)+ v~a_h(xo)
rain
(w,g)>~r.
(2.57)
wcah(xo)
Since o(a, g)/a ~ 0 uniformly with respect to g c $1 in (2.48) and o~(a,g)/a ~ 0 uniformly with respect to g e S1 in (2.49), then (2.56) and (2.57) suggest that there exists an a > 0 such that for any x e S~ (Xo) = {x e E.IIIx- xoll<~~} and x # Xo either
f(x) > f(xo)
(2.58)
(in case 1), or
h(x) > h(xo) = 0
(2.59)
(in case 2). If (2.59) holds, then x ~ ~. Thus, it follows from (2.58) and (2.59) that
f ( x ) > f(xo)
Vx~llnS~(xo), X#Xo,
i.e., Xo is a strict local minimum point o f f on /2.
[]
Remark 13. Theorem 11 is stated by A. Shapiro in [11]. Example 2. Let - ~lxr
X=(X(I),X(2))EE2,
x <=~, o = { x
Xo=(0,0);
f(x)=lx<')l-lx<=>l+x<=~; h(x)=
~ E2I h(x) <~0}.
From quasiditierential calculus we have
O_f(xo) = co{(1, 1), ( - 1 , 1)}, af(xo) = co{(0, 1), ( 0 , - 1 ) } , Oh(xo) = {(0, -1)}, -Oh(xo)= co{(89 0), (-~-, 0)}. It is shown in [5, Section 5, Chapter II] that the nondegeneracy condition (2.4) is satisfied at Xo (see Fig. 2). We shall now verify the necessary conditions for a minimum. Construct sets L(xo), Ll(xo), L2(xo) (see (2.9), (2.37), (2.38)):
L(xo) =
['~
[O_f(xo)+cl(cone(a_h(xo)+ w))]
w6"ah(xo)
= c o { ( - 1 , 1), (1, 1), (0, -1)}, LI (Xo) = - [ a f ( x o ) + ah (Xo)] = co{(89 1 ), (89 - 1 ), ( - 89 - 1), ( - ~, 1)},
Of(xo) -Oh(xo) = co{( 3, 1), (_3, 1)}, a_h(xo)- af(xo) = co{(0, 0), (0, -2)}, L2(xo) = co{_af(xo) - a h ( x o ) ; a_h(xo)- af(xo)} = co{( 3, 1), (_3, 1), (0, -2)}. From Figs. 3 and 4 it is clear that the necessary conditions for a minimum (2.8) and (2.36) are satisfied.
E E Demyanov / Necessary conditions and descent directions
x(21
Fig. 2.
x(21
1
x (1)
Fig. 3.
Example 3. Let x e E2, Xo= (0,0); 12 and h be the same as in Example 2 and f(x) =
Ix'"l
- ~jx(21J + x r
Now we have Of ( x o ) = co{( 1,-1'), ( - I, 1)},
-Sf(xo) = CO{(O, I), (0, - ~)}.
L(xo) remains the same:
L(xo) = c o { ( - 1 , 1), (1, 1), ( 0 , - 1 ) } .
Let us find Ll(xo) and L2(xo): L,(xo) = - [ ~ f ( x o ) + 0h(xo)] = co{(~, ~), (I,- 89 ( - I , -89 ( - I , ~)}, Of ( x o ) - O h ( x o ) = co{( 3, 1), ( - 3, 1)},
O_h(xo) --Of(xo) = co{(0, -~), (0, _3)},
39
40
V.E Demyanov / Necessary conditions and descent directions
(2)
1
2; :. x(1)
Fig. 4.
--~h(xo), O_h(xo)-
L2(xo) = co{0f(xo)
~f(xo)} = co{( 3,
1),
( - 3 , 1), (0, --~)}.
It is clear from Fig. 5 that
-'Sf(xo) cint L(xo) i.e., the sufficient condition for a strict local m i n i m u m (2.52) is satisfied. Figure 6 shows that the sufficient condition (2.53) is also satisfied. Remark 14. In E x a m p l e 2 Xo was in fact a m i n i m u m point but we cannot deduce this from the necessary conditions alone.
x(2) 1~~4~
----1~
1
Fig. 5.
x(1)
V.F. Demyanov / Necessary conditions and descent directions
41
(2)
1
L2/
r- :-,, /
.,,,
-2 Fig. 6.
Example 4. Let x = (x (~), x(2))e E2, Xo = (0, 0); f(x) = -Ix('~l - x (2), ~ = {xl
h(x) ~
o}. T h u s the
functionf is t he
Ix~'l-Ix(2~l+x~2';
s a m e as in
h(x) = Example 2--.only
/2 has been changed (see Fig. 7). Note that the nondegeneracy condition (2.4) is satisfied and that
Of(xo) = co{(l, 1), ( - 1 , 1)}, "Of(xo)= co{(0, 1), ( 0 , - 1 ) } , o_h(xo)={(O,-1)}, -Oh(Xo)=CO{(1,O),(-1,O)}. Construct sets L(xo), L~(xo), L2(xo): L(xo)= A [O_f(xo)+cl(cone(Oh(xo)+W))]=co{(-1, 1),(1, 1),(0,0)}, wegh(x O)
xf2)
1
-1
S2
1 ,~ x (1)
/ Fig. 7.
42
V.F. Demyanov / Necessary conditions and descent directions O_f(Xo) --~h(xo) = co{(2, 1), ( - 2 , 1)}, O_h(xo) --~f(xo) = co{(0, 0), (0, -2)}, L,(xo) = - [ ~ f ( x o ) + ~h(xo)] = co{(1, 1), (1, -1), ( - 1 , - 1 ) , ( - 1 , 1)t,
L2 (Xo) = co{-0_f(xo) - -~h (Xo), Oh (Xo) - ~f(xo)t = co{(2, 1), ( - 2 , 1), (0, -2)}. We observe that the necessary condition (2.8) is not satisfied (see Fig. 8). We then calculate the steepest descent directions (see I'5, Section 7, Chapter II]), obtaining go = (,/-2/2,-x/2/2) and g ; = ( - x / 2 / 2 , - 4 2 / 2 ) . It can be seen from Fig. 8 that the necessary condition (2.36) is also not satisfied (this is hardly surprising since conditions (2.8) and (2.36) are equivalent). We shall now find directions satisfying (2.39)-(2.41). It is clear from Fig. 9 that there exist (2)
1
J
--1
x(1)
-1 Fig. 8. (2)
-~
x(1)
i -1
Fig. 9.
V.F. Demyanov / Necessary conditions and descent directions
43
two directions of this kind:
gl--
,
,
g'l=
i-6'
"
Figure 6 shows that the directions of steepest descent go and g~ are tangent directions but that the descent directions gl and g'l are interior.
References [1] B.G. Boltyanskiy, "A method of tents in the theory of extremal problems" (in Russian), Uspekhi Matematicheskih Nauk 30 (1975) 3-55. Translated in Russian Mathematical Surveys 30 (3) (1975) 1-54. [2] V.F. Demyanov and L.N. Polyakova, "Minimization of a quasidifferentiable function on a quasidifferentiable set", Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki 20 (1980) 849-856. Translated in U.S.S.R. Computational Mathematics and Mathematical Physics 20 (1980) 34--43. [3] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 21-25. Translated in Soviet Mathematics DoMady 21 (1980) 14-17. [4] V.F. Demyanov and A.M. Rubinov, "On some approaches to the non-smooth optimization problem" (in Russian) Ekonomika i Matematicheskie Metody 17 (1981) 1153-1174. [5] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable optimization (in Russian) (Nauka, Moscow, 1981). [6] S. Karlin, Mathematical methods and theory in games, programming and economics, Volumes I, II (Addison Wesley, Reading, MA, 1959). [7] L.N. Polyakova, "Necessary conditions for an extremum of quasi-ditIerentiable functions", Vestnik Leningradskogo Universiteta 13 (1980) 57-62. [Translated in Vestnik Leningrad University Mathematics 13 (1981) 241-247. [8] B.N. Pshenichniy, Necessary conditions for extremum problems (Marcel Dekker, New York, 1971). [9] B.N. Pshenichniy and Yu. M. Danilin, Numerical methods for extremum problems (in Russian) (Nauka, Moscow, 1975). [10] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, N J, 1970). [ 11] A. Shapiro, "On optimality conditions in quasidifferentiable optimization", SIAM Journal on Control and Optimization 23 (1984) 610-617.
Mathematical Programming Study 29 (1986) 44-55 North-Holland
ON THE MINIMIZATION OF A QUASIDIFFERENTIABLE FUNCTION SUBJECT TO EQUALITY-TYPE QUASIDIFFERENTIABLE CONSTRAINTS L.N. P O L Y A K O V A Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR Received 25 December 1983 Revised manuscript received 24 March 1984 This paper considers the problem of minimizing a quasidifferentiable function on a set described by equality-type quasidifferentiable constraints. Necessary conditions for a minimum are derived under regularity conditions which represent a generalization of the well-known Kuhn-Tucker regularity conditions. Key words: Quasidifferentiable Functions, Quasidifferentiable Constraints, Regularity Conditions, Necessary and Sufficient Conditions for a Minimum.
1. Introduction In this p a p e r we consider the problem o f minimizing a quasidifferentiable function [2, 5] subject to equality-type constraints which m a y also be described by quasidifferentiable functions. A regularity condition is stated which in the s m o o t h case is similar to the first-order K u h n - T u c k e r regularity condition. Sufficient conditions for this regularity qualification to be satisfied are then formulated in terms o f suband superdifferentials o f the constraint function. We also consider cases where the quasidifferentiable contraint is given in the form o f the union or intersection o f a finite n u m b e r o f quasidifferentiable sets: analytical representations o f the cone of feasible directions (in a b r o a d sense) are obtained for such cases. Necessary and sufficient conditions for a m i n i m u m o f a quasidifferentiable function on an equalitytype quasidifferentiable set are proved, as are sufficient conditions for a strict local minimum. A m e t h o d o f finding steepest-descent directions in the case where the necessary conditions are not satisfied (but u n d e r some additional natural assumptions) is also given. The theory is illustrated by means o f examples, some o f which c a n n o t be studied using the Clarke subdifferential or other similar constructions. Let h be a locally Lipschitzian function which is quasidifferentiable on E,, and Dh(x)=[O_h(x),'~h(x)] be its quasidifferential at x e En. Then the directional Translated from Russian at the International Institute for Applied Systems Analysis (IIASA), A-2361, Laxenburg, Austria. 44
L N. P o l y a k o v a / E q u a l i t y - t y p e quasidifferentiable constraints
45
derivative of h is given by
Oh(x) - m a x ( v , g ) + rain (w,g). 8g won(x) w6!oh(x)
(1)
a={x~E,
(2)
Let
lh(x)=O}.
Assume that the set 12 is n o n e m p t y and contains no isolated points. For every x ~ 12 set
~g It is clear that 7o(X) is a closed cone which d e p e n d s on h. It is not difficult to check that 3,o(X) =
I.J
[ c o n e + ( 0 h ( x ) + v) n ( - c o n e ) §
w)].
(3)
vcO_h(x) wc~h(x)
Here and elsewhere cone A is u n d e r s t o o d to refer to the conical hull o f set A, and cone + A to the cone conjugate to cone A. E x a m p l e 1. Let 12 = { x ~ E21h(x) =0}, where
x = (x (1), x ~2))~ E2,
h(x) = max{0, hi(x), h2(x)},
hi(x) = (x(t)) 2 + (x (2)- 2) 2 - 4 ,
h2(x) = -(x(l)) 2 - (x (2)- 1)2+ 1.
Let Xo = (0, 0) ~ E2: it is clear that Xo ~ / 2 . It is not difficult to show that we can take the pairs o f sets
Dh~(xo) = [{(0, -4)}, {(0, 0)}], Dh2(xo) = [{(0, 2)}, {(0, 0)}], Dh(xo) = [co{(0, - 4 ) , (0, 2)}, {(0, 0)}] as quasidifferentials o f functions ht, h2 a n d h at Xo. Here co A denotes the convex hull o f set A. Then we have yo(Xo)=
I._.J [cone+(v)c~(-cone)+(~h(xo))]= I._J (A, 0 ) = E ~ x{0}. v~_0h(xo)
A~E 1
For any x ~ f/ introduce the closed cone
F ( x ) = { g ~ E, 13A > 0, {x,}: x, ~ x, x, ~ x, xi ~ 12, X i -- X IIx, -
]
xll-' go, g = Xgo~.
(4)
L.N. Polyakova/ Equality-type quasidifferentiable constraints
46
The cone F ( x ) is called the cone o f feasible directions (in a b r o a d sense) o f set/2 at x. We say that the regularity condition is satisfied for function h at x e 12 if
r(x)
= ~,o(X).
(5)
Note that in E x a m p l e 1 the regularity condition is satisfied at x = Xo.
2. Sufficient conditions for the regularity qualification to be satisfied F r o m the definition o f a quasidifferentiable function it follows that the directional derivative is a continuous, positively h o m o g e n e o u s function o f direction g and is defined on E,. We shall use the following notation. Define hx(g) = Oh(x),
Og
h+~(g)= max (v, g), o~Oh(x)
h~(g)= min ( w , g ) , weOh(x)
Dh(x)=[O_h(x),'ah(x)] is a quasidifferential o f h at x. Then hx(g) = h+x(g)+h~(g). We shall now find a quasiditterential o f function hx(g) at point g e En. Since
where
function hx+(g) is finite and convex on En, and function h~(g) is finite and concave on En, the sets
O_h~(g)=co{vlv~R+(x)},
0h~(g) = {0}, (6)
_abe(g) = {0},
-ahx(g)=co{wl w e R - ( x ) } ,
can be taken as subditterentials and superditterentials o f functions h+~(g) and h~(g), where
R-~(x) -- {v ~ a_h(x)l(v, g) = h~+(g)}, R - ( x ) = {w ~ -ah(x)l(w, g) = h~(g)}. Therefore
Dh~(g) = [0h~(g), 0hx(g)].
(7)
Note that at point g = 0 an arbitrary quasidifferential o f function h at x can be taken as a quasiditterential o f function h~. For all other points g e E, we have
O_h~(g) c a_h(x),
Ohx(g) c -Oh(x).
(8)
The converse is also true: any quasiditterential o f function hx at point g = 0 is a quasidifferential o f function h at x.
Theorem 1. I f the function hx(g) has no strict local extrema on 3'o(X) then the regularity
condition is satisfied for function h at point x ~ 1-2.
L.N. Polyakova/ Equality-type quasidifferentiable constraints
47
Proof. Since the function h(x) is assumed to be locally Lipschitzian, the following inclusion (see [3]) holds:
r ( x ) ~ yo(X). We shall now try to prove the opposite. Choose an arbitrary r yo(X) and assume that r ~ F(x). Since the function h is continuous (see [2]), there exists a positive n u m b e r ao such that for every a e (0, oto] and any g e S~o(~,), g # g, the inequality h ( x + a g ) # O holds and sign h ( x + a g ) = c o n s t a n t . (Here S~(z)=
{v~ l~l llv- zll ~ r}.) Let us first assume that for all a 9 (0, ao] and g 9 S~o(g), g # ~, the inequality
h ( x + ag) > 0
(9)
holds. Since
h(x + ctg) = h(x) + othx(g) + o(a, g) and
o(,~, g)
O,
then without loss of generality we can assume that hx(g)I>0= h x ( g ) V g e S~o(r From the assumptions of the theorem the function hx(g) has no strict local minimum at/~ and therefore inequality (9) is not satisfied. In the same way it can be shown that there exists an a~ > 0 such that for every O/ E (0, Otfl] and any g e S~,(~), g # ~, the inequality h(x + a g ) < 0 is also not satisfied. The contradiction means that yo(x) c F ( x ) and thus proves the theorem.
Theorem 2. I f the function h has a quasidifferential Dh(x)=[_0h(x), ah(x)] at x e 12 such that -O_h( x ) c~ -ah(x ) = ~, then the function h satisfies the regularity condition at point x. Proof. Since 0 e yo(X), it follows from the properties o f a quasidifferential of function hx(g) at 0 (see (8)) and the assumptions of the theorem that neither the necessary condition for a minimum nor that for a m a x i m u m is satisfied for quasidifferentiable function hx on En at any point g e En. Thus, it follows from Theorem 1 that function h satisfies the regularity condition at point x, and Theorem 2 is proved. This regularity condition is first-order and therefore it possesses all the deficiencies characteristic of first-order conditions. We shall now consider an example in which this condition is not satisfied.
Example 2. Let t2 -- {x e El I h(x) = 0},
L N. Polyak.ova/ Equality-typequasidifferentiableconstraints
48 where
(x-l)
,
x>l,
h(x)= 0,
-1<~x<~1, x<-l.
( x + l ) 2,
Then 12 = c o { - 1 , 1}. The function h is s m o o t h and achieves its m i n i m u m value on E~ at every point x c/2. It is clear that the regularity condition is satisfied at every point of the set O except for points - 1 and +1. At these points yo(x) = Ej, F ( 1 ) = - g and F ( - 1 ) = g , where g 1> 0. Let hi, i e I = 1 : N, be locally Lipschitzian functions which are quasidifferentiable on E., and let Dhi(x) = [_~h~(x), 0h~(x)] be their quasidifferentials at x e El. Set
/ 2 , = { x e E , lh,(x)=O},
y , o = { g e E , lah'(x)=o }. ag
(a) A s s u m e t h a t / 2 = [ ' - ~ i / 2 ~ . Then
12 = {x ~ E, [h(x) = 0},
(10)
where h(x) = max{Ih,(x)lli ~ I}. In the case where the s e t / 2 is n o n e m p t y and function h satisfies the regularity condition at some point x c/2, we have
r(x) = 3,o(X) = ['l yio(X) = ("/ i~l
~_J
T(v,, w,)
i ~ l vial_hi(x) w, e a h d x )
=
U
T(vt, w,,..., vN, wN),
Vle#hl(x)iwacahl(x) ONE~_hN(x)iWNeOhN(X)
where
T(v, w) = c o n e + ( 0 h ( x ) + v) ~ I(-cone)§
w)],
T(vl, w,,..., vN, wN) = r] T(v,, wi). ir
(b) Let us n o w consider the case where /2 is the union o f a finite n u m b e r of quasidifferentiable sets: /2 = U / 2 , . icl
Then /2 = { x ~ E, l h(x) =O}, where
h(x) = min{Ih,(x)ll i ~ I}.
(11)
L N. Polyakooa/Equality-type quasidifferentiable constraints
49
If, in addition, the regularity condition is satisfied by function h at point x e O, then
r(x) = yo(X)= U y,o(X)= U icl(x)
i~:l(x)
r(v,, ~,y,
where l ( x ) = { i e l I x e O } .
3. Necessary conditions for a minimum of a quasidifferentiable function on an equality-type quasidifferentiable set
Let quasidifferentiable functions f and h be locally Lipschitzian on En and let D f ( x ) = [_0f(x), 0f(x)],
D h ( x ) = [_0h(x), 0h(x)]
be their quasidifferentials at some point x c O. Assume also that the set O is described by relation (2). We shall consider the following problem: Find minf(x).
(12)
Theorem 3 (see [6]). I f x* is a solution to (12) and if F c F(x*) is a convex cone then
--Of(x*) c O_f(x*) - F +. Theorem 4. Assume that function h satisfies the regularity condition at some point
x* c s
Then for x* to be a minimum point o f f on s it is necessary that -af(x*)c
~
[_0f(x*)- T+(v, w)].
(13)
vea h ( x " )
w~h(x*)
Proof. Let x* be a minimum point of f u n c t i o n f on/2. Then it follows from Theorem
3 and (3) that - a f ( x * ) c o _ f ( x * ) - T§
w),
v e S h(x*), weOh(x*).
Note that T+(v, w) = cl(cone(ah(x*) + v) - cone(_Oh(x*) + w)).
Inclusion (14) holds for every v e ~_h(x*) and w e Oh(x*), and therefore -0f(x*) c
~ ve#h(x*) we~h(x*)
This completes the proof.
[O_f(x*)-T+(v,w)].
(14)
50
L.N. Polyakova / Equality-type quasidifferentiable constraints
If set 12 is described by (10) and function h satisfies the regularity condition at some point x* e 12, then for x* to be a minimum point o f f on g2 it is necessary that
vt~Oht(x*)iwtcSht(x*) o,~-OhN(x*)iwN~-~hm(x*)
If set 12 is described by (11) and function h satisfies the regularity condition at some point x* ~ 12, then for x* to be a minimum point o f f on 12 it is necessary that
-'~f(x*) c
n
n
ie I(x*) v,c~_h,(x*) w~c?*hAx*)
[_0f(x*)-
T+(v,, w,)].
A point x* for which condition (13) is satisfied will be called an of function f on set 12.
inf-stationarypoint
Example 3. Let function f be superdifferentiable on E2 (i.e., such that it has a quasidifferentiai of the form Df(x)=[{0},0f(x)]) at each x~ E2. The set 12 is described by the relation D={x=(xlt~,xC2))~E21h(x)=O}, where h ( x ) =
IIx" l+x 2'I Consider the point Xo = (0, 0). We have
~h(xo) = co{(2, 2), ( - 2 , 2), (0, 0)}, "Oh(xo) = co{(-1, - 1 ) , (1, -1)}. It is easy to check that
"ro(Xo)= U
N
x c Et
T*(v, w) = cone(co{(-1, - 1 ) , (1, -1)}).
v~_h(xo) wc~h(xo)
It is clear that function h satisfies the regularity condition at Xo. Therefore for x0 to be a minimum point of f on /2 it is necessary that
Of(xo) c cone(co{(-1, - 1 ) , (1, -1)}).
4. Steepest-descent directions Assume that point x is not an inf-stationary point of quasidifferentiable function f on quasidifferentiable set 12, and that f satisfies the regularity condition at x. We shall now find a steepest-descent direction of function f on 1-2 at point x. First compute g o = a r g min Ilgll=l
g~l'(x)
3f(x) 3g
L. N. Polyakova / Equality-type quasidifferentiable
constraints
51
We have of(x) O> min - - min rain max (z,g) Ilgll= 1 ag Ilgll = t we~f(x) zcc3f(x)+w gcF(x)
g~_ F ( x )
rain ~,,e~f(x)
min
max
= min
min
/ l-
min
w ~ f ( x ) v'~oh(x) w'c~h(x)
=-
(z,g)
[Igll ~ l zc~_f(x)~w g~_ "yo(X )
max
min
w~-~f(x) zcb_f(x)+w v'c~h(x) te T+(v',w ') w'e~h(x)
\
min ze~_f(x)+w t~T*(t,',w')
IIz-tll)
IIz-tll.
Let Zo9 O_f(x) + Wo, Woe "~f(xo), roe O_h(xo), W'o9 -Oh(x), toe T ~( v', w') be such that Ilzo-toll =
max
min
IIz-tll.
wc~f(x) zeOf(x)+w -+ t~=T (v',w')
w'c~h(x)
v'r
Then the direction go=-(Zo-to)/llZo-toll is a steepest-descent direction of quasidifferentiable function f on set f/ (described by (2)) at point x. This steepestdescent direction may not be unique.
5. S u f f i c i e n t c o n d i t i o n s f o r a strict l o c a l m i n i m u m
If quasidifterentiable functions f and h are directionaUy differentiable at x 9 E., then
f ( x + ag) = f ( x ) + ot Of(x) + o(a, g), ag h ( x + ag) = h ( x ) + a
ah(x) + o,(a, g), Og
where
o(a, g) Og
o,(~, g) ot~+O
' 0,
Clg
ct~+O
' 0.
(15)
Assume that the convergence described by (15) is uniform with respect to g e E,,
Ilgll=l. Denote by r(w, v', w') the radius of the largest ball centered at the origin which can be inscribed in the set
Of(x)+ w - T+(v ', w'),
L.N. Polyakova / Equality-type quasidifferentiable constraints
52
where we-Of(x), w'e-Oh(x), v'e O_h(x). Let
r( x ) = min
r( w, v', w').
c~f(x) v'~a_h(x) w'c'~h(x)
Theorem 5. If set 12 is described by (2), point Xoe 12 and
--~f(xo) c i n t
~
v'ea_h(xo) w'e~h(xo)
[_af(xo) - T+(v', w')],
(16)
then min
af(xo)
e~ vo(~) Ilgll=l
- r ( x o ) > O.
c~g
Proof. If inclusion (16) is satisfied at Xo e 12, then for every w e 0f(xo), v ' e _ah(xo), w'e-~h(xo) we have min
(v, g) = r(w, v', w').
max
g~ T(u',w') vea_f(xo)+w Ilgll=l
But since ro(Xo) =
T( v', w'),
U
v'c#h(xo) w'eoh(xo)
then min
af(xo)
rain
min
min
w~3f(xo) w' c~h(xo) v' ~h(Xo)
Ilell=l
x
min
max
(v,g)
ge T(v',w') vcOf(xo)+W
Ilgll=l
=
min
min
rain
r(w, v', w') = r(xo).
weSf(xo) v'e_~h(xo) w'~'Oh(x.o)
It is clear that r ( ~ ) > 0, thus proving the theorem.
Theorem 6. I f inclusion (16) is satisfied at Xoe 12, then Xo is a strict local minimum
o f f on 1"1 and there exist numbers e > 0 and 8 > 0 such that f ( x ) >~f(xo) + e IIx- xoll Vx ~ 12 n S~(xo). Proof. Take F > 0 and set (see [1])
A~(xo)={g~E,,IUgll=l,
Oh(xo) -~g <~e}.
L.N. Polyakova/ Equality-type quasidifferentiable constraints
53
The set A~(xo) c E, is clearly compact, and if ~ = 0 then Ao(xo) = yo(Xo) n SI(0). It follows from Theorem 5 that there exists an r(Xo)> 0 such that min
af(xo)
8~Ao(~o)
Og
= r(xo) > 0,
and therefore we can find g > 0 and F(Xo)> 0 such that rain
g c Ai(xo)
Of(xo) = F(xo) > 0. ag
Fix a > 0 and choose an arbitrary x ~/2 c~ S~(xo). IfA = then
IIx- xoll and g = ( l / A ) ( x
- Xo)
(af(xo) + o(,~,x g)~,/ ag
f(x)-f(xo) = A \
(17)
dg where
o(,~, g) A
A~+o
o,(,~, g)
; O,
h
A---+O
~0
uniformly with respect to g, Ilgll = 1. Set F--- min{f, 89 such that max{l~,
~
}~
Then there exists a 8 > 0
V A t ( O , 8].
(18)
Given such a 8, eqs. (17) are valid for any x s / 2 n S~(xo). This gives us
f ( x ) - f ( x o ) >i ]Ix - xoll ~, where e = F(Xo) Example 4. Consider the same f u n c t i o n f and set O as in Example 3. I f the inclusion
af(xo) c i n t cone(co{(-1, - 1 ) , (1, -1)}) is satisfied at Xo = (0, O) c/2, then Xo is a strict local minimum point of function f on set 12.
6. Reduction to the unconstrained case
Consider the function
F ( x ) = m a x { f (x) - f * , h(x), - h ( x ) } ,
L.N. Polyakova / Equality-type quasidifferentiable constraints
54
where f * = infxr162 Function F is quasidifferentiable on En. It is clear that if a point Xo is a solution to problem (12) then Xo is also a minimum point of F on En. We shall now write down a necessary condition for F to have a minimum on En at Xo. Since
~_F(Xo)= co{A, B, C], where
A = O_f(xo) - -~h(xo) + O_h(xo), B = 2_0h(Xo) --~f(xo),
C = -2~h(xo) --Of(xo)
and
-~F ( xo) :- ~f( Xo) + "~h( xo) - ~_h( xo), then the following result holds. Proposition. For a point Xo ~ 0 to be a minimum point o f f on 1"1 it is necessary that
- ~ F ( x o ) c OF(xo).
(19)
Remark. In some cases condition (19) is a worse requirement for an extremum than condition (13). This can be illustrated by means of an example. Example 5. Consider the came function h as in Example 3:
h(x)=llxd+ x=l,
x=(x"),xr
E=.
Let Xo = (0, 0). It is not difficult to check that
O_h(xo) --~h(xo) = co{(1, 1), ( - 1 , 1), (3, 3), ( - 3 , 3)}, 2Oh(xo) = co{(0, 0), (4, 4), ( - 4 , 4)} and
O_h(xo) - - ~ h ( ~ ) ~ 20_h(xo).
(20)
However, inclusion (20) implies that any quasidifferentiable function f satisfies (19) (the necessary condition for a minimum on the set 12) at the point Xo = (0, 0). Theorem 7. I f functions f and h are quasidifferentiable, the convergence in (15) is
uniform with respect to g e EM, IIg[[ = 1, and -'~F(xo)C int ~_F(xo), then Xo~ O is a strict local minimum point o f f on the set aq described by (2). The p r o o f is analogous to that of Theorem 11 in [4].
L. N. Polyakova / Equality-type quasidifferentiable constraints
55
References [1] V.A. Daugavet and V.N. Malozemov, "Nonlinear approximation problems" (in Russian), in: N.N. Moiseev, ed., The state-of-the-art of operations research theory (Nauka, Moscow, 1979) pp. 336-363. [2] V.F. Demyanov and A.M. Rubinov, Approximate methods in optimization problems (American Elsevier, New York, 1970). [3] V.F. Demyanov and L.V. Vasiliev, Nondifferemiableoptimization (in Russian) (Nauka, Moscow, 1981). [4] V.F. Demyanov, "Quasidifferentiable functions: Necessary conditions and descent directions", Working Paper WP-83-64, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [5] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions", Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vestnik Leningrad University Mathematics 13 (1981) 241-247). [6] L.N. Polyakova, "On one problem in nonsmooth optimization" (in Russian), Kibernetika 2 (1982) 119-122.
Mathematical ProgrammingStudy 29 (1986) 56-68 North-Holland
QUASIDIFFERENTIAL CALCULUS AND FIRST-ORDER OPTIMALITY CONDITIONS IN NONSMOOTH OPTIMIZATION Alexander S H A P I R O Department of Mathematics and Applied Mathematics, University of South Africa, P.O. Box 392, Pretoria 0001, South Africa
Received 22 September 1983 Revised manuscript received 28 October 1984 This paper is concerned with first-order optimality conditions for nonsmooth extremal problems. Local approximations are obtained in terms of positively homogeneous functions representable as the sum of sublinear and superlinear functions or, equivalently, as the difference of two sublinear functions (d.s.I. functions). The resulting optimality conditions are expressed in the form of set inclusions. The idea of such approximations is exploited through the detailed study of d.s.I, functions and the cones corresponding to nonpositive values of d.s.I, functions. Key words: Nonsmooth Optimization, Quasidifferentiable Functions, Optimality Conditions.
I. Introduction
In recent years much attention has been paid to nondifferentiable functions appearing in optimization theory. Efforts to find adequate tools for handling nonsmooth problems have resulted in the use of methods and techniques borrowed from convex analysis. Linear functions, which served quite well for the purposes of local approximation in differential calculus, have been replaced by sublinear or superlinear functions. Analogously, single gradient vectors have been substituted by convex compact sets (see, e.g., Rockafellar [16], Demyanov and Malozemov [4], Pshenichnyi [13] and Clarke [3]). These efforts have led to Clarke's theory of generalized gradients [ 1-3], which has been found to be a powerful instrument for investigating nonsmooth problems. However, although the method is good enough for global analysis it is often too rough for the purposes of local approximation, which is the main concern in optimization theory. The reasons for this are quite clear: if a function f ( x ) is essentially nonconvex then a sublinear function can provide only a rough bound on the local behavior o f f ( x ) . In this paper we consider an approach aimed at obtaining better local approximations of nondifferentiable functions. This approach was first put forward by Demyanov and Rubinov in [5] (see also [7] and the references therein). They introduced a class of functions (quasidifferentiable functions) which could be approximated to first order by a sum of sublinear and superlinear functions. In Section 2 we carry out a detailed investigation of positively homogeneous functions 56
A. Shapiro / Quasidifferential calculus and optimality conditions
57
which can be represented as the sum of sublinear and superlinear functions or, equivalently, as the difference of two sublinear functions (d.s.I. functions). It will be shown that there is a close relation between d.s.l, functions and the class of functions which can be locally represented as the difference of two convex functions (d.c. functions). Section 3 is concerned with first-order approximations of locally Lipschitz functions. It is shown how Clarke's method of generalized derivatives can be used to obtain approximations in terms of d.s.i, functions. Finally, different optimality conditions involving d.s.1, approximations are discussed in Section 4. It is demonstrated that necessary conditions proposed in [6] hold for almost every perturbed inequality-constrained problem. We denote by SA(') the support function of a bounded set A d E,, i.e.,
SA(X) =--sup{(x, y)lY r A}. It is known that every support function is sublinear and conversely that every sublinear function ~ is the support function of a certain convex compact set. This set is unique and given by
A = {yI(x, y)<~ q~(x), Vx ~ En}. The distance from a point x to a set A will be denoted by da(x), i.e.,
da(x)-= inf{l[x-yll [y~ A}.
2. The spaces of d.s.I, and d.c. functions In this section we study positively homogeneous functions which can be represented as the difference of two sublinear functions. The importance of these functions will become evident in later sections where they will be used for the purposes of local approximation. Definition 1. We say that a positively homogeneous function ~: En --, E~ is difference sublinear (d.s.I.) if it can be represented as the difference of two sublinear functions. The linear space of d.s.l, functions will be denoted by D S L ( E , ) or simply DSL. Definition 2. We say that a function f : D--, El, defined on an open domain D_c E~, is difference convex (d.c.) at a point x ~ D if there exist a convex neighborhood U of x and a pair of convex functions g, h on U such that
f(x) =g(x)-h(x) for all x ~ U. We say t h a t f is d.c. on D i f f is d.c. at every point x of D. The class of d.c. functions has been investigated by several authors (see [15, Section 14 and the references on page 28]). Hartman [10] showed that this class is
58
A. Shapiro / Quasidifferential calculus and optimality conditions
closed u n d e r superpositton and then under any algebraic operation which is defined. In particular, the pointwise m a x i m u m ( m i n i m u m ) of a finite family of d.c. functions is also a d.c. function. Rockafellar [17] introduced a class of m a x functions, called l o w e r - C 2 functions, and characterized them in terms of representability as a difference of convex and quadratic convex functions. In particular, this result implies that the squared distance function d 2 ( . ) is a d.c. function for every set A c_ E~. ( L o w e r - C 2 functions characterized as the pointwise s u p r e m u m of a family of C2-functions uniformly b o u n d e d in a C 2 - n o r m have also been studied by Yomdin [19].) The following t h e o r e m demonstrates that there is a close relation between the spaces of d.c. and d.s.l, functions. We denote the tangent h y p e r p l a n e to the sphere S ~ _ ~ = { x ~ E , [ [ I x l [ = l } at a point x c S , _ ~ by Px, i.e., P ~ = { x + y l ( x , y ) = O } . Note that P~, as an affine subspace of E,, has a linear structure. Theorem I. A positively homogeneous function ~p belongs to D S L ( E , ) iff the restriction o f ~ to p~ is d.c. at x f o r every x ~ S,_~. Proof. The restriction of a convex function to an afline subspace is convex, so that the condition must be necessary. N o w s u p p o s e that the restriction ~ of ~ to Px is d.c. at x, i.e., there exist a convex n e i g h b o r h o o d V of x and a convex function ~ on ~'-- Vc~ Px such that ~ + ~ is convex on V. The convex function ~ can be represented on V as the pointwise s u p r e m u m of a family {g~, i c I} of functions which are affine on p~, i.e., ~ ( x + y ) = sup{g,(y) I i e I},
(1)
where g~(. ) = (a~, 9) + b~, a~ c En and b~ ~ E~. Moreover, this family and the neighborh o o d V can be chosen in such a way that []a,[[<~M and [b,l~<M,
Vial,
(2)
for some positive constant M. Every afline function g~ is associated with a linear function G~ on E,, defined by G~(tx+y)=tb~+(a~,y),
VteE~,
such that the restriction o f G~ to P~ coincides with g~. N o w consider the sublinear function ~7(x) = s u p { G i ( x ) [ i ~ I}. From (2) we have that 7/(. ) is finite on E, and, from (1), the restriction of 7/to P~ coincides with r~. Then, since ~ + -~ is convex on ~', we deduce that r + 7/is convex in a n e i g h b o r h o o d of x. N o w , due to the c o m p a c t n e s s of Sn_t, there exist points Xl, Xk S n_l and sublinear functions "0t. . . . . r/k such that tp + rh is convex in a n e i g h b o r h o o d of x~, i = 1 . . . . . k, and S,_~ is covered by the union of these neighborhoods. C o n s i d e r the sublinear function ~: = r/1 +" 9 9+ r/k. We know that the positively .
.
.
,
E
A. Shapiro / Quasidifferential calculus and optimality conditions
59
homogeneous function ~p+sc is convex in some neighborhood of every point of S~_t. The following lemma shows that this implies that ~ + sc is sublinear on E,, thus completing the proof. Lemma 1. If a positively homogeneous function ~b: E, ~ E~ is convex in some neighbor-
hood of every point of S,_~, then 0 is sublinear on E,. Proof. First we observe that, because of its positive homogeneity, 0 is convex in some neighborhood of every point x e/5,, x # 0. Consider a, b e E, and the interval [ a, b] = { ta +(1 - t)b]0 ~ t ~ 1}. We have to show that the restriction t~ of 0 to [a, b] is convex. If 0~ [a, b], then ~ is convex in some neighborhood of every point of [a, b] and hence is convex on [a, b]. The proof in the remaining case, when 0 e [a, b], follows easily from continuity arguments. The result of Theorem 1 indicates that many properties of d.c. functions are shared by d.s.I, functions. In particular, it shows that if the restriction of a positively homogeneous function ~0 to the sphere S,_~ is C2-smooth (in an atlas of local coordinate systems of S , 1), then ~o~ DSL(E,). There is a dual relationship between sublinear (superlinear) functions and the class /2 of convex, compact subsets of E,. This duality suggests that every d.s.1. function ~o is associated with a pair of sets A, B e/-2 such that ~(x) = max{(x, v ) l v ~ A}+min{(x, w ) l w ~ B} or equivalently,
~p(x) = SA(X) -- S-n(X).
(3)
Consider an equivalence relation -- defined o n / 2 x/2 such that (At, B~) - ( A 2 , B2) iff At - B2 = A 2 - B~. In other words, the pairs (A~, B~) and (A2, B2) are equivalent iff they represent the same d.s.l, function ~ in (3). The equivalence class containing (A, B) will be denoted by [A, B]. The linear space .0 is taken to be the quotient space /2 x ~ / - , where the algebraic operations in ~ are imposed by the linear space DSL, i.e.,
[ml, Bt]+[A2, B2]=-[A~+A2, B~+ B2] and if a I>0 then
a[A, B] ~- [aA, aB] while if or < 0 then
a[A, B] =- [aB, aA]. The space ~ with the norm imposed by the Hausdortt distance p(A, - B ) between the sets A and - B becomes a normed space considered by RadstrOm [14]. It can be shown that the correspondence
[ A, B]~--~SA(" )-- s_n(" )
(4)
A. Shapiro / Quasidifferential calculus and optimality conditions
60
between Radstr6m's space ~ and the DSL space, with the sup norm
II~ll~-- sup{l~(x)ll x ~ S~_,}, is isometric (cf. Demyanov and Rubinov [5]). Since the d.s.l, functions are continuous, their restrictions to S,_~ form a subspace of C (S~ _j), i.e., the Banach space of continuous functions on S,_ j with the sup-norm. A continuous function on Sn-~ can be approximated with given precision by a C2-smooth function on S~_~ with the sup-norm. We have seen that a positively homogeneous function with C2-smooth restriction to Sn_~ is d.s.l. This leads to the following result: Corollary 1. The normed spaces D and DSL (with the sup norm) are isometric to
each other and to a dense subspace of C(Sn_l). It has been shown by Demyanov and Rubinov [8, Theorem 2] that the superposition of d.s.l, functions is also a d.s.I, function (see also [18, Theorem 2.1]). In particular, the DSL space is closed under the operations of taking pointwise maximum and minimum of a finite family of d.s.l, functions. Moreover, if ~o~ are d.s.l, functions and [A~, B~] are the corresponding elements of D, i = 1. . . . . m, then the max function ~0ma~(X) --=max{~oi(x), i e 1 : m} is associated with
and the min function ~mm with
(cf. [7, Lemmas 2.2 and 2.3]). In the classical theory of convex analysis, every closed convex cone C is associated with another closed convex cone C o
C~
y)~O, V x ~ C},
which is said to be polar (or dual) to C. Note that (C~ ~ C, i.e., cone C can be considered as the polar of C o (e.g., [16, p. 121]). Therefore every d o s e d convex cone C can be represented in the form c = { x l s A ~ x ) <~ 0},
(7)
where A is a compact convex set generating the cone C ~ It is interesting and perhaps a little surprising that every closed cone C can be represented in a form similar to (7) if the sublinear function SA(') is replaced by a d.s.l, function ~o(.).
A. Shapiro / Quasidifferential calculus and optimality conditions
61
Theorem 2. Every closed cone C is associated with a d.s.l, function ~ such that
c = Ix I,p(x) <~0}.
(8)
Proof. C o n s i d e r the function
~:(.)= a ~ ( . ) - a~e( . ), where (" = E , \ C is the c o m p l e m e n t of (7. Clearly ~:(x) <~ 0 iff x E C and ~(tx) = t:~(x) for every nonnegative n u m b e r t. Also, as we have already mentioned, d 2 ( . ) and d ~ ( - ) are d.c. functions and therefore their difference ~(. ) is a d.c. function. Define the function ~o as follows: ~(x)=llxll-'~(x).
It can be seen that the function ~0 is positively h o m o g e n e o u s and has the same restriction to S, i as s~. For a point XE S,_~ the restriction ~ of ~o to the h y p e r p l a n e p~ is ~ ( x + y ) = (1 +
Ilyll2)-~/2~(x+y),
where ~ is the restriction of sr to Px. Since ~ is d.c. at x and the function (1 + Ilyl12) -1/2 is C : - s m o o t h we deduce that ~ is d.c. at x. By T h e o r e m 1 this implies that ~o E DSL. It follows from the definition of ~o that ~ ( x ) <~ 0 iff x e C. For a given closed cone C we let n[C] denote the set of d.s.l, functions ~o satisfying (8) and N [ C ] the set o f corresponding elements of ~ , i.e., [A, B] E N [ C ] iff C = { X l S a ( X ) -- S _B(X) ~
0}.
It can easily be verified that n[C] and N [ C ] form closed convex cones in the D S L and R a d s t r r m spaces, respectively. We say that a d.s.1, function ~0 is nondegenerate if
cl{xl ~(x) < 0} = {x I~(x) <~0} (cf. [7, p. 221]). Clearly, if there exists a n o n d e g e n e r a t e d.s.l, function r in n[C], then the cone C is the topological closure of its interior int(C). Conversely, let r be the d.s.1, function constructed in T h e o r e m 2 and x E int(C). Then d e ( x ) > 0 and hence ~o(x) < 0. C o n s e q u e n t l y if c l ( i n t ( C ) ) = C,
(9)
then the function r is nondegenerate. Corollary 2. For a closed cone C there exists a nondegenerate d.s.l, function 9~ in n[ C ] iff condition (9) holds. Consider a finite family o f closed cones {Ca iE l : m } and let ~oiE n[Ci], i = 1 , . . . , m. The corresponding max function Cma~ belongs to n[(],"=l Ci] and the min
A. Shapiro / Quasidifferential calculus and optimality conditions
62
function ~mi, to n[I,_]~'_~ Ci]. Therefore if some elements [Ai, B~] of N[C~] are known, elements of N[(]~'=I C,] and N[l,_JT'=t C~] can be constructed in accordance with formulae (5) and (6), respectively.
3. Quasidifferentiable functions and DSL approximations Let f ( x ) be a real-valued function defined on an open domain D c E,. It is said that f is quasidifferentiable at x if f is directionally differentiable at x and the directional derivative f ' ( y ) can be represented in the form
f ' ( y ) = max{(y,
v)l v c A} + min{(y,
w)[ w e B}.
The corresponding element [A, B] of 12 is called the quasidifferential o f f at x and is denoted by
~f(x) = [Of(x), af(x)]. Quasidifferentiable functions were introduced by Demyanov and Rubinov in [5] and have been studied in a number of publications since then (see [7, 8, 18] and the references therein). It has been demonstrated that quasiditterentiable functions form a linear space closed under superposition and then under algebraic operations whenever these are defined. The resulting quasidifferential calculus has been used to obtain first-order optimality conditions for nonsmooth problems involving quasidifterentiable functions [6-9, 12, 18]. In this and the following section we exploit ideas very close to the quasidifferentiable approach mentioned above. We shall assume throughour that function f is locally Lipschitz although not necessarily directionally difIerentiable. We recall that t
fx(Y) = iim sup t~o'
f ( x + ty) - f ( x ) t
is called the upper Dini (directional) derivative o f f ( . ) at x. The lower Dini derivative is defined in a similar way (e.g., [3, p. 243]).
Definition 3. A positively homogeneous continuous function 9: E n ~ Et is said to be an upper (first-order) approximation of f at x if f~+(- ) ~< ~0(. ). We say that an upper approximation ~p is upper d.s.1, if ~ e DSL. Upper approximations by sublinear functions have been used in many studies of first-order optimality conditions (see, e.g., Ioffe [11 ]). But if the upper Dini derivative f ~ ( . ) is essentially nonconvex, then it is not possible to find a 'good' upper approximation in the class of sublinear functions. On the other hand, the DSL space provides, at least theoretically, an apportunity to approximate f~+(.) with given precision by a d.s.I, function.
A. Shapiro / Quasidifferential calculus and optimality conditions
63
O f course, the concept of d.s.l, a p p r o x i m a t i o n s has no practical significance unless some way o f constructing such a p p r o x i m a t i o n s can be found. Here we make use of the generalized directional derivative f o ( . ) and the generalized gradient a o f ( x ) of Clarke (see [3, Section 2.1]). Recall that f o ( . ) is the s u p p o r t function of a c l f ( x ) and hence is a sublinear function [1, Proposition 1.4]. Also, it follows immediately from the definition that - q ( . ) : _ f o ( . ) provides an u p p e r a p p r o x i m a t i o n o f f at x. Following Clarke [3, Definition 2.3.4], we say that a locally Lipschitz function h is regular at x if h is directionally differentiable at x and h',(. ) = h~ 9). Let function h: E, ~ E~ be regular at x and consider the function g = f + h. Since gO(.) is an u p p e r a p p r o x i m a t i o n of g at x we have that ~p(.) = g O ( . ) _ h , ( . )
(10)
is an u p p e r d.s.I, a p p r o x i m a t i o n o f f and [ a c t g ( x ) , - a c ~ h ( x ) ] is the corresponding element o f / 2 . C o m p a r i n g the d.s.I, a p p r o x i m a t i o n tp(- ), given by (10), with "0(" ) = f o ( . ) one finds that tp(. ) is always better or at least the same as r/(-) in the sense that r ) <~ r/(. ). This follows immediately from the fact that the generalized derivative gO(. ) of the sum is less than or equal to the sum o f the generalized derivatives f o ( . ) + hO(. ) [3, Proposition 2.3.3]. The pointwise m i n i m u m of a finite family of u p p e r d.s.l, a p p r o x i m a t i o n s is also an u p p e r d.s.l, a p p r o x i m a t i o n . Therefore, by choosing a sequence ht, h2, 9 9 9 of functions which are regular at x and constructing the c o r r e s p o n d i n g d.s.l, functions ~o~, go2,. 9 9 one can generate a decreasing sequence 6r = min{tp~, i 9 1: k}, k = 1, 2 . . . . , o f u p p e r d.s.1, approximations. We should mention that a m e t h o d of choosing the regular functions in some optimal m a n n e r has still to be found. N o w let us consider a closed set Sc_ E, and a point Xoe S. A vector y is said to be tangent to S at Xo if there exist a sequence {x,} in S, where x, ~ Xo, and a sequence of positive n u m b e r s t,--} 0 + such that ( x , - X o ) / t , tends to y as n ~ oo. It is well known that the set of tangents forms a closed cone called the tangent cone to S at Xo. We denote the tangent cone by r(Xo, S).
Definition 4. An element ~ o f / 2 is said to be quasinormal to S at Xo if ~ e N [ C ] , where C ~ r(Xo, S) is a closed cone of tangents. The set of all q u a s i n o r m a l s to S at xo is d e n o t e d by W(xo, S). G i v e n S and x0 9 S, we are interested in finding a q u a s i n o r m a l c o r r e s p o n d i n g to a possibly larger cone of tangents. Let S be a quasidifferentiable set, i.e., S = { x l g ( x ) ~ 0}, where g is a quasidifferentiable function [7, C h a p t e r II, Section 6]. Furthermore, let g(xo) --0 and suppose that the d.s.1, function s~,Cxo,(" ) - s-~,~xo~( 9 )
is nondegenerate. Then ~g(xo) is a q u a s i n o r m a l c o r r e s p o n d i n g to the tangent cone ~'(Xo, S). E x a m p l e s of quasidifferentiable sets and the c o r r e s p o n d i n g quasinormals can be f o u n d in [7, pp. 224-228].
64
A. Shapiro / Quasidifferential calculus and optimality conditions
It follows almost immediately from the definition that y e r(Xo, S) iff ~'(y)=0, where sr(-) is the lower Dini derivative of ds(') at Xo. Thus, if there is a known upper d.s.l, approximation ~o of ds(') at Xo, then ~ ( 0 ) gives a quasinormal to S at Xo. In particular, if we take ~o to be the generalized derivative of ds(" ) at Xo, we obtain a quasinormal [cgclds(xo), {0}] corresponding to the tangent cone of Clarke [3, p. 11]. Finally, we note that if S is the union of several closed sets $ 1 , . . . , S,,, then r(Xo, S)=I._J~'=I ~'(Xo, Si). Thus, if quasinormals [Ai, Bi]EX(Xo, S~) are given, a quasinormal to S can be constructed in accordance with formula (6).
4. First-order optimality conditions In this section we discuss various optimality conditions involving d.s.l, approximations. Another way of looking at the d.s.I, function ~p given by (3) is to consider it in the form ~o(x) = min{sa§ w(x)l w ~ B}.
(11)
This shows that if ,p(.) is a d.s.l, upper approximation of f, then SA+w(') is a sublinear upper approximation of f for every w ~ B. In this sense an upper d.s.l. approximation may be considered as a family of upper sublinear approximations. Let us consider the unconstrained problem of minimizing a function f(x) over E,. Let Xo be a local minimizer o f f and ~p be an upper d.s.l, approximation o f f at Xo with ~ p ( 0 ) = [_0~o(0),~o(0)] the corresponding element of/~. We know that for every w E ~ ( 0 ) , s_0,~o)+w(") is an upper sublinear approximation o f f . Consequently we deduce from the well-known optimality condition for sublinear approximations that (cf. [7, p. 215])
o~o~(O)+w, Vw~ ~(0).
(12)
(12) may be considered as a family of necessary conditions. An equivalent formulation of (12) is - 0 ~ ( 0 ) c_ _0f(0),
(13)
i.e., a set inclusion. Condition (13) is an obvious extension of the corresponding necessary condition for quasidifferentiable functions due to Polyakova [6] (see also [7, Theorem 5.1]). Now let us consider the problem (P)
minimize
f(x)
subject to
gi(x)<~0, i = 1 , . . . , k,
F(x)=O and x e S , where F(x) = (fl(x), 9 99,fro(x)), f b . . - ,f,,, gl,. -., gk and f are real-valued, locally
A. Shapiro / Quasidifferential calculus and optimality conditions
65
Lipschitz functions on En and S is a closed subset of E,. In what follows we assume that the functions f ~ , . . . ,fro are quasidifferentiable and Xo is a feasible point of problem (P) such that the set
I(xo) ={i[g,(xo)=0, i c 1: k} is nonempty. Moreover, it will be assumed that the point Xo is regular for F in the sense that the directional derivative F'~(. ) is continuously differentiable at y and the corresponding Jacobian matrix is of full rank m whenever F'(y) = 0 and y # 0 (see 118, Definition 3.1]). It is simple to verify the following assertion (e.g., [11]): If Xo is a local solution of problem (P), then Xo is also a local solution of the problem (Q)
minimize
h(x)
subject to
F(x)=O and x e S ,
where the function h(x) is defined by
h(x) = m a x { f ( x ) - f ( x o ) ; g,(x), i e l(xo)}. Let r/ be an upper d.s.l, approximation of h at Xo and ~ : ( 0 ) e N(Xo, S) be a quasinormal corresponding to a d.s.l, function ~:. We have that if Xo is a local solution of problem (Q), then the solution of the linearized (or, to be more precise, quasilinearised) problem is zero (cf. [18, Theorem 2.2]): (L)
minimize
rl(y )
subject to
F'o(y )=0
and
~:(y)<~0.
Applying the necessary conditions proposed in [18, Theorem 4.2], to problem (L) we obtain the following ~:esult. Theorem 3. If xo is a local solution of problem (P) and Xo is regular for F, then for sufficiently large r > 0, -B
- r-alllF(xo)[t[ =- A + ra_lllf(xo)ll I ,
(14)
where B = ~n (0) + ~ ( 0 ) , and
II1"III is a
A = co{_a,0(0) - 0~:(0), _a~:(O)- an (0)}
norm on E~.
Note that if ~o and ~'i, ir I(xo), are upper d.s.I, approximations of f and gi, respectively, then r/(y) = max{~0(y); ~',(y), i c l(Xo)}
(15)
is an upper d.s.l, approximation of h and the corresponding quasiditterential ~r/(0) can be calculated from (5).
A. Shapiro / Quasidifferential calculus and optimality conditions
66
The necessary conditions of Theorem 3 (for r/given by (15)) mean that ~o(y)I> 0 whenever ~(y), i ~ l(xo), and ~(y) are less than zero and F ~ ( y ) = 0. For the sake of simplicity suppose now that F is continuously ditterentiable. Since, for every v~a~0(0), w~asr~(0), i~l(Xo), and u~a~:(0) the support functions of the sets a~o(0)+ v, a~'~(0)+ w~ and _~(0)+ u yield the corresponding upper approximations, the usual optimality conditions in convex analysis imply that (14) is equivalent to the following:
For every v~-a~o(O), w~e-a~(O), i~ I(xo), ue'a~(O) there exist numbers Izj, j e 1: m, and non negative numbers A~, i ~ l(xo), A and ct, not all of which are zero, such that
0eA(_0~o(0)+v)+ ~ i~l(x o)
A,(O_~,(O)+w,)+~,lxjVfj(xo)+a(~_~(O)+u ). j=l
(16)
If ~(.)/> 0, then the set a~:(0)+ u contains zero for every u ~ a~:(0) and hence condition (16) holds trivially. An obvious improvement would be to replace (16) by 0eA(_a~0(0)+v)+
y.
A,(Osr,(0)+w,)+ ~. ~tjVfj(xo)+cone{~_~(O)+u}
iel(xo)
j=l
(17) and to require that at least one of the numbers A~, tzj and )t be nonzero. It is appropriate to describe optimality conditions of form (17) as Fritz John-type necessary conditions. To ensure that for all v, w~ and u there exist multipliers satisfying (17) with positive A it is necessary to have some sort of constraint qualification. A straightforward constaint qualification of this type would make it possible to pass from constraints ~'~(y)< 0 to ~(y) <~O, i ~ l(Xo), by the operation of topological closure. In other words, we should require that cl{yl ~',(y) < 0, i e I(xo)} = {Y I~',(Y) <~0, i e/(Xo)}, which is equivalent to the condition that the d.s.l, function ~max = max {sr~, i c l(xo)} is nondegenerate. This constraint qualification was proposed in [6], where it was called a nondegeneracy condition (see also [7, Chapter II]). Under the nondegeneracy condition we have A ~ 0 and can take a = 1. Then for the problem with inequality constraints only, the necessary conditions take the form proposed in [6] (see also [7, Theorem 7.1]):
(a_~o(O)+v)n[-cone{~cUo)(O,,(O)+w~)}] ~ 0
(18)
for all v c aq~(0) and w~~. a~'~(0), i ~ l(xo). Condition (18) is equivalent to the following (cf. [7, Theorem 7.2]): -a~(0)c
N wi~af;,(O) ic l(xo)
[_~(0)+cone{
U i~ l(Xo)
(_~r
(19)
A. Shapiro / Quasidifferential calculus and optimality conditions
67
We say that the inequality constrained problem is normal if for every solution xo and for all upper d.s.l, approximations ~o, ~'~ it admits necessary conditions of the form (19). Now let us consider the following perturbed problem minimize
f(x)
subject to
g~(x)<~s,, i = l , . . . , k ,
where s = ( s l , . . . , Sk)~ Ek. Slight modification of the results presented by Clarke in [2] shows that for almost every s the perturbed problem (P,) is normal. Let O(s) denote the optimal value of program (P~). Following Clarke [2], we say that problem (P~) is calm if O(s) is finite and lim inf [ 0(s') - O(s)]/IIs's'~$
Theorem 4. / f (P,)
sll >
-~.
is calm, then (P,) is normal
Proof. We have that if (Ps) is calm and Xo is a solution to (P,), then for some/3 > 0, x0 minimizes the function
e(x)=f(x)+/3 ~ max{O,g,(x)-s,}, ir
where I = {i1 g~(xo)= s~, i~ 1: k}, in a neighborhood of Xo (see the proof of Theorem 2 in [2]). Let ~ and ~ be upper d.s.I, approximations at xo o f f and g~, respectively. It follows that for every v ~ 0~p(0) and w, ~ 0~'~(0), i a 1, the support function of the set
(_09(0) + v) +/3 E co{{O}, _o~r,(o)+ w,}
(20)
i~I
is an upper approximation of e(-) at Xo. Therefore zero is a member of the set defined by (20) and consequently there exist non-negative multipliers A~, i e I, such that
oe _~(o)+ v+ Z a,(_or
w,).
icl
This implies (18) or, equivalently, (19). It is known that if O(s) is finite in a neighborhood of zero, then (P~) is calm and consequently normal for almost every s in this neighborhood [2, Theorem 3]. Finally, we should mention that it is possible to find steepest-descent directions in the unconstrained case and descent directions in the inequality-constrained case (see [7, 9]). We hope that this will lead to the development of numerical algorithms for optimization problems involving quasidifferentiable functions.
68
A. Shapiro / Quasidifferential calculus and optimality conditions
References [ 1] F.H. Clarke, "Generalized gradients and applications", Transactions of the American Mathematical Society 205 (1975) 247-262. [2] F.H. Clarke, "A new approach to Lagrange multipliers", Mathematics of Operations Research 1 (1976) 165-174. [3] F.H. Clarke, Optimization and nonsmooth analysis (Wiley & Sons, New York, 1983). [4] V.F. Demyanov and V.N. Malozemov, Introduction to minimax (Wiley & Sons, New York, 1974). [5] V.F. Demyanov and A.M. Rubinov, "'On quasidifferentiable functionals", Soviet Mathematics Doklady 21 (1980) 14-17. [6] V.F. Demyanov and L.N. Polyakova, "'Minimization of a quasidifferentiable function on a quasidifferentiable set", USSR Computational Mathematics and Mathematical Physics 20 (1980) 34-43. [7] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable optimization (Nauka, Moscow, 1981) (in Russian). [8] V.F. Demyanov and A.M. Rubinov, "On some approaches to the nonsmooth optimization problem" (in Russian), Ekonomika i Matematicheskie Metody 17 (1981) 1153-1174. [9] V.F. Demyanov, "Quasidifferentiable functions: necessary conditions and descent directions", Working paper WP-83-64, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [10] P. Hartman, "On functions representable as a difference of convex functions", Pacific Journal of Mathematics 9 (1959) 707-713. [1 !] A.D. Ioffe, "Necessary and sufficient conditions for a local minimum. 1: A reduction theorem and first-order conditions", S I A M Journal on Control and Optimization 17 (1979) 245-250. [12] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions", Vestnik Leningrad University Mathematics 13 (1981) 241-247. [13] B.N. Pshenichnyi, Necessary conditions for extremum problems (Marcel Dekker, New York, 1971). [14] H. Radstrrm, "An embedding theorem for spaces of convex sets", Proceedings of the American Mathematical Society 3 (1952) 165-169. [15] A.W. Roberts and D.E. Varberg, Convex functions (Academic Press, New York, 1973). [16] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, N J, 1970). [17] R.T. R•ckaf•••ar•``Fav•rab•ec•asses•fLipschitz-c•ntinu•usfuncti•nsinsubgradient•ptimizati•n••• in: E. Nurminski, ed., Progress in nondifferentiable optimization (International Institute for Applied Systems Analysis (Laxenburg, Austria, 1982). [ 18] A. Shapiro, "On optimality conditions in quasidifferentiable optimization", S I A M Journal on Control and Optimization 22 (1984) 610-617. [19] Y. Yomdin, "On functions representable as a supremum of a family of smooth functions", SIAM Journal on Mathematical Analysis 14 (1983) 239-246.
Mathematical Programming Study 29 (1986) 69-73 North-Holland
ON MINIMIZING A CONCAVE
THE
SUM
OF A CONVEX
FUNCTION
AND
FUNCTION
L.N. P O L Y A K O V A
Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR Received 27 December 1983 Revised manuscript received 24 March 1984
We consider here the problem of minimizing a particular subclass of quasiditterentiable functions: those which may be represented as the sum of a convex function and a concave function. It is shown that in an n-dimensional space this problem is equivalent to the problem of minimizing a concave function on a convex set. A successive approximations method is suggested; this makes use of some of the principles of e-steepest-descent-type approaches.
Key words: Quasidifferentiable Functions, Convex Functions, Concave Functions, e-SteepestDescent Methods.
1. Introduction The p r o b l e m of m i n i m i z i n g n o n c o n v e x n o n d i f f e r e n t i a b l e f u n c t i o n s poses a considerable challenge to specialists in m a t h e m a t i c a l p r o g r a m m i n g . Most o f the difficulties arise from the fact that there may be several directions o f steepest descent. To solve this p r o b l e m requires both a new t e c h n i q u e a n d a new a p p r o a c h . I n this p a p e r we discuss a special subclass of n o n d i f f e r e n t i a b l e f u n c t i o n s : those which can be represented in the form
f(x) = f , ( x ) +f2(x), where fl is a finite f u n c t i o n which is convex o n E . a n d f2 is a finite f u n c t i o n which is concave o n E.. T h e n f is c o n t i n u o u s a n d q u a s i d i t t e r e n t i a b l e on En, with a q u a s i d i t t e r e n t i a l at x c E . which may be t a k e n to be the pair o f sets
D f ( x ) = [Of(x), 0 f ( x ) l , where
~_f(x) = ofl(x) = {v ~ E. Ifl(z) - f l ( x ) >t (v, z - x ) V z ~ E.},
~ f ( x ) = af2(x) = {w ~ t,:, If2(z) - A ( x ) ~ ( w, z - x ) V z ~ En}. In other words, a_f(x) is the subdif[erential of the convex f u n c t i o n ]'1 at x ~ E . (as defined in convex analysis) a n d a f ( x ) is the s u p e r d i t t e r e n t i a l o f the c o n c a v e f u n c t i o n
f2atx~E.. Translated from Russian at the International Institute for Applied Systems Analysis (IIASA), A-2361 Laxenburg, Austria. 69
70
L.N. Polyakova / The sum of a convex function and a concave function Consider the p r o b l e m of calculating inf f ( x ) .
(1)
x(: E n
Quasiditterential calculus shows that for x* e E. to be a m i n i m u m point o f f on E. it is necessary that -af(x*) c af(x*).
(2)
We shall now show that the p r o b l e m of m i n i m i z i n g f on the space E, can be reduced to that of minimizing a concave function on a convex set. Let /2 denote the epigraph of the convex function f~, i.e., $2 = epi f~ = {z = [x, IX] e E, x E~ ] h(z) =-f~(x) - IX <~0}, and define the following function on E, x El: tl,(z)=f2(x)+ix, Set z 9 E, af2(x) Let
z:[x, ix]eE,•
f2 is closed and convex and function ~ is quasidifferentiable at any point • El. Take as its quasidifferentiai at z = [x, Ix] the pair of sets D ~ , ( z ) = [{0}, x{1}], where 0 9 E,+t. us now consider the p r o b l e m of finding
inf 6(z).
(3)
zCO
It is well-known (see, e.g., [3]) that if a concave function achieves its infimai value on a convex set, this value is achieved on the b o u n d a r y of the set. Theorem 1. For a point x* to be a solution o f problem (1), it is both necessary and sufficient that point [x*, IX*] be a solution to problem (3), where Ix* = f l ( x ) . Proof. Necessity. Let x* be a'solution of p r o b l e m (1). Then
ix+f2(x)>~f,(x)+f2(x)>~f,(x*)+f2(x* )
VIx~>f,(x), V x 9
(4)
But (4) implies that q,(z) >~f,(x*) +f2(x*) = f2(x*) + Ix* where Ix* = f , ( x * ) . Thus there exists a z* = [ x * , / x * ] 9
O(z)/> ~(z*)
Vz 9 O.
such that
(5)
This proves that the condition is necessary. Sufficiency. That the condition is also sufficient can be p r o v e d in an analogous way by arguing b a c k w a r d s from inequality (5). Remark. An analogous result was obtained in [4].
L. IV. Polyakova / The s u m o v a convex f u n c t i o n a n d a concave f u n c t i o n
71
2. A numerical algorithm Set e I> 0. A point Xo 9 E, is called an e-inf-stationary point of the function f on E, if
--df(xo) ~ O_~f(xo),
(6)
where _~.(Xo) = a.L(Xo) = {v e E. IZ(z)
-A(Xo)
>I ( v , , z - Xo) - ~
Vx ~
E.},
i.e., a_,f(xo) is the e-subdifferential of the convex function f, at Xo. Fix g 9 E, and set
O~f(xo)
max ( r , g ) +
min (w,g).
(7)
Theorem 2. For a point Xo to be an e-inf-stationary point of the function f on E,, it is both necessary and sufficient that o~f(xo) min - - / > Og
0.
(8)
[[gl~ :1
Proof. Necessity. Let Xo be an e-inf-stationary point o f f on E,. Then from (6) it follows that
Oc W+~_~f(Xo) VWE'Of(Xo). Hence rain
max
(z,g)~>0
VweOf(xo),
Ilgll= 1 zew+Oef(x 0)
and thus for every g e E,, IIg[I = 1, we have min
max (z, g)/> 0.
wESf(xo) ULi2ff(Xo)
However, this means that min ilgll=l
a,f(xo)
I> 0
(9)
c~g
proving that the condition is necessary. That it is also sufficient can be d e m o n s t r a t e d in an a n a l o g o u s way, arguing backwards from the inequality (9). Note that since the m a p p i n g _0~f: E, •
+co]--, 2E-
is H a u s d o r t t - c o n t i n u o u s if e > 0 (see, e.g., [1]), then the following t h e o r e m holds.
Theorem 3. I f e > 0 then the function max~o,y(x)(v , g) is continuous in x on E. for any fixed g c E,.
L.N. Polyakova / The sum of a convex function and a concave function
72
Assume that Xo is not an e-inf-stationary point. Then we can describe the vector
O,.f(xo) g,(xo) = arg min - ~g
Ilgll= 1
as a direction o f e-steepest-descent o f function f at point Xo. It is not difficult to show that the direction
/' Vo.+Wo)
=
+
ll
'
where vo~ e O_~f(xo), woe-~f(xo) and -
max w (: ~f(xo)
min IIv+wll---IlVo~+woll--a~(xo), veO,:f(Xo)
is a direction o f e-steepest-descent of function f at point Xo. N o w let us consider the following m e t h o d o f successive approximations. Fix e > 0 and choose an arbitrary initial a p p r o x i m a t i o n Xor E~. Suppose that the Lebesque set
D(xo) = {Xo ~ E, If(x) <~f(xo)) is b o u n d e d . Assume that a point xk c E, has already been found, lf---~f(Xk) ~ o_~f(Xk), then Xk is an e-inf-stationary point o f f on E. ; if not, take
ak=argminf(xk+ag~k),
Xk+l =Xkq-akgek,
a~>O
where g~k----gE(Xk) is an e-steepest-descent direction o f f at Xk. Theorem 4. The following relation holds: lim a~(Xk) = O.
k~of~,
Proof. We shall prove the theorem by contradiction. Assume that a subsequence {Xk~} o f sequence {Xk} and a n u m b e r a > 0 exist such that
a~(xk,)<~-a
Vs.
(The required subsequ~nce must exist since D(xo) is compact.) Without loss of generality, we can assume that Xk, -~ X* (clearly, x* ~ D(xo)). Then
f(Xk + otg~k,) = f ( x k ~ ) + fO' (of'(Xk~+ 7"g~k~)) dT"+ a(cgf2(Xk')~ ~
\
cggrk~
/
\
+O(a'
g*'k~)'
cgg~k~ f
where O(Ot,
The term o ( a ,
gF.k~.)
~ O.
gek~) appears
in the above equation due to the concavity off2. The
L.N. Polyakova / The sum o v a convex function and a concave function
73
fact that function f2 is concave implies that O(a,g~k,)~
Vet>O,
Vg~k t e n ,
and therefore
f(Xk+ag~k,)~f(Xk,)+
max
(v,g~k~)dr+c~
t'eafl(x~.3+rgak ~)
min
(w,g~k,).
weaf2(xk )
Since C~jl(X)DOJ~(x) for every x e E., we have
(v,g.k~) >-
max V(-'tl~fl(Xk §
x)
max (v,g.k,), t~cafl(xk ~-rg, k3)
and thus
f(xk +ag~k~)~f(xk,)+
max
(v,g~k,)dr+a
vea~fl(xk, ~-'rg~k )
min
(w,g,-k,).
weOf2(xks)
Since the mapping a~f~ is Hausdorff-continuous at the point x*, there exists a (5 > 0 such that
O~fl(X ) c O.f,(y) +2 S,(O)
Vx, y ~ Sa(x*),
where S.(z) = {x ~ E. i llx - zil <~r}. Also, there exists a number K > 0 such that
Xk, 9 S~/2(x*)
Vk.
>
K,
and hence
f(Xk +ag~k~)~f(Xk~)+a
(a)
a~(Xk~)+~
Va 9
0,
, Vk~> K.
Therefore
f(Xk,§
= m i n f ( x k +ag, k,)~ s Xk,+-~g~k, <~f(Xk.,)--~.
(10)
Inequality (10) contradicts the fact that sequence {f(Xk)} is bounded, thus proving the theorem.
References [1] E.A. Nurminski, "'On the continuity of e-subgradient mappings", Cybernetics 5 (1977) 148-149. [2] L.N. Polyakova, "Necessary conditions for an extremum of a quasiditterentiable function", Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vesmik Leningrad University Mathematics 13 (1981) 241-247). [3] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, New Jersey, 1970). [4] H. Tuy, "Global minimization of a difference of two convex functions", in: G. Hammer and D. Pallaschke, eds. Selected topics in Operations Research and Mathematical Economics, Proceedings of the VIII Symposium of Operations Research, Lecture notes in economics and mathematical systems, v. 226 (Springer Verlag, Berlin, 1984) pp. 98= 118.
Mathematical Programming Study 29 (1986) 74-84 North-Holland
AN ALGORITHM FOR MINIMIZING A CERTAIN CLASS OF QUASIDIFFERENTIABLE FUNCTIONS V.F. D E M Y A N O V Applied Mathematics Department, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR, and International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria
S. G A M I D O V a n d T.I. S I V E L I N A Applied Mathematics Department, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR Received 15 November 1983 Revised manuscript received 15 October 1984 We consider the problem of minimizing a function which is a smooth composition of max-type functions. We treat this function as quasidifferentiable. A very important property of the algorithm suggested is that at each step it is necessary to consider more than one descent direction (this is due to the quasiditterentiability of the function). Key words: Maximum Function, Composite Quasidifferentiable Function, Steepest Descent Direction.
1. Introduction
O n e interesting a n d i m p o r t a n t class o f n o n d i f f e r e n t i a b l e f u n c t i o n s is t h a t p r o d u c e d by s m o o t h c o m p o s i t i o n s o f m a x - t y p e functions. Such f u n c t i o n s are o f practical value a n d have been s t u d i e d extensively by several r e s e a r c h e r s [7, 1, 5], W e treat t h e m as q u a s i d i t t e r e n t i a b l e functions a n d a n a l y z e t h e m using q u a s i d i t t e r e n t i a l calculus. O n e special s u b g r o u p o f this class o f f u n c t i o n s ( n a m e l y , the sum o f a m a x - t y p e f u n c t i o n a n d a m i n - t y p e function) has b e e n s t u d i e d b y T.I. Sivelina [9]. The m a i n feature o f the a i g o d t l l m d e s c r i b e d in the p r e s e n t p a p e r is t h a t at e a c h step it is n e c e s s a r y to c o n s i d e r a b u n d l e o f a u x i l i a r y d i r e c t i o n s a n d points, o f w h i c h only one can be c h o s e n for the next step. This r e q u i r e m e n t seems to arise from the intrinsic n a t u r e o f n o n d i t t e r e n t i a b l e functions.
2. The unconstrained case
Let
(1)
f ( x ) = F ( x , y l ( x ) , 99 9 y m ( x ) ) 74
V.F. Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
75
where
x e En,
y~(x) =- max ~bu(x), j~- I~
li =--1 : Ni
and functions F(x, y , . . . , y,,) and ~bu(x) are continuously ditterentiable on E, ~,, and E,, respectively. Take any g c En. Then for a/> 0 we have
Oy,(x)
y~(x+ag)=y~(x)+ot
~g
+o~(a, g)
where
Oyi(x)
Og
lira y~(x+ag)-y~(x)
a~+o
a
Ri(x) = {j ~ Ii [ &i~(x) = yi(x)},
r
max (~blj(x), g),
j~R,(~) O,(a,g)
=Ock~(x)
Ox '
(2)
)0.
This leads to + o ( a , g) f ( x + ag)= f ( x ) + a [ ( O F ( ~ x ) ) , g) + ~, cgF(y(x)) Or i~ l cgyi Og _1
(3)
where
y(x) =- (x, y,(x), . . . , ym(x) ),
l--=l:m, and
o(,~, g) a
ct~+O
,0.
(4)
It is clear that convergence in (2) and (4) is uniform with respect to g ~ $1 -= {g c En I Ilgll = 1}. Let
I+(x)={iellOF(y(x))>O},
I (x)={iEl OF(y(x))
cgyi
c~yi
Then from (3) we have
ie l§
+
Je R , ( x ) \
E min(OF(y(x))cblj(x),g)]+o(a,g) i~ t_(x~J~ R,Cx)\ Oyi
It follows from (5) that f is quasiditierentiable and
Df(x) = [~f(x), ~f(x)]
Oy i
9
(5)
V.F. Demyanovet al. / An algorithmfor minimizingquasidifferentiablefunctions
76 where
af(x)
= co
A(x)=
A(x),
-af(x)
= co
B(x),
Y. v~E. v OF(y(x)) Ox t- i~+~x)
B ( x ) = { w ~ E . Iw=
~ ir
Oy~
4)b(x),je
R,(x) ,
aF(y(x))qb~(x),j~Ri(x)}. Oyi
(x)
Recall [4, 3] that a necessary condition for x* e E, to be a m i n i m u m point of a quasidifferentiable function f on E, is
--af(x*) c af(x*). A point x* satisfying this inclusion is called an inf-stationary point o f f on E,. For x* c E, to be a local m i n i m u m point of f it is sufficient that'
-'af(x*)
af(x*).
cint
The following l e m m a s can be derived from the a b o v e necessary and sufficient conditions: Lemma 1.
For any set of coefficients
{Aoliel-(x*),jcR,(x*),Aq>~O,
~.
A/t----l }
jsRi(x*)
there exists another set of coefficients {AijliEI+(x*),jER,(x*),Aij>~O,
~
Aij = 1 }
j ~ R~(x*)
such that aF(y(x*)) OXi
+ ~
aF(y(x*)) Oyi
icl
Y.
je Ri(x*)
(If aF(y(x*))/ay, = 0 put Ao = 0 Vj c
Aoqb~(x* ) = 0 .
(6)
R,(x*).)
C o n d i t i o n (6) is a multipliers rule - note the difference between it and the Lagrange multipliers rule for m a t h e m a t i c a l p r o g r a m m i n g . It follows from (6) that x* is a stationary point of the s m o o t h function
Fx(x)= F(x,
Y~ jC Rl(x*)
alj~bu(x),... ,
~
A,,jdp,.j(x)),
teRm(x*)
t For an arbitrary quasidifferentiable function this condition is sufficient for a minimum only with certain additional assumptions (see [2]). However, this condition is sufficient for functions described by(l).
V.F. Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
77
and if ~f(x*) consists of more than one point then the set {;tis} is not unique. (Of course, it may not be unique even if-~f(x*) is a singleton.) Lemma 2. I f for any w ~ B(x*) there exist sets
{EeA(x*)liel:(n+l)}
and
~i a~>O, ~ a ~ = l i=l
such that the vectors {v~}form a simplex (i.e., vectors { v i - v.§ i e 1 : n} are linearly independent) and w=~__+~ a,v,, then x* is a local minimum point o f f on En. We shall now introduce the following sets, where e >I 0,/.~ t> 0:
Ri~(x) = {j e I,l~o(x) >1yi(x) - e}, O_,f(x)=co{vEEn]v
OF(y(x))F t~X
B,(x)={wcE,
Iw=
~ ict (x)
~
cgF(y(x))~b~(x),jeRi,(x)),
iel,(x)
c3yi
OF(y(x))~,j(x),jERi~(x)}" c3yi
Let f be defined by (1). A point x * s E. will be called an e-inf-stationary point o f f on E~ if
--Of(x*) c O_~f(x*). We shall now describe an algorithm for finding an e-inf-stationary point, with e > 0 and ~z > 0 fixed. Choose an arbitrary xo c E~. Suppose that xk has been found. If
---Of(Xk) c O_,f(Xk)
(7)
then Xk is an e-inf-stationary point and the process terminates. If, on the other hand, (7) is not satisfied then for every we B~,(Xk) we find min
vc #,f(x~)
IIw+vll = IIw+v~(w)ll.
If w+ Vk( W) ~ 0 then let gk ( W) = --( W + v~ ( w ) ) / ll w + v~(w)ll and compute
min f(xk + agk( w) ) = f(Xk + ak( W)gk( W) ). c~~O If w + vk(w) = 0 then take ak(W)gk(W) = 0. Next find min
f(Xk +C~k(w)gk(w))=f(xk +ak(Wk)gk(Wk)).
wca~,(xk)"
We then set
Xk§ = Xk + ak(Wk)gk( Wk).
(8)
78
V.F. Demyanoo et al. / An algorithm for minimizing quasidifferentiable functions
It is clear that (9)
f ( x k + , ) <~f(xk).
I f f(xk ~1) = f(Xk) then the procedure terminates. By repeating this procedure we obtain a sequence of points {Xk}. If it is a finite sequence (i.e., consists of a finite number of points) then its final element is an e-inf-stationary point by construction. Otherwise the following result holds. Theorem 1. I f the set D(xo) = {x c E. If(x) <~f(Xo)} is bounded then any limit point o f the sequence {xk} is an e-inf-stationary point o f f on E,.
Proof. The existence of limit points follows from the boundedness of D(xo). Let x* be a limit point of {Xk}, i.e., x* ----limk,~oo Xkc It is clear that x* ~ D(xo).
Assume that x* is not an e-inf stationary point. Then there exists a w * e Bo(x*) such that min
v~-#,f(x*)
IIw*+vll=a>o.
(lo)
We shall denote by w* the point in B~,(Xk) which is nearest to w* and by p(w*) the distance of w* from w*. It is obvious that
p(wk*)
k~c~
,0.
It may also be seen that the mapping ~_~f(x) is upper-semicontinuous. From (10) and the above statements it follows that there exists a K < oo such that
~r
min
IIw*. §
.
.
+v~,(w*~,)ll=a~~ .
a
V k ~ > K.
(11)
Now we have f ( x k , + agE.,) ; J ' ( x * + (xk, - x* + C~gk,)) = f(x*) 4
a/(x*) B[Xk, -- X* + Otgk,]
~-or IIx~, - x* + ~g~, II)
where
w* +%(w*,) gk, =- gk.,(w*,) = -- IIw~*+ v~,(w~*)I~'
c3f(x*) c3[xk - x* + a g j
(cgF(y(x*)) ) k -~x , Xk, -- X* + agk,
r 12)
V. t( Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
,., -1- L . icl§
+
~,
L .
79
[OF(y(x*)) . , , , , "~ max. I 7-cpo[ x ), Xk -- X* + Otgk,]
)Je~i~ x ) \
.
oyi
[OF(y(x*)) . , , , ,
mm. |
iel..(x )J~-Ri(x ) \
m oyi
'~
q)ij[X ) , X k - - X * - l - O ~ g k , ] .
(13) Since max a~ + m i n b~ <~ max[a~ + b,] ~< max ai + m a x b~, i~I
i~l
i~-I
i~l
icl
min a~ + min b~ <~ m i n [ a i + b,] ~< min a~ + m a x b. iE [
i~- I
i~: l
i~ l
i(: l
it follows from (13) that
of(x*) O[xg, - x* + Otgk,] - - a [ ( O F ( y ( x * ) ) gk~)+ ~ max (OF(y(x*)) ~j(X,),gk.,) L\ Ox ' ~ 1+(x*)J~R,C~*)\ Oy~ +
: '~
Y. min ( O F ( y ( x * ) ) ~ : j ( X , ) , g k , ) ] + ~ f l i ( O e , let (x*)jeRi(x*)\ by, i~t
Of(x*)
Xk_X, )
(14)
+ Z t3,(,~, x ~ , - x * ) ,
Ogk,
i~:l
where
fli(a, Xk --X*)E[ "
min
(aF(y(x*)) + OF(y(x*))
LJenAx*~\
c)X
Oy~
\ 4,~Ax*), x~ - x*}, /
max (OF(y(x*)) ~ OF(y(x*)) dpli(x*),Xk _ X * ) ] . j~n,(~*)\ Ox ayi It is clear that
p,(a,x~ -x*).~ I~oo ,0 uniformly with respect to a. Recall that
Of(x*)_ Ogks
max (V, gk~)+ vcaf(x*)
rain (w, gk). wc~,f(x*)
Geometrically of(x*)/agk~ is the difference o f the maximal projection o f the set Of(x*) on the line {z = Agk, [A/> 0} and the maximal projection o f the set --~f(x*) on the same line.
8O
V.F. D e m y a n o v et al. / A n algorithm f o r minimizing quasidifferentiable functions
But
w* + v~,( w*? gk, = -iiW,k~ § vk~(w, )ll where w* ~ w* e-~f(x*), vk~(w* ) satisfies (11), hence,
p(Vk,(W*),Of(x*))=
min
-
ve~f(x*)
IIvk~(w~%)-vll-,0.
Therefore, for k~ sufficiently large, of(x*)
a
0gk,
4"
From (14) and (12) we conclude that there exist values of a o > 0 and ks such that f ( X k ~ + Otogk,)
But this is impossible since f(xk~+,)=
min
we B~(xk5 )
f(Xk x +ak~(W)gk~(W))
<~f(Xk~ + ak~(W* )gk~(W*)) = minf(xk W otgk (w*)) <~f(Xk~-t- Otogk,) < f ( x * ) . This contradicts (9) and the fact that
f(Xk)
k~cx3
' f(x*).
3. The constrained case
Let us consider the set
= { x e E,,[h(x)~O}~
(15)
where h ( x ) = H ( x , ym+,(x) . . . . , y A x ) ) ,
y,(x)=maxcbo(x), je. Ij
//---l:Ni, i ~ ( m + l ) : p ,
and the functions H(x, y,,+, . . . . , yp) and ~bu(x) are continuously differentiable on E,_,,+p and E,, respectively. Let the function f be of the form (1). The function h is quasidifferentiable and its quasidifferential can be described analogously to that o f f in Section 2. The set ft defined by (15) is called quasidifferentiable.
V.F. Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
81
The problem is to find m i n d , o f ( x ) . As in (3) we have
h(x + ag) = h(x)+ a
[OH(~xX)) +
i,-~.r
aH07(x)) Oyi(x) +-o'(a, g ) ]
by,
- ag
where
o'(,~, g)
l'=(m+l):p;
,0,
;(x) =(x, y ' + , ( x ) , . . . , yAx)).
Let
l'+(x)={iel'-0H07(x)) > 0},
I'_(x)={ieI' OH(y(x))
Oy~
Now we have
h ( x + a g ) = h ( x ) + a [ ( a H ( ~ x ( X ) ) g ) + Y ' ,r
+~
,cmax(aH(y(x))gblj(x)'gl] n.t.,\ ay, ,.1
/aH(;(x)) , '~ ~ min / -c~,j(x),g + o ' ( a , g ) ,~._(x~j,:R,(x)\ ayi J _
where
Ri(x) = {j e]i ] ~b0(x) = yi(x)}. We now introduce the sets
Ri,-(x) = {j 9 lil qb,j(x) >i yi(x) - e},
{
0~h(x)=co v~E,
]v
aH07(x))+ ax
B'u(x)={weE,,lw=
~ it1 '(x)
OH(y(x))4~'J(x),jeRi~(x) } ,
~ i~l~.(x)
~'Yi
cgH(y(x))dp,j(.x),jERi~(x)} Cgyi
where e i> 0,/z/> 0. Several equivalent necessary conditions for a minimum have been obtained [2, 3, 8]. Here we take the necessary condition in the form proposed by A. Shapiro
[81: In order that x * e 12 be a minimum point of a quasidifferentiable function f defined on a quasidifferentiable set/2, it is necessary that
--~f(x*) c Of(x*)
for h(x*) < 0
(16)
- [ 0 f ( x * ) + Oh(x*)] c co{Of(x*) - 0h (x*), Oh(x*) - ~f(x*) } for h(x*) =0.
(17)
82
ILF. Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
Take e 1>0, r~>0. We shall call x * c 12 an (e, r)-inf-stationary point o f f on 12 if
-Of(x*) c O_,f(x*)
for h(x*) < - r ,
-[Of(x*) + ~h(x*)] c co{O~f(x*) - ~,h (x*), O~h(x*) - ~ f ( x * ) } for -r<~ h(x*)<-O. We shall now describe an algorithm for finding an (e, r)-inf-stationary point with e > 0 , / z > 0 and r > 0 fixed. Choose an arbitrary x0c/2. Suppose that Xk 9 12 has been found. If condition (16) or (17) is satisfied at xk then Xk is an (e, r)-inf-stationary point and the process terminates. There are two other possibilities: (a) h ( x k ) < - r , (b) -r<~ h(Xk)<~O. In case (a) we perform one step in the minimization of the function f, using the same algorithm as in Section 2 except that min f ( xk + agk( W) ) ~0
must be replaced by min{f(xg + agk(w))la ~ O, xk + ~gk(w) 9 12} in (8). In case (b) we have to find min{ll w, + w2 + vii Iv 9 CO{O_,f(xk) --5,h(Xk), o~h(Xk) --o,f(xk)} = Ilw, + w~+ v~(w, +
wz)ll
for every wl 9 Bu(Xk) and w2 9 B'~(Xk). Compute m i n f ( x k ( a ) ) = f ( x k ( w ~ , w2)),
h(x~(a))<~O,
(18)
where x k ( a ) = xk - a ( w ,
+ w2+ v~(w, + w2)).
We then find
min{f(xk(wt, w2))l wl c B~(Xk), W2E B~(Xk)} =f(Xk(Wkl, Wk2)). Setting Xk+I ~ - X k ( W k l , Xk.lE12,
Wk2), it is clear that
f(Xk4-1)<~f(Xk).
lff(xk+l) =f(Xk) then the procedure terminates. Repeating this procedure, we construct a sequence of points {Xk}. If it is a finite sequence then the final element is an (e, r)-inf-stationary point o f f on J'2 ; otherwise it can be shown that the following theorem holds.
V.F. Dem.vanov et al. / An algorithm for minimizing quasidifferentiable functions
Theorem 2. I f the set ~(Xo) = {x c O
If(x) <~f(xo)} is b o u n d e d
the s e q u e n c e {Xk} is an (e, r ) - i n f - s t a t i o n a r y
83
then a n y limit p o i n t o f
p o i n t o f f on 1-1.
Proof. T h e o r e m 2 can be proved in the same way as T h e o r e m 1. Remark 1. If the initial point x0 does not belong to 1"2 it is necessary to take a few preliminary steps in the minimization o f function h until a point belonging to 12 is obtained. Remark 2. To find an inf-stationary point (i.e., an (e, r)-inf-stationary point where e = r = 0) it is necessary for e to tend to zero (this can be achieved using the standard mathematical p r o g r a m m i n g techniques). Remark 3. It is possible to extend the p r o p o s e d a p p r o a c h to the case where f(x)
= max F~(x, y , ~ ( x ) . . . . , y , m , ( x ) ) , i~-I
h ( x ) = max H j ( x , z j , ( x ) , . . . , zjm,(x)), jcJ
y , k ( X ) = max ~b,k,(x) IC li~
and the functions F~(x, y , . . . . , yim.), H i ( x , zj~, . . . , Zjmj), r differentiable.
are continuously
Remark 4. Instead o f the one-dimensional minimization p r o p o s e d in (18) it is possible to take x~ ( w , , w2) = xk - Xk(w~ + w2 + vk ( w, + w2) )
where ~x~
Ak
, +0, k~cr
~. Ak = +oo. k
=0
Concluding remarks The algorithm described in this paper is 'a conceptual one". To make it implementable it is necessary to apply specific subroutines for solving auxiliary problems arising here (line search problems like (8), finding the distance between a point and a set, etc.). We did not discuss these problems since they have been widely studied by m a n y scholars in the field o f mathematical programming. Other subproblems are essentially o f quasiditterentiable nature (like the choice o f the parameters e, Iz and r). Our numerical experience is not sufficient to make any strong recommendations. It seems that it is heavily d e p e n d e n t on the class o f problems you are solving.
84
V.F. Demyanov et al. / An algorithm for minimizing quasidifferentiable functions
It is quite possible that a n e - i n f stationary p o i n t is not a minimizer. O u r only hope is that the p r o b l e m is not ill-posed. Nevertheless, by decreasing e we are still able to decrease the value o f the function, a n d it is quite acceptable from practical c o n s i d e r a t i o n s . O f course, this problem requires a special a t t e n t i o n but it seems u n l i k e l y that a n answer exists for a n arbitrary function. We m u s t treat i n d i v i d u a l classes o f f u n c t i o n s a n d get as m u c h i n f o r m a t i o n as possible by s t u d y i n g their specific properties. Practical e x p e r i m e n t s with this m e t h o d are b e i n g carried out in different places (see, e.g., D. Pallaschke [6]) a n d we hope to be able to discuss the practical aspects o f the p r o b l e m elsewhere.
Acknowledgment The a u t h o r s wish to t h a n k c o r d i a l l y , P r o f e s s o r A.M. R u b i n o v for his v a l u a b l e advice a n d H e l e n G a s k i n g for her careful editing.
References [1] A. Ben-Tal and J. Zowe, "Necessary and sufficient optimality conditions for a class of nonsmooth minimization problems", Mathematical Programming 24 (1982) 70-91. [2] V.F. Demyanov,"Quasiditterentiable functions: Necessary conditions and descent directions", Working Paper WP-83-64, International Institute for Applied SystemsAnalysis (Laxenburg, Austria, 1983). [3] V.F. Demyanov and L.N. Polyakova, "Minimization of a quasidifferentiable function on a quasiditterentiable set", USSR Computational Mathematics and Mathematical Physics 20 (4) (1980) 34-43. [4] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Soviet Mathematics Doklady 21 (1) (1980) 14-17. [5] R. Fletcher and G.A. Watson, "First and second order conditions for a class of nondifferentiable optimization problems", Mathematical Programming 18 (1980) 291-307. [6] D. Pallaschke, "On numerical experiments with a quasidifferentiable optimization algorithm", in Abstracts of the IIASA workshop on nondifferentiable optimization: Motivations and application (held in Sopron, Hungary, 17-22 September 1984), International Institute for Applied SystemAnalysis (Laxenburg, Austria, 1984) pp. 138-140. [7] G. Papavassilopoulos, "Algorithms for a class of nonditterentiable problems", Journal of Optimization Theory and Applications, 34 (1) (1981) 41-82. [8] A. Shapiro, "On optimality conditions in quasidifferentiable optimization", SIAM Journal on Control and Optimization 22 (4) (.1984) 610-617. [9] T.I. Sivelina, "Minimizing a certain class of quasiditterentiable functions", Vestnik of Leningrad University 7 (1983) 103-105.
Mathematical Programming Study 29 (1986) 85-94 North-Holland
A LINEARIZATION METHOD FOR MINIMIZING CERTAIN QUASIDIFFERENTIABLE FUNCTIONS K r z y s z t o f C. K I W I E L
Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01447 Warsaw, Poland Received 20 March 1984 Revised manuscript received 17 September 1984 An algorithm for minimizing those quasidifferentiable functions that are smooth compositions of max-type functions is given. At each iteration several search directions are found by solving a number of quadratic programming subproblems. Then an Armijo-type search is performed simultaneously along all the search directions to produce the next approximation to a solution. Quasidifferential calculus is used to establish the global convergence of the algorithm to infstationary points.
Key words: Nondifterentiable Optimization, Quasidifferential Calculus, Descent Methods.
1. Introduction We are c o n c e r n e d with methods for minimizing a nondifferentiable, n o n c o n v e x function f : E. -~ El o f the form
f(x)=g(x,
mi~J, a x f j , ( x ) , . . . , m a x fjjm(X) ) ,
(1)
where the functions g: E, • -~ El and f~i : E, --, El are continuously differentiable, and I := { 1 , . . . , m} and Ji, ie I, are n o n e m p t y finite sets o f indices. Such functions arise in m a n y applications (e.g., minimax problems, It and loo a p p r o x i m a t i o n problems, exact penalty methods) and have been studied in a n u m b e r o f papers [2, 3, 4, 6, 7, 8, 11, 13, 16]. Clarke's subdifferential calculus [5] can be used to derive necessary optimality conditions and to design minimization methods for f when the function g(x, Yl, 9 9 9 Ym) is nondecreasing with respect to each y~, i ~ I. This a s s u m p t i o n was essentially used in [2, 3], and the c o r r e s p o n d i n g methods can be f o u n d in [2, 10, 11, 12]. If this assumption fails, e.g., if
f ( x ) = m a x ~ l ( x ) - max fj2(x), J~-JI
J~J2
a more subtle a p p r o a c h is needed. In this case the quasidifterential calculus o f D e m y a n o v and R u b i n o v (see, e.g., [7]) yields sharper optimality conditions and suggests suitable algorithms [6, 16]. We s h o u l d add that some other methods [4, 13] treat the original n o n s m o o t h minimization problem indirectly by solving an infinite sequence o f differentiable problems. 85
K.C. Kiwiel / A linearization method
86
This paper presents an algorithm that is tailored to the structure of (1). At each iteration several search directions are found by solving a number of quadratic programming subproblems. Then an Armijo-type search is performed simultaneously along all the search directions to produce the next approximation to a solution. We should add that the idea of using several search directions at each iteration stems from [6, 16]. However, our search-direction-finding subproblems are natural extensions of those that lead to at least a linear rate of convergence for special cases of (1) (see [9, 15]), while no such results seem to hold for the subproblems of [6, 16] (see [14]). Also our line search procedure is readily implementable, whereas [6] requires exact directional minimizations. The algorithm is 'globally' convergent in the sense that each of its accumulation points is inf-stationary for f (see Section 2 for the definition). We note that the method of [6] converges to only approximately inf-stationary points. In effect, our algorithm seems to be the first readily implementable and globally convergent method for the problem in question. The method is derived and stated in Section 2. Its global convergence is established in Section 3. Finally, we draw some conclusions in Section 4. E, denotes the n-dimensional Euclidean space with the usual inner product (., 9) and associated norm II" II. Superscripts are used to denote different vectors, e.g., x ~ and x 2. All vectors are row vectors. The convex hull of a set S c E, is denoted by co S.
2. Derivation of the method
We start by reviewing the properties of the problem minimize f ( x )
on E,
(2)
(see [2, 3, 6] for details). A necessary condition for a point ~ e E, to solve (2) is (3)
Of(x) >~O V d e E., ad
where
of(x) od
- f ' ( d ) = li~n If(x+ td) - f ( x ) ] / t 1~0
denotes the derivative o f f at x in the direction d. Points ~ satisfying (3) are called inf-stationary for f. For convenience, we shall use the following notation: f~(x) = m a x f j , ( x ) , jaJi
i e I,
F ( x ) = ( f l ( x ) , . . . , fm (x)), f ( x ) = g(x, F ( x ) ) .
K.C. Kiwiel/ A linearization method
87
Vg(x, y) the n - v e c t o r
F o r z = [x, y ] e E , x Em we d e n o t e by
while
3•(x,y) denotes
-
Og
ieL
az~+. (z),
Let a i ( x ) - - ag (x, r
F(x))
Vx e E,,, i e / ,
b(x) = Vg(x, F(x))
Vx.
Then
of(x) -(b(x), d ) + Y. a,(x) )at;(x .~ o(l Od i~t = (b(x), d ) + Y. ai(x) max (Vfi(x), d), ic!
Jc-Ji(x)
"
so that
af(x) =(b(x),d)+ ad +
•
~
max
(ai(x)Vfj,(x),d)
ic I~(x) ) ~ . J d x )
min
(a,(x)Vfj,(x), d),
(4)
ie t _ ( x ) j s
where
J , ( x ) = { j c J , ls
ie I,
I+(x)={i~1la,(x)>O},
l_(x)={i~1la,(x)
a n d s u m m a t i o n over an e m p t y set o f i n d i c e s yields zero. T h e r e f o r e if(d)=
m a x ( v , d ) + " rain
vea_f~x)
w~f(x)
(w,d),
where
Of(x) = co A(x),
~tf(x)= co B(x), (5) ir
B(x):{w[w=~.
ai(x)~fjji(x),jEJi(x)}. i~/_(x)
88
K.C. Kiwiel / A linearization method
In this notation, the dual formulation of the necessary condition for a minimum (3) is - 0 f ( $ ) c a f('2). We shall need the following primal formulation, which follows immediately from the preceding formulae (see (3)-(4)). Lemma 1. A point ,2 9 E, is inf-stationary if and only if (b(,2),d)+
Y.
a,(,2) m a x ( V f j , ( , 2 ) , d ) + ( ~ , d ) ~ 0
V d e E , , f f , eB('2).
The minimization methods of [2, 11] require that ! (x) be empty for all x, i.e., ai(x) >i 0 Vx 9 En, i 9 L In this case the methods only need to handle the discontinuities in af(- )/ad arising from the max terms in (4). To see the essential difficulties of the case when L (x) is nonempty, consider the example
f ( x ) = f l ( x ) - max{0, x)
for x 9 El,
where f~ is a continuously ditterentiable function defined by
f'(x)={oX3
ifx>0.ifx<~0'
A straightforward generalization of the method of steepest descent with Armijo's line searches [1] would be based on the step x k~l = x k + tkd k, for k = 1, 2 , . . . , where
d k = arg min{f'k(d) + 89
1121d9 E,),
tk=max{1/2'lf(xk+-~dk)<~f(xk)+~-zif'k(d~),
i=0, 1,...}.
It is easy to check that for x ~9 (-~, 0) we would have f ' k ( d ) = --3(xk)2d, d k = 3(xk) 2 and x k < xg+~<~ Xk+ 3(xk)2< 0 for all k, so that x k would converge to some x* in ( - .~, 0] (in fact to x* = 0) which is not inf-stationary f o r f ( f ' . ( 1 ) = - 1 ) . This failure results from the discontinuity of af(. )/ad at x * = 0, where -/2(') changes from {1} to {1, 2}, and the fact that the d k tend to zero. The above example shows that the use of f ' ( . ) may yield a feeble (although steepest !) descent direction f o r f at x. A better model o f f around x may be obtained as follows. From (4), for any w 9 B(x) the function f ( . ; x, w, 8) defined by
f ( d ; x, w, 8) = (b(x), d)+
•
i~ l~(x)
ai(x) max [fj,(x)-fi(x)+(Vf~,(x), d ) ] + ( w , d) J~Ji(x,6)
(6) approximates f ' ( d ) from above, where the use of Ji(x, 8) = (j 9 Ji If~,(x) ~ fi(x) - 8} for some fixed 8 > 0 takes into account possible variations in J~(. ) around x. Note that i f f ( d ; x, w, 8) < 0, t h e n f ' ( d ) ~
89
K.C. Kiwiel / A linearization m e t h o d
we B(x)
f o r f at x. Therefore, for each
we shall choose
d(w)
to
minimize f ( d ; x, w, 8)+'211dll 2 over all d e E,,
(7)
where the term IId112/2 ensures that x+ d(w) stays in the region where f ( . ; x, w, 8) is a close approximation t o f ( x + . ) -f(x). Since the objective function (7) isstrongly convex, it has a unique minimizer d(w) which satisfies
d(w)=-[b(x)+
~
a~(x) ~
ic 14 (x)
Ai~(w)Vfj~(x)+w],
(8a)
j c J~(.x,t$)
where
Aj,(w)>~O forjeJi(x, 8),
Y,
Aj~(w)=l
for
iEL(x)
(8b)
jcJi(x,8)
(see, e.g., [9]). Moreover, since
f(d(w);
x, w, 8) + 89 d(w)rl 2 ~
= 0,
we have
f(d(w); x, w, 8) <~-'lld(w)ll 2,
(9)
so that d(w) is a descent direction for f at x if d(w)# O. To take into account possible abrupt changes in B ( - ) around x, which may be caused by variations in Ji(" ) for I e I_(x), ~r shall find more search directions by calculating d(w) for all w in the set
B(x, 8)={wEEnlw=
~
a,(x)Vfj,(x),jeJi(x, 8)}.
(10)
i~l_(x)
As before, we shall set B(x, 8 ) = {0} ifl_(x) is empty. We shall now give the method in detail.
Algorithm 1
Step 0 (Initialization). Select a starting point xlE En, a final accuracy tolerance es i> 0, an activity tolerance 8 > 0 and a line search parameter m > 0. Set k = 1. Step 1 (Direction finding). For each we B(x k, 8), find d(w) from the solution (d(w); l,l i ( w ) , i E l+(xk)) to {he quadratic programming subproblem min d, ul
89
d)+
Y. ~ a,(xk)u,+(w, d),
icl§
)
subject to f~,(x k) - f ~ ( x k) + (Vfj,(xk), d) <~ u,
Step 2 (Stopping criterion).
If
u k -- -max{~lld(w)rr2[ and continue.
for
IId(w)ll <~ef
w e B(x ~, 8)}
j e J,(x k, 8), i e I+(xR).
for all
w e B(xk),
stop. Otherwise, set (! 1)
K.C. Kiwiel / A linearization method
90
Step 3 ( Stepsize selection). (i) S e t t = l . (ii) Find ~ in B ( x k, 3) such that f ( x k + td( ~,)) = m i n { f ( x k + t d ( w ) ) l w ~ B ( x k, 3)}. (iii) If
f ( x k + td( ~,) ) <~f ( x k) + m( t)2u k, set t k= 1, d k = d( ~,), x k+l = xk + tkd k and go to Step 4; otherwise, replace t by t/2 and go to Step 3(ii). Step 4. Increase k by 1 and go to Step 1. A few comments on the algorithm are in order. It may be efficient to compute d ( w ) via (8) by finding Lagrange multipliers A~(w), j e Ji( x k, ?J), i c I~( x k) that solve the dual subproblem
k
subject to Aji >~O, j e j k i , J
;k h j i = l ,
i e l k,
where b k = b(xk), I k ----I§ etc. To check that the algorithm cannot go into an infinite loop at Step 3, observe that Step 3 is always entered with d ( 6 ) # 0 for some ~ e B(xk), sO that f ' k ( d ( ~ ) ) < 0 by (9). Hence t~,0 would lead to f'k(d(~,))~>liminf t~O
min wcB(xk).~)
[f(xk+td(w))--f(xk)]/t>~lim
mtuk=O,
t~.O
producing a contradiction. We may add that if B(X k, t$) is a singleton, e . g . , B(x k, c$) = {0} as considered in [2, 11], then only one search direction is used at the kth iteration. In this case Algorithm 1 may be regarded as a natural extension of well-known and relatively efficient linearization methods [9, 15], and is close to the method of [2].
3. Convergence In this section we shall establish the global convergence of the method. In the absence of convexity, we will content ourselves with finding an inf-stationary point for f. Naturally, we assume that the final accuracy tolerance es is set to zero. We start by analyzing the properties of search directions generated around nonstationary points.
K.C. Kiwiel/ A linearization method
9I
Lemma 2. Suppose that ~ ~ E,, if, ~ B( Y,) and d e E, are such that f ( d ; .~, ~, O)< O. Then there exist ~ > 0 and neighborhoods S(s and S( ff,) of ~ and if,, respectively, such that
f~(d(x, w))<~-g
V[x, w]~ S(2) xS(ff),
(12)
lid(x, w)ll~>g V[x, w]eS(~)xS(Ce),
(13)
where d(x, w) denotes the solution of (7). Proof. By assumption, we have (see (6))
(b(.x'), d)+
~. ai(x.) max (Vf~,(.~), d)+(~, d ) < - e ie
I~(.~)
JcJt(x)
for some e > 0 and d ~ 0. Hence, using the continuity of a~, b , f , fj~ and Vfj~, we may choose bounded S(s and S(ff) such that I+(s c I+(x) and l+.(x)\l+(,2)c {i~ II a,(s =0},
(14a)
tl a,(~z) =0},
(14b)
I_(~)~ L ( x ) and l_(x)\l_(s (b(x), d)+
Y.
{ie
ai(x) max (Vfji(x), d ) + ( w , d ) < ~ - e / 2
ic I+(x)
J~'Ji (~)
for all [x, w ] c S(~) • Next, since f and fj; are continuous, we have J~(~)c Ji(x, 6) for 6 > 0 and f j ; ( x ) - f ( x ) ~ < - ~ for some fixed ~ > 0 i f x is close to ~ and j e J,(x, 6)\J~(.f). We may therefore shrink S(.~) and choose small / > 0 such that max (Vfj~(x), ?d)/> max -~llVf~,(x)ll Ildll i> - ~ / 2
j~Ji( ~ )
jE Ji(.x)
> f,,(x) -f,(x) + (vfj,(x), ~d) for a n y j e J i ( x , 8)\Ji(2), and (b(x), a~)+
~
ic I+(x)
ai(x) max [f~i(x)-f(x)+(Vfi,(x), tt)]+(w, a~)~ <-g J e J d -r',a)
for a~= fd, ~ = ? e / 4 > 0 and all [ x , w ] E S ( ~ ) x S ( ~ ) . Thus f ( J ; x , w , 8)<~-~ and since f ( 9 ; x, w, 8) is conves~ f ( t d ; x, w, 8) ~<-t~ for all t c [0, 1]. Then, since d(x, w) solves (7),
f(d(x, w); x, w, 8)+89
w)ll 2
~< te[O, minIJ {f(ta~; x, w, ,~)+89
-~/211dll ~
and we have
(b(x),d(x, w))+
Y. is l+(x)
ai(x) max [f~,(x)-f(x)+(Vfj,(x),d(x, w))] j ~ Ji(x.~ )
+(w, dfx, w))<<--~/211dll 2
(15)
K.C. Kiwiel / A linearization method
92
for all [x, w]E S ( 2 ) x S ( ~ ) . Since (4) and (5) yield
f " (d(x, w))<~ (b(2), d(x, w)) +
Y. ie
a,(2) m a x ( V f j , ( 2 ) , d ( x , w ) ) + ( ~ , d ( x , w ) ) ,
1+(.~)
(16)
jaJ,(~)
while, by (8), d(x, w) is uniformly bounded for [x, w] in S(2) x S(ff), we may shrink S(~) x S ( f f ) to obtain (12) from (15)-(16) for g = ~/4[dl ~ using the continuity of b, a,, f~, fj~ and Vfj~ together with (14) and the fact that J~(2)c J~(x, 8) for x close to *. Since f ' ( . ) is continuous and f ' ( O ) = O, (12) implies (13) for small g > O. We may now justify the stopping criterion. Lemma 3. If Algorithm 1 terminates at the kth iteration, then X k is inf-stationary for
f. Proof. Suppose, arguing by contradiction, that 2 = x k is nonstationary, but d(2, w) = 0 for all w ~ B ( 2 ) . By Lemma 1, there exist ~ e B ( 2 ) and d c E ~ such that f(aT; 2, ~, 0 ) < 0 , so Lemma 2 yields d(~, ~ ) # 0, thus proving the lemma.
Our principal result is Theorem 1. Every accumulation point of the infinite sequence {x k} generated by
Algorithm 1 is inf-stationary for f. Proof. Suppose that there exist 2 c E, and an infinite set K c {1, 2 , . . . } such that x k --~ 2. Assume, contrary to the assertion of the theorem, that 2 is nonstationary. By Lemmas 1 and 2, there exist ~ e B(2) and g > 0 such that (12) and (13) hold for some S( 2) • S( ff~). Since X k - ~ 2 and 8 > 0 is fixed, it is easy to see from (10) that B(x k, 8) ~ S(w) ~ 0 for large k ~ K. Thus there must exist ~ k E B(X k, 8) and ~lk = d ( x k, w k) such that
f,(~k)<~_g
Ilakll >e
for large k e K,
for large k ~ K .
(17) (18)
Since x k r__,~, (8) and (11) imply the existence of t~ < 0 such that t~ ~< u k <<-0 for all k ~ K ; in particular, {dk}k~r is bounded. Hence Taylor's expansion yields (see [6])
f ( ~ + td k) <~f(2) + tf~ (d k) + o(t, k), where o(t, k)/t-~O as t~0 uniformly with respect to k e K. Therefore, by (17)
f ( x k + td k ) 6 f ( x k ) + f ( x k + td k) - f ( g + td k) + f ( 2 ) - f ( x k ) - gt + o(t, k) for large k E K. Since x k --~ 2, {dk}k~ r is bounded and f is continuous, we may choose, for any s > O and O < i < g, a t ( 1 ) > O such that
f ( x k + td k) --f(2 + td k) + f ( 2 ) - f ( x k ) + o( t, k) < e + it,
K.C. Kiwiel / A linearization method
93
and hence, for ~ = g - g > 0, (19)
f ( x k + td k) ~ f ( x k) + e - ~t
for all t ~ [0, t(g)] and large k ~ K. Let us choose e such that the interval [_t(e), ?(e)] of solutions to the inequality (20)
e - ~t <~ m ( t ) 2 ~
contains 1/2' ~< t(~) for some i > 0. This is possible, since [t(e), T(e)]--~ [0, - ~ / m ~ ] as e~,0. Then, from (19)-(20) and the fact that m~<~ mu g for k e K, t = 1/2' satisfies f(xk+tdk)<~f(xk)+m(t)2u
Hence
tk ~ t
k
for large k e K .
and
f ( x k+l) = f ( x k + tkd k) <~f(x k) + m ( t k ) 2 u k <~f(x k) + m ( t ) 2 u k
(21)
by construction, for all large k c K. Since f ( x k ) ~ f ( ~ ) from the continuity o f f and the fact that x k ~ ,2 and f ( x k ~ 1 ) < ~ f ( x k) for all k, (21) yields u k x_~ O. But - u k >~ lid k112/2 I> g2/2 for large k e K from (18). This contradiction completes the proof. Remark 1. From the preceding results, one may establish the convergence of a version of the method described in [6] in which line searches involving exact minimizations are replaced by the simultaneous Armijo-type search of Algorithm 1. Remark 2. For greater efficiency, one may replace the constant activity tolerance 8 in Algorithm l by a sequence {8 k} such that ~k ~ ~ for some fixed g > 0 and all k (see [9]). Clearly, the preceding convergence results remain valid.
4. Conclusions
We have presented an extension of the linearization method [9, 15] for minimizing smooth compositions of max-type functions. The algorithm seems to be the first readily implementable and globally convergent method for solving the problem in question, since the only other known comparable method [6] requires exact onedimensional minimizations during the line searches and converges to only approximately inf-stationary points. We should add that the method can be extended to constrained quasiditterentiable problems as in [6]. Due to lack of space, we shall report the extensions elsewhere.
References [ 1] L. Armijo, "Minimization of functions having Lipschitz continuous first partial derivatives", Pacific Journal o f Mathematics 16 (1966) 1-3.
94
K.C. Kiwiel / A linearization method
[2] A. Auslender, "Minimisation de fonctions localement Lipschitziennes: Applications h la programmation mi-convexe, mi-differentiable", in: O.L. Mangasarian, R.R. Meyer and S.M. Robinson, eds., Nonlinear programming 3 (Academic Press, New York, 1981) pp. 429-460. [3] A. Ben-Tal and J. Zowe, "Necessary and sufficient optimality conditions for a class of nonsmooth minimization problems", Mathematical Programming 24 (1982) 70-91. [4] D. Bertsekas, "Approximation procedures based on the method of multipliers", Journal of Optimization Theory and Applications 23 (1977) 487-510. [5] F.H. Clarke, Optimization and nonsmooth analysis (Wiley, New York, 1983). [6] V.F. Demyanov, S. Gamidov and T.I. Sivelina, "An algorithm for minimizing a certain class of quasiditterentiable functions", Working Paper WP-83-122, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [7] V.F. Demyanov and A.M. Rubinov, "'On quasidifferentiable mappings", Mathematische Opera. tionsforschung und Statistik, Series Optimization 14 (1983) 3-21. [8] R. Fletcher, Practical methods of optimization, Volume II, Constrained optimization (Wiley, New York, 1981). [9] K.C. Kiwiel, "A phase 1-phase II method for inequality constrained minimax problems", Control and Cybernetics 12 (1983) 55-75. [10] K.C. Kiwiel, "A linearization algorithm for nonsmooth minimization", Mathematics of Operations Research (to appear). [11] K.C. Kiwiel, "A quadratic approximation method for minimizing a class of quasidifferentiable functions", Numerische Mathematik 45 (1984) 411-430. [12] R. Mifflin, "A modification and an extension of Lemarechal's algorithm for nonsmooth minimization", Mathematical Programming Study 17 (1982) 77-90. [13] G. Papavassilopoulo~, "Algorithms for a class of nondifferentiable problems", Journal of Optimization Theory and Applications 34 (1981) 31-82. [14] O. Pironneau and E. Polak, "On the rate of convergence of certain methods of centers", Mathematical Programming 2 (1982) 230-257. [15] B.N. Pshenichny, Method oflinearizations (in Russian) (Nauka, Moscow, 1983). [16] T.I. Sivelina, "Minimizing a certain class of quasidifferentiable functions" (in Russian), Vesmik Leningradskogo Universiteta 7 (1983) 103-105.
Mathematical Programming Study 29 (1986) 95-107 North-Holland
A DIRECTIONAL IMPLICIT FUNCTION QUASIDIFFERENTIABLE FUNCTIONS
THEOREM
FOR
V.A. D E M I D O V A Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7//9, Leningrad 199164, USSR
V.F. D E M Y A N O V Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR and International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria. Received 15 October 1983 Revised manuscript received 15 November 1984
The implicit and inverse function theorems provide an essential c o m p o n e n t of classical differential calculus, and for this reason many attempts have been made to extend these theorems to n o n s m o o t h analysis (see, for example, the work of F. Clarke, H. Halkin, J.-B. Hiriart-Urruty, A.D. loffe, B.H. Pourciau, J. Warga). In this paper, we consider the case of quasiditierentiable functions. It is shown that to obtain nontrivial results it is necessary to study a directional implicit function problem (it turns out that in some directions there are several functions, while in others there are none).
Key words: Implicit Functions, Inverse Functions, Quasidifferential Calculus, Directional Derivatives.
1. Introduction In this paper we consider problems quasidifferential calculus to the implicit differential calculus. Let us first recall some definitions. A set S of Em is called quasidifferentiahle at x and if there exists a "lJair of convex such that, for any g ~ Em, od~(x) c3g
lim
related to the derivation of analogues in and inverse function theorems o f classical function 4' defined and finite on an open at x ~ S if it is directionally differentiable compact sets _~d~(x)c Em and 0 & ( x ) c Em
~[~b(x+ag)-~(x)]
a~+o ~
max
(v,g)+
ve_~4,(x)
min (w,g). wc~S(x)
The pair Dcb(x)= [_0~b(x),0&(x)] is called a quasidifferentialof ~b at x; it is not unique. The properties of quasidifferentiable functions were first investigated in [3, 5, 12]. This work led to the development of quasidifferential calculus, which is a generalization of classical differential calculus (see, e.g., [2, 4, 6]. Some extensions to Banach spaces are discussed in [2] and [4]. 95
V.A. Demidova, V,F. Demyanov / An implicit function theorem
96
The implicit and inverse function theorems of classical differential calculus represent an essential element in the structure of the calculus and have important applications. In the nonsmooth case this problem was discussed, for example, by F. Clarke [1], H. Halkin [7], J.-B. Hiriart-Urruty [8], A.D. Ioffe [9], B.H. Pourciau [13], H. Warga [14]. The problem of deriving analogous theorems in quasidifferential calculus was introduced and briefly examined in [2,6]. In the present paper we continue our study of this problem.
2. An implicit function theorem
Let z=[x,y], xeEm, y 9
and let the functions f ( z )
(/el:n)
be finite
quasidifferentiable on E.,+.. Consider the following system: f ( x , y) = 0
Vi 9 l:n.
This can be rewritten in the form
f(z) =0
(1)
where
f =(f," " " .f.),
Oe E..
The problem is to find a function y(x) such that
f(x,y(x))=O
Vx 9
u
Unfortunately we cannot solve this very general formulation of the problem for an arbitrary quasidifferentiable system of type (1). But what we shall try to do is to solve this problem for a given direction g 9 E,,. We shall call this a directional
implicit function problem. Suppose that Zo= [Xo, Y0] is a solution o f system (1), i.e., f(zo) = 0
Vi 9 l:n.
Consider the system of cqaations
f(xo+ ctg, y(a)) = 0
(2)
where a > 0. Since the functions f are quasidifferentiable, for any q 9 En we have, from (1),
f(xo+ otg, yo + aq) = f ( x o , Yo) + ct O[g, Of(zo) q] + oi(ct, q) =
af(zo) +o,(ot, q)
a O[g, q]
(3)
V.A. Demidova, V.F. Demyanov / An implicitfunction theorem
97
where
af,(zo) _
max [(vii, g)+(v2, q ) ] +
rain [(wli, g)+(w2/, q)].
(4)
Here Df~(z)=[O_fi(z),Of~(z)] is a quasiditterential of f~ at z; 9 f . . ( z ) c E m + . , ~f.(z) c E,.+. are respectively sub- and superditterentials off~ at z (convex c o m p a c t sets)" vi = [vii, w,], and wi = [wli, w2i]. Let q0 ~- E . be a solution to the quasi-linear system
of,( zo)
O[g, qo]=O
Vi 9
(5)
~+o 9 0
(6)
Suppose that in (3)
oi(a, q) a
uniformily with respect to q 9 S ~ ( q o ) = {q 9 E~
l llq- qoll ~< ~},
where 8 > 0 is fixed. Is it possible to find a vector function r ( a ) with ao>O such that
f~(xo+ ag, yo+ a [ q o + r
=0
Vi 9 1 : n, a 9 [0, ao]
(7)
where r ( a ) 9 E . V a c [0, ao]? Take e ~> 0 and introduce the sets R,. = {v, e 0f~(zo)[(v,,, g) +(v2,, qo) t> m a x [(~,i. g) + (v2,. qo)] - e}, /~i~ = {w, e0f~(zo)l (w,~ g)+(w2,, q o ) ~<
min [(we,, g)+(w2,, q o ) ] + e } ,
~,~eSs
_R,(r) ~={v, e O_f~(zo)i(v,,, g) + (v2,. qo+ r) = /~,(v) :: {w, e ~f~(Zo)l (w~,, g) + (w2,, qo+ ~) =
m a x [(vl,. g) + (v2,. qo + ~)]}, t;~c_oA(zo) rain [ ( ~ , , g) + (~2~ qo + r)]}.
~c~f,(zo)
It is clear that all these sets d e p e n d on Zo, g, qo. Note that m a p p i n g s _R~(r) and J~i(ff') are u p p e r - s e m i c o n t i n u o u s (i.e., closed) and that for any e > 0 there exists a 81 > 0 such that
8~=8~(e)<8,
_R,(~)~_R,,,
~,(~-)=.~,~
V i e l : n , VreS~,(0)
From (4),
OL(zo) o[g, qo + r]
(vl,(v). g) + (v2,(r). qo + r) + (wli(r). g) + (w2i(v), qo + r) = (V2i(v) + W2,(v), T) + r,,(v)
(8)
98
V.A. Demidova, V.F. Demyanov / An implicit function theorem
where
rli( z) = (v,,(~-), g ) + ( v2i( r), qo)+ (wli(z), g) + ( w2i( r), qo), v,(~) = [v,,(~), v2,(~)] ~ 8,(z),
w,(~-) = [w,,(~-), w2,(~)] ~/L( ~) .
Since _Ri(r) and /~i(r) are upper-semicontinuous, if T. ~ w,(%) ~ w, then 1.)i E7 Ri(O),
0, V ~ ( Z , ) ~
V, and
w i E Ri(O).
This means that
rli(O)
af,.(Zo) cg[g, qo]
and rti(T) are continuous. From (5) it follows that rli(0) = 0
(9)
Vic l:n.
Thus, from (3) f~(xo + ag, yo + c~(qo + T)) = a[(v2,(r) + W2i(T), T) + r,(a, ~-)] where
ri( a, "r) =- r,i('r) +
oi(a, qo + ~) Ot
Consider the functions F/,,(r) = (v2i(~') + W2i(T), 7")+ ri(ot, ~').
(10)
Here t)2i('r ) E V2i(7"), w2i(7") E l~2i('/'), where V2,(~') = {V2i J::IV, i E Era: [ V,i, V2i] E _Ri(~')}, ~k~l"r2i('r) : { W2i l a Wli E Era: [ w i i , w2i ] C_ Ri(,/')}.
The mappings V~(T) and w.~(~-) are upper-semicontinuous. Now introduce the set M (7) of matrices such th~tf A e M (r) if A is a matrix with i-th row [ v2~(r) + w2i(z) iT where v2i(r)~ V2i(z)
and
w2i('r)e
~'r2i(T ).
The mapping M is convex-valued and upper-semicontinuous. Let us denote by ME (where e >/0) the set of matrices defined as follows:
M~=
f (A/I
I
Ai=[o2i+w2i]T, v2ie_Ri~,w2ieRieVi ,.
A=
\ao/i
V.A. Demidova, V.F. Demyanov / An implicit function theorem
99
From (8) it is clear that Vr ~ S~,(0).
M(r)cM~
(11)
Note that if 81 = 81(r in (11) then (8) is satisfied. Theorem 1. I f for some e > 0 we have rain det A > 0
(12)
Ae M~
then for a positive and sufficiently small there exists a solution r ( a ) to system (7) or, equivalently, to the system
F~(r) = 0
Vic l:n.
Proof. Let us construct the mapping M - ' ( r ) r ( a , r) = &,,(~')
where M - I ( z ) = { B = A - ' I A ~ M(z)}. From (11) and (12) it follows that r a e [0, ao]) in r e S~,(0) and that
is upper-semicontinuous (for any fixed
~ff ( S ~ l ( 0 ) ) C S ~ I ( 0 ) 9
It is easy to see that r is convex for each ~'. This means that all of the conditions o f the Kakutani theorem (see [10, 11]) are satisfied and therefore there exists at least one point r ( a ) which is a fixed point of the mapping r z(~) e r From (6) and (9) it is also clear that ~'(a)
ot --}0
,0.
Now from the above equation and (10) it follows that ~(r(~))
= 0.
[]
Corollary. I f q o is a solution to (5) and the condition (12) o f Theorem I is satisfied then system (2) has a solution y ( r ) defined on [0, ao] (where d o > O) such that y~.(O) --- lim 1 [ y ( a ) - y ( O ) ] = qoa ~ + 0 O/
We shall call Theorem 1 a directional implicit function theorem. Of course, there could be several solutions to (5), or none at all.
V.A. Demidova, V.F. Demyanov / An implicit function theorem
100
It is important to be able to solve systems o f equations o f the form max[(v1~,g)+(v2~,q)]+min[(w~,g)+(W2l, q)]=b~
oi~O'li
wlEo'2i
Viel:n
vi
where = [ v , i , v2~], wi =[wli, w2i], and t r , , c E,,+. and ~ 2 i c E,.+. are convex c o m p a c t sets. We shall call systems o f this type In some cases (for example, if cr~i and 0"2~ are convex hulls o f a finite n u m b e r of points) the problem o f solving quasilinear systems can be reduced to that o f solving several linear systems o f algebraic equations (we shall illustrate this later on).
quasilinear.
3. An inverse function theorem N o w let us consider a special case o f the problem, namely, where system (1) is o f the form x + d~(y) = 0
(13)
i.e.,
x(i)+~bi(y)=O
Vie l:n,
where x = (x (1). . . . , x (")) e E,,
y = (y(l) . . . . , yC~)) e E,
and the functions ~b~ are quasidifferentiable on E.. Suppose that Zo = [Xo, Yo] e E2,, is a solution to (13), i.e, Xo+ ~b(yo) = 0. C h o o s e and fix any direction g e E,. We n o w have to consider two questions: 1. What conditions are necessary for the existence o f a positive ao and a continuous at a = 0 vector function such that the expressions
y(a)
y(0)=yo, xo+ag+c~(y(a))=0
V a e [ 0 , ao]
(14)
are satisfied? 2. I f y ( a ) exists does ' y ' ( 0 ) =- lim 1 [ y ( a ) - y ( 0 ) ] a~-~O
necessarily exist? To answer these questions we turn to T h e o r e m 1 and its corollary. Let = [_0$~(y),~$i(y)] be a quasidifferential o f ~bi at y. We then have
Dcbi(y)
~bi(yo+aq)=c~i(yo)+a[L v,~_.o.~,(vo) max (vi, q)+ min (wi, q)]+oi(a,q). w, c ~,,(yo)
(15)
V.A. Demidova, V.E Demyanov / An implicit function theorem
101
In this case equation (4) takes the form max ( v , . q ) +
m in
vic-O. ~ , ( y o )
(wi, q ) = - g i
(16)
Vi~l:n.
wiCr)cbi(yo)
Suppose that qoe E. is a solution to (16) and that in (15) oi( a, q) Ol
a ~ +O
~0
uniformaly with respect to q c S~(qo). We now introduce the sets max (vi, q ) - e } ,
_Ri~ = {o,c~_4~,(yo)l(vi, q)>-
Vi~_d~i(Yo)
/~,. ={W~eO~b,(yo)l(wi, q)~<
min
(w,, q ) + e } .
Let M, be a set of matrices defined as follows:
Me =
A=
Ai=[vi+wi]T,v,e_Ri~,w2eRi~Vi n
}
where e/> 0. Theorem 2. I f f o r some e > 0 we have (17)
min det A > 0
A~ M~
then there exist an a o > 0 and a continuous vector function y( a ) such that y(O) = Yo, Xo+ ag + r
=0
and
y~(O):= qoRemark 1. In the case where each of the sets _0~bi(yo) and 0~bi(yo) (for all values of i) is a convex hull of a finite number of points, it can be shown that theorem 2 is valid if (17) holds for e = 0 . An analogous result can also be obtained for Theorem 1. Remark 2. Suppose that [Xo, Yo] is a solution to (14). Then to solve the directional inverse function problem it is necessary to find all the solutions to (16) and check whether condition (17) is satisfied. As an illustration of Theorem 2 and the use of the technique outlined above we shall now present a simple example.
V.A. Demidova, V.F. Demyanov / An implicit function theorem
102
Example. Let x = (x (1~, x t2)) ~ E2, y = (yU), y(2)) E E2, x o = (0, 0), Yo -- (0, 0). C o n s i d e r the following system of equations: x I1) + ly ~I)I - 2ly 12~l= O,
x~2~+ ly(t~_ y(2)] = 0.
(18)
This system is simple enough to be solved directly. It is not difficult to derive the following solutions: 1.
y(1} = x(I)-- 2x(2), (19)
y(2) = X(1)_X(2)
if y~/'21 = {y = (y('), yt2))ly~l) > O, y(2)> O, ytt)_y(2)>~ 0};
y(t) = X(I) + 2X(2),
2.
X(2)
y(2) = x ( 1 ) +
(20)
if y c O2 = {Y = (y~l), y(2))ly(i) > 0, y(2) >10, y(n _ y~2) <~ 0} ;
3.
y(l)= __ l x ( I ) -
2X(2) '
yC2~=_kx.)+kx(2~
(21)
if y ~ g23 = {y = (y<~), y(2)) ] yU)/> O, y(2) ~< O, y(l) _ y(2)/> O} ;
4.
yU)=]x(')+2x(2), y(2~ =
(22) ~x(l)
-- lx(2)
if y 6 g24 = {y = (yU), y~2))ly(t)<~ 0, y(2)/> 0, y(1)_ y(2) <~0};
5.
y(1) = - x ( ~ ) - 2x~2), (23) y(2) = _ x ( a ) _
xO)
if y ~ Ft5 = {y = (y(~), y(2))ly(~) <~O, y(2) ~< O, y(i) _ y(2)/> O}; 6.
y ( l ) = _X~O + 2X(2), y(2)___X(l) d- X (2)
if y
~ ,(~6 :
(24)
{Y = (y(1), yC2))ly(1),~~ O, yt2) <~ 0, yU) _ y(2) ~< 0}.
In this e x a m p l e it is obvious that [Xo, Yo] satisfies (18). N o w consider the (arbitrarily chosen) four directions gl = (1, 0), g2 = ( - 1 , 0), g3 = (1, 1), g 4 = ( - - 1 , - - 1 ) . For gt we have Xo+ ctg~ = (a, 0). We now look at each of the possible solutions in turn. From (19), Yll (t~) = (a, a ) ~ ~1 V a / > 0, i.e., Y11(a) satisfies (14) for all a / > 0 and therefore Y~l(a) is a directional inverse function o f (13) in the direction gl and y'H+(O) = (1, 1) = qo,. Solution (20) yields the same directional inverse function as (19).
V.A. Demidova, V.F. Demyanov / An implicit function theorem
103
From (21) we obtain yt3(X) = ( - ~ a1 , - ~ a ) ; in this case Y13~'23 V a > 0 and therefore y~a(a) is not a directional inverse function o f (13) in the direction g~. From (22) y~4(a) = ( gaa , ~ a ) r V a > 0 , and therefore ylg(a) is also not a directional inverse function of (13) in the direction g~. Solutions (23) and (24) yield the functions yt5(a)=y~6(a)= ( - a , - a ) , where
yls(a)r
and
Yl6(Ot ) C ~"~6 ~Of ~ 0.
ThUS ylS(Ot)=y(ct) is a directional inverse function of (13) in the direction g~ and y'~s+(0) = ( - 1 , - 1 ) = q05. Thus there are two directional inverse functions o f (13) in the direction gl :yll(Ct) = (a, a ) and yls(a) = (-a, - a ) . N o w let us consider g2 = ( - 1 , 0). F r o m (19),
y21(Ot)=(--a,--ot)~Ol
Vot > 0,
and therefore y2~(a) is not a directional inverse function of (13) in the direction g~. In the same way we obtain (for a > 0):
yz2(a) = ( - a , - a ) ~ .02,
yz3(a) = (-~a, Jet) ~ J'~3,
y24(ct) = ( - l a , - ~ a ) ~ 124,
y25(a) = (a, a)~ 12s,
Y26(a) = (a, or)I~ J'~6.
This m e a n s that there is no directional inverse function of the system (13) in the direction g2. The same is also true for the direction g3 = (1, 1) since for a > 0 y31 (or) = (-- a, 0) ~ ,('21,
Y32 = (3a, 2 a ) ~ 122,
Y33(ot) : (--Or, 0) t~ ~'~3,
Y34(a) = (a, 0) ~ .O4,
y35(ct)=(-3a,-2a)~ 05,
Y36(0~) = (O~, O) ~ J~6.
In the same way we find that there are two directional inverse functions of (13) in the direction g4 = ( - 1 , - 1 ) :
y4~(ct)=Y4a(a)=(ct, O),
y44(ct)=Y46(a)=(-a,O),
and y~l+(0) = y~3+(0) = ( 1 , 0),
y h + ( 0 ) = y~s+(0) = ( - 1, 0).
N o w let us solve the p r o b l e m again using the results of T h e o r e m 2. System (18) can be rewritten in the following form (see (13)): x + 4,(y)=0
where 6 = (4h, ~b2), 6,(Y) = {Y<')I- 21Y<2~1, and ~b2(y) = ly "1 - y~2)l. The functions ~b, and &2 are quasidifferentiable. We first find their quasidifferentials at Yo = (0, 0): Dthl(Yo) = [_0~bt(yo), 06t(Yo)],
Dqb2(yo)= [_3tb2(yo),-Oq~2(Yo)]
(25)
104
V.A. Demidova, V.F. Demyanov / An implicitfunction theorem
where O~b,(yo) = {v = (v I'), v(2)) I v(') c [ - 1 , 1], v (2) = O} = c o { ( - 1 , 0), (1, 0)}, ~b,(yo) = {w = (w CI), w(2))lw(') = 0, w~2~c [ - 2 , 2]} = co{(0, - 2 ) , (0, 2)}, _8~b2(Yo)= c o { ( - 1, 1), (1, -1)},
~b2(Yo) = {(0, 0)}.
For any fixed g = (gtl), g(2)), we have to solve (16) and find qo = (q~o~), q~o2))9 From (25) and (16) we obtain the system
v~l)q(I) +
max v l l ) e [ - l , 1]
min
w~2)q(2)=_g~l),
w~2;~[-2,2]
(26) max V2r
(v2, q) = _g(2).
1,1 ),( I,-- I )}
In general we cannot solve (16) but if _a~bi and Oqb~ are convex hulls o f a finite number o f points, as is the case here, we can solve (26) by considering the following eight linear systems o f algebraic equations:
1.
_q(~)_ 2q(2) = _gl~),
(27)
_ q(l) + q(2) = _ g ( 2 ) ;
2.
_q(1)_2q(2) = _g(1), (28)
q(l)_q(2)= _g(2); 3.
-q(I) + 2q(2) = -g(X), (29)
_ q ( l ) + q(2) = _g(2); 4.
_q(l) + 2q(2) = _g(i), (30)
q(l)_ q(2) = _g(2); 5.
q(~)_2q(2) = _g(t),
(31)
_ q ( t ) + q(2) = _g(2); 6.
q(1)-2q (2) = -g(~), (32)
q(1)_ q(2) = _g(2); 7.
q(I)+2q(2) = _ g o )
" (33)
_q(l) -F q(2) = _g(2);
8.
q(~)+ 2q (2)= --gft), q(1)_ q(2) = _g(2).
(34)
The systems (27)-(34) are all nondegenerate and thus solutions exist for any g ~ E2. Take gl = (1, 0). Solving (27)-(34) we obtain four ditterent vectors: qH = (], ~)
(from (27) and (28)),
V.A. Demidova, V.F. Demynnov / An implicitfunction theorem q,2 = (--1,--1)
from (29) and (30)),
ql3 = (1, 1)
(from (31) and (32)),
q~4 = ( - ~ , - ~ )
(from (33) and (34)).
105
Now it is necessary to check which ofthe values of q~ are solutions to (26), i.e., satisfy max
v~)q(l)+
ull)~[--I,1]
min
W~2)q(2)=--l,
w12)6[--2.2]
(35) max
v2cco{(- 1.1).(1.- 1)}
(v2, q) = 0.
A quick check shows that only vectors qt2 and q~3 satisfy (35). For q~2 = ( - 1 , - 1 ) we have R,o={V, 9 -
ql2)=
"
-
min (v,, q , 2 ) } = { ( - 1 , O)}, ul~#4,t(.vo)
R,o = {w, ~ Ock,(yo) l(w,, q,2) =
820 = { v2 9 _,34,2(yo) I (v~, q,2) =
~2o={W29
(w,, q,2)} = {(0, 2)},
min max
(v2, ql2)} = c o { ( - 1, 1), (1, - 1)},
u2c _~,t,2(yo)
q,2) =
min
(w2, q t 2 ) } = { ( O , O ) } .
W2C0~2(Y0)
Then
{a (~ 1
a, = [vi + wi] T, l)i 9 _Rio, wi 9 Rio
a2
}
=
co{A,, A2}
where
a(i Note that
i (;
Ao - ~A, +~A 2 =
E Mo
and det Ao--0, i.e., condition (17) is. not satisfied. But we should remember that condition (17) was originally introduced to deal with o~(a, q) in (15) and that it is a sufficient, not necessary condition for the existence of a directional inverse. In our case ol(a, q) = 0 for all values of i and q, and therefore there must be a vector function y,2(a) such that Xo+agt+c~(y,2(a))=O
Va>~O
and
y',2+(O)=q,2=(-1,-1).
Moving to ql3 = (1, 1), and following the same line of argument we deduce the existence of a vector function y13(a) such that Xo+Otgl+gb(yl3(a))=OVot>~O
and
y'13+(O)=ql3=(1,1).
106
V.A. Demidova, V.F. Demyanov / An implicitfunction theorem
Thus there are two solutions to (15) for g ~ = ( 1 , 0 ) : duplicates the result o b t a i n e d earlier. For g2 = ( - 1, 0) we again arrive at the four vectors q2, = (~,-~),
qz2 = ( - 1 , - 1 ) ,
q23= (1, 1),
Y~2(~) and y13(ot). This
q2,= ( - ~ , - ~ )
calculated previously as solutions to (27)-(34); however, none of them satisfies (26). Thus the system (13) has no directional inverse function in the direction g2. For g3 = (1, 1), solving the systems (27)-(34) yields six different vectors: q3, = (1, 0)
(from (27) and (29)),
q32: (--1, 0)
(from (32) and (34)),
q33 = ( - ~ , 2)
(from (28)),
q34 = (89 - ] )
(from (33)),
q35 = ( - 3 , - 2 )
(from (30)),
q36 :
(3, 2)
(from (31)).
These values should then be tested by substituting them into (26) (for q<')= 1, q~2~ = 1). We find that none of these six vectors satisfies (26), and therefore system (13) has no directional inverse function in the direction g3. For g4 = ( - 1 , - 1 ) we obtain the same six vectors as for g3 q,, = (1,0), q44 = (~, - ]),
q,2 = ( - 1 , 0),
q43 = (--~, 2),
qas = ( - 3 , - 2 ) ,
q46 = (3, 2),
but now q4t = (1, 0) and q42 = ( - 1 , 0) satisfy (26) (the four other vectors still do not). C o n d i t i o n (17) does not hold but it is not essential to invoke T h e o r e m 2 in this case since in (15) o~(cr
V~>~0 V i E I : 2 .
Thus for the direction g4=(--1,--1) there are two directional inverse functions y41(a) and y42(a) such that Xo+Otg4+~(y4j(ot))=O VOt>~O,
Xo+Otg4+qb(y42(ot))=O Va>~O,
and y~,+(0) = (1, 0),
y~2~(0) = ( - 1 , 0).
This again duplicates the results obtained earlier.
References
[1] F.H. Clarke, Optimization and Nonsmooth Analysis (Wiley, New York, 1983). [2] V.F. Demyanov, ed., "'Nonsmooth problems of control theory and optimization" (Leningrad University Press, Leningrad, 1982).
V.A. Demidova, V.F. Demyanov / An implicit function theorem
107
[3] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 21-25 (translated in Soviet Mathematics Doklady 21 (1980) 14-17). [4] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable mappings", Mathematische Operationsforschung und Statistik, Series Optimization 14 (1) (1983) 3-21. [5] V.F. Demyanov and L.N. Polyakova, "Minimization of a quasidifferentiable function on a quasidifferentiable set" (in Russian), Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fisiki 20 (4) (1980) 849-856 (translated in the USSR Computational Mathematics and Mathematical Physics 20 (4) (1981) 34-43). [6] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable optimization (in Russian) (Nauka, Moscow, 1981). [7] H. Halkin, "Interior mapping theorem with set-valued derivative"", Journal d'Analyse Mathematique 30 (1976) 200-207. [8] J.-B. Hiriart-Urruty, "Tangent cones, generalized gradients and mathematical programming in Banach spaces", Mathematics of Operations Research 4 (1979) 79-97. [9] A.D. Ioffe, "Nonsmooth analysis: Differential calculus of nondifferentiable mappings", Transactions of the American Mathematical Society 266 (1), ( 1981) 1-56. [10] S. Kakutani, "'A generalization of Brower's fixed point theorem", Duke Mathematical Journal 8 (3) (1941) 457-459. [11] L.V. Kantorovich and G.P. Akilov, Functional analysis, (Nauka, Moscow, 1977). [12] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions'" (in Russian), Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vestnik Leningrad University" Mathematics 13 (1981) 241-247. [13] B.H. Pourciau, "Analysis and optimization of Lipschitz continuous mappings", Journal of Optimization Theory and Applications 22 (1977) 311-351. [14] J. Warga, "An implicit function theorem without differentiability", Proceedings of the American Mathematical Society 69 (1978) 65-69.
Mathematical Programming Study 29 (1986) 108-117 North-Holland
DIRECTIONAL DIFFERENTIABILITY OF A CONTINUAL MAXIMUM FUNCTION OF QUASIDIFFERENTIABLE FUNCTIONS V.F. DEMYANOV Department of Applied Mathematics, Leningrad State University, Universitetskaya Naberezhnaya 7/9, Leningrad 199164, USSR, and International Institute for Applied Systems Analysis (IIASA) A-2361 Laxenburg, Austria
I.S. ZABRODIN Department of Applied Mathematics, Leningrad State University, Universitetskaya Naberezhnaya 7/9, Leningrad 199164, USSR Received 15 October 1983 Revised manuscript received 15 November 1984
The problem of the directional differentiability of a m a x i m u m function over a continual set of quasiditterentiable functions is discussed. A formula for the directional derivative such a m a x i m u m function is obtained. It is shown that the function is not necessarily quasidifferentiable any more. An illustrative example is provided.
Key words: M a x i m u m Function, Quasidifferentiable Function, Directional Derivative, Feasible Direction.
1. Introduction
Optimization problems involving nondifferentiable functions are recognized to be of great theoretical and practical significance. There are many ways of approaching the problems caused by nondifferentiability, some of which are now quite well developed while others still require much further work. A comprehensive bibliography of publications concerned with non-differentiable optimization has recently been compiled [7]-major contributors in this field include J.P. Aubin, F.H. Clarke, Yu.M. Ermoliev, J.B. Hiriart-Urruty, A.Ya. Kruger, S.S. Kutateladze, C. Lemar~chal, B.S. Morduchovich, E.A. Nurminski, B.N. Pshenichniy, R.T. Rockafellar, and J. Warga. The notion of subgradient has been generalized to nonconvex functions in a number of different ways. One of these involves the definition of a new class of nondifferentiable functions (quasidifferentiable functions) which has been shown to represent a linear space closed with respect to all algebraic operations as well as to the taking of pointwise maximum and minimum [3, 8]. This has led to the development of quasidifferential calculus - - a generalization of classical differential calculus - - which may be used to solve many new optimization problems involving nondifferentiability [4]. 108
V.F. Demyanov, LS. Zabrodin / Differentiability of a maximum function
109
This p a p e r deals with the problem of the directional ditierentiability of a maximum function over a continual set of quasiditterentiable functions. It will be shown that in general the operation of taking the 'continual' m a x i m u m (minimum) leads to a function which is itself not necessarily quasiditterentiable.
2. Auxiliary results Let us consider a mapping G: E~ ~ 2 E~, where 2 e~ denotes the set of all subsets of Era. Fix x o e E n and g o e s , Ilgll--1. Choose y ~ G(xo) and introduce the set y ( y ) -= y(Xo, g, y)
= { v ~ E., I=lao> O: y + a v r G ( x o + a g ) Vct c[O, ao]}. We shall denote the closure of y(y) by F ( y ) , i.e., F ( y ) =- F(xo, g, y) = cl y(y). The set F ( y ) is called the set of first-order feasible directions at the point y ~ G(xo) in the direction g. Remark 1. In the case where G does not depend on x, the set F ( y ) = F(xo, g, y) does not depend on Xo and g, and is a cone called the cone of feasible directions at y. A mapping G is said to allow first-order approximation at a point Xo in the direction g e E,, Igll-- 1, if, for an arbitrary convergent sequence {Yk} such that yk -~ y,
yk E G( Xo + akg ),
Otk "-~ + O
ask~,
the following representation holds: YR = Y + Otkl)k "~ O( Olk )
where Ok e F(Xo, g, y), akVk -~ O, y e G(xo). In what follows it is assumed that the mapping G is continuous (in the Hausdortt metric) at a point Xo and allows first-order approximation at Xo in any direction g e E,, Ilgll-- 1. It is also assumed that for every
xcS~(Xo)=(xcEnlllx-xoll~}, ~>0, the set G ( x ) is closed and sets G ( x ) are jointly bounded on S~(xo), i.e., there exists an open bounded set B c Em such that G(x) c B
Vx~S~(Xo).
V.E Demyanov, LS. Zabrodin / Differentiability of a maximumfunction
110
Let us consider the function f ( x ) = max
~(x,y)
ycG(x)
where function d~(z) = tb(x, y) is continuous in z = [x, y] on S~(xo) x B and differentiable on Zo in any direction r / = [ g , q]E E,+,,, i.e., there exists a finite limit c~&(Xo, y)
m
c~b(Xo,Yo)
On
O[g, q] = lim l [ ~ b ( X o + ag, yo + aq) - 6 ( X o , Yo)] ~+0
VZo = [Xo, Yo] E Zo.
Here
Zo :{XoIXlR(xo)},
R(x)={yEG(x)lck(x,y)=f(x)}.
Suppose that the following conditions hold: Condition 1. If qk --> q then
Ocb(Xo, y) <~ l-J~-mOc~(Xo,y) O[g, q] k,e~o O[g, qk]
Vy E R(xo).
Condition 2. Let Yk E R(xo+ Otkq), Yk ~ Y. Since G allows first-order approximation at Xo, then
Yk= f+akqk +O(ak), It is assumed that the qk'S
are
~IE R(Xo). bounded.
Condition 3. Function 4~ is Lipschitzian in some neighborhood of the set Zo.
Then the following result holds. Theorem I. The function f is differentiable at the point Xo in the direction g and
Of(xo) Og
Of(xo, y)
-- sup sup . y~R(~o)q~r(y) O[g, q]
(1)
Proof. Let us denote by A the right-hand side of (1). Fix yE R(xo) and qE y(y). Then y + aq E G(xo+ ag) for sufficiently small a > 0 and
f ( x o + ag) >! r
(OCb(Xo, y)~ + o(a ). ag, y + aq) = f ( x o ) + a \ ~--~,q] ]
V.F. Demyanov, LS. Zabrodin / Differentiability o f a maximum function
111
Here lim h(a)=- lim 1_[f(xo+ag)_f(xo)]>3c~(xo, y) ,~~+o ,,~+o a O[g, q] Since y ~ R(xo) and q c y(y) are arbitrary then 0~b(Xo, y) lim h(a)>~ sup sup ,~*o .... mxo) q~cy) O[g, q]
(2)
Let qk ~ q, qk E y(y). Then q ~ F(y). It follows from Condition 1 that 0~b(Xo, y) ~<~
3[g, q]
&b (Xo, y). k,+~ O[g, qk]
(3)
But
O~(Xo, y) OCb(Xo,y) <~ sup O[g, qk] q~:,(y) O[g, q] Since F ( y ) = cl y(y), then from (3)
Od~(xo, y)
ad~(xo, y) <
OCb(Xo, y)
sup ~< sup ~ sup qc~(y) 0[g, q] q~r~y) O[g, q] q~,(y~ O[g,q] Hence
Od~(Xo, y) sup q~r(y) O[g, q]
Od~(Xo, y)
sup q~:,(y) O[g, q]
(4)
From (2) and (4) it follows that
(5)
lim h(a) >- A. c*~+O
Now let us choose sequences {Yk} and {ak} such that 1
- - [ f ( x o + akg)--f(Xo)]-~ lira h(a), OCk
(6)
a ~ +0
yRER(Xo+akg),
yk~37,
Os
The conditions imposed on the mapping G and the continuity of the function ~b ensure that the function f is continuous at Xo. Hence, from the equality f(xo + akg) = C~(Xo+ akg, Yk), one can conclude that f ( x o ) = ~b(Xo, y), i.e., tic R(xo). Since the mapping G allows first-order approximation at Xo, then yk = rid-Otkq k d- O( Olk) , where qk E F ( 3 7 ) , Otkq k -~ d-O. From Conditions 2 and 3 the qk'S a r e bounded and the function ~b is Lipschitzian around Zo. Without loss of generality one can assume that qk ~ q. It is clear that q E F(37). Hence
f(xo + akg) -- f(Xo) = d~(Xo+ akg, Yk) -- 6(Xo, 37) = 6(Xo+ akg, 37+ akqk +O(ak)) -- ~b(Xo, 37) = QI + Q2
(7)
112
V.F. Demyanov, I.S. Zabrodin / Differentiability of a maximum function
where
{ o4,(Xo, y)'~
Q, = ~b(Xo + akg, fi + akq ) -- Cb( Xo, .~) = Otk k . . .O[g, . . q] ] + O( ak )' Q2 = Ch(Xo+ akg, .9 + akqk + O( Otk) ) -- ~b(xo+ akg, ~ + akq ). Since ~b is a Lipschitzian function, then
IQ~I ~
Lakllqk
-
q+
(8)
o(~k)ll.
It follows from (6)-(8) that lim h ( a ) = lim l [ f ( x o + ,~+o k ~ ak
akg) - f ( x o ) ] -
Odp(Xo, y) O[g, q]
from which it is clear that lira h(ot)<~ sup
sup
ycR(xo) qeF(y)
a~+O
O~b(Xo, y) O[g, q]
A.
(9)
Comparison of (5) and (9) now shows that l i m ~ + o h ( a ) exists and is equal to A, thus completing the proof. Remark 2. Equation (1) has been proved under some different assumptions elsewhere [10] (see also [1, Section 10]). The case where ~b is differentiable was studied by Hogan [5].
3. Quasidifferentiable case Let us consider once again the function f ( x ) = max d~(x,y)
(10)
y~G(x)
where mapping G satisfies the conditions specified earlier and function tb(z)= ~b(x, y) is continuous in z on S,(xo) x B and quasidifferentiable on Zo, i.e., for any point zo = [Xo, Yo] ~ Zo there exist convex compacts _0~b(zo)c E,+m and 0~b(zo) c E. ~,, such that Odp(xo, Yo) - lim l [ ~ b ( x o + ag, yo + aq) c3[g, q] ~ + o ct --
max
[ v~,v2]~ _a~,(Zo)
-
[(vl, g) + (v2, q ) ] +
~b(Xo, Yo)] min
[ wt,w21e 3,~(Zo)
[(wl, g ) + ( w 2 , q)]. (11)
It is also assumed that Conditions 2 and 3 are satisfied. (Condition 1 follows immediately from (11).) Thus, all the conditions of Theorem 1 are fulfilled and we arrive at
V.F. Demyanov, L S. Zabrodin / Differentiability of a maximum function
113
Theorem 2. The function f defined by (10) is directionally differentiable and, moreover,
Of(xo)
sup
~g
sup ~
max
.vt:R(xo)qEF(y) ([vl,va]~-~(xO,.v)
[(v,, g) +(v2, q)]
+
mi_n
[(wl, g)+(w2, q)]~. J
[ w~,w2]ca,f,(Xo,y)
(12)
Remark 3. Since y e R(xo, y), the following relation holds: sup ~
max
(v2, q)+
qc F(y) [ v2e-~qbv(Xo,Y)
min
(w2, q ) } = 0
w2Ea~by(Xo,y)
Vy~R(xo).
Here ~_~y(Xo,y) and a4~y(Xo,y) are the projections of sets ~b(Xo, y) and ~b(Xo, y), respectively, onto Era. Remark 4. Pshenichniy [9] considered the case where G(x) does not depend on x and Fy(x) = ~b(x, y) is a directionally differentiable function for every fixed y, i.e., there exists
OCb(x,y) lim l[~b(x+ag, y)-c~(x,y)]. Og ,~-~o a Then ~b(x + ag, y) = d~(x, y ) + a 0Oh(x, y) ~-o(a, y).
Og
(13)
Under an additional assumption about the behavior of o(a, y) in (13), it has been proved that
Of(x) Og
a~b(x, y) = max ~
(14)
Og
yeR(x)
It is clear that equation (14) differs from equation (12). Example 1. Let x ~ E l , y f(x)=
max
yc[ -2,2]
~ E l , G ( x ) =- G =
[ - 2 , 2], ~b(x, y) = x - 2 [ y - x}, and
(x-~y-xl).
(15)
It is clear that
f(x)=x, R(x)={x}
Vx~(-2,2).
(16)
Choose x e (-2, 2) and verify equation (14). We shall now compute the right-hand side of (14). Since ( o ( x , y ) = x - 2 m a x { y - x , - y + x } , then for yeR(x)={x} it follows [2] that ark(x, y) Og
g - 2 max{-g, g}.
V.F. Demyanov, LS. Zabrodin / Differentiability of a maximum function
114
Hence, for gl = +1,
Oc~(x,y)
max - - y~mx) Ogl
1-2=-1,
and for g2 = - 1
Od)(x, y)
max
1-2
= -3.
But from (16) it is clear that
Of(x) 0g
=g
(17)
Vx e ( - 2 , 2).
Thus equation (14) does not hold for any direction g (in E 1 there are only two directions g such that Ilgll = 1 : g = + l and g = - 1 ) . Now let us verify equation (12). Denote by D the right-hand side of (12). The function ~b(x, y) is quasiditterentiable. From quasidifterential calculus [3, 4, 8] it follows that if y = x then one can choose _0~b(x,y) = {(1, 0)}, 0qb(x,y)= co{(-2, 2), (2, -2)}. For the function f described by (15) we have
F(y) =-F(x, y) = E,
Vx 9 ( - 2 , 2).
Computing D:
D = s u p l ( l ' g ) t-(O'q)+ ~ qc E t ~
= g + sup qc= E I [ Wl .w2]r
min
[(wl"g)W(w2"q)]}
min
[ 't.w2]cco[(-
2,2),(2.2)]
[(w~ 9 g)+(w2" q)].
(18)
" 2.21.(2,--2)]
It is clear from Figure 1 that for any g the second term on the right-hand side of (18) is equal to zero, i.e., D = g. (The supremum in (18) is attained at q = g.) Thus, from (17), equation (12) is correct in this case. R e m a r k 5. When solving practical problems in which it is required to minimize a max function over a continual set of points, this maximum function is often discretized (the contirt~lal set replaced by a grid of points). In m a n y cases this operation is a legitimate one [6], but we shall show that in the case where ~b is a quasidifferentiable function this replacement may be dangerous. Let f again be described by (15). Define fN as
fN(X) = m a x ( x - 2IY - x[), yCo" N
where crN = { X b . . . , XN}, Xk e[--2, 2]. This function has N local minima (see Figure 2), although the original f = x has no local minimum which is not also global on [ - 2 , 2]. This demonstrates that the discretization of a max-type function must be carried out very cautiously.
V F. Demyanov, L S. Zabrodin/Differentiability of a maximum function
115
w2
\ \ \ \ \
\ \
I -2
-I
~- w I
N k N
-i
-2
\\
~ ' \ \ N. \
\
\
Fig. 1.
-2
-1
~_ x
,-2
Fig. 2.
116
V.F. Demyano~, LS. Zabrodin / Differemiability of a maximum function
Example 2. This e x a m p l e illustrates that Condition 2 is essential to our argument.
q~(x,y)
= x-2
min
,/(x-t3)2+(y-t) 2,
t~[-2, 2]
f(x)
= y~c-2.21 max dp(x,y).
(19)
It is clear that for x ~ ( - 2 , 2),
R(x)={yly=,~x},
f(x)=x.
(20)
I f y = ~ x , then the m i n i m u m in (19) is achieved at t = x. Take xo = 0. Then R(0) = {0}, F(y) = E~. Construct a quasidifferential of the function ~b at the point (0, 0). By the rules of quasidifferential calculus one can choose _O~b(O,O) = {(1, 0)},
~b(O, O) = c o { ( - 2 , 0), (2, 0)}.
Let us denote by D the right-hand side of equation (12) and evaluate it.
D=-D(g)=sup{1.g+O+ qc E t
min
(w,'g+w2"q) }
[ wt, w2]cc~ (-- 2,0),(2,0) ]
=g-21gl. If gt = 1 then D ( g l ) = - 1 ; i f g2 = - 1 then O ( g 2 ) = - 3 . But it is clear from (20) that af(O)/~g = g. Thus equation (12) does not hold, and the reason is that Condition 2 is not satisfied. Indeed, taking an arbitrary sequence Xk=Xo+akg where ak--'+O, putting, for example, g = I , Xo=0, we obtain yk =
)7-]- OlklJk, Yk E R ( X k ) . For )5 = 0 this leads to I)k =
O~k
R(Xk) =
{~ak}. But
~ "4-00. k~oo
Concluding remarks Thus, a f o r m u l a for the directional derivative of a m a x i m u m function over quasidifferentiable functions has been established ( T h e o r e m 2). Two other p r o b l e m s arise: (1) h o w to use the f o r m u l a to construct effectively a steepest descent direction; (2) how to a p p r o x i m a t e the dea'ivative as the difference o f two positively h o m o g e n o u s functions (in this case we can apply the results o f QuasiditIerential Calculus to c o m p u t e ( a p p r o x i m a t e ) descent directions).
References [1] V.F. Demyanov, Minimax: Directional differentiability (in Russian)(Leningrad University Press, Leningrad, 1974). [2] V.F. Demyanov and V.N. Malozemov, Introduction to minimax (WHey, New York, 1974). [3] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Soviet Mathematics Doklady 21(1) (1980) 14-17.
V.F. Demyanov, LS. Zabrodin / Differentmbility of a maximum function
117
[4] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable optimization (in Russian) (Nauka, Moscow, 1981). [5] W.W. Hogan, "Directional derivatives for extremal value functions", Western Management Science lnstitue, Working Paper No. 177 (Los Angeles, 1971). [6] V.N. Malozemov, "On the convergence of a grid method in the best polynomial approximation problem" (in Russian), Vesmik Leningrad University 19 (1970) 138-140. [7] E.A. Nurminski, ed., Progress in nondifferentiable optimization, CP-82-$8, IIASA, Laxenburg, 1982. [8] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions", Vestnik Leningrad University, 13 (1980) 57-62. [9] P.N. Pshenichniy, Necessary conditions for extremum problems (Marcel Dekker, New York, 1971). [10] T.K. Vinogradova, V.F. Demyanov and A.B. Pevnyi, "On the directional differentiability of functions of maximum and minimum" (in Russian) Optimization (Novosibirsk) 10 (27) (1973) 17-21.
Mathematical Programming Study 29 (1986) 118-134 North-Holland
ON THE EXPRESSIBILITY OF PIECEWISE-LINEAR C O N T I N U O U S F U N C T I O N S AS T H E D I F F E R E N C E OF TWO PIECEWISE-LINEAR CONVEX FUNCTIONS D. M E L Z E R Mathematics Division, Humboldt University, Berlin, GDR Received 20 January 1984 Revised manuscript received 9 November 1984
The differential calculus for convex, compact-valued multi functions developed by Tyurin, Banks and Jacobs is used to give an equivalent description in terms of multifunctions of the class of functions which can be represented as the difference of two globally Lipschitzian convex functions. This approach is also used to develop a means of representing piecewise-linear continuous functions as the difference of two piecewise-linear convex functions in finite dimensions. This leads directly to a Minkowski duality theorem for piecewise-linear positively homogeneous continuous functions and equivalence classes of convex compact sets produced by convex compact polyhedrons: every piecewise-linear positively homogeneous continuous function may be uniquely characterized by its quasidifferential (as defined by Demyanov and Rubinov) at zero.
Key words: Convex Compact-Valued Mappings, Support Functions, Quasidifferentiability, Globally Lipschitzian Functions, Polyhedral-Valued Mappings, Piecewise-Linear Continuous Functions, Minkowski Duality.
1. Introduction
Consider locally Lipschitzian functions f : E, ~ E~. It has been shown that Clarke's generalized gradients [5] provide a useful way of handling such functions, from both optimization and analysis viewpoints, in the nonsmooth, nonconvex case. Clarke [5] has shown that the generalized directional derivative f ~ .) is the support function of a nonempty convex compact set--the generalized gradient af(x) Vx ~ E,. This leads directly to a necessary condition for a point Xo to be a local minimizer of f :
Oe Of(xo).
(1)
Condition (1) covers both the convex case in which af(x) is the usual subgradient defined in convex analysis and the smooth case in which af(x) reduces to the sing!eton {~Tf(x)}. This observation has also led to the derivation of some calculus rules and paved the way for various generalizations (see, e.g., [25] for an introduction to the finite-dimensional case, and [19] for further references). On the other hand, there are also nontrivial examples of locally Lipschitzian functions for which (from the optimization viewpoint) af(x) may be so large that (1) does not exclude the existence of proper descent directions with the usual definition of the directional 118
D. Melzer/ On the expressibilityof piecewise-linearfunctions
119
derivative f ' ( x ; v) [14,30]. There are even examples of Lipschitzian mappings
f : El ~ El whose generalized gradients are identically equal to an interval [a, b] specified a priori [25, p. 97; 26]). As an illustration, let us consider the following simple example of a piecewise-linear continuous function f : E2--> E~: O
ifx.y
Y
ify~<x~<0, ifx~
f(x,y)= x/2 Ix~2
(2)
ifO<~y<~x.
Here we have 0~c3f(0,0) but. there are proper descent directions ( f ' ( ( 0 , 0 ) ;
-(1, ~)) =-1). One class of functions for which this 'weakness' of (1) cannot occur is characterized by the condition f~ Vx, v. (3) Such functions are called subdifferentially regular [26, p. 6] (see also [25, pp. 36-41], where this notion is introduced in a more general manner) or quasidifferentiable [22, p. 58]. For example, (3) is satisfied by lower-C k functions. ( f is defined to he Iower-C k if it is locally representable as the maximum of a continuous function F(x, s) which has partial derivatives up to order k with respect to x which are continuous jointly in both variables, where the maximum is taken over s from a compact topological space [26]. This result is due to Pshenichnyi [22, p. 64] and Clarke [5].) Obviously, (2) is not a local max function but it suggests a feeling that the 'weakness' of (1) at (0, 0) comes from insufficient separation (in some sense) of the 'convex' and 'concave' parts o f f or, more strictly, from insufficient consideration of the power of the 'concave' part (or of the 'convex' part if one is interested in local maxima). Exactly this thought is reflected in the class of quasidifferentiable functions introduced and investigated by Demyanov and Rubinov (see, e.g., [7-9]; for their relationship to Clarke's generalized gradients see [6; 7, pp. 112-115]). A function f is said to be quasidifferentiable at a point x if its ordinary directional derivative f'(x; 9) can be expressed as the difference of the support functions of two nonempty convex compact sets Of(x) and - ~ f ( x ) . Here Of(x) is called a subdifferential and -~f(x) a superdifferential of f at x. In what follows, the term 'quasidifferentiable function' will always have this meaning. Examples include differentiable functions, convex functions, concave functions, saddle functions, and any linear combination, product, quotient (if they exist), or pointwise maximum or minimum of a finite collection of such functions [7, 8]. Necessary conditions for Xoto be a local minimizer or maximizer are then
-'~f( xo) c O_f(xo),
(4)
-O_f( xo) c Of( xo),
(5)
120
D. Melzer / On the expressibility of piecewise-linear functions
respectively (cf. [21; 7, pp. 87-105]). In particular, for example (2) we have (see Example 2 in Section 5 of this paper)
o_f(O,O)={(x,y)llx[~ l, ly[~l,x + y>~ -89 (6) Of(O, O) = {(x, y)[ Ix] ~< 1, [y] <~ 1, Ix + y] <~3}. Hence, (0, O) cannot be a local minimizer or maximizer. One of the simplest and most interesting classes of quasidifferentiable functions which are not necessarily subdifferentially regular is given by those functions which may be represented (at least locally) as the difference of two convex functions. (See, e.g., [10]). There does not seem to be any very extensive literature devoted to the problem of expressing functions in this way. Let K c E, be a nonempty convex set, and let C o ( K ) denote the set of continuous convex functions f : K -~ E~. We use the notation C(K), C I ( K ) in the usual way. If K is assumed to be compact then a well-known consequence of the StoneWeierstrass Theorem [16, p. 46] is that C o ( K ) - C o ( K ) is dense in C(K) with respect to the sup norm. It is shown in [1, 2] that every function f c CI(K) with Lipschitzian partial derivatives belongs to C o ( K ) - C o ( K ) and that, for each closed real interval [a, b], Co([a, b]) - Co([a, b]) is equal to the set of integrals of functions of bounded variation over [a, hi. In the same papers, it is proved that C o ( E 2 ) Co(E2) includes every piecewise-linear function f ~ C(E2). Zalgaller [31] has sketched a theoretical algorithm for deciding whether or not f c C ( K ) belongs to C o ( K ) - C o ( K ) which requires the calculation of the convex hull of two functions at each step. He showed how the representation of a piecewise-linear function fE C(E2) may be obtained as the difference of two convex functions using the simplex method. Finally, Rockafellar [26] proved that for 2 ~ k ~ co the classes of lower-C k functions all coincide, and that f 6 lower-C 2 is equivalent to the statement that f can be locally represented as f - - g - h, where g is convex and h is convex quadratic. Some quadratic programming problems were discussed in [11, 28]. In the present paper, we extend some results obtained by Pecherskaya [20] (see also [7, Chapter II]) to give an equivalent description in terms of differentiable multifunctions of the subclass of C o ( E , ) - C o ( E , ) composed of globally Lipschitzian convex functions [4, 29] (Section 2). Let K c E, be a convex compact set. Then since every g e C o ( K ) with finite directional derivatives over K with respect to each feasible direction may be suitabily extended to produce a Lipschitzian function ~ c C o ( E , ) [24, Theorem 23.3], there is no loss of generality in limiting ourselves to global considerations. In Section 3 we show a class of polyhedral-valued mappings to be differentiable, and in Section 4 we use these results to construct a new proof of the fact that every piecewise-linear function f ~ C ( E , ) belongs to C o ( E , ) Co(E,). We also sketch an algorithm for computing a representation of f as the difference of two convex functions which is simpler than those given by Zalgaller [31]. This leads to a Minkowski duality-type relationship [12; 24, Section 13; 27]). Two simple examples, including (2), are used to illustrate the method in Section 5.
D. Melzer / On the expressibility o f piecewise-linear functions
121
We shall use the following notation: B., Sn - - unit ball and unit sphere in the Euclidean space E.. co M - - convex hull of M c E.. M* - - positive polar cone of M. 2 fo - - set of all nonempty convex compact subsets of E.. d~(M, N ) - - Hausdorff distance between M ~ E. and N c E.. J~(t) - - Jacobian of a function f,: E I ~ E. at a point t; the argument t will be omitted iff~ is affine-linear. ~(E.) - - piecewise-linear continuous function on E.. , ~ (E.) - - piecewise-linear positively homogeneous continuous function on E.. ~(E.) - - piecewise-linear positively homogeneous continuous convex function o n E n. All the other notation is standard or defined directly where it is used for the first time.
2. Directional differentiability of multifunctions and quasidifferential calculus The definitions and assertions given below are, with slight alterations, valid for a much broader class of spaces than that considered here [4, 13]. We shall simply confine ourselves to what is necessary for the purposes of the present paper. Let 12c E,. be an open set and I':12-->2 eo be a (multivalued) mapping.
Definition 1. The mapping F is said to be directionally differentiable at x ~ [2 in the direction h ~ E,. if there exist two sets Fx+(h), I'~ (h) ~ 2 F. such that, for small positive a, we have d H ( F ( x + ah) + a r ; ( h ) , F ( x ) + a F t ( h ) ) = O,.h(a),
(7)
where Ol -l Ox,h(Ol )
at ~0+
) O.
This is Tyurin's definition [29], wlaich is equivalent to one given later by Banks and Jacobs [4] based on Radstrorrfs embedding theorem [23]. From this we also have that the pair (l'~(h), F~(h)) is not unique, corresponding to the fact that d H ( A + C, B + C) = dH(A, B) for arbitrary fixed A, B, C 6 2 v,,. However, uniqueness may be achieved by use of the equivalence relation
(A,B)-(C,D)
iff
A+D=B+C,
A , B , C , D c 2 E..
Furthermore, F is said to be directionally differentiable at x c 12 if it is directionally differentiable in all directions. In this case, F + , F X : E , , ~ 2 w. are positively homogeneous mappings. Finally, F is positively (negatively) conically directionally differentiable at x in the direction h if (7) holds with F ~ ( h ) = {0}(F~(h)= {0}). In
D. Melzer / On the expressibility o f piecewise-linear functions
122
the special case of convex c o m p a c t images, the differentiable m a p p i n g s F : E--} 2 Y (where E, Y are finite-dimensional Euclidean spaces and 2 Y is the p o w e r set of Y) considered by Nurminski [18] are actually conically differentiable mappings. As mentioned below, m a p p i n g s of this type have some interesting local monotonicity properties. O f course, there are a lot of other a p p r o a c h e s to the differentiability of multifunctions (see, e.g., [15] for the relationships between different notions and for an incomplete bibliography), but that given a b o v e is suitable for our purposes. In what follows, we are interested only in the differentiability properties of F for fixed x, h. Hence, we can simplify the notation; a m a p p i n g F which is differentiable at x in the direction h could also be considered as a m a p p i n g F : [ 0 , 8)--> 2 E", ~ > 0, which is differentiable from the right at zero. Using Minkowski duality, we can reformulate Definition 1 in terms of support functions as follows: Let P ( E . ) be the set of real sublinear functions over E, with the t o p o l o g y induced by the C ( S . ) sup norm, and consider 4> :2 ~o ~ P(En) given by ~b(M,v)=max(z,v),
M ~ 2 ~~ v ~ E , ,
zcM
(the support-function m a p p i n g ) , and ~br :[0, 8 ) ~ P ( E , ) defined by rbr(a,v)=fb(l'(a),v),
veE,.
(8)
Then for all sets M, N e 2 e. we have
d,,(M, N ) = II~(M, ") - ~ ( N , . )11 = m a x l 6 ( M , v) - 4,(N, v)l,
(9)
t~s S n
which leads to Proposition 1 [20; 7, p. 131]. Proposition 1. 1":[0, 8 ) ~ 2 ~o is differentiable from the right at zero iff the limit lim ~ - l ( ~ b r ( a , v) - ~br(0, v)) =: ~b~-(0+, v)
(10)
~0+
exists uniformly with respect to v ~ S, and there are sets I'+, F - ~ 2 e" such that
4,~-(o +, . ) = ~ ( r +, 9) - , / , ( r , - ) . Let us n o w consider ~b~.(0+, .) : E, --> El. If F is differentiable from the right at zero then ~br(0 ' + , 9) is direetionally differentiable at zero in every direction g e E,, and we have - a- ~ , . r( o 0g
+
, o ) -- 6,-(0 ' , g) = ~b(l "+, g ) - ~b(F-, g).
(11)
But this m e a n s that ~b~-(0+, 9) is quasidifferentiable at zero with _a~b~-(0+, 0 ) = F +, a6~,(O +, O) = - Z ' - . 9
!
-i-
Proposition 1'. I ' : [0, 8) --} 2 E- is differentiable from the right at zero/ff~br(0 , v) exists uniformly with respect to v ~ S. and ~br(0 ' +, 9) is quasidifferentiable at zero.
D. Melzer / On the expressibility of piecewise-linear functions
123
Thus, if F is differentiable from the right at zero we have ~b~.(0§ ( C o ( E . ) - C o ( E . ) ) ; F is positively (negatively) conically differentiable from the right at zero if it is differentiable from the right and ~b~-(0§ 9) is a convex (concave) function. In the latter case, especially if we assume ~b~-(0+,v)>0, v e S.(cb'r(O ~, v) < O, v e S.), or, equivalently, 0 e int I "+, F - = {0} (0 e int F, I'* = {0}), it is clear that for sufficiently small positive a, F ( a ) is strictly increasing (decreasing) with respect to the set inclusion. In order to derive the desired close relation between differentiable mappings and ( C o ( E . ) - C o ( E . ) ) functions, l e t f e C ( E . ) be such that it can be represented as the difference of two Lipschitzian functions g, h e C o ( E . ) . The supposition that g e C o ( E . ) is Lipschitzian is equivalent to stating that
~(x,t)
~tg(x/t), = [ g0+(x),
t>0, t~<0,
(12)
belongs to Co(E,§ ~), where gO*(. ) denotes the recession function of g [24, Theorem 10.5]. Hence, without loss of generality, f, g, h may be assumed to be positively homogeneous. Proposition 2. Let g, h e C o ( E . ) be positively homogeneous functions and f:= g - h. Then
F(a):={xeE, l(x,u)<~h(u)+af(u),ueSn},
aeE~,
(13)
is differentiable from the right at zero with F+=Og(O),
/'-=0h(0).
(14)
Proof. For arbitrarily fixed cre [0, 1], we have r ( o t ) = {x e E,, I(x, u) <~ag(u) + (1 - a)h(u), u e S,,}
= O(otg(O) + (1 - a)h(O)). Thus, using well-known properties of the subgradient we obtain F ( a ) = a F t + ( 1 - a ) F - , a e [0, 1], which implies d• ( F ( a ) + ctl'-, F(O) + a l "+) =- O, a e [0, 1]. Now let f : E.-~ E, be an arbitrary function. Following the pattern of the convex case, we say that f has a recession function f0+: E, -~/~ = [-co, +co] if the pointwise limit f 0 + ( x ) : = l i m t-~f(tx),
xeE.,
(15)
exists (in either the proper or the improper sense) [24, Corollary 8.5.2]. Then f : E,~ l ~, ff, l defined by
jr(x,
.,.
I)
6~
[tf}x/t) If0 (x) IfO§ [+co r
/
ift>0, if t = 0 , if t < 0 and f 0 ~ is everywhere finite, otherwise,
D. Melzer / On the expressibility of piecewise-linear functions
124
is called the positively homogeneous function produced by f. Moreover, if f is Lipschitzian then f belongs to C(E~+l). Making use of these notions, we arrive at the following result: Corollary 1. f : E, -->E 1 can be represented as the difference of two Lipschitzian convex functions if and only if it has a recession function fO + which is everywhere finite and there exist a 8 > 0 and a mapping F: [0, 8) ~ 2 Fo which is differentiable from the right at zero such that
4,~-(o +, (v, t)) =?(v, t),
(v, t) c s,,+l.
The following example shows that global Lipschitz continuity and the fact that a function f can be expressed as the difference of two convex functions are not sufficient to guarantee the existence of a representation o f f as the difference of two Lipschitzian convex functions. Let f : El ~ El be given by
f(x)=
x cos In x, (l+sin(x-1))cos(x-1),
xt> 1, x~l.
(17)
f has globally bounded first and second derivatives and, consequently, it is globally Lipschitzian and may be represented [ 1, 2] as the difference of two convex functions g, h : El ~ E,. However, f has no recession function and thus cannot be expressed as the difference of two Lipschitzian convex functions.
3. Differentiability of polyhedral-valued mappings We shall now attempt to apply the above results to the class of piecewise-linear continuous functions. With this goal in mind, we shall first look at a suitable class of polyhedral-valued mappings which is actually somewhat more extensive than necessary but which provides a good understanding of the strength of the notion of differentiability adopted. The methods used are borrowed from linear parameteric programming, and, for convenience, we first of all give definitions of the terms 'partitioning' (see [3, p. 91] or [17, p. 3]) and 'piecewise-linear function' as they are used here. A finite system {M~ c En I1 <~ i <~ s} of (nonempty) convex polyhedrons is said to be a polyhedral partitioning of a given convex polyhedron M c E, if the following three conditions hold: (i) ~--J~=lMi = M. (ii) riM~c~riMj=O,i#j,l~i,j~s. (iii) dim M = dim M~, 1 <~ i <~ s. The partitioning is said to be conically polyhedral if the Mi, 1 ~ i ~ s, (and thus also M ) are convex polyhedral cones with a c o m m o n vertex (without loss of generality,
D. Meher / On the expressibility of piecewise-linear functions
125
located at the origin). A polyhedral partitioning is said to be regular if the following condition is satisfied: (iv) foreachlc{1,...,s},lll>~2, suchthatf-)~ M ~ O 1"'1~ M~ is a common closed face of M~, i e / . Definition 2. f e C ( E.) is said to be a piecewise-linear continuous function (f e ~ ( E.) ) if there is a polyhedral partitioning {M f . . . . , M f} of E. such that f is aftine-linear on M f for all i. If f ~ ( E n ) is positively homogenous ( f s ~ + ( E n ) ) then { M ~ , . . . , M~} is assumed to be conical. Obviously, there is no loss of generality in restricting consideration to functions whose domain is the whole space E., and similarly piecewise-linear functions may be defined which are not necessarily continuous. I f f e ~ ( E . ) is convex then f is a convex polyhedral function [24, Section 19], and the partitioning of E. into maximal (with respect to the set inclusion) linearity ranges o f f results in a regular polyhedral partitioning. In the nonconvex case, such a maximal polyhedral partitioning does not necessarily exist. If M c E. is a compact convex polyhedron then ~b(M, 9) ~ ~+(E~) and
{(-cone( M - y))*lye H ( M ) }
(18)
(i.e., the set of normal cones of M with respect to its vertices) represents the partitioning of E, into maximal linearity ranges of ~b(M, 9) (see, e.g., [17, Chapter 8.5]). We also have that {~b(M, ")10# M c E, a convex compact polyhedron} is a proper subclass of ~c(En). However, it may be shown (see Proposition 4 of this paper) that for each f ~ ~ ( E , ) and each polyhedral partitioning P f of En into linearity ranges o f f there exist a regular refinement/Sy = { ~ ( . . . . . h~t~} of PY and a convex compact polyhedron QcEn+~ such that {hT/~Yx{1}ll<~i<~}= {(-cone(Q - y)* n (E, x {1})IY e H(Q)}. Now let functions f : [0, 8) --> E,, 1 ~< i <~ r, be continuously ditterentiable and F : [ 0 , 8 ) ~ 2 E. be defined by
r(t) := c o { f ( t ) l 1 <~ i ~< r}.
(19)
Then, at t ~ [0, 8), we have
r
v)= max .(.v,f~(t)) l<~i~r
(20)
and [22, p. 61] ~b~-(0+, v) = max (v, J~(0)),
(21)
R( t, v):= { il (v, fi( t)) = ~br(t, v)}.
(22)
i~ R(O,v)
where
As in (18), H(t) denotes the set of vertices of F ( t ) ; let V(t):= { / I f ( t ) e H(t)}.
D. Melzer / On the expressibility of piecewise-linear functions
126
!
+
Lemma 1. ~br( O , 9) : E, ~ E~ is continuous iff (23)
6'~(v) := max{(v, Jj(0)) Ifj(0) = f ( 0 ) } = ~b~-(0+, v) 3
for all v~ E, and i~ R(O, v ) ~ V(O). Proof. The 'only if' part of the condition was stated by Pecherskaya [7, Chapter
II, T h e o r e m 4.1] (see also [20, T h e o r e m 2]) in a stronger but not generally valid form: her necessary condition for a multifunction (19) to be differentiable would be true only u n d e r the additional assumptions that f j ( 0 ) , . . . , f r ( 0 ) are pairwise different points and {f(0)[ 1 <~ i ~< r} = H ( 0 ) . However, her p r o o f also holds in our case. To p r o v e the 'if' part of the condition, let v ~ E, be arbitrarily fixed. Since f , 1 <~ i <~ r, are continuous functions there exists an e > 0 such that, for each u ~ v + eB,, we have R(0, u) c R(0, v) and hence v~
['-)
( - c o n e ( F ( 0 ) - f (0)))*.
i~ R(O.v)c~ v(o)
N o w let {u,} be an arbitrary sequence converging to v, and i,~ R(O, u,) c~ V(O), t = 1, 2, 3 , . . . , be such that ~b}-(0§ u,) = (u,, J~,(O)) = ~b'i,(u,). For every i ~ V(0) satisfying s u p { t l i , = i } = ~ se have i~R(O,v). Exploiting the differentiability properties o f f , l<~i<<-r, it then follows that for the subsequence {U,~}k o f {U,} defined by t'~=min{tli,=i}, tk= ' m i n { t l t > tk-~,il=i}, i k=2,3,..., we have t + t . 4~r(O , u,~) = ,/,,(u,~) ~ ,/,;(v) = ~br(0 ' + , v). Consequently, { ~ b' r ( 0+, u,)}, is split into at most r subsequences with the same limit ~br(0 ' + , v), and the p r o o f is complete. The continuity of 4)]-(0 +, .) yields ~br(0 ' + , ' ) c ~ ( E n ) ; l e t P ' = { K t , . . , K' ' ~ }. b e a n y conically polyhedral partitioning of E, into linearity ranges of ~b~-(0+, 9). N o w let us assume temporarily that dim F(0) = n and define a refinement pO, of P' in the following way:
Kep ~
r
dimK=n,
K=K~
', K ~
~ K'~P'.
(24)
Here pO denotes the partitioning (18) of E, into the normal cones of F(0) with respect to its vertices. Then all cones K ~ pO, are pointed; let U = { u , . . . , up} be the set o f their edge vectors, and for arbitrarily fixed e/> 0
mr(O,e):={y~E,](y,u,)<~br(O, uj)+ec~-(O+,uj),j=l,...,p}.
(25)
Each extreme point Yo of F(0) is uniquely characterized by the index set I(yo), where
(yo, uj)=dpr(O,u~),
j~l(yo),
(yo, uj)
(26)
For i~. { 1 , . . . , r} satisfying f ( 0 ) = Yo we set
e,(yo):=sup{el(u~,yo+eJ~(O))<4~r(O, uj)+e4~'r(O +, uj),j~/(Yo)}.
(27)
It then follows that eo := inf{inf{e~(y)If(0) = Y}IY c- H(0)} > 0. y
i
(28)
1). Melzer / On the expressibility of piecewise-linear functions
127
Theorem I. The mapping F defined by (19) is differentiable from the right at zero iff t 4 dpr(O , 9)" E, ~ E~ is a continuous function. In this case, we can take
F + = e-'M,.(O, e),
r- = e-'r(o),
(29)
where Mr(O, e) is given by (25) and e c (0, eo) is arbitrarily fixed. Proof. The 'only if' part of the condition is an immediate consequence of Proposition 1. Otherwise let e e (0, Co) and v E S, be arbitrarily fixed. We first compute r e), v). For this purpose, we choose a cone K c pO, with v ~_K. Furthermore, let YK e H ( 0 ) be the unique vertex satisfying K c ( - c o n e ( F ( 0 ) - YK ))*, lr := { i [Yr = f (0), r (0 +, v) = (v, J~(0))}, ir ~ Ir be an arbitrary index, and y~ (e) := YK + eJ~(O). Then because Cr(0, ") and r +, 9) are linear functions over K we have
r
(30)
e), v) <~(v, y,,(e)).
On the other hand, if J ( K ) c then we obtain
{ t , . . . , p} denotes the index set of edge vectors of K
(uj, y,k ( e )) = Cr(O, us) + ego'r(O +, uS), j e J ( K ) , (uj, Y,k (e )) <~r
uj) + e~b'r(O ~, uj),
(uj, y~K (e )) < Cr( O, uj) + er
O+, uj),
j~ I(yK)\J(K), j ~ I(yK ).
This leads to Y,K(e) e M r ( 0 , e) and consequently the equality in (30) holds, implying e), v ) = Cr(0, v)+e4a~-(O +, v).
r
(31)
We shall now turn to the 'if' part of the condition. Let a e (0, e] be arbitrarily fixed. F r o m (31) and the continuity of C r ( a , " ), Cr(0, ") and r 9), it follows (cf. (9)) that there exists a v ( a ) c S, such that
dn( l'( a ) + e - l a F ( O), F( O) + e - l a M r ( O, e ) )
= 14,,-(~, v ( ~ ) ) - 6,-(0, v ( ~ ) ) - ,~r (0", v(o~))!. If a is sufficiently small then we can always obtain R ( a , v ( a ) ) c R(O, v ( a ) ) . Hence, for every i~ c R ( a , v ( a ) ) that satisfies C r' ( a + , v ( ~ ) ) = ( v ( a ) , 4 ~ ( a ) ) = ( v ( a ) , 4o(~)) we have
d u ( l'( a ) + e - l a F ( O), F( O) + e - l a M r ( O, e ) )
= I(v(~), f,~ (,~) -f,o ( 0 ) - I,o (,~))1 <~ m a x I l l ( a ) - f , ( 0 ) where a - l o ( a ) ~
O.
-/,(~)ll
= o(~),
D. Melzer / On the expressibility of piecewise-linear functions
128
The p r o o f of T h e o r e m 1 shows clearly what kind of conditions ensure that F is conically differentiable. If F is differentiable from the right at zero then it is positively conically differentiable iff
(uj, J,(O)) ~< 4,~-(0 +, uj)
(32)
for each j ~ l ( f ( O ) ) , i= 1 , . . . , r. In this case we can take
r + = { y l ( u j , y)<~ Or(O ' ~", us),j = 1, . . . . p},
r - = {0}
(33)
thus satisfying eo = o0 and
F§
lim e-IMr(O, e).
(34)
In this sense, e = eo m a y be chosen in every case. Finally, let us relax the a s s u m p t i o n dim F ( 0 ) = n. If dim F ( 0 ) < n we m a y carry out all the above constructions for a m a p p i n g / ~ ( . ) defined by
F(t)=F(t)+Fo,
t e [0, 6),
(35)
with an arbitrarily fixed n-dimensional c o m p a c t convex p o l y h e d r o n Fo. T h e o r e m 1 then holds with r + = e ~Mr(0, e),
F - = e-t/~(0).
4. Representation of piecewise-linear continuous functions as the difference of two piecewise-linear convex functions We should first note that the positively h o m o g e n e o u s function p r o d u c e d by a piecewise-linear continuous function is itself a piecewise-linear continuous function. Thus, without loss of generality, we m a y restrict our considerations to ~ + ( E , ) functions. Let h ~ ~ c ( E , ) , f ~ ~ + ( E , ) , and let F : El ~ 2 E" be defined by
F ( t ) = {x~ E, [(x, u)<~ h ( u ) + tf(u), u ~ S,}.
(36)
If pOy is any conically polyhedral partitioning of E, into pointed linearity cones of both h and f, and U = {u, . . . . , up} is the set of their edge vectors, then F m a y be written in the form
l ' ( t ) = {x ~ E, I(x, us) ~ h(u s) + tf(uj),j = 1. . . . . p}.
(37)
i.e., the values of F ( t ) are either e m p t y or convex c o m p a c t p o l y h e d r o n s d e p e n d i n g on one p a r a m e t e r from the right-hand side of the linear constraint inequalities. If F ( t ) # O , t ~ [ 0 , 8), then so := sup{a ]'r
~ H(fl)3yo ~ H(O) such that
I(y~) c t(yo) ' Vfl c [0, a)} > O.
(38)
D. Melzer / On the expressibility of piecewise-linear functions
129
Let a ~ (0, ao) be arbitrarily fixed and H(a) = { Y l , . . - , Y,}. For each i, we choose a basis matrix U ~ of yi; the column vectors f , hi e E, are defined to consist of the c o m p o n e n t s f(uj), h(uj) corresponding to the rows uj o f U ~. Then we have F ( t ) = co{(U')-~(h~ + tf~)[ 1 ~< i <~ r},
r
+, v)=a-'(dpr(a, v)-ckr(O, v)),
t c [0, ao),
(39)
v~S,,
(40)
so that ~b~-(0 ~, v) is continuous with respect to v. Thus, as a consequence of T h e o r e m 1, we obtain the following result:
Proposition 3. Let the mapping F defined by (36) have nonempty values for t c [0, r
r > 0. Then I" is differentiable from the right at zero.
This proposition covers results obtained in [20; 7, C h a p t e r II, Section 2] where F(0) is a s s u m e d to be a nondegenerate p o l y h e d r o n (i.e., I ' ( 0 ) has no degenerate vertex). Propositions 2 and 3 and formula (29) of T h e o r e m 1 show us what we have to do in order to obtain a representation o f f ~ ~ + ( E , ) in the desired form: we sketch an algorithm for c o m p u t a t i o n of a function h ~ ~c(En) such that 4~-(0 § v) = f ( v ) , v c S, (where I" is defined as in (36)). Step O. We start with any conical partitioning pS of E, into linearity ranges o f f and an arbitrary full-dimensional n o n d e g e n e r a t e convex c o m p a c t p o l y h e d r o n Fo. Let pO be the unique partitioning (18) of E, into linearity cones of ~b(Fo,.), P~ Kr} be defined as in (24), and U = { u l , . . . , up} be the set of edge vectors of Ki, 1 ~< i<~ r. Then for each i e {1 . . . . . r} there exists a unique vertex y, of Fo such that K, c ( - c o n e ( F o - y ~ ) ) * . We choose v~ ~ int Ki, 1 ~< i ~< r, arbitrarily, and define functions F~ : El --> En and a m a p p i n g F : Et ~ 2 E- in the following way:
Fi(t)=yi+tVf(v,),
l<~i<<-r,
F(t) := co{F,(t) ] 1 <~ i<~ r}.
(41) (42)
Then F is given by (19) with I"(0) - Fo. If ~b~-(0§ 9) now coincides with f (which can be checked by m e a n s of the conditions ' § ,uj)=f(u~), Ckr(O
j=l,...,p,
(43)
see (37)) then we have the expected result; otherwise, we c o m p u t e a suitable refinement of p O / a n d the corresponding p o l y h e d r o n as follows. Step 1. Let Uo c U denote the set of edge vectors of the linearity cones of r 9). If there is no u~o~ U \ Uo such that ~b~-(0§ uj)>f(Uio) then go to Step 2. Otherwise, for an arbitrary index ioc R(0, ujo), c o m p u t e ao := min{(u~o , y ~ - y , ) [ i~ R(0, Ujo)},
(44)
take a ~ (0, ao), and define /~o := Fort {yl(y, u~o)<~(y~, u ~ ) - a}.
(45)
130
D. Melzer / On the expressibility of piecewise-linear functions
Then we have U = U, Uo= Uou{u~o}, and ~ = F ~ , i ~ R ( O , u ~ ) . The vertices of/~o that do not coincide with those of Fo may be easily identified by the simplex method (in a number of steps less than or equal to the number of basis matrices of all vertices F~(0), i c R(0, ujo)). All other necessary computations are elementary. For simplicity, we shall revert to our original notation Io, U, Uo. . . . rather than using /zo, U, t~o,.... Lemma 2. Let 4~'r(O~, u j ) = f ( uj), j ~ U \ Uo, and joe Uo be such that ~b~-(0+, u~o)> f(Ujo). Then there exists a cone Koe pO, Kogujo, such that Ko contains at least one edge vector uj, ~: U \ Uo.
Proof. Let ioe R(0, u~o)be an index such that d~-(0 +, Ujo)=(J~, ujo), where J~ denotes the Jacobian of F , 1 ~< i ~ r. Defining K0 := ( - c o n e ( F o - Fg(0)))* we evidently obtain Joe R(0, v), v e i n t Ko. Assuming that R(0, v)={io}, v e i n t Ko, then we have f ( v ) = ( J ~ , v), v e i n t Ko, andf(u~,) < (Jq, uj,,), which contradicts the continuity o f f We thus have R(0, v) # {iot, which implies that Ko has common interior points with at least two cones of PJ. Therefore, if Ko did not contain u~,~ U \ Uo then y~-= F~(0) would be a proper degenerate vertex of Fo, which cannot happen since both Step 1 and Step 2 preserve the assumed nondegeneracy of Fo. Step 2. Let ujoc Uo satisfy 4,~.(0 ~, u~o)>f(u~). Then select an index Joe R(0, ujo) such that ~b).(0+, Ujo) = (J~, ujo), choose ujl ~ ( U \ Uo) c~ ( - c o n e ( l o - F~(0)))*, and perform the same computations with Ujl as with U~oin Step 1. Proposition 4. Every piecewise-linear continuous function f : E, ~ E1 can be expressed as the difference f = g - h of two piecewise-linear functions g, h ~ Co( E, ). I f f is also positively homogeneous then we can take h(. ) = e-t4~,.(O, 9),
(46)
where F is given by (42), and e e (0, eo] is chosen as described in Section 3. Proof. From Lemma 2 the algorithm sketched above stops when Uo= {ut . . . . , ur} (if not before) and then conditions (43) are satisfied. Hence, after a number of steps bounded from above by the starting cardinality of U \ Uo, we obtain a mapping I" described by (19), (42) Which is differentiable from the right at zero and such that f ( . ) = ~b~(9+, 9). This last, together with (29), proves (46). Remark 1. (a) If our starting polyhedron has proper degenerate vertices then the case excluded by Lemma 2 may arise. If this occurs, then choose io, Ko = ( - c o n e ( l o - F~,(0)))* as in Step 2 and carry out the same computations as in Step 1 with an arbitrary vector u cint Ko (best but not necessarily taken from the boundary of some cone KS~ pI). The new vertices of Fo produced in this manner are nondegenerate or, at least, have a proper weaker degeneracy than F~(0). Therefore, with this modification our algorithm is successful in every case.
D. Melzer / On the expressibility of piecewise-linear functions
131
(b) L e t f ~ ~(En), p r be some polyhedral partitioning of E. into linearity ranges of f, and {Yl. . . . , y,} be composed of the vertices and one point from the relative interior of the unbounded edges of each MfE Pf (the polyhedrons MfE P f are assumed to have vertices). Then Zalgaller's method [31 ] yields a minimal (in some sense) representation f = g - h, where g and h are piecewise-linear convex functions in E~, by computing the values h(yi), 1 <~i<~ r, as the components of a solution of a linear programming problem. In higher dimensions, the computer time taken to determine the initial data of the linear programming problem may be considerable; in our method, this may be substantially reduced by a suitable choice of Io. Moreover, the representation of h obtained by our method seems to be more suitable for applications than that given by Zalgaller's method. Using the notation of quasidifferential calculus, Proposition 4 may be reformulated as follows: Corollary 2. j'r C (E.) is a piecewise-linear positively homogeneous function iff there exists a pair 0f(0), -~f(O)c E. of convex compact polyhedrons such that f ( v ) = max (y, v)+ min (z, v). yf~_f(O)
zc,~f{O)
(47)
In this case, Df(O) = (0f(0), ~f(0)) is the quasidifferential of f a t zero. Thus, we have found that ~ , ( E , ) fimctions and equivalence classes of pairs of convex compact sets produced by convex compact polyhedrons are connected by a relation similar to the Minkowski duality existing between positively homogeneous convex functions and convex compact sets.
5. Examples Example 1. L e t f ~ ~ ( E t ) be given by f(x)=
-x-2 x -x+2
ifx~-l, if-l~<x~
1.
Then -x-2t f(x,t)=~-x+2t
i f 0 ~< t < ~ - x , i f 0 < ~ t ~< x , if Ixl ~< t, ift<~0,
is the positively homogeneous function produced by f. We choose F0= co{(1, 0), (0, 1), (-1, 0), (0,-1)} and using the partitioning P] g!ven by the above description of f obtain the conically polyhedral partition pOy of E2 shown in Table 1.
D. Melzer / On the expressibility of piecewise-linear functions
132 Table 1 Cone
Edge vectors
Vertex of Io
f
K, K~ K~ K,, K5 K6
(-1, 0), (-1,1) (-1, 0), ( - I , - 1 ) (-1,)), (1, 1) (-1,-1),(1,- 1) (1,0), (I,l) (1,0), (I,- 1)
(-1,0) (-1,0) (o, l) (0,-1) (1,0) (1,0)
-x-2t --X X --X
-x+2t --X
The c o r r e s p o n d i n g functions t.~ are
F,(t) =
F4(t)=(O,-1)+ t(-1, 0),
( - 1 , O) + t ( - 1 , - 2 ) ,
F2(t)=(-1, O)+ t(-l, O),
Fs(t)=(1, O)+ t(-1,2),
F3(t) = (0, 1 ) + t(l, 0),
F6(t)=(l, 0)+ t(-1, O).
On checking conditions (43) we obtain Table 2. Table 2 u
&(l~, u)
R(0, u)
~ . ( 0 ' , u)
f(u)
(1,0) (-1,0) (1,1) (-1,1) (1, - 1 ) (-1,-1)
1 1 1 1 1 1
5,6 1,2 3,5,6 1,2,3 4,5,6 1,2,4
-!
-1 1 1 -1 -1 1
1 ! 1 -1 3
Application o f Step 2 o f our algorithm with u = ( - 1 , 0), a = 89yields Fo = co{(1, 0), ( - ~', ~), N o w we have the results shown in Tables 3 and (o, 1), ( o , - 1 ) , ' (-12,- 89 4, for which conditions (43) are satisfied. We take e = eo = ~- and obtain
h(x, t) =
4~,(ro; (x, t))
=
4x
ifltl~<x,
4t
if Ixl <~ t,
-4t
i f t ~< - x ,
-2x+2t
ifx<~ - t < ~ 0 ,
L-2x -2t
ifx<~ t~<0.
Table 3 Cone
Vertex of F o
f
Fi
K~ K2
(-89189 (- 89 -89
-x-2t
F~(t) = ( -89189 t ( - 1 , - 2 ) Fz(t) = (- 89 - 89 t ( - l , 0 )
-x
133
O. Melzer / On the expressibility of piecewise-linearfunctions
Table 4 u
6(Fo, u)
R(O, u)
4;r(O+, u)
f(u)
(-1,1) (-1,-1)
1 1
1,3 2,4
-1 1
-1 1
(-1,o)
89
1,2
1
1
Thus, f = ( f + / ~ ) - / ~ is a representation o f f as the difference o f two functions = f + h,/~ ~ ~c(Ez). Finally, restriction to Et x { 1} yields the desired representation f = g - h o f f as the difference o f two piecewise-linear convex functions g, h, where -2x+2 h(x)=
ifx<~-l, 4
3x
iflx[<~l, ifx~> 1,
{-3x g(x)=
i f x ~< - 1 ,
x+4
if[x[<~l,
3x+2
ifx~> 1.
E x a m p l e 2, Consider the example given by formula (2). Take
1o = co{(2, 1), (1, 2), ( - 2 , 2), ( - 2 , - 1 ) , ( - 1 , - 2 ) , (2, -2)}, and use the partitioning PY o f E2 from (2). Then conditions (43) are satisfied directly, and c h o o s i n g e = eo = 2 we obtain the following representation o f the quasidifferential Df(O) = (0f(0), 0f(0)) o f f at 0 e Ez: o_f(O)=~Mr(O, 2 ) = { ( x , y ) l l x l < ~ l , l y l < ~ l , x + y > ~ - ~ } ,
of(0) = - ~,Io.
Acknowledgement
The a u t h o r is indebted to one o f the u n k n o w n referees for calling his attention to the p a p e r by Zalgaller [31].
References
[1] A.D. Alexandrov, "On surfaces which may be represented by a difference of convex functions" (in Russian), Izoest(va Akaderaii Nauk Kazakhskoj SSR, Seriya Fiziko-Matematicheskikh 3 (1949) 3-20. [2] A.D. Alexandrov, "On surfaces which may be represented by differences of convex functions" (in Russian), Doklady Akademii Nauk SSSR, 72 (1950) 613-616. [3] B. Bank, J. Guddat, D. Klatte, B. Kummer and K. Tammer, Non-linear parameteric optimization (Akademie-Verlag, Berlin, 1982). [4] It.T. Banks and M.Q. Jacobs, "A differential calculus for multifunctions", Journal of Mathematical Analysis and Applications 29 (1970) 246-272. [5] F.H. Clarke, "'Generalized gradients and applications", Transactions of the American Mathematical Society 205 (1975) 247-262. [6] V.F. Demyanov, "On connections between Clarke's subdifferential and the quasidifferential'" (in Russian), Vestnik Leningradskogo Universiteta 13 (1980) 18-24. [7] V.F. Demyanov, ed., Nonsmooth problems of optimization and control theory (in Russian) (Leningrad University Press, Leningrad. 1983).
134
D. Melzer / On the expressibility of piecewise-linear functions
[8] V.F. Demyanov and A.M. Rubinov, "'On quasidifferentiable functions", Soviet Mathematics Doklady 21 (1980) 14-17. [9] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable mappings", Mathematische Opera. tionsJbrschung und Statistik, Series Optimization 14 (1983) 3-21. [10] Tuy Hoang, "Global minimization of a difference of two convex functions", Preprint of the Institute of Mathematics of Hanoi, 1983. [l l] P.F. Kough, "The indefinite quadratic programming problem", Operational Research 27 (1979) 516-533. [12] S.S. Kutateladze and A.M. Rubinov, Minkowski duality and its applications (in Russian) (Nauka, Novosibirsk, 1976). [ 13] Le van Hot, "On the differentiability of multivalued mappings I, I I", Commentationes Mathematical Universitae Carolinae 22 (1981) 267-280, 337-350. [ 14] R. Mifflin "Semismooth and semiconvex functions in constrained optimization", S I A M Journal on Control and Optimization 15 (1977) 959-972. [15] S. Miri~a, "The contingent and the paratingent as generalized derivatives for vector-valued and set-valued mappings", Preprint Series in Mathematics 31 (1981), University of Bucharest. A version may also be found in Nonlinear Analysis 6 (1982) 1335-1368. [ 16] A.M. Neumark, Normierte AIgebren, (Deutscher Verlag der Wissenschaft, Berlin, 1959). [17] F. No~,i~ka, J. Guddat, H. Hollatz and B. Bank, Theorie der linearen parametrischen Optimierung (Akademie-Verlag, Berlin, 1974). [18] E.A. Nurminski, "'On the differentiability of multivalued mappings" (in Russian), Kibernetika 5 (1978) 46-48. [19] E.A. Nurminski, "Bibliography on nondifferentiable optimization", Working Paper WP-82-32, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1982). [20] N.A. Pecherskaya, "On the differentiability of multivalued mappings (in Russian), Vesmik Leningradskogo Universiteta 7 (1981 ) 115-117. [21] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions" (in Russian), Vesmik Leningradskogo Universiteta 13 (1980) 57-62. [22] B.N. Pshenichnyi, Necessary conditions for an extremum (Nauka, Moscow, 1969 and 1982); English translation by Dekker, New York, 1971. [23] H. Radstrrm, "An embedding theorem for spaces of convex sets", Proceedings of the American Mathematical Society 3 (1952) 165-169. [24] R.T. Rockafellar, Convex analysis, (Princeton University Press, Princeton, New Jersey, 1970). [25] R.T. Rockafellar, "The theory of subgradients and its applications to problems of optimization. Convex and nonconvex functions". In the series Research and education in mathematics ( Heldermann-Verlag, Berlin (West), 1981 ). [26] R.T. Rockafellar, "'Favourable classes of Lipschitz continuous functions in subgradient optimization", Working Paper WP-81-1, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1981). [27] A.M. Rubinov, Superlinear multivalued mappings and their applications to problems of mathematical economy (in Russian) (Nauka, Leningrad, 1980). [28] K. Tammer, "Mrglichkeiten zur Anwcnding der Erkenntnisse der parametrischen Optimierung fiir die L/~sung indefiniter quadratischer Optimierungsprobleme", Mathematische Optimierungsforschung und Statistik 7 (1976~ "~9--222. [29] Yu.N. Tyurin, "Mathematical formulation of a simplified model of production planning" (in Russian), Ekonomika i Matematicheskie Metody 1 (1965) 391 -410. [30] R.S. Womersley, "Optimality conditions for piecewise smooth functions", Mathematical Program. ming Study 17 (1982) 13-27. [31] V.A. Zalgaller, "'On the representation of functions of two variables as a difference of convex functions" (in Russian) Vestnik Leningradskogo Universiteta 1 (1963) 44-45.
Mathematical Programming Study 29 (1986) 135-144 North- Holland
POSITIVELY HOMOGENEOUS QUASIDIFFERENTIABLE FUNCTIONS AND THEIR APPLICATIONS IN COOPERATIVE GAME THEORY S.L. P E C H E R S K Y Institute for Social and Economic Problems, USSR Academy of Sciences, ul. Voinova 50-a, Leningrad 198015, USSR
Received 28 December 1983 Revised manuscript received 15 April 1984 One interesting class of quasidifferentiable functions is that formed by the family of positively h o m o g e n e o u s functions. In this paper, the author studies the properties of these functions and uses them to derive some new results in the theory of cooperative games.
Key words: Homogeneous Quasidifferentiable Functions, Cooperative Games with Side Payments, Subdifferentials.
I. I n t r o d u c t i o n
We shall begin by recalling the definition o f quasidifferentiability (for more information on the properties o f quasidifferentiable functions see [5]). Let a finitevalued function f : S ~ Et be defined on an o p e n set S c E,. D e f i n i t i o n 1 [5]. A function f is said to be quasidifferentiable at a point x ~ S if it is differentiable at x in every direction g ~ En and there exist convex c o m p a c t sets Of(x) c En and ~f(x) c 17.. such that
Of(x) Og
-
max (v,g)+ rain (w,g)
vc~_f(x)
wc~f(x)
VgeE..
(1)
The pair o f sets Df(x) =[_~f(x), ~f(x)] is called a quasidifferential o f the function f at the point x and the sets Of(x) and -df(x) are called a subdifferential and a superdifferential, respectively, o f f at x. In what follows we shall consider a positively h o m o g e n e o u s function f, i.e., f ( A x ) = Af(x)
VA/>0.
(2)
Let K be a convex cone in E, with a c o m p a c t base and a n o n - e m p t y interior. We shall suppose that T is the base o f this cone, where dim T < n ; let ri T denote the relative interior o f the set T, and R.r the affine hull of T. Definition
2. A function f : T ~ E1 is said to be quasidifferentiable at a point x ~ ri T 135
S.L. Pechersky/ Positive homogeneity and game theory
136
if it is differentiable at this point in every direction g 9 Rr= R r - x c o m p a c t sets _aft(x), 0Tf(,x) c R r exist such that
Of(x) = m a x ( v , g ) + rain ( w , g ) Og vc=a.Tf(x) WE3Tf(X)
and convex
Vg 9
The following proposition is an i m m e d i a t e corollary of these definitions.
Proposition 1. Let a function f : K ~ E~ be quasidifferentiable at a point x 9 int K. Then the function fl r(x.p), where T(x, p) = {z 9 K I (z - x, p) = 0}, p 9 E,, is quasidiffer. entiable at x, and its quasidifferential is defined by the pair [A, B], where A = Prp(~f(x)),
B = Prp(af(x)),
and PrpC represents the orthogonal projection of a set C on the hyperplane H.
={z 9 E.l(z,p)=O}.
2. Qunsidifferentinbility of n positively homogeneous extension Let us s u p p o s e that the function f : K ~ El is the positively h o m o g e n e o u s extension to the cone K of a function f defined on the set T(x,x), x ~ i n t K . Let f be quasidifferentiable at x.
Theorem 1. The function f is quasidifferentiable at x and moreover
Df(x)=[O_f(x) t ' X f~(,x ) "of(x)] Proof. Since f is quasidifferentiable, the equality
Of(x) c3h
= max ( v , h ) + vc~.f(x)
rain ( w , h ) ,
we~J'(x)
(3)
holds for every direction
h ~ Hx = { v E E . l(v, x)=0} and
a_f(x), ~f(x) c Hx.
(4)
Let us consider an arbitrary direction g c E. and s u p p o s e that
g~Ax
for every A ~ El.
Consider lim ( f ( x + h g ) - f ( x ) ' ~ . A /
F(x, g) = ~ + o \
(5)
S.L. Pechersky/ Positivehomogeneityand game theory
137
It is clear that ( x + Ag)
IIx[I 2
T(x, x),
(x + ;tg, x)
where Ilxll 2 = (x, x). Let
h
Ilxl12 ( x + g ) - x . (x+g,x)
(6)
Then h e H~ and we have the following representation: Ilxll 2
(x+Ag)
(x + ;tg, x)
- x + l.th
where A=
,llxll ~
Ilxll=+ (g, x ) -
Iz(g, x)"
(Note that h # 0 because g ~s Ax.) It is clear that A ~ +0 itt/z -* +0, and thus we have +
= lim
( f ( x + p . h ) - f ( x ) +f(x+lzh) (g' x)~
A . +o \
= l i m ~ ( f(x
I-~ /
a
+ tzh)-f(x) . Ilxll2+(g,x)-l~(g,x). ) + f(x) " (g,x) ~,
lieu ~
Hence for every g ~ Ax the derivative
Of(X)
af(x) Jlxll2+(g,x) = ah " Ilxll =
ag
af(x)/Og exists and
l_f(x),
~
Ilxll ~
, tg, x)
(7)
where h is defined by (6). From (3) we then get
ay(x) -
og
Ilxll2+(g,x) [ Ilxll 2
max ( v , h ) + min
o~-~.~(~,
..~s,~)
f(x)
(~, h)]+ i1-~ (g, x).
Since the function af(x)/ag is positively homogeneous in g, it is enough to assume that g satisfies the condition Ilxll=+(g, x ) > 0. Then, taking (6) into account, we have a fag (x)
max (v,g ,,,_,,f,x) \
(g,x)'~+ rain ( w -, g - x
-xl-~)
,,,~:,x)\
(1g-,~x )) \ + f ([-~(g,x). x)
S.L. Pechersky / Positive homogeneity and game theory
138
Since Of(x), Of(x)c H~,
__~f(x)= max Og
(v,g)+
vcOf(x)
=
min ( w , g ) + f ( x ) "
I - ~ ~g'
we~f(x)
max
(v, g ) +
min
vog_f(x)+(ftx)/lixllZ)x
x)
(w,g).
(8)
wc~f(x)
N o w we have to check that this formula holds for g ~ Hx or g = Ax for some A ~ 0. If g ~ Hx, then (g, x) = 0 and max
oc~.f(x)+(f(x)/llx]12)x
(v, g) = max (v, g). Yes,f (x)
Let us s u p p o s e that g = Ax for some A ~ O. Then
3g
i~
/
tz
/
But from (4) we have
v~_f(x)+(f(x)/llxllmax ~.~ (V. hx) + ~;~a,omin(w, hx)= tti[{ f (~x ) x, Ax ) = AS(x), thus proving the theorem.
3. Game-theoretical applications of quasidifferentiable functions N o w let us consider the game-theoretical applications o f quasidifferentiable functions. The study o f so-called fuzzy or generalized games is currently attracting a great deal o f interest. We will not go into the reasons for this here (but see J.-P. Aubin [173 ] on this topic): we shall simply recall the main definitions. Let I = 1: n be a set o f n players. We can then identify an arbitrary set S c I, called a coalition, with a characteristic vector e s, where e = ~ = (1 . . . . , 1 ) e E, and e s is the projection o f vector e on the subspace
R s = {xe E, lx, = 0 f o r i~ S}. Thus the set o f all coalitions is {0, 1}~. The set o f generalized (fuzzy) coalitions is, by definition, the convex hull co{0, 1}" = [0, 1]". Hence a generalized coalition r e [0, 1]" associates with each player i~ I a participation rate r, c [0, 1], which is a n u m b e r between 0 and 1.
Definition 3 [3]. An n-person generalized cooperative game (with side payments) is defined by a positively h o m o g e n e o u s function v : [ 0 , 1 ] " ~ E~ which assigns a payoff v ( r ) c El to each generalized coalition r e [ 0 , l]". The function v is called the characteristic function o f the game.
S.L. Pechersky / Positive homogeneity and game theory
139
Since v is positively homogeneous we can extend v to E~. by setting
v(0) =0,
for r e E ~ + , r # 0 . We shall take the vector space E. as the space of outcomes (or multi-utilities). Vector x = ( x ~ , . . . , x . ) c E~ represents the utilities of the players; the utility of the generalized coalition z is given by (r, x) = ~=1 ~ix~. If S c / , then this utility is equal to (eS, x)=Y.i~sXi. It is well-known (see, for example, [1, 2]) that the directional derivative may be used to define the solutions of a game. In an extension of this idea, J.-P. Aubin has proposed that the Clarke subdifferential could be used to define a set of solutions to locally Lipschitzian games, i.e., games with a locally Lipschitzian characteristic function. Definition 4 [3]. We say that the Clarke subdifferential ac~V(~) of v at ~ is the set of solutions S(v) to a locally Lipschitzian game with characteristic function v.
The following properties of the set S(v) are worthy of note: (a) S(v) is nonempty, compact and convex, (b) S(v) is Pareto-optimal, i.e., if x e S(v), then ~=1 x~ = v(U, (c) S(Av)= AS(v) for A ~ E,, (d) S ( u + v ) c S ( u ) + S ( v ) , (e) If v is superadditive, then S(v) coincides with the core of v, (f) If v is continuously differentiable at ~, then S(v)= Vv(U, i.e., S(v) contains only one element which coincides with the generalized Shapley value of the game v. Definition 5. A generalized game is said to be quasidifferentiable if its characteristic
function is quasidifferentiable. Remark 1. Since quasidifferentiability is essential only on the diagonal of cube [0, 1]" then from Theorem t and the positive homogeneity of function v it is sufficient to assume that v is quasidifferentiable only at L Let v be quasidifferentiable and its quasidifferential be [_0v(]), ~v(~)]. From Proposition 1 we deduce that the function v 1= VITal.I) is quasidifferentiable at ~ with a quasidifferential defined by the pair [Pr~ _0v(U, Prl 0v(U]. It is clear that the positively homogeneous extension ~ of the function v ~ on E + coincides with v; the quasidifferential of this function at ~, which may be found using Theorem 1, is
[Pr~ _3v(~)+ H - ~ ~' Pr~
140
S.L. Pechersky / Positive homogeneity and game theory
It is also clear that this pair is in some sense 'Pareto-optimal', since for x e Pr~ _av(9) + (v(9)/Ilzl2)9 and y e Pr~ 0v(9) we have
[v(9)
)
(xi + Yi ) : (x + y, "~l = \ll--~-~'tl, 9 = v ( ~ ) i=l
(because Pr, _or(9), Pr~ 0v(9) = H 0 . Let D % ( 9 ) be a quasidifferential o f v at 9 which is Pareto-optimal in the sense described above. We then have the following definition: Definition 6. The quasidifferential D"v(~) o f the characteristic function v at the point 9 is called a quasisolution o f the game. There are at least two reasons for using the term 'quasisolution'. Firstly, it is k n o w n that quasidifferentials are not unique and are defined up to the equivalence relation. We should also note that a locally Lipschitzian function is not necessarily quasidifferentiable and vice versa. Moreover, it is obvious that a function which is both locally Lipschitzian and quasidifferentiable may have both a directional derivative and an u p p e r Clarke derivative, which are essentially different quantities. Quasisolutions also possess certain properties which go some way towards justifying their name. 1. If a characteristic function v is continuously differentiable at 9, then D"v(~) = IVy(9), 0], where Vv(9) is the gradient o f v at 9 and a quasisolution can be identified with the generalized value o f the game. 2. If v is concave (i.e., superadditive), then D"v(~) = [0, av(~)], where av(~) is the superdifferential o f the concave function v and the quasisolution D'~v(9) can be identified with the core o f the game. 3. A quasisolution is linear on v. Remark 2. In general, if one element o f a quasidifferential is zero, then it is natural to regard the c o r r e s p o n d i n g quasisolution as a solution o f the game. Finally, using the properties o f quasidifferentials we can find quasisolutions o f the m a x i m u m and m i n i m u m games o f a finite n u m b e r o f quasidifferentiable games, and thus we m a y speak about the calculus o f quasisolutions. Let us n o w consider the directional derivative
av( ) = Og
lim
~~+o
v(9+ Ag)- v(9) A
This value shows the marginal gain o f coalition 9 when a new coalition g joins the existing coalition 9. (We do not assume that g e E ,+, and hence this vector can have negative components. Such c o m p o n e n t s m a y be interpreted as the ' d a m a g e ' caused to the c o r r e s p o n d i n g players or alternatively as an indication that they should leave the whole set o f players).
S.L. Pechersky / Positive homogeneity and game theory
141
Since representation (1) holds for a quasidifferentiable game, it is interesting to consider the vectors x(g) and y(g) at which the corresponding maximum and minimum are attained. Since _0v(~) and av(~) are convex compact sets, the sets Arg max{(x, g)lx ~ _0v(~)} and
Arg min{(y, g)lY ~ ~v(~)}
consist of only one element for almost every g ~ S n-I where S n-t is the unit sphere in E,. Let G ( v ) denote the set of such g, and z(g) = x(g) +y(g). Note that if the function v is both locally Lipschitzian and quasidifferentiable and also satisfies some additional property (which is too cumbersome to describe h e r e - - s e e D e m y a n o v [4]), then the points z(g), g c G(v), describe all extreme points of the Clarke subdifferential of o at ~ (the set of solutions proposed by J.-P. Aubin).
4. Solution of quasidifferentiable games We shall now define the solution of a quasidifferentiable game, which we shall call an st-solution. We require the following additional definition: Definition 7 [6]. Let K be a compact convex set in E,. The Steiner point of the set K is the point
s(K)=Zfs,
o ,aP(K'a)dA'
(9)
where )t is the Lebesque measure on the unit sphere S "-I in E,, trn is the volume of the unit ball in E,, a is a variable vector on S n-1 and p(K, .) is the support function of K. Note that we always have s ( K ) e K and s ( - K ) = - s ( K ) . entiable characteristic function with quasidifferential
Let v be a quasidiffer-
D'~v(~) = [0v(~), 0v(~)]. Definition 8. The st-solution of a quasidifferentiable game with characteristic function v is the vector st(v) defined by the equality st(v) = s( O_v(~)) + s('Ov(~) ).
(10)
We first have to prove that this definition does not depend upon the pair defining a particular quasidifferential v (such a quasidifferential may not even be 'Paretooptimal'). This follows immediately from the linearity on K (with respect to vector addition of sets) of the function s defined by (9), and from the following obvious property of quasiditierentials: if [A, B] is a quasidifferential of v at x, then the pair
S.L. Pechersky / Positive homogeneity and game theory
142
[At, B~] is also a quasidifferential of v at x if and only if
A-BI=A1-B.
(11)
Using the equality (11) and the linearity of s we get
s(A-B~)=s(A~-B)
r
s(A)-s(B~)
=s(A~)-s(B) r
s(A)+s(B)=s(AI)+s(Bt).
The vector st(v) can be interpreted as the vector of average marginal utilities received by the players. We shall now describe some properties of st-solutions. Proposition 2. If a generalized game is quasidifferentiable, then:
1. The mapping st: v-~st(v) is linear in v. 2. The st-solution is Pareto-optimal, i.e., (st(v)), = v(ll). i=1
3. l f v is continuously differentiable, then st(v) = V v(~) and the st-solution coincides with the generalized Shapley value of v. 4. I f v is concave (superadditive), then st(v) is the Steiner point of the core of the game. The p r o o f of this proposition follows immediately from Proposition 1, Theorem 1, and the definition of quasisolutions. Now let us prove two more important properties of an st-solution: it satisfies the 'dummy' axiom (Theorem 2) and is symmetric (Theorem 3). Let a quasiditterentiable game have characteristic function v such that v ( x ) = v(x ~\i) for every x 9 E~+ . Then for every g c E, we have av(~)0g=/imo~ (v(~+Ag)-v(~))A / - lira { v~-
-A-+0\
~v(7 t\i)
x- v(J\')/- ~g'\'
(12) "
It is clear that the function ~3= v]n'\' is quasidifferentiable at ~t\i and its quasidifferential at this point is defined by the pair [Pr_0v(U, Pr~v(U], where Pr A is the projection of A on R t\i. Hence, from (12), this pair is the quasidifierential of v at ~. Thus if xcPr(_0v(~)) and yCPr(~v(U), then x ; = 0 , y , = 0 . From this we have (st(v)), = 0 and the following theorem holds. Theorem 2. I f a quasidifferentiable game with characteristic function v is such that v(x) = v(x TM)for every x 6 [ 0 , l]", then (st(v))i =0.
S.L. Pechersky / Positive homogeneity and game theory
143
In other words, the function st(-) satisfies the so-called d u m m y axiom, which states that a ( d u m m y ) player who gives nothing to any coalition will also receive nothing. " N o t h i n g will c o m e o f n o t h i n g " Shakespeare, King Lear Suppose n o w that v is quasidifferentiable and 7r is a permutation o f the set o f players l = l : n . We shall define the game zr*v as follows: 7 r * v ( x ) = v(x,.
Let
',1), . . . , x .
,,,).
(Tr-lx)i = x .
Theorem
3. The
,.) and (Trx)i = x,.~.
st-solution is symmetric, i.e., st(It*v)
= zr st(v).
Proof. If [_~v(~), 0v(~)] is a quasiditierential o f v at ~, then
aTr*v(~) limo( Tr*v(~+hg)-Tr*v(~)) Og h = lim ( A~o\
v(rr-l~+Tr-l(Ag))-v(zr i.fl)) h
= lim (V(~+A(rr-~g))-v('~)) Ov(~) ~,~+o h -O(rr-l g)" Hence
o~r*v(~) -
-
Og
- max (z, z r - l g ) + min (y, 7r-lg) ~c_~v(~)
yc~v(~)
= max (Trz, 7r(Tr-lg))+ min (Try, 7r(Tr-lg)) zc#vO)
-
max
zc ~r(0v(l))
),cSv(I)
(z,g)+ yc min (y,g). ~r(Sv(~))
Thus [zr(_Ov(~)), ~r(0v(~))] is a quasidifferential o f rr*v at ~. Since the Steiner point is invariant u n d e r orthogonal transformations o f E., then s(~O_v(~))
= ~-s(O_v('~)),
s(~-~v('O)) = ~s(-~v(~))
and hence St(Tr*v) = ~" st(v),
(13)
which is the proposition o f the theorem. It is clear that the above formula holds for every orthogonal transformation o f E, which leaves the vector ~ unchanged.
144
S.L. Pechersky / Positive homogeneity and game theory
References [1] J.-P. Aubin, Mathematical methods of game and economic theory (North-Holland, Amsterdam, 1979). [2] J.-P. Aubin, "Cooperative fuzzy games", Mathematics of Operations Research 6 (1981) 1-13. [3] J.-P. Aubin, "Locally Lipschitz cooperative games", Journal of Mathematical Economics 8 (1981) 241-262. [4] V.F. Demyanov, "'On a relation between the Clarke sub-differential and the quasidifferentiar', Vestnik Leningradskogo Universiteta 13 (1980) 18-24 (translated in Vestnik Leningrad University Mathematics 13 (1981) 183-189). [5] V.F. Demyanov and A.M. Rubinov, "On some approaches to a nonsmooth optimization problem" (in Russian), Ekonomika i Matematicheskie Metody 17 (1981) 1153-1174. [6] W.J. Meyer, "Characterization of the Steiner point", Pacific Journal of Mathematics 35 (1970) 717-725.
Mathematical Programming Study 29 (1986) 145-159 North-Holland
QUASIDIFFERENTIABLE MAPPINGS DIFFERENTIABILITY OF MAXIMUM
AND THE FUNCTIONS
N.A. P E C H E R S K A Y A Institute for Social and Economic Problems, USSR Academy of Sciences, ul. Voinova 50-a, Leningrad 198015, USSR Received 28 December 1983 Revised manuscript received 10 October 1984 In this paper the author discusses what is meant by a derivative of set-valued mappings, and generalizes some existing definitions. The problem of the differentiability of m a x i m u m functions is taken as an example.
Key words: Set-Valued Mappings, Quasidifferentiability, Radstr6m Theorem, M a x i m u m Functions.
1. Introduction
Reviews of the literature dealing with set-valued mappings [7, 13, 20] show that the question of their differentiability has so far received relatively little attention. However, although there is not as yet any general agreement on what is meant by a derivative in this context and the applications of any such derivative have not been clearly defined, the steps already made in this direction justify further investigation. Moreover, they show that the concept of differentiability as applied to set-valued mappings is not simply a generalization of the corresponding concept from classical calculus but is of a more informal character. On the basis of this concept it is possible to develop a uniform approach which can be used to study optimization methods, to describe the properties of economic models, to simplify the proofs of certain extremal theorems, to link some important definitions in convex analysis and optimization theory, and to formulate and solve variational problems, etc. (see, e.g., [4, 11, 15,21,24]). The approach described in this paper is related to the research described in [2, 4, 17, 25], where the definition of differentiability is based on the Redstr6m theorem and is an extension of the analogous definition for a function in a normed space. We take as an example of the use of this concept the question of the differentiability of a maximum function defined over a differentiable set-valued mapping.
2. Definition of a differentiable mapping
Let us consider a mapping G: ~-~ ~ ( E , ) , where l'l c Em is an open set, ~(Em) is a collection of nonempty convex compact sets in E,, with Hausdorff metric OH. 145
N.A. Pecherskaya/ Ouasidifferentiable mappings
146
Let us remind the definition of the space of convex sets M(E,,). The standard definition of this space [14] as a set of classes of equivalent pairs [A, B], A, Be ~ ( E , , ) , means that it can be considered as a linear space in which ~ ( E m ) is a generating cone up to the isomorphic relation. The equivalence class containing the pair [A, B] will be denoted by (A, B). The pairs [A, B] and [C, D] are said to be equivalent iff A + D = B + C. Multplication by a number A is defined as follows: A(A, B)
~(AA, AB), = [,
A/>0, a <0.
The function 8n((A, B), (C, D)) =- p u ( A + D, B + C) defines a metric on the space M(E,,). Any equivalence class which contains the pair [0, 0], e.g., the collection of all pairs {[D, D]ID 9 Le(E,.)}, is said to be a zero of M(E,.). The relation II(A, B)II = 8u((A, B), (0, 0)) defines a norm in M(E,,). Fix x 9 and v 9 E,. Let us also consider a function (~:/2 ~ M ( E , . ) which maps every x 9 onto the space of convex sets M(Em) : G ( x ) = (G(x), 0). The directional differentiability of the function (~ at x 9 12 in a direction v 9 En is equivalent to stating that there exists an element ( ~ ' ( v ) 9 M ( E , . ) such that
r
~(x)+ac~'(v)+~.o(~),
~>0,
(1)
where ([lax,~(~)ll/~)-, 0, ,,-, 0. Denote G~(v)=(G+~(v), G~(v)), ox.o(a)=(ox+~(cr), o~-v(ct)). Then (1) takes the form
(G(x + ~v, o) = (G(x), O)+ ~(G+~(v), G~(v))+ (o~*~(~), o~.o(~)). Using the rules for addition and multiplication by a number in the space M(Er,), (1) can be rewritten in the form
pt~( G ( x + ctv) + ctG~( v), G(x) + ctG+~(v) ) = o~.~(ct), where o~.o(~)=lla~.o(~)ll equivalence relation.
(2)
and the pair [G~+(v), G~(v)] is defined up to the
Definition 1. A mapping G is said to be differentiable at x 9 O in a direction v 9 En if -tthere exist non-empty convex sets Gx(v), G ~ ( v ) 9 Ze( E,.) such that for a sufficiently small tr > 0 condition (2) holds. The equivalence class CJ"( v) = ( G+~(v), G~-( v)) is then said to be the derivative of the mapping G at x in the direction v.
We shall now reformulate this definition in terms of support functions. A mapping p : . O ~ ~ ( E , . ) is called the support function of a mapping G if it maps every x 9 .O onto a support function pG(x,') of the set G(x). Here p c ( x , ' ) is an element of the collection ~ ( E , . ) of all finite continuous sublinear functions on E,. [23]: Pc~(x, u)= maxz~(x)(z, u) V u 9 E,,. Since the support functions are positively homogeneous, they are completely defined by their values on the sphere St = [u 9 E,, Ill ull = x}. The track of a support
N.A. Pecherskaya / Quasidifferentiable mappings
147
function on the sphere S~ will also be denoted by p, using the equivalent notation pc~)(u)=- p~7(x, u), u e S~. For an arbitrary u e S~ consider the limit
lim (PO(X+aV,u)-po(x,u)) = OpG(X,U) c, ~+o
Ot
cgV
(3)
We shall call this the uniform derivative of the function Po at the point x in the direction v if the convergence is uniform with respect to u e S~. This means that there exists a function Opt(x, u)/Ov such that for every v e En and a sufficiently small a > 0,
p c ( x + av, u ) - p c ( x , u ) - a
~pc(x, u) Ov
- ox,~(a, u),
(4)
where (ox.v(a, u)/a)-->O, a - > +0 uniformly with respect to u. One fact should be mentioned in connection with the inclusion ~ s , ( E m ) c C(S1), where C(S~) is the space of all functions with a uniform norm which are continuous on S~. Let A, D e ~(E,n). It is shown in [15] that the following equality is satisfied:
pn(m, D ) = [[PA--Po[IC(SO,
(5)
where pA and Po are the support functions of A and D, respectively. Going back to (2) and making use of (5), we get m a x l P c ~ + ~ + , c z ~ ) ( u ) --pGC~)+~C.+~o~(U)[= o ( a ) , u~S I
(6)
where o(a )/ a -->0, a ~ +0 (o(~) =- o,.v(cr)). As a consequence of Minkowski duality [14], which allows us to identify the support function of a convex set with the set itself, relation (6) can be rewritten in the form
]pc(~+~o)(u) - pc(x)(u) - a(pc~(o)(u) -Po:<~>(u))I-- O(a, u), where (o(a, u ) / a ) ~ O uniformly with respect to u e S~. Taking the limit as a ~ +0 (uniformly with respect to u) in the equality above, we obtain
~p~(x, u) --p~+~(~)(u)--pG:~o~(u) av
VueSi.
(7)
Thus, we have proved the following:
Proposition 1. A mapping G is differentiable at x c 12 in a direction v c E, if the uniform derivative of the support function Pa at x in this direction exists, and non-empty convex compact sets G~ ( v ) , G ~( v ) exist such that (7) holds. The pair [G~+(v), G~(v)] defined up to the equivalence relation is the derivative of the mapping G.
N.A. Pecherskaya / Ouasidifferentiable mappings
148
Remark I. It is clear from the above definition that the derivative of a differentiable mapping G at fixed x, v can be described as a function of support direction u (using Opo(x, u)/Ov). The representation of the derivative of the function Pa as the difference of two convex functions (the sum of convex function and a concave function) suggests that this derivative should be called partially quasidifferentiable with respect to u, by analogy with the terminology introduced by Demyanov and Rubinov [6]. In the same way, the mapping defined by quasidifferentiable support functions were initially called quasidifferentiable. As ever, on reflection we realized that the quasidifferential introduced by Demyanov and Rubinov [6] and the derivative of differentiable mappings defined above are different concepts, and therefore decided to omit the prefix 'quasi' in order to avoid confusion. In the case of a mapping with a convex epigraph the function Op~(x, u)/Ov with fixed x, u arbitrary v, defines the locally conjugate mapping studied by Pschenichny [21, 22]. It is shown in [ 19] that the class of differentiable mappings is closed with respect to multiplication by a real number and to addition; this paper also proves the differentiability of 'complex' mappings and considers some classes of differentiable mappings which are interesting from the point of view of applications.
3. Derivatives, tangent cones, and sets of feasible directions We shall now demonstrate the relation between the derivative of a mapping and some important concepts in optimization theory. Fix x e D, v e E,. Choose z e G ( x ) and consider the cone K ( z ) normal to the set G ( x ) at z:
K (z) = {u e Eo [(z, u) = p~,~.~(u)}. The cone
-K*(z)
= { u * e Em[(u, u * ) ~ < 0 V u e K ( z ) }
is then a tangent cone to the set G ( x ) at z. The role of the tangent cone in constructing necessary conditions for an extremum needs no elaboration; the tangent cone is also the main indicator of the local behavior of a mapping. However, it is sometimes more convenient to describe this behavior in terms of feasible directions. In particular, some authors [ 1, 11, 24] use the notion of a set of feasible directions, with its definition modified in some way, to study the derivative of a mapping. We shall define this notion as follows. A set of vectors g e Em is called the set of feasible directions F(x, z, v) at point z e G ( x ) in direction v if there exists a vector function o ( a ) e En, ([[o(a)[l/a)~ 0 such that z + ag + o( a ) e G( x + av ) for a > 0 sufficiently small. If G ( x ) = G for all x e / 2 , the set F ( z ) = F(x, z, v) does not depend on x and v, and is a cone coinciding with the tangent cone.
N.A. Pecherskaya/ Quasidifferentiablemappings
149
If the support function of a mapping is uniformly differentiable, the feasible directions form a nonempty, convex, closed set which can be represented in the form [16]:
F(x,z, v)={gcEm'(u'g)<~OPa(X'U)VueK(z)) " B y
(8)
The function Opt(x, u)/Ov is sublinear on the cone K(z). It is clear that this function is also positively homogeneous, and its convexity follows from the relations pG(X, Ul)+pG(X, U2)=pG(X, UI+U2), where ul, u2e K(z). Thus, for u=u,+u2 we have =~+o\lim (pG(X+~
OpG(X,
)
o
r
<-lira( pa(x + ctv' ul) a+ pG(x + av' = ~+o\lim (pG(X + av,
Ul)--pG(x ,or
+ ~-+o\lim (pG(x +av,
U2)-pG(Xo,t
Pc;(X' Ul)+ Pc;(x'
Ul))
U2))
OpG(x, u,) + OpG(X,U2)
OV
c3V
From (8) we deduce that the set F(x, z, v) is a subdifferential of the sublinear function
OpG(X, U) /~(x, u)
=)
OV +oo,
' ueK(z),
(9)
u ~ K(z),
which in its turn can be rewritten in the form
/~(x, u)= sup (g, u)
Vue K(z).
(10)
geT(z)
We shall now establish a relation between the derivative of a differentiable mapping, the set F(z) and the tangent cone.
Theorem 1. If mapping G is differentiable at point x in direction v then F ( x , z, v ) + G ~ ( v ) = G+~(v) - g * ( z ) .
(11)
Proof. Consider (7) for u ~ K(z). Invoking (9) and (10), we have /~(x, u)+pG?~(~,~(u)=PG~(v)(U) VU~ K(z).
(12)
150
N.A. Pecherskaya / Quasidifferentiable mappings
Equation (12) is equivalent to the subdifferential equality (Theorems 2.2, 2.3 in [23]): Ofi + cgp~-,~) - K*( z ) = cgpa' ~v) - K*( z ).
(13)
Using Minkowski duality and F ( z ) = c9/~, (13) can be rewritten in the form F(x, z, v)+ G ~ ( v ) - K * ( z ) = G+~(v) - K * ( z ) .
(14)
Comparing the definitions of the set F ( z ) and the cone - K * ( z ) , it is not difficult to see that F ( z ) - K * ( z ) = F(z). Since 0 e [ - K * ( z ) ] , the reverse inclusion F ( z ) K * ( z ) ~ F ( z ) is always true, and we immediately obtain (11) from (14). We shall now introduce one more notion, which is closely related to the derivative of a mapping. A mapping G is said to allow first-order approximation at x in a direction v with respect to a closed subset R (x) of the set G ( x ) if for arbitrary convergent sequences {ak} , {Zk} such that k -* ~ , a k "~ "Jr-O, Zk ") Z, Z k (: G ( x -~- Otkl)), the following representation holds: Z k ~ Z -[- Otkg k + O( O:k ),
( 1 5)
where gk ~ F(X, z, v), z ~ R ( x ) . Remark 2. Our definition of a mapping allowing first order approximation is slightly different from one defined in [5]. A point z is called a point of smoothness of the set G ( x ) if the normal cone K ( z ) at z consists of a vector Uo normal to G ( x ) at z. For example, points belonging to the interior of a face of a polyhedral are points of smoothness in this sense. Theorem 2. Let mapping G be differentiable at x ~ O in a direction v, Let R ( x ) c G ( x ) be a closed set of the points of smoothness o f the set G ( x ) . Then G allows first order approximation at x in the direction v with respect to R ( x ) . Proof. Let {Otk} , {Zk} be such that ZkC G ( x + a k V ) , a k o + 0 , Zk~Z, ZC R ( x ) . Let u0 be a normal vector to G ( x ) at z. Take a vector g _ e G ~ ( v ) such that Pc-t~)(u)=(g_,uo),
(16)
uo~ K ( z ) .
It follows from the differentiability of the mapping G that for sufficiently small • 0 V z k E G ( x + akv) Vg._ e G~(v), there exist Yk E G(X) and (g+)k e G~(v) such that Olk
Zk + a k g _ = y k +ak(g+)k +O(ak),
(l[O(ak)ll/ak)~O.
(17)
It is clear that Yk -~ Z. NOW take Yk in the form Yk = Z + akbg, where bk ~ [ - K*(z)].
(18)
N.A. Pecherskaya / Quasidifferentiable mappings
151
(17) can then be rewritten as zk = z + ak( (g+)k -- g- + bk) + o(~k), where ak((g+)k -- g- + bk) ~ O. We shall now show that ( g + ) k - - g - + b k C F ( z ) . From (16) and (18) we have ( (g* ) k, Uo) -- (g-, Uo) + ( bk, Uo) <~p GZ(o)( Uo) -- p G: (o)( Uo) for Uo~ K ( z ) . The right-hand side of this inequality is equal to apG(x, u)/Ov and hence ( (g+)k, Uo) -- (g-, Uo) + ( bk, Uo)
opG(x, u) OV
Using the representation of F ( z ) given in (8), we obtain (g+)k - g- + bk ~ F(z), and thus prove that G allows first-order approximation.
4. Differentiability of a maximum function
Let Go c E. be such that G ( x ) c Go for any x ~ O. Consider the function c~(x) = mcax f ( x , z),
(19)
where the function f ( x , z) is defined and continuous on ~Q x Go and is differentiable at (x, z) in any direction (v, g), v ~ E,, g ~ E,.. The question of the differentiability of the function ~b(x) has been discussed by many authors [3, 5, 8-12, 16, 18, 24] under various assumptions about the function f ( x , z) and the mappping G. We try to show that for diiterentiable mappings some of these assumptions (for example, the local convexity or concavity o f f ( x , z) at z for every x, or the existence of a Siater point for the mapping G) are not essential. Let B(x, u) = {z ~ G ( x ) l ( u , z) = pc(x, u)}, R ( x ) = {z ~ G(x)]da(x) = f ( x , z)}. [,emma 1. l f zo~ R ( x ) then Zo~ B(x, Of(x, Zo)/Oz). Proof. Suppose that the assertion of the lemma is incorrect, i.e., there exists a point z o ~ R ( x ) such that z o ~ B ( x , Of(x, zo)/az). Take an arbitrary z c G ( x ) , z~ B(x, Of(x, z)/Oz). Then from the definition of the set B(x, u), where u = Of(x, Zo)/Oz, it follows that Oz
Oz
/"
(20)
N.A. Pecherskaya / Quasidifferentiable mappings
152
If Of(x, Zo)/Oz ~ 0, then from (20) it is clear that
Oz
go,, z o - z ) = - p < O .
(21)
Let g = z - zo. The ditterentiability of the function f ( x , z) implies that
f ( x , Zo+ a g ) = f ( x , Zo)+ a Of(x' Zo__..~+) o(g, a). Og
(22)
Then for sufficiently small c~ > 0 we deduce from (21) and (22) that ot
f ( x , Zo+ ag) ~ f ( x , Zo) +'~p + o(g, a) > f ( x , Zo), which contradicts the fact that Zo is a m a x i m u m point of the function f ( x , z) on G(x), since the point Z o + a g is a member of G ( x ) for a e [ 0 , [[Z-Zol[]. Hence ZoC B(x, Of(x, Zo)/Oz). In the case where Of(x, Zo)/Oz = 0 for ZoC R ( x ) it is clear that B(x, Of(x, Zo)/Oz) = G ( x ) and therefore Zoe B(x, Of(x, Zo)/Oz), thus proving the lemma. It is clear from Lemma 1 that for zoe R ( x ) the vector u =Of(x, zo)/Oze K(zo), where K(zo) is the normal cone at z0 as defined in Section 3. We shall use the notation - K*( zo) =- W( Zo). Suppose now that the tangent cone ~(Zo) does not contain straight lines. This means [26] that for s o m e j > 0 and for an arbitrary g c ~(zo) the following inequality holds:
Oz
' g ~ -Jllg[l,
(23)
or, in other words, Of(x, Zo)/Oz.e int K(zo). Fix v e E,. Consider the sequences {ak}, a k ~ + 0 , {s :~ke R(X+akV), and let :~k~ ZO. From the upper semicontinuity of the mapping R it follows that Zoe R(x). We shall represent ik in the form
~k:Zo+akgk,
akgk~O,
k~.
(24)
The following lemma is proved in [18]. Lemma 2. I f the tangenl cone 9'(Zo) does not contain straight lines, then the sequence [gk} defined in (24) is bounded. Corollary. Since G is a differentiable mapping, there exists a convergent subsequence {gk}, g,k ~ g, such that ~ F(Zo), where F(Zo) is defined by (8). Proof. Take u =Of(x, Zo)/Ozc K(zo) and find the scalar product of u and :~k (the latter defined by (24)):
(%, u) = (Zo, u) + ~k(ik, u).
N.A. Pecherskaya / Quasidifferentiable mappings
153
Since Zoe R(x), then po(x + akv, u) >--p(7(x, U) + ak(g,k, U), or
(25)
po(x + akv, u) - p o ( x , u) >~(P,k, u). Olk
Now choose a convergent subsequence of the bounded sequence {gk} (Lemma 2), the elements of which will also be denoted by gk. Let limk~oogk -~ g. As shown earlier, the mapping G has a uniformly differentiable support function and thus, taking the limit as a k - +0 (uniformly with respect to u) in (25), we obtain apo(x, u) >-(~,u), cgv
ueK(zo),
and therefore ~ e l'(zo), thus proving the Corollary Theorem 3. Let a mapping G be differentiable at point x e 12 in a direction v e E,. If
condition (23) holds for z e R ( x ) or the smoothness assumption holds for z e R ( x ) then the function ~b(x) defined by (19) is differentiable at x in the direction v, and its derivative can be represented in the form: Ov
-pci~(~,k~j+(af(~x
,v)],
(26'
where the pair [G~(v), G~,(v)] is the derivative of mapping G (up to the equivalence relation ). Proof. Let condition (23) hold for z e R(x), ~ e F(x, z, v). Then for a > 0 sufficiently small we have
z+~r
G(x+ ~v),
(]}o(,~)ly,~)--, o,
,~-, +0,
and the following inequality holds: ck(x + av) >~f ( x +av, z + a~ + o(a ))
~(,~)-~ o, o(~)/~-, o, a -~ +O. Then lira l [ ~ b ( x + a v ) - c b ( x ) ] > ~ \
~x
,v
+
,~ .
a~40
Denote the right-hand side of this inequality by A(z,~). Since z e R ( x ) and
N.A. Pecherskaya I Quasidifferemiable mappings
154
~ F(x, z, v)
are arbitrary, we have
lim
l[ch(x+av)-#~(x)]>~
~+0
sup
Of
sup
A(z,g).
(27)
z~_R(x) ~ _ l ' ( z )
We shall now prove the reverse inequality. Let h(a) -~ (1/a)[~b(x + a v ) - ~b(x)]. Choose sequences {ak}, {Zk} such that ~.kc R(x +akV), Z'k~Z, ak-*+O, h(ak)~ li---m~o h(a). It is clear that ze R(x) and that Zk = Z+ak~k, where a~k~O. Then
(o(xk) = f(x -t- Orgy, Z + a~k) z) =4,(x)+~L[ (Of(x,-~zz), ~,k) +(af(x, , ~ ,v)]+o(,~), and hence
1--[4~(x+akv)-fb(x)]=L\
~z
,gk
,V
)] +_o
ak
Taking the limit as
ak-* 0
li---m h ( a ) - - l i m cf~+O
in (28), we obtain
l[4~(X+akV)--4~(X)]=A(z,~,),
~k ~ + 0 O/k
where ~ -= l i m k ~ lim
(28)
Ot k
g,k, g ~ F(x, z, v)
(corollary to Lemma 2). It is then clear that
l[r
sup
c , ~ + o O~
sup
A(z,~).
(29)
zc- R ( x ) ~ c l ' ( z )
Comparing (27) and (28), we obtain the formula
Or"
,
Oz 'g' + \
Making use of relation (12) for u = p,.
~x
,v
Of(x, z)/az, z e R(x),
.
(30)
we have
sup (~, af(x'z)) i~-}'(z~ Oz
=P(;*~("~\ az
/-Pc;~<,~\~/.
(31)
Since z ~ R(x) is chosen arbitrarily, substituting (31) into (30) leads to (26). We see that inequality (23) plays an essential role in the proof above. However, this inequality could fail, for example, at the point of smoothness of the set G(x). In this case it can be shown, using Theorem 2 and arguing along the same lines as in the proof of the first part of the Theorem, that the function ~b(x) is once again diiterentiable and its derivative has the form (26).
N.A. Pecherskaya / Quasidifferentiable mappings
155
Remark 3. I f condition (23) is satisfied for one point Zoe R ( x ) then it follows from the definition of B(x, u) that B(x, Of(x, Zo)/az) = {Zo} and hence Lemma 1 implies R ( x ) = {Zo} and therefore the maximum in formula (26) can be omitted.
5. Computation of the derivative of the maximum function ~ ( x ) Consider the mapping
G ( x ) = {z e E,. [Az <~~ ( x ) } where A is an n • matrix, x 9 n>m, .~:PoO, PCEr, OcEn, ~is a directionally differentiable function. Without loss of generality we can assume that the rows ai, i = 1 , . . . , n, of matrix A are normalized. Suppose that G ( x ) is bounded and non-degenerated (in the linear programming sense) and does not contain unnecessary polyhedral constraints for x 9 dom G = {x 9 O[ G ( x ) ~ ~)}. Now let us consider the mapping
(32)
a ( x ) = {z 9 Em I A z <~x}.
The differentiability of the mapping (32) under these assumptions was proved in [19], and its derivative (up to the equivalence relation) obtained in the form:
Z~(v) = {g 9 Em lAg <~v+ + N~},
(33)
Z2( v) = {g 9 E,,[Ag<~ v_+ N~},
(34)
where the coordinates of the vectors v§ v_ 9 En are given by
i = v', v+ [0,
Vi
/>0, v~<0,
--V ~,
vi =
[0,
V %0,
9 v'~>0,
i=l,..,
n,
and N is a sufficiently large positive number, ~--(1, 1 , . . . , 1), ~ 9 E,. We shall try to use (26) to compute the derivative of the function ~b(x) (see (19)) defined by the mapping G (see (32)) and a function f ( x , z) which is differentiable with respect to both x and z. Consider the function max (g, Of(• z~:R(x)
where G~(v) is defined by (33). It follows from Lemma 1 that if z 9 R ( x ) , then z 9 B(x, af(x, z)/az). Let us define the set J ( z ) , z 9 B(x, af(x, z)/az), as follows:
J ( z ) = {j 9 1 : n l(aj, z) = xi}, where aj is a column vector composed of the elements of the jth row of the matrix
A, aje E,,. Let J ( z ) = l : s, s<- m.
156
N.A. Pecherskaya/ Quasidifferentiablemappings
Remark 4. Note that points z 9 R ( x ) satisfying the conditions of Theorem 3 do not belong to the co-dimension 1 faces of the polyhedron. It is clear that the solution to (35) is also a solution to the system of equations (aj, g)-(vJ++ N ) = O
Vj 9
(36)
which can be represented in the form g'=
~
/3jaj.
(37)
j~J(z)
(36) can then be rewritten as follows; /3~(aj, a,)-(vJ++ N ) = O
Vj 9
(38)
i=1
Let Aj be an s x s matrix, where
/(al, at) Aj =/(a2'.al).
\(as]at)
"'" 999
(at, a,) 1 (a=,.as)/.
"'" (as, as)~]
Since the vectors ai, j 9 J(z), in (37) are linearly independent, the matrix ,Aj is nondegenerate. For this reason system (38) has a unique solution, which can be written as /3 =- ~,;'(~7+ + N~), where ~+ =
~vJ, vJ>>-O, j c J ( z ) = [ l : s ] , 1.0, v J < 0 ,
~=(1,...,1), ~E,.
Any other solution to system (36) can be represented in the form g = g' + g", where g' is defined by (37) and g" belongs to the orthogonal complement of the set of vectors {aj IJ 9 J(z)}. Thus the vector g" does not attect the value of the scalar product (35), and so
z, g~&(o)\
Oz
\
Oz
\
Oz
jeJ(~)
where Aj is an m x s matrix, Aj = ( a t , . . . , as).
(39)
N.A. Pecherskaya/ Quasidifferentiablemappings
157
In the same way we can derive a computational scheme for calculating max
s~z2(,,) \
,g =
Oz
\
Oz
'
A~A2'(O_ + N~)
(40)
where vj'
~-=
O,
vJ<0'
jeJ(z).
vJ>~O,
Substituting (39) and (40) into (26) and remembering that t3j = obtain
zT~ -
~_, we then
max [(Of(x,z) ) (Of(~zZ) itT)], Ov =z~r(x)L\ ~x ,v + ,Afi.j
Or
t7 = ( v ' , . . . , vS).
(41)
Thus the following result holds. Theorem 4. The maximum function r defined by (19), where G(x) is a mapping
described by (32), is directionally differentiable and its directional derivative is given by (41). Note that (41) can be reformulated as follows. Let
L(z)
Of(x'z--)~-[((Aj'4]t)TOf(;;z))T'o'o'L's'O] n
(42)
Then from (41) we have
o6(x) do
= max (L(z), v).
(43)
zeRtx)
Finally, let us consider one special case. Let G be a difterentiable mapping, and consider the function r
=
max max (1,, z),
(44)
zeG(x) i E l : k
where 1~e En. Theorem
5. The function dp is differentiable at x in a direction v and Or
OV
Proof.
max ( l ~ , g ) - m a x
ieR(x)L.geG~(v)
geG:~v)
(i~,g)].
(45)
Let us reverse the order in which maxima are taken in (44): r
= m a x max (li, z), iel:k zeG(x)
and set f/(x) = maxz+~<x) (l,, z).
(46)
158
N.A. Pecherskaya / Quasidifferentiable mappings
It follows from Proposition 1 that the f~ are directionally differentiable at x in a direction v and
of,(x) OV
-
max
g~G*, (v)
(l~,g)-
max
~G=(v)
(l~,g).
(47)
Since
f,(x+av)=f,(x)+aof,(X) +o,.o(a) ' 013
o,.Aa) Ot
,0,
(48)
,~~+o
it is not difficult to show that for a > 0 sufficiently small,
r
+av) =max fi(x +av)= i r I :k
m a x fi(x +av),
i(R(x)
(49)
where R(x) = {i ~ 1 : k ]f~(x) = ~b(x)}. Therefore, from (48),
r
r
m a x Of/(X)+max o , . ~ ( a ) . icR(x)
OZ)
icl:k
On the other hand, we also have
~b(X+ctg)>~r162
max
of,(x)
i n oi,~(a). Ov + mi~l:k
These two inequalities imply (45), thus proving the theorem.
References [1] J.-P. Aubin, "'Contingent derivatives of set-valued maps and existence of solutions to nonlinear inclusions and differential inclusions", Mathematical Analysis and its Applications, Advances in Mathematics, Supplementary Studies, Part A 7 (1981) 159-228. [2] H.T. Banks and M.Q. Jacobs, "'A differential calculus for multifunctions', Journal of Mathematics and its Applications 24 (1970) 246-272. [3] V.V. Beresnev and B.N. Pschenichnyi, "On differential properties of the maximum function" (in Russian), Journal of Compulationol Mathematics and Mathematical Physics 14 (1974) 639-651. [4] M. Bradly and R. Datko, "Some analytic and measure theoretic properties of set-valued maps", SIAM Journal on Com~'ol and Optimization 15 (1977) 625-635. [5] V.F. Demyanov, Minimax: Directional differentiability (in Russian) (Leningrad University Press, Leningrad, 1974). [6] V.F. Demyanov and A.M. Rubinov, "'On quasidifferentiable functionals" (in Russian), Doklady Akademii Nauk SSSR 250 (1980) 21-25. [7] R.V. Gamkrelidze (ed.), "Progress in science and engineering," (in Russian), MathematicaIAnalysis 19 (1981) 127-230. [8] J. Gauvin and F. Dubeau, "Differential properties of the marginal functions in mathematical programming", Mathematical Programming Study 19 (1982) 101-119. [9] E.G. Golstein, Convex programming. Elements of the theory (in Russian) (Nauka, Moscow, 1970). [10] J.-B. Hiriart-Urruty, "Gradients g6n6ralises de fonction marginal", SIAM Journal on Control and Optimization (1978) 381-416.
N.A. Pecherskaya / Ouasidifferentiable mappings
159
[11] K.H. Hoffman and J. Kolumban, "'Verlagemeinerte Differentialbarkeitsbegriffe und Anwendung in der Optimierungs theorie", Computing 12 (1974) 17-41. [12] W.W. Hogan, "'Directional derivatives for extremal value functions with applications to the completely convex case", Operations Research 21 (1973) 188-209. [13] P. Huard, ed., "'Point-to-set maps and mathematical programming", Mathematical Programming Study 10 (1979) 1-190. [14] S.S. Kutateladze and A.M. Rubinov, Minkowski duality and its applications (in Russian) (Nauka, Novosibirsk, 1976). [15] V.L. Makarov and A.M. Rubinov, Mathematical theory of economic dynamics and equilibria (in Russian) (Nauka, Moscow, 1973). [16] L.I. Minchenko and O.F. Borisenko, "On the directional differentiability of a maximum function", Journal of Computational Mathematics and Mathematical Physics 23 (1983) 567-575. [17] E.A. Nurminski, "On the differentiability of set-valued mappings" (in Russian), Kibernetika 5 (1978) 46-48. [18] N.A. Pecherskaya, "On the directional differentiability of a maximum function subject to linked constraints" (in Russian), in: Yu.G. Evtushenko, ed., Operations research (models, systems, solutions) (Moscow Computing Center, Moscow, 1976) pp. 11-16. [19] N.A. Pecherskaya, "Differentiability of set-valued mappings" (in Russian), in: V.F. Demyanov, ed., Nonsmooth problems of control and optimization (Leningrad University Press, Leningrad, 1982) pp. 128-147. [20] N.A. Pecherskaya, "On the differentiability of set-valued mappings" (in Russian), Vestnik Leningradskogo Universiteta 7 ( 1981 ) I 15-117. [21] B.N. Pschenichnyi, Convex analysis and extremal problems (in Russian) (Nauka, Moscow, 1980). [22] B.N. Pschenichnyi, "Convex multi~'alued mappings and their conjugates" (in Russian), Kibernetika 3 (1972) 94-102. [23] A.M. Rubinov, Superlinear multivalued mappings and their application to economic and mathematical problems (in Russian) (Nauka, Leningrad, 1980). [24] S. Tagawa, "Optimierung mit mengenwerten Abbildungen", Operations Research Verfahren 31 (1979) 619-629. [25] Yu.N. Tyurin, "A mathematical formulation of a simplified model of industrial planning" (in Russian), Ekonomika i Mathematicheskie Metody 1 (1965) 391-409. [26] B.Z. Vulich, Special problems of the geometry of cones in normed spaces (in Russian) (Kalinin University Press, Kalinin, 1978).
Mathematical Programming Study 29 (1986) 160-175 North-Holland
QUASIDIFFERENTIABLE OPTIMAL CONTROL
FUNCTIONS
IN
V.F. D E M Y A N O V Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR and International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria V.N. N I K U L I N A a n d I.R. S H A B L I N S K A Y A Department of Applied Mathematics, Leningrad State University, Universitetskaya nab. 7/9, Leningrad 199164, USSR Received 14 October 1983 Revised manuscript received 16 November 1984
This paper is concerned with nonsmooth optimal control problems in which the functionals on the right-hand sides of the differential equations describing the controlled system are nondifferentiable (more specifically, quasidifferentiable). Several necessary conditions are derived. It turns out that different variations of a control produce different necessary conditions which are generally not equivalent. As a result we obtain several necessary conditions of different complexity which may be used to solve nonsmooth optimal control problems. Key words: Nonsmooth Optimal Control Problems, Quasidifferentiable Functions, Necessary Conditions, Variation of a Control, Variation of a Trajectory, Nonsingular Control.
1. Introduction N o n d i f f e r e n t i a b i l i t y in control t h e o r y a p p e a r s n a t u r a l l y on the r i g h t - h a n d side o f the system o f e q u a t i o n s as well as in the f u n c t i o n a l ( t h r o u g h s a t u r a t i o n functions, b y t a k i n g t h e m o d u l e s , etc.). (See, e.g., [2, 6, 7, 10].) In m a n y cases b o t h the system a n d the f u n c t i o n a l are des~afbed by quasidifferentiable functions, a class w h i c h is defined a n d i n v e s t i g a t e d in [3, 4, 8]. This p a p e r is c o n c e r n e d with the v a r i a t i o n s o f t r a j e c t o r y c a u s e d by using different v a r i a t i o n s o f the c o n t r o l for such q u a s i d i f f e r e n t i a b l e f i g h t - h a n d sides. W e c o n s i d e r five different t y p e s o f c o n t r o l variations. N e c e s s a r y c o n d i t i o n s for an e x t r e m u m o f a q u a s i d i f f e r e n t i a b l e f u n c t i o n a l are then stated. The m a i n i n t e n t i o n o f the a u t h o r s is to d r a w the a t t e n t i o n o f specialists in c o n t r o l t h e o r y a n d its a p p l i c a t i o n s to a new class o f p r o b l e m s which seems to be p r o m i s i n g a n d p r a c t i c a l l y oriented. A special case o f this class o f p r o b l e m s has a l r e a d y b e e n d i s c u s s e d in [5]. 160
V.F. Demyanov et al. / Optimal control problems
161
I. 1. Statement of the problem Let the object o f study be governed by the following system o f ordinary differential equations:
Yc(t) = f(x( t), u( t), t),
(1)
x(0) = Xoe E,
(2)
where x = (x (1). . . . , xr u = (u (n . . . . , utn),f = (f<'),... ,f<")), t ~ [0, T], and T > 0 is fixed. We shall use .N to denote the set o f r-dimensional vector functions which are piece-wise continuous (right-hand continuous) on [0, T]. Let us set
U={ueXlu(t)~
VVtc[0,
T]}
where V c E, is a c o m p a c t set. The set U is called the class of admissible controls and any u c U a control Functions fu~ are (i) defined on S (where S c En+r.t is the set o f all admissible x, u, t); (ii) continuous with respect to x and u; (iii) Lipschitzian with respect to x on S; (iv) piece-wise continuous with respect to t on S; and (v) quasidifferentiable with respect to x in S. (In Section 2.5 it will be assumed that the f u r ' s are quasidifferentiable jointly with respect to x and u.) Recall that the function F defined on En is quasidifferentiable at x c E, if it is directionally differentiable and there exist convex c o m p a c t sets O_F(x)c En and -aF(x) c E, such that
aF(x)_ lim l [ F ( x + a g ) - F ( x ) ] = f~g
a ~ + O ~1~
max (v,g)+ v~_OF(x)
m_in (w,g)
WEOF(X)
Vg~En.
Let x(t, u) denote the solution of system (1)-(2) for a chosen u e U. The p r o b l e m is to minimize the functional
.~(u)=~p(x( T, u))
(3)
subject to u e U where ~b(x) is quasidifferentiable, finite and Lipschitzian on the set o f admissible x. Let u * ~ U denote a u which minimizes #, i.e., 99 ( u * ) = m i n ~ ( u ) . ucU
(We shall not consider here the problem o f whether such a u exists or is unique.) The pair o f functions (x*, u*) where x*(t)=x(t, u*) will be called an optimal process; u*(t) is k n o w n as an optimal control and x*(t) an optimal trajectory.
V.F. Demyanovet al. / Optimal controlproblems
162
2. Variations o f a control
To derive necessary conditions for a minimum of (3) the following controls are generally used:
u~ = u* + Au~ ~ U where the function Au~ is called a variation of u*. We shall consider several variations of the control and the corresponding variations of the trajectory.
2.1. A needle variation ( a sharp variation) Let
Au~(t)={y-u*(t), O,
t~[O,O+e), tr O+e),
(4)
where y ~ V, 0~[0, T), e > 0 . We wish to find
h(t) =-(h")(t),..., h(")(t))= lira l [ x ( t , u~)-x(t~ u*)], ~-,+0
(5)
E
where the vector function h is the variation of the trajectory x* caused by variation of the control u*. It is clear that h(t) = 0 Vt ~[0, 0). For t > O + e we have
x~(t) -- x(t, u~) = xo+ +
I'
L,
f(x*(r), u*(r), r) d r +
fo+
f ( x ~ r ) , y, r) dr
dO
f(xE(r), u*(r), ~') dT.
O+e
Invoking (5), and taking the limit as e-~ +0, we obtain (see [1]) h")(t) = f(~ +
O), y, O) -f(1)(x*( O), u*( O), O)
fr
max
(v,h(.r))+
min
(w,h(r))
]
dr
Viel:n
(6)
where _Of")(r) -_~fx <')(x , (r), u*(r r) and Of")(~')=Of~')(x*(r), u*(~'), ~') a r e respectively a subditierential and a superditterential o f f ") with respect to x. For every r the sets ~_f")(r)c 17., and 0 f " ~ ( r ) c E, are convex and compact. Let us now rewrite system (6) in the following shorter form:
h( t) = f(x*( O), y, O) - f ( x * ( O), u*( O), O) +
fr o
max (v,h(z))+
L~
min (w,h(r)) ~Eoft'r)
]
dr
(7)
V.k: Demyanov et al. / Optimal control problems
163
where
Of(r) = [_3f(')(r) . . . . , 3 f ( " ) ( r ) ] ,
-3f(r) = [ 3 f ( ' ) ( r ) , . . . , 3f(")(r)].
Suppose that all m a p p i n g s 0 f ~~ and 0f") are piece-wise continuous on [0, T]. Then it follows from (6) that h(~)(t)=
m a x ( v , h ( t ) ) + min ( w , h ( t ) ) , veO_f~O(t) wE3fli)(t)
h~
O)-f(i)(x*(O), u*(O), 0)
Vic l:n.
We can again rewrite this system in a shorter form: /~(t) = m a x
vc~f(t)
(v, h(t))+
rain
weaf(t)
(w, h(t)),
(8)
h( O) = f ( x * ( O), y, O ) - f ( x * ( O ) , u*( O), 0).
(9)
If O_f(t) and Of(t) are piece-wise continuous m a p p i n g s then a solution to (8)-(9) exists and is unique for any fixed y e V and 0 6 [0, T]. Here, h(t) d e p e n d s on 0 and y.
2.2. A multiple needle variation (needle variations at several points) Let
Au~(t)=
{
yi-u*(t),
t~[Oi, Oi+el~)
Vi~l:r,
0,
t~ U [0,,O,+eli), i~:l:r
where yi e V, 0~ e [0, T), li/> 0, e > 0, and r is an arbitrary (but fixed) natural number. It is clear that x~(t) = x*(t) for t<~ 0~. If t > 01 then we have
x~(t)=-x(t,u~)=Xo +
Io'
f(x*(r),u*(r),r)dr+
I 2
+
+
01+ el I
f(x~(r),y,,r)dr
~ 02+el2
f(x~(r), u*(r), r) d r +
~ Oro~+el (r;
f ( x , ( r ) , Y2, 7") d r + " 9 9 dO 2
f ( x , ( r ) , y,~,), r) d r +
al Or(t)
f t
f(x~(r), u * ( r ) , r) d r
Or(t)+elr(O~
(10)
"
where r ( t ) c l : r is such that
O,m < t ~ 0,o)+ ~. If r( t ) = r then Or+~= T. Without loss of generality we can assume that t > Orm+ elr~,).
(11)
V.F. Demyanov et al. / Optimal control problems
164
From (10) it follows that
h(t) =- lim 1 [ x , ( t ) - x * ( t ) ] = l~[f(x*(Oi), Yl, 01)--f(x*(01), u*(O~), 01)] e~+O E
+
max (v, h(r))+ re_in (w, h(r))] d r weOf(r) I O!~ [ vcO.f(':)
+
12[f(x*(02), Y2, 02)-f(x*(02), u*(02), 02)]
+
I 02~ [ max
(v,h(r))+ re_in ( w , h ( r ) ) ] d r + . . . w~P,f(r)
vcOf(r
+ Ir(,)[f(x*(Or(,)), Yr~t), OrU))--f(x*(Or(t))), U*(Or(t),
+
I
0,(,))]
[ max (v, h ( r ) ) + m_in (w, h(r))] dr.
(12)
weOf(r)
0tit I Ve 0 f ( r )
Now let us introduce the functions ho(t)=0
Vt~[0, T],
hi(t)=0
Vt
while for t > 0i the function hi(t) satisfies the differential equation /~,(t) = max (v, hi(t))+ m_in (w, hi(t)) re=Of(t)
w~.af(t)
(13)
with initialcondition
h,(0i) = hi-,(0i)+ li[f(x*(Oi), Yi, 0i)-f(x*(O,), u*(0i), 0i)].
(14)
From (12) it is clear that h(t) = hr(,)(t). Thus, h(t) (which depends on {yi}, {Oi}, and {li}) is a piecewise continuous function satisfying the system of differential equations (8) (or, equivalently (13)) with several 'jumps' as indicated by (14).
2.3. A bundle of variations Let ~yi--u*(t),
Au~(t)=(O,
tE[Oi, Oi+eli),
t~[O, O + e ) ,
where Yi ~ V, li ~> 0, ~. ~..~ li = 1, 01 = 0, 0i+~ = 0i + eli, 0r + el, = 0 + e and r is an arbitrary natural number. It is not difficult to check that h(t)=0
Vtc[0,0).
V.F. Demyanov et al. / Optimal control problems
165
For t/> 0, we have
h(t) = f l,[f(x*(O),yi, O)-f(x*(O), u*(O), 0)] i=l
+
;r
max (v, h(r))+
re_in (w, h(r))
weaf(r)
o L vc-~f(~')
]
dr.
(15)
The variation of trajectory h(t) satisfies the system o f ordinary differential equations (8) with initial condition
h(O) = Y. li[f(x*(O),yi, O)-f(x*(O), u*(O), 0)]. i=1
The vector function h(t) depends here on {Yi}, {li} and 0.
2.4. A multiple bundle of variations ( a bundle of variations at several points) Take
l Yo - u*(t), Aug(t) = ( 0 ,
t E
Oi-t- e
~
lik, Oi + e
k~O
lik
Vj E
I:Mi V/el:N,
k'~O
t~s U [0i, 0,+eli), i~l:N
,M.
where e > 0, 0~ e [0, T), Yo ~ V, lO>1O, and 1io= 0 for all i c 1 : N, j c 1 : Mi, ~j='~ 10 = 1 and Mi and N are natural numbers. Consider the functions ho(t)=0
h,(t)=O
V t E [ 0 , T],
Vt
while for t > 0i the function hi(t) satisfies the differential equation (13) with initial conditions Mi
h,(0i) = h,_,(O,)4 2 l,k[f(x*(O,), Y,k, O,)--f(x*(0i), u*(O,), 0,)]. k-O
It is now possible to show that
h(t) = hrt,)(t) where r(t) was defined in (11). The function h(t) depends on {Yo}, {0i}, and {l~}.
2.5. A classical variation Suppose in addition to the above assumptions that the set U is convex and f is quasiditterentiable jointly in x and u, i.e.,
4f(x, U~ t ) lira l [ f ( x + a h , O[h, q] ~+o ct =
max [v,. v2] ~ o_.f~.,,(t )
u+aq, t ) - f ( x , u, t)]
[(v,, h)+(v2, q ) ] +
min [ wl. w2lc alT~.~(t )
[(wl, h) + (w2, q)].
V.F. Demyanov et aL / Optimal control problems
166
Now let
Au~(t)=e(u(t)--u*(t))=--eq(t),
u~ U.
Proceeding as above we find that h(t) satisfies the system of ordinary differential equations h(t) =
[(vb h(t))+(v2, q(t))]
max [vt,o2]~_~f,,w(O
+
min
[ ~,,,w2],- ~/,.~(t)
[(w~, h(t))+(w2, q(t))]
with initial condition h(0) =0. Here _0fx,~(t) c En+r and 0fx.~(t) c En+r are convex compact sets. Thus for all of the five control variations considered here we obtain
x,(t) = x*(t) + eh(t) + o(e) where h(t) satisfies a particular system of equations, depending on the control variation chosen.
3. Necessary optimality conditions Since ~b is quasidifferentiable and Lipschitzian we have
J(u~) = ok(x*( T)+ eh( T ) + o ( e ) ) = ~b(x*(T)) + e
Oek(x*( T)) ~o(~) Oh(T)
and therefore the following necessary condition holds:
Theorem 1. If u* ~ U is an optimal control then Ock(x*( T)) Oh(T)
max
~O~r
(v, h ( T ) ) +
min
~,~(.:~T))
(w, h(T))>~O
(16)
for all admissible variations of.trajectory h(T). It is possible to obtain different necessary conditions by considering different types of control variations. Suppose, for example, that f is smooth with respect to x and that we choose a needle variation. Then equation (8) becomes an ordinary system of variational equalities (see, e.g., [9]):
h ( t ) _ O f ( x * ' u * ' t ) h(t) ' Ox with initial condition (9).
t>~O,
V.F. Demyanov et al. / Optimal control problems
167
Applying the Cauchy formula, we obtain
h(T) = Y(T) Y-~(O)(f(x*(O), y, O) -f(x*(O), u*(0), 0)) = Y ( T ) Y '(O)Arf(X* , u*, 0), where Y(t) is the fundamental matrix of solutions to the system of variational equalities,
~try(x* , u*, O):f(x*(O), y, O)-f(x*(O), u*( O), 0)). Substituting the above expression for h(T) into (16) we obtain a~b(x*(T)) ah(T)
max ((Y(T) Y-'(o))rv,/trf(x*,y*,O)) ~O,~(x'(r)) +
min
wc~d~(x*(T))
(( Y(T) Y-'(0)) Tw,/tyf(x*, u*, 8)).
Let us introduce the following n-dimensional vector functions:
Ov(O)=(Y(T)Y-'(o))Tv,
veS_ck(x*(T)),
Ow(O)=(Y(T)Y-'(O))rw,
weOck(x*(T)).
It is not difficult to see that the function ~ ( 0 ) satisfies the following system of differential equations:
dO~(O) dO
Oft(x*, u*, 0)
- - =
Ox
0~(0),
0 ~< T,
veOqb(x*(T).
0v(T) =v,
(17)
Similarly, the function 0w(0) satisfies the system dOw(0)=
afT(x *, U*, 0)
dO
Ox
0w(T) = w,
0~(0),
0 ~< T, (18)
w e 0~b(x*(T)).
Using (16) we can deduce the following theorem: Theorem 2. For a control u* E U to be optimal it is necessary that min [ y~ v
max
L vc,),;t,(x*(T))
+
min
AyH(x*, u*, d/,,, O)
weSgO(x*( T) )
/tyH(X*,U*,Ow, o ) l = o 3
VOe(O,T)
where H(x, u, O, O) = (f(x(O), u(O), 0), qJ(O)),
AyH(x*, u*, ~, 0) = H(x*, y, ~b, O) - H(x*, u*, d/, 0).
(19)
V.F. Demyanov et al. / Optimal control problems
168
Condition (19) is a generalization of the Pontryagin maximum principle [9]. Functions O.(t) and ~bw(t) are referred to as conjugate functions and systems (17) and (18) as conjugate systems. Now let us consider a multiple needle variation (and f is again supposed to be smooth with respect to x). Making use of formula (12) and passing to the limit as r ~ +co it is possible to obtain the following 'integral' necessary optimality condition.
Theorem 3. For a control u* ~ U to be optimal it is necessary that inf
max
ueU I v~Ocb(x*(T))
Io
zl.H(x*, u*, ~ . r) dr
+ minwcs,(xO(r))fro AuH(x*'u*'O~'r) dr}=O"
(20)
In the case where the set U of admissible controls is convex and f(x, u, t) is smooth with respect to both x and u it is not difficult to deduce the following condition
Theorem 4. For a control u* ~ U to be optimal it is necessary that inf I
max
u e U [. vc_~q~(x*(T))
+
min
wc~r162
r (OH(x*'u*' \ ~u ~b'' r) , u(~) - u * ( r ) ) dr
I0( \
~u
'
u(r)-u*(r)
)}
dr =0.
(21)
If ~b is a smooth function then conditions (19) and (20) are equivalent. For nonsmooth problems condition (20) may happen to be 'stronger' than conditions (19) and (21). A more detailed comparison of different necessary conditions can be found in the paper by V.N. Nikulina and I.R. Shablinskaya (see [5, Chapter IV]).
Example 1. Consider the system of two equations .~(1) = U,
~c~2)= -x(')-i-U with the initial condition xr
xC2~(O)= 0. Let the functional be defined by
J(u) = d~(x(1, u)) = Isin xtl)(1, u ) l - Isin xr
u)l.
Find a quasidifferential of function ,fi at x = 0 = (0, 0): D(k(0) = [co{(cos 0; 0); ( - c o s 0;0)}; co{(0, cos 0); (0, - c o s 0)}]
= [co{(l, o), (-1, o)}, co{(O, 1), (o, -1)}].
V.F. Demyanov et al. / Optimal control problems
169
Let V = [ - 1 , 1]. Take i f ( t ) = 0 Vte[0, 1]. Then
s
Vt~[0,1]
and
J(i) =0. Functions q,o and ~,~ (defined by (17) and (18)) satisfy the same system
(~l')= 2x~'tq/2J'
451:~=0. Therefore 4,~(0) = (v ~'~, vt2~), ~w(0) = (w "~, w~2~) u
z[0, 1]
and
H(.~,i,g,~,O)=H(s
qJ~,O)=O V0c[0, 1],
H(.~, u, i/J~,0) = v(l)u+v(2~u 2,
(22)
H(:~, u, ~p~,0 ) = wtl)u+wt2~u 2. Let us first check condition (19): max (vtl~y+ v~2~y2)+ m_in (wr vc_~,b(O)
wt2~y2) = max{-y, y}+min{-y 2, y2}
wead~(O)
--ly[-y2>~O V y E [ - 1 , 1] r o e [ 0 , 1]; i.e., condition (19) is satisfied. Now let us verify condition (21). It follows from (22) that OH(~, i, ~ , 0) = v(,~'
OH(~, i, q'w, 0) _ w(,)
3u
3u
and
v~)u(O) dO+ m_in
max
v~_~(O)
w~a,b(0)
=max
{fo'
u(O)dO;-
fo
w~)u(O) dO
}Fro
u(O) dO =
I
u(O) dO >!0 Y u z U
i.e., condition (21) is also satisfied. But now we shall show that nevertheless condition (20) is not satisfied for i. To do this it is not necessary to find infimum in (20), it is enough to pick up a 'violator' of this condition. Take fi(t)=
1, -1,
tz[0, 89 tE[~,l].
V.F. Demyanov et al. / Optimal control problems
170
Then inf _ m a x uc u [woq,(o) + rain
we~q~(o)
<~ max
ve04~(o)
+ rain
(t~~
dO
Io'(w(X)u(O)+w(2)u2(O))dO} (v(')a(o)+v(2)a2(O))dO
Io'(w"~a(O)+w(2)a2(O))dO
:max{f]a(O) dO,-f]u(O)dO)+min(~u2(O)dO,-f]u2(O)dO} =
a(0)d0
I Io -
~i~(0)d0=-l<0,
i.e., condition (20) is not satisfied. Thus, condition (20) allows us to discover that control ~ is not optimal while conditions (19) and (21) have failed to do this. Remark 1. Conditions (19)-(21) can be used to develop numerical methods for
solving problem (3).
4. The case of a nonsingular control
Let us go back to the case where every c o m p o n e n t of vector-function f is quasidifferentiable in x. Let us remind that if a function F is Lipschitzian on E. then it is almost everywhere differentiable on E. and therefore for almost every x ~ E. it is possible to take
OF(x) = [ ~ J
"SF(x)= {0}.
Definition 1. A control u * c U will be called nonsingular (and its corresponding trajectory a nonsingular trajectory) if for almost every t ~[0, T] the function f(x*, u*, t) is differentiable in x along the trajectory x*(t). In this case it is possible to take
Iof,(x*,Oxu*, t)},
O_f(x*, u*, t) =--Of(t) = (
-Of(x*, u*, t)=-~f(t)={0}
(23) V a.e. t~[O, T].
V.F. Demyanov et al. / Optimal control problems
17 I
A trajectory for which condition (23) is not satisfied will be referred to as a
singular trajectory and the control which generates this trajectory will be called a singular control. Let x* = x(t, u*) be a nonsingular trajectory. Suppose that the number of points where (23) is violated is finite. Denote them i t , . . . , tN (assume also that tl < " 9 9< tN). Now let us consider a variation of control u* at some point O~[tk-i, tk], k ~ I : ( N + I ) . Here to=0, tN+~ = T. Then h(t)=-O Vt 0 then the system (8)-(9) can be decomposed into N - k + 1 linear systems of the type
h(t)= Ak(t)h(t)
Vtc-[O, tk)
with the initial condition
h(O) = a y f ( x * , u*, 0) and
l~(t)= Aj(t)h(t) h(t~__0 =
lira
Vte[tj_l, tj] V j e ( k + l ) : ( N + l ) , (24)
h(t).
I~l/- I 0
Here
Aj(t)-
Of(x*, u*, t) Ox
Vtc[tj_,,b] Vj~ k : ( N F 1),
Ayf(x*, u*, 0) = f ( x * , y, O) - f(x*, u*, 0). The function h(t) can now be rewritten in the form
h(t) = Yk(t) Ykl(O)Aof(x *, U*, O) Vte[O, tk), h ( t ) = Yi(t)yfl(tj_t)h(tj_~)
Vt~[tj-l,t 2) V j > k
where Yj is the fundamental matrix of solutions of system (24) for j e k : ( N + 1). Now let us introduce the notation: Fj(t,r)=Yj(t)Yj-'(r)
Vjek+(N+l).
=f~i~. [7 F~(tj, tj_~), k < N + l , Rk [E, (k+t):N+l k = N + 1, where E is the identity (n x n)-matrix. Then
h( T) = RkFk( T, O)Ayf(x*, u*, 0).
172
V.F. Demyanov et at / Optimal control problems
Substituting this expression for h ( T ) into (16) and transforming the scalar product under the operation of maximum we shall obtain (v, h ( T ) ) = (v, RkFk( T, O)Ayf(x*, u*, 0))= (FTR~v, Ayf(x*, u*, 0)) where T denotes the transposition. Introducing functions &v(O) = F~(T, O)R~v
and
r
= F~(T, O)R~w
we are able to reformulate condition (16) in the following form Theorem 5. Let u* ~ U be a nonsingular optimal control and tt, . 9 9 tu be all the points where condition (23) is violated. Then the relation max
v c ~_r ( x*( T ) )
(g,v(0), ayf(x*, u*, 0 ) ) +
rain
w ~ d a ( x*( T ) )
(r
zl,f(x*, u*, 0))>!0 (25)
is satisfied for every y c V and for almost all 0 ~ [tk-1, tk] where k c I : ( N + 1). Here q,v(0) is the solution of the system of linear equations ~b(0)=--AT(0)qJ(0)
VO~[tk_,, tk]
(26)
with the terminal condition $o(tk) = RTv,
v60_Ob(x*(T)),
and Sw(0) is the solution of system (26) with the terminal condition ~bw(tk) = R~w,
w~qb(x*(T)).
Note that to check condition (16) it is necessary to solve the systems (8)-(9) for each y c V and every 0~[0, T]; but to check condition (25) it is enough to find functions ~0o(0) and ~0w(0) and then we can use them for all 0 and y.
Example 2. Let us. consider the following system of two equations .~(I) ----_Xf2),
(27) X'z'-- 21x
The functional is defined by J,(u) = ~,(x(1, u ) ) = Ixct)(l, u)l +]x~2)(l, u)l.
v.F. Demyanov et al. / Optimal controlproblems
173
Let
U={ucXlu(t)~[-1,1]
Vt e [0, 1]}.
Take the control u * ( t ) = 0 V t e [ 0 , 1 ] and check its optimality. Then x ~ ) * ( t ) = x(Z)*(t) = 0 Vt e[0, 11 and J~(u*) =0. It is clear that functions
~ ( x ) := Ix')l + Ix(=) I are quasidifferentiable and _0~h(0) = co{(-1, -1), (-1, 1), (1, -1), (1, 1)}, _0f(t)(t) = {(0, 1)},
ajd')(t) = {(0, 0)}
0f2)(t) = co{(0, 2), (0, -2)},
0~b(0) = {0},
Vt e[0, 1],
0f(2)(t) = co{(1, 0), ( - 1, 0)}
Ve[0, 1].
The trajectory x*(t) = (x(')*(t), x(2)*(t)) is singular in the sense of Definition 1. The system for the variation of the trajectory x* is the following:
t~(')( t) = h~2)(t), h(t)(t)=
max
(v,h(t))+
min
(w,h(t))
= max{2h~2)(t), -2h(2)(t)}+min{h(')(t), -h(1)(t)} -- 2lh~2)(t)t- Ih(2)(t)l .
(28)
Let us check necessary oondition (16). The solution of (28) is
h(')(1)=y(l-O)e'-~
h12~(1)= y(2 -O) e'-~
if y > 0 and h~l)(1) = ~ 2 (e~4i-')~l -e) _ e(-./~--l)(l-q)) ___a3(O)y,
h(2)(1) = ~ 2
(('/-~ - 1) eC~--'x'-~
if y < 0 . Note that a~(0)>0
VOc[0,1] V i ~ I : 4 .
(,f2+ 1) e (-4~ I)(I--O)) ~ a4(O)y
174
V.F. Demyanov et al. / Optimal control problems
S u b s t i t u t i n g the e x p r e s s i o n for h(1) into the l e f t - h a n d side o f (16) we get, for y t> 0, m a x (v, h ( 1 ) ) = m a x (y(a~(O)+a2(O))) wOmb(o) vE~_r (al(o) + a2(O))y
=max
t
(-a~(O)+a2(O))y (-al(O)-a2(O))y (a~(O)-a2(O))y
=(a,(O)+a2(O))y>~O
Vyc[0,1]
V06[0,1].
A n a l o g o u s l y , if y < 0, max (v,h(1))=-y(a3(0)+aa(0))~0 w_~,/,(O)
Vyr[-1,0].
Thus for the control u * ( t ) the necessary c o n d i t i o n (16) is satisfied. It is also clear that this c o n t r o l is o p t i m a l . R e m a r k 2. T h e m o s t interesting case arises w h e n there exists a set o f n o n z e r o m e a s u r e for which ~_f(t) a n d -df(t) are not singletons. This i n t r o d u c e s the p r o b l e m o f s o - c a l l e d ' s l i d i n g m o d e s ' - - a very i m p o r t a n t a r e a for further study.
Remark 3. The p r o b l e m n o w is to find m o r e c o m p u t a t i o n a l l y useful f o r m u l a t i o n s o f (16) for different c o n t r o l variations. W e are also faced with a new t y p e of differential e q u a t i o n in the s h a p e o f e q u a t i o n ( 8 ) - - w e shall call this a q u a s i l i n e a r differential e q u a t i o n . T h e p r o p e r t i e s o f its s o l u t i o n s have yet to be investigated.
References
[1] V.M. Alekseev, V.M. Tikhomizov and S.V. Fomin, Optimal control (in Russian) (Nauka, Moscow, 1979). [2] F.H. Clarke, "'The maximum principle under minimal hypothesis", SIAM Journal on Control and Optimization 14(6).~,r976) 1078-1091. [3] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 21-25. (Translated in Soviet Mathematics Doklady 21(1) (1980) 14-17.) [4] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable mappings", Mathematische Operationsforschung und Statistik, Series Optimization 14(1) (1983) 3-21. [5] V.F. Demyanov, ed., Nonsmooth problems of optimization theory and of control (in Russian) (Leningrad University Press, Leningrad, 1982) (see especially chapter IV by V.N. Nikulina and I.R. Shablinskaya). [6] E.I. Kugushev, "'The maximum principle in optimal control problems with nonsmooth right-side hands" (in Russian), Vesmik Moskovskogo Universiteta 3 (1973) 107-113. [7] S.S. Mordukhovichand A.Ya. Kruger,"Necessaryoptimalityconditionsinterminalcontrolproblems with nonfunctional constraints" (in Russian), Doklady Akademii Nauk BSSR 20(12) (1976) 10641067.
V.F. Demyanov et al. / Optimal control problems
175
[8] L.N. Polyakova, "Necessary conditions for an extremum of quasidifferentiable functions" (in Russian), Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vestnik Leningrad University Mathematics 13 (1981) 241-247). [9] L.S. Pontrjagin, V.G. Boltjanskii, R.V. Gamkrelidse and E.F. Mischenko, The mathematical theory of optimal processes (Wiley, Chichester, 1962). [10] J. Warga, Optimal control of differential and functional equations (Academic Press, New York, 1972).
Mathematical Programming Study 29 (1986) 176-202 North-Holland
THE SPACE OF STAR-SHAPED SETS AND ITS APPLICATIONS IN NONSMOOTH OPTIMIZATION A.M. R U B I N O V a n d A.A. Y A G U B O V Institute for Social and Economic Problems, USSR Academy of Sciences, ul, Voinova 50-a, Leningrad 198015, USSR, Institute of Mathematics and Mechanics, Academy of Sciences of Azerbaijan SSIL ul. Agaeva 9, Baku 370602, USSR Received 15 December 1983 Revised manuscript received 3 March 1984 The study of quasidifterentiable functions is based on the properties of the space of convex sets. One very important concept in convex analysis is that of the gauge of a set. However, the definition of a gauge does not require convexity, and therefore the notion of a gauge can be extended beyond convex sets to a much wider class of sets. In this paper the authors develop a theory of gauge functions and study some properties of star-shaped sets. The results are then used to study nonsmooth extremal problems (of which problems involving quasidifferentiable functions represent a special class). Key words: Gauge, Star-Shaped Sets, Positively Homogeneous Functions, Directional Derivatives, Nonsmooth Optimization, Quasiditterentiable Functions, Necessary Conditions.
1. Introduction O n e very i m p o r t a n t c o n c e p t in subdifferential calculus is that of M i n k o w s k i duality, through which every convex c o m p a c t set is associated with a specific support function. The study of q u a s i d i t t e r e n t i a b l e f u n c t i o n s (see 4-6) is essentially b a s e d on the properties of the space of convex sets. M a k i n g use of this space, the sum of a convex f u n c t i o n a n d a concave f u n c t i o n can be associated with every class of e q u i v a l e n t pairs of convex c o m p a c t sets. The c o n c e p t of a gauge (a gauge f u n c t i o n of a convex set c o n t a i n i n g the origin [8]) is very i m p o r t a n t in convex analysis. However, the definition of a gauge does not require the c o r r e s p o n d i n g set to be convex but only to have a 'star s h a p e ' with respect to its 'zero' (origin). For this reason the idea of a gauge is not limited to convex sets, but can be a p p l i e d to a m u c h wider class of sets altogether (correspond e n c e b e t w e e n gauges a n d these sets have long been recognized in the geometry of n u m b e r s [1]). W h e n d o i n g this it is c o n v e n i e n t to c o n s i d e r only those sets which are star-shaped with respect to their zero a n d which have a c o n t i n u o u s gauge. In the present paper, these sets will be called star-shaped. It is possible to i n t r o d u c e algebraic o p e r a t i o n s (called here inverse a d d i t i o n a n d inverse m u l t i p l i c a t i o n by a n o n n e g a t i v e n u m b e r ) within this family of sets in such a way that the n a t u r a l c o r r e s p o n d e n c e b e t w e e n gauges a n d star-shaped sets b e c o m e s a n algebraic i s o m o r p h i s m . This allows us to use the s t a n d a r d algebraic t e c h n i q u e n o r m a l l y used to construct the space of convex sets to b u i l d the space of star-shaped sets. The 176
A.M. Rubinov and A.A. Yagubov / The xpace of star-shaped sets and its applications
177
duality between gauge functions and support functions (which holds in the convex case) allows us to consider the polar operator as a linear mapping from the space of star-shaped sets into the space of convex sets. It is then possible to look at some problems previously studied using the space of convex sets from a different, in some respects more general, standpoint. This is particularly useful in quasidifferential calculus. In the first part of this paper we study star-shaped sets and their gauges and the family of all star-shaped sets. Algebraic operations and an order relation are introduced, and their properties are discussed. The properties of the mapping which associates every star-shaped set with its gauge are also considered. We then define the space of star-shaped sets and study its properties. The second part of the paper is concerned with applications. Of particular importance is a geometrical interpretation of the directional derivative and its application to quasidifferentiable functions, and a definition of quasidifferentiable mappings. We also discuss the asymptotic behavior of trajectories which are generated by mappings with star-shaped images.
2. Star-shaped sets and gauges Definition. A closed subset U of the n-dimensional space E, is called a star-shaped set if it contains the origin as an interior point and every ray
~.~ ={,~xl~ >~0}
(xr
does not intersect the boundary of U more than once. To justify the definition we shall show that a star-shaped set U is star-shaped with respect to its zero, i.e., for all points x c U the set U contains the interval [0, x] = {Ax[A ~ [0, 1]}. Let us consider the set
Ux=Uc~, where x • O. This set is closed since it is a subset of the ray ~ and the endpoints of the intervals adjoining it are the boundaries of U. The fact that U is star-shaped implies either that there is no adjoining interval (i.e., Ux = ~ ) or that an adjoining interval is unique and of the form
{;xl ~ ~ (,,', +oo)}, where t,' > O. In this case Ux = [0, u'x]. The star-shape of U with respect to its zero follows immediately from the above, and is equivalent to either of the two relations hUc
U VA~[O, 1],
AUDU
VA~>I.
Recall that a finite function f defined on E, is called positively homogeneous if f(Ax) = Af(x) VA ~>O.
178
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
Let 12 be a set in E , , 0 9 int 12. The function
Ixl---Ixl,
= inf{)t > 0 ] x e )tO}
(1)
is called the gauge of set .(2 (or the Minkowski gauge function). If 12 is convex then the gauge coincides with the gauge function familiar from convex analysis; if 12 is a ball then the gauge is a n o r m corresponding to this ball.
Theorem 1. Let s be a functional defined on E,. The following propositions are then equivalent: (a) the functional s is positively homogeneous, nonnegative and continuous; (b) s coincides with the gauge o f a star-shaped set I2, where 12 = {xl s ( x ) <~ 1}. Proof. (a) Let s be a positively h o m o g e n e o u s , nonnegative, continuous functional and 12 = {xls(x)~< 1}. Then
Ixl,
= inf{A > O Is(x) <~A } = s(x).
It is easy to check that the set 12 is star-shaped. (b) Let s coincide with the gauge o f a star-shaped set/2. Since 1"2 is star-shaped then it follows from the definition that s ( x ) ~< 1 if x c 12 and if s ( x ) < 1 then x 9 O. Since 12 is closed then .O ={xllxl<~ 1}. It is clear that the gauge is both positively h o m o g e n e o u s and nonnegative. Let us now show that the gauge is continuous. Since the gauge is positively h o m o g e n e o u s it is enough to check that the set B, ={xllxl 1} is closed and that the set B2= {xllxl < 1} is open. However, B1 must be closed since it coincides with 12. Suppose now that B2 is not open, that x 9 B2 and that there exists a,sequence {Xk} such that Xk ~ X, IXkl/> 1. Without Ioss o f generality we can assume that limixg I = v/> 1. Take yk --xk/Ixkl. Then lYd = 1 and therefore Yk is a b o u n d a r y point o f 12. Since Yk ~ X~ Z" then the point x / v is also a b o u n d a r y point o f 12. If x ~ 0 it follows that the ray ~x intersects the b o u n d a r y o f 1~/ at least two different points x / l x I and x/~,, which is impossible. If Ix I = 0 then the ray ~x lies entirely in 12 and (from the definition o f ' s t a r - s h a p e d ' ) does not contain any b o u n d a r y points o f 12. Thus the gauge o f a star-shaped set must also be continuous a n d the theorem is proved. Remark. Since the gauge is continuous and int O coincides with the set {x]lx] < 1}, 12 must be regular, i.e., it coincides with the closure o f its interior. Let us denote by ~ the set o f all star-shaped subsets o f the space E,, and by ~/ the family o f all nonnegative, continuous, positively h o m o g e n e o u s functions defined on E.. The following proposition may then be d e d u c e d :
Proposition 1. A mapping ~b: bP~ ~ which associates a gauge with every star-shaped set is a bijection.
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
179
The set .9'[is a cone in the space Co(En ) of all continuous, positively homogeneous functions defined on E.. Since every function from Co(E,) is completely defined by its trace on the unit sphere S l = { x c E.l[lxll=l}, where Ilx]] is the euclidean norm of x, the space Co(E,,) can be identified with the space C(St) of all functions which are continuous on St and the cone Y[ coincides with the cone of functions which are nonnegative on St. Assume that C(St) (and hence the cone 5T[)are ordered in some natural way; fl >~f2<=>fl(x)~f2(x) Vx. Let us introduce the following order relation (by anti-inclusion) within the family ,5e of all star-shaped sets: -Qt~>02
ifO, c02.
It follows immediately from the definition of a gauge that the bijection ~ which associates a gauge with every star-shaped set is an isomorphism of ordered sets ,~ and K. In other words, relations O, ~ 02 and [xlt/> ]x[2 Vx are equivalent (where l" h is the gauge of set .0~). The cone ~ is a lattice, i.e., i f f ~ , . . . ,f,, ~ ~ then functions f and f defined by
f ( x ) = min f ( x ) ,
f ( x ) = max fi(x)
i
i
also belong to bY. Let f be the gauge of a star-shaped set /2~. Then f is the gauge of the union 0 = I,_J~~ and f is the gauge of the intersection ~ = (-'1~~ . This follows from the relations {A > O[xe AO} = U {A > Ol x ~ AO,},
(2)
i
{A > O l x e AO} = O {A > 0 l x e ) t O , } ,
(3)
i
which can be verified quite easily. Thus, the union and intersection of a finite number of star-shaped sets are themselves star-shaped sets. Furthermore, the union coincides with the infimum and the intersection with the supremum of these sets in lattice 5f.
Proposition 2. Let A be a set of indices and U~ be a star-shaped set with gauge ]. [,. I f the function Ix[ = i n f ~ a Ix[~ is continuous, then it is the gauge of the set cl U ~ u~. If the function ]xl = sup,,~a ]xl~ is finite and continuous, then it is the gauge of the set
A,,u~. We shall prove only the first part of the proposition. Since the function Ix I = inf~ca Ix]o is continuous it follows from Theorem 1 that this function is the gauge of some star-shaped set _O. It is now not difficult to check that
~=cJU u,,. Indeed, the continuity of functions I" [ and I" [,, implies that int _0 = {xllxl < 1} = {xlinf Ixl~ < Ct
1} = U t~
int U..
180
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
Therefore, taking into account the regularity of star-shaped sets we get O = cl int _/2 = cl U int u~ = cl I J u~. at
at
This proves the first part of the proposition.
3. Addition and multiplication The algebraic operations of addition and multiplication by a nonnegative number have been introduced within the family E{ of gauges of star-shaped sets in a natural way. We shall now introduce corresponding operations within the family ~ with the help of isomorphism ~,. Let ~ c 5e, h/>0. We shall describe the set h @ ~ with gauge ['1 = h ] ' l a , where [" It~ is the gauge of O, as the inverse product of set ~ and number h. The set ~ ( ~ ~2 with gauge l" I which satisfies the relation
1"l=l'l,+l'12, where [. I~ is the gauge of set 0~, is called the inverse sum of the shar-shaped sets ~21 and ~2. It follows from the definition that if h > 0 then
~o~ =!~. A If h = 0 then the set h O ~ coincides with the entire space E.. We shall now describe inverse summation. To do this we require the following elementary proposition.
Proposition 3. Let a , , . . . , am be nonnegative numbers. Then 1
al+'''+am=
min m a x i a i
(4)
(where it is assumed that 0 / 0 = 0). If a~ = 0 Vi then (4) is trivial. Otherwise, for any set {ct~} such that a~/> O, ~ a~ = 1 there exists an index j such that aj aJ<~Kk= 1 ak and therefore 1
max i i
ol i
m
a~/> ~ k=|
a k.
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
181
At the same time max i
-O1~i
ai = .
at,,
k=l
and this proves the proposition. Now let us consider star-shaped sets /21 and /22 with gauges l-In, and [.In2 respectively, and let 1. [ be the gauge of their inverse sum/21G 122. Then the following equality holds for every x: Ix[ = [xln, + ]xln: = min max
Ix[n,,
x],~2
0 ~ ar ~ I
min
Ixl,I-~
0~: tr ~ 1
O~t~'~
1
I 1.,
where l" Io is the gauge of set ct/2~ c~ (1 - ct).O2. (It is assumed that 0 9 = r'l~>o a/2.) Since the function l" I is continuous it follows from Proposition 2 that O1|
U
[~c~(1-a)/22].
0 ~ 1
Note that the role of zero (a neutral element) with respect to summation in a 'semilinear space' 5~ is played by the space E, (since the gauge of E, coincides with the identity zero). At the same time, E, is the smallest element of the ordered set ,5C We shall now give some computational examples.
Example 1. Consider the following rectangles in E~: U = [ - 1 , 1] x [ - 2 , 2],
V, = [-2A, 2A] • I-A, a].
Their inverse sum coincides with an octagon which is symmetric with respect to the coordinate axes. The intersection of this octagon with the first quadrant has the vertices: A
2A
2A
Rectangles U and V1 and their inverse sum are shown in Fig. 1. The set U | shown in Fig. 2.
V~o is
Example 2. Let U={(x,y)e E21y<~1} and V={(x,y)e E~lx<~1}. The set U | V is depicted in Fig. 3.
Example 3. Sets U and V are presented in Figs. 4(a) and 4(b), respectively; the set U O V coincides with the intersection of U and V (see Fig. 4(c)).
182
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
I
. . . . I- . . . . II U I I
[-- . . . . .
I i
9
I
L .... -4. . . . . . . . . .
L ..........
Fig. 1.
1.67
/
0.48
1 ] 0.43
I Fig. 2.
Fig. 3.
r ....
I Vl
I
I
I
I I
!
I
,
_I! . . . . I I I I I
I
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
183
7 (a)
(c)
(b) Fig. 4.
4. The cone of star-shaped sets
We shall now describe the vector space generated by the 'cone' of star-shaped sets re for which an order relation (with respect to anti-inclusion) and inverse algebraic operations have been defined. Let 6e2 be the set of pairs (U1, U2), where Ui ~ ~. Let us introduce within 6e2 the operations of inverse addition O and inverse multiplication by a number (3, and a preordering relation/> and an equivalence relation - . These are defined as follows:
(u,, u2)|
v2)= ( u,| v,, u~| v2),
AQ(U,,U2)=(A(3U,,A(3U2)
Ae(e,, u2)=(Ixleu ,lxlou,)
ifA~O, irA<0,
(u,, u2)>~(vl, v2) r
u . | v2 >~~ @ v,,
(u,, e , ) - ( v , , v2) r
u,|189
u2|
We shall now factorize the set bv2 with respect to the equivalence relation ~. In other words, we shall consider the family T of all classes of equivalent pairs. Since the operators O and Q produce equivalent pairs when applied to equivalent pairs, the operations for inverse summation and inverse multiplication by a number can be introduced within T in q ~ t e a natural way. The order relation within T is derived naturally from 6e2. An element of T which contains a given pair (U~, /3"2) will be denoted by [ U], U2]. We shall identify an element U of the set 6e with the element [ U, E,] of the set T. The equality [ U,, U2] = [ U1, E,]@[En, /3"2]= [ U,, E . ] O [ U~, E~] (where srQ 77 = ~:@ ( - 1 ) Q 7/) then implies that every element of T can be represented as the difference of two elements of ~, i.e., T is the smallest vector-ordered space containing ~T. For this reason we shall call ~ the space of star-shaped sets (compare with the space of convex sets).
184
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
We shall associate with every pair ( U,, U2) e 5e2 a positively homogeneous function [. [, is the gauge of Ui. It is clear that two pairs generate the same function if and only if they are equivalent. Hence, the function f = l" h -1" 12 Co(E,) is associated with every element [ UI,/-/2] of the space T. Conversely, by representing a continuous positively homogeneous function f in various forms f = f l -f2 (where f E ~ ) , we conclude that every element of the space Co(E,) is associated with the class of equivalent pairs [U~, Uz], where Ui = {x[f(x) <~1}. Identifying, as above, a star-shaped set U with the element [ U, E,] c T, we conclude that the mapping
f= l" I~- l" 12, where
[Ul,
(5)
u~]--,I.I,-I.12
is an extension of the bijection ~0: ~ ~ (which associates a gauge with a star-shaped set) to the bijection T ~ Co(E,). We shall use the same symbol q, to denote this bijection and refer to it as a natural isomorphism. It is clear that qJ preserves both the algebraic operations and the order relation. It is also clear that T, Co(En) and C(S1) can be viewed as different manifestations of the same ordered vector space. It is well-known that the space C(S1) is a vector lattice: its elements fl . . . . . f,, include a point-wise supremum V~'-lf. In addition, i f f =fli-f2~ then
Vmf = V
m ( =1
i
Af,
=
i=1
A
k=l
f,k+
~
#__~___, f2 ik) =~,,1f2i, ,
i=1
f~,.
We may now conclude that the space T is also a vector lattice: if a ~ , . . . , ~,, e T, ai = [ U,, U2,] then
(6) i -1
k=l
; 4,=[0
i=1
where (Y, | relation
(7)
k=l
denotes the inverse sum of the corresponding terms. From (6) and the
A
,~, = -
i=l
(-o,,) i=1
we conclude that
(8) Equation (8) is in some respects more convenient than (7). Let a = [ U,, U2] be an element of the space of star-shaped sets, and f = l" h - [ " 12 be the corresponding positively homogeneous function.
A.M. Rubinov and A.A. Yagubov/ The space of star-shaped sets and its applications
185
Let V = {x If(x)<~ 1}. The set V is star-shaped. It is not difficult to check that the element a + = a v 0 coincides with IV, E,], i.e., that V is the smallest (in the sense of the ordering within 5e, or the largest with respect to inclusion) star-shaped set with the property U, ~ U 2 e V. We shall now introduce a norm [. [within the space Co(E,). I f f ~ Co(E,) then If[ = max xEE.
If(x)[ Ilxll '
where [[. [[ is the euclidean norm in E,. The corresponding norm in C(S~) is [f] = max.-~s, If(z)[. In what follows we shall use the equality [f] = inf{h/> 0 1 - h Ilxll <~f(x) <~ h Ilxll, Vx ~ E, }. Let B be the unit ball in E,. The element e = (B, E , ) o f the space T corresponds to the function 11" [1, and the element - e = (E,, B) to the function -I1" II. Let us define the following norm in T:
lal=inf{h > O [ - h @e<~a<~ h Ge}, where a ~ E,. If a = [ U1, /-/2] then lal = i n f { h > 0 [ U , ~ U20) I B ; U2~ U,O) 1 B]. For a star-shaped set U we set IUI--- I[ U, E,]I and therefore [U[ = inf{h > 01AU = B} -= inf{A > 01U = ;t Q B}. Let X be a star-shaped c o m p a c t set in E,, and ,~ be some subset of the family 5e(X) of all star-shaped subsets of X. Let U e , ~ , and I1 be the gauge of U. We shall consider the sets
~u = { x l l x l ~ = l } (the b o u n d a r y of U) and
a~, - - { x l l - e ~ 0 there exists a 8 > 0 such that a~j + B~ c a~ V U c ~ where Ba =
{xlllxll < 81. Proof. Let us consider the set ~Ti.I of all functions from C ( S ~ ) - - t h i s represents a contraction (on S~) of the gauges of sets from ,~. The fact that the set 2~ is c o m p a c t is equivalent to the set E I I being compact. By the Arzeid-Ascoli t h e o r e m this property of E I I is equivalent to this set being b o u n d e d and equicontinuous. It is clear that condition (i) is satisfied if and only if-Vl. I is bounded.
186
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
We shall now show that c o n d i t i o n (ii) is e q u i v a l e n t to I; I.I b e i n g e q u i c o n t i n u o u s a s s u m i n g that (i) holds. Let c o n d i t i o n (ii) be satisfied a n d U ~ E. First o f all note that there exists an r > 0 such that for any U 9 I; a n d any x for which Ixl u = 1 the following i n e q u a l i t y holds: ]]xll > r. It follows i m m e d i a t e l y from c o n d i t i o n (i). Since X is c o m p a c t then there exists an R < oo such that
Ilxll<-R
Vx 9
Let condition (ii) be satisfied. We show that the set E l I is uniformly continuous i.e. for any e > 0 there exists a 8 > 0 such that relations Ilxll -- Ilyll = ] and [Ix - y l l < 8 imply
Ilxl.-lyl.[<
e
vu 9
Putting e ' = re let us find a 8' c o r r e s p o n d i n g to e' ( a c c o r d i n g to (ii)). By 8 let us d e n o t e 8'/R. T a k e e l e m e n t s x a n d y such that
Ilxll--Ilyll-- 1,
IIx-yll < 8
c h o o s e a n y U 9 ~. Let 1
A
-
ixlu ,
x'=
Ax,
y' =
Av.
It is clear that IIx'll = Ily'!! = A. Since I x ' l . -- a l x l .
= l,
we have xeUcX
and hence
--- IIx'll ~< R. Thus
IIx'-y'll -- x I t x - y l l < R a = 8'. A p p l y i n g (ii) we get y ' e a ~ , i.e.
]ly'l. - I x ' l . I < ~'. Since Ix'Iv = 1, IIx'll-- A > r a n d t h e r e f o r e
Ilxl u -lYl u I = ~-1 IIx'l v -ly'l u ] < ~ e' -- e. Since 8 d o e s not d e p e n d on U then the r e q u i r e d u n i f o r m c o n t i n u i t y is established.
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
187
Now let us assume that the set of functions "~t.! is uniformly continuous. It is necessary to prove that (ii) holds. Choose an e > 0 and let e'<e/R. Let us find a 6' such that the relations
Ilx'll--Ily'l] =- 1,
[[x'-Y'll < a
imply the inequality
I[x'lu-ly'[uI < ~'
Vue~.
Choose a ~ such that
<(e-Re')r, Let U c X
and
x~aU,
~
]X]u =
1. Assume also that
I]Y - x ll < a and put x ' - - x / ] l x r l , f , =
IIX--IIx Ilyll y 2<
y/[lYll.
The inequality
Ilxll.1 Ilyll
Irx-yll 2
implies that
[Ix'-y'[[ < a'. Besides, since ]]Y'{[-- 1, lY'lu <~ l/r; therefore Ilxl,~-[yl,~[= [Ix'lu"
Hxll-[y'[u" [[ylll
= IIx'l u" [[xll- I[x[[" ly'lu + llx[[ " I.v'lu -lylJ[" <~ [Ixll" Ilx'[ ~ - l y ' l u l + ly'l~ll[xll- Ilylll ~
]]Y[][
Re'+l [Ix-yll r
< e.
i.e.,
illyllu-11<~ ~, or, equivalently, y c O~. This completes the proof. R e m a r k . Let X be the family of all convex c o m p a c t sets belonging to X for which
condition (i) of Proposition 4 is satisfied. Then it is not difficult to show that set ~7I.lis equicontinuous and therefore X is compact.
5. The space o f convex sets
In conjunction with the space o f star-shaped sets T, we shall consider the space of convex sets M (see [3, C h a p t e r I]). Recall that this space consists o f classes of
188
A.M. Rubinov and A.A. Yaguboc / The space o f star-shaped sets and its applications
equivalent pairs [ U, V], where U and V are convex compact sets in E,, and the equivalence relation is defined by (U,, V,)-(U2, V2) r
U , - V2= U2- V,.
The algebraic operations in M are defined as follows:
[e,, v,]+[e2, v2] =[ u,+ e2, v,+ v~], h [ A , a ] = ( [ h A , hB] [,lAB, hA]
ifA~>0, ifh<0.
The order relation /> is given by
[U,, V,]>~[U2, I/2] if U , - V2 m/-/2- VI. Let L be the subspace of the space Co(E,) which consists of functions which can be represented by the sum of a convex function and a concave function. The mapping qb : M ~ L defined by 4 ( [ U, V])(x) = max (u, x) + m i n (v, x) ut_ U
(9)
oC V
is an algebraic and ordering isomorphism (it is, of course, assumed that L is provided with natural algebraic operations and an order relation). The inverse mapping ~-~ associates an element [_0p,~q] from M with a function p + q ~ L (where _~p is the subdifferential of the sublinear functional p and ~q is the superdifferential of the superlinear functional q). Let us consider a subset U of the space En. Let U ~ denote its polar:
U~
Vu~ U}.
Here (and in (9)) 0', x) is the scalar product of y and x. Let us recall the main properties of the polar: (i) The set U ~ is convex and closed; 0 c U ~ (ii) I f U is convex and closed and 0 c U, then U ~176U. (iii) U is compact if and only if 0 c int U ~ (iv) Let U be a convex closed set, with 0 e U. Then the gauge function of U coincides with support function of the polar U ~ and the support function of U coincides with the gauge function of the polar. (v) Let UI and U2 be convex and closed and let 0 ~ U1, 0 ~ U2. Then the relations UI D U2, U ~c U ~ are equivalent and
(U1+U2)~176
~
(AU)~
h
~ ifh>0.
Now let us consider star-shaped convex sets U~ and U2. Since 0 ~ int U,, the polar U ~ is compact. Since the gauge [. [i of the set U, coincides with the support function of the polar U ~ the following relations holds: [ x [ , - [x[2 = max (1, x ) - m a x (1, x) 1~u ~ 1~tJ~ = max (l, x ) + l~u, ~
min i~t-u2 ~
(l, x).
(10)
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
189
Let ~b and 9 be mappings defined by formulas (5) and (9) respectively, ~t be an element of the space T containing the pair (U1, U2), and fl be an element of the space M containing the pair ( U ~ - U ~ From (10) it follows that
and hence = (~-1~)(,~).
The operator 7r = ~-~q/ defines the operation of taking the polar (or is the polar operator). It is defined on the subspace Tc of the space T which consists of elements a such that there exists a pair (U, V) c a, where U and V are convex sets. It is clear that Tc is a linear space (this follows from the equivalence of the convexity o f a star-shaped set and that of its gauge). The set of values of the operator Tr coincides with the space of convex sets. Indeed, for f l e M it is always possible to find a pair ( U, V) e fl such that 0 e U, 0 e V. Then U = (U~ ~ V = (V~ ~ so that fl = 7ra, where ~ = [ U ~ V~ e To. From the properties of the polar it follows that the operator r is linear and order-preserving.
6. Quasidifferentiability and a geometrical interpretation of directional derivatives
The space of star-shaped sets can be used to provide a geometrical intepretation of directional derivatives. Let f be a function defined on an open set f~ c En and suppose that at a point x ~ E, we can construct the directional derivative o f f : af(x) -f'x(g) =- lira 1 [ f ( x + c t g ) - f ( x ) ] , 8g , ~ o o~ where the function fy~(g) is continuous in g. Since the functional f~ is positively homogeneous, an element of the space T of star-shaped sets is associated with f~. In other words, a pair of star-shaped sets (U, V) exists such that
f'(g) = min{A > 0 1 g e A U } - m i n { A > 0 ] g ~ AV} or, equivalently, f ' ( g ) = min{A > 01ge AU}+max{A < 0 ] g c ( - A ) V}.
(11)
Note (from equation (11)) that the pairs (U, V) and (U~, V,) represent the derivative o f f if and only if they are equivalent. Let us denote the set U in (11) by Of(x) and the set V by df(x). Invoking the properties of the space T of star-shaped sets, it is possible to state rules for algebraic operations over functions and the
190
A.M. Rubinov and A.A. Yagubov / The .space of star-shaped sets and its applications
corresponding pairs:
d_(fl + f2)(x) = _dfl(x)(~ d_f2(x), a ( f , +f . ) ( x ) = a f , ( x ) |
d_(fl "f2)(x) =fl(x)Qd_f2(x)@f2(x)Qdfl(x), a ( f l 9f2)(x) = f l ( x ) Q a f 2 ( x ) O f 2 ( x ) Q a f , ( x ) . Using formulas (6) and (7) and the rules for ditterentiability of the m a x i m u m function it is easy to find _d(maxfi(x)),
a(maxfi(x)), i
d(minf~(x)), -
i
d(minfi(x)). i
It is clear that a function f is quasiditterentiable at x if and only if there exist convex sets df(x) and a f ( x ) . In this case
df(x) = [of(x)] ~
af(x) = [ - ~ f ( x ) ]
~
where 0f(x) and -gf(x) are a subdifferential and a superdifferential, respectively, of f at x. We shall now present a geometrical interpretation of necessary conditions for a minimum. It is based on the following lemma. Lemma 1. Let a functional f be directionalty differentiable at x c 17.,, the derivative f ' ( g ) be continuous in g and :K be a cone in E,. Then (i) The relation min f ' ( g ) = 0
is satisfied if and only if Of(x) ~ 5~c 8f(x). (ii) The relation max f'(g)=0 g t= :,Y
is satisfied if and only if d f ( x ) c~ :K c df(x). Proof. Let us write f ' ( g ) in the form
f'x(g) = Igl, -Ig[2, where l" I1 is the gauge of the set d_f(x) a n d [ . 12 is the gauge of the set df(x). Assume that minf'(g)=0
and
ged_f(x)c~ffL
Then Ig, I <- l and Igll-lg21 0, so that Ig2[<~ 1, which is equivalent to the inclusion g e d f ( x ) . Thus, we have Of(x) (~ ~Kc df(x).
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
191
A r g u i n g from the o t h e r direction, s u p p o s e that this last i n c l u s i o n holds. F o r a g ~ ~ such that Igh > 0 let us find a h > 0 such that lag[, = 1. Then Xg ~ d_f(x). But since hg c (If(x) we have the i n e q u a l i t y IAgl= <~ 1. This m e a n s that Igh -[gl= =f!,(g) >0. Thus, if [gh = 0 then [g[2 = 0 (since [g[2 <~ [gh). Part (ii) o f the l e m m a can be p r o v e d in the s a m e way. Let x c 12 c E,. By Yx we shall d e n o t e the cone o f feasible d i r e c t i o n s o f the set O at the p o i n t x, i.e., g c y~ i f x + ag E 12 V a c (0, ao], where ao is s o m e positive n u m b e r (which d e p e n d s on x a n d g). Let E~ d e n o t e the c o n e o f feasible (in a b r o a d sense) d i r e c t i o n s o f O at x : g ~ I'~ if for any e > 0 there exists an e l e m e n t g~ ~ B~(g)=-{qlllq-gll < ~} a n d a n u m b e r a~ ~ (0, e) such that x + a ~ g ~ 1 2 . A f u n c t i o n a l f defined on an o p e n set 12 c E~ is said to be u n i f o r m l y d i r e c t i o n a l l y differentiable at x e / 2 if for a n y g c E , a n d e > 0 there exist n u m b e r s t~> 0 a n d ao > 0 such that
[ f ( x + aq) - f ( x )
-
af!.(q)l
< ae Vq c B~(g), Vet ~ (0, ao].
It is s h o w n in [3, C h a p t e r I] that a d i r e c t i o n a l l y differentiable, locally Lipschitzian function is also u n i f o r m l y d i r e c t i o n a i l y differentiable.
Theorem 2. Let x* ~ 12 be a minimum point o f f on 12. I f f is directionally differentiable at x* and f ' . ( g ) is continuous in g then d f ( x * ) c~ Tx* c d f ( x * ) .
(12)
I f f is uniformly differentiable at x* then d f ( x * ) n Ix* c af(x*).
(13)
C o r o l l a r y . I f f attains its minimal value at an interior point o f the set 12, then
df(x*) c df(x*). Remark. I f f is q u a s i d i f f e r e n t i a b l e a n d the sets d f ( x * ) a n d d f ( x * ) are convex then the relation d f ( x * ) c elf(x*) is e q u i v a l e n t to the inclusion - "~f(x*) c ~.Of(x*),
which is f a m i l i a r from quasidifferential calculus. A n a l o g o u s necessary c o n d i t i o n s for a c o n s t r a i n e d e x t r e m u m o f a quasidifferentia b l e f u n c t i o n can be o b t a i n e d from (12) a n d (13). The values a = Iminllgll=1f ' ( g ) ] , b = maxllgll_ 1i f ( g ) are called the rates o f steepest descent a n d steepest ascent, respectively, o f f on E,.
Proposition 5. The following relations hold: a = inf{A > 0 [d f ( x ) D _df(x)<~ A Q B},
(14)
b = inf{A > Old_f(x) ~ a f ( x ) | A Q B}.
(15)
192
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
Proof. Note that a = - min f~(g) = max (-f'(g)) = inf{A > O [ - f ' ( g ) ~< h [[g[[}, Ilgll= I
Ilgll =1
b = max f'(g) = inf{h > o [/s
<~ A Ilgll}.
Ilgll=l
Since f ' ( g ) = Igl,- Ig12, where l" It is the gauge o f the set df(x) a n d ] . 12 is the gauge o f the set df(x), we immediately arrive at (14) and (15). Note that max{a, b} = I f ' ( g ) l = I[_df(x), df(x)]l.
7. Differentiability of star-shaped-set-valued mappings We shall now use the space o f star-shaped sets to derive a definition o f ditterentiability for star-shaped-set-valued mappings. Let a :/2 ~ 5e be a mapping, w h e r e / 2 is an open set in E, and ,90 is the family o f all star-shaped subsets o f the space E,.. Identifying ow with the cone of elements o f space T with the form [ U, E,], we can assume that a operates into the Banach space T. The m a p p i n g a is said to be strongly star-shaped directionally differentiable at x ~ .0 if there exists a mapping a '" x . E, ~ T such that for every g c E, and sufficiently small a > 0 the following relation holds:
[a(x+ag), a ( x ) ] = otQa'(g)| where o(ot)/a ~ + 0 Let
(16)
0. Here the convergence is in the metric o f space T.
a'(g)=[a~(g),a~(g)],
o(ot)=[o*(a),o-(ot)].
Then (16) can be reformulated as follows:
[a(x + ag), a ( x ) ] =[otQa+(g)@o+(a), ct'~a~(g)~o-(ot)]. Since the pairs o f sets on both sides of this equality define the same element o f the space T, they are equivalent, i.e.,
a(x + ag)O c~Q a-~(g)~)o-(e~) = a(x)| a (S)a+(g) 0) o + ( a ) .
(17)
Thus, a m a p p i n g a is strongly star-shaped directionally differentiable if and only if there exist mappings a~ : E, ~ 5e, a+: E , ~ ,90 which satisfy (17). Remark. Several other definitions o f the derivative o f a m a p p i n g have been proposed. These are based on the use o f the space o f convex sets and the derivative o f the support function of a m a p p i n g (see, for example, [3, C h a p t e r II]). Let us associate a gauge I" Ix with each set a(x). This means that we define a m a p p i n g (an abstract function) x ~ 1. Ix with values in Co(E,). It follows from the
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
193
definition that a m a p p i n g a is strongly star-shaped differentiable if and only if this abstract function is directionally differentiable (in the topology o f space Co(E,)). We shall now consider an example. Let f ( x , y ) be a function defined on 12 x E,, (where 12 is an open set in E,). Assume that it is nonnegative, continuous and continuously ditIerentiable with respect to x in its domain. Suppose also that f is positively h o m o g e n e o u s in y:
f(x, Ay) = Af(x, y)
VA ~ 0.
Set
a(x) = {y If(x, y) <~1}. It is easy to check that the gauge [.Ix o f the set a(x) coincides with the function fix, 9). From the properties o f f it now follows that the m a p p i n g a is directionally dilIerentiable and that the function
Y~\
Ox
'g
corresponds to a'(g) (through a natural isomorphism). Note the following relations between strong ditierentiability and algebraic operations: 1. Let a~ : 12 ~ 5~ and a2 : 12 -~ cf be strongly directionally ditterentiable mappings, and let a~(~a2 be their inverse sum:
(a,Ga2)(x) = al(x)O)a2(x) V x e 12. Then the m a p p i n g a~|
is directionally differentiable and
( a~O) a2)" (g) = ( a,) !~(g)| ( a2)'~(g). 2. Let a m a p p i n g a : 12 ~ 5eand a f u n c t i o n / : 12 --, E~ be directionally differentiable. Then the m a p p i n g b:x ~ f ( x ) @ a ( x ) is directionally ditterentiable and
b" (g) =.f'~(g) Q) a(g)~) f ( x ) @ a" (g). To prove these two assertions it is necessary to view the m a p p i n g s 12 ~ 5e as single-valued mappings /-/-", T and to make use o f the properties o f directional derivatives o f single-valued operators. The following property can be proved in the same way: 3. Let m a p p i n g s F : X ~ 12 and a : 12 ~ cf be directionally differentiable and a be Lipschitzian. Then the m a p p i n g b(x) = a ( F x ) is also directionally differentiable and
b'(g) =-a'~:~(F'(g)). We say that a strongly directionally ditterentiable m a p p i n g a is strictly quasidiilerentiable if its derivative a'(g) belongs to the subspace Tc o f space T or, equivalently, if there exists a representation a'(g) = [ a ~ ( g ) , a 2 ( g ) ] , where sets a~ (g) and a;(g) are convex.
194
A.M. Rubinov and A.A. Yagubov / 7he space of star-shaped sets and its applications
The f u n c t i o n / z ( x , y) = [ylx, where I. Ix is the gauge o f set a ( x ) , is called the gauge function o f the m a p p i n g a. If a is strongly quasidifferentiable (in g), then the function /z is directionally differentiable and the following equality holds: /z'(x, y, g) =
lyl- -lyl~
= max (1, y) + min (1, y), I c A~
I c7 Bg
where I1 and I1+ are the gauges o f the sets a x ( g ) and a~(g), respectively, and A e = [ a ~ ( g ) ] ~ B e = - [ a ~ ( g ) ] ~ The element [Ae, B e ] = z r ( a ' ( g ) ) o f the space of convex sets (where zr is the polar operator) is called a quasidifferential o f the m a p p i n g a in direction g. Let a m a p p i n g a have convex images and the polar m a p p i n g a ~ be defined by
a~
= [a(x)]
~
Applying the polar operator zr to the equality [ a ( x + ag), E , ] = [ a ( x ) , E,]~) a G a'~(g)O)o(a), we obtain [a~
+ ag), 0] = [a~
0] + a e r ( a ' ( g ) ) + rr. o ( a ) .
This provides a p r o o f o f the following theorem.
Theorem 3. I f a mapping a possesses the property o f strong (star-shaped) quasidifferentiability, this is equivalent to saying that a strong (convex) derivative o f mapping a ~ exists.
8. Weakly star-shaped directional differentiability Let a m a p p i n g a : 1 2 - * ~ have gauge function /z. We say that a is weakly (starshaped) ditterentiable in a direction g if for any y e En the partial derivative /z'(x, y, g) exists. Note that the function y~/x!~(x, y, g) is not even required to be continuous. We shall now discuss in detail the conditions necessary for the partial derivative tz'(x, y, g) to exist. Let a : E, ~ 2 ~-, be a mapping. Fix x ~ E,, y ~ a ( x ) , and g ~ E,. Let 7(x, y, g) = { v c E,,13ao > 0: y + a v c a ( x + ag) V a ~ (0, ao]}.
(18)
F ( x , y, g) = cl 7(x, y, g). We say (see [4]) that the m a p p i n g a:E~--> 2 ~,,, allows first-order a p p r o x i m a t i o n at x ~ En in the direction g ~ E, if for any numerical sequence {ak} such that a k --~ d-0 and any convergent sequence {Yk} such that YR E a ( x d- akg), Yk -~ Y, the representation YR = Y + Os -~- O( a k) holds, where Vk~I'(x,y,g),
CtkVk~O,
yea(x).
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
195
Assume also that a is a continuous m a p p i n g and that the t o p o l o g y of ~ is induced from the Banach space T. This is equivalent to saying that the m a p p i n g x-~ [. [~(x) = I t ( x , . ) is continuous. Fix an element yo~ Era, and for x ~ g2 take V ( x ) = [it(x, Yo), +oo). We shall now describe the set Fv(x, ", g) (the closure of set yv(X, ", g) constructed from formula (18)). Let A ~ V(x). The relation v ~ 7v(X, A, g) means that, for a sufficiently small, we have I t ( x + ag, Yo) <~A + cry.
(19)
IrA > It(x, Yo) then (19) is valid for every v (with a sufficiently small). IfA = It(x, Yo) then (19) can be rewritten in the form 1 v / > - - [ i t ( x + ag, Yo) - i t (x, Yo)]O~
N o w we have ( (-oo, +oo), Z'v(X, A, g) = {t [ ~ ' ( x , Yo, g), +co),
A > v-(x, Yo), A = It(x, Yo),
where /%'~(x, Yo, g) = lim 1 [ i t ( x + ag, Yo) - v- (x, Yo)]. c~ ~ t - 0 0 t '
Proposition 6. A mapping a is weakly star-shaped directionally differentiable at x if and only if the mapping V allows first-order approximation in every direction f o r all Yo ~ E,. Proof. 1. Let V be such that first-order a p p r o x i m a t i o n is allowed in a direction g, and
ot k -q. + 0 .
Then
I t ( x + akg , Yo) ~ It(x, Yo) and therefore It ( x + akg, Yo) = It (X, Yo) + akVk + O( ak ), where Vk ~ V-~(X, YO, g)" This leads to I t ' ( x , Yo, g) = lira __1 [v-(x + akg, Yo) -- It(x, Yo)] ~> ~ ' ( X , Yo, g). Otk
2. Let a be directionally differentiable. Then the derivative v-'~(x, Yo, g) exists for every y o g a ( x ) , g e En. Let Ak~A, Ak~ V ( X + a k g ) . Then Ak >1 V- ( X + akg, Y0) = It (X, Y0) + ak It'~ (X, Y0, g) + O( ak ). If A = It(X, Yo), set Vk = Otkv-t~(X,YO, g) and we have a representation which is used in the definition of the first-order a p p r o x i m a t i o n . If A > v-(x, Yo) then this representation is obvious, and the proposition is proved.
196
A . M . Rubinot~ and A.A. Yagubov / The space o f star-shaped sets and its applications
Remark. The gauge function can be viewed as a minimum function with dependent constraints ~t(x, y0)= min h, x~_ V(x)
and therefore its differentiability can be studied with the help of a theorem by Demyanov [2]. However, this theorem is proved under the assumption that V allows first-order appI'oximation. Proposition 6 shows that this assumption is absolutely essential in the case under consideration. It is clear that the inverse sum of weakly differentiable mappings is also weakly differentiable. If a is weakly differentiable,f(x)/> 0 a n d f is a directionally differentiable function, then the mapping b ( x ) = f ( x ) Q a ( x ) is also weakly differentiable. Let a~:12--> 6e (i ~ 1: N ) be a weakly directionally differentiable mapping. Then the union of these mappings _a(x)=LJ~l:N a~(x) and their intersection a ( x ) = ['-'),~l:n a~(x) are also weakly directionaily differentiable. I f / ~ is the gauge of the mapping ai then the derivatives of the gauge functions /2 and /z of the mappings a and a are described by the following equations: /2'~(x, y, g) = max
ic R(x, v)
/zi(x, y, g),
~',(x, y, g) = min /xi(x, y, g), i~. Q(x,.v)
where R ( x , y ) = { i e 1: Nlli(x,y)=/z,(x, y)}, Q(x,y)={i~ 1: Nl~_(x,y)=/xdx, y)}. We shall now consider some examples of weakly differentiable mappings. Example 4. Let l: En ~ Em be directionally differentiable and set
a(x) = {yi(l(x), y) <~1}. It is clear that
a(x) is a star-shaped set, with gauge
[ylx -- ~(x,
y) = max{(/(x), y), 0}.
The derivative of/x(x, y) at x in direction g (where y is fixed) exists and is given by
I(ol'~(g),Y) /x'~(x, y, g) =
if(l(x),Y) >-0, if (/(x), y) < 0,
[max{(l'x(g),y),O}
if(l(x),y)=O.
Thus the mapping a is at least weakly differentiable. The function y--, # ' ( x , y, g) may be discontinuous, and in this case the mapping a is not strongly differentiable. Example 5. Let a(x) = {y](li(x), y) <~ 1, i e 1 : k}, where the l~: E, ~ Em are directionally differentiable mappings. Take a~(x)={yl(l~(x),y)<-l}. Since a ( x ) = ['-'1~a~(x) we deduce that the gauge /z of mapping a is of the form
i~(x, y)=max(l~(x), y), icO:k
where
lo(x ) = 0 V x c E,.
A.M. Rubinov and A.A. Yagubov / The space o f star-shaped sets and its applications
197
The function/.t is directionally differentiable for any fixed y and ~t'(x, y, g) = icR(x,y) m a x ( ( li)" g" y),
where g(x, y) -- {il ~(x, y) : (/,(x), y)}. Example 6. Let l~j: E, -> Em(i ~ 1 : k ( j ) ; j ~ 1 :p) be directionally differentiable m a p pings p
aj(x)={y[(lij(x),y)<~lVi~l:k(j)},
a(x):Ua2(x). j=l
The gauge f u n c t i o n / z o f m a p p i n g a is given by /x(x,y)=min
m a x (lo(x),y),
jEl:p i~O:k(j)
where loj(X) =0 Vj~ 1: p; V x c En. The f u n c t i o n / z is directionally differentiable and hence the m a p p i n g a is weakly differentiable.
Example 7. Let k
a(x)-- U g,(x)u,, i=1
where Ui, i c 1: k, are star-shaped sets in Era, the gi are functions defined in E,, and the set
0 ={xlg,(x)>OVie l:k} is not empty. For x ~ O the set a(x) is star-shaped with gauge lyl~ -- ~ ( x , y) = min
[YI___C_,
i gi(x)'
where I" I~ is the gauge of set U~. It is clear that the m a p p i n g a is weakly differentiable. Analogously, the m a p p i n g a(x)=
O
gi(x)Ui
i=l:k
is also weakly differentiable with gauge /z (x, y) = m a x - lyl, -
i gi(x)"
Example 8. Let F : En ~ Em be a directionally differentiable m a p p i n g with coordinate functions f~, i e 1 : m. Take
O={xlf(x)>OVi~
l:m}
198
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
a n d a s s u m e that 12 is not empty. C o n s i d e r the m a p p i n g
a(x) = { y c E,,iy<~ F ( x ) }
=
F(x)-E+,.
defined on /2. Since a(x) can be rewritten in the form
a(x)={y
Y' -<11,
It is clear that the g a u g e o f the m a p p i n g a is
Yi
/z(x, y ) = m a x ,
f,(x)
It is p o s s i b l e to i n t r o d u c e the n o t i o n o f q u a s i d i f f e r e n t i a b i l i t y for w e a k derivatives as well as s t r o n g derivatives. W e say that a m a p p i n g a : 12 -, S is weakly quasidifferentiable if for every 3' ~ E~. there exist convex c o m p a c t sets Ay a n d B v such that / z ' ( x , y, g) = m a x (1, g) + r a i n (I, g), I~- A~
I,._ Hv
where /z is the gauge o f m a p p i n g a. We shall n o w c o n s i d e r one a p p l i c a t i o n o f w e a k q u a s i d i f f e r e n t i a b i l i t y to e x t r e m a l problems. Let Z be a set d e s c r i b e d by Z = { x E / 2 1 y ~ a ( x ) } , where a is a m a p p i n g defined on an o p e n set 12 c E , a n d o p e r a t i n g into the set 5e o f s t a r - s h a p e d subsets o f E , ; y is a fixed vector from E,~. In o t h e r words, Z = a-I(y). (A m o r e general case is d i s c u s s e d in [7].) It is n e c e s s a r y to c o n s t r u c t the cone o f feasible d i r e c t i o n s o f Z at x~Z. I f / z is the gauge f u n c t i o n o f m a p p i n g a then
Z = Ix' e /2l~,.(x', y ) ~ l}. If a is w e a k l y q u a s i d i t t e r e n t i a b l e we can c o n s i d e r the cones:
ya={gllz'(x,y,g)
"y2={g[Iz'(x,y,g)<~O}.
Let y~ d e n o t e the cone o f feasible d i r e c t i o n s o f Z at x. Then y, ~ y~ c 3'2. F r o m [3, c h a p t e r 1, P r o p o s i t i o n 2, Sectictn 10], it follows that if
Gx(--axlz(x, y) ) r Gx(O~lz(x,y) ), where G x ( V ) = {/x e V [ / z ( x ) = m a x
l,(x)}
vC V
a n d ~J~/x(x, y ) a n d _0x/z(x, y ) are respectively a s u p e r d i t t e r e n t i a l a n d a subdifferential o f function # with r e s p e c t to x, then cL Yt = c] Y = Y2.
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
199
Consider the following example. Let
a ( x ) = { v c E ~ l v ~ Fx}, where F is a quasidifferentiable m a p p i n g with coordinate functions f~, i e l : m ; y = ~r = (1 . . . . . 1). Then (see Example 8 above)
1
1
/x(x, y) = max f ( x ) - minifi(x)" The inequality I~(X, y)~< 1 is equivalent to both min f,(x)/> 1 i
and max gi(x) ~ O, i
where gi(x) = 1 -fi(x).
9. Trajectories of star-shaped mappings Let us now discuss the asymptotic behavior o f trajectories generated by a starshaped mapping. Problems of this type c o m m o n l y arise in mathematical economics, where they are studied u n d e r additional convexity assumptions. The same problems without the convexity assumptions have been discussed in [9]. Let ~ be a star-shaped c o m p a c t set in En. A Hausdorff continuous m a p p i n g a:~llst(~) defined on ,T is called a discrete dispersible dynamic system ( 0 3system). Here Hs,(~) is the family o f all star-shaped subsets o f set ~. A sequence {x~ I i ~ 0, 1 . . . . } o f elements o f ~ such that xi.lEa(xi) ,
ie0,1 .... ,
is calle&a trajectory o f the D3-system a. A n o n e m p t y subset H o f set ~ is called a semiinvariant set o f the D~-system a if a ( O ) c O. Take
Pa(~:) = cl U a'(~) t=l
for sc~ llst(~), where a ' + ' ( ~ ) = a(a'(~)). A point x e ~T is called a Poisson stable point o f D3-system a if x e Pa(x)= P~({x}). A set /~/~ IIs,(gT) is called a turn-pike set o f D3-system a if p(x,, 1(4)-~ 0 for any trajectory {x,} o f this system. Let M denote the intersection o f all turn-pike sets.
200
A.M. Rubinov and A.A. Yagubov / The space o f star-shaped sets and its applications
A functional h defined on eT is said to be in equilibrium if h is continuous, h(x) >~O V x e ~, and
h(y)<~h(x)
Vxe~,
yea(x).
Let the functional h be in equilibrium. Take
(h o a)(x) = h ( a ( x ) ) = max{h(y)ly 9 a(x)} for x 9 ~F and set
Wh={Xe~.lh(x)=(hoa)(x)}
,
W = ( ' ] Wh, h
where the intersection is taken over all functionals in equilibrium. It is shown in [6] that W ~ M. Let 2 be a c o m p a c t subset o f the space H~t(~) in the t o p o l o g y induced from the space of star-shaped sets T. A m a p p i n g a : ~ 2 is called quasihomogeneous if a(Ax) = Aa(x)
VA e [0, 1).
In addition, a ( / z x ) c / z a ( x ) V # > l for q u a s i h o m o g e n e o u s mappings. Some examples o f q u a s i h o m o g e n e o u s mappings are given below. 1. Concave mappings (under the additional assumption 0 9 a(0)). A m a p p i n g a defined on a convex c o m p a c t set ~ is concave if
a ( a x + f l y ) D a a ( x ) + f l a ( y ) Va, ~ >10, a + ~ = 1. 2. Homogeneous mappings of degree & A m a p p i n g a is h o m o g e n e o u s of degree t$ if it follows from x, Ax 9 ~f that
a(~x) = A~a(x). Proposition 7. I r a mapping a : ~-> Z is quasihomogeneous then the function
g(x) = {Ixle 1
if x r ~,
ifxc ~,
is in equilibrium, where ~ is a star-shaped semi-invariant set.
Proof. If x 9 ~: then h(x) = 1. If y 9 a(x) then y 9 a(x) c ~ since sr is semi-invariant, and hence h(y) = 1. Let x ~ ~: and y e a(x). Then h(x) = Ix[r =inf{A > O [ x c A~:}> 1. If y 9 ~: then h(y) = 1 < h(x). Let y ~ ~:. Then, using the inequality h(x) = X > 1, the quasihomogeneity of the m a p p i n g a and the semi-invariance o f ~c, we obtain
y 9 a(x) c a(A~) c Aa(~) c ~ . Therefore h(y)=lYlr the proposition.
= h(x). This implies that h is in equilibrium and proves
A.M. Rubinov and A.A. Yagubov / The space of star-shaped sets and its applications
201
L e m m a 2. I f C is a compact star-shaped set, then f o r any ~ there exists an e > 0 such
that C + e B + ~ (1 + ~7)C.
Proof. Assume the converse to be true. Suppose that there exist sets {gk}, {t~k}, gk E B, ~k e C, 1)k ~ l) and a n u m b e r 7 ' > 0 such that Ok + gk ~ ( l + "rf)C. Taking the limit as k-->oo we obtain v~ (1 + ' q ' ) C , which contradicts the inclusion v e C and thus proves the lemma.
T h e o r e m 4. I r a : ~ ~ lI~t( ~ ) is a quasihomogeneous mapping and a ( x ) c Z f o r every
x, then W = M = ~ where Y( is the f a m i l y o f all Poisson stable points.
Proof. It is necessary to check the inclusions ~ D W, M D ~. 1. We shall first verify ~ ~ W. If x ~ ~, then x ~ Pa(x). T h e set P~(x) is star-shaped (since a ( x ) c Z and Z is compact) and semi-invariant. Let h be the function defined in Proposition 7 with respect to set ~: = P~(x). Then h is in equilibrium and since x ~ P a ( x ) we have h ( x ) > 1. However, we also have a ( x ) e P a ( x ) and therefore ( h o a ) ( x ) = l . Thus x ~ Wh and hence x ~ W. 2. To verify M D ~, we first let x e ~ , i.e., x e P~(x). From L e m m a 2 it is clear that for every e e (0, l) there exists a n u m b e r t such that (1 - e)x e a'(x).
(20)
Consider a sequence o f positive numbers {ek} such that [ I k - i ( 1 - Ek) converges to some n u m b e r ~,e(0, 1). Using (20) and the q u a s i h o m o g e n e i t y o f a we can construct a trajectory X = {x,} starting from x and containing the subsequence {% =l-I~=~ (1--ek)X}. This means that ux is a limit point o f the trajectory x and therefore u x e M. Since u is an arbitrary n u m b e r we conclude that x e M. This completes the p r o o f o f the thereom.
References [1] J. Cassels, An introduction to the geometry of numbers (Springer-Verlag, Berlin, 1959). [2] V.F. Demyanov, Minimax: directional differentiability (in Russian) (Leningrad University Press, Leningrad, 1974). [3] V.F. Demyanov (Ed.), Nonsmooth problems in the theory of optimization and control (in Russian) (Leningrad University Press, Leningrad, 1982). [4] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals', Doklady Akademii Nauk, SSSR 250 (1980) 21-25. (Translated in Soviet Mathematics Doklady 21 (1) (1980) 14-17.) [5] V.F. Demyanov and A.M. Rubinov, "On some approaches to the nonsmooth optimization problem" (in Russian), Ekonomika i Matematieheskie Metocly 17 (1981) 1153-1174.
202
A.M. Rubinov and A.A. Yagubot~ / The space of star-shaped sets and its applications
[6] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable mappings", Mathematische Operationsforschung und Statistik, Series Optimization 14 (l) (1983) 3-21. [7] B.N. Pschenichnyi, Convex analysis and extremal problems (in Russian) (Nauka, Moscow, 1980). [8] R.T. Rockafellar, Convex analysis (Princeton University Press, Princeton, 1970). [9] A.M. Rubinov, "Turn-pike sets in discrete dispersible dynamic systems'" (in Russian), Sibirskii Matematicheskii Zhurnal 21 (4) (1980) 136-145.
Mathematical Programming Study 29 (1986) 203-218 North-Holland
e-QUASIDIFFERENTIABILITY OF REAL-VALUED FUNCTIONS AND OPTIMALITY CONDITIONS IN E X T R E M A L P R O B L E M S V.V. G O R O K H O V I K Institute of Mathematics, Academy of Sciences of the Byelorussian SSR, Surganov St. 11, Minsk 220604, USSR Received 2 December 1983 Revised 6 June 1984
In this paper we demonstrate how the concept of quasidifferentiability introduced by Demyanov and Rubinov may be extended to the more general concepts of e-quasidifferentiability and approximate quasidifferentiability. We study the e-quasiditterentiability of composite functions and present some rules for e-quasidifferential calculus. The optimality conditions for some typical extremal problems are restated in terms of e-quasiditterentials.
Key words: Nonsmooth Analysis, e-Quasidifferential, Local Extremum, Inequality and Equality Constraints, Optimality Conditions.
1. Introduction
The main aim of this paper is to show how the concept of quasidifferentiability introduced into nonsmooth analysis by Demyanov and Rubinov [4] (see also [ 1, 2, 5, 6, 16]) may be extended to the concepts of e-quasidifferentiability and approximate quasidifferentiability. These concepts are more general than that of quasidifferentiability, every quasidifferentiable function being also approximately quasiditterentiable. In addition, we shall see that the class of approximately quasidifferentiable functions contains locally Lipschitzian functions that are also directionally differentiable. The concepts of e-quasidifferentiability and approximate quasidifferentiability (termed simply quasidifferentiability) were introduced in earlier papers by the present author [8, 9]. The structure of the paper is as follows. Section 2 provides a summary of the results and notation which are used in the body of the paper. In Section 3 we introduce the concepts of e-quasidifferentiability and approximate quasidifferentiability for real-valued functions. We also investigate the property of e-quasidifferentiability for compositions of e-quasidifferentiable functions, and present some rules for e-quasidifferential calculus. A number of typical extremal problems are considered in Section 4 and their optimality conditions are derived in terms of equasidifferentials. It should be noted that the choice of extremal problems considered in Section 4 and in fact the paper as a whole was greatly influenced by [1, 2, 4, 5, 6, 16]. Some familiarity with these papers would be useful to the reader. 203
V. V. Gorokhovik / e-Quasidifferentiability
204
2. Difference-snhlinear functions and their quasidifferentials Let ~ ( E . ) be the vector space of real-valued positively homogeneous ( h ( h x ) = hh(x), h > 0 , x ~ E . ) continuous functions defined on n-dimensional Euclidean space E., and l e t / q ( E . ) be a convex cone in Y((E.) consisting ofsublinear functions. We shall say that a function h from 9((E.) is difference-sublinear if h may be represented as the difference of two sublinear functions, i.e., if there exist sublinear functions h and k7 such that h ( x ) = h ( x ) - h ( x ) V x ~ E.. The collection of all difference-sublinear functions is denoted by H(E.). It is not hard to see that H(En) is the smallest vector subspace of Yg(E.) which contains /q(E.). Moreover, the vector subspace H ( E . ) is closed with respect to the operations of taking pointwise m a x i m u m and minimum on finite subsets of H(E.). To demonstrate this we consider difference-sublinear functions h, and h~ and assume that hi(x) = hi(x) - ~(x), x c E., i = 1, 2 where _hi,/~i, i = 1, 2, are sublinear functions. From the equalities max{hi(x), h2(x)} = max{_hl(x) +/~2(x), _h2(x)+/~,(x)} - (/~l(x) +/~2(x)) and min{ hi(x), h2(x)} = (_h,(x) + h2(x)) - max{/~,(x) + _h2(x),/~2(x) + _hi(x)} we can conclude that max{h,(x), h2(x)} and min{h,(x), h2(x)} are also differencesublinear functions. It is well-known that each sublinear function h e / ~ ( E . ) is uniquely associated with a convex compact subset ah(0) = {v c E, [(x, v) <~h(x) Vx ~ E,} called the subdifferential of h at zero, where h ( x ) = max (x, v). vc,~h(O)
(By (x, v) we denote the scalar product of vectors x and v.) From this and from the definition of difference-sublinear functions it follows that for each h ~ H ( E . ) there exist two convex compact subsets _~h(0)c E, and ~ h ( 0 ) c E. such that h ( x ) = max ( x , v ) v~gh(O)
max (x,w)
Vx~E,.
An ordered pair Dh(O)= [0h(0), ~h(0)] of convex compact subsets ~h(0) c E, which satisfy equality (1) is called a quasidifferential of Any ordered pair A = { A , , 4 ] of convex compact subsets A c represents a quasidifferential at zero of the function ~b(A)e H ( E , )
r
v ) - m a x ( x , w) t,c_A
(1)
wcS,h(0)
V x c E,,.
0 h ( 0 ) c E~ and h at zero. E, and , 4 c E, defined by (2)
wc./~
If pairs A1 = [_A~, ,4~] and A2 = [_A2, fi~2] satisfy the equality _A~+ ,42 = -4~ + _Az then d~(A~)(x) = cb(A2)(x) V x e E,. Thus the quasidifferential of a difference-sublinear function is not uniquely defined. In particular, if Dh(O)=[~_h(O),-~h(O)] is a quasidifferential of h at zero then for any convex compact set M c E, the pair [0h(0) + M, a h ( 0 ) + M ] is also a quasidifferentiai of h at zero.
V. V. Gorokhovik / e-Quasidifferentiability
205
Let us define the addition of pairs A1 = [_AI, A1] and A2 = [_A2, .42] as follows:
A,] + [_A2, A2] = [_A, + A2, A, + and scalar multiplication in the following way: A[A, A] = ~'[A_A, AA],
-
A~0,
L[I IA, I l_a],
We therefore have
4o(A~ + A2)(x)= r
V x c E,,
and
ck(AA)(x)=A4~(A)(x)
V x e En.
Let h: E, --> El be a ditterence-sublinear function and Dh(O) = [0h(0), ~h(0)] be a quasiditterential of h at zero. Then h is nonnegative on E,, i.e., h(x)>-O V x ~ E, if and only if ~h(0) c _0h(0).
(3)
Finally we endow ,9~'(En) (and H(E,)) with the norm JJhl[-= maxllxu~lJh(x) [ and note that by the Weierstrass-Stone theorem the subspace ofditterence-sublinear functions H ( E , ) is dense in ~ ( E , ) with respect to this norm. By virtue of this fact each positively homogeneous continuous function h may be approximated by differencesublinear functions, and the quasidifferentials of the approximating functions used to characterize h. This is the main idea of the approach described in the following section. A more detailed account of this preliminary material can be found in [1, 4, 5, 6, 14]. Here we have confined ourselves simply to those facts which are necessary for an understanding of the present paper.
3. e-Quasidifferentiability and approximate quasidifferentiability of real-valued functions Let f be a real-valued function defined on some open set U c E, containing a point x. From Demyanov and Rubinov [4], a function f is said to be quasidifferentiable at a point x if (a) f is directionally differentiable at x, i.e., the following limit exists:
f!~(g) = lim
f ( x + ag) - f ( x )
Vg ~ E..
(b) the directional derivative f'~: En ~ E1 is difference-sublinear.
V. V. Gorokhovik / e-Ouasidifferentiability
206
An ordered pair D f ( x ) = [_~f(x), Of(x)] of convex compact sets ~_f(x)c E, and ~ f ( x ) c E~ such that
f'(g)=
max ( g , v ) roOf(x)
max (g,w)
VgcE,,
w~-af(x)
is called a quasidifferential of the function f at the point x. Thus we can refer to a quasidifferential of the directional derivative f!~: E, ~ El at zero (if it exists and is difference-sublinear) as a quasidifferential of the function f at the point x. Generalizing this notion, we introduce the following definitions: Definition 1. Let e >/0. A function f is said to be e-quasidifferentiable at a point x i f f is directionally ditierentiable at x and there exist convex compact sets O_~f(x) c En and -~ff(x)c En such that
If(g)-
max ( g , v ) + max v~-Orf(x)
wcS~f(x~
(g,w)[<~ellgll
VgcE,.
(4)
A pair D~f(x) = [_~ff(x), 0 , f ( x ) ] satisfying (4) is called an e-quasidifferential of the
function f at the point x. In other words, a function f is e-quasiditterentiable if its directional derivative may be uniformly approximated to within e on the unit ball by a difference-sublinear function. It is evident that any e-quasiditterential D~f(x) of the function f at x is also an e'-quasiditterential o f f for any e' t> e. In particular we can consider a quasidifferential Df(x) of a quasidifferentiable function f to be an e-quasiditterential of f for any e > 0. Hence quasiditierentiable functions are also e-quasidifferentiable for any positive e. However, this property is not limited to quasidifferentiable functions--it characterizes a much wider class of functions which can be defined as follows: Definition 2. A function f is said to be approximately quasidifferentiable at a point x i f f is e-quasidifferentiable at x for any positive e. The next theorem gives a criterion for approximate quasidifferentiability. Theorem 1. A function f: tJ ~ El is approximately quasidifferentiable at a point x if
and only if f is directionall): differentiable at x and its directional derivative f " : E, ~ E~ is continuous. Proof. This theorem follows immediately from the fact that the subspace H ( E , ) of difference-sublinear functions is dense in the space ~ ( E , ) of positively homogeneous continuous functions with respect to the norm II h II ~-: max I1~11~1Jh (x)[. From this theorem it follows that any uniformly directionally differentiable function is also approximately quasidifferentiable.
V.. V. Gorokhovik / e-Quasidifferentiability
207
Recall [3, 7, 11] that a f u n c t i o n f i s said to be uniformly directionally ditterentiable at a point x if it is directionally differentiable at x and for any g e E, and any A > 0 there exist ~ > 0 and 3' > 0 such that
It
' ( f ( x + tz) - f ( x ) ) - f ' ( g ) [ < A
for all t e (0, 3') and all z e Sa(g) = {y e E. [ Ily- gll ~ ~}. In particular, any function f that both satisfies the Lipschitz condition at the point x and is directionally differentiable at x is uniformly directionally differentiable at this point. Hence any locally Lipschitzian function that is also directionally differentiable must be a p p r o x i m a t e l y quasiditterentiable. Let a function f : U - E~ be a p p r o x i m a t e l y quasidifferentiable at a point x. We shall use ~ , f ( x ) to denote the set whose elements are e-quasiditterentials of the function f at x. From the definition of a p p r o x i m a t e quasiditterentiability it follows that the set @~f(x) is not e m p t y for any positive e and that ~ , f ( x ) c ~ , f ( x ) , where e~E
t.
The collection ~ f ( x ) = {~,f(x)l e > 0} is called an approximate quasidifferential of the function f at the point x. The intersection ("]~>o ~ , f ( x ) is not e m p t y if and only if the function f is quasiditterentiable at x, and in this case any element of("~,> o ~ f ( x ) is a quasiditterential o f f at x. R e m a r k 1. O u r a p p r o a c h to the notion of a p p r o x i m a t e quasidifferentiability is based on the uniform a p p r o x i m a t i o n of the directional derivative by ditterence-sublinear functions. In this sense it is similar to W a r g a ' s a p p r o a c h to the derivative container [18, 19], which is based on uniform a p p r o x i m a t i o n by C~-functions. As a first step towards a calculus for e-quasidifferentials we shall now derive a chain rule for the c o m p o s i t i o n of e-quasidifferentiable functions. Theorem 2. Let U be an open subset of En and V be an open subset of Era. Consider given functions f ~ : U ~ E l , i ~ l : m , u: V ~ E ~ and a point x ~ U such that y = (fl(x) . . . . ,f,,,(x))e V. Suppose furthermore that the functions .~, i e l : m , are e~quasidifferentiable at the point x and the function u is uniformly directionally differentiable at the point y. Let D , ~ ( x ) ; i e 1 : m, be any e~-quasidifferentials of the functions f~, i e l : m at the point x and D~u(y) be an ~-quasidifferential of the function u at the point y.
Then the composite functions S: x' ~ u ( f l ( x ' ) , . . . , f,, (x') ) is e-quasidifferentiable at the point x for e-= max
Ih,le,+ max
Ae_~;u(y)i=l
A ~ e u ( ) ' ) i=1
and if vectors p = ( v b . . . , v~<~Ai~txi,
IX,Ie,+~
,
max](f;)x(z)[
i--1 Ilzll <~1
~,,,) and Ix = (txl . . . . . izm) satisfy the inequalities
/el:m,
VAeO_;u(y)t_)~u(y)
(5)
V. V. Gorokhovik / e-Quasidifferentiability
208
then D~S(x) = [O~S(x), ~S(x)], where
o_~S(x)= t_)
,
(6)
0,S(x) = x~,,~)[-J{~.~=~(A~- u~)0,,f(x) + ~-~ (/x~- A~)0~,f(x)},
(7)
Z (,~,-~,,)_0~,f,(x)+
A C ~.'~,,u ( y )
i--I
(m-x,)~,f(x) i=l
is an e-quasidifferential of the composite function S at the point x. Proof. Since the function u is uniformly directionally differentiable at the point x and the functions f , i~ 1 : m, are directionally dilterentiable at the point x, then (from [3, 11 ]), the composite function S ( x ' ) = u ( f l ( x ' ) , . . . ,fm(x')) is directionally ditterentiable at the point x, where S'~(z) = u'y((fl)'(z) . . . . . (fm)'(z)), z E E,. Define the ditterence-sublinear functions Pe(y')--- ch(D;u(y))(y'), y'~ E,,, and h~,(z)=-ch(D,,f(x))(z),zcE,,i6l:m, and note that DP,:(O)=D;u(_v) and Dh~,(0) = D~,f(x), i ~ 1 : m. By Zaslavskii's lemma (see [1,5]), the composite function q~:z P e ( h ~ , ( z ) , . . . , h , . ( z ) ) is difference-sublinear and [O_~S(x), 0,S(x)], with ~_~.S(x) and SLS(x) defined by (6) and (7), is a quasiditterential of q, at zero, i.e., q~(z)= max (z, v ) vE.~_,S(x)
max (z, w) Vzc E,. w~;5~S(x)
To prove the theorem we use the inequality [S'(z) - q, (z)l ~
. . . . . (f,,,)'(z)) I
+[P~((f])~(z),...,(f,,,)~(z))-q~(z) I 3zeE,,
(8)
and estimate both terms on the right-hand side. From the definition of aft g-quasiditterential of the function u at y we have [S'(z) - P ~ ( ( f , ) ' ( z ) , . . . , (fm)'(z))[ = [ u ' y ( ( f , ) ' ( z ) , . . . , ( f , . ) ' ( z ) ) - 4a(D~u(y))((f,)'x(z) . . . . , (fm)'(z))] ~< ~ ~ max i=l
Vz
(9)
E~
Ilzll~ 1
Since the inequalities
I~,1~, lizli
~,h~,(z)i=1
i
i~l
1
i =1
\ i =1
/
hold for any ;tl, A z , . . . , )t,, we can use well-known inequalities for the maximum
V. V. G o r o k h o v i k / e-Quasidifferentiability
209
of a sum and of a difference to obtain max _ A i h , , ( z ) - m a x
A~M \i=1
ACM i~l
hi[ei
Ilzll
i=l~Aihe'(z)+m?XM(~' [ ] z l[lhilei) /=l
-<max~ ~ i=l~h / ( f ) ~(_) ' " ~- < max ~.~-M
VzcE.,
where M is an arbitrary subset of E,.. Since
P~(y'l,...,y',)=
max
h~vi- max
A~v'~,
A~7t~u(y) t=l
A~_~iu(y) i=1
Y' = (Y'1. . . . . y'~)cE,,, then from the preceding chain of inequalities (assuming M equals a_~u(y) or ~ t u ( y ) , as a p p r o p r i a t e ) we have IP~((f~)'~(z) . . . . . (f,.)'~(z))-q~(z)[ <~~!!zll
Vz~ Eo,
(10)
where max
max Ac~r
h(~iu(y) i-I
i :1
Then from inequalities (8)-(10) we obtain
]S',,(z)-q~(z)[ <~llzll
Vz~E.,
which proves the theorem. Corollary 1. Let the functions f , i ~ l: m, satisfy the assumptions of Theorem 2. Then (a) the function x'-~ ~ = l Z i f ( x ' ) (x' e U, A1. . . . , Am e El) is e-quasidifferentiable at the point x for e---~--~"=1 and
[A,lei,
D*: (i~_,
A'f ) (x)=i~ "LD.,f~(x)
is an e-quasidifferential of the function at x. (b) the function x'--> maxi~l:m f ( x ' ) is e-quasidifferentiable at the point x for e = maxi~i(x) el, l ( x ) = {i ~ 1: m[fi(x)=maxi~l:,.fi(x)}, and
i(:l:m
L
kiEl(x) \-
'
is an e-quasidifferential of the function at x.
je_l(x)
ie_l(x)
210
V. V. G o r o k h o v i k / e-Quasidifferentiability
(c) the function x ' ~ m i n ~ t : m f ( x ' ) is e-quasidifferentiable at the point x for e = max,~1(x) el, I ( x ) = {i c 1 : m l f ( x ) = m i n i ~ l : m f ( x ) } , and
": :
i~l(x)
li~:l(x)
j~-I(x) jr
is an e-quasidifferential o f the function at x. To prove Corollary 1 we make use of Theorem 2, defining the function u: Em -+ E1 as follows: (a) u(y)=Y~_~ AtYi; (b) u ( y ) = m a x i ~ : , , y ~ ; (c) u ( y ) = m i n ~ c l : , , y , We close this section with the following remark. Remark 2. In those cases where the function f : U ~ E~ is not directionally differenti-
able at the point x, the notions of e-quasidifferentiability and approximate quasidifferentiability may be defined on the basis of other positively homogeneous local approximations. Lower and upper Dini's derivatives [10, 15] are particularly suitable for this purpose. The approach based on Dini's derivatives is described in detail elsewhere [8, 9]. Here we note only that every locally Lipschitzian function is lower and upper approximately quasidifferentiable.
4. Conditions for a local extremum
In this section we shall present conditions for a local extremum expressed in terms of e-quasidifferentials. Let U be an open set in E. and f : U - E1 be a real-valued function with a finite value at a point x ~ U. Let the function f be e-quasidifferentiable at the point x for some e/> O. Theorem 3. l f a function f attains a local minimum ( m a x i m u m ) at a point x then for
any e-quasidifferential D ~ f ( x ) we have -O.f(x) = O_J'(x.) + eB.
(O_~f(x) c - ~ f ( x ) + eB.),
( 11 )
where B. is the closed unit ball in E.. Proof. Since [Igll = maxt,,n,. (g, v) we can rewrite relation (4) from Definition 1 in the following way: max (g, v ) v~_i_Jrf(x)
<~
max
(g, w ) < ~ f ' ( g )
wc~rf(x)+eB"
max v~-i_l.f(.x) § rB,,
(g,v)-
max (g,w) w~_3.f(x)
Vg~E,.
V. V. Gorokhovik / e-Quasidifferentiability
211
I f f attains a local m i n i m u m at a point x then f'(g)>~O V g ~ En. Hence from the right-hand side o f the preceding inequality we have max (g, v)<~
max
(g, v)
V g c E,,
which is equivalent to inclusion (11). The p r o o f for the case o f a local m a x i m u m follows the same lines. Theorem 4. If a function f is uniformly directionally differentiable at a point x and if
for some e-quasidifferential D~f(x) (e >!O) of f a t x there exists a real ~ > e such that -O~f(x) + 8B, c O~f(x)
(~_,f(x) + 8B, ~ g,f(x)),
(12)
then the function f achieves a strict local minimum (maximum) at the point x. Proof. We shall confine ourselves to proving the sufficient condition for a strict local minimum. From condition (12) and the definition o f an e-quasidif[erential we obtain 0<~ max (g, v ) v~O~f(x)
max
(g, w)
wc~,f(x)+,SBn
= max ( g , v ) v~_~ftx)
max w~rf(x)
<~f'(g)-(8-e)llgll
(g,w)-ellgll-(~-~)llgll
Vg~E,.
Hence, we have
f'~(g)>~(,5-e)llgll
VgE E,.
(13)
We shall n o w argue by contradiction. Suppose that f does not attain a strict local m i n i m u m at the point x. Then there exists a sequence xk ~ U, k =.1, 2 , . . . , tending to x and such that f(Xk)<-f(x), k = 1,2 . . . . . Let tk = ][Xk--XH-~, k = 1 , 2 , . . . , and let zk~, s = 1, 2 , . . . , be a subsequence o f the sequence tk(Xk --X), k = 1, 2 , . . . , which converges to a unit vector r Since f is uniformly directionally ditterentiable at x we have f~(r
= lim f ( x + tk zk,) --f(x) = lim f(Xk,) --f(x) ~ 0 s~
tk~
s ~oo
lk ~
which contradicts (13). This completes the proof. Corollary 2 (due to Polyakova [2, 4]). Suppose that afunctionfis quasidifferentiable
at a point x. If the function f attains a local minimum (maximum) at the point x then -~f(x) ~ Of(x)
(~_f(x) ~ Of(x)).
v. v. Gorokhovik/' e-Quasidifferentiability
212
If, in addition, f is uniformly directionally differentiable at x and we have "Of(x) c i n t Of(x)
(Of(x) c i n t -Of(x))
then f attains a strict local minimum (maximum) at the point x. 4.1. Minimization under an inequality constraint Now we consider the problem of minimizing the function f : E. --> E 1 on the set G c E,. Suppose that G = { x ' c E. I u(x') <<-0}, where u: E, ~ E~ is a given real-valued function on E,. Theorem 5. Let x ~ G, f be e~-quasidifferentiable at the point x and u be e2-quasidiffer-
entiable at the point x. I f the function f attains a local minimum on G at the point x then for every e 1-quasidifferential D~if(x) and every eE-quasidifferential D ~ u ( x ) we have -O,,f(x) +-O~2u(x) c co{(_O~f(x) +-O~u(x) + elB,,) (O_~u(x) + O~,f(x) + e2B,,)}.
(14)
Proof. If the function f achieves a local minimum at the point x on the set G then
f'(g)>~O
VgcA(xtG),
(15)
where A ( x I G) is the cone of feasible directions for G at the point x [3]. Since the inclusions {g ~ E~ [ u ' ( g ) < 0} c A ( x I G) c {g E E. l u'~(g) <~0} hold for G = { x ' c E , inequalities
f'x(g)
(16)
lu(x')~O}, then it follows from (15) that the system of u'(g)
(17)
is inconsistent with respect to g on E.. This is associated with the inconsistency of the inequality system
6(D~lf(x))(g)+ell]gll
c~(D~2u(x))(g)+e2llgll
with respect to g for every,D~if(x) and every D~u(x). The inconsistency of the last inequality system is equivalent to the condition
max{~b(D~J(X))(g)+~lllgll,~(D~2u(x))(g)+~llgll}~O
V g e E..
(18)
Now, using the rules of quasiditterential calculus, we can demonstrate that (18) is equivalent to (14). This completes the p r o o f of the theorem. The set G = { x ' ~ E , [ u ( x ' ) ~ O } is said to satisfy the regularity condition for an inequality constraint at a point x if (i) the function u: E, ~ E1 is directionally differentiable at x (ii) there is no point at which the function u~. ' " E~ -~ El attains a local minimum.
V.V. Gorokhovik / e-Quasidifferentiability
213
Because u" : En --> El is positively h o m o g e n e o u s we have u'(go) = 0 at points go ~ E, where u" attains a local m i n i m u m (or m a x i m u m ) . It is now easy to see that (ii) is equivalent to (ii') for any goe E, which satisfies u'(go)=0 and any 3 , > 0 there exists a ~ e E~ such that u ' ( g ) < 0 and Ilg-goll < y. (The regularity condition (ii') was first f o r m u l a t e d in [3]). Let us n o w s u p p o s e that instead of (i) we have (a) the function u: En -~ Et is locally convex [ 11 ] (this means that u is directionally differentiable at x and u" is convex). In this case (ii) is equivalent to the well-known Slater regularity condition: (b) there exists a g c E~ such that u ' ( g ) < 0. It should be noted that (i) is not sufficient for condition (ii) to be equivalent to condition (b). Let us consider the function u:E2oE~, where u ' ( g ) = min{lg~,l, _g<2)}. It is not hard to see that u" satisfies (b) with ~ = (0, 1) and therd is a point go = (0, - 1 ) at which u" achieves a local m i n i m u m . Theorem 6. Suppose that the assumptions of Theorem 5 are satisfied and that the set
G satisfies the regularity condition for an inequality constraint at the point x. If the function f achieves a local minimum at the point x on the set G then for every el -quasidifferential D, l f (x ) of the function f a t the point x and for every e2-quasidifferential D,~.u(x) of the function u at the point x we have -O~if(x) c
~
[~E,f(x)+etB,+cone(O_~2u(x)+e2B -{v})].
(19)
Here cone M is the closure of the conic hull of the set M. Proof. In the p r o o f of T h e o r e m 5 we showed that for the function f to achieve a local m i n i m u m on G at the point x it is necessary that system (17) be inconsistent. Consider the function ~b(g)=--dp(D~,f(x))(g)+elllgU V g e E,, and note that from show that if the regularity condition is satisfied then the system
O(g) < O,
u'(g) <~0
(20)
is also inconsistent. Arguing by contradiction,,, We suppose that there exists a vector ~ ~ E. such that 6 ( g ) = c~ < 0, u ' ( ~ ) <~ 0. The inconsistency of system (17) implies u ' ( ~ ) = 0. In view of the regularity condition we can find a sequence gk ~ E,, k = 1, 2 , . . . converging to ~ and such that u'(gk)< 0, k = 1, 2 . . . . . Since the function 6 ( g ) is continuous we can identify a natural n u m b e r N such that for all k > N we have ]~,(gk) - ~'(~)l < ~ . Thus ~(gk)< 0(~)+ 1c~[/2--a/2 < 0, u ' ( g k ) < 0 Vk > N, which contradicts the fact that (17) is inconsistent. Hence, system (20) must be inconsistent.
V.V. Gorokhovik / e-Quasidifferentiability
214
It follows from the inconsistency of (20) that
ch(1)~,f(x) )(g) + el I[gll ~>0 for all g satisfying the inequality
d~(D~2u(x))(g) + ~211gll ~ 0. Continuing the argument as in [1, 2, 5] leads to (19), thus proving the theorem. Remark 3. The necessary condition presented in [1,2, 5] for this problem with quasidiiterentiable functions f and u is a consequence of Theorem 6. Remark 4. Condition (14) is a consequence of condition (19). Indeed, noting that the pairs [~,f(x)+-O~:u(x), O~,f(x)+O~u(x)] and [O~:u(x)+-O~,f(x), -O~u(x)+ 0~,f(x)] are respectively el- and e2-quasidifierentials of the functions f and u at x, we can rewrite condition (19) in the following way:
"O.,f(x) + "O~u(x) ~ O~,f(x) + -O~u(x) + e, B. + cone(_0.,u(x) + -O~,f(x) + e2B. s { w})
(21)
for all we'O~,f(x)+O~u(x). Let v ~ -g.,f(x) and w e O.~u(x) and let us consider two possible cases: (a) v+w~O~u(x)+-a.,f(x)+ezB., (b) v + w c O~2u(x)+'~,f(x)+ezB,,. Since for any convex compact set M satisfying 0~ M we have cone M = cone M then for case (a) it follows from (21) that
v+ w e O_~,f(x)+'O~u(x)+ e,B. +cone(O~u(x)+-O~,f(x)+ e2B. - { v + w}). Hence there exist vte O~,f(x) +-O~u(x) + e 1Bn, ~)2e 0e2U(X ) "~'0~,f(x) + e2B n and a real number X >/0 such that v+
w =
v, + X ( v ~ - ( v + w)).
From this we have
v+W=l+AVi+
1)2
and therefore
v+weco((O_~,f(x)+'O~u(x)+e,B.)u('O.,f(x)+O_.~u(x)+e2B.)).
(22)
For case (b) the inclusion (22) is trivial. Since we are considering an arbitrary v9 and an arbitrary w 9 inclusion (22) implies that inclusion (14) is true. We shall now compare the necessary conditions (14) and (19) with other known necessary conditions by considering a particular case.
V. V. Gorokhovik / e-Quasidifferentiability
215
Example 1. Let us suppose that the function f is locally convex at the point x and let u(x')=maxi~l:mui(x'), where u, is also locally convex at x for i e l ( x ) = {i 9 1 : ml u,(x) = 0}. T h e n f and u are quasidifferentiable at x and their quasidifferentials are
Df(x) =
[ 0 f ( x ) , {0}],
Du(x) =
[co{Ou,(x), i 9 l(x)}, {0}],
where Of(x) and Ou~(x), i e l ( x ) , are respectively the subdifferentials o f f " (u~)', i 9 I ( x ) , at zero. In this special case condition (14) has the form
and
0 e co{0f(x), 0ui(x), i 9 which is equivalent to the existence of vectors v 9 0f(x), v~ 9 Ou.(x), i 9 l ( x ) , and real numbers Ao~>O, ;q>~O, i 9 l ( x ) , A o + ~ i ~ x ) A~= 1, such that AoV+ Y~ h~v~=0. ie I ( x )
Since u ' ( g ) = m a x ~ . ~ ) ( u ~ ) ' ~ ( g ) , g 9 E~, satisfies condition (a) then for the set G = { x ' 9 E~ Imax,~,:m u,(x')~< 0} to satisfy the regularity condition at x we require the existence of a vector ~ e E. such that ( u ~ ) ' ( ~ ) < 0 for i 9 I(x). For the example under consideration the necessary condition (19) may be stated as follows:
0 9 Of(x) + cone(co{Ou~(x), i 9 l(x)}). This is equivalent to the existence of vectors v 9 0f(x), v~9 Ou~(x), i e I ( x ) , and real numbers Ai ~>0, i 9 l ( x ) , such that v+
~
A~v~=0.
icl(x)
Thus we see that necessary conditions (14) and (19) generalize the Fritz John multiplier rule [12] and the K u h n - T u c k e r multiplier rule [13], respectively.
4.2. Minimization under an equality constraint We shall now consider the problem of minimizing the real-valued function f : En -~ El subject to the equality constraint G = {x'e E. [ u(x') = 0}, where u: E. ~ El is a real-valued function~defined on E.. The set G = { x ' e E n l u ( x ' ) = 0 } is said to satisfy the regularity condition for an equality constraint at a point x if (i) the function u: E.-~ E~ is uniformly directionally differentiable at the point x. and (ii) the function U'x:En ~ E~ does not achieve a local extremum (maximum or minimum) at any point. Using the same arguments as in the case of the regularity condition under an equality constraint, we can see that condition (ii) is equivalent to
V. E Gorokhovik / e-Quasidifferentiability
216
(ii') for any g e E. satisfying u'(g) = 0 and any real y > 0, there exist ~ , .~z e E. such that u ' ( g t ) < 0 , I1~,-gll < T and u'(g2) > 0, II~=-gll < T. (A discussion of condition (ii') is given in [3].) Theorem 7. Suppose that a function f is e~-quasidifferentiable at a point x and the constraint G = { x ' e E. [u(x') = 0} satisfies the regularity condition for an equality
constraint at x. I f the function f achieves a local minimum on the set G at the point x then for every el-quasidifferential D~,f(x) and for every e2-quasidifferential (e2 > O) D~u(x) of the function u at x, the following inclusion holds: {_a~,f(x) + e, B. + cone[(_a~ u(x) + e2B,, - { w})
~9.,f(x) c w e,7,: u ( x )
vca_,~u(x)
u (~2u(x) + e2B, - {v})]}.
(23)
Proof. U n d e r the regularity condition we have (see [3])
f'(g)/> 0
for all g e E, such that u ' ( g ) = 0,
(24)
where the function f achieves a local m i n i m u m on G at the point x. We shall now show that for every w e ~ , f ( x ) either the system max ( g , v ) - ( g , w ) + e , l l g l l < o , v~a_~f(x)
u'(g)<~o
(25)
-u'(g)<~O
(26)
or the system max (g,v)-(g,w)+e,Hg]]
is inconsistent. ( N o t e that a similar idea was used in [17].) Arguing by contradiction we suppose that both systems are consistent and let g~ be a solution of system (25) and g - be a solution of (26). Since the function u ' ( g ) is continuous with respect to g then there exists a real n u m b e r a e (0, 1) such that u'(g,~) = 0, where g~ = a g + + ( 1 - a ) g - . F r o m the chain of inequalities
f'(g,~)<~ m a x (g,~, v ) - ( g , , , ve-a.tlf(x) ~
vc~),tf(x)
+(1-a)(
w)+e~llg,,ll
(g+, v ) - ( g +, w)+ elllg+[I) max
oc=d, tf(x)
(g-, v ) - ( g
, w)
~-~,llg-II)<0,
which contradicts condition (24), we conclude that at least one of systems (25) and (26) is inconsistent w h e n e v e r w e ~.,f(x). Arguing as in the p r o o f of T h e o r e m 6 we can then show that the inclusion
"O~,f(x) = [_a.,f(x) + elB. + cone(_a.2u (x) + e2B. - {w})] w [_a~,f(x) + elB,, + cone(0~2u(x) + e2B,, - {v})] holds for all w e-O~2u(x) and v e a_~2u(x), which is equivalent to condition (23).
V. V. Gorokhovik / e-Quasidifferentiability
217
Remark 5. The problem of minimization under an equality constraint with quasidifferentiable functions was considered in [ 1, 16, 17]. The necessary conditions presented in these reports are consequences of (23).
Acknowledgements The author is greatly indebted to V.F. Demyanov and A.M. Rubinov for their encouragement. The author also thanks N.S. Soroka for English language assistance and H. Gasking for careful editing.
References [1] V.F. Demyanov, ed., Nonsmooth Problems of Optimization Theory and Control (in Russian) (Leningrad University Press, Leningrad, 1982). [2] V.F. Demyanov and L.N. Polyakova, "Minimum conditions for a quasidifferentiable function on a quasidifferentiable set" (in Russian), Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki 20 (1980) 843-856. [3] V.F. Demyanov and A.M. Rubinov, Approximate methods in extremal problems (American Elsevier, New York, 1970). [4] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 21-25. English translation in Soviet Mathematics Doklady 21 (1980) 14-17. [5] V.F. Demyanov and A.M. Rubinov, "On some approaches to the nonsmooth optimization problem" (in Russian), Ekonomika i Mathematicheskie Metody 17 (1981) 1153-1174. [6] V.F. Demyanov and L.V. Vasiliev, Nondifferentiable optimization (in Russian) (Nauka, Moscow, 1982). [7] A.Ja. Dubovitskii and A.A. Milyutin, "Extremum problems in the presence of restrictions", Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki 5 (1965) 395-453. English translation in USSR Computational Mathematics and Mathematical Physics 5 (1965) 1-80. [8] V.V. Gorokhovik, "On the quasidifferentiability of real-valued functions", Doklady Akademii Nauk SSSR 266 (1982) 1294-1298. English translation in Soviet Mathematics Doklady 26 (1982) 491-494. [9] V.V. Gorokhovik, "Quasidifferentiability of real-valued functions and local extremum conditions'" (in Russian), Siberian Mathematical Journal 25 (3) (1984) 62-70. [ ! 0] A.D. loffe, "Calculus of Dini subdifferentials", Nonlinear Analysis. Theory and Applications 8 (1984) 517-539. [11] A.D. Ioffe and V.M. Tihomirov, Theory ofextremalproblems (North-Holland, Amsterdam, 1979). [12] F. John, Extremum problems with inequalities as subsidiary conditions", in: K.O. Friedricks, O.E. Neugebauer and J.J. Stoker, eds., Studies and Essays: Courant Anniversary Volume (Wiley Interscience, New York, 1948) pp. 187-204. [13] H.W. Kuhn and A.W. Tucker, "Nonlinear programming", in: J. Neymann, ed., Proceedings of the Second Berkeley Symposium of Mathematical Statistics and Probability (University of California Press, Berkeley, California, 1951) pp. 481-492. [14] S.S. Kutateladze and A.M. Rubinov, Minkovski Duality and its Applications (in Russian) (Nauka, Siberian division, Novosibirsk, 1976). [15] .I.-P. Penot, "Calcul sous-differentiel et optimization", Journal of Functional Analysis 27 (1978) 248-276. [16] L.N. Polyakova, "On one problem of nonsmooth optimization" (in Russian), Kibernetika 2 (1981), 119-122. [17] B.N. Pshenichnyi and R.A. Khachatryan, "Equality constraints in nonsmooth optimization problems" (in Russian) Ekonomika i Matematicheskie Metody 18 (1982) 1133-1140.
218
V. V Gorokhovik / e-Quasidifferentiability
[18] J. Warga, "Necessary conditions without differentiability assumptions in optimal control", Journal of Differential Equations 18 (1975) 41-62. [19] J. Warga, "Derivative containers, inverse functions and controllability", in: D.L. Russel, ed., Calculus of variations and control theory (Academic th'ess, New York, 1976) pp. 13-46.
Mathematical ProgrammingStudy 29 (1986) 219-221 North-Holland
APPENDIX
A guide to the bibliography on quasidifferential calculus (January 1985 version) Quasidifferentiable (q.d.) functions were introduced in 1979 (see [9, 11]). Their properties are described in [12, 15]. Extremal properties involving q.d. functions were first discussed in [8, 25]. Necessary conditions for q.d. functions are examined in [5, 8, 25, 26, 31]. Numerical methods for solving optimization problems described by q.d. functions were first proposed by Sivelina in [32] and later in [6]. New algorithms generalizing the method outlined in [6, 32] have been suggested by Pallaschke [22], and Pailaschke and Recht [23]. (These papers also report on numerical experience.) Another method generalizing that described in [6, 32] is given by Kiwiei [19]. A different approach to the solution of a particular class of quasidiiterentiable optimization problems has been proposed by Polyakova [27-29]. Industrial problems formulated in terms of q.d. functions are discussed by Voiton in [33]. Various problems in quasidifferential calculus are treated in [1, 2, 3, 16, 34]. Optimal control problems involving a quasidifferentiable functional and quasidifferentiable right-hand sides are studied in [7, 20, 21]. The infinite-dimensional case is treated in [13, 14]. The concept of e-quasidifferentiability was suggested by Gorokhovik in [17, 18]. Star-shaped sets and their relations with quasidifferentials are studied by Rubinov and Yagubov in [30]. An application of quasidifferentiable functions to game theory is described by Pechersky in [24]. Reports [1, 5, 7, 10, 16, 21, 24, 27, 28, 30] are earlier versions of some papers included in this Study.
Bibliography on quasidifferential calculus (January 1985) [ 1] V.A.Demidovaand V.F. Demyanov,"A directionalimplicitfunctiontheoremfor quasidifferentiable functions", Working Paper WP-83-125, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [2] V.F.Demyanov,"On a relation betweenthe Clarke subdifferentialand the quasidifferential', Vestnik Leningradskogo Universiteta 13 (1980) 18-24 (translated in Vestnik Leningrad University Mathematics 13 [1981] 183-189. [3] V.F. Demyanov,"Problems of nonsmooth optimizationand quasidifferentials"(in Russian), Technical Cybernetics 1 (1983) 9-19. 219
220
Appendix
[4] V.F. Demyanov, ed., Nonsmooth problems of control theory and optimization (Leningrad University Press, Leningrad, 1982). [5] V.F. Demyanov, "Quasidifferentiable functions: Necessary conditions and descent directions", Working Paper WP-83-64, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [6] V.F. Demyanov, S. Gamidov and T.I. Sivelina, "An algorithm for minimizing a certain class of quasidifferentiable functions", Working Paper WP-83-122, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [7] V.F. Demyanov, V.N. Nikulina and I.R. Shablinskaya, "Quasidifferentiable problems in optimal control", Working Paper WP-84-2, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [8] V.F. Demyanov and L.N. Polyakova, "Minimization of a quasidifferentiable function on a quasidifferentiable set" (in Russian), Zurnal V.v~islitel'no~ Matematiki i Matematicesko~ Fiziki 20 (4) (1980) 849-856 (translated in U.S.S.R. Computational Mathematics and Mathematical Physics 20 (4) (1981) 34-43). [9] V.F. Demyanov, L.N. Polyakova and A.M. Rubinov, "On one generalization of the concept of subdifferential", in: All-Union Conference on Dynamic Control: Abstracts of Reports (Sverdlovsk, 1979) pp. 79-84. [ 10] V.F. Demyanov, L.N. Polyakova and A.M. Rubinov, "Nonsmoothness and quasidifferentiability", Working Paper WP-84-22, International Institute for Applied Systems Analysis (Laxenbt/rg, Austria, 1984). [ 11 ] V.F. Demyanov and A.M. Rubinov, "'On quasidifferentiable functionals", Doklady Akademii Nauk SSSR 250 (1980) 12-25 (translated in Soviet Mathematics Doklady 21 (1980) 14-17). [ 12] V.F. Demyanov and A.M. Rubinov, "'On some approaches to non-smooth optimization problems" (in Russian), Ekonomika i Matematicheskie Metodv 17 (1981) 1153-1174. [13] V.F. Demyanov and A.M. Rubinov, "Elements of quasidifferential calculus" (in Russian), in [4], pp. 5-127. [14] V.F. Demyanov and A.M. Rubinov, "On quasidifferentiable mappings", Mathematische Operationsforschung und Statistik, Series Optimization 14 (1) (1983) 3-21. [15] V.F. Demyanov and L.V. Vasiliev, "Nondifferentiable optimization" (in Russian) (Nauka, Moscow, 1981) (English translation forthcoming). [16] V.F. Demyanov and I.S. Zabrodin, "Directional differentiability of a continual maximum function of quasidifferentiable functions", Working Paper WP-83-58, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1983). [17] V.V. Gorokhovik, "On quasidifferentiability of real-valued functions" (in Russian), Doklady Akademii Nauk SSSR 266 (1982) 1294-1298 (translated in Soviet Mathematics Doklady 26 (1982) 491-494). [18] V.V. Gorokhovik, "Quasidifferentiability of real-valued functions and local extremum conditions'" (in Russian), Siberian Mathematical Journal 25 (1984). [19] K.C. Kiwiel, "Randomized search directions in descent methods for minimizing certain quasidifferentiable functions", Collaborative Paper CP-84-56, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [20] V.N. Nikulina and I.R. Sheblinskaya, "'Optimality conditions in control problems with a quasidifferentiable functio~'/al" (in Russian), in [4], pp. 175-204. [21] V.N. Nikulina and I.R. Shablinskaya, "Quasidifferentiable terminal problems of optimal control", in: Abstracts of the IIASA Workshop on Nondifferentiable Optimization: Motivations and Applications (September 17-22, 1984, Sopron, Hungary), International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984) pp. 121-127. [22] D. Pallaschke, "On numerical experiences with a quasidifferentiable optimization algorithm", in: Abstracts of the IIASA Workshop on Nondifferentiable Optimization: Motivations and Applications (September 17-22, 1984, Sopron, Hungary), International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984) pp. 138-140. [23] D. Pallaschke and P. Recht, "On the steepest descent method for a class of quasidifferentiable optimization problems", Collaborative Paper CP-84-57, International Institute for Applied Systems Analysis (Laxenburg, Australia, 1984).
Appendix
221
[24] S.L. Pechersky, "Positively homogeneous quasidifferentiable functions and their applications in cooperative game theory", Collaborative Paper CP-84-26, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [25] L.N. Polyakova, "Necessary conditions for an extremum of quasiditterentiable functions" (in Russian), Vestnik Leningradskogo Universiteta 13 (1980) 57-62 (translated in Vestnik Leningrad University, Mathematics 13 (1981) 241-247). [26] L.N. Polyakova, "On one nonsmooth optimization problem" (in Russian), Kybernetika 3 (1982) 119-122. [27] L.N. Polyakova, "On the minimization of a quasidifferentiable function subject to equality-type quasiditterentiable constraints", Collaborative Paper CP-84-27, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [28] L.N. Polyakova, "On minimizing the sum of a convex function and a concave function", Collaborative Paper CP-84-28, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [29] L.N. Polyakova, "On the minimization of a quasidifferentiable function subject to an equality type constraint", in: Abstracts of the IIASA Workshop on Nondifferemiable Optimization: Motivations and Applications (September 17-22, 1984, Sopron, Hungary), International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984) pp. 141-144. [30] A.M. Rubinov and A.A. Yagubov, "The space of a star-shaped sets and its application in nonsmooth optimization", Collaborative Paper CP-84-29, International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984). [31] A. Shapiro, "On optimality conditions in quasidiiterentiable optimization", S I A M Journal on Control and Optimization 23 (4) (1984) 610-617. [32] T.I. Sivelina, "On the minimization of one class of quasiditierentiable functions" (in Russian), Vestnik Leningradskogo Universiteta 7 (1983) 103-105. [33] E.F. Voiton, "Quasiditterentiable functions in the problems of optimal synthesis of electric circuits" (in Russian), in [4], pp. 291-307. [34] Z.Q. Xia, "On mean value theorems in quasiditierential calculus"; in: Abstracts of the IIASA Workshop on Nondifferentiable Optimization: Motivations and Applications (September 17-22, 1984, Sopron, Hungary), International Institute for Applied Systems Analysis (Laxenburg, Austria, 1984) pp. 186-189.