Nonsmooth Equations in Optimization: Regularity, Calculus, Methods, and Applications

Nonsmooth Equations in Optimization Nonconvex Optimization and Its Applications Volume 60 Managing Editor: Panos Pard...

Author: Diethard Klatte | B. Kummer

15 downloads 520 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Nonsmooth Equations in Optimization

Nonconvex Optimization and Its Applications Volume 60 Managing Editor: Panos Pardalos Advisory Board: J.R. Birge Northwestern University, U.S.A. Ding-Zhu Du University of Minnesota, U.S.A. C. A. Floudas Princeton University, U.S.A. J. Mockus Lithuanian Academy of Sciences, Lithuania H. D. Sherali Virginia Polytechnic Institute and State University, U.S.A. G. Stavroulakis Technical University Braunschweig, Germany

The titles published in this series are listed at the end of this volume.

Nonsmooth Equations in Optimization Regularity, Calculus, Methods and Applications

by

Diethard Klatte Institute for Operations Research and Mathematical Methods of Economics, University of Zurich, Switzerland

and

Bernd Kummer Institute of Mathematics, Faculty of Mathematics and Natural Sciences II, Humboldt University Berlin, Germany

KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW

eBook ISBN: Print ISBN:

0-306-47616-9 1-4020-0550-4

©2002 Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow Print ©2002 Kluwer Academic Publishers Dordrecht All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America Visit Kluwer Online at: and Kluwer's eBookstore at:

http://kluweronline.com http://ebooks.kluweronline.com

Contents Introduction

xi

List of Results

xix

Basic Notation

xxv

1 Basic Concepts 1.1 Formal Settings 1.2 Multifunctions and Derivatives 1.3 Particular Locally Lipschitz Functions and Related Definitions Generalized Jacobians of Locally Lipschitz Functions Pseudo-Smoothness and D°f Piecewise Functions NCP Functions 1.4 Definitions of Regularity Definitions of Lipschitz Properties Regularity Definitions Functions and Multifunctions 1.5 Related Definitions Types of Semicontinuity Metric, Pseudo-, Upper Regularity; Openness with Linear Rate Calmness and Upper Regularity at a Set 1.6 First Motivations Parametric Global Minimizers Parametric Local Minimizers Epi-Convergence

1 1 2 4 4 4 5 5 6 6 7 9 10 10 12 13 14 15 16 17

2 Regularity and Consequences 2.1 Upper Regularity at Points and Sets Characterization by Increasing Functions Optimality Conditions Linear Inequality Systems with Variable Matrix Application to Lagrange Multipliers Upper Regularity and Newton’s Method

19 19 19 25 28 30 31

v

Contents

vi

2.2 Pseudo-Regularity 2.2.1 The Family of Inverse Functions 2.2.2 Ekeland Points and Uniform Lower Semicontinuity 2.2.3 Special Multifunctions Level Sets of L.s.c. Functions Cone Constraints Lipschitz Operators with Images in Hilbert Spaces Necessary Optimality Conditions 2.2.4 Intersection Maps and Extension of MFCQ Intersection with a Quasi-Lipschitz Multifunction Special Cases Intersections with Hyperfaces

32 34 37 43 43 44 46 47 49 49 54 58

3 Characterizations of Regularity by Derivatives 3.1 Strong Regularity and Thibault’s Limit Sets 3.2 Upper Regularity and Contingent Derivatives 3.3 Pseudo-Regularity and Generalized Derivatives Contingent Derivatives Proper Mappings Closed Mappings Coderivatives Vertical Normals

61 61 63 63 64 64 64 66 67

4 Nonlinear Variations and Implicit Functions 4.1 Successive Approximation and Persistence of Pseudo-Regularity 4.2 Persistence of Upper Regularity Persistence Based on Kakutani’s Fixed Point Theorem Persistence Based on Growth Conditions 4.3 Implicit Functions

71 72 77 77 79 82

5 Closed Mappings in Finite Dimension 5.1 Closed Multifunctions in Finite Dimension 5.1.1 Summary of Regularity Conditions via Derivatives 5.1.2 Regularity of the Convex Subdifferential 5.2 Continuous and Locally Lipschitz Functions 5.2.1 Pseudo-Regularity and Exact Penalization 5.2.2 Special Statements for 5.2.3 Continuous Selections of Pseudo-Lipschitz Maps 5.3 Implicit Lipschitz Functions on

89 89 89 92 93 94 96 99 100

6 Analysis of Generalized Derivatives 6.1 General Properties for Abstract and Polyhedral Mappings 6.2 Derivatives for Lipschitz Functions in Finite Dimension and 6.3 Relations between 6.4 Chain Rules of Equation Type and with 6.4.1 Chain Rules for

105 105 110 113 115 115

Contents

6.4.2 Newton Maps and Semismoothness 6.5 Mean Value Theorems, Taylor Expansion and Quadratic Growth 6.6 Contingent Derivatives of Implicit (Multi–) Functions and Stationary Points 6.6.1 Contingent Derivative of an Implicit (Multi-)Function 6.6.2 Contingent Derivative of a General Stationary Point Map

vii

121 131 136 137 141

7 Critical Points and Generalized Kojima–Functions 7.1 Motivation and Definition KKT Points and Critical Points in Kojima’s Sense Generalized Kojima–Functions – Definition 7.2 Examples and Canonical Parametrizations The Subdifferential of a Convex Maximum Function Complementarity Problems Generalized Equations Nash Equilibria Piecewise Affine Bijections 7.3 Derivatives and Regularity of Generalized Kojima–Functions Properties of N Formulas for Generalized Derivatives Regularity Characterizations by Stability Systems Geometrical Interpretation 7.4 Discussion of Particular Cases 7.4.1 The Case of Smooth Data 7.4.2 Strong Regularity of Complementarity Problems 7.4.3 Reversed Inequalities 7.5 Pseudo–Regularity versus Strong Regularity

149 149 150 151 154 154 156 157 159 160 160 160 164 167 168 170 170 175 177 178

8 Parametric Optimization Problems 8.1 The Basic Model 8.2 Critical Points under Perturbations 8.2.1 Strong Regularity Geometrical Interpretation Direct Perturbations for the Quadratic Approximation Strong Regularity of Local Minimizers under LICQ 8.2.2 Local Upper Lipschitz Continuity Reformulation of the C-Stability System Geometrical Interpretation Direct Perturbations for the Quadratic Approximation 8.3 Stationary and Optimal Solutions under Perturbations 8.3.1 Contingent Derivative of the Stationary Point Map The Case of Locally Lipschitzian F The Smooth Case 8.3.2 Local Upper Lipschitz Continuity Injectivity and Second-Order Conditions

183 185 187 187 189 190 191 193 194 196 197 198 199 200 202 203 205

Contents

viii

Conditions via Quadratic Approximation Linearly Constrained Programs 8.3.3 Upper Regularity Upper Regularity of Isolated Minimizers Second–Order Optimality Conditions for Programs Strongly Regular and Pseudo-Lipschitz Stationary Points 8.3.4 Strong Regularity Pseudo-Lipschitz Property 8.4 Taylor Expansion of Critical Values 8.4.1 Marginal Map under Canonical Perturbations 8.4.2 Marginal Map under Nonlinear Perturbations Formulas under Upper Regularity of Stationary Points Formulas under Strong Regularity Formulas in Terms of the Critical Value Function Given under Canonical Perturbations

208 209 210 211 215 217 217 220 221 222 225 225 227

9 Derivatives and Regularity of Further Nonsmooth Maps 9.1 Generalized Derivatives for Positively Homogeneous Functions 9.2 NCP Functions Case (i): Descent Methods Case (ii): Newton Methods 9.3 The C-Derivative of the Max-Function Subdifferential Contingent Limits Characterization of for Max-Functions: Special Structure Characterization of for Max-Functions: General Structure Application 1 Application 2

231 231 236 237 238 241 243 244 251 253 254

10 Newton’s Method for Lipschitz Equations 10.1 Linear Auxiliary Problems 10.1.1 Dense Subsets and Approximations of M 10.1.2 Particular Settings and NCP Functions 10.1.3 Realizations for Functions 10.2 The Usual Newton Method for 10.3 Nonlinear Auxiliary Problems 10.3.1 Convergence 10.3.2 Necessity of the Conditions

257 257 260 261 262 265 265 267 270

11 Particular Newton Realizations and Solution Methods 11.1 Perturbed Kojima Systems Quadratic Penalties Logarithmic Barriers 11.2 Particular Newton Realizations and SQP-Models

275 276 276 276 278

229

Contents

ix

12 Basic Examples and Exercises 12.1 Basic Examples 12.2 Exercises

287 287 296

Appendix Ekeland’s Variational Principle Approximation by Directional Derivatives Proof of TF = T(NM) = NTM + TNM Constraint Qualifications

303 303 304 306 307

Bibliography

311

Index

325

This Page Intentionally Left Blank

Introduction Many questions dealing with solvability, stability and solution methods for variational inequalities or equilibrium, optimization and complementarity problems lead to the analysis of certain (perturbed) equations. This often requires a reformulation of the initial model being under consideration. Due to the specific of the original problem, the resulting equation is usually either not differentiable (even if the data of the original model are smooth), or it does not satisfy the assumptions of the classical implicit function theorem. This phenomenon is the main reason why a considerable analytical instrument dealing with generalized equations (i.e., with finding zeros of multivalued mappings) and nonsmooth equations (i.e., the defining functions are not continuously differentiable) has been developed during the last 20 years, and that under very different viewpoints and assumptions. In this theory, the classical hypotheses of convex analysis, in particular, monotonicity and convexity, have been weakened or dropped, and the scope of possible applications seems to be quite large. Briefly, this discipline is often called nonsmooth analysis, sometimes also variational analysis. Our book fits into this discipline, however, our main intention is to develop the analytical theory in close connection with the needs of applications in optimization and related subjects. Main Topics of the Book

1. Extended analysis of Lipschitz functions and their generalized derivatives, including ”Newton maps” and regularity of multivalued mappings. 2. Principle of successive approximation under metric regularity and its application to implicit functions. 3. Characterization of metric regularity for intersection maps in general spaces. 4. Unified theory of Lipschitzian critical and stationary points in optimization, in variational inequalities and in complementarity problems via a particular nonsmooth equation. 5. Relations between this equation and reformulations by penalty, barrier and so-called NCP functions. 6. Analysis of Newton methods for Lipschitz equations based on linear and xi

xii

Introduction

nonlinear approximations, in particular, for functions having a dense set of points. 7. Consistent interpretation of hypotheses and methods in terms of original data and quadratic approximations. 8. Collection of basic examples and exercises. Motivations and Intentions

For sufficiently smooth functions, it is clear that many questions discussed in this field become trivial or have classical answers. Even the way of dealing with equations defined by nonsmooth functions seems to be evident: (1) Define a reasonable derivative and prove a related inverse function theorem. Then, like in the Fréchet-differentiable case, (2) derive statements about implicit functions, successive approximation and Newton’s method, (3) and develop conditions for characterizing critical points in extremal problems. Of course, this calls for a deeper discussion. First

of all, one has to specify the notion of a derivative. This should be a sufficiently nice function L that approximates the original one, say locally at least of first order like the usual linearization at some argument in the Fréchet concept as

However, there are many problems when going into the details. Example 0.1 (piecewise linear Consider the real function if if For the function satisfies Our ”linearization” of at the origin is nonlinear, but has still a simple piecewise linear structure. Taking the linearization of should be the usual one, namely, Evidently, in this example, we found a of near which is simpler than the original function But in view of differentiation and inverse mappings, there arise already three new problems: What about inverse maps of piecewise linear functions ? What about continuity of derivatives or of ”linearizations” in terms of Which kind of singularity (critical point) appears at the origin ? Example 0.2 (no piecewise linear norm on one cannot find any piecewise linear origin.

For the Euclidean at the

Introduction

xiii

The functions and of these examples are not only of academic interest because they are typical optimal value functions in parametric optimization: with respect to with respect to

(for –1

They may occur, e.g., as objectives or as constraint functions in other optimization problems. Next,

it may happen that one needs different derivatives for different purposes. To illustrate this we note that there exists a real, strictly monotonous directionally differentiable Lipschitz function such that on where is a countable set. (i) is (ii) The inverse is well-defined and globally Lipschitz. (iii) Newton’s method (to determine a zero of ) with start at any point generates an alternating sequence and uses only points in X. Notice that X has full Lebesgue measure. Concerning the construction and further properties of such a function we refer to Chapter 12, Basic Example BE.1. So, the existence of a Lipschitzian inverse on the one hand and local convergence of Newton’s method on the other hand are different things. Indeed, we have to expect and to accept that there are generalized derivatives which allow (for certain nonsmooth functions) the construction of Newton-type solution methods without saying anything about uniqueness and Lipschitz behavior of the inverse, whereas other ”derivatives”, which characterize the inverse function, are rather inappropriate for Newton-type solution methods. Moreover,

the power of the classical differential calculus lies in the possibility of computing derivatives for the functions of interest. The latter is based on several chain rules. Related rules for composed generalized derivatives of functions or multifunctions are often not true or hold in a weaker form only. Even for rather simple mappings in finite dimensional spaces, it may be quite difficult to determine the limits appearing in an actual derivative-definition. This means an increase in the technical effort. In addition,

everybody has an idea about what tangency is or what a normal cone is. This had the effect that various more or less useful notions of generalized derivatives have been introduced in the literature, and many relations have been shown between them. Each of these derivatives has its own history and own motivation by geometrical aspects or by some statement, say by an application. However, these applications and motivations often play a second (or no) role in subsequent publications, which are devoted to technical refinements of the calculus, generalization and unification. So, the reader may easily gain the impression

xiv

Introduction

that ”nonsmooth analysis” is a graph the vertices of which are definitions of generalized derivatives and the edges are interrelations between them. It is hard to see that the graph is indeed something like a network of electric power because the lamps that can be switched on are hidden. In the present book,

as far as general concepts are concerned, we motivate why this or another concept is appropriate (or not) for answering a concrete question, we develop a related theory and indicate possible applications in the context of optimization. We also try to use as few notions of generalized derivatives as possible (only those mentioned below), and we describe necessary assumptions mainly in terms of well–known upper and lower semi–continuity properties. In this way, we hope that every reader who is familiar with basic topological and analytical notions and who is interested in the parametric behavior of solutions to equations and optimization problems (smooth or nonsmooth) or in the theory and methods of nonsmooth analysis itself will easily understand our statements and constructions. As a basic general instrument, we apply Ekeland’s variational principle. A second tool consists in a slight generalization of successive approximation, which opens the same applications (by the same arguments) as successive approximation in the usual (single-valued) case, namely implicit function theorems and Newton-type solution methods. Further, as a specific topic of our monograph, we use so-called Kojimafunctions (having a nice, simple structure for analytical investigations) in order to characterize crucial points in variational models. For several reasons, but mainly in order to establish tools for studying variational problems with nondata and, closely related, stationary points in optimization, we summarize and extend the calculus of generalized derivatives for locally Lipschitz functions. Finally, we connect generalized Newton-type methods with the continuity of (generalized) differentiability, as in the classical differentiable case; see the concept of Newton maps. Via perturbed Kojima systems, we establish relations to other standard optimization techniques, in particular, to penalty and barrier type methods. However, the most important tool for understanding nonsmooth analysis with its various definitions and constructions, is the knowledge of several concrete functions and examples which show the difficulties and ”abnormalities” in comparison with smooth settings. Such examples will be included in all parts of this monograph. The most important ones as well as answers to various exercises are collected in the last chapter. We envision that our book is useful for researchers, graduate students and practitioners in various fields of applied mathematics, engineering, OR and economics, but we think that it is also of interest for university teachers and advanced students who wish to get insights into problems, potentials and recent

Introduction

xv

developments of this rich and thriving area of nonlinear analysis and optimization. Structure of the Book

In Chapter 1, we start with some basic notation, in particular, with the presentation of certain desirable stability properties: pseudo- (or metric) regularity, strong and upper regularity. We try to find intrinsic conditions, equivalent or sufficient, which (as we hope) make the properties in question more transparent and indicate the relations to other types of stability. In Chapter 2, we present various conditions for certain Lipschitz properties of multivalued maps and the related types of regularity, we investigate interrelations between them and discuss classical applications as, e.g., (necessary) optimality conditions and ”stability” in parametric optimization. A great part of this chapter is devoted to pseudo-regularity of multifunctions in Banach spaces, where the use of generalized derivatives is avoided. This approach is based on the observation that the concepts of generalized derivatives which are usually applied for describing this important regularitytype (contingent derivatives as well as Mordukhovich’s co-derivatives) lead us to conditions that are not necessary even for level set maps of monotone Lipschitzian functionals in separable Hilbert spaces, cf. Example BE.2. Therefore, we present characterizations which directly use Ekeland’s variational principle as well as the family of assigned inverse functions. They allow characterizations of pseudo-regularity for the intersection of multifunctions and permit weaker assumptions concerning the image- and pre-image space as well. In particular, we reduce the question of pseudo-regularity to the two basic classical problems: (i) Show the existence of solutions to an equation after small constant perturbations, i.e., provided that and is small, verify that some satisfying exists. for some solution in a Lipschitzian way, (ii) Estimate the distance i.e., show that there is with such that Pseudo-regularity requires that, for certain neighborhoods U and V of and respectively, one finds a constant L such that both requirements can be satisfied whenever and We will demonstrate that, under weak hypotheses, it is enough to satisfy (i) and (ii) for all and for satisfying where is some constant depending on and Chapter 3 is devoted to characterizations of regularity by the help of (generalized) derivatives and may be seen as justification of the derivatives investigated in the current book. We also intend to motivate why the regularity concepts introduced in the

xvi

Introduction

first two chapters are really important. In particular, this will be done in Chapter 4 by showing persistence of regularity with respect to small Lipschitzian perturbations which has several interesting consequences (e.g. in Section 11.1). Note that we do not aim at presenting a complete perturbation theory for optimization problems and nonsmooth equations, our selection of results is subjective and essentially motivated by the applications mentioned above. Many general regularity statements can be considerably sharpened for closed multifunctions in finite dimension and for continuous or locally Lipschitz function sending into itself. So Chapter 5 is devoted to specific properties of such mappings and functions where we pay attention to statements that are mainly of topological nature and independent on usual derivative concepts. As an essential tool for locally Lipschitz functions, we apply here Thibault’s limit sets. In contrast to Clarke’s generalized Jacobians, the latter provide us with sufficient and necessary conditions for strong regularity and, even more important, they satisfy intrinsic chain rules for inverse and implicit functions. Basic tools for dealing with several generalized derivatives will be developed in Chapter 6. Our calculus of generalized derivatives includes chain rules and mean-value statements for contingent derivatives and Thibault’s limit sets under hypotheses that are appropriate for critical point theory of optimization problems, where the involved problem-functions are not necessarily Here, we write coderivatives by means of contingent derivatives (which will be computed in Chapter 7), and we also introduce some derivative-like notion called Newton function. It represents linear operators that are of interest in relation to Newton-type solution methods for Lipschitzian equations and describes, in a certain sense, continuous differentiability for non-differentiable functions. Derivatives for so-called semismooth functions are included in this approach. Chapter 7 is devoted to stable solution behavior of generalized Kojimafunctions. By this approach, we cover in a unified way Karush-Kuhn-Tucker points and stationary points in parametric optimization, persistence and stability of local minimizers and related questions in the context of generalized equations, of complementarity problems and equilibrium in games, as well. The notation Kojima-function has its root in Kojima’s representation of KarushKuhn-Tucker points as zeros of a particular nonsmooth function. We will see that basic generalized derivatives of such functions can be determined by means of usual chain rules. The properties of these derivatives determine, in a clear analytical way (based on results of Chapter 6), the stable behavior of critical points. In contrast to descriptions of critical points by generalized equations, our approach via Lipschitz equations has three advantages: (i) The invariance of domain theorem and Rademacher’s theorem may be used as additional powerful tools (not valid for multifunctions), (ii) The classical approach via generalized equations is mainly restricted to systems of the type where varies in and the multi-

Introduction

xvii

function is fixed. This means for optimization problems: The involved functions are and the perturbed problem has to have the same structure as the original one. By our approach, for instance, stationary points of the original problem and of assigned penalty or barrier functions may be studied and estimated as zeros of the same perturbed Kojima function (even for involved (iii) The necessary approximations of the multifunction - when speaking about generalized equations - are now directly determined by the type of derivative we are applying to the assigned Lipschitz equation. In Chapter 8, the regularity characterizations for zeros of generalized Kojima functions are applied to critical points, stationary solutions and local minimizers of parametric nonlinear programs in finite dimension, the specializations to the case of data – which is well-studied in the optimization literature – are explicitly discussed. In particular, we present second order characterizations of strong regularity, pseudo-regularity and upper Lipschitz stability in this context, give geometrical interpretations, and derive representations of derivatives of (stationary) solution maps. Finally, Taylor expansion formulas for critical value functions are obtained. In Chapter 9, we regard generalized derivatives of other mappings that are important for the analysis of optimization models, namely of (i) positively homogeneous functions and for the maximum of finitely many func(ii) Clarke subdifferentials tions In particular, may be a directional derivative or a so-called NCPfunction, used for rewriting complementarity problems in form of nonsmooth equations. We study the latter more extensively in order to show how the properties of determine the behavior of first and second order methods for solving the assigned equations and how related iteration steps can be interpreted in an intrinsic way. The simple derivative defined below, plays an essential role in this context. In view of (ii), it turns out that (which may be seen as a proto-derivative, too) depends on the first and second derivatives of the functions at the reference point only. We will determine the concrete form of in a direct way and establish the relations to C-derivatives of generalized Kojima-functions. Solution methods for general Lipschitzian equations are the subject of Chapter 10. Here, we summarize crucial conditions for superlinear convergence, based on linear and nonlinear auxiliary problems and present typical examples. In this chapter, our subsequent definitions of Newton maps, derivative D° and of locally will be justified from the algorithmic point of view. Moreover, the relations between the regularity conditions, needed for Newton’s method, as well as upper, pseudo- and strong regularity shall be clarified.

xviii

Introduction

In Chapter 11, (generalized) Newton methods will be applied in order to determine Karush-Kuhn-Tucker points of problems. Depending on the reformulation as (nonsmooth) equation (via NCPor Kojima-functions) and on the used generalized derivative ”DF” as well, we formulate the related Newton steps in terms of assigned SQP- models and of (quadratic) penalty and (logarithmic) barrier settings. The connection of these different solution approaches becomes possible by considering the already mentioned perturbed Kojima functions and by studying the properties of their zeros. Taking the results of Chapter 4 into account, one obtains Lipschitz estimates for solutions, assigned to different methods of the mentioned type. From Chapter 10 it is obvious that the are only important for the interpretations in terms of quadratic problems, not for solving according to Chapter 10. Chapter 12 contains Basic Examples which are used throughout the book at several places, while all numbered Exercises occurring in the first 11 chapters are once more compiled, now accompanied with the answers. In the Appendix, we prove some known basic tools for convenience of the reader.

Acknowledgements

We would like to thank several colleagues for stimulating discussions and valuable hints during the work on this project, in particular Asen Dontchev, Helmut Gfrerer, Peter Kall, Janos Mayer, Jiri Outrata, Stefan Scholtes and Alexander Shapiro, but especially Peter Fusek, Bert Jongen and Alexander Kruger, who in addition read parts of the manuscript. Our thanks go to Jutta Kerger for her linguistic support and to Ramona Klaass who typed a great part of the manuscript. We are grateful to Panos Pardalos for welcoming our work into the series Nonconvex Optimization and Its Applications and to John Martindale from Kluwer for his editorial help.

List of Results Introduction Example 0.1 Example 0.2

(piecewise linear (no piecewise linear

1. Basic Concepts Remark 1.1 Lemma 1.2 Example 1.3 Example 1.4 Example 1.5 Example 1.6 Example 1.7 Example 1.8 Example 1.9 Lemma 1.12 Example 1.13 Example 1.14 Theorem 1.15 Theorem 1.16

(derivatives of the inverse) (composed maps) (regularity for functions) (pseudo-regular, but not strongly regular) (strong regularity for continuous functions) (pseudo-regularity for linear operators) (Graves-Lyusternik theorem) (subdifferential of the Euclidean norm) is u.s.c., but not l.s.c.) (metrically regular = pseudo-regular) (pseudo-Lipschitz, but not locally upper Lipschitz) (the inverse of Dirichlet’s function) (Berge/Hogan stability) (stability of CLM sets)

2. Regularity and Consequences Lemma 2.1 Theorem 2.4 Theorem 2.5 Theorem 2.6 Lemma 2.7 Lemma 2.8 Corollary 2.9 Theorem 2.10 Remark 2.11 Theorem 2.12

(upper Lipschitz and describing Lipschitz functionals) (the max-form for intersections) (calm intersections) (free local minima and upper Lipschitz constraints) (Hoffman’s lemma) (Lipschitz u.s.c. linear systems) (Lipschitz u.s.c. multipliers) (selection maps and optimality condition) (inverse families and pseudo-regularity) (Ekeland’s variational principle) xix

xx

Lemma 2.13 Lemma 2.14 Example 2.15 Theorem 2.16 Theorem 2.17 Lemma 2.18 Lemma 2.19 Lemma 2.20 Lemma 2.21 Theorem 2.22 Remark 2.23 Corollary 2.24 Corollary 2.25 Theorem 2.26 Corollary 2.27

List of Results

(proper multifunctions) (pseudo-regularity for proper mappings) (F is not pseudo-regular) (pseudo-regularity of proper mappings with closed ranges) (basic equivalences, proper mappings) (pseudo-singular level sets of l.s.c. functionals) (Ekeland-points of norm-functionals in a real Hilbert space) (pseudo-singular cone constraints) (pseudo-singular equations) (intersection theorem) (estimates) (intersection with level set) (finite sets of directions) (intersection with hyperfaces) (Lipschitz equations)

3. Characterizations of Regularity by Derivatives Lemma 3.1 Lemma 3.2 Lemma 1.10 Remark 1.11 Corollary 3.3 Theorem 3.4 Theorem 3.5 Theorem 3.7 Theorem 3.11

(strong regularity for multifunctions) (upper regularity) (pseudo-regularity at isolated zeros) (pseudo-regularity and Lipschitz continuity) (pseudo-regularity if CF is linearly surjective 1) (basic equivalences, closed mappings) (pseudo-regularity if CF is linearly surjective 2) (injectivity of co-derivatives and pseudo-regularity) (vertical normals and regularity)

4. Nonlinear Variations and Implicit Functions Theorem 4.1 Theorem 4.2 Theorem 4.3 Corollary 4.4 Theorem 4.5 Lemma 4.6 Corollary 4.7 Theorem 4.8 Theorem 4.9 Theorem 4.11

(persistence under variations) (successive approximation) (estimates for variations in ) (pseudo- and strong regularity w.r. to ) (persistence of upper regularity) (lsc. and isolated optimal solutions) (pseudo-Lipschitz and isolated optimal solutions) (growth and upper regularity of minimizers) (estimate of solutions) (the classical parametric form)

List of Results

xxi

5. Closed Mappings in Finite Dimension Theorem 5.1 Theorem 5.2 Theorem 5.3 Theorem 5.4 Theorem 5.6 Theorem 5.7 Lemma 5.8 Lemma 5.9 Theorem 5.10 Theorem 5.12 Theorem 5.13 Theorem 5.14 Theorem 5.15

(regularity of multifunctions, summary) (CF and D*F) (conv CF) (regularity of the convex subdifferential) (pseudo-regularity and exact penalization) (pseudo-regular & upper regular) (continuous selections of the inverse map) (convex pre-images) (equivalence of pseudo- and strong regularity, bifurcation) (isolated zeros of Lipschitz-functions, (inverse functions and (inverse functions and (implicit Lipschitz functions)

6. Analysis of Generalized Derivatives Theorem 6.4 Theorem 6.5 Theorem 6.6 Theorem 6.8 Theorem 6.11 Theorem 6.14 Theorem 6.15 Lemma 6.16 Lemma 6.17 Theorem 6.18 Theorem 6.20 Corollary 6.21 Theorem 6.23 Theorem 6.26 Theorem 6.27 Theorem 6.28

(polyhedral mappings) and and generalized Jacobians) (partial derivatives for (partial derivatives for (existence and chain rule for Newton functions) (semismoothness; Mifflin) (selections of (special locally functions) (Newton maps of ) ( expansion) (quadratic growth on a neighborhood) (quadratic growth at a point) (C-derivative of the implicit function) (the case of pseudo-Lipschitz S) (C–derivatives of stationary points, general case)

7. Critical Points and Generalized Kojima–Functions Lemma 7.1 Lemma 7.3 Lemma 7.4 Theorem 7.5 Theorem 7.6

(necessity of LICQ for pseudo-regularity)

(TN, CN) (N simple, and further properties) (TF, CF; product rules) (TF, CF; explicit formulas)

xxii

Lemma 7.7 Theorem 7.8 Corollary 7.13 Corollary 7.14 Lemma 7.15 Lemma 7.16 Lemma 7.17 Lemma 7.18 Lemma 7.19 Lemma 7.20 Theorem 7.21 Corollary 7.22

List of Results

(subspace property of

)

( strong regularity and local u.L. behavior) (difference between TF and CF) (Newton’s method under strong regularity) (strong regularity of an NCP) (transformed Newton solutions) (invariance when reversing constraints) (deleting constraints with zero LM, pseudo-regular) (deleting constraints with zero LM, not strongly regular) (reduction for data) ( pseudo-regular = strongly regular)

8. Parametric Optimization Problems Theorem 8.2 Remark 8.3 Corollary 8.4 Remark 8.5 Corollary 8.6 Theorem 8.10 Theorem 8.11 Corollary 8.13 Lemma 8.15 Corollary 8.16 Theorem 8.19 Theorem 8.24 Corollary 8.25 Theorem 8.27 Lemma 8.31 Lemma 8.32 Theorem 8.33 Theorem 8.36 Corollary 8.37 Theorem 8.38 Theorem 8.39 Lemma 8.41 Theorem 8.42 Theorem 8.43 Theorem 8.45 Theorem 8.47

(strongly regular critical points) (necessity of LICQ, variation of ) (nonlinear variations, strongly regular) (strong stability in Kojima’s sense) (geometrical interpretation, strongly regular) (strongly regular local minimizers) (locally u.L. ) (nonlinear variations, u.L.) (auxiliary problems) (geometrical interpretation, u.L.) (CX under MFCQ) (locally u.L. stationary points) (second–order sufficient condition) (quadratic approximations) (upper regularity implies MFCQ) (u.s.c. of stationary and optimal solutions) (upper regular minimizers, ) (upper regular minimizers, ) (necessary condition for strong regularity, case) (local minimizer and quadratic growth, case) (TX under MFCQ) (TF-injectivity w.r. to (TF and pseudo-regularity of X) ( derivatives of marginal maps) ( for nonlinear perturbations I) ( for nonlinear perturbations II)

List of Results

xxiii

9. Derivatives and Regularity of Further Nonsmooth Maps Lemma 9.1 Lemma 9.2 Lemma 9.3 Theorem 9.4 Theorem 9.7 Corollary 9.9 Corollary 9.10

and for positively homogeneous functions) (NCP: minimizers and stationary points) (limits of for pNCP) (particular structure of for max-functions) (general structure of (reformulation 1) (reformulation 2)

10. Newton’s Method for Lipschitz Equations Lemma 10.1 Theorem 10.5 Theorem 10.6 Theorem 10.7 Theorem 10.8 Theorem 10.9

(convergence of Newton’s method - I) (regularity condition (10.4) for NCP) (uniform regularity and monotonicity) (convergence of Newton’s method - II) (the condition (CA)) (the condition (CI))

11. Particular Newton Realizations and Solution Methods Theorem 11.1 Lemma 11.3 Lemma 11.4 Lemma 11.5 Lemma 11.6

(perturbed Kojima–systems) (Newton steps with perturbed F) (Newton steps with pNCP) (Newton steps with perturbed (condition (CI) in Wilson’s method)

12. Basic Examples and Exercises Example BE.0 Example BE.1 Example BE.2 Example BE.3 Example BE.4 Example BE.5 Example BE.6

(pathological real Lipschitz map: lightning function) (alternating Newton sequences for real, Lipschitz (level sets in a Hilbert space: pseudo-regularity holds, but the sufficient conditions in terms of contingent derivatives and coderivatives fail) (piecewise linear bijection of with (piecewise quadratic function having pseudo-Lipschitz stationary points being not unique) (Lipschitz function directional derivatives nowhere exist, and contingent derivatives are empty) (convex non-differentiable on a dense set)

xxiv

List of Results

Appendix Theorem A.1 Lemma A.2 Lemma A.3 Lemma A.5 Lemma A.7

(Ekeland’s variational principle: proof) (approximation by directional derivatives 1) (approximation by directional derivatives 2) (descent directions) (Gauvin’s theorem and Kyparisis’ theorem)

Basic Notation (in the order of first occurrence in the text) Section 1.1 (or B): the closed unit ball in X (or B°): the open unit ball in X X*: dual of X X × Y (or (X, Y)): product of sets X, Y canonical bilinear form of X* × X the reals

Minkowski operations for C, convention even if is a metric space bd M, cl M, int M, conv M: boundary, closure, interior, convex hull of M similarly: etc. dist point-to-set distance in a metric space near (the statement holds) in some neighborhood of is a locally Lipschitz function from X to Y has continuous first (Fréchet-) derivatives has locally Lipschitz first (Fréchet-) derivatives has continuous first and second (Fréchet-) derivatives for defined componentwise for the real o-type and O-type functions Section 1.2 multi-valued map (multifunction) from X to Y gph F: graph of F dom F: domain of F F(A): image of A under F inverse multifunction of F xxv

xxvi

Basic Notation

contingent derivative of F at in direction Thibault’s limit set of F at in direction coderivative of F at in direction the weak* convergence (one-sided) directional derivatives of at in direction Clarke’s directional derivative of a functional at in direction (usual convex) subdifferential of a functional at Clarke-subdifferential of a functional at

Section 1.3 (Fréchet-) derivative of at second (Fréchet-) derivative of at B-subdifferential for locally Lipschitz Clarke’s generalized Jacobian for locally Lipschitz another subdifferential piecewise is a function constructed by NCP: nonlinear complementarity problem

Sections 1.4–1.6 MFCQ: Mangasarian-Fromovitz constraint qualification l.s.c.: lower semicontinuous u.s.c.: upper semicontinuous lim sup upper Hausdorff limit of a set sequence lim inf lower Hausdorff limit of lim lim upper and lower limits of a multifunction CLM set: complete local minimizing set epi epigraph of a functional

Section 2.1 describes

near

the non-negative real A transposed or partial (Fréchet-) derivative of with respect to or partial second derivative of with respect to and

Section 2.2 set of all local Ekeland points with factor dim Y: dimension of Y

Basic Notation

xxvii

Gfrerer’s DIST function Section 4.0

composition of

and

Section 6.1 contingent cone (Bouligand cone) Clarke’s tangent cone tangent cone related to the Thibault derivative projection map map of normals Sections 6.3–6.6 A is exposed element of M partial Thibault derivative of with respect to is a locally function ker G: kernel of G Sections 7.1–7.3 generalized Kojima-function LICQ: Linear Independence constraint qualification SMFCQ: strict Mangasarian-Fromovitz constraint qualification if if

if if if if if if respectively respectively several critical cones cones related to the T– and C–stability systems, respectively Section 8.1

for

xxviii

Basic Notation

for for critical point map stationary solution map multiplier map

Section 8.2 (SOC): second-order condition (SSOC): strong second-order condition Section 9.2 pNCP: cone of pNCP functions

Chapter 1

BASIC CONCEPTS 1.1 Formal Settings Given a (real) normed space X, we denote by and the closed and open unit ball, respectively. If the space is obvious, we omit the subscript. The normed spaces under consideration are always real normed spaces. The canonical bilinear form on the product space X* × X is denoted by where X* denotes the dual of X. For C, and we write in the sense of the Minkowski operations. We also often identify a set consisting of a single element with its element. So, for and we write D instead of In particular, is the closed ball with centre and radius This notation will be also used for metric spaces. In the suitable context, we denote by bd M, cl M, int M, conv M the boundary, closure, interior and the convex hull of a given set M, respectively. For compact writing sets which result from certain operations, we use symbols of the kind etc. to denote the sets etc. (for some set D under consideration and in a well–defined setting). Given a metric space the point–to–set distance is denoted by dist with the convention dist We say that some statement holds near if it holds for all in some neighborhood of For metric spaces X and Y, we indicate by the symbol that is a locally Lipschitz function, i.e., for each there are a neighborhood and a constant L such that The constant L is said to be a Lipschitz rank (or Lipschitz modulus) of near For Banach spaces X and Y, indicates that is a function having continuous [locally Lipschitz] first Fréchet-derivatives. Similarly, means that is a function having continuous first and second Fréchet-derivatives. An 1

2

1. Basic Concepts

optimization problem defined by functions is said to be a Often, we will assign, to some sequence of real certain elements To indicate that the elements converge to we write in M. In order to avoid unnecessary indices, we will also speak of sequences of real, converging and assigned points So the symbol is not reserved for a continuous quantity, a priori. For real we put and For we define and componentwise. As usually, o-type functions are assumed to satisfy as and o(0) = 0, while denotes a vanishing function as and o(0) = 0.

1.2

Multifunctions and Derivatives

The symbol says that F is a multi-valued map (multifunction), defined on X with We abbreviate: gph the graph of F, the domain of F, and the image of Most of the multifunctions considered in this monograph will assign, to certain parameters, feasible sets of optimization problems or solutions of equations. If, for some neighborhood U of F(U) is contained in a compact (bounded) set C, then F is said to be locally compact (locally bounded) near If gph F is closed in the product space X × Y, then F is said to be closed. The inverse is given by For normed spaces X and Y and we associate with F the following maps: CF defined by if there are certain (discrete) and assigned elements such that defined by if there are certain (discrete) assigned points with and elements such that defined by if there are certain (discrete) assigned points in gph F and dual elements in X* × Y * such that if and where is the weak* convergence. Notice that is an existence condition: For all sequences in gph F and there are with and in gph F such that, for sufficiently

1.2.

3

Multifuuctions and Derivatives

large The mapping CF is the contingent derivative [AE84], also called graphical derivative or Bouligand derivative (since its graph is the contingent cone introduced by Bouligand [Bou32]), D*F is (up to a sign) the coderivative in the sense of Mordukhovich [Mor93], and TF is Thibault’s limit set, it was defined in [RW98] and was called strict graphical derivative there. Note that we prefer to use the name Thibault's limit set (or Thibault derivative) for TF since this derivative has been first considered (however, for and with another notation) by Thibault [Thi80] and [Thi82]. To unify the terminology, we call all these mappings generalized derivatives. Remark 1.1 (derivatives of the inverse). For each of these generalized derivatives, the symmetric definitions induce that the inverse of the derivative is just the derivative of the inverse at corresponding points. As usually, we will say that a derivative is injective if the origin belongs only to the image of or respectively. For functions F, we have and may write CF TF and D*F(x). Nevertheless, the images of the derivatives as well as the pre-images may be multivalued or empty. If the (one-sided) limit exists uniquely for a function F and all sequences then it is called the directional derivative of F at in direction and denoted by Further, for Clarke’s directional derivative of direction is defined by the usual limes superior

at

in

which is obviously finite for locally Lipschitz functions. The set of all such that

is called the (usual convex) subdifferential of The set of all such that

is called the Clarke-subdifferential of f at tial of at and

at

It coincides with the subdifferenholds for convex

4

1. Basic Concepts

1.3 Particular Locally Lipschitz Functions and Related Definitions Let

be locally Lipschitz.

Generalized Jacobians of Locally Lipschitz Functions

By Rademacher’s theorem (for proofs see, e.g., [Fed69, Har79, Zie89]), the set the Fréchet derivative of

exists at }

has full Lebesgue measure, i.e., Moreover, for near the norm of Df is bounded by a local Lipschitz rank L of facts ensure that the mapping defined by

and These

for certain has non-empty images. In addition, one easily sees that is closed and locally compact. The same properties are induced for the map defined by the generalized Jacobian of at These observations, along with an elaborated calculus for which includes an inverse-function theorem, cf. Theorem 5.13 as well as close connections to (several) directional derivatives, in particular for real-valued are the fundamentals of F.H. Clarke’s [Cla83] concept of nonsmooth analysis. The latter equation induces that, for and convex there is no difference between the classical subdifferential of convex analysis and the generalized Jacobian In the literature, the mapping is often called the B-subdifferential and also denoted by Pseudo-Smoothness and D°f

Next let us copy Clarke’s definition. We put exists and is continuous near }. and

for certain Evidently, In contrast to the pair the pair fulfills on the open set However, and D°f(x) may be empty, cf. the real Lipschitz function G in Example BE.0 where also is a constant, proper interval. If is dense in we call pseudo-smooth. The function of Example BE.l obeys this property and satisfies additionally as well as

1.3.

Particular Locally Lipschitz Functions and Related Definitions

Piecewise

5

Functions

The class of (piecewise ) is defined in the following way. Given a locally Lipschitz function from to one says that belongs to if there is a finite family of functions such that, for all the set is not empty. We will also write of active functions at

The set characterizes the set The generalized Jacobian of has the representations

see [Sch94], and and see [Kum88a]. Note that so the first index set of related

for certain

if and only if may be smaller.

coincides with

near

Note. If one defines in the same way, but requiring the weaker assumption that is only continuous (instead of being locally Lipschitz), then one obtains the same class of functions, see [Hag79]. Obviously, the maximum-norm of is a function, not so the Euclidean norm. Every is pseudo-smooth since contains the open and dense set cf. the proof of Lemma 6.17. Convex functions are not necessarily pseudo-smooth, cf. Example BE.6. Nevertheless, in many applications, they are even NCP Functions

An NCP function is any function

such that

Such functions are connected with nonlinear complementarity problems (NCPs): Given find such that Using G, the NCP can be written as an equation

by setting

We will say that the NCP is (strongly) monotone if where

is a fixed constant. A standard NCP is defined by

6

1.4

1. Basic Concepts

Definitions of Regularity

Our main subject is the equation where

is a locally Lipschitz function.

Its inverse is a multifunction with possibly empty images We shall be interested in the local properties of near a pair Generally, we will speak of regularity whenever is non-empty for near The type of regularity (strong, pseudo, upper) will be concerned with Lipschitz properties of only. So it does not make any difference whether F is a function or a multifunction having images and the inverse Moreover, the requirements related to make sense for any multifunction F acting between metric spaces X and Y. In particular, Y may be a subset of e.g. which already ensures that is non– empty. Therefore, we present the corresponding definitions - as usually - in this generality. Definitions of Lipschitz Properties

Let X and Y be metric spaces,

and

gph S.

(D1) The mapping S is said to be pseudo-Lipschitz (with rank L) at if there are neighborhoods U and V of and respectively, such that, given any points gph S and

(D2) Similarly, if U, V and L exist in such a manner that for

then S is called locally upper Lipschitz (briefly locally u.L.) at with rank L. In many papers, condition (1.1) is written in a weaker form, namely as

Here, dist is the point-to-set distance in X, as defined above. Having (1.3) one can satisfy (1.1) with any In this sense, the conditions (1.1) or (1.3) are equivalent. We will prefer the condition (1.1). If S is a function, then definition (Dl) simply claims Lipschitz continuity on some neighborhood of

1.4. Definitions of Regularity

7

The notion pseudo-Lipschitz was introduced in [Aub84, AE84], it is also called Aubin property [RW98]. It is well-known from Robinson’s [Rob76a] work that a finite-dimensional system has, at and for a pseudo-Lipschitzian solution map if and only if the Mangasarian–Fromovitz constraint qualification (MFCQ) [MF67] is satisfied: (MFCQ)

has full rank and there is some u such that and

see also §2.2.4 below. The solution map S defined by (1.4) is crucial for various properties of an optimization problem Regularity Definitions

Let

be the inverse of a given multifunction

If S is pseudo-Lipschitz at then F is called pseudo-regular at If, additionally, neighborhoods U and V of and respectively, exist in such a way that is single-valued for then we call F strongly regular at Finally, if S is locally upper Lipschitz at and is non-empty for all then F is said to be upper regular at In every case, one says that L is a rank of (the related) regularity. To distinguish the defining neighborhoods assigned to different regular maps F and G at points and we write and respectively, and to quantify these neighborhoods, we denote by (similarly ) some positive constant such that

is satisfied, where we recall the convention X is a metric space.

if

Pseudo-regularity means that, locally around a Lipschitzian error estimate holds true. Having a solution x to one finds some satisfying the perturbed inclusion with a (small) distance Identifying the mappings S from (1.4), this ensures

Evidently, condition (Dl) remains true after changing the point gph S. So, (Dl) is a property which concerns the Lipschitz behavior of S near Moreover, as a direct application of the definition only, one sees that pseudo-regularity is persistent with respect to composition of maps, and upper regularity shows the same property after a natural modification.

8

1. Basic Concepts

Lemma 1.2 (composed maps).

If

and are pseudo-regular at and respectively, then as H(x) = F(G(x)) is pseudo-regular at If G and F are upper regular at the given points, then as is upper regular at for sufficiently small neighborhoods of In addition, the following estimates hold. Let and be the assigned neighborhoods with related constants according to (1-5), and let be related ranks of regularity. Then, in both cases, is a rank of regularity for H, and related neighborhoods may be defined as follows: and

provided that Proof. (i) By the choice of

we ensured that and

Let fulfills Since finds some

and and and

we find So we have satisfying

be given. We show that some in such a way that Since one Hence,

Using

next, some fulfills This yields By pseudo-regularity of G we finally obtain the existence of satisfying Therefore, H is pseudo-regular with neighborhoods and rank (ii) Let Since and F is upper regular, we have

Selecting this ensures regularity of G and

and, due to upper

So every belongs to Since we restricted F to the points do not belong to the image of Therefore, upper regularity of H with rank now follows from

1.4.

9

Definitions of Regularity

Functions and Multifunctions

Given any closed multifunction define = dist( gph F), say with distance on X × Y. Then, condition (Dl) for becomes a typical implicit-function requirement for the (globally) Lipschitz function namely: Given such that

with and

and

there is some

Similarly, (D2) requires: For all

with

it holds

Each multifunction is the inverse of the map Thus, there is no principal difference therein whether we investigate F or and speak about pseudo-regularity or the pseudo-Lipschitz condition (Dl). However, in any case, our assumptions should concern the given mapping F, If F is a function, then satisfies Conversely, if S satisfies (1.6), then is a function, defined on dom This fact has consequences for extending statements concerning inverse functions to inverse multifunctions: If (1.6) has been nowhere used, then the related statement on is immediately true for multivalued too. On the other hand, one cannot expect to obtain specific results for inverse functions from the theory of multifunctions, as long as (1.6) has been not exploited. Example 1.3 (regularity for

functions). If is a continuously differentiable function, then all these regularity definitions coincide - due to usual implicit function theorem - with the requirement det Example 1.4 (pseudo-regular, but not strongly regular). The complex function for F(0) = 0, is a Lipschitz function which is pseudoregular and upper regular without being strongly regular at the origin. Example 1.5 (strong regularity for continuous functions). For a continuous function strong regularity at induces that F is a homeomorphism between certain neighborhoods and Hence, is necessarily true due to Brouwer’s famous invariance of domain theorem. This is an essential fact being true for functions, but not for multifunctions. Example 1.6 (pseudo-regularity for linear operators). Let

be a linear operator onto Y where X and Y are normed spaces. Pseudo-regularity now requires that, given and there is some such that and In other words, is bounded as a mapping in the factor space Conversely, one may say that pseudo-regularity is just a nonlinear, local version of this property.

10

1. Basic Concepts

Example 1.7 (Graves-Lyusternik theorem). Let

be continuously differentiable near X and Y be Banach spaces and Then, F is pseudo-regular at (for references, proof and modifications, see Chapter 4, Theorem 4.11). One may state that pseudo-regularity is the basic topological property of near Example 1.8 (subdifferential of the Euclidean norm). A relevant multifunc-

tion being strongly regular at (0,0): Take the subdifferential (in the sense of convex analysis) of the Euclidean norm Then,

1.5

Related Definitions

Let us recall some common notions concerned with multifunctions for metric spaces X and Y . Types of Semicontinuity If and for each sequence then S is said to be lower semicontinuous (l.s.c.) at In the situation of definition (Dl), there is even a Lipschitzian estimate for all

in some neighborhood V

of

In the latter case, S is called Lipschitz at with rank L. If, given S is l.s.c. at all then S is said to be at If for each sequence and arbitrary then S is said to be upper semicontinuous (u.s.c.) at This coincides with C. Berge’s [Ber63] u.s.c. - definition if is compact (see, e.g. Chapter 2]), where S is called at if for any open set there is some neighborhood of such that for all The map S is said to be Lipschitz and all

at

(with rank L) if

for all in some neighborhood

If S is a function, we will then also say that S is pointwise Lipschitz at In comparison with the local upper Lipschitz property (D2), one considers now the whole set and does not claim that the elements of converge to a single point as

1.5. Related Definitions

11

If the requirements of definition (Dl) are satisfied for U = X, then we have and all

for all in some neighborhood V of

In this case, S is said to be Lipschitz-continuous around Example

1.9

line-segment

is u.s.c., but not l.s.c.). Assign, to each The inverse is

the

and is pseudo-regular at (0,0). Setting similarly then becomes

and

for

Now, is Lipschitz u.s.c. with each L at with L = 1 at the origin but Finally, for any given sequence of sets

as well as Lipschitz l.s.c. is not l.s.c. at (k = 1,2,...), one defines

These sets are often called the upper and lower Hausdorff-limits of the sequence respectively; sometimes the Kuratovski-Painlevt limits. Trivially,

Similarly, the limits tifunctions S:

and

are defined for mul-

Note that lim inf, in the bracket, has the following meaning: First take any sequence next consider the usual lower limit for the related sequence of extended reals Clearly, depends on the selected Now, denotes the infimum of over all sequences Analogously, one has to read limsup. We continue this chapter by clarifying some relations between pseudo-regularity and other regularity notions.

12

1. Basic Concepts

Metric, Pseudo-, Upper Regularity; Openness with Linear Rate

Let us first mention a connection between pseudo- and upper regularity at isolated pre-images. Lemma 1.10 (pseudo-regularity at isolated zeros).

spaces) is pseudo-regular at F is upper regular at

and if

If is isolated in

(metric then

with the same rank.

Indeed, starting with neighborhoods pseudo-regularity, one may exploit (setting

Proof.

and ) that

related to

Decreasing (then moves to ), will be attained at the isolated point Thus (D2) holds true for with and with a new neighborhood U. Decreasing we can further arrange, by pseudo-regularity, that for all Therefore, upper regularity of F at holds with rank L. Remark 1.11 (pseudo-regularity and Lipschitz continuity). Under the assump-

tions of Lemma 1.10, one shows analogously the existence of neighborhoods U and V of and respectively, such that the multifunction is Lipschitz on V. So, if is isolated in pseudo-regularity of F at and Lipschitz continuity (in the Hausdorff-distance) near of the map (for fixed small mean exactly the same. One says that

is metrically regular (with rank L > 0) at if, for certain neighborhoods U and V of and respectively, the estimate

holds true. For completeness, we present a proof of the well-known fact that metric regularity and pseudo-regularity describe the same property. Basically, this statement is known from [Iof81, BZ88, Pen89]. Lemma 1.12 (metrically regular = pseudo-regular).

F is pseudo-regular at

if and only if F is metrically regular at Proof. Writing down the both definitions and using condition (1.3) instead of (1.1) one obtains

metric regularity: dist pseudo-regularity: dist To see that pseudo-regularity ensures (1.7) (the reverse is trivial), note that in case of the distance becomes large if one restricts to a

1.5. Related Definitions

13

new, smaller neighborhood of We show (1.7) after taking sufficiently small neighborhoods and Obviously, is valid by pseudo-regularity. Hence, for and we have So (1.7) is true if But otherwise, the inequality tells us that

for small

So (1.7) follows again from pseudo-regularity.

Openness of F with linear rate around existence of L > 0 and of some neighborhood

means by definition the of such that

In other words, is Lipschitz l.s.c. with uniform rank L at all such that gph F, and the related neighborhoods of (the l.s.c. Lipschitz estimate holds on which) have again an uniform radius s. Evidently, this is pseudo-regularity of F at too. Calmness and Upper Regularity at a Set

Our local upper Lipschitz property (D2) for was used in [Don95] for instance. More generally, Robinson [Rob81] defined (D2) with respect to a set by

Now, the neighborhood U of

in (1.2) is replaced by an

open set U containing a set As before, we call F upper regular at Lipschitz at and

if both

Another variation of (D2), called calmness of S at the existence of some L and neighborhoods U, V of such that

is locally upper

and

means respectively,

If then the local upper Lipschitz condition (1.8) implies (1.9) immediately. Calmness also means that the pseudo-Lipschitz condition (Dl) has to hold for particular only, and has been applied and investigated e.g. in [Cla83] for deriving optimality conditions. Under this respect, calmness can be similarly used as the local upper Lipschitz property at a set, cf. Section 2.1 (optimality conditions) and Theorem 2.10. An interesting calmness condition for multifunctions can be found in [H001]. It is applicable to the models in [Out00] and many models in [LPR96].

14

1. Basic Concepts

Example 1.13 (pseudo-Lipschitz, but not locally upper Lipschitz). Let

and let be the interval for real Then, if S(0), the mapping S is not locally upper Lipschitz at because, for each set and each L > 0, one finds points such that and Further, S is not calm at On the other hand, S is pseudoLipschitz at each point Example 1.14 (the inverse of Dirichlet’s function). if

is rational;

For the real function

otherwise ,

the inverse is calm at and locally upper Lipschitz at (0,S(0)) since The mapping is even pseudo Lipschitz at (0,0) since holds for all irrational and all near 0. The second example indicates that the usual construction of penalties for calm equations, may lead to terrible auxiliary functions F. For related questions we refer to §2.1 and Lemma 2.1.

1.6 First Motivations Property (D1) is closely related to continuity statements on parametric optimization models Here, S : (metric spaces) and are given, and plays the role of a parameter. If is a local solution for feasible points can be assigned to in an uniform Lipschitzian manner provided that as well as and are close to and respectively. Under (Lipschitz) continuity of this allows estimates of the related infima

and of solution sets

The properties (D1) and (D2) also ensure the validity of well-known necessary optimality conditions and help to estimate related Lagrange multipliers in terms of the parameter distance even if primal-dual solutions are not unique. These facts, which become clearer below, explain the great interest in (D1) and (D2) as well as in the other types of regularity and semicontinuity for multifunctions. As examples and basic results, we mention the following classical statements.

1.6. First Motivations

15

Parametric Global Minimizers Theorem 1.15 (Berge–Hogan stability). Let

at

be continuous and S be u.s.c. and l.s.c. at some Then one has: (C. Berge [Ber63]) If is compact then, at is continuous and is u.s.c. (W. Hogan [Hog73]) Let be convex, be compact and be closed and convex. Then, at is continuous and is u.s.c. If, in addition, all sets are closed, then for near

Proof. (i) Let and let Since S is l.s.c., one finds semicontinuous due to

denote any sequence that realizes Hence, is upper

On the other hand, to any with By compactness of accumulation point of all and first select a subsequence such that certain such that of

there corresponds some there exists some common Hence, given one may and next choose So we obtain continuity

Finally, considering the (existing) accumulation points of any as one finds first and next Thus yields that is u.s.c. at (ii) Again, is u.s.c. due to the arguments from (i). Therefore, the rest will follow as above by continuity and compactness, provided that is bounded for every sequence of satisfying and To show the latter, choose large enough such that Next assume that certain diverge. Then there are points on the line segment with Since holds by upper semicontinuity, every accumulation point of the bounded elements belongs to the closed and convex set Because of such a point exists. Since is (quasi-) convex, it holds additionally that

Thus, we obtain

in contradiction to the choice of .

The statements of the foregoing theorem have been generalized under several points of view: with respect to continuity and compactness, and by investigating also i.e., and where and see, e.g., RW98]. Even our formulation (ii) is a slight generalization

16

1. Basic Concepts

of Hogan’s original result. Nevertheless, the basic arguments of the original proofs remained valid. It should be also mentioned that the hypotheses concerning S were investigated for several important mappings in finite dimension, e.g., for the mappings analytic and convex on (quasi-) convex polynoms, rational coefficients; integer for These investigations are (with appropriate objectives ) closely related to duality and existence theorems for problems of type (1.10), cf., e.g., [Roc71, Lau72, Roc74, Kum81, BM88, BA93, Kla97, Sha98]. We further note that stability results of the type presented in Theorem 1.15 may be also formulated in terms of (classical) convergence of functions and sets, see, for example, [DFS67, Fia74, Kum77]. Parametric Local Minimizers

In the case the Berge–Hogan theorem may be extended to certain sets of parametric local minimizers of the problem (1.10). Following Robinson [Rob87] (see also [FM68, Kla85]), a nonempty set is said to be a complete (or strict) local minimizing set (CLM set) for on if there is an open set such that

where cl is the closure of and ”argmin” is written for the set of global minimizers. Note that is supposed being open, hence each element of a CLM set is a local minimizer for on In particular, is a CLM set if is a strict local minimizer for on and is a CLM set provided it is not empty. Moreover, certain sets of local minimizers satisfying a linear or quadratic growth condition (sometimes called sets of weak sharp minimizers) are CLM sets, see, e.g., [War94, Kla94a, BS00]. Theorem 1.16 (stability of CLM sets [Kla85, Rob87]). Consider (1.10) in the case Given let Z be a compact CLM set for on and let be closed for each in some neighborhood of Further, suppose that is continuous on X × Y and that S is both u.s.c. at and l.s.c. at some Then there are a neighborhood of and an open bounded set such that and is u.s.c. at with for each is a CLM set for on i.e., in particular, any element of is a local minimizer for on

1.6. First Motivations

17

By definition, there is some open bounded set such that Hence, since S is l.s.c. at some the sets are nonempty and compact for near Since is continuous, assertion (i) follows from Weierstrass’ theorem and part (i) of Theorem 1.15. Moreover, when applying that is compact, assertion (i) gives the Berge u.s.c. of at Hence, for the open set containing there is some neighborhood of such that for all in i.e., by definition, these sets are CLM sets. Proof.

Epi-Convergence

The parametric optimization problem (1.10) with respect to can be reformulated by introducing an (improper) function g as if

otherwise

and studying the “free” parametric extremal problem

Conversely, having an improper function then, after setting (the domain of ), we are just studying problem (1.10) with an objective defined on Similarly, to obtain an objective that is everywhere finite, one can put

whereupon

Of course, the different formulations (1.10), (1.11), (1.12) of the same subject alone cannot present new insides for the analysis of parametric optimization problems. However, since the suppositions for (1.11) are usually written by means of the epigraphs and their convergence properties (types of epi-convergence) as related conditions have often another (shorter) form. On the other hand, they must be re-interpreted in terms of dom Here, we will prefer the classical parametric formulation (1.10), whereas e.g. in [RW84, Rob87, Att84, RW98] just (1.11) has been favored. Note that, in the context of approximations to optimization problems, the close relations of the arguments in the epi-convergence approach to those of the classical theory of functions were discussed by Kall [Kal86].

This Page Intentionally Left Blank

Chapter 2

REGULARITY AND CONSEQUENCES In this chapter, we present conditions for certain Lipschitz properties of multivalued maps and the related types of regularity, we investigate interrelations between them and discuss classical applications as, e.g., (necessary) optimality conditions and stability in optimization. A great part of this chapter is devoted to pseudo-regularity of multifunctions in Banach spaces, where we do not utilize generalized derivatives. We directly use Ekeland’s variational principle as well as the family of assigned inverse functions. They lead to characterizations of pseudo- regularity for the intersection of multifunctions and permit rather weak assumptions concerning the image- and pre-image space as well.

2.1

Upper Regularity at Points and Sets

Characterization by Increasing Functions

Let X, Y be metric spaces, and let We call Lipschitzian increasing near there are such that

Further, we say that describes S near for S near briefly

(or

if

S is locally upper Lipschitz at is Lipschitzian increasing near 19

if

on

and and

is a describing function

20

2. Regularity and Consequences

We will see by Theorem 2.6 that describing functions can play the role of penalty functions in optimality conditions. So the structure of possible ”candidates” becomes interesting. By the next statement, there is always a describing Lipschitz function, globally defined with rank 1 and not depending on Let us agree that the metric in product spaces Y × X is defined as

Lemma 2.1 (upper Lipschitz and describing Lipschitz functionals).

and

satisfies only if

i.e., S is locally upper Lipschitz at is Lipschitzian increasing near

Proof. For simplicity, we write

both for and vanishes on Let be Lipschitzian increasing near locally upper Lipschitz with rank since for

Conversely, let be not Lipschitzian increasing near there is some such that and

Select any

Given

the distance function

if and Evidently, Then S is

This is, for each

with

Then,

and Thus, both and not locally upper Lipschitz at

vanish (as

); so S is

More examples

of describing functions for

near

are the following ones.

(i) Equations

In the case of functions and one easily sees that fulfills and too. In the form considered here, i.e., as an equation and with one can study all maps that were originally given by a function and a multifunction via

2.1. Upper Regularity at Points and Sets

21

this form occurs in models of generalized semi-infinite optimization where One has only to put

whereupon

Therefore, the map

fulfills (1.8) iff so does

too.

(ii) Cone constraints Let Y be a linear normed space, X be a metric space, convex cone, and Lemma 2.2 (Cone constraints). Let

Then, if

is Lipschitz on

be a

and the function

for some

fulfills with certain constants

and

Proof. Let be some Lipschitz rank of Then, one obtains for all and

Hence

Hence on

and

and

Next, fix any

We verify

Due to some satisfying

it holds

So one finds as well as some

such that

and

From

we conclude (by adding points of a convex cone) that

22

2. Regularity and Consequences

Therefore, the inclusion holds true, whenever Because of the latter can be guaranteed by Now (by the choice of and belong to and fulfill whereupon (2.8) is ensured whenever Considering yields

this

The assertion now follows from (2.6), (2.7) and Lemma 2.1. The convex cone K had not to be closed, indeed. If also X is a linear normed space and is linear and continuous, then one easily shows that is convex. In addition, is bounded on some neighborhood of due to (2.5). So it is locally Lipschitz on too. Needless to say that is simpler than from the viewpoint of computation. For and one obtains the usual penalty term (iii) Cone constraints and equations Let where S satisfies the assumptions of Lemma 2.2, sends X into a linear normed space Z, and Writing in form of cone constraints with the cone in the product space, the interior of K’ is empty and Lemma 2.2 cannot be applied. In addition, the describing distance-function according to Lemma 2.1 only satisfies

So we only know, by the previous statements, that the maximum function

fulfills and q is Lipschitzian increasing near

iff so is from Lemma 2.2.

However, due to the gap between and the function may Lipschitzian increase near while does not (Then S and violate (L1), but

23

2.1, Upper Regularity at Points and Sets

not so

In this situation,

and Q are no describing functions for

On the other hand, the maximum turns out to be a describing function under all classical regularity assumptions that ensure, as in the subsequent Lemma, that is pseudo-Lipschitzian (or only calm) at Lemma 2.3 (the max-form under calmness). Suppose: X, Y ,Z are Banach spaces, some satisfies and and Moreover, let be contained in a sufficiently small (by diameter) neighborhood of Then, in (2.9) is a describing function for near Proof. Our suppositions are nothing but well-known regularity conditions for optimization problems in Banach spaces, cf. [Rob76a, Rob76c, ZK79], which ensure that the map is pseudo-Lipschitz at see also the discussion after Theorem 2.22. So, the lemma will follow from Theorem 2.4 below because is locally Lipschitz. (iv) Arbitrary Intersections More general, let X, Y, Z be metric spaces,

and

Theorem 2.4 (the max-form for intersections). Let

be calm at and be pseudo-Lipschitz at Moreover, suppose that is contained in a sufficiently small (by diameter) neighborhood of Then, the function

fulfills The inequality (2.10) follows as above without any assumptions, we in opposite direction. First notice that is Lipschitz, so it holds as In consequence, for sufficiently small neighborhoods we find arbitrarily small Now, for near and (small) there are (by definition of and ) points and such that Proof.

estimate

Next we apply that is pseudo-Lipschitz at there exists, for small and some

We thus obtain

and

say with rank K. Since satisfying

24

2. Regularity and Consequences

So, since (y’, z’, x’) is close to we may use calmness of say with rank L at By (2.12) and (2.13) this ensures the existence of some such that

Finally,

implies the upper estimate

and yields (as and Lemma 2.1, the latter tells us that because so is

Recalling (2.10) is a describing function for near

Notice that Lemma 2.3 and Theorem 2.4 do not assert the upper Lipschitz property of at itself. The relation between the upper and pseudoLipschitz properties as well as calmness will be investigated under Theorem 2.10. Next, we inspect the hypothesis of being calm in the previous theorem and reduce calmness of the intersection of two mappings to the intersection of one mapping with a constant set (a new space X) only. Theorem 2.5 (calm intersections). Let S be calm at

and be pseudo-Lipschitz at be calm at Then Let be close to (say with rank L), there are and

Proof.

Since is pseudo-Lipschitz (rank K), we find such that

T be calm at

Moreover, let is calm at Since S and T are calm such that

and

are close to

and Next observe that satisfying

Therefore, there exists also some

Using these inequalities, we directly obtain the required Lipschitz estimate

2.1. Upper Regularity at Points and Sets

25

(v) Set-Constraints

Assume that and M is a fixed, closed subset of X. Clearly, then one may study S on the new metric space X := M which leads us to a new function However, let us also regard two usual descriptions of via functions under the viewpoint of the pseudo-Lipschitz assumption for in the theorem. the mapping (a) Setting is pseudo-Lipschitz, and If S is already calm (w.r. to the space X) then the theorem allows us to study, instead of the calmness of the mapping

If H is calm at then so is at hence also the original map at This way, one may replace (for the calmness investigation) the fixed set M by and the mapping S by

(b) Setting function

if

otherwise, and as above, the is discontinuos and it holds for The theorem cannot be applied. Indeed, for small we would obtain the trivial constant map

which tells us nothing about Optimality Conditions

The local upper Lipschitz property of feasible set maps S ensures optimality conditions for constrained minimization in terms of free (i.e., unconstrained) local minimizers of an auxiliary function. To study an optimization problem consider any map (between metric spaces) as a parametric family of constraints satisfying for some The following statement, though more general, applies basically the same simple arguments as the related proposition in [Cla83] for calm constraints. Theorem 2.6 (free local minima and upper Lipschitz constraints).

Given metric spaces X, Y, let be locally upper Lipschitz at with rank be Lipschitz near with rank K, and let Further, let be a local minimizer of on Then is a local minimizer of whenever

with

from (2.1).

26

Proof.

rank for

2. Regularity and Consequences

Let near

let U be the open set in (1.8) and K be some Lipschitz Given select some with

Then, have

After passing to the limit

For and small and we know that is small enough to apply the Lipschitz estimate and Further, since we So, it holds

the latter ensures the assertion due to

It is trivial but useful to note that the function may be replaced, in Theorem 2.6, by any function satisfying and Applying the function of Lemma 2.1, the new objective P turns out to be even Lipschitz near Provided that X and Y are normed spaces, now all necessary optimality conditions for free local minimizers of P induce necessary conditions for the originally constrained problem. In particular, if directional derivatives of P at in direction exist, then it must hold

and

follows for the contingent derivative CP. Dual Conditions

Let us mention only two basic approaches for obtaining dual conditions; various other approaches and more involved results can be found in [Roc70, Gol72, Roc74, IT74, LMO74, War75, Iof79a, Ben80, KM80, Roc81, BBZ81, BZ82, Pen82, Cla83, Stu86, Mor88, Cha89, Sha98, BS00].

2.1. Upper Regularity at Points and Sets

27

Dual conditions via directional derivatives

If and are directionally differentiable (below, we see that the directional derivatives may be generalized) and satisfy

then (2.14) yields a condition for the sum

Let, in addition, the directional derivatives be continuous and sublinear in (which is evident for locally Lipschitz convex functions). Then, applying the Hahn-Banach theorem, see e.g. [KA64], to the sublinear function in the product space the supporting functional of Q on the subspace defined by can be extended to an additive and homogeneous functional on that supports Q everywhere. Thus,

and hold for all The latter implies (since Q is continuous by assumption) that are bounded, and

So one obtains the existence of some duality) inequality

satisfying the (conjugate

Since the involved directional derivatives are positively homogenous, the infima are zero and belongs (by definition) just to the usual, convex subdifferential Similarly, one obtains In other words, after defining a new subdifferential function at as (and applying it to

too) some

and which is the generalized Lagange condition

for the non-convex

satisfies the inclusions

28

2. Regularity and Consequences

or simply for Fréchet differentiable functions. Recalling the concrete form of for in (2.3) or in Lemma 2.3, one sees that directional derivatives and contingent derivatives of maximum functions play a crucial role, in this context. Further, one observes that several concepts of directional derivatives may be applied to derive (2.18) for the subdifferential (2.17) in the above way, provided that of P, and (i) condition (2.16) remains valid for local minimizers the existence of directional derivatives as well as sublinearity and conti(ii) nuity with respect to the directions can be guaranteed. For locally Lipschitz functions on linear normed spaces X, these hypotheses are satisfied by Clarke’s directional derivatives and his subdifferential which coincides with after identifying and cf. Sections 1.2 and 1.3. For the equation in terms of generalized Jacobians cf. [Cla76], increases the analytical tools for computing the derivatives in question. Dual conditions via generalized subdifferentials

Without applying directional derivatives, one may restrict the functions the set sufficiently small, whereafter

to

holds true for the minimizer and all types of generalized subdifferentials Then, provided that a chain rule

is valid, one directly obtains (2.18) with respect to the subdifferential under consideration. We refer the reader who is interested in recent results devoted to inclusion (2.19) for particular subdifferentials, to [MS97a, Kru00, NT01, Kru01]. For the related subdifferential-theory (mainly of certain limiting Fréchet subdifferentials), the Lipschitz property of and as well as the fact that X is an Asplund space play an important role, see also subdifferentials in §2.2.2. Linear Inequality Systems with Variable Matrix

The upper Lipschitz property is particularly important for linear inequalities and polyhedral multifunctions. Basic results on this subject go back to Hoffman [Hof52], Walkup and Wets [WW69] and [NGHB74]. A complete theory of upper Lipschitz properties in the polyhedral case has been elaborated

2.1. Upper Regularity at Points and Sets

29

by Robinson [Rob76b, Rob81], for some (incomplete) summary see Theorem 6.4 below. Various results concerning the nonlinear case have been shown in [Rob76c] for the first time. Fixed Matrix

The solution set of a parametric, finite dimensional linear inequality system A is an is Lipschitz u.s.c. at

matrix,

This is a consequence of the famous Hoffman’s lemma

Lemma 2.7 (Hoffman [Hof52]). There is some constant L depending only on A such that the inequality

holds for all

and all

with

where

is the

row of A.

The lemma has initiated various investigations and proofs in order to find error bounds and best estimates (constants L) for linear inequalities or linear programs/complementarity problems (see, e.g., {Rob73, Rob76b, Man81a, Kla87, Man90, Li93, KT96]) as well as global error bounds for convex multifunctions or convex inequalities (see, e.g., [Rob75, Man85, LP97, BT96, Pan97, Kla98, LS98, KL99]). One of the first extensions of the lemma to Banach spaces can be found in [Iof79b]. If there is some with (component–wise), then is nonempty for near Hence i.e., is upper regular at Moreover, using one easily shows that F is even pseudo-regular at each pair with Having only then, after identifying Y with dom S, these statements remain true, and is an unbounded, convex polyhedral set. Lipschitzian Matrix

If

where depends (locally) Lipschitzian on then analogue statements as above can be immediately derived. If also A depends Lipschitzian on then dom S is no longer closed and S is not locally upper Lipschitz, in general. Even if is non-empty and bounded, the map S is not necessarily l.s.c. But the upper Lipschitz behavior remains valid. The next lemma is well-known (see Robinson [Rob77]) and applies to the continuous and Lipschitz continuous situation in the same manner. Lemma 2.8 (Lipschitz u.s.c. linear systems). Let

30

2. Regularity and Consequences

let A and be pointwise Lipschitz at bounded. Then S is Lipschitz u.s.c. at Proof. Let

and

and let

be non-empty and

be small. Writing

and we have, with some L and related norms,

and If, for certain the elements (for some subsequence of

diverge, then division by

yields

and In this case, Hence, if Then, setting

contains the ray and is unbounded. is small enough, there is an upper bound K for

it holds

and Since, for the fixed matrix now the estimate

S is Lipschitz u.s.c. at

with some rank

completes the proof. The reader will easily see that S is still u.s.c. at at and is non-empty and bounded.

if both A and are continuous

Application to Lagrange Multipliers

Let be the (possibly empty) set of Lagrange multipliers, assigned to a feasible point and to some parameter of an optimization problem

and suppose that Then we have

if and only if

and

2.1. Upper Regularity at Points and Sets

Let put, for fixed

Due to

31

(the index set assigned to active inequalities) and and

for

near

we observe that

and Hence, Lemma 2.8 immediately ensures (by setting there and ) the following well-known result (cf., e.g., [Kla91, BS00]). Corollary 2.9 (Lipschitz u.s.c. multipliers). Provided that

empty and bounded, the multiplier maps If, in addition, card schitz l.s.c. at

and

is nonare Lipschitz u.s.c. at

then the restricted map

is Lip-

Note that, by Gauvin’s theorem [Gau77], is non-empty and bounded if and only if is a stationary point satisfying MFCQ, while card means in algebraic formulation just the so-called strict MFCQ condition [Kyp85], for both results see also Lemma A.7 in the Appendix. The requirement of being Lipschitz l.s.c. at with respect to dom implies rank conditions concerning submatrices of on dom which are often written as the so-called constant rank condition [Jan84]. Having only the same arguments ensure that and are at least u.s.c. at

Upper Regularity and Newton’s Method

Let V, and let

be upper regular at with rank L and neighborhoods U, be a (pointwise) Lipschitz function with and

for

Then (evidently) the map with rank Supposing

is locally upper Lipschitz at

the iteration process

generates a (possibly not unique) sequence satisfying in particular,

near

and

32

2. Regularity and Consequences

The same is true if is only locally upper Lipschitz at and if one knows that exists. To obtain a standard application of the process (2.21), let and let be a regular matrix. Setting

with rank L

and

the process (2.21) describes just Newton’s method. Indeed, we obtain and, considering on small neighborhoods U, V of respectively,

So, it holds that

and (2.21) describes Newton’s method as asserted. Since

the assumption of (2.20) is valid with arbitrarily small L > 0. The latter ensures (locally) superlinear convergence.

2.2 Pseudo-Regularity Pseudo-regularity is the most interesting and most complicated stability property we are dealing with in the present book and, in fact, there are still several open questions (mainly of topological nature) concerning this property even for Lipschitz functions. For instance, it was a big step ahead to know that, if is pseudo-regular at and directionally differentiable at the zero is necessarily isolated (cf. [Fus99, Fus01] and Theorem 5.12). Notice that this statement is nearly trivial for If is not directionally differentiable at then the same question is completely open. Pseudo-regularity may be (and has been) characterized by different means. For mappings in finite dimensional spaces, contingent and coderivatives describe the problem sufficiently well. In more general spaces, our Example BE.2 restricts these approaches essentially. For this reason, limits of certain Ekeland points, which describe the ”pseudo-singular” situation, will be taken into consideration. In addition, we investigate the intersection of pseudo-regular maps by means of assigned inverse families. Such families come into the play if one studies the inverses of pseudo- regular maps in detail. First of all, we establish a general connection between calm, upper, lower and pseudo-Lipschitzian maps S and the optimality condition of Theorem 2.6.

2.2. Pseudo-Regularity

33

Theorem 2.10 (selection maps and optimality condition). Let pseudo-Lipschitz at with rank L and neighborhoods U, V. Let fixed such that and define

be be

Then, is Lipschitz u.s.c. at is Lipschitz l.s.c. at all (both with rank L); The functions

and coincide for near If is locally Lipschitz and then, provided that is large enough,

Moreover, the statements calm at with rank L. (i) Let with if S is calm at since

remain true if

is only

Setting in (1.1) there is some Notice that also exists (by definition) with related neighborhoods U, V and rank L. Moreover,

Proof.

we obtain (ii) Let

and

(locally) minimizes on is a free local minimizer of

and and

Then, for some

Consider any there is some we estimate

So

such that such that by applying

satisfies

By (1.1), To show that

and

holds as required. (iii) Clearly, is evident and follows from To verify for near we consider any such that and

34

2. Regularity and Consequences

Let We show that

realize the distance up to an error Indeed, since

it holds

Thus, the inequalities

and are valid and ensure

whenever

The latter holds true due to (2.22) and yields as well as via (iv) For sufficiently small minimizes on By (i), is Lipschitz u.s.c., so is, by Lemma 2.1 and Theorem 2.6, a local minimizer of whenever is sufficiently large. Using (iii), this is the assertion.

2.2.1

The Family of Inverse Functions

Let us consider the point namely

satisfying the requirements (1.1) of pseudo-regularity,

as a function of

and

for

Then, F is pseudo-regular at iff there is a family such that and, for all has whenever So

is a special family of selections

for

of functions

which tells us that

and, in addition,

We will say that

is a local inverse of F, and

is an inverse family.

one

2.2.

Pseudo-Regularity

35

The functions are not unique a priori and may be discontinuous at So, an inverse family, assigned to a pseudo-regular mapping F, may consist (at least theoretically) of more or less complicated selections of To get a close connection between inverse families and cones of tangents or normals, we consider the directions and for (linear) normed spaces and respectively. Setting and we have Now, each is bounded since So one may identify and a related family of uniformly bounded functions defined for and for all in some interval such that if

Indeed, having pseudo-regular at tions, we have

one easily sees (by ”decreasing” U and V) that F is We call an inverse family of directions. By our defini-

Remark 2.11 (inverse families and pseudo-regularity). The following conditions are equivalent to each other: 1. An inverse family exists. 2. An inverse family of directions exists. 3. F is pseudo-regular at Up to now, the domain of all was a constant neighborhood V of containing the second component of while all functions were defined on the same interval If these domains are replaced by different neighborhoods of and intervals respectively, the existence of (or ) describes the l.s.c. property (2.24) of To indicate that we understand the families and in this weaker sense, we denote them by and respectively. Our Theorem 2.17 will say that the existence of and pseudo-regularity and the l.s.c. property (2.24) - are equivalent for quite general mappings F. Particular Local Inverses

Knowing that, for certain maps, there are particular local inverses, gives us extra information about where or how one can find some satisfying (1.1). It can also mean that not all variations of must be regarded. In this way the question of whether F is pseudo-regular or not can be simplified. Let us regard some examples. Simple Cases

1.

Let X be a linear normed space, let

be locally Lipschitz and

36

2. Regularity and Consequences

Provided that Clarke’s directional derivative direction one may put

in order to see that F is pseudo-regular at family consists of functions that move direction only.

is negative for some

The related inverse ”Lipschitzian far” in a fixed

2. If, moreover, the locally Lipschitz function is convex or continuously differentiable, then one may even state: Either F is not pseudo-regular at or there is an inverse family that consists of functions having the form (2.26). Indeed, if then either F is not pseudo-regular at namely if or one can take any direction with If is convex and continuous, put where is any point with as far as exists. Otherwise, F cannot be pseudo-regular at since is empty for 3. The reader will easily confirm that, by setting either–or–statement of 2. is true for each mapping

the

provided that the (possibly discontinuous) components are non-decreasing in each component In fact, any point satisfying can be replaced by without violating this inequality. So one may replace in (2.23) by after taking the maximum-norm. Having now can be again replaced by whereafter remains true and the required Lipschitz estimate holds with another constant (or equivalent norm) only. For instance, may be an extreme value function (also called marginal function) of the type

with arbitrary functions or an utility function.

and

or

is a probability distribution function

More Complicated Cases

1.

Discontinuous functions are needed for those pseudo-regular functions which are not strongly regular at Indeed, in that case, the inverse does not possess a (single-valued) selection that is continuous on some neighborhood V of and satisfies cf. Theorem 5.10. Thus, every local inverse for

2.2. Pseudo-Regularity

37

is necessarily discontinuous at some point of V. 2. Pseudo-Lipschitzian level sets without descent directions in Hilbert spaces: Our Example BE.2, where and is a lower level set map for on is helpful in order to understand pseudo-regularity in general spaces. It shows that locally inverse functions may have (necessarily) rather bad properties. In particular, one has to take into account that the bounded directions of every inverse family of directions do not necessarily converge as Even accumulation points may fail to exist, and directional derivatives, for certain arbitrarily close to can satisfy the first-order minimum condition Note that, in our example, is one of the simplest nonsmooth, non-convex functions on a Hilbert space: is globally Lipschitz, directionally differentiable and concave.

2.2.2 Ekeland Points and Uniform Lower Semicontinuity In this section, we characterize pseudo-regularity by two topological means: (i) by so–called Ekeland points, related to the distance functions (ii) by Lipschitz lower semicontinuity of

near the reference point. The first characterization is our basic tool, the second one will help to understand the content of pseudo-regularity. We will require that Lipschitz l.s.c. holds with uniform rank L at the points in question. But the neighborhoods where the l.s.c. estimates are true, may have different size. For this fact, the notion uniform lower semicontinuity will be used. We start with the formal negation of pseudo-regularity: A multifunction (between metric spaces) is not pseudo-regular at iff

Here, we have identified L, and appearing in the definition with and respectively. For instance, (2.27) holds for with and In order to show that such points cannot exist, we want to obtain additional information about possible sequences. This is the purpose of our next considerations where we replace in (2.27) by an Ekeland-point of the function We apply Ekeland’s variational principle [Eke74] in the following form.

38

2. Regularity and Consequences

Theorem 2.12 (Ekeland).

and let

Let X be a complete metric space and be a l.s.c. function having a finite infimum. Let and be positive, Then there is some such that

and

A proof of the above theorem will be added in the appendix. As usually, a function with values in is called l.s.c. if all lower level sets are closed. In applications, X is often a closed subset of a Banach space. Points satisfying are also said to be We say that for short

if

is a local Ekeland-point of a functional is finite and

with factor

If (2.28) holds for all we call a global Ekeland-point. Via property (2.28) ensures that is a lower bound for several generalized directional derivatives of at Note that in Example 0.1 we had (if and (if For each one finds here local Ekeland-points with but not with If X is a normed space and mapping (of approximate local minimizers)

one may introduce the

for

Its inverse

assigns, to

some subset of X*, and defines via

just the so-called subdifferential of be explicitly defined by saying that

at if

This subdifferential can

It has been extensively studied in the literature during the last years. For its behavior as and/or and applications as well, we refer to [Kru85, Fab86, Fab89, Kru96, Kru97, NT01] and [Iof00] where the reader finds not only a comprehensive overview, but also various further references. As an introduction into the rich world of ideas for generalized derivatives, subdifferentials and their applications, one has to mention (even now) the paper [Iof79a].

2.2. Pseudo-Regularity

39

The idea of dealing with pseudo-regularity by applying Ekeland’s variational principle, was a basic one and goes back to J.P. Aubin and I. Ekeland. The proof of Theorem 4, §7.5, in [AE84] is a typical example for realizing this idea. A. Ioffe [Iof79b] and A. Auslender [Aus84] also used the Ekeland principle very early as a crucial tool in a more special context devoted to optimality conditions and finite systems of (in)equalities, respectively. There, sufficient conditions for F being pseudo-regular have been derived in terms of generalized derivatives. Here, we add a condition which is both necessary and sufficient for important special cases which will be summarized in Lemma 2.13 below. We say that is proper near if, for fixed near the function dist values in l.s.c. on X. Lemma 2.13 (proper multifunctions).

A multifunction is proper (everywhere) under each of the following assumptions. F is a continuous function; F is closed and is a closed convex set in a real Hilbert space Y and is a continuous function. F is closed and locally compact. where X, Y are Banach spaces, is continuous and satisfies assumption (iv). In each of these cases, the sets are closed and dist will be attained, i.e., dist for some provided that

Proof. (i) is trivial.

or (ii) and (iv) follow by compactness arguments in and apply (iv). (v): Note that dist (iii): The existence and the lower semicontinuity follow from

where

is the non-expansive projection map onto K.

Lemma 2.14 (pseudo-regularity for proper mappings). Let X and Y be metric

spaces, let X be complete, and let F is not pseudo-regular at is a global Ekeland-point of

are true whenever

be proper near Then, if and only if for each there exist and (all depending on p) such that with factor and the inequalities

40

2. Regularity and Consequences

Note: Let the given condition be satisfied. Then, So, applying (2.29), it holds and

in (2.30) exists since

For this direction, neither the l.s.c.-assumption nor any Ekeland property of is needed. It suffices to know that According to our previous note, we may put and where satisfies (2.30). Now (2.29) ensures and Thus (2.27) holds true. Let be given. We put assume that (2.27) is true, and fix any with Proof of Lemma 2.14.

Setting and is positive by (2.27), and are true by the choice of From we have Hence is for the l.s.c. functional on X. To replace by an Ekeland point, we put in Theorem 2.12: There exists a global Ekeland point with factor such that

In particular, we observe that and

Next consider any satisfying (2.30) with the already fixed ). Using (2.31) we observe hence

should not be confused with

Now the estimate

verifies the first inequality in (2.29). On the other hand, it holds – due to (2.27) and by the choice of

By (2.31), the latter ensures the estimate

Taking again (2.32) into account, we thus obtain

So (2.30) implies (2.29), which completes the proof.

2.2. Pseudo-Regularity

41

From the proof of direction one easily sees that Lemma 2.14 remains true after replacing the notion ”global Ekeland-point” by ”local Ekeland point”. In order to see what happens if gph F is not closed, consider Example 2.15 (F is not pseudo-regular). For

put

rational}. Then, for irrational for rational Clearly, F is not pseudo-regular at (0, 0), and dist for all Thus each pair trivially satisfies the inequality dist ( ,F( )). To fulfill the implication (2.30) (2.29), the pair must be taken near the origin, say with irrational With now (2.29) follows from (2.30) due to and

The situation becomes simpler if F is supposed to have dosed images. Theorem 2.16 (pseudo-regularity of proper mappings with closed ranges).

Let F be a closed-valued map satisfying the assumptions of Lemma 2.14- Then, F is not pseudo-regular at if and only if for each there exist and such that is a global Ekeland-point of with factor as well as 0 < dist Let for some under consideration. The direction follows as for Lemma 2.14; concerning see the Note following Lemma 2.14 and notice that follows from and the closedness of To show assume that F is pseudo-regular, contrarily to the assertion. Then, using the equivalence of both properties, F is metrically regular at with some rank L. Since we may assume that is close to this yields Because of we have and obtain particularly On the other hand, it holds Proof.

and

So we derive as well as

Due to

this yields

as

So F cannot be pseudo-regular.

42

2. Regularity and Consequences

Partial Inverses

Using Theorem 2.16, the notion of an inverse family (of directions) in §2.2.1 may be weakened without violating the equivalence with pseudo-regularity. Let X be a metric space, be normed and A partial inverse (with rank L) at is a mapping that assigns, to some sequence of and related elements such that We say that F is partially invertible near if, for some neighborhood of and some L, there is a partial inverse at each with uniform rank L. Recalling the convention in metric spaces, one can equivalently define F is partially invertible near for each it holds

near

all

if and some fixed L > 0,

In particular, F is partially invertible near if is Lipschitz l.s.c. near with uniform rank K, cf. condition (2.24). In fact, with L = K + 1, a partial inverse can be defined by taking small and setting and For showing partial invertibility of the lower level set map of a continuous and directionally differentiable functional on a Banach space, it is enough to assign, to near some uniformly bounded direction such that Even if is discontinuous, now may consist of all in some interval (where depends on and ) and of Theorem 2.17 (basic equivalences, proper mappings). Let X be a complete

metric space, Y be a normed space and be closed-valued and proper near Moreover, let dist hold for some whenever Then, the following properties are equivalent to each other: F is pseudo-regular at F is partially invertible near For some neighborhood of is Lipschitz l.s.c. with uniform rank L at each Note: Concerning our assumptions we refer to Lemma 2.13, Proof of Theorem 2.17. If F is pseudo-regular then there is an inverse family

(see §2.2.1), so F is partially invertible and fulfills the l.s.c. condition. Thus, we have to show that F is pseudo-regular if F is partially invertible. Let us assume, contrarily to the assertion, that F is not pseudo-regular.

2.2. Pseudo-Regularity

43

Consider any Ekeland pair assigned to some as in Theorem 2.16. Since there is some such that Next put and use our assumption. Hence, for certain there are and with and If is small then one finds

By the Ekeland property of

with factor

it holds for small

that

The left-hand side can be estimated with the already shown relations

thus yields So, the singularity condition

of Theorem 2.16 cannot be satisfied.

It is worth noting that the key inequality (2.33) of the preceding proof is already true if and This means that the claim in the definition of a partial inverse could be even weakened. The foregoing theorem also holds for closed F and Banach spaces X and Y, we refer to our discussion following corollary 3.3 below.

2.2.3 Special Multifunctions Here, we apply Theorem 2.16 to particular forms of the map F. Recall that denotes the set of all local Ekeland points of with factor cf. (2.28). Level Sets of L.s.c. Functions Lemma 2.18 (pseudo-singular level sets of l.s.c, functions). Let X be a complete metric space, be l.s.c., and Then, F is not pseudo-regular at if and only if for certain

Note: For continuous

the condition takes the form:

and

44

2. Regularity and Consequences

It holds where We may apply Theorem 2.16 since is l.s.c. Let and be assigned to as in Theorem 2.16. Then and, due to and one observes The local Ekelandproperty of yields Proof of Lemma 2.18.

For near write

it holds

because

is l.s.c. and

So we may

which ensures ( ) Due to For each yield

we have, with some now the both conditions

hence Thus, the map F may be pseudo-regular at Because converge to as at

and

only with rank F cannot be pseudo-regular

Cone Constraints

Next, we consider case (iii) of Lemma 2.13 more in detail. Suppose that

We wrote cone constraints because K is a convex cone in many applications. The Lipschitz assumption is crucial for forthcoming estimates, the existence of inner points of K is not needed. Let be the projection of onto the closed convex set Equivalently, this means that belongs to the normal cone of at Clearly, is locally Lipschitz. So pseudo- regularity of F can be reduced to the study of global Ekeland points for the locally Lipschitz functionals Instead of norm-functionals, let us now use dual functions of the form Lemma 2.19 (Ekeland-points of norm-functionals in a real Hilbert space).

Under (2.34), let If Conversely, if

and for

then then

for

and all

2.2. Pseudo-Regularity

45

Proof. (i) We have, for some

Taking the square of both sides, we obtain With the notation and

this yields

Let be a Lipschitz constant for fulfills a Lipschitz estimate A becomes

near

For near So,

and small then and the term

Hence, Given any we may further restrict necessary), such that

to a smaller neighborhood of (if

Now (2.35) ensures which is So we see that

is a local Ekeland-point of with factor

(ii) By the suppositions, it holds with some

for all

Returning to the particular (cone-) mapping F under consideration, we have to put in order to obtain a characterization of pseudoregularity by normalized functionals In view of Lemma 2.19, we define To abbreviate we write and being aware that and are fixed. Recall that and that is the projection of onto For basic techniques of dealing with projections in Hubert spaces we refer to [Har77] and [Sha88b].

46

2. Regularity and Consequences

Lemma 2.20 (pseudo-singular cone constraints). Under the hypotheses (2.84),

one has: F is not pseudo-regular at if and only if there are points and

depending on such that where is defined as

with If dim and F is not pseudo-regular at then the functional (as have an accumulation point

Note: The inclusion

and

means explicitly that for

near

Proof of Lemma 2.20.

(i), By Lemma 2.19 (ii), the given points are Ekeland-points with factor for Thus, F is not pseudoregular at by Theorem 2.16. (i), Apply first Theorem 2.16, next Lemma 2.19 (i) . (ii) The existence of is evident. We consider and from condition in (i) and assume that for a certain sequence By (2.34), is locally Lipschitz. So there is a uniform Lipschitz constant L for the functions near Then, we obtain for small and small distance

This ensures Since and

with we thus obtain

with the fixed element Lipschitz Operators with Images in Hilbert Spaces

Having Lipschitzian operators with images in a Hilbert space, we are now in the situation of cone constraints with K = {0} and Given we put Lemma 2.21 (pseudo-singular equations). Let complete metric space and Y be a real Hilbert space, put some point Then,

let X be a and consider

2.2. Pseudo-Regularity

F if If if

47

is not pseudo-regular at and only if dim then F is not pseudo-regular at and only if

Exercise 1. The proof of Lemma 2.21 is left as exercise.

By the Lemma, components of vector-valued functions may be aggregated by nontrivial linear functionals Let be a locally Lipschitz function, X be a complete metric space, Then we obtain, due to normequivalence in is pseudo-regular at is pseudo-regular at Necessary Optimality Conditions

To demonstrate how standard optimality conditions may be directly derived from the equivalences in the last lemmas, we consider only the case of dim where X is a B-space and all functions belong to Recall that, for more abstract problems with pseudo-Lipschitz contraint maps, the reduction to upper Lipschitz constraints is possible via Theorem 2.10, whereafter necessary optimality conditions can be derived as for free minimizers of functions, provided the objective is locally Lipschitz, too; see the part ”Optimality Conditions” of Section 2.1. Equality Constraints

Let

Then,

be a (local) solution of the problem

is not pseudo-regular at So, by Lemma 2.21 (ii), there is some nontrivial and such that is a local Ekeland point with factor

a sequence to the functional Clearly, then the (Fritz-John condition)

must be satisfied. Thus, via

it follows

If the constraint map is pseudo-Lipschitz at i.e. if is pseudo- regular at then is impossible. Indeed, otherwise the singularity condition (ii) for G is fulfilled. So, division by leads us (with new multipliers) to the Lagrange condition

48

2. Regularity and Consequences

Inequality Constraints

Let

be a (local) solution of the problem

Now is not pseudoregular at By Lemma 2.20 (ii), there is a nontrivial vector along with related sequences such that is a local Ekeland point with factor to the functional Here, denotes the Euclidean projection of onto the cone i.e.,

Let us write

and Since the set is not empty. For some subsequence of the finite set is constant. By the construction of and because of where

we have: only for

and For we further obtain Since is a local Ekeland point of with factor it holds This yields, as By definition of J, it holds Further, is valid since Therefore, the Fritz-John optimality conditions are satisfied. If then, due to now is a local Ekeland point with factor for where Thus, again by Lemma 2.20 (i), direction is not pseudo-Lipschitz at constrained map In other words, if G is pseudo-Lipschitz at Kuhn-Tucker conditions.

the map So, even more, the original has the same property. now yields the Karush-

Exercise 2. How can the situation of mixed constraints (equations and inequalities) be handled in a similar manner?

Exercise 3. Verify that, for

every function nowhere pseudo-regular. Hint: Apply Rademacher’s theorem.

is

2.2.

Pseudo-Regularity

2.2.4

49

Intersection Maps and Extension of MFCQ

It was already noticed in Section 1.4 that Robinson [Rob76c] proved the following: a finite-dimensional system with has, at a pseudo-Lipschitzian solution map Mangasarian-Fromovitz constraint qualification [MF67](MFCQ),

iff the

has full rank, and there is some such that and is satisfied. The rank condition means that is pseudo-regular at whereas the MFCQ direction is both a tangent direction of the (regular) manifold at and a ”descent direction” for the cone-mapping at Let us recall the idea of the sufficiency proof in a non-technical way because we intend to use it under more general settings. To show that contains a point close enough to some given move (relatively) sufficiently far in direction Then, the obtained point fulfills the constraint with a big slack and violates the equation only by a little one. Next, using pseudo- regularity of near one can replace by a (close) solution of the equation Due to the big slack the point will also satisfy the inequalities In the the existence and suitable estimates for are ensured by the usual implicit function theorem. Now, this tool can be replaced by direct estimates as already done in [Kum00b]. However, the fixed direction must be exchanged by (discontinuously) moving directions. Intersection with a Quasi-Lipschitz Multifunction

Let X, Y, Z be normed spaces, and let multifunctions. The mapping

be any

with represents as above the intersection of independent (with respect to and ) constraints. We ask for the objects playing the role of the MFCQ direction now. Note that we cannot use fixed due to our Example BE.2. The following definitions enable us to deal with the quite general constraint as in the case of inequalities. A multifunction is said to be quasi-Lipschitz near if there is some (small) constant such that, for near and

50

2. Regularity and Consequences

all sufficiently small the inclusion holds true, provided that and Needless to say that we are not interested in the trivial case of int

Examples If and then G is everywhere quasiLipschitz. Indeed, it holds (with maximum norm in ). Using some Lipschitz constant of near this implies whenever Here, Similarly, G may describe standard cone constraints, i.e., if and only if where is a convex cone, and int Now,

Pseudo-Regular Intersections To quantify the distance of inner points to the boundary (our ”slack”), we adopt an idea due to H. Gfrerer, by defining a function DIST with possibly negative values:

Of course, DIST may take the values and and is therefore not a distance function in the standard sense. Nevertheless, as noticed in [Gfr98], the function DIST turns out to be useful when dealing with optimality conditions. Convention. In the remainder of this subsection, the symbol is used for the norms in Y and Z, while is the norm in X. In product-spaces, we take the max-norm. Theorem 2.22 (intersection theorem). Suppose that X, Y, Z are normed (real) spaces, and Further, let G be quasi-Lipschitz near Then, the intersection map H is pseudoregular at if the following conditions hold: F is pseudo-regular at and there are elements defined for near and for in some interval such that, uniformly with respect to all sequences and in gph H,

If G describes standard cone constraints, these conditions are necessary for pseudo-regularity of H at and may be replaced by If G describes level sets of a locally Lipschitz functional i.e., then one may restrict all to everywhere, without violating any of the above statements.

2.2. Pseudo-Regularity

51

Before proving Theorem 2.22 we quantify the estimates. Remark 2.23 (estimates). Let G be quasi-Lipschitz near

with constant let F be pseudo-regular at with rank L > 0. Then, in a weaker form, the conditions (ii) and (iii) in Theorem 2.22 may be written as follows: For some there are elements denned for near and for such that (ii)’ and (iii)’ These are just the properties we shall need in the proof of the sufficiency part, and it will turn out that H satisfies the pseudo-regularity condition with the estimate

Proof of Theorem 2.22. Sufficiency: We consider small and points close to We further agree that denotes In product-spaces, as mentioned above, we take the max-norm. Because of (ii) and (iii) we find, for any some such that, whenever we have

and

for certain Let

we note that

be fixed. Setting

and

From now on, we consider any points

and

satisfying

with some Here, B is the unit ball in the related (product-) space. We have to show that, for small and some constant K, (2.39) ensures the existence of some such that

52

2. Regularity and Consequences

The appropriate will be specified during the proof by taking such which satisfy the subsequent conditions. To find we move first sufficiently far in (the moving) direction We put with If

is small enough, we may apply (2.36) to see that

Hence, with we obtain Applying (2.37), the existence of some with is ensured. Next we ”solve” Since is still close to (after decreasing once again if necessary) we obtain, due to and by applying pseudoregularity of F at the existence of some such that and

So we may estimate

Recalling (2.38) this yields

Since G is quasi-Lipschitz we conclude that Indeed, this follows now directly from and (2.41). Therefore, the obtained point satisfies Finally, taking

into account, we estimate

By our definition of side has the form where and are constants depending on gives us the Lipschitz estimate in (2.40) with the sufficiency of the given conditions.

the sum on the right-hand

and on the fixed only. This and verifies

2.2. Pseudo-Regularity

53

For an explicit estimate, let us summarize that, if we have

Our construction thus presented some set

satisfying (2.40) and belonging to the

This yields

which is the claimed estimate. Necessity for standard cone constraints: Let Obviously, condition (i) is necessary without any assumption. Let and fix any with We consider

near Pseudo-regularity of H at with such that

with rank

(later we specialize provides us, for small

and

and

Since

we obtain, by adding points of a convex cone,

hence Moreover, due to we have Writing we thus obtain (ii) and (iii). Indeed, the norm of is uniformly bounded by and so we may now replace and by and respectively. Then

Notice that (iii) is evident because of Level sets: Let Now we are in the situation of standard cone constraints with The theorem becomes trivial if because one may put

Let

54

2.

Regularity and Consequences

Necessity. We have is near

with

if and only if is near with Therefore, the conclusions of the necessity part for standard cone constraints are particularly true for the points Due to the above constructed point does not depend on and neither does Sufficiency. Using (ii) and (iii) for the special points the general sufficiency proof provides the estimate (2.40) under the additional hypothesis that Having other points satisfying (2.39), i.e. and

it holds to

Using (2.40) for such that

there is some

related to

and

and

Since

fulfills

and

we may put in order to satisfy (2.40) with the given again. So the remark and all the statements of the theorem have been shown. Special Cases

To interpret the conditions of Theorem 2.22, we consider particular cases and suppose that G describes level sets of a locally Lipschitz function i.e., now the intersection is Corollary 2.24 (intersection with level set). Let X and Y be normed spaces, Further, let g be locally Lipschitz and Then, H is pseudo-regular at if and only if F is pseudo-regular at

and there are elements respect to all and

which satisfy, uniformly with in gph F, and

Proof. In comparison with Theorem 2.22, only appears as a new topic. To obtain the necessity of this condition, note first that (ii) must be true since (see the necessity part of the above proof). Then, holds with some since is locally Lipschitz. So we may replace by

2.2. Pseudo-Regularity

55

By (ii) in the foregoing corollary we know that G is pseudo- regular at because there is an inverse family of selections of the form

whenever the appearing additional parameter is sufficiently close to Condition (iii) says - but only with dependence on - that is an object like a ”horizontal-tangent direction” to gph F at The inverse families of F are implicitly included in the hypothesis (i). Therefore, we are requiring that F is pseudo- regular and that, in addition, a horizontal tangent direction of gph F is a strict descent direction of provided that we interpret directions as bounded functions. Explicitly, the conditions (ii) and (iii) of the corollary may be written as: and

Remark 2.23 tells us that we may even require the following (formally weaker) condition: and

Let us continue specializing the assumptions of Corollary 2.24. Case 1:

is a function, continuous at Then so does not depend on Corollary 2.24 attain the simpler form

and (ii) and (iii) in

Case 2:

is locally Lipschitz, Now it suffices to consider directions only.

in a finite subset of

Corollary 2.25 (finite sets of directions). Let

Let

and

be a locally Lipschitz

and

Then, H is pseudo-regular at if and only if is pseudo-regular at and, for each sufficiently small there exists a finite subset such that, for all near and all sufficiently small the inequalities and

can be satisfied with some

56

2.

Regularity and Consequences

By Theorem 2.22 and its estimates, the conditions (ii) and (iii) of Corollary 2.24 are equivalent with condition (iv). Therefore, the direction is evident, we show the other one. Let the requirements (iv) be satisfied for some small and let K be a common Lipschitz rank of and near Then, after replacing by some where we have Proof.

The same inequality holds for the norm of Therefore, because of the elements will again satisfy the related conditions of (iv); we only have to take larger Since dim one may select a finite U of i.e. a finite set such that So we find new elements satisfying (iv) for too. Note. Already shows that card U = 1 cannot be expected. The cardinality of U may increase while is vanishing. But, by the estimates of Theorem 2.22, the conditions of Corollary 2.25 must be satisfied only for some sufficiently small which has been already estimated in terms of L, K, and Case 3:

is locally Lipschitz and is fixed. By definition, condition (v) just says that Clarke’s directional derivative

is negative. Case 4:

We show that these properties are sufficient to put

in Corollary 2.25, i.e., while there (in a more general situation) for each could attain finitely many values, now only one direction in bd B has to be considered. To prove this, let be an accumulation point of as Now (v) and (vi) yield, with some

2.2. Pseudo-Regularity

57

The first inequality follows from the existence of the directional derivatives and from the Lipschitz property of Due to and condition (vi) is satisfied for

Since and

is u.s.c., (2.44) yields that small enough. Using the estimate

is true for

near

which can be easily shown for all directionally differentiable functions cf. Lemma 6.24, we obtain, for any and that

Consequently, we have derived (v) and (vi) for fixed indeed. Moreover, we have verified that (v) and (vi) together are equivalent with (2.44). Along with the equivalences (i)

is surjective

rank

this leads to the equivalent pseudo-regularity condition

Case 5:

Let and be a max–function of a semi-infinite optimization problem. For a semi-infinite optimization problem with parametric constraints

put If is a compact topological space and is continuous and has continuous derivatives then (2.43) follows from the well-known formula for the directional derivatives

which also implies that

In this case, the necessary and sufficient condition (2.45) is said to be the extended Mangasarian-Fromovitz constraint qualification (extended MFCQ), see [HZ82, JTW92, HK94, Kla94b, Sha94, JRS98].

58

2. Regularity and Consequences

If is finite, then our circle is closed: Condition (2.45) coincides with MFCQ, see §1.4. Exercise. Let

with vector-valued and in Show that satisfies MFCQ already under the formally weaker (but equivalent, because of the hypothesis that M is Lipschitz l.s.c. at ((0,0), Intersections with Hyperfaces

Again, let Y be a normed space, but X has to be a Banach space, now. We further suppose that is continuous at

and has closed pre-images

So, gph H is the intersection of gph with a (Lipschitz) hyperface. Theorem 2.26 (intersection with hyperfaces).

H is pseudo-regular at

Under the above assumptions, if the following conditions are satisfied:

is pseudo-regular at and there are uniformly bounded elements

for

and

and

in X such that

, there holds

Note: For the conditions (i), (ii), (iii) together are nothing else but the full rank condition for the Setting one needs directions indeed. Proof of Theorem 2.26. By Corollary 2.24, the mappings

and

are both pseudo-regular at say with rank K > 0. Let be a Lipschitz constant for near such that Next consider any points

2.2.

Pseudo-Regularity

where

59

is small (again fitted during the proof). We have to find some satisfying a (uniform) Lipschitz estimate

Without loss of generality, let is sufficiently small) there exists

Due to regularity of such that

(provided that

and

Put By definition of we have is the essential case, otherwise we already found we construct a converging sequence such that Lipschitz property of we have

With

Clearly, In what follows tends to By the

this yields

For small we know (since is continuous at are small enough such that all points in

that with

and

and

are sufficiently close to now such and fixed. Then, there is some

in order to apply pseudo- regularity of

We keep

such that and

The Lipschitz property of yields

Beginning with

we may now put

in order to find, again by pseudo-regularity of

and

This yields

some

such that

60

Moreover, as long as

2. Regularity and Consequences

(the nontrivial case) we have

So, we have defined a fundamental sequence in with as in (2.49). Since X is complete, exists in By the closeness of and because of it holds and Finally, (2.48), (2.49) and (2.50) yield the regularity estimate with rank

due to

Combining Corollary 2.25 and Theorem 2.26 one obtains by induction arguments a characterization of pseudo-regular Lipschitz functions in terms of finite sets of normalized directions. Corollary 2.27 (Lipschitz equations). Let and Then, is pseudo-regular at if and only if for sufficiently small there are a finite subset and some such that, whenever and the conditions

if as well as

and can be satisfied by taking certain

if and

If has directional derivatives, then the directions and (which in general may jump in U as can be regarded as being fixed at least for small Exercise 4. How Theorem 2.26 may be extended to the case of a closed multifunction What about necessity of the conditions in Theorem 2.26 (similar to Theorem 2.22) ?

Chapter 3

Characterizations of Regularity by Derivatives This chapter is devoted to characterizations of regularity by the help of (generalized) derivatives and may be seen as justification of the derivatives investigated in the current book. Let be a multifunction, X, Y be normed spaces,

3.1 Strong Regularity and Thibault’s Limit Sets According to the definition, strong regularity is pseudo- regularity along with a uniquely defined (local) inverse function. In what follows, we characterize this property by means of the generalized derivative TF. As in the context of pseudo- regularity, we start with the negation. Assume that F is not strongly regular at Then, equivalently,

or

(3.1) says that is not Lipschitz l.s.c. at (3.2) holds, in particular, if certain pre-images near Let us put and equivalently the existence of sequences satisfying

61

are multivalued Now (3.2) just means

62

3. Characterizations of Regularity by Derivatives

For the sequence case, the origin belongs, for

has an accumulation point to the set V of all limits

This set V is exactly Hausdorff limit, we have

defined in Section 1.2. In terms of the upper

with respect to

hence, in this

in gph F and

By the construction of TF, we obtain

We summarize this first consequence of the definition in a Lemma. As introduced in Section 1.2, we say that is injective if the origin does not belong to for Lemma 3.1 (strong regularity for multifunctions). Let (normed spaces), Then, injectivity of is necessary for F being strongly regular at If then F is strongly regular at if and only if is injective and is Lipschitz l.s.c. at Proof. Immediately by the above discussion.

For locally Lipschitz functions F, the limit sets have been studied by Thibault [Thi80] (to construct certain subdifferentials) and were denoted there by According to §1.2, we call also the Thibault derivative of F at in direction If F is a function, we write because is unique. For locally Lipschitz functionals these sets were already considered by F.H. Clarke [Cla76, Cla83], since his directional derivative are

Concerning strong regularity of Lipschitz functions, the value of the use of TF and the relations to Clarke’s generalized Jacobians [Cla76, Cla83] have been shown in [Kum91b] and becomes clear in Chapter 6 below. For multifunctions, was defined in [RW98]. There, the T-operation was applied to and the necessary condition of strong regularity took the equivalent form

3.2. Upper Regularity and Contingent Derivatives

3.2

63

Upper Regularity and Contingent Derivatives

By definition (see Section 2.1), F is upper regular at if there exist L > 0 and neighborhoods U and V of and respectively, such that This requires (like strong and pseudo-regularity) in particular that is Lipschitz l.s.c. at On the other hand, the local upper Lipschitz condition cannot be satisfied (for each choice of L, U, V) if and only if there are sequences Now the quotients are vanishing. Having an accumulation point of the bounded sequence the latter can be written by means of the contingent derivative CF as So we have obtained the following well-known result [KR92]. Lemma 3.2 (upper regularity). Let

Then, injectivity of locally upper Lipschitz at If then it holds is injective

(normed spaces), and let is necessary for to be

is locally upper Lipschitz at

and F is upper regular at

Proof. Immediately by the above discussion. Exercise 5. Show that, in the Lemmas 3.1 and 3.2, one may replace Lip-

schitzian l.s.c. by l.s.c. for

3.3

Pseudo-Regularity and Generalized Derivatives

To characterize pseudo-regularity, different generalized derivatives have been used in the literature, in particular contingent derivatives (Aubin & Ekeland) and coderivatives (Mordukhovich). These concepts (see Chapter 1 for the definitions) lead us, for closed F and finite dimension, to (primal and dual) criteria for pseudo-regularity. Having infinite dimension, additional assumptions must be imposed for getting equivalent conditions.

64

3. Characterizations of Regularity by Derivatives

Contingent Derivatives To keep the technical effort small, we will first use the suppositions of Theorem 2.17. So we assume

Concerning the classical hypotheses (F closed and X,Y Banach spaces), we refer to Theorem 3.4. Proper Mappings

To apply Theorem 2.17 in the framework of contingent derivatives, let X in (3.9) be a Banach space. The contingent derivative (see §1.2), has been successfully applied to describe locally stable behavior of F in [AE84]. In a related concept of tangent cones for sets in normed spaces, the set gph CF coincides with the contingent cone of gph F at cf. Section 6.1 below. If F is a locally Lipschitz function having directional derivatives, then For chain rules and further properties, see again Section 6 below. We say that CF is linearly surjective near if, for some L and some neighborhood of

Corollary 3.3 (pseudo-regularity if CF is linearly surjective 1). Let (3.9) be

true, X be a Banach space and CF is linearly surjective near true.

For

Then, F is pseudo-regular at if the ”only if” direction is also

Proof. Let

and Using (3.10) we find 0 such So we know, by definition of CF (see Section 1.2), that for some sequence and Therefore, F is partially invertible near and Theorem 2.17 guarantees pseudo-regularity. that

Conversely, let F be pseudo-regular. We consider the sequence to and by the partial inverse. Now, the bounded sequence has an accumulation point since Thus, are true.

assigned and (3.10)

Closed Mappings

The assumptions of Theorem 2.17 and Corollary 3.3 are not the ”classical” ones: In various papers, one imposes the hypotheses

3.3. Pseudo-Regularity and Generalized Derivatives

65

Theorem 3.4 (basic equivalences, closed maps). Under (3.11), the equiva-

lences of Theorem 2.17 remain true, i.e. the following statements are equivalent: F is pseudo-regular at F is partially invertible near For some neighborhood of is Lipschitz l.s.c. with uniform rank L at each For proving Theorem 3.4, it suffices to study the original proof of Corollary 3.3 under assumption (3.11) in [AE84] and to note that the hypothesis (3.10) may be replaced by the assumption of F being partially invertible. We repeat first the basic arguments given by Aubin and Ekeland, cf. [AE84, Thm. 4, §7.5]. Theorem 3.5 (pseudo-regularity if CF is linearly surjective 2). Let (3.11) be

satisfied and exist L > 0 and a neighborhood

of

Then, F is pseudo-regular at such that (3.10) holds true.

if there

Proof. (by contradiction) Given L from condition (3.10), take

such that Next, using points from (2.27), choose such that set and introduce the functional as Then is a of Put and apply Ekeland’s theorem to on the complete metric space gph F. Now there is some such that

and, by definition of

Consider If then By (2.27), we get

and

Hence,

Since the latter yields a contradiction. Therefore, we have The point depends on Increasing if necessary, we have By (3.10), one finds now a direction and a sequence such that special points belong to gph F: This is the crucial inequality for the proof. With we obtain

66

3. Characterizations of Regularity by Derivatives

So (3.12) and (3.13) yield Using here we have Due to

this leads us to

which cannot hold for small because diction, which completes the proof.

We arrived at a contra-

Proof of Theorem 3.4. In the above proof, the direction

in (3.13) only

had only to ensure to hold. The latter, however, is already guaranteed (by definition) if F is partially invertible near and L is the related constant. Hence, under (3.11), Corollary 3.3 remains valid, too.

Coderivatives For normed spaces X and Y and one may define a pair to be an to gph F locally around is a neighborhood of such that

if there

This definition corresponds with the approximate normals in [KM80]. For condition (3.14) yields a usual definition of normals at to the set The concept of coderivatives (see §1.2) requires to put

Remark 3.6 The inequality in ( 3.14) implies

Therefore, one may simplify the definition by setting and respectively, provided that the weak* convergence ensures norm-convergence in the related dual space (due to finite dimension or because of particular properties of F).

3.3.

Pseudo-Regularity and Generalized Derivatives

67

The injectivity of is an elegant type of a pseudo- regularity condition, elaborated by Kruger and Mordukhovich. Theorem 3.7 (injectivity of coderivatives and pseudo-regularity). If

then injectivity of is necessary and sufficient for pseudoregularity of F at If X is an Asplund space and then injectivity is still a sufficient condition. Proof. The first statement has been shown in [Mor93] (in terms of openness

one finds this statement already in [KM80, Mor88]), the second one in [MS97b]. We obtain Theorem 3.7, for as a consequence of Theorem 3.11, too. For our Example BE.2 which concerns pseudo-regularity, the sufficient conditions of Corollary 3.3 and Theorem 3.7 are not satisfied due to the properties (iii) and (iv) of this example. A Banach space X is an Asplund space if every continuous convex function is Fréchet-differentiable on a dense subset of X, cf. [Asp68].

Vertical Normals To derive sharper conditions via one has to avoid the requirement of weak* convergence for defining D*F. To weaken the assumptions concerning X, one may modify the definition of by considering the bilinear form only. If in (3.14), we say that is a Obviously, may be defined for metric spaces X by requiring that

For brevity, let us further say: F has no vertical normals near if there is some such that, for all with there is no to gph F locally around which satisfies and (with denoting here the norms in X*, Y*). In this context, X has to be a normed space. If X is a metric space, we say: F has no vertical zero-normals near if there is some such that, for all with there is no to gph F locally around which satisfies We intend to show that the simpler are just the right objects to characterize pseudo-regularity in a dual manner. Remark 3.8 (equivalence for normed spaces). For normed spaces X and Y, the two conditions of having no vertical (zero-) normals are equivalent.

Indeed, if F has vertical normals, then the inequalities under remark 3.6 show that F has vertical zero normals, too. Conversely, are with this already completes the equivalence.

Proof.

68

3. Characterizations of Regularity by Derivatives

In the case of finite dimension, weak* and strong convergence coincide. So one obtains: Lemma 3.9 (vertical normals 1). Let If then F has no vertical normals near is injective. If then is injective F has no vertical normals near

Next observe that F has never vertical normals if it is pseudo- regular. Lemma 3.10 (vertical normals 2). Let If X and Y are normed spaces then F is pseudo-regular F has no vertical normals. Similarly, if X is a metric space and Y is normed, then F is pseudo-regular F has no vertical zero-normals.

Proof. Let F be pseudo-regular with rank L, and let be an to gph F locally around with We put and choose satisfying Next consider points with If is small then, for and small one finds (by pseudoregularity) some such that Let be the neighborhood, related to in (3.14). Decreasing if necessary we have and

Dividing by

and using

yields the key inequality

(ii) For proving the second statement, we have to regard a which is formally and yields the same formula without the terms including So we obtain hence cannot tend to zero, and F has no vertical zero-normals. (i) Concerning the first statement we obtain The right-hand side tends to as So cannot vanish, which yields again the assertion. The condition in terms of

is motivated by

Theorem 3.11 (vertical normals and regularity). Let

3.3. Pseudo-Regularity and Generalized Derivatives

69

If X and F satisfy (3.9) and Y is a real Hilbert space, then F is pseudo-regular at has no vertical zero-normals near If, in addition, X is a Banach space, then F is pseudo-regular at has no vertical normals near For both statements, see Lemma 3.10. (i) We suppose that F has no vertical zero-normals. The norm in Y is denoted by Assuming that F is not pseudo-regular then, by Theorem 2.16, there is a sequence of related Ekeland points to Using the existence of with Proof.

and setting

the Ekeland property becomes

Taking the square of both sides, one has

Writing here

and

Now restrict Then

one obtains

to

where

and Hence, we may proceed with

i.e., After re-substituting and setting namely,

(3.16) holds true with

Thus F has vertical zero-normals near which is impossible by the supposition. (ii) If X is a Banach space, then the vertical zero-normals are vertical normals with The equivalence in Theorem 3.11 (ii) has been already shown for Asplund spaces X, Y and closed F, cf. Theorem 3.4 in [MS98]. Due to this result and the remark 3.8, Theorem 3.11 (i) holds for Asplund spaces and closed F, too. The new topic of Theorem 3.11 consists in the fact that, by applying the simpler the geometry of the unit ball in X is completely out of discussion. Concerning assumption (3.9), we once again refer to Lemma 2.13.

70

3. Characterizations of Regularity by Derivatives

Concluding Remarks The Lemmas 3.1 and 3.2 indicate that, at least in finite dimension, a deeper study of the ”derivatives” TF and CF could be valuable, even more since CF is also important with respect to pseudo-regularity. We investigate these derivatives in Chapter 6 below. Until now, we have established characterizations of stability properties by more or less simple other conditions. However, at this point, we neither know any practicable criteria for checking some of these properties nor the main analytical reason why we should do it. Therefore, we will deal in the next chapter with nonlinear variations of (multi-)functions, with implicit functions as well as with successive approximation. In this context, the three regularity notions under consideration shall play a key role, indeed. As technical tools in finite dimension, the derivatives CF and TF will considerably help to analyze particular interesting mappings F. If the value of these derivative-concepts is essentially limited by our Example BE.2, while Example BE.5 presents a Lipschitz function with empty contingent derivatives. In consequence, we will work here directly with the definitions.

Chapter 4

Nonlinear Variations and Implicit Functions In this chapter, we will (at least) suppose that

The notions of regularity, introduced so far, concern the solution sets of the inclusion where is a constant function. To study nonlinear variations of F, we consider now inclusions

and their solution sets where is some neighborhood of on We put

and we equip the space

and

is supposed to be Lipschitz

of our variations

Since (4.2) means that we study fixed points of

71

with the norm

72

4. Nonlinear Variations and Implicit Functions

The following approach is essentially based on [Kum99]. To modify the regularity definitions for variations in

let

be such that We call F pseudo-regular with respect to G at with rank L if there are neighborhoods U of and V of such that, given and there is some satisfying

The former neighborhood V of in Y is now a neighborhood of in G. Notice that holds by definition of S via (4.2). Further, we say that pseudo-regularity of F at is persistent with respect to if there is some neighborhood of such that F is pseudo-regular with respect to at provided that The related Lipschitz ranks, say L and as well as the neighborhoods U and of may differ from each other. In the same manner, we understand strong and upper regularity with respect to G as well as the related persistence.

4.1

Successive Approximation and Persistence of Pseudo-Regularity

In what follows we verify that pseudo- and strong regularity are persistent under small of maps F satisfying (4.1), and we derive estimates for the assigned solutions. Without these estimates, the result of this chapter reads as follows (in fact, it will become a corollary of Theorem 4.3 below). Theorem 4.1 (persistence under

variations). Suppose that F satisfies (4.1) and is pseudo- [strongly] regular at is some neighborhood of and fulfill Then, if is small enough, there exist a second neighborhood of and a constant K such that, to each zero of in U, there is a [unique] zero of in satisfying Our concept of persistence of pseudo-regularity does not only say that a multifunction is again pseudo-regular (this was shown by Cominetti [Com90]), it even requires that we have to estimate the distance of solutions in terms of Persistence of strong regularity was originally studied by Robinson [Rob80] for small In this context, the mapping may be directly investigated by applying Banach’s fixed point theorem. Having pseudo- regularity, is neither a contraction mapping nor convex-valued. This makes a

4.1. Successive Approximation and Persistence of Pseudo-Regularity

73

direct application of known fixed point theorems more difficult. As a basic tool, we construct a solution to directly by successive approximation. Supposing (4.1), we want to solve (4.2) by a modification of Banach’s fixed point approach, i.e., by selecting such that is “sufficiently small”. However, the following algorithm simply generates (more general) elements and independently of any function In order to start, we need and constants For realizing the initial step, assigned to we define Process (4.4)

Step

Notes 1. Generally,

may not exist, and the procedure becomes stationary if one selects 2. To solve (4.2), put which yields with as far as and belong to 3. Let be the subdifferential of a convex function given on Put Then means and Hence, given

4. Put Hence,

we require now satisfies and and minimizes

A solution Then

means

and

In this case, the algorithm minimizes by a proximal point method. for closed assume and 5. To solve select with The latter is possible if H is pseudo-Lipschitz with rank on Before dealing with the convergence of the process (4.4), some comments are appropriate. The concrete, general formulation of our algorithm may be new, not so the idea of applying successive approximation schemes for showing solvability of (pseudo-Lipschitzian) equations or inclusions. This idea can be found already

74

4. Nonlinear Variations and Implicit Functions

in [Lyu34, Gra50] as well as in [Com90] and, in a more general form, in [DMO80] and [Sle96]. One can also find extensions of Newton’s method which use linearization (in the proper Fréchet sense or a generalized one) of the function at some iteration point and solve the auxiliary problems Some solution now replaces in the next step. Methods of this type were studied in [AC95] and [Don96] and were applied to show persistence of solutions like here, based on Kantorovich-type statements [KA64]. Similar approaches are also known from [Kum92, Rob94] and [Kum95a] where the ”derivatives”, however, had to satisfy conditions which lead to strongly regular and upper regular solutions, respectively. Here, we intend to show that zeros of pseudo-regular mappings F after small Lipschitzian perturbations can be determined and estimated via a procedure like Banach’s successive approximation: not depending on the linear structure of X and without hypotheses concerning derivatives. So, algorithm (4.4) can be used in the same manner as successive approximation for functions; in particular for deriving implicit-function theorems. The linear structure of Y will not be used, so Y may be a metric space. Then G is no longer normed; and must be defined via and the metric in Y. In this form, the algorithm and Theorem 4.2 apply to multifunctions whereafter the relation to Banach's fixed point theorem is even better visible. Theorem 4.2 (successive approximation).

regular at with rank L and neighborhoods assume that

Suppose (4.1), let F be pseudoand

Then, The process (4.4) generates in U a (geometrically) convergent sequence with It holds and If for in (4.4), where is Lipschitz with rank on U, then If one can satisfy where is a closed mapping and Y is complete, then Note: To simplify we will put in later applications and will require that

which then will lead to

whenever

and Proof of Theorem 4.2. (i), (ii) Let us first assume that the points under

consideration belong to the regions U, V of pseudo-regularity.

4.1.

Successive Approximation and Persistence of Pseudo-Regularity

Having satisfying

(as for

75

) we see that some point

really exists. Indeed, if then we put , otherwise exists by the definition of the point-to-set distance. So, by pseudo-regularity of F, we observe that

If

then and

If then steps, hence

has already been defined according to one or more previous

and

Moreover, the points

fulfill

This ensures

and Therefore, we generate Cauchy sequences to U and V whenever

which is ensured, provided that and are small enough, namely if (4.5) holds true. The sequence then converges in the complete space X, and the limit fulfills (iii) To show

Pseudo-regularity and

even if Y is not complete, note first that

now yield

76

4. Nonlinear Variations and Implicit Functions

Thus, dist have

and, since

is closed by (4.1), we

(iv) The existence of and is ensured since H is closed and Y is complete. As under (iii), we obtain the assertion from dist Theorem 4.3 (estimates for variations in

). Suppose (4.1), and let F, L satisfy the assumptions of Theorem 4.2. For some let certain functions and fulfill on the following relations:

Then, if such that

solves

Proof. Having

there exists a solution

such that

to

we put

and

Then and, since

Thus F is pseudo-regular at

it holds

with rank L and neighborhoods

By our assumptions, it is easy to see that

and

Thus, Theorem 4.2 may be applied to instead of noting the fixed point by we observe that both

Deand

The next direct consequences of Theorem 4.3 extend pseudo- and strong regularity explicitly to small Lipschitzian perturbations.

4.2.

Persistence of Upper Regularity

77

Corollary 4.4 (pseudo- and strong regularity w.r.

to ). Let (4.1) be satisfied and let F be pseudo- [strongly] regular at with constant L. Setting then F is pseudo- [strongly] regular at with respect to (U, Y) with rank 2(L + 1). In particular, if satisfies sup and then F is pseudo- [strongly] regular at if and only if so is g + F. Concerning pseudo-regularity, see Theorem 4.3. Concerning strong regularity note that the function is a contraction on U if is small enough. Thus fixed points and of are unique on U due to Proof.

for some In a slightly more special form and by direct application of Banach’s fixed point theorem, the strong-regularity-version of Corollary 4.4 was a main result of [Rob80] while the second statement of Corollary 4.4 can be found in [Com90].

4.2

Persistence of Upper Regularity

An extension of upper regularity at may be a set) to the related upper regularity with respect to G is not possible in the generality of §4.1 in spite of the fact that the upper Lipschitz estimate is very simple:

The problem arises from the requirement that should be solvable. This cannot be ensured by the weak topological hypothesis of upper regularity at alone. So one needs more structure of the mappings, e.g., convexity in order to apply Kakutani’s fixed-point theorem. Persistence Based on Kakutani’s Fixed Point Theorem

The next hypotheses permit the consideration of quite general perturbations of F. The latter is desirable, e.g., if we are interested in minimizing a perturbed convex function on Then, is the basic multifunction, but studying allows us to deal with differentiable perturbations of only: So, if one is interested, e.g., in a homotopy between two arbitrary convex functions, the mapping becomes the key for analyzing minimizers. But being a map “near F”, is no longer a continuous translation of a multifunction F. We consider this situation for the case of where the typical ideas already apply. For similarly perturbed inclusions in Banach spaces, we refer to [Klu79, Kum84, Kum87].

78

4. Nonlinear Variations and Implicit Functions

Theorem 4.5 (persistence of upper regularity).

Let be closed and be non-empty, convex and compact. Let F be upper regular at with rank L and neighborhoods let the ranges of G and (on U and V, respectively) be non-empty and convex and, in addition, let for some bounded set K. Finally, suppose Then, G has a zero in if

is satisfied. Notes: The additional hypothesis

ensures the

estimate and

for some

For continuous functions G, F, the suppositions of the theorem hold true, if F is upper regular at the sets are convex and nonempty (as for strongly regular F) and is sufficiently small. Our statement then follows already from [Rob79] where inclusions have been studied ( is a function, F a multifunction). Robinson’s setting allows the direct application of Kakutani’s theorem to the map Here, we have to prepare this application by partition of unity or by Michael’s selection theorem. Proof of Theorem 4.5.

For

upper regularity of F yields

Therefore, the convex and compact set

fulfills

In what follows, we consider only points Condition (4.6) ensures that, if there is some with Then Hence, if G satisfies (4.6), so does the map Using K we find some such that for all The mapping is closed, has uniformly bounded, convex images and fulfills (4.6). To show that we assume the contrary,

Since

is non-empty, closed and convex, one finds some

such that

4.2. Persistence of Upper Regularity

79

Since

is closed and uniformly bounded, one sees that implies for near Therefore, the map is l.s.c. and has non-empty convex ranges. By Michael’s selection theorem [Mic56], there is a continuous selection defined on C. Since we have after normalization,

Now consider the mapping It fulfills on C and, due to our hypothesis, is closed (since is continuous) and convex-valued. Thus, by Kakutani’s fixed point theorem, there is some with The latter means hence On the other hand, since and to (4.6), some element which satisfies (4.7) because of

So

contains, due This contradicts

must be true.

For making clear the necessity of the imposed assumptions, it is useful to regard the following (Lipschitz continuous) multifunctions

and sets

where

has no zero near

for

Persistence Based on Growth Conditions

Solution sets of optimization problems form a further special class of mappings. Let us point out here a situation which is simple because the feasible set is fixed and is interesting because we get a direct relation to quadratic growth conditions. In Section 5.1.2 we apply the subsequent theorem to the subdifferential of convex functions. Suppose

Lemma 4.6 (lsc. and isolated solutions).

map

is l.s.c. at

only if

Under the assumptions (4.8), the

80

4. Nonlinear Variations and Implicit Functions

Proof. Assume (conversely) the existence of

Let

and let, according to the l.s.c. assumption, for fixed

Then, due to optimality of

and

for the assigned parameters, we notice that

and

Adding both inequalities yields

But

Since

also fulfills, by definition of

the both inequalities imply was arbitrarily taken, we obtain

Recalling that for all

Hence

is not l.s.c. at

whenever

As an immediate consequence we obtain Corollary 4.7 (pseudo-Lipschitz and isolated solutions). Under the assump-

tions (4.8), the solution mapping if it is single-valued near

is pseudo-Lipschitz at

only

Proof. Indeed, considering the neighborhoods U, V in Definition (D1) of §1.4,

we have So

and

must be l.s.c. at all

follows.

Next we consider we define

on small neighborhoods of

Given any subset

of X,

and Theorem 4.8 (growth and upper regularity of minimizers). Let the assump-

tions (4.8) be satisfied and

Consider the following statements:

4.2. Persistence of Upper Regularity

81

is Lipschitz l.s.c. at with rank L; and, in addition, there is a neighborhood

the mappings and with rank and, if for small Then, the following implications are true:

and

of

are Lipschitz u.s.c. at it holds

Note: Condition (4.9) is called a quadratic growth condition (of Proof of Theorem 4.8. (i)

and l.s.c.)

with

(ii) Let For small Then

such that

at

consider any (existing by

hence This yields as well as So we obtain

and

as well as, recalling

In particular, this inequality holds for all Thus,

(ii) (iii) Let Notice that Let

whenever

Using

we observe that

with

where

is fixed.

82

4. Nonlinear Variations and Implicit Functions

Hence yields the upper Lipschitz properties with rank is small enough, we have

If

Thus, it holds Finally, if is continuous and then it suffices to minimize on the compact set (which contains in order to find some element of We notice that under (ii) in Theorem 4,8, the set does not necessarily cover all local minimizers of with respect to M near see the discussion of so-called local minimizing sets. It seems further worth to mention that the growth condition

is persistent with respect to small and formula (6.43) ).

4.3

perturbations of

(cf. Corollary 6.21

Implicit Functions

Knowing that a certain ”nice behavior” of solutions can be extended to solutions of equations for small functions in some class G, we have several means for studying solutions of the equation

for

near some critical parameter

Now equation 4.10 becomes becomes sufficiently small in G as For inclusions of the form

say

As one of the simplest, define

It remains to ensure that

there is only a formal difference after the same settings because inclusion 4.11 takes the form ”Nice behavior” must be extended from the inclusion to As before, should be small in G provided that t is close to 0. The ”nice behavior” we have in mind is the pseudo-Lipschitz property of solutions depending on and respectively. This has been clarified by Theorem 4.3. The unchanged multi-valued term in (4.11) makes the hypothesis of (strong, pseudo-) regularity for F more or less hard.

83

4.3. Implicit Functions

However, if this supposition is satisfied, we have to deal with F directly, and the mapping appears as a formal appendix in the well-known context of usual implicit-function theorems only. Thus, as a technical problem, we have to ensure that•

is indeed sufficiently small (in the sense of Theorem 4.3 for t near as far as is restricted to a small ball U around This is the content of the rest of this chapter where we formulate the consequences of Theorem 4.3 in the current terminology and discuss the needed suppositions for sufficiently smooth We impose the general assumption

To indicate that in (4.12), one additionally supposes that Y is a Banach space, we write (4.12)’. As before, we put

and For sup

recall that Lip denotes the sup–norm of

denotes the smallest Lipschitz rank, and on U.

Theorem 4.9 (estimate of solutions).

and let and neighborhoods for all

Then, if such that

and of inclusion

Let the assumption (4.12) be fulfilled, be pseudo-regular at with rank L Suppose that, for some and for all in some ball it holds

there exists, to each solution some solution of for parameter

84

4. Nonlinear Variations and Implicit Functions

Proof. Solutions of the pre-images ensures the estimate

are solutions to

Because of (4.12), are closed. In addition, (4.13)

Since and are related to pseudo- regularity with rank L, we have only to apply Theorem 4.3 with and Assumption (4.13) is fulfilled, e.g. for where is locally Lipschitz. Setting and the existence of a solution is guaranteed if is sufficiently small as on Replacing ”pseudo-regular” by ”strongly regular”, the solutions are unique in some neighborhood of In order to apply Theorem 4.9, there are two essential tasks, namely: simplify, if possible, the condition of pseudo-regularity imposed on F, and verify whether is small enough in (for small and ) such that (4.13) can be satisfied. Practically (since L is usually unknown), we have to ensure that fulfills

whenever t belongs to a small neighborhood of the origin. To check (4.14), the function may be replaced, in the definition of by any (simpler) continuous function such that fulfills

Using a earization

of

of

condition (4.15) is well-known to hold for the lin-

at

The next lemmas as well as the estimates in the proof of Theorem 4.11 below will summarize these facts and are the key for deriving classical implicit function theorems based on the contraction principle, cf. Zeidler [Zei76]. We need here (4.12)’ instead of (4.12) because integrals in Y are needed for proving the used mean-value theorem. Lemma 4.10 (variations in Frechet differentiable to

). Let (4.12)’ be true and Then the function

and the multifunction is pseudo-regular at

iff so is

be continuously fulfills

4.3. Implicit Functions

85

Proof. It holds

Thus,

is true with as due to the continuity of the derivative. For the second statement, now Corollary 4.4 may be applied. The following statement can be also shown via the Newton-approaches in [AC95, Don96] or by using such assumptions on h which guarantee that fulfills (4.14) because of the so-called (strong) B-differentiability, cf. [Rob91]. In [Com90], condition(4.14) is rewritten as a strict (partial) differentiability condition. There, one finds also a brief and precise characterization of the relations to the Graves-Lyusternik theorem [Gra50, Lyu34] and to the (Robinson-Ursescu) open mapping theorem in [Rob76a]. By our approach, the statement may be verified, under pseudo-regularity, as an identical copy of the usual implicit-function proof (based on strong regularity). Theorem 4.11 (the classical parametric form). Suppose that (4.12)’ holds,

exists and is continuous near and is continuous at 0. Moreover, let be pseudo-regular at constant L. Then, for sufficiently small there exists whenever and solves of satisfying

Proof. We have to estimate the quantities

small that

Continuity of sufficiently small

and

with such that, there is a solution

and of Theorem 4.9. For the mean-value theorem implies

along with yields and for it holds

Thus, for

86

4. Nonlinear Variations and Implicit Functions

To estimate

With small that

one may write

such that

and small

such

this inequality yields Let us add an approximation, applied in §6.6.2, where the existence of is not supposed. For this reason, we assume that also varies in a Banach space and we write, for continuous

provided that the partial derivative of which contains

exists on some convex neighborhood

Lemma 4.12 (linearization w.r. to parameters). Let and where is a fixed constant. If exists and is continuous near then it holds as well as sup Lip

If

is even Lipschitz near

Proof. By the mean-value theorem, one obtains

which yields the first assertion immediately. The second one follows from

If

for

is Lipschitz near

near

and

0) then it even holds, with some constant K,

near the origin.

then

4.3. Implicit Functions

87

In the present case, solutions of

and can be compared: With the inclusions become

and respectively. Concluding Remarks

1. If F is pseudo-regular then the possibly multi-valued solution map defined by the solutions of behaves even pseudo-Lipschitz with respect to In addition, solution estimates in terms of sup can be derived from Theorem 4.3 and lead to implicit-function statement for systems 2. The estimates (4.13) in Theorem 4.9 do not only hold for small smooth perturbations of a smooth original function Other simple possibilities are linear homotopies where and are locally Lipschitz, cf. §11.1. 3. All the solutions, whose existence was claimed, can be determined (theoretically) by using successive approximation, see §4.1.

This Page Intentionally Left Blank

Chapter 5

Closed Mappings in Finite Dimension In this chapter, we regard only closed multifunctions suppose and investigate, in particular, regularity of locally Lipschitz functions.

5.1

Closed Multifunctions in Finite Dimension

In finite dimension, the regularity conditions derived up to now may be simplified and allow additional conclusions that are not true, in general.

5.1.1 Summary of Regularity Conditions via Derivatives Theorem 5.1 (regularity of multifunctions, summary).

Let

be dosed and

Then:

If X is a normed space, the conditions (5.1) and (5.2) are still necessary for the related regularity. 89

90

5. Closed Mappings in Finite Dimension

Proof.

For (5.1) see Lemma 3.2 and the related exercise. For (5.2) see Lemma 3.1 and the related exercise. For (5.3) see Corollary 3.3 and Theorem 3.7 (§3.3). For (5.4) see the arguments of the proof to Corollary 3.3 for dim For normed X, the necessity of the conditions (5.1) and (5.2) follows from the same lemmas. The (pointwise) condition (5.4) is indeed not sufficient for to be Lipschitz 1.s.c. at cf. Exercise 9 below. The l.s.c. condition under (5.1) and (5.2) is, in general, not ensured by the already imposed injectivity of CF and TF, respectively. On the other hand, the condition (5.3) can be replaced by a formally weaker surjectivity condition. To show this, we establish first a relation between CF and D*F without considering limits of the dual elements. Theorem 5.2 (CF and D*F). Under the assumptions of Theorem 5.1, it holds

if and only if Proof. By Remark 3.6, we have

in gph F and

iff there are

such that

Writing here and ball in ) this becomes

with

( the sum-norm

Since every sequence of (as ) possesses a convergent subsequence, and the accumulation points form just gph one easily sees that (5.6) yields both directions of the assertion. The next statement can be shown by using the both pseudo-regularity conditions of Theorem 5.1 and Theorem 5.2 as well. Our proof applies the Ekeland points of Theorem 2.16 and CF only. Theorem 5.3 (convCF). Under the assumptions of Theorem 5.1,

(i) the mapping F is pseudo-regular at (ii) F is not pseudo-regular at

if and only if if and only if

5.1. Closed Multifunctions in Finite Dimension

91

Note. Evidently, condition (5.8) is equivalent to the (also necessary and sufficient) coderivative condition but (5.8) requires the study of contingent derivatives, multiplied with fixed only. In this way, the present theorem indicates that the condition (in finite dimension) is nothing but a limit-condition for contingent derivatives.

(i) By Theorem 5.1, only the sufficiency of the condition must be shown. We apply Theorem 2.16. Assume that F is not pseudo-regular at though exists in the given way. Then one finds, for each certain and such that both and is a global Ekeland-point of dist with factor Recall that the latter means Proof of Theorem 5.3.

Let be fixed such that nonempty and closed, some Setting exist and

Next consider any satisfy Euclidean norm and

and

Since realizes the distance and applying condition (5.7) to satisfying

is there

such that assigned points One easily determines, by using the for and small that

From (5.9), we thus obtain

and as well as Since is impossible, the assertion now follows from Theorem 2.16. (ii) This statement can be easily derived by negation of (5.7) along with the separation theorem. Recall that, for locally Lipschitz functions, Corollary 2.27 gave another criterion without using derivatives. Exercise 6. Show that, for Note: To get here coderivative with the opposite sign.

one obtains Mordukhovich [Mor88] defined the

92

5.1.2

5. Closed Mappings in Finite Dimension

Regularity of the Convex Subdifferential

Here, we discuss the content of the regularity definitions for the subdifferential of a convex function on by showing how they are related to growth conditions. Let be a convex functional on and be its subdifferential at

Then,

means that

is a solution of

and the related infimum value is just the value of the (concave) conjugate function at Let us put

The inverse is the mapping of §4.2, now with Strong regularity of at simply means that has unique and locally Lipschitz minimizers for near the origin. Next we demonstrate that our three regularity properties can be completely characterized by growth at a point or by (uniform) growth on a neighborhood, respectively. For this reason we introduce the following conditions: (CG.1)

such that

whenever and

(CG.2)

such that, for all exists and satisfies (CG.3) such that

a minimizer of

Taking into account that is a closed and positively homogeneous map, condition (CG.l) is nothing else but injectivity of Condition (CG.2) has been used in [LS97], in a pointwise context. Here, (CG.2) is an uniform growth condition for all near the minimizers For this condition simply means that is positively definite. Condition (CG.3) is the standard quadratic growth condition with respect to at some point Theorem 5.4 (regularity of the convex subdifferential). Let

convex, and let be a point such that is pseudo-regular at is strongly regular at (CG.1) (CG.2), and is upper regular at (CG.3).

Then

be

5.2. Continuous and Locally Lipschitz Functions

93

Notes:

(i) Compared with Theorem 5.1, the l.s.c.-condition of (5.2) does not appear. (ii) The local monotonicity condition (CG.1) holds true if is strongly convex; but the reverse is not true; take This shows that the local situation differs from the global one, i.e., if is globally unique and Lipschitz, then is strongly convex, cf. [LS96, Thm. 2.2]. (iii) For the equivalence ”strong regularity (CG.l)” will also follow from Theorem 5.14 below. Proof of Theorem 5.4. The claimed equivalences essentially follow from previously proved regularity conditions. pseudo-regular

strongly regular follows from Corollary 4.7.

strongly regular

(CG.1): see Theorem 5.1.

(CG.1) is l.s.c. at

strongly regular: By Theorem 5.1, it is only to show that Setting and in (CG.l), one observes

and Taking argmin near and it follows that is isolated in hence, in particular, the set argmin is non-empty and bounded. So, by Theorem 1.15, the solution sets are nonempty for sufficiently small and is u.s.c. at . Since is isolated in this u.s.c. mapping is l.s.c. at too. (CG.2) strongly regular. Due to (CG.2) and Theorem 4.8, is locally Lipschitz u.s.c. with uniform rank at the unique minimizers for near . By uniqueness of the solutions, it is uniformly Lipschitz l.s.c., too. But this is pseudo-regularity due to Theorem 2.17 and strong regularity due to uniqueness. strongly regular

(CG.2) see Theorem 4.8.

upper regular (CG.3) see Theorem 4.8 and recall again (for ) that is not empty for small because -as a singleton- is non-empty and bounded.

5.2

Continuous and Locally Lipschitz Functions

In what follows, we intend to elaborate certain deeper, specific properties of continuous and pseudo-regular functions Let us first note that we simultaneously speak of mixed systems of equations and inequalities, in this context.

94

5.

For

consider the mapping

For

we write

Since pseudo-regularity of and M is pseudo-Lipschitz at

5.2.1

Closed Mappings in Finite Dimension

is the pseudo- Lipschitz property of H, iff so is H at for

Pseudo-Regularity and Exact Penalization

Let

On some bounded neighborhood recall that

of

we choose any small function

was defined in (4.3). Given near the equation

we want to find some

(Lipschitzian close to

satisfying

By the results of Section 4.1, exists, provided that is pseudo-regular at and is small. Here, we show that can be found by an exact penalty approach if and only if is pseudo-regular. For this reason, we define for

The function

is a penalty function (with parameters and

for the problem

Let Lemma 5.5

Given

that Proof. It holds

it holds whenever

and

for So we have

and are small enough such

5.2. Continuous and Locally Lipschitz Functions

If

95

then

Thus,

and Since inclusion

is compact and nonempty, this yields now follows from (5.11).

The

The statement (ii) of the next theorem characterizes pseudo- regularity by the fact that, for large is a penalty function of problem (P) provided that is small enough. The notion exact penalty indicates here that the minimizers are feasible points for the original problem (P), Theorem 5.6 (pseudo-regularity and exact penalization). Let be a zero of let be a bounded neighborhood of and put Then the following statements are equivalent: is pseudo-regular at where

is the unit ball in G with respect to

given in (4.3).

Proof.

Given and that (in contrast to the assertion)

let

and assume

For small continuity ensures that has small norm, and by Lemma 5.5, the norm is small, too. From Theorem 4.3 we thus obtain the existence of a constant C such that, to each small there corresponds some with

Now choose any

So, since Hence

Then

for small Decreasing

and

the point

cannot minimize

if necessary, we know that

fulfills whenever

and

96

5. Closed Mappings in Finite Dimension

Next put constant: for and If is small enough, continuity of ensures Thus, Lemma 5.5 yields the existence of some and guarantees the estimate

Since we obtain (i) with pseudo-Lipschitz rank and neighborhoods U, V having radius Let of in

be pseudo-regular at 0) and be any fixed bounded neighborhood From the above theorem we know that, for small small and for sufficiently large there is a minimizer of with respect to Each such fulfills the perturbed equation and the estimate

This provides us (at least theoretically) with an exact penalty approach for computing a solution of being ”Lipschitzian close” to In particular, we may identify with the function H in (5.10). Then takes the place of a perturbation of H, and we are speaking about solutions to a perturbed system of constraints. What About the Infinite Dimensional Case ? We imposed the hypothesis to obtain in Lemma 5.5. Concerning the image space, one may formally permit that is replaced by a Banach space Y. But, if the pseudo-regularity of already implies (see the Exercise 3 in Section 2.2.3). So, at least for the locally Lipschitz case, the assumption dim would be an empty generalization because the remaining hypotheses cannot be satisfied even if dim

5.2.2

Special Statements for m = n

Specific properties of continuous functions can be used if or/and is locally Lipschitz. As a main motivation for particularly investigating such functions, we mention that (generalized) Kojima-functions are just of the considered type, provided the functions - involved in the underlying optimization problem- have Lipschitzian derivatives. Therefore, our statements concerning zeros (or level sets) of functions are closely related to critical points of optimization and variational problems. Let us start with a consequence of Rademacher’s theorem. The latter states for any locally Lipschitz function that the set of those where the Fréchet-derivative does not exist, has Lebesgue measure This yields, for a close connection between upper regularity, pseudo-regularity and isolated pre-images. Theorem 5.7 (pseudo-regular & upper regular). Let pseudo-regular at Then there are neighborhoods and some such that

be

5.2. Continuous and Locally Lipschitz Functions

97

For almost all in particular for all points of Fréchet differentiability, is upper regular at For almost all the sets are finite. *

(i) If exists, one has By Theorem 5.1, the Jacobians (for near ) are regular matrices with uniformly bounded inverses Therefore, the point is an isolated solution of The upper regularity at follows again from Theorem 5.1. (ii) With some neighborhood U from (i), let Since and is Lipschitz on U, also is true (note that By pseudo-regularity, is a neighborhood of Let Using (i), the pre-images are isolated, so they are isolated for almost all Taking some closed ball the set must be finite for such Finally, identifying U with int the assertion has been shown. Proof.

Selections

Given any with we ask now for the existence of a continuous function that assigns, to near 0, pre-images and satisfies Evidently, exists trivially if, for some neighborhood U of the map is already single-valued and continuous near the origin. The next lemma tells us that exists only in this trivial case. Lemma 5.8 (continuous selections of the inverse map). Let

selection of and

be open and bounded and on V. Then is open,

Since has the inverse

be any continuous for all the selection

Proof.

of

with The function

has the inverse

hence it is one-to-one, which tells us that for all

By our hypotheses, and are continuous; hence V and U are homeomorphic sets. Since V is open and bounded, so U is open, too. Notice that this conclusion is just the statement of Brouwer’s invariance of domain theorem, cf. [AH35, Kap. X.2, Satz IX). Consider, for the situation of the lemma above, the special case of Then is an isolated zero of and U is a neighborhood of such that the map is single-valued and continuous on V. If there exist such open sets V and U, we will also say that is locally unique and continuous near

Note.

98

5. Closed Mappings in Finite Dimension

Lemma 5.9 (convex pre-images).

Let be open and bounded and suppose that, for some subset the map is l.s.c. on V. Further, suppose that F has nonempty convex images for all Then F is single-valued and continuous on V.

The application of E. Michael’s [Mic56] selection theorem yields the existence of a continuous selection of F on V. By Lemma 5.8, it holds that Since is open and is convex, we easily obtain from Proof.

that

is single-valued, too.

Projections

Let Put

and

be pseudo- regular at

0).

and With any norm the closed set

we know that, for sufficiently small is nonempty, and

where L may depend on the fixed norm. Moreover, given such that hence

So the distance If card For fixed norm.

is Lipschitz on on

and

there exists

This yields the well-known observation: then the projection

the number of elements in

is continuous. may depend on the used

Theorem 5.10 (equivalence of pseudo- and strong regularity, bifurcation). Let

and be pseudo-regular at Then the following properties are equivalent. has, on some neighborhood of the origin, a continuous selection with The projection map is single-valued near the origin. is strongly regular at Moreover, if is not strongly regular at then is a bifurcation point of such that, near the origin, has no (single-valued) continuous selection s satisfying The statements follow immediately from the Selection Lemma 5.8 along with continuity of for card

Proof.

5.2.

Continuous and Locally Lipschitz Functions

99

By the theorem, the local uniqueness of the projection near the origin does not depend on the used norm, provided that is pseudo-regular. Recall that Example 1.4 presented just a pseudo-regular function which was not strongly regular.

5.2.3

Continuous Selections of Pseudo-Lipschitz Maps

Suppose that is pseudo-regular at an isolated zero Then one easily sees that, if

of

is small enough, the multifunction H defined

by

(with L from pseudo-regularity) has compact images, is Lipschitz (upper and lower) on and fulfills

With the same properties, H can be defined for isolated zeros of any pseudoregular mapping. However, if is not isolated, the existence of a nontrivial continuous compact-valued selection seems to be an open problem even for Lipschitz functions Nevertheless, the existence of H is ensured for piecewise Lemma 5.11 (isolated zeros of Let be pseudo-regular at Then is an isolated zero of Thus, near the origin, has a Lipschitz continuous, compact-valued selection with Exercise 7. Verify Lemma 5.11.

The previous lemma is a special case of a powerful theorem, which was recently shown by P. Fusek. Theorem 5.12 (isolated zeros of Lipschitz-functions,

Then

Suppose that is pseudo-regular at 0) and directionally differentiable at is an isolated zero of and, in addition, holds for

all Moreover, if is even directionally differentiable for near is some such that for all and

then there

Proof. See [Fus99, Fus0l).

Exercise 8. Let

be defined as follows:

Analyze the continuity properties for F with the Euclidean norm, with polyhedral norms and with “ > ” instead of in the definition of F. Does there exist a continuous function such that on B ?

100

5. Closed Mappings in Finite Dimension

Exercise 9. Find a counterexample showing that the pointwise condition (5.4) in Theorem 5.1 is not sufficient for the Lipschitz l.s.c. of

5.3 Implicit Lipschitz Functions on For locally Lipschitz functions in finite dimension, the derivative describes precisely (local) uniqueness and Lipschitz behavior of inverse and implicit functions. This will follow from the subsequent Theorems 5.14 and 5.15, shown in [Kum91b] (including Example BE.3, too) and [Kum91a], respectively. Let us first recall Clarke’s basic inverse function theorem in [Cla76]. Theorem 5.13 (inverse functions and If trices in are regular, then is strongly regular at

and all ma-

This statement follows from the next Theorem and Theorem 6.6. Replacing now by we obtain a necessary and sufficient condition and a clear description for the T-derivative of the inverse. In Chapter 6, we shall see that fulfills also several chain rules which are important for computing them in relevant special cases. Theorem 5.14 (inverse functions and Tf). A function strongly regular at if and only if

is

Moreover, if even belongs to and is some neighborhood of then the following statements are equivalent. is strongly regular at is injective. is strongly regular at with respect to If is strongly regular at then the locally unique (and Lipschitz) inverse satisfies the equivalence

Note. In comparison with Theorem 5.1, the requirement is Lipschitz l.s.c. at does not appear and is, on the contrary, a key consequence of (ii). The first statement of the theorem was already a footnote in [Cla76]. Proof. Let (5.12) be true. Then the open ball and its are homeomorphic because (5.12) ensures that as a mapping of the type is well-defined and Lipschitz with constant By the invariance of domain theorem, V is open if so is U. Since we thus obtain int V. Hence is strongly regular at

5.3.

Implicit Lipschitz Functions on

101

Conversely, (5.12) follows immediately from strong regularity of at Put and apply that is locally Lipschitz. Thus the first statement is true; (i) (5.12). Now, let be locally Lipschitz, say with some constant K on (i) (ii) follows from Theorem 5.1. (ii) (5.12) can be seen as follows: If (5.12) is not true, then there are sequences with

Setting With subsequence,

the sequence of we thus obtain

has a cluster point and, for some appropriate

hence (ii) is violated. Thus, using (i) (5.12), we also have (i) equivalence (i) (iii), we refer to Corollary 4.4. The formula for

(ii). For the

is an evident consequence of the definitions.

The implication (i) (iii) is not restricted to Lipschitz functions in finite dimension, only. As already mentioned in the proof of Corollary 4.4, one may consider the function which maps a small neighborhood of into itself and is contractive as far as is strongly regular at and is small in Example BE.3 indicates that the conditions of Theorem 5.14 are weaker than F.H. Clarke’s requirement of all matrices in being regular because there is a Lipschitz-homeomorphism of (piecewise linear) such that contains the zero matrix. So, Theorem 5.13 presents a sufficient condition for strong regularity, which is not necessary even for piecewise linear functions 1). Equipped with chain rules for computing Theorem 5.14 turns out to be a powerful tool for strong stability analysis of critical points in (finite dimensional) optimization and for regularity of generalized Kojima functions. Additional Nonlinear Perturbations

Theorem 5.14 can be extended to systems of the form

and to the related implicit function for

near

In order to exploit the equivalence of (i) and (iii) in Theorem 5.14, we can consider, as in §4.3, the nonlinear perturbation of the

102

5. Closed Mappings in Finite Dimension

original function. But one has to be careful: In §4.1 and §4.3 we needed small of on some neighborhood of as The latter is not necessarily ensured by supposing to be locally Lipschitz, see Example 6.7 in §6.4. We only know that the sup-norm on fulfills

as If is strongly regular at Theorem 4.5 ensures the existence and local upper Lipschitz behavior of solutions to (5.13), but not the uniqueness. Nevertheless, the Thibault derivative describes again the implicit-function situation near Theorem 5.15 (implicit Lipschitz functions). The implicit functio

n to (5.13) locally exists as a uniquely defined Lipschitz function (that maps a neighborhood V of (0,0) into some neighborhood U of ) if and only if holds for each

Proof.

For the function has an inverse being locally Lipschitz near (0, by Theorem 5.14. So there are positive and such that the solutions in of are well defined and Lipschitz for By Theorem 4.5, we find positive and such that

is solvable whenever

and related solutions

satisfy

with some (local) upper Lipschitz rank C. Therefore, if is wrong, then there exist sequences and related solutions and such that Then, Setting

both tending to

and and

Because of

After selecting a subsequence such that

we obtain

this tells us that

converges to some

this just means

5.3. Implicit Lipschitz Functions on

If

103

and

then we find

for appropriate sequences converging as noted above. Setting and

one easily sees that the inverse Lipschitz. Relations between in Section 6.4.

if it exists at all, cannot be locally

and ”partial derivatives”

Exercise 10. Verify: If

and directionally differentiable for rectionally differentiable for near

near

and

will be studied

is strongly regular at then the local inverse

is di-

This Page Intentionally Left Blank

Chapter 6

Analysis of Generalized Derivatives In this chapter, we study properties of selected generalized derivatives which will play a crucial role in the subsequent chapters. We mainly focus on the contingent derivative CF and the Thibault derivative TF of some given (multi-) function F, both generalized derivatives were introduced in Section 1.2. The presented properties of CF can be found in [AE84], the related statements for TF are often similar, we refer, e.g., to [Thi80, Kum91b]. Moreover, we examine the relations between Thibault derivatives and Clarke’s [Cla83] generalized Jacobians with respect to locally Lipschitzian functions, and we discuss so– called Newton maps [Kum00a] which are set-valued first–order approximations of nonsmooth functions and are of interest in the local convergence analysis of Newton–type methods.

6.1

General Properties for Abstract and Polyhedral Mappings

Suppose that

We recall the definitions of and given in Section 1.2 above. One has if there exists a sequence and an associated sequence such that while means that there exists some sequence and associated sequences and such that We will also use the characterization that iff there are in gph F and o-type functions such that Clearly, and 105

106

6. Analysis of Generalized Derivatives

By the definitions only, it holds

In general, the inclusion in that relation cannot be replaced by an equation, see the simple example Tangents

Derivatives of multifunctions are closely related to tangents of sets, cf., e.g., [Cla83, AE84, RW98, BL00]. To see this, let and Note that convergence of sequences is written index–free according to the convention of §1.1. One says that belongs to the contingent cone of Z at (also called Bouligand cone) if, for some sequence and some related sequence there holds More restrictive, belongs to Clarke’s tangent cone if, whenever and in Z, there are related such that Another cone can be defined by saying that belongs to if there exist certain sequences and in Z such that Evidently, If Z is the closure of an open set, then convex polyhedral set in finite dimensions, then are convex polyhedral cones. Returning to the map F and setting tangents in the product space (X, Y) yield that

is the whole space. If Z is a and the definitions of

and

Elementary Properties Because of the definition via limits, the sets and are closed in Y . If the images of F are convex then or are not necessarily convex again. This fact explains one type of difficulties for establishing a related differential calculus for CF and TF as well. Another type of difficulties comes from the fact that, for writing some element in limit-form, it may happen that one needs a particular sequence In other words, some sequence already assigned to the derivative of another function may be inappropriate to represent in limit-form. This leads to difficulties if we want to show the additivity or other chain rules. On the other hand, many proofs concerning chain rules for generalized derivatives are only straightforward consequences of the definitions: One has to select appropriate converging subsequences from a given one (which exists by

6.1. General Properties for Abstract and Polyhedral Mappings

107

the definitions only). If this is possible, we will say that the related rule can be shown in an elementary way. As a drastic example of an invalid chain rule we regard the following conjecture: Given real functions that are continuous at it holds provided that Put

if is rational, otherwise. Then Hence, the conjecture is false.

otherwise; and

Lemma 6.1 (TF, CF are homogeneous;

if irrational, for and ). Suppose (6.1). Then

Given one may reverse the role of and as well as of and in the definition of TF. Then one obtains Thus, is a homogeneous mapping. Similarly, one sees that is only positively homogeneous. As already noticed in Remark 1.1, the equivalences for the inverse multifunction are evident due to the symmetric definitions of the derivatives. Proof.

Lemma 6.2 (variation by

spaces,

The mappings G and

functions). Suppose that X and Y are normed and Then

where

is defined by

have the same derivatives All these statements are also valid for the contingent derivatives CG, and Proof. By writing down the related limits, one directly sees that

Taking Lemma 6.1 into account we thus obtain that

108

6. Analysis of Generalized Derivatives

The analogous arguments hold true for contingent derivatives. Moreover, if in particular and then

Applying this fact to the ”difference” of G and at G and have the same T-derivatives at Again, the analogous arguments are valid for CG and Similarly, one shows by elementary means Lemma 6.3 (small variations of given mappings). Let X and Y be normed spaces, and Then the following properties hold. 1. If then then 2. If and then 3. If If then 4.

Polyhedral Maps

An important special class of multifunctions was introduced and studied by S.M. Robinson [Rob81]. One says that F is polyhedral, if gph F is a union of a finite number of convex polyhedral sets (bounded or not). Such a union will also be called a polyhedral set. Clearly, F is polyhedral if and only if so is Examples

Given a convex polyhedron the following mappings are polyhedral and send into itself: of onto P, which is multivalued for polyhe(i) the projection map dral norms and piecewise linear for the Euclidean norm; (if ) and (ii) the map of normals (if which is related to the Euclidean projection by Further important examples are the solution maps of parametric linear inequalities and parametric linear complementarity problems, respectively, (iii) (iv) where A, B are given as well as linear transformations of convex hulls with given points and (v) linear functions

6.1.

General Properties for Abstract and Polyhedral Mappings

109

Properties

Having a finite representation of the graph by closed sets,

the submappings defined by still describe the contingenttangent cones and the contingent derivatives via

respectively. Clearly, some of these sets may be empty. For polyhedral mappings, the cones are solution sets of linear inequality systems hence they are computable and have a nice structure. This is the key for showing the next well-known statements. Given a linear transformation we define

Clearly, AF is polyhedral if F is so. Theorem 6.4 (polyhedral mappings).

Let

be a polyhedral

multifunction, and let Then: The contingent derivative is again polyhedral. (exact approximation). For sufficiently small it holds and if

(linear transformation). Under linear transformations holds Conversely, if where

it

then one has

is defined by

Note. In particular,

is l.s.c. at provided that or (i) the matrix A is regular (due to and for small (ii) F is upper regular at

In general, the l.s.c. condition for may fail to hold: Let given by and with fixed is empty for but holds for

be Then

Proof of Theorem 6.4. The statements (i) and (ii) are left as Exercise 11.

We consider the statements (ii). Let Then, for small and, by definition,

it holds

110

6.

Analysis of Generalized Derivatives

Hence, The supposition there are elements satisfying

means that, for certain

We have to verify that holds for some If the latter holds true then, as shown in the first part, belongs to and fulfills Thus is necessarily l.s.c. for We show the sufficiency. If is l.s.c. at

then the optimization problem

satisfies (6.4)} has solutions for small and Because F is polyhedral, we find a subsequence of and a fixed convex polyhedron such that the points belong to P. Let P have the implicit description

with appropriate fixed matrices M, N and vector our minimum problem reads

Then, for certain

The linear constraints depend on a real parameter on the right-hand sides, and the related feasible set map is Lipschitz l.s.c. on its closed domain, by Hoffman’s lemma given in Section 2.1 above. Therefore, holds for some L and small After selecting an accumulation point of for one sees that Along with (6.4) for this yields the claim Exercise 11. Prove the statements (i) and (ii) of Theorem 6.4.

6.2

Derivatives for Lipschitz Functions in Finite Dimension

In this section, we suppose that is a locally Lipschitz function with rank L near Then the differences after replacing of

by a sequence

in the definition

6.2. Derivatives for Lipschitz Functions in Finite Dimension

are vanishing due to The same holds in view of

111

Hence,

These sets are non-empty, closed and bounded because

is locally Lipschitz and

For

it holds

For the absolute value

we observe that (the usual directional derivative),

and

(a closed interval). So

and

are different even for elementary functions.

Further Properties

Proof. We consider

Assume there are some and such that meets and the same is true for the (larger) set

and disjoint open sets and Then

as far as is sufficiently small. Indeed, it holds and all the sets under consideration are uniformly bounded. Next, let elements be written with related and Setting, for and the points belong to and form a continuous curve, connecting and So cannot be true; M is connected. For put Both mappings are Lipschitz in

i.e.

112

6.

Analysis of Generalized Derivatives

This follows from the Lipschitz estimate

by assumption. The mapping shown in [Thi80].

has a useful subadditivity property which was already

Setting

(6.10) follows due to

and by passing to the limits as The statements

follow directly from the definition by selecting appropriate subsequences. For locally Lipschitz functions, the relation between close. To see this, we prove a further characterization of

Theorem 6.5

and For if and only if for if and only if

is very

one has:

Proof. (i) By Remark 3.6, the relation

means that

The left-hand side of (6.13) does not depend on side fulfills with some Lipschitz rank

and Let

of

explicitly. The right-hand near

6.3. Relations between

and vanishes. So

and

113

means equivalently that

The same condition appears for (ii) We show first

for certain

for the given

This verifies (i).

Let

The contingent derivatives fulfill by definition

hence may estimate

satisfies

see Lemma A.2. So, given any

For fixed

one finds

we

such that and

This ensures, for certain

i.e., (ii)

From (i), we thus obtain the assertion Using

which is true (for all

by definition of

with

again for certain sequences of the latter yields the assertion.

6.3

Relations between

Again, let

we obtain from (6.14)

Setting

and

be locally Lipschitz.

114

6.

Analysis of Generalized Derivatives

Generalized Jacobians

We briefly recall the needed definition, see Chapter 1: exists as Fréchet derivative}, for certain and

is the generalized Jacobian of at in Clarke’s sense. Since is closed (in finite dimension), is also a closed mapping, by Caratheodory’s theorem concerning convex hulls. We show that Let one finds

Considering points such that

such that

and

where Hence, After replacing by its convex hull, one obtains Further, if is an exposed matrix (i.e., A is not a proper convex combination of elements in then and, consequently, the related set fulfills

where the symbol ”ex” refers to the set of exposed elements. Conversely, the (deeper) relation

holds. To verify this inclusion, one may apply the mean value theorem

shown in [Cla83] and the fact that is closed and locally bounded. Moving and here, and taking into account that (obviously)

the inclusion (6.16) follows immediately.

6.4. Chain Rules of Equation Type

Theorem 6.6

115

and generalized Jacobians)

Proof. Now, the assertion is a consequence of (6.15) and (6.16) because is convex.

The inclusion (6.16) may be strict even for piecewise linear functions see Example BE.3. For (6.17) shows because is convex as a connected subset in R. The listed properties concerning including also chain rules for composed functions, have been shown basically in [Thi80], while (6.7) and (6.17) were proved in [Kum91b]. Concerning properties of we refer to [Cla83], and concerning to [AE84].

6.4 6.4.1

Chain Rules of Equation Type Chain Rules for Tf and Cf with

In what follows we derive chain rules for functions in finite dimension and impose the general assumptions: is locally Lipschitz, and Then one has

The proof is elementary: Let and for certain and One can select a subsequence such that has a limit Setting now one obtains and i.e. If and are fixed, our arguments remain valid, so one obtains the assertion for too. For the reverse inclusions, one needs extra assumptions, because elements and may require different sequences for the related limit representations. If the limits do not depend on the particular sequences, the difficulties vanish. So we leave the following chain rule as an

116

6.

Analysis of Generalized Derivatives

Exercise 12. Verify the following stronger versions of (6.18):

If

and

are not continuously differentiable, then it still holds:

Let with some

be written as Since is pseudo-regular, we find first certain tending to and next related points satisfying a Lipschitz estimate Thus, the points form a sequence with some cluster Then and, due to and it holds also as required. Proof.

Note. We have shown that

Hence, if

and we already know that is single-valued, then

belongs to

for some

and

hold true. However, even if is linear, the reverse inclusion in (6.18). Example 6.7 (chain rule, counterexample).

Let

may fail to hold for and

where

Then

but

Partial Thibault Derivatives

The next chain rules appear to be the key for our later applications. We consider

6.4. Chain Rules of Equation Type

Again, and formula

117

are supposed to be locally Lipschitz. We are interested in the

where and denote the partial T-derivatives, defined - as usually - by fixing the remaining arguments. In general, (6.21) is not true, we need a special property of ”Simple” Lipschitz Functions

A locally Lipschitz function is said to be simple at if, for all and each sequence there is a sequence such that holds at least for some subsequence of It is remarkable that neither all functions nor all are simple. On the other hand, all are simple (cf. [Kum91b]). Further simple functions are and see Lemma 7.4 below. Detailed investigations of simple functions and relations to the following chain rule may be found in [Fus94]. Theorem 6.8

(partial derivatives for Let and be locally Lipschitz, and let exist and be locally Lipschitz, too. Then

Let, additionally, be simple at Moreover, given any

Then the equation (6.21) holds true.

and there are sequences

and

such that as well as

Note. Clearly, Proof of Theorem 6.8. All sequences

tive and vanishing. Let

will be supposed to be posi-

118

6. Analysis of Generalized Derivatives

Proof of

Let

be given by

and

We put The set

and analyze the right-hand side of (6.21). contains the accumulation points of

Since the sequence is bounded, convergence least for some subsequence). Next consider

Again, convergence quence if necessary). Thus,

may be assumed (at

may be assumed (again for some subsecontains the limits of

Since also is bounded, now Therefore, the right-hand side in (6.21) contains remains to show that is vanishing. Explicitly, we have

Adding elements of ”vertical groups” as well as this yields

where A and B denote the two squared brackets. The term may be written as

Similarly, we may write the other three differences:

may be assumed. It

6.4. Chain Rules of Equation Type

Since the derivatives estimates of the form:

for some we see that

119

as well as

are locally Lipschitz, we thus obtain

Finally, recalling that

and

tends to zero, indeed. Proof of Suppose that the right-hand side in (6.21),

is simple. Now let

be any element of

with We have to verify that there are sequences

and

such that

The limit expressions of and will be a by-product of the construction. Due to our assumptions we may write

with Since is simple, can be written as a limit (of a subsequence), where the already given sequence of occurs:

provided that

Using next that

has been suitably taken. Notice that

exists and is locally Lipschitz, the limit

does not depend on the selected sequences and So we may change them and put and Additionally, the terms may be replaced (without changing) by since In this way we obtain with

120

6. Analysis of Generalized Derivatives

Setting now as well as we observe that (again for some subsequence)

Moreover, the elements

and

of the first part

are now

and

So we can estimate as above and obtain theorem.

This proves the

Corollary 6.9 (standard partial derivative).

locally Lipschitz,

Proof. The function

Suppose that exists and is locally Lipschitz as well. Then

is

is simple.

The next conclusion will be our key for dealing with generalized Kojima functions in the subsequent chapter. Corollary 6.10 (product rule). Let

where and are locally Lipschitz matrix-valued functions of related size. Suppose that one of them is simple. Then the product rule of differentiation holds for TF, i.e.,

Proof. If, for example, N is simple, put

and

Note that the result of the operation is a set S of matrices A having the size of To get the first set of the sum, one has to multiply all by Needless to say, Corollary 6.10 holds (by the same arguments) for sums too. The latter is formally needed if one writes equation

as the equivalent

6.4. Chain Rules of Equation Type

121

Partial Contingent Derivatives

What about the contingent derivatives under similar assumptions? To study this problem, we must put and and, of course, replace the derivatives under consideration. We have to begin with the definition of a simple locally Lipschitz function with respect to To be simple at now means the following: For each

and every sequence there holds at least for some subsequence of

This is nothing else but the existence of the directional derivative at

i.e.,

The proof of Theorem 6.8 now remains valid step by step in the present context, it becomes only shorter due to the fixed sequences and the notion simple at with repect to may be replaced by directionally differentiable at The results are the following analogous statements. Theorem 6.11 (partial derivatives for

and suppose

). Let and be locally Lipschitz, exists and is locally Lipschitz, too.

Then If, additionally, equation:

is directionally differentiable at

then the inclusion holds as

As a product rule this yields Let where and are locally Lipschitz matrix-valued functions of related size. Suppose that one of them is directionally differentiable. Then

Corollary 6.12

6.4.2 Newton Maps and Semismoothness Newton Functions

Let be any function and X, Y be normed spaces. If is continuously differentiable and is fixed, the two approximations and

122

6. Analysis of Generalized Derivatives

may replace each other because both,

and

satisfy

For

if and f(0) = 0, exists but does not so. For the reverse situation occurs. When applying solution methods, we need (or have) at points near a solution So the second approximation becomes important and, if the condition must be specified for multivalued derivatives.

Let function of

be locally bounded. We say that at

is a Newton

if

Our notation will be motivated by Newton’s method, see Lemma 10.1. At this moment, one may regard the actual property as being a generalization of continuous differentiability for nonsmooth functions. Notice that in (6.22), may be replaced by the u.s.c. function

without violating this condition. So may be supposed to be u.s.c. (or continuous as well). Further, the function may be arbitrary at and is not uniquely defined at If satisfies (6.22), then it is also a Newton function of at whenever Here, is not necessarily small in the - norm, cf. (4.3). Newton functions at are (single-valued) selections of locally bounded maps such that Accordingly, we call M a Newton map of at This property is obviously invariant if one forms the union or the convex hull of two Newton maps (the set on the right-hand side of (6.24) is convex). Example 6.13 (examples of Newton functions).

1. If is a Newton map at Indeed, since

Thus, we may write

and

denotes the unit ball of

is locally Lipschitz with rank satisfies

with

then

every matrix

and

6.4. Chain Rules of Equation Type

where

123

Using the new o-type function

in (6.24), this is the assertion. 2. For

and

one may put

where Indeed, for sufficiently small, the index sets fulfill with some Lipschitz rank L of near , we obtain

Hence, If

Thus,

satisfies (6.24). then, due to

cf. Lemma A2, one easily confirms that (6.24) implies (with a possibly new o-type function),

However, is not necessarily directionally differentiable (see the next theorem), and M does not have a so-called approximate Jacobian [JLS98] which would require

Surprisingly, condition (6.22) is a weak one, and Newton functions satisfy a common chain rule. Theorem 6.14 (existence and chain rule for Newton functions).

Every locally Lipschitz function sesses, at each Newton function schitz constant L for near Let and and at Then function of at

Banach spaces) posbeing locally bounded by a Lipwith Newton functions defines

at Newton

124 Proof.

6. Analysis of Generalized Derivatives

(i) Given

By Hahn-Banach arguments,

For small and

this yields

there is a linear operator

with

even exist with bounded norm

So it suffices to put

(ii) The straightforward proof is basically the same as for Fréchet derivatives. We put and Our assumptions yield where and

where Thus,

Now If

that

then

is of type since is uniformly bounded for Otherwise, we obtain from

vanishes as

near

Hence

By Theorem 6.14, it turns out that, having Newton maps and at the related points and then the canonically composed map is a Newton map for the composed function However, the function defined under (i) in the previous theorem, does not use local behavior of near and depends on which is often an unknown solution. So one cannot directly apply statement (i) of Theorem 6.14 for solution methods since first one has to find satisfying (6.22) without using Nevertheless, having it can be applied for Newton’s method exactly like under the usual boundedness condition of the inverse, see Section 10.1. To investigate convergence of Newton’s method for maps M satisfying (6.24) and particular realizations have been considered in [Kum88b].

6.4. Chain Rules of Equation Type

125

Semismoothness

A function is said to be semismooth at if is a Newton map at This notion, based on Mifflin [Mif77], has been introduced and used for Newton’s method by [PQ93] and [QS93] and in many subsequent papers. Some modifications of semismoothness are mentioned in Section 10.1. One well-known class of semismooth functions is the class since and each is trivially semismooth (everywhere). Another class consists of certain NCP functions, which will be considered in Section 9.2. The real, globally Lipschitz function in Example BE.0 is nowhere semismooth. Before showing how Newton maps may be applied to the class defined below, we recall conditions for semismoothness given by [Mif77, Prop. 3, Thm. 2]. Theorem 6.15 (semismoothness; Mifflin).

and maximum functions smooth, provided that is continuous and too.

Convex functions over compact sets Y are semiexists and is continuous,

Proof. We present a proof for completeness and in order to show how the inclusion which holds for near in the case of a finite set Y (where will be modified in the current case of a compact index set Y. For seek of simplicity, let

(i) Let

be convex. The inclusion

must be shown for all

By the definition of

we already have

i.e., Since

is u.s.c., it holds such that

Thus, one finds some This ensures

as reqired. (ii) For the maximum function, it holds

where definition

The semismoothness condition becomes by

126

6. Analysis of Generalized Derivatives

We will see that this is equivalent to

which is just the mentioned inclusion Using compactness of Y and continuity of

By upper semicontinuity of So let fulfill

for near if Y is finite. one estimates uniformly

(cf. Theorem 1.15), one has Notice that

Thus, taking (6.26) and (6.28) into account, we have to show that (6.27) is valid, indeed. Suppose contrarily that

holds for certain obtain points

and related Via the mean-value theorem, we (between 0 and ) satisfying

It follows

and

Passing to a subsequence if necessary, there exists a common accumulation point of and So, with some accumulation point of we finally arrive at a contradiction

Note. In consequence, the function is semismooth for Euclidean norm and compact, non-empty Further, each DC-functional (difference of convex functions) is semismooth. The same holds (by Theorem 6.14), if has DC components since

6.4. Chain Rules of Equation Type

127

Pseudo-Smoothness and D°f

We call pseudo-smooth if is a on an open and dense subset These functions appear in many applications, cover the class by Lemma 6.17, have locally bounded derivatives on and obey nonempty sets

Nevertheless, in Example BE.6, §12.1, we present a real convex function that is not pseudo-smooth. Let be the set of all points of It makes no difficulties to see that does not depend on the choice of in (6.30). One could even replace by any dense subset of in (6.30). In addition, it holds

The single-valued selections of functions, because satisfying

are natural candidates for being Newton holds necessarily for all closed maps M

hence also for all closed Newton maps M which assign, to ally, the Jacobian

as usu-

In order to check whether is a Newton map for a pseudo-smooth function at it suffices to consider all points in a dense subset of and to investigate whether

holds true. In this case, the contingent derivative can be estimated by too. Lemma 6.16 (selections of

of fixed

If is a Newton function, then and it holds

is pseudo-smooth and some selection is a Newton map at the same

Let be a second selection. By (6.31), Sf satisfies (6.22) for pouits If in (6.22) is not u.s.c., we replace it by in (6.23) which is again of little-o-type. Then, since each is a limit of elements in condition (6.22) holds by continuity arguments for at too. Thus, every selection of fulfills (6.22), so (6.24) is true. Inclusion (6.33): For small due to pseudo-smoothness, the quotients Proof.

128

6. Analysis of Generalized Derivatives

can be approximated (with error < ) by

such that that

and

Then (6.31) and (6.22) guarantee which yields the assertion because as

In Example BE.1,

is pseudo-smooth and directionally differentiable with (6.33) fails to hold though exists, and neither nor contain a Newton function at Nevertheless, there are pseudo-smooth functions outside such that is always a Newton map.

Locally

Functions

Let be pseudo-smooth. We call locally (and write ) if there is an open and dense subset such that is on and the following holds: There exists a finite family of open sets and continuous functions satisfying is on and is uniformly continuous on for each (i) bounded set K, and there exists an such that, given (ii) for each one finds some with rel int conv and In comparison with (proper) functions, we do not claim that is on the whole space. The set in the previous definition will be also called of Lemma 6.17 (special locally functions). The Euclidean norm of a linear function and all functions are locally A pseudo-smooth function is locally if there is a finite covering of by convex polyhedra such that is and is uniformly continuous on int In addition, if and are locally and then is again locally (provided that are of appropriate dimension). Proof. Euclidean Norm: If if and Let coincides with

put otherwise. and

near if

Note that We put

and

The set is dense in We prove this known fact for completeness: Otherwise, one finds some ball with Then the relatively

6.4. Chain Rules of Equation Type

129

open sets are dense in V, by definition of the intersection is again dense. But means which contradicts Thus, is dense, indeed. Next, given one finds such that

Now assign some equations

So

to

such that Then the and follow from the choice of and and is valid because and coincide near Finally, is uniformly continuous on since is continuous on Covering: Define such that, for the set The existence of is ensured since all

and take

small enough is constant.

are polyhedral sets.

With the related sets and radii assigned to

and

one may

put and

We are now ready to present the motivation for the above definitions. Theorem 6.18 (Newton maps of

and

). Let be a locally function Then is a Newton map of at The function in (6.24) can be taken as provided that both is a modulus of uniform continuity for all functions on near and is continuous. For the composition of locally functions and the mapping is a Newton map of at

Note.

Modulus of uniform continuity means that for all

If all

are globally Lipschitz on

then

near holds for small

Proof of Theorem 6.18. Proof of (i) and (ii). Given we find some that defines the ball the definition of Let With some according to the definition of we obtain

in and

130

6. Analysis of Generalized Derivatives

This allows us to integrate and to estimate

Due to uniform continuity, the supremum is bounded by this guarantees Let be any selection of Since on provided that belongs to the dense subset

Using

now (6.22) holds true,

of Because remains bounded, continuity arguments then yield that (6.34) also holds for the upper Hausdorff limit:

Replacing here by any open and dense set where is on, we obtain the same set A on the left side of (6.35) because and are continuous on and each can be approximated by arbitrarily close. Thus, by (6.35) and definition of we obtain for all which verifies (i) and (ii). Proof of (iii), Knowing (i), statement (iii) follows from Theorem 6.14. Remark 6.19 Combining (6.25) and (6.33) one obtains upper and lower esti-

mates of

for

with certain o-type functions

Here, will depend on too. However, for it becomes obvious that these functions are uniformly bounded for all in some compact set K and by

as

and

and

Let be strongly regular at Show that is semismooth at if is semismooth at Note that you cannot apply invertibility of the matrices in due to Example BE.3. Exercise 13.

6.5.

Mean Value Theorems, Taylor Expansion and Quadratic Growth

131

6.5 Mean Value Theorems, Taylor Expansion and Quadratic Growth In this section, we establish the Taylor expansion of a in finite dimension in terms of as in the smooth case. We start with the simplest case of a mean value theorem. There holds:

Notice that (6.37) fails to hold with Proof of (6.37): Then

in place of

Assume first that some

realizes max

on [0,1].

for small yields

for some

and

for yields

for some Since is connected, we obtain If there is no maximizer of on [0,1] in (0,1), then implies that there is a minimizer of on [0,1] in (0,1), and, by similar arguments, one comes to the same inclusion. By using a linear transformation and Lemma 6.2, we have the statement:

For holds

a rather large set must be used for estimates. There

Proof. Indeed, otherwise one may separate

is some

Put

and C, i.e., there

with

Then property (6.38) yields, with some

The last inclusion holds true due to property (6.18), where denotes the set Considering all one obtains a contradiction. Statement (6.39), in terms of instead of has been shown and is a main theorem in [Cla83]. Our set C is formally smaller than the related set

132

6. Analysis of Generalized Derivatives

since

So (6.39) looks even stronger. But recall that, for showing we had already applied the of the mean value theorem (based on Rademacher’s theorem and Fubini’s theorem in [Cla83]). Really, the two versions are equivalent because of the convex-hull operation and Theorem 6.4. The next theorem extends well-known facts from using the some devices concerning the proof. Theorem 6.20 (

holds for same

expansion). Let

to

functions by Then

and some

Without loss of generality let and Moreover, replacing by we do not change the statements, since this transformation does not change the set So we may assume that Next put Then Proof.

The last inclusion is justified by (6.18). Due to the transformations (6.40), it suffices to show that for some For this reason we define the real function

Then

and

Applying the usual mean value theorem, there is some

such that

Next we use property (6.38) to obtain for some With our settings, this means that

and, since

as required.

For the related statement based on Clarke’s generalized Jacobian refer to [HUSN84].

we

6.5. Mean Value Theorems, Taylor Expansion and Quadratic Growth

133

Quadratic Growth

The following statements are of particular interest if the subsequent constant is positive, because then is locally growing in comparison with its linearization. Nevertheless, may be any real. Corollary 6.21 (quadratic growth on a neighborhood). Let

be a cone and

be a constant such that

Then there exist a neighborhood

of

and some

such that, for all

Note. In particular, one may put (as long as in (6.42). Since is homogeneous, (6.41) holds also for hence for a ”double cone”. Proof of Corollary 6.21. The mapping

bounded, and mains valid for 6.20 to and

near

is closed and locally is Lipschitz in So, if is small, (6.41) reand with Applying Theorem (and using now proves the corollary.

Even for convex functions, quadratic growth at a point, i.e. (6.42) for only, does not induce that (6.41) holds with some Example 6.22 (counterexample). Take a real, monotone Lipschitz function

such that for odd

one obtains a convex Simultaneously,

and Setting

is constant on intervals of the form

function satisfying (6.42) for U = R and holds.

Condition (6.42) for requires a less strong assumption, hence we may replace by there. Theorem 6.23 (quadratic growth at a point). Let

be cone and

be a constant such that

134

6. Analysis of Generalized Derivatives

Then there exist a neighborhood for i.e.,

Proof. We assume Given any sequence written as

of

and some

such that (6.42) holds

otherwise consider there exists a subsequence such that

can be

where, for certain If (6.42)’ is false -with We consider such

it does not hold for related Let

with

and Since

there is some K depending on

only, such that

Now it holds

Due to (6.43) and the second integral turns out to be of type By (6.41)’ and by the choice of we have, for sufficiently small and

Therefore,

Recalling that this shows, for sufficiently small sequence, the inequalities

This verifies the assertion.

of the selected

6.5. Mean Value Theorems, Taylor Expansion and Quadratic Growth

135

Further, the contingent derivative can be applied in order to derive a first-order estimate for functionate. Lemma 6.24 (mean values via

If

and

then Proof. Let

be fixed and let

Then T is closed and that assume

Let

be the set of satisfying

be the maximal element of T. To show

Since and

we know that

for certain i.e., This contradiction shows was arbitrary, the lemma is shown. Corollary 6.25 (Lipschitz condition). Let

some and

and suppose, with holds for all near

and max-norm in that Then is Lipschitz near

Proof. Otherwise there are

Put

and

Because

with rank

near

such that, with some

Since, by assumption,

Lemma 6.24 yields the opposite inequality

which completes the proof. Exercise 14. Show that

single-valued

is

on an open set

if

is

136

6.

6.6

Contingent Derivatives of Implicit (Multi–) Functions and Stationary Points

Let us consider the set

Analysis of Generalized Derivatives

of solutions to the equation

under the assumptions that F maps

to

and

where denotes the partial derivative with respect to In order to avoid confusion in notation of the present section, we use calligraphic letters to denote the spaces under consideration. Let

i.e., Under the hypotheses of the usual implicit function theorem for and S are single–valued and locally (with derivatives DS), and there holds

In this section, we are interested in a similar characterization of the contingent derivative for the multivalued map under the assumption (6.45). Moreover, we shall derive contingent derivative formulas for mappings of the (projection-) type In particular settings, this may be interpreted as a mapping of stationary solutions with associated multipliers A crucial motivation to study these questions comes from the sensitivity analysis of solutions to nonlinear programs. A.V. Fiacco and G.P. McCormick [Fia76, FM68] were the first who derived sufficient conditions for the validity of the formula (6.46) when are finite–dimensional spaces and is the (single–valued and differentiable) primal–dual critical point mapping of a perturbed equality/inequality-constrained program. However, it is well–known that even in this special case, the assumptions to guarantee formula (6.46) are very restrictive, involving LICQ and the technical assumption of strict complementarity. The situation is similar for the stationary solution mapping. So, for optimization problems or for smooth programs under weaker assumptions one may only expect the existence of either one or the other generalized (directional) derivative of critical point or stationary/optimal solution maps, including related formulas. For this reason, such generalized differentiability properties play some rule in the literature, see the bibliographical note at the end of the section.

Contingent Derivatives of Implicit (Multi–) Functions

137

Our purpose is to point out which assumptions ensure (different levels of) generalizations of the relation (6.46) to contingent derivatives if is defined by an equation, and this under a smooth parameter–dependence (6.45). For several instances of (6.44) which are related to nonlinear programs or complementarity problems (for example, if F is a so–called generalized Kojima function, see [KK99a]), the contingent derivative of has an explicit representation – and this carries over to provided that a suitable extension of formula (6.46) is true.

6.6.1

Contingent Derivative of an Implicit (Multi-) Function

In this subsection, we are concerned with the contingent derivative of the solution set mapping of the equation Assume that (6.45) holds. By definition, the contingent derivative contains all limits of usual difference quotients:

i.e., consists of all limits that can be obtained for certain sequences and Note that here and in the following, functions will often be equipped with a subscript in order to distinguish them. In any case, writing we are saying that as As only the local behavior of near is of importance, we may identify with where is the neighborhood of appearing in assumption (6.45). Notice further, that one could similarly regard points in a fixed subset only. As it is standard in the smooth case, we consider, for near the function By the mean-value theorem, one obtains

where

can be estimated (uniformly for

by with Due to

as

one easily sees that

and

138

6. Analysis of Generalized Derivatives

and, moreover,

Further, for

near

we have that

i.e.,

If , moreover, is locally Lipschitz near that the difference

with rank K, we even know

satisfies for

near From now on, according to (6.47) remains the same function, only will be replaced by in the next section. Note that we consider the mapping because, due to Remark 1.1, the contingent derivative CS is known if and only if so is CF:

In the following we intend to exploit (6.48) and (6.50) under different topological assumptions concerning and , while, in the chapter on parametric optimization, we will additionally apply algebraic properties of F, which are available when stationary points of optimization problems in come into the play. Note that the (simple) inclusion (i) in the following theorem could be derived from [Lev96, Thm. 3.1]. To abbreviate, we denote the linear map by Theorem 6.26 (C-derivative of the implicit function). Let (6.45) be satisfied.

Then (inclusion) (existence) Let locally Lipschitz u.s.c. at

If

is l.s.c. at then

and

is

(best case of a Lipschitzian inverse) Let If is l.s.c. at and is a locally Lipschitz function near then

Contingent Derivatives of Implicit (Multi-) Functions

Proof, (i) Let and related

Setting

After subtracting

and

139

By definition of such that

there are some sequence

and taking (6.50) into account, we obtain

and division by

now (6.49) ensures that

(ii) We have to show that For this is trivial, so let and We put and use the assumption that S is locally Lipschitz u.s.c. with rank L. Since is l.s.c. at there exist such that Applying (6.48) and (6.50), we observe for sufficiently small and that

and Thus the sequence remains bounded since So holds for every accumulation point of this sequence as vanishes. (iii) It remains to show that Let Then for a certain sequence and some related one has Due to the lower semicontinuity of

we find, for the same sequence

such that

points

as

Once more we apply (6.50) and obtain

Since S is a locally Lipschitz function near

This ensures, for the current sequence,

say with rank L, we conclude

and

140

6. Analysis of Generalized Derivatives

Note that indeed may be empty if any of the assumptions of (ii) does not hold. Consider, as a first example, with Then for and so Here is l.s.c. at (0,0), but the local Lipschitz upper semicontinuity fails. A second example gives a locally Lipschitz u.s.c, mapping 5, but the l.s.c. of fails. Consider with and Obviously is empty for and is equal to {0} for Hence for However is empty for with but coincides with for hence S is locally Lipschitz u.s.c. at the origin. Remark. From the proof of the preceding theorem, one observes that in (ii)

as well as in (iii) the l.s.c. assumption on may be weakened: Given some direction one has only to suppose that the multifunction is l.s.c. at In the latter case, we shall say that is l.s.c. at in direction q. In the following, we construct a problem for which this weaker l.s.c. assumption on is satisfied, and property (ii) holds but property (iii) fails, i.e.,

In this example, F is even a locally Lipschitzian function satifying (6.45). Consider

in direction hence over,

Then for and for is l.s.c. at 0 in direction (but not l.s.c. at 0). Moreand One easily verifies that is locally Lipschitz u.s.c. at 0. However, CS(0)(1, 1) is the convex hull of (1,0) and (0,1).

To show the equation in assertion (iii) of Theorem 6.26 but allowing that the related sets are empty, it suffices to know that S is pseudo–Lipschitz and that is locally Lipschitz. As a basic tool we use the implicit (multi– ) function estimate of Theorem 4.9 in the case of a pseudo–Lipschitz inverse mapping. Note that with defines a nonlinear perturbation of the initial equation so the next theorem does not immediately follow from the pseudo–Lipschitz continuity of the solution set mapping Theorem 6.27 (the case of pseudo-Lipschitz S).

and let

be locally Lipschitz near then

Let (6.45) be satisfied, If S is pseudo–Lipschitz near If, moreover, then

Contingent Derivatives of Implicit (Multi–) Functions

By assumption, Lipschitz estimate (6.51)

141

Moreover, we note that now the

Proof.

holds locally with some K. The inclusion (i) of Theorem 6.26 is true as shown above. Hence, we have only to verify that Let and So, for some sequence and related and the equations

are valid. By (6.50), we have Since is pseudo-Lipschitz (say, with constant L) at we may apply Theorem 4.9 which states the following result: Let be sufficiently small, and suppose that any locally Lipschitz functions and with values in fulfill the estimates

and where and denotes the smallest Lipschitz constant of on Then, if solves there exists a solution to such that Setting now and we observe that and the Lipschitz rank of becomes arbitrarily small for sufficiently small Thus, there is a solution to such that

This yields and So the current sequence fulfills and and the main assertion of the theorem is shown. For the second statement under it suffices now to verify that for all By assumption, S is pseudo– Lipschitz at with some rank L. Hence, given any sequence there are such that provided that is small enough. If then is evident, otherwise an accumulation point of the sequence exists and fulfills

6.6.2

Contingent Derivative of a General Stationary Point Map: Topological Assumptions

In contrast to the former settings, let F depend on a third variable suppose that the spaces

are finite–dimensional.

and

142

6. Analysis of Generalized Derivatives

We are now interested in the projection X(p) of the solution set

onto the x-space, i.e. we put

In several settings, e.g., if is a primal-dual solution of the KKT system to a parametric optimization problem, then is said to be a stationary point for the parameter This explains the title of this subsection. In accordance with the previous subsection, we suppose

where again,

denotes a neighborhood of

To abbreviate, let

and

We want to characterize the contingent derivative where we only use topological properties which are typical for standard stationary point mappings. A crucial special realization – devoted to stationary solutions of nonlinear programs – will be regarded in the chapter on parametric optimization. By definition only, it holds

The former mapping S now reads

and is given by the solutions being assigned to

to

Then it holds for directions

Again, CS is known if so is CF. We summarize these facts by

and

Further, we may identify

with

of the previous subsection and set

Contingent Derivatives of Implicit (Multi–) Functions

143

in order to obtain (by the same arguments) the same estimates, in particular,

and In the following, we intend to obtain a statement saying that for some yields the existence of some such that

Having this statement, it is quite natural to write by using the set and, finally, to write it via (6.56) in terms of CF for the given F. The inclusion (6.60) means that the equation or, equivalently, has solutions satisfying (for some sequence

and related

So (6.58) ensures, with some as well as

and

Moreover, if then it is easy to see that the sequences (6.61) necessarily satisfy a l.s.c. condition (with respect to of Lipschitz-type, namely, for small where L is chosen such that So, to simplify the further presentation, we impose this l.s.c. assumption which is needed anyway if holds:

We will use this condition for points of the form and are in general supposed to have finite dimension.

Recall that

Theorem 6.28 (C–derivatives of stationary points, general case). Let (6.54) and (6.63) be satisfied, let and Then one has: (i) (inclusion) For each

there is some

such that

144

6.

(ii) (existence) If X is l.s.c. at then

Analysis of Generalized Derivatives

and S is locally Lipschitz u.s.c. at

and S is a locally Lipschitz (iii) (Lipschitzian inverse) If X is l.s.c. at function near then for each q, one has both and with some

Proof.

sequence

(i) Let and functions

Then, according to (6.55), there are some and such that by setting the points with some related fulfill By (6.63), the points can be taken in such a way with some fixed L. Thus, since an accumulation point exists, (6.64) follows by definition of is l.s.c. at due to (6.63), and Theorem 6.26 (ii) tells us contains some element So follows

that of (ii) Now that from (6.57). (iii) Again by Theorem 6.26(iii), we observe that

holds, and (ii) shows

Using (i), this is the assertion.

From the proof of the preceding theorem, one observes that in (ii) and (iii) the l.s.c. assumption on may be replaced by the l.s.c. of in direction q (see the remark following Theorem 6.26). Further, notice that Theorem 6.27 can be applied to the current maps and S if one is interested in obtaining the assertions under (iii) without supposing (6.63). In the particular setting of optimality conditions, this relates metric regularity and the local Lipschitz property of 5, we shall discuss this in the chapter devoted to parametric programs. On Assumption (6.63)

In the next theorem we shall substitute (6.63) by a transversality condition. To do this we first suppose local boundedness of as follows:

Then some accumulation point of the sequence exists, since is finite–dimensional by assumption, and – after selecting another subsequence if necessary – we may assume that

Contingent Derivatives of Implicit (Multi–) Functions

145

exists, too. We also consider the partial inverse mapping

and suppose that every bounded set

is upper Lipschitz at the origin in the following sense: For there is a constant K > 0 such that

We note that for X being the stationary point mapping of a standard program, assumption (6.65) is satisfied if the Mangasarian–Fromovitz constraint qualification (MFCQ) holds at (for In this special case, it is obvious that condition (6.63) is stronger than MFCQ; in our general setting (6.53), (6.54), a transversality condition will be added to the boundedness assumption (6.65). Let and let i.e.,

Finally, let respect to at be its kernel.

be the contingent (or Bouligand) tangent cone of

at

denote the (partial) contingent derivative of F with and let

Theorem 6.29 (transversality condition). Let F satisfy (6.54) and (6.65). In

addition, let F be locally Lipschitz and be upper Lipschitz at the origin. Let be some accumulation point of y(p) where and Then holds for some subsequence and some L whenever the transversality condition

is satisfied. Proof. Assume that (6.63) does not hold. Then, by (6.65), for given

and some related subsequence

considered under (6.66),

it holds We show that the nontrivial vector

With

and

from (6.66) belongs to

we obtain that

146

6.

Analysis of Generalized Derivatives

where, since F is locally Lipschitz with some rank

one has

Further, in finite dimension and for locally Lipschitz F, the differences

satisfy

So we obtain from (6.69)

Taking (6.68) into account, division by yield i.e.,

and passing to the limit

Hence there exists some sequence such that Since is upper Lipschitz we derive from and a constant K such that both Thus, it holds as well as to too.

that there exist and So

belongs

If is a singleton then (6.67) holds trivially. For standard nonlinear programs, this means uniqueness of the Lagrange multiplier to and is often written in an algebraic manner as strict MFCQ at (for It is worth noting that the particular structure of F is not needed for showing (6.63) under this condition. Nevertheless, a complete description of CX as in Theorem 6.28 (iii),

needed hard suppositions due to the following facts. Having some pair such that holds, where we know that but the reverse conclusion is not true, in general. Indeed, the latter inclusion only says that certain sequences (as satisfy and

But it must be shown that (as in the proof of Theorem 6.27), the particular nonlinear equation has solutions close enough to This requires basically implicit function arguments. To avoid them, often the extended stationary point map defined as

Contingent Derivatives of Implicit (Multi-) Functions

147

has been investigated. We will follow this line in our application to nonlinear programs, see Chapter 8.1. The contingent derivatives of and X are (using the definitions only) related to each other by

where the subscript of 0 refers to the corresponding space. Bibliographical Note. The approach presented in Section 6.6 is based on

the authors’ paper [KK01]. In the context of inclusions, contingent derivatives of general implicit multifunctions were studied, e.g., in [AF90, LR94, Lev96, RW98]. In the context of nonlinear programming, generalized differentiability properties have been investigated in the literature quite often. By using the theory of second–order optimality and stability conditions, the existence and representation of standard directional derivatives of stationary or optimal solutions to in were studied, for example, in the papers [GD82, GJ88, Sha88b, RD95]. For a recent survey of this approach, we refer to [BS98] and [BS00, Chapter 3]. In [LR95] the existence of the proto–derivative (and, hence, of the contingent derivative) of the stationary solution map of a parametric optimization problem in was shown to hold under the Mangasarian–Fromovitz constraint qualification, and a derivative formula was given. The approach in [LR95] was based on the study of proto-derivatives in the context of subgradient mappings, see [PR94, LR96, RW98] and Section 9.3. More results on the existence and representation of proto–derivatives or B–derivatives of the stationary solution mapping (or of the solution sets of parametric nonsmooth equations), can be found, e.g., in [Rob91, Pan93, Lev96, RW98]. For nonlinear programs with data, an extension of formula (6.46) to Thibault derivatives of the critical point map was given in [Kum91a] under the assumption of strong regularity.

This Page Intentionally Left Blank

Chapter 7

Critical Points and Generalized Kojima–Functions In order to study critical points of optimization problems as well as solutions of generalized equations and complementarity problems in a unified way, we prefer to use a direct, analytical approach for characterizing such points: namely as zeros of some nonsmooth function F sending into itself. Various functions are suitable for this purpose, and later we will deal with several of them, indeed. In the present chapter we consider systems of equations which are defined by locally Lipschitz functions of a special structure and are adapted from Kojima’s [Koj80] form of the KKT conditions for problems. This leads to the notion (generalized) Kojima–function. We shall investigate different regularity concepts for such systems, or, equivalently, different Lipschitz properties of critical point and stationary solution mappings.

7.1

Motivation and Definition

As a starting point let us consider the nonlinear optimization problem

where

and the functions are (at least) continously differentiable near some point of interest. In the following, however, we will mainly deal with the case that the functions and belong to the class The latter hypothesis opens the 149

150

7. Critical Points and Generalized Kojima–Functions

use of several technical tools and allows us to weaken the standard smoothness assumption of critical point theory in nonlinear programming. The weaker supposition makes sense, for example, in two–level optimization, decomposition approaches and semi–infinite optimization, where data appear in a natural way. Note that even for problems without constraints, the gap between and is very large. This will become clear in several situations below. KKT Points and Critical Points in Kojima’s Sense

Using the standard Lagrange function the classical necessary optimality conditions to problem (7.1) in the sense of Karush, Kuhn and Tucker have the form

a solution of this system is called a Karush–Kuhn–Tucker point (KKT point) of (7.1). If exists such that satisfies (7.2), then is said to be a stationary solution to (7.1). It is well–known that a local minimizer of (7.1) is necessarily a stationary solution, provided that some constraint qualification holds. Following Kojima [Koj80], one may assign to (7.1) the function with the components

where and and the vectors and are defined by the components and respectively. Evidently, and Notice that and are the Euclidean projections of to the nonnegative and non-positive orthant of respectively. As projections, they are connected with the related normals in an evident way, in particular, The function F is called the (usual) Kojima–function of the program (7.1). If is a zero of F, then we say that is a critical point and is a stationary solution of the system (or simply of F). Moreover, if for some stationary solution of F, then we say that is a critical value of (7.1). Since, in general, may have negative components, the Lagrangian L is now

and We do not need a new symbol because the former settings. One immediately sees that

was nonnegative in

7.1. Motivation and Definition

151

Note that both transformations are locally Lipschitz, this is important in view of our regularity notions. One may even directly identify critical points and KKT points: By continuity, inactive constraints – i.e., such with – remain inactive for near So these constraints do not play any role for the local analysis of KKT points near and we may delete in such a context. Hence, one might suppose that at the given stationary solution of interest (but not at near ), whereupon is a KKT point iff is critical. Nevertheless, we will not always make use of this fact, in order to show also certain symmetries concerning active and non–active constraints in several statements. Generalized Kojima–Functions – Definition

It was first observed in [Kum98] that several regularity results with respect to the Kojima–function F defined in (7.3) do not require the concrete form of the first component of F, but only the affine–linear structure of with respect to i.e., in (7.3), and can be replaced by arbitrary (continuous) functions and This extension maintains the nice separable form of F, namely, can be written as the product of the row vector

and a specially structured matrix This is our key observation and suggests the following definition. In the next section, we shall see that this product describes besides Kojima’s function (7.3) several further objects of interest in the context of optimization. Let again Definition. A function is said to be a generalized Kojimafunction if it has the representation

where

is given by (7.5),

is defined by

and E is the matrix, is a row vector, and are matrices formed by the row vectors in and are regarded as column vectors of length and respectively. Convention. To avoid the frequent use of the transition symbol, we often will omit it if the context is clear. In particular, we sometimes write to denote a column vector composed by the column vectors and and we identify

152

7. Critical Points and Generalized Kojima–Functions

and for the product of a vector and a matrix A. Further, we will agree in several applications of generalized Kojima functions to

and to omit the transition symbol in

and so on.

Obviously, F is the usual Kojima–function if and In view of a ”second–order” analysis, we shall often suppose in (7.6) that and but formally this is not required in the above definition. The notions critical point and stationary solution are used similarly to those for the usual Kojima–function. A generalized Kojima–function F with components and has the explicit form

Of course, the components and are still the same as in (7.3). The equation is sometimes called a (generalized) Kojima system. The separable structure of F = NM and the special form of N are very helpful for computing generalized derivatives. The function N is locally Lipschitz and has, in addition, a special simple structure (as a projection onto a polyhedron). We will see that N is simple in the sense of Section 6.4. Exploiting this fact and supposing that M is locally Lipschitz, one can determine the Thibault derivative TF and the contingent derivative CF (recall that they are crucial for strong, upper and pseudo-regularity of F, cf. Theorem 5.1 and Theorem 5.14) by the product rule of differentiation, cf. the properties (6.5) ... (6.8) of Section 6.4. We shall see below that these derivatives can be (more or less explicitly) described and interpreted in terms of the original functions involved. This is the most hard and interesting task for applying abstract stability statements formulated in any equivalent setting. Clearly, if one of the input functions in M is a complicated Lipschitz function, the remaining problem may be serious. On the other hand, if M is even continuously differentiable (in particular, if M corresponds to the classical Kojima–function (7.3) with then only the function is nonsmooth, and F becomes piecewise smooth. If, in addition, at some zero the strict complementarity condition holds true, also is smooth near and the usual implicit function theorem is applicable in order to study regularity of F: In fact, then all our regularities (strong, pseudo, upper) coincide and are satisfied if and only if is nonsingular. However, in the following we are just interested in the opposite (non–standard) case. Having (pseudo, strong) regularity of F at we may modify all the functions in (7.8) as long as the variations remain sufficiently

7.1.

Motivation and Definition

153

(see the Theorems 4.1 and 4.3 concerning existence and behavior of related zeros). If we vary only M, but not the fixed elements 0 and 1 in (7.6), then we are still in the classical framework of parametric optimization. But what happens after changing the functions and in or the other elements of M? The result and interpretations of such perturbations will be studied in Chapter 11 for usual critical points. For many stability statements, the assumed constraint qualifications are of crucial importance. In the regularity context, they clarify the local behavior of the feasible points under right–hand side perturbations or, from the dual point of view, the behavior of the related normal cone map. Mostly, one supposes MFCQ or the more restrictive Linear Independence constraint qualification (LICQ) which says that the gradients of active constraints are linearly independent. Therefore, before we consider particular cases and apply rules for differentiating F, let us state that LICQ is a necessary consequence of pseudo-regularity for generalized Kojima functions under quite weak assumptions. Our proof directly indicates which perturbations disturb the pseudo–Lipschitz property, provided LICQ is violated. Lemma 7.1 (necessity of LICQ for pseudo-regularity). Let F = NM be a generalized Kojima function, where and are continuous and Then, F is pseudo-regular at some zero only if the gradients of active constraints are linearly independent. Proof. Without loss of generality we may assume put and for summation index For set Then, the point Next let any

For simplicity, and omit the and we

solves satisfy

We will show that

To this end, with

consider solutions

For small fixed and for tending to zero, now there are such solutions which satisfy the pseudo-Lipschitz inequality

The first for

components of and small

are not smaller than This implies

Hence, it holds

to

154

7. Critical Points and Generalized Kojima–Functions

For equality constraints value theorem and due to

is trivially true. By the meanwe derive the identities where

Therefore,

Since

is continuous and

there holds

Recalling (7.9) one thus observes that

and hence

for arbitrarily small

7.2

But this inequality yields

Examples and Canonical Parametrizations

In this section, we discuss relations to other settings of critical point systems. In particular, we give typical examples of generalized Kojima–functions and show which kinds of parametrization appear if the system is perturbed in the right–hand side.

The Subdifferential of a Convex Maximum Function

The two basic notions of convex analysis, the conjugate and the subdifferential of a given convex functional on just describe key quantities of some parametric optimization problem. So yields the infimum value, and the minimizers of the perturbed function Strong regularity of at means Lipschitz continuity and uniqueness of near Further, we know that strong and pseudo-regularity here coincide, cf. Theorem 5.4. The inclusion cannot be modeled by Kojima functions, in general. But for solving it (or for any analysis of related solutions), one needs information concerning At this stage, Kojima functions come into the play. To see what happens we regard one of the simplest cases. Let be a maximum function on i.e., are convex functions belonging to the class

where all Then

7.2. Examples and Canonical Parametrizations

155

and

Setting

and now we have

and our conditions attain the form

i.e.,

These equations form the Kojima system of the problem

this is a program with convex and smooth

data.

Strong regularity of the system (7.10) means local Lipschitz continuity of the unique solutions to

for the parameter having small norm. The latter is the Lipschitz property of unique primal–dual solutions to the parametric problem

According to Lemma 7.1, strong regularity of system (7.10) at requires LICQ. In the current case for and LICQ means: The vectors have to be affinely independent, where

It is worth noting that the weaker MFCQ requires the existence of such that and is always satisfied. In contrast to the initial situation when studying the subdifferential mapping in system (7.12) additional parameters appear, and we are speaking about dual solutions, too. By variation of we may modify the functions separately such that This was an impossible variation as long as we considered without any inner structure. Keeping fixed we just return to our initial question of strong regularity of The question arises whether the two forms of strong regularity are indeed different, or not. The answer is yes.

156

7. Critical Points and Generalized Kojima–Functions

Remark 7.2 (strong regularity of 1. Clearly, if the Kojima function in (7.10) is strongly regular then so is may be strongly regular while the Kojima system 2. On the other hand, (7.10) does not so. This case happens if and only if is strongly regular at and are affinely dependent. To see [2.], we first recall that under strong regularity of (7.10), LICQ holds because of Lemma 7.1. Conversely, let be strongly regular at and, in addition, let LICQ be true. Then is isolated in and the (uniform) growth condition holds, see Lemma 3.1 and Theorem 4.8. So (7.13) is still solvable with solutions near for small parameters (because is l.s.c. at 0). By LICQ, the duals in (7.12) uniquely exist and are Lipschitz in the indicated variables. This is just strong regularity of (7.10). Summary. Strong regularity of means only uniform growth of (and can evidently hold even if all coincide). Strong regularity of system (7.10) means just both, uniform growth and LICQ at the solution to problem (7.11). Complementarity Problems

Given locally Lipschitz functions

find

such that

With we rewrite the conditions as yields the equation

which

where In fact, F is a generalized Kojima-function on

where E is again the The perturbed system

identity matrix. Clearly, means

and

do not appear.

and describes a natural parametrization of the original problem with parameters and Find such that

In the standard case we obtain a generalized Kojima–function where i.e., in comparison with the usual Kojima–function (7.3) only is replaced by Explicitly,

7.2. Examples and Canonical Parametrizations

The special relation

157

becomes interesting, e.g., in Lemma 7.18.

Several other descriptions of the complementarity problem are possible, for example, via an optimization problem. Instead of (7.16), one may investigate the optimization problem

provided the original problem (7.14) is solvable and side perturbations of the related Kojima-system lead us to

The right–hand

The system describes the critical points of the perturbed problem

with parameters Compared with (7.16), an interpretation in the context of complementarity is now less obvious and the analytical form of the Kojimasystem is more complicated. Concerning approaches via so-called NCP functions we refer to Chapter 9. Generalized Equations

Given any closed, convex set and any continuous function a generalized equation (written in traditional form as variational inequality) claims to find some such that

After introducing the (contingent) normal cone

this means The introduction of generalized equations for the unified study of KKT systems, complementarity problems and equilibrium problems is due to S.M. Robinson (see, e.g., [Rob80, Rob82]). The general equation (7.18) becomes an equation by writing in terms of the Euclidean projection onto C. But more interesting, let us suppose that some analytic description of C is given, say C is polyhedral,

with some suitable matrix A and vector Taking the particular form of the normal cone into account, (7.18) is equivalent to

158

for some

7. Critical Points and Generalized Kojima–Functions

With

this is a generalized Kojima system with and does not appear. The related parametric equation now becomes

This system characterizes, by putting the parametric generalized equation

the solutions of

with parameters Here, the feasible set C is no longer constant. When studying (7.21) directly in a general framework (e.g. by the tools of Chapter 2), new difficulties will appear because depends on However, knowing that (7.21) is only a perturbed generalized Kojima-system in the sense that satisfies (7.21) satisfies (7.20) with some keeps the things simpler since the consideration of multifunctions can be avoided at all. On the other hand, the traditional form of parametrizing (7.18) according to Robinson’s work [Rob80, Rob82] leads to a problem where C remains fixed:

So the parametrizations (7.21) and (7.22) are dealing with different subjects, both closely related to the original problem. A similar situation was discussed above concerning subdifferentials. Again, with the same arguments as for the subdifferential, linear independence of all active gradients (i.e., of all satisfying for the solution to (7.19) under consideration) is the crucial condition which ensures that strong regularity in Robinson’s sense (i.e., the existence of locally unique and Lipschitz solutions to (7.22)) implies strong regularity with respect to the parametrization (7.21). In particular, if the generalized equation (7.18) describes KKT points of the nonlinear program (7.1), i.e. if (7.18) has the particular form

then the parameterization (7.22) with coincides with the right– hand side perturbation of the Kojima function (7.3) due to the special kind of H and C. Both parameterizations now describe exactly the KKT points of

which follows easily from the particular form of C. Besides, strong regularity with respect to (7.22) implies here strong regularity with respect to (7.21)

7.2. Examples and Canonical Parametrizations

159

because the linear independence condition concerning A holds true; not that Finally, if C is not polyhedral (and/or non–convex), but say C is described by with the situation is similar. The interesting set in (7.18) is now the normal cone of the contingent cone to C at Under a constraint qualification like MFCQ, there holds such that So, setting

equation (7.20) passes into

Clearly, F is again of generalized Kojima-type, where, i.e., in comparison with the usual Kojima function only

and is replaced by – H.

Nash Equilibria

Consider the following problem of a (Nash) equilibrium of the non-antagonistic 2-person game (u,v, X, Y): Given continuously differentiable functions and compact convex sets X, Y in and respectively, find such that

Writing down the first order necessary optimality conditions for both optimization problems, we get the generalized equation

where are again the usual normal cone maps. For seek of simplicity, let X be the unit simplex of i.e., (where and let Y be the unit simplex of Further, let be the dual vectors associated with and respectively, and let be the dual variables with respect to the equations. Then the critical point system takes the form

This system can be rewritten as F = 0 by means of a generalized Kojimafunction F with

160

7. Critical Points and Generalized Kojima–Functions

In (7.6), the matrix M depends now on the strategies and and the vector N depends on the dual vector The related system with righthand side perturbations leads us to linear perturbations of the utilities in the player’s variables as

and claims perturbations of the simplexes X and Y as well: Similarly, one can handle games of more players. Piecewise Affine Bijections

If F is the usual Kojima–function of a quadratic optimization problem, say

then and are constant and are linear. The investigation of then leads us to piecewise linear systems studied by Kuhn and Löwen [KL87]. Basic extensions of their results to the case of can be found in [JP88, Sch94, PR96, RS97].

7.3

Derivatives and Regularity of Generalized Kojima–Functions

The possibilities for computing relevant derivatives are important for any analysis of nonsmooth functions, in particular also for generalized Kojima–functions. Because of the crucial role which the Thibault derivative TF and the contingent derivative CF are playing for strong, pseudo– and upper regularity, it is desirable to have an explicit and intrinsic description of these mappings in terms of the original functions. We shall derive such descriptions and utilize them both for characterizing regularity and solving Kojima systems. Properties of N

Recall that

and

where

is the row vector

is defined according to (7.6), i.e.,

Below we shall show that N is simple, hence, we can apply Theorem 6.8 and the Corollaries 6.9 and 6.10 in order to represent TF and CF. We will see that

7.3. Derivatives and Regularity of Generalized Kojima–Functions

and that, even if M is arbitrarily smooth, we will get

161

in

general. However, the only part interesting in this context is the piece-wise linear function Its derivatives and considering the components separately.

can be written down by

Obviously, by defining

the set

consists of all vectors satisfying

with some

The reader easily sees that and hence incides with the multivalued directional derivative Clarke’s generalized Jacobian. Setting similarly,

coin terms of

we obtain a singleton and where

and

For now is simply the directional derivative of at 0 in direction sign Hence, CN is the directional derivative of N. Trivially,

Some Transformation

For several reasons, it makes sense to rewrite the representations (7.28) and (7.29) by using the transformation

with

(I) if

(II) if where

or

The reader easily sees that then then

162

7. Critical Points and Generalized Kojima–Functions

and

Conversely, if by

and

satisfy the conditions under (I) or (II), we define

and

Then, one concludes in an elementary way that and respectively. Moreover, in both cases, the one-to-one correspondences satisfy The latter will be important for studying the injectivity of the derivatives TF and CF of the generalized Kojima-function. Derivatives of N

In the following lemma, we summarize representations of TN and CN which immediately follow from the discussion above. Lemma 7.3 (TN, CN). Let and Then, has the following representations:

Moreover,

has the following representations:

Basic Lemma on N

Now we make sure that (the locally Lipschitz functions) and N are simple in the sense of Section 6.4 and have further useful properties which are based on their special structure. For real the functions and are monotone and satisfy So the related derivatives of are componentwisely given by the ”derivatives” of the absolute value function Let denote the components of Since is component-wisely defined via independent variables, all the following statements must be shown only for each component. In addition, they are evident for (where is locally linear), and they also hold for N which differs from only by additional linear and independent components. Lemma 7.4 (N simple, and further properties). 1. The functions and N are simple everywhere.

7.3. Derivatives and Regularity of Generalized Kojima–Functions

2. Given that

and and

with

one has for all imply the inclusions

163

and small

(a) (b)

Proof. 1. Obviously, N is simple if and only if so is Given arbitrary and let belong to and let be any given sequence. To show that is simple, we have to construct elements in such a way that

with the chosen sequence of write with

(or with some infinite subsequence). To do this, and define where

Now one easily determines in case of

(i) (ii) (iii) Therefore, even holds for the constructed sequence of 2. The proof of (a) and (b) is ensured by piecewise linearity and is left to the reader. By construction, the points defined under (i), (ii), (iii) in the previous proof belong to a line given by and Conventions

Throughout the rest of this section, we suppose that

where

Further, we shall use the abbreviations

164

7. Critical Points and Generalized Kojima–Functions

and we write

and Moreover, we put for

where

is the first component of F, i.e., while denotes the partial Thibault derivative of at with respect to in direction Analogously, denotes the corresponding partial contingent derivatives. Recall again that for the optimization model (7.1), corresponds to a primal–dual vector, while and

Formulas for Generalized Derivatives Product Rules

From Lemma 7.4 and Lemma 7.3 we know that N is simple and directionally differentiable. Hence, if then results of §6.4.1 immediately imply the partial differentiation and product rules for TF and CF. For completeness, we repeat here the product rules. Theorem 7.5 (TF, CF; product rules). Let F = NM be a generalized Kojima– function according to (7.8), and Suppose Then the Thibault derivative TF of F has the representation

Moreover, given tions

and

the three condi-

can be satisfied with the same sequences and where all are located on a line and Finally, for the contingent derivative CF, the same statements are true, one has only to replace ”T” with ”C” and with Proof. Since N is a simple function according to Lemma 7.4, the statements concerning TF follow from Theorem 6.8 and Corollary 6.10. The contingent derivative can be determined similarly, one has to apply Corollary 6.12 instead.

7.3. Derivatives and Regularity of Generalized Kojima–Functions

165

Explicit Formulas

Theorem 7.6 (TF, CF; explicit formulas). Assume the hypotheses of Theorem 7.5, and suppose in addition that are Then,

consists exactly of all vectors

such that

holds with some Equivalently,

consists exactly of all vectors

holds with some Further,

such that

satisfying consists exactly of all vectors

such that (7.43)

or (7.44) hold after replacing with as well as and with and respectively. In this case, the elements and

with

are unique.

Note. The vectors satisfying (7.44) with describe exactly the set gph where is defined by for The analogous result for gph holds after replacing with and with Proof of Theorem 7.6. To show (7.43), we have still to determine the terms of the product rule (7.41) by using that Recall that M has the structure (7.6). Since the derivative yields (independently of concrete choices of and with respect to the columns 2 and 3 of M the submatrix

The related limits of column 1 are just given by all elements

So, the vectors of the product form

in (7.41), have exactly the

166

7. Critical Points and Generalized Kojima–Functions

where the three blocks are assigned to the components the chain rule (6.19), the elements

are also forming the set

and

of F. By

since

By the structure of TN according to Lemma 7.3, the second term

assigns to

and

where rules.

all triples

Thus, the explicit formula (7.43) follows from the chain

The equivalent formula (7.44) follows from (7.43) and Lemma 7.3 based on the transformations (7.30). The explicit formulas for CF can be shown similarly, according to Theorem 7.5. The main point of the above proof was the product rule TF = T(NM) = N (TM) + (TN) M, Since our way via the rather general Theorem 6.8 and its corollaries is quite long we add a direct proof in the appendix as Lemma A.6 which is valid for the actual product rule only. Let us emphasize that in order to apply the explicit formulas, we supposed (at least) that and The foregoing theorem will allow us to show that TF and CF are related in a very simple way, provided that is even a mapping. This will be done in §7.4.1. Further notice that in the context of coincides with and reduces to a singleton, namely to where H is the Hessian

Subspace Property

The following lemma will be important for deriving marginal value formulas. Suppose and Further, let us agree that and are the maps which assign to all satisfying the explicit formula (7.43) concerning TF and its modification concerning CF, respectively. Then, If F is role of

near

and strongly regular, the map

plays the

7.3.

Derivatives and Regularity of Generalized Kojima–Functions

167

Generally, the component which characterizes the movement of the stationary solutions under variations of F, is not uniquely determined by However, in the essential case of and the inner product is fix at critical points. Lemma 7.7 (subspace property of Further, let and

holds for all Proof. Since

Let Then,

and

in we obtain from the explict formula that

is true. Evidently, the lemma similarly also holds for Regularity Characterizations by Stability Systems

Using the explicit representations of TF and CF derived in the previous subsection, we shall characterize strong regularity of a generalized Kojima–function F = NM and local upper Lipschitz continuity of its critical point mapping. From Theorem 5.14 we know that the generalized Kojima–function F is strongly regular at a zero of F if and only if is injective, i.e., if there is no nontrivial direction such that holds. Further, the specialization of Lemma 3.2 to F says that the critical point mapping is locally upper Lipschitz at if and only if is injective, i.e., if implies Hence, these injectivity properties can be verified by using the explicit formulas derived in Theorem 7.6: Given we have to look for solutions of the T–stability system

or of the C–stability system

168

7. Critical Points and Generalized Kojima–Functions

In the analysis of strong regularity, we shall also consider the problem of finding solutions of the system related to (7.43),

which is equivalent to the T–stability system. Now, the next theorem immediately follows. Theorem 7.8 Let F = NM be a generalized Kojima–function according to Definition (7.8), and let be a zero of F. Suppose that and Then the following properties are equivalent: F is strongly regular at The T–stability system has only the trivial solution (i.e., is injective). If solves (7.47), then Moreover, is locally upper Lipschitz at if and only if the C–stability system has only the trivial solution (i.e., is injective). Geometrical Interpretation

Under the assumptions of Theorem 7.8, the T-stabsility system permits a geometrical interpretation of strong regularity, first given in [Kum91b, §5] for usual Kojima functions. Set

and let be the subspace of generated by the vectors where we put if no equations are required. Define the (large) tangent space and introduce the polyhedral cone

Remark 7.9 (nontrivial solution of the T-stability system). The T-stability system has a nontrivial solution

(i) with

(ii) with Indeed, using the condition

some pair

satisfies and

the assertion of the foregoing remark follows from in (7.45).

7.3. Derivatives and Regularity of Generalized Kojima–Functions

169

If and hold in Remark 7.9(i), then (i) means precisely that LICQ is violated. Therefore, in the general case, we say that the generalized LICQ holds (at if such a pair does not exist. Notice, however, that the generalized LICQ does not coincide with the linear independence of related vectors and (at considered in Lemma 7.1. So one obtains, for Remark 7.10 (TF injective). (i) the generalized LICQ holds (at (ii)

is injective if and only if and

For the special case of

(i) and (ii) take the form

LICQ and Similarly, the C-stability system permits a geometrical interpretation of being locally upper Lipschitz at Let and be as above, and put

Further, introduce the cone

Now we obtain Remark 7.11 (nontrivial solution of the C-stability system). The C-stability system has a nontrivial solution (i) with

some pair and

(ii) with

satisfies for some

To prove this, one has again to put

and to apply that

Having and in (i), this condition means precisely that the strict MFCQ is violated. In the current case, we say that the generalized strict MFCQ holds (at if such a pair does not exist. So one obtains for Remark 7.12 (CF injective). is injective if and only if the generalized strict MFCQ holds (at and (i) (ii) For the special case of strict MFCQ and

(i) and (ii) take the form

170

7.4

7. Critical Points and Generalized Kojima–Functions

Discussion of Particular Cases

In this section, we specialize the results of the previous section to the case of and apply them to nonlinear complementarity problems. In particular, we discuss consequences for Newton–type methods. However, the application to usual Kojima functions related to nonlinear programs will be postponed to Chapter 8.

7.4.1

The Case of Smooth Data

Let be a zero of the Kojima function F. We again use the abbreviations (7.37) ... (7.40). To avoid the transposition symbol, we agree that the vectors are columns are rows and use the convention (7.7). Suppose that now one has

For

For example, if F is the Kojima function of the nonlinear program (7.1), H is the (partial) Hessian with respect to at of the Lagrangian

Matrix Representation

From Theorem 7.6 we know that the Thibault derivative of F at can be written as

where

and

is the quadratic matrix of order

is given according to (7.28).

in direction

7.4. Discussion of Particular Cases

171

Specializations of Regularity Characterizations

We know that F is strongly regular at implies this obviously yields

if and only if

F is strongly regular at in our special case. Now let be the set of all matrices Since is (arc–wise) connected in the space of

under consideration. one obtains

F is strongly regular at det has the same Because each appears in exactly one column and is affine–linear, the function is affine–linear in each too. So (by induction arguments) it suffices to check only all determinants det for with Moreover, since only includes nondifferentiable terms and

one easily sees that the set of matrices is just Clarke’s [Cla83] generalized Jacobian Clarke’s sufficient condition for local Lipschitz invertibility (i.e., strong regularity) of F says that all matrices in have to be non–singular. Hence, in our particular case, this sufficient condition is also a necessary condition for strong regularity of F. Finally, along with F, let us also study the linearized Kojima function

Clearly, it holds and the related derivatives of F and LF evidently coincide at the point Now we summarize in the smooth case several characterizations of strong regularity and local upper Lipschitz continuity of F. Corollary 7.13 strong regularity and local u.L. behavior). F = MN be a generalized Kojima–function according to (7.8), and let Suppose that Then the following properties are mutually equivalent: is injective (i.e., F is strongly regular at ). (i) (ii) LF is strongly regular at (iii) det for all has the same for all with (iv) det (v) are non-singular matrices. Moreover, the following properties are mutually equivalent:

Let

172

7. Critical Points and Generalized Kojima–Functions

(a) (b) (c)

C(LF)

is injective (i.e., is injective. is locally u.L. at (0,

is locally u.L. at (0,

)).

).

Proof. Both sets of equivalences are consequences of Theorem 7.8 according to the above discussion. Note. With respect to problems, Corollary 7.13 ensures, that the injectivity condition for TF does not change if, instead of the original problem, we study its quadratic approximation at namely,

Hence, strong stability of the original problem and its quadratic approximation (PQ) at coincide. This reduction to quadratic problems was a key result of Robinson’s paper [Rob80]. By Corollary 7.13, the same may be said about the locally upper Lipschitz property of KKT–points at (0, ). Concerning necessity of LICQ under strong regularity, let us add the following argumentation, based on Corollary 7.13 and proposed in [Kum91b]: Strong regularity implies that for all and this again implies LICQ via if Some Historical Notes on Strong Regularity

The history of the strong regularity conditions presented in Corollary 7.13 (in the context of problems) is quite long. The conditions are basically known from Kojima’s and Robinson’s work in 1980, see [Koj80, Rob80]. Robinson wrote the condition in a different algebraic way by means of Schur complements. Kojima proved the characterization (i) (iv), however, LICQ was still an additional assumption. Again LICQ extra assuming, Jongen et al. [JMRT87] proved in 1987 that Robinson’s and Kojima’s matrix conditions are equivalent. A little gap was remaining after the mentioned papers: the proof that LICQ is a (simple !) consequence of strong regularity at critical points. In our knowledge this gap was first time closed in 1990-91 in the papers [KT90, Thm. 2.3] and [Kum91b, Thm. 5.1]. Further, in [JKT90], the equivalence between Clarke’s regularity condition for and strong regularity has been shown, so it became again evident, now by means of nonsmooth analysis, that the earlier conditions imposed by Robinson and Kojima are both sufficient and necessary for strong regularity. The strong regularity conditions of Corollary 7.13 can be also derived from the stability results for equations, presented in [RS97] in terms of the socalled coherent orientation property, note that F is in this special case. For recent self-contained presentations of characterizing strong regularity in optimization, we refer, e.g., to [Don98, KK99b, BS00].

7.4. Discussion of Particular Cases

173

Additionally, by using the injectivity condition of coderivatives (see our Theorem 3.7), characterizations in terms of intersections of polar cones have been also derived via generalized equations, see [DR96]. There, as a new final result, it has been shown that (in our terminology) strong and pseudo-regularity of generalized Kojima functions coincide provided that M is a function. An alternative proof and an example for the fact that this statement does not hold for optimization without constraints and piecewise linear has been given in [Kum98], we will present this result in §7.5 and Example BE.4. The present approach to strong regularity of problems (via injectivity and computing TF) has been developed in [Kum91b, Kum91a], extensions to generalized Kojima systems and conditions for upper regularity have been presented in [KK99a, KK99b]. The reader will easily find various other approaches and remarkable contributions devoted to stability of critical and stationary points for optimization problems and related variational problems, we refer to [KR92, Mor94, PR96]. However, by our opinion, the applied techniques in these papers or in the book [LPR96] are essentially restricted to and respectively. In this historical series, we have to include A.V. Fiacco’s pioneering work concerning sensitivity in parametric optimization, see [FM68, Fia83], though he studied even differentiability properties of solutions. But, at the crucial points, there strict complementarity has been supposed which allows to apply the usual function theorem to the KKT-system for deriving a local stability theory. Nevertheless, our desire of extending his clear analytical approach to stability in optimization, was just a key idea for writing this book at all. Difference between TF and CF

Let us return to the derivatives T F and CF and regard its difference. Actually, only the replacement of and is of importance for comparing them. Corollary 7.14 (difference between TF and CF). Let and let Then

where, with

and

unit vector of

Note. The statement says that in the smooth case, of the polyhedron P along the vector

is a translation

174

7. Critical Points and Generalized Kojima–Functions

Proof of Corollary 7.14. We apply the explicit formulas. Since we have where The only difference concerns when T F permits full variation CF restricts to 0 or 1, depending on the selected This leads to different elements and of the derivatives. First, to see what happens, let us suppose that has only one element. Having and we observe that if then

if

where

then

in both cases. Unifying both cases, we have

and each such difference with card the difference

may occur. Considering now the case can be written as a sum:

Again, all right-hand sides may appear. Finally, to send the interesting sets into we defined just the polyhedron P. So, even for problems, T F and CF are different if On the other hand (see Section 7.2), by Robinson’s approach via generalized equations (cf. §7.1), there is only one crucial generalized equation, based on the linearization of H for such problems. In this framework, one may understand our derivatives TN and CN as different approximations of the normal map. The injectivity check for T F as well as computing all elements in the set requires to solve a finite number of linear equations assigned to the matrices for In particular, for CF, the nontrivial solutions of the C–stability system are of interest. This leads us, by definition of to a (generally non-monotone) linear complementarity system. Consequences for Newton Methods

For computing solutions of the generalized Kojima systems by a Newton method (based on linear auxiliary problems), one may fix some matrix assigned to and in order

7.4. Discussion of Particular Cases

175

where denotes the matrix (7.49) with in place of Concrete methods then differ each to each other by the selection of at a current iteration point Lemma 7.15 (Newton’s method under strong regularity). Under strong regularity at the solution and for these methods converge (locally) superlinearly to for all selections of Note. Whether strong regularity is really needed for convergence, depends on the choice of Proof of Lemma 7.15. Put

We have

By strong regularity, all matrices have uniformly bounded inverses for near Since it holds that F is a function. So F is semismooth, cf. §6.4.2, and the matrices form a Newton function of F at But this yields In Chapter 10 we will see that this statement remains even true if F is less smooth, and we will interpret the auxiliary Newton-systems.

7.4.2

Strong Regularity of Complementarity Problems

Given to find some

the nonlinear complementarity problem (NCP) claims such that

To this problem, we assign the generalized Kojima function

does not appear. It holds at any solution of the NCP. Strong regularity of the NCP (i.e., by definition, strong regularity of F) at means regularity of all matrices in (7.49). In the present context, the rows of attain the form

In complementarity theory, one often uses matrices combined by so let us transform the system again.

and

Lemma 7.16 (strong regularity of an NCP). Let u, and let be a solution of the NCP (7.52). Denote by C(r) the matrix formed by the rows Then, for every fixed the matrix P(r) in (7.54) is singular if and only if so is the matrix C(r). Moreover, the NCP is strongly regular at if and only if the matrices C(r) are non-singular for all

176

7. Critical Points and Generalized Kojima–Functions

Proof. We verify the first statement, the second one is a consequence of Corollary 7.13. Let Then yields hence The two equations

assigned to lines

and

now yield

Let

due to

Then one finds

such that

by

setting otherwise If

(7.55) follows elementary. In the last case, it holds by definition and (7.55) is shown.

again

To characterize the Newton equations (7.51) for the actual case, let

be the linearization of

at

Lemma 7.17 (transformed Newton solutions). Let solve (7.51) for s = (x,y) = (x,u(x) – v(x)). Then,

Conversely, if satisfies (7.56) for is satisfied with as

and let

and y = u(x) = v(x), then (7.51)

otherwise

Proof. To abbreviate we omit the arguments it holds

of

and Then we have

due to

and

and

By (7.51)

7.4. Discussion of Particular Cases

177

Thus,

So (7.56) is valid. Conversely, we have to discuss the three possible cases, namely, Then

and

Then Thus

Thus,

and

Now we have

and

as well as

This completes the proof. Having

all choices of are possible in (7.56), and The equations (7.56) are crucial if one solves the complementarity problem by means of certain positively homogeneous NCP-functions The coefficients in (7.56) are then defined as normalized partial derivatives of at For details we refer to Section 9.1.

7.4.3

Reversed Inequalities

If M is only a mapping, the equation (see (7.48)) is no longer true. The linear operator of (7.49) now includes the multifunction at the place of H and becomes a nonlinear, set–valued operator which is defined in accordance with the explicit formula of Theorem 7.6 for each This operator is still linear in the dual directions and Let us next again assume that and

i.e., in contrast to Kojima systems for standard nonlinear programs, only is replaced by Suppose that is a zero of F such that strong regularity at is violated, i.e., for some

178

7. Critical Points and Generalized Kojima–Functions

Further, let be some index such that Since the point is also a zero of the function which differs from F only by changing the sign of both and This procedure simply means that the original inequality has been reversed. Changing the sign of in the vector we see that is not strongly regular at , too. Therefore, we may state Lemma 7.18 (invariance when reversing constraints). Suppose and and let be a zero of F. Then, strong regularity of F at is invariant with respect to multiplication of any with provided that The previous lemma explains why characterizations of strongly regular critical points may differ by sign, compare, e.g., differences in such conditions given in [Kum91b] and [DR96], It is worth to mention that the lemma fails to hold for the related injectivity of CF.

7.5

Pseudo–Regularity versus Strong Regularity

In Chapter 5 we gave a characterization of pseudo–regularity for continuous functions. Instead of specializing these results to generalized Kojima functions (which would not give much new insight), we shall discuss the close connections to strong regularity in various situations. Throughout this section, let F be a generalized Kojima function, and suppose that and are functions. In the first two lemmas we shall show that zero–Lagrange–multipliers (zero LM) do not play an essential role for the relation between pseudo– and strong regularity. For the given F, define by removing both, the component of and the product in Obviously, is again a generalized Kojima–function. Let as above Lemma 7.19 (deleting constraints with zero LM, pseudo-regular). If F is pseudo–regular at some zero of F with then is again pseudo–regular at Proof. Specify the norms used in and to be the maximum–norm, and put We use the letters for elements of for elements of for elements of and for elements of In the definition of pseudo-regularity of F, let L be the Lipschitz constant, let V be the open ball and let where is already small enough such that

7.5. Pseudo–Regularity versus Strong Regularity

Again by continuity, there is some

179

such that

Now let and be arbitrarily fixed. We have to show that there is some point such that

For this reason, we define the vector with which differs from by the additional component only. Similarly put for defining by using Because of the inequality is not active at Hence the point belongs to The pseudo–regularity of F provides us with some satisfying

Hence, This gives deleting in

and so, the choice of yields Therefore, the point defined by belongs to and satisfies (7.57).

Recall that a (multi-valued) selection of a given multifunction is said to be continuous if it is upper and lower semicontinuous. Lemma 7.20 (deleting constraints with zero LM, not strongly regular). Let F be pseudo–regular at some zero of F with Further, suppose that has a closed–valued and continuous selection such that Then, if F is not strongly regular at is also not strongly regular at Proof.

Consider the mapping and suppose first that is single–valued on some ball Then, by the properties of is a continuous selection of and, in accordance with Theorem 5.10, F is strongly regular at Hence, if F is not strongly regular, then for some sequence there exist certain elements satisfying

Our assumptions concerning ensure that and consider the inverse map S of at the parameter points

and

Next,

180

7. Critical Points and Generalized Kojima–Functions

where is the projection of onto Deleting the of the points in (7.58), we define points and in and respectively. Due to the difference of the parameters becomes

To show that is not strongly regular at we assume the contrary is true (with rank L). With some Lipschitz constant K for near and one then obtains

Since even

and the estimate implies and, due to This contradicts (7.58) and so completes the proof.

Recall that, by Lemma 5.11, is an isolated zero of a F from in itself, provided that F is pseudo–regular at In this case, near the origin, has a continuous multivalued selection with compact images and This result, together, with the foregoing lemmas allow for special cases a simple reduction procedure, which has an interesting application: for the usual Kojima–function of a regularity and pseudo–regularity coincide. Theorem 7.21 (reduction for data). Let F = NM be a generalized Kojima–function, and suppose that are Let be a zero of F and Define the reduced generalized Kojima–function by deleting from F all components and all products in with If F is pseudo-regular but not strongly regular at then the same is true for at the reduced part of Proof. The function F is hence has a compact–valued and continuous selection with as long as F is pseudo–regular at Due to the Lemmata 7.19 and 7.20, one may successively remove all constraints with (which automatically includes that After deleting the related components, the corresponding reduced part of is still a zero of which remains pseudo–regular but not strongly regular. By standard continuity arguments, this reduction may be continued for the components with without affecting the desired properties. With respect to the reduced zero now fulfills the strict complementarity condition. So, if all data in M are even (which means for the usual Kojima-function that then the system is locally a and holds. Therefore, Theorem 5.1 yields that pseudo–regularity of at implies non–singularity of the Jacobian and hence strong regularity of at So we have proved

7.5. Pseudo–Regularity versus Strong Regularity

181

Corollary 7.22 ( pseudo-regular = strongly regular). If F = NM is a generalized Kojima–function with being then, at any zero F is strongly regular if and only if F is pseudo–regular. The results of this subsection on pseudo–regularity were taken from [Kum98]. Note that in the context of optimization problems and related variational inequalities, Corollary 7.22 also appears in [DR96]. However, even for stationary points of a function (unconstrained), both regularity concepts do not coincide, see Example BE.4.

This Page Intentionally Left Blank

Chapter 8

Parametric Optimization Problems In this chapter, we study the local Lipschitz behavior of critical points and critical values as well as stationary and (local) optimal solutions for parametric nonlinear optimization problems in finitely many variables. We do not aim at a comprehensive or even complete presentation of all aspects of sensitivity and local stability analysis in nonlinear optimization. Our purpose is to derive (Lipschitz) stability results for programs involving data, and for that to apply largely the results of the previous chapter on regularity and Kojima– functions. It will turn out that our approach also yields several known (or new) basic results for perturbed nonlinear programs with data. The statements shall concern strong regularity, pseudo regularity and upper Lipschitz stability, second order characterizations, geometrical interpretations, as well as representations of derivatives of solution and marginal value maps. Note that there is a well-developed perturbation theory for programs with smooth (i.e., usually at least ) data, for a book reflecting the state of the art of this theory we refer to Bonnans and Shapiro [BS00]. Basic monographs in the field of parametric nonlinear optimization are, e.g., Fia83, Mal87, DZ93, Lev94], crucial aspects and applications of this field are systematically handled, e.g., in the books [Gol72, BM88, GGJ90, BA93, Gau94, RW98]. Moreover, for programs with there exists a powerful and deeply developed singularity theory based on the characterization of generically appearing singular cases of Kojima’s system, we mainly refer to the basic work of Jongen, Jonker and Twilt [JJT86] and to [JJT83, JJT88, JJT91]. As a starting point, we introduce the parametric nonlinear program

183

184

8. Parametric Optimization Problems

where is a subset of and and map to and respectively. If and belong to the class then the problem (8.1) is called a parametric program. In particular, we are interested in the classes and Recall that the related parametric Kojima system defining the critical points of then becomes where and

The associated Lagrangian

and we have

of

is defined by

in (8.2).

An Illustrative Example

Even if is a subset of and if all data functions are arbitrarily smooth, then critical points and critical values considered as functions of may behave rather badly (in particular, discontinuity may hold). Example 8.1 (see metric problem

Consider the real convex, quadratic one–para-

For the stationary (= optimal) solutions optimal) values are unique, namely,

and the critical (=

For

all feasible are critical, and For and we have and The special discontinuity of at (note that is not lower semicontinuous at 0) has a strange consequence: If P(0) should be solved but (because of a former computational error) really one solves with some then the error becomes as larger as better approximates the true value 0. On the other hand, if the is large enough then one gets again the exact critical value.

Thus, even under the practical point of view, the preceding example illustrates the need of some ”stable behavior” of problem (8.1) with respect to parameters describing the involved functions.

8.1. The Basic Model

185

Results on Parametric Optimization from Previous Chapters

Many results of the first chapters of the present book (in particular, those of Chapter 7) can be considered as contributions to stability and parametric analysis of feasible and stationary point sets to optimization problems. We compile here a list of propositions explicitly devoted to parametric optimization problems: Theorem 1.15 (Berge/Hogan stability), Theorem 1.16 (stability of complete local minimizing sets), Theorem 2.6 (free local minima and upper Lipschitz constraints), Lemma 2.7 (Hoffman’s lemma), Lemma 2.8 (Lipschitz u.s.c. linear systems), Theorem 2.10 (selection maps and optimality conditions), Lemma 4.6 (lsc. and isolated optimal solutions), Corollary 4.7 (pseudo-Lipschitz and isolated optimal solutions), Theorem 4.8 (growth and upper regularity of minimizers).

8.1

The Basic Model

Throughout the present chapter, our basic model is

where varies in varies in and are given as above. This is a parametric program with additional canonical perturbations which are particularly needed in showing that certain sufficient conditions for Lipschitz stability (in the one or the other sense) are also necessary ones. Since we study the local stability behavior of solutions, throughout we associate with the unperturbed problem a fixed element of where we shall often identify The related problem

is called a parametric program with canonical perturbations, i.e., the perturbations of the corresponding Kojima system are only in the right-hand side. To get a compact and brief description of our results, we have omitted the equality constraints (or ). As far as we apply results on Kojima functions, it can be easily and directly seen by the assumptions and proofs below that the equalities play the same role as inequalities with positive multiplier components of the critical point under consideration. This becomes also formally clear from the explicit representations of the derivatives CF and TF in Theorem 7.6, since there or if We are mainly interested in a local stability analysis of (or programs around some given stationary solution of and for small perturbations So, the following general assumptions are supposed to hold:

186

8. Parametric Optimization Problems

Note that many stability results derived in the present chapter can be extended to more general parameter spaces and less restrictive assumptions on the data functions, in particular, when taking the results of the Chapters 2 and 3 into account. Recall that the parameterized Kojima function has the product representation with

and where E means the Here, the convention (7.7) is used, i.e., and column vectors. We again put

identity matrix. are considered as

see (7.31), and

see (7.32), where

The sets of critical points, stationary solutions and multipliers related to and , respectively, are denoted by

with abbreviations

If the multiplier

associated with

is fixed, we use the

and Further, since the sets ”generalized Hessians” of L, namely

where derivative (contingent derivative) of

and

defined in (7.40) become

is the partial Thibaultwith respect to at in

8.2. Critical Points under Perturbations

187

direction To have a concise formulation of stability conditions, we will sometimes suppose that holds at some initial stationary solution of the program i.e., for all Of course, under the general assumptions (8.6), is locally upper Lipschitz (or locally single-valued and Lipschitz) at if and only if the stationary solution set map of the parametric program

has this property. The same argument applies to the critical point mapping at some

8.2

Critical Points under Perturbations

In this section we shall discuss local Lipschitz continuity and local upper Lipschitz behavior of critical points to parametric and programs. In the case of canonical perturbations, these properties are defined via the Kojima function In the case of nonlinear perturbations, the implicit function theorems of previous chapters are helpful. Of course, the regularity characterizations given in Chapter 7 apply to the situation of the present section by putting there and

8.2.1

Strong Regularity

We shall say that the optimization problem given in (8.4) is strongly (pseudo) regular at a critical point (or, synonymously, is a strongly (pseudo) regular critical point) if the associated Kojima function has this property. Theorem 8.2 (strongly regular critical points). Let be a critical point of the problem (P). For with the notation (8.9)–(8.11), the following properties (i) – (iii) are all equivalent to each other, and each of them implies that holds at The problem (P) is strongly regular at For each solution of the system

one has

188

8. Parametric Optimization Problems

The T–stability system

has only the trivial solution If then each of the conditions (i) – (iii) is equivalent to each of the following conditions: The problem is pseudo-regular at The determinants of all matrices

with non–vanishing sign, where

and

if

have the same

Proof. Apply Lemma 7.1, Theorem 7.8, Corollary 7.22 and Corollary 7.13. Remark 8.3 (necessity of variation of ). In the previous theorem, we used that is a consequence of pseudo-regularity for generalized Kojimafunctions, see Lemma 7.1. For the present situation, one also knows that if the critical point map of the particularly perturbed program

is locally single-valued near then has necessarily to hold, see [KT90]. For completeness, we give the proof: Assume that is singlevalued on some open neighborhood of 0, where is an open neighborhood of but fails at Then there is some such that where and Let and if and = 0 if for given Hence, for sufficiently small one has and the point satisfies

However, for any sufficiently small number and again contradiction.

one also has belongs to

which yields a

Corollary 8.4 (nonlinear variations, strongly regular). Consider the parametric program Suppose is an open subset of and are real-valued functions defined on Let be some critical point of Then,

8.2. Critical Points under Perturbations

189

1. the critical point mapping is locally single–valued and Lipschitz around if and only if holds for each Moreover, 2. if, in addition, exists and is Lipschitzian on some neighborhood of then is locally single–valued and Lipschitz around if and only if is strongly regular at

Proof. The first assertion immediately follows from Theorem 5.15, and it implies the second assertion by using Theorem 8.2 and the standard partial derivative formula for Thibault derivatives (cf. Corollary 6.9). Remark 8.5 (strong stability in Kojima’s sense). Adapting Kojima’s [Koj80] definition, we say that a critical point of is strongly stable with respect to some perturbation class if (i) there is a constant such that the equation has a unique solution (where is the Kojima function of whenever is small in the norm on and (ii) is continuous at the zero map 0 with To get a relation to strong regularity, one has to ensure that the perturbation class is rich enough. If contains all small perturbations of the type (D symmetric) and then, by Kojima [Koj80], is strongly stable with respect to if and only if (P) (i.e. F) is strongly regular at For a discussion of this relationship see also [KT90, KK99b]. In view of the previous corollary and remark, we now concentrate ourselves to canonically perturbed programs. The next discussions specialize several facts known from Section 7.3. Geometrical Interpretation

Here we adapt the geometrical interpretation of strong regularity given in Remark 7.10. Put there and let be a given critical point of (P)=(P)(0). Then we obtain

and

Hence, it holds the Corollary 8.6 (geometrical interpretation, strongly regular). is a strongly regular critical point of

190

8. Parametric Optimization Problems

if and only if

are valid. Since and imply second–order condition (SOC) is always true if

is fulfilled. In the case, it holds the so–called strong second-order condition.

the

i.e., (SSOC) is

Direct Perturbations for the Quadratic Approximation

For the T-stability system (8.13) indicates that and where, near a KKT point of the initial problem and for certain with small norm, there are two different KKT points to the quadratic problem, with

such that strong regularity fails to hold. This can be formulated precisely as follows, where i.e., is assumed without loss of generality. Note that the rule of the quadratic approximation in the analysis of strong regularity of (P) was pointed out by Robinson in [Rob80]. The following lemma was first given in [Kum98]. Lemma 8.7 (two close critical points, quadratic problems). Let be a KKT point for with i.e., and suppose (i) If and are KKT points for sufficiently close to then and solve (8.13). (ii) If solves (8.13) and has small norm, then, setting and the points and are KKT points for Proof. One has to put the given points in the related equations. The proof is similar to that of Lemma 8.17 below, we omit the details. Corollary 8.8 Let be a KKT point for and and suppose Then problem is not strongly regular at if and only if for some sequences of vanishing vectors and corresponding perturbations and the quadratic problem has two different KKT points and converging to as

8.2. Critical Points under Perturbations

191

Proof. Indeed, if such KKT points exist then we obtain a nontrivial solution of (8.13) via Lemma 8.7. Hence, (P) is not strongly regular. On the other hand, if (P) is not strongly regular at then some non– trivial solves (8.13). Multiplying this point by some small then it remains a solution, and we obtain two different KKT points for by Lemma 8.7(ii). So, in order to characterize strong regularity of critical points, KKT point of problems involving particular canonical perturbations with and must be considered only. Lemma 8.7 cannot be applied for since the quadratic program (PQ) is not defined. Nevertheless, if the Kojima–function is still piecewise linear, problem may be perturbed in a similar way, only is not longer true in (ii). Lemma 8.9 (two close critical points, piecewise quadratic problems). Let the functions involved in of (8.4) have piecewise linear first derivatives, and let be a KKT-point for with (i) If and are KKT points for in and is small enough, then and solve the system (8.13). (ii) If solves (8.13) and has small norm, then given any one finds and such that both holds and the point q defined by satisfies as well as Moreover, by setting and the points and are KKT points for Proof. The representation of as a difference quotient now follows from the piecewise linear structure of the matrix M in the Kojima function F = MN. The rest again requires only the direct calculation. So the particular perturbations which must be considered have the same form as in Lemma 8.7 above. Strong Regularity of Local Minimizers under LICQ

We finish this subsection by specializing Theorem 8.2 to the case of critical points of programs such that the are local minimizers. Recall that Theorem 8.10 (strongly regular local minimizers). Suppose that and are functions. Let be a critical point of the program and suppose that is a local minimizer of this program. Then is strongly regular at if and only if satisfies and is positive definite on (i.e., (SSOC) holds). In this case, if denotes the critical point of the canonically perturbed program then is a local minimizer of whenever is sufficiently small.

192

8. Parametric Optimization Problems

Proof. Before proving the equivalence, we note that the additional proposition on persistence of the local minimizing property is an immediate consequence of Theorem 1.16. ”If ”–direction of the equivalence: This immediately follows from the characterization (8.15) and the sufficient condition (8.16) discussed above. ”Only if” –direction of the equivalence: Suppose that F is strongly regular at Then, as shown above, satisfies Note that, by convention, is also satisfied if To prove (SSOC), we consider where and first show that

Indeed, strong regularity at particularly includes that the local minimizer of P (0) is a strict one. Hence, Theorem 1.16 applies and so, for some and all we have

and therefore, each element of is a local minimizer of Assume that were already small enough such that for both LICQ holds on each is a stationary solution of – and has a unique stationary solution in which is implied by strong regularity. Since, by construction, is a stationary solution of even for all we then finally conclude that So (8.17) is shown. Because of (8.17), the relation

holds by a classical necessary optimality condition. Theorem 8.2 implies that, in particular, the matrix of (v) in that theorem with if and if is nonsingular, and so

is also nonsingular, where is the vector function built by Hence, by a known fact from linear algebra, we then obtain that the strong inequality in (8.18) is satisfied for \ {0}, i.e., (SSOC) holds, which completes the proof. Note. In the case of programs, strong regularity of (P) at for a local minimizer in general does not imply the corresponding second-order condition (SSOC) defined in (8.16), see the counterexample of a simple unconstrained program presented in Example 6.22.

8.2. Critical Points under Perturbations

193

8.2.2 Local Upper Lipschitz Continuity Consider again the basic model (P) of (8.4) and the parametric pendants and (P) where t and vary. Let, as above, (or if is fixed) denote the associated Kojima function. In this subsection, we are interested in necessary and sufficient conditions for the local upper Lipschitz continuity of the critical point mappings and respectively. Similarly to the case of strong regularity, we again discuss quadratic approximations and geometrical interpretations, however, now the Thibault derivative TF is replaced by the contingent derivative CF. Recall that a multifunction from to is said to be locally upper Lipschitz (briefly locally u.L.) at if there are positive real numbers such that

Again we note that definition (8.19) includes that is isolated in does not include that is nonempty for q near As above, we use the notation

but it

for given and with according to (8.9). Now, Theorem 7.8 which was proved in the context of generalized Kojima functions immediately gives the following characterization theorem. Theorem 8.11 (locally u.L. ). Let be a critical point of the problem For and with the notation (8.9)–(8.11) and (8.20), the critical point map is locally upper Lipschitz at if and only if the C–stability system

has only the trivial solution It is worth noting that for unconstrained programs, the condition of Theorem 8.11 is reduced to the second-order criterion

which would imply for even strong regularity. However, in the case of programs, this criterion in general does not imply upper regularity, i.e., the existence of critical points of slightly perturbed problems cannot be guaranteed, see the following simple example.

194

8. Parametric Optimization Problems

Example 8.12 (no upper regularity for programs). Put Here, and The origin is not a minimizer, but a stable critical point in the sense of local upper Lipschitz behavior. Indeed, for stationary points to exist (not uniquely) near and If a < 0 then Moreover, replacing the generalized derivative we obtain If all data in the problem near are supposed to be functions with respect to a finite-dimensional the Kojima function (8.2) is locally Lipschitzian with respect to Hence, with one has where and there exist all

and such that Thus, if

for

is locally u.L. at with constants according to (8.19) (put there and then each with and satisfies the estimate

This means that is locally u.L. at immediately obtain

From these observations, we

Corollary 8.13 (nonlinear variations, u.L.). Let be open, and suppose that belong to the class Then the critical point map of the parametric program near is locally upper Lipschitz at if and only if the critical point map S of the canonically perturbed program p near 0, is locally upper Lipschitz at In view of the previous corollary, we now again concentrate ourselves to canonically perturbed programs. The next discussions partially specialize facts known from Section 7.3. Reformulation of the C-Stability System

In what follows, we give a reformulation of the C-stability system (8.21), which will allow to reduce in the case the characterization of a locally u.L. critical point to the question whether the reference point is an isolated critical point (i.e., the unique critical point in some neighborhood of it) to the assigned quadratic problem defined in §8.2.1. It will be also seen how the quadratic problems can be substituted for

8.2. Critical Points under Perturbations

195

Replacing TF by CF, we may apply similar arguments as above in the case of strong regularity. As above, let is critical for (P)},

if

and

if

Further, we define

and For some given

the cones coincide, since for each vector one has

satisfying

and

Remark 8.14 (reformulation of the C-stability system). By Theorem 8.11, the critical point map of the canonically perturbed program is locally u.L. at if and only if the system (8.21) has only the trivial solution Using the structure of some point solves (8.21) if and only if and, for some the point is an optimal solution of the linear program

and where again

and If we have and so we arrive at the auxiliary problem

From Theorem 7.6 we know that the set defined by all points

is just

satisfying the perturbed system

i.e., if

(without loss of generality), Z is given by all those KKT points of the perturbed linear program

and which satisfy, in addition, program (8.22), perturbed by

In the and

case, this is the quadratic

196

8. Parametric Optimization Problems

The remark tells us that the analysis of system (8.21) is nothing else than the analysis of a family of linear optimization problems and of quadratic problems, respectively. So it is not surprising that the roots of the following statements are basically quadratic parametric optimization. Lemma 8.15 (auxiliary problems). Some point only if is a stationary point of

solves (8.21) if and

respectively.

Proof. Recall that for every So, given any objective function the stationary points of the problems

and

(defined as of KKT-points) coincide because all constraints in are linear. In particular, this holds both for and

So our Remark 8.14 finishes the proof. Geometrical Interpretation

Next we adapt the geometrical interpretation of being locally u.L., which was given for generalized Kojima–functions in the Remarks 7.11 and 7.12. The cone considered there now becomes

(see the previous proof), while and

The cone is just polar to the tangent cone so also coincides with the polar cone of the latter can be written as

Since By putting

So we obtain from the Remarks 7.11 and 7.12 the following corollary.

8.2. Critical Points under Perturbations

197

Corollary 8.16 (geometrical interpretation, u.L.). The critical point map is locally u.L. at if and only if (i) strict MFCQ and (ii) are satisfied. Moreover, (i) is violated if and only if (8.21) has a nontrivial solution with u = 0, and (ii) is violated if and only if (8.21) has a solution with Now and under the sufficient condition

imply

Thus, (ii) holds true

(SSOC)’

and

Direct Perturbations for the Quadratic Approximation

We show for problems, how the system (8.21) indicates where, near there is a second KKT-point for In contrast to the corresponding geometrical interpretation of strong regularity, here we have only to deal with the unperturbed quadratic approximation, i.e., we study with

where we again assume, without loss of generality,

(i.e.,

).

Lemma 8.17 (two critical points, quadratic problems). Let and be a KKT-point for (or, equivalently, for (P)). Let if and if If is a critical point for and (i) solves (8.21). (ii) If solves (8.21) and then is a KKT-point for Proof. Some point

is critical for

then

if and only if

and Setting

and using

and

this is

Proof of (i). Let be critical for holds, and we have only to show that

So we already know that (8.25)

198

8. Parametric Optimization Problems

provided that the condition and if if

holds. Indeed, from we obtain that

then then

and hence

Thus, in any case, If

if

Moreover,

then

and

yield

as well as Summarizing, this is

so

solves (8.21).

Proof of (ii). Let solve (8.21) at (0,0). Using and we see that for

it holds

for

it holds

and since

and

while

as well as Hence, since also (8.25) follows from (8.21),

is a KKT-point for

Analogously to the discussion following Lemma 8.7, one has Corollary 8.18 Let

and be a KKT-point for (P). Then, the critical point map of(P)(a,b), is locally u.L. at if and only if the point is an isolated KKT-point of problem

Proof.

If (8.21) has a nontrivial solution then, using the solution for small we find KKT-points arbitrarily close to by Lemma 8.17(ii). Conversely, having such KKT-points for we find related by Lemma 8.17(i). Taking Theorem 8.11 into account, this yields the assertion.

8.3

Stationary and Optimal Solutions under Perturbations

In this section, we characterize Lipschitz properties of the stationary solution set maps and of the parametric problems and(P) introduced in §8.1. Let the general assumptions (8.6) be satisfied, i.e., is a stationary

8.3. Stationary and Optimal Solutions under Perturbations

199

solution of and for some neighborhood of The results will be used to give conditions for the Lipschitz behavior (in the one or the other sense) of perturbed local minimizers near a strict local minimizer of the initial problem.

8.3.1 Contingent Derivative of the Stationary Point Map From the general theory developed in the chapters before, we know that generalized derivatives of the mappings X and play an essential role for characterizing Lipschitz stability. In this subsection we describe the contingent derivative of in terms of the contingent derivative of Kojima’s function F. Throughout Subsection 8.3.1, we put without loss of generality for the initial parameter, and we assume that

By these assumptions, F is the product of a locally Lipschitz matrix-function (cf. (8.7)) with the piecewise linear vector This yields, for fixed and contingent-derivative of F with respect to the elementary product (or partial derivative) rule

holds, which provides us CS via (6.56). For sufficiently small some satisfying the formula

and

can also be easily shown, where here and in the following subscripts denote partial derivatives in the contingent and Fréchet sense, respectively, and B denotes a closed unit ball independently of the space under consideration. Indeed, the inclusion

is true since M is locally Lipschitz and continuously differentiable with respect to t. The identity holds for small since N is piecewise linear. Rewriting F as a product by using these terms, (8.27) follows immediately. Nevertheless, even for arbitrarily smooth involved data in (8.26) there is (up to now) no complete formula which describes for the stationary point mapping

200

8. Parametric Optimization Problems

the contingent derivative under only. However, the contingent derivative can be computed, under when including additional canonical perturbations, i.e., for the stationary point map defined in (8.8),

The importance of adding such canonical (also called ’tilt’) perturbations for characterizing stability properties of optimization and variational problems is well–established in the literature. In our context, a map similar to has been considered in [LR95]. There the authors assumed that are functions and were able to put the additional tilt perturbation only in the objective function. So, their results are, in this smooth case, stronger than those of the following Theorem 8.19. On the other hand, our result applies to in which quadratic approximations of the input functions cannot be applied.

The Case of Locally Lipschitzian F The following approach was recently presented in [KK01] and is based on the results obtained in [KK99a] for parametric programs under canonical perturbations only. In both papers, was a crucial assumption. Note that this assumption may be slightly weakened, see [Kla00] and the remark following Theorem 8.19. To compare with the usual case of let us, for the present, assume that M and N are functions, and is a regular zero of Then we obtain by the classical implicit function theorem that

Now we derive the same formula for Kojima’s function in terms of contingent derivatives by considering all in the set which is bounded under MFCQ. Let denote the partial derivative of M with respect to Theorem 8.19 (CX under ). Let function of problem (8.26), and let MFCQ be satisfied at it holds

be the Kojima Then,

if and only if

for some Proof. Condition (8.28) means that, for some sequence

o-type functions

the points and related

and certain

8.3. Stationary and Optimal Solutions under Perturbations

201

fulfill Due to

we find (for some subsequence of

some

Then (8.27) tells us that

and after division by

Since N is piecewise linear and

is a fixed matrix, the set

is closed as a finite union of polyhedral cones, and inclusion, certain elements

contains, by the above

On the other hand, the set is compact because M is locally Lipschitz. So there is some accumulation point of the sequence under consideration and

Thus, since the intersection is not empty, (8.29) is true. Conversely, let (8.29) hold and note that N is directionally differentiable. Then one finds certain

as well as some sequence

and related

and For small

(of the given sequence), we know that

such that, with

202

8. Parametric Optimization Problems

After an elementary calculation, this tells us, because of F = MN, that

and

The latter gives (8.28). Remark 8.20 (selection property). The crucial assumption

was only used in the proof of the ”only if” direction in order to show that assertion (8.30) holds. Indeed, it is immediately clear from this proof (see also [Kla00]) that may be replaced by the following selection property: Given a sequence

(put above ) with there exists a sequence of associated multipliers such that has an accumulation point

where In particular, Corollary 2.9 says that under one may take any sequence of associated multipliers. Further, the so– called constant rank condition guarantees the selection property, see [Kla00, Lemma 5]. In the case of fixed a linearly constrained program also fulfills the selection property, this will be shown in Lemma 8.29 below. The Smooth Case

In specializing the result of Theorem 8.19 to we again suppose and hence This convention is used in the following discussion. For the multivalued term becomes (single-valued and) linear in and namely,

Explicitly, the term according to (8.7) is a matrix of the form

where all first and second–order partial derivatives are taken at (8.29) requires, writing

for

So,

for some and Next consider, for comparison, the following quadratic approximation of problem (8.26) at

8.3. Stationary and Optimal Solutions under Perturbations

where

203

similarly When specializing (8.7), the related matrix of Kojima’s function, assigned to (8.32), attains the form

where is the constraint function of the problem (8.32), and So, the derivative coincides with Moreover, at the problems (8.26) and (8.32) have the same set of dual vectors. By Theorem 8.19, this yields Corollary 8.21r If

and MFCQ holds at then the derivative coincides for the problems (8.26) and (8.32).

Note that in the quadratic–quadratic program (8.32), the parameter now only linearly in the first–order terms with respect to

appears

8.3.2 Local Upper Lipschitz Continuity In this paragraph, we characterize local upper Lipschitz continuity of the stationary point maps X and of the parametric problems and respectively, and this by supposing the data belong to the class (or ) and MFCQ holds at the initial stationary solution. It turns out that the (generalized) second-order approximation developed in §8.2.2 carries over to stationary and optimal solutions, where the representation of CX given in the previous paragraph plays an essential role. Moreover, we discuss these second-order type conditions with respect to local minimizers. As above, we consider the parametric problems and according to (8.4) and (8.5), and we suppose at least that

Then, for the Kojima function M is Lipschitz on Note once more that equality constraints could be included without any problems, we have not done this for brevity of presentations. The presentation is based on the authors’ papers [KK99a, KK01] and is related (in the case) to [Lev96, DR98, HG99, LPR00]. Two Illuminating Examples

To illustrate typical difficulties, we start by two interesting examples. The first example illustrates that even for smooth data standard first and second–order optimality conditions do not imply that the solutions behave locally upper Lipschitz. Some ”stronger” second–order sufficient condition is needed. Note that this example was first given in [KK85], it modifies a proposal made in [GT77].

204

8. Parametric Optimization Problems

Example 8.22 Minimize

subject to where is a parameter. Then the optimal solution set (= set of stationary points) is if and if In this example of a perturbed convex quadratic program, the solution set mapping is not locally upper Lipschitz. Note that MFCQ holds, and (0,0) is a strict local minimizer of order 2. The next example illuminates an essential difference between problems and see Figure 8.1. Example 8.23 (Ward [War94, Ex. 3.1]) Let Z denote the set of integers, and

define

by

where is a function, and strict local minimizer of order 2. For the derivative of at have

It is easy to see that for each

has two zeros in

and

is a we

8.3. Stationary and Optimal Solutions under Perturbations

205

Hence, each of the intervals contains a local minimizer and a local maximizer of This example shows: for a function it may happen that in any neighborhood of a strict local minimizer of order two there are infinitely many other stationary points of this function. Hence, in particular, X = S is not locally u.L. at (0,0). For functions, unconstrained strict local minimizers of order 2 are automatically isolated stationary points. Under constraints, to get this property, one has additionally to assume that MFCQ holds, see Robinson [Rob82]. The latter example was originally given in [War94] to illustrate that some second-order optimality condition [Kla92] in terms of Clarke’s generalized Hessian is not equivalent to strict local minimality of order 2. Since this example concerns the unconstrained minimization of a function, it applies also to ”most regular” constrained problems. Injectivity and Second-Order Conditions

Combining Theorem 8.19 on the representation of CX and the Theorems 7.5 and 7.6 on representations of CF, we now will easily obtain that the stationary solution set mappings X and are locally upper Lipschitz (locally u.L.) if and only if a restricted injectivity condition on the Kojima function F is satisfied. Given a stationary solution of we say that satisfies the injectivity condition for with respect to u if for all and all one has where is the set of multipliers associated with If the included parameters are obvious, we will also say that satisfies CF-injectivity w.r. to u. If the same sense, TF-injectivity w.r. to u will be used. If the original problem (P) is a linear program, then one easily observes: satisfies CF-injectivity w.r. to iff is the unique solution of (P)(0). satisfies TF-injectivity w.r. to iff is, for each the unique solution of min s.t. Theorem 8.19, specialized to X, says that for a stationary solution (P),

of

The basic notation is used according to §8.1, in particular, is a fixed stationary solution of (P). Moreover, as defined in §8.2.2, consider the cones,

if

and

and and

if

206

8. Parametric Optimization Problems

For some given these cones coincide, see the dicussion prior to Remark 8.14. We also recall the definition (8.24) of the polar cone of namely, with

Given any

and any direction

because in the following

we write

may vary.

Theorem 8.24 (locally u.L. stationary points). Consider the parametric pro-

grams and near near 0, and let be functions with respect to Suppose that satisfies Then the following properties are mutually equivalent. The stationary point map X of is locally u.L. at The stationary point map of is locally u.L. at satisfies the injectivity condition for with respect to u. For each the C-stability system has no solution with For each and all one has

The equivalence of (iii) and (iv) is a direct consequence of Theorem 7.6 (see also Remark 8.14). Because of the product rule of Theorem 7.5 Proof.

Theorem 8.19 applied to the mapping and (8.34) immediately yield the equivalence of (i) and (iii). The equivalence of (iv) and (v) follows from the second part of Corollary 8.16. In order to show the equivalence of (i) and (ii), we use, similarly to the proof of Corollary 8.13, the reformulation

where Since (ii) (ii). Suppose such that

(i) is trivially fulfilled, it suffices to prove the direction (i) is locally u.L. at i.e., there exist

8.3. Stationary and Optimal Solutions under Perturbations

207

Assuming on the contrary that (ii) is not true, we conclude that there is a sequence with (and associated such that and

By MFCQ, Corollary 2.9 particularly yields that has an accumulation point By being so small that

Hence, if imply that for sufficiently large

is u.s.c. at and so, may be regarded

near

Then (8.36) and (8.38)

is satisfied, which contradicts (8.37) and so completes the proof. By a slight modification of the proof, one obtains that in the last theorem, MFCQ may be replaced by the selection property defined in Remark 8.20. In this form, the result was given in [Kla00]. Note.

Corollary 8.25 (second–order sufficient condition).

tions of Theorem 8.24 are satisfied, in particular, let If for each the condition

holds, then X and spectively. Proof.

some

are locally upper Lipschitz at

If X is not locally u.L. at and a solution of

Suppose the assumpbe an element of

and

re-

then, by Theorem 8.24, there are

with This implies After scalar multiplication of the inclusion in (8.39) by and of the equation in (8.39) by we obtain By contraposition, this completes the proof. Remark 8.26 (illustration by examples).

The injectivity condition of Theorem 8.24 is obviously not fulfilled in both Example 8.22 and Example 8.23. Choose in Example 8.22 the multipliers and and the nontrivial direction Then implies that (8.39) can be satisfied with and In Example 8.23 no constraints appear, and (8.39) reduces to which is trivially satisfied for by definition of

208

8. Parametric Optimization Problems

Conditions via Quadratic Approximation

The quadratic approximations introduced in §8.2.2 with respect to the upper Lipschitz behavior of critical points to or programs are now be studied under the viewpoint of upper Lipschitz behavior of stationary points. By Theorem 8.24, we may restrict our analysis without loss of generality to canonically perturbed programs. Theorem 8.27 (quadratic approximations).

Let be a stationary point of (P)=(P)(0), and suppose that satisfies MFCQ. Then, the following statements are equivalent: (i) The stationary point map X of is locally u.L. at (ii) For all the origin u = 0 is the unique point which, for some solves For condition (ii) coincides with the following one: (iii) For all the origin is the unique stationary point of the quadratic program

Note. To have, for tion

a relation to the second quadratic approximawith we have to apply Lemma 8.17: Suppose then condition (iii) equivalently means that (iv) for all with defined according to Lemma 8.17, it holds that if is critical for and then Proof. By Theorem 8.24, (i) holds if and only if for all one has that each solution of the C-stability system (8.21) satisfies Then the Lemmas 8.15 and 8.17 establish correspondences between solutions of (8.21) and the stationary points of the auxiliary optimization problems in such a manner that iff for the problems (8.23), and iff for Using Remark 8.14, so nothing remains to prove.

To derive an alternative form of characterization (iii) in the preceding theorem, we consider the function

Here is again a given stationary solution of (P)=(P)(0). Note that the set of multipliers, assigned to may be unbounded. Lemma 8.28 (uniform sign of the Hessian form).

Suppose that for each of (8.40). Then has no zero in on

Let and the origin is an isolated stationary point and so, by continuity, sign is constant

8.3. Stationary and Optimal Solutions under Perturbations

Proof.

Indeed, if over

for some fulfills

So we obtain stationary points origin, a contradiction.

209

then the related minimizer

of

of (8.40) arbitrarily close to the

The previous two theorems particularly imply that under MFCQ at sign is necessarily constant on if the stationary solution set mapping X is locally u.L. at Linearly Constrained Programs

For canonically perturbed programs under affine-linear constraints, our characterizations of the local upper Lipschitz behavior of X become simpler and hold without assuming MFCQ [Kla00]. To obtain this, one essentially uses that the selection property of Remark 8.20 for the multiplier mapping Y is automatically satisfied in this case: Lemma 8.29 (selection property, linear constraints).

matrix,

Let A be an (m, n)consider the parametric program

and

and let be a stationary solution of this program for p = 0. Then, for any sequence with there exists a sequence of multipliers such that has an accumulation point Proof. Because of the Lipschitzian one–to–one correspondence between crit-

ical points and KKT points, we may assume that Lagrange multiplier set, i.e., for

For

and

let if

and for

Hence, for Now let set

is the (standard)

if

we put

one has be any sequence in gph X such that Then there are some and some infinite such that

210

8. Parametric Optimization Problems

By Lemma 2.7 (Hoffman’s lemma), the multifunction effective domain Therefore,

Since

is Lipschitz on its

is a closed multifunction, one has for all

Hence,

near

the continuity of imply that

to the sequence result.

and Now, (8.42) applied yields the desired

Theorem 8.30 (locally u.L. X, linear constraints). For the stationary point

mapping X of the parametric problem (8-41) and given the following statements are equivalent: X is locally upper Lipschitz at There exists at least one multiplier such that the system

has no solution with The origin u = 0 is the unique point which, for some solves Proof. By the note following Theorem 8.24, the characterization (i) (iv) of that theorem also holds if MFCQ is replaced by the selection property for Y which is, by Lemma 8.29, automatically satisfied under linear constraints. Since does not depend on the multiplier Theorem 8.27 yields the claimed result (note again that in the proof of Theorem 8.27 the assumption MFCQ can be replaced by the selection property on Y).

The specializations to the case and to a second–order condition similar to Corollary 8.25 are now obvious, again MFCQ is not needed, and the phrase ”for all multipliers ” may be replaced by ” for at least one multiplier ” . By the previous theorem, related results in [HG99] are covered.

8.3.3 Upper Regularity In general, the local upper Lipschitz property of the stationary point mapping to a parametric program does not include persistence of the existence of stationary solutions. However, starting with a strict local minimizer of the initial problem, Theorem 1.16 on stability of complete local minimizing sets guarantees

8.3. Stationary and Optimal Solutions under Perturbations

211

the existence of local minimizers of slightly perturbed problems under natural assumptions. In the present subsection, we derive sufficient conditions for the stationary or (locally) optimal solution set mapping to be locally nonemptyvalued and upper Lipschitz (briefly, this is called upper regularity). Again we consider the parametric programs and (cf. §8.1), near near (0,0), and its stationary solution multifunctions and X, respectively. We suppose that some stationary solution of the initial problem is given, and are at least in where is a neighborhood of Lemma 8.31 (upper regularity implies MFCQ). For the parametric program (P)(0, b), b varies near 0, let be a stationary solution of (P)=(P)(0,0). If is locally nonempty-valued and upper Lipschitz at then MFCQ holds at

By assumption, for some constant L > 0 and each sequence there exist some sequence if and sufficiently small, such that

Proof.

if

for small Hence, one has that, with

holds for small positive Therefore, division by limit of a suitable subsequence of

and passing to the yield

which means that MFCQ is fulfilled at Upper Regularity of Isolated Minimizers

Denote by

the set of all local minimizers of

for fixed

and put

where is a given nonempty subset of In the following lemma, we recall conditions (cf. [Kla86], and [Rob82] for the case) which ensure that

is fulfilled for all minimizer In the

near and for some neighborhood of an initial case, Robinson [Rob82, Thm. 3.2] has essentially used

212

8. Parametric Optimization Problems

the fact that a strict local minimizer of order 2 under MFCQ is automatically an isolated stationary solution. Example 8.23 shows that this is no longer true for programs, we have to make an extra assumption. Recall that is said to be an isolated stationary solution of if holds for some neighborhood of Lemma 8.32 (u.s.c. of stationary and optimal solutions). For the parametric

program P(t,p), let be a stationary solution of Suppose that MFCQ holds at with respect to Then one has: is u.s.c. at for some neighborhood of If, in addition, is both a local minimizer and an isolated stationary solution to (P), then holds for some neighborhoods of and of Further, and are u.s.c. at

By MFCQ, we have from Corollary 2.9 and from persistence of MFCQ under small perturbations that there are neighborhoods of ( may be assumed to be compact), of and of 0 such that is u.s.c on Hence, in particular, there are a compact set Z and some such that Proof.

It suffices to show that the multifunction is closed at Taking any sequence satisfying (with related and we have and the existence of some accumulation point of Hence, because is closed at This yields and so, (i) is shown. To show (ii), we first use that is an isolated point in Because of the MFCQ, then is also an isolated local minimizer. Thus, with from (i), there is a neighborhood satisfying

Since MFCQ persists under small perturbations, we may assume with no loss of generality that MFCQ holds at each point in near and so is a subset of for all near By Theorem 1.16, there exist a neighborhood of and a neighborhood such that is true for Hence, from (i) and (8.45), we obtain that and are also u.s.c. at Let again

8.3. Stationary and Optimal Solutions under Perturbations

213

Note that, in particular, the injectivity condition at for CF with respect to implies that is an isolated stationary solution of (P). Now, Theorem 8.24 and Lemma 8.32 immediately yield the following result. Theorem 8.33 (upper regular minimizers,

). Consider the parametric program and let be a local minimizer of Then is locally nonempty–valued and u.L. at if and only if both MFCQ holds at with respect to and satisfies the injectivity condition for with respect to u. Further, if is locally nonempty– valued and u.L. at then has this property, too.

Remark 8.34 (isolated minimizing sets). In the case of replacing by an isolated compact set of local minimizers of (P), characterizations of upper regularity in the sense of Theorem 8.33 are still not known. However, under MFCQ on and under a growth condition of order imposed on with respect to an open bounded set containing local upper Hölder continuity of order for has been shown in the literature; details may be found in Klatte [Kla94a] for a general setting including Lipschitzian programs, for in Bonnans and Shapiro [BS00, Prop. 4.41] concerning optimization problems and Ioffe [Iof94] concerning Lipschitzian programs with fixed constraints. For the case results of that type are well-known already from the 80ies, see, e.g., [Alt83, Don83, Aus84, Kla85, Gfr87]. Local upper Lipschitz continuity of holds under LICQ (on ) already if quadratic growth of with respect to is assumed; this was shown for programs first in [Sha88a], and for programs in [Kla94a] and [BS00, Thm. 4.81]. Related results under a different set of assumptions can be found in [Iof94]. In the case the mentioned condition reduces to that of Robinson [Rob82]. Remark 8.35 (some consequences of Theorem 8.33). Recall that, by Corollary 8.25, the second order condition SOCL implies that satisfies the injectivity condition for CF with respect to Hence, by Theorem 8.33, SOCL on and MFCQ at together ensure that and are locally nonempty–valued and u.L. at provided that is a local minimizer of (P). Below we shall derive second–order optimality conditions for programs which guarantee that a stationary solution of (P) is a (strict) local minimizer and also satisfies SOCL at some If is a global minimizer of (P), then one may replace in statement (ii) of Theorem 8.33 the local minimizing set mapping by the global optimizing set mapping provided that is locally bounded near that point.

We finish this paragraph by a complete characterization of upper regularity of the stationary solution set mapping for parametric programs, provided that is a local minimizer of (P). Because of Lemma 8.31, MFCQ may be supposed without restriction of generality. Note that the proof of the inclusion (i) (iii) in the following theorem essentially uses an idea proposed by Gfrerer [Gfr00].

214

8. Parametric Optimization Problems

Theorem 8.36 (upper regular minimizers,

). Let be a local minimizer of and suppose that , belong to a neighborhood of If satisfies MFCQ, then the following properties are equivalent to each other and imply that is locally nonempty-valued and u.L. at is locally nonempty-valued and u.L. at is locally u.L. at satisfies the quadratic growth property

For each

SOCL is satisfied on

i.e.,

holds true. Without loss of generality, let Then, in particular, KKT points and critical points in Kojima’s form coincide, and L is the usual Lagrange function. Note that is a stationary solution of (P) due to MFCQ. The equivalence (i) (ii) follows from Theorem 8.33 together with Theorem 8.24. The inclusion (iii) (iv) is trivial, while (iv) (ii) under MFCQ follows from Corollary 8.25. Put Proof.

If (iii) does not hold, then for some bd B and some one has Then, by a known second-order necessary optimality condition, there is some with hence, for some it follows Therefore (iv) is not true, and we have shown that (iv) (iii). It remains to prove that (ii) (iii) is true. Note that for fixed the function attains its maximum on in vertices of the bounded convex polyhedron Let Putting we define the continuous functions

If (iii) does not hold, then since otherwise already Lemma 8.28 yields the assertion. Further, the second-order necessary optimality condition used above gives By continuity of it follows for some fixed Since and this means

8.3.

Stationary and Optimal Solutions under Perturbations

for some directional derivatives of

215

Hence, for all directions the at are non-negative. Setting the latter means

So we obtain, for some compact, convex neighborhood

of

and the minimax theorem ensures the existence of a multiplier that Thus,

such

is a stationary solution of the quadratic auxiliary program

By Theorem 8.27, this contradicts (ii), and so, (ii) completes the proof.

(iii) is shown. This

Corollary 8.87 (necessary condition for strong regularity,

case). Let the assumptions of Theorem 8.36 be satisfied. If is locally single-valued and Lipschitz near then satisfies the strong growth property

Proof. In particular, is locally u.L. at each in some neighborhood of Considering for each the particular righthand side perturbation with for and for we then have the result immediately from Theorem 8.36.

The opposite direction (under MFCQ) is not true, cf. the counterexample given by Robinson [Rob82]. From the proof of Theorem 8.36 and Corollary 8.37 one sees that the strong growth property already follows if only perturbations appear. Second–Order Conditions for

Programs

Second-order optimality conditions for programs were given in terms of Clarke’s generalized Hessian [HUSN84, KT88] and second-order (tangential) directional derivatives [War94] of the Lagrangian. Here we present necessary and sufficient optimality conditions in terms of the contingent derivative of

216

8. Parametric Optimization Problems

(cf. [Kla00]). In particular, this is of interest with respect to the regularity assumptions discussed in Remark 8.35. Throughout we consider a given stationary solution of the unperturbed problem and we again write and so on. Recall that the set is connected and compact. We denote by the constraint set of (P), and the abbreviation SMFCQ means the strict MFCQ. Theorem 8.38 (local minimizer and quadratic growth,

case). Consider program (P). Suppose that is a critical point of (P). (i) If holds for some c > 0 and for each then there exists a neighborhood of such that for all the quadratic growth condition is fulfilled. (ii) If is a local minimizer of (P), and SMFCQ is satisfied at then there holds for every the

Proof. Assertion (i) was proved in Theorem 6.23.

It remains to prove (ii). Since is a fixed multiplier vector, we write All constructions in the proof can be restricted to points in some neighborhood of Assume being small enough such that DL is Lipschitzian on with constant By assumption, SMFCQ holds at hence statement (i) of Corollary 8.16 and Gordan’s theorem of the alternative (see (A.8) or, e.g., [Man81b, Man94]) imply that MFCQ holds at even with respect to the constraint set

Let

be any vector with with no loss of generality suppose that Note that is the linearization cone of Hence, by the classical theory of constraint qualifications, there exists a sequence

and By the assumption of (ii), Thus, for sufficently large

Since such that with

we have

is a local minimizer of (P). Note that there holds with some that

for all

Hence, there exist

by the mean value theorem. Using that and we find that some subsequence of converges to a limit which must belong to Then, after dividing (8.47) by and passing to the limit for the corresponding subsequence, we see that (8.46) and (8.47) imply This completes the proof of (ii).

8.3. Stationary and Optimal Solutions under Perturbations

217

Note. A local minimizer which satisfies the quadratic growth condition of Theorem 8.38 (i) is called a strict local minimizer of order 2 to (P). If is a strict local minimizer of order 2 to (P) and SMFCQ holds at then

for every this is easy to see from part (ii) of the above proof. We learn from Example 8.23 that the condition in (i) is not necessary for the quadratic growth property at However, statement (ii) of Theorem 8.38 can be considered as a compromise. Moreover, we mention that for a program, statement (i) of this theorem carries over into [Rob82, Thm. 2.2].

8.3.4 Strongly Regular and Pseudo-Lipschitz Stationary Points In this subsection, we are interested in characterizations of strong regularity and of the pseudo-Lipschitz property of stationary solutions if LICQ fails to hold. Strong Regularity

A necessary condition for strong regularity when supposing that is a local minimizer of a program was given in Corollary 8.37. To find even a characterization, we know from Lemma 3.1 that a suitable representation of the Thibault derivative of the stationary solution set mapping might be helpful. Indeed, Lemma 3.1 and Exercise 5 imply that the stationary point map of the parametric program (P) is a locally Lipschitz function near if and only if both the Thibault derivative satisfies and X is a l.s.c. multifunction. This gives rise to look for a suitable representation of TX in terms of the problem data for (P) It will turn out that under MFCQ, this is a more difficult task than the description of the contingent derivative CX. We were only able to give a limit representation of TX. It is an open question whether there exists an explicit form of description, or not. For seek of simplicity, we again assume that holds for the given stationary solution of (P)(0). Hence, under MFCQ at the multiplier set is a bounded subset of Theorem 8.39 (TX under MFCQ). Suppose that

is a stationary point of (P)(0), satisfying MFCQ and Then there holds if and only if there exist and related points such that and with the vectors

218

8. Parametric Optimization Problems

fulfill In terms of original data, the latter means that satisfy

and

where Let Then, for some sequence assigned sequences of parameters and appropriate o-type functions we have Proof.

and Due to MFCQ,

is compact, so there exist dual variables assigned to and respectively, such that

and, for some subsequence, the points Since and vanish, of

and and

converge: á can be written by the help

This yields

After division by (and after selecting an appropriate subsequence), the bounded matrices converge to some element while the term vanishes. Thus, the limit of

exists and satisfies

So we derived

Conversely, assume that such sequences have been found. Setting and one easily determines (by the above calculation) that

8.3.

Stationary and Optimal Solutions under Perturbations

219

So, we know that and

Using also and

we obtain This yields by definition of TX. The second description follows by direct calculation from the first one. This completes the proof. Remark 8.40 If

then we obtain in Theorem 8.39 that

Notice that

and are assigned to the part in the explicit formula for the derivative TF, see Theorem 7.5. However, now is only bounded and, even more important, both and (near ) appear in the formula. For this reason, the description of TX(0, in terms of first and second derivatives of at makes difficulties even for arbitrarily smooth functions. We are not sure that such a description exists at all. For linear problems (P), the condition attains the form for some

On the other hand, the derivative TF presents a necessary condition for X being locally unique and Lipschitz. Lemma 8.41 (TF-injectivity w.r. to

Lipschitz at Proof.

Then,

Let X be locally single–valued and satisfies TF-injectivity w.r. to u.

Otherwise, it holds with some Then, by definition of TF, there are sequences and such that

and related

Setting and

one sees that and vanishes. So X cannot be locally single–valued and Lipschitz near

220

8. Parametric Optimization Problems

Pseudo-Lipschitz Property

As before, we investigate the map at under MFCQ at Due to MFCQ, X is closed (near (0, )), so we know from Theorem 5.3 that X is not pseudo-Lipschitz at (0, ) if and only if satisfies the (”pseudosingularity”) condition of Theorem 5.2. In the current case, this means

We close this subsection with a direct relation between this condition and the derivative TF. Theorem 8.42 (TF and pseudo-regularity of X).

fulfill MFCQ. If the constraints are (affine) linear then (8.48) yields Let (8.48) be satisfied and Then the point (u*,v*,Au*) is a solution of the T-stability system (8.13) for some and (a,b) = 0. If the constraints are (affine) linear, and satisfies TF-injectivity w.r. to u, then X is pseudo Lipschitz at (0, ).

Proof. From Theorem 8.19, we have So the set set of all sums

coincides, by Theorem 7.6, with the

such that and is restricted to the unit ball. Now we apply these equivalences to the points in (8.48) and take into account that the index sets and the allowed variations of in (7.32) depend on the selection of Similarly, and depend on and we will write Next observe that (8.48) and (8.49) imply

for all in the polyhedral cone Due to the possible variations of ) and in (if

by

and all in (if ), ), this restricts

(if (with

8.4. Taylor Expansion of Critical Values

221

The conditions (8.50) define a polyhedral cone

and ensure

So, the crucial inequality of (8.48) becomes

along with (8.50), i.e., Selecting an appropriate subsequence, the three index sets in (8.50) are constant for all we denote them by Then if and only if

For the non-empty set

one so derives

We consider now the particular cases mentioned in the theorem. (i) If and all are affine-linear, then one can show that To do this, note that now and the matrices A do not depend on Next we verify that, for the components must vanish. Indeed, the relation means that Having for some then one finds, with appropriate that belongs to and fulfills for some a contradiction by definition of Thus, for all The latter yields Since is bounded, we conclude Multiplying finally with a MFCQ direction it follows as claimed. So the non-trivial vector vanishes, i.e., is impossible. (ii) Since

we have

hence (8.53) means

and (8.52) holds with too. For each point ( ) is a solution of system (8.13). (iii) If satisfies TF-injectivity w.r. to (ii), and from (i). So we obtain property must hold.

8.4

and

so the

then it follows from i.e., the pseudo-Lipschitz

Taylor Expansion of Critical Values

In the present section, we derive formulas for the Taylor expansion of the critical value function with second–order terms and supposing different

222

8. Parametric Optimization Problems

regularity properties of the critical point map. These results are understood as supplements to the well–developed theory of first- and second-order directional differentiability of the optimal value function, for which we refer to the books [Gol72, DM74, Fia83, Gau94, Lev94, BS00] with many references to the field.

8.4.1

Marginal Map under Canonical Perturbations

Again we regard the canonically perturbed standard problem

at some critical point for Supposing strong regularity, the critical points are locally unique and Lipschitz for small parameters, and so is the marginal map (or critical value function)

Under convexity and/or hypotheses, the structure of is well-known, for basic studies we refer to the literature just mentioned. Because and (of course, locally), we may apply chain rule (6.19) to determine the T-derivative of

The elements of the set and Lemma 6.1: One has

are known by Theorem 7.6 iff

where and are defined according to Chapter 7. Recall that the set collects all satisfying (8.55) and (8.56), and is the corresponding set for the contingent derivative, where one has to replace in (8.55) by and in (8.56)

by In what follows, we make sure that exists and study the explicit form of the and Throughout this section, we often use the convention to write if

instead of

}

exists, the same for

Theorem 8.43 (

of F at a zero

derivatives of marginal maps). Under strong regularity the map belongs to it holds

8.4. Taylor Expansion of Critical Values

223

and

and moreover,

where

and

Note: Under strict complementarity of a this yields

problem and with

and which is a singleton. Proof.

Since

(Representation of

We show that

we have by Lemma 7.7,

Therefore, independently of the choice of consists only of the element

the set

i.e., the inner product of the negative KKT point and the direction under consideration. Since strong regularity is persistent, the same applies to small parameters Hence, is single-valued near the origin. So (see Exercise 14) locally exists and has the form (8.57). (Representation of Up to the term (in place of is where is given by the components of By Lemma 7.4, the terms form just the T-derivative of the proper Lagrange multiplier, i.e., (for small ). Taking into account that for each we can find sequences and which realize all limit-relations of Theorem 7.6 at once,

224

8. Parametric Optimization Problems

we obtain that all the representation of (Representation of

belong to is correct. By definition, we have

To see how depends on – which reduces to problems –, we use (8.55) and (8.56) in order to replace

So

for and

For the contingent derivatives of the same system (8.55), (8.56) is crucial after the replacements mentioned above. The formulas follow now by analogue arguments, and holds due to the definition of In accordance with Theorem 6.20, the set provides us with a second-order approximation of near the origin: For fixed and it holds

with some Clearly,

has a cluster point

in

Thus, for some

Having Theorem 6.23 gives us a condition for growth at some point (like Corollary 6.21 for growth near a point via Corollary 8.44 (lower estimates). Under strong regularity of F at a zero it holds: (i) If then one has for sufficiently small,

(ii) If

inf then this estimate is locally persistent, i.e., there exists such that

whenever

and

8.4.

Taylor Expansion of Critical Values

8.4.2

225

Marginal Map under Nonlinear Perturbations

For parametric problems of the kind

the Kojima-function F depends on

For any

we denote by

the set of stationary points, the set of critical points. We assume that varies near 0, and Throughout this subsection we suppose at least that

where the latter means that for some whenever We consider and are interested in the interplay of the different regularity assumptions for obtaining more or less detailed characterizations of the map Formulas under Upper Regularity of Stationary Points

Note that, by (8.59), but may be multivalued for Further, the set images by (8.59). Let us first observe that, due to holds the usual chain rule formula

has non-empty and again (8.59), it

Indeed, to each sequence there corresponds, by (8.59), at least one approach direction (for a certain subsequence of Since every element of coincides with some and because holds with as

uniformly for

226

8. Parametric Optimization Problems

we obtain the inclusion The reverse inclusion becomes evident after writing any in limit-form. Having (8.59) and (8.60), we can apply all the arguments of §6.6.2 given there for the contingent derivative There, the hypothesis (6.63) had turned out to be crucial. In the present context, (6.63) attains the following form:

for nonlinear perturbations I). Let and suppose (8.59) and (8.61). Then, for

Theorem 8.45

where

and belong to one has

is the related Lagrangian.

Remark 8.46 In (8.62),

upper Lipschitzian) multifunction

is a linear map and approximates the (locally in the form

cf. Lemma A.3 and the subsequent remark. So can be identified with the Fréchet derivative of at the origin. Note that Theorem 8.45 even holds if are only functions (in particular, without supposing the smooth parameter dependence of (8.59)). This follows from [JMT86, Lemma 2.1], where Fréchet differentiability at the origin was proven straightforwardly when assuming the existence of a pointwise Lipschitz (at 0) selection of Proof of Theorem 8.45. By Theorem 6.28 (i), it holds with

where the images of S are the critical points of problem P(0) Thus,

i.e.,

for some Since

Lemma 7.7 ensures

Again, this term does not depend on the selection of ensure (8.62).

Hence (8.60) and (8.63)

8.4.

Taylor Expansion of Critical Values

227

Note. Writing canonical perturbations as and splitting into corresponding marginal function that

formula (8.62) yields for the

this is again again formula (8.57). Condition (8.61) is valid under several constraint qualifications which have been already discussed in Chapter 5. Let us consider two special cases of the foregoing theorem. (i) In particular, (8.61) holds true if is pseudo-Lipschitz at because the latter yields LICQ due to Lemma 7.1. Then, (8.63) is even valid as equation. To see this, one can use the same arguments as under Theorem 6.28 (iii), now supported by Theorem 6.27. So, formula (8.62) holds true though the set of stationary points is not necessarily single-valued. (ii) If the Kojima function of the unperturbed problem is even strongly regular (i.e., is locally Lipschitz invertible), then the critical point mapping is Lipschitz near 0, and (8.62) is true not only for but also for parameters near 0, i.e.

In the present situation, we again easily see that

locally exists as a Lipschitz function. Formulas under Strong Regularity and Smooth Parametrization

Suppose that P(0) is strongly regular at and hence the critical point map is locally single–valued and Lipschitz near Further suppose that

Then the Thibault derivative of deed, let us first put

where id

can be computed via formula (6.19). In-

denotes the identity mapping. Then, by (8.64),

Since G is locally Lipschitz and

it holds

228

8. Parametric Optimization Problems

We show that the crucial component rule with the local inverse

of

of

Indeed, by putting

the function G assigns to the point critical point given by

Using

fulfills the natural chain

the pair

where

is the

this is

In order to estimate let formly) for the quantities of the

Then we obtain (uninorm of

on

and

So, since in fact

is locally Lipschitz by assumption and

vanishes, we obtain

The latter set is given by Theorem 7.6 and the rule of the inverse derivative. The same formula for the contingent derivative follows (under strong regularity) by completely analogous arguments, we omit the details. Theorem 8.47 (

for nonlinear perturbations II). Let and belong to and suppose (8.65). If is a strongly regular critical point of P(0), then

Proof. With as shown above,

and

one has,

and

Hence, Theorem 6.8 yields

i.e., by using (8.67), the assertion for follows analogously.

is shown. The second assertion

8.4.

Taylor Expansion of Critical Values

229

Note. Writing canonical perturbations as

and we have for

and splitting into and

the term

vanishes,

and Further,

holds true. Formulas in Terms of the Critical Value Function Given under Canonical Perturbations One can directly compare with the contingent derivative map of the canonically perturbed problem at

of the marginal

To do this we suppose again that is a strongly regular critical point of (P)(0,0), and is locally Lipschitz near We consider

and use that for the perturbation Setting

holds if and only if

is a critical point of (P)

we may write

Here, depends also on but it holds for uniformly with respect to Due to strong regularity and the points converge to indeed. We thus obtain with some new that where Hence, Due to

this yields

230

8. Parametric Optimization Problems

By persistence of strong regularity, the obtained formula

then holds once more also for parameters 6.11, this yields the contingent derivative of

and the set

near the origin. By Theorem as

becomes

Note that only differentiability properties of F along with strong regularity played any role in the present context. So, in particular, we did not utilized that the parameter appeared only in the matrix M of the Kojima function F=MN. Remark. The set

is interesting if the original problem appears in a decomposition setting: and are parameters given by the ”master” to some (or more than one) follower who solves his problem with primal-dualsolution The objective of the master consists in minimizing a function with certain constraints concerning and In the simplest case, we have without constraints which yields a max-min problem, namely,

Clearly, the master is interested in (stationary) points where show that such a point is a local solution, one may consider directions (or under constraints in the ”feasible” directions at

To in all

Chapter 9

Derivatives and Regularity of Further Nonsmooth Maps 9.1

Generalized Derivatives for Positively Homogeneous Functions

Several practically important functions are positively homogeneous, e.g., Euclidean projections onto a closed convex cone, among of them many NCP functions, the function which appears in Kojima functions and the directional derivative of a directionally differentiable, locally Lipschitz function . Injectivity of for the directional derivative then means: There is a unique and (globally) Lipschitzian assignment such that In the current subsection we will investigate the derivatives of such functions while, in the next one, those properties of NCP-functions will be studied that are important for solving the related NCP-equations, cf. Section 1.3. Accordingly, we suppose that is positively homogeneous, and want to determine the derivatives First of all we observe that, for all

and and positive

231

at the origin.

232

9. Derivatives and Regularity of Further Nonsmooth Maps

This tells us that the derivatives T and C are norm-invariant, i.e.,

Next, we immediately see that

In consequence, (9.1) along with closenness of

We will show that, for the set quotients

yields

the following collection of difference and

which is points of .

plays a key role. Let

denote the set of all

Lemma 9.1 (T (0) and D° for positively homogeneous functions). Let

be positively homogeneous. Then, it holds (i) (ii) If m = 1, one has

(iii) If

is simple at the origin, and

is pseudo-smooth then

Proof. (iii) This statement is a direct consequence of formula (9.1) and the

definition of D° (0) as being the set of all limits of sequences in

for

(i) To determine all limits L of terms

we put CASE 1: Then CASE 2: Now

and distinguish three cases. and

Without loss of generality,

9.1.

Generalized Derivatives for Positively Homogeneous Functions

we may assume that Then

233

and tends to

CASE 3: With the settings of case 2, we now obtain where Thus, In the cases 2 and 3, each pair of

and

can be written as

by suitable choice of and This yields Moreover, closed set which contains account that

is true due to (9.2). So as well as

is just the Taking into

we now obtain assertion (i). (ii) If the terms may be split into two groups and For group the origin is not contained in the line-segment S connecting and p. By the mean-value statement (6.38), we can write with some Because we get For group it holds Considering on S one easily sees that d is a convex combination of and Conversely, every such convex

234

9. Derivatives and Regularity of Further Nonsmooth Maps

combination can be written as a quotient follows from (i) that

Moreover, as a connected set in

So it

is convex. Hence, is contained in

Since the elements and (6.38)), the same holds for conv

belong to C (again due to This yields

So, (i), (ii) and (iii) have been verified. The proof of the simple-property is left as Exercise 15. Exercise 15. Verify that positively homogeneous

are simple

at the origin. Exercise 16.

cl

Show that, for the situation conv must be taken into account.

Difficulties for Compositions

We are now going to study where we suppose that

for a composed function is positively homogeneous,

and

We even suppose that

is

on

so only zeros of can make any difficulties. Clearly, it holds but not any of our chain rules in Section 6.4 guarantees the equality. So the present function is a good example for discussing the related problems in detail. We have to regard all limits for certain

and

Setting we may write For we obtain the same is true if So let us turn to the crucial case of for all under consideration. With the possible limits L have been considered in the proof of Lemma 9.1. So we know that L depends on and

9.1.

Generalized Derivatives for Positively Homogeneous Functions

235

according to the cases considered under Lemma 9.1:

Therefore, we put

and select a (further) subsequence of

such that the limits

exist. The vector plays, for the sequence the role of a normalized approach direction. So the directional derivatives become important. We consider the simple cases first. (i) If the limits and are uniquely determined, namely,

This yields Hence and are uniquely defined by and So also L is well-defined by (9.3), (9.4) and (9.5). (ii) Let and Then Indeed, writing each component of z, say by using the mean value theorem, one sees that (iii) The crucial case consists of

So

yields and

and Now

is much greater than For this reason, the quotients

may have limits which depend on the high-order term In the last case, and cannot be written by means of the approach directions and first derivatives only. This makes the explicit computation of hard. To see that the unpleasant situation may really appear, let and Setting, e.g., and we get (which may be non-zero due to even for linear ) and Now depends on

236

9.2

9. Derivatives and Regularity of Further Nonsmooth Maps

NCP Functions

NCP-functions are functions :

satisfying

which is supposed for all functions appearing in this section. They are used in order to formulate the NCP (or the subsequent complementarity conditions in some more complex system)

cf. Section 1.3, as an equation

Because of the composed structure of and the difficulties for computing Th (as mentioned above) and similarly for its convex hull map the application of the derivatives and is more convenient in the present situation. Therefore, we will apply the results of Section 6.4.2 and suppose throughout that

We further recall that the NCP is said to be (strongly) monotone if

holds true, where by Finally, let If proximation):

and

is a fixed constant. A standard NCP is defined denote the partial derivatives of on

then monotonicity yields (via

and first-order ap-

The same remains true (consider limits for in if the pairs are components of a Newton function i.e.,

In order to find some zero of several NCP functions can be (and have been) used, cf. [SQ99] for some overview. Necessary and desirable properties of may depend on but also on the method one is aiming to apply. So we will regard two principal possibilities of solving (9.6).

(5) minimize a so-called merit function, e.g.,

by a descent method or

9.2. NCP Functions

237

(ii) solve (9.6) directly by a Newton method.

Though also combinations of both ideas are possible, we study these cases separately because they require different properties of the NCP function We will see that, having satisfied these properties, the concrete definition of plays a less important role. CASE (i): Descent Methods

Having satisfies

the function

should ensure that

This is true if

As a second requirement, should imply The latter cannot be ensured for all problems, but at least for monotone standard NCP’s . Clearly, then has to be monotone in a certain sense, too. We call an NCP function strongly monotone if for all

with

Lemma 9.2 (NCP: minimizers and stationary points).

Let fulfill (9.10) and be strongly monotone. Further, let the NCP be monotone, and be a regular matrix. Then implies

Proof. Given

define

by (if

then put

Then

The first sum is non-negative by (9.8), the second one is positive iff

Notes

(i) For strongly monotone NCP’s, the same is true if weaker sense

because now (9.11), (9.12) and

ensure

is monotone in the

and

238

9. Derivatives and Regularity of Further Nonsmooth Maps

(ii) For

one may replace in (9.8) and can define

Then

implies

and

by a Newton function as

by the same arguments.

(iii) Without supposing the smoothness (9.10), one may also replace the pair

by pairs

and comes to the same conclu-

sion. Knowing that implies all first order methods for minimizing a – or a – function may be applied to NCP-functions satisfying the assumptions of the lemma can be chosen arbitrarily smooth. Nevertheless, one may also apply methods of nonsmooth convex optimization (cf. [SZ88, SZ92, HUL93, OKZ98]) for minimizing

as long as points that

is sublinear and the NCP is monotone. Then we have at

and

hold true. Directions

satisfying

will just appear as Newton directions in the next subsection. CASE (ii): Newton Methods

Having Newton’s method in mind, we require that the NCP function with the

Let us discuss these conditions.

satisfies

9.2. NCP Functions

239

If for

in contrast to (9.13), then hence if both and So system (9.6) degenerates whenever strict complementarity does not hold. By (9.14), belongs to the simplest class of functions satisfying Condition (9.15) guarantees that is on and is at strictly complementary solutions as long as and are continuously differentiable, too. Condition (9.16) is consistent with the assumption of Lemma 9.2 and avoids singular derivatives of for strongly monotone NCP’s, cf. Theorem 10.6.

Properties and Construction of Let

such functions are called

functions. Due to (9.14), we have

Hence one easily derives that

In consequence, there is a positive lower bound for all gradient norms:

Moreover, and Taking into account that obtain the basic properties

hold with certain is norm-invariant and locally bounded, we

The first equation is just Examples of pNCP functions

(i) Put function, or

this is an often used concave standard and, to satisfy (9.16), change the

(ii)

sign of on (iii) One can define via any norm of such that its unit sphere bd B is piecewise smooth, has no kinks at the positive axes and fulfills

and

240

9.

Derivatives and Regularity of Further Nonsmooth Maps

Setting for bd B and for one easily infers that belongs to With the Euclidean ball, one obtains the strongly monotone concave function used e.g. in [Kap76] (for penalization), [Fis97] and [KYF97]. (iv) In addition, can be defined (and each means of a real function

where

can be written) by with zeros at 0 and

are the polar coordinates of

Then, by well known derivative transformations, for radius

at

In particular, the natural setting for

with the symmetric extension for

defines a function conditions. Lemma 9.3 (limits of

which satisfies, like for

). For

Proof. We apply the polar representation of

and study the first limit for Due to (9.15), the function

all the already mentioned

is

one has

put

near 0, so one may write

where Hence

Since (due to and assertion, the second one is left to the reader.

we obtain the first

9.3. The C-Derivative of the Max-Function Subdifferential

241

The previous lemma allows us to interpret a Newton step in terms of the original functions. The linearized equation (9.6), i.e.,

means with

i.e., where is only a vehicle for defining the coefficients

as

If is close to some point then the previous lemma tells us that the weight is large (near to some positive real) and is small. So, roughly speaking, we solve, independently of the concrete choice of

This is exactly the form of the Newton equation with where If is close to the symmetric situation appears. Finally, if holds at the solution then we cannot predict how the derivatives of the functions and will be weighted by for near So we have to hope that the Newton equation has still uniformly bounded solutions for all limits of as vanishes (”Newton-regularity”), whereupon we can define the weights at the point (0,0) by some of these limits. Similarly, we may proceed at other points of Depending on ”Newton-regularity” may be a more or less strong assumption. Though does not fulfill the requirements of Lemma 9.2, our Theorem 10.5 will indicate that it belongs to the best pNCP-functions in view of this regularity hypothesis for Newton’s method.

9.3

The C-Derivative of the Max-Function Subdifferential

To illustrate the above settings and assumptions in the context of multifunctions, we study here the contingent derivative CF where is Clarke’s subdifferential (the generalized Jacobian) for a maximum function

We begin with some basic facts. It is well known that of all ”active” gradients, i.e.,

is the convex hull

242

9. Derivatives and Regularity of Further Nonsmooth Maps

So we have by dualization

Further, for all that

and

in some compact set we obtain due to

where Let

be any convex-combination of related active gradients

We denote by

the related set of weights

and put if Then,

and equality holds if compact set and with

Hence (9.26) ensures, for all

in any

So the convex combination plays, in view of lower estimates, the role of a second derivative. Reversing the role of and yields

Adding (9.28) and (9.29), one obtains a monotonicity relation

9.3. The C-Derivative of the Max-Function Subdifferential

243

Contingent Limits

Now assume, in addition, that

With related coefficients and for some subsequence, the index sets are constant and converge, say Hence (9.28) and (9.29) can be applied with and (9.30) yields, up to terms of order

thus Moreover, since as equations, so

for small

now (9.28), (9.29) and (9.30) hold

and

and

may be seen as second order derivative of in direction

The analysis of from the viewpoint of second order derivatives has been developed in [Roc88] and can be applied even to composed function where is convex with polyhedral structure of the sets and F is cf. [Pol90], and to sensitivity analysis, as well, cf. [LR95]. The crucial point for this approach consists in the fact that (not only Clarke’s subdifferential) can be interpreted as a composed map of the subdifferentials to the functions and

Indeed, active at } and

for some

244

9.

Derivatives and Regularity of Further Nonsmooth Maps

briefly Here, is the usual subdifferential of convex function, but must be generalized one (since is not convex). The crucial question then consists in the validity and interpretation of the chain rule

In what follows, we determine

directly, based on the property of F and Ekeland’s principle only, and show the relations between and the contingent derivative CX of the stationary point map in Chapter 8. This way we obtain, for a particular case, known statements (including the above chain rule). On the other hand, our proof is self-contained and does not require an extension of the tools by proto-derivatives, epi-convergence and approximate subgradients. In addition, we pay attention to those functions that are active at near up to order and show (cf. Corollary 9.6) that variations of F of this order keep invariant though the (proper) active index sets may switch around Characterization of

for Max-Functions: Special Structure

To determine several simplifications are possible. We may assume and since other than those with are not maximal for near and could be deleted. After adding a (symmetric ) quadratic function

and setting

we have

and

The latter follows easily from the definition of

and yields

Thus, we may assume that Finally, taking any with we must only investigate the situation which characterizes just singularity with respect to upper Lipschitz behavior of the stationary point

9.3.

The C-Derivative of the Max-Function Subdifferential

245

map at the origin, cf. §4.2 and §5.1.2 where local minimizer and/or convex function have been regarded. Therefore, we study the essential case of

Let us abbreviate

The normal cone of

at

is denoted by

Clearly,

Theorem 9.4 (particular structure of

for max-functions). Under the settings (9.34), it holds if and only if both the direction u belongs to and there are an index set elements and with for such that, with

Before proving the theorem some comments are appropriate because the requirements (9.36) and (9.38) permit several equivalent descriptions. These comments basically apply facts from linear optimization. Remark 9.5 (equivalent conditions). In Theorem 9.4, condition (9.36) may

be replaced by

where

while (9.38) may be replaced by

Proof of Remark 9.5. First note that one may identify by removing other indices from J. Further, the condition is not essential for satisfying because it can be satisfied (keeping the rest valid) by

246

forming new

9. Derivatives and Regularity of Further Nonsmooth Maps

as

with large

Condition (9.36): Under the remaining conditions, the real numbers satisfy for all

So, (9.36) may be replaced by

because the value

in (9.36) fulfills

Condition (9.41) can be written (put

) by

Hence, by duality of linear programming,

Further, having solvability of problem (9.42) the point solution because feasibility is obvious and (9.38) yields

Thus, instead of (9.36), one may equivalently claim that

with optimal value . Then, the additional constraint this condition and leads us to (9.39).

which is (9.40).

solves

(satisfied by ) does not change

Condition (9.38): Again by duality, (9.38) means

and

is a nontrivial

9.3. The C-Derivative of the Max-Function Subdifferential

247

Note that by parametric linear programming (via Hoffman’s lemma and the global Lipschitz property of the linear objective), (9.40) means exactly that there exists a constant such that

The latter will be important for the second part of the subsequent proof. After replacing (9.36) by (9.39) and (9.38) by (9.40), respectively, the explicit variables and disappear while and J are connected by the relations Proof of Theorem 9.4. With

Let

the statement becomes trivial, so let By definition of

there are points

and

such that, for certain one has if and, in addition, Clearly, and depend on for seek of simplicity, we avoid to write it explicitly. (Notice, in view of the following corollary, that the conclusions in this part of the proof are even valid if contains all such that i.e., must be only ”approximately active” at ) Selecting an appropriate subsequence, the sets are constant: Further, convergence of the bounded elements Next, each can be written, as

may be assumed, say where

Here, so

is the error of the first-order approximation of vanishes. In consequence, converges:

Due to

In addition, for

the limit

near

exists, too. This implies (9.37):

there are solutions

of the linear system

248

9. Derivatives and Regularity of Further Nonsmooth Maps

This yields solvability for the right-hand side

:

holds for certain Therefore, (9.37) and (9.38) are valid. We derive (9.36). Using second order expansion, we conclude that

where again Let Since we obtain

is maximal (or maximal up to ) and vanishes, for all Knowing that one has for all So we have and Next, considering any we obtain by (9.44) for

Therefore, the linear systems (in )

have solutions vanish. So also

for each under consideration. The right-hand sides

remains solvable, and every solution positive

fulfills (9.36).

If the assertion does not hold then, by formal negation, there exist and such that

whenever is small enough. By the well-known relations between and Clarke’s directional derivative cf. [Cla83] and Chapter 6, it holds

Selecting this yields for

with minimal Euclidean norm and setting

Therefore, one obtains from (9.45),

9.3. The C-Derivative of the Max-Function Subdifferential

We will show that (9.46) cannot hold. Let the theorem. We fix and put

J and

249

satisfy the conditions of

(only important if such

Let be sufficiently small such that, if holds true. For any we put

exists)

is defined, small )

and apply

where

implies that small positive

If

then

(being smaller than for is not active for sufficiently So, in what follows, we must only regard

First we intend to show that, for sufficiently small

Due to

Again for small inequality

(from (9.37), (9.38)), it holds

we have

Hence the crucial

is valid. We are now able to apply (9.50) for proving (9.48). CASE 1 If is small enough, namely if

then (9.50) yields

and

250

9. Derivatives and Regularity of Further Nonsmooth Maps

The latter is (9.48). So let the opposite hold, and put

for the current

and

CASE 2

If

is small, namely if

we may use (9.43), i.e.,

in order to obtain again

CASE 3

Finally, if of (9.47) and

for some we estimate by the help in order to deduce (for small not depending on ),

Summarizing, (9.48) is true, provided that is sufficiently small (depending on only). Next, setting

and

So, for the particular points

(9.47) ensures for all

we have

It remains to apply Ekeland’s principle for Using (9.48) and for on with Ekeland-point with

on

for small the point for small Let This ensures

is be a related

and

Clearly, now vanishes (as ). Moreover, by the construction of and the point belongs to the interior of for small So, the Ekeland- inequality yields necessarily for all

With small such that the assertion.

this contradicts (9.46) and proves

9.3.

The C-Derivative of the Max-Function Subdifferential

Corollary 9.6

and

251

(approximations of high order). Suppose, for any index set J that

and Then and the conditions of Theorem 9.4 are fulfilled with the given set J (though these are not active at in general). Proof. By the first part

conditions, the second one Note. The part every given sequence

of the proof to Theorem 9.4 we obtain the related verifies

of the proof indicates (via it holds

This means that if

that, for with certain

i.e., by definition,

then even is valid. Mappings F possessing a contingent derivative that satisfies (like this lim sup = lim inf-equation are introduced in [Roc88] as being proto-differentiable multifunctions. This property is a multivalued version of ”simple” in §6.4.1 and turns out to be similarly useful for establishing chain rules (because of the same technical reasons as for functions). Characterization of

for Max-Functions: General Structure

Theorem 9.7 (general structure of

and only if there exist and

and

It holds such that

if

and and equation holds for

Remark 9.8 (equivalent conditions). In the theorem, condition (iii) may be

replaced by

252

9.

Derivatives and Regularity of Further Nonsmooth Maps

where

and Condition (v) may be replaced by

Proof of Theorem 9.7. For characterizing

we apply the transformations between

and

generally, in (9.32): with

The quantities of Theorem 9.4 have now the form

Thus, the related conditions for

the real numbers

are as follows:

fulfill

These conditions are independent of the choice of in the equation Condition (9.38) becomes

and

in (9.36) attains the form

Substituting finally by

and using Remark 9.5, the theorem is verified.

9.3.

The C-Derivative of the Max-Function Subdifferential

253

Application 1 Let us compare the conditions of Theorem 9.7 with those which describe the contingent derivative CX of the stationary point map X for the

in Section 8.2. The map is defined by the stationary points the (canonically perturbed) parametric problem

of

where Since holds everywhere for (9.53), we know that, in terms of the related Kojima function F and with

Using the explicit form of CF (see Theorem 7.6), direction by (we write here instead of

for some solutions for

Here, to

is characterized

(for the definition see (7.32)), and the set has the form

denotes the second derivative of

of dual

with respect for all

To model the variations of

that correspond to

we put and

Then, follow. Therefore, for all with follow, too.

and (in consequence)

254

9. Derivatives and Regularity of Further Nonsmooth Maps

Corollary 9.9 (reformulation 1).

For the stationary point map X of the program (9.53), the elements are exactly characterized by the conditions of Theorem 9.7, except for condition (iii).

Proof. Indeed, the explicit formula for CX attains the form

and there exist

and

satisfying and

So, the point in Theorem 9.7 and the current vector may be identified, while J becomes the set Here, we do not require because permits However, as already noted after Theorem 9.4, the condition is not essential for those satisfying The absence of condition (iii) arises from different variations of the same problem (9.54). The relation describes the existence of stationary points to problem (9.54) with for variations The relation stationary points

and describes the existence of

to problem (9.54) with

for variations

and

Here, the replacement of by does not change the stationary points However, the original constraints

must be satisfied up to error only. Thus, in comparison to the possible sets of active constraints may increase, and new stationary points may occur. However, the requirements and are equivalent in accordance with Corollary 9.6. Application 2

Let

be the mapping

given by the stationary points for

9.3. The C-Derivative of the Max-Function Subdifferential

255

and let F = MN be the associated Kojima function. Consider the contingent derivative at some stationary point

and suppose that

.

satisfies

To simplify the calculations, we further suppose that

In order to establish the correspondence between derivative of Kojima’s function, introduce the functions

and the contingent

Then one may write

Here,

If then

certain

holds with weights is a Lagrange multiplier for

(put This yields:

if

fulfill, for and and

Thus, setting

and

one has

Corollary 9.10 (reformulation 2). Under assumption (9.55), the application of Theorem 9.7 to leads, via the transformation

to a particular solution

of the system

256

9. Derivatives and Regularity of Further Nonsmooth Maps

such that

in addition satisfies

with

and

(not with

Proof. Note that have with

and since

By Theorem 9.7, we

and

that

and

such that and

solves

with optimal value

and one also has

By (9.58), is a Lagrange multiplier vector for Since 1, we obtain from (9.58) that and This yields, due to

Because of

the conditions (9.57) and if

coincide. Using and noting that iff the latter is equivalent to (9.56). The subset of non-negative feasible points in (9.56) is just the set of Lagrange multiplier vectors for (without component Since we also have So (9.59) becomes:

The latter verifies the assertion.

Chapter 10

Newton’s Method for Lipschitz Equations For computing a zero of a locally Lipschitz function several Newtontype methods have been developed and investigated (from the theoretical and practical point of view as well) during the last 20 years. They have been applied to variational inequalities, generalized equations, Karush-Kuhn-Tucker systems or nonlinear complementarity problems, see, e.g., [KS87, Kum88b, HP90, Pan90, IK92, Kum92, Rob94, Don96, BF97, Fis97, KYF97]. Accordingly, one finds various conditions for convergence of nonsmooth Newton methods (mainly written in terms of semismoothness) and many reformulations of identical problems by means of different equations. Especially for complementarity problems, a big number of so-called NCP functions have been applied in order to obtain such a description cf. In this chapter, we elaborate those properties of and related derivatives which are necessary and sufficient for solving by a Newton method, and we compare the imposed assumptions in terms of the original data. Before going into the details we suggest the reader to study Example BE.1, which indicates that Newton methods cannot be applied to the class of all locally Lipschitz functions, even if (provided that the Newton steps have the usual form at of the given function). We also mention the well-known real function for fixed which shows the difficulties if is everywhere locally excepted the origin, and if is not locally Lipschitz.

10.1

Linear Auxiliary Problems

Newton’s method for computing a zero determined by the iterations

257

of

(Banach spaces) is

258

10. Newton’s Method for Lipschitz Equations

where is supposed to be invertible. The local superlinear convergence of this method means that, for some otype function and near , we have

which is, after substituting

and applying

to both sides,

The equivalence between (10.1) and (10.2) is still true if one defines, in a more general way,

where is a given set of invertible linear maps. A method of this type is often said to be a generalized Newton method. The elements in (10.3) and in (10.2) now depend on the selected elements A. So we have to precise that the inequality in (10.1) should hold independently of the choice of Next suppose that there are constants

and

Omitting the indices and setting , condition (10.2) now attains the equivalent form

such that

the convergence

and yields necessarily, with

Conversely, having (10.6), i.e., for some then, via convergence

and (10.5), one obtains the

But (10.6) is condition (6.24) in §6.4.2: M has to be a Newton map of So we have shown

at

Lemma 10.1 (convergence of Newton’s method - I). Suppose the regularity

condition (10.4). Then, the method (10.3) fulfills the convergence condition (10.7) if and only if M satisfies (10.6). The latter means that M is a Newton map of at

10.1. Linear Auxiliary Problems

259

Remark 10.2 If the conditions (10.4) and (10.6) hold true with

then the method converges for all initial points

satisfying

since, by induction arguments,

In the current context, the function may be arbitrary (even for normed spaces X, Y) as long as consists of linear bijections between X and Y. Nevertheless, we will suppose that is locally Lipschitz near This is justified by two reasons: If is only continuous, we cannot suggest any practically relevant definition for Having uniformly bounded and writing (10.2) implies that satisfies a pointwise Lipschitz condition at , namely

if is small enough such that Since the solution is unknown, our assumptions should hold for all near the solution. Then, the local Lipschitz property of (near the solution) follows necessarily from (10.8). Further, having uniformly bounded after applying

now (10.2) guarantees and

Therefore,

restricts

again in a canonical manner:

is locally upper Lipschitz at

Condition (10.6) can be met in various versions in the literature. Let locally Lipschitz with rank L near and

be

260

10. Newton’s Method for Lipschitz Equations

If Clarke’s generalized Jacobian is a Newton map at , then is called semismooth at , sometimes – if is even quadratic – also strongly semismooth. In several papers [Fis97], semismoothness has been defined in a slightly different way by using directional derivatives at the place of

For condition (10.6) now follows from (10.10) due to the uniform approximation of by directional derivatives (see Lemma A1):

In others papers, M is a mapping that approximates and functions satisfying the related conditions (10.6) are called weakly semismooth. However, neither the relation between M and nor the existence of is essential for the interplay of the conditions (10.4), (10.6), (10.7) in accordance with Lemma 10.1. The main problem consists in the characterization of those functions which allow us to find a practically relevant Newton map. These function classes are not very big up to now. The biggest class of pseudo-smooth functions for which nonsmooth Newton methods have been successfully applied is -at least up to now and by our knowledge- the class of composed locally cf. §6.4.2.

10.1.1 Dense Subsets and Approximations of M For the conditions (10.4) and (10.6) define M not uniquely and must hold on a dense subset of a neighborhood of only. Indeed, let and assume (without loss of generality) that the function under consideration is upper semi-continuous. Then, in order to fulfill (10.4) and (10.6), it suffices to know some M such that (10.4) and (10.6) hold for all in a dense subset S of a neighborhood of Having this, one may define as Hausdorff - limit and for

The map has non-empty ranges (for this reason, finite dimension was required), and fulfills (10.4) and (10.6) on by continuity arguments. Further, if M satisfies (10.4) and (10.6), then (10.4) holds for each and (10.6) holds for each map with

with

10.1. Linear Auxiliary Problems

261

Finally, one may replace M satisfying (10.4) and (10.6) by another map as far as

In particular, consider

which permits to approximate the elements of

with accuracy

Remark 10.3 When using N instead of M, condition (10.4) is still satisfied with each The function in (10.6) changes only by Thus, the replacement (10.11) of M by N will not disturb locally quadratic (or worse) convergence.

Indeed, let

and

Then

yields

and

With some Lipschitz rank L of near functions and

This provides us

, inclusion (10.6) ensures, for all linear

for N.

Remark 10.4 Similarly, the related Newton equation (10.3) may be replaced

by

as long as (10.4), satisfied.

and

Note that, in this case,

is true with some constant C.

10.1.2 Particular Settings If

the following particular settings seem to be appropriate:

One could also define if the rows or one considers the sets

of A belong to

of all difference quotients (with unit vectors

are

262

10. Newton’s Method for Lipschitz Equations

and defines the entrees In the latter cases, If is a

of A belong to

for all

is not necessarily true. generated by then one may put

This is the setting of one of the first paper on ”non-smooth Newton methods”, cf. Kojima and Shindoh, [KS87]. In each of these cases, it is easy to see that M is closed and locally bounded, and that (10.4) holds true if and only if The both conditions (10.4) and (10.6) however are not fulfilled a priory. They require different properties of depending on the choice of M.

10.1.3

Realizations for

and NCP Functions

For locally functions Theorem 6.18 yields that is a Newton map, so only condition (10.4) becomes crucial. We consider the case of a complementarity problem with locally data in detail. Let and denote the related sets of by and We already know that the maps and are Newton maps. Given some element let denote its component. Further, let consist of all matrices A the rows of which satisfy

This map contained in the carthesian product of all sets is a Newton map for the function as

cf. Theorem 6.18. Hence condition (10.4), namely the existence of remains the only problem for applying Newton’s method to the NCP-equation with Newton steps

Moreover, method (10.13) just means to solve the ”weighted equation”

Indeed, by (9.20), it holds at any

after setting

10.1. Linear Auxiliary Problems

This equation is still valid for the limits the Newton equation

263

in

The latter ensures for

Theorem 10.5 (regularity condition (10.4) for NCP). Let

and let be a solution of the complementarity problem. (i) If condition (10.4) is satisfied for the settings of method (10.13), then (10.4) is also satisfied for the special NCP function (ii) Condition (10.4) holds true if the NCP is strongly regular at . if (iii) Condition (10.4) is equivalent with strong regularity of the NCP at there is an arc in the of that connects the unit vectors of Proof. Recalling (9.19), it holds for some So we see that the matrices A in (10.12) are regular iff so are the matrices with rows

where and

For

these rows have the form

and the coefficients form a subset By continuity arguments, it suffices to consider only for showing (10.4). So, (10.4) holds true if and only if all matrices (which form a compact set) are invertible. This condition is as weaker as smaller the sets are. To study let The pairs vary in If we obtain that is near and Similarly, yields Now let Then is any limit of derivatives for in By norm-invariance of we conclude that So, the pairs vary in the whole set and

In the ”smallest case”, contains 0 and 1. This is just the situation for In the ”largest one”, the full interval [0,1] belongs to whenever Then, nonsingularity of all coincides, by Lemma 7.16, just with strong regularity of the NCP at . Clearly, having an (continuous) arc in which connects the unit vectors of the set is connected and contains 0 and 1. So the equation is in fact true for This proves the theorem.

264

10. Newton’s Method for Lipschitz Equations

Theorem 10.6 (uniform regularity and monotonicity). Let the NCP be strongly

monotone. Then, for and every matrix cording to (10.12) is regular and fulfills, for each bounded set

ac-

where is the strict monotonicity constant of NCP, and p = p(g) is the constant from (9.19) taken with the max-norm. Suppose one finds (Euclidean sphere) such that

and some

Proof.

This corresponds to the fact that A is singular or the maximum-norm. By definition of it holds for certain

in terms of

We know that is a Newton function for at every fixed by Theorem 6.18. The strict monotonicity of NCP yields, setting This ensures, for small

and For any

and in particular for each we thus obtain

Hence

Let Due to the factors Further,

and have the same (non-zero) sign. So, the inequality ensures

Returning to (10.15) for the latter yields by (9.19), Therefore,

and taking into account that

implies

and

as asserted.

10.2. The Usual Newton Method for

Functions

265

Taking any and as well as one sees that Theorem 10.6 fails to hold for a monotone standard NCP, since follows from and On the other hand, the theorem still holds without strong monotonicity of NCP whenever (10.16) remains true for some and some Moreover, if has locally Lipschitz derivatives on an open and dense subset (which is fulfilled for all the given examples of in Section 9.1) and if then one obtains quadratic convergence because in (10.6) now fulfills

of

10.2

The Usual Newton Method for Functions

Condition (10.6) also holds for all

if we put

The remaining condition (10.4) now means regularity of all matrices In this case, is obviously a strongly regular zero of each -function However, then one may apply the usual Newton method to any fixed generating function active at provided that (as usually supposed) is already close enough to the solution. Notice that this simplification is possible, if all generating functions are explicitly known. But the functions are needed anyway in order to find some element of and are known for many problems, e.g. for an NCP with or for polyhedral generalized equations with cf. Section 7.1.

10.3 Nonlinear Auxiliary Problems Solving linear auxiliary problems is only one possibility of dealing with a Lipschitzian equation, many other approaches are thinkable. In this section, we consider linear and non-linear auxiliary problems which may be solved only approximately. In contrast to the previous sections, now the existence of exact solutions (for the auxiliary problems) must not be required. Let (normed spaces) be locally Lipschitz with rank L near let and let

be a mapping satisfying the general supposition

266

of this section. Having

The fixed parameter

10. Newton’s Method for Lipschitz Equations

we want to solve an inclusion of the form

prescribes the accuracy when solving

If (10.19) holds true, we call One may identity rectional derivative of at are still possible,

and exact solutions. with any (suitable) multivalued generalized diIn particular, the settings of the former section

where is a set of functions in Lin (X,Y ). Notice, that the existence of an inverse or even surjectivity are not explicitly required, now. This is a realistic assumption for equations arising from control problems. Basic ideas to this topic can be found, e.g., in [Alt90]. Feasibility: We call the triple

are positive and such that, whenever solutions and generates iterates satisfying

feasible if, for each there process (10.18) has

To ensure feasibility of we will impose the following conditions for near which now replace (10.4) and (10.6) in the previous section. Condition (CI) (injectivity of the derivative).

fixed). Condition (CA) (condition for the approximation).

Considering only the directions by using namely,

in (CA) we get a weaker condition

Condition (CA)* (simplified condition CA).

This condition requires a good behavior of the ”directional derivatives” with respect to difference quotients including , provided that is positively homogeneous in the second argument:

10.3.

Nonlinear Auxiliary Problems

Since

267

(CA)* implies, for all

i.e., It turns out that holds for many relevant settings of

10.3.1

cf. Theorem 10.8.

Convergence

Based on (CA) and (CI), let us summarize the convergence properties for the current method (10.18). Theorem 10.7 (convergence of Newton’s method - II).

(i) The triple is feasible if there exist and a function such that, for all the conditions (CI) and (CA) are satisfied. Moreover, having (CI) and (CA), let

Under this condition, the convergence can be quantified as follows: (ii) If

even satisfies

then and fulfill the requirements in the definition of feasibility. In particular, (10.18) remains solvable if (iii) If there exists a solution of (10.18) for every then provided that So (10.23) and hold for large (iv) If all are exact solutions of (10.18), then they fulfill with

from (CA) if

Note: The conditions (CI), (CA) and (CA)*, respectively, must be imposed for only.

268

Proof.

10. Newton’s Method for Lipschitz Equations

Given

let

be taken as under (10.22), and let

(Preparation) First we apply (CA)* and (CI) to elements (CA)* and (10.17) yield

only. With

and (CI) ensures Thus, since

it follows

This yields that, for each

holds true. (ii, existence of a solution to Now let and be small enough such that (10.23) holds true. Then contained in meets the larger balls

Hence solves (10.18) for with the current Up to now we only applied the conditions (CA)* and (CI). (ii, estimate of any solution to Next, let by any solution of the auxiliary problem (10.18) at (CA) we observe

and by (CI), each

Because some inequalities

hold true.

belongs to

By

has at least the norm

this yields with (10.25) that the key

10.3.

Nonlinear Auxiliary Problems

269

(iv, estimate of any exact solution) For exact solutions of (10.18), condition (CA) yields

so the estimates (10.26) and (10.27) hold even with

Thus, exact solutions and (iv) is true.

satisfy

and lead us to

so

(Note) Recalling our basic settings in (10.22), namely

inequality (10.27) provides (for exact and ”inexact” solutions as well) an estimate for namely,

So, our assumptions (CI), (CA) and (CA)*, respectively, have to hold for

only. (iii) Additionally, (10.27) ensures

Thus, all

are again in

and converge to

(i) and (ii) Finally, if also (10.23) holds true, then (10.27) ensures

So (ii) (and hence (i)) is valid as desired:

as asserted under (iii).

270

10. Newton’s Method for Lipschitz Equations

The proof also shows that, with the constants of the theorem, the following holds: Due to (10.24), the zero of is isolated. For and near the point is a solution of the auxiliary problem (10.18) satisfying

The full condition (CA) - not only (CA)* - was needed for showing that all solutions of (10.18) satisfy this estimate

thus they fulfill (10.28), too. Finally, using (ii), the inequalities (10.24) and ensure

So is decreasing provided that sufficiently small.

10.3.2

(depending on L and ) has been taken

Necessity of the Conditions

Under several particular settings, the technical condition (CA) may be replaced by (CA)*. Theorem 10.8 (the condition (CA)). Suppose

and let Gh

denote any of the following generalized directional derivatives:

(Clarke’s Jacobian applied to (usual directional derivatives, provided that they exist) and if Then, the conditions (CA) and (CA)* are equivalent. Proof. If

where

is a set of linear functions, condition

(CA)* means

One obtains (CA) by adding

For

the proof follows from the subadditivity inclusion

10.3. Nonlinear Auxiliary Problems

271

cf. (6.10) in Section 6.2, and is left to the reader. Next we will only consider directional derivatives because the proof for is basically the same (one has only to select appropriate subsequences). Thus, let be directionally differentiable near We may suppose that in (CA)* is u.s.c. and decreases for Further, may be assumed. We have to show that, for arbitrarily fixed

Our assumption

where limits (for

allows us to write

and we have

as

Regarding the assigned

So we have to verify that

After setting

we notice that and

When computing the limit of the crucial quotients

the term Next, write

may be omitted because with

is locally Lipschitz.

and apply (10.31). Then,

For small it holds norm-decreasing and continuous,

and, since

and

is

272

10. Newton’s Method for Lipschitz Equations

Thus the limit lim to

(for vanishing or

belongs

This ensures the assertion (10.30) with a new function Using the approximation of a Lipschitz function by directional derivatives (if they exist near ), i.e., our Theorem 10.8 verifies the following statement: If then

This is an interesting additivity property of directional derivatives for functions. Having normed X and Y, the equivalence (CA) (CA)* for directional derivatives remains true (by the proof just given). However, when dealing with contingent derivatives and dim one needs a strong extra assumption that replaces the existence of Given any sequence there is always a (norm-)convergent subsequence of At the end of this section, we characterize the necessity of (CI) and (CA) for several relevant settings. Theorem 10.9 (the condition (CI)). h(x*) = 0.

Suppose that

and

Let Then (CI) holds at holds for near is strongly regular at Having (CI), Condition (CA) is necessary and sufficient for being feasible. Let Then (CI) holds at holds for near non singular. This condition is stronger than strong regularity. Let Then (CI) holds at is locally upper Lipschitz at Let provided that directional derivatives exist near Then, supposing strong regularity, (CA) is necessary and sufficient for being feasible; supposing pseudo-regularity, (CI) is satisfied for near Proof. Let

The first assertions follow from Theorem 5.14 and closeness of To show the necessity of (CA) [for sufficiency apply Theorem 10.7], we may assume that Let We have to show that inverse and subadditivity (6.10) of it holds

Using the

10.3.

Nonlinear Auxiliary Problems

273

We select and Every bility,

such that solves

belongs to

Since

Hence, due to feasithis yield as desired

where Let By Clarke’s inverse function theorem, we have non singular By Example BE.3 we see that the reverse statement does not hold, in general. The rest follows from closeness of Notice that the piecewise linear function of this example fulfills condition (CA); so the example is relevant in the present context. Let The statement follows immediately from Theorem 5.1. Let Let be strongly regular. Then (CI) is obviously true for near Moreover, under strong regularity one can show that is directionally differentiable iff shows the same property (cf. Exercise 10). Then is also holds - since the next equivalence is valid for contingent derivatives,

Because exists, there is always some satisfying this equation. Having feasibility of our triple (or superlinear convergence of the version), we obtain once more Accordingly, we write Now, to show that (CA) is valid, assume that Then, and

Since

is Lipschitz (with rank L) in the second argument, it follows with

which gives (CA)* and – by Theorem 10.8 – even (CA). Finally, if is only pseudo-regular at the assertion is true due to Theorem 5.12.

This Page Intentionally Left Blank

Chapter 11

Particular Newton Realizations and Solution Methods After some problem has been modeled as a (nonsmooth) equation, the Newtonsteps for solving the latter induce particular iteration steps (actions) and regularity requirements in the original problem; for instance, compare equation (9.21) and the intrinsic equivalent equation (9.22). We study these actions and requirements for Karush-Kuhn-Tucker systems (KKT) of optimization models, and want to demonstrate their dependence on the applied Newton techniques and the corresponding reformulations. We will see that the related auxiliary problems (being linear or nonlinear equations a priori) describe solutions of quadratic optimization problems in all cases. So one can solve the auxiliary problems by several methods (of second or first order), in particular if certain numerical difficulties occur with the current one. In this way, connections to SQP-methods (sequential quadratic programming) become evident, but we also establish a bridge to methods of penalty-barrier type and will compare the hypotheses and actions in terms of the original problems. Our tool consists in studying certain perturbed Kojima systems that describe, in a uniform way, stationary points of assigned penalty or barrier functions close to an original solution So the approach permits solution estimates, based on regularity assumptions at In addition, it makes also clear (by using general properties of pNCP functions) that reformulations of the KKT- complementarity condition by pNCP functions can be always modeled in form of particularly perturbed Kojima systems with perturbations that depend on and only. 275

276

11.1

11. Particular Newton Realizations and Solution Methods

Perturbed Kojima Systems

We consider perturbations of the Kojima function to the problem

assigned

namely,

and study solutions of the system

Let us show that this system is closely related to penalty and barrier methods for problem (11.1). Note again that - in contrast to the common terminology in the literature - the whole auxiliary function (i.e., objective function + penalty/barrier term) is said to be a penalty/barrier function. Quadratic Penalties

Suppose

Let If If

solve (11.2). Then we know:

then it follows then it follows

and and

Hence, we obtain in both cases is a stationary point of the penalty function

Conversely, if

is stationary for

for

then

and

i.e.,

with

for

solves (11.2). Logarithmic Barriers

Let Now, the second equation of (11.2), implies feasibility of in (11.1). Let solve (11.2). Then: If If Setting

then then

and and we thus observe

does not appear in the Lagrangian.

11.1.

Perturbed Kojima Systems

Hence, the point

277

has the following properties. It is feasible for (11.1), fulfills and is stationary (not necessarily minimal !) for the function

Conversely, having some

with the latter properties, the point

with

and

solves (11.2). For the terms see that

coincide with

Accordingly, the current point function

So we

is also stationary for the logarithmic barrier

Theorem 11.1 (perturbed Kojima-systems). Under the above settings, zeros

of the perturbed Kojima equation (11.2) and critical points of the well-known auxiliary functions correspond to each other. Under strong regularity of (11.1) at a critical point (x*,y*) of F, the solutions of (11.2) are, for small locally unique and Lipschitz. So, it holds

Proof. This follows directly from the given transformations and Corollary 4.4,

since the maps

are small Lipschitz functions in the

Remark 11.2 (modifications). The inequality (11.3) now compares solutions of different methods in a Lipschitzian manner. Further, one may mix the signs of the and obtains similarly stationary points for auxiliary functions containing both penalty and barrier terms. So, given some initial point it is quite natural to put if

and

if

Moreover, for critical points which are not locally unique, the same arguments including Corollary 4.4, present estimates of under pseudoregularity of F at or ensure estimates of the difference under the upper Lipschitz property of at even if and are only functions.

278

11. Particular Newton Realizations and Solution Methods

If problem (11.1) includes also equality constraints with related duals z, then additional perturbations of the type

change the functions

and

only by additional terms

Concerning other auxiliary problems and more details we refer to [Kum95b, Kum97].

11.2

Particular Newton Realizations and SQPModels

Let us assume that in Section 10.1 coincides with Kojima’s function assigned to our standard optimization problem (11.1),

For deriving relations to we suppose Then F is a function, and all the mentioned derivatives are Newton maps (or satisfy condition (CA)). Again, we omit additional equality constraints only for seek of brevity. Depending on the choice of a Newton map M (or of a generalized derivative) we investigate the kind of the related auxiliary problems and the meaning of the (Newton-) regularity condition (10.4), imposed for points near a zero In all subsequent cases, we assume that is the current iteration point and describes the movement defined by the Newton step. It will be seen that is a solution (primal-dual) of some quadratic optimization problem. So the (generalized) Newton-methods are at the same time and differences between them arise from the different approximations of in the related Newton equations. We are now going to study this interrelation for particular settings. Case 1:

Apply the usual Newton method to any fixed generating function being active at the initial point

of F

11.2. Particular Newton Realizations and SQP-Models

The functions

279

are defined by index sets

Here, we assigned, to otherwise. The initial set

as

the function has to be active at

if

and

if

if

If we may fix any of the two alternatives. Because during all steps, the iterations require

The equations related to

for

and i.e.,

remains fixed

have the form

and does not appear in any other equation. Thus, we have to solve the problem by linearization of the related system at Condition (10.4) requires regularity of the Jacobians for all S, active at This is strong regularity of all related problems at the solutions So condition (10.4) is weaker than strong regularity of the original problem at Case 2:

With the Kojima-Shindoh approach (see §10.1.2), one selects some set S being active at the current point and makes next a Newton step based on (changing) S as above. The condition (10.4) is the former one. The method is a classical ” active index set” algorithm. Case 3:

Applying the generalized Jacobian (= TF, since may take any matrix cf. (7.49), for the Newton step

) one

Condition (10.4) requires more than above, namely just strong regularity of problem (11.1) (or of F) at

280

11. Particular Newton Realizations and Solution Methods

We study the Newton steps for the original Kojima system and the perturbed equation (11.2) at once by considering any and dealing with the Newton equation for

Recall that this setting represents a mixed penalty-barrier approach, cf. Section 11.1, for solving (11.1). Let and be fixed. Practically, may depend on (in each step). To obtain locally superlinear convergence, it suffices to ensure that

e.g., cf. the Remarks 10.3, 10.4 concerning approximation (10.11) in Section 10.1 and notice that not only but also the original Kojima function F has been changed by We abbreviate and Given cf. (7.28), we put

in accordance with the T-derivative of Below, will stand for so the Lagrangian does not depend on (in contrast to the case 4 following next). Finally, put If then statement are zero.

hence

and our weights

in the next

Lemma 11.3 (Newton steps with perturbed F). In the current case, a Newton

step (11.4) means to find a KKT- point

where

and

of the problem

The vector v in (11.4) is then given by

and

Proof. The linearized equations

require (equivalently), by the product

rule given in Corollary 6.10,

i.e., and

11.2. Particular Newton Realizations and SQP-Models

By substituting

in the linearized equation

281

i.e., in

and setting one obtains

For

we have If So the

we have equation becomes

Indeed, if and

we know that

and has the form where

This proves the assertion. Note. The case of (no constraints in problem (11.5)) can be easily forced by setting and whenever Let Then, if the weights are just the penalty factors. For and all choices of are allowed. So may attain all non-negative values. Let If now is negative, and stationary are not necessarily minimizer of problem (11.5). If it holds Case 4:

Application of NCP functions. To solve the KKT-system of the (11.1) by the help of some function require the usual Lagrange condition (without !)

and write the remaining conditions as

282

11.

Particular Newton Realizations and Solution Methods

Using the derivative D°G in accordance with (10.12) we have to solve

with

Let

Now the Newton equation has again the form of case 3, only identified for and L stands for the usual Lagrangian.

and

must be

Lemma 11.4 (Newton steps with pNCP). In the current case, a Newton step

means to find a KKT-point

where then given by

of problem (11.5)

and

The vector v in (11.6), (11.7) is and

Note. It holds

coincide with

Proof. Since

Replacing

and non-zero coefficients of case 3 after setting

and

(11.7) yields

in (11.6) we obtain

So the equivalence follows by the same arguments as under case 3. For

now

is possible. Further, the convergence and

if

So the method realizes basically a penalty approach.

yields both

11.2. Particular Newton Realizations and SQP-Models

283

Case 5:

Perturbed generalized Jacobians and unperturbed Kojima function. Let the Newton step be given by

where belongs again to the perturbed equation (11.2), We are now using approximations of the (unperturbed) Newton map which is justified as long as we select (assigned to ) in such a way that cf. the Remarks 10.3 and 10.4 Compared with case 3, the terms do not appear, and the above proof leads us via directly to the modified objective

All the other conclusions of case 3 remain true after setting

i.e.,

This way one obtains Lemma 11.5 (Newton steps with perturbed

ton step (11.8) means to find a KKT-point

where

and

). In the current case, a Newof the problem

The vector v in (11.8) is then given by and

Note. In comparison with (11.5) now the first derivative of the full Lagrangian

appears in the objective. Setting particularly and selecting with if we obtain as well as: A Newton step (11.8) means to find a stationary point of

where The vector is then given by

284

11.

Particular Newton Realizations and Solution Methods

Case 6:

Solving auxiliary problems of Wilson- type, means to apply Newton’s method by using directional (or contingent-) derivatives of F:

The solutions fulfill the same conditions as in case 3 since The structure of cf. (7.29) implies additionally that and So the constrains may be written as inequalities. Since it holds (directional derivative). Now we have to solve the system

where again

With the transformations (7.30) the conditions become

for

for

and

and for

The left side in (11.12) is

Therefore, we are solving a linear complementarity problem, and since and for the solutions are the critical points of the quadratic problem (now with inequality constraints) min s.t.

where

for for

The vector

is then given by for for

So we are applying, as before, a method of sequentially quadratic approximation. Basically, the present one is Wilson’s method which has been originally developed under the strict complementarity assumption, and the strong sufficient second order condition (i.e., positive definiteness of the Hessian on the space). We investigate condition (CI). Let S have the same meaning as in case 1. Recall that S is active near if holds on some neighborhood of We put is active near

for certain

11.2. Particular Newton Realizations and SQP-Models

285

Lemma 11.6 (condition (CI) in Wilson’s method). Condition (CI) for method (11.11) means equivalently that

is regular for each

Proof.

Assume (CI) holds true. We show first

Indeed, for

If S is active near

the terms

in (11.11) satisfy

then from

it follows

Conversely, we show that (11.13) ensures (CI). The set of the points under consideration in (11.13), is dense in (shown in the proof of Lemma 6.17). We thus conclude (by continuity arguments) that all for satisfy Formula (6.33) in Section 6.4.2 now yields and ensures (CI) because of inf

So (CI) and (11.13) are equivalent. Taking into account that for and small condition (CI) can be equivalently written as regularity of all for

This Page Intentionally Left Blank

Chapter 12

Basic Examples and Exercises 12.1

Basic Examples

Example BE.0

A pathological real Lipschitz function (lightning function). We present a simple construction of a special real Lipschitz function G such that F.H. Clarke’s subdifferential fulfills The existence of such functions has been clarified in [BMX94]. It will be seen that the following sets are dense in the set G is not directionally differentiable at }, the set of local minimizers, and the set of local maximizers. To begin with, let be any affine-linear function with Lipschitz rank L(U) < 1, and let As the key of the following construction, we define a linear function V by if U is increasing, otherwise. Here, and denotes the step of the (later) construction. Given any we consider the following 4 points in

By connecting these points in natural order, a piecewise affine function

287

288

12. Basic Examples and Exercises

is defined. It consists of 3 affine pieces on the intervals

By the construction of V and

it holds provided that

is small.

After taking in this way, we may repeat our construction (like defining Cantor’s set) with each of the related 3 pieces and larger see Figures 12.1 and 12.2. Now, start this procedure on the interval [0, 1] with the initial function and

In the next step we apply the construction to the 3 pieces just obtained, then with to the now existing 9 pieces and so on. The concrete choice of the (feasible) is not important in this context. We obtain a sequence of piecewise affine functions on [0,1] with

12.1.

Basic Examples

289

Lipschitz rank < 1. This sequence has a cluster point in the space C[0,1] of continuous functions, and has the Lipschitz rank L = 1. Let has a kink at } and N be the union of all If

then the values will not change during all forthcoming steps Hence The set N is dense in [0,1]. Connecting arbitrary 3 neighbored kink-points of and taking into account that these points belong to the graph of one sees that has a dense set of local minimizers (and maximizers). Further, let D be the dense set of all centre points belonging to some subinterval used during the construction. Then each is again a centre point of some subinterval for each step with sufficiently large Thus, is again true. Moreover, for arbitrary one finds points such that and

as well as

namely the nearest kinks of on the right side of where is (large and) odd or even, respectively. This shows that directional derivatives cannot exist for In addition, by the mean-value theorem for Lipschitz functions [Cla83], one obtains To finish the construction define G on by setting where integer denotes the integer part of Needless to say that G is also nowhere semismooth. Derived functions: Let Then for all is strictly increasing, has a continuous inverse which is nowhere locally Lipschitz, and is not directionally differentiable on a dense subset of In the negative direction – 1, is strictly decreasing, but Clarke’s directional derivative is identically zero. The integral

is a convex

function with strictly increasing derivative and

such that

for all in a dense set

holds true. Example BE.1 Alternating Newton sequences for real, Lipschitzian

with almost all initial

290

12. Basic Examples and Exercises

points. To construct integers

consider intervals

for

and put (the center of (the center of

In the

define the points

to be the linear function through and ( 0),

i.e., where Similarly, let the points

be the linear function through and ( 0),

) ).

12.1.

Basic Examples

291

i.e., where Evidently,

at

at if

Now define for and

We finish the construction by setting

and

as

if

for

The related properties can be seen as follows: For one obtains and The assertion can be directly checked. Again directly, one determines the global Lipschitz rank One the left side of the interval coincides with one the right with Since coincides with on a small neighborhood of the center point Now, let us start Newton’s method a some Then the next iterate is some point There, it holds (or for negative arguments). Hence, the method generates the alternating sequence Example BE.2

A function which is one of the simplest nonsmooth, nonconvex functions on a Hilbert space. Pseudo-regularity of the map can be easily shown. However, the sufficient conditions of Section 3.3 in terms of contingent derivatives and coderivatives will not be satisfied. Let and

Now is the level set map of a globally Lipschitz functional. Since is concave the directional derivatives exist everywhere. Further, is monotone with respect to the natural vector ordering, and is nowhere positive.

(i) The mapping F is (globally) pseudo-regular, e.g., with rank L = 2. Indeed, if and there is some such that Put where is unit vector in Then, pseudo-regularity follows from and since

(ii) Next we are going to show that, at each with

it holds

in spite of (uniform) pseudo-regularity of F. We show even more:

292

12. Basic Examples and Exercises

(iii) If for certain and bounded say for then necessarily depends on and there is no (strong) accumulation point of In fact, by the choice of we have and for some Due to the latter inequality can never hold for if is bounded. So one obtains for an infinite number of components. Assuming to be fixed, this yields the contradiction Hence depends on Assuming convergence for certain we obtain again a contradiction, namely for certain though Finally, we consider

(iv) The point Clearly, Setting (3.14) for provided that we have

yields of F. is an

to gph F around

due to the condition requires and

are small, say But then and condition (3.14) becomes

Since is small, this condition is always true, so our assertion is valid. Further, since was arbitrary, we may put as in order to obtain (weak*). Thus, (v) For the points from (ii) and for weak* or strong accumulation point of around with To verify this, we show that Due to condition (3.14) particularly requires that

there is no to gph F

holds for small and all With and small this implies It remains to consider negative Then we select with large such that Now (12.2) yields the assertion (setting and using that has Lipschitz rank 1), hence

12.1. Basic Examples

293

While (iv) says that D*F(0,0) is not injective, the property (v) indicates that the sets for tell us nothing about at points from (ii). Example BE.3

Piecewise linear bijection of On the sphere of Put

with

let vectors

and

be arranged as follows:

and notice the following important properties:

(i) (ii) The vectors (iii) The cones are proper.

and

turn around the sphere in the same order.

generated by

and

and

generated by

and

Let

be the unique linear function satisfying and Setting if we define a piecewise linear function which maps onto By the construction, is surjective and has a well-defined inverse; hence it is a (piecewise linear) Lipschitzian homeomorphism of Moreover, on int and on int Thus, contains the unit-matrix E as well as –E and, by convexity, the zero-matrix, too. Example BE.4

A piecewise quadratic function tionary points being not unique. We put

in polar-coordinates,

having pseudo-Lipschitzian sta-

294

and describe

12.

as well as the partial derivatives

Basic Examples and Exercises

over the 8 cones

by

and on the remaining cones is defined as in Studying the of the sphere, it is not difficult to see (but needs some effort) that is continuous and is pseudo-Lipschitz at the origin. For there are exactly 3 solutions of Example BE.5

A Lipschitz function such that directional derivatives nowhere exist, neither as strong nor weak (pointwise) limits; and contingent derivatives are empty. For

define a continuous function

by

12.1. Basic Examples

The mapping C[0, 1]. For small

If

295

is a Lipschitz function from the interval consider the function

into

then

and Hence, the limit nor in a weak sense). If

(as

) cannot exist in C[0,1] (neither in a strong then we obtain for that

and Thus (as ) cannot exist, too. This shows that is a Lipschitz function without directional derivatives and with empty contingent derivatives for nontrivial directions. Example BE.6

A convex function

non-differentiable on a dense set.

Consider all rational arguments

such that

are positive

integers, prime to each other, and put

For fixed

the sum

over all feasible

is bounded by

and Now define

and

for

Then is increasing, bounded by c and has jumps of size Next extend on by setting

at

if and put

for

Since

is increasing, the function

as Lebesgue integral is convex and for and ( irrational, rational) one obtains different limits of Thus is not differentiable at

296

12. Basic Examples and Exercises

12.2

Exercises

Exercise 1 Proof of Lemma 2.21.

(i)

Let F be pseudo-regular at with rank L and neighborhoods U, V. For fixed then the mapping is again pseudo-regular at with rank L and the same neighborhoods U, V. The second part of the proof to Lemma 2.18 now shows

if Since

(i)

and

was arbitrary, this yields

By Lemma 2.20, there are points and that and Since now means:

depending on where

such

if hence (ii)

F is not pseudo-regular at

because of (i).

(ii)

The condition holds true due to Lemma 2.20 (ii).

Exercise 2

How the situation of mixed constraints (equations and inequalities) can be handled in a similar manner? Define the cone corresponds to an equation}.

if i corresponds to an inequality,

if

Exercise 3

Verify that , for every function regular. Hint: Apply Rademacher’s theorem. Assume the contrary. Take near such that theorem). Since there exists such that function

is nowhere pseudoexists (Rademacher’s The

12.2. Exercises

297

has derivative 0 at Thus, is a local Ekeland-point for Apply Lemma 2.21 (ii) to obtain a contradiction.

with each

Exercise 4

Show how Theorem 2.26 may be extended to the case of a closed multifunction What about necessity of the conditions in Theorem 2.26? One can repeat all the arguments of the necessity part for Theorem 2.22. Exercise 5

Show that, in the Lemmas 3.1 and 3.2, one may replace ”Lipschitzian ”l.s.c.” by ”l.s.c.”. Proof: If

certain

is l.s.c, at we find

without being Lipschitz l.s.c. then, for with and

Since we obtain, a convergent (sub)sequence which shows that for some So, already the necessary injectivity conditions for the related regularity are violated. Exercise 6

Show that for

one has

The simplest way is to use Theorem 6.5. For statement (ii) yields for all so the linear function

is identical zero. But this means just Exercise 7

Proof of Lemma 5.11. By definition, is continuous, and is always true for some where and belongs to a finite family F of We may assume that the sets fulfill otherwise the local representation of by the family F would need less functions Applying Theorem 5.1 to we obtain for some large Hence, locally, consists of arcs belonging to the strongly regular, inverse functions for sufficiently small Thus,

is an isolated zero, and H exists as required.

Exercise 8

Analyze the continuity properties for

for

298

12. Basic Examples and Exercises

F with the Euclidean norm and with polyhedral norms, respectively, and with “>” instead of Does there exists a continuous function such that on B? The first part is left to the reader. A continuous function with B cannot exists because is would have a fixed point.

on

Exercise 9

Find a counterexample showing that the pointwise condition (5.4) in Theorem 5.1 is not sufficient for the Lipschitz l.s.c. of We construct

continuous with and

Let if

and G = conv M. For put In order to define at triangle given by the points

let

For

and

with

let D be the

and let

Then We shift the point of D and define to be the related point:

to the left boundary

So

becomes a continuous function of the type Setting where is the projection of onto G, can be continuously extended to the whole space. We identify and Clearly, holds for all and Exercise 10

Show that if tionally differentiable for differentiable for near

near

is strongly regular at then the local inverse

and direcis directionally

Otherwise one finds images for near and contains at least two different elements and Since one obtains then the images

such that exists and For small

12.2. Exercises

299

differ by a quantity of type while the pre-images differ by fore, the local inverse cannot be Lipschitz near for

There-

Exercise 11 Verify Theorem 6.4, first part “polyhedral”. The statements (i) and (ii) can be easily seen for each submapping defined by gph since is a convex polyhedron. From gph and (6.2) then the assertions follow via selection of subsequences assigned to fixed Exercise 12 Verify

(i) If or (ii) If (iii) If

is directionally differentiable, then then and is l.s.c. at then

Proof. Note that

(i)

the functions are locally Lipschitz by assumption.

directionally differentiable: Let

and

(possible with given

for certain

since

Write

is directionally differentiable). Then,

directionally differentiable: Let and

We may write, with certain

and Using again

(ii) Let

we get and let

Then

be written as

Since

it holds

for all Setting

and

as above.

this yields

and

300

12.

(iii) Let

and let

Basic Examples and Exercises

be written as with

Since is l.s.c. at Substituting, we obtain

one finds

such that and, since as required.

it follows Exercise 13

Let smooth at 0 if so is

be strongly regular at at

Show that

is semi-

Otherwise, is not a Newton map at 0. Then, due to conv (cf. (6.17)), also is not a Newton map at 0. So one finds some and elements such that where Setting and using that and with some new positive constant C :

Since

is a Newton map at

Next apply that homogeneous map

and

are locally Lipschitz, we obtain

we may write (with different o– functions)

By subadditivity of the (cf. (6.10)), we then observe

Hence with certain We read the latter as

which yields, with some Lipschitz rank L of

This contradiction proofs the statement.

near the origin,

12.2. Exercises

301

Exercise 14

Show that all and

is

on an open set

if card

for

By its basic properties, now the map is additive and homogeneous for each In addition, as a locally bounded and closed mapping, is continuous on Exercise 15

Verify that positively homogeneous Let of Given

and be given that there exist such that select some such that

are simple at the origin. We know by the structure and put

Then

Next select and choose a related in the same way as above. Repeating this procedure, the subsequence of all then realizes, with the assigned and Exercise16

Show that, for the situation must be taken into account. The situation

occurs once more for

This Page Intentionally Left Blank

Appendix In this section, we present proofs of often applied (and well-known) basic tools for convenience of the reader. Ekeland’s Variational Principle Theorem A.1 (Ekeland’s variational principle, appears also as Theorem 2.12).

Let X be a complete metric space and be a l.s.c. function having a finite infimum. Let and be positive, and let Then there is some such that and

Proof. Put

For arbitrary

and

in X,

we observe

Taking the infimum over all

on both sides, we obtain

Therefore, is a Lipschitz function; in particular, is u.s.c. To construct a sequence we set If then realizes all the assertions of the theorem. Thus, beginning with assume that Then one finds some such that and, in addition,

From (A.1) and

we obtain for each

303

304

Appendix

This yields particularly

By (A.1) and (A.3), is again the point in question whenever Otherwise (A.3) shows that the Cauchy sequence has a limit the complete space X. Since is l.s.c., we observe By (A.1), the sequence of is decreasing, hence Moreover, using (A.3) we even obtain

Finally, recalling that

in

is u.s.c., we infer due to (A.2), the key relation

The latter proves the theorem. Approximation by Directional Derivatives

The following lemma can be found in Shapiro’s paper [Sha90] where a survey of concepts of directional differentiability and their interrelations is presented, see also [BS00]. Lemma A.2 (approximation by directional derivatives 1).

Let be locally Lipschitz (Y normed), and let directional derivatives for all Then

Otherwise one finds converging

Proof.

exist

(a sequence) and some

such that where The directions some subsequence,

Since that

and

have some cluster point; so they converge for Setting we obtain

is locally Lipschitz, say with rank L, we also observe (for all small

Appendix

305

So, replacing by

in A.4, it holds for some subsequence,

in contradiction to the directional differentiability. Lemma A.3 (approximation by directional derivatives 2). Let be locally Lipschitz. Then

Proof. some

Otherwise one finds a sequence of converging directions such that

Setting

and

and

this is

The directions have some cluster point. Thus, for certain (belonging to some subsequence) the bounded quotients converge to an element Since the multifunction is Lipschitz with respect to the Hausdorff-distance. In particular, it is lower semicontinuous, so dist vanishes, in contradiction to A.5. Remark A.4 Analogously, one shows that if

Lipschitz at

for some neighborhood

and

is locally upper is lower semicontinuous, it holds

of

The l.s.c. assumption is essential even for pointwise Lipschitz functions for and if Lemma A.5 (descent directions).

and

Put,

otherwise.

Let be locally Lipschitz (X normed), and let directional derivatives exist for each Further, let be some sequence such that related elements fulfill with some fixed Then, it holds for each cluster point of the sequence

306

Appendix

Proof. If, for certain

L for

near

we have

then using some Lipschitz rank

it follows

Thus, holds for the particular sequence this yields

Since

is directionally differentiable,

for every sequence Proof of TF = T(NM) = N TM + TN M The main point in the proof of Theorem 7.6 was the product rule TF = T(NM) = NTM + TNM. Since the way via the more general Theorem 6.8 is quite long we add a direct proof which is valid for the actual product rule only. Lemma A.6 (direct proof of the product rule).

Proof. To begin with we set

(similarly

Now put Then

and

and

are defined), and observe that

for any given sequence depend on and

If, moreover, then - since M and N are locally Lipschitz - the bounded sequences and have accumulation points and respectively. The third term is vanishing. So we obtain, for all converging subsequences that the limit can be written as

Appendix

307

This tells us For showing the reverse inclusion, the special structure of N comes into the play. Let and be arbitrarily given, and let and be appropriate sequences such that The existence of such sequences is ensured by the definition of TM. To show that we have to find elements in such a way that can be written as

with the already given sequence of (or with some infinite subsequence). If this is possible then, considering for as above, we obtain

which proves the lemma. We are now going to construct for given By definition of the first and the last components of any element belong to the map and are obviously 0 and respectively. The remaining components of are formed by the T-derivative of the function at in direction which has been already studied in Lemma 7.4. Accordingly, we find such that for small and (A.7) holds even as identity: for small Hence the lemma is true, indeed. Constraint Qualifications

The following lemma compiles some basic facts on crucial constraint qualifications for a nonlinear program

with and being functions defined from respectively. Recall that (Mangasarian-Fromovitz feasible point if both has full row rank and

while

(Linear Independence

is linearly independent, where

to

and

is said to hold at some

is said to hold at

if

308

Appendix

Given a stationary solution of (P), hold at if the set of Lagrange multipliers

(strict

) is defined to

is a KKT point of (P)} is a singleton. To have a unified algebraic description of the above CQs, let us introduce, for any feasible point of (P) and any index set the polyhedral cone

Let us also recall Gordon’s theorem of the alternative [Man81b, Man94] which says that for matrices and of suitable dimensions,

and, by a standard argument from convex analysis (see, e.g., [Man81a, Man94, Roc70, SW70]), this equivalently means that for any right-hand side the linear system has a bounded (possibly empty) solution set. Hence, it follows immediately that a feasible point of (P) satisfies

Moreover, given a stationary solution states that satisfies

where again cations

and

the following lemma

From (A.9)–(A.11), the known impli-

are obvious. Lemma A.7 (Gauvin’s theorem [Gau77] and Kyparisis’ [Kyp85] theorem).

Let

be a stationary solution of (P). Then is bounded if and only if MFCQ is satisfied at is a singleton (i.e., SMFCQ is satisfied at ) if and only if for some one has

Proof. (i) follows from (A.8) and (A.10) according to the discussion above.

To show the ”only if”-direction of (ii), let and with Then, for all sufficiently small, one has

Appendix

309

where Thus, is not a singleton. To show the ”if”-direction of (ii), assume that is not a singleton. Let be any element of Then there is a second element such that both and satisfy

Hence,

i.e.,

and

which completes the proof.

This Page Intentionally Left Blank

Bibliography [AC95] [AE84] [AF90] [AH35] [Alt83]

[Alt90]

[Asp68] [Att84] [Aub84] [Aus84] [BA93]

[BBZ81] [Ben80]

[Ber63] [BF97]

D. Azé and C.C. Chou. On a Newton type iterative method for solving inclusions. Mathematics of Operations Research, 20:790–800, 1995. J.-P. Aubin and I. Ekeland. Applied Nonlinear Analysis. Wiley, New York, 1984. J.-P. Aubin and H. Frankowska. Set–Valued Analysis. Birkhäuser, Boston, 1990. P.S. Alexandroff and H. Hopf. Topologie. Springer, Berlin, 1935. W. Alt. Lipschitzian perturbations of infinite optimization problems. In A.V. Fiacco, editor, Mathematical Programming with Data Perturbations, pages 7–21. M. Dekker, New York, 1983. W. Alt. Stability of Solutions and the Lagrange–Newton Method for Nonlinear Optimization and Optimal Control Problems. Universität Bayreuth, Bayreuth, 1990. Habilitationsschrift. E. Asplund. Fréchet differentiability of convex functions. Acta Math., 121:31–47, 1968. H. Attouch. Variational Convergence for Functions and Operators. Applicable Mathematics Series. Pitman, London, 1984. J.P. Aubin. Lipschitz behaviour of solutions to convex minimization problems. Mathematics of Operations Research, 9:87–111, 1984. A. Auslender. Stability in mathematical programming with nondifferentiable data. SIAM Journal on Control and Optimization, 22:29–41, 1984. E.G. Belousov and V.G. Andronov. Solvability and Stability for Problems of Polynomial Programming. Moscow University Publishers, Moscow, 1993. in Russian. A. Ben-Israel, A. Ben-Tal, and S. Zlobec. Optimality in Nonlinear Programming: A Feasible Direction Approach. Wiley, New York, 1981. A. Ben-Tal. Second–order and related extremality conditions in nonlinear programming. Journal of Optimization Theory and Applications, 31:143– 165, 1980. C. Berge. Topological Spaces. Macmillan, New York, 1963. S.C. Billups and M.C. Ferris. QPCOMP: A quadratic programming based solver for mixed complementarity problems. Mathematical Programming B, 76:533–562, 1997. 311

312

Bibliography

B. Bank, J. Guddat, D. Klatte, B. Kummer, and K. Tammer. Non-Linear Parametric Optimization. Akademie-Verlag, Berlin, 1982. J.M. Borwein and A.S. Lewis. Convex Analysis and Nonlinear Optimiza[BL00] tion: Theory and Examples. CMS Books in Mathematics. Springer, New York, 2000. B. Bank and R. Mandel. Parametric Integer Optimization. Mathematical [BM88] Research, Vol. 39. Akademie-Verlag, Berlin, 1988. [BMX94] J.M. Borwein, W.B. Moors, and W. Xianfy. Lipschitz functions with prescribed derivatives and subderivatives. CECM Information Document 94026, Simon Fraser Univ., Burnaby, 1994. G. Bouligand. Introduction à la Géométrie Infinitésimale Directe. [Bou32] Gauthier-Villars, Paris, 1932. J.F. Bonnans and A. Shapiro. Optimization problems with perturbations: [BS98] A guided tour. SIAM Review, 40:228–264, 1998. J.F. Bonnans and A. Shapiro. Perturbation Analysis of Optimization Prob[BS00] lems. Springer, New York, 2000. J.V. Burke and P. Tseng. A unified analysis of Hoffman’s bound via [BT96] Fenchel duality. SIAM Journal on Optimization, 6:265–282, 1996. A. Ben-Tal and J. Zowe. A unified theory of first and second order condi[BZ82] tions for extremum problems in topological vector spaces. Mathematical Programming Study, 19:39–76, 1982. J.M. Borwein and D.M. Zhuang. Verifiable necessary and sufficient con[BZ88] ditions for regularity of set-valued and single-valued maps. Journal of Mathematical Analysis and Applications, 134:441–459, 1988. R. W. Chaney. Optimality conditions for piecewise nonlinear pro[Cha89] gramming. Journal of Optimization Theory and Applications, 61:179–202, 1989. F.H. Clarke. On the inverse function theorem. Pacific Journal of Mathe[Cla76] matics, 64:97–102, 1976. F.H. Clarke. Optimization and Nonsmooth Analysis. Wiley, New York, [Cla83] 1983. [Com90] R. Cominetti. Metric regularity, tangent sets and second-order optimality conditions. Applied Mathematics and Optimization, 21:265–287, 1990. [DFS67] G.B. Dantzig, J. Folkman, and N. Shapiro. On the continuity of the minimum set of a continuous function. Journal of Mathematical Analysis and Applications, 17:519–548, 1967. V.F. Demyanov and V.N. Malozemov. Introduction to Minimax. Wiley, [DM74] New York, 1974. [DMO80] A.V. Dmitruk, A.A. Milyutin, and N.P. Osmolovski. Lyusternik’s theorem and the theory of the extremum. Uspekhy Mat. Nauk, 35:11–46, 1980. in Russian. A. Dontchev. Perturbations, Approximations and Sensitivity Analysis of [Don83] Optimal Control Systems. Lecture Notes in Control and Information Sciences 52. Springer, Berlin, 1983.

Bibliography

313

[Don95]

A. Dontchev. Characterizations of Lipschitz stability in optimization. In R. Lucchetti and J. Revalski, editors, Recent Developments in Well–Posed Variational Problems, pages 95–116. Kluwer, 1995.

[Don96]

A. Dontchev. Local convergence of the Newton method for generalized equations. Comptes Rendus de l’Académie des Sciences de Paris, 332, Ser. I:327–331, 1996.

[Don98]

A. Dontchev. A proof of the necessity of linear independence constraint qualification and strong second–order sufficient optimality condition for Lipschitzian stability in nonlinear programming. Journal of Optimization Theory and Applications, 98:467–473, 1998.

[DR96]

A. Dontchev and R.T. Rockafellar. Characterizations of strong regularity for variational inequalities over polyhedral convex sets. SIAM Journal on Optimization, 6:1087–1105, 1996.

[DR98]

A. Dontchev and R.T. Rockafellar. Characterizations of Lipschitz stability in nonlinear programming. In A.V. Fiacco, editor, Mathematical Programming with Data Perturbations, pages 65–82. Marcel Dekker, New York, 1998.

[DZ93]

A. Dontchev and T. Zolezzi. Well–Posed Optimization Problems, Lecture Notes in Mathematics 1543. Springer, Berlin, 1993.

[Eke74]

I. Ekeland. On the variational principle. Journal of Mathematical Analysis and Applications, 47:324–353, 1974.

[Fab86]

M. Fabian. Subdifferentials, local London Math. Soc., 34:568–576, 1986.

[Fab89]

M. Fabian. Subdifferentiability and trustworthiness in the light of a new variational principle of Borwein and Preiss. Acta Univ. Carolinae, 30:51– 56, 1989. 17th Winter School on Abstract Analysis, Srni 89.

[Fed69]

H. Federer. Geometric Measure Theory. Springer, New York, Heidelberg, 1969.

[Fia74]

A.V. Fiacco. Convergence properties of local solutions of sequences of mathematical programming problems in general spaces. Journal of Optimization Theory and Applications, 13:1–12, 1974.

[Fia76]

A.V. Fiacco. Sensitivity analysis for nonlinear programming using penalty functions. Mathematical Programming, 10:287–311, 1976.

[Fia83]

A.V. Fiacco. Introduction to Sensitivity and Stability Analysis. Academic Press, New York, 1983.

[Fis97]

A. Fischer. Solutions of monotone complementarity problems with locally Lipschitzian functions. Mathematical Programming, Series B, 76:513–532, 1997.

[FM68]

A.V. Fiacco and G.P. McCormick. Nonlinear Programming: Sequential Unconstrained Minimization Techniques. Wiley, New York, 1968.

[Fus94]

P. Fusek. Über Kettenregeln in Gleichungsform für Ableitungen nichtglatter Funktionen. Diplomarbeit, Fachbereich Mathematik. Humboldt– Universitat zu Berlin, Berlin, 1994.

and Asplund spaces. Journal

314

[Fus99]

[Fus01] (Gau77]

[Gau94] [GD82]

[Gfr87]

[Gfr98] [Gfr00] [GGJ90] [GJ88]

[Gol72]

[Gra50] [GT77] [Hag79] [Har77]

[Har79] [HG99] (HK94)

Bibliography

P. Fusek. Eigenschaften pseudo-regulärer Funktionen und einige Anwendungen auf Optimierungsaufgaben. Dissertation, Fachbereich Mathematik. Humboldt–Universität zu Berlin, Berlin, Februar 1999. P. Fusek. Isolated zeros of Lipschitzian metrically regular functions. Optimization, 49:425–446, 2001. J. Gauvin. A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Mathematical Programming, 12:136–138, 1977. J. Gauvin. Theory of Nonconvex Programming. Les Publications CRM, Montreal, 1994. J. Gauvin and F. Dubeau. Differential properties of the marginal function in mathematical programming. Mathematical Programming Study, 19:101– 119, 1982. H. Gfrerer. Holder continuity of solutions of perturbed optimization problems under Mangasarian-Fromovitz Constraint Qualification. In J. Guddat, H.Th. Jongen, B. Kummer, and F. editors, Parametric Optimization and Related Topics, pages 113–124, Akademie-Verlag, Berlin, 1987. H. Gfrerer. Personal communication, 1998. H. Gfrerer. Personal communication, 2000. J. Guddat, F. Guerra, and H.Th. Jongen. Parametric Optimization: Singularities, Pathfollowing and Jumps. Wiley, Chichester, 1990, J. Gauvin and R. Janin. Directional behaviour of optimal solutions in nonlinear mathematical programming. Mathematics of Operations Research, 13:629–649, 1988. E.G. Golstein. Theory of Convex Programming. Transactions of Mathematical Monographs 36. American Mathematical Society, Providence, RI, 1972. L.M. Graves. Some mapping theorems. Duke Mathematical Journal, 17:111–114, 1950. J. Gauvin and J.W. Tolle. Differential stability in nonlinear programming. SIAM Journal on Control and Optimization, 15:294–311, 1977. W.W. Hager. Lipschitz continuity for constrained processes. SIAM Journal on Control and Optimization, 17:321–338, 1979. A. Haraux. How to differentiate the projection on a convex set in Hilbert space. Some applications to variational inequalities. Journal of the Mathematical Society of Japan, 29:615–631, 1977. R. Hardt. An Introduction to Geometric Measure Theory, Lecture Notes, Melbourne University, 1979. W.W. Hager and M.S. Gowda. Stability in the presence of degeneracy and error estimation. Mathematical Programming, 85:181–192, 1999. R. Henrion and D. Klatte. Metric regularity of the feasible set mapping in semi-infinite optimization. Applied Mathematics and Optimization, 30:103–109, 1994.

Bibliography

315

R. Henrion and J. Outrata. A subdifferential condition for calmness of multifunctions. Journal of Mathematical Analysis and Applications, 258:110– 130, 2001. [Hof52] A.J. Hoffman. On approximate solutions of systems of linear inequalities. Journal of Research of the National Bureau of Standards, 49:263–265, 1952. W.W. Hogan. Point-to-set maps in mathematical programming. SIAM [Hog73] Review, 15:591–603, 1973. P.T. Harker and J.-S. Pang. Finite-dimensional variational inequality and [HP90] nonlinear complementarity problems: A survey of theory, algorithms and applications. Mathematical Programming, 48:161–220, 1990. [HUL93] J.-B. Hiriart-Urruty and C. Lemaréchal. Convex Analysis and Minimization Algorithms I, II. Springer, New York, 1993. [HUSN84] J.-B. Hiriart-Urruty, J.J. Strodiot, and V. Hien Nguyen. Generalized Hessian matrix and second order optimality conditions for problems with Applied Mathematics and Optimization, 11:43–56, 1984. R. Hettich and P. Zencke. Numerische Methoden der Approximation und [HZ82] Semi–Infiniten Optimierung. Teubner, Stuttgart, 1982. C.M. Ip and J. Kyparisis. Local convergence of quasi-Newton methods for [IK92] B-differentiable equations. Mathematical Programming, 56:71–89, 1992. A.D. Ioffe. Necessary and sufficient conditions for a local minimum. 3: [Iof79a] Second order conditions and augmented duality. SIAM Journal on Control and Optimization, 17:266–288, 1979. A.D. Ioffe. Regular points of Lipschitz functions. Transactions of the [Iof79b] American Mathematical Society, 251:61–69, 1979. A.D. Ioffe. Nonsmooth analysis: differential calculus of nondifferentiable [Iof81] mappings. Transactions of the American Mathematical Society, 266:1–56, 1981. A.D. Ioffe. On sensitivity analysis of nonlinear programs in Banach spaces: [Iof94] The approach via composite unconstrained optimization. SIAM Journal on Optimization, 4:1–43, 1994. A.D. Ioffe. Codirectional compactness, metric regularity and subdifferen[Iof00] tial calculus. American Math. Soc., Providence, RI:123–163, 2000. Conf. on Nonlin. Analysis, Limoges 1999. A.D. Ioffe and V.M. Tichomirov. Theory of Extremal Problems. Nauka, [IT74] Moscow, 1974. in Russian. R. Janin. Directional derivative of the marginal function in nonlinear [Jan84] programming. Mathematical Programming Study, 21:110–126, 1984. H.Th. Jongen, P. Jonker, and F. Twilt. Nonlinear Optimization in I: [JJT83] Morse Theory, Chebychev Approximation. Peter Lang Verlag, Frankfurt a.M.-Bern-NewYork, 1983. H.Th. Jongen, P. Jonker, and F. Twilt. Nonlinear Optimization in II: [JJT86] Transversality, Flows, Parametric Aspects. Peter Lang Verlag, Frankfurt a.M.-Bern-NewYork, 1986. [HO01]

316

Bibliography

H.Th. Jongen, P. Jonker, and F. Twilt. The continuous, desingularized Newton method for meromorphic functions. Acta Applicandae Mathematicae, 13:81–121, 1988. [JJT91] H.Th. Jongen, P. Jonker, and F. Twilt. On the classification of plane graphs representing structurally stable rational Newton flows. Journal of Combinatorial Theory, Series B, 51:256–270, 1991. [JKT90] H.Th. Jongen, D. Klatte, and K. Tammer. Implicit functions and sensitivity of stationary points. Mathematical Programming, 49:123–138, 1990. V. Jeyakumar, D.T. Luc, and S. Schaible. Characterization of general[JLS98] ized monotone nonsmooth continuous maps using approximate Jacobians. Journal of Convex Analysis, 5:119–132, 1998. [JMRT87] H.Th. Jongen, T. Möbert, J. Rückmann, and K. Tammer. On inertia and Schur complement in optimization. Linear Algebra and its Applications, 95:97–109, 1987. [JMT86] H.Th. Jongen, T. Möbert, and K. Tammer. On iterated minimization in nonconvex optimization. Mathematics of Operations Research, 11:679–691, 1986. [JP88] H.Th. Jongen and D. Pallaschke. On linearization and continuous selections of functions. Optimization, 19:343–353, 1988. H.Th. Jongen, J.-J. Rückmann, and O. Stein. Generalized semi–infinite [JRS98] optimization: A first order optimality condition and examples. Mathematical Programming, 83:145–158, 1998. [JTW92] H.Th. Jongen, F. Twilt, and G.W. Weber. Semi-infinite optimization: structure and stability of the feasible set. Journal of Optimization Theory and Applications, 72:529–552, 1992. [JJT88]

[KA64]

L.W. Kantorovich and G.P. Akilov. Funktionalanalysis in normierten Räumen. Akademie Verlag, Berlin, 1964.

[Kal86]

P. Kall. Approximations to optimization problems: An elementary review. Mathematics of Operations Research, 11:9–18, 1986. A. Kaplan. On the convergence of the penalty function method. Soviet Math. Dokl., 17:1008–1012, 1976. D. Klatte and B. Kummer. Stability properties of infima and optimal solutions of parametric optimization problems. In V.F. Demyanov and D. Pallaschke, editors, Nondifferentiable Optimization: Motivations and Applications, pages 215–229. Springer, Berlin, 1985. D. Klatte and B. Kummer. Generalized Kojima functions and Lipschitz stability of critical points. Computational Optimization and Applications, 13:61–85, 1999. D. Klatte and B. Kummer. Strong stability in nonlinear programming revisited. Journal of the Australian Mathematical Society, Series B, 40:336– 352, 1999. D. Klatte and B. Kummer. Contingent derivatives of implicit (multi–) functions and stationary points. Annals of Operations Research, 101:313– 331, 2001.

[Kap76] [KK85]

[KK99a]

[KK99b]

[KK01]

Bibliography

[KL87] [KL99]

[Kla85]

[Kla86]

[Kla87]

[Kla91]

[Kla92]

[Kla94a] [Kla94b] [Kla97]

[Kla98]

[Kla00] [Klu79] [KM80]

[Koj80]

[KR92]

317

D. Kuhn and R. Löwen. Piecewise affine bijections of and the equation Linear Algebra and its Applications, 96:109–129, 1987. D. Klatte and W. Li. Asymptotic constraint qualifications and global error bounds for convex inequalities. Mathematical Programming, 84:137-160, 1999. D. Klatte. On the stability of local and global optimal solutions in parametric problems of nonlinear programming. Part I: Basic results. Seminarbericht Nr. 75 der Sektion Mathematik, Humboldt-Universität Berlin, pages 1-21, 1985. D. Klatte. On persistence and continuity of local minimizers of nonlinear optimization problems under perturbations. Seminarbericht Nr. 80 der Sektion Mathematik, Humboldt-Universität Berlin, pages 32-42, 1986. D. Klatte. Lipschitz continuity of infima and optimal solutions in parametric optimization: The polyhedral case. In J. Guddat, H.Th. Jongen, B. Kummer, and F. editors,Parametric Optimization and Related Topics, pages 229–248. Akademie–Verlag, Berlin, 1987. D. Klatte. Strong stability of stationary solutions and iterated local minimization. In J. Guddat, H.Th. Jongen, B. Kummer, and F. editors, Parametric Optimization and Related Topics II, pages 119–136. Akademie-Verlag, Berlin, 1991. D. Klatte. Nonlinear optimization under data perturbations. In W. Krabs and J. Zowe, editors, Modern Methods of Optimization, pages 204–235. Springer, Berlin, 1992. D. Klatte. On quantitative stability for non-isolated minima. Control and Cybernetics, 23:183–200, 1994. D. Klatte. On regularity and stability in semi–infinite optimization. SetValued Analysis, 3:101-111, 1994. D. Klatte. Lower semicontinuity of the minimum in parametric convex programs. Journal of Optimization Theory and Applications, 94:511–517, 1997. D. Klatte. Hoffman’s error bound for systems of convex inequalities. In A.V. Fiacco, editor, Mathematical Programming with Data Perturbations, pages 185–199. Marcel Dekker, New York, 1998. D. Klatte. Upper Lipschitz behavior of solutions to perturbed programs. Mathematical Programming, 88:285–311, 2000. R. Kluge. Nichtlineare Variationsungleichungen und Extremalaufgaben. VEB Deutscher Verlag der Wissenschaften, Berlin, 1979. A.Y. Kruger and B.S. Mordukhovich. Extremal points and Euler equations in nonsmooth optimization. Doklady Akad. Nauk BSSR, 24:684–687, 1980. in Russian. M. Kojima. Strongly stable stationary solutions in nonlinear programs. In S.M. Robinson, editor, Analysis and Computation of Fixed Points, pages 93-138. Academic Press, New York, 1980. A. King and R.T. Rockafellar. Sensitivity analysis for nonsmooth generalized equations. Mathematical Programming, 55:341–364, 1992.

318

[Kru85] [Kru96] [Kru97] [Kru00] [Kru01] [KS87]

[KT88] [KT90]

[KT96]

[Kum77] [Kum81]

[Kum84]

[Kum87]

[Kum88a]

[Kum88b]

[Kum91a]

[Kum91b]

Bibliography

A.Y. Kruger. Properties of generalized subdifferentials. Sibirian Mathematical Journal, 26:822–832, 1985. A.Y. Kruger. On calculus of strict Dokl. Akad. Nauk Belarus, 40:4:34–39, 1996. in Russian. A.Y. Kruger. Strict and extremality conditions. Dokl. Akad. Nauk Belarus, 41:3:21–26, 1997. in Russian. A.Y. Kruger. Strict and extremality of sets and functions. Dokl. Nat. Akad. Nauk Belarus, 44:4:21–24, 2000. in Russian. A.Y. Kruger. Strict and extremality conditions. Optimization, 2001. to appear. M. Kojima and S. Shindoh. Extensions of Newton and quasi-Newton methods to systems of equations. Journal of the Operational Research Society of Japan, 29:352–372, 1987. D. Klatte and K. Tammer. On second–order sufficient optimality conditions for problems. Optimization, 19:169–179, 1988. D. Klatte and K. Tammer, Strong stability of stationary solutions and Karush-Kuhn-Tucker points in nonlinear optimization. Annals of Operations Research, 27:285–307, 1990. D. Klatte and G. Thiere. A note on Lipschitz constants for solutions of linear inequalities and equations. Linear Algebra and its Applications, 244:365–374, 1996. B. Kummer. Global stability of optimization problems. Mathematische Operationsforschung und Statistik, Series Optimization, 8:367–383, 1977. B. Kummer. Stability and weak duality in convex programming without regularity. Wissenschaftliche Zeitschrift der Humboldt–Universität zu Berlin, Mathematisch–Naturwissenschaftliche Reihe, XXX:381–386, 1981. B. Kummer. Generalized equations: Solvability and regularity. Mathematical Programming Studies, 21:199–212, 1984. as preprint Nr. 30, Sektion Mathematik Humboldt–Universität Berlin, 1982. B. Kummer, Linearly and nonlinearly perturbed optimization problems in finite dimension. In J. Guddat et al., editor, Parametric Optimization and Related Topics, pages 249–267. Akademie Verlag, Berlin, 1987. B. Kummer. The inverse of a Lipschitz function in Complete characterization by directional derivatives. Preprint no. 195, Humboldt-Universität Berlin, Sektion Mathematik, 1988. B. Kummer. Newton’s method for non-differentiable functions. In J. Guddat et al., editor, Advances in Math. Optimization, pages 114–125. Akademie Verlag Berlin, (Ser. Math. Res. 45), Berlin, 1988. B. Kummer. An implicit function theorem for and parametric Journal of Mathematical Analysis and Applications, 158:35–46, 1991. B. Kummer. Lipschitzian inverse functions, directional derivatives and application in Journal of Optimization Theory and Applications, 70:559–580, 1991.

Bibliography

[Kum92]

319

B. Kummer. Newton’s method based on generalized derivatives for nonsmooth functions: convergence analysis. In W. Oettli and D. Pallaschke, editors, Advances in Optimization, pages 171–194. Springer, Berlin, 1992.

[Kum95a] B. Kummer. Approximation of multifunctions and superlinear convergence. In R. Durier and C. Michelot, editors, Recent Developments in Optimization, volume 429 of Lecture Notes in Economics and Mathematical Systems, pages 243–251. Springer, Berlin, 1995. [Kum95b] B. Kummer. On solvability and regularity of a parametrized version of optimality conditions. ZOR Mathematical Methods of OR, 41:215–230, 1995. [Kum97]

[Kum98]

[Kum99]

B. Kummer. Parametrizations of Kojima’s system and relations to penalty and barrier functions. Mathematical Programming, Series B, 76:579–592, 1997. B. Kummer. Lipschitzian and pseudo-Lipschitzian inverse functions and applications to nonlinear programming. In A. V. Fiacco, editor, Mathematical Programming with Data Perturbations, pages 201–222. Marcel Dekker, New York, 1998. B. Kummer. Metric regularity: Characterizations, nonsmooth variations and successive approximation. Optimization, 46:247–281, 1999.

[Kum00a] B. Kummer. Generalized Newton and NCP-methods: Convergence, regularity, actions. Discussiones Mathematicae - Differential Inclusions, 20:209–244, 2000. [Kum00b] B. Kummer. Inverse functions of pseudo regular mappings and regularity conditions. Mathematical Programming, Series B, 88:313–339, 2000. [KYF97] C. Kanzow, N. Yamashita, and M. Fukushima. New NCP-functions and their properties. Journal of Optimization Theory and Applications, 94:115– 135, 1997. J. Kyparisis. On uniqueness of Kuhn–Tucker multipliers in nonlinear pro[Kyp85] gramming. Mathematical Programming, 32:242–246, 1985. [Lau72] P.-J. Laurent. Approximation et Optimisation. Hermann, Paris, 1972. E.G. Levitin. Perturbation Theory in Mathematical Programming and its [Lev94] Applications. Wiley, New York, 1994. A.B. Levy. Implicit multifunction theorems for the sensitivity analysis of [Lev96] variational conditions. Mathematical Programming, 74:333–350, 1996. W. Li. The sharp Lipschitz constants for feasible and optimal solutions of a [Li93] perturbed linear program. Linear Algebra and its Applications, 187:15–40, 1993. [LMO74] E.S. Levitin, A.A. Miljutin, and N.P. Osmolovski. On conditions for a local minimum in a problem with constraints. In B.S. Mitjagin, editor, Mathematical Economics and Functional Analysis, pages 139–202. Nauka, Moscow, 1974. in Russian. A.S. Lewis and J.-S. Pang. Error bounds for convex inequality systems. [LP97] In J.P. Crouzeix, J.-E. Martinez-Legaz, and M. Volle, editors, Generalized Convexity, Generalized Monotonicity: Recent Results, pages 75–110. Kluwer Academic Publishers, Dordrecht, 1997.

320

Bibliography

[LPR96]

Z.-Q. Luo, J.-S. Pang, and D. Ralph. Mathematical Programs with Equilibrium Constraints. Cambridge University Press, Cambridge, 1996.

[LPR00]

A.B. Levy, R.A. Poliquin, and R.T. Rockafellar. Stability of locally optimal solutions. SIAM Journal on Optimization, 10:580-604, 2000. A.B. Levy and R.T. Rockafellar. Sensitivity analysis for solutions to generalized equations. Transactions of the American Mathematical Society, 345:661-671, 1994. A.B. Levy and R.T. Rockafellar. Sensitivity of solutions in nonlinear programs with nonunique multipliers. In D.-Z. Du, L. Qi, and R.S. Womersley, editors, Recent Advances in Nonsmooth Optimization, pages 215-223. World Scientific Press, Singapore, 1995.

[LR94]

[LR95]

[LR96]

[LS96]

[LS97]

[LS98] [Lyu34]

A.B. Levy and R.T. Rockafellar. Variational conditions and the protodifferentiation of partial subgradient mappings. Nonlinear Analysis: Theory, Methods Applications, 26:1951-1964, 1996. C. Lemaréchal and C. Sagastizabal. More than first-order developments of convex functions: Primal-dual relations. Journal of Convex Analysis, 3:1-14, 1996. C. Lemaréchal and C. Sagastizabal. Practical aspects of the MoreauYosida regularization: Theoretical premilinaries. SIAM Journal on Optimization, 7:367-385, 1997. W. Li and I. Singer. Global error bounds for convex multifunctions and applications. Mathematics of Operations Research, 23:443-462, 1998. L. Lyusternik. Conditional extrema of functions. Math. Sbornik, 41:390401, 1934.

K. Malanowski. Stability of Solutions to Convex Problems of Optimization. Lecture Notes in Control and Information Sciences 92. Springer, Berlin, 1987. [Man81a] O.L. Mangasarian. A condition number of linear inequalities and equalities. Methods of Operations Research, 43:3-15, 1981. [Mal87]

[Man81b] O.L. Mangasarian. A stable theorem of the alternative: An extension of the Gordan theorem. Linear Algebra and its Applications, 41:209-223, 1981. [Man85] O.L. Mangasarian. A condition number for differentiable convex inequalities. Mathematics of Operations Research, 10:175-179, 1985. [Man90]

O.L. Mangasarian. Error bounds for nondegenerate monotone linear complementarity problems. Mathematical Programming, 48:437-446, 1990.

[Man94]

O.L. Mangasarian. Nonlinear Programming. Classics in Applied Mathematics. SIAM, Philadelphia, 1994. republication of the work first published by McGraw-Hill Book Company, New York, 1969. O.L. Mangasarian and S. Fromovitz. The Fritz John necessary optimality conditions in the presence of equality and inequality constraints. Journal of Mathematical Analysis and Applications, 17:37-47, 1967.

[MF67]

[Mic56]

E. Michael. Continuous selections I. Annals of Mathematics, 63:361-382, 1956.

Bibliography

321

[Mif77]

R. Mifflin. Semismooth and semiconvex functions in constrained optimization. SIAM Journal on Control and Optimization, 15:957–972, 1977.

[Mor88]

B.S. Mordukhovich. Approximation Methods in Problems of Optimization and Control. Nauka, Moscow, 1988. in Russian.

[Mor93]

B.S. Mordukhovich. Complete characterization of openness, metric regularity and Lipschitzian properties of multifunctions. Transactions of the American Mathematical Society, 340:1–35, 1993.

[Mor94]

B.S. Mordukhovich. Stability theory for parametric generalized equations and variational inequalities via nonsmooth analysis. Transactions of the American Mathematical Society, 343:609–657, 1994.

[MS97a]

B.S. Mordukhovich and Y. Shao. Fuzzy calculus for coderivatives of multifunctions. SIAM Journal on Control and Optimization, 35:285–314, 1997.

[MS97b]

B.S. Mordukhovich and Y. Shao. Stability of set-valued mappings in infinite dimensions: point criteria and applications. SIAM Journal on Control and Optimization, 35:285–314, 1997.

[MS98]

B.S. Mordukhovich and Y. Shao. Mixed coderivatives of set-valued mappings in variational analysis. Journal of Applied Analysis, 4:269–294,1998.

[NGHB74] F. J. Guddat, H. Hollatz, and B. Bank. Theorie der linearen parametrischen Optimierung. Akademie-Verlag, Berlin, 1974, [NT01]

H.V. Ngai and M. Théra. Metric inequality, subdifferential calculus and applications. Set-Valued Analysis, 9:187–216, 2001.

[OKZ98]

J. Outrata, M. and J. Zowe. Nonsmooth Approach to Optimization Problems with Equilibrium Constraints. Kluwer Academic Publ., Dordrecht-Boston-London, 1998.

[Out00]

J. Outrata. A generalized mathematical program with equilibrium constraints. SIAM Journal on Control and Optimization, 38:1623–1638, 2000.

[Pan90]

J.-S. Pang. Newton’s method for B-differentiable equations. Mathematics of Operations Research, 15:311–341, 1990.

[Pan93]

J.-S. Pang. A degree-theoretic approach to parametric nonsmooth equations with multivalued perturbed solution sets. Mathematical Programming, 62:359–383, 1993.

[Pan97]

J.-S. Pang. Error bounds in mathematical programming. Mathematical Programming, 79:299–332, 1997.

[Pen82]

J.-P. Penot. On regularity conditions in mathematical programming. Mathematical Programming Study, 19:167–199, 1982.

[Pen89]

J.-P. Penot. Metric regularity, openess and Lipschitz behavior of multifunctions. Nonlinear Analysis: Theory, Methods Applications, 13:629– 643, 1989.

[Pol90]

R.A. Poliquin. Proto-differentiation of subgradient set-valued mappings. Canadian Journal of Mathematics, XLII, No.3:520–532, 1990.

[PQ93]

J.-S. Pang and L. Qi. Nonsmooth equations: motivation and algorithms. SIAM Journal on Optimization, 3:443–465, 1993.

322

Bibliography

R.A. Poliquin and R.T. Rockafellar. Proto-derivative formulas for basic subgradient mappings in mathematical programming. Set–Valued Analysis, 2:275–290, 1994. J.-S. Pang and D. Ralph, Piecewise smoothness, local invertibility, and [PR96] parametric analysis of normal maps. Mathematics of Operations Research, 21:401–426, 1996. L. Qi and J. Sun. A nonsmooth version of Newton’s method. Mathematical [QS93] Programming, 58:353–367, 1993. D. Ralph and S. Dempe. Directional derivatives of the solution of a para[RD95] metric nonlinear program. Mathematical Programming, 70:159–172, 1995. [Rob73] S.M. Robinson. Bounds for error in the solution set of a perturbed linear program. Linear Algebra and its Applications, 6:69–81, 1973. [Rob75] S.M. Robinson. An application of error bounds for convex programming in a linear space. SIAM Journal on Control and Optimization, 13:271–273, 1975. [Rob76a] S.M. Robinson. Regularity and stability for convex multivalued functions. Mathematics of Operations Research, 1:130–143, 1976. [Rob76b] S.M. Robinson. Stability theorems for systems of inequalities. Part I: Linear systems. SIAM Journal on Numerical Analysis, 12:754–769, 1976. [Rob76c] S.M. Robinson. Stability theorems for systems of inequalities. Part II: Differentiable nonlinear systems. SIAM Journal on Numerical Analysis, 13:497–513, 1976. S.M. Robinson. A characterization of stability in linear programming. [Rob77] Operations Research, 25:435–447, 1977. S.M. Robinson. Generalized equations and their solutions, Part I: Basic [Rob79] theory. Mathematical Programming Study, 10:128–141, 1979. [Rob80] S.M. Robinson. Strongly regular generalized equations. Mathematics of Operations Research, 5:43–62, 1980. S.M. Robinson. Some continuity properties of polyhedral multifunctions. [Rob81] Mathematical Programming Study, 14:206–214, 1981. [Rob82] S.M. Robinson. Generalized equations and their solutions. Part II: Applications to nonlinear programming. Mathematical Programming Study, 19:200–221, 1982. S.M. Robinson. Local epi-continuity and local optimization. Mathematical [Rob87] Programming, 37:208–223, 1987. S.M. Robinson. An implicit function theorem for a class of nonsmooth [Rob91] functions. Mathematics of Operations Research, 16:292–309, 1991. [Rob94] S.M. Robinson. Newton’s method for a class of nonsmooth functions. Set-Valued Analysis, 2:291–305, 1994. R.T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, [Roc70] N.J., 1970. R.T. Rockafellar. Ordinary convex programs without a duality gap. Jour[Roc71] nal of Optimization Theory and Applications, 7:143–148, 1971.

[PR94]

Bibliography

323

[Roc74]

R.T. Rockafellar. Conjugate Duality and Optimization. Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, 1974.

[Roc81]

R.T. Rockafellar. The Theory of Subgradients and its Application to Problems of Optimization. Convex and Nonconvex Functions. Heldermann, Berlin, 1981.

[Roc88]

R.T. Rockafellar. First and second order epi-differentiability in nonlinear programming. Transactions of the American Mathematical Society, 207:75–108, 1988.

[RS97]

D. Ralph and S. Scholtes. Sensivitity analysis of composite piecewise smooth equations. Mathematical Programming, Series B, 76:593–612, 1997. R.T. Rockafellar and R. J.-B. Wets. Variational systems, an introduction. In G. Salinetti, editor, Multifunctions and Integrands: Stochastic Analysis, Approximation and Optimization, Lecture Notes in Mathematics 1091, pages 1–54. Springer, Berlin, 1984.

[RW84]

[RW98]

R.T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer, Berlin, 1998.

[Sch94]

S. Scholtes. Introduction to Piecewise Differentiable Equations. Preprint No. 53/1994. Institut für Statistik und Mathematische Wirtschaftstheorie, Universität Karlsruhe, 1994. A. Shapiro. Perturbation theory of nonlinear programs when the set of solutions is not a singleton. Applied Mathematics and Optimization, 18:215– 229, 1988. A. Shapiro. Sensitivity analysis of nonlinear programs and differentiability properties of metric projections. SIAM Journal on Control and Optimization, 26:628–645, 1988.

[Sha88a]

[Sha88b]

[Sha90]

A. Shapiro. On concepts of directional differentiability. Journal of Optimization Theory and Applications, 66:477–487, 1990.

[Sha94]

A. Shapiro. On Lipschitzian stability of optimal solutions of parameterized semi–infinite programs. Mathematics of Operations Research, 19:743–752, 1994. A. Shapiro. First and second order optimality conditions and perturbation analysis of semi–infinite programming problems. In R. Reemtsen and J.-J. Rückmann, editors, Semi–Infinite Programming, pages 103–133. Kluwer, Boston, Dordrecht, London, 1998.

[Sha98]

[Sle96]

B. Slezák. An inverse function theorem in topological groups. Studia Scientiarum Mathematicarum Hungarica, 31:415–421, 1996.

[SQ99]

D. Sun and L. Qi. On NCP functions. Computational Optimization and Applications, 13:201–220, 1999.

[Stu86]

M. Studniarski. Necessary and sufficient conditions for isolated local minima of nonsmooth functions. SIAM Journal on Control and Optimization, 24:1044–1049, 1986. J. Stoer and C. Witzgall. Convexity and Optimization in Finite Dimensions I. Springer, Berlin, 1970.

[SW70]

324

[SZ88]

[SZ92]

[Thi80] [Thi82]

[War75] [War94]

[WW69]

[Zei76] [Zie89] [ZK79]

Bibliography

H. Schramm and J. Zowe. A combination of the bundle approach and the trust region concept. In J. Guddat and et al., editors, Advances in Math. Optimization, volume Ser. Mathem. Res., 45, pages 196–209. Akademie Verlag, Berlin, 1988. H. Schramm and J. Zowe. A version of the bundle idea for minimizing a nonsmooth function: conceptual idea, convergence analysis, numerical results. SIAM Journal on Optimization, 2:121–152, 1992. L. Thibault. Subdifferentials of compactly Lipschitz vector-valued functions. Annali di Matematica Pura ed Applicata, 4:157–192, 1980. L. Thibault. On generalized differentials and Subdifferentials of Lipschitz vector-valued functions. Nonlinear Analysis: Theory, Methods & Applications, 6:1037–1053, 1982. J. Warga. Necessary conditions without differentiability assumptions in optimal control. Journal of Differential Equations, 15:13–46, 1975. D.E. Ward. Characterizations of strict local minima and necessary conditions for weak sharp minima. Journal of Optimization Theory and Applications, 80:551–571, 1994. D.W. Walkup and R.J.-B, Wets. A Lipschitzian characterizations of convex polyhedra. Proceedings of the American Mathematical Society, 20:167–173, 1969. E. Zeidler. Vorlesungen über nichtlineare Funktionalanalysis I - Fixpunktsätze -. Teubner Verlagsgesellschaft, Leipzig, 1976. William P. Ziemer. Weakly Differentiable Functions. Springer, New York, 1989. J. Zowe and S. Kurcyusz. Regularity and stability for the mathematical programming problem in Banach spaces. Applied Mathematics and Optimization, 5:42–62, 1979.

Index active functions, 5 Asplund space, 67 Aubin property, 7

isolated, 194 critical value, 150 critical value function, 222

B-differentiable function, 85 B-subdifferential, 4 barrier function, 276 Berge-u.s.c. multifunction, 10 Bouligand cone, 106 Bouligand derivative, 3 boundary, 1

derivative D° , 4, 127 Bouligand, 3 Clarke’s directional, 3 contingent, 3, 64, 165 directional, 3 generalized, 3 graphical, 3 injective, 3, 62 partial C-, 121 partial T-, 117 strict graphical, 3 Thibault, 165 Thibault’s, 3 describing function, 19 directional derivatives, 3 Dirichlet’s function, 14 DIST, 50 domain of a multifunction, 2

of , 128 optimization problem, 2 C-stability system, 168 calmness, 13 Clarke’s directional derivative, 3 Clarke’s tangent cone, 106 CLM set, 16 closed multifunction, 2 closure, 1 coderivative, 3, 66 complete local minimizing set, 16 cone Bouligand, 106 Clarke’s tangent, 106 contingent, 64, 106 normal, 158 cone constraints, 21, 44 standard, 50 conjugate function, 92 constant rank condition, 31 contingent cone, 64,106 contingent derivative, 3, 64,165 convex hull, 1 critical point, 150, 152

Ekeland’s variational principle, 37, 303 Ekeland-point, 38 global, 38 local, 38 epi-convergence, 17 epsilon-normal to gph F, 66 epsilon-optimal, 38 error estimates, 7 exact penalty, 95 exact solutions, 266 exposed matrix, 114 325

326

extended MFCQ, 57 extreme value function, 36 feasible triple, 266 function B-differentiable, 85 barrier, 276 conjugate, 92 critical value, 222 describing, 19 exact penalty, 95 generalized Kojima-, 151 Kojima-, 150 Lagrange, 150, 184 Lipschitzian increasing, 19 locally 128 locally Lipschitz, 1 logarithmic barrier, 277 monotone NCP-, 238 NCP, 5, 236 Newton, 122 see piecewise penalty, 94, 276 piecewise 5 pNCP, 239 pseudo-smooth, 4, 127 semismooth, 125, 260 simple, 117 standard Lagrange, 150 strongly monotone NCP-, 237 strongly semismooth, 260 functions active, 5 Gauvin’s theorem, 309 generalized derivatives, 3 generalized equation, 158 generalized Jacobian, 4, 114 generalized Kojima-function, 151 generalized LICQ, 169 generalized Newton method, 258 generalized semi-infinite optimization, 21 generalized strict MFCQ, 170 global Ekeland-point, 38

Index

Gordan’s theorem, 308 graph of a multifunction, 2 graphical derivative, 3 Graves-Lyusternik theorem, 10, 85 growth condition, 81 Hausdorff-limit lower, 11 upper, 11 Hoffman’s lemma, 29 image of a multifunction, 2 implicit Lipschitz function, 102 injective derivative, 3, 62 injectivity with respect to u, 205 interior, 1 invariance of domain theorem, 97 inverse local, 34 partial, 42 inverse family, 34 inverse family of directions, 35 inverse Lipschitz function, 100 inverse multifunction, 2 isolated critical point, 194 Karush-Kuhn-Tucker point, 150 KKT point, see Karush-Kuhn-Tucker point Kojima-function, 150 Kuratovski-Painlevé limits, 11 l.s.c. multifunction, 10 Lagrange function, 150, 184 Lagrangian, 150, 184 LICQ, 153, 308 generalized, 169 limit sets Thibault’s, 3, 62 linearly surjective, 64 Lipschitz continuous, 11 Lipschitz l.s.c., 10 Lipschitz modulus, 1 Lipschitz rank, 1 Lipschitz u.s.c., 10

Index

Lipschitzian increasing, 19 local Ekeland-point, 38 local inverse, 34 locally function, 128 locally bounded, 2 locally compact, 2 locally Lipschitz, 1 locally u.L., 6, 193 locally upper Lipschitz, 6, 193 locally upper Lipschitz at a set, 13 logarithmic barrier function, 277 lower Hausdorff-limit, 11 lower semicontinuous, 10 Mangasarian-Fromovitz constraint qualification, 7, 49 map marginal, 222 Newton, 122 of normals, 108 projection, 108 marginal function, 36 marginal map, 222 matrix exposed, 114 method Wilson’s, 284 metrically regular, 12 7, 49, 308 extended, 57 generalized strict, 170 strict, 31, 308 direction, 49 Minkowski operations, 1 monotone NCP, 5, 236 monotone NCP-function, 238 multifunction, see multivalued map Berge-u.s.c., 10 calm, 13 closed, 2 inverse, 2 l.s.c, 10 Lipschitz continuous, 11 Lipschitz l.s.c., 10 Lipschitz u.s.c., 10

327

local inverse, 34 locally bounded, 2 locally compact, 2 locally u.L., 6, 193 locally upper Lipschitz, 6, 193 locally upper Lipschitz at a set, 13 lower semicontinuous, 10 metrically regular, 12 open with linear rate, 13 partially invertible, 42 pointwise Lipschitz, 10 polyhedral, 108 proper near a point, 39 proto-differentiable, 251 pseudo-Lipschitz, 6 pseudo-regular, 7 quasi-Lipschitz, 50 strongly regular, 7, 61 u.s.c., 10 upper regular, 7, 63 upper regular at a set, 13 upper semicontinuous, 10 multivalued map, 2 Nash equilibrium, 159 NCP, see nonlinear complementarity problem monotone, 5, 236 standard, 5, 236 strongly monotone, 5, 236 NCP function, 5, 236 near , 1 Newton function, 122 Newton map, 122, 258 Newton method, 32, 257 nonlinear complementarity problem, 5 nonsmooth analysis, xi normal 66 vertical, 67 vertical zero-, 67 67 normal cone, 158

328

Index

open mapping theorem, 85 openness with linear rate, 13 optimality conditions, 25, 33, 47

regularity, 6 persistent with respect to G, 72 rank of, 7

parametric program, 184 parametric nonlinear program, 184 parametric program with additional canonical perturbations, 185 with canonical perturbations, 185 partial C-derivative, 121 partial T-derivative, 117 partial inverse, 42 partially invertible, 42 function, 5 penalty function, 94, 276 persistent regularity, 72 piecewise function, 5 pNCP function, 239 point critical, 150, 152 KKT, 150 point-to-set distance, 1 pointwise Lipschitz multifunction, 10 polyhedral multifunction, 108 polyhedral set, 108 program parametric 184 parametric nonlinear, 184 pseudo-regular, 187 strongly regular, 187 projection, 108 proper near a point, 39 proto-differentiable, 251 pseudo-Lipschitz, 6 pseudo-regular, 7 pseudo-regular linear systems, 29 pseudo-regular program, 187 pseudo-smooth function, 4, 127

selection property, 202 semismooth, 125, 260 simple function, 117 SOC, 190 solution stationary, 150, 152 SQP-methods, 275 SSOC, 190 stability system, 168 standard cone constraints, 50 standard NCP, 5, 236 stationary solution, 150, 152 strict graphical derivative, 3 strict 31, 308 strongly monotone NCP, 5, 236 strongly monotone NCP-function, 237 strongly regular, 7, 61 strongly regular program, 187 strongly semismooth, 260 strongly stable in Kojima’s sense, 189 subdifferential

quasi-Lipschitz, 50 Rademacher’s theorem, 4 rank Lipschitz, 1 of regularity, 7

38

Clarke-, 3 convex, 3 T-stability system, 168 theorem Gauvin’s, 309 Gordan’s, 308 Graves-Lyusternik, 10, 85 invariance of domain, 97 open mapping, 85 Rademacher’s, 4 Thibault derivative, 3, 62, 165 Thibault’s limit set, 3 u.s.c. multifunction, 10 uniform rank of Lipschitz l.s.c., 34 upper Hausdorff-limit, 11 upper regular, 7, 63 upper regular at a set, 13

Index

329

upper regular linear systems, 29 upper semicontinuous, 10 variational analysis, xi variational principle Ekeland’s, 37, 303 vertical normal, 67 vertical zero-normals, 67 Wilson’s method, 284 67