Preface The study of solution methods for nonlinear boundary value problems must be described as one of the most import...
18 downloads
481 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Preface The study of solution methods for nonlinear boundary value problems must be described as one of the most important and stimulating research topics in Applied Mathematics and Numerical Analysis, both for its beautiful theory and its interesting applications. The first, generally available study dealing with solution methods for nonlinear problems in function space was the landmark publications by Kantorovich from the 30th some of which was later translated from Russian to various languages. This basic work started a school on nonlinear functional analysis methods. Among others, gradient methods and the Newton or, as it is often called, the Newton-Kantorovich method, were studied. In these early publications, little concern was devoted to solution methods for the linear systems arising in the methods, presumably because the intention was to use analytical approximations and the arising linear systems had small size. Since the arrival of electronic computers in the 50th a tremendous progress has been done in iterative solution methods, in particular using various preconditioning methods. Incidently, the Newton method can be described as a variable preconditioning method and can be coupled with generalized iterative solution methods. One such iterative method is the conjugate gradient method, which was first studied in Hilbert function space by Hayes in the 50th and later extended by Daniel to the conjugate gradient method for the solution on nonlinear operator problems. Although some important previous work has been done on defining preconditioners via approximating boundary value operators, up to date the study of preconditioners have been done mostly algebraically. In the present book, the authors successfully take up the challenge to develop in a systematic way such a functional analytic framework for construction of preconditioners. Thereby they give a proper general background on how to understand and improve preconditioners for elliptic boundary value problems, both in continuous function space and in the finite dimensional spaces arising after proper discretization of the differential operators. Their approach provides a natural way to prove spectral equivalence between the pairs of preconditioning and given operators, i.e. for which the spectral condition number is bounded uniformly with respect to the discretization (mesh) parameter. Doing so they also can prove mesh independent convergence of the methods, such as the Newton method, in a general way. The monograph gives a connection between Sobolev space theory and iterative solution methods required for the actual numerical solution of nonlinear boundary value 3
4 problems. In particular, a general presentation of how to understand the behaviour of preconditioners and to improve them is given. Both nonlinear operators and nonlinear boundary conditions are analysed. Furthermore both theory and applications are presented. Although the presentation is limited to boundary value problems of elliptic type, it can likely be of interest also for other related problems such as variational inequality problems and parabolic problems. The major emphasis in the book is on the relation between preconditioners and some general properties of elliptic operators in function space. This monograph can be expected to be important both for researchers involved in the analysis and development of such solution methods as well as for practitioners, who will use the methods for their particular applications. February 2002, Owe Axelsson, University of Nijmegen, Netherlands
Contents Introduction
I
9
Motivation
17
1 Nonlinear elliptic equations in model problems 1.1 Elasto-plastic torsion of rods . . . . . . . . . . . . . . . . . 1.2 Electromagnetic field theory (nonlinear Maxwell equations) 1.3 Nonlinear elasticity . . . . . . . . . . . . . . . . . . . . . . 1.4 Elasto-plastic bending of clamped plates . . . . . . . . . . 1.5 Semilinear equations . . . . . . . . . . . . . . . . . . . . . 1.6 Some other examples . . . . . . . . . . . . . . . . . . . . . 1.6.1 Flow models . . . . . . . . . . . . . . . . . . . . . . 1.6.2 Non-potential problems . . . . . . . . . . . . . . . . 1.7 Weak formulations . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
19 20 23 24 25 26 28 28 29 30
2 Linear algebraic systems 2.1 Conditioning of systems of linear algebraic equations . . . . . . . . . . 2.1.1 Well-posedness . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 The condition number and its properties . . . . . . . . . . . . . 2.1.3 Computer implementation . . . . . . . . . . . . . . . . . . . . . 2.1.4 An example of improved conditioning . . . . . . . . . . . . . . . 2.1.5 Some iterative methods and their convergence . . . . . . . . . . 2.2 Problems with large condition numbers . . . . . . . . . . . . . . . . . . 2.2.1 Discretized elliptic problems . . . . . . . . . . . . . . . . . . . . 2.2.2 Difficulties arising from large condition numbers . . . . . . . . . 2.3 Preconditioning of linear algebraic systems . . . . . . . . . . . . . . . . 2.3.1 Preconditioning as the main tool of improving the condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Spectral equivalence and preconditioning . . . . . . . . . . . . . 2.3.3 Some important preconditioning techniques . . . . . . . . . . .
35 35 35 36 38 41 42 44 44 45 46
3 Linear elliptic problems 3.1 Some properties of linear operators in Hilbert space . . . . . . . . . . . 3.1.1 Energy spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Spectral equivalence and contractivity . . . . . . . . . . . . . .
53 54 54 55
5
46 49 50
6
CONTENTS 3.2
Well-posedness of linear elliptic problems . . . . . . . . . . . . . . . . . 3.2.1 Weak solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard solution methods for some linear elliptic problems . . . . . . 3.3.1 Finite element discretization . . . . . . . . . . . . . . . . . . . . 3.3.2 Efficient solution algorithms for general linear elliptic problems . 3.3.3 Fast solvers for problems in special form . . . . . . . . . . . . . Iteration and preconditioning in Sobolev space . . . . . . . . . . . . . . 3.4.1 The condition number of linear operators . . . . . . . . . . . . . 3.4.2 Preconditioned gradient type methods and mesh independence . 3.4.3 Some remarks on two-sided preconditioning . . . . . . . . . . .
58 58 60 61 61 63 64 66 66 67 71
4 Nonlinear algebraic systems and preconditioning 4.1 The condition number of nonlinear algebraic systems . . . . . . . . . . 4.2 Some iterative methods for nonlinear algebraic systems . . . . . . . . . 4.2.1 Gradient type methods . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Preconditioning for nonlinear algebraic systems . . . . . . . . . . . . . 4.3.1 Preconditioned simple iterations for nonlinear algebraic systems 4.3.2 Variable preconditioning and Newton’s method . . . . . . . . . 4.3.3 The problem of preconditioning for discretized nonlinear elliptic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73 74 75 76 77 80 80 81
II
85
3.3
3.4
Theoretical background
5 Nonlinear equations in Hilbert space 5.1 Potentials and monotone operators . . . . . . . . . . . . 5.2 Iterative methods for smooth mappings . . . . . . . . . . 5.2.1 Simple iterations (gradient method) . . . . . . . . 5.2.2 Newton-like methods . . . . . . . . . . . . . . . . 5.3 Preconditioning by linear operators in Hilbert space . . . 5.3.1 Definition and properties of the condition number 5.3.2 Fixed preconditioning operators . . . . . . . . . . 5.3.3 Variable preconditioning operators . . . . . . . . 5.4 Preconditioning and gradients . . . . . . . . . . . . . . . 6 Solvability of nonlinear elliptic problems 6.1 Some properties of the generalized differential operators 6.2 Weak solutions . . . . . . . . . . . . . . . . . . . . . . 6.2.1 General well-posedness results . . . . . . . . . . 6.2.2 Various solvability theorems . . . . . . . . . . . 6.2.3 Non-potential problems . . . . . . . . . . . . . . 6.3 Qualitative properties . . . . . . . . . . . . . . . . . . . 6.3.1 Regularity of the solution . . . . . . . . . . . . 6.3.2 Positivity of the solution . . . . . . . . . . . . . 6.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
82
. . . . . . . . .
87 88 94 94 100 109 110 112 118 127
. . . . . . . . .
131 131 143 143 146 153 154 154 156 157
CONTENTS
7
III Iterative solution of nonlinear elliptic boundary value problems 163 7 Iterative methods in Sobolev space 7.1 Simple iterations . . . . . . . . . . . . . . . . . . . . . 7.1.1 Second order Dirichlet problems . . . . . . . . . 7.1.2 Mixed and higher order problems . . . . . . . . 7.2 Newton-like methods . . . . . . . . . . . . . . . . . . . 7.2.1 The general damped Newton algorithm . . . . . 7.2.2 Second order problems: inner-outer iterations . 7.2.3 Second order problems: variable preconditioning 7.2.4 Other problems . . . . . . . . . . . . . . . . . . 7.3 Preconditioning and Sobolev gradients . . . . . . . . . 7.4 Some more iterative methods in Sobolev space . . . . . 7.4.1 The nonlinear conjugate gradient method . . . . 7.4.2 Frozen coefficient iterations . . . . . . . . . . . 7.4.3 Double Sobolev gradients . . . . . . . . . . . . . 7.4.4 Symmetric part preconditioning . . . . . . . . . 7.4.5 Some remarks on multistep iterations . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
165 167 168 174 183 185 190 196 198 201 207 207 210 213 217 221
8 Preconditioning strategies for discretized nonlinear elliptic problems based on preconditioning operators 223 8.1 Some general properties of the Sobolev space preconditioners . . . . . . 227 8.1.1 Stiffness matrices: solvability, updating and structure characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 8.1.2 Mesh independent conditioning properties . . . . . . . . . . . . 229 8.2 Various preconditioning strategies based on the Sobolev space background234 8.2.1 Discrete Laplacian preconditioner . . . . . . . . . . . . . . . . . 236 8.2.2 Constant coefficient preconditioners . . . . . . . . . . . . . . . . 240 8.2.3 Separable preconditioners . . . . . . . . . . . . . . . . . . . . . 243 8.2.4 Linear principal part preconditioner . . . . . . . . . . . . . . . . 245 8.2.5 Frozen coefficient preconditioner . . . . . . . . . . . . . . . . . . 247 8.2.6 Initial shape preconditioners . . . . . . . . . . . . . . . . . . . . 249 8.2.7 Preconditioners using domain decomposition . . . . . . . . . . . 253 8.2.8 Diagonal coefficient preconditioners . . . . . . . . . . . . . . . . 258 8.2.9 Double Sobolev gradient preconditioner . . . . . . . . . . . . . . 261 8.2.10 Symmetric part preconditioners . . . . . . . . . . . . . . . . . . 264 8.2.11 Incorporating boundary conditions . . . . . . . . . . . . . . . . 266 8.2.12 Non-injective problems . . . . . . . . . . . . . . . . . . . . . . . 269 8.2.13 Discrete biharmonic preconditioner . . . . . . . . . . . . . . . . 271 8.2.14 Double diagonal coefficient preconditioners . . . . . . . . . . . . 272 8.2.15 Decoupled Laplacian preconditioners for systems . . . . . . . . . 275 9 Algorithmic realization of iterative methods based on preconditioning operators 279 9.1 General algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8
CONTENTS
9.2
9.3 9.4
9.1.1 Simple iterations . . . . . . . . . . . . . . . . . 9.1.2 Newton-like iterations . . . . . . . . . . . . . . 9.1.3 Some convergence results . . . . . . . . . . . . . Finite element realization . . . . . . . . . . . . . . . . 9.2.1 The gradient–finite element method . . . . . . . 9.2.2 The Newton–finite element method . . . . . . . 9.2.3 Convergence estimates and mesh independence . 9.2.4 Multilevel type iterations . . . . . . . . . . . . . Some remarks on the use of Laplacian preconditioners . On time-dependent problems . . . . . . . . . . . . . . .
10 Some numerical algorithms for nonlinear 10.1 Elasto-plastic torsion of rods . . . . . . . 10.2 Electromagnetic field equation . . . . . . 10.3 Elasto-plastic bending of clamped plates 10.4 Nonlinear elasticity . . . . . . . . . . . . 10.5 Radiative cooling . . . . . . . . . . . . . 10.6 Electrostatic potential in a ball . . . . . 11 Appendix
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
elliptic problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
283 286 294 297 298 299 300 303 308 314
in physics319 . . . . . . 320 . . . . . . 326 . . . . . . 335 . . . . . . 340 . . . . . . 342 . . . . . . 347 357
Introduction The study of nonlinear elliptic problems is motivated by the wide scope of scientific areas where such problems occur, and raises the demand for their efficient numerical solution. For a clearer exposition of the approach of this book, first we sketch briefly the general steps of describing phenomena in those fields of science where computational mathematics is involved. The description of a real-life phenomenon generally requires the knowledge of its characteristic parameters (such as concentration, temperature etc). That is, one needs a procedure that yields the quantitative values of the parameters. This problem is usually solved via a modelling process which comprises the construction of a sequence of different models. The first step of this modelling process is an approximation of the original phenomenon that results in a physical (chemical etc.) model, based on suitable laws and neglecting less important factors. Then this is turned into a mathematical model, consisting of governing equations whose solution describes the unknown parameters in the studied domain. In most real-life cases these equations cannot be solved explicitly, hence we apply some approximation method which yields the construction of a numerical model. These models generally lead to linear or nonlinear algebraic systems, and can be solved mostly only with computers. This last procedure results in the computer model. The steps of this modelling process are shown in Figure 1. Physical (chemical etc.) model of the real phenomenon ↓ Mathematical model (Operator equation) ↓ Numerical model (System of algebraic equations) ↓ Computer model (The solution of the algebraic problem with computer) Figure 1: Steps of modeling: the process of solving real problems. This presentation suggests a basic principle for constructing computational methods as one incorporates properties, arising at different levels of this diagram, into the method. Namely, the higher level one can rely on, the closer one gets back to the original phenomenon and hence the more natural is the way in which the constructed method may fit the original problem. 9
10
CONTENTS
The type of mathematical model that forms the main scope of this book is the class of nonlinear elliptic boundary value problems. Such problems arise in various fields of science. Most of these models are encountered in different areas of physics, for instance, in elasticity, electromagnetics or flow models. Besides, some other ones are related to chemical and biological phenomena. The numerical solution of elliptic problems has been a subject of extended investigation in the past decades. Both for linear and nonlinear elliptic problems, the widespread approach of numerical solution is ’discretization plus iteration’. That is, one first discretizes the problem, most frequently using the finite element, finite difference or finite volume method. Then the solution of the obtained system of linear or nonlinear algebraic equations is obtained by some iterative method. Hereby linear problems are also mentioned because their study helps in the nonlinear case in two aspects: first, linear problems often give analogy and motivation for nonlinear ones, second, the iterative methods for nonlinear problems generally consist of sequences of auxiliary linear problems. The crucial point in the iterative solution of the discretized elliptic problems is most often preconditioning. This principle has turned out to be generally valid in the case of linear algebraic systems, and is pointed out in the works of Axelsson, Evans, Kuznetsov and other authors. Preconditioning essentially means a suitable choice of matrices for the auxiliary linear systems that the iteration will consist of, and this concept is equally relevant for handling linear and nonlinear problems if the matrices are allowed to vary stepwise. In the nonlinear case Newton-like iterations are widespread, in which the auxiliary matrices are close to the Jacobians of the nonlinear system. In this case the problem of solving the auxiliary systems is often handled by inner iterations and one encounters again the problem of preconditioning. Altogether, preconditioning plays a key role for nonlinear problems as well. Moreover, preconditioners have to satisfy two basic requirements, namely, simple solvability and improved convergence. Since these goals are typically conflicting, there is no general rule as to how the preconditioners should be chosen. The key role of preconditioning underlines the importance of choice in computational methods. Generally speaking, the computer solution process of a real-life problem consists of two parts after the numerical model has been established: the construction of the numerical algorithm and the computer coding. In the presence of many known efficient methods, a basic question is which numerical method to choose. In our case this key question is more specific: which preconditioners should be chosen in the iterative process, i.e. what auxiliary linear systems should the iteration consist of? If these are properly chosen, then one gets in the position to realize an efficient computer method, relying on the supply of standard solvers and packages. A possible basis for the choice of methods is the theoretical background that is used to handle the mathematical model. This has often proved to be very favourable since the mathematical properties of the original phenomenon can be naturally incorporated into the numerical method. Schematically, one can thus rely on the second level of the diagram in Figure 1 instead of the third one. In the case of discretized elliptic problems one can rely on the functional analytic background, and this is natural since the original boundary value problem on the continuous level is set up in an infinite dimensional Sobolev
CONTENTS
11
space. Accordingly, this tool has appeared in many authors’ works since the results of Kantorovich and his school. As a recent application of functional analytic theory, we point out the Sobolev gradient approach developed by Neuberger, which yields an organized use of descent methods for PDEs. Related ideas are also used in the so-called H 1 -methods. The relevance of the functional analytic background in numerics is also demonstrated by the works of Glowinski, in particular for elliptic problems. For a basic motivating example let us turn again to the case of linear problems. A general approach for the efficient preconditioning of linear elliptic problems has been developed using the continuous level in the papers of Concus and Golub, Gunn, D’yakonov, Elman and Schultz, Faber, Manteuffel and Parter, Widlund and other authors. The proposed preconditioner for a discretized linear elliptic problem is the discretization of another linear elliptic problem that can be solved in an easier way, in particular by some fast solver. The idea of this preconditioning is obtained from the conditioning properties of the elliptic operators themselves in the corresponding Sobolev space. In particular, the latter provides mesh independent condition numbers of the derived preconditioned matrices, characterized in the papers of Manteuffel and his co-authors. This preconditioning approach is a motivation for the present book, namely, the above idea can be extended to the nonlinear case. Such kind of preconditioning appears in the context of equivalent stiffness matrices in the papers of Axelsson and Maubach, Rossi and Toivanen, and others. The Sobolev gradient approach of Neuberger leads to preconditioning operators derived from the corresponding Sobolev norm. Following these ideas, our aim is to set up an organized way of preconditioning for nonlinear elliptic problems. The main goal of this book is to develop the framework of preconditioning operators for discretized nonlinear elliptic problems, which means that the proposed preconditioning matrices are the discretizations of suitable linear elliptic operators. In other words, the preconditioner is the projection of a preconditioning operator from the Sobolev space into the same discretization subspace as was used for the original nonlinear problem. Accordingly, the iterative sequence for the discretized problem is the similar projection of a theoretical iteration, defined in the Sobolev space, into the discretization subspace. This means that the preconditioning operator is chosen for the boundary value problem itself on the continuous level, before discretization is executed. In this way the properties of the original problem can be exploited more directly than with usual preconditioners defined for the discretized system. Schematically, using Figure 1 as before, the framework of preconditioning operators relies on the second level of the diagram instead of the third one. The above idea can be illustrated in the following way, considering one-step iterations and allowing the preconditioners to vary stepwise. Let us consider a nonlinear boundary value problem T (u) = g . (1) The standard way of numerical solution consists of ’discretization plus iteration’. That is, one first discretizes the problem in some subspace Vh and obtains a nonlinear alge(n) braic system Th (uh ) = gh . Then one looks for positive definite matrices Ah (n ∈ N) (n) and one defines the iterative sequence {uh }n∈N in Vh with these matrices as (variable)
12
CONTENTS
preconditioners, as shown by Figure 2.
T (u) = g
?
Th (uh ) = gh
-
(n+1)
uh
(n)
(n)
(n)
= uh − (Ah )−1 (Th (uh ) − gh )
Figure 2: discretization plus iteration The idea of preconditioning operators means that one first defines suitable linear elliptic differential operators S (n) and a sequence {u(n) }n∈N with these operators as preconditioners in the corresponding Sobolev space, then one proposes the preconditioning matrices (n) (n) A h = Sh (2) for the iteration in Vh , which means that the preconditioning matrices are obtained using the same discretization for the operators S (n) as was used used to obtain the system Th (uh ) = gh from problem (1). (This way of derivation is indicated in the (n) notation Sh .) The above process is shown schematically in Figure 3.
T (u) = g
-
u(n+1) = u(n) − (S (n) )−1 (T (u(n) ) − g)
? (n+1) uh
=
(n) uh
(n)
(n)
− (Sh )−1 (Th (uh ) − gh )
Figure 3: iteration plus discretization: the preconditioning operator idea As a matter of course, both approaches yield the same type of sequence running in Vh , and the difference lies in the special choice of preconditioners in the second case. On the other hand, although in Figure 3 the discretization parameter h is fixed, its idea remains just the same if h is redefined in each step n, i.e. in the setting of a multilevel or projection-iteration method. Preconditioners obtained from preconditioning operators will be sometimes called Sobolev space preconditioners.
CONTENTS
13
We note that, clearly, the preconditioning operator approach does not include important preconditioning methods which rely on different matrix techniques. The advantages of the preconditioning operator idea appear in both areas that are involved in the requirements of good preconditioners. • Since the auxiliary systems are discretizations of suitable linear elliptic operators, their solution relies on a highly developed background. There are various efficient standard solvers ranging from general multigrid type algorithms to fast and direct methods developed for special equations. This is an aspect which has already made the idea work in the mentioned case when linear problems are reduced to other linear elliptic equations. In addition, the construction of the preconditioners is obtained directly from the underlying operator without studying the actual form of the discretized system. • The convergence properties of the theoretical sequence in the Sobolev space give an asymptotic bound for those of the discretized problems, hence the latter have mesh independent condition numbers. Moreover, one can obtain a priori bounds for these by carrying out analytic estimates, without using the discretized systems. The intention of this book is to demonstrate that preconditioning operators yield an organized framework for a class of preconditioners in the majority of widespread iterative methods. For this purpose we consider a scope of problems in which we attempt to give a clear presentation and go through from the theoretical basis to algorithms for model problems. The considered nonlinear elliptic problems have a convex potential, i.e. the solutions are minimizers of suitable energy type functionals, and accordingly, preconditioning is based on spectral equivalence. (Non-potential problems are only mentioned.) Further, among iterative methods we consider one-step processes (sometimes completed with inner iterations), and within these we focus on simple and Newton-like iterations. The former are detailed for the sake of understanding, since for simple iterations the idea of preconditioning operators can be presented more directly. Some other iterative methods, not of these two types, are also treated in connection with Sobolev space background, but many considerations for them are not detailed. Similarly, numerical test results are given as an illustration in some typical cases, whereas the analogous implementation in the other cases is left to the reader. The structure of the book is based on the above ideas. The book consists of three parts: Part I gives motivation for our subject in different respects, Part II contains the required theoretical background, and the main ideas of the book are presented in Part III. The chapters of the book are devoted to the following topics. In Part I the model problems of Chapter 1 both show the wide scope and represent typical kinds of nonlinear elliptic equations. This is followed by a summary on linear equations, involving algebraic systems and elliptic problems in Chapters 2 and 3, respectively, with focus on preconditioning. Chapter 2 includes the discussion of computer realization. In Chapter 3 the Sobolev space background also plays a central role, and the motivating results on preconditioning operators are summarized here. Chapter 4 contains a brief summary on iterations for nonlinear algebraic systems, and, as a conclusion, demonstrates
14
CONTENTS
the importance of finding good preconditioners. Part II starts with Chapter 5 which summarizes the underlying Hilbert space theory involving monotone and potential operators. The framework of variable preconditioning is also developed here. Chapter 6 deals with nonlinear elliptic problems from the aspect of solvability: the discussion of existence and uniqueness, based on the corresponding results of the previous chapter, both helps the understanding of the nature of these equations and gives a starting point for the study of convergent iterations. Part III consists of Chapters 7–10: referring to Figure 3, Chapter 7 is devoted the upper level of the diagram, whereas Chapters 8–10 concern the lower level. That is, Chapter 7 summarizes iterative methods in Sobolev space: as mentioned above, a detailed presentation is given for simple and Newton-like iterations, and some other iterative methods are sketched. The focus is on preconditioning, which is even used to give a common framework for these two types of method via the Sobolev gradient idea. Chapter 8 first summarizes briefly some general properties of the derived preconditioning methods for the discretized problems, then a list of various preconditioners is presented using the preconditioning operator background. The latter makes this chapter a central part of the book. The given preconditioners basically rely on spectral equivalence. Chapter 9 gives algorithmic realization and convergence results for the numerical methods based on preconditioning operators. Special attention is devoted to FEM realization, which is the most natural choice for our setting owing to the Sobolev space background. Finally, in Chapter 10 we return to the model problems of Chapter 1 and, as an illustration, we give algorithms for some of them using the previous results. A brief appendix in Chapter 11 gives some background information. As shown by the above structure, we can approach the developed iterative methods from three levels: Hilbert space theory, Sobolev space results and then the discretized problems. The Sobolev space level is basic for the understanding of the approach, since the next step to the discretized level is standard as shown by Figure 3 (and detailed in Chapter 9). For the better understanding of the Sobolev space part, the main results in Chapters 6-7 are formulated for several distinct types of boundary value problems. The division is as similar as possible in the different chapters within the demands of presentation, and is mostly suited to the examples of Chapter 1. The following table summarizes this structure concerning solvability, simple and Newton-like iterations in Sobolev space. (Other iterative methods and some non-potential equations are mentioned in these chapters for 2nd order mixed problems of different form.) Solvability Simple iterations Newton-like iterations 2nd order Dirichlet Theorem 6.5 Theorems 7.1, 7.2 Theorems 7.7, or mixed problems 7.9, 7.10 2nd order problems Theorem 6.5 Theorem 7.3 Theorem 7.8 of 3rd type 2nd order Theorem 6.4 Theorem 7.4 Theorem 7.8 mixed systems 2nd order Theorem 6.7 Theorem 7.5 Theorem 7.8 Neumann problems 4th order Theorem 6.8 Theorem 7.6 Theorems 7.8, 7.11 Dirichlet problems
CONTENTS
15
The corresponding realizations of these Sobolev space iterations for discretized problems in Chapter 9 are detailed for 2nd order mixed problems. For the other problems they are analogous and can be derived in an obvious way. The above structure of theorems is completed by returning to the examples of Chapter 1, to which the given division of problems is suited. These examples are also dealt with concerning solvability in Chapter 6 and in the model algorithms in Chapter 10. (A similar table as above will be given on this at the end of Chapter 1.) This monograph is not only intended for researchers in numerical analysis but for all those interested in any step of the process from the mathematical modelling to computer realization for real-life problems. The reader of this book is expected to be familiar with the basic level of the finite element method and functional analysis. The references given in subsection 3.3.1 and at the beginning of Chapter 5 may provide help in this matter in relation to elliptic problems. Acknowledgements. We dedicate this book to our colleague and former teacher L´aszl´o Cz´ach, who inspired our interest in mathematical analysis together with generations of Hungarian mathematicians and, as a member of Kantorovich’s school, gave us a basic motivation for using functional analysis in applied mathematics. We express our special thanks to Owe Axelsson for the many stimulating discussions on iterative methods and also for pointing out important results related to our topic. Moreover, his encouragement has given us a significant impulse to go on confidently with our subject. We also wish to thank John W. Neuberger for the valuable discussions which reinforced the importance of the analytic basis. We are grateful to our lead editor Daniele Funaro for the careful handling of the manuscript and his valuable suggestions on our material. We also greatly appreciate the comments of the unknown reviewers on the first draft of our book. We thank NOVA Science Publishers for the opportunity of writing this monograph, and, in particular, Marcin Paprzycki for the promotion of the publication process. The second author is grateful to the Hungarian research fund AMFK (Alap´ıtv´any a Magyar Fels˝ooktat´as´ert ´es Kutat´as´ert) for the Magyary Zolt´an Scholarship, which provided him full support for the time of research during the writing of this book.
16
CONTENTS
Part I Motivation
17
Chapter 1 Nonlinear elliptic equations in model problems Nonlinear elliptic problems arise in various mathematical models connected to diverse applications. Such models are mostly encountered in different fields of physics, and some other ones are related to chemical and biological phenomena. In this chapter some of the most important elliptic model problems are described briefly. Our aim is to illustrate the wide scope of the class of elliptic problems in nonlinear models, which motivates the investigation of efficient numerical solution methods. The equations are described in strong form, and their weak formulation is summarized distinctly in the last section. The ellipticity of these equations means that their principal part can be written in a strictly positive quasilinear form. In the case of second order equations this form is given as N X ∂ 2u , (1.1) aij (x, u, ∇u) ∂xi ∂xj i,j=1 where {aij (x, s, η)}N i,j=1 is a strictly positive definite matrix for any variables x ∈ Ω ⊂ RN , s ∈ R, η ∈ RN . The studied problems will be written in divergence form. The main theoretical feature connected to ellipticity is that the generalized differential operators corresponding to these problems (except for those in subsection 1.6.2) have a convex potential, i.e. the solutions are minimizers of suitable energy type functionals. This property will be used in the theoretical discussion on solvability and conditioning in Part II, where the potentials for the studied classes of problems will also be given. (We note that the convexity of the potential corresponds to the monotonicity of the generalized differential operator. In the non-potential case this is replaced by monotonicity in principal part. The monotone potential operator property will be in fact used in a form which relies on the Gateaux differentiability of the generalized differential operator, see the discussion at the beginning of Chapter 5.) For convenience of reading we note that when coordinates are not used, the brief notation for a vector of RN will be x in this chapter, but in the further part of the book we will only write x which will cause no ambiguity. The mentioned examples represent typical types of elliptic problems to which the discussion in this book is suited. To help understanding, this structure is sketched in 19
20 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS a table at the end of this chapter which gives the occurrence of the main examples or corresponding types of problems in the book.
1.1
Elasto-plastic torsion of rods
Our first example in this chapter is the elasto-plastic torsion of a hardening rod. Since its background involves typical physical considerations that lead to a nonlinear boundary value problem, we give more details here than for the further examples and sketch how the BVP is derived. The mathematical model of plastic state under plane deformation conditions was first given by Saint-Venant [262]. The model of elasto-plastic torsion is given below in the hardening state following the presentation of Kachanov [159]. For further details see [190, 215]. Let us consider a hardening rod with cross-section Ω ⊂ R2 , the lower end of the rod being clamped in the (x, y)-plane. The aim is to determine the tangential stress in the points of the rod under given torsion. The notations of this section are as follows: (x, y) ∈ R2 stands for the plane variable, subscripts like ux mean coordinates, and partial deriva∂ tives are denoted by ∂x etc. as in (1.1). In the Saint Venant model of torsion one assumes that the cross-sections experience rigid rotation in their planes and are distorted in the direction of the z-axis. Denoting by ω > 0 the torsion per unit length of the rod, the displacements ux , uy and uz are then given by ux = −ωzy, uy = ωzx, uz = w(x, y, ω) (1.2) where w is the distortion of a cross-section. Consequently, the shear strain vector γ and the tangential stress vectors τ act in cross-sections parallel to the (x, y)-plane, i.e. we can write τ = (τx , τy ) , γ = (γx , γy ) , further, the shear strain satisfies γx =
∂ux ∂uz ∂w + = − ωy, ∂z ∂x ∂x
γy =
∂uy ∂uz ∂w + = + ωx. ∂z ∂y ∂y
(1.3)
The equations (1.3) imply the continuity condition ∂γy ∂γx − = 2ω . ∂x ∂y
(1.4)
Further, for the tangential stress there holds the equilibrium equation ∂τx ∂τy + = 0. ∂x ∂y
(1.5)
1.1. ELASTO-PLASTIC TORSION OF RODS
21
Hence we can introduce the stress function u fulfilling ∂u , ∂y
τx =
τy = −
∂u . ∂x
(1.6)
The surface of the rod is free of normal stresses, i.e. τ ·ν = 0
on ∂Ω
(1.7)
where ν denotes the outward normal direction, and therefore by (1.6) the tangential derivative of u vanishes, i.e. we have u|∂Ω ≡ const.
(1.8)
The condition of the hardening state involves the single curve model, wherein the connection between strain and stress depends only on the strain and stress intensities
Γ = γx2 + γy2
1/2
,
T = τx2 + τy2
1/2
.
(1.9)
Moreover, T is a strictly increasing function of Γ defined below a certain strain Γ∗ (the end of validity of the elasto-plastic model), for which crack of the material first occurs. The stress-strain function is usually written in the form T = g(Γ)Γ
(1.10)
where the decreasing function g is called modulus of plasticity. The inverse is expressed in the similar product form Γ = g(T )T , (1.11) and, since by Hencky’s relations the strain and stress vectors are parallel, there holds also γx = g(T )τx , γy = g(T )τy . (1.12) According to the above, the increasing function g is also defined in a bounded validity interval [0, T∗ ] where T∗ = g(Γ∗ )Γ∗ . We require that g ∈ C 1 [0, T∗ ], then its described properties can be summarized by the inequalities 0 < µ1 ≤ g(T ) ≤ (g(T )T )′ ≤ µ2
(T ∈ [0, T∗ ])
(1.13)
with suitable constants µ1 , µ2 independent of T . Summing up, the relations that determine the tangential stress are (1.4), (1.5), (1.7), and (1.12), written briefly as
rotγ = 2ω divτ = 0
γ = g(T )τ
in Ω,
(1.14)
22 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS τ ·ν = 0
on ∂Ω.
Substituting (1.12) into (1.4) and using (1.6), we obtain
∂ − ∂x − g(T ) ∂u ∂x
∂ ∂y
g(T ) ∂u ∂y
where T = |τ | = |∇u| .
= 2ω , (1.15)
Since u is only determined up to additive constant, the boundary value in (1.8) may be chosen 0. Hence the discussed model leads to the nonlinear Dirichlet boundary value problem − ∂ g(|∇u|) ∂u − ∂ g(|∇u|) ∂u = 2ω ∂x ∂x ∂y ∂y (1.16) u|∂Ω = 0 or written briefly as −div (g(|∇u|)∇u) = 2ω (1.17) u|∂Ω = 0 .
If this is solved for u then the required tangential stress is obtained from (1.6).
Remark 1.1 The boundary value problem of torsion can also be formulated in terms of the distortion w of a cross-section, introduced in (1.2). (See [190].) First, (1.10) and (1.12) imply τx = g(Γ)γx ,
τy = g(Γ)γy ,
(1.18)
and here (1.3) yields Γ=
∂w ∂x
− ωy
The equilibrium equation (1.5) implies
2
+
∂w ∂y
+ ωx
2 1/2
(1.19)
.
∂ ∂ (g(Γ)γx ) + (g(Γ)γy ) = 0 , ∂x ∂y
(1.20)
further, from (1.7) and (1.12) γ·ν = 0
on ∂Ω.
(1.21)
Using (1.3), the equations (1.20) and (1.21) yield the nonlinear Neumann boundary value problem ∂ − ∂x − ωy − g(Γ) ∂w ∂x
∂ ∂y
g(Γ) ∂w ∂x
∂w ∂y
ν1 +
+ ωx
∂w ∂y
=
0
ν2 |∂Ω = ω(yν1 − xν2 ) .
(1.22)
Letting x = (x, y) and introducing the function r(x) = (ωy, −ωx), problem (1.22) can be written briefly as −div (g(|∇w − r(x)|)(∇w − r(x))) = 0
(∇w − r(x)) · ν |∂Ω = 0 .
(1.23)
1.2. ELECTROMAGNETIC FIELD THEORY (NONLINEAR MAXWELL EQUATIONS)23
1.2
Electromagnetic field theory (nonlinear Maxwell equations)
The problem of electromagnetic field in devices has been widely investigated, see Glowinski-Marrocco [134], Kˇriˇzek–Neittaanm¨aki [188] and the references there. Here we consider the two-dimensional case in cross-sections. The arising boundary value problem is derived from the nonlinear Maxwell equations, and for a brief formulation we denote by x ∈ R2 the plane variable. The 2D electromagnetic potential in some device Ω ⊂ R2 is described by the boundary value problem −div (b(x, |∇u|) ∇u) = ρ(x)
(1.24)
u|∂Ω = 0 ,
where the scalar-valued function b : Ω × R+ → R describes magnetic reluctance and ρ ∈ L2 (Ω) is the electric current density. The function b is measurable and bounded w.r. to x and C 1 w.r. to the variable r, further, it satisfies 0 < µ1 ≤ b(x, r) ≤
∂ (r b(x, r)) ≤ µ2 ∂r
(x ∈ Ω, r > 0)
(1.25)
with constants µ2 ≥ µ1 > 0 independent of (x, r). Problem (1.24) can be obtained from the nonlinear Maxwell equations
rotH = ρ
divB = 0
B·ν = 0
in Ω on ∂Ω
and the relation H = b(x, |B|) B , where H and B are the magnetic field and induction, respectively. The calculation that yields (1.24) is the same as for (1.17) using (1.14). Typically, the function b is independent of x in some subdomain and constant on the complement, where these subdomains correspond to ferromagnetic and other media, respectively. That is, a(r) if x ∈ Ω1 b(x, r) = α if x ∈ Ω \ Ω1 , where Ω1 ⊂ Ω is a given subdomain, α > 0 is a constant and a ∈ C 1 (R+ ) satisfies 0 < µ1 ≤ a(r) ≤ a(r) + a′ (r)r ≤ µ2 with constants µ2 ≥ µ1 > 0 independent of r.
(r ≥ 0)
(1.26)
24 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS Problem (1.24) describes the electromagnetic potential in an isotropic material. In the case of an anisotropic material the scalar coefficient b(x, |∇u|) is replaced by a matrix-valued coefficient B(x, |∇u|). Examples. The following function a, which satisfies (1.26), characterizes the reluctance of stator sheets in the cross-sections of an electrical motor in the case of isotropic media [134, 188]: r8 1 α + (1 − α) 8 a(r) = µ0 r +β
!
(r ≥ 0).
(1.27)
Here µ0 is the vacuum permeability and α, β > 0 are characteristic constants. (Concrete values of these will be cited and the related conditioning properties discussed in subsection 8.2.7 and section 10.2.) Another example for (1.26) in a somewhat similar form is
a(r) = 1 − (c − d)
1 r2 + c
(r ≥ 0),
(1.28)
where c > d > 0 are constant. The corresponding boundary value problem describes magnetostatic field [75, 238].
1.3
Nonlinear elasticity
The problem of elasticity of a body Ω ⊂ R3 with nonlinear behaviour of the material can be described with the aid of the displacement vector u : Ω → R3 , the strain tensor ε : Ω → R3×3 and the stress tensor σ : Ω → R3×3 (see Blaheta [45], Neˇcas–Hlav´aˇcek [224]). Using the notation x ∈ R3 for the space variable, the basic system of equations is − div σi = ϕi (x)
in Ω
σi · ν = τi (x)
on ΓN
ui = 0
on ΓD
(i = 1, 2, 3)
(1.29)
where σi = (σi1 , σi2 , σi3 ) (i = 1, 2, 3) is the ith row of the matrix σ, the functions ϕ : Ω → R3 and τ : ΓN → R3 describe the body and boundary force vectors, respectively, further, ∂Ω = ΓN ∪ ΓD is a disjoint measurable subdivision and ΓD 6= ∅. The problem (1.29) can be formulated as a second order system in terms of the displacement u. First, the strain tensor ε = ε(u) is determined by the displacement via the relation 1 ∇u + ∇ut ε(u) = 2 t where ∇u (x) denotes the transpose of the matrix ∇u(x) ∈ R3×3 for x ∈ Ω. The connection of strain and stress is given by a matrix-valued function T as follows. For any Θ ∈ R3×3 let 1 dev Θ = Θ − vol Θ (1.30) vol Θ = trΘ · I, 3
1.4. ELASTO-PLASTIC BENDING OF CLAMPED PLATES
25
where trΘ = 3i=1 Θii is the trace of Θ and I is the identity matrix. Using these notations, there holds σ(x) = T (x, ε(u(x))) (1.31) P
with T : Ω × R3×3 → R3×3 given by T (x, Θ) = 3k(x, |vol Θ|2 ) vol Θ + 2µ(x, |dev Θ|2 ) dev Θ
(x ∈ Ω, Θ ∈ R3×3 ), (1.32)
where k(x, s) is the bulk modulus of the material and µ(x, s) is Lam´e’s coefficient. Here the functions k, µ : Ω × R+ → R are measurable and bounded w.r. to x and C 1 w.r. to the variable s, further, they satisfy 0 < µ0 ≤ µ(x, s) < 32 k(x, s) ≤ k0 , 0 < δ0 ≤
∂ ∂s
(k(x, s2 )s) ≤ δ˜0 ,
(µ(x, s2 )s) ≤ δ˜0
∂ ∂s
0 < δ0 ≤
(1.33)
with constants µ0 , k0 , δ0 , δ˜0 independent of (x, s). Then, substituting (1.31) into (1.29), we obtain the system − div Ti (x, ε(u)) = ϕi (x)
1.4
Ti (x, ε(u)) · ν = τi (x)
in Ω on ΓN
ui = 0
(i = 1, 2, 3).
(1.34)
on ΓD
Elasto-plastic bending of clamped plates
The elasto-plastic bending of a clamped thin plane plate Ω ⊂ R2 is described by a fourth order nonlinear Dirichlet boundary value problem (Langenbach [190], Neˇcas– Hlav´aˇcek [224], Mikhlin [215]). It can be derived from the 3D elasticity system, given in the previous section, after neglecting the direction orthogonal to the plate. The following presentation of the problem is based on [215]. Using now for convenience the notation (x, y) ∈ R2 for the plane variable, the deflection u must satisfy
∂2 ∂x2
g(E(D2 u))
u|∂Ω =
∂u | ∂ν ∂Ω
∂2u ∂x2
+
1 ∂2u 2 ∂y 2
+
∂2 ∂x∂y
+
∂2 ∂y 2
=0
g(E(D2 u))
g(E(D2 u))
∂2u ∂y 2
∂2u ∂x∂y
+
1 ∂2u 2 ∂x2
= α(x)
(1.35) where α(x) is proportional to the external normal load per unit area, g depends on the given material and E(D2 u) = where
∂2u ∂x2
2
2
+
D u=
∂2u ∂2u ∂x2 ∂y 2 ∂2u ∂x2 ∂2u ∂x∂y
+
∂2u ∂y 2
∂2u ∂x∂y ∂2u ∂y 2
2
!
+
∂2u ∂x∂y
2
,
(1.36)
(1.37)
26 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS is the Hessian of a function u ∈ C 2 (Ω). The boundary condition expresses that the plate is rigidly clamped at its edge. The material function g ∈ C 1 (R+ ) satisfies the inequalities 0 < µ1 ≤ g(r) ≤ µ2 , (1.38) 0 < µ1 ≤ (g(r2 )r)′ ≤ µ2 with suitable constants µ1 , µ2 independent of the variable r > 0.
Let us introduce the following notations. For u ∈ C 2 (Ω) we set 2 ∂2u + 12 ∂∂yu2 ∂x2 1 ∂2u 2 ∂x∂y
˜2
D u=
and for any matrix-valued function A = div
2
a b c d
=
∂2a ∂x2
1 ∂2u 2 ∂x∂y 2 ∂2u + 21 ∂∂xu2 ∂y 2
a b c d +
∂2b ∂x∂y
!
,
∈ C 2 (Ω, R2×2 ) let +
∂2c ∂y∂x
+
∂2d ∂y 2
.
(1.39)
Then problem (1.35) is written briefly as ˜ 2 u = α(x) div2 g(E(D 2 u)) D u |∂Ω =
∂u | ∂ν ∂Ω
= 0.
(1.40)
The clamping of the plate is expressed by the second boundary condition (∂u/∂ν)|∂Ω = 0. We note that in the case of a freely supported plate, this condition is replaced by ˜ 2 u) ν · ν | = 0 . (D ∂Ω
1.5
(1.41)
Semilinear equations
Semilinear equations arise in various models, most often connected with nonlinear diffusion processes (see e.g. [175, 233]). These equations have a linear principal part, i.e. the nonlinearity appears in the zeroth-order term. We list some typical problems in different contexts. (a) Diffusion-kinetic enzyme problems Steady-state diffusion problems for enzyme-catalyzed reactions are described e.g. in Keller [175], Murray [219]. Denoting by u ≥ 0 the steady-state concentration of the substrate in a cell Ω ⊂ R3 whose surface is a semi-permeable membrane, the governing equation and corresponding boundary condition are u = 0 εu+k + h(x) (u − u0 (x)) |∂Ω = 0 ,
1 −div (d(x) ∇u) +
d(x) ∂u ∂ν
(1.42)
where d(x) > 0 is the molecular diffusion coefficient of the substrate in a medium containing some continuous distribution of bacteria, k > 0 is the Michaelis constant
1.5. SEMILINEAR EQUATIONS
27
and ε > 0, h(x) > 0 is the permeability of the membrane, u0 (x) > 0 is the external concentration of substrate. We assume that the coefficients d, h and u0 are C 1,α for some α > 0. Here the nonlinearity 1 u (u > 0) (1.43) εu+k describes the rate of the enzyme-substrate reaction by the Michaelis-Menten rule. (Other rates than (1.43) may also arise in different reactions [219].) r(u) =
(b) Radiative cooling The steady-state temperature distribution in various radiating bodies or gases is described by the problem −div (κ(x) ∇u) + σ(x)u4 = 0
κ(x) ∂u + α(x) (u − u˜(x)) |∂Ω = 0 , ∂ν
(1.44)
where u ≥ 0 is the unknown temperature in the body Ω which is radiating on the surface and the boundary condition there is given by Newton’s law [175]. Here κ(x) > 0 is the thermal conductivity, σ(x) > 0 is the Boltzmann factor, α(x) > 0 is the heat transfer coefficient, u˜(x) > 0 is the external temperature. We assume that the coefficients κ, σ, α and u˜ are C 1,α for some α > 0. A modification of (1.44) is obtained if the boundary condition is given by Stefan’s law: on ∂Ω . κ(x) ∂u + α(x) (u4 − u˜(x)4 ) = 0 ∂ν (c) Autocatalytic chemical reactions Reaction-diffusion equations and systems form a wide area in mathematical models [58, 139, 267]. Elliptic problems arise as the steady-states of reaction-diffusion processes, and here the nonlinearities in the lower order terms describe the rate of the reaction. For instance, the problem −∆u + up = 0 u |∂Ω = 1
(1.45)
in a domain Ω ⊂ R2 with some p ≥ 1 describes a chemical reaction-diffusion process where the reaction is autocatalytic, i.e. the growth of the concentration u ≥ 0 speeds up the rate of the reaction [88]. (d) Electrostatic potentials The electrostatic potential u in a charged body Ω ⊂ R3 is described by the problem −∆u + eu = 0
u|∂Ω = 0 ,
(1.46)
see [188]. The modification of problem (1.46) with the nonlinearity 2 sinh u instead of eu and inhomogeneous data describes the electrostatic potential in van Roosbroeck’s drift-diffusion model (see e.g. [126, 254]).
28 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS
1.6 1.6.1
Some other examples Flow models
(a) Subsonic potential flow The behaviour of potential flows has been studied in several works, see e.g. [24, 57, 257, 231] and the references there. Following [24], the subsonic potential flow in a wind tunnel section Ω ⊂ R2 is described by the boundary value problem −div (̺(|∇u|2 ) ∇u) = 0
̺(|∇u|2 )
with the nonlinearity
∂u ∂ν
= γ(x)
in Ω
on ΓN
(1.47)
u = v∞ (x) on ΓD
5/2 1 2 2 ̺(|∇u| ) = ̺∞ 1 + (M∞ − |∇u| , (1.48) 5 where M∞ is the Mach number at infinity, ̺∞ the air density at infinity, ΓN consists (0) (1) of disjoint subparts ΓN and ΓN (the sides and the end of the wind tunnel section, respectively) such that (0) 0 on ΓN γ(x) = (1) c∞ on ΓN 2
where the constant c∞ > 0 is the wind outlet velocity, further, v∞ describes the wind inblow. The flow is subsonic if supΩ |∇u| < 1, in which case the equation is elliptic. (Otherwise the flow is transonic and the equation is hyperbolic in the subdomains where |∇u| exceeds 1.) The subsonic potential flow around a non-lifting aerofoil S is given by the same equation with only Neumann boundary conditions, i.e. −div (̺(|∇u|2 ) ∇u) = 0
̺(|∇u|2 )
∂u ∂ν
in Ω
= γ(x) on ∂Ω
(1.49)
with the nonlinearity (1.48), where now Ω ⊂ R2 is the region between the aerofoil S and an artificial far-field boundary Γ∞ , and the boundary value γ satisfies γ(x) =
(
0 on S γ∞ on Γ∞
where γ∞ is the normal component of the velocity. (See [257] and the references there.) (b) Electric field in dielectric fluids The model of electrorheological fluids under the action of a non-constant electric field has been elaborated in [63]. The quasi-static electric field E in a newtonian dielectric fluid can be obtained as E = −∇u, where the potential u satisfies −div (1 + c|∇u|2 ) ∇u) = 0
u|∂Ω = φ0 .
(1.50)
1.6. SOME OTHER EXAMPLES
29
The function φ0 = φ0 (x, t) depends also on the time t, i.e. u is the solution of an elliptic problem for each fixed time level. Here c = χ1 /(χ0 + 1) > 0 is a constant coming from the coefficients of the dielectric susceptibility relation χE = χ0 + χ1 |E|2 .
1.6.2
Non-potential problems
The two problems in this subsection are non-potential in the sense that their solution is not the minimizer of a suitable functional, in contrast to all the other examples of the chapter. (The related theoretical background is discussed in sections 5.1 and 6.2.3, and see some more details on the weak differential operator in subsection 7.4.4.) We note that for the semiconductor system in paragraph (b), this sense of non-potential property is independent of the fact that the unknown functions are potentials themselves for certain corresponding fields. (a) Stationary heat conduction The temperature u in a steady state of heat conduction in a body Ω ⊂ RN (N = 2 or 3) with nonhomogeneous isotropic material and prescribed temperature on the boundary is described by the problem − div (a(x, u) ∇u) = g(x)
u|∂Ω = ϕ(x) ,
in Ω
(1.51)
see e.g. [117]. Here the continuous function a : Ω × R → R describes the heat conduction properties of the material, depending also on the temperature. Accordingly, it satisfies (x ∈ Ω, s ∈ R) 0 < α ≤ a(x, s) ≤ α′ with constants α, α′ independent of (x, s). Further, the function g gives the internal heat sources. If the heat flow is prescribed on a part ΓN of the boundary, then the boundary condition there is | = γ(x). a(x, u) ∂u ∂ν ΓN Further, in the case of an anisotropic material the function a(x, s) is replaced by a positive definite matrix-valued function A(x, s) = {aij (x, s)}N i,j=1 . (b) Semiconductor equations The behaviour of semiconductor devices is described by a system of three equations called Van Roosboeck’s system (see Kˇriˇzek–Neittaanm¨aki [188] and the references there, including modifications of this model). The three unknown functions are the electrostatic potential u0 and the electrochemical potentials u1 and u2 of the holes and electrons, respectively. The electric field is obtained as −∇u0 from the potential u0 .
30 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS The system is as follows: − div (ε(x)∇u0 ) = f (x) + eu1 −u0 − div (µ1 (x)eu1 −u0 ∇u1 ) = k(u1 , u2 ) (1 − eu1 +u2 ) u +u u +u
− div (µ2 (x)e
α(x)u0 +
∂u0 ∂ν
∇u2 ) = k(u1 , u2 ) (1 − e
0
2
=
∂u1 ∂ν
=
∂u2 ∂ν
1
2
)
in Ω, (1.52)
= 0 on ΓN
ui = ui (x)
on ΓD ,
where f ∈ L∞ (Ω) is the net density of charges of ionized impurities, the continuous function k > 0 represents the rate of generation of holes and electrons, ε ∈ L∞ (Ω) is the dielectric permittivity, µ1 , µ2 ∈ L∞ (Ω) are the mobilities of holes and electrons, respectively. The coefficients ε, µ1 , µ2 satisfy ε(x) ≥ m0 ,
µ1 (x) ≥ m0 ,
µ2 (x) ≥ m0
with some constant m0 > 0 independent of x.
1.7
Weak formulations
In this section we give the weak formulations of the main examples which have been so far presented in strong form in this chapter. Since the reader is in general assumed to be familiar with weak formulations, we only go into some details where the symmetry of the weak form requires some calculations. Besides, for the examples not mentioned here the similar weak formulation itself is left to the reader. The weak form of the boundary value problems is obtained in two steps. First, the strong form is transformed such that we multiply it by an arbitrary test function v (taken from the corresponding Sobolev space), we integrate and then apply the divergence theorem to halve the order of derivation in u. Second, we only require the weak solution to be in a Sobolev space for which the obtained expression makes sense. This means that for classical (regular) functions u, the left-hand side of the weak form coincides with the integral Z T (u)v , Ω
and a regular weak solution is a classical (strong) solution. On the other hand, the weak solution is generally not regular, in which case it fails to be a strong solution of the original problem. (a) Elasto-plastic torsion The weak formulation of problem (1.17) reads as follows: find u ∈ H01 (Ω) such that Z
Ω
g(|∇u|)∇u · ∇v = 2ω
Z
Ω
v
(v ∈ H01 (Ω)).
(1.53)
(For regular functions u ∈ H 2 (Ω) ∩ H01 (Ω), the above integral is obtained via multiplying (1.17) by v ∈ H01 (Ω), integration and the divergence theorem. The converse of
1.7. WEAK FORMULATIONS
31
these operations shows in this case that if the weak solution is in H 2 (Ω) ∩ H01 (Ω) then it is a strong solution as well.) (b) Electromagnetic field equation The weak formulation of problem (1.24) reads as follows: find u ∈ H01 (Ω) such that Z
Ω
b(x, |∇u|)∇u · ∇v =
Z
Ω
ρv
(v ∈ H01 (Ω)).
(It is derived similarly as (1.53).) (c) Nonlinear elasticity The Sobolev space corresponding to the boundary conditions of system (1.34) is 1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0}.
Then the weak formulation of system (1.34) reads as follows: find u = (u1 , u2 , u3 ) ∈ 1 HD (Ω)3 such that Z
Ω
T (x, ε(u)) · ε(v) =
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
1 (v ∈ HD (Ω)3 )
(1.54)
or, with the representation (1.32) for T (x, ε(u)), Z Ω
3k(x, |vol ε(u)|2 ) vol ε(u) · vol ε(v) + 2µ(x, |dev ε(u)|2 ) dev ε(u) · dev ε(v) dx =
Z
Ω
ϕ · v dx +
Z
ΓN
τ · v dσ
1 (v ∈ HD (Ω)3 ).
Here the elementwise matrix product · is defined by A · B :=
N X
(A, B ∈ R3×3 ).
Aik Bik
i,k=1
(1.55)
1 For regular functions u ∈ (H 2 (Ω) ∩ HD (Ω))3 , the weak formulation is obtained via 1 multiplying the terms of the system (1.34) by vi ∈ HD (Ω) (i = 1, 2, 3), integration, summation and the divergence theorem. In this way we have
Z
Ω
T (x, ε(u)) · ∇v =
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
1 (v ∈ HD (Ω)3 )
instead of (1.54). The latter can be obtained from here if we apply the following identities. 1 Proposition 1.1 For any u, v ∈ HD (Ω)3 there holds
vol ε(u) · ∇v = vol ε(u) · vol ε(v),
dev ε(u) · ∇v = dev ε(u) · dev ε(v).
32 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS Proof. The symmetry of the matrices vol ε(u) and dev ε(u) implies vol ε(u) · ∇v = vol ε(u) · ε(v),
dev ε(u) · ∇v = dev ε(u) · ε(v),
(1.56)
respectively. Further [47], the decomposition A = vol A + dev A
(1.57)
is orthogonal w.r. to the product · in the sense that arbitrary matrices A, B ∈ R3×3 satisfy vol A · dev B = 0, and this implies vol A · B = vol A · vol B and dev A · B = dev A · dev B. The latter and (1.56) imply the required identities. (d) Elasto-plastic bending of clamped plates The weak formulation of problem (1.40) reads as follows: find u ∈ H02 (Ω) such that Z 1Z 2 2 2 g(E(D u)) (D u · D v + ∆u ∆v) = αv 2 Ω Ω
(v ∈ H02 (Ω)),
(1.58)
where the elementwise matrix product · is defined by (1.55). For regular functions u ∈ H 4 (Ω) ∩ H02 (Ω), the weak formulation is obtained via multiplying problem (1.40) by v ∈ H02 (Ω), integration and the divergence theorem. In this way we have Z
Ω
˜ 2 u · D2 v = g(E(D2 u)) D
Z
Ω
(v ∈ H02 (Ω))
αv
instead of (1.58). The latter can be obtained from here by observing that ˜ 2 u = 1 D2 u + ∆u · I D 2
(where I ∈ R2×2 is the identity matrix) and I · D2 v = ∆v, which yield that ˜ 2 u · D2 v = 1 (D2 u · D2 v + ∆u ∆v). D 2 (e) Semilinear equations We give the weak formulation of the radiative cooling problem. The formulation for the other three semilinear problems in section 1.5 is analogous to this. The weak formulation of problem (1.44) reads as follows: find u ∈ H 1 (Ω) such that Z Ω
4
κ ∇u · ∇v + σu v +
Z
∂Ω
αuv dσ =
Z
∂Ω
α˜ uv dσ
(v ∈ H 1 (Ω)).
(For regular functions u ∈ H 2 (Ω), the above integral is obtained via multiplying (1.44) by v ∈ H 1 (Ω), integration and the divergence theorem.) Summary of the examples. The examples of this chapter have illustrated that nonlinear elliptic problems arise in various mathematical models connected to diverse
1.7. WEAK FORMULATIONS
33
applications. The wide scope of the class of elliptic problems in nonlinear models motivates the theoretical understanding of this kind of equations and, based on this, the development of efficient numerical solution methods. In addition, the mentioned examples represent typical types of elliptic problems to which the discussion in this book is suited. The structure of the studied types of problems has been summarized in the table at the end of the introduction. To help understanding, this structure is completed by the examples, hence they are mentioned separately in Chapter 6 (solvability) and most of them are also considered in Chapter 10 to illustrate algorithmic realization. The following table sketches the occurrence of the main examples or corresponding types of problems in the main contexts. (Namely, existence and uniqueness theorems are given in Chapter 6 for certain types of problems and applied to the examples in Propositions 6.1-6.4. Simple and Newton-like iterations in Sobolev space are formulated in the theorems of Chapter 7. Finally, some derived numerical algorithms for the examples are discussed in the distinct sections of Chapter 10.) Problem Existence Torsion (1.17) Th. 6.6 Magnetic (1.24) Cor. 6.1 Elasticity (1.34) Th. 6.4 Plate (1.40) Th. 6.8 Semilinear Th. 6.5 (1.42)–(1.46)
Example Simple iter. Newton Prop. 6.1 Th. 7.1 Ths. 7.7,9,10 Prop. 6.2 Prop. 6.3 Prop. 6.4
Th. 7.4 Th. 7.6 Th. 7.3
Th. 7.8 Ths. 7.8,11 Th. 7.8
Algorithm Sec. 10.1 Sec. 10.2 Sec. 10.4 Sec. 10.3 Sec. 10.5– Sec. 10.6
Table 1. The occurrence of the main examples or corresponding types of problems in the main contexts of the book. The above examples and studied methods form the main line of presentation, as explained in the introduction. In addition, problems (1.17) and (1.24) are also covered by the methods in subsections 7.4.1–7.4.3. Among the other model problems, the flow problem (1.47) is of the same type as (1.17) and (1.24); as an example it is mentioned in section 6.4 together with the dielectric fluid equation. Neumann problems, including the examples (1.23) and (1.49), are addressed in Theorem 6.7 for solvability and in Theorems 7.5 and 7.8 concerning iterations. Finally, the non-potential problems (1.51) and (1.52) are respectively involved in Theorems 6.9 and 6.10 for solvability, and in Theorems 7.15 and 7.13–7.14 concerning frozen coefficient iterations; both problems are covered by subsection 7.4.3 and problem (1.51) is also involved in subsection 7.4.4.
34 CHAPTER 1. NONLINEAR ELLIPTIC EQUATIONS IN MODEL PROBLEMS
Chapter 2 Linear algebraic systems The aim of this chapter is twofold. On the one hand, linear systems give motivation and analogy for the nonlinear case in the study of the properties and importance of conditioning. On the other hand, the iterative solution of nonlinear systems incorporates the solution of auxiliary linear systems, hence the latter has basic influence on the efficiency of the overall method. This chapter focuses on the importance of conditioning properties and preconditioning for linear algebraic systems, and also includes a brief discussion of computer realization. For comprehensive summaries on linear algebraic systems and their solution, the reader is e.g. referred to the monographs of Axelsson [11], Kelley [176], Molchanov [217], Varga [282], Young [288], Wilkinson [292, 293].
2.1 2.1.1
Conditioning of systems of linear algebraic equations Well-posedness
A system of linear algebraic equations is a widespread tool in the description of physical and other models. It generally has the form ˆx = ˆb, Aˆ
(2.1)
where Aˆ ∈ Rs×s and ˆb ∈ Rs are a given matrix and vector, respectively. The exact values of the entries of Aˆ and ˆb, however, are only known in some exceptional cases, i.e. we usually know only some approximation of the form Ax = b
(2.2)
with a given accuracy of data: kAˆ − Ak = k∆A k ≤ εA ,
kˆb − bk = k∆b k ≤ εb .
(2.3)
Therefore, a linear physical model is described by a family of systems of linear algebraic equations of the form (2.2)-(2.3). Consequently, for the system (2.1) with inaccurate data, the analysis of the solution cannot be replaced by the analysis of an individual 35
36
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
problem (2.2) but only by the set of problems (2.2) with the data under the conditions (2.3). The requirement of well-posedness hereby means to guarantee the closeness of the solution of any problem in the set to the exact unknown solution xˆ of (2.1). An easy calculation shows that assuming the regularity of the matrices A and k∆A A−1 k < 1 ,
(2.4)
there holds the estimate kˆ x − xk ≤
kA−1 k (k∆b k + k∆A A−1 k) kbk 1 − k∆A A−1 k
(2.5)
which yields the continuous dependence of the solution on the input data. This means that under the condition (2.4) the problem (2.2) is well-posed. Assume that the relation k∆A kkA−1 k < 1 (2.6)
holds, that is, the condition (2.4) is satisfied. Then for the relative error we have the estimate ! kˆ x − xk kAkkA−1 k k∆A k k∆b k ≤ + , (2.7) bk kxk kAk kbk 1 − k∆ kbk moreover, this estimate is sharp on the class of regular matrices.
2.1.2
The condition number and its properties
As we have seen, even for well-posed systems with a fixed relative inaccuracy the relative error may be very large, depending on the number cond(A) = kAkkA−1 k, which is called condition number of the regular matrix A. Then the estimate (2.7) means cond(A) kˆ x − xk ≤ bk kxk 1 − k∆ kbk
!
k∆A k k∆b k + . kAk kbk
(2.8)
We list some important properties of the condition number (see e.g. [11]). • The condition number cond(A) depends on the chosen norm. • For all matrices cond(A) ≥ 1. • In Euclidean norm for any matrix A the condition number has the following relationship with its maximal and minimal eigenvalue: cond(A) ≥
|λmax | , |λmin |
(2.9)
and for symmetric matrices the equality holds, i.e., cond(A) =
|λmax | . |λmin |
(2.10)
2.1. CONDITIONING OF SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS 37 • For symmetric positive definite (SPD) matrices A the relations λmax =
hAx, xi , 2 x∈Rs \{0} kxk sup
λmin =
hAx, xi \{0} kxk2
inf s
x∈R
(2.11)
are valid. Therefore, having the estimate m≤
hAx, xi ≤M kxk2
(x ∈ Rs \ {0})
(2.12)
with some suitable constants 0 < m ≤ M < ∞, there holds cond(A) ≤
M . m
(2.13)
• Roughly speaking, an almost singular matrix is expected to have a large condition number and small determinant. Nevertheless, there is no direct proportion between the values of det(A) and cond(A), e.g. multiplying A by a constant changes the former but not the latter. An alternative condition number for a matrix A ∈ Rs×s , which also takes det(A) into account, is the K-condition number trace(A)s (2.14) K(A) = det(A) (see [19]). The number
!
k∆A k k∆b k p := cond(A) , + kAk kbk
is called condition number of the system (2.2). The dramatic effect of a large condition number is shown by the following example of a system of linear algebraic equations [217]: 100x1 + 500x2 = 1700 15x1 + 75.01x2 = 255
(2.15) (2.16)
and a very slightly perturbed system of the form 100x1 + 500x2 = 1700 15x1 + 75.01x2 = 255.03.
(2.17) (2.18)
As one can verify, the exact solution of the first system is x1 = 17, x2 = 0, whereas for the second one we have the quite different result x1 = 2, x2 = 3. In virtue of (2.8), the explanation for this is the large condition number cond(A) = 2.6585 · 105 . In this case, in order to have a solution reasonably closer to the unperturbed one, we have to compensate the large condition number with much greater accuracy, i.e. we must restrict ourselves to much smaller perturbations. For instance, the system 100x1 + 500x2 = 1700 15x1 + 75.01x2 = 255.000003 has the exact solution x1 = 16.9985, x2 = 0.0003.
(2.19) (2.20)
38
2.1.3
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
Computer implementation
Besides the inaccuracy in the mathematical model, some error may arise also from the computer implementation of the applied numerical method. Each computer has a word length consisting of the number of binary digits contained in each memory word, and this word length determines the number of digits that can be carried in the usual arithmetic, called single precision arithmetic, of the machine. On most scientific computers this is equivalent to a number of decimal digits between 7 and 14. Higher precision arithmetic (e.g. double precision arithmetic) can also be carried out. (For interest we mention that on many computers the double precision arithmetic is a part of the hardware and, compared to the single-precision arithmetic, only rarely requires twice as much time.) Due to the different kinds of errors (computer realization of the mathematical model, rounding errors during the computation, etc.), instead of x we get some other solution, denoted by xcomp in the sequel. Namely, round-off errors can affect the final computed result in different ways. First, during a sequence of millions of operations, each subject to a small error, there is the danger that these small errors will accumulate so as to eliminate much of the accuracy in the computed result. Moreover, the catastrophic cancellation error may happen due to the subtractions. This is one way in which an algorithm can be numerically instable. (However, as we have already seen, it is also possible that the results of a computation are completely erroneous of round-off errors even in case of only a small number of arithmetic operation steps if cond(A) is large. For some more interesting examples see [237].) Our aim is to analyse the closeness of x and xcomp , relying on Molchanov [217], Wilkinson [292, 293]. Using the so-called backward error analysis, for some numerical methods one can show that the round-off error has the same effect as that caused by perturbations to the original problem data. (Of course, special analytical tools are needed for different methods.) Assuming the existence of such analysis (which exists for the majority of solvers for systems of linear algebraic equations), we can write the equivalent perturbed equation for xcomp as follows: (A + δA)xcomp = b + δb
(2.21)
(A + F )xcomp = b,
(2.22)
or where the perturbations δA and δb (or F ) are the equivalent perturbations depending on the numerical method chosen, the order of the system, computer characteristics, etc. As before, the estimate kx − xcomp k cond(A) ≤ kxcomp k 1 − kδbk kbk
kδAk kδbk + kAk kbk
!
(2.23)
is valid. If the right side is small then the given algebraic system is called computationally well-conditioned. (In the opposite case, it is called computationally ill-conditioned.) It is worth mentioning that with the decrease of the perturbations δA and δb (i.e., with the increase of the computer word length), we can achieve the closeness of x and
2.1. CONDITIONING OF SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS 39 xcomp . (We note that, of course, this does not imply the same for xˆ and x.) This is illustrated by the example [217] 0.135x1 + 0.188x2 + 0.191x3 + 0.178x4 0.188x1 + 0.262x2 + 0.265x3 + 0.247x4 0.191x1 + 0.265x2 + 0.281x3 + 0.266x4 0.178x1 + 0.247x2 + 0.266x3 + 0.255x4
= 0.3516 = 0.4887 = 0.5105 = 0.4818
(2.24)
(2.25) having the exact solution x = [0.4, 0.5, 0.6, 0.5].
(2.26)
By using the Cholesky factorization on some computer with a fixed word length of 6 digits, the computer solution xcomp is as follows: xcomp = [−0.0378848, 0.764402, 0.710031, 0.434482].
(2.27)
Although the residual vector rcomp = b − Axcomp satisfies rcomp = −10−6 [0.245, 0.3506, 0.12562, 1.4556],
(2.28)
which is quite small, the difference between (2.26) and (2.27) is significant. At the same time, by increasing the length of the computer words up to 10 and using double precision arithmetic in the numerical algorithm, we get xcomp = [0.3995346979, 0.5002810367, 0.6001168362, 0.4999307078],
(2.29)
and by further increasing the length of the computer words up to 12 and using again double precision arithmetic in the numerical algorithm, we get xcomp = [0.4000000001, 0.4999999999, 0.6000000000, 0.5000000000].
(2.30)
We note that the computer realization of a decomposition A = QP , where Q and P are triangular or orthogonal matrices, corresponds to the main part of the decomposition A + F = QP
(2.31)
since the backward substitution is almost negligible [292, 293]. This fact enables us to check the closeness of x and xcomp , to improve the computer solution xcomp and also to give an estimate for cond(A). Namely, using (2.31), let us define the iteration (a)
x0 = 0,
δ 0 = xcomp ;
for any k = 0, 1, 2, . . . :
(b) rk = b − Axk ;
(QP )δ k = rk ;
xk+1 = xk + δ k . (2.32)
40
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
The vectors δ s (so-called correction vectors) give information about the computer conditioning of the system: if this vector sequence is not decreasing or its decrease is slow then the system of linear algebraic equations under consideration is computationally ill conditioned. This phenomenon is shown by numerical experience, an exact treatment is found e.g. in Wilkinson [292, 293]. Now let us apply the algorithm (2.32) to xcomp from (2.27). Preserving the 6-digit length of the words, we obtain x1 x2 x3 x4 -0.037884 -0.514729 -1.302137 -3.15656 0.764402 1.052429 1.52790 2.64787 . 0.710031 0.829836 1.02789 1.49359 0.434782 0.363669 0.24625 -0.030055
(2.33)
The residual vectors have the following values: r1 −0.245 · 10−6 −0.3506 · 10−6 −0.12562 · 10−5 −0.14556 · 10−5
r2 0.5 · 10−6 −0.129 · 10−6 −0.316 · 10−6 −0.172 · 10−6
r3 0.346 · 10−5 0.463 · 10−5 , 0.4188 · 10−5 0.4297 · 10−5
(2.34)
and for the correction vectors we obtain δ1 δ2 δ3 0.476845 -0.787408 -1.854371 0.2880228 0.475488 1.119974 -0.119805 0.197993 0.465775 -0.071113 -0.117465 -0.276260 which shows the divergence of the iteration. If we increase the length up to 10 then we get the following solution: x1 x2 x3 0.3995346979 0.3999954658 0.4000000742 0.5002810367 0.5000027368 0.4999999552 0.6001168362 0.6000011469 0.5999999813 0.4999307078 0.4999993230 0.5000000111
(2.35)
modified values for the x4 0.4 0.5 0.6 0.5
(2.36)
and for the correction vectors: δ1 0.4607679081 · 10−3 0.2782980169 · 10−3 0.1159612523 · 10−3 0.6861518675 · 10−4
δ2 0.4608453927 · 10−5 −0.2781626695 · 10−5 −0.1159611323 · 10−5 0.6881067490 · 10−6
δ3 −0.7420070607 · 10−7 0.4480042552 · 10−7 . 0.1870018490 · 10−7 −0.1110010759 · 10−7
(2.37)
The vectors δ k (k = 1, 2, 3) show a proper behaviour and we observe that the third iteration x4 already gives the exact solution up to 10−12 .
2.1. CONDITIONING OF SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS 41 The vectors δ k can be also applied to estimate the condition number cond(A): if δ k are considerably decreasing then cond(A) ≈
kδ 1 k εkx1 k
(2.38)
(see [292, 293]), where ε denotes the computer zero. In our example, 10 kδ
1
k∞ 1 kx k∞
cond(A) ≈ 10
≈ 0.77 · 107 ,
while the exact value is cond(A) = 1.34 · 107 .
2.1.4
An example of improved conditioning
The following example is taken from [217]. Let us consider the system of linear algebraic equations (2.2) with
10 0.01 0 A = 0.01 0.001 0.1 , 0 0.1 1000
20.05 b = 0.125 . 1000.5
(2.39)
The exact solution is y = [2; 5; 1]T . The SPD matrix A has the eigenvalues λ1 = 0.000979999,
λ2 = 10.00001,
λ3 = 1000.00001,
(2.40)
hence A has the condition number cond(A) ≈ 1.0204 · 106 . Let us use the explicit one-step iterative method y k+1 − y k + Ay k = b (2.41) τ 2 ≈ 0.00199999806 (the so-called gradient method, with the optimal choice τ = λ1 +λ 3 see in the next subsection), which is theoretically a linearly convergent method with 1 the ratio of convergence q = λλ33 −λ ≈ 0.99999808. However, by the choice y 0 = 0, with +λ1 simple precision arithmetic, we get the ‘quasi-constant’ oscillating sequence y 2k
2.0004559 = 4.5465279 , 0.95203686
y 2k+1
2.0004559 = 4.5465374 1.0480337
(2.42)
(for all k = 1, 2, . . .), which is even far from exact the solution. At the same time, using the double precision arithmetic, after 4 · 106 iterations we get the numerical solution y 4000000
2.000001968 = 4.998032428 . 0.9996063277
(2.43)
This means that by increasing the computer capability and (extremely) the computer costs, one can achieve convergence in spite of the computational ill-conditioning of the system.
42
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
Using some simple rearrangement one can rewrite the system of linear algebraic equations corresponding to (2.39) in the form Bx = ˜b,
(2.44)
where B ∈ Rs×s and ˜b ∈ Rs have the form
10 1 0 B = 1 10 1 , 0 1 10
25 b = 53 . 15
(2.45)
(This intuitive transformation corresponds to the multiplication of both sides of (2.2) with the choice (2.39) by the matrix D−1 = BA−1 .) The matrix B has the eigenvalues λ1 = 8.5857864,
λ2 = 10.0,
λ3 = 11.414214,
(2.46)
so B has the condition number cond(B) ≈ 1.3294. Using the same gradient method to the transformed system (2.44) (2.45) with zero initial vector and simple precision arithmetic, after the fifth iteration we obtain
2.0002003 y5 = 5.0001192 , 1.0001993
(2.47)
which is already sufficiently close to the exact value. Moreover, the rate of the convergence of the latter iteration is q ≈ 0.141421, that is, compared to the original one, it is reduced 7.07 times.
2.1.5
Some iterative methods and their convergence
In this subsection the most widespread iterative methods for SPD matrices are briefly mentioned. The main goal is to show the influence of cond(A) on the convergence. Comprehensive summaries on iterative methods for linear problems are found in the wide literature, e.g. in the monographs of Axelsson [11], Bruaset [60], Kelley [176], Young [288]. We consider systems of linear algebraic equations of the form (2.2) with an SPD matrix A. The basic iterative methods for its solution are the first order (one-step) iterative methods of the form D(xk+1 − xk ) = −τk rk ,
rk = Axk − b,
k = 0, 1, . . .
(2.48)
with some given initial vector x0 , SPD matrix D and numbers τk > 0. (If τk ≡ τ then the method is called stationary, otherwise non-stationary.) In this subsection we assume that D is the identity matrix, that is, we consider the iteration xk+1 = xk − τk (Axk − b),
k = 0, 1, . . .
(2.49)
If the estimate (2.12) is known then the stationary iterative method xk+1 = xk − τ (Axk − b),
k = 0, 1, . . .
(2.50)
2.1. CONDITIONING OF SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS 43 with the choice τ=
2 m+M
(2.51)
where m and M are from (2.12), is called gradient method or method of steepest descent. (The iterations (2.49)–(2.50) are sometimes also called simple/fixed point/Richardson iterations.) The method is linearly convergent with the quotient q=
M −m M/m − 1 = < 1. M +m M/m + 1
(2.52)
Clearly, the gradient method has the optimal order of convergence if the spectral bounds (2.12) are sharp, i.e. m = λmin and M = λmax . Then qopt =
cond(A) − 1 . cond(A) + 1
(2.53)
If we consider the variable choice of the parameters τk (that is, instationary iterative methods) by the Chebysev choice of the parameters, then the asymptotic rate of the convergence is √ √ M− m (2.54) q=√ √ M+ m in terms of the bounds in (2.12), and in the optimal case we have q
cond(A) − 1 qopt = q . cond(A) + 1
(2.55)
We note that for both methods we require the estimate (2.12), that is an estimate for the spectrum of A. If we do not have it, we can use the iteration (2.49) with the choice τk =
hrk , rk i . hArk , rk i
(2.56)
This method has the convergence rate of (2.53). The parameter choice in the Chebysev iterative method is very sensitive to the round-off error. In order to avoid this difficulty and preserve the convergence rate, we can apply different kinds of conjugate gradient methods (CGM). The CGM was first presented by Hestenes and Stiefel [150] and discussed in a general setting by Daniel [79]. Its three-term recurrence version is called Lanczos method [11, 290]. The CGM has become a particularly efficient tool for large-scale problems via suitable parallelization techniques (Brugnano–Marrone [61], Navon–Phua–Ramamurthy [222]). The theory and several versions of the CGM are discussed in detail in the monograph of Axelsson [11]. The standard CG algorithm constructs the sequence (xk ) (together with conjugate
44
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
directions dk and the residuals rk ) as follows: (a) x0 ∈ Rs is arbitrary; r0 = −d0 = Ax0 − b; for any k = 1, 2, ... : if xk−1 , rk−1 , dk−1 are obtained, then (b1) γk = hAdk−1 , dk−1 i, αk = − γ1k hrk−1 , dk−1 i; k k−1 k−1
(b2)
(b3) (b4)
(b5)
r =r
+ αk Ad
;
xk = xk−1 + αk dk−1 ; βk =
1 hAdk−1 , rk i; γk
dk = −rk + βk dk−1 .
(2.57) In the CG algorithms the knowledge of the spectrum of the matrix A is not assumed, this possible information is used only for the convergence analysis. Namely, the CGM has the convergence quotient (2.54): √ √ M− m q=√ √ . M+ m Moreover, we note that it gives the exact solution in s steps where A ∈ Rs×s , but this result is only useful for small sized problems. One may instead benefit by the superlinear convergence estimates, first established by Hayes [148] (see also Winter [294]). A characterization of the rate of superlinear convergence is given by Axelsson and Kaporin [19] in terms of the K-condition number defined in (2.14).
2.2
Problems with large condition numbers
In this section we consider problems with large condition numbers and analyse the arising difficulties.
2.2.1
Discretized elliptic problems
Let us consider a second order elliptic boundary value problem in RN with the assumption that the corresponding bilinear form is symmetric, coercive and bounded. The finite element method leads to a system of linear algebraic equations of the form (2.2), where A ∈ Rs×s , and we are interested in the value cond(A). The standard technique (see e.g. in Axelsson–Barker [14]) yields
cond(A) = O s2/N .
(2.58)
That is, with the increase of s (the dimension of the matrix A) the upper bound for the condition number grows very rapidly. Some further properties of the condition numbers are as follows (see e.g [14, 218, 269]).
2.2. PROBLEMS WITH LARGE CONDITION NUMBERS
45
• If the discretization parameter (traditionally denoted by h) is given, then the above consideration is replaced by the suitable estimate
cond(A) = O h−2 ,
(2.59)
and this bound is independent of the space dimension N. The rapid growth of cond(A) (now of course as h → 0) is observed again. • The estimates (2.58) and (2.59) are independent of the degree of the finite element subspace. • Due to the previous remark we can conclude that the round-off error does not depend strongly on the degree of the polynomial finite elements. • For 2d-th order boundary value problems we have the estimate
cond(A) = O h−2d .
(2.60)
• In the estimates the constants depend inversely on the smallest eigenvalue of the considered continuous problem and they increase if the geometry of the elements becomes degenerate. However, the exponents of h are correctly defined in the estimates. • Similar results hold for the finite difference method [217, 263]. E.g., for the fivepoints difference matrix for a unit square with Dirichlet boundary conditions, λmin = 1 − cos πh and λmax = 1 + cos πh. Hence cond(A) = cot2 πh ≈ O(h−2 ) 2 as h → 0. As this example shows, the gradient method (2.50)–(2.51) has the convergence rate q = cos πh = 1 − O(h2 ).
2.2.2
Difficulties arising from large condition numbers
Now we are in the position to summarize the difficulties arising from large condition numbers. 1. Uncertainties of the mathematical model (ill-posedness) If cond(A) is large, then we cannot guarantee that the exact solution of the system of linear algebraic equations (2.2) is close to the ‘real physical’ solution (2.1). 2. Large round-off error (numerical instability) As we have seen in subsection 2.1.3, due to the large condition number the theoretically well based methods can fail during the computer realization. 3. More computer costs (slow convergence) We have also noticed that in case of a large condition number the basic iterative methods show very slow convergence or can even fail. At the same time, the iterative method of solving systems of linear algebraic equations plays a central role in the whole solution process. This is illustrated by the the following example (Molchanov [217]). The problem is orientated to the stability computations in 3D air-plane simulation by use of FEM. The fully discretized system of linear algebraic equations is solved by using a variant of the conjugate gradient method.
46
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS process CPU time in the whole process (%) decomposition of the domain 0.25 construction of SLAE’s 4.25 construction of the iterative process 38.25 computation by the iteration (25 steps) 57.25
As one can see, the main computational effort is arising from the iterative steps. Therefore, if one is able to decrease the number of iterations then the computational costs are also significantly reduced.
2.3
Preconditioning of linear algebraic systems
In subsection 2.2.2 we have listed the difficulties arising from large condition numbers. The first one (ill-posedness) can be solved by some regularization of the problem. The study of this question is not a goal of this book, we refer to the related works, e.g. Molchanov [217], Tichonov [279, 280] and others. The other two difficulties (numerical instability and slow convergence) can be handled by preconditioning, which is a basic topic of this work. (We note that suitable preconditioning can also help in case of ill-posedness, see Axelsson-Neytcheva-Polman [28].) The notion of preconditioning was introduced in an early paper of Evans [105], and since then abundant preconditioning ideas and techniques have been developed, see e.g. Axelsson [11], Bruaset [60], Evans [105].
2.3.1
Preconditioning as the main tool of improving the condition number
Let us consider the system of linear algebraic equations (2.2) and assume that D ∈ Rs×s is a regular matrix with the following properties: • the solution of systems with matrix D is not costly, • the matrix D−1 A is considerably better conditioned than A. In the sequel, such a matrix is called a preconditioning matrix or preconditioner. Clearly, the above goals of the choice of the preconditioning matrices are conflicting. The choice of D as the identity matrix is optimal from the viewpoint of the first requirement, but not of the second requirement. On the other hand, the choice D = A optimally satisfies the second but not the first requirement. We mention that the matrix D in subsection 2.1.4 is not a real preconditioner: its inversion requires the same work as that of A. (In that illustrative example we have a priori defined the inverse D−1 without the explicit definition of D, therefore we referred to this approach as an ‘intuitive” method.) It is worth mentioning that, according to the requirements, the preconditioning matrix is necessarily also ill-conditioned but well-structured. Favourable choices of D to minimize the total computational effort can depend on the size of the problem, the sparsity pattern and the spectrum of A, the computer architecture, etc.
2.3. PRECONDITIONING OF LINEAR ALGEBRAIC SYSTEMS
47
Having a preconditioning matrix D, instead of problem (2.2) we consider the following equivalent problem D−1 Ax = D−1 b, (2.61) which is called preconditioned problem. Obviously, on the basis of the assumptions made, cond(D−1 A) << cond(A), (2.62) that is, the system of linear algebraic equations (2.61) is considerably better conditioned than (2.2). Typically, assuming A to be an SPD matrix, we choose D also to be an SPD matrix, hence the spectrum of D−1 A remains positive. Let us notice that the solution of the preconditioned problem (2.61) with the iterative methods discussed in subsection 2.1.3 turns into the iteration of type (2.48) with the preconditioning matrix D. However, the spectral bounds m and M of A in (2.51)–(2.52) are replaced by constants m ˜ and −1 ˜ M that are spectral bounds for D A. That is, let ˜ hDx, xi mhDx, ˜ xi ≤ hAx, xi ≤ M
(x ∈ Rs ),
which, using the D-energy norm kxk2D = hDx, xi, is equivalent to
hD−1 Ax, xiD ˜ ≤M (x ∈ Rs \ {0}). (2.63) kxk2D ˜ << M are assumed, hence (2.62) holds. Clearly, the relations m ˜ >> m and/or M That is, the preconditioned problems are better conditioned and hence yield faster convergence for (2.50). Consequently, the algorithm (2.48) can be regarded as a preconditioned one step iterative method. The preconditioning process results in only a small change in the recent algorithm, namely, in the steps of the iteration we have to solve an additional system of linear algebraic equations with the preconditioning matrix. For instance, whereas the method (2.50)–(2.51) means the algorithm m ˜ ≤
(a)
x0 is given, ; for any k = 0, 1, 2, . . . :
(b) rk = Axk − b;
τ=
2 ; M +m
xk+1 = xk − τ rk ;
(2.64)
the preconditioned version is (a)
x0 is given, ; for any k = 0, 1, 2, . . . :
(b) rk = Axk − b;
ek = D−1 rk ;
τ=
2 ; ˜ +m M ˜
xk+1 = xk − τ ek .
(2.65)
48
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
As one can see, the extra computer work arises from the step of solving Dek = rk , which is, by assumption, an easier task. For the conjugate gradient method one can similarly introduce a preconditioning matrix and define the preconditioned conjugate gradient method (PCGM), which means applying (2.57) to the preconditioned system D−1 Ax = D−1 b. In the corresponding numerical algorithm one has to add the same extra work as before. (See Axelsson–Barker [14].) The algorithm is the appropriate modification of (2.57) as follows: (a) x0 ∈ Rs is arbitrary; r0 = −d0 is the solution of the system Dr0 = Ax0 − b; for any k = 1, 2, ... : if xk−1 , rk−1 , dk−1 are obtained, then (b1) γk = hAdk−1 , dk−1 i, αk = − γ1k hDrk−1 , dk−1 i; k k k−1
(b2)
(b3) (b4) (b5)
(b6)
y is the solution of the system
Dy = Ad
;
rk = rk−1 + αk y k ;
xk = xk−1 + αk dk−1 ; βk =
1 hAdk−1 , rk i; γk
dk = −rk + βk dk−1 .
(2.66) then the precondiAccording to the results of subsection 2.1.5, if cond(D A) ≤ tioned CGM converges with the ratio (2.54). In fact (see e.g. [11]), the residuals −1
rk = D−1 (Axk − b)
M m
(2.67)
(calculated recursively in the algorithm) satisfy the estimate k
kr kD ≤
√ √ !k M− m √ kr0 kD , √ M+ m
(2.68)
where krk kD = hDrk , rk i1/2
(2.69)
is the energy norm corresponding to D. Remark 2.1 The energy norm (2.69) of the residuals is obtained from the algorithm (2.66) with no extra work. Namely, there holds hDrk , rk i = −hDrk , dk i
(2.70)
2.3. PRECONDITIONING OF LINEAR ALGEBRAIC SYSTEMS
49
(k=1,2,...) and the latter is calculated in the steps (b1). The relation (2.70) is verified as follows. The construction of the CGM implies that hDrk , dk−1 i = 0,
(2.71)
namely, steps (b1)-(b3) yield hDrk , dk−1 i = hDrk−1 , dk−1 i + αk hDy k , dk−1 i = hDrk−1 , dk−1 i + αk hAdk−1 , dk−1 i Then (2.71) and (b6) imply
= hDrk−1 , dk−1 i + αk γk = 0.
hDrk , rk i = −hDrk , dk i + βk hDrk , dk−1 i = −hDrk , dk i.
2.3.2
Spectral equivalence and preconditioning
As before, we assume that A and D are SPD matrices. We call these matrices spectrally equivalent if there exist constants 0 < γ1 ≤ γ2 such that the relation γ1 ≤
hAx, xi ≤ γ2 hDx, xi
(x ∈ Rs \ {0})
holds. Introducing a new inner product and the induced norm as kxk2D = hx, xiD ,
hx, yiD = hDx, yi, we get
hAx, xi hD−1 Ax, xiD = hDx, xi kxk2D
(x ∈ Rs \ {0}).
(2.72)
(2.73)
This means that the condition number of the preconditioned matrix D−1 A in k · kD norm satisfies the estimate γ2 . (2.74) cond(D−1 A) ≤ γ1 Hence under the natural assumptions γ2 < M and γ1 > m the condition number is improved (in D-norm). The notion of spectral equivalence can be extended to two infinite sets of SPD matrices {Ah } and {Dh } in a uniform sense as follows: there exist constants 0 < γ1 ≤ γ2 (independent of h) such that the relation γ1 ≤
hAh x, xi ≤ γ2 hDh x, xi
(h ∈ R,
x ∈ Rs \ {0})
(2.75)
holds. (See Axelsson–Barker [14], Manteuffel–Parter [204]. In [14] we can also find the construction of such preconditioners in some finite element problems using piecewise linear basic functions for a boundary value problem over a rectangular domain.) We mention that if the preconditioner D can be decomposed into the form D = B T B then the above preconditioning can be expressed in an equivalent two-sided form. Namely, h(B −T AB −1 )y, yi hAx, xi = (2.76) hDx, xi kyk2
50
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
holds for y = Bx 6= 0 and here the ranges of the two sides are equal. Hence cond(B −T AB) in the original norm coincides with cond(D−1 A) in the D-norm, further, the iterative sequences (xn ) and (yn ) corresponding to D−1 A and B −T AB, respectively, satisfy yn = Bxn .
2.3.3
Some important preconditioning techniques
In this subsection we give an overview about possible preconditioners D for an SPD matrix A. Only some widely used constructions are briefly listed. For details we refer to the monographs of Axelsson [11], Axelsson–Barker [14], Bruaset [60], and for aspects of high-performance computing to Dongarra–Duff–Sorensen–van der Vorst [89]. • Incomplete factorization preconditioning In this approach (where A is not necessarily SPD) the preconditioner D preserves the sparsity structure of the matrix A and has the decomposition D = LU , where L and U are lower and upper triangular matrices, respectively. Here L and U are suitable modifications of the LU -factorization of A such that they have the same sparsity pattern on the lower and upper triangular part, respectively. (In order to define these triangular matrices, usually certain Gauss elimination steps for the matrix A are applied.) If A is a matrix with few nonzero elements and D is close to A, then this preconditioner is effective, hence it is widely used for sparse matrices. • SSOR preconditioner Assume that A is SPD and it is decomposed into the sum A = L + diag(A) + LT , where now L is the lower triangular part of A. Then the matrix 1 D= 2−ω
1 diag(A) + L ω
1 diag(A) ω
−1
1 diag(A) + LT , ω
ω ∈ (0, 2)
is called symmetric successive overrelaxation (SSOR) preconditioner. Here the basic question is the parameter choice. E.g. for finite difference discretizations of elliptic problems in a square domain with size π, the use of the SSOR preconditioner with the choice ω∗ =
2 2 ≈ 1 + 2 sin h/2 1+h
(2.77)
results in the improvement from O(h−2 ) to O(h−1 ). • ADI preconditioner This is a typical and useful preconditioning method for discretized elliptic problems. (For simplicity, we consider the 2D case.) Let now L denote the offdiagonal part of the difference operator (matrix A) acting in the x-direction, LT the same for the y-direction, and diag(A) the diagonal part (as before). Then the matrix 1 diag(A) + L D= ω
−1
1 diag(A) ω
1 diag(A) + LT , ω
ω ∈ (0, 2)
2.3. PRECONDITIONING OF LINEAR ALGEBRAIC SYSTEMS
51
is called alternating direction iterative (ADI) preconditioner. (We recall that L and LT mean different matrices in the ADI and SSOR methods.) In the ADI method we solve alternatively one-dimensional difference equations. If we assume the commutativity of L and LT (which means that the continuous problem has constant coefficients), then the optimal parameter choice is repeatedly ω ∗ , defined in (2.77). • SOR preconditioner This preconditioning method is recommended for non-SPD matrices since the preconditioning matrix is not symmetric. Namely, D = diag(A) + ωL, where L denotes the lower triangular part of the matrix A (see Axelsson [11], Young [288]). A very useful related version for non-symmetric problems is the ADI preconditioning method. We refer to Delong–Ortega for some test problems (convection-diffusion in 2D) and optimal choice of the parameter [80], and for some parallel implementation with the conjugate gradient method [81]. • Multilevel preconditioners For matrices arising in discretized elliptic problems, efficient preconditioning has been developed in connection with algebraic multilevel iterations (AMLI), using block matrix approximate factorization. Constructing a sequence of finite element matrices on hierarchical basis form, one can define block diagonal preconditioners consisting of a perturbed AMLI type preconditioner block for the previous term and an element-by-element approximation of the leading block. In this way optimal and parameter-free condition numbers can be achieved. For this type of preconditioning the reader is referred to Axelsson [13], Axelsson–Margenov [25], Axelsson–Padiy [29], Axelsson–Vassilevski [32]. • Preconditioning for low-frequency eigenvalues Particular preconditioning techniques have been developed to eliminate the effect of a cluster of low-frequency eigenvalues of the matrix A. One of them is the socalled deflation of the components corresponding to such a cluster by using a preconditioner of the form B = I − V (V T AV )−1 V T ,
(2.78)
where the image of the rectangular matrix V contains the mentioned components (Nicolaides [232]). Such matrices also appear in coarse-grid corrections (Hackbusch [144]). An improvement of the preconditioner (2.78) is developed in Axelsson-Neytcheva-Polman [28] and Padiy-Axelsson-Polman [239], which avoids the computation of (V T AV )−1 : introducing an easily invertible approximation BV of (V T AV )−1 , the preconditioner B = I + σV BV−1 V T ,
(2.79)
where σ > 0 is a parameter close to λmax (A), moves the cluster of small eigenvalues to the vicinity of the largest one instead of deflating them.
52
CHAPTER 2. LINEAR ALGEBRAIC SYSTEMS
Chapter 3 Linear elliptic problems In this chapter we briefly summarize some properties related to linear elliptic boundary value problems and their numerical solution. For simplicity we focus on second order problems. The first section gives some background on linear operators in Hilbert space, required both in this chapter and later in the book. Then in section 3.2 the solvability of elliptic problems is discussed, involving both weak solutions and some regularity results. After that we turn to the numerical solution of elliptic problems, focusing on two areas. First, the numerical solution of linear elliptic problems is a highly developed area containing many efficient standard solution methods, including both methods for general problems and fast solvers for problems in special form. We discuss this field briefly in section 3.3, starting with a very short summary on finite element method discretization, and then quoting particular efficient solution methods in two distinct subsections. These include references to methods for general problems and then to fast solvers for important kinds of problems in special form, respectively. Second, section 3.4 points out that the iterative solution of discretized linear elliptic problems may benefit by Sobolev space theory, which both reveals the background of important conditioning properties and suggests an efficient preconditioning approach. The latter has been developed in a series of papers, started by D’yakonov [96], Gunn [142, 143], Concus and Golub [76], and more recently put in an organized form by Faber, Manteuffel and Parter [109] and Manteuffel and Parter [204] who provide a comprehensive and rigorous foundation of the theory of preconditioning operators. Namely, one can define preconditioning matrices as discretizations of suitable linear preconditioning operators. In this way the obtained condition numbers can be estimated in a mesh uniform way by that of the underlying operators. This work gives a basic motivation for the present book to apply a similar operator approach for the preconditioning of nonlinear problems, relying on the mentioned highly developed background of elliptic solvers. In section 3.4 we briefly summarize some of the ideas of the above papers. 53
54
3.1
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
Some properties of linear operators in Hilbert space
This section contains some properties of linear operators in Hilbert spaces that are required later for boundary value problems. Since the whole discussion serves as background for the nonlinear case, we consider real Hilbert spaces (which give the suitable setting for nonlinear problems) instead of complex ones. For more details on linear operators see [109, 154, 255, 259]. A Hilbert space is typically thought of as infinite-dimensional, the main realization being the Sobolev spaces in which boundary value problems are considered. We also note that Rs is an obvious special case of Hilbert space, and most results in the sequel are direct analogies of the finite-dimensional ones such that the proofs combine the ideas of the latter with Hilbert space technical tools.
3.1.1
Energy spaces
Definition 3.1 Let H be a real Hilbert space and D ⊂ H a dense subspace. The energy space HS of a strictly positive symmetric linear operator S : D → H is defined as the completion of D with respect to the inner product hu, viS ≡ hSu, vi
(u, v ∈ D).
The inner product h., .iS and corresponding norm k.kS are called energy inner product and energy norm, respectively. Remark 3.1 In the sequel we consider strongly positive linear operators, by which we mean operators with a positive lower bound p > 0: hSu, ui ≥ pkuk2
(u ∈ D),
i.e. the strict positivity assumption is slightly strengthened. In this case the energy norm is stronger than the original one: kuk2S ≥ pkuk2
(u ∈ HS ),
(3.1)
and it is easy to verify that the space HS can be represented such that it is contained in H (see e.g. [255]). Theorem 3.1 Let H be a real Hilbert space, D ⊂ H a dense subspace, and S : D → H a strongly positive symmetric linear operator. Then for all g ∈ H the equation Su = g has a unique weak solution u∗ ∈ HS , that is, hu∗ , viS = hg, vi
(v ∈ HS ).
(3.2)
3.1. SOME PROPERTIES OF LINEAR OPERATORS IN HILBERT SPACE
55
Proof. The functional φ : HS → R,
φv ≡ hg, vi
is bounded linear on HS owing to the estimate |φv| = |hg, vi| ≤ kgkkvk ≤ p−1/2 kgkkvkS , which follows from (3.1). Hence the Riesz theorem implies the existence and uniqueness of u∗ .
3.1.2
Spectral equivalence and contractivity
In this subsection we consider bounded linear operators. We formulate some lemmas for operators in a real Hilbert space H satisfying the following property: Spectral equivalence condition. Let A and B be strongly positive self-adjoint linear operators in H and let there exist constants M ≥ m > 0 such that mhBh, hi ≤ hAh, hi ≤ M hBh, hi
(h ∈ H).
(3.3)
Proposition 3.1 Let A and B be strongly positive self-adjoint linear operators in H, satisfying (3.3). Then HA = HB = H and m1/2 khkB ≤ khkA ≤ M 1/2 khkB
(h ∈ H).
(3.4)
Proof. Since A and B are bounded both below and above, it follows from Remark 3.1 that HA = HB = H. Then, by definition, (3.4) follows obviously from (3.3). Lemma 3.1 Let A and B be strongly positive self-adjoint linear operators in H, satisfying (3.3). Then mhA−1 h, hi ≤ hB −1 h, hi ≤ M hA−1 h, hi
(h ∈ H).
(3.5)
Proof. We only prove the right side of (3.5), the left one is similar. Let v ∈ H. Setting h = B −1/2 v, (3.3) yields kA1/2 B −1/2 vk2 = kA1/2 hk2 ≤ M kB 1/2 hk2 = M kvk2 . Hence which implies
kB −1/2 A1/2 k2 = k(B −1/2 A1/2 )∗ k2 = kA1/2 B −1/2 k2 ≤ M,
hB −1 h, hi = kB −1/2 A1/2 A−1/2 hk2 ≤ M kA−1/2 hk2 = M hA−1 h, hi
(h ∈ H).
From Proposition 3.1 and Lemma 3.1 there follows Corollary 3.1 Let A and B be strongly positive self-adjoint linear operators in H, satisfying (3.3). Then m1/2 khkA−1 ≤ khkB −1 ≤ M 1/2 khkA−1
(h ∈ H).
(3.6)
56
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
Lemma 3.2 Let A and B be strongly positive self-adjoint linear operators in H, satisfying (3.3). Then the following contractivity estimates hold: kI −
2 M −m AB −1 kA−1 ≤ , M +m M +m
(3.7)
M −m 2 B −1 AkB ≤ , M +m M +m
(3.8)
kI −
where I is the identity operator. Proof. Let C =
M +m B. 2 −1
The operator I − AC
We first verify (3.7).
is self-adjoint w.r. to the energy norm of A−1 , since
hAC −1 h, viA−1 = hC −1 h, vi = hh, C −1 vi = hh, AC −1 viA−1
(h, v ∈ H).
Hence kI − AC −1 kA−1 = sup h6=0
Since C −1 =
2 B −1 , M +m
|h(I − AC −1 )h, hiA−1 | |h(A−1 − C −1 )h, hi| . = sup khk2A−1 hA−1 h, hi h6=0
(3.9)
(3.3) and Lemma 3.1 imply
2M 2m hA−1 h, hi ≤ hC −1 h, hi ≤ hA−1 h, hi. M +m M +m Hence for any h ∈ H −
M − m −1 M − m −1 hA h, hi ≤ h(A−1 − C −1 )h, hi ≤ hA h, hi, M +m M +m
i.e. the supremum in (3.9) is indeed at most
M −m . M +m
The proof of (3.8) is analogous to that of (3.7). One proves similarly that the operator I − C −1 A is self-adjoint w.r. to the energy norm of B, and hence kI − C using C −1 = −
−1
2 |h(B − M +m A)h, hi| |h(I − C −1 A)h, hiB | , = sup AkB = sup khk2B hBh, hi h6=0 h6=0
2 B −1 . M +m
(3.10)
Estimating here hAh, hi with (3.3), we obtain
2 M −m M −m hBh, hi ≤ h(B − A)h, hi ≤ hBh, hi M +m M +m M +m
i.e. the supremum in (3.10) is indeed at most
(h ∈ H),
M −m . M +m
The obtained contractivity result enables us to verify directly the convergence of the simple iteration (gradient method) for uniformly positive bounded linear operators. We consider constant stepsize.
3.1. SOME PROPERTIES OF LINEAR OPERATORS IN HILBERT SPACE
57
Theorem 3.2 (Kantorovich-Akilov [163, 165]). Let H be a real Hilbert space and let the bounded self-adjoint linear operator A : H → H satisfy mkhk2 ≤ hAh, hi ≤ M khk2
(u, h ∈ H)
(3.11)
with constants M ≥ m > 0 independent of u, h. Let b ∈ H be arbitrary and denote by u∗ ∈ H the unique solution of the equation Au = b. Then for any u0 ∈ H the sequence un+1 := un −
2 (Aun − b) M +m
(n ∈ N)
(3.12)
converges to u∗ according to the linear estimate 1 M −m kun − u k ≤ kAu0 − bk m M +m
∗
n
(n ∈ N) .
Proof. The well-posedness of equation Au = b follows from (3.11) since 0 is not in the spectrum of A. The construction of the sequence (3.12) implies 2 un+1 − u = I − A (un − u∗ ) M +m ∗
(n ∈ N),
hence, applying Lemma 3.2 with B = I (the identity operator), we obtain kun+1 − u∗ k ≤ By induction kun − u∗ k ≤
M −m kun − u∗ k M +m
M −m M +m
n
(n ∈ N).
ku0 − u∗ k
(n ∈ N)
and here (3.11) implies ku0 − u∗ k ≤ (1/m)kAu0 − bk,
which yield the required estimate.
The above theorem is the extension of the same finite-dimensional result quoted in subsection 2.1.5. One can similarly generalize the thereby mentioned convergence result of the conjugate gradient method to Hilbert spaces (Daniel [79]): the matrix A in the CG sequences (2.57) can be replaced by a linear operator, and if (3.11) holds then the CGM converges with ratio √ √ M− m q=√ √ . M+ m Moreover, if (3.11) is replaced by (3.3), then the same convergence results on the gradient and conjugate gradient methods hold for the operator B −1 A in k.kB norm. Hereby the iterative sequences are the analogues of (2.65) and (2.66), respectively. (We note that the superlinear convergence result in terms of the K-condition number, mentioned at the end of subsection 2.1.5, can also be extended to Hilbert space [22]).
58
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
3.2
Well-posedness of linear elliptic problems
In this section we consider the second order linear elliptic operator Su ≡ −div (K(x)∇u)
(3.13)
on a bounded domain Ω ⊂ RN , and we assume that S is uniformly elliptic, i.e. the symmetric matrix-valued function K ∈ C 1 (Ω, RN ×N ) satisfies σ(K(x)) ⊂ [m, M ] ⊂ (0, ∞)
3.2.1
(x ∈ Ω).
(3.14)
Weak solutions
The aim of this subsection is to sketch some well-posedness results for boundary value problems and related properties of the differential operators. Detailed investigation of linear elliptic problems is found e.g. in the monographs of Agmon [3], Hackbusch [145], H¨ormander [153], Simon–Baderko [265]. Let us first consider the linear Dirichlet problem Su = g
in Ω
u |∂Ω = 0
(3.15)
on a bounded domain Ω ⊂ RN with the operator (3.13) and some g ∈ L2 (Ω). Theorem 3.3 Problem (3.15) has a unique weak solution u∗ ∈ H01 (Ω), i.e. Z
Ω
Z
K(x) ∇u∗ · ∇v =
Ω
gv
(v ∈ H01 (Ω)).
(3.16)
The standard proof [3, 145] of this theorem relies on the coercivity and boundedness of the bilinear form Z a(u, v) = K(x) ∇u · ∇v (3.17) Ω
on
H01 (Ω).
Now we formulate some properties of the operator S in the corresponding Sobolev spaces. These will be used in the later chapters. On the other hand, we note that the positivity of S also enables us to verify the above well-posedness result. Proposition 3.2 (Sobolev inequality) [77]. There exists ν > 0 such that Z
Ω
|∇u|2 ≥ ν
Z
u2
Ω
(u ∈ H01 (Ω)).
(3.18)
The converse is not true, i.e. there holds sup u∈H01 (Ω)\{0}
R
|∇u|2 = +∞. 2 Ω |u|
Ω
R
(3.19)
3.2. WELL-POSEDNESS OF LINEAR ELLIPTIC PROBLEMS
59
Proposition 3.3 The operator (u ∈ D(S) ≡ H 2 (Ω) ∩ H01 (Ω))
Su ≡ −div (K(x)∇u)
is a symmetric linear operator in L2 (Ω) with some positive lower bound ̺ > 0 . Proof. The condition K ∈ C 1 (Ω, RN ×N ) implies that S maps H 2 (Ω) ∩ H01 (Ω) into L2 (Ω). The symmetry of K and divergence theorem yield that S is symmetric: hSu, vi =
Z
Ω
(u, v ∈ H 2 (Ω) ∩ H01 (Ω)).
K(x) ∇u · ∇v
(3.20)
Hence, using (3.14) and (3.18), hSu, ui ≥ m
Z
2
Ω
|∇u| ≥ mν
Z
Ω
u2 = ̺kuk2L2 ,
with ̺ := mν. The well-posedness result Theorem 3.3 then follows from Theorem 3.1 and Proposition 3.3. The lower bound of S is characterized as follows: Proposition 3.4 The sharp lower bound ̺ > 0 of S is the smallest eigenvalue of S on D(S) = H 2 (Ω) ∩ H01 (Ω). Proof. It follows from the Fourier series expansion. Namely, if u is represented as u=
∞ X
c i ui ,
i=1
where λi and ui are the eigenvalues and corresponding normalized eigenfunctions of S, then ∞ ∞ hSu, ui =
X i=1
λi c2i ≥ λ1
X i=1
c2i = λ1 kuk2L2 ,
and equality holds for the first eigenfunction u = u1 . Remark 3.2 Applying Proposition 3.4 to S = −∆ and using Green’s formula −
Z
Ω
(∆u) u =
Z
Ω
|∇u|2
(u ∈ H 2 (Ω) ∩ H01 (Ω)),
we obtain that the lower bound ν in Proposition 3.2 equals the smallest eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω). The well-posedness of problems more general than (3.15) can be proved similarly as mentioned after Theorem 3.3, i.e. using the coercivity and boundedness of the corresponding bilinear forms. First, if a lower order term q(x)u is added to −div (K(x)∇u) in (3.15), then the assumptions q(x) ≥ 0 and q ∈ L∞ (Ω) imply that the coercivity
60
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
and boundedness, respectively, of the form (3.17) are preserved, using Proposition 3.2. Further, in the case of mixed problems the corresponding boundary conditions yield an additional term in the bilinear form, which can be estimated similarly using the standard boundary integral inequalities [265]. For the similar treatment of higher order problems see e.g. [3]. In the case of Neumann problems, well-posedness holds in the one-codimensional subspace consisting of functions with zero mean. (We note that all these theorems are special cases of the nonlinear well-posedness results in Chapter 6.)
3.2.2
Regularity
We quote some regularity results for problems with the operator (3.13) under different boundary conditions on the bounded domain Ω ⊂ RN . The case of interest for us is to ensure that the solution is in H 2 (Ω). The first result concerns the Dirichlet problem (3.15). Theorem 3.4 (Kadlec [162]). Let Ω be C 2 -diffeomorphic to a convex domain. 2 Then for any g ∈ L (Ω) the unique weak solution u∗ ∈ H01 (Ω) of the problem (
Su = g u|∂Ω = 0
(3.21)
satisfies u∗ ∈ H 2 (Ω). Now let us consider the mixed boundary value problem Su = g
in Ω
∂K(x)·ν u + βu = ϕ on ΓN
u = 0
(3.22)
on ΓD ,
where ∂K(x)·ν u = K(x) ν · ∇u is the conormal derivative of u at x. In the case ∂Ω = ΓN the following result holds: Theorem 3.5 (Grisvard [140]). Let ∂Ω = ΓN ∈ C 1,1 and β ∈ C 0,1 (∂Ω), β > 0 on ∂Ω. Then for any g ∈ L2 (Ω) and ϕ ∈ H 1/2 (∂Ω), the unique weak solution u∗ of (3.22) satisfies u∗ ∈ H 2 (Ω). The corresponding result for Neumann problems (i.e. β ≡ 0) is as follows: Theorem 3.6 R(see e.g. [99]). If ∂Ω ∈ C 2 , then for any g ∈ L2 (Ω)R and ϕ ∈ H 1/2 (∂Ω) R with Ω g dx + ∂Ω ϕ dσ = 0, the unique weak solution u∗ with Ω u∗ dx = 0 of the problem Su = g (3.23) ∂ K(x)·ν u|∂Ω = ϕ
satisfies u∗ ∈ H 2 (Ω).
3.3. STANDARD SOLUTION METHODS FOR SOME LINEAR ELLIPTIC PROBLEMS61 In the case ∂Ω 6= ΓN the analogue of Theorem 3.5 holds for the operator S = −∆ and coefficients β ≡ ϕ ≡ 0 in (3.22) on convex polygonal plane domains Ω ⊂ R2 , provided that ΓN and ΓD consist of entire edges, further, there are acute angles at the vertices that belong to ΓN ∩ ΓD (Grisvard [140]). Further, if Ω is a cube and S has a scalar coefficient K(x) = p(x) · I with some positive function p ∈ C 1 (Ω), then Theorem 3.6 remains true (Faierman [110]).
3.3
Standard solution methods for some linear elliptic problems
The purpose of this section is to point out that for important kinds of linear elliptic problems there exist highly developed standard solution methods. Consequently, the matrices of certain discretized linear elliptic problems serve as efficient preconditioners for other linear problems. (This approach, which gives motivation for the nonlinear case, will be discussed in the next section.) We start this section with a very brief summary on the finite element method, which is the most widespread way of discretization for linear elliptic problems. Then two distinct subsections are devoted to various efficient solution methods, cited first for general problems in subsection 3.3.2 and then for problems in special form in subsection 3.3.3.
3.3.1
Finite element discretization
In this subsection we give a very brief summary on the finite element method. For simplicity we remain at the study of the second order operator (3.13), further, we present its FEM discretization for Dirichlet problems (3.16). It is not our aim to give any complete discussion on the FEM, its knowledge being expected of the reader. Instead, this summary serves convenience concerning the terminology and notations to be used later. For details, we refer to the wide literature, e.g. Axelsson–Barker [14], Ciarlet [72], Molchanov–Nikolenko [218], Oden [234], Strang-Fix [269], Temam [276]. The basic idea of FEM is as follows: a family of finite-dimensional subspaces Vh ⊂ H01 (Ω) is constructed and problem (3.16) is replaced by the problem f ind
u∗h
∈ Vh :
Z
Ω
K(x) ∇u∗h
· ∇vh =
Z
Ω
gvh
(vh ∈ Vh ).
(3.24)
The functions u∗h are the finite element approximations to the solution of (3.16). In the theoretical foundation of the method the basic question is the construction of Vh (see [14, 269]). This is briefly summarized as follows. The domain Ω is replaced by the union of disjoint subdomains ∪Ll=1 Ωl , and on each of them we define a subspace Vh,l of suitable polynomials spanned by local basis functions Tl {φ(l) r }r=1 .
The global basis functions {φi }si=1
62
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
on Ω are constructed from the local basis functions by their contribution of on each element Ωl . The global basis functions span the subspace Vh which is assumed to satisfy some global smoothness condition: as minimum, we require Vh ⊂ H01 (Ω) ∩ C(Ω). In the general case the degree of the polynomials on different elements may be different, then the local basis functions on Ωl are chosen such that the space Prl of polynomials up to some degree rl over Ωl satisfies Prl ⊂ Vh,l . Most often, however, the degree of the polynomials on different elements is chosen the same, i.e. rl ≡ r. When the degree of the polynomials in Vh equals r on each triangle, the main theoretical error estimate for the FEM approximation u∗h is as follows [269]. Letting u∗ denote the exact solution of (3.16), for the case u∗ ∈ H k (Ω) (k = 1, ..., r + 1) there holds (3.25) ku∗ − u∗h kH01 (Ω) ≤ const. · hk−1 ku∗ kH k (Ω) .
More generally, if on the left we consider the H s (Ω) norm of the error (s = 0, 1, ..., k), then the factor hk−1 is replaced by hk−s on the right. Further, if the differential operator is of order 2m, then the latter is replaced by hk−m for the H01 (Ω) norm of the error. P Looking for u∗h in the form u∗h = sj=1 cj φj and by setting vh = φi (i = 1, .., s) in (3.24), we obtain the system of linear algebraic equations s X
j=1
cj
Z
Ω
K(x) ∇φj · ∇φi =
Z
Ω
gφi
(i = 1, .., s).
(3.26)
In the realization of the method the crucial question is to ensure Vh ⊂ H01 (Ω) and the computation of the integrals in (3.26). In the typical investigations the convergence and the accuracy of the method are considered under the assumptions that the inclusion Vh ⊂ H01 (Ω) is assumed and the integrals in (3.26) are computed exactly. However, sometimes the above requirements are violated: either the basis functions do not belong to the space H01 (Ω) or the integrals are computed numerically. The first problem can be caused by different reasons: sometimes the subdomains Ωl are polygonal and ∪Ll=1 Ωl only approximates Ω (typically due to the curved boundary of Ω); further, the smoothness of the basis functions in Vh may be lower than it is required by the the space H01 (Ω); finally, the principal boundary conditions are sometimes not satisfied. These modifications may cause serious problems and sometimes even the convergence fails. In order to handle these problems different techniques can be applied like piecewise testing, isoparametric elements, variational crime. (See e.g. [40, 182, 183, 218, 269].) A central question in the realization of the method is the determination of the integrals in (3.26). They are generally computed numerically using some quadrature rule Q, that is Z Q(vh ) ≈ vh dΩ Ω
see e.g. [24]. Namely, first numerical integration formulae (quadrature rules), denoted by Ql (vh ), are chosen on the closure of each different element Ωl . Such a local quadra(l) l (l) ture rule Ql is defined by a set of quadrature points {xk }qk=1 and weights wk ∈ R such that Z q Ql (vh ) ≡
l X
k=1
(l)
(l)
wk vh (xk ) ≈
Ωl
vh dΩl ,
3.3. STANDARD SOLUTION METHODS FOR SOME LINEAR ELLIPTIC PROBLEMS63 Ql (
Z ∂ (l) ∂ (l) ∂ (l) ∂ (l) φr φs ) = φr φ dΩl ∂xp ∂xq ∂xq s Ωl ∂xp
(3.27)
(l) for all vh ∈ H01 (Ω), φ(l) r , φs ∈ Vh,l , p, q = 1, ..N . Then the global quadrature rule Q on Ω is defined by
Q(vh ) ≡
L X
Ql (vh ).
l=1
The quadrature rule Q is called of degree d if it is exact for all polynomials in Pd . The element quadrature rule depends on the degree of the polynomial basis functions in Vl : if Vl = Prl then Ql has to be exact for all polynomials in P2(rl −1) (cf. e.g. (3.27)). Using this quadrature, the numerical computation of the left side of (3.26) yields the so-called stiffness matrix Bij = Q (K∇φj · ∇φi ) . We mention that when a lower order term q(x)u is added to the principal part of the elliptic operator, then the above discretization includes the corresponding additional matrix with the elements Mij = Q (qφj φi ) , called mass matrix. We remark that by a special choice of the numerical integration formula Q, the well known finite difference method (FDM) can be recovered for the numerical solution to (3.16). For details on the FDM the reader is referred e.g. to Samarskii [263], Young [288]. Finally we note that the so-called primal formulation (3.24) of the FEM can be replaced by a mixed formulation written as a system with the extra unknown z = ∇u. This is useful when the coefficients have large variation and is also favourable to spare cost for fourth order problems (see e.g. Axelsson–Gustafsson [17], Brezzi–Raviart [56]).
3.3.2
Efficient solution algorithms for general linear elliptic problems
A large variety of efficient methods have been developed for the numerical solution of general linear elliptic problems. These methods provide the solution of the algebraic systems obtained from the discretization of the problem, and they rely in different ways on the fact that the algebraic system comes from an elliptic problem. The scope of available solution methods also involves fourth order problems and general boundary conditions. We mention some of these methods, clearly without being complete. For detailed summaries the reader is e.g. referred to the monographs of Axelsson and Barker [14], Birkhoff and Lynch [42], Ciarlet and Lions [73], Hackbusch [144, 145]. A widespread approach for solving elliptic problems is using multigrid or multilevel methods, see the summaries of Hackbusch [144] and McCormick [210, 211]. Multigrid methods comprise the combination of smoothers (i.e. scaled iterative methods) with iterative residual correction on coarser grids, where the aim of the latter is to reduce the error on a given finer grid. The smoother is most often a conjugate gradient method. A unified theory for multigrid algorithms is given by Douglas and Douglas [92],
64
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
and for variational problems a concise treatment is found in McCormick–Ruge [212]. There exist two principal versions of multigrid methods: the correction schemes in the first one only use the coarser levels for solving the residual correction problems, whereas the second one is composed of nested iteration schemes which also generate an initial guess for the next level using the previous coarser levels. See e.g. Bramble– Pasciak–Xu [54], McCormick [209] for the first and Hackbusch [144] for the second approach; an analysis of the complete algorithm for both versions is found in Douglas [91]. The several modifications of standard multigrid methods include the constructive interference method (Douglas–Miranker [94]) which can decrease costs by eliminating smoothing steps, robust multigrid (Hackbusch [146], Ta’asan [274]) which can handle highly oscillatory problems, and versions to treat nearly singular problems (Cai– Mandel–McCormick [66]). Multigrid methods are also suitable for parallel computer realization (Bramble–Pasciak–Xu [53], Frederickson–McBryan [118]). Multilevel methods form an efficient way of constructing preconditioning matrices for discretized elliptic problems using nested subspaces. The algebraic multilevel preconditioning method (Axelsson–Vassilevski [31, 32]) uses purely algebraic means to construct a preconditioner (namely, as a kind of recursively defined approximate block factorization of the original stiffness matrix), and in this way it avoids regularity assumptions on either the solution or the domain. Multilevel preconditioning methods are able to handle disproportional coefficients via multilevel decomposition (Jung– Nepomnyashchikh [158]). The multilevel approach is also applicable to higher order problems (Bramble–Pasciak–Zhang [55], Ewing–Margenov–Vassilevski [107]). The parallel version of multilevel preconditioning (Bramble–Pasciak–Xu [53, 54]) allows the parallel construction of the preconditioners. As a counterpart of subspace decomposition, the idea of decomposition is also widespread to handle irregularity in the shape of the domain or in the coefficients. Domain decomposition methods have undergone a large evolution since parallel computing has started to develop, see e.g. the monographs of Quarteroni–Valli [250], Smith–Bjorstad–Gropp [266]. The scope of this approach includes the solution of problems with strongly discontinuous coefficients (Graham–Hagger [137], Nepomnyashchikh [225]) via additive Schwarz methods, and can also be combined with higher order finite element methods (Ainsworth [4]). As a matter of course, purely algebraic means are of basic importance not only in the above-mentioned multilevel preconditioning approach but in the general handling of the linear elliptic systems. For instance, incomplete or approximate factorization methods such as incomplete Cholesky or block incomplete factorization of the system matrix are widespread. The study of various algebraic tools in this respect is beyond the scope of the book, the reader is referred to the monograph of Axelsson [11] on these and other techniques for the iterative solution of the discretized systems.
3.3.3
Fast solvers for problems in special form
For certain elliptic equations with special coefficients, there exist particular solution methods in which the number of operations can be reduced by exploiting the special form of the problem. In this way these fast solvers require considerably lower cost than general solution methods in the case of large sized discretized systems.
3.3. STANDARD SOLUTION METHODS FOR SOME LINEAR ELLIPTIC PROBLEMS65 The simplest elliptic operator is the Laplacian, for which various fast direct solvers were developed already some decades ago. The majority of these fast Poisson solvers is a direct method, developed originally on rectangular domains and later extended to other domains using the fictitious domain approach. Comprehensive summaries on the direct solution of the Poisson equation on a rectangle are found in the papers of Dorr [90] and Swarztrauber [272], which include the method of cyclic reduction, the fast Fourier transform and the FACR (fast cyclic reduction) algorithm. The mentioned papers equally discuss Dirichlet, Neumann and mixed boundary conditions. The parallel implementation of these algorithms is also feasible (Rossi–Toivanen [256], Vassilevski [283, 284]). Another family of fast solvers on rectangles which has undergone recent development is formed by spectral methods (Boyd [50], Funaro [121], Gottlieb-Orszag [136]), which are based on collocation. Some more details and references related to these fast direct Poisson solvers are given in section 9.3. Many of the above Poisson solvers were later generalized to problems with separable operators, i.e. in which the derivatives w.r. to different variables may contain non-constant coefficients depending on the same single variable. These extensions are based on the fact that the methods often use this separation property only instead of the particular form of the Laplacian (Buzbee–Golub–Nielsen [64], Dorr [90], Swarztrauber [272]). These methods include marching algorithms, which are summarized for equations with constant coefficients in Bank–Rose [37], and analysed in comparison with other fast methods in Bank-Rose [36]. The mentioned fast methods require O(n2 log(log n)) operations (i.e. floating point multiplications and divisions) in the case of a rectangle, where n denotes the number of unknowns. In case of the extension of these methods to three-dimensional problems, the factor n2 is replaced by n3 . The extensions of the above methods from rectangular domains to other ones is achieved using the fictitious domain approach. This means that the original domain is embedded into a rectangle, in which an equivalent problem is solved by the above fast solvers (B¨orgers–Widlund [49], Marchuk–Kuznetsov–Matsokin [205], Rossi–Toivanen [257]). Spectral methods on domains other than rectangles can be treated using the domain decomposition idea (Funaro [122], Funaro–Quarteroni–Zanolli [125]). Fast solvers have also been developed for the biharmonic problem in the 4th order case (Bjorstad [43], Langer [191]). This includes spectral methods (Bjorstad–Tjostheim [44]), and the reduction of the biharmonic operator to Poisson solvers (Mayo [207]) also amenable to parallelization (Mayo–Greenbaum [208]). Finally we mention the class of elliptic operators with piecewise constant coefficients, whose efficient solution can be achieved by various methods including ones designed especially for such problems. We hereby refer to algebraic multilevel preconditioners (Axelsson–Vassilevski [31, 32]), multigrid methods (Khoromskij–Wittum [179]), standard domain decomposition in terms of additive Schwarz methods (Dryja [95], Graham–Hagger [137]), or scaled Laplacian preconditioners (Greenbaum [138]). It is important to underline that the convergence rates of algebraic multilevel methods are essentially independent of the coefficient jumps. An exact formulation of the independence on the jumps is found in Knyazev–Widlund [181] concerning the finite element error: namely, if the constant coefficient value ε in a subdomain tends to zero,
66
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
then uniform FEM error estimates are proved if the right-hand side scales with ε in that subdomain.
3.4
Iteration and preconditioning in Sobolev space
In this section it is summarized briefly how Sobolev space theory may help the solution of discretized linear elliptic problems. This theory reveals the background of important conditioning properties and also suggests an efficient preconditioning approach, which gives motivation for the nonlinear case. This preconditioning approach has been developed in a series of papers, started by D’yakonov [96], Gunn [142, 143] , Concus and Golub [76], and and more recently put in an organized form by Faber, Manteuffel and Parter [109] and Manteuffel and Parter [204] based on equivalent operators. It will be summarized in subsection 3.4.2. Briefly, the matrices of certain discretized linear elliptic problems serve as efficient preconditioners for other linear problems. This is due on the one hand to the arising mesh independent conditioning estimates, and on the other hand, to the highly developed standard solution methods that can be used for the auxiliary problems and have been quoted in the previous section. The summary of these results is preceded by giving a basic notion in subsection 3.4.1 and followed by a brief excursion to two-sided preconditioning in subsection 3.4.3.
3.4.1
The condition number of linear operators
Definition 3.2 Let H be a Hilbert space, S a symmetric strictly positive linear operator in H. Then the condition number of S is defined as cond(S) = where Λ(S) =
hSx, xi , 2 x∈D(S)\{0} kxk sup
Λ(S) , λ(S)
λ(S) =
hSx, xi . x∈D(S)\{0} kxk2 inf
We note that in general there holds 0 < Λ(S) ≤ ∞ and 0 ≤ λ(S) < ∞. Hence the condition number may equal ∞. The above definition corresponds to the ratio of extreme eigenvalues in the case of SPD matrices. If an operator A in H is not symmetric as above but still invertible, then we can define cond(A) = kAkkA−1 k just as for matrices, but in the case of a Hilbert space this may also equal ∞.
Example. Let S be the differential operator defined in (3.13), D(S) = H 2 (Ω) ∩ H01 (Ω). Then Λ(S) =
sup u∈D(S)\{0}
R
Ω
2 K ∇u · ∇u Ω |∇u| R R ≥ m sup =∞ 2 2 u∈D(S)\{0} Ω |u| Ω |u|
R
3.4. ITERATION AND PRECONDITIONING IN SOBOLEV SPACE
67
using (3.19), hence cond(S) = ∞. (The same holds for any unbounded operator as well.) Remark 3.3 The above infinite condition number underlies the phenomenon that cond(Sh ) = O(h−2 ) is unbounded as h → 0 if Sh arises from some discretization of S. (This phenomenon was quoted in subsection 2.2.1.) Namely, cond(Sh ) approaches the infinite condition number of S as the discretization is refined.
3.4.2
Preconditioned gradient type methods and mesh independence
The observation in Remark 3.3 reflects a principle that has basic importance in the preconditioning of discretized linear elliptic problems. Namely, the asymptotic conditioning properties of these problems are determined by those of the underlying operators in Sobolev space. Based on this, one can also define preconditioning matrices as discretizations of preconditioning operators. Moreover, the obtained preconditioning properties are estimated by that of the underlying operators in a mesh uniform way owing to the mentioned asymptotics. This principle has been developed in a series of papers, referred to in the introduction of this section. We will briefly summarize some of these ideas in paragraph (b) of this subsection. Before that, we begin by citing the first result achieved in this field, where the Laplacian was defined as a preconditioner in Sobolev space. (a) The introduction of the Laplacian preconditioner The first result on preconditioning in Sobolev space has been established in L. Cz´ach’s thesis [78] (see also cited in Kantorovich–Akilov [165], Chapter XV). (The result also extends to the analogous Hilbert space setting, for which some related results were given by Kantorovich [164], Petryshyn [240].) Let us consider the linear Dirichlet problem of the form (3.15) Lu ≡ −div (K(x)∇u) = g u |∂Ω = 0
in Ω
(3.28)
on a bounded domain Ω ⊂ RN that is C 2 -diffeomorphic to a convex domain, where K(x) satisfies (3.14) and g ∈ L2 (Ω). Further, we define the operator −∆ with domain D(−∆) = H 2 (Ω) ∩ H01 (Ω). (Then by Theorem 3.4 we have R(−∆) = L2 (Ω).) The Sobolev space H01 (Ω) is endowed with the inner product hu, vi
H01
=
Z
Ω
∇u · ∇v .
The existence and uniqueness of a solution u∗ ∈ H 2 (Ω)∩H01 (Ω) is ensured by Theorems 3.3 and 3.4. Theorem 3.7 (Cz´ach).
Let us consider the operator A = (−∆)−1 L
68
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
as a linear operator in H01 (Ω) with domain D(A) = H 2 (Ω) ∩ H01 (Ω) (and hence R(A) = H 2 (Ω) ∩ H01 (Ω).) Then there holds the spectral equivalence condition
(u ∈ H 2 (Ω) ∩ H01 (Ω)),
mh−∆u, uiL2 ≤ hLu, uiL2 ≤ M h−∆u, uiL2 and hence we obtain cond(−∆−1 L) ≤
M m
(3.29) (3.30)
in the norm of H01 (Ω). Consequently, for any g ∈ L2 (Ω) and u0 ∈ H 2 (Ω) ∩ H01 (Ω) the iteration
2 (−∆)−1 (Lun − g) ∈ H 2 (Ω) ∩ H01 (Ω) (3.31) M +m converges to the solution u∗ of problem (3.28) with convergence quotient un+1 = un −
q=
M −m M +m
in H01 norm (and hence also in L2 norm). Idea of proof. Using Green’s formula, (3.29) is written as m
Z
Ω
2
|∇u| ≤
Z
Ω
K ∇u · ∇u ≤ M
Z
Ω
|∇u|2 ,
(3.32)
and this follows from (3.14). Further, using that hAu, uiH01 = hLu, uiL2 , (3.32) implies that m≤
hAu, uiH01 ≤M kuk2H 1 0
(u ∈ H 2 (Ω) ∩ H01 (Ω), u 6= 0),
(3.33)
and hence (3.30) is verified. Then A has a bounded self-adjoint extension to H01 (Ω), and the convergence result follows from (3.33) by the Hilbert space version of the gradient method in Theorem 3.2. Remark 3.4 (i) If we only assume u0 ∈ H01 (Ω), then (3.31) is replaced by the weak form 2 hun+1 , viH01 = hun , viH01 − hAun − b, viH01 , M +m R where the extended A on H01 (Ω) is the weak form of L and hb, viH01 = Ω gv (v ∈ H01 (Ω)). (ii) The extended operator A : H01 (Ω) → H01 (Ω) is the generalized differential operator corresponding to the weak formulation of problem (3.28). This means that for regular functions the decomposition Au = (−∆)−1 Lu
(u ∈ H 2 (Ω) ∩ H01 (Ω)),
used in Theorem 3.7, yields a (theoretically constructive) representation of the generalized differential operator.
3.4. ITERATION AND PRECONDITIONING IN SOBOLEV SPACE
69
(b) Preconditioning based on equivalent operators The discrete analogue of the theoretical result of Theorem 3.7 was established when the fast solvers, cited in subsection 3.3.3, have been developed. The efficiency of the latter motivated their application as solvers for auxiliary problems in preconditioned iterations. The first such results were presented by D’yakonov [96] and Gunn [142], who investigated the centered finite difference discretization of linear elliptic problems on a rectangle and suggested the same discretization of the Laplacian as preconditioner. The corresponding simple (or gradient/Richardson) iteration, which defines a sequence of linear systems −∆h un+1 = −∆h un − α(Lh un − gh ) (3.34) (with some constant α > 0) was later termed as D’yakonov-Gunn iteration. (Here the index h indicates the discretization parameter, the matrices Lh and ∆h denote the corresponding discretizations of the operator L in (3.28) and of the Laplacian, respectively.) Similar iterations were discussed in Widlund [291]. Then the development of fast solvers for more general problems than the Poisson equation led to iterations where the discrete Laplacian ∆h is either modified by scaling or adding a zeroth-order term, or is replaced by a general separable operator (Bank [38], Concus–Golub [76], Goldstein [135], Greenbaum [138], Guillard–Desid´eri [141], Widlund [291]). Further, similar ideas were applied to nonsymmetric problems (Elman–Schultz [104], Manteuffel–Otto [203]) and problems with singularities (Nepomnyashchikh [226]), for multilevel preconditioners (Bramble–Pasciak–Vassilevski [52]) or in the context of boundary integral operators (Khoromskij–Wendland [178]), and used efficiently for Stokes problems (B`egue–Glowinski–P´eriaux [41]). In many of these works the simple iteration (3.34) was replaced by a conjugate gradient method (see also Carey–Jiang [68], Rossi–Toivanen [257]), following the development of the latter. Besides providing efficiency using the fast solvers, these papers state that such iterations yield mesh independent convergence results. The above methods are based on the common observation that one can define efficient preconditioning matrices as discretizations of suitable elliptic preconditioning operators. For our symmetric operator L in (3.28), the related simple or gradient iterations can be written as Sh un+1 = Sh un − α(Lh un − gh )
(3.35)
(with some α > 0), where S is an operator which suitably approximates L and at the same time the solution of the linear systems (3.35) can be carried out by some efficient solver. The study of such iterations can be based on the analogue of Theorem 3.7 for two arbitrary spectrally equivalent linear elliptic operators (cf. also D’yakonov [96]). Namely, if L and S are strongly positive elliptic operators satisfying the spectral equivalence condition mhSu, uiL2 ≤ hLu, uiL2 ≤ M hSu, uiL2 then cond(S −1 L) ≤
(u ∈ H 2 (Ω) ∩ H01 (Ω)), M m
(3.36)
(3.37)
70
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
in the weighted Sobolev norm k.kS of H01 (Ω). Consequently, the simple or gradient iteration can now be applied to the operator S −1 L, and Theorem 3.2 yields convergence with ratio M −m q= . M +m Moreover, one is generally interested in the convergence of the CGM which yields faster convergence. Using the Hilbert space version of the CGM, mentioned in subsection 3.1.2, the analogue of the preconditioned CG sequence (2.66) for the operators L and S yields the convergence ratio √ √ M− m q=√ √ M+ m for the CG sequence in H01 (Ω) with m and M from (3.36). The main significance of these Sobolev space results is that they provide asymptotics for the discretized case, and this means that the obtained condition numbers and corresponding convergence results are mesh independent. In terms of spectral equivalence, this means that the analogue of (3.36) holds for the families of discretized operators Lh and Sh uniformly in the discrete subspaces Vh , i.e. ˜ hSh u, uiL2 mhS ˜ h u, uiL2 ≤ hLh u, uiL2 ≤ M
(u ∈ Vh , h > 0)
(3.38)
˜ ≥m ˜ =M with suitable constants M ˜ > 0 independent of h. (In many cases we have M and m ˜ = m, e.g. in FEM applications.) Such mesh independence results are observed in many of the above-mentioned papers. We now turn to the organized study of this property and quote very briefly some of these results. A rigorous study of the mesh independence of condition properties has been given in the paper of Faber, Manteuffel and Parter [109] for preconditioning operators, based on Hilbert space theory. Before mentioning some results, we note that the analogue of (3.37) is clearly valid in a general Hilbert space H for two spectrally equivalent linear operators. Namely, if L and S are strongly positive operators such that D = D(L) ∩ D(S) is dense in H and the spectral equivalence condition mhSu, ui ≤ hLu, ui ≤ M hSu, ui holds, then cond(S −1 L) ≤
(u ∈ D)
(3.39)
M m
in k.kS norm. In the paper [109] the spectral equivalence condition (3.39) is replaced by the more general condition of equivalence in norm. Based on this, the theory of equivalent operators is developed such that it includes nonsymmetric operators as well. Let S : W → V and L : W → V be linear operators between the Hilbert spaces W and V . For our setting it suffices to consider the case when S and L are one-to-one and D = D(L) ∩ D(S) is dense. The operator L is said to be equivalent in V -norm to S on D if there exist constants M ≥ m > 0 such that m≤
kLukV ≤M kSukV
(u ∈ D \ {0}).
(3.40)
3.4. ITERATION AND PRECONDITIONING IN SOBOLEV SPACE
71
If (3.40) holds, then under suitable density assumptions on D, the condition number of LS −1 in V is bounded by M/m. The V -norm equivalence of S −1 and L−1 implies this bound similarly for S −1 L. Similarly, condition (3.38) is replaced by uniform norm equivalence as follows. Let us consider families of operators Lh and Sh (indexed by h > 0) such that there exist the pointwise limit operators L and S as h → 0. The families Lh and Sh are said to be V -norm uniformly equivalent if there exist constants M ≥ m > 0, independent of h, such that kLh ukV m≤ ≤M (u ∈ D \ {0}, h > 0). (3.41) kSh ukV
Analogously as above, this implies that the condition numbers of the family Lh Sh−1 in V are uniformly bounded in h, and the similar equivalence of Sh−1 and L−1 h implies that −1 the condition numbers of the family Sh Lh in V are uniformly bounded in h.
Using the above notions, the following general results are established among other ones in the cited paper. (For brevity we omit precision.) First, the V -norm equivalence of L and S is necessary for the V -norm uniform equivalence of the families Lh and Sh . Further, if L and S are self-adjoint, then the converse is also true if the families Lh and Sh are obtained via suitable projections from L and S. That is, in this case (which is involved in FEM discretizations of self-adjoint elliptic operators) mesh independence for the discretized problems holds if and only if the underlying operators are equivalent. The Hilbert space results in [109] are followed by their detailed application to general elliptic operators involving FEM and FDM discretizations, for which both L2 norm and H 1 -norm equivalence are studied. The above results reinforce that the use of preconditioning operators is a natural preconditioning approach, moreover, in the above setting it is the only way in which mesh independent condition numbers can be produced. A similarly organized study of mesh independence under different boundary conditions is given by Manteuffel and Parter [204]. Since we focus on symmetric operators where the H 1 -condition number is relevant, here we only quote the corresponding H 1 result. This states that the H 1 -condition numbers of the family Sh−1 Lh are uniformly bounded if and only if L and S have homogeneous Dirichlet boundary conditions on the same portion of the boundary. Summing up, the cited results put forward the preconditioning operator idea for problem (3.28), in which the preconditioning matrix is the discretization of a preconditioning operator S that is equivalent to L. In particular, using fast solvers, the discretized Laplacian or general separable elliptic operators have proved efficient preconditioners when used for other linear elliptic operators, furthermore, discretized preconditioning operators provide the natural class of preconditioners for which mesh independent condition numbers can be obtained.
3.4.3
Some remarks on two-sided preconditioning
The analogue of the two-sided preconditioning mentioned at the end of subsection 2.3.2 exists also in the Sobolev space case. For simplicity we only consider the operator
72
CHAPTER 3. LINEAR ELLIPTIC PROBLEMS
S = −∆. Now we think of −∆ as ∇T ∇, and first we verify that this decomposition is valid in suitable operator sense. Definition 3.3 Let 1 Hrot = {v ∈ H 1 (Ω)N : rotv = 0},
1 Hdiv = {v ∈ H 1 (Ω)N : divv = 0},
1 1 ⊥ H∗1 = Hrot ∩ (Hdiv ) .
Proposition 3.5
(i) The mapping ∇ : H 2 (Ω) ∩ H01 (Ω) → H∗1 is bijective.
(ii) Let K(H∗1 ) = {Kv : v ∈ H∗1 }. The mapping −div : K(H∗1 ) → L2 (Ω) is bijective. (iii) There holds (−div)∗ |H 2 (Ω)∩H 1 (Ω) = ∇. 0
Proof. 1 1 (i) For any u ∈ H 2 (Ω)∩H01 (Ω) ∇u ∈ Hrot , further, for any z ∈ Hdiv h∇u, ziL2 (Ω)N = 1 1 hu, −divziL2 (Ω) = 0, i.e. ∇u⊥Hdiv . Hence R(∇) ⊂ H∗ . Conversely, for any w ∈ H∗1 the solution u ∈ H 2 (Ω) ∩ H01 (Ω) of the problem ∆u = divw, u|∂Ω = 0 1 1 ⊥ satisfies div(∇u−w) = 0, hence ∇u−w ∈ Hdiv ; on the other hand, ∇u ∈ (Hdiv ) 1 ⊥ as above and w ∈ (Hdiv ) by assumption, hence ∇u − w = 0. Thus R(∇) ⊃ H∗1 , and the uniqueness of this u implies that ∇ is bijective.
(ii) For any v ∈ H 1 (Ω) Kv ∈ H 1 (Ω), hence R(−div) ⊂ L2 (Ω). Conversely, if g ∈ L2 (Ω), then by Theorem 3.4 g ∈ R(−div) and is assumed once. (iii) This follows from h∇u, ziL2 (Ω)N = hu, −divziL2 (Ω) H 1 (Ω)N ).
(u ∈ H 2 (Ω) ∩ H01 (Ω), z ∈
Now we can summarize the relation of one and two-sided preconditioning in the Sobolev space case for the operator L in (3.28). Namely, let K : L2 (Ω)N → L2 (Ω)N , Kz := K · z (pointwise matrix-vector product). Then (
)
hKz, zi
hLu, ui : u ∈ H 2 (Ω) ∩ H01 (Ω), u 6≡ 0 = kzk2 2 N h−∆u, uiL2 (Ω) L (Ω) L2 (Ω)
L2 (Ω)N
: z ∈ H∗1 , z 6≡ 0 ,
using substitution z = ∇u and Proposition 3.5 (i). Hence cond(−∆−1 L) in H01 (Ω) norm coincides with cond(K) in L2 (Ω)N norm. We note that, defining the operators ∇ and −div with domains as in Proposition 3.5, there holds K|H∗1 = (−div)−1 L∇−1 , which corresponds to the decomposition B −T AB −1 in the discrete case, and we just obtain the analogue of (2.76). Since we have obtained again the equivalence of the one and two-sided preconditions, it suffices to restrict ourselves to the first one.
Chapter 4 Nonlinear algebraic systems and preconditioning Nonlinear algebraic systems arise in many different contexts in numerical analysis, one of the most important being the discretization of nonlinear boundary value problems. In this chapter we briefly discuss some basic questions in the numerical solution of nonlinear algebraic systems, focusing on issues related to preconditioning. (More details and proofs will be given in the general Hilbert space setting in section 5.2. For comprehensive summaries on the solution of nonlinear algebraic systems, the reader is e.g. referred to Dennis–Schnabel [84], Kelley [176], Ortega–Rheinboldt [238].) Naturally, the main aspect of our study is the solution of systems obtained from discretized nonlinear elliptic problems. The crucial point in the iterative solution of nonlinear algebraic systems is most often preconditioning. This principle has turned out to be generally valid in the case of linear algebraic systems (see e.g. Axelsson [11, 30, 248]). Preconditioning essentially means a suitable choice of matrices for the auxiliary linear systems that the iteration consists of, and this concept is equally relevant for handling linear and nonlinear problems if the matrices are allowed to vary stepwise (Nashed [220], Ortega-Rheinboldt [238]). This is a natural generalization of the linear preconditioning theory, since the most widespread iterations for nonlinear problems are Newton-like methods which consist similarly of a sequence of auxiliary linear systems. We note that convergence results in terms of condition numbers also fit in this framework, which will be developed in section 5.3 (with the main ideas sketched in its introduction). The above considerations mean that in order to solve a nonlinear algebraic system F (x) = b
(4.1)
in Rs , the following kind of sequence will be regarded as a preconditioned iterative sequence: x(n+1) = x(n) − (A(n) )−1 (F (x(n) ) − b) (4.2)
with matrices A(n) (n ∈ N) to be suitably chosen. In particular, in the case of a discretized nonlinear elliptic problem Th (uh ) = gh 73
(4.3)
74CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING (n)
we look for matrices Ah (n ∈ N) such that the preconditioned iterative sequence (n+1)
uh
(n)
(n)
(n)
= uh − (Ah )−1 (Th (uh ) − gh )
(4.4)
exhibits suitable convergence to the solution of (4.3). The difficulty of finding good preconditioners lies in the two conflicting requirements: the solution of the auxiliary systems should be relatively simple (i.e. not costly), but at the same time the preconditioners are expected to yield a significant improvement of convergence. In the case of linear systems, roughly speaking, the preconditioners thus have to be found as suitable intermediate matrices between the identity and the original system matrix that are the extreme cases. Turning to nonlinear discretized elliptic problems, suitable preconditioning is crucial again since the condition number of the Jacobians of these systems is very large (in fact, it tends to infinity when discretization is refined). In this case the matrices of the auxiliary linear systems are most often related to the Jacobians of the nonlinear system, that is, the iteration is some Newton-like method. The need to balance between the two conflicting requirements appears again, since now the exact Jacobians are optimal matrices from the point of view of convergence but are usually more costly than other auxiliary matrices. Here we note that for the latter reason several other iterations than Newton are also used, which are either still Newton-like (inexact or quasi-Newton methods) or are different types of successive approximations such as the frozen coefficient method. Nevertheless, Newton’s method with exact Jacobians is widespread, in which case the problem of solving the auxiliary systems is often handled by so-called inner iterations (see [35]) and one encounters again the problem of preconditioning. To sum up, preconditioning plays a key role for nonlinear problems as well, and there is no general rule as to how the preconditioners should be chosen. This crucial importance of preconditioning for discretized nonlinear elliptic problems is pointed out in this chapter for gradient- and Newton-like methods, and is summarized in its closing subsection. The latter gives the conclusions that form a basic motivation for this book.
4.1
The condition number of nonlinear algebraic systems
Let F : Rs → Rs be a given mapping. Then the general form of nonlinear algebraic systems is F (x) = b, (4.5) where b ∈ Rs is a given vector. When the nonlinear system arises from the discretization of a differential equation, the coordinate functions of F generally have a special structure, which depend both on the form of the operator and on the type of the discretization. For instance, F has typically a ‘band structure’, that is for some integer k ≥ 1 (k < s) the relation ∂Fi (x) = 0, ∂xj
(|i − j| > k)
(4.6)
4.2. SOME ITERATIVE METHODS FOR NONLINEAR ALGEBRAIC SYSTEMS 75 holds. For elliptic partial differential equations with local approximation we obtain such band systems which are also sparse. We remark that such systems are generally largesized (i.e. the number of the nonlinear equations s is large) and hence it is important to take advantage of the above special properties during the numerical solution of (4.5). It is also typical that these systems have similar properties to ill-conditoned systems of linear algebraic equations. (For instance, we meet again the problem of numerical instability, huge number of the iterations in the numerical solution process etc.) In order to handle the above phenomenon in the sequel, we introduce the condition number for a class of functions arising in discretized elliptic problems. Let F : Rs → Rs be a strictly monotone nonlinear function, i.e. hF (v) − F (u), v − ui > 0
(u 6= v ∈ Rs ).
Then the condition number of F is defined as cond(F ) =
Λ(F ) , λ(F )
(4.7)
where Λ(F ) = sup
u6=v∈Rs
hF (v) − F (u), v − ui , kv − uk2
λ(F ) =
inf
u6=v∈Rs
hF (v) − F (u), v − ui . kv − uk2
(As before, h·, ·i denotes the usual Euclidean inner product in Rs , and k · k is the induced norm.) We can see that (4.7) generalizes the definition of the condition number of matrices given in subsection 2.1.2. Using the mean value theorem, we obtain the following equivalence property. A differentiable function F is said to have symmetric derivatives if the matrices F ′ (x) ∈ Rs×s are symmetric for any x ∈ Rs . Proposition 4.1 Let F be a differentiable nonlinear function with symmetric derivatives and let M ≥ m > 0 be constants. Then the following statements are equivalent: (i) (ii)
mkx − yk2 ≤ hF (x) − F (y), x − yi ≤ M kx − yk2 mkhk2 ≤ hF ′ (x)h, hi ≤ M khk2
(x, y ∈ Rs );
(x, h ∈ Rs ).
Due to the above equivalence, m and M are simultaneously sharp or not in (i) and (ii). We say that the numbers m and M are spectral bounds of F if they are sharp. Clearly, cond(F ) ≤ M/m.
Finally, we note that if F has symmetric derivatives and a positive lower bound, then system (4.5) has a unique solution x∗ ∈ Rs [127, 281].
4.2
Some iterative methods for nonlinear algebraic systems
In this subsection we briefly summarize two important types of numerical methods for the solution of (4.5), namely, gradient and Newton-like methods. We note that the
76CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING class of quasi-Newton methods at the end of subsection 4.2.2 leads to iterations of the type k xk+1 = xk − τk A−1 (k = 0, 1, 2, . . .), (4.8) k (F (x ) − b) see (4.21), which gives a general form of one-step iterative methods. A lot of variants of gradient and Newton-like methods, as well as other iterations of the kind (4.8), are discussed in the monograph of Ortega and Rheinboldt [238]. (Some of these variants are mentioned after (4.21). Concerning the one-step sequence (4.8), we note that CG type multistep methods in this context are in general unable to increase the possible order of convergence [33].) The crucial role of the conditioning properties of F in the realization of both gradient and Newton-like methods will be observed.
4.2.1
Gradient type methods
The gradient method or method of steepest descent is the analogue of that defined for linear problems in subsection 2.1.5 (see Gajewski–Gr¨oger–Zacharias [127], Kantorovich– Akilov [165], Ortega and Rheinboldt [238]). Similarly to the linear case, its simplest realization is achieved with stationary stepsize for nonlinear problems (4.5) with spectral bounds m and M . This sequence is as follows: xk+1 = xk −
2 (F (xk ) − b) M +m
(k = 0, 1, 2, . . .),
(4.9)
where x0 is any given vector. The following theorem gives some sufficient conditions of the convergence of the gradient method. Theorem 4.1 [127, 165]. Let F : Rs → Rs be continuously differentiable and have symmetric derivatives with spectral bounds m and M as in Proposition 4.1. Then the gradient method (4.9) with any starting vector x0 converges linearly to the unique solution x∗ , namely, the estimate 1 M −m kx − x k ≤ kF (x0 ) − bk m M +m k
∗
k
(n ∈ N) .
(4.10)
holds. The proof is based on proving the estimate kxk+1 − x∗ k ≤
M −m k kx − x∗ k, M +m
(4.11)
2 to which end one verifies that J(x) = x− M +m (F (x)−b) possesses contraction constant M −m . M +m A large number of more general methods than (4.9) have been developed with various choices of descent directions and stepsizes, and the steepest descent direction itself can be coupled with several minimization principles to define the steplength. It is not our purpose to detail these here, for a summary the reader is referred e.g. to [69, 238]. For problems with spectral bounds m and M , the sequence (4.9) yields
4.2. SOME ITERATIVE METHODS FOR NONLINEAR ALGEBRAIC SYSTEMS 77 optimal convergence if the descent is stepwise taken w.r. to the same norm. (Variable norms can also be considered in the framework of (4.8), see some references after (4.21) in the next subsection.) We note that conjugate gradient methods have been generalized similarly from the linear case to nonlinear systems (Axelsson–Chronopoulos [15], Daniel [79], Fletcher– Reeves [116], Ortega–Rheinboldt [238]), and have become a particularly efficient tool for large-scale problems via suitable parallelization techniques (Navon–Phua–Ramamurthy [222]). The nonlinear conjugate gradient method will be mentioned, together with more details and some relaxed conditions on the gradient method, in the general Hilbert space setting in subsection 5.2.1. Similarly, the ADI method mentioned in subsection 2.3.3 can be extended to nonlinear systems as a Peaceman-Rachford type iteration, see Kellogg [177], Ortega–Rheinboldt [238]. We underline that, clearly, if the condition number M/m is large then (4.10) provides a very slow convergence only.
4.2.2
Newton’s method
The widespread Newton’s method defines the following iteration procedure for solving (4.5): xk+1 = xk − F ′ (xk )−1 (F (xk ) − b) (k = 0, 1, 2, . . .) (4.12) with some given vector x0 , assuming the existence of F ′ (x)−1 for all x ∈ Rs (or at least in a suitable neighbourhood of the solution). In practice, one of course does not invert F ′ (xk ) to carry out (4.12) but solves the linear system F ′ (xk )pk = −(F (xk ) − b)
(4.13)
instead, and adds the ‘correction term’ pk to xk . We give a brief discussion on Newton’s method and its most frequent modifications. Detailed summaries are found e.g. in Dembo–Eisenstat–Steihaug [82], Eisenstat– Walker [101], Gal´antai [129], Ortega–Rheinboldt [238], Ypma [295]. We note that for discretized elliptic problems, the linearization by Newton’s method enables one to apply the multilevel approach (quoted in subsection 3.3.2) in the nonlinear context as well (Axelsson–Kaporin [18], Blaheta [45]), thus allowing efficient parallel realization (Deuflhard–Weiser [85], Heise [149]). The convergence of Newton’s method depends on the choice of the starting vector. If x0 is in a sufficiently small neighbourhood of a solution of (4.5), then the convergence is easily proved under some additional conditions, whereas the study of the iteration from an arbitrary starting point is much more difficult. In the sequel we formulate some convergence results. Theorem 4.2 [74]. If the convex region D0 ∈ Rs contains a solution x∗ of (4.5), and both F ′ (x)−1 and the second derivatives of F exist and are bounded in D0 , then Newton’s method (4.12) converges quadratically for all x0 , sufficiently close to x∗ .
78CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING The above result on the local convergence of Newton’s method assumes that there is a solution. If we want to prove the existence of the solution, more conditions on F are required. The following theorem provides the well-posedness of (4.5) together with the local convergence of Newton’s method under global assumptions for the function F , and an estimate is also given. Theorem 4.3 Let F : Rs → Rs have a derivative satisfying (i) kF ′ (x)hk ≥ λkhk (x, h ∈ Rs ) with some constant λ > 0 independent of x and h; (ii) kF ′ (x)−F ′ (y)k ≤ Lkx−yk of x and h.
(x, y ∈ Rs ) with some constant L > 0 independent
Then the system (4.5) has a unique solution x∗ ∈ Rs , and there exists ε > 0 such that for kx0 − x∗ k < ε Newton’s method (4.12) converges quadratically to x∗ and there holds the quadratic estimate n
kxk − x∗ k ≤ λ−1 kF (xk ) − bk ≤ c q 2 → 0 with c =
2λ2 L
> 0, q =
L kF (x0 ) 2λ2
(4.14)
− bk < 1.
(The proof will be given in a general Banach space setting in subsection 5.2.2.) We mention that the weakness of these theorems is the problem of choice of the starting vector in the iteration, that is, to define the suitable domain of attraction in which an initial guess x0 yields convergence of Newton’s method. In the sequel we formulate conditions for the suitable choice of x0 . Theorem 4.4 (Newton-Kantorovich) [165, 236]. Assume that F : Rs → Rs is differentiable on a convex set D0 ∈ Rs and that kF ′ (x) − F ′ (y)k ≤ Lkx − yk
(4.15)
for all x, y ∈ D0 . Suppose that there is an x0 ∈ D0 such that kF ′ (x0 )−1 k ≤ 1/λ,
(4.16)
kF ′ (x0 )−1 F (x0 )k ≤ µ,
(4.17)
θ = Lµ/λ < 1/2.
(4.18)
and Let
S = {x : kx − x0 k ≤ t∗ }
where t∗ = Lλ 1 − (1 − 2θ)1/2 . Then Newton’s method (4.12) is well defined and converges to a solution x∗ ∈ S of the problem (4.5) and for the rate of convergence the estimate (4.14) holds with q = 2θ.
4.2. SOME ITERATIVE METHODS FOR NONLINEAR ALGEBRAIC SYSTEMS 79 Remark 4.1 The root x∗ of equation (4.5) is unique in the ball S. (It can even be shown to be unique in some larger ball centered at x0 , see [238].) Remark 4.2 In order to analyse the conditions of the theorem, we mention that the Lipschitz continuity (4.15) is simply a smoothness condition on the function F on the convex set D0 . This condition, together with the bound (4.16), guarantees that the Jacobians F ′ (x) are not singular on S. The condition (4.17) can be written as kx1 − x0 k ≤ µ, ensuring that x1 ∈ S, which is needed to start the inductive proof of the theorem. With a further stronger assumption on the function F , namely, the convexity requirement, we can prove the global convergence of Newton’s method and even give a convenient qualitative characterization of the convergence. Theorem 4.5 (Newton-Baluev) [236]. Assume that in problem (4.5) F is continuously differentiable and convex on the whole Rs , further, F ′ (x) is non-singular and F ′ (x) ≥ 0 for all x ∈ Rs . Then (4.5) has a unique solution x∗ and for any x0 ∈ Rs Newton’s method (4.12) converges to x∗ with the rate (4.14). Moreover, the convergence is monotone nonincreasing, i.e. the relation x∗ ≤ xk+1 ≤ xk holds for all k = 1, 2, . . . (in coordinatewise sense). The Newton-Kantorovich theorem provides sufficient but not necessary conditions for the convergence of Newton’s method. (In practice the method may often converge even when the conditions are not satisfied.) The crucial problem in Newton’s method is the choice of the starting vector. To define it, we usually have to carry out a lot of computations. This is because it has to be selected from the known domain of attraction, which is a ball whose radius becomes smaller and smaller with the increase of the dimension of Rs , that is, the number of the equations in the system (4.5). The problem of local convergence is best handled by damping. Namely, introducing a set of parameters τk ∈ (0, 1], we can generalize Newton’s method as follows: xk+1 = xk − τk F ′ (xk )−1 (F (xk ) − b)
(k = 0, 1, 2, . . .),
(4.19)
where x0 is a given vector. This numerical method is called damped Newton’s method. The involvement of the parameters τk reduces certain norm of the correction terms and enables us to enlarge the radius of the ball where the starting point can be chosen. The local convergence of Newton’s method can be turned into global if the parameters τk are chosen sufficiently small. A typical related result will be formulated in the general setting in Theorem 5.11. (Obviously, hereby the parameter choice is a crucial question. This problem is widely investigated, see e.g. [34] and references therein. In that work one can find references to some further modifications not investigated in this short subsection.) We note that in practice the auxiliary equations (4.13) are most often not solved exactly, but they are generally replaced by the inequality kF ′ (xk )pk + F (xk ) − bk ≤ δk kF (xk ) − bk
(4.20)
80CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING with some given tolerance δk > 0 to define pk . In this way inexact Newton methods are defined (see the summary of Dembo–Eisenstat–Steihaug [82]). Another way of avoiding the formation of exact Jacobians is the construction of a sequence k xk+1 = xk − τk A−1 k (F (x ) − b)
(k = 0, 1, 2, . . .),
(4.21)
where the matrices Ak are suitable approximations of the Jacobians F ′ (xk ). These methods are called approximate or quasi-Newton methods, see Bank–Rose [35], Dennis– Mor´e [83] for summaries. We also note that sequences of the type (4.21) give a general form of one-step iterative methods: a lot of such iterations, including the DavidonFletcher-Powell method that uses a special recurrence to define Ak , are discussed in the monograph of Ortega and Rheinboldt [238]. The sequence (4.21) can also be considered in the framework of descent methods if the descent is taken w.r. to stepwise variable norms: this approach was introduced in [220, 287]. Finally, we underline that the conditioning properties of F play an important role in the realization of Newton’s and related methods . Namely, if the condition number of F (and, by Proposition 4.1, also that of the Jacobians F ′ (x)) is large, then one has to solve ill-conditioned systems in course of the iterations (4.12) and (4.19).
4.3
Preconditioning for nonlinear algebraic systems
As we have seen, ill-conditioning may cause serious problems in gradient type methods. This has been discussed in sections 2.1–2.2 for linear problems and is analogous for the nonlinear iteration (4.9). Similarly, in Newton’s method we may encounter a lot of numerical difficulties during the solution of ill-conditioned auxiliary linear problems. Similarly to the case of linear systems, the remedy for difficulties with ill-conditioned problems is suitable preconditioning. In this section some related procedures are sketched. (Exact formulation and convergence results for the methods of this subsection will be given in the general setting in section 5.3.) First the gradient method is discussed. The tool of preconditioning improves the condition number of the given nonlinear system. Then, using variable preconditioning (i.e. stepwise redefined auxiliary matrices), we can obtain a considerable ‘speed up’ of the rate of convergence. This way of discussion even includes Newton’s method, where one usually applies preconditioned inner iterations as well in the steps of the outer Newton iteration.
4.3.1
Preconditioned simple iterations for nonlinear algebraic systems
Assume that D is a given SPD matrix with a lower bound p and F : Rs → Rs a C 1 -mapping with symmetric derivatives and satisfying the property ˜ hD(x − y), x − yi mhD(x ˜ − y), x − yi ≤ hF (x) − F (y), x − yi ≤ M
(4.22)
˜ . Then for any vector x0 ∈ Rs the iteration with some constants 0 ≤ m ˜ ≤M xk+1 = xk −
2 zk , ˜ M +m ˜
(4.23)
4.3. PRECONDITIONING FOR NONLINEAR ALGEBRAIC SYSTEMS where Dz k = F (xk ) − b,
81 (4.24)
converges to the unique solution x∗ of the equation (4.5), further, the estimate ˜ −m 1 M ˜ 0 kx − x kD ≤ kF (x ) − bk ˜ +m mp ˜ 1/2 M ˜ k
∗
!k
(k ∈ N)
(4.25)
holds. (This result relies on Theorem 4.1 using the D-norm. The exact proof follows from Theorem 5.13.) We remark that here D is a spectrally equivalent preconditioner to F , and clearly, by its suitable choice, the estimate (4.25) ensures faster convergence than (4.10). As a consequence of Proposition 4.1, condition (4.22) can be given equivalently as ˜ hDh, hi mhDh, ˜ hi ≤ hF ′ (x)h, hi ≤ M (x, h ∈ Rs ). (4.26)
4.3.2
Variable preconditioning and Newton’s method
A natural generalization of (4.23)–(4.24) is preconditioning at every iteration step, that is, the use of so-called variable preconditioners. This means that we replace the matrix D by a suitable set of matrices Dk during the iteration. In this case we assume that for all k ∈ N the matrices Dk are SPD and each of them is spectrally equivalent to the derivative matrix F ′ (xk ), that is, the relation ˜ k hDk h, hi m ˜ k hDk h, hi ≤ hF ′ (xk )h, hi ≤ M
(x, h ∈ Rs )
(4.27)
˜k ≥ m holds for all k with some numbers M ˜ k > 0. Then the algorithm is as follows: xk+1 = xk −
2 zk , ˜ Mk + m ˜k
where Dk z k = F (xk ) − b.
(4.28) (4.29)
We note that the above sequence is in the general form (4.21) of one-step iterative methods. Further, variable preconditioning has also been used already in the context of linear systems, see Axelsson–Nikolova [27], Jung–Nepomnyashchikh [158]. The suitable choice of variable preconditioners can speed up the convergence of the iterative method, since one can improve the contractivity (4.11) stepwise. In this way one can achieve the convergence quotient q = lim sup k
˜k − m M ˜k ˜k + m M ˜k
instead of that in (4.25) (for the proof see subsection 5.3.3). As we can see, we can even obtain a superlinear rate if q = 0. Moreover, in the optimal case we arrive at Newton’s method: namely, the choice Dk = F ′ (xk ) (and corresponding spectral ˜k = m bounds M ˜ k = 1) shows that Newton’s method can be also interpreted as a preconditioned iteration with variable preconditioners, where the latter are chosen as the Jacobi matrices at the element xk . We underline that the step (4.29) in the preconditioned iteration means the solution of a system of linear algebraic equations, similarly as in the step (4.13) in Newton’s
82CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING method. When the structure of Dk is not simple enough for (4.29) to be solved directly, we generally use some inner iteration which needs preconditioning again [35]. This is the typical case for Newton’s method, where obviously the matrix Dk = F ′ (xk ) is derived in course of the iteration instead of being chosen with some prescribed structure. Therefore the results of the linear theory, developed in Chapter 2, are well applicable to the nonlinear case in this way. In particular, when Newton’s method is realized by using an inner iteration for the auxiliary equations (4.13), then the same spectral inequality (4.27) as above can be used to define a suitable preconditioner for the inner iteration to find pk . Namely, if the matrix Dk satisfies (4.27), then (4.13) can be replaced by the preconditioned form Dk−1 F ′ (xk )pk = −Dk−1 (F (xk ) − b),
(4.30)
and the simple or conjugate gradient methods for (4.30) yield linear convergence with quotients √ √ ˜ −m ˜ − m M ˜ M ˜ , and √ √ ˜ +m ˜ + m M ˜ M ˜ respectively (cf. subsection 2.1.5). In practice generally the latter is chosen. The stopping criterion for the inner iteration is usually of the form (4.20).
4.3.3
The problem of preconditioning for discretized nonlinear elliptic equations
Let us now assume that the nonlinear algebraic system F (x) = b
in Rs
(4.31)
in (4.5) arises from the discretization of an elliptic boundary value problem in RN . This kind of system is a main example of an ill-conditioned nonlinear problem. Namely, the uniform ellipticity of the nonlinear BVP means that the condition number of F in the discretized problems is proportional to the condition number of the discretizations of a fixed linear BVP. Hence, by the results quoted in subsection 2.2.1, for 2/N second order problems there holds cond(F ) = O s or, in terms of a discretization parameter h, cond(F ) = O h−2 , (4.32)
see (2.58)–(2.59). This means that the condition number grows unboundedly as the discretization is refined. Hence the iterative methods of section 4.2 converge slowly or their numerical realization may even fail unless suitable preconditioning is used. The most important ways in which a preconditioner may improve the convergence have been sketched in the preceding two subsections. Accordingly, the crucial point in the iterative solution is to find a suitable matrix D in (4.24) or matrices Dk in (4.29) or (4.30), respectively, that satisfy the requirements of a good preconditioner. That is, similarly as mentioned in subsection 2.3.1, the resulting condition number should be considerably better than the original one, further, at the same time, the solution of the auxiliary systems containing D or Dk is expected to be not costly and/or carried out in
4.3. PRECONDITIONING FOR NONLINEAR ALGEBRAIC SYSTEMS
83
a standard way. Since these two requirements are conflicting, there is no general rule as to how these preconditioners should be defined, and hence the construction of the matrices D or Dk has often to be based on individual considerations for the different problems. The aim of this book in the sequel is to develop an organized way of constructing a certain class of preconditioners for discretized nonlinear elliptic problems. The proposed preconditioning matrices are those arising as the discretizations of suitable linear elliptic boundary value problems.
84CHAPTER 4. NONLINEAR ALGEBRAIC SYSTEMS AND PRECONDITIONING
Part II Theoretical background
85
Chapter 5 Nonlinear equations in Hilbert space This chapter is devoted to the Hilbert space background of the results that will be used for the studied nonlinear elliptic problems. We investigate equations F (u) = b
(5.1)
in a real Hilbert space H with monotone potential operators F . The given abstract results form the basis of the later chapters in the sense that both solvability and the convergence of iterations concerning boundary value problems will be derived from the Hilbert space theorems. The main part of this chapter is section 5.3, which summarizes preconditioning in Hilbert space. It also reflects that the concept of preconditioning is able to provide a general framework to discuss iterative methods. As referred to in Chapter 1, the main common theoretical feature of the studied elliptic problems is as follows. The generalized differential operators corresponding to these problems (except for some of the type in subsection 1.6.2) are monotone potential operators, i.e. the solutions are minimizers of suitable convex functionals. This property underlies the theoretical discussion on solvability and conditioning. Therefore the results related to the monotone potential property are presented with proofs even when cited, in order to provide an arranged theoretical background. (Besides, for the latter reason proofs are also given for Newton-like methods, where monotonicity is not used.) The monotone potential operator property will be in fact used in a uniform kind which also assumes the Gateaux differentiability of the operator. Namely, besides the suitable continuity of F ′ , the operators F ′ (u) are symmetric and satisfy mkhk2 ≤ hF ′ (u)h, hi ≤ M khk2
(u, h ∈ H)
(5.2)
with some constants M ≥ m > 0, sometimes allowing M to depend on kuk. This property (or at least the left side inequality of (5.2)) will be the basic common ingredient in the proofs of well-posedness and preconditioned convergence results in this chapter. Accordingly, it underlies the application of the abstract results to boundary value problems, and will therefore be verified for these problems at the beginning of their study in section 6.1. 87
88
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
More details and general theorems on minimization and monotone operators are found in the summaries of C´ea [69], Gajewski–Gr¨oger–Zacharias [127], Langenbach [190] and Vainberg [281], which also involve iterative methods. The monograph of Ortega and Rheinboldt [238] also includes many results in normed spaces. More general monotone operators are discussed by Francu, see [117] and the references there. In the sequel let H be a real Hilbert space with inner product h ., . i and corresponding norm k . k. We consider equations of the form (5.1) in H with monotone potential operators F . The basic notions to be used in these theorems are given below in section 5.1, see also the Appendix (Chapter 11). For more background on nonlinear functional analysis see Zeidler [296] and the references cited above. The first section of the chapter gives existence and uniqueness results using potentials and monotonicity properties, then section 5.2 summarizes the required results on gradient and Newton-like iterations for differentiable operators. (Newton methods are presented using an in that case simpler Banach space setting.) Using these results, preconditioning in Hilbert spaces is developed in the main section 5.3 including fixed and variable preconditioning operators in a general framework. The latter is completed by some remarks on the connection of preconditioning and gradients in section 5.4.
5.1
Potentials and monotone operators
This section contains existence and uniqueness theorems for (5.1). The main theoretical result that will be relied on in the applications is Theorem 5.1. Its basic assumptions are the symmetry and positivity of F ′ (u), which guarantees that F is a monotone potential operator. Then existence and uniqueness can be obtained via the variational principle by minimizing the convex potential corresponding to the equation (5.1), i.e. the functional φ : H → R satisfying φ′ (u) = F (u) − b
(u ∈ H).
(5.3)
Theorem 5.1 will be followed by some more general results concerning monotone operators, including the non-potential case. For the convenient formulation of the results, we will use the following notion throughout this section: Definition 5.1 The nonlinear operator F : H → H has a bihemicontinuous symmetric Gateaux derivative if (i) F is Gateaux differentiable; (ii) F ′ is bihemicontinuous; (iii) for any u ∈ H the operator F ′ (u) is self-adjoint. Remark 5.1 If F has a bihemicontinuous symmetric Gateaux derivative then F is a potential operator, i.e. there exists a functional Ψ : H → R satisfying Ψ′ (u) = F (u) (u ∈ H) in Gateaux sense. Moreover, for operators with bihemicontinuous Gateaux derivatives the symmetry of the derivatives is necessary and sufficient for the existence of a potential. (See Gajewski–Gr¨oger–Zacharias [127], Zeidler [296].)
5.1. POTENTIALS AND MONOTONE OPERATORS
89
The basic existence and uniqueness theorem is now formulated for operators with a bihemicontinuous symmetric Gateaux derivative. (For analogous results see Gajewski– Gr¨oger–Zacharias [127], Langenbach [190], Vainberg [281].) Theorem 5.1 Let H be a real Hilbert space and let the operator F : H → H have the following properties: (i) F has a bihemicontinuous symmetric Gateaux derivative; (ii) there exist a constant m > 0 such that hF ′ (u)h, hi ≥ mkhk2
(u, h ∈ H).
(5.4)
Then for any b ∈ H the equation F (u) = b has a unique solution u∗ ∈ H, which is also the unique minimizer of satisfying (5.3).
φ : H → R
Proof. Assumption (i) implies that F has a potential Ψ : H → R. Let φ(u) := Ψ(u) − hu, bi
(u ∈ H).
Then φ satisfies (5.3), hence the theorem will be proved by verifying that φ has a unique minimizer u∗ . For any u ∈ H there holds 1 m φ(u) = φ(0) + hφ (0), ui + hφ′′ (θu)u, ui ≥ φ(0) + kuk − kφ′ (0)k kuk. 2 2
′
Hence lim φ(u) = +∞
kuk→∞
(5.5)
and φ is bounded below. Let (un ) ⊂ H such that lim φ(un ) = inf φ(u).
n→∞
u∈H
Then {φ(un )}n∈N is bounded, hence from (5.5) {kun k}n∈N is bounded. Let (ukn ) be a subsequence that converges weakly to some u∗ ∈ H, especially lim hφ′ (u∗ ), ukn i = hφ′ (u∗ ), u∗ i.
n→∞
Assumption (ii) yields that φ is (strictly) convex, which implies that φ(ukn ) ≥ φ(u∗ ) + hφ′ (u∗ ), ukn − u∗ i. Letting n → ∞ , we obtain
inf φ(u) ≥ φ(u∗ ),
u∈H
i.e. φ has a minimum at u∗ . Finally, this minimizer is unique owing to the strict convexity of φ.
90
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Remark 5.2 We note that the growth of φ can be estimated from below quadratically. Namely, using that φ′ (u∗ ) = 0, φ′′ = F ′ and (5.4), the Taylor expansion implies that 1 m φ(u) − φ(u∗ ) = hφ′′ (u∗ + θ(u − u∗ ))(u − u∗ ), u − u∗ i ≥ ku − u∗ k2 . 2 2 Remark 5.3 Theorem 5.1 remains also valid if the inequality in assumption (ii) is replaced by hF ′ (u)h, hi ≥ m(kuk)khk2 (u, h ∈ H) with some function m satisfying limr→∞ r m(r) = ∞. (This is verified using the same way of proof. For related results see Gajewski–Gr¨oger–Zacharias [127], Vainberg [281].) Now Theorem 5.1 will be extended to non-injective operators that can be factorized to a monotone potential operator. Definition 5.2 Let H be a real Hilbert space and H0 ⊂ H a closed subspace. A nonlinear operator F in H is called translation invariant with respect to H0 if for any u ∈ D(F ) and h ∈ H0 we have u + h ∈ D(F ) and F (u + h) = F (u). Remark 5.4 If F is Gateaux differentiable and translation invariant with respect to H0 then for any u ∈ D(F ) the inclusion H0 ⊂ kerF ′ (u) holds. Theorem 5.2 Let H be a real Hilbert space and F : H → H be a nonlinear operator which is translation invariant with respect to some H0 ⊂ H. Assume that (i) F has a bihemicontinuous symmetric Gateaux derivative; (ii) there exists a constant m > 0 such that for all u ∈ H and h ∈ H0⊥ hF ′ (u)h, hi ≥ mkhk2 . (In contrast, by Remark (5.4), we have hF ′ (u)h, hi = 0 for h ∈ H0 .) Then the following assertions hold. (1) R(F ) = F (0) + H0⊥ . (2) For any b ∈ R(F ) there exists a unique u∗ ∈ H0⊥ such that for any h ∈ H0 F (u∗ + h) = b , and thus all solutions of equation F (u) = b are obtained. Proof. The results follow by applying Theorem 5.1 in the space H0⊥ and the translation invariance of F . To help better understanding, the following two examples illustrate the preceding theorems. The application of these and the later results to other problems will be the subject of the further chapters.
5.1. POTENTIALS AND MONOTONE OPERATORS
91
Example 5.1. Let the Hilbert space H = H01 (Ω) be endowed with the inner product Z hu, viH01 = ∇u · ∇v . Ω
Let us consider the operator F : hF (u), viH01 =
H01 (Ω)
Z
Ω
→ H01 (Ω) defined by the equality
f (x, ∇u) · ∇v
(u, v ∈ H01 (Ω)),
(5.6)
where Ω ⊂ RN is a bounded domain, further, the function f ∈ C 1 (Ω × RN , RN ) has (x,η) whose eigenvalues λ satisfy symmetric Jacobians ∂f∂η 0<m≤λ≤M
(5.7)
with constants M ≥ m > 0 independent of (x, η). This operator F is the weak or generalized differential operator corresponding to the boundary value problem (
−div f (x, ∇u) = g(x) u|∂Ω = 0 ,
(5.8)
see the Appendix (Chapter 11), paragraph (c) of A2. Then one can prove that F has a bihemicontinuous symmetric Gateaux derivative. Namely, this derivative is given by 1 1Z hF (u)h, viH01 = lim hF (u+th)−F (u), viH01 = lim ((f (x, ∇u+t∇h)−f (x, ∇u)·∇v t→0 t t→0 t Ω Z ∂f = (x, ∇u)∇h · ∇v (u, h, v ∈ H01 (Ω)). (5.9) Ω ∂η The symmetry of the Jacobians implies that F ′ (u) is self-adjoint. Further, owing to the continuity of the Jacobians, for any u, k, w, h ∈ H01 (Ω) the function ′
(s, t) 7→
∂f (x, ∇u + s∇k + t∇w)∇h ∂η
is continuous from R2 to RN , and from this one can prove that also the mapping (s, t) 7→ F ′ (u + sk + tw)h is continuous from R2 to H01 (Ω). The exact proof of these properties requires suitable integral estimates and is found in detail for a more general case in Theorem 6.1. The relation (5.9) and assumption (5.7) imply mkhk2H01 ≤ hF ′ (u)h, hiH01 ≤ M khk2H01
(u, h ∈ H01 (Ω)).
(5.10)
The left side of this inequality and the properties checked above imply that Theorem 5.1 is valid for F . We note that if H01 (Ω) is replaced by H 1 (Ω) in the definition (5.6), then F becomes the generalized differential operator corresponding to the Neumann problem for the equation −div f (x, ∇u) = g(x), that is, the boundary condition in (5.8) is replaced by ∂u/∂ν = 0. In this case Theorem 5.2 can be applied if the one-dimensional subspace H0 is defined to consist of constant functions on Ω. The above chain of ideas will be followed in Chapter 6 to prove existence and uniqueness for boundary value problems with more general classes of operators.
92
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Remark 5.5 The above operator F has a potential by the general result quoted in Remark 5.1. For this particular example it is easy to give the potential explicitly. Namely, the assumptions on f imply the existence of a C 1 function ψ : Ω×RN → R satisfying ∂ψ (x, η) = f (x, η). ∂η Let us introduce the functional Ψ : H01 (Ω) → R, Ψ(u) ≡
Z
Ω
ψ(x, ∇u) .
(5.11)
Then one can verify (similarly to the differentiability of F above) that ′
hΨ (u)viH01 =
Z
Ω
Z ∂ψ (x, ∇u) · ∇v = f (x, ∇u) · ∇v = hF (u), viH01 ∂η Ω
for all u, v ∈ H01 (Ω), i.e. Ψ′ (u) = F (u). That is, Ψ is a potential of F . The weak solution of problem (5.8) is the minimizer of the functional φ(u) = Ψ(u) −
Z
Ω
gu =
Z
Ω
(ψ(x, ∇u) − gu) .
(5.12)
We note that owing to φ′′ = F ′ the left side of estimate (5.10) expresses the uniform convexity of φ. As the proof of Theorem 5.1 shows, this property ensures the existence of the minimizer in (5.12).
Example 5.2. An obvious example of potential operator is the class of bounded self-adjoint linear operators, where the potential is the quadratic functional (Kantorovich-Akilov [165], Mikhlin [215]). This shows that the above theorems on nonlinear operators are extensions of the linear case. Namely, let A : H → H be a bounded self-adjoint linear operator and b ∈ H. Using that the Hilbert space H is real, it is elementary to verify that the quadratic functional 1 φ(u) = hAu, ui − hb, ui 2
(u ∈ H)
has the Gateaux derivative φ′ (u) = Au − b (cf. [165]). We note that the potential criteria in Remark 5.1 are clearly satisfied in this case, i.e. the operator F (u) = Au − b (u ∈ H) (5.13) has a trivial bihemicontinuous symmetric Gateaux derivative. Namely, for all u ∈ H we have F ′ (u) = A, which is self-adjoint and the constant mapping F ′ ≡ A is bihemicontinuous.
5.1. POTENTIALS AND MONOTONE OPERATORS
93
Finally, if A is also strongly positive, i.e. it satisfies hAh, hi ≥ mkhk2
(u, h ∈ H)
with a constant m > 0 independent of h, then (5.4) holds for (5.13) and hence the well-posedness of equation Au = b is a special case of Theorem 5.1. Now we turn to some results whose scope includes certain non-potential operators. By Remark 5.1 such results are required for nonsymmetric problems. The lack of the potential requires subtler monotonicity conditions to achieve existence results. First some necessary definitions are given. Definition 5.3 Let H be a real Hilbert space. The operator F : H → H is called (i) strongly monotone if there exists m > 0 such that hF (u)−F (v), u−vi ≥ mku−vk2 (u, v ∈ H); (ii) strictly monotone if hF (u) − F (v), u − vi > 0 (u, v ∈ H, u 6= v); (iii) monotone if hF (u) − F (v), u − vi ≥ 0 (u, v ∈ H); (iv) pseudomonotone if the assumptions un → u and lim suphF (un ), un −ui ≤ 0 imply lim infhF (un ), un − vi ≥ hF (u), u − vi for all v ∈ H. Remark 5.6 A monotone operator is pseudomonotone if it is hemicontinuous (see e.g. Francu [117]). Hence the above definitions are written in weakening order. Definition 5.4 Let H be a real Hilbert space. The operator F : H → H is called (i) bounded if there exists an increasing function ϕ : R+ → R+ such that kF (u)k ≤ ϕ(kuk) (u ∈ H); (ii) coercive if
hF (u), ui → ∞ as kuk → ∞. kuk
Remark 5.7 For example, it is easy to see that the operator F in Theorem 5.1 is strongly monotone, and hence coercive; further, if condition (ii) of Theorem 5.1 is completed with the upper estimate hF ′ (u)h, hi ≤ M (kuk)khk2 with some increasing function M , then it also follows that F is bounded. The condition of strong monotonicity can be much weakened under suitable technical assumptions to achieve existence. This is formulated in the following theorem (Francu [117]). Theorem 5.3 Let H be a real Hilbert space and let the operator F : H → H be pseudomonotone, bounded, coercive and continuous in finite dimension. Then (1) for any b ∈ H the equation (5.1) has a solution u∗ ∈ H;
94
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
(2) if strict monotonicity is assumed instead of pseudomonotonicity, then this solution is also unique. Finally, a proposition is quoted from [117] that helps checking pseudomonotonicity for a class of operators. Proposition 5.1 Let H be a real Hilbert space and let the operator F : H → H have the form F (u) = B(u, u), where B : H × H → H has the following properties. (a) Continuity conditions: (a1) u 7→ B(u, v) is hemicontinuous and bounded for each v ∈ H; (a2) v 7→ B(u, v) is hemicontinuous for each u ∈ H;
(a3) un ⇀ u, hB(un , un ) − B(un , u), un − ui → 0 (v ∈ H);
⇒
B(un , v) ⇀ B(u, v)
(a4) un ⇀ u, B(un , v) ⇀ b ⇒ hB(un , v), un i → hb, ui (v ∈ H). (b) Monotonicity in the second argument: hB(u, u) − B(u, v), u − vi ≥ 0
(u, v ∈ H).
Then the operator F is pseudomonotone. We note that Theorem 5.3 and Proposition 5.1 are also valid for operators F : X → X in reflexive Banach spaces X. A typical example of pseudomonotone operator is F : H01 (Ω) → H01 (Ω) defined by ∗
hF (u), viH01 =
Z
Ω
a(x, u)∇u · ∇v
(u, v ∈ H01 (Ω)),
where Ω ⊂ RN is a bounded domain and the function a ∈ C(Ω × R) satisfies 0 < µ1 ≤ a(x, s) ≤ µ2
(x ∈ Ω, s ∈ R)
with constants µ1 , µ2 independent of (x, s). The application of Theorem 5.3 to this operator will be sketched in subsection 6.2.3.
5.2 5.2.1
Iterative methods for smooth mappings Simple iterations (gradient method)
The construction of simple iterations for equations involving a monotone potential operator F can be discussed in the framework of gradient (steepest descent) methods. Namely, let φ : H → R be a potential corresponding to the equation F (u) = b,
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
95
i.e. a functional satisfying φ′ (u) = F (u) − b (u ∈ H). Then a sequence of the form un+1 = un − αφ′ (un ) = un − α(F (un ) − b)
(n ∈ N)
(with some α > 0) defines a steepest descent iteration for the minimization of φ. (We note that we will consider fixed stepsizes α > 0 in the iterations since the available estimates for the rate of convergence are the same as for variable stepsizes.) Gradient-like methods in Hilbert spaces have been thoroughly investigated beginning with the works of Kantorovich. The reader is referred to C´ea [69], D’yakonov [96], Gajewski–Gr¨oger–Zacharias [127], Kantorovich-Akilov [165], Langenbach [190], Poljak [244], Vainberg [281]. The main theorem that will be used for simple iterations is formulated below, using optimal constant stepsize. This result is the Hilbert space analogue of the finite-dimensional case in Theorem 4.1, and is verified under the assumptions of Theorem 5.1 plus the upper side of the spectral estimate for F ′ (u). (We note that the proof uses the fixed point principle, this will be mentioned in Remark 5.8 afterwards. Simple iterations in fixed point context are thoroughly discussed in Browder–Petryshyn [59] and Gajewski–Gr¨oger–Zacharias [127].) Theorem 5.4 (cf. [127, 165]). Let H be a real Hilbert space and let the operator F : H → H have the following properties: (i) F has a bihemicontinuous symmetric Gateaux derivative; (ii) there exist constants M ≥ m > 0 such that for all u, h ∈ H, mkhk2 ≤ hF ′ (u)h, hi ≤ M khk2 .
(5.14)
Let b ∈ H be arbitrary and denote by u∗ ∈ H the unique solution of the equation F (u) = b. Then for any u0 ∈ H the sequence un+1 := un −
2 (F (un ) − b) M +m
(n ∈ N)
(5.15)
converges to u∗ according to the linear estimate M −m 1 kun − u k ≤ kF (u0 ) − bk m M +m ∗
n
(n ∈ N) .
Proof. The existence and uniqueness of u∗ follows from Theorem 5.1. Let us introduce the operator J : H → H defined by J(u) ≡ u −
2 F (u) M +m
(u ∈ H).
Then for all u the Gateaux derivative J ′ (u) exists and J ′ (u)h = h −
2 F ′ (u)h M +m
(u ∈ H),
(5.16)
96
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
hence J ′ (u) is also self-adjoint. In virtue of (5.14) we can apply Lemma 3.2 with A = F ′ (u) and B = I (the identity operator), and we obtain that J ′ (u) is a contraction M −m with constant M . Then J is also a contraction with the same constant since +m kJ(v) − J(u)k = k
Z
1
0
′
J (u + t(v − u))(v − u)dtk ≤
Z
0
1
kJ ′ (u + t(v − u))(v − u)kdt
M −m M −m kv − ukdt = kv − uk. M +m 0 M +m The construction of the sequence (5.15) implies ≤
Z
1
un+1 − u∗ = un − u∗ −
2 (F (un ) − F (u∗ )) = J(un ) − J(u∗ ), M +m
(n ∈ N),
hence we obtain kun+1 − u∗ k ≤ By induction
M −m kun − u∗ k M +m
M −m M +m and here the lower bound m of F implies kun − u∗ k ≤
n
(n ∈ N).
ku0 − u∗ k
(n ∈ N)
ku0 − u∗ k ≤ (1/m)kF (u0 ) − bk, which yields the required estimate. Remark 5.8 The proof shows that the iteration (5.15) can be regarded as a fixed point iteration instead of a minimizing sequence, if the equation F (u) = b is rewritten as 2 (F (u) − b). u=u− M +m Then, clearly, for the proof one only had to verify that the operator J =I−
2 F M +m
(where I is the identity operator) is a contraction with constant
M −m . M +m
Remark 5.9 Under the assumption that F has a bihemicontinuous Gateaux derivative, the symmetry of F ′ (u) and condition (5.14) are respectively equivalent to the following conditions: (a) F is a potential operator; (b) F is Lipschitz continuous and strongly monotone: kF (u) − F (v)k ≤ M ku − vk,
hF (u) − F (v), u − vi ≥ mku − vk2
(u, v ∈ H).
We note that Theorem 5.4 can be verified by assuming the above conditions (a)-(b) only, without the Gateaux differentiability of F . Moreover, one can further generalize
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
97
the result for non-potential operators satisfying (b), but in this case the quotient 1/2
M −m M +m
is replaced by the greater one (1 − (m/M )2 ) . Both generalizations are found in Gajewski–Gr¨oger–Zacharias [127]. However, in most examples the conditions of Theorem 5.4 are satisfied, as is the case for the model problems in Chapter 1. (As will turn out in Chapter 6, the operators corresponding to the model problems are special cases of Theorems 6.1–6.2 which establish the conditions of Theorem 5.4 for general types of elliptic operators.) Another kind of generalization of Theorem 5.4 allows the upper bound in (5.14) to depend on kuk (cf. Gajewski–Gr¨oger–Zacharias [127], Kar´atson [170], Vainberg [281]). The following formulation is quoted from [170], where a convergence estimate analogous to (5.16) is given. Theorem 5.5 Let H be a real Hilbert space and let the operator F : H → H have the following properties: (i) F has a bihemicontinuous symmetric Gateaux derivative; (ii) there exists a constant m > 0 and an increasing function M : [0, ∞) → (0, ∞) such that for all u, h ∈ H mkhk2 ≤ hF ′ (u)h, hi ≤ M (kuk)khk2 .
(5.17)
Let b ∈ H be arbitrary and denote by u∗ ∈ H the unique solution of the equation F (u) = b. Let u0 ∈ H be arbitrary,
M0 := M ku0 k +
1 kF (u0 ) − bk . m
(5.18)
(n ∈ N)
(5.19)
Then the sequence un+1 = un −
2 (F (un ) − b) M0 + m
converges linearly to u∗ , namely, 1 M0 − m kun − u k ≤ kF (u0 ) − bk m M0 + m
∗
n
(n ∈ N) .
(5.20)
Proof. It goes similarly to Theorem 5.4. One can verify by induction that un ∈ B(u0 , r0 )
(n ∈ N) ,
(5.21)
where r0 = (1/m)kF (u0 ) − bk
and B(u0 , r0 ) denotes the ball with radius r0 centered at u0 . Using the estimate kuk ≤ ku0 k + r0
(5.22)
for u ∈ B(u0 , r0 ), the upper bound of F ′ (u) on B(u0 , r0 ) is at most M0 defined in (5.18) owing to (5.17). This bound can be used throughout the iteration by (5.21), hence the proof is completed as in Theorem 5.4 with M0 instead of M . For more details see [170].
98
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Remark 5.10 The following further modifications of Theorem 5.5 use the results of [170]. (Related earlier results when F is locally Lipschitz continuous are found in [221, 281].) (i) In assumption (ii) of Theorem 5.5 the inequality may be replaced by the weaker one m(kuk)khk2 ≤ hF ′ (u)h, hi ≤ M (kuk)khk2 (5.23) with some function m satisfying limr→∞ rm(r) = ∞. Then the constant m in (5.19)(5.20) is also replaced by some m0 depending on u0 . Here one obtains the values m0 = m(r0 + ku0 k),
M0 = M (r0 + ku0 k),
(5.24)
where r0 > 0 is the smallest number such that kF (u0 ) − bk ≤ m(r0 + ku0 k)r0 . We note that the proof goes similarly to Theorem 5.5. One verifies (5.21) again by induction with the present r0 , and by (5.22) applies the proof of Theorem 5.4 with bounds M0 and m0 . (ii) We also note that if the condition (5.23) is replaced by m(ku − u0 k)khk2 ≤ hF ′ (u)h, hi ≤ M (ku − u0 k)khk2 ,
(5.25)
i.e. the bound functions are given depending on u0 , then we simply have M0 = M (2ku0 − u∗ k) ,
m0 = m (2ku0 − u∗ k) .
(Namely, in this case (5.21) is replaced by un ∈ B(u∗ , r0 ) with r0 = ku0 − u∗ k.) The unknown vector u∗ can be avoided by the following estimates, provided that the functions ˜ (t) = tM (t) m(t) ˜ = tm(t), M are strictly increasing. Estimate (5.25) implies ∗ kF (u0 ) − bk ≥ m(ku0 − u∗ k) ku0 − u∗ k = m(ku ˜ 0 − u k) ,
hence
(5.26)
(5.27)
M0 ≤ M 2m ˜ −1 (kF (u0 ) − bk) . Similarly,
˜ −1 (kF (u0 ) − bk) . m0 ≥ m 2M
The following theorem gives convergence for the class of non-injective operators studied in Theorem 5.2. (For details see [168].) Theorem 5.6 Let H be a real Hilbert space and F : H → H be a nonlinear operator which is translation invariant with respect to some H0 ⊂ H. Assume that (i) F has a bihemicontinuous symmetric Gateaux derivative;
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
99
(ii) there exist constants M ≥ m > 0 such that for all u ∈ H and h ∈ H0⊥ mkhk2 ≤ hF ′ (u)h, hi ≤ M khk2 . (In contrast, by Remark 5.4, we have hF ′ (u)h, hi = 0 for h ∈ H0 .) Let b ∈ F (0) + H0⊥ and denote by u∗ ∈ H the unique solution of equation F (u) = b that satisfies u∗ ∈ H0⊥ . Then for any u0 ∈ H0⊥ the sequence un+1 = un −
2 (F (un ) − b) (n ∈ N) M +m
converges to u∗ according to the linear estimate kun − u∗ k ≤
1 M −m kF (u0 ) − bk m M +m
n
(n ∈ N) .
Proof. We apply Theorem 5.4 in H0⊥ . The above defined simple or gradient iterations can be generalized to obtain a kind of conjugate gradient (CG) method: similarly to the linear case, the linear convergence quotient is improved using an appropriate construction of conjugate directions. Such methods have been cited in Chapter 4 for the finite-dimensional case (Daniel [79], Fletcher–Reeves [116], Ortega–Rheinboldt [238]). The extension to Hilbert space below is due to Daniel [79]. First we summarize the assumptions and the construction of the approximating sequence. The CG assumptions. Let H be a real Hilbert space, F : H → H a continuous operator such that (i) F is twice Gateaux differentiable; (ii) the first Gateaux derivative of F is bihemicontinuous, symmetric and satisfies mkhk2 ≤ hF ′ (u)h, hi ≤ M khk2
(u, h ∈ H)
with constants M ≥ m > 0 independent of u, h; (iii) there exist u0 ∈ H and constants R, B > 0 such that for any u ∈ B(u0 , R) := {u ∈ H : ku − u0 k ≤ R} there holds kF ′′ (u)k ≤ B; (iv) let b ∈ H and φ : H → R such that φ′ (u) = F (u) − b. (This φ exists by the previous assumptions.) We assume that {u ∈ H : φ(u) ≤ φ(u0 )} ⊂ B(u0 , R) holds for the level set corresponding to u0 . (We note that the original paper [79] assumes Frˆechet differentiability in (i), but the Gateaux sense suffices if the bihemicontinuity of F ′ in is assumed in (ii) as above.)
100
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Construction of the CG iteration. Let u0 ∈ H be as in assumption (iii), p0 = r0 = b − F (u0 ). For n ∈ N = {0, 1, . . .}, successively, let un+1 := un + cn pn where cn is the smallest positive root of hF (un + cpn ) − b, pn i = 0; set rn+1 := b − F (un+1 ), pn+1 := rn+1 + bn pn , where bn := −hF ′ (un+1 )pn , rn+1 i/hF ′ (un+1 )pn , pn i. We use the following for any n ∈ N let εn := hF ′ (un )−1 rn , rn i1/2 . further notations: √ ηn MB 4mM M −m M Further, let d := MB3 3 + 2m , + dεn , q := M , ηn := 2m 2 εn , σn := (M +m)2 1+η +m n qn := (q 2 + σn )1/2 , Rn := Then there holds
Theorem 5.7 [79].
√
M ε . m(1−qn ) n
Under the above assumptions (i)–(iv) the following hold:
(1) The equation F (u) = b has a unique solution u∗ ∈ H, and the sequence (un ) of the CG iteration converges strongly to u∗ . (2) Let N0 ∈ N be such that RN0 < R and σN0 < 1 − q 2 . Then kun − u∗ k ≤ RN0 · qN0 · qN0 +1 . . . qn−1
(n > N0 ) .
(Note that lim qn = q). (3) Let N0 be as in (2). Then for any m > N0 there exists Nm ∈ N such that εn+m
√ √ !2m M − m ≤ 4 √ + δn εn √ M+ m
(n > Nm )
where lim δn = 0. (Note that εn is equivalent to kF (un ) − bk and kun − u∗ k.)
5.2.2
Newton-like methods
In this subsection we give a brief summary on Newton-like methods for well-posed problems in infinite-dimensional spaces. These results can be formulated in a more general setting than the simple iteration results: no symmetry of the derivatives is involved and the theorems hold in Banach spaces. On the other hand, Lipschitz continuity of the derivatives is required. In the presented theorems our important basic assumption will be the estimate (5.29), which – replacing the uniform ellipticity condition (5.4) in the case of simple iterations – also ensures well-posedness. (We note that this condition is not necessary for the convergence of Newton’s method. In the famous results of Kantorovich who extended Newton’s method to Banach spaces [165], local invertibility conditions are only assumed around the solution, which might enable the method to be applied beyond elliptic problems, see Remark 5.16.) First the local convergence of Newton’s method will be presented. We note that the suitable extensions of the related results in subsection 4.2.2 to infinite-dimensional spaces are also valid (see in Ortega–Rheinboldt [238]). In the sequel we only consider the inexact and damped versions, and the subsection is closed by the damped inexact Newton (DIN) method in Theorem 5.12 and Corollary 5.2, which contain the practically most relevant version and will be used later in the applications. Namely, the DIN method provides global convergence and incorporates the numerical error arising for the auxiliary linear equations.
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
101
For a more detailed discussion of these and related results on Newton’s method in Banach spaces, the reader is referred e.g. to Axelsson [7], Collatz [74], KantorovichAkilov [165], Ortega–Rheinboldt [238]. Further, many of the works cited in subsection 4.2.2 also include extensions to the infinite-dimensional case. In this subsection we consider an operator between two Banach spaces instead of acting in a Hilbert space, since this generality makes no difference in the exposition, moreover, it fits more naturally the setting of these theorems. That is, let X, Y be Banach spaces, F : X → Y and b ∈ Y . We consider the equation F (u) = b.
(5.28)
The following theorem, which follows from the result of Plastock [243], yields wellposedness for (5.28): Theorem 5.8 Let F : X → Y have a derivative satisfying kF ′ (u)hk ≥ λkhk
(u, h ∈ X)
(5.29)
with some λ > 0 independent of u, h. Then for all b ∈ Y , equation (5.28) has a unique solution u∗ ∈ X. Remark 5.11 Condition (5.29) implies the following estimates: (1) kF ′ (u)−1 k ≤
1 λ
(u ∈ X).
(2) kF (u) − F (v)k ≥ λku − vk (u, v ∈ X)). In the theorems to follow, we examine the residual error kF (un ) − bk
(5.30)
for the constructed sequences (un ). Since Remark 5.11 yields kun − u∗ k ≤
1 kF (un ) − bk, λ
(5.31)
the error kun − u∗ k inherits the estimates obtained for (5.30). Let us now consider the Newton iteration in the space X, i.e. the analogue of the sequence (4.12). Under our assumptions, the local convergence of Newton’s method is provided by the following theorem. Theorem 5.9 Let F : X → Y have a derivative satisfying (i) kF ′ (u)hk ≥ λkhk (u, h ∈ X) with some constant λ > 0 independent of u, h; (ii) kF ′ (u) − F ′ (v)k ≤ Lku − vk of u, v.
(u, v ∈ X) with some constant L > 0 independent
102
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Then equation (5.28) has a unique solution u∗ ∈ X, and there exists ε > 0 such that for ku0 − u∗ k < ε, the sequence un+1 = un − F ′ (un )−1 (F (un ) − b)
(n ∈ N)
(5.32)
(n ∈ N),
(5.33)
converges quadratically to u∗ : kF (un+1 ) − bk ≤
L kF (un ) − bk2 2λ2
and hence there holds the weak quadratic estimate n
kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ c q 2 → 0 with c =
2λ L
> 0, q =
L kF (u0 ) 2λ2
(5.34)
− bk < 1.
Remark 5.12 The sequence (5.32) can be rewritten as un+1 = un + pn ,
where
F ′ (u )p = −(F (u ) − b). n n n
(n ∈ N).
(5.35)
Proof of Theorem 5.9. The existence of the unique solution u∗ ∈ X follows from Theorem 5.8. For any n ∈ N there holds F (un+1 ) − b = F (un ) − b + ′
= F (un ) − b + F (un )pn + Hence kF (un+1 ) − bk ≤
Z
1
0
Z
0
1
Z
0
1
F ′ (un + t(un+1 − un ))(un+1 − un )dt
(F ′ (un + t(un+1 − un )) − F ′ (un ))pn dt .
(5.36)
Ltkun+1 − un kkpn kdt ≤ (L/2)kpn k2
= (L/2)kF ′ (un )−1 (F (un ) − b)k2 ≤ (L/2λ2 )kF (un ) − bk2 , i.e. (5.33) holds. Further, by induction we obtain n −1
kF (un ) − bk ≤ (L/2λ2 )2
n
n
kF (u0 ) − bk2 = c q 2
with c = 2λ2 /L and q = (L/2λ2 )kF (u0 ) − bk. If u0 satisfies L kF (u0 ) − bk < 1, 2λ2
(5.37)
then (5.34) follows. Remark 5.13 The initial accuracy ε can be estimated by 2( Lλ )2 . Namely, this implies (5.37) and the proof works thereafter.
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
103
In practice one considers the modification when pn in (5.35) is determined only approximately with some prescribed relative error. That is, the equation F ′ (un )pn + (F (un ) − b) = 0 is replaced by the inequality kF ′ (un )pn + (F (un ) − b)k ≤ δn kF (un ) − bk with some δn > 0. This modified algorithm defines the inexact Newton method. The following theorem gives local convergence for the inexact Newton method. Theorem 5.10 Let F : X → Y satisfy conditions (i)-(ii) of Theorem 5.9, and let u∗ ∈ X be the solution of equation (5.28). Then there exists ε > 0 such that for ku0 − u∗ k < ε, the sequence un+1 = un + pn
(n ∈ N),
where
kF ′ (u )p + (F (u ) − b)k ≤ δ kF (u ) − bk n n n n n
with 0 < δn ≤ δ0 < 1,
(5.38)
converges to u∗ with speed depending on the sequence (δn ) up to quadratic order. Namely, if δn ≡ δ0 < 1, then the convergence of (un ) is linear. Further, if δn ≤ const. · kF (un ) − bkγ with some constant 0 < γ ≤ 1, then the convergence of (un ) is of order 1 + γ: kF (un+1 ) − bk ≤ c1 kF (un ) − bk1+γ
(n ∈ N)
with some constant c1 > 0, yielding also the convergence estimate of weak order 1 + γ kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ d1 q (1+γ)
n
(n ∈ N)
with suitable constants 0 < q < 1, d1 > 0. Proof. We assume without loss of generality that b = 0. Equation (5.36) now implies kF (un+1 )k ≤ δn kF (un )k + (L/2)kpn k2 .
(5.39)
Here (5.38) implies kpn k ≤ kF ′ (un )−1 kkF ′ (un )pn k ≤ kF ′ (un )−1 k(kF ′ (un )pn + F (un )k + kF (un )k) ≤ λ−1 kF (un )k(1 + δn ),
(5.40)
kF (un+1 )k ≤ δn kF (un )k + (L/2λ2 )(1 + δn )2 kF (un )k2 .
(5.41)
hence Let first δn ≡ δ0 < 1. Then by (5.41) kF (un+1 )k ≤ (δ0 + (2L/λ2 )kF (un )k)kF (un )k. If u0 satisfies
̺ := δ0 + (2L/λ2 )kF (u0 )k < 1,
104
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
then kF (u1 )k ≤ ̺kF (u0 )k and by induction kF (un+1 )k ≤ ̺kF (un )k
(n ∈ N+ ),
i.e. linear convergence holds. Now let δn = c kF (un )kγ
(5.42)
with some constants c > 0 and 0 < γ ≤ 1. Then (5.41) and (5.42) imply kF (un+1 )k ≤ ckF (un )k1+γ + (L/2λ2 ) (1 + ckF (un )kγ )2 kF (un )k2
≤ kF (un )k ckF (un )kγ + (L/2λ2 ) (1 + ckF (un )kγ )2 kF (un )k . Let u0 be chosen such that
̺0 := ckF (u0 )kγ + (L/2λ2 ) (1 + ckF (u0 )kγ )2 kF (u0 )k < 1
(5.43)
(5.44)
holds. Then (kF (un )k) is decreasing: namely, by (5.43)–(5.44) kF (u1 )k ≤ ̺0 kF (u0 )k, and, by induction, if for some n we have kF (un )k ≤ ... ≤ kF (u0 )k, then again by (5.43)–(5.44) kF (un+1 )k ≤ ̺0 kF (un )k. (5.45) Let c1 := c + (L/2λ2 )(1 + ckF (u0 )kγ )2 kF (u0 )k1−γ .
(5.46)
The estimates (5.43) and kF (un )k ≤ kF (u0 )k imply
kF (un+1 )k ≤ kF (un )k1+γ c + (L/2λ2 ) (1 + ckF (un )kγ )2 kF (un )k1−γ ≤ c1 kF (un )k1+γ . (5.47) Then by induction (1+γ)n −1 γ
kF (un )k ≤ c1 −1/γ
with d1 = c1
n
kF (u0 )k(1+γ) = d1 q (1+γ)
n
1/γ
and q = c1 kF (u0 )k < 1.
Remark 5.14 An important special case of inexact Newton methods is using an approximation Bn of the derivative F ′ (un ), that is, (5.38) is realized by un+1 = un − Bn−1 (F (un ) − b) kI − F ′ (u )B −1 k ≤ δ n n n
(n ∈ N),
where
with 0 < δn ≤ δ0 < 1
(5.48)
(denoting by I the identity operator in Y ). In fact, then there holds pn = −Bn−1 (F (un )− b), hence kF ′ (un )pn + (F (un ) − b)k = k(−F ′ (un )Bn−1 + I)(F (un ) − b)k ≤ δn kF (un ) − bk.
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
105
The above two theorems yield local convergence only. The widespread way to achieve global convergence is the introduction of damping parameters, i.e. the elements pn are multiplied by suitable constants τn ∈ (0, 1]. First the damped Newton method is established for exact correction terms. Theorem 5.11 Let F : X → Y satisfy conditions (i)-(ii) of Theorem 5.9, and let u∗ ∈ X be the solution of equation (5.28). Let u0 be arbitrary, and let us consider the sequence un+1 = un + τn pn
(n ∈ N),
F ′ (u )p = −(F (un ) − b)
Then
n n n τ = min 1, n
λ2 LkF (un )−bk
where (5.49)
and
o
kun − u∗ k ≤ λ−1 kF (un ) − bk → 0 monotonically
with locally quadratic speed, namely, for some index n0 ∈ N we have kF (un+1 ) − bk ≤ c1 kF (un ) − bk2
(n ≥ n0 )
(5.50)
(with some constant c1 > 0) and the corresponding weak quadratic estimate n
kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ d1 q 2
(n ≥ n0 )
(5.51)
with suitable constants 0 < q < 1, d1 > 0. Proof. We assume without loss of generality that b = 0. Similarly to (5.36) and afterwards, now we have F (un+1 ) = (1−τn )F (un )+τn (F (un )+F ′ (un )pn )+τn
Z
0
1
(F ′ (un +t(un+1 −un ))−F ′ (un ))pn dt (5.52)
and hence kF (un+1 )k ≤ (1 − τn )kF (un )k + τn2 (L/2λ2 )kF (un )k2
= kF (un )k 1 − τn + τn2 (L/2λ2 )kF (un )k .
(5.53)
If (L/2λ2 )kF (u0 )k < 1 then τn ≡ 1 and Theorem 5.9 can be applied. Otherwise we use that for any fixed n the real function ϕ(t) = 1 − t + t2 (L/2λ2 )kF (un )k (t ≥ 0) has its minimum for λ2 t = τn = LkF (un )k with the value λ2 . (5.54) ϕ(τn ) = 1 − 2LkF (un )k Hence by (5.53) and induction, the sequence (kF (un )k) decreases and satisfies the linear estimate !n λ2 kF (u0 )k (n ∈ N). (5.55) kF (un )k ≤ 1 − 2LkF (u0 )k
106
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Consequently, for some n0 ∈ N+ the estimate (L/2λ2 )kF (un0 )k < 1 will hold and then τn ≡ 1 (n ≥ n0 ) and Theorem 5.9 can be applied with initial guess un0 . We note that by modifying the above proof, the norm in the damping parameter can be expressed in terms of pn . Namely, instead of (5.53) we can get the following estimate from (5.52):
kF (un+1 )k ≤ kF (un )k 1 − τn + τn2 (L/2λ)kpn k , in which case the expression in brackets has its minimum for τn =
(5.56)
λ Lkpn k
with the value ϕ(τn ) = 1 −
λ . 2Lkpn k
Here the right side can be estimated above by that of (5.54) in virtue of kpn k ≤ λ−1 kF (un )k, hence the proof of Theorem 5.11 can be continued. That is, there holds Corollary 5.1 Theorem 5.11 remains valid if in (5.49) we redefine τn as (
)
λ τn = min 1, . Lkpn k Remark 5.15 We note that the quadratic convergence factor q in the estimate (5.51) is obtained in the same way as in Theorem 5.9, i.e. we have n
kun − u∗ k1/2 ≤ q =
L kF (un0 ) − bk. 2λ2
(5.57)
The advantages of the inexact and damped Newton methods can be united in order to both achieve global convergence and incorporate the numerical error arising for the auxiliary linear equations. The obtained algorithm yields the practically most relevant version of Newton’s method. The following theorem provides global convergence of the damped inexact Newton method (DIN). Theorem 5.12 Let F : X → Y satisfy conditions (i)-(ii) of Theorem 5.9, and let u∗ ∈ X be the solution of equation (5.28). Let u0 be arbitrary, and let us consider the sequence un+1 = un + τn pn
(n ∈ N),
where
kF ′ (u )p + (F (un ) − b)k ≤ δn kF (un ) − bk with
n n n τ = min 1, n
Then
(1−δn ) λ2 (1+δn )2 LkF (un )−bk
o
0 < δn ≤ δ0 < 1 and
.
kun − u∗ k ≤ λ−1 kF (un ) − bk → 0 monotonically
(5.58)
5.2. ITERATIVE METHODS FOR SMOOTH MAPPINGS
107
with speed depending on the sequence (δn ) up to locally quadratic order. Namely, if δn ≡ δ0 < 1, then the convergence is linear. Further, if δn ≤ const. · kF (un ) − bkγ with some constant 0 < γ ≤ 1, then the convergence is locally of order 1 + γ: kF (un+1 ) − bk ≤ c1 kF (un ) − bk1+γ
(n ≥ n0 )
with some index n0 ∈ N and constant c1 > 0, yielding also the convergence estimate of weak order 1 + γ kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ d1 q (1+γ)
n
(n ∈ N)
with suitable constants 0 < q < 1, d1 > 0. Proof. We proceed similarly as in Theorem 5.11. We assume again without loss of generality that b = 0. Now (5.52) and (5.40) yield kF (un+1 )k ≤ (1 − τn )kF (un )k + τn δn kF (un )k + τn2 (L/2)kpn k2
≤ kF (un )k 1 − τn (1 − δn ) + τn2 (L/2λ2 )(1 + δn )2 kF (un )k . Here the expression in brackets has its minimum for τn = with the value
(5.59)
(1 − δn ) λ2 (1 + δn )2 LkF (un )k
λ2 ϕ(τn ) = 1 − 2LkF (un )k
1 − δn 1 + δn
!2
.
(5.60)
By induction, using δn ≤ δ0 < 1, we have ϕ(τn ) ≤ ϕ(τ0 ) and kF (un )k decreases at least linearly. In particular, for some n0 ∈ N+ (5.44) is satisfied if thereby kF (u0 )k is replaced by kF (un0 )k. Hence Theorem 5.10 can be applied with initial guess un0 . We note that, by modifying the above proof similarly to Corollary 5.1, the norm in the damping parameter can be expressed in terms of pn . Namely, the right-side estimate in (5.59) can be replaced by
kF (un+1 )k ≤ kF (un )k 1 − τn (1 − δn ) + τn2 (L/2λ)(1 + δn )kpn k , in which case the expression in brackets has its minimum for τn =
(5.61)
(1 − δn ) λ (1 + δn ) Lkpn k
with the value ϕ(τn ) = 1 −
λ (1 − δn )2 . 1 + δn 2Lkpn k
Here the right side can be estimated above by that of (5.60) in virtue of kpn k ≤ λ−1 kF (un )k, hence the proof of Theorem 5.12 can be continued. That is, there holds
108
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Corollary 5.2 Theorem 5.12 remains valid if in (5.58) we redefine τn as (
)
(1 − δn ) λ τn = min 1, . (1 + δn ) Lkpn k Remark 5.16 In the famous results of Kantorovich [165], local invertibility conditions are only assumed around the solution instead of the uniform invertibility (5.29), which might enable the method to be applied beyond elliptic problems. The exact formulation of this is the analogue of Theorem 4.4 in the Banach space setting instead of Rs . Remark 5.17 The Lipschitz condition in Theorems 5.9–5.12 can be relaxed in several ways. (a) One can only assume H¨older continuity instead of Lipschitz: kF ′ (u) − F ′ (v)k ≤ Lku − vkα
(u, v ∈ X)
with some constants L > 0, 0 < α < 1 independent of u, v, see e.g. Axelsson [7]. (This restriction is required in certain applications with stronger nonlinearities.) Then the same proofs yield the results of these theorems with quadratic convergence replaced by convergence of order 1 + α: kF (un+1 ) − bk ≤ const. · kF (un ) − bk1+α
(n ∈ N).
(b) It suffices to require Lipschitz continuity of F ′ on a ball with some radius R around u∗ such that λ−1 kF (u0 ) − bk ≤ R.
Namely, in the original theorems the sequence kF (un ) − bk is decreasing, hence the iterative sequence satisfies kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ λ−1 kF (u0 ) − bk ≤ R
(5.62)
and the conditions are only used on the mentioned ball. (c) The theorems also remain true if local Lipschitz continuity of F ′ is required, i.e. ˜ : R+ → R+ such that there exists an increasing function L ˜ kF ′ (u) − F ′ (v)k ≤ L(r)ku − vk
(u, v ∈ X, kuk, kvk ≤ r).
(5.63)
This follows from the previous paragraph (b). Namely, let R0 := 2λ−1 kF (u0 ) − bk + ku0 k. Then F ′ has the Lipschitz constant ˜ 0) L = L(R on the ball B(0, R0 ), which contains the ball with radius λ−1 kF (u0 ) − bk around u∗ since the elements of the latter satisfy kuk ≤ ku − u∗ k + ku∗ − u0 k + ku0 k ≤ 2λ−1 kF (u0 ) − bk + ku0 k.
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
5.3
109
Preconditioning by linear operators in Hilbert space
The conditioning properties of F play a crucial role in the theorems of the preceding section 5.2. This observation is similar to the case of algebraic systems in Chapter 4. Namely, if the condition number of F is large, then so is the condition number of the Jacobians F ′ (u) (see Proposition 5.2 below), hence the simple (gradient type) iterations have a poor convergence quotient, and also one has to solve ill-conditioned equations in course of the Newton-like iterations. Hence, as a remedy, suitable preconditioning is required. The concept of preconditioning is able to provide a general framework to discuss simple iterations and Newton-like methods. This setting is similar to the one used in subsections 4.3.1–4.3.2. Let us sketch how this way of discussion proceeds. First, simple iterations can be accelerated by using a fixed linear preconditioning operator B based on spectral equivalence: mhBh, hi ≤ hF ′ (u)h, hi ≤ M hBh, hi
(u, h ∈ H)
(5.64)
(with some constants M ≥ m > 0), which means that
cond B −1 F ′ (u) ≤
M m
(u ∈ H).
This yields that the preconditioned sequence un+1 = un −
2 B −1 (F (un ) − b) M +m
(n ∈ N)
(5.65)
converges with ratio
M −m . M +m −m If the spectral bounds M and m available in (5.64) are too wide (i.e. M is too M +m large), then one can use variable preconditioners Bn based on spectral equivalence in the distinct steps: q=
mn hBn h, hi ≤ hF ′ (un )h, hi ≤ Mn hBn h, hi
(n ∈ N, h ∈ H)
(5.66)
(with some constants Mn ≥ mn > 0 satisfying 0 < m ≤ mn ≤ Mn ≤ M ), which means that Mn (n ∈ N). cond Bn−1 F ′ (un ) ≤ mn In this case the preconditioned sequence un+1 = un −
2 B −1 (F (un ) − b) Mn + m n n
converges with ratio estimated by q = lim sup
Mn − m n . Mn + m n
(n ∈ N)
(5.67)
110
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
(If the latter is 0 then superlinear convergence is achieved.) On the one hand this variable preconditioning generalizes the preconditioned simple iteration (5.65). On the other, condition (5.66) means that the sequence (5.67) defines an inexact or quasiNewton method, in which Bn is an approximate derivative operator. The price of faster convergence is similar as for Newton’s method, i.e. (5.67) converges only locally unless suitable damping is used. Finally, if one chooses Bn = F ′ (un ) in (5.66) with the corresponding spectral bounds mn = Mn = 1, then one arrives at Newton’s method as an extreme case of (5.67). This shows that (5.67) can be regarded as a general form of one-step iterations. We note that (in contrary to the previous cases) the operators Bn = F ′ (un ) are not really considered as preconditioners since they cannot be chosen to yield suitably simple auxiliary linear equations, as was expected of B and Bn in (5.65) and (5.67). Hence one generally uses some inner iteration for the auxiliary linear problems and applies preconditioning therein. This section follows the above setting in the development of Hilbert space preconditioning. Accordingly, after some preliminaries on the condition number, we first apply preconditioning with fixed linear operators. This is formulated using an analogue of (5.64) that does not assume differentiability of the operator. The discussion thus allows the case when cond(T ) = ∞ and hence the preconditioning operator is chosen unbounded. This scope is motivated by the related conditioning properties of differential operators and is connected to the strong form of the problems. (For notational distinction we use T for operators when non-differentiability or an infinite condition number is allowed, and also in particular for the strong form of elliptic differential operators.) In the next subsection variable preconditioning is discussed using stepwise redefined preconditioning operators. As detailed above, this adaptive approach is able to improve the often limited scope of fixed preconditioning. The setting here relies on the differentiability of the operator F and is hence analogous to the weak formulation of elliptic problems. We note that exact Newton iterations (i.e. the third stage in the above framework) are not detailed here since the corresponding results have been already summarized in subsection 5.2.2. (Inner iterations for the auxiliary linearized equations can be defined via the CGM or other gradient type methods, whose Hilbert space versions have been mentioned in subsection 3.1.2.)
5.3.1
Definition and properties of the condition number
The condition number of nonlinear operators is a direct extension of the finite-dimensional case defined in section 4.1. Definition 5.5 Let H be a real Hilbert space and let F be a nonlinear operator in H (i.e. defined on a subspace of H) which is strictly monotone. Then the condition number of F is defined as Λ(F ) cond(F ) = , λ(F )
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
111
where Λ(F ) =
hF (v) − F (u), v − ui , kv − uk2 u6=v∈D(F ) sup
λ(F ) =
hF (v) − F (u), v − ui . u6=v∈D(F ) kv − uk2 inf
The numbers Λ(F ) and λ(F ) are called the spectral bounds of F. Similarly to the linear case, there holds 0 < Λ(F ) ≤ ∞, 0 ≤ λ(F ) < ∞. We underline the possibility that thus the condition number may be infinite, moreover, this is always the case for differential operators in strong form as well as for any unbounded nonlinear operator. This is illustrated by the following example. Example. The following nonlinear differential operator T satisfies cond(T ) = ∞.
(5.68)
Let Ω ⊂ RN be a bounded domain and let f ∈ C 1 (Ω × RN , RN ) have symmetric (x,η) whose eigenvalues λ satisfy Jacobians ∂f∂η 0 < µ1 ≤ λ ≤ µ 2
(5.69)
with constants µ1 and µ2 independent of (x, η). We define T (u) ≡ −div f (x, ∇u)
(5.70)
with domain D(T ) = H 2 (Ω) ∩ H01 (Ω)
in the real Hilbert space L2 (Ω). Then there holds
hT (v) − T (u), v − ui = Z
Ω
Z
Ω
(f (x, ∇v) − f (x, ∇u)) · (∇v − ∇u) =
Z ∂f (x, ∇u + θ∇(v − u)) (∇v − ∇u) · (∇v − ∇u) ≥ µ1 |∇(v − u)|2 . ∂η Ω
Hence Λ(T ) ≥ µ1
sup u6=v∈D(T )
R
Ω
|∇(v − u)|2 =∞ 2 Ω |v − u|
R
using (3.19), that is, cond(T ) = ∞. We note that this T is the strong form of the weak or generalized differential operator defined in (5.6). From the point of view of conditioning the relevant form is the strong one for the following reason. The infinite condition number of T is the property that underlies the problem of large condition numbers of discretized elliptic problems, quoted in subsection 4.3.3. Namely, if the functions Th arise from some discretizations of T , then cond(Th ) → ∞ as h → 0
since they approach the infinite condition number of T as the discretization is refined. For differentiable operators it is usually more convenient to express the spectral bounds with the derivative operators.
112
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Proposition 5.2 Let F be a Gateaux differentiable nonlinear operator in H, and let M ≥ m > 0 be constants. Then the following statements are equivalent: (i)
mkv − uk2 ≤ hF (v) − F (u), v − ui ≤ M kv − uk2
(ii)
mkhk2 ≤ hF ′ (u)h, hi ≤ M khk2
(u, v ∈ D(F ));
(u, h ∈ D(F )).
Proof. It follows directly from the definition of Gateaux derivative and the mean value theorem. Note that if (i) or (ii) holds above then cond(F ) ≤
M . m
Further, m and M are simultaneously sharp or not in (i) or (ii), owing to the obtained equivalence. If they are sharp then they coincide with the spectral bounds of F . The main property of the condition number for simple iterations is that its estimate M −m above by M implies the same convergence quotient M as in the linear case under m +m suitable technical assumptions. Namely, in virtue of Theorem 5.4 there holds Corollary 5.3 Let F : H → H have a bihemicontinuous symmetric Gateaux derivative. If the spectral bounds of F are between m > 0 and M < ∞, then for any b ∈ H and u0 ∈ H, the sequence un+1 = un −
2 (F (un ) − b) M +m
(n ∈ N)
converges to the unique solution u∗ ∈ H of the equation F (u) = b according to the linear estimate kun − u∗ k ≤
5.3.2
1 M −m kF (u0 ) − bk m M +m
n
(n ∈ N) .
Fixed preconditioning operators
The main theorem of this subsection provides preconditioning of a nonlinear operator T by the linear operator S such that cond(S −1 T ) ≤
M . m
(5.71)
(More precisely, this will hold literally under the extra assumption R(S) ⊃ R(T ) for the operator S −1 T to make sense on D(T ). This can be avoided by defining (5.71) in a generalized sense, which will be discussed in Remark 5.18.) The given theorem uses Gateaux differentiability of the preconditioned operator, which helps to obtain an optimal convergence quotient (cf. Remark 5.9). (Related results use weaker conditions but accordingly obtain larger convergence factors: summaries on this are found in the papers of Petryshyn, e.g. [241, 242]. For other results concerning spectral equivalence we refer to D’yakonov [97].)
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
113
The result includes (and in fact is motivated by) the case cond(T ) = ∞, in which case clearly S must be unbounded to produce (5.71). As the example (5.70) shows, the condition number is infinite for differential operators in strong form since they are unbounded. This explains the phenomenon (quoted in subsection 4.3.3) that cond(Th ) is unbounded as h → 0 if Th arises from some discretization of T . Namely, cond(Th ) approaches the infinite condition number of T as the discretization is refined. The above property shows the importance of finding preconditioners for operators with an infinite condition number. We recall that the energy space HS of a strictly positive symmetric linear operator S : D → H is defined as the completion of D w.r. to the inner product hu, viS ≡ hSu, vi (see subsection 3.1.1). Theorem 5.13 Let H be a real Hilbert space, D ⊂ H a dense subspace, T : D → H a nonlinear operator. Assume that S : D → H is a symmetric linear operator with lower bound p > 0, such that there exist constants M ≥ m > 0 satisfying mhS(v − u), v − ui ≤ hT (v) − T (u), v − ui ≤ M hS(v − u), v − ui
(u, v ∈ D). (5.72)
Then the identity hF (u), viS = hT (u), vi
(u, v ∈ D)
(5.73)
defines an operator F : D → HS . If F can be extended to HS such that it has a bihemicontinuous symmetric Gateaux derivative, then (1) for any g ∈ H the equation T (u) = g has a unique weak solution u∗ ∈ HS , i.e. hF (u∗ ), viS = hg, vi
(v ∈ HS ).
(5.74)
(If g ∈ R(T ) then T (u∗ ) = g.) (2) For any u0 ∈ HS the sequence un+1 = un −
2 z M +m n
,
where hzn , viS = hF (un ), viS − hg, vi (v ∈ HS ),
(5.75)
converges linearly to u∗ , namely, M −m 1 kun − u kS ≤ kF (u0 ) − bkS m M +m ∗
n
(n ∈ N) ,
(5.76)
where hb, viS = hg, vi (v ∈ HS ). (3) Assume that there holds the additional condition R(S) ⊃ R(T ).
(5.77)
If g ∈ R(S) and u0 ∈ D, then for any n ∈ N the element zn in (5.75) can be expressed as zn = S −1 (T (un ) − g),
114
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE that is, the sequence (5.75) can be written as un+1 = un −
2 z M +m n
,
(5.78)
where Szn = T (un ) − g, further, the estimate (5.76) takes the form M −m 1 kT (u0 ) − gk kun − u kS ≤ 1/2 mp M +m
∗
n
(n ∈ N) ,
where p > 0 is the lower bound of S. Proof. Let u ∈ D be fixed. Then the inequality kvk ≤ p−1/2 kvkS
(5.79)
for the energy norm implies |hT (u), vi| ≤ p−1/2 kT (u)kkvkS
(v ∈ D),
hence v 7→ hT (u), vi is a bounded linear functional on D ⊂ HS . It has a unique bounded linear extension Φu : HS → R, hence the Riesz theorem defines a unique vector F (u) ∈ HS that satisfies hF (u), viS = Φu v
(v ∈ HS ).
The latter gives (5.73) for v ∈ D, i.e. F is the required operator. Now it is easy to verify assertions (1)–(3) using Theorem 5.4 in the space HS . Let F be extended to HS such that it has a bihemicontinuous symmetric Gateaux derivative. This extension can be denoted also by F without confusion. Then (5.72) implies mkv − uk2S ≤ hF (v) − F (u), v − uiS ≤ M kv − uk2S
(u, v ∈ HS ),
(5.80)
i.e. the spectral bounds of F are between 0 and M , and thus Corollary 5.3 holds for F in the space HS . Let b ∈ HS be defined by hb, viS = hg, vi
(v ∈ HS ).
(5.81)
The existence of b is provided by the estimate |hg, vi| ≤ p−1/2 kgkkvkS
(v ∈ D)
and the Riesz theorem as above. (1) The equation F (u) = b has a unique solution u∗ ∈ HS , and the equality F (u∗ ) = b coincides with (5.74). If g ∈ R(T ), then (5.74) means hT (u∗ ), vi = hg, vi hence T (u∗ ) = g.
(v ∈ HS ),
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
115
(2) By (5.81) the sequence (5.75) coincides with that in Corollary 5.3, whence the estimate (5.76) follows. (3) In virtue of (5.77) we have hT (u), vi = hS −1 T (u), viS
(u, v ∈ D),
hence (5.73) implies F|D = S −1 T.
(5.82)
Therefore, if un ∈ D, then the auxiliary equation in (5.75) takes the form hzn , viS = hT (un ) − g, vi (v ∈ HS ), and is solved by zn = S −1 (T (un ) − g) ∈ D.
Hence u0 ∈ D implies by induction the sequence (un ) remains in D and zn is as above. Further, g ∈ R(S) and (5.81) yield hb, viS = hg, vi = hS(S −1 g), viS = hS −1 g, viS
(v ∈ HS ),
hence b = S −1 g. Here the energy norm satisfies kwkS ≤ p−1/2 kSwk
(w ∈ D).
(5.83)
Setting w = F (u0 ) − b = S −1 (T (u0 ) − g), (5.76) yields the required estimate. Remark 5.18 The convergence result of Theorem 5.13 is due to the achieved condition number M cond(F ) ≤ (5.84) m in HS . Here F plays the role of the preconditioned operator, but this can only be written in the classical form M cond(S −1 T ) ≤ (5.85) m under the extra assumption R(S) ⊃ R(T ), in which case (5.82) holds and the iteration can be kept in D. If the assumption R(S) ⊃ R(T ) does not hold, then S −1 T is only defined on a subset of D and thus cond(S −1 T ) may be irrelevant. (This difficulty of preconditioning does not arise in the finite dimensional case H = Rk , where the positivity of S implies that R(S) = H.) Hence it is useful to redefine (5.85) in a generalized sense (essentially by (5.84)), such that this covers the setting of Theorem 5.13 without the assumption R(S) ⊃ R(T ). This is achieved by the following definition. Definition 5.6 Assume that H is a real Hilbert space, D(T ) ⊂ H and D(S) ⊂ H are dense, T : D(T ) → H is a nonlinear operator and S : D(S) → H is a strictly positive symmetric linear operator satisfying the following conditions: (1) there exists an operator F : HS → HS such that hF (u), viS = hT (u), vi
(u, v ∈ D(T ));
116
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
(2) cond(F ) = κ with some κ ≥ 1. Then we say that cond(S −1 T ) = κ (in generalized sense) in HS . Remark 5.19 As mentioned earlier, in order to produce the finite condition number (5.71) in the case cond(T ) = ∞, clearly there must hold cond(S) = ∞ as well, i.e. S must be unbounded. Nevertheless, S is expected to yield easily solvable auxiliary equations in (5.78).
If the original operator is Gateaux differentiable, then Theorem 5.13 can be formulated in a simpler way. For distinction we use different notations. Namely, consider a differentiable operator, say, F : H → H, satisfying the spectral equivalence inequality of the type (5.72) w.r. to some linear operator B: mhB(v − u), v − ui ≤ hF (v) − F (u), v − ui ≤ M hB(v − u), v − ui
(u, v ∈ H).
Then, using Proposition 5.2 in the k.kB -norm, this can be rewritten with F ′ , and we obtain equivalence in the form (5.64) which yields
cond B −1 F ′ (u) ≤
M m
(u ∈ H).
In this way we obtain Corollary 5.4 Let H be a real Hilbert space. Let the nonlinear operator F : H → H have a bihemicontinuous symmetric Gateaux derivative. Let b ∈ H and denote by u∗ the unique solution of equation F (u) = b. Assume that B is a bounded self-adjoint linear operator with lower bound p > 0 satisfying mhBh, hi ≤ hF ′ (u)h, hi ≤ M hBh, hi
(u, h ∈ H)
(5.86)
with some constants M ≥ m > 0. Then for any u0 ∈ H, the sequence un+1 = un −
2 B −1 (F (un ) − b) M +m
(n ∈ N)
converges linearly to u∗ according to the estimate 1 M −m kun − u kB ≤ kF (u0 ) − bk 1/2 mp M +m ∗
n
(n ∈ N).
(5.87)
Now we turn to some generalizations of Theorem 5.13 that allow a weaker equivalence condition than (5.72).
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
117
Theorem 5.14 (i) Let D(S) be a proper subset of D = D(T ). If assumption (5.72) is replaced by mkv − uk2S ≤ hT (v) − T (u), v − ui ≤ M kv − uk2S
(u, v ∈ D),
(5.88)
and the other hypotheses of Theorem 5.13 hold, then the assertions (1)-(2) of the theorem remain valid. (ii) If assumption (5.72) is replaced by m hS(v − u), v − ui ≤ hT (v) − T (u), v − ui ≤ M (r) hS(v − u), v − ui (u, v ∈ D, kukS , kvkS ≤ r)
(5.89)
with some increasing function M : R+ → R+ , then the following modification of Theorem 5.13 holds: the constant M in assertions (2)-(3) is replaced by the one depending on u0 1 M0 := M ku0 k + kF (u0 ) − bk . m (iii) Let T be non-injective such that it is translation invariant with respect to some H0 ⊂ H and assumption (5.72) holds if u − v ∈ H0⊥ . Assume that T (0) = 0. If the other hypotheses of Theorem 5.13 hold with H and D replaced by H0⊥ and D ∩ H0⊥ , respectively, then the theorem holds in H0⊥ instead of H. Proof. It goes in the same way as Theorem 5.13. For parts (ii) and (iii) we use Theorems 5.5 and 5.6 instead of Theorem 5.4. Remark 5.20 The analogue of Remark 5.10 is as follows for preconditioned operators. (i) If assumption (5.89) is replaced by m(r) hS(v − u), v − ui ≤ hT (v) − T (u), v − ui ≤ M (r) hS(v − u), v − ui (u, v ∈ D, kukS , kvkS ≤ r)
where m : R+ → R+ is a decreasing function with limr→∞ r m(r) = ∞, then, in virtue of Remark 5.10, assertion (ii) of Theorem 5.14 remains valid if m is also replaced by some m0 depending on u0 . (See (5.24).) (ii) Let us consider a preconditioned differentiable operator F as in Corollary 5.4. If F satisfies m(ku − u0 kS )khk2S ≤ hF ′ (u)h, hi ≤ M (ku − u0 kS )khk2S , i.e. the bound functions are given depending on u0 , then Theorem 5.14 (ii) remains valid if m is also replaced by some m0 depending on u0 , namely, M0 = M (2ku0 − u∗ kS ) ,
m0 = m (2ku0 − u∗ kS ) .
The bounds M0 and m0 can be estimated avoiding u∗ as follows, provided that the functions ˜ (t) = tM (t) m(t) ˜ = tm(t), M are strictly increasing. Using (5.26)–(5.27) in HS and (5.79), we obtain
M0 ≤ M 2m ˜ −1 (p−1/2 kF (u0 ) − bk) ,
˜ −1 (p−1/2 kF (u0 ) − bk) . m0 ≥ m 2M
118
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Remark 5.21 For a given symmetric strictly positive linear operator S : D → H and the corresponding F defined via (5.73), the assumptions in parts (i)-(ii) of Theorem 5.14 can be formulated with the derivative of F . Namely: (i) assumption (5.88) can be written as mkv − uk2S ≤ hF (v) − F (u), v − uiS ≤ M kv − uk2S which, by Proposition 5.2, is equivalent to mkhk2S ≤ hF ′ (u)h, hiS ≤ M khk2S
(u, h ∈ HS );
(5.90)
(ii) assumption (5.89) can be written as mkv − uk2S ≤ hF (v) − F (u), v − uiS ≤ M (r) kv − uk2S
(u, v ∈ HS , kukS , kvkS ≤ r)
which, using Proposition 5.2, is equivalent to mkhk2S ≤ hF ′ (u)h, hiS ≤ M (kukS )khk2S
5.3.3
(u, h ∈ HS ).
(5.91)
Variable preconditioning operators
In this subsection we consider preconditioning using stepwise variable linear operators. An extra assumption for this investigation is the differentiability of the nonlinear operator. Accordingly, the methods to be developed are suitable generalizations of Corollary 5.4 on fixed preconditioners. In order to motivate the necessity of variable preconditioning, let us just consider the preconditioned simple iteration in Corollary 5.4. Such iterations are globally preconditioned in the sense that the preconditioners are the same in each step and rely on the global behaviour of F ′ (u). Whereas this preconditioning may be convenient when the condition number is small, it is often insufficient since the available global convergence quotient may be very poor (e.g. it is almost 1 for the magnetic potential equations, see section 10.2.) This insufficiency demands the stepwise improvement of contractivity, i.e. varying B during the iteration to produce better spectral bounds. This necessarily involves the local properties of the Jacobians during the iteration. The stepwise comparison to F ′ (un ) leads to the framework of Newton-like or inexact Newton methods. Consequently, the variable preconditioners will be constructed as approximate derivatives. More precisely, they will be chosen to be stepwise spectrally equivalent to F ′ (un ). A technical implication of the Newton framework is that the Lipschitz continuity of F ′ is assumed. The first theorem provides local linear convergence using fixed spectral bounds for preconditioning. We note that it illustrates the preconditioning role of inexact Newton methods, since the result is locally an exact analogue of Corollary 5.4. The second theorem will generalize Theorem 5.15: using damped iteration and variable spectral bound preconditioning, it will provide global convergence up to second order. For more details on variable preconditioning see Kar´atson–Farag´o [174].
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
119
We note for notational intelligibility that the original spectral bounds of (5.2) will now be replaced by λ and Λ, since the notations m and M (or, in the variable preconditioning case mn and Mn ) are retained for the bounds in the spectral equivalence inequality for preconditioning. (For instance, the latter is found in (5.92) for Theorem 5.15, and this is the generalization of (5.86).) Theorem 5.15 Let H be a real Hilbert space. Assume that the nonlinear operator F : H → H has a symmetric Gateaux derivative satisfying the following properties: (i) (Ellipticity.) There exist constants Λ ≥ λ > 0 satisfying λkhk2 ≤ hF ′ (u)h, hi ≤ Λkhk2
(u, h ∈ H).
(ii) (Lipschitz continuity.) There exists L > 0 such that kF ′ (u) − F ′ (v)k ≤ Lku − vk
(u, v ∈ H).
Let b ∈ H and denote by u∗ the unique solution of equation F (u) = b. We fix constants M ≥ m > 0. Then there exists a neighbourhood V of u∗ such that for any u0 ∈ V, the sequence un+1 = un −
2 Bn−1 (F (un ) − b) M +m
(n ∈ N),
with properly chosen self-adjoint linear operators Bn satisfying mhBn h, hi ≤ hF ′ (un )h, hi ≤ M hBn h, hi
(n ∈ N, h ∈ H),
(5.92)
converges linearly to u∗ . Namely, M −m kun − u k ≤ C · M +m ∗
n
(n ∈ N)
(5.93)
with some constant C > 0. The proof of Theorem 5.15 is preceded by some required properties. Lemma 5.1 Let the conditions (i)-(ii) of Theorem 5.15 hold. Then for any u, v, h ∈ H, hF ′ (u)h, hi ≤ hF ′ (v)h, hi 1 + Lλ−2 kF (u) − F (v)k . Proof. Assumption (i) implies kF (u) − F (v)k ≥ λku − vk. Hence hF ′ (u)h, hi ≤ hF ′ (v)h, hi+Lku−vkkhk2 ≤ hF ′ (v)h, hi+Lλ−2 kF (u)−F (v)khF ′ (v)h, hi. Applying Lemma 5.1 to u and u∗ , we obtain
120
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Corollary 5.5 If F (u∗ ) = b, then for any fixed u ∈ H there holds 1 hF ′ (u∗ )h, hi ≤ ≤ 1 + µ(u) 1 + µ(u) hF ′ (u)h, hi
(h ∈ H),
where µ(u) = Lλ−2 kF (u) − bk. We introduce the norms khku = hF ′ (u)−1 h, hi1/2
(u, h ∈ H).
(5.94)
Then Corollaries 3.1 and 5.5 imply directly Corollary 5.6 If F (u∗ ) = b, then for any fixed u ∈ H there holds 1 khk2u∗ ≤ 1 + µ(u) ≤ 1 + µ(u) khk2u
(h ∈ H),
where µ(u) is from Corollary 5.5. Proof of Theorem 5.15. We assume without loss of generality that b = 0, i.e. we study the equation F (u) = 0. Assumption (i) and Lemma 3.1 imply that Λ−1 khk2 ≤ hF ′ (u)−1 h, hi ≤ λ−1 khk2 for any u, h ∈ H. Hence the norms (5.94) satisfy λ1/2 khku ≤ khk ≤ Λ1/2 khku
(u, h ∈ H),
(5.95)
and there also holds kF ′ (u)−1/2 k ≤ λ−1/2
(u ∈ H).
(5.96)
Since the assumptions imply that λM −1 khk2 ≤ hBn h, hi for any h ∈ H, we obtain similarly to (5.96) that kBn−1/2 k ≤ λ−1/2 M 1/2 . (5.97) The following norms (special cases of (5.94)) will be used throughout the proof: k . kn = k . kun
(n ∈ N),
k . k∗ = k . ku∗
(5.98)
The Lipschitz continuity of F ′ implies that F (un+1 ) = F (un ) + F ′ (un )(un+1 − un ) + R(un ), where kR(un )k ≤
L kun+1 − un k2 . 2
Here F (un ) + F ′ (un )(un+1 − un ) = F (un ) −
2 F ′ (un )Bn−1 F (un ), M +m
(5.99)
(5.100)
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
121
hence (5.92) and Lemma 3.2 imply that
kF (un )+F (un )(un+1 −un )kn ≤
I − ′
Further, (5.95) and (5.100) yield
kR(un )kn ≤
2 M −m F ′ (un )Bn−1
kF (un )kn ≤ kF (un )kn . M +m M +m n (5.101)
2L kBn−1 F (un )k2 . 1/2 2 λ (M + m)
Here, using (5.97), (5.92) and Lemma 3.1, we have kBn−1 F (un )k2 ≤ kBn−1/2 k2 kBn−1/2 F (un )k2 ≤ M λ−1 hBn−1 F (un ), F (un )i ≤ M 2 λ−1 hF ′ (un )−1 F (un ), F (un )i = M 2 λ−1 kF (un )k2n . Hence kR(un )kn ≤
2LM 2 kF (un )k2n . λ3/2 (M + m)2
(5.102)
Altogether, (5.99), (5.101) and (5.102) yield 2LM 2 M −m + 3/2 kF (un )kn M + m λ (M + m)2
kF (un+1 )kn ≤
!
kF (un )kn .
Finally, using Corollary 5.6 and (5.98), we obtain M −m 2LM 2 kF (un+1 )k∗ ≤ (1+µ(un )) + 3/2 (1 + µ(un ))1/2 kF (un )k∗ 2 M + m λ (M + m)
!
kF (un )k∗ ,
where µ(un ) = LΛ1/2 λ−2 kF (un )k∗ using (5.95). That is, kF (un+1 )k∗ ≤ ϕ(kF (un )k∗ ) kF (un )k∗ ,
(5.103)
where ϕ(t) = (1 + βΛ
1/2
2
−2 1/2
t) Q + M βα λ
1/2
(t/2) 1 + βΛ
1/2
t
(5.104)
and the notations
L M −m M +m , β = 2, Q = 2 λ M +m + + are used. Then ϕ : R → R is a strictly increasing continuous function and ϕ(0) = Q. Estimate (5.103) puts us in the position to prove the required convergence estimate (5.93), provided that the assumption α=
r := ϕ(kF (u0 )k∗ ) < 1
(5.105)
is satisfied for the initial guess. First, we obtain by induction that kF (un+1 )k∗ ≤ rkF (un )k∗
(n ∈ N).
(5.106)
122
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Namely, kF (u1 )k∗ = rkF (u0 )k∗ . Further, the assumption kF (uk+1 )k∗ ≤ rkF (uk )k∗ (k = 0, ..., n − 1) yields kF (un )k∗ < kF (u0 )k∗ , hence kF (un+1 )k∗ ≤ ϕ(kF (un )k∗ ) kF (un )k∗ ≤ ϕ(kF (u0 )k∗ ) kF (un )k∗ = rkF (un )k∗ . Inequality (5.106) implies kF (un )k∗ ≤ rn kF (u0 )k∗ → 0, ϕ(kF (un )k∗ ) → Q and hence kF (un+1 )k∗ lim sup ≤ lim ϕ(kF (un )k∗ ) = Q. kF (un )k∗ From now on we use the notation en = kF (un )k∗ . Then (5.103) implies n−1 Y
n−1 Y
!
!
ϕ(ek ) ϕ(ek ) e0 = en ≤ Qn e0 Q k=0 k=0
(n ∈ N).
(5.107)
Using (5.104) and the notations c = βΛ1/2 , d = (M 2 βα−2 λ1/2 )/2, we have
ϕ(t) = (1 + ct) Q + dt (1 + ct)1/2 . Here d ϕ(ek ) = (1 + cek ) 1 + ek (1 + cek )1/2 Q Q d c ≤ (1 + cek ) 1 + ek 1 + ek Q 2
!
!
d cd c2 d 3 =1+ c+ ek + e2k + e Q Q 2Q k !
d c2 d 3 3k cd ≤1+ c+ er . e0 rk + e20 r2k + Q Q 2Q 0 !
Since for any sequence (ak ) ⊂ R+ there holds P exp( ∞ k=0 ak ), hence we obtain n−1 Y
ϕ(ek ) ≤ exp Q k=0
(
d c+ Q
!
Qn−1
k=0 (1
+ ak ) ≤
e0 cd e20 c2 d e30 + + 1−r Q 1 − r2 2Q 1 − r3
)
Qn−1 k=0
exp(ak ) ≤
=: E .
Therefore (5.107) yields en ≤ e0 E · Qn
(n ∈ N).
Finally, using condition (ii) and (5.95), this implies kun − u∗ k ≤ λ−1 kF (un )k ≤ λ−1 Λ1/2 e0 E · Qn
(n ∈ N),
which coincides with the required convergence estimate with C = λ−1 Λ1/2 e0 E.
(5.108)
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
123
Remark 5.22 The convergence has been proved under the sufficient condition ϕ(kF (u0 ) − bk∗ ) < 1
(5.109)
for the initial guess, with ϕ defined in (5.104). In connection with this we note the following: (a) The condition (5.109) is satisfied if K
L kF (u0 ) − bk∗ < 1 , λ2
n
o
where K = Λ1/2 (M/m) max 1, 2(M − m)−1 (λ/Λ)1/2 (see [174]). Relating this to the well-known sufficient condition 2λL2 kF (u0 ) − bk < 1 of the exact Newton iteration (cf. (5.37)), we observe that the order is similar (although K is obviously somewhat larger than 1/2). (b) The sufficient condition of convergence can be given using the original norm as follows. Since the theoretical norm k . k∗ satisfies kF (u0 )−bk∗ ≤ λ−1/2 kF (u0 )−bk by (5.95), and ϕ increases, therefore we obtain the condition ϕ(λ−1/2 kF (u0 ) − bk) < 1 to be checked for u0 .
Now we turn to the more general version of Theorem 5.15. We recall the following definitions of norms (see (5.98)), where (un ) is an iterative sequence and u∗ is the solution of F (u) = b: khkn = hF ′ (un )−1 h, hi1/2
(n ∈ N),
khk∗ = hF ′ (u∗ )−1 h, hi1/2 .
(5.110)
The following theorem generalizes Theorem 5.15. Using damped iteration and variable spectral bound preconditioning, it provides global convergence up to second order. Theorem 5.16 Let H be a real Hilbert space. Let the operator F : H → H have a symmetric Gateaux derivative satisfying the properties (i)-(ii) of Theorem 5.15. Denote by u∗ the unique solution of equation F (u) = b. For arbitrary u0 ∈ H let (un ) be the sequence defined by un+1 = un −
2τn B −1 (F (un ) − b) Mn + m n n
where the following conditions hold:
(n ∈ N),
(5.111)
124
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
(iii) Mn ≥ mn > 0 and the properly chosen self-adjoint linear operators Bn satisfy mn hBn h, hi ≤ hF ′ (un )h, hi ≤ Mn hBn h, hi
(n ∈ N, h ∈ H),
further, using notation ω(un ) = Lλ−2 kF (un ) − bk, there exist constants K > 1 and ε > 0 such that Mn /mn ≤ 1 + 2/(ε + Kω(un )); (iv) we define τn = min{1,
1 − Qn }, 2ρn
(5.112)
n −mn where Qn = M (1 + ω(un )), ρn = 2LMn2 λ−3/2 (Mn + mn )−2 kF (un ) − bkn (1 + Mn +mn ω(un ))1/2 , ω(un ) is as in condition (iii) and k . kn is defined in (5.110). (This value of τn ensures optimal contractivity in the n-th step in the k . k∗ -norm.)
Then there holds kun − u∗ k ≤ λ−1 kF (un ) − bk → 0, namely, lim sup
Mn − m n kF (un+1 ) − bk∗ ≤ lim sup < 1. kF (un ) − bk∗ Mn + m n
(5.113)
Moreover, if in addition we assume Mn /mn ≤ 1 + c1 kF (un ) − bkγ (n ∈ N) with some constants c1 > 0 and 0 < γ ≤ 1, then kF (un+1 ) − bk∗ ≤ d1 kF (un ) − bk1+γ ∗
(n ∈ N)
(5.114)
with some constant d1 > 0. Owing to the equivalence of the norms k . k and k . k∗ , the orders of convergence corresponding to the estimate (5.114) can be formulated with the original norm: Corollary 5.7 (Rate of convergence in the original norm.) Let Mn /mn ≤ 1 + c1 kF (un − b)kγ with some constants c1 > 0, 0 < γ ≤ 1. Then there holds kF (un+1 ) − bk ≤ d1 kF (un ) − bk1+γ
(n ∈ N),
and consequently kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ const. · ρ(1+γ)
n
with some constant 0 < ρ < 1. Proof of Theorem 5.16. We assume without loss of generality (similarly to Theorem 5.15) that b = 0, i.e. we study the equation F (u) = 0. Using (5.99) and (5.111), we obtain
F (un+1 ) = (1 − τn )F (un ) + τn F (un ) −
2 F ′ (un )Bn−1 F (un ) + R(un ). Mn + m n
5.3. PRECONDITIONING BY LINEAR OPERATORS IN HILBERT SPACE
125
Hence
kF (un+1 )k∗ ≤ (1 − τn )kF (un )k∗ + τn
I −
Here, using Corollary 5.6 and Lemma 3.2,
I−
2 F ′ (un )Bn−1 F (un )
+ kR(un )k∗ . Mn + m n ∗
Mn − m n 2 F ′ (un )Bn−1 F (un )
≤ (1 + µ(un ))1/2 kF (un )kn Mn + m n Mn + m n ∗
Mn − m n kF (un )k∗ , Mn + m n
≤ (1 + µ(un ))
where µ(un ) = Lλ−2 kF (un )k. Further, from (5.95) and (5.100) there follows kR(un )k∗ ≤
2L L kun+1 − un k2 = τn2 1/2 kB −1 F (un )k2 , 1/2 2λ λ (M + m)2 n
hence, using the estimate preceding (5.102) and then Corollary 5.6, we obtain kR(un )k∗ ≤ τn2
2LM 2 2LM 2 2 2 1/2 kF (u )k kF (un )kn kF (un )k∗ ≤ τ (1+µ(u )) n n n n λ3/2 (M + m)2 λ3/2 (M + m)2
Summing up, we obtain kF (un+1 )k∗ ≤
2LM 2 Mn − m n kF (un )kn + τn2 (1 + µ(un ))1/2 3/2 1 − τn + τn (1 + µ(un )) Mn + m n λ (M + m)2 That is,
kF (un+1 )k∗ ≤ 1 − τn (1 − Qn ) + τn2 ρn kF (un )k∗ , where Qn and ρn are as in condition (iv). ˜ < 1 such that There exists Q ˜ Qn ≤ Q
(n ∈ N).
!
kF (un )k∗ . (5.115)
(5.116)
Namely, the assumption Mn /mn ≤ 1 + 2/(ε + Kµ(un )) with K > 1 and ε > 0 implies 1 + ε + Kµ(un ) ≤ 1 + hence
2 Mn + m n = , (Mn /mn ) − 1 Mn − m n
˜ Mn + m n 1 + µ(un ) ≤ Q Mn − m n
˜ := max{1/K, 1/(1 + ε)} < 1. with Q Let us introduce the function p : [0, 1] → R, p(t) := 1 − (1 − Qn )t + ρn t2 . Here ′ p (t) = −(1 − Qn ) + 2ρn t yields that τn defined in (5.112) satisfies p(τn ) = min p(t) < 1, t∈[0,1]
126
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
since p′ (0) = −(1 − Qn ) < 0. Hence from (5.115) kF (un+1 )k∗ ≤ p(τn )kF (un )k∗ < kF (un )k∗ .
(5.117)
Moreover, if τn = 1 (i.e. when 1 ≤ (1 − Qn )/2ρn ), then
˜ p(τn ) = Qn + ρn ≤ Qn + (1 − Qn )/2 = (1 + Qn )/2 ≤ (1 + Q)/2 < 1.
In the case τn = (1 − Qn )/2ρn we have
˜ 2 /(4 sup ρn ) =: Q′ < 1. p(τn ) = 1 − (1 − Qn )2 /(4ρn ) ≤ 1 − (1 − Q) n
The latter holds since by (5.117) kF (un )k∗ is bounded, hence ρn = const. · kF (un )kn (1 + const. · kF (un )k)1/2
(5.118)
is bounded, the three norms being equivalent. Altogether, from (5.117) we obtain kF (un )k∗ ≤ const. · rn → 0
˜ where r = max{(1 + Q)/2, Q′ }. This also implies that ρn → 0 and µ(un ) = Lλ−2 kF (un )k → 0. A brief calculation gives
p(τn ) = Qn + ρn 1 − (1 − τn )2 (for both τn = 1 and τn < 1), hence (5.117) yields lim sup
(5.119)
kF (un+1 )k∗ Mn − m n ≤ lim sup Qn = lim sup . kF (un )k∗ Mn + m n
The bound Mn /mn ≤ 1 + 2/ε in assumption (iv) implies that 1 Mn − m n ≤ < 1. Mn + m n 1+ε
lim sup
Finally, let Mn /mn ≤ 1 + c1 kF (un )kγ with constants c1 > 0, 0 < γ ≤ 1. Then Mn /mn ≤ 1 + c2 kF (un )kγ∗ with c2 = c1 Λ1/2 , hence Mn − m n Mn − m n < ≤ c2 kF (un )kγ∗ , Mn + m n mn
and therefore Qn ≤ c3 kF (un )kγ∗
with c3 = c2 (1 + supn µ(un )). Also,
ρn ≤ c4 kF (un )k∗ with some c4 > 0 since kF (un )k∗ is bounded (cf. (5.118)). Using notation en = kF (un )k∗ , we obtain from (5.117) and (5.119) that en+1 ≤ (Qn + ρn ) en ≤ (Qn + c4 en ) en ≤ ≤ with d1 = c3 + c4 e01−γ .
c3 eγn
+ c4 e0
en e0
γ
c3 eγn
en en + c4 e0 e0
en = d1 e1+γ n
5.4. PRECONDITIONING AND GRADIENTS
127
Remark 5.23 (a) It is worth mentioning that Theorems 5.15-5.16 define descent methods, similarly to the simple iteration. Namely, conditions (i)-(ii) of Theorem 5.15 imply the existence of a potential φ : H → R, i.e. φ′ (u) = F (u) − b (u ∈ H). Then the directions −Bn−1 (F (un ) − b) are descent directions, since their angle is acute with the steepest descent direction −(F (un ) − b) owing to hBn−1 (F (un ) − b), F (un ) − bi > 0 . A more direct connection of Newton iterations with descent methods will be set up in the next section in the context of variable steepest descent. (b) Theorem 5.16 can be generalized by only assuming H¨older continuity instead of Lipschitz: kF ′ (u) − F ′ (v)k ≤ Lku − vkα (u, v ∈ X) with some constants L > 0, 0 < α < 1 independent of u, v. Then the same results hold with 0 < γ ≤ 1 replaced by 0 < γ ≤ α for (5.114), i.e. the fastest feasible convergence is of order 1 + α. (See Remark 5.17.)
5.4
Preconditioning and gradients
The connection of preconditioning to gradients and steepest descent can be established in a way which is related to the famous class of Sobolev gradient methods and also helps to treat gradient and Newton-like methods in some common manner. The Sobolev gradient approach has been developed in a series of publications by Neuberger (see e.g. [227]–[230]), for a summary see his monograph [231]. According to its main theoretical principle, preconditioning can be obtained via a change of inner product in the representation of the operator as the gradient of a suitable functional. For linear operators the same is known with the quadratic functional (cf. [11]). Now we put the results of section 5.3 in this context. The fixed and variable preconditioners are connected with an overall or stepwise change of inner product, respectively. Further, Newton’s method can be regarded as an optimal extreme case of variable preconditioning. (a) Fixed preconditioners Here we analyse Theorem 5.13 in the context of gradients. By Remark 5.1 and the assumptions of Theorem 5.13, the operator F : HS → HS in (5.73) has a potential φS : H → R. That is, φ′S (u) = F (u)
(u ∈ HS )
(5.120)
where φ′S denotes the gradient (derivative) of φ w.r. to the inner product h., .iS . On the other hand, if we consider φ|D as a (densely defined) functional in H, then its gradient w.r. to the original inner product h., .i of H is φ′ (u) = T (u)
(u ∈ D).
(5.121)
128
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
Namely, using the definition of directional derivative and relation (5.73), we obtain hφ′ (u), vi =
∂φ (u) = hφ′S (u), viS = hF (u), viS = hT (u), vi ∂v
(u, v ∈ D).
The Gateaux differentiability of φ w.r. to h., .i follows since for fixed u the linear functional v 7→ hT (u), vi is bounded in H. The steepest descent iteration corresponding to the gradient φ′S (and with constant stepsize) is the preconditioned sequence in (5.75): un+1 = un −
2 (F (un ) − b) . M +m
(5.122)
Using the gradient φ′ one would have the steepest descent iteration un+1 = un − α(T (un ) − g)
(5.123)
with some constant α > 0. (In the motivating case cond(T ) = M = ∞ the analogue of 2/(M + m) does not work to define α.) Altogether, (5.120) and (5.121) mean that the change of the inner product yields the change of the gradient of φ. Further, in the case R(S) ⊃ R(T ) we have F|D = S −1 T
(5.124)
from (5.82), hence (5.120) is replaced by φ′S (u) = S −1 T (u)
(u ∈ D).
(5.125)
That is, the modified gradient is expressed as the formally preconditioned version of the original one. In this case the sequence (5.122) takes the form (5.78): un+1 = un −
2 S −1 (T (un ) − g) . M +m
(5.126)
The preconditioned sequences (5.122) or (5.126) are the steepest descent iterations corresponding to the formulations (5.120) or (5.125), respectively, and yield convergence with ratio M −m q= M +m owing to the achieved condition number (5.71) cond(S −1 T ) ≤
M . m
On the other hand, the iteration (5.123) which is the steepest descent iteration corresponding to (5.121) gives slow or even no convergence, the latter being true for the motivating case cond(T ) = ∞. (5.127)
To sum up, this means that preconditioning by S is equivalent to a change of inner product and a redefined gradient of φ, and it provides considerably better convergence than the original operator.
5.4. PRECONDITIONING AND GRADIENTS
129
By Remark 5.19, in the case cond(T ) = ∞ the finite condition number of S −1 T can be achieved with an unbounded operator S which compensates (5.127). We note that one can further improve the conditioning by varying the operator S within the class of operators equivalent to T in the sense of (5.72). (b) Variable preconditioners In order to discuss the results of subsection 5.3.3 in the context of gradients, let us now consider the case of differentiable operators. That is, let the operator F : H → H have a bihemicontinuous symmetric Gateaux derivative and let φ : H → R be a potential, i.e. φ′ (u) = F (u) (u ∈ H). (5.128) Assume that the nth term of an iterative sequence is constructed and let Bn be a strongly positive self-adjoint linear operator. Then it follows similarly to (5.125) that the gradient of φ w.r. to the inner product h., .iBn is φ′Bn (u) = Bn−1 F (u)
(u ∈ H).
(5.129)
Namely, hφ′Bn (u), viBn =
∂φ (u) = hφ′ (u), vi = hBn−1 F (u), viBn ∂v
(u, v ∈ H).
The relation (5.129) means that the sequence un+1 = un −
2τn B −1 (F (un ) − b) Mn + m n n
(5.130)
in (5.111) is a variable gradient (steepest descent) method corresponding to φ such that in the nth step the gradient of φ is taken w.r. to the inner product h., .iBn . The stepwise spectral equivalence relation mn hBn h, hi ≤ hF ′ (un )h, hi ≤ Mn hBn h, hi
(n ∈ N, h ∈ H)
(5.131)
yields condition numbers
cond Bn−1 F ′ (un ) ≤
Mn mn
(n ∈ N)
and, according to Theorem 5.16, implies convergence with ratio q = lim sup
Mn − m n . Mn + m n
In particular, superlinear convergence can also be obtained when q = 0. (c) The Newton iteration as an optimal variable steepest descent Paragraph (b) allows us to regard Newton’s method as optimal in the context of steepest descent in a more general sense than for a usual gradient method. Namely,
130
CHAPTER 5. NONLINEAR EQUATIONS IN HILBERT SPACE
the latter defines an optimal descent direction when a fixed inner product is used. In contrast, if the search for an optimal descent direction also allows the stepwise change of inner product such that the latter is chosen among energy inner products h., .iBn corresponding to (5.131), then the choice Bn = F ′ (un ) yields optimal spectral bounds mn = Mn = 1 in (5.131), and the corresponding variable gradient iteration (5.130) becomes the damped Newton method with quadratic convergence speed. Roughly speaking, this means that the descents in the gradient method are steepest w.r. to different directions, whereas in Newton’s method they are steepest w.r. to different directions and inner products. Remark 5.24 A different interpretation of Newton’s method in Sobolev gradient context uses minimization subject to constraints, see Neuberger [231], Chapter 7.
In this chapter we have summarized the Hilbert space background of the results that will be used for nonlinear elliptic problems in the sequel. Both the solvability of problems and the convergence of preconditioned iterations concerning boundary value problems will be derived from the Hilbert space theorems of the present chapter. In addition, the obtained results help us to present that the concept of preconditioning is able to provide a general framework to discuss iterative methods.
Chapter 6 Solvability of nonlinear elliptic problems This chapter is devoted to the existence, uniqueness and some properties of the solutions of nonlinear elliptic boundary value problems. The first two sections involve the weak formulation and are based on the abstract setting of section 5.1. Namely, in section 6.1 the uniform monotonicity of the generalized differential operators is established in terms of bihemicontinuous symmetric Gateaux derivatives. Using this, the results of section 5.1 then yield well-posedness results for the weak formulation in section 6.2. In particular, the later discussion motivates the formulation of solvability results for various specialized cases, hence (after a general study in subsection 6.2.1) such results are given in subsection 6.2.2. Then section 6.3 presents some regularity and positivity properties. Finally, as an application of these results, section 6.4 contains existence and uniqueness for the model problems of Chapter 1. The results of this chapter are given in a self-contained form together with those of the previous chapter (except for subsection 6.2.3). Related summaries on similar elliptic problems are contained e.g. in the monographs of Gajewski–Gr¨oger–Zacharias [127], Necas [223], Zeidler [296].
6.1
Some properties of the generalized differential operators
This section serves as a theoretical background for the study of well-posedness in the next section. We prove the existence of bihemicontinuous symmetric Gateaux derivatives and the uniform monotonicity for two general types of generalized differential operators. (The calculations follow the papers [166, 167].) These operators correspond to the boundary value problems of subsection 6.2.1, see Remark 6.3. The generalized differential operators are considered as mappings from a Sobolev space into itself, see the Appendix, paragraph (c) of A2 for some more explanation. First we consider an operator corresponding to a 2n-th order r-term system with Dirichlet boundary conditions on a bounded domain Ω ⊂ RN . The following notations will be used: denote by d the number of multi-indices α = (α1 , . . . , αN ) ∈ NN such that |α| := α1 + . . . + αN ≤ n. We write any ξ ∈ Rrd as ξ = (ξi,α )i=1,..,r, |α|≤n = 131
132
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
(ξ1,α )|α|≤n , .., (ξr,α )|α|≤n with ξi,α ∈ R (i = 1, .., r, |α| ≤ n). Differentiation with respect to ξi,α is denoted by ∂ξi,α (i = 1, .., r, |α| ≤ n). Further, let D(n) u := (∂ α ui )i=1,..,r, |α|≤n for any u = (u1 , .., ur ) ∈ H0n (Ω)r = H0n (Ω) × . . . × H0n (Ω). (See e.g. [2] for the definition of H0n (Ω).) The inner product of the real Hilbert space H0n (Ω) is defined by hw, viH n (Ω) := 0
Z
N X
(∂i1 ..∂in w)(∂i1 ..∂in v)
Ω i1 ,..,in =1
and that of the space H0n (Ω)r by hw, viH n (Ω)r := 0
r X i=1
hwi , vi iH n (Ω) . 0
We introduce other inner products hw, vi∗H n (Ω) 0
Z
:=
Ω
X
α
α
(∂ w)(∂ v)
and
|α|≤n
hw, vi′H n (Ω) 0
:=
Z
Ω
X
(∂ α w)(∂ α v)
|α|=n
on H0n (Ω), which define equivalent norms to the original one, namely, there exist ω1 , ω2 > 0 such that ω1 kwkH0n (Ω) ≤ kwk′H0n (Ω) ≤ kwk∗H0n (Ω) ≤ ω2 kwkH0n (Ω) .
(6.1)
Assumptions 6.1. Let the functions fi,α : Ω × Rrd → R satisfy the following conditions: (i) fi,α ∈ C |α| (Ω × Rrd ), fi,0 ∈ C 1 (Ω × Rrd ) (i = 1, .., r, 0 < |α| ≤ n); (ii) ∂ξi,α fj,β = ∂ξj,β fi,α
(i, j = 1, .., r; |α| , |β| ≤ n) ;
(iii) there exist constants 0 < µ1 ≤ µ2 such that for any (x, ξ) ∈ Ω × Rrd and ζ ∈ Rrd , µ1
r X X
i=1 |α|=n
2
|ζi,α | ≤
r X
X
i,j=1 |α|,|β|≤n
∂ξj,β fi,α (x, ξ)ζj,β ζi,α ≤ µ2
r X X
i=1 |α|≤n
|ζi,α |2 .
Theorem 6.1 Let Assumptions 6.1 hold. Then the formula hF (u), viH n (Ω)r := 0
≡
Z X r X
Z
Ω
f (x, D(n) u) · D(n) v dx
fi,α (x, D(n) u) ∂ α vi dx
Ω i=1 |α|≤n operator F : H0n (Ω)r
defines an Gateaux derivative satisfying
(6.2)
(u, v ∈ H0n (Ω)r )
→ H0n (Ω)r which has a bihemicontinuous symmetric
mkhk2H0n (Ω)r ≤ hF ′ (u)h, hiH n (Ω)r ≤ M khk2H0n (Ω)r 0
(u, h ∈ H0n (Ω)r )
(6.3)
with suitable constants M ≥ m > 0. (Namely, m := µ1 ω12 and M := µ2 ω22 with ω1 and ω2 from (6.1).)
6.1. SOME PROPERTIES OF THE GENERALIZED DIFFERENTIAL OPERATORS133 Proof. First we remark the following facts: assumption (iii) implies that for all i, j = 1, .., r , |α| , |β| ≤ n and (x, ξ) ∈ Ω × Rrd ∂ξj,β fi,α (x, ξ) ≤ µ2 ,
further, Lagrange’s inequality yields that for all i = 1, .., r, |α| ≤ n , (x, ξ) ∈ ∈ Ω×Rrd |fi,α (x, ξ)| ≤ |fi,α (x, 0)| + µ2
r X X
j=1 |β|≤n
|ξj,β | .
We verify that for any i = 1, ..., r the formula hFi (u), viH n (Ω) := 0
Z
Ω
X
fi,α (x, D(n) u)∂ α v dx
(v ∈ H0n (Ω))
|α|≤n
(6.4)
defines an operator Fi : H0n (Ω)r → H0n (Ω). Namely, the modulus of the integral on the right-hand side of (6.4) can be estimated by Z
Ω
≤
√
X
|α|≤n
|fi,α (x, 0)| + µ2
r X X
j=1 |β|≤n
|∂ β uj (x)| |∂ α v(x)|dx ≤
d max kfi,α (idΩ , 0)kL2 (Ω) + µ2 |α|≤n
√
rdkuk∗H0n (Ω)r
!
kvk∗H0n (Ω) ,
i.e. for any fixed u ∈ H0n (Ω)r the discussed integral defines a bounded linear functional in v on H0n (Ω). Hence, using the Riesz theorem, (6.4) defines Fi (u) ∈ H0n (Ω) for arbitrary u ∈ H0n (Ω)r . Consequently, the operator (6.2) with coordinates (6.4) is also well-defined: F (u) := (F1 (u), .., Fr (u)) ∈ H0n (Ω)r
(u ∈ H0n (Ω)r ).
(6.5)
Now we check the desired properties of F , that is, the three requirements of bihemicontinuous symmetric Gateaux derivative (see Definition 5.1) and the estimate (6.3). (i) For any u ∈ H0n (Ω)r let Si (u) ∈ L (H0n (Ω)r , H0n (Ω)) be the bounded linear operator defined by hSi (u)h, viH n (Ω) := 0
Z X r
X
Ω j=1 |α|,|β|≤n
∂ξj,β fi,α (x, D(n) u)(∂ β hj )(∂ α v) dx
(for all h ∈ H0n (Ω)r , v ∈ H0n (Ω)). The √ existence of Si (u) is provided again by Riesz’s theorem, now using the estimate µ2 rdkhk∗H0n (Ω)r kvk∗H0n (Ω) for the right side integral. We will prove that Fi is Gˆateaux differentiable (i = 1, .., r) , namely, Fi′ (u) = Si (u) n
(u ∈ H0n (Ω)r ) . o
Let u, h ∈ H0n (Ω)r and E := v ∈ H0n (Ω) : kvkH0n (Ω) = 1 . Then 1 i δu,h (t) := kFi (u + th) − Fi (u) − tSi (u)hkH0n (Ω) t
(6.6)
134
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS 1 sup hFi (u + th) − Fi (u) − tSi (u)h, viH n (Ω) 0 t v∈E
=
Z X h 1 = sup fi,α x, D(n) u(x) + tD(n) h(x) − fi,α x, D(n) u(x) − t v∈E Ω |α|≤n
−t
r X
X
j=1 |α|,|β|≤n
= sup v∈E
Z
X h
Ω α,β,j
∂ξj,β fi,α x, D(n) u(x) ∂ β hj (x) ∂ α v(x)dx
∂ξj,β fi,α x, D(n) u(x) + tθ(x, t)D(n) h(x) −
i
−∂ξj,β fi,α x, D(n) u(x)
∂ β hj (x)∂ α v(x)dx
X
∂ξj,β fi,α (id, D (n) u+tθD (n) h)−∂ξj,β fi,α (id, D (n) u) ∂ β hj 2 × ≤ sup L (Ω) v∈E α,β,j
o
×k∂ α vkL2 (Ω) .
Here k∂ α vkL2 (Ω) ≤ kvk∗H0n (Ω) ≤ ω2 (|α| ≤ n). Further, |tθ(x, t)D(n) h(x)| ≤ |tD(n) h(x)| → 0 (if t → 0) almost everywhere on Ω , hence the continuity of ∂ξj,β fi,α implies that in each term of the sum the first factor converges a.e. (almost everywhere) to 0 when 2 t → 0. Since the integrands are majorated by 2µ2 |∂ β hj (x)| ∈ L1 (Ω) for any |β| ≤ n and j = 1, .., r, Lebesgue’s theorem yields that the obtained expression tends to 0 when t → 0 , thus i lim δu,h (t) = 0 . t→0
That is, (6.6) holds. Finally, (6.6) and (6.5) imply that F is Gˆateaux differentiable and F ′ (u) = S(u) ≡ (S1 (u), .., Sr (u))
(u ∈ H0n (Ω)r ).
(ii) It is proved similarly to (i) that F ′ is bihemicontinuous. Namely, let u, k, w, h ∈ H0n (Ω)r be fixed functions. Then i ωu,k,w,h (s, t) := kFi′ (u + sk + tw)h − Fi′ (u)hkH0n (Ω) =
= sup hFi′ (u + sk + tw)h − Fi′ (u)h, viH n (Ω) = 0
v∈E
= sup v∈E
Z
X h
Ω α,β,j
∂ξj,β fi,α x, D(n) u(x) + sD(n) k(x) + tD(n) w(x) −
i
−∂ξj,β fi,α x, D(n) u(x)
∂ β hj (x)∂ α v(x)dx .
Using the continuity of the functions ∂ξj,β fi,α and Lebesgue’s theorem, we obtain just as above that i (s, t) = 0 , lim ωu,k,w,h s,t→0
i.e. the mapping (s, t) 7→ Fi′ (u + sk + tw)h
6.1. SOME PROPERTIES OF THE GENERALIZED DIFFERENTIAL OPERATORS135 is continuous from R2 to H0n (Ω). Owing to (6.5) the same holds with Fi′ replaced by F ′. (iii) It follows from assumption (ii) that for any u, h, v ∈ H0n (Ω)r hF ′ (u)h, viH n (Ω)r = 0
= =
r X i=1
hFi′ (u)h, vi iH n (Ω) 0
Z X X
∂ξj,β fi,α (x, D(n) u)(∂ β hj )(∂ α vi )dx
Z X X
∂ξi,α fj,β (x, D(n) u)(∂ β hj )(∂ α vi )dx
Ω i,j |α|,|β|
=
Ω i,j |α|,|β|
r D X
hj , Fj′ (u)v
j=1
E
H0n (Ω)
= hF ′ (u)v, hiH n (Ω)r , 0
i.e. F ′ (u) is self-adjoint. The verified three properties mean that F has a bihemicontinuous symmetric Gateaux derivative. (iv) Setting v = h in the above formula, we have hF ′ (u)h, hiH n (Ω)r = 0
Z XX Ω i,j α,β
∂ξj,β fi,α (x, D(n) u)(∂ β hj )(∂ α hi )dx .
Hence from assumption (iii) we have
µ1 khk′H0n (Ω)r
2
≤ hF ′ (u)h, hiH n (Ω)r ≤ µ2 khk∗H0n (Ω)r 0
therefore, using (6.1),
2
mkhk2H0n (Ω)r ≤ hF ′ (u)h, hiH n (Ω)r ≤ M khk2H0n (Ω)r 0
(h ∈ H0n (Ω)r ) , (h ∈ H0n (Ω)r )
with m = µ1 ω12 and M = µ2 ω22 . That is, (6.3) is satisfied. Remark 6.1 As an important example, the following class of operators forms a special case of (6.2), i.e. Theorem 6.1 holds for them (cf. Langenbach [190], Mikhlin [215]). Let us consider an operator F : H0n (Ω)r → H0n (Ω)r of the form hF (u), viH0n (Ω)r =
Z
Ω
a([u, u]) [u, v]
(u, v ∈ H0n (Ω)r ),
(6.7)
i.e. the integrand of F in (6.2) is of the form f (x, D(n) u) · D(n) v = a([u, u]) [u, v] ,
(6.8)
where the following notations are used. The scalar C 1 function a : R+ → R+ satisfies the condition 0 < λ ≤ a(r) ≤ Λ , (6.9) 0 < λ ≤ (a(r2 )r)′ ≤ Λ
136
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
with suitable constants λ, Λ independent of the variable r > 0, further, [ . , . ] : H0n (Ω)r × H0n (Ω)r → L1 (Ω)
(6.10)
is a given bilinear mapping, which can be expressed as [u, v] = L D(n) u · L D(n) v
(u, v ∈ H0n (Ω)r )
with some givenR nonsingular matrix L ∈ Rrd×rd , using the notations of Theorem 6.1. Note that here Ω [h, h] is an equivalent norm to khk2H0n (Ω)r . Then (6.7) is a special case of (6.2) satisfying (6.3). Namely, let
p(r2 ) = min{a(r2 ), (a(r2 )r)′ },
q(r2 ) = max{a(r2 ), (a(r2 )r)′ }
(r ≥ 0),
(6.11)
which inherit (6.9): 0 < λ ≤ p(r) ≤ q(r) ≤ Λ
(r ≥ 0).
(6.12)
It is easy to check that the derivative of (6.8) satisfies
∂f (x, D(n) u) D(n) h·D(n) v = a([u, u]) [h, v] + 2a′ ([u, u]) [u, h][u, v] ∂ξ
(u, h, v ∈ H0n (Ω)r ).
Hence, using (6.11)–(6.12), λ[h, h] ≤ p([u, u]) [h, h] ≤
∂f (x, D(n) u) D(n) h · D(n) h ≤ q([u, u]) [h, h] ≤ Λ[h, h] (6.13) ∂ξ
which implies that λ
Z
Ω
[h, h] ≤
Z
Ω
′
p([u, u]) [h, h] ≤ hF (u)h, hiH0n (Ω)r ≤
Z
Ω
q([u, u]) [h, h] ≤ Λ
Z
Ω
[h, h]
(6.14) R for all u, h ∈ H0n (Ω)r . Then the equivalence of Ω [h, h] and khk2H0n (Ω)r yields the desired estimate (6.3). Second and fourth order examples for operators of the form (6.7) are given in Remarks 6.5 and 6.6, respectively. For instance, the physical model problems in sections 1.1–1.2 and 1.4 are of this type. Now we consider second order problems with 3rd type boundary conditions and the case of polynomial growth of lower order terms. The following assumptions are made for the coefficients of the operator: Assumptions 6.2. Let the bounded domain Ω ⊂ RN and the functions fi , qi and si (i = 1, ..., r) satisfy the following conditions: (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅ and ΓN ∪ ΓD = ∂Ω. (ii) The functions fi : Ω × RN r → RN , qi : Ω × Rr → R and si : ΓN × Rr → R are measurable and bounded w.r. to the variable x ∈ Ω (or x ∈ ΓN , resp.) and C 1 in the other variables.
6.1. SOME PROPERTIES OF THE GENERALIZED DIFFERENTIAL OPERATORS137 (iii) Let fi = (fi1 , ..., fiN ) and f = (f11 , ..., f1N , f21 , ..., f2N , ..., fr1 , ..., frN ) : Ω×RN r → RN r . For any (x, η) ∈ Ω × RN r , the matrices {∂ηk fj (x, η)}j,k=1,...,N r are sym(f ) metric and their eigenvalues λj (x, η) (j = 1, ..., rN ) satisfy (f )
µ1 ≤ λj (x, η) ≤ µ2 with constants µ2 ≥ µ1 > 0 independent of (x, η). (iv) Let 2 ≤ p (if N = 2) or 2 ≤ p ≤ N2N (if N > 2). There exist constants ci , di ≥ 0 −2 and 2 ≤ pi ≤ p (i = 1, 2) such that the following holds: for any (x, ξ) ∈ Ω × Rr , (q) the matrices {∂ξk qj (x, ξ)}j,k=1,...,r are symmetric and their eigenvalues λj (x, ξ) (j = 1, ..., r) satisfy r X
(q)
0 ≤ λj (x, ξ) ≤ c1 + c2
j=1
|ξj |p1 −2 ,
further, for any (x, ξ) ∈ ΓN ×Rr the matrices {∂ξk sj (x, ξ)}j,k=1,...,r are symmetric (s) and their eigenvalues λj (x, ξ) (j = 1, ..., r) satisfy (s)
0 ≤ λj (x, ξ) ≤ d1 + d2 (v) Either ΓD 6= ∅, or the function x 7→
inf
r X
j=1
|ξj |p2 −2 .
(s)
ξ∈Rr ,j=1,..,r
λj (x, ξ) is not a.e. constant zero
on ΓN .
We introduce the Hilbert space 1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0}
(6.15)
Z
(6.16)
with the inner product hu, viHD1 (Ω) :=
Ω
where β(x) ≡
∇u · ∇v dx + 1 µ1
inf
ξ∈Rr ,j=1,..,r
Z
ΓN
(s)
1 (u, v ∈ HD (Ω)),
βuv dσ
λj (x, ξ)
(x ∈ ΓN )
(6.17)
(s)
with µ1 and λj (x, ξ) from assumptions (iii) and (iv), respectively. (Assumption (v) 1 ensures the positive definiteness of (6.16).) The corresponding inner product in HD (Ω)r is r hu, viHD1 (Ω)r =
X i=1
hui , vi iHD1 (Ω)
1 (u, v ∈ HD (Ω)r ).
Then there hold the Sobolev embeddings and corresponding estimates 1 HD (Ω) ⊂ Lp (Ω),
1 HD (Ω)|ΓN ⊂ Lp (ΓN ),
kukLp (Ω) ≤ Kp,Ω kukHD1 kukLp (ΓN ) ≤ Kp,ΓN kukHD1
1 (u ∈ HD (Ω)),
1 (u ∈ HD (Ω)|ΓN )
(6.18) (6.19)
(see in the Appendix, Chapter 11), where Kp,Ω > 0 and Kp,ΓN > 0 are suitable constants 1 1 and HD (Ω)|ΓN denotes the trace of HD (Ω) on ΓN .
138
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
Theorem 6.2 Let Assumptions 6.2 hold. Then the formula hF (u), viH 1 (Ω)r := D
≡
Z X r
Ω i=1
Z
Ω
(f (x, ∇u) · ∇v + q(x, u) · v) dx +
Z
ΓN
s(x, u) · v dσ
(fi (x, ∇u1 , .., ∇ur ) · ∇vi + qi (x, u1 , .., ur )vi ) dx +
Z X r
ΓN
si (x, u1 , .., ur )vi dσ
i=1
1 (u, v ∈ HD (Ω)r )
(6.20)
1 1 defines an operator F : HD (Ω)r → HD (Ω)r which has a bihemicontinuous symmetric Gateaux derivative satisfying
mkhk2H 1 (Ω)r ≤ hF ′ (u)h, hiH 1 (Ω)r ≤ M (kukHD1 (Ω)r )khk2H 1 (Ω)r D
D
D
1 (u, h ∈ HD (Ω)r ) (6.21)
with the constant m = µ1 > 0 and the increasing function 2 2 M (r) := µ2 +c1 K2,Ω +d1 K2,Γ +c2 µp1 Kpp11,Ω rp1 −2 +d2 Kpp22,ΓN µp2 rp2 −2 N
(r > 0), (6.22)
where K2,Ω , K2,ΓN , Kp1 ,Ω , Kp2 ,ΓN > 0 are the embedding constants in (6.18)–(6.19) and 4−pi
µpi = max{1, r pi −2 }. Proof. The proof goes on similarly to Theorem 6.1 with appropriate modifications. Let us write the operator F in three terms F (u) = A(u) + B(u) + C(u)
1 (u ∈ HD (Ω)r ),
where hA(u), viH 1 (Ω)r = D
Z X r
Ω i=1
fi (x, ∇u1 , .., ∇ur ) · ∇vi dx ,
hB(u), viH 1 (Ω)r = D
Z X r
qi (x, u1 , .., ur )vi dx ,
(6.23)
(6.24)
Ω i=1
hC(u), viH 1 (Ω)r = D
Z X r
ΓN
si (x, u1 , .., ur )vi dσ
(6.25)
i=1
1 (u, v ∈ HD (Ω)r ). (In the sequel we will omit writing dx in the integrals.) We will verify the required properties separately for the three operators and finally sum up. (A) Owing to the uniform ellipticity assumption (iii) on f , the proof for the operator A is a special case of Theorem 6.1. Hereby the Gateaux derivative of A is given by ′
hA (u)h, viH 1 (Ω)r = D
Z
Nr X
Ω j,k=1
∂ηk fj (x, ∇u1 , .., ∇ur ) ∂k v ∂k v ,
(6.26)
r using the notations of assumption (iii) including the indexing {∂k v}N k=1 of ∇v for a 1 function v ∈ HD (Ω)r . Further, the spectral bounds µ1 , µ2 for the eigenvalues of ∂ηk fj together with the reindexing Nr X
j,k=1
|∂k h|2 =
r X i=1
|∇hi |2
1 (h ∈ HD (Ω)r )
6.1. SOME PROPERTIES OF THE GENERALIZED DIFFERENTIAL OPERATORS139 yield that µ1
r Z X
i=1 Ω
2
′
|∇hi | ≤ hA (u)h, hiH 1 (Ω)r ≤ µ2 D
r Z X
i=1 Ω
|∇hi |2
1 (h ∈ HD (Ω)r ).
(6.27)
(B) Following the proof of Theorem 6.1, first we verify that for any i = 1, ..., r the formula hBi (u), viH 1 (Ω) := D
Z
qi (x, u1 , .., ur )vi
Ω
1 1 (u ∈ HD (Ω)r , v ∈ HD (Ω))
(6.28)
1 1 defines an operator Bi : HD (Ω)r → HD (Ω). For this we remark the following facts: assumption (iv) implies that for all i, k = 1, . . . , r and (x, ξ) ∈ Ω × Rr
|∂ξk qi (x, ξ)| ≤ c1 + c2
r X
j=1
|ξj |p1 −2 .
(6.29)
Hence from Lagrange’s inequality we have for all i = 1, . . . , r, (x, ξ) ∈ Ω × Rr
|qi (x, ξ)| ≤ |qi (x, 0)| + c1 + c2
r X
j=1
p1 −2
|ξj |
r X
k=1
|ξk | ≤
c′1
+
c′2
r X
j=1
|ξj |p1 −1
(6.30)
with suitable constants c′1 , c′2 > 0. 1 Let q1 > 1 be defined by p1 −1 + q1 −1 = 1. For any u ∈ HD (Ω)r let Q(u) ≡ c′2
r X
j=1
|uj |p1 −1 = c′2
r X
j=1
|uj |p1 /q1 .
Then we have |qi (x, u)| ≤ c′1 + Q(u) from (6.30) where Q(u) ∈ Lq1 (Ω), hence the right-hand side of (6.28) can be estimated by the expression c′1 |Ω|1/2 kvkL2 (Ω) + kQ(u)kLq1 (Ω) kvkLp1 (Ω) , where |Ω| is the measure of Ω. Using (6.18) with p = p1 we see that for any fixed u ∈ 1 1 HD (Ω)r the right-hand side of (6.28) defines a bounded linear functional on HD (Ω)r . 1 r 1 Hence, using the Riesz theorem, the operator Bi : HD (Ω) → HD (Ω) is well-defined. Now we check that B has a bihemicontinuous symmetric Gateaux derivative. The requirements (i)-(iii) of Definition 5.1 are verified as follows. (i) We prove that B is Gˆateaux differentiable by verifying this for all Bi (i = 1, ..., r). 1 1 1 For any u ∈ HD (Ω)r let Si (u) : HD (Ω)r → HD (Ω) be the bounded linear operator defined by hSi (u)h, viHD1 =
Z X r
∂ξk qi (x, u)hk v
Ω k=1
1 1 (u ∈ HD (Ω)r , v ∈ HD (Ω)).
(6.31)
The existence of Si (u) is provided by the Riesz theorem similarly as above, now having the estimate Z
Ω
|∂ξk qi (x, u)hk v| ≤ c′1 khk kL2 (Ω) kvkL2 (Ω)
r
X
+ c2
|uj |p1 −2
k=1
p1
L p1 −2 (Ω)
khk kLp1 (Ω) kvkLp1 (Ω)
140
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
using (6.29) and that
−1 p1 p1 −2
+ p1 −1 + p1 1 = 1. We prove that
Bi′ (u) = Si (u) n
1 u ∈ HD (Ω)r .
o
1 1 Let u, h ∈ HD (Ω)r , further, E ≡ v ∈ HD (Ω) : kvkHD1 (Ω) = 1 and
1 i δu,h (t) ≡ kBi (u + th) − Bi (u) − tSi (u)hkHD1 (Ω) t =
1 suphBi (u + th) − Bi (u) − tSi (u)h, viHD1 (Ω) . t v∈E
Then we have i δu,h (t)
Z 1 = sup t v∈E
qi (x, u + th) − qi (x, u) − t
Ω
= sup v∈E
≤ sup
r X
v∈E k=1
Z
Ω
Z X r
Ω k=1
r X
!
∂ξk qi (x, u)hk v
k=1
(∂ξk qi (x, u + θth) − ∂ξk qi (x, u)) hk v
|∂ξk qi (x, u + θth) − ∂ξk qi (x, u)|
p1 p1 −2
p1 −2 p1
dx
khk kLp1 (Ω) kvkLp1 (Ω) .
Here kvkLp1 (Ω) ≤ Kp1 ,Ω using (6.18) with p = p1 , further, |θth| ≤ |th| → 0 a.e. on Ω, hence the continuity of ∂ξk qi implies that the integrands converge a.e. to 0 when
p1 p1 −2 r P |uj + t0 hj |p1 −2 ) ≤ t → 0. The integrands are majorated for t ≤ t0 by 2(c1 + c2 j=1 r P p1 1
c˜1 + c˜2
j=1
|uj + t0 hj |
∈ L (Ω), hence, by Lebesgue’s theorem, the obtained expression
tends to 0 when t → 0. Thus
i lim δu,h (t) = 0 , t→0
i.e. Bi is Gˆateaux differentiable. (ii) It follows similarly to (i) that Bi is bihemicontinuous. Namely, let u, k, w ∈ 1 1 HD (Ω)r , h ∈ HD (Ω) be fixed functions. Then ωu,k,w,h (s, t) ≡ kBi′ (u + sk + tw)h − Bi′ (u)hkHD1 (Ω) = suphBi′ (u + sk + tw)h − Bi′ (u)h, viHD1 (Ω) v∈E
= sup v∈E
Z X r
Ω k=1
(∂ξk qi (x, u + sk + tw) − ∂ξk qi (x, u)) hk v .
Then we obtain just as above from the continuity of ∂ξk qi and Lebesgue’s theorem that lim ωu,k,w,h (s, t) = 0 ,
s,t→0
i.e. the mapping (s, t) 7→ Bi′ (u + sk + tw)h
6.1. SOME PROPERTIES OF THE GENERALIZED DIFFERENTIAL OPERATORS141 1 is continuous from R2 to HD (Ω). The obtained bihemicontinuity of Bi is inherited by B. (iii) It follows from Bi′ (u) = Si (u) in (6.31) and from the assumed symmetry of the 1 Jacobians ∂ξ q(x, ξ) that for any u, h, v ∈ HD (Ω)r
hB ′ (u)h, viHD1 (Ω)r =
r X i=1
r X
hBi′ (u)h, vi iHD1 (Ω) =
i=1
hBi′ (u)v, hi iHD1 (Ω)
= hB ′ (u)v, hiHD1 (Ω)r .
(iv) Now we estimate the spectral bounds of B ′ (u). 1 For any u, h ∈ HD (Ω)r we have ′
hB (u)h, hiHD1 (Ω)r =
Z X r
∂ξk qi (x, u)hk hi ,
Ω i,k=1
hence assumption (iv) yields hB ′ (u)h, hiHD1 (Ω)r ≥ 0.
(6.32)
Further, Z
hB ′ (u)h, hiHD1 (Ω)r ≤ Here
r Z X
c1
k=1 Ω
Ω
c 1 + c 2
r X
j=1
|uj |p1 −2
r X
h2k .
(6.33)
k=1
2 h2k ≤ c1 K2,Ω khk2H 1 (Ω)r D
using (6.18) with p = 2. Further, from H¨older’s inequality r Z X
j,k=1 Ω
|uj |p1 −2 h2k
≤
r X
j,k=1
2 kuj kpL1p−2 1 (Ω) khk kLp1 (Ω)
=
r X
j=1
kuj kpL1p−2 1 (Ω)
r X
k=1
An elementary extreme value calculation shows that for x ∈ Rr , the values of µ p1
P
r j=1
P
|xj |2
r j=1
p1 −2 2
r Z X
≤
4−p1
lie between R2 and r p1 −2 R2 , i.e. 4−p1
where µp1 = max{1, r p1 −2 }. Hence
|uj |p1 −2 h2k
j,k=1 Ω
µp1 Kpp11,Ω
|xj |p1 −2
2 p1 −2
r X
j=1
≤ µ p1
j=1
p1 −2 2
kuj k2H 1 D
r X
r X
k=1
p1 −2 2
kuj k2Lp1 (Ω)
khk k2H 1 D
!
r X
k=1
khk k2Lp1 (Ω) Pr
j=1
Pr
khk k2Lp1 (Ω)
j=1
!
.
x2j = R2
|xj |p1 −2 ≤
!
= µp1 Kpp11,Ω kukpH11−2(Ω)r khk2H 1 (Ω)r . D
D
Altogether, (6.33) yields
hB ′ (u)h, hiHD1 (Ω)r ≤ MB (kukHD1 (Ω)r )khk2H 1 (Ω)r D
1 (u, h ∈ HD (Ω)r )
(6.34)
142
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
with the real function p1 −2 2 MB (r) = c1 K2,Ω + c2 µp1 Kpp11,Ω rH 1 (Ω)r D
(r > 0).
Together with (6.32) we obtain 0 ≤ hB ′ (u)h, hiHD1 (Ω)r ≤ MB (kukHD1 (Ω)r )khk2H 1 (Ω)r D
1 (u, h ∈ HD (Ω)r ).
(6.35)
(C) For the operator C, the required properties are proved in the same way as for B via replacing Ω by ΓN and p1 by p2 . (This is because of the similar estimates for qi and si .) A different step is used for the lower bound. That is, C has a bihemicontinuous symmetric Gateaux derivative hC ′ (u)h, viHD1 =
Z
ΓN
r X
1 1 (Ω)r , h, v ∈ HD (Ω)). (u ∈ HD
∂ξk si (x, u)hk v dσ
i,k=1
(6.36)
Now the lower spectral estimate uses assumption (iv) and (6.17): ′
hC (u)h, hi
1 HD
≥
Z
inf
ξ∈Rr ,j=1,..,r
(s) λj (x, ξ)
r Z X
h2k dσ
k=1
ΓN
= µ1
r X
βh2k dσ.
k=1Γ N
The upper estimate is analogous to (6.34) with MB replaced by 2 MC (r) = d1 K2,Γ + d2 µp2 Kpp22,ΓN rp2 −2 N
(r > 0).
Hence we get µ1
r Z X
k=1Γ N
βh2k dσ ≤ hB ′ (u)h, hiHD1 (Ω)r ≤ MC (kukHD1 (Ω)r )khk2H 1 (Ω)r D
1 (u, h ∈ HD (Ω)r ).
(6.37)
Summing up, we obtain that F = A + B + C has a bihemicontinuous symmetric Gateaux derivative, further, from (6.27), (6.35) and (6.37) µ1 khk2H 1 (Ω)r = µ1 D
r Z X
k=1 Ω
|∇hk |2 + µ1
r Z X
k=1Γ N
r Z X
≤ µ2 + MB (kukHD1 (Ω)r ) + MC (kukHD1 (Ω)r )
k=1 Ω
βh2k dσ ≤ hF ′ (u)h, hiH 1 (Ω)r D
|∇hk |2 = M (kukHD1 (Ω)r )khk2H 1 (Ω)r D
with M (r) as in (6.22). That is, (6.21) is satisfied and the proof of Theorem 6.2 is complete.
6.2. WEAK SOLUTIONS
143
Remark 6.2 The operators of the form (6.7) in Remark 6.1 can also be considered 1 1 with mixed boundary conditions, i.e. the following operator F : HD (Ω)r → HD (Ω)r is a special case of the operator F in (6.20): hF (u), viHD1 (Ω)r =
Z
Ω
a([u, u]) [u, v]
1 (u, v ∈ HD (Ω)r )
(6.38)
1 with a(r) and [ . , . ] from (6.9) and (6.10), respectively , replacing H0n (Ω)r with HD (Ω)r . Hence Theorem 6.2 holds for F (with M (r) ≡ M ). The same result remains true if, more generally, a is allowed to depend on x ∈ Ω and in (6.9) we consider derivatives of a(x, r) w.r. to r. Further, the sum of operators as in (6.38) also inherits this property. For instance, the result holds for
hF (u), viHD1 (Ω)r =
Z Ω
a(x, [u, u]) [u, v] + b(x, {u, u}) {u, v} dx
1 (u, v ∈ HD (Ω)r )
(6.39)
under the above conditions. In this case it is enough to require that Z
Ω
([h, h] + {h, h})
(6.40)
is an equivalent norm to khk2H 1 (Ω)r , since the analogue of (6.14) now holds with [h, h] D replaced by [h, h] + {h, h}. Namely, following the notation (6.8), let f (x, ∇u) · ∇v = f1 (x, ∇u) · ∇v + f2 (x, ∇u) · ∇v ≡ a(x, [u, u]) [u, v] + b(x, {u, u}) {u, v} . Let a and b satisfy (6.9) (understanding derivatives w.r. to r as mentioned above) using the same constants λ and Λ for both of them for simplicity. Then, applying (6.13) to f1 and f2 and summing up, we obtain similarly that λ
Z
Ω
([h, h] + {h, h}) ≤
Z
Ω
Z ∂f (x, ∇u) ∇h · ∇h ≤ Λ ([h, h] + {h, h}) , ∂η Ω
(6.41)
which, by the imposed equivalence condition for the norm (6.40), implies the required uniform ellipticity of F similarly as in Remark 6.1.
6.2
Weak solutions
This section provides existence and uniqueness results for the weak formulation of nonlinear boundary value problems. The first subsection involves two general classes of problems. Then from these results we derive solvability for various more specialized cases, motivated by the later discussion.
6.2.1
General well-posedness results
In this subsection we derive existence and uniqueness theorems for two general types of boundary value problems. Namely, the uniform ellipticity results of Theorems 6.1–6.2 mean that the conditions of Theorem 5.1 hold for the corresponding boundary value problems, hence they have a unique solution. First we consider a 2n-th order system, using the notations of section 6.1.
144
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
Theorem 6.3 Let Ω ⊂ RN be a bounded domain and let the functions fi,α : Ω×Rrd → R satisfy Assumptions 6.1. Further, let gi ∈ L2 (Ω) (i = 1, .., r). Then the system X |α| α (n) T (u , .., u ) := (−1) ∂ f (x, D u) = gi (x) i 1 r i,α
(i = 1, .., r)
|α|≤n
α
(6.42)
(i = 1, .., r, |α| ≤ n − 1)
∂ ui|∂Ω = 0
has a unique weak solution u∗ = (u∗1 , .., u∗r ) ∈ H0n (Ω)r , that is, u∗ satisfies Z
Ω
X
fi,α (x, D
(n) ∗
α
u ) ∂ v dx =
|α|≤n
Z
Ω
(i = 1, .., r, v ∈ H0n (Ω)).
gi v dx
(6.43)
Proof. Theorem 6.1 implies that the operator F in (6.2) satisfies the conditions of Theorem 5.1. Further, the mapping r Z X
v 7→
i=1 Ω
gi vi dx
defines a bounded linear functional on H0n (Ω)r since its modulus can be estimated by r X i=1
kgi kL2 (Ω) kvi kL2 (Ω) ≤ K2,Ω
r X i=1
kgi kL2 (Ω) kvi kH0n (Ω) ≤ K2,Ω kgkL2 (Ω)r kvkH0n (Ω)r ,
where K2,Ω > 0 denotes the embedding constant for H0n (Ω) ⊂ L2 (Ω). Hence the Riesz theorem yields the existence of b ∈ H0n (Ω)r such that hb, viH0n (Ω)r =
r Z X
i=1 Ω
(v ∈ H0n (Ω)r ).
gi vi dx
Theorem 5.1 yields that the equation F (u) = b
(6.44)
has a unique solution u∗ ∈ H0n (Ω)r . Here (6.44) equals hF (u), viH0n (Ω)r = hb, viH0n (Ω)r
(v ∈ H0n (Ω)r ),
i.e. Z X r X
Ω i=1 |α|≤n
fi,α (x, D(n) u) ∂ α vi dx =
r Z X
i=1 Ω
gi vi dx,
(v ∈ H0n (Ω)r )
(6.45)
and for u = u∗ this is equivalent to the system (6.43) since in (6.45) the functions vi can be varied independently. Hence u∗ is the required weak solution of (6.42). For second order problems, the following general theorem covers both 3rd type boundary conditions and the case of polynomial growth of lower order terms. Now Theorem 6.2 ensures that the conditions of Theorem 5.1 hold, hence it yields the required well-posedness result.
6.2. WEAK SOLUTIONS
145
Theorem 6.4 Let Ω ⊂ RN be a bounded domain and the functions fi , qi and si (i = 1, ..., r) satisfy Assumptions 6.2. Further, let gi ∈ L2 (Ω) and γi ∈ L2 (ΓN ) (i = 1, .., r). Then the system
Ti (u1 , .., ur ) := − div fi (x, ∇u1 , .., ∇ur ) + qi (x, u1 , .., ur ) = gi (x)
Qi (u1 , .., ur ) ≡
in Ω
fi (x, ∇u1 , .., ∇ur ) · ν + si (x, u1 , .., ur ) = γi (x) on ΓN ui = 0
(6.46)
on ΓD r
(for i = 1, .., r) has a unique weak solution u∗ = (u∗1 , .., u∗r ) ∈ H 1 (Ω) , that is, u∗ satisfies Z
Ω
(fi (x, ∇u∗1 , .., ∇u∗r ) Z
· ∇v +
gi v dx +
Ω
Z
qi (x, u∗1 , .., u∗r )v)
dx +
Z
si (x, u∗1 , .., u∗r )v dσ =
ΓN 1 (i = 1, .., r, v ∈ HD (Ω)).
γi v dσ
ΓN
(6.47)
Proof. Theorem 6.2 implies that the operator F in (6.20) satisfies the conditions of Theorem 5.1. Further, the mapping Z Z r X v 7→ γi vi dσ gi vi dx + i=1
Ω
ΓN
1 defines a bounded linear functional on HD (Ω)r since its modulus can be estimated by r X i=1
kgi kL2 (Ω) kvi kL2 (Ω) + kγi kL2 (ΓN ) kvi kL2 (ΓN )
≤ K2,Ω kgkL2 (Ω)r + K2,ΓN kγkL2 (ΓN )r kvkHD1 (Ω)r as in Theorem 6.3, where K2,Ω > 0 and K2,ΓN > 0 denote the embedding constants for H0n (Ω) ⊂ L2 (Ω) and H0n (Ω) ⊂ L2 (ΓN ), respectively. Hence the Riesz theorem yields 1 the existence of b ∈ HD (Ω)r such that hb, viHD1 (Ω)r
Z Z r X = γi vi dσ gi vi dx + i=1
Ω
ΓN
1 (v ∈ HD (Ω)r ).
Theorem 5.1 yields that the equation F (u) = b
(6.48)
1 (Ω)r . Here (6.48) equals has a unique solution u∗ ∈ HD
hF (u), viHD1 (Ω)r = hb, viHD1 (Ω)r
1 (v ∈ HD (Ω)r ),
146
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
i.e. Z X r
Ω i=1
=
(fi (x, ∇u1 , .., ∇ur ) · ∇vi + qi (x, u1 , .., ur )vi ) dx +
Z X r
Ω i=1
gi v dx +
Z X r
(v ∈
γi v dσ
ΓN i=1
Z X r
ΓN
si (x, u1 , .., ur )vi dσ
i=1
1 HD (Ω)r )
(6.49) and for u = u∗ this is equivalent to the system (6.47) since in (6.49) the functions vi can be varied independently. Hence u∗ is the required weak solution of (6.46). Remark 6.3 As shown by the proof of Theorems 6.3–6.4, the equations (6.43) or (6.47) can be written with the generalized differential operator F as hF (u∗ ), vi = hb, vi
(6.50)
1 (for all v ∈ H0n (Ω)r or v ∈ HD (Ω)r , respectively). If u∗ is regular, i.e.
u∗ ∈ H 2n (Ω)r ∩ H0n (Ω)r
1 u∗ ∈ H 2 (Ω)r ∩ HD (Ω)r ,
or
(6.51)
respectively, then it is a strong solution: T (u∗ ) = g. This is because for functions u with regularity as in (6.51), the divergence theorem yields Z hF (u), vi = T (u)v Ω
1 (for all v ∈ H0n (Ω)r or v ∈ HD (Ω)r , respectively). Some special cases when the regularity (6.51) holds are cited in subsection 6.3.1 for a single operator. On the other hand, in general u∗ may not have the regularity (6.51), in which case it is not a strong solution of the original problem (6.42) or (6.46). In this case the notion of weak solution gives the only way in which an existence result can be obtained.
6.2.2
Various solvability theorems
Now we formulate some solvability theorems for single equations. They are special cases of those in the previous subsection. The distinct formulation of these results is motivated by the later discussion and serves the sake of clearer exposition. First we consider second order problems with Robin (or third type) boundary conditions. Theorem 6.5 Let the problem − div f (x, ∇u) + q(x, u) = g(x)
in Ω
f (x, ∇u) · ν + s(x, u) = γ(x) on ΓN
satisfy the following conditions:
u = 0
on ΓD
(6.52)
6.2. WEAK SOLUTIONS
147
(i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅ and ΓN ∪ ΓD = ∂Ω. (ii) The functions f : Ω × RN → RN , q : Ω × R → R and s : ΓN × R → R are measurable and bounded w.r. to the variable x ∈ Ω (or x ∈ ΓN , resp.) and C 1 in the other variables. (iii) The Jacobian matrices
∂f (x,η) ∂η
are symmetric and their eigenvalues λ satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞
with constants µ2 ≥ µ1 > 0 independent of (x, η). (iv) Let 2 ≤ p (if N = 2) or 2 ≤ p ≤ N2N (if N > 2). There exist constants −2 ci , di ≥ 0 and 2 ≤ pi ≤ p (i = 1, 2) such that for any x ∈ Ω (or x ∈ ΓN , resp.) and ξ ∈ R, 0 ≤ ∂ξ q(x, ξ) ≤ c1 + c2 |ξ|p1 −2 ,
0 ≤ ∂ξ s(x, ξ) ≤ d1 + d2 |ξ|p2 −2 .
(v) Either ΓD 6= ∅, or x 7→ inf ξ∈R ∂ξ s(x, ξ) is not a.e. constant zero on ΓN . (vi) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ). 1 Then problem (6.52) has a unique weak solution u∗ ∈ HD (Ω). That is, u∗ ∈ H 1 (Ω) satisfies u∗ |ΓD = 0 and
Z
Ω
[f (x, ∇u∗ ) · ∇v + q(x, u∗ )v] + Proof.
Z
s(x, u∗ )v dσ =
ΓN
Z
Ω
gv +
Z
γv dσ
ΓN
1 (v ∈ HD (Ω)).
It follows directly from Theorem 6.4 with r = 1.
Remark 6.4 As a special case of (6.20), the generalized differential operator F : 1 1 HD (Ω) → HD (Ω) corresponding to problem (6.52) is given by hF (u), vi
1 HD
=
Z
Ω
[f (x, ∇u) · ∇v + q(x, u)v] +
Z
ΓN
s(x, u)v dσ
1 (u, v ∈ HD (Ω)).
For the problems to follow in Theorems 6.6–6.8, the generalized differential operator corresponds similarly to the left-hand sides of the equality that defines u∗ . The following special case of Theorem 6.5 is of particular interest in model problems, since it occurs frequently as seen in Chapter 1. Namely, here the operator consists of only the principal part and the nonlinearity appears in a scalar coefficient depending on |∇u|. (Further, the boundary condition is mixed.)
148
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
Theorem 6.6 Let the problem −div (b(x, |∇u|) ∇u) = g(x)
b(x, |∇u|)
∂u ∂ν
in Ω
= γ(x) on ΓN
u = 0
(6.53)
on ΓD
satisfy the following conditions: (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅ and ΓN ∪ ΓD = ∂Ω. (ii) The scalar-valued function b : Ω × R+ → R is measurable and bounded w.r. to x and C 1 w.r. to the variable r, further, it satisfies 0 < µ1 ≤ b(x, r) ≤
∂ (r b(x, r)) ≤ µ2 ∂r
(x ∈ Ω, r > 0)
(6.54)
with constants µ2 ≥ µ1 > 0 independent of (x, r). (iii) ΓD 6= ∅. (iv) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ). 1 Then problem (6.53) has a unique weak solution u∗ ∈ HD (Ω). That is, u∗ ∈ H 1 (Ω) satisfies u∗ |ΓD = 0 and Z Z Z ∗ ∗ 1 b(x, |∇u |)∇u · ∇v = gv + γv dσ (v ∈ HD (Ω)). Ω
Proof. satisfies
hence
∂f (x,η) ∂η
Ω
Using the notation b′r (x, r) =
ΓN
∂ b(x, r), ∂r
the function f (x, η) = b(x, |η|)η
∂f (x, η) b′ (x, |η|) ξ · ζ = b(x, |η|) (ξ · ζ) + r (η · ξ)(η · ζ), ∂η |η| is symmetric and (6.54) yields
µ1 |ξ|2 ≤ b(x, |η|)|ξ|2 ≤
∂f (x, η) ξ · ξ ≤ (b(x, |η|) + b′r (x, |η|)|η|) |ξ|2 ≤ µ2 |ξ|2 . ∂η
(6.55)
This implies that all the conditions of Theorem 6.5 are satisfied with the special case q ≡ 0 and s ≡ 0. It is worth formulating the even more special case when b(x, r) is independent of x. Further, let us only have Dirichlet boundary condition.
6.2. WEAK SOLUTIONS
149
Corollary 6.1 Let the scalar-valued function a ∈ C 1 (R+ ) satisfy 0 < µ1 ≤ a(r) ≤ (r a(r)) ′ ≤ µ2
(r > 0)
(6.56)
with constants µ2 ≥ µ1 > 0 independent of r. Let us consider the problem −div (a(|∇u|) ∇u) = g(x)
(6.57)
u| = 0 ∂Ω
on a bounded domain Ω ⊂ RN with some g ∈ L2 (Ω). Then (6.57) has a unique weak solution u∗ ∈ H 1 (Ω). Remark 6.5 The function a(r) can be redefined in terms of r2 , i.e. we can have a ˜(|∇u|2 ) = a(|∇u|) in the nonlinearity of (6.57). This turns condition (6.56) into a special case of (6.9) for a ˜. Then the corresponding generalized differential operator F : H01 (Ω) → H01 (Ω) is given by hF (u), viH01 =
Z
Ω
a ˜(|∇u|2 )∇u · ∇v
(u, v ∈ H01 (Ω)),
(6.58)
which is a special case of the operator (6.7) in Remark 6.1. Now we consider Neumann problems. Here the non-injectivity can be avoided by working in the space of functions with zero mean: {u ∈ H 1 (Ω) :
Z
Ω
u dx = 0 } .
(6.59)
Theorem 6.7 Let Ω ⊂ RN be a bounded domain with piecewise smooth boundary and let f satisfy the corresponding conditions in (ii)-(iii) of Theorem 6.5. Further, let Z g ∈ L2 (Ω) and assume that
Ω
g dx = 0. Then the problem
(
− div f (x, ∇u) = g(x) f (x, ∇u) · ν |∂Ω = 0
∗
1
has a unique weak solution u ∈ H (Ω) such that Z
Ω
∗
f (x, ∇u ) · ∇v =
Z
Ω
gv
Z
Ω
(6.60)
u∗ dx = 0. That is, u∗ satisfies
(v ∈ H 1 (Ω)).
The set of solutions is {u∗ + c : c ∈ R}. Further, if assumption to hold then there exists no weak solution.
(6.61) Z
Ω
g dx = 0 fails
Proof. The proof of Theorem 6.5 can be repeated in the space given in (6.59). R The necessity of the condition Ω g dx = 0 follows in the usual way by setting a constant test function in (6.61).
150
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
Now we consider 4th order Dirichlet problems. Of practical interest is the operator consisting of only the principal part. For convenience, we write the nonlinearity in matrix form, using the notations analogous to (1.37) and (1.39), further, we define the elementwise matrix product H · V :=
N X
(H, V ∈ RN ×N ).
Hik Vik
i,k=1
(6.62)
Theorem 6.8 Let the problem T (u) ≡ div2 A(x, D 2 u) = g(x) u |∂Ω =
∂u | = ∂ν ∂Ω
(6.63)
0
satisfy the following conditions:
(i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary. (ii) The matrix-valued function A : Ω × RN ×N → RN ×N is measurable and bounded w.r. to the variable x ∈ Ω and C 1 in the other variables. (iii) The Jacobian arrays ∂A(x, Θ) = ∂Θ
(
∂Ars (x, Θ) ∂Θik
)N
i,k,r,s=1
∈ R(N ×N )
2
are symmetric and their eigenvalues Λ satisfy 0 < µ1 ≤ Λ ≤ µ 2 < ∞
(6.64)
with constants µ2 ≥ µ1 > 0 independent of (x, Θ). (iv) g ∈ L2 (Ω). Then problem (6.63) has a unique weak solution u∗ ∈ H02 (Ω). That is, u∗ satisfies Z
Ω
A(x, D2 u∗ ) · D2 v =
Z
Ω
gv
(v ∈ H02 (Ω)).
(6.65)
Proof. Defining the natural correspondence between the notations ∂xi ∂xj u (i, j = αN 1, ..., N ) and ∂ α u = ∂1α1 ...∂N (|α| = α1 +...αN = 2) of second order partial derivatives, the entries Ai,j of the matrix-valued function A determine corresponding functions fα (|α| = 2). Then our theorem is a special case of Theorem 6.3 with a single equation (r = 1) and zero lower order terms (fα = 0 for |α| ≤ 1). Remark 6.6 (i) An important special case of (6.63) is an analogue of (6.56)–(6.57) in the 4th order case, namely, an operator of the form
˜ 2u , T (u) = div2 a(E(D2 u)) D
(6.66)
6.2. WEAK SOLUTIONS
151
using the notations E(D2 u) =
1 2 2 |D u| + (∆u)2 , 2
˜ 2 u = 1 D2 u + ∆u · I D 2
as in section 1.4, and letting the scalar-valued function a ∈ C 1 (R+ ) satisfy (6.9). (This kind of operator arises for the elasto-plastic plate equation, see (1.40).) Then the corresponding generalized differential operator F : H02 (Ω) → H02 (Ω) is given by hF (u), viH02
1Z = a(E(D2 u)) (D2 u · D2 v + ∆u ∆v) 2 Ω
(v ∈ H02 (Ω)),
(6.67)
(cf. (1.58)), which, similarly to the 2nd order analogue (6.58), is a special case of the operator (6.7) in Remark 6.1. (ii) The analogues of the well-posedness theorem can be formulated if one of the boundary conditions of (6.63) is replaced by a second order one corresponding to the operator T . First, under the boundary conditions u|∂Ω = A(x, D2 u)ν · ν|∂Ω = 0
(6.68)
the unique weak solution is provided by reproducing Theorem 6.8 in the space ˜ 2 (Ω) ≡ {u ∈ H 2 (Ω) : u|∂Ω = 0} . H The boundary conditions ∂u | = ∂ν ∂Ω
A(x, D2 u)ν · ν|∂Ω = 0
(6.69)
lead to a non-injective problem, where the kernel consists of constant functions similarly to second order Neumann problems. Then the analogue of Theorem 6.7 can be formulated in the space H 2 (Ω) with the same zero-mean integral conditions. Remark 6.7 (Potentials). As mentioned in Chapter 5, the solutions of the boundary value problems in this subsection are minimizers of suitable convex functionals. For a second order Dirichlet problem we have defined the latter in (5.12) in Remark 5.5. Now, as an illustration, we give such minimizing functionals φ for some other problems. (Obviously these are unique up to an additive constant.) For the 3rd type boundary value problem (6.52), the minimizing functional φ : 1 HD (Ω) → R is given by φ(u) =
Z
Ω
(ψ(x, ∇u) + Q(x, u)) +
Z
ΓN
S(x, u) dσ −
Z
Ω
gu −
Z
ΓN
γu dσ ,
(6.70)
where the C 1 functions ψ : Ω × RN → R, Q : Ω × R → R and S : ∂Ω × R → R satisfy ∂ψ (x, η) = f (x, η), ∂η
∂Q (x, ξ) = q(x, ξ) and ∂ξ
∂S (x, ξ) = s(x, ξ), ∂ξ
respectively, and the existence of ψ is ensured by the assumed symmetry and continuity of ∂f /∂η.
152
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
For the special case (6.57) let us consider the function a(r) as redefined in terms of r , i.e. we have a ˜(|∇u|2 ) in the equation as in (6.58). Then the minimizing functional 1 φ : H0 (Ω) → R is given by Z 1Z 2 φ(u) = A(|∇u| ) − gu , 2 2
Ω
1
Ω
+
where the C function A : R → R satisfies
A′ (r) = a ˜(r).
The analogue of the above functional in the 4th order case can be given for problem (6.63) containing the operator of the form (6.66). Then φ : H02 (Ω) → R is given by Z 1Z 2 φ(u) = A(E(D u)) − gu , 2 Ω
1
Ω
+
where A ∈ C (R ) is the primitive of a as above. More generally, the minimizing functional for problems with the operator (6.7) in Remark 6.1 is given by Z 1Z φ(u) = A([u, u]) − gu . 2 Ω
Ω
Finally, as a semilinear special case of (6.52) let us consider the radiative cooling problem (1.44). Then the minimizing functional φ : H 1 (Ω) → R contains quadratic terms corresponding to the linear parts, i.e. it is given by Z Z 1Z 1 1 2 5 2 κ(x) |∇u| + σ(x)|u| + φ(u) = α(x)u dσ − α˜ uu dσ . 2 5 2 Ω
∂Ω
∂Ω
(We note that the equation corresponding to this functional in fact contains the nonlinear term σ(x)|u|3 u instead of σ(x)u4 , since φ has to be convex and for the required nonnegative solution u > 0 of (1.44) the two nonlinearities are the same. See more on this change in Proposition 6.4.) Remark 6.8 The left sides of the equations that define the weak solution u∗ in Theorems 6.5–6.8 can be written (using the divergence theorem) as Z
Ω
if u∗ is regular, i.e. if it satisfies in Theorems 6.5–6.7 or
T (u∗ )v
u∗ ∈ H 2 (Ω)
(6.71)
u∗ ∈ H 4 (Ω)
(6.72)
in Theorem 6.8. This implies that in this regular case the function u∗ is the strong solution of the original boundary value problem: T (u∗ ) = g. On the other hand, in general we cannot ensure the regularity (6.71)–(6.72) for u∗ , in which case it is not a strong solution of the original problem. In this case the notion of weak solution gives the only way in which an existence result can be obtained. (Cf. Remark 6.3.)
6.2. WEAK SOLUTIONS
6.2.3
153
Non-potential problems
The following two results are a brief excursion to problems where, in contrast to the main part of our investigations, the generalized differential operator has no potential. Hence the existence results rely on a different theory which uses subtler monotonicity conditions and was sketched at the end of section 5.1. We note that, as mentioned in Remark 5.1, the nonexistence of a potential is equivalent to the lack of symmetry of the derivatives. The latter is caused by the presence of u in the nonlinearity of the principal part, in contrast to a problem like (6.52). The explicit form of the nonsymmetric derivative operator arising for (6.75) is studied in subsection 7.4.4, see e.g. (7.215)–(7.218). The first theorem concerns a nonlinear diffusion type equation. Theorem 6.9 Let Ω ⊂ RN be a bounded domain and a : Ω × R → R be measurable w.r. to x and continuous in s, satisfying 0 < µ1 ≤ a(x, s) ≤ µ2
(x ∈ Ω, s ∈ R)
(6.73)
with constants µ1 , µ2 independent of (x, s). Then for any g ∈ L2 (Ω), the problem
− div (a(x, u) ∇u) = g(x)
in Ω
u|∂Ω = 0
(6.74)
has a unique weak solution u∗ ∈ H01 (Ω). The existence part of this theorem is proved e.g. in Francu [117]. There the conditions of Theorem 5.3 are verified for the generalized differential operator hF (u), vi ≡
Z
Ω
a(x, u)∇u · ∇v
(v ∈ H01 (Ω)),
(6.75)
using Proposition 5.1 to check pseudomonotonicity. The continuity and boundedness assumptions are proved using estimates from the theory of Nemicky operators, for which the reader is referred to Fuˇc´ık–Kufner [120]. The monotonicity in principal part in condition (b) of Proposition 5.1 follows from (6.73): Z
Ω
(a(x, u)∇u − a(x, u)∇v) · (∇u − ∇v) ≥ µ1
and then setting v = 0 yields coercivity: 1 Z a(x, u)∇u · ∇u ≥ µ1 kukH01 → ∞ kukH01 Ω
Z
Ω
|∇u − ∇v|2 > 0,
as kukH01 → ∞
(where the norm kuk2H 1 = |∇u|2 is used). 0
R
Ω
The uniqueness of the solution is proved in Douglas–Dupont–Serrin [93] using the maximum principle. We note that the existence proof of Theorem 6.9 is also valid for mixed boundary conditions. This also follows from the theorem below for systems of more general form. The crucial structural condition is again monotonicity in principal part.
154
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
Theorem 6.10 Let the system − div fi (x, u, ∇u) + f0,i (x, u, ∇u) = gi (x)
in Ω fi (x, u, ∇u) · ν = γi (x) on ΓN
ui = 0
on ΓD
(i = 1, .., r)
(6.76)
with the unknown function u = (u1 , .., ur ) ∈ H 1 (Ω)r satisfy the following conditions: (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅, ΓN ∪ ΓD = ∂Ω and ΓD 6= ∅. (ii) The functions fi : Ω × R(N +1)r → RN (i = 0, .., r) are measurable w.r. to the variable x ∈ Ω and continuous in the other variables. We use the notations fi = (fi,1 , ..., fi,N ) (i = 0, .., r) and f = (f1,1 , ..., f1,N , f2,1 , ..., f2,N , ..., fr,1 , ..., fr,N ) : Ω × R(N +1)r → RN r . (iii) There exist constants c1 , c2 > 0 such that |fi (x, ξ, η)| ≤ c1 + c2 (|ξ| + |η|)
(i = 0, ..., r, (x, ξ, η) ∈ Ω × RN × RN r ).
(iv) There exist constants c3 , c4 > 0 such that f0 (x, ξ, η) · ξ + f (x, ξ, η) · η ≥ c3 |η|2 − c4 |ξ| (v) (f (x, ξ, η) − f (x, ξ, η˜)) · (η − η˜) > 0
((x, ξ, η) ∈ Ω × RN × RN r ).
(x ∈ Ω, ξ ∈ RN , η, η˜ ∈ RN r ).
Then for any gi ∈ L2 (Ω) and γi ∈ L2 (ΓN ) (i = 1, .., r), problem (6.76) has a weak solution u∗ ∈ H 1 (Ω)r . The proof is analogous to that of Theorem 6.9, see Posp´ıˇsek [246]. Now condition (iv) implies coercivity and condition (v) ensures monotonicity in principal part. We note that the solution is is generally not unique. As a special case, the above theorem holds for coefficients fi (x, u, ∇u) = ai (x, u)∇ui where each ai is as in Theorem 6.9. In particular, the existence of the solution of the system of semiconductor equations (1.52) can be derived from Theorem 6.10 using a suitable truncation. The proof is given in Kˇriˇzek–Neittaanm¨aki [188], Posp´ıˇsek [246] together with the discussion of uniqueness.
6.3
Qualitative properties
6.3.1
Regularity of the solution
In this subsection we quote two regularity results that are relevant for our treatment. First, for Dirichlet problems the following theorem provides both classical and Sobolev regularity under fairly general conditions.
6.3. QUALITATIVE PROPERTIES
155
Theorem 6.11 (Miersemann [213]). Let Ω ⊂ R2 be a bounded domain with piecewise C 2 boundary, locally convex at the corners. Let f ∈ C 1 (R2 , R2 ) such that its Jacobians ∂f (η)/∂η are symmetric and have eigenvalues between two positive constants independent of η. Further, let g ∈ L2 (Ω). Then the weak solution u∗ of the problem
−div f (∇u) = g(x)
in Ω
u|∂Ω = 0
(6.77)
satisfies u∗ ∈ C 1,α (Ω) ∩ H 2 (Ω) with some 0 < α < 1. The same C 1,α -regularity result holds if the Dirichlet boundary conditions are replaced by Neumann or third type: Theorem 6.12 (Lieberman [195]). Let the conditions of Theorem 6.11 hold, and consider equation (6.77) with either (i) Neumann boundary conditions f (∇u) · ν |∂Ω = 0
or
(ii) third boundary conditions (f (∇u) · ν + s(x, u)) |∂Ω = 0 with some s ∈ C 1 (∂Ω × R) for which ∂s/∂x and ∂s/∂u are bounded. Then the weak solutions satisfy u∗ ∈ C 1,α (Ω) with some 0 < α < 1.
Remark 6.9 (a) The local convexity at the corners means equivalently that the corners are so-called angular, i.e. the angles at the corners are less than π. Obviously, as special case these theorems include C 2 domains. (b) Theorem 6.11 can be generalized to RN for domains with conical vertices defined analogously to angular corners (Miersemann [214]), and Theorem 6.12 holds as well in RN for locally convex corners [195]. (c) For Neumann and mixed problems on piecewise smooth domains, the less strong Sobolev regularity u∗ ∈ H r (Ω) holds for all r < 23 , see Ebmeyer [98]. We note that for some special problems with smooth data one may even have classical C 2 -regularity, for instance, for some homogeneous or semilinear problems. For this and the detailed study of regularity the reader is referred to Gilbarg–Trudinger [132].
156
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
6.3.2
Positivity of the solution
When real-life phenomena are modeled by boundary value problems, often the sign of the solution is prescribed in addition to the equation. This is due to the physical meaning and is typically u ≥ 0 (e.g. concentration, temperature). We quote some results that establish the sign of solutions, relying on maximum principles. First we consider a semilinear boundary value problem −div (c(x) ∇u) + q(x, u) = 0 a(x) ∂u + b(x)u| = γ(x) ∂Ω ∂ν
(6.78)
on a bounded smooth domain with the following conditions: (i) c ∈ C 1 (Ω), q ∈ C 1 (Ω × R), a, b, γ ∈ C 1 (∂Ω); (ii) there exists a constant m > 0 such that c(x) ≥ m (x ∈ Ω); (iii) a and b are nonnegative, b(x) 6≡ 0, a(x) + b(x) > 0 (x ∈ ∂Ω). Theorem 6.13 Let the conditions (i)-(iii) hold and assume that u∗ ∈ C 2 (Ω) ∩ C 1 (Ω)
(6.79)
is a solution of (6.78). (1) If q ≤ 0 and γ ≥ 0, then u∗ ≥ 0. (2) If q ≥ 0 and γ ≤ 0, then u∗ ≤ 0. (3) For Dirichlet problems (i.e. a ≡ 0, b ≡ 1) the results (1)-(2) hold for any weak solution without the smoothness assumption (6.79), further, one can allow c ∈ L∞ (Ω), γ ∈ L2 (∂Ω) and q to be C 1 w.r. to u only. Then the sign of u∗ is understood a.e. (4) If in (1) or (2) the inequalities for q and γ are strict, then so is for u∗ . (5) In (1) the inequality q ≤ 0 can be replaced by the weaker one q(x, 0) ≤ 0 (x ∈ Ω), moreover, the existence of a classical solution u∗ ∈ C 2 (Ω) with u∗ ≥ 0 is also ensured provided that ∂Ω ∈ C 2,α and the coefficients in assumption (i) are C 1,α for some α > 0. Proof. Assertions (1)-(4) follow from the standard maximum principles (see Gilbarg–Trudinger [132], Protter–Weinberger [249], Struwe [270]). Assertion (5) is a consequence of a theorem of Keller [175]. We note that the theorem also holds if c(x) is replaced by a uniformly positive definite matrix C(x).
6.4. EXAMPLES
157
From Theorem 6.13 one can easily derive the nonnegativity of the solution for operators with scalar nonlinearity in the principal part. Namely, let us consider Dirichlet problems of the form (6.57): −div (a(|∇u|) ∇u) = g(x)
(6.80)
u| = 0 ∂Ω
on a bounded domain Ω ⊂ R2 with piecewise C 2 boundary with conical vertices as in Remark 6.9 (b), where the scalar-valued function a ∈ C 1 (R+ ) satisfies (6.56): 0 < µ1 ≤ a(r) ≤ (r a(r)) ′ ≤ µ2
(r > 0)
with constants µ2 ≥ µ1 > 0 independent of r, further, we assume g ≥ 0. By Corollary 6.1 the weak solution u∗ ∈ H01 (Ω) of (6.80) exists and is unique. Further, Theorem 6.11 implies ∇u∗ ∈ C α (Ω) with some 0 < α < 1. Then the functions c(x) = a(|∇u∗ (x)|), q(x, s) = −g(x) and γ(x) ≡ 0 are as required in Theorem 6.13, using assertions (1) and (3). Hence we obtain Corollary 6.2 The solution u∗ ∈ C 1,α (Ω) ∩ H 2 (Ω) of (6.80) satisfies u∗ ≥ 0.
6.4
Examples
The scope of the preceding existence and uniqueness theorems is illustrated by considering the model problems from Chapter 1. Proposition 6.1 Let Ω ⊂ R2 be a bounded domain with piecewise C 2 boundary, such that the angles at the corners are less than π. Then (1) the electromagnetic potential equation (1.24) in the device Ω has a unique weak solution u∗ ∈ H01 (Ω), i.e. Z
Ω
b(x, |∇u∗ |)∇u∗ · ∇v =
Z
Ω
ρv
(v ∈ H01 (Ω)).
(2) The elasto-plastic torsion problem (1.17) for the cross-section Ω has a unique weak solution u∗ ∈ H01 (Ω), i.e. Z
Ω
∗
∗
g(|∇u |)∇u · ∇v = 2ω
Z
Ω
v
(v ∈ H01 (Ω)),
for sufficiently small torsion per unit ω on the right-hand side. (For other ω crack occurs, i.e. the cross-section cannot be entirely in elasto-plastic state.) Further, u∗ also satisfies the regularity u∗ ∈ C 1,α (Ω) ∩ H 2 (Ω) with some 0 < α < 1. Proof. (1) Problem (1.24) is a special case of (6.53), hence the statement follows from Theorem 6.6. (2) The function g ∈ C 1 [0, T∗ ] can be extended to [0, ∞) such that it remains C 1 and the inequality (1.13) remains true on [0, ∞) with some µ ˜2 instead of µ2 . Let g˜
158
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
denote this extension of g. Replacing g by g˜ in (1.17), we obtain a special case of problem (6.57), for which Corollary 6.1 yields the unique weak solution u∗ . From Theorem 6.11 this solution satisfies u∗ ∈ C 1,α (Ω), hence |∇u∗ | is bounded, and the corresponding norm estimate [132] for the constant right-hand side ω implies that maxΩ |∇u∗ | ≤ const. · ω. That is, for sufficiently small ω we obtain max |∇u∗ | ≤ T∗ ,
(6.81)
Ω
which means that g˜(|∇u∗ |) = g(|∇u∗ |), i.e. u∗ is the solution of the original problem (1.17) with the nonlinearity g. Remark 6.10 The solution of the equivalent formulation of the elasto-plastic torsion problem (1.17), given in the Neumann problem (1.23) for the distortion w, is unique up to an additive constant. This follows from Theorem 6.7. Proposition 6.2 The nonlinear elasticity system (1.34) has a unique weak solution 1 u∗ = (u∗1 , u∗2 , u∗3 ) ∈ HD (Ω)3 , i.e. Z
Ω
∗
T (x, ε(u )) · ε(v) =
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
1 (v ∈ HD (Ω)3 )
or, with the representation (1.32) for T (x, ε(u∗ )), Z Ω
3k(x, |vol ε(u∗ )|2 ) vol ε(u∗ ) · vol ε(v) + 2µ(x, |dev ε(u∗ )|2 ) dev ε(u∗ ) · dev ε(v) dx =
Z
Ω
ϕ · v dx +
Z
ΓN
1 (v ∈ HD (Ω)3 )
τ · v dσ
where ϕ = (ϕ1 , ϕ2 , ϕ3 ) ∈ L2 (Ω)3 and τ = (τ1 , τ2 , τ3 ) ∈ L2 (ΓN )3 . Proof. The generalized differential operator corresponding to (1.34) is hF (u), vi
1 (Ω)3 HD
=
Z X 3
Ω i=1
Ti (x, ε(u)) · ∇vi ≡
Z
Ω
T (x, ε(u)) · ∇v
1 (u, v ∈ HD (Ω)3 ),
(6.82)
which is of the form (6.20) with q ≡ 0, s ≡ 0 and f (x, ∇u) = T (x, ε(u))
(6.83)
if the rows T1 , T2 , T3 of the matrix T are written in a row vector (T1 , T2 , T3 ). The function T (x, ε(u)) is defined via (1.32). To obtain the corresponding integrand in (6.82), we apply Proposition 1.1 and (1.56), by which vol ε(u) · ∇v = vol ε(u) · ε(v) = vol ε(u) · vol ε(v), dev ε(u) · ∇v = dev ε(u) · ε(v) = dev ε(u) · dev ε(v).
Substituting this with (1.32) into (6.82), we obtain hF (u), viHD1 (Ω)3 =
Z
Ω
T (x, ε(u)) · ε(v) =
6.4. EXAMPLES Z
Ω
159
(3k(x, |vol ε(u)|2 ) vol ε(u) · vol ε(v) + 2µ(x, |dev ε(u)|2 ) dev ε(u) · dev ε(v)) dx (6.84)
1 for all u, v ∈ HD (Ω)3 . For brevity, instead of checking the conditions of Theorem 6.4, we use Remark 6.2 to check first that Theorem 6.3 holds for F . Namely, the obtained form (6.84) is a special case of (6.39) with
a(x, s) = 3k(x, s), [u, v] = vol ε(u) · vol ε(v),
b(x, s) = 2µ(x, s), {u, v} = dev ε(u) · dev ε(v).
Here the required inequalities (6.9) follow from assumption (1.33) with λ = min{µ0 , δ0 } 1 and Λ = max{k0 , δ˜0 }, further, for any h ∈ HD (Ω)3 the considerations of Proposition 1.1 imply that Z
Ω
([h, h] + {h, h}) =
Z Ω
2
2
|vol ε(h)| + |dev ε(h)|
=
Z
Ω
|ε(h)|2 .
Hence, using the notation (6.83), by (6.41) λ
Z
Ω
2
|ε(h)| ≤
Z
Ω
Z ∂f (x, ∇u) ∇h · ∇h ≤ Λ |ε(h)|2 . ∂η Ω
(6.85)
Further, there holds the inequality κ
Z X 3
Ω i=1
|∇hi |2 ≤
Z
Ω
|ε(h)|2 ≤ K
Z X 3
Ω i=1
|∇hi |2
(6.86)
with a suitable constant κ > 0 (ensured by Korn’s inequality using ΓD 6= ∅, see Hlav´aˇcek–Neˇcas [152]) and with e.g. K = 1 (obtained trivially even for the integrands, regardless the boundary condition). Hence Z X 3
Z X 3 ∂f |∇hi | ≤ =m |∇hi |2 = M khk2H 1 (Ω)3 (x, ∇u) ∇h·∇h ≤ M D Ω ∂η Ω i=1 Ω i=1 (6.87) with m = κλ and M = KΛ. That is, Theorem 6.2 holds for the operator (6.82), and 1 hence Theorem 6.4 provides the unique weak solution u∗ ∈ HD (Ω)3 .
mkhk2H 1 (Ω)3 D
2
Z
Proposition 6.3 The elasto-plastic bending problem of a clamped plate (1.40) has a unique weak solution u∗ ∈ H02 (Ω), i.e. Z 1Z 2 ∗ 2 ∗ 2 ∗ αv g(E(D u )) (D u · D v + ∆u ∆v) = 2 Ω Ω
(v ∈ H02 (Ω)).
(6.88)
Proof. For any matrices B, C ∈ R2×2 let us introduce the following notations: ˜ = 1 (B + trB · I) B 2
(6.89)
160
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
where I ∈ R2×2 is the identity matrix, {B, C} =
1 (B · C + trB trC) 2
(6.90)
where the elementwise matrix product · is defined as in (6.62), and E(C) = {C, C}. Then (1.40) is a special case of (6.63) with ˜ A(x, Θ) = g(E(Θ)) Θ
(x ∈ Ω, Θ ∈ R2×2 ).
(6.91)
For brevity, instead of checking the conditions of Theorem 6.8, we verify directly via Remark 6.1 that Theorem 6.3 holds for the generalized differential operator hF (u), viH02 =
Z
A(x, D2 u)·D2 v =
Ω
Z
˜ 2 u·D2 v g(E(D2 u)) D
Ω
(u, v ∈ H02 (Ω)) (6.92)
which now corresponds to (6.65). Namely, using the weak form (1.58) and the notation (6.90), we have hF (u), viH02 =
Z
Ω
1Z = g(E(D2 u)) (D2 u · D2 v + ∆u ∆v) 2 Ω
g({D2 u, D2 u}) {D2 u, D2 v}
(u, v ∈ H02 (Ω)).
(6.93)
The obtained form of F is a special case of (6.7) with a(r) = g(r) and [u, v] = {D2 u, D2 v} , where the required inequalities (6.9) follow from assumption (1.38), hence Remark 6.1 yields that Theorem 6.1 with r = 1 and n = 2 holds for the operator (6.93). Then Theorem 6.3 provides the unique weak solution u∗ ∈ H02 (Ω), and now (6.65) coincides with (6.88) owing to (6.92)–(6.93). The same well-posedness result holds for a freely supported plate, i.e. if in (1.40) the boundary condition (∂u/∂ν)|∂Ω = 0 is replaced by the second order one ˜ 2 u) ν · ν |∂Ω = 0 (D
(6.94)
given in (1.41). This follows from part (ii) of Remark 6.6 and the observation that for (6.91) the condition A(x, D2 u)ν · ν|∂Ω = 0 is now equivalent to (6.94): namely, ˜ 2 u)ν · ν = 0 on ∂Ω A(x, D2 u)ν · ν = g(E(D2 u)) (D if and only if (6.94) holds, owing to assumption g > 0 in (1.38).
6.4. EXAMPLES
161
We note that by (6.89) the the boundary condition (6.94) can be written as ∂2u ∂ν 2
+ ∆u|∂Ω = 0 .
The well-posedness of the various semilinear problems in section 1.5 is proved in the following theorem in a much common way, using the results of Theorem 6.13 on the sign of the solution. In addition, the solution is now classical in C 2 -regularity sense. Proposition 6.4 Let ∂Ω ∈ C 2,α . The semilinear problems in section 1.5 have a unique classical solution u∗ ∈ C 2 (Ω). Proof. (a) For the diffusion-kinetic enzyme problem (1.42), following [175], let us extend the function r in (1.43) to (−∞, 0] to remain C 1,α and increasing on R, e.g. by r(ξ) = ξ/(εk) (ξ < 0). The corresponding boundary value problem is a special case of (6.52) with ΓD = ∅, f (x, η) = d(x)η, q(x, ξ) = r(ξ), s(x, ξ) = h(x)ξ and γ(x) = h(x)u0 (x). Hence Theorem 6.5 yields the unique weak solution u∗ for the problem with extended r. Then we can apply statement (5) of Theorem 6.13 (with coefficients c = a = d, b = h and q, γ as above) since q(x, 0) = r(0) = 0 and the coefficients are C 1,α . Hence we obtain that u∗ ∈ C 2 (Ω) and u∗ ≥ 0, i.e. u∗ is the solution of the original problem (1.42) with the nonlinearity (1.43). (b) For the radiative cooling problem (1.44) the proof is the same as above since the problem only differs in the nonlinearity σ(x)u4 from (1.42). Now we replace q(x, ξ) = σ(x)ξ 4 by σ(x)|ξ|3 ξ to have an extension for ξ < 0 that remains C 1,α and increasing w.r. to ξ. (c) For the autocatalytic chemical reaction (1.45) we similarly replace up by |u|p−1 u and apply Theorem 6.5 to have the weak solution u∗ ∈ H01 (Ω). Then we can use statement (5) of Theorem 6.13 with coefficients c ≡ 1, a ≡ 0, b = γ ≡ 1 and q(x, ξ) = |ξ|p−1 ξ to obtain that u∗ ≥ 0, i.e. u∗ is the weak solution of the original problem (1.45). (d) For the electrostatic potential equation (1.46) we define q(x, ξ) as eξ for ξ < 0 and 1 + ξ for ξ > 0 and obtain from Theorem 6.5 the weak solution u∗ ∈ H01 (Ω). Then statement (2) of Theorem 6.13 yields that u∗ ≤ 0, hence u∗ is the weak solution of the original problem (1.46). The regularity result u∗ ∈ C 2 (Ω) for the Dirichlet problems (1.45) and (1.46) follows from [270]. The well-posedness of the two flow problems in subsection 1.6.1 is verified similarly to Proposition 6.1 since both equations contain an operator of the form −div (a(|∇u|) ∇u), just as equation (1.17). The subsonic flow condition maxΩ |∇u| < 1 for problem (1.47) is analogous to (6.81) for the torsion problem, hence similarly the subsonic solutions are only obtained for sufficiently small boundary data γ, v∞ . The uniqueness for the Neumann problem (1.49) holds up to an additive constant, similarly as in Remark 6.10. In the case of the dielectric fluid equation (1.50) the boundedness of |∇u| can be
162
CHAPTER 6. SOLVABILITY OF NONLINEAR ELLIPTIC PROBLEMS
derived from a theorem in Gilbarg–Trudinger [132] under the condition φ0 ∈ C 2,α (∂Ω), which also ensures the existence of the classical solution u∗ ∈ C 2,α (Ω). The question of existence and uniqueness for the non-potential equations of section 1.6.2 has been discussed in subsection 6.2.3. Namely, the well-posedness result for the heat conduction equation (1.51) is directly formulated in Theorem 6.9. Further, the system of semiconductor equations (1.52) has a weak solution but this is generally not unique. (The proof, based on Theorem 6.10 and a suitable truncation, is given in Kˇriˇzek–Neittaanm¨aki [188], Posp´ıˇsek [246] together with the discussion of uniqueness.)
Part III Iterative solution of nonlinear elliptic boundary value problems
163
Chapter 7 Iterative methods in Sobolev space This chapter is devoted to iterative methods that construct sequences of functions on the continuous level, converging to the exact solution of a nonlinear elliptic boundary value problem. As mentioned in the Introduction, the aim of such a theoretical iteration concerning a discretized elliptic problem is to provide a background sequence whose projection to the discretization subspace is a convenient preconditioned iterative sequence for the discretized problem. In fact, the later chapters will show that the derived iterative sequences have favourable properties concerning both their construction and convergence. Namely, roughly speaking, the FEM realization of such a theoretical iteration simply means that we use the same formulas just replacing the Sobolev space by the considered FEM subspace, further, the convergence factors of the theoretical sequence yield analytic bounds for the derived sequence in the discretized case. The Sobolev space background has proved fruitful for a long time in the study of iterative methods, as will also turn out in many cited results later. General results are summarized e.g. in the monographs of Gajewski–Gr¨oger–Zacharias [127], Glowinski [133], Hutson–Pym [154], Kantorovich–Akilov [165], Kluge [180], Langenbach [190], Neuberger [231]. It is useful to relate the content of this chapter to Figure 3, given in the Introduction, which has illustrated the idea of preconditioning operators for one-step iterations. Namely, the Sobolev space iterations to be discussed here are at the upper level of the diagram of Figure 3. A one-step iterative method in Sobolev space defines a sequence of the form un+1 = un − αn Sn−1 (T (un ) − g), (7.1) where αn > 0 is a suitably chosen stepsize. (We note that in Figure 3 the stepsizes αn were for simplicity incorporated into the operator Sn , but now and in the sequel it will be more convenient to consider them separately. Further, for the terms of the iterative sequence we simply write the lower index n instead of the upper index (n), which was used in Figure 3 to avoid double subscripts when indexing with the discretization parameter h.) The formula (7.1) gives a fairly general form of one-step iterative methods (see e.g. Ortega–Rheinboldt [238]). Most of the discussed methods in this chapter are of the form (7.1), including simple (fixed point) iterations and Newton-like methods that are in the focus of our investigations. Moreover, (7.1) implies that preconditioning yields a 165
166
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
general framework for the study of one-step iterative methods for nonlinear problems. Namely, preconditioning can be generally understood as a suitable choice of linear operators such that the iteration consists of corresponding auxiliary linear problems, and in this context it is favourable to allow these operators vary stepwise as in (7.1). Convergence results in terms of condition numbers also fit in this framework, as has been sketched in the introduction of section 5.3. Accordingly, this chapter is built up as follows. We focus on simple and Newton-like iterations, discussed in sections 7.1 and 7.2, respectively. (The term ’simple iteration’ means that in (7.1) we set a fixed preconditioning operator Sn ≡ S throughout the iteration.) The detailed presentation of simple iterations in section 7.1 serves the sake of understanding since for simple iterations, preconditioning operators in Sobolev space can be presented more directly. This is because the background of simple iterations is connected more directly with the theoretical properties, and hence we will apply a similar kind of simple iteration for various types of boundary value problems. Section 7.2 is devoted to Newton-like methods. We start with the general damped Newton method, then two kinds of inexact Newton method are discussed. The first version defines an inner-outer iteration (an idea introduced in this form in Bank–Rose [35]), i.e. the linearized equations in the steps of Newton’s method as outer iteration are solved approximately by inner iterations. This widespread combined approach is beyond the scope of (7.1) which only includes the outer iteration here. In contrast, the second version realizes an inexact or quasi-Newton method in the variable preconditioning framework of (7.1) such that the linear operators in the auxiliary problems are approximations of the derivative operators F ′ (un ). These two versions are detailed for second order mixed problems, whereas for other boundary value problems we only include fourth order Dirichlet problems and leave the other cases to the reader. We note that both in sections 7.1 and 7.2 we consider general preconditioning operators, which in the case of 2nd order Dirichlet problems are typically given as Sn z ≡ −div (Gn (x)∇z) or in weak form hBn z, vi ≡
Z
Ω
Gn (x) ∇z · ∇v
(v ∈ H01 (Ω)),
where Gn ∈ L∞ (Ω, RN ×N ) is a symmetric positive-definite matrix-valued function. (For convenience the above notational distinction between the weak and strong forms of linear operators will be used throughout this part the book.) That is, the coefficient matrix Gn is not specified in the investigations of these two sections, instead, this step is left to Chapter 8 where various possible choices of Gn will be discussed already in the setting of the discretized problems. Section 7.3 gives a brief summary on preconditioning as a common framework for simple and Newton-like methods via the Sobolev gradient idea. The latter means roughly speaking that (7.1) describes descent methods w.r. to fixed or variable inner products. In section 7.4 some other iterative methods than the above two types are discussed in Sobolev space setting. This includes the famous nonlinear conjugate gradient method and frozen coefficient iterations, and also gives two particular preconditioning methods relying on more recent investigations; finally, some generalizations to multistep methods are mentioned.
7.1. SIMPLE ITERATIONS
167
As referred to above, a convenient class of problems where generality does not obscure the exposition of ideas is of the form T (u) ≡ −div f (x, ∇u) = g(x)
in Ω
(7.2)
(that is, it contains a second order operator consisting only of principal part) with Dirichlet or mixed boundary conditions. The presentation of simple iterations is started with this equation with Dirichlet boundary conditions in detail in section 7.1, and problems of more general form are then studied in an analogous way. Newton-like methods are considered in section 7.2 similarly, first developed for equation (7.2) with mixed boundary conditions and then sketched for more general problems. Moreover, since for these other problems the technical tools are the same as with simple iterations, the two realizations of Newton’s method in subsections 7.2.2 and 7.2.3 are entirely left to the reader except for fourth order Dirichlet problems. (We note that the equation (7.2) with mixed boundary conditions will also be convenient later in Chapter 9 to present algorithmic realization and coupling with FEM for the discussed iterative methods. Finally we also mention that the considered boundary conditions are local, but nonlocal boundary conditions might also be treated like in [171].) For all considered problems in this chapter the existence and uniqueness of the weak solution is ensured by the results of Chapter 6 and is hence regarded as known. For convenience of reading, the theorems of sections 7.1 and 7.2 are given in the following structure. The assumptions on the boundary value problem, the construction of the preconditioning operator and the convergence theorem are given in three distinct parts, such that the first two are marked with the same numbers as the convergence theorem.
7.1
Simple iterations
In this section we investigate simple iterations in Sobolev space, that is, we set a fixed preconditioning operator Sn ≡ S throughout the iteration. In the first subsection we consider second order Dirichlet problems. First the nonlinear operator consists only of principal part, since this simplicity of the problem helps further the clearer exposition of ideas. Then a lower order term is also allowed. Nonlinear operators of more general form are studied in the second subsection, the discussion is analogous to that of second order Dirichlet problems. It is worth mentioning as a typical example the Laplacian preconditioning operator S = −∆, considered in such iterations already in many early papers under various conditions (Koshelev [184, 185], Nashed [220], Petryshyn [241]). For other more recent applications see Mahavier [201] and Neuberger [227, 228], for general discussions Gajewski–Gr¨oger–Zacharias [127], Langenbach [190] and Neuberger [231], and for the authors’ related results [20, 21, 167, 168] and [172]–[174].
168
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
7.1.1
Second order Dirichlet problems
Let us consider problems of the form (
T (u) ≡ −div f (x, ∇u) = g(x) u|∂Ω = 0
(7.3)
on a bounded domain Ω ⊂ RN , such that the following assumptions are satisfied: Assumptions 7.1. (i) The function f : Ω × RN → RN is measurable and bounded w.r. to the variable (x,η) x ∈ Ω and C 1 w.r. to the variable η ∈ RN , further, its Jacobians ∂f∂η are symmetric and their eigenvalues λ satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞ with constants µ2 ≥ µ1 > 0 independent of (x, η). (ii) g ∈ L2 (Ω). Construction 7.1. Let G ∈ L∞ (Ω, RN ×N ) be a symmetric matrix-valued function for which there exist constants M ≥ m > 0 such that mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN ).
(7.4)
We introduce the linear operator Su ≡ −div (G(x)∇u)
(u ∈ H 2 (Ω) ∩ H01 (Ω) with G(x)∇u ∈ H 1 (Ω)),
(7.5)
and the corresponding energy space H01 (Ω) with the inner product hu, viG :=
Z
Ω
G(x) ∇u · ∇v
(7.6)
(equivalent to the usual one). Using this inner product in H01 (Ω), the generalized differential operator F : H01 (Ω) → H01 (Ω), corresponding to T , is given by the equality hF (u), viG =
Z
Ω
f (x, ∇u) · ∇v
(u, v ∈ H01 (Ω)),
(7.7)
and similarly, the weak form of the right-hand side g is defined by the element b ∈ H01 (Ω) for which Z hb, viG = gv (v ∈ H01 (Ω)). (7.8) Ω
(These weak forms follow the usage in Chapter 6, and some further explanation to clarify their meaning will be given in Remark 7.2. See also in the Appendix, paragraph (c) of A2.)
7.1. SIMPLE ITERATIONS
169
Theorem 7.1 Let Assumptions 7.1 be satisfied. Then Construction 7.1 yields
cond S −1 T ≤
M m
(7.9)
and the following corresponding convergence results: (1) for any u0 ∈ H01 (Ω) the sequence (un ) ⊂ H01 (Ω) defined by un+1 = un − where
Z
Ω
G(x) ∇zn · ∇v =
Z
Ω
2 zn , M +m
f (x, ∇un ) · ∇v −
Z
Ω
gv
(v ∈ H01 (Ω)),
(7.10)
converges linearly to the unique weak solution u∗ of (7.3), namely, kun − u∗ kG ≤
1 M −m kF (u0 ) − bkG m M +m
n
(n ∈ N) ,
(7.11)
where F and b are defined in (7.7)–(7.8). (2) Further, if G ∈ C 1 (Ω, RN ×N ) and Ω is C 2 -diffeomorphic to a convex domain, then for any u0 ∈ H 2 (Ω) ∩ H01 (Ω) the auxiliary problems throughout the iteration (7.10) take the strong form (
Szn = T (un ) − g zn|∂Ω = 0
(7.12)
with zn ∈ H 2 (Ω), and (7.11) can be replaced by kun − u∗ kG ≤
1 M −m kT (u0 ) − gkL2 (Ω) 1/2 m̺ M +m
n
(n ∈ N) ,
(7.13)
where ̺ > 0 is the smallest eigenvalue of S on H 2 (Ω) ∩ H01 (Ω). Proof. As a special case of Theorem 6.5, problem (7.3) has a unique weak solution u ∈ H01 (Ω). Let T be the operator in (7.3) with domain ∗
D(T ) = H 2 (Ω) ∩ H01 (Ω)
(7.14)
in the real Hilbert space L2 (Ω). We verify that T and S satisfy the assumptions of part (i) of Theorem 5.14. Therefore we check the conditions of Theorem 5.13, with (5.72) replaced by (5.88). Inequality (7.4) implies that the eigenvalues of the matrices G(x) have a uniform (x,η) . Hence by Proposition 3.3 the positive lower bound similarly to the Jacobians ∂f∂η operator S in (7.5) is a symmetric linear operator in L2 (Ω) with some positive lower bound ̺ > 0. The divergence theorem yields Z
Ω
T (u)v =
Z
Ω
f (x, ∇u) · ∇v
(u, v ∈ H 2 (Ω) ∩ H01 (Ω))
(7.15)
170
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
and condition (7.4) implies m G(x)(∇v − ∇u) · (∇v − ∇u) ≤ (f (x, ∇v) − f (x, ∇u)) · (∇v − ∇u) ≤ M G(x)(∇v − ∇u) · (∇v − ∇u), hence (7.15) gives mkv
− uk2G
≤
Z
Ω
(T (v) − T (u))(v − u) ≤ M kv − uk2G
(u, v ∈ H 2 (Ω) ∩ H01 (Ω)), (7.16)
i.e. (5.88) holds. Further, by (7.15) the operator F defined in (5.73) now takes the form (7.7). Using that f ∈ C 1 , a calculation similar to Theorem 6.1 yields that F is Gateaux differentiable, ′
hF (u)h, viG =
Z
Ω
∂f (x, ∇u) ∇h · ∇v ∂η
(u, h, v ∈ H01 (Ω))
(7.17)
and thus F ′ is bihemicontinuous and symmetric. Hence the conditions of Theorem 5.13 are satisfied. Consequently, by part (2) of Theorem 5.13 the sequence (7.10) converges according to (7.11). Further, if G ∈ C 1 (Ω, RN ×N ) and Ω is C 2 -diffeomorphic to a convex domain, then Theorem 3.4 yields R(S) = L2 (Ω) and hence the condition (5.77) is also satisfied. Then part (3) of Theorem 5.13 yields that for any u0 ∈ H 2 (Ω) ∩ H01 (Ω) the auxiliary problems take the strong form (7.12), further, (7.13) holds with the lower bound ̺ > 0 of S, which by Proposition 3.4 is the smallest eigenvalue of S on H 2 (Ω) ∩ H01 (Ω). That is, assertions (1)-(2) of Theorem 7.1 are verified for (7.3). The corresponding conditioning estimate (7.9) is understood in the sense of Definition 5.6 according to Remark 5.18. Remark 7.1 The main ingredient of the proof of Theorem 7.1 is to define the preconditioning operator S as a linear elliptic operator, equivalent to T in the sense of coefficients in (7.4) or – with the energy norm – in that of (7.16). This has enabled us to apply Theorem 5.14. We note that, using (7.17), (7.16) implies that mkhk2G ≤ hF ′ (u)h, hiG ≤ M khk2G
(u, h ∈ H01 (Ω)),
which shows that the underlying result ensuring linear convergence is Theorem 5.4 (in the space H01 (Ω) with k.kG -norm). Remark 7.2 The generalized differential operator F in (7.7) maps from H01 (Ω) into H01 (Ω), i.e. for any u ∈ H01 (Ω) there exists the function F (u) ∈ H01 (Ω) that defines the equality (7.7). This means that (although it may seem unusual) it is correct to use the same notation h., .iG in (7.7) to denote the sense of duality pairing as was used in (7.6) to denote the weighted H01 (Ω)-inner product. (The underlying property for this coincidence is that the dual of a Hilbert space is itself, i.e. the functionals in the dual space can be given as elements of the Hilbert space.) The fact that the element F (u) is a function in H01 (Ω) can be seen more visually in the regular case, i.e. if G and Ω are regular as in part (2) of Theorem 7.1. Namely,
7.1. SIMPLE ITERATIONS
171
in this case the connection of the generalized differential operator F and the original operator T is established using (5.82) by the decomposition F|H 2 ∩H01 = S −1 T ,
(7.18)
hence for u ∈ H 2 (Ω) ∩ H01 (Ω) the element F (u) ∈ H01 (Ω) is in fact the function S −1 T (u) ∈ H 2 (Ω) ∩ H01 (Ω). The similar form is valid for the weak right-hand side b ∈ H01 (Ω), which was defined in (7.8) as the weak solution of the problem Sb = g: under the above regularity conditions there holds b = S −1 g ∈ H 2 (Ω) ∩ H01 (Ω).
(7.19)
The strong form (7.12) of the auxiliary equations is due to the representations (7.18) and (7.19). (In the proof of Theorem 7.1 it was derived directly from the earlier Theorem 5.13.) Namely, in the general case (i.e. without the regularity assumptions) the problem (7.10) is written as (v ∈ H01 (Ω))
hzn , viG = hF (un ) − b, viG or simply zn = F (un ) − b ,
(7.20)
whereas in the regular case, using the representations (7.18) and (7.19), (7.20) turns into zn = S −1 (T (un ) − g) . Now we consider the modification of problem (7.3) when a lower order term of polynomial growth is added: (
T (u) ≡ −div f (x, ∇u) + q(x, u) = g(x) u|∂Ω = 0.
(7.21)
For this problem Theorem 7.1 is modified as follows. Assumptions 7.2. Let the conditions (i)-(ii) of Theorem 7.1 hold, further, let q : Ω × R → R be measurable and bounded w.r. to the variable x ∈ Ω and C 1 w.r. to the variable ξ ∈ R. Let 2 ≤ p ≤ N2N (if N > 2) or 2 ≤ p (if N = 2); that is, the −2 Sobolev embedding H01 (Ω) ⊂ Lp (Ω) (7.22) holds (see in the Appendix, Chapter 11). Assume that there exist constants c1 , c2 ≥ 0 such that q satisfies 0 ≤ ∂ξ q(x, ξ) ≤ c1 + c2 |ξ|p−2
((x, ξ) ∈ Ω × R).
Construction 7.2. Let G ∈ L∞ (Ω, RN ×N ) be a symmetric matrix-valued function for which there exist constants m′ ≥ m > 0 such that mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ m′ hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN ), (7.23)
172
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
and let the corresponding linear operator S and inner product h., .iG be defined as in (7.5)–(7.6). We introduce the function p M (r) = m′ + c1 ̺−1 + c2 Kp,Ω rp−2
(r > 0),
(7.24)
where Kp,Ω is the embedding constant in the estimate (u ∈ H01 (Ω))
kukLp (Ω) ≤ Kp,Ω kukG
(7.25)
corresponding to (7.22), and ̺ > 0 denotes the smallest eigenvalue of the operator S. Now the generalized differential operator F : H01 (Ω) → H01 (Ω) is the appropriate modification of (7.7), i.e. hF (u), viG =
Z
Ω
(u, v ∈ H01 (Ω)).
(f (x, ∇u) · ∇v + q(x, u)v)
(7.26)
The weak right-hand side b ∈ H01 (Ω) is the same as in (7.8). Theorem 7.2 Let Assumptions 7.2 be satisfied. Then Construction 7.2 yields that for any u0 ∈ H01 (Ω), the assertions (1)-(2) of Theorem 7.1 hold with
M0 := M ku0 kG +
1 kF (u0 ) − bkG m
(7.27)
instead of M . Namely, (1) for any u0 ∈ H01 (Ω) the sequence (un ) ⊂ H01 (Ω) defined by un+1 = un − Z
Ω
G(x) ∇zn · ∇v =
Z
Ω
2 zn , M0 + m
(f (x, ∇un ) · ∇v + q(x, un )v) −
where Z
Ω
gv
(7.28)
(v ∈ H01 (Ω)) (7.29)
converges linearly to the weak solution u∗ of (7.21): 1 M0 − m kun − u kG ≤ kF (u0 ) − bkG m M0 + m ∗
n
(n ∈ N) .
(7.30)
(2) If the regularity conditions in part (2) of Theorem 7.1 are satisfied, then (7.27) can be replaced by !
1 kT (u0 ) − gkL2 (Ω) , M0 := M ku0 kG + m̺1/2 and (7.12)–(7.13) hold with this M0 instead of M .
(7.31)
7.1. SIMPLE ITERATIONS
173
Proof. The proof goes on in the same way as for Theorem 7.1, now applying part (ii) of Theorem 5.14. The only difference in checking the conditions of Theorem 5.13 is that now condition (5.72) is replaced by (5.89). In virtue of Remark 5.21, the latter is equivalent to (5.91), which in the present setting means mkhk2G ≤ hF ′ (u)h, hiG ≤ M (kukG )khk2G
(u, h ∈ H01 (Ω))
(7.32)
where F : H01 (Ω) → H01 (Ω) is the generalized differential operator defined by (7.26). The estimate (7.32) follows analogously to Theorem 6.2 by setting the inner product h., .iG in H01 (Ω), namely, it is then a special case of (6.21) with r = 1 and ΓN = ∅, and 2 K2,Ω is replaced by ̺−1 using (11.8). Remark 7.3 The norm kF (u0 )−bkG in (7.27) is to be computed after the first iteration step, since the function z0 = F (u0 ) − b is determined in (7.29) for n = 0. That is, the iteration has to be started in the following way: u0 ∈ H01 (Ω) is arbitrary;
z0 = F (u0 ) − b is determined via (7.29);
M0 = M ku0 kG +
u1 = u 0 −
2 z M0 +m 0
.
1 kz0 kG m
;
(Then for n ≥ 1 the iteration may proceed as in (7.28)–(7.29).) This way of calculating kF (u0 ) − bkG works also for the estimates (7.11) and (7.30). Remark 7.4 Theorem 7.1 can be reformulated and proved in an identical way for mixed boundary conditions, i.e. for problems T (u) ≡ −div f (x, ∇u) = g(x)
Q(u) ≡
in Ω
f (x, ∇u) · ν = γ(x) on ΓN u = 0
(7.33)
on ΓD
(let alone the slightly more special strong form (7.12)–(7.13)). Here the conditions γ ∈ L2 (ΓN ) and ΓD 6= ∅ are assumed. R R R 1 (Ω) Namely, one has to replace the terms gv by gv + γv dσ and H01 (Ω) by HD Ω
Ω
ΓN
1 (defined in (6.15)), respectively. Thus the iterative sequence (un ) ⊂ HD (Ω) is now defined by
un+1 = un − Z
Ω
G(x) ∇zn · ∇v dx =
Z
Ω
2 zn , M +m
where
f (x, ∇un ) · ∇v dx −
Z
Ω
gvdx +
Z
ΓN
γv dσ
(7.34)
1 (for all v ∈ HD (Ω)), and then (un ) satisfies the convergence estimate (7.11). The detailed formulation of this theorem is unnecessary since problem (7.33) is a special case of (7.35). The same holds for the generalization of Theorem 7.2 when a lower order term is added in (7.33).
174
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Remark 7.5 The simplest initial guess for the iteration (7.10) is clearly u0 ≡ 0. We note that this has some relevance from the point of view of preconditioning as well. (x,η) Namely, (7.4) gives M +m G(x) ≈ ∂f∂η , hence M +m S is a preconditioner of T that 2 2 can be regarded as its linear approximation. Then a natural initial guess would be 2 S −1 g. However, this function just the corresponding approximate solution u0 = M +m equals u1 if u0 ≡ 0 is chosen, i.e. the two choices give the same algorithms (only put in a different way).
7.1.2
Mixed and higher order problems
The results of subsection 7.1.1 are now extended to more general boundary value problems including second order problems of 3rd type, mixed systems and Neumann problems, further, fourth order Dirichlet problems. The discussion follows that of the previous subsection, hence we omit certain details that can be understood in an analogous way as before. (a) Second order problems of 3rd type We consider second order boundary value problems of 3rd type in the form T (u) ≡ −div f (x, ∇u) + q(x, u) = g(x) in Ω
Q(u) ≡
f (x, ∇u) · ν + s(x, u) = γ(x) on ΓN u = 0
(7.35)
on ΓD
1 on a bounded domain Ω ⊂ RN . The space HD (Ω) is defined as in (6.15). Let p be a real number satisfying
2 ≤ p (if N = 2) or 2 ≤ p ≤
2N N −2
(if N > 2).
(7.36)
Then there hold the Sobolev embeddings 1 HD (Ω) ⊂ Lp (Ω),
1 HD (Ω)|ΓN ⊂ Lp (ΓN ),
1 kukLp (Ω) ≤ Kp,Ω kuk (u ∈ HD (Ω)),
1 kukLp (ΓN ) ≤ Kp,ΓN kuk (u ∈ HD (Ω)|ΓN )
(7.37) (7.38)
1 with suitable constants Kp,Ω > 0 and Kp,ΓN > 0, where HD (Ω)|ΓN denotes the trace of 1 1 HD (Ω) on ΓN and k.k denotes a norm in HD (Ω) that is equivalent to the standard H 1 norm. (See (6.18)–(6.19).)
Assumptions 7.3. Let the problem (7.35) satisfy the following conditions: (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅ and ΓN ∪ ΓD = ∂Ω. (ii) The functions f : Ω × RN → RN , q : Ω × R → R and s : ΓN × R → R are measurable and bounded w.r. to the variable x ∈ Ω (or x ∈ ΓN , resp.) and C 1 in the other variables.
7.1. SIMPLE ITERATIONS (iii) The Jacobian matrices
175 ∂f (x,η) ∂η
are symmetric and their eigenvalues λ satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞
with constants µ2 ≥ µ1 > 0 independent of (x, η). (iv) There exist constants ci , di ≥ 0 and 2 ≤ pi ≤ p (i = 1, 2) such that for any x ∈ Ω (or x ∈ ΓN , resp.) and ξ ∈ R, 0 ≤ ∂ξ q(x, ξ) ≤ c1 + c2 |ξ|p1 −2 ,
0 ≤ ∂ξ s(x, ξ) ≤ d1 + d2 |ξ|p2 −2
with p defined in (7.36). (v) Either ΓD 6= ∅, or x 7→ inf ξ∈R ∂ξ s(x, ξ) is not a.e. constant zero on ΓN . (vi) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ). Construction 7.3. Let G ∈ L∞ (Ω, RN ×N ) be a symmetric matrix-valued function for which there exist constants m′ ≥ m > 0 such that mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ m′ hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN ). (7.39)
Further, let β ∈ L∞ (ΓN ) such that
1 inf ∂ξ s(x, ξ) m ξ∈R We introduce the linear operator β(x) ≤
(x ∈ ΓN ).
Su ≡ −div (G(x)∇u)
(7.40)
for u ∈ H 2 (Ω) with G(x)∇u ∈ H 1 (Ω) and satisfying the boundary conditions Ru ≡ ∂G(x)·ν u + β(x)u = 0 (x ∈ ΓN ),
u = 0 on ΓD
(7.41)
(where ∂G(x)·ν u = G(x) ν · ∇u is the conormal derivative of u at x). The corresponding 1 inner product in the energy space HD (Ω) is denoted by hu, viHD1 (Ω) :=
Z
Ω
G(x) ∇u · ∇v +
Z
ΓN
β(x)uv dσ
1 (u, v ∈ HD (Ω)).
(7.42)
(Condition (v) ensures that (7.42) is positive definite.) For simplicity we will similarly omit writing dx for the integrals on Ω in the sequel. The boundary conditions of (7.35) will be called regular if for any f ∈ L2 (Ω) and 1 ϕ ∈ H 1/2 (ΓN ), the weak solution u ∈ HD (Ω) of the corresponding linear problem Su = f
Ru = ϕ
in Ω
(7.43)
on ΓN
satisfies u ∈ H 2 (Ω). (Sufficient conditions are given in subsection 3.2.2.) Finally, we introduce the function 2 M (r) = m′ + c1 ̺−1 + d1 K2,Γ + c2 Kpp11,Ω rp1 −2 + d2 Kpp22,ΓN rp2 −2 N
(r > 0),
(7.44)
where Kp1 ,Ω and Kp2 ,ΓN are defined according to (7.37)-(7.38) and ̺ > 0 denotes the smallest eigenvalue of the operator S.
176
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Theorem 7.3 Let Assumptions 7.3 be satisfied. Then Construction 7.3 yields the following convergence results: 1 (1) Let u0 ∈ HD (Ω), and
1 kF (u0 ) − bkHD1 (Ω) , m
M0 := M ku0 kHD1 (Ω) +
(7.45)
where M (r) is from (7.44), F denotes the generalized differential operator corre1 sponding to T and b ∈ HD (Ω) is the element for which hb, viHD1 (Ω) = For n ∈ N let
Z
Ω
gv +
Z
γv dσ
ΓN
1 (v ∈ HD (Ω)).
2 zn , M0 + m
un+1 = un −
(7.46)
1 where zn ∈ HD (Ω) satisfies
Z
G(x) ∇zn · ∇v +
Z
[f (x, ∇un ) · ∇v + (q(x, un ) − g)v] +
Z
Ω
Z
Ω
ΓN
ΓN
β(x)zn v dσ =
(7.47)
(s(x, un ) − γ)v dσ
1 (v ∈ HD (Ω)).
1 Then the sequence (un ) ⊂ HD (Ω) converges linearly to the weak solution u∗ ∈ 1 HD (Ω) of (7.35), namely, ∗
kun − u kHD1 (Ω)
M0 − m 1 kF (u0 ) − bkH 1 (Ω) ≤ D m M0 + m
n
(n ∈ N) .
(7.48)
(2) Assume in addition that G ∈ C 1 (Ω, RN ×N ) and the boundary conditions are regular as defined with (7.43). 1 Then for any u0 ∈ H 2 (Ω) ∩ HD (Ω) the auxiliary problems (7.47) throughout the iteration take the strong form
Szn ≡ −div (G(x) ∇zn )
= T (un ) − g
in Ω
Rzn ≡ ∂G(x)·ν zn + β(x)zn = Q(un ) − γ on ΓN zn = 0
(7.49)
on ΓD
with zn ∈ H 2 (Ω), and (7.48) can be replaced by kun − u∗ kHD1 (Ω) ≤ (for all n ∈ N).
1/2 M − m n 1 0 2 2 kT (u ) − gk + kQ(u ) − γk 0 0 L2 (Ω) L2 (ΓN ) m̺1/2 M0 + m (7.50)
7.1. SIMPLE ITERATIONS
177
Proof. The proof is the suitable modification of that of Theorem 7.2. Now we 1 1 consider the generalized differential operator F : HD (Ω) → HD (Ω) defined by hF (u), viHD1 =
Z
Ω
(f (x, ∇u) · ∇v + q(x, u)v) +
Z
s(x, u)v dσ
(7.51)
1 (u, h ∈ HD (Ω)),
(7.52)
ΓN
1 (u, v ∈ HD (Ω)). Then F satisfies the analogue of (5.91):
mkhk2H 1 ≤ hF ′ (u)h, hiHD1 ≤ M (kukHD1 )khk2H 1 D
D
which is derived from Theorem 6.2 similarly as in Theorem 7.2 by setting the inner 1 product (7.42) in HD (Ω). Using (7.52), Theorem 5.5 yields assertion (1) of Theorem 7.3. The strong form (7.49) is obtained by induction similarly to Theorems 7.1–7.2. Further, (7.51) now yields hF (u), viHD1 = hT (u), viL2 (Ω) + hQ(u), viL2 (ΓN )
1 (u, v ∈ H 2 (Ω) ∩ HD (Ω))
instead of (5.73), hence kT (u0 ) − gkL2 (Ω) in (7.13) is replaced by the product norm of the pair (T (u0 ) − g, Q(u0 ) − γ), which yields (7.50). Remark 7.6 The strong form (7.49) clarifies especially the roles of the operators F , T and S in the auxiliary problems. Namely, let us define the pairs of operators mapping into product spaces
T Q
and
S R
:
2 HD (Ω)
2
→ L (Ω) × H
1/2
(ΓN ),
T (u) := Q
T (u) Q(u)
(7.53)
2 1 and similarly for S and R, where HD (Ω) := H 2 (Ω) ∩ HD (Ω).
S ) is bijective. R Hence, using that (7.47) coincides with (7.49), the iteration (7.46)–(7.47) can be written as −1 2 S T g un+1 = un − (un ) − , (7.54) R Q γ M0 + m If the boundary conditions are regular as defined with (7.43), then (
further, the obtained convergence estimate (7.50) corresponds to the condition number (relative to the initial guess u0 ) cond
S R
−1
T Q
!
≤
M0 . m
(7.55)
Remark 7.7 The norm kF (u0 ) − bkHD1 (Ω) in (7.48) was estimated by the computable norm 1/2 ρ−1/2 kT (u0 ) − gk2L2 (Ω) + kQ(u0 ) − γk2L2 (ΓN )
to yield (7.50) when the given regularity conditions hold. This estimate can also be used in (7.45). Without the regularity conditions one can compute kF (u0 ) − bkHD1 (Ω) similarly to Remark 7.3.
178
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
(b) Second order mixed systems Now we consider second order mixed systems of the form Ti (u1 , .., ur ) ≡ − div fi (x, ∇u1 , .., ∇ur ) = gi (x)
Qi (u1 , .., ur ) ≡
in Ω
fi (x, ∇u1 , .., ∇ur ) · ν = γi (x) on ΓN ui = 0
on ΓD
(i = 1, .., r) (7.56)
on a bounded domain Ω ⊂ RN , such that the following assumptions are satisfied: Assumptions 7.4. (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅ and ΓN ∪ ΓD = ∂Ω. (ii) The functions fi : Ω × RN r → RN are measurable and bounded w.r. to the variable x ∈ Ω and C 1 in the other variables. (iii) Let fi = (fi1 , ..., fiN ) and f = (f11 , ..., f1N , f21 , ..., f2N , ..., fr1 , ..., frN ) : Ω×RN r → RN r . For any (x, η) ∈ Ω × RN r , the matrices {∂ηk fj (x, η)}j,k=1,...,N r are sym(f ) metric and their eigenvalues λj (x, η) (j = 1, ..., rN ) satisfy (f )
µ1 ≤ λj (x, η) ≤ µ2 with constants µ2 ≥ µ1 > 0 independent of (x, η). (iv) ΓD 6= ∅. (v) gi ∈ L2 (Ω) and γi ∈ L2 (ΓN ) (i = 1, .., r). Construction 7.4. Let G ∈ L∞ (Ω, RN r×N r ) be a symmetric matrix-valued function for which there exist constants M ≥ m > 0 such that mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN r , ξ ∈ RN r ). (7.57)
1 1 The corresponding inner product in the energy space HD (Ω)r (with HD (Ω) defined in (6.15)) is now
hu, viHD1 (Ω)r :=
Z
Ω
G(x) ∇u · ∇v =
Z X r
Ω i,j=1
G(ij) (x) ∇uj · ∇vi
1 (u, v ∈ HD (Ω)r ),
(7.58)
where the second form corresponds to the decomposition G(x) = {G(ij) (x)}ri,j=1 to blocks G(ij) ∈ RN ×N . Using this decomposition, condition (7.57) implies m
Z X r
Ω i,j=1
G(ij) (x) ∇hi · ∇hj ≤
Z
Ω
Z X r ∂f G(ij) (x) ∇hi · ∇hj (x, ∇u) ∇h · ∇h ≤ M ∂η i,j=1 Ω
(7.59)
7.1. SIMPLE ITERATIONS
179
1 (for all u, h ∈ HD (Ω)r ). We introduce a preconditioning operator in the form of an r-tuple of operators
S1 .. S= . Sr
as a preconditioner for the original operator
T1 .. T = . . Tr
Namely,
Si u ≡ − div
r X
j=1
G(ij) (x) ∇uj
(i = 1, .., r),
(7.60)
where Si are defined for all u = (u1 , .., ur ) ∈ H 2 (Ω)r with G(x)∇u ∈ H 1 (Ω)r and satisfying the boundary conditions r X
∂G(ij) (x)·ν uj = 0 on ΓN
and ui = 0 on ΓD
(i = 1, .., r)
(7.61)
j=1
(where ∂G(ij) (x)·ν uj = G(ij) (x) ν · ∇uj is the conormal derivative of uj at x). Theorem 7.4 Let Assumptions 7.4 be satisfied. Then Construction 7.4 yields M m and the following corresponding convergence results:
cond S −1 T ≤
1 (1) Let u0 ∈ HD (Ω)r , and for n ∈ N let
un+1 = un −
(7.62)
2 zn , M +m
(7.63)
1 where zn = (zn,1 , .., zn,r ) ∈ HD (Ω)r satisfies
Z X r
G
(ij)
Ω i,j=1
∇zn,j · ∇vi =
Z X r
Ω i=1
[fi (x, ∇un ) · ∇vi − gi vi ] −
Z X r
γi vi dσ
(7.64)
ΓN i=1
1 for all v ∈ HD (Ω)r .
1 Then the sequence (un ) = (un,1 , .., un,r ) ⊂ HD (Ω)r converges linearly to the weak 1 solution u∗ = (u∗1 , .., u∗r ) ∈ HD (Ω)r of the system (7.56), namely, ∗
kun − u kHD1 (Ω)r
M −m 1 kF (u0 ) − bkH 1 (Ω)r ≤ D m M +m
n
(n ∈ N) ,
(7.65)
where F denotes the generalized differential operator corresponding to T = (T1 , ..., Tr ) 1 (cf. (6.20)) and b ∈ HD (Ω)r is the element for which hb, viHD1 (Ω)r =
Z X r Ω i=1
gi v i +
Z
r X
ΓN i=1
γi vi dσ
1 (v ∈ HD (Ω)r ).
(7.66)
180
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
(2) Assume in addition that G ∈ C 1 (Ω, RN r×N r ) and the boundary conditions are regular in the sense of paragraph (a). 1 Then for any u0 ∈ H 2 (Ω)r ∩ HD (Ω)r , the auxiliary problems (7.64) throughout the iteration take the strong form
− div
r P
j=1
G(ij) (x) ∇zn,j r P
j=1
!
= Ti (un,1 , .., un,r ) − gi
in Ω
∂G(ij) (x)·ν zn,j = fi (x, ∇un,1 , .., ∇un,r ) · ν − γi on ΓN zn,i = 0
(7.67)
on ΓD
(for all i = 1, .., r) with zn = (zn,1 , .., zn,r ) ∈ H 2 (Ω)r , and the estimate (7.50) 1 1 holds with HD (Ω) and L2 (Ω) replaced by HD (Ω)r and L2 (Ω)r , respectively. Proof. The proof goes on in the same way as for Theorem 7.3, using product 1 1 spaces. Now the generalized differential operator F : HD (Ω)r → HD (Ω)r is defined by hF (u), viHD1 (Ω)r = (u, v ∈
1 HD (Ω)r )
Z
Ω
f (x, ∇u) · ∇v ≡
Z X r
Ω i=1
fi (x, ∇u) · ∇vi
and satisfies the analogue of (5.90):
mkhk2H 1 (Ω)r ≤ hF ′ (u)h, hiHD1 (Ω)r ≤ M khk2H 1 (Ω)r D
D
1 (u, h ∈ HD (Ω)r ).
1 This follows analogously to Theorem 6.2 by setting the inner product (7.58) in HD (Ω)r , namely, it is then a special case of (6.21) with M (r) ≡ M .
(c) Neumann problems Now we consider Neumann problems of the form (
T (u) ≡ −div f (x, ∇u) = g(x) f (x, ∇u) · ν |∂Ω = 0
(7.68)
on a bounded domain Ω ⊂ RN with piecewise smooth boundary. The iterative method is obtained with a simple modification of Theorem 7.1, using factorization. For this reason we introduce the subspaces H0 := {u ∈ H 1 (Ω) : u(x) ≡ const. on Ω},
H0⊥ = {u ∈ H 1 (Ω) :
Z
Ω
u dx = 0 } .
Theorem 7.5R Let the function f satisfy condition (i) of Theorem 7.1 and let g ∈ L2 (Ω) satisfy Ω g = 0. Let G ∈ L∞ (Ω, RN ×N ) be a symmetric matrix-valued function satisfying (7.4). Then Theorem 7.1 holds for the problem (7.68) in the space H0⊥ instead of H01 (Ω). In particular, the functions zn instead of those in (7.10) are now solutions of the auxiliary problems Z Z Z G(x) ∇z · ∇v = f (x, ∇u ) · ∇v − gv n n Ω
zn ∈ H 1 (Ω),
Ω
Z
Ω
zn dx = 0.
Ω
(v ∈ H01 (Ω)),
(7.69)
7.1. SIMPLE ITERATIONS
181
Further, if G ∈ C 1 (Ω, RN ×N ) and ∂Ω ∈ C 2 , then for any u0 ∈ H 2 (Ω) satisfying R Ω u0 dx = 0, the auxiliary problems (7.69) throughout the iteration take the strong form −div (G(x) ∇zn ) = T (un ) − g Z (7.70) ∂G(x)·ν zn |∂Ω = f (x, ∇un ) · ν, zn dx = 0 Ω
2
with zn ∈ H (Ω) (where ∂G(x)·ν zn = G(x) ν · ∇zn is the conormal derivative of zn at x).
Proof. The proof is the same as for Theorem 7.1 in the space H0⊥ instead of H01 (Ω). That is, now part (iii) of Theorem 5.14 is applied. For the second part the regularity result Theorem 3.6 is used. (d) 4th order Dirichlet problems Now we consider 4th order Dirichlet problems of the form defined in (6.63) with a matrix-valued function A: T (u) ≡ div2 A(x, D 2 u) = g(x) u |∂Ω =
∂u | = ∂ν ∂Ω
(7.71)
0,
such that the following assumptions are satisfied: Assumptions 7.6. (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary. (ii) The function A : Ω × RN ×N → RN ×N is measurable and bounded w.r. to the variable x ∈ Ω and C 1 in the other variables. (iii) The Jacobian arrays ∂A(x, Θ) = ∂Θ
(
∂Ars (x, Θ) ∂Θik
)N
i,k,r,s=1
∈ R(N ×N )
2
are symmetric and their eigenvalues Λ satisfy 0 < µ1 ≤ Λ ≤ µ 2 < ∞
(7.72)
with constants µ2 ≥ µ1 > 0 independent of (x, Θ). (iv) g ∈ L2 (Ω). 2
Construction 7.6. Let G ∈ L∞ (Ω, R(N ×N ) ) be a symmetric array-valued function for which there exist constants M ≥ m > 0 such that mhG(x)Φ, Φi ≤ h
∂A(x, Θ) Φ, Φi ≤ M hG(x)Φ, Φi ∂Θ
((x, Θ) ∈ Ω × RN ×N , Φ ∈ RN ×N ).
(7.73)
182
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
We introduce the linear operator Su ≡ div2 (G(x)D2 u)
(u ∈ H 4 (Ω) ∩ H02 (Ω) with G(x)D2 u ∈ H 2 (Ω)),
(7.74)
and the corresponding energy space H02 (Ω) with the inner product hu, viG :=
Z
Ω
G(x) D2 u · D2 v .
(7.75)
Theorem 7.6 Let Assumptions 7.6 be satisfied. Then Construction 7.6 yields M m and the following corresponding convergence results:
cond S −1 T ≤
(7.76)
(1) for any u0 ∈ H02 (Ω) the sequence the sequence (un ) ⊂ H02 (Ω) defined by un+1 = un − where
Z
Ω
2
2
G(x) D zn · D v =
Z
2 zn , M +m 2
Ω
2
A(x, D un ) · D v − ∗
converges linearly to the weak solution u ∈
H02 (Ω)
M −m 1 kun − u kG ≤ kF (u0 ) − bkG m M +m
∗
Z
Ω
gv
(v ∈ H02 (Ω)),
(7.77)
of (7.71), namely, n
(n ∈ N) ,
(7.78)
where F is the generalized differential operator corresponding to T (see (7.80)) and Z hb, viG = gv (v ∈ H02 (Ω)). Ω
(2) In particular, if G ∈ C 2 (Ω, RN ×N ) and ∂Ω ∈ C 4 , then for any u0 ∈ H 4 (Ω) ∩ H02 (Ω) the auxiliary problems throughout the iteration (7.77) take the strong form (
Szn = T (un ) − g n zn|∂Ω = ∂z | =0 ∂ν ∂Ω
(7.79)
with zn ∈ H 4 (Ω), and the right-hand side of (7.78) takes the form (7.13), where ̺ > 0 is the smallest eigenvalue of S on H 4 (Ω) ∩ H02 (Ω). Proof. The proof goes on similarly as for Theorem 7.1, now with the generalized differential operator F : H02 (Ω) → H02 (Ω) defined by hF (u), viG =
Z
Ω
A(x, D2 u) · D2 v
(u, v ∈ H02 (Ω)).
(7.80)
In checking the conditions of Theorem 5.14, now the estimate (5.88) is replaced by the equivalent form (5.90): mkhk2G ≤ hF ′ (u)h, hiG ≤ M khk2G
(u, h ∈ H02 (Ω)).
This follows analogously to Theorem 6.1 by setting the inner product h., .iG in H02 (Ω), namely, it is a special case of (6.3) with r = 1. Finally, for assertion (2) a corresponding regularity result in [3] ensures that the auxiliary problems have solutions in H 4 (Ω) for right-sides in L2 (Ω).
7.2. NEWTON-LIKE METHODS
183
Remark 7.8 Theorem 7.6 can be reformulated and proved similarly with the boundary conditions given in part (ii) of Remark 6.6. Further, lower order terms can be added analogously to Theorem 7.2, cf. [172]. Remark 7.9 Similarly as mentioned in Remark 7.1, the main ingredient of the proof of Theorems 7.3–7.6 is to define the preconditioning operator S as a linear elliptic operator equivalent to T . This has enabled us to apply Theorem 5.14. Further, the obtained generalized differential operators satisfy mkhk2G ≤ hF ′ (u)h, hiG ≤ M khk2G
(∀ u, h)
(with M depending on kuk in the presence of lower order terms), which shows that the underlying results ensuring linear convergence are Theorems 5.4 and 5.5. Remark 7.10 The errors in Theorems 7.3–7.6 can be a priori estimated using the initial weak residuals kF (u0 ) − bkG . These residuals can be computed similarly as mentioned in Remark 7.3.
7.2
Newton-like methods
This section is devoted to Newton-like methods in Sobolev space, applying the abstract results of subsections 5.2.2 and 5.3.3 to elliptic boundary value problems. The described methods are mostly developed for second order mixed problems, since this special case allows the clearer exposition of the ideas. Other boundary value problems are referred to briefly. We consider the weak formulation of the problems, the strong form has no practical relevance here in contrary to the simple iterations in the previous section. In the first subsection a general result is given on the convergence of the damped Newton method in Sobolev space when the generalized differential operator has a Lipschitz continuous derivative. The next subsections are devoted to damped inexact Newton methods (DIN). In this part we consider finite-dimensional subspaces, since in this way the Lipschitz continuity of the derivative of the coefficient f itself is sufficient to ensure generally the same for the corresponding generalized differential operator. In subsections 7.2.2–7.2.3 two kinds of DIN sequences are constructed for second order mixed problems. The first version (inner-outer iteration) uses a Newton iteration where inexactness comes from the approximate solution of the theoretically exact Newton equations; the latter is achieved by an inner CG iteration. The second version defines approximations of the derivatives, thus realizing variable preconditioning (described in subsection 5.3.3) which reflects the preconditioning role of Newton’s method and thus the common character of preconditioned simple iterations and inexact (or quasi-) Newton methods. The two versions are strongly related, they involve similarly constructed preconditioners based on variable spectral equivalence. The main differences are as follows. Concerning the approach, the preconditioners act for the inner linear problems in the first version and as preconditioners for the outer nonlinear
184
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
problem in the second version. From practical aspect, the second version is simpler to realize since it comprises only one equation with the same preconditioner to achieve similar order of convergence; on the other hand, an extra condition has to be imposed on the initial spectral bounds, whereas there is no such limitation in the first version. The above DIN iterations for other boundary value problems are referred to in the last subsection. The treatment of these problems is similar to that in the related theorems for simple iterations, and is given in detail for fourth order Dirichlet problems. The discussed two kinds of DIN sequences correspond to the two levels of variable preconditioning described in the introduction of section 5.3. However, the order is reverse, i.e. we now discuss inner-outer iterations first, since this is a more widespread way of realizing Newton’s method. We note that the preconditioning approach applied in these Newton-like methods is a suitable generalization of the preconditioned simple iterations of section 7.1, as outlined at the beginning of section 5.3. Namely, both in subsections 7.2.2 and 7.2.3 the preconditioning operators Bn satisfy
cond Bn−1 F ′ (un ) ≤
Mn mn
(n ∈ N),
(7.81)
and in the variable preconditioning method of subsection 7.2.3 this yields convergence with ratio Mn − m n q = lim sup Mn + m n including superlinear convergence when q = 0. Further, in subsection 7.2.2 the same estimate (7.81) implies the convergence of the inner CG iteration with ratio √ √ Mn − m n √ . √ Mn + m n An important advantage of the discussed Sobolev space Newton methods is that the derivative operators (and later in their numerical realization the corresponding Jacobians) are derived in a straightforward manner, without any further numerical differentiation. This is because they come from the weak formulation instead of studying the actual form of the nonlinear algebraic system corresponding to the discretized operator. The Sobolev space version of Newton’s method has been introduced by Kantorovich as an application of the Newton-Kantorovich method in normed spaces (see [165]), and has been widely used owing to the above advantages. For elliptic problems, summaries involving numerical aspects are given e.g. by Axelsson [7, 10] and Rannacher [251], and some particular applications are found in Chiorescu [71], Vladimirova [285]. An efficient numerical realization of Newton’s method that relies on the Sobolev space background is the class of multilevel Newton iterations, see subsection 9.2.4 and the references there. The above-mentioned idea of inner iterations for the solution of the linearized equations has been introduced in this form in Bank–Rose [35]. The discussion of innerouter iterations and variable preconditioning in subsections 7.2.2–7.2.3 is closely related to the authors’ papers [16, 174].
7.2. NEWTON-LIKE METHODS
7.2.1
185
The general damped Newton algorithm
In this subsection damped Newton iterations are constructed for boundary value problems in the corresponding Sobolev spaces, using the normed space results of Theorem 5.11 and Corollary 5.1. Let us first consider a second order mixed boundary value problem of the form − div f (x, ∇u) = g(x)
The Sobolev space
in Ω
f (x, ∇u) · ν = γ(x) on ΓN u = 0
(7.82)
on ΓD .
1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0} ,
(7.83)
defined in (6.15) corresponding to the Dirichlet boundary ΓD , is now endowed with the inner product Z hu, viHD1 :=
Ω
∇u · ∇v
1 (u, v ∈ HD (Ω)).
(7.84)
(Condition ΓD 6= ∅ below in assumption (i) ensures that (7.84) is positive definite.) We consider problem (7.82) under the following assumptions: Assumptions 7.7. (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary, ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅, ΓN ∪ ΓD = ∂Ω and ΓD 6= ∅; (ii) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ); (iii) the function f : Ω × RN → RN is measurable and bounded w.r. to the variable x ∈ Ω and C 1 w.r. to the variable η ∈ RN ; (iv) the Jacobians
∂f (x,η) ∂η
are symmetric and their eigenvalues λ satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞
(7.85)
with constants µ2 ≥ µ1 > 0 independent of (x, η); 1 1 (v) the generalized differential operator F : HD (Ω) → HD (Ω), defined by
hF (u), viHD1 =
Z
Ω
f (x, ∇u) · ∇v
1 (v ∈ HD (Ω)),
(7.86)
has a Lipschitz continuous derivative. 1 Let b ∈ HD (Ω) be the element defined by
hb, viHD1 =
Z
Ω
gv +
Z
ΓN
γv dσ
1 (v ∈ HD (Ω))
(7.87)
186
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
1 and denote by u∗ ∈ HD (Ω) the weak solution of (7.82), i.e.
hF (u∗ ), viHD1 = hb, viHD1
1 (v ∈ HD (Ω)).
(7.88)
The above weak forms of F and b follow the usage in Chapter 6, and some further explanation to clarify their meaning has been given in Remark 7.2. (See also in the Appendix.) 1 Construction 7.7. Let u0 ∈ HD (Ω) be arbitrary, and let the sequence (un ) be defined by the following iteration. If, for n ∈ N, un is obtained, then
(n ∈ N),
un+1 = un + τn pn
(7.89)
1 where pn ∈ HD (Ω) is the solution of the linear auxiliary problem
hF ′ (un )pn , viHD1 = −hF (un ) − b, viHD1
1 (v ∈ HD (Ω))
or Z
Ω
Z Z ∂f (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v − gv + γv dσ ∂η Ω ΓN
further, τn = min{ 1,
1 (v ∈ HD (Ω)),
µ1 } ∈ (0, 1] Lkpn kHD1
(7.90) (7.91)
where L denotes the Lipschitz constant of F ′ corresponding to assumption (v). Theorem 7.7 Let Assumptions 7.7 be satisfied. Then the sequence (un ) defined by Construction 7.7 satisfies 1 kun − u∗ kHD1 ≤ µ−1 → 0 monotonically 1 kF (un ) − bkHD
(7.92)
with speed of locally quadratic order. Namely, kF (un+1 ) − bkHD1 ≤ c1 kF (un ) − bk2H 1
D
(n ≥ n0 )
with some index n0 ∈ N and constant c1 > 0, which also yields the convergence estimate of weak quadratic order n
kF (un ) − bkHD1 ≤ d1 q 2
(n ≥ n0 )
(7.93)
with suitable constants 0 < q < 1, d1 > 0. Proof. Owing to our assumptions on f , we obtain as a special case of Theorem 6.1 that the operator F has a uniformly positive derivative with lower bound µ1 > 0. This implies (5.29) with λ = µ1 . Together with the Lipschitz continuity of F ′ , this means that the assumptions of Theorem 5.11 on F are satisfied. Then Corollary 5.1 yields the required result. Theorem 7.7 can be reformulated in a similar way for the other boundary value problems considered in section 7.1. For brevity, these results are given containing only the different details explicitly.
7.2. NEWTON-LIKE METHODS
187
Theorem 7.8 Let the items (i)-(iv) of Assumptions 7.7 be replaced by Assumptions 7.3–7.6, respectively, let F and b be as in those theorems and assume that F ′ is Lipschitz continuous. Then Construction 7.7 and the corresponding estimates in Theorem 7.7 remain the same for the problems considered in Theorems 7.3–7.6 if (a) in the case of the 3rd type boundary value problem (7.35), the auxiliary problem (7.90) is replaced by ∂f ∂q (x, ∇un ) ∇pn · ∇v + (x, un ) pn v ∂η ∂ξ
Z
Ω
= −
Z
Ω
!
[f (x, ∇un ) · ∇v + (q(x, un ) − g)v] −
+
Z
ΓN
Z
ΓN
∂s (x, un ) pn v dσ ∂ξ
(s(x, un ) − γ)v dσ
(7.94)
1 (v ∈ HD (Ω));
1 1 (b) in the case of the mixed system (7.56), the space HD (Ω) is replaced by HD (Ω)r and the auxiliary problem (7.90) by
Z X r ∂fi
Ω i,j=1
∂ηj
=−
(x, ∇un,1 , .., ∇un,r ) ∇pn,j · ∇vi Z X r
Ω i=1
(fi (x, ∇un ) · ∇vi − gi vi ) +
Z X r
(7.95) γi vi dσ
ΓN i=1
1 (v ∈ HD (Ω)r )
1 with un = (un,1 , .., un,r ) ∈ HD (Ω)r and using the decomposition
to blocks
∂fi ∂ηj
∂f ∂η
∂fi r = { ∂η } j i,j=1
∈ RN ×N ;
1 (c) in the case of the Neumann problem (7.68), the space HD (Ω) is replaced by H0⊥ = R {u ∈ H 1 (Ω) : Ω u = 0 } and the auxiliary problem (7.90) by
Z
Ω
Z ∂f (x, ∇un ) ∇pn · ∇v = − (f (x, ∇un ) · ∇v − gv) ∂η Ω
(v ∈ H0⊥ );
(7.96)
1 (d) in the case of the 4th order problem (7.71), the space HD (Ω) is replaced by H02 (Ω) and the auxiliary problem (7.90) by
Z
Ω
Z ∂A 2 2 2 (x, D un ) D pn · D v = − A(x, D2 un ) · D2 v − gv ∂Θ Ω
(v ∈ H02 (Ω)). (7.97)
We note that in the above Sobolev space Newton methods the derivative operators (and later in their numerical realization the corresponding Jacobians) are derived in a straightforward manner, without any further numerical differentiation. This is because they come from the weak formulation instead of studying the actual form of the nonlinear algebraic system corresponding to the discretized operator.
188
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Remark 7.11 Estimates (7.92)–(7.93) imply that the number of Newton iterations to achieve kun − u∗ kHD1 ≤ ε (7.98) for some prescribed tolerance ε > 0 satisfies n = O(log log ε)
as ε → 0.
This is a main advantage of Newton’s method over the simple iterations, for which the same error (7.98) requires the number of iterations n = O(log ε)
as ε → 0.
Remark 7.12 The Lipschitz constant L of the generalized differential operator depends on the actual coefficient. A class of problems when it can be estimated in a general way is that of semilinear equations, for which case the bound is obtained as follows. (i) Let us consider the mixed problem − div (A(x)∇u) + q(x, u) = g(x)
in Ω
∂A(x)·ν u = γ(x) on ΓN
u = 0
(7.99)
on ΓD
with the following assumptions. The domain Ω ⊂ RN is bounded and N = 2 or 3; the matrices A(x) are symmetric and their eigenvalues are between positive constants independent of x; further, the function q ∈ C 1 (Ω × R) has a Lipschitz continuous derivative w.r. to ξ: |∂ξ q(x, ξ1 ) − ∂ξ q(x, ξ2 )| ≤ Cq |ξ1 − ξ2 |
((x, ξ) ∈ Ω × R)
(7.100)
with some constant Cq > 0 independent of x, ξ. The corresponding generalized differential operator is defined by hF (u), viHD1 =
Z
Ω
[A(x) ∇u · ∇v + q(x, u)v]
1 (u, v ∈ HD (Ω)),
and its derivative by ′
hF (u)v, ziHD1 =
Z
Ω
[A(x) ∇v · ∇z + ∂ξ q(x, u) vz]
1 (u, v, z ∈ HD (Ω)),
Then there holds Z 2 |h(F (u) − F (v))z, zi| = (∂ξ q(x, u) − ∂ξ q(x, v)) z ′
′
≤ Cq
Z
Ω
Ω
|u − v| z 2 ≤ Cq ku − vkL2 (Ω) kzk2L4 (Ω)
using H¨older’s inequality. Here by Remark 3.2 ku − vkL2 (Ω) ≤ ̺−1/2 ku − vkHD1 ,
(7.101)
7.2. NEWTON-LIKE METHODS
189
1 where ̺ > 0 denotes the smallest eigenvalue of −∆ on H 2 (Ω) ∩ HD (Ω), further,
kzkL4 (Ω) ≤ K4,Ω kzkHD1 , where K4,Ω > 0 is the Sobolev embedding constant corresponding to the embedding 1 HD (Ω) ⊂ L4 (Ω) for N ≤ 4 (see (7.37)). Hence kF ′ (u) − F ′ (v)kHD1 =
2 sup |h(F ′ (u) − F ′ (v)) z, zi| ≤ Cq ̺−1/2 K4,Ω ku − vkHD1 ,
kzkH 1 =1 D
that is, the Lipschitz constant L of the operator F ′ satisfies 2 L ≤ Cq ̺−1/2 K4,Ω .
In particular, for Dirichlet problems there holds the estimate K4,Ω ≤ (2/̺)1/4 (see (11.9)), which implies L ≤ 21/2 Cq ̺−1 .
A possible estimate for ̺ is given in (11.1). Altogether, the constant L is estimated in an explicit way in terms of the coefficients and domain. (ii) The Lipschitz condition (7.100) can be relaxed such that ∂ξ q is assumed to be only locally Lipschitz continuous w.r. to ξ. Then its growth must be limited by the 1 Sobolev embedding estimates of HD (Ω). Then F ′ is also locally Lipschitz continuous and the local Lipschitz constant can be estimated similarly as above. We sketch this for the same example (7.99). Let condition (7.100) be replaced by |∂ξ q(x, ξ1 ) − ∂ξ q(x, ξ2 )| ≤ Cq (max |ξ1 |, |ξ2 |)p−2 |ξ1 − ξ2 |
((x, ξ) ∈ Ω × R), (7.102)
where 2 ≤ p ≤ 4. In this case (7.101) implies ′
′
≤ Cq k1k
|h(F (u) − F (v))z, zi| ≤ Cq 5 L 4−p (Ω)
≤ Cq |Ω|
4−p 5
Z
Ω
(max |u|, |v|)p−2 |u − v|z 2
max kukL5 (Ω) , kvkL5 (Ω)
p−2
p+1 max kukHD1 , kvkHD1 K5,Ω
p−2
ku − vkL5 (Ω) kzk2L5 (Ω) ku − vkHD1 kzk2H 1
D
where |Ω| is the volume or area of Ω, and K5,Ω is the Sobolev embedding constant 1 corresponding to the embedding HD (Ω) ⊂ L5 (Ω) for N ≤ 4 (see (7.37)). Here H¨older’s inequality was applied, using that (4 − p)/5 + (p − 2)/5 + 1/5 + 2/5 = 1. That is, F ′ satisfies the local Lipschitz continuity condition (5.63), since (taking supremum w.r. to kzkHD1 = 1) the obtained estimate implies ˜ kF ′ (u) − F ′ (v)kHD1 ≤ L(r)ku − vkHD1 with
1 (u, v ∈ HD (Ω), kukHD1 , kvkHD1 ≤ r)
4−p p+1 p−2 ˜ L(r) = Cq |Ω| 5 K5,Ω r .
(7.103)
190
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
7.2.2
Second order problems: inner-outer iterations
In this subsection and the next one we consider the boundary value problem (7.82). Now Newton’s method is realized with an inner-outer iteration, i.e. the solutions of the linearized equations in a damped Newton method are obtained numerically using distinct iterations, hence the latter are considered as inner iterations during the outer Newton iteration. (This approach has been introduced in this form in Bank–Rose [35].) In particular, in the inner iterations we propose a preconditioned conjugate gradient method, which is generally the most efficient way of solving the inner symmetric linear equations. Of course, the convergence results for the outer iteration are independent of the actual choice of the inner solver. 1 The iterative sequence is constructed in a finite-dimensional subspace of HD (Ω), since in this way the Lipschitz continuity of the derivative of the coefficient f itself is sufficient to ensure generally the same for the corresponding generalized differential operator. Assumptions 7.9. Let the items (i)-(iv) of Assumptions 7.7 hold and let the (x,η) Jacobians ∂f∂η be Lipschitz continuous w.r. to η. 1 Let V ⊂ HD (Ω) be a finite-dimensional subspace with the inner product (7.84). Let F : V → V denote the operator defined by
hF (u), viHD1 =
Z
Ω
f (x, ∇u) · ∇v
(v ∈ V )
(7.104)
and b ∈ V the element defined by hb, viHD1 =
Z
Ω
gv +
Z
ΓN
γv dσ
(v ∈ V )
(7.105)
(cf. also Remark 7.2). Denote by u∗ ∈ V the solution of hF (u∗ ), viHD1 = hb, viHD1
(v ∈ V ).
(7.106)
The operator F is Gateaux differentiable (cf. Theorem 6.2), and its derivative is given by Z ∂f ′ (x, ∇u) ∇v · ∇z (u, v, z ∈ V ). hF (u)v, ziHD1 = Ω ∂η Since V is finite-dimensional, the operator F ′ inherits the Lipschitz continuity of ∂f /∂η. Let L denote the Lipschitz constant of F ′ . (An estimate for L will be given after Theorem 7.9 in Remark 7.14.) Construction 7.9. Let u0 ∈ V be arbitrary, and let the sequence (un ) ⊂ V be defined by the following inner-outer iteration. (a) The outer iteration defines the sequence un+1 = un + τn pn
(n ∈ N),
(7.107)
where pn ∈ V is the numerical solution of the auxiliary linear problem hF ′ (un )pn , viHD1 = −hF (un ) − b, viHD1
(v ∈ V )
(7.108)
7.2. NEWTON-LIKE METHODS
191
or Z
Ω
Z Z ∂f (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v − gv + γv dσ ∂η Ω ΓN
(v ∈ V ),
further, δn > 0 is some prescribed constant satisfying 0 < δn ≤ δ0 < 1, τn = min{ 1,
(1−δn ) µ1 (1+δn ) Lkpn kH 1
D
} ∈ (0, 1] .
(7.109)
(b) To determine pn in (7.108), the inner iteration defines a sequence (p(k) n ) ⊂ V
(k ∈ N)
using a preconditioned conjugate gradient method. Namely, we choose constants Mn ≥ mn > 0 and a symmetric matrix-valued function Gn ∈ L∞ (Ω, RN ×N ), satisfying σ(G(x)) ⊂ [µ1 , µ2 ] (x ∈ Ω) with µ1 , µ2 from (7.85), such that there holds mn hGn (x)ξ, ξi ≤ h
∂f (x, ∇un (x))ξ, ξi ≤ Mn hGn (x)ξ, ξi ∂η
(7.110)
(for all x ∈ Ω, ξ ∈ RN ), and let Bn : V → V be the corresponding linear operator defined by Z hBn h, viHD1 =
Ω
Gn (x) ∇h · ∇v
(h, v ∈ V ).
(7.111)
Then we consider the preconditioned form of (7.108): Bn−1 F ′ (un )pn = −Bn−1 (F (un ) − b),
(7.112)
and construct the sequence (p(k) n )k∈N by the standard conjugate gradient method (2.66) for the equation (7.112). (That is, in (2.66) we set the system matrix A = F ′ (un ), the right-hand side −(F (un ) − b) and the preconditioner D = Bn .) For convenience we let p(0) n = 0. n) Finally, pn := p(k ∈ V is defined with the smallest index kn for which there n holds the relative error estimate n) kF ′ (un )p(k + (F (un ) − b)kBn−1 ≤ ̺n kF (un ) − bkBn−1 n
(7.113)
with ̺n = (µ1 /µ2 )1/2 δn and δn > 0 defined in (7.109).
Theorem 7.9 Let Assumptions 7.9 be satisfied. Then Construction 7.9 yields the following convergence results:
192
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
(1) There holds
cond Bn−1 F ′ (un ) ≤
Mn mn
(7.114)
and, accordingly, the inner iteration satisfies √ √ !k Mn − m n ′ (k) kF (un ) − bkBn−1 kF (un )pn + (F (un ) − b)kBn−1 ≤ √ √ Mn + m n
(k ∈ N).
Therefore the number of inner iterations for the nth outer step is at most kn ∈ N determined by the inequality √ √ !k Mn − m n n √ ≤ ̺n (7.115) √ Mn + m n (where ̺n = (µ1 /µ2 )1/2 δn ). (2) The outer iteration (un ) satisfies 1 kun − u∗ kHD1 ≤ µ−1 → 0 monotonically 1 kF (un ) − bkHD
with speed depending on the sequence (δn ) up to locally quadratic order. Namely, if δn ≡ δ0 < 1, then the convergence is linear. Further, if δn ≤ const. · kF (un ) − bkγH 1
D
with some constant 0 < γ ≤ 1, then the convergence is locally of order 1 + γ: kF (un+1 ) − bkHD1 ≤ c1 kF (un ) − bk1+γ H1 D
(n ≥ n0 )
with some index n0 ∈ N and constant c1 > 0, yielding also the convergence estimate of weak order 1 + γ kF (un ) − bkHD1 ≤ d1 q (1+γ)
n
(n ∈ N)
with suitable constants 0 < q < 1, d1 > 0. Proof. For fixed n the estimate (7.114) follows from (7.110)–(7.111), and it yields that the inner CGM iteration for the preconditioned auxiliary problem converges at least linearly with ratio √ √ Mn − m n Qn := √ √ Mn + m n in k.kBn norm (see subsection 2.3.1). Hence, using p(0) n = 0 and (2.67)–(2.68) with ′ matrix A = F (un ), right-hand side −(F (un ) − b) and preconditioner D = Bn , the residuals −1 rn(k) := Bn−1 F ′ (un )p(k) n + Bn (F (un ) − b)
(k = 0, 1, ..., kn )
(7.116)
of equation (7.112) satisfy (k) k (0) k −1 = kr −1 . kF ′ (un )p(k) n + (F (un ) − b)kBn n kBn ≤ Qn krn kBn = Qn kF (un ) − bkBn
7.2. NEWTON-LIKE METHODS
193
That is, (7.113) is achieved if Qknn ≤ ̺n , i.e. assertion (1) of our theorem is proved. To verify assertion (2) we use Theorem 5.12. According to our assumptions on f , we obtain as a special case of Theorem 6.1 that the operator F has a uniformly positive derivative. This implies (5.29), further, F ′ inherits the Lipschitz continuity of ∂f , which mean that the assumptions of Theorem 5.12 on F are satisfied. By the ∂η assumption σ(G(x)) ⊂ [µ1 , µ2 ] (x ∈ Ω) the spectral bounds of Bn are also between µ1 and µ2 . Hence (7.113) and Corollary 3.1 imply kF ′ (un )pn(kn ) + (F (un ) − b)kHD1 ≤ (µ2 /µ1 )1/2 ̺n kF (un ) − bkHD1 = δn kF (un ) − bkHD1 . (7.117) Then, in virtue of (7.107)–(7.109), Corollary 5.2 yields the desired estimates of assertion (2). Remark 7.13 The estimate (7.113) in the algorithm can be checked as follows. According to the proof, (7.113) is equivalent to krn(kn ) kBn ≤ ̺n krn(0) kBn
(7.118)
where the functions rn(k) are defined in (7.116). In order to check (7.113), one therefore has to calculate the integrals krn(k) k2Bn
=
hBn rn(k) , rn(k) iHD1
=
Z
Ω
Gn (x) ∇rn(k) · ∇rn(k) .
(7.119)
However, these are the residuals for (7.112) and by Remark 2.1 they are calculated during the preconditioned CGM iteration, hence it needs no extra work to determine them. Remark 7.14 The explicit construction of the algorithm in Theorem 7.9 requires the Lipschitz constant L of F ′ in step (7.109). We give a possible way of estimating L (x, η). using the Lipschitz constant of the Jacobians ∂f ∂η Let Lf denote the Lipschitz constant of
∂f (x, η) ∂η
w.r. to η:
∂f ∂f (x, η (x, η 1) − 2 ) ≤ Lf |η1 − η2 | ∂η ∂η
((x, η) ∈ Ω × RN ).
The derivative of the operator F is given by hF ′ (u)v, ziHD1 =
Z
Ω
∂f (x, ∇u) ∇v · ∇z ∂η
(u, v, z ∈ V ).
Then there holds Z |h(F ′ (u) − F ′ (v))z, zi| = Ω
!
∂f ∂f (x, ∇u) − (x, ∇v) ∇z · ∇z ∂η ∂η
194
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE ≤ Lf
using (7.84), hence
Z
Ω
|∇u − ∇v| |∇z|2 ≤ Lf k∇u − ∇vkL∞ (Ω) kzk2H 1 , D
kF ′ (u) − F ′ (v)kHD1 =
sup |h(F ′ (u) − F ′ (v)) z, zi| ≤ Lf k∇u − ∇vkL∞ (Ω) .
z∈V, kzk 1 =1 H D
Introducing k∇zkL∞ (Ω) , z∈V \0 k∇zkL2 (Ω)
K(V ) = sup
(7.120)
1 which is finite since V is a finite-dimensional subspace of HD (Ω), we obtain
kF ′ (u) − F ′ (v)kHD1 ≤ Lf K(V ) ku − vkHD1 , that is, the Lipschitz constant L of the operator F ′ satisfies L ≤ Lf K(V ) . In particular, if V = Vh is the FEM subspace consisting of piecewise linear functions, then K(V ) = K(Vh ) has a very simple bound. Namely, let Tj (j = 1, ..., s) denote the elements corresponding to Vh , i.e. Ω = ∪sj=1 Tj is a disjoint subdivision. Then for any z ∈ Vh and j = 1, ..., s |∇z|2 |int Tj ≡ const. , hence, denoting by vol(Tj ) the volume (or area) of Tj , we have 2
2
sup |∇z| = max |∇z| Ω
j
|int Tj
≤
s X
j=1
2
|∇z|
|int Tj
s X 1 ≤ vol(Tj )|∇z|2 |int Tj min vol(Tj ) j=1 j
Z s Z X 1 1 2 |∇z| = |∇z|2 . = min vol(Tj ) j=1 Tj min vol(Tj ) Ω j
j
This yields the estimate K(Vh )2 ≤
1 . min vol(Tj ) j
If, for instance, {Tj }j=1,...,s is a uniform orthogonal isosceles triangulation with parameter h of a domain Ω ⊂ R2 , then vol(Tj ) = h2 /2 for all j and hence √ 2 K(Vh ) ≤ . h In this case
√
2 Lf . h (In general similar estimates for K(Vh ) are found in Ciarlet [72], Chapter 3.2, and for L in elasticity systems in Blaheta [45].) L≤
7.2. NEWTON-LIKE METHODS
195
Remark 7.15 A main advantage of Newton’s method over the simple iterations is quadratic (or order 1 + γ) convergence. Consequently, as mentioned in Remark 7.11, the number of Newton iterations n to achieve kun − u∗ kHD1 ≤ ε
(7.121)
for some prescribed tolerance ε > 0 satisfies n = O(log log ε)
as ε → 0,
(7.122)
whereas for simple iterations the same error (7.98) requires the number of iterations n = O(log ε)
as ε → 0.
(7.123)
According to Theorem 7.9, superlinear convergence can be preserved in an innerouter iteration if δn → 0, and the rate of the latter determines the rate of overall convergence. In particular, if δn ≤ const. · kF (un ) − bkHD1 then the convergence is quadratic: n
kF (un ) − bkHD1 ≤ const. · q 2
(n ∈ N)
(7.124)
with some 0 < q < 1. We observe that the required rate for δn is n
δn ≤ const. · q 2
(n ∈ N).
(7.125)
In the nth outer Newton step the number of inner iterations kn ∈ N is determined by (7.115): it is the smallest integer for which √ √ !k Mn − m n n n √ ≤ (µ1 /µ2 )1/2 δn ≤ const. · q 2 (7.126) √ Mn + m n if we want to satisfy (7.125). It is easy to see that √ √ Mn − m n Mn √ −1 =O √ mn Mn + m n as Mn /mn → 1, hence (7.126) can be replaced by the requirement
kn
Mn −1 mn
n
≤ const. · q 2 .
(7.127)
This implies that assuming Mn n − 1 ≤ const. · Q2 mn
(n ∈ N)
(7.128)
for some 0 < Q < 1, the numbers kn remain bounded as n → ∞. Hence the total number of inner-outer iterations remains proportional to that of the outer ones, and consequently it satisfies (7.122) when a tolerance ε > 0 is prescribed.
196
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
We underline that the converse is also true in the following sense: if Mn − 1 ≥ Q0 > 0 mn
(n ∈ N)
(7.129)
for some Q0 independent of n, then the overall number of iterations in the inner-outer algorithm is as in (7.123) if we want to achieve (7.124) via (7.125). Namely, (7.127) and (7.129) imply n Qk0n ≤ const. · q 2 , i.e. kn ≥ const. · 2n . Hence the total number ntot of inner-outer iterations until the nth outer iteration is ntot =
n X
j=1
kj ≥ const. ·
n X
j=1
2j = O(2n ) .
(7.130)
The number n of outer iterations satisfies (7.122), or equivalently 2n = O(log ε), hence from (7.130) ntot = O(log ε), which is the same as (7.123) for simple iterations. Altogether, we obtain that under the condition (7.129) the total number of innerouter iterations to achieve a prescribed error is of the same magnitude as for simple iterations. (This is the case in particular if Bn ≡ B is a fixed preconditioner for all n.) In other words, the inner-outer Newton iteration yields faster convergence than simple iterations if and only if the spectral bounds mn and Mn in (7.110) satisfy Mn → 1, mn which means that the preconditioners become asymptotically the exact Jacobians as n → ∞.
7.2.3
Second order problems: variable preconditioning
We consider the boundary value problem (7.82) again. Now Newton’s method is realized in the setting of variable preconditioning: the derivative operators are stepwise replaced by their suitable approximations, which still provide appropriate contractivity. (This gives a kind of quasi-Newton method.) The method is the application of the normed space result of subsection 5.3.3. Similarly to the previous subsection, the iterative sequence is constructed in a 1 finite-dimensional subspace of HD (Ω), since in this way the Lipschitz continuity of the derivative of the coefficient f itself is sufficient to ensure generally the same for the corresponding generalized differential operator.
7.2. NEWTON-LIKE METHODS
197
Assumptions 7.10. Let the items (i)-(iv) of Assumptions 7.7 hold and let the (x,η) Jacobians ∂f∂η be Lipschitz continuous in η. 1 Let V ⊂ HD (Ω) be a finite-dimensional subspace, further, let the operator F : V → V , the element b ∈ V and the weak solution u∗ ∈ V be defined as in formulas (7.104)–(7.106) in the previous subsection.
Construction 7.10. Let u0 ∈ V be arbitrary, and let (un ) ⊂ V be the sequence defined as follows. If, for n ∈ N, un is obtained, then we choose constants Mn ≥ mn > 0 and a symmetric matrix-valued function Gn ∈ L∞ (Ω, RN ×N ) for which there holds mn hGn (x)ξ, ξi ≤ h
∂f (x, ∇un (x))ξ, ξi ≤ Mn hGn (x)ξ, ξi ∂η
(x ∈ Ω, ξ ∈ RN ),
(7.131) further, Mn /mn and τn ∈ (0, 1] satisfy the conditions (iii)-(iv) in Theorem 5.16 with λ = µ1 and L denoting the Lipschitz constant of F ′ . We define un+1 = un −
2τn zn , Mn + m n
where zn ∈ V is the solution of Z
Ω
Gn (x) ∇zn · ∇v =
Z Ω
f (x, ∇un ) · ∇v − gv −
Z
ΓN
(7.132)
γv dσ
(v ∈ V ).
(7.133)
Theorem 7.10 Let Assumptions 7.10 be satisfied. Then Construction 7.10 yields
cond Bn−1 F ′ (un ) ≤
Mn mn
(n ∈ N),
(7.134)
where Bn is defined as in (7.111), and accordingly, the sequence (un ) defined by Construction 7.10 converges to u∗ with at least linear speed, namely, lim sup
kF (un+1 ) − bk∗ Mn − m n ≤ lim sup <1 kF (un ) − bk∗ Mn + m n
(7.135)
(with the k.k∗ norm defined as in (5.110)). Moreover, the speed of convergence depends on the sequence (Mn /mn ) up to locally quadratic order: if Mn /mn ≤ 1 + const. · kF (un ) − bkγH 1 D
with some constant 0 < γ ≤ 1, then the convergence of (un ) is locally of order 1 + γ: kF (un+1 ) − bkHD1 ≤ c1 kF (un ) − bk1+γ H1
(n ≥ n0 )
D
with some index n0 ∈ N and constant c1 > 0, yielding also the convergence estimate of weak order 1 + γ (1+γ) 1 ≤ d1 q kun − u∗ kHD1 ≤ µ−1 1 kF (un ) − bkHD
with suitable constants 0 < q < 1, d1 > 0.
n
(n ≥ n0 )
(7.136)
198
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Proof. Similarly to Theorems 7.7–7.9, we obtain as a special case of Theorem 6.1 that the operator F has a uniformly positive derivative, further, F ′ inherits the Lipschitz continuity of ∂f , which mean that the assumptions of Theorem 5.16 on F ∂η are satisfied. Defining Bn as in (7.111), the estimate (7.134) follows similarly as does (7.114) in Theorem 7.9. Further, Construction 7.10 yields that the sequence (5.111) coincides with (7.132)–(7.133). Then Theorem 5.16 and Corollary 5.7 yield the desired estimates.
7.2.4
Other problems
The realizations of damped Newton iterations, developed in the preceding two subsections, can be generalized from (7.82) to other boundary value problems. This generalization goes in a similar way as has been carried out in detail for the simple iterations in subsection 7.1.2 and for the damped Newton method in Theorem 7.8. Since this is mostly straightforward, it is only given in detail for fourth order Dirichlet problems and left to the reader for the other cases. Let us consider the fourth order Dirichlet problem formulated in (7.71) with a matrix-valued function A: T (u) ≡ div2 A(x, D 2 u) = g(x) u |∂Ω =
∂u | = ∂ν ∂Ω
0.
(7.137)
We develop the discussed two kinds of iterative sequences simultaneously. Assumptions 7.11. Let the items (i)-(iv) of Assumptions 7.6 (made for problem be Lipschitz continuous in Θ. (7.71)) hold, and let the Jacobians ∂A(x,Θ) ∂Θ Let V ⊂ H02 (Ω) be a finite-dimensional subspace with the inner product hu, viH02 :=
Z
Ω
D2 u · D2 v.
(7.138)
Let F : V → V denote the operator defined by hF (u), viH02 =
Z
Ω
A(x, D2 u) · D2 v
(v ∈ V )
(7.139)
and b ∈ V the element defined by hb, viH02 =
Z
Ω
gv
(v ∈ V ).
(7.140)
Denote by u∗ ∈ V the solution of hF (u∗ ), viH02 = hb, viH02
(v ∈ V ).
(7.141)
Construction 7.11. Let u0 ∈ V be arbitrary. We define simultaneously two sequences, both denoted by (un ) (which causes no ambiguity).
7.2. NEWTON-LIKE METHODS
199
Assume that un is constructed for some n ∈ N. We choose constants Mn ≥ mn > 0 and a symmetric array-valued function Gn ∈ L∞ (Ω, RN ×N ), satisfying σ(Gn (x)) ⊂ [µ1 , µ2 ] (x ∈ Ω) with µ1 , µ2 from (7.72), such that there holds mn hG(x)Φ, Φi ≤ h
∂ A(x, D2 un (x)) Φ, Φi ≤ Mn hG(x)Φ, Φi ∂Θ
(x ∈ Ω, Φ ∈ RN ×N )
(7.142)
and let Bn : V → V be the corresponding linear operator defined by hBn h, viH02 =
Z
Ω
Gn (x) D2 h · D2 v
(h, v ∈ V ).
(7.143)
Then un+1 is defined in one of the following two ways. (1) In the first case, an inner-outer iteration is used. (a) The outer iteration step defines the element un+1 = un + τn pn
(n ∈ N),
(7.144)
where pn is the numerical solution of the linear auxiliary problem hF ′ (un )pn , viH02 = −hF (un ) − b, viH02
(v ∈ V ),
(7.145)
further, δn > 0 is some prescribed constant satisfying 0 < δn ≤ δ0 < 1, τn = min{ 1,
(1−δn ) µ1 (1+δn ) Lkpn kH 2 0
} ∈ (0, 1] (7.146) ′
where L denotes the Lipschitz constant of F . (b) To determine pn in (7.145), an inner iteration is defined. Namely, we consider the preconditioned form of (7.145): Bn−1 F ′ (un )pn = −Bn−1 (F (un ) − b),
(7.147)
and construct a sequence (p(k) n )k∈N by the standard conjugate gradient method (2.66) for the equation (7.147). (That is, in (2.66) we set the system matrix A = F ′ (un ), the right-hand side −(F (un )−b) and the preconditioner D = Bn .) For convenience we let p(0) n = 0. n) Then pn := p(k ∈ V is defined with the smallest index kn for which there n holds the relative error estimate n) kF ′ (un )p(k + (F (un ) − b)kBn−1 ≤ ̺n kF (un ) − bkBn−1 n
(7.148)
with ̺n = (µ1 /µ2 )1/2 δn and δn > 0 defined in (7.146). (2) In the second case the operator Bn is used as variable preconditioner for the nth outer step as follows. Let τn ∈ (0, 1], assume that τn and the numbers Mn ,mn
200
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE in (7.142) satisfy the conditions (iii)-(iv) in Theorem 5.16 with λ = µ1 and L denoting the Lipschitz constant of F ′ . We define un+1 = un −
2τn zn , Mn + m n
(7.149)
where zn ∈ V is the solution of Z
Ω
Gn (x) D2 zn · D2 v =
Z
Mn mn
− 1.
In this case we let δn =
Ω
A(x, D2 un ) · D2 v − gv
(v ∈ V ).
(7.150)
Theorem 7.11 Let Assumptions 7.11 be satisfied. Then the sequence (un ) defined by Construction 7.11 satisfies kun − u∗ kH02 ≤ µ−1 1 kF (un ) − bkH02 → 0 monotonically with speed depending on the sequence (δn ) up to locally quadratic order. Namely, if δn ≤ δ0 < 1, then the convergence is at least linear with quotient q = lim sup δn
or
q = lim sup
Mn − m n Mn + m n
in cases (1) or (2), respectively. Further, if δn ≤ const. · kF (un ) − bkγH 2 with some 0 constant 0 < γ ≤ 1, then the convergence is locally of order 1 + γ: kF (un+1 ) − bkH02 ≤ c1 kF (un ) − bk1+γ H2 0
(n ≥ n0 )
with some index n0 ∈ N and constant c1 > 0, yielding also the convergence estimate of weak order 1 + γ kF (un ) − bkH02 ≤ d1 q (1+γ)
n
(n ≥ n0 )
with suitable constants 0 < q < 1, d1 > 0. (Further, there holds (7.115) in the case (1b) for the inner iteration.) Proof. The proof goes in the same way as those of Theorems 7.9–7.10, now replacing the operators (7.104) and (7.111) by (7.139) and (7.143), respectively. Remark 7.16 The norms in (7.148) and the Lipschitz constant L can be computed similarly to Remarks 7.13–7.14.
7.3. PRECONDITIONING AND SOBOLEV GRADIENTS
7.3
201
Preconditioning and Sobolev gradients
In this section we relate the preceding preconditioning methods of this chapter to the famous class of Sobolev gradient methods. This context may help the understanding of these methods on a common basis, and also shows that the concept of preconditioning is able to provide a general framework to discuss iterative methods. We will rely on the theoretical considerations of section 5.4. The Sobolev gradient approach has been developed in a series of publications of Neuberger (see e.g. [227]–[230]), and with focus on weighted Sobolev spaces it is discussed by Mahavier [201]. For a summary the reader is referred to the monograph of Neuberger [231]. In the Sobolev gradient approach the iteration is constructed as a gradient (steepest descent) method for a suitable functional. The main principle of Sobolev gradients is that preconditioning can be obtained via a change of inner product to determine the gradient of the functional. In particular, a sometimes dramatic improvement can be achieved by using the Sobolev inner product instead of the original L2 one. This change appears in the iterative sequence as preconditioning by the (discretization of the) minus Laplacian or, more generally, by the operator Su ≡ −∆u+cu. (We note that a related idea is used in the so-called H 1 -methods, see Carey–Jiang [68], Richardson [254].) The focus in these works is on least-square functionals for general operators; however, here we will consider potential operators.) Now we put the preceding results of this chapter in this context. First, fixed preconditioners are considered. We start with explaining the Sobolev gradient idea with the usual H01 inner product, leading to Laplacian preconditioners. Then the fixed preconditioners of section 7.1 can be regarded as Sobolev gradients w.r. to a fixed weighted inner product in the Sobolev space. Second, more generally, if we allow the stepwise change of inner product then we obtain variable preconditioners, including the Newton iteration, as gradients w.r. to a variable inner product in the Sobolev space. (See Nashed [220], Ortega-Rheinboldt [238] for related ideas.) That is, the preconditioners of section 7.2 are derived as suitable gradients, and in particular Newton’s method can be regarded as an optimal extreme case of variable steepest descent. (a) Fixed preconditioners For ease of presentation we consider the Dirichlet problem (7.3) in subsection 7.1.1: (
T (u) ≡ −div f (x, ∇u) = g(x) u|∂Ω = 0,
(7.151)
and analyse Theorem 7.1 in the context of gradients. The Assumptions 7.1 on (7.3) are also imposed for (7.151), i.e. the C 1 function f has symmetric and uniformly elliptic Jacobians with some uniform spectral bounds µ2 ≥ µ1 > 0. The assumptions on f imply the existence of a function ψ : Ω × RN → R with ∂ψ (x, η) = f (x, η), ∂η
202
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
hence we can introduce the functional φ : H01 (Ω) → R, φ(u) ≡
Z
Ω
ψ(x, ∇u)
(7.152)
as in Remark 5.5. Then the directional derivatives of φ satisfy Z ∂φ (u) = f (x, ∇u) · ∇v ∂v Ω
(u, v ∈ H01 (Ω)).
(7.153)
Using the divergence theorem, we can write (7.153) as Z ∂φ (u) = T (u)v ∂v Ω
(u, v ∈ H 2 (Ω) ∩ H01 (Ω)).
(7.154)
Since for fixed u the linear functional v 7→ Ω T (u)v is bounded in L2 (Ω), we obtain by definition that φ is Gateaux differentiable as a functional from L2 (Ω) to R (with the dense domain D(φ) = D(T ) = H 2 (Ω) ∩ H01 (Ω)). Further, (7.154) gives that the L2 -gradient is φ′ (u) = T (u) (u ∈ H 2 (Ω) ∩ H01 (Ω)). (7.155) R
The L2 -gradient (7.155) will now be replaced by Sobolev gradients in two steps. First, we start with explaining the Sobolev gradient idea with the usual H01 inner product, which leads to Laplacian preconditioners. Then, more generally, the preconditioners in section 7.1 will be discussed as Sobolev gradients w.r. to a fixed weighted inner product in the Sobolev space. Sobolev gradients and Laplacian preconditioners. Let the space H01 (Ω) be endowed with the usual inner product hu, viH01 =
Z
Ω
∇u · ∇v
(7.156)
and let us consider the generalized differential operator F : H01 (Ω) → H01 (Ω) corresponding to (7.151), i.e. hF (u), viH01 =
Z
Ω
f (x, ∇u) · ∇v
(u, v ∈ H01 (Ω))
(7.157)
(cf. also Remark 7.2). Then, using the above arguments, φ is Gateaux differentiable as a functional from H01 (Ω) to R, and by (7.153) and (7.157) the H01 -gradient is φ′H01 (u) = F (u)
(u ∈ H01 (Ω)).
(7.158)
The space H01 (Ω) with the inner product (7.156) is the energy space of the operator −∆ defined on D(−∆) = H 2 (Ω) ∩ H01 (Ω), since by the divergence theorem Z
Ω
∇u · ∇v = −
Z
Ω
(∆u) v
(u ∈ H 2 (Ω) ∩ H01 (Ω)).
Hence, according to the notation (5.120) in section 5.4, we can write (7.158) as φ′H01 (u) = φ′−∆ (u) = F (u)
(u ∈ H01 (Ω)).
(7.159)
7.3. PRECONDITIONING AND SOBOLEV GRADIENTS
203
Further, following part (2) of Theorem 7.1, under the regularity assumption that Ω is C 2 -diffeomorphic to a convex domain, we have the decomposition F|H 2 ∩H01 = (−∆)−1 T
(7.160)
(see also Remark 7.2). Hence (7.159) can be replaced by φ′−∆ (u) = (−∆)−1 T (u)
(u ∈ H 2 (Ω) ∩ H01 (Ω)).
(7.161)
That is, the modified gradient (7.159) is expressed as the formally preconditioned version of the original one (7.155). The steepest descent iteration corresponding to the gradient (7.158) (and with constant stepsize) is the preconditioned sequence in (7.10) with the auxiliary operator S = −∆. Hence Theorem 7.1 yields cond(F ) = cond(−∆−1 T ) ≤
µ2 µ1
(7.162)
and the corresponding convergence quotient q=
µ2 − µ1 , µ2 + µ1
(7.163)
where µ2 ≥ µ1 > 0 are the uniform spectral bounds of the Jacobians of f . On the other hand, the steepest descent iteration corresponding to (7.155) would give no convergence, since cond(T ) = ∞ (see (5.68). The above considerations mean that Sobolev gradient idea, i.e. the transition from L2 -gradient (7.155) to the H01 -gradient (7.158), gives a fundamental improvement in the convergence of the corresponding steepest descent iteration. Remark 7.17 We note that, by Remark 5.19, the condition number cond(T ) = ∞ has to be compensated by an unbounded linear preconditioning operator to achieve a finite condition number for the (therefore bounded) preconditioned operator. Hereby it is natural to involve the Laplacian as an unbounded preconditioner for the elliptic operator T , since it is another elliptic operator of the simplest form. Accordingly, the Sobolev gradient idea is natural in the sense that it uses the inner product in which φ has a bounded gradient. Remark 7.18 There exist (at least) three approaches in literature that lead to (discrete) Laplacian preconditioners. Naturally, these are different formulations of a common feature and all of them appear in the above considerations. Let us list them again: (i) (Sobolev gradients.) The Laplacian generates the energy space H01 (Ω) in which the gradient of the potential φ (corresponding to T ) is bounded. The latter is essentially because φ depends on ∇u. The construction of the H01 -gradient involves the generator −∆. (Cf. Neuberger [231].) (ii) The generalized differential operator F in (7.157) is bounded and differentiable in H01 (Ω), and it has a finite condition number in contrast to T . Hence the simple
204
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
iteration for F yields suitable convergence in H01 (Ω) (Gajewski–Gr¨oger–Zacharias [127], Necas [223]). The constructive form of this iteration involves the decomposition (7.160) and hence contains the Laplacian preconditioner. (iii) In order to compensate the unboundedness of the nonlinear operator T to achieve a finite condition number, the Laplacian is the most natural linear elliptic operator in similar divergence form. (Cf. Axelsson–Gustafsson [17], Carey–Jiang [68].)
General preconditioners as weighted Sobolev gradients . Following subsection 7.1.1, let us now endow the space H01 (Ω) with the weighted inner product hu, viG :=
Z
Ω
G(x) ∇u · ∇v ,
(7.164)
where the matrix-valued function G ∈ L∞ (Ω, RN ×N ) is as in (7.4), i.e. it is uniformly spectrally equivalent to the Jacobians of f . Denote by M ≥ m > 0 these spectral equivalence bounds. (The inner product h., .iG is equivalent to the usual one (7.156)). Let us consider the generalized differential operator F : H01 (Ω) → H01 (Ω) under the inner product (7.164), i.e. hF (u), viG =
Z
Ω
f (x, ∇u) · ∇v
(u, v ∈ H01 (Ω)).
(7.165)
(Cf. also Remark 7.2. We note that here F depends on the choice of G, which is e.g. shown constructively for the strong forms (7.18) and (7.160). Hence it would be more precise to write F(G) instead of F to indicate the dependence. However, since G is fixed in the study, we simply write F which causes no misunderstanding just as in section 7.1.) Then, similarly as above, the functional φ : H01 (Ω) → R in (7.152) is Gateaux differentiable, and the H01 -gradient w.r. to the inner product h., .iG is φ′G (u) = F (u)
(u ∈ H01 (Ω))
(7.166)
with F in (7.165). Now the space H01 (Ω) with the inner product (7.164) is the energy space of the operator Su ≡ −div (G(x)∇u), (7.167) therefore, analogously to the notation (7.159), we can write (7.166) as φ′G (u) = φ′S (u) = F (u)
(u ∈ H01 (Ω)).
(7.168)
Further, under the regularity assumptions in part (2) of Theorem 7.1, we have the decomposition (7.18) analogously to (7.160), and hence (7.168) can be replaced by φ′G (u) = φ′S (u) = S −1 T (u)
(u ∈ D(S)).
(7.169)
That is, the modified H01 -gradient (7.168) is expressed as the formally preconditioned version of the original L2 -gradient (7.155).
7.3. PRECONDITIONING AND SOBOLEV GRADIENTS
205
The steepest descent iteration corresponding to the gradient (7.166) (and with constant stepsize) is the preconditioned sequence in (7.10), hence the achieved condition number and corresponding convergence quotient, respectively, are cond(F ) = cond(S −1 T ) ≤
M , m
q=
M −m M +m
using Theorem 7.1, where M ≥ m > 0 are the uniform spectral bounds of the Jacobians of f w.r. to G(x). This means that (7.162) and (7.163) are further improved if, using the weight G(x), these spectral bounds m and M are better than the original ones µ1 and µ2 . Altogether, we can consider (7.166) or (7.169) as weighted Sobolev gradients. This yields a finite condition number for any weight G(x) in the given class, in contrast to cond(T ) = ∞. Looking for G(x) as a uniform approximation of the Jacobians of f , the preconditioning operator S may lead to better conditioning properties than the special case −∆. (See also Mahavier [201] for a discussion on weighted Sobolev gradients.) Remark 7.19 A special case of the above considerations is when the original operator is also linear. Then the Sobolev gradient idea is related to the quadratic functional. Preconditioning for linear equations has been applied in the inner iteration for Newton’s method in subsection 7.2.2. There the auxiliary linear equation in the nth outer step is F ′ (un )p = −(F (un ) − b). (7.170) The corresponding quadratic functional is 1 φ(p) = hF ′ (un )p, pi + hF (un ) − b, pi. 2 For the Dirichlet problem (7.151) this takes the form Z 1 Z ∂f (x, ∇un ) ∇p · ∇p + f (x, ∇un ) · ∇p − gp φ(p) = 2 ∂η Ω
Ω
(p ∈ H01 (Ω)). (7.171)
The preconditioned form of (7.170) was defined in (7.112) by Bn−1 F ′ (un )p = −Bn−1 (F (un ) − b),
(7.172)
where the operator Bn in (7.111) corresponds to a weight matrix Gn (x). (Note that here n is fixed.) In this case the preconditioned form (7.172) can be achieved as the Sobolev gradient of the quadratic functional (7.171) w.r. to the inner product hu, viGn = in H01 (Ω).
Z
Ω
Gn (x) ∇u · ∇v
206
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
(b) Variable preconditioners Let us consider the Dirichlet problem (7.151) again. As a generalization of paragraph (a), we may allow the stepwise change of inner products (7.164) during the iteration. In this way variable Sobolev gradients can be constructed, and we arrive at the results of subsection 7.2.3 in the context of gradients. We have seen in paragraph (a) that it is natural to use elliptic differential operators as linear preconditioning operators for T , since these are unbounded (which is necessary to compensate cond(T ) = ∞) and are of similar type as T . Hence the choice of preconditioning operator was done in the class of linear elliptic operators by varying the weight G(x). Having fixed this scope, it will be more convenient below to use the weak form of the elliptic operators. Assume that the nth term of an iterative sequence is constructed, and let Gn ∈ L (Ω, RN ×N ) be a matrix-valued function which is spectrally equivalent to the Jacobian ∂f (x, ∇un ). Denote by Mn ≥ mn > 0 these spectral equivalence bounds. The ∂η matrix Gn (x) defines the weighted inner product ∞
hu, viGn :=
Z
Ω
Gn (x) ∇u · ∇v
(7.173)
in H01 (Ω). The relation of (7.173) to the original inner product (7.156) is as follows. Endowing with (7.156) we can define a linear operator Bn : H01 (Ω) → H01 (Ω) by
H01 (Ω)
hBn h, viH01 =
Z
Ω
(h, v ∈ H01 (Ω)).
Gn (x) ∇h · ∇v
Then (7.173) is the energy norm of Bn . Let F denote the generalized differential operator (7.157) in the original inner product. By (7.158), F is the H01 -gradient of φ. Then the gradient φ′Gn in the inner product (7.173) is related to F by the relation φ′Gn (u) = Bn−1 F (u)
(u ∈ H01 (Ω)).
(7.174)
(This product form is the analogue of (7.169) and follows from the general case (5.129).) The relation (7.174) means that the sequence un+1 = un −
2τn Bn−1 (F (un ) − b) Mn + m n
(7.175)
in (7.132)–(7.133) is a variable gradient (steepest descent) method corresponding to φ such that in the nth step the gradient of φ is taken w.r. to the inner product h., .iGn .
According to Theorem 7.10, the assumed stepwise spectral equivalence of Gn (x) (x, ∇un ) yields the following condition numbers and convergence ratio: and ∂f ∂η
cond Bn−1 F ′ (un ) ≤
Mn mn
(n ∈ N),
q = lim sup
Mn − m n . Mn + m n
(7.176)
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
207
In particular, superlinear convergence can be obtained and its speed is determined by the rate as Mn /mn → 1. (We note that conditioning requires us to push Mn /mn close to 1, whereas the final choices of Gn also involve the easy solvability of the auxiliary problems, which is in general an opposite aspect for the choice of preconditioner.) (c) Newton’s method as an optimal variable steepest descent The above discussion on variable preconditioners allows us to regard Newton’s method as optimal in the context of steepest descent in a more general sense than for a usual gradient method. This idea has been suggested in section 5.4 for the Hilbert space setting. We now present it for the Dirichlet problem (7.151). The usual gradient method defines an optimal descent direction when a fixed inner product is used. In contrast, let us now extend the search for an optimal descent direction by allowing the stepwise change of inner product. For the latter the possible choices are as above in the weighted form (7.173), where Gn (x) is a uniformly elliptic matrix-valued function, and hence (7.173) is equivalent to the usual H01 inner product. Further, as mentioned after (7.176), the resulting convergence properties are determined by the quotients Mn /mn (as n → ∞), where Mn ≥ mn > 0 are the spectral equivalence bounds of the Jacobian ∂f (x, ∇un ) w.r. to G(x). ∂η Then the choice ∂f Gn (x) = (x, ∇un ) , ∂η which generates the inner product hv, ziGn :=
Z
Ω
∂f (x, ∇un ) ∇v · ∇z ∂η
for the nth step Sobolev gradient, is an extreme case of variable preconditioner that yields optimal equivalence bounds mn = Mn = 1, and the corresponding variable gradient iteration (7.175) becomes the damped Newton method with quadratic convergence. Roughly speaking, this means that the descents in the gradient method are steepest w.r. to different directions, whereas in Newton’s method they are steepest w.r. to different directions and inner products. Remark 7.20 (i) The Newton iteration is usually coupled with inner iterations for the linearized equations, as is done in section 7.2.3. The Sobolev gradient idea for these inner iterations has been sketched in Remark 7.19. (ii) A different interpretation of Newton’s method in Sobolev gradient context uses minimization subject to constraints, see Neuberger [231], Chapter 7.
7.4 7.4.1
Some more iterative methods in Sobolev space The nonlinear conjugate gradient method
The Sobolev space version of the nonlinear conjugate gradient method is based on the Hilbert space analogue of CG methods, quoted at the end of subsection 5.2.1. We
208
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
present the Sobolev space application of the Daniel iteration [79]. For some other related algorithms we refer to [6, 39]. We consider the boundary value problem (7.82) similarly as in section 7.2, under 1 the items (i)-(iv) of Assumptions 7.7. The corresponding Sobolev space HD (Ω), defined in (7.83), is endowed with the inner product hu, viHD1 :=
Z
Ω
1 (u, v ∈ HD (Ω)).
∇u · ∇v
(7.177)
The Hilbert space version of the method is established in Theorem 5.7. In order to ensure the (thereby assumed) twice differentiability of the generalized differential operator simply by that of the nonlinearity f , we apply Theorem 5.7 in a finite-dimen1 sional subspace V ⊂ HD (Ω), endowed with the same inner product (7.177). Similarly to subsections 7.2.2–7.2.3, let F : V → V denote the operator defined by hF (u), viHD1 =
Z
Ω
f (x, ∇u) · ∇v
(v ∈ V )
(7.178)
and b ∈ V the element defined by hb, viHD1 =
Z
Ω
gv +
Z
ΓN
(v ∈ V )
γv dσ
(7.179)
(cf. also Remark 7.2). Denote by u∗ ∈ V the unique solution of the problem hF (u∗ ), viHD1 = hb, viHD1
(v ∈ V ).
(7.180)
The CG iteration constructs a sequence (un ) ⊂ V together with (pn ) ⊂ V and the residuals rn = b − F (un ) ∈ V as follows. Let u0 ∈ V be arbitrary. Then r0 ∈ V is the solution of the problem Z
Ω
∇r0 · ∇v = −
Z
Ω
(f (x, ∇u0 ) · ∇v − gv) +
Z
ΓN
γv dσ
(v ∈ V )
(7.181)
and p0 = r0 . If, for n ∈ N, un and pn are obtained, then un+1 := un + cn pn , where cn is the smallest positive root of equation hF (un + cpn ) − b, pn iHD1 = 0, further, rn+1 ∈ V is the solution of the problem Z
Ω
∇rn+1 · ∇v = −
Z
Ω
(f (x, ∇un+1 ) · ∇v − gv) +
Z
ΓN
γv dσ
(v ∈ V );
(7.182)
finally, pn+1 := rn+1 + bn pn with bn = −αn /βn where αn =
Z
Ω
∂f (x, ∇un+1 ) ∇pn · ∇rn+1 , ∂η
βn =
Z
Ω
∂f (x, ∇un+1 ) ∇pn · ∇pn . ∂η
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
let
209
The convergence results are formulated using the following notations: for any n ∈ N 1/2
εn := hF ′ (un )−1 rn , rn iH 1 , D
further, let m = µ1 from (7.85), d :=
B M3
M , ηn 2m √ M ε . m(1−qn ) n
3+
qn := (q 2 + σn )1/2 , Rn :=
:=
and √
M = µ2
MB ε , 2m2 n
σn :=
(7.183)
ηn 4mM (M +m)2 1+ηn
+ dεn , q :=
M −m , M +m
Theorem 7.12 Let the items (i)-(iv) of Assumptions 7.7 hold for the boundary value problem (7.82). Denote by u∗ ∈ V the unique weak solution as in (7.180). Further, assume that f has a bounded second derivative, i.e. there exists L > 0 such that the arrays ∂ 2 f (x, η) ∈ RN ×N ×N ∂η 2 (of the corresponding trilinear mappings) satisfy
∂ 2 f (x, η)
≤ L 2
∂η
(x ∈ Ω, η ∈ RN ).
(7.184)
Then for arbitrary u0 ∈ V the above constructed CG sequence satisfies following convergence results: (1) Let N0 ∈ N be such that RN0 < R and σN0 < 1 − q 2 . Then kun − u∗ kHD1 ≤ RN0 · qN0 · qN0 +1 . . . qn−1
(n > N0 ) .
(Note that lim qn = q). (2) Let N0 be as in (1). Then for any m > N0 there exists Nm ∈ N such that √ √ !2m M − m εn+m ≤ 4 √ + δn εn (n > Nm ) √ M+ m
where lim δn = 0. (Note that εn is equivalent to kF (un ) − bkHD1 and kun − u∗ kHD1 .) Proof. The assumptions (i)-(iv) of Theorem 5.7 have to be checked. It follows in the standard way (similarly as e.g. in section 6.1) that F is twice differentiable and there hold hF ′ (u)v, ziHD1 =
Z
Ω
∂f (x, ∇u) ∇v · ∇z , ∂η
(7.185)
∂2f (x, ∇u) (∇v, ∇w, ∇z) Ω ∂η 2 for any u, v, w, z ∈ V , further, F ′ is bihemicontinuous. Then the symmetry and uniform ellipticity assumption (ii) of Theorem 5.7 follows from (7.185) and (7.85). Further, (7.120) in Remark 7.14 and (7.184) yield ′′
hF (u)(v, w), ziHD1 =
Z
|hF ′′ (u)(v, w), ziHD1 | ≤ Lk∇vkL∞ (Ω) k∇wkL2 (Ω) k∇zkL2 (Ω)
210
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
for all u, v, w, z ∈ V , i.e.
≤ LK(V )k∇vkL2 k∇wkL2 k∇zkL2 kF ′′ (u)k ≤ LK(V ) .
Finally, by the assumptions, F has a potential Ψ(u) =
Z
Ω
ψ(x, ∇u)
(u ∈ V ),
where ∂η ψ = f . Then the functional φ(u) = Ψ(u) −
Z
Ω
gu +
Z
ΓN
γu dσ
satisfies φ′ (u) = F (u) − b. The uniform ellipticity of F , verified for assumption (ii), yields that hφ′′ (u)v, viHD1 ≥ mkvk2H 1 (u, v ∈ V ), D
hence lim φ(u) = ∞
kuk→∞
(like e.g. in Theorem 5.4). Hence the level sets of φ are bounded, i.e for any u0 ∈ V they are contained in some ball, which is suitable to satisfy the required assumptions (iii) and (iv) of Theorem 5.7. Remark 7.21 The above CG algorithm can be generalized using an equivalent inner 1 product in HD (Ω) similarly as for the simple iterations in section 7.1. This means that a preconditioned iteration is defined in the Sobolev space. Namely, if G ∈ L∞ (Ω, RN ×N ) is a symmetric matrix-valued function satisfying (7.4) and we use the inner product (7.6), then we have to set G as a weight matrix on the left-hand sides of equations (7.181) and (7.182). That is, (7.182) is replaced by Z
Ω
G(x) ∇rn+1 · ∇v = −
Z
Ω
(f (x, ∇un+1 ) · ∇v − gv) +
Z
ΓN
γv dσ
(v ∈ V ) (7.186)
(and similarly, the rewritten form of (7.181) becomes (7.186) with index 0 instead of n + 1). Then the obtained iteration satisfies Theorem 7.12 with m and M from (7.4) instead of (7.183).
7.4.2
Frozen coefficient iterations
In this subsection we sketch the so-called frozen coefficient algorithm, which can be defined generally for problems where the nonlinearities appear as coefficients of linear expressions of the derivatives. Then one solves successively linear problems where the coefficients come from the previous iteration. The method is also called Kachanov’s method (he introduced it first for problems in plasticity in [159]) or secant method. (See also Axelsson–Gustafsson [17], Fuˇcik-Kratochv´ıl-Neˇcas [119], Hlav´aˇcek–Kr´ıˇzek– Mal´ y [151].) Hereby we will first describe the general construction for second order problems whose nonlinearities are scalar coefficients of ∇u and u. Then convergence theorems are given.
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
211
Let us consider the problem −div (b(x, u, ∇u)∇u) + r(x, u, ∇u)u = g(x)
in Ω
= γ(x) on ΓN b(x, u, ∇u) ∂u ∂ν
u = 0
(7.187)
on ΓD
with the following conditions: (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary; ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅, ΓN ∪ ΓD = ∂Ω and ΓD = 6 ∅. (ii) The functions b, r : Ω × R × RN → R are measurable and bounded w.r. to the variable x ∈ Ω and continuous in the other variables. (iii) For all (x, ξ, η) ∈ Ω × R × RN there holds 0 < β1 ≤ b(x, ξ, η) ≤ β2 ,
0 ≤ r(x, ξ, η) ≤ ρ
(7.188)
with constants β2 ≥ β1 > 0 and ρ > 0 independent of (x, ξ, η). (iv) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ). 1 Turning to the weak formulation, one introduces the space HD (Ω) as in (7.83) and 1 looks for u ∈ HD (Ω) such that
hF (u), vi ≡
Z
Ω
[b(x, u, ∇u)∇u · ∇v + r(x, u, ∇u)uv] =
Z
gv +
Ω
We define B(u; v, z) ≡
Z
Ω
Z
γv dσ
ΓN
1 (v ∈ HD (Ω)).
[b(x, u, ∇u)∇v · ∇z + r(x, u, ∇u)vz] .
Then hF (u), vi = B(u; u, v),
(7.189)
1 further, (v, z) 7→ B(u; v, z) is a symmetric bilinear form on HD (Ω)2 for any fixed 1 u ∈ HD (Ω).
Then the frozen coefficient algorithm is defined as follows: for given un , the next 1 iteration un+1 ∈ HD (Ω) satisfies the linear equation B(un ; un+1 , v) ≡
Z
Ω
[b(x, un , ∇un )∇un+1 · ∇v + r(x, un , ∇un )un+1 v] =
Z
Ω
gv +
Z
ΓN
γv dσ
1 (Ω)). (v ∈ HD
(7.190) The conditions (i)-(iv) imply that in each step problem (7.190) has a unique weak solution, i.e. un+1 is well-defined.
212
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
One requires that problem (7.187) has a unique weak solution and un converges to it. In the following theorems different additional conditions are considered to ensure convergence. The first theorem is formulated generally in Hilbert space (Fuˇcik-Kratochv´ıl-Neˇcas [119]), providing convergence using directly the properties of F . (This result involves potential operators, for which the conditions on the coefficients are as studied in Chapter 6.) Theorem 7.13 Let the following conditions hold besides the given assumptions (i)-(iv) on problem (7.187): the operator F is strongly monotone, i.e. hF (u) − F (v), u − vi ≥ mku − vk2
1 (u, v ∈ HD (Ω))
1 with a suitable constant m > 0, further, F has a potential φ : HD (Ω) → R satisfying
2(φ(v) − φ(u)) ≤ B(u; v, v) − B(u; u, u)
1 (u, v ∈ HD (Ω)).
Then the frozen coefficient iteration converges to the unique weak solution. The above result can also be applied to systems in an analogous way, see the cited paper. In particular, thereby the convergence has been proved for the nonlinear elasticity problem (see section 1.3): Theorem 7.14 The frozen coefficient iteration for the nonlinear elasticity problem converges to the unique weak solution u∗ = (u∗1 , u∗2 , u∗3 ) ∈ H 1 (Ω)3 . The next result for a diffusion type non-potential equation follows from a theorem of Hlav´aˇcek–Kr´ıˇzek–Mal´ y [151]. Theorem 7.15 Let in problem (7.187) there hold b(x, ξ, η) = a(x, ξ) and r(x, ξ, η) ≡ 0, where b satisfies (7.188); further, let ΓD = ∂Ω. That is, we consider problem (6.74). 1 Then for any finite-dimensional subspace V ⊂ HD (Ω), the frozen coefficient iteration in V converges to the projection of the unique weak solution into V . (We note that we can replace the assignment un 7→ un+1 in (7.190) generally by u 7→ S(u) with a suitable mapping S : V → V . Then, owing to (7.189), the original problem can be replaced by a fixed point equation in V and the convergence result achieved via suitable contractions. This approach is used in the paper quoted above.) The result in [151] holds for more general coefficients than in Theorem 7.15 (namely, with r ≥ 0, a matrix instead of b and ΓN 6= ∅). However, in this case uniqueness of the solution is not guaranteed, and the sequence converges to one of the solutions. We note that a favourable property of the frozen coefficient iteration is the symmetry of the equations (7.190). In contrast, if Newton’s method is used for such problems then (although the convergence is faster) the auxiliary equations are non-symmetric, see (7.218). A comparison of the two methods which also involves their suitable combination is given by Lavery [194] for two-point boundary value problems.
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
7.4.3
213
Double Sobolev gradients
In this subsection an iterative method is summarized which is closely related to the frozen coefficient iteration described in the previous subsection. The detailed study of this method is found in Axelsson–Kar´atson [20]. We consider here the Dirichlet problem −div (a(x, u, ∇u)∇u) = g(x) (7.191) u|∂Ω = 0, where the function a : Ω × R × RN → R is piecewise continuous and for all (x, ξ, η) ∈ Ω × R × RN there holds 0 < α1 ≤ a(x, ξ, η) ≤ α2 (7.192)
with constants α2 ≥ α1 > 0 independent of (x, ξ, η). Then problem (7.191) is a special case of (7.187). In course of the frozen coefficient iteration one has to solve auxiliary problems which in strong form are given as −div (a(x, un , ∇un )∇un+1 ) = g(x)
un+1 |∂Ω = 0.
(7.193)
Let us fix n and introduce the notation a(x) = a(x, un , ∇un ), further, for any given function c ∈ W 1,∞ (Ω) we define the operator ˆ (c) u := −div(c(x)∇u) L
ˆ (c) ) = H 2 (Ω) ∩ H01 (Ω). with domain D(L
(7.194)
Then we can write (7.193) briefly as ˆ (a) un+1 = g, L which is equivalent to an iteration step with the following correction term and stepsize 1: un+1 = un − zn ,
ˆ −1 L ˆ (a) un − g . where zn = L (a)
(7.195) (7.196)
ˆ (a) un − g is the residual and L ˆ −1 plays the role of a preconditioner. Related Here L (a) preconditioned iterations are obtained if stepwise some linear operator close to but ˆ −1 is used in (7.196). simpler than L (a) The double Sobolev gradient iteration uses a preconditioned operator which acts as the identity on functions u for which a∇u is in the range of the operator ∇. In the sequel this operator is defined and its conditioning is investigated. The first estimate is related to the distance of a∇u from R(∇), then the second one involves the difference of a from a constant function, which is a priori available in practice.
214
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
The operator applied for preconditioning is defined as ˆ (1/a) (−∆)−1 . (−∆)−1 L
(7.197)
The name ’double Sobolev gradient preconditioning’ is related to the Sobolev gradient technique where the preconditioner involves the solution of a Poisson equation (see section 7.3). Here the latter is done twice. This means that instead of solving one ˆ (a) in (7.196), one has to solve two auxiliary equations with auxiliary equation with L −∆. The latter is favourable if a fast solver for the Laplacian is available on Ω. The preconditioned iteration in strong form using (7.197) is un+1 = un − zn ,
(7.198)
ˆ (1/a) (−∆)−1 L ˆ (a) un − g . where zn = (−∆)−1 L
(7.199)
(We note that, alternatively, we could use (7.197) to construct a preconditioned inner iteration to obtain zn in the frozen coefficient iteration step (7.196). This twofold application of the same preconditioner is similar to the two Newton-like iterations studied in subsections 7.2.2 and 7.2.3.) The preconditioned operator is denoted by ˆ (1/a) (−∆)−1 L ˆ (a) , C = (−∆)−1 L
(7.200)
then (7.199) can be written as zn = Cun − g˜
(7.201)
ˆ (1/a) (−∆)−1 g. The construction of zn is achieved in two steps in with g˜ = (−∆)−1 L weak form: Z
Ω
∇vn · ∇h = Z
Ω
Z
Ω
(a ∇un · ∇h − gh)
(h ∈ H01 (Ω)),
(7.202)
1 ∇vn · ∇h a
(h ∈ H01 (Ω)).
(7.203)
∇zn · ∇h =
Z
Ω
The first observation is that if a∇un is in the range of the operator ∇ : H01 (Ω) → L2 (Ω), then (7.196) and (7.199) act similarly on un , i.e. un = Cun . This follows from the proposition below. Proposition 7.1 Let u ∈ H01 (Ω). If a∇u ∈ R(∇), then Cu = u. Proof. For any u ∈ H01 (Ω) there holds
ˆ (1/a) (−∆)−1 L ˆ (a) u , u − Cu = (−∆)−1 −∆u − L that is,
u − Cu = s
where s ∈ H01 (Ω) is defined by the following equalities:
ˆ (a) u −∆r = L r|∂Ω = 0 ,
(7.204)
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
In weak form:
Z
Ω
Z
Ω
215
ˆ (1/a) r −∆s = −∆u − L s|∂Ω = 0 .
(7.205)
Z
(7.206)
∇r · ∇h =
∇s · ∇h =
Z
Ω
Ω
a ∇u · ∇h ,
∇u · ∇h −
Z
Ω
1 ∇r · ∇h a
(7.207)
for all h ∈ H01 (Ω). Now let a∇u ∈ R(∇), i.e. there exists z ∈ H01 (Ω) such that a∇u = ∇z. Then (7.206) implies that z = r with r defined in (7.204) and (7.206), hence (7.207) yields kskH01 =
sup hs, hi1 =
khkH 1 =1 0
Z
sup
khkH 1 =1 Ω 0
∇s · ∇h =
sup
Z
khkH 1 =1 Ω 0
1 ∇u − ∇r · ∇h = 0, a
(7.208) hence u − Cu = s = 0. The conditioning of the operator C is now studied via estimating its distance from the identity operator in H01 (Ω). The first estimate formulates a continuous dependence type result corresponding to Proposition 7.1, namely, for general u ∈ H01 (Ω) the norm k(I − C)ukH01 is estimated by the distance of a∇u from R(∇). Proposition 7.2 Let u ∈ H01 (Ω). Then 1 dist (a∇u, R(∇)), inf a
k(I − C)ukH01 ≤ where
min ka∇u − ∇vkL2 (Ω) .
dist (a∇u, R(∇)) =
v∈H01 (Ω)
(7.209)
Proof. Let s = (I − C)u and r be as defined in (7.204) and (7.206). Now (7.207) yields Z
Ω
∇s · ∇h =
Z
Ω
∇u · ∇h − ≤
Z
Ω
Z 1 1 ∇r · ∇h = (a∇u − ∇r) · ∇h a Ω a
1 ka∇u − ∇rkL2 (Ω) khkH01 , inf a
hence kskH01 =
sup hs, hi1 =
khkH 1 =1 0
sup
Z
khkH 1 =1 Ω 0
∇s · ∇h ≤
1 ka∇u − ∇rkL2 (Ω) . inf a
(7.210)
Here (7.206) implies that a∇u − ∇r is orthogonal to R(∇) in L2 (Ω), hence ∇r is the projection of a∇u to R(∇), and ka∇u − ∇rkL2 (Ω) equals the distance (7.209). Thus (7.210) yields the required result. The second estimate for I − C in H01 (Ω) involves the difference of a from a constant function, which is a priori available in practice. We can study without loss of generality the distance of a from the constant 1.
216
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Proposition 7.3 Let g := 1 − a and assume that γ : = kgk∞ < 1. Then kI − Ck ≤
(7.211)
γ . 1−γ
(7.212)
Proof. The estimate follows from Proposition 7.2. Let u ∈ H01 (Ω) be arbitrary. Setting v = u in (7.209), we obtain k(I − C)ukH01 ≤ ≤
1 inf a
min ka∇u − ∇vkL2 (Ω) ≤ 1
v∈H0 (Ω)
1 k(a − 1)∇ukL2 (Ω) inf a
ka − 1k∞ γ k∇ukL2 (Ω) ≤ kukH01 . inf a 1−γ
The condition number of C is kCkkC −1 k directly by definition, since now C is not symmetric and cond(C) was not obtained from spectral equivalence. Now Proposition 7.3 implies the required estimate of cond(C): Proposition 7.4 If γ = ka − 1k∞ < 12 , then kI − Ck < 1, C is invertible and cond(C) = kCkkC −1 k ≤
1 . 1 − 2γ
γ Proof. γ < 12 implies 1−γ < 1, hence by (7.212) kI − Ck < 1 and hence C = I − (I − C) is invertible. (7.212) also implies that for any u ∈ H01 (Ω)
!
!
1 − 2γ γ γ 1 kukH01 = 1 − kukH01 , kukH01 ≤ kCukH01 ≤ 1 + kukH01 = 1−γ 1−γ 1−γ 1−γ which, using kCkH01 = sup u
cond(C) = kCkkC −1 k.
kCukH 1 0 kukH 1 0
, kC −1 kH01 = sup u
kC −1 (Cu)kH 1 0
kCukH 1
, proves the estimate for
0
Remark 7.22 Assumption γ < 12 in Proposition 7.4 means that inf a > 12 , sup a < 23 . Accordingly, if we consider the case a ≈ c1 instead of a ≈ 1, then kI − Ck < 1 remains valid if sup a < 3 inf a. (In this case the analogues of the above estimates involve γ := k ca1 − 1k∞ instead of γ = ka − 1k∞ .) The assumption sup a < 3 inf a can be eliminated or relaxed by changing the Laplacian preconditioner to a suitable piecewise constant variable-coefficient operator. This relies on decomposing the domain Ω in parts Ωi such that on each Ωi , we demand sup a|Ωi < 3 inf a|Ωi . The appropriate construction is also to be used if a has several discontinuities (or sharp gradients), and will be discussed in subsection 8.2.9.
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
7.4.4
217
Symmetric part preconditioning
This subsection is an excursion to the problem of non-symmetry arising in course of Newton’s method in the case of non-potential equations. Namely, the linearized operators in the steps of the iteration are not self-adjoint, as will be seen below. A possible reduction to symmetric problems consists in preconditioning by the symmetric part of the linearized operator for the inner iteration. The symmetric part will be understood in two ways: symmetrization or the leading term of the operator. The linearized problems in the Newton iteration can be handled with conjugate gradient methods, whose scope includes non-symmetric equations as well. For nonsymmetric linear algebraic systems a survey of available methods can be found in Saad–Schultz [260, 261], and general classes of methods using truncation are given by Axelsson [9], Faber and Manteuffel [108]. The paper [9] introduces a generalized conjugate gradient least square method, whose full version uses all previous search directions and yields a linear convergence estimate. Further, it is proved that using the symmetric part of the matrix for preconditioning, the full version coincides with the truncated one which requires only a single, namely the current search direction. In the case of non-selfadjoint elliptic operators many investigations concern the leading (plus sometimes a zeroth-order) term of the operator as preconditioner, see Bramble–Pasciak [51], Elman–Schultz [104], Manteuffel–Otto [203]. The paper Axelsson–Kar´atson [21] uses the symmetrization of the operator, and proves the same result as above on truncation. In the sequel we define two kinds of symmetric part preconditioners, and give estimates on the obtained condition numbers. (A more detailed presentation on the first version is given in the mentioned paper [21].) Let us consider the diffusion type non-potential problem (6.74), which is a special case of (7.191). The weak formulation reads hF (u), vi ≡
Z
Ω
a(x, u)∇u · ∇v =
Z
gv
Ω
(v ∈ H01 (Ω)).
(7.213)
Here we assume that 0 < α ≤ a(x, u) ≤ α ˜
(x ∈ Ω, u ∈ R)
(7.214)
with suitable constants α, α ˜ , further, the partial derivative au (x, u) is bounded and g ∈ L2 (Ω). The space dimension N equals 2 or 3. In the nth step of a Newton-like iteration, the linearized operator F ′ (un ) : H01 (Ω) → 1 H0 (Ω) is defined by ′
hF (un )v, ziH01 =
Z
Ω
(a(x, un ) ∇v · ∇z + v (au (x, un )∇un · ∇z))
(v, z ∈ H01 (Ω)).
(7.215)
Introducing the notation b(un ) = au (x, un )∇un ,
(7.216)
the formula (7.215) reduces to ′
hF (un )v, ziH01 =
Z
Ω
(a(x, un ) ∇v · ∇z + v (b(un ) · ∇z))
(v, z ∈ H01 (Ω)). (7.217)
218
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
The operator F ′ (un ) is not self-adjoint: it is seen immediately that ′
∗
hF (un ) v, ziH01 =
Z
Ω
(v, z ∈ H01 (Ω)), (7.218)
(a(x, un ) ∇v · ∇z + (b(un ) · ∇v) z )
and here the second term is non-symmetric in v and z. (a) Preconditioning by symmetrization Using the symmetrization of F ′ (un ), we can define the preconditioning operator 1 Bn := (F ′ (un ) + F ′ (un )∗ ) 2 for the inner iteration. In the sequel we fix un ∈ H01 (Ω) and we study the operator Bn and the related conditioning properties. Proposition 7.5 (1) The operator Bn satisfies hBn v, ziH01 =
Z Ω
1 a(x, un ) ∇v · ∇z − (divb(un ))vz 2
(v, z ∈ H01 (Ω)),
(7.219)
and there holds F ′ (un ) = Bn + Rn
(7.220)
with Rn defined by hRn v, ziH01
1Z (v (b(un ) · ∇z) − (b(un ) · ∇v) z ) = 2 Ω
(v, z ∈ H01 (Ω)).
(7.221)
(2) If there holds the coercivity condition div b(un ) ≤ 0,
(7.222)
then Bn is strongly positive: hBn v, viH01 ≥ αkvk2H01
(v ∈ H01 (Ω))
with α > 0 from (7.214). Proof. (1) Letting b = b(un ), the relation (7.219) follows from Z
Ω
((b · ∇u)v + u(b · ∇v)) = −
Z
Ω
(div b)uv,
which is a consequence of the formula 0=
Z
∂Ω
buv dσ =
Z
Ω
div (buv) =
Z
Ω
[(div b)uv + (b · ∇u)v + u(b · ∇v)]
for u, v ∈ H01 (Ω). Further, (7.221) follows from (7.218) and that 1 Rn = F ′ (un ) − Bn = (F ′ (un ) − F ′ (un )∗ ). 2
(7.223)
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
219
(2) The positivity and coercivity conditions (7.214) and (7.222) imply that Z
hBn v, viH01 =
Ω
Z 1 a(x, un ) |∇v|2 − (div b(un ))v 2 ≥ α |∇v|2 = αkvk2H01 . 2 Ω
(7.224)
In the sequel we assume as well that the coercivity condition (7.222) is satisfied. Part (2) of Proposition 7.5 implies that we can introduce the energy norm kvkBn = 1/2 hv, viBn . In terms of this we can estimate Rn as follows. Proposition 7.6 There holds |hRn v, zi| ≤ LkvkBn kzkBn
(v, z ∈ H01 (Ω))
(7.225)
with
K4 kau (x, un )kL∞ k∇un kL4 , (7.226) α where K4 > 0 is the embedding constant corresponding to the Sobolev embedding L=
H01 (Ω) ⊂ L4 (Ω),
kvkL4 ≤ K4 kvkH01
(v ∈ H01 (Ω)).
(7.227)
Proof. (7.227) and (7.216) imply Z
Ω
|v(b(un ) · ∇z)| ≤ kau (x, un )kL∞ k∇un kL4 kvkL4 k∇zkL2 ≤ K4 kau (x, un )kL∞ k∇un kL4 kvkH01 kzkH01 ,
and the same is obtained for |z(b(un ) · ∇v)| by exchanging v and z. Hence |hRn v, ziH01 | ≤ K4 kau (x, un )kL∞ k∇un kL4 kvkH01 kzkH01
(v, z ∈ H01 (Ω)),
and together with (7.223) we obtain the required estimate. Proposition 7.7 The operator Bn−1 F ′ (un ) satisfies cond(Bn−1 F ′ (un )) ≤ 1 + L w.r. to the Bn -norm, where L is the constant in (7.226). Proof. For brevity let An = Bn−1 F ′ (un ). We have kvk2Bn = hBn v, vi = hF ′ (un )v, vi = hAn v, viBn ≤ kAn vkBn kvkBn hence kA−1 n kBn ≤ 1.
(v ∈ H01 (Ω)),
220
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
Further, Proposition 7.6 implies kBn−1 Rn vkBn = sup hBn−1 Rn v, ziBn = sup hRn v, zi kzkBn =1
kzkBn =1
≤ sup LkvkBn kzkBn = LkvkBn . kzkBn =1
Since by (7.220) An = I + Bn−1 Rn , we obtain kAn kBn ≤ 1 + kBn−1 Rn kBn ≤ 1 + L. That is, cond(An ) = kAn kBn kA−1 n kBn ≤ 1 + L w.r. to the Bn -norm. Remark 7.23 The value of L in (7.226) can be determined explicitly. Namely, un being a known function, the numbers α, kau (x, un )kL∞ and k∇un kL4 can be calculated directly from the given coefficients and un , further, K4 can be estimated in terms of the domain (see (11.9)–(11.10) in the Appendix). Remark 7.24 Altogether, the obtained condition number determines the linear convergence of the conjugate gradient method for the inner iteration, see Axelsson [9] for the corresponding convergence estimates. Moreover, as proved in [21], the symmetrized preconditioning operator Bn implies that this convergence is achieved by the simpler truncated version which requires only a single, namely the current search direction.
(b) Preconditioning by the leading term As an alternative to symmetrization, we can define the leading term Cn of the operator F ′ (un ) in (7.217) as preconditioner: hCn v, zi
H01
=
Z
Ω
a(x, un ) ∇v · ∇z
(v, z ∈ H01 (Ω)).
(7.228)
This self-adjoint operator is simpler than the symmetrization Bn , since in (7.228) there is no zeroth-order term like in (7.219). On the other hand, for Cn we have no result like in Remark 7.24 on the coincidence of the full and truncated versions of the CGM. The available conditioning estimate for preconditioning with Cn is the same as for preconditioning with Bn above. Namely, the lower estimate (7.224) holds for Cn as well by neglecting the term with v 2 , and the upper estimate (7.225) is the same for the first order term itself as for Rn . Using these estimates, the proof of Proposition 7.7 can be repeated for the operator Cn−1 F ′ (un ) to yield cond(Cn−1 F ′ (un )) ≤ 1 + L w.r. to the Cn -norm with the constant L from (7.226).
7.4. SOME MORE ITERATIVE METHODS IN SOBOLEV SPACE
7.4.5
221
Some remarks on multistep iterations
Following the goals of this book, in this chapter we have focused on one-step iterations, and in particular on simple iterations and Newton-like methods. Still we have also considered some more general iterations in different settings, the most important being inner-outer iterations which combine Newton’s method with inner CG iterations. We may also mention the nonlinear conjugate gradient method in this context. In this subsection we quote a general formulation of multistep iterations, and point out that preconditioning operators can be used here in an analogous way as for onestep methods. We consider a class of multistep methods given in Ortega–Rheinboldt [238]. In order to motivate this, we recall the general form of one-step iterative methods given in (7.1) in the introduction of this chapter. Now we consider the weak form of the operators, which is reflected in the notations Bn and F similarly as before, further, we incorporate the stepsizes αn > 0 into the operators Bn . Then formula (7.1) turns into un+1 = un − Bn−1 (F (un ) − b) , (7.229) which describes one-step iterative methods for a boundary value problem F (u) = b in weak form. We consider the boundary value problems studied in this chapter, and in this setting the operator F is Gateaux differentiable. Following Ortega–Rheinboldt [238], we define the iterative sequence as follows. For given n ∈ N let us assume that the term un is obtained. We introduce the operator Hn = I − Bn−1 F ′ (un ), where I is the identity operator. (In general the contractivity of the operators Hn determines the convergence of the sequence (7.229), analogously to the investigations of subsection 5.3.3.) Now Hn is used to define the next term as un+1 = un −
kn X
j=0
Hnj Bn−1 (F (un ) − b),
(7.230)
where kn is a suitably chosen integer. The construction of the iteration (7.230) is executed with the following algorithm. If un is obtained, then un+1 = un − zn , (7.231) where the correction term zn is defined recursively via a sequence zn(j) (j = 0, ..., kn ) as follows: (7.232) Bn zn(0) = F (un ) − b, for j = 1, ..., kn : Bn wn(j−1) = F ′ (un ) zn(j−1) ,
(7.233)
zn(j) = zn(j−1) − wn(j−1) ,
(7.234)
222
CHAPTER 7. ITERATIVE METHODS IN SOBOLEV SPACE
which means that by induction zn(j) = Hn zn(j−1) = Hnj Bn−1 (F (un ) − b) (j = 1, ..., kn ), and finally kn X
zn =
zn(j) .
(7.235)
j=0
The above iteration can be used for boundary value problems in combination with preconditioning operators in an analogous way as for one-step methods. For ease of exposition we show this for a second order mixed boundary value problem, given similarly as in (7.82) in section 7.2. That is, we consider the problem − div f (x, ∇u) = g(x)
in Ω
f (x, ∇u) · ν = γ(x) on ΓN
u = 0
(7.236)
on ΓD ,
on which we impose the smoothness and ellipticity conditions of Assumptions 7.7 simi1 1 larly as there. The corresponding generalized differential operator F : HD (Ω) → HD (Ω) is defined by Z 1 hF (u), viHD1 = f (x, ∇u) · ∇v (v ∈ HD (Ω)), Ω
1 where the space HD (Ω) is defined in (7.83). Further, the weak form of the right-hand 1 side is defined by the element b ∈ HD (Ω) for which
hb, viHD1 =
Z
Ω
gv +
Z
ΓN
γv dσ
1 (v ∈ HD (Ω)).
Then the weak form of problem (7.236) reads as 1 (v ∈ HD (Ω)).
hF (u), viHD1 = hb, viHD1
(7.237)
In the one-step iterations of this chapter, the preconditioning operators for problem 1 1 (7.237) were defined in weak form as operators Bn : HD (Ω) → HD (Ω) such that hBn h, viHD1 =
Z
Ω
Gn (x) ∇h · ∇v
(h, v ∈ H01 (Ω)),
(7.238)
where Gn ∈ L∞ (Ω, RN ×N ) is a symmetric positive-definite matrix-valued function. Similarly, the multistep algorithm (7.231)–(7.235) can be applied to problem (7.237) 1 as a sequence running in the Sobolev space HD (Ω), and in which the preconditioning operators Bn are taken from (7.238). Then in the problems that contain Bn one has to solve auxiliary linear elliptic problems with coefficient Gn (x). Namely, in step (7.232) 1 one has to find zn(0) ∈ HD (Ω) such that Z
Ω
Gn (x) ∇zn(0) · ∇v =
Z
Ω
f (x, ∇un ) · ∇v −
Z
Ω
gv +
Z
ΓN
γv dσ
1 (Ω)), (v ∈ HD
and similarly, (7.233) is equivalent to the problem Z Z ∂f (j−1) 1 Gn (x) ∇wn · ∇v = (x, ∇un ) ∇zn(j−1) · ∇v (v ∈ HD (Ω)). Ω Ω ∂η Finally we remark that, although the use of multistep methods may yield better convergence factors [238], still one-step iterations in the general form (7.1) are able to provide favourable convergence whose order cannot be increased in general by CG type multistep methods [33].
Chapter 8 Preconditioning strategies for discretized nonlinear elliptic problems based on preconditioning operators A brief summary on the iterative solution of nonlinear algebraic systems has been given in Chapter 4, which has led us to the conclusion that preconditioning plays a key role in these methods. Moreover, we have seen that preconditioners have to satisfy two typically conflicting requirements (namely, simple solvability and improved convergence), hence there is no general rule as to how the preconditioners should be chosen. This observation has motivated the main goal of this book, which is the construction of a class of preconditioners for discretized nonlinear elliptic problems based on preconditioning operators. These preconditioning matrices, obtained as the discretizations of suitable linear elliptic operators, will also be called Sobolev space preconditioners. The Sobolev space results of the previous chapter provide us with a general background for such preconditioners, but do not specify yet the preconditioning operators used in those methods. This chapter is devoted to a collection of examples of Sobolev space preconditioners and, preceding this, a brief description of their general favourable properties which helps the organized study afterwards. In order to summarize the idea of preconditioning operators, let us first consider one-step iterations. For this we can recall Figures 2–3, given as an illustration in the Introduction. Following that line of thought, we start from the general problem of preconditioning for discretized elliptic problems, as described at the beginning of Chapter 4 using (4.4). Namely, let us consider the nonlinear boundary value problem T (u) = g
(8.1)
and its discretization Th (uh ) = gh in some finite-dimensional subspace Vh . Then one (n) looks for symmetric positive definite matrices Ah (n ∈ N) and one defines the iterative (n) sequence {uh }n∈N in Vh with these matrices as (variable) preconditioners. (Here the 223
224CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT upper index (n) for the iterative sequence, used unlike in the preceding chapters, serves to avoid double subscripts when indexing with h.) We also choose stepsizes α(n) , which in Figures 2–3 were incorporated for simplicity into the operator S (n) , but now and in the sequel it will be more convenient to consider them separately. The described process is shown by Figure 8.1.
T (u) = g
?
Th (uh ) = gh
-
(n+1)
uh
(n)
(n)
(n)
= uh − α(n) (Ah )−1 (Th (uh ) − gh )
Figure 8.1: discretization plus preconditioned iteration The idea of preconditioning operators offers a special choice of preconditioners for the discretized problem. Namely, one first chooses suitable symmetric strictly positive linear elliptic operators S (n) and defines a sequence {u(n) }n∈N with these operators as (variable) preconditioners in the corresponding Sobolev space. Then one proposes the preconditioning matrices (n) (n) A h = Sh (8.2) for the iteration in Vh , which means that the preconditioning matrices are obtained using the same discretization for the operators S (n) as was used used to obtain the system Th (uh ) = gh from problem (8.1). (This way of derivation is indicated in the (n) notation Sh .) This approach yields a choice of preconditioners for the discretized problem such that the obtained iterative sequence is the projection of the theoretical iteration from the Sobolev space into the discretization subspace. This is illustrated by Figure 8.2:
T (u) = g
-
u(n+1) = u(n) − α(n) (S (n) )−1 (T (u(n) ) − g)
? (n+1) uh
=
(n) uh
−α
(n)
(n)
(n)
(Sh )−1 (Th (uh ) − gh )
Figure 8.2: the preconditioning operator idea
225 We note that in Figure 8.2 the discretization parameter h is fixed, but the preconditioning operator idea remains just the same if h is redefined in each step n, i.e. in the setting of a multilevel or projection-iteration method. The above sketch with Figures 8.1–8.2 concerns one-step iterations, which involves both simple iterations (i.e. when S (n) ≡ S is fixed) and Newton-like methods. In the case of the latter the Sobolev space background relies on the weak formulation, using section 7.2, and the preconditioning operators are approximations of the derivative F ′ (un ) of the generalized differential operator. (For distinction, the preconditioning operators in weak form are denoted by B (n) instead of S (n) . Moreover, since it will cause no ambiguity, in section 8.2 we will simply write Bn similarly to Chapter 7.) On the other hand, the same preconditioning operator idea can also be used for multistep iterations in different settings. One can define, for instance, preconditioned nonlinear conjugate gradient methods in this way. The most important considered class of iterations in this context is that of outer-inner iterations, where the outer sequence is constructed by a Newton-like method. Then one normally uses a preconditioned conjugate gradient method for the inner iteration, and in this case preconditioning operators can be relied on again, namely, in the form of operators spectrally equivalent to the derivative F ′ (un ) of the generalized differential operator. As mentioned in the Introduction, the preconditioning operator idea has proved very useful in many applications, and in the case of linear elliptic problems it yields a theoretically well based class of preconditioners. (See the papers of Faber, Manteuffel and Parter [109], Manteuffel and Parter [204], and many others cited in subsection 3.4.2.) Preconditioning operators for nonlinear problems appear in the applications of the Sobolev gradient approach (Neuberger [227]-[231]) and in H 1 -methods (Carey– Jiang [68], Richardson [254]), for some other related numerical methods the reader is referred e.g. to Axelsson–Maubach [24], Lavery [193], Rossi–Toivanen [257]. The advantages of the preconditioning operator idea appear in both areas that are involved in the requirements of good preconditioners, i.e. easy construction and solvability, and convenient conditioning estimates. Namely: • Defining the auxiliary systems as discretizations of suitable linear elliptic operators, one can rely on a highly developed background for their solution. Efficient standard solvers including fast direct methods have been summarized in subsections 3.3.2–3.3.3. • In addition, the construction of a Sobolev space preconditioner is straightforward from the underlying preconditioning operator, and is achieved without studying the actual form of the discretized system. In this way the properties of the original problem can be exploited more directly than with usual preconditioners defined for the discretized system. For instance, this way can help to handle certain difficulties such as discontinuous coefficients or sharp gradients. • The preconditioned discretized problems have mesh independent condition numbers, since their convergence properties are asymptotically determined by the theoretical sequence in the Sobolev space. Hence the conditioning properties
226CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT exploit the fact that the preconditioning operators are chosen for the boundary value problem itself on the continuous level. • Moreover, one can obtain a priori bounds for the condition numbers by carrying out analytic estimates, without using the discretized systems. In the case of 2nd order Dirichlet problems, the typical preconditioning operators have the form S (n) z ≡ −div (Gn (x)∇z) ,
where Gn ∈ L∞ (Ω, RN ×N ) is a symmetric positive-definite matrix-valued function. (This form follows the setting of Chapter 7.) Then the auxiliary problems in the theoretical Sobolev space iteration are of the type (
or in weak form
Z
Ω
S (n) z ≡ −div (Gn (x)∇z) = r(x) z|∂Ω = 0,
(8.3)
Z
(8.4)
Gn (x) ∇z · ∇v =
Ω
rv
(v ∈ H01 (Ω)),
where r depends on n and denotes a residual coming from the previous iteration step. That is, r = T (u(n) ) − g in the one-step iteration in Figure 8.2, and it comes from the linearized equation in the case of inner Newton iterations. (n) Denote, as in Figure 8.2, by Sh the matrix obtained from the discretization of the operator S (n) . Then the corresponding auxiliary linear algebraic system in the iteration for the discretized problem will be (n)
Sh zh = rh . The emphasis in our investigation is on the finite element discretization of the studied elliptic problems. This will be in focus both in the study of general properties and the detailed preconditioning strategies. (FEM realization will be also detailed in the next chapter on algorithmic realization in section 9.2.) The reason is that by its construction the FEM is the most natural realization of Sobolev space methods. This is clear when the FEM solution zh ∈ Vh of problem (8.4) is looked for in some FEM subspace Vh . Namely, one simply has to replace H01 (Ω) in (8.4) by Vh , i.e. the function zh ∈ Vh has to satisfy Z Z Gn (x) ∇zh · ∇v = rv (v ∈ Vh ). Ω
Ω
(n)
Further, we obtain directly the corresponding preconditioning matrix Sh : n
(n)
Sh
o
i,j
=
Z
Ω
Gn (x) ∇vi · ∇vj
(i, j = 1, ..., k),
(8.5)
where v1 , ..., vk is a basis of Vh . The main goal of this chapter is to give various examples of preconditioners based on preconditioning operators. As seen above, in the case of 2nd order Dirichlet problems this relies on the suitable choice of the weight matrix function Gn . For mixed problems a
8.1. SOME GENERAL PROPERTIES OF THE SOBOLEV SPACE PRECONDITIONERS227 boundary operator is also involved, and for fourth order problems suitable weight array functions are used. We note that the first example of preconditioning operator will be the Laplacian, whose unboundedness already compensates that of the differential operator T and we get a finite (but maybe still large) condition number. On the next level, the choice of the weight function is based on a more accurate approximation of the coefficients of the nonlinear problem, involving several ideas and realizations. The organized study of the conditioning and structure characteristics of Sobolev space preconditioners will be based on a general study of these properties in section 8.1, which shows that this class of preconditioners is able to meet the mentioned requirements of good preconditioners. Then the examples are listed in section 8.2, containing in each case first the definition of the proposed preconditioning operator, then the construction and structure properties of the derived preconditioning matrix, and finally the conditioning estimates of the preconditioners. The vast majority of the discussed preconditioners will be based on spectral equivalence in the corresponding Sobolev space. This suits the main scope of our considered problems whose weak form involves monotone potential operators. (We note that preconditioning by spectral equivalence is also efficient in other contexts, e.g. by using a coarser mesh for the same operator, see Axelsson–Gustafsson [17]. This idea is a kind of opposite to that of preconditioning operators and, together with preconditioning methods using matrix techniques, it shows that the preconditioning operator approach clearly does not include certain other important preconditioning methods.) The construction of the preconditioners uses the strong or weak form of the operator depending on the demands of the clarity of discussion. (For notational distinction, in accordance with Chapter 7, the preconditioning operators in weak form will be denoted by Bn . For the strong form we keep the notation S (n) used in this introduction to avoid double subscripts when indexing with h. We note that in strong form Chapter 7 contained only fixed operators S, and anyway this slight notational incoherence will cause no ambiguity.)
8.1
Some general properties of the Sobolev space preconditioners
As mentioned earlier, the two major requirements of good preconditioners are as follows: • easy and/or standard solvability of the auxiliary equations; • convenient conditioning estimates. In sections 3.3 and 3.4 it has turned out for linear problems that the use of Sobolev space preconditioners is supported by these demands. That is, the discretization of a suitable linear elliptic operator as preconditioner for other linear elliptic operators has favourable properties in both respects. This section is devoted to these two requirements when linear preconditioning operators are used for nonlinear elliptic problems. Clearly, the standard solvability of
228CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT the auxiliary equations can rely on the same results as before, hence the aim of the first subsection is some further study of the structure of such auxiliary problems. In subsection 8.1.2 on conditioning we verify some general mesh independent convergence estimates which can be used afterwards for the different Sobolev space preconditioners.
8.1.1
Stiffness matrices: solvability, updating and structure characteristics
When Sobolev space preconditioners are used, the auxiliary problems to be solved are discretizations of linear elliptic boundary value problems. Such preconditioners are favourable from the point of view of solvability, since a large variety of efficient standard methods is available for the solution of the auxiliary problems. A brief summary on these elliptic solvers is given in subsection 3.3.2 for general problems. Furthermore, if this is possible, it is advisable to choose the preconditioning operator in some special form such that one may use an even more efficient particular solution method. Such special problems (like constant coefficient or separable operators) and corresponding fast solvers have been mentioned in 3.3.3. Altogether, one can rely on a wide standard background for the auxiliary problems. Besides solution methods, the work of updating and storing of the preconditioning matrices also urges us to find properties of discretized elliptic problems that possibly simplify these tasks. The next part of this subsection is devoted to a favourable structure property of the Sobolev space preconditioners when finite element realization is applied, using the results of Axelsson and Maubach [24]. Namely, if the preconditioning matrix is the FEM discretization of a linear elliptic operator in divergence form, then it admits a convenient factorization. This makes it easier to solve the auxiliary equations, further, to store and (in the case of variable preconditioning) to update these matrices. Matrices corresponding to the FEM discretization of a linear elliptic operator in divergence form are generally called stiffness matrices. To involve the applied quadrature of numerical integration, the following definition will be used: a matrix B is called a stiffness matrix if there exists a symmetric positive definite matrix-valued function G : Ω → RN ×N such that Bij = Q (G∇φj · ∇φi ) ,
where Q is the applied quadrature rule and {φi }si=1 are the global finite element basis functions (see subsection 3.3.1). We note that the original definition also includes a nonlinear case, i.e. when G = G(x, ∇u) depends on the unknown function u. In our case G = G(x) only comes from the linear preconditioning operators. The factorization of stiffness matrices relies on an additional set of Lagrangian basis functions on Ω. These are constructed in two steps: first for each element and each quadrature point a local basis function is defined that equals one in this point and zero in the other quadrature points of the element, then this local function is extended as zero on the complement w.r. to Ω. (For the details see [24].) The set of all global ′ functions is denoted by {ψj }sj=1 , where s′ is the total number of quadrature points on Ω. Then the factorization is formulated as follows.
8.1. SOME GENERAL PROPERTIES OF THE SOBOLEV SPACE PRECONDITIONERS229 Theorem 8.1 (Factorization theorem, Axelsson–Maubach [24].) Let B ∈ Rs×s be a stiffness matrix, defined by Bij = Q (G∇φj · ∇φi ) correspondN ×N ing to a matrix-valued function G = {Gp,q }N , using finite element p,q=1 : Ω → R s basis functions {φi }i=1 , a quadrature formula Q and corresponding Lagrangian basis ′ functions {ψj }sj=1 . Then B can be factorized into B = ZWZ t ,
(8.6)
where • Z is an s by N s′ rectangular block matrix h
Z = Z (1) · · · Z (N )
with blocks [Z
(p)
∂φi ]ij = Q ψj ∂xp
!
(p = 1, .., N, i = 1, .., s, j = 1, .., s′ );
• W is an N s′ by N s′ square block matrix W (1,1) .. W= . W (N,1)
with blocks
i
··· ... ···
2 [W (p,q) ] = diag Q−1 (G−1 p,q ψj )
W (1,N ) .. . (N,N ) W
(p, q = 1, .., N, j = 1, .., s′ )
and Z t denotes the transpose of Z. Corollary 8.1 If in Theorem 8.1 G(x) is diagonal for all x ∈ Ω, then W is diagonal. We underline that in (8.6) the matrix Z comes from the discretization of the operator −div, the matrix Z t comes from the discretization of ∇ and W comes from that of the weight matrix function G(x).
8.1.2
Mesh independent conditioning properties
In this subsection we study the connection between the conditioning properties of the preconditioning operators and those of the corresponding preconditioning matrices. This only relies on the abstract properties of these preconditioners, hence the exposition is clearer using an abstract Sobolev space H and its finite dimensional subspaces. Let us consider a boundary value problem T (u) = g
(8.7)
(including the boundary conditions in T ) whose weak solution lies in a Sobolev space H. We assume that (8.7) is one of the problems in Chapter 6, hence, in particular, it has a unique weak solution, further, the operator T is strictly monotone. We will define discretizations using the following property:
230CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT Definition 8.1 A family of finite dimensional subspaces Vh ⊂ H (’indexed’ by h > 0) is called a dense approximating family if it satisfies the following condition: for any u ∈ H there exists a family of vectors (uh ) with uh ∈ Vh (∀h > 0) such that uh → u as h → 0. A main example of dense approximating family is formed by FEM subspaces. More specially, the required property holds if ∪Vh is dense in H and (Vh ) is monotone in the sense that V1 ⊂ V2 ⊂ ... as with Ritz methods. Generally speaking, any discretization procedure for (8.7) can be regarded as follows. We define a dense approximating family of finite dimensional subspaces Vh ⊂ H (h > 0) and corresponding discretization mappings Ph such that Th := Ph T is an operator defined on Vh and gh := Ph g ∈ Vh . The main requirement is that limh→0 Th = T in some suitable sense. Then we consider the family of problems Th (uh ) = gh
(8.8)
in Vh (h > 0), where it is convenient to incorporate the boundary conditions in Th . For simplicity let us first consider one discretized problem (8.8), i.e. h is fixed. The Sobolev space preconditioning idea has been summarized in the introduction of this chapter using Figure 8.2. That is, we first choose suitable symmetric strictly positive linear elliptic operators S (n) and stepsizes α(n) > 0 such that the sequence {u(n) }n∈N in the Sobolev space H, defined by u(n+1) = u(n) − α(n) (S (n) )−1 (T (u(n) ) − g)
(n ∈ N),
(8.9)
exhibits suitable convergence to the weak solution of (8.7) in H. Then we define the matrices (n) Sh := Ph S (n) , (8.10) i.e. we apply the same discretization mapping Ph to the operator S (n) as was used to (n) define Th = Ph T from T , and we propose the matrices Sh as preconditioners in the iteration for the discretized problem. That is, for problem (8.8) we define the iterative sequence (n+1) (n) (n) (n) uh = uh − α(n) (Sh )−1 (Th (uh ) − gh ) (n ∈ N). (8.11) The main advantage from conditioning aspect is as follows. If the sequence of subspaces Vh approximates H in a reasonable way, then the sequence (8.11) behaves asymptotically in the same way as the theoretical sequence (8.9) as h → 0. Consequently, if analytic investigations yield usable estimates for the latter in the Sobolev space H, then these are valid for the numerical iterations as well, moreover, the estimates are subspace (mesh) independent, i.e. independent of h. (One of the main goals of this whole chapter is to present various applications of this idea.) Investigations in this topic are found in many important papers. We underline the rigorous results for linear equations based on the theory of equivalent operators in the papers of Faber, Manteuffel and Parter [109], Manteuffel and Parter [204]. For Newton’s method the mesh independence principle has been developed in a general setting by Allgower et al. [5], Popa [245].
8.1. SOME GENERAL PROPERTIES OF THE SOBOLEV SPACE PRECONDITIONERS231 We give more exact formulations of this property in the case of Ritz-Galerkin discretizations, motivated by FEM realization. First we consider simple iterations, then turn to Newton-like methods. (a) Simple iterations Let us consider the case of simple iterations, i.e. when matrices Sh , independent of n, are sought for in the iteration for the discretized problem. That is, instead of (8.11) the required iteration becomes of the form (n+1)
uh
(n)
(n)
= uh − α(n) Sh−1 (Th (uh ) − gh )
(n ∈ N).
(8.12)
In this case the preconditioning operator idea means that we first choose a suitable strictly positive linear elliptic operator S, and define the theoretical sequence u(n+1) = u(n) − α(n) S −1 (T (u(n) ) − g)
(n ∈ N)
(8.13)
instead of (8.9). The related conditioning and convergence properties of the sequence (u(n) ) are found in section 7.1. Then we define Sh := Ph S in the similar way as in (8.10). The subspaces Vh are assumed to be chosen reasonably to satisfy Vh ⊂ D(S) = D(T ). Let the discretization mappings Ph be defined as orthogonal projections into the subspaces Vh ⊂ H. This means that for any uh ∈ Vh the elements Th (uh ) ∈ Vh and Sh uh ∈ Vh are defined by the relations hTh (uh ), vh i = hT (uh ), vh i and hSh uh , vh i = hSuh , vh i
(vh ∈ Vh ),
(8.14)
respectively. Then the condition numbers of the operators Sh−1 Th (h > 0) and S −1 T are related as follows. Theorem 8.2
(1) For any family (Vh )h>0 there holds
cond Sh−1 Th ≤ cond S −1 T
(h > 0).
(2) If (Vh )h>0 is a dense approximating family then
cond Sh−1 Th → cond S −1 T
as h → 0.
Proof. (1) Let D = D(S) = D(T ) as in Theorem 5.13. Then hT (vh ) − T (uh ), vh − uh i hT (v) − T (u), v − ui ≤ sup = Λ(S −1 T ), 2 kvh − uh kS kv − uk2S uh 6=vh ∈Vh u6=v∈D hT (v) − T (u), v − ui hT (vh ) − T (uh ), vh − uh i ≥ inf = λ(S −1 T ), λ(Sh−1 Th ) = inf 2 u6=v∈D uh 6=vh ∈Vh kvh − uh kS kv − uk2S Λ(Sh−1 Th ) =
sup
232CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT hence
cond Sh−1 Th =
Λ(Sh−1 Th ) Λ(S −1 T ) −1 ≤ = cond S T . λ(S −1 T ) λ(Sh−1 Th )
(2) This follows from the standard continuity argument. Using Theorem 8.2, an estimate of the condition number in Sobolev space implies a subspace independent estimate for the condition numbers in all subspaces. This implies that the number of iterations for prescribed accuracy in a discretized problem is bounded by that for the abstract problem. In terms of spectral equivalence, the above result yields the following conditioning estimate for the discretized problems (8.8): Corollary 8.2 Let the condition mhS(v − u), v − ui ≤ hT (v) − T (u), v − ui ≤ M hS(v − u), v − ui
(u, v ∈ D(T )) (8.15)
be satisfied for S and T , that is cond(S −1 T ) ≤
M , m
with constants M ≥ m > 0 independent of u, v. Then (1) there holds the subspace independent estimate
cond Sh−1 Th ≤
M m
(h > 0);
(2) if in (8.12) we choose the constant stepsize α(n) ≡ linearly with quotient M −m q= M +m independently of h.
2 , M +m
(n)
then uh
converges (8.16)
Remark 8.1 Theorem 8.2 and Corollary 8.2 hold in particular for two linear operators, hence they can also be used for the operators Bn−1 F ′ (un ) in the preconditioned Newtonlike iterations studied in subsections 7.2.2–7.2.3. This means that in the nth step the estimate Mn cond Bn−1 F ′ (un ) ≤ (8.17) mn implies the subspace independent estimate
′ cond (Bn )−1 h Fh (un ) ≤
Mn mn
(h > 0)
for the discretized operators. Analogous to (8.16), the variably preconditioned iteration of Theorem 7.10 then yields the convergence ratio q = lim sup
Mn − m n . Mn + m n
(8.18)
8.1. SOME GENERAL PROPERTIES OF THE SOBOLEV SPACE PRECONDITIONERS233 Further, using this kind of preconditioning in inner iterations as in subsection 7.2.2, if the preconditioned conjugate gradient method is applied for these inner iterations as in Theorem 7.9 then its convergence quotient √ √ Mn − m n Qn = √ √ Mn + m n is also bounded independently of the subspace Vh . Remark 8.2 Corollary 8.2 remains true if M is replaced by some M0 which depends on the initial guess u(0) ∈ H and is valid on a ball around u(0) (that is, in the setting used in Theorems 7.2 and 7.4). Then the conditioning estimate
cond Sh−1 Th ≤
M0 m
(h > 0)
(0)
is valid w.r. to initial guesses uh ∈ Vh that are in this ball too. (The same works if m is also replaced by m0 .)
(b) Newton-like iterations The variable preconditioning realization of the inexact Newton method has been developed in subsection 7.2.3. It has a convergence estimate which includes (8.18), mentioned above in the context of linear convergence, and allows superlinear convergence when q = 0. As seen in subsections 7.2.2–7.2.3, the suitable choice of the variable preconditioners yields convergence up to quadratic order as well as for the damped inexact Newton method. The study of mesh independence in this case relies on different investigations as above, and has been the subject of several papers. The ’mesh independence principle’ for Newton’s method, stating that under suitable conditions the quadratic convergence is subspace (mesh) independent, has been verified under rather general conditions. This includes a normed space setting (see Allgower et al. [5], Popa [245]), and also two-level versions (Axelsson [10], Axelsson–Layton [23]). For a detailed study concerning the standard Newton method the reader is referred to Allgower et al. [5]. Let us consider the weak formulation of problem (8.7), written as F (u) = 0, and use the notation Fh = Ph F . Then the above result [5] implies the following sufficient conditions: the mesh independence principle holds if the discretization mapping Ph is (i) Lipschitz uniform, i.e. there is L > 0 such that kFh′ (u) − Fh′ (v)k ≤ Lku − vk (h > 0, u, v ∈ Vh ); (ii) bounded, i.e. there is q > 0 such that kPh uk ≤ qkuk (h > 0, u ∈ H); (iii) stable, i.e. there is σ > 0 such that kFh′ (Ph u)−1 k ≤ σ (h > 0, u ∈ H);
234CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT (iv) consistent of order p, i.e. there is c > 0 such that kPh F (u) − Fh (Ph u)k ≤ chp and kPh F ′ (u)v − Fh′ (Ph u)Ph v)k ≤ chp (h > 0, u ∈ W ∩ B, v ∈ W ), where W ⊂ H is a subset containing (un ). These conditions are naturally satisfied in the case of a Ritz-Galerkin type discretization owing to orthogonality, provided that the operator F ′ in H01 (Ω) is Lipschitz continuous and has a bounded inverse. The resulting mesh independent estimates can also be obtained by applying Theorem 7.7 in Vh , since Fh inherits the Lipschitz constant and lower bound of F ′ independently of h. For Galerkin type discretizations the numerical efficiency of Newton’s method can be much increased by a multigrid setting, which can suitably generalize the multigrid methods cited for linear problems in subsection 3.3.2. The two-level Newton method developed in the papers of Axelsson [10], Axelsson–Layton [23]), which uses a very coarse grid and requires only one or two steps on the fine grid, preserves the above mesh independence result. The above conditions (i)-(iv) concern the formally exact Newton method which is is completely defined and hence needs no introduction of preconditioners. The case of interest in our investigations is rather an inexact Newton method, in which most often preconditioners are already required: either for an inner iteration under an outer Newton iteration step, or sometimes as a sequence of variable preconditioners in the role of approximate Jacobians. As we have seen in Remark 8.1, in these methods the Sobolev space background can be used similarly to the case of fixed preconditioners, and yields mesh independent condition numbers. Namely, for any fixed n the conditioning estimate Mn cond Bn−1 F ′ (un ) ≤ mn on the continuous level implies the subspace independent estimate
′ cond (Bn )−1 h Fh (un ) ≤
Mn mn
(h > 0)
for the discretized operators.
8.2
Various preconditioning strategies based on the Sobolev space background
This section presents the central part of the applications of Sobolev space preconditioning, namely, we define various preconditioners based on the Sobolev space background. The general way of this follows the ideas given in the introduction of this chapter: • First a preconditioning linear operator is defined in the framework of the theorems of Chapter 7. That is, this operator is determined by the choice of the coefficient matrix (array) G or Gn . • Then the preconditioning matrix for the discretized problem is obtained using the same discretization for this linear operator as the one applied to the boundary value problem. In particular, we include FEM discretization and for each
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG proposed preconditioning we give the construction and for stiffness matrices the structure of the preconditioning matrix in this case. The solution of the corresponding auxiliary linear algebraic systems relies on the solvers cited in subsections 3.3.2–3.3.3. • The condition number of the preconditioned operator on the continuous level is determined using the results of Chapter 7. Then the conditioning of the discretized operators follows from Theorem 8.2. The majority of preconditioners is presented for second order equations (subsections 8.2.1–8.2.12), including some that may be directly extended to other problems. Then subsections 8.2.13–8.2.14 concern fourth order problems and subsection 8.2.15 involves second order systems. Since the emphasis is on the choice of the coefficient matrices, for convenience most of the preconditioning operators will be discussed under Dirichlet boundary conditions, and in these cases the discretization subspace Vh is appropriately chosen (i.e. for second order problems it is a subspace of H01 (Ω)). We note that in general the domain of the preconditioning operator is determined following Chapter 7, and then Vh corresponds to the thereby satisfied boundary conditions. (Incorporating boundary conditions in the preconditioning matrices is considered distinctly in subsection 8.2.11.) Accordingly, for many of the proposed strategies we will consider a Dirichlet problem which, for convenience, consists only of principal part. In the second order case this problem is of the form T (u) ≡ − div f (x, ∇u) = g(x)
u|∂Ω = 0
in Ω
(8.19)
on a bounded domain Ω ⊂ RN . For this problem we will assume that, similarly as before, f : Ω × RN → RN is measurable and bounded w.r. to the variable x ∈ Ω (x,η) and C 1 in the other variables, further, the Jacobian matrices ∂f∂η are symmetric and their eigenvalues λ satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞ (8.20) with constants µ2 ≥ µ1 > 0 independent of (x, η). In the study of preconditioners we will use the following notations. Similarly to section 8.1, the preconditioning operators used for the differential operator T (in strong form) will be denoted by S or S (n) , depending whether a fixed operator or variable operators are used, respectively. The discretizations of these operators will be denoted (n) by Sh or Sh , respectively. Further, sometimes it will be more convenient to use the weak form of the preconditioning operator, regarded as a preconditioning operator for the weak form F of the differential operator or, in an inner iteration during an outer Newton step, as a preconditioning operator for F ′ (un ). It is useful to distinguish these cases from the strong form, following Chapter 7, when we wish to use the thereby obtained conditioning estimates. Hence the preconditioning operators in weak form will be denoted by B when a fixed operator is used and, in accordance with section 7.2, by Bn when variable operators are used. The discretizations of these weak operators
236CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT will be denoted by Bh or (Bn )h , respectively. As a matter of course, the finite element discretization of the strong and weak forms coincide: for instance, let Z
Sz ≡ −div (G(x)∇z) (z ∈ H 2 ∩ H01 ) and hBz, vi =
Ω
G(x) ∇z · ∇v
(z, v ∈ H01 )
(8.21) be the strong and weak forms of a second order linear operator with Dirichlet boundary conditions, respectively. Then the discretizations of these operators in a FEM subspace Vh yield the same stiffness matrix given by (Sh )i,j = (Bh )i,j =
Z
Ω
G(x) ∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Remark 8.3 If a preconditioning operator is used in an inner iteration or as a variable preconditioner (like in subsections 7.2.2–7.2.3), then the resulting condition number is bounded above by the condition number obtained for the same preconditioning operator when used for a simple iteration (like in section 7.1). For instance, for second order equations consisting only of principal part, this follows from (7.4) and (7.110) (respectively (7.131)), since the latter involve the Jacobian only at η = ∇un (x) instead of all η ∈ RN , hence mn ≥ m and Mn ≤ M . Consequently, those preconditioning operators in the sequel that are discussed for simple iterations can also be used in inner Newton iterations (or as variable preconditioners) with the same or better condition numbers.
8.2.1
Discrete Laplacian preconditioner
(a) The preconditioning operator The most straightforward preconditioning operator for problem (8.19) is the Laplacian S = −∆ . This corresponds to the (simplest) coefficient matrix G(x) ≡ I in the general operator Su = −div(G(x)∇u). In the case of Dirichlet boundary conditions, the domain of −∆ is H 2 (Ω) ∩ H01 (Ω). (For mixed boundary conditions see subsection 8.2.11.) Remark 8.4 (i) The idea of discrete Laplacian as preconditioner has been widely applied. Early and later applications for linear problems have been quoted in subsection 3.4.2, such as in the papers of Concus-Golub [76], D’yakonov [96], Gunn [142], Widlund [291] which followed the development of fast Poisson solvers (see subsection 3.3.3). In the nonlinear case we underline the Sobolev gradient approach (Neuberger [231]), in which context the Laplacian preconditioner is discussed in section 7.3. Related convergence results are given in Gajewski–Gr¨oger–Zacharias [127], Koshelev [185], and
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG some more numerical applications for nonlinear problems are found in Carey–Jiang [68], Rossi–Toivanen [257]. (ii) We note that some generalizations of the Laplacian preconditioner have also been developed. First we mention the natural modification Su ≡ −∆u + cu with some constant c > 0. This leads to auxiliary Helmholtz equations whose solution can be achieved via modifications of the fast Poisson solvers, further, sometimes the suitable choice of c may improve the rate of convergence, see Concus–Golub [76], Manteuffel–Otto [203]. Second, the Laplacian can be scaled by a positive C 2 function a : Ω → R+ . Scaling means the transformation Sa u ≡ −a1/2 ∆(a1/2 u) = −div (a∇u) + qu with the function q = −a1/2 ∆(a1/2 ), which yields that with the changed variable v = a1/2 u the Laplacian is equivalent to a more general linear differential operator. This idea has important applications in the linear case (Concus–Golub [76], Greenbaum [138], Guillard–D´esid´eri [141], Widlund [291]). It can be naturally generalized for nonlinear diffusion operators (i.e. with a depending on the solution) if this a is approximated by a function of x ∈ Ω only, or via frozen coefficient iterations. In the discretized problem the operator Sa has the matrix −D∆h D, where D is a diagonal matrix derived from the function a1/2 . (b) Construction of the preconditioning matrix The corresponding preconditioner for the discretized problem (8.8) is the discrete Laplacian −∆h , obtained by applying the same discretization for the operator −∆ as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z ∇z · ∇v = rv (v ∈ Vh ) f ind z ∈ Vh : Ω
Ω
with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). (The dependence of the elements of Vh on h is not denoted here and in the sequel either in order to have less indices.) That is, the preconditioning matrix is given by (−∆h )i,j =
Z
Ω
∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the preconditioning matrix −∆h has the product form −∆h = ZZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively. We note that the same factorization is valid in the case of FDM discretization, see e.g. Richardson [254].
238CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT The solution of the linear algebraic systems containing the preconditioner −∆h relies on the fast Poisson solvers cited in subsection 3.3.3. (Some more details on these solvers will be also given in section 9.3.) (c) Conditioning We will determine the condition number of the operator −∆−1 T on the continuous level based on Theorem 7.1 in section 7.1. Then the conditioning of the discretized operators will follow from Theorem 8.2. We present preconditioning for a Dirichlet problem. (Boundary conditions of 3rd type will be dealt with in subsection 8.2.11.) First we consider the problem in (8.19), i.e. an operator consisting only of principal part. In order to apply Theorem 7.1, we observe that for the operator S = −∆, inequality (7.4) is satisfied with m = µ1 and M = µ2 . This holds because the coefficient matrix of S is G(x) ≡ I and (8.20) is equivalent to µ1 |ξ|2 ≤ h
∂f (x, η) ξ, ξi ≤ µ2 |ξ|2 ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN ).
(8.22)
Consequently, by (7.9) the Laplacian preconditioner yields cond(−∆−1 T ) ≤
µ2 . µ1
(8.23)
Owing to Theorem 8.2, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate: µ2 cond(−∆−1 , (8.24) h Th ) ≤ µ1 which is independent of the subspace Vh . If µµ21 is reasonably small, then the discrete Laplacian preconditioner is suitable for the discretized systems. The same estimate holds under mixed boundary conditions, which follows from Remark 7.4. Remark 8.5 We note that, similarly, the resulting condition number in inner iterations for Newton-like methods can be determined on the continuous level based on Theorems 7.9 and 7.10 in section 7.2. Namely, in this case the numbers µ1 , µ2 in (8.22) (and consequently in (8.23)–(8.24) as well) are replaced by the spectral bounds of the (x, ∇un (x)), given by (7.110) in the case of Gn (x) ≡ I (the identity matrix). matrix ∂f ∂η See also Remark 8.3. Example 1. Assume that
∂f ∂η
is uniformly diagonal dominant, i.e. for all i = 1, ..., N
N ∂fi (x, η) X ∂fi (x, η) − ∂ηj ∂ηi j=1
j6=i
≥ µ1 > 0
(x ∈ Ω, η ∈ RN )
(8.25)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG with some constant µ1 independent of x, η. Then the left side of (8.22) is satisfied. Namely, N N X ∂f (x, η) ∂fi (x, η) ∂fi (x, η) 2 X h ξ, ξi = |ξi | + ξi ξj ∂η ∂ηi ∂ηj i,j=1 i=1 i6=j
≥
N X i=1
hence by (8.25) h
∂fi (x, η) −
∂ηi
∂fi (x, η) |ξi |2 , ∂ηj
N X j=1 j6=i
(8.26)
N X ∂f (x, η) ξ, ξi ≥ µ1 |ξi |2 . ∂η i=1
If, in addition, we assume that ∂fi ∈ L∞ (Ω) ∂ηi
(8.27)
for all i = 1, .., N , then the right side of (8.22) is also satisfied: the estimate (8.25) and the upper analogue of (8.26) imply N N X X ∂fi ∂f (x, η) 2 ∞ 2k ξ, ξi ≤ kL − µ1 |ξi | ≤ µ2 |ξi |2 h ∂η ∂ηi i=1 i=1
!
with µ2 = 2 max k i
∂fi kL∞ − µ1 . ∂ηi
(8.28)
Altogether, (8.25) and (8.27) are computable sufficient conditions for the ellipticity (8.22) of ∂f . ∂η Example 2. Let T be the operator in (6.57): T (u) ≡ −div (a(|∇u|)∇u) ,
(8.29)
where 0 < m ≤ a(r) ≤ (ra(r))′ ≤ M (r > 0) with suitable constants M ≥ m > 0. We note that by Remark 6.7 the operator T corresponds to the potential 1Z A(|∇u|2 ) ψ(u) = 2
(8.30)
Ω
where the C 1 function A : R+ → R satisfies A′ (r) = a ˜(r) = a(r1/2 ).
(8.31)
Then, as follows from the proof of Theorem 6.6, the function f (x, η) = a(|η|)|η| satisfies m|ξ|2 ≤ a(|η|)|ξ|2 ≤ h
∂f (x, η) ξ, ξi ≤ (a(|η|) + a′ (|η|)|η|) |ξ|2 ≤ M |ξ|2 , ∂η
(8.32)
240CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT that is, T satisfies (8.22) with µ1 = m and µ2 = M . (In terms of the potential, these constants are the uniform convexity bounds of A.) Hence (8.23) and (8.24) hold. If the obtained condition number µ2 /µ1 is reasonably small, then the Laplacian preconditioner is proposed owing to the above-mentioned available fast Poisson solvers. An example for this is the elasto-plastic torsion of rods, for which numerical illustration will be given section 10.1. If problem (8.19) is modified such that a lower order term q(x, u) is present in the operator T , then the conditioning estimate cannot be simply obtained from the matrix estimate (8.22) as was the case for (8.23). In this case Theorem 7.2 can be applied, provided that q(x, u) satisfies the estimate 0 ≤ ∂ξ q(x, ξ) ≤ c1 + c2 |ξ|p−2
((x, ξ) ∈ Ω × R)
with suitable p ≥ 2 and c1 , c2 > 0. We also assume that Ω is C 2 -diffeomorphic to a convex domain. Then (8.23) is replaced by cond(−∆−1 T ) ≤
µ ˜2 µ1
(8.33)
if, using (7.24) and (7.31), we define p µ ˜2 = µ2 + c1 ̺−1 + c2 Kp,Ω r0p−2
(depending on the initial guess u0 ), where Kp,Ω is the embedding constant in (7.25), ̺ > 0 denotes the smallest eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω) and r0 = ku0 kH01 +
1 kT (u0 ) − gkL2 (Ω) . m̺1/2
(8.34)
(Should Ω be not as above, then (7.27) and Remark 7.3 have to be used instead of (7.31).) Further, in virtue of Remark 8.2, the discretized operators −∆−1 h Th in a FEM 1 subspace Vh ⊂ H0 (Ω) inherit the estimate (8.33), independently of the subspace Vh .
8.2.2
Constant coefficient preconditioners
(a) The preconditioning operator A slight generalization of the Laplacian preconditioning operator is an arbitrary second order linear elliptic operator with constant coefficients. It can be proposed instead of the Laplacian if the ellipticity of the nonlinear operator T includes significant (x,η) anisotropy in different directions, i.e. the eigenvectors of the Jacobians ∂f∂η cluster around certain vectors with distant eigenvalues. In this case a suitable coefficient matrix C gives a better uniform approximation of the Jacobians than the identity matrix which defines the Laplacian. The matrix C has to be chosen such that its eigenvectors follow the above described anisotropy. As an illustration for this, we may think of a nonlinear operator T with a matrix-valued diffusion coefficient that is the scalar multiple of some fixed matrix A0 ∈ RN ×N , i.e. T (u) ≡ −div (b(x, ∇u)A0 ∇u)
(8.35)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG where b : Ω × RN → R is a scalar-valued function. Then the choice C = A0 is a better approximation of the coefficient than the identity matrix, and this also holds for the Jacobians. The constant coefficient preconditioning operators have the form Su = −div(C∇u) with some given symmetric positive-definite (SPD) matrix C ∈ RN ×N . That is, we set G(x) ≡ C in the general form of preconditioning operators. As is well known, this operator S can be reduced to the Laplacian by a change of variables, which also transforms the domain Ω. Hence S can only be proposed in the case of simple-shaped domains whose change is not worth one’s while. (b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem (8.8) is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z f ind z ∈ Vh : C ∇z · ∇v = rv (v ∈ Vh ) (8.36) Ω
Ω
with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). That is, the preconditioning matrix is given by (Sh )i,j =
Z
Ω
C ∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the preconditioning matrix Sh has the product form Sh = ZWZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is the weight matrix corresponding to the coefficient matrix C. The solution of the linear algebraic systems containing the preconditioner Sh relies on solution methods designed especially for problems with constant coefficients, cited in subsection 3.3.3. (c) Conditioning We will determine the condition number of the operator S −1 T on the continuous level based on Theorem 7.1 in section 7.1. Then the conditioning of the discretized operators will follow from Theorem 8.2. (The related estimates in an inner iteration
242CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT for Newton-like methods can be carried out similarly as mentioned in Remark 8.5, see also Remark 8.3.) Let us consider the Dirichlet problem (8.19) again. The conditioning estimates use the norm |ξ|2C = hCξ, ξi (8.37) in RN . The condition number of the operator S −1 T is determined by the bounds m = inf h |ξ|C =1 η∈RN
∂f (x, η) ξ, ξi, ∂η
M = sup h |ξ|C =1 η∈RN
∂f (x, η) ξ, ξi ∂η
(being positive owing to (8.20) and the positivity of C), which yield that mhCξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hCξ, ξi ∂η
(8.38)
for all (x, η) ∈ Ω × RN , ξ ∈ RN . That is, inequality (7.4) is satisfied for G(x) ≡ C, hence (7.9) yields M . (8.39) cond(S −1 T ) ≤ m Owing to Theorem 8.2, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate: M cond(Sh−1 Th ) ≤ , (8.40) m which is independent of the subspace Vh . The same estimate holds under mixed boundary conditions, which follows from Remark 7.4. Example. Let E ∈ RN ×N be a given SPD matrix, and let us consider the modification of the potential (8.30) when it depends on the E-norm (defined analogously to (8.37)) of ∇u, i.e. in the argument of A we replace |∇u|2 by |∇u|2E . Then the derivative of ψ in strong form is an operator T of the type (8.35) with b(x, ∇u) = a(|∇u|E ),
A0 = E
and the real function a(s) being taken from (8.31). Further, it follows similarly as after (8.30) that the estimate (8.32) holds if |ξ| and |η| are replaced by |ξ|E and |η|E , respectively, i.e. we have m|ξ|2E ≤ h
∂f (x, η) ξ, ξi ≤ M |ξ|2E . ∂η
Therefore, defining C = E, the operator T satisfies (8.38) and hence (8.39) and (8.40) hold. (We note that in the case of the Laplacian preconditioner for this operator T , we could only estimate the preconditioned condition number by ΛC M/λC m instead of (8.39), where ΛC and λC are the greatest and smallest eigenvalues of C, respectively).
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG
8.2.3
Separable preconditioners
(a) The preconditioning operator Let us consider the Dirichlet problem (8.19) again. We assume that the Jacobians of f are uniformly diagonal dominant, i.e. the sufficient conditions (8.25)–(8.27) of the ellipticity (8.22) are satisfied. The preconditioning operator S will be a separable elliptic operator, i.e. its coefficients will depend on the distinct single variables xs (s = 1, ..., N ). The introduction of such a preconditioning operator is motivated by the available fast solvers for separable problems (see subsection 3.3.3). We give the construction of S as a preconditioning operator for T in a simple iteration, further, we simultaneously indicate the slight changes that turn it into a preconditioning operator for the inner iteration in an outer Newton step un . We introduce the following notations. For any x ∈ Ω and 1 ≤ s ≤ N let Ωs = {z ∈ Ω : zs = xs } (the set of points in Ω with the same sth coordinate as x) and
N X ∂fs (x, η) ∂fs (x, η) as (xs ) = x∈Ω inf − ∂ηj s ∂ηs j=1 N η∈R
j6=s
N X ∂fs (x, η) ∂fs (x, η) bs (xs ) = sup + ∂ηj x∈Ωs ∂ηs j=1
η∈RN
We note that
j6=s
as (xs ) ≥ µ1 ,
bs (xs ) ≤ µ2
,
.
(8.41)
(8.42)
(8.43)
with µ1 and µ2 from (8.25) and (8.28), respectively. Alternatively, if the inner iteration in an outer Newton step un is considered, then in (8.41)–(8.42) the variable η is replaced by ∇un (x) and the infimum and supremum are taken w.r. to x ∈ Ωs only. Then the preconditioning operator is defined as Su = −
N X ∂
s=1
∂xs
∂u as (xs ) ∂xs
!
.
This corresponds to the coefficient matrix G(x) ≡ diag {as (xs )}N s=1 . (b) Construction of the preconditioning matrix
244CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT The corresponding preconditioner Sh for the discretized problem (8.8) is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z X Z N ∂z ∂v as (xs ) f ind z ∈ Vh : = rv (v ∈ Vh ) ∂xs ∂xs s=1 Ω
Ω
with some finite-dimensional FEM subspace Vh ⊂ matrix is given by (Sh )i,j =
Z X N
as (xs )
Ω s=1
H01 (Ω).
∂vi ∂vj ∂xs ∂xs
That is, the preconditioning
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the preconditioning matrix Sh has the product form Sh = ZWZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is a diagonal matrix since G is diagonal. The solution of the linear algebraic systems containing the preconditioner Sh exploits the fact that S is the discretization of a separable elliptic operator. Namely, in this case one can use fast solvers developed especially for such elliptic problems. For details see subsection 3.3.3. (c) Conditioning We will determine the condition number of the operator S −1 T on the continuous level from Theorem 7.1. Then the conditioning of the discretized operators will follow from Theorem 8.2. The construction of S yields that for all (x, η) ∈ Ω × RN , ξ ∈ RN there holds hG(x)ξ, ξi = N X
N X
s=1
as (xs )|ξs |2 ≤ h
∂f (x, η) ξ, ξi ∂η !
bs (xs ) bs (xs )|ξs | ≤ sup max ≤ hG(x)ξ, ξi, x∈Ω s=1,..,N as (xs ) s=1 2
using (8.26) and the calculation afterwards. That is, the inequality mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN ),
required in (7.4), is satisfied with m = 1,
M = sup max
x∈Ω s=1,..,N
bs (xs ) . as (xs )
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG Then (7.9) yields cond(S −1 T ) ≤ sup max
x∈Ω s=1,..,N
bs (xs ) . as (xs )
Owing to Theorem 8.2, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate: cond(Sh−1 Th ) ≤ sup max
x∈Ω s=1,..,N
bs (xs ) , as (xs )
which is independent of the subspace Vh . We note that by (8.43) we have M/m ≤ µ2 /µ1 , hence the separable preconditioner yields the same or a better condition number than the discrete Laplacian. This is naturally due to the fact that (8.41)–(8.42) is a ’variable bound’ type improvement of the bounds in (8.25)–(8.27) that is sensitive to the distinct variables xs . We also remark that the above conditioning estimate is simultaneously valid whether S is a preconditioning operator for T in a simple iteration or for the inner iteration in an outer Newton step un . Namely, the difference has been involved in the different definition of the functions as and bs in the two cases.
8.2.4
Linear principal part preconditioner
(a) The preconditioning operator For semilinear problems, a straightforward fixed preconditioning operator is the linear operator which is in the principal part. That is, if the semilinear operator is T (u) = −div(A(x)∇u) + q(x, u)
(8.44)
such that it is a special case of (7.21), then the preconditioning operator is Su = −div(A(x)∇u). This corresponds to the coefficient matrix G(x) ≡ A(x). The corresponding iteration for a Dirichlet problem T (u) = g u |∂Ω = 0
is obtained by Theorem 7.2. Since now m = 1, the iteration is given by un+1 = un − where
Z
Ω
A(x) ∇zn · ∇v =
Z
Ω
2 zn , M0 + 1
[A(x) ∇un · ∇v + (q(x, un ) − g)v]
(v ∈ H01 (Ω)).
246CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT (The constant M0 is given below in (8.47).) Setting wn := zn − un , the iteration takes the simpler form un+1 = where
Z
Ω
M0 − 1 2 un − wn , M0 + 1 M0 + 1
A(x) ∇wn · ∇v =
Z
Ω
(q(x, un ) − g)v
(v ∈
(8.45) H01 (Ω)).
Here the strong form of the auxiliary equation is −div(A(x)∇wn ) = q(x, un ) − g . We note that the above iteration can be regarded as an improvement of the related ’linearized ’ iteration div(A(x)∇un+1 ) = q(x, un ) − g , (8.46) which is another frequently used iteration for the discussed semilinear problem. Namely, setting M0 = 1 in (8.45) would give un+1 = −wn , which just yields the linearized iteration (8.46). However, Theorem 7.2 in fact gives M0 > m = 1, that is un+1 is a proper convex combination of un and −wn such that an optimal linear convergence quotient is achieved. (b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem (8.8) is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z f ind z ∈ Vh : A(x) ∇z · ∇v = rv (v ∈ Vh ) Ω
with some finite-dimensional FEM subspace Vh ⊂ matrix is given by (Sh )i,j =
Z
Ω
A(x) ∇vi · ∇vj
Ω
H01 (Ω).
That is, the preconditioning
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the preconditioning matrix Sh has the product form Sh = ZWZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is the weight matrix corresponding to A. The solution of the linear algebraic systems containing the preconditioner Sh relies on the solution methods cited in subsection 3.3.2. (c) Conditioning
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG We will determine the condition number of the operator S −1 T on the continuous level from Theorem 7.2. Then the conditioning of the discretized operators follows from Theorem 8.2. The assumptions of Theorem 7.2 for (8.44) require that the matrices A(x) are symmetric and their eigenvalues are between constants µ2 ≥ µ1 > 0 independent of x, further, the function q(x, u) satisfies the estimate 0 ≤ ∂ξ q(x, ξ) ≤ c1 + c2 |ξ|p−2
((x, ξ) ∈ Ω × R)
with suitable p ≥ 2 and c1 , c2 > 0. In addition, let us first assume for convenience that A ∈ C 1 (Ω, RN ×N ) and Ω is C 2 -diffeomorphic to a convex domain. Then Theorem 7.2 yields the lower bound m=1 and, using (7.24) and (7.31), the upper bound p M0 = 1 + c1 ̺−1 + c2 Kp,Ω r0p−2
(8.47)
depending on the initial guess u0 and the right-hand side g, namely, r0 = ku0 kA + ̺−1/2 kT (u0 ) − gkL2 (Ω) , where Kp,Ω is the embedding constant in (7.25) with G = A and ̺ > 0 denotes the smallest eigenvalue of S on H 2 (Ω) ∩ H01 (Ω). (If A ∈ / C 1 or Ω is not as above, then (7.27) and Remark 7.3 have to be used instead of (7.31).) Altogether, we obtain the condition number (relative to the initial guess u0 ) cond(S −1 T ) ≤
M0 p = 1 + c1 ̺−1 + c2 Kp,Ω r0p−2 . m
In virtue of Remark 8.2, the discretized operators Sh−1 Th in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate, independently of the subspace Vh .
8.2.5
Frozen coefficient preconditioner
(a) The preconditioning operator The frozen coefficient iteration of section 7.4.2 for second order Dirichlet problems can be expressed in terms of variable preconditioning as follows. We consider operators of the form T (u) = −div(b(x, ∇u)∇u) ,
where b is continuous and 0 < β1 ≤ b(x, η) ≤ β2 with constants β2 ≥ β1 > 0 independent of (x, η) ∈ Ω × RN .
In the nth step un of the iteration, we define the ’frozen coefficient’ preconditioning operator as S (n) z = −div(b(x, ∇un )∇z).
248CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT Its weak form is hBn z, vi =
Z
(z, v ∈ H01 (Ω)).
b(x, ∇un )∇z · ∇v
Ω
This preconditioning operator corresponds to the coefficient matrix Gn (x) ≡ b(x, ∇un (x)) · I, where I is the identity matrix. Then it is easy to see that the preconditioned iteration, defined with the operator S (n) and steplenght 1: un+1 = un − (S (n) )−1 (T (un ) − g) (8.48)
is equivalent to the standard frozen coefficient iteration defined in section 7.4.2. Namely, by definition we have S (n) un = T (un ), hence (8.48) is equivalent to S (n) un+1 = S (n) un − T (un ) + g = g ,
that is −div(b(x, ∇un )∇un+1 ) = g
or in weak form Z
Ω
b(x, ∇un )∇un+1 · ∇v =
Z
Ω
gv
(v ∈ H01 (Ω)),
which coincides with the frozen coefficient iteration (7.190) for our operator T . (b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem (8.8) is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z f ind z ∈ Vh : b(x, ∇un ) ∇z · ∇v = rv (v ∈ Vh ) Ω
Ω
with some finite-dimensional FEM subspace Vh ⊂ matrix is given by
(n) Sh i,j
=
Z
Ω
H01 (Ω).
b(x, ∇un ) ∇vi · ∇vj
That is, the preconditioning
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . (n) Using Theorem 8.1, the preconditioning matrix Sh has the product form (n)
Sh = ZWn Z t where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, i.e. they are independent of n and hence need not be updated. Since the
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG operator S (n) has the scalar-valued coefficient b(x, ∇un ), we obtain that the weight matrix Wn is diagonal. (n) The solution of the linear algebraic systems containing the preconditioner Sh relies on the solution methods cited in subsection 3.3.2. (c) Conditioning The convergence of the frozen coefficient iteration is established by the results cited in subsection 7.4.2, which are formulated independently of condition numbers. We note that favourable conditioning properties of frozen coefficient iterations are shown by the numerical experiments of Axelsson–Gustafsson [17], Georgiev–Margenov– Neytcheva [130].
8.2.6
Initial shape preconditioners
As described in the introduction, the coefficient matrix of the preconditioner is expected to be in some sense close to those in the original operator during the iteration. In particular, closeness can be understood by the similar shapes of the two coefficients, with special respect to large variations between small and large values. The initial preconditioner of a variable preconditioning procedure may satisfy this requirement. In the case when variable preconditioning seems too costly, one may keep this initial preconditioner for the whole iteration if it comes from a reasonable initial guess. A well-known case of this is the modified Newton method, when F ′ (u0 ) is chosen as fixed preconditioner. In this subsection we sketch the properties of two such preconditioners: the modified Newton preconditioner and the initial coefficient preconditioner (which is obtained similarly from the frozen coefficient iteration). We note that the modified Newton iteration in function space has been introduced by Kantorovich (see Kantorovich–Akilov [165]), and the discussion of initial coefficient preconditioning is given by Axelsson [8]. (a) The preconditioning operators The preconditioning operator in the modified Newton method is the initial derivative F ′ (u0 ) of the weak form F of the original differential operator T . For second order Dirichlet problems with the operator T (u) = − div f (x, ∇u), this preconditioning operator is defined by the equality ′
hBz, viH01 ≡ hF (u0 )z, viH01 = where hz, viH01 =
R
Ω
Z
Ω
∂f (x, ∇u0 ) ∇z · ∇v ∂η
(z, v ∈ H01 (Ω))
∇z · ∇v. This corresponds to the coefficient matrix G(x) ≡
∂f (x, ∇u0 ). ∂η
250CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT The strong form of the operator is Sz = −div
∂f (x, ∇u0 ) ∇z ∂η
(z ∈ H 2 (Ω) ∩ H01 (Ω)).
The initial coefficient preconditioning operator is defined for problems with operators of the form T (u) = −div(b(x, ∇u) ∇u). Then the preconditioning operator is defined by the equality hBz, viH01 =
Z
Ω
b(x, ∇u0 ) ∇z · ∇v
(z, v ∈ H01 (Ω)).
In strong form: Sz = −div(b(x, ∇u0 ) ∇z)
(z ∈ H 2 (Ω) ∩ H01 (Ω)).
This corresponds to the coefficient matrix G(x) ≡ b(x, ∇u0 (x)) · I, where I is the identity matrix. For both preconditioning operators, it is reasonable to choose an initial guess u0 which approximates the solution as much as possible using the previous information. This obvious requirement has clearly even more importance here than for preconditioners independent of u0 as e.g. in subsections 8.2.1–8.2.4. (Note that if the coefficient is independent of x, then for u0 ≡ 0 the initial shape preconditioners reduce to constant coefficient preconditioners.) (b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem (8.8) is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the following type: for the modified Newton preconditioner f ind z ∈ Vh :
Z ∂f (x, ∇u0 ) ∇z · ∇v = rv ∂η Ω
Z
Ω
(v ∈ Vh ),
and for the initial coefficient preconditioner f ind z ∈ Vh :
Z
Ω
b(x, ∇u0 ) ∇z · ∇v =
Z
Ω
rv
(v ∈ Vh )
with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). That is, the preconditioning matrices are given by (Sh )i,j =
Z
Ω
∂f (x, ∇u0 ) ∇vi · ∇vj ∂η
(i, j = 1, ..., k)
and
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG (Sh )i,j =
Z
Ω
b(x, ∇u0 ) ∇vi · ∇vj
(i, j = 1, ..., k),
respectively, where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, for both preconditioners the matrix Sh has the product form Sh = ZWZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is the weight matrix. In the case of the (scalar-valued) initial coefficient preconditioner, W is diagonal. The solution of the linear algebraic systems containing the preconditioner Sh relies on the solution methods cited in subsection 3.3.2. (c) Conditioning Since the preconditioned operator relies on the shape of an initial function, its conditioning is better than those independent of the coefficients. We give an estimate for the modified Newton preconditioner, which depends on the initial accuracy and is based on spectral equivalence. A study of the initial coefficient preconditioner is found in Axelsson [8]. Similarly as before, the conditioning of the preconditioned operator with modified Newton preconditioner is first determined on the continuous level, then the conditioning of the discretized operators follows from Theorem 8.2. The conditioning of the modified Newton preconditioner operator can be estimated using the results in subsection 5.3.2. Let us consider the Dirichlet problem (8.19)– (8.20). Let F be the weak differential operator corresponding to T . Condition (8.20) implies as usual that µ1 kvk2 ≤ hF ′ (u)v, vi ≤ µ2 kvk2
(u, v ∈ H01 (Ω)),
(8.49)
where the the usual inner product hu, vi =
Z
Ω
∇u · ∇v
and corresponding norm are used in H01 (Ω). Further, let L denote the Lipschitz constant of F ′ . Then, using Lemma 5.1, it follows similarly to Corollary 5.5 that for any u ∈ H there holds 1 hF ′ (u)v, vi ≤ ≤ 1 + µ(u), (8.50) 1 + µ(u) hF ′ (u0 )v, vi
where µ(u) = Lµ−2 1 kF (u) − F (u0 )k. In the calculations below, we assume that the initial residual satisfies the accuracy −1/2
γµ1 −5/2
kF (u0 ) − bk < 1
with γ = Lµ1 µ2 . (This will justify the achieved estimate (8.51). Otherwise we use that the two sides of (8.50) can be replaced by µ1 /µ2 and µ2 /µ1 , respectively, which follows from (8.49).)
252CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT We introduce the norm kvk0 = hF ′ (u0 )v, vi1/2
Then
(v ∈ H01 (Ω)).
1/2
(v ∈ H01 (Ω)).
kvk0 ≥ µ1 kvk
We obtain
−5/2
µ(u) ≤ Lµ−2 1 µ2 ku − u0 k ≤ Lµ1
−5/2
µ2 ku − u0 k0 = γku − u0 k0
with γ = Lµ1 µ2 , and hence 1 kvk20 ≤ hF ′ (u)v, vi ≤ (1 + γku − u0 k0 )kvk20 (u, v ∈ H01 (Ω)). 1 + γku − u0 k0 Letting 1 , M (t) = 1 + γt (t ≥ 0), m(t) = 1 + γt we can apply part (ii) of Remark 5.20 with S = F ′ (u0 ). That is, introducing the functions ˜ (t) = tM (t), m(t) ˜ = tm(t), M the bounds M0 and m0 (relative to the initial guess u0 ) can be estimated by
−1/2
M0 ≤ M 2m ˜ −1 (µ1
˜ −1 (µ1−1/2 kF (u0 ) − bk) . m0 ≥ m 2M
kF (u0 ) − bk) ,
Since now there holds m(t) =
1 , M (t)
it suffices to estimate M0 since m0 =
1 . M0
We have t s and hence m ˜ −1 (s) = 1 + γt 1 − γs therefore M (2m ˜ −1 (s)) = (1 + γs)/(1 − γs), which yields m(t) ˜ =
M0 ≤
−1/2
kF (u0 ) −1/2 γµ1 kF (u0 )
1 + γµ1 1−
− bk
− bk
(t, s ≥ 0),
.
(8.51)
(By the initial assumption on u0 , the denominator in (8.51) is positive. Without this assumption we have µ2 /µ1 as a general upper bound for M0 .) −5/2
Altogether, using m0 = M10 and γ = Lµ1 (relative to the initial guess u0 ) satisfies M0 cond(B F ) ≤ ≤ m0 −1
µ2 , we obtain that the condition number
1 + γ˜ kF (u0 ) − bk 1 − γ˜ kF (u0 ) − bk
!2
with γ˜ = Lµ−3 1 µ2 . We note that the effect of the initial residual on the condition number is shown directly by the obtained estimate. In virtue of Remark 8.2, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate: !2 1 + γ˜ kF (u0 ) − bk −1 (8.52) cond(Bh Fh ) ≤ 1 − γ˜ kF (u0 ) − bk
(relative to the initial guess u0 ), which is independent of the subspace Vh .
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG
8.2.7
Preconditioners using domain decomposition
(a) The preconditioning operator In this subsection we introduce piecewise constant coefficient preconditioning operators, defined via domain decomposition that equidistributes the variation in the coefficient. The derived preconditioning matrices will be suitable, among other things, for compensating sharp gradients or discontinuous coefficients. At the same time, by their construction the operators are close to the Laplacian. The introduction of this preconditioner and more details are found in Axelsson– Farag´o–Kar´atson [16]. We remark that this approach is not identical to a standard domain decomposition method, in which distinct problems are solved on the subdomains and the latter are defined by some properties of the shape of the domain Ω. The discussion is given for Dirichlet problems corresponding to the operator T (u) = − div f (x, ∇u), with the usual ellipticity condition λ |ξ|2 ≤ h
∂f (x, η) ξ, ξi ≤ Λ |ξ|2 ∂η
(x ∈ Ω, η, ξ ∈ RN )
(8.53)
with constants Λ ≥ λ > 0 independent of (x, η). This kind of preconditioner is used naturally in the variable setting: it will be constructed in the nth step of an iteration as a preconditioner for the weak differential operator F ′ (un ) given by ′
hF (un )z, vi =
Z
Ω
∂f (x, ∇un ) ∇z · ∇v ∂η
(z, v ∈ H01 (Ω)).
(8.54)
This is described in the sequel. The general construction and conditioning analysis in paragraphs (a)-(c) will be completed by the study of the case with scalar nonlinearity in paragraph (d). Let un be fixed. The piecewise constant coefficient operator for the preconditioning of F ′ (un ) is defined as follows. To improve the spectral bounds in (8.53), the domain Ω is decomposed in disjoint subdomains Ωi : Ω = Ω1 ∪ Ω2 ∪ ... ∪ Ωsn
(8.55)
such that for all i = 1, .., sn λi |ξ|2 ≤
∂f (x, ∇un (x)) ξ · ξ ≤ Λi |ξ|2 ∂η
(x ∈ Ωi , ξ ∈ RN ),
(8.56)
with λ ≤ λi ≤ Λi ≤ Λ. We introduce a piecewise constant weight function wn such that (i = 1, .., sn ), (8.57) wn |Ωi ≡ ci
254CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT where λi ≤ ci ≤ Λi .
Then the preconditioning operator Bn : H01 (Ω) → H01 (Ω) is the weak differential operator defined by hBn z, vi =
Z
Ω
wn (x) ∇z · ∇v
(z, v ∈ H01 (Ω)).
(8.58)
This preconditioning operator corresponds to the coefficient matrix Gn (x) ≡ wn (x) · I, where I is the identity matrix. The strong form of the preconditioning operator is given formally by S (n) u = −div(wn (x)∇u). However, this will not be used since for general u it is only the weak formulation that makes sense, owing to the discontinuity of wn . In practice, the domain decomposition during the numerical solution is naturally executed for the arising numerical approximations un . Then a convenient way to carry out the decomposition is to use the nodal values of |∇un | and adapt the refinement of the mesh to the level line structure such that the subdomains consist of entire elements (i.e. the weight function is constant on each element). This can be executed using a standard mesh generation procedure. In this context the practically most convenient elements are piecewise linear functions, since then the contours are polygonal, moreover, the value of |∇un | is constant on each fixed element and hence the decomposition fits automatically the element interfaces. (More generally, one might use isoparametric elements.) We remark that in this way the preconditioning matrix can be regarded as a coarse approximation of the Jacobian using averaging on the subdomains. (b) Construction of the preconditioning matrix The corresponding preconditioner (Bn )h for the discretized problem (8.8) is obtained by applying the same discretization for the operator Bn as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z f ind z ∈ Vh : wn (x) ∇z · ∇v = rv (v ∈ Vh ) Ω
with some finite-dimensional FEM subspace Vh ⊂ matrix is given by ((Bn )h )i,j =
Z
where v1 , ..., vk is a basis of Vh .
Ω
wn (x) ∇vi · ∇vj
Ω
H01 (Ω).
That is, the preconditioning
(i, j = 1, ..., k),
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG Using Theorem 8.1, the preconditioning matrix (Bn )h has the product form (Bn )h = ZWn Z t . Since the operator Bn has the scalar-valued coefficient wn , we obtain that Wn is a diagonal matrix. Further, if the mesh fits the decomposition (8.55) such that wn is constant on each element, then Wn consists of constants ci at the entries corresponding to the subdomains Ωi . The matrices Z and Z t correspond to the discretization of −div and ∇, respectively, i.e. they are independent of n and hence only the diagonal matrix Wn has to be updated. The above factorization shows that (Bn )h has a convenient structure close to the discrete Laplacian. We note that by suitably refined decomposition one can limit the size of jumps of wn between adjacent subdomains.
The solution of the linear algebraic systems containing the preconditioner (Bn )h may rely on many existing efficient methods, including ones designed especially for piecewise constant coefficient problems and cited in subsection 3.3.3. Namely, we refer to algebraic multilevel preconditioners (Axelsson–Vassilevski [31, 32]), multigrid methods (Khoromskij–Wittum [179]), standard domain decomposition in terms of additive Schwarz methods (Dryja [95], Graham–Hagger [137]), or scaled Laplacian preconditioners (Greenbaum [138]). (c) Conditioning We will determine the condition number of the operator Bn−1 F ′ (un ) in H01 (Ω) based on Theorem 7.9 in section 7.1. Then the conditioning of the discretized operators will follow from Theorem 8.2. Introducing mn := min λi /ci i
and Mn := max Λi /ci , i
the left and right sides of (8.56) can be estimated further by mn wn (x)|ξ|2 and Mn wn (x)|ξ|2 , respectively. This and (8.54) yield mn hBn v, vi = mn
Z
Ω
2
′
wn |∇v| ≤ hF (un )v, vi ≤ Mn
Z
Ω
wn |∇v|2 = Mn hBn v, vi (8.59)
for all v ∈ H01 (Ω). In other words, (8.58) means that Bn defines the energy inner product Z hz, viBn = wn ∇z · ∇v (8.60) (z, v ∈ H01 (Ω)), Ω
and, hence (8.59), implies mn kvk2Bn ≤ hBn−1 F ′ (un )v, viBn ≤ Mn kvk2Bn
(v ∈ H01 (Ω)).
(8.61)
Consequently, we obtain the estimate
cond Bn−1 F ′ (un ) ≤
Mn mn
(8.62)
256CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT for the condition number of the operator Bn−1 F ′ (un ) in H01 (Ω) when the latter is endowed with the inner product (8.60). Owing to Theorem 8.2 and Remark 8.1, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate:
′ cond (Bn )−1 h Fh (un ) ≤
Mn , mn
(8.63)
which is independent of the subspace Vh . We remark that, clearly, the condition number can be decreased by a suitable refinement of decomposition. Moreover, a reasonable choice of ci is some p-adic – especially, arithmetic, geometric or harmonic – mean of λi and Λi , in which case it is easily seen that Λi Mn = max . (8.64) i mn λi Hence, if the decomposition is chosen such that Λi /λi is the same for all i = 1, ..., sn , then the improvement of the condition number will grow rapidly with the number of subdomains. (d) Decomposition for scalar nonlinearity The main point in the construction of the proposed preconditioners is to control the bounds λi and Λi and the corresponding subdomains Ωi in (8.55)-(8.56). This is easy to execute when the dependence of the nonlinearity f on ∇u is in fact on |∇u|, since in this case the spectrum of ∂f (x, ∇un (x)) can be estimated in terms of ∂η |∇un (x)|. We present this below for the case of a scalar diffusion-type coefficient, i.e. for nonlinearities of the form f (x, η) = b(x, |η|)η , where b : Ω × R+ → R+ is differentiable in r and satisfies 0 < λ ≤ b(x, r) ≤ b(x, r) +
∂b(x, r) r ≤Λ ∂r
(x ∈ Ω, r ≥ 0).
(8.65)
This is a special case when (8.53) holds (see e.g. Theorem 6.6), and in fact, f has the above form in many applications as seen in Chapter 1. For simplicity, let us first neglect the dependence on x and consider f (x, η) = a(|η|)η .
(8.66)
Then (8.65) reduces to 0 < λ ≤ a(r) ≤ a(r) + a′ (r)r ≤ Λ
(r ≥ 0).
(8.67)
A simple calculation yields a(|∇un (x)|)|ξ|2 ≤
∂f (x, ∇un (x)) ξ · ξ ≤ (a(|∇un (x)|) + a′ (|∇un (x)|)|∇un (x)|) |ξ|2 , ∂η (8.68)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG see e.g. (8.32). In order to determine the improved bounds (8.56) for given subdomains Ωi , we use the notation c(r) := a(r) + a′ (r)r (8.69) and let λi = inf a(|∇un (x)|), x∈Ωi
Λi = sup c(|∇un (x)|).
(8.70)
x∈Ωi
Alternatively, we can first choose convenient values of λi and Λi , and determine suitable decompositions based on the relations (8.70). For this purpose we study the real function r 7→ a(r), and let the subdomains Ωi be determined as corresponding level sets of |∇un |. Namely, let κ0 := sup r≥0
c(r) . a(r)
First a number κ > κ0
(8.71)
is fixed such that κ is an a priori acceptable condition number for Bn−1 F ′ (un ). Then the interval [0, ∞) is decomposed recursively into subintervals sup c(r) Ji = [ri−1 , ri ) such that
r∈Ji
inf a(r)
=κ
(i = 1, 2, ...).
(8.72)
r∈Ji
c(r) Since a is increasing and a(r) is bounded by Λ/λ, this procedure terminates in a finite number of steps (say s), yielding the intervals Ji where
0 = r0 < r1 < ... < ri < ... < rs−1 (< rs = ∞).
(8.73)
Then the subdomains Ωi can be defined as corresponding level sets of |∇un |, namely Ωi := {x ∈ Ω : |∇un (x)| ∈ Ji }
(i = 1, ..., s),
i.e. ri−1 ≤ |∇un (x)| < ri for x ∈ Ωi . Further, by the definition of Ωi and the requirement (8.70), we introduce λi = inf a(r), r∈Ji
Λi = sup c(r) .
(8.74)
r∈Ji
The weight function wn is then defined as in (8.57). Using (8.72) and (8.74), the above construction yields Λi = κ λi
(i = 1, ..., s).
(8.75)
From (8.62) and (8.64) it then follows that the condition number of the preconditioned matrix is at most κ. (Note that the possible inaccuracy of the estimate (8.68) of the spectrum implies that by refining the decomposition, κ may approach only κ0 defined in (8.71) instead
258CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT of 1. However, this is small enough in any practical situation. Even in the extremely ill-conditioned model case (8.76) which yields a condition number close to 105 , the number κ0 gives a convergence quotient less than 0.5 as we will see in section 10.2.) We conclude that a prescribed condition number of the preconditioned matrix can be achieved by studying only the real function a. The same construction can be repeated in the case of x-dependent coefficient if b(x, |η|) is a different function of |η| on distinct subdomains of Ω, also allowing possible discontinuities on the interfaces. Then we proceed as above on each subdomain. An example of this is a nonlinearity of the form b(x, r) :=
a(r) if x ∈ Ω1 α if x ∈ Ω \ Ω1 ,
where Ω1 ⊂ Ω, a(r) is as above and α > 0 is a constant. (This coefficient arises for the potential in H-shaped magnets, see Kˇriˇzek–Neittaanm¨aki [188].) Then a suitable choice of the weight function wn , which follows the discontinuity of the coefficient, is defined as above for a(r) on Ω1 and as constant α on Ω \ Ω1 . Example. The following nonlinearity characterizes the reluctance of stator sheets in the cross-sections of an electrical motor in the case of isotropic media [134, 188]: 1 r8 a(r) = α + (1 − α) 8 µ0 r +β
!
(r ≥ 0),
(8.76)
where α = 0.0003 and β = 16000, further, µ0 is the vacuum permeability. (See also [26] for an earlier study of this problem and domain decomposition.) The function a varies in several magnitudes over the whole domain, hence the corresponding operator T is almost singular. Namely, the bounds in (8.53) are λ = α = 0.0003 and Λ = max(a(r) + a′ (r)r) = 2.5313, hence the related condition number and corresponding CG convergence quotient are √ √ Λ− λ Λ √ = 0.9785. √ = 8437.7, λ Λ+ λ The preconditioning technique of this subsection will be applied to this operator in section 10.2, see (10.26). We will see that it suffices with few subdomains to achieve a reasonable convergence quotient. For instance, we can achieve a CG convergence quotient Q = 0.6123 using 9 subdomains. (Further, in this subdivision the jumps of wn between adjacent subdomains are around 2 only.)
8.2.8
Diagonal coefficient preconditioners
(a) The preconditioning operator
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG In this subsection we discuss preconditioning operators with diagonal (i.e. scalarvalued) coefficients. These operators are a kind of generalization of the ones in the previous subsection. We study again the scalar nonlinearity (8.66) f (x, η) = a(|η|)η with some real C 1 function a : R+ → R+ satisfying the ellipticity condition (8.67), and consider again Dirichlet problems corresponding to the operator T (u) = − div (a(|∇u|) ∇u) . Similarly to the previous subsection, the suggested preconditioner will be described in the sequel in the variable setting as preconditioner for the weak differential operator F ′ (un ) (see (8.54)). The preconditioning operator is defined as follows. Let, as before, c(r) := a(r) + a′ (r)r
(r ≥ 0).
We choose a real function d : R+ → R+ such that a(r) ≤ d(r) ≤ c(r) Then hBn z, vi =
Z
Ω
d(|∇un |) ∇z · ∇v
(r ≥ 0). (z, v ∈ H01 (Ω)).
(8.77) (8.78)
This preconditioning operator corresponds to the diagonal coefficient matrix Gn (x) ≡ d(|∇un (x)|) · I, where I is the identity matrix. This is the motivation for this kind of preconditioner, which simplifies the operator since the Jacobian of f (x, η) = a(|η|)η w.r. to η is not diagonal: ∂f a′ (|∇un |) (x, ∇un ) = a(|∇un |) I + (∇un · ∇utn ), ∂η |∇un |
where (∇un · ∇utn ) is the diadic product matrix of ∇un (see e.g. Theorem 6.6). The diagonal factor also appears in the discretized form (8.79). The strong form of the preconditioning operator is S (n) z = −div(d(|∇un |)∇z). (b) Construction of the preconditioning matrix
The corresponding preconditioner (Bn )h for the discretized problem (8.8) is obtained by applying the same discretization for the operator Bn as the one applied to the boundary value problem in (8.8). In the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z Z f ind z ∈ Vh : d(|∇un |) ∇z · ∇v = rv (v ∈ Vh ) Ω
Ω
260CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). That is, the preconditioning matrix is given by ((Bn )h )i,j =
Z
Ω
d(|∇un |) ∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the matrix (Bn )h has the product form (Bn )h = ZWn Z t ,
(8.79)
where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, i.e. they are independent of n and hence need not be updated. Since the operator Bn has the scalar-valued coefficient dn , the weight matrix Wn is diagonal. The solution of the linear algebraic systems containing the preconditioner (Bn )h relies on the solution methods cited in subsection 3.3.2. (c) Conditioning We will determine the condition number of the operator Bn−1 F ′ (un ) in H01 (Ω) based on Theorem 7.9 in section 7.1. Then the conditioning of the discretized operators will follow from Theorem 8.2. First, (8.68) implies that a(r) min r≥0 d(r)
!
hBn v, vi ≤
Z
Ω
a(|∇un |) |∇vn |2 ≤ hF ′ (un )v, vi
c(r) ≤ c(|∇un |) |∇vn | ≤ max r≥0 d(r) Ω Z
2
!
hBn v, vi
for all v ∈ H01 (Ω). Hence we obtain the estimate
cond Bn−1 F ′ (un ) ≤ max r≥0
d(r) c(r) max . d(r) r≥0 a(r)
(8.80)
Owing to Theorem 8.2 and Remark 8.1, the discretized operators in a FEM subspace Vh ⊂ H01 (Ω) inherit the above estimate:
′ cond (Bn )−1 h Fh (un ) ≤ max r≥0
c(r) d(r) max , r≥0 d(r) a(r)
(8.81)
which is independent of the subspace Vh . Two possible choices of the function d are as follows. First, let us observe that max r≥0
c(r) d(r) c(r) max ≥ max r≥0 r≥0 d(r) a(r) a(r)
and the latter is achieved when d(r) = (a(r)c(r))1/2 .
(8.82)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG Hence this d is the optimal choice concerning the possible condition numbers. Second, a simpler structure of the preconditioner is obtained if d is chosen to be piecewise constant such that d(|∇un |) is the piecewise constant weight function defined in the previous subsection. This shows that the discussed class of diagonal coefficient preconditioner gives a kind of generalization of the domain decomposition preconditioners. We note that the preconditioner using (8.82) can be regarded as a limiting case of the domain decomposition preconditioners as the decomposition is refined. The achieved condition number is thus smaller than by any domain decomposition. (Conversely, the domain decomposition preconditioners can be considered as approximations of the preconditioner corresponding to (8.82), using some coarser numerical integration for d(|∇un |).)
8.2.9
Double Sobolev gradient preconditioner
(a) The preconditioning operator We consider Dirichlet problems with the operator T (u) = −div (a(x, u, ∇u)∇u) . Following subsection 7.4.3, we impose the conditions assumed for (7.191), further, we introduce the notation for an operator ˆ (c) u := −div(c(x)∇u) L
ˆ (c) ) = H 2 (Ω) ∩ H 1 (Ω) with domain D(L 0
(8.83)
for any given function c ∈ W 1,∞ (Ω). The preconditioning operator will be defined in the nth step of a frozen coefficient iteration, i.e. as a preconditioner for the operator ˆ (a) where the notation L a(x) = a(x, un , ∇un ) is used. The preconditioning operator S is defined based on subsection 7.4.3, using a double application of a Laplacian preconditioner: ˆ −1 (−∆), S = (−∆)L (1/a) defined in this strong form on H 2 (Ω) ∩ H01 (Ω). Unlike in the preceding subsections, S ˆ (a) on u is not of the form (8.21). For certain arguments S −1 acts as the inverse of L (see Proposition 7.1). (b) Construction of the preconditioning matrix ˆ (a) yields the First, using Theorem 8.1, the FEM discretization of the operator L factorized stiffness matrix ˆ (a) ) = ZAn Z t , (L h
where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, further, An is the weight matrix corresponding to the function a. Then the corresponding preconditioning matrix is t −1 t Sh = ZZ t (ZA−1 n Z ) ZZ ,
(8.84)
262CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT where ZZ t is the discrete Laplacian. Hereby Z and Z t are independent of n and hence need not be updated during the iteration. This preconditioner has been applied by Axelsson–Gustafsson [17], Elman [103]. The proposed preconditioning is particularly effective when a fast Poisson solver is available on Ω. The solution of the linear algebraic systems containing the preconditioner Sh uses such fast solvers cited in subsection 3.3.3. (Some more details on these solvers will be also given in subsection 9.3.) (c) Conditioning The condition number of the preconditioned operator ˆ (a) = (−∆)−1 L ˆ (1/a) (−∆)−1 L ˆ (a) C = S −1 L
(8.85)
on the continuous level is given by Proposition 7.4. Namely, if γ = ka − 1k∞ < 12 , then 1 cond(C) ≤ . (8.86) 1 − 2γ The preconditioned matrix t t −1 t Ch = (ZZ t )−1 ZA−1 n Z (ZZ ) ZAn Z
(the discretized operator corresponding to C) inherits this estimate. Namely, using the discrete H01 (Ω)-norm hu, viZ t := hZ t u, Z t vi (also for the corresponding operator norm), it follows in an analogous way to (8.86) that if γh = kAn − IkZ t < 21 , then cond(Ch ) ≤
1 1 − 2γh
where cond(Ch ) = kCh kZ t kCh−1 kZ t . (d) Generalization of the preconditioner using domain decomposition The assumption γ = ka − 1k∞ < 12 , made for (8.86), is somewhat restrictive (as well as its analogue, sup a < 3 inf a if we consider the general case a ≈ c instead of a ≈ 1). It can be eliminated by changing the Laplacian preconditioner used in (8.85) to a suitable piecewise constant coefficient operator defined in subsection 8.2.7. This construction is also to be used if a has sharp gradients or discontinuities. The generalized preconditioner involves the operator defined formally in strong form by ˆ (w) z = −div(w∇z), L
where w is a piecewise constant weight function on Ω, i.e. w|Ωi ≡ const. on certain ˆ (w) instead of −∆, the new preconditioner in strong form is subdomains Ωi . Using L ˆ −1 L ˆ ˆ −1 ˆ C = L (w) (w2 /a) L(w) L(a) .
(8.87)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG The corresponding modified correction term zn in the iteration (7.198)–(7.199) is obtained by using the weight function w in the weak form (7.202)–(7.203), i.e. now zn is defined in two steps by Z
Ω
w∇vn · ∇ϕ = Z
Ω
Z
Ω
(a ∇un · ∇ϕ − f ϕ)
(ϕ ∈ H01 (Ω)),
(8.88)
w2 ∇vn · ∇ϕ a
(ϕ ∈ H01 (Ω)).
(8.89)
w∇zn · ∇ϕ =
Z
Ω
In other words, the original variational problems in (7.202)-(7.203) are now posed using the equivalent norm Z kukw := w|∇u|2 Ω
in
H01 (Ω).
In order to define the weight function w, the domain Ω is decomposed in parts Ωi (i = 1, ..., k) such that on each Ωi the oscillation of a is suitably small: namely, introducing the notations Mi := sup a|Ωi ,
(i = 1, ..., k),
mi := inf a|Ωi
(8.90)
the decomposition is chosen such that Mi < 3mi
(i = 1, ..., k).
(8.91)
Then we define w by 1 w|Ωi ≡ ci := (Mi + mi ) 2
(i = 1, ..., k).
(8.92)
Then the assumptions imply that
a
γ := 1 −
w
Namely, (8.91) and (8.92) yield
∞
1 < . 2
2 Mi < ci < 2mi , 3
(8.93)
(8.94)
hence on Ωi there holds Mi a mi 1 1 ≤1− ≤1− < , − <1− 2 ci ci ci 2 that is,
a 1 a | = |1 − | < on Ωi . w ci 2 Using (8.93), it is proved analogously to Propositions 7.3 and 7.4 (just by inserting γ the weight w) that kI − Ckw ≤ 1−γ , and consequently |1 −
cond(C) ≤
1 , 1 − 2γ
264CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT where cond(C) = kCkw kC −1 kw , without any assumption like ka − 1k∞ < 12 . The corresponding preconditioning matrix is the modification of (8.84) such that ZZ t is replaced by ZWZ t , where W is obtained from the discretization of the weight function w: t −1 t Sh = ZWZ t (ZA−1 n Z ) ZWZ . (The updating required for W is not very costly since the value of w on each subdomain Ωi is constant.) The preconditioned matrix t t −1 t Ch = (ZWZ t )−1 ZA−1 n Z (ZWZ ) ZAn Z ,
corresponding to (8.87), inherits the estimate for C similarly as before. Finally we note that the suggested decomposition of Ω can be carried out in the following way. We define [t0 , t1 ] = R(a) = {a(x) : x ∈ Ω}. By assumption we have [t0 , t1 ] ⊂ (0, ∞). We choose k ∈ N satisfying 3k > ̺=
1/k t1 t0
t1 t0
and let
(< 3). We define the level sets Ωi = {x ∈ Ω : ̺i−1 t0 ≤ a(x) < ̺i t0 } (i = 1, ..., k − 1), Ωk = {x ∈ Ω : ̺k−1 t0 ≤ a(x) ≤ ̺k t0 }.
Then Ω = Ω1 ∪ ... ∪ Ωk and for all i sup a|Ωi Mi ≤ ̺ < 3, = mi inf a|Ωi i.e. (8.91) holds.
8.2.10
Symmetric part preconditioners
(a) The preconditioning operators Based on subsection 7.4.4, we now consider two preconditioners for inner Newton iterations for diffusion type problems − div (a(x, u) ∇u) = g(x)
in Ω
u|∂Ω = 0
(see (6.74)). Denoting by F the weak form of the differential operator (see (7.213)), the linearized operator F ′ (un ) : H01 (Ω) → H01 (Ω) is given by hF ′ (un )v, ziH01 =
Z
Ω
(a(x, un ) ∇v · ∇z + v (b(un ) · ∇z))
using the notation b(un ) = au (x, un )∇un .
(v, z ∈ H01 (Ω)),
(8.95)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG The operator F ′ (un ) is not self-adjoint. Two kinds of self-adjoint preconditioning operator are proposed for the inner iteration: the first one is the symmetric part of F ′ (un ), namely, let 1 Bn := (F ′ (un ) + F ′ (un )∗ ) , 2 which satisfies hBn v, ziH01 =
Z Ω
1 a(x, un ) ∇v · ∇z − (divb(un ))vz 2
(v, z ∈ H01 (Ω)).
(8.96)
Second, the operator Bn can be simplified by neglecting the zeroth-order term, i.e., we define Z hCn v, ziH01 = a(x, un ) ∇v · ∇z (v, z ∈ H01 (Ω)). (8.97) Ω
These operators are discussed in subsection 7.4.4. In particular, whereas the operator Cn is simpler, the preconditioner Bn implies that the full version of the CGM coincides with the truncated one which requires only a single search direction. We note that since F ′ (un ) is not self-adjoint, this kind of preconditioning does not rely on spectral equivalence in contrary to the general approach of this chapter. Instead, a direct estimate from Proposition 7.7 will be used for the conditioning. (b) Construction of the preconditioning matrices The corresponding preconditioners (Bn )h and (Cn )h for the discretized linearized equation are obtained by applying the same discretization for the operators Bn or Cn as the one applied to the boundary value problem. Using the operator Bn for the FEM realization for Dirichlet problems, the auxiliary equations are of the type Z 1 a(x, un ) ∇v · ∇z − (divb(un ))vz = rv 2
Z
f ind z ∈ Vh :
Ω
Ω
(v ∈ Vh ) (8.98)
with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). That is, the preconditioning matrix is given by ((Bn )h )i,j =
Z Ω
1 a(x, un ) ∇vi · ∇vj − (divb(un ))vi vj 2
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . The above formulas are similar for Cn by neglecting the zeroth-order term. Since now the matrix (Cn )h is given by ((Cn )h )i,j =
Z
Ω
a(x, un ) ∇vi · ∇vj
(i, j = 1, ..., k),
in this case one can use Theorem 8.1 to obtain the factorization (Cn )h = ZWZ t ,
266CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is a diagonal weight matrix. The solution of the linear algebraic systems containing the preconditioners (Bn )h and (Cn )h relies on the solution methods cited in subsection 3.3.2. (c) Conditioning The condition numbers of the operators Bn−1 F ′ (un ) and Cn−1 F ′ (un ) on the continuous level are determined in subsection 7.4.4, and are inherited by the discretized operators. The operator Bn−1 F ′ (un ) satisfies cond(Bn−1 F ′ (un )) ≤ 1 + L due to Proposition 7.7, where L=
K4 kau (x, un )kL∞ k∇un kL4 , α
further, K4 > 0 is the Sobolev embedding constant in (7.227). (See also (11.9)–(11.10).) Now let Vh ⊂ H01 (Ω) be a FEM subspace. Then Propositions 7.6 and 7.7 can be repeated in Vh in the place of H01 (Ω), hence the discretized operators in Vh ⊂ H01 (Ω) inherit the above estimate:
′ cond (Bn )−1 h Fh (un ) ≤ 1 + L ,
which is independent of the subspace Vh .
As mentioned in subsection 7.4.4, paragraph (b), the same estimate holds if Bn is replaced by Cn .
8.2.11
Incorporating boundary conditions
(a) The preconditioning operator In the preceding subsections we have been focusing on Dirichlet boundary conditions. Now we consider second order problems with 3rd type boundary conditions T (u) ≡ −div f (x, ∇u) + q(x, u) = g(x)
Q(u) ≡
in Ω
f (x, ∇u) · ν + s(x, u) = γ(x) on ΓN u = 0
(8.99)
on ΓD
with the assumptions on (7.35) given in subsection 7.1.2. We note that these conditions ensure well-posedness, hence the non-injective case of Neumann problems requires distinct treatment which will be done in the next subsection. 1 We recall that the Sobolev space HD (Ω) corresponding to the Dirichlet boundary is defined as 1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0} .
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG The suitable class of preconditioning operators is now studied generally, without specifying the coefficients of the preconditioning operator, but rather with emphasis on the way of incorporating the boundary conditions. First we consider simple iterations. Relying on Remark 7.6, the strong form of the S preconditioning operator is defined as a pair of operators ( ), where R Su ≡ −div (G(x) ∇u)
(x ∈ Ω)
Ru ≡ ∂G(x)·ν u + β(x)u
(x ∈ ΓN )
(8.100)
1 for u ∈ H 2 (Ω) ∩ HD (Ω). Here, following Theorem 7.3, G(x) is defined as in (7.39) to be uniformly spectrally equivalent to all Jacobians ∂f (x, η), and β(x) satisfies ∂η
β(x) ≤
1 inf ∂ξ s(x, ξ) m ξ∈R
(x ∈ ΓN ),
where m is from (7.39). Further, ∂G(x)·ν u = G(x) ν ·∇u denotes the conormal derivative of u at x. Remark 8.6 In course of a Newton-like linearization, the matrix G(x) and the function β(x) in (8.100) are replaced by ∂f (x, ∇un ) ∂η
and
∂ξ s(x, un ) ,
i.e. by the Jacobian and the derivative of the boundary coefficient in the nth step un , respectively. (The obtained pair of operators is the strong form of the derivative operator F ′ (un ).) Further, when preconditioning is required in an inner iteration for F ′ (un ), then the preconditioning operator in strong form is obtained by setting some suitable Gn (x) and βn (x) instead of G(x) and β(x) in (8.100), respectively. In this case the definition of these coefficients uses an appropriate modification of the ideas given after (8.100): namely, we only require that Gn is equivalent to ∂f (x, ∇un ) and βn (x) ∂η satisfies 1 βn (x) ≤ ∂ξ s(x, un ) (x ∈ ΓN ). m (b) Construction of the preconditioning matrix The corresponding preconditioner for the discretized problem is obtained by applying the same discretization for the preconditioning operator as the one applied to the boundary value problem. The important point arising with 3rd type boundary conditions is that the representation as a pair of operators has to be taken into account in the discretization. Let us first consider simple iterations. We write (8.99) as
T (u) = Q
g γ
,
268CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT 1 working in the space HD (Ω) (i.e. the Dirichlet condition is understood on ΓD ). Then the discretized problem can be considered in the form
T Q
(uh ) = h
g γ
(8.101) h
1 in a discretization subspace Vh ⊂ HD (Ω), and our preconditioned iteration (unh ) in Vh can be written, using (7.54), as
un+1 h
=
2 − M0 + m
unh
S R
−1 " h
T Q
h
(unh )
−
g γ
#
.
(8.102)
h
Hence the preconditioning matrix is S R
. h
S ) into Vh , and reflects that the R preconditioner contains terms both from Ω and ∂Ω. This is the projection of the pair of operators (
In the FEM realization the auxiliary equations are of the following type: find z ∈ Vh such that Z
Ω
G(x) ∇z · ∇v +
Z
ΓN
β(x)zv dσ =
Z
Ω
rv +
Z
ρv dσ
ΓN
(v ∈ Vh )
(8.103)
1 with some finite-dimensional FEM subspace Vh ⊂ HD (Ω). That is, the preconditioning matrix is given by
S R
! h
i,j
=
Z
Ω
G(x) ∇vi · ∇vj +
Z
ΓN
β(x)vi vj dσ
(i, j = 1, ..., k),
(8.104)
where v1 , ..., vk is a basis of Vh . In the case of variable preconditioning the construction is the same, replacing G(x) and β(x) by Gn (x) and βn (x), respectively. Note that the preconditioning matrix (8.104) reduces to the stiffness matrix
S R
! h
= i,j
Z
Ω
G(x) ∇vi · ∇vj
(i, j = 1, ..., k)
(8.105)
if β = 0, in particular, in the case of mixed problems. Then Theorem 8.1 yields that the matrix (8.105) has the product form Sh = ZWZ t , where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is the weight matrix corresponding to G(x).
The solution of the linear algebraic systems containing the preconditioner (8.104) relies on the solution methods cited in subsection 3.3.2.
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG (c) Conditioning The condition number of the operator
S R
−1
T Q
can be determined on the continuous level based on Theorem 7.3 and Remark 7.6. Then the conditioning of the discretized operators
S R
−1
T Q
S R
−1
T Q
!
h
(8.106) h
follows from Theorem 8.2. Namely, by (7.55) we have cond
≤
M0 m
with m introduced in (7.39) and M0 defined in (7.45). (Since in the present study the coefficients G(x) and β(x) of the preconditioner are not specified, there is no need of more detailed quotation here.) In virtue of Remark 8.2, the discretized operators (8.106) in a FEM subspace Vh ⊂ 1 HD (Ω) inherit the above estimate, which is independent of the subspace Vh . When preconditioning is done in an inner iteration used for an outer Newton step, one can obtain an estimate similar to (7.114) for 3rd type boundary conditions, which can be seen to hold for the corresponding pairs of operators in the same way as above, using Remarks 8.1 and 8.6. (The details are left to the reader.)
8.2.12
Non-injective problems
(a) The preconditioning operator Let us consider Neumann problems of the form (
T (u) ≡ −div f (x, ∇u) = g(x) f (x, ∇u) · ν |∂Ω = 0
(8.107)
with f satisfying the usual ellipticity condition (8.20). Then the operator T is noninjective, and the study of preconditioning relies on the corresponding result of Theorem 7.5 in subsection 7.1.2. The suitable class of preconditioning operators is studied generally (without specifying the coefficient of the preconditioner), with emphasis on avoiding non-injectivity. For this we use the factorization as in Theorem 7.5. Recall the subspaces 1
H0 := {u ∈ H (Ω) : u(x) ≡ const. on Ω},
H0⊥
1
= {u ∈ H (Ω) :
Z
Ω
u dx = 0 } .
270CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT Choosing some symmetric matrix-valued function G ∈ L∞ (Ω, RN ×N ) satisfying (7.4), the preconditioning operator is Su = −div(G(x)∇u)
(u ∈ H 2 (Ω) ∩ H0⊥ ).
In other words, if the operator Lu = −div(G(x)∇u) (u ∈ H 2 (Ω)) is equivalent to T (u) = −div f (x, ∇u) in the sense of (7.4), then S = L|H0⊥ .
(b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem is obtained by applying the same discretization for the operator S as the one applied to the boundary value problem (8.107). The main point is that the discretization subspace Vh must satisfy Vh ⊂ H0⊥ . In the FEM realization the auxiliary equations are of the type Z
f ind z ∈ Vh :
Ω
G(x) ∇z · ∇v =
Z
Ω
rv
(v ∈ Vh )
with some finite-dimensional FEM subspace Vh ⊂ H0⊥ . That is, the function z must satisfy the zero-mean integral condition. The preconditioning matrix is given by (Sh )i,j =
Z
Ω
G(x) ∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Using Theorem 8.1, the matrix Sh has the product form Sh = ZWZ t ,
where the matrices Z and Z t correspond to the discretization of −div and ∇, respectively, and W is the weight matrix corresponding to G(x). The solution of the linear algebraic systems containing the preconditioner Sh relies on the solution methods cited in subsections 3.3.2–3.3.3, which cover Neumann problems as well. (c) Conditioning The condition number of the operator S −1 T on the continuous level can be determined based on Theorem 7.5. Then the conditioning of the discretized operators follows from Theorem 8.2. Namely, since G(x) is chosen to satisfy (7.4), i.e. the Jacobians ∂f (x, η) have ∂η uniform spectral bounds m and M w.r. to G(x), therefore Theorem 7.5 yields M . cond(S −1 T ) ≤ m Owing to Theorem 8.2, the discretized operators in a FEM subspace Vh ⊂ H0⊥ inherit the above estimate: M , cond(Sh−1 Th ) ≤ m which is independent of the subspace Vh .
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG
8.2.13
Discrete biharmonic preconditioner
(a) The preconditioning operator In this subsection and the following one we turn to 4th order Dirichlet problems. We consider problems of the form defined in (7.71) with a matrix-valued function A: T (u) ≡ div2 A(x, D 2 u) = g(x) u |∂Ω =
∂u | = ∂ν ∂Ω
(8.108)
0.
The simplest preconditioning operator for (8.108) is S = ∆2 . This corresponds to the identity coefficient array G(x) ≡ I
(8.109)
in the general operator 2
Su ≡ div2 (G(x)D2 u)
in (7.74). That is, G ∈ R(N ×N ) now satisfies Gi,j,k,l = δ(i,j),(k,l)
(i, j, k, l = 1, ..., N ).
(b) Construction of the preconditioning matrix The corresponding preconditioner ∆2h for the discretized problem is obtained by applying the same discretization for the operator ∆2 as the one applied to the boundary value problem (8.108). The boundary conditions now require that the discretization subspace Vh satisfies Vh ⊂ H02 (Ω). In the FEM realization the auxiliary equations are of the type Z
f ind z ∈ Vh :
Ω
2
2
D z·D v =
Z
Ω
rv
(v ∈ Vh )
with some finite-dimensional FEM subspace Vh ⊂ H02 (Ω). That is, the preconditioning matrix is given by
∆2h
i,j
=
Z
Ω
D 2 vi · D 2 vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . The solution of the linear algebraic systems containing the preconditioner ∆2h relies on the fast biharmonic solvers cited in subsection 3.3.3. (c) Conditioning
272CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT We determine the condition number of the operator S −1 T = ∆−2 T on the continuous level based on Theorem 7.6 in section 7.1. Then the conditioning of the discretized operators follows from Theorem 8.2. (The related estimates in an inner iteration for Newton-like methods can be carried out similarly as mentioned in Remark 8.5, see also Remark 8.3.) Namely, by Assumptions 7.6 and using (8.109), we obtain that m = µ1 and M = µ2 can be chosen in (7.72) and (7.73). In this way (7.72) and (7.73) coincide and can be rewritten as µ1 kΦk2 ≤ h
∂A(x, Θ) Φ, Φi ≤ µ2 kΦk2 ∂Θ
((x, Θ) ∈ Ω × RN ×N , Φ ∈ RN ×N ).
Then Theorem 7.6 yields cond(∆−2 T ) ≤
µ2 . µ1
Owing to Theorem 8.2, the discretized operators in a FEM subspace Vh ⊂ H02 (Ω) inherit the above estimate: µ2 cond(∆−2 , h Th ) ≤ µ1 which is independent of the subspace Vh . Finally we note that if lower order terms are also included in the operator T , then the modified estimates can be obtained analogous to (8.33)–(8.34), see [172].
8.2.14
Double diagonal coefficient preconditioners
(a) The preconditioning operator We consider a special case of the 4th order operator (8.108) with scalar nonlinearity, introduced in Remark 6.6. That is, let T be an operator of the form
˜ 2u , T (u) ≡ div2 a(E(D2 u)) D
(8.110)
where the following notations are used: D2 u is the Hessian of u, ˜ 2 u = 1 D2 u + ∆u · I , D 2
where I is the identity matrix, and E(D2 u) =
1 2 2 |D u| + (∆u)2 . 2
(8.111)
2 1 + (Here |D2 u|2 = N → R+ i,j=1 (∂i ∂j u) as before.) Further, the C function a : R satisfies the ellipticity condition
P
0 < λ ≤ a(r) ≤ Λ ,
0 < λ ≤ (a(r2 )r)′ ≤ Λ
with suitable constants λ, Λ independent of the variable r > 0.
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG Then (8.110) is a special case of the operator in Theorem 7.6 (see Remark 6.6). This kind of operator arises e.g. in elasto-plasticity (see section 1.4). The operator T (u) can be rewritten as the sum T (u) =
1 2 1 div a(E(D2 u)) D2 u + ∆ a(E(D2 u)) ∆u . 2 2
In weak form: for any u ∈ H02 (Ω) hF (u), viH02
1Z = a(E(D2 u)) (D2 u · D2 v + ∆u ∆v) 2 Ω
(v ∈ H02 (Ω)).
(8.112)
We define variable preconditioning operators in the steps of an iteration as preconditioners for F ′ (un ). The construction is analogous to subsection 8.2.8. For this we first introduce the functions p(r2 ) = min{a(r2 ), (a(r2 )r)′ },
q(r2 ) = max{a(r2 ), (a(r2 )r)′ }
(r ≥ 0).
(8.113)
The preconditioning operator is defined as follows. We choose a real function s : R+ → R+ such that p(r) ≤ s(r) ≤ q(r) (r ≥ 0). (8.114) Then we define the operator S (n) z =
1 1 2 div s(E(D2 un )) D2 z + ∆ s(E(D2 un )) ∆z . 2 2
(8.115)
(The name ’double diagonal coefficient preconditioner’ expresses that both terms in (8.115) have a diagonal coefficient.) The weak form of the preconditioning operator is given by hBn z, viH02
1Z = s(E(D2 un )) (D2 z · D2 v + ∆z ∆v) 2 Ω
(z, v ∈ H02 (Ω)).
This operator corresponds to the coefficient array Gn (x) ≡ s(E(D2 un (x))) · G˜ in the general operator S (n) z ≡ div2 (Gn (x)D2 z),
2 where the constant array G˜ ∈ R(N ×N ) is defined by
G˜i,i,i,i = 1 G˜i,k,i,k = 1/2
(i = 1, ..., N );
G˜i,j,k,l = 0
otherwise.
G˜i,i,k,k = 1/2
(i, k = 1, ..., N, i 6= k);
(i, k = 1, ..., N, i 6= k);
(b) Construction of the preconditioning matrix
(8.116)
274CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT The corresponding preconditioner (Bn )h for the discretized problem is obtained by applying the same discretization for the operator Bn as the one applied to the boundary value problem. In the FEM realization the auxiliary equations are of the type Z 1Z 2 2 2 s(E(D un (x))) (D z · D v + ∆z ∆v) = rv 2 Ω Ω
f ind z ∈ Vh :
(v ∈ Vh )
with some finite-dimensional FEM subspace Vh ⊂ H02 (Ω). That is, the preconditioning matrix is given by ((Bn )h )i,j =
1Z s(E(D2 un (x))) (D2 vi · D2 vj + ∆vi ∆vj ) 2 Ω
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . The solution of the linear algebraic systems containing the preconditioner (Bn )h relies on the solution methods cited in subsection 3.3.2. (c) Conditioning We will determine the condition number of the operator Bn−1 F ′ (un ) in H02 (Ω) on the continuous level similarly as for the second order diagonal coefficient preconditioner in subsection 8.2.8. Then the conditioning of the discretized operators will follow from Theorem 8.2. In virtue of (8.112), the operator F is of the class given in Remark 6.1 if one uses the bilinear mapping and corresponding quadratic expression 1 [v, z] = (D2 v · D2 z + ∆v ∆z), 2
[v, v] = E(D2 v).
Hence (6.14) holds, in which the intermediate inequalities now yield Z
Ω
p(E(D2 un )) E(D2 v) ≤ hF ′ (un )v, vi ≤
Z
Ω
q(E(D2 un )) E(D2 v).
(8.117)
Here (8.116) implies hBn v, vi
H02
=
Z
Ω
s(E(D2 un )) E(D2 v)
(v ∈ H02 (Ω)),
hence from (8.117) we obtain p(r) min r≥0 s(r)
!
q(r) hBn v, vi ≤ hF (un )v, vi ≤ max r≥0 s(r) ′
!
hBn v, vi
for all v ∈ H02 (Ω). That is, we have the estimate
cond Bn−1 F ′ (un ) ≤ max r≥0
s(r) q(r) max . s(r) r≥0 p(r)
(8.118)
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG Owing to Theorem 8.2 and Remark 8.1, the discretized operators in a FEM subspace Vh ⊂ H02 (Ω) inherit the above estimate:
′ cond (Bn )−1 h Fh (un ) ≤ max r≥0
q(r) s(r) max , s(r) r≥0 p(r)
(8.119)
which is independent of the subspace Vh . We mention two possible choices of the function s, similarly to subsection 8.2.8. First, there holds q(r) s(r) q(r) max max ≥ max r≥0 s(r) r≥0 p(r) r≥0 p(r) and the latter is achieved when s(r) = (p(r)q(r))1/2 . Hence this s is the optimal choice concerning the possible condition numbers. Second, a simpler structure of the preconditioner is obtained if s is chosen to be piecewise constant. Then s(E(D2 un )) is a piecewise constant weight function in H02 (Ω) and this preconditioner is the analogue of the second order case described in subsection 8.2.7.
8.2.15
Decoupled Laplacian preconditioners for systems
(a) The preconditioning operator Decoupled Laplacian preconditioners have been introduced for linear elasticity systems by Axelsson [12], such that the action of the preconditioner involves a separate Laplacian-like system for each displacement component. Besides obtaining smaller sized auxiliary problems than the original system, the efficient realization of this preconditioner also relies on the available fast Poisson solvers. In this subsection we sketch the analogous application of this preconditioner for systems of the form (7.56) in paragraph (b) of subsection 7.1.2. For simplicity we restrict ourselves to Dirichlet boundary conditions, i.e. we consider the system Ti (u1 , .., ur ) ≡ − div fi (x, ∇u1 , .., ∇ur ) = gi (x)
ui = 0
in Ω on ∂Ω
(i = 1, .., r) (8.120)
with a bounded domain Ω ⊂ RN and given functions fi : Ω × RN r → RN satisfying Assumptions 7.4. The corresponding nonlinear operator is in fact an r-tuple of operators T1 . T = .. Tr r
defined on the domain D(T ) = (H 2 (Ω) ∩ H01 (Ω)) .
276CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT Accordingly, the preconditioning operator is defined as the r-tuple of minus Laplace operators acting componentwise: −∆z1 . Sz = .. −∆zr
r
for any z = (z1 , ..., zr ) ∈ (H 2 (Ω) ∩ H01 (Ω)) . This operator corresponds to the coefficient arrays G(ii) (x) = I ∈ RN ×N (i = 1, ..., r),
G(ij) (x) = 0 ∈ RN ×N (i = 1, ..., r, i 6= j) (8.121) in the general definition (7.60), where I denotes the identity matrix and 0 the matrix with all zero entries. (b) Construction of the preconditioning matrix The corresponding preconditioner Sh for the discretized problem is decoupled to discrete minus Laplacians, i.e. it is a block-diagonal matrix −∆h 0 Sh = ... ...
0 −∆h 0
... ...
−∆h
where ∆h comes from the same discretization of the Laplacian as the one applied to the boundary value problem. In course of FEM realization, the auxiliary problems consist of r independent discrete Poisson equations of the type f ind zi ∈ Vh :
Z
Ω
∇zi · ∇v =
Z
Ω
̺i v
(i = 1, ..., r, v ∈ Vh )
with some finite-dimensional FEM subspace Vh ⊂ H01 (Ω). (In these problems the residual-like right-hand sides ̺i depend on each component, e.g. they are of the form (7.64) in the case of a simple iteration.) That is, the above defined blocks of the preconditioning matrix are given by (−∆h )m,l =
Z
Ω
∇vm · ∇vl
(m, l = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Accordingly, the solution of the linear algebraic systems containing the preconditioner Sh relies on the fast Poisson solvers, cited in subsection 3.3.3 and to be detailed further in section 9.3. (c) Conditioning We determine the condition number of the operator S −1 T on the continuous level based on Theorem 7.4 in section 7.1. Then the conditioning of the discretized operators
8.2. VARIOUS PRECONDITIONING STRATEGIES BASED ON THE SOBOLEV SPACE BACKG follows from Theorem 8.2. (The derived condition number determines the convergence of the simple iteration in Theorem 7.4. We note that the same preconditioner can also be used for inner iterations in outer Newton-like iterations, similarly as mentioned in Remark 8.5 for a single Laplacian. See also Remark 8.3 on the related conditioning estimates.) Namely, condition (iii) in Assumptions 7.4 can be written in an equivalent way in terms of the quadratic forms: µ1 |ξ|2 ≤ h
∂f (x, η) ξ, ξi ≤ µ2 |ξ|2 ∂η
((x, η) ∈ Ω × RN r , ξ ∈ RN r ),
(8.122)
where the concise notation f is defined via writing fi = (fi1 , ..., fiN ) and then f = (f11 , ..., f1N , f21 , ..., f2N , ..., fr1 , ..., frN ) : Ω × RN r → RN r .
(8.123)
This means that condition (7.57) is satisfied if we thereby substitute the matrix G ∈ RN r×N r defined via (8.121) and set m = µ1 , M = µ2 . Then inequality (7.59) turns into Z Z X Z X r r ∂f µ1 |∇hi |2 ≤ (x, ∇u) ∇h · ∇h ≤ µ2 |∇hi |2 (8.124) ∂η i=1 i=1 Ω
Ω
(for all u, h ∈
H01 (Ω)r ),
Ω
and Theorem 7.4 implies
cond S −1 T ≤
µ2 . µ1
Owing to Theorem 8.2, for any FEM subspace Vh ⊂ H01 (Ω) the discretized operators in Vhr inherit the above estimate:
cond Sh−1 Th ≤
µ2 , µ1
which is independent of the subspace Vh . Example. The nonlinear elasticity system (1.29) involves the nonlinearity f (x, ∇u) = 3k(x, |vol ε(u)|2 ) vol ε(u) + 2µ(x, |dev ε(u)|2 ) dev ε(u)
(8.125)
(for x ∈ Ω ⊂ R3 , u ∈ H01 (Ω)3 ) in the system (8.120) with the notation (8.123), where ε(u) =
1 ∇u + ∇ut 2
and the piecewise C 1 functions k, µ : Ω × R+ → R satisfy 0 < λ ≤ µ(x, s) < 23 k(x, s) ≤ Λ , 0<λ≤
∂ ∂s
(k(x, s2 )s) ≤ Λ ,
0<λ≤
∂ ∂s
(µ(x, s2 )s) ≤ Λ
(8.126)
with λ = min{µ0 , δ0 } and Λ = max{k0 , δ˜0 }, where the constants µ0 , k0 , δ0 , δ˜0 come from (1.33) and are independent of (x, s), further, vol ε(u) and dev ε(u) are defined via
278CHAPTER 8. PRECONDITIONING STRATEGIES FOR DISCRETIZED NONLINEAR ELLIPT (1.30). Under the considered Dirichlet boundary conditions, the corresponding system (8.120) is a pure displacement problem. Then, by (6.87) in Proposition 6.2, inequality (8.124) is satisfied, hence the corresponding conditioning estimates are valid. In fact, by (6.85) the nonlinearity (8.125) satisfies λ
Z
Ω
2
|ε(h)| ≤
Z
Ω
Z ∂f (x, ∇u) ∇h · ∇h ≤ Λ |ε(h)|2 ∂η Ω
(u, h ∈ H01 (Ω)3 )
(8.127)
with λ and Λ from (8.126), further, by (6.86) there holds the Korn type inequality κ
Z X 3
Ω i=1
2
|∇hi | ≤
Z
Ω
2
|ε(h)| ≤ K
Z X 3
Ω i=1
|∇hi |2
(8.128)
with suitable constants K ≥ κ > 0 independent of u, h ∈ H01 (Ω)3 . In fact, under the Dirichlet boundary conditions considered for (8.120), we now have the sharp value κ = 1/2 (see [12]), further, there holds K = 1 (obtained trivially even for the integrands, regardless the boundary condition). Hence (8.124) holds with µ1 = κλ = λ/2,
µ2 = KΛ = Λ
and accordingly we obtain the estimate
cond S −1 T ≤ 2
Λ , λ
which is inherited by Sh−1 Th independently of the subspace Vh . (As mentioned earlier, the obtained estimate determines the convergence of the simple iteration in Theorem 7.4. We note that analogously to Remark 8.5, the decoupled Laplacian preconditioners can also be efficiently used for inner iterations both in outer Newton-like and in frozen coefficient iterations for this nonlinearity.)
Chapter 9 Algorithmic realization of iterative methods based on preconditioning operators This chapter is devoted to the algorithmic realization of numerical methods that use the idea of preconditioning operators developed in the preceding chapters. That is, we consider iterative sequences in which the preconditioning matrices are the discretizations of suitable linear elliptic operators, and the sequence itself is the suitable projection of a Sobolev space iteration into the discretization subspace. The aim of this chapter is twofold: • We wish to demonstrate that the iterative methods based on preconditioning operators are easy to algorithmize, and they define a complete procedure if one relies on the linear elliptic solvers quoted in section 3.3. We will see that the construction of the iteration is particularly straightforward in the case of FEM realization. Further, the given constructions include the computation of the arising constants using the coefficients of the problem. • The convergence of the given methods is established, with attention payed to the numerical error arising in the auxiliary linear problems. We note that the computer coding of the given algorithms can rely on a standard computational background and various highly developed packages, based on the linear elliptic solvers given in subsections 3.3.2–3.3.3 which frequently allow parallelization as well. The study of different questions related to programming (such as round-off errors etc.) is not the aim of this book. The derivation of a numerical iteration based on the preconditioning operator idea was described in the introduction of the previous chapter and illustrated in Figure 8.2 for one-step methods. The iterative sequence for the discretized problem is the suitable projection of a theoretical iteration, running in the Sobolev space, into the discretization subspace, as shown again in Figure 9.1. The theoretical iteration on the upper (Sobolev space) level is defined using the methods of Chapter 7. Various choices of preconditioning operators S (n) , together with their properties and those of (n) their discretizations Sh , have been discussed in Chapter 8. 279
280CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
T (u) = g
-
u(n+1) = u(n) − α(n) (S (n) )−1 (T (u(n) ) − g)
? (n+1) uh
=
(n) uh
−α
(n)
(n)
(n)
(Sh )−1 (Th (uh ) − gh )
Figure 9.1: derivation of a numerical iteration using the preconditioning operator idea
This chapter presents algorithmization for the basic types of such methods that are in the focus of the book, namely, for simple iterations and Newton-like methods. The discussion in section 9.1 is motivated by the joint treatment of the two aims mentioned above: it gives algorithms in a way which helps to derive error estimates easily. For this purpose we rewrite the process of Figure 9.1 in appropriate ways. (This also serves the sake of more generality, since we can thus allow the discretization parameter to vary stepwise, i.e. h = hn , or to omit the explicit role of h in the study of convergence.) Simple iterations and Newton-like methods are discussed in subsections 9.1.1 and 9.1.2, respectively. In each case we algorithmize the sequences at both levels, i.e. first the theoretical iteration and then the suitably rewritten numerical iteration. For Newton-like methods we also include the realizations of inexact Newton method with inner-outer iterations or via variable preconditioning. Convergence estimates are given in subsection 9.1.3. Section 9.2 is devoted to the FEM realization of simple and Newton-like iterations using Sobolev space background and preconditioning operators. As mentioned earlier, the FEM discretization is in the focus of our investigations since, by its construction, it is the most natural realization of Sobolev space methods. This is particularly clear when the numerical iteration at the lower level of Figure 9.1 is derived from the upper theoretical iteration as its projection to a FEM subspace. Namely, in this case we use the same formulas as in the theoretical iteration, just replacing the Sobolev space by the considered fixed finite dimensional FEM subspace. The FEM realization of simple (gradient type) iterations and Newton-like methods, called gradient– and Newton–finite element methods (GFEM and NFEM), is given in subsections 9.2.1 and 9.2.2, respectively, using a fixed FEM subspace. The corresponding convergence estimates are summarized in subsection 9.2.3, with focus on the mesh independence of the rate of convergence. The GFEM and NFEM are extended to a more general multilevel setting in subsection 9.2.4 in which stepwise varied mesh widths hn are used. Finally, section 9.3 contains some remarks on possible usages of Laplacian preconditioners. Following a brief summary on Poisson solvers which completes the references of subsection 3.3.3, this section also includes the direct gradient method which shows the possibility of literal Sobolev space preconditioning: namely, for some special problems the Laplacian can be inverted exactly, hence discretization is avoided and the iteration is applied directly in the Sobolev space. This method is then generalized by combining it with Fourier series.
281 The algorithms in this chapter are considered in general form, i.e. the coefficient matrices of the preconditioning operators are not specified. This is both because the two goals, given at the beginning of this chapter, concern general iterations, and because various choices of coefficient matrices have already been discussed in Chapter 8. (Of course, the examples with Laplacian preconditioners in section 9.3 form a special case.) Similarly to Chapter 7, a convenient class of problems, where generality does not obscure the exposition of ideas, is formed by second order mixed problems for operators consisting of principal part only. That is, we consider the problem − div f (x, ∇u) = g(x)
in Ω
f (x, ∇u) · ν = γ(x) on ΓN u = 0
(9.1)
on ΓD
with Assumptions 9.1 given below. The detailed formulation of methods and results in this chapter will be presented for this class. In addition, the basic ideas and convergence results are also given for general problems F (u) = b
(9.2)
covering the weak form of boundary value problems. The formulation of these results for other specified classes of problems will be referred to, however, the exact details are left to the reader, together with the analogous formulations for the Sobolev space methods of section 7.4. Concerning problem (9.1), we recall the Sobolev space 1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0} ,
defined in (6.15) corresponding to the Dirichlet boundary ΓD . Further, similarly as in Chapter 7, we assume that problem (9.1) satisfies the following conditions: Assumptions 9.1. (i) Ω ⊂ RN is a bounded domain with piecewise smooth boundary, ΓN , ΓD ⊂ ∂Ω are measurable, ΓN ∩ ΓD = ∅, ΓN ∪ ΓD = ∂Ω and ΓD 6= ∅; (ii) the function f : Ω × RN → RN is measurable and bounded w.r. to the variable x ∈ Ω and C 1 w.r. to the variable η ∈ RN ; (iii) the Jacobians
∂f (x,η) ∂η
are symmetric and their eigenvalues λ = λ(x, η) satisfy 0 < µ1 ≤ λ ≤ µ 2 < ∞
with constants µ2 ≥ µ1 > 0 independent of (x, η); (iv) g ∈ L2 (Ω) and γ ∈ L2 (ΓN ).
(9.3)
282CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
9.1
General algorithms
In this chapter we give the algorithmic form of simple and Newton-like iterations based on Sobolev space background in subsections 9.1.1 and 9.1.2, respectively. In each case we algorithmize the sequences at both levels, i.e. first the theoretical iteration is given in the Sobolev space, and then the general form of numerical iterations is presented in which the auxiliary problems are solved numerically, i.e. with some numerical error. (The analogous formulation for the Sobolev space methods of section 7.4 is left to the reader.) In addition to the sake of generality, the described presentation will enable us to derive convergence estimates for the numerical iterations in subsection 9.1.3. For the above purposes we need to rewrite the process of Figure 9.1 in appropriate steps. We are interested in the way in which the numerical iteration is derived. First, one should indicate that both the theoretical and numerical iterations comprise sequences of auxiliary linear problems: see Figure 9.2. Figure 9.2: u(n+1) = u(n) − α(n) z (n) ,
where S (n) z (n) = T (u(n) ) − g
? (n+1) uh
=
(n) uh
−α
(n)
(n) zh ,
(n) (n)
(n)
where Sh zh = Th (uh ) − gh
Second, in the numerical iteration we can neglect the role of the discretization parameter h. Instead, more generally, we can consider the numerical iteration as a sequence in which stepwise the auxiliary problems are solved approximately (with some numerical error). In this case, shown by Figure 9.3, we denote the numerical sequence by (un ), further, the (unknown) exact and the numerical solutions of the auxiliary (n) problems are denoted by z∗ and z (n) , respectively. Figure 9.3: u(n+1) = u(n) − α(n) z (n) ,
where S (n) z (n) = T (u(n) ) − g
?
u
(n+1)
= u
(n)
−α
(n)
z
(n)
,
(n)
where S (n) z∗ = T (u(n) ) − g
(n)
and z (n) ≈ z∗
Finally, we do some notational changes for convenience. First, without the presence of the discretization parameter h, it is simpler to write a lower index n for the sequences instead of the upper index (n) that was used before to avoid double subscripts. Further, the discussion of this section is based on the weak form of the operators, hence, in order
9.1. GENERAL ALGORITHMS
283
to be consistent with our previous notations, we denote these weak forms of the linear and nonlinear operators by Bn and F in the diagram, respectively. These changes in the previous diagram yield Figure 9.4.
Figure 9.4: un+1 = un − αn zn ,
where Bn zn = F (un ) − b
un+1 = un − αn z n ,
where Bn zn∗ = F (un ) − b and z n ≈ zn∗
?
In fact, for simple iterations the presence of Bn will be realized by an energy inner product, whereas for Newton-like methods it is the linearized operator or its approximation; further, in both cases Bn is represented by the corresponding weak form using an integral with test functions.
9.1.1
Simple iterations
(a) Theoretical sequences
Simple iterations based on the preconditioning operator theory have been developed in section 7.1. The term ’simple iteration’ means that we set a fixed preconditioning operator (independent of n) throughout the iteration. The algorithmic form of these iterations for problem (9.1) can be established using Remark 7.4 and Theorem 7.1. Namely, first we choose a symmetric matrix-valued function G ∈ L∞ (Ω, RN ×N ) satisfying
mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN )
(9.4)
1 with constants M ≥ m > 0 independent of x, η and ξ. Then the iteration (un ) ⊂ HD (Ω) is defined in the following way:
284CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
1 (a) u0 ∈ HD (Ω); 1 for any n ∈ N : if un ∈ HD (Ω) is obtained, then (b1) zn ∈ H 1 (Ω) is the solution of the problem D
Z Z Z Z G(x) ∇zn · ∇v = f (x, ∇un ) · ∇v − gv + γv dσ Ω Ω Ω ΓN 2 (b2) un+1 = un − zn .
M +m
1 (v ∈ HD (Ω));
(9.5)
Remark 9.1 We note that the preconditioning operator Su = −div(G(x)∇u), which R 1 (Ω), yields the same defines the energy inner product hh, viG = Ω G(x) ∇h · ∇v in HD spectral bounds m and M for step (b2) in the iteration as in (9.4) since the nonlinear operator consists of principal part only. In the presence of a lower order term q(x, u) in (9.1), M depends on u0 and is determined using Theorem 7.2. The algorithm (9.5) can be formulated in a similar way for the boundary value problems considered in Theorems 7.3–7.6. This is sketched below in a general setting, the formulation for the individual cases being left to the reader. First we choose a matrix (or array) G satisfying the inequality corresponding to (9.4) in Theorems 7.3–7.6 w.r. to the coefficient of the nonlinear differential operator. Let h., .iG be the energy inner product corresponding to G in the Sobolev space H related to the equation. Let us denote by F the weak form of the nonlinear differential operator w.r. to h., .iG , i.e. the weak formulation of the problem can be written as hF (u), viG = hb, viG
(v ∈ H)
(9.6)
(cf. also Remark 7.2). The constants m and M are constructed as in Theorems 7.3–7.6. (Similarly as in Remark 9.1, they coincide with the spectral bounds corresponding to G in the case of an operator with principal part only, or depend on u0 in the presence of a lower order term.) Then the iteration (un ) ⊂ H is defined in the following way: (a) u0 ∈ H; for any n ∈ N : if un ∈ H is obtained, then
(b1) zn ∈ H is the solution of the problem
(b2)
hzn , viG = hF (un ) − b, viG un+1 = un −
2 zn . M +m
(v ∈ H);
(9.7)
9.1. GENERAL ALGORITHMS
285
(We remark that, clearly, the algorithm (9.7) includes the case (9.5) as well.)
(b) Numerical iterations
Now we define the general algorithm, in which the auxiliary problems in step (b1) of (9.5) are solved with some numerical error δn . The iteration is denoted by (un ) ⊂ 1 HD (Ω) for distinction from the previous theoretical sequences. (This will help us to study convergence in subsection 9.1.3 under the natural choice u0 = u0 .) Similarly as for (9.5), first we choose a symmetric matrix-valued function G ∈ 1 L∞ (Ω, RN ×N ) satisfying (9.4). Then the sequence (un ) ⊂ HD (Ω) is defined in the following way:
1 (Ω); (a) u0 ∈ HD 1 for any n ∈ N : if un ∈ HD (Ω) is obtained, then 1 (b1) zn∗ ∈ HD (Ω) denotes the exact solution of the problem Z Z Z Z ∗ G(x) ∇z · ∇v = f (x, ∇u ) · ∇v − gv + γv dσ n n Ω Ω Ω ΓN (b2) (b3)
1 (v ∈ HD (Ω));
δn > 0 is some constant satisfying δn ≤ δ0 ,
z n ≈ zn∗ ,
1 z n ∈ HD (Ω)
is constructed such that
kzn∗ − z n kG ≤ δn ; un+1 = un −
2 zn . M +m
(9.8)
The algorithm (9.8) can be formulated in a similar way for the boundary value problems considered in Theorems 7.3–7.6. Similarly to the setting of (9.7) and using the notations thereby, the iteration is given as follows. (Again, the algorithm (9.9) includes the case (9.8) as well.)
286CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
(a) u0 ∈ H; for any n ∈ N : if un ∈ H is obtained, then (b1) zn∗ ∈ H denotes the exact solution of the problem hzn∗ , viG = hF (un ) − b, viG (v ∈ H); (b2) (b3)
δn > 0 is some constant satisfying δn ≤ δ0 ,
z n ≈ zn∗ ,
zn ∈ H
(9.9)
is constructed such that
kzn∗ − z n kG ≤ δn ; un+1 = un −
2 zn . M +m
We note that the algorithms (9.8)–(9.9) contain the special case when the auxiliary problems are solved in a fixed subspace Vh ⊂ H01 (Ω) (respectively Vh ⊂ H) where h is independent of n. (Then the numbers δn come from the corresponding discretization errors.) This case corresponds to Figure 9.2 in the introduction of this section. Remark 9.2 The error estimate kz n − zn∗ kG ≤ δn can be rewritten in terms of the relative residual error estimate kz n − (F (un ) − b)kG ≤ ρn kF (un ) − bk
(9.10)
with ρn = δn /kF (un ) − bk.
9.1.2
Newton-like iterations
In the first paragraph (a) of this subsection we give the theoretical damped Newton algorithm in Sobolev space. The corresponding numerical iterations coincide with the damped inexact Newton method, presented in the second paragraph (b). On Figure 9.4, Newton’s method corresponds to the choice Bn = F ′ (un ). In this case the auxiliary problems are determined by the derivative operators at un instead of a priori chosen preconditioning operators, hence their treatment requires further attention. Two particular realizations are presented in paragraphs (c) and (d), following the ideas described in section 7.2. Namely, the auxiliary problems are solved by preconditioned inner iterations or are approximated by other operators to provide a variable preconditioned algorithm, respectively. An important advantage of the Sobolev space based Newton methods in their numerical realization is that the Jacobians are provided in a straightforward manner, without any further numerical differentiation. This is because they come from the weak formulation instead of studying the actual form of the nonlinear algebraic system corresponding to the discretized operator.
9.1. GENERAL ALGORITHMS
287
(a) The theoretical sequences First we consider the second order mixed problem (9.1) again. The Sobolev space is now endowed with the inner product
1 HD (Ω)
hu, viHD1 :=
Z
Ω
∇u · ∇v.
The generalized differential operator and right-side are defined by hF (u), viHD1 =
Z
Ω
f (x, ∇u) · ∇v,
hb, viHD1 =
Z
Ω
gv +
Z
ΓN
γv dσ
(9.11)
1 (for all v ∈ HD (Ω)), respectively. (These weak forms of F and b follow the usage in Chapter 6, and some further explanation to clarify their meaning has been given in Remark 7.2. See also the Appendix, paragraph (c) of A2.) We assume that F ′ is Lipschitz continuous. The damped Newton iteration for (9.1) in Sobolev space is given by Theorem 7.7, and its algorithmic form is as follows.
1 (a) u0 ∈ HD (Ω); 1 for any n ∈ N : if un ∈ HD (Ω) is obtained, then 1 (b1) pn ∈ HD (Ω) is the solution of the problem ′
F (un )pn = −(F (un ) − b),
that is
Z Z Z Z ∂f γv dσ f (x, ∇u ) · ∇v + gv + (x, ∇u ) ∇p · ∇v = − n n n Ω ∂η ΓN Ω Ω (b2) τn = min{ 1, Lkpnµk1 1 } ∈ (0, 1] , H D
(b3)
1 (v ∈ HD (Ω));
un+1 = un + τn pn .
(9.12)
Remark 9.3 The constants required for τn in step (b2) of (9.12) can be computed as follows: (1) µ1 is the lower bound from (9.3); (2) the Lipschitz constant L of F ′ can be estimated using Remarks 7.12 or 7.14. For the other boundary value problems considered in Chapter 7, the analogue of algorithm (9.12) is now formulated in a general setting similarly to that of (9.7). (For the different individual problems one can rely on Theorem 7.8.) We denote by H the Sobolev space related to the equation and by F the generalized differential operator in H. Then the iteration for the weak form hF (u), vi = hb, vi
(v ∈ H)
(9.13)
288CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO of the problem is given as follows. (a) u0 ∈ H; for any n ∈ N : if un ∈ H is obtained, then (b1) pn ∈ H is the solution of the problem (b2)
(b3)
hF ′ (un )pn , vi = −hF (un ) − b, vi
τn = min{ 1,
µ1 Lkpn k
(9.14)
(v ∈ H);
} ∈ (0, 1] ,
un+1 = un + τn pn .
(Note again that the general algorithm (9.14) includes the case (9.12) as well, and the constants in step (b2) can be computed following Remark 9.3.) (b) Numerical iterations: the general inexact algorithms The numerical iteration corresponding to (9.12) coincides with the damped inexact Newton method. Namely, the auxiliary problems are solved with some numerical error δn (and also the damping parameter is suited to this step). For the error of the auxiliary problems – in accordance with the usual discussion of inexact Newton methods –, hereby we consider the relative residual error estimate kF ′ (un )pn + (F (un ) − b)k ≤ δn kF (un ) − bk ,
(9.15)
instead of the estimate kp∗n − pn k ≤ δn used in the simple iterations (9.8)–(9.9). (This corresponds to the formulation (9.10).) We note that, since the exact solution of the auxiliary problem satisfies kF ′ (un )p∗n + (F (un ) − b)k = 0, the lower bound µ1 of F ′ (un ) and the estimate (9.15) imply µ1 kp∗n − pn k ≤ kF ′ (un )(p∗n − pn )k = kF ′ (un )pn + (F (un ) − b)k ≤ δn kF (un ) − bk, i.e. the relative error estimate (9.15) implies kp∗n − pn k ≤ δ˜n with δ˜n = (δn /µ1 )kF (un ) − bk. The inexact Newton algorithms are formulated below, without specifying the way of solving the auxiliary problems. 1 First we consider problem (9.1). The sequence (un ) ⊂ HD (Ω) is defined in the following way:
9.1. GENERAL ALGORITHMS
289
1 (a) u0 ∈ HD (Ω); 1 for any n ∈ N : if un ∈ HD (Ω) is obtained, then (b1) δn > 0 is some constant satisfying 0 < δn ≤ δ0 < 1, 1 (b2) p∗n ∈ HD (Ω) denotes the exact solution of the problem F ′ (un )p∗n = −(F (un ) − b), that is Z Z Z Z ∂f ∗ f (x, ∇u γv dσ gv + ) ∇p · ∇v = − ) · ∇v + (x, ∇u n n n ΓN Ω Ω Ω ∂η 1 (b3) pn ≈ p∗n , pn ∈ HD (Ω) is constructed such that kF ′ (un )pn + (F (un ) − b)kHD1 ≤ δn kF (un ) − bkHD1 ; µ1 n) } ∈ (0, 1] , (b4) τn = min{ 1, (1−δ (1+δn ) Lkpn kH 1 D (b5) u =u +τ p . n+1
n
1 (v ∈ HD (Ω));
n n
(9.16) The constants required for τn in step (b4) can be computed using Remark 9.3.
To involve the other boundary value problems considered in Chapter 7, the algorithm (9.16) is now extended to a general setting similarly to (9.14). Using the notations thereby, the iteration is given as follows. (a) u0 ∈ H; for any n ∈ N : if un ∈ H is obtained, then (b1) δn > 0 is some constant satisfying 0 < δn ≤ δ0 < 1, (b2) p∗n ∈ H denotes the exact solution of the problem (b3) (b4) (b5)
hF ′ (un )p∗n , vi = −hF (un ) − b, vi
pn ≈ p∗n ,
pn ∈ H
(v ∈ H);
(9.17)
is constructed such that
kF ′ (un )pn + (F (un ) − b)k ≤ δn kF (un ) − bk; τn = min{ 1,
(1−δn ) µ1 (1+δn ) Lkpn k
} ∈ (0, 1] ,
un+1 = un + τn pn .
We note that the algorithms (9.16)–(9.17) contain the special case when the auxiliary problems are solved in a fixed subspace Vh ⊂ H01 (Ω) (respectively Vh ⊂ H) where h
290CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO is independent of n. (Then the numbers δn come from the corresponding discretization errors.) This case corresponds to Figure 9.2 in the introduction of this section. (c) Inner-outer iterations As a special case of the damped inexact Newton iteration (9.16), we consider the realization when the approximate solutions pn of the auxiliary problems Z
Ω
Z ∂f ∗ (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v + ∂η Ω
Z
Ω
gv +
Z
ΓN
γv dσ
(9.18)
1 (v ∈ HD (Ω)) are constructed by inner iterations. This inner-outer iteration method has been introduced in this form in [35], and is defined in Theorem 7.9 for our problem (9.1). We recall that the Jacobians have common spectral bounds µ2 ≥ µ1 > 0, see (9.3). The method in question is formulated in Theorem 7.9 in a finite-dimensional subspace 1 V ⊂ HD (Ω).
The algorithmic form of the inner-outer iteration is as follows. For simplicity, we use the notations un and pn instead of un and pn , respectively. The outer iteration is given by the algorithm (a) u0 ∈ V ; for any n ∈ N : if un ∈ V is obtained, then (b1) δn > 0 is some constant satisfying 0 < δn ≤ δ0 < 1, (b2) (b3)
(b4)
pn is obtained from the inner iteration (9.20); τn = min{ 1,
(1−δn ) µ1 (1+δn ) Lkpn kH 1
D
} ∈ (0, 1] ,
un+1 = un + τn pn .
(9.19) The constants required for τn in step (b3) can be computed using Remark 9.3. The inner iteration for given un ∈ V is as follows. First of all, we choose constants Mn ≥ mn > 0 and a symmetric matrix-valued function Gn ∈ L∞ (Ω, RN ×N ) satisfying σ(G(x)) ⊂ [µ1 , µ2 ] (x ∈ Ω) with µ1 , µ2 from (9.3), such that there holds mn hGn (x)ξ, ξi ≤ h
∂f (x, ∇un (x))ξ, ξi ≤ Mn hGn (x)ξ, ξi ∂η
(x ∈ Ω, ξ ∈ RN ).
We let h̺, vi = −
Z
Ω
f (x, ∇un ) · ∇v +
Z
Ω
gv +
Z
ΓN
γv dσ
(v ∈ V )
(the functional corresponding to the right-hand side of (9.18)). The inner iteration defines a sequence (p(k) ) = (p(k) n )k∈N ⊂ V ; we will omit the lower index n for notational
9.1. GENERAL ALGORITHMS
291
simplicity, which causes no ambiguity since now n is fixed. If kn ∈ N is such that the relative residual error of p(kn ) is within the given tolerance, then we accept p(kn ) for pn . The sequence (p(k) ) ⊂ V is defined by the preconditioned CGM according to (2.66), simultaneously with sequences (r(k) ) ⊂ V and (d(k) ) ⊂ V . This is given below. Here step (a) defines the initial functions, steps (b1)-(b6) give the iteration cycle, step (c) contains the stopping criterion and step (d) yields the output of the inner iteration to be used in the next outer step.
p(0) ∈ V is arbitrary;
(a) Z
Ω
Gn (x) ∇r
(0)
· ∇v =
r(0) = −d(0) ∈ V is the solution of the problem ∂f (x, ∇un ) ∇p(0) · ∇v − h̺, vi ∂η
Z
Ω
if p(k−1) , r(k−1) , d(k−1) ∈ V are obtained, then
for any k = 1, 2, ... : R
∂f (k−1) Ω ∂η (x, ∇un ) ∇d
αk = − γ1k
· ∇d(k−1) ,
(b1)
γk =
(b2)
y (k) ∈ V is the solution of the problem Z
Ω
Gn (x) ∇y
(k)
· ∇v =
Z
Ω
r(k) = r(k−1) + αk y (k) ;
(b4)
p(k) = p(k−1) + αk d(k−1) ;
(b5)
βk =
(b6)
d(k) = −r(k) + βk d(k−1)
R
R
∂f (x, ∇un ) ∇d(k−1) · ∇v ∂η
(b3)
1 γk
(v ∈ V );
∂f (k−1) Ω ∂η (x, ∇un ) ∇d
Ω
Gn (x) ∇r(k−1) · ∇d(k−1) ;
(v ∈ V );
· ∇r(k) ;
until (c)
kF ′ (un )p(k) + (F (un ) − b)kBn−1 ≤ ̺n kF (un ) − bkBn−1 where ̺n = (µ1 /µ2 )1/2 δn ;
(d)
kn = the first k for which (c) holds; pn = p(kn ) . (9.20)
Remark 9.4 According to Remark 7.13, step (c) in (9.20) is checked as follows. The inequality in step (c) is equivalent to kr(kn ) kBn ≤ ̺n kr(0) kBn , and here kr(k) k2Bn
= hBn r
(k)
,r
(k)
iHD1 =
Z
Ω
Gn (x) ∇r(k) · ∇r(k)
(9.21) (9.22)
292CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO (k = 0, ..., kn ). Moreover,
hBn r(k) , r(k) iHD1 = −hBn r(k) , d(k) iHD1 = −
Z
Ω
Gn (x) ∇r(k) · ∇d(k) ,
which is determined anyway in the next iteration step for αk in (b1) as long as k < kn , hence no extra work is required to calculate (9.22).
(d) Variable preconditioning
We consider problem (9.1) again. The realization of the damped inexact Newton iteration (9.16) via variable preconditioning is defined in Theorem 7.10. In this approach the derivative operators are replaced by approximate operators in the steps of the iteration. This method is formulated in Theorem 7.10 in a finite-dimensional 1 (Ω). subspace V ⊂ HD We recall that for technical reasons the algorithm uses the variable norm khkn = hF (un )−1 h, hi1/2 in the steps of the iteration. Further, we fix constants K > 1 and ε > 0 in advance for the definition of the forcing terms. We note that the role of the forcing n −mn term δn will now be played by (1 + ω(un )) M (cf. the proof of Theorem 7.10), since Mn +mn in this setting the approximate Jacobian is constructed via variable preconditioning with spectral bounds mn and Mn . ′
The algorithmic form of the iteration is as follows.
9.1. GENERAL ALGORITHMS
293
(a) u0 ∈ V ; for any n ∈ N : if un ∈ V is obtained, then 1 ; (b1) ω(un ) = Lµ−2 1 kF (un ) − bkHD
(b2) we choose constants Mn ≥ m n > 0
satisfying Mn /mn ≤ 1 + 2/(ε + Kω(un )), and (b3) we choose a symmetric matrix-valued function Gn ∈ L∞ (Ω, RN ×N ) for which there holds mn hGn (x)ξ, ξi ≤ h ∂f (x, ∇un (x))ξ, ξi ≤ Mn hGn (x)ξ, ξi ∂η (b4) we calculate the constants −3/2
ρn = 2LMn2 µ1
Mn −mn (1 Mn +mn
Qn =
(x ∈ Ω, ξ ∈ RN );
+ ω(un )),
(Mn + mn )−2 kF (un ) − bkn (1 + ω(un ))1/2 ;
n }; (b5) τn = min{1, 1−Q 2ρn
(b6) zn ∈ V is the solution of the problem Z
Ω
Gn (x) ∇zn · ∇v =
(b7)
un+1 = un −
Z
Ω
f (x, ∇un ) · ∇v −
Z
Ω
gv +
Z
ΓN
2τn zn . Mn + m n
γv dσ
(v ∈ V );
(9.23)
Remark 9.5 Concerning steps (b4)-(b5) in (9.23), we note that the given value of τn ensures optimal contractivity in the n-th step. The constant ρn can be estimated using the inequality −1/2 kF (un ) − bkn ≤ µ1 kF (un ) − bkHD1
(which follows from (5.95) with λ = µ1 ), hence only the original norm has to be used. The latter can be computed using a basis v1 , ..., vk ∈ V : namely, if c1 , ..., ck are the coefficients of F (un ) − b in the representation F (un ) − b =
k X
then
ci vi ,
i=1
k X
kF (un ) − bk2H 1 =
i,j=1
γij = hvi , vj iHD1 =
Z
D
where
Ω
γij ci cj ,
∇vi · ∇vj
294CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO are the entries of the corresponding Gram matrix. Remark 9.6 The algorithms (9.19)–(9.20) and (9.23) are strongly related. Namely, one can consider the latter as executing one inner iteration in each step of the outer 2 iteration with a simplified steplength α1 = Mn +m . Hence, if in step (b2) of (9.23) the n condition Mn /mn ≤ 1 + 2/(ε + Kω(un )) requires a too small value of Mn /mn , then this can be avoided by preferring the algorithm (9.19)–(9.20) and doing more than one inner step with the same preconditioner.
9.1.3
Some convergence results
The theoretical sequences (9.7) and (9.14), containing the exact solution of the auxiliary problems, are algorithmic reformulations of the iterations in Chapter 7. (The algorithms (9.5) and (9.12) are special cases of these.) Hence their convergence is provided by the corresponding results of Chapter 7. We verify below some convergence results on the general numerical sequences (9.9) and (9.17), in which the auxiliary problems are solved with some numerical error. (Again, the algorithms (9.8) and (9.16) are special cases of these.) First we consider the simple iteration algorithm (9.9) for the problem (9.6). In the setting of this problem the Sobolev space H is endowed with the energy inner product h., .iG . Then the convergence results rely on the property that F has a bihemicontinuous symmetric Gateaux derivative in H, satisfying mkhk2G ≤ hF ′ (u)h, hiG ≤ M khk2G
(u, h ∈ H)
(9.24)
with suitable constants M ≥ m > 0 (see Remarks 7.1 and 7.9.) The first result on simple iterations states that the linear convergence of the theoretical sequence is preserved if the numerical errors tend similarly to zero. Theorem 9.1 Consider problem (9.6) under assumption (9.24), and denote its solution by u∗ . Let in the algorithm (9.9) there hold δn ≤ c1 q n (n ∈ N) with some constants 0 < q < 1, c1 > 0. Then the following linear convergence estimates hold: (a) if q >
M −m M +m
where c2 = (b) if q <
then
αc1 q−Q
M −m M +m
kun − u∗ kG ≤ c2 q n
+
1 kF (u0 ) m
− bkG with Q =
M −m M +m
and α =
2 ; M +m
then ∗
kun − u kG ≤ c3 where c3 =
(n ∈ N)
αc1 r q(1−r)
+
1 kF (u0 ) m
M −m M +m
− bkG with r =
n
(n ∈ N)
q . Q
−m the convergence estimate is faster than const. · sn for any (In the case q = M M +m −m n M −m ) .) s > M +m , but is slower than ( M M +m
9.1. GENERAL ALGORITHMS
295
Proof. The convergence estimates on (un ) are obtained from the investigation of En = kun − un kG where (un ) is the theoretical iteration (9.7) with u0 = u0 . We will use the notations 2 M −m and α = M +m as above. Q= M +m First we verify the recursive estimate En+1 ≤ QEn + αδn
(n ∈ N).
(9.25)
Namely, we have un+1 − un+1 = un − αzn − (un − αz n ) = un − un − α(zn − zn∗ ) − α(zn∗ − z n ). Here zn − zn∗ = F (un ) − F (un ). Let J = I − αF , where I is the identity operator in the Sobolev space H. Then, by Remark 5.8, J is a contraction with constant Q. Hence we obtain un+1 − un+1 = J(un ) − J(un ) − α(zn∗ − z n ), kun+1 − un+1 kG ≤ kJ(un ) − J(un )kG + αkzn∗ − z n kG ≤ Qkun − un kG + αδn ,
i.e. (9.25) is verified. Since the convergence of the theoretical iteration implies
1 kF (u0 ) − bkG Qn , (9.26) m it suffices to verify our estimates (a)-(b) for En = kun − un kG , using (9.25) when δn ≤ c1 q n . kun − u∗ kG ≤ kun − un kG + kun − u∗ kG ≤ kun − un kG +
(a) Let d2 =
αc1 . q−Q
We prove by induction En ≤ d2 q n
(n ∈ N).
(9.27)
Since E0 = ku0 − u0 kG = 0, (9.27) is trivial for n = 0. If (9.27) holds for fixed n ∈ N, then En+1 ≤ QEn +αδn ≤ d2 Qq n +c1 αq n = (b) Let r =
q Q
as above and d3 =
αc1 r . q(1−r)
!
αc1 Q αc1 n+1 + c1 α q n = q = d2 q n+1 . q−Q q−Q
In order to prove
En ≤ d3 Qn
(n ∈ N),
we verify by induction that En ≤ d3 (1 − rn )Qn
(n ∈ N).
(9.28)
For n = 0 (9.28) is again trivial. If (9.28) holds for fixed n ∈ N, then En+1 ≤ QEn + αδn ≤ d3 (1 − rn )Qn+1 + c1 αq n
αc1 r(1 − rn ) n+1 αc1 n+1 n+1 αc1 r 1 − rn Q + r Q = + rn Qn+1 = q(1 − r) q q 1−r αc1 r = (1 − rn+1 )Qn+1 = d3 (1 − rn+1 )Qn+1 . q(1 − r)
296CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO It is also of practical interest to ensure that we only arrive in a prescribed neighbourhood of the solution. Theorem 9.2 Consider problem (9.6) under assumption (9.24), and denote its solution by u∗ . Let in the algorithm (9.9) there hold δn ≡ mε (n ∈ N) with some constant ε > 0. Then M −m n kun − u∗ kG ≤ ε + C · (n ∈ N) M +m with C =
1 kF (u0 ) m
− bkG .
Proof. In virtue of (9.26), it suffices to prove by induction that En ≤ ε
(n ∈ N).
(9.29)
By definition E0 = ku0 − u0 kG = 0. If (9.29) holds for fixed n ∈ N+ , then (9.25) yields En+1 ≤ Qε + αmε = ε . Now we consider the damped inexact Newton algorithm (9.17) for problem (9.13). The result now uses the Lipschitz continuity of F ′ : kF ′ (u) − F ′ (v)k ≤ Lku − vk
(u, v ∈ H),
(9.30)
and that F ′ is uniformly elliptic in the original norm of the Sobolev space H: hF ′ (u)h, hi ≥ λkhk2
(u, h ∈ H).
(9.31)
(Here L > 0 and λ > 0 are fixed constants.) Theorem 9.3 Consider problem (9.13) under assumptions (9.30)–(9.31), and denote its solution by u∗ . Let in the algorithm (9.17) there hold δn ≤ const. · kF (un ) − bkγ with some constant 0 < γ ≤ 1. Then the convergence is locally of order 1 + γ:
kF (un+1 ) − bk ≤ const. · kF (un ) − bk1+γ
(n ≥ n0 )
with some index n0 ∈ N, yielding also the convergence estimate of weak order 1 + γ kun − u∗ k ≤ λ−1 kF (un ) − bk ≤ const. · q (1+γ)
n
(n ∈ N)
with a suitable constant 0 < q < 1. Proof. It follows directly from Theorem 5.12. Finally, for the convergence of the iterations (9.19)–(9.20) and (9.23), we refer to Theorems 7.9 and 7.10.
9.2. FINITE ELEMENT REALIZATION
9.2
297
Finite element realization
This section is devoted to numerical iterations that are obtained by the FEM realization of Sobolev space iterations. This means that in Figure 9.1 the Sobolev space iteration is projected into a FEM subspace Vh . We first investigate in detail the case of a fixed mesh used, i.e. h is fixed during the iteration. Then we also consider stepwise varied meshes. In the case of a fixed FEM subspace Vh , the corresponding iteration in Vh is a suitably preconditioned iteration after discretization where the preconditioning matrices are the discretizations of the proposed preconditioning operators. The two main advantages of Sobolev space preconditioning are presented directly in this approach, which yields easy construction and a priori mesh independent condition numbers. We can exploit that the FEM discretization is the most natural realization of Sobolev space methods. Namely, for the construction we can use the same formulas as in the theoretical iteration, just replacing the Sobolev space by the FEM subspace Vh . Further, the convergence results on the Sobolev space iteration are automatically inherited for the iteration in the discretized case, independently of the mesh used. We will present the FEM realization of simple or gradient iterations (gradient– finite element method) and Newton’s method (Newton–finite element method) in a fixed subspace. First the construction of the algorithms is given, then the convergence properties are established. The analogous formulation of FEM realization for the other Sobolev space methods of section 7.4 is left to the reader. After the discussion with fixed mesh the more general case is studied when a stepwise varied mesh is used, i.e. a kind of multilevel or projection-iteration method is defined. We give exact formulations of these methods for mixed problems of the form − div f (x, ∇u) = g(x)
in Ω
f (x, ∇u) · ν = γ(x) on ΓN u = 0
(9.32)
on ΓD
similarly to (9.1) given at the beginning of this chapter. We assume that problem (9.32) satisfies the conditions (i)-(v) imposed on (9.1). Based on the preceding sections, the algorithms will be given and convergence studied. Other problems will be only referred to, the extension of results being straightforward to formulate. In subsections 9.2.1–9.2.3 we consider the FEM discretization of problem (9.32) in 1 a fixed FEM subspace Vh ⊂ HD (Ω). We denote by uh ∈ Vh the corresponding FEM solution of (9.32): Z
Ω
f (x, ∇uh ) · ∇v =
Z
Ω
gv +
Z
ΓN
γv dσ
(v ∈ Vh ).
(9.33)
(Similarly to v ∈ Vh above, in the sequel we will omit the subscript h for the elements of Vh for notational simplicity.) For details on the discretization procedure the reader is referred to the monographs of Axelsson and Barker [14], Ciarlet [72] and White [289]. The gradient– and Newton–finite element methods are formulated in subsections 9.2.1
298CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO and 9.2.2, respectively, and the mesh independent convergence results are given in subsection 9.2.3. Finally, the case of stepwise varied mesh is discussed in subsection 9.2.4.
9.2.1
The gradient–finite element method
The gradient–finite element method (GFEM) is the FEM realization of the Sobolev space based simple iterations of section 7.1. Its name reflects that simple iterations arise as a gradient method (see section 5.2). For related constructions see Farag´o–Kar´atson [114], Gajewski–Gr¨oger–Zacharias [127]. As mentioned above, in this subsection we consider a fixed mesh, and the case of a varied mesh will be considered in subsection 9.2.4. Following (9.4), first we choose a symmetric matrix-valued function G ∈ L∞ (Ω, RN ×N ) satisfying mhG(x)ξ, ξi ≤ h
∂f (x, η) ξ, ξi ≤ M hG(x)ξ, ξi ∂η
((x, η) ∈ Ω × RN , ξ ∈ RN )
(9.34)
with constants M ≥ m > 0 independent of x, η and ξ. As mentioned before, the GFEM algorithm for (9.33) is obtained from the simple Sobolev space iteration (9.5) in an obvious way. Namely, in the algorithm (9.5) we just 1 have to replace the Sobolev space HD (Ω) by the FEM subspace Vh . Thus we obtain the following algorithm: (a) u0 ∈ Vh ; for any n ∈ N : if un ∈ Vh is obtained, then
(b1) zn ∈ Vh is the solution of the problem
Z Z Z Z γv dσ gv + G(x) ∇zn · ∇v = f (x, ∇un ) · ∇v − ΓN Ω Ω Ω 2 (b2) zn . un+1 = un −
(v ∈ Vh );
M +m
(9.35)
The preconditioning matrix Bh arising in step (b1) is given by (Bh )i,j =
Z
Ω
G(x) ∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . Note that Bh is derived from the bilinear form hz, viG =
Z
Ω
G(x) ∇z · ∇v
1 (z, v ∈ HD (Ω)),
(9.36)
1 i.e. from the energy inner product of the operator Sz ≡ −div (G(x)∇z) on HD (Ω), and the iteration (9.35) is the projection of the Sobolev space iteration (9.5) into Vh .
9.2. FINITE ELEMENT REALIZATION
299
Remark 9.7 We note that the above preconditioning operator yields the same bounds m and M for step (b2) in the iteration as in (9.34) since the nonlinear operator consists of principal part only. In the presence of a lower order term q(x, u) in the operator, M depends on u0 and is determined using Theorem 7.2. Remark 9.8 In iterations for Dirichlet, Neumann and certain 3rd boundary value problems with regular coefficients f, G ∈ C 1 , the special case Vh ⊂ H 2 (Ω) is of interest from qualitative aspects. This requires Vh ⊂ C 1 (Ω), i.e. C 1 -elements are used. Namely, for Dirichlet problems the iteration (9.35) is the projection of the Sobolev space iteration with auxiliary equations in strong form (
−div (G(x)∇zn ) = T (un ) − g zn ∈ H 2 (Ω), zn|∂Ω = 0 ,
(9.37)
i.e. the solutions of problem (b1) in (9.35) (with ΓN = ∅) are numerical solutions to (9.37). This strong form of the iteration has the following favourable property from qualitative aspect. Under the rather general conditions of Theorem 6.11 the exact solution u∗ has the smoothness u∗ ∈ C 1 (Ω) ∩ H 2 (Ω). Using C 1 -elements means that the numerical solution is also in C 1 (Ω) ∩ H 2 (Ω), i.e. the smoothness of the exact solution is preserved. In particular, if the corresponding vector field ∇u∗ is looked for, then the strong form ensures that the numerical approximations ∇un are continuous. (For appropriate Neumann and 3rd boundary value problems, similar smoothness of u∗ is ensured by Theorem 6.12.) For the other problems in section 7.1, the GFEM algorithm is derived similarly as (9.35) for problem (9.32). Essentially, in order to obtain zn , the actual Sobolev space in the definition of zn in Theorems 7.3–7.6 has to be replaced by Vh (both for zn and the test functions).
9.2.2
The Newton–finite element method
The Newton–finite element method (NFEM) is the FEM realization of the Sobolev space Newton iterations of section 7.2. Related constructions are summarized e.g. in Axelsson [7, 10], see also Rannacher [251] and Rossi–Toivanen [257]. As mentioned above, in this subsection we consider a fixed mesh, and the case of a varied mesh will be considered in subsection 9.2.4. We define the NFEM in the general form, i.e. the iteration contains the stepwise linearized problems. We note that, as mentioned earlier, the solution of these linear problems requires some kind of further preconditioning, since they are determined by the derivative operators at un instead of a priori chosen preconditioning operators. The two particular algorithms, given in items (c) and (d) of subsection 9.1.2, can be used in the same way in the setting of the NFEM (i.e. with V replaced by Vh ). 1 Let Vh ⊂ HD (Ω) be a suitably chosen FEM subspace. We consider the discretized problem (9.33). As mentioned before, the NFEM algorithm for (9.33) is obtained from the Sobolev space Newton iteration (9.12) in an obvious way. Namely, in the algorithm (9.12) we
300CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO 1 just have to replace the Sobolev space HD (Ω) by the FEM subspace Vh . Thus we obtain the following algorithm:
(a) u0 ∈ Vh ; for any n ∈ N : if un ∈ Vh is obtained, then (b1) pn ∈ Vh is the solution of the problem
Z Z Z Z ∂f (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v + gv + γv dσ Ω ∂η Ω Ω ΓN (b2) τn = min{ 1, Lkpnµk1 1 } ∈ (0, 1] , H D
(b3)
(v ∈ Vh );
un+1 = un + τn pn .
(9.38)
Remark 9.9 The constants required for τn in step (b2) are as follows: µ1 is the lower bound from (9.3) and L is the Lipschitz constant of the generalized differential operator, which can be estimated using Remark 7.14. For the other problems in section 7.2, the algorithm is derived similarly as (9.38) for problem (9.32). Essentially, in order to obtain pn , the actual Sobolev spaces in the definition of pn in Theorem 7.8 have to be replaced by Vh (both for pn and the test functions).
9.2.3
Convergence estimates and mesh independence
In this subsection we present convergence theorems for the studied algorithms for problem (9.32). (For the other boundary value problems the analogous formulation of the theorems is obvious and hence is left to the reader.) (a) Convergence of the GFEM and the NFEM Let uh ∈ Vh be the solution of (9.33). In this paragraph we first study the convergence of the GFEM and the NFEM to uh , i.e. we investigate the error kun − uh kHD1
(n ∈ N).
(9.39)
We note that the convergence of uh to the weak solution u∗ of the original problem 1 (9.32) as h → 0 follows in a standard way using the minimizing functional φ : HD (Ω) → R (see (6.70)). Namely, the continuity of φ and the density of the subspaces Vh imply that φ(uh ) = min φ(u) → min φ(u) = φ(u∗ ), 1 u∈Vh
u∈HD (Ω)
hence by Remark 5.2 kuh − u∗ k2H 1 ≤ (2/m) (φ(uh ) − φ(u∗ )) → 0 D
as h → 0.
9.2. FINITE ELEMENT REALIZATION
301
The rate of this convergence depends on the actual choice of the subspaces Vh , for details see e.g. Axelsson [7], Ciarlet [72]. The following two theorems establish the rate at which the iteration converges to the FEM solution. The focus is on the mesh independence of this rate. Theorem 9.4 Let there hold the conditions (i)-(v) formulated for (9.1). Then the GFEM algorithm (9.35) yields kun − uh kG ≤ C ·
M −m M +m
n
(n ∈ N)
with a constant C > 0 independent of h. Namely, in general we have C = m1 kF (u0 ) − bkG . In particular, for Dirichlet problems satisfying the regularity conditions of part (2) of Theorem 7.1, we may set C = m̺11/2 kT (u0 ) − gkL2 (Ω) , where ̺ > 0 is the smallest eigenvalue of S on H 2 (Ω) ∩ H01 (Ω). 1 Proof. Theorem 7.1 and Remark 7.4 can be repeated in Vh instead of HD (Ω), since the restriction of the differential operator in Vh inherits the original Sobolev space properties with unchanged constants. (See also Corollary 8.2 in subsection 8.1.2.)
The Newton iteration is considered in the setting of Theorem 7.7, using the Lipschitz 1 1 continuity of the derivative of the generalized differential operator F : HD (Ω) → HD (Ω) corresponding to (9.1). (See item (v) of Assumptions 7.7.) Theorem 9.5 Let there hold the conditions (i)-(v) formulated for (9.1), further, assume that F ′ is Lipschitz continuous. Then the NFEM algorithm (9.38) yields that n
kun − uh k ≤ C · q 2
(n ≥ n0 )
with suitable constants 0 < q < 1, C > 0 and integer n0 ∈ N independent of h. 1 Proof. Theorem 7.7 can be repeated in Vh instead of HD (Ω), since the restriction of the differential operator in Vh inherits the original Sobolev space properties with unchanged constants. (Cf. also part (b) of subsection 8.1.2.)
We underline again the mesh independence of the above results: all the constants appearing in the theorems are determined by the original problem before discretization and hence are independent of h. Using Theorems 9.4–9.5, we can also obtain the distance from the exact weak solu1 tion u∗ ∈ HD (Ω) of problem (9.32) directly. Hereby we denote by εh the discretization error εh = ku∗ − uh kHD1 = dist (u∗ , Vh ) w.r. to the fixed subspace Vh . Corollary 9.1 Let there hold the conditions (i)-(v) formulated for (9.1). Then (a) the GFEM algorithm (9.35) yields M −m kun − u kG ≤ εh + C · M +m ∗
n
with a suitable constant C > 0 independent of h;
(n ∈ N)
302CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO (b) assuming further that F ′ is Lipschitz continuous, the NFEM algorithm (9.38) yields n kun − u∗ k ≤ εh + C · q 2 (n ≥ n0 ) with suitable constants 0 < q < 1, C > 0 and index n0 independent of h.
(b) Inner iterations As mentioned in the introduction, an inner iteration is a relevant particular way of finding numerically the solution pn of the linearized equation in step (b2) of the Newton algorithm (9.38). In this paragraph we formulate the mesh uniform convergence of the inner iterations. We use the notations Fh and bh as follows: let Fh : Vh → Vh denote the operator defined by Z hFh (u), viHD1 = f (x, ∇u) · ∇v (v ∈ Vh ) (9.40) Ω
and bh ∈ Vh the element defined by hbh , viHD1 =
Z
Ω
gv +
Z
ΓN
(v ∈ Vh )
γv dσ
(9.41)
(cf. also Remark 7.2). Then (9.33) is written as hFh (uh ), viHD1 = hbh , viHD1
(v ∈ Vh ).
(9.42)
We recall the linearized equation in the algorithm (9.38): Z
Ω
Z ∂f (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v + ∂η Ω
Z
Ω
gv +
Z
ΓN
γv dσ
(9.43)
(for all v ∈ Vh ), or, using the notations (9.40)–(9.41), Fh′ (un )pn = −(Fh (un ) − bh ).
(9.44)
The inner iteration defines a sequence (p(k) ) = (p(k) n )k∈N ⊂ V via a preconditioned conjugate gradient method. This is given in algorithmic form using (9.20) in subsection 9.1.2, thereby (in the setting of the NFEM) replacing V by Vh , further, its convergence follows from Theorem 7.9. (We will omit the lower index n in p(k) for notational simplicity, which causes no ambiguity.) We recall that in order to define the preconditioning matrix for the iteration (9.20), we choose constants Mn ≥ mn > 0 and a symmetric matrix-valued function Gn ∈ L∞ (Ω, RN ×N ) satisfying σ(G(x)) ⊂ [µ1 , µ2 ] (x ∈ Ω) with µ1 , µ2 from (9.3), such that there holds mn hGn (x)ξ, ξi ≤ h
∂f (x, ∇un (x))ξ, ξi ≤ Mn hGn (x)ξ, ξi ∂η
(x ∈ Ω, ξ ∈ RN ).
Then the preconditioning matrix (Bn )h for problem (9.44) is given by ((Bn )h )i,j =
Z
Ω
Gn (x) ∇vi · ∇vj
(i, j = 1, ..., k),
9.2. FINITE ELEMENT REALIZATION
303
where v1 , ..., vk is a basis of Vh . This means that Bh is the projection of the precondi1 1 tioning operator Bn : HD (Ω) → HD (Ω), defined by the bilinear form hBn z, vi =
Z
Ω
Gn (x) ∇z · ∇v
1 (z, v ∈ HD (Ω)),
into Vh . The corresponding preconditioned CGM iteration is given by (9.20) in the subspace V = Vh . In this context, the aim of the inner iteration for the problem (9.44) is to determine an element p(kn ) ∈ Vh that already satisfies the relative residual error estimate kFh′ (un )p(kn ) + (Fh (un ) − bh )kBn−1 ≤ ̺n kFh (un ) − bh kBn−1
(9.45)
with some prescribed tolerance ̺n > 0. We remark that the norms in (9.45) can be calculated as described in Remark 9.4 after (9.20). For simplicity, we let p(0) = 0. Then there holds the following convergence estimate, which is a consequence of part (b) of Theorem 7.9. Corollary 9.2 Let there hold the conditions (i)-(v) formulated for (9.1). Then the inner iteration (9.20) in V = Vh for the problem (9.44) yields kFh′ (un )p(k)
+ (Fh (un ) − bh )kBn−1 ≤
√ √ !k Mn − m n √ kFh (un ) − bh kBn−1 . √ Mn + m n
Corollary 9.2 implies that in order to satisfy (9.45), the number of iterations kn is given by the inequality √ √ !k Mn − m n n √ ≤ ̺n . √ Mn + m n We underline the mesh independence of this result: the constants mn and Mn are independent of h, hence the required number of iterations to satisfy (9.45) is also independent of h. Together with the inner iteration result of Corollary 9.2, the convergence of the overall inner-outer Newton iteration is provided by Theorem 7.9.
9.2.4
Multilevel type iterations
Now we generalize the GFEM and NFEM algorithms (9.35) and (9.38), respectively, such that stepwise varied mesh widths hn are used. Naturally, the focus is on the case hn → 0. An important part of these methods involves the case when one possibly executes more than one iteration on the same level hn until prescribed accuracy. The term ’multilevel iteration’ is usually used for such methods. For Newton-like methods, the construction and convergence of multilevel iterations is presented in detail in Axelsson– Kaporin [18], Blaheta [45], Deuflhard–Weiser [85], and particular applications also involving parallelization are investigated in Heise [149], Posp´ıˇsek [247]. The adaptivity
304CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO of the multilevel approach makes it an efficient realization of Newton’s method, and further improves the advantages of the two-level Newton method cited in paragraph (b) of subsection 8.1.2 (see [45]). Here we only discuss the varied mesh iterations in a general setting as suitable projection-iteration methods, with main interest in the assumption that hn → 0 in a prescribed rate. A detailed discussion of related results on projection-iteration methods is found e.g. in Gajewski–Gr¨oger–Zacharias [127] and Zeidler [296], see also the paper of Gajewski–Kluge [128] for monotone operators. For the study of convergence, in both algorithms the accuracy of solving the auxiliary problem in the nth step is denoted by δn and convergence is formulated according to the rate as δn → 0. We note that the earlier choice of a fixed mesh is a special case of this subsection, then the accuracy δn comes from the discretization error w.r. to Vh . Now the convergence is obtained in a somewhat opposite way: the rate of δn → 0 is prescribed as n → ∞ and one has to define the sequence (Vhn ) with suitably refined mesh widths hn redefined in each step. We verify that the obtained rate of convergence remains the same as that of the theoretical iteration when the rate of δn is the same.
As is clear from the construction, the usage of varied mesh widths enables us to approximate the exact solution. Hence we now study the error
kun − u∗ k 1 (in contrast to (9.39)) where u∗ ∈ HD (Ω) denotes the exact weak solution of (9.32). The theorems are presented for (9.32), for the other boundary value problems the reformulation of the theorems is obvious.
(a) Gradient–finite element method with varied mesh
Similarly to (9.35), first we choose a symmetric matrix-valued function
G ∈ L∞ (Ω, RN ×N ) satisfying (9.34). Then the GFEM algorithm with varied mesh for (9.32) is as follows:
9.2. FINITE ELEMENT REALIZATION
305
1 (a) u0 ∈ HD (Ω); for any n ∈ N : if un is obtained, then 1 (b1) zn∗ ∈ HD (Ω) denotes the exact solution of the problem Z Z Z Z ∗ γv dσ gv + ) · ∇v − f (x, ∇u G(x) ∇z · ∇v = n n ΓN Ω Ω Ω
1 (v ∈ HD (Ω));
1 (b2) Vhn ⊂ HD (Ω) is a FEM subspace, and z n ∈ Vhn is the solution of the problem Z Z Z Z · ∇v = ) · ∇v − f (x, ∇u gv + γv dσ (v ∈ Vhn ) G(x) ∇z n n Ω Ω ΓN Ω such that kzn∗ − z n kHD1 ≤ δn ; 2 (b3) u =u − z . n+1
n
M +m
n
(9.46) The following convergence result is a consequence of Theorem 9.1, since the GFEM is a special realization of the algorithm thereby. Corollary 9.3 Let Assumptions 9.1 hold for problem (9.1). If δn ≤ const.·q n (n ∈ N) with some constant 0 < q < 1, then the GFEM algorithm (9.46) yields the following linear convergence estimates: (a) if q >
M −m M +m
then kun − u∗ kG ≤ const. · q n
(b) if q <
M −m M +m
(n ∈ N);
then M −m kun − u kG ≤ const. · M +m ∗
n
(n ∈ N).
−m the convergence estimate is faster than const. · sn for any (In the case q = M M +m −m M −m n s> M , but is slower than ( M ) .) M +m +m
Remark 9.10 In iterations for Dirichlet problems with regular coefficients f, G ∈ C 1 , the special case Vhn ⊂ H 2 (Ω) (n ∈ N) has importance. As mentioned in Remark 9.8, this requires Vhn ⊂ C 1 (Ω), i.e. C 1 -elements are used. Then the iteration (9.46) is the projection of the Sobolev space iteration with auxiliary equations in strong form (
−div (G(x)∇zn ) = T (un ) − g zn ∈ H 2 (Ω), zn|∂Ω = 0 ,
(9.47)
306CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO i.e. the solutions z n of problem (b2) in (9.46) (with ΓN = ∅) are numerical solutions to (9.47). Besides the qualitative aspects given in Remark 9.8, this strong form of the iteration has advantages in controlling the errors δn in (9.46). Namely, the benefit of higher order elements is that the desired rate of δn → 0 can be obtained directly via the mesh width hn used in step (b2) of (9.46). Using the H 2 error estimate corresponding to (b2) and the Bernstein inequality corresponding to (9.47), we obtain the estimate
kzn − zn∗ kH01 (Ω) ≤ Chn kzn∗ kH 2 (Ω) ≤ C ′ hn kT (un ) − gkL2 (Ω) with suitable constants C, C ′ > 0, where zn∗ ∈ H 2 (Ω) ∩ H01 (Ω) is the exact solution of (9.47). The obtained bounds C ′ hn kT (un ) − gkL2 (Ω) play the role of δn : if sup{kT (un ) − gkL2 (Ω) : n ∈ N} < +∞ (which can be assumed since (un ) is constructed to converge to the solution of equation T (u) = g), then δn can be controlled by the choice of hn , i.e. (instead of estimating δn in the steps) the suitably prescribed refinement of the mesh yields the required order estimate of the convergence of δn . In particular, if (hn ) → 0 is chosen a geometric sequence then the conditions and hence the obtained estimates of Corollary 9.3 are satisfied. (The same argument can also be used for other boundary conditions.)
For the other problems in section 7.1, the GFEM algorithm with varied mesh is derived similarly as (9.46) for problem (9.32). Essentially, in order to obtain z n , the actual Sobolev space in the definition of zn in Theorems 7.3–7.6 has to be replaced by Vhn (both for zn and the test functions). Then the convergence result similar to Corollary 9.3 follows in the same way from Theorem 9.1.
(b) Newton–finite element method with varied mesh
For the NFEM, we use the notations F and b as in (9.11). The NFEM algorithm with varied mesh for (9.32) is as follows:
9.2. FINITE ELEMENT REALIZATION
307
1 (Ω); (a) u0 ∈ HD for any n ∈ N : if un is obtained, then 1 (b1) p∗n ∈ HD (Ω) denotes the exact solution of the problem Z Z Z Z ∂f (x, ∇un ) ∇p∗n · ∇v = − f (x, ∇un ) · ∇v + γv dσ gv + ∂η Ω Ω ΓN Ω
1 (v ∈ HD (Ω))
(b2) Vh ⊂ H 1 (Ω) is a FEM subspace, and p ∈ Vh is the solution of
n n n D Z Z Z Z ∂f γv dσ (x, ∇un ) ∇pn · ∇v = − f (x, ∇un ) · ∇v + gv + ∂η Ω ΓN Ω Ω ′ such that kF (un )pn + (F (un ) − b)kHD1 ≤ δn kF (un ) − bkHD1 ; µ1 n) (b3) τn = min{ 1, (1−δ } ∈ (0, 1] , (1+δn ) Lkpn kH 1 D
(b4)
(v ∈ Vhn )
un+1 = un + τn pn .
(9.48) Similarly to (9.38), the constants required for τn in step (b3) can be computed following Remark 9.3.
The following convergence result is a consequence of Theorem 9.3, since the NFEM is a special realization of the algorithm thereby. Corollary 9.4 Let there hold the conditions (i)-(v) formulated for (9.1), further, assume that F ′ is Lipschitz continuous. If δn ≤ const. · kF (un ) − bkγH 1 with some D constant 0 < γ ≤ 1, then the NFEM algorithm (9.48) yields convergence locally of order 1 + γ: kF (un+1 ) − bkHD1 ≤ const. · kF (un ) − bk1+γ H1
(n ≥ n0 )
D
with some index n0 ∈ N, and consequently of weak order 1 + γ: kun − u∗ kHD1 ≤ λ−1 kF (un ) − bkHD1 ≤ const. · q (1+γ)
n
(n ∈ N)
with a suitable constant 0 < q < 1. For the other problems in section 7.2 the algorithm is derived similarly as (9.48) for problem (9.32). Essentially, in order to obtain pn , the actual Sobolev spaces in the definition of pn in Theorem 7.8 have to be replaced by Vhn (both for pn and the test functions). Then the convergence result similar to Corollary 9.4 follows in the same way from Theorem 9.3.
308CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
9.3
Some remarks on the use of Laplacian preconditioners
(a) Fast solvers for the Poisson equation In the class of preconditioning elliptic operators that yield efficient preconditioning matrices through their discretizations, a special role is played by the Laplacian. Namely, for Poisson equations there exists a variety of fast solvers developed in the past decades. The presence of these methods particularly suggest the discrete Laplacian as preconditioner for other elliptic operators, since the easy invertibility as the first requirement of an efficient preconditioner is thus satisfied. In this paragraph we give some further references to such fast solvers, completing those mentioned in subsection 3.3.3. The majority of these fast methods is a direct solver. They are basically developed on rectangular domains, and can be extended to other domains using the fictitious domain approach. Further, some iterative methods are also particularly efficient for Poisson equations. A family of fast solvers on rectangles which has undergone recent development is formed by spectral methods (Boyd [50], Funaro [121], Gottlieb-Orszag [136]). The basic idea is the polynomial approximation of the solution using collocation, which consists in finding a polynomial that satisfies the differential equation in the nodes of a prescribed orthogonal grid and the boundary conditions in the nodes of the boundary. The most widespread related methods are Chebyshev and Legendre collocation methods, and the related algebraic systems are most often solved iteratively using suitable finite difference preconditioning (Funaro [123], Orszag [235]). An alternative is offered by finite element preconditioners (Deville [86], Deville–Mund [87]). Spectral methods can be also used for non-symmetric equations, i.e. in the presence of first order advective terms, using suitably modified collocation grids, see the survey of Funaro [124]. For other applications related to physical problems see Canuto [67]. Domains other than rectangles can be treated using the domain decomposition idea, also allowing domains with complex geometries (Funaro [122], Funaro–Quarteroni–Zanolli [125]). Among more classical direct solvers an efficient fast solution method is the cyclic reduction, which can also be used in the non-symmetric case (i.e. in the presence of first order terms). A survey on cyclic reduction is found in the papers of Dorr [90] and Swarztrauber [272], and its version for the non-symmetric case in Swarztrauber [271]. Further, one can use the fast Fourier transform for the Poisson equation: the papers of Dorr [90] and Swarztrauber [272] also survey the Fourier method and the latter couples Fourier analysis with the realization of the FACR (fast cyclic reduction) algorithm. Marching algorithms are summarized in Bank–Rose [37]. These methods require O(n2 log(log n)) operations (i.e. floating point multiplications and divisions), where n denotes the number of unknowns. (In the case of the generalization to nonsymmetric problems, this count is O(n2 log n).) The parallel implementation of these algorithms is also feasible, see Vassilevski [283] and for a more recent version that uses the PSCR (partial solution variant of the cyclic reduction) method for data partitioning, Rossi–Toivanen [256].
9.3. SOME REMARKS ON THE USE OF LAPLACIAN PRECONDITIONERS 309 The above methods are developed on rectangular domains. They can be extended to other domains using the fictitious domain approach. The basic idea is to embed the original domain into a larger simple-shaped domain in which an equivalent problem can be solved more efficiently. The larger domain is typically a rectangle, in which the above fast solvers can be applied (B¨orgers–Widlund [49], Marchuk–Kuznetsov– Matsokin [205], Rossi–Toivanen [257]). We note that in these methods the Laplacian can be usually replaced by separable operators, i.e. in which the derivatives w.r. to different variables are separated (Bank– Rose [37], Dorr [90], Swarztrauber [271]). Further, besides direct solvers, there exist important classes of iterative solution methods that are particularly efficient for the Poisson equation, above all multigrid methods. These methods have been dealt with in subsection 3.3.2. For multigrid methods we recall the surveys of Hackbusch [144] and McCormick [210, 211], further, we note that the multilevel results can be strengthened in a two-level setting using spectral equivalence (Bramble–Pasciak–Zhang [55]) and the iterative solution methods may rely on incomplete factorization preconditioners (Elman [102], Juncu-Popa [157], Magolu [200]). Remark 9.11 For nonlinear problems, discrete Laplacian preconditioners (as a special case of the preconditioning operator classes used in Chapter 7) can arise both in simple iterations and in inner iterations for an outer Newton method. Such preconditioners have been used efficiently for successive approximations e.g. in Carey–Jiang [68] and in inner Newton iterations in Juncu-Popa [157], Rossi–Toivanen [257]. To compare these settings, we note that the latter method yields more freedom than simple iterations since one can choose independently the number of inner iterations for each different outer step. Further, the outer Newton sequence converges faster than a simple iteration. However, by Remark 7.15 the total number n of inner-outer iterations to achieve a prescribed error ε is of the same magnitude as for simple iterations, namely, there holds n = O(log ε) as ε → 0 for an inner-outer iteration with inner Laplacian preconditioner as well as for as for a simple iteration. (b) The direct gradient method For some special problems, Sobolev space preconditioning by the Laplacian can be executed literally in the sense that discretization is avoided and the iteration is applied directly in the Sobolev space. This direct realization, called direct gradient method, is available for special domains and coefficients where the Poisson equations can be solved exactly. The direct gradient method is described briefly below, more details are found in a paper of Kar´atson [169]. The main idea is the following: on special domains, first approximating the coefficients and right-hand sides of a semilinear problem by appropriate algebraic/trigonometric polynomials, the Sobolev space iteration can also be kept in a suitable class of algebraic/trigonometric polynomials. Hence the solution of the auxiliary Poisson equations can be achieved directly by executing a linear combination or by solving a
310CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO simply structured linear system of algebraic equations to obtain the coefficients of the required polynomial. We consider a semilinear operator with polynomial nonlinearity, which is a special case of (7.35), with linear boundary conditions: s X cj (x)uj = g(x) T (u) ≡ −div (a(x)∇u) + j=0
Qu ≡ α(x)u + β(x) ∂u ∂ν
|∂Ω
(9.49)
= 0.
The following conditions hold: a ∈ C 1 (Ω), 0 < m ≤ a(x) ≤ m′ , cj ∈ C(Ω); s ≤ p − 1, P if N > 2; q(x, u) := sj=0 cj (x)uj satisfies where 2 ≤ p if N = 2 or 2 ≤ p ≤ N2N −2 0 ≤ ∂q/∂u; further, α, β ∈ P C(∂Ω, R+ ), α + β > 0 a.e. (This type of equation occurs e.g. in steady states of autocatalytic reaction-diffusion equations. We note that for non-polynomial nonlinearities q(x, u) the method can be used if q(x, u) can be suitably approximated by polynomials in u). Approximating the coefficients and right-sides by suitable polynomials of x up to prescribed accuracy, the solution of the approximate system remains appropriately close to the solution of (9.49). The exact formulation of this is left to the reader. In the sequel we may consider (9.49) as already having polynomial coefficients, i.e. being replaced by the approximate system. The simple or gradient iteration for (9.49) in strong form is a special case of (7.46) and (7.49), for given un defined by the algorithm g = T (un ) − g n
−∆z = g , Qzn |∂Ω = 0 (zn ∈ H 2 (Ω)) 2 z . M0 +m n
n n u n+1 = un −
(9.50)
Its convergence is given by Theorem 7.3. (For Neumann problems, now constant times identity is added to the Laplacian for injectivity.) For the direct gradient method one has to find a function class P (as a subspace of H (Ω)) such that if u0 ∈ P and g ∈ P, then un ∈ P implies T (un ) ∈ P and gn ∈ P implies zn ∈ P. Then by induction the iteration runs in P. Further, we require that the Poisson equations in (9.50) with gn ∈ P can be solved exactly in an elementary way. 2
We now give some special domains and corresponding function classes when the above requirements hold. For simplicity of presentation, we only consider the two-dimensional case Ω ⊂ R2 (the analogies being straightforward). (i) Rectangle We investigate the case when Ω ⊂ R2 is a rectangle. It can be assumed that Ω = I ≡ [0, π]2 (if not, a linear transformation is done).
9.3. SOME REMARKS ON THE USE OF LAPLACIAN PRECONDITIONERS 311 Denote by Ps , Pc and Psc the set of sine, cosine and mixed trigonometric polynomials l P
Ps = { Pc = {
n,m=1 l P
σnm sin nx sin my : l ∈ N+ , σnm ∈ R (n, m = 1, . . . , l)},
σnm n,m=0 l l P P
Psc = {
n=1 m=0
cos nx cos my : l ∈ N, σnm ∈ R (n, m = 0, . . . , l)} and
σnm sin nx cos my : l ∈ N+ , σnm ∈ R (n, m = 1, . . . , l)},
respectively. The coefficients a and cj of (9.49) can be approximated by cosinepolynomials to any prescribed accuracy, hence (as suggested above) we assume that (9.49) already fulfils a, cj ∈ Pc (j = 0, ..., s). Dirichlet boundary conditions We consider the case when j is odd in each term in (9.49) and assume g ∈ Ps . Then the operator T is invariant on Ps . Further, for any h ∈ Ps the solution of −∆z = h, satisfies z ∈ Ps , namely, if h(x, y) = z(x, y) =
l P
n,m=1
l X
n,m=1
z|∂I = 0 σnm sin nx sin my, then
σnm sin nx sin my . + m2
(9.51)
n2
These imply that if (9.50) starts with u0 ∈ Ps then we have un ∈ Ps throughout the iteration, and the auxiliary problems are solved in the trivial way (9.51). Neumann boundary conditions This is similar to the Dirichlet case, now letting g ∈ Pc . Here T is invariant on Pc .
Further, for any h ∈ Pc , h(x, y) =
l P
n,m=0
σnm cos nx cos my, the solution of
−∆z + kz = h,
∂ν z|∂I = 0
(with k ∈ R+ ) satisfies z ∈ Pc , namely, z(x, y) =
l P
n,m=0
σnm n2 +m2 +k
cos nx cos my .
Mixed boundary conditions Denote by Γi (i = 1, . . . , 4) the boundary portions [0, π) × {0}, {π} × [0, π), (0, π] × {π}, {0} × (0, π], respectively. First, let α(x) ≡ χ{Γ2 ∪Γ4 } , β(x) ≡ χ{Γ1 ∪Γ3 } , i.e. we have u|Γ2 ∪Γ4 ≡ 0, ∂ν u|Γ1 ∪Γ3 ≡ 0. Then the above method works on Psc . We can proceed similarly for other edgewise complementary characteristic functions α and β, using sin(n + 12 )x type terms for mixed endpoint conditions.
312CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO (ii) Disc We investigate the case when Ω ⊂ R2 is the unit disc B = B1 (0). Now a, cj and g are assumed to be algebraic polynomials. (For this we use the notation a, cj , g ∈ Palg ). Then T is invariant on Palg . Dirichlet boundary conditions If h ∈ Palg , then the solution of −∆z = h,
z|∂B = 0
(9.52)
can be found by looking for z in the form z(x, y) = (x2 + y 2 − 1)q(x, y), where q ∈ Palg has the same degree as h (cf. [165], Chapter XV). Then the coefficients of q can be determined by solving a a simply structured linear system of algebraic equations, obtained from equating the coefficients of −∆z and h. The matrix of this system has three diagonals and two other non-zero elements in each row. 3rd boundary conditions We examine the case α(x) ≡ α > 0, β(x) ≡ β > 0. If h ∈ Palg then the solution of −∆z = h,
(αz + β∂ν z)|∂B = 0
can be determined similarly to the Dirichlet case. Namely, let q(x, y) be a polynomial with unknown coefficients {anm } and of the same degree as h. Let p(x, y) ≡ (x2 + y 2 − 1)q(x, y) =
l X X
bnm xn y m ,
s=0 n+m=s
here the bnm ’s are linear combinations of the anm ’s. We look for z as z(x, y) =
l X X
bnm n m x y . s=0 n+m=s α + βs
Equating the coefficients of −∆z and h leads again to a linear system of algebraic equations for {anm }, from which we then determine {bnm }. Then z satisfies the boundary condition, since on ∂B we have ∂ν z = x∂x z + y∂y z and αz + β(x∂x z + y∂y z) =
l X X
bnm (α + β(n + m))xn y m = p(x, y) = 0 . s=0 n+m=s α + βs
Radial problems A special case of the methods above occurs when the algebraic polynomials a, cj and g are radially symmetric (notation: a, cj , g ∈ Prad ). Then T is invariant on Prad .
9.3. SOME REMARKS ON THE USE OF LAPLACIAN PRECONDITIONERS 313 Letting r = (x2 + y 2 )1/2 , there holds ∆z = r−1 ∂r (r∂r z) for radial functions z. Hence, if in (9.52) we have h ∈ Prad : h(r) =
l X
am r2m
m=0
(r ∈ [−R, R]),
(9.53)
then the solution of (9.52) is z(r) =
l X
am (R2m+2 − r2m+2 ) . 2 (2m + 2) m=0
(9.54)
(iii) Other domains The methods given for a rectangle can be extended for domains where the eigenfunctions of the Laplacian are known explicitly. Then the terms of the polynomials in Ps , Pc and Psc have to be replaced by the actual eigenfunctions. These are known e.g. for rectangular triangles and regular hexagons (Makai [202], Riesz–Sz.-Nagy [255], Simon–Baderko [265]). Further, if a domain is diffeomorphic to one of the above special domains, then (9.49) can be transformed to the special domain such that uniform monotonicity is preserved. Then the described direct realization is available for the transformed equation. Finally we note that the small enough high-index coefficients of the polynomials in the approximating sequence can be dropped within given accuracy in order to have a favourable bound of the degree of the polynomials. For further details see [169]. (c) Gradient-Fourier method The coupling of the simple or gradient iteration with the Fourier method is a kind of generalization of the above described direct gradient method from semilinear problems to general equations. The main idea is that in each step of the iteration the right-hand side of the auxiliary problem is approximated by a partial Fourier sum, and then the modified auxiliary problem can be solved exactly. The details are found in the papers of L´oczi [198, 199]. The method works when the eigenvalues λi and eigenfunctions ei (i ∈ N+ ) of the Laplacian are known for the given domain and boundary conditions. (This is the case, for instance, on rectangles, balls, rectangular triangles or regular hexagons, see Makai [202], Riesz–Sz.-Nagy [255], Simon–Baderko [265]. For Neumann boundary conditions one works in the subspace orthogonal to constants and uses the positive eigenvalues only.) For simplicity, here we sketch the method for Dirichlet boundary conditions, i.e. we consider problem (7.21): T (u) ≡ −div f (x, ∇u) + q(x, u) = g(x) u |∂Ω = 0 .
The coefficients satisfy the uniform monotonicity properties assumed for (7.21).
314CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO The construction of the method is as follows. The iterative sequence is un+1 = un −
2 zn , M +m
where z n is an approximate solution of the auxiliary problem −∆zn = rn ≡ T (un ) − g z n|∂Ω = 0
using the Fourier method. Namely, rn is approximated by a partial Fourier sum rn =
ln X
ci,n ei
with ci,n =
i=1
Z
Ω
rn ei
where ln is chosen such that krn − rn k2L2 (Ω) = krn k2L2 (Ω) −
ln X i=1
c2i,n ≤ ωn2
with some prescribed tolerance ωn > 0. Here ωn also contains the error of numerical integration. Then we let z n be the solution of −∆z n = r n
which is obtained explicitly:
z n|∂Ω = 0,
zn =
ln X ci,n i=1
λi
ei .
The convergence of the method can be achieved by the proper choice of the tolerances ωn . Namely, there holds −1/2
kzn − z n kH01 (Ω) ≤ λ1
−1/2
krn − rn kL2 (Ω) ≤ λ1
ωn
(9.55)
(where λ1 > 0 is the first eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω) and the estimate (5.83) −1/2 is used). Hence Theorems 9.1 and 9.2 can be applied with δn = λ1 ωn .
9.4
On time-dependent problems
Although the presentation of preconditioning operators is limited to elliptic problems in this book, the preconditioning operator idea can be also applied to time-dependent problems, using the widespread technique of time discretization which reduces timedependent problems to sequences of elliptic ones. This approach is highly developed in the linear case, and is also applicable for many nonlinear problems. It has a vast literature, see e.g. Kacur [160], Ladyzenskaya [189], Rektorys [252]. It is not our purpose to detail this area, which is beyond the main scope and length of this book.
9.4. ON TIME-DEPENDENT PROBLEMS
315
Instead, we very briefly refer to some possibilities of using the preconditioning operator idea in this context. In order to illustrate the time discretization method, let us first consider the linear evolution equation
∂u ∂t
− ∆u = g(x) on Ω × (0, T )
u = 0 on ∂Ω × (0, T ) and u(x, 0) = u (x) on Ω 0
(9.56)
where Ω is a bounded domain with a sufficiently smooth boundary ∂Ω. We divide the time interval [0, T ] into subintervals [ti−1 , ti ], i = 1, . . . s. Then, successively, for i = 1, . . . s we solve the elliptic equations
ui (x)−ui−1 (x) τi
− ∆ui (x) = g(x) on Ω
u = 0 on ∂Ω i
(9.57)
where τi = ti − ti−1 and ui (x) is the approximation to the solution u(x, t) at the time level t = ti . Here we recursively obtain that ui−1 (x) is already known since u0 (x) is taken from (9.56). Finally, we construct Rothe’s function us (x, t) = ui−1 (x) + (t − ti−1 )
ui (x) − ui−1 (x) , τi
t ∈ [ti−1 , ti ],
i = 1, 2, . . . , n (9.58)
which is an approximate solution of (9.56) in the sense that us tends to the solution u in suitable functional spaces. The approximation of (9.56) by the sequence of elliptic equations (9.57) was introduced by Rothe [258]. (Due to the construction of the Rothe’s function, the introduced method is often called the method of lines, or is termed as the method of semidiscretization or the method of discretization in time.) The method has proved to be applicable to a wide range of evolution problems (including non-traditional integrodifferential equations and problems with integral conditions, describing complicated processes in the theory of heat transfer) and hyperbolic problems , and is also applicable to rheology (Jerome [156], Kacur [160], Martensen [206], Rektorys [252]). This approach can be generalized in a straightforward way to nonlinear parabolic problems of the form ∂u + T (u) = g(x) (9.59) ∂t with proper initial and boundary conditions [161]. Then a solution scheme consists of the recursive solution of elliptic problems on each time level
ui (x)−ui−1 (x) τi
+ T (ui )(x) = g(x) on Ω
u = 0 on ∂Ω. i
(9.60)
Numerical as well as theoretical aspects of this method have been examined in Rektorys [252] (convergence questions, including those when the elliptic problems generated by the method are solved numerically, error estimates with tests of their practical efficiency, existence theorems, regularity of the weak solutions, etc). Namely, for linear problems, under the usual assumptions of uniform ellipticity and boundedness of
316CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO the elliptic operators in weak form, one can prove the existence and uniqueness of the weak solution and the convergence of the numerical solutions to the weak solution in different norms. A relatively sharp error estimate is derived at the points of divisions tj . The convergence of the ’Ritz-Rothe’ method is proved, i.e. when the elliptic problems are solved approximately by the Ritz or a similar method (like the finite element method). Further, regularity questions are discussed w.r. to both time and space variables. These results can be extended to the case of nonhomogoneneous initial and boundary conditions, and also to problems with nonlinear elliptic operators. We remark that in the problems of type (9.60) the elliptic operator T can be more generally approximated by a weighted approximation using both time levels t = ti−1 and t = ti . In any case the input and output data of the problem on a given time level come from the known previous and the unknown actual levels, respectively. Hence, one of the main questions in these methods related to the time dependence is the actual choice of time discretization, determined by the extent in which the discretization scheme is implicit. This means that one looks for a right balance between the values used from the known previous and the unknown actual time level. Roughly speaking, more values from the previous level lead to easier elliptic subproblems, whereas more values from the unknown actual level increase the stability of the method. Besides, if one solves the discretized elliptic subproblems iteratively, then in the overall algorithm one also has to balance the time steps and space iterations to preserve both the accuracy and low costs. Various schemes and related results on these questions are found in the works of Samarskii [263], Thom´ee [277, 278] and many others. The need of balance between the time and space discretization parameters also appears in relation with qualitative aspects. Namely, in order to preserve certain properties of the initial function such as nonnegativity or concavity, these parameters have to satisfy suitable conditions. For instance, in the case of the heat conduction equation the ratio τ /h2 has to remain bounded even for the the unconditionally stable methods, where τ = τi is as above and h is the space discretization parameter (Farag´o [111, 112]). In the sequel we briefly sketch some possible ways of using the operator background in the context of time-dependent problems. One can both use the properties of the involved operators on the continuous level, and apply preconditioning operators. First, let us consider an operator T in (9.59) in the form as in (6.53): T (u) ≡ −div (b(x, |∇u|) ∇u) .
(9.61)
Then one can propose the following scheme of time discretization: the nonlinear coefficients are evaluated on the previous time level, hence one has to solve linear elliptic problems one the time levels and the nonlinearity only enters into the known input data. Namely, on the ith time level we define the elliptic subproblem ui (x) − ui−1 (x) − div (b(x, |∇ui−1 |) ∇ui ) = g(x). τi (We note that this idea is similar to the frozen coefficient method.)
(9.62)
9.4. ON TIME-DEPENDENT PROBLEMS
317
The iterative solution of the elliptic problems (9.62) can be achieved using a preconditioning operator. Since the function b is scalar-valued, the same is reasonably assumed of the preconditioning operator, hence it should have a scalar coefficient wi . In particular, one may choose a constant or piecewise constant coefficient, and rely on subsections 3.3.2 and 3.3.3 for related solvers for the auxiliary problems. Here the latter problems are of the form Z
Ω
(n) wi (x) ∇zi
· ∇v =
Z
(n)
Ω
(n) b(x, |∇ui−1 |)∇ui
(v ∈
· ∇v +
(n) τi−1 ui v
−
Z Ω
g + τi−1 ui−1 v
(9.63)
H01 (Ω)),
where the sequence ui (n = 0, 1, ...) is constructed to converge to ui on the ith (n) time level, and the unknown functions zi in (9.63) define the correction terms in the iteration. Another area where the continuous level is involved is the method of splitting. This class of methods forms an efficient way of handling certain complex physical phenomena, such as reaction-diffusion problems or air pollution. The latter lead to models including diffusion, advection and chemical reaction terms together. These models usually involve a system of semilinear equations in which the nonlinearity appears in the zeroth-order reaction term: ∂u − div (k(x) ∇u) + b(x) · ∇u + R(x, u) = 0 . ∂t
(9.64)
Here the coordinates of the unknown vector-function u(x, t) ∈ Rm mean the concentration of the l-th pollutant (l = 1, 2, . . . m), k(x) is the diffusion coefficient matrix, b(x) ∈ R3 is the velocity field (defined mainly by the wind) and R(x, u) describes the chemical reaction. (For some further details, see Zlatev [298].) The main idea of splitting is to divide the problem into three parts, namely, on each time level one solves consecutively the problems in which only one of the three time-independent terms is present. The solution of the obtained problem and that of the original one are connected in a way which is characterized by the so-called splitting error (Lanser [192], Sanz-Serna [264], Strang [268]). In the study of the splitting error the continuous level plays an important role. Under certain conditions on the commutativity of the involved elliptic operators, one can even achieve zero error. Further, if these conditions are not exactly satisfied, then more generally one obtains corresponding error estimates. The splitting method may also make it easier to find suitable preconditioners, since the different time-independent operators need different treatment. The preconditioning operators discussed in subsection 3.4.2 may provide efficient preconditioning for the step containing the second order elliptic term. For related results on splitting and the commutativity of the involved operators, the reader is referred to Farag´o-Havasi [113], Havasi–Bartholy–Farag´o [147].
318CHAPTER 9. ALGORITHMIC REALIZATION OF ITERATIVE METHODS BASED ON PRECO
Chapter 10 Some numerical algorithms for nonlinear elliptic problems in physics In this chapter we illustrate the application of the preconditioned iterative methods elaborated in Chapters 7–9. The algorithms are formulated for some of the model problems of Chapter 1. In accordance with Chapter 8, the emphasis is on FEM realization which is a natural setting for methods with Sobolev space background. Hence all the given algorithms, with the exception of that in section 10.6, will couple the proposed iterative methods and preconditioners with FEM discretization. Besides, for ease of exposition we consider a fixed mesh in the FEM realizations and construct the algorithms to approach the FEM solution uh . (We only note that on the next level, by letting h → 0, the convergence of uh to the weak solution u∗ of the original problem is ensured by the uniform convexity of the corresponding energy functionals, as was pointed out at the beginning of subsection 9.2.3.) The given FEM algorithms demonstrate that the FEM realization of Sobolev space methods with preconditioning operators can be derived in a straightforward way from the theoretical iteration. Further, in most cases the mesh independence of the convergence of the iterations follows from the preceding results of this book, which will be illustrated in some of the examples by the a priori calculation of analytic bounds for these convergence estimates. The computer coding of the given algorithms relies on various standard FEM realizations and packages, based on the linear elliptic solvers given in subsections 3.3.2– 3.3.3. The first two sections contain examples with concrete nonlinearities that illustrate some (mostly qualitative) aspects of the proposed methods. The coding of the algorithms in the other sections is similar to these, and its execution is not the aim of this summary. For similar reasons we do not include all the examples of Chapter 1 either, but instead we wish to illustrate the use of preconditioning operators in typical situations. The structure of the sections is as follows: • The problem is posed. 319
320CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS • The proposed method is sketched with reference to the previous subsections, containing the chosen kind of iteration and preconditioner. • The algorithm is given and the corresponding convergence result is quoted. (In sections 10.1–10.2 and 10.6 numerical illustration is also included.) • Conclusions are given that briefly summarize the main features and advantages of the proposed method.
10.1
Elasto-plastic torsion of rods
(a) The problem Let us consider problem (1.17): −div (g(|∇u|)∇u) = 2ω
in Ω ⊂ R2
u|∂Ω = 0 .
(10.1)
Here g ∈ C 1 [0, T∗ ] is a strain-stress function satisfying 0 < µ1 ≤ g(T ) ≤ (g(T )T )′ ≤ µ2
(T ∈ [0, T∗ ])
(10.2)
with suitable constants µ1 , µ2 independent of T , and the constant ω>0 is the torsion per unit. (See section 1.1). For T > T∗ we define for simplicity g(T ) = g(T∗ ). For sufficiently small ω the material is in elasto-plastic state, and by proposition 6.1 problem (10.1) has a unique weak solution u∗ ∈ H01 (Ω), i.e. Z
Ω
g(|∇u∗ |)∇u∗ · ∇v = 2ω
Z
Ω
(v ∈ H01 (Ω)).
v
(b) The proposed method We consider a fixed FEM subspace Vh ⊂ H01 (Ω) and look for the solution uh ∈ Vh of the discretized problem Z
Ω
g(|∇uh |)∇uh · ∇v = 2ω
Z
Ω
v
(v ∈ Vh ).
(10.3)
(The dependence of the test functions v ∈ Vh on h is not denoted here and in the sequel either in order to have less indices.) The proposed method is the gradient–finite element method (GFEM) combined with the discrete Laplacian preconditioner. The GFEM has been defined in subsection 9.2.1, and the Laplacian preconditioner is discussed in subsection 8.2.1. The GFEM defines a simple iteration which, in the case of Laplacian preconditioner, consists of the numerical solution of auxiliary Poisson equations in each step of the iteration using a
10.1. ELASTO-PLASTIC TORSION OF RODS
321
suitable FEM. We will consider both the weak and strong forms of the algorithm. The discrete Laplacian preconditioning matrix is given by (−∆h )i,j =
Z
Ω
∇vi · ∇vj
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . The application of the Laplacian preconditioner is motivated by the available fast Poisson solvers, quoted in paragraph (a) of section 9.3. We note that one might similarly propose the Laplacian preconditioner in inner iterations for an outer Newton iteration if the Newton–finite element method (NFEM) were applied instead of the GFEM. However (as pointed out in the mentioned section 9.3 in Remark 9.11), with the Laplacian preconditioner the total number n of inner-outer iterations to achieve a prescribed error ε is of the same magnitude as for a simple GFEM iteration, namely, n = O(log ε)
as ε → 0
(10.4)
for both methods. Moreover, the GFEM produces this order for the number of iterations by a simple iteration which does not require the construction of the Jacobians as does the NFEM. Hence this simplicity of the GFEM justifies its usage whenever its linear convergence quotient is reasonably small. The latter will be illustrated by a real-life example of g in paragraph (e). We define the bounds m = g(0),
M = max (g(T )T ) ′ .
(10.5)
0≤T ≤T∗
(If µ1 , µ2 are sharp in (10.2), then m = µ1 , M = µ2 .) For simplicity, we pick u0 ≡ 0 for the initial guess (cf. also Remark 7.5). (c) The algorithm In this paragraph we give the algorithmic form of the proposed method. We consider a fixed mesh, i.e. the algorithm is given in a fixed FEM subspace Vh . First we define the algorithm in general in weak form, then the particular case is given when the auxiliary problems are in strong form. •
In general, we consider the algorithm in weak form and fix a FEM subspace Vh ⊂ H01 (Ω).
The GFEM algorithm for (10.1) is as follows:
(a) u0 ≡ 0; for any n ∈ N : if un ∈ Vh is obtained, then (b1) z ∈ V is the solution of n h Z Z Z ∇zn · ∇v = g(|∇un |)∇un · ∇v − 2ω v Ω Ω Ω 2 zn . un+1 = un − (b2)
M +m
(10.6) (v ∈ Vh );
322CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS Here the constants m and M are taken from (10.5), further, the auxiliary linear algebraic systems in step (b1) are solved by a fast Poisson solver quoted in section 9.3. •
The algorithm in strong form is the same as (10.6) with a fixed FEM subspace satisfying Vh ⊂ H 2 (Ω) ∩ H01 (Ω). (10.7) In this case the function zn in step (b1) satisfies zn ∈ H 2 (Ω), i.e. it is the approximate solution of the strong auxiliary problem (
−∆z = − div (g(|∇un |)∇un ) − 2ω z|∂Ω = 0
(10.8)
in Vh . The convergence of the algorithm (10.6) is given by Theorem 9.4. Namely, if uh ∈ Vh denotes the FEM solution (10.3), then there holds kun − uh kH01 ≤ C · with C=
M −m M +m
n
(n ∈ N)
2 kωkL2 (Ω) , m̺1/2
where m and M are from (10.5), and ̺ > 0 is the smallest eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω). For the constant C, we note that kωkL2 (Ω) = ω|Ω|1/2 since ω is constant, and one can use the estimate ̺ ≥ 2π 2 /diam(Ω)2 from (11.1) where |Ω| and diam(Ω) denote the area and diameter of Ω, respectively. Hence C≤
diam(Ω) (2|Ω|)1/2 ω mπ
with m from (10.5). Note that the obtained convergence estimate is mesh independent, since it only contains data from the original problem before discretization. Concerning the linear convergence of the method, we also refer to the remark on comparison with the Newton–finite element method in paragraph (b) involving the order (10.4). (d) The weak and strong forms: realization and qualitative aspects The straightforward way of realizing the FEM solution for the auxiliary problems in (10.6) is the use of C 0 -elements, since the algorithm in weak form requires that zn ∈ H01 (Ω). The simplest elements here are piecewise linear functions, which in our 2D case are simply determined by elementwise three coefficients coming from the values at the vertices (Strang-Fix [269]). The functions un are stored as the suitable arrays
10.1. ELASTO-PLASTIC TORSION OF RODS
323
of the coefficients in a standard way, further, the numerical integration on the rightside of the auxiliary problem is achieved easily since ∇un is a constant vector on each element. Accordingly, the obtained field ∇u of the numerical solution u of problem (10.1) is constant on each element. The obtained auxiliary linear algebraic systems are solved by a fast Poisson solver quoted in paragraph (a) of section 9.3. The strong form of (10.6), given via (10.7) and (10.8), can be realized using C 1 elements, since the condition Vh ⊂ H 2 (Ω) requires Vh ⊂ C 1 (Ω). The simplest C 1 elements for the 2D problem (10.8) are standard full quintic finite element approximations on each triangle. Then the 21 coefficients of the polynomials of degree 5 are determined such that 18 of them come from the values at the vertices and three from the normal derivatives at the midpoint of each edge [269]. The use of such a higher order FEM requires a much larger number of arithmetic operations, and therefore it is not widespread. However, the reasonability of its usage is justified in literature (see e.g. [48, 155, 297]) and, in particular, it is also a basis for the hp-version by Szab´o and Babuˇska [273]. The C 1 -elements lead to higher order error estimates [269], therefore a given accuracy requires smaller h than with lower degree elements and hence the arising matrix sizes are not much larger. The C 1 -elements have an important advantage for our problem connected to the qualitative aspect in Remark 9.8. Namely, the exact solution u∗ of (10.1) has the smoothness u∗ ∈ C 1 (Ω) ∩ H 2 (Ω), which is preserved by using C 1 -elements. This means in particular that the continuity of the tangential stress field τ = (τx , τy ) =
∂u ∂u ,− ∂y ∂x
!
is thus reproduced by the numerical approximations, and also the level contours of τ are connected. (e) An example We enclose the numerical results from [115] for problem (10.1) with a copper rod. We consider the copper rod with a square cross-section 10 mm × 10 mm. The material was heat treated at the temperature 600◦ C for 1 hour, and the corresponding strain-stress function g is then determined using the following data, obtained from the measurements [286]. (We use N for force and, for convenience, mm for length throughout the experiment.) T g(T ) T g(T ) T g(T )
0 1.0840 2.5097 1.4329 4.5887 1.8486
1.0779 1.0840 2.6786 1.4650 4.9247 1.9398
1.2962 1.1479 3.4842 1.6292 4.9866 1.9811
1.5238 1.2160 3.6339 1.6614 5.1473 2.0541
1.7395 1.2754 4.0616 1.7462 5.3245 2.1259
1.9293 1.3201 4.4678 1.8166
Table 10.1. The values of T mean 102 × N/mm2 and those of g(T ) mean 10−4 × mm2 /N .
324CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The values of g(T ) are determined by suitable interpolation using Table 10.1. Letting T∗ = 5.3245, the validity interval is [0, T∗ ]. The two cases 0 ≤ T ≤ 1.0779 ( with g(T ) ≡ 1.0840) and 1.0779 ≤ T ≤ 5.3245 correspond to the elastic and plastic state, respectively. According to (10.5), this strain-stress function gives the ellipticity bounds m = 1.0840,
M = 4.6861.
From this the stepsize and the convergence quotient estimate are 2 = 0.3466, M +m
M −m = 0.6243. M +m
We apply the algorithm (10.6) in strong form, i.e. we use C 1 -elements described in part (d). This is motivated by qualitative aspects, since the continuity of the tangential stress field τ is thus reproduced by the numerical approximations. In [115] we have determined ω = 0.3613 mm−1 as a value of the torsion per unit for which the solution of the problem (10.1) slightly increases above the critical state. This means that the stress intensity T = |∇u| slightly exceeds the maximum T∗ = 5.3245 of the validity interval in some points of the cross-section, i.e. crack occurs. Hence the cross-section can be divided into three parts corresponding to each possible behaviour of the material: elastic state, plastic state and where the crack occurs. Numerical results. The numerical results from [115] corresponding to problem (10.1) with ω = 0.3613 and a copper rod having square cross-section 10 mm × 10 mm are given below. The aim is to determine the tangential stress field ∇u. Our principle for the stopping criterion in the algorithm (10.6) relies, as usual, on the difference of consecutive terms. Namely, in each step we compute the nodal errors εn , which we define as the difference of the derivatives with respect to the mesh points. (Computing the nodal error requires no extra work since the used values of derivatives appear during the FEM calculations.) When εn decreases below 10−4 , we also compute the error en = kun − un−1 kH01 (Ω) with numerical integration of suitably higher accuracy than for εn . The computations are executed up to accuracy 10−4 . The FEM error estimate (3.25) shows that even h = 2.5 mm is a reasonable choice for this purpose. The convenience of this coarse mesh is due to the use of C 1 -elements. In Table 10.2 we summarize the results of the computations with ω = 0.3613 mm−1 and h = 2.5 mm. The number of iterations n, the computed stress intensities Tn = |∇un | and the nodal errors εn are given. The required stopping criterion is εn ≤ 10−4 .
10.1. ELASTO-PLASTIC TORSION OF RODS n 1 Tn 3.4402 εn 1.6552 n 9 Tn 5.4237 εn 0.0033 Table 10.2.
2 5.2090 0.8993 10 5.4210 0.0018
3 5.4735 0.2487 11 5.4226 0.0014
4 5.3440 0.0748 12 5.4217 0.0010
5 5.4500 0.0423 13 5.4223 0.0007
325 6 5.4040 0.0143 14 5.4216 0.0004
7 5.4325 0.0088 15 5.4219 0.0002
8 5.4174 0.0047 16 5.4217 0.0001
After step 16 we computed e16 , using numerical integration of the gradients on a 20 × 20 mesh, and obtained e16 = 0.000086. Further refinement to 40 × 40 yielded e16 = 0.000089. The obtained values strengthen the reliability of the nodal stopping criterion. Consequently, we accept u˜ = u16 as the numerical solution. The surface and contours of the obtained tangential stress intensity are plotted in Figures 10.1 and 10.2, respectively. Here the cross-section can be divided into three parts: the corners and a small central part are in elastic state, in the middle of the edges crack occurs, and the intermediate region is in plastic state (see Figure 10.3.) Owing to the used C 1 -elements, the numerically computed stress intensity T˜ = |∇˜ u| is continuous, hence its level contours are connected, as well as those belonging to the exact solution. This would not be satisfied if lower order elements had been used. (We note that T˜ is not everywhere differentiable since u˜ is not C 2 , hence the contours are not everywhere smooth.)
Figure 10.1: The surface of the tangential stress intensity (f ) Conclusions The gradient–finite element method has been combined with the discrete Laplacian preconditioner for the solution of problem (10.1). The efficiency of the Laplacian preconditioner is provided by the application of fast Poisson solvers. The main advantage
326CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
Figure 10.2: The contours of the tangential stress intensity
Figure 10.3: The regions of elastic state, plastic state and crack of the GFEM is that it defines the very simple algorithm (10.6) which, however, produces the same order for the number of iterations to achieve a prescribed error (without the need to construct the Jacobians) as would the Newton–finite element method if the Laplacian preconditioner had been applied in inner iterations for the outer Newton steps. The proposed method provides mesh independent linear convergence, for which a priori analytic bounds have been calculated. Numerical illustration has been given
10.2. ELECTROMAGNETIC FIELD EQUATION
327
using a real-life example of g which produces a reasonably small linear convergence quotient for the GFEM. For qualitative reasons the FEM realization has been chosen to preserve the smoothness properties of the exact solution.
10.2
Electromagnetic field equation
(a) The problem The electromagnetic field equation in terms of the potential u has been given in section 1.2 in equation (1.24): −div (b(x, |∇u|) ∇u) = ρ(x)
u|∂Ω = 0 ,
in Ω ⊂ R2
(10.9)
where the scalar-valued function b(x, r) describes magnetic reluctance and ρ is the electric current density in the device Ω. (See section 1.2). By Proposition 6.1 problem (10.9) has a unique weak solution u∗ ∈ H01 (Ω), i.e. Z
Ω
∗
∗
b(x, |∇u |)∇u · ∇v =
Z
Ω
ρv
(v ∈ H01 (Ω)).
For simplicity we develop a detailed algorithm for the special case when b(x, r) = a(r).
(10.10)
(Dependence on x will be referred to in Remark 10.1.) By (1.25), the function a ∈ C 1 (R+ ) satisfies 0 < λ ≤ a(r) ≤ c(r) ≤ Λ (r ≥ 0) (10.11)
with constants Λ ≥ λ > 0 independent of r, where we denote c(r) = a(r) + a′ (r)r
(10.12)
as in (8.69). We note that, in contrast to the torsion problem (10.1), this nonlinearity may vary in several magnitudes over the whole domain [188]: typical values are Λ/λ = O(105 ), as is the case e.g. in the example at the end of this section. (b) The proposed method We consider a fixed FEM subspace Vh ⊂ H01 (Ω) and look for the solution uh ∈ Vh of the discretized problem with the nonlinearity (10.10): Z
Ω
a(|∇uh |)∇uh · ∇v =
Z
Ω
ρv
(v ∈ Vh ).
(10.13)
The proposed method is the damped inexact Newton–finite element method (NFEM), realized by an inner-outer iteration and using domain decomposition preconditioners in the inner iteration. The ingredients of the proposed method are taken from the following parts of the book. The NFEM has been defined in subsection 9.2.2. The damped
328CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS inexact version of Newton’s method using inner-outer iterations is given in subsection 9.1.2 (see (9.19)–(9.20)). Finally, preconditioners using domain decomposition have been introduced in subsection 8.2.7. The preconditioning matrices that use the domain decomposition procedure of subsection 8.2.7 are the discretizations of piecewise constant coefficient elliptic operators. That is, such a preconditioning matrix Bn(h) is given by n
o
Bn(h) i,j
=
Z
Ω
wn (x) ∇vi · ∇vj
(i, j = 1, ..., k),
(10.14)
where wn is a piecewise constant function on Ω (redefined in each outer Newton step), and v1 , ..., vk is a basis of Vh . This means that the inner iterations consist of the numerical solution of auxiliary linear elliptic problems with piecewise constant coefficients, using a suitable FEM. The solution of the auxiliary linear problems relies on methods designed especially for piecewise constant coefficient problems and cited in subsection 8.2.7, paragraph (b). The application of such preconditioners is motivated by the usually large variation of the coefficient a in problem (10.13) (which is a basic difference of the electromagnetic field equation from the formally similar elasto-plastic torsion problem (10.3)). Namely, as pointed out in subsection 8.2.7, such preconditioning matrices are suitable for compensating the sharp gradients and at the same time their structure remains close to the discrete Laplacian. A real-life example in paragraph (e) will illustrate that, even in an extremely ill-conditioned case, it suffices to use few subdomains to achieve a reasonable condition number. (We note that the applied decomposition approach is not identical to a standard domain decomposition method, in which distinct problems are solved on the subdomains and the latter are defined via the shape of the domain Ω. Now in each Newton step the decomposition yields one piecewise constant coefficient problem, and the subdomains are defined to follow the variation in the coefficient. We further remark that standard DD methods can also be useful for problem (10.9) if the device Ω has a complicated shape, see e.g. [149].) The proposed algorithm for (10.9) will be constructed in paragraph (c) in three steps: (i) The outer Newton iteration is given based on the algorithm (9.19) in the subspace Vh . (ii) The preconditioner for the inner iteration is constructed using the domain decomposition procedure in subsection 8.2.7. This construction results in a piecewise constant weight function wn to be used in the inner iteration in item (iii). The obtained preconditioning matrix Bn(h) is given by (10.14). This matrix is the projection of the linear elliptic preconditioning operator Bn : H01 (Ω) → H01 (Ω), defined by Z hBn v, zi =
Ω
wn (x) ∇v · ∇z
(v, z ∈ H01 (Ω)),
(10.15)
10.2. ELECTROMAGNETIC FIELD EQUATION
329
into the subspace Vh . The operator F ′ (un ) has spectral equivalence bounds mn and Mn with respect to Bn such that the quotient Mn /mn can be prescribed. This quotient yields a mesh independent estimate for the condition number of the inner iteration. (iii) The inner iteration is constructed for the linearized equations Fh′ (un )pn = −(Fh (un ) − bh )
(10.16)
in the steps of the outer iteration (cf. (9.43)–(9.44)). It defines a sequence (p(k) ) = (p(k) n )k∈N ⊂ Vh using a preconditioned conjugate gradient method as in (9.20). The used stopping criterion is the relative residual error of p(kn ) : if kn ∈ N is such that this residual error is within the given tolerance, then we accept p(kn ) for pn . We recall that the mesh independent conditioning estimate Mn /mn in step (ii) follows simultaneously with the construction in the used domain decomposition procedure in paragraph (d) of subsection 8.2.7. We quote the corresponding results. As we will summarize in the algorithm (10.23), the procedure starts from a prescribed condition number for the inner iteration and yields a suitably constructed decomposition Ω = Ω1 ∪ Ω2 ∪ ... ∪ Ωsn . Let us introduce the constants Λi = sup c(|∇un (x)|)
λi = inf a(|∇un (x)|), x∈Ωi
(i = 1, ..., sn )
x∈Ωi
(10.17)
where c(r) is from (10.12), and mn := min λi /ci , i
Mn := max Λi /ci i
where ci = wn |Ωi . In particular, we let ci = 12 (λi + Λi ). Then a brief calculation yields that the operator in (10.15) satisfies mn hBn v, viH01 ≤ hF ′ (un )v, viH01 ≤ Mn hBn v, viH01 hence
cond Bn−1 F ′ (un ) ≤
(v ∈ H01 (Ω)),
Mn , mn
(10.18)
and by (8.63) the same holds with the preconditioning matrix Bn(h) in (10.14):
cond (Bn(h) )−1 Fh′ (un ) ≤
Mn mn
(10.19)
independently of the mesh width h. The corresponding convergence quotient of the inner CGM iteration is √ √ Mn − m n . Qn := √ √ Mn + m n
330CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS Concerning the linearized equation (10.16), we note that now f (x, η) = a(|η|)η satisfies ∂f (x, η) a′ (|η|) = a(|η|) · I + (η · η t ) (x ∈ Ω, η ∈ RN ) (10.20) ∂η |η|
(where I ∈ RN ×N is the identity matrix and η · η t is the diadic product matrix of η and its transpose). Using this and (9.43), equation (10.16) takes the form a′ (|∇un |) a(|∇un |) · I + (∇un · ∇utn ) ∇pn · ∇v |∇un | Z Z = − a(|∇un |)∇un · ∇v + gv
Z ( Ω
)
Ω
Ω
(v ∈ Vh ),
and similar integrals will appear in the corresponding steps of the inner iteration. (c) The algorithm In this paragraph we give the algorithmic form of the proposed method. We consider a fixed mesh, i.e. the algorithm is given in a fixed FEM subspace Vh ⊂ H01 (Ω). The proposed algorithm for (10.9) is given in three steps (i)-(iii) as described in paragraph (b). (i) The outer Newton iteration (un ) is given by the algorithm (a) u0 ∈ Vh ; for any n ∈ N : if un is obtained, then (b1) δn > 0 is some constant satisfying 0 < δn ≤ δ0 < 1, (b2) (b3)
(b4)
pn is obtained from the inner iteration (10.23); τn = min{ 1,
(1−δn ) λ (1+δn ) Lkpn kH 1
D
} ∈ (0, 1] ,
un+1 = un + τn pn . (10.21)
Here the constants required for τn in step (b3) are as follows: – λ is the lower bound from (10.11), – L is the Lipschitz constant of Fh′ calculated using Remark 7.14. Further, for δn we propose in particular that δn ≤ C · kFh (un ) − bh kH01 with some constant C > 0 independent of n. (ii) The piecewise constant weight function wn for the preconditioning of the inner iteration is constructed as follows. We define a decomposition Ω = Ω1 ∪ Ω2 ∪ ... ∪ Ωsn and constants ci (i = 1, ..., sn ) by the algorithm below. We use the notation c(r) := a(r) + a′ (r)r, further (since n is fixed in the procedure) for simplicity we write s instead of sn .
10.2. ELECTROMAGNETIC FIELD EQUATION
(a) (b) (c) (d) (e)
c(r) ; κ0 := sup a(r) r≥0
331
we fix a number κ > κ0 ;
we define recursively subintervals Ji = [ri−1 , ri ) with 0 = r0 < r1 < ... < ri < ... < rs−1 < rs = ∞ sup c(r) such that
r∈Ji
inf a(r)
=κ
(i = 1, ..., s);
(10.22)
r∈Ji
Ωi := {x ∈ Ω : |∇un (x)| ∈ Ji }
(i = 1, ..., s),
Λi = sup c(r)
(i = 1, ..., s);
λi = inf a(r), r∈Ji
ci = 12 (λi + Λi )
r∈Ji
(i = 1, .., s).
Then the function wn is defined by
wn |Ωi ≡ ci
(i = 1, ..., sn ).
(iii) The inner CG iteration yields pn through the following sequence (p(k) ) = (p(k) n )k=1,...,kn ⊂ Vh .
(We omit the lower index n for notational simplicity, which causes no ambiguity since now n is fixed. The sequence (p(k) ) ⊂ Vh is defined simultaneously with sequences (r(k) ) ⊂ Vh and (d(k) ) ⊂ Vh .)
332CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
(a) p(0) ∈ Vh is arbitrary; r(0) = −d(0) ∈ Vh is the solution of the problem ) Z Z ( a′ (|∇un |) (0) t wn (x) ∇r · ∇v = a(|∇un |) · I + (∇un · ∇un ) ∇p(0) · ∇v |∇u | Ω Ω n Z Z + a(|∇u |)∇u · ∇v − ρv (v ∈ Vh ); n n Ω Ω for any k = 1, 2, ... : if p(k−1) , r(k−1) , d(k−1) are obtained, then o Rn a′ (|∇un |) t (b1) γ = a(|∇u |) · I + (∇u ·∇u ) ∇d(k−1) ·∇d(k−1) , k n n n |∇un | Ω R αk = − γ1k wn (x) ∇r(k−1) · ∇d(k−1) ; Ω (b2) y (k) ∈ Vh is the solution of the problem ) Z Z ( a′ (|∇un |) t (k) (∇un · ∇un ) ∇d(k−1) · ∇v wn (x) ∇y · ∇v = a(|∇un |) · I + |∇un | Ω Ω
(v ∈ Vh );
(b3)
r(k) = r(k−1) + αk y (k) ;
(b4)
p(k) = p(k−1) + αk d(k−1) ;
(b5)
βk =
1 γk
Rn
a(|∇un |) · I +
Ω
a′ (|∇un |) (∇un |∇un |
o
· ∇utn ) ∇d(k−1) · ∇r(k) ;
d(k) = −r(k) + βk d(k−1)
(b6) until (c)
kFh′ (un )p(k) + (Fh (un ) − bh )kBn−1 ≤ ̺n kFh (un ) − bh kBn−1 where ̺n = (µ1 /µ2 )1/2 δn ;
(d)
kn = the first k for which (c) holds; pn = p(kn ) .
(10.23) The auxiliary linear problems in steps (a) and (b2) are solved by one of the methods designed for piecewise constant coefficient problems and quoted in subsection 8.2.7, paragraph (b). Further, we note that by Remark 9.4, the inequality to be checked in (c) is now equivalent to Z
Ω
and, moreover, here Z
Ω
wn (x) |∇r
wn (x) |∇r
(k) 2
| ≤
(k) 2
| =−
̺2n
Z
wn (x) |∇r(0) |2
(10.24)
Z
wn (x) ∇r(k) · ∇d(k)
(10.25)
Ω
Ω
10.2. ELECTROMAGNETIC FIELD EQUATION
333
which is determined anyway in the next iteration step for αk in (b1) as long as k < kn , hence no extra work is required to calculate (10.25). The convergence of the algorithm (10.21)–(10.23) is given by Theorem 7.9. Namely, • the inner iteration satisfies kFh′ (un )p(k)
+ (Fh (un ) − bh )kBn−1 ≤
√ √ !k Mn − m n √ kFh (un ) − bh kBn−1 , √ Mn + m n
hence the number of inner iterations for the nth outer step is at most kn ∈ N, determined by the inequality √ √ !k Mn − m n n √ ≤ δn . √ Mn + m n This means that kn only depends on mn and Mn in (10.18), i.e. it is independent of the mesh width h. (See also Corollary 9.2.) • in the case of the proposed choice δn ≤ const. · kFh (un ) − bh kH01 , the outer iteration satisfies the quadratic estimate kFh (un+1 ) − bh kH01 ≤ const. · kFh (un ) − bh k2H01 , and, consequently, there holds n
kun − uh kH01 ≤ λ−1 kFh (un ) − bh kH01 ≤ const. · q 2
(n ∈ N)
with a suitable constant 0 < q < 1, where uh ∈ Vh denotes the FEM solution (10.13). Remark 10.1 The same kind of algorithm can be repeated in the case of an xdependent coefficient if b(x, |η|) is a different function of |η| on distinct subdomains of Ω. Then the domain decomposition procedure for preconditioning is the same as above on each subdomain. This kind of coefficient arises for the potential in H-shaped magnets [188], mentioned in section 1.2. Thereby the function b : Ω × R+ → R+ in (10.9) is given by b(x, r) :=
˜ a(r) if x ∈ Ω α if x ∈ Ω0 ,
˜ and Ω0 are given disjoint subdomains of Ω (corresponding to ferromagnetic and where Ω other media, respectively), the function a ∈ C 1 (R+ ) satisfies (10.11), and α ∈ (λ, Λ) is a constant. Then a suitable choice of the weight function wn is defined as previously ˜ (via suitable subdomains Ω1 , ..., Ωs ) and as constant α on Ω0 , yielding a for a(r) on Ω decomposition Ω = Ω0 ∪ Ω1 ∪ ... ∪ Ωs . Further, (10.20) is replaced by
a(|η|) · I + ∂f (x, η) = α·I ∂η
a′ (|η|) (η |η|
· ηt)
˜ if x ∈ Ω,
if x ∈ Ω0 ,
334CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS and hence in the inner iteration (10.23), in all the integrals of the form a′ (|∇un |) a(|∇un |) · I + (∇un · ∇utn ) |∇un |
Z ( Ω
)
∇p · ∇v,
˜ and to add a term α R ∇p · ∇v. one has to replace Ω by Ω Ω0 (d) An example
The following nonlinearity, which varies in several magnitudes over the whole domain, has been cited in subsection 8.2.7. It characterizes the reluctance of stator sheets in the cross-sections of an electrical motor in the case of isotropic media: r8 1 α + (1 − α) 8 a(r) = µ0 r +β
!
(r ≥ 0),
(10.26)
where α = 0.0003 and β = 16000, further, µ0 is the vacuum permeability [188]. For simplicity we can consider µ0 = 1 since µ0 does not affect the conditioning. Then supr>0 a(r) = limr→∞ a(r) = 1. A relevant part of the graph of the function a is shown in Figure 10.4.
Figure 10.4: The graph of the function a(r) With this nonlinearity, the problem (10.9) is almost singular: namely, λ = α = 0.0003 and Λ = max(a(r) + a′ (r)r) = 2.5313, hence the related original condition number and corresponding convergence quotient are √ √ Λ− λ Λ √ = 0.9785. √ = 8437.7, λ Λ+ λ
10.2. ELECTROMAGNETIC FIELD EQUATION
335
That is, suitable preconditioning is inevitable for reasonable convergence of the inner iteration. Now we will examine the proposed decomposition procedure, and will observe that few subdomains are able to yield a reasonable convergence quotient. First we note that the lower bound in step (a) of (10.22) for the condition numbers now equals κ0 = 8.9344, hence the value of the possible inner CGM convergence quotients Q with proper refinement is at least √ κ0 − 1 Q0 := √ = 0.4976. κ0 + 1 For a prescribed number of subdomains, one can calculate from (10.22) and (10.26) the smallest convergence quotient corresponding to the decomposition. Some values are given in Table 10.3 below. s Q s Q
4 5 6 7 8 9 0.7449 0.7025 0.6711 0.6467 0.6281 0.6123 10 11 12 13 14 15 0.6008 0.5903 0.5817 0.5737 0.5673 0.5623
Table 10.3.
s = number of subdomains; Q = corresponding convergence quotient.
We may conclude that it suffices with few subdomains to achieve a reasonable convergence quotient. The shapes of the arising subdomains can be illustrated by the following example. We define the model domain Ω = [0, 1] × [0, 1] ⊂ R2 and the test solution u0 (x, y) = 14x(1 − x)y(1 − y). (The factor 14 ensures that the range of b(|∇u0 |) coincides with that of b.) Hereby we study the decomposition corresponding to the exact function u0 when 9 level sets are chosen to have a convergence quotient 0.6123. The level contours of |∇u0 | are the curves shown in Figure 10.5. The level values (the numbers ri defined in (8.73)) are indicated on the lines. The sets Ωi (i = 1, ..., 9) are not connected, except for i = 5 corresponding to J5 = [1.8962, 2.0977). This structure is better seen on the filled contour plot (Figure 10.6), where the components of the same level set have the same shade. (Figures 10.4-10.6 have been produced using Matlab.) It is obvious for qualitative reasons that the subdomains are generally disconnected in a similar way for any other example satisfying the Dirichlet boundary condition. The arising greater number of connected subdomains does not increase the work of updating the weight matrix Wn , since the latter only depends on the number s of the constants ci . We hereby note that in spite of the large variation of the original coefficient a, the ratios of the constants ci on adjacent subdomains are around 2 in our test problem. (e) Conclusions The Newton–finite element method has been realized for the solution of problem (10.9) by an inner-outer iteration and with preconditioners using domain decomposition
336CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
Figure 10.5: Level contours of |∇u0 |
Figure 10.6: Filled contour plot of |∇u0 | in the inner iteration. The main advantage of the outer Newton iteration is quadratic convergence, which is achieved by the sufficiently accurate solution of the linearized equations via the inner iterations. The latter are preconditioned CGM iterations, where for preconditioners we have introduced the discretizations of suitably chosen piecewise constant coefficient elliptic operators. These preconditioners are suitable for compensating the sharp gradients via the proposed domain decomposition procedure.
10.3. ELASTO-PLASTIC BENDING OF CLAMPED PLATES
337
At the same time, their structure remains close to the discrete Laplacian and hence efficient solution methods are available for the auxiliary problems. In comparison with the previous section 10.1, we have observed that the usually large variation of the coefficient a in problem (10.13) is a basic difference of the electromagnetic field equation from the formally similar elasto-plastic torsion problem (10.3). Consequently, the thereby efficient Laplacian preconditioner would now be unable to provide reasonable convergence. The effect of domain decomposition procedure has been studied on an extremely ill-conditioned real-life example of a. Here the condition number without domain decomposition (i.e. corresponding to the Laplacian preconditioner) is O(105 ), with a corresponding convergence quotient almost equal to 1, whereas using not more then ten subdomains we have decreased the convergence quotient to 0.6.
10.3
Elasto-plastic bending of clamped plates
(a) The problem The equation of elasto-plastic bending of clamped plates has been given in (1.35): ˜ 2 u = α(x) div2 g(E(D 2 u)) D
where
u |∂Ω =
∂u | ∂ν ∂Ω
E(D2 u) = ˜ 2u = D
in Ω ⊂ R2
= 0,
∂2u ∂x2
2
+
2 ∂2u + 21 ∂∂yu2 ∂x2 1 ∂2u 2 ∂x∂y
∂2u ∂2u ∂x2 ∂y 2
+
∂2u ∂y 2
1 ∂2u 2 ∂x∂y 2 ∂2u + 21 ∂∂xu2 ∂y 2
2
!
+
∂2u ∂x∂y
2
(10.27)
, (10.28)
,
g depends on the given material, further, α(x) is proportional to the external normal load per unit area. The material function g satisfies the inequalities 0 < µ1 ≤ g(r) ≤ µ2 ,
(10.29)
0 < µ1 ≤ (g(r2 )r)′ ≤ µ2
with suitable constants µ1 , µ2 independent of the variable r > 0. (See section 1.4.) By Proposition 6.3 problem (10.27) has a unique weak solution u∗ ∈ H02 (Ω): Z 1Z αv g(E(D2 u∗ )) (D2 u∗ · D2 v + ∆u∗ ∆v) = 2 Ω Ω
(v ∈ H02 (Ω)).
(b) The proposed method We consider a fixed FEM subspace Vh ⊂ H02 (Ω) and look for the solution uh ∈ Vh of the discretized problem Z 1Z 2 2 2 g(E(D uh )) (D uh · D v + ∆uh ∆v) = αv 2 Ω Ω
(v ∈ Vh ).
338CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The proposed method is the variable preconditioning formulation of the Newton– finite element method (NFEM), using a double diagonal coefficient preconditioner with piecewise constant coefficient. The algorithm of variable preconditioning has been given in subsection 9.1.2, based on the results of subsections 7.2.3 and 7.2.4. In the context of the NFEM (as mentioned in the introduction of subsection 9.2.2), one simply has to apply that algorithm in a FEM subspace Vh . Further, double diagonal coefficient preconditioners have been introduced in subsection 8.2.14. As a special case, one can construct the coefficient function as piecewise constant in a similar way as for second order problems. This construction, using the domain decomposition procedure of subsection 8.2.7, has been already applied in section 10.2 and is therefore easy to adapt to the present fourth order case. In the method of variable preconditioning the derivatives are replaced by approximate Jacobians as variable preconditioners in the steps of the iteration, constructed as discretized spectrally equivalent preconditioning operators. To define the latter, we choose a suitable domain decomposition procedure. This construction provides a decomposition Ω = Ω1 ∪ Ω2 ∪ ... ∪ Ωsn and constants ci (i = 1, ..., sn ) to produce a piecewise constant weight function wn (similarly as in section 10.2), and results in a preconditioning matrix Bn(h) given by 1Z = wn (x) (D2 vi · D2 vj + ∆vi ∆vj ) (i, j = 1, ..., k), (10.30) 2 Ω where v1 , ..., vk is a basis of Vh . This matrix is the projection of the linear elliptic preconditioning operator Bn : H02 (Ω) → H02 (Ω), defined by n
o
Bn(h) i,j
1Z wn (x) (D2 v · D2 z + ∆v ∆z) (v, z ∈ H02 (Ω)), (10.31) 2 Ω into the subspace Vh . This means that the inner iterations consist of the numerical solution of auxiliary linear elliptic problems containing the operator Bn , using a suitable FEM. Similarly to the second order case, one can bound the condition numbers of the inner iterations by suitably refined decompositions adapted to the variation of the nonlinearity g, and the structure of the matrix Bn(h) now remains close to the discrete biharmonic operator. The solution of the auxiliary linear problems relies on the methods cited in section 3.3. In each outer Newton step n the linearized operator F ′ (un ) has spectral equivalence bounds mn and Mn with respect to Bn such that the quotient Mn /mn can be prescribed. The bounds mn and Mn are used to define the steplength for the next iterate un+1 , and (as n → ∞) the quotient Mn /mn controls the rate of convergence. hBn v, zi =
(c) The algorithm In this paragraph we give the algorithmic form of the proposed method. We consider a fixed mesh, i.e. the algorithm is given in a fixed FEM subspace Vh ⊂ H02 (Ω). We fix constants K > 1 and ε > 0 in advance for the stepwise definition of the forcing terms. Further, we introduce the functions p(r) := min{g(r), (g(r2 )r)′ },
q(r) := max{g(r), (g(r2 )r)′ }
(r > 0),
10.3. ELASTO-PLASTIC BENDING OF CLAMPED PLATES
339
which, by (10.29), satisfy
µ1 ≤ p(r) ≤ q(r) ≤ µ2
(r > 0).
The algorithmic form of the iteration is as follows. In the nth iteration step, besides required parameters, the piecewise constant coefficient wn is constructed in (d1)-(d5), then the steplength τn and correction term zn are determined in (e1)-(e2) and (f), respectively, finally the next iterate un+1 is defined in step (g).
340CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
(a) u0 ∈ Vh ; for any n ∈ N : if un is obtained, then (b) ω(un ) = Lµ−2 1 kFh (un ) − bh kH02 ; q(r) (c) κ0 := sup p(r) , r≥0
we fix a number κ ∈ (κ0 , 1 + 2/(ε + Kω(un ))); (d1) we define recursively subintervals Ji = [ri−1 , ri ) with 0 = r0 < r1 < ... < ri < ... < rs−1 < rsn = ∞ sup q(r) such that
r∈Ji
inf p(r)
=κ
(i = 1, ..., sn );
r∈Ji
(d2) Ωi := {x ∈ Ω : E(D2 un (x)) ∈ Ji } Λi = sup q(r)
(d3) λi = inf p(r), r∈Ji
r∈Ji
(d4) ci = 21 (λi + Λi ),
(i = 1, ..., sn ), (i = 1, ..., sn );
(i = 1, .., sn )
(d5) we let wn : Ω → R, wn |Ωi ≡ ci (e1) mn := mini λi /ci ,
(i = 1, ..., sn );
Mn := maxi Λi /ci ;
(e2) we calculate the constants
Qn =
Mn −mn (1 Mn +mn
+ ω(un )),
−2 1/2 ρn = 2LMn2 µ−2 ; 1 (Mn + mn ) kFh (un ) − bh kH02 (1 + ω(un )) n τn = min{1, 1−Q }; 2ρn
(f ) zn ∈ Vh is the solution of the linear problem 1Z wn (x) (D2 zn · D2 v + ∆zn ∆v) 2 Ω Z 1Z 2 2 2 g(E(D un )) (D un · D v + ∆un ∆v) − αv = 2 Ω Ω (g)
un+1 = un −
2τn zn . Mn + m n
(v ∈ Vh );
(10.32) The auxiliary linear problems in step (f) are solved by one of the linear methods quoted in section 3.3. Further, the constants required for ω(un ) in step (b) are obtained as follows:
10.3. ELASTO-PLASTIC BENDING OF CLAMPED PLATES
341
– µ1 is the lower bound from (10.29), – the Lipschitz constant L of Fh′ is calculated following Remarks 7.14–7.16, – kFh (un ) − bh kH02 can be computed using the considered finite element basis functions v1 , ..., vk ∈ Vh (see Remark 9.5). Namely, if c1 , ..., ck are the coefficients of Fh (un ) − bh in the representation k X
Fh (un ) − bh = then bh k2H02
=
γij = hvi , vj iH02 =
Z
kFh (un ) − where
ci vi ,
i=1
k X
γij ci cj ,
i,j=1
Ω
D 2 vi · D 2 vj
are the entries of the corresponding Gram matrix. For Qn , ρn and τn , one only needs previously calculated constants. (We note that we have used Remark 9.5 for defining ρn in step (e2), and have −1/2 replaced the norm kFh (un ) − bh kn in (9.23) by µ1 kFh (un ) − bh kH02 .) If the condition κ0 ≤ κ ≤ 1 + 2/(ε + Kω(un )) in step (c) is too restrictive, then, using Remark 9.6, one may prefer an inner-outer iteration (the algorithm (9.19)–(9.20), similarly as applied e.g. in section 10.2) instead of (10.32) with the same kind of preconditioner. The convergence of the algorithm (10.32) is given by Theorem 7.11, it depends on the quotient Mn /mn as n → ∞. (We note that the number κ in step (c) equals Mn /mn , which follows from the construction of wn as in subsection 8.2.7.) Namely, there holds n kun − uh kH02 ≤ µ−1 1 kFh (un ) − bh kH02 ≤ const. · Q
with Q = lim sup
(n ∈ N)
Mn − m n < 1. Mn + m n
Moreover, if Mn /mn ≤ 1 + const. · kFh (un ) − bh kH02 , then kFh (un+1 ) − bh kH02 ≤ const. · kFh (un ) − bh k2H02
(n ∈ N),
and, consequently, there holds n
kFh (un ) − bh kH02 ≤ const. · q 2 with a suitable constant 0 < q < 1. (d) Conclusions
(n ∈ N)
342CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The damped inexact Newton–finite element method has been realized for the solution of problem (10.27) using the variable preconditioning formulation of inexact Newton methods, further, applying a double diagonal coefficient preconditioner with piecewise constant coefficient in the inner iteration. Similarly to the previous section, such preconditioners are able to produce low condition numbers via the domain decomposition procedure and, at the same time, their structure now remains close to the discrete biharmonic operator which yields efficient solution of the auxiliary problems. In comparison with the previous section 10.2, the main advantage of the variable preconditioning formulation of the inexact Newton method is that it does not require the construction of the Jacobians: instead, the Jacobians are replaced by matrices with the above described simple structure. This difference compensates the larger freedom that inner-outer iterations might provide in error control, since the lack of constructing Jacobians spares particular cost in the considered 4th order case.
10.4
Nonlinear elasticity
(a) The problem Let us consider the system (1.34) for the unknown displacement vector u = (u1 , u2 , u3 ) : Ω → R3 : in Ω ⊂ R3
− div Ti (x, ε(u)) = ϕi (x)
with
Ti (x, ε(u)) · ν = τi (x)
on ΓN
ui = 0
(i = 1, 2, 3),
(10.33)
on ΓD
T (x, Θ) = 3k(x, |vol Θ|2 ) vol Θ + 2µ(x, |dev Θ|2 ) dev Θ
(x ∈ Ω, Θ ∈ R3×3 ), (10.34) where k(x, s) is the bulk modulus of the material, µ(x, s) is Lam´e’s coefficient, vol Θ and dev Θ are defined in (1.30), further, ε(u) = 12 (∇u + ∇ut ) denotes the strain tensor corresponding to the displacement vector u. (See section 1.3). 1 We recall that the Sobolev space HD (Ω) corresponding to the Dirichlet boundary is defined as 1 HD (Ω) := {u ∈ H 1 (Ω) : u|ΓD = 0},
and owing to ΓD 6= ∅ we can introduce the inner product hu, viHD1 (Ω) :=
Z
Ω
1 (u, v ∈ HD (Ω)).
∇u · ∇v
By Proposition 6.2 problem (10.33) has a unique weak solution u∗ = (u∗1 , u∗2 , u∗3 ) ∈ 1 HD (Ω)3 , i.e. Z
Ω
T (x, ε(u∗ )) · ε(v) =
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
1 (v ∈ HD (Ω)3 )
10.4. NONLINEAR ELASTICITY
343
or, in virtue of (10.34), Z n
3k(x, |vol ε(u∗ )|2 ) vol ε(u∗ ) · vol ε(v) + 2µ(x, |dev ε(u∗ )|2 ) dev ε(u∗ ) · dev ε(v)
Ω
=
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
o
1 (v ∈ HD (Ω)3 )
where ϕ = (ϕ1 , ϕ2 , ϕ3 ) ∈ L2 (Ω)3 and τ = (τ1 , τ2 , τ3 ) ∈ L2 (ΓN )3 . (b) The proposed method 1 We consider a fixed FEM subspace Vh ⊂ HD (Ω) and we look for the FEM solution 3 uh = (uh,1 , uh,2 , uh,3 ) ∈ Vh of the system (10.33) in Vh3 , which is given by
Z n Ω
3k(x, |vol ε(uh )|2 ) vol ε(uh ) · vol ε(v) + 2µ(x, |dev ε(uh )|2 ) dev ε(uh ) · dev ε(v) =
Z
Ω
ϕ·v+
Z
ΓN
τ · v dσ
o
(v ∈ Vh3 ).
The proposed method is the frozen coefficient iteration given in subsection 7.4.2, which corresponds to the variable preconditioners defined in subsection 8.2.5, now in the setting of systems. The frozen coefficient algorithm consists of the successive solution of linear problems where the nonlinear coefficients come from the previous iteration. The frozen coefficient iteration for (10.33) is motivated by the simplicity of the algorithm, e.g. in comparison with Newton’s method. (See also the remark at the end of subsection 7.4.2. We note on the other hand that Newton’s method has also efficient realizations for elasticity problems using multilevel versions, see Axelsson–Kaporin [18], Blaheta [45, 46].) (c) The algorithm In this paragraph we give the algorithmic form of the proposed method. We 1 consider a fixed FEM subspace Vh ⊂ HD (Ω) and define the iteration (un ) = (un,1 , un,2 , un,3 )n∈N in the product space 1 Vh3 ⊂ HD (Ω)3 .
The frozen coefficient algorithm is as follows: (a) u0 ∈ Vh3 ; for any n ∈ N : if un ∈ Vh3 is obtained, then
(b) un+1 ∈ Vh3 is the solution of the problem
Z n o 2 2 3k(x, |vol ε(u )| ) vol ε(u ) · vol ε(v) + 2µ(x, |dev ε(u )| ) dev ε(u ) · dev ε(v) n n+1 n n+1 Ω Z Z τ · v dσ (v ∈ Vh3 ). = ϕ·v+ Ω
ΓN
(10.35)
344CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The auxiliary linear problems in step (b) are solved by one of the methods quoted in section 3.3. (In particular, one may apply a preconditioned CGM with decoupled Laplacian preconditioners defined in subsection 8.2.15.) The convergence of the algorithm is ensured by Theorem 7.14, see also paragraph (c) of subsection 8.2.5. (d) Conclusions The frozen coefficient iteration for problem (10.33) consists of the successive solution of linear systems where the nonlinear coefficients come from the previous iteration, hence the FEM realization of the method is straightforward. The main advantage of the proposed method is the special simplicity of the algorithm (10.35). This simplicity is a particular difference from Newton’s method, where for the considered three-term system the construction of Jacobians would require extra cost.
10.5
Radiative cooling
(a) The problem Let us consider the semilinear problem (1.44) describing radiative cooling: −div (κ(x) ∇u) + σ(x)u4 = 0
κ(x) ∂u + α(x) (u − u ˜(x)) |∂Ω = 0 , ∂ν
(10.36)
where u > 0 is the unknown temperature in a plane plate Ω = [0, a] × [0, b] ⊂ R2 . (See section 1.5). For simplicity we normalize κ such that inf Ω κ = 1. Proposition 6.4 yields that the positive solution of (10.36) exists and is unique. (b) The proposed method We will discuss two iterative algorithms below for problem (10.36), and present how the 3rd type boundary conditions can be incorporated into the methods. In this paragraph we describe in detail the gradient–finite element method (GFEM) combined with a suitably modified linear principal part preconditioner. The GFEM, defined in subsection 9.2.1, is adapted to realize Theorem 7.3 such that the linear principal part preconditioner from subsection 8.2.4 is used and the 3rd type boundary conditions are incorporated in the preconditioner as proposed in subsection 8.2.11. Finally, we will also refer to the NFEM in this context in Remark 10.2, in which case the linear principal part is completed by the derivatives of the lower order and boundary terms in the auxiliary problems. Formulation of the gradient–finite element method with linear principal part preconditioner. We consider a fixed FEM subspace Vh ⊂ H 1 (Ω)
10.5. RADIATIVE COOLING
345
and look for the solution uh ∈ Vh of the discretized problem. The linear principal part preconditioner for the semilinear operator in problem (10.36) is defined in subsection 8.2.4 as the principal part Su ≡ −div (κ(x) ∇u). Under the 3rd type boundary conditions of problem (10.36), we will modify this preconditioning operator according to Theorem 7.3 by adding a suitably defined boundary term in the weak form of the operator. In order to apply Theorem 7.3, first the problem (10.36) has to be rewritten. Namely, since the solution satisfies u > 0, (10.36) can be written equivalently as −div (κ(x) ∇u) + σ(x)|u|3 u = 0
κ(x) ∂u + α(x) (u − u ˜(x)) |∂Ω = 0 , ∂ν
or in weak form Z Ω
3
κ(x) ∇u · ∇v + σ(x)|u| uv +
Z
∂Ω
α(x)uv dσ =
Z
∂Ω
α(x)˜ uv dσ
(v ∈ H 1 (Ω)).
This means that the nonlinearity q(x, ξ) = σ(x)|ξ|3 ξ is increasing w.r. to ξ as required in Theorem 7.3. The discretization of the rewritten problem in the considered fixed FEM subspace Vh ⊂ H 1 (Ω) reads as follows: we look for the solution uh ∈ Vh of the problem Z Ω
κ(x) ∇uh · ∇v + σ(x)|uh |3 uh v +
Z
∂Ω
α(x)uh v dσ =
Z
∂Ω
α(x)˜ uv dσ
(v ∈ Vh ).
(10.37) Using the notations of Theorem 7.3, besides the above defined q(x, ξ) = σ(x)|ξ|3 ξ the other coefficients are f (x, η) = κ(x)η, s(x, ξ) = α(x)ξ, g(x) = 0 and γ(x) = α(x)˜ u(x), further, the required constants are p1 = 5, p2 = 2, c1 = d1 = 0, c2 = 3 supΩ σ, d2 = sup∂Ω α. Using the linear principal part preconditioner, we have G(x) = κ(x) · I, and by Construction 7.3 we obtain m = m′ = 1 and β(x) = α(x). The boundary portions are now ΓN = ∂Ω, ΓD = ∅, hence the 1 corresponding Sobolev space is HD (Ω) = H 1 (Ω) with the inner product hu, viH 1 (Ω) :=
Z
Ω
κ(x) ∇u · ∇v +
Z
∂Ω
α(x)uv dσ .
The modified preconditioning operator is the one that generates the above inner product, and then the proposed preconditioning matrix is its discretization in the subspace Vh given by (Sh )i,j =
Z
Ω
κ(x) ∇vi · ∇vj +
Z
∂Ω
α(x)vi vj dσ
(i, j = 1, ..., k),
where v1 , ..., vk is a basis of Vh . The iterative sequence in Vh with the preconditioner Sh is defined as the projection of the Sobolev space sequence, constructed by Theorem 7.3 in the above setting, into Vh . This projection is described in subsection 9.2.1 for the GFEM. Namely, it is executed
346CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS in a trivial way such that we just replace H 1 (Ω) by Vh in the iteration given by (7.46)– (7.47) in Theorem 7.3. Therefore our only task left is to complete the construction of the sequence (7.46)–(7.47), which requires the calculation of the constant M0 involved in (7.46). Besides, using that now the preconditioner is the linear principal part of the operator, the sequence can be simplified similarly as done in (8.45) for Dirichlet problems. This simpler form will be used in the algorithm. Calculation of the constant M0 . takes the form
Using the above coefficients, formula (7.44) now
5 2 M (r) = 1 + 3 sup σ K5,Ω r3 + sup α K2,∂Ω Ω
(r > 0),
(10.38)
∂Ω
where the embedding constants K5,Ω and K2,∂Ω are defined as in (7.37)-(7.38). The embedding constants can be estimated using Lemma 11.1, owing to the norR malization inf Ω κ = 1. First, using ∂Ω α(x)u2 dσ ≤ kuk2H 1 (Ω) and the Cauchy-Schwarz inequality, we obtain 2 K2,∂Ω ≤ 1/ inf α, ∂Ω
K1,∂Ω ≤ kα−1/2 kL2 (∂Ω) .
Introducing the notation 2 Kα = K2,∂Ω + 2K2,Ω ,
Lemma 11.1 yields recursively 2 K2,Ω ≤ 12 (K1,∂Ω + 1)2 , 4 K4,Ω ≤ 21 Kα2 ,
√ 2 3 3 , K3,∂Ω ≤ a2 K3,Ω + 3 2K2,Ω
3 K3,Ω ≤ 21 Kα (K1,∂Ω + 1) ,
5 3 2 K5,Ω ≤ 21 Kα K3,∂Ω + 3K4,Ω .
That is, introducing Cα = kα−1/2 kL2 (∂Ω) + 1 we obtain 5 K5,Ω
˜ := Kα ≤K 2
!
Kα Cα 3 + √ (Kα + Cα2 ) . a 2
(10.39)
Now let u0 ≡ 0. Then, using (7.45), the upper ellipticity bound is M0 := M kbkH 1 (Ω) , where the function M (r) is from (10.38) and b ∈ H 1 (Ω) is the element for which hb, viH 1 (Ω) =
Z
∂Ω
(v ∈ H 1 (Ω)).
α(x)˜ uv dσ
Here we have kbkH 1 (Ω) = ≤
sup kvkH 1 (Ω) =1
sup kvkH 1 (Ω) =1
R
(
hb, viH 1 (Ω) = 1/2
∂Ω
α˜ u2 dσ)
R
(
sup kvkH 1 (Ω) =1 1/2
∂Ω
αv 2 dσ)
R
∂Ω
≤(
α˜ uv dσ
R
1/2
∂Ω
α˜ u2 dσ)
= kα1/2 u˜kL2 (∂Ω) .
(10.40) To sum up, we will altogether use the following value of M0 in the iteration: ˜ 1/2 u˜k3L2 (∂Ω) + sup∂Ω α , M0 = 1 + 3 sup σ Kkα inf ∂Ω α Ω
(10.41)
10.5. RADIATIVE COOLING
347
˜ is calculated from (10.39) with Cα = kα−1/2 kL2 (∂Ω) +1 and Kα = (1/ inf ∂Ω α)+ where K √ 2Cα . (c) The algorithm In this paragraph we give the algorithmic form of the proposed method. We consider a fixed mesh, i.e. the algorithm is given in a fixed FEM subspace Vh ⊂ H 1 (Ω). Let M0 be defined by (10.41). Then the algorithm is as follows: (a) u0 ≡ 0; for any n ∈ N : if un ∈ Vh is obtained, then (b1) wn ∈ Vh is the solution of the problem Z Z κ(x) ∇w · ∇v + α(x)wn v dσ n Ω ∂Ω Z Z 3 = σ(x)|un | un v − α(x)˜ uv dσ Ω ∂Ω M0 − 1 2 un+1 = un − wn . (b2)
M0 + 1
(10.42) (v ∈ Vh );
M0 + 1
The auxiliary linear problems in step (b1) are solved by one of the methods quoted in section 3.3. The convergence of the algorithm is given by the estimate (7.48) in Theorem 7.3, which, in virtue of Remark 8.2, is valid for (10.42) independently of the subspace Vh . Using also the estimate (10.40), we obtain kun − uh kH 1 (Ω) ≤ kα1/2 u˜kL2 (∂Ω)
M0 − 1 M0 + 1
n
(n ∈ N)
(10.43)
with M0 defined in (10.41). The obtained convergence estimate is mesh independent, since it only contains data from the original problem before discretization. Remark 10.2 If we apply Newton’s method instead of the above simple iteration, then the 3rd type boundary conditions can be incorporated in the preconditioner in a similar way. In this case the modification of the algorithm (9.38) is used such that the auxiliary problem in step (b1) is Z h Ω
i
κ(x) ∇pn · ∇v + 3σ(x)|un |3 pn v +
= −
Z h Ω
3
Z
∂Ω
i
κ(x) ∇un · ∇v + σ(x)|un | un v −
α(x)pn v dσ Z
∂Ω
α(x)(un − u˜)v dσ
(v ∈ Vh ).
348CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The Newton iteration admits a mesh independent quadratic convergence estimate in the neighbourhood of the solution. Namely, the nonlinearity q(x, ξ) = σ(x)|ξ|3 ξ satisfies (7.102) in Remark 7.12 with constants Cq = 3 supΩ σ, p = 4, hence, using (7.103), F ′ is locally Lipschitz continuous according to the definition (5.63) with 5 ˜ L(r) = 3 sup σ K5,Ω r2 . Ω
By Remark 5.17, paragraph (c), and in virtue of the present lower bound λ = m = 1, this gives the Lipschitz constant ˜ (2kF (u0 ) − bkH 1 (Ω) + ku0 kH 1 (Ω) ) L=L w.r. to the initial guess u0 . Now with our choice u0 = 0 and using (10.39)–(10.40), we obtain ˜ (2kbkH 1 (Ω) ) ≤ L ˜ (2kα1/2 u˜kL2 (∂Ω) ) ≤ 12 sup σ K ˜ kα1/2 u˜k2L2 (∂Ω) . L=L Ω
Then Remark 5.15 yields that now the Newton iteration has the quadratic convergence factor L ˜ kα1/2 u˜k2L2 (∂Ω) kF (un0 ) − bkH 1 (Ω) , q = kF (un0 ) − bkH 1 (Ω) ≤ 6 sup σ K (10.44) 2 Ω where n0 is the index for which the linearly convergent initial part of the damped Newton sequence arrives in the neighbourhood of quadratic convergence. This linear estimate can be done using Theorem 5.11 similarly as above for the simple iteration, and is left to the reader. We only illustrate here the special case when the coefficients are small enough, namely, when the right side estimate in (10.45) holds. Namely, in this case we have n0 = 0 in (10.44) and hence, using (10.40), we obtain ˜ kα1/2 u˜k3 2 q ≤ 6 sup σ K L (∂Ω) < 1 .
(10.45)
Ω
That is, by (5.57) we obtain the convergence estimate 1/2n ˜ kα1/2 u˜k3L2 (∂Ω) kun − u∗ kH 1 (Ω) ≤ 6 sup σ K Ω
for the Sobolev space Newton method. By Theorem 9.5, this gives a computable mesh independent estimate for Newton’s method in an arbitrary FEM subspace Vh . (d) Conclusions We have considered two iterative algorithms for problem (10.36), which show how the boundary conditions can be incorporated into the preconditioner via suitable boundary terms. We have discussed in detail the gradient–finite element method (GFEM) combined with a suitably modified linear principal part preconditioner. The main advantage of this algorithm is that it uses a fixed preconditioner which has not to be updated, hence its construction (involving boundary integration as well) is executed only once. Nevertheless, it achieves mesh independent linear convergence which can be a priori estimated using analytic constants calculated from the coefficients. We have also referred to the NFEM in this context, in which case the a priori mesh independent estimation of the quadratic convergence factor has also been illustrated.
10.6. ELECTROSTATIC POTENTIAL IN A BALL
10.6
349
Electrostatic potential in a ball
(a) The problem We consider problem (1.46): −∆u + eu = 0 u |∂B = 0
(10.46)
on the ball B = B(0, R) ⊂ R3 with radius R. The function u describes the electrostatic potential (see section 1.5). By Proposition 6.4, problem (10.46) has a unique classical solution u∗ ∈ C 2 (B). Moreover, u∗ is radially symmetric [131]. (b) The proposed method In contrast to the previous sections, this last example uses no discretization. Namely, the proposed method for (10.46) is the direct gradient method for the Laplacian preconditioner, defined in section 9.3, part (b). This method realizes literal Sobolev space preconditioning, using the special form of the problem. That is, the Laplacian can be inverted exactly for radially symmetric polynomials on B, hence discretization is avoided and the iteration is applied directly in the Sobolev space H01 (B). The direct realization is due to keeping the iteration in the class of radially symmetric polynomials P = {
l X
m=0
am r2m : l ∈ N, am ∈ R},
where r = |x| for x ∈ B,
using stepwise polynomial approximation of eun . Let us first define the iteration without the polynomial approximation of eun . For this we introduce eu (u ≤ 0) f (u) := 1 + u (u > 0). If u solves (10.46), then ∆u ≥ 0 and the maximum principle [249] implies that u ≤ 0. Hence (10.46) is equivalent to −∆u + f (u) = 0 u |∂B = 0.
(10.47)
In this problem the inequality 0 ≤ f ′ (u) ≤ 1 implies by Theorem 7.2 that we obtain the ellipticity bounds m = 1, M = 1 + ̺−1 (10.48) for the Laplacian preconditioner, independently of the initial guess, where ̺ > 0 is the smallest eigenvalue of −∆ on H 2 (B) ∩ H01 (B). The latter is known explicitly: π ̺= R
2
,
(10.49)
350CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS since the first positive root of the spherical Bessel function j0 (r) = sin r/r is π (see 2̺ 2 [1]). Then M +m = 2̺+1 . Hence the corresponding iteration in strong form is given by un+1 = un −
2̺ zn , 2̺ + 1
where − ∆zn = −∆un + f (un ),
(10.50)
zn|∂B = 0.
Setting wn := zn − un as in (8.45), the iteration takes the simpler form un+1 =
1 2̺ un − wn , 2̺ + 1 2̺ + 1
where − ∆wn = f (un ),
wn|∂B = 0.
(10.51) (10.52)
Using the maximum principle again, we obtain that wn ≥ 0. Hence, letting u0 ≤ 0, we have by induction that un ≤ 0 for all n ∈ N. Hence f (un ) can be replaced again by the original eun . (We note that, similarly to section 10.1, the above simple iteration produces the same order for the number of iterations to achieve a prescribed error as Newton’s method if the Laplacian preconditioner were applied in inner iterations for the outer Newton steps. See also Remark 9.11.) Now we turn to the approximated algorithm using radially symmetric polynomials. Namely, in each step of (10.52) we approximate f (un ) = eun by a suitable Taylor polynomial kn X ujn p(un ) = , (10.53) j=0 j! and define the corresponding next iterate by un+1 =
1 2̺ un − wn , 2̺ + 1 2̺ + 1
where − ∆wn = p(un ),
wn|∂B = 0.
(10.54) (10.55)
In this case, if un is a radial polynomial (i.e. un ∈ P), then p(un ) ∈ P. Further, wn ∈ P and it is elementary to determine wn : if p(un ) =
ln X
am r2m
m=0
(r ∈ [−R, R]),
(10.56)
then (10.55) is equivalent to 1 ∂ ∂wn − 2 r2 r ∂r ∂r
!
=
ln X
am r2m ,
wn (−R) = wn (R) = 0
m=0
and its solution wn ∈ P is wn (r) =
ln X
am (R2m+2 − r2m+2 ) . m=0 (2m + 3)(2m + 2)
(10.57)
10.6. ELECTROSTATIC POTENTIAL IN A BALL
351
Then, by induction, if u0 ∈ P, then for all n ∈ N we have un ∈ P and the Poisson equations (10.55) are solved by (10.57). The theoretical iteration (10.51)–(10.52) converges according to the estimate (7.13). −m 1 Since now (10.48) gives M = 2̺+1 , we obtain M +m kun − u∗ kH01 (B) ≤ ̺−1/2 k − ∆u0 + eu0 kL2 (B)
1 2̺ + 1
!n
(n ∈ N) .
(10.58)
For simplicity we choose u0 ≡ 0, in which case k − ∆u0 + eu0 kL2 (B) = |B|1/2 , where |B| = 4R3 π/3 is the volume of B.
The convergence of the iteration (10.54)–(10.55) is achieved by the suitable choice of p(un ). Namely, for given un let wn∗ and wn denote the solution of (10.52) and (10.55), respectively. Then we have the estimate kwn∗ −wn kH01 (B)
≤̺
−1/2
un
ke −p(un )kL2 (B)
|B| 1/2 kun kk∞n +1 |B| 1/2 un ) ke −p(un )k∞ ≤ ( ) . ≤( ̺ ̺ (kn + 1)!
Further, it is easy to see that kzn∗ − zn kH01 (B) = kwn∗ − wn kH01 (B) , where zn∗ and zn are the solutions of (10.50) and its polynomial approximation, respectively, since un = zn − wn is already a polynomial. Hence kzn∗
− zn kH01 (B)
|B| 1/2 kun kk∞n +1 ) . ≤( ̺ (kn + 1)!
(10.59)
Then we can use Theorem 9.2 to ensure that the iteration arrives in a prescribed neighbourhood of the solution. Namely, let the indices kn be chosen such that the right-hand side of (10.59) is bounded by some fixed ε (independent of n) throughout the iteration. Then, using that m = 1, the iteration (10.54)–(10.55) satisfies (10.58) up to accuracy ε. Remark 10.3 We note that the degrees of the polynomials un may grow very fast in the above iteration, and at the same time the high-index coefficients will be very small. It may spare memory to drop the small enough terms of high index within some given accuracy δ > 0, in which case the latter is added the value of ε in the final accuracy. Namely, if the nth iterate un is the polynomial un (r) =
sn X
am r2m ,
(10.60)
m=0
then for any index tn ≤ sn we have the estimate k
sn X
m=tn +1
am r2m k2H01 (B) ≤ 4πsn
sn X
m=tn +1
4m2 a2m
R4m+1 4m + 1
(10.61)
352CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS (obtained from elementary integration). Hence, letting tn ≤ sn be the smallest index for which sn 2 X R 4πsn 2mam R2m ≤ δ2 (10.62) 4m + 1 m=tn +1 and defining tn X
am r2m ,
(10.63)
kun − un kH01 (B) ≤ δ.
(10.64)
un (r) =
m=0
we obtain the estimate
(c) The algorithm In this paragraph we give the algorithmic form of the proposed method. First we define 2 π (10.65) ̺= R from (10.49). We fix a tolerance ε > 0, which will be the accuracy of the algorithm in H01 (B) norm, and let ω=ε
̺ |B|
!1/2
,
(10.66)
where |B| = 4R3 π/3 is the volume of B. The proposed method constructs a sequence of radial polynomials via (10.53)– (10.55), in which the indices kn are chosen such that in each step the right-hand side of (10.59) is bounded by ε. Then the algorithm reads as follows: (a) (b1)
u0 ≡ 0; for any n ∈ N : if un ∈ P is obtained, then let µn = max |un |; B
kn ∈ N the smallest number such that
(b2) p(un )(r) =
(b3) (b4)
kn P
j=0
un (r)j j!
(r ∈ [−R, R]);
wn ∈ P is the solution of the problem −∆wn = p(un ) ,
wn|∂B = 0
using formula (10.57); un+1 =
1 (un − 2̺ wn ) . 2̺ + 1
µknn +1 (kn +1)!
≤ ω; (10.67)
10.6. ELECTROSTATIC POTENTIAL IN A BALL
353
We emphasize that the solution of the auxiliary Poisson equations in step (b3) is achieved exactly, using formula (10.57). The convergence of the algorithm is given by Theorem 9.2. Namely, step (b1) in (10.67) and the value of ω in (10.66) have been defined such that the right-hand side of (10.59) is bounded by the prescribed number ε. Then Theorem 9.2 yields that the sequence (10.67) follows the estimate (10.58) of the theoretical iteration (10.51)–(10.52) up to accuracy ε: ∗
kun − u kH01 (B)
|B| 1/2 1 ≤( ) ̺ 2̺ + 1
!n
+ε
(n ∈ N).
(10.68)
As noted in Remark 10.3, the polynomials un may contain a large number of highindex terms whose coefficients are almost zero. These terms are dropped automatically when their coefficients are below the roundoff accuracy. However, we may spare memory by also dropping some of the small terms larger than the roundoff accuracy. This procedure can be achieved as described in Remark 10.3 within some given tolerance δ > 0, using the estimate (10.62) to define in each step a truncation of the polynomial un in the form (10.63). In this case the algorithm (10.67) is completed by a step (b5) which finds the decreased index tn+1 and by a step (b6) which redefines un+1 with terms up to index tn+1 only. This completion is given below, using the polynomial form of un+1 in step (b4), and for simplicity using the same notation un+1 for the redefined function in step (b6). (b4) (b5) (b6)
un+1 (r) =
sP n+1
m=0
am r2m ,
tn+1 ∈ N is the smallest index such that 4πsn+1
sP n+1
m=tn+1 +1
un+1 (r) =
tn+1 P
m=0
2
(2mam R2m )
R 4m+1
≤ δ2;
(10.69)
am r2m .
The corresponding error estimate is obtained by adding δ to the right-hand side of (10.68). (d) An example We present a test result for problem (10.46) on the ball B = B(0, 2) ⊂ R3 with radius R = 2, and in the experiment we set ε = 10−6 as the prescribed tolerance in the algorithm. We consider the truncated version of the algorithm with truncation accuracy δ = ε = 10−6 .
354CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS The numbers required in (10.65) and (10.66) are now 2
π ρ= 2
≈ 2.4674,
3ρ ω=ε 32π
1/2
≈ 2.7135 · 10−7 .
Hence the convergence factor in the estimate (10.68) is 1 ≈ 0.168498. 2ρ + 1 Further, with the proposed choice u0 ≡ 0, the values in step (b1) of algorithm (10.67) for n = 0 are µ0 = k0 = 0. To avoid the powers 00 , these are defined before starting the iteration together with the explicitly given functions w0 (r) = (R2 − r2 )/6 and u1 (r) = (2ρ/2ρ + 1)w0 (r). The sequence (un ) is constructed via the described truncation procedure, i.e. the algorithm consists of the steps (b1)-(b6) given in (10.67) and (10.69). Using the above data, the algorithm is completely defined. The experiment has been carried out using Mathematica1 and provided by L. L´oczi (ELTE University, Budapest). The brevity and simplicity of the code shows that the direct gradient method involves no difficulty in the implementation, and therefore the code is worth its inclusion here. (a) The initial data for the input in the code are the following: u0 [r ] = 0, µ0 = 0, k0 = 0; P u0 [r ] = 1, w0 [r ] = 16 (R2 − r2 ), u1 [r ] =
1
Copyright 1988-2000 Wolfram Research, Inc.
1 (u0 [r] 2ρ+1
− 2ρ w0 [r]).
10.6. ELECTROSTATIC POTENTIAL IN A BALL
355
(b) The iteration is coded as follows:
Do[ µiter = − FindMinimum[-Abs[uiter [r]], {r, R/100, −R/200, −R, R}][[1]]; kiter +1 µ iter kiter = 0; While[ (k > ω, kiter ++]; v iter +1)! (u [r])j , {j, 0, kiter }]; Puiter [r ] = Sum[ iter j! coefflist = CoefficientList[Puiter [r], r2 ]//N; length = Length[coefflist]; witer [r ] = −Sum[
coefflist[[m+1]] 2m+2 2m+2 -R ), {m, 0, length−1}]; (2m+3)(2m+2) (r
1 uiter+1 [r ] = 2ρ+1 (uiter [r]−2ρ witer [r])//Expand;
chopcoefflist = CoefficientList[uiter+1 [r], r2 ]//N; choplength = Length[chopcoefflist]; δ = ε; counter = choplength-1; tailsum = 0; While[counter≥0 && 4π choplength tailsum ≤ δ 2 ,
R tailsum += (2 counter chopcoefflist[[counter+1]] R2 counter )2 4counter +1 ; counter ----];
counter += 2; uiter+1 [r ] = uiter+1 [r][[Range[counter]]], {iter, 1, nmax }]
(10.70)
Here nmax is the maximal number of iterations allowed. The actual number of iterations is determined by the measure of error. For this we estimate the theoretical errors En = kun − u∗ kH01 (B) in two ways. First, the a priori estimate yields |B| 1/2 1 En ≤ en ≡ ( ) ̺ 2̺ + 1
!n
+ 2ε,
obtained from adding the truncation error δ = ε to (10.68). Further, using the lower ellipticity bound m = 1 of the operator T (u) = −∆u + eu , an estimate analogous to
356CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS (9.55) implies that kun − u∗ kH01 (B) ≤ rn ≡ ̺−1/2 k − ∆un + eun kL2 (B) , which is the a posteriori residual estimate for En . Using the data of our example, we obtain En ≤ min{en , rn }
(n ∈ N)
with en = 3.6853 · 0.168498n + 2 · 10−6
(10.71)
rn = 0.6366 · k − ∆un + eun kL2 (B) .
(10.72)
dn ≡ kun+1 − un kH01 (B) ,
(10.73)
and We also calculate i.e. the difference between consecutive terms as usual. The following table presents all three error estimates in the iteration. We observe that the residual error is smaller than the a priori estimate en , and is decreased below the accuracy ε = 10−6 first in step 9, then it is stabilized slightly below that value. n en rn dn
0 1 2 3.6853 0.6209 0.1046 3.6853 0.4298 0.0505 2.4856 0.2740 0.0194
n en rn dn
7 1.621 · 10−5 5.183 · 10−6 4.953 · 10−7
3 4 5 0.01763 0.002972 5.025 · 10−4 0.00696 0.001082 1.717 · 10−4 0.00168 0.000179 2.323 · 10−5
8 4.394 · 10−6 1.457 · 10−6 7.549 · 10−8
9 2.403 · 10−6 8.551 · 10−7 1.165 · 10−8
10 2.067 · 10−6 7.573 · 10−7 1.813 · 10−9
6 8.634 · 10−5 2.826 · 10−5 3.319 · 10−6
11 2.011 · 10−6 7.413 · 10−7 2.839 · 10−10
12 2.001 · 10−6 7.387 · 10−7 —–
Table 10.4. The errors en , rn and dn , defined in (10.71), (10.72), (10.73), respectively. The iterative sequence un can be presented in a very simple way, since it consists of polynomials of r which are defined by their coefficients. These coefficients in the radial polynomial form un (r) =
sn X
m=0
am r2m
(r ∈ [−R, R])
are given in the Table 10.5 for n = 1, 2 and n = 11, 12, respectively. In addition, Figure 10.7 contains the graphs of some of the first terms of the sequence in terms of the variable r. In fact, for the sake of positivity we plot the functions −un . (The rapid convergence is obviously observed.)
10.6. ELECTROSTATIC POTENTIAL IN A BALL
a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
u1 −0.554335 0.138584
u2 −0.472618 0.102961 0.003309 0.000109 2.944 · 10−6 6.632 · 10−8 1.360 · 10−9 1.736 · 10−11 5.954 · 10−13
.........
u11 −0.475685 0.103577 0.003218 0.000127 5.572 · 10−6 2.596 · 10−7 1.258 · 10−8 6.264 · 10−10 3.181 · 10−11 1.641 · 10−12 8.576 · 10−14 4.524 · 10−15
357 u12 −0.475685 0.103577 0.003218 0.000127 5.572 · 10−6 2.596 · 10−7 1.258 · 10−8 6.264 · 10−10 3.181 · 10−11 1.641 · 10−12 8.576 · 10−14 4.524 · 10−15
Table 10.5. The coefficients of the iterative sequence.
We observe that the terms u11 and u12 of the sequence coincide up to the accuracy ε = 10−6 . This, together with the corresponding residual error r12 < 10−6 in Table 10.4, suggests that we accept u12 ≈ u∗ as the numerical solution. In order to be able to visualize its graph, we plot the surface of the 2D function which assigns the same values to the radius as u12 . In fact, for the sake of positivity again, we plot this graph for the function −u12 in Figure 10.8. (e) Conclusions In this last example we have realized the direct gradient method with the Laplacian preconditioner for problem (10.46). In contrast to the previous sections, in this simple special case the proposed method uses no discretization but instead realizes literal Sobolev space preconditioning for the iteration directly in the space H01 (B). This is due to keeping the iteration in the class of radially symmetric polynomials, on which the Laplacian can be inverted exactly. The main advantage of this method is the simplicity of the algorithm (10.67) and (10.69), whose straightforward coding has been enclosed and the obtained fast linear convergence presented in a numerical test example.
358CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
Figure 10.7: Some of the first terms of the sequence −un .
10.6. ELECTROSTATIC POTENTIAL IN A BALL
Figure 10.8: The graph of the modulus of the numerical solution u12 .
359
360CHAPTER 10. SOME NUMERICAL ALGORITHMS FOR NONLINEAR ELLIPTIC PROBLEMS
Chapter 11 Appendix We give some definitions and properties in general normed and Sobolev spaces that are used in the book. Clearly, most of the related background is contained in the references given in the corresponding parts of the book. The definitions below are given for unambiguity of terminology, and the properties are quoted for the convenience of the reader.
A1.
Some definitions in normed spaces
We give some notions of convergence, continuity and differentiability. For more details see e.g. [165, 296]. Definition 11.1 Let H be a real Hilbert space. The sequence (un ) ⊂ H converges weakly to u ∈ H if hun , vi → hu, vi for all v ∈ H. We write un ⇀ u . Definition 11.2 Let H be a real Hilbert space and operator.
F : H → H be a nonlinear
(a) F is called continuous in finite dimension if for any finite dimensional subspace V ⊂ H, the mapping F|V is continuous. (b) F is called hemicontinuous if for any u, w, h ∈ H the mapping t 7→ hF (u + tw), hi is continuous from R to R. (c) Let F be Gateaux differentiable. F ′ is called bihemicontinuous if for any u, k, w, h ∈ H the mapping (s, t) 7→ F ′ (u + sk + tw)h is continuous from R2 to H. Definition 11.3 Let X, Y be Banach spaces, D ⊂ X a subspace, F : D → Y a nonlinear operator. (a) The operator F is called Gateaux differentiable at u ∈ D if (i) for any v ∈ D there exists the directional derivative 1 (F (u + tv) − F (u)) ∈ Y ; t→0 t
∂v F (u) = lim
361
362
CHAPTER 11. APPENDIX
(ii) the mapping v 7→ ∂v F (u) is a bounded linear operator from D to Y . (b) The operator F is called Fr´echet differentiable (or simply differentiable) at u ∈ D if there exists a bounded linear operator A from D to Y such that lim
v→0
1 kF (u + v) − F (u) − Avk = 0. kvk
A2. Some properties of Sobolev spaces and elliptic operators (a) Lower bounds of linear elliptic operators. When one needs a lower bound for a linear elliptic operator S, by Proposition 3.4 one has to estimate the smallest eigenvalue of S. A simple estimate may be achieved via Sobolev type inequalities like (3.18). We give such a result for the Laplacian. For more subtle ways of estimation see e.g. [100, 253]. Proposition 11.1 [265]. Let Ω ⊂ RN be a bounded domain and ̺ > 0 be the smallest eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω). Then ̺ ≥ N π 2 /diam(Ω)2 .
(11.1)
(b) Sobolev embeddings. A detailed description of the embeddings of Sobolev spaces into other Sobolev, Lp or C k spaces is given in the monograph of Adams [2]. Here we formulate embeddings of H 1 (Ω) to Lp spaces, then quote some estimates for the embedding constants. Let p be a real number satisfying 2 ≤ p (if N = 2) or 2 ≤ p ≤
2N N −2
(if N > 2).
(11.2)
Then there hold the Sobolev embeddings and corresponding estimates H 1 (Ω) ⊂ Lp (Ω), H 1 (Ω)|Γ ⊂ Lp (Γ),
kukLp (Ω) ≤ Kp,Ω kukH 1 (Ω) kukLp (Γ) ≤ Kp,Γ kukH 1 (Ω)
(u ∈ H 1 (Ω)),
(11.3)
(u ∈ H 1 (Ω)|Γ )
(11.4)
where Γ ⊂ ∂Ω is any measurable subsurface, Kp,Ω > 0 and Kp,Γ > 0 are suitable constants, and H 1 (Ω)|Γ denotes the trace of H 1 (Ω) on Γ. For many cases of bounded domains the exact embedding constant has been determined, see [62, 196, 275]. The answer is much more complete for three (and more) dimensions than for two. Here we enclose some estimates in two dimensions that take into account the boundary values of the functions. These two lemmas are found in [169]. In them we assume that the spaces H 1 (Ω) and H01 (Ω), respectively, are endowed with a norm k.k satisfying Z Ω
|∇u|2 ≤ kuk2 .
(11.5)
363 Lemma 11.1 Let Ω = [a, b] × [c, d] ⊂ R2 and let the boundary ∂Ω be decomposed into Γ1 = {a, b} × [c, d] and Γ2 = [a, b] × {c, d}. Let (11.5) hold. Then for any pi ≥ 1 (i = 1, 2) there holds +p2 Kpp11+p ≤ 2 ,Ω
and for any p ≥ 1
1 p1 p2 p2 −1 p1 −1 K + p K Kp1 ,Γ1 + p1 K2(p 2 2(p2 −1),Ω , p2 ,Γ2 1 −1),Ω 2 p Kp,Γ ≤ i
√ p−1 2 p Kp,Ω + p 2K2(p−1),Ω . b−a
Lemma 11.2 Let Ω ⊂ R2 be a bounded domain with piecewise smooth boundary. Let (11.5) hold and let us consider the Sobolev embedding and corresponding estimate under Dirichlet boundary conditions: H01 (Ω) ⊂ Lp (Ω), Then +p2 Kpp11+p ≤ 2 ,Ω
kukLp (Ω) ≤ Kp,Ω kuk
(u ∈ H01 (Ω)).
p1 p2 p1 −1 p2 −1 K2(p1 −1),Ω K2(p . 2 −1),Ω 2
(11.6) (11.7)
Examples. We give estimates of the embedding constants Kp,Ω of H01 (Ω) in (11.6) for two special cases. (a) (p = 2). Let ̺ > 0 be the smallest eigenvalue of −∆ on H 2 (Ω) ∩ H01 (Ω). Then Remark 3.2 implies that 1 2 (11.8) K2,Ω ≤ . ̺ (b) (p = 4). Lemma 11.2 with p1 = p2 = 2 and the estimate (11.8) imply 2 4 K4,Ω ≤ . ̺
(11.9)
In particular, using (11.1), we obtain 4 K4,Ω ≤ 2 diam(Ω)2 /nπ 2 .
(11.10)
(c) Generalized differential operators The notion of generalized differential operator appears throughout this book and is used for the weak form of a differential operator. For convenience we briefly summarize its introduction and give some explanation. We note that its meaning is further clarified in Remark 7.2 in subsection 7.1.1. (General properties of generalized differential operators, required for the theoretical background, are discussed in section 6.1. Further, see [127, 190, 296] for a related setting.) For simplicity, let us consider here a Dirichlet problem (
T (u) ≡ −div f (x, ∇u) = g(x) u|∂Ω = 0
(11.11)
364
CHAPTER 11. APPENDIX
with the usual conditions, i.e. f : Ω × RN → RN is C 1 w.r. to η ∈ RN such that its Jacobians ∂f (x, η)/∂η are symmetric and have eigenvalues between two positive constants independent of η, further, let g ∈ L2 (Ω). (See e.g. Assumptions 7.1 in subsection 7.1.1.) The corresponding generalized differential operator F will be defined such that the weak formulation of problem (11.11) can be written as hF (u), vi =
Z
Ω
gv
(v ∈ H01 (Ω)).
(11.12)
This operator F will act in the space H01 (Ω), which we endow with the inner product hu, viH01 =
Z
Ω
∇u · ∇v
(u, v ∈ H01 (Ω)).
(11.13)
One can prove the existence of the operator F : H01 (Ω) → H01 (Ω) defined by the equality Z hF (u), viH01 = f (x, ∇u) · ∇v (u, v ∈ H01 (Ω)). (11.14) Ω
Namely, let u ∈ H01 (Ω) be fixed. Then the integral on the right-hand side of (11.14) defines a bounded linear functional in v on H01 (Ω). (This is verified e.g. in Theorem 6.1 in a more general setting.) Hence, using the Riesz theorem, there exists an element of H01 (Ω), denoted by F (u), which represents this bounded linear functional w.r. to the inner product (11.13). That is, (11.14) holds. Therefore it is correct to give the following Definition 11.4 The generalized differential operator corresponding to problem (11.11) is the operator F : H01 (Ω) → H01 (Ω) defined by the equality (11.14). Then problem (11.12) is in fact written as hF (u), viH01 =
Z
Ω
gv
(v ∈ H01 (Ω)).
(11.15)
We note that it may seem unusual to use the same notation h., .iH01 in (11.14) to denote the sense of duality pairing as was used in (11.13) to denote the H01 (Ω)-inner product. (A more widespread way of representing functionals on H01 (Ω) is the space H −1 (Ω).) It is nevertheless correct to do this, as shown by above sketch of proof for the existence of the operator F . In other words, the underlying property for this coincidence is that the dual of a real Hilbert space is itself, i.e. the functionals in the dual space can be given as elements of the Hilbert space. (The fact that the element F (u) is a function in H01 (Ω) can be seen more visually in the regular case, which is discussed in Remark 7.2.)
Bibliography [1] Abramowitz, M. and Stegun, C. A. (Eds.). ”Spherical Bessel Functions.” 10.1 in Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing. New York: Dover, pp. 437-442, 1972. [2] Adams, R.A., Sobolev Spaces, Academic Press, 1975. [3] Agmon, S.: Lectures on elliptic boundary value problems, D. van Nostrand Co., 1965. [4] Ainsworth, M., A preconditioner based on domain decomposition for h-p finite element approximation on quasi-uniform meshes, SIAM J. Numer. Anal., 33 (1996), 1358-1376. ¨ hmer, K., Potra, F. A., Rheinboldt, W. C., A mesh[5] Allgower, E. L., Bo independence principle for operator equations and their discretizations, SIAM J. Numer. Anal. 23 (1986), no. 1, 160–169. [6] Axelsson, O., On some numerical methods for nonlinear field problems, Numerical Methods in Electrical and Magnetic Field Problems, ICCAD, S. Margherita (Italy), June 1-4, 1976. [7] Axelsson, O., On global convergence of iterative methods, in: Iterative solution of nonlinear systems of equations, pp. 1-19, Lecture Notes in Math. 953, Springer, 1982. [8] Axelsson, O., A mixed variable finite element method for the efficient solution of nonlinear diffusion and potential flow equations, in: Advances in multi-grid methods, Notes on numerical fluid mechanics, Vol.11 (eds. Braess, D. et al.), pp. 1-11, Braunschweig, 1985. [9] Axelsson, O., A generalized conjugate gradient least square method, Numer. Math. 51 (1987), 209-227. [10] Axelsson, O., On mesh independence and Newton-type methods, Appl. Math. 38 (1993), no. 4-5, 249–265. [11] Axelsson, O., Iterative solution methods, Cambridge Univ. Press, 1994. [12] Axelsson, O., On iterative solvers in structural mechanics; separate displacement orderings and mixed variable methods, Math. Comput. Simulation 50 (1999), no. 1-4, 11–30. [13] Axelsson, O., Optimal preconditioners based on the rate of convergence estimates of the conjugate gradient method, Numer. Funct. Anal. Optim. 22 (2001), no. 3-4, 277–302.
365
366
BIBLIOGRAPHY
[14] Axelsson, O., Barker, V.A., Finite Element Solution of Boundary Value Problems, Academic Press, 1984. [15] Axelsson, O., Chronopoulos, A.T., On nonlinear generalized conjugate gradient methods, Numer. Math. 69 (1994), No. 1, 1-15. ´ I., Kara ´tson J., Sobolev space preconditioning for Newton’s [16] Axelsson, O., Farago method using domain decomposition, Numer. Lin. Alg., 9 (2002). [17] Axelsson, O., Gustafsson, I., An efficient finite element method for nonlinear diffusion problems, Bull. Greek Math. Soc., 32 (1991), pp. 45-61. [18] Axelsson, O., Kaporin, I., Minimum residual adaptive multilevel finite element procedure for the solution of nonlinear stationary problems, SIAM J. Numer. Anal. 35 (1998), no. 3, 1213–1229 (electronic). [19] Axelsson, O., Kaporin, I., On the sublinear and superlinear rate of convergence of conjugate gradient methods. Mathematical journey through analysis, matrix theory and scientific computation (Kent, OH, 1999), Numer. Algorithms 25 (2000), no. 1-4, 1–22. ´tson J., Double Sobolev gradient preconditioning for elliptic [20] Axelsson, O., Kara problems, Report 0016, Dept. Math., University of Nijmegen, April 2000. ´tson J., Symmetric part preconditioning for the conjugate gra[21] Axelsson, O., Kara dient method in Hilbert space, Report 0204, Dept. Math., Univ. Nijmegen, February 2002. ´tson J., On the rate of convergence of the conjugate gradient [22] Axelsson, O., Kara method for linear operators in Hilbert space, to appear in Numer. Funct. Anal. [23] Axelsson, O., Layton, W., A two-level discretization of nonlinear boundary value problems, SIAM J. Numer. Anal. 33 (1996), no. 6, 2359–2374. [24] Axelsson, O., Maubach, J., On the updating and assembly of the Hessian matrix in finite element methods, Comp. Meth. Appl. Mech. Engrg., 71 (1988), pp. 41-67. [25] Axelsson, O., Margenov, S., An optimal order multilevel preconditioner with respect to problem and discretization paramaters, Report 0015, Dept. Math., University of Nijmegen, April 2000. ¨vert, U., On a graphical package for nonlinear partial differential [26] Axelsson, O., Na equation problems, Information processing 77 (ed. B. Gilchrist), 103-108, IFIP, NorthHolland, 1977. [27] Axelsson, O., Nikolova, M., A generalized conjugate gradient minimum residual method (GCG-MR) with variable preconditioners and a relation between residuals of the GCG-MR and GCG-OR methods, Commun. Appl. Anal. 1 (1997), no. 3, 371–388. [28] Axelsson, O., Neytcheva, M., Polman, B., The bordering method as a preconditioning method (in Russian), Vestnik Moskov. Univ. Ser. XV Vychisl. Mat. Kibernet. 1996, , no. 1, 3–25, 81; translation in Moscow Univ. Comput. Math. Cybernet. 1996, no. 1, 1–22
BIBLIOGRAPHY
367
[29] Axelsson, O., Padiy, A., On the additive version of the algebraic multilevel iteration method for anisotropic elliptic problems, SIAM J. Sci. Comput. 20 (1999), no. 5, 1807– 1830 (electronic). [30] Axelsson, O., Vassilevski, P.S., A survey of multilevel preconditioned iterative methods, BIT 29 (1989), no. 4, 769–793. [31] Axelsson, O., Vassilevski, P.S., Algebraic multilevel preconditioning methods I., Numer. Math. 56 (1989), pp. 157-177. [32] Axelsson, O., Vassilevski, P.S., Algebraic multilevel preconditioning methods II., SIAM J. Numer. Anal. 27 (1990), pp. 1569-1590. [33] Al-Baali, M., Fletcher, R., On the order of convergence of preconditioned nonlinear conjugate gradient methods, SIAM J. Sci. Comput., 17 (1996), No.3, 658-665. [34] Baker, C.T., Phillips, A., The numerical solution of nonlinear problems, Clanderon Press, Oxford, 1981. [35] Bank, R.E., Rose, D.J., Global approximate Newton methods, Numer. Math. 37 (1981), 279-295. [36] Bank, R.E., Rose, D.J., An O(n2 ) method for solving constant coefficient boundary value problems in two dimensions, SIAM J. Numer. Anal. 12 (1975), no. 4, 529–540. [37] Bank, R.E., Rose, D.J., Marching algorithms for elliptic boundary value problems. I. The constant coefficient case, SIAM J. Numer. Anal. 14 (1977), no. 5, 792–829. [38] Bank, R.E., Marching algorithms for elliptic boundary value problems. II. The variable coefficient case, SIAM J. Numer. Anal. 14 (1977), no. 5, 950–970. [39] Bartels, R., Daniel, J. W., A conjugate gradient approach to nonlinear elliptic boundary value problems in irregular regions, in Conference on the Numerical Solution of Differential Equations (Univ. Dundee, Dundee, 1973), pp. 1–11. Lecture Notes in Math., Vol. 363, Springer, Berlin, 1974. [40] Bazeley, G.P., Cheung, Y.K., Irons, B.M., Zienkiewicz, O.C., Triangular elements in plate bending, Wright Paterson, I., 1965. `gue, C., Glowinski, R., Pe ´riaux, J., D´etermination d’un op´erateur de [41] Be pr´econditionnement pour la r´esolution it´erative du probleme de Stokes dans la formulation d’Helmholtz (in French, English summary), [A preconditioning operator for the iterative solution of the Stokes problem via the Helmholtz formulation], C. R. Acad. Sci. Paris S´er. I Math. 306 (1988), no. 5, 247–252. [42] Birkhoff, G., Lynch, R.E., Numerical solution of elliptic problems, SIAM Studies in Applied Mathematics 6, SIAM, Philadelphia, 1984. [43] Bjorstad, P. E., Fast numerical solution of the biharmonic Dirichlet problem on rectangles, SIAM J. Numer. Anal. 20 (1983), no. 1, 59–71. [44] Bjorstad, P. E., Tjostheim, B. P., Efficient algorithms for solving a fourth-order equation with the spectral-Galerkin method, SIAM J. Sci. Comput. 18 (1997), no. 2, 621–632.
368
BIBLIOGRAPHY
[45] Blaheta, R., Multilevel Newton methods for nonlinear problems with applications to elasticity, Copernicus 940820, Technical report. [46] Blaheta, R., Adaptive composite grid methods for problems of plasticity, Modelling ’98 (Prague), Math. Comput. Simulation 50 (1999), no. 1-4, 123–134. [47] Blaheta, R., Numerical methods in elasto-plasticity, Documenta Geonica, Peres, Prague, 1999. [48] Bornemann, F.A., Erdmann, B., Kornhuber, R., A posteriori error estimates for elliptic problems in two and three space dimensions, SIAM J. Numer. Anal. 33 (1996), no. 3, 1188–1204. ¨ rgers, C., Widlund, O. B., On finite element domain imbedding methods, SIAM [49] Bo J. Numer. Anal. 27 (1990), no. 4, 963–978. [50] Boyd, J. P., Chebyshev and Fourier spectral methods (2nd edition), Dover Publications, Mineola, 2001. [51] Bramble, J. H., Pasciak, J.E., Preconditioned iterative methods for nonselfadjoint or indefinite elliptic boundary value problems, in Unification of finite element methods, 167–184, North-Holland Math. Stud. 94, North-Holland, Amsterdam, 1984. [52] Bramble, J. H., Pasciak, J.E., Vassilevski, P. S., Computational scales of Sobolev norms with application to preconditioning, Math. Comp. 69 (2000), no. 230, 463–480. [53] Bramble, J. H., Pasciak, J.E., Xu, J., Parallel multilevel preconditioners, Math. Comp. 55 (1990), no. 191, 1–22. [54] Bramble, J. H., Pasciak, J.E., Xu, J., The analysis of multigrid algorithms with nonnested spaces or noninherited quadratic forms, Math. Comp. 56 (1991), no. 193, 1–34. [55] Bramble, J. H., Pasciak, J.E., Zhang, X., Two-level preconditioners for 2mth order elliptic finite element problems, East-West J. Numer. Math. 4 (1996), no. 2, 99–120. [56] Brezzi, F., Raviart, P.A., Mixed Finite Element Method for 4th Order Elliptic Equations, in: Topics in Numerical Analysis III (ed.: J.Miller), Academic Press, 1998. ´riaux, J., Perrier, P., Pironneau, O., [57] Bristeau, M. O., Glowinski, R., Pe Poirier, G., Finite element methods for transonic flow calculations, in Advances in computational transonics, 703–731, Recent Adv. Numer. Methods Fluids 4, Pineridge, Swansea, 1985. [58] Britton, N. F., Reaction-diffusion equations and their applications to biology, Academic Press, London, 1986. [59] Browder, F. E., Petryshyn, W. V., Construction of fixed points of nonlinear mappings in Hilbert space, J. Math. Anal. Appl. 20 (1967), 197–228. [60] Bruaset, A. M., A survey of preconditioned iterative methods, Pitman Research Notes in Mathematics Series 328, Longman Scientific and Technical, Harlow, 1995.
BIBLIOGRAPHY
369
[61] Brugnano, L., Marrone, M., Vectorization of some block preconditioned conjugate gradient methods, Parallel Comput. 14 (1990), no. 2, 191–198. [62] Burenkov, V.I., Gusakov, V.A., On exact constants in Sobolev embeddings III., Proc. Stekl. Inst. Math. 204 (1993), No. 3., 57-67. [63] Busuioc, V., Cioranescu, D., On a class of electrorheological fluids. Contributions in honor of the memory of Ennio De Giorgi, Ricerche Mat. 49 (2000), suppl., 29–60. [64] Buzbee, B. L., Golub, G. H., Nielson, C. W., On direct methods for solving Poisson’s equations, SIAM J. Numer. Anal. 7 (1970) 627–656. [65] Cai, X., Dryja, M., Domain decomposition methods for monotone nonlinear elliptic problems, Domain decomposition methods in scientific and engineering computing, 2127, Contemp. Math. 180, AMS, Providence, 1994. [66] Cai, Z., Mandel, J., McCormick, S., Multigrid methods for nearly singular linear equations and eigenvalue problems, SIAM J. Numer. Anal. 34 (1997), no. 1, 178–200. [67] Canuto, C., Hussaini, M. Y., Quarteroni, A., Zang, T. A., Spectral methods in fluid dynamics, Springer Series in Computational Physics, Springer, 1988. [68] Carey, G.F., Jiang, B.-N., Nonlinear preconditioned conjugate gradient and leastsquares finite elements, Comp. Meth. Appl. Mech. Engrg., 62 (1987), pp. 145-154. ´a, J. Optimization - theory and algorithms, Bombay, 1978. [69] Ce [70] Chen, Z., Ewing, R., Lazarov, R. D., Maliassov, S., Kuznetsov, Y.A., Multilevel preconditioners for mixed methods for second order elliptic problems, Numer. Linear Algebra Appl. 3 (1996), no. 5, 427–453. [71] Chiorescu, Gh., Plane boundary value problems in the nonlinear generalized theory of rods II. The resolution of a boundary value problem by iterative Newton’s method, Bul. Inst. Politehn. Iasi Sect. I 27 (31) (1981), no. 1-2, 89–95. [72] Ciarlet, Ph., The finite element method for elliptic problems, North-Holland, Amsterdam, 1978 [73] Ciarlet, Ph. G., Lions J.-L. (eds.), Handbook of numerical analysis, Vol. II. Finite element methods, Part 1, North-Holland, Amsterdam, 1991. [74] Collatz, L., Functional Analysis and Numerical Mathematics, Academic Press, N.Y., 1966. [75] Concus, P., Numerical solution of the nonlinear magnetostatic field equation in two dimensions, J. Comput. Phys. 1 (1967), 330-342. [76] Concus, P., Golub, G.H., Use of fast direct methods for the efficient numerical solution of nonseparable elliptic equations, SIAM J. Numer. Anal. 10 (1973), 1103– 1120. [77] Courant, R, Hilbert, D., Methods of Mathematical Physics II., Wiley Classics Library, J. Wiley & Sons, 1989.
370
BIBLIOGRAPHY
´ch, L., The steepest descent method for elliptic differential equations (in Russian), [78] Cza C.Sc. thesis, 1955. [79] Daniel, J.W., The conjugate gradient method for linear and nonlinear operator equations, SIAM J. Numer. Anal., 4 (1967), 10-26. [80] Delong, M.A., Ortega, J.M., SOR as a preconditioner, Appl. Num. Math. 181, 1995, 431-440. [81] Delong, M.A., Ortega, J.M. SOR as a parallel preconditioner, in Linear and nonlinear conjugate gradient-related methods, Adams, L., Nazareth, J. (eds.), 143-148, SIAM, 1996. [82] Dembo, R.S., Eisenstat, S.C., Steihaug, T., Inexact Newton methods, SIAM J. Numer. Anal., 19 (1982), No.2, 400-408. ´, J.J., Quasi-Newton methods, motivation and theory, SIAM [83] Dennis, J.E. Jr., More Rev., 19 (1977), 46-89. [84] Dennis, J.E. Jr., Schnabel, R. B., Numerical methods for unconstrained optimization and nonlinear equations (corrected reprint of the 1983 original), Classics in Applied Mathematics 16, SIAM, Philadelphia, 1996. [85] Deuflhard, P., Weiser, M., Global inexact Newton multilevel FEM for nonlinear elliptic problems, Multigrid Methods V., 71-89, Lecture Notes Comput. Sci. Eng. 3, Springer, 1998. [86] Deville, M. O., Chebyshev collocation solutions of flow problems. Spectral and high order methods for partial differential equations (Como, 1989), Comput. Methods Appl. Mech. Engrg. 80 (1990), no. 1-3, 27–37. [87] Deville, M. O., Mund, E., Chebyshev pseudospectral solution of second-order elliptic equations with finite element preconditioning, J. Comput. Phys. 60 (1985), no. 3, 517–533. [88] D´ıaz, J. I., Applications of symmetric rearrangement to certain nonlinear elliptic equations with a free boundary, Nonlinear differential equations (Granada, 1984), 155– 181, Res. Notes in Math., 132, Pitman, 1985. [89] Dongarra, J. J., Duff, I. S., Sorensen, D. C., van der Vorst, H. A., Numerical linear algebra for high-performance computers. Software, Environments, and Tools, SIAM, Philadelphia, 1998. [90] Dorr, F. W., The direct solution of the discrete Poisson equation on a rectangle, SIAM Rev. 12 (1970), 248–263. [91] Douglas, C. C., Multigrid algorithms with applications to elliptic boundary value problems, SIAM J. Numer. Anal. 21 (1984), no. 2, 236–254. [92] Douglas, C. C., Douglas, J. Jr., A unified convergence theory for abstract multigrid or multilevel algorithms, serial and parallel, SIAM J. Numer. Anal. 30 (1993), no. 1, 136–158.
BIBLIOGRAPHY
371
[93] Douglas, J. Jr., Dupont, T., Serrin, J., Uniqueness and comparison theorems for nonlinear elliptic equations in divergence form, Arch. Rational Mech. Anal. 42 (1971), 157–168. [94] Douglas, C. C., Miranker, W. L., Constructive interference in parallel algorithms, SIAM J. Numer. Anal. 25 (1988), no. 2, 376–398. [95] Dryja, M., An iterative substructuring method for elliptic mortar finite element problems with discontinuous coefficients. Domain decomposition methods, 10 (Boulder, CO, 1997), 94–103, Contemp. Math., 218, Amer. Math. Soc., Providence, RI, 1998 [96] D’yakonov, E. G., On an iterative method for the solution of finite difference equations (in Russian), Dokl. Akad. Nauk SSSR 138 (1961), 522–525. [97] D’yakonov, E. G., The construction of iterative methods based on the use of spectrally equivalent operators, USSR Comput. Math. and Math. Phys., 6 (1965), pp. 14-46. [98] Ebmeyer, C., Mixed boundary value problems for nonlinear elliptic systems in ndimensional Lipschitzian domains. Z. Anal. Anwendungen 18 (1999), no. 3, 539–555. [99] Egorov, Yu.V., Shubin, M.A., Encyclopedia of Mathematical Sciences, Partial Differential Equations I., Springer, 1992. [100] Egorov, Yu.V., Kondratiev, V., On spectral theory of elliptic operators. Operator Theory: Advances and Applications, 89. Birkh¨ auser Verlag, Basel, 1996. [101] Eisenstat, S. C., Walker, H. F., Globally convergent inexact Newton methods, SIAM J. Optim. 4 (1994), no. 2, 393–422. [102] Elman, H.C., A stability analysis of incomplete LU factorizations, Math. Comp. 47 (1986), no. 175, 191–217. [103] Elman, H.C., Preconditioning for the steady-state Navier-Stokes equations with low viscosity, SIAM J. Sci. Comput., 20 (1999), No.4, pp. 1299-1316. [104] Elman, H.C., Schultz. M.H., Preconditioning by fast direct methods for nonselfadjoint nonseparable elliptic equations, SIAM J. Numer. Anal., 23 (1986), 44-57. [105] Evans D. J., The use of preconditioning in iterative methods for solving linear equations with symmetric positive definite matrices, J. Inst. Math. Appl. 4 (1968) 295–314. [106] Evans D. J. (ed.), Preconditioned iterative methods, Topics in Computer Mathematics 4, Gordon and Breach Science Publishers, Lausanne, 1994. [107] Ewing, R. E., Margenov, S. D., Vassilevski, P. S., Preconditioning the biharmonic equation by multilevel iterations, Math. Balkanica (N.S.) 10 (1996), no. 1, 121–132. [108] Faber, V., Manteuffel, T., Necessary and sufficient conditions for the existence of a conjugate gradient method, SIAM J. Numer. Anal., 21 (1984), 352-362. [109] Faber, V., Manteuffel, T., Parter, S.V., On the theory of equivalent operators and application to the numerical solution of uniformly elliptic partial differential equations, Adv. in Appl. Math., 11 (1990), 109-163.
372
BIBLIOGRAPHY
[110] Faierman, M., Regularity of solutions of an elliptic BVP in a rectangle, Comm. PDE, Vol. 12 (1987), 285-305. ´ , I., Qualitative properties of the numerical solution of linear parabolic prob[111] Farago lems with nonhomogeneous boundary conditions, Comput. Math. Appl., 31 (1996) pp. 143-150. ´ , I., Nonnegativity of the difference schemes, Pure Math. Appl., 6 (1996) pp. [112] Farago 147-159. ´ The mathemathical background of operator splitting and ´ , I, Havasi, A., [113] Farago the effect of non-commutativity, in: Large-Scale Scientific Computing, S. Margenov, J. Wasniewski, P. Yalamov (eds.), Lect. Notes Comp. Sci., Springer-Verlag, 2002, pp. 264-271. ´ , I., Kara ´tson, J., The gradient–finite element method for elliptic problems, [114] Farago Comp. Math. Appl. 42 (2001), 1043-1053. ´ , I., Kara ´tson, J., Gradient–finite element method for the Saint Venant [115] Farago model of elasto-plastic torsion in the hardening state, Publ. Appl. Anal., ELTE University (Budapest), May 2000. [116] Fletcher, R., Reeves, C.M., Function minimization by conjugate gradients, Comput. J. 7 (1964), 149-154. [117] Francu, J., Monotone operators. A survey directed to applications to differential equations, Apl. Mat. 35 (1990), no. 4, 257–301. [118] Frederickson, P.O., McBryan, O. A., Parallel superconvergent multigrid, in Multigrid methods (Copper Mountain, 1987), 195–210, Lecture Notes in Pure and Appl. Math. 110, Dekker, New York, 1988. ˇ´ık, S., Kratochv´ıl, A., Nec ˇas, J., Kaˇcanov’s method and its application, Rev. [119] Fuc Roumaine Math. Pures Appl. 20 (1975), no. 8, 907–916. ˇ´ık, S., Kufner, A., Nonlinear differential equations. Studies in Applied Mechanics [120] Fuc 2, Elsevier, 1980. [121] Funaro, D., Polynomial approximation of differential equations, Lecture Notes in Physics, New Series, Monographs 8, Springer, 1992. [122] Funaro, D., Spectral elements in the approximation of boundary-value problems in complex geometries, in Innovative methods in numerical analysis (Bressanone, 1992), Appl. Numer. Math. 15 (1994), no. 2, 201–205. [123] Funaro, D., A fast solver for elliptic boundary-value problems in the square, Comput. Methods Appl. Mech. Engrg. 116 (1994), no. 1-4, 253–255. [124] Funaro, D., Spectral elements for transport-dominated equations, Lecture Notes in Computational Science and Engineering 1, Springer, 1997. [125] Funaro, D., Quarteroni, A., Zanolli, P., An iterative procedure with interface relaxation for domain decomposition methods, SIAM J. Numer. Anal. 25 (1988), no. 6, 1213–1236.
BIBLIOGRAPHY
373
¨rtner, K., On the discretization of van Roosbroeck’s equations [126] Gajewski, H., Ga with magnetic field, Z. Angew. Math. Mech. 76 (1996), no. 5, 247–264. ¨ ger, K., Zacharias, K., Nichtlineare Operatorgleichungen und [127] Gajewski, H., Gro Operatordifferentialgleichungen, Akademie-Verlag, Berlin, 1974 [128] Gajewski, H., Kluge, R., Projektions-Iterationsverfahren und nichtlineare Probleme mit monotone Operatoren (in German), Monatsb. Deutsch. Akad. Wiss. Berlin 12 (1970) 98–115. ´ntai, A., The theory of Newton’s method. Numerical analysis 2000, Vol. IV, [129] Gala Optimization and nonlinear equations, J. Comput. Appl. Math. 124 (2000), no. 1-2, 25–44. [130] Georgiev, A., Margenov, S., Neytcheva, M., Multilevel algorithms for 3D simulation of nonlinear elasticity problems. Modelling ’98 (Prague), Math. Comput. Simulation 50 (1999), no. 1-4, 175–182. [131] Gidas, B., Ni, W.N., Nirenberg, L., Symmetry and related properties via the maximum principle, Commun. Math. Phys. 68 (1979), 209-243. [132] Gilbarg, D., Trudinger, N. S., Elliptic partial differential equations of second order (2nd edition), Grundlehren der Mathematischen Wissenschaften 224, Springer, 1983. [133] Glowinski, R., Numerical methods for nonlinear variational problems. Springer Series in Computational Physics. Springer-Verlag, New York, 1984. [134] Glowinski, R., Marrocco, A., Analyse num´erique du champ magn´etique d’un alternateur par ´el´ements finis et sur-relaxation ponctuelle non lin´eaire, Comput. Methods Appl. Mech. Engrg. 3 (1974), no. 1, 55–85. [135] Goldstein, C. I., Preconditioning singularity perturbed elliptic and parabolic problems, SIAM J. Numer. Anal. 28 (1991), no. 5, 1386–1418. [136] Gottlieb, D., Orszag, S. A., Numerical analysis of spectral methods: theory and applications. CBMS-NSF Regional Conference Series in Applied Mathematics, No. 26, SIAM, Philadelphia, 1977. [137] Graham, I. G., Hagger, M. J., Unstructured additive Schwarz-conjugate gradient method for elliptic problems with highly discontinuous coefficients, SIAM J. Sci. Comput. 20 (1999), 2041–2066 (electronic). [138] Greenbaum, A., Diagonal scalings of the Laplacian as preconditioners for other elliptic differential operators, SIAM J. Matrix Anal. Appl., 13 (1992), 826-846. [139] Grindrod, P., The theory and applications of reaction-diffusion equations. Patterns and waves (2nd edition), Oxford Applied Mathematics and Computing Science Series, The Clarendon Press, Oxford University Press, New York, 1996. [140] Grisvard, P., Elliptic Problems in Nonsmooth Domains, Pitman, 1985. ´side ´ri, J.-A., Iterative methods with spectral preconditioning for [141] Guillard, H., De elliptic equations, Comput. Methods Appl. Mech. Engrg. 80 (1990), no. 1-3, 305–312.
374
BIBLIOGRAPHY
[142] Gunn, J. E., The numerical solution of ∇ · a∇u = f by a semi-explicit alternating direction iterative method, Numer. Math. 6 (1964), 181-184. [143] Gunn, J. E., The solution of elliptic difference equations by semi-explicit iterative techniques, J. Soc. Indust. Appl. Math. Ser. B Numer. Anal. 2 (1965), 24–45. [144] Hackbusch, W., Multigrid methods and applications, Springer Series in Computational Mathematics 4, Springer, Berlin, 1985. [145] Hackbusch, W., Elliptic differential equations. Theory and numerical treatment, Springer Series in Computational Mathematics 18, Springer, Berlin, 1992. [146] Hackbusch, W., A new approach to robust multigrid solvers, in ICIAM ’87: Proceedings of the First International Conference on Industrial and Applied Mathematics (Paris, 1987), 114–126, SIAM, Philadelphia, 1988. ´ Bartholy, J., Farago ´ , I., Splitting method and its application in air [147] Havasi, A., pollution modelling, Id˝ oj´ ar´ as, 105 (2001) pp. 39-58. [148] Hayes, R.M., Iterative methods of solving linear problems in Hilbert space, Nat. Bur. Standards Appl. Math. Ser, 39 (1954), 71-103. [149] Heise, B., Nonlinear field simulation with finite element domain decomposition methods on massively parallel computers, Surveys Math. Indust. 6 (1997), no. 4, 267–287. [150] Hestenes, M.R., Stiefel, E., Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Standards, Sect. B, 49 (1952) No.6., 409-436. ´c ˇek, I., Kr´ıˇ ´, J., On Galerkin approximations of a quasilinear [151] Hlava zek, M., Maly nonpotential elliptic problem of a nonmonotone type, J. Math. Anal. Appl. 184 (1994), no. 1, 168–189. ´c ˇek, I., Nec ˇas, J., On inequalities of Korn’s type, II. Applications to linear [152] Hlava elasticity, Arch. Rational Mech. Anal. 36 (1970) 312–334. ¨ rmander, L., Linear partial differential operators, Springer, 1976. [153] Ho [154] Hutson, V., Pym, J. S., Applications of functional analysis and operator theory. Mathematics in Science and Engineering 146, Academic Press, New York-London, 1980. [155] Ihlenburg, F., Babuˇ ska, I., Dispersion analysis and error estimation of Galerkin finite element methods for the Helmholtz equation, Internat. J. Numer. Methods Engrg. 38 (1995), no. 22, 3745–3774. [156] Jerome, J.W., The method of lines and the nonlinear Klein-Gordon equations, J. Diff. Equations, 30 (1978) pp. 20-31. [157] Juncu, Gh., Popa, C., Preconditioning by approximations of the Gram matrix for convection-diffusion equations, Math. Comput. Simulation 48 (1998), no. 2, 225–233. [158] Jung, M., Nepomnyashchikh, S. V., Multilevel preconditioning procedures for elliptic problems, in Large-scale scientific computations of engineering and environmental problems (Varna, 1997), 78–90, Notes Numer. Fluid Mech., 62, Vieweg, Braunschweig, 1998.
BIBLIOGRAPHY
375
[159] Kachanov, L.M., Foundations of the theory of plasticity, North-Holland, 1971 ˇur, J., Method of Rothe in evolution equations, Teubner-Texte zur Mathematik [160] Kac 80, Leipzig, 1985. ˇur, J., Wawruch, A., On an approximate solution for quasilinear parabolic [161] Kac equations, Czech. Math. J., 27 (1977), pp. 220 - 241. [162] Kadlec, J., On the regularity of the solution of the Poisson problem on a domain with boundary locally similar to the boundary of a convex open set, Czechosl. Math. J., 14 (89), (1964), pp. 386-393. [163] Kantorovich, L.V., On an effective method of the solution of the extremal problem of the quadratic functional (in Russian), Dokl. Akad. Nauk. SSSR, 48 (1945), 483–487. [164] Kantorovich, L.V., Functional analysis and applied mathematics. NBS Rep. 1509, U. S. Department of Commerce, National Bureau of Standards, Los Angeles, 1952. [165] Kantorovich, L.V., Akilov, G.P., Functional Analysis, Pergamon Press, 1982. ´tson, J., The gradient method for a class of nonlinear operators in Hilbert space [166] Kara and applications to quasilinear differential equations, Pure Math. Appl., 6 (1995), No. 2, 191–201. ´tson, J., The gradient method for non-differentiable operators in product [167] Kara Hilbert spaces and applications to elliptic systems of quasilinear differential equations, J. Appl. Anal., 3 (1997) No. 2., pp. 205-217. ´tson, J., Gradient method for non-injective operators in Hilbert space with [168] Kara application to Neumann problems, Appl. Math., 26 (1999) No. 3, 333-346. ´tson, J., Gradient method for semilinear elliptic systems via preconditioning [169] Kara in Sobolev space, Publ. Appl. Anal., ELTE University (Budapest), November 1999. ´tson, J., Gradient method for non-uniformly convex functionals in Hilbert [170] Kara space, Pure Math. Appl., Vol. 11 (2000), No. 2., 309-316. ´tson, J., Gradient method in Sobolev space for nonlocal boundary value prob[171] Kara lems, Electron. J. Diff. Eqns., Vol. 2000(2000), No. 51, pp. 1-17. ´tson, J., Sobolev space preconditioning of strongly nonlinear 4th order ellip[172] Kara tic problems, in: Numerical Analysis and Its Applications, Sec. Int. Conf. NAA 2000 (Rousse, Bulgaria), eds. L. Vulkov, J. Wasniewski, P. Yalamov, pp. 459-466, Lecture Notes Comp. Sci. Vol. 1988, Springer, 2001. ´tson J., Farago ´ I., Sobolev space preconditioning for nonlinear mixed bound[173] Kara ary value problems, in: Large-scale Scientific Computing, LSSC 2001, eds. S. Margenov, J. Wasniewski, P. Yalamov, pp. 104-112, Lecture Notes Comp. Sci. Vol. 2179, Springer, 2001. ´tson J., Farago ´ I., Variable preconditioning via inexact Newton methods for [174] Kara nonlinear problems in Hilbert space, Publ. Appl. Anal., ELTE University (Budapest), 2001/2.
376
BIBLIOGRAPHY
[175] Keller, H. B., Elliptic boundary value problems suggested by nonlinear diffusion processes, Arch. Rational Mech. Anal. 35 (1969), 363–381. [176] Kelley, C.T., Iterative methods for linear and nonlinear equations, Frontiers in Appl. Math., SIAM, Philadelphia, 1995. [177] Kellogg, R. B., A nonlinear alternating direction method, Math. Comp. 23 (1969), 23–27. [178] Khoromskij, B. N., Wendland, W. L., Spectrally equivalent preconditioners for boundary equations in substructuring techniques, East-West J. Numer. Math. 1 (1993), no. 1, 1–26. [179] Khoromskij, B. N., Wittum, G., Robust interface reduction for highly anisotropic elliptic equations. Multigrid methods V (Stuttgart, 1996), 140–156, Lect. Notes Comput. Sci. Eng., 3, Springer, Berlin, 1998. [180] Kluge, R., Nichtline¨ are Variationsungleichungen und Extremalaufgaben. Theorie und N¨ aherungsverfahren. Mathematische Monographien 12, Berlin, 1979. [181] Knyazev, A.W., Widlund, O., Uniform finite element error estimates for differential equations with jumps in the coefficients, to appear in Math. Comp. ˇiˇ [182] Korotov, S., Kr zek, M., Finite element analysis of variational crimes for a nonlinear heat conduction problem in three-dimensional space, ENUMATH 97 (Heidelberg), 421-428, World Sci. Publ., 1998. ˇiˇ [183] Korotov, S., Kr zek, M., Finite element analysis of variational crimes for a quasilinear elliptic problem in 3D, Numer. Math., 84 (2000), No. 4, 549-576. [184] Koshelev, A., Convergence of the method of successive approximations for quasilinear elliptic equations, Dokl. Akad. Nauk. USSR 142 (1962), No. 5, 1007-1010. [185] Koshelev, A., Regularity problem for quasilinear elliptic and parabolic systems. Lecture Notes in Mathematics 1614, Springer-Verlag, Berlin, 1995. [186] Kuznetsov, Yu. A., Efficient iterative solvers for elliptic finite element problems on nonmatching grids, Russian J. Numer. Anal. Math. Modelling 10 (1995), no. 3, 187–211. [187] Kuznetsov, Yu. A., Two-level preconditioners with projectors for unstructured grids, Russian J. Numer. Anal. Math. Modelling 15 (2000), no. 3-4, 247–255. ˇiˇ ¨ki, P., Mathematical and numerical modelling in electrical [188] Kr zek, M., Neittaanma engineering: theory and applications, Kluwer Academic Publishers, 1996. [189] Ladyzenskaya. O.A, Solonnikov, V.A., Uralceva, N.N., Linear and quasilinear equations of parabolic type, Translations of Mathematical Monographs, Vol. 23, AMS, 1967. [190] Langenbach, A., Monotone Potentialoperatoren in Theorie und Anwendung, Springer, 1977. [191] Langer, U., A fast iterative method for solving the first boundary value problem for the biharmonic equation (in Russian), Zh. Vychisl. Mat. i Mat. Fiz. 28 (1988), no. 2, 209–223, 302.
BIBLIOGRAPHY
377
[192] Lanser, D., Verwer, J.G., Analysis of operators splitting for advection- diffusionreaction problems in air pollution modelling J. Comput. Appl. Math., 111 (1999), pp. 201-216 [193] Lavery, J. E., Local convergence of the method of pseudolinear equations for quasilinear elliptic boundary value problems, J. Comput. Appl. Math. 11 (1984), no. 1, 69–82. [194] Lavery, J. E., A comparison of the method of frozen coefficients with Newton’s method for quasilinear two-point boundary-value problems, J. Math. Anal. Appl. 123 (1987), no. 2, 415–428. [195] Lieberman., G.M., The conormal derivative problem for equations of variational type in nonsmooth domains, Transactions of the AMS, Vol. 330 (1992), 41-67. [196] Lions, J. L., Quelques M´ethodes de R´esolution des Probl`emes aux Limites Non Lin´eaires, Dunod, Gauthier-Villars, Paris, 1969. [197] Lions. P.-L., Pacella, F., Tricario, M., Best constants in Sobolev inequalities for functions vanishing on some part of the boundary, Indiana Univ. Math. J., 37 (1988), No. 2., pp. 301-324. ´ czi, L., The Gradient-Fourier method for nonlinear elliptic partial differential equa[198] Lo tions in Sobolev space, Annales Univ. Sci. ELTE, 43 (2000), 139-149. ´ czi, L., The Gradient-Fourier method for nonlinear Neumann boundary value prob[199] Lo lems and its algorithmic realization in Mathematica, to appear in Pure Math. Appl. [200] Magolu, M.-M., Analytical bounds for block approximate factorization methods, Linear Algebra Appl. 179 (1993), 33–57. [201] Mahavier, W. T. A convergence result for discrete steepest descent in weighted Sobolev spaces, Abstr. Appl. Anal., 2 (1997), no. 1-2, 67–72. [202] Makai, E., Complete systems of eigenfunctions of the wave equation in some special case, Studia Sci. Math. Hung., 11 (1976), 139–144. [203] Manteuffel, T., Otto, J., Optimal equivalent preconditioners, SIAM J. Numer. Anal., 30 (1993), 790-812. [204] Manteuffel, T., Parter, S. V., Preconditioning and boundary conditions, SIAM J. Numer. Anal. 27 (1990), no. 3, 656–694. [205] Marchuk, G. I., Kuznetsov, Yu. A., Matsokin, A. M., Fictitious domain and domain decomposition methods, Soviet J. Numer. Anal. Math. Modelling 1 (1986), no. 1, 3–35. [206] Martensen, E., The convergence of the horizontal line method for Maxwell’s equations, Math. Mech. in Appl. Sci., 1 (1979) pp. 101-113. [207] Mayo, A., The fast solution of Poisson’s and the biharmonic equations on irregular regions, SIAM J. Numer. Anal. 21 (1984), no. 2, 285–299. [208] Mayo, A., Greenbaum, A., Fast parallel iterative solution of Poisson’s and the biharmonic equations on irregular regions, SIAM J. Sci. Statist. Comput. 13 (1992), no. 1, 101–118.
378
BIBLIOGRAPHY
[209] McCormick, S.F., Multigrid methods for variational problems: general theory for the V -cycle, SIAM J. Numer. Anal. 22 (1985), no. 4, 634–643. [210] McCormick, S.F. (ed.), Multigrid methods, Frontiers in Applied Mathematics 3, SIAM, Philadelphia, 1987. [211] McCormick, S.F., Multilevel adaptive methods for partial differential equations, Frontiers in Applied Mathematics 6, SIAM, Philadelphia, 1989. [212] McCormick, S.F., Ruge, J.W., Multigrid methods for variational problems, SIAM J. Numer. Anal. 19 (1982), no. 5, 924–929. [213] Miersemann, E., Zur Regularit¨ at verallgemeinerter L¨ osungen von quasilinearen elliptischen Differentialgleichungen zweiter Ordnung in Gebieten mit Ecken, Z. Anal. Anw. 1 (1982), no. 4, 59–71. [214] Miersemann, E., Quasilineare elliptische Differentialgleichungen zweiter Ordnung in mehrdimensionalen Gebieten mit Kegelspitzen, Z. Anal. Anw., 2 (1983), no. 4, 361-365. [215] Mikhlin, S.G., Noordhoff, 1971
The Numerical Performance of Variational Methods,
Walters-
[216] Mises, R. von, Mechanik der festen K¨ orper in plastisch-deformabilen Zustand, Nachr. K¨ on. Ges. Wiss. G¨ ottingen 4 (1913) [217] Molchanov, I.N, Computer solution methods of applied problems. Algebra, Kiev, Naukova Dumka, 1987 (in Russian). [218] Molchanov, I.N, Nikolenko, L.D., Introduction to the Finite Element Method, Kiev, Naukova Dumka, 1989 (in Russian). [219] Murray, J.D., A simple method for obtaining approximate solutions for a large class of diffusion-kinetic enzyme problems, Math. Biosciences, 2 (1968), 379-411. [220] Nashed, M. Z., The convergence of the method of steepest descents for nonlinear equations with variational or quasi-variational operators, J. Math. Mech. 13 (1964), 765–794. [221] Nashed, M. Z., A decomposition relative to convex sets, Proc. Amer. Math. Soc. 19 (1968), 781-786. [222] Navon, I.M., Phua, P.K.H., Ramamurthy, M., Vectorization of conjugate gradient methods for large-scale minimization in meteorology, J. Optim. Theory and Appl. 66 (1990) No.1., 71-93. ˇas, J., Introduction to the theory of nonlinear elliptic equations, J. Wiley & Sons, [223] Nec Chichester, 1986. ˇas, J., Hlava ´c ˇek, I., Mathematical theory of elastic and elasto-plastic bodies: [224] Nec an introduction, Studies in Applied Mechanics 3, Elsevier Scientific Publishing Co., Amsterdam-New York, 1980. [225] Nepomnyashchikh, S. V., Domain decomposition for elliptic problems with large condition numbers, in Domain decomposition methods in scientific and engineering computing (University Park, PA, 1993), 75–85, Contemp. Math., 180, Amer. Math. Soc., Providence, RI, 1994.
BIBLIOGRAPHY
379
[226] Nepomnyashchikh, S. V., Preconditioning operators for elliptic problems with degenerate coefficients, Preprint 1035/1995, Ross. Akad. Nauk Sibirsk. Otdel., Vychisl. Tsentr, Novosibirsk, 1995. [227] Neuberger, J. W., Steepest descent for general systems of linear differential equations in Hilbert space, Lecture Notes in Math., No. 1032, Springer, 1983. [228] Neuberger, J. W., Some global steepest descent results for nonlinear systems. Trends in theory and practice of nonlinear differential equations (Arlington, Tex., 1982), 413– 418, Lecture Notes in Pure and Appl. Math., 90, Dekker, New York, 1984. [229] Neuberger, J. W., Renka, R. J., Minimal surfaces and Sobolev gradients. SIAM J. Sci. Comput. 16 (1995), no. 6, 1412–1427. [230] Neuberger, J. W., Renka, R. J., Numerical calculation of singularities for Ginzburg-Landau functionals, Electron. J. Diff. Eq., No. 10 (1997). [231] Neuberger, J. W., Sobolev gradients and differential equations, Lecture Notes in Math., No. 1670, Springer, 1997. [232] Nicolaides, R. A., Deflation of conjugate gradients with applications to boundary value problems, SIAM J. Numer. Anal. 24 (1987), no. 2, 355–365. [233] Nonlinear diffusion equations and their equilibrium states 3., Proceedings of the conference held in Gregynog, August 20–29, 1989, N. G. Lloyd, W.-M. Ni, L. A. Peletier and J. Serrin (eds.), Progress in Nonlinear Differential Equations and their Applications 7, Birkh¨ auser, Boston, 1992. [234] Oden, J. T., Finite elements, Vols. I-VI. (in collaboration with G. F. Carey et al.), The Texas Finite Element Series, Prentice Hall, Englewood Cliffs, 1981-1986. [235] Orszag, S. A. Spectral methods for problems in complex geometries, J. Comput. Phys. 37 (1980), no. 1, 70–92. [236] Ortega, J.M., Numerical Analysis. A second course, Academic Press, N.Y., 1972. [237] Ortega, J.M , Poole, W.G., An Introduction to Numerical Methods for Differential Equations, Pittman Publ., 1981. [238] Ortega, J.M., Rheinboldt, W.C., Iterative solutions for nonlinear equations in several variables, Academic Press, 1970. [239] Padiy, A., Axelsson, O., Polman, B., Generalized augmented matrix preconditioning approach and its application to iterative solution of ill-conditioned algebraic systems, SIAM J. Matrix Anal. Appl. 22 (2000), no. 3, 793–818 (electronic). [240] Petryshyn, W. V., On two variants of a method for the solution of linear equations with unbounded operators and their applications, J. Math. and Phys. 44 (1965) 297– 312. [241] Petryshyn, W. V., On the extension and the solution of nonlinear operator equations, Illinois J. Math. 10 (1966) 255–274. [242] Petryshyn, W. V., On the iteration, projection and projection-iteration methods in the solution of nonlinear functional equations, J. Math. Anal. Appl., 21 (1968), 575–607.
380
BIBLIOGRAPHY
[243] Plastock, R., Homeomorphisms between Banach spaces, Trans. Amer. Math. Soc. 200 (1974), 169–183. [244] Poljak, B.T., Gradient methods for solving equations and inequalities (in Russian), Z. Vyˇcisl. Mat. i Mat. Fiz., 4 (1964), 995-1005. [245] Popa, C., Mesh independence principle for nonlinear equations on Hilbert spaces by preconditioning, Int. J. Comput. Math. 69 (1998), no. 3-4, 295–318. [246] Posp´ıˇ sek, M., Nonlinear boundary value problems with application to semiconductor device equations, Appl. Math. 39 (1994), no. 4, 241–258. [247] Posp´ıˇ sek, M., Convergent algorithms suitable for the solution of the semiconductor device equations, Appl. Math. 40 (1995), no. 2, 107–130. [248] Preconditioned conjugate gradient methods. Proceedings of the International Conference held at the University of Nijmegen, Nijmegen, June 19–21, 1989, edited by O. Axelsson and L. Yu. Kolotilina, Lecture Notes in Mathematics 1457, Springer, 1990 [249] Protter, M. H., Weinberger, H. F., Maximum principles in differential equations (corrected reprint of the 1967 original), Springer-Verlag, New York, 1984. [250] Quarteroni, A., Valli, A., Domain decomposition methods for partial differential equations. Numerical Mathematics and Scientific Computation, Oxford Science Publications, The Clarendon Press, Oxford University Press, New York, 1999. [251] Rannacher, R., On the convergence of the Newton-Raphson method for strongly nonlinear problems, in: Nonlinear Computational Mechanics, (eds. Wriggers, P., Wagner, W.,) Springer, 1991. [252] Rektorys, K., The method of discretization in time and partial differential equations, Dortrecht-Boston, Reidel, 1982. [253] Rempel, S., Schmidt, G., Eigenvalues for spherical domains with corners via boundary integral equations, Integral Eqns. Operator Theory 14 (1991), no. 2, 229–250. [254] Richardson, W.B. JR, Sobolev Gradient Preconditioning for PDE Applications, in: Iterative Methods in Scientific Computation IV. (Kincaid, D.R., Elster, A.C., eds.), pp. 223-234, IMACS, New Jersey, 1999. [255] Riesz F., Sz.-Nagy B., Vorlesungen u ¨ber Funktionalanalysis, Verlag H. Deutsch, 1982. [256] Rossi, T., Toivanen, J., A parallel fast direct solver for block tridiagonal systems with separable matrices of arbitrary dimension, SIAM J. Sci. Comput. 20 (1999), no. 5, 1778–1796 (electronic). [257] Rossi, T., Toivanen, J., Parallel fictitious domain method for a non-linear elliptic Neumann boundary value problem, Czech-US Workshop in Iterative Methods and Parallel Computing, Part I (Milovy, 1997), Numer. Linear Algebra Appl. 6 (1999), no. 1, 51–60. [258] Rothe, E., Zweidimensionale parabolische Randwetausgaben als Grenzfall eindimensionaler Randwertausgaben, Math. Ann., 102 (1930), 650-670.
BIBLIOGRAPHY
381
[259] Rudin, W., Functional analysis (2nd edition), International Series in Pure and Applied Mathematics, McGraw-Hill, New York, 1991. [260] Saad, Y., Schultz, M.H., Conjugate gradient-like algorithms for solving nonsymmetric linear systems, Math. Comput. 44 (1985), 417-424. [261] Saad, Y., Schultz, M.H., GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput. 7 (1986), 856– 869. [262] Saint-Venant, B., Sur l’´etablissement des ´equations des mouvements int´erieures, Comp. Rend. Acad. Sci., 70 (1870). [263] Samarskii, A.A., Theory of the difference schemes, Nauka, Moscow, 1977 (in Russian). [264] Sanz-Serna, J.M., Calvo, M., Numerical Hamiltonian Problems, Chapman and Hall, 1994. [265] Simon L., Baderko, E., Linear Partial Differential Equations of Second Order (in Hungarian), Tank¨onyvkiad´ o, Budapest, 1983 [266] Smith, B.F., Bjorstad, P.E., Gropp, W.D., Domain decomposition. Parallel multilevel methods for elliptic partial differential equations, Cambridge University Press, Cambridge, 1996. [267] Smoller, J., Shock waves and reaction-diffusion equations (2nd edition), Grundlehren der Mathematischen Wissenschaften, 258, Springer, 1994. [268] Strang, G., On the construction and comparison of difference schemes, SIAM Journal Numer. Anal, 5 (1968), pp. 505 - 517. [269] Strang, G., Fix, G.J., An Analysis of the Finite Element Method, Prentice Hall, Englewood Cliffs, 1973 [270] Struwe, M., Variational methods. Applications to nonlinear partial differential equations and Hamiltonian systems, Springer-Verlag, Berlin, 1990. [271] Swarztrauber, P. N., A direct method for the discrete solution of separable elliptic equations, SIAM J. Numer. Anal. 11 (1974), 1136–1150. [272] Swarztrauber, P. N., The methods of cyclic reduction, Fourier analysis and the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle, SIAM Rev. 19 (1977), no. 3, 490–501. ´ , B., Babuˇ [273] Szabo ska, I., Finite Element Analysis, J.Wiley and Sons, 1991 [274] Ta’asan, S., Multigrid methods for highly oscillatory problems, PhD thesis, Weizmann Institute of Science, Rehovot, Israel, 1984. [275] Talenti, G., Best constants in Sobolev inequality, Ann. Mat. Pura Appl., Ser. 4 (1976), Vol. 110, pp. 353-372. [276] Temam, R.: Survey of the status of finite element methods for partial differential equations. Finite elements (Hampton, VA, 1986), 1–33, ICASE/NASA LaRC Ser., Springer, New York-Berlin, 1988.
382
BIBLIOGRAPHY
´e V., Finite Difference Methods for Linear Parabolic Equations, Elsevier, [277] Thome North-Holland, 1990. ´e, V. , Galerkin Finite Element Methods for Parabolic Problems, Springer, [278] Thome Berlin, 1997. [279] Tichonov, A.N., Stability of solution algorithms for singular systems of algebraic equations (in Russian), Zh. Vychisl. Mat. i Mat. Fiz. 4 (1965), 718-722. [280] Tichonov, A.N., Arsenin, V.J., Solution methods of singular problems, Nauka, Moscow, 1979 (in Russian). [281] Vainberg, M., Variational Method and the Method of Monotone Operators in the Theory of Nonlinear Equations, J.Wiley, New York, 1973. [282] Varga, R. S., Matrix iterative analysis (2nd revised and expanded edition), Springer Series in Computational Mathematics 27, Springer-Verlag, Berlin, 2000. [283] Vassilevski, P. S. Fast algorithm for solving a linear algebraic problem with separable variables, C. R. Acad. Bulgare Sci. 37 (1984), no. 3, 305–308. [284] Vassilevski, P. S., Lazarov, R. D., Margenov, S. D., Vector and parallel algorithms in iteration methods for elliptic problems, in Mathematics and mathematical education (Albena, 1989), 40–51, Bulgar. Akad. Nauk, Sofia, 1989. [285] Vladimirova, N. M., Convergence of the Newton-Kantorovich method for quasilinear elliptic equations (in Russian), Izv. Vyssh. Uchebn. Zaved. Mat. 9 (1987), 13–25, 81. ¨ ro ¨ s, G., personal communication [286] Vo [287] Yakovlev, M. N., On the solution of non-linear equations by iterations (in Russian), Dokl. Akad. Nauk SSSR 156 (1964), 522–524. [288] Young, D. M., Iterative solution of large linear systems, Academic Press, New YorkLondon, 1971. [289] White, R. E., An introduction to the finite element method with applications to nonlinear problems, John Wiley and Sons, Inc., New York, 1985. [290] Widlund, O., A Lanczos method for a class of non-symmetric systems of linear equations, SIAM J. Numer. Anal., 15 (1978), 801-812. [291] Widlund, O., On the use of fast methods for separable finite difference equations for the solution of general elliptic problems, in Sparse Matrices and Applications, D.J. Rose and R.A. Willoughby (eds.), Plenum Press, N.Y. 1972, pp. 121–134. [292] Wilkinson, J.H., Ill-condition in numerical linear algebra, Congr. Numer. 51 (1986), 59-81. [293] Wilkinson, J.H., London, 1963.
Rounding errors in algebraic processes, Notes in Appl. Sci. 32,
[294] Winter, R., Some superlinear convergence results for the conjugate gradient method, SIAM J. Numer. Anal., 17 (1980), 14-17.
BIBLIOGRAPHY
383
[295] Ypma, T. J., Local convergence of inexact Newton methods, SIAM J. Numer. Anal. 21 (1984), no. 3, 583–590. [296] Zeidler, E., Nonlinear functional analysis and its applications, Springer, 1986 ˇ ´ıˇ [297] Zen sek, A., Nonlinear elliptic and evolution problems and their finite element approximations. Computational Mathematics and Applications, Academic Press, Inc., London, 1990 [298] Zlatev, Z., Computer treatment of large air pollution models, Kluwer Academic Publishers, Dordrecht-Boston-London, 1995.