Series in Contemporary Applied Mathematics CAM 6
liiiiiiiiiiimiiMlIlIH Tatsien Li • P ingwen Zhang
frontiers and Prospects of Contemporary Applied Mathematics
Series in Contemporary Applied mathematics
CAM
Honorary Editor: Chao-Hao Gu (Fudan University) Editors: P. G. Ciarlet (City University of Hong Kong), Tatsien Li (Fudan University)
1. Mathematical Finance
Theory and Practice
(Eds. Yong Jiongmin, Rama Cont)
2. New Advances in Computational Fluid Dynamics Theory, Methods and Applications (Eds. F. Dubois, Wu Huamo)
3. Actuarial Science
Theory and Practice
(Eds. Hanji Shang, Alain Tosseti)
4. Mathematical Problems in Environmental Science and Engineering (Eds. Alexandre Ern, Liu Weiping)
5. Ginzburg-Landau Vortices (Eds. Haim Brezis, Tatsien Li)
6. Frontiers and Prospects of Contemporary Applied Mathematics (Eds. Tatsien Li, Pingwen Zhang)
Series in Contemporary Applied Mathematics CAM 6
Frontiers and Prospects of
editors
Tatsien Li Fudan University, China
Pingwen Zhang Peking University, China
Higher Education Press
\[p World Scientific NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANCHAI • H O N G K O N G • TAIPEI • CHENNAi
Tatsien Li
Pingwen Zhang
Department of Mathematics
School of Mathematical Sciences
Fudan University
Peking University
Shanghai, 200433
Beijing, 100871
China
China
^ t t M t f c ^ Wftff&^MW. ^Frontiers and Prospects of Contemporary Applied Mathematics / ^^kM
(Li,
Tatsien), ft^fX (Zhang, Pingwen) if. —JfcJiC:
mmm^&mt, 2005.12 ISBN 7-04-018575-X i.Mf... n.©$. ..©&... —XM^X IV.29-53 ^ l a w s ^ i t CIP mm?f
m.mffi&-ffi% (2005) m 13795s ^
Copyright © 2005 by Higher Education Press 4 Dewai Dajie, Beijing 100011, P. R. China, and World Scientific Publishing Co Pte Ltd 5 Toh Tuch Link, Singapore 596224 All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without permission in writing from the Publisher. ISBN 7-04-018575-X Printed in P. R. China
Contents Preface Invited Talks Jin Cheng, Mourad Choulli, Xin Yang: An Iterative BEM for the Inverse Problem of Detecting Corrosion in a Pipe
1
Weinan E, Pinghing Ming: Analysis of the Local Quasicontinuum Method Houde Han: The Artificial Boundary Method
18 Numerical Solutions
of Partial Differential Equations on Unbounded Domains
33
Kerstin Hesse, Ian H. Sloan: Optimal order integration on the sphere
59
Jialin Hong: A Survey of Multi-symplectic Runge-Kutta Type Methods for Hamiltonian Partial Differential Equations
71
Ming Jiang, Yi Li, Ge Wang: Inverse Problems in Bioluminescence Tomography
114
Fangting Li, Ying Lu, Chao Tang, Qi Ouyang: Global Dynamic Properties of Protein Networks
149
Wei Li, Yunqing Huang: A Modified Adaptive Algebraic Multigrid Algorithm for Elliptic Obstacle Problems
160
Zeyao Mo: Parallel Algorithms and Implementation Techniques for Terascale Numerical Simulations of Typical Applications Yaguang Wang: Long Time Behaviour of Solutions to Linear
179
Thermoelastic Systems with Second Sound
191
Contributed Talks Hongxuan Huang, Changjun Wang: Distance Geometry Problem and Algorithm Based on Barycentric Coordinates
208
Jijun Liu: On IU-Posedness and Inversion Scheme for 2-D Backward Heat Conduction
227
Yirang Yuan, Ning Du, Yuji Han: Careful Numerical Simulation and Analysis of Migration-Accumulation
242
Rongxian Yue: Error Analysis on Scrambled Quasi-Monte Carlo Quadrature Rules Using Sobol Points
254
Preface During the period of the 8th Annual Conference of the China Society for Industrial and Applied Mathematics (CSIAM) held on August 24 30, 2004 in Xiangtan, Hunan Province, China, the Symposium on Frontiers and Prospects of Contemporary Applied Mathematics was held. About 300 representatives from over 100 domestic universities, scientific research institutions and enterprises and from abroad attended the conference. At the symposium some Chinese and foreign scholars and experts were invited to give plenary lectures. They introduced current progress and expressed their prospects on some important topics of the industrial and applied mathematics. Besides, at the section meetings many participants gave academic reports. Considering that these plenary lectures have high academic values due to, their representative and perspective, we collected them in a volume for publication. Meanwhile a small part of the academic reports provided in the sections was also selected for this volume. We hope that the publication of this book would effectively help readers understand the current situation of the industrial and applied mathematics and the hot issues in this area. Also we hope the publication of this book would be helpful in pushing the industrial and applied mathematics forward. We would like to take this opportunity to express our heartfelt thanks to all of the speakers at the symposium for their great support, especially to those speakers who wrote papers for this book. We would also like to show our sincere thanks and respect to the National Natural Science Foundation of China, the Mathematical Center of Ministry of Education of China and Xiangtan University for their financial help and support; and to Higher Education Press and World Scientific Publishing Company for their hard work and efforts in publishing this book.
Li Tatsien October 2005
1
An Iterative BEM for the Inverse Problem of Detecting Corrosion in a Pipe* Jin Cheng School of Mathematical Shanghai E-mail:
Sciences, 200433,
Fudan
University,
China.
[email protected]
Mourad Choulli Departement
de Mathematiques
du Saulcy,
Universite
57045 Metz cedex,
E-mail:
de Metz He
France.
[email protected]
Xin Yang Department
of Computer
State University, E-mail:
Science
University
and Engineering,
Park, PA 16802,
Penn USA.
[email protected] Abstract
In this paper, we consider an inverse problem of determining the corrosion occurring in an inaccessible interior part of a pipe from the measurements on the outer boundary. The problem is modelled by the Laplace equation with an unknown term 7 in the boundary condition on the inner boundary. Based on the Maz'ya iterative algorithm, a regularized BEM method is proposed for obtaining approximate solutions for this inverse problem. The numerical results show that our method can be easily realized and is quite effective.
1
Introduction
Detecting the corrosion inside a pipe is one of the most important topics in engineering, especially in the safety administration of the nuclear power station. There are several ways t o do this. In this paper, we will discuss the m a t h e m a t i c a l theory a n d numerical algorithm for a m e t h o d of detecting the corrosion by electrical fields. More exactly, we consider an *The authors are partly supported by NNSF of China (No. 10271032 and No. 10431030) and Shuguang Project of Shanghai Municipal Education Commission (N.E03004).
2
Jin Cheng, Mourad Choulli, Xin Yang
inverse problem of determining the corrosion occurring in an inaccessible interior part of a pipe from the measurements on the outer boundary. Our goal is to determine information about the corrosion that possibly occurs on an interior surface of the pipe, which is an 'inaccessible' part, and we collect electrostatic data on the part of the exterior surface of the pipe, which is an 'accessible' part. In the case that the thickness of the pipe is sufficiently small when compared with the radius of the pipe and the Cauchy data are given on the whole outer boundary, this inverse problem can be treated by the Thin Plate Approximation,method (TPA). The algorithm and numerical analysis can be found in [7]. But this algorithm works only under the assumption that the thickness is small enough when compared with the radius of the pipe. The case, in which the Cauchy data are given on part of the outer boundary and the smallness assumption is abandoned, has not been studied and it is obvious that it is of great importance for practice problems. The main difficulty for this inverse problem is the ill-posedness of the inverse problem. The measured data are given only on part of the outer boundary and we want to determine an unknown function in the inner boundary. Because of the ill-posedness, the errors in measured data will be enlarged in the numerical treatment if we do not treat it suitably. In this paper, based on the Maz'ya iterative method, we propose a new BEM algorithm for this inverse problem. It can be easily realized. The numerical results show the efficiency of this method. This paper is organized as follows: 1. Formulation of the inverse problem, 2. The iterative boundary element method, 3. Numerical examples, 4. Conclusions.
2
Formulation of the inverse problem
Suppose a domain fi = {x | n < \x\ < r2} C R 2 (see Figure 2.1) and the boundaries Ti = {x\ \x\ = ri}and T2 = {x\ \x\ = r2}-
3
An Iterative BEM for the Inverse Problem • • •
Assume that Q is a metallic body with constant conductivity. In the domain CI, we consider an electrostatic field. The electric potential u satisfies the Laplace's equation in Cl, i.e., Au = 0,
in
Cl.
(2.1)
Let To be an open set of the outer boundary T2 of Cl which is an 'accessible' part. On To, the Dirichlet data and the Neumann data of the electric potential u are given, i.e.,
u(x) = 4>(x),
x e r0,
(2.2)
x e T0,
(2.3)
uv{x) = ip(x),
where uv is the outer normal derivative of u on the boundary.^ We denote the rest part of the exterior boundary of Cl by I^,
f2 = r 2 \r 0 . We assume that the corrosion only happened on the interior boundary of the domain Cl and the corrosion can be described by a non-negative function 7 in the boundary condition on the interior boundary. That is, uv+ju
= 0,
on
TU
(2.4)
where 7 > 0 represents the corrosion damage. The inverse problem we discuss in this paper is to find the unknown coefficient 7 from the Cauchy data
and ip on I V We will treat this inverse problem by the following steps: Step 1: Get the Cauchy data on the interior circle by solving the Cauchy problem for Laplace's equations. We use the iterative boundary element method to solve the Cauchy problem: 'Au(x) — 0, xeCl, < u(x) = (j)(x), x € To, (2.5) un(x) =ip(x), xeT0. Our goal is to get the Cauchy data on I?i: u(x) =
xETi;
un(x) =ipi(x),
x£Ti.
Step 2: Get the impedance 7 from the Cauchy data on the interior circle. For the boundary condition un + 7U = 0,
x on Ti,
7 can be obtained by 7 =
= -—,
if >i ^ 0 .
4
Jin Cheng, Mourad Choulli, Xin Yang
Remark 2.1. It can be proved that the measure of the zero set {>i = 0} can not be non-zero. Therefore, our method is valid in the case of i ^ 0.
3
The iterative boundary element method for this Cauchy problem
In this section we will give the iterative boundary element method (see [9], [10],[11]) for the Cauchy problem in Step 1. We will prove the convergence rate only under the regularity assumption. Some numerical simulation results for the Cauchy problem are also presented.
3.1
Description of the algorithm
In [11], V.A. Kozlov, V.G. Maz'ya and A.V.Fomin proposed the algorithm as follows: _ 1. Specify an initial boundary guess UQ on I \ and r Y 2. Solve the well-posed mixed boundary value problem: 'AUW(x)=0, 0)
< ui = v, t/(o) = UO)
xen,
x e T0, zeriuf2.
(3.1)
to determine U(0)(x) for x G Ct and q0 = ui°\x) for x G Tx U f2. 3 (i). Suppose that the approximation q^ is obtained. We can solve the mixed boundary value problem: 'A[/(2fe+1)=0j < [/(2fe+i) = 0 ,
x e ( ] ) x G
r0,
2k+1)
uk
=qk,
_
(3.2)
xeTtuh.
Then we can determine [/(2fc+1)(:r) for x G fi and Ufc+i = U^2k+1\x) for x G Ti U f 2 . (ii) By Wfc+i, we can obtain U^k+2\x) for x G O, and qk+i = x G Ti UT2 by solving the mixed boundary value problem: 'A[/(2fe+2)=0) 2fc+2)
.t7( =^, (2k+2) u =Uk+u
x e
^
xeT0,
(3.3)
x € r i u f 2 .
4. Repeat step 3 for k > 0 until a prescribed stopping criterion is satisfied. The stopping criterion we will use in this paper is \\uk+i — Mk||L2(riur2) < e, where s is a small positive number.
An Iterative BEM for the Inverse Problem • • •
5
Remark 3.1. The mixed boundary value problems (3.2) and (3.3) are well-posed problems. We solve the mixed boundary value problems (3.2) and (3.3) by the boundary element method, which can be found in a lot of guide books on the boundary element method, for example, [1]. In the following, we give only the outline of the iterative BEM form. Consider the following mixed boundary value problem in twodimensional case: Aw = 0, in Cl, u = f, on rD, (3.4) un = g, on I V As we have known, the foundational integral formula of the harmonic function u{Mi) = J
(u*^-
u^pj
dT,
Mi e Q,
(3.5)
where u* = •£- In —-— represents the foundational solution of the Laplace's equation. And the boundary integral formula is : *u{Mi) = / ( u* — - u — J dT,
Mi e dQ.
(3.6)
Equation (3.6) can be discretized as follows: CiUi + V
/
uq*dY -J2
u*qdT = 0.
(3.7)
The values of u and q in the integrands of (3.7) are constant within each element, and u and q consequently can be taken out of the integrals. This gives cm + J2\
I
Q*dT )uj-^2[
I u*dT j q3 = 0.
(3.8)
With the given boundary condition, we can rearrange equation (3.8) with all the unknowns on the left-hand side and a vector on the righthand side obtained by multiplying matrix elements with the known values. This gives
iUi + Y^l f q*dr)Uj-
^2 {[ u*dr\qj (3.9) u*dT )
j=m+l Vri
/
j=l V r .
qj.
6
Jin Cheng, Mourad Choulli, Xin Yang
The whole set of equations can be expressed in a matrix form as A(*D)=-B(UD) \uNJ \
\Uk+iJ
\qkj
and get Ufc+i that will be needed in the next equations. (ii) With Mfc+i, we can get qk+i by solving
\Qk+i J
\Uk+iJ
Our boundary element method gives a problem about computing linear equations twice in every iterative. It is easy to realize it by the technique of Matrix computing.
3.2
Convergence analysis
In this section we give the convergence analysis under the regularity assumption on the unknown potential u. First of all, we simplify the subproblem 1 as the following Cauchy problem for Laplace equation: Let fi C R2 be an open bounded set and Ti, F^ be two parts of the boundary dQ., satisfying T\ U T2 = dQ and Ti n T 2 = 0. Au = 0, < u = f, uv = g,
x in O, a; on r l t £ on Ti,
(3.10)
where v is the unit outer derivative vector. Given the Cauchy data (f,g) e # 1 / 2 ( r i ) x f l # 2 ( r i ) ' , we assume that there exists an i? 1 -solution of problem (3.10). We are mainly interested in the determination of the Neumann trace. The following work is to introduce an operator T : H0Q (IT^) —> -^oo (^2)' and represent the above iterative. Refer to [8]. We can simplify our iterative method as Aw = 0 in fi; UJ\VI = / ; A^ = 0 in Q; u„A\ri = g;
w „ J r 2 = <£. v|r 2 = ^-
An Iterative BEM for the Inverse Problem • • • We define the operators Ln : H^2{T2)'
•^H^il)
-> H^Q)
and Ld :
7 Hl/2(T2)
by Ld(iP)~v€H\n).
Define the Neumann trace operator 7„ : J?1(f2) —> H0Q (T2) , 7n(u) := u1/A\r2 and the Dirichlet trace operator 7^ : i? 1 (ft) —+ J J 1 / 2 ^ ) , 7d(u) := w|r2So we can rewrite the iterative as ( u = Ln(
V = 7d(w)> 4>k+l = ln(v).
If we define T :— 7 n o L^ o 7^ o L n , we conclude that T is an affine operator on H0Q (TI), which satisfies
0fc+1=T(0fe)=Tfe+1(
Ld(-)=Lld{-)
+ ug,
where the if 1 (Q, P)-functions wj and vg depend only on / and g, respectively. With these definitions we have k+i = T{(f>k) =in°Lldoldo
Lln(((>k) + 7„ o Lld o 7 (o; / ) + 7„(i/ s )
= T/fc+1(0o)+SJfe=oT/(^). Prom [8], we know the operator Ti is positive, self adjoint, injective, 1/2 regularly asymptotic in H0Q and non expansive. In [8] the convergence of this iterative method is presented. Under the source condition which is not so obvious for the engineers. Here we only use regularity assumptions in the convergence analysis. Since our problem is in an annular domain, the following theorems are discussed in the annular domain. But the results can be extended into a general domain. Firstly, we define the Sobolev spaces of periodic functions ^(-TT.TT)
:= {{y) = £ > e ^ f e | £ ( 1 + j 2 ) ' $ < oo}, S € R. (3.11) jez jez
Before we give the theorems, we introduce the following logarithmictype source conditions: /(A) =
|(Me.KDA- 1 ))- >
A>0,
(312)
8
Jin Cheng, Mourad Choulli, Xin Yang
Theorem 3.2. Set ft be an annular domain, ft C R2. Let (f,g) be consistent Cauchy data and assume that the solution <$> of the Cauchy problem (3.10) satisfies <j>-<j>o£ Hper, where cfio € H is some initial guess. Let fi > 2, (fe,ge) be some given noisy data with \\ze — Zfi9\\ < e, e > 0 and k(e,ze) be the stopping rule determined by the discrepancy principle k(e, z€) = min{fc € N\\\ze - (/ - Ti)4>% \\ < fie}.
(3.13)
Then there exists a constant C, depending on c/>o only such that i)
U-4>U
ii)
for all iteration index k satisfying 1 < k < k(e,ze). Theorem 3.3. Set ke — k(e, z€). Under the assumption of Theorem 3.2 we have i) H)
ke{ln{K))=0{e-1), U-rtJ^Od-lnVe)-1).
The next lemma is most important for the proof of the theorems. Lemma 3.4. Set ft be an annular domain, ft C R2. Then the solution cf> of the Cauchy problem (3.10) in this domain satisfies f-foeH^,
(3.14)
where 4>o G H is some initial guess and H^er is the Sobolev spaces of periodic functions defined as in (3.11). This regularity assumption is equivalent to choosing some ip € Hper satisfying
where f is the logarithmic-type source conditions (3.12). Proof. For simplicity, we consider Cauchy problem (3.10) in the annular domain r1 = {(^^);^e(-7r,7r)},JR>l, r 2 = {(l,0);0€(-7r,7r)}, where f(9) = £f=i a,- sm(j0), g{6) = £ f = 1 bj sin(j0).
An Iterative BEM for the Inverse Problem • • • Given the Neumann data N
»o(0) = X>,:>8inO'0), 3=1
we can get
where
_ (R? - R-i)2 ~ (Ri + R-i)2'
j
For (fr — foe Hper, there exists a,j, (j = 1 • • • N) satisfying S , = i
a
| < °°>
N 1
- 0 o = ^2a,jj
sin(jy).
3=1
So we get N
J2(l+J2)a2r2
To the logarithmic-type source conditions (3.12), the source condition is to find some tp G i?° e r , satisfying
£-fo=/(/-W. So our problem comes into finding this ip. Set ip = J2f=i bj sin(J2/)> t n e n &.=
^
3
3f(l-\iY
From the estimate In (f^-)
>l-ln
(exp(l)
W - Br' RP + R-J
2R~i ^Ri + R-i > 2jlnR - 1, -In
In
(!?m<1+ln(i vi -\J-
\i
Ri-R-i Ri+R-i
Rj + R-i 2R~i < 2jlnR + 1 - ln2,
= l + ln
Jin Cheng, Mourad Choulli, Xin Yang
10 we have
2jlnR-
1 < -j-
r ^ < 2jlnR +
l-ln2
And w i t h ^ ^ i a2- < oo, we can obtain Y^j=i tf < °o, i-e-, i> € H°er.
D
Lemma 3.5. Let (f,g) be consistent Cauchy data and assume that the solution 4> of the fixed point equation satisfies the source condition 4>- 4>o = f(I - Ti)ip,
for
some
ip e H,
where o G H is some initial guess and f is the function defined in (3.12) with p > 1. Let /J, > 2, (fe,g€) be some given noisy data with \\ze — Zftg\\ < e, e > 0 and k(e,ze) the stopping rule determined by the discrepancy principle. Then there exists a constant C, depending on p and \{ijj\\only such that
i)
U-4>l\\
ii)
\\zc - (I - Ti)(f>%\\ <
Ck-x(lnk)-p,
for all iteration index k satisfying 1 < k < k(e, ze). Lemma 3.6. Set ke = k(e,z£). have i)
ii)
Under the assumption of Lemma 3.5 we
ke(ln(ke))P =
Oie-1),
U~l\\<0((-lnV~e)-p)-
The proof of lemma 3.5, lemma 3.6 can be found in [4]. With all the lemmas above, it is easy to give the proof. Theorem 3.2 can be deduced by lemma 3.4 and lemma 3.5. Theorem 3.3 can be deduced by lemma 3.4 and lemma 3.6.
3.3
Numerical experiment for t h e Maz'ya iteration
In this section, we will test the previous algorithm to calculate a few examples with Matlab. For simplicity, we set the domain fi with interior radius 1 and outer radius 1 + b in the following experiments. The number of the boundary element is n. Since we use the quadratic elements, we take n nodes on the outer circle and also n nodes on the interior circle. And set the number of nodes whose data are given to be m. we consider a harmonic function: u(x,y) = log[(x-0.5)2
+
(y-0.5)2].
We use the prescribed algorithm to get the unknown data on the boundary, and then use the harmonic basic integral formulation to calculate
An Iterative BEM for the Inverse Problem • • •
11
the data on the circle with the radius 1 + a(a < b). In the following numerical experiment the noise level is 6 noisy. The figures on the left show the exact solution compared with the approximate solution, and the dot line represents the approximate solution The real line represents the exact solution. The figures on the right side are the curves of the absolute errors. We use the stopping rule as ||«fe+i — Ufe||L2(riur2) ^ 10~ 3 . Example 1. In this experiment we take n = 100, 200, m = 50,100, b = 1, a = 0.5 and 5 — 0.01, respectively. n=100, m=50:
Figure 3.1
Figure 3.2
n=200, m=100:
Figure 3.3
Figure 3.4
So if you want higher precision you should use more element during the process of this iterative.
12
Jin Cheng, Mourad Choulli, Xin Yang
Example 2. In this experiment we set n = 100, m = 30, b = 1, a = 0.5 and 5 = 0.01, 0.001, respectively. n=100, m=30, 6 = 0.01:
Figure 3.5
Figure 3.6
n=100, m=30, 6 = 0.001:
Figure 3.7
Figure 3.8
The numerical results show that the subproblem 1 is ill-posed in the Hadamard sense, i.e., the solution does not depend continuously on the data, which means the small errors in the measurement of the voltages on the boundary can produce unbounded errors in the solution. Example 3. In this experiment we set n = 100, m = 50, a = 0.1, 0.25, 0.5, 6=1 and 6 = 0.01 ( a = 0.5 is shown in the pervious example)
An Iterative BEM for the Inverse Problem • • •
13
a=0.25:
10
Z0
30
40
50
70
Figure 3.11
80
90
100
Figure 3.12
From the numerical simulation, it can be seen that the precision decreases as a decreases.
4
Numerical results for the inverse problem
In this section we will use the iterative algorithm to treat our inverse problem, and give some numerical examples. In the following test, we choose the ring domain as
n = {(x,y)\l< Vz 2 +2/ 2 <2}. Example 1. In this test we recover the continuous piecewise linear function: 0<1, 1 when 1 < 0 < 1.5, 461-3 when 1.5 < 9 < 2.5, -26 + 6 when 2.5 < 6 < 4.5, j(9) = I 1 when 30 - f when 4.5 < 9 < 5.5, - 6 0 + 37 when 5.5 < 9 < 6, 1 otherwise.
14
Jin Cheng, Mourad ChouUi, Xin Yang
We get the data (f>, tp from the solution of the direct problem by the boundary element: AC/ = 0,
x € ft,
un = -i, ser0uri, un + iU = o, xer2. The follwing are the result figures. (i)Set m=100, n=100
Figure 4.1
Figure 4.2
(ii)Set m=50, n=100
Figure 4.3
Figure 4.4
Example 2. We consider the harmonic function: u{x, y) = yz - x2y + x2 - y2 + 6, whose polar coordinates form is u(r, 6) = r 3 (sin 3 6 - cos2 6 sin 6) + r2 cos 26 + 6.
An Iterative BEM for the Inverse Problem • • •
15
It is easy to know that the coefficient 7 on the inner circle is
7W
3(sin3 6 - cos2 6 sin 6) + 2cos29 sin 3 6 - cos2 0sin<9 + cos20 + 6'
The figure on the left side is a comparison between the approximate 7 and the exact 7. The right one is the absolute error curve. (i)Set TO=100, n=100
2
Figure 4.5
3
Figure 4.6
(ii)Set m=50, n=100
Figure 4.7
Figure 4.8
(iii)Set m=100, n=100 and add 5% random noisy on the Cauchy data. It can be seen from the numerical results that there is a lot of noise in the numerical solution which means our method to this inverse problem is so sensitive. Any small error of the data may lead the iterative method not to converge. Another thing is that, if the Cauchy data is only given on part of the boundary, we can only obtain the local solution.
16
Jin Cheng, Mourad Choulli, Xin Yang
Figure 4.9
5
Figure 4.10
Conclusions
In this paper, we have investigated an inverse problem in detecting corrosion in a pipe. The problem has be modelled by Laplace equation with the unknown coefficient in the boundary condition.We deduce a numerical method to solve it and test the result with numerical experiments.
References [1] C. A. Brebbia, J. C. F. Telles and L. C. Wrobel, Boundary Element Techniques: Theory and Applications in Engineering, SpringerVerlag, Berlin 1984. [2] J. Cheng, Y. C. Hon, T. Wei and M. Yamamoto, Numerical computation of a Cauchy problem for Laplace's equation, ZAMM Z. Angew. Math. Mech., 81 (2001), No. 10, 665-674. [3] M. Choulli, Stability estimates for an inverse elliptic problem, J. Inverse Ill-posed Probl, 10 (2002), No. 6, 601-610. [4] H. W. Engl and A. Leitao, A Main iterative regularization method for elliptic Cauchy problems, Numer. Funct., Anal. Optim., 22 (2001), No. 7-8, 861-884. [5] D. Fasino and G. Inglese, Discrete methods in the study of an inverse problem for Laplace's equation, IMA Journal of Numerical Analysis, 19(1999), 105-118. [6] Y. C. Hon and T. Wei, Backus-Gilbert algorithm for the Cauchy problem of the Laplaces equation, Inverse Problems, 17 (2001), No. 2, 261-271. [7] G. Inglese, An inverse problem in corrosion detection, Inverse Problem, 13(1997), 977-994.
An Iterative BEM for the Inverse Problem • • •
17
[8] A. Leitao, An iterative method for solving elliptic Cauchy problems, Numer. Punct. Anal. Optim., 21 (2000), No. 5-6, 715-742. [9] D. Lesnic, L. Elliott and D. B. Ingham, An iterative boundary element method for solving numerically the Cauchy problem for the Laplace equation, Engineering Analysis with Boundary Element Vol.20(1997), No.9, 123-133. [10] M. Jourhmance and A. Nachaoui, An alternating method for an inverse Cauchy problem, Numerical Algorithms, 21(1999), 247-260. [11] V. A. Kozlov, V. G. Maz'ya and A. V. Fomin, An iterative method for solving the Cauchy problem for elliptic equations, Comput. Math. Phys., 31(1)(1991), 45-52. [12] T. Wei, Y. C. Hon and J. Cheng, Computation for multidimensional Cauchy problem, SIAM J. Control Optim., 42 (2003), No. 2, 381396 (electronic).
18
Analysis of the Local Quasicontinuum Method Weinan E Department of Mathematics and PACM, Princeton University and School of Mathematical Sciences, Peking University. E-mail: [email protected]
Pingbing Ming Institute of Computational Mathematics and Scientific/Engineering Computing, AMSS, No.55, Zhong-Guan-Cun East Road, Chinese Academy of Sciences. E-mail: [email protected] Abstract We analyze the stability and accuracy of the local quasicontinuum method. Optimal estimates are obtained for the error between the quasicontinuum solution and the macroscopic model solution.
1
Introduction
This is the first of a series of papers devoted to the analysis of the quasicontinuum method (QC), which is becoming a popular multiscale technique for simulating the static properties of crystalline materials. Since QC is a computational method that couples the atomistic models of crystals with continuum models, the analysis naturally touches upon the important issue of how these different levels of models are related to each other. In the present paper, we study the simplest situation when classic potentials are used in the atomistic models, and when there are no defects in the crystal. Consider the following type of atomistic models of crystal deformation under an applied force: N E
{Vi, •••
,VN}
=
V
(VI>
• • • . VN) ~ 5 Z / ( * * ) ' »*'
(L1)
where V is the interaction potential between atoms, / is the external force, yi is the deformed position of the i—th atom and the undeformed
Analysis of the Local Quasicontinuum Method
19
position we will denote by Xi. Let Q be a sufficiently smooth open set representing the region occupied by the material in the undeformed (reference) configuration. We have introduced in V an explicit parameter e for the lattice constant. Naturally we are interested in the situation when e is much smaller than the size of Q which is 0(1). In the atomistic model, the deformation of the crystal is described by the displaced position of each atom. The positions {y1: • • • , yN} are computed by minimizing the energy functional (1.1) subject to certain boundary conditions. In contrast, in the continuum regime, the deformation is described by the displacement field u, and u{xi) = yi— Xi is the displacement of the i—th atom. The vector field u is computed by minimizing a continuous functional of the type f W(Vu) dx-
f f(x)
• u(x) dx,
(1.2)
subject to certain boundary conditions. Here / is again the external force, and W is the stored energy functional of the material. A very important practical question is how one gets W. In the linear elastic response regime, i.e., when the displacement is infinitesimal and W can be approximated by a quadratic function of Vtt, the coefficients in this quadratic form can be obtained from V by linearizing the total potential energy at the equilibrium (undeformed) position. The details of this procedure can be found in [4]. At finite deformation, it becomes less clear what form of W one should take. One common proposal is to use the Cauchy-Born (CB) rule. But straightforward application of the CB rule often leads to variational problems that are badly behavedt 2 '. For the simple lattice, WQB is defined as ^B(A) =
fclim
m
,
(1.3)
where D is an open domain in R d and L denotes the lattice. As to the complex lattice, WCB is defined as WcB(A)=minW(A,p),
(1.4)
p
where
where the summation is carried out for y±, yj,yk € (I + A)L n kD. The quasicontinuum method put forward by Tadmor, Ortiz and Phillips!15] is a procedure for modelling the deformation of crystalline
20
Weinan E, Pingbing Ming
material by using directly atomistic models. We refer to [11] for an updated review of QC. The deformation of the crystal is represented by a collection of representative atoms (repatoms) on an adaptively generated finite element mesh that resolves but does not over-resolve the variations of the displacement field. The repatoms can either be on the vertices of the mesh or the center of the elements. Once the repatoms are selected, the displacement of the rest atoms can be approximated via a linear interpolation: Nrep a=l
where the subscript a identifies the representative atoms and Nrep is the number of the repatoms involved. As usual, we use Xi to denote the position of the i—th atom in the undeformed configuration, and m = j/j — Xi to denote the displacement of the i—th atom. oa is an appropriate weight function. This step reduces the number of the degrees of the freedom. But to compute the total energy, we still need to visit every atom. To reduce the computational complexity in this step, several summation rules are introduced. The simplest one is to assume the deformation gradient A = -gL is uniform within each element; therefore, the Cauchy-Born rule holds true!8!. Denote by -E'(A) the strain energy density obtained from the Cauchy-Born rule. The strain energy in element K can be approximated by E(AK)\K\ where \K\ is the volume of the element K and AR is the deformation gradient of the element K. With these approximation, the evaluation of the total energy is reduced to a summation over the elements:
Ec
Y,
E(AK)\K\.
K€TH
This version is called the local QC. In the presence of defects, the deformation is non-smooth and the local QC may not be accurate enough. A nonlocal version of QC has been developed in which the energy is computed by
E~^2n<xE<*(u<x)a=l
Here the energy Ea from each repatoms is computed by visiting its neighboring atoms whose positions are generated by using the local deformation, and na is a set of suitably chosen weights. There are several approaches to determine na, all can be reformulated as certain summation rules, we refer to [14] and [9] for different types of summation rules and we will analyze a special one in the last section.
Analysis of the Local Quasicontinuurn Method
21
Another version of QC, which is based on the force balance, was proposed in [9]. The method generates clusters around the repatoms and performs the force calculation by using the atoms within the cluster (see Fig.l.lbelow).
Figure 1.1 Schematic demonstration for the cluster-basednonlocal QC (courtesy of M. Ortiz)
There are very few existing work on the error estimate of QC. P. Linl10] analyzed QC in the absence of external forces (hence no deformation). When deformation is present, the situation becomes quite different. Naively one might expect to prove a result stating that the global minimizers of the atomistic model (1.1) can be approximated to good accuracy by QC solutions. Such a result is in general false. In fact, it has been realized for some time that the global minimizers of the atomistic model does not support extensional stress!16!. This can be seen from the simple one-dimensional model in [6], which shows that a fractured state has less energy than a uniformly deformed state. A comprehensive analysis of one-dimensional QC with external forces was recently carried out by Blanc, Le Bris and LegoU®. Define e(QC): = max |1 - TIK/\K\
|.
(1.5)
K£TH
These considerations motivate the following theorem: T h e o r e m 1.1. Assume UCB € W 2 >°°(i?;R d ) is the solution of (2.2). There exist two constants H0 and Mi such that for any 0 < H < H0 and efQC^< Mi, there exists a locally unique QC solution UQC satisfying (3.1) and for d= 1,3, | | t f Q O - t f c B | | i < C ( t f + c(QC)), \\Uqo - UCB||I,OO < C{H + e(QC)).
22
Weinan E, Pingbing Ming
Moreover, let y q C = x + UQC(X). the full atomistic model such that \\y-yQC\U
There exists a local minimizer y of
+ H + e(QC)),
(1.7)
where \\ • \\d is defined in (2.6). For the case d — 2, the above two estimates remain to be true except that e(QC) in (1.6)2 and (1.7) should be replaced by e(QC)|lni?|. It remains to estimate e(QC). As to the local QC' 15 ^ e(QC) = 0;
(1.8)
while there is no general estimate for the nonlocal QC. For a special case when the cluster-based summation rule is employed I9', we have e(QC) < C-, r
(1.9)
where r is the cluster size. Remark 1.2. Theorem 1.1 is only valid for perfect crystalline solids without defects. Therefore, it is not surprising that the local QC is more accurate than the nonlocal QC for this ideal case. Throughout this paper, the constant C is assumed to be independent of e and H. The main results of this paper have been announced in [5].
2
The existence theorem for the continuum and the atomistic models
Let Q be a bounded cube. For any positive integer m and k, we denote by Wk'P(n;Rm) the Sobolev space of mappings y: Q -> R m such that \\y\\k,P < 00 (see [1] for definition). We write Wl>P{Q) for W ^ t f j R 1 ) and '^{Q) for Wl'2{n). In particular, W#' p (£?;R m ) denotes the sub1p n space of W ' {n;W ) with the same trace on the opposite faces of dfi. Summation convention will be used. We will use | • | to denote the absolute value of a scalar quantity, the Euclidean norm of a vector and the volume of a set. In several places, we denote by | • \g2 the £2 norm of a vector to avoid confusion. For a vector v, Vv is the tensor with components (V«)y = djVi; for a tensor field S, div S is the vector with components djSij. Given any function W: Mdxd —> R, we define
D w
* w-(is;)
-"
Diw(A)
~(j&)-
Analysis of the Local Quasicontinuum Method
23
where Mmxn denotes the set of real mxn matrices. For any p > d and m > 0, define X: = W m + 2 ' p (/2;R d ) n
W^p(n-M.d),
imdY: = Wm>p(n;] Given the total energy functional /(»):= f(WcB(Vv(x))-f(x)-v(x)^dx,
(2.1)
n where WCB(VW) is given by (1.3) or (1.4) with A = Vv, we seek a solution u — B • x G X such that I(u) =
min I(v). v-BxeX
The Euler-Lagrange equation of the above minimization problem is: (C(v): = -div(DAWCB(Vv))=f, 1
in Q,
v — B • x is periodic,
on dfi.
As to the atomistic model, we consider the following minimization problem: E m o • ^i1 , ^ ,VN}(2-3) n iVi> -y—x—Bxis
periodic and 2_,4 Vi=0
The existence result is based upon the following two assumptions: Assumption A: W(A,p) satisfies the generalized Legendre-Hadamard condition at the undeformed configuration: There exist two constants Ai and A2, independent of e, such that for all £,77, £ S R d , there holds {DlW(0,Po) W(0,p VKDpA W ( 0 ,0P) O ) PA
DApW(0,Poy
^D ^2(pW(0,p 0 , p 00)) y V C
where p0 is the shift at the undeformed configuration. The second assumption is: Assumption B: There exist two constants Ai and A2 such that the acoustic branch and the optical branch of the phonon spectrum satisfy w 0 (fc)>Ai|fc|
and
u0(k)>A2/e,
(2.4)
respectively, where k belongs to the first Brillouin zone, and o;a(fc), a;0(fc) are respectively the acoustic and the optical branches of the phonon spectrum.
24
Weinan E, Pingbing Ming
T h e o r e m 2 . 1 . [6, Theorem 2.1, Theorem 2.2] If Assumption A holds and p > d, m > 0, then there exist three constants >c\, X2 and S such that for any B e M+Xd with ||B|| < tt\ and for any f eY with ||/||u"».p(.r2) < >ci, problem (2.2) has a unique solution UQB that satisfies \\UCB - B • x\\w™+z.p(n) < 8, and UQB is o, Wl'°°-local minimizer of the total energy functional (2.1). Moreover, if Assumption B holds and p > d, m > 6, then there exist two constants M\ and M 2 such that for any B £ M^ xd satisfying ||B|| < Mi and for any f EY with \\f\\wm'P(n) 5= M2, problem (2.3) has a local minimizer y that satisfies \\V-VCB\U
(2.5)
where yCB = a? + t / C B ( x ) . The norm || • ||d is defined for any z G with ^2i:=i Zi = 0 as
RNxd
II* U = e^i^noz)1/2,
(2.6)
where Ho = Ti-(x) is the Hessian of V at the undeformed state and \\ • \\d is a discrete analog of H1 —norm. The linearized operator of £ at u e X for any n g X i s defined as jCnn{u)v =
-div(D2AW{Vu)Vv).
We associate £u n with a bilinear form A for any v, w 6 X: = / Vio • D%W(Vu)Vv
dx.
n A direct consequence of Theorem 2.1 is: Corollary 2.2. [6, Lemma 4-1] For anyp > d, there exist two constants H > 0 and A > 0 such that for any \\f\\Lp(n) < *> A(UCB;
3
v, v) > A\\v\\i
for all v e X.
Local quasicontinuum method
The original local Q C ' ^ i s based on the Cauchy-Born rule, which can be formulated as
Analysis of the Local Quasicontinuum Method
25
Problem 3.1. Find UH G XH such that IH(UH)
= jam
IH(V),
where IH(V): = % ( V V )
- J f(x) • V(x) dx, n
with
WQC(VV) = J2 " * W C B ( V V ) . KGTH
The Euler-Lagrange equations associated with the above minimization problem is of the form: Find UH € XH such that AH(UH,V)
= (f,V)
for all V 6 XH,
(3.1)
where AH is defined for all V, W G X # as i4ji(V,W):=
£ (njf/|lT|) Ker„
[DAWCB(W)VWdx. J K
For any t>, « # , u; G X , define R(v, VH, W): = A(VH, W) — A(v, w) — A(v; VH — v, w).
(3.2)
Here R satisfies for ejj: = v — VH and i + ^ = 1, (p,q> 1), |-R(w,w»,«»)[ < C(Af)||VeH||§>2P||V«»||o,,
(3.3)
with any v and vH satisfying ||w||i l0o + ||VH||I,OO < M . The existence and the local uniqueness of the solutions of (3.1) are established in the following lemma, which is similar to [7, Theorem 5.1]. We only give proof for the case d = 2. The other cases are the same except that the estimate for the discrete Green's function changes into: | | G h | | M < C,
d = l,3.
Lemma 3.2. Assume that UCB £ W2,p{fi) with p > d is the solution of (2.2). There exists a constant Ho such that for all 0 < H < Ho, problem (3.1) has a solution UH satisfying \\UH - J W C B | | I , O C < e(QC) 1 / 2 +
H1-^,
\\UCB ~ UH\\i,oc < C(e(QC) 1 /2 + j j i - a / p ) , where PHUCB
is defined as
A(UCB;PHUCB,V)
= A{UCB;UCB,V)
for all
VeXH.
(3.5)
26
Weinan E, Pingbing Ming
Moreover, if there exists a constant rj(M) with 0 < 7/(M) < 1 such that e(QC) £
[(DAWCB(VV)-DAWcB(VW))VZdx <„(M)||V(V-W)||o||VZ||o
(3-6)
for all V, W e XH n W 1 '°°(/2;]R d ) and Z e Xj*, fAen f/ie QC solution UH satisfying (3.1) is locally unique. Proof. In view of Corollary 2.2, for sufficiently small x, A is coercive at UCB- Using Schatz's argument!13), we infer that there exists a constant H0 > 0 such that for 0 < H < H0, A{U
™;X,W)
sup
>CWVh for all 7 6 l f f .
W
W€XH
(3.7)
\\ \\l
Hence there is a unique solution PHU CB satisfying (3.5) and \\UCB
< CH1-^.
- PHUCB\\I,OO
Define a nonlinear mapping T: XH —> ^
(3.8)
by
i ( t / C B ; T(V), W ) = i ( t / C B ; C^CB, W ) - R(UCB,
+
V, W)
A(V,W)-AH(V,W)
for any W € XH- Obviously T is continuous. Define the set B: = {VGXH
\ ||V-PHC/cB||i)oc<e(QC)1/2 + F 1 - d / p } .
We claim that there exists a constant HQ > 0 such that for all 0 < H < Ho, T{B) C B. Notice that A(UCB;T(V)
- PHUCB, W) = -R(UCB,V, +
W) A(V,W)-AH(V,W).
Taking W = GH, where GH is the discrete regularized Green's function'12] , and using the classical estimate for the Green's function' 12 ', we obtain \\T(V) - PHUCB\\I,OO
< C\\nH\ \\UCB - V|| 2 ] 0 0 + Ce(QC)|lnff|
< C ( | | J 7 C B - i^U"cB||?,oo + WPHUCB ~ V|| 2 ; 0 0 + Ce(QC)|lnff |) < C(e(QC) + H2-2d/p <e(QG)
1/2
+
1 d p
H-/.
+ H)\\nH\
Analysis of the Local Quasicontinuum Method
27
An application of Brouwer's fixed point theorem gives the existence of UH e B such that T(UH) = UH- By the definition, UJJ satisfies (3.4)i. An application of the triangle inequality and (3.8) yields (3.4)2Suppose that both UH and UH are solutions of (3.1). Then we have r\m
C\\UH
IT II
-UH\\i<
IQ1A(UH;UH-UH,W)dt
sup -^ W€XH SU
-
P
77==^ ||VV||l
\A(UH,W)-A(UH,W)\ iiriTii
W€XH
'
|| yv ||i
where UH = (1 - t)UH + tUH- Note that A(UH, W) - A(UH, W) = (A(UH, W) - AH(UH, -(A(UH,W)-AH(UH,W)). We have
W))
WUH-UHW^VMWUH-UHWI. So if r)(M) < 1, we must have UH = UH-
•
Based on the above lemma, we can derive the final error bounds. Lemma 3.3. Assume that UCB e W2'°°(fl;Rd). Ho such that if e(QC) < Co,
There exist C0 and (3.9)
then for 0 < H < HQ, we have | | t f c B - t f i r | | i < C ( f f + e(QC)),
(3.10)
||t/ C B-U'H||i,oo
and from (3.7), we have (3.12)
\\UCB-PHUCB\\I,OO
Using (3.3) with V = PHUCB \\PHUCB
(3.11)
- UH and invoking (3.7) we obtain
- UH\\i < C\\UCB
- UH\\U + Ce(QC)
< C\\PHUGB - UH\\\A + Ce(QC) + CH2. Using the interpolation inequality we have \\PHUCB
- UH\\lA < \\PHUCB
- VH\\I\\PHUCB
- UH\\lt00.
There exist d = 1/(4C 2 ) and Hx = ( l / ^ C 2 ) ) 1 ^ - ^ ) e(QC) < d and H < Hi, we have e(QC) 1 / 2 + H1-*'" < 1/(2C).
such
that if
Weinan E, Pingbing Ming
28
Therefore, using (3.4)i, we have WPHUCB
- VH\\X < C(e(QC) 1 / 2 + Hx-d'*)\\PHUcB + Ce(QC) + CH2 <
-
\\\PHUCB
UHIU
-
UH\\i
+ Ce(QC) + CH2,
which gives WPHUCB
~ UH\\i < C(e(QC) + H2).
This inequality together with (3.12) yields (3.10). Putting V = GZH in (3.3) and repeating the above procedure, we obtain that there exist Ci and H2 such that if e(QC) < Ci and H < H2, we come to (3.11). Finally, letting Co = min(Ci,C2) and HQ = min(Hi,H2), we finish the proof. •
4
E s t i m a t e of e(QC)
It remains to estimate e(QC). For local QC of [11], we obviously have e(QC) = 0. However, it is difficult to give a general estimate of e(QC) for the nonlocal QC. We will give an estimate of e(QC) for nonlocal QC that employs the summation rule of Knop and Ortiz' 9 '. Define a discrete inner product as (4>,il))L:=
Y^
4>{xi)ijj(xi).
x,eLnD
For each node x of TH, define a cluster Br(x) = :{xi G L \ \xi — x\ < r}. For any domain D\, \DX denotes its characteristic function. We let all the nodes as {xi}^x, and the corresponding basis function is {>i}i^iThe weight associated with the node Xi is defined as rii, and let n = (ni, • • • , UM)T- The cluster summation rule can be formulated as Bn = g,
(4.1)
where B is an M x M matrix with B^ — {i,XBr(x))L and the M x 1 vector g is defined as gi — (fa, 1)LTo get the weights we have to solve a system of M x M linear algebraic equations, which is very expensive in particular for big N. Therefore, the mass lumping is commonly employed in practice, which amounts to assembling all entries on each line of B into the diagonal entry, namely, we need to solve the following simple linear equations: Bn = g,
(4.2)
Analysis of the Local Quasicontinuum Method
29
with Bu = & / ( E j l i Ba) a n d Bij = 0 for i ^ j . With the above consideration, the energy IH is defined as M
lH(yV)
= ^2niWi(yV),
(4.3)
t=i
where Wi(VV) = ^
J^
IBrix^DKjlWcB^Vi)
is the energy associated with the i—th node, where 3A/3/(27T) is a scaling factor. Here Mi is the set of elements sharing the common node ajj. For any element K e T g , assembling the energy contribution of each vertices in K, we rewrite (4.3) into IH(VV)
Yl
= ^ n
^nKABrixJnKlWcBiWi),
KeTH i=i
where {«K,i}f=i denotes three weights associated with three vertices of the element K. If we define *K = ^Y,nK,i\Br{xi)r\K\l\K\,
(4.4)
i=l
then the energy can be rewritten as /fl(VV) = Y, ^ ^ ( V V ) . KerH This is similar to the original QC formulation. In what follows, we estimate e(QC) for the case when all elements K are equal and the lattice summation rule in [9] is employed (see Fig.4.1). We define Lo to be the number of atoms over each edge and roe the cluster radius. Lemma 4.1. If all elements K £ Tg are equal and the first order lattice summation rule of Knap and Ortiz [9] is employed, then
Proof. Let be the linear base function associated with the center of the hexagonal. A direct calculation gives
Y xecnM
0(x) = l + 6 ^ ( * - l ) ( L o + l - « ) / i o = i§. »=i
30
Weinan E, Pingbing Ming
Figure 4.1 A special cluster-based summation rule in 2-D.Here L0 = 4 and ro — 1 <
PXBr(xi)(x)- The contribution of the local sums
We calculate ^xecriM at each vertices is 6
r
and the contribution of the sum at the center is r
Therefore, the overall contribution of the cluster summation is I i + J 2 = l + 3ro(ro + l). Thus the weight at vertices is n = LQ/(1 + 3ro(ro + 1)). Using (4.4), we get the equivalent weight of each element (4.5) • Remark 4.2. For the full atomistic model, the weight at vertices reduces to 1. Indeed, L = 1 and r = 0 in this case. A straightforward calculation gives that e(QC)
Zrl l + 3r 0 (r 0 + l)
< _
_ 4 _ _ 4e 3r 0 ~ 3r'
Analysis of the Local Quasicontinuum Method
31
This proves (1.9). Proof for Theorem 1.1 By the estimates (1.8) and (1.9) for e(QC), if e/r is sufficiently small, then e(QC) can be smaller than any given threshold; this verifies (3.6). Therefore, the estimate (1.6) follows from Lemma 3.3. Let y be the local minimizer of the full atomistic model obtained in Theorem 2.1. Using (2.5) and (1.6), we obtain
II y - vac h <\\y- VCB h + II VCB - 2/QC \U <<7e + C||[/cB-£/ Q c||i,oo < C(e + H + e(QC)), which gives (1.7).
•
References [1] R. A. Adams and J. J. F. Fournier, Sobolev Spaces, Academic Press, second edition, 2003. [2] J. M. Ball and R. D. James, Proposed experimental tests of a theory of fine micro structure and the two-well problem, Phil. Trans. Roy. Soc. Lond. A., 338 (1992), 389-450. [3] X. Blanc, C. Le Bris and F. Legoll, Analysis of a prototypical multiscale method coupling atomistic and continuum mechanics, to appear in Math. Modelling and Numer. Anal. [4] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, Oxford University Press, 1954. [5] W. E and P. B. Ming, Analysis of the multiscale methods, J. Comp. Math., 19 (2004), 209-220. [6] W. E and P. B. Ming, Cauchy-Born rule and the stability of crystals: static problem, 2005, preprint. [7] W. E, P. B. Ming and P. W. Zhang, Analysis of the heterogeneous multiscale method for elliptic homogenization problems, J. Amer. Math. Soc, 18 (2005), 121-156. [8] J. L. Ericksen, The Cauchy and Born hypotheses for crystals, In: Phases Transformations and Materials Instabilities in Solids, (M. Gurtin eds.), Academic Press, 1984, 61-77. [9] J. Knap and M. Ortiz, An analysis of the quasicontinuum method, J. Mech. Phys. Solids., 49 (2001), 1899-1923. [10] P. Lin, Theoretical and numerical analysis for the quasi-continuum approximation of a material particle model, Math. Comp., 72 (2003), 657-675.
32
Weinan E, Pingbing Ming
[11] R. Miller and E. B. Tadmor, The quasicontinuum method: overview, applications and current directions, J. Computer-aided Material Design, 9 (2002), 203-239. [12] R. Rannacher and R. Scott, Some optimal error estimates for piecewise linear finite element approximation, Math. Comp., 38 (1982), 437-445. [13] A. Schatz, An observation concerning Ritz-Garlerkin methods with infinite bilinear forms, Math. Comp., 28 (1974), 959-962. [14] V. B. Shenoy, R. Miller, E. B. Tadmor, R. Philips and M. Ortiz, An adaptive finite element approach to atomic scale mechanics-the quasicontinuum method, J. Mech. Phys. Solids., 47 (1999), 611-642. [15] E. B. Tadmor, M. Ortiz and R. Phillips, Quasicontinuum of defects in solids, Phil. Mag., A73 (1996), 1529-1563.
analysis
[16] L. Truskinovsky, Fracture as a phase transition, In: Contemporary Research in the Mechanics and Mathematics of Materials (R. C. Batra and M. F. Beatty ed.), CIMNE, Barcelona, 1996, 322-332.
33
The Artificial Boundary Method Numerical Solutions of Partial Differential Equations on Unbounded Domains * Houde Han Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China.
1
Introduction
The aim of this paper is to introduce the artificial boundary method, which has been established as a powerful and effective technique to obtain the numerical solutions of partial differential equations on unbounded domains in recent twenty five years. Many problems arising in science and engineering lead to solving the boundary value problem of partial differential equations on unbounded domains, such as the stress analysis of a dam with infinite foundation (see Fig.1.1), fluid flow around the obstacle (see Fig.1.2) and fluid flow in an infinite channel (see Fig. 1.3). The great new difficulty in finding the numerical solutions of these problems is the unboundedness of the physical domain. Therefore finite element method and finite difference method can not be used for these problems in a straightforward manner. In the early engineering literature the method is to introduce an artificial boundary to reduce these problems on a bounded computational domain and set up a suitable boundary condition (such as Neumann or Dirichlet boundary condition for the dependent variables) at the artificial boundary. Then solve the reduced problem on the bounded computational domain. In general, the above boundary conditions are only very rough approximations of the exact boundary condition at the artificial boundary. In the case when the high accuracy is required, the bounded computational domain must be quite large and the cost of the computation is increased. In the practical computation, in order to limit the computational cost, the artificial boundary should be chosen not far from where we "This work was supported by National Natural Science Foundation of China under Grant No. 10471073.
34
Houde Han
are interested in such that the bounded computational domain is as small as possible. Therefore how to design the suitable boundary conditions with high accuracy on a given artificial boundary for the problems on unbounded domains, or how to solve partial differential equations on unbounded domains numerically, has attracted the attention of many engineers and mathematicians. In the last more than twenty years, large number of mathematicians and engineers are involved in this subject, who have studied various problems from science and engineering by different approaches. Engquist and Majda (1977) [12] constructed the absorbing boundary conditions for the numerical simulation of waves. Feng (1980) [13], Han and Ying (1980) [54] studied the exterior problem of Laplace equation in M2 and introduced a circle artificial boundary, on which the Steklov-Poincare mapping is given as an exact boundary condition by different approaches, and then the exterior problem is reduced to a problem on the bounded computational domain. Goldstein (1982) [23] studied Helmholtz-type equations in the wave guide. Feng & Yu (1982,1983) [14,15,17,70,71] found the exact boundary conditions for the exterior problems of Laplace equation and Helmholtz equations [16] on a circular artificial boundary. Han & Wu (1985,1992) [50-52] presented the exact boundary conditions at an artificial boundary for Laplace equation, linear elasticity equations and Stokes equations in two dimensional case; moreover the high order global and local artificial boundary conditions were given. Yu (1985) [72] also gave the exact boundary condition and artificial boundary conditions for the exterior problem of Laplace equation in two dimensional case. Hagstrom & Keller (1986) [27] obtained the exact boundary condition on an artificial boundary for partial differential equations in a cylinder domain; shortly after (1987) [28], they used this approach to solve the nonlinear problem. Halpem & Schatzman (1989) [29] developed a family of artificial boundary conditions for unsteady Oseen equations in the velocity pressure formulation with small viscosity. Nataf (1989) [65] designed an open boundary condition for steady Oseen equations in stream-function vorticity formulation. Keller & Givoli (1989) [19,61] studied the exterior problems of Laplace equation and Helmholtz equation, and obtained the artificial boundary conditions for a given artificial boundary. In recent years, more and more mathematicians and engineers have joined in this research area and the artificial boundary method has attained successful applications in many fields in science and engineering [1-5,9,11,18,2022,24-26,31,32,37,40,41,45,46,48,49,53,57,62-64,66-69,73-77,79]. Two typical problems given in the next section are used in this paper to explain the basic idea and approaches of the artificial boundary method.
The Artificial Boundary Method • • • Incoming flow
Figure 1.1
Incoming Flow Exterior Domain Airfoil Surface
Figure 1.2
Figure 1.3
2 2.1
Two typical problems Problem(P)
Suppose that fl C R 2 is an exterior domain with boundary Tj, f(x) and g(x) are the given function on Q, Tj, and the support of f(x) is compact, namely, there is a positive constant RQ > 0 such that supp{/(a;)} C BRo = {x \ \x\ < RQ}. We now consider the following problem(P):
Houde Han
36
-Au = f(x),
in
ft,
(2.1)
"In* = g(x), u is bounded when |x| —> +oo.
(2.2) (2.3)
Partial differential equation (2.1) is called the Poisson equation. Problem (P) is the mathematical model for many important physical phenomena, such as the irrotational perfect fluid flow, the gravitational field and the electromagnet ism.
2.2
P r o b l e m (N-S)
We consider the numerical simulation of a steady incompressible viscous flow around a body (domain fio) m a no-slip channel defined by M x [0, L]. Then the velocity (u, v) and pressure p satisfy the following problem (NS) in the domain O = l x [ 0 , L]\£l0'du du dp . u— + u — + —^ = i/Ati,
. in
dv dv dp U 7T~ + " a - + 7T^ =
m
OX\
OX2
OX\
OX2
U
= w
L2=O,L
OXi
. '
uAv
.
,_
fi,
,_ >
.„ ,, (2.4) . . (2-5
n
OX2
L=o,£ = 0,
- 0 0 < xi < +00,
«lan 0 = v l«no = °' u —• ^00(^2) = ax2(L — X2) as xi —> 00, f —>• Vao = 0
as xi —> 00,
(2.7)
<2'8) 0 < X2 < L, (2.9) 0 < X2 < i ,
(2.10)
where v is the kinematic viscosity, a > 0 is a given constant, and (uoo(x2),Woo) is the velocity at infinity of the channel. The physical domains in problem (P) and problem (N-S) are unbounded. How to solve problem (P) and problem (N-S) numerically by the artificial boundary method? We will answer the equation in this paper. The main processes of the artificial boundary method are: (1) Introduce an artificial boundary dividing the physical domain Cl into two parts: the bounded computational domain flj and an unbounded part fie = f2\fij. (2) Find the exact boundary condition or design a suitable artificial boundary condition on the given artificial boundary. Then the original problem is reduced to the problem on the bounded
37
The Artificial Boundary Method • • •
computational domain. (3) Solve the problem on the bounded computational domain numerically. The process (2) is the core of the artificial boundary method.
3 3.1
The global (nonlocal) artificial boundary conditions T h e Steklov-Poincare mapping for exterior problem of Laplace equation
We now return to the typical problem (P). Introduce an artificial boundary TR = {x | \x\ =R,R>
Ro}
and
TR c ft.
The TR divides the physical domain ft into two parts: the bounded computational domain ft; with boundary Ti U TR and the unbounded part fte = ft — ft;. Consider the restriction of u, the solution of problem (P), on domain fte. u satisfies: Au = 0, in fte, u is bounded, when |a;| —• oo,
(3.1) (3.2)
but on the boundary FR, u\r = u(R,8) is unknown. Problem (3.1)(3.2) is an uncompleted problem. If the function u(R, 6) is given, then problem (3.1)-(3.2) has a unique solution. For the given u(R, 6), the solution of problem (3.1)-(3.2) can be written as: « M ) = ^ + J2 (-)m Z
{am cos m6 + bm sin m6),
(3.3)
T
m=l
du(R,9) —-x
v^ mw = 2 ^ (--5)(am
„ „. cosmtf + bm sm mO).
. (3.4)
m=l
By equation (3.3), we obtain the Fourier coefficients {am, m = 0,1, • • •} and {bm,m = 0,1, • • •}: f
f2n
1 am — — /
u(R, (j>) cos m,
J
(3.5)
I fc bm = — I V Trio
u(R, 4>) sin mcjxlcp,
38
Houde H a n or
f2*du(R,4>)
l
.
/ — H 7 — s m m) ... = / —^7—cosm0#, wrn JQ o
am = . bm
(3.6)
or 1 fr2v „ d2„,.„ ~, u{R,4>) ~5 / —5IS—cosm0d0, ^d2u(i?,>) • sin m4>d<j>. 2 2 2
•m J00 nm
(3.7)
d<£
Substituting (3.5) (or (3.6), or (3.7)) into equation (3.4), we obtain du{R,6) _
^ V
/
m=l
[2lx
-^«(ii,e)cosm(e-0)d0sSi(u|rJ,
(3.8)
u
27r ^OO /--27T i aw(i?,0) . s i n mv( 6 > Y- 0 ) # = 5 2v( ilri r ), (3.9) ^Ja -KR d6 ' *
au(fl,e) dr
m=l
3u(il,0)
A
I*"
1
flVfrfl)
,x.,_cM
(a
w
„
i m
m=l
Equalities (3.8), (3.9) and (3.10) are equivalent, and are the exact boundary conditions satisfied by the solution u of the original problem^). By the following equality ( [24], p.44) cosmfl 2 V ^ ^ = - l n ( l - c o B 0 ) = / o (0), £—•* m rn
(3.11)
we have «
= -cot^/
dfl2
2 sin2 2|
,
(3.12)
= h(9).
(3.13)
l W
Therefore it follows from equality (3.11) that the exact boundary condition (3.10) can be rewritten as: du(R,9)_ dr
1
/
"
-S5/„*
„ « , , A ,( , „ _ ) L Ea
-*
^*
'«.)
<3I4)
The Artificial Boundary Method • • •
39
Furthermore by integrating by parts, we arrive at: du{R,0)
1
\R[
dr
[2v
2TVR
du(R, 9) dr
=
,\du{R,1
'^-^^W ^^^
2^i? / " h{6 ~ * )u(i *' W
= 5 6 Hr„)-
<3'15' (3-16)
In fact, equations (3.8)-(3.10) and (3.14)-(3.16) are the six equivalent representations of Steklov-Poincare mapping: namely, for any given u(R,6) G H^(TR), solving the Laplace equation on fie with condition (3.2), we obtain ^ ' ^ du dn rR
du(R,0)= dr
€
H-i(rR). ,
j = 1>2)...}6-
for
(3-17)
The exact boundary condition (3.16) was obtained by Feng (1980) [13]. The exact boundary conditions (3.14) and (3.15) were given by Han and Ying [54] in 1980, using Hilbert transformations. Formulations (3.8)-(3.10) were discovered by Han and Wu (1985) [50]. The above discusion on the Steklov-Poincare mapping is mainly from the work ( [50],1985) by Han and Wu. At the same time formulation (3.8) was proposed by Yu (1985) [72]. In 1989, formulation (3.8) appeared in the paper [19] by Givoli and Keller in which formulation (3.8) is called DtN mapping. The Steklov-Poincare mappings (3.8)-(3.10) and (3.14)-(3.16) are exact boundary conditions on the given artificial boundary TR, which are global.
3.2
T h e reduced problem on t h e b o u n d e d computational domain f2j
Any one of the six equivalent Steklov-Poincare mappings can be used to reduce the original problem (P) to a problem on the bounded computational domain Qi: (Pi)
Au == f(x), u
=9(x),
(3.18) (3.19)
=SMrR)
(3.20)
\r-
du dn r
infii,
forj=l,2,---,6. For j=l,2,- • • ,6 problem(Pj) is defined on the bounded computational domain J2» and with the exact global boundary condition (3.20). It is straightforward to check that problem (Pj) is equivalent to the original
40
Houde Han
problem (P) in the following sense. If u is the solution of the original problem (P),then the restriction of u is the solution of problem (Pj); otherwise, if u is the solution of problem (Pj), then u is the restriction of the solution of problem (P). Therefore we only need to solve problem (Pj) on the bounded computational domain fij, and then we can obtain the solution of original problem (P). But we need to pay the added expense to compute the singular integrals or infinite series, which are from the exact global boundary conditions (3.20). In practical applications, we take first few terms of the series in the exact boundary conditions (3.8)-(3.10), and then we obtain a sequence of the approximate artificial boundary conditions on FR, which are also global.
N
du
Y. j
rR du dn
^«(«.*)<»»">(«-*)*»sSr(«lr»). <3-21)
N
'** r. — t f ^ f m=l
du dn
'**
m=l
-»(» - m - #Hr.). (3-22)
u
u
for iV = 0 , l , 2 , By the approximate artificial boundary conditions (3.21)-(3.23), the original problem (P) can be reduced to the approximate problem ( P ^ ) on the bounded computational domain Cl[-. (if)
-AuN
= f(x),
infii,
N
u \ri=g(x),
(3.24) (3.25)
N
8u dn
= Sf(uN\rR)
(3.26)
1 R
f o r ; = 1,2,3. Problem (3.24)-(3.26) is a properly-posed problem for N = 0,1,2, • • • and j = 1,2,3 (see Han and Wu [50]). In fact problems (P^), are equivalent for j — 1,2,3. Therefore, in the following the index j is omitted and problem ( i f ) is called problem (PN). We hope that the solution uN of problem (PN) is a "good" approximation of tt, the solution of problem (P) on the bounded computational domain. In the following subsection we discuss the numerical solution of problem (PN)-
The Artificial Boundary Method • • •
3.3
Finite element approximation of problem
41
(PN)
In this subsection, let g(x) = 0 on IV Introduce the subspace of Sobolev space if 1 (fi;): Vo = {v\veH1{Sli),v\ri=0}.
(3.27)
Then the equivalent variational problem of problem (PN) is: Find uN G Vo such that a(uN,v) + bN(uN,v) = f(v),
Vv£V0,
(3.28)
where a(u,v) = I f{v)=
Vu • Vvdx
/
fvdx,
bN(u,v)=)
/ / cos[m(6 - (p)\—^ d6d 0, M > 0, independent of TV, such that [50] |a(u,v)| < Mllttlli^illuHi.n,, Vu, v G Vo, a(v,v) > <x\\v\\i Q., Vv G Vo, \bN(u,v)\ bN(v,v)
< M||u||i i n i ||'i;||i j n i , Vu,v G V0, > 0 , VUG VO.
From the Lax-Milgram theorem we obtain the following result [50]: Theorem 3.1 For given f{x) 6 i? _ 1 (fii) and any integer N > 0, the reduced problem (3.28) has a unique solution uN(x) G VoWe now consider the finite element approximation of the reduced problem (3.28) on the bounded computational domain Q ; . Let Vh be a finite element subspace of Vo with h > 0. For example, assume that Ti is a polygonal line. Then Vh is given by the linear triangle element. The numerical approximate problem is: (v>N\ / F i n d uh G vh such that ^ > \ a(u», vh) + bN(u», vh) = f(vh),
, V«h G Vh.
. ^ >
After solving problem (3.29), finally we obtain the numerical approxation u^ . Furthermore we obtain obtai the error estimation of the finite imation element approximation u^ [35,50].
Houde Han
42
Theorem 3.2 Suppose that u is the solution of problem (P) and u^ is the solution of problem (P^). Then the following error estimates hold: JV, -~ 1 l«-«fc|i,n, 1,
(3-30)
where C\ > 0 is a constant independent of h, N, and \u - t # | 1 A < C2{h\u\2,ni
+ ^lI(^)^+>|
3 A r o
},
(3.31)
where C2 > 0 is a constant independent of h, N and R. The error estimate (3.31) shows how the error depends on the mesh size h, the accuracy of the artificial boundary N and the location of the given artificial boundary R. Along the direction given by Feng [13], Han and Wu [50], Yu [72], the global artificial boundary condition approaches are widely applied to various problems on unbounded domains. For example, the exterior problems of Poisson equation in three dimension [38,78], the exterior problem of elastic equations [36,50,52,78], the incompressible material problem on unbounded domains [8,33,78], and the exterior problem of Stokes equations in three dimensions [80]. Recently, for the time dependent problems (heat equation, wave equation and Schrodinger equation), the exact (global) boundary conditions and a series of the global artificial boundary conditions are obtained [43,44,58,59,69].
4
Local boundary conditions
We now return to the typical problem (P) and consider the restriction of u, the solution of problem (P) on the domain fie: Au = 0, in fie, u is bounded when |a;| —• oo.
/ 4 1-)
On fie, u can be written as u(r, e) = ^ - + y (-)m(am 2 ^—< r
cos(m6) + bm sin(m0)),
(4.2)
m=l
where constants {am, m = 0,1, • • •} and {bm, m = 0,1, • • • } are given by (3.5). On the artificial boundary TR, we have oo
U
(R, 0) = v + Yl (Qm
cos me
( )
+ bm sm(m0)),
(4.3)
The Artificial Boundary Method • • •
du dn
_ du(R,6)
g ( _^ ) ( f l m c o s { m e )
=
rR
+ bmsin(m9)),
43
(4.4)
TO=1
Qp
' = Yl (-m2)(am
cos(m8) + bm sin(m0)),
(4.5)
m—1
d 2 k
£
6 )
= £ (-Vk™2k(am
cos(m0) + 6 m sin(m0)), (4.6)
m=l
Consider the following summation du(R,9) dr
+
1_ f-(
ifgN^W) 30 2 f e
fc=i
N
oo
= Y,U m i + £ m 2 f e a £ r ) ( a m c o s ( m 0 ) + & m sin(m0))}, (4.7) fc=l
m=l
where A^ is a given positive integer, and {a^, k = 1,2, • • • , N} are constants to be determined. Take {&%!} such that the first N terms in the right side of (4.7) equal to zero. Then {a^} satisfies
f>2*af
™ m = l,2,...,iV.
fc=i
Namely, / l 2 l4 22 2 4
1™\ 2 2JV
A\
/ < \ a$
1_
(4.8)
~R V^2
AT4 ••• N2N)
W
\<>NN)
This linear system has a unique solution for any positive integer N because its coefficient matrix is a Vandermonde matrix. After solving the linear system (4.8), we obtain the constants {a%, k = 1,2, • • • , N}. Substituting {aj^, k — 1,2, • • • , N} into (4.7) we arrive at du(R,9)
l ^ R fc=l
2k
k
Nd
u(R,9)
44
Houde Han oo =
N
S m=N+l
{(~j;+J2rn2ka%)(amcos(m6) R
+
bmsm(m9))}.
fc=i
Neglecting the right side of the above equality, then we have the following local artificial boundary conditions du dn
du(R, 6) dr
1
N
k„d?»
—s D - D <*k
(4.9)
d6>2fe
The high-order local artificial condition (4.9) was given by Han and Wu (1985) [50]. The coefficients Kcients {a%, {ak k = 1,2, • • • , N} (for N = 1,2, • • • , 5) are given in the following table: N = l N =2 N = 3 N = 4 iV = 5
«?
a?
*?
74/60 533/420 3881/3780
-1/6 -15/60 -43/144 -214/643
1/60 11/360 71/1728
1 7/6
N
a?
-1/1008 -13/6048
1/25920
By the local artificial boundary condition (4.9), the typical problem (P) can be reduced to problem {Pfj) on the bounded computational domain: ( AuN = f{x), uN\r{ = 0,
in fii,
(P£) <
2k
Nd
u(R,6)
k
dn
Q02k
•
In the following we assume that g{x) = 0 on i y Define the space V? = {VG H1^)
| v\Tl = 0,i;|r R e
HN(TR)}.
Then the boundary value problem (P^) is equivalent to the variational problem: Find uN € VtN such that (4.10) a(uN, v) + bN(uN, v) = f(v),
Vv e V?
with &AT(U
dku(R,6)dkv(R,6) 36k 86k
d9.
The Artificial Boundary Method • • •
45
Suppose that V* ' is a finite dimensional subspace of Vj?. A family of such subspaces was introduced by D. Givoli and J.B.Keller [21]. We now consider the numerical approximation of problem (Pi)'Find u% e V*N'h such that (4.11) a(u%, v) + bN(u», v) = f(v),
Vv € KA
For roblem (4.11) we have [7,55]. Theorem 4.1 Suppose that f 6 -L2(fii) and u\r0 € HN(TQ) for the odd integer N(\ < N < 20) or N = 0. Problem (4-U) has a unique solution Uf?. Furthermore the following error estimate holds: \\u-u%\\*
< CN{miveVN,h\\u
R - v\\* + -j?-
N+l
|u|jv,r 0 },
where CN is a constant independent of h and R, and ii
H
2
i
i2
i
i2
IN* = M I A + M;v,rv When the high-order local artificial boundary conditions are used to reduce the problem (P), the high-order derivatives of the unknown function u are involved in the reduced problem (P^), which causes a new difficulty for finding the numerical solution of problem (P^)- Han and Zheng [56] proposed a mixed finite element method to overcome this new difficulty.
5
Discrete artificial boundary conditions
We now return to the typical problem (N-S). The physical domain fi = R x [0, L] \ fio is an unbounded domain. Introduce two artificial boundaries r b = {Xl=b, 0 < x2 < L}, Tc = {xi = c,
0 < X2 < L}
with c > 0 and b < c such that fio C [b,c] x [0,L]. Then fi is divided into three parts (see Fig. 5.1): fib = {(^1)3:2)1 — oo < xi < b, 0 < x2 < L}, fi; = {(xi,x2)\b
< xi < c, 0 < x2 < L}\
fi0,
fic = {(xi, x2)\c < xi < +oo, 0 < x2 < L}. Is it possible to compute our problem (N-S) only on the bounded computational domain f2j? The key points to answer this question are to
46
Houde Han
"«,(*j).
ot
Figure 5.1 find two "suitable artificial boundary conditions" on the given artificial boundaries Tt and T c . Let ip &nd u denote the stream-function and vorticity. Then dip _ &X2
dip '
v,
(5.1)
du 8x2'
(5.2)
dxi
dv dx\ The typical problem (N-S) is reduced to dw dx\
dw 3x2
PAW
= 0,
/Sip + w ~ 0,
in
n,
(5.3)
ft
(5.4)
with boundary conditions ^U 2 =o = 0, - o o <x\ < +oo, ll>\xi=L = i>L=
\ U00(s)ds, Jo
—— dx
-OO < Si < + 0 0 ,
= 0 , — oo < xi < +oo,
(5.5) (5.6) (5.7)
2\x2=0,L
ip = const,
7— = 0, on
on
8Q,Q,
(5.8)
I-X2
4> —> ^00(^2) = /
Uoo(s)ds, when \x±\ —• +00,
w —*• 0^00(^2) = — ^ ( x s ) , when \xi\ —> +00.
(5.9) (5.10)
When \b\,c are large, in the domains Qb and fic the flow is close to Poiseuille flow, and the N-S equations can be linearized on domain ftc
47
The Artificial Boundary Method • • • (or Qb) as . . du dp . «oo(a;2)^ k — = I'Ait, oxi axi . . dv dp "00(^2)^ 1-^—=^Aw, aa;i 0x2 9M
. in
_, ilc,
(5.11)
. in
S2C,
(5.12)
fic.
(5.13)
<9I>
-5 h ^ — = 0, in OTi OT2
Therefore linear N-S equations (5.11)-(5.13) with the boundary conditions in stream-function vorticity are given as u
'oo02)^-^
u00(x2)-^-+
OX1OX2
vAw = 0,
infic,
(5.14)
in
(5.15)
OXi
A ^ + w = 0, ^1x2=0 = 0,
fic,
tp\x2=L = i>L, c < a ; i < + o o , -~\X2=Q,L
= 0, c < a ; i < + o o ,
(5.16) (5.17)
i> —»• i>oa{x2), u —> u>oo{x2), when x\ —• +00. (5.18) Since the boundary conditions on the artificial boundary Tc are unknown, problem (5.14)-(5.18) is an incompletely posed problem and it can not be solved independently. Let tp\Xl=c = 4>c{x2),
0<x2
(5.19)
w|xi=c = UC{X2),
0<x2
(5.20)
For given functions tpc(x2), uc(x2), we can not directly use the technique given in Section 3 and Section 4 to find the exact boundary conditions and approximate artificial boundary conditions of linear problem (5.14)(5.20). One reason is that the coefficients of equation (5.14) are not constants, which are the functions of variable x2. In this section a new class of artificial boundary conditions is proposed, which is called the discrete artificial boundary condition. For given ipc(x2), ^c{x2), we consider the infinite difference approximate solution of problem (5.14)-(5.20). Let 8\ > 0 and 82 = L/N be two mesh sizes, where TV is a positive integer. The domain Clc is discretized by mesh points x>* = {4,x%),j
= 0,l,2,---,k
=
0,l,2,-.-,N
with x[ =c + j61,
j = 0,1,2, ••• ,
48
Houde Han X2=k52, fc = 0,1,2, ••• ,N.
Using the following difference approximations: U
d2^
/ i k\
U
L(X2),,
i
°°^X2' dx dx '(xJ'k) ~
45 5
Wj+1>k+1 -vj+i,k-i
- Vj-i,fc+i +ipj-i,k-i],
ksdu.
U00O2) r
Uoo(^) — \(xi,k) «
i
[u;j + i, fc -u;j-i,fej,
2
Au,U» «
^
+
52
.
A^,,*
^
+
^
,
then we obtain the infinite difference scheme: 45i5; 1!
-[>Pj+l,k+l - 1pj+l,k-l
(T
—°l\
~ i>j-l,k+l
+
1pj-l,k-l]
)
2
x
( 5 - 21 )
K'+i.fc ~ Uj-i,k]
r^j+i,fe ~ 2^j,fc + ^j-i,fc , ^j.fc+i — 2^j,fc + Uj,k-i i _ n + J U " 5? 5| - ' ^j+i, fc - 2ipjtk + V>,-_i|fc V'j.fc+i - 2^ j|fc + ^ | f c _ ! _ +
L
for j = 1,2, • • • , k = 1, • • • , N — 1 with boundary conditions V>i,o = 0, ^o = - ^ - , ^
= - ^ _
ipj,N = tpL,
3 1
3
% ^
(
^ ^ -
1 +
+
j = 0,1,2, • • • ,
l )
, l
i^O.1,2,..., )
,
j =0,1,2,...,
limj.^+oo Vj,fc = ^ o o ^ l ) . limj^+oo Wj,fe = Woo (2:2 )> ipo,k = ipc(x2), w 0 ,fc=w c (x§), fc = 1,2, - • • ,iV - 1.
(5.23) (5.24) (5.25) ( 5 - 26 ) (5.27)
Let X
J = k j . i i ' • - ' VJ,N-I; ipj,i, • • • , V'j.Ar-iF G R 2 J V _ 2 ,
X ^ = [ ^ ( 4 ) , • • • . W o o ^ - 1 ) ; V'oo(^), • • • , V o o ^ - 1 ) ] 7 " € R 2Ar ~ 2 . The infinite difference equations (5.21)-(5.27) are equivalent to the following system of linear algebraic equations including infinitely many
The Artificial Boundary Method • • •
49
unknowns X i , X2, • • •: f For given X 0 , X ^ € R2N~2, find (PD) I AoXj-i + B0Xj + C0Xj+1 = D0,
{Xi, X 2 , • • •} such that j = 1,2, • • • ,
^ hnij—xx) X j = XQO ,
where A0,BQ, CQ are three (2iV — 2) x (2iV — 2) matrices with constant elements and Do £ R 2 -^ - 2 . Ao,-Bo,Co and Do are obtained from the difference equations (5.21)-(5.22) and we know that [47] (A0 + B0 + C 0 )X o o = D0.
(5.28)
Let Yj = Xj - Xoo for j = 0,1,2, • • • . Then {Yj,j = 0, • • •} satisfies:
{
For given Y 0 6 R 2JV ~ 2 , find {Yj, j = 1,2 • • •} such that A0Yj^ + B0Yj + C0Yj+1 = 0, j = 1,2, • • • , Hindoo Yj = 0.
Problem (-Pp) can be solved numerically by a fast iteration method [47], and we obtain Y i ~ -TY0 where T e
R^2N'2^2N'2\
Return to the vectors Xo, X i , and we have X i « - T X o + (/ + T)Xoo. Let W
rduj(c,x\) dx G R2N~2.
dw{c,X2~l) ' dx
dw(c,x\) dx '
dw(c,x^~x) ' dx
T
Then approximately we obtain
W:
•1
—
Xo
Si
Finally on the artificial boundary r c we have a discrete artificial boundary condition W = ~(T
+ I)(X0-Xoo).
(5.29)
On the artificial boundary Ti, we can also obtain a similar discrete artificial boundary condition, and then the original problem can be reduced to the bounded computational domain in a finite difference formulation. Furthermore the discrete artificial boundary conditions are given on a polygonal artificial boundary for the exterior problem of Poisson equation [34], the problem of infinite elastic foundation [6] and the problem with interface [39,42].
50
6
Houde Han
Implicit boundary conditions
We now consider the typical problem (P) again: '-Au (P)<
= f(x),
in fi,
= 9(x),
r\ u is bounded when \x\ —> oo.
Introduce an artificial boundary T (arbitrary shape)C fl (see Fig. 6.1) such that r divides the domain fi into two parts: the bounded part fi; and the unbounded part £le and f(x) = 0,
Vxe
ne.
On the artificial boundary T, the exact boundary condition is the
Figure 6.1 Steklov-Poincare mapping, namely, for given u\r we solve the following problem: Au = 0, in £le, u = u\r,
on r ,
u is bounded when la;I —> oo.
(6.1) (6.2) (6.3)
By the solution u of problem (6.1)-(6.3), Steklov-Poincare mapping is given by du
dn r
=S(ur).
(6.4)
51
The Artificial Boundary Method • • •
Unfortunately, since the shape of the artificial boundary T is arbitrary, we can not find the explicit formulation of the Steklov-Poincare mapping in general. In this section we will discuss the implicit boundary conditions on any shape of artificial boundary T. Let A = f^|r, n denotes the outward unit normal to T = <9fie, and the fundamental solution of Laplace equation is given by: G(x,y) = — \og\x-y\.
(6.5)
Using the Green's formula, we obtain u x
()
=
/ — a — u ( y ) d s y - / G(x,y)X(y)dsy Jr (Jny Jr
+ a,
Vz 6 fie,
where a is a constant and J X(x)dsx = 0.
(6.6)
On the boundary T, we have [60] l
-u{x) = J
dG
Q*'VyV)u(y)dsy
- J G(x,y)X(y)dsy
+ a,
Vx € I\ (6.7)
Furthermore [31] d2G(x,y)^^
f
dG(x,y)
\™ - l ^ " ' ^ . " I ^ * » * * - >•• («> %y
The equalities (6.7) and (6.8) are two implicit boundary conditions on the given artificial boundary T and are satisfied by u(x), the solution of problem (P). By a combination of the implicit boundary (6.7) and (6.8), the typical problem (P) can be reduced to the bounded computational domain fi; with unknown functions u, A and unknown constant a [10,30]. This approach is considered as the symmetric coupling method of finite element and boundary element, which is one of the two principal classes of FEM-BEM formulation [64] introduced independently by Costabel [10] and Han [30].
7
Conclusions
The artificial boundary method has been established for computing the numerical solutions of partial differential equations on unbounded domains. The key points of this method are to find the exact boundary conditions or the approximate artificial boundary conditions on the
52
Houde Han
given artificial boundary for various problems arising in many fields of science and engineering. In general, the artificial boundary conditions can be classified into implicit boundary conditions and explicit boundary conditions including global artificial boundary conditions, local artificial boundary conditions and discrete artificial boundary condition. The explicit artificial boundary conditions are more convenient in applications, but the implicit artificial conditions can handle the artificial boundary with any shape.This method has attained successful applications in many fields in science and engineering and has shown wider and wider application prospect. In this field there are still many open problems, for example, how to solve the nonlinear partial difference equations on unbounded domains numerically? It is an interesting and important problem waiting for solving.
References [1] B. Alpert, L. Greengard and T. Hagstrom. Nonreflecting boundary conditions for the time-dependent wave equations. J. Comput. Phys., 180:270-296, 2002. [2] W. Z. Bao. The approximations of the exact boundary condition at an artificial boundary for linearized incompressible viscous flows. J. Comput. Math., 16:239-256, 1998. [3] W. Z. Bao. Artificial boundary conditions for incompressible NavierStokes equations: A well-posed result. Comput. Methods. Appl. Meth. Engrg., 188:595-611, 2000. [4] W. Z. Bao and H. D. Han. Nonlocal artificial boundary conditions for the incompressible viscous flow in a channel using spectral techniques. Journal of Computational Physics, 126:52-63, 1996. [5] W. Z. Bao and H. D. Han. Local artificial boundary conditions for the incompressible viscous flow in a slip channel. Journal of Computational Mathematics, 15:335-343, 1997. [6] W. Z. Bao and H. D. Han. The direct method of lines for the problem of infinite elastic foundation. Comput. Methods Appl. Mech. Engrg., 175:157-173, 1999. [7] W. Z. Bao and H. D. Han. High-order local artificial boundary conditions for problems in unbounded domains. Comput. Methods Appl. Mech. Engrg., 188:455-471, 2000. [8] W. Z. Bao and H. D. Han. Error bounds for the finite element approximation of an incompressible material in an unbounded domain. Numerische Mathematik, 93:415-444, 2003.
The Artificial Boundary Method • • •
53
[9] W. Z. Bao, H. D. Han and Z. Y. Huang. Numerical simulations of fracture problems by coupling the FEM and the direct method of lines. Comput. Methods Appl. Mech. Engrg., 190:4831-4846, 2001. [10] M. Costabel. Symmetric Methods for the coupling of finite elements and boundary elements, in Boundary Elements IX, volume 1. Springer-Verlag, Berlin, 1987. [11] Q. Du and X. N. Wu. Numerical solution for the three-dimensional Ginzberg-Landau models using artificial boundary. SIAM J. Numer. Anal., 36:1482-1506, 1999. [12] B. Engquist and A. Majda. Absorbing boundary conditions for the numerical simulation of wave. Math. Comput., 31:629-651,1977. [13] K. Feng. Differential vs. integral equations and finite vs. infinite elements. Math. Numer. Sinica, 2:1:100-105, 1980. [14] K. Feng. Canonical boundary reduction and finite element method. In Proceedings of International Invitational Symposium on the Finite Element Method, Hefei, 1981. Science Press, Beijing, 1982. [15] K. Feng. Finite element method and natural boundary reduction. In Proceedings of the International Congress of Mathematicians, pages 1439-1453, Warszawa, 1983. [16] K. Feng. Asymptotic radiation conditions for reduced wave equation. J. Comput. Math., 2:2:130-138, 1984. [17] K. Feng and D. H. Yu. Canonical integral equations of elliptic boundary value problems on the finite element method. In Proceedings of International Invitational Symposium on the Finite Element Method, pages 211-252, Beijing, 1982. Science Press, Beijing, 1983. [18] G. N. Gatica, L. F. Gatica and E. P. Stephan. A FEM-DtM formulation for a non-linear exterior problem in incompressible elasticity. Math. Meth. Appl. Sci., 26:151-170, 2003. [19] D. Givoli and J. B. Keller. A finite element method for large domains. Comput. Methods. Appl. Meth. Engrg., 76:4-66, 1989. [20] D. Givoli and J. B. Keller. Nonreflecting boundary conditions for elastic waves. Wave Motion, 12:261-279, 1990. [21] D. Givoli and J. B. Keller. Special finite elements for use with highorder boundary conditions. Comput. Methods. Appl. Meth. Engrg., 119:199-213, 1994. [22] D. Givoli, I. Patlashenko and J. B. Keller. High-order boundary conditions and finite elements for infinite domains. Comput. Methods. Appl. Meth. Engrg., 143:13-39, 1997.
54
Houde Han
[23] C. I. Goldstein. A finite element method for solving Helmholtz type equations in waveguides and other unbounded domains. Math. Comput, 39:309-324, 1982. [24] I. S. Gradshteyn and I. M. Kyzhik. Tables of Integrals, Series and Products, Sixth Edition. Academic Press, 2000. [25] M. J. Grote and J. B. Keller. On nonreflecting boundary conditions. J. Comput. Phys., 122:231-243, 1995. [26] T. Hagstrom, S. I. Haraharan and D. Thompson. High-order radiation boundary conditions for the convective wave equation in exterior domains. SI AM J. Sci. Comput, 25:1088-1101,2003. [27] T. Hagstrom and H. B. Keller. Exact boundary conditions at an artificial boundary for partial differential equations in cylinders. SIAM J. Math. Anal., 17:322-341, 1986. [28] T. Hagstrom and H. B. Keller. Asymptotic boundary conditions and numerical methods for nonlinear elliptic problems on unbounded domains. Math. Comput, 48:449-470, 1987. [29] L. Halpern and M. Schatzman. Artificial boundary conditions for incompressible viscous flows. SIAM J. Math. Anal, 20:308-353, 1989. [30] H. D. Han. A new class of variational formulations for the coupling of finite and boundary element methods. Journal of Computational Mathematics, 8:223-232, 1990. [31] H. D. Han. Boundary integro-differential equations of elliptic boundary value problems and their numerical solutions. Scientia Sinica, 31:1153-1165, 1998. [32] H. D. Han and W. Z. Bao. An artificial boundary condition for the incompressible viscous flows in a no-slip channel. Journal of Computational Mathematics, 13:51-63, 1995. [33] H. D. Han and W. Z. Bao. The artificial boundary conditions for incompressible materials on an unbounded domain. Numerishe Mathematik, 77:347-363, 1997. [34] H. D. Han and W. Z. Bao. The discrete artificial boundary condition on a polygonal artificial boundary for the exterior problem of Poisson equation by using the direct method of lines. Comput. Methods Appl. Mech. Engrg., 179:345-360, 1999. [35] H. D. Han and W. Z. Bao. Error estimates for the finite element approximation of problems in unbounded domains. SIAM J. Numer. Anal, 37:1101-1119, 2000.
The Artificial Boundary Method • • •
55
[36] H. D. Han and W. Z. Bao. Error estimates for the finite element approximation of linear elastic equations in an unbounded domain. Mathematics of Computation, 70:1437-1459, 2001. [37] H. D. Han, W. Z. Bao and T. Wang. Numerical simulation for the problem of infinite elastic foundation. Computer Methods in Applied Mechanics and Engineering, 147:369-385, 1997. [38] H. D. Han, C. H. He and X. N. Wu. Analysis of artificial boundary conditions for exterior boundary value problems in three dimensions. Numer. Math., 85:367-386, 2000. [39] H. D. Han and Z. Y. Huang. The direct method of lines for the numerical solutions of interface problem. Comput. Methods Appl. Mech. Engrg., 171:61-75, 1999. [40] H. D. Han and Z. Y. Huang. A semi-discrete numerical procedure for composite material problems. Mathematical Sciences and Applications, 12:35-44, 1999. [41] H. D. Han and Z. Y. Huang. The direct method of lines for incompressible material problems on polygon domains. In T. Chan, T. Kako, H. Kawarada and O. Pironneau, editors, 12th International Conference on Domain Decomposition Methods, pages 125132, 2001. [42] H. D. Han and Z. Y. Huang. The discrete method of separation of variables for composite material problems. International Journal of Fracture, 112:379-402, 2001. [43] H. D. Han and Z. Y. Huang. A class of artificial boundary conditions for heat equation in unbounded domains. Computers & Mathematics with Applications, 43:889-900, 2002. [44] H. D. Han and Z. Y. Huang. Exact and approximating boundary conditions for the parabolic problems on unbounded domains. Computers & Mathematics with Applications, 44:655-666, 2002. [45] H. D. Han and Z. Y. Huang. Exact artificial boundary conditons for Schrodinger equation in R2. Comm. Math. Sci., 2:79-94, 2004. [46] H. D. Han, Z. Y. Huang and W. Z. Bao. The discrete method of separation of variables for computation of stress intensity factors. Chinese J. Comput. Phy., 17:483-496, 2000. [47] H. D. Han, J. F. Lu and W. Z. Bao. A discrete artificial boundary condition for steady incompressible viscous flows in a no-slip channel using a fast iterative method. Journal of Computational Physics, 114:201-208, 1994. [48] H. D. Han and X. Wen. The local artificial boundary conditions for numerical simulations of the flow around a submerged body. Journal of Scientific Computation, 16:263-286, 2001.
56
Houde Han
[49] H. D. Han and X. Wen. The global artificial boundary conditions for numerical simulations of the 3d flow around a submerged body. Journal of computational Mathematics, 21:435-450, 2003. [50] H. D. Han and X. N. Wu. Approximation of infinite boundary condition and its applications to finite element methods. Journal of Computational Mathematics, 3:179-192, 1985. [51] H. D. Han and X. N. Wu. The mixed finite element method for stokes equations on unbounded domains. Journal of Systems Sci.and Mathematical Sci, 5:121-132, 1985. [52] H. D. Han and X. N. Wu. The approximation of the exact boundary conditions at an artificial boundary for linear elastic equations and its application. Mathematics of Computation, 59:21-37, 1992. [53] H. D. Han and X. N. Wu. A fast numerical method for the blackscholes equation of american options. SIAM J. NUMER. ANAL., 41:2081-2095, 2003. [54] H. D. Han and L. A. Ying. Large elements and the local finite element method. Acta Mathematicae Applicatae Sinica, 3:237-249, 1980. [55] H. D. Han and C. X. Zheng. High-order local artificial boundary conditions of the exterior problem of poisson equations in 3-d space. Numer. Math. (A Journal of Chinese Universities), 23:290304, 1999. [56] H. D. Han and C. X. Zheng. Mixed finite element and high-order local artificial boundary conditions of elliptic equation. Comput. Methods Appl. Mech. Engrg., 191:2011-2027, 2002. [57] H. D. Han and C. X. Zheng. Mixed finite element method and higher-order local artificial boundary conditions for exterior 3-d Poisson equation. Tsinghua Science and Technology, 7:228-234, 2002. [58] H. D. Han and C. X. Zheng. Exact nonreflecting boundary conditions for acoustic problem in three dimensions. Journal of computational Mathematics, 21:15-24, 2003. [59] S. D. Jiang and L. Greengard. Fast evaluation of nonreflecting boundary conditions for the Schrodinger equations in one dimension, journal. [60] C. Johnson and J. C. Nedelec. On the coupling of boundary integral and finite element methods. Math. Comp.,, 35:1063-1079, 1980. [61] J. B. Keller and D. Givoli. Exact nonreflecting boundary conditions. J. Comput. Phys., 82:172-192, 1989.
The Artificial Boundary Method • • •
57
[62] A. Kirsch and P. Monk. A finite element method for approximating electromagnetic scattering from a conducting object. Numer. Math., 92:501-534, 2002. [63] Z. P. Li and X. N. Wu. Multi-atomic Young measure and artificial boundary in approximation of micromagnetics. Appl. Numer. Math., 51:69-88, 2004. [64] S. Meddahi, M. Gonzalez, and P. Perez. On a FEM-BEM formulation for an exterior quasilinear problem in the plane. SIAM J. Numer. Anal, 37:1820-1837, 2000. [65] F. Nataf. An open boundary condition for the computation of the steady incompressible Navier-Stokes equations. J. Comput. Phys., 85:104-129, 1989. [66] T. Ushijima. And FEM-CSM combined method for planar exterior Laplace problems. Japan J. Indust. Appl. Math, 18:359-382, 2001. [67] X. N. Wu and H. D. Han. A finite-element method for Laplace and Helmholtz type boundary value problems with singularities. SIAM J. Numer. Anal, 34:1037-1050, 1997. [68] X. N. Wu and H. D. Han. Discrete boundary conditions for problems with interface. Comput. Methods Appl. Mech. Engrg., 190:49874998, 2001. [69] X. N. Wu and Z. Z. Sun. Convergence of difference scheme for heat equation in unbounded domains using artificial boundary conditions. Appl. Numer. Math., 2004:261-277, 50. [70] D. H. Yu. Canonical integral equations of biharmonic elliptic boundary value problems. Math. Numer. Sinica, 4:330-336, 1982. [71] D. H. Yu. Numerical solutions of harmonic and biharmonic canonical integral equations in interior or exterior circular domains. J. Comput. Math., 1:52-62, 1983. [72] D. H. Yu. Approximation of boundary conditions at infinity for a harmonic equation. J. Comput. Math., 3:219-227, 1985. [73] D. H. Yu. Canonical integral equations of Stokes problem. J. Comput. Math., 4:62-73, 1986. [74] D. H. Yu. The approximate computation of hypersingular integrals on interval. Numer. Math. J. Chinese Univ., 1:114-127, 1992. [75] D. H. Yu. The coupling of natural BEM and FEM for Stokes problem on unbounded domain. Math. Numer. Sinica, 14:371-378,1992. [76] D. H. Yu. The mathematical theory of the natural boundary element method. In Monograph on pure mathematics and applied mathematics, No. 26. Science Press, Beijing, 1993.
58
Houde Han
[77] D. H. Yu. The computation of hypersingular integrals on a circle and its error estimates. Numer. Math. J. Chinese Univ., 16:332-337, 1994. [78] D. H. Yu. Natural Boundary Method and Its Applications. Kluwer Academic Publishers and Science Press, 2002. [79] C. X. Zheng and H. D. Han. High-order artificial boundary conditions for ideal axisymmetric irrotational flow around 3d obstacles. International Journal of Numerical Methods in Engineering, 54:1195-1208, 2002. [80] C. X. Zheng and H. D. Han. Artificial boundary method for exterior Stokes flow in three dimensions. International Journal for Numerical Method in Fluids, 41:537-549, 2003.
59
Optimal order integration on the sphere Kerstin Hesse, Ian H. Sloan School of Mathematics, Sydney
The University NSW
2052,
of New South
Wales,
Australia.
Abstract This paper reviews some recent developments in cubature over the sphere S2 for functions in Sobolev spaces. More precisely, for an m-point cubature rule Qm we consider the worst-case (cubature) error, denoted by E(Qm; Hs), of functions in the unit ball of the Sobolev space H" = HS(S2) with s > 1. The following recent results are reviewed in this paper: For any sequence (Q m ( n ))neN of positive weight m(n)-point cubature rules Q m ( n ), where Qm(n) integrates all spherical polynomials of degree < n exactly, the worstcase error in H" satisfies the estimate E(Qm(ny, Hs) < cs n~s with a universal constant c„ > 0. Whenever m{n) = 0(n2) we deduce E{Qm(n)\ H") < c3 m(n)~3'2, where the constant cs now depends on the constant in m(n) = 0(n2). This rate of convergence is optimal since it has also been shown that there exists a universal constant c 3 > 0 such that for any m-point cubature rule Qm, the worst-case error in H" with s > 1 satisfies E(Qm; H3) > c 3 m~s^2. For example, sequences (Qm(n)) of positive weight product rules with m(n) = 0(n2) achieve the optimal order of convergence 0(m{n)~3!2). So too, if the weights are all positive, do sequences (<2m(n)) of interpolatory cubature rules based on extremal fundamental systems.
1
Introduction
For a continuous function / on the sphere S2 c R 3 , S2 := { x = ( x ! , ^ , ^ )
e]R3
| xl+xj
the integral / / = / 2 /(x)du;(x) Js
+ xl = 1 } ,
60
Kerstin Hesse, Ian H. Sloan
can be approximated with an m-point cubature (or numerical integration) rule m
<3m/:=X] w i/( x j) with points x i , . . . , x T O g S2 and weights w\,..., wm € R. (Here dw(x) denotes the usual surface measure on 5 2 .) The quality of such a cubature rule Qm for functions in a space H of continuous functions on S2, with norm || • ||, is measured by the worst-case (cubature) error
E(Qm-H):=
sup
\Qmf-If\.
f€H,\\f\\
Interesting questions in this context are the following ones: For a sequence (Q m ( n ))neN of m(n)-point cubature rules Qm(n) with the property liniyi^oo m(n) = oo, what is the order in terms of m(n) of the worstcase error E(Qm(ny,H)7 What is the best possible order in terms of m of the worst-case error E(Qm; H) among all choices of m-point cubature rules Qml And how may this best possible order be achieved? More precisely, we want to find a suitable lower bound for the worst-case error E(Qm;H) for any m-point cubature rule Qm in terms of orders of m, and we want to identify sequences of cubature rules that achieve this order of convergence. The results when put together then show that each one is an optimal result. In this paper we review such results from [5-7] for the Sobolev (Hilbert) space Hs = HS(S2) with s > 1 arbitrary. The space Hs is roughly the space of those continuous functions whose generalized (distributional) derivatives of order < s are square-integrable. In the next section we discuss interpolatory cubature based on extremal fundamental systems (see [14]), which provided the initial motivation for the work reviewed in this paper. We compare and contrast such rules with product rules. In the subsequent section we introduce the Sobolev space Hs, formulate the results and sketch some of the ideas behind the proof of the lower bound result (Theorem 3.2). In the last section we conclude with an example that illustrates the good performance of interpolatory cubature based on extremal fundamental systems.
2
Extremal fundamental systems versus product rules
Writting the integral / / of a continuous function / on S2 in terms of the coordinates (t,4>) € [-1,1] x [0,27r):
if = J
J F(t,4>)dtd
Optimal order integration on the sphere
61
where F{t,4>) '•= / ( V l — t2 cos(p, V l - t2 sin >,£), the most natural integration rules that come to mind are so-called product rules, the 'product' of two one-dimensional integration rules. (The substitution t = cos 6, 6 G [0,7r], gives the usual polar coordinates.) For our purposes product rules of the following type are of interest:
where Qm(n)
1S
the product of
which integrates all trigonometric polynomials of degree < n exactly and a rule n'
„i
5^MiM*0» / • ' -
»=i
h(t)dt,
(2.2)
1
which is assumed to have positive weights \ii, i = 1 , . . . , n', and to integrate all algebraic polynomials of degree < n exactly. We also demand that n' = 0(n) in order that m(n) = (n + l)n' = 0(n2). There are many possible choices of rule (2.2) with the mentioned properties (see for example [2,15]). The product rule Qm(n) u s e s Tn(n) = (n + l)n' = 0(n2) points, and the properties of the two one-dimensional rules imply that the product rule (2.1) has positive weights, and that it integrates any spherical polynomial of degree < n exactly. More precisely, Qm(n)P = IP
Vp € P„,
where P„ is the space of all spherical polynomials of degree < n, that is, the restriction to S2 of all polynomials on R 3 of degree < n. The dimension of P n is dn := dim(P„) = (n + 1) 2 . An unattractive feature of product rules is that they have an uneven point distribution: the points cluster at the poles. In contrast to the point sets of product rules, so-called extremal fundamental systems possess a much nicer geometric point distribution. Let {3>j | j = 1 , . . . , dn} be any basis of P„. A fundamental system of degree n G No is a point set {x, | i = 1 , . . . , dn} of dn points for which d e t l ^ x O l ^ ^ ^ O .
(2.3)
The interpolation problem based on a fundamental system is: given a continuous function / , find L n / G P„ such that Lnf{xi)
= /(xj),
i = l,...,dn.
(2.4)
62
Kerstin Hesse, Ian H. Sloan
It follows from (2.3) that this problem always has a unique solution. (Note that Ln : C(S2) -» P„ is a projector, that is, L„ = Ln, and in particular Lnp = p for all p e P n .) Of course, the interpolation problem (2.4) can be arbitrarily badly conditioned if the condition number of the matrix [$j(x»)]i,j=i,...,d„ is large. A fundamental system {xt \i = 1,... ,dn} (of degree n) is called an extremal fundamental system (of degree n) if it maximizes the determinant of the interpolation matrix, that is, if det [$j(xj)l. . ,
. =
max
det [,• (y,)]. . ,
. . (2.5)
It should be noted that both (2.3) and (2.5) do not depend on the choice of the basis {$j | j = 1 , . . . , dn} of P„. The interpolatory cubature rule with respect to an (extremal) fundamental system of degree n is defined by integrating the interpolating polynomial Lnf e P„ of a continuous function / exactly, that is, Qm(n)f-= ! Lnf{x)du>(x). JS2
(2.6)
The interpolant Lnf S P n can be represented with the help of the Lagrange polynomials lj e P n ) j = 1, • • • , 4 , defined by £j(xi) = 5ji for % ^ 1 , . . . , dni as
Lnf = JT/f(xj)ej.
(2.7)
Substituting (2.7) into (2.6) yields Qm(n)f = Y]wjf(xj)
with
Wj ••= £j(x)dw(x).
(2.8)
Prom the definition of the weights in (2.8) it is not at all clear whether the weights are non-negative. However for the computed extremal fundamental systems up to degree n = 50 the computed weights uij are all positive and lie in the interval [\ j 1 , | j1-] (see [14]). This is strong numerical evidence that interpolatory cubature rules with respect to extremal fundamental systems have positive weights, although we do not have a proof. The cubature rule Q m ( n ), given by (2.8), uses m(n) = (n+ l ) 2 = 0(n2) points, and from (2.6) it is clear that it integrates all p e P n exactly. Extremal fundamental systems lead to a very well-conditioned polynomial interpolation problem (2.4). They can also possess a very nice geometric point distribution, as shown by the following two results. In [11, Theorem 5.1] Reimer, generalizing an argument from [17], showed that
Optimal order integration on the sphere
63
Figure 2.1 GauB-Legendre product rule points (left) and extremal fundamental system (right) for degree n = 41.
for any cubature rule that is exact for polynomials of degree < n, and that has points {x, | i = 1 , . . . , m} and positive weights, max
min
xGS 2 j = l , . . . , m
arccos(x • x,) < arccos(z n ),
(2.9)
where zn is the largest zero of the Legendre polynomial P\n/i\ • The expression on the left of (2.9) is called the mesh norm. From the known asymptotics of the zeros of the Legendre polynomials we have , , 4.8096 arccos(z n ) ~ . n Essentially (2.9) tells us that there are no 'large holes' in extremal fundamental systems if the corresponding weights are positive. The other result (due to [10], using a result of [12]) asserts that the points of an extremal fundamental system are well separated, in that for an extremal fundamental system {x* | i = 1 , . . . , dn} of degree n, 7T
arccos(xi • x,) > — 2n
for j , i = 1 , . . . , dn, with i ^ ?'.
Figure 2.1 illustrates that extremal fundamental systems have a much nicer geometric point distribution than the point sets of product rules. In terms of the point number m(n) = 0(n2) and the degree of polynomial exactness n a product rule (2.1) and the interpolatory cubature rule (2.8) with respect to an extremal fundamental system (of degree n) have comparable properties.
64
3
Kerstin Hesse, Ian H. Sloan
Optimal order estimates for cubature in Sobolev spaces
To formulate and explain our cubature results we need some notations. For more details the reader is referred to [3,8,10]. The restriction to S2 of airy homogeneous harmonic polynomial of exact degree t is called a spherical harmonic of degree £. The space of all spherical harmonics of degree £ has the dimension 2£ + 1, and we choose an orthonormal set {Y^ \k = 1 , . . . , 2^+1} of spherical harmonics of degree £ with respect to the L2(S2) inner product ( / . 5 ) £ 2 = / 2 /(x)s(x)dw(x). Js Then U^ 0 {l^fc | k = 1 , . . . , 2£ + 1} is a complete orthonormal set in L2{S2), and n
P„ = span \J{Yek \k =
l,...,2£+l}.
Any / € L,2(S2) can be expanded into a Fourier series (or Laplace series) with respect to this complete orthonormal set of spherical harmonics: oo 2^+1 f=0 fc=l
with the Fourier coefficients given by ftk-=
/ Js2
f(x)Yik(x)(Lj(x).
The equality in (3.1) is to be understood in the -^(-S^-sense. The Sobolev space Hs = HS(S2), s > 0, is now defined as the completion of s p a n ( J ^ l 0 P n with respect to the norm
(
oo 2t+X
\
1
/2
e=o fe=i / The space Hs is a Hilbert space with the inner product oo 2t+l €=0 fc=l
which induces the norm (3.2). Clearly, H° = L2(S2). For s > 1 the space Hs is embedded into the space C(S2) of continuous functions on S2,
Optimal order integration on the sphere
65
endowed with the supremum norm, and in this situation the Sobolev space Hs is also a reproducing kernel Hilbert space, a fact which plays an important role in the proof (see [5,6]) of the upper bound of the worst-case cubature error (see Theorem 3.1 below). Numerical experiments (see [14]) showed that for typical n up to degree 191 E(Qm{n);H3/2)
« 3.217 (n + l)" 1 - 5 0 0 6 = 3.217m(nT 0 - 7 5 0 3 ,
where Qm{n) is the interpolatory cubature rule with respect to an extremal fundamental system of degree n. Note that 1.5006 ss 3/2. That experimental result motivated first the conjecture and then the proof of the following theorem from [5,6]: Theorem 3.1. For each s > 1, there exists a constant cs > 0 such that for any m-point cubature rule Qm, which has positive weights and satisfies Qmp = Ip for all p € P n , the worst-case cubature error in Hs satisfies the estimate E(Qm;Hs)
(3.3)
Examples of cubature rules that satisfy the assumptions of Theorem 3.1 are positive weight product rules (2.1), and also, under the assumption that the weights are positive, interpolatory cubature rules based on extremal fundamental systems. For a sequence {Qm(n)) °f cubature rules with Qm[n)P = Ip for all p € P n , the estimate (3.3) holds also true if the assumption of the positivity of the weights in Theorem 3.1 is replaced by the assumption that the sequence (Qm(n)) satisfies a certain local regularity property (see [5,6] for the details). In this case the constant cs may depend on the sequence of cubature rules. Theorem 3.1 was first proved in [5] for the special case s = 3/2 for which we 'knew' the 'correct' order 0(n~ 3 / 2 ) from the persuasive numerical evidence presented in [14]. An essential part of the proof in [5] is a surprising representation of the tail of a certain Legendre series. The proof of this representation does not generalize in an obvious way to arbitrary s > 1, but using the knowledge of this representation we conjectured in [6] an analogous result for arbitrary s > 1, and then verified that result, enabling us in [6] to prove Theorem 3.1 for general s> 1. For a sequence (Qm(n)) of m(n)-point cubature rules with the property Qm(n)P = Ip f° r ail p € P„ we have always m{n) > en2 for some constant c > 0, that is, n2 is the lowest possible order of m(n). If m(n) = 0(n2) for a sequence {Qm{n)) of rules satisfying the conditions in Theorem 3.1 then E(Qm(ny,Hs)
(3.4)
66
Kerstin Hesse, Ian H. Sloan
where the constant c s > 0 depends on the constant in m{n) = 0(n2). The following result, from [7], shows that (3.4) is indeed the optimal order of convergence. Theorem 3.2. For each s > 1, there exists a constant cs > 0 such that for any m-point cubature rule Qm on S2, E{Qm-Hs)>csm-s'2.
(3.5)
Theorem 3.2 is a 'negative' result that shows the limitations of m-point cubature rules in Hs. The estimate (3.5) is sharp (or optimal) because Theorem 3.1 with the additional assumption m{n) = 0(n2) identifies sequences of cubature rules that achieve this optimal rate of convergence. We now give a brief sketch of the proof of Theorem 3.2 in [7]. The proof was inspired by the method of the proof for lower bounds for the worst-case cubature error in certain spaces of continuous functions on the unit cube [0, l ] 2 (see [1,9]). Sketch of the proof of Theorem 3.2 The idea is to construct for each m-point cubature rule Qm a function fm 6 Hs such that 7^-TT
\Qmfm ~ Ifm\
> Cs m " ^ 2
(3.6)
\\Jm\\s
with a constant cs > 0 that is independent of Qm and m but depends on s. As the cubature error of / m / | | / m | | s is a lower bound for E(Qm; Hs), (3.6) implies (3.5). First, we pack the sphere S2 with 2m spherical caps S(yj,am)
:= { z e S2 | arccos(z • yj) < am} ,
j = 1 , . . . , 2m, where the spherical angle am satisfies ci (2m)" 1 / 2
(2m)" 1 / 2
(3.7)
with constants c\,ci being positive and independent of m. A packing of the sphere means that the caps touch at most at their boundaries and are not allowed to overlap. That such centers y i , . . . , y^m G S2 and a spherical angle am satisfying (3.7) exist follows from results in [4] (see also [13]). As the 2m caps of our sphere packing touch at most at their boundaries and as there are only m cubature points, there are at most m such caps that contain a cubature point in the interior. Conversely there are at least m caps that contain no cubature point in their interior. After relabeling, we may assume that S(yj,am), j = 1 , . . . , m, contain no cubature point in their interior.
67
Optimal order integration on the sphere
Figure 3.1 A packing of S2 with the 100 points y i , . . . ,yioo of an extremal fundamental system of degree n = 9 as centers.
Now we construct a function / m such that the following properties are satisfied: (i) fm is infinitely differentiable, (ii) the support of fm is contained in UjLi S(yj > am) (and therefore fm vanishes on the boundaries of these caps), and (iii) fm restricted to the cap S(yj, am), j € { l , . . . , m}, looks exactly the same for each cap. Concerning (iii), we choose fm such that fm\S(yj,am) is rotationally symmetric with respect to y^ and therefore effectively a function of only one variable t := x • y j . We use the same function of one variable for the construction of fm for any Qm and just scale its argument accordingly. This is crucial to the proof. The fact that fm € C°°(S2) implies that fm € Hs for any s > 0; so we can use the same function fm for the proof for all s > 1. Prom the construction, it is immediately clear that **%mjm
==
U,
and the integral Ifm can be easily computed and expressed in terms of orders of m. The difficult step in estimate (3.6) is to obtain a suitable upper bound of ||/ m ||* in terms of the orders of m. To obtain an tipper bound for ||/TO||a we initially consider only even s > 0 and make use of the fact that for the C°° -function fm H/mlU = ( ^
( ( ! - A*) S / 2 / m ( x ) ) 2 dw(x))
,
(3.8)
where A* is the Beltrami operator, the angular part of the Laplace operator. Since for even s the operator (\ — A*) s / 2 may be defined through differentiation in the classical sense, we can make use of the fact that fm
Kerstin Hesse, Ian H. Sloan
68
is essentially a sum of m functions, each of which has its support on one of the (non-overlapping) caps S(yj,am), because the same property is inherited by ( \ — A*)*/ 2 / m . An estimate of (3.8) yields an upper bound of ||/mils, which provides the desired orders of m. For s > 0 but not an even integer we can interpolate between the results for the even integer cases with the help of Holder's inequality. For interpolation between the even integer cases it is convenient that the same function fm is used for all s > 1. We refer to [7] for the details of the proof. •
4
Numerical example
As an illustration of the good performance of interpolatory cubature we plot the cubature error for a 'test' function / which is only Lipschitz continuous, namely, (4.1)
/(x):=H^-3fl(x) where p is a polynomial in P 3 and given by p(x) := xf + 2x\ + \x\ + 3xia;2a;3 + lx\x3
+ xx + x2 + x3 + 1
and where ff(a
*f=JT
{»
for
for
#<x0-x
with R := 2/3 and x 0 := ( c o s ^ sin f,sin ^f sin | , c o s f ) . The function / is shown in the right picture in Figure 4.1 as a 3-dimensional graph on (0, n) x (0,2n). 3d — plot of the polynomial p/3
Figure 4.1
3d- plot of the function f = p/3-3g
The polynomial p/3 (left) and the 'test' function / (right).
Optimal order integration on the sphere
69
Estimation of the Legendre coefficients of g as a function of t := xo • x (see [3, Subsection 5.8.2]) shows that g and hence / is in Hs when s < 3/2. However / cannot be in Hs with s > 2 as this would imply (from the Sobolev embedding theorem, see [3, Lemma 5.2.3]) that / is in C 1 (5 2 ). Figure 4.2 plots the cubature error \Qm{n)f ~ If\ of / as a function of n, where Qm(n) is the interpolatory cubature rule with respect to an extremal fundamental system of degree n. The apparent rate of convergence is approximately 0(n~2A), somewhat better than that predicted by the upper bound, and therefore consistent with it. (Note that the lower bound is on the worst-case error, but not on the error for a particular function.) :
~m cubature error
- ^ - i.36ri- 2 ; 37 ; : 10° ^
10 '
10 S
!
•
':' 1
*
» • • • r - r r * • H ^ - * ^ ••*?•.
: MM
103
^ ^ < /
E
o
.
••":
•*;•••,;
»
v
m
— 10
•
*
\
•
\ *M 10 *
10
6
degree of exactness n
Figure 4.2 Cubature error for (4.1) using interpolatory cubature with extremal fundamental system for degree n < 100.
Acknowledgment The authors acknowledge the support of the Australian Research Council under its Centres of Excellence Program.
References [1] N.S. Bakhvalov, On approximate computation of integrals. (Russian) Vestnik MGV, Ser. Math. Mech. Astron. Phys. Chem., 4 (1959), 3-18. [2] P.J. David and P. Rabinowitz, Methods of Numerical Integration. Academic Press, Orlando, 1984.
70
Kerstin Hesse, Ian H. Sloan
[3] W. Freeden, T. Gervens and M. Schreiner, Constructive Approximation on the Sphere with Applications to Geomathematics. Oxford University Press, Oxford, 1998. [4] W. Habicht and B.L. van der Waerden, Lagerung von Punkten auf der Kugel. Math. Ann., 123 (1951), 223-234. [5] K. Hesse and I.H. Sloan, Worst-case errors in a Sobolev space setting for cubature over the sphere S2. Bull. Austral. Math. Soc, 71 (2005), 81-105. [6] K. Hesse and I.H. Sloan, Cubature over the sphere S2 in Sobolev spaces of arbitrary order. Applied Mathematics Report AMR04/27, The University of New South Wales, August 2004 (submitted). [7] K. Hesse and I.H. Sloan, Optimal lower bounds for cubature error on the sphere S2. J. Complexity, 21 (2005), 790-803. [8] C. Miiller, Spherical Harmonics. Lecture Notes in Mathematics, 17, Springer-Verlag, Berlin, 1966. [9] E. Novak, Deterministic and Stochastic Error Bounds in Numerical Analysis. Lecture Notes in Mathematics, 1349, Springer-Verlag, Berlin, Heidelberg, 1988. [10] M. Reimer, Constructive Theory of Multivariate Functions. B.I. Wissenschaftsverlag, Mannheim, Wien, Zurich, 1990. [11] M. Reimer, Spherical polynomial approximation: A survey. In: Advances in Multivariate Approximation, W. Haufimanm, K. Jetter, M. Reimer, eds., Wiley, Berlin, 1999, 231-252. [12] M. Riesz, Eine trigonometrische Interpolationsformel und einige Ungleichungen fur Polynome. Jahresber. Deutsch. Math.-Verein., 23 (1914), 354-368. [13] E.B. Saff and A.B.J. Kuijlaars, Distributing many points on a sphere. Math. Intelligencer, 19 (1997), 5-11. [14] I.H. Sloan and R.S. Womersley, Extremal systems of points and numerical integration on the sphere. Adv. Comput. Math., 21 (2004), 107-125. [15] A.H. Stroud, Approximate Calculation of Multiple Integrals. Prentice Hall, Englewood Cliffs, N.J., 1971. [16] G. Szego, Orthogonal Polynomials. American Mathematical Society Colloquium Publications, 23, American Mathematical Society, Providence, 1975, 4th edn.. [17] V.A. Yudin, Covering a sphere and extremal properties of orthogonal polynomials. Discrete Math. Appl., 5 (1995), 371-379.
71
A Survey of Multi-symplectic Runge-Kutta Type Methods for Hamiltonian Partial Differential Equations* Jialin Hong State Key Laboratory Institute
of Scientific
of Computational
Computing,
Academy
and Engineering
Mathematics
of Mathematics
Chinese Academy
of Sciences
Beijing
100080,
and and System
(CAS),
Computing,
Scientific/Engineering Sciences
P.O.Box
(AMSS),
2719,
China.
Abstract Mathematicians and scientists have paid much more attention to structure - preserving algorithms of dynamical systems since Professor Kang Feng proposed and systematically developed the so-called symplectic algorithms for Hamiltonian systems in the late 1980s. Multi-symplectic numerical methods for infinitedimensional Hamiltonian systems, such as Schrodinger equations, play an important role in scientific and engineering computing. This paper is a survey of multi-symplectic Runge-Kutta type methods for Hamiltonian partial differential equations. We summarize the results on the multi-symplecticity of Runge-Kutta type methods, including partitioned Runge-Kutta methods and Nystrom methods, etc, for Hamiltonian partial differential equations. And we give a novel result on the multi-symplecticity of partitioned Runge-Kutta methods for Schrodinger equations. We present some applications of these methods to Schrodinger equations and Dirac equations in quantum physics, and investigate conservative properties of energy and momentum for these methods, in particular, the corresponding energy and momentum analysis.
1
Introduction
In [25, 32, 38], authors established independently t h e condition of symplecticity of R u n g e - K u t t a methods t h a t plays an important role in the "This work is supported by the Director Innovation Foundation of ICMSEC and AMSS, the Foundation of CAS, the NNSFC (No. 19971089 and 10371128) and the Special Funds for Major State Basic Research Projects of China (G1999032804).
72
Jialin Hong
study and the applications of structure-preserving algorithms for Hamiltonian systems, even for dynamical systems. The condition is closely related to the algebraic stability of Runge-Kutta methods for ordinary differential equations in [5, 9, 14, 22, 34]. The symplectic condition of partitioned Runge-Kutta methods for Hamiltonian systems was given in [33, 36, 39], which shows that it is possible to construct explicit symplectic methods for some special Hamiltonian systems (e.g., separable Hamiltonian systems). The remarkable theoretical analysis on symplectic Runge-Kutta methods and symplectic partitioned Runge-Kutta methods was presented in [14, 34] and references therein. More information on the sufficient and necessary conditions of symplecticity of RungeKutta type methods for Hamiltonian ODEs can be found in [14, 34]. S. Reich in [31] showed that concatenated Gauss-Legendre methods in temporal direction and spatial direction leads to a multi-symplectic integrator for scalar wave equations (also for nonlinear Schrodinger equations). This gives us a motivation to consider the condition of multisymplecticity of Runge-Kutta type discretizations for the general Hamiltonian partial differential equations. In [19], authors discussed and established the condition of multi-symplecticity of Runge-Kutta type methods for Hamiltonian partial differential equations. [16] gave the condition of multi-symplecticity of Runge-Kutta-Nystrom methods for the nonlinear Schrodinger equations. [20] investigated the energy and momentum analysis of multi-symplectic Runge-Kutta methods for onedimensional nonlinear Dirac equations. More details and information on multi-symplectic methods of Hamiltonian PDEs can be found in [4, 6, 7, 8, 15, 16, 17, 18, 19, 20, 21, 23, 24, 27, 29, 30, 31, 37, 40, 41]. This paper is a survey of the multi-symplectic Runge-Kutta type methods. We summarize the conditions of multi-symplecticity for Runge-Kutta methods, partitioned Runge-Kutta methods for the general Hamiltonian partial differential equations and Runge-Kutta Nystrom methods for Schrodinger equations. We not only summarize results in some papers written with collaborators, but also present some novel results on the topic. For the convenience of readers, we will give some technical details of the proofs for main results. We also investigate the classical conservative properties of multi-symplectic Runge-Kutta type methods which include charge, energy and momentum conservation laws. This paper is organized as follows. In the rest of this section, we recall some basic facts on symplectic conditions of Runge-Kutta type methods for finite dimensional Hamiltonian systems, and then give the definition of multi-symplecticity of Hamiltonian partial differential equations, and conservative properties of local energy and momentum. In section 2, we restate some results on the multi-symplecticity of Runge-Kutta methods and partitioned Runge-Kutta methods applied to the general Hamiltonian partial differential equations. In section 3, we present the condition
A Survey of Multi-symplectic Runge-Kutta • • •
73
of multi-symplecticity of partitioned Runge-Kutta methods and restate multi-symplecticity of Runge-Kutta Nystrom methods applied to nonlinear Schrodinger equations. In section 4, we discuss the energy and momentum analysis for multi-symplectic Runge-Kutta methods of onedimensional nonlinear Dirac equations in quantum physics. We end the survey in the conclusion, section 5.
1.1
A brief review on symplecticity of Runge-Kutta type methods for the finite dimensional Hamiltonian systems
Consider the Hamiltonian system f)M fiJ-f Pk = —z—(p,Q), 9fc = -z—(p,q), k = l,2,...,d, (1.1) oqk dpk where q = (qi,q2,- • • ,qd)T (the generalized coordinates), p = (pi,P2, ... ,Pd)T (the conjugate momentum), d is a positive integer (the degree of freedom), -T denotes the transpose of vectors (or matrices) and H is the Hamiltonian of the system. The Poincare (1899) theorem states that if H(p, q) is a twice continuously differentiable function on U C R2d, then for each fixed t, the flow (ft is a symplectic transformation whenever it is defined. For the long time of numerical computation for Hamiltonian systems, it is a crucial idea, from K. Feng (see [11, 12]), that the numerical schemes should preserve as much as possible the characteristic properties and inner symmetries of the original system. A numerical method for (1.1) is called symplectic if it preserves the symplecticity of the flow of (1.1) in the sense of the corresponding numerical discretization. Many symplectic methods and their construction ways have been given. Here we focus on the condition of symplecticity of Runge-Kutta type methods when applied to (1.1). As well-known, an s-stage Runge-Kutta method for the solution of the differential equation y = f(t,y),
y{to) = yo
is given by h = /(to + (kh, yo + h Yfj=i aijkj)>
i=
l,.-.,s,
2/i = yo + hY?i=i_biki, where 6j, Oy (i, j = 1 , . . . , s) are real numbers. For more information, see [5, 14, 34].
74
Jialin Hong
Assume that bi, a^ and b the coefficients of two RungeKutta methods respectively. A partitioned Runge-Kutta method for the solution of the differential equations V = f{y,z),
z--=g{y,z)
is given by h = f(yo + h Yfj=\ aijkj,z0
+ h£)*=1
s
k =9(.yo + hY^ j=1aijkj,z0
yi=yo + h X)*=i hki,
aijlj),
s
+ hY^ j=iaijlj),
i=
l,...,s,
zi= z0 + h J2Ui bik-
For more detail of the method, see [14, 34] and references therein. A Nystrom method for the solution of the second-order differential equations y = git, y, y) is given by k = g(to + Cih, y0 + Cihy0 + h2 Y?j=i ^ijk,yo y\ = yo + hy0 + h2 YH=I kk,
yi=yo
+ hX^=i «u^')>
+ h X^=i
kk,
where Q , bi, a^ and hi, a^ are real numbers. For more information, see [5, 14, 34]. In [25, 32, 38], authors established independently the following result: Proposition 1.1. If the coefficients of a Runge-Kutta method satisfy biOij + bjaji — bibj = 0,
for all
i, j = 1,... ,s,
(1.2)
then it is symplectic. This condition coincides one on the preservation of the quadratic invariants of differential equations under the Runge-Kutta discretization, and it is also related closely to the condition of algebraic stability of Runge-Kutta methods (see [5, 9, 14, 22, 34]). The condition (1.2) gives a useful construction way for symplectic Runge-Kutta methods of Hamiltonian systems [35]. The condition of symplecticity of partitioned Runge-Kutta was obtained independently in [33, 36, 39]. Proposition 1.2. If the coefficients of a partitioned method satisfy biaij + bjaji = bibj, bi — bi,
for
for
i, j — l,...,s,
i = l,...,s,
methods
Runge-Kutta (1.3) (1.4)
A Survey of Multi-symplectic Runge-Kutta • • •
75
then it is symplectic. In particular, the condition (1.3) is sufficient for the symplecticity of the numerical flow if the Hamiltonian is of the form Hip, q) — T(p) + Uiq), i.e., it is separable. In this case, it is possible to gain some explicit symplectic schemes (see [14, 36]). Some related results on variational integrators can be found in [26]. Proposition 1.3. If the coefficients of the Nystrom method, applied to the differential equation y = giy), k = g(yo + Cihyo + h2 Y,sj=1 Qijlj), yi=Vo + hy0 + h2 YH=I Pik,
yi=Vo + h Y,si=i hik,
satisfy fa
=6i(l-Cj)
biiPj - atj) = bjiPi - a,ji)
for i = l,...,s,
(1.5)
for i,j = l,...,s,
(1.6)
then it is symplectic. This result was originally obtained by Suris (see [39]). The above results pave the way of the investigation of multi-symplectic Runge-Kutta type methods for Hamiltonian partial differential equations. For more details of symplectic methods for Hamiltonian systems, see [11, 12, 14, 34] and references therein.
1.2
Multi-symplecticity of Hamiltonian partial differential equations
Now we turn our attention to the Hamiltonian partial differential equations. Consider the Hamiltonian partial differential equation (H-PDE) Mzt + Kzx = VzSiz),
zGRn,
n>3,
(1.7)
where M, K are n x n skew-symmetric matrices, S : Rn —> R is a smooth function (at least twice continuously differentiable), and the gradient of S is with respect to an inner product on Rn which will be denoted by (•, •) throughout the paper. Some important partial differential equations, such as Schrodinger equations, Dirac equations in quantum physics and the wave equation and so on, can be written as the form of (1.7). Equation (1.7) has an intrinsic conservation law as follows:
76
Jialin Hong
so called multi-symplectic conservation law (see [6, 7, 15, 16, 17, 18, 19, 20, 23, 24, 29, 30]), where differentiable 2-forms OJ(X, t) = U(x, t)TMTV{x,
K(X, t) = U(x, t)TKTV(x,
t),
t),
and U(x, t) and V(x, t) are the solutions of the variational equation Mdzt + Kdzx = DzzS{z)dz
(1.9)
corresponding to (1.7). w and K are also called pre-symplectic forms. Equivalently, we have co = -dz A Mdz,
K
= -dz A Kdz.
(1.10)
The equation (1.7) has a local energy conservation law (ECL) dE dt
OF dx
with energy density
(1.11)
-i
-zTKzx
E = S(z) and energy flux F=\zTKzt.
It has also a local momentum conservation law (MCL) 81 dt with momentum density
dG dx
1 I = -z1 2
T
(1.12)
Mzx
and momentum flux G = S(z) -
\zTMzt.
The local energy conservation law and local momentum conservation law associated to the multi-symplectic conservation law (1.8) were also investigated in [7, 23, 20]. To the purpose of numerical computation, following [7] (also [4, 8, 15, 16, 17, 18, 19, 20, 23, 24, 29, 30, 31]), we introduce a uniform grid (xj,tk) € R2 in the plan of (x, t) with a mesh-length At in the temporal direction and a mesh-length Ax in the spatial direction. The value of z(x,t) at the mesh point (xj,tk) is denoted by Zjtk- A numerical discretization of (1.7), (1.8) and (1.9) can be written, respectively, as
A Survey of Multi-symplectic Runge-Kutta • • • Mdj:kzitk
+ Kd>x>kzjik = (VxSj,k)j,k,
d3t'kWj,k + di'kKj,k = 0, Mdi'k(dz)j,k where 5 Jjfc =
+ Kdj-k(dz)j,k
= {D3£Sj,k)(dz)j,k,
77
(1.13) (1.14) (1.15)
S(zjtk,Xj,tk), wjtk = (MUjtk, Vjtk)
Kj.fc = (KUjtk, Vj,k),
Ujrk and V^fc are solutions of (1.15), and <9j' , ftp are discretizations of the derivatives <9t and dx, respectively. The following definition comes from [7]: Definition. The numerical scheme (1.13) is called a multi-symplectic integrator of the system (1.7) if (1.14) is a discrete conservation law of (1.13). This conception can be extended easily to the cases of variable coefficients or higher spatial dimensions (see [15, 21]). Based on the definition, some contributions to multi-symplectic methods have been made by some mathematicians. Some multi-symplectic methods for Schrodinger equations, Dirac equations, KdV equations and the wave equations have been constructed (see [4, 8, 15, 16, 17, 18, 19, 20, 21, 23, 24, 27, 29, 30, 31, 41] and references therein).
2
Multi-symplecticity of Runge-Kutta type methods for the general Hamiltonian partial differential equations
In this section, we summarize results on multi-symplecticity of RungeKutta type methods for the general Hamiltonian partial differential equations. These results were given in [19]. We consider (1.7). To simplify notations, let the starting point (a;o,to)=(0,0) in numerical methods proposed throughout this paper. We assume that all numerical methods presented in this paper are solvable.
2.1
Runge-Kutta methods
The Runge-Kutta method for equation (1.7) is the following
78
Jialin Hong Zmk — zm + At 5Z_j=1 dkjdtZmj,
z^ = zl + ^rnk
=
z
= z
r
AtJ: k=1hdtZmk,
%0 ~r £\X ) .n — i
a
"mn^x^'nki
i o ~l~ Aa;2^ T O = 1 bmoxZmk, MdtZmk + KdxZmk = VzS(Zmk),
(2.1) (2.2) (2.3) (2.4) (2.5)
where notations are used as follows: Zmk « z(cmAx,dkAt), z^ « z(c m Ax,0), dtZmj « dtz(cmAx,dkAt), dxZmj « dxz(cmAx,dkAt), z\ K, z(cmAx, At). ZQ « ^(0, dfcAt), -z* « z(Aa;, efeAt), and
<4 i=i
Corresponding variational equations to (2.1-2.5), respectively, are
dZmk = dz% + AtJ2rj=i
akjd(dtZmj),
(2.6)
dz^ = dz°m + At J2l=i dZmk = dzfi + Ax E«=i
bkd(dtZmk), amnd(dxZnk),
(2.7) (2.8)
dzk = dz% + Ax^2sm=i
bmd(dxZmk),
(2.9)
MdtdZmk where DzzS(Zmk)
+ KdxdZmk
=
DzzS(Zmk)dZmk,
(2.10)
is a symmetric matrix.
Theorem 2.1.[19] If in the method (2.1-2.5), (2.11)
bkbj — bkakj — bjdjk = 0,
(2.12) hold for k,j = 1,2, • • •, r and m,n = 1,2, • • •, s, then the method (2.12.5) is multi-symplectic with the conservation law A * E ™ = 1 bmddz^rM^idzi) + A * E L i h((dzk)TKT(dzk)
-{dz^)TMT(dz0J) - (dzk)TKT(dzk))
= 0,
where
and
{dz^, dz^, dz\, dz$, dZmk, d(dxZmk), d(dtZmk)} {dz^, dz^, dzk, dzft, dZmk,d(dxZmk),d(dtZmk)}
are solutions of the variational equation (2.6-2.10).
(2.13)
A Survey of Multi-symplectic Runge-Kutta • • • The outline of the proof.
79
Let
{dz^, dz^, dzl, dzfc, dZmk, d(dxZmk), {dz^, dz^, dz\, dzfr, dZmk, d(dxZmk),
d(dtZmk)}, d(dtZmk)}
are solutions of the variational equation (2.6-2.10). It follows from (2.12.5) and (2.6-2.10) that (dzlfM^idzi)
{dz\)TMT{dzQm)
,—_-
T
= A*n=iM<W«»fc)
,
MT(dZmk)
—- T
(2-14)
+ (dZmk)
MTd(dtZmk))
+
(dzZk)TKTd{dxZmk)).
and (dz\)TKT{dzl)
(d4)TKT{dz%)
-
= Az£™=i b^di^XntJKT{dZmk)
Using (2.10) and the symmetry of the matrix DzzS(Zmk)
(2.15) produces
d(d^zZk) MT{dZmk) + (dZ^k) MTd(dtZmk) +d(j£zZk) KT{dZmk) + {dzZk) KTd(dxZmk)=0. Combining (2.14), (2.15) and (2.16), the proof of the theorem is completed. Remark 2.1. This theorem can be extended to the Hamiltonian partial differential equation with varying coefficients M[x)zt + K(t)zx
= VzS(z, x, t)
(2.17)
and the higher dimensional case in spatial 7
M(x)zt + ^2KT(t)zXT
=
S7zS(z,xi,...,x7,t),
T= l
where M(x), K(t) and KT(t) (r = 1 , . . . , 7) are skew-symmetric matrices and smooth in x and t, respectively, and S(z, x, t) is a smooth real function. The following corollary is a natural extension of the result in [31]: Corollary 2.1.[19] If in the method (2.1-2.5), the method applied to both the temporal direction and the spatial direction is of GaussLegendre, then the method (2.1-2.5) is a multi-symplectic integrator.
80
Jialin Hong
2.2
Partitioned Runge-Kutta methods
We consider the blocked Hamiltonian partial differential equation M1
M0\
(pt\,f
# 1 K0\
+
-M^M2J\qt)
(p
[-K^K2 )(£)
=( « ) •
(2 18
->
where MT, KT (r = 1,2) are ax a skew-symmetric matrices, Mo, KQ are ax a matrices and S(p, q) is a smooth real function in p = (pi,p2, • • •, pa)T and q = (quq2,---,qa)T• The corresponding multi-symplectic conservation law is dw(U,V) dt
, dK(U,V) dx
^Q
(2.19)
where
U(x,t) and V(x,t)
are solutions of the variational equation
Mi M 0 -M 0 T M 2
=D
W + ( J ^ ^2 ) ^
"S&dz
( 2 - 20 )
and z = (pi,p2i ••• jPa, 9i)92, ••• ,qa)T • Now we apply the partitioned Runge-Kutta method to equation (2.18). -* mfc =
PS, + A*EJ=l0i?ft^mi,
= 9m+AiE;=i4i)^m,-, 1) P m == pSl + At££=1&< $Pmfc,
tymk
9m == -< mfc
=
9m + AiELi4 2 ) ^Q m f c , P o ^" A s ; 2 ^ n _ i
Omn<JxPnk, a
Wrote = 9o "I" ^ ^ 2—in=\
Pi =
Po +
^J2 m=1b^dxpmk,
9i = 9o "1" Aa^2_(m=l f
, /
#1
mnOxQnk
s
TO
OxQmki
Mi M 0 \ /a 4 p m f c \ (K-M^M2J \dtQmk) \(d P \ [VpS(Pmk, tymk) \ ^0 x mk J \dxQmk) \VqS{Pmk, ^Cmk) J
A Survey of Multi-symplectic Runge-Kutta • • •
81
where we make use of notations Pm ~ P(CmA>X, 0),
pm ~ P{Cm&X, At),
p£ « p ( 0 , 4 A t ) , 9m ~ q(cmAx,Q), 9o ~ 9 ( 0 . 4 A t ) , ^mfc ~ p(c m Aa;, dfeAt),
p\ » p(Ax, 4 At), qm « g(cmAa;, At), q\ » g(Ax, d fc At), Q mfc « q(cmAx, dkAt),
dtPmk « §2f(cmAa;, 4 At),
9 x P mfc « — ( c m A : r , 4 A t ) ,
dtQmk « § f ( c m A x , 4 A i ) ,
ax<5TOfc « —
(cmAx,dkAt)
under the assumption that
£ aw = £ a® = cm,
£ «SJ = E 4? = 4-
n=l
j'=l
n=l
(2.2i)
j=l
We denote this method by (PRK). The variation equations of this method are dPmk = dp°m + At Y,rj=1 a^ddtPmj, dQmk = dq°m + At E , r = 1 a$ddtQmj,
(2.23)
dpi = dp°m + At £ L i b^ddtPmk,
(2.24)
dqm = d& + At Zl=1
(2.25)
b^ddtQmk,
dPmk = dp% + Ax £ * = 1 a{rkUdxPnk, dQmk = dql + Ax £ * = 1 a$nddxQnk, k
(
(2.26) (2.27)
dpi = dp 0 + Ax £ m = 1 b ^ddxPmk,
(2.28)
drf = dq% + Ax E m = 1 b$ddxQmk, Md(dtZmk) + Kd(dxZmk) = Amkd(Zmk),
(2.29) (2.30)
where d(Zmk)
A
(2-22)
=
dPmk U»°lmk
d{dtZmk) =
{fdlqll)'
d(dxZmk) =
(JQIQH
/ L>ppd\±mkiQmk) , IP \*mki y m k )
L'pqb\rmk,Qrnk) ^-'qq^>\"mkiQmk)
82
Jialin Hong
Obviously, Amk is a symmetric matrix. Now we let {dpm, dp^, dp^ dpi, dPmk, ddtPmk, ddxPmk, dqm, dq^, dq\, dq%, dQmk, ddtQmk,
ddxQmk},
{dPm, dp^, dpi, dp^ dPmk, ddtPmk, ddxPmk, dqm, dq°l7 dq\, dq%, dQmk, ddtQmk,
ddxQmk}
be solutions of the variational equations (2.22-2.30), and
6tu,m = (dpC,dqlT)MT
( * | ) - (4&r,d&T)MT
5xnk = (dA^\)KT
( * ! ) - (MMl)KT
(*| (*£
By a straightforward calculation, we have StVm = — T — T T^,T
^El=i((dPmk
+{bk1]d^PnkT,
r,k X) )MT ,I b[^ ddtPmk \bk'ddtQmk J 2) T T bk dd^nk )M (J*£ ) )
,dQmk
+(A*)2 Tl^M^il + (bk2)a{3 2) (
+ bfa%
+(bk a S+bfa^
b^)a{^^k1^1))d(dXlj)TM1d(dtPmk)
+ - b^bf^dtQrnj)
-
MididtQmk)
)
T
bk%f )d(dtPnj) TM0d(dtQmk)
+(bf)b^) - bfa% -
bkx\^)d{dtQmj)TM0d{dtPmk)) (2.31)
and SxKk =
AZ£^=1((^T,5Q^VT +(bk1)ddxXlkT, lfdKQmk)KT +(Ax) 2 YLn=i((b™a™
($>%%*) \bl'ddxQmkJ
( dJ^k \)
+ ~bn}a£l -
V^bV)d(^Kk)K^d^k)
+(brna$n + &n a&m - ifflbh)d{dxQnk) --—*
)
+ (6m Omn + &n Onm ~ bfflbn
Hffiffl
Kid{dxQmk) •
)d{dxPnk)
- bttdttl - b^~a{xi)d(d^Qnk)T
T
K0d(dxQmk)
K0d{dxPmk). (2.32)
A Survey of Multi-symplectic Runge-Kutta • • •
83
If for k = 1,2, • • •, r and m = 1,2, • • •, s, b£)=bV=bk,
~bW =~b% = ~hm.
(2.33)
Then the corresponding multi-symplectic conservation law of the method (PRK) is s
r
bk5xKk = 0.
Ax ^2 bmSttOm + At ^ m=l
(2.34)
fe=l
Consequently, in this case, it is sufficient for (2.34), which holds, that 7i = 0
and
h = 0,
(2.35)
where
h= (A*)2 E ^ M M ? + b?aiS -^b^)d(dtPmj)TM1(i(dtPmk) +(4 2) 4? + bfafk ~ bk2)bf)d{dtQmj)TM2d{dtQmk) +(4 2) 4? + bfafk - b^h^dijhP^j) M0d(dtQmk) +(bf)bk1) - bfVj* -
b^a^did^u/ModidtPrnk)) (2.36)
and
h= (Ax)2 E m „ = l ( ( ^ ) « ^ + ^ 2 ^ ' ——
+(fem)a£& + 6l2)ai2^ - b$b$)d(dxQnk) { )
&^)d(dxPnk)TK^d^rnk)
T
K2d(dxQmk)
——~——
T
+ ( ^ 0 ^ 7 1 + i f t S - V$b k )d(dxPnk)
K0d(dxQmk)
+ (&n2)6m) - ftrn'omn ~ b^Okm)d{dxQnk)
K0d(dxPmk)). (2.37)
We let (Mi)fcj (M2)fcj
-W>.
- 6 (1) a (1) + 6 (1) a (1) -- °fe % + °j ajk J,(2)-(2) , u(2) (2) _ - °fc % + °j ajk
-bfbf\ -b?b?\ { 4- ftWfiW -b X\
- (2) (l) (l) (2)_ (M3)fcj ~b °k akj ++b °j a ajk ("l)ran (^)mn (^Imn
= 6 (1) a( 1 ) =
"m amn
<~ 0n
= 6(2)a(1) +
Then the following result is concluded:
&nrn-
6( 2 )6( 2 )
b^a^ - 6(2)6<1>
84
Jialin Hong
Theorem 2.2.[19] In the method (PRK), suppose that (2.21) and (2.33) hold. The method (PRK) is multi-symplectic with discrete multisymplectic law (2.34) if one of following conditions holds: 1. for r = 1,2,3, (Mr)fej = 0 (fc, j = 1,2, ••-,r) and (i/ T ) m „ = 0 (m,n = 1,2, •• -,s) (2.38) when Mx + 0, Kx ^ 0 (A = 1,2), M 0 7^ 0 and K0 + 0. 2. for r = 1,2,3, (Mi)fcj = (M2)fcj = 0 (k,j = 1,2, • • •, r) (resp. (fj,T)kj = 0 (fc,j = 1,2, • • •, r)), (l*r)mn = 0 ( m , Tl = 1, 2, • • • , s ) ,
(resp. (i/i) m „ = (ys) m „ = 0 (m, n = 1,2, • • •, s)) when M 0 = 0 (resp. i^o = 0). 3. for T = 1,2, (Atr)fej = 0 (&, j = 1,2, • • • r) and (vT)mn 1,2, • • •, s) when M 0 = 0 and iiTo = 0.
= 0 (m, n =
4. (fia)kj = {vz)mn = 0, for k,j = 1,2, •• -,r; m , n = l , 2 , - - - , s , when M r = ifT = 0 for r = 1,2 (This is a typical multi-symplectic partitioned condition). 5. for r = 1,2, (ju3)fcj = {vT)mn = 0, for fc,j = 1,2, •• -,r; m , n = 1,2, • • •, s, when MCT = if0 = 0 for a = 1,2. 6. for r = 1,2, (/zT)*y = {v?)mn = 0, for k,j = 1,2, •• -,r; m,n = 1,2, • • •, s, when M 0 = Ka = 0 for a = 1,2. 7. (m)kj = (^Imn = 0, for k, j = 1,2, • • •, r; m, n = 1,2, • • •, s, when M 0 = M 2 = Ka = 0 for a = 1,2. 8- (113)kj = ("limn = 0, for k, j = 1,2, • • •, r; m, n = 1,2, • • •, s, when MCT = JFfo = K2 = 0 for a = 1,2. Now we give some remarks. Remark 2.2. In Theorem 2.2 we list only eight conditions for multisymplecticity of partitioned Runge-Kutta method (PRK). By using I\ = 0 and I2 = 0, we can conclude more conditions for multi-symplectic partitioned Runge-Kutta methods. This theorem can be extended naturally to the case of Hamiltonian partial differential equation with varying coefficients. It is very useful to construct multi-symplectic methods for Hamiltonian PDEs as in the finite dimensional case.
A Survey of Multi-symplectic Runge-Kutta • • •
85
Remark 2.3. It is trivial and apparent to extend Theorem 2.2 to the Hamiltonian partial differential equation with higher spatial dimension Mzt + Y^KrZxr
=VzS{z),
(2.39)
T= l
where i > 2, M and KT (r = 1,2, • • •, t) are skew-symmetric matrices, and S is a smooth function. The extension will be useful in the computational BEC and other scientific and engineering fields. ,(!) — u,,(2) Remark 2.4. In Theorem 2.2 the condition 1 implies aL• fc- for k, j = 1,2, • --r and a m «
(2)
iron for m, n = 1,2, • • •, s. In fact, in this case only one symplectic Runge-Kutta method is applied in each direction. Remark 2.5. Consider the nonlinear Schrodinger equation (2.40) Let ip{x,t) = u(x,t) + iv{x,t). Then equation (2.40) is read as (2.41) We take 2 = (u, v,ux,vx)T, as the following M^ where
and then equation (2.40) can be rewritten
at
+ K^ = dx
/o-ioo\ M =
1 0 00 0 0 0 0
K
\o 0 0 0 / S(z,t) = -\(u2+v^-±(ul
VzS(z,t),
(2.42)
( 0 0 10\ 0 0 01 - 1 0 00
V0 -100/ + vl).
Equation (2.42) accords with the case of condition 7 in Theorem 2.2. Thus the partitioned Runge-Kutta method (PRK) can be applied to equation (2.42). The numerical experiments of multi-symplectic method applied to equation (2.40) were given in [4, 8, 17, 18, 23, 24] and some references therein. In [18], authors investigated the conservative properties and error analysis of a multi-symplectic scheme for the Schrodinger equations with variable coefficients. We will give some further results on multi-symplectic Runge-Kutta type methods for equation (2.40) in section 3.
86
Jialin Hong
Remark 2.6. We consider the nonlinear Dirac equation (2.43) where ip = (ipi,ip2)T,
i — %/—I, f(s) is a real function of a real vari0 -1 -10 able s, matrices A and B are and respectively, and -1 0 0 1
(2.44)
VzS{z),
where / 0 1 0 0\ -10 0 0 M = 0 0 0 1 \ 0 0-10/
(0
0 0 r 00-10 K = 0 1 0 0 \ - 1 0 0 0,
and S{z) = ^F{u\
+
vl-ul-vl),
where the real smooth function F{Q satisfies -£F(() = /(£)• Equation (2.44) is in the case of condition 6 in Theorem 2.2. Now we denote z = (ui, U2, ^1,^2)) and equation (2.43) can be rewritten as &zt+Kzx = VsS(z), (2-45) where M
/ 0 0 -1 \ 0
0 10s 0 01 0 00 -100,
K
( 0 0 0\-l
0 0l\ 0 10 100 0 00/
Equation (2.45) is in the case of condition 4 in Theorem 2.2. It thus is possible to construct more multi-symplectic methods for equation (2.43). We will introduce some results on energy and momentum conservative properties of Runge-Kutta methods for equation (2.44) in section 4. The stability and convergence of multi-symplectic Runge-Kutta type methods for the general Hamiltonian PDEs, indeed, are still open problems. It will be very interesting to study the backward error analysis of multi-symplectic Runge-Kutta methods for the general case. The analysis of energy and momentum conservation is also very important for some
A Survey of Multi-symplectic Runge-Kutta • • •
87
practical situations. The combination of symplectic Runge-Kutta type methods in the temporal direction and spectral methods in the spatial directions for Hamiltonian PDEs will be very powerful. The study of local or global error for multi-symplectic methods is significant.
3
Multi-symplectic Runge-Kutta type methods for Schrodinger equations
In this section we investigate the condition of multi-symplecticity of Runge-Kutta type methods for the Schrodinger equation
idttp = dxx^ + V'{\ V |2)V,
(x, t) € U C R2
(3.1)
where V : R —> R is a smooth function. Equation (2.40) in section 2 is a special case of (3.1). Let ip = q + ip. Then equation (3.1) can be rewritten as dtq = dxxp+V'(q2+p2)p, dtP=-dxxq-V'(q2+p2)q.
(3.2) (3.3)
We introduce a pair of conjugate momenta v = qx,w = px and obtain Mdtz + Kdxz = where z = (q,p,v,w)T, •o
1 M = 0 .0
(3.4)
VzS{z),
M and K are skew-symmetric matrices,
-1 0 0 0
0 00 0 , 0 0 0 0.
-o o
0 0 K = 1 0 .0 1
-1 0 0 0
o-
-1 0 0 .
and the smooth function S(z) = i ^
w2+V^q2+p2)y
+
(3.4) has a multi-symplectic conservation law dw dt
dn dx
0,
(3.5)
where cu and K are of pre-symplectic forms, CJ
-dz A Mdz and
K
= -dz A Kdz. 2
(3.6)
88
Jialin Hong
The corresponding equations for the differential form da = (dq, dp, dv, dw)T are given by dtdq - dxdw = V'(q2 + p2)dp + (2pqdq + 2p2dp)V"(q2 + p2), -dtdp - dxdv = V'(q2 + p2)dq + (2pqdp + 2q2dq)V" (q2 + p2), dxdq = dv, dxdp = dw, where we use the fact that the exterior derivative operator d can be commuted with the partial derivative operator dt or dx. It follows that dtdq A dp — dxdw Adp = 2qpV (q2 + p2)dq A dp, -dtdp Adq-
dxdv A dq = 2qpV (q2 + p2)dp A dq.
By the observation that dt(dp A dq) = dtdp A dq + dp A dtdq = —dtdq Adp + dtdp A dq, we have dx (dv A dq + dw A dp) = dxdv A dq + dv A dxdq + dxdw A dp + dw A dxdp = dxdv A dq + dv A dv + dxdw A dp + dw A dw = dxdv Adq + dxdw A dp. This yields the multi-symplectic conservation law dt (dp A dq) + dx(dq Adv + dpA dw) = 0,
(3.7)
which is equivalent to (3.5) and very useful for validating multi-symplecticity of numerical methods. (3.7) has been used to verify the multisymplecticity of numerical methods for Schrodinger equations (see [8, 16, 23, 31]).
3.1
Multi-symplecticity of partitioned Runge-Kutta methods for the Schrodinger equations
Now we make use of partitioned Runge-Kutta methods in both temporal and spatial directions to discretize equations (3.2) and (3.3). The method is as follows:
Qmk = q°m + At J2U "kfaQm, -
(3-8)
vey of Multi-symplectic Runge-Kutta • • •
89
(3.9)
AtZUa{SPmJ,
(3.10)
Pmz = p m + A * E L i ^ 2 ) ^ - f e .
(3.11)
= P°m +
•Lmk
tymk
=
Qo T L±X 2-m=l ^m.nOxWnki
q\ = q$ + Ax X)m=i bmdxQmk, ±mk
=
Po "+" Ax 2-^n=l amn
p\ =--P% + Vmk
timdxPmk,
= V$ + Ax X)n=l
Vi =--v$
Wmk
Ax E m = l
+
AxJ2m=ib™dxVmk,
= W$ + Ax Y.n=l
Wk =
v$ +
timndxVnk,
OmndxWnk,
&*YZn=1i$dxwmk,
(3.12) (3.13) (3.14) (3.15) (3.16) (3.17) (3.18) (3.19)
= V'(Q2mk +
Plk)Qmk,
(3.20)
Wmk ~ dxWmk = V'(Qllk +
Plk)Pmk,
(3.21)
%Pmk -dxVmk
'mfc;
(3.22)
dxPmk = Wmk-
(3.23)
@x*vmk
=
This method (3.8)-(3.23) is denoted by (PRKl). The following result is novel: Theorem 3.1. In the above method, if
W*?-W*%-*?*§=*> u%w - b{1)h{1) - fe(1)s(1) - o
(3-24) C3 2«rt
£(2)r(2)
(onp.\
?(2)~(2)
E(2)-(2) _ „
l(l) _ £(2) _ 1(1) _ 1(2) _ I "m — "m — "TO — "m — "tin
to o 7 \ V°"^V
Jialin Hong
90
bi1] = b^ = bk,
(3.28)
for all A;, j = 1 , . . . , r, m, n = 1 , . . . , s, then the method (PRK1) is multi-symplectic with a discrete conservation law
Az E m = i M * ? m A dpi - dq°m A dp°m)+ A* £ L i bk(dqk A dvk - dq% A du§ + dpj A dwf - dp§ A dwft) = 0. (£>CL1) The outline of the proof. The method (PRK1) implies d(dtQmk) A dPmk + dQmk A d(dtPmk) + dQmk A d(dxVmk) + dPmk A d(dxWmk)
= 0.
After we give the variational equations, a straightforward calculation leads to
dqm A dpm - dq%, A dp°m = AfLUiipPdidtQmk)
A dPmk + bk2)dQmk A d(dtPmk))
+(A*)2 £ U i ( 6fc1)6S2) - hi1)ak] - bf^MdtQmk)
A d(3tPmfe).
Similarly, we have dqk A dvk - dq£ A dv£ = Ax £ „ + (Ax)2 ^
(
W
-
fc^fi^,
= 1
V$dQmk
A d{dxVmk)
- VP&fnWxQn*)
A
d(dxVmk)
and dpk A dwk - dpg A dwft = Ax J2m=i bm dPmk A
d(dxWmk)
+ ( A x ) 2 Em „ = 1 $ M 2 ) - & M » - ^ ^ J d ^ Q m f c ) A
d(dxWmk).
Therefore, the conditions (3.24-3.28) imply the discrete conservation law (DCL1). This completes the proof.
A Survey of Multi-symplectic Runge-Kutta • • •
3.2
91
Multi-symplecticity of Runge-Kutta-Nystrom methods for the Schrodinger equations
In this subsection we restate the main results in [16]. For (3.2-3.3), in ^-direction by applying an s-stage Nystrom method over [-L, L] with coefficients {am,j},{bm}, {/?m} and cm — E j = i a m ? ' and in t-direction by applying an r-stage Runge-Kutta method over [0, At] with coefficients {o-k,i\, {bk} and dk = E J = i ^kj, it is concluded that
Qtm = Ql + CmAxvf
+^2 E;=I amii-dtPti - vaQfj)2+(Pi^Qt), P k
l
m
= Pf + CmAxwf
+Ax2ZUa m^dtQij-V'dQir v
l+l
= V? +
wf+^wf +
+ iPtj^Ptj),
^ZSm=I bmi-dtPtm - V{{Qlm? +
(Pt^QtJ,
Ax^ =1 bm(dtQlm - V'iiQtJ* + {Pt,mf)Ptm\ Axv
Qi+i = 1i +
i
+Ax 2 E; = 1 /? m {-dtPtm ~ v'WtJ2
pf+1 = pf +
+ {P£m)2)Qtn),
Axw
t
+A^2 EJUl PmidtQtm - V'HQU2 + {Pt,m?)P?,m), -Qlm + pfc
1l,m
A
tEWakidtQlm,
= Plm + ^EU~^idtPi,m, = $m + &tZl=1bkdtQtm>
P},m= plm +
^EUhdtPlkm.
The method in the above box is denoted by (RKN1). The notations above are in the following sense: Qfm « q((l+cm)Ax, J fe At), qf w q(lAx,dkAt), $ Q * m « $g((Z + c m )Az, d fc At), J ^ m « p((f + c m )Ax,d f e At), pf « P(lAx,dkAt), $P,* m « ftp((i + c m ) A z , 4 At), i f w dxv(lAx, 4 A t ) .
92
Jialin Hong The variational equations corresponding to (RKNl) are dQlm = dqk + cmAxdvk + Ax2 J2sj=1 amjdUfj,
(3.29)
dPkm = dpk + cmAxdwk + Ax2 £ * = 1 OmjdV^,
(3.30)
dvk+1 = dvk + Ax £ m = 1 bmdUkm,
(3.31)
dwk+l = dwk + Ax £ m = 1 bmdVkm,
(3.32)
dqk+1 = dqk + Axdvk + Ax2 ESm=i PmdUkm,
(3.33)
dpk+1 = dpk + Axdwk + Ax2 £ m = 1 pmdVkm
(3.34)
dQtm = dq?,m + At £ [ = 1 akiddtQlm,
(3.35)
dPtm = dplm + A* £!=i 5feiddtPz*m,
(3.36)
rfff/U = dQi,m + A* E U MdtQf,ro, *j,m = dP?,m + At E L i ^
^
(3.37) (3-38)
Here 'ere dUfj = -ddtPfc - V'dQl)2 -V'HQlf
+ (PfjWQfjVQfjdQ^ +
iP^dQl,
dVft = ddtCtfj - V"((Qkj)2 + (P^P^Q^dQ^ -V\{Q^)2
+
+ 2 i * dP&)
+ 2P£dP£)
{Ptj)2)dP^.
Theorem 3.2.[16] In the method (RKNl), if Pm = bm{l-Cm), bm(/3j-amj) = bj(Pm-ajm), for m,j = 1,2,.. .s, and hbi = bkhki + haik, for i, k — 1,2,.. .r, then the method (RKNl) is multi-symplectic with the discrete multisymplectic conservation law r
Ax J2 bk[dqi+l A dvk+l - dqk A dv? + dpk+1 A dwk+1 - dpk A dwk] fc=i s
+ At H bm[dqlm A d p ^ - d^°m A d p £ j = 0. m=l
A Survey of Multi-symplectic Runge-Kutta • • •
93
Proof. It follows from the variational equations that dft+i A dv,*+1
= dq1^ A dvf + Ax Y bmdqf A dU^m s
+ Axdvf A dvf + Ax2 Y
bmdv\ A dV^m
s
s
s
+ Ax2 J2 PmdUtm A dvf + Ax* J2 P™ Y W * m A dtf&. m=l
ro=l
j=l
We solve dgf from (3.29) and insert it into the second part of the above expression. This yields dqi+i A dv,fc+1 S
= dq\ A dv\ + Ax Y, bmdQlm A dtfm m=l s
+ Ax2 Y
(-bmcm + bm~ Pm)dvf A dU£m s
+ Ax3 Y
( M & - amj))dUtj A dU^m.
m,j=l
By using the conditions /3 m = bm(l - cm) and 6j(/?m - o,-m) = 6m(/3j ), we obtain s
dft+i A dv,fc+1 = dg,fe A du,fc + Aa; ^
bmdQlm
A d^ f e m .
m=l
Similarly, it can be concluded that * ! + i A dt«f+1 = dp? A dwf + Aa; Y ftmdiftn A dV£m. From the combination of the above two equations, it follows that <*9?+i A dvf+1 + dpf+1 A dw,*+1
= dgf A dv? + dp? A dwf s
+ AxJ2 bm(dQtm A dU£m + dPtkm A dV,fcm). ™
m = l1
94
Jialin Hong
Recalling the denotations of dU*m and dVikm, we obtain the corresponding semi-discretized conservation law (dgrf+1 A dw,*+1 - dq\ A dv,fe) + (dpk+1 A du;,*+1 - dp? A dw,fc)
= Ax E™=1 &m(dft^m A dQ*m + P*m A ddtQtJ. By applying similar arguments as before, it follows from (3.35-3.38) that dqlm A dp\m T
= dqf>m A dplm + At £ bk{ddtP^m A dQlm + P*m A ddtQ\m) k=\ r
+ At2 J2 (i>kbi ~ biaik - bkaki){ddtQlm A ddtPi,m)k,i=l
In terms of the condition bkbi — bidik — bkaki = 0, the corresponding conservation property is obtained: d
Qi,m A dplm - dqlm A dp?>m
= At E L i hiddtP^ A dQlm + P*m A ddtQlJ. Combining the above results yields the discretized multi-symplectic conservation law r
Ax ] T bk[dqk+1 A dvk+1 - dqk A dv? + dpk+1 A dw?+1 - dp? A dwk] fe=i s
+ At Y, bm[dqtm A dplm - dqlm A dp£m] = 0, which is what we want. The following result reveals an interesting intrinsic character of (RKN1): Theorem 3.3.[16]/n the method (RKN1), assume that (3m = bm(l-Cm),
bm((3j-amj)
= bj(f3m-ajm),
for m,j = 1,2,.. .s,
and bkh = bkaki + haik, for i, k = 1,2,.. .r. Then for equation (3.1) with periodic boundary condition or zero boundary conditions, i.e., V'jv = V'oi 9xip% = dxipk or ip^ = ipk = 0, the method satisfies the discrete charge conservation law, that is, N-l
s
Yl Yl hmWlm? 1=0 m=l
= COnst.
A Survey of Multi-symplectic Runge-Kutta • • • Proof.
95
The method (RKN1) implies
W, J 2 - KJ2 = i>l„M,m - < m < = AtJ2h(^mdM,m
+d^lm^J
fe=l
+ At2 J ^ (bkk - bkaki -
biaik)dt*i,mdtV\,m.
i,k=l
By using bkbi = bkaki + biCHk and taking the sum of the above equation over the spatial grid points, we can get
7"
~
= At E h Eilo1 EJU M*U$*L +
dt^m^j.
fc=l
On the other hand, it follows that
Wf+1)tf+i - Wf)tf
bxtftf + AxJ2 M*f, Jd***t m=l
+Az 2 ^
/M0***?, m )^ + Ax2 £
m=l
+Az3 £
&m(l -
cjtfd^
m=l
M&-amj)(&**f,jR**tn.
m,j=l
By using /3m = bm(l - cm), we have M* + i)tf + i - ( # ) t f = Ax|^|2 + A x ^ 6 m ( * f ) m ) a x x * t m=l
+2A* 2 J ] A„»((a i x ** m )0f) m—1
+Ax 3 Y^ bm(Pj ~
amj){dxx*?j)dxx*?im,
m,j=l
where 3i(u) denotes the real part of the complex u.
96
Jialin Hong Secondly, we can get
M+i)#+i - M*)# = W+i)ti+1 - Wfk\Ak M
Axltff +
Ax^b^lJd^l m=l
+2Az2 £ /^((Sx**?, J $ ) m=\
+Ax3 £
bjiPm-ajmXd^l^y?^.
m,j=l
Subtracting (3.6) from (3.5), summing the results over the spatial grid points and use the condition bj([3m — o-jm) = bm(0j — amj), we obtain N-l (=0 N-l
s
=E E
bmm,jdxx*im-(dxxytj*tj.
i=0 m = l
We know that <3>*m satisfy the nonlinear Schrodinger equations lu
t^l,m
u
xx^ltm
^
v
)vl,m-
\\^l,m\
It follows from the above equalities that
Combining the above equalities leads to
E E Mltf,Ja " Kmf) r
= -iAt^hl^N
-i%4)
- (^N
- <%$)]•
fc=l
If the boundary conditions satisfy 1>N = il>o, O
or
^
=
^o=0.
then
£ £ MlViJ 2 -1<J2) = o. /=0 m = l
This completes the proof. This result shows that the multi-symplectic Runge-Kutta-Nystrom methods preserve exactly the important normalization in quantum physics.
A Survey of Multi-symplectic Runge-Kutta • • •
4
97
Multi-symplectic Runge-Kutta methods for one dimensional Dirac equations
In this section, we consider the one-dimensional nonlinear Dirac equation (2.43). Under appropriate conditions, (2.43) has the conservation of the charge Q, the linear momentum V and the energy E, where
' eW0(*) = /fl(hM*,t)l2 + \Mx,t)\2)dx, < V(m) = / « 9 ( ^ i ^ i + *h&ih)dx,
(4.1)
, E^)(t) = /« W i | ^ 2 + th&fr) + /(iV'il2 - \ih\2))dx, where 9(u) and u denote the imaginary part and the conjugate of the complex u, respectively, and / is a primitive function of / , namely,
/(*)= f fir)dr.
Jo In the sequel, we will focus on an important particular case of (2.43) ^
+ d-t + im^i + 2 i A ( | ^ | 2 - |Vi|2)V>i = 0, (4.2) 2
2
^T + ^ t - *™/>2 + 2iA(|Vi| - l ^ l ) ^ = 0, that is, f(s) = m — 2As in (2.43), where m and A are real constants. The results obtained can be easily extended to the general case (2.43). Proposition 4.1 [20] If the solution ip of the Dirac equation (4.2) satisfies lim
\ip(x,i)\=0,
uniformly for
t G R,
(4.3)
\x\—++oo
then Q(V)(t) = Q(),
(4.4)
where Q W ( t ) = f(\
(4.5)
JR
Remark 4.1 The Dirac equation can be deduced from the timedependent Schrodinger equation, |-0i|2 and |^ 2 | 2 represent the probability density of the particle in two states, respectively, the charge conservation law represents the probability conservation, so it is an important
Jialin Hong
98
quantity in physics process. Therefore, in latter numerical methods we emphasize on not only the discrete geometric structure, but also the discrete charge conservation law. Now we apply r-stage and s-stage Runge-Kutta methods to the tdirection and x-direction in the multi-symplectic system (2.44), respectively, and we will get the following Z^ = z°m +
Atj:akjdtZil,
z1m = z0m +
AtZhdtZ^ fe=i
Zkm = 4 + Ax £
(4.6)
amndxZk,
n=l
bmdxZhm,
z\ = 4 + Ax £ m=l
MdtZkm + KdxZl
=
VzS{Z^),
where we have made use of the following notations: Z^ « z{cmAt, dkAx), z^m « z(cmAx,Q), dtZ^ » dtz(cmAx,dkAt), dxZm w dxz (cmAx,dkAt), 4 , « z(cmAx,At), zl RS z(0,dkAt), z^ « 2 (c m Ax,0), and s =
Cm
/
r j Q"mn >
^fc
==
/
_, <^fcj •
The variational equation corresponding to (4.6) is
dzm = dz°m +
AtY;akjd(dtzym J=l
dzm = dz0m +
Atj:bkd(dtZ)km, fc=i
(4.7)
d Z £ = dz$ + A x £ 5 m n d ( a * ^ ) S . n=l
bmd{dxZ)km,
dzf = da£ + Ax £ { Md{dtZ)km
+ Kd{dxZ)km
=
DZ2S(Zm)dZ^
where we use the abbreviations: dZ^ denotes (dZ)^, (d(dtZ))kn and so on.
d(dtZ)kn
denotes
According to Theorem 2.1, we have the following basic result: Theorem 4.1. [20] If the method (4.6) satisfies the following coefficient conditions:
{
bkbj - bkakj -
-
-
- bja,jk = 0, .
- -
_
(4-8)
A Survey of Multi-symplectic Runge-Kutta • • •
99
for all k,j = l,2,---,r and m,n = l , - - - , s , then (4.6) is multisymplectic with the discrete multi-symplectic conservation law S
T
Ax J2 U « 4 - O + A*26fc(«i - Ko) = °. m=l
(4-9)
k=l
where w^ = hdz^ A Mdz^, and KQ = \dz\ A ifd^o-
K\ = | i i z f A #d-Zi, w^ = \dz®m A M d ^
This is a natural consequence of Theorem 2.1. However, its proof is based on (1.8) associated with (1.10). The exterior differentials and wedge products are used in the proof. Similar proofs can be found in [23, 31] and references therein. Such mathematical tools are related closely to differential geometry, and are able to reveal more intrinsic characters of the original systems.
4.1
Energy analysis of multi-symplectic Runge-Kutta methods
In this subsection we investigate the conservative properties of local energy which are crucial in some scientific and engineering fields. Since symplectic Runge-Kutta methods conserve quadratic invariants of ODEs exactly [14, 34], it is concluded that multi-symplectic Runge-Kutta methods can preserve the energy and momentum conservation laws precisely if the multi-symplectic Hamiltonian S(z) is quadratic or linear. But in Dirac equations, the Hamiltonian S(z) in the multi-symplectic form (2.44) is not quadratic or linear, so the scheme (4.6) in general hasn't the discrete energy and momentum conservation laws. We denote the total energy by ,L/2
£L(t)=
/ E(z(x,t))dx J-L/2
(4.10)
and the total momentum by ,L/2
TL(t)=
/ I(z(x,t))dx, J-L/2
(4.11)
where notations are the same as in (1.11) and (1.12). When we integrate the energy conservation law over the local domain, namely,
Jialin Hong
100
then (4.12) has the following form without the derivative symbols in the integrands: / [E(z(x,T))-E(z(x,0))]dx+ Jo
f [F(z(h,t))-F(z(0,t))]dt Jo
= 0. (4.13)
Corresponding to RK method (4.6), we use a discrete form t>m{E{zlm) - E(z°J) + r J2 bk(F(zt) - F(zk))
Ele±hJ2 m=X
(4.14)
fc=l
to approximate the left side of (4.13). One of the remarkable features of this result is that the energy, momentum densities, and the fluxes are not the algebraic functions —they depend on derivatives. In the numerical methods, we have a question of how to transform the derivatives into the algebraic equations. To deal with this, we introduce the following auxiliary system:
dxzi = (dxz)°m + r E akj{dtdxzym J=I
{dxzYm = (dxz)°m + r E bk(dtdxz)km, fe=i
(4.15)
s
dtZ^ = {dtz)l + h E
amn(dxdtZ)nk,
n=l
{dtz)\ = (dtz)k + h t
bm(dxdtZ)km,
m=l
where (dxz)m and {dtz)* satisfy s z
°m = z0 +
h
h
XI
rnn{dxz)°n,
(4.16)
n=l
Zo=Zo+T^akj{dtz)l,
(4.17)
respectively, and {dtdxZ)km k
{dxdtZ) m
K,
dtxz(cmh,dkT),
«
dxtz(cmh,dkT).
Assume that matrices A = (akj)rxr and A = (oTO„)SXs are invertible. Then we have (dtdxz)km = (dxdtz)km. (4.18) In fact, the first equation of (4.6), the third equation of (4.15) and (4.17) imply that r
Z
k
s
k
m
= z°m + z -z°0 + ThJ2Y, j=l n=l
dkfamn{dxdtZ)l
(4.19)
A Survey of Multi-symplectic Runge-Kutta • • •
101
Similarly, the third equation of (4.6), the first equation of (4.15) and (4.16) imply that r
Zt = zk0 +zl-zl
s
+ hrY,Y.
^j~amn{dtdxZ)l
(4.20)
j=l n = l
Prom (4.19) and (4.20), we conclude that (4.18) holds for m = 1, • • •, s and k = 1, •• • ,r. Now we don't distinguish the notations (dtdxZ)^ using only one notation Y£ to denote them.
and (dxdtZ)km,
and
Theorem 4.2.[20] If the matrices of Runge-Kutta methods in (4.6) satisfying (4.8) are invertible, then for the method (4.6), the error of discrete energy conservation law satisfies |£ie| < Cr3h
(4.21)
for sufficiently small T and h, where the constant C is independent of r and h. Proof. Because E{zxm) - E(z°m) = S(zlJ
- S(z°J - ±{(zl)TKdxzl
-
(z°m)TKdxz0J, (4.22)
k
k
k T
k
F{z ) - F(z ) = \\{z ) Kdtz
k T
k
- (z ) Kdtz ],
(4.23)
by using the second equation of (4.6) and the second equation of (4.15), we have (z^Kd^
{zl)TKdxzl
-
= {zl - z°m)TKdxZim
+ (z^)TK(dxz^
= (r £ bkdtZkm)TK{dxzl
- dxz°J
+ r E bkY£) + T{z°m)T £
(424)
KYk,
fe=l fc=l fe=0
and by applying the first equation of (4.6) and the first equation of (4.15) to equation (4.24), (4.24) can be written as bk[{dtZkm)TKdxZkm
r t
+
{Zkm)TKY^}
k=i
+r2 E E {bkbj - bkakj - bjajk) {dtZ^)TKYl fc=l
= T£ fe=i
j=l
bk[{dtZkm)TKdxZkm
+
{Zkm)TKY*}.
(4.25)
102
Jialin Hong
Similarly, it follows that (zk)TKdtzk
(z£)TKdtzk
-
= h ± bm[(dxZi)TKdtZi
+ {Zkm)TKY^\.
(426)
Ele = hJ2 bm[S(zl) - S(z°J -rY,bk{dtZkm)TVzS{Zkm)].
(4.27)
m=l
And (4.22)-(4.26) imply that s
r
m=l
fc=l
For simplicity, we present the following identity: bk{dtZkm)TVzS{Zkm)
5(4.) - S(z°J - r £ fc=l
= h±
bkdtZkm) - S(z°J
bm[S(z°m + T£
m=l
(4.28)
fc=l
-T±bk{dtZkm)TVzS{Zkm)\. k=l
Notice that S(z) is a multi-variable polynomial function with the degree of 4. Then the expansion of S(z + ry) is S{z + ry) = S(z) +rS^(z)(y) + + £sW(z){y,y,y)
^S^(z)(y,y) + £sW(z){y,y,y,y),
V
'
'
where the notation S^(z) is the first order derivative with respect to z as a linear map (the gradient VzS(z)), S(2\z) the second order derivative as a bilinear map (the second order derivative matrix DzzS(z)) and similarly for higher order derivatives. Since the degree of the polynomial is 4, S(*>(z) = 0for fc>5. Now we introduce two new notations r
r
zm = ^bkdtZ^n
and z\
fc=i
=-^akidtZ3m. j=i
By using the expansion (4.29), (4.28) can be written as S(z°m + rzm) - S(z°m) - r £
bk{dtZkm)TVzS(Zkm)
k=i
bk{dtZ^)TVzS{Zl)
= TSM{zl){zm)-T± +Cmr3 + Dmr4,
+
^sV\zl){zm,zm)
fc=i
(4.30)
A Survey of Multi-symplectic Runge-Kutta • • • where Cm = g S ' ( 3 ) ( ^ ) ( z m , z m , z m ) , Dm = and
103
^S^iz^Zm,zm,zm,zm)
S ( 1 ) ( 4 J ( * m ) = [VzS(z°m)]Tzm = (zm)TVzS(z°m), S^(z°m)(zm,zm) = (zm)T(DzZS(z°J)zm.
, {
-6 >
Prom (4.30) and (4.31), it follows that r E bk(dtZi)T(VzS(z°m)
-
VzS{Zkm))
fe=i
+T E £
hbjidtZ^iD^Si^dtZ^
fe=ij=i
+CmT3 + Dmr4
(4.32)
= 4 E E ( ^ - - 6feafej - 6 j a j f c )(a t ^) T (D z z 5(Zi))a t ^ fc=ij=i
3 4 3 4 +C Cmmrr 3 ++ D m r + C m r + Dmr = (C ro + <7 m )r 3 + (£>m + Dm)T4,
where r
- i L
fc=i
+ £ E^^ ( 3 ) (^)(3t^,^4,^) fc=ij=i
and fe=l
+i £ E^5 ( 4 ) (^)(9t^,a t z4,^,4). fc=ij=i
We assume that z, dtz, dxz are bounded in the considered domain, namely, there is an M* > 0 s.t. \z\<M*,
\dtz\<M\
(x,t)-
\dxz\<M*
for all (x, t) in the domain. Since S(z) is a polynomial function and the degree of the polynomial is 4, S(z) and its derivatives with respect to z are all bounded. With the assumptions and the brief analysis above, combing (4.27) and (4.32), we have |£ie| < Chr3 (4.33) for sufficiently small r and h, where the constant C is independent of z, S, T and h. This completes the proof.
104
Jialin Hong
Remark 4.2 E\e is a discrete approximation of the local integration (4.13) of the energy conservation law. Similarly, we can use
Mj-ji.(E''"'
7 E{Z°-)] + ± h(F(zf> ~ "W »
m=l
(4.34)
fc=l
to approximate the energy conservation law (1.11), under the assumptions of Theorem 4.2, we have \K\
(4.35)
Remark 4.3 The calculation of the estimate (4.33) or (4.35) can be extended to the general multi-symplectic system, if it satisfies the regularity conditions as mentioned in above theorem. Integrating the local energy conservation law (1.11) over the whole considered spatial interval [-L/2, L/2] leads to
= 1% W& + F(L/2, t) - F(-L/2, t) - A fL/2 Mr1 r/U ~ dt
( 4 - 36 )
J-L/2 " '
where the last equality derives from the periodic boundary condition. The calculation above implies the total energy conservation law
|^W=0.
.
(4.37)
Now from the former discussion on the discrete approximation of the local energy conservation law, we can define the discrete total energy at time U as N-l
a
d
(£ Ly = hJ2J2
~hrnE{4,m),
(4-38)
1=0 ro=l
where z\m « z{lh + cmh, IT) and i is a non-negative integer. Theorem 4.3.[20] Under the assumption of Theorem 4.2, assume that z, S{z) and their derivatives satisfy the regularity conditions, and then the local error of the discrete total energy conservation law satisfies
l^tel^K^)1-^)0!^^3
(4.39)
A Survey of Multi-symplectic Runge-Kutta • • •
105
for sufficiently small T and h, where the constant C is independent of z, S, T and h. The outline of the proof. With the similar calculation in Theorem 4.2, it is deduced that
\{eif-{£if\
tbmh\E(zlJ-E(zlJ\
1=0 m=l
< £
3
Chr
(4.40) 3
= CNhr
=
3
CLT
1=0
= Cr3, where C is a constant as stated above. When multi-symplectic numerical methods are applied to multi symplectic systems, there doesn't exist the discrete total energy conservation law. However, the numerical experiments [18, 16, 20] show that the discrete total energy oscillates near its initial value with the long-time evolution. Hence, one says that multi-symplectic numerical methods give rise to a good total energy conservation with essentially no accumulation of errors in quite a long time. Here we give a theoretical result. Theorem 4.4.[20] Under the assumptions of Theorem 4.3, for T > 0, there exists TO > 0 such that for T
- (£f)°\ < CT2,
uniformly for nr < T,
(4.41)
where the constant C is independent of z, S, r and T. Proof.
\(£ir-(sdL)°
= |(inr - vir-1)+••• (Vf)1 - (sir) < | ((££)n - {sir-1) |+• • •+1fei)1- (st)°) | < £ CT3 i=i
= UTCT2
<
TCT2
= Cr2, where C is a constant as stated in (4.41).
(4.42)
Jialin Hong
106
Remark 4.4 For any multi-symplectic system, if the regularity conditions as mentioned in Remark 4.3 hold, then the estimates (4.39) and (4.41) can be obtained too.
4.2
Momentum analysis of multi-symplectic Runge-Kutta methods
As discussed in the energy analysis, we make use of the integral form of the local momentum conservation law (1.12)
namely, / [I(z(x,T))-I(z(x,0))]dx+ Jo We define
[ [G(z(h,t))-G(z(0,t))]dt Jo
s
M]a = hJT
= 0. (4.44)
r l
bm(I(z J
- I(z°J) + r £ > ( < ? ( * ! ) - G{zk))
m—\
(4.45)
fc=l
as the discrete form of the left side of (4.44), namely, the local momentum conservation law. Since the Dirac equations we consider are nonlinear, they don't have the discrete local momentum conservation law, but we can also get the local error estimate of discrete form of the conservation law. Theorem 4.5. [20] Under the assumptions of Theorem 5.1, the following estimation |Me| < Crh3 (6.4) holds for sufficiently small r and h, where the positive constant C is independent of z, dtz, dxz and S. The outline of the proof. deduced that
From (4.15)-(4.27), similarly, it can be
Mte = h £ M J C 4 ) - I(&)) + r E HG(z^) - G{zk0)) m=l
fc=l
= r E h \s(z>n - 5 ( 4 ) - h t fc=l
L
(4.47) bm{dxZt)TVzS{Zkm)
m=l
Following the second half of the proof of Theorem 4.2 leads to
\S(z1) - S(z*) -hJ2
bm{dxZkm)TVzS{Zkm)\ < Ch\
(4.48)
A Survey of Multi-symplectic Runge-Kutta • • •
107
According to (4.47) and (4.48), we have |M l e | < Crh3,
(4.49)
where C is a constant as stated above. Remark 4.5. As same as stated in Remark 4.2, M\e is the discrete approximation of the integration (4.44) of the local momentum conservation law. Similarly, we make use of
M. =
^J'(4J-/a)) + ^ t > (Q«)-e(4» m=l
(450)
fc=l
to approximate the momentum conservation law (1.12), and under the assumptions of Theorem 4.5, we have |M,;| < Ch?,
(4.51)
where C is a constant as same as mentioned in (4.46) or (4.49). Remark 4.6. As same as stated in Remark 4.3, the local error estimates (4.46) and (4.51) can be extended to any multi-symplectic system with the regularity conditions as required in Remark 4.3. Now we turn to the discussion of the total momentum. First, integrating the local momentum conservation law (1.12) over the spatial interval [-L/2, L/2] gives
= I-L/2 %dx
+ G (V2, t) - G(-L/2, t)
(4.52)
where we make use of the periodic boundary conditions. It implies that the total momentum is conserved in the continuous case. In our RungeKutta methods, we define the total momentum at time U as N-l
s
(ldLy = h Y, £ bml(ztm),
(4.53)
1=0 m = l
where z\ m and i have the same meaning as before. Theorem 4.6. [20] Under the assumptions of Theorem 4.5 and with the periodic boundary condition, if dxz is a periodic function on the spatial
Jialin Hong
108
interval [—L/2,L/2], namely, dxz(-L/2,t) = dxz(L/2,t) for all t, then we have the following discrete total momentum conservation law: d\0 &
)
=
&
)
(4.54)
•
Proof. From (4.15)-(4.20), with the similar calculation of (4.22)-(4.27), we get
I(zlJ-I(z°J
= \[{ziJTMdxzlm
(zlJTMdxz°m
-
= I E hUdtZ^fMd^^
+
iZ^fMY^
fc=i
(4.55)
where Ytkm denotes (dtdxZ)lm or From (4.55), it follows that
(dxdtZ)lm.
= f E &*{ " E 1 E bm Udtzkm)TMdxzkm + (zkm)TMYkm]}. fc=l
L
I. Z=0 m = l
-> J
(4.56) On the other hand, it is deduced that (zk+1)TMdxzf+1
bm[(dtZkm)TMdxZkm+(Zkm)TMYkm
= h± L
m=l
(zk)TMdxzk
-
(4.57)
Combining (4.56) and (4.57), we have
{nf - pi? = § E h \ N£ k=i
L i=o
k
T
\(z?+1)TMdxz*+1
L
= E bk\{z N) Mdxz% fe=i
-
-
(zk)TMdxzk
(4.58)
k T
(z ) Mdxz$
L
= 0, where the last equality comes from the periodic boundary conditions of z and dxz. The proof is finished. Remark 4.7. For any multi-symplectic system if the phase variable z and the first order derivatives of z with respect to spatial variables are periodic in the spatial domain, then applying the multi-symplectic RK methods to this system, we can get the discrete total momentum
A Survey of Multi-symplectic Runge-Kutta • • •
109
conservation law with the same form of (4.54). For the Runge-Kutta discretization of the nonlinear Dirac equation, the symplecticity both in temporal and spatial directions implies the multi-symplecticity of the integrator. The preservation of charge, energy and momentum conservation laws is very important under the structurepreserving discretization. A known result that the multi-symplectic integrator preserves the local energy and momentum exactly if the multisymplectic Hamiltonian is of quadratic has been contained in our energy and momentum analysis. In particular, from the main results in this section, it follows that, under given conditions, there exists a constant C > 0 such that for sufficiently small r and h, we have \Eie + Mle\
+ h2),
which shows the local symmetry of energy and momentum under the discretization of multi-symplectic Runge-Kutta methods. These results tell us that multi-symplectic Runge-Kutta methods are stable and convergent in the sense of energy and momentum conservation laws. Our numerical experiments in [20] explain the theoretical result intuitively, and the numerical results in the experiments obey our theoretical analysis. Our work shows that the traditional methods in numerical analysis can be brought in the geometric numerical methods, but we must give a new development for the traditional methods. In this sense, our work is important but just in the beginning.
5
Conclusions
The Runge-Kutta discretization is an important way to devise the finite difference schemes for partial differential equations. The results reviewed or presented in this paper are on the multi-symplecticity of Runge-Kutta methods for Hamiltonian partial differential equations, including nonlinear Schrodinger equations and wave equations, etc., and the analysis of local energy and local momentum for multi-symplectic Runge-Kutta methods for the Dirac equations in the quantum physics. Concatenated symplectic Runge-Kutta methods in the temporal direction and the spatial direction will lead to a multi-symplectic integrator for Hamiltonian PDEs. The similarity and comparability of the conditions in main results on multi-symplecticity remind us that: 'All happy families resemble one another, but each unhappy family is unhappy in its own way'. This survey leaves some open problems, which are more challenging and more significant in both theories and applications of the structure preserving algorithms. The following is just a part of these problems:
Jialin Hong
110
• The solvability of multi-symplectic Runge-Kutta methods. For what scales of time step-size and space step-size the nonlinear algebraic system determined by multi-symplectic Runge-Kutta methods is solvable. • While the above system is solvable, it is usually changed into a finite difference scheme, in a compact form, to implement the numerical computation. For higher order methods, we hope that such 'change' can be done. • Up to now there has not been any analysis on the stability and convergence of multi-symplectic Runge-Kutta type methods for the general Hamiltonian partial differential equations. • The backward error analysis is verified to be powerful for the demonstration of the long-time computational advantage of symplectic methods for Hamiltonian systems. In [29, 30], the related work was done for some multi-symplectic methods of Hamiltonian PDEs. It will be very interesting if the backward error analysis can be devised for all kinds of multi-symplectic Runge-Kutta type methods of the general Hamiltonian PDEs. In fact, the numerical computational advantages of multi-symplectic methods have not been revealed completely. • The study of multi-symplectic Runge-Kutta type methods of stochastic Hamiltonian partial differential equations will be very useful. • It should be very interesting that the conditions of multisymplecticity of Runge-Kutta type methods for Hamiltonian PDEs are investigated by means of variational integrators. The above list means that we have just commenced the research work on the topic. Some more important problems are probably beyond the list. Multi-symplectic methods should have more important applications to sciences and technologies.
References [1] A. Alvarez, Linear Crank-Nicholsen scheme for nonlinear Dirac equations, J. Comput. Phys. 99 (1992) 348-350. [2] A. Alvarez and B. Carreras, Interaction dynamics for the solitary waves of a nonlinear Dirac model, Phys. Lett. 86A (1981) 327-332. [3] A. Alvarez, P. Kuo and L. Vazquez, The numerical study of a nonliear one-dimensional Dirac equation, Appl. Math. Comput. 13 (1983) 1-15.
A Survey of Multi-symplectic Runge-Kutta • • •
111
U. Ascher and R. McLachlan, Multisymplectic box schemes and the Korteweg-de Vries equation, Appl. Numer. Math. 48 (2004) 2550269. J. C. Butcher, The Numerical Analysis of Ordinary Differential Equations. Runge-Kutta and General Linear Methods, John Wiley & Sons, Chichester, 1987. T. J. Bridges, Muti-symplectic structures and wave propagation, Math. Proc. Camb. Phil. Soc. 121 (1997), 147-190. T. J. Bridges and S. Reich, Multi-symplectic integrator: numerical schemes for Hamiltonian PDEs that conserve symplecticity, Physics Letters A 284 (2001) 184-193. J. Chen, New schemes for the nonlinear Schrodinger equation, Appl. Math. Comput. 124 (2001) 371-379. G. J. Cooper, Stability of Runge-Kutta methods for trajectory problems, IMA J. Numer. Anal. 7 (1987) 1-13. D. B. Duncan, Symplectic finite difference approximations of the nonlinear Klein-Gordon equation, SIAM J. Numer. Anal. 34 (1997) 1742-1760. K. Feng, Collected Works of Feng Kang (II), National Defence Press, Beijing 1995. K. Feng and M. Qin, Symplectic Geometric Algorithms for Hamiltonian Systems (in Chinese), Zhejiang Science & Technology Press, Hangzhou 2003. J. de Frutos & J. M. Sanz-Serna, Split-step spectral schemes for nonlinear Dirac systems, J. Comp. Phys. 83 (1989) 407-423. E. Hairer, C. Lubich and G. Wanner, Geometric Numerical Integration. Structure-preserving algorithms for ordinary differential equations, Springer Series in Computational Mathematics 31 Springer Berlin 2002. J. Hong and Y. Liu, Multisymplecticity of the central box scheme for a class of Hamiltonian PDEs and an application to quasi-periodic solitary waves, Math. Comput. Modelling. 39 (2004) 1035-1047. J. Hong, X. Liu and C. Li, Multi-symplecticity of Runge-KuttaNystrom methods for nonlinear Schrodinger equations, preprint, 2004. J. Hong and Y. Liu, A novel numerical approach to simulating nonlinear Schrodinger equation with varying coefficients, Appl. Math. Lett. 16(5) (2003), 759-765.
112
Jialin Hong
[18] J. Hong, Y. Liu, Hans Munthe-Kaas and Antonella Zanna, Globally conservative properties and error estimation of a multisymplectic scheme for Schrodinger equations with variable coefficients, preprint, 2002. [19] J. Hong, H. Liu and G. Sun, The Multi-symplecticity of partitioned Runge-Kutta methods for Hamiltonian PDEs, preprint, 2003. [20] J. Hong and C. Li, Some properties of multi-symplectic RungeKutta methods for nonlinear Dirac equations, preprint, 2004. [21] J. Hong and M. Qin, Multi-symplecticity of the centred box discretization for Hamiltonian PDEs with m > 2 space dimensions, Appl. Math. Lett. 15 (2002) 1005-1011. [22] A. Iserles, A First Course in the Numerical Analysis of Differential Equations, Cambridge University Press, Cambridge 1996. [23] A.L.Islas, D. A.Karpeev and C.M.Schober, Geometric integrators for the nonlinear Schrodinger equation, J. Comput. Phys. 173 (2001) 116-148. [24] A. L. Islas and C. M. Schober, On the preservation of phase space structure under multisymplectic discretization, J. Comput. Phys. 197 (2004) 585-609. [25] F. M. Lasagni, Canonical Runge-Kutta methods, ZAMP 39 (1988) 952-953. [26] J. E. Marsden & M. West, Discrete mechanics and variational integrators, Acta Numerica 10 (2001) 1-158. [27] J. E. Marsden, S. Pekarsky, S. Shkoller and M. West, Variational methods, multi-symplectic geometry and continuum mechanics, J. Geom. and Phys. 38 (2001) 253-284. [28] R. I. McLachlan, Symplectic integration of Hamiltonian wave equations, Numer. Math. 66 (1994) 465-492. [29] B. Moore and S. Reich, Multi-symplectic integration methods for Hamiltonian PDEs, Future Gener. Comput. Syst. 19 (2003) 395402. [30] B. Moore and S. Reich, Backward error analysis for multi-symplectic integration methods, Numer. Math. 95 (2003) 625-652. [31] S. Reich, Multi-symplectic Runge-Kutta collocation methods for Hamiltonian wave equations, J. Comp. Phys. 157 (2000) 2, 473499. [32] J. M. Sanz-Serna, Runge-Kutta schemes for Hamiltonian systems, BIT 28 (1988) 877-883.
A Survey of Multi-symplectic Runge-Kutta • • •
113
[33] J. M. Sanz-Serna, The numerical integration of Hamiltonian systems, In: Computational Ordinary Differential Equations, ed. by J.R. Cash & I.Gladwell, Clarendon Press, Oxford, 1992, 437-449. [34] J. M. Sanz-Serna and M.P.Calvo, Numerical Hamiltonian Chapman & Hall, London, 1994.
Systems,
[35] G. Sun, A simple way of constructing symplectic Runge-Kutta methods, J. Comput. Math. 18 (2000) 61-68. [36] G. Sun, Symplectic partitioned Runge-Kutta methods, J. Comput. Math. 11 (1993) 365-372. [37] Y. Sun and M. Qin, Construction of multi-symplectic schemes of any finite order for modified wave equation, J. Math. Phys. 41 (2000) 7854-7868. [38] Y. B. Suris, On the conservation of the symplectic structure in the numerical solution of Hamiltonian systems (in Russian), In: Numerical Solution of Ordinary Differential Equations, ed. by S.S. Filippov, Keldysh Institute of Applied Mathematics, USSR Academy of Sciences, Moscow, 1988, 148-160. [39] Y. B. Suris, Hamiltonian methods of Runge-Kutta type and their variational interpolation (in Russian), Math. Model. 2 (1990) 78-87. [40] Y. Wang, B. Wang and M. Qin, Numerical implementation of the multi-symplectic Preissmann scheme and its equivalent schemes, Appl. Math. Comput. 149 (2004) 299-326. [41] P. Zhao and M. Qin, Multi-symplectic geometry and multisymplectic Preissmann Scheme for the KdV equation, J. Phy. A: Math. Gen. 33 (2000) 3613-3626.
114
Inverse Problems in Bioluminescence Tomography Ming Jiang* LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, China. E-mail: [email protected]
YiLit Department of Mathematics, Hunan Normal Changsha, China. Email: [email protected]
University,
Ge Wang E-mail: [email protected]
1
Introduction
Gene therapy is a breakthrough in the modern medicine, which promises to cure diseases by modifying gene expression. A key for the development of gene therapy is to monitor the in vivo gene transfer and its efficacy in the mouse model. Traditional biopsy methods are invasive, insensitive, inaccurate, inefficient and limited in the extent. To map the distribution of the administered gene, reporter genes such as those producing luciferase are being used to generate light signals within a living mouse, which can be externally measured. A highly sensitive CCD camera has been built to take a 2D external view of expression of the bioluminescent signal [1]. Such a 2D image of photon emission is then registered with a 2D visible light picture of the mouse for the localization of the reporter gene activity. In addition to gene therapy, this new imaging tool has great potentials in other various biomedical applications as well [2-6]. However, this 2D bioluminescence imaging technique, like the *M. Jiang and G. Wang are at CT/Micro-CT Laboratory, Department of Radiology, University of Iowa, Iowa City, IA 52242, USA. tY. Li and G. Wang are at Department of Mathematics, University of Iowa, Iowa City, IA 52242, USA.
Inverse Problems in Bioluminescence Tomography
115
traditional radiography, is incapable of 3D characterization of internal source features of interest. To address the needs for 3D localization and quantification of a bioluminescent source distribution in a small animal, a bioluminescence tomography (BLT) system is being developed [7-10]. Mathematically, BLT is an inverse problem to recover an internal bioluminescent source distribution subject to Cauchy data for the diffusion equation. Traditionally, optical diffuse tomography utilizes incoming visible or near infra-red light to probe a scattering object, and reconstructs the 3D distribution of internal optical properties, such as one or both of absorption and scattering coefficients [11,12]. In contrast to this active imaging mode, BLT reconstructs an internal bioluminescent source distribution, generated by luciferase induced by reporter genes, from external optical measurement. In BLT, the complete knowledge on the optical properties of anatomical structures of the mouse has been established based on an independent tomographic scan, such as a CT/micro-CT scan, by image segmentation and optical property mapping. That is, we can segment the CT/micro-CT image volume into a number of anatomical structures, and assign optical property values to each structure by using a database of the optical properties compiled for this purpose [7,8]. The organization of the paper is as follows. In §2 we introduce the forward process for BLT by the radiative transfer equation and its diffusion approximation. In §3 we discuss briefly the optical diffuse tomography problem. The BLT problem is formulated in §4. In §5, we reformulate the BLT problem in an abstract operator form by the Dirichlet-to-Neumann map. We review recent results on the uniqueness of the solution for BLT and demonstrate that the minimal norm solution is not physically favorable for BLT. And in §6 we give a detailed analysis of the structure of the solution for BLT in §7 for sources as a linear combination of points or ball-like sources or general radial basis functions. Then, we propose several iterative reconstruction algorithms for BLT in §8. Finally, we discuss remaining issues and future directions in §9.
2 2.1
Radiative transfer equation and diffusion approximation Radiative transfer equation (RTE)
Let ft be a domain in the three contains the object to be imaged. 0 € S2 at x 6 ft, where S2 is the migration in a random medium
dimensional Euclidean space R 3 that Let u(x, 9) be the light flux in direction unit sphere. A general model for light is the radiative transfer equation, or
116
Ming Jiang, Yi Li, Ge Wang
Boltzmann equation [11-17]: ldu - — (x,9,t)+0-
Wxu(x,9,t)
+ n(x)u(x,0,t)
,^ ^
= Hs{x) JS2 v{0 • 0')u(x, 0', t) tiff + q(x, 6, t) for t > 0 and i e ( l , where c denotes the particle speed, fj, = /xa + /xs with Ha and /i s being the absorption and scattering coefficients respectively, the scattering kernel r] is normalized such that J r)(0 • 6')d0' = 1, and q s2 is the internal light source. In (2.1), the radiance u(x,6,t) is in the unit of W c m - 2 sr _ 1 , the source term q(x, 0, t) is in the unit of W c m - 3 s r - 1 , the scattering coefficient /t s and the absorption coefficient \ia both are given in the unit of c m - 1 , and the scattering phase function r\ is in the unit o f s r - 1 [18]. One widely assumed kernel is the Henyey-Greenstein scattering [11, 12],
M-hii+yx,*)**'
(2 2)
-
The parameter g € (—1,1) is a measure for anisotropy, with g = 0 corresponding to isotropic scattering. Other scattering kernels can be found in [15,16]. The initial condition for u is u(x,e,o)
= o, xen,oes2.
(2.3)
The boundary condition for u represents the incoming flux g~
u(x,e,t) = g-(x,e,t),
x e an, e e s2, v{x) • e < 0, t > 0, (2.4)
where v is the exterior normal on dCl. The problem (2.1), (2.3) and (2.4) admits a unique solution under appropriate assumptions on fx, /xa and r? [17]. The homogeneous condition g~(x,9,t) = 0 specifies that no photons travel in an inward direction at the boundary, except for the source terms [11, p. R50].
2.2
Diffusion approximation
Typical values of \xa and fxs in optical tomography for biological tissues are fia = 0.1 ~ 1.0mm - 1 , /zs = 100 ~ 200mm _ 1 , respectively. This means that the mean free path of the particles is between 0.005 and 0.01 mm, which is very small compared to a typical object. Thus, the predominant phenomenon in optical tomography is scatter rather than transport. Therefore, one can replace the transport equation (2.1) by a much simpler diffusion equation [12]. The diffusion theory, therefore,
Inverse Problems in Bioluminescence Tomography
117
becomes an appropriate approximation for many biomedical applications [19]. In the following we present the derivation of the diffusion approximation from the RTE under the above conditions in [12]. Other approaches can be found in [11,13,16]. Due to the prevalence of scatter, the flux is essentially isotropic within a small distance away from the sources, i.e., it depends only linearly on 6. Thus we may describe the process adequately by the first few moments u0(x,t)
= — u(x,e,t)d9, 47T Js?
(2.5)
u1(x,t)
= —
6u(x,6,t)d6,
(2.6)
09*u(x,e,t)d9
(2.7)
u2(M) = 7- / 47T JS2
of u. Note that uo(x,t) is the photon density and u\(x, t) is the photon current, which define the measurability [11, p. R46]. Integrating (2.1) over S2 and using the normalization of 77, we obtain - — (x,t) + V -ui{x,t)
+ /j,a(x)u0{x,t)
=q0(x,t),
(2.8)
where q0(x,t) = ^-
f
q{x,9,t)d6.
(2.9)
47T JS2
qx and 92 are defined similarly as that for u. Similarly, multiplying (2.1) with 6 and integrating over S2 yield —^-(x,t)+\7-u 2 (x,t)+fi(x)u 1 (x,t)
= fjns(x)ui(x,t)+q1(x,t),
(2.10)
where fj is the mean scattering cosine fj=^-
J 6-e'r)(6-6')d6',
(2.11)
47T JS2
which does not depend on 6 and equals to g for the Henyey-Greenstein scattering kernel in (2.2) and qi{x,t) = — I 9q(x,Q,t)d6.
(2.12)
47T JS2
Introducing the reduced scattering coefficient /i'a = ( l - J j K ,
(2.13)
118
Ming Jiang, Yi Li, Ge Wang
we can write the equations for uo and u\ in the more concise forms 1 /-I
--zr(x,t) --^-(x,t)
+ V -ui(x,t)
+ V-u2(x,t)
+ fj,a(x)uo(x,t)
+ (fia(x) + fJ,'s(x))Ul(x,t)
=q0(x,t),
(2.14)
= qx{x,t).
(2.15)
Now we assume that u depends only linearly on 9, u(x,9,t)
=auo(x,i)+/36-ui(x,t).
(2.16)
For the constants a and (3, we easily obtain a = 1 and (3 = 3 by computing the moments. By expressing u2 by (2.16) in terms of UQ and u\, it follows that V • u2 = ~Vu 0 . Eliminating V • 1*2 from (2.15), we obtain a closed system for uo and u\\ --7r-(x,t) —J£(x,t)
+ V -ui(x,t)
+ -Vu0
+ ^a(x)u0{x,t)
+ (j*a(x) + n's(x))Ul(x,t)
=q0(x,t),
(2.17)
= «i (*,*)•
( 2 -!8)
To obtain the diffusion approximation, we go one step further, assuming that u is almost stationary in the sense that ^f-(x,t) is negligible in (2.18) and that q\ = 0. We obtain approximately Ul
= -DVu0,
(2.19)
where D(x) = n. . *—-r^T. (2.20) y 3(Ha(x) + n's(x)) ' This is called Fick's law, and D is the diffusion coefficient. Inserting (2.19) into (2.17), we finally arrive at the diffusion approximation - - ^ - V • (DVu0) + »au0 = go-
(2.21)
This is the most commonly used forward model for photon migration in tissue. It is used almost exclusively in optical tomography and is also called the Pi-approximation [11,12,20-22]. The initial and boundary conditions for (2.21) can be obtained similarly. From (2.3), we immediately obtain u0(x,0)=0,
xefi.
(2.22)
From the boundary condition (2.4), we obtain v(x) • f
9u(x, 6,t)d6 = v(x) • f
9g~ (x, 9, t) d9
Inverse Problems in Bioluminescence Tomography
119
for x e 80, and t > 0. From (2.16) and (2.19), we get u(x,6,t)
= u0(x,t)
- 30 •
D(x)Wu0{x,t).
Observing that p{x)-
I 9de = -n, Jv(x)e
(2.23)
/ 0idjd8 = —Si j , A Jv(x)-e
(2.24)
where 5^j is the Kronecker symbol, we obtain r
2TT
-iruo{x, t) - 3—D(x)u(x)
• Vu0(x, t) = v{x) • /
6g~ (x, 6, t) d6, (2.25)
i.e., u0(x,t) + 2D(x)-^(x,t) UV
= --v(x)-
For an isotropic distribution g"(x,9,i) u0(x,t)
+2D(x)-p-(x,t)
[
""
9g-{x,6,t)d9.
(2.26)
Jv(x)-6<0
= g~(x,t),
=g~(x,t),
it follows that
i £ 9 0 , t > 0.
(2.27)
The initial boundary value problem (2.21), (2.22) and (2.27) admits a unique solution UQ under appropriate assumptions [12]. When no photons travel in an inward direction at the boundary, the boundary condition (2.27) becomes the homogeneous Robin condition uo(x,t) +2D(x)-p-(x,t)
= 0 , xedCl,
t>0.
(2.28)
Please refer to [16,23] for other boundary conditions. Other approaches to approximate the RTE are discussed in §9. Remark 2.1. To incorporate diffuse boundary reflection arising from a refractive index mismatch between Q, and the surrounding medium, the boundary condition (2.27) becomes u0(x,t) where A =
+2AD(x)-p-(x,t)
= g~(x,t),
x € dfl, t> 0,
(2.29)
— and R depends on the refraction properties of the
medium [11, p. R50].
120
3
Ming Jiang, Yi Li, Ge Wang
Optical tomography
Optical tomography utilizes the low-energy visible or near infra-red light to probe highly scattering media in order to derive qualitative or quantitative images of the optical properties of these media from parameters Ha and fis (or one of them) based on the forward process modeled by the RTE or diffusion approximation [11]. As an emerging medical imaging modality, optical tomography is hopeful to be a low-cost alternative or complement to existing medical technology. Here we present a brief introduction about this technique. Please refer to [11,24-28] for details and recent development. The primary approach is based on the RTE. The problem is to recover the optical parameters ^a and fis (or one of them) from the measurement of the outgoing flux. The measured quantity in optical tomography is the emittance through a unit area at x G dCl perpendicular to the exterior normal u(x) on dtl ( [16, § 7.1], [29, p. 843]) g(x,t)=
Js2
v{x)-6u{x,6,t)d6,
x&d9,,t>0.
(3.1)
Because the RTE is computationally intensive and the mean free path of the particle is small compared to a typical object size in this context, the diffusion approximation is widely utilized in optical tomography and the resulted technique is called the diffuse optical tomography. For the diffusion model (2.21), the measurement equation (3.1) reads, by (2.6) and (2.19), g(x, t) = u(x) • u\{x, t) = —D{x)v(x) • Vuo(x, t).
(3.2)
Hence g(x,t) =-D(x)~(x,t),
xedQ,t>0.
(3.3)
Please refer to [11] for other types of measurements. The optical tomography problem in the diffusion approximation calls for the determination of D and /xa from the value of g for all incoming isotropic light distributions g~. This is an inverse problem of partial differential equations [30].
4
Bioluminescent tomography (BLT)
BLT is to reconstruct the source q from measurement on the boundary of a region 0 enclosing the source by assuming that the optical parameters of the underlying medium is known through other approaches, e.g., the optical parameters can be established point-wise from a parametersrequisite tomographic scan, such as a CT/micro-CT scan [7,8]. The
Inverse Problems in Bioluminescence Tomography
121
measured quantity is as in (3.1) or (3.3) for the RTE or the diffusion approximation, respectively. Because of the same reasons as in the case for optical tomography with the RTE, we utilize the diffusion approximation for BLT. The internal bioluminescence distribution induced by reporter genes is relatively stable [1-6], so we can use the stationary version of equations (2.21) and (2.27) as the forward model for bioluminescence tomography. By discarding all the time dependent terms in (2.21) and (2.27), the stationary forward model is modeled by the following boundary value problem (BVP): - V • (DVuo) + Atauo =
x e Q,,
(4.1)
xGT.
(4.2)
u0(x)+2D(x)^(x)=g-(x),
The stationary measurement equation reads, by (3.3), g(x) =-D(x)^(x),
xeT.
(4.3)
Given the measurement (4.3), it follows that the boundary value of uo{x) can be obtained according to (4.2) as follows: uQ(x)=g-(x)
+ 2g(x),
xGT.
(4.4)
Hence, u0 satisfies the following Cauchy condition on the boundary T [31]: Mx) D(x)^(x)
= 9~(x) + 2g(x), =-g(x),
1ST,
x€T.
(4.5) (4.6)
Therefore, the bioluminescence tomography (BLT) problem is to reconstruct the source qo in (4.1) from given uo(x) and -——(x) for x £ T, under the governing diffusion equation (4.1). This is an inverse source problem of partial differential equations [30]. In summary, the bioluminescence tomography problem can be stated as follows: Given the incoming flux g~ and outgoing flux g for x € T, find a source qo with one corresponding light flux u to satisfy V - ( DVu0) + Haiio = qo, (BLT)
«.+*>£-.-. *£-'•
X £ CI,
xeT,
ier.
(4.7)
122
Ming Jiang, Yi Li, Ge Wang
We have g~ = 0 in a typical bioluminescence tomography configuration because there is no incoming light. In practice, it is difficult to obtain all the measurement along the whole boundary T. We consider the case in which the measurement can only be conducted on some disjoint closed connected parts Tj c I\ for j = 1, • • • , J. Let FP = | j / = i IV The BLT problem then becomes an image reconstruction problem from partial data: Given the incoming flux g~ and outgoing flux g for x G Tp, find a source qQ with one corresponding photon flux u to satisfy -V • (DVu0) + iJ,auo = q0, (BLT(P)) {
u0 + 2D-^=g-, r>du°
x G tt, xGT,
(4 8)
a r
When I > = T, the (BLT(P)) problem (4.8) reduces to the (BLT) problem (4.7).
5
Reformulation of BLT
In [9,10], the (BLT) problem (4.7) is reformulated as a linear operator equation by the Dirichlet-to-Neumann map and solved with the proposed EM-like iterative algorithm. The uniqueness properties for the (BLT) problem was well studied in [32]. For the (BLT(P)) problem (4.8), we follow the same treatment and analyze the uniqueness properties of its solution in the following. The reader should be warned that the presentation in its mathematically accurate form will require rather technical and tedious assumptions on the domain and the coefficients of the BLT problem. In the following, we try to make the mathematical presentations as precise as possible while keeping a concise readability. For details, please see [9,32] and the references therein. We assume that Q is a bounded smooth domain of JR^ although the case of our main interest is TV = 3, and that each partial boundary Tj C r is disjoint, closed and connected for j = 1, • • • , J. We always assume that the parameters D > Do > 0 for some positive constant Do and that fia > 0 are bounded functions. We further assume that D is sufficiently regular near T, e.g., D is equal to a constant near V. We need the following notations from functional analysis [33]. Let A be a linear operator from Banach space X to Banach space Y. The kernel or null space of A is defined as Af[A] = {x G X : A[x] = 0}, and the range of A as TZ[A] = {y G Y : y = A[x] for some x G X}. For a subspace M of Hilbert space H, M1- is the set of all y G H such that (y, x) = 0 for all x <=M.
Inverse Problems in Bioluminescence Tomography
123
Let 70 and 71 be the boundary value maps 70 [u] = u | r
and
(5.1)
71 [it] = D —
and L be the differential operator L[u] = - V • (£>Vu) + fj,au.
(5.2)
Given / £ H*(Tp), let w\ £ Hx{Sl) be the solution of the following mixed boundary value problem (MBVP) [34,35] L[wi] = 0, in Q,
(5.3)
7 o K ] = / , on r P ,
(5.4)
7 0 K ] + 2 7 i K ] = g~, on T \ r P .
(5.5)
We define a linear operator NrP from i ? 5 ( r P ) to if~2 ( r P ) by JVrp[/] = 7i[«>i]lr P -
(5-6)
A^rP is an extension of the well-known Dirichlet-to-Neumann (or SteklovPoincare) map [30]. On the other hand, for qo G L2(Cl), we consider the following MBVP L[w2] = qo, in H, 7o[w2] = 0, on r P , 7o[w2] + 271 [w2] = 0 , on T \ T P .
(5.7) (5.8) (5.9)
and define another linear operator Ap P from L2(Q) to H 5 ( r P ) by
ArPM = -7iK]|rp.
(5.10)
In terms of 70 and 71, the (BLT(P)) problem is to find q0 such that L[u] = qo, in O, 7 o M + 2 7 l [u] = <7~, o n T , 7i[«] = -fl, o n T P
(5-11) (5.12) (5.13)
for some light flux u, given the observed g on T P and assumed g~ on T. Assume that such a source qo and radiance flux u exist. Then, u is a solution of the following MBVP [34] L[u] = q0, in n,
(5.14)
7oM =
(5.15)
70M + 271 [«] =
(5.16)
124
Ming Jiang, Yi Li, Ge Wang
Let u>i be denned as in (5.3) - (5.4) with / = g + 2g, and u>2 be denned as in (5.7) - (5.8). Let v = w\ + W2- Then we have L[v] = q0, in n ,
(5.17)
To[v]=g-+2g,<mrP,
(5.18)
7bH + 27i [v] = 9~, on T \ TP.
(5.19)
By the uniqueness of the above MBVP [34,36], it follows that v satisfies (5.14) - (5.16). Hence, u = v is the required light flux u that generates the measurement on Fp. The measurement equation implies that, on TP,
-9 = 7 i N = 7 i K ] + 7 i K ] = NVp [g~ + 2g] - A F p M -
(5.20)
Hence, qo satisfies the following equation in the operator form Ar> M = NTp [g~ + 2g] + g, on TP.
(5.21)
Conversely, if there exists qo satisfying (5.21), we can construct u = v as indicated above. It follows easily that u satisfies the forward model and the measurement equation. In summary, we have P r o p o s i t i o n 5.1. qo is a solution to the (BLT(P)) problem (4-8) if and only if it is a solution to (5.21).
6
Uniqueness of BLT
Given its physical meaning, BLT must have at least one solution. Therefore, in this section we will not discuss the existence of the BLT solution, and primarily focus on the uniqueness property of BLT. The uniqueness property for BLT was studied in [32]. We need to determine the kernel 7V[ArP] of the operator A r P : L 2 (0) —y H^(Tp) C L2(Tp) to characterize the uniqueness property of BLT. We need Green's formula in the following. For v and p in appropriate function spaces, the following Green's formula is well-known [34,35]: / [v • L[w] — w • L[v]] dx = — / [i>7i[u>] — itf7i[v]] dT.
Jo.
(6.1)
Jv
For %/j G Hi (r), let TrP be denned by = TrP [ip] as the unique solution in H 1(£l) C L2(Q) of the following MBVP L\4>] = 0, in 7o[>] = ip, on r P , 7oM+27iM=0, o n r \ r p .
fi,
(6.2) (6.3) (6.4)
Inverse Problems in Bioluminescence Tomography
125
Then, by Green's formula (6.1), (5.7) - (5.8), (5.10), (6.2) and (6.3), / qo • 4>dx = / L[w2[ • 4>dx Ju Jn = -
I [27i[>]] dT + w2L[4>] dx Jr Jn
= -
[27i[<£]] dT /
[>7i[^2] - W 2 7 1 [4>]\ dT
Jr\rP
= - f
[-VAfa,]-11*71 [0]]dT
JTp
-
ir\rP
[2] -W27i[0]] dT
= f iPA[qo]dT, JrP because / [}] dT Jr\rP = \ I [{w2 + 27i[«*]) ~w2( + 27i[>])] dT = 0, * './r\r r \ r >P by (5.9) and (6.4). Thus, for the operators A F p : L 2 (fi) - • F 2 ( r P ) C L2{TP) and Tr P : ff*(rP) C L2(TP) -> L 2 (fi), we have (?o, TTp [ip])L2{n) = ( A r p [g0], V') L 2 ( r p ) ,
(6.6)
i.e., they are the dual to each other, Ar>=Trp.
(6.7)
Then, the kernel of Ar P is [33] 7V-[ArP] = IllA^}1-
=
ft[Trp]\
(6.8)
A detailed characterization can be obtained by letting Krp
(") = {P € &&)
: 7 o H l r P = 0, 7i\p]lr P = 0 and
7o[p] + 2 7 i b ] l r \ r P = 0 } .
(6 9j
-
Then, we have Proposition 6.1. ^ [ T r p ] - L = L[fl?, r p (n)].
(6.10)
126
Ming Jiang, Yi Li, Ge Wang
Proof. If q e L[Hg>rp(n)] with q = L\p] for some p € H$rp(n), v = TTpM € TZ[TrP], by Green's formula (6.1), (v)mti)=
/ q-vdx= Jo,
/ Ja
then for
v-L\p\dx
= / \pji[v] - vyi\p]]aT+ / L[v]-pdx ./r 7n
= 0
because 7o[p]lr> = 0> 7i[p]lr P = 0> -^M = 0, and the boundary integral on T \ Fp is equal to zero as in (6.5). Hence, q _L 1Z[T], Therefore, Conversely, assume that q e TilT}1- = JV[A]. We have, by (5.7) (5.10), there exists u>2 such that L[u)2] = q, in CI, 7o[w2] = 0, on r P , 7o [w2] + 271 [w2] = 0 , on T \ r P , 71^2] = 0, on Tp. We have w2 € # 2 ( f i ) by the regularity theory for second order elliptic partial differential equations [34,35]. The above boundary conditions imply that u>2 G ff02)rp(fi), Hence, q = L[u>2] e L[HQ(Q,)]. The conclusion follows immediately. • By Proposition 5.1, the BLT(P) problem is equivalent to the linear equation (5.21) with qo as the unknown to be found. All the solutions <7o to (5.21) form a convex set in Z 2 (fi). Hence, there exists one unique solution of the minimal L 2 -norm among those solutions [33], denoted by qH- Then, all the solutions can be expressed as qn + Af[A]. We summarize the above results into the following theorem. Theorem 6.2. Assume that the (BLT(P)) problem is solvable. For any couple (g~ ,g) such that N[g'+2g}+geHi(rP),
(6.11)
there is one special solution qn for the (BLT(P)) problem (4-8), which is of the minimal L2-norm among all the solutions. Then, any solution can be expressed as qo = qH + L\p] for some p £ i? 2 F p (fi). Remark 6.3. Naturally, condition (6.11) for (g~,g) is automatically satisfied when g is a normal trace 71 [u], where u is a solution of the forward model (4-1) and (4.2) for q0 € L2{9) [37].
Inverse Problems in Bioluminescence Tomography
127
Because there is no unique solution to the (BLT(P)) problem in the general case by Theorem 6.2, one may consider to utilize the unique minimal norm solution qn as the solution of the BLT problem. The minimal norm solution qn is also called the minimal energy solution, and advocated in other fields [38,39]. However, we will demonstrate below that the minimal energy source solution is not physically favorable for BLT in general. Since sources of compact supports are comm only encountered in practice, we will demonstrate that such sources can not be found as the minimal norm solution for the BLT problem. With the minimal norm source solution qn -L L[HQ r p ( 0 ) ] , we have, by Green's formula (6.1), fa any v € H$(Q) C H*trp(Q)], / qH • L[v] dx = / v • L[qH] dx = 0.
Jn
(6.12)
Jn
It follows that qn must satisfy the following equation L[qH] = 0.
(6.13)
Hence, we have the following theorem: Theorem 6.4. The minimal energy solution qn can not possess a compact support within Cl unless it is zero. Proof. If qn is of compact support within fl, it satisfies the partial differential equation (6.13) and the following boundary condition: 70 [qij] — 0 on r . Because D > 0 and fia > 0, it follows that qn — 0 by the uniqueness for the Dirichlet problem of elliptic partial equations of second order [34,36]. • Sources of the form q0 = qn + q* for any q* £ L[Hfir (£1)] will generate the same measurements. Therefore, q* is called a non-radiating part [39], for it contributes nothing to the measurement and hence is unobservable. In other words, we have arrived at the same conclusion as that for the inverse source problem of the Helmholtz equation reported in [39]. Theorem 6.5. Ifqo ^ 0 with a compact support within Cl is a solution to the BLT problem, then qo must have a non-radiating, i.e., un-observable part. Proof. If go does not have a non-radiating part, then q = qn is the minimal norm source and satisfies (6.13). Hence, it follows that qo = 0 as in the proof of the above theorem. • When r > = T, the (BLT(P)) problem (4.8) reduces to the (BLT) problem (4.7). The results in this section generalize those for the (BLT) problem in [32].
128
Ming Jiang, Yi Li, Ge Wang
Remark 6.6. Although the minimal norm solution is advocated in other fields [38,39], it is not appropriate for BLT, as indicated in Theorem 6.5. Sources of compact supports frequently take place in BLT practice. One kind of sources as combinations of radial base functions in (7.1) and (7.2) are studied in Theorem 7.2 in the next section. Such sources can not be reconstructed as the minimal norm solution.
7
Uniqueness results for radial sources
Given the difficulty that there is no unique solution to BLT in the general case by Theorem 6.2, we must restrict the solution space to a sub-space of bioluminescent source distributions so that the uniqueness of the solution may be established in that specific case. There are some results on sources with specific structures as reviewed in [32]. We will report our recent results on sources as combinations of radial base functions, which generalize the results in [40]. Note that any function can be approximated by a radial combination [41,42]. Hence, this kind of sources represent a quite general class of sources. First, we consider the case of a linear combination of bioluminescent impulses or point sources, m
Qo(y) = Yl
ai6
(y ~ ^ ) '
t7-1)
where each a, is a constant coefficient, and yi the location of a point source inside fi for i = 1, • • • , m. For the uniqueness result, the conditions on Cl, D, Ha and qo are as follows. C l : Q is a bounded C 2 domain of RN and partitioned into nonoverlapping sub-domains fij, i = 1,2,...,/; C2: Each Qj is connected with a piecewise smooth boundary Pf, C 3 : D and (j,a are C2 near the boundary of each sub-domain; C4: D > Do > 0 for some positive constant Do is Lipschitz on each sub-domain; fia > 0 and /xa € Lp(Cl) for some p > N/2; The following theorem is established in [32] and extends the result in [40]. Theorem 7.1. ( [32]) Assume the conditions Cl - C4 hold. If qo(y) = m
M
J2 diS(y — yt) and Qo{y) = X) Aj5(y — Yj) are two solutions to the BLT »=1
3=1
problem (4-8), then m = M and there is a permutation r of [l,m] such that ai = AT(i) and yi =Yr^y
Inverse Problems in Bioluminescence Tomography
129
Then, let us consider the case of a linear combination of radial base functions m
«>(«) =X>(lll/-*i||)XB r <
r4 (xi)
(7.2)
t=l
for the more general (BLT(P)) problem (4.8), which covers the BLT problem as a special case. To present the result in this case, we need the following notations. For each 0 < ro < T\ < oo,xo € RN, let Bro^ri(xo) denote a hollow ball specified by ro < |a; — cco| < T\ for ro > 0 and a solid ball specified by |rc — a^o| 0 and fj,i,...,fxi > 0 such that D(x) = £)» and na{x) = /Xj, Vx € CliNote that condition C4* is a special case of condition C4. C 5 : There exists a C2 patch P0 of T; C6: For each sub-domain Q,m, there exists a sequence of indices i\, i2,..., ik S [1,7] with the following connectivity property: The intersection PQ PI r ^ contains a smooth C 2 open patch and Pi- n Pij+1 contains a smooth C 2 open patch for j = 1,...,fc— 1 and Q,ik = O m ; C7: qo is of the following form m
Qo(y) = 5^5*(||y - a;i||)xBpir ,pr, (a*) '——' i=l
(7.3)
o i
where each gt ^ 0, gi G L2(Briri(xi)) is continuous, the source centers {x^} are distinct, and each source support Bri ri{xi) CC fi/t for some k £ [1,/]. Theorem 7.2. ( / ^ / J Assume tfte conditions CI - C4*, C5 - C7 m
M
hold. If qi(y) = £ gi(\\y - Xi\\)xBri , (xt) and q2(y) = J2 Gi(\\v ~ X J I Q X B i Bi (Xt) are two solutions to the (BLT(P)) problem (4-8), then m = M and there exist a permutation r of [1, m] and a map C : [1, m] —• [1,7] SMC/I £/iai Xj = XT^ £ fic(i) a n ^ Jt
rN-1ipC(i)(r)gi{r)dr
=J ^
r w -Vc(i)(r)Gr(i)(r)o>, / o r i = 1,...,/,
^ ^
130
Ming Jiang, Yi Li, Ge Wang
where
^•(0) = 1,
$ ( 0 ) = 0.
(7.6)
Remark 7.3. ( [32]) The unique positive radial solution of -DAip + na
(7.7)
(7.8)
and for \xa > 0,
{
Bessell(0, y/^r), •inh(yyr) N
N = 2, (7.9)
= 3
where Bessell is a Bessel function of the first kind. Remark 7.4. For i = 1, • • • , m, let j — C(i) and
Zi=cjN
r^-V^WftWdr.
(7.10)
Let Fj(x, y) be the fundamental solution of —V • (DjVu) + fXjU with the Dirichlet condition at oo, that is, y€RN.
-V„-(DjVyFi(x,y))+iiJFj(x,y)=8{x-y),
(7.11)
By the mean value property [32], for x $• BTiri (a;,), j
Fj(x,y)gi(\\y-xi\\)dy
JB i r
,r
(7.12)
, (xi)
O l
= r
9i(r)dr
Jrl
= /
I
Fj(x,y)dy
(7.13)
J\\y-Xi\\=r
ujNrN-1gi(r)Fj(x,xi)j(r)
dr
Jrjj
= wNFj{x,Xi) = ZiFj(x,xi).
r"'1 Mr)9i(r)
dr
(7.14) (7.15)
Inverse Problems in Bioluminescence Tomography
131
If the source gi(\\y — Xi\\) inside Bri r%{xi) is in a homogeneous medium filled in the whole space R w with corresponding parameters Dj and fij, then ZiFj(x, Xi) is the light flux induced by it outside Bri r% (x,), which is the same as a light flux induced by a point source at Xi with Zi being the point source intensity. Hence, if the light flux in a homogeneous medium outside Briri(xi) generated by two source terms of the form gi(\\y — a;,||) with supports inside Briri{xi) are the same, they are indistinguishable from the measurements and equivalent solutions to the BLT problem. On the other hand, Theorem 7.2 guarantees that the number m of the sources, the source centers {x^} and the overall source intensities Zi, can be uniquely determined. These quantities are what we are able to reconstruct from the measured data without further knowledge and are of important interest in practical applications.
8
Reconstruction methods
Based on the iterative methods for image reconstruction, we propose four iterative algorithms for the (BLT(P)) problem. The first two are based on the EM algorithm for emission CT [43] and its ordered-subset (OS) variant, in which the source is constrained to be nonnegative, while the latter two are based on the constrained Landweber scheme and its OS variant [9,10]. Several issues related to these algorithms will be discussed in §8.5. Let b = Nrp [g~ + 2g] + g £ H 5 ( r P ) . Then the BLT problem is to find a solution to the following linear operator equation: ArP[lo} = b, o n r P .
8.1
(8.1)
EM method
Based on the formulation in [12], let us define F[q0] = f
{blogArp[qo} - ArP[<7o]} dT,
(8.2)
JTp
which is a generalized form of the log likelihood function when the measured data b is subject to Poissonian distribution. By the maximum principle of elliptic partial differential equations [34], it follows that Arpfeo] > 0 if go > 0. Hence, if there is a nonnegative solution to the (BLT(P)) problem (5.21), we must have b > 0. Therefore, we assume that b > 0. We try to find a solution for the (BLT(P)) problem by performing the following optimization argmaxFfe)]. 90 > 0
(8.3)
Ming Jiang, Yi Li, Ge Wang
132
We first assume that go > 0 is a minimizer of F. The case of go > 0 can be handled similarly as the limiting case. We need to find the Prechet derivative of F. Let f(t) = F[q0 + tv],
for t around 0,
(8.4)
where v is an arbitrary bounded function of L2(Ct), and compute d dt /(*) Ar>[go]
Jn
Hence, the Prechet derivative of F is
F'[g0] = A*rp f A- j - A - - 1 e L 2 (0). L r P [go
(8.5)
If go > 0 is a solution of (8.3), it follows that F'[qo] = 0. The general case of go > 0 is given by the Kuhn-Tucker condition [12]: 9b-AJ
= 0.
Ar> [go]
(8.6)
Let >i = AJ, [1] = T r p [ l ] , i.e., the solution of the following MBVP by (6.7): L[i] = 0, in 0 , 1o[(f>i] = 1, o n T p ,
7o[>i] + 27l[0i] = o,
on
r\rP.
(8.7) (8.8) (8.9)
It follows from the maximum principle of elliptic partial differential equations that 0 < cf>\ < 1 [36]. The Kuhn-Tucker condition (8.6) can be rewritten as Qo = -rQo • TrP
_Ai>M.
(8.10)
Then, we obtain the following EM formula [12]: (n+l _
1
n)
T
ArP[^n)]
(8.11)
Based on the above analysis, we can formulate the EM algorithm as follows. Algorithm 8.1. EM Algorithm for BLT:
Inverse Problems in Bioluminescence Tomography Step 1:
133
Initialization.
1. Solve the following MBVP: L[wi] = 0, in
fi,
1o[wi}=g-+2g, 7o [wi]
on r>,
+ 271 [twi] = g~, on T \ TP.
(8.12) (8.13) (8.14)
Set b = 7i[wi]lr P + 9- When g~ = 0 , b = 71 [wi] + ^ o h ] , restricted to Tp. 2. Choose an initial guess q$ . Step 2: For n > 0, do the following iteration until the convergence criteria are satisfied. 1. Solve the following MBVP: L[win)] = qin\ 7 o [z4 n)
n)
inn,
(8.15)
] = 0, on TP,
(8.16)
n)
7o[t4 ] + 2 7 i[<4 ] = 0, on T \ TP. 2. Set p(») =
(8.17)
^ - r - . W / i e n g~ = 0, ir
\p
(„) = 7 i M + |7o[wi] -7i K
]
rP
3. Solve the following MBVP: L[{n)) = 0, in £1, 7o[4> 7 o [0
I
(n)
(n)
]=P
(n)
, onTp,
(8.19)
] + 27i[> )] = 0 , on T \ TP.
(8.20)
(n
Set (n+i) _
(n)
>(n)
Step 3: The reconstructed source is given by q^ 8.2
(8.18)
.
OS-EM method
The success of the EM algorithm has been well known [43-45] and significantly magnified by the ordered-subset (OS) technique [46-50]. The OS technique is also commonly called block iterative methods [51]. Here,
Ming Jiang, Yi Li, Ge Wang
134
we derive the OS version of the EM algorithm for the (BLT(P)) problem. The OS version can be formulated within the framework of the incremental subgradient method studied in [52]. Recall that the measurement is taken on some disjoint closed connected parts Tj C F for j = 1, • • • , J. Let Tp = U/=i ^V Let h = &lr,
(8-21)
be the parts of b on Tj, which constitutes an ordered-subset of the data b. One may choose other ordered-subset for the data b. The log likelihood function can then be written as F[Qo] = V /
{bj log A r p [go] - Ar> [go]} <*T.
(8.22)
To obtain the OS version of the EM formula (8.11), we note that the EM formula (8.11) is a one-step steepest descent iteration: in+1)=q^+X(n).F'[q^}
(8.23)
with X(n) = A^jij- Let Fj M = /
{bj log Ar> [go] - Ar P [go]} dC,
(8.24)
for j = 1, • • • , J. For n > 0, let [n] = n (mod J ) + 1. The incremental subgradient method for solving the optimization problem (8.3) is «S B + 1 ) =flS B ) + A(n)-JFfB] [«(">].
(8.25)
For the OS version, we need to find the Frechet derivative of Fj for j = 1, • • • , J and relevant relaxation parameter X(n). As in the previous section, we first assume that go > 0 is a minimizer of F. Let f(t) = Fj[g0 + tv],
for t around 0,
where v is an arbitrary bounded function of L2(Tp), > >
t=o
A-j, I ^ Ar P [go]
(8.26)
and compute
J
Let Rj : £ 2 ( r ) — L2{T) be defined by
^)-{r :*£
(8.27)
Inverse Problems in Bioluminescence Tomography
135
Then dt
fit)
Rj | 6
-IM
ArpHdT
A?S Ri
}' dfi.
Ar P [go]
Hence, the Frechet derivative of F is F;[q0}=A*rp![Rj
e L 2 (0).
_Ar>M
(8.28)
Therefore, the formula (8.25) can be written as («+D = g W + A (n) • A r p Q6
b
i?N
(")l
(8.29)
LArp[«n
Let <j>ij = A F p {.Rj[l]} = TrP {Rj[l}}, i.e., the solution of the following MBVP by (6.7) and the definition (8.27) of Rf L[(f>ij] = 0, in O, 7o[>i,j] = 1, o n T j , 7 o [ 0 u ] = O , onTpXr,-, lo[i,j] + 271 [0ij] = 0 , on T \ r P
(8.30) (8.31) (8.32) (8.33)
for j = 1, • • • , J. It follows from the maximum principle of elliptic partial differential equations that 0 < 4>ij <4>\ < 1 [36]. Let (n) A(n) = 0 01, [n] '
(8.34)
Then, we obtain the following OS-EM formula: (n+l)
1
(n)
rr
I D
Arpfe,(«)i
01, [nl
(8.35)
Based on the above analysis, we can formulate the OS-EM algorithm as follows. Algorithm 8.2. OS-EM Algorithm for BLT: Step 1:
Initialization.
1. Solve the following MBVP: L[wi] = 0, in Q, 7 o K ] = g~ + 2g, on TP, 7b[twi] + 27i[i« 1 ]=ff-, on T\TP.
(8.36) (8.37) (8.38)
Ming Jiang, Yi Li, Ge Wang
136
Set b = 7i[wi]lrv + 9- When g restricted to Fp. 2. Choose an initial guess q^ '.
= 0, b = 71 [wi] + §7o[wi],
Step 2: For n > 0, do the following iteration until the convergence criteria are satisfied. 1. Solve the following MBVP: L[w^n)] =q^n\ 7o[t4 n) ] = 0, 7o[4
n)
inn, onTp,
] + 27i[«4 n) ] = 0 , on T \ TP.
(8.39) (8.40) (8.41)
2. Set
„(») When g
R\n]
(8.42) -7i[^" } ]
= 0, •yi[wi] + 570 [wi]
P{n) = R[n]
r v(")l
(8.43)
5. Solve the following MBVP: i[0(")] = 0, tn fi, (n)
(n)
7o[> ]=P , onr [ n ] , 7o[^(n)] = o, on r P \ r [ n ] ) 7 o [
^. Set
(n)
]+27 1 [0 ( " ) ]=O,
onr\rP.
(8.44) (8.45) (8.46) (8.47)
,(«)
Step 3: The reconstructed source is given by %(n+1) Comparing with the EM formula (8.11), we would like to make the following remarks: 1. The OS-EM formula reduces to (8.11) when the datum 6 is not ordered into subsets. 2. Because \ Tj, it does make sense in (8.35) when qfi1' zfi 0. This can be remedied by requiring that the source to be reconstructed is of compact support and adopt the convention for the EM algorithm: g = 0 in (8.35);
Inverse Problems in Bioluminescence Tomography
137
3. Because 0 < ij <(j>i
8.3
Constrained Landweber method
Assume that we have some prior knowledge about the source represented as a convex set: C — {go : <7o satisfies some convex constraints},
(8.48)
which is a closed convex subset of L2(f2). E.g., the nonnegativity is a convex constraint. Other forms of constraints well be discussed in §9. By (5.21), go satisfies the following conditions: A r p [qo] = Nrp [g~ + 2g] + g,
and
q0eC.
(8.49)
Let Pc be the orthogonal projection operator from L2{Vi) to C. Then, the constrained Landweber scheme or projected Landweber scheme is given as follows: ^ n + 1 ) = Pc {?£° + A„Afp [b - A r p [q^]] } ,
(8.50)
where A„ is a relaxation parameter. The convergence property of the constrained Landweber scheme was studied in [53,54] and improved in [38,55]. The limit of q^ is a solution to the constrained least-squares problem argmin-||6-ArpMI||2(r)go ec
(8.51)
The precise condition for the relaxation depends on the operator norm l l A r P A r > | | = l|Ar P || 2 , e.g., 0 < e < A„ < - p ^ p - e- Please refer to [38, 53-55] for more details. The algorithm can be formulated as follows. Algorithm 8.3. Constrained Landweber BLT Algorithm. Step 1:
Initialization.
1. Choose an appropriate relaxation strategy {A„}.
138
Ming Jiang, Yi Li, Ge Wang 2. Solve the following MBVP: L[wi] = 0, in n, lo[wi]=g-+2g, onTP,
(8.52) (8.53)
7 o N ] + 2 7 i H ] = onT\TP.
(8.54)
Set b = 7i[wi]lrP + 9- When g~~ = 0, b = 7i[toi] + §7o[«>i], restricted to Tp. 3. Choose an initial guess q^ . Step 2: For n> 0, do the following iteration until the convergence criteria are satisfied. 1. Solve the following MBVP: L[i4 n) ] =
(8-55)
n)
7o[«4 ] = 0, on TP, 7ok
n)
(8.56)
] + 2 7l [«4 n) ] = 0, on T \ TP.
2. Set p("> = 6 + 7i[^ n ) ]
. When g~ = 0, p^
(8.57) = n[wi] +
Tp
57o[wi] + 7i[«4 ]> restricted to Tp. 3. Solve the following MBVP: L[
{n
lo[4> ]=V \ onTp, 7oft ] + 2 7 i[^ n >] - 0, on T \ TP. (n)
(8.58) (8.59) (8.60)
4. Set > + i ) _ » + 1o''"' = 96"
A
^ ( n ) .
5. Compute the projection qo = Pc[%
]• Set q^
' = q0.
Step 3: The reconstructed source is given by q^
8.4
OS constrained Landweber method
Similarly, the OS version of the constrained Landweber method can be formulated as the following: qW
= PC { g W + A„AJp {R{n] [b - Ar P [qin)]} } } ,
(8.61)
where An is a relaxation parameter. The convergence property of the constrained Landweber scheme was studied in [56]. The algorithm can be formulated as follows.
Inverse Problems in Bioluminescence Tomography
139
Algorithm 8.4. OS Constrained Landweber BLT Algorithm. Step 1:
Initialization.
1. Choose an appropriate relaxation strategy {A„}. 2. Solve the following MBVP: (8.62)
L[wi] = 0, in fi, 7o[wi] =9" +%g, 2
7o[wi] + 7i[wi] = g~, Set b = 7i[tt>i]| rp + g. When g restricted to Tp. 3. Choose an initial guess q^ .
onTp,
onr\rp.
(8.63) (8.64)
= 0, b = 71 [wi] + 570 [«>i],
Step 2: For n > 0, do the following iteration until the convergence criteria are satisfied. 1. Solve the following MBVP: (8.65) n)
7o[«4 ] =°> n)
onTp,
n)
7o[u4 ] + 2 7 i [ 4 ] = 0,
onT\TP.
(8.66) (8.67)
2. Set Pin)
When g
= R[n
6 + 7i[4 n ) ]
(8.68)
= 0,
P(n) = R[n] 7 i K ] + 27o[wi]+7i[«4'(")l
(8.69)
3. Solve the following MBVP: L[W] = 0, in SI, 7o[0
(n)
]=P
(n)
(8.70)
, onT[n],
(8.72)
n
(8.73)
7o[0 ] = 0, on TP\T[n], (n)
7o[> ] + 2 7 l [ ^ ) ] = 0 , on T \ r P . I
(8.71)
(n)
Set
5. Compute the projection qo = Pc[% Step 3: TTie reconstructed source is given by QQ
]• 5ei C/Q + '.
= go-
140
8.5
Ming Jiang, Yi Li, Ge Wang
Relevant issues
Choice of <^0). We propose the following strategy for choosing an initial guess q0 . By Green's formula, letting v = u and p = 1 in (6.1), we have / k « -q0]dx Jn
= -
gdT.
(8.74)
JT
If we replace qo with q^ , we obtain / q^0) dx = gdT+ [ naudx. (8.75) Jn JT Jn Note that u > w\ by the the maximum principle of elliptic partial differential equations [34]. We obtain / qi0)dx> I gdT+ I fiaWldx. (8.76) Jn JT Jn In practical applications, the source qo is usually of compact support inside Q. Our choice of q^ should take this fact into account. In this work, we always use qo that is constant in its support fio where fio CC fi, i.e., ^ 0 ) = QoXn0
(8.77)
where Qo is a constant. Based on the above analysis, we need to choose Qo and fio such that
Q0\rt0\>
gdr
+
fiawi dx, (8.78) Jn where |Oo| is the volume of QQ- Because only the datum g on Fp is available, the final estimate is JT
Qol^ol > / 9dV+ I naWldx. ^rP Jn
(8.79)
Estimation of A„ The relaxation parameter for the constrained Landweber method depends on the operator norm ||Ajp p Ar P || = ||Ar P || 2 , which is equivalent to finding the minimal eigenvalue of Ap p Ap P or the minimal singular value of Ar P . This can be reduced to a boundary eigenvalue problem of partial differential equations. Convergence Criteria. The convergence criteria for both algorithms may include (1) when the iteration number n reaches an assumed maximum number; (2) when the successive increment |g 0 " —qo \ is smaller than an assumed error level.
Inverse Problems in Bioluminescence Tomography
9
141
Discussions
This BLT problem is ill-posed and does not have a unique solution. To obtain a physically favorable unique solution, adequate prior knowledge must be utilized. The constrained iterative approach provides a mechanism for incorporating prior knowledge as constraints and has been widely used in practice. Here we propose several iterative algorithms for BLT. For experimental results of the algorithms, please refer to [9]. The numerical results in [9] demonstrate that the image resolution of BLT would be in the millimeter order. As discussed in Remark 7.4, reconstructed results are subject to deviations in terms of intensity and support size. Our preliminary results in [9] are in accordance with the theoretical conclusion. This observation should be instructive for BLT applications, where the number of sources, their centers and the intensity features Zi are of practical value. If additional prior information about the source distribution is available, we can surely achieve a more specific BLT reconstruction. Given the difficulty that there is no unique solution to BLT in the general case by Theorem 6.2, we must restrict the solution space so that the uniqueness of the solution may be established in a specific context. We have not utilized more constraints in experiments other than the nonnegativity, but other constraints such as the source intensity, size and location may be very effective as well. Also, the requirement that the source distribution is of the form (7.2) constitutes a convex constraint C. Note that any function can be approximated by a radial combination [41]. Hence, sources of the above form represent a quite general class of sources. Given the basis functions
142
Ming Jiang, Yi Li, Ge Wang
low scattering and low absorbing areas are present in the cerebrospinal fluid of the brain, the synovial fluid of human finger joints, or the amniotic fluid in the female uterus. Examples for highly absorbing regions in the body are hematoma or liver tissue [64]. High order spherical harmonics or solid harmonics can be applied to find approximation of the RTE [11,12,20-22]. Anisotropic scattering effects in highly scattering media were studied in [67]. The Fokker-Planck equation is used as an approximation to the RTE when scattering is sharply peaked [68,69]. For media with a varying refractive index, the corresponding RTE and its diffusion approximation were recently established in [70,71]. Efficient approximation and computation of the RTE are of importance in further investigation. Acknowledgments M. Jiang was supported in part by the National Basic Research Program of China under Grant 2003CB716101 and National Natural Science Foundation of China under Grant 60325101, 60272018 and 60372024. Yi Li was supported in part by a Xiao-Xiang Grant of Hunan Normal University and the National Natural Science Foundation of China under Grant 10471052. M. Jiang, Y. Li and G. Wang were supported in part by NIH/NIBIB grants (EB001685 and EB002667).
References [1] B. W. Rice, M. D. Cable and M. B. Nelson, "In vivo imaging of light-emitting probes," J. Biomed. Opt, vol. 6, pp. 432 - 440, 2001. [2] A. McCaffrey, M. A. Kay and C. H. Contag, "Advancing molecular therapies through in vivo bioluminescent imaging," Mol. Imaging, vol. 2, pp. 75 - 86, 2003. [3] A. Soling and N. G. Rainov, "Bioluminescence imaging in vivo application to cancer research," Expert Opin. Biol. Ther., vol. 3, pp. 1163 - 1172, 2003. [4] J. C. Wu, I. Y. Chen, G. Sundaresan, J. J. Min, A. De, J. H. Qiao, M. C. Fishbein and S. S. Gambhir, "Molecular imaging of cardiac cell transplantation in living animals using optical bioluminescence and positron emission tomography," Circulation, vol. 108, pp. 1302 - 1305, 2003. [5] C. H. Contag and B. D. Ross, "It's not just about anatomy: in vivo bioluminescence imaging as an eyepiece into biology," J. Magn. Reson. Imaging, vol. 16, pp. 378 - 387, 2002. [6] A. Rehemtulla, L. D. Stegman, S. J. Cardozo, S. Gupta, D. E. Hall, C. H. Contag and B. D. Ross, "Rapid and quantitative assessment of
Inverse Problems in Bioluminescence Tomography
143
cancer treatment response using in vivo bioluminescence imaging," Neoplasia, vol. 2, pp. 491 - 495, 2002. G. Wang, E. A. Hoffman and G. McLennan, "Bioluminescent CT method and apparatus," March 2003, US provisional patent application. G. Wang et al, "Development of the first bioluminescent tomography system," Radiology Suppl. (Proceedings of the RSNA), December 2003. M. Jiang and G. Wang, "Image reconstruction for bioluminescence tomography," Proceedings of SPIE, vol. 5535, pp. 335 - 351, 2004. , "Image reconstruction for bioluminescence tomography," Proceedings of the RSNA, December 2004. S. R. Arridge, "Optical tomography in medical imaging," Inverse Problems, vol. 15, pp. R41 - R93, 1999. F. Natterer and F. Wiibbeling, Mathematical Methods in Image Reconstruction. Philadelphia, PA: SIAM, 2001. K. M. Case and P. F. Zweifel, Linear transport theory. AddisonWesley Publishing Co., Reading, Mass.-London-Don Mills, Ont., 1967. C. Cercignani, The Boltzmann equation and its applications, ser. Applied Mathematical Sciences. New York: Springer-Verlag, 1988, vol. 67. S. Chandrasekhar, Radiative transfer. New York: Dover Publications Inc., 1960. A. Ishimaru, Wave Propagation and Scattering in Random Media. New York: IEEE Press, 1997. D. S. Anikonov, A. E. Kovtanyuk and I. V. Prokhorov, Transport equation and tomography, ser. Inverse and Ill-posed Problems Series. Utrecht: VSP, 2002. A. D. Klose and A. H. Hielscher, "Quasi-Newton methods in optical tomographic image reconstruction," Inverse Problems, vol. 19, no. 2, pp. 387-409, 2003. E. D. Aydin, C. R. E. de Oliverira and A. J. H. Goddard, "A comparison between transport and diffusion calculations using a finite element-spherical radiation transport method," Medical Physics, vol. 29, no. 9, pp. 2013 - 2023, September 2002. R. T. Ackroyd, C. R. E. de Oliveira, A. Zolfaghari and A. J. H. Goddard, "On a rigorous resolution of the transport equation into a system of diffusion-like equations," Progress in Nuclear Energy, vol. 35, pp. 1 - 64, 1999.
144
Ming Jiang, Yi Li, Ge Wang
[21]
, "On the exact resolution of the transport equation for an anisotropic scattering medium into a system of diffusive equations," Annals of Nuclear Energy, vol. 26, pp. 729 - 755, 1999.
[22] E. W. Larsen, G. Thommes, A. Klar, M. Seaid and T. Gotz, "Simplified pn approximations to the equations of radiative heat transfer and applications," Journal of Computational Physics, vol. 183, pp. 652 - 675, 2002. [23] M. Schweiger, S. R. Arridge, M. Hiraoka and D. T. Delpy, "The finite element method for the propagation of light in scattering media: Boundary and source conditions," Medical Physics, vol. 22, no. 11, pp. 1779 - 1792, September 1995. [24] D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette and Q. Zhang, "Imaging the body with diffuse optical tomography," IEEE Signal Processing Magazine, vol. 18, pp. 57 - 75, 2001. [25] V. A. Markel and J. C. Schotland, "Inverse problem in optical diffusion tomography. I. Fourier-Laplace inversion formulas," Journal of the Optical Society of America, A, vol. 18, no. 6, pp. 1336 - 1347, June 2001. [26]
, "Inverse problem in optical diffusion tomography. II. Role of boundary conditions," Journal of the Optical Society of America, A, vol. 19, no. 3, pp. 558 - 566, March 2002.
[27] V. A. Markel, V. Mital and J. C. Schotland, "Inverse problem in optical diffusion tomography. III. Inversion formulas and singularvalue decomposition," Journal of the Optical Society of America, A, vol. 20, no. 5, pp. 890 - 902, May 2003. [28] V. A. Markel, J. A. O'Sullivan and J. C. Schotland, "Inverse problem in optical diffusion tomography. III. Nonlinear inversion formulas," Journal of the Optical Society of America, A, vol. 20, no. 5, pp. 903 - 912, May 2003. [29] S. R. Arridge and J. C. Hebden, "Optical imaging in medicine. II. Modelling and reconstruction," Physics in Medicine and Biology, vol. 42, pp. 841 - 853, 1997. [30] V. Isakov, Inverse Problems for Partial Differential Equations, ser. Applied Mathematical Series. New York-Berlin-Heidelberg: Springer, 1998, vol. 127. [31] F. John, Partial Differential Equations.
New York: Springer, 1982.
[32] G. Wang, Y. Li and M. Jiang, "Uniqueness theorems for bioluminescent tomography," Medical Physics, vol. 31, no. 8, pp. 2289 2299, 2004.
Inverse Problems in Bioluminescence Tomography
145
[33] W. Rudin, Functional analysis, 2nd ed., ser. International Series in Pure and Applied Mathematics. New York: McGraw-Hill, 1991. [34] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, ser. Grundlehren der Mathematischen Wissenschaften. Berlin-Heidelberg-New York: Springer-Verlag, 1983, vol. 224. [35] R. Dautray and J. L. Lions, Mathematical Analysis and Numerical Methods for Science and Technology. Berlin: Springer-Verlag, 1990, vol. I. [36] M. H. Protter and H. F. Weinberger, Maximum Principles in Differential Equations. Englewood Cliffs, N. J.: Prentice-Hall, 1967. [37] A. E. Badia and T. H. Duong, "Some remarks on the problem of source identification from boundary measurements," Inverse Problems, vol. 14, pp. 883 - 891, 1998. [38] A. Sabharwal and L. C. Potter, "Convexly constrained linear inverse problems: iterative least-squares and regularization," IEEE Transactions on Signal Processing, vol. 46, pp. 2345 - 2352, 1998. [39] E. A. Marengo, A. J. Devaney and R. W. Ziolkowski, "Inverse source problem and minimum-energy sources," Journal of the Optical Society of America, A, vol. 17, no. 1, pp. 34 - 45, January 2000. [40] A. E. Badia and T. H. Duong, "An inverse source problem in potential analysis," Inverse Problems, vol. 16, pp. 651 - 663, 2000. [41] M. J. D. Powell, "The theory of radial basis function approximation in 1990," in Advances in Numerical Analysis, Vol. II (Lancaster, 1990), ser. Oxford Sci. Publ. New York: Oxford Univ. Press, 1992, pp. 105-210. [42] M. D. Buhmann, Radial basis functions: Theory and implementations, ser. Cambridge Monographs on Applied and Computational Mathematics. Cambridge: Cambridge University Press, 2003, vol. 12. [43] L. A. Shepp and Y. Vardi, "Maximum likelihood restoration for emission tomography," IEEE Trans. Med. Imaging, vol. 1, pp. 113 - 122, 1982. [44] D. L. Snyder, T. J. Schulz and J. A. O'Sullivan, "Deblurring subject to nonnegativity constraints," IEEE Trans. Signal Processing, vol. 40, pp. 1143 - 1150, 1992. [45] G. Wang, D. L. Snyder, J. A. O'Sullivan and M. W. Vannier, "Iterative deblurring for CT metal artifact reduction," IEEE Trans. Medical Imaging, vol. 15, pp. 657 - 664, 1996.
146
Ming Jiang, Yi Li, Ge Wang
[46] H. M. Hudson and R. S. Larkin, "Accelerated image reconstruction using ordered subsets of projection data," IEEE Trans. Medical Imaging, vol. 13, no. 4, pp. 601 - 609, December 1994. [47] J. Browne and A. R. D. Pierro, "A row-action alternative to the EM algorithm for maximizing likelihood in emission tomography," IEEE Trans. Medical Imaging, vol. 15, no. 5, pp. 687 - 699, 1996. [48] C. L. Byrne, "Block-iterative methods for image reconstruction from projections," IEEE Trans, on Image Processing, vol. 15, no. 5, pp. 792-794, 1996. [49] G. Wang, G. D. Schweiger, and M. W. Vannier, "An iterative algorithm for x-ray ct fluoroscopy," IEEE Trans. Medical Imaging, vol. 17, pp. 853 - 856, 1998. [50] M. Jiang and G. Wang, "Convergence studies on iterative algorithms for image reconstruction," IEEE Trans. Medical Imaging, vol. 22, no. 5, pp. 569 - 579, May 2003. [51] Y. Censor and S. A. Zenios, Parallel Optimization: Theory, Algorithms, and Applications. New York: Oxford University Press, 1997. [52] A. Nedic and D. P. Bertsekas, "Incremental subgradient methods for nondifferentiable optimization," SIAM J. on Optimization, vol. 12, pp. 109 - 138, 2001. [53] B. Eicke, "Konvex-resringierte schlechtgestellte Problems und ihr Regularisierung durch Iterationsverfahren," 1991, thesis. Technishe Universitat, Berlin. [54]
, "Iteration methods for convexly constrained ill-posed problems in Hilbert space," Numer. Fund. Anal. Optim., vol. 13, pp. 413 - 429, 1992.
[55] M. Piana and M. Bertero, "Projected Landweber method and preconditioning," Inverse Problems, vol. 13, pp. 441 - 463, 1997. [56] M. Jiang and G. Wang, "Constrained block-iterative landweber scheme for image reconstruction," Proceedings of SPIE, vol. 5535, pp. 335 - 351, 2004. [57] M. Griebel and M. A. Schweitzer, Eds., Meshfree methods for partial differential equations, ser. Lecture Notes in Computational Science and Engineering. Berlin: Springer-Verlag, 2003, vol. 26, papers from the workshop held at the Rheinische Friedrich-Wilhelms Universitat Bonn, Bonn, September 11-14, 2001. [58] M. Jiang and G. Wang, "Convergence of the simultaneous algebraic reconstruction technique (SART)," IEEE Trans. Image Processing, vol. 12, no. 8, pp. 957 - 961, 2003.
Inverse Problems in Bioluminescence Tomography [59]
147
, "Development of iterative algorithms for image reconstruction," Journal of X-ray Science and Technology, vol. 10, no. 1-2, pp. 77 - 86, 2002, invited Review.
[60] L. M. Bregman, Y. Censor, S. Reich, and Y. Zepkowitz-Malachi, "Finding the projection of a point onto the intersection of convex sets via projections onto half-spaces," J. Approx. Theory, vol. 124, no. 2, pp. 194-218, 2003. [61] S. Firbank, S. R. Arridge, M. Schweiger, and D. T. Delpy, "An investigation of light transport through scattering bodies with nonscattering regions," Physics in Medicine and Biology, vol. 41, pp. 767 - 783, 1996. [62] F. Martelli, D. Contini, A. Taddeucci, and G. Zaccanti, "Photon migration through a turbid slab described by a model based on diffusion approximation . 2. Comparison with Monte Carlo results," APPLIED OPTICS, vol. 36, no. 19, pp. 4600 - 4612, Jul 1997. [63] A. H. Hielscher, R. E. AlcoufFe, and R. L. Barbour, "Comparison of finite-difference transport and diffusion calculations for photon migration in homogeneous and heterogeneous tissues," Physics in Medicine and Biology, vol. 43, pp. 1285 - 1302, 1998. [64] A. D. Klose and A. H. Hielscher, "Iterative reconstruction scheme for optical tomography based on the equation of radiative transfer," Medical Physics, vol. 26, no. 8, pp. 1698 - 1707, 1999. [65] A. D. Klose, V. Ntziachristos, and A. H. Hielscher, "The inverse source problem based on the radiative transfer equation in optical molecular imaging," Journal of Computational Physics, vol. 202, no. 1, pp. 323 - 345, 2005. [66] H. Dehghani, D. T. Delpy, and S. R. Arridge, "Photon migration in non-scattering tissue and the effects on image reconstruction," Physics in Medicine and Biology, vol. 44, pp. 2897 - 2906, 1999. [67] J. Heino, S. Arridge, J. Sikora, and E. Somersalo, "Anisotropic effects in highly scattering media," Physical Review E, vol. 68, 2003, Art. No. 031908637. [68] A. D. Kim and J. B. Keller, "Light propagaion in biological tissue," Journal of the Optical Society of America, A, vol. 20, no. 1, pp. 92 - 98, January 2003. [69] A. D. Kim, "Transport theory for light propagation in biological tissue," Journal of the Optical Society of America, A, vol. 21, no. 5, pp. 820 - 827, May 2004. [70] J. M. Tualle and E. Tinet, "Derivation of the radiative transfer equation for scattering media with a spatially varying refractive
148
Ming Jiang, Yi Li, Ge Wang index," OPTICS COMMUNICATIONS, 38, Dec 2003.
vol. 228, no. 1-3, pp. 33 -
[71] M. L. Shendeleva, "Radiative transfer in a turbid medium with a varying refractive index: comment," Journal of the Optical Society of America, A, vol. 21, no. 12, pp. 2464 - 2467, Dec 2004.
149
Global Dynamic Properties of Protein Networks Fangting Li, Ying Lu, Tao Long, Qi Ouyang Center for Theoretical Biology, Department of Physics, Peking University, Beijing 100871, China. E-mail: [email protected]
Chao Tang California Institute for Quantitative Biomedical Research, Departments of Biopharmaceutical Sciences and Biochemistry and Biophysics QB3, University of California, San Francisco, USA. E-mail: [email protected] Abstract The interactions between proteins, DNA, and RNA in living cells constitute molecular networks that govern various cellular functions. To investigate the dynamical properties and stabilities of such networks, we studied the cell-cycle and the life-cycle networks of the budding yeast. With the use of a simple dynamical model, it was demonstrated that the cell-cycle network is extremely stable and robust. The biological state—the Gl stationary state—is a global attractor of the dynamics, attracting almost all initial protein states. Furthermore, the biological pathway— the cell-cycle sequence of protein states—is a globally attracting trajectory of the dynamics. These properties are largely preserved with respect to small perturbations to the network. Similar findings were obtained for the life-cycle network of the budding yeast. These results suggest that cellular protein networks are robustly designed for their functions.
1
Introduction
Despite the complex environment in and outside of the cell, the underlying bio-molecular networks carry out various cellular functions reliably. How is the stability of a cell state achieved? How can a biological pathway take the cell from one state to another reliably? Evolution must have played a crucial role in the selection of the architectures of these networks to have such a remarkable property. Much attention has recently
150
Fangting Li, Ying Lu, Qi Ouyang, Chao Tang
been focused on the "topological" properties of large-scale networks [15]. It was argued that a power-law distribution of connectivity, which is apparent for some bionetworks [2,4], is more tolerable against random failure [1], Here we address this question from a dynamic systems, point of view. We study the networks regulating the cell cycle and the life cycle of the budding yeast, and investigate their global dynamical properties and stabilities. We find that the stability of a cell state is achieved by the state being a global attractor of the dynamics—almost all initial protein states flow to the biological stationary states. The reliability of a biological pathway is achieved by the pathway being an attracting trajectory of the dynamics—the flows of the protein states to a final biological stationary state are convergent onto the trajectory corresponding to the biological pathway. In the next section, we discuss the network of cell cycle in budding yeast Saccharomyces cerevisiae and identify the principal players in the network dynamics. These work will allow us to build a simple protein network for budding yeast cell cycle. Then we introduce our model to study the dynamic property of the network, and compare our result with the experimental observations. In section IV, we compare the dynamics of the cell-cycle network with that of random networks, this comparison reveals the specificity of biological networks. Finally, we give another example of this dynamic study to show that the dynamic properties that we found in this study could be universal in the molecular biological networks.
2
The cell cycle network in yeast
The cell cycle process, by which one cell grows and divides into two daughter cells, is a vital biological process, the regulation of which is highly conserved among the eukaryotes [6]. The process consists of four phases: Gl (in which the cell grows and, under appropriate conditions, commits to division), S (in which the DNA is synthesized and the chromosomes replicated), G2 (a "gap" between S and M), and M (in which chromosomes are separated and the cell is divided into two). After the M phase, the cell enters the Gl phase, hence completing a "cycle". The process has been studied in great details in the budding yeast Saccharomyces cerevisiae, a single-cell model eukaryotic organism. There are about 800 genes involved in the cell cycle process of the budding yeast [7]. However, the number of key regulators that are responsible for the control and regulation of this complex process is much smaller. Based on extensive literature studies, we have constructed a network of key regulators that are known so far, as shown in Fig.lA.
Global Dynamic Properties of Protein Networks
151
Figure 1 (A) Cell-cycle network. (B) Simplified cell-cycle network. (C) Dynamical trajectories of the 1986 protein states (green nodes) flowing to the Gl fixed point (blue node). Arrows between states indicate the direction of dynamic flow from one state to another. The cell cycle sequence is colored blue. The size of a node and the thickness of an arrow are proportional to the logarithm of the traffic flow passing through them. There are four classes of members in this regulatory network: cyclins (Clnl,2,3 and Clbl.2,5,6, which bind to the kinase Cdc28), the inhibitors, degraders, and competitors of the cyciin/Cdc28 complexes (Ski, Cdhl, Cdc20, Cdcl4), transcription factors (SBF, MBF, Mcml/SPF, Swi5, Ace2), and checkpoints (cell size, the DNA replication and damage checkpoints). The green arrows in Fig.lA represent positive regulations. For example, under rich nutrient conditions and when the cell grows large enough, the Cln3/Cdc28 will be "activated", which in turn activates (by
152
Fangting Li, Ying Lu, Qi Ouyang, Chao Tang
phosphorylation) a pair of transcription factor groups SBF and MBF, which transcribe the genes of the cyclins Clnl,2 and Clb5,6, respectively. Red arrows in Fig.lA represent "deactivations" (inhibition, repression, or degradation). For example, the protein Sicl can bind to the Clb/Cdc28 complexes to inhibit their functions, Clbl,2 phosphorylates Swi5 to prevent its entry into the nucleus, while Cdhl targets Clbl,2 for degradation. The cell-cycle sequence starts when the cell commits to division by activating Cln3, driving the cell into the "excited" G l state. The subsequent activation of Clb5,6 drives the cell into the S phase. The entry into and exit from the M phase is controlled by the activation and degradation of O b i , 2 . After the M phase, the cell comes back to the stationary Gl phase, waiting for the signal for another round of division. Thus the cell cycle process starts with the "excitation" from the stationary Gl state by the signal and evolves back to the stationary Gl state through a well-defined sequence of states.
3
Model and simulation of the cell cycle network
In principle, the arrows in the network have very different time scales of action, and a dynamic model would involve various binding constants and rates [8,9]. However, since we are mainly concerned here with the overall dynamic properties and the stability of the network, we use a simplified dynamics on the network, which treats the nodes and arrows as logic-like operations®. Thus, in the model each node i has only two states, Si — 0 and Si = 1, representing the active and the inactive states of the protein, respectively. The protein states in the next time step are determined by the protein states in the present time step via the following rule®:
1,
YiaijSj(t)>0, j
Si(t + 1)= <
0,
^ciijSjit)
<0,
(3^
j
®This is by no means automatically an good approximation. In particular, making the time constants of all arrows the same could have disastrous consequences in the network dynamics. However, we are saved for this particular network because of the intrinsic sequential nature of the dynamics. We have tested the dynamics with varied time scales of action for different arrows and obtained similar results. ®Except at the start, where Cln3 is activated by Swi5 (or Ace2) and the cell size signal.
Global Dynamic Properties of Protein Networks
153
where for a green arrow from protein j to protein i and (or any large negative number) for a red arrow from j to i. With the discrete dynamics of Eq.(3.1), the network itself can be simplified without changing its overall dynamic properties. Since simplification also depends on biological details of each node and link, we do not follow any exact rule but some general guidelines: (1) A node with only incoming arrows can be deleted, for its state is determined by the state of other nodes and it cannot affect any other node. (2) A node with only outgoing arrows can be deleted or combined with other nodes. For example, in Fig.IB, we have removed two checkpoints, assuming the system will always pass the "tests" at the two checkpoints. (3) A node with only one incoming and one outgoing red arrows will be deleted, and the two red arrows will be substituted by one green arrow. (4) A linear chain of nodes connected by unilateral green arrows will be combined to one node. After simplification, the main logic relationships between nodes are kept while the network can be much simpler for simulation. We have checked that the main dynamic features, such as major fixed points, properties of major pathways, etc., are consistent with each other for both the original and the simplified networks. The simplified cell-cycle network is shown in Fig.IB with 11 nodes (plus a signal node). We have also added "self-degradation" (yellow loops) to those nodes that are not negatively regulated by others. The degradation is modelled as a time-delayed interaction: If a protein with a self yellow arrow is activated at time t (Si(t) = 1) and if it receives no further positive inputs from t + 1 to t = t + td, it will be degraded at t = t + td, i.e., Si(t + td) = 0®. Using the dynamic model described above, we first start at the beginning of the cell cycle by "exciting" the G l stationary state with the cell size signal, and observe the system to go back to the Gl stationary state. The temporal evolution of the protein states, presented in Table 1, indeed follows the cell-cycle sequence observed in the experiments. The system goes from the excited Gl state to the S phase, the G2 phase, the M phase, and finally to the stationary G l state. This is the biological trajectory or pathway of the cell cycle network. We thus conclude that our model catches the major features of the dynamics of the cell cycle network. Next, we study the attractors of the network dynamics by starting from each of the 2 1 1 = 2048 initial states in the 11-node network of Fig.IB. We find that all of the initial states eventually flow into one of the 7 stationary states (fixed points), shown in Table 2. Among the 7 fixed points, there is a global fixed point that attracts 1973 or over 96% protein states. Remarkably, this super-stable state is the biological Gl stationary state. The advantage for a cell's stationary state to be a ®For simplicity, we use the same lifetime td = 4 for all proteins with a self-loop. The results are essentially the same for t d = 2 .
154
Fangting Li, Ying Lu, Qi Ouyang, Chao Tang
Table 1 Temporal evolution of protein states for the simplified cell-cycle network of Fig.IB. The right column indicates the cell cycle phase. Note that the number of time steps in each phase does not reflect its actual duration. Also note that while the on/off of certain nodes sets the start or end of certain cell-cycle phases, for other nodes the precise duration at which they are turned on or off does not have an absolute meaning in this simple model. Step Cln3 SBF MBF Clnl,2 Cdhl Swi5 Cdc20 Clb5,6 Sicl
1
1
0
0
0
1
0
0
0
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 1 1 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 1 1 1 1 1 1 1 0 0
0 0 0 1 1 1 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 1 1 1 1
Clbl,2 Mcml Phase 0 0 Excited Gl
0 0 0 0 1 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 0 0 0
Gl Gl Gl
s
G2 M M M M M M M M Ground Gl
big attractor of the network is obvious—the stability of the cell state is guaranteed. To investigate the dynamical stability of this biological pathway, we study how the 1973 initial protein states flow to their final attractor of the Gl stationary state. In Fig.lC, each of these protein states is represented by a green dot, with the arrows between them indicating dynamic flows from one state to another. The biological pathway is labelled in blue. We see that the dynamic flow of protein states is convergent onto the biological pathway, making the pathway an attracting trajectory of the dynamics. With such a topological structure of the phase diagram of protein states, the cell cycle pathway is a very stable trajectory—it is very unlikely for a sequence of events, starting at the beginning of the cell cycle process, to deviate from the cell cycle pathway.
Global Dynamic Properties of Protein Networks
155
Table 2 The fixed points of the cell cycle network. Each fixed point is represented in a row. The left column is the size of the basins of attraction for the fixed point; the other 11 columns show the protein states of the fixed point. The protein states of the biggest fixed point correspond to that of the Gl stationary state. The on or off does not have an absolute meaning in this simple model. Basin
MBF
SBP
1973 32 20 11 9 2 1
0 0 1 0 0 1 0
0 1 0 0 0 0 0
4
Cln3 Clnl,2 Clb5,6 Sicl
0 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 0 0 0 0 0
1 0 1 1 0 1 0
Cdhl Clbl,2 Cdc20 Swi5
1 0 1 0 0 0 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Mcml
0 0 0 0 0 0 0
Comparison with random networks
To investigate how likely a big fixed point and a converging pathway can arise by chance, we study an ensemble of random networks that have the same numbers of nodes and links in each color as in the cell cycle network. We find that random networks typically have many more attractors (fixed points and limit cycles) with the average number being 20.65. The sizes of the basins of attraction in the random networks are typically small, with its distribution having a power-law decay, as shown in Fig.2A. The probability for a random network to have an attractor of a basin size B equal to or larger than that of the cell cycle network (B > 1973) is 0.02. To quantify the "convergence" of pathways, we define a quantity wn for each protein state n that measures the overlap of its trajectory with all other trajectories. Denote Tjik the total traffic flow through the arrow A,^ that takes protein state j to k in one time step, i.e., Tjtk is the total number of trajectories starting from all protein states that pass through Ajtk. If the trajectory from n to its attractor has Ln steps so that it consists of Ln arrows Ak-i,k, k = 1,2,..., Ln, wn = X)fc=i Tk-i,k/Ln. The overall overlap of all trajectories in a network can be measured by W =< wn >, where the average is over all protein states. The normalized histogram of wn for all protein states is shown in Fig.2B, for both the cell cycle and the random networks. Without any significant overlap or convergence of trajectories, the random networks have their w-distribution peaked at small w's, with an average W = 340. While for the cell cycle network, the distribution is peaked at very large numbers (W = 1230), indicating the significant convergence of
156
Fangting Li, Ying Lu, Qi Ouyang, Chao Tang
-J*B
Mim-
Figure 2 Comparison of the cell-cycle network (cell-cycle) with the random networks (random). (A) Attractor size distribution of random networks. (B) w-value distribution. (C) The relative stability of the biggest fixed point with respect to network perturbations. (D) The relative change of the overall convergence of trajectories with respect to network perturbations. (To generate statistics for random networks, 10000 random networks were used in A and B, and 1000 random networks were used in C and D. trajectories. The probability for a random network to have W > 1230 is 0.057. However, the probability that a random network have both B > 1973 and W > 1230 is less than 0.0001. We see that the cell cycle network has two distinct dynamic properties compared with random networks: It has a super fixed point and it has a converging pathway. What effects would perturbations of the network have on these properties? We perturbed a network by deleting an interaction arrow, adding a green or red arrow between nodes that are not linked by an arrow, or switching a green arrow to red and vice versa. The change in the size of the basins of attraction for the biggest attractor, AB and the change in W, AW were then measured as a result of the perturbation. The distributions of AB/B and AW/W are plotted in Fig.2C and Fig.2D, respectively, for the cell cycle network along with those of random networks. We observe that about 80% of the perturbations on the cell cycle network result in less-than-half changes on these two quantities. However, this particular property seems to be generic—it is shared by random networks.
Global Dynamic Properties of Protein Networks
5
157
The life cycle network
In order to verify that the observed dynamics of the cell cycle network is not a singularity in the biological networks, but carries certain universal properties, we now turn our attention to a different protein network— the one regulating the life cycle of the budding yeast. Yeast cells can exist in either diploid or haploid genetic state. Triggered by starvation, cells in diploid form undergo meiosis to give rise to spores-haploid cells in a dormant state that are resistant to harsh environmental conditions. Under appropriate conditions, two haploid cells can be stimulated by pheromone to fuse, forming a diploid cell. Fig.3A shows the protein interaction network governing the life cycle processes in budding yeast, which we have obtained through extensive literature studies. We use a similar discrete dynamic model of Eq.(3.1) to study the life cycle network shown in Fig.3A. Again, under this evolutionary rules, we simpliy the network by removing and reconnecting nodes in a way that does not affect the network dynamics. After the simplification process, the network becomes that shown in Fig.3B. Notice the cell cycle network in Fig.lA reduces to a single node of Gl cyclins in the life cycle network. The dynamic rule on the simplified network is still Eq.(3.1)®. To keep certain nodes on in the normal cell state after being transiently turned off, we have introduced a housekeeping node that is kept on always. Starting from all 2 14 initial states, we trace their evolution trajectories. There are 7 stationary states and no limit cycle (compared with an average of 67 attractors in random networks of the same size). The stationary states of the haploid and the diploid forms are the only two global attractors, attracting 34.7% and 58.7% of all initial states, respectively, making them the only global attractors. The two biological pathways—sporulation and mating, which turn diploids into haploids and vice versa—are both attracting trajectories. The trajectories from 9000 randomly chosen initial states are shown in Fig.3C. Different from the cell cycle network, the life cycle network shows an additional property of bistability.
6
Conclusion
The idea that the aspects of biological systems can be modelled as dynamic systems and biological states can be interpreted as attractors has a long history, with examples in neural networks [10,11], immune systems [12,13], genetic networks [14,15], cell regulatory networks [16], and ecosystems [17]. Our study on actual yeast cellular networks lends sup®There are green arrows in Fig.3A that should have the "AND" logic. However, in Fig.3B, these arrows are combined or removed so that Eq.3.1 is appropriate for the dynamics.
158
Fangting Li, Ying Lu, Qi Ouyang, Chao Tang
Figure 3 (A) Life-cycle network. (B) Simplified life-cycle network. (C) Dynamical trajectories of 9000 protein states randomly selected from the basins of attraction of the two big fixed points (blue nodes). The two fixed points represent the biological stationary states in the haploid (right) and the diploid (left) forms, respectively. Diploid protein states are colored green; haploid protein states are colored yellow. The two biological pathways—sporulation and mating—are colored blue. The size of a node and the thickness of an arrow are proportional to the logarithm of the traffic flow passing through them. Arrows with traffic flow less than 3 (and their corresponding nodes) are omitted for clarity.
port to this idea. Furthermore, our results suggest that not only biological states correspond to big fixed points but the biological pathways are also robust. Functional robustness has been found in other biological networks, e.g., in the chemotaxis of E. Coli (in the response to external stimuli) [18] and in the gene network setting up the segment polarity in insect's development (with respect to parameter changes) [19]. It has also been found at the single molecular level—in the mutational stability of proteins [20]. Indeed, robustness may provide us with a window through which we can gain some understanding on the profound driving
Global Dynamic Properties of Protein Networks
159
force of evolution. Acknowledgments This work was partly supported by the National Key Basic Research Project of China (No. 2003CB715900). We thank T. Hua, H. Li, and L.H. Tang for helpful discussions. The networks and dynamic trajectories are drawn with Pajek (http://vlado.frnf.unilj .si/pub/networks/pajek/).
References [i R. Albert, H. Jeong, A.-L. Barabsi, Nature 406, 378 (2000). [2 H. Jeong, B. Tombor, R. Albert, Z. N. Oltval, A.-L. Barabsi, Nature 407, 651 (2000). [3; H. Jeong, Sp. P. Mason, A.-L. Barabsi, Z. N. Oltvai, Nature 411,
41 (2001). [4: S. Maslov, K. Sneppen, Science 296, 910 (2002).
[5. R. Milo et al., Science 298, 824 (2002). A. Murray, T. Hunt, The Cell Cycle (Oxford University Press, New York, 1993).
[r P. T. Spellman et al., Mol. Biol. Cell 9, 3273 (1998). [8. K. Chen et al., Mol. Biol. Cell 11, 369 (2000).
J. J. Tyson, K. Chen, B. Novak, Nature Review Molecular Biology 2, 908 (2001). [9;
W. S. McCulloch, W. Pitts, Bull. Math. Biophys. 5, 115 (1943). J. J. Hopfield, Proc. Natl. Acad. Sci. U.S.A. 79, 2554 (1982).
[11
N. K. Jerne, Ann. Immunol. (Paris) 125C, 373 (1974). G. Parisi, Proc. Natl. Acad. Sci. U.S.A. 87, 429 (1990).
[12 S. A. Kauffman, J. Theoret. Biol. 22, 437 (1969). [13 S. A. Kauffman, "The origins of order" (Oxford University Press [14 1993). [15 S. Huang, D. E. Ingber, Exp. Cell Res. 261, 91 (2000). [16 R. M. May, Science 186, 645 (1974). [17 U. Alon, M.G. Surette, N. Barkai, S. Leibler, Nature 397, 168
[is: (1999). G. von Dassow, E. Meir, E. M. Munro, G. M. Odell, Nature 406, [19 189 (2000). H. Li, R. Helling, C. Tang, N. Wingreen, Science 273, 666 (1996). [20
160
A Modified Adaptive Algebraic Multigrid Algorithm for Elliptic Obstacle Problems Wei Li Institute for Computational and Applied Mathematics, Xiangtan University, Xiangtan 4H105, China. E-mail: [email protected]
Yunqing Huang Institute for Computational and Applied Mathematics, Xiangtan University, Xiangtan 411105, China. E-mail: [email protected] Abstract A modified algebraic multigrid algorithm(AMG) is presented to solve the discrete obstacle problems. The Gauss-Sidel smoother in the general algebraic multigrid algorithm is combined with a post processing to satisfy the inequality constraint for every entry of the solution. Numerical experiments illustrate the efficiency of the algorithm for the obstacle problems discretized on the uniform mesh, while for the discretized obstacle problems on an /i-adaptive mesh, the numerical solution did not converge to the exact solution. A further active-set strategy like projection is introduced to solve the problem on the /i-adaptive mesh. It can be shown by the numerical result that the proposed algorithm is efficient and robust.
1
Introduction
Obstacle problems are a type of free boundary problems. After discretizing, the resulted finite-dimensional obstacle problem has similar mathematical features as a constrained quadratic programming problem. Standard numerical methods for optimization problems are often adopted to solve discrete obstacle problems (e.g. [2] - [9]). Among those methods, the comparatively efficient ones are projection methods and their variant forms, linear approximation methods, relaxation methods and penalty function techniques. However, as pointed out by Hoppe and Kornhuber in [20], if the obstacle problem is discretized in space by continuous and piecewise linear
A Modified Adaptive Algebraic Multigrid Algorithm • • •
161
finite elements with respect to a triangulation of domain fi, the above mentioned iterative methods typically suffer from rapidly deteriorating convergence rates when the triangulation is more and more refined, which renders the iterative methods inefficient from a numerical point of view. Using multilevel techniques with respect to a hierarchy of triangulation can overcome this drawback. So the domain decomposition method and multigrid approaches to obstacle problems have been extensively studied in recent years ( [10] - [16], [20] - [30]). To the author's knowledge, the algebraic multigrid method has not been applied in solving obstacle problems in the existing literature for multigrid type iterative methods ( [17-19])As an alternative to the projected relaxation for obstacle-type problems, the linearization techniques based on active set strategies were used in [25] - [28], whose main idea is to pre-specify a set of active constraints at each iteration and then to solve a linear subproblem for the computation of the new iteration. The multigrid techniques used in [25] - [28] consist of outer and inner iterations where the outer iteration is an active set strategy and the inner iterations are .multigrid iterations for the approximate solution of the auxiliary problems. Since the coefficient matrices of the auxiliary systems are symmetric positive definite, Hoppe and Kornhuber [20] adopted preconditioned conjugate gradient method(PCG) to replace the inner iterations in [25] - [28]. AMG was verified robust and flexible for positive definite linear systems [17]. In this paper, we at first developed the AMG algorithm to solve the discrete obstacle problem on a uniform mesh. The difference of this algorithm is different in two aspects from the AMG algorithm for the linear system. One difference is that a post processing after every operation to update a single entry of the solution in the smoother is introduced to make the solution be in the feasible set during the whole smoothing process. The other difference is that a heuristic projection given to the inequality constraint to keep the residual problems on every grid level has the same formation and thus the algorithm can be exactly the same on all grid levels. Numerical results show the efficiency of the proposed algorithm for the discrete obstacle problem on the uniform mesh. It can also be observed from numerical results that this algorithm is not suitable for the discretized obstacle problem on an hadaptive mesh. The numerical error always kept very large along the free boundary. Inspired by the active set method introduced in [20], we adopted the algorithm with modification for the obstacle problem only on the finest mesh. After smoothing on the finest mesh, an active set can be obtained from the current solution. The linear subproblem on the inactive set is again solved by using the standard AMG method for the linear system. This is equivalent to replacing PCG as the inner iterations of the active set method. It can be shown that the modified AMG
Wei Li, Yunqing Huang
162
is a fast converging iterative method for solving the obstacle problem discretized on an h-adaptive mesh. The rest of this paper is organized as follows. In section 2, we introduce the discrete obstacle problem. In section 3, AMG algorithm to solve the model obstacle problem is developed. By comparing the numerical results on uniform meshes and /i-adaptive meshes, we are motivated to modify AMG for the discrete obstacle problems on /i-adaptive meshes. The modified AMG and corresponding numerical experiment are given in section 4. Some conclusive remarks are presented in section 5.
2
Model problem and finite element discretization
Let fi be a bounded convex polygon in R 2 with dCl as its boundary. We consider the obstacle problem: find u* G K such that J(u*) = minJ(u),
(2.1)
u€K
where J(u) = -a(u,u)-
(f,u),
a(u, v) = I Vu • VwcJx,
Ju
u
(/, ) = I fudx., JQ
K = {u €
HQ(Q)
: u ^ wo, a.e., in .fl}
This problem is equivalent to a variational inequality as follows: find u* € K such that a(u*,v-u*)^{f,v-u*),
WveK.
(2.2)
The boundary of the active set I = {x G CI : u(x) = uo(x)} is called free boundary [1] which is unknown. We shall only discuss the piecewise linear element due to the limited higher order regularity for the solution of a variational inequality. Suppose Th is a regular triangulation of fl with triangle elements. O = Ut=i ^*i where fli are the elements in the triangulation. Let S^ be the set of all vertices of triangles in Th and V\ be the set of all polynomials with two variables of degree < 1. The piecewise linear finite element space Vh and its subset KH are defined as Vh = {uh G C°(fi), uh\m
= 0, uh\Qi e
Vi(Qi),^i},
A Modified Adaptive Algebraic Multigrid Algorithm • • •
163
Kh = {uh G Vh, uh > u0 on T,h}. Then the approximation of problem (2.1) in the above finite element space is given as follows: find u\ G Kh such that J « ) = min J(uh).
(2.3)
uh£Kh
Correspondingly, the finite element approximation of (2.2) amounts to the computation of an element Uh G Kh satisfying a(u*h,vh - u*h) > {f,vh - < ) ,
Vvh G Kh.
(2.4)
Let dim(Vft) = N and 4>i, 1 < i < N,be the basis functions of Vh- Any function uh G Vh then has the formation as N i=l N
and thus Vh and M are isomorphic. Therefore, corresponding to Kh, the closed convex set in RN is KN = {u G RN, m > u 0 ,i, 1 < i < N}, where MQ^ denote the value of UQ at the i-th point. Thanks to (2.5), J{uh) is then written as
I (N \i=l
N
\
I
j=l
/
\
i,j=l
N
\
i=l
/
/0
RN
i=l
The discrete minimization problem (2.3) can be reformulated as follows: find u* G JOv swc/t i/iai J(u*) = min J u T A u + F T u , ueKN 2
(2.7)
w/iere A = (aij)iJ=1...
^N ,
a,ij = a((f>i, j),
/«i\
/-(/,
\MW
\-{f,
Then (2.4) is equivalent to the discrete inequality: find u* 6 Kjq such that (v - u*) T Au* > (v - u * ) T ( - F ) , Vv G ifjv. (2.8) As an algebraic multigrid method, our algorithm will begin from this discretized formation.
Wei Li, Yunqing Huang
164
3
A M G algorithm on uniform mesh
In this section we will introduce an AMG algorithm to solve (2.7) discretized on uniform meshes. This algorithm is not dependent on the fact that the discrete problem is obtained by discretizing (2.1) on uniform meshes but it is only efficient for such discrete problem. Our algorithm is a standard V-cycle AMG algorithm [17] which consists of two ingredients: smoothing and coarse grid correction. The residual problem is required before correction. Problem (2.7) is different from solving a linear system by the inequality constraints. This leads to two main differences of our algorithm from the AMG algorithm for a linear system. The former is to obtain the defect equation on the coarse grid, and the inequality should be projected as the objection too. The latter is that the smoother can not be a standard Gauss-Sidel like sweeper any more and a post-processing can be applied to satisfy the inequality constraint after each unknown is updated by the Gauss-Sidel like sweeper. For convenience, the standard V-cycle AMG algorithm with iterative implementation is described as follows: 1. Give an i n i t i a l guess u(°); 2. For i from 0 t o L — 1, do (a) Smoothing on grid level i to update u M ; (b) Get the residual problem on grid level i + 1; end do 3. Solve the problem on grid level L exactly enough; 4. For i from L — 1 to 0, do
(a) Update t h e c o r r e c t i o n on g r i d l e v e l i + 1 t o uW; (b) Smoothing on g r i d l e v e l i t o update u'*'; end do 5. Check if stopping c r i t e r i a i s met: t o go t o s t e p 2;
y e s , t o s t o p ; no,
Step 2 and step 4 are the so-called pre-smoothing and post-smoothing, respectively. Step 3 will solve a very small problem in the same formation as the original problem which is assumed to be solvable in a comparatively cheap cost. In the following, we will describe in detail the steps in the above algorithm.
3.1
Project the constraint
Consider (2.7) and assume that there is an initial guess u^°\ Let uW be a new approximation after the coarse grid correction. The residual
A Modified Adaptive Algebraic Multigrid Algorithm • • •
165
problem of (2.7) is then written as: min i ( u « - u<°)f A(uW - u<°>) + F r ( u « - u ( 0 ) ),
(31)
s.t. ( u ^ - u<°>) > u 0 . It can be re-written as min-u^AuW+F^uW,
, , _,
S.t. U ^ ^ U 0 ,
where F = F- Au.(°\ u 0 = u 0 + u<°). Assume that a projection matrix P is obtained from the information retrieved from matrix A. Using this projection matrix, we can get a projected optimization problem of (3.2) on the coarse grid level as
s.t. u « ^ uj,1', where AW
T
= PAP
,
F^
= PFT,
u^=Pu0.
(3.4)
The projection matrix P is, in fact, the restriction operator in a two-level correction from the view of AMG. When the coarse grid is chosen, the interpolation operator can be constructed and the restriction operator is taken as the transpose of interpolation. Then the coarse-level Galerkin operator and residual problem (3.3) can be obtained from (3.4). Hence, coarsening and interpolation are essential components used in the setup phase of AMG. In the following, we will first describe the splitting section and then the interpolation. The approach for constructing the splitting and interpolation is the same for all levels of the AMG hierarchy. Therefore, the following description will be of any fixed level. Clearly, all of the following must be repeated recursively for each level until the level reached contains sufficiently few variables to get a direct solution.
3.2
S t a n d a r d coarsening
For convenience, we regard the set of grid points on the Ith grid level as the index set AT' = {1,2, • • • , m}, where ni is the number of unknowns on the Ith level. Thus referring to a point i £j\fl means nothing other than referring to the ith point (xi,yi). As we know, the coarsening process of AMG from fine level I to coarse level / + 1 is, in fact , a splitting of Afl into two disjoint subsets Afl = Cl U Fl with Cl representing those variables which are to be contained in the coarse level (C-variables) and Fl being the complementary set(F-variables). To simplify notations, we
166
Wei Li, Yunqing Huang
usually omit the index I in the following, for instance, we write Af, C, F instead of TV', Cl,Fl. Before defining the C/F splitting, we need to introduce the sets which characterize the connections between grid points. The first set is the direct neighborhood of a point i defined as Ni = {jeN:j^i,aii^0}
(iGAO,
where J\f is the index set. Next a point i is said to be strongly negatively coupled (or connected) to another point j if —dij ^ e s t r max | aifc |,
0 < estT < 1.
a.ik<0
where e s t r is a constant which defines strong negative connections. According to [17], £str is set to 0.25 as a default value. Then we denote a set of all strong negative connections of point i by Si, Si = {j G Ni : i isstrongly negatively coupled to j}. Also, the set of strongly negatively coupled to i is defined as SiT =
{jeAf:ieSj}.
The simple splitting algorithm used in our paper was presented by Ruge and Stiiben (see [17]). Essentially, some first point i is defined to become a C-variable. Then all points j 6 Sf become F-variables. Next, from the remaining undecided points, define another one to become a Cvariable and all points which are strongly negatively coupled to it (and which have not yet been decided upon) become F-variables. This process is repeated until all variables have been taken care of. However, in order to avoid randomly distributed C/F-patches and instead obtain reasonably uniform distributions of C- and F-variables, we introduce a "measure of importance", Aj, of any undecided variable i to become the next C-variable. A, is defined as Ai=|SVrn.D|+2|SiTnF|
(ieD),
where D, at any stage of the algorithm, denotes the current set of undecided points, and | • | denotes the number of elements a set contains. The complete standard coarsening algorithm is described as follows [17]:
1. set F :=0,C
:=0,D:=M;
2. compute A, :=| 5 , T n D \ +2 | SiT D F | for any i £ D; 3 . pick ieD
with max{Aj}, and s e t C := C U {i}, D
:—D\{i};
A Modified Adaptive Algebraic Multigrid Algorithm • • • 4 . for a l l jeSiTnD: 5. if D = 0,
s e t F := F U {j}, D :=
167
D\{j};
s t o p ; e l s e go t o s t e p 2.
In step 2, Xi has to be computed globally only once at the beginning of the algorithm. At later stages, we just need to update it locally. And from the definition of Xi, we can see that the above algorithm is performed in such an order: Initially, points with many other strong negative connections to them become C-variables, while, later, the tendency is to pick, as C-variables, those on which many F-variables strongly depend. It is clear that this simple algorithm ensures strong F-to-C connectivity which is required by the good interpolation [17].
3.3
Direct interpolation
Assume that a coarse grid Afl+1 has been constructed by means of standard coarsening described in section 3.2. Then we define the interpolation weight so that the interpolation lj+1 and the restriction It+1 operators can be defined. The interpolation approach used in our paper was given in [17]. The ith component of error e at the fine level I is given by
4 = {l\+1el+% = (^+1'
, +1 I \ I 2 '
u
(3-5)
where P\ = Cl C\S\ is called the set of interpolation points to i. In the following, we also omit the index I for convenience. Moreover, instead of (3.5), we simply write e, = ^2 Vikek,
i e F.
(3.6)
feSPi
Our goal is to define the interpolation weights w^ in (3.6) so that (3.5) yields a reasonable approximation for any algebraically smooth error e which approximately satisfies auei + ^^
a e
ij j = 0)
i & F.
(3.7)
As we know, algebraically smooth error varies slowly in the direction of strong connections [17]. That is, the error at a point i is essentially determined by a weighted average of the error at its strong neighbors. Consequently, by assuming 0 ^ P» C C n Ni, the stronger connections of any F-variable i are contained in Pi, the better of which
V
7T YJ
^kePi
a
ik
kep.
aikek
~ v l^jeNi
T~ YJ aiiei> a
ij
j e N
.
(3-8)
168
Wei Li, Yunqing Huang
will be satisfied for smooth error. This suggests approximating (3.7) by auei + ai ^2 aikek = 0 with a» =
je
i
l3
,
(3.9)
a
fcPR, kePi
l^k€Pi ik
which leads to an interpolation formula (3.6) with matrix-dependent, positive weights: Uik = -onaik/au,
3.4
i € F,k e Pi.
(3.10)
Smoothing
The smoothing operation in our algorithm is a little different from that for a linear system. Assume that a point-by-point iteration method as Gauss-Sidel sweeper is adopted. Our algorithm is to add a postprocessing to satisfy the inequality constraint. The following is the pseudo code of such an algorithm: for i from 1 to N, do N y
Uj, — Pi
si-ijllj *lij Uj ,
Ui = max(ui,u0i) end do We have not tested any non point-by-point iteration method, such as conjugate gradient method, as a smoother. It is clear that the postprocessing operation applying the inequality constraint can be added to that kind of smoothers, too.
3.5 3.5.1
Numerical experiments On uniform meshes
To compare the efficiency of standard AMG method on uniform meshes and /i-adaptive meshes, we have performed a numerical experiment. In both cases, the smoothing steps for pre- and post- smoothing are chosen as 3, and the stop criteria is set to be the L2 norm of the difference of two sequential numerical solutions to be less than 10~ 6 . The setup of our numerical example is as
rc = [ - i , i ] x [ - i , i ] , f(x,y) = -2, K={ve
H&(n) :v^0,
a.e., in .0}
The exact solution is .2
!
, = J ry - l n ( r ) + l n ( l ) - - , i f r ^ l , 0,
if r < 1.
A Modified Adaptive Algebraic Multigrid Algorithm • • •
169
where r = \Jx2 + y2. The free boundary is u(x, y) = 0 along the circle r = l. The domain Q is triangulated into a mesh 7^ with triangle elements by choosing step length h = 0.0625 in two directions (as shown in Figure 3.1 (left)).
Figure 3.1 Uniform (left) and non-uniform (right) triangulation on the domain fl. We solve the example problem by using the standard AMG V-cycle as the iterator with an initial u*-0) = 0. After 12 iteration steps the V-cycle stops. The convergence history and the L 2 -error of the resulted finite element solution are shown in Table 3.1. AMG V-cycle step m and L2-errors on uniform meshes.
Table 3.1 m 1 2 3 4 5 6
3.5.2
||
u
m_
0.442 0.114 0.032 0.009 0.002 0.000
u
m-I
910 450 983 679 917 882
0 67 59 699
||ia
m 7 8 9 10 11 12
|| u m _
0.000 8.144 1.784 6.537 1.097 7.326
u
m-l
267 765 24e-05 32e-05 49e-06 87e-06 91e-07
|| L a
II U-Ufc ||L2
0.001 543 62
On /i-adaptive meshes
The above step shows the efficiency of the standard AMG solver to the obstacle problem with uniform meshes. The next natural step is to combine the AMG solver and mesh adaptation to solve the same example.
170
Wei Li, Yunqing Huang
As a parameter we have chosen the ratio between the largest and the smallest diameters of triangles in the given mesh. We have tested the case hmax/hmin « 11. Starting with the initial triangulation % depicted in Figure 3.1 (right), we got a sequence of triangulations by adopting the standard /t-adaptive procedure in AFEPACK[ [31]] provided by Ruo, Li. The /i-refinement is stopped if the X 2 -norm of the difference between the solutions for the example is less than 10~ 6 and the fineness of the initial mesh is below 1 0 - 5 . On each refinement level, we got an approximation error by the full AMG V-cycle iteration, providing a posteriori error estimator for local mesh modification. Table 3.2 presents the number rrik of AMG V-cycle iterations needed in order to reduce the initial error on each adaptation level k and the corresponding L 2 -error after the Vcycle stops. Table 3.2 meshes.
Iteration steps of AMG V-cycle and L2-errors on /i-adaptive
adaptation level k 0 1 2 3 4 5
mk 9 21 28 59 30 35
the number of unknowns 955 2 411 6 992 17 010 16 809 16 457
II " - " f t . Ilia 0.001 306 97 0.004 690 88 0.0112 079 0.015 664 7 0.0184 493 0.018 866 9
From the last column of Table 3.2, we can see that the L 2 -error becomes larger and larger during the /i-refinement process, which indicates that the standard AMG solver on the /i-adaptive mesh is not convergent. This will be explained from the following Lemma in [1]. Lemma 3.5.2 [1] An element u* 6 KN is the solution of (2.8) if and only if there exists q € KN such that ^ u * + F = q, where KN = {u e RN,ut
(u*) T q = 0,
(3.11)
> 0,1 < i < N}.
Proof. If u* € KN is the solution of (2.8), then we have z T ( 4 u * + F) > 0, which can be deduced by choosing v = u* + z in (2.8) with arbitrarily given z G K^. Since u* 6 KN, we have
{u*f'{Au* + F) ^ 0.
(3.12)
A Modified Adaptive Algebraic Multigrid Algorithm • • •
171
On the other hand, taking v = 0 e KN in (2.8) gives (-u*) T Au* > ( - u * f ( - F ) , which implies (u*) r (Au* + F) < 0.
(3.13) T
Hence, by (3.12) and (3.13) it follows that (u*) (Au* + F) = 0 proves (3.11). We also observe that ef q = qi > 0 (1 < i < N) for any unit vector e, € M^, which proves q € K;v. The converse statement is obvious. Now we obtain desirable result. • As we know, (3.11) is referred to the linear complementarity problem of (2.8). Also, it is pointed out that for ease of exposition we only consider a special obstacle problem where UQ = 0, and more general problems may be transferred into this form by using the transformation u = u — UQ. From the proof of Lemma 3.5.2, it is noted that since u* ^ 0, q ^ 0 and (u*) q = 0, for any 1 ^ i ^ N, u* > 0 and qi > 0 can not hold at the same time. Naturally, the index set M = {1,2, • • • , N} can be split into two disjoint subsets M = NauNb,
(3.14)
where A/"0 = {*e A/-; « ? > ( ) } ,
M, =
{ieM;u!=0}.
The set A/j, is often called to be active since in view of u* = 0, i £ A/;,, it contains the nodal points where the obstacle is active. Correspondingly, A/0 is said to be the inactive set. According to this split, we thus obtain that < > 0, qt = 0 for any i e A/0, ,g _ u i = 0) Qi > 0 for any i E A/&. ^ ' ' Then there exists an elementary transformation matrix P such that
Aba Abb
Under this elementary transformation, (3.11) can be rewritten as
172
Wei Li, Yunqing Huang
By exactly solving (3.17), we have ua = -A^Fa,
q 6 = Abaua
+ Fb,
(3.18)
where u a is the formal solution to (2.8). Based on the above analysis, it follows that for any splitting of N according to (3.14), the formal solution (3.18) can naturally satisfy the necessary condition (3.11), which implies that the key to solve (3.11) is to find a suitable splitting of index set J\f such that u a > 0 and q& > 0 hold at the same time. In other words, to capture the free boundary is the key to solve obstacle problems well. In AMG algorithm above without splitting of Af, the active points are not separated from the inactive points, which can not ensure that u a > 0 and q& ^ 0 hold at the same time and the resulted solution with large errors will not provide a reliable error estimator for /i-refinement. Of course it fails to converge to the exact solution. Therefore, we need to make some modification to the standard AMG. In the following we will state how to do it.
4
Modified A M G algorithm and implementation
4.1
Modified A M G algorithm
As discussed in section 3.6, to design an efficient AMG solver for the obstacle problem with /i-adaptive meshes, we modify AMG by adopting an active-set strategy as presented in [26], [27]. The completely modified V-cycle AMG algorithm with iterative implementation is described as followes: 1. Choose an i n i t i a l guess u ^ ; 2. Given u ^ G RN, m > l e v e l L t o update u^ m ^;
0, smooth on t h e f i n e s t g r i d
3 . From u^ m ^, determine J\fa C A/" as t h e s e t of index i G M such t h a t Ui > 0, Vi G Ma, and s e t A/J, := Af\Na; 4 . Construct l i n e a r complementary problem Aua = -Aub - F,
(4.1)
where u\ = uo,i,i € A4;«i = 0,i G Na> and u a satisfies < = 0,iGA4; ' 5. Solve subproblem (4.1) by using standard AMG V-cycle for the linear system as an iterator; 6. Compute u ( m + 1 ) = u a + ub;
A Modified Adaptive Algebraic Multigrid Algorithm • • • 7. postsmooth; 8. Check i f the stopping c r i t e r i a i s met: no t o , go to step 2.
173
y e s , to stop;
Prom the algorithm above, it is clear that the modified AMG V-cycle, in fact, consists of outer and inner iterations, where the outer iteration is the active-set strategy and the inner iteration is the standard AMG Vcycle for the linear system. Each step of the outer iteration requires the solution of the linear subproblem (4.1), which is performed iteratively by standard AMG V-cycle iterations. It is obvious that (4.1) actually is a "reduced", i.e., lower-dimensional linear system on the inactive set, which is a projected relaxation. If the subproblem is solved exactly and A is an M-matrix, then for an arbitrarily given initial guess u' 0 ) the sequence u^m\ m ^ 1, of the iterates is monotonically decreasing and converges to the unique solution u* of (2.8). See [20] [26] [27] for details. Since the standard AMG algorithm has been verified to be robust for elliptic PDEs [17], we believe that the modified AMG can converge with a satisfying resolution , which will be confirmed by the following numerical experiments.
4.2
Numerical experiment
In this section, we present a numerical experiment in order to demonstrate the efficiency of the modified AMG algorithm compared to the standard AMG algorithm. The example we will consider is the same as the example presented in section 3.6. Again, we compare the performance of the computations on uniform meshes and /i-adaptive meshes shown in Figure 3.1. In our algorithm, the subproblems are solved by the standard AMG V-cycle iteration with the accuracy 10~ 6 . The whole iteration stops if the L2norm of the difference between the successive iterative solutions is less than 1CT6. 4.2.1
On uniform meshes
Observations similar to Table 3.1 are made in Table 4.3. By comparing Table 3.1 with Table 4.3, it is seen that a higher accuracy of errors obtained by the modified AMG algorithm than by the standard AMG algorithm, which shows that the modified AMG is more efficient for the obstacle problem with uniform meshes. 4.2.2
On /i-adaptive meshes
Starting with the initial triangulation To depicted in Figure 3.1 (right), we still adopt the standard /i-adaptive procedure in AFEPACK provided
174
Wei Li, Yunqing Huang
Table 4.3 m 1 2 3 4 5 6 7
Modified AMG V-cycle steps and Z<2-errors on uniform meshes.
|| nm
_
u
m-l
2.141 1.105 0.574 0.399 0.256 0.144 0.069
||La
52 87 719 737 049 496 805 0
m 8 9 10 11 12 13 14
|| u m - um~L \\La 0.031 958 3 0.010 329 7 0.002 090 10 9.927 16e-05 4.485 36e-06 1.291 08e-06 3.809 12e-07
II u-u/,, ||L2
4.454 56e-04
by Dr Ruo Li to refine the mesh. The stopping criteria of the refinement is the same as in (3.5.1), i.e., \\ u — Uh | | L 2 < 10~ 6 and the mesh fineness < 10~ 5 . On each refinement level, we apply the modified AMG V-cycle algorithm given in section 4.1 to solve the example problem. In this case, AMG converges to a desirable solution up to the 3th refinement level. The number rrik of AMG V-cycle iterations needed on each adaptation level k and L 2 -errors are shown in Table 4.4. Table 4.4 Iteration steps of the modified AMG V-cycle and L2-errors on ^-adaptive meshes. adaptation level k 0 1 2 3
TUk
7 10 18 33
the number of unknowns 955 2 689 8 951 30 674
II U-Ufc || L2 0.000 642 307 0.000 152 573 3.682 23e-05 3.284 34e-05
The modified AMG method is superior compared to the standard AMG method. It has been clearly demonstrated by Table 3.2 and Table 4.4 that when the standard AMG fails to converge for the example problem with /i-adaptive meshes, the modified AMG converges with a satisfying resolution. Furthermore, the approximation error is substantially reduced as seen in Table 4.3 and Table 4.4. In Table 4.3, the i 2 -norm of u — uu is 4.454 56e-04 on the uniform mesh but is reduced by about 10 times to 3.284 34e-05 on the adaptive mesh. It is found that one would need a 100 x 100 uniform mesh to produce such an error reduction. Thus the modified /i-adaptive AMG can indeed boost the accuracy very effectively. In Table 4.5, we report the number of unknowns on an inactive set after splitting the index set Af on the finest level of 0th adaptive level. The
A Modified Adaptive Algebraic Multigrid Algorithm • • •
175
Table 4.5 Number of the inactive grids and Z/2-errors of the AMG V-cycle on Oth mesh adaptation level. iteration step of V-cycle 1 2 3 4 5 6 7
number of grid point in M 955 955 955 955 955 955 955
number of grid point in Ma 402 501 545 554 556 556 556
|| Urn
Um_i
0.931 0.194 0.041 0.002 8.712 2.766 1.577
224 768 145 9 162 90 85e-05 86e-06 72e-07
\\L2
original number of unknowns before splitting is 955. The first column lists the numbers of V-cycle iteration steps. The number of unknowns on the inactive set is shown in the 3th column. The final column gives the L 2 -norm of u — Uh after a V-cycle. By comparing the 3th column with the 2th column it is seen that the number of inactive points is about half of the original number, which implies that the scale of the original problem (3.11) is "reduced" about 50%. Therefore, the modification on the finest mesh level can reduce the scale of original problem to some extent. This is helpful to accelerate the iteration speed of the AMG V-cycle. The resulted adaptive mesh and the solution error are plotted in Figure 4.1. Note: for the visual clarity, the error values in Figure 4.1 (right) have been magnified by 1000 times of the actual value. It is observed from Figure 4.1 that the adaptive mesh refinement has been efficiently carried out near the free boundary and the large error only occurs on the free boundary. This shows that our modified AMG can capture free boundary very well.
5
Conclusion
In this paper, we presented an AMG algorithm for a discrete obstacle problem which is defined by a piecewise linear finite element discretization of a continuous problem. Numerical experiments illustrated the efficiency for the algorithm if the finite element discretization is taken on a uniform mesh but it is not suitable for the problem discretized on an /i-adaptive mesh. A further active set method like the modified AMG algorithm is developed to solve the problem discretized on an /i-adaptive mesh. Numerical results have shown that the proposed algorithm is ef-
176
Wei Li, Yunqing Huang
Figure 4.1 Resulted adaptive mesh (left) and solution errors for the example (right). fective for the adaptive mesh refinement in elliptic obstacle problems where the previous AMG method fails to converge. This is an extension of the application of AMG to other fields. The theoretical study and convergence analysis of the modified AMG method will be our future work. Acknowledgments The authors express their appreciation to Dr. Ruo Li for providing the code of the standard AMG method. The authors also thank Mr. Ge Cheng for the discussions on the implementation issues on this topic. This research was supported in part by the National Natural Science Foundation of China and Research Fund for Doctoral Program of Chinese Ministry of Education.
[1] D. Kinderlehrer and G. Stampacchia, Variational inequality and application [2] R. Scholz, Numerical solution of the obstacle problem by penalty method, Computing, 32, 297-306(1984) [3] G. Strehlau, Loo-error estimate for the numerical treatment of the obstacle problem by the penalty method, Numer. Fund. Anal, and Optimiz., 10(1/2), 185-198(1989) [4] R. Glowinski, Y. A. Kuznetsov and T.-W. Pan, A penalty / Newton / conjugate gradient method for solution of obstacle problems, C. R. Acad. Sci. Pans, Ser. I, 336(2003), 435-440
A Modified Adaptive Algebraic Multigrid Algorithm • • •
177
J. P. Zeng and S. Z. Zhou, Block monotone iterative methods for elliptic variational inequalities, Appl. Math, and Comp. , 128(2002), 109-127 J. L. Lions and G. Stampacchia, Variational inequality, Comm. Pure Appl. Math. , 20(1967), 493-519 J.-F. Rodrigues, Obstacle problems in mathematical physics, NorthHolland Math. Stud. , 134 (North-Holland, Amsterdam, 1987) R. Glowinski, Numerical methods for nonlinear variational problems, Springer Verig New York Inc. (1984) T. S. Zheng, I. Li and Q. Y. Xu, An iterative method for the discrete problems of a class of elliptical variational inequality, Appl. Math. and Mech. , vol. 16, No. 4, 329-335(1995) X. J. Xu and S. M. Shen, Domain decomposition method for obstacle problem, J. Comp. Math, in Advanced University, 2(1994), 186-194 J. P. Zeng, D. H. Li and S. Z. Zhou, Nonoverlapping domain decomposition method for solving single obstacle problem (I): two subdomain case, Journal of Hunan University, vol. 24, No. 3, 1-6(1997) S. Z. Zhou, J. P. Zeng and X. Tang, Generalized Schwarz algorithm for obstacle problems, Comp. and Math, with Appl. , 38(1999), 263271 R. H. W. Hoppe, Multigrid algorithms for variational inequalities, SIAM J. Numer. Anal. , 24(5), (1987) K. Stwben, Algebraic multigrid (AMG): Experience and comparisons, Appl. Math. Comput. ,13(1983), 419-451 J. P. Zeng and J. T. Ma, Cascadic multigrid method for a kind of one dimensional elliptic variational inequality, Journal of Hunan University, vol. 28, No. 5, 1-5(2001) Y. M. Zhang, Multilevel projection algorithm for solving obstacle problems, Comp. and Math, with Appl. , 41(2001), 1505-1513 U. Trottenberg, C. W. Oosterlee, A. Schiiller, A. Brandt, P. Oswald and K. Stiiben, Multigrid, Academic Press, 2001 K. Sttiben, A review of algebraic multigrid, J. Comp. and Appl. Math. , 128(2001) 281-309 C. Iwamura, F. S. Costa, I. Sbarski, A. Easton and N. Li, An efficient algebraic multigrid preconditional conjugate gradient solver, Comput. Methods Appl. Mech. Engrg., 192(2003), 2299-2318 R. H. W. Hoppe and R. Kornhuber, Adaptive multilevel methods for obstacle problems, SIAM J. Numer. Anal, 31(1994), 301-323
178
Wei Li, Yunqing Huang
[21] R. Kornhuber, Monotone multigrid methods for elliptic variational inequalities I., Numer. Math., 69(1994), 167-184 [22] R. Kornhuber, Monotone multigrid methods for elliptic variational inequalities II., Numer. Math., 72(1996),481-499 [23] A. Brandt and C. W. Cryer, Multigrid algorithms for the solution of linear complementarity problems arising from free boundary problems, SI AM J. Sci. Statist. Comput. 4(1983), 655-684 [24] R. Glowinski, J. L. Lions and R. Tremolieres, Numerical Analysis of Variational Inequalities, North-Holland, Amsterdam 1981 [25] W. Hackbusch and H. D. Mittelmann, On multigrid methods for variational inequalities, Numer. Math., 42(1983), 65-76 [26] R. H. W. Hoppe, Multigrid algorithms for variational inequalities, SIAMJ. Numer. Anal., 24(1987), 1046-1065 [27] -, Two-sided approximations for unilateral variational inequalities by multigrid methods, Optimization, 18(1987), 867-881 [28] -, Une methode multigrid pourla solution des problemes d'obstacle, Math. Model. Numer. Anal., 24(1990), 711-736 [29] J. Mandel, Etude algebrique d'une methode multigrille pour quelques problemes de frontiere libre, C. R. Acad. Sci. Paris, Ser. I, 298(1984), 469-472 [30] -, A multilevel iterative method for symmetric, positive definite linear complementarity problems, Appl. Math. Optimization, 11(1984), 77-95 [31] R. Li and W.-B. Liu, http://circus.math.pku.edu.cn/AFEPack
179
Parallel Algorithms and Implementation Techniques for Terascale Numerical Simulations of Typical Applications* Zeyao Mo Institute of Applied Physics and Computational P.O.Box, Beijing 100088, China E-mail: [email protected]
Mathematics,
Abstract This paper summarizes our recent works on parallel computing for larger scale numerical simulations with some typically realistic applications on hundreds of processors with peak performance larger than one Teraflops. These applications introduced here include numerical solutions of multi-materials radiation hydrodynamic equations coupled with the particle transport in two dimensional cases, and numerical simulations of the laser plasma interaction and the interface instability driven by the laser ablation for Inertial Confine Fusion (ICF) in three dimensional cases. Primarily, scalable parallel algorithms and implementation techniques are summarized here to explain why and how we can organize such Terascale parallel numerical simulations
1
Introduction
Larger scale numerical simulations have become more and more important tools to accelerate scientific researches especially on high energy density plasma physics where testing experiments are very expensive or impossible to perform [1]. As a challenging application of researches on high energy density plasma physics, Inertial Confine Fusion (ICF) [2] is more and more fascinating scientific computing experts because of its requirements for larger scale numerical simulation for strongly nonlinear multi-physics phenomena especially for multi-materials radiation hydrodynamics. It is well known that as the base to enable such simulations -This work is supported by NNSFC for DYS No.60425205 and NNSFC No.60273030 and Funds of CAEP.
180
Zeyao Mo
on high performance massively parallel computers, researches on parallel computing, particularly the parallel algorithms and implementation techniques, are exclusively significant. For example, for the solution of a two dimensional radiation hydrodynamics equation by using our canonical codes built on well known Lagrangian numerical methods [3], the scale of eight thousands of discrete zones will serially expense about seven days on one microprocessor with the peak performance of 1GFLOPS because of its long integral formulation with tiny time step constrained by the strong nonlinearity of radiation opacity, sharp shock and interface discontinuousness, and also the grid deformation owing to the shear moving of fluids. Moreover, for the solution of a usual multi-groups particle, particularly neutron and photon, transport equation with our codes built on discontinuous finite element discrete ordinates numerical method [4], the scale of 2536 finite elements with the addition of 44 groups and 16 angles will serially expense 240 days. Nevertheless, numerical simulations for higher accuracy and higher resolution still require improving the number of the grid zones nowadays by two magnitudes, up to several millions. Besides from requirements listed above, high accurate simulations within the scales of the laboratory in three dimensions for local plasma phenomena interested, such as laser plasma interactions [5] and interface instability for laser ablation [6], also require that the number of discrete zones or particles should be no less than 10 Giga. For any cases, it is inevitable that thousands of processors must be used. Actually, such numerical simulations will never come to reality if we have no advanced researches on parallel computing, especially on scalable parallel algorithms and implementation techniques [7]. In this paper,we will survey our main progress on parallel algorithms and implementation techniques in recent years for numerical simulation of typical ICF applications. In section 2, requirements and solutions are concisely outlined for two dimensional solution of multi-materials radiation hydrodynamics equation coupled with the particle transport equation. Similar outlines for three dimensional simulations of laser plasma interactions and interface instability were given in section 3. Lastly, we summarize some related works and prospect our recent researches
2
Progress in two dimensions
In two dimensional cases, numerical solution of multi-materials radiations hydrodynamics equation coupled with the particle transport equation will extremely dominate the elapsed time for realistic applications for ICF [3,6,8]. Usually, the radiation hydrodynamics equation consists of three components with respect to the mass conservation, the mo-
Parallel Algorithms and Implementation Techniques for • • •
181
mentum conservation and the energy conservation. The latter can be further approximated by three diffusion equations with respect to electron, ion and photon temperatures. Three types of temperatures tightly nonlinearly couple with each other by swapping energy from photon to electron at first and then transferring energy from electron to ion later. The diffusion coefficient of each equation is a strongly nonlinear function of temperatures with the power of 4 especially owing to the flux-limited radiation opacity of materials. In special cases where non-equilibrium radiation occurs and the mean free path of photons is so long that the photon diffusion equation should be replaced by the photon transport equation to pursue better approximations [6]. Besides the radiation photon transport, neutron or particle transport should also be considered to enforce the energy source at the right hand of the energy conservation equation within a suitable period of the simulation cycle. For simplicity, the particle transport equation can approximated be again by a group of coupled multi-group transport equations which can be efficiently solved by discrete ordinate methods [4,6]. The multi-materials radiation hydrodynamics equation coupled with the particle transport can usually be split into three systems to be solved in sequence depending on the operator splitting of multi-physics [3,6, 8 ,10]. Firstly, the well known Euler hydrodynamics equation is explicitly solved for a time step constrained by CFL stability requirement. Secondly three tightly coupled energy diffusion equations concerned with three types of temperatures with respect to electron, ion and photon are implicitly solved within this time step. And lastly the particle transport equation is also implicitly solved to update the source term for the next time step. In order to distinguish the interface between two neighboring materials, the Lagrangian grid is usually utilized to discretize the computational domain, or the Arbitrary Lagrangian-Euler (ALE) grid is used to improve the smoothness of the Lagrangian grid for deformation owing to shear moving of fluids [8,10]. However, Euler hydrodynamics equation, energy diffusion equation and particle transport equation should be considered for parallel computing independently because they usually use different numerical methods with different numerical characteristics and granularity of parallelism requiring different parallel algorithms and implementation techniques.
2.1
Progress for solution of multi-materials Euler "hydrodynamics equation
The design of parallel algorithms and implementation techniques using message passing interface MPI [11,12] for numerical methods with explicit schemes for temporal discretization for solution of single materials Euler hydrodynamics equation is very trivial to partition computational
182
Zeyao Mo
grid into P different subdomains distributed to P processors by using many of well known grid partitioning methods, such as blocking methods for structured grid or graph partitioning methods for unstructured grid [7]. The parallel performance is well scalable only if enough zones are distributed to each processor with a well balanced communication-tocalculation ratio. However, dynamic load imbalance will severely violate this truth for multi-materials Euler hydrodynamics on Lagrangian grid [13]. It mainly comes from the dynamic variations of numerical calculations for zones containing different materials or having different types of deformation. Moreover, these dynamic variations will quiver the cache hit ratio to violate the final elapsed simulation time at the same time. Typical test experiments in our work [13] show that we will lose parallel efficiency by 35 Unfortunately, we can not directly apply the static load balancing methods surveyed in [7] to our cases here because these methods always require the numerical calculations are known for each zone in advance of load adjustment. Such requirements are impossible for us since overheads for numerical calculations of zones are impossible to be statically predicted for solutions of complex state of equations and for processing of various grid deformations. In paper [14], we presented a dynamic load balancing method named by multilevel averaging weights (MAW) for the load imbalance problems in one dimension. It is an iterative method with the assumption that all zones in each processor have the same weight for consumption of CPU cycles. Once one execution of the parallel code is finished, the weight for each process is known to be the ratio of the local CPU overheads over the total CPU overheads for all processors, and then the weight for each zone is known to be the ratio of the local weight of processors over the number of local zones. Based on these weights, we rebalance zones among processors with the requirement that weights should be evenly distributed to each processor. Therefore, after an enough number of iterations, we can get a well balanced load distribution. In order to improve the convergence, a multilevel architecture for processors by recursively binding two processors into a single supernode is formed within which we can organize our iterations from the top level to the bottom level. Theoretical analysis and many tests have shown that logP multilevel iterations are abundant in getting a well balanced load distribution where the number of iterations is required for the case of two processors. The performance for the parallel solution of the multi-materials Euler hydrodynamics equation has been improved by 25% with successful applications of MAW method to fix the dynamic load imbalance problems on tens of processors [14]. Besides this, MAW method can be easily extended for other one-dimensional load imbalance problems such as molecular dynamics simulations by using 64 processors [15]. With
Parallel Algorithms and Implementation Techniques for • • •
183
the aid of well known space filling curves [16] to index zones in higher dimensional space within one dimensional index space, MAW method can also be applied to higher dimensional dynamic load imbalance problems. In fact, we have applied it to three dimensional molecular dynamics simulations where severe load imbalances usually dynamically occur because of the non-uniform density distributions and dynamic concentrations of molecules. We have improved the speedup from 315 to 420 using MAW methods for some realistic simulations in the scale of 0.21 Giga molecules on 500 processors on the base of the usual dynamic load balancing method by evenly partitioning molecules using space filling curves [17].
2.2
Progress for solution of energy diffusion equation
After the multi-materials Euler hydrodynamics equation advances explicitly a step from time tn to time tn+\ on the Lagrangian or ALE grid, the energy diffusion equation coupled with three temperatures is then solved with the usage of full implicit finite volume discrete scheme on the same grid [7,18]. Here, it is indispensable to use full implicit stencil for temporal discretization because explicit stencils will severely limit the time step for sufficing the well known von Neumann numerical stability while compared with CFL stability constraints for Euler hydrodynamics equation [7,18]. Besides the trivial parallel implementations by using MPI, the kernel of parallel algorithms is the iterative solvers for the resulted strongly nonlinearly algebraic equations based on the domain decomposition of Lagrangian or ALE grid. Many works have focused on scalable parallel iterative solvers for these nonlinear equations and have manifested that inexact or Jacobi-free Newton iteration [19] is very efficient provided that a proper problem-specific linear system containing the main numerical characteristics of the nonlinear system is found to precondition the complex linear system arising from Newton linearization. Certainly, the proper linear system should be iteratively solved by some robust linear solver. Usually, Krylov subspace iterative methods preconditioned by parallel multi-grid [20] or BILUT [21] or diagonal scaling [21] can serve as such roles for most cases [21]. Paper [22,23,24] compared and presented many serial or parallel solvers for the radiation diffusion equation. We addressed them for the energy diffusion equations with three temperatures of electron, ion and photon [25] on an adaptive unstructured grid based on UG framework [26] using tens of processors, and addressed them again on Lagrangian quadrilateral grid for radiation hydrodynamics codes [27,28,29] on hundreds of processors. Specially, our realistic radiation hydrodynamics applications coupled with the energy diffusion equation have shown that speedup 360 can be gained with the
184
Zeyao Mo
usage of 512 processors for the scale of zones larger than 0.6 millions [29]. Two important conclusions come to be more mature from above researches. One is that better convergence is always easier on ALE grid than on Lagragian grid because of its smoother grid no matter whether symmetric or non-symmetric discrete stencils are utilized. The other one is that multigrid preconditioner is always essential for larger scale (for example, ten thousands of zones) parallel solution of the problemspecific linear system provided that it can properly approximate the complex linear system from inexact Newton linearization [19,23,25].
2.3
Progress for solution of transport equation
If an energy source term from a neutron transport is enforced in the light of radiation hydrodynamics equation, or the radiation diffusion has been approximated by a photon transport, then an addition of one particle transport equation should be solved within each time step to update the source term or radiation energy at time £ n+ i[4]. It is well known that the transport equation here is a typical six dimensional numerical problem defined on two dimensional geometry with the addition of three dimensional velocity and one dimensional temporal space. For simplicity, particle energy can be grouped into multiple non-overlapping intervals and for each group a transport equation can be reformulated. Then, the original transport equation can be replaced by the multi-group particle transport equations with the benefits of reducing velocity space from three dimensions to two dimensions defined by unit vectors on the surface of a unit ball [4]. Similarly, full implicit schemes for the temporal space are indispensable for solution of multi- group transport equations. Moreover, discontinuous finite element or finite difference stencils for geometry with the addition of discrete ordinate (Sn) method for velocity unit vectors have been proven to be one of the most effective numerical methods though some factors such as ray affections are waiting for a better solution [4]. Similarly, strongly nonlinear algebraic equations will be resulted from these discrete schemes especially for the photon transport equation. However, these nonlinear algebraic equations are different from those diffusion equations because they are usually written in the integral formulations. Source term iteration associated with flux sweeping for each of n (n/2+2) angles covering the surface of the unit ball for Sn methods is one of the most welcome methods for solution of such equations especially for cases when the number of zones is less than several thousands [4]. However, inexact Newton Semi-Multigrid-SGA preconditioned Krylov subspace iterative methods associated with the same flux sweeping are more robust for solution of such equations when the number of zones are larger than ten thousands [31]. For any cases, the kernel of
Parallel Algorithms and Implementation Techniques for • • •
185
parallel algorithms is the parallelization of flux sweeping over zones for each discrete Sn angle based on a domain decomposition of geometrical grid. Unfortunately, it is not trivial for the parallelization of flux sweeping because of the strict data dependence among neighboring zones while the flux is updated across the computational domain for each angle. Actually, the data dependence among zones for an angle can be represented by one direct graph where each node represents a zone and each direct connection between its two neighboring nodes represents the data dependence among these two zones. The flux sweeping should be updated for zones one by one strictly according to the data dependence indicated by direct connections. The direct connection between zone A and zone B can be uniquely decided by the sign of the inner product between a unit vector representing this angle and a normal vector representing the outer direction of the neighboring side with respect to zone B. If the product is positive, the direct connection should point from zone B to zone A meaning that flux of zone B should be updated before that of zone A, and after that, flux of zone A can be updated by using the updated outer flux across the neighboring side from zone B. Otherwise, the inverse case should occur. On the geometry of one dimensional case, it is obvious that flux sweeping can't be parallelized based on the domain decomposition. However, for higher dimensional cases, we can proceed by using the well known pipelining or parallel dataflow techniques to extract the potential inner parallelism hided in each direct graph. Many works addressed such parallel algorithms on higher dimensional structured grid on thousands of processors [32—34]. However, such parallel algorithms on unstructured grid actually don't share any characteristics between them at all. Plimpton and Hendrickson, et al., [35] firstly addressed this problem for three-dimensional two-group radiation transport Bolzmann equation and gained parallel efficiency by 80% on 256 dual nodes in ASCI Red for the scale of 0.1 million zones and 80 angles. In 2002, we addressed this problem and presented a parallel flux sweeping algorithm in paper [36] for two-dimensional cylinder symmetrical multi-group neutron transport equation based on the domain decomposition on unstructured Lagrangian grid. One test with the scale of 2 536 zones, the addition of 44 groups and 16 angles showed that our parallel algorithm can get speedup 72 for 92 processors on machine A with MPI latency less than 2 microseconds and speedup 80 for 256 processors on machine B with MPI latency equal to 10 microseconds. Another larger scale application further showed that our parallel algorithm can get speedup 254 for 500 processors on machine B [37]. Scalability analysis in paper [38] showed that performance of such parallel algorithms is sensitive to MPI latency, i.e., that parallel solution of neu-
186
Zeyao Mo
tron transport equations need high performance machines with lower MPI latency. Furthermore, we also addressed such problems for twodimensional cylinder symmetrical multi-group radiation transport equation based on staggered unstructured Lagrangian grid last year [39]. We have gained speedup 60 for 128 processors on machine B for an application with the scale of 6400 zones, 20 groups and 40 angles.
3
Progress in three dimensions
Though full three dimensional numerical simulations for researches on ICF are currently not possible owing to the lack of powerful computing resources and robust numerical methods, three dimensional simulations are still essential for some significant physical principles locally interested. In recent years, we focus on the parallelization of two codes for such destinations. One is designed for simulations of laser plasma interaction based on Cloud-in-Cell (CIC) methods [5] coupling plasma particle behavior with Maxwell electromagnetic fields, and the other is designed for simulations of interface instability driven by laser ablation based on the Euler numerical methods coupled with alternating direction implicit (ADI) iterative solvers for electron heat transfer equations. We organize the parallelization of CIC numerical methods [40] as follows. Firstly, we decompose the structured grid into regular subdomains distributed to different processors. Explicit finite difference schemes for discretization of Maxwell equations can be solved efficiently on these subdomains in parallel. Possion equation for electronic potentials correction can also be easily solved by CG iterative method preconditioned by BILU(0) [21] in parallel. Secondly, we distribute particles (electrons and ions) located in each subdomain to the corresponding processor owing to this subdomain. The processor will update all the particles in parallel at each time step. Thirdly, we will pack all particles out of each subdomain after each time step and transfer them individually to the corresponding neighboring subdomain into which the particle has moved. Numerical applications in [40] showed that we can achieve parallel efficiency by 82 We organize the parallelization of simulation for the interface instability [41] as follows. Firstly, we decompose the structured grid into regular subdomains distributed to different processors. Explicit finite difference schemes for the discretization of Euler dynamics equation can be solved efficiently on these subdomains in parallel. Especially, we utilize the alternating plane communication scheme to optimally suppress the communicational overheads. Secondly, we utilize parallel pipelining techniques to solve tridiagonal linear algebraic equations arising for ADI iterative solver for implicit discrete scheme for the solution of electron heat transfer equation. Numerical applications in [41] also showed that
Parallel Algorithms and Implementation Techniques for • • •
187
we can achieve parallel efficiency by 90.
4
Related works
Besides from above surveyed works, we have summarized our realistic parallelization and performance optimization techniques in paper [42]. Recently, we focus on the parallel algorithms on how to efficiently concatenate multiple codes individually designed for the numerical solution of different physics together to execute a full parallel simulation [43]. We will focus our researches on the scalable parallel algorithms and implementation techniques for more complex applications in the future, and try to build a software framework to shorten the development cycle of application codes. Moreover, we will pay attention to parallel adaptive computing on structured grids for the numerical solution of higher dimensional radiation hydrodynamics equation. Acknowledgments These works are finished with the help of Associate Professor Xiaolin Cao , Fengli Zuo and Dr. Aiqing Zhang in High Performance Computing Center in IAPCM.
References [1] Committee on High Energy Density Plasma Physics, et al., Frontiers in High Energy Density Physics : The X-Games of Contemporary Science, 2003, NAS, USA. [2] National Ignition Facility Program, available at http://www.llnl. gov/nif. [3] Deyuan Zi, et al., Numerical methods for two dimensional nonsteady hydrodynamics,Chinese Academic Press, Beijing, 1998. [4] E.E. Lewis and W.F.Miller, Computational Methods of Neutron Transport, John Wiley & Sons Publisher, 1984. [5] Chang Tianqiang, et al., Laser plasma interactions and laser fusion, Hunan Academic Publisher, 1991. [6] R.L.Bowers and J.R.Wilson, Numerical modeling in applied physics and astrophysics, Jones and Bartlett Publishers, 1991. [7] J.Dongarra, et al., Sourcebook of Parallel Computing, Kauf Morgann Publisher, 2003. [8] M.L.Wilkins, Computer Simulation of Dynamic Springer, Berlin, 1999.
Phenomena,
188
Zeyao Mo
[9] Francis H. Harlow, Fluid dynamics in Group T-3 Los Alamos National Laboratory, JCP,195(2004): 414-433. [10] Pavel Vachal, Rao V. Garimella and Mikhail J. Shashkov, Untangling of 2D meshes in ALE simulations, J. Comp. Phys., 196(2004) :627-644. [11] Z.Mo and G.Yuan, Parallel programming with message passing interface MPI, Chinese Academic Publisher, 2001. ( In Chinese ) [12] W.Gropp, E.Lusk and A.Skjellum, Using MPI: portable parallel programming with the message-passing interface, 2nd edition, MIT Press, Cambridge, MA, 1999. [13] Z.Mo and S.Fu, Parallelization of 2-dimensional hydrodynamics equation concerned with three temperatures, Chinese J. Comp. Phys. 17(2000):625-632. [14] Z.Mo and B.Zhang, Multilevel averaging weight method for dynamic load imbalance problems, Intern. J. Computer Math., 76(2001) :463477. [15] Z.Mo, J.Zhang and Q.Cai, Dynamic load balancing for shortrange parallel molecular dynamics simulations, Intern. J. Computer Math., 79(2002):165-177. [16] H.Sagan, Space-filling curves, Springer-Verlag, 1994. [17] X. Cao and Z. Mo, A new scalable parallel method for molecular dynamics based on Cell-Block data structure, In: Proceedings of ISPA 2004, Hong Kong, J.Cao, L.T.Yang and F.Lau (eds.). Lecture Notes in Computer Science, 3358(2004):757-764. [18] Peter N. Brown, Dana E. Shumaker and Carol S. Woodward, Fully implicit solution of large-scale non-equilibrium radiation diffusion with high order time integration, J. CompT. Phys, 204(2005): 760783. [19] D.A.Knoll and D.E.Keyes , Jacobian-free Newton-Krylov methods: a survey of approaches and applications , JCP, 193(2004):357-397. [20] U. Trottengerg, C.Oosterlee and A.Schuller, Multigrid, Academic Press, 2001. [21] Y.Saad, Iterative methods for sparse linear systems, 2nd edition, SIAM, Philadelphia, 2003. [22] W.J.Rider, D.A.Knoll and G.L.Olson, A multigrid Newton-Krylov method for multimaterial equilibrium radiation diffusion, J. Comp. Phys., 152(1999):164. [23] C.Baldwin, P.N.Brown, R.Falgout, F.Graziani and J.Jones, Iterative linear solvers in a 2D radiation-hydrodynamics code: methods and performance, J. Comp. Phys., 154(1999):!.
Parallel Algorithms and Implementation Techniques for • • •
189
S.Schaffer, A semi-coarsening multigrid method for elliptic partial differential equations with highly discontinuous and anisotropic coefficients, SIAM J. Sci. Stat., 20(1)1998228-242. Z.Mo, L.Shen and G.Wittum, Parallel adaptive multigrid algorithm for 2-D 3-T diffusion equations, Intern. J. Computer Math., 81(2004): 361-374. UG: A flexible software toolbox for solving partial differential equations, http://cox.iwr-heidelberg.de/ug/index.html. Z.Mo and S.Fu, Krylov subspace iterative methods for solution of two-dimensional energy equations with three temperatures, Chinese J. Numer. Math. & Appl., 24(2)(2003):133-143. Y.Xiao, S.Shu, P.Zhang, Z.Mo and J.Xu, Semicoarsening Algebraic multigrid solver for solution of 2-D 3-T energy equation, Chinese J. Numer. Math. & Appl., 24(4)(2003):293-303. Z. Mo, Parallelization and optimization of energy diffusion equation and neutron transport equation, GF Report ZW-J-2003233, IAPCM, 2003. T.A. Wareing, J.M. McGhee, J.E. Morel and S.D. Pautz, Discontinuous Finite Element Sn Methods on 3-D unstructured Grids, In: Proceedings of International Conference on Mathematics and Computation, Reactor Physics and Environment Analysis in Nuclear Applications, 1999, Madrid, Spain. Ardra: Scalable parallel code system to perform neutron and radiation transport calculations, http://www.llnl.gov/ardra. SWEEP 3D Discrete Ordinates Neutron Transport Benchmark Codes, http://www.llnl.gov/asci_benchmarks/asci/limited/ sweep3d/sweep3d_readme.html. R.S.Baker and R.E.Alcouffe, Parallel 3-d Sn Performance for MPI on Cray-T3D, In proceedings of Joint International Conference on Mathematics Methods and Supercomputing for Nuclear Applications, Vol. 1, pp.377-393, 1997. R.S.Baker and K.R.Koch, An Sn algorithm for the massively parallel CM-200 computer, Nucl.Sci.Eng.,Vol.l28, 1998, pp.312-320. S.Plimpton, B.Hendrickson, S.Burns and W.McLendon, Parallel algorithms for radiation transport on unstructured grids, In: Proceedings of Super Computing, 2000. [36 Z.Mo and X.Fu, Parallel flux sweeping algorithm for neutron transport on unstructured grid, Journal of Supercomputing, 30(1)(2004):5-17.
190
Zeyao Mo
[37] J.Wei and L.Pu, Applications of parallel flux sweeping algorithm for solution of steady neutron transport equation, GF Report, ZW-J2004126, IAPCM, 2004. [38] Z.Mo, Parallel pipelining algorithm for neutron transport on unstructured grid, Chinese J. Comput. 27(5) (2004):587-595. [39] A.Zhang, Z.Mo, et al., Parallel implementations of LARED-I codes, GF Report, ZW-J-2004061, IAPCM, 2004. [40] A.Zhang, Z.Mo, et a l , Parallelization of LARED-P codes for simulation of laser plasma interactions, GF Report, ZW-J-2002045, IAPCM, 2002. [41] F.Zuo, Z.Mo, et al., Alternating plane parallel algorithms for numerical simulation of interface instability driven by laser ablation, Chinese J. Numer.Math. & Appl., 2004. [42] Z.Mo, Research on key techniques for parallelization and optimization of applied codes, Chinese J. Numer. Math. & Appl., Vol.24, No. 2, 2002, pp.45-58. [43] Z.Mo, Concatenation algorithm for parallel numerical simulation of hydrodynamics coupled with neutron transport, Intern. J. Parallel Programming, 33(2005):57-71.
191
Long Time Behaviour of Solutions to Linear Thermoelastic Systems with Second Sound Yaguang Wang Department of Mathematics, Shanghai Jiao Tong University Shanghai, 200240 China. E-mail: [email protected]
1
Introduction
The thermoelastic systems describe the elastic and thermal behaviour of elastic heat conductive media, in particular the reciprocal actions between elastic stresses and temperature differences. They are hyperbolicparabolic coupled systems if the heat conduction is assumed to be governed by the Fourier law (see e.g. [3,4,17]). To remove the physical paradox of infinite speed of propagation for thermal disturbances in the classical thermoelasticity, one idea is to use the Cattaneo law instead of the Fourier law for heat conduction, which makes thermoelastic systems to be purely hyperbolic (see [1,2]) and transmits thermal disturbances as wave like pulses, travelling at finite speeds. This is known as second sound. There have been some interesting results concerning the existence and stability of solutions to both Cauchy problems and initial-boundary value problems of the thermoelastic system with second sound, (cf. [5,9] and references therein). The Lp—Lq decay estimates of solutions to linearized problems are important character of the problems under consideration. Meanwhile, it is well-known that to establish the LP — Lq decay estimates is the crucial step to study the existence of global solutions with small initial data for nonlinear problems (see e.g. [3,4,17]). In this note, we shall survey some recent observations with collaborators on asymptotic behaviour of solutions to the Cauchy problem of thermoelastic system with second sound. In §2, we shall derive the LP — Lq decay estimates of solutions to a linear system in one space variable, and the three space variable case will be investigated in §3. Finally, in §4 we study the relaxation limit of discontinuous solutions. The main idea in deriving the IP — Lq decay estimates is to diagonalize the linear thermoelastic system in regions of large and small frequencies in order to obtain the asymptotic information of characteristic
192
Yaguang Wang
roots of the system. This idea was succeeded in studying the propagation of singularities of solutions to hyperbolic-parabolic coupled systems ( [10,11]), the Lp — Lq decay rates of solutions to the linear thermoelastic system of type III ( [7,16]), the LP — Lq decay rates of solutions to the linear thermoelastic system of hyperbolic-parabolic coupled type with time-dependent coefficients ( [12,13]), and the diffusive structure of the classical thermoelastic system as t —> +oo ( [8]).
2
LP' — Lq decay estimates in linear thermoelastic system with second sound in 1-d
To explain how to derive LP — Lq decay estimates by frequency analysis, let us first consider the following Cauchy problem for the linear thermoelastic system with second sound in one space variable: uu - a2uxx + (39x = 0, 9t + llx + fiuxt = 0, rqt +q + K9X = 0, u(0,x) = u0(x), ut(0,x) = u±(x), 9(0,x) = 00(x), q(0,x) = qo(x), (2.1) where u and 9 are the displacement and the temperature difference of the elastic media, q is the heat flux, a, (3, 7, 6, r and K all are positive constants. Denote by u+ = ut + ia£u,
u~ = iit — ia^u,
where u is the Fourier transform of u with respect to x. From (2.1) we know that V = (u+,u~,6,q)T satisfies the following Cauchy problem: (dtV + i£A1V + A0V = 0, ^ \ V(0, x) := V6(x) = (tii + ia£u0, u\ - ia£u0,90,
qo)T,
where Ai =
/-aO0O\ 0 a/30 2
V 0
2
u
'
A) = d i a g { 0 , 0 , 0 , - } . T
0^0/
We are going to diagonalize the coefficient matrix i£Ai + Ao given in (2.2) in regions of large and small frequencies in order to obtain the asymptotic behaviour of characteristic roots. This is crucial to derive decay estimates of the solution.
Long Time Behaviour of Solutions to • • •
2.1
193
In t h e region |£| < a for small a > 0
In this region, the dominant matrix AQ is diagonal already. Let us diagonalize A\. Letting V1 = (I + Ki£)V with K\ being a constant matrix, we have dtVl + AoV1 + (i^A, - £[Ao, K^V1
+ A\{Z)VX = 0,
(2.3)
where [-, •] means the commutator of two related matrices and A\{£,) {Cij)ixi with Cij — 0(|£| 2 ) at least when £ —> 0. Obviously, by choosing
=
/000 0 \ 000 0 0 0 0 -T7 \00K 0 /
KX
we have
A\:=aKx,A0}+it;A1
=
/-a0/?0\ 0 a/30 5 § 00
V 0 0 0 0/ 1
and V satisfies
dtV1 + AoV1 + Alv1 + A^OV1 = o.
(2.4)
Let D be the left upper (3 x 3)—block of A\. It is easy to know that its eigenvalues are A 2/3 = ±v//3<5 + a 2 .
Ai=0,
Denote by Ik and rk the left and right eigenvectors of D with respect to Afc for 1 < k < 3 such that the normalization IjTk = Sjk holds. Let
( (h\
\
h
K2 =
la
/
(0) 3 xl 3x3
V (0)lx3 2
1
/
1
and then V — K2V satisfies dtV2 + A20V2 + i£A2V2 + A2{£)V2 = 0,
(2.5)
where A% = A0, A2 = diag{0, y/QS + a2, -y/06 + a2,0} and A^) = (Cij)4x4 with dj = 0(|£| 2 ) at least when £ —> 0. It is not difficult to find two constant matrices K3 and K4 such that V3 = (I + K^)(I + K3£2)V2 satisfies
194
Yaguang Wang dtV3 + A30V3 + i£A\V3 + ?AlV3 3
2
3
+ A3(£)V3 = 0,
(2.6) 3
where A 0 = A 0, A £ —> 0, and
= A\, A\{Z) = ( C y ) 4 x 4 with Cy = 0(|£| ) when A32 = diag{Ci, C 2 , C 3 , - - 2JK} T
with Cj (j = 1,2,3) being positive constants. Thus, we conclude Proposition 2.1. (1) The characteristic roots v^ = ffc(£) (fc = 1,2,3,4) o/ the matrix AQ + i£Ai given in (2.2) behave for |£| < a « 1 as
r ^ = Cif+ o(|fl3), ^2 = t V ^ + ^ ^ + C2£2 + 0(|£|3),
^ - -i^fpsT^z + c3e + o(|£|3),
U = i + (7- 2 «7)£ 2 + 0(|£|3)
IOT£/I Cj > 0 being given as above. (2) The solution to the Cauchy problem (2.2) has in |£| < a « the following representation: V(t, 0 = Qj; 1 diag{exp(-i/ 1 t), exp(-i/ 2 i), exp(-i/ 3 t), exp(-i/ 4 *)}QiV&(0,
1
,„ _* l
j
w/iere Q : = Mi(J + / ^ O ^ + K3£2)K2(I + K& with Kj (j = 1,2,3,4) gwerz as above, and Mi — (£}&(£) )|,fc=i w * ^ ^Ik ~ 1 ar*^ ^jfe = ^(l£| 2 ) / or " l£l ""* 0 and j 7^ A;.
2.2
In the region |£| > N for large iV > 0
Let us first diagonalize i£A\ of (2.2) in this region. It is easy to know that A\ has four distinct real characteristic roots:
Ai/2 = ±y—^—, A3/4 = ± y — — , where
T
V
T
r
Denote by Zfc and rfe the left and right eigenvectors of A\ with respect to Afe (1 < k < 4) such that the normalization IjTi = 5ij holds. Let Lt = (h,h,h,h)T. Then from (2.2), V1 = L{V satisfies dtV1 + i^AV1 + AlV1 = 0,
(2.8)
Long Time Behaviour of Solutions to • • •
195
where A = diag{Ai, A2, A3, A4} and AQ — (ajj)4x4 is a constant matrix with a,jj > 0 for any 1 < j < 4. One can find a constant matrix L2 such that V2 = (I + L2^~1)V1 satisfies dtV2 + i£AV2 + A\V2 + Al^V2 = 0, where A% = diag{an,a22,033,044} and A?_i(0 = (Cij )4x4 with dj OQ^-1) when £ ^ 0 0 . Therefore, we get
=
Proposition 2.2. (1) The characteristic roots vk (k = 1,2,3,4) of the matrix i£Ai + A0 behave for |£| > N » 1 as: Vk = Vk{€) = afcfc + *CAfe + 0 ( | £ | - 1 ) , where akk are in A2 and Afc are the characteristic roots of Ai in (2.2). (2) The solution to the Cauchy problem (2.2) has the following representation as |£| > N » 1; V{t,i)
= Q 2 " 1 diag{exp(-i/it),exp(-f2t),exp(-i/ 3 t), exp(-i/ 4 t)}Q 2 V r o(0-
where Q% = M2(I + L2^~1)Li with Lt (i = 1,2) given as above and M * = (&(0)£ f e =i rvitii l\k =r\k = \ and r%{H)) = OQ^1), l%(0) = 0 ( | £ | - 1 ) for If I - • 00 andj^k.
2.3
In t h e region a < |f | < N
Letting uk{k = 1,2,3,4) denote the characteristic roots of i£Ai + AQ, from Propositions 2.1, 2.2 and the following lemma we know that there is a constant C such that rWfe(0>C>0
(fc= 1,2,3,4),
for all £ e {a < |£| < N}.
Lemma 2.1. The matrix if Ai + AQ in (2.2) has no purely imaginary eigenvalue v = ia with a £ R and a ^ 0 in {a < |£| < N}. It is not difficult to verify this lemma. Proposition 2.3. There exist two positive constants C\ and C2 such that the solution V = V(t, £) to the Cauchy problem (2.2) satisfies in { < |£| < N} the following estimate:
|V(t,OI
196
Yaguang Wang
Proof. From above we know Rei/k > C > 0 for all £ G {a < |£| < N}. For the equation dtV + (i^A\ + AQ)V = 0, there exists a regular matrix L = L(£) such that L(i^Ai + AQ)^1 = A is a Jordan matrix. Thus W = LV satisfies dtW + AW = 0, where the Jordan form of A depends on the multiplicity of characteristic roots vk{£) in {a < |£| < N}. If all roots are simple, then this proposition follows immediately. Otherwise, for example, let us consider the case that the multiplicity of the characteristic roots v\ = z^, ^3 = v± is two, but v\ ^ v3 at a point £0 G {a < |£| < TV} and /i/itfo) 0 0 0 \ 1 ^(Co) 0 0 0 0 ^(&) 0 V 0 0 1 v3fa)J In this case, the solution W(£, £0) = (Wi, W2, W3, W4)7, can be expressed as: Wi(*,&) = W 1 (0,Co)e- y i ( ^ ) t ,
W 3 (t,£o) = ^3(0,6)e- 1 / 3 ( « o ) t ,
W 2 (£,£ 0 ) = (W2( 0. Using a perturbation argument, one deduces that there is e > 0 such that when |£ — £o| < e , |^OI
(2-9)
Long Time Behaviour of Solutions to • • •
197
ll^-H^O^Ci.O)!^-^) < C7(l + *)-i||^-1(0(O^o(O)llx.H«) (2-10) os |£| < a «
1 and
< cexp(-c2t)ii^-H(i - mm2v0(0)h^R)
[2Al)
as |£| >CTfor a cutoff function >(£) € C°°(.R) satisfying
^
\ 0 , |C| > a.
•
By using an interpolation between (2.9) and (2.10)(2.11) and going back from V to the solution (u,6,q) of (2.1), it follows Theorem 2.1. Let (u,6,q) be the unique solution to the problem (2.1). Then, the following estimates hold: \\{ut,ux,d,q){t,-)\\Lq{]R)
where - + - = 1, 2 < q < oo. p
q
'
— 1 —
The details of this section can be found in [15].
3
LP — Lq decay estimates in linear thermoelastic system with second sound in 3-d
In this section, we study the long time behaviour of solutions to the Cauchy problem of linear thermoelastic system with second sound in three space variables. Consider the following Cauchy problem in {t > 0, x G JR 3 }: Vtt + a?V x V x j - alvVy
+ jV9 = 0,
Ot + 0V'q + SS7'yt = O, Tqt + q + itV0 = 0, y(0,x)=yo(x), yt(0,x) = yi{x), e(0,x)=eo(x), q{Q,x)=q0
(3.1)
where a\, a?, are positive constants with a.\ < a.?,, A, V and V represent the Laplace, gradient and divergence operators in space variables, respectively, and all other notations are the same as in (2.1).
Yaguang Wang
198
As usual, decompose the vector y into y = yp + ys with yp being the potential part, i.e., V x yp = 0, and ys being the solenoidal part, i.e., V ys = 0. From (3.1) we obtain that ys and (yp,0,q) satisfy the following problems ys(0, x) = ys0(x),
yst(0,x) = yt(x)
(3.2)
and 'yP-alVVV + TV^O, 9t + pVq + 6Vyf = 0, rqt + q + KW6 = 0, yp(0,x)=ypo(x), yP(0,x)=yp(x), . 0(0, x) = 0o(x), q(0, x) - q0(x),
(3.3)
respectively. For problem (3.2), the following estimates are well-known:
||(Vi/8,y|)(t,.)IU.(fl3)
(3.4)
where TV > 3(1 — | ) , | + i = 1, 2 < g < oo and C is a positive constant. It remains to investigate problem (3.3) for the unknowns (yp,9,q). Denote by yp, 6 and q the Fourier transforms of yp, 0 and q with respect to space variable x e R3, respectively, and introduce
^±=^±^2^1^From (3.3), we know that V = {yP_,yp_,6,q) satisfies the following problem: f dtV + A0V + AiV = 0,
1^(0,0 = ^(0, where Vo(Z) = (fi + i ^ m , Hi - ia2\£\i%, 9o, qo?
(-ia-AQh Ai =
03X3 \
03x3
03x3
fy£0 3 x 3 ^
^ 2 ^ 1 - ^ 3 «7C 0 3 X 3 03X3
^
03X3/
and^0 = diag{0,..,0,i,i,i}. First, by a simple computation we have the following result:
(3.5)
Long Time Behaviour of Solutions to • • •
199
Lemma 3.1. For any £ 6 R3 \ 0, AQ + A\ has eigenvalues ±ia2\i\ of multiplicity two. Moreover, the corresponding left and right eigenvectors are as follows: 'h k ' is M
= rf = rl = ^ = rl
= (ai,6i,ci,0,...,0), = (a2,b2,C2,0, ...,0), = (0,0,0,01,61,01,0,0,0,0), = (°. 0.0. «2, b2, C2,0,0,0,0),
where {(afe, bk, Cfe)}|=1 are two unit vectors satisfying ak£,i+bk£,2+Ck£,3 = 0 and a\a2 + bib2 + C\c2 = 0. To obtain the asymptotic behavior of other eigenvalues one needs to diagonalize the matrix A0 + A\ in regions of small and large frequencies. With an idea similar to that given in §2.1, in the region |£| < a for a small a > 0, we can construct matrices Kj (1 < j < 5) homogeneous of order zero in £ with K2, K4 being invertible such that V1 = (I + K6\£\)K4{I
+ K3\£\2)K2(I
+
K^\)V
satisfies 3
3tV1 + X ; ^ ( 0 ^ 1 = 0,
(3-6)
3=0
where A\ = AQ, A\ = diag{-ia 2 |£|,
-ia2\£\,ia2\£\2io^\, 0 , - i v ^ + ^ | £ | , * V ^ 2 + 7«^l,0,0},
Ai=diag{O4x4,C 5 |£| 2 ,C 6 |£| 2 ,C 7 |£| 2 ,-/? K |£| 2 ,0,0} for positive constants Ck > 0 (fc = 5,6,7), and
A
i® = ((cT
)
\{^ij)6xW
J
with dj = 0(|£| 3 ) at least as |£| -*• 0. Proposition 3.1. Letting V(t, £) = ( V i , . . . , Vio) T be the solution to the Cauchy problem (3.5) and F^iUfe)) (i = 1,2) with UQ1^) = {Vo,i,V0t2,Vo>3)T and C/Q(0 = {Vo,4,V0,5,Vofi)T being rotation-free, then we have in |£| < a « 1 ifte following representation: 10
10
r = 5 Z=l
200
Yaguang Wang
where
v5 = c5\e+om3), ( ^ = ^0 = 1 + 0 ( ^ 3 ) ,
and Crik(£) tend to constants C®lk as |£| —> 0. Proof. Prom Lemma 3.1 and (3.6), we know that there is an invertible matrix L(£) such that D := L(£)(A0 + Ai)L _ 1 (£) = diag{i/i,...,i/ 10 }, where ' V! = V2 = -«|CI«2, i/3 = i/4 = i|f ja 2 ,
^ = C5|^|2 + Q(ig|3),
v6 = i^T7m\+c6\tf+om% v7 = -im\z\+c 7\e+o(\e), 2 vs = -Km\ + ;+o(\m, Moreover, we may choose L(£) = (li,...,lio)T such that k are the linearly independent left eigenvectors corresponding to v\ (1 < i < 4), respectively. Thus from the rotation-free property O{J7"1(UQ) (i = 1,2), we have h-V0(Z)=0, (l
satisfies
f ftW + DW = 0, \ ^ o = (0,0,0,0,^5(0),...,^10(0))T with Wi(0) (5 < « < 10) from L(£)V0(Q, which immediately yields the conclusion. • In the region of large frequencies, let us first diagonalize the matrix Ax. By a direct computation, we obtain that eigenvalues of A\ are J Ai/2 = -«*2l£|, ^3/4 = ia2.\(\, A 5/6 = 0,
1 \7/s = ±mJ^,
\9/w =
±m^
with d = 81 + a\ + & and c = y/(6-y + o% + f)2 - *$&-. Letting Lj = (Zi,/2, •••,^io)T with Zj being the left eigenvectors of A\ with respect to \j {{h,l2,l3,U} are the same as in Lemma 3.1) for 1 < j < 10, we know that V1 = L{V satisfies dtV1 + A\VX + AlV1 = 0,
Long Time Behaviour of Solutions to • • •
201
where A\ = diag{Ai,..., Aio}. As in §2.2, one can construct a matrix Li homogeneous of order zero in i such that V2 = {I + \S\-1L2)V1 satisfies dtV2 + A\V2 + A2V2 + A2_X{£))V2 = 0,
(3.8)
where Al = diag{0 4X 4, -,-,
T
for positive constants dj > 0 and
A
-^ = ((cT\)
with dj = 0 ( | £ | - 1 ) at least as |£| —> oo. Thus, similar to the proof of Proposition 3.1, we can obtain Proposition 3.2. Letting V(t,£) = (Vi,V2,..., Vio) T be the solution to the Cauchy problem (3.5), and ^(U^)) (i = 1,2) with U&(£) = (Vb,i, Vb,2,^o,3)T andU${£) = (V0j4, V0,5, Vo,6)T being rotation-free, then we have in |£| > N » 1 the following representation: 10
10
r=5 1=1
where
«* = - + o ^ r 1 ) , u6 = -+odcr1), VJ = dj+oar1) T
(7 < * < m
T
and Crik(£) tend to constants C°;fc as |£| —> oo. Now, let us investigate the property of eigenvalues of AQ + Ai in the region a < |£| < N. First, by a direct computation, we have Lemma 3.2. The matrix AX+AQ in (3.5) has no other purely imaginary eigenvalues v = ia with a S R and a ^ 0 in {a < \(\ < N} besides ^1/2 = -w*2|£| and i>3/4 = ia2\€\. Denote by v\. (1 < k < 10) eigenvalues of A\ + AQ. Together with (3.6), (3.8) and Lemma 3.2, it follows for all f € {a < |£| < N} that Re i/i(£) = 0 (1 < i < 4) and Re z^(£) > C> 0
(5 < k < 10)
by using the compactness of {a < |£| < N}. Thus, in a way similar to Proposition 2.3, we can obtain
(3.9)
Yaguang Wang
202
Proposition 3.3. Letting V(t, £) = (Vi,... ,Vio)T be the solution to the Cauchy problem (3.5) and - P - 1 ^ ^ ) ) (« = !> 2 ) with uo(0 = (Vo,i,K>,2,Vb,3)T and uo(Q = (Vo,4>vo,5,V0fi)T, being rotation-free, then there exist two positive constants C\ and C^ such that we have in {a < |£| < N} the following decay estimates: \V(t,^)\
where C 2 € (0, C) with C > 0 given in (3.9). Theorem 3.1. (1) Let {y,0,q) be the unique solution to the Cauchy problem (3.1). Then the following Lq — Lp decay estimates hold:
ll(l/t,Vi/,^g)(<,.)IU«(K»)
,Qo)\\wN-p(R3)
(3.10) for all 1 < p < 2, 2 < q < oo satisfying i + i = 1, where N > 3(1 - | ) and C is a positive constant. (2) Furthermore, if the initial data are rotation-free, rot yo = rot 2/i = 0, then the above estimates can be improved as: \\(yt, Vy, 0, q)(t, -)llL«(ii3)
+
t)-^p-^\\(yi,Vy0,eo (3.11)
Summarizing Propositions 3.1, 3.2 and 3.3, it follows the L? — 1? and Ll — L°° estimates for the problem (3.3), which implies (3.11) by an interpolation argument as usual. Prom (3.4) and (3.11), one deduces the conclusion (3.10). The details of this section can be found in [14].
4
Asymptotic behaviour of discontinuities in thermoelasticity
There have already been some results on the propagation of weak singularities in classical thermoelasticity of hyperbolic-parabolic coupled type (see [10,11] and references therein). However, the behaviour of propagation of strong singularities, e.g., discontinuities, in thermoelasticity of hyperbolic-parabolic coupled type is open, and is interesting to be studied obviously. One approach is to study the asymptotic behaviour of discontinuities of solutions to the thermoelastic system with second sound when the relaxation parameter r goes to zero. This is the purpose of this section.
203
Long Time Behaviour of Solutions to • •
Consider the following problem for the semilinear thermoelastic system with second sound in {t > 0, x £ 1R}: 'utt - a2uxx + /36x = f(u, 6), Ot+jqx + Sutx =g(u,0), rqt + q + K0X = 0, t = 0 : u — u0{x), ut = ui(x),
(4.1) 9 = 60{x), q = qo(x),
where / and g are smooth in their arguments, and all other notations are the same as in (2.1). Obviously, when T —> 0, the Cattaneo law (4.1)3 turns into the Fourier law, which makes the system in (4.1) become the classical one in thermoelasticity. Assuming that u'o,ux,0o,qo have jumps at x — 0, we are going to study the asymptotic behaviour of discontinuities in problem (4.1) with respect to r —> 0. Let u± = ut ± aux. From (4.1), we know that U = (u, u+,u-, 6, q)T satisfies dtU + AxdJJ + A0U = F(U), U(0,x) = C/(0)(x) := (u0,ui +au'0,ui
-
au'0,90,qo)T,
(4.2)
where
A1
&ndF(U)
=
fa 0 0 0 \0
0 -a 0 | 0
0 0 0\ 0 (3 0 a/30 | 0 7 0 ^ 0/
A0
/0-1 0 0 = 0 0 0 0
000\ 000 000 000
\o o o o i /
T
= (0,f,f,g,0) . Denote by {Aj}^ =0 the eigenvalues of A\, and {lj = (IJO, . . . , Zj4)}|_ 0 ({rj = ( r j0, • • • > rj4,)T}1j=0) the associated left (right) eigenvectors with normalization lj • rk = Sjk- Let L = (l0,..., li)T and R = ( r 0 , . . . , 7-4). From (4.2), V = LU satisfies idtV + A1dxV + A0V = F(V), \V{0,x) = LU^(x),
(4.3)
where A0 = LA0R and F(V) = LF{RV). By a simple computation, we have SO
Ao = a,
Ai,2 = T a ( l - ^-T) z«7
+ 0{r2),
A3,4
TJ11+0(^). T
204
Yaguang Wang
Denote by Sfe :={(t,x)
\x-Xkt
= 0},
0
the characteristic lines for (4.3), and by [Vi]^k the jump of Vi on £&, i.e., at (t*,x*) with x* = Xkt*, ME*(**,Z*)=
lim
V*(t,a;)-
(t,a:)—»(t*,x*) x>Afct
lim
^(i,x).
(t,x)—*(£* , x * ) x
Prom the first equation in (4.3), we immediately have that VQ = u does not jump at all if UQ is continuous. Next we claim that for any j ^ k, we have [Vjhk = 0.
(4.4)
Indeed, for any j e {1,2,3,4}, from (4.3), we know that 1
4
(dt + *jdx)Vj = -~J2 h^iVi
+ Fj (V)
(4.5)
should be locally bounded everywhere, and for any j ^ k,Xj = dt + \jdx is transversal to T,k. It follows the assertion (4.4). From (4.3), one has t
1
[Vkhk = [Vk]{0}e-i ^* + jih^h^e-^^ds.
(4.6)
o On the other hand, we have i\[Fk(V)Uk{s)\
< CT\[Vk]Ek{s)\, fc = l,2,
\\[Fk(V)hkis)\
k = 3,4
for a constant c > 0. Thus, we obtain
flim T _ 0 \[Vk]Sh\ = lim r _ 0 U V ^ I e - ^ * , \\[Vkhk\
< \[Vk]{0}\e-^+°^+ct,
k = 1,2,
k = 3,4
by using Uk4r4k = i^T + 0(T2), fc = l,2, \/fc4r4fc = 5 + 0 ( r ) ,
A; = 3,4.
Thus, when r —• 0, [Vk]sk (k = 3,4) decay exponentially fast, [Vi]^ and [^2]s2 persist and decay exponentially as t —• oo, and more rapidly for small heat conduction coefficient KJ.
Long Time Behaviour of Solutions to • • • By returning to the variables
(u,6,q),
'ut + aux = Vx + 0(T)V2 + 0(^)V3 + ut - aux = 0(T)VI + V2 + 0(y/¥)V3 + , 0 = 0{T)VX
q= -Ul
-yj${l
+ 0{T)V2
0(T))V3
0(^)V4, 0{^)V4,
+ V3 + VA,
+ 0(r))Vi - £ ( 1 +
+
205
0(T))V2
+ y f (1 +
0(T))V4,
it follows Theorem 4.1. Let (u,6,q) be the solution to the Cauchy problem (4-1) with the initial data satisfying that UQ is smooth, and (u'0,ui,9o,qo) are piecewise smooth with a possible jump at x = 0. Then, along the characteristic curves as r —> 0, we have [ut ± aux, 6, q\xk —> 0,
k = 3,4
exponentially, ([ut + aux]^1} [ut - a u x ] s 2 ,
[0]E 1 | 2 )
—> 0
of order 0(T), and lim r ^o[u t ± aux]-EH2) = [iti ±
OT^oje"^',
lim r ^o[g]s li2 = ~ 2 ^ K ±awo]{o}e" 2 "'' . In the case of system (4.1) being linear, one can deduce the asymptotic behaviours of [c^tf]^ similar to those of q given above. The details can be found in [6]. We have also studied the case that / and g depend on (ut, ux) in [6], and a similar result to Theorem 4.1 has been obtained under certain growth restriction of / and g on (ut,ux). Acknowledgments The results on Lp — Lq estimates were collaborated with Lin Yang, and the results on the asymptotic behaviour of the relaxation limit of discontinuous solutions were collaborated with Reinhard Racke. This research was partially supported by the NSFC grant 10131050, the Ministry of Education of China and Shanghai Science and Technology Committee grant 03QMH1407.
Yaguang Wang
206
References D. S. Chandrasekharaiah, Thermoelasticity with second sound: a review, Appl. Mech. Rev., 39(1986), 355-376. D. S. Chandrasekharaiah, Hyperbolic thermoelasticity: a review of recent literature, Appl. Mech. Rev., 51(1998), 705-729. S. Jiang and R. Racke, Evolution Equations in Thermoelasticity, Chapman & Hall/CRC Monographs and Surveys in Pure and Appl. Math, Vol. 112, Chapman & Hall/CRC 2000. R. Racke, Lectures on Nonlinear Evolution Equations, Initial Value Problems, Vieweg & Sohn, Braunschweig/Wiesbaden, 1992. R. Racke, Thermoelasticity with second sound-exponential stability in linear and nonlinear 1-d, Math. Meth. Appl. Sci., 25(2002), 409441. R. Racke and Y. G. Wang, Asymptotic behavior of discontinuous solutions to thermoelastic systems with second sound, Submitted for publication. M. Reissig and Y. G. Wang, Linear thermoelastic systems of type III in 1-D, Submitted for publication. M. Reissig and Y. G. Wang, In preparation. M. A. Tarabak, On existence of smooth solutions in one-dimensional nonlinear thermoelasticity with second sound, Quart. Appl. Math., 50(1992), 727-742. Y. G. Wang, Microlocal analysis in nonlinear thermoelasticity. Nonlinear Anal, 54(2003), 683-705. Y. G. Wang, A new approach to study hyperbolic-parabolic coupled systems, In ^Evolution Equations" (R. Picard, M. Reissig and W. Zajaczkowskieds.), Banach Center Publications, 60(2003), 227-236. Y. G. Wang and M. Reissig, Parabolic type decay rates for 1-D thermoelastic systems with time-dependent coefficients, Monatsh. Math., 138(2003), 239-259. Y. G. Wang and M. Reissig, Influence of the hyperbolic part on decay rates in 1-d-thermoelasticity, In "Hyperbolic Differential Operators and Related Problems" (V. Ancona and J. Vaillant eds.), Lect. Notes Pure Appl. Math., 233(2003), Marcel Dekker, Inc., 89108. Y. G. Wang and L. Yang, LP — Lq decay estimates for Cauchy problems of linear thermoelastic systems with second sound in 3-d, Preprint.
Long Time Behaviour of Solutions to • • •
207
[15] L. Yang and Y. G. Wang, LP — Lq decay estimates for the Cauchy problem of linear thermoelastic systems with second sound in one space variable, Submitted for publication. [16] L. Yang and Y. G. Wang, Well-posedness and decay estimates for Cauchy problems of linear thermoelastic systems of type III in 3-d, Submitted for publication. [17] S. M. Zheng, Nonlinear Parabolic Equations and HyperbolicParabolic Coupled Systems, Pitman Mono. Surv. in Pure Appl. Math., Vol. 76, Longman Sci. & Tech., John Wiley & Sons Inc., New York, 1995.
208
Distance Geometry Problem and Algorithm Based on Barycentric Coordinates* Hongxuan Huang, Changjun Wang Department
of Industrial Beijing
Engineering, 100084,
Tsinghua
University,
China.
Abstract It is well-known in biology that the function of a protein depends on its structure. In order to understand the structure of a protein, a way in common use is to measure the distances among some pairs of atoms experimentally by using NMR spectroscopy or X-ray crystallography, and then to reconstruct the structure of a protein with the aid of theoretical techniques such as the principle of distance geometry. This paper will discuss the Euclidean distance geometry problem related to the structure of a protein. First, we summarize some properties about the complete Euclidean distance matrix which establish the relationship between a distance matrix and its corresponding semidefinite positive (or negative) matrix. The complexity and the uniqueness of configuration of the distance geometry problem are also discussed from the viewpoint of realizations of a distance matrix. Finally, we present a new algorithm based on the barycentric coordinates for solving the basic Euclidean distance geometry problem.
1
Introduction
Proteins are complex biopolymers composed primarily of atoms including carbon, hydrogen, oxygen and nitrogen with an occasional sulfur, phosphorus, halogen or transition metal ion thrown in. These atoms constitute 20 different amino acids, each having an acid group, an amino group and a side chain. A polypeptide chain is m a d e u p of m a n y amino acids through the peptide bond among them. All amino acids are referred t o as the basic units of a protein. Usually, the sequence of amino acids in a protein is called the primary structure of the protein, in which a local spatial structure related t o a piece of peptide chain is called the "This research was supported by Scientific Foundation for Returned Overseas Chinese Scholars, Ministry of Education, and by the National Natural Science Foundation of China under project 10201017.
Distance Geometry Problem and Algorithm • • •
209
secondary structure of the protein. The tertiary structure of a protein is said to be the conformation of all atoms in the three-dimensional space [5,31]. The structure of a protein specifies its function. One of the challenging problems faced by molecular biologists is to determine the structure of a protein and its function. The process of generating a protein from its sequence of amino acids to its three-dimensional structure is referred to as the protein folding. There are two fundamental approaches to study the process of protein folding in order to determine the threedimensional structure of a protein. One is to analyze the basic configuration of a protein by using kinetics of protein folding [16], and the other is to measure the distances among some pairs of atoms experimentally via NMR spectroscopy or X-ray crystallography and reconstruct the structure of a protein by using theoretical techniques such as the principle of distance geometry. For surveys and reviews of work in this area, see refs. [5,7-9,13,14]. A matrix D = (dij) € *ftnxn is called a Euclidean distance matrix if there exist vectors x1, • • • ,xn € 5Rfe (for some k > 1) such that \\xt — Xj\\ = d^ for all i, j G. N = {!,•• • ,n}, where || • || denotes the Euclidean norm in $tk. The set of the vectors X = {xi\i G N} is called a realization of the Euclidean distance matrix D. Let EDMn denote the set of all Euclidean distance matrices in 5Rnx". The Euclidean distance geometry problem in protein folding is referred to as follows: Given all (or partial) Euclidean distances among pairs of atoms in a protein by a matrix D e 5Rnxn, how to determine the relative positions of all atoms in a protein molecule? For the Euclidean distance geometry problem, let us denote the number of atoms by n. The problem is in general called the molecular distance geometry problem. If the distance between atoms i and j is known, then the element d^ in a matrix D = (dij) € 9ft™ x n is specified and equal to another element dji. Otherwise, dij and dji are not specified. If all elements in matrix D are specified, then the matrix D is said to be a complete Euclidean distance matrix, or otherwise a partial Euclidean distance matrix. As a matter of convenience, any kind of these matrices is called a distance matrix. Based on the number of elements specified with certain accuracy in a distance matrix, the Euclidean distance geometry problem can be classified as two categories:
210
Hongxuan Huang, Changjun Wang The basic Euclidean distance geometry problem: Given a matrix D € 5RnX™, is it a complete Euclidean distance matrix, i.e., D e EDMn? If yes, how to find one of its realizations? The general Euclidean distance geometry problem: Given an index set / c {(i,j)\i,j = 1, • • • ,n} and two real sets {hj\(i,j) £ 1} and {uij\(i,j) S / } , is there an Euclidean distance matrix D € EDMn such that, for any (i,j) € / , kj ^ d%j ^ Uijl If yes, how to find a set of points {a;1, • • • , xn} C 5Rfc(for some k) such that V(i, j) £ / , hj ^ ||a;* — a^'|| ^ Uy?
The typical approaches to attack on the distance geometry problem can be classified as two categories: (1) Based on the principle of determining a position of a point via its distances to other points, a realization can be constructed sequentially from a distance matrix. This kind of algorithm includes linear build-up algorithm [5,8,9]. In addition, the singular value decomposition [29] can be used for a complete distance matrix in order to find one of its realizations in rectangular coordinates, which requires 0(n3) arithmetic operations. Note that linear build-up algorithm runs in time 0(n), but it may meet a disadvantage of error propagation [30]. (2) For a partial distance matrix, we construct a corresponding energy function with respect to upper and lower bounds on the specified elements. Then, we try to find a global minimum point of the energy function and obtain a realization of the partial distance matrix [1,5,14,20-22,27]. Certainly, there are many kinds of energy functions, for example, the sum of the absolute differences for the square distance and the square sum of distance differences, which correspond to different algorithms [7,17,18,26]. In the field of social science the distance geometry problem is usually called multidimensional scaling problem [3,10], and the square sum of distance differences is called the stress function. Furthermore, people have proposed some approximate algorithms to minimize the stress function, e.g., majorization method [6]. For the convenience of mathematical analysis, the square sum of the differences of the square distances
£ (iM-*i 2 -4) 2 is an energy function preferred, where / is an index set, and dij = ltj = Uij(V(i,j) £ I) is specified. Most current algorithms are proposed with respect to such energy functions, e.g., semidefmite programming [1], EMBED algorithm [5], ABBIE algorithm [14], global continuation [21, 22], continuation based on a smoothing technique and D.C. programming [20], and stochastic/perturbation method [27].
Distance Geometry Problem and Algorithm • • •
211
Prom the definition of the distance geometry problem, it can be seen that a distance matrix is independent of particular coordinates used in a realization. The main purpose of this paper is to study some properties of a realization with respect to a distance matrix which is independent of the external coordinates, and present an algorithm based on barycentric coordinates (these coordinates are affine invariant [28]). The elementary experiments indicate that the new algorithm provides an opportunity to decrease the error propagation [30]. The paper is organized as follows: We first summarize some characteristics of a complete distance matrix related to semidefinite matrices. In Section 3 the complexity and the uniqueness of the configuration are discussed, together with a necessary condition satisfied by any realization of a distance matrix. The principle of barycentric coordinates and an algorithm based on such coordinates are presented in Section 4.
2
Complete distance matrix and semidefinite matrix
Suppose that a complete distance matrix D £ 5Rnxn has a realization X = {xx,x2, • • • ,xn} in 0?p(for some p < n — 1), but it has not any realization in 3? p_1 . That is, p is the minimal dimension of the space in which there is a realization for the complete distance matrix D. Such a realization of the distance matrix D in W is said to be irreducible, i.e., the complete distance matrix D is irreducibly embeddable in W. In this case, we briefly denote the dimension of D by p. Definition 2.1. [4] Given a matrix M=
(CD)
eft (m+n)x(ro+n) ,
where the matrix A € sjj m x m j s nonsingular, the matrix S = D-
CA-lB
e 3Txn
is called the Schur complement matrix of A with respect to M. The Schur complement matrix S has some properties [4]. For example, rank (M)=rank(A)+rank(S r ), and the determinant \M\ = \A\ • \S\. In addition, when A is singular, the definition of the Schur complement matrix can be generalized as follows: any matrix such as S = D - CA~B e 5Rnxn is said to be the Schur complement matrix of A with respect to M, where A~ e AW = {a | AaA = A}
212
Hongxuan Huang, Changjun Wang
is a generalized inverse of A. It is obvious that Schur complement matrix may not be unique for a singluar matrix A. Next we summarize the necessary and sufficient conditions for a complete distance matrix, which are based on the relationship between a complete distance matrix and its corresponding semidefinite positive (or negative) matrix. Theorem 2.1. The following assertions are equivalent: (i) A real matrix D e 3?nx™ is a complete distance matrix, i.e., D g EDMn. (ii) Suppose that a matrix P = (p^) is defined as follows: For i,j = 1, • • • ,n — l, let Pij ~ ^i+i,i + dj+1,1 — di+1j+1,
where di+iti,dj+iti and di+ij+i semidefinite positive matrix.
are elements in D.
(2.1)
Then P is
(Hi) Let a matrix Ds = (dfj) and the bordered matrix
The Schur complement matrix of the upper left 2-by-2 principal submatrix I
1
„ I is semidefinite positive.
(iv) The matrix Ds = (d?-) is semidefinite negative on the orthogonal complement subspace of a vector e = (1, • • • , 1) T € 3?™, i.e., the hyperplane eTx = 0. (v) Given a vector Z i e F such that eTh= Eh = (I - ehT)Ds(I
1, the matrix - heT)
(2.3)
is semidefinite negative. (vi) Let a vector e = (1, • • • A)T € 3Rn, the bordered matrix D defined by (2.2) has a unique negative eigenvalue. Proof, (i) •£=>• (ii). According to Theorem 43.1 in ref. [2], the distance matrix D can be irreducibly embedded into a space 3fr(r < n—1) if and only if the quadratic form F(y) = \yTPy is a non-negative function, where the Hessian matrix P is denned in (2.1) and y € 3? n _ 1 . According to the definition about irreducibly embeddable and the property of a non-negative quadratic function, we conclude that the assertion (i) is equivalent to (ii).
Distance Geometry Problem and Algorithm • • •
213
{ii) <£=4> {Hi). First, we compute the Schur complement matrix of the upper left 2 x 2 submatrix I
_ 1 in D defined by (2.2). If we partition
- I, where d = (d|i, • • • , ^ i ) T and D is the lower
the matrix Ds by I
right (n — 1) x (n — 1) submatrix in Ds, then the bordered matrix / 0 1 eT D = 1 0 -
-^-(e
>
-d)(;j)"
1
1
_ I in D is
( j ^ ) = - - D + d e r + erf r .
The Schur complement matrix above is just the matrix P defined by (2.1). Therefore, the assertion (ii) is equivalent to (hi). (i) <*=> (iv). The assertion (iv) describes the property of a matrix composed of square distances, which can be used conveniently. In fact, the equivalence of the assertions (i) and (iv) comes from the equivalence of the assertions (i) and (ii). It is easy to check that the quadratic form Q{x) = xTDsx, x g 5ftn can be expressed equivalently by n
n
n
Q(x) = 2 J d?jXiXj = 2X1 ^ J d y X j + X / i,j=l
j=2
dijxixj-
i,j=2
For any vector x G 5ft™ satisfying eTx = 0, let us substitute xi in the n
function Q{x) by x\ = — £ Xj, we obtain 3=2
Q{x) = -2 £ \j=2 =
£ dljXj
/ \j=2
J
+ £
n n Z 2_^ G>ijXiXj -f- 2_^ Q>ijXiXj i,j=2 i,j=2
= E =
Xj
{dfj-dli-dl^XiXj
-yTPy,
where y = (#2, • • • , x „ ) T g 5ftn_1. (i) <£> (v). For the assertion (v), Schoenberg [25] once proved that if a vector h is set to be e/n, or the i-th unit vector e,(i.e., all components are
214
Hongxuan Huang, Changjun Wang
zero except the i-ih component is 1.), then the matrix D G EDMn if and only if the matrix Eh in (2.3) is semidennite negative, and the dimension of the complete distance matrix D is p = rank (Eh). Gower [11] extended this result later. Now we summarize briefly their proofs here as follows: For any vector h G !Rn such that eTh = 1, we can get the following equalities: ( 7 - e ^ ) ( / - ^ )
= (/-e^),
T \
(I - heT) (l
een
I
T
\ _ Il •*T een
Therefore, the matrices Ee/n and Eh defined by (2.3) satisfy the following properties: „
,,
eeT\ „ /
Ee/n=[I-—)D.(l-
eeT
—
ee \ „ / r ee1
'-irJM'-vi' Eh=(l-ehT)Ee/n(l-heT).
(2 4)
-
(2.5)
This indicates that the matrix Ee/n is semidennite negative if and only if Eh is semidennite negative, and rank (Ee/n)
= rank (Eh).
Thus, the result in [25] implies that the assertion (i) is equivalent to (v). (i) <==> (vi). Based on the symmetry of the bordered matrix D and properties of the Schur complement matrix [4], we know that the index of inertia of the bordered matrix D ln(£)=In((j£))+In(n where the matrix P = (p%j) is the Schur complement matrix of I 1 „ J with respect to D, and the element Pij is defined by (2.1). Since the eigenvalues of the matrix ( 1 n I is ± 1 , the bordered matrix D has a unique eigenvalue if and only if the Schur complement matrix P is semidennite positive. Therefore, the equivalence of the assertions (i) and (ii) implies the equivalence of the assertions (i) and (vi). •
Distance Geometry Problem and Algorithm • • •
3 3.1
215
C o m p l e x i t y a n d u n i q u e n e s s of configuration Complexity of the basic distance geometry problem
Suppose that D — (dy) € 5R™xn is a complete distance matrix. According to the relationship between D and its corresponding semidefinite positive matrix P or negative matrix Eh associated with eTh = 1, a realization of D can be found by the decomposition of the semidefinite matrix in the three-dimensional space when the dimension of D is 3 (see Proposition 6.12 in ref. [5]). In this paper, we will generalize the method used in [5] from another viewpoint and derive a necessary condition satisfied by any realization of a complete distance matrix. Given a complete distance matrix D € EDMn and a vector h € SR™ satisfying eTh = 1, we know that the matrix Eh defined by (2.3) is semidefinite negative by using the assertion (v) in Theorem 2.1. Let the matrix Fh = -\Eh
= ~(I
- ehT)Ds(I
-
heT),
where Ds = (d? •). Then Fh is semidefinite positive with respect to eTh = 1. For the matrix Fh = {fij), it is easy to check that the element fa = ~\ ( 4 - (D'h)i
=E J=l
hl
~ (Ds/l)j +
hTDah)
• k<& + 4 - 4 ) - \hTDsh, z
(3.1)
z
where eTh = 1 is used in the last equality. In particular, let us denote the diagonal entries in the matrix Fh by a vector diag (Fh) = Dsh - -{hTDsh)e.
(3.2)
Suppose that the set of points {j/i, • • • ,yn} C 5Rfc is a realization of a complete distance matrix D, i.e., \\yi — yj\\ = dij, Vi,j = 1,2, •• • , n. Then, what is the relationship between the matrix Fh and Y = (j/i, • • • , 2/n)? First of all, by the law of cosines we have the following equalities:
5(4 + 4 ~ 4 ) = (m - yi)T(yj - vi) = (yi)Tyj - iyifvi
- (vi)Tyi + \\vi\\2-
216
Hongxuan Huang, Changjun Wang
Then, based on equality (3.1) and the condition eTh = 1, we can get "
1
i=i
Z
fij = (Vifyj - (VifYh - (Yhfyj + 5>illMll2 - -hTDah. Therefore, the relationship between the matrices Fh and Y can be described as follows: Fh = YTY - YT(Yh)eT = ( y - YheT)T(Y
- e(Yh)TY
+ ( f > | | W | | 2 - \hTDsh\
- YheT) + (f^ hi\\yi - Yh\\2 - \hTDsh\
eeT eeT. (3.3)
Finally, we present a property associated with any realization of a complete distance matrix D: Theorem 3.1. If Fh = XTX is any decomposition of the matrix Fh, and denote the row number of X by k, then we have Xh = 0. In addition, the column vectors in X give a realization of a complete distance matrix D m»fe. Conversely, if {yi, • • • , yn} C 3ifc is any realization of the matrix D, then Fh = (Y- YheT)T(Y - YheT). (3.4) In particular, the following relations hold:
Y^hl\\yi-Yhf
= \hTDsh,
(D.h)i - hi - Yh\\2 = ^hTDsh.
(3.5) (3.6)
Proof. Since the vector h G 5ft" satisfies eTh = 1, we have (1 — heT)h = 0. If there exists a decomposition Fh = XTX, where X = (x1, • • • , xn) € KfeXT\ then it can be checked that \\Xh\\2 = hT(XTX)h = hTFhh = 0, that is, Xh = 0. Based on Fh = XTX and (3.1), we know d h T WxH2 = fu = J2 l k-lh Dsh, fc
(3.7)
*
& II2 = fn = E d%h« - \hTDsh,
(3-8)
Distance Geometry Problem and Algorithm • • •
217
and the square of the distance between xl and x^ \\xi - ajJ||2 = H^H2 + p ' | | = fii + J jj ~ 2jij
2{xi)Txi =
"»j •
This indicates that {x1, • • • , xn} C 3?fc is a realization of £>. Conversely, by multiplying (3.3) on the left hand side by hT and on the right side by h, we can get
J2hi\\yi-Yhf-±hTDsh
= 0,
i=i
where eTh = 1 is also used here. That is, the conclusion (3.5) holds. In addition, based on (3.2) and (3.4), the conclusion (3.6) follows directly. • Theorem 3.1 indicates that for any vector h € {x \ eTx = 1} and a realization X = {x1, • • • ,xn} of D by using the decomposition such as Fh = XTX, the barycenter of such a realization X with respect to h is the origin of coordinates. On one side, the vector h determines essentially the origin of coordinates by which a realization is located. On the other side, for any realization {yi, • • • , yn} c 5Rfc of a complete distance matrix D, the weight sum of the square of distances between these points and their barycenter with respect to h is khTDsh. Based on Theorem 3.1, when we construct a realization of D by using the decomposition XTX of the matrix Fh, the techniques for singular value decomposition, whose principle can be found in ref. [29], may be adopted to embed irreducibly a complete distance matrix D into 3?p, where p = rank (Fh). Since singular value decomposition requires at most 0(n3) arithmetic operations, the basic distance geometry problem can be solved in polynomial time, i.e., it will be finished in polynomial time to check whether or not a semi-metric matrix is a distance matrix and find one of its realizations of D when D £ EDMn.
3.2
Complexity of the generalized distance geometry problem
Given n integers a\, • • • ,an and a partial distance matrix D = (dij) € 3?™x™, where only n elements du = dik = cn(i = 1,2, • • • , n, k =mod(i + l,n)) are specified. Let us consider an instance of the generalized distance geometry problem in the one-dimensional space: How to find a realization X = {xl\i = 1,2, • • • , n} c 3? of D such that \xk — x%\ = du(if it exists), where k = mod(i + l , n ) , i = 1,2, ••• ,n?
218
Hongxuan Huang, Changjun Wang
For the instance above, if there exists a realization X = {xl\i = 1,2, • • • , n} C 5ft of D, then we can construct an index set S := {i\di = xk — x1, k = mod(i + 1, n)} such that J2iesai = 2 i g s a i - The converse proposition also holds for the above instance. This indicates that the above instance of the generalized distance geometry problem is equivalent to an integer partition problem. It is well-known that the later is NP-complete [12]. Hence, the generalized distance geometry problem in the one-dimensional space is NP-complete. The idea above can be extended to analyze the complexity of the generalized distance geometry problem in 3Jm [5]. Since the embeddability of weighted graphs in fc-space is a strongly NP-hard problem [24], the problem of finding a realization of any partial distance matrix D in 3ftfc (k > 1) is also NP-hard [5,13,19]. Furthermore, it is known that the problem of finding an approximate realization or the problem of finding another realization unequivalent to a given realization is also NP-hard [5,24]. It should be mentioned that there is an interesting phenomenon with the distance geometry problem: For the problem of finding a realization of a partial distance matrix in 5Rm(m > 1), the instances that are used to prove the complexity can be solved easily in 3? m + 1 [15]. According to the above analysis, the complexity of the generalized distance geometry problem is still not clear if we do not restrict the dimension of space in which a realization is located. But the polynomial time algorithms may exist for some partial distance matrix with special structures. For example, if the graph corresponding to elements specified in D is a chordal graph(ior any cycle with a length not less than 4 there is a chord in the graph) and all distances specified are exact, then such a generalized distance geometry problem can be solved in polynomial time [19].
3.3
Uniqueness of configuration
A rigid motion in Euclidean space is referred to as a translation, a rotation or a reflection. Definition 3.1. A realization X = (x1,- •• , xn) of a matrix D € EDMn is said to be congruent to another realization Y = (j/ 1 , • • • ,yn) if X can be obtained from Y by a rigid motion or a composition of some rigid motions.
Distance Geometry Problem and Algorithm • • •
219
Definition 3.2. Given a finite set of points X = {x% \ i = 0,1, • • • , k} C W1, let
Sx = {^2 A ^l £
Xi
•= *'xi e X>Xi
eR
' * = °'''' 'k^ '
denote the linear manifold generated by X. If the dimension dim(Sx) of Sx is \X\ — 1, then X is said to be a referential coordinate set [15]. I VI
Given a referential coordinate set X and a negative vector d £ W+ , whether or not to determine a point x in Sx such that the distance between x and x% G X is equal to dp. Now we present a principle of determining such a point based on a negative vector (or a distance vector) d, whose proof can be found in ref. [15]: Theorem 3.2. [15] Given a referential coordinate set X = {xi | i = 0,1, • • • ,k} C 5ftm and a nonnegative vector d = (do,di,--- ,dk)T 6 3f++1 = {x e SRfc+1|a; ^ 0}. Let Sx be the linear manifold corresponding to X and the index set Ik = {0,1, • • • , k}. Then, one and only one of the following assertions (\) and (\\) holds: (i) There exists a unique point x* € Sx such that, for any i £ Ik, the Euclidean distance betweenx* andxi is equal to di, i.e., \\x*— Xi\\ = da (ii) For any point x 6 Sx, \\x-Xi\] ^ di.
there exists a certain i G Ik such that
Moreover, let us consider a linear system corresponding to X as follows: ATx = b, (3.9) where A = (x1 - x°, • • • , xk - x ° ) ,
b = (6i, • • • , bk)T,
6i = ( ^ - ^ - | | 3 ; 0 | | 2 + || a; i || 2 )/2,
i = l,...,fc.
If x* € Sx is a solution to the linear system (3.9), then the following assertions (iii) and (iv) hold: (iii) There exists a constant c such that
df - \\x* - Xif = c, VieJfe; (iv) The matrix D = Do,d dT ,o is a complete Euclidean distance matrix, i.e., D e EDMn if and only if the constant c ^ 0(where DQ € R(fe+i)*(k+i) is a distance matrix corresponding to X). In particular, when k < m, c > 0, or k = m, c = 0, there exists a realization of the Euclidean distance matrix D in 5£m; when k = m, c > 0, there exists a realization in 5ftm+1.
220
Hongxuan Huang, Changjun Wang
Based on the above Theorem 3.2, we can prove the uniqueness of configuration associated with a complete distance matrix by induction, that is, all realizations are congruent to each other. Moreover, if a partial distance matrix has ^/-properties and a unique configuration, then the minimal number of specified elements should satisfy certain necessary conditions [15]. In general, the uniqueness of configuration is dependent not only on the number of elements specified, but also on the distribution of values and positions of the specified elements.
4 4.1
Algorithm based on coordinates The linear build-up algorithm
The linear build-up algorithms, which are based on rectangular coordinates to solve the three-dimensional distance geometry problem, run in polynomial time 0(n), where n is the number of atoms. They can be applied to problems such as that all distances are exact and the number of specified elements is sufficient enough [8,9]. By using a referential coordinate set, the position of another atom can be determined from a distance vector whose components indicate the distances between the atom and each of points in the referential set. In particular, the points in a referential set are vertices of a tetrahedron in the three-dimensional space. The main idea of the linear build-up algorithms based on rectangular coordinates can be described as follows: First, we choose a referential set that is composed of four noncoplanar atoms. One of the atoms is set to the origin, and another atom is placed on x-axis such that the distance to the origin is one of specified values. The 3rd atom is placed on the xoy-plane where the distances to the first two atoms are specified values. The position of the last atom in the referential set is determined by the distance vector to other atoms in the same set. Thus, the coordinates of points in the referential set have the following patterns: x° = (0,0,0) T , x 1 = (*, 0, Of, x 2 = (*, *, 0) T , x 3 = (*, *, *f, where * indicates that the corresponding component may be not zero. Then, by using the methods similar to that in Section 3, we can determine sequentially the positions of other atoms by solving linear systems such as (3.9). The parameter k in (3.9) is set to 3 and di is the distance from atom /(where I = 5,6, •••) to atom i + l(i = 0,1,2,3), respectively [5,8,9].
Distance Geometry Problem and Algorithm • • •
4.2
221
An algorithm based on barycentric coordinates
A disadvantage of the linear build-up algorithm based on rectangular coordinates is its poor numerical stability. In addition, any realization is related to the coordinates of points in a referential set. After a rigid motion all coordinates of atoms will change accordingly even though they have a unique configuration. In order to overcome above disadvantages, we will propose an algorithm here, which is based on barycentric coordinates. This kind of coordinates are dependent only on the configuration and remains unchanged during the process of a rigid motion. Definition 4.1. [23,28] Given four non-coplanar points X • X • X • x4 G 5R3 and any point x e 3i 3 , if there exist unique values of A, e 5£(i = 1,2,3,4) such that X)»=i ^t = 1 and x = 2^n—\ AjX ,
then the values Ai, A2, A3, A4 are referred to as the barycentric coordinates of x with repect to x1,x2,x^,xi. For the above definition, let us denote X = (x1 • X . X • X 4 ) and A = (Ai, A2, A3,\4,) T , we have x = XX, where the barycentric coordinates satisfy eTX = 1. There are two advantages for the barycentric coordinates: (1) The coordinates are independent of the origin and the rigid motion, and describe the relative position of a point x with respect to X; (2) When we determine the structure of a protein(i.e., to find a realization of a distance matrix), the strategy based on such coordinates is helpful to implement the algorithm parallel: the whole peptide chain can be divided into several pieces of sub-chains, and then the secondary structures are determined piece by piece. Finally, different pieces of peptide sub-chains are integrated to generate the tertiary structure of a protein. In the following part of the paper, we will discuss the principle of determining the position of a point based on barycentric coordinates. From x = XX and eTA = 1, we know x-x1
=Y^Xi(x:> -x^^BX,
(4.1)
where B = (x2—x1,xz — x1,xi—x1),X= {X2,Xz,X4,)T. After multiplying the both sides of (4.1) from the left by BT, we will get a linear system related to the unknown vector A: BT(x - x1) = BTBX.
(4.2)
Since X « X j X f X £UT6 affine independent (i.e., non-coplanar), the matrix B is of full rank. In particular, the coefficient matrix BTB is nonsingular [23].
Hongxuan Huang, Changjun Wang
222
According to the law of cosines, the coefficient matrix and constants in the linear system (4.2) can be computed from the distance matrix D — (dij) = (\\xl — xj ||) and the distances doi(i — 1,2,3,4) between the point x and x%. Therefore, the solution A of system (4.2) is independent of the rectangular coordinates, and just related to the distances. After A is computed, we can get Ai = 1 - A2 - As - A4.
(4.3)
That is, we obtain the barycentric coordinate A = (Ai, A2, A3, A4)7, of x with respect to X. It should be mentioned that the coefficient matrix BTB in (4.2) is just the positive matrix ^ P , where P is defined by (2.1) and related to the distance matrix D(see the assertion (ii) in Theorem 2.1). Although the above results come from the linear system (4.1), the barycentric coordinate A is independent of the choice of x 1 in (4.1). Now we present the following equivalent theorem: Theorem 4.1. For the barycentric coordinate A of x with respect to a; 1 ,^ 2 ,^ 3 ,^ 4 , any of the following strategies will give the same results: For any index i, we construct a linear system similar to (4-2) by using the relation
x - xi = ^2 A?(xJ -
xi
)-
Proof. Without loss of generality, we choose the index i = 2 and compute the barycentric coordinates again. Denote the matrix C = (x1 - x2, x3-x2,x4-x2),
A = (Ai, A3, A 4 ) T .
We have a linear system similar to (4.2) as follows: CT(x-x2)=CTCX. Note that
Hence, we can get x
- x2 = CX = BX + (x1 - x2),
i.e., x - x1 = BX.
(4.4)
Distance Geometry Problem and Algorithm • • •
223
This indicates the equivalence of the linear systems (4.2) and (4.4) under the condition eTX = 1. In other words, two strategies will give the same barycentric coordinate A. • A new algorithm based on the barycentric coordinates comes from the integration of the linear build-up strategy and the above principle about A, which can be used to analyze the basic distance geometry problem. The main idea of the new algorithm is described as follows: After an initial referential coordinate set {a;1'0, x2'0, x3'0, x4'0} is chosen, we compute sequentially the barycentric coordinates of other points with respect to one of the referential coordinate sets {x1,k,x2'k,x3'k, 4k x ' }(k = 0,1, • • • ,) based on the linear system (4.2) and (4.3). Note that the referential coordinate set can be adjusted during the process of computation. The whole realization is constructed from the barycentric coordinates of points together with the corresponding referential coordinate sets. Finally, we present a framework of the algorithm based on the barycentric coordinates (we assume that all elements in the distance matrix are specified and exact): Algorithm 1. BEGIN Step 1: Given an initial referential coordinate set X0 = { ^ ' V ^ ^ - V ^ K T h e four atoms in Xo is non-coplanar), let U = {x1'0, x 2 '°,a; 3,0 ,a; 4,0 } denote the set of atoms whose coordinates are known, and the set V contain all other atoms. Set k = 0. Step 2: If V = 0, then all atoms have the barycentric coordinates and the corresponding referential sets, and the algorithm stops; otherwise, go to Step 3. Step 3: For a certain atom x £ V, we choose four non-coplanar atoms in U such that the distances from them to x are specified. The four non-coplanar atoms can act as a new referential coordinate set, denoted by Xk+1 = {xl>k+1,x2>k+\x3>k+3,x4'k+1}. To determine the barycentric coordinates
224
Hongxuan Huang, Changjun Wang of x with respect to X^+i by solving the linear system (4.2) and (4.3). Set {x} U U -> U, V\{x} -> V, k + 1 -> k and go to Step 2. END
Note that the new algorithm based on the barycentric coordinates is helpful to solve the basic distance geometry problem in parallel and improve the stability of computation. If the number of elements specified in a partial distance matrix is enough, then it is possible to solve such a generalized distance geometry problem by using the above algorithm. The analysis of the algorithm in detail and some numerical results can be found in ref. [30].
References 1] Alfakih, A.Y., Khandani, A. and Wolkowicz, H., Solving Euclidean distance matrix completion problems via semidefinite programming, Computational Optimization and Applications, 12: 13-30, 1999. 2] Blumenthal, L.M., Theory and Applications of Distance Geometry, London: Oxford University Press, 1953; New York, NY: Chelsea Publishing Co., 1970. 3] Borg, I. and Groenen, P., Modern Multidimensional Scaling: Theory and Applications, New York, NY: Springer-Verlag, 1997. 4] Carlson, D., What are Schur complements, anyway? Linear Algebra and Its Applications, 74: 257-275, 1986. 5] Crippen, G.M. and Havel, T.F., Distance Geometry and Molecular Conformation, England: Research Studies Press, New York, NY: John Wiley and Sons, 1988. 6] De Leeuw, J., Convergence of the majorization method for multidimensional scaling, Journal of Classification, 5: 163-180, 1988. 7] Glunt, W., Hayden, T.L. and Raydan, M., Molecular conformations from distance matrices, Journal of Computational Chemistry, 14(1): 114-120, 1993. 8] Dong, Q.F and Wu, Z.J., A linear-time algorithm for solving the molecular distance geometry problem with exact inter-atomic distances, Journal of Global Optimization, 22: 365-375, 2002. 9] Dong, Q.F and Wu, Z. J., A geometric build-up algorithm for solving the molecular distance geometry problem with sparse distance data, Journal of Global Optimization, 26(3): 321-333, 2003.
Distance Geometry Problem and Algorithm • • •
225
[10] Gower, J.C., Distance matrices and their Euclidean approximation, In: Third International Conference on Data Analysis and Informatics, Diday, E. (ed.), INRIA, Versailles, pp. 1-19, 1983. [11] Gower, J.C., Properties of Euclidean and non-Euclidean distance matrices, Linear Algebra and Its Applications, 67: 81-97, 1985. [12] Karp, R.M., On the complexity of computational problems, Networks, 5: 45-68, 1975. [13] Hendrickson, B., The Molecular Problem: Determining Conformation from Pairwise Distances, Ph.D thesis, Department of Computer Science, Cornell University, 1990. [14] Hendrickson, B., The molecule problem: exploiting structure in global optimization, SI AM Journal on Optimization, 5(4): 835-857, 1995. [15] Huang, H.X., Liang, Z.A. and Pardalos, P.M., Some Properties for the Euclidean Distance Matrix and Positive Semidefinite Matrix Completion Problems, Journal of Global Optimization, 25(1): 3-21, 2003. [16] Huang, K., Lectures on Statistical Physics and Protein MIT and ZCAM, 2004.
Structure,
[17] Kruskal, J.B., Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis, Psychometrika, 29: 1-27, 1964. [18] Kruskal, J.B., Nonmetric multidimensional scaling: a numerical method, Psychometrika, 29: 115-129, 1964 [19] Laurent, M.A., Tour d'Horizon on Positive Semidefinite and Euclidean Distance Matrix Completion Problems, In: Topics in Semidefinite and Interior-Point Methods, Pardalos, P.M., and Wolkowicz, H. (eds.), Fields Institute Communications vol. 18, American Mathematical Society, pp. 51-76, 1998. [20] Le Thi Hoai An, Solving large scale molecular distance geometry problems by a smoothing technique via the Gaussian transform and D.C. programming, Journal of Global Optimization, 27(4): 375-397, 2003. [21] More, J.J. and Wu, Z.J., Global continuation for distance geometry problems, SIAM Journal on Optimization, 7(3): 814-836, 1997. [22] More, J.J. and Wu, Z.J., Distance geometry optimization for protein structures, Journal of Global Optimization, 15: 219-234, 1999. [23] Rockafellar, R. T., Convex Analysis, Princeton, New Jersey: Princeton University Press, 1970.
226
Hongxuan Huang, Changjun Wang
[24] Saxe, J.B. Embeddability of weighted graphs in A;-space is strongly NP-hard, In: Proceedings of the 17th Allerton Conference in Communications, Control and Computing, pp. 480-489, 1979. [25] Schoenberg, I.J., Remarks to Maurice Frechet's article "Sur la definition axiomatique d'une classe d'espaces vectoriels distancies applicables vectoriellement sur l'espace de Hilbert", Annals of Mathematics, 36: 724-732, 1935. [26] Yajima, Y., Positive Semidefinite Relaxations for Distance Geometry Problems, manuscript, 2001. [27] Zou, Z., Bird, R.H. and Schnabel, R.B., A stochastic/perturbation global optimization algorithm for distance geometry problems, Journal of Global Optimization, 11: 91-105, 1997. [28] Hazewinkel, M., Encyclopaedia of Mathematics, Dordrecht: Kluwer Academic, 1988. [29] Cai, D.Y., Numerical Algebra, Beijing: Tsinghua University Press, 1987. [30] Wang, C.J., Studies of the algorithm based on barycentric coordinates for the distance geometry problem, Technical Report, Department of Mathematical Sciences, Tsinghua University, 2004. [31] Zhou, A.R.(ed-), Biochemistry(5ih Edition), Beijing: People's Medical Publishing House, 2002.
227
On Ill-Posedness and Inversion Scheme for 2-D Backward Heat Conduction Jijun Liu Department of Mathematics, Southeast University Nanjing 210096, China. E-mail: [email protected] Abstract The aim of this paper is to present an inversion scheme for 2-D backward heat problem in a general domain D C 1Z2. Such problems arise in many engineering areas such as archaeology and reaction-diffusion processes. The physical description is to determine the initial field distribution from its final measurement given at some time T > 0. Mathematically, this problem belongs to the category of inverse problems for parabolic equations. The ill-posedness and the convergence rate of the regularizing solution are theoretically analyzed. Based on the fundamental solution to the parabolic equation, we propose an implementable inversion scheme for this severely ill-posed problem by constructing a regularizing system for a coupled ill-posed integral equations. Numerical performances are given to show the validity of this regularizing scheme.
1
Introduction
For a given initial (and/or boundary) status of some physical field at t = 0, we can determine its status at t > 0 from the governing law such as a parabolic equation or hyperbolic equation. These kinds of direct problems for partial differential equations have been studied thoroughly. In terms of the language used in the engineering community, we want to determine the output for a given system from its input data. Corresponding to these well-known direct problems, we are often faced with the so-called inverse problems, namely, recovery of system parameters or initial/boundary data from some information about the solution. The reconstruction of system parameters (coefficients in PDEs) implies the identification of an unknown system, while the determination of initial/boundary status means the controllability for a given system. That is, to obtain the desired output, what kind of signals should be
228
Jijun Liu
input to a given system. Obviously, such problems arise in a number of applications of interest to science and technology, such as construction of composite materials and geophysical exploration. In this paper, we consider a controllability problem arising in the reaction-diffusion processes or archaeology, which aims to determine the initial status of a physical field governed by the parabolic equation from its measurement given at some final time T > 0. Our model is inverse heat conduction problems (IHCPs). There have been many researches on this topic. The works on 1-dimensional spatial valuable case can be found in [1], [4], [7]. However, the technique used for 1-D IHCPs can not be generalized easily to higher dimensional case [7] and some new inversion schemes should be introduced for higher IHCPs. Moreover, this problem is severely ill-posed. In 2-dimensional case, the IHCPs for determining the boundary surface heat flux can be found in [3], [12], while the backward heat problem for D being a rectangle was given in [8], where this problem was transformed into an integral equation of the first kind directly, based on the explicit expression of the heat field. However, for a general 2-dimensional domain D C 1Z2, we can not give the Dirichlet eigenfunctions for the Laplacian operator explicitly. Therefore, we should find a new inversion method for the backward heat conduction problem for a general 2-dimensional domain. Let the temperature field u(x, t) satisfy the following direct heat problem 'ut = Au,
(x,t) e D x (0,T], xedDx[0,T], x € D,
(1.1)
where D C TZ2 and uo(x) should satisfy the compatibility condition uo(-) = 0
on dD.
(1.2)
Our inverse problem considered here is to estimate the initial temperature UQ{X) from the measurement um(x, T) otu(x,T) satisfying \\um(;T)-u(;T)\\<6
(1.3)
for some known error level 6 > 0. We pay our attention to explain the ill-posedness of this problem and explain our inversion scheme based on the potential theory for the parabolic equation in this paper. Some theoretical details will be given elsewhere. For example, the zero boundary condition u(x,t)\dD = 0 considered here is without lose of generality; the general case u(x, t) = f(x, t) on dD for some known function f(x, t) can be transformed into this simple case by the potential theory ( [6], Theorem 9.8).
On Ill-Posedness and Inversion Scheme • • •
229
This paper is organized as follows. Firstly, the integral equations corresponding to the inverse problem are set up based on the potential theory. Then the ill-posedness and the convergence rate for some theoretical regularizing solutions are analyzed in section 3. In section 4, we propose to solve this ill-posed problem by regularizing the equivalent integral equations. The information about the regularizing parameter obtained in section 3 helps us choosing the regularizing parameter for the general ill-posed integral equations. Numerical implementations are presented in section 5 to demonstrate the efficiency of the inversion method.
2
Equivalent integral equations
The function
G i;y T):
^ ' =4W^)eXP("l^)'
t>T
(21)
is called the fundamental solution of the 2-D heat equation. Firstly we state the classical potential theory for the heat conduction problem as follows( [6], Theorem 9.8), which is essential for our regularizing inversion scheme. Lemma 2.1. Let f(x,0)\xedD (0,T]) satisfies
/( t] =
= 0. / / the function 4>(x,t) G C(dD x
dG{
dulV),T)(f,iy' T^ds^dT - \^ *) (2-2)
*' Si L
for (x,t) £ 3D x (0, T], then the double-layer potential given by v(x,t) = f
dG{
*>fy'T)
[
JO JdD
(x,t) e Dx(0,T]
(2.3)
av
\V)
solves the heat conduction problem 'vt = Av, < v(x, t) = f(x, t),
v(x,0) = Q,
(x,t)eDx (0,T], {x, t)j dD x [0, T],
(2.4)
i€fl.
Moreover, it follows
230
Jijun Liu
Theorem 2.2. The solution of (1.1) and (1.2) att = T meets
w{x t] =
'
i b IDexp ("^IT")
Uoiy)dy
>
(x,t) eDx 0(x, t) = 2W(x, t) + 2 f
[
(0,T],
^ ^ I M l ^ y , r)d S (2/)dr, (x,t)
u(x,T) = W(x,T) + £ fdD
9G
edDx(0,T],
T) T ds
t^' ^ ) (y)dT> xeD.
Proof. It follows from the standard result ( [6], Theorem 9.4) that W(x, t) solves the heat equation with W(z,0) = l i m — / exp(
——J u0(y)dy = u0(x).
Notice that uo(x)\go = 0 means we can extend uo to V? continuously by defining wo(^) = 0 for x £ D. Now we make the transform u{x,t) = W(x,t)
+ V(x,t).
(2.5)
Then V(x, t) solves Vt = AV, V(x,t) = -W(x, V(x,0)=Q,
t),
(x,t) e D x (0,T], (x, t)edDx [0, T], XGT).
(2.6)
Now the application of Lemma 2.1 to (2.6) completes the proof immediately. • For a given initial temperature UQ, this theorem gives a way to the determination of u(x, T) step by step. This procedure is a typical layer-striping technique along the time direction. The advantage of this method is that we only need to solve a 2-dimensional problem at each step, while general numerical methods such as finite difference schemes need to solve the problem in 3-dimensional space D x (0,Tj. So this theorem in fact provides another possible numerical way to the solution of the direct heat problem for a general 2-dimensional domain. For the direct heat problem, we can see the well-posedness of determining u(x, T) from u$ in terms of Theorem 2.2 clearly. However, for the backward heat problem, recovery of UQ from u(x, T) is ill-posed. To understand the degree of the ill-posedness, we consider this problem in
On Ill-Posedness and Inversion Scheme • • •
231
the frequency domain temporarily. Denote by u(u,t) for w £ H2 the Fourier transform of u(x, t) with respect to x e K 2 . It is easy to see that u0(w) = exp(|w| 2 T)u(o;, T)
(2.7)
from (1.1). This relation means that the high-frequency component in wo will be amplified by exp(|w| 2 T). This factor increases rapidly as \u\ —> oo compared with those for some known ill-posed problems. For example, the differential operation has the estimate \Dp\ < \u>\\p\ for one dimensional differentiable function p(x), while the determination of boundary heat status in 2-D case has the amplifier exp([^wj + UJ2 + w 2 ] 1 / 2 ) ( [11], Chapter 4), which is roughly the same as exp(|w|). So the backward heat problem is severely ill-posed. For this problem, we generally can not restore the stability on the initial temperature. But we can obtain the conditional stability near the initial time. Such an estimate leads to the convergence rate of some regularizing solution, which shows the ill-posedness of this inverse problem from the other point. This is the topic of the next section.
3
Regularization solution
Suppose that the upper bound on exact UQ is known. More precisely, we assume that
IKH + HAuoll < M0
(3.1) 2
holds for some known constant MQ > 0, where ||-|| is the L — norm in D. We will approximate UQ in some admissible set of the initial temperature. Firstly, by the standard logarithmic convexity method ( [5], [9]), we can prove that the solution of (1.1) with (u, UQ) replaced by (w,wo) has the estimate
IK-,OII
(3.2)
Please notice that this estimate is true but trivial at t = 0. Now we take t = a for some small time a > 0. Then we get \H;a)\\
KWwoW1** \\w(-,T)\\* .
(3.3)
Let us assume that the admissible set of WQ (X) is Si := {h(x) € C(D) : \\h\\ < 2M 0 }. For two initial temperature fields w^w2, temperature fields meet
(3.4)
G Si, the corresponding
\\w1(;a)-w2(;a)\\<\\wl-w2\\^\\wH.,T)-w2(.,T)f <4M0||«;1(-,T)-u;2(-,r)||f
(3.5)
232
Jijun Liu
from (3.3). Now consider the difference w(x, a) — follows \\w(-,a)-w0(-)\\2
WQ(X).
Obviously it
= / \w(x,a) - w(x, 0)\2dx = a2
\wt(x,t)\t=i{x)\2dx JD
with £(x) £ (0, a ) . On the other hand, the continuity of w(x,t) t — 0 says lim w* = A(lim w) = Awn.
up to
Therefore, if w0(x) is smooth enough such as in C2(D), wt(x,t) can be uniformly bounded by AWQ in D x (0, a) for a small enough. So we get \\w(;a)-w0(-)\\2
[
\Aw0\2dx
JD
with some constant C > 0 independent of a, which yields \\w(-,a) - wo{-)\\L2{D) < 2CM0a
(3.6)
for initial temperature wo in the admissible set S 2 := {h(x) E C2(D) : \\Ah\\ < 2M 0 }.
(3.7)
Based on (3.4)-(3.7), we can give a regularization strategy for the approximation of uo(x) from the noisy data of u(x, T). Theorem 3.1. Let the admissible set for UQ be Si P ) ^ - For given noisy data um(x,T) satisfying (1.3), there is a regularization strategy to construct w(x,a), which is an approximation to UQ for small a. Moreover, the convergence rate is
|K.,a(J))-«0(-)ll
(3.8)
for the regularizing parameter a(5) — C n_~ln"a ' as 5 —> 0. Proof. Since u(x, T) is the final value corresponding to the exact initial value uo(x) £ Si(~|S2 and um(-,T) is the approximation of u(x,T), it follows from the continuous dependence of the direct heat problem that there exists w(x, t) solving the heat problem with initial value WQ € Si f]S2 such that \\w(-,T) - um(-,T)\\ < 5, which means
IK-,r)-«(-,r)||<2(j due to (1.3), and w depends on um(-,T)
but may be non-unique.
(3.9)
233
On Ill-Posedness and Inversion Scheme • • •
Now for two functions w,u with initial values WQ, UQ G S I ("1^2, we have ||w(-,oi) - u o | | < \\u(-,a) - u 0 | | + \\w(-,a) -
u(-,a)\\
< Cia + 4 M o H - , T ) - «(-,T)||* < Cxa + 8M 0 <5 a/T := f(a)
(3.10)
from (3.5), (3.6) and (3.9). The first term of f(a) is the error of approximating uo by u(-,a), while the second one is the error of approximating u(-,a) from the noisy data, since w comes from um(-,T). Now we choose a to minimize f{a). It is easy to see from / ' ( a ) = 0 that min„>o f(oi) = f(a(6)) with =
^ n ^ Q - l n ^ - M - l n . ) lno
We can also check that the convergence rate of f(a(8)) a(5) given by (3.11). The proof is complete.
is the same as •
Remark 3.2. The argument used here is that although we can not generally choose u>o(x) such that w(x,T) — um(x,T) for noisy data um(x, T), we can take some WQ such that w(x,T) is close to um. For example, wo = UQ satisfies this condition. This technique is somewhat similar to that used in [2] to obtain an approximate minimizer. Although we do not know how to choose u>o €E Si f] S2, the importance of this theorem is that we give the convergence rate for this severely ill-posed problem. The approximation of MO by other regularization method will be constructed from Theorem 2.2 in the next section.
4
Integral equation method for inversion
Theorem 2.2 gives a relation between u(x, T) and uo(x). We will use this relation to approximate uo(x) from the noisy data of u(x, T) numerically. Define (K\u0)(x)
= ±-J^exp
( - ^ ^ )
My)dy,
x G D,
(4.1)
= 2 f f dG{*>fy>T)cf>(y,T)ds(y)dT,xedD (4.2) av Jo JdD \y) for t G (0,T]. Then by eliminating W(x,t) in Theorem 2.2, we get (K^)(x)
((KTUo)(x) + (KT(j>)(x)=2u(x,T), \){x) = 0,
x€D, (x,t) edDx(0,T]
[
' '
Jijun Liu
234
with respect to UQ,4>. Essentially, it is an integral equation of the first kind with respect to one unknown woThe solvability of (4.3) for the exact temperature field comes from the following argument. The existence of UQ(X) is obvious. Since the eigenfunctions {un(x)} for the Laplacian operator with Dirichlet boundary condition are complete in L2(D), we can expand the solution to (1.1) as oo
y^cne~antun(x),
u{x,t) = n=l
where the coefficient c„ is determined from the condition on u(x, t) with respect to time variable t. The normalized eigenfunctions {un(x)} satisfy
{
Aun = -
x e D, X £ 3D.
It is easy to see that uo(x) = u(x,0) = Y1\
u
(u,T)un(y)dy)
e~anTun(x),
which means that uo(x) is unique for given exact field u(x, T). Now the second equation of (4.3) yields 4>(x, t) - ( K ^ ) (x) = 0,
(x, t)edDx
(0, T]
for uo(x) = 0. The fact that this equation has only the trivial solution can be established by standard argument (see proof of Theorem 9.9 in [6], [10]). Finally the existence of (x, t) for any given UQ can be obtained from the Ritz theorem since the operator K | is compact from C(dD x [0,T]) to C(dD x [0,71). Now we construct the Tikhonov regularization approximation of (4.3) for the noisy data um(x, T) by [(3g(x) + (KVKTgXx) + ( K r ^ ) W = 2(K[\m)W, \(x,t) - (K{g)(x) - (K|0)(z) =0, xedDx(0,T]
x € D
with some regularization parameter (3 > 0, where K f = (Kf)* is the adjoint of Kf. Denote the solution of (4.4) by (g5p, Sp). We hope g^ —> UQ for some /3 = (3(5) as 5 —> 0. The solvability of (4.4) for given (3 > 0 and the convergence property for the regularizing solution will be proven theoretically in [10]. If we want to give a strategy for the choice of 0(5) in terms of the standard Tikhonov theory, it is necessary but not easy to analyze the property of the coefficient operators in (4.4) and make
On Ill-Posedness and Inversion Scheme • • •
235
some smooth assumptions on the exact solution UQ. However, the regularization convergence rate shown in section 3 gives us some information about the choice of /?, though the convergence rate is obtained for the other regularization method. Now we derive the discrete form of (4.4). The operator K* is standard with a smooth kernel for x G D. Let us consider ~K2For x, y 6 dD, x ^ y, we can rewrite (4.2) as
(K^Xs) = /
V{y
} '{X~V)Mx,y,t)ds(y),
JdD
(4.5)
\x-y\
**>*') = 2i* - »i a [ s ^ h r ? e x p (- lwS)) <*»>T)dT-
(46)
The improper integral at r = t of the right-hand side is obviously conver\x-y\2
gent since limT _•(_(£ — r) e •4(*--T) = 0 for x ^ y. Moreover, it follows that ]imil>(x,y,t) = -(y,t) (4.7) (see the proof of Theorem 9.5 in [6] for the above argument). Therefore we consider ip(x, y, t) is a continuous function for (x, y) e dD x dD by defining tp(y,y,t) = ^(f>(y,t) for y 6 dD. Now we compute the weak singular integral (A.bfioi i/)(x,y,t) € C{dD x dD x (0,T]). Assume that the parameterization expression of dD is dD := {r(s) := (ri(s),r2(s)),s
G [0,2TT]}
with 27T—period smooth function r(s). Denote by ip(p,q,t) := ii>{r{p), r(q),t) for (x,y) = (r(p),r(q)) € dD x dD. Since "(l/) = (r>2{q), -r[(q))/\r'(q)\
:= r ^ / K f a ) ! ,
we get
(K^)(x) = / ^ ^ ^ I I ^ , , , ^ Vo IK?)-np)r for xj^y.
On the other hand, p—g
r'(9)MKp)-r(
r'(q)^ • r"(g) 2|r'(g)| 2
from the L'Hospital rule. So we can compute (4.2) by fl-K f-2-K
(Kl(j))(x)=
k2(p,q)^(p,q,t)dq /0 Jo
(4.8)
236
Jijun Liu
for (x,y) = (r(p),r(q))
€ dD x dD with the kernel function (r'(q)x-(r{p)-r(q))
I
,
2\r>{q)f'
P = 1-
We conclude the above analysis as the following Theorem 4.1. For x = r(p) £ dD, the operator K | has the expression
J fort£(0,T\
/-27T
' o
with \ f\r{p)-r(q)\*
I
4*(t-r)>
i>(p,q,t) = '
eXP
k2(p,q)iP(p,q,t)dq
\r(p)-r(q)\2\^
(
(
(4.10)
4(*-r)
)
MlW'' p^q,
(4.H)
-4>{r{q),t), p = q, (. IT
while for x € D, 2v
(Kl»(a;) "Jo f
fTr'{q)^-{x-r{q)) Jo MT-r)> 4>(r(q),T)drdq.
( C
^{
\x-r(q)\2\ 4(T-r) J
(4.12)
Divide [0,2n], [0,T] by the grid points Si = i x ir/N\,tj = j x T/(2N2) for i = 0,1, • • • , 2iVi, j = 0,1, • • • , 2N2. For the 2-dimensional domain D, we construct the grids under the polar coordinates. Assume that (0,0) e D. For r(s) £ dD, we divide [0, \r(s)\] by pk(s) = k x |r(s)|/(2Af 3 ) for k = 0, l,--- ,2iV3 at each fixed s. For a;*-* = Pk(si)(cossi,sinsi) £ D, by noticing >(r(sj),0) = 0, the discrete unknowns in (4.4) are 1. g(x^k) for i = 0,1, • • • , 2Ni - 1, k = 0,1, • • • , 2AT3; 2. 4>(r(si),tm) for i = 0,1,-•• ,2N1-l,m = l,--- ,2N2. Since the regularizing parameter (3 > 0 should be small, it is necessary that the integrals in (4.4) should be computed with sufficient accuracy; otherwise the influence of (3 will be covered by the truncation error. All integrals in (4.4) are essentially 1-dimensional, and we compute them by composite Simpson's rule in numerics. By lengthy but trivial computations, the discrete form of (4.4) based on Theorem 4.1 is 2-N1-122V3
pg(x^k)+ £ i=0
J
£K1(i,k,i,k)g(xi'k) +
fe=0
2JV X -1 2N2
J2
Y,K2(i,m,i,k)>(r(si),tTn)=fr(x*'k),
i=0
m=l
(4.13)
On Ill-Posedness and Inversion Scheme • • •
237
2iVi-12N 3
^2D(i,k,t,m)g(xi'k)-
0 ( r ( « i ) , * m ) - J2 i=0
fc=0 2JVi-l
1
3^" 2 1
i=0
m
^TCCm,™,*,!)^*),*™) =0(4.14) m=l
for I = 0,1,••• ,2iVi - l,fc = 0,1,••• ,2N3,m = 1, • • • ,2JV2. All the coefficients here can be written out explicitly ( [10]), but we omit their concrete expressions. Introduce the [4Ni(N2 + N3) + 2Ni]— dimensional vector x a n d define X ((2^3
+ l ) x i + (fc + l)):= 5 (a ; i ' f e ), i = 0,---2N1-l,k
= 0,--- ,27V3,
x(47ViAT3 + 2JVi + 27V2 x i + m) := ^(r(«i),t m ), * = 0,--- , 2 J V i - l , m = l , - "
,2N2,
Then (4.13) and (4.14) constitute a system of linear equations with respect to the unknowns x ( 0 for index I = 1, • • • , 4Ni(N2 + N3) + 2N±.
5
Numerical performance
In all our numerical tests, we generate the noisy data um(-,T) um(x, T) = [1 + S x T&n(x)]u(x, T)
by (5.1)
where ran (a;) is the standard random number distributed in [—1,1], and u(x, T) is the exact final value obtained by solving our model problem. Here we consider a model problem for x = r(cos8,sm6) £ D = B(0,Ro) with initial value independent of 9. Then u = u(r, t) satisfies
«t = 0 + £j£,
r€(0,Ro),t>0,
u(Ro,t)=0,\u(0,t)\
t>0,
u(r,0)=uo(r),
re[0,Bo],
(5.2)
where UO(RQ) = 0 . By a standard argument of variable separation, if we take uo(r) — Jo(-\Aor') with ^/XQRQ being the zero point of JQ{X), where Jo is the Bessel function of zero order, then the solution to (5.2) is u(r,t) = e-XotJ0(y/X^r),
(5.3)
238
Jijun Liu
from which we can obtain the exact final value u(r,T). We check our inversion method by comparing the inversion result with exact initial temperature JO(\/AQV)-
Example 1: Choose the parameters RQ = 1, Ac = 2.40482, T = 0.001, Nx = N2=-N3
= 10, # , = 0.01,5 = 0,
where 2.4048 is the first zero point of Jo (a;). The numerics are given in Figure 5.1. Although there are some oscillations, the inversion results are still satisfactory. Exact initial dsffl
Inversion initial daas
Figure 5.1 Inversion with Exact Input Data
Example 2: Keep other parameters unchanged in Example 1 and adjust Ao = 8.6542, where 8.654 is the third zero point of JQ(X). In this case, the initial data oscillate. The recovery results are shown in Figure 5.2. The oscillation property of exact initial data is revealed clearly. We can see from these figures that the reconstruction results can reveal the whole property. Considering the severe ill-posedness of this inverse problem, we think the results are still satisfactory. On the other hand, since we simulate the inversion input data by e'~"AorJo(v/A*or), it is regret that we can not choose T sufficient large in our numerical examples; otherwise e~ AoT « 0 for V^o being the zero point of Jo (a;). Of course, if we take other simulation input data with u(x, T) not decaying so rapidly, we can consider large T for JV2 large enough. However, the number of unknowns will increase rapidly. Finally we consider the noisy input data. For our second example we add 5% random noisy to the input data (i.e., S = 0.05) and take
On Ill-Posedness and Inversion Scheme • • • Exact initial das
239
inversion Initial date
Figure 5.2 Inversion with Exact Input Data Exact initial da*
inversion Initial dafc
Figure 5.3 Inversion with Noisy Input Data
0 = 0.001. The results are given in Figure 5.3. We can also see some numerical oscillations which are common for ill-posed problems. However, the main characteristics are still captured. On the other hand, our inversion method proposed in this paper can be used to generate a good initial guess for UQ(X). Based on this initial guess value, some iteration procedure can be applied efficiently to generate a more accurate recovery result for the backward heat problem. We can use Theorem 2.2 as
240
Jijun Liu
the solver for the direct problem in the iteration procedure. The related work is in process. In this paper, the backward heat conduction problem is considered. For this severe ill-posed problem, we propose to recover the initial temperature by the integral equation method. One of the advantages of this method is the relatively small amount of computations with 0(4Ni(N2 + 7V3))-unknowns compared with the direct difference method with 0(SNi A^A^-unknowns. In our method, the problem is essentially to solve a 2-D integral equation of the first kind. Therefore some regularization method is needed. However, to make the regularizing term really work, we should compute the integrals with sufficient accuracy. For our higher dimensional problem, this accurate computation requirement leads to rapid enlargement of unknowns. For example, we should solve a linear equations with 3200-unknowns for Ni = N2 = Ns = 20. So the efficient solver for linear equations as well as the relations between regularizing parameter (3 and (T,size(.D), N\, N2, N3,6) should be discussed by noticing that the generated coefficient matrix of linear equations does dot have any special property such as symmetry for general domain D. On the other hand, the numerical error dependance on the input error level S should also be analyzed. All these remained works, which need large amount of computations, are worthwhile to be studied in the future. Acknowledgments This work is supported by NSFC (No. 10371018).
References [1] J.R.Cannon, The One-dimensional Heat Equation, Addsion-Wesley, Menlo Park, CA, 1984. [2] J.Cheng and M.Yamamoto, One new strategy for a-priori choice of regularizing parameters in Tikhonov regularization, Inverse Problems, Vol.16, No.4, L31-L38, 2000. [3] L.Guo and D.A.Murio, A mollified space-marching finite-difference algorithm for the two-dimensional inverse heat conduction problem with slab symmetry, Inverse Problems, Vol.7, No.2, 247-259, 1991. [4] P.Jonas and A.K.Louis, Approximate inverse for a one-dimensional inverse heat conduction problem, Inverse Problems, Vol.16, No.l, 175-185, 2000. [5] V.Isakov, Inverse Problems of Partial Differential Springer-Verlag, 1998.
Equations,
[6] R.Kress, Linear Integral Equations, 2nd, Springer-Verlag, Berlin, Heidelberg, 1999.
On Ill-Posedness and Inversion Scheme • • •
241
[7] D.Lesnic and L.Elliott,The decomposition approach to inverse heat conduction, J. Math. Anal. Appl., Vol.232, No.l, 82-98, 1999. [8] J.J.Liu, Numerical solution of forward and backward problem for 2-D heat conduction problem, J. Comput. Appl. Maths., Vol.145, No.2, 459-482, 2002. [9] J.J.Liu and D.J.Lou, On stability and regularization for backward heat equation, Chinese Annals of Mathematics, Ser.B, Vol.24, No.l, 35-44, 2003. [10] J.J.Liu, Solving 2-dimensional backward data problem by potential method, submit to J. Computational and Applied Mathematics. [11] D.A.Murio, The Mollification Method and the Numerical Solution of Ill-Posed Problems, John Wiley & Sons. Inc., NewYork, 1993. [12] A.Shidfar and A.Neisy, A two-dimensional inverse heat conduction problem for estimating heat flux, Far East J. Appl. Math., Vol.10, No.2, 145-150, 2003.
242
Careful Numerical Simulation and Analysis of Migration-Accumulation Yirang Yuan, Ning Du Institute of Mathematics, Shandong University, Jinan 250100, China.
Yuji Han Physical Exploration Institute of Shengli Petroleum Dongying 257022, China.
Administration,
Abstract This paper considers numerical simulation of careful parallel arithmetic of oil resources migration-accumulation. It puts forward careful parallel operator splitting-up implicit iterative scheme, parallel arithmetic program, parallel arithmetic information and alternating-direction mesh subdivision. Parallel arithmetic and analysis of different CPU combinations have been done. This numerical simulation test and the actual conditions are basically coincident. The convergence estimation of the model problem has successfully solved the difficult problem in the fields of permeation fluid mechanics, computational mathematic and petroleum geology.
1
Introduction
The oil formation in sediment basins, its displacement, transport and accumulation, and the final formation of oil deposits have been one of the key problems in the exploration of oil-gas resources. How has oil been accumulated in the present loop according to the mechanics of immiscible flow? How is oil distributed in basins? All these are what the numerical simulation of secondary-accumulation of oil resources mainly studies[l-4]. With the exploration of oilfields, efforts have been made to find covered and "potato piece" oil deposits, so basin simulation must be more and more careful and exact. Studies of the basin simulation, the secondary migration-accumulation in particular, with the help of traditional serial computers can hardly solve this problem. The fluid dynamics model of the secondary migration-accumulation has strong hyperbolic characteristics. Therefore, the numerical method
243
Careful Numerical Simulation and • • •
is very difficult in mathematics and mechanics[4-8]. Up to the present, there have been only a few preliminary numerical results for the twodimensional cut plan problems, but the three-dimensional problems have not been touched yet [4,5]. For highly accurate and careful parallel numerical simulation of oil resources migration-accumulation (Tanhai region, three-layer), we put forward in this thesis a careful parallel operator splitting-up implicit iterative scheme, parallel arithmetic program, parallel arithmetic information transmission and alternating-direction mesh subdivision. Making use of the present high-performance miniature computer group and MPI information transmission compiler system which is composed of thirtytwo CPU and based on Turbolinux operating system, we have conducted parallel arithmetic of the "quantitative simulation techniques of oil resources secondary migration". We have also done parallel arithmetic and analysis of different CPU combinations. Our results are identical with the actual situation, successfully realizing the high-accurate numerical simulation when the length of simulation meshes is reduced from thousands of meters to hundreds of meters. The convergence estimation of the model problem has successfully solved the difficult problem in the fields of permeation fluid mechanics, computational mathematic and petroleum geology[6-8]. This thesis discusses the numerical simulation of the secondary migration-accumulation, the most difficult part in basin simulation, and is important in the rational evaluation of oil resources exploration and oil deposit location.
2
The mathematical model and numerical method
For the numerical simulation of secondary three-layer oil migration in porous media, the flow in the first, third and fifth layers is considered as horizontal and the one between them (the weak percolation layer) as vertical. After careful analysis of the model and the scientific numerical test, we propose a creative and rational numerical model. The mathematical model of three-layer migration-accumulation^6'11,12) is as follows: (first layer)
V • ( t f x ^ W . ) + Boq - (K2^)z=Hl Ho x = {x, y)T
V • (Kl—ViM
= -*s&
Ho oz e fii, t G J = (0, T],
+ Bwq - (K2 ——)z=Hl x
G
fix, teJ,
at
°£), dt (la)
= <M— - _ ) , (lb)
244
Yirang Yuan, Ning Du, Yuji Han
<3
-*U
* Z
Figure 1 Three-layer sketch map (second layer)
x = (x, y, z)T e O2, t e J, d
{K
krw d
Tz ^Tz^
K-fdiPo = $s(
(2a)
dtpw. } x 6 fi2
~aT - ^ T '
' ' G J'
(2b)
(third layer) T-7
/ Tr kro r-! 1 \ 1 n
1 I rs " ' r o &YO s
/ r ^ " T O Oip0.
V • (tf8 — W O + 5 0 , + ( # , — - 5 - ) ^ - ( * 4 — - ^ W ,
(3a)
=" * * ( ^ - ^ ) . • (#3
W W + Bwq + {K2 Hw
x=( 3 ! ,i,) r 6fi 1 ,tGJ ) ^—)z=H1 ~ {K4 -^—)z=H2 Hw oz fiw dz (3b)
(fourth layer) oz
Ho dz
at
at
x = (x, y, z)T e 0 3 , t£ J, -^-{Ki — ^ ) = *s(-soz nw oz at (fifth layer)
— , x e Q 3 , < € J, at
(4a) (4b)
V • (K5
Hw
Careful Numerical Simulation and • • •
245
x = (a;, y)T e Hi, t £ J,
(5a)
vipw) + Bwq + (K 4
— ) 2 = H 3 = $s(
fj,w oz
at
at
x e ill, teJ,
(5b)
where tp0 and ipw are the potential functions; kro and fcru) are the relative permeabilities for the oil and water phases, respectively; Ki(i = 1 ~ 5) are the absolute permeabilities in respective layers; /x0 and fiw are the viscosities for the oil and water phases; s = j 2 - , where s is the water concentration, and pc is the capillary pressure; B0 and Bw are the flow coefficients: B0 = ^ ( ^ + ^ L ) - 1 fi = fcueVkfl. + hu„.\-i. q(x, t) is the source (sink) function.
By Darcy law: — K^2-^-
=
=
qh, 0) — K^^dT Qh, w The initial conditions and boundary conditions are given. If the actual thickness of the carrying bed (first, third and fifth layers) is much smaller than the size of the horizontal simulation area, we propose the solution by reducing it to a two-dimensional problem in the following way. So it can also be called a quasi-three-dimensional problem. We put forward a kind of modified method of second order splitting-up implicit iterative scheme: x direction: As{AxoAxro)
+ Av(AywAyipW)
= Hl+i(EA0)(r0 Ax(AxwAxrw)
~ W)
- B™qm+1 + GVC - G < \
+ Ay(AywAy^)
= Hl+i(XAw)(rw
+ Grw ~ G % (6a)
- Ghfil + G^*
- $J>) - B™qm+1 - G C + GV-0m;
(6b)
y direction: Ax(AxoAxro) =
+ Ay(AyoAyi;il+V)
ffJ+1(E4,)ty#+1)
Ax(AxwAxrj = Hl+i{Y,Aw){^
+G ^
-
G
^+1)
- ro) - B™qm+1 + GiP™ - G C ,
+ Ay{AvwAy^+1))
(7a)
- G^+1) + G^+1)
- rw) ~ B™qm+1 - G C + G^,
(7b)
where Ax{AxAxip) = Ax,i+i/2,j{ipi+i,j -ipij) - AxA_i/2, j{ipi,j -ipi-i,j), Ax,i+i/2,j = (-^"rf Ax)i+i/2,j- Take the value of kr according to the partial upper reaches principle, and the other terms can be defined similarly. G = -Vp$s/At, Vp = AxAy, Hi+i is the iterative factor, ZJAW = Awi+i/2j + J4«J,J-I/2,J + • • • + AWtij, TiA0 = • • •.
246
Yirang Yuan, Ning Du, Yuji Han
As for the second and fourth (weak percolation) layers, their thicknesses H2 — Hi and H$ — H2 are very small, so they can be replaced by Darcy law and coupled with the first, third and fifth layers to replace differential eqs. (2) and (4). Here it must be noted that -{K^^)z=Hi and —(K~w--^-)z=Hi are Darcy velocities of oil phase and water phase, respectively. Since H2 — H\ and H$ — H2 are very small, they can be replaced by differential formulae:
\X0
UZ
fJ/0
A
1*0
Az = H2-Hu
_Rk
d^ ^ _ hKxfisL)x
(8a)
+ K2(^h}(^,
2
- ik, i)/Az,
Az = H2-Hi.
(8b)
Thus (8) is used to replace (2) in the second layer and coupled with the first and third layers. Similarly, the third and fifth layers are coupled by Darcy formula. The (I + 1) times iterative computational formula of s: ,(0 _
S^+1)=MS-jr^-)
sm
+ (l-oJl)sW,
(9)
PP-P2 where I is the iterative time, and 0 < wi < 1 is the mean factor. Suppose that we can find out {ip™+1, VC + 1 } at tm+1, the concentration of water can be attained by using the following formula: gm+1
3
=Sm
+
tym+1
_ ^ m _ ^m+1
+
^my
(1Q)
Parallel arithmetic and parallel program
Noting that the alternating-direction method has parallel arithmetic characteristics, we put forward the alternating-direction one-dimensional strip decomposition computation. This is our region decomposition method, which has been applied in our program design and has got desirable results. Now we are going to introduce the parallel computation according to the strip region decomposition. Eq. (6) is analyzed, when tpw and ipo are known. In every x-direction line, eq. (6) is a second order block triangular equation. The unknown functions ijj^ and ip* can be derived from the speedup implicit method. Each strip line is independent of others and can be dealt with parallelly. For instance, if there are fifty lines and each of them is equipped with a processor, the parallel computation is performed by 50 processors.
Careful Numerical Simulation and • • •
247
However, we don't have so many processors at present. If five are used, we must divide the region into five strips, and each group of ten lines is equipped with one processor. After eq. (6) has been dealt with, the computation is switched to eq. (7) in y direction. At this moment, functions xpw and ip0 are unknown with xp^ and ip* being known. However, they all exist in x direction in the storage of each processor in horizontal strips. Now, when ipw and ipo are computed in each processor in vertical strips in y direction, xp^ and xp* must be transmitted in related processors. See Figure 2. T*
k D a
M
a:
<*
•e
3
4
B
£ 0
1 2
Figure 2 Decomposed computation of a stripped region The region is divided into five parts, equipped with processors 0 , 1 , 2, 3 and 4, respectively. Take processor 2 as an example. It is placed in the center of the region. After the computation in x direction is finished, strip a is presented to processor 0, strip b to processor 1, strip d to processor 3, and strip e to processor 4. At the same time, processor 0 accepts strip A, accepts strip B from processor 1, strip D from processor 3, and strip E from processor 4. In this way, ip^ and ip* in each processor are prepared, and eq. (7) can be solved. Of course, after the computation in y direction has been finished, processors must exchange their information to get ready for the computation in x direction. Thus, an iterative computation is completed.
4
Validity analysis of parallel arithmetic
We adopt the geology parameters of Tanhai region. Simulation region: Taihai region, earth-coordinates (m) (20 611 700.00, 4 169 000.00) and (2 071 700.00, 4 253 000.00), horizontal scale=8 845.2km2. The simulation includes three layers, that is, Sand third lower section, Sand third middle section and Sand third upper section. According to the structure of Tanhai region, Chengzikou-Qingyun ridge, Yihezhuang-Wudiningjin ridge, Chenjiazhuang-Binxian ridge and Qingtuozi-Kendong ridge are lo-
248
Yirang Yuan, Ning Du, Yuji Han
cated from northwest to southeast. In between horizontally located are Chengbei hollow, Huanghekou hollow, Bonan hollow, Gunan hollow and other oil-bearing hollows. Example 1: In x direction the mash step length is 1 620m, and there are 65 meshes; in y direction the mesh step length is 1 680m, and there are 50 meshes. So on the plane of each layer there are 3 250 meshes. Example2: Each mesh in Example 1 is further divided into four. Thus in x direction the number of meshes is 130, and the step length is 810m. In y direction we have 100 meshes, and the step length of each is 840m. The first layer has 13 000 meshes, while the third layer has 39 000. Simulation begins with the computation of Dongying Group, and continues through sediment interruption of the upper and lower third systems, Guantao group, Minghuazhen group, and finally to the present fourth system, covering thirty million geological years. Thus precise numerical parallel simulation computation has been completed. Table 1 shows that Example 1, starting with hiatus, continues for 26 million years. We perform serial computation by different combinations of CPU; the relationship between the operation time of the computers and the combination of CPU is as follows: Table 1: CPU number time(h:m) Sp acceleration rate
1 11:01 1.0
3 6:07 1.80
6 4:48 2.30
9 5:04 2.17
12 5:28 2.02
18 6:16 1.76
24 7:11 1.53
30 8:18 1.33
It can be seen from Table 1 that when six processors are used, the computation takes the least time and the acceleration rate (Sp = T\/Tp, where T\ is the time of serial computation and Tp is the time of parallel computation) is the largest (2.30). The situations are the same when three and eighteen processors are used. Then the computation time tends to take longer time with the increase in the number of processors. This is because the larger the number of processors, the longer time is needed the information transmission between nodes though parallel computation can speed up computation. Table 2: CPU number Example 2 operation time Example 1 operation time
Sp acceleration rate
1 69:28 6:13 1
3 26:12 3:23 2.65
6 18:59 2:35 3.66
9 16:28 2:34 4.22
18 15:26 3:05 4.50
30 20:34 4:59 3.38
Careful Numerical Simulation and • • •
249
The computation results show: 1) The number of meshes increases four times, but the computation time takes on the form of nonlinear increase: the time increases about eight times when less than nine processors are used, and the time increases five times when more than eighteen processors are used. 2) The least computation time is needed during the period from six to eighteen processors. This is because computation time increases faster than the time spent on information transmission. 3) Serial computation for non-encrypted meshes is 6:13 (373m.), and for encrypted meshes, it is 69:28 (4 168m.). The number of meshes increases four times, while the computation time increases about eleven times. So it can be concluded that the tinier the meshes, the longer time the parallel computation takes, and the more difficult it is to complete the careful simulation computation task in the large-scaled migration-accumulation. 4) The acceleration in large meshes is relatively small (2.30), while that in encrypted meshes is larger (4.50). Therefore, we can come to the following conclusions: 1) Parallel computation can increase computation efficiency for large meshes. 2) Parallel computation can enlarge computation scale, providing us with the practicability of careful simulation of the large-scaled migration-accumulation. For example, tens of thousands or millions of nodes can not be computed with a single processor because of its extra-long computation time, but it can be done with paralleled processors. When there are many mesh nodes, mesh step length might be 500m or even less than 100m. Three-dimensional earthquake explanations should be made full use of so as not to miss even a small loop when careful simulation of migration-accumulation is performed. Figure 3 illustrates the oil concentration distribution in three layers (Sand three upper, Sand three middle and Sand three lower). The results are identical with the parallel computation. This proves that the computation procedure is correct and the results are reliable.
\
«tt •i
iiBL
\ 4£
"\ k
\
ZJ1 JsT VI11
Z396 B93 B2
L41
Figure 3a
.
€
G
J.
%6 Y15 Y12
Ch42 ~-;*v
r''H
*s^--
*
_ Sand third upper oil saturation isogram (Tanhai)
Yirang Yuan, Ning Du, Yuji Han
250
«*J? gSa9^
^
% v C 1 C10
o
*)/"
S3 %9
683
21
(S°L41 ,?
Figure 3b
K*12
<*42
»
Sand third middle oil saturation isogram (Tanhai)
j?TT*
,jr
\:«#f.
Figure 3c
5
Sand third lower oil saturation isogram (Tanhai)
Numerical analysis of the model problem
As for the numerical method of secondary oil migration of the multilayer in porous media, for the sake of brevity we consider only one model problem, the nonstationary flow computation of mutilayer fluid dynamics in porous media, where the flow in the first and third layers is considered to be horizontal and the flow between them (the weak percolation layers) to be vertical. We have to find out the following convection-dominated diffusion coupling systems with initial-boundary value problem' 1 1 - 1 4 ^ $i(x, V)-Q7 + a(x, y, t) • Vu - V • (Ki(x, y)Wu) + K2(x, y, = Q1(x,y,t,u),
(x,y)T£Qlt
teJ=(0,T],
z)-^\z=i (11a)
Careful Numerical Simulation and • • • $2(x,y,z)-^®z{x, y)-^
= — (K2{x,y,z)-^),
{x,y,z)T
+ b{x, j , , t) • Vt> - V • (K3(x, y)W)
= Q3(x,y,t,v),
(x,y)T eQi,
251
e CI, t e J,
(lib)
- K2{x, y, z)-^
\x=0
teJ,
(lie)
where Q = {(x,y,z)\0 < x < 1, 0 < y < 1, 0 < z < 1}, fii = {(x,y) |0 < x < 1, 0 < y < 1} . For eq. (Ha) the characteristic finite difference fractional steps scheme is given by jjn+l/2
__ jjn
+ Q(*i>Vv*n>U%),
-Hw-H&Ww jjn+1 _
(12a)
jjn+l/2
ij
Kij
l
ij
= 5y(K16y(Un+1-Un))ij,
At
l<j
(12b)
where $i(x,y,h) = $i(x,y) + %$2(x,y,l). We interpret Un(X) as the piecewise biquadratic interpolation, where Ufj = Un(X\tij), X\^j = n+l
*•
For eq. (lib) the finite difference scheme: l3h
$2, ijk
ljk Af
= 6E(K26zWn+%k,
l
(13)
For eq. (He) the characteristic finite difference fractional steps scheme is given by yn+l/2 _ yn **,V
ij
ij
UK35xVn+1^)ij+Sy(K3SyVn)ij+
=
At
+K2,ij,1/26ZW%+1
+ Q(Xi,Vj,tn,V8),
l
(14a)
yn+1 _ yn+l/2 *3,i3-
ij
ij At
=6y(K3Sy(Vn+1-Vn))ij,
where we interpret Vn(X) n
where % = V (X2f tj),
l<j
(14b)
as the piecewise biquadratic interpolation, X2,
y
= Xtj - f£\t/*3,
ij.
Theorem Suppose that the exact solution of problem (11) satisfies the smooth condition. Adopt the modified characteristic finite difference fractional steps schemes (12)~(14). The following error estimate holds:
252
Yirang Yuan, Ning Du, Yuji Han
\\u - U\\laB(Jihl) + \\v - V\\loa(JM) + \\w - W\\LcB(Jihl) + \\dt(u-U)\\L2w2) + \\dt(v-V)\\L2W2) + \\dt(w-W)\\LW) < M{h2 + At},
^
where \\9k-V;X) K
SU
= '
P l l s l x . \\9nh^jiX)=
nAt
K
{E,\\9n\\xAt}i.
sup '
NAt
„
— n=U
References Dembicki H Jr., Secondary migration of oil experiments supporting efficient movement of separate, buoyant oil phase along limited conduits, AAPG Bull, 1989, 73(8): 1018-1021. Catalan L, An experimental study of secondary oil migration, AAPG Bull, 1992, 76(5): 638-650. Allen P A and Allen J R, Basin Analysis: Principles and Application, Beijing: Petroleum Press, 1995. Wang Jie and Guan Defan, The Model Study of Oil-Gas Migrationaccumulation, Beijing: Petroleum Press, 1999. Zhang Hou-fu, Review and prospect of oil-gas migration, In: Zhang Hou-fu ed., Oil-Gas Migration Collected Works, Dongying, Shandong: Petroleum University Press, 1995, 3-6. Ewing R E, The Mathematics of Reservoir Simulation, Philadelphia: SIAM, 1983. Ungerer P, Burous J, Doligez B, et al., A 2-D model of basin petroleum by two-phase fluid flow, application to some case studies, In: Doligez ed., Migration of Hydrocarbon in Sedimentary Basins, Paris: Editions Technip, 1987, 414-455. Ungerer P, Fluid flow, hydrocarbon generation, and migration, AAPG Bull, 1990, 74(3): 309-335. Walte D H and Yukler M A, Petroleum origin and accumulation in basin evolution—A quantitative mode, AAPG Bull, 1981, 65(8): 1387-1396. Cha Ming, Secondary Hydrocarbon Migration and Accumulation, Beijing: Geology Press, 1997. Yuan Yirang, Zhao Wei-dong, Cheng Ai-jie and Han Yu-ji, Numerical simulation analysis for migration-accumulation of oil and water, Applied Mathematics and Mechanics, 1999, 4(20): 386-392.
Careful Numerical Simulation and • • •
253
[12] Yuan Yirang, Zhao Wei-dong, Cheng Ai-jie and Han Yu-ji, Simulation and applications of three-dimensional migration accumulation of oil and water, Applied Mathematics and Mechanics, 1999, 9(20): 933-942. [13] Yuan Yirang, The characteristic finite difference fractional steps method for compressible two-phase displacement problems, Science in China, Series A, 1999, 1(42): 48-57. [14] Yuan Yirang, The upwind finite difference fractional steps method for combinatorial system of dynamics of fluids in porous media and its application, Science in China, Series A, 2002, 45(5): 578-593.
254
Error Analysis on Scrambled Quasi-Monte Carlo Quadrature Rules Using Sobol Points* Rongxian Yue Division of Scientific Computation, E-Institute of Shanghai Universities, and Department of Applied Mathematics, Shanghai Normal University, 100 Guilin Road, Shanghai 200234, China. E-mail: [email protected].
Abstract We study the worst-case error and random-case error of scrambled quasi-Monte Carlo quadrature using Sobol points. The function spaces considered in this article are the weighted Hilbert spaces TL3 generated by Haar wavelets with weights fa > 0 and a parameter 77 > 0 which reflects the smoothness of the spaces. Conditions are found under which multivariate integration using the scrambled Sobol points is strongly tractable in the worst-case and random-case settings, respectively. The worst-case results improve upon those of Wang (2003). The random-case results give weaker conditions for strong tractability than in the worst-case setting.
1
Introduction
Base b scrambling quadrature proposed by Owen (1995) is a hybrid method of Monte Carlo and quasi-Monte Carlo methods of integration based on random permutations of the digits of the points in a net or a sequence. A scrambling maps every point a = ( 0 1 , . . . , as) in the sdimensional unit cube [0, l)s into some other point x = (xi,...,xs) by randomly scrambling the digits of a. Suppose that {aj}f=1 is a set of n points in [0, l ) s . Write the components of a* as aij = Ylk=i aijkb~k- For i = 1 , . . . , n, let x» = (xa,..., xis) with x^ = Y^T=i xijkb~k, where xijk is a random permutation applied to a^k- The x$'s satisfy the following rules: "This work was partially supported by NSFC grant 10271078, and the Special Funds for Major Specialties of the Shanghai Education Committee.
Error Analysis on Scrambled Quasi-Monte Carlo • • •
255
(1) Each digit Xijk is uniformly distributed on the set { 0 , 1 , . . . , 6 — 1}; (2) For any two points x* and Xi' the s pairs (xn,Xi>i),..., (#i S , Xi'S) are mutually independent; (3) If ctij and a^j share the same first A; digits, but their fc + 1 s t digits are different, then (a) Xijh = Xi'jh for h = 1 , . . . , fc; (b) the pair (xijk+i,Xi>jk+i) is uniformly distributed on the set {() : di + di>; ck, d,- G { 0 , 1 , . . . , b - 1}} and (c) Xijfe+2,Xijfe+3,• • • , xej fe+2, XVJfe+31• • • are mutually independent. We call this a base b scrambling scheme and call the sequence {XJ}™ =1 a scrambled version of {a;}" = 1 . The following geometrical description of this scheme may help us visualize the randomization: Begin by partitioning the unit cube [0, l ) s along the X\ axis into b parallel 6-boxes of the form [£/b, (£+l)/b) x [0, l ) 3 " 1 for I = 0 , . . . , b-1. Then randomly permute those 6-boxes and replace them in one of the 6! possible orders, each such order having probability 1/6!. Next take each such 6-box in turn, partition it into b congruent 6-boxes of volume b~2 along the x\ axis and randomly permute those 6-boxes. Then repeat this process on b2 6-boxes of volume 6~3, 63 6-boxes of volume 6 - 4 and so on. In practice this can stop when the 6-boxes are narrow enough compared to the machine precision. The full scrambling involves applying the above operations along the other s J. Q.XGS X 2 ; • * • i 3*s ^ ^ well. All of the many permutations used are to be statistically independent. Two important properties of this randomization are: (i) a scrambled version of a point in the unit cube has the uniform distribution on the unit cube; (ii) a scrambled (t, m, s)-net or (t, s)-sequence in base 6 is still a (t, m, s)-net or a (t, s)-sequence with probability one. For (t, m, s)-nets and (t, s)-sequences in base 6 we refer to Niederreiter (1992). Therefore, this method provides an unbiased estimate of the integral over the unit cube [0,l) s ,
W) = [
/(x)dx,
by the quadrature rule of the form
Q»,.(/) = i£/(xt). This method can achieve the superior accuracy of quasi-Monte Carlo methods, via allowing estimation of the accuracy by replication as in the MC methods. Previous work has investigated the variance of the estimate (Owen (1997a, 1997b, 1998), Yue (1999), and Yue and Mao (1999)),
256
Rongxian Yue
the root mean square discrepancies of the scrambled nets and (t, s)sequences ( Hickernell and Yue (2000)), and the error analysis of scrambled quasi-Monte Carlo quadrature rules in the worst-case, random-case, and average-case settings (Heinrich, Hickernell and Yue (2004)). Recently, the tractability problems of integration and approximation based on scrambled quasi-Monte Carlo algorithms have been considered in Yue and Hickernell (2001) for the Hilbert space spanned by the Haar wavelets. Conditions are found under which integration and approximation are tractable. Unfortunately, the approach used there is non-constructive. The main purpose of this article is to find conditions under which integration using Scrambled Sobol points (Sobol, 1967) is strongly tractable in worst-case and random-case settings, respectively. The class of integrands is the weighted Hilbert space spanned by multidimensional Haar wavelet series, Hs, which will be described in the next section. The tractability problem for quasi-Monte Carlo quadrature in some other spaces has been studied by some other authors, such as Hickernell and Wozniakowski (2000), Sloan and Wozniakowski (2001), and Wang (2002, 2003). We believe that it is interesting to study the tractability problem for integration using the scrambled Sobol points in the space 7is. We now define the tractability problems as follows. Let Hs be a Hilbert space of functions defined on [0, l ) s , whose reproducing kernel is Ka(jc,y). Let P sc ,n = { x i , . . . , x „ } be a scrambled version of Pn = { a i , . . . , a„} C [0, l)s according to the scrambling scheme of Owen (1995). The worst-case error of a scrambled QMC quadrature rule Qn,s over the unit ball of Hs is denned as ev/(Pn,Ks)
= JEsc
V
L
sup
/ew„ll/ll
| / s ( / ) - Q „ , s ( / , P s c , n ) |J ,
(1.1)
where the expectation Esc is taken with respect to the scrambled sample points Xj. For n = 0 we do not sample the function and the initial error is ew(0,ivTs) = ||7 s ||. We study, for any given e € (0,1), the cost of integration n"(e,Ks)
= min{n : 3Pn such that ew(Pn,Ks)
< eew(),Ks)}.
If there exist non-negative C(s) and p such that nw(e,Ks)
Ve G (0, l),Vs > 1,
we then say that the multivariate integration in the space Tis is tractable. The infimum, pw, of all possible p is called the ^-exponent of tractability. The strong tractability means that C(s) can be made independent of s.
Error Analysis on Scrambled Quasi-Monte Carlo • • •
257
Note that = Esc[D(Pn,Ks)}2,
[e»{PmKs)f
(1.2)
where D(Pn, Ks) is a discrepancy of the set Pn based on the kernel Ks. For a given point set Pn = { a i , . . . , a „ } the discrepancy has an explicit form as follows: [D(Pn,Ks)\2
= [ i[o,i) 2 s -. n
Ks(x,y)dxdy n
- - £ n
i=i
f ^[o,i) s
Ks{^,y)dy
n
»=ii'=i
For the tractability in the random-case setting over a space Hs, we define the random-case error of the scrambled quasi-Monte Carlo quadrature as eT(Pn,Ks)=
sup /€Ti»,||/||
jEsc\Is(f)-QntS(f,Psc,n)\2.
(1.3)
V
The cost of integration, nr(e, Ks), and the e-exponent, pT, can be defined by analogy with those in the worst-case setting. The main results of this article are Theorems 1 — 2, in which we give the conditions on strong tractability and the e-exponents for the weighted Haar wavelet space 7is- The worst-case results improve upon those of Wang (2003). The random-case results give weaker conditions for strong tractability than those in the worst-case setting. In the next section we first define the weighted Hilbert space H.s and briefly review the definition of the Sobol sequence, and then derive the condition of strong tractability of integration in the worst-case setting. In Section 3 we derive the condition of strong tractability in the randomcase setting over the space Hs-
2 2.1
Strong tractability in worst-case setting T h e weighted Hilbert space of H a a r wavelets
We first define the weighted Hilbert space Hs of integrands. Define a basis that is a tensor product of Haar wavelets. Let V>(z) = 2 x 1 L 2XJ=O-1L*J=O.
which is called the mother wavelet. Here 1{.} denotes the characteristic function, and \_x\ denotes the floor function of x or the greatest integer
258
Rongxian Yue
less than or equal to x. For integers k > 0 and 0 < r < 2k the dilated and translated versions of the mother wavelet are Vfcrfr) = 2k'2^{2kx
- r) = 2fc/2(2 x V + i x J = 2 r - l L 2 * x J = r ).
For any subset u C { 1 , . . . , s}, let \u\ denote the cardinality of u. For each j E u let kj and tj be integers satisfying kj > 0 and 0 < Tj < 2ki. Let k denote the |w|-vector of kj, and let r be the |u|-vector of Tj, j £ u. For any point x = (xi,..., xs) E [0, l)s let
j€u
jeu
where |k| denotes the sum of the kj, j £ u, i.e., |k| = J2jeu^j^or x u = 0 we take by convention ^/>ukr(x) = V"0( ) = 1- It is known that these wavelets form an orthonormal basis with respect to the £2-norm on [0, l ) s . Let w„k be positive weights satisfying the summability condition
X] 2 ' k|w «k
(2.1)
u,k
Define the Hilbert space Hs as consisting of all wavelet series / ( x ) = Yl /"krV'ukr(x), u,k,r
whose series coefficients, / u kr, satisfy the summability condition X ) ^ k l A k r l 2 < 00.
(2.2)
u,k,r
The inner product for Hs is defined as = 2 J wuk/«kr5«kru,k,r
{f,9)n3
Here the summation \]
1S
* a ^en over all subsets u C { 1 , . . . , s } , |u|-
«,k,r
vector k, and all |u|-vector r defined above. The summability condition on the series coefficients, (2.2), implies that the inner product is finite for all f,g E 'Hs- Condition (2.2) together with the summability condition (2.1) implies that the wavelet series is absolutely summable for any x € [0, l ) s . The reproducing kernel for this Hilbert space is (Hickernell and Yue, 2000): Ksc,s(x,y)
w = ^ «kV'«kr(x)V'«kr(y). «,k,r
(2.3)
Error Analysis on Scrambled Quasi-Monte Carlo • • •
259
Summability condition (2.1) implies that this sum is well-defined. Reproducing kernels of the form (2.3) have the property that their values do not change under scrambling. Such kernels are defined to be scramble-invariant. In other words, if Xj and x$' are the scrambled versions of two points, and a; and a^ are in [0, l)s, then i^SC)S(xj,Xj') = Ksc,s(a4, Bi') with probability one (Hickernell and Yue, 2000). It follows from (1.2) that for the space Hs with reproducing kernel KSC:S we have e w (P„, Ksc,s) = D(Pn, Ksc,.).
(2.4)
Hereafter, it will be assumed that
0<^ k 3 u 2-^ +1 )l k l,
A^JIft,
(2.5)
jeu where (3\ > fc > • • • > 0 and rj > 0. This form of w„k automatically satisfies the summability condition (2.1).
2.2
Integration using scrambled Sobol points
In this subsection, we first briefly review the construction of the Sobol sequence, and then deal with the tractability problem of multivariate integration using scrambled Sobol points in the worst-case setting over the space HsSobol sequence, { a o , a i , . . . } , is a (t,s)-sequence in base 2. By the definition, for all integers k > 0 and m > t, the set of 2 m points, {a* | k2m < i < (k + l ) 2 m } , forms a (t,m,s)-net in base 2, i.e., every elementary interval of the form
fc,y^%),
kj>0, 0
of volume 2 t _ m contains exactly 2* points of the net. The generation of Sobol sequence can be briefly described as follows (Sobol, 1967). List all primitive polynomials over the field F^ in a sequence arranged in the order of nondecreasing degrees. Let pi,P2,---,ps be the first s primitive polynomials. Each component of an s-dimensional Sobol sequence is based on one primitive polynomial. Let Pj(x) =xd + cxxd~x +
h cd_ix + 1
be the j - t h primitive polynomial of degree d = deg(pj), where each Oy is 0 or 1. We use its coefficients to define a sequence {m 7 } by the recurrence m 7 = 2c 1 m 7 _ 1 © 2 2 c 2 m 7 _ 2 © • • • © 2d~1cd-irn1-d+i
© 2 d m 7 _d © m 7 _ d
260
Rongxian Yue
f o r 7 = d + l , d + 2 , . . . , where © is the bit-by-bit exclusive operator. The initial values of mi, m,2, • ••, ma can be chosen freely provided that each rrij is odd and m 7 < 2 7 . Then we define the direction numbers by v1 = m 7 / 2 7 ,
7 = 1,2,
The j-th component of the s-dimensional Sobol sequence denoted by V = {ao, a i , . . . } is defined by o-ii = hvi © «2^2 © hvs © • • • ,
i = 0,1,2,...,
where • • • 13^2*1 is the binary representation of i, and then a; = (aii,ai2,...,ais),
i — 0,1,2,
Note that the s-dimensional Sobol sequence is a (t, s)-sequence in base 2, where the quality parameter t is determined by the degrees of the primitive polynomials pj: s
i = ]T[degfo)-l]. An important property of the Sobol sequence is as follows. For a nonempty subset u C { l , . . . , s } , let Vu be the projection of the sdimensional Sobol sequence V onto [0, l ) u . Thus Vu is a |u (-dimensional Sobol sequence based on the primitive polynomials pj with indices j G u. Therefore, Vu is a (tu, s)-sequence in base 2 with the quality parameter tu = $ > e g ( p , - ) - 1].
(2.6)
We also note from Sobol (1969) that the degree of the primitive polynomials Pj can be bounded by degfe) < log2 j + log2 log 2 (j + 1) + log2 log2 log 2 (j + 3) + r
(2.7)
for all integers j > 1, where r is a constant independent of j and s. Note from Hickernell and Yue (2000) that for the base 2 scrambling for the set Pn = { a i , . . . , a n } , [ew(Pn,Ksc,s)}2
= [D(Pn,Ksc,s)\2
5>uk2lklruk(P„),
= i £
|u|>0 k
where the r„k(-Pn) are given by Tuk(Pn) = n ^
^
HI
i = l i' = lj£u
X 1
L 2 ^ + 1 ««iJ = L 2 ^ + 1 «^J
_ l
l2^aiji
= [2^ai,jl)
Error Analysis on Scrambled Quasi-Monte Carlo
261
and are called gain coefficients under scrambling for the set Pn (Owen, 1997). The following lemma gives upper bounds on r u k for the Sobol nets and sequences, which are the special case of the results given in Owen (1998), Hickernell and Yue (2000). Lemma 1. Let P£et be a (t,m,s)-net sequence. Then
|k| < m — tu —\u\,
o,
r uk (P n net )
drawn from the Sobol (t,s)-
< 3'""'2*",
|k| >m-tu-
(2.9)
\u\.
Let P®eq be the set of first n points of the Sobol (t, s)-sequence.
Then
r„k(Prq) < -3H+12'«+1. (2.10) n The following lemma gives an upper bound for the worst-case error of integration by using the Sobol points over the space Hs with weights in (2.5). Lemma 2. Let Ksc,s be a scrambled-invariant kernel as described in (2.3) whose coefficients wuv. satisfy the upper bound (2.5). Define v = 3-2V/ log2. Let P ° e t be a (t, m, s)-net drawn in Sobol (t, s)-sequence. Then for large n = 2m, [e w (P» e \# S C ) S )] 2 ^Cwin-^^Oogn)
H - l J ^ l 0 a2+( 1 +^, ? )-t "o/ 3 u ,
«#0
(H-l)!
_ >. n0, T?
(2-11)
where C w i is a constant independent of u and n. Let P^eq denote the set of first n points of the Sobol (t, s)-sequence. Then for large n without requiring n = 2m, e™(P^,Ksc,s)}2 M-i
C^n-1-"
V ; - ^ — - ( l o g n ) l " l - 1 2 ( 1 + " ) * " / 3 u , 0 < r, < 1, ^—' Vl(\u\ — l)\ ;
u^0
< <
C
'
H-i
w2rc-2£^^(logn)H22'»/?tt,
C n 2
^~
u#0
(l«Fi)
E
7jVW(logn)l"l-
>|-1 u#0
(H-l)!
1
n=l,
2(1+")*«/3u,
where C w 2 are some constants independent of u
r, > 1,
andn.
(2.12)
Rongxian Yue
262
Proof. By the property of the Sobol sequence, a (t, m, s)-net drawn from the Sobol sequence is a ((£„),m, s)-net in base 6 = 2 defined in Hickernell and Yue (2000), where (tu) is a 2s — 1 dimensional vector and t = maxutu. Therefore, the results (2.11) and (2.12) can be proved by the similar arguments used in the proof of Theorem 14 in Hickernell and Yue (2000). • Comparing results (2.11) and (2.12) in the lemma above, one can see that the superiority of a Sobol net over the first n points of a Sobol sequence depends on the rate of decay of the coefficients w„k that define the scrambled-invariant kernel. The following theorem gives tractability conditions of integration for the space H.s when Sobol points are used in the scrambled quadrature rule. Theorem 1. Let Ksc>s be a scrambled-invariant kernel as described in (2.3) whose coefficients u>„k satisfy the upper bound (2.5). (i) Assume that P ° e t is a Sobol (t,m,s)-net. If /3j in (2.5) satisfy oo
J2 ft (J log2 3 log2 log2 j)1+" < oo,
(2.13)
then for any 5 > 0, there exists a constant C w such that e w (P n ,# S C ) S )<<7 w n-( 1 + ">/ 2 + 5 ,
77>0.
Consequently, multivariate integration in Hs using the Sobol nets is strongly tractable in the worst-case setting. (ii) Assume that P^eq is the set of the first n points of Sobol (t, s)sequence. If Pj in (2.5) satisfy oo
X > t f l Q g 2 jlQg 2 l0g 2 tf+"*nto,l) <
00>
(2-14)
.7 = 1
then for any S > 0, there exists a constant C w such that e W (P„,^ s c , s ) <
Cwn -[l+min(,M)]/2+* >
Consequently, multivariate integration in 7is using the Sobol sequences is strongly tractable in the worst-case setting. Proof. For the Sobol (t, m, s)-net, combining (2.6) and (2.7) gives that + 1) log 2 log 2 (j + 3)]. 2 t„ < 2 (r-i)M Y[\jlog2(j
Error Analysis on Scrambled Quasi-Monte Carlo • • •
263
It follows from (2.11) that [e w (P„ n e t ^sc, s )] 2 < C^n-1-*
J2 {fob log 2 (j + 1) log2 Iog2(7 + 3)] 1 + , ? logn},
where C w * is a constant independent of u and n. If /% satisfy condition (2.13), then for any fixed 8 > 0 there exists an integer I > 0 such that & [? lQfeO' + !) loS2 log 2 0' + 3)] 1 + " < <J.
£ Define
Cw„ = m a x { l , 5/ £ #,-[;log 2 (j + 1)log 2 log 2 (j + 3)] 1 + "},
{
C**Pj> J' = 1) • • • >A ft,
j = t + l,l +
2,....
Note that a„ = Hjeu otj > Cwf^/3U, and 00
^ [j log 2 (j + 1) log2 log 2 (j + 3)] 1 + "
£ J'=I
= C w „ £ ft \j log 2 (j + 1) log2 log 2 (j + 3)] 1 + " J'=I
&[jlog 2 (i + 1) log2 log 2 (j + 3)] 1 + " < 26.
+ £ j=e+i
Put C w = y/c^Cv,-?*.
It follows that
!ew(C^.)]2 ^CwV"1-" £
a „ n { l ? ' l o g 2 ( i + l)log2log 2 (j + 3)] 1+7? logn} j£u
|u|>0 s
< C w ' n " 1 - " J ] {1 + aj\jlog2(i
+ 1) log2 log 2 (j + 3)] 1 + "logn}
< C w 2 n - 1 - " e x p { ^ a i [ j l o g 2 0 " + l)log 2 log 2 (;/+ 3)] 1 + "lognJ 3=1
2
1
< <7 w n- -"exp{2(51ogn} =
C^n'1-^2*.
264
Rongxian Yue
This completes the proof of (i) under condition (2.13). The proof for (ii) under condition (2.14) is similar to the above proof.
• 3
Tractability in random-case setting
In this section we consider the tractability problems in the random-case setting for the Hilbert space Hs with reproducing kernel KSCiS defined in (2.3). Note from Heinrich, Hickernell and Yue (2003) that the randomcase error eT{Pn,KBCtS) can be expressed as er(Pn,KSCiS)
=
/max{n-1wukr„k(P„)},
where TUk is the gain coefficient defined by (2.8). The following theorem gives tractability conditions on integration in the random-case setting. Theorem 2. Suppose that KSCiS is a scramble-invariant kernel defined by (2.3) with coefficients wuk satisfying condition (2.5). (i) Assume that the quadrature employs the scrambled Sobol nets and P " e t is a Sobol (t,m,s)-net. If (3j in (2.5) satisfy oo
J2 log[/ii/?.j (j log2 j log2 log2 j)2+r>] < oo,
(3.1)
3=1
where fii = 24 • 4n, then there exists a constant C r i such that e r (P^ e t , tfSC;S) < <7 rl n-( 2 +")/ 2 ,
r, > 0.
Consequently, multivariate integration in Hs is strongly tractable in the random-case setting. (ii) Assume that the quadrature employs the first n points of the scrambled Sobol (t, s)-sequence and P^ e q is the set of the first n points of the Sobol (t, s)-sequence. If (3j in (2.5) satisfy oo
J2 logl/fcft U log2 j log2 log2 j) 1+r>] < oo,
(3.2)
3= 1
where \i
Error Analysis on Scrambled Quasi-Monte Carlo • • •
265
Proof. For the Sobol (t, m, s)-net P£ e t , making use of (2.5), (2.9) and (2.6), we have max{n_1o;ukruk} u,k.
\/3u3^2-(-1+^k+t-}
max k>Tn — tu — \u\-\-X
< n-( 2 +")2- 1 - r 'max{/3 t t (3 •2 1 +^)H2( 2 +") t "} < n - ( 2 + " ) 2 - 1 - " m a x i J J [/xi/^jlog, jlog 2 log 2 0' + 3))2+r>] 1 , [j€u where fi\ = 24 • 4''. If condition (3.1) holds then we have
J
oo
n ^ i & O ' ^ 2 i log2 log 2 (j + 3)) 2+ "] < oo. It follows that there exists some constant C r * such that J ] [MP, U log2 3 log2 log 2 (j + 3)) 2+ "] < C r „
V«.
Put C r l = 1 / C r » 2 - 1 - ' ' . Therefore, we have e r (P n n e t ^sc, s ) < C r l n - ( 2 + " ) / 2 . This completes the proof under condition (3.1). For the set of the first n points of Sobol (t, s)-sequence P^ eq by making use of (2.5), (2.10) and (2.6), we have max{n - 1 u; u kr u k} < 3n~ 2 max {/?u(3 • 21+f>)M2(1+")t" } < 3n- 2 m u ax i J J [ ^ ^ ( j l o g , jlog 2 log 2 (j + 3))1+r>] where /u2 = 12 • 417. If condition (3.2) holds then we have oo
n [ / x 2 ^ ( j l o g 2 jlog2 log 2 (j + 3)) 1+ "] < oo. 3=1
It follows that there exists some constant Cr»* such that J J [iMiPjU log2 j log2 log 2 (j + 3)) 1+r) ] < C r „ ,
Vu..
Rongxian Yue
266 Put C r 2 = \/3Cr**, and then
This completes the proof under condition (3.2).
D
References Heinrich, S., Hickernell, F. J. and Yue R. X., Optimal quadrature for Haar wavelet spaces, Math. Comp, 73(2004), 259-277. Hickernell, F. J. and Wozniakowski, H., Integration and approximation in arbitrary dimensions, Adv. Comput. Math, 12(2000), 25-58. Hickernell, F. J. and Yue, R. X., The mean square discrepancy of scrambled (t, s)-sequences, SI AM J. Numer. Anal, 38(2000), 10891112. Niederreiter, H., Low-discrepancy and low-dispersion sequences, J. Number. Theory, 30(1988), 51-70. Niederreiter, H., Random Number Generation and Carlo Methods, SIAM, Philadelphia, 1992.
Quasi-Monte
Owen, A. B., Randomly permuted (t, m, s)-nets and (t, s)sequences, In: Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing (Berlin) (H. Niederreiter and P. J. S. Shiue, eds.), Lecture Notes in Statistics, Vol. 106, Springer-Verlag, 299317. Owen, A. B., Monte Carlo variance of scrambled equidistribution quadrature, SIAM J. Num. Anal. 34(1997a), 1884-1910. Owen, A. B., Scrambled net variance for integrals of smooth functions, Ann. Statist, 25(1997b), 1541-1562. Owen, A. B., Scrambled Sobol and Niederreiter-Xing points, J. Complexity, 14(1998), 466-489. Sloan, I. H. and Wozniakowski, H., Tractability of multivariate integration for weighted Korobov classes, J. Complexity, 17(2001), 697-721. Sobol', I. M., The distribution of points in a cube and the accurate evaluation of integrals, Zh. Vychisl. Mat. i Mat. Phys., 7(1967), 784-802 (Russian). Sobol', I. M., Multidimensional Quadrature Formula and Haar Functions, Izdat. Nauka, Moscow (Russian), 1969. Wang, X., A constructive approach to strong tractability using quasi-Monte Carlo Algorithms, J. Complexity, 18(2002), 683-701.
Error Analysis on Scrambled Quasi-Monte Carlo • • •
267
[14] Wang, X., Strong tractability of multivariate integration using quasi-Monte Carlo algorithms, Math. Comp., 72(2003), 823-838. [15] Yue, R. X. and Hickernell, F. J., Integration and approximation based on scrambled sampling in arbitrary dimensions, J. Complexity, 17(2001), 897. [16] Yue, R. X., Variance of quadrature over scrambled unions of nets, Statistica Sinica, 9(1999), 451-473. [17] Yue, R. X. and Mao, S. S., On the variance of quadrature over scrambled nets and sequences, Stat, and Prob. Letters, 44(1999), 267-280.
Frontiers aiD1 Prospects of temDO rarv Annlied Mathematics Series in Contemporary Applied Mathematics CAM 6
This collection of articles covers the hottest topics in contemporary applied mathematics. Multiscale modeling, material computing, symplectic methods, parallel computing, mathematical biology, applied differential equations and engineering computing problems are all included. The book contains the latest results of many leading scientists and provides a window on new trends in research in the field.
is
Higher Education Press www.hep.com.cn
World Scientific www.worldscientific.com