Lecture Notes in Computational Science and Engineering Editors M. Griebel, Bonn D. E. Keyes, Norfolk R. M. Nieminen, Espoo D. Roose, Leuven T. Schlick, New York
12
Ursula van Rienen
Numerical Methods in Computational Electrodynamics Linear Systems in Practical Applications
With 173 Figures, 65 in Colour
. Springer
Ursula van Rienen Fachbereich Elektrotechnik und Informationstechnik Universitat Rostock 18051 Rostock, Germany e-mail:
[email protected]
Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Rienen, Ursula Ivan: Numerical methods in computational electrodynamics: linear systems in practical applications I Ursula van Rienen. - Berlin; Heidelberg; New York; Barcelona; Hong Kong; London; Milan; Paris; Singapore; Tokyo: Springer, 2001 (Lecture notes in computational science and engineering; 12) ISBN 3-540-67629-5
Front cover: High-voltage engineering. Epoxid resin specimen with layer of water drops on the surface. Shown is a vector representation of the electro-quasistatic field.
Mathematics Subject Classification (2000): 65C20, 65F05, 65FlO, 65F50, 65N06, 65N12, 6SN22, 6SN2S, 6SNso, 6SNSS, 78-02, 78-04, 78-08, 78A2S, 78A30, 78A3S, 78A40, 78A4S ISSN 1439-73S8 ISBN 3-S40-67629-S Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microftlm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH © Springer-Verlag Berlin Heidelberg 2001
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover Design: Friedheim Steinen-Broo, Estudio Calamar, Spain Cover production: design & production GmbH, Heidelberg Typeset by the author using a Springer TEX macro package Printed on acid-free paper
SPIN 10653083
46/3142/LK - 5 43210
To my children
J an and Viola
Contents
Acknowledgements ........................................... XI Overview ..................................................... XIII Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Classical Electrodynamics ................................ 1.1 Maxwell's Equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.2 Energy Flow and Processes of Thermal Conduction. . . . . . . .. 1.2.1 Energy and Power of Electromagnetic Fields ........ , 1.2.2 Thermal Effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.3 Classification of Electromagnetic Fields ................... 1.3.1 Stationary Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.3.2 Quasistatic Fields ................................ 1.3.3 General Time-Dependent Fields and Electromagnetic Waves .......................................... 1.3.4 Overview and Solution Methods. . . . . . . . . . . . . . . . . . .. 1.4 Analytical Solution Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.4.1 Potential Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.4.2 Decoupling by Differentiation . . . . . . . . . . . . . . . . . . . . .. 1.4.3 Method of Separation ............................ , 1.5 Boundary Value Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1.5.1 Boundary Value Problems of the Potential Theory ... , 1.5.2 Further Boundary Conditions. . . . . . . . . . . . . . . . . . . . .. 1.5.3 Complete Systems of Orthogonal Functions. . . . . . . . .. 1.6 Bibliographical Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
11 11 13 13 14 15 15 17
Numerical Field Theory. .. .. .. . . . . . . .. . . . . .. . . . . .. .. .. . .. 2.1 Mode Matching Technique. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.1.1 Mathematical Treatment of the Field Problem. . . . . .. 2.1.2 Scattering Matrix Formulation. . . . . . . . . . . . . . . . . . . .. 2.1.3 Standing Waves and Traveling Waves. . . . . . . . . . . . . .. 2.1.4 Convergence and Error Investigations. . . . . . . . . . . . . .. 2.2 Finite Element Method .................................
35 36 37 38 42 44 48
1.
2.
21 22 23 23 25 26 29 29 30 31 33
VI
3.
Contents 2.2.1 General Outline of the Finite Element Approach 2.2.2 Weighted Residual Method; Galerkin Approach ...... 2.2.3 Duality Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.2.4 Finite Element Discretizations of Maxwell's Equations 2.2.5 Synthesis Between FEM with Whitney Forms and Finite Integration Technique. . . . . . . . . . . . . . . . . . . . . . . .. 2.3 Finite Integration Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.3.1 FIT Discretization of Maxwell's Equations. . . . . . . . . .. 2.3.2 Stationary Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.3.3 Quasistatic Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.3.4 General Time-Dependent Fields and Electromagnetic Waves .......................................... 2.4 Resulting Linear Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2.4.1 Special Properties of Complex Matrices ... . . . . . . . . .. 2.4.2 Mode Matching Technique. . . . . . . . . . . . . . . . . . . . . . . .. 2.4.3 Finite Integration Technique ...................... 2.5 Bibliographical Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
49 50 52 54
Numerical 'freatment of Linear Systems. .. .. .. .. .. . . . . . .. 3.1 Direct Solution Methods ................................ 3.1.1 LV-decomposition; Gaussian Elimination. . . . . . . . . . .. 3.2 Classical Iteration Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3.2.1 Practical Vse of Iterative Methods: Stopping Criteria. 3.2.2 Gauss-Seidel and SOR . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3.2.3 SGS and SSOR Algorithms. . . . . . . . . . . . . . . . . . . . . . .. 3.2.4 The Kaczmarz Algorithm ......................... 3.3 Chebyshev Iteration .................................... 3.4 Krylov Subspace Methods .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3.4.1 The CG Algorithm ............................... 3.4.2 Algorithms of Lanczos Type ....................... 3.4.3 Look-Ahead Lanczos Algorithm .................... 3.4.4 CG Variants for Non-Hermitian or Indefinite Systems. 3.5 Minimal Residual Algorithms and Hybrid Algorithms ....... 3.5.1 GMRES Algorithm (Generalized Minimal Residual) .. 3.5.2 Hybrid Methods .................................. 3.5.3 GCG-LS(s) Algorithm (Generalized Conjugate Gradient, Least Square) ................................ 3.5.4 Overview of BiCG-like Solvers ..................... 3.6 Multigrid Techniques ................................... 3.6.1 Smoothing and Local Fourier Analysis .............. 3.6.2 The Two-Grid Method ............................ 3.6.3 The Multigrid Technique .......................... 3.6.4 Embedding of the Multigrid Method into a Problem Solving Environment .............................. 3.7 Special MG-Algorithm for Non-Hermitian Indefinite System .
83 87 87 91 92 92 94 95 96 99 100 105 108 109 114 115 116
56 56 56 70 72 74 76 76 77 78 80
120 121 121 123 124 127 128 131
Contents
3.7.1
3.8
3.9
3.10
3.11 4.
Pecularities of the Special Problem and Corresponding Measures ........................................ 3.7.2 The Multigrid Algorithm; Properties of the Linear System and its Solution ........................... 3.7.3 Grid Transfers for Vector Fields .................... 3.7.4 The Relaxation .................................. 3.7.5 The Choice of the Cycles in the FMG Approach ...... 3.7.6 The Solution Method on the Coarsest Grid .......... 3.7.7 Concluding Remarks on the Multigrid Algorithm and Possible Outlook ................................. Preconditioning ........................................ 3.8.1 Incomplete LV Decompositions .................... 3.8.2 Iteration Methods ................................ 3.8.3 Polynomial Preconditioning ........................ 3.8.4 Multigrid Methods ............................... Real-Valued Iteration Methods for Complex Systems ........ 3.9.1 Axelsson's Reduction of a Complex Linear System to Real Form ....................................... 3.9.2 Efficient Preconditioning of the C-to-R Method ...... 3.9.3 C-to-R Method and Electro-Quasistatics ............ Convergence Studies for Selected Solution Methods ......... 3.10.1 Real Symmetric Positive Definite Matrices ........... 3.10.2 Complex Symmetric Positive Stable Matrices ........ 3.10.3 Complex Indefinite Matrices ....................... Bibliographical Comments ...............................
VII
Applications from Electrical Engineering ................. 4.1 Electrostatics .......................................... 4.1.1 Plug ............................................ 4.2 Magnetostatics ......................................... 4.2.1 C-Magnet ....................................... 4.2.2 Current Sensor ................................... 4.2.3 Velocity Sensor .................................. 4.2.4 Nonlinear C-magnet .............................. 4.3 Stationary Currents; Coupled Problems ................... 4.3.1 Hall Element .................................... 4.3.2 Semiconductor ................................... 4.3.3 Circuit Breaker .................................. 4.4 Stationary Heat Conduction; Coupled Problems ............ 4.4.1 Temperature Distribution on a Board ............... 4.5 Electro-Quasistatics .................................... 4.5.1 High Voltage Insulators with Contaminations ........ 4.5.2 Surface Contaminations ........................... 4.5.3 Fields on High Voltage Insulators ................... 4.5.4 Outlook .........................................
132 137 141 145 148 148 149 150 151 154 154 155 156 156 158 160 161 162 176 182 199 205 205 205 211 212 213 214 216 217 219 220 221 223 223 224 224 226 227 229
VIII
5.
Contents 4.6 Magneto-Quasistatics ................................... 4.6.1 TEAM Benchmark Problem ....................... 4.7 Time-Harmonic Problems ............................... 4.7.1 3 dB Waveguide Coupler .......................... 4.7.2 Microchip ....................................... 4.8 General Time-Dependent Problems ....................... 4.9 Bibliographical Comments ...............................
233 234 234 237 237 239 239
Applications from Accelerator Physics .................... 5.1 Acceleration of Elementary Particles ...................... 5.2 Linear Colliders ........................................ 5.2.1 Actual Linear Collider Studies ..................... 5.2.2 Acceleration in Linear Colliders .................... 5.2.3 The S-Band Linear Collider Study .................. 5.3 Beam Dynamics in a Linear Collider ...................... 5.3.1 Emittance ....................................... 5.3.2 Wake Fields and Wake Potential ................... 5.3.3 Single Bunch and Multibunch Instabilities ........... 5.4 Numerical Analysis of Higher Order Modes ................ 5.4.1 Computation of the First Dipole Band of the S-Band Structure with 30 Homogeneous Sections ............ 5.4.2 Developments That Followed the ORTHO Studies .... 5.4.3 Geometry and Convergence Studies of Trapped Modes 5.4.4 Comparison with the Coupled Oscillator Model COM. 5.4.5 Comparison with Measurements for the LINAC II Structure at DESY ............................... 5.5 36-Cell Experiment on Higher Order Modes ................ 5.5.1 Design .......................................... 5.5.2 Numerical Results for the First Dipole Band ......... 5.5.3 Measurement Methods ............................ 5.5.4 Bead Pull Measurements .......................... 5.5.5 Comparison of Measurement and Simulation ......... 5.5.6 Measurement with Local Damping .................. 5.5.7 Comments and Outlook ........................... 5.5.8 Suppression of Parasitic Modes ..................... 5.5.9 Design of the Damped SBLC Structure ............. 5.5.10 Concluding Remarks about the Linear Collider Studies 5.6 Coupled Temperature Problems .......................... 5.6.1 Inductive Soldering of a Traveling Wave Tube ........ 5.6.2 Temperature Distribution in Accelerating Structures .. 5.6.3 RF-Window ..................................... 5.6.4 Waveguide with a Load ........................... 5.7 Bibliographical Comments ...............................
243 243 245 247 250 265 268 268 270 279 280 281 288 288 290 294 295 297 299 302 304 305 310 314 315 318 319 320 320 322 323 324 324
Contents
IX
Summary ..................................................... 335 References . ................................................... 337 Symbols ...................................................... 353 Index ......................................................... 363
Acknowledgements
The author wishes to thank all who contributed to the successful completion of this book!. The author is particularly grateful to: - Prof. Dr.-Ing. Thomas Weiland for his encouragement to undertake this venture, for many fruitful discussions and ideas, and for the stimulating working environment and atmosphere. - Prof. Dr. Peter Rentrop for his strong interest in the project and for taking care of the second referee's report on the postdoctoral thesis. - Prof. Dr. Willi Tornig for his encouragement to start a postdoctoral thesis. - My colleagues of the study group Theory of Electromagnetic Fields at the Darmstadt University of Technology for good cooperation. My thanks go especially to Michael Bartsch, Dr. Ulrich Becker, Dr. Markus Clemens, Dr. Micha Dehler, Dr. Peter Hahne, Dr. Bernd Krietenstein, Dr. Philipp Pinder, Oliver Podebrad, Dr. Brigitte Schillinger, Dr. Rolf Schuhmann, Dr. Klaus Steinigke, Dr. Bernhard Wagner, and Dr. Heike Wolter, who shared their ideas with the author in valuable discussions. Dr. Markus Clemens also allowed me to use some magneto-quasistatic examples from his research in this book. - My colleague Dr. Alfons Langstrof deserves very special thanks, since he made it possible for me to carry out a substantial part of the research at home 2 by his help with the installations of a workstation and a pc. - My students who worked on senior or diploma theses, thus making essential contributions to the success of my postdoctoral thesis: Ralf Ehmann, Michael Hilgner, Dr. Bernd Krietenstein, Jiirgen Nahr, Dr. Philipp Pinder, Oliver Podebrad, Michael Sommer, Dr. Martin Witting. Working with them was a real pleasure and gave an important motivation to start a university career. - Dr. Oliver Claus and Dr. Hans-Joachim Kloes from the Institute of High Voltage Engineering of the Darmstadt University of Technology for a fruitful cooperation in the research on contaminated high voltage insulators. 1 2
The German version of this book was the Habilitationsschrift of the author (author's postdoctoral thesis required for qualification as a university lecturer). being able to take care of the children
XII
Acknowledgements
- My colleagues from DESy3 in Hamburg for valuable discussions and for letting me use several drawings concerning the SBLC study: Dr. Michael Drevlak, Dr. Norbert Holtkamp, Dr. Martin Dohlus, and Dr. Rainer Wanzenberg. - My colleagues Dr. Peter HUlsmann, Dr. Martin Kurz, and Wolfgang Muller from Frankfurt University for good cooperation in the course of the 36-cell experiment. - Dr. Bernhard Steffen from the Forschungszentrum JUlich for important ideas and discussions on the development of the presented multigrid algorithm. - Prof. Dr. Owe Axelsson and Dr. Maya Neytcheva for their interest in my problems and for a lot of information on the solution of complex linear systems. The author is indebted to them for the material on the algorithm which is called 'C-to-R algorithm' in this book. - The reviewers and the language editor Ms. Olga Holtz of this book as well as Prof. Tasche, Lubeck University, and Dr. Gisela P6plau, Rostock University, for valuable comments. - Dr. Dirk Hecht, Rostock University, for careful reading of the manuscript. - Mrs. Brigitte Lalk from Rostock University for redrawing many figures in the book. - The "Deutsche Forschungsgemeinschaft" (DFG) for postdoctoral scholarship, which enabled me to have optimal working conditions. In this context, the author would also like to thank her student assistants Christian Wengerter and Michael Hilgner for their enthusiasm and their work. - Finally, the author wishes to thank all who created a suitable work climate in her private life. In particular, the thanks are due to my friend Claudia Dicken-Hahne, who often helped out and lovingly cared for my children. - Very special thanks go to my children Jan and Viola for many weekends away from their mother and to my husband Gereon for his support in every respect. My parents deserve thanks for my education, which let me undertake unusual journeys through life.
3
Deutsches Elektronen-Synchrotron
Overview
L Classical Electrod!!!amics Maxwell's Equations and its analytical solution
b Numerical Field Theory
/
Representative for Semi-analytical Methods: Mode-Matching Method
Maxwell's Equations and its numerical solution
~
A Representative for Discretization Methods: Finite Element Method
,,, ,, :,
A Representative for Discretization Methods: Finite Integration Technique
..:.
I 3. Numerical Treatment of Linear Systems I
/
I Direct Methods
ARRlications from Electrical En&ineerin& ~
i
~
Stationary Methods
Instationary Methods
f
f
~ ARRlications
from Accelerator Physics
iJ Figure 0.1. Organization of the book.
I
Introd uction
Linear systems are found in all kinds of scientific disciplines once a linear relation between a system of unknowns is formulated. In addition, nonlinear relations are often linearized in order to simplify their solution. Classical electrodynamics deals with macroscopic electric and magnetic phenomena. These experimentally observed phenomena have been formally described by James Clerk Maxwell (1831 - 1879) in Maxwell's equations. Time-varying electric fields cause magnetic fields and vice versa. Therefore the general term electromagnetic fields is used. Maxwell's equations form the axiomatic basis of electrodynamics - analogously to Newton's axioms for mechanics. For very simple geometrical structures, Maxwell's equations can be explicitly solved analytically. (Analytical methods are reviewed in section 1 "Classical Electrodynamics" .) The goal of this book is to bring together demanding applications from numerical field theory and modern methods from numerical mathematics in order to make the solution of the problems as efficient as possible. Furthermore, practical properties of the numerical methods are studied for examples of practical relevance. For this purpose, various field problems are solved. Some of them are parts of bigger future projects that require semi-analytical methods on one hand and some discretization method on the other hand. Choosing an appropriate method for the solution of linear systems, one cannot right away transfer to practice the convergence properties that have been derived theoretically, since practical problems generally show a diversity of additional difficulties, which are simply not typical for model problems. We will see examples of this. Moreover, complex, non-Hermitian systems, often indefinite or even nearly-singular, are typical for field theory. Some recently developed methods for this kind of systems have been studied from the point of view of their practical suitability for field theoretical problems.
Calculation of Electromagnetic Fields A variety of methods has been developed in the past to compute electromagnetic fields in practical applications. Some representatives of these methods are described in section 2 "Numerical Field Theory": The mode matching method, the Finite Element Method, and the Finite Integration Technique are U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
2
Introduction
treated in more detail. They are just specimen of larger classes of schemes. Essentially, we have to distinguish between semi-analytical methods, discretization methods, and lumped circuit models. The semi-analytical methods and the discretization methods start directly from Maxwell's equations. Semi-analytical methods are concentrated on the analytical level: They use a computer only to evaluate expressions and to solve resulting linear algebraic problems. The best known semi-analytical methods are the mode matching method, which is described in subsection 2.1, the method of integral equations, and the method of moments. In the method of integral equations, the given boundary value problem is transformed into an integral equation with the aid of a suitable Greens' function. In the method of moments, which includes the mode matching method as a special case, the solution function is represented by a linear combination of appropriately weighted basis functions. The treatment of complex geometrical structures is very difficult for these methods or only possible after geometric simplifications: In the method of integral equations, the Greens function has to satisfy the boundary conditions. In the mode matching method, it must be possible to decompose the domain into subdomains in which the problem can be solved analytically, thus allowing to find the basis functions. Nevertheless, there are some applications for which the semi-analytic methods are the best suited solution methods. For example, an application from accelerator physics used the mode matching technique (see subsection 5.4). This method leads to full complex matrices with rank of order one to two hundred (compare subsection 2.4). So, they can be solved very well by means of direct solution methods described in subsection 3.1. The second class of methods employs local difference equations obtained after suitable discretization. The best known discretization methods are the Finite Element Method and the Finite Difference Method. The Finite Element Method is treated briefly in subsection 2.2. For the numerical field computation, various Finite Element formulations are used. Some of these methods start from the Poisson, wave, or Helmholtz equation. Yet, these partial differential equations of second order follow only if one imposes restrictions on the material parameters €, /-L and K.. Therefore, numerical solution methods that are based on these partial differential equations can only be applied to domains which are composed of a few sub domains with linear, homogeneous, and isotropic material each. Then the differential equations are solved for a vector potential and/or a scalar potential or for the vector fields E or H. The domain decomposition, which is performed to allow the treatment of piecewise homogeneous material filling, requires a complicated coupling of the equations at all junctions. A very adequate Finite Element formulation for electromagnetic problems is given by the edge element technique. A method that is particularly well suited for field theoretical problems is the Finite Integration Technique (shortly FIT), which was developed by Weiland and is directly based on Maxwell's equations in integral form. The
Introduction
3
Finite Integration Technique is described in subsection 2.3. It consistently transforms Maxwell's equations into a system of linear algebraic equations, the so-called Maxwell Grid Equations. Depending on the problem type, linear systems or eigenvalue problems have to be solved or a time-domain integration has to be performed. Subsection 2.4 describes the resulting linear systems. The matrices are each sparse with fixed band structure. Their rank is typically of order several hundred thousand or several million. Besides real positive semi-definite matrices, complex non-Hermitian, partly indefinite systems also have to be solved. In particular, the complex symmetric matrices are typical for field theoretical problems. Generally, the nature of chosen discretization directly implies special characteristics of the resulting matrices, like their sparsity, the special filling pattern (e.g., band structure in FIT), etc. Thus, looking at the difference equations, i.e., the connections among unknowns in the underlying grid, already gives an idea about the resulting matrix structure.
Modeling of Numerical Field Problems The application of the Finite Integration Technique to field theoretical problems is briefly described in subsection 2.3. In the course of this, the modeling of stationary current fields, stationary temperature problems, electroquasistatics, and problems with time-harmonic excitation arises as a new area of application of the Finite Integration Technique. Generally, electroquasistatics is hardly known. Its area of applicability and its modeling are therefore treated in some more detail. Also new is the discussed possibility to consistently solve coupled temperature problems by the Finite Integration Technique. Analogously to electrostatics, the stationary current and temperature problems lead to Poisson's equations and to real positive semi-definite matrices. A formulation with a complex scalar potential is used for electroquasistatics. This yields Poisson's equation with complex quantities and complex symmetric matrices. Problems with time-harmonic excitation yield complex indefinite systems, which may even become singular, depending on the excitation frequency. The numerical problems from field theory thus often reduce after suitable modeling to solving large sparse linear systems of equations
Ax = b. The type of the system matrix A varies from real, symmetric, positive definite to complex, non-Hermitian, indefinite, and nearly-singular (see subsection 2.4). The matrices are usually sparse, but some methods in numerical field theory give rise to full matrices. Complex matrices may be rewritten as twice as large real matrices in order to apply methods for real linear systems. However, this procedure cannot be recommended from the numerical point of
4
Introduction
view, since the condition of the corresponding real matrix is much worse than that of the complex matrix. The complex non-hermitian system matrices are particularly typical for electrical engineering, while they rarely occur in other fields. Numerical Treatment of Large Linear Systems - Theory and Practice
The most important numerical methods for the solution of linear systems of equations are briefly introduced in section 3 "Numerical Treatment of Linear Systems" . In the focus of that section are several more recent iterative methods. The direct methods such as Gaussian elimination are only briefly presented in subsection 3.1, since they belong to the "classics", which can be found in most elementary textbooks on numerical mathematics. They are appropriate for full matrices, like those occurring, for example, in the mode matching technique. Iterative methods are applied to large sparse matrices, like those typical for discretization methods. The iterative methods may be divided 4 into two groups: the classical iterative methods, which are described in subsection 3.2, and the Krylov subspace methods described in subsection 3.4. there also exist such methods as the multigrid methods, which are described in subsection 3.6. The Jacobi, GauB-Seidel, and SOR methods are representatives of the classical iteration methods. They are easy to understand and implement. Significantly more efficient are the modern Krylov subspace methods and the multigrid methods, yet their theoretical analysis is significantly more difficult. The Krylov subspace methods described in subsection 3.4 form a group of closely related methods. Historically, the Lanczos method published in 1950 and the conjugate gradient method (shortly cg method) published by Hestenes and Stiefel in 1952 form the basis of the subsequent evolution of the Krylov subspace methods. With these algorithms, the solution of a linear system is achieved by minimization of a residual functional. The iterates are related to the initial residual by multiplication by a polynomial in the system matrix, i.e., the minimization takes place over special vector spaces, the socalled Krylov spaces. They give rise to a sequence of orthogonal or conjugated vectors. Conjugateness means orthogonality with respect to an inner product with a weighting matrix. The important advantages of Krylov subspace methods is their relatively high convergence rate, which may be even increased by various preconditioning techniques, their independence of any parameters, which make unnecessary such estimations as those used for the relaxation parameter in the SOR method, their acceptable storage requirements, their low performance 4
yet, another possible division into stationary and non-stationary methods is given in section 3.
Introduction
5
times per iteration, and their good rounding error properties. Yet, for indefinite or non-symmetric matrices, these methods may become unstable. Therefore, generalized cg methods in many different versions were developed since the end of the 70s, those being applicable also for non-symmetric and/or indefinite systems. Often, only a clever combination of preconditioning and generalized cg method yields the wanted robustness. Nevertheless, the Krylov subspace methods are still an active area of research. There is a large number of recent publications, which in particular deal with the application of these methods to non-Hermitian linear systems. Unfortunately, for non-Hermitian systems Ax = b with A E c nxn , A :I A H , the robustness of the Krylov subspace methods is not sufficient to use them as black box solvers. There are enough examples of system matrices A for which the presented methods either do not reach the prescribed accuracy or even do not converge at all. Some of these examples are of great importance for practice, so thorough numerical experiments are always needed in order to decide which solution method is suitable for a given problem. Other important classes of methods are the Minimal Residual method described in subsection 3.5, which is more stable than the cg-like methods and in particular demonstrates absolutely monotone convergence. But, to compete with these methods, it usually requires (even in versions with restarts) to store a large number of basis vectors. Taking into account the typical size of the applications, the use of the Minimal Residual methods often makes no sense because of bad storage efficiency. Hybrid methods combine the cg or BiCG method or the Look-Ahead Lanczos method with a Minimal Residual formulation, especially with the GMRES algorithm. This way, the advantages of the short recursions in the cg and Lanczos method are combined with the stable and monotone behaviour of the Minimal Residual methods. These methods are briefly treated in subsection 3.5. The resulting algorithms are particularly well suited for the solution of complex non-Hermitian systems of equations. Originally, the multigrid methods, which are described in subsection 3.6, were developed as a construction principle for fast solvers of Poisson's equation. The evolution of the multigrid methods began in the early 60s with publications of Fedorenko and Bakhalov. Based on these publications, Brandt, who recognized the great efficiency of the multigrid methods, started his investigations in 1972. Independently of these studies, Hackbusch developed his multigrid algorithms, which he first published in 1976. Recently, the multigrid methods acquired additional importance as the so-called multilevel preconditioners. A variety of different views and derivations appeared. Recent noteworthy developments are Griebel's representation of the multigrid methods and multilevel preconditioners as classical iterative methods over generating systems and the cascade methods of Deuflhard. The observation of the following typical properties of the classical iterative methods applied to linear systems as they arise as a result of the
6
Introduction
discretization of partial differential equations governed the development of the multigrid methods: The classical iterative methods provide a very good smoothing of the error in only few iteration steps. But with increasingly finer discretization (h ~ 0 ), their rate of convergence decreases, and their total error decreases only insignificantly after the smoothing of the high frequency error components. The multigrid methods also reduce the low frequency error components and have a rate of convergence which is nearly independent of the step size h of the discretization. In general, multigrid methods are especially well suited for linear boundary value problem8 of elliptic partial differential equations as well as for elliptic eigenvalue problems. Then typical advantages of these methods making them superior to the others are their speed and robustness. Axelsson developed a special method, which is described in subsection 3.9, based on an equivalent real formulation for complex linear systems. This method does not use the well known procedure to solve the twice as large real system instead of the complex system, where the condition number of the real system equals the square of that of the original system. Axelsson's method is described in this book. An efficient preconditioning and an iterative method for the solution of the real subsystem is recommended. In this context, the Chebyshev method is used, which is an acceleration method for classical fixed point methods. Instead of using only the information of the last iteration step, a linear combination of the already computed approximate solutions is used. The coefficients are chosen so that faster convergence is reached than for the original sequence of approximate solutions. In this process, the minimax properties of the Chebyshev polynomials are used. For the iterative methods, some convergence studies from numerical experiments are presented in subsection 3.10. In particular, results for real applications are presented in addition to some purely academic examples, which are still relatively closely related to the model problems that are usually used for theoretical convergence studies. The central question, to what extent the theoretical convergence results may be transferred to practical applications, is answered for several examples. Often, these practical problems are very large and have geometrical singularities. Not rarely, the methods show a different convergence behaviour for these practical applications, as could be expected theoretically: For special problems, even hybrid methods, for example, the BiCGSTAB method, demonstrate significantly worse convergence than what could be expected; they even stagnate quite often. Another example of this fact is given by the multigrid methods: In theory, they provide an excellent method for the iterative solution of large linear systems. Yet, the theoretical convergence studies are almost all limited to Poisson's equation on the unit square or similar very simple domains. For the practical application to other partial differential equations and/or to domains with arbitrary convex edges, some problems may arise which require a creative solution before properties of the multigrid algorithm can be reached.
Introduction
7
Sometimes, these then are only roughly comparable with theory, as is shown in subsection 3.10.
Practical Applications in Electrical Engineering and Accelerator Physics The studies on the solution of linear systems in numerical field computation were carried out for examples relevant for practice, as is described in section 4 "Applications from Electrical Engineering" and section 5 "Applications from Accelerator Physics". In the course of this, we went far beyond the solution of a linear system. In high-voltage engineering, the fields on contaminated high-voltage insulators were modelled. In accelerator physics, the physical problem of parasitic fields in accelerator structures was studied. Besides these two very comprehensive problems, a series of other applications was investigated. In particular, computations for coupled temperature problems were also presented.
Field Computation for Various Applications from Electrical Engineering In section 1, a classification of electromagnetic field problems was given and taken up again in subsection 2.3. In section 4, several application problems from these problem classes are investigated: As an electrostatic example, the electric field in a plug is calculated with two different solvers. Results are shown in subsection 4.1. For both magnetostatic examples which were treated in subsection 3.10, the C-magnet and the current sensor, field representations are shown in subsection 4.2. For a velocity sensor, different solvers are compared, including also a black box multigrid solver. Results for a nonlinear calculation of another C-magnet are also described there, showing the necessity of fast solvers. As examples for stationary current and coupled problems, a simple Hall element, a semi-conducting cube, and a circuit breaker are presented. The temperature distribution on a board serves as an example of stationary computation. Furthermore, four examples of coupled temperature computation are presented in subsection 5.6: inductive soldering and temperature distribution in an rf-cavity, an rf-window, and some waveguide with load. Some test specimen of humid high voltage insulators are considered as typical examples for electro-quasistatics (also see remarks below). As an example of magneto-quasistatics, simulation results for a TEAM benchmark problem are shown. Time-harmonic problems can be divided into problems with excitation and eigenvalue problems. Simulation results are shown for the two examples which were studied in subsection 3.10: the 3dB waveguide coupler and the microchip. Another application problem is given by the inductive soldering, an
8
Introduction
eddy current problem with coupled thermal computation which is presented in subsection 5.6. Eigenvalue problems are treated in section 5, especially in subsection 5.4. Some of the above-mentioned examples could also be solved as general time-dependent problems. Other examples of those are the rf-window and the waveguide with load shown in subsection 5.6. Field Simulation for High Voltage Insulators under Environmental Damage Field theoretical simulation intended to optimize the design of high voltage insulators should also include environmental influences which affect the insulator material. Modeling and implementation of electro-quasi statics with the Finite Integration Technique is described in subsection 2.3. It shows the possibility to simulate humid or contaminated high voltage insulators. Discharges may occur on such insulators. Hitherto, the electrostatic simulation was mainly used to solve this problem. Yet, it is easy to show that significant difference exists between the results of the two formulations. Electroquasistatics is the appropriate model, as is also supported by computations for examples treated in subsection 4.5. A Special Application: Modes in Accelerating Structures for Linear Colliders In the future, such high energies (500 GeV to 1.5 TeV) will be needed in e+ e- -physics that storage rings will no longer be a realistic possibility because of their energy losses from synchrotron radiation. As a result, some current projects of a linear collider are carried out worldwide. They are discussed in subsection 5.2. For the S-band 2 x 250 GeV linear collider project SBLC, some investigations were carried out. The SBLC project proposes 2452 so-called constant gradient structures to accelerate the elementary particles. These aperiodic traveling wave tubes shall have 180 cells and an accelerating gradient of 17 MV 1m. A so-called bunch train of 125 bunches is proposed with distance 16 ns from bunch to bunch. In order to reach as high luminosity as possible, each bunch must be prevented from spreading. The beam dynamics is treated in subsection 5.3: Effects of scattered fields which are caused by parasitic modes are among the main reasons for the bunch spreading. Consequently, the suppression of these modes, the so-called Higher Order Modes (HOMs), is of fundamental importance for the actual collider design. The intensity of interaction of the particles with the higher order modes differs a lot. It can be expressed by the so-called loss parameter. For reasons clear from experience and some preliminary theoretical and numerical investigations, it could be assumed that the modes of the first dipole band would cause the worst blow-up effects. Therefore, of main interest were computations for this dipole band.
Introduction
9
The development of a program based on the mode matching technique is briefly described in subsection 2.1 resp. subsection 5.4. In particular, it can be used for the field computation of parasitic modes in long accelerating structures. Results from field computations for a typical accelerating structure are presented. The investigated structure showed many modes which strongly interact with the particle beam and which are trapped inside the structure. Such unwelcome modes have to be suppressed as much as possible in order to maintain the stability of the beam. The" 36-cell experiment" described in subsections 5.5 - 5.5.4 was done with a relatively short structure for measurement studies of higher order modes and their damping. Coupled Temperature Problems in Accelerators The design of many technical components requires the investigation of their electro- or magneto-thermal behaviour. The Finite Integration Technique proved to be a consistent numerical method for the computation of electromagnetic fields. In order to allow the simulation of coupled thermal problems, FIT was applied to the calculation of stationary temperature problems, as is described in subsection 2.3. This procedure guarantees the consistency of coupled calculations. For this purpose, a material of prescribed temperature can serve as a heat source, or a heat source with given density or emission can be chosen. Thus, in particular, the heating resulting from wall losses of modes and the heating by Joule's energy which is caused by eddy currents can completely be evaluated inside a modular program package which is based on FIT. Several examples of practical applications from accelerator physics are presented in subsection 5.6.
1. Classical Electrodynamics
Classical electrodynamics treats macroscopic electrical and magnetic phenomena. These experimentally observed phenomena have been formally described by James Clerk Maxwell (1831 - 1879) in Maxwell's equations [172]. Maxwell's equations build the axiomatic basis of electrodynamics, analogously to Newton's axioms for mechanics. This chapter discusses Maxwell's equations and means for their analytical solution. Rather than aspiring to a comprehensive treatise, the book contains the discussion of only those topics which are important for the main themes of this book. Extensive treatment can be found in many textbooks, e.g., [142], [251]' or [168].
1.1 Maxwell's Equations Maxwell's equations are the fundamental equations of electrodynamics. Timevarying electric fields cause magnetic fields and vice versa. Therefore, one uses the term electromagnetic fields. Their macroscopic behaviour is described by Maxwell's equations.
Maxwell's Equations in Differential ForIll. Maxwell's equations [172] reflect the relations between the four characteristic quantities of electromagnetic fields. These quantities are: E H D = foE+P
B = ILo(H
+ M)
[Vim] [A/m] [C/m2] = [As/m 2] [T]=[Vs/m 2]
electric field strength magnetic field strength electric flux magnetic flux
Often the electric flux is also referred to as the electric displacement and the magnetic flux as the magnetic induction. P is called the electric polarization density (shortly polarization). It is connected to certain charge displacements in the molecules of the considered material and a resulting change of the field. The polarization describes the vector sum of all dipoles with respect to the volume in presence of exterior fields (macroscopic averaged electric dipole density of the material system). Strictly speaking, the electric quadrupole moment and higher order terms have to be added, but in most media they U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
12
1. Classical Electrodynamics
are negligible [142]. M is called magnetization and describes the macroscopic averaged magnetic" dipole density" of the material system. In homogeneously polarized materials, P = fOXeE where Xe is the dielectric susceptibility. For the linear materials, D = fE with f = fo(l + Xe). Here the dielectric constant of vacuum (also influence constant) equals fO = 8.854187· 10- 12 As/Vm and fr := 1 + Xe is referred to as relative permittivity. In diamagnetic media, the relation M = XmH holds, with the magnetic susceptibility Xm. In magnetic linear materials, we can write B = J.LH with J.L = J.Lo(l + Xm). Here J.Lo = 1.256· 10- 6 Vs/ Am equals the permeability of vacuum (also induction constant) and J.Lr := 1 + Xm is called relative permeability. In general, the permittivity f = frfO and the permeability J.L = J.LrJ.Lo are tensors depending on time, space, and field. They are scalars in linear isotropic media. In vacuum, they are constant (see above) satisfying the condition c = 1/ VfoJ.Lo where c = 2.99792458· 108 m/s is the velocity of light. Another field quantity is the current density J with unit A/m 2
J
= J L + J E + J K = ",E + J E + 6 grad p.
J L = ",J E is the conduction current density arising in materials with electric conductivity", (unit 1/ Dm) from the subsisting field strength. J E represents an impressed current density which is independent of all field forces. The convection current density J K = c5 grad p is the density of a current of
free electrical charges with the electric charge densities p (unit As/m 3 ). The proportionality constant 6 is called diffusion constant. Thus all quantities appearing in Maxwell's equations are introduced. Maxwell's equations for static media are then given by
BB Bt BD J Bt +
curl E curl H divD divB
=p
= o.
(1.1) (1.2) (1.3) (1.4)
So, they are coupled partial differential equations of first order. Maxwell's Equations in Integral Form. Equivalent to Maxwell's equations in the form of partial differential equations (1.1) - (1.4) is their representation in the integral form. Let A be a 2-dimensional domain with boundary BA. Upon integrating (1.1) and (1.2) over A, using Stokes' theorem, we can rewrite the first two Maxwell's equations as
1 E.ds leA 1 H.ds leA
_r BB .dA JA
Bt BD (-+J) ·dA. A Bt
!
(1.5) (1.6)
1.2 Energy Flow and Processes of Thermal Conduction
13
Now let V be a 3-dimensional domain enclosed by the surface av. Integrating (1.3) and (1.4) over V and using GauB' theorem, we rewrite the third and fourth Maxwell's equations as
1
D.dA =
lev 1 B·dA lev
+:
rp.dV
(1.7)
iv
= 0.
(1.8)
The continuity equation and the theorem on the conservation of charges divJ
= 0,
1 J.dA+~
lev
r p.dV=O
at iv
(1.9)
follow from Maxwell's equations. To get the continuity equation, apply div to both sides of (1.2), take into account that divcurlH == 0, interchange the order in the mixed derivative with respect to space and time, and use (1.3). The conservation law for charges follows from the continuity equation by integration and GauB' theorem. This law states that the total charge in a volume decreases if and only if the corresponding current flows out at the same time. The solution of Maxwell's equations depends on the problem. The different permissible assumptions lead to a whole series of different types of differential equations and hence to different solution arrangements, which will be discussed later.
1.2 Energy Flow and Processes of Thermal Conduction The energy conservation law states that energy may neither be generated nor annihilated. However, energy can be transformed. All forms of energy: heat, work, mechanical energy, electrical energy, chemical energy, sound, and light, - are equivalent to each other. In our context, we are mainly interested in the transformation of electrical energy into heat. 1.2.1 Energy and Power of Electromagnetic Fields
Electromagnetic energy is transported during the propagation of electromagnetic fields. Their spatial distribution in static, isotropic, electrically and magnetically linear media is representable by their electric and magnetic energy densities (1.10) Electric energy and magnetic energy are coupled with each other and cause an energy flux. The density of this energy current is represented by the vector
14
1.
Classical Electrodynamics
S=ExH. The vector S measured in W /m 2 is referred to as the Poynting vector. Instead of the term 'energy flux', one often uses 'power density' . The conservation of energy is described by Poynting's theorem, which follows from Maxwell's equations:
.
aD at
dlV S = -E· J - E . -
aB at
H . -.
In case of isotropic, linear, homogeneous media, the last formula reduces, by (1.10), to
aWe aw m
.
(
)
Tt+Tt+dlVS=- JL+J E ·E. The rate of change of electromagnetic energy plus the emerging power with respect to time equals the opposite of the performed work. The work performed per unit of space and time (J L + J E) . E describes the conversion of electromagnetic energy into thermal energy and is called Joule's heat. Therefore, thermal effects are also of interest in connection with the conversion of electrical power in Joule's heat.
1.2.2 Thermal Effects In general, there are three ways thermal energy can be transmitted from one thermodynamic system to another: thermal flow (also called thermal conduction), convection and thermal radiation. In gases or liquids, even though thermal conduction takes place, convection plays the main role in the thermal transport. In the following, we are mainly interested in thermal conduction.
Thermal Conduction. Transport of thermal energy by thermal conduction happens through intra-molecular impulse exchange. In isotropic materials, the thermal flux J w (unit W/m 2 ) is described by Fourier's law l : J w = -"'TgradT
(1.11)
Fourier's law states that the thermal current always flows in the direction of decreasing temperature, namely perpendicularly to the isothermallines 2 . The proportionality factor "'T has the dimension W /m·K and is the thermal conductivity of the material (the notation A is often used instead of "'T). In general, the thermal conductivity depends on the temperature T and the pressure of the material. For good electric conductors, the thermal conductivity "'T is proportional to the electric conductivity "'E: 1
2
This relation is only an approximation formula but for some important materials such as metals it is sufficiently good. Isotherms are lines connecting neighbouring points of equal temperature (cf. contour lines).
1.3 Classification of Electromagnetic Fields /'i,T /'i,E
k2
= 2"2T. e
15
(1.12)
This relation is c2,lled Wiedemann-Franz' law; here k 1.38 . 10- 23 J /K 19 denotes the Boltzmann constant and e = 1.60.10- As the charge of the electron. Now, consider a heat source inside an infinitesimally small volume element dV. The first law of thermodynamics states that the change in inner energy u (T) equals the energy supplied by the internal heat source with density w diminished by the thermal flow through the surface of the volume element
au at dV = w dV -
J w . dA.
(1.13)
By Taylor expansion of first order and use of Fourier's law, we get
~~ =w+div(/'i,TgradT).
(1.14)
For a stationary heat flow, i.e., if constant sources have produced an equilibrium state, (1.14) reduces to (1.15) The divergence of the temperature gradient weighted with the thermal conductivity is therefore proportional to the density of the heat source. Since the gradient of the thermal field is a priori free from eddy currents (curl grad == 0), the case of stationary temperature fields presents a problem which is formally identical to the problem of electrostatics, which is treated in the next subsection.
1.3 Classification of Electromagnetic Fields The problems on electromagnetic fields can be divided into several classes: electrostatics, magnetostatics, stationary currents, quasistatics, and fast varying fields. An important special case of time varying fields is the case of harmonic time dependence. 1.3.1 Stationary Fields
Stationary fields are time-independent electromagnetic fields E, H, B, D and J. Their field quantities depend on space only. They are generated by static or uniformly moving charges. Magnetic and electric field get decoupled in stationary case. Because of the independence of time, we get two completely independent systems of differential equations for both.
16
1. Classical Electrodynamics
Electrostatics. Electrostatic fields only exist in non-conductive areas. Since J E = 0, J K = 0, and
'" = 0 =>
JL
= ",E = 0,
electric and magnetic field are decoupled. Electrostatics considers the system curl E = 0
(1.16)
= p.
(1.17)
div D
Equation (1.16) expresses the fact that the electrostatic fields are free from eddy currents. Equation (1.17) gives the source density of the generating electric charges. In linear media, we have D = EE. In general, D = EoE+P. This material equation is also valid, e.g., for permanently polarized media. Many dielectrics are not isotropic, i.e., the polarization depends on how the direction of the applied electric field relates to the preferential direction of the dielectrics. Then the permittivity E is a symmetric tensor of second order and has nine components (only six of which are independent).
Magnetostatics. Magnetostatics treats time-independent magnetic fields: curl H
(1.18)
divB
(1.19)
Equation (1.18) gives the curl density of the generated direct current or, equivalently, of moving electrical charges. Equation (1.19) expresses the fact that there are no magnetic charges. Recall that B = /-1H and that /-1 is a symmetric tensor for linear but non-isotropic media. Non-linear materials, which satisfy equation B = /-1o(H + M), are often considered in magnetostatics. For ferromagnetic media such as iron the relation between M and H depends on the history of the medium and on its special kind. It has to be found by measurements. The relation between M and H (or between B and H) is displayed by the so-called hysteresis curves. Increasing H from its initial value to the desired value, we obtain the first curve of the hysteresis; decreasing H from the obtained value to the initial value leads to a different curve. Hence the hysteresis loop, whose area describes the losses for re-magnetization. Media with wide hysteresis loops display large losses for the re-magnetization and are called "hard materials". A "soft material" with skinny hysteresis loop is appropriate for transformers.
Stationary Current Fields. Stationary current fields arise in conductors where direct voltage is applied. The system curlE
divJL
o o
(1.20) (1.21 )
1.3 Classification of Electromagnetic Fields
17
applies to the direct currents J L = /'i.E. For the stationary case, equation (1.21) follows directly from charge conservation law (1.9) with J = J L. Equations (1.20) and (1.21) demonstrate that the stationary current field is free from eddy currents and sources. 1.3.2 Quasistatic Fields
One can divide time-dependent fields into "slowly" and "fast" varying fields. The quasistatic laws for "slowly"varying fields result from Maxwell's equations if one neglects the magnetic induction or the electric displacement current. Thus, the electromagnetic waves which result from the coupling of magnetic induction and electric displacement current are neglected in electroand magneto-quasistatics. No derivatives with respect to time occur in the quasistatic equations. However, this doesn't mean that the sources and therefore the fields themselves are time-independent. On the contrary, the fields are determined by their sources at a given time independently of the state of the sources a moment earlier. In general, the sources are unknown rather then determined by the fields themselves via Lorentz' force which acts on the particles present in the field. In this context, the time dependencies have to be taken into account again. The "secondary" fields H in electro-quasistatics (shortly EQS) and E in magneto-quasistatics (shortly MQs) may be obtained from the timedependent equations: the continuity equation in electro-quasistatics and the first Maxwell's equation (1.1) in magneto-quasistatics. Electro-Quasistatics. Electro-quasistatics gives a reasonable approximation for low frequency fields which can be thought to be free of eddy currents (i.e., magnetic induction may be neglected), while the effects of displacement current play an important role. So, we assume aB = 0 aD ;i
at
'at
o.
Then Maxwell's equations amount to curl E
=0
curl H - aD +J
- at
div D div B
= p =0
with J = J L + J E, i.e., assuming no electric charges in motion. For harmonic time dependence, E(r, t) = E(r) cos(wt + ¢), we can use the ansatz E(r, t) = Re(E(r)e iwt ),
H(r, t)
= Re(H(r)e iwt )
(1.22)
18
1. Classical Electrodynamics
with the complex amplitudes E(r) = E(r)e i ¢ and H(r) where ¢ is the phase angle of the cosine function. E and H are often also called phasers. Differentiating and taking out the term eiwt , we get curl E
=0
(1.23)
curl H = iwD + ",E + ~E
(1.24)
div D div B
(1.25)
= f!.. = o.
(1.26)
Equations (1.23) and (1.25) are sufficient to determine E uniquely and are, therefore, the fundamental equations of electro-quasistatics. Magneto-Quasistatics. As for the slowly varying fields where the main role is played by the magnetic flux density, one can neglect the contribution of the displacement currents, i.e., max IBBD t
rER3
where J = J L given by
+JE
I« max IJI rER3
= ",E + J E. The resulting system of equations 3 is then
curl E curl H
J
divD
=p
divB
=
BB Bt
o.
For harmonic time dependence, the ansatz (1.22) may be used. Upon differentiating and cancelling eiwt , Maxwell's equations yield curl E = -iwB ~
(1.27) (1.28)
= f!..
(1.29)
=
(1.30)
curl H = div D div B
o.
(1.28) and (1.30) are sufficient to determine H uniquely and are, therefore, the fundamental equations of magneto-quasistatics. The continuity condition div ~ = 0 follows from (1.28). Conditions for Quasistatic Fields. To determine whether the quasistatic approximation is a suitable model for a given problem,4 we may study the 3
4
Magneto-quasistatics is often referred to as quasistatics in literature. The discretization of electro-quasistatics is described in section 2.3. Furthermore, electro-quasistatics is one main topic of the subsections 3.10 and 4.5. Electroquasistatics and the suitability of the quasistatic approximation for certain problems are mostly discussed very shortly in literature, electro-quasistatics is often not remarked upon at all. The argumentation here follows [120], where electroand magneto-quasistatics is treated in great detail.
1.3 Classification of Electromagnetic Fields
19
error fields (satisfying the equations obtained by subtracting the simplified Maxwell's equations from the full ones). The error fields should be small compared to the quasistatic field. In order to facilitate discussion of the orders of magnitude, we assume that all space quantities do not change more than by a factor of two. In this case, we can speak of a typical length L of the setup. Here are two examples. For electro-quasistatics, consider a pair of ideally conducting spheres with radii and distance between the spheres of order L which are excited by a voltage source. For magneto-quasistatics, consider an ideally conducting loop with alternating current applied to it. The radius of the loop and the diameter of its cross-section are of order L. If we think of a medium as a combination of an "ideal conductor" and an "ideal insulator", we may use the following rule of thumb to decide if a problem is electro- or magneto-quasistatic. The frequency of the driving source should be decreased so that the fields become static. If, at this point, the magnetic field vanishes, the field is electro-quasistatic. If the electric field vanishes, the field is magneto-quasistatic. Since many metals are very good conductors and many gases, liquids, and solids are very good insulators, this rule of thumb is not too unrealistic. To estimate the magnitude of the fields, we let L be the typical length for the setup in question. Then the spatial derivatives of the curl and divergence operators can be replaced by 1/ L, which implies
E = pL
for EQS
co
and H = JL for MQS
(1.31)
where E, H, J stand for typical values of E, H, J. For a sine-like excitation, the characteristic time 7 for the sine-like solution to the oscillation equation is be given by the inverse of the frequency w. In case of electro-quasistatics, a time-varying charge causes a current, which in turn induces an H-field. In case of magneto-quasistatics, the time-varying currents cause a time-varying H-field, which causes an E-field. Using (1.31), we obtain
H
= coEL = L2p 7
for EQS and E
7
= /1o HL = /1o JL2 7
for MQS.
7
To get an estimate for the error caused by neglecting magnetic induction (or displacement current), the corresponding estimates are substituted into the full Maxwell's equations. Then we get the following estimates for the error fields: Eerror
/10pL2 = --:;=2
for EQS and
Ii _ co/1oJ L3 error 72 for MQS.
An application of (1.31) gives
Eerror E
/1ocoL2 :"""::"-;0--
72
for EQS and
Herror H
= co/1oL2 72
for MQS
20
1. Classical Electrodynamics
as an estimate for the relative error caused by using the quasistatic equations instead of the full Maxwell's equations. Electro-quasistatics as well as magneto-quasistatics are based on the assumption of sufficiently slow time-variation (low frequencies) and sufficiently small dimensions so that L
-« T c (recall that c = 1/ Vcof.Lo is the speed of light). The quasistatic approximation is therefore valid if an electromagnetic wave can propagate over the characteristic length of the setup in a time which is small compared to the time T.
One decides which of the two quasistatic approximations to use by comparing the given fields with the fields which would exist in the static case. In our example with the spheres, if the system is excited by a constant voltage source, the spheres are charged and the charge causes an electric field. But in this static borderline case there exist no currents and hence no magnetic field. Therefore, the static system is mostly influenced by the electric current, so it is appropriate to assume the setup is electro-quasistatic even when the excitation varies with time. If a direct current is applied to the loop in the second example, the circulating current will cause a magnetic field, but there exist no charges and therefore no electric field. Thus, magneto-quasistatics is the appropriate approximation. In [120], a circular plate capacitor is given as an example of an electroquasistatic setup. If this plate capacitor is excited with a frequency 1 MHz, the quasistatic equations give a good approximation for the actual field, as long as the radius of the plate is much smaller than 300 m. The same book contains an overview of other practical applications of quasistatics. Here is a list of some of them: The skin effect at transmission lines is magneto-quasistatic. The processes in transistors and in the picture tube of a television set, which converts signals into picture and sound, are electro-quasistatic. Electric currents in the nerve lines and other electric activities in the brain are electroquasistatic. Electrical power supply systems give more examples of electroand magneto-quasistatics. For instance, the generator fields in a power plant are magneto-quasistatic, while most electronics in the control room is electroquasistatic. The high voltage power transmission system may be regarded as electro-quasistatic. The specification of the insulator function starts off with EQS approximations. However, after an electric breakdown there flow enough error currents to make an MQS approximation more appropriate. Many aspects of an overland line are electro- or magneto-quasistatic. To lightnings, however, these approximations are no longer applicable. We shall study humid high-voltage insulators in subsection 4.5 as an important example of applications of electro-quasistatics.
1.3 Classification of Electromagnetic Fields
21
1.3.3 General Time-Dependent Fields and Electromagnetic Waves In case of fast varying fields where neither the time variation of the magnetic induction nor that of the density of displacement current may be neglected, the set of full Maxwell's equations has to be solved. Maxwell's equations state that the electric and magnetic fields are interrelated. The time-dependent change of the fields propagates with finite velocity through space. It is called electromagnetic wave 5 . In engineering, electromagnetic waves are generated for the purpose of energy and signal transmission. General Time-Dependent Fields. For fields with general time-dependence where no terms in Maxwell's equations may be neglected, it is convenient to rewrite Maxwell's equations as an initial-boundary value problem for a system of differential equations. For this purpose, the field quantities E and H first have to be normalized appropriately. We use the normalizations
aB /at
aD /at
E H
= .,(ZoE' = IYoH'
(1.32) (1.33)
with the square root of the wave impedance ffo = J /-lo / co, the admittance Yo = l/Zo, and c = crcO, /-l = /-lr/-lO. This normalization results in E' and H' having the same dimension. Introducing an unknown function
u(t):= and an excitation function q(t) = (
(~,)
-t
pv )
for the case of moving charges, we can rewrite the first two Maxwell's equations as u (t) = Lu(t) where
L'- (
_!5:.
c 1 --curl/-lr co 1
+ q(t)
~CUrl~) Cr /-lo 0
(1.34)
.
Adding the initial conditions
U(tO)=(~t), we get an initial-value problem. Its formal solution is given by
u(t) 5
= u(to) +
rt (Lu(t) + q(t))dt.
lto
The propagation speed in vacuum is just the velocity of light.
(1.35)
22
1. Classical Electrodynamics
Time-harmonic Oscillations and Waves. Electromagnetic waves with periodic time dependence are of special importance. For harmonic time dependence, we can use the ansatz (1.22). Upon differentiating and cancelling eiwt Maxwell's equations, we get
curl H
= -iwB = iwD + J L + J E
div D div B
= O.
curl E
e.
1.3.4 Overview and Solution Methods
Stationary Fields Magnetostatics Stationary Currents curl H curl E 0 JE -= divB = 0 divJL -- 0
Electrostatics curl E = 0 div D = p
Quasistatic Fields Electro-Quasistatics Magneto-Quasistatics
o
curl E curl H divD divB
=
curlE
= _oB
curl H divD divB
=
at
General Time-Dependent Fields and Electromagnetic Waves Time-Harmonic Oscillations General Case
-iwB
curl E curl H divD divB
=
iwD + ,IL P
o
i:JB
curl E
+ ,IE
curl H div D div B
alJt
= at + J =
=
P
0
Table 1.1. Classification of electromagnetic fields
Table 1.1 gives a summary of the classification of electromagnetic fields. Typical solution methods (the choice depends on the type of the problem) are analytical solution, numerical methods with field-theoretical orientation, and lumped circuit formalisms. In the sequel, we discuss the usual analytical solution methods. Some special numerical methods of field computation are described in subsections 2.1 and 2.3. Often one has to resort to the application of appropriate lumped circuits.
1.4 Analytical Solution Methods
23
1.4 Analytical Solution Methods Now we discuss some analytical solution methods, precisely, the potential method (for electro- and magnetostatics, stationary current fields, electroquasistatics, and the wave equation), the decoupling of Maxwell's equations by differentiation, and the separation method (for Helmholtz equation and waves in circular waveguides). We assume that t, JL and K, are constants. 1.4.1 Potential Theory To solve Maxwell's equations, one frequently uses potentials. The potential theory is especially useful if the fields are static. Electrostatics. Since the electrostatic field is free from eddy currents, we have curl E = 0, so the field can be described uniquely by a scalar potential function (shortly potential):
E(r)
= - gradip(r) = 'Vip
since curl grad ip == 0. For a linear isotropic material, we have D = tE. Substituting this into the divergence equation, we get the Poisson equation (also potential equation) for a homogeneous material (t = const.) P Llip = --. t
Recall that the Laplace operator Ll equals 'V 2 • In charge free space, p = 0, so the potential satisfies the Laplace equation
Llip =
o.
The Laplace (and Poisson) equations are elliptic differential equations. Magnetostatics. Since div B = 0, the vector B is source free, hence may be expressed as curl of some vector potential A: B
= curlA.
This vector potential is unique up to the gradient of some scalar ip. Choose ip so as to obtain the so-called Coulomb gauge div A = 0.
(1.36)
This is very suitable for static problems. In a linear isotropic medium, we have B
= JLH.
(1.37)
24
1. Classical Electrodynamics
Substituting (1.36) and (1.37) in (1.18) yields 1 curl (-curl A) = J E. J-t
For homogeneous media, this is equivalent to
.!.( graddiv A J-t
..:1A)
= J E,
which then leads (with Coulomb gauge) to
Stationary Current Fields. As in electrostatics, the electric field can be uniquely described by a scalar potential cp because curl E = 0: E = - gradcp.
Since J L
= K,E and K, = canst, we obtain K,div E
= -K,div gradcp = 0,
i.e., the Laplace equation
..:1cp = O.
Electro-Quasistatics. Since the electro-quasi static field is free from eddy currents, it is uniquely defined by some scalar potential function. In time-harmonic case, curl E = 0 with the complex amplitude E. As in the real case, we choose a complex scalar potential <£..: E
= -grad<£...
(1.38)
Since div curl == 0, the system (1.23)-(1.26) is equivalent to the equation div ((iwc {:}
+ K,)E + I-E ) div ((iwc
= 0
+ K,)E)
= -div (I- E ).
(1.39) (1.40)
Substituting (1.38) into (1.40), we get div ((iwc
+ K,) grad<£..)
= div (I- E ).
In case of homogeneous isotropic media, this implies the following Poisson equation ..:1 <£.. = (iwc + K,)-ldiv (I- E ), which is formally identical to the fundamental equation of electrostatics, but with complex potential and complex right-hand side.
1.4 Analytical Solution Methods
25
Wave Equation. Starting off from Maxwell's equations (1.1) - (1.4) yields the wave equations under the assumption of an homogeneous material, i.e., if E, P, and /'i, are assumed to be constant in the whole domain. Because B is source-free (by (1.4)), we can again choose a vector potential A so that
B = curl A. With this ansatz, curl E
8A
= - curl 7ft'
results from (1.1). In other words, the vector E + 8Aj8t is free from eddy currents and may be written as gradient of a scalar potential: E
8A
+ 7ft
= - gradip.
The electric field strength E and the magnetic flux density B can also be computed from A and ip: E
B
8A -grad ,/)- r 8t' curlA.
(1.41) (1.42)
Like the vector potential A, the scalar potential
+ p,E ~~ = O.
If
Avoiding the potential method, one can solve Maxwell's equations (1.1)-(1.4) for the electromagnetic field quantities in the general case directly analyti~ cally. Again, f., J.L and /'i, are assumed to be constant. Furthermore, let J = J L.
26
1. Classical Electrodynamics
Apply curl to (1.1), differentiate (1.2) with respect to time, and use the expressions for Hand D together with the definition of J L. We get curl curl E
aE = -ff.L-at 2 -
K.f.L-.
at
(1.43)
curlcurlH
aH= -ff.L-at 2
K.f.L-.
aH at
(1.44)
2
aE
Analogously, 2
For any vector field F, curl curl F = _\7 2 F
+ grad div F.
This expression (with F = E, H) is substituted into (1.43) and (1.44). Note that graddiv H = 0 because of the conditions (1.4) and f.L = canst. If the charge density p equals zero, then grad div E = 0 as well because of (1.3). Therefore, F = E and F = H each satisfy the equation of a damped wave: (1.45) Thus, for materials with
K.
= 0, F = E, and F = Hsatisfy the wave equation
a
2F \7 2 F = ff.L 8t 2
with propagation speed 1/,j4i; in vacuum it equals the speed of light. In case of excitation (p =J 0), we get the inhomogeneous wave equation for the electric field E: 2
a2 E
aE = gradp/f
\7 E - ff.L- - K.f.L-
at
2
at
'
while H still satisfies the wave equation (1.45). The wave equations comprise a pair of decoupled hyperbolic differential equations of second order. 1.4.3 Method of Separation
An important yet simple method of solution of linear differential equations in several variables is the separation of variables. In the following, we will start with the wave equation and separate: first, time from spatial parameters, then individual coordinates. These arguments will be important later in the book. Separation of variables is good for solving various other problems, which are not discussed in the book.
1.4 Analytical Solution Methods
27
Helmholtz Equation. The ansatz
F(r, t)
= F(r) . T.(t);
F, T. complex functions;
implies the Helmholtz equation for F( r) and an equation of a damped oscillator with the solution
for T.(t). Here ~ and the wave number 15;. are complex quantities which depend = w2 W =F iwf.1/'i,. The solution T.(t) deon each other in accordance with scribes a harmonic oscillation. Consequently, this allows the following ansatz for the electric and magnetic field
e
F(r, t)
= Re(F(r)ei~t).
Thus, Maxwell's equations for time-harmonic fields with rewritten as
J..E
°
can be
curl H = iWEE + ",E
(1.46) (1.47)
div E
(1.48)
curlE = -iwf.1H,
div H
= 0, = 0.
(1.49)
For time-harmonic fields, the wave equations become the homogeneous and inhomogeneous Helmholtz equations
(\7 2 + 15;.2)E(r) (\7 2 + 15;.2)H(r) and
= =
°°
(\7 2 + 15;.2)E(r) = gradp(r)/E (\7 2 + 15;.2)H(r) = 0. -
If the boundary curves of the setup coincide with coordinate planes of the given coordinate system, it is possible to separate those coordinates from the remaining ones in the Helmholtz equations.
Waves in Circular Cylindrical Waveguides. Waves in circular cylindrical waveguides propagate in z-direction (in (r,
°
°
28
1. Classical Electrodynamics E(H)
= rotA (H)
(Hz - waves),
H(E)
= rotA (E)
(E z - waves).
These vector potentials each satisfy Helmholtz equation
In many applications, one solves the Helmholtz equation by separation of variables. The solutions depend on the given coordinate system. Often the Cartesian, the circular cylindrical and the spherical coordinate systems are used. The cylindrical coordinate system (r,
2 A(E,H) + kO-z
= 0,0 k2 ,= w211 E • 0 0
(1.50)
For the separation of variables, we choose the product ansatz of Bernoulli
Now apply the \7-operator in cylindrical coordinates. To obtain g(
whose most general solution is
In general, ZI' is a combination of a Neumann function and a Bessel function:
In case of homogeneous circular waveguides ZI' equals the fL-th Bessel function J w The solution of the scalar Helmholtz equation can be completely described by the vector potential
(1.51)
1.5 Boundary Value Problems
29
with the separation equation
(1.52) · 1ue K·IS gIven . by K(H) ./ / K(E) . / h . The eigenva JLn = J JLn ro or JLn = J JLn ro were ro IS the radius of the waveguide. K~~) and K~~) are determined by the boundary conditions on the wall of the waveguide: for perfectly conducting walls they are n x E = 0,
n·H =0. r(E,H)
(r, t.p) are called cross-sectional eigenfunctions of the E z - and H z -
waves.
1.5 Boundary Value Problems In practice, it is not the general solution of the field problem which is of interest; rather, the electromagnetic fields should be determined for a special subdomain of space. Then the fields have to satisfy given boundary conditions, as we already saw in the circular cylindrical waveguide case. 1.5.1 Boundary Value Problems of the Potential Theory
In potential theory, the following kinds of boundary value problems for the Poisson equation are considered most often: - The problem 'Llt.p = p in a domain G, t.p = t.pI on the boundary oG' is called 'the first boundary value problem' or the Dirichlet (boundary value) problem; t.pI is some function of r defined on oG. The problem' Ll'P = p in a domain G, a'P / an = 'Pz on the boundary aG' is called 'the second boundary value problem' or Neumann (boundary value) problem; 'Pz is some function of r defined on oG. For this problem to be solvable, we have to demand fa pdV = faa grad'PzdA where G = CUoC. 6
- The problem 'Llt.p = p in a domain G, at.p + bo'P / on = t.p3 on the boundary oG' is called 'the third boundary value problem' or Newton's or mixed boundary value problem; t.p3 is some function of r defined on oG. All of these boundary value problems are uniquely solvable under certain supplementary conditions (on the boundary oG of the domain G). This topic is treated extensively in textbooks, e.g., [168]. Newton's boundary condition is especially important in thermal radiation. All the boundary conditions are first of all physically justified. Neumann's boundary condition may, however, also be used as a symmetry condition in a symmetric domain and can therefore also have geometrical reasons. 6
This condition is an immediate consequence of the Gaussian integral theorem.
30
1. Classical Electrodynamics
1.5.2 F\lrther Boundary Conditions Periodic Boundary Condition. Geometrical reasons justify the periodic boundary condition, which can be used for domains with periodicity in one or more coordinates. The reader may think, for example, of an iris-loaded waveguide as it is shown in subsection 2.1, Fig. 2.1, or a ridged waveguide usual in microwave engineering. The waves in those so-called slow wave structures propagate with lower phase velocity than in a plain waveguide. The Floquet theorem [66] for propagating waves in periodic structures states that the waves can be described as modes for which the fields at the distance of one period only differ by a constant complex phase factor, i.e., the formulas
E(z + L) = E(z)e if3 H(z + L) = H(z)e if3 hold in case of longitudinal periodicity. Correspondingly,
E('P + L) = E('P)e if3 H('P + L) = H('P)e if3 in case of azimuthal periodicity. Corresponding expressions may be derived for the energy density. Thus, a prescribed fixed phase increase can be used as boundary condition in time-harmonic field simulations. Periodic boundary conditions for static problems can be derived similarly. They are used, for example, in field simulations for azimuthally periodic motors such that only a fraction of the periodic structure needs to be discretized. Open Boundary Condition. For the numerical solution of many problems it is necessary to restrict a domain which would in fact extend to infinity in one or more, may be all, coordinate directions to some finite domain. In case that the computational domain does not extent so far that the effect of the boundaries on the fields inside and around the computed structure can be neglected, some special type of boundary condition has to be introduced which reflects the physical effects. In this context, we speak of an open or absorbing boundary condition (ABC). The latter term is easily misleading, however. First, we describe the static field case. Assume that the computational domain n contains all source terms. The coupling between the fields in the outer domain and the computational domain n uses the boundary potential 'P r and the displacement current Dr· The boundary potential 'P r also gives the tangential electric field components on r = an. The displacement current Dr is normal to r. It can be shown that Dr is uniquely determined by the boundary potential 'Pr via the linear boundary operator
L: 'Pr -+ Dr.
1.5 Boundary Value Problems
31
The boundary operator L describes the influence of the outer domain. The field computation can then be confined to the interior of the computational domain. This corresponds to Robbin's boundary condition: nr . D - L
= a on r = an.
For details, the reader is referred to [72], where also an implementation of the first and second order approximations is described. Analogously to the static case, the existence and uniqueness of an open boundary operator can be derived for time-dependent fields. A linear and bijective boundary operator can be found from the solution of Maxwell's equations for the unbounded domain. This operator then projects the tangential electric field at the boundary onto the tangential magnetic field such that the computation can be confined to a finite computational domain with corresponding Robbin's boundary condition. As in the static case, one uses approximations, since the discretization of the exact operator would need a lot of computational resources compared to the discretization used inside the computational domain. One of the most usual boundary operators is the factorization of the wave equation. For details on the open and absorbing boundary conditions, the reader is referred to [27], [72], [76], [90], [177], [180], [256], and [268]. Very good approximations to the open boundary operator were introduced in [29] and used, e.g., in [261]. Waveguide Boundary Condition. In technical applications, the solution domain often includes one or more waveguides. At a certain distance, the studied object does no longer influence the fields in the waveguide. Therefore it is useful and permitted to cut the computional domain accordingly. Then the so-called waveguide boundary condition is used when the boundary of the domain goes through a waveguide. It is a specific case of an open boundary condition. In case of a monofrequent excitation, an exact open boundary condition can be given [76]. Generally, the field in a longitudinally homogeneous waveguide can be written as an an infinite series of transverse eigenmodes. The idea is to separate the transverse electric fields in grid planes near the boundary for each mode in "incident" and "reflected" portions of the wave [76]. Then, the field can be determined at any place in the waveguide from their field distribution in a transverse plane (see, e.g., [277] or [72]). 1.5.3 Complete Systems of Orthogonal Functions Under appropriate conditions, the solution of boundary value problems gives a linear subspace F of L2(G) and an orthogonal basis of that space F (a complete system of orthogonal functions). Then any function in F can be written as an (infinite) linear combination of the basis functions. Let {
32
1. Classical Electrodynamics
lb
f(x)g(x)dx
is the scalar product on F. A function
f(x)
=:
(1,g)
f may then be represented as
= 2: anlPn(x) n=l
where (lPn) is an orthonormal basis of F. Since
(lPn, lPm)
= omn,
the coefficients in the expansion of f equal
For Laplace and Helmholtz equations, which are each derived from Maxwell's equations, complete systems of orthogonal functions arise as solutions of the problem with appropriate boundary conditions. It is customary to study the Laplace equation
\l2'lj;
= EP'lj; + EP'lj; = 0 8x 2
8y2
on a rectangular domain [0, aJ x [0, bJ; this is a particularly simple setting (cf. [178]). For homogeneous Dirichlet boundary conditions at x = 0, x = a, y = b and an inhomogeneous boundary condition at y = 0, the most general solution can be written as
'lj; =
=
2: An sinh [ ( 7ran ) (b -
y)] sin (7r:X) .
n=l
The values of coefficients are not given here (see, e. g., [178]). The functions sin(7rnx/a) where n is an integer are eigenfunctions for the above problem. To expand a solution into a series of eigenfunctions, we separate variables in the equation according to the coordinates (s such that the boundary includes the coordinate plane (s=const. The determination of suitable eigenfunctions and the values of the separation constants which satisfy the boundary conditions (the eigenvalues) are problem of this method. The expansion into a series of eigenfunctions then produces the solution of the boundary value problem. If ('lj;n (z)) is the obtained sequence of eigenfunctions, then every piecewise continuous function F(z) can be approximated by a series
2: An'lj;n(z) 00
F(z) =
n=O
between the boundary points a and b. Furthermore, one can show that a function with finite number of discontinuities in (a, b) can be represented by a series of eigenfunctions which solve the linear least squares problem
l.6 Bibliographical Comments
j[F(Z) -
f
An1/Jn(zWdZ
=0
33
(1.53)
n=O
(least squares fit). However, at the discontinuities of the solution, partial sums of the series that represents the solution will have peaks of width decreasing to zero and of bounded height. Such series are obviously not differentiable and in general not uniformly convergent. But they are integrable and the series obtained by integration converge uniformly. Therefore, a certain class of non-analytical functions may be represented by an infinite sum of eigenfunctions. If the boundary conditions at b are fairly "reasonable" (for details, see [178]), it is possible to show that the resulting eigenfunctions are mutually orthogonal:
Once they are normalized, we get the orthonormal sequence
(L 1/Jn) :
Here
A part of the next section covers the semi-analytical mode matching technique. Using this method, one determines the electromagnetic fields by orthonormal series expansion.
1.6 Bibliographical Comments Since the topic of this section is a classical subject of theoretical physics and electrical engineering, no attempt will be made to give a complete list of relevant literature. A short discussion of the cited literature and some hints for further reading follow: The earliest reference is, of course, the book "A Treatise on Electricity and Magnetism" [172] by James Clerk Maxwell. The book by Jackson [142] (or [143]), "Classical Electrodynamics", is a classic itself. It covers all the basic and advanced topics and is much more precise than many other textbooks. Its the author's favorite for looking up something in this field of physics. Another classical reference is the Stratton's book" Electromagnetic Theory" [251]. A very detailed book is "Electromagnetic Fields and Energy" by Haus and Melcher [120]. This book is, to the author's knowledge, the only book which also covers electro-quasistatics. It is carefully written and somewhat
34
1. Classical Electrodynamics
differs in its concept from that of usual textbooks. The book contains many practical examples to enhance physical intuition of the reader. The book by Morse and Feshbach [178], " Methods of Theoretical Physics" , is another classical textbook. For students wishing to get a better understanding of electromagnetism, the author recommends Volume 2 of the "The Feynman Lectures on Physics" [95] by Feynman, Leighton and Sands (mainly treating electromagnetism and matter). The book "Electromagnetics" [85] from Schaum's Outline Series offers many exercises. For readers interested primarily in microwave theory and technology, Collin's "Foundations for Microwave Engineering" [67] and "Field Theory of Guided Waves" [66] may be recommended. (Finally, the book [168] by Lehner, a German textbook written for electrical engineers, is suggested because of its thorough treatment of the material.)
2. Numerical Field Theory
Except for very simple geometric structures, Maxwell's equations cannot be solved analytically. A variety of methods for the evaluation of electromagnetic fields has been developed over time. In essence, all the methods can be divided into two classes: semi-analytical methods and discretization methods. In the following, we will discuss at least one method from each group and the properties of the related linear systems. The best known semi-analytical methods are the mode matching technique, the method of integral equations, and the method of moments. In the method of integral equations, the given boundary value problem is transformed via appropriate Greens' function into an integral equation. In the method of moments, which includes the mode matching technique as a special case, the solution function is expressed as a linear combination of adequate basis functions. The treatment of complex geometrical structures is very difficult for these methods and often requires geometrical simplifications: in the method of integral equations, the Greens' function has to satisfy the boundary conditions. In the mode matching technique, there must be a decomposition of the domain into sub domains such that the problem can be solved analytically in these subdomains and thus the basis functions can be given. Nevertheless, there exist applications where the semi-analytical solution methods are the best ones. For this reason, the mode matching technique is briefly discussed in the sequel. The best known discretization methods are the finite element and the finite difference methods. These methods are applied to Poisson's equation, wave equation, or Helmholtz equation to compute the field numerically. But these partial differential equations of second order result only after the material parameters E, fL, and", are assumed to be constant. Therefore, numerical methods which are based on the partial differential equations can only be used in domains which are composed of a few subdomains with linear, homogeneous, and isotropic materials each. Then the differential equations are solved for a vector potential and/or a scalar potential or for E and/or H. Here the finite element and finite difference method differ in their solution ansatz: The Finite Difference method (FD) is used for the basic equation [174] (e.g., Poisson' equation). Choosing a difference method, one replaces all derivatives in the differential equations by some difference quotient. The FiU. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
36
2. Numerical Field Theory
nite Element Method (FEM) approximates the solution by some function in a finite dimensional space. Polynomials with local support are used as basis for this function space. A variational problem is solved [174], [265] instead of the basic differential equation. A short description of the Finite Element Method follows in subsection 2.2. A drawback of these methods is that they are based on the equations which hold only in homogeneous media, e.g., the wave equation. The decomposition into sub domains which is undertaken in order to treat problems with piecewise homogeneous material filling results in a complicated coupling of the equations at the interfaces. Therefore, this formulation is not always practical. A method which is especially well suited for the problems of field theory is the finite integration technique (shortly FIT) [296] which is described in subsection 2.3.
2.1 Mode Matching Technique In this subsection, the mode matching technique is treated as a representative of the larger class of semi-analytical methods. The subsection gives a brief introduction to the mode matching technique. It is a method (which can be called classical in field theory [200], [273]) that uses expansions into a series of eigenfunctions. The eigenfunctions satisfy the given boundary conditions are found by separation of variables in the differential equation. The mode matching technique can be used to compute fields in structures which can be decomposed into sub domains such that, for each subdomain, the analytical solution of Maxwell's equations can be given as a Fourier series or a Fourier-Bessel series. The solution for the complete structure can then be obtained by continuous matching of the fields at the interfaces of the subdomains. The assembling of these interfaces depends on the problem. The rest of this subsection is mostly devoted to cylindrically symmetric structures. Figure 2.1 presents an example of such cylindrically symmetric structure, viz. an iris-loaded waveguide. Figure 2.2 shows some frequently used types of subdivision for the cylindrically symmetric case. For a more general and more detailed description of the mode matching technique, see, e.g., [200]. In many cases the mode matching technique is applied to problems where the structure first has to be simplified geometrically for the decomposition into sub domains with analytic solutions to be possible. A good example is the accelerating structure studied in subsection 5, which is used to accelerate elementary particles. In this example, all roundings have to be neglected in order to stay within computational limits. Nevertheless, the obtained solution makes sense. That example will be treated in more detail in subsection 5. First, all rounded edges will be approximated by rectangular edges. Next, the structure will be subdivided into sub domains by transverse interfaces which have a constant circular cross-section.
2.1 Mode Matching Technique
37
~==?=~====~~=====7~~====~?~
\
\
\
I
/
I
I I
2b
~==~~======~~====~~========~/
Figure 2.1. Circular cylindrical iris-loaded waveguide.
~~ . . U. . .~. . . . . . U ~ U IIJ
II
Figure 2.2. Frequently used types of subdivision for a cylindrically symmetric structure in sub domains which can be treated analytically. The upper half of the cross-section of the structure is shown.
2.1.1 Mathematical Treatment of the Field Problem In subsection 1.4.3, the analytical solution for waves in circular cylindrical waveguides has been discussed: The solution of the scalar Helmholtz equation (1.50) can be completely described by the vector potential given in (1.51): A~E,H) (r,
C J/1-(K r)e±/1-'Pe±k,z =: T..(E,H) (r,
with the Bessel function JJL(Kr) and the separation equation (1.52)
K2
+ k;.
k5 =
These solutions are now used to evaluate the fields in the subdomains. Using an appropriate definition of functions (en) and (hn ) as differential functions of the cross-sectional eigenfunctions T..(E,H)(r,
38
2. Numerical Field Theory
L Un(z) en(r, tp) 00
Et(r, tp, z) =
(2.1)
n=l
L In(z) hn(r, tp). 00
Ht(r, tp, z) =
(2.2)
n=l
en (r, tp) and h n (r, tp) are referred to as field eigenfunctions of the electric and magnetic field. The properties of Fourier-Bessel series shall not be described here; they can be found in any good textbook on Mathematical Physics or Electromagnetic Field Theory, e.g., see [168]. The field eigenfunctions of the E z - and Hz-waves in a cylindrical waveguide build a complete orthogonal system. The discrete eigenfunctions correspond to different waveforms and are usually called modes, more precisely, monopole modes, dipole modes, quadrupole modes and so on according to their azimuthal dependence f.L = 0,1,2, .... The transverse field components of an individual mode are given by En,t(r,tp,z) = Un(z)· d· [VZH. (e z x "V'LH(r,tp)) + (/ZE. "V'LE(r,tp))] H n,t(r, tp, z) = In(z) . d· [( _VyH . "V'LH (r, tp)) with
Z E = _1_ yE
=
kz
We
an
d ZH
+ JYE. (e z x "V'LE(r, tp))]
= _1_ = Wf.L yH
kz .
The longitudinal field components are given by
E
-n,z
'K 2
(rtpz)=I " - n (z).d.vfYE._J-.TE(rtp) We ,
'K 2 H (rtpz)=-U (z).d./ZE.J_.TH(rtp) -n,z " - n Wf.L ,
The cross-sectional eigenfunctions 'L(E,H)(r,tp) and the eigenvalues K which are determined by the zeroes of the Bessel functions have been introduced in subsection 1.4.3. 2.1.2 Scattering Matrix Formulation
The next step after finding the solution of Maxwell's equations in a homogeneous cylindrical waveguide is the continuous matching of the solutions at the cross-sections of neighbouring homogeneous sub domains of some cylindrically symmetric waveguide structure. Besides the formulation via scattering matrices, which is given here, there is a number of other formulations of the mode matching technique. The derivation of the scattering matrix formulation is described in detail by Piefke in [200]. Other variants of the mode matching technique are described in [141]. For accelerating structures, the
2.1 Mode Matching Technique
39
mode matching technique has been used, e.g., in [115], [17], [152]' and [125]. In connection with the computation of accelerating and parasitic modes, the mode matching technique has been presented in [278], [279], and other formulations in [121], [122] and [322]. See also subsection 5 of this book. Simple Step in a Cylindrical Waveguide. The simplest example is a step in a cylindrical waveguide (shown in Fig. 2.3). The Fourier coefficients Un(z) and In(z), the so-called voltage and current amplitudes, are related to the amplitudes of the natural waves in positive and negative z-direction. In the subdomains I and II, they are given by
I;
=
I;'; =
JYi (a~ e-jk!,n z - b~ ejk!,n
Z)
VY// (b;;, e-jk!:m z - a;;' ejk!:m z ).
The transverse fields in the cross-sectional areas between the two volumes I and II are given by the equations (2.1) and (2.2) with appropriately chosen local coordinate z. /
I
AI ·A II
/ I
an 2b
•
l bn
•
VAil
EItI:iIt E" I
•
1(=
=
I
II
am bl
m.
2a
H1t 010H"t
/
z=o Figure 2.3. A simple step in a cylindrical waveguide with amplitudes of incident and reflected or transmitted waves.
40
2. Numerical Field Theory
The continuity condition on the tangential fields E{,II (2.1) and H{,II (2.2) in the common cross-section All yields
n=l
m=l
00
00
""' L.J IIn hIn n=l
= ""' L.J III m hll m' m=l
By the so-called" expansion in opposite direction" at the common crosssection (i.e., the tangential electric field of the volume element II is expanded in a series of field eigenfunctions hI of the volume element I, and the tangential magnetic field H{ of the volume element I is expanded in a series of the field eigenfunctions ell of the volume element II; see [200]) it is guaranteed that the power flow in both directions onto the common cross-sectional area is equal to each other. Furtheron this action guarantees that only known boundary values are implied [102]. By the orthonormality of the eigenfunctions, we get two matrix equations which express the relation between the coefficients U and I via the coupling matrices C and D:
EV
u I = -CUll , III = D II.
(2.3) (2.4)
The voltage and current amplitudes are written in vector form. It should be noticed that these are linear systems of infinite dimension. The coupling matrices only depend on the geometry of the stepl. The formulation via coupling matrix is especially well suited for periodic problems. However, for studies of the transmission and reflection behaviour of waveguides with aperiodic changes in diameter, the formulation via scattering matrices is more appropriate. The scattering matrices can be derived from the coupling matrices by some transformations (for details, see [87]). The scattering matrix can be determined by combining linear systems (2.3) and (2.4): s _ (DNC - E DN ) -NC N-E' with (2.5) The identity matrix is denoted in this subsection as E in order to avoid confusion with the current amplitudes I . Equation (2.5) shows that it is necessary to invert a full-rank complex matrix E + CD to compute the scattering matrix. 1
The coupling matrix of a waveguide with constant cross-section is the identity matrix.
2.1 Mode Matching Technique
41
The scattering matrix relates the wave amplitudes a and h of incident and reflected or transmitted waves: hI) ( hll
=
( SI I SI I I ) ( aI ) SI;,! SI ;,ll all'
Assume that mode n with amplitude 1 is incident from the left end into subdomain I. Then the complex amplitude of the m-th scattered and reflected mode in subdomain I is given by Sl,I(m, n) while the amplitude of the p-th scattered and to subdomain II transmitted mode is given by Sll,!(p, n). The scattered modes may not necessarily propagate. The scattering matrix depends on the geometry and the actual frequency. The transmission and reflection coefficients are explicitly given as elements of the scattering matrix. The matrix is symmetric, involutory, and orthogonal ([200], see also subsection 2.4).
Concatenation of Scattering Matrices. For structures with multiple changes in the cross-section, all scattering matrices for the individual subdomains are determined first. The scattering matrix for the complete structure or its parts is then evaluated by concatenation of these matrices. We first give the concatenation rule for a structure with one change in cross-sectional diameter. Let S(1) and S(2) be the two scattering matrices to be combined. Then the complete scattering matrix is given by 2
S(1,2) _ S . S _ (S}~~ + S}~LRS}2~SW,!
-
-
-1
8 -2 -
S(2) RTS(1) -ll,!- -II,!
S(2) -ll,ll
S}1LRSZL
+ S(2)
R TS(1) S(2) -ll,!- -II, II-I, II
)
= (E.-S}2~SW Il)-1 [246]. Thus for each concatenation, the full-rank complex matrix i-S}2~SW , ,II has to be inverted. The emerging amplitudes hI
with R
and b I I are calculated from the scattering matrix for the complete structure. The scattering matrix of some structure with n - 1 changes in diameter is given by n - 1 applications of the above concatenation rule:
The wave amplitudes for an arbitrary subdomain v of the structure can be obtained from the amplitudes of the incident (outer) amplitudes and the scattering matrices S1 8 S2 8· .. 8 S" and S" 8 S,,+1 8 ... 8 Sn for the parts of the structure to the left and to the right from sub domain v. The wave amplitudes a" and h" are also called inner amplitudes. As soon as the inner amplitudes are known, all the field components of the electric and magnetic field and all the quantities which are derived from those can be determined directly. 2
The individual chain matrices can be used instead of the individual scattering matrices. The chain matrix for the complete structure could then be obtained by simple multiplication. However, for structures with very many sub domains this is numerically disadvantageous compared to the use of the scattering matrix.
42
2. Numerical Field Theory
2.1.3 Standing Waves and Traveling Waves The linear system used to determine the resonant frequencies of the shielded assembly can be constructed from the scattering matrix of some waveguide structure which opens in z-direction and has the length l, introducing magnetic or perfectly electric terminations at z = 0 and z = l. Then the resonant frequencies are the zeros of the determinant of the following matrix
fiI,II± .E. ) . - (fiI,Ifi II±.E. fiII,II
A -
,!
The perfect electric or magnetic boundary condition is reflected by the sign of the identity matrix .E.. This determinant of the above matrix is a function of frequency, since the transmission matrices ,! and and the reflection matrices and ,! depend on frequency. The determinant vanishes if and only if the circular frequency w equals some resonant (natural) frequency Wr of the structure. Notice that, in using the mode matching technique as described above, the natural frequencies are not available as solutions of a linear algebraic eigenvalue problem. Rather, they are determined as zeroes of det(A). The matrix A is complex symmetric (compare the properties of the scattering matrix). Its determinant is calculated by LV-decomposition of A, as described in subsection 3. For each resonant frequency, the outer amplitudes at both magnetic (electric) boundaries are computed from the linear system
fi II
fiI
fiII,II
fiI,II
(~fu:I ~ ~I;:/~.E.) (::I ) = O. For each resonant frequency W r , the electromagnetic fields of the free oscillation can then be computed via all the inner amplitudes, which themselves can be determined by choosing the outer amplitudes as excitation. The computation of traveling waves is based on the following consideration: A traveling wave can be decomposed into two standing waves: one with magnetic, the other with electric short circuit at both ends 3 . Thus, appropriate amplitudes a I and all can be found on both ends of the waveguide. Let a magn . and ael. be the excitation amplitudes 4 of the two standing waves forming a special mode. Then a(trav.w.)
= ± (a(magn.) ± j TJ a(el.)).
yields the amplitudes of the traveling wave where the signs are chosen so that the wave travels in the right direction. The normalizing factor TJ is chosen so 3
4
In general, the superposition of standing waves also yields a contribution to a reflected wave. But it is possible to achieve destructive interference of the unwanted reflected contributions. i.e., either a I or a II with corresponding boundary condition
2.1 Mode Matching Technique
43
that both standing waves have the same stored energy Ws (compare definition in subsection 5.2.2): 1] = JW~magn.) /W~el.). It is clear that a magn . and ael. have to belong to modes with the same frequency. Further details and a study of an aperiodic traveling wave tube can be found in [318]. Computation of locally concentrated fields. For aperiodic waveguides as they are often used for the acceleration of elementary particles (see subsection 5.2.2), the so-called trapped modes may exist. These are standing waves for which the electric field is completely trapped in the inner part of the waveguide, i.e., there is no contact with the boundaries in the direction of propagation (cf. subsection 5.4). Consequently, such a mode cannot be excited via outer amplitudes. For the computation of such modes it is necessary to "split up" the structure in some plane inside the part of the waveguide which is filled with electric field: Let S(1) and S(2) be the two scattering matrices for the resulting two parts of the structure. In the separating plane both wave amplitudes, which are going in the same direction, have to be equal for both parts. It follows that the resonant frequencies of the complete closed structure are found by determining the zeroes of
±
d et (
S(1) S(1) I I -E I II , (1) (1) SII I SII II
o' o
, -E -
0
0
0
-E
0
S(2) S(2) -I I - I II (2) (2)' SII,I SII,II ±
)
(2.6)
E
For the individual modes, the separating plane should be as close to the field maximum as possible. Speeding Up the Method by Interpolation. Subsection 5.4 contains studies of the already mentioned circular cylindrical structure with 180 cells. In this structure, all the cell and iris radii are tapered linearly, which means that the radius diminuishes with its length. This structure can be composed of 360 homogeneous subsections if one neglects all roundings. The focus of the study was in evaluating the 180 resonant frequencies and their field distribution. For a structure with so many subdomains, the construction of the individual scattering matrices and their concatenation to the complete scattering matrix (or to the scattering matrix for a segment of the structure) needs the biggest part of the computational time. For this reason, some method to speed up the computations by interpolation of the matrix elements of the cells has been developed and studied. Furthermore, a data basis of frequencies for each section has been build up in order to allow an interpolation for the sections in between. The representation with the purely imaginary impedance matrix (Z-matrix) was chosen, since for the complex scattering matrix (S-matrix), its real and imaginary part would have needed interpolation, which does not necessarily preserve the property to be free of losses. The interpolated Z-matrix, however, stays purely imaginary and thus
44
2. Numerical Field Theory
loss free. By the reciprocity, it is symmetric (like the S-matrix - cf. subsection 2.4). This procedure yields very good approximations to the resonant frequencies. However, it was observed that the computing time can be very badly influenced by a wrong choice of subdivision, so that it is even possible that the 'accelerated' computation may take more time. A detailed description of the method as well as results of the studies for the 180 cell structure can be found in [87]. 2.1.4 Convergence and Error Investigations The mode matching technique approximates the solution of Maxwell's equations in the least squares sense - with respect to the integral, not the sum (cf. (1.53)). At the junctions, the infinite series that represents the solution may have discontinuities. This is called Gibbs' phenomenon [178]. To use the series for numerical calculations of the field, one has to use the partial sums of the infinite Fourier-Bessel series (2.1) and (2.2), i.e., the series have to be truncated after a finite number of summands (modes). There are some well-known convergence criteria like the edge condition [175], [290], and the geometrical mode ratio, which can be used in the subdomains. Gibbs' Phenomenon. At the junctions of the structure, the infinite series converges in the least squares (L2) sense. In practical applications, the Fourier series is cut after a finite number n of modes, which further increases the error at the cross-sectional junctions. Therefore the field strength shows discontinuities at the intersection of neighbouring subdomains. Such overshooting is well known from theoretical physics and system theory and occurs when a discontinuous function is represented by its Fourier series. The overshooting cannot be eliminated by taking more modes into account. The limit limn-+oo itself has 'flanges' at the ends of the discontinuity (cf. [178]). Theoretically, this phenomenon should not cause any problems in case of integration, since the area of the overshooting has negligibly small width. Thus, with exact calculations, integration over such a discontinuity would yield correct results. With the numerical calculations for the studies presented later, the overshooting, however, had width which was not negligible. In particular, if one needs to integrate over a large number of discontinuities, the errors could no longer be neglected as the studies described in subsection 5.4 show. Also, the height of the overshooting appeared to be too large in order to be explainable by Gibbs' phenomenon only. The usual measure to cure this problem is to use a filter to weight either the eigenfunctions or the coupling matrix. A cosine filter for the coupling matrix C = (c mn ) was used below: c~n =
C mn .
7r)
7r) .
m-l (n-l cos ( M _ 1 . '2 cos N - 1 . '2
Fig. 2.4 illustrates the effect of this filter on the matrix entries.
2.1 Mode Matching Technique
0••
0.'
0.'
0.'
0,'
0,'
0,3
0.'
0,2
0,2
0,'
0.1
"
.,
1ft
cell mode n
45
oellmodln
Figure 2.4. Coupling matrix for a structure with two cells with (right) and without filter (left). The absolute values of the matrix entries are shown. 5 and 25 modes have been used in the subdomains. Illustration from the diploma thesis of Nahr [183] 2100 E.LV f m J(filter· 0)
- ----.-,-----'-'-'---1-----------1 == ::'="00 1100 -------------,---.---,----,-,--
1000
2000 -
,
~ ---------------,-------------- +----------,-------------
100
~L-
o
r'-'-'-'-'--'-'
r
I
-'-'-'-.7--\\--_:-____
~
0,0'
______
~
0.0'
IDOI----,---------------------------H
____
radius{rn]
~
0,03
____--ill 0,'"
Figure 2.5. Radial electric field strength of the accelerating 27r /3-monopole mode with (right) and without filter (left) at the transition from iris to cell. 10 modes were used inside the iris for the series expansion. Graph from the diploma thesis of Nahr [183]
Figure 2.6. Longitudinal electric field strength of the accelerating 27r /3-monopole mode with (right) and without filter (left) for a 9-cell structure whose cell and iris radii are linearly tapered. In the smallest iris, 10 modes have been used for the series expansion.
46
2. Numerical Field Theory
As expected, the behaviour of the transverse and longitudinal field strength of the monopole modes became much smoother, as Fig. 2.5 and Fig. 2.6 show. However, systematic studies indicated that the ,use of a cosine filter for the coupling matrix is not sufficient for the dipole ~odes (in order to keep the overshooting small enough). That is why further studies with other types of filters were carried out [239]. But they hardly showed any improvement. So far, Sommer studied the effect of nearly virtual intermediate steps(auxiliary transitions)5 at each cross-sectional step. Nevertheless, with these intermediate steps alone, it was not yet possible to reach a sufficient suppression of the overshooting. Finally, the cosine filter for the coupling matrix was combined with the intermediate steps. This gave excellent suppression of overshooting except at points very close to the axis or to some sharp edges. Figure 2.7 shows comparisons of the field behaviour with filter and auxiliary transitions and with auxiliary transition only. These measures also influence remarkably the quality of secondary quantities such as the loss parameter, which will be introduced in subsection 5.3. Subsection 5.4 discusses briefly further results from [239]. In [146]' [147], an algorithm is used which is also based on the scattering matrix formulation but uses another strategy to suppress Gibbs' phenomenon by weighting the amplitudes a V , b V instead of the coupling matrix. This kind of weighting seems to be advantageous compared to the combination of matrix weighting and auxiliary transitions. Meaningful systematic comparisons are yet to be done, however. Edge Condition. The edge condition [175] was found by the solution of scattering problems. The electromagnetic field can blow up in the vicinity of a sharp edge of the scattering object. The edge condition provides a statement about the order of this singularity. It states that the electromagnetic energy density has to be integrable over any finite domain even if this domain con5
This technique is similar to Piefke's "Zwischenmediumsmethode" [199] (compare also with the extensive analysis in [213] ), which are very, very short compared to the other sections, i.e. their length nearly vanishes. However, Sommer developed this idea independently.
Figure 2. 7. Lo~itudinal electric field strength of the lowest dipole mode as function of radius. The right- and left-hand expansions are plotted at some crosssectional step. 27 modes were used inside the iris, 56 inside the cell. The results of the calculations without filter but with auxiliary transition (on the right) as well as with filter and auxiliary transition (on the left) are shown. Graph from the diploma thesis of Sommer [239J
2.1 Mode Matching Technique
47
tains a singularity. In case of a perfectly conducting surface with some edge the edge condition allows the conclusion that the singular components of the electric and magnetic field near the edge are of order d- 1 / 2 where d is the distance from the edge. The field components which are parallel to the edge are always finite. The waveguide junction shown in Fig. 2.3 produces a sharp corner where the waveguide diameter changes. The electromagnetic fields show a singularity at this location. The solution of the boundary value problem is unique only if the so-called edge condition is included. Then the edge condition provides knowledge about the asymptotic decrease of modal amplitudes for some discontinuity problem in the waveguide. This available knowledge is lost for the modal coefficients of higher order when one retains only a finite number of unknown modal coefficients in the field expansions. Vassallo [290] discusses computations where this knowledge was taken into account. However, this method was not used there, and the reported gain was rather moderate compared to the computational effort, even though the results are interesting for calculations of the complete electromagnetic field. In our context, however, of interest is the integral of the voltage along a radius that lies between the symmetry axis and the smallest iris radius. Thus, only the longitudinal component of the electric field at this special radius is of importance. The radius was chosen far enough from the irises and hence from the edges of the structure. Many studies of convergence were carried out and are described in much more detail in [183], [239]. Geometrical Mode Ratio. Consider again a simple step in a cylindrical waveguide, as shown in Fig. 2.3. The continuity at this step is evidently determined by the eigenvalues K~~(E,H) and K~~(E,H) because they characterize the transverse field. Suppose M eigenfunctions are taken in section I and N eigenfunctions in section II. The entries in the coupling matrix measure the strength of the coupling between modes in both sections and thus the reciprocal influence on the wave amplitudes. In order to take into account the maximum of information for a selected dimension (M or N given), the following geometric criterion for the mode ratio can be used: Best convergence is reached if the relation between the number of eigenfunctions follows the formula M a
N
b
where a and b are the radii of the waveguide left and right of the step. This well-known criterion could be verified for the convergence of several examples. Numerical experiments in [183] were carried out for tapered irisloaded waveguides. These experiments with a weighting factor w for the ratio alb showed best convergence for w = 1, i.e., for the geometrical mode ratio.
48
2. Numerical Field Theory
2.2 Finite Element Method The first publication about this method goes back to 1943 when R. Courant published a paper" Variational Methods for the Solution of Problems of Equilibrium and Vibrations" [68] in the Bulletin of the American Mathematical Society [54]. Yet, the importance of this method was not seen at that time. Thus it came about that the engineers re-invented the method independently in the early fifties: The Finite Element Method (FEM) has its practical origin in civil engineering, where it was used for mechanical applications. The first publication in this field was by Argyris [2] in 1954/55, the next by Turner, Clough, Martin and Topp in 1956 [272]. Clough [63] proposed the name of the method. The intent was to overcome difficulties of the Finite Difference Method by using other methods such as numerical evaluation of the variational integral, following the approach of Ritz and Galerkin but using local basis functions defined over single elements of the solution domain. Only a decade later, this numerical method was first used in electrical engineering. Several papers were published around 1970, see, e.g., [236]. Initially, the Finite Element Method was developed to handle problems where the desired configuration is obtained via minimizing the potential energy, i.e., the variational integral equivalent to the boundary value problem. However, it was quickly generalized to other problems where such a minimizing principle does not hold because the equivalent variational integral is not known, so one uses then what is known as weighted residuals or applies Galerkin approach. Meanwhile, it is used in different versions in many fields, among those computational electrodynamics. It is well established, especially for static problems. During the last decade, important progress was achieved in Finite Element Methods with regard to their application for electromagnetic field simulation. FEM with nodal elements had shown some drawbacks when applied to the solution of Maxwell's equations as, e.g., the existence of the so-called ghost modes in the time-harmonic case. The goal of the developments which started in the 80's by introduction of more appropriate element types [185], [186] was to overcome those drawbacks by fitting the FEM formulation to the electrodynamics background. The main idea behind the numerical solution of differential equations by the Finite Element Method is the approximation of the solution by a linear combination of basis functions. The coefficients of the linear combination are obtained from some variational problem, which in case of symmetry is equivalent to a minimization problem. As for other methods like the Finite Difference Method or the Finite Integration Technique, which is described in the next subsection, the solution domain fl has to be discretized, i.e., decomposed into smaller subdomains which are geometrically simpler. These subdomains are then called finite elements. The process of decomposition into finite elements is often referred to as mesh generation. The Finite Element Method is very flexible in its discretization. For two-dimensional problems, it is possible
2.2 Finite Element Method
49
to combine convex quadrangles with triangles; for three-dimensional problems, different types of geometrical bodies can be combined. But since all permissible decompositions have to satisfy certain geometrical and topological conditions, the mesh generation can be very difficult and time consuming for complex applications. This is one of the main drawbacks of FEM that the effort (numerical and often engineering) for the mesh generation is very high. Standard Finite Element Methods are thoroughly explained in [54], [212] or [113], among others. Since there exist many textbooks and articles on the Finite Element Method (see., e.g., [47], [134]' [209], [215], [249], [265]), [45], and since this method was not used by the author for field simulations which are treated in this book, only a rough idea about the method and some specific aspects of it will be described in what follows. 2.2.1 General Outline of the Finite Element Approach (following closely [216]; an extensive and mathematically profound presentation can be found, e.g., in [113]) The first step is the decomposition of the structure or domain of interest into a finite number of elements. In this step the number, type, size and arrangement of the elements have to be decided. Sophisticated algorithms automatically adapt the elements to better fit the solution. Depending upon dimension, the main element shapes are as follows: (1) One dimensional: The elements are line segments. The number of nodes and associated variables assigned to each element depend on the type of the interpolation function and the degree of continuity required. (2) Two-dimensional: Triangular elements are usually used, as they are most easily adaptable. Often the nodes are the vertices of the triangle with the unknowns being the value of the solution at these nodes, but sometimes more nodal variables are assigned to each triangle (cf. the paragraph on mixed FEM). Quadrilateral elements are sometimes chosen. (3) Three-dimensional: The four-node tetrahedral element is common; but the right prism and a general hexahedron are sometimes used. In axisymmetric problems, curved ring-type elements with axial symmetry are used. Similarly, near curved boundaries, elements with curved edges may be introduced at the cost of complicating the derivation of the element properties. Remark: In general, it is best that the elements in a finite element solution be neither long and thin, nor short and fat; the performance of the method is best if the "aspect ratio" of the elements, the "width:height", is roughly one: Let K be a finite element and hK := diam K,
then
PK:= sup {diam S; S ball with S ~ K} , hK -<(1<00 PK -
50
2. Numerical Field Theory
is requested for all finite elements. In this context, "roughly one" means that the ratio could perhaps range from say 0.1 to 10, but that wider variations are strongly discouraged. The same holds also for triangular grids applied in Finite Volume Methods, see, e.g., the triangular grids chosen in the Finite Integration Technique described in section 2.3. In the second step, the interpolation is selected. The solution is typically determined in terms of the unknown at a set of nodes on each element. Then, to determine the solution field, that is, the values of all the physical quantities of interest, within each element, it is necessary to interpolate from the nodal values of that element. The number of nodes per element and the choice of interpolation is dictated by the order of accuracy required and by the degree of continuity needed for the governing equations. The continuity might then mean continuity of the solution itself or of its first derivative, etc. In the third step, the element properties are derived. The contribution from each element to the governing equations is determined. In case of minimizing the potential energy, this only involves finding the potential energy of each element in terms of its unknown nodal values. In case of the weighted residual method, it involves determining the contribution of each element to the coefficients of the nodal variables in the resulting system of equations. The next step is to assemble the system. To find the properties of the whole system modeled by the assemblage of elements, all the element properties have to be assembled into one set of equations. These equations need to be supplemented to account for the boundary conditions of the problem. Then the system is solved. The assembled matrix gives a system of equations which can be solved for the unknown nodal variables. In a linear problem, these equations will be linear and result in a large sparse system of linear equations which is solved by some iterative method. In a nonlinear problem, some Newton-like procedure will need to be applied. This yields a sequence of linear systems which are again large and sparse, and thus solved iteratively. Once the nodal variables are known, any supplementary quantities of interest may be computed via the interpolation scheme. 2.2.2 Weighted Residual Method; Galerkin Approach
(again following [216]; an extensive and mathematically profound presentation can be found, e.g., in [113]) The method of weighted residuals is a technique for obtaining approximate solutions to linear and nonlinear differential equations. It is a very general method. Combined with the Finite Element Method, it allows to solve physical problems with no "energy" integral to be minimized. Galerkin's method is based on a variational principle expressing an equilibrium (saddle point) condition rather than on a minimization principle. Regard a mixed problem for an elliptic linear differential equation:
2.2 Finite Element Method
£U U
51
=f
in a domain n = 0 on some part D of an
au = g -an
on some part N of an
with the unknown function u(x), some given right-hand side f(x), some function g(x) defined on D ~ an and an open subset n c Rn with Lipschitzcontinuous boundary. First, the solution is approximated by a sum N
u(x) =
L Ui
where Ui are unknown coefficients and
l
= O.
When £ is a linear differential operator, this equation can be rewritten as
N
= LkijUj -
Fi
j=l
where the coefficients are
Since this equation has to be satisfied for all i, i = 1, ... , N, these form a system of linear equations, Ku = F, to be solved for the unknowns U = (U1, ... ,UN)T. Typically, most of the interaction coefficients or connection coefficients kij are zero. This is because the basis functions are nonzero only in the small number of elements around a given node.
52
2. Numerical Field Theory
Another issue is that in order to compute the interaction coefficients by the above integral, basis functions with a high enough degree of continuity are needed for .c¢j to be bounded, which is not easy to achieve. To circumvent this difficulty, one usually integrates by parts in order to reduce the degree of derivatives of ¢j. This is done at the expense of differentiating ¢i, so that the almost even division is achieved when the orders of derivatives of ¢i and those of ¢j differ by at most one. For the application of this method, see, for example, [134].
2.2.3 Duality Methods Various reasons, described, e.g., in [47] (see also references therein), as the existence of a constraint, justified the introduction of duality methods. These methods use different variational formulations and consequently different Finite Element approximations. Mixed Finite Element Methods, which play an important role in the solution of Maxwell's equations with FEM, rely on basic principles of duality theory. Standard Finite Element approximation takes place on Sobolev spaces. Sobolev spaces are based on
the space of square integrable functions on {}. Then, in general, for any integer m ~ 0 the Sobolev spaces are defined as
where Dav
= aXl
a1
a1a1v ...
aXnan' lal = a1 + ... + an,
these derivatives being taken in the sense of distributions. Practically, the most important of these spaces are H1 ({}) and, for fourth-order problems, H2({}). Duality theory is based on the classical principle of Legendre's transformation. The duality theory is abstract. For a convex function f(x) defined on a space V, a relation to its conjugate function f*(x*) on the dual space VI of V is obtained. With that, a given minimization problem of the form inf g(x)
xEV
+ f(x)
can be transformed to a saddle point problem inf { sup g(x)
xEV
x.EV'
+ (x*,x) - f*(x*)},
2.2 Finite Element Method
53
instead of which, under simple regularity assumptions, the dual problem sup {inf g(x)
x'EV'
xEV
+ (x*,x) -
!*(x*)}
can be considered. Several examples of the application of this method are given in [47]. Employing a symmetric bilinear form and a second (Hilbert) space Q with its scalar product, one derives a system of equations (see, e.g., [47]) which are the optimality conditions of some saddle point problem, so allow to proof existence and uniqueness of the solution to that saddle point problem. Next, the problem (i.e., the optimality condition) is approximated on finite dimensional subspaces Vh of V and Qh of Q. The index h will refer to a mesh from which these approximations are derived. Then, a matrix form can be derived for this approximation of the problem. In this procedure, several difficulties may arise which are not described here, but they are treated carefully in [47]. The space which is especially adapted for the study of mixed and hybrid methods is H(div; D) = {v Iv E L2(D); div v E L2(D)}. Two additional spaces containing functions which satisfy Dirichlet's boundary condition (or Neumann's boundary condition) are defined (see, e.g., [47]). Let the solution domain D be partitioned into sub domains (elements): D = U;.n=I K r . Given such a partition of the domain D, an approximation of the Sobolev spaces HI(D) and H2(D) is needed next. Standard approximations of Sobolev spaces can be divided into two classes: conforming and non-conforming methods. Conforming methods are the most natural of Finite Element Methods. They yield internal approximations in the sense that finite dimensional subspaces of the Sobolev space, which is to approximate, can be built. For a given partition of D, a conforming approximation of HI (D) is a space of continuous functions defined by a finite number of parameters (degrees of freedom). This is usually achieved by using a space of piecewise polynomial functions on the elements K of the partition for D. The degrees of freedom are then a set of linear forms on the set of polynomials on K. For the approximation of HI (D), Lagrange type elements, i.e., elements with point values as degrees of freedom, are sufficient, whereas approximating H2(D) requires Hermite type elements, i.e., elements with degrees of freedom also involving derivatives. Several examples of element spaces for conforming approximations are given in [47]. Non-conforming methods are used in connection with hybrid Finite Element Methods. They yield external approximations in the following sense: Given a variational problem a(u,v) = (f,v)v'xv, for all v E V,u E V,
54
2. Numerical Field Theory
with f E V', a Hilbert space V, and a( u, v) a bilinear (coercive) form on V x V, an external approximation Vh to V is then given by a family of finitedimensional subspaces of a space S with V c S and some canonical extension a(.,.) to S x S such that v = lim
h-+O
Vh ::}
v E V.
Then an approximation to the variational problem can be formulated and a result about the approximation error in u, Strang's lemma [250], can be derived. Usually the Hilbert space V is V = Hl(D) or V = H 2 (D). Again, the reader is referred for example to [47]. Techniques for approximations used for H(div; D) are based on the original work of Raviart-Thomas [211] and Thomas [263], which was later generalized and extended to the three-dimensional case by Nedelec [185] using
More details on approximations in H(div; D) and H(curl; D) can be found in [47] and references therein.
2.2.4 Finite Element Discretizations of Maxwell's Equations For the analytic solution of Maxwell's equations (1.1) - (1.4), one determines which class of problems - electrostatics, magnetostatics, stationary currents, quasistatics and fast varying fields (including time-harmonic fields) - includes the problem in hand; this gives substantial simplifications of the model. A usual approach is the conversion of Maxwell's equations into variational equations in suitable function spaces, the so-called weak formulation [28]. The function spaces are given by vector fields of finite energy. This formulation uses two different spaces and can therefore be called the dual formulation; it is more appropriate for the discretization of Maxwell's equations than the usual nodal formulation. In fact, the key point in a consistent discretization of Maxwell's equations is the need to use a pair of dual meshes. This breakthrough understanding was achieved in mixed Finite Element Method, shortly mixed FEM, or FEM with Whitney forms on a primal and dual mesh (the terminology can be different for different authors), see, e.g., [211], [185], [186], [36]. Another common notation is that of the edge element formulation (see, e.g., [28], [158]). Before discussing the idea behind this approach, let us touch upon an often discussed problem in electromagnetic field computation. (i) With edge elements, the tangential field component satisfies the continuity condition at boundary elements. Yet, the tangential component may be discontinuous. Some authors (see, e.g., [294]) propose to reinforce continuity explicitly. (ii) Sharp, perfectly conducting edges need a much finer discretization in their surrounding or the use of special basis functions, which are singular as the
2.2 Finite Element Method
55
field itself in these locations, because the electric field goes to infinity at the edge. This is the case for nodal as well as for edge elements. At sharp edges, the electric field has no definite direction. Edge elements, but not nodal elements, make a change in direction possible at the edge. Note that the Finite Integration Technique, which is introduced in subsection 2.3, does not permit any vector field allocation at the sharp corners of the boundary. Thus, the Finite Integration Technique not only guaranties continuity of all electromagnetic field quantities but also ensures convergence near sharp edges. The order of convergence is reduced to one in the vicinity of sharp edges [295]. (iii) As already noted, the appearance of spurious modes, which arise in the discretization of eigenvalue problems for cavities and waveguides, is a wellknown problem arising when nodal elements are treated by penalty methods. Many authors (see, e.g., [294]) state that spurious modes do not occur with edge elements. But according to Mur [181]' [182]' there exist examples for the contrary. Two Hilbert spaces are chosen: the first for the electric field E and magnetic field H denoted by
the second for the displacement current D and the magnetic induction B: H(div;f?):= {v E L2(f?);divv E L 2(f?)}.
An important point is the representation of the material equations (cf. section 1.1) which link functions from both spaces. Two formulations are possible. [28] uses the one introduced by Bossavit [34], [35] (c-1D,v)o;n
= (E,v)o;n
(D,11)o;n
= (cE,11)o;fl for all 11 E H(curl; [l),
for all v E H(div;f?)
with the L2([l) inner product (., .)o;n. Thus, a formal consistency within the framework of differential geometry is achieved (see, e.g., [34], [35]): The fields can be interpreted as differential forms, products of I-forms giving 2-forms which yield electric and magnetic energy, respectively. The material coefficients yield the so-called mass matrices in the edge element formulation, which are operators transforming differential forms of different orders - socalled Hodge or * operators. The Hodge operator maps p-forms into (n-p)forms; so, for n = 3 and p = 2, it maps 2-forms into I-forms. For the Finite Element Method, the resulting linear systems are large and sparse, thus enabling one to use iterative solvers. Yet, the usual irregularity of the grid is of course reflected in the matrix structure, i.e., the matrix entries are more or less scattered, which requires special effort for efficient storage. The reader who is interested in more details on the Finite Element discretizations of Maxwell's equations is referred to the articles cited above and references therein.
56
2. Numerical Field Theory
2.2.5 Synthesis Between FEM with Whitney Forms and Finite Integration Technique Using lumped inner products, one can regard FIT as an edge element scheme [37], [64]. The key element in finding a synthesis between methods like the Finite Integration Technique, shortly FIT [296], which is described in detail in the next subsection, on the one hand and FEM with Whitney forms on the other hand is a diagonal Hodge operator [33]. Some important aspects of the procedure in context of a comparison between FIT and FEM can be summarized as follows ([259], [33]): Like the discrete FIT-operators S, C, ... , the connectivity matrices on a simplicial mesh are discrete analogs to the divergence-, curl- and gradient-operator. The mass matrices of edge elements are in some sense analogous to the material operators Do, D/l' D" of FIT. As shown in [259], the mass matrices can also be regarded as discrete analogs to the Hodge operator for differential forms. This terminology comes from differential geometry. In FIT, the state variables like d defined over surfaces may be regarded as 2-forms, while the other state variables like e defined along some path may be regarded as I-forms and the material operators are corresponding tensors. In FEM, the Galerkin method can be interpreted as a realization of the discrete Hodge operator [259]. A close link of FIT with mixed FEM may well be suspected, yet a theoretical interpretation from the point of view of differential geometry is still open for research.
2.3 Finite Integration Technique The most important class of methods for numerical field calculation deals with local difference equations obtained after suitable discretization. In the last subsection, we briefly discussed one such method, viz. the Finite Element Method. This subsection describes another, the Finite Integration Technique. This discretization method is chosen throughout this book to solve Maxwell's equations. It presents a discretization consistent with Maxwell's equations, i.e., the resulting discrete solutions reflect the analytical properties of the continuous solutions. The Finite Integration Technique can best be described as a Finite Volume method. 2.3.1 FIT Discretization of Maxwell's Equations The Finite Integration Technique [296], [303], [304] (shortly FIT) has been developed specifically for the solution of Maxwell's equations. The goal of this development was the ability to solve numerically the complete system of Maxwell's equations in full generality. The Finite Integration Technique presents a transformation of Maxwell's equations in integral form
2.3 Finite Integration Technique
_j
57
aB .dA 1 E.ds A at loA 1 H.ds j (aD +J).dA. A at loA 1 D.dA p · dV lov 1 B·dA = 0 lov
i
onto a grid pair (G, G). Yee's FDTD-method [323] (1966) with the so-called "leap frog" -scheme is a predecessor of FIT. FDTD stands for Finite Difference Time Domain. It is a Finite Difference method for the solution of Maxwell's equations in the time domain. The feature of Yee's method is that it evaluates Maxwell's equations for both electromagnetic fields (not just the electric field) and that the electric and magnetic field components are allocated on two staggered grids. In 1977, Weiland [296] generalized the FDTD-method to a general numerical method for all electrodynamics. In fact, the Finite Integration Technique is not restricted to electrodynamics but is also a suitable numerical method for other subjects. For example, the Finite Integration Technique was recently applied in acoustics [321], elastodynamics [94], and to temperature problems [283], [203]. The Finite Integration Technique makes only some usual idealizations concerning the materials: The materials of the given objects have to be piecewise linear, homogeneous, and isotropic so that the sub domains with constant material parameters (10, p, /'i,) are at least as big as the elementary volumes used. The FIT Grid. The discretization of Maxwell's equations, i.e., the field computation in a finite number of discrete points, gives a decomposition of the solution space into grid cells. The first step towards that goal is to define a finite volume n, the calculation domain. Now, cover n with a grid G. As an example, we can consider a simple Cartesian coordinate grid. It has to be stressed at this point, however, that a FIT grid G is defined in much more generality and that its definition also includes non-coordinate grids as well as non-orthogonal grids. For the simple Cartesian coordinate grid, we will explain the derivation of the Maxwell Grid Equations: The grid is composed of the so-called elementary volumes Vi or FIT cells. Each FIT cell is filled with a homogeneous material. The intersection of two elementary volumes is an elementary area Ai, the intersection of two elementary areas an elementary line L i , on which the unknown state variables are allocated. Definition 2.3.1 yields a formal description of the FIT grids:
Definition 2.3.1. A FIT grid G is defined as: - G E R3 (R2) simply connected - elementary volumes V = {VI, ... , Vnv } with G
= UVi, Vi f. {};
58
2. Numerical Field Theory b: Magnetic Flux e: Electric Voltage b
FIT -Ce ll
Figure 2.8. A simple example of an FIT grid and its elementary parts.
n n
- elementary areas A = {AI, ... , AnA} with {Ad := Vi - elementary lines L = {LI' ... , L nL } with {Ld := Ai - points P = {PI, ... , Pnp} with {Pd := nLi
In the next step, Maxwell's equations will be transferred to the FIT grid G. First, we examine the first of Maxwell's equations, viz. the induction law:
1
E. ds
=_
r aB . dA.
faA iA at An elementary volume Vi for the simplest case is shown in Fig.
2.9. The lefthand side of the equation is treated as follows: Consider the surface integral over the elementary area Ai and introduce the electric voltage along the elementary line Li as a state variable on the grid G:
Definition 2.3.2. The electric (grid-)voltage tary line Li as a state variable: ei
:=!
Li
ei
is assigned to each elemen-
E· ds.
Therefore, the contour integral faAi E . ds reduces to the difference
Thus, the allocation of state variables is carried out in a very natural way. It is sufficient to examine the integral over one elementary area. Suppose the integration area A would have the shape shown in Fig. 2.10. Obviously, the total integral can be written as the sum of integrals over the elementary areas Ai. The surface integral
-~! at A B·dA on the right-hand side of the induction law has to be transferred appropriately to the grid G. The elementary area Ai has already been chosen as the
2.3 Finite Integration Technique
59
e·1 Figure 2.9. Elementary volume of a Cartesian FIT grid G with the allocated state variables electric grid voltage (left) and magnetic grid flux (right).
---- -- -t H H t - - t t
W! _:; V
Figure 2.10. Allocation of electric grid voltages on an integration area which is composed of several elementary areas.
area of integration. Therefore, the direction of dA is already determined. Analogously, a state variable on G is now also assigned to the magnetic flux density (cf. Fig. 2.9): Definition 2.3.3. The magnetic (grid) flux bi normal to the elementary area Ai is assigned as a state variable to each elementary area Ai:
bi
=!
Ai
B ·dA.
Thus, the induction law in discrete form is given by ei
+ ej -
ek - el
= -
ata bi .
After appropriate numbering of the points of the grid, the state variables ei and bi , i=l, ... , N can be stored in vectors e and b. The factors {-I, I} are stored in a matrix of size 3N x 3N which reflects the topology of the grid and will be denoted by C:
60
2. Numerical Field Theory
C is a block matrix with blocks ±Px , ±Py , ±Pz and 0:
C= where x, y, z are the three coordinate directions. The blocks Px , Py , Pz each have -l's on their main diagonals and l's on some sub- (or super-)diagonal the distance of which to the main diagonal agrees with the chosen numbering. On dual-orthogonal FIT grids (G 1- G, cf. Def. 2.3.4), we have
Px ~
a ax'
Py ~
a ay'
Px ~
a ax'
Further details can be found in [305]. Thus, the discrete form of the induction law is
Next, the fourth of Maxwell's equations
1 B·dA Jav
= 0
is transferred to the grid G. Again, the .total integral can be written as the sum of integrals over the components. Consequently, the surface integrals are over the elementary volumes Vi, and the magnetic fluxes bi are used again. Fig. 2.9 shows the allocation of the magnetic flux on Vi. This yields the difference equation bi
+ bj + bk - bl - bm
-
bn = O.
Introduce a matrix S of size N x 3N with elements {-I, I} corresponding to the topology of the grid:
S:=
C 1.. 1 .. 1 ..
-1 .. -1 ..
-1..)
=
(P.IP,IP.)
So, the discrete form of the fourth Maxwellian equation is Sb= O. It remains to transfer the second and third Maxwell's equations. This will be completely analogous to the transfer of the first and fourth Maxwell's equations. For that, the so-called dual grid G is introduced. For a Cartesian grid, it equals the grid G shifted by half a cell length. Again, the definition of the dual FIT grid is much more general than what example may suggest.
2.3 Finite Integration Technique
61
dual Grid
d
d: Electric Flux h: Magnetic Voltage dual FIT·Cell Figure 2.11. Elementary volume of the dual grid G for the Cartesian FIT grid G and the allocated state variables 'magnetic grid voltage' and 'electric grid flux'.
Definition 2.3.4. The dual FIT grid {; can be formally described as - {; as G with v,A,L,P 3Pi with Pi E fj, VVj 3Pi with Pi E Vj - vAj 3L i with Li Aj i- 0, VAj 3L i with Li
- vfj
n
nAj i- O·
The dual-orthogonal FIT grid is defined the relations L 1.
A and L 1. A.
The state variables hand d are introduced analogously to e and b: Definition 2.3.5. The magnetic voltage hi is assigned as a state variable to each dual elementary line
Li : hi := ( H· ds.
hi
The electric flux di normal to the dual grid surface
Ai
is assigned as a state
variable to each dual elementary area Ai:
di =
f.
hi
D· dA.
Furthermore, the total electric current ji normal to the dual grid surface is assigned as a state variable to each dual elementary area Ai:
ji =
f.
hi
Ai
J. dA.
The discrete charges qi (allocated in Pi on G) are assigned as state variables to each dual elementary volume
Vi:
qi
= ( pdV. lVi
62
2. Numerical Field Theory
The values of the state variables are stored in topological matrices {} and analogous to C and S. Thus, the discrete form of Ampere's law is
S
and the third Maxwell's equation is
Sd= q. In electrostatics and for stationary current problems, the electric field E can be expressed as the gradient of the scalar potential cp; in electro-quasistatics, the complex electric phaser E can be represented as the gradient of a scalar complex potential cpo In magnetostatics, one uses a vector potential, but, as it will be shown in subsection 2.3.2, it is possible to reduce the problem to a scalar potential problem. The FIT formulation for statics and electroquasistatics uses the obvious idea to allocate the potentials directly on grid G, just in the points Pi E G. The electric and magnetic fields are then each allocated on the elementary lines of the FIT grid G. The integral f E . ds, e.g., then simply corresponds to the difference of the potentials at these grid points.
Definition 2.3.6. In statics (electro-quasistatics), a real (complex) scalar potential PE,i, PM,i (P.i) is allocated in each point Pi E G. Let Li be the elementary line between two points Pa and Pb of G. Then the electric gradient ei in statics (f.i in electro-quasistatics) allocated on Li is defined as PE,b - PE,a (ei := P.b - P.a, respectively)
ei :=
and the magnetic gradient of statics is defined as
Remark 2.3.1. The grid pair (G, G) is - not necessarily parallel to some coordinate axis, - not necessarily orthogonal 6, - not necessarily regular. For a Cartesian grid, the state variables hand d are allocated in G in the same way as e and bin G. Such grids are also referred to as "staggered grid" (Yee). For many but not all applications, it is not important on which of the two grids the first or second Maxwellian equation is discretized. 6
An angle to the normal of an elementary line is allowed if /-L or differ in the adjacent elementary volumes.
c;
and
K
do not
2.3 Finite Integration Technique
63
The Maxwell Grid Equations. To summarize the above, let us formally define the Maxwell Grid Equations. It should be emphasized that no approximations have been made so far. Only when the material equations are transferred to the grid space, FIT will require some approximations. Definition 2.3.7. The following discretization method for Maxwell's equations is called Finite Integration Technique (shortly FIT). 1. A FIT grid G is chosen for the given solution domain. The grid points are assumed to be numbered appropriately. The following state variables are introduced:
= fLi E· ds bi = fAi B· dA hi = hi H· ds di = hi D ·dA
ei
ji =
qi
hi J ·dA
= Iv. pdV
electric (grid)-voltage magnetic (grid)-fiux magnetic (grid)-voltage electric (grid)-fiux electric total current discrete (grid)-charges (allocated in Pi on G)
2. The discrete analogue to Maxwell's equations is given by
"" ~cikek
abi =-at
k
LCikhk
=
a:
ad·
k
+ji
LSikbk = 0 k
L Sikdk
= qi,
k
3. which yields the following system of linear equations
Ge =-b
Gh = d+ j Sb = 0
3d
= q.
Because of its consistency and generality, this system is referred to as Maxwell Grid Equations.
4.
The operators G,
G,
S, and
3 can
be interpreted as follows:
G,G S, 3
== curl == divergence T -T . -S ,-S == gradIent.
Therefore, they are called the discrete curl operator, the discrete divergence operator, and the discrete gradient operator.
64
2. Numerical Field Theory
Remark 2.3.2. The replacement of field components by state variables in the form used here and in an analogous form was made in [297] and was taken up again in [76] and [116]; also see [72]. This notation requires fewer parameters and appears more elegant. However, equivalences such as ds == Ds get lost in the notation used in Def. 2.3.7. In the alternative notation, the field components are stored in vectors e, b, h, d, j and the voltages and fluxes from Def. 2.3.7 can be obtained by multiplication by matrices Ds , DA, Ds , DA which contain the lengths of the elementary line segments and areas of the FIT grids G and G: e = Ds e, b = DA b, h = Ds h, d = DA d and j = DA j. In this notation, the equivalence to Maxwell's equations in integral form becomes more obvious:
eDse=-DAb eDsh = DA(d+j) SDA b = 0
SDAd = q
In the following, we use only the state variables from Def. 2.3.7. The most important characteristics of the Finite Integration Technique are shortly summarized next. At this point, no detailed proofs will be given, since they can be found in the literature ([296], [302], [304], [305], [306], [79]). Lemma 2.3.1. By the duality of the grids G and
eT
G,
the equality
=e
holds for the curl matrices.
The elements of the matrices e, C, S, and S may take only three values: 0, +,1 and -1. The matrices are also banded with only a few non-vanishing bands. Theorem 2.3.1. The Finite Integration Technique represents a consistent method of the first order for the solution of Maxwell's equations.
The integrals arising in the Finite Integration Technique represent Maxwell's equations in a very natural way. This is due to a special allocation of the field quantities and suitably chosen rules of integration. The conservation of the relations between the different integrals in Maxwell's equations guarantees the consistency. It can be shown (cf. [304]) that the analytic properties of Maxwell's equations have their counterparts for the discrete solutions. In particular, the property that the curl fields are free of sources is preserved [304] by the following lemma. Lemma 2.3.2. The Finite Integration Technique preserves the vector analytical relations between curl and div. In particular, div curl == 0
se = 0 and se = o.
Therefore, the discrete third and fourth Maxwell's equations are satisfied.
2.3 Finite Integration Technique
65
This property distinguishes the Finite Integration Technique from a wealth of other possible discretization methods for Maxwell's equations. Because of this property, there is a unique possibility to check the numerical solution for correctness and precision. For static fields, one uses a potential formulation, i.e., the fields are represented as the gradients of potentials. The Finite Integration Technique preserves the analytical property that gradient fields are irrotational: Lemma 2.3.3. The Finite Integration Technique preserves the vector ana-
lytical relations between curl and grad. In particular, curl grad == 0
CTS T = 0 and CST = o.
Another analytic property which is also preserved for the electromagnetic fields calculated using the Finite Integration Technique is the principle of energy conservation: Lemma 2.3.4. In the FIT grid space, the total electromagnetic energy is
given by the sum of the individual field energies
aw = at aWe + ---at aWm = -).T e.* at The so-called" ghost modes" (see, e.g., [197]) can be excluded a priori in a very simple way for the Finite Integration Technique computation of steadystate electromagnetic fields with harmonic time dependence. Theorem 2.3.2. The solution space rl"i12 of the eigenvalue problems for
time-harmonic free oscillations (modes) can be written as the direct sum of the space rl, of the physical modes and the space rlw of the so-called "ghost modes": rl"i12 = rl, EB rlw
== 0 does not hold in the grid space, the distinction between the physical modes and the ghost modes is not so obvious. Methods such as the Finite Element method often use penalties [197]; in particular, the graddiv-term in the eigenvalue equation usually has some penalty factor p assigned to it. Upon solving the equations with different factors p, the desired eigenfunctions can be recognized since they do not show any dependence on p. Note that the matrices arising in the Finite Integration Technique can also be obtained by some other appropriately chosen discretization method for partial differential equations [304]. However, the methodology described here stands out by the apparently more natural derivation of the appropriate discretization.
If the source is not free of curl fields, i.e., if div curl
66
2. Numerical Field Theory
Approximation of Material Properties. Until now, no approximations have been used at all. They will be necessary after the transfer of the material equations to the grid space. On the FIT grid, the following state variables are defined di I-Ai D· dA ' ei ILi E· ds bi hi I-Li H· ds IAi B ·dA, qi ji IVi pdV Ai J·dA '
I-
The state variables di and ei (b i and hi, respectively) are each allocated at the same points. There is an analogue of the material equations relating them to each other. In order to find that analogue of the material equations, the grid flux is divided by the grid voltage. The ratios di/ei and bi/hi (ji/ei, respectively) then will be approximated by averaging the corresponding material parameters. The averaged quantities are then combined in the so-called material matrices De, DIt and D",. Thus, the following transfer of the electromagnetic material equations to the grid space results:
--+ --+ --+
d b jl
A more detailed description of this approximation can be found in [305]. Discretization of Integrals. The state variables are defined as surface or line integrals over elementary areas or elementary lines. One needs a one-one correspondence between the discrete field quantities and the state variables. To obtain such a correspondence, we use the definition of an integral as a limit. This is analogous to the discretization of differential equations in the Finite Difference methods. The simplest numerical integration over an interval uses one supporting point in the middle of the interval:
l
xo + L1
xo
f(x)dx = ,1. f(xo
,1
+ -) + 0(,13). 2
(2.7)
The simplest integration formula for a surface integral is
l
xo + L1 /,yo+L1
xo
Yo
f(x, y)dxdy = ,12 . f(xo
,1
,1
+ 2' Yo + 2) + 0(,14).
(2.8)
This approximation formula is used in the Finite Integration Technique. The field quantities are allocated on the grid pair (G, G) in the same way as the state variables. The remainder in the integral formulas has order 0(,12). Special Properties of Dual-orthogonal FIT Grids. Finally, for dualorthogonal FIT grids, still another vector analytic property of electromagnetic fields is transferred to the algebraic solutions: Lemma 2.3.5. On dual-orthogonal FIT grids, the continuity of Ell and BJ.. holds at interfaces of different materials.
2.3 Finite Integration Technique
67
For the proof, consider a grid distribution with e and b. (i) The electric field E is only represented by components tangential to the elementary volumes. Therefore, the computed field strength is continuous even if the material filling of adjacent cells differs in c: or K, [142]. (ii) Continuity also holds for adjacent elementary volumes with different 1-", since one computes only the components of the magnetic flux density normal to the material surface [142]. This shows clearly the main advantage of the Finite Integration Technique over the methods that allocate all the field components in one point and consequently cannot guarantee the continuity of the computed field quantities. But there exist variants of the Finite Element method allocating the field components in a very similar way to the Finite Integration Technique, e.g., [46], [211]. They have briefly been discribed in section 2.2. Remark 2.3.3. On a FIT grid, sharp, perfectly conducting edges do not lead to dramatic convergence degradation since no vector field is allocated in the corners of sharp edges, which constitute places of singularity - the convergence is still of first order [295]. To get an improved field representation in the vicinity of a corner, an edge correction may be implemented [118].
For the unique solvability of Maxwell's equations and thus of the matrix equations, the boundary conditions have to be satisfied on the outermost surface of the given object. Those boundary planes may not necessarily coincide with the boundary of the grid. Remark 2.3.4. On a FIT grid, the boundary conditions can be incorporated in a natural way during the setup of the difference equations, so that they will be satisfied automatically.
On a perfectly conducting surface, part of the boundary condition is vanishing of tangential components of the electric field. In addition, the normal components of the magnetic field must vanish on that surface. The perfectly conducting surface is approximated in the FIT grid by the surfaces of some elementary volumes of the FIT grid. The state variables ei and bi are allocated in such a way that the field components of E are tangential and that ones of B are normal to the surface. Therefore, the boundary conditions can already be included into the set of difference equations by just making the corresponding state variables zero. As already noted, the Finite Integration Technique makes it possible to choose different coordinate systems and different grid types. The coordinate system is chosen so as to get the best possible discretization of the material distribution for the problem in hand. It is self-evident that this becomes much easier if at least a part of the boundary coincides with some coordinate planes. Besides the coordinate system, an appropriate grid type can also be chosen. Figures 2.12 and 2.13 show two typical grid types.
68
2. Numerical Field Theory
I I
/" "
~"
"
J--
--+B
B E
Figure 2.12. Elementary volume of a three-dimensional Cartesian grid with dual grid G, material filling and allocation of the components of E and B.
Irregular but orthogonal FIT grids correspond to the so-called box schemes for Finite Difference methods. The grid G corresponds to a primary decomposition, the dual grid Gto a secondary decomposition. The dual-orthogonal FIT grid (G, G), with the triangular grid G and the dual hexagonal grid G obtained by making the central perpendiculars grid lines of G, is called a net of weakly acute type [123J in the context of box schemes. To summarize, let us emphasize that the area of application of the Finite Integration Technique completely agrees with that of Maxwell's equations. For the possible grid types and the variety of possible applications, we refer the reader to the existing literature. A characteristic of the Finite Integration Technique which should be stressed is its applicability to irregular grids. [277J, [286J discuss triangular grids, [32J three-dimensional dual non-orthogonal grids. [302] and [304] give very detailed surveys. Next, we apply the Finite Integration Technique to several classes of problems in field theory.
2.3 Finite Integration Technique
('r
69
Iir I !
1\/
f\ I\J\
7Y
7
li~
J.L
= 2.0
I!i5~i !!-::-:~-~~----I I.-
....
~
:- "
....
.
S: 1 - !
i--
: :!tA'
'
Ii 1-............ +
i;1< l
~i2 ~It~ : ;j~
=______________________________. -
L=__
'"
:,~ ~ 1----........
Is :-,I
. ;:~
-'- ,
- 1---
is !---Il~ ~-
l~ii~ ~!!~ "
..,
I· , i!
I-
€
= 1.5
!II !-li' ;1·1 Lz 1.1: iiI L __________________________________ _
Figure 2.13. Two-dimensional triangular grid and computed fields for a cylindrically symmetric structure (cavity filled with ferrite and ceramics). The illustration on the top left shows the triangular grid G, in the top right-hand corner the corresponding dual grid G is shown. Down on the left the electric field E is displayed, down on the right the isometric lines of H
70
2. Numerical Field Theory
2.3.2 Stationary Fields Electrostatics. The following equivalence between the continuous and the discrete equations results in case of electrostatics if the Finite Integration Technique is used for solution of Maxwell's equations: curlE div D
=0 =p
As in (1.41), a scalar potential
div (c: grad
= -p
The boundary conditions can be of any of the four types we introduced earlier. For the Dirichlet boundary conditions, i.e., if
Magnetostatics. The following equivalence between the continuous and the discrete equations results in case of magnetostatics if the Finite Integration Technique is used for solution of Maxwell's equations: curlH
= JE
div B =
°
6b;:lb = je Sb= 0.
It is not always necessary to choose the vector potential formulation, as described in subsection 1.4.1. For the problems which can be described using two dimensions, i.e., in cylindrically symmetric or longitudinally invariant structures, a vector potential A = (0,0, Az) with only one component in longitudinal direction is sufficient:
B
= curIA
b = Ca.
The following continuous (discrete) equation has to be solved:
1 curl ( -curl A) = J E J.l
- -
1-
CD;: Ca
= je.
The properties of the matrix C and the subsequent system matrices are treated in subsection 2.4. In the three-dimensional case, following the formulation for the analytic solution of differential equations, a general homogeneous solution is combined with a special inhomogeneous solution. In this way, it is possible to work
2.3 Finite Integration Technique
71
with a scalar potential also in magnetostatics. Thus, the magnetic field H is decomposed into a homogeneous and an inhomogeneous part:
H:=Hh+Hi. These parts of the solution have to satisfy the following equations:
curlH h = 0 curlHi
=JE
Using de facto non-physical magnetic charges as an auxiliary quantity, one obtains the linear system div(J.lgrad
= div(J.lH i ),
with
Pm := div (J.lH i ). This can be interpreted as a divergence equation with a homogeneous part determined by a scalar potential H h = - grad
= Pm.
This expression is identical to the fundamental equation of electrostatics. It yields the discrete grid equation - -
-T
SD/lS
H
= Hi + Hh = Hi -
grad
Again, it should be noted that the scalar potential
Ce
=0
=0
SD"e = O.
The potential formulation for the electric field E is identical to that of electrostatics. The potential equation to be solved is given by div '" grad
=0
{:::::}
-
-T
SD"S
= O.
Again, there may be four types of boundary value problems. Dirichlet's and Neumann's boundary conditions are the same as in electrostatics; open and periodic boundary conditions can be used as well.
72
2. Numerical Field Theory
Stationary Temperature Fields. The resulting equations of the electrostatic field problem are formally identical to the conditional equation for the stationary temperature distribution in a thermal process dominated by thermal conduction. Such questions arise in the context of theoretical electrotechnics, e.g., in connection with wall losses in rf structures. The formal analogy of the differential equations makes it reasonable to use the same numerical method, especially in view of coupled field and temperature calculations. The following equivalence between the continuous and the discrete equations results in case of stationary temperature problems if the Finite Integration Technique is used for solution of Maxwell's equations: div /'i,T grad T = -w Now, the Dirichlet boundary condition means that a fixed temperature T = const. is prescribed on the boundary. The Neumann boundary condition aT/an = 0 means that the field of the temperature gradient has a vanishing normal component. In this case, open and periodic boundary conditions are also allowed. In addition, mixed boundary conditions can be imposed.
2.3.3 Quasistatic Fields Electro-Quasistatics. The following equivalence between the continuous and the discrete equations results in case of slowly varying fields which mainly depend on the displacement current if the Finite Integration Technique is used for solution of Maxwell's equations: curl E curl H div D div B
=0 {::::::} Cf. = 0 (2.9) = (iw + /'i,) E + J o {::::::} 6fy;;1!l. = (iwDc + DI<)f. + jo (2.10) = E!. {::::::} SDcf. = E!. (2.11) =0 {::::::} S!l. = 0 (2.12)
Because of (2.9) the electric field E can be represented as the gradient of a complex scalar potential: -e
E = -grad~
= STI[>-E .
(2.13)
Substituting (2.13) in (2.10) yields div ((iWE + /'i,) grad~)
= div (J L +J E) {::::::} S(DI< +iwDc)STP.E = S(t + iJ
Let us introduce the following notation -
-T
-
-T
AI< := SDI<S , Ac := SDc S ,
Eo
+ lJ
-..
:= S(lJ
The comparison with electrostatics and stationary current fields shows that the matrix of the discrete problem is the sum of the" real part" SD I<ST, which
2.3 Finite Integration Technique
73
is just the system matrix AK for stationary current fields, and the "imaginary part" w5DJ;T, which is just the system matrix Ae of electrostatics scaled by the frequency w. Thus (2.14) has to be solved. The system matrix A = AK + iwAe is complex symmetric and, as shown in subsection 2.4, positive stable. For the Dirichlet boundary condition, i.e., for the fixed potential tE = canst on the boundary, we set Ell == 0 because aP.E/atan = 0 and E = - gradp.E' The Neumann boundary condition atE/an = 0 yields El.. == O. Magneto-Quasistatics. The following equivalence between the continuous and the discrete equations results in case of slowly varying fields with general time dependence if the Finite Integration Technique is used for solution of Maxwell's equations in case that the material parameters E, J..l and K, do not depend on time: curl E = curl H
=J
aB at
divJ = 0 div B
=0
Differentiating 6D;;lb
=-
¢=>
Ce
¢=>
6D;;lb
¢=>
5j
¢=>
Sb= O.
b
= j + DKe
=0
= j + DKe and substituting Ce = - -
b yields
1
CD;; Ce + DK e= - j . Setting y(t)
:=
(2.15)
e makes (2.15) into a first order differential equation y (t) = Ly(t)
+ r(t),
with L = _D~16fJ;;lC and r(t) = _D~l j. Therefore, an initial value problem has to be solved in the general time dependent case. This problem cannot be solved by explicit schemes for stability reasons, so it is necessary to use implicit or semi-implicit methods. The algorithm requires linear systems with the above matrix to be solved repeatedly. For harmonic time dependence, we have E(r, t) = Re(E(r)e iwt ). After differentiating with respect to time and cancelling exp(iwt), we obtain the discrete equation (6fJ;;lc + iwDKk = -iwl, i.e., a boundary value problem.
74
2. Numerical Field Theory
2.3.4 General Time-Dependent Fields and Electromagnetic Waves General Time-Dependent Fields. For general time-dependent fields it is recommended to use the so-called mean-value state variables [76] for the formulation of the discrete initial value problem rather than the integral state variables used before. Let e' = [E'] be the vector of all mean values of the normalized electric field along the grid lines and (1.32), h' = [h'] the corresponding vector for the magnetic field. u(t) := (:;,) The operator L has the following discrete analogue:
(
L
_!5:. E
1 1 --curl/1r
2-CUrl~) Er /10 0
EO
1 1fj-1cfj fj-l) -D K DDA - ( E Er A S /10 _fj-l D-1CD D-l 0 /1r
A
S
EO
The mean-value state variables and the integral state variables are connected via the operators D s, fj s, D A, and fj A for the line (surface) integrals on the . (d ua1) gn. 'd S0, e.g., e = D s D- /10 1/2 D eO -l/2 e. , pnmary The initial value problem can be solved by a one-step or multi-step algorithm. Detailed information about the standard methods for the numerical solution of ordinary differential equations can be found in textbooks. [76] is devoted specifically to the initial value problem for high frequency fields described above. For stability reasons, implicit methods are chosen for slowly varying fields. Then it is necessary to solve a linear system with the matrix above in each step in order to obtain the numerical integration. Since this matrix is very illconditioned [207]' this presents an extremely demanding problem. Recently, some progress has been achieved in solving this problem [57]. Harmonic Oscillations. Harmonic oscillations such as eddy current problems are problems on the frequency domain. Using the Finite Integration Technique for the solution of Maxwell's equations results in the following equivalence between continuous and discrete equations in case of excited time-harmonic oscillations: curlE curl H div D div B
= -iwB = iwD+J... = f!.. =0
{::::::} {::::::} {::::::}
{::::::}
= -iw!!. Cll = iwfl + i Sfl = f!.. S!!.= o. C~
2.3 Finite Integration Technique
75
The relevant continuous and discrete material equations are given by
D=cE
4. = l2es::.
B = /!.H
Q= D Jill
l..=JL+JE
-j
= D"s::. + j
~
.
The fields are excited by an electric current -e j which flows in a closed current loop or a current-carrying wire. The excitation current can also be the Fourier transform of an elementary particle moving along some trajectory. Substituting the equations one into another yields the so-called discrete Curl-Cud Equation or discrete Helmholtz equation. (curl -I curl? - w~f. ') E
/!.
= -iwl..E
The right-hand side -iwj-e represents the impressed current excitation. c' combines the complex conductivity and permittivity.
A Special Cylindrically Symmetric Problem. For the cylindrically symmetric case without azimuthal dependence (monopole fields) and excitation on the axis, a special multigrid algorithm is presented in subsection 3.7. It has been developed in [277]. Again, normalized fields E' and H' are used. Then Maxwell's equations in integral form are
1 E' ·ds laA 1 H'.ds laA 1 ErE'· dA lev 1 Ilr H '· dA lav
CiArIlr H '· dA
-i~
CiAr ErE'· dA + I'
i~
cq' = O.
Here I' represents the beam current (cf. section 5) of frequency w in some rf-structure for the acceleration of elementary particles. This current I' is given by I q v'ZO e-ik(z-zo) for r = 0 . I (w , r , z) = 0 otherwIse
{c
with the wave number k = w / c. As the general solution of an inhomogeneous partial differential equation is the sum of a special solution of the inhomogeneous equation and the general solution of the homogeneous equation, the magnetic field H' is decomposed into an inhomogeneous part H* (the source field), which is caused by the current I', and the homogeneous part HO (source-free field):
76
2. Numerical Field Theory HI
= HO +H*.
Then the solution for the inhomogeneous problem in free space is chosen to be the inhomogeneous part. The current l' induces the following azimuthal magnetic field: H* (w r z) = ~ e-ik(z-zo)
"27fr
.
Next, Maxwell's equations are discretized using mean values of the normalized field and applying some open boundary condition, which was first introduced in [277]. As materials, only ideal conductors and vacuum are assumed, hence the material constants may be omitted in the resulting equations (J.Lr = €r = 1 in vacuum). Details about the discretization can be found in [277]. Finally, rewriting the difference equations such that the homogeneous azimuthal field components remain as unknowns, one obtains a complex linear system. The solution l! = (H~l' ... , H~N) includes the unknown azimuthal magnetic field components, while the inhomogeneous part h* = (H;l' ... , H;N) goes to the right-hand side: (2.16) The system matrix M has size N x N where N is the number of grid points. The matrices A, D, and I are real; I denotes the identity matrix. The matrix A is the same as the matrix arising in the cylindrically symmetric case for the eigenvalue problem and giving the resonant monopole modes in the closed structure; it has only four side bands (cf. [300]). The matrix D is a diagonal matrix expressing the open boundary condition. Again, k stands for the wave number for the chosen excitation frequency. kl stands for the wave number of the fields propagating into the beam tube (open boundary condition). Resonant Oscillations. Using the Finite Integration Technique leads to a linear algebraic eigenvalue problem in case of resonant oscillations. This problem type is treated in detail, e.g., in [302] and [231] for Cartesian FIT grids, in [72] for circular cylindrical FIT grids, and in [277] and [32] for nonorthogonal FIT grids.
2.4 Resulting Linear Systems Here we give an overview of the linear algebraic systems which arise if one applies the mode matching technique and the Finite Integration Technique to the problems discussed above. Real or complex, sparse or full matrices may occur depending on the application. 2.4.1 Special Properties of Complex Matrices Here we recall some facts from linear algebra which will be needed later. All the matrices are complex, i.e., with elements in C, unless indicated otherwise.
2.4 Resulting Linear Systems
77
Definition 2.4.1. A matrix A is called positive definite (positive semi-definite) in en if its quadratic form (Ax, x) is real and positive (nonnegative) for all x f. O,X E en. The following theorem gives a necessary and sufficient condition for a matrix to be positive definite. Theorem 2.4.1. A matrix A is positive definite (positive semi-definite) over en if and only if A is Hermitian and all its eigenvalues are positive (nonnegative). This implies, in particular, that a positive definite matrix is regular (non-singular) . Proof: Axelsson [9] Matrices all whose eigenvalues have positive real part are important in many applications. Therefore they deserve a name: Definition 2.4.2. A matrix A is called positive stable if all their eigenvalues have positive real part. Further important terms are the following. Definition 2.4.3. (i) B := 1/2(A + AH) is called Hermitian part of A, (ii) C := 1/2(A - AH) is called anti- or skew-Hermitian part of A. The identities BH = Band C H = -C follow directly from the definitions. Hence the following lemma. Lemma 2.4.1. A matrix A is positive stable if the Hermitian part of A is positive definite. Proof: Axelsson [9] Finally the following term is introduced. Definition 2.4.4. A matrix A is called quasi-symmetric if it is similar to a symmetric matrix. Next, we describe the properties of the matrices arising in the mode matching method and the Finite Integration Technique. 2.4.2 Mode Matching Technique The system matrices are each complex and full (not sparse). Their rank is of order 102 • Before the original linear system with matrix A-
-
(.s.I,I ± ~ .s.I,II ) .s.II,I .s.II,II ± ~
can be solved, some other matrices have to be inverted and their determinants have to be evaluated.
78
2. Numerical Field Theory
The block matrices §i,j' i, j = I, II, are scattering matrices which were constructed by concatenation. Recall that the concatenation of scattering matrices requires the inversion of the matrix E. - §}2j§}Y II' To determine the scattering matrices from the coupling matrices C a'nd D, the full complex matrix E. + CD has to be inverted. The scattering matrix possesses the following properties: [200]: - The scattering matrix is symmetric
- The scattering matrix is involutory
- The scattering matrix is orthogonal
2.4.3 Finite Integration Technique The matrices are each sparse with band structure. Precisely, for the lexicographic order of grid points, the matrices are banded with three nonzero suband super-diagonals. Their rank is typically of order 106 . Stationary Fields. In magnetostatics, the system matrix SD/lS T corresponding to the three-dimensional case is real, symmetric, and positive definite. For lexicographic order of the grid points, the matrix is banded (with three bands on each side of the main diagonal). E.g., for a three-dimensional Cartesian FIT-grid with I steps in X-, J steps in y- and K steps in z-direction, the following holds: ~1 ~1 a2 ~2
a1
81
11
12
82
~i: ±1
, distances to ai:
{ 1i:
8i
:
±I ±IJ
It can be easily shown 7 that they are M-matrices and, because of their sym-
metry, positive definite Stieltjes matrices. 7
see, e.g., [202J
2.4 Resulting Linear Systems
79
The system matrix corresponding to the two-dimensional case CD;:lC is also real symmetric and positive definite. As in magnetostatics, the system matrices SDJ;T, SDI<ST and SDI
is indefinite. The matrix CDC is positive stable. When conductivity increases, the eigenvalues move from the real axis towards the imaginary axis (depending on the oscillatory mode) [116]. Zero occurs as a multiple eigenvalue if the computational domain includes loss free areas. For waveguide boundaries the matrices become non-symmetric and indefinite. In [277], the special cylindrically symmetric case of an elementary particle passing a cavity on the axis and thus exciting fields inside the cavity was considered. The system matrix M = A + ik'D - k 2 I is complex quasisymmetric and indefinite above the lowest resonant frequency of the cavity, since A is the matrix of the corresponding eigenvalue problem. D is a diagonal matrix which expresses the open boundary condition (waveguide boundary). k stands for the wave number of the excitation frequency, k' for the wave number of the fields propagating to the tube (actually, only one wave was used). For the eigenvalues of A, i.e., for the resonant frequencies of the closed cavity, the matrix M gets singular. The right-hand side b of the system is frequency dependent. Therefore b as well as the solution x is moving to the higher frequencies when the frequency increases.
The next section introduces numerical methods for these linear systems. Recently, a number of iterative methods has been developed for the comB
cf. subsection 1.5
80
2. Numerical Field Theory
plex (non-Hermitian) matrices, while these problems could hardly or only very inefficiently be treated before. The algorithms are described in the next chapter. Some of these methods have been applied for the system matrices of the Finite Integration Technique. Results of convergence studies are discussed in section 3.10.
2.5 Bibliographical Comments In this section three numerical schemes using to solve Maxwell's Equations have been treated, viz. the Mode Matching Method, an example of a semianalytic scheme, and the Finite Element Method and the Finite Integration Technique, two examples of discretization schemes.
General Textbooks The book [174] discusses several numerical methods for partial differential equations.
Mode Matching Method The Mode Matching Method was used by the author to solve some accelerator problems, which are discussed later in the book. It is almost a classical method and is described in many textbooks. The literature cited here cannot possibly cover them all. The derivation of the scattering matrix formulation is described in detail by Piefke in [200]. This is a German textbook. Another German textbook treating this subject is [273]. In other formulations, the mode matching technique is described, e.g., in [141]. Several problems arise in the application of the Mode Matching Method. Gibbs' phenomenon is described, for example, in [178]. Problems of convergence near geometrical singularities may be cured using the edge condition described, e.g., in [175] and [290]. More details on the particular formulation used in this book are described in [199] and in several Ph.D. theses which were supervised by Piefke, see, e.g., [102], [213], [246]. Others were studied in several master theses supervised by the author: see [87], [183]' [239], [318]. In [146]' an algorithm is used which is also based on the scattering matrix formulation but uses another strategy to suppress Gibbs' phenomenon. For accelerating structures, the mode matching technique has also been used, e.g., in [115], [17], [152]' and [125]. In connection with the computation of accelerating and parasitic modes, the mode matching technique has been presented in [278], [279], other formulations in [121], [122], and [322].
2.5 Bibliographical Comments
81
Finite Element Method The first publication on the Finite Element Method by Clough [63] goes back to 1960. In the field of electrical engineering, several papers were published around 1970, see, e.g., [236]. Standard Finite Element Methods are thoroughly explained in [54] or [212], and other books. By now there exist many textbooks and articles devoted to the Finite Element Method: see., e.g., [47], [134]' [209]' [215], [249], [265]. General outlines of the Finite Element approach can even be found on the World Wide Web: see, e.g., [216]. An extensive and mathematically sophisticated presentation can be found, e.g., in [113] or [45]. Special aspects are discussed, e.g., in [181]' [182]' [294], and [197]. Some important aspects of non-conforming elements are treated, e.g., in [250] and [47]. Developments starting in the 80's introduced more appropriate element types to solve Maxwell's equations. The techniques are based on the original work of Raviart-Thomas [211] and Thomas [263], which was later generalized and extended to the three-dimensional case by Nedelec [185]. Further relevant literature in this context is [37], [36], [34], [35], [47], [64], [259]' [46], and [186]. Also, [28] discusses the usual approach to convert Maxwell's equations into variati0nal equations in suitable function spaces. An example of a German textbook treating this subject is [158].
Finite Integration Technique There were two reasons to devote a section to the Finite Integration Technique. First, even today it is rarely described in textbooks; secondly, it was the main method used by the author to solve problems which are discussed in the next sections. Vee's Finite Difference Method from 1966 [323] for general time dependent problems is the predecessor of the Finite Integration Technique. It is main topic of Taflove's book [255] from 1995. The publications [296], [303], [304] contain elementary descriptions of the Finite Integration Technique, which was introduced by Weiland in 1977. A detailed description may be found in [305]. Also, some special characteristics are described, e.g., in [297], [302], [306], and [79]. Many dissertations were written under the supervision of Weiland: see, e.g., [32], [57], [72], [76], [116], [207], [277], [231]. Applications outside the field of electrical engineering are described, e.g., in [321] (acoustics), [94] (elastodynamics) or [283], [203] (temperature problems). Other relevant publications in the context of this book are, e.g., [286]' [300], [301], and [123].
3. Numerical Treatment of Linear Systems
In the previous sections it became evident that numerical problems in electrical engineering often reduce after appropriate modelling to the solution of large sparse linear l systems of equations
Ax
= b.
The character of the system matrix A varies from real, symmetric and positive definite to complex, non-hermitian and nearly-singular. Predominantly the matrices are sparse. However there exist some methods in numerical field theory which lead to full matrices. Complex matrices can often be rewritten in real matrices which are twice as large in order to apply algorithms for real linear systems. But this procedure cannot be recommended under numerical point of view though, because the condition of the corresponding real matrix is much worse than that of the complex matrix. Particularly, complex non-hermitian system matrices are typical for electrical engineering while they are rare in other fields. Their numerical treatment is thoroughly studied subsequently. In this section, the most important numerical methods for the solution of linear systems will be introduced in a short outline. In the course of this outline, the main emphasis will be laid on recent iterative methods. Direct methods such as Gaussian elimination will only be introduced briefly since they belong to the" classics" and can be found in every textbook on numerical mathematics. The iterative methods are often classified on the basis of various criteria. Figures 3.1-3.3 show one usual classification. They give an overview of the most important iterative methods. Iterative methods can be expressed in the simple form Xk+l = M Xk + c, with the new iterate xk+l, the previous iterate Xk, and the so-called iteration matrix M. There are two main types of iterative methods, stationary iterative methods and non-stationary iterative methods, dependent on the nature of M and c in the iteration process. In stationary methods, the iteration matrix M and vector c remain constant throughout the iteration, while a new iteration matrix M or vector c is generated in every step of the non-stationary iterative methods. 1
Also nonlinear problems are very typical. Yet, these are usually solved using the Newton-Raphson method (see e.g. [191]) which then yields a sequence of linear problems to be solved.
U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
s'"
CLl
~
~
en.
l;l CLl
=
~
'E dCLl S
CLl
-:;j
.~
~ Ci! u ;:!
Z M -.:!'
00
Row-Projection Methods (e.g. Kaczmarz)
• • • •
One-Level-Methods
•
•
ADI-Method Peaceman-Rachford (2D elliptical PDE's)
Geometrical (grid-oriented) Algebraic (matrix-oriented)
Multilevel-Methods
Richardson Jacobi Gauss-Seidel Generalizations as SORand SSOR
Chebyshev-Iteration Acceleration of the OneLevel (Fixpoint)Iterations
Methods with Incomplete Factorization
(e.g. Strongly Implicite Procedure of Stone)
A Posteriori Iteration
Accuracy-improvement for direct solution methods
~
o
.d
S
CLl
~
~
=
o
en
~
.~
~
.>~
.-t
~ o
~
II)
'"
b.O
~
3. Numerical Treatment of Linear Systems
85
Instationarl: Methods (Il (Projection Methods for symmetricJhermitian matrices)
~ • •
• • •
cg Lanczos SYMMLQ MinRes CR
Figure 3.2. Overview of non-stationary methods, part I: symmetric. hermitian matrices. Figure 3.1 presents an overview of the most important stationary methods. Traditional stationary iterative methods include Richardson, Jacobi, Gauss-Seidel, and generalizations such as SOR and SSOR. These are all one-level methods in the sense that they work on the original problem. The Chebyshev iteration and multilevel methods are iterative schemes which make use of these traditional stationary iterative methods. The Chebyshev iteration is an acceleration scheme for the one-level methods. Multilevel methods can be divided into geometrical multilevel methods and algebraic multilevel methods. The geometrical multilevel methods (often referred to as multigrid methods) are grid-oriented in the sense that an underlying partial differential equation is discretized on grids refined in different ways, thus resulting in linear systems of different dimensions. Algebraic multilevel methods work independently of any discretization scheme; they work on a sequence of matrices of distinct dimensions "condensed" in a certain way. The underlying idea of the above-mentioned methods will be described in this section. Other stationary iterative methods are the row-projection methods such as the Kaczmarz method, which is also described here, the ADI Method, methods with incomplete factorization, and a posteriori iteration. The non-stationary iterative methods should be divided into two subgroups: methods for symmetric or Hermitian matrices, which are presented in Fig. 3.2, and methods for general matrices presented in Fig. 3.3. The nonstationary methods are projection methods. The best known representative of non-stationary methods for symmetric positive definite problems is the conjugate gradient method (cg). In every iteration of the algorithm, the approximate solution Xk is updated by adding a search direction Pk multiplied by a constant CXk, Xk+l = Xk + CXkPk. Minimization of the error leads to a specific choice of CXk and Pk obtained from the residual rk and the system matrix A. Hence, the conjugate gradient method actually generates a sequence of approximate solutions in terms of multiples
[JJ
s
>,
[JJ
..., CfJ
~
~
o
;3 ...... ~
s
~
E=<
Cd u .;::::
sp z M
00
<0
Bi-Lanczos Skew projection with basis pair
! E. !.
CGS CGS2
LV-decomposition (GCGS)
~Tridiagonal matrix vector recursion ~Short
....................................................... Bi-Lanczos BiCG SCBiCG (COCG, BiCGCR, ... ) with A
! E !•
Bi-Lanczos and BiCG . r·····BicG·~i~~·i~·~~;~~i······~ ~ ~
• • •
Instationary Methods (lD (Projection Methods for general matrices)
BiCG with internal Least-Square-Fit (QMR-family) QMR TFQMR
BiCGStab BiCGStab2 BiCGStab(l)
Hybrid Methods
complex symmetric ............................................... L ............................................lI·L-________---'
• • •
•
Arnoldi Methods
Orthogonal projection
><
II
II :
00 Q)
u .;::::
-:;; S
~
Q)
Cd ...
~
,.:.;
p.
~
...... ...,
o
."
'"
...,
Q)
..d
>,
S
~ ~ o
...,
0:g
Arnoldi "Relatives"
....o
~
o
~
I
[/]
GCR GMOrthomin Axelsson's methods
=>Vpper Hessenberg matrix =>Long vector recursion
Arnoldi Implementations FOM GMRES
.;;:~
~
Q)
... ;;. o
~
~
(1)
boO
~
3.1 Direct Solution Methods
87
of the residual r and matrix A. The approximate solutions all lie in the span of {ro, Aro, A2ro, ..... , Anro }. This set forms a basis for the Krylov subspace, and thus Xk belongs to the Krylov subspace. In every iteration, two inner products and a matrix vector product are calculated. The approximate solution is obtained through a three term recurrence of the residual vector which provides a better iterate in the conjugate direction of A. Other methods of this subgroup are the Lanczos algorithm for linear systems, SYMMLQ, the minimal residual method, and the conjugate residual method (CR). Non-stationary iterative methods for general matrices can be split again into two subgroups as displayed in Fig. 3.3: Bi-Lanczos or cg-like methods and Arnoldi Methods. The Bi-Lanczos methods are skew projection methods with a basis pair (two Krylov subspaces); they lead to short recursions like in the cg method. Arnoldi methods are based on orthogonal projections which in the case of general matrices lead to long recursions. First, there are Bi-Lanczos, BiCG, and SCBiCG methods for complex symmetric system matrices A, which are strongly related to BiCG methods with internal LV decomposition, the generalized CGS methods for which CGS and CGS2 serve as representatives. Secondly, there exist BiCG methods with internal least square fit; they belong to the QMR family: QMR and TFQMR. Bi-Lanczos type methods will be described in this section. The second subgroup of non-stationary methods are the Arnoldi Methods with the Arnoldi implementations FOM and GMRES and related methods such as GCR, GMOrthomin, and Axelsson's. Some of those will be described in this section. Finally, there are hybrid non-stationary methods which combine the BiLanczos scheme with the Arnoldi scheme. These are BiCGSTAB, BiCGSTAB2, and the more general BiCGSTAB(l) method. They are described in this section, too.
3.1 Direct Solution Methods The use of direct solution methods can be recommended for systems Ax = b with dense and unstructured system matrix A. In this book, Gaussian elimination is used in the framework of the mode matching technique for determinant computation, matrix inversion, and solution of linear systems. The basic idea of the direct solution of a linear system of equations is the transformation of the system into an equivalent system having triangular form. To achieve stability of the algorithms, it is necessary to apply some scaling (equilibration) and some pivoting.
3.1.1 LU-decompositionj Gaussian Elimination Consider a linear system of equations Ax = b with regular matrix A E c nxn and x, b E C. The so-called pivot elements akk playa central role in the LV
88
3. Numerical Treatment of Linear Systems
decomposition (Gaussian elimination) since they are used as divisors. Their magnitude compared to that of the remaining matrix elements is of crucial importance for the stability of the algorithm. Yet, instability can be avoided by appropriate permutations of rows and columns during the elimination. The permutations can be expressed in matrix notation as follows:
Definition 3.1.1. Let Si E Sn, i = 1, ... , n, be permutations of (1,2, ... , n) and ei = (8li , ... ,8ni f,i = 1, ... ,n be unit vectors. Then a matrix P E c nxn with P
= [e
Sl ' ... ,
esJ is called permutation matrix.
In practice the partial pivoting is used where, in each step, the row containing the element with the largest absolute value in the pivoting column is placed first by permutations. This pivoting strategy needs at most O(n 2 ) additional operations. The algorithm for the Gaussian elimination with partial pivoting goes as follows:
Algorithm 3.1.1.1 (LU Decomposition with Partial Pivoting) Given a regular matrix A E For k=1, ... , n-1 Determine p E {k, k r(k) := p
cnxn .
+ 1, ... ,n}
with
la~~1
=
maxk~i~n H~l
Exchange ak~) and a~~) ,j=k, ... , n. For i = k+1, ... , n (k)
.- aik Iik·-W a kk
a~~+1) := l~:) a~;+I) := a~;) -likak~) for j = k + 1, ... , n The permutations PI, ... , Pn- I are described by the integer vector (r(I), ... , r(n-I»). Pk denotes the exchange of the rows k and r(k). However, partial pivoting only makes sense if the matrix has been scaled before. Otherwise it can lead to an inadequate choice of the pivot element. Row scaling means mUltiplying each row of the matrix A by an appropriate factor so that 11.1100 of all rows of the new system matrix D;I A finally equals a fixed constant 2 . Remark 3.1.1. Let PI, ... , Pn- I be the permutations in the algorithm 3.1.1.1. With P = Pn- I ... PI, we get P A = LU. Here L is a lower triangular matrix and U the upper triangular matrix resulting from the decomposition. The standard algorithm for the direct solution of a linear system is Gaussian elimination with row scaling and partial pivoting. We usually assume in the sequel that L is the product of the elementary lower triangular matrices and the permutation matrices, so that we can write A = LU. 2
Row scaling lowers the probability that a very small number is added to a very large one during the elimination.
3.1 Direct Solution Methods
89
Algorithm 3.1.1.2 (Gaussian Elimination)
- A = LU, Gaussian triangular decomposition (L lower, U = A(n) = L -1 A upper triangular matrix) - Ly = b, forward substitution - U x = y, backward substitution Remark 3.1.2. The number of coefficients to be stored in Gaussian elimination equals n(n + 1), i.e., exactly the number of input coefficients (matrix A and the vector b). Regarding the computational effort, counted in multiplications, the following holds for the Gaussian elimination without pivoting and scaling: n-1 - ex
k=l
n-1 - ex
3
Lk 2 == n3
for the LU-decomposition,
2
Lk == n2 ' each, for the forward and backward substitution
k=l
The computational effort for the partial pivoting O(n 2 ) has to be added. For the mode matching technique, Gaussian elimination is not only used for the solution of linear systems of equations but also for matrix inversion and determinant computation [246J, [279J. The relation A -1 = U- 1 L -1 is exploited for matrix inversion. The matrix L is not stored explicitly: only the multipliers lik below the main diagonal of A are stored. For the pivoting, the pivot vector (r(l), ... , r(n-1)) is constructed. The matrices Lk are Frobenius matrices and therefore have the property that the inverse L;l results from Lk by changing the sign of the elements lik. Thus we get L-1 = Ln-1Pn-1 ... L2P2L1P1. The algorithm for matrix inversion consists of two steps: Algorithm 3.1.1.3 (Matrix inversion with Gaussian Elimination) 1. Replace U with the help of a column-oriented algorithm by U- 1 • 2. Determine the unknown inverse A -1 corresponding to
A- 1 = U- 1Ln-1Pn-1 ... L2P2L1P1. After LU decomposition of A, the determinant computation with the help of Gaussian elimination reduces to evaluating the product of the diagonal elements of U, since the relation det(A) = det(L) det(U) gives n
det(A)
= (-l)P II Ukk, k=l
where p stands for the number of permutations.
90
3. Numerical Treatment of Linear Systems
Remark 3.1.3. Basically, there are two reasons why direct methods are not well suited for large sparse systems resulting due to discretization of two- and three-dimensional problems: 1. Direct methods which are based on a LU decomposition produce a Jillin with non-zero entries in the major part of the band width of matrix A and therefore cost a lot of storage space. Then the matrix has to be stored peripherally for large problems and the I/O-costs, i.e., the writing from disc to file, start to dominate. 2. A common serious drawback of the direct methods is that rounding errors and errors in the data tend to grow faster than the condition number. For elliptical partial differential equations, they grow as O(h- 3 ), where h stands for the step size of the discretization. Therefore, at least a double precision calculation is necessary which has an even stronger impact on the storage requirements.
For full matrices, e.g., the coupling and scattering matrices of the mode matching technique (also for matrices from many other applications), Gaussian elimination remains the method. Many developments have been made to cure the weakness of the algorithm to rounding errors. Examples are intervalanalytical variants of Gaussian elimination or the use of special number representations 3 . The division-free Gaussian algorithm is of great importance for contemporary vector computers, since a division operation takes much more time on those computers than a multiplication [9]. Iterative methods do not have to fight the problem of fill-ins, since appropriate preconditioning and accelerated methods make it possible to obtain algorithms of nearly optimal order of computational complexity. Compared to direct methods, the influence of rounding errors can be reduced by several orders of magnitude by thorough evaluation of the residuals inside the iterative algorithms [7]. Some modern developments (which can not be treated here) are worth mentioning. For some special applications, e.g., two-dimensional problems with system matrix A(p) that continuously depend on some parameter p, direct methods on bandwidth-limited RISC-machines (PC's) can even handle badly conditioned cases. The so-called "U ni-/Multifrontal Direct Solvers" [254] use sophisticated algorithms to reorder the unknowns. Additionally, they use BLAS Level 2 and Level 3 routines, which reach nearly peak performance and are fitted to the Cache size. For matrices A(p) which do not change their filling pattern with p, a subsequent factorization then only requires 10% of the first LU decomposition. Nevertheless, for the asymptotic complexity of the direct solvers becomes evident for really large problems 3
In [104], for example, integer systems of equations resulting from system analysis (studies concerning Petri-nets) were solved "exactly". In this context, a divisionfree Gaussian algorithm with special number representation was developed and interval-analytical inclusion methods were applied.
3.2 Classical Iteration Methods
91
the complexity of the best direct solvers for general matrices reaches O(N 2.S)
[56J.
3.2 Classical Iteration Methods Iteration methods for the solution of linear systems start with an initial approximation to the solution vector, from which they build a sequence of approximate solutions, which (under certain conditions) converges to the exact solution of the system of equations. A special advantage with respect to sparse matrices lies in the simplicity of these methods. Essentially, only matrix-vector multiplications and vector additions have to be carried out. It should be noted that the effort for a matrix-vector multiplication with a sparse matrix usually is of order O(n) rather than O(n 2 ) as for full matrices. The reason is that the number of non-vanishing elements in each row is usually small and independent of the dimension n of the matrix. These methods can also offer an enormous saving of storage space compared with direct methods, since usually only the non-vanishing elements of the system matrix, the solution vector, and a few additional vectors have to be stored. Their disadvantage compared to direct methods is slow convergence or even divergence. In addition, they require an appropriate stopping criterion. As already noted, the iterative methods can be split into two groups, viz. stationary and non-stationary iteration methods. "Classical" iterative methods like Jacobi, Gauss-Seidel, and SOR are stationary. They are easy to understand and to implement. Much more efficient, however, are the multigrid methods, which internally use stationary methods like Gauss-Seidel, and non-stationary methods like the Krylov subspace methods. Yet, their analysis is also much more difficult. But practical applications require fast (and robust) solvers. Thus, the engineers can hardly wait for a decent analysis but start using the methods (recall FEM, which was used by engineers for a long time before the mathematical theory was fully developed). Therefore, it is important to document systematic experimental convergence studies until theoretical convergence results are available. An attempt for such a documentation based on typical examples will be given in section 3.10. Since there exists a whole host of literature on this topic, of which [9] and [103] are recommended examples, the well-known classical methods (together with their symmetric variants) and the Kaczmarz algorithm are treated only shortly in what follows. There is an overview in [21], which is also available on the World Wide Web. The non-stationary methods, especially their modern variants, will be treated to a greater extent in section 3.4, since they are still relatively unknown outside the mathematical community.
92
3. Numerical Treatment of Linear Systems
3.2.1 Practical Use of Iterative Methods: Stopping Criteria Using iterative methods in practice, we need to implement some stopping criterion based on the size of some available quantity. Available quantities in all iterative methods are the initial vector Xo and the iterates Xk, Xk+l, besides the matrix A and the right-hand side b. The residual rk is easily computed but needs one additional matrix-vector multiplication. Non-stationary methods like the Krylov subspace methods recursively calculate a "residual" which can be used for a stopping criterion, but the true residual should be also computed from time to time. Usual criteria are as follows: - an absolute criterion depending on the residual:
- an absolute criterion depending on the iteration error
Ilx* -
Xk
II:
IIXk+l - xkll :S E, - a relative criterion depending on the relative residual, i.e., on the ratio of the actual residual and the initial residual:
- a relative criterion depending on the iterates and the initial vector, again taking Xk+l as an approximation for the unknown exact solution X* and thus approximating the relative iteration error:
The initial vector Xo is usually either set to zero or chosen to be a random vector. In case Xo = 0 we have ro = b in the relative residual. The reader who would like more details about stopping criteria and their relation to the iteration error itself is referred, e.g., to [9].
3.2.2 Gauss-Seidel and SOR Most classical methods are fixed point methods. The iteration function Xk+l = ¢(xd is constructed in such a way that it possesses exactly one fixed point x* which equals the sought for exact solution of the linear system of equations Ax = b. The choice of an appropriate decomposition of the given matrix A is a crucial point in the construction of an effective classical iteration algorithm. The convergence speed of the fixed point iteration Xk+l = M Xk + c is mainly determined by the spectral radius p(M) of the iteration matrix. Yet, it can be improved for a given M by relaxation The SOR algorithm (successive over-relaxation) may be called the classical iteration method for linear systems, besides the Jacobi algorithm. Its main advantage in comparison with other methods is the small storage requirement. This makes the algorithm suitable for extremely large matrices arising in some applications.
3.2 Classical Iteration Methods
93
Algorithm 3.2.2.1 (SOR Algorithm; Successive Over-relaxation) For i = 1, ... , n
In matrix notation, the SOR algorithm can be written as
M;ORx(k+l)
= (M;OR -
with
M;OR
A)x(k)
+b
= ~(D + wL).
w D stands for a diagonal matrix with the elements for a lower triangular matrix with the elements (M~OR - A) is
(M;OR _ A)
aij,
aii,
i
i
= 1, ... , nand
L
> j. Then the matrix
= 1-
w D - U, w where U contains the elements of A lying above the main diagonal. Remark 3.2.1. For w = 1, we obtain the Gauss-Seidel algorithm, which is also referred to as a relaxation method. Strictly speaking, the term 'SOR' is correct only for values w > 1, while for 0 < w < 1 the notion of under-relaxation would be accurate.
Some results for the SOR algorithm will be given here without proofs. They can be found for in textbooks like [9]' [103]' [114], [248]: Theorem 3.2.1. The matrix A is assumed to be positive definite and decomposed as where D is the diagonal of A and L a strictly lower triangular matrix. Furthermore,
0<w<2 is assumed. Then the SOR algorithm converges:
The convergence is monotone in the energy norm:
p(M;OR) :::;
IIM;ORIIA < 1.
Proof: see Ostrowski [192] and [114] Young's Theorem [324] gives evidence about the optimal relaxation parameter Wopt. It implies
94
3. Numerical Treatment of Linear Systems
Theorem 3.2.2. Let - A have property A, - the block Jacobi matrix MJak = 1- D- 1 A have only real eigenvalues. Then the SOR algorithm converges for an arbitrary initial vector Xo if and only if
- p(MJak) < 1 and - 0
< W < 2.
Furthermore, for
2
Wopt
(3.1)
= 1 + Jl _p(MJak)2'
the asymptotic convergence factor satisfies min W p(MSOR) = p(MSOR) = W Wop I
W
opt
-
1
Jl - p(MJak)2 1 + Jl - p(MJakF 1-
Proof: see [9] or [114].
For one-step methods like Gauss-Seidel and SOR, the order in which the equations are being iterated influences the quality of error smoothing (cf. [112]). For the two-dimensional case, the FIT equations correspond to the Five-Paint-Star for the Poisson equation. For the Poisson equation, the socalled red-black ordering is usually recommended for Gauss-Seidel and SOR instead of the normal lexicographic ordering. For this ordering, the grid is split up into two groups of nodes: In case of a two-dimensional Cartesian grid, the red nodes are the grid points k = (j -1)·J +i with j = 1, ... , J, i = 1, ... , I, i+ j even; the remaining points are the black nodes. The red points are iterated first in Red-Black Gauss-Seidel or SOR and then the black ones are iterated. However, for problems with a strong coupling in one coordinate direction, the linewise relaxation is usually recommended. In the linewise relaxation, the grid point also split into two groups: one block consists of grid points k = (j - 1) . J + i with j even, the other one of grid points k = (j - 1) . J + i with j odd. One iterates first the first block, then the second block. The main problem connected with the SOR algorithm is the strong dependence of convergence on the chosen relaxation parameter. This parameter is often hard to choose appropriately. In contrast, the Krylov subspace methods are not affected by this problem. 3.2.3 SGS and SSOR Algorithms
Gauss-Seidel and SOR do not converge for indefinite matrices. It is worth noting that, even for a Hermitian matrix A, the iteration matrix does not
3.2 Classical Iteration Methods
95
have to be Hermitian at all. The spectrum of the SOR iteration matrix M;OR may very well have complex eigenvalues. The SSOR algorithm is a special symmetric iterative method. The SGS algorithm will not be treated separately here since, with w = 1, it is only a special case of the SSOR. Let us introduce the backward SOR algorithm first. It is denoted by M~SOR.
Algorithm 3.2.3.1 (Backward SOR) For i = 1, ... , n:
X~+l = W (b i -
f
aijx;k) -
j=l
t
aijx;k+l)) /
aii
+ (1 -
w)x;k).
j=i+l
Thus, the iteration matrix is M~SOR = ~(D + wLf. The SSOR algorithm follows by concatenation of one step of SOR and one step of backward SOR:
Algorithm 3.2.3.2 (SSOR Algorithm; Symmetric SOR) If the SOR algorithm is denoted by M;OR, then the SSOR algorithm is defined as the product M;SOR = M~SOR 0 M;OR.
Today SSOR is mainly important as a preconditioner for the later introduced Krylov subspace methods. In addition, it is often used in connection with the Chebyshev iteration [103].
3.2.4 The Kaczmarz Algorithm The Kaczmarz algorithm is a row-action procedure. While standard iterative methods work in each iteration step with the whole matrix A, this method uses in each iteration step only one row of the matrix A E Rmxn. One Kaczmarz step is defined by
aT
Algorithm 3.2.4.1 (Kaczmarz) For k = 1, ... , n :
with lk
= k mod m + 1.
The Ak are relaxation parameters. The original Kaczmarz algorithm results from the choice Ak = 1. In this case, the following holds:
Theorem 3.2.3. The Kaczmarz algorithm (Ak quadrati cal non-singular matrices A.
= 1) converges for
arbitrary
96
3. Numerical Treatment of Linear Systems
Proof: Kaczmarz [148]. Without relaxation, the iteration corresponds to a successive orthogonal projection onto the hyperplanes alkX = b1k . For Ak :j:. 1, the iterate xkH lies on a line from xk to its orthogonal reflection about the hyperplane alkX = b1k .
Theorem 3.2.4. The Kaczmarz algorithm with relaxation parameter Ak :j:. 1 and 0 < lim infk-too Ak :S lim sUPk-too Ak < 2 converges for arbitrary matrices A for which the linear system is consistent, i. e., has at least one solution. Proof: Herman et al.[126]. Since the Kaczmarz algorithm converges extremely slowly, it has no big importance per se. However it is often used as a smoothing method in multigrid algorithms.
For a general convergence analysis of the classical iterative methods, the reader is referred to the basic standard literature of numerical mathematics.
3.3 Chebyshev Iteration The Chebyshev iteration is an acceleration method for fixed point algorithms4. Instead of using only the information of the last iteration step, now a linear combination k
Yk
=
2:
VkjXj
j=O
made up from the already computed approximate solutions Xo, ... , Xk is used. The coefficients Vkj are constructed so that the sequence {Yo, YI, ... } converges faster than the original sequence of approximative solutions {xo, Xl, ... }. The following description of the affine subspace containing the vector Yk is implied by the requirement that if Xo = ... = Xk = X, then Yk should equal the solution x:
Furthermore, solution x: 4
Yk
shall be the best approximation from
Vk
to the searched
This description essentially follows [74], which essentially follows [103]. Therefore, in many places were the German textbook [74J is cited, the English speaking reader can also find the same in [103J even when the latter is not explicitly cited.
3.3 Chebyshev Iteration
II Yk
- x
II = yEV min II Y k
x
II
97
(3.2)
for a suitable norm II . II. The approximation Yk is the affine orthogonal projection of x onto Vk if the Euclidean norm II y II = ;r;;:y) is used. In this case, the minimization problem is equivalent to a variational problem which, however, cannot be analyzed. Therefore an alternative problem is constructed from the minimization problem 5 . This alternative problem is based on the Chebyshev polynomials. The relation Xk = ¢k(xO) holds for the fixed point iteration with ¢(y) = My + c. If the polynomial Pk E P k of degree k has the properties k
Pk ()..)
= L Vkj)..j j=O
the relation
k
and Pk(l)
= L Vkj = 1, j=O
k
Yk
= L VkjXj = Pd¢)xo j=O
follows for Yk. Since II Yk - x II:; II Pk(M) IIII Xo - x II, the minimization problem (3.2) can be replaced by II Pk(M) II = min with Pk(l) = 1. Next, it is assumed that the underlying fixed point iteration is symmetrizable. Let a and b be the minimal and maximal eigenvalues of M. Then
holds with the so-called virtual spectral radius p(Pk(M)) of M. With that, max IPk ()..)I
>'E[a,b]
= min
with Pk E Pk and Pk(l)
=1
is the minimax problem to be solved. The minimax properties of the Chebyshev polynomials are well known ([103], [74]) from the interpolation theory, and the polynomials P k turn out to be specially normalized Chebyshev polynomials P ()..) = Ck(t()..)) with t('\) = 2'\ - a - 1. k Cdt(l)) b- a Using the three term recurrence
as well as some transformations (for details see [103] or [74]) and observing the fact that PI (,\) = w'\ + 1 - w with the optimal relaxation parameter w = 2/(2 - b - a) for the symmetrizable iteration method Xk+1 = MXk + c, one obtains the following relaxation: 5
The choice of another norm presents another possibility. The conjugate gradient methods which are treated in the next subsection are based on this idea.
98
3. Numerical Treatment of Linear Systems
with
,Ck - l (i) . ' 2-b-a Pk := 2t '" with t:= b . Ck(t) - a
=I-
In particular, for a fixed point iteration of the form M the relation Yk
= Pk(Yk-l -
Yk-2
B- 1 A, c = B- 1 b,
+ wB- l (b - AYk-l)) + Yk-2
holds. With that, it is possible to write down the algorithm for the Chebyshev iteration, sometimes also called the Chebyshev acceleration for the fixed point iteration Xk+1 = M Xk + c with M = I - B- 1 A, c = B- 1 b: Algorithm 3.3.0.2 (Chebyshev Iteration) Yo := Xo , 2 - Amax(M) - Amin(M) , 2 t .w .- - - - - - - - - .- Amax(M) - Amin(M) .- 2 - Amax(M) - Amin(M) Co := 1; C l := i Yl := w(Myo + c) + (1 - w)yo For k = 2,,3, ... , kmax : Ck = 2ick- l - Ck-2 ,Ck-l(i) Pk - 2 t -----i:-'Ck(i) Solve the linear system Bz = b - AYk-l Yk = Pk(Yk-l - Yk-2 + wz) + Yk-2 If II Yk - Yk-l II :S tol II Yk II : stop
The convergence speed of the Chebyshev acceleration for a symmetrizable fixed point method can be estimated by the following theorem: Theorem 3.3.1. Let A be a symmetric positive definite matrix and M = M(A) the iteration matrix of a symmetrizable fixed point method Xk+1 >( x k) = M x k + c for A and the arbitrary initial vector Xo ERn. Then
II Yk
- x
II:S
1
ICk(i)1
II Xo -
x
holds for the corresponding Chebyshev iteration Yk algorithm 3.3.0.2.
II
= Pk(»xo,
with
i
as in
Proof: see, for instance, [74]. Lemma 3.3.1. For the Chebyshev polynomials Ck and condition number Ii > 1, the estimate
I (~)I ~ ~ Ck
holds.
Ii-I
2
(fo+ 1)k fo-l
3.4 Krylov Subspace Methods
99
Proof: see [74]. Altogether, the following estimate is valid for the iterates Yk:
II Yk
- x
II ~
2
(
J>:(A) - 1 )
~ J>:(A) + 1
k
II Xo -
x
II
Obviously, for the realization of the Chebyshev iteration it is very important to have a good knowledge of the spectrum of A (see the literature cited in [103]). The method can also be applied to non-symmetric matrices. Then ellipses enclose the (generally non-real) spectrum [171]. In the Axelsson method [11], [10], which is described in subsection 3.9 and is applicable to complex systems, the Chebyshev iteration is used to solve a real subproblem.
3.4 Krylov Subspace Methods Krylov subspace methods is a whole group of methods which are closely related. Linear systems are solved by minimization of a residual functional. In this process, the iterates are obtained from the initial residual by multiplication by a polynomial evaluated at the coefficient matrix, i.e., the minimization takes place over special vector spaces, the so-called Krylov subspaces. The algorithm builds a sequence of orthogonal or conjugate vectors (conjugate means orthogonal with respect to an inner product with a weighting matrix, say (x, y) = x T Ay with A symmetric positive definite. This scalar product defines the so-called energy norm.) Historically, the Lanczos algorithm [164] published in 1950 and the Conjugate Gradient algorithm (cg-algorithm) [127] published by Hestenes and Stiefel in 1952 have been the basis for the development of the Krylov subspace methods. Before describing these methods in detail, we will give a short overview of a few main characteristics. Here are the main reasons supporting the use of Krylov subspace methods. - Because of the optimality property of the approximate solution over the relevant solution space, the convergence rate is relatively high. - Using certain precondition techniques, it is possible to increase the convergence rate considerably. - The algorithms are free of any parameters, i.e., the user does not need to make parameter estimates (compare to the relaxation parameter of the SOR algorithm, which has a strong impact on the rate of convergence but is difficult to determine optimally). - The short recurrence relation leads to acceptable storage requirements and acceptable computation times per iteration. - The rounding error characteristics is also acceptable.
100
3. Numerical Treatment of Linear Systems
The main idea of all cg-like algorithms lies in solving an equivalent system instead of the original linear system. The equivalent problem is the minimization of a functional (cf. (3.2) in the derivation of the Chebyshev-iteration). Here it is dispensed with the historical development as an improvement of the gradient method. Yet, these methods can become unstable for indefinite or non-symmetric matrices. As a result, generalized cg methods in a great variety of versions have been developed since the end of the 70th. These versions are also applicable to non-symmetric and/or indefinite systems. For this kind of problems, it became obvious that often only a clever combination of preconditioning and some generalized cg algorithm led to the desired robustness. The Krylov subspace methods still are an active research area. The use of these methods for non-hermitian linear systems is discussed in a large number of recent publications. Unfortunately, for non-Hermitian systems Ax = b with A E c nxn , A"1 AH, the robustness of Krylov subspace methods is not sufficient to use them as black box solvers. There exist enough examples of system matrices A for which the described methods cannot reach the prescribed accuracy or for which they even fail to converge at all. Some of these examples have a great practical importance, so it is always absolutely necessary to carry out careful numerical experiments in order to see which solver is best for a given application problem.
3.4.1 The CG Algorithm As already mentioned in the derivation of the Chebyshev iteration, the cg algorithm results instead of the Chebyshev iteration if another norm is chosen in the construction of the problem alternative to the minimization problem. This will be shown in the sequel, where the presentation follows [74) (the English speaking reader is again referred to [103) instead). Again, the approximation of the solution x of the linear problem Ax = b by vectors Yk from an affine subspace Vk is the starting point. Choosing the Euclidean norm IIYI12 = J< y, y > first led to a dead end, the only way from which was passing to a solvable alternative problem. The underlying idea will still be followed, but now a scalar product fitting to the problem shall be used. This is the scalar product < x, Y >A =< x, Ay >= x T Ay weighted with a symmetric positive definite (shortly spd) matrix A. The energy norm is the corresponding norm.
Definition 3.4.1. For positive definite matrices A the so-called A-norm or energy norm is introduced:
Let Vk = Xo + Uk eRn be again a k-dimensional affine subspace and let Uk be the linear subspace parallel to Vk .
3.4 Krylov Subspace Methods
Definition 3.4.2. The solution
Xk
101
of the minimization problem
is called the Ritz- Galerkin approximation of x in Vk . According to a theorem from linear least squares theory,
holds with
ut
= {y E Rnl < y, z >= 0 for all z E Ud
the orthogonal complement of Uk in Rn. Thus, the solution Xk E Vk of the minimization problem is uniquely determined.
Definition 3.4.3. The function
P: R n --+
Vk,
y
f----t
Py with
Ily -
Pyll = min
ZEVk
Ily - zll
is affine linear and is called the orthogonal projection of Rn onto the affine subspace Vk . Consequently, Xk is the orthogonal projection of x onto Vk with respect to < .,. > A. With that, the minimization problem is equivalent to the following variational problem: (3.3) Instead of "orthogonal with respect to < .,. >A", the term" A-orthogonal" (also A-conjugate, for historical reasons) is often used. Equation (3.3) states that the error x - Xk must be A-orthogonal to Uk. With the residuals rk := b - Axk, the relation
holds. Consequently, the variational problem (3.3) is equivalent to the condition that the residuals rk be orthogonal (with respect to the Euclidean scalar product) to Uk, i.e.,
< rk,u >= 0 for all U
E Uk.
Now let PI, ... ,Pk be an A-orthogonal basis of Uk, i.e.,
Then it follows for the A-orthogonal projection Pk : Rn --+ Vk that
102
3. Numerical Treatment of Linear Systems
(3.4)
In contrast to the derivation of the Chebyshev iteration, here no longer exists any dependence of the right-hand side on the yet unknown solution x, i.e., the A-orthogonal projection Xk of x onto Vk can be calculated explicitly. Now (3.4) imply the recursions
Xk = Xk-l
+ CXkPk
rk = rk-l - cxkAPk for the approximate solutions Xk and the residuals rk, since
Now only the subspaces Vk C Rn for which an A-orthogonal basis PI, ... ,Pk can easily be computed are still missing for the construction of an approximation method. By the Cayley-Hamilton theorem (see, e.g., [48], [114]), there exists a polynomial Pn - l E P n-l such that
and therefore
x - Xo Choose Vk = Xo
= A-Iro = Pn- l (A)ro + Uk
E span{ro, Aro, ... , An-Iro}.
with
Uk := span{ro, Aro, ... , Ak-Iro} for k = 1, ... , n for the approximation spaces. Then x E Vn , i.e., the n-th approximation Xn is the solution itself. The subspaces Uk defined in this way are called Krylov subspaces [91]. Generally, the notation Km is mostly used instead of the Uk used here:
Definition 3.4.4.
L ciAiy}
m-l
Km = Km(A,y):= span{y,Ay, ... ,Am-Iy} = {v E Rnlv =
i=O
with Ko = {y} and m = 1,2,3, ..... Km is called m-th Krylov subspace ofRn generated by A and y . Therefore, all cg-like methods are referred to as Krylov subspace methods. With y = ro = b - Axo the residual for the initial vector Xo, the relation b - AXm 1. Km(A, ro) is called the Petrov-Galerkin condition. The following theorem justifies the construction of an A-orthogonal basis PI, ... , Pk from the residuals:
3.4 Krylov Subspace Methods
Theorem 3.4.1. Let rk thogonal, i. e.,
:P
103
O. Then the residuals ro, ... , rk are pairwise or-
< ri, rj >= Oij < ri, ri > for i,j = 0, ... , k. They span Uk+! , i.e., Uk+l
= span{ro, ... , rd·
Proof: complete induction over k (see, for instance, [74]) Set PI := ro for ro :P 0. By the above theorem, rk vanishes for k > 1, i.e., the solution x = Xk has been found or the vectors PI, ... ,Pk-l and rk are linearly independent and span Uk+! so that, with
an orthogonal basis of Uk+l was found. This gives rise to the cg algorithm:
Algorithm 3.4.1.1 (CG Algorithm; Conjugate Gradient) Choose f31 == 0, PI == TO, Xo := 0 and thus ro := b For k = 1, ... , n ifrk-l = 0 then Set x = Xk-l and finish the computation. otherwise 13k =< Tk-l, rk-l > / < Tk-2, rk-2 > Pk = rk-l + f3kPk-1 (lk =< rk-l,rk-l > / < Pk,Pk >A Xk = Xk-l + (lkPk rk = rk-l - (lkAPk x
=
Xn
In each iteration step, only one matrix-vector multiplication, namely Apk, is necessary. The cg algorithm has the following properties if computation is free of rounding errors:
°
rn = and Xn is the exact solution after at most n steps. - The vectors Pk are pairwise A-orthogonal or conjugate, i.e.,
-
whenever
k:p i.
They are called search directions. - Each residual is orthogonal to all previous search directions of the functional F. - All residuals are pairwise orthogonal.
104
3. Numerical Treatment of Linear Systems
- The recursively computed residual rk is the same as the actual residual rk = b - AXk of the iterated approximate solution Xk. Thus the cg algorithm also combines properties of direct and iterative solution methods: It generates a series Xk which approximates the solution x in the prescribed manner, but, after at most n steps, it gives the exact solution if there are no rounding errors, i.e. in exact arithmetic. In practice, the recursively computed residuals rk of the cg algorithm are not orthogonal because of rounding errors. Therefore, the exact solution cannot be obtained after n steps in practice. So, the iterative aspect of the method became more emphasized over the years. In 1959, Stiefel and his co-workers (Rutishauser, Engeli et al.) pointed out the attractiveness of the cg algorithm as a solution method for sparse systems. Since the publication of Reid in 1971, the iterative aspect is mainly of interest. One advantage among others is the availability of the recursively computed residuals in each iteration step. The following rough error estimate holds for the asymptotic rate of convergence of the cg algorithm. Theorem 3.4.2. Let A E Rnxn be a symmetric positive definite matrix with the eigenvalues Al ~ A2 ~ ... ~ An. Let x· be the exact solution of the linear system Ax b. The following relation holds for the iterates Xk of the cg algorithm
II Xk and
II
Xk
- x·
IIA:<::;
(~kH - ~n) II Xo -
kH +
n
x·
IIA
-x·IIA:<::; 2 (~ _ l ) k II Xo -x·IIA ~2(A) + 1
Proof: See, for instance, [74]. Thus, the rate of convergence depends on the initial residual and on the eigenvalue distribution, i.e., on the whole spectrum and therefore on the condition number ~2(A) of the matrix A. In practice, the following stopping criterion for a given limit 6 is used:
Yet, it is possible to show (see, e.g., [74]) that the residual norm is a bad measure of convergence since the iterates Xk improve drastically only for ill-conditioned systems (~(A) » 1), even though the residual norms grow (compare Theorem 3.4.2). On the other hand, since ill-conditioned systems should in any case be solved with a preconditioned cg algorithm, this turns out to be not a real problem. A consequence of Theorem 3.4.2 is given by Lemma 3.4.1. To reduce the error in the energy-norm by a factor of f, i.e., to achieve
3.4 Krylov Subspace Methods
105
at most k cg iterations are necessary where k is the smallest integer satisfying
(3.5) Proof: see, for instance, [74]. In general, the cg algorithm converges very well, often faster than SOR if preconditioning is used (d. subsection 3.10). Axelsson [9] analyses very thoroughly the rate of convergence of the cg algorithm not restricting himself to real matrices. He shows that three phases of convergence process can be typically distinguished for the cg algorithm: 1. the initial phase where the residual norm decreases at least as fast as O(k + 1)-2 where k is the number of iterations; 2. the intermediate phase where the rate of convergence is linear and given by the standard estimate via the condition number; 3. the last phase with tendency to superlinear rate of convergence, which becomes obvious for special distributions of the eigenvalues.
Since the cg algorithm works so well for spd matrices, it is used since the 70's as a basis for a wealth of algorithmic variants to be applied to nonsymmetric matrices. In connection to this, note that an arbitrary invertible matrix A generally does not define a norm. The most important algorithmic variants are introduced in the sequel. 3.4.2 Algorithms of Lanczos Type Just like the cg-like algorithms, the algorithms of Lanczos type are Krylov subspace methods. In 1950, Lanczos [164] developed a special method to generate conjugate orthogonal vectors. This method is used to determine the extreme eigenvalues of a matrix and also to compute the solution of linear systems. Originally developed for symmetric matrices, it gave rise to modifications for non-symmetric (non-Hermitian) matrices. For indefinite matrices, the algorithms of Lanczos type may fail to determine linearly independent search directions. Therefore some methods were recently developed to avoid the breakdown of the algorithm. These are the so-called Look-Ahead Lanczos algorithms, which, in particular, include the QMR algorithm [100]. Lanczos Algorithm. The Lanczos algorithm was originally developed for the computation of extreme eigenvalues. It generates a series of tridiagonal matrices {Tj } with the property that the extreme eigenvalues Tj E c jxj present increasingly better approximations of the extreme eigenvalues of A. Different derivations are possible. Golub [103] (d. also [74]) chooses optimization of the Rayleigh quotient, while Saad [226], e.g., introduces the Lanczos algorithm as a simplified version of the modified Gram-Schmidt variant of the
106
3. Numerical Treatment of Linear Systems
Arnoldi process, i.e., a projection method based on complete orthogonalization. In any case, the Lanczos algorithm can also be used for the construction or stabilization of cg-like algorithms and, in this context, it is interesting for the following reason. The Lanczos algorithm for non-Hermitian matrices A ¥ AH, A E nxn is based on a bi-orthogonalization. The algorithm that was proposed by Lanczos for non-Hermitian matrices forms a pair of bi-orthogonal bases for the two subspaces
c
and
Wk(AH,wd = span{wl,AHwl, ... ,(AH)k-lwt}.
Algorithm 3.4.2.1 (Lanczos Algorithm for Non-HermitianMatrices)
°
Choose two starting vectors VI and WI such that set /31 := 01 := 0, Vo = Wo := For j = 1, ... , k aj =< AVj, Wj > Vj+l = AVj - ajvj - /3jVj-l W)+1 = AH Wj - -ajwj - TUjWj-l 0)+1 = 1< V)+I, 1V)+1 > 11/2 (3)+1 =< V)+I, Wj+l > /0)+1 W)+1 = Wj+1 / (3)+1 V)+1 = v)+t/Oj+1
<
VI, WI
>= 1
and
The scalars 0)+1, (3)+1 for the weighting of Vj+1 and Wj+1 can be chosen in a way different from the one in the algorithm above as long as they satisfy < Vj+l,Wj+1 > = l. The following tridiagonal matrix results
(3.6)
When k increases, the extreme eigenvalues of Tk E C k x k tend to the extreme eigenvalues of A. With Qk = [Vl,V2, ... ,Vk] and Rk = [Wl,W2, ... ,Wk], the following theorem can be derived:
Theorem 3.4.3. If the algorithm 3.4.2.1 does not break down till the kth step, the vectors Vi, i = 1, ... , k and Wj,j = 1, ... , k build a bi-orthogonal system, i.e.,
3.4 Krylov Subspace Methods
107
holds where h is the unit matrix of order k. Furthermore, {Vdi==1,2, ... ,k is a basis ofVk (A,v1) and {Wj}j==1,2, ... ,k a basis ofWk(AH,wt}. Finally, AQk AH Rk Rf: AQk
= QkTk + Vk+1 ef:
= RkTf! + Wk+1ef: = Tk
(3.7) (3.8) (3.9)
holds, with ek the k-th unit vector. Proof: By induction (see for instance [226]). The relation (3.9) characterizes the Lanczos algorithm. The matrix Tk results from a skew projection of A on Vk (A, V1) and the orthogonality to W k (A H, W1) [226]. The Krylov subspace methods for non-Hermitian matrices use a Bi-orthogonalization which goes back to Lanczos and is based on a skew projection method (which is not orthogonal). Here is the algorithm that solves linear systems by the Lanczos method: Algorithm 3.4.2.2 (Lanczos Algorithm for Linear Systems) 1. Start: Compute (31 := Ilrall· 2. Generate the Lanczos vectors: Carry out k steps of the algorithm 3.4.2.1 for the generation of the Lanczos vectors V1, ... , Vk, W1, ... , Wk and the tridiagonal matrix Tk starting from V1 := 1'0/(31 and an arbitrary W1 with < V1, W1 >= 1. 3. Determine the approximate solution:
with the n x k-matrix Qk = [V1' V2, ... , Vk] and the tridiagonal matrix Tk as in (3.6). In the Hermitian as well as in the non-Hermitian case, the following formula can be given for the residual [226]:
with the k-th unit vector ek. Therefore, the residual norm in step 2 of algorithm 3.4.2.2 can be determined without much effort without computing the approximate solution itself. In step 3 of the algorithm 3.4.2.2, the linear system TkYk = (31 e1 with the tridiagonal matrix Tk has to be solved. This can be done, e.g., by LV decomposition Tk = LkUk. Including the decomposition into the Lanczos step, one can determine the approximate solutions Xk consecutively. Saad [226] calls this procedure the direct Lanczos algorithm. If A is real symmetric and a different factorization is chosen, the SYMMLQ algorithm [193] follows. A variant for non-Hermitian systems which follows another algorithm for real non-symmetric matrices given in [103] can be found in [280]. For the non-Hermitian Lanczos algorithm, one needs to store only six vectors and a tridiagonal matrix.
108
3. Numerical Treatment of Linear Systems
3.4.3 Look-Ahead Lanczos Algorithm While normalizing Vj+! and Wj+l, an exact breakdown of the algorithm may happen if < Vj+l' Wj+! >= O. More serious, however, is the case of a near-breakdown. In this case, the Lanczos vectors are scaled by very small numbers, which, after a few steps, can cause a near-breakdown. Yet, by some modifications of the algorithm, it is possible to continue in most cases. This procedure is often referred to as Look-Ahead Lanczos. The underlying idea of this algorithm is that the pair {Vj+2, Wj+2} can often be defined even if the pair {Vj+l, wj+d cannot. Then the algorithm can be continued starting with the pair {Vj+2,Wj+2}. If also {Vj+2' wj+d cannot be defined, the pair {Vj+3, Wj+3} etc. can be used. The following explanation of the mechanism underlying the Look-Ahead Lanczos algorithm follows Saad [226]. First, define a bilinear form on the subspace P k - l by
(3.10) Unfortunately, it may happen that < P, q >p vanishes or takes on negative values, thus being an "indefinite inner product". Now, there exists a polynomial Pj of degree j and a scalar "fj such that Vj+l = Pj (A)VI and Wj+! = "fjpj(AH)WI. The Lanczos algorithm attempts to construct a sequence of polynomials which are orthogonal with respect to the inner product (3.10). Because of
< Pj,Pj >p= "fj < Pj(A)VI,Pi(AH)WI >, an exact breakdown happens in step j if and only if the indefinite norm of the polynomial Pi vanishes in step j. The main idea of the Look-Ahead Lanczos algorithm is that the polynomial Pi is left out but Pi+! is calculated in any case, thus allowing to continue constructing the series. The following example, with
will illustrate this idea. The polynomials qj and qj+! are orthogonal to the polynomials PI, .·.,Pj-2· If Pj = qj is fixed and Pj+l is determined by the requirement that qj+! be orthogonal to Pj-l and Pj, then the resulting polynomial is orthogonal to all polynomials of degree ~ j. It is therefore possible to continue the algorithm in the same way starting from step j + l. The disadvantage of Look-Ahead implementations is the immense additional complexity. One difficulty is the need to decide when the situation is near-breakdown. Furthermore, the matrices Tk are no longer tridiagonal. In what follows, the QMR algorithm will serve as a representative of the many known variants.
3.4 Krylov Subspace Methods
109
3.4.4 Variants of the CG Algorithm for Linear Systems with Non-Hermitian or Indefinite System Matrix The cg algorithm is a very efficient and simply implementable method for symmetric positive definite matrices. In practice, however, linear systems Ax = b with complex non-Hermitian or indefinite system matrix A often arise. A number of variants of the cg algorithm has been developed for these linear systems. Some important ones will be described in the following. In literature, many algorithms have only been introduced for real non-symmetric matrices but they can easily be transferred into the complex space. With that, the inner product < .,. >: cn x cn -t C is then defined as < x, y >:= yH x = x where y is the complex conjugate of the vector y. The Krylov subspace methods are an actual research topic even today. Especially interesting is their application to non-Hermitian linear systems. Yet it has to be noted that the Krylov subspace methods for non-Hermitian systems Ax = b with A E cnxn, A f:. AH are not robust enough to use them as black box solvers. There are many examples of system matrices A for which the methods which are introduced below either do not reach the prescribed accuracy or even fail to converge at all. Some of these examples are important applications from practice. Therefore thorough numerical experiments are always absolutely necessary in order to decide if an algorithm is applicable to a given problem.
rr
CGNR and CGNE Algorithm (CG Applied to the Normal Equations). The easiest algorithms which are suitable for non-Hermitian or indefinite systems Ax = b and related to the cg algorithm are the CGNR and CGNE algorithm, which were developed by Craig [69] for real matrices A. Here the more general complex form is given. These algorithms are based on the normalized equation AH Ax = AHb (CGNR) or AAH X = b (CGNE). First, they transform the system into a related positive definite system, then apply the cg algorithm. [89] studies this ansatz for real problems. The advantage of these algorithms is their easy implementation (cf. [226], [280]). Yet, the normalization of the matrix increases the condition number, since I\:(AH A) = I\:(A)2. Consequently, only very slow convergence can be expected, and it is essential to precondition the system appropriately. Then these algorithms are usually very robust but also very slow. The latter is the reason why these methods have been a priori excluded for the studies discussed below. BiCG Algorithm (Bi-Conjugate Gradient). The cg algorithm is not suitable for non-Hermitian systems, since the residual vectors can no longer be made orthogonal via a short recurrence. The GMRES algorithm avoids this difficulty by using longer recurrences, which imposes extra storage requirements. The BiCG algorithm, on the contrary, replaces the orthogonal series of residuals by two mutually orthogonal sequences, but then it cannot maintain the minimization property. Just like it is possible to derive the cg algorithm from the Hermitian Lanczos algorithm, it is possible to derive the
no
3. Numerical Treatment of Linear Systems
BiCG algorithm from the non-Hermitian Lanczos algorithm [226]. The algorithm was first presented in 1952 by Lanczos [165]' then, in 1975, Fletcher [96] published a cg-like version for real non-symmetric matrices. The iterates of the BiCG algorithm satisfy the Petrov-Galerkin condition
(3.11) with the Krylov subspaces Vk = Vk(A, ro) with ro = b - Axo and W k = WdAH,ro) , where TO is an additional non-trivial starting vector, the socalled pseudo-residual related to AH. In 1981, Jacobs [144] introduced the method for complex non-symmetric matrices. Thus, it is a method of projection onto Vk orthogonally to Wk. The BiCG algorithm has been developed for general systems of complex linear equations. It is the basis for several modern algorithms, which will be introduced in the sequel.
Algorithm 3.4.4.1 (BiCG Algorithm) Choose Xo, ro = b - Axo and TO such that < ro, TO ># 0, set P-l := P-l :=
0, < r-l,r-l >:= 1 For k = 0,1, . .. : 13k =< rk,Tk > / < rk-l,Tk-l Pk = rk +13kPk-l Pk = Tk + 13kPk-l ak =< rk,Tk > /
A XkH = Xk + akPk rk+l = rk - akApk Tk+l = Tk - akA HPk
>
In practice, the often used expressions like Ok :=< rk, Tk > and Vk := APk, are stored in a separate variable. Here this is not done deliberately to keep the algorithm as transparent as possible. The pseudo-residuals Tk = b- AH Xk and the pseudo search directions Pk theoretically provide the termination of the process after finitely many steps. By construction, the pseudo-residuals Tk are orthogonal to the residuals rk and the pseudo search directions Pk are A-conjugate to the search directions Pk. In the real symmetric case, the BiCG algorithm corresponds to the cg algorithm, since in this case the pseudo search directions and pseudo-residuals coincide with the original search directions and residuals. Only a few theoretical results about the convergence of the BiCG algorithm is known until now. It can be shown that the residual norm is reduced in phases. During each phase, the number of iterations of the algorithm is more or less comparable with that in the GMRES algorithm [100]. In practice, this can often be observed, but irregular convergence can also occur and the algorithm may even break down. The breakdowns are caused by divisions by zero during the computation of ak and 13k. In the first case, the QMR algorithm gives the appropriate Look-ahead strategy, in the other case it leads to very complicated algorithms (cf. [196]). A restart directly before
3.4 Krylov Subspace Methods
111
some (near-)breakdown and a switch to more robust algorithms such as the GMRES algorithm are other possibilities to handle (near-)breakdowns. The residuals Tk may be expressed as a linear combination of the basis vectors of Vk(A,TO), which means nothing else than Tk = ¢k(A)TO, with a polynomial ¢k in A. According to Sonneveld [240], the calculation of the vectors Pk and Tk can be considered the construction of the so-called BiCG polynomials of degree k 'ljJk, ¢k E P k (where P k = {qlq( t) = 2:7=0 ait, q(O) = 1, ai E R}) that satisfy the following three-term recurrence relation for Pk and Tk: Tk = ¢~iCG(A)TO, Pk = 'ljJkiCG(A)TO' For the derivation of further cg-like algorithms, the polynomial-like representation of the algorithms shows great advantages. The polynomial representation of BiCG corresponds to the polynomial representation of the cg algorithm - merely with another bilinear form. The bilinear form used in BiCG is given by [¢, 'ljJ] := rf! ¢(A)'ljJ(A)TO, while for the cg algorithm, the positive semi-definite bilinear form [¢, 'ljJ] := is used. In general, i.e., for non-Hermitian indefinite A, the bilinear form used in BiCG is not positive semi-definite.
TJ' ¢(Af 'ljJ(A)TO
COCG Algorithm (Conjugate Orthogonal Conjugate Gradient). The COCG algorithm of van der Vorst and Melissen [276] is a symmetric variant of the complex BiCG algorithm. Let A E cn x n , A = AT, b E cn, Xo E cn. The choice of ro = 1'0, the conjugate complex of the initial residual TO, simplifies the computation of the factors ak and 13k of the BiCG algorithm in the complex symmetric case since one matrix-vector multiplication with the complex conjugate matrix A H is avoided. Algorithm 3.4.4.2 (COCG Algorithm) Choose Xo E en, set Po = TO = b - Axo FOT k = 0,1, ... " 13k =< Tk, 1'k > / < Tk-l, 1'k-l > Pk = Tk + 13kPk-l ak =< Tk,1'k > / < Pk,Pk >A Xk+l = xk + akPk TkH = Tk - akApk If the system matrix is complex symmetric, it may be assumed that this algorithm is the best choice, since there is only oue matrix-vector multiplication per iteration necessary, while the BiCG algorithm requires two matrix-vector multiplications. With respect to storage requirements, the COCG algorithm is also favourable, with three vectors to store compared to five vectors in the BiCG algorithm. For the polynomial representation, the following relations hold for the residuals Tk and the search directions Pk of the COCG:
112
3. Numerical Treatment of Linear Systems
rk =
Pk = Wk;CG(A)ro
where
rk
= (
Pk
= (Wk;CG)2(A)ro,
without the need of constructing Tk and Pk, which, in the original BiCG algorithm, would mean one matrix-vector multiplication with AH. The relation
Xk E Xo
+ V2k (A, ro)
holds for the iterates. Algorithm 3.4.4.3 (CGS Algorithm) Choose Xo, ro = b - Axo and TO, such that 0, < r-l,r-l >:= 1 For k = 0,1, ... : (3k =< rk,To > / < rk-l,To > Uk = Tk + (3kqk Pk = Uk + (3k(qk + (3kPk-d Q:k =< Tk,TO > / < Pk,TO >A qkH = Uk - Q:kApk XkH = Xk + Q:k(Uk + qkH) TkH = Tk - Q:kA(Uk + qkH)
< ro, TO >¥
0, set qo
:= P-l :=
Compared with the BiCG algorithm, the CGS algorithm has the advantage that the pseudo-residuals Tk and the pseudo search directions Pk as well as AH are not needed. It only requires to perform two matrix-vector multiplications and to store seven vectors. Theoretically, the CGS algorithm converges if the BiCG algorithm converges. Yet, for many applications, CGS is usually faster than BiCG. The quadrature of the BiCG polynomials is a characteristics of the CGS algorithm; its increase - depending on the choice of the initial residual To - either intensifies an existing contraction property of the polynomial or increases the residual norm. This causes irregular convergence behaviour, which often can be observed, can lead to numerical extinction, and thus strongly influences the stability of this algorithm. As a result of this observation, several new, stabilized algorithms have been developed.
3.4 Krylov Subspace Methods
113
CGS2 Algorithm. Fokkema, Sleijpen, and van der Vorst [238] developed generalized versions of the CGS algorithm. The next relative is the CGS2 algorithm. This algorithm chooses
with a 'nearby BiCG polynomial' ¢k which is based on the vector s instead of f. Because of the great similarity to CGS, the algorithm is not explicitly given here. It also requires only two matrix-vector multiplications but, compared to the CGS algorithm, has a worse storage requirement of ten vectors.
SCBiCG Algorithm. Recently, Clemens [60] included some algorithms for complex symmetric systems in one class by introducing the more general formulation of the SCBiCG(T, n) algorithm. This class includes the COCG and BiCGCR algorithms for n = and n = 1. This class stands out due to the fact that it only requires one matrix-vector multiplication per iteration. Clemens also combines these algorithms with Minimal Residual smoothing, following Schonauer [233]. Let 7r E P n be a polynomial of degree n (P n := {qlq(z) = I:7=oCiZi,Z E C, Ci E R, Cn :P O}). Denote the set of its coefficients Ci by T: T := {cih=o, ... ,n. For these algorithms, the pseudo-residual fo of the BiCG algorithm is chosen according to fo := 7r(A)ro.
°
This implies by induction that
fk
= 7r(A)rk
and 'Pk
= 7r(A)Pk
for k
= 0,1, ...
holds for the pseudo-residuals and pseudo search directions of BiCG. The set of coefficients T and the degree n of the polynomial 7r chosen for the construction of 1'0 also determine a special algorithm from this class. Auxiliary vectors v(ih := AiVk are defined in order to formulate the algorithm:
Algorithm 3.4.4.4 (SCBiCG(T, n) Algorithm) Choose xo, r(O)o = b - Axo For i = 0,1, ... ,n - 1 : r(i + 1)0 = Ar(i)o For i = 0,1, ... , n : p( i)o = r( i)o p(n + 1)0 = Ap(n)o For k = 0,1, ... : Xk+l = Xk + (}:kP(Oh For i = 0,1, ... , n : r(ih+l = r(ih - (}:kP(i + 1h For i = 0,1, ... ,n : P(i)k+l = r(ih+l + (3kP(ih p(n + 1h+l = Ap(nh+l
114
3. Numerical Treatment of Linear Systems
with
L elk
=
Cl
< r(ih, r(jh >
O
LCI 1=0
13k =
L
Cl
< r(ih+1,r(jh+1 >
O
< r(ih, r(jh >
O::;t'Sn,I=i+j,j'Si'Sj+1
The Petrov-Galerkin condition holds for the iterates Xk of the underlying BiCG algorithm (3.11), with the Krylov subs paces Vk(A, ro) and WdAH, TO). The following orthogonality relations hold:
< ri, 1l'(A}rj > = < Pi, 1l'(A)pj > A =
° °
for i :j; j; i,j
= 0,1, ... , k
for i :j; j; i, j = 0,1, ... , k
The COCG algorithm results for n = 0, while, for n = 1 and Co = 0, one obtains an algorithm which coincides with the CR algorithm (Conjugate Residual) of Stiefel [247] in the real case; it will be called the BiCGCR algorithm.
3.5 Minimal Residual Algorithms and Hybrid Algorithms The minimal residual algorithms are more stable than the cg-like algorithms. In particular, they display absolutely monotone convergence behaviour. Nevertheless, some parallels to the cg algorithm exist. The GMRES algorithm of Saad and Schultz [229] is given below. It is a generalization of the MINRES algorithm of Paige and Saunders [193] for non-symmetric systems. Both generate a sequence of orthogonal vectors. However, while MINRES uses short recursions, it is necessary to take into account all previously computed vectors of the orthogonal sequence in GMRES. For this reason, restarted versions of this method are used in practice. Other related methods are ORTHODIR of Jea and Young [145], a method by Axelsson [6] that builds the basis of the Generalized Conjugate Gradient, Least Squares method of [8], which is described in subsection 3.5.2, the Generalized Conjugate Residual method GCR of Elman [89], [88], the GMERR algorithm of Weiss [312], and the recently published GMBACK algorithm of Kasenally [151]. However, in order to be able to compete with the cg-like methods, these methods require in general (even in the restarted versions) to store a large number of basis vectors. In combination with the typical size of the problems coming from application, their use very often does not make sense. GMRES, for example, was compared, for ill-conditioned linear
3.5 Minimal Residual Algorithms and Hybrid Algorithms
115
systems, by Schmid, Paffrath, and Hoppe in [230] with BiCGSTAB and CGS, which proved to be much more efficient. The observed very slow convergence of GMRES and wild oscillation of CGS can each be regarded typical for both classes.
3.5.1 GMRES Algorithm (Generalized Minimal Residual) Since the GMRES algorithm itself cannot be recommended for practice because of its enormous storage requirements and since it only occurs in the sequel in connection with hybrid methods such as the BiCGSTAB algorithm, a detailed derivation is not given here. It can be found in the original publication of Saad and Schultz [229] (also see [280]). In GMRES, like in the cg algorithm, a minimization problem is solved instead of the linear system itself. The minimization problem in GMRES is (3.12) G MRES is based on the conservation of orthogonality rather than on a threeterm recurrence. Consequently, the orthonormal basis {Vi, ... , vd of the Krylov subspace Vk(A, 1'0) has to be determined. To this end, the Arnoldi algorithm [3], a modified Gram-Schmidt orthonormalization method, is applied. If the vector y in (3.12) is expressed by its Krylov space representation QkZ, then (3.12) is equivalent to the minimization of with f3 =
111'011,
where Qk = [Vi, ""Vk] is the matrix with the columns Vi, ... ,Vk. The Arnoldi process yields a k-dimensional upper-Hessenberg matrix Hk with the elements hij = < V j, Vi >A. Now let Hk be the Hessenberg matrix H k extended by the row (0, ... ,0, hk+l,k). Then the relation
which is proven in [44] and is very important for the derivation of GMRES, holds. With that, using the orthonormality of Qk+l, the functional J(z) can be expressed as
with the unit vector el. For the solution of the minimization problem, Hk is then brought into upper triangular shape via Givens rotations: Hk = fhRk with a unitary rotation matrix fh. Then J(z) can be transformed into
With that, Zk = minz J(z) can be obtained by backward substitution since Rk is an upper triangular matrix.
116
3. Numerical Treatment of Linear Systems
The number of vectors to be stored for the reconstruction of the orthonormal basis grows linearly, the number of floating point operations (flops) quadratically with k. Therefore, in practice a "dynamical" version of GMRES is used which restarts after I steps and therein uses Xl as initial approximation.
Algorithm 3.5.1.1 (GMRES(I) Algorithm) Choose xo, set ro = b - Axo, VI = ro/llrol!' [l = I, For j = 1,2, ... , l: (Arnoldi) For i = 0,1, ... ,j: hi,j =< Vj,Vi >A
90
= (1IroI12' 0, ... , of
j
Vj+! = AVj - Lhi,jvi i=1
hj+l,j = Ilvj+111 Vj+1 = vj + l/hj+l,j Store iI j in factorized form iIj 9j = [lj9j-1 If Ilrtil = 91 > c: restart with Xo := Xl, VI := rtlllrtli Compute Zl from RlZl = 91 Xl = Xo + QlZl
= [ljRj
Convergence of GMRES(I) is guaranteed only for positive definite matrices. For indefinite systems, stagnation may happen, which, however, is observed seldom[230]. Implementations using the Gram-Schmidt orthogonalization are relatively inexpensive, but may be numerically unstable. So, some implementations apply the Householder transformation for orthonormalization, which is twice as expensive. These implementations are known for their stability and their good vectorizability [230], [44].
3.5.2 Hybrid Methods The hybrid methods described below combine the cg algorithm, the BiCG algorithm, or the Look-Ahead Lanczos algorithm with a minimal residual ansatz, particularly with the GMRES algorithm. This way the advantage of short recursions in the cg or Lanczos algorithm is combined with the stable and monotone convergence behaviour of the minimal residual algorithm. The resulting algorithms are very well suited for the solution of complex nonHermitian linear systems.
BiCGSTAB Algorithm (Bi-Conjugate Gradient Stabilized). The BiCGSTAB algorithm [275] was developed by van der Vorst as a stabilized version of the cg algorithm. The BiCGSTAB algorithm is a modification of the CGS algorithm. Because of squaring of the residual polynomial ¢~;CG2 (A) in the CGS algorithm, it may happen, in case of irregular convergence, that
3.5 Minimal Residual Algorithms and Hybrid Algorithms
117
rounding errors build up and finally lead to an overflow. To avoid this, the residuals rk in BiCGSTAB are defined by rk = 71'dA)¢~iCG(A)ro, with a new polynomial 71'k. This new polynomial 71'k is defined recursively in each step. The goal is stabilization and smoothing of the algorithm. Therefore, 71'k E P k is defined as 71'k+1(t)
= (1 -
Wkt)71'k(t).
Thus, in each step, the preceding polynomial is multiplied by a polynomial of degree 1. The new parameter Wk is chosen so that, by multiplication of the residual vector by (I - WkA), a steepest descent in the direction of the preceding residual is achieved. Thus the parameters Wk are chosen so that the Euclidean norm of rk is minimized:
In other words, 71'k is a product of k I-step MR polynomials (Minimal Residual): 71'k(t)
= (1 -
wlt)(1 - W2t) ... (1 - wkt).
Introducing the vector Sk with Sk :=
71'k-1 (A)¢k BiCG
(A)ro,
define the parameter Wk by sf! ASk
Wk
= sf! AH ASk'
In vector form, the algorithm is given by: Algorithm 3.5.2.1 (BiCGSTAB Algorithm) Choose xo, ro = b - Axo and 1'0 such that < ro, 1'0 For k = 1,2, ... : (}:k-l =< rk-l,fo > / < Pk-l,fo >A Sk = rk-l - (}:k-IApk-1 Wk =< Sk,Sk >A / < Sk,Sk >AHA Xk = Xk-l + (}:k-IPk + WkSk rk = Sk - WkAsk (3k =< rk,fo > / < rk-l,fo > (3k(}:k-1 ( A) Pk = rk + Wk Pk-l + Wk Pk-l
>i= 0,
set qo := 0
The BiCGSTAB algorithm combines the BiCG polynomial with a GMRES(I)-minimization step. Therefore, the BiCGSTAB algorithm normally leads to smoother convergence curves than the CGS algorithm. However, in the BiCGSTAB algorithm, stagnation or even a breakdown can happen if Wk nearly vanishes. The computational effort is two matrix-vector multiplications and a storage requirement is seven vectors.
118
3. Numerical Treatment of Linear Systems
BiCGstab2 and BiCGstab(l) Algorithm. In 1991, Gutknecht proposed the BiCGstab2 algorithm [110] in order to avoid stagnation and breakdown caused by nearly vanishing Wk. His proposal was to use a MR polynomial of second degree. In each even step, he corrects the first degree MR polynomial of the preceding step. The MR polynomial of degree one, however, can already be nearly degenerate and thus can cause degeneration of the MR polynomial of second degree as well as large errors. Therefore, Sleijpen and Fokkema introduced a generalization of this method in 1993 that constructs a MR polynomial of degree l in each l-th step. This leads to the more efficient BiCGstab(l) algorithm [237] with 7rk
where k
= ml + l,
= XjXj-1 ... Xo
Xi E Pt. Xm minimizes
This method can be regarded a combination of the BiCG algorithm with GMRES(l). The iterates satisfy Xk
= Xtm
E Xo
+ V2mt (A,TO).
Consequently, for l = 1, it coincides with the original BiCGSTAB algorithm. Certain near-breakdowns can be avoided using the BiCGstab(l) algorithm, but generally they cannot be avoided since the leading coefficient of Xm may become very small. In section 3.10, some studies regarding implementations with l = 1 and l = 2 are discussed. Even for l = 2, stagnation may happen if the GMRES(2)-part stagnates. The BiCGstab(l) algorithm requires 2l matrix-vector multiplications in each iteration and requires to store.
QMR and TFQMR Algorithm (Transpose-free Quasi-Minimal Residual). The QMR algorithm (Quasi Minimal Residuan of Freund and Nachtigal [100] is based on the Look-Ahead Lanczos algorithm for nonHermitian linear systems. The last one is combined with the minimal residual algorithm GMRES in the QMR algorithm. The QMR algorithm can be applied even to singular quadratic systems, as Freund and Hochbruck [98] showed. Thus, the Petrov-Galerkin condition is replaced in the QMR algorithm by quasi-minimization of the residual norm. In contrary to the BiCG algorithm, breakdowns are principally excluded in QMR by using a Look-Ahead strategy in the underlying Lanczos process. In the QMR algorithm, the vectors {Vj} (cf. algorithm 3.4.2.1) generated by the Look-Ahead Lanczos algorithm are used as a basis for the Krylov subspace Vk(A, TO)' Let Qk be the matrix Qk := [VI, ... , Vk] built from the basis vectors. Then the k-th QMR iterate Xk is defined by Xk = Xo
+ QkZk
where Zk E C k is the unique solution of the least squares problem
3.5 Minimal Residual Algorithms and Hybrid Algorithms
119
(compare to the minimization problem in the derivation of the GMRES algorithm). Therein (3 = Ilroll, e1 is the first unit vector from Rk+1 and the matrix Yk+l := diag(w1,w2, ... ,Wk+l) is an arbitrary diagonal weighting matrix with Wj > O,j = 1,2, ... , k+ 1. The standard choice for the weights is Wj = 1 for all j. The matrix Tk is a (k + 1) x k tridiagonal matrix from the Lanczos process which satisfies AQk Qk+1Tk.
=
Therefore, the matrix Yk+1 'h has full rank and guarantees the existence of a unique solution of the problem (3.13). Then the following holds for the residual vector rk := b - AXk: (3.14) Consequently, because of (3.13), the k-th QMR iterate Xk is characterized by the minimization of the second factor in (3.14). This is just the quasiminimal residual property. For further details, the reader is referred to [100]. The QMR and TFQMR algorithms of Freund [97] are closely related to the CGS algorithm (compare [97], [330]). The convergence behaviour of the QMR algorithm is very similar to that of the CGS and CGS2 algorithms. However, the convergence curves are evidently much smoother. In the original QMR algorithm, the usual three-term recursions are used inside the Lanczos process. Yet, since it has been observed that vector iterations based on three-term recurrences are less robust in case of finite computation accuracy than the mathematically equivalent two-term recurrences, an implementation with two-term recurrence relations was introduced by Freund and Nachtigal in [99]. There also exists a transpose-free version of the QMR algorithm, the so-called TFQMR algorithm. This algorithm is explicitly given here:
Algorithm 3.5.2.2 (TFQMR Algorithm)
Choose xo, W1 = Yl = ro = b - Axo, ro such that AYl, Po = 0, '190 = 0 For k = 1,2, . .. : Cl:k-1 =< w2k,rO > / < vk-l,rO > Y2k = Y2k-1 - Cl:k-1 Vk-1 For j = 2k - 1, ... , 2k : Wj+l = Wj - Cl:k-1AYj
tl j = IIWj+lllV1 + tl]_dllwjll Pj = Yj + tl]_lPj_l!(l + '19]_1)
Xj = Xj-1 + Cl:k-1Pj/(1 + '19]) (3k =< W2k+1, ro > / < W2k,rO > Y2k+l = W2k+l + (3kYZk Vk = AY2k+1 + (3k(AY2k + (3k v k-d
<
ro, ro
>¥-
0, Vo
120
3. Numerical Treatment of Linear Systems
Again, to make the algorithm transparent, the usual auxiliary scalars, which of course should be used in an implementation, are not introduced. 3.5.3 GCG-LS(s) Algorithm (Generalized Conjugate Gradient, Least Square) The Generalized Conjugate Gradient, Least Squares algorithm ofAxelsson [5], [6], [8J belongs to the group of generalized cg algorithms. There is a number of such methods invented in recent years. They are either of least squares type, as ORTHOMIN [291]' the predecessor of the method [5], [6J treated here and GMRES [229], or of Galerkin type, as in [313], [6J. In [228], the equivalences between the different methods are discussed in detail. All of these methods recursively construct a sequence of search directions {pj} and approximate solutions Xk as linear combinations of the preceding search directions or as truncated linear expansions by minimization of a weighted squared norm of the error of the residuals in the least squares sense or by demanding certain orthogonality relations. The computation of the search directions can happen in many different ways. In [5], [6], they are recursively determined as a linear combination of the last residuals and search directions. The GCG-LS(s) algorithm generalizes the method from [5], [6J allowing the search direction Pk to be a certain (s + I)-term expansion and Xk+1 a combination of Pk and Pk-s-2. Here Pk-s-2 plays the role of a control term, and small values of this term indicate that the appropriate value might have been found for s where s is the parameter of the "truncated version". For s = n - 1, the full version arises. Algorithm 3.5.3.1 (GCG-LS Algorithm) k-th step; Sk = min{k,s}:
Q:~k) =<
rk,Pk
Xk+i = Xk rk+1
= rk
>
A / (k) Q: k Pk
< Pk,Pk >AHA
+ + Q:~k) APk
hk+i = AH Ark+i
Fori
= O,I, .. ,sk:
f3~'".ll =<
hk+1,Pk-l
> / < Pk-l,Pk-l > AHA
Sk
Pk+i
=
-rk+1
+ L}~'".ljPk-j j=O
In each step, Sk +3 inner products, two matrix-vector multiplications with and one with AH are necessary. The GCG-LS(s) algorithm converges monotonely without breakdowns. One can prove under certain assumptions that the algorithm terminates after finitely many steps. As usual, the convergence rate depends on the eigenvalue distribution. In [8], Axelsson gives estimates showing that the GCG-LS(s) A,
3.6 Multigrid Techniques
121
algorithm needs evidently less iterations than the CGNE algorithm in order to reach the same accuracy. It is worth noting that the Krylov sequence in the GCG-LS(s) algorithm is based on A and not on AHA as in the CGNE algorithm. However, in the full GCG-LS algorithm, the costs per iteration grow linearly with the number k of steps. This is the reason why the truncated version of the algorithm was introduced.
3.5.4 Overview of BiCG-like Solvers Table 3.1 compares the polynomial representation and numerical effort for some commonly used Krylov subspace and hybrid methods. Solver BiCG COCG CGS CGS2 BiCGSTAB(l) TFQMR
BiCG polynomial rk - if>k'C;V(A)ro rk = if>~iCG(A)ro rk = (if>~iCG?(A)ro rk = ¢k(A)if>~iCG(A)ro rk = 7rk(A)if>~iCG(A)ro QMR ansatz on CGS
multiplications 2 1 2 2 2I 2
vector storage 5
3 7 10 2I
+5
8
Table 3.1. Comparison of polynomial representation and numerical effort for some Krylov subspace and hybrid methods. The second column shows the polynomial representation of the k-th residual. The third column shows the number of matrixvector-multiplications per iteration. The last column shows the necessary storage requirements in number of vectors to be stored.
3.6 Multigrid Techniques The multigrid method, which in spite of its name is a construction principle for iterative solvers rather than a 'method', was originally developed for fast solution of Poisson's equation. The historical development of multigrid techniques, shortly MG, began in the beginning of the sixties with studies by Fedorenko and Bakhvalov: Already in 1961, Fedorenko [92] described a twogrid algorithm and, in 1964 [93], the first multigrid algorithm for Poisson's equation on a square. In 1966, Bakhvalov [14] published a method for second order elliptical differential equations with variable coefficients. At that time they spoke of a method for a sequence of grids. On the basis of these papers, Brandt [40] began his studies in 1972. He discerned the great efficiency of the multigrid scheme. Independently of these papers and studies, Hackbusch [111] developed his multigrid algorithms, which he first published in 1976. Further important papers are listed, e.g., in [253] by Stiiben and Trottenberg. In [269], Trottenberg gives a short overview of the basic ideas of multigrid techniques. Meanwhile, a more textbook-like treatise can be found, e.g., in
122
3. Numerical Treatment of Linear Systems
Briggs' multigrid tutorial [49], Hackbusch [112], [114], or GroBmann / Roos [107]. In [320], Wittum gives a popular description of the method. In [42], Brandt gives some kind of a guide for the development of a multigrid algorithm. The multigrid techniques recently acquired additional importance as the so-called multilevel preconditioner. Over the years, a variety of different derivations and views was found. Unfortunately, the multitude of notations going along with this does not lead to greater clarity. Among the noteworthy newer developments are Griebel's [106] representation of multigrid techniques and multilevel preconditioners as classical iterative solvers (GauB-Seidel, Jacobi preconditioner) over generating systems which keep the node bases of several discretization levels and Deuflhard's [73] cascade methods. The steadily published "mgnet-digest" [81] contains an overview of the wide range of existing literature. Only the essentials of multigrid techniques shall be given in this subsection before a special multigrid algorithm [277] will be described in subsection 3.7.
Principle of Multigrid Techniques A boundary value problem for a partial differential equation is usually discretized on some grid. The discretization leads to a large linear system Ax = b with simple structure: The system matrix A is sparse, the non-zero elements for many discretization methods are in very systematic order (e.g., appear only on a few bands below and above the main diagonal) and, for homogeneous domains, all non-zero terms have the same order of magnitude. Iterative solution methods are best for such linear systems. The classical stationary iteration methods have three typical properties when they are applied to linear systems due to discretization of differential equation: 1. They give a very good smoothing of the error, i.e., the amplitude of
high frequency error components in the Fourier expansion of the error is strongly diminished in a few iterations steps. But:
2. The rate of convergence worsens with refinements of the discretization, i.e., as h -+ O. 3. The total error hardly diminishes after the smoothing of the high frequency terms. The basic principle of multigrid now is the combined treatment of a sequence of discrete problems of increasing resolution - all approximating the same continuous problem. The combination is chosen in a way that each discrete problem mostly takes care of the highest frequency components, leaving lower frequencies to coarser resolutions. Properly implemented, this gives a rate of convergence independent of the step size h in the finest discretization.
3.6 Multigrid Techniques
123
Multigrid techniques are especially well suited for the treatment of linear boundary value problems for elliptic differential equations, for elliptic systems of partial differential equations, and for elliptic eigenvalue problems. In addition, they are also suited for non-linear boundary value problems. For other problems such as hyperbolic problems they can also be used but then they do not show their typical advantages (speed and robustness) compared with other methods. Let us emphasize again that - even though an explicit multigrid or multilevel method is used most often - the multigrid idea is a principle to construct iterative solvers for discrete partial differential equations rather than a solution method. This idea has been successfully applied to a wide range of problems, from elasticity to fluid dynamics and to all kinds of discretization methods. The original underlying principle is to exploit a separation of frequency scales implied by the spectral properties of the differential operator, which originally was an elliptic operator.
3.6.1 Smoothing and Local Fourier Analysis Crucial for the quality of a multigrid algorithm is the choice of a suitable smoothing procedure for the given partial differential equation. One method, the local mode analysis, requires expanding the error of the approximate solution into its Fourier series. The smoothing is mainly a local process, since high frequencies have only a small area of influence. Therefore, the smoothing may be studied far away from the boundaries in the inner part of the grid. Such an analysis is also very helpful for the estimation of the rate of convergence. The error vn,m = v(n· h x , m· hy) of an approximate solution on a two-dimensional grid is (locally) expanded in a Fourier series: 00
V n,m --
L..J A e e i (6l 1 n+6l 2 m) ,
""
(3.15)
k=-oo
with 8 = (8 1 ,8 2 ), The single Fourier components of this error expansion may be studied separately. The convergence rate for the high-frequent components gives the smoothing rate where a component is said to have high frequency if it is no longer visible, i.e., not representable, on the next coarser grid. For a grid with step size H, this is the case if the wave length of the wave is < 4H. Thus, the following notation results in case of uniform coarsening:
Definition 3.6.1. Let G h be a fine grid and let G H be the next coarser one. The error components visible on G h but not on G H are the high frequency components. If Gh and G H are regular grids with H = ph, then; :::; 181 :::; 7r for these components. Assume that the partial differential equation has been reduced by discretization to the linear system of equations
124
3. Numerical Treatment of Linear Systems
(3.16)
Ax=b. Then
M x(j+1)
+ (A -
M)x(j)
=b
(3.17)
gives a general description of the iteration. Now let v~~ = xn,m - xW,~ be ('+1) ('+1) the error before and V,,{,m = xn,m - X';,m the error after one iteration step. For those, the ansatz
vU) n,m
= AU) ei(el n+e2m) V(j+1) = A U+1) ei(el n+e2 m) e 'n,m e
is used. The subtraction of (3.17) from (3.16) results in
MV(j+1)
+ (A -
M)v(j) = O.
Now, in order to obtain statements about convergence and the smoothing factor, it is necessary, at this point, to look at the studied problem and the chosen iterative method. In accordance with Brandt [41], the convergence factor and the smoothing rate are defined as follows:
Definition 3.6.2. Let 8 be a Fourier component of the error function. Then AU+1) JL(8) = _e_._ A(J)
e
is called the convergence factor of the 8-component and
is called the smoothing factor.
A smoothing factor equal to 0.5, e.g., means that three relaxation steps reduce the high frequency error components nearly about one magnitude (which is the case for Poisson's equation on a square using Gauss-Seidel and p = 2). Local mode analysis is straightforward only for regular grids with constant spacing; for unstructured grids it does not work at all.
3.6.2 The Two-Grid Method For the explanation of the principle behind multigrid techniques, it is sufficient to look at a two-grid method. This comprises all important components of a multigrid method and at the same time is still very easy to overview. In this introductory subsection, only regular two-dimensional grids will be treated. Let h be the step size of the fine grid and H the step size of the coarse grid. H = 2h is a usual choice. Before giving the description of the two-grid method, introduce some basic notation.
3.6 Multigrid Techniques
125
The Relaxation. According to their special task in connection with multigrid techniques, the classical iteration methods with good error smoothing, as, e.g., Gauss-Seidel, are generally referred to as smoothing procedures or as the relaxation. The choice of the relaxation depends on the problem. In general, a fixed number of relaxation steps is performed. The Defect Equation. An essential point in multigrid techniques is the determination of some correction on the coarser grid. The following properties of the error are basic in this context: 1. The error of the iterated approximate solution of the linear system AhXh
=
having the exact solution Xh is itself the solution of another linear system with the same system matrix A h : Let Xh be the approximate solution of AhXh = b h iterated by the relaxation method. The error is given by Vh = Xh - Xh. Then AhVh = dh holds with d h = AhXh - bh. Because of Xh = Xh - Vh, the solution Vh yields the searched correction for the approximate solution Xh. 2. Also, the error Vh can be approximated well on a coarser since it is a smooth grid function: Let Xh be the initial approximation for Xh = Ah1bh . The relaxation yields the approximation Xh. The error Vh = Xh - Xh is smoother than the error Xh - Xh and therefore can be represented and determined on a coarser grid without considerable distortion. bh
In this context, the following notation is used: Definition 3.6.3. The equation AhVh = dh is called the defect equation and the quantity d h := AhXh - b h is called the defect. The Coarse Grid Correction. To determine the correction on a grid with step size H, all quantities of the defect equation have to be defined there. In general, the matrix AH is constructed by applying the discretization method to the coarse grid, i.e., analogously to A h . There exist other possibilities such as the algebraic AMG method described in subsection 3.6.3. The defect dh and the correction VH are transferred from one grid to another by linear mappings: Definition 3.6.4. A linear mapping If! : G h -t GH from a grid with step size h to a coarser grid with step size H is referred to as the restriction. The restriction dH = If! dh assigns a certain weighted average value obtained from neighbouring fine grid values to each coarse grid point. Definition 3.6.5. A linear mapping I'H : G H -t Gh from a grid with step size H to a finer grid with step size h is referred to as the interpolation.
126
3. Numerical Treatment of Linear Systems
For every fine grid point, the interpolation Ii! uses neighbouring coarse grid points. Usually - the error analysis will tell- a linear interpolation is sufficient. Interpolation and restriction are dual concepts, so there is some incentive in taking for I'll the generalized inverse of Ii!. Usually, a somewhat simpler restriction serves as well. If a good interpolation is given, which may not be easy, there should be no problem in getting a matching restriction. Next, the defect equation on the coarse grid G H can be set up and the correction can be determined.
Algorithm 3.6.2.1 (Coarse Grid Correction) One step of the coarse grid correction is composed of the following single steps: 1. Determine dh = AXh - bh and transfer dh onto a coarser grid --t dH 2. Solve AHvH = dH 3. Transfer VH onto the fine grid --t Vh 4. Build x hew = Xh - Vh On the coarsest grid used, the defect equation AHvH = dH is usually solved directly. Thus, all elements of a multigrid algorithm are already introduced. A multigrid algorithm usually consists of a combination of two kinds of iteration methods: A classical iteration method with good error smoothing, as, e.g., the Gauss-Seidel algorithm. I I An iteration method that reduces low frequency error components: One iteration step consists of applying a correction that has been computed on a coarser grid. I
Even though the coarse grid correction, i.e., method II, is not convergent itself, the combination of the smoothing iteration I with the coarse grid correction turns out to be a very fast converging method. These facts are discussed in detail in [112]. Before describing the general multi grid scheme, the two-grid method is formally given first:
Algorithm 3.6.2.2 (Two-Grid Method) One step of a two-grid method consists of:
h-grid restriction H-grid interpolation h-grid
relaxation (--t error smoothing) defect computation dh = AhXh - bh dH = Ii!dh solution of AHvH = dH Vh = I'HvH correction Xh - Vh
The two-grid method may also shortly be expressed by: X(Hl)
= R V x(j) - I'HAli Ii! (b h - AhR v x(j)),
3.6 Multigrid Techniques
127
where RIJ stands for lJ relaxation steps, i.e., RIJ x(j) corresponds to Xh in the above description of the two-grid method. Usually, a fixed number of relaxation steps is carried out. Adaptive strategies perform smoothing steps until the smoothing rate, which is easily monitored, falls far below the smoothing factor. 3.6.3 The Multigrid Technique
While the two-grid method already shows the principle of the multigrid scheme, it is nevertheless different from the typical multigrid algorithms, for which the following holds in general: - The grids are staggered more deeply than in the two-grid method; in general, at least three grids are used. - The grids are swept through in special cycles, the most important types of which are the V- and W-cycles. - The coarsest grid has only very few grid points, so that a direct solution method can be applied there. Even for fine grids (h -+ 0), the use of several grids only leads to an effort proportional to the number of unknowns on the finest grid, because, on each of the finer grids, only a few iterations are necessary and, on the coarsest grid, only a problem with very few unknowns has to be solved directly or iteratively. In a multigrid algorithm with the grids G1, ... , G I , the correction on the grid GI- I , 1 > 1 is only approximately determined by another coarse grid correction. Thus, these algorithms are built by recursive use of coarse grid corrections where the number of recursions is determined by the number of grids. Algorithm 3.6.3.1 {Multigrid Scheme} One step of a multigrid scheme on the grids G 1, 1 > 1 consists of the following recursion: lJ relaxations defect computation d l = AIXI - bl dl - l = If-Idl restriction 'Y multigrid steps on grid G l - I , grid G l - I i.e., for A l - I Vl-I = dl - l interpolation VI = ILl Vl-I correction Xl - VI grid G 1 The problem Al VI = dl is either solved by a direct method or by another iterative method.
The choice of the number 'Y of steps on each level j = 1, ... , 1 always leads to a different kind of cyclic course. A schematic representation of these cycles motivates the choice of the following notation:
128
3. Numerical Treatment of Linear Systems
Definition 3.6.6. A multigrid iteration with 'Y = 1 is called a V-cycle; one with 'Y = 2 is called a W-cycle. Fully adaptive cycling strategies have been devised, which are superior in complicated situations.
3.6.4 Embedding of the Multigrid Method into a Problem Solving Environment The solution of an elliptic partial differential equation (PDE) usually is only part of a larger problem, which may contain many further parts. For the embedding of the solver there are two primary decisions: - Can the multigrid be interwoven with the problem definition creating a series of approximations of a continuous problem or is it supposed do be a black box linear equation solver? Multigrid does not give a perfect black box linear equation solver. The most general attempt to build one was AMG (see below), some 'grey box' attempts [245] have been developed, but all these fall far below the speed of a genuine multigrid code. The definition of coarse grids as well as the interpolation have to be determined by the properties of the differential operator. - Is there a series of similar problems to be solved (parameter studies ... ) or a single one? Like all iterative solvers, the multigrid cycle given above needs a starting approximation. Unless this can be supplied by a previous problem or a parameter study, the natural way to get it is by solving the same problem with lower resolution. This leads directly to the FMG algorithm, where every grid has to provide only a slight increase in accuracy (maybe a factor of 4). Random starts are a waste of effort. Thus, the proper way to implement a geometrical multigrid scheme proceeds as follows: the initial approximation of the solution x of the linear system Ax = b is generated using all coarser grids.
Algorithm 3.6.4.1 (FMG Approach) The FMC approach (Full MultiGrid) on the grid sequence G l , ... , G 1 is determined by: grid G l grid Gk, k = 2, ... , l
solution of AlXl
= bl
interpolation Xk = iLl Xk-l i multigrid steps on grid G k
Xk denotes the initial approximation on the grid G k, Xk, k = 2, ... , l denotes the approximation iterated with the multigrid scheme, Xl denotes the approximate solution on Gl . The interpolation iLl may be different from the interpolation ILl of the multigrid scheme.
3.6 Multigrid Techniques
129
The choice of the number i of iteration steps on each grid depends on the problem. Often, an interpolation j of higher order than I is chosen. The reason is that j interpolates the approximate solution of the linear system, which is not necessarily smooth, while I interpolates the smooth correction (= error of approximation). The term nested multigrid iteration is sometimes used in the literature for the FMG approach. grid 1
grid 2
grid 3
\N\:/\:/\ r---------l
°
:-CYde
on
~d3
°v:° °v:°
-I
L~
0\
1
_______
~J
°
Figure 3.4. Formal course of the FMG approach in case of three grids with two V-cycles each. (Grid 1 is the coarsest.)
A further variant of the multigrid idea is the algebraic multigrid. These techniques use the multigrid idea without any geometrical background by assigning lower-dimensional systems to a given linear system where these lower-dimensional systems reproduce certain strong couplings to the initial system. Besides that, this approach is completely analogous to the multigrid techniques.
Algorithm 3.6.4.2 (AMG) In the AMG (Algebraic MultiGrid), problems of lower dimension are set up for each system Ax = b on several levels 1 - I, ... , 1: A k - l := I~-l AkILl'
bk-l = I~-lbk' k = l, 1 - I, ... ,2.
The mappings I~-l : Rnk ~ Rnk-l and ILl: Rnk-l ~ Rn k are referred to as the restriction and the interpolation. The restriction is defined by (Ikk-l )T . I kk-l .._ For the total of 1 levels, the method consists of the following recursion: levell restriction levell - 1 interpolation levell
v relaxations defect computation dl = AIXI - bl dl - l = If-ldl 'Y steps of AMG on levell - 1, i.e., for AI-IVI-l = dl- l VI = ILlvl-l correction Xl - VI
130
3. Numerical Treatment of Linear Systems
The problem Al VI
= dl
is solved by a direct or another iterative method.
The analogy with multi grid techniques is obvious. The setup of the coarse grid matrices by means of interpolation and restriction according to A k - l := 1 AkILl is also called the Galerkin approach [112]. Algebraic multi grid has mainly been developed in order to apply the multigrid idea also to so-called black box solvers. Such techniques are applicable to a whole class of algebraic problems such as, e.g., linear systems with symmetric positive definite matrix. The AMG algorithm frequently compares favourably with other black box linear equation solvers, but for some problems it failed miserably. The crucial 1. part is the proper definition of the matrices
1:-
1:-
Convergence Properties. As was noted at the beginning, the multigrid technique stands out due to the fact that its convergence is independent of the step size of the discretization. While the classical iteration methods x(jH) = M x(j) + c typically show a contraction behaviour according to
Ilx~H) - xhll :::; IIMhllllx~) - xhll, j
= 1,2, ...
with limh-t+O IIMhl1 = 1, it is possible to achieve the following estimate with some constant ~ E (0,1) independent of h
Ilx~+l)
- xhll :::; ~llx~) - xhll,
j
= 1,2, '"
only if the algorithmic components of the multigrid method are suitably chosen and combined. Regarding a two-grid method on Gk, G k - l for the solution of
AkWk
= qk,
qk E Gk,
the algorithm can be written as W
(Hl) --
RV2(I _ Ik A-I Ik-lA )RV1w(j) k-l k-l k k k k k _. S w(j) -. k,k-l k + Ck,k-l qk·
k
+ Ck,k-l qk
(3.18) (3.19)
Generalized, the (l + 1)-grid operator can be obtained recursively as
Sk,k-l
:=
R~2 (I -
ILl (I -
Sk-l,k-l)Ak~l I~-l Ak)R~I.
(3.20)
Then the following theorem [107] can be shown:
Theorem 3.6.1. Let the two-grid operators Sk,k-l defined by (3.19) be bound by IISk,k-lll:::; Cl, k = 2,3, ... ,l with some constant
Cl
E (0,1). Further, assume that some
IIR~2 ILlIIIIAk~l 1:- 1 AkR~111 :::;
C2,
k
C2 > 0
exists with
= 2,3, ... , l.
Then there exists a positive integer (T such that the multigrid operators described by (3.20) can be estimated by
IISk,lll :::; c, with some constant
C
k
= 2,3, ... ,t
E (0,1) independent of t.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
131
Some Remarks on the Development of Multigrid Algorithms. It became obvious in this subsection that the following components have to be chosen appropriately during the development of a multigrid algorithm for a special problem : -
the the the the the
special variant of the method, e.g., the FMG method, relaxation method, choice of grids and cycles, restriction and interpolation, solution method on the coarsest grid.
For simple problems, e.g., for Poisson's equation discretized on a rectangular domain with a structured grid, the choice of these components can be done with help of the so-called model problem analysis or the (local) Fourier analysis (compare [112], [42], also see subsection 3.6.1). This technique, often referred to as local mode analysis, is feasible only on structured grids. For more general problems, especially for problems on irregular domains, theoretical studies are hardly to carry out. Complex applications require a careful design. The sequence of grids, the smoother and the interpolation have to be adapted to the properties of the differential operator. Here, techniques like semi-coarsening or transforming smoothers may be necessary. Special care has to be taken if spectral properties of the problem, e.g., the number of negative eigenvalues, depend on the resolution. After the principal design has been decided upon, the tuning will usually require extensive experiments. The abstract convergence theory is of little help here, since it either treats only simple situations or is overly pessimistic. A frequent problem with real world applications is that the resolution of the coarse grid problems are far too low, so the asymptotic estimates are of little value.
3.7 A Special Multigrid Algorithm for the Solution of a Non-Hermitian Indefinite System Originally, the multigrid technique was developed as a principle for the construction of iterative solvers for discrete elliptic problems. Thus, optimal multigrid algorithms for Poisson's equation or the Navier-Stokes equation as well as for some other applications can easily be found in the relevant literature. Below we present an example of a problem illustrating the fact that, for some special practical applications, it can be very hard to find an optimal combination of multigrid components. The algorithm was designed in 1987 to solve some problems in the construction of an accelerator. It was subject to restrictions to be discussed in the text. They resulted in a less than optimal solution. Some of the issues involved - notably the treatment of indefinite problems - are better understood by now and would find a more
132
3. Numerical 'Treatment of Linear Systems
efficient treatment. Nevertheless, the example is rather typical for the accumulation of difficulties that may arise in real applications and thus result in a convergence speed far off optimality. A multigrid algorithm [277] that was developed primarily for a two-dimensional FIT grid in order to solve a linear system with complex symmetric indefinite matrix is introduced. This algorithm solves the high frequency indefinite linear system (2.16) from subsection 2.4 which can happen to be indefinite (for resonances and quasi-resonances). The complex problem (difficult domains with arbitrary boundary curves) did not allow us to perform a theoretical convergence study. Therefore, the algorithm is to a large extent the result of experimental studies which are summarized in the following. The description of the multigrid components which were finally used also reflects the imposed restrictions. In subsection 3.10, the convergence behaviour which is mainly determined by the indefiniteness is studied in more detail. The indefiniteness leads to problems in the course of the smoothing procedure and the approximation of the continuous problem [260]. In particular, numerical studies in [277] revealed that a grid-dependent correction of the discrete operator was necessary. Besides the indefiniteness, there are the high frequency and the near-singularity, which make the setup of an optimal multigrid algorithm more difficult. 3.7.1 Pecularities of the Special Problem and Corresponding Measures Discretization and Coarse Grid Generation. The calculations were embedded into a system of connected problems where easy exchange of data was essential, so the geometry description and the finest grid were given from outside by the URMEL/MAFIA code (Fig. 3.5). At that time (1987), a compatible grid of higher resolution would have required too much computing time for industrial use. Of course, this given restriction collides with a guiding principle of multi grid construction: "Start out with a coarse grid and refine it." - yet this situation is quite common. For coarsening, first an existing AMG method was tested and found unsatisfactory, see below. To save development time, coarse grids were constructed by taking every other grid line in the primary grid and constructing the discrete equations by the similar routines as for the fine grid with some variations at boundaries. An irregular coarsening adapted to the geometry would have given much better results. For the structure given in (Fig. 3.5), the first coarsening would optimally contain horizontal lines 11 and 18, too, for a proper representation of the geometry of the system, and further coarsenings would be more irregular. Implementing such a coarsening based on the geometry description of the URMEL/MAFIA codes is quite a task. Even with a geometry adapted coarsening, it will in general not be possible to represent the geometry exactly on the coarse meshes, so that modifications of the coarse
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
133
grid equations as described below (Extension of FIT) will be needed. This way, all grids treat (nearly) the same physical problem, at least for k = O.
r
Q:-z 'I'
. . . . . r-.
t'/
\
/ r
Q:-z 'I'
Figure 3.5. FIT grids G1 and GI- 3 for a cavity that was used in the PETRA storage ring at DESY in Hamburg (Deutsches Elektronen Synchrotron) for the acceleration of elementary particles. It is a cylindrically symmetric structure, so that, for reasons of symmetry, it is sufficient to discretize the upper half of the cross-section.
To facilitate explanations, only regular two-dimensional grids were treated in subsection 3.6. But a good discretization requires irregular grids. In [277], a Cartesian FIT grid is used that allows irregular step size. As was already noted, a uniform coarsening was chosen for which every other grid line of the grid GI builds a grid line of GI - 1 in z- as well as in r-direction: Definition 3.7.1. LetPi,1 = (j-I)·J+k withj = 1,2, ... ,J, k = 1,2, ... ,K, J. K = N be the points of the grid GI . Then, in the uniform coarsening of a FIT grid, the grid GI - 1 is determined by the points Pi,l-l := (j - I) . J + k with j = 1,3, ... , J - 2, J, k = 1,3, ... , K - 2, K for odd J, K, j = 1,3, .... , J - 1, J for even J and k = 1,3, ... , K - 1, K for even K.
For this coarsening, the following relation holds on all grid levels: GI -
1
C
GI .
For a regular Cartesian grid, this corresponds to the uniform coarsening with factor 2 (compare [112]). Besides the actual FIT grid G, the finite integration
134
3. Numerical Treatment of Linear Systems -
H- --j
!
f- 1=
1
!
1
i
I
I
I
!
I
I I
i i
i
I
I
I
I
..l....
--+ I
I I-----j--I i I II
I
-+-I -
+-
4--
i
I
I
i
i
.-j--
f--+--!
I
i
----t--, i
i
---- ---
I
II
I
j
I
!
!
I
I
I
I
rr-= Figure 3.6. FIT grids Cl and CI- 1 as well as the dual grids (;1 and (;1-1.
technique also needs a corresponding dual grid G. On the finest grid, the dual grid is fixed by the middles of the grid lines of G. On the coarser grids, the dual grid is chosen such that it can be composed of grid lines from the grid one level finer: Definition 3.7.2. LetPi,l = (j-l)·J+k withj = 1,2, ... ,J, k = 1,2, ... ,K, J. K = N be the points of grid G1. The dual grid GI - 1 for uniform coarsening is determined by the points Fi ,I-1 := (j -1)· J + k with j = 2,4, ... , J -1, k = 2,4, ... , K -1 for odd J, K or j = 2,4, .... , J - 2 under inclusion of the last grid line in r-direction of GI for even J and k = 2,4, ... , K - 2 under inclusion of the last grid line in z-direction of G1 for even K.
The following restriction concerning the grid was used during the implementation in order to lower remarkably the programming effort: The fine grid is chosen so that J and K are odd on all grids G1• Since this does not present a principal restriction, this assumption is made in the sequel. It is the only restriction made with respect to the general applicability to problem (2.16). At the same time, this commitment guarantees that the step size has about the same order of magnitude over the whole grid on all levels. Consequently, large differences in the matrix elements which could lead to instabilities are avoided. The dual grids GI give a sequence of the so-called staggered grids with
GI - 1 ct. G1
(cf. [112]). Figure 3.5 shows an example of the coarsening of a Cartesian FIT grid. Figure 3.6 shows the corresponding dual grid.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
135
At this point, it becomes evident that the coarser grids often do no longer yield a good approximation of the studied geometry, i.e., the boundaries of the sub domains with different materials do no longer (approximately) coincide with grid lines. An extension of FIT was developed to solve this problem. An Extension of the Finite Integration Technique (FIT). Suppose we are given a FIT grid G that approximates the boundaries of some structure geometry by a polygon consisting of elementary lines. Next, the elementary areas bounded in this way are filled with material. These fillings can also be done diagonally (compare Fig. 3.7).
M=
o
2
4
5
Figure 3.7. Filling types of the FIT-grid in the program URMEL-I [277]'[287] which uses the special multigrid algorithm.
On the coarser grids generated for the multigrid algorithm, the material boundaries do no longer necessarily coincide with the elementary lines. Therefore, elementary areas exist on the coarser grids which are partially filled with different material. At the same time, it is extraordinarily important not to change the physical problem and hence the location of the material boundaries on the coarser grids, because this would lead to serious convergence problems of the multigrid algorithm, as the solutions are very sensitive to this. To allow nevertheless the use of the compatible coarsening, an extension of the discretization method has been developed in [277]. An ansatz at hand for the treatment of these partially filled areas is the addition of further filling types to those shown in Fig. 3.7. But even if many similar fillings are put together, this would require an enormous number of types even for relatively few grid levels. Additionally, the automatic assignment of the type calibration to the coarse grid cells requires a great programming effort. The following more practical ansatz has been carried out with the preliminary restriction to vacuum or perfect conductors as materials [277]. The state variables of FIT are defined by integrals along elementary lines and elementary areas. Now these integrals are modified according to the partial filling: Definition 3.7.3. The state variables of a partially filled elementary area are determined only over the subarea filled with vacuum or over that part of the elementary lines which borders vacuum. Figure 3.8 shows how different the partial filling of an elementary area on a coarse grid may look like. This ansatz leads to additional approximations, but their influence on the total error and on the convergence is probably very
136
3. Numerical Treatment of Linear Systems
small 6 . Figure 3.8 displays the extension of FIT. The line integrals extend over the shortened distances .1 1, ... ,.1 4 and the integration area is reduced to Aj := .13.12 + .11.14 - .1 1.12. A complete description of the grid quantities used on the coarser grids and of the modified FIT equations can be found in [277]. In [266], the application of this extension has been studied for general time-dependent problems on two-dimensional Cartesian grids.
Z j +J
Figure 3.8. Left: Part of a coarse grid with partially filled elementary areas. Right: Relevant grid quantities for the difference equation for H
The singularity of cylindrical coordinates. Cylindrical coordinates have a singularity at the axis. The FIT discretization of (2.16)does not use any values on the axis, but the change of size of the matrix elements is proportional to r. For iterative solvers, only differences in neighbouring coefficients 6
The method gives an approximation only of the first order, but this is also the case on the finer grids as soon as variable step size is used. Thus, it leads to no crucial disadvantage.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
137
cause problems, so the smoothers are not affected much. The interpolation can be improved by a proper transformation which is described below in remark 3.7.5. 3.7.2 The Multigrid Algorithm; Properties of the Linear System and its Solution As a multigrid scheme, the FMG approach (see subsection 3.6.3) was chosen. As an interpolation j for the FMG, the same interpolation I as in the multigrid algorithm is used. This is described in subsection 3.7.3.
Remark 3.7.2. For solutions that have high frequency themselves, multigrid literature especially recommends to choose an interpolation of higher order for 1. The reason is that, in such cases, the choice of a linear interpolation for j normally leads to an additional cycle compared with an algorithm using a cubic interpolation for 1. In these cases, the interpolation error clearly exceeds the coarse grid error. However, in the algorithm presented here, both errors have about the same order of magnitude, so an improved interpolation would probably not lead to a faster algorithm [244]. Only if a better coarse grid correction can be achieved, the more expensive interpolation makes sense. Nonetheless, because of the arbitrary boundary curves, the implementation of a cubic interpolation is very difficult and costly. On each of the coarser grids, the matrices are set up via the finite integration technique (FIT). The generation of the coarser grids as well as the subsequent setup of the differential equations and the matrices is done fully automatically. The coarsening of the grids is described in subsection 3.7.l. Such a coarsening procedure that is analogous to the discretization on the finest grid is also called compatible coarsening and is usually recommended
[42].
The Indefinite, Nearly Singular Problem for k > O. Depending on k, the continuous and discrete problems may have a small to moderate number of negative eigenvalues (compare Fig. 3.49 in section 3.10). Indefinite problems need special attention at many components of the solver. Gauss-Seidel relaxation ceases to converge, but it still is a good fine grid smoother. Coarse grids may require a more robust smoother, like the Kaczmarz method. Under the following conditions, the coarse grid correction can compensate for the error of the smooth components introduced by the use of Gauss-Seidel relaxation (cf. subsection 3.7.4): Let the error Vk on the fine grid Gk have the smooth eigenfunction ek as the main part. Then the relation dk = MkVk = A.kek holds (compare (2.16)) for the defect dk . The corresponding coarse grid equation is given by Mk-1Vk-l
= A.kI;-lek'
As ek is smooth, we assume it to be close to an eigenvector of eigenvalue A.k-l' Then
M k- 1
with
138
3. Numerical Treatment of Linear Systems Vk-1
Ak k-1 = -,-I ek k Ak-1
is an approximate solution of the coarse grid equation. Since ILl 1:- 1ek = ek may be assumed for the smooth eigenfunction ek, the following holds for the new error on grid G k:
Thus, good convergence can be reached for
1 1-~I«1.
(3.21 )
Ak-1
There may be a small number of eigensolutions for which coarse and fine grid eigenvalues differ in sign. The coarse grid correction for these has the wrong sign and multigrid does not converge. In our example, this could be avoided by correcting for the grid dependence of the numerical speed of light (see below). Otherwise, taking the coarse grid correction as a preconditioner of a Krylov subspace method for indefinite problems can take care of these eigensolutions at moderate cost [28]. Disadvantages of the AMG Algorithm. First, an algebraic multigrid algorithm had been designed that oriented itself to an AMG algorithm for the solution of Maxwell's equations in three dimensions [245]. Eventually, this algorithm was not used, since
1.
the Galerkin approach to the coarse grid matrix of the AMG method leads, for many practical structures, to the situation that another physical problem is solved on the lower levels; 2. for the special AMG method mentioned above, only the piecewise constant interpolation and the so-called "trivial injection" can be used for reasons which are explained in detail in [245]. Both would lead to convergence problems if the AMG algorithm is to be applied in an appropriate way to the linear system (2.16). The first point can easily be seen. As to the second point, there are some peculiarities of this system which are decisive for the convergence of such an AMG algorithm. These peculiarities are generally of basic importance for the setup of a suitable solution method for this system and are described next. Let Xk be an approximate solution of the system (2.16)
Then the related defect equation for
Vk
=
Xk - Xk
is given by
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System Expand the defect
dk
into a series of eigenvectors dk
=L
ekj
of M
139
k:
,Bkjekj'
j
Assume that the actual grid G k is already one of the coarser grids. Then the low frequency eigensolutions are dominant in dk , since the smoothing operations on the fine grids already have eliminated the high frequency eigensolutions, i.e., l,Bkj I is small for all j with large )..kj' Now, some ilk
:=
ilk
is used as a correction on
ILlzk-l
with
Zk-l
Xk.
This correction is given by
= approximate solution
Mk-lZk-l
= I~-ldk'
In particular, let the piecewise constant interpolation be chosen as ILl and the corresponding restriction as I~-l. Then high frequencies couple in the restriction of dk : For d k -1, a staircase approximation is generated by the restriction with twice as large steps than there are for dk on grid G k. This effect corresponds to the addition of a remarkable portion of high frequency components to the right hand side of the defect equation on the grid G k - l . Remark 3.7.3. With the Galerkin approach to the generation of the coarse grid matrices and the "trivial injection" as the restriction, the steps in the staircase approximation load each low frequency eigensolution with a great portion of high frequency components. Therefore, the coarse grid frequency follows by averaging the basic frequency and the intermixed higher frequencies, i.e., the coarse grid eigenvalues are strongly shifted. For that reason, the factors in Zk-l and hence in ilk of an eigenfunction from dk in the correction term will be different from those in Vk. It even may happen that, for some of the eigenfunctions, the sign with which the eigenfunction contributes to the correction changes.
For homogeneous problems (wave number k eigenvalues is smaller [245], [244].
= 0), the relative change of the
Grid-Dependent Eigenvalue Shift. For the compatible coarsening, the eigenvalues are shifted less than for the Galerkin approach. Now the eigenvalues of the coarse grid matrices do not depend on the restriction. In the residual restriction, higher frequencies again couple, but they are not enforced to the same extent as for the Galerkin approach: On one hand, they share a smaller portion because of the interpolation of higher order (compare subsection 3.7.3); on the other hand, the low frequency part of the correction has the right factor, since the low frequency parts are not changed too much in the coarse grid matrix. Thus, only high frequency components that can be eliminated by a few smoothing steps should appear. However, during the development of the multigrid algorithm, the suspicion arose that the eigenvalue
140
3. Numerical Treatment of Linear Systems
shift of M = A - k 2 I + ik'D by the discretization cannot be neglected but should be determined and included in the multigrid algorithm (compare subsections 3.7.3 and 3.10). The following reflections allow to give an estimate of the grid-dependant eigenvalue shift: In the discretization of a sine wave, some essential shifts happen. In fact, a sine wave is still an eigenfunction of the discrete .1-operator, but the eigenvalue is no longer given as square of the wave number k. In order to make the discrete operators (A - k~I) have about the same spectrum as the continuous operator (.1 - k 2 I) and, in particular, always have the same sign for all the eigenvalues on the different grids, it is necessary to choose kd somewhat different from k. With the following ansatz, a hint for the calculation of kd can be obtained: _k 2 is the eigenvalue of the one-dimensional wave
f(x) = sin kx;
f" (x)
= _k 2 sin kx.
Discretization on a regular grid with step size h leads to:
f; (x)
= (sin(k(x - h)) - 2 sin kx + sin(k(x + h))/h2 = (sin kx cos kh - cos kx sin kh - 2 sin kx + sin kx cos kh + cos kx sin kh)/h 2
= (2 sinkx cos kh - 2sinkx)/h 2 - 1) = sinkx· 2(coskh h2 ; (~k2 for small h), i.e. k~ should be set to about
2(coskh -1) h2
(3.22)
in order to reach the right distribution of signs in the spectrum. The optimal value is still somewhat different, since the assumption of a linear wave is only an approximation. At the same time, it follows that kh < 7r /2 is a necessary condition for meaningful computations. The "shift" k 2 in the matrix M now also has the factor s = 2(coskh -1)/(h2 . k 2 ). Correspondingly, the factor s is applied to the right hand side b, which also contains k2 . In subsection 3.10 the results are shown for a sample computation with and without a shift factor. Only one grid was used for these computations, i.e., the linear systems were solved directly. The results show very clearly that the sought for quasi-resonances appear on all grids for the same frequency if the shift factor is used while otherwise they are shifted to lower frequencies for coarser discretizations. Remark 3.7.4. The discretization of the .1-operator leads to a shift of eigenvalues. For the indefinite problem .1- k 2 I, this shift causes convergence problems. Using a shift factor for k2 to achieve an equal eigenvalue distribution on all grids of the multigrid algorithm remarkably improves convergence.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
The use of the shift factor diminishes improvement (compare (3.21)).
11 - oX::
1
141
I, which leads to convergence
3.7.3 Grid Transfers for Vector Fields The multigrid algorithm was implemented in order to compute electromagnetic monopole fields in cylindrically symmetric structures. In this case, the unknowns of the linear system to be solved by the multigrid algorithm are allocated at the points Fi ,/ of the dual FIT grid G/. As was noted in subsection 3.7.1, the dual grids GI , G2 , ... ,G/ form a sequence of staggered grids. The transfer functions between the grids have been chosen accordingly. Details can be found in [277]. For the grid transfer, the piecewise bilinear interpolation and restriction were chosen.
The coefficients in the interpolation and restriction formulas depend on the step sizes hz and hr. Figure 3.9 shows the allocation of known values on the dual grid G/- I as well as the values which have to be interpolated on grid G/ on top. The allocation of known values on the dual grid G/ and the unknown values H/- I on G/- I is displayed on the bottom. Theoretically, instead of the transfer function of second order, a piecewise constant interpolation which is of first order could also be chosen. However, tests with such kind of transfer functions of lower order showed that, as might be expected, then the convergence of the multigrid algorithm worsens more and more with increasing frequency w. Treatment of the Boundaries in the Interpolation. One problem for the grid transfer is the correct treatment of the boundaries. At the boundaries of the studied geometry, one or several points lie inside the metal or outside of the grid. The treatment of such boundary points is an important issue affecting the convergence of the algorithm. In the interpolation at some border with metal (which is assumed to be perfectly conducting), it is possible to work with "mirror fields" by using the boundary condition that the tangential electric field Ell vanishes there. Figure 3.10 shows this method known from potential theory [142]. This leads to the boundary condition EZI := 0, EZ2 := O. In the monopole case (Hr = 0), the following differential equation for Ez can be formulated: ik(r~ - rnEZI
=>
v'r2H 3
= 2r2H3 - 2rlHI v'r1 = r.:;; v'r1 H I yr2
Analogously, H4 can be replaced by H2 in the interpolation formula if a perfectly conducting boundary crosses a dual grid cell parallel to the z-axis, as is shown in Fig. 3.10.
142
3. Numerical Treatment of Linear Systems -
I
+-
-
t
~,
\1
I
I
I
I
t ,J ~
H"
H,J
I
I
I
-
•
known values
o
interpolated values
t
-
i
f---
I
I I
H"
-1
f---
-
t
-
H"
I
I
-
1·1,
I
-
[+
-
I
H"
HI2
I
f---
-I
f---
-j -
~
t
f---
i
•
known values
o
values determined hy restridion
I
I
I
I
:--
I
I
-
t t
r-
r
r-
f---
I
f---
I
f---
I
t-
I
t
l-
I
0,., 0,
I
-
t
-
t
I
r
t rI
-
I
t
t
I
I
l-
0,
I
I
f---
t I
I
-I
I
t-
I
+I
f---
I
-
I
-+ -
I
I
i
-
I
t
"4H I
-
I
-
,-1
H"
f---
I
11 l:..L
t
I
-
I
I
l-
-
I
I
-
I
-+ f--
-
I
I
-
I
-+
l-
I
l-
I
Figure 3.9. Top: Allocation of known values and values to be interpolated on the grids Gl- 1 and Gl. Bottom: Allocation of known values and values to be determined by restriction on the grids Gl and Gl- 1 •
Figure 3.10. To the method of mirror fields used for the interpolation at domain boundaries.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
143
Remark 3.7.5 {Balancing Transformation of the Fields}. The equation has been written in the form above, since the linear system (2.16) is transformed before its solution: The fields solving the system (2.16) vary strongly as the radius r increases. In order to keep the interpolation error as small as possible, this property can either be compensated for by a weighted interpolation or by some field transformation which leads to a smaller variation when r changes. Here the second method was chosen: The magnetic field H
at
at-l
at,
Further, a perfectly conducting boundary parallel to the r-axis or in some corner of the grid may occur. In these cases, the method of mirror fields also yields analogous formulas. Last but not least, the axis (r = 0) forms a boundary of the computational domain. Figure 3.11 shows this situation.
Figure 3.11. Interpolation on axis.
In the monopole case (m the axis. In accordance with
= 0), the condition EZl
=f. 0 , EZ2 =f. 0 holds on
the magnetic monopole field is independent of the azimuthal angle cp. Thus, the values of Hl and H2 can be replaced in the interpolation by the following:
where the negative sign is given by the direction of e
and, at the left boundary,
144
3. Numerical Treatment of Linear Systems
[ =a Er
H1.1,r-----4f-L -
H1·1.
T
0 H I,
I
0 H I,
I
I
H1.1, ~--+., Er
I
HI . I,
I!,.Z'
H0 ,,-e . ik' z
[=0
Figure 3.12. Interpolation at the "open" boundary in tube region ("waveguide
boundary") .
Remark 3.7.6. This procedure at the grid boundaries allows to use there a bilinear interpolation by using the physical properties of the solution fields.
The restriction does not pose any problems at the boundaries, since the coarse grid values always lie inside the domain more than the fine grid values. Interpolation on Partially Filled EleInentary Areas. For both the restriction and the interpolation, the coefficients for partially filled elementary areas are computed in the same way as those for completely filled areas, since the state variables are formally allocated at the same positions in both cases. The resulting additional approximation error is negligible. Under-Interpolation Caused by the Indefiniteness of the ProbleIn.
For indefinite problems, difficulties arise if the eigenvalue shift on the coarser grids is not taken into account (see also subsections 3.7.2 and 3.10). In the following, a possibility for improvement of convergence proposed by Brandt in [41] is described. Assume
).~ :S ).~ :S ... :S ).~ < 0 < ).~+l :S ).~+2 :S ... on the grid G k with 1 :S k :S I - 1 for the eigenvalues of the problem to be solved. According to [41], some under-interpolation wk1:+l should be used instead of the interpolation 1:+1 if Wk = 1 does not satisfy the following relation:
0<
).k+1 /).k J
J
<~, 1:S j :S n + 1, 1 :S k :S 1-1. Wk
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
If this relation does not hold for
Wk
145
= 1, it follows that
and thus the algorithm diverges (cf. (3.21) in subsection 3.7.2). For the excited time-harmonic problems, the following holds: - without eigenvalue shift: - >..j >..~+l for 1 j :S n+ 1 (this is typical for Finite Difference methods, while ~ holds in general for Finite Element methods), - thus, for all j with >..~+l < 0, the relation for Wk is also satisfied with Wk = 1, - but: the frequency intervals around some singularity where >..~+l > 0 and thus >..~+ 1/ >..j < 0 holds for j = n and possibly further j, i.e., the divergence intervals, are relatively large. - with eigenvalue shift: - aside from small intervals around some singularity >..j > 0 for j = 1, ... , n, - for most j, 1 :S j :S n + 1 the relation >..~+l / >..j ~ 1 holds. 7
:s
:s
Consequently, the under-interpolation led to a further convergence improvement: For the new error after the coarse grid correction, the approximation
holds for under-interpolation (cf. subsection 3.7.2) Thus, for Wk < 1, a better convergence results than for Wk = 1, if >..J+l / >..j > 1 holds for j = n or n + 1. Since this condition is not always satisfied for all j = 1, ... , n, the value of Wk should not be chosen too small. In the implemented multigrid algorithm, the parameter was set to be Wk = the squared shift- factor for the eigenvalue shift. The results shown in subsection 3.10 confirm the advantage of the under-interpolation.
sL
3.7.4 The Relaxation Choosing a relaxation method, the indefiniteness of the system matrix has to be accounted for. Yet, even classical iteration methods which theoretically diverge for indefinite problems can nevertheless be used as smoothing relaxation in a multigrid algorithm (cf. [112]). It is known from the relevant multigrid literature (e.g., [43]) that the Gauss-Seidel method shows good smoothing properties even for indefinite problems as long as the step size is small enough. On such grids, the divergence of the smooth components is 7
The shift factor could only be obtained by a series of estimations, since the eigenvalues and eigenvectors themselves are unknown.
146
3. Numerical Treatment of Linear Systems
slow compared with the smoothing of the high frequency components and thus can be compensated for by the coarse grid correction. On the coarser grids, however, this method is not suited as a relaxation method because of the divergence of the smooth components, which are the relevant components there. An iteration method that also converges for indefinite problems is the Kaczmarz algorithm. In subsection 3.10, some calculations in comparison with both variants of the Gauss-Seidel algorithm are described. According to these studies, the linewise relaxation seems better suited, which is why it has been implemented in the multigrid algorithm.
Its Application in the Special Multigrid Algorithm. First, a multi grid algorithm [277] was devised in which the relaxation choice was based on the idea that the solution error of the linear system can be expressed as a linear combination of the eigenvectors of the system matrix (cf. subsection 3.7.2). According to [42], some of the higher frequency components do not need to converge 'efficiently' under the relaxation of an intermediate level as long as these components converge 'efficiently' under the relaxation of the next finer grid. Thus, it is important that each eigenfunction is represented correctly at least on one grid such the corresponding error components can be smoothed on the different grids. For eigenfunctions with eigenvalue larger than the squared wave number of the input frequency k 2 , the problem is positive definite. For these higher frequency components, the Gauss-Seidel method is suited. Therefore, the first attempt was as follows: The Gauss-Seidel method was used for relaxation up to the second-last grid on which a function in k2 can still be represented well. On all coarser grids the Kaczmarz method was applied. Extensive testing showed that this combination of Gauss-Seidel and Kaczmarz as relaxations did not lead to a convergent FMG-algorithm. There seem to be two reasons for that: On one hand, the grids on which the Kaczmarz method was applied at all were too coarse to represent the solution, which is a function of k. On the other hand, on the coarser grids, the eigenfunctions are shifted to lower frequencies, so that the solutions were principally different on different grids close to resonances and quasi-resonances. Now, to avoid the second problem, a corrected shift k2 . 2(cos kh -1)/(h 2 . k2 ) is used on all grids instead of the squared wavenumber k 2 belonging to the input frequency (cf. subsection 3.7.2). Further, an under-interpolation (cf. subsection 3.7.3), as proposed in [41], is used. The actual version of the multigrid algorithm in URMEL-J implies the Kaczmarz method as soon as the Gauss-Seidel method does no longer sufficiently reduce the error or even obviously diverges. 8
(It becomes obvious already here that not only two or three fixed relaxation steps are used but that the number of relaxations is controlled by some stopping criterion as described below.) Coarse grids on which functions of the 8
The program URMEL-I was based on URMEL [300] which is for resonant fields.
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
147
wave number k can no longer be represented very well (hz > k/5) are not used anymore at all. This choice of relaxation improves the convergence more than the mere application of Gauss-Seidel (in combination with Kaczmarz). Yet, the convergence still does not satisfy very well the conditions of a multigrid method, which is caused by problems to be discussed in subsection 3.10. Remark 3.7.7. This means that a strong limitation exists on the coarsening process: The higher the input frequency, the finer the coarsest grid. But this limitation is typical for indefinite problems. Thiebes [260J shows that the "dominant" frequency may only be on the coarsest grid and he denotes a frequency as dominant if it belongs to some eigenfunction with the smallest absolute value of a negative or positive eigenvalue. Here, the squared wavenumber k 2 lies close to the dominant frequencies of M. Thus, the observations described above coincide with Thiebes' statements.
The limitation concerning the coarsening limits the applicability of the algorithm: A direct solution method is used on the coarsest grid. Its rounding errors get too large, starting from a certain number of unknowns. As a consequence, the input frequency for which the multigrid algorithm in its actual version converges is bounded. In [43], some multigrid algorithm for slightly indefinite problems is introduced where the coarse grid equation is corrected in some sense. Then the subspace of eigenfunctions is approximated using the Kaczmarz method and some special multigrid algorithm not satisfying the condition (3.21) for good convergence. With these approximated eigenfunctions, additional correction equations are set up on the coarser grids. A similar procedure could also make sense for the algorithm from [277J, [287J treated here. However, it would require remarkably more computational effort than the multigrid algorithm used now, whose convergence is sufficiently good in most cases. Treatment of Re-Entrant Corners. Re-entrant corners at the boundary of the vacuum domain, as, e.g., at the connection of cavity and beam tube, present a singularity. One possibility to improve the convergence for such problems is the application of the linewise relaxation [112J. This was already chosen in any case because of the strong coupling in z-direction (see above). Further, an additional relaxation could be done in the neighbourhood of such a singularity in order to optimize a multigrid algorithm. This procedure was not yet applied to the actual algorithm, since this relaxation would hardly diminish the total error, which is dominated by the coarse grid error [244], compare 3.7.2. Finally, there exists the very expensive possibility to introduce local grids at the boundary [112J. In the end, this coincides with the method proposed by Brandt to treat the inner part and the boundary of the domain separately [244J.
148
3. Numerical Treatment of Linear Systems
3.7.5 The Choice of the Cycles in the FMG Approach In the FMG approach, the number of multigrid iterations i k on the grids Gk, k = 2, ... , I, the number Vk of relaxations on the single grids, and the number 'Y of multigrid steps on the next coarser grid must each be fixed. In general, W-cycles are said to be more robust than V-cycles [253], [112]. Nevertheless, there exist problems for which V-cycles converge but W-cycles do not: Remark .9.7.8. 1. With W-cycles, the coarse grid correction has a dominant influence. Consequently, this requires an optimal coarse grid correction in order to let the W-cycles converge better than the V-cycles [244]. There lie the particular faults of the algorithm described above - caused by the complicated boundary conditions and the indefiniteness (cf. subsections 3.7.2 and 3.7.4). 2. Finally, an adaptive choice of the cycles seems to be more suited than W-cycles [244], since, for frequencies close to a singularity, the coarse grid correction will always be relatively bad. Yet, the improvements could make the interval around a singularity smaller.
If the coarse grid correction was not optimized by the improvements discussed in [277], then the use of V-cycles seems therefore to be the best choice. For the multigrid iteration, 'Y = 1 was chosen, i. e., V-cycles are used.
Generally, multigrid algorithms are applied to differential equations with coefficients which vary locally very little. In such cases, it is possible to fix a priori, by theoretical reasoning, the number of relaxation steps to be carried out before and after the coarse grid corrections. However, for the problem at hand this is not valid. Because of the problems described above, the convergence rate of a standard multigrid cycle can no longer be predicted. Therefore, an adaptive cycling is needed. The number Vk of relaxations on the grid G k as well as the number ik of V-cycles in the FMC approach have not been chosen a priori but depended on a stopping criterion formulated in terms of the residual norm. Nevertheless, both can also be fixed to be some number ~ 1 by the user of the program URMEL-J [277) via the input parameters.
3.7.6 The Solution Method on the Coarsest Grid On the coarsest grid, Gaussian elimination is used for the solution of the linear system Al Xl = bl . In case of Neumann boundary conditions, one or several values have to be fixed before the solution of the defect equation on the coarsest grid is possible. At the first site, the time-harmonic cylindrically
3.7 Special MG-Algorithm for Non-Hermitian Indefinite System
149
symmetric problem with current excitation on the axis seems to be in this problem class. However, the axis is not a boundary of the studied structure but only a boundary of the computational domain. Setting up the difference equations, the complete structure is used 9 . Thus, the eigenvalue zero does not occur. Therefore, no additional equation is necessary to fix Xl' Yet, the problem is a radiation problem possessing undesired eigenfunctions with large entries at the waveguide boundaries. In fact, the corresponding homogeneous problem (1 = H* = 0) allows for a crossing wave. Setting Xl (P) = 0 for all points P of the open boundary, the undesired eigenfunctions can be suppressed. This procedure was also implemented in the algorithm. A further improvement of the method could be reached by normalizing Xl from physical considerations.
3.7.7 Concluding Remarks on the Multigrid Algorithm and Possible Outlook In subsection 3.10, systematic studies will be discussed. In principal, the described multigrid algorithm from [277] presents a solver for the complex symmetric indefinite system (2.16). In the practical impedance calculations, however, only the high frequencies near a quasi-resonance (and thus close to a near singularity) are of interest. For such applications of the time-harmonic problem, the functionality of the algorithm is unsatisfactory. Thus, since the posed problem implies a number of difficulties, which all together have a very strong influence on the convergence (compare subsection 3.10), a number of improvements is still possible. A whole number of possible improvements has already been listed in [277] (e.g., interpolation of higher order or different choice of cycling). In addition, there are ideas which proved themselves efficient over the last years in other contexts but for similar problems, as, e.g., the semi-coarsening, which is described in [106]. Also, the construction of preconditioning methods based on multigrid, as in the AMLI methods (Algebraic MultiLevel Iteration) described in [189], is promising. Meanwhile, besides the multigrid ansatz, there are the very attractive Krylov subspace methods described in subsection 3.4.4, which already proved themselves effective for time-harmonic problems on three-dimensional FIT grids. Finally, with regard to the near-singularity, there exist special methods which were introduced in [12]. There some bordering methods are used for the construction of preconditioners. It could be shown that it is efficient in some cases first to make a nearly singular problem exactly singular by some inclusion method and then to solve this problem by some Krylov subspace method. Further studies and comparisons in this direction are very desirable, since, due to the lack of efficient and reliable programs, quite rough analytical estimations are still used for the impedances instead of optimizing the design by numerical simulations. 9
Furthermore, the solutions (.jTH
150
3. Numerical Treatment of Linear Systems
3.8 Preconditioning The convergence behaviour of the Chebyshev iteration and all Krylov subspace methods strongly depends on the spectrum of the system matrix. Geometrically, this can be very well illustrated with the gradient method for a two-dimensional problem. For badly conditioned system matrices, the contour lines become very long ellipses and, for disadvantageous initial values, the search path followed by the gradient algorithm zigzags. This is tantamount to extremely slow convergence lO . For many methods, not only the condition number influences the speed of convergence but the convergence itself is often in qu.estion when the condition number is large. The examples in subsection 3.10 also clearly indicate that a suitable preconditioning is the key to good performance of iterative methods. The condition number of a well-conditioned problem is close to one, i.e., the extreme eigenvalues are nearly the same l l . The goal of the preconditioning is to improve the eigenvalue distribution in the complex plane such that the preconditioned system has condition number close to one. One possibility is the multiplication by a suitable polynomial in A. More often, however, the following way is chosen: A non-singular matrix M which represents a good approximation to the original matrix A and which can be easily inverted is found. Multiplying A by the inverse of the preconditioning matrix, one obtains a well-conditioned system if M approximates A well enough. Depending on the positioning of both matrices, three different types of preconditioning are introduced: - one-sided preconditioning - left-handed preconditioning Solve where the preconditioned system has the same solution as the original system. This type is usually preferred. - right-handed preconditioning Solve where the right-hand side remains unchanged. - split preconditioner Solve 1 1 M1 AM2 (M2 X ) -- M-1b 1 , which, for suitable choices of Ml and M 2 , keeps a given symmetry. 10 11
Graphical illustrations can be found in nearly every textbook, e.g., [74] Then the contour lines of the gradient algorithm become circles and, for every initial value, the normal vectors to the contour lines lead directly to the minimum which is the solution.
3.8 Preconditioning
The transformed system
A'x'
151
= b'
is then solved by a Krylov subspace method. Two principally different strategies for the preconditioning are usual: A preconditioning based on a classical iteration method or on an incomplete triangular decomposition or on a polynomial in A. In addition, preconditioning via some multigrid method and domain decomposition methods was recently introduced. In [44], some of the most popular methods are ordered by increasing quality where the term approximation quality of the inverse of A is used for the assessment of the quality of a special preconditioning: An iteration method without preconditioning has quality factor zero, i.e., computational effort and storage requirements are low but the effort is high. A direct method has quality factor one because of high computational effort for the LU-decomposition and backward substitution and at least as much storage space as the matrix A; iteration effort is small, since the iteration has just one step. Thus, preconditioned iteration methods have quality factor between zero and one. The computational effort for the iterative method goes down with increasing approximation quality, while the effort for the preconditioning, which dominates the total effort, grows. Consequently, a choice can be regarded optimal if it guarantees convergence at the smallest possible effort. The decision about the optimality depends on the problem. The selection given in [44] contains all rather" classical" preconditioners: - the Jacobi method with M = diag(A) (one-sided), - the scaling with Ml = M2 = Jldiag(A)1 (two-sided Jacobi), - the modified symmetric Gauss-Seidel preconditioning (SGS) with Ml . M2 = (D + L)D-l(D + U) where A = D + L + U, - the diagonal ILV decomposition (Incomplete LV) with Ml . M2 = (fJ + L)fJ-1(fJ + U), where fJ : diag(Ml . M 2 ) = D - ILV, ILU(O), or ILU(k) with Ml . M2 = (I + L)(D + U) - ILUT(k) In what follows, we will mainly discuss the incomplete LV decompositions, since the combination of ILU decomposition with Krylov subspace methods is often regarded the best "general purpose" method.
3.8.1 Incomplete LU Decompositions LV decomposition of the system matrix A generates a multiplicative decomposition A = LU where L ("lower") and U ("upper") are lower and upper triangular matrices (LU decomposition). The computation of the LV decomposition is the same as Gaussian elimination. The LV decomposition is not well suited for sparse matrices, since it produces fill-ins in the matrix A. The idea to avoid fill-ins during elimination by no longer eliminating all elements
152
3. Numerical Treatment of Linear Systems
in the lower triangular part of A led to the incomplete LU decompositions (ILU, incomplete LU decomposition):
A=LU-R rather than A = LU, with R a rest matrix. If an ILU decomposition is used as a solution method, this is an iterative process, since an exact solution is no longer possible to find. The symmetric variant (U = LT) of the ILU decomposition is referred to as IC (incomplete Cholesky decomposition). The idea behind the incomplete Cholesky decomposition is the computation of a lower triangular matrix L which is a good approximation to the exact Cholesky factor L with suitable sparse structure. The idea of an incomplete decomposition is also supported by the observation that the application of Cholesky decomposition to sparse symmetric matrices produces only relatively small entries lij outside the filling scheme of A [74). The ILU decomposition was first used in 1960, for example, by Varga [289). [7), [223) and [224) give a good overview of this topic. If the system matrix is an M-matrix,12 the effects of preconditioning are well understood. For more general systems of linear equations, they are not yet fully understood, so preconditioning of non-Hermitian complex matrices is mainly based on numerical experience. In [173), the effects of preconditioning are visualized for the spectra of some test matrices. Incomplete LU Decomposition in Accordance with the Matrix Structure (ILU(k)). In its original form, ILU(O), the incomplete LU decomposition, maintains the structure of the original matrix A and neglects all additional entries generated by Land U not fitting in the filling scheme of A. Generally, a set Z of pairs (i, j) can be given so that the only stored entries of Land U have indices (i,j) E Z. The set Zk which belongs to ILU(k) is recursively determined by Zo := {(i,j)11 :::; i,j :::; n, aij :I O} Zk := {(i,j)11 :::; i,j :::; n, (i,j) E Zk-l or some (i,j) for which an additional entry is generated by elements of Zk-l .} Algorithm 3.8.1.1 (ILU(k)) For i = 1, ... , n: For j = 1, ... , n: If (i,j) E Zk min{i,j)-l likUkj Sij = aij -
L
k=l
For i :::; j : lij = Sij For i 2:: j : Uij = sij/lii
---12 i.e. aii > 0, aij ::; 0 for i =1= j and A-I 2: 0 (elementwise). M-matrices arise in the discretization of simple partial differential equations, e.g., in electrostatics.
3.8 Preconditioning
153
In most cases, increasing the number of levels used for the decomposition also improves the rate of convergence (cf. subsection 3.10). One disadvantage is that, for k > 0, it is hardly possible to know in advance how large the additional entry has to be. In general, ILU(k) yields an effective pre conditioner. Two special modifications due to Wittum [319] known as ILU,a(O)- and ILU,a(3)-factorization often prove themselves to be very efficient preconditioners. Also, over large intervals, they are remarkably insensitive to the parameter (3, so that the choice of a standard value such as (3 = -0.99 is possible. ILU,a(O) leads to the same filling scheme as the scheme of A, while ILU,a(3) allows three additional side bands. On the other hand, ILU,a(3) generates a better approximation and thus, in general, a better convergence. If storage presents no problem, ILU,a(3) should therefore be preferred (see also section 3.10). Modified Incomplete LU Decomposition (MILU(k) and MIC(O». To improve the ILU decomposition, Gustafsson [109] proposed a modification of the diagonal elements of L such that the row sum of the error matrix LU - A vanishes. Let Zk be defined as above. Algorithm 3.8.1.2 (MILU(k» Fori = 1, ... ,n: lii = 0 For j = 1, ... , n: min(i,j)-l Sij
= aij -
L
iikukj
k=l
If (i,j) E Zk : For i < j : lij = Sij For i = j : lii = lii + Sii For i > j : iiij = Sij Otherwise: iii = lii + Sij For j = i + 1, ... , n Uij = iiij / lii
For certain elliptic partial differential equations, this decomposition turned out to be very efficient [173]. In general, however, the MILU(k) method is often worse than the ILU(k) and even can fail at all. Another modified method is the MICw(k) [7]. In section 3.10, some convergence studies on practical examples are given. E.G., this pre conditioner improves the spectrum of the complex linear system of electro-quasistatics and leads to a remarkable reduction of the necessary number of iteration steps of the studied cg-like methods (see section 3.10). Incomplete LU Decomposition with Thresholds for Additional Entries (ILUT(k». Saad [225] uses another criterion to neglect some entries. He defines scalars nf and nf to be the number of non-vanishing elements in the lower, resp. strictly upper part of the i-th row of A. Then only the nf + k
154
3. Numerical Treatment of Linear Systems
nf
(resp. + k ) largest elements in L (resp. U) get an entry. The advantage of ILUT(k) compared with ILU(k) is the predictable storage requirement. The algorithm can be found in [225) and [173). 3.8.2 Iteration Methods
The SSOR method is a pre conditioner used very often. The SSOR preconditioning also improves the spectrum of the linear system and in many cases leads to a remarkable reduction of the required number of iterations. The preconditioning matrix is then given by M
= (~D - ~L)(~D)-I(~D - ~U) WI
W2
WI
WI
W2
where A = D - L - U, D is the diagonal, L the lower left triangular matrix and U the upper right triangular matrix. L = UT holds for a symmetric system matrix A. The advantage of this decomposition is given by the fact that it is not necessary to compute M explicitly and to store it. For acceleration of the method, the parameters WI and W2 have to be chosen appropriately. A typical choice is WI = W2 = 1, since the sensitivity of the preconditioning process to these parameters is usually small. For this choice of the parameters, the method is called symmetric Gauss-Seidel preconditioning (SGS). The SSOR or SGS method is often used as a split preconditioner. Then the forward SOR algorithm is used from the left and the backward SOR algorithm from the right. For a "nearly-symmetric" matrix, it is often possible to use the storage efficient SSOR preconditioner by merely applying it to the symmetric part of the system matrix. This idea goes back to Concus, Golub, and Widlund in the middle of the 70's and sometimes is called the CGW method (the authors' initials). There are many variants and (unfortunately) also many different names for this technique. For example, this ansatz is also referred to as partial SSOR preconditioning. Such an ansatz is also studied by Yserentant in [326) for an indefinite symmetric matrix, as it results from the discretization of elliptic equations of Helmholtz-type. As a split preconditioner for the complete system, he uses a matrix B obtained from the symmetric positive definite part of an indefinite matrix A, and he gives estimates for the spectrum of these preconditioned systems. The consequences for the Krylov subspace method were also described in [326). 3.8.3 Polynomial Preconditioning
In polynomial preconditioning, the system s(A)Ax = s(A)b is solved instead of the system Ax = b where s is a polynomial of usually low degree. This idea goes back to Rutishauser [220) and was taken up later
3.8 Preconditioning
155
by several authors (see, for example, [222]). For instance, for a symmetric positive definite matrix, a least squares polynomial over the interval [0, Amax] can be used where Amax is an estimate of the largest eigenvalue obtained via Gershgorin circles. Polynomial preconditioning cannot be recommended on scalar computers since no advantages can be expected there; on such kind of computers, it is customary to apply incomplete LU decompositions for preconditioning. Yet, on vector computers and especially on parallel machines the polynomial preconditioning is advantageous according to [222]. It will not be treated in more detail here, since parallel machines are not standard yet. 3.8.4 Multigrid Methods In general, classical multigrid algorithms have optimal complexity only for regular problems (Le., sufficiently regular solution and sufficiently regular grid) 13. The method of hierarchical basis functions of Yserentant [325] is not of optimal order, especially for three-dimensional problems where the condition number grows like O(h- 1 ), h being the step size. In [13], Axelsson and Vassilevski present a multilevel preconditioning (shortly AMLI for Algebraic MultiLevel Iterative Method) on Finite Element triangulations which yield an algorithm of optimal order under weak assumptions on the FEM grid. Their preconditioning is based on a special incomplete decomposition of block matrices. The decomposition at one level is defined by recursion in terms of the decomposition at the previous level, and the resulting linear system at the coarsest grid is then solved directly. This kind of preconditioning can then also be used for a preconditioned iteration method such as a cg-like method. There are successful applications of this method to simple domains with one re-entrant corner. This domain does not satisfy the elliptic (H2) regularity condition, so a classical V-cycle multigrid algorithm would not converge with optimal order, while the AMLI-preconditioned cg-method converges very fast and also better than the ILU-preconditioned cg-method. Neytcheva [189] also uses the AMLI-preconditioner for indefinite and nearly-singular systems. In [190], Oosterlee and Washio look at three different multigrid algorithms as single solvers and as preconditioners for the BiCGSTAB and GMRES(20) method comparing it with BiCGSTAB and GMRES(50) method, each preconditioned by MILU. They apply these methods for singularly perturbed problems for which it is difficult to design optimal standard multigrid methods. In particular, a diffusion equation with strongly varying diffusion coefficients is one example for which the three MG algorithms do not converge satisfactorily and for which the Krylov acceleration "is really needed for convergence" [190]. In all examples studied in [190], the MG-preconditioned 13
Compare to the special multigrid algorithm in subsection 3.7 for high-frequency solutions and irregular grids which does approve this statement.
156
3. Numerical Treatment of Linear Systems
Krylov subspace methods are faster than the MG algorithms themselves and faster than the MILU-preconditioned Krylov subspace method.
3.9 Real-Valued Iteration Methods for Complex Systems For many applications, complex linear systems Ax = b with A E cnxn,x,b E C n
(3.23)
are to be solved. The linear system can also be written as
(R + is)(x
+ iy) = u + iv
(3.24)
with R,S E Rnxn,x,y,u,v ERn. In what follows, a (preconditioned) iterative method is introduced which was developed by Axelsson. The presentation here follows van den Meijdenberg [274] and Korotov [157] as well as a draft by Axelsson and Kutcherov [11], [10]. 3.9.1 Axelsson's Reduction of a Complex Linear System to Real Form This method uses a well-known procedure to solve a real system (3.25) whose dimension is twice as large as the dimension of (3.23) and whose condition number is the square of the condition number of the original problem. It is assumed that Rand R + SR-1S are non-singular. This holds, e.g., if R is symmetric and positive definite (which is usually abbreviated as spd) and S is symmetric 14 . Using the identity (I - iSR-1)(R + is) = R + SR-1S, the relations x = (R + SR-1S)-1(u + SR-1v), y = (R+SR-1S)-1(V-SR-1u)
(3.26) (3.27)
follow for the solution of (3.24). This is an equivalent real formulation for the complex linear system (3.23) or (3.24). However, this formulation is not very advantageous, since it requires to solve two linear systems, one with SR- 1S and one with R. In addition, it necessitates extra matrix-vector multiplications and vector operations. 14
If R is allowed to be singular, then Sand S + RS- 1 R have to be non-singular. In this case, the system (3.24) multiplied by i is solved.
3.9 Real-Valued Iteration Methods for Complex Systems
157
To reduce the effort, the real formulation (3.25) can be exploited for the complex system: As soon as x is known, y can be determined from Ry = v-Sx instead of (3.27?5. With that, instead of the complex system (3.24)
(R+iS)(x+iy) =u+iv, we have the systems
(R + SR- 1S)x = U + SR- 1v Ry = v - Sx.
(3.28) (3.29)
They are now solved consecutively in the real vector space. If R is ill-conditioned, some generalized form of the first system (3.28) may be better suited. This is the case if R + as with some real parameter a is better conditioned than R. The derivation of the generalized method starts off with system (3.24). Adding the first equation multiplied by a yields the system
Rx - Sy (R + as)y - (aR - S)x
=u =v -
(3.30)
au.
(3.31)
Elimination of y leads to
Rx - S(R + as)-I(aR - S)x
= u + (R + as)-I(v -
au).
Using
aR - S
= a(R + as) -
(1 + ( 2 )S
gives
Cox
= u + S(R + as)-1 (v -
au)
(3.32)
where (3.33)
For a = 0, (3.28) follows again. The system of form (3.32), (3.33) was first studied in [11], see also [10]. Below it will be shown that the iterative solution of (3.32) is essentially equivalent to the solution of (3.28) with (R + as)R-l(R + as) as preconditioner. This method was already studied in [4]. A general iterative method can be written in the form C( x 1+1 - x I) 15
I -TIT,
l --
° ,
1, ...
Thus the solution for one system with R + S R- 1 S is saved.
158
3. Numerical Treatment of Linear Systems
where Xo is the initial residual, C a preconditioning matrix, Tl are accelerating parameters such as the parameters in the Chebyshev iteration. Consider C(a) = R + as. Equation (3.32) implies the relation
rl
= (R -
as)xl - u + S(R + as)-l [(1 + a 2 )Sxl - v + au] .
Using identity (3.9.1), evaluate (1 + a 2 )S:
rl
= Rxl -
U -
S(R + as)-l [(aR - S)xl + V
-
au] .
Then rl can be computed without direct inversion of (R+aS) in the following way: Set zl = (aR - S)xl + V Solve (R + as)yl = zl rl = Rxl - Syl - U
-
au
(3.34) (3.35) (3.36)
This way yl is already found when rl is being determined after xl is determined, so (3.29) no longer has to be solved. Furthermore, this procedure avoids the initial computation of S(R + as)-l(v - au) on the right-hand side of (3.32)16. Finally, the method shall be written down completely. For convenience, it is called here the C-to-R method.
Algorithm 3.9.1.1 (C-to-R Method with PreconditioningjAxelsson)
Choose a. Set Xo := O. For 1 = 0,1, ... Set zl:= (aR - S)xl + V - au Solve (R + as)yl = zl Compute rl = RXI - Syl - u Choose Tl Solve (R + as)xl+ 1 = (R + as)xl - Tlrl This is a general form of the C-to-R method where the parameters a and Tl and the solution method for both linear systems (R + as)yl = zl and (R + as)xl+ 1 = (R + as)xl - Tlrl still has to be chosen.
3.9.2 Efficient Preconditioning of the C-to-R Method Let C(a) = R+aS be a preconditioner for Ga. To analyze the corresponding condition number, the generalized eigenvalue problem
JL(R + as)x 16
= Gax
If the first system in (3.25) is multiplied by (-a) and added to the second, then, after simple reordering, one obtains (3.35), i.e., (R+aS)yl = (aR-S)xl +v-au.
3.9 Real-Valued Iteration Methods for Complex Systems
159
is studied. With H := R- 1 / 2SR- 1 / 2, this is equivalent to
J.t(I + exH)y
= [1 -
exH + (1
+ ex 2)H(I + exH)-l H]
y,
(3.37)
where y = R 1 / 2 X. Now let>.. be an eigenvalue of the second generalized eigenvalue problem
>..Rx {::::::} >..y
= Sx with x f:. 0 = H y with y f:. O.
By (3.37), it follows that 1 - ex>..
J.t =
+ (1 + ex 2)>..2 1(1 + ex>..) 1 + ex>.. '
or, equivalently, (3.38) The parameter ex should be chosen so that the spectral condition number J.tminl J.tmax is minimized.
Theorem 3.9.1. Let R be spd and S be symmetric positive semi-definite. Then _ {
J.tmin J.tmax
={
I " H<;>2 for 0 :S ex :S A
~ for 5.. {Ha>..)2
< ex
-
I for 0: :S ex ~ lor 0 < ex < ii {1+a>..)2 J'
holds for the extreme eigenvalues of (R where 5.. = maximal eigenvalue of R-1S, ii = }. . 1+v/1+}.2
-
-
+ exS)-lG a ,
The spectral condition number is minimized when ex value
J.tmaxlJ.tmin
with Ga as in (3.33)
= ii; then it takes on the
= 2 J1+12 ~. 1 + VI + >..2
Proof: The limits for the extreme eigenvalues follow from (3.38) byelementary calculation for 0 :S >.. :S 5... Similarly, it can be shown that J.tmax 1J.tmin is minimized by some ex in the interval ii :S ex :S 5.. with J.tmax = 1. Consequently, it is minimized for ex = argmax&'50a(l + ex 2)-1, i.e., for ex = ii.
160
3. Numerical Treatment of Linear Systems
Remark 3.9.1. For an arbitrary 0: satisfying 0: ::; 0: ::; 1, the condition number is limited by 2, as was shown above. In practice 5. is often large, viz. 0: = 1 - 1/5.+ 0(1/5. 2 ), >. -t 00 [11], [10]. For this reason, 0: = 1 if >. is unknown. In this case, the smallest eigenvalue is given by ~, the largest by 1 and the condition number takes on the value 2. As shown in [11], [10], the C-to-R method can be extended to a twoparametric method. But it is obvious that the computational effort for one iteration step of the two-parametric C-to-R method equals the effort of two iteration steps of the one-parametric C-to-R method with the Chebyshev iteration. Next, because of the optimality property of the Chebyshev iteration, the one-parametric C-to-R method with the Chebyshev iteration converges at least as fast as the two-parametric C-to-R method. For this reason, the application of the two-parametric C-to-R method does not present further advantages compared with the one-parametric C-to-R method with the Chebyshev or cg iteration.
3.9.3 C-to-R Method and Electro-Quasistatics In electro-quasistatics, the complex linear system (2.14) results from discretization with Finite Integration Technique
If P.E is written as P.E = (+iT] and the right-hand side is written as l!.o = u+iv, the correspondence to system (3.24) becomes obvious. The block matrices AI< = SDJ;T and Ac = SDcST are both symmetric and positive definite (spd). Thus the C-to-R method can be applied to the system (2.14) of electroquasistatics. In an unpublished diploma thesis [274], supervised by Axelsson, van den Meijdenberg applies the Finite Element method to the electro-quasistatic equations. She ends up with a complex linear system of the same block structure:
(A + iwB)(( + iT])
= x + iy.
She uses the C-to-R method by Axelsson to solve this system in the twodimensional case. The essential results of van den Meijdenberg can shortly be summarized as follows: - She uses the C-to-R method with the Chebyshev iteration. To determine yl, the preconditioned cg algorithm is used as an inner iteration. An incomplete LV decomposition works as a preconditioner. Van den Meijdenberg scales the ill-conditioned matrices A and B by some diagonal scaling matrix such that the resulting diagonal entries of the scaled matrix are close to 1.
3.10 Convergence Studies for Selected Solution Methods
161
- Several two-dimensional problems with 700 to 2000 unknowns were studied. The condition number of the matrix could be significantly improved by scaling for the electro-quasistatic examples,17 thus she reduces the cost of the outer iterations and performs that computation with lower accuracy than the accuracy possible and meaningful for the inner cg-iterations. In some examples, convergence to the correct solution could only be reached after scaling. - The values of the conductivity of the sheet vary by powers of ten in the range between 8 . 10- 12 and 8 . 10- 7. From 8 . 10- 9 on, the achievable accuracy of the iterated solution went down and, for 8 . 10- 7, no solution could be found any more. This reflects the fact that the diagonal elements of the system matrix may not be to large compared with the other elements in order to be balanced by the scaling. - The Gauss algorithm, which is still applicable to the dimensions of the examples, was three times faster than the C-to-R method for some of the examples. For the example with about 2000 unknowns it became comparable to the C-to-R method. - The CGS algorithm as well as the GCG-LS method converged for none of the electro-quasistatic examples while the C-to-R method with scaling converged in all cases. The C-to-R method was also implemented in the FIT system of electroquasistatics [129]. The parameter studies of van den Meijdenberg also have been comprehended with Krylov-subspace methods for some relatively simple examples. Some of the results are described in subsection 3.10. Altogether, a very similar behaviour was observed.
3.10 Convergence Studies for Selected Solution Methods Some of the previously introduced direct, stationary, and non-stationary solution methods as well as a special multigrid algorithm have been implemented for several field-theoretical problem types. Some convergence studies from numerical experiments will be described here for the iterative methods. Besides purely academic examples, which still are relatively close to the usual model problems in theoretical convergence studies, some results are presented for realistic applications. One of the questions studied was to what extent theoretical results about convergence properties can apply to practical problems. These practical problems are often very large and possess geometrical singularities. All examples have been discretized with the Finite Integration Technique (FIT) [296], [305] in the program packages MAFIA [65] or URMEL-I 17
Van den Meijdenberg refers to this case as electrostatics, but in fact the equations are those of electro-quasistatics. The other problems which she calls thermalmagnetic are usually referred to as eddy current problems.
162
3. Numerical Treatment of Linear Systems
[287]. Unless otherwise stipulated, the relative residual is taken for a stopping criterion
Ihlliliroll
:s 6.
3.10.1 Real Symmetric Positive Definite Matrices In [159], the implementation of the FIT equations for the electro- and magnetostatics is described for three-dimensional grids, in [25]- for two-dimensional Cartesian grids ((r, z) or (x, y); cylindrical problems or Cartesian problems which are invariant in one coordinate direction) and in [72]- for three-dimensional circular cylindric grids as well as with open boundary conditions. In [292]' an improved algorithm is introduced for nonlinear magnetostatics, and its application to nonlinear electrostatic problems is described. The FIT equations have also been implemented for stationary current problems and stationary temperature problems (see also [23], [283], [288]). The system matrices of electro- and magnetostatics, stationary current problems and stationary temperature problems are real symmetric and positive definite. Therefore these linear systems can be solved with the GaussSeidel, the SOR, and the cg method. For the cg method, again, several different pre conditioners can be applied. In extensive convergence studies, the standard IC preconditioning, the MIC1] called modification by Gustafsson [109], and several iterative preconditioners have been compared (see also [202]). The results of these convergence studies are shortly summarized in the following.
Simple Model Problem. As a simple model problem, a unit cube with Dirichlet boundaries was chosen, driven by a conducting loop (cf. [202]). The example was discretized on an equidistant Cartesian grid with m points in each coordinate direction (step size h = 1I (m -1), total number of grid points N = m 3 ). The side length of the conducting loop is 0.1 m. The exciting direct current has strength 10 A. Figure 3.13 shows the geometry of the example.
/
I~
D
x
/
~
Figure 3.13. Simple model problem for statics. - Unit cube driven by a current loop with side length 0.1 m and direct current 10 A.
3.10 Convergence Studies for Selected Solution Methods
163
In statics, the matrix A has dimension n = N, the diagonal elements (}:i each have the value 6/h 2 and the entries in the side bands (3i,"/i and c5i each have the value 1/ h 2 (before implementing the boundary conditions). The matrix A corresponds to the Finite Difference matrix for Poisson's equation on the same domain. The eigenvalues and eigenvectors of this matrix are well known: The eigenvectors
correspond to the eigenvalues
A/
= :2
(sin2(ih%) +sin2(jh%) +sin2(kh%)) , 1 ~ i,j,k
~ m, 1 = 1,,,,N.
The extreme eigenvalues correspond to i = j = k = m and i = j = k = 1: Amax
12 2 7r 12 . 2 7r = h2 cos (h'2) and Am in = h2 sm (h'2)'
As in the two-dimensional case [114], the condition number is equal to
It grows quadratically when step size h decreases. The eigenvalues of the Jacobi matrix which are decisive in the convergence of the Jacobi, Gauss-Seidel, and SOR method (cf. theorem 3.2.2) are as follows:
A/
= ~ (cos(ih7r) + cos(jh7r) + cos(kh7r)) , 1 ~ i,j, k ~ m, 1 = 1, '"
N.
The eigenvectors of the Jacobi matrix are the eigenvectors e/, 1 = 1, "., N of the matrix A. The spectral radius of the Jacobi matrix in two- and threedimensional case is given by
Thus, for the classical iteration methods in three-dimensional case, the effort is at most of order O(N1. 66 ) for Jacobi and Gauss-Seidel methods and of order O(N1.33) for SOR with optimal relaxation parameter w which is given by (3.1) in theorem 3.2.2. The estimate (3.5) from lemma 3.4.1 of the effort for the error reduction in the cg method also holds for the preconditioned cg method. In this case, however, the matrix A has to be replaced by the matrix A' of the transformed system (see subsection 3.8). For left-handed preconditioning with the MIC1] , ILU w , or SSOR method, the following relation holds, which has been shown by Gustafsson [109] and Axelsson [7] for the two- and three-dimensional case: (3.39)
164
3. Numerical Treatment of Linear Systems
In the three-dimensional case, h- I
k
IX
N I /3 and
= O(N I / 5 ),
which follows from (3.5) for the number of necessary steps. O(N1.33) follows for the dependence of the total number of operations on the step size h or on the number of grid points N if no preconditioning or the Jacobi, SGS, IC(O), or IC(3) preconditioning was used, and O(N1.17) if SSOR with optimal w, MIC1](O), MIC1](3), ILUw(O), or ILU w(3) was used as a preconditioner.
Iteratlors
lteratlo..
I~r-----r--->nr----.
I~r-----r-----r-----'
~
SOR j
-+-.
CGJAKi
100
-8-
CGSGS j -0" CGSSORi
-e-
0'-
10 ........"----'1000 (a) Iteratlors
100...-----,-----;p,------,
-0'
I 1000 (e)
1010.--7.0'---ri06
Nj
Figure 3.14. Residual curve for different iterative methods in case of the simple model problem from statics. Graphs from the graduation paper of Pinder [202]
Figure 3.14 shows comparisons of different iteration methods with and without preconditioning on five differently refined grids in case of the model problem. Table 3.2 gives the used number of grid points and the resulting values for the condition number of A, the spectral radius of the Jacobi matrix and the optimal relaxation parameter Wopt. In Fig. 3.14, the number of iterations is displayed on the doubly logarithmic scale as a function of the number of unknowns (= grid points). The iterations were stopped as soon as the actual residual r~rue (which should not be confused with the recursively calculated residual rk) from the cg method satisfied the condition
3.10 Convergence Studies for Selected Solution Methods I,J,K 11 21 31 41 53
h 0.10000 0.05000 0.03333 0.02500 0.01923
N = dim(A) 1.33100 .1O~ 9.26100 .10 3 2.97910 .10 4 6.89210 .104 1.48877 .10 5
K(A) 3.9860.10 1 1.6145 .10 2 3.6409 .10 2 6.4779.10 2 1.0952.10 3
p(MJa~)
Wopt
0.95106 0.98769 0.99452 0.99692 0.99818
1.5279 1.7295 1.8107 1.8545 1.8861
165
Table 3.2. Step size, number of grid points, condition of the system matrix, spectral radius of the Jacobi matrix and optimal value for the SOR relaxation parameter of the five examples computed for the model problem.
The curves show very well the expected linear dependence. The determined average gradients m are listed in Table 3.3. Gauss-Seidel 0.59 cg-SGS 0.30
SOR, Wopt 0.34 ICCG(O) 0.30
cg, cg-Jacobi 0.34 ICCG(3) 0.30
m Table 3.3. Average gradients determined for different preconditioned methods. Both cg without preconditioning and that with Jacobi preconditioning have the same gradient, since those methods are identical for the studied example, as the diagonal elements of A all have the constant value 6/ h 2 •
The effort per iteration is ex N for all methods such that the total effort is ex NHm. The theoretical dependencies given above for the computational effort as functions of the grid resolution could also be put within the bounds of expected accuracy. According to (3.39), the cg algorithm preconditioned with MIC rp ILU w , or SSOR should possess a remarkably better convergence speed compared to the other methods. This indeed becomes obvious in Fig. 3.15. Note that the SOR method with optimal W is nearly as fast as the cg method with SGS preconditioning. The incomplete LU decomposition in its version with three additional diagonals only seems to be advantageous in the IC decomposition. In the other cases, it does not show essential advantages because the lower number of iteration steps is more or less compensated by the greater computational effort. As expected, the preconditioned cg methods are without any doubt the optimal methods for the system matrix of static problems. All studied pre-
166
3. Numerical Treatment of Linear Systems 1e+04 . - - - - - - - - - - - - - - - - - - - , 8 - - - £ ) G a u s s - S e l d e l 8---£) cg-Jakobl )E--1
~SOR
\1 ..... '1 cg-SGS <)- - -<> ICCG(O) + - -+ ICCG(3) !s----I::.MICCG(3)
1J§CI)
1e+03
m
.!;;;
~
+= :::::>
1e+02
C-
U
1e+01 '---_ _~_ 18+04
_'_'__~~~~~~~~~~'_'_____...J
38+04
1e+05
3&+05
Number of grid points Figure 3.15. Computational times of the studied iteration methods for the statics model problem. (Note: SOR with optimal w.)
conditioners have one free parameter, which has to be chosen appropriately. Figure 3.16 shows the number of necessary iterations for the cg method with SSOR preconditioning as a function of the relaxation parameter w for the model problem with 31 x 31 x 31 grid points (denoted by 'Ex. 1.3' in the figure) and with 41 x 41 x 41 grid points (,Ex. 1.4'). After passing the minima, the curves show a steep increase such that, in contrary to the SOR, it is better to choose the parameter too small than too large. For the example with larger number of grid points, the gradients are somewhat steeper. That means that, with growing number of grid points, the differences between w and the optimal value Wapt affect the rate of convergence more strongly.
"L,---L,.,---l,=-,-,'-0-,----L,,---l,-:-,-','-0-.,--:':I.~,----J,':-.--.J..,l.9-.,J OW
Figure 3.16. Number of iterations as a function of the relaxation parameter w for the cg method with SSOR preconditioning in case of the statics model problem with 29791 (,Ex. 1.3') and 68921 grid points (,Ex. 1.4'). Graph from the graduation paper of Pinder (202)
3.10 Convergence Studies for Selected Solution Methods
167
Figure 3.17 shows the influence of the parameter Tl on the rate of convergence for the M1C'I preconditioned cg method. The closeness of the optimal parameter to zero can be justified. For the examples 1.3 and 1.4, the general choice of Tl = 0.01 in the computations above is nearly optimal. The detailed curves show a very steep gradient between Tl = 0 and the minimum. If in doubt, one should prefer Tl = 0 to Tl = 0.2.
EJ.U 3D
" .;r.--------.
..--,.,.
"
" 10
o (e)
0,1
0.3
0.2
0.4
EIII
0.5
H1L---'----'----,'-:---,-,-' 0 O./U 0.02 0.03 0.04 (d)
Et_
Figure 3.17. Number of iterations as a function of the parameter 1] for the cg method with the MIC1) preconditioning in case of the statics model problem with 29791 (,Ex. 1.3') and 68921 grid points (,Ex. 1.4'). The two upper graphs belong to the MIC1)CG(O) method, Fig. (b) shows a zoomed part of (a). The two lower graphs belong to MIC1)CG(3) method, Fig. (d) shows a zoomed part of (c). Graphs from the graduation paper of Pinder [202]
The influence of the parameter w in the ILU w preconditioning of the cg method is in principal similar to that of the parameter Tl in the M1C'I preconditioning. This is not too astonishing, since the ILUw=-l CG method is the same as the M1C'I=oCG method 18 . Consequently, Fig. 3.18 shows that the optimal w is close to -1. The curves for the ILU wCG(3) method are obviously more flat than those of the ILUwCG(O) method, and their minima span a somewhat larger interval. Thus, the ILU wCG(3) method seems to be less sensible to the choice of w than the ILUwCG(O) method. All the studies above dealt with a relatively well-conditioned model problem. It is well known that realistic applications usually have much worse condition numbers because of re-entrant corners, non-homogeneous material distributions, and non-homogeneous grids. The next example shows possible differences in the convergence behaviour, even though it is still a relatively simple example. 18
The ILUw=oCG method is the same as the ICCG method.
168
3. Numerical Treatment of Linear Systems "F-"'~':::;-=-.--.---,---,
"
30
£ ... 1.4 ElI.U
20
r-/
--------
.,._....
*~-----
25', £ •. 1.4 2lJ
~~+'T-.... ,+ ,.. - -+"
.- -+ ."
\- .. ""E~:.~.:~..........)(........_.
_-I( ...................
j
" '-,:--":::---'------1_.._._.L.........J .,)
"US
'-,:),96
"tl.'.l4
-11.92
-0.9
Omori"
(b)
--~
l _--'-:c-_.....L..._-'--,-.-J O
,
I!
.
_'r(",:_~.
J .*-;;:;:;"
•....• __ •.. i(••. ___ ._ .• e •••
IOL._,_..,.L.,_.---'.,L.,,_..L ..._,--'.,-,,--1"1).9 (dl
Ofl'l*gl
Figure 3.18. Number of iterations as a function of the parameter w for the cg method with the ILU w preconditioning in case of the statIcs model problem with 29791 (,Ex. 1.3') and 68921 grid points (,Ex. 1.4'). The two upper graphs belong to the ILUwCG(O) method; Fig. (b) shows a zoomed part of (a). The two lower graphs belong to the ILU wCG(3) method; Fig. (d) shows a zoomed part of (c). Graphs from the graduation paper of Pinder [202]
C-Magnet. As the first example of a realistic application, a simple Cmagnet driven by two coils was studied (cf. [202)). The material of the magnet is assumed to have constant permeability /-lr = 398. The air gap of the magnet is of highest interest; this is why that area is discretized especially finely, while the step size may increase more and more towards the outer boundaries of the computational domain. The magnet has two symmetry planes: y = 0 and z = O. This is exploited in the numerical field calculation, and only one quarter of the magnet shown in Fig. 3.19 is discretized. At the symmetry plane z = 0, the Neumann boundary condition is chosen, since the magnetic field lines are parallel to the boundary there. At y = 0, the Dirichlet boundary condition is chosen, since here the field lines are perpendicular to the symmetry plane. At all other boundaries, the open boundary condition is used. In the computation discussed, N = 98600 grid points were used and neighbouring step sizes differ by at most a factor of 2.5. Figure 3.20 shows a cut through the grid. From the numerical point of view, this example, which is also relevant in practice, is interesting since it leads to a matrix with entries of different orders of magnitude caused by the strongly varying step size and by the non-homogeneous material filling. The condition of this matrix is much more realistic than that one of a simple model problem. Figures 3.21 and 3.22-3.24 display the residual curves as functions of CPU time in a double logarithmic representation. The relative residual, the quotient Ilr~rue 1100
IIrSrue 1100 '
3.10 Convergence Studies for Selected Solution Methods
169
Figure 3.19. Quarter of a C-Magnet. - For symmetry reasons it is sufficient to compute only a quarter of the structure. The use of the open boundary condition allows us to place the borders of the computational domain, i.e., those boundaries which contain no symmetry planes, relatively close to the C-magnet. The outer frame in the picture displays the borders of the discretized computational domain.
Figure 3.20. Grid (with N of the C-magnet.
rt
= 250965 points) in (y, z)-plane: zoom of the coil area
the true residual rue in the k-th iteration step (not the recursively computed residuals rk), and the true initial residual r6rue are given. The initial approximation was each time chosen to be zero, so that the initial residual r6rue equals to the right-hand side b of the system. In Fig. 3.21, it is worth noting that the CPU time there is given in minutes, while otherwise it is always in seconds. The typical convergence behaviour described by Axelsson, cited in subsection 3.4.1, with its three different phases can also be observed in Fig. 3.21. Without preconditioning, the cg method is slower for this example than the SOR method with optimal relaxation parameter w: In order to reach relative error of order 10- 4 , the cg method needs nearly 110 minutes, and the SOR with optimal w - only about 15 minutes. Jumps in the conver-
170
3. Numerical Treatment of Linear Systems
gence curve as, e.g., at t = 85 minutes are each caused by a restart with the last approximate solution. \O;::~I'=""::.;~=~=~I..,....-r--r_..,....--'--r-.----'--r-.--,
epu-tlmtlfmln
Figure 3.21. Residual curve of the cg-method without preconditioning in case of the C-magnet. Graph from the graduation paper of Pinder [202]
A comparison of Fig. 3.21 and Fig. 3.22 shows that the Jacobi preconditioning leads to a remarkable convergence acceleration in this example with non-homogeneous material distribution and non-homogeneous grid: The efficiency is increased by a factor of 7.5 by this scaling. The cg method with Jacobi preconditioning then becomes comparable to the SOR method with optimal W but without the need to determine the optimal w, which would be costly. For all preconditioned cg methods, only a small decrease of the residual can be observed at first. Then, starting at a certain number of iterations, which is nearly the same for the different methods, the rate of convergence rapidly increases and stays at this level until the end of the iteration process. As for the model problem from above, the SGS, IC(O), and IC(3) preconditioning do not prove themselves optimal now that the percentage differences are greater than before. The SSOR preconditioning, which was most effective in connection with the modified IC decomposition in case of the model problem is now much less effective and only comparable with the IC(O) or IC(3) preconditioning. The MIC1) and ILU w preconditioning are optimal: Compared with the model problem, the C-magnet problem is much worse conditioned, so that their versions with three additional bands lead to an obvious improvement. Regarding the optimal relaxation parameter Wopt of the SOR method always used above, it has to be noted that the effort for its evaluation is in practice not at all in a reasonable proportion to the solution effort for the linear system. Wopt is given by (3.1) in theorem 3.2.2. The spectral radius of the Jacobi matrix, which is part of the formula, is usually determined by the Mises iteration. For the C-magnet and the initial approximation (1, 1, ... , If, the curve for W starts to stagnate only after about 100 minutes of CPU time. After more than three hours(!) of CPU time, small oscillations, which were present until then (cf. Fig.25 in [202]), finally vanish and the final values of Wopt = 1.9782 and p(MJak) = 0.999939 result. Consequently, only estimated values are used in practice for w. Yet, the SOR method is
3.10 Convergence Studies for Selected Solution Methods
171
~=la=t~~e=r~~=ld=~=I-r________- '________- .________- .__________r-______--' 10 .-
0.1 0.01 '\
0.001
\
\"
CG-5GS "'••
1'10-4
l'10- S
L
........""1
0
200
400
600
800
1000
1200
cpu-time I sec.
Figure 3.22. Residual curves of the cg method with different preconditioners (Jacobi and SGS) and of the SOR method with optimal w in case of the C-magnet. Graphs from the graduation paper of Pinder [202] relative residual 10~----~-----r-----.------r-----.------r-----.
0.1 0.01 0.001 CG-5S0R, _1.675
1'10-5 L-___....L._ _-..I_ _ _.1-_ _- L_ _- . l_ _ _-1..._ _---' 100 150 50 200 250 300 350 o cpu-time I sec.
Figure 3.23. Residual curves of the cg method with different preconditioners (SSOR, IC(O), IC(3)) in case of the C-magnet. Graphs from the graduation paper of
Pinder [202]
172
3. Numerical Treatment of Linear Systems relative residual
.._. 0.1
M1CCG(O~ ctacO.OO 1 ILUCG(O~ w--O.99 M1CCG(3)' ela=O.OOI ILUCG(3). --0.99
0.01 0.001
I'IO-S '--_ _ _-'--_ _ _...1.-_ _ _- ' -_ _ _--'
so
100
ISO
200 cpu-time I sec.
Figure 3.24. Residual curves of the cg method with different preconditioners (MIC7](O), MIC7](3), ILUw(O), ILUw(3)) in case of the C-magnet. Graphs from the
graduation paper of Pinder [202J
very sensitive to any deviations of W from Wopt, especially in case of underestimation. For the above example, the computation time nearly doubled in case of under-estimation (w = 1.9682) and, for over-estimation (w = 1.9882), it still increased by about the factor of 1.7 (cf. Fig.26 in [202]). Another disadvantage of the SOR method lies in the relatively bad final accuracy (cf. Fig. 3.22). In comparison with Gauss-Seidel, the SOR method does not reach a similarly good smoothing of the error. Therefore, a restart with the last approximation and W = 1, i.e., the Gauss-Seidel method can remarkably reduce the final residual within a few steps. In the above example, an improvement by nearly one order of magnitude was reached (cf. Fig.27 in [202]). The additional effort is relatively small compared to the total effort. Summarizing all reasons described above, it can be stated that, in practice, the SOR method is not of interest as fast solution method. Also, it should be noted that the ILUwCG(O) method is about six times and the ILU wCG(3) method is about 7.5 times as fast as the SOR method with the optimal parameter. For further grid refinements, these factors only grow: For N = 250965, e.g., they are about eight or ten. For an estimated parameter w, the factors are even larger. One important question encountered in practice is if and when restarts should be done in the preconditioned cg method. They are necessary because of rounding errors, which let the recursively computed residual differ more and more from the true residual, finally leading to the loss of conjugateness of the search directions. In addition, rounding errors are responsible for possible divisions by very small numbers in the course of computation of 13k in the cg algorithm 3.4.1.1. The restart with the last approximation is a suitable measure at the moment when the norm of the recursively computed residual gets smaller than the normalized difference between recursive and true residual:
3.10 Convergence Studies for Selected Solution Methods
173
Figure 3.25 elucidates the effect of this measure on the cg method with Jacobi preconditioning: By several restarts, the residual can be improved by at least one order of magnitude. Figure 3.26 displays, for the example above, the true and the recursive residual vectors, which are part of the restart criterion, as well as their difference starting from the 220-th iteration step. As soon as the restarts follow each other very fast, the algorithm should be stopped.
Figure 3.25. Effect of restarts on the cg method with Jacobi preconditioning.
Graph from the graduation paper of Pinder
[202]
Figure 3.26. True residual rtrue and recursive residual rk as well as their differences in the restart phases (from the 220-th iteration step on) in case of the example from Fig. 3.25. Graph from the graduation paper of Pinder [202]
For the model problem, the influence of the parameters had also been extensively studied. The qualitative relation is quite similar in case of the Cmagnet, so no figures for the C-magnet are shown here; they can be found in [202]. For the SSOR preconditioning, the interval around the minimum gets larger compared to the one in Fig. 3.16, and thus the acceptable interval for w gets larger. For the MIC1} preconditioning, the interval around the minimum now ends much earlier, so that the optimal TJ has to be chosen corr.espondingly smaller. For MIC1}~0.002CG(3), the lowest number of iterations (about 30) is achieved. This has to be compared with a minimum of 16 iterations for the model problem with 68921 grid points and TJ ;:::;; 0.004 (cf. Fig. 3.17). For the ILU w preconditioning, the gradient of the curve just outside of the interval around the minimum is much steeper than the one in the model problem (cf. Fig. 3.18), especially for the ILU(O) version. The absolute minimum of
174
3. Numerical Treatment of Linear Systems
the number of iterations is reached for ILUw~-o.99CG(3) (below 30 iterations); compare to the minimum of only 14 iterations for the model problem with 68921 grid points and w E [-0.99, -0.935]. If one demands at least the same efficiency as for the SSOR preconditioning with optimal w, then this requirement is satisfied by the ILUwCG(O) method for W E [-0.9995, -0.90] and by the ILU wCG(3) method for wE [-0.9995, -0.40]. Thus, both the MICry- and ILUw-preconditioned cg method appeared to be most efficient. In practice, the ILU w method should be preferred as a preconditioner because of the simpler choice of parameters; even for w = 0, this gives the still acceptable ICCG method. Also, the differences in the convergence behaviour of the model problem become obvious in comparison to the C-magnet. An algorithm that is one of the fastest in one case can be much less efficient for another problem (e.g., cg method with SSOR preconditioning). All computations described above were done on a 486 DX2-66 PC in single precision. Only the scalar products occurring in the cg method were carried out in double precision [202]. Computed equipotential lines and field distributions are shown in the application part (subsection 4.2.1) in Fig. 4.12 and Fig. 4.1l. Current Sensor. Another example from magnetostatics is the field computation of a current sensor driven by a coil fed by the current I = 100 A. The material of the sensor is assumed to have the relative permeability J..Lr = 500. Figure 3.27 shows the arrangement with the grid for the sensor. According to its circular geometry, the sensor is discretized on a circular cylindric grid exploiting the symmetry with respect to the plane
L Figure 3.27. Circular current sensor with excitation from coil. One half of the symmetric structure is shown.
3.10 Convergence Studies for Selected Solution Methods
175
The discretization on a circular cylindrical grid leads to some special characteristics, which also influence the solution of the resulting linear system. For an azimuthally closed grid, the planes
L z r
n + (J ·1) I + I
Figure 3.28. Detail close to the axis of a circular cylindrical FIT-grid.
Here is a simple example that includes both special cases discussed above. The system matrix has the structure shown in Fig. 3.29. The crosses denote non-zero elements on the bands, the stars denote additional elements appearing in the special cases. The rows denoted by 'd' belong to the multiple grid points on the axis with
176
3. Numerical Treatment of Linear Systems
xx xxx xxx xx
*x
x X
x
**
x
xx xxx xx x
A=
x
x
x
x
d
x
xx xxx xx x X
x
x
x
X
d
xx xxx x xx
d
Figure 3.29. Example of a system matrix belonging to a circular cylindrical FIT grid which includes the axis and is azimuthally closed. Figure 3.30 displays the residual curve for the cg method with modified ILU w preconditioning and the SOR method with optimal relaxation parameter. In this example, the cg method hardly shows any advantage as far as the rate of convergence is concerned, but its independence of any parameters has to be kept in mind (w in the preconditioner can always be set to be zero.). In contrast, the optimal relaxation parameter for the SOR method, which very strongly influences the rate of convergence, as was shown above, has to be determined in an additional first step. Often, the corresponding effort is very high compared to the solution of the linear system itself, so that the relaxation parameter is only approximated in practice. Consequently, the convergence of the SOR method under realistic conditions still slows down remarkably.
3.10.2 Complex Symmetric Positive Stable Matrices Now that it became obvious that even for real symmetric positive definite matrices the theoretical convergence results can only inadequately be transferred to the solution of realistic problems, some numerical results obtained by solution methods for complex symmetric matrices are presented. The FIT equations for electro-quasistatics have been implemented for three-dimensional Cartesian coordinate grids (cf. [281]' [282]). The resulting system matrices are complex symmetric positive stable (cf. subsection 2.4). These systems of equations cannot be solved with the Gauss-Seidel, SOR, or cg method. Theoretically, the SSOR method could be used, but the modern variants of Krylov subspace methods should be preferred. The COCG
3.10 Convergence Studies for Selected Solution Methods
177
",'aIM! residua'
10F~~"---r--'--r--.--.--r--'---'
.",
0.1
co
0.01 0.001
SOR. . .1... •···· ...
1"10-4 1"10-'
1'10--6
L--...1----L_..L---L._L--...1----L_..L---L.---I
o
, 10 cpu-time I min
Figure 3.30. Convergence history for the circular current sensor. The relative residual of the cg metnod with modified ILU", preconditioning and of the SOR ~ethod with opttrr:al relaxation J?arameter, determined by the Mises iteration, are displayed as functIOns of CPU time needed. Graph from the diploma thesis of Pinder [201]
method was implemented first, since it already proved its suitability in a FIT implementation for time-harmonic problems [116]. After reaching satisfactory results with the COCG even without preconditioning, several more recent Krylov subspace methods, e.g., the TFQMR method [281], [282]' [58] and one algorithm of the SCBiCG(r, n) class described by Clemens [60], were implemented. All of these methods were studied for several simple problems and for realistic applications as well.
Analytic Example. Our first example is a simple parallel plate capacitor with two layers of different material displayed in Fig. 3.31. Let the given material constants be electric conductivities 11:1,11:2 and relative permittivities cl,r, c2,r' The frequency of excitation is w. Let h denote half of the height of the plate capacitor (see Fig. 3.31). For this simple example, the analytic solution can be easily determined: For -h ::; y < 0, the complex potential!. is given by
and for 0 ::; y :::; h by
where >'1, >'2 E C are given by >'1 = (11:1, WCl,r), >'2 = (11:2, WC2,r)' The parameters in the studied example were chosen as follows: half height h = 5 cm, material parameters (c1,r, 11:1) = (8.0,2.0.10- 7 S/m) and (c2,r, 11:2) = (4.0,1.0.10- 12 S/m), voltage V = 1 V, and frequency f = 50 Hz. The dimensions are 4 cm x 10 cm xl cm. The system was discretized using regular step size 1 cm in all directions, i.e., with only 110 grid cells. Figure 3.32 shows a comparison between the electrostatic potential cP E and the real part of the electro-quasistatic potential Re(~E)' each computed with Finite Integration
178
3. Numerical Treatment of Linear Systems C===================~I £2 1\:2 fh --------------------------------------------------------------. I
£,
v
X; I ()
Figure 3.31. Simple parallel plate capacitor of height 2h with two layers of different material.
Technique, as well as the real part of the analytically determined potential cp. The agreement between the numerical and the analytical results is excellent. This example also makes perfectly clear that the electrostatic model would not be well suited and only the electro-quasistatic model works well for this kind of problem. 2.5
ID I
2.0
,./
",/
Elektrostatik Elektro-Quasistatik
,,-
.... ,'
(l)
..... .....
........III
1.5
I'::
(l)
.... 0
'"
"~/
1.0 .,-
/
0.5
----_/
/
/
/
/
/
/
/
I
/
/
/
/
/
I
/
/
/
/
/
/
/
I
/
/
/
/
/
/
/
/
/
o. -4.00
o.
-2.00
2.00
4.00
y/cm
Figure 3.32. Electrostatic potential E and real part of the electro-quasistatic potentials Re('kE ), each computed using Finite Integration Technique, as well as the real part of the analytically determined potential ¢ for the simple parallel plate capacitor. The numerical results agree very well with the analytic solution, so that no difference can be found in the curves.
Simple Plate Capacitor. Figure 3.33 shows a typical convergence history for another simple plate capacitor. Its side length is 8 cm and its height is 3 cm. It is assumed that the dielectric material has relative permittivity lOr = 3 and electric conductivity", = 10- 6 S/m. Furthermore, a voltage gradient 15 kV Icm is assumed at the frequency f equal to 50 Hz. The complex linear system to be solved has dimension 45177.
3.10 Convergence Studies for Selected Solution Methods it ::2 au.
I-
0.1
c
0.01
£ "0 ::::I
179
MIC(O)"C()S ... _.~ (1) MIC(0)-CGS2 _._-_. (2) ...... ·MIC(O)'TFQMR''''''(3Y MIC(O)-BiCGSTAB ...... (4) ... MtC(OFBiCGSlati{2)'CCCC'(S]
0
.0
§
0.001
0
c 0.0001 Q;
a. a.
2§ 0
c (ij
::::I "0
'iii ~
Q)
.:::
tii
Oi
cr:
1e-05
..
1e-06
......................;.....
········f
1e-07
(3)
1e-08
········f·· .. ·
0
50
100
150
200
Number of matrix vector multiplications
250
Figure 3.33. Convergence history for the simple plate capacitor with 45177 grid points and MIC(O) left-handed preconditioning with w = -0.5. The graph is courtesy
of Clemens, TU Darmstadt [58]
In all cases, the MIC(O) preconditioning was used. Comparing the convergence curves, it has to be noted that only an upper bound for the true residual is given in case of the TFQMR method. In some cases this upper bound is much larger than the true residual. The CGS and CGS2 methods both produce very strong oscillations. The convergence of the CGS2 method is somewhat better than that of the CGS method. The BiCGSTAB and the BiCGstab2 method show a very similar convergence behaviour. Altogether, the convergence of the BiCGstab2 method is the fastest. It reaches a relative residual norm of 10- 8 after approximately 100 matrix-vector multiplications. The CGS method needs about 25% more matrix-vector multiplications to reach the same residual. Some parameter studies on the dependence of convergence on the actual value of the electric conductivity K, were carried out for this example with a grid of 42772 grid points shown in Fig. 3.35. The values of the electric conductivity K, of the layer of water varied by one magnitude each between 10- 9 Sim and 10- 6 S/m. Starting at about 10- 6 Slm, the convergence curve remarkably went down. This illustrates the fact that the diagonal elements of the problem matrix may not differ too much one from another in order to be balanced by the scaling provided in the Jacobi-preconditioned BiCGCR method. Contaminated Insulator. An important problem in high voltage engineering are phenomena caused by moisture layers or pollution layer on the insulator. In [208], there are experimental studies on the effects of surface contaminations with low conductivity on the aging process of cylindrical test insulators from epoxy resin loaded by alternating current. For numerical studies
180
3. Numerical Theatment of Linear Systems
J-.
,
Figure 3.34. Simple plate capacitor with layer of water. For symmetry reasons it is sufficient to discretize a quarter of the geometry.
"OOO"01 1~"~~
l.U O !~ O l
Figure 3.35. Simple plate capacitor with layer of water. Grid in the (x, y)-plane.
one of these test specimen was chosen: The electrodes are each 6 mm thick and have radius 18 mm. The computed model is a 30 mm long solid piece of the originally [208]100 mm long hollow cylindrical test specimen with radius 15 mm. The epoxy resin has relative permittivity fr = 4; the relative permittivity of water droplets is fr = 81 and their electrical conductivity may be assumed to be I'\, = 10- 6 S/m. The frequency f of the alternating current is 50 Hz, and a voltage gradient of 5 kV Icm is applied. The size and form of the water droplets vary. Neglecting possible deformations which may be caused by the electric field, a rounded form with typical diameter of 1-3 mm can be assumed. As clearly visible in the photograph of Fig. 4.32 in subsection 4.5, the droplets are quite close to each other but randomly distributed. First, constant radius 3 mm was assumed for the water droplets. Next, in Fig. 4.38 in subsection 4.5, a simulation model with many different droplets distributed over the whole surface is shown on the left. On the right, Fig. 4.38
3.10 Convergence Studies for Selected Solution Methods
181
shows a model where some of the water droplets coalesced. In subsection 4.5, potentials and electric fields are displayed for all those examples.
10' ,-----~------~------_r------,_----__. 10'
-
10'
PBiCGCR PBiCGCR (MRS)
-
COCG
-
COCG(MRS) PCOCG
10'
10" 10" 0.0
2000.0
4000.0 6000.0 8000.0 number of malri. vector mu~iplicalions
10000.0
Figure 3.36. Convergence history of different Krylov subspace methods for the solutIOn of the field proolem in case of an insulator with seven distinct water droplets and a discretization with 308826 grid points, i.e., 926478 complex unknowns. Compared are the COCG and COCG-MRS (COCG with Minimal Residual Smoothing) with the preconditioned methods PCOCG, PBiCGCR, and PBiCGCR-MRS. The preconditioned methods only need about 3 % of the computational effort compared with the COCG without preconditioning. The graph is courtesy of Clemens, TV Darmstadt [60)
The convergence history of different Krylov subspace methods is shown in Fig. 3.36 for the solution of the field problems in case of an insulator with seven distinct water droplets and a discretization with 308826 grid points (cf. Fig. 4.35). The complex linear system also has the dimension of 926478. Compared are the COCG and COCG-MRS (COCG with Minimal Residual Smoothing) with the preconditioned methods PCOCG, PBiCGCR, and PBiCGCR-MRS. The preconditioned methods only need about 3 % of the computational effort compared with the COCG without preconditioning. As a preconditioner, an implicit complex-valued split Jacobi preconditioning is used. The effect of MRS is quite visible: The original COCG method shows the well-known wild oscillations, while COCG-MRS has a smooth convergence curve, which always stays below the oscillating COCG curve. Figure 3.37 shows a zoom into the convergence curves of the preconditioned methods. Most of the time, the curves of PBiCGCR and PBiCGCRMRS coincide. As expected, they always differ where the PBiCGCR curve happens to have an increase in the residual norm. All three curves end with the same final result in this example. Yet, the PCOCG curve starts at ap-
182
3. Numerical Treatment of Linear Systems
1EH>1
1EH>5 0.0
100.0
200.0 :JlO.o number of matrix vedDr rrUti plications
_.0
m.O
Figure 3.37. Convergence history of different preconditioned Krylov subspace methods for the solution of the field problems in case of an insulator with seven distinct water droplets and a discretization with 308826 grid points, i.e., 926478 complex unknowns, and computation in double precision. The graph is courtesy of Clemens, TU Darmstadt [60)
proximately 100 matrix-vector multiplications with strong oscillations of the residual, which, however, are still relatively smooth compared with the wild oscillations of COCG (cf. Fig. 3.36). These computations were carried out on a SUN Microsystems Workstation in double precision. Furthermore, some parameter studies with respect to the dependence of the convergence on the actual value of the electric conductivity K, have been carried out. The values of the electric conductivity K, of the water droplets or water layer were varied by one order of magnitude each between 10- 9 and 10- 6 . That strongly influenced convergence, as can be seen in Fig. 3.38. This reflects the fact that the diagonal elements of the problem matrix may differ too much to be balanced by preconditioning. Figure 3.38 shows the obtained convergence curves for different conductivities in case of the Jacobi preconditioned BiCGCR method. In Fig. 3.39, not only the convergence curves for different conductivities but also the convergence curves of distinct and coalescing droplets are displayed each for the application of the Jacobipreconditioned BiCGCR method.
3.10.3 Complex Indefinite Matrices In time-harmonic field problems, linear systems have complex indefinite matrices. Depending on the problem type, these matrices may be symmetric, quasi-symmetric, or non-symmetric. (A matrix A is called quasi-symmetric if it is similar to a symmetric matrix.) For the two-dimensional case with
3.10 Convergence Studies for Selected Solution Methods
10'
--- ._. c:ood. 10-11 ---- cood.l ...1 - - cood. 10-6
10"
~
..,~ 'f!
10~
10"
.~
liI
i!
183
"
IO~
",
'
......~-~~......---,.j:. ~ \
104
104
0
lOll
200
300
400
soo
oo:mber of itefIIioo 5tepo
Figure 3.38. Convergence curves for different conductivities. The Jacobipreconditioned BiCGCR method is used. Computation with single precision. 10'
---- _,.,ed drops (canducliYlly 1...9 SImI
- - - completely dry - - - conneeted drcp$ (canducliYlly 1.9 SImI - - separa'oo drops (canducliYlly 1&-6 SImI
10~
o
100
200 number 01
300 ~8"1ion .t~
Figure 3.39. Convergence curves for different conductivities in case of distinct and coalescing water droplets. The Jacobi-preconditioned BiCGCR method is used. Computation with single precision.
184
3. Numerical Treatment of Linear Systems
quasi-symmetric indefinite matrix, a special multigrid algorithm was developed [277]. In the three-dimensional case, several modern Krylov subspace methods have been implemented [116], [58], [60]. The results of corresponding convergence studies are presented in the following. Krylov Subspace Methods. In [116], the solution of the FIT equations on three-dimensional Cartesian grids is described for time-harmonic fields. For this problem type, the system matrix is complex symmetric. In [116], this system was solved with the preconditioned COCG method. If now a waveguide boundary condition is used in order to simulate open boundaries [58], then the complex system matrix becomes non-symmetric and indefinite. Therefore, to solve these linear systems, the modern Krylov subspace methods, which also were used in electro-quasistatics, as well as variants of the QMR method that do not assume symmetry, were implemented. For the time-harmonic problems, the convergence behaviour of these solution methods was studied in [58] and [60] in one simple example and two realistic applications. (i) Simple Test Example: As a simple test problem, a rectangular domain with a wire inside was chosen [58]. The current in the wire induces electromagentic fields in the box. One complete side of the domain was assumed to be waveguide boundary. Figure 3.40 shows this simple example. The computational domain was discretized with a 4 x 3 x 4 grid. This gives a complex non-symmetric system with 144 unknowns. Figure 3.41 shows the eigenvalue distribution of the system matrices of this simple example for the cases with and without waveguide boundary condition.
Figure 3.40. 4 x 3 x 4 grid of the simple test problem.
Schuhmann, TV Darmstadt [58]
The picture is courtesy of
To make a fair judgment of the convergence studies, the following has to be noted: The number of grid points belonging to the waveguide boundary amounts to about 10% of the total number of grid points; this number it is unusually large compared to one in practical applications. Since preconditioning is oriented to practical problems, the partial SSOR preconditioning was chosen, which simply ignores the non-symmetric fraction of the matrix. As a consequence, the application of this preconditioner leads to deteriorated
3.10 Convergence Studies for Selected Solution Methods 1.0
185
r-----,--~---,-----__r--~-_,
x Ideal magnetic bOundary o Open waveguide boundary
,
0.5
,p ,
o
-- - - --- -&- - ->f-OOCM
-0.5
~&GM &<-~eoc:*€)i<~)te~-
- - - - --- --
o
-1.0 L-._~_.-l.!,---_~_..,-'-_ _~_,.....,.._~_-.....J -10.0 0.0 10.0 20.0 30.0 Re(eigenvalues)
Figure 3.41. Eigenvalue distribution of the system matrices of this simple example for the cases witli and without waveguide boundary condition. The non-symmetric matrix of the case with waveguide Doundary condition has eigenvalues with nonzero imaginary part. The graph is courtesy of Schuhmann, TV Darmstadt [58]
1.0e+03 1.0e+02 1.0e+Ol 1.0e+00 1""".__...................-..:."""",::
E 1.0e-Ol
.=o -6'"
!
CGSOG5:/·""-·
.................~TF.aMa....:.::.:.::.,,·· TAB...::'2)"'"""
1.0e-02 1.0e-03
.~ 1.0e-04 ~ 1.0e-05
-m 1.0e-06
II:
1.0e-07 1.0e-08 ~ .............. + ................... t·····················,························,······............. ···,····,~,::·\F\· 1.0e-09 1.0e-l0 t:::.::=L=J::=l===::..:1=~d:::ls:=:J o 10 20 30 40 50 60 70 80 Number of matrix vector multiplications
Figure 3.42. Convergence curves of the implemented solution methods without preconditioning for the simple test example. The graph is courtesy of Clemens, TV Darmstadt [58]
186
3. Numerical Treatment of Linear Systems
1.0e+03 1.0e+02 1.0e+01
§
. PSSOR.CGS . -
........................... '................: ...... .... . ...... .. PSSOR-TfQMR ..,---PSSOR.BiCGSTAB ......
... ·PSSOR~BrCGs(ab(2)·::·
1.0e+00 1.0e-01
~ 1.0e-02
:§
1.0e-03
l!! '" 1.0e-04 ~ ......... , ........... + ...........+ ..........:.:.··.l,-;,i.......................................................................... Q)
~
Qi
1.0e-05 1.0e-06
-!
~
, ............ , ,..................•............. ·'····'h:········· .. ·.... ·,··.. ·· .. ·· ..·... ·..······ .. ··,·· .. ·· .. · -! , .............. ,............. ,.............................,\,,,... \ :... ,...........,..........., .,
~
-!
~.........
a: 1.0e-07 .............,............. , ..............,................., ............. ·...........··V\·,{·:\,I1 .... ·:" ..., ...............,..... 1.0e-08 1.0e-09 ~ ...............,.............
+ ...........+............................................ ,................. ,'-' \ .....', .................
1.0e-1 0 ~ ...........,.............+
........... ,...........,...............................................,...... \ . . . . . . V
-!
.,
o 20 40 60 80 100 120 140 160 180 200 Number of matrix vector multiplications Figure 3.43. Convergence curves of the implemented solution methods with preconditioning for the simple test example. The convergence rate is worse than that without preconditioning, since only a few grid points were used, and the nonsymmetric part of the system matrix caused by the waveguide bounaary condition is unusually large. The graph is courtesy of Clemens, TU Darmstadt [58] convergence of all used cg-like methods in case of the simple test problem from Fig. 3.40. Figures 3.42 and 3.43 illustrate this fact. In addition, note that the BiCGSTAB method with and without preconditioning stagnates even for this very small and simple problem. The issue of stagnation is discussed in more detail in the next problem arising in practice. The curves of the other methods do not differ essentially in this example; all curves wind more or less around each other. The CGS and CGS2 method show slightly more fluctuations than the BiCGstab2 method, but the differences are not as dramatic as, e.g., in Fig. 3.33. For this simple test problem, the upper residual bound of the TFQMR method curve has the shape similar to that of the other methods but much smoother, as is visible in Figures 3.42 and 3.43. In summary, the BiCGstab2 method shows again the best convergence, even though the related BiCGSTAB method stagnates. (ii) 3 dB Waveguide Coupler: A 3 dB waveguide coupler has been simulated in [58] as a time-harmonic problem. A detailed description as well as representations of the electric field and the reflection coefficients can be found in subsection 4.7.1. The coupler was discretized on a 53 x 2 x 128 grid (N = 13568 points). Figure 4.45 shows the electric field Re(E) in the 3 dB waveguide coupler. This figure also displays the geometry of the coupler. Even after 3500 matrix-vector multiplications, the CGS method without preconditioning does not show any remarkable reduction of the relative residual, while the relative residual of the preconditioned CGS method at about 500 matrix-vector multiplications starts to decrease essentially, and, after fewer than 1000 matrix-vector multiplications, the relative residual is smaller
3.10 Convergence Studies for Selected Solution Methods
187
1.0e+09 F"==or=="""""==lF=r=r=="F'==r===; 1.0e+08 t-
··································1,1..
il tlIJ:
I::>SE;ORi~C{is:::::::i
1.0e+07 1.0e+06
oE 1.0e+05 c:
(ij ::I
~
;n (1)
1.0e+04 1.0e+03 •. 1.0e+02 H············· 'n··········· ..... ..................•.. ............. j...........
. .....
II-HI·' ··'1"··,
~ 1.0e+Ol ~······",.-······,·····'!Ilh+·····················,······..... +.................... ,.........................;................ 4
a;
a: 1.0e+00 .................... c. ~
1.0e-Ol 1.0e-02 t-..............
c
1.0e-03 t-............................
o
500
..j ...•.....
1000 1500 2000 2500 3000 Number of matrix vector multiplications
3500
Figure 3.44. Convergence curves of the PSSOR-preconditioned CGS and TFQMR in comparison to the CGS method without preconditioning in case of the wavegUide coupler. The graph is courtesy of Clemens, TV Darmstadt [58] metho~
1.09+07 1.0e+06 1.0e+05 E 1.0e+04
o 1.0e+03 1/1··1'·11+ ,......... ,···c···· .. ,........................ • .........................:.................. :§'" 1.0e+02 I'H·····J M·······:·············., ··lh· ~
;n~ 1.09+01
-i
~\lJ;:~~!..~~t~:c.cc."'i"·'.':'.~. .t·W~"'·.Il:············································............ ......~
>
'iii 1.0e+00 a; a: 1.0e-01
1.0e-02 1.0e-03 t- ........,...........+,
. . . . . . + ............•...............,...................... ····,················,················.····"'t14
1.0e-04 '------''--........._---'-_--'--_-'-_-'-_-'--_'-------'_---' o 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Number of matrix vector multiplications
Figure 3.45. Convergence curves of the implemented solution methods with preconditioning in case of the waveguide coupler. The CGS2 and TFQMR metliods related to the CGS method show the same rate of convergence as the CGS method itself, while the BiCGSTAB completely fails and BiCGstab2 needs about five times the effort in order to reach the prescribed accuracy for the relative residual. The graph is courtesy of Clemens, TV Darmstadt [58]
188
3. Numerical Treatment of Linear Systems
1.0e+07 r········!··················;···············.;.·········.......... /................. ·······?SSOR'CGS·::::= PSSOR-TFQMR······ 1.0e+06 P$SOR'BiCGSTAB ..... 1.0e+05 k"'+l ."1'1 1~1"IH+ilMNt
E 1.0e+04 o c:
iii 1.0e+03 ::l
~ 1.0e+02 l!?
~ 1.0e+Ol ~ OJ 1.0e+00
0:
~:·:::·:·::::::Lcc:·:·::::·::i::::·::::·:.::·;-:::·:.:::·:c::·:"::cc·:::::·::::':::·::cc·::::·:::·:·::::·::::·::dtl::J/
1.0e-Ol 1.Oe-02 ~ ......... ·f·········: 1.Oe-03 r
o
·T··· ......· ;
;................:..................................................,........:. ,,····U:H·.. ·..·· .. ·· -I ............. ;................................ : ................:·· .. ·.......... ··,·· ....· .... ·· .... ·:· ...... ··· ..
·1·······-1
100
200 300 400 500 600 700 800 900 1000 Number of matrix vector multiplications Convet:~nce curves of the implemented solution methods
Figure 3.46. TFQMR, CGS CGS2, and BiCGSTAB with PSSOR preconditioning in case of the waveguide coupler. The BiCGSTAB method stagnates after only a few iteration steps. The graph is courtesy of Clemens, TU Darmstadt [58]
than 10- 3 . The residual bound of the PSSOR-TFQMR method starts with smaller values than that in the PSSOR-CGS method, but, from about 700 matrix-vector multiplications on, both curves essentially coincide. For this larger and realistic problem with 40704 complex unknowns, the application of the partial SSOR preconditioning yields the expected reduction of necessary iteration steps. Figure 3.44 illustrates these facts. The BiCGSTAB method stagnates again. The BiCGstab2 method, which showed the best convergence in the previous example, does not stagnate but converges extremely slowly. It needs nearly 500 % more matrix-vector multiplications in order to reach the same residual (10- 3 ) as the CGS, CGS2, or TFQMR method. In addition, these three methods are able to reach a relative residual of 10- 4 , which is even one order of magnitude better. A noteworthy fact is that the CGS2 and TFQMR methods, which are related to the CGS method, show nearly the same rate of convergence as the CGS method itself. This becomes obvious in Fig. 3.45. Figure 3.46 shows a zoom from Fig. 3.45 into the curves corresponding to the PSSOR-CGS, PSSOR-TFQMR, and PSSOR-BiCGSTAB methods. It indicates that the convergence behaviour of the CGS and the TFQMR methods differs for the waveguide coupler during the initial phase in the following way: The upper residual bound of the TFQMR method first steeply increases, then increases slowly, and next stays on the same level for some time before it decreases stepwise quite fast. The relative residual of the CGS method strongly oscillates; also, the mean values first increase before they start to decrease - somewhat earlier than the TFQMR bound.
3.10 Convergence Studies for Selected Solution Methods
189
(iii) Cross-Talk between Bond Wires of a Microchip: As another application problem, consider a part of a microchip with regard to a possible crosstalk between 10 and 40 GHz. As is visible in Fig. 4.47 in subsection 4.7.2, the discretized part consists of two microstrip ports and two thin bond wires which connect the microstrip lines with the resistor blocks on the material. The resistor blocks have conductivity /'i, = 1.3· 104 Slm, and the substrate has relative permittivity Cr = 9.0. The dimensions are about 700 J..Lm x 300 J..Lm. The discretization was done on a grid with 71 x 20 x 85 = 120700 points. The cross-talk of one wire to the other was determined for the frequencies 10 GHz and 40 GHz. Fig. 4.47 also shows the electric field Re(E).
1.0e+05 r - - - - - r - - - - , - - - - , - - - - - - r - - - - - , PSSOR-TF'QMR -
;:::~~~-t(~lhJi~ljl?~;--_ f(
~ 1.oe+02,7",".;1 . ....... .... ..... ,
!g.
~
E o I:
(ij
, \ Y
.. . : . \ ; ]J"UY';:r,}
If lJ
~
1.0e+01
I
. ."r"
1.0e+00 k· .. ·. ······· ..···· ..·
'--<...................................................... .-1
i.................................................................
::l
'C
'iii ~
1.0e-01 . . . .................... , ....................................................................... ""=='1:.........................
-I
Q)
~
£ 1.0e-02 ' - - - - - ' - - - - - - ' - - - - - " - - -.......- - - - - ' o
2000 4000 6000 8000 Number of matrix vector multiplications
10000
Figure 3.47. Convergence curves of the preconditioned TFQMR method and the non-preconditioned BiCG method in case of a microchip. The graph is courtesy of
Clemens, TU Darmstadt [58]
The complex non-symmetric linear system has the dimension 362100. The stabilized convergence behaviour of the PSSOR-preconditioned TFQMR method allows the solution of this system for prescribed accuracy after about 3000 iterations in 17.46 hours on a SUN Sparc 20 workstation. The nonpreconditioned BiCG method does not converge for this problem, as can be seen in Fig. 3.47. A comparison with calculations in time domain has also been carried out in this case. All calculations were done on a SUN Microsystems workstation in double precision.
The presented examples uniquely show that the preconditioning of Krylov subspace methods is absolutely necessary and thus allows a reduction of the computational effort to a few percent compared with the non-preconditioned
190
3. Numerical Treatment of Linear Systems
methods. The convergence behaviour of the single methods is very different. Generally, the stabilized BiCGSTAB method can only be recommended with precaution since it sometimes stagnates. The smoothed methods do not converge much faster than the non-smoothed variants of the corresponding Krylov subspace methods, but their use as black box solvers in some simulation packages is easier. Multigrid Technique. As already noted in subsection 3.6.3, theoretical study of convergence of the multigrid algorithm is virtually impossible for general problems. Yet, a model problem analysis for a related model problem can be taken as the basis for the choice of components in a multigrid method. In the study of Thiebes [260], such a model problem analysis is given for indefinite elliptical boundary value problems. There the grid-dependent shift of the eigenvalues of the discrete operator and the bounds for the coarsest grid demanded by the smoothing procedure are emphasized as main problems. These problems also became obvious in the implementation described before (compare [277]). In the following, some numerical results are presented that allow some insight in the convergence behaviour of the developed multigrid algorithm. The computations were all carried out for the cylindrical cavity with tube ("pill-box")19 displayed in Fig. 3.48. This example was chosen for two reasons: on one hand, semi-analytical results were available for comparisons, on the other hand, this structure can be approximated by all grids without geometrical error. 65 mm
Figure 3.48. Dimensions of the pill-box example. The drawihg shows the upper half of the cavity with radius r = 10 cm and length 9 = 6.5 cm. The tube has radius a = 5 cm. The cutoff frequency is 2295 MHz. (i) Eigenvalue Shift: In subsection 3.7.2, the estimate (3.22) was derived for the grid-dependent shift of the eigenvalues of M compared to those of the continuous problem. With this estimate, a shift-factor is determined for the squared wavenumber k 2 by M = A - k 2 I +ik' D. The dependence of multigrid methods on eigenvalues resembles that of defect correction methods [244], for 19
Whenever the term "pill-box" is used in the sequel, this special geometry is meant.
3.10 Convergence Studies for Selected Solution Methods
191
which the following holds. Let B be an approximation of the inverse of A. Then the condition for the convergence of x new = xold+B(b_Axold) is that all eigenvalues oX of B . A satisfy the estimate 0 < oX < 2. A multigrid method can fit into this scheme if one takes the coarse grid correction as realization of B in the space of those fields which are representable on the coarse grid. For a detailed study, the eigenvalues of the complex matrix M have been determined by routines from the NAGLIB library [108] (cf. also [315] cited there). Therein the matrix is transformed to upper Hessenberg form and the eigenvalues and eigenvectors are determined by the LU-algorithm. For the pill-box from Fig. 3.48, the calculated eigenvalues are shown in figures 3.49 and 3.50 on a "fine" and a coarse grid for frequency f = 3500 MHz with wavelength 8.566 cm, each with and without the shift factor. Since the main problem becomes obvious only with quite coarse grids, the "fine" grid has a rather large step size and only N = 297 grid points. The step sizes on the fine grid are Llr = 1.25 cm and Llz = 0.8125 cm, on the coarse grid they are only twice as large. The constant s represents the shift factor for k 2 in M. The ten smallest eigenvalues of the matrix M are each displayed. Both with and without the shift factor, the seven smallest eigenvalues of the fine grid have a negative real part. On the coarser grid, the situation is the same if the shift factor is used, while two more eigenvalues have negative real part if no shift factor is used. As is well-known, such a situation leads to major convergence problems or even makes convergence impossible. The importance of the shift factor is thus elucidated by this example, which at the same time stresses the necessity to avoid extremely coarse grids in case of indefinite problems. As to the obtained physical results, Fig. 4.21 from [277] clearly shows that quasi-resonances of the studied cavities occur on all grids at nearly the same frequency if the shift factor is used and are shifted otherwise to smaller and smaller frequencies as the discretization becomes coarser (see also Tables 4.1 - 4.5 in [277]). (ii) Under-Interpolation: Subsection 3.7.3 motivated the use of under-interpolation. The implemented multigrid algorithm uses Wk = (2( cos kh 1)/(h 2 . k2))2 with the step size h and the wavenumber k is used for underinterpolation. Figure 3.51 shows that the used parameter evidently improves the convergence behaviour compared with Wk = 1, even though it is only an estimate for the optimal parameter (for this example its value is Wk = 0.9425 and, according to Table 4.5 from [277], Wk ~ 0.885 would be optimal instead). Further results can be found in Table 4.9 from [277]. (iii) Comparison of Red-Black and Line Relaxation: Already in subsection 3.7.4, the choice of the linewise Gauss-Seidel relaxation was based on the strong coupling of the solution in the direction of the z-coordinate. In a number of studies in [277], it became evident that the restriction of the residual yields about the same fractions in linewise as in red-black relaxation. Consequently, the differences are not as great as they could be expected for the studied problem type. Since nevertheless the linewise relaxation gave the
192
3. Numerical Treatment of Linear Systems Eigenvalues of Matrix M frequency = 3500 MHz. without shift·factor
2000
~
Oftne grid o coarse grid
<0
t500
0 0
.,
0
1000
t::
Co
.,~
.5
.,
0
500
ore
CI
g
0
_____
~ ________ ~_ 0 ___
,
0:
0
0
- - - - - - - -1- - -
·500
·1000 -6000
~
-4000
·2000
2000
4000
real part
Figure 3.49. Ten smallest eigenvalues of the matrix M at f = 3500 MHz on a fine and a coarse grid not using the shift factor s for k 2 • It is particularly remarkable that the seventh and eighth eigenvalue have negative real part on the coarse grid, while both are positive on the grid that is twice as fine if no eigenvalue correction is performed via a shift factor. Eigenvalues of Matrix M 4000
frequency = 3500 MHz. with shift·factor r--------------~----__,
Ofine grid o coarse grid
2000
o
~
~
I
<>
0
o
- - - - - - - - - 0- 6 - - - - - - - - -
,,:
o
~6:, __~ _.0________ J__?__ ---~ -_Cl_ I
0
·2000
o
.4000 L-._~_~~_~_~_ _~_~_ _~_ -6000 -4000 ·2000 real part
_.J
2000
Figure 3.50. Ten smallest eigenvalues of the matrix M at f = 3500 MHz on a fine and a coarse grid using the shift factor s for k 2 . All eigenvalues have the same sign on the coarse and fine grid.
3.10 Convergence Studies for Selected Solution Methods Under-Interpolation 10000.00
193
frequency = 3500 MHz
r------------------, --w=1 . - - - w=O.9425
1000.00
100.00
10.00
1.00
0.10
0.01 0'---~--'1~0-~--='20::---~---!30:--~--':40-~------,J50 # V·cycles
Figure 3.51. lib - M x112/llx1l2 for grid G 2 of a three-grid algorithm with and without under-interpolation. The number of relaxations before and after each Vcyde is adaptively controlled (VI, V2 ::; 20 each).
best results overall, the small additional effort is worthy. Yet, one should think about a combination of both. (iv) Dependence of the Rate of Convergence on the Number of Relaxations: The number of relaxations is adaptively controlled using the residual: - The relaxations with the Gauss-Seidel method are stopped as soon as
II b -
II (new)
Mx 2 > 0.625. lib - M xll~old) -
- The relaxations with the Kaczmarz method are stopped as soon as lib - Mxll~new) ld > 0.9. lib - Mxll~o ) Furthermore, a maximal number of relaxations can also be prescribed. This possibility was used to set up the following comparisons. Since theoretically determinable rates of convergence are rare, it is usual to study the behaviour of the proportion of the residuals before and after one V-cycle. In figures 3.52 and 3.53, this proportion is displayed for a two-grid algorithm as a function of the maximal number (111,112) of relaxations before and after each V-cycle. These values are obtained for the pill-box on a grid with N = 1105 points and Llz = 0.40625 cm. At the end, the relative error in G 2 has the same order of magnitude as that in G1 (-10- 2 ). Three special frequencies were chosen for the following reasons:
194
3. Numerical Treatment of Linear Systems
- The two frequencies f = 2500 MHz and f = 2700 MHz are both close to a quasi-resonance at about 2650 MHz. The matrix A in M = A-s·k 2 I +ik' D has eigenvalues 2600 MHz and 2646 MHz on the coarsest grid with ..1z = 0.8125 cm as well as 2521 MHz and 2538 MHz on the coarsest grid of the three-grid algorithm with ..1z = 1.625 cm. Close to those frequencies are singularities of the real part of the matrix M. - The frequency f = 3500 MHz has wavelength 8.5655 cm, which gives the borderline case of about five steps per wavelength on the coarsest grid of the three-grid algorithm. Only one relaxation takes place after we switch to the Kaczmarz method sometimes even worsening the result. This is probably caused by the fact that in these cases the limits of arithmetic accuracy are reached. But it is acceptable since the Kaczmarz method leads to an evident improvement in many other cases. 100.00.-----------------_ -(1,1) - - - - (1,2) (1,3) _.- (2,1)
10.00
1.00
0.10
# V-cycles
Figure 3.52. lib - M xll~new) Illb - M xll~old) for a two-grid algorithm with VI, V2 E {I, 2, 3} and N = 1105 on the finest grid G2. The number of V-cycles is adaptively controlled (~ 50); no post-iterations are carried out.
Table 3.4 gives the relative error of the two-grid solution for different small maximal values of 1/1 and 1/2, i.e., the number of relaxations before and after one V-cycle. The corresponding number of work units is given in parentheses. The used CPU time can be found in Table 3.6. Besides, for 2700 MHz, which is very close to a singularity, the adaptive control of the number of relaxations is optimal with respect to the relative error. It corresponds to the standards presented in subsection 3.7.5 for a solution method of a problem of this type type; precisely, the main smoothing takes place after the correction. Also, the first V-cycle reduces the error very well, which shows quality of the presented algorithm.
3.10 Convergence Studies for Selected Solution Methods
195
1000.00 . - - - - - - - - - - - - - - - - - - - - - - - , -6 ---- 7 ............ B --- 9 --- 10 -50
100.00
10.00
1.00
0.10
0.01
L-_~_'_
o
_ _----',--_ _- - : ' : _ _ - - - - - , J
10
15
20
# V-cycles
Figure 3.53. lib - Mx112/llx112 for a two-grid algorithm with Vl, V2 E {I, 2, 3}. Convergence curves for maximally six, seven, ... , ten and fifty V-cycles are displayed.
(Vl,V2) ~l,lJ
(1,2) (1,3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3) default
pill-box, N - 1105, I - 2 2500 MHz 2700 MHz 3500 MHz Ib-- Mx 1l2 lib· Mxl12 Ilb-- Mx l12 #WU #WU #WU Ilx 12 Ilxll? Ilxll? 0.0890 0.0692 0.0928 0.0964 0.1000 0.0655 0.0912 0.0615 0.0933 (32) (36) (47) 0.0933 (29) 0.0920 (32) 0.0677 (42) 0.0589 (47) s. ~vl,v2J=~1,~J ~automatlc contro!J 0.0644 (44) s. (vl,v2)=(1,3) (automatic control) s. (Vl, v2)=(2,1) (automatic control) s. (Vl, v2)=(1,2) (automatic control) 0.0597 0.0709 s. (Vl, v2)=(1,3) (automatic control) s. (Vl,V2)=(3,3) (automatic control)
g~j
g~j
~:~j
~:~j
Table 3.4. lib - M x112/lIx112 for two-grid algorithm with Vl, V2 E {I, 2, 3}. The number of V-cycles is adaptively controlled (~ 50); no post-iterations are carried out. N is the number of grid points. The number of work units is given after the error in parentheses.
(v) dependence of the Rate of Convergence on the Number of Multigrid Iterations: The number of multigrid iterations, Le_, of V-cycles, is adaptively controlled using the residual: - As soon as
li b - M II(new)
x
2
> 0.95
lib - M xll~old) -
holds on the grid G k, 2 :S k :S l and the Gauss-Seidel method is used for relaxation, the algorithm switches to the Kaczmarz method for the next
196
3. Numerical Treatment of Linear Systems
V-cycles. These additional V-cycles especially improve the approximate solutions of higher input frequencies. - Afterwards, the V-cycles with the Kaczmarz method are stopped as soon as li b - M II(new) x 2 > 0.95 lib - M xll~old) holds on grid Gk, 2 ~ k
~
l.
Furthermore, a maximal number can again be prescribed. This was used in the setup of the comparisons above. Because of the stopping criterion, more V-cycles are carried out on coarser grids than is necessary for convergence. In particular, it is interesting that a "too good" approximate solution on the coarser grid can even be slightly disadvantageous on the finer grid. Probably this can be explained by the fact that, on one hand, the approximate solutions slightly differ from one grid to another and, on the other hand, new errors are coupled by interpolation. It is usual in the literature (d., e.g., [42]) that a fixed proportion between lib - Mxll(O) and lib - MxW*) is given as a criterion for optimal switch from the coarse to the fine grid. In [277] (Tables 4.10 - 4.12), the values of lib Mxll(O) Illb - Mxll(*) differed so much at the points chosen as optimal for a switch between grids (1.2 . 10- 6 , 9.8 . 10- 6 , as well as 44.7 . 10- 6 in the example) that it was difficult to find a reasonable criterion for the switch between grids. Perhaps the reason is that the coarse grid correction is not yet optimal, as was already noted above. For this reason, the same stopping criterion is used on all grids. The following reflections can be taken as the basis for an optimal switch: If, on the grid GI - I , an improvement of the error by 8 costs nearly the same effort as on grid G I , one should switch to grid GI . The error x - x is composed of the discretization error d and the algebraic error a. As soon as a gets smaller than d, the switch of grids should take place. a is dominated by the smallest eigenvalues. d can be estimated quite well by the difference of the solutions on GI - 2 and GI - I . Such estimates for the given problem type are some of the improvements foreseen for the given algorithm. (vi) CPU Time: The necessary CPU time strongly depends on the actual problem to be solved. If many impedance values have to be computed for which the according linear system is nearly singular or if a cavity with many close quasi-resonances, hence singularities, is studied, then only a slow or even no convergence has to be expected. For such problems the amount of CPU time will be very large. Furthermore, multicell cavities have many singularities at their re-entrant corners, so they require a rather fine grid. Two examples have been studied under several aspects with respect to the CPU time. a) Coarse Grid Generation and Construction of the Submatrices A and D: The coarse grid generation and the construction of the submatrices A and
3.10 Convergence Studies for Selected Solution Methods
197
D depend only on the geometry of the system. For each computation, they are only done once, independently of the number of impedance values to be computed. Table 3.5 shows the CPU times as a function of the number of grid points for the cylindrical cavity ("Pill-box") and for the superconductive cavity shown in Fig. 3.54 which is used as nine-cell cavity at DESY in Hamburg in the storage ring HERA to accelerate elementary particles. Here the computation was done for a single cell and as a multi cell cavity. The computations have been carried out on an IBM 3090-150E (under MVS/XA, FORTRAN VS Compiler Vers.2 Re1.3, optimization level 3). b) Two-Grid Algorithm with Different Step Sizes and Different Numbers of Relaxations: In order to get an impression of the CPU time needed by the multigrid algorithm, see Table 3.6, which indicates the used CPU time for different grids. In case of the pill-box, the variation in CPU time per frequency for different pairs of (Vi, V2) does not exceed 15%.
/
/
1/
l I'
,
I"
1
i
1
'\
"
\
\ \
\
\
\
\
,
\
1
\
'
I
I
'
\
---1--, , I i
1
\
I
1
\
"
\
\
I ' I T--I---
I \ \
\
\
I~
/
/
I ~
/
/
I~
II I I I
I I I I
150
Figure 3.54. Dimensions of the superconductive HERA-cavity.
198
3. Numerical Treatment of Linear Systems l
N
1 1 1 1 2 3
85 297 1105 4257 1105 1105
pill-box Gi Ai,Di 0.23 0.01 0.02 0.25 0.09 0.47 1.22 0.34 0.37 0.49 0.50 0.38
N
153 561 2145 2145 2145
HERA, 1 cell Gi Ai,Di 0.35 0.58 1.38 1.40 1.40
0.01 0.04 0.14 0.20 0.21
N
297 1105 4257 4257 4257
HERA, 3 cells Gi Ai,Di 0.63 1.36 4.12 4.16 4.17
0.02 0.07 0.28 0.39 0.42
Table 3.5. Computational effort in sec. of CPU time on an IBM 3090-150E for the grid generation and the construction of the submatrices Ai and Di on the grids G I , ... , G/. Column l gives the number of grid levels of the multigrid algorithm, column Gi the CPU time needed for the generation of all grids and the computation of all related magnitudes. Column Ai, Di gives the CPU time needed for the construction of the submatrices on all grids. Approximately 0.3 sec has to be added for the storage of the coarse grid matrix in the form needed by the direct solution method.
(VI, V2) (1,1) (1,2) (1.3) (2,1) (2,2) (2,3) (3,1) (3,2) (3,3) default
Pill-box, N = 1105, l = 2 2700 MHz 2500 MHz 3500 0.90 1.41 1.00 0.90 0.97 1.39 0.94 (32) 1.01 (36) 1.43 0.88 (29) 0.94 (32) 1.31 1.44 siehe ~VI, v2~=~1,2~ 1.35 siehe (VI, v2)=(1,3) siehe (VI, v2)=(2'~l siehe (VI,V2)=(2,2) siehe (VI, v2)=(2,3) siehe (VI, V2)=(2,3)
g~~
g~~
MHz
~:~~
(47) (42) (47) (44)
Table 3.6. Computational effort in seconds of CPU time on an IBM 3090-150E for a two-grid algorithm with VI, V2 E {I, 2, 3}. The number of V-cycles is adaptively controlled (:::; 50); no post-iterations are executed. N is the number of grid points. The number of work units is given in parentheses after time.
3.11 Bibliographical Comments
199
3.11 Bibliographical Comments This section presented an overview of several classes of numerical methods for linear systems of equations. Some of the methods are well described in textbooks, some only in research papers. General Textbooks The author's personal choice of textbooks, which have different focus and are written on different levels, is as follows: First of all, the book by Golub and Van Loan "Matrix Computations" [103], which covers the most important methods and addresses a wide range of readers. A very comprehensive and well written book is Axelsson's "Iterative Solution Methods" [9]. Other books on this topic are, e.g., Saad's "Iterative Methods for Sparse Linear Systems" [227], see also [226] or Greenbaum's "Iterative Methods for Solving Linear Systems" [105]. Examples of German Textbooks are, e.g., [248] and [74]. Direct Solution Methods Direct solution methods are described in most elementary textbooks on numerical mathematics; examples are [103], [248] and [74]. The application of Gaussian elimination in the context of mode matching techniques is described, e.g., in [246] and [279]. Algorithms that attempt to circumvent the weakness of Gaussian elimination caused by rounding errors are treated, e.g., in [9], [7], [104]. Modern developments use sophisticated reordering techniques, see, e.g.,
[254].
Classical Iteration Methods These iterative methods are described in many textbooks on numerical mathematics such as [9], [103], [114], [248]. An overview is presented in [21] which is also available on the World Wide Web. The convergence theory for these methods is also described there. Some original results have also been cited in this book, like [192]' [324], [148], [126]. Error smoothing aspects, which are of great importance in the context of multilevel methods, are treated, e.g., in [112]. Chebyshev Iteration The Chebyshev iteration, an acceleration method for the classical fixed point algorithms, is described in most of the above-mentioned textbooks, e.g., [9] and [103]. The description here closely follows [103] and [74]. Further references can be found therein.
200
3. Numerical Treatment of Linear Systems
A version for non-symmetric matrices is described in [171]. A special algorithm for complex matrices implying the Chebyshev iteration is introduced in [11] and [1O](1t is also briefly described in subsection 3.9 of this book.).
Krylov Subspace Methods Krylov subspace Methods for real matrices (i.e., mainly the cg and Lanczos algorithm) are described and motivated in most textbooks on numerical mathematics such as cited above ([9], [103], [114] etc.). The original papers are [164] for the Lanczos algorithm and [127] for the cg algorithm. Like with the Chebyshev iteration, our presentation of the cg algorithm closely follows [103] and [74]. Some more details on the underlying linearalgebraic theory can be found, e.g., in [48] or [114]. The notion of Krylov subspaces goes back to [91]. An intensive convergence study of the cg algorithm can be found in [9]. Different derivations of the Lanczos algorithm are possible. The reader is referred to [103], [74], and [226]. Here we follow the last book. The SYMMLQ algorithm [193] is a variant for real symmetric matrices. A variant for nonHermitian systems is presented in [103], see also [280]. The QMR algorithm [100] is a representative of the Look-Ahead Lanczos algorithms. A number of variants of the cg algorithm has been developed for nonHermitian or indefinite linear systems. Some of them are treated in more recent textbooks, say [227], but many can be found only in the original conference papers, articles, or lecture scripts. 1. The CGNR and CGNE algorithm were introduced in [69] and are discussed, e.g., in [89], [226], [280]. 2. The BiCG algorithm was introduced in [165] and [96]. A variant for complex non-symmetric matrices was introduced in [144]. Only few theoretical results about the convergence of the BiCG algorithm are known until now. Some results can be found in [100]. Breakdowns in the BiCG algorithm are discussed in [196]. The representation via BiCG polynomials was introduced in [240]. 3. The COCG algorithm was introduced in [276]. The typical effect of heavy oscillations in the convergence history can be found, e.g., in [116] and in section 4 of this book. 4. The CGS algorithm was first published in [240]. 5. The CGS2 algorithm, a generalized version of the CGS algorithm, was first documented in [238]. 6. The class of the SCBiCG(F, n) algorithms was introduced in [60]. These methods were also combined with Minimal Residual smoothing, which can be found in [233]. Results for the combined methods are presented in section 4 of this book. The algorithms of this class are related to wellknown methods like the COCG or the CR algorithm [247] for real-valued
3.11 Bibliographical Comments
201
matrices, thus motivating names like BiCGCR, the algorithm that was applied in examples described in section 4 of this book.
Minimal Residual Algorithms and Hybrid Algorithms The GMRES algorithm, a generalization of the MINRES algorithm [193] for non-symmetric systems, was first published in [229]. It is based on the Arnoldi algorithm [3]. Some basic relations needed in the derivation of GMRES are shown in [44]. Results on implementation of stable truncated versions of GMRES(l) that use the Householder transformation for orthonormalization, can be found, e.g., in [230], [44]. Other related methods are ORTHODIR [145], a method by Axelsson [6] that builds the basis of the Generalized Conjugate Gradient, Least Squares method [8] (also described in subsection 3.5.2 of this book), the Generalized Conjugate Residual method GCR [89], [88], the GMERR algorithm [312], and the GMBACK algorithm [151]. The paper [230] compares GMRES with BiCGSTAB and CGS. Hybrid methods combine the cg, BiCG algorithm, or Look-Ahead Lanczos algorithm with a minimal residual ansatz, particularly with the GMRES algorithm. 1. The BiCGSTAB algorithm was introduced in [275]. The BiCGstab2 algorithm was proposed in [110] and the more efficient BiCGstab(l) algorithm in [237]. 2. The QMR algorithm was published first in [100]. Some special applications are shown in [98]. The TFQMR algorithm and its relation to CGS are described in [97], the latter also in [330]. While the original QMR algorithm uses three-term recursions, an implementation with mathematically equivalent two-term recursions was introduced in [99]. 3. The Generalized Conjugate Gradient and Least Squares algorithm of Axelsson is described in [5], [6] and [8]. They are representatives of the group of generalized cg algorithms. These methods are either of least square type like ORTHOMIN [291]' the predecessor of the method published in [5] and [6] and GMRES [229], or of Galerkin type like the algorithms in [313] and [6]. In [228], the relations between the different methods are discussed in detail.
Multigrid Techniques This construction principle for iterative solvers was introduced in [92], [93], [14] and further developed in [40]. Independently, the paper [111] was published. [253] lists other important publications. A textbook-like treatise can be found, e.g., in [49], [112], [114] or [107]. The paper [269] presents a short overview of the basic ideas of multigrid techniques. In [320], a more popular description is given. [42] presents some
202
3. Numerical Treatment of Linear Systems
kind of a guide for the development of a multi grid algorithm. A number of different techniques developed over the years, some representatives are described in the publications [106] or [73]. The steadily published "mgnetdigest" [81] presents an overview of a wide range of literature. Convergence theory of multilevel techniques is given in the above-mentioned textbooks and in many publications, e.g., [41] and [260], the latter with focus on indefinite problems. A special multi grid algorithm for indefinite, nearly singular systems, which was presented in [277] and implemented in some electromagnetic CAE tool [287], is described in subsection 3.7 of this book. Further aspects on the application of multigrid methods in the design of particle accelerators are described in [245]. Some of the grid-related developments have been taken up later on, see [266]. Other discretization ideas like those in [262] could be implemented. Another approach to solving indefinite problems is presented in [43]. Recent multilevel developments for indefinite problems are presented in [28]. Often preconditioning methods are based on a multigrid, e.g., those described in [189]. In [12], special methods are devised to tackle nearly-singularity. Preconditioning The preconditioning of non-stationary methods - at least the preconditioned cg algorithm - is treated in most modern textbooks on numerical mathematics, see [9], [103], [114], [74] etc. In [44], some of the most popular methods are compared and ordered. The ILU decomposition is described in [289]. [7], [223] and [224] give a good overview of this topic. In [173] the effects of preconditioning are visualized for the spectra of some test matrices. Two special modifications are presented in [319], others in [109] and [7]. Incomplete LU decomposition with thresholds for additional entries (ILUT(k)) is published in [225] and [173]. SSOR preconditioning goes back to Concus, Golub, and Widlund. The partial SSOR preconditioning is studied in [326]. Polynomial preconditioning was introduced in [220] and was treated later by several authors (see, for example, [222]). Multilevel preconditioning (shortly AMLI for Algebraic MultiLevel Iterative Method) is studied, e.g., in [13] and [189] or [190]. Real-Valued Iteration Methods for Complex Systems A special (preconditioned) iterative method was developed by Axelsson for complex linear systems. The presentation in this book is based heavily on papers of van den Meijdenberg [274] and Korotov [157] as well as on a draft ofAxelsson and Kutcherov [11], [10]. The method is based on earlier ideas presented in [4]. An implementation under the guidance of the author is described in [129].
3.11 Bibliographical Comments
203
Convergence Studies for Selected Solution Methods The examples presented in that subsection have all been discretized with the Finite Integration Technique (FIT) [296], [305] as implemented in the CAE package MAFIA [65]. This package consists of different modules for the different electro dynamical problem classes: 1. Static and stationary fields: Implementation of electro- and magnetostatics in different dimensions and on different grids are described in [159], [25], [72], and [292], the latter nonlinear electro- and magnetostatics. The implementation of stationary current and temperature problems is presented in [23], [283], [288]. Extensive convergence studies with standard and modified ILU preconditioning as well as several iterative preconditioners have been compared: see the references above concerning the preconditioning and solution methods, see [202] and [201] for more details on the convergence studies. The study of the model problem follows [114], with references to theoretical results from [109] and [7]. Some special implementation aspects are described in [305] and [201]. 2. Electro-quasistatic fields: Implementation is presented in [281] and [282]. The first solver tests were based on results from [116]. In [281]' [282]' [58], and [60], various Krylov subspace and hybrid methods were studied. The experimental background for practical examples is described in [208]. 3. Problems with time-harmonic excitation: Studies with the special multigrid algorithm are reported in [277] and summarized in this book. References are made to theoretical studies from [42], [260]. NAGLIB routines [108], [315] were used for comparison. Studies with modern Krylov subspace methods are described in [116], [58] and [60].
4. Applications from Electrical Engineering
In this section, some examples from different fields of electrical engineering shall be presented. The order of the examples follows the classification of electromagnetic field problems described in subsection 1.3 (see Table 1.1). All examples presented here have been computed with the CAE tool MAFIA [65] which uses the Finite Integration Technique (FIT) as discretization method. Besides others, the electromagnetic fields are given for the examples that were used for the convergence studies in subsection 3.10 of the newly implemented Krylov subspace methods. In particular, examples are presented for the newly implemented algorithms based on FIT for electro-quasistatics, stationary current problems, and stationary heat conduction. The FIT discretization of stationary current and temperature problems now also allows for a consistent solution of coupled problems (see also subsection 5.6). The impetus for modelling and implementation of electro-quasistatics was given by the practical problem of humid or contaminated high voltage insulators on which flash-overs may happen.
4.1 Electrostatics In electrostatics, the analytic and FIT equations to be solved are given by div(cgradcp) =-p The equations are solved for the electrostatic potential, from which the electric field is computed via E
= - gradcp
4.1.1 Plug
Consider a plug discretized in cylindrical coordinates with 32 steps in T-, 47 steps in cp-, and 39 steps in z-direction. This example was originally discretized by the team of Weiland, TV Darmstadt; see, e.g., [72]. The total number of grid points is 58656. Figure 4.1 displays the geometry indicating U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
206
4. Applications from Electrical Engineering
Figure 4.1. Plug with grid lines on its outer surface. 6.600E-02
3.3002-02
I
-3.6002-02
4 . 000E-02
0 .116
Figure 4.2. Upper half of the cross-section in (r, z)-plane of the cylindrical grid for the plug.
Figure 4.3. Electric field in plug. View from inside.
4.1 Electrostatics
207
also the grids cells. Figure 4.2 shows the cross-section of the upper halves of the grid in (r, z )-plane. This problem was solved with both solvers that the static S-module of the program package MAFIA[303] offers as standard, viz. the SOR method with estimated relaxation parameter and the preconditioned cg method. The static electric field, i.e., the final approximation, is displayed in Fig. 4.3. The disadvantage of the dependence of the SOR method on a parameter, which was already pointed out in subsection 3.10, becomes very obvious when SOR is used as a black box solver with estimated relaxation parameter, as Fig. 4.4, 4.5, 4.6, and 4.7 illustrate. Figure 4.8 displays the development of the solution, i.e., the electrostatic potential, during the iterative solution process with the preconditioned cg method. 10° 10')
8 8
I ~.-
10.2
OJ 10.3
~ 104 '" ~
OJ
> '.I:l
I
", ,
'" 1O-1i c=: 10.7
... . ....
0;
10-8
0
l
]
" .' .
10.5
~~~ I
250
.......
.........,
500
I ,
' .' .' .' ...~
750
1000
I
1
1250
Number of iterations Figure 4.4. Convergence history of built-in solvers: SOR versus preconditioned cg.
10° 10-)
\\
:I ~
peG SOR
\,
\
§ 10-2
0
c:: Ol
.g
10-3
_.-_.- -'-.- -'-.- --
'r;; 104 ~
~
10-5
' i 10-1i
Q
c=: to-7
to- S
0
50
100
150
J
200
Number of iterations Figure 4.5. Blown-up view of the first 200 iteration steps of the convergence history of built-in solvers: SOR versus preconditioned cg.
208
4. Applications from Electrical Engineering
Figure 4.6. Error development in the preconditioned cg-algorithm. The error in the best iterative approximation is plotted after the 1st , 6t , 12th , 24 th , and 50th step.
To get a better understanding of the local error distribution during the iteration process, the following studyl was carried out: First, the best approximation Pi: was computed with the preconditioned cg algorithm. Let nsol v be the maximal number of iterations needed for the specific solver to reach the level of machine precision. Then, step by step, each iteration was stopped after only j steps, j = 1,2,3, ... , nsolv, with PCG and SOR, and the "error" was computed as the difference between that approximation p~7G) or 1
pttJR) and Pi:. Figures 4.6 and 4.7 show the error development in PCG
The plots are taken from mpeg films which can be requested from the author. They have been prepared with the help of M. Hilgner.
4.1 Electrostatics
209
Figure 4.7. Error development of the SOR algorithm. The error in the best iterative approximation is plotted after the 1st, 6t, 12th, 24th, 50th , and 150th step.
and SOR. Figure 4.8 displays the development of the solution, i.e., the scalar electrostatic potential PE, during the iterative solution process with the preconditioned cg method.
210
4. Applications from Electrical Engineering
Figure 4.8. Development of the iterative solution, the electrostatic potential, in case of peG. Displayed are the approximations after the 1st , 6t, 12 th , 24th and 50th step. The corresponding electric field, i.e., the gradient of the final approximation of the electrostatic potential, is shown once more for comparison (bottom right).
4.2 Magnetostatics
211
Figure 4.9. Electrostatic potential on the surface of the plug and in its neighborhood.
4.2 Magnetostatics In magnetostatics, the analytic and FIT equations to be solved are given by 1 - - 1curl (-curl A) = J E CD; Ca =je ' /L As was described in detail in subsection 2.3, the magnetic field H is decomposed into its homogeneous and non-homogeneous parts: H:=Hh+Hi '
This finally allows one to use a scalar potential H geneous part: div (/L grad
h
= - grad
= Pm
Note that this expression coincides with the governing equation of electrostatics. The resulting magnetic field H can be obtained from: H = Hi
+ Hh
= Hi - grad PM.
It should be stressed again that the scalar potential PM has no physical meaning at all but is a purely auxiliary quantity. In the following, some field representations are shown for the two linear magnetostatic examples that were treated in subsection 3.10 and a third (nonlinear) problem.
212
4. Applications from Electrical Engineering
4.2.1 C-Magnet The magnetostatic application of a simple C-shaped electric magnet was also discussed in connection with convergence studies of iterative solution methods. The magnet is excited by two driven coils. A constant relative permeability of fJr = 398 is assumed for the magnet material. The magnet has two symmetry planes at y = 0 and z = O. This symmetry is exploited in the numerical field calculation: only the quarter ofthe magnet shown in Fig. 4.12 in subsection 3.10 is discretized. The domain around the air gap of the magnet is of greatest interest, which is why it is discretized especially finely, while increasingly larger step sizes may be chosen in the direction of the solution domain boundaries. For the presented calculations, N = 98600 grid points were used and neighbouring step sizes differed maximally by a factor of 2.5. At the symmetry plane z = 0, the Neumann boundary condition is chosen, since the magnetic field lines run parallel to the boundary there. At y = 0, Dirichlet boundary conditions are chosen, since here the field lines are perpendicular to the symmetry plane. At all other boundaries, the open boundary condition is chosen. Symmetry planes
Figure 4.11 displays the geometry of a simple C-shaped magnet. Figure 4.11 shows the field lines of the magnetic flux density in this so-called C-magnet. From the numerical point of view, this practically relevant example is interesting since it leads to matrix entries of very different order because of the strongly varying step size and the non-homogeneous material distribution. The matrix condition number is much more realistic than that of the simple model problem. This is clearly reflected in the convergence behaviour of the investigated iterative methods (see subsection 3.10). This problem was also solved by both solvers as they appear in a standardized form in the S-module of the CAE tool MAFIA [303], viz. the SOR method with roughly estimated relaxation parameter and the preconditioned
S.213 fig.4.11
4.2 Magnetostatics
213
Figure 4.11. Three-dimensional representation of the magnetic field of the Cmagnet.
Figure 4.12. Contour lines of the scalar magnetic potential (after 104 steps of PCG) in the symmetry plane z = O. The plot shows the quarter of the structure, which is enough to compute by the symmetry: front and lower planes are symmetry planes. cg method. The disadvantage of the dependence of the SOR method on a parameter, which was already pointed out in subsection 3.10, becomes very obvious again when SOR is used as a black box solver with estimated relaxation parameter, as Fig. 4.13 and 4.14 illustrate.
4.2.2 Current Sensor Another magnetostatic example is the field computation of a current sensor, which is excited by a loop driven with I = 100 A. A relative permeability of JLr = 500 is assumed for the material of the sensor. Figure 3.27 in subsection 3.10 shows the arrangement with the grid for the sensor. According to the annular geometry, the sensor is discretized on a circular cylindric grid.
214
4. Applications from Electrical Engineering
§ 10·
= 10' ·
0
(;j
::l
:9u> g 10') <>
;:>
.~
0) ~
_1
Il
to·5 - .~ 10.7 1 1
\
-
10.9
1000 500 750 umber of iterations
250
0
1250
Figure 4.13. Convergence history of built-in solvers: SOR versus preconditioned cg. ~
10°
0
I:::
(;j
::l
:9
10-2
\j".
\ \
CI)
~ II)
;>
'.0
\
\I
10-4
\~
"\(1 .,
oj
7) ~
.~ ~
10-6
~
1
10-8
~/
0
50
100 150 200 250 Nwnber of iterations
300
Figure 4.14. Blown-up view of the first 300 iteration steps of the convergence history of built-in solvers: SOR versus preconditioned cg.
The symmetry with respect to the planes 'P = 0, 'P = 27T allows to discretize only the half shown in Fig. 3.27. The chosen grid has N = 27869 points. Figure 4.15 shows the magnetic flux density in the plane z = O. Because of the high permeability of the material, the field is mainly located inside the sensor ring.
4.2.3 Velocity Sensor A certain magnetic velocity sensor has been studied by Schillinger and Clemens in order to compare the preconditioned cg with the algebraic multigrid. The cg algorithm was used as it is implemented in the CAE tool MAFIA with the two different preconditioners IC(O) and ILU(3). The black box solver by Ruge and Stiiben [219], [252] was used as a multi grid solver.
4.2 Magnetostatics
'\
215
..... ~.\
o o.
----
,, ,
.,
0.00
"\
~ .......
" .
\,: \" I
..
\,
ph.i
t I
~ :II:
r
0.01
~ •
.
:.
I
•
j
I
I
1
I
I
I
I
I
•
I
I I
I
I
I
I
I
I
,
:\ ·· ~. .:.
.. ':
·: j
j
0.02
0 . 02
Figure 4.15. Magnetic flux density in some z-plane of the annular current sensor.
Figure 4.16. Geometry and magnetic flux density of magnetic velocity sensor. The plot is courtesy of Schillinger, TU Darmstadt
The problem was discretized using about N ~ 230000 unknowns. Figure 4.16 shows half of the system and the magnetostatic flux density. Figure 4.17 compares the convergence of the cg algorithm with the two different preconditioners, the AMG and the AMG plus its overhead by the grid setup on the coarser levels. AMG is about twice as fast as peG, even if the overhead is taken into account. Thus, the velocity sensor is an excellent example of optimal performance of a black box algebraic multigrid solver for Poisson's equation.
216
4. Applications from Electrical Engineering
Rei. Residual 1~ r-----~----~------~----,
-
AMG IC(O)-CG ILU(3)-CG AMG+Grid-Set
Cpu/sec.
Figure 4.17. Convergence history of magnetic field computation of the magnetic
velocity sensor. Compared are the cg algorithm with IC(O) and ILU(3) preconditioning and the black box AMG (with and without overhead). The plot is courtesy of Schillinger, TV Darmstadt
4.2.4 Nonlinear C-magnet
In case of nonlinear permeabilities, the following Newton-Raphson-like procedure [191]' [292] is carried out in the static S-module of the CAE package MAFIA: The calculation is carried out in a series of cycles C1 , ... , CM , as is illustrated in Fig. 4.18. The first step in cycle Ci is to solve the linear problem using the permeability J.li-l' This gives the magnetic field H(i). Looking up the right value in the B - H-curve yields the next value for the permeability J.li (see Fig. 4.19 for a qualitative plot of a B - H-curve). As soon as some kind of convergence is reached (see [292] for details), the final field H(M+1) is evaluated via a linear static computation with permeability J.lM. A C-shaped magnet is chosen again as an example - now with a nonlinear material. The example was originally discretized by the team of Weiland, TU Darmstadt; more details and more examples can be found, e.g., in [292]. Figure 4.20 displays the permeability of the C-magnet. Figure 4.21 shows the magnetic vector potential after the last cycle of the nonlinear iteration. Both are shown in a two-dimensional cross-section. Figures 4.22 and 4.23 display the convergence curves for the SOR method and the preconditioned cg method. The preconditioned cg method only needs about 10% of the total number of iterations necessary for the SOR method. For nonlinear problems, the rate of convergence of the implemented solver becomes especially important.
4.3 Stationary Currents; Coupled Problems
{ Cl) -----------. ~ { ~
217
J.l (0)
Cyclet
J.l
Cycle 2
-----------.
J.l(2) ........
.... ....
{
Cycle M
-.......
Ell)
El2)
~
J.l (M) - - - - - - - - - - - . JlM+l)
Figure 4.18. Computational procedure for nonlinear problems. The arrows from left to right stand for linear computation, those from right to left for the JL-update.
B J.lCi)
,
I I I
I,
'
I
I'
I, " I I , I' ,
, ,
,
,~
,' ,' ",
,I,', ,,' I
'
,
H
'/
Figure 4.19. Qualitative plot of a B - H-curve.
4.3 Stationary Currents; Coupled Problems As in electrostatics, the potential formulation may be used for stationary current fields, as was shown in section 1 and subsection 2.3. Then the continuous and discrete equations are given as follows: div K, grad
=0
Again, the electric field is computed via E = - grad
-
-T
SDI<S PE = O.
218
4. Applications from Electrical Engineering u .
t ,
.. ..
n .
I.
Figure 4.20. Permeability of the C-magnet. '2 .
~ -'--'---~ -'-~~- , - ,, - ~ --T -'-, -
..
.. ..
t .
11 .
Figure 4.21. Magnetostatic vector potential in the C-magnet after the last cycle of the nonlinear iteration.
4.3 Stationary Currents; Coupled Problems
---==i Crd
10 3
E 0 c
(ij
:l "0
'iii ~
219
102 101
----
l~ k
10° 104-
>
-
c,... . - c,.Io5 - c,... 6 c,.Io7 -c,...1 -
Crd< '
-=-~
:-..
~ 10-2 ~
10-3
10"
o
100
200
300
400
number of iterations
Figure 4.22. Convergence history of the SOR algorithm in case of a nonlinear C-magnet.
103 E
0c
(ij
:l "0
'iii ~
-
102
-
101
<>,de 2 Cycle 3 <>,de 4 <>,de 5
C>,de6
Cycle 7 CycleS Cycle 9 Cycle 10 -Lost -
100 10~
> 10-2
~ ~
10 03 10"
0
10
20
30
number of iterations
40
Figure 4.23. Convergence history of the preconditioned cg algorithm in case of a nonlinear C-magnet.
In the following, some field representations are shown for two examples: First, a simple Hall element, next a semi-conductor problem is considered as a sample coupled problem. 4.3.1 Hall Element In this example, consider a simplified Hall element without application of a magnetic field. This is a stationary current problem. The conductivity of the material is assumed to be '" ~ 4 S/m. The resulting real positive definite linear system is solved using the cg method. After 33 iterations, the accuracy cannot be improved further. The obtained solution satisfies the continuity equation div J = 0 with the accuracy of 0.13.10- 4 .
220
4. Applications from Electrical Engineering
Figure 4.24. Simple Hall element and vector representation of the electric field.
The scalar potential is given by the solution vector. The electric field can be directly determined without any loss in accuracy. Figure 4.24 shows the geometry of the problem and a representation of the electric field. 4.3.2 Semiconductor
Figure 4.25. Current flow field and resulting magnetic field in a semiconducting cube. The plot is courtesy of Bartsch, CST GmbH
This example shows the coupled calculation of stationary currents and the excited magnetostatic fields in a semiconducting cube. Two copper contacts are attached to one face of the cube. The contacts are connected to different potentials of ± 10 V. Figure 4.25 shows the current flow field and the resulting magnetic field for the investigated semiconducting cube (cf. [23], [24]). For the coupled calculation, the field J of the stationary current computation that is allocated on the FIT grid G has to be transferred to the dual FIT
4.3 Stationary Currents; Coupled Problems
221
3
grid G. The vector that represents J on G satisfies the continuity equation and can be used as the excitation for the magnetostatic computation. This procedure is described in detail in [25], [23], [24]. In our example, the relative accuracy of the continuity equation div J = 0 is better than 10- 6 . The results of the coupled calculation are displayed in Fig. 4.25. 4.3.3 Circuit Breaker
This practical example (presented also in [24]) deals with a circuit breaker2 • Figure 4.26 displays its geometry. The upper left and right region in Fig. 4.26 are the two fixed parts of the contact. They are connected by the movable contact bridge shown in the lower region.
Figure 4.26. Geometry of a circuit breaker.
The plot is courtesy of Bartsch, CST GmbH
To simulate the current, assume there is a difference of potentials at the contacts. The resulting current flow J is displayed as a three-dimensional arrow plot in Fig. 4.27. The potential distribution PE on the surface of the structure is shown in Fig. 4.28. Figure 4.29 shows an arrow plot of the field strength H as a result of the coupled calculation of the magnetic field. An essential feature of the circuit breaker is that, with increasing current, the electromagnetic force can be used to separate the bridge from the fixed contacts once the current exceeds a certain threshold value. Figure 4.30 shows the absolute value of the force on the bridge as a three-dimensional contour plot. Forces arise mainly in the surrounding of the contact area. The calculated forces are in good agreement with measurements. 2
Thanks are due to P. Steinhauser from Rockwell Automation, Switzerland, for the model and measurement of the circuit breaker example.
222
4. Applications from Electrical Engineering
CST
Figure 4.27. Current flow field J. The plot is courtesy of Bartsch, CST GmbH
Figure 4.28. Electric potential distribution q; E. The plot is courtesy of Bartsch, CST
GmbH
Figure 4.29. Magnetic field strength H. The plot is courtesy of Bartsch, CST GmbH
4.4 Stationary Heat Conduction; Coupled Problems
Figure 4.30. Absolute of the force on the contact bridge. Bartsch, CST GmbH
223
The plot is courtesy of
4.4 Stationary Heat Conduction; Coupled Problems As was already discussed in subsection 2.3, stationary temperature problems lead to Poisson's equation, just like the static problems treated above. The underlying analytic and the corresponding FIT equations are given as div /'\,T grad T
= -w
Here the temperature distribution on a board is chosen as an example. further examples of temperature calculations are given in subsection 5.6 for several coupled problems arising in accelerator physics: 1. Inductive soldering, which needs the coupled computation of an eddy current problem, i.e., an excited time-harmonic problem, and a stationary temperature problem. Figures 5.46, 5.47, 5.48 and 5.49 display results for all stages of the coupled computation. 2. Temperature distribution in rf cavity, which requires the coupled eigenmode computation, i.e., the solution of a time-harmonic problem, and a stationary temperature problem. Figures 5.50 - 5.53 show the solutions of the single problems in different representations. 3. RF window requiring a general time domain simulation being coupled with the temperature simulation. Figures 5.54 - 5.56 display the results. 4. Waveguide with a load, which also requires combined electromagnetic time domain and temperature simulation. The results are shown in Fig. 5.58 and 5.59. 4.4.1 Temperature Distribution on a Board
A board with several IC's was investigated with regard to the heating of neighbouring components by some heated IC . In the numerical calculation, the heated IC corresponds to a material with given temperature, thus serving as a heat source. The surrounding space is open, i.e., for the simulation it is assumed that the temperature vanishes at infinity. Correspondingly, all temperatures given in the figures are relative to the outer temperature of the board; they should not be understood as absolute temperatures. The problem domain has dimensions 9.4 cm x 6.4 cm x 6.0 cm. The heat source which might be, for example, a CPU of a PC is assumed to be 70° Celsius (or Kelvin) warmer then the surrounding air. The other
224
4. Applications from Electrical Engineering
Figure 4.31. Board with different IC's. The big IC is assumed to be the heat source with temperature 70° Celsius above that of the surrounding air. Displayed are the board and isometric planes at 55° and 80° above the room temperature.
components have the following conductivities: the substrate and the mechanical connections K, = 15 Slm, the small Ie's K, = 100 Slm, the sheets K, = 200 Slm, and the air K, = 0.588 S/m. The linear system was solved using the cg method with ILU(3) preconditioning. The implemented cg solver has automatic control over the iteration process: it ends the process as soon as the residuum stagnates. After 33 iterations, this criterion was satisfied and the value of the relative residuum was 0.2179827· 10- 6 . The required cpu time was about 72 seconds on a SUN Sparc Server. Substituting the solution in the divergence equation 0.734857· 10- 4 as the relative accuracy of the solution. Figure 4.31 shows the board with two isometric planes which visualize temperatures 55° and 80 0 above the room temperature.
4.5 Electro-Quasistatics As was already discussed in subsection 2.3, electro-quasistatic problems lead to a complex Poisson equation
where the electric field E is represented as the gradient of a complex scalar potential: E = -grad~ 4.5.1 High Voltage Insulators with Contaminations
An important branch of electrical energy engineering is the high voltage engineering. Electric voltages of more than 1000 Volt (1 kV) are referred to as
4.5 Electro-Quasistatics
225
high voltages. For these voltages, electric fields may cause strong discharging effects. High voltage is applied, e.g., in the long distance transmission of electric energy via overhead lines. The operating frequency is 50 Hz. The resulting electromagnetic fields are slowly varying fields which depend mainly on the displacement current. Thus, the equations of electro-quasistatics have to be solved. An important problem in high voltage engineering are phenomena which are caused by humidity or contamination of the insulator, in particular, material aging, which is caused by breakdowns. Figure 4.32 shows a typical example of an insulator as well as a test specimen for experimental studies on aging processes. High voltages are applied at the insulators. Below a critical voltage Uk, the electrostatic field of the charge-free space prevails. The dielectric can be assumed to be perfect insulator free of space charges. Above the critical voltage Uk, the insulating material looses its insulating properties and becomes the carrier of a discharge which makes a conducting connection along the insulator. The coalescence of single drops on humid insulators allows to reach the discharge voltage. This is displayed in Fig. 4.33
Figure 4.32. Insulator and epoxid resin specimen with a layer of drops as it develops on the surface after water sputtering.
So far, the computations of superposed electrostatic fields and electric fields of lossy dielectrics were not satisfactory, because a suitable model which also allowed detailed quantitative studies was missing: In the course of the
226
4. Applications from Electrical Engineering
Figure 4.33. Discharge on epoxid resin specimen with a layer of drops as it develops on the surface after water sputtermg. The photo is courtesy of Weiland, TU Darmstadt, and Philipp Morris Company
studies, only one publication [274] could be found that discretizes the same differential equation - although with the Finite Element Method (FEM) and only for the two-dimensional case. If this problem type is simulated at all with some discretization method, then it is simulated as a static model and mostly only in the two-dimensional case (see, e.g., [241], where the Boundary Element Method (BEM) is used). The discretization of the electro-quasistatic equations with Finite Integration Technique presents a suitable model for investigations of the superposed electrostatic fields and electric fields of lossy dielectrics. 4.5.2 Surface Contaminations
The electric field on a contaminated surface of a solid dielectric is different from that on a clean and dry surface. The electrostatic field on the surface of a dry and clean dielectric can be simulated by a series on n capacitors. A conductive contamination leads to an increase in surface conductivity and can influence the resulting electric field distribution. Then a series of n capacitors connected in parallel with n resistors gives a suitable equivalent circuit. Qualitatively, the question of surface contaminations was already studied in extenso [156]. In [208], experimental studies are presented on the influence of weakly conductive contaminations on the surface aging of on-load alternating voltage cylindrical specimen from epoxy-resin. The specimen differ, e.g., by the shape of their electrodes. The influence of the electrode shape on the field arising in their neighbourhood is dramatic. The phenomenon of the so-called "electrolytic partial discharge erosion" causes an increase of the total conductivity of the surface layer. In most cases, the local conductivity depends non-linearly on the local electric field strength. This dependence is again different for different types of contamination.
4.5 Electro-Quasistatics
227
4.5.3 Fields on High Voltage Insulators
In [208], experimental studies are presented on the influence of weakly conductive contaminations on the surface aging of on-load alternating voltage cylindrical specimen from epoxy-resin. The specimen differ, e.g., by the shape of their electrodes. The influence of the electrode shape on the field arising in their neighbourhood is very substantial: For a jutting out disc electrode, the maximal electric field is about fifteen times higher than the homogeneous electric field; for the toroid electrodes, it is about six times higher, and for electrodes similar to the Rogowski profile, it only increases by about 10% [208]. Cylindrical Specimen with Toroid Electrodes. For numerical studies, one of these specimen was chosen: The caps are each 6 mm thick and have radius 18 mm. The computed model is a 30 mm long, solid piece of the originally [208] 100 mm long hollow cylindrical specimen of radius 15 mm. The epoxy-resin has relative permittivity f. r = 4, the relative permittivity of the water drops is f. r = 81, and their electric conductivity may be assumed to be /'i, = 10- 6 S/m. The frequency f of the alternating voltage is 50 Hz, and a voltage gradient of 5 kV Icm is applied. The size and form of the water droplets vary. Neglecting deformations caused by the electric field, we assume a round shape with a typical diameter of 1-3 mm. As is clear from the picture in Fig. 4.32, the drops are close together but randomly distributed. A technical drawing of the specimen and a picture of the experimental studies described in [208] is displayed in Fig. 4.34. The problem domain was discretized using a 57 x 57 x 73 grid leading to a linear system with n = N = 237177 complex unknowns. First, a constant radius 3 mm was assumed for the water droplets. Figure 4.35 shows (in the left part) a representation of the electric field when only seven droplets of radius 3 mm are assumed on the surface. On the right, a contour plot of the real part of the complex potential is displayed. Figure 4.36 shows isometric lines of the real part of the electro-quasistatic potential when the water droplets are arranged in one row along the specimen. So far, simulations for high voltage insulators were in most cases based on electrostatic calculations. Since the electrostatic model does not include the current density and the displacement current, it is obvious that this model inheres a systematic error. Figure 4.37 shows a comparison of the longitudinal electric field along the specimen, i.e., from one electrode to the other. This path crosses some water drops, which each cause a strong increase in the field. The electrostatic model, however, shows significant discrepancies in these areas, which are most interesting in this study. Consequently, the electro-quasistatic model is to be preferred for simulations of high voltage insulators. In Fig. 4.38, a simulation model is shown with many different drops which are distributed over the complete surface. Some of the water drops have coalesced. For symmetry reasons, only a quarter of the structure is discretized.
228
4. Applications from Electrical Engineering
-
1 em,
0.5 q bzw.
5 mq/cm'
Figure 4.34. Hollow cylindrical epoxid resin specimen (1) with toroid electrodes (2), terminating caps (3), and annular seal (4). Also shown is the layer of droplets which develops on the surface of this specimen after water sputtering. The drawing is courtesy of Quint [208J
In each of the Fig. 4.39 and 4.40, the absolute value of the electric field is displayed over some specific cross-section. The field increase between the drops and near the electrodes becomes very obvious. Only the field increase near the electrodes remains if the test specimen is completely dry (compare Fig. 4.40). The computed field courses agree very well with the observations in the corresponding experiments [150]. Qualitative comparisons are hardly possible for such specimen for reasons related to the measurement techniques. As a result, simulation, manufacture, and measurement of special specimen are planned for a quantitative validation of the numerical method. An evaluating
4.5 Electro-Quasistatics
229
comparison with other numerical methods is not possible for the time being since only electrostatic models are available (e.g., [241]).
4.5.4 Outlook In the future, further numerical and experimental studies are foreseen. The influence of the electrode shape, which was already experimentally investigated in [208], shall be compared with numerical results. The relation between the applied field strength and the shape deformation of the droplets found in the experiments described in [55] shall be realized numerically by the simulation of the deformation caused by the electric field. In this process, the field and the resulting deforming forces are iteratively computed. Also, a typical insulator shape (as in Fig. 4.32) shall be simulated. Finally, a direct measurement of the electric field is needed for the quantitative validation of the simulation. For that, single water droplets are put by a pipette on a flat specimen, as is shown in Fig. 4.41, or on a specimen with electrodes inside, and the fields are measured in which the conductivity of the water will be variable. The enclosed electrodes shall avoid possible field increase at the boundaries of the electrodes. Measurements of this kind shall facilitate precise comparisons between simulation and practice.
230
4. Applications from Electrical Engineering
Figure 4.35. Vector representation of the electric field Re(E) and contours of the real part of the complex potential Re(tE ) for a specimen with only a few water droplets. -0.01
Figure 4.36. Isometric lines of the real part of the electro-quasistatic potential Re(tE ) in a cross-section with four water drops in one row.
4.5 Electro-Quasistatics
231
2.0e+05
-- .. - ---
,...._J
I I
I I I
'0
I
~
o ·c tl
.,
i I
I
·3.0e+05
I I
"
ill
- - - - Electro·Quasistatics .. Electrostatics
.S.Oe+05 L -_ _ _ ·0.02
~
_ _ __'__ _ _
~
0.03
_ _ _- ' -_ _ _
~
_
__.J
O.OS z/m
Figure 4.37. Comparison of the results from electrostatic and electro-quasistatic computation. The dashed line with the most instant minimum shows the zcomponent of the electric field Re(E) as a function of z as obtained from the electro-quasistatic computation. The dotted line shows the z-component of the electric field E resulting from electrostatic computation.
x Figure 4.38. Specimen with many partly coalesced water droplets of varying size. Displayed are the studied geometry (left) and the equipotential lines of the real part of the complex potential Re(fE ) (right).
232
4. Applications from Electrical Engineering
y
Figure 4.39. Representation of the absolute value of the electric field over some specific (y,z)-cross-section (x = 0.7 cm) for a specimen with many separate drops of varying size.
y
Figure 4.40. Representation of the absolute value of the electric field over some specific (y,z)-cross-section (x = 0.7 cm) for a specimen without any water drops.
4.6 Magneto-Quasistatics
,,
I I
I
I
Z
233
y
\Z... x
Figure 4.41. Electric field on some flat epoxid-resin specimen with electrodes on top. A row of three water droplets is put between the electrodes.
4.6 Magneto-Quasistatics In subsection 2.3, we derived the FIT equation for magneto-quasistatics
As is explained in more detail in [59] and [57], several aspects, like regularization, have to be taken into account in the setup of the numerical model eventually leading either to wave equations for the electric grid voltage e and the magnetic vector potential a or to a system of differential-algebraic equations (DAE) of order 1 with a regular matrix stencil: (
- - -1
-T
-)
d
d .
CDI' C + DeS DNSDe e + D", dt e = - dt J
(6iJ;:lC + DeST DNSDe)
a+ D", :ta =
j
where the term with the normalizing diagonal matrix D N represents a local Coulomb gauging [116]. This system is solved by standard implicit one-step methods of integration with respect to time [117]. An example of such standard time-marching algorithms are 8-methods [331], which include the well-known implicit backward Euler (8 = I), Galerkin (8 = 2/3), and Crank-Nicholson (8 = 1/2) methods. A good compromise between the numerical effort arising from the
234
4. Applications from Electrical Engineering
necessarily iterative solution of two linear systems for each time step and the achievable L-stability is given by the stiffly accurate SDIRK2 scheme [1], which is of order 2 for DAEs of order 1. Also applicable are Gear's backward differential formulas (BDF) [117J. For the startup phase of the second order BDF2 method, the SDIRK2 method can be applied [56], [59J. After discretization with respect to time, the DAE system yields linear systems of the form (D(Lltn+1) + A) Yn+1 = b where D(Llt n+1) is a diagonal matrix with entries depending on the actual duration of steps. The formulations were chosen to yield real-valued, symmetric, positive (semi-)definite linear algebraic systems, to which the classical preconditioned conjugate gradient algorithm is applicable. Best results with respect to the total computational time so far were achieved by an efficient implementation of the SSORCG method using an operator-type matrix-vector multiplication [59J. The cg method is especially well-suited for this type of semi-definite problems and also gives meaningful solutions to the strongly singular systems arising from discretizations with respect to time. In case of non-linear magnetic material, the system to be solved is equivalent to a nonlinear problem
which can either be solved by methods of successive approximations or by the Newton-Raphson schemes [158J. 4.6.1 TEAM Benchmark Problem This simple magneto-quasistatic example shows an eddy current simulation of the Team Workshop benchmark problem 7 [38], [119] at 50 Hz. The structure has been modelled using about 47000 mesh cells. The total simulation time of the frequency domain solver is less than 1 hour, whereas the implicit time domain calculation with a transient building-up phase over 1.5 periods requires about 2 (4) hours for 30 time steps with 30 (60) PCG solutions on a SUN ULTRA SPARC 1, depending on the chosen time integration scheme for the vector potential formulations. The deviation of the otherwise excellent results from the measured values is due to the staircase approximation of the coil curvature.
4.7 Time-Harmonic Problems The Helmholtz equation and its discrete analogue obtained using the Finite Integration Technique, the so-called discrete Curl-Curl Equation or discrete Helmholtz equation (cf. subsection 2.3), are given by
4.7 Time-Harmonic Problems
235
Figure 4.42. Geometry of TEAM benchmark problem No.7 and measurement path. The plot is courtesy of Clemens, TU Darmstadt
Figure 4.43. Field strength. The plot is courtesy of Clemens, TU Darmstadt
(curl.!.. curl -
!!:.
W2~/) E = -iwlE
The right-hand sides -iwlE and -iwie represent the impressed current excitation. ~' combines the complex conductivity and permittivity. In case of resonant modes, the right hand sides are equal to zero and an eigenvalue problem results. Some kind of gauging is applied with the aim of shifting static solutions to facilitate the numerical solution of the eigenvalue problem, which then is given by:
(JE 1 CD;lC + ST DNSI2ck + w2~ = O. Eigenvalue problems are not the main topic of this book, but the reader can find some examples for this type of problems in section 5 (Fig. 5.1), subsections 5.2 (Fig. 5.6,5.7),5.4 (Fig. 5.20, 5.28, 5.29), and 5.6 (Fig. 5.50). However, the computation of resonant fields by other methods may well lead to linear equation systems: the mode matching technique, described in
236
4. Applications from Electrical Engineering
O.OOB
,----~-----,----___,
0.007 0.006 ~
1 !::.N, II>
-
0.005 0.004 0.003
+ measured values 1requen~ domain solution ..-.-. quasis1atic••• sdirk22 --- quasis1atlc. a. bd12 --_. wav....qn .......·s method
0.002 0.001 0.000
'---~-~-------'---------'
0.00
o. to
0.20
Palh Blm
Figure 4.44. Comparison of magnetic field strength values for different simulation models and measurement. The plot is courtesy of Clemens, TV Darmstadt
subsection 2.1 is one example of this. With this method, the natural frequencies (eigenfrequencies) are found as zeros of some matrix determinant (see (2.1.3)) and amplitudes of the wave excitation by solving the linear system
(cf. (2.1.3) and subsection 2.1). Some results of resonant field computations are given in subsection 2.1 (Fig. 2.5 - 2.7) and subsection 5.4 ((Fig. 5.19, 5.20 and related figures). In the convergence studies in subsection 3.10, two time-harmonic applications with excitation were investigated. Both are problems which can also be solved in time domain and were first simulated there. Both examples are only briefly described and some typical results are presented. For a more detailed description, see the literature referred to there. Another example of a time-harmonic application with excitation is the eddy current problem related to inductive soldering described in subsection 5.6 (Fig. 5.46).
4.7 Time-Harmonic Problems
237
4.7.1 3 dB Waveguide Coupler The 3 dB waveguide coupler, or, to be more precise, the 3 dB rectangular waveguide directional coupler,3 was calculated in time domain in [76J and compared with measurements and mode matching calculations before it was simulated as a time-harmonic problem in [58J. For the latter, the coupler was discretized on a 53 x 2 x 128 grid (N = 13568 points). It consists of two rectangular waveguides whose wide sides are facing each other. The energy from one of the waveguides is to be evenly distributed by the coupler into the two waveguides. For this purpose, six connecting slits, differing in height and distance, lie between the waveguides over their full width. These were devised to lie in the range 11-12.5 GHz for the optimal frequency response. In this frequency range, the coupling is even better than 2.84 dB [76]).
Figure 4.45. Electric field Re(E) in the 3 dB waveguide coupler. The walls of
both waveguides (top and bottom in the picture) were showed transparent to allow a view of the field.
Figure 4.45 shows the electric field in the 3 dB waveguide coupler. In this picture, the geometry of the coupler also becomes clear. Figure 4.46 shows a comparison of the reflection coefficient Sl1 with the time domain computation [76J for three different frequencies. The agreement between both solution ways is very good. 4.7.2 Microchip
In this example, a section of a microchip was considered in the frequency range of 10 to 40 GHz with the focus on a possible cross-talk between the 3
The coupler was manufactured and measured by the company MBB. The results of measurements and mode matching calculations were provided to Dohlus [76] in private communication.
238
4. Applications from Electrical Engineering
-
time domain computation
¢ frequency domain computation
'20.0
til
~
::
-25.0
(/)
·30.0
·35.0 ~---'-------'.---~----'-------" 10.0 11.0 12.0 13.0 14.0 15.0
frequency I GHz
Figure 4.46. Reflection coefficient 5 11 of the 3 dB waveguide coupler. The results from the frequency domain calculation show a very good agreement with the time domain solution.
Figure 4.47. Cross-talk of the electric field Re(E), logarithmically scaled, at a frequency of 10 GHz. bond wires. As is visible in Fig. 4.47, the discretized section consists of two microstrip ports and two thin bond wires which connect the microstrips with resistive blocks on the material. The resistive blocks have conductivity K, = 1.3 . 104 Slm, and the substrate has relative permittivity Cr = 9.0. The dimensions are about 700 /lm x 300 /lm. The discretization was done on a grid with 71 x 20 x 85 = 120700 points. The cross-talk from one wire to the other was determined for the frequencies 10 GHz and 40 GHz. The comparison of the reflection coefficients obtained in time and frequency domain shows an extremely good agreement, cf. Fig. 16 in [58]. Figure 4.47 shows a cross-talk of
4.9 Bibliographical Comments
239
the electric field at 10 GHz. The electric field Re(E) is displayed in vectorial, logarithmically scaled form.
4.8 General Time-Dependent Problems General time-dependent problems lead to initial value problems which can be solved by explicit procedures based on discretization with respect to time. The analytic operator and the corresponding FIT operator are given by
L
-~ ~CUrl~) (--curl1
f.lr
E
1
Er
Eo
f.lo
0
Thus, no linear system has to be solved for this problem type, so it is not in the scope of this book. Yet, some examples of the solution of Maxwell's equations in time domain can be found in subsection 5.6: Results of the field computation for some rfwindow are displayed in Fig. 5.54 and Fig. 5.55. Figure 5.58 shows the field distribution in a waveguide with a load.
4.9 Bibliographical Comments In this section, several applications which require solutions of large linear systems have been presented. Most of the examples have already been studied from the point of view of convergence of applicable algorithms in subsection 3.10. There appeared some field plots and other relevant plots. In general, the author is not aware of any textbook covering applications from electrical engineering from the point of view of solution of the resulting linear system (which was one of the main reasons to write this book). Consequently, mainly research papers, student's papers, and dissertations have been cited in this section. The most relevant literature concerning the abovementioned examples has already been listed in the bibliographical comments of section 3 and will not be repeated here again. Electrostatics The electrostatic example was originally described in [72], where it was solved by the SOR algorithm and an older version of the CAE package MAFIA[303].
240
4. Applications from Electrical Engineering
Magnetostatics The linear C-magnet and the sensor are just two typical and rather simple examples discretized by Weiland and his group; see section 3 for references. The nonlinear C-magnet was first described in [292]. A description of the Newton-Raphson-like procedure can be found, e.g., in [191]. Stationary Current Problems; Coupled Problems The Hall element was discretized by the author. More details on the semiconducting cube can be found in [23] and [24]. [25] describes the computation of electromagnetic forces with FIT. The circuit breaker was modelled and measured by Rockwell Automation. The simulations for the circuit breaker have been first presented in [24]. Stationary Temperature Problems; Coupled Problems The discretization of the PC board was taken from some time domain simulation of Weiland and his group. The temperature simulation was done by the author. The temperature simulations in the coupled problems from subsection 5.6 were guided by the author. Electro-Quasistatics Electro-Quasistatics is one of the author's main research topics of the last years. It started out from some ideas and discussions with Weiland. Besides theoretical studies, the implementation in MAFIA was carried out and some examples, like the insulator, have been discretized. In the course of the studies, only one publication (274) could be found on a 2D FEM discretization of the same differential equation. A 2D static BEM model is described in (241). Extensive qualitative experimental studies have been carried out under the guidance of Konig [156], [208], and [55); see also [150]. Magneto-Quasistatics In order to complete the spectrum of examples, some recent examples by Clemens have been presented. The examples and the theory underlying the FIT implementation are described in detail in [59] and [57]. Some of the ideas have first been described in [116]. The low frequency Team Workshop benchmark problem 7 was first solved by FEM ([38], (119)), then by FIT ([59) and [57)). Textbooks dealing with the implicit time-stepping algorithms are, e.g., [1], [117], and (331). A German textbook describing the solution of nonlinear electromagnetic problems is [158].
4.9 Bibliographical Comments
241
Time-Harmonic Problems Several examples for this problem type are described in subsections 5.2 and 5.6 as well as in subsections 2.1 and 5.4. Please refer to the bibliographical comments in the corresponding sections. The waveguide coupler was manufactured and measured by the company MBB. The results of the measurement and mode matching calculations were provided to Dohlus [76], who solved the problem in time domain and compared the results with measurements and mode matching calculations before it was simulated as time-harmonic problem by Clemens et al. in [58]. The cross-talk study for the microstrip was described in [58] before. General Time-Dependent Problems Some coupled problems involving time domain simulations are described in subsection 5.6; see bibliographical comments there.
5. Applications from Accelerator Physics
High energy physics studies the elementary constituents of matter. Adequate instruments - comparable with microscopes - for studying elementary particles are accelerators, or storage rings. In the construction of large accelerators, technical components are used which are on the edge of feasibility and require a notable amount of funds. Therefore, most precise statements about the capability of a new accelerator are a substantial part of the design. Based on the methods introduced above, computer codes were developed which serve for the optimization of certain technical components of an accelerator, especially of the accelerating sections. Besides their application in accelerator physics, these methods can be used practically everywhere in electrical engineering where microwave systems, e.g., waveguides or resonators, are used.
5.1 Acceleration of Elementary Particles The simplest apparatus for the acceleration of a charged particle is the plate capacitor. In accelerator physics, the required devices must supply very high voltages (up to lOll Volt (100000 MV)). Yet, such high voltages cannot be created between a pair of electrodes. In order to reach the desired high energies, high frequency alternating currents are used and the energy supply is carried out in several steps. The technique for creation of those high frequency waves is the same as in radio or TV engineering. The transport of such high frequency waves with wave length 0.1-1 m takes place in hollow metallic structures that are also of size 0.1-1 m. Waveguides. A waveguide is characterized by the fact that it carries propagating electromagnetic waves. In general, the z-coordinate is assumed to be the direction of propagation. Then an ideal waveguide is given by a 3dimensional domain which extends infinitely in the z-direction with perfectly conducting walls forming the transversal boundaries. Waveguides are filled with air or vacuum and often also with dielectrics or ferrites. A real waveguide is some kind of a pipe with walls of finite conductivity. The rf waves are carried to the accelerating sections of the accelerator. These sections are driven so that the particles pass them at the same moment as the accelerating field of the wave. U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
244
5. Applications from Accelerator Physics
Cavities. In storage rings, the accelerating structure is usually designed to make the electromagnetic fields stay inside by resonance. In its most essential electromagnetic properties, such an accelerating structure does not differ from a resonant cavity. An ideal resonant cavity is a finite volume where perfectly conducting walls form the boundaries. It may be filled with some material, e.g., some dielectrics or ferrite. A cavity made of a material with finite conductivity having one or several openings is still referred to as a cavity if it is possible to create resonant oscillations in the cavity by applying electromagnetic modes of appropriate frequency. For such frequencies, standing waves develop by reflection of the electromagnetic waves at the boundaries in their direction of propagation. The eigenfrequencies belonging to these resonant oscillations are called resonant frequencies. Then the system is in resonance. An ideal cavity has a J-like resonant spectrum. In a cavity with finite conductivity, the resonances are no longer undamped. The damping leads to a finite resonance width. Therefore the resonances can occur in the neighbourhood of an eigenfrequency. The eigenfrequencies of the cavities lie in the rf range l . The resonant oscillations are time-harmonic oscillations. These harmonic oscillations with finite energy in a given domain are often denoted as modes. The modes form a complete system of functions in £2(0, 21T), i.e., any oscillation can be described as a combination of all modes where amplitude and phase are suitably chosen. Therefore, the knowledge of the harmonic oscillations which solve Maxwell's equations is sufficient to describe all possible oscillation forms. A resonant mode is suitable for the" acceleration" of (or, more precisely, for the supply of kinetic energy to) the particles if it has a strong longitudinal electric field on the axis. Figure 5.1 shows the electric field of such a TMolO mode. In a storage ring (resp. linear accelerator), resonant modes (resp. traveling waves) are used for the energy supply to the particles. Upon passing the electric fields transmitted in the cavities (resp. traveling wave tubes), the particles experience a force which increases their kinetic energy. With that, the laws of the relativity theory hold for ultra-relativistic particles. Accordingly, the speed of light cannot be exceeded. Consequently, these particles do not get faster but heavier according to E = mc2 • In the sequel, we present computations for cavities, aperiodic waveguides, and rf windows. The mode matching technique and the Finite Integration Technique were used to compute traveling waves and standing waves (modes) in accelerating structures. While passing the accelerating structures, the particle bunches themselves cause electromagnetic fields, which are undesirable. Their interactions with the generating and all following particles have to be studied together with the possibilities to suppress those parasitic fields. The results presented for dipole modes in a typical accelerating structure for future linear colliders [267] elucidated one typical problem of these structures, 1
f ;::: 300 MHz
5.2 Linear Colliders
245
Figure 5.1. Electric field of the fundamental mode of a cylindrically symmetric cavity.
the so-called trapped modes. These results initiated a series of theoretical and experimental studies. Some of the successive studies are also described here. Furthermore, calculations of temperature were carried out for the resulting wall losses. The simulation algorithm for the computation of temperature is a helpful tool for many problems related to the accelerator operation. This algorithm also proved itself useful for some problems on manufacturing techniques, in particular, for the soldering of accelerating structures.
5.2 Linear Colliders In what follows, a special field of applications of accelerator physics is treated in more detail. This field is highly interesting and important, if not from the numerical, but certainly from the physical point of view. In the future, the e+e- -physics will be interested in such high energies (500 GeV up to 1.5 TeV) that storage rings will no longer be applicable because of their high energy losses by synchrotron radiation. This necessitates the construction of a linear collider, and the studies toward this goal have been already carried out worldwide. The results for the S-band 2 x 250 GeV linear collider study SBLC [311] are discussed in detail. In this context, a program based on mode matching has been developed [279] and a series of numerical studies on the mode matching technique [183], [87], [204], [239] as well as field calculations for the accelerating traveling wave and parasitic modes in the accelerating structures [318] were carried out. Since the results of the field calculation gave strong indications of the necessity of radical changes in the
246
5. Applications from Accelerator Physics
design, a short test structure was designed and an experimental measuring was decided upon [284], [160], [161]' [163]. In this subsection on linear colliders and the accelerating structures of the future, we analyze linear accelerators and their influence on the beam dynamics. Those accelerators are referred to as linear colliders. The actual concepts of linear colliders can be in principle split into two groups: Most paradigms propose normal conductive structures for the acceleration of elementary particles. One paradigm suggests acceleration by superconductive structures. In the sequel, the normal conductive structures are treated in detail. All design studies for normal conductive linear colliders predict traveling wave tubes for the acceleration of the particles. In principal, periodic and aperiodic structures can be used. The advantages and disadvantages of constant impedance and constant gradient tubes will be considered. A special study, the S-band linear collider study, shortly SBLC, will be the focus of our discussion. A special emphasis will be on the design of the S-band tube. The SBLC study proposes 2452 so-called constant gradient structures for acceleration. These aperiodic traveling wave tubes will have 180 cells and accelerating gradient 17 MV 1m. A so-called bunch train of 125 packets of particles (bunches) with the distance between the packets equal to 16 ns is planned. In order to reach a luminosity2, as large as possible, any dilation of the bunches has to be avoided. Effects of scattered fields (wake fields 3 ), which are caused by parasitic resonant oscillations, are among the main factors that cause the dilation of the bunches. Consequently, the suppression of these modes, the so-called Higher Order Modes (HaMs), is of fundamental importance for the actual design of the collider. The interaction of the single HaMs with the bunch can be described by the so-called loss parameter, which will be treated later in detail. The evidence collected so far and some theoretical and numerical preliminary investigations suggest that one can assume that the modes of the first dipole band would cause the most harmful dilating effects. Therefore, the main interest lay on computations for this dipole band. In what follows, mainly the dipole modes of the S-band tube, their influence on the beam dynamics, and possible measures for their suppression are discussed. With the mode matching method, a numerical analysis was carried out for the HaMs. The frequencies and loss parameters of the dipole modes 2
3
The high energy experiments aim for a reaction between the elementary particles of the colliding bunches. Such reaction will only happen for those elementary particles out of the two colliding bunches which are close enough; most of the particles will just pass on. The luminosity measures the probability of the event that some elementary particles will cause a reaction. A good luminosity requires a high number of particles in the colliding bunches, a high number of bunches, and small values for transversal density distributions (Le. "dense" bunches with a small "cross section"). The term for these remaining fields is chosen by the association with the wakes of a boat. Wake fields and beam loading are treated in more detail in subsection 5.3.2.
5.2 Linear Colliders
247
thus computed allow the subsequent computation of the wake potential. So, they are of basic relevance for beam dynamics simulations. Contrary to all preceding qualitative considerations, the detailed numerical analysis showed that a radically different design is needed for the HOM damping: Besides the few strongly interacting modes at the end of the tube with the input coupler, there is a multitude of other strongly interacting modes. Many of these modes are trapped completely in the inner part of the tube, i.e., have no field at both ends of the tube. The grave consequence of this is that one damper is by far not sufficient to suppress all dipole modes which are dangerous for the particle beam. For experimental validation of the numerical results, a short test structure was designed which was measured in the microwave laboratory of Frankfurt University. The measuring results confirmed very well the numerical forecast about the field distribution of the parasitic modes.
5.2.1 Actual Linear Collider Studies Before describing the SBLC study in more detail, let us first make some introductory remarks about linear colliders. In high energy physics, a wide consensus exists that an electron-positron collider should be studied as the next project for the largest accelerator, the LHC (Large Hadron Collider) at CERN. This linear collider should have a center of mass energy of 500 Ge V and an event rate, the so-called luminosity, of 1033 cm -2 sec-I. For such high energies, storage rings are no longer suitable, since their energy losses by synchrotron radiation are proportional to the fourth power of the energy. Starting at about 100 Ge V, the compensation of this energy loss is no longer technically and financially affordable. In 1989, the first linear collider SLC was put into operation. It uses the 3 km long linear accelerator, already existing since the sixties. The goal energy is 50 GeV. In some respects, this machine differs from those studied since it carries electrons and positrons in one common accelerator. Nevertheless, it is of great importance for the studies in question, since the SLC presents a successful prototype of a normal conductive linear collider.
,/~ bunch compressor
ml
. J.P."
main lioac
main linac
J.P.
wiggler beam dump
\
.
collimator
matching optics
bunch compressor beam dump
e --source, injector pre-ace.
Figure 5.2. Schematic layout of the S-band linear collider SBLC.
248
5. Applications from Accelerator Physics
Figure 5.2 shows the schematic layout of the S-band linear collider. This layout is valid for most studies. The electrons as well as the positrons pass through a linear accelerator of their own. These accelerators are opposite to each other. The interaction region, i.e., the experimental zone, has a small angle of a few milliradians to the center line between the electron and positron linear accelerator. Before the main accelerator are, for each linear accelerator, a source for the elementary particles, a pre-accelerator, and a section for the bunch compression. The accelerating structures in the main accelerator playa central role in the design studies. They are the topic of the following subsections. Worldwide, there have been six different design projects of a future linear collider. They are carried out in part by international groups. These are the collider projects TESLA (coordinated by DESY, Hamburg, Germany), SBLC (also coordinated by DESY, Hamburg, Germany), JLC (coordinated by KEK, Tsukuba, Japan), NLC (coordinated by SLAC, Stanford, USA), VLEPP (coordinated by BINP, Novosibirsk, Russia), CLIC (coordinated by CERN, Geneva, Switzerland). On the conference EPAC 1994, a "Technical Review Committee" with representatives of all the projects was officially founded. Its report [267] of 1995 and the proceedings of the "Next Generation Linear Colliders" workshops [242], [153], [140] give a good overview of the different studies. The physics which is made possible by these linear colliders is described in [327]. Three of these design projects are very similar with respect to their working frequency, meanwhile also being in close co-operation, so that there are essentially four different concepts. Table 5.1 gives a short overview of the design projects. 1. with superconductive accelerating structures
(9-cell cavities) 1.3 GHz L-Band TESLA 2. with normal conductive accelerating structures (traveling waves tubes with 80-200 cells) S-Band SBLC 3 GHz 11.4 GHz X-Band JLC NLC 11.4 GHz VLEPP 14 GHz CLIC (2 beams) 30 GHz Table 5.1. Design projects of future linear colliders
The TESLA study [314] conducted by DESY in international collaboration is the only linear collider project foreseeing the use of superconductive accelerating structures. These are 9-cell cavities each driven at a resonant frequency of 1.3 GHz.
5.2 Linear Colliders
249
Besides TESLA, all other collider studies mentioned above foresee normal conducting accelerating structures. These are each traveling wave structures with about 80 to 200 cells. The only project in the S-band regime is SBLC [310], [311] with a working frequency of 3 GHz. This study was conducted by DESY in collaboration with other institutions, e.g., the Darmstadt University of Technology and the Frankfurt University. In the X-band regime, the Japanese study JLC [258], [257] and the American study NLC [194] of SLAC, each working with 11.424 GHz, and the Russian study VLEPP [16] with 14 GHz were initiated. Finally, the third conceptually different project is CLIC of CERN [232] with frequency 30 GHz. The last differs essentially from the previous studies, since it proposes the so-called Two-Beam Accelerator (TBA). The concept of TBA is based on a relativistic driving beam with high intensity but medium energy (6 GeV) running parallel to the main accelerator and depositing energy periodically in its 30 GHz traveling wave tubes. All the design projects face the practical problem to construct accelerating structures under extremely narrow tolerances. Precisions of up to 1 /-Lm have to be kept in the production. Potential scientific problems are dark currents and wake fields. The wake fields will be discussed in more detail below. TESLA proposes an accelerating gradient of 25 MV 1m. This puts high demands on the material, the production technique, and the cleaning and handling process. Thermal breakdowns and field emission are reasons why that gradient is at the edge of technical feasibility. In the first prototype, an accelerating gradient of 20 MV 1m was reached in continuous wave operation; under pulsed operation, one of six cavities reached 25 MV 1m [267]. The total length 4 of the collider will be 29 km. This project has the highest efficiency at the expense of a very high technical effort. The normal conductive DESY project SBLC mainly uses the same technology as SLC, which has been working successfully since 1989. Yet, SBLC's accelerating structures are essentially longer and differ slightly in geometry. Like in SLC, the accelerating gradient of the SBLC structures equals 17 MV 1m. The necessary length of the collider amounts to 33 km. Details on this project follow in the sequel. To reduce the length of the linear collider and thus reduce the costs, the Japanese project JLC, the Stanford project NLC, and the Russian project VLEPP chose frequencies in the X-band regime. Their accelerating gradients lie at 58 MV 1m, 37.3 MV 1m, and 91 MV 1m respectively, while their total lengths amount to 10.4 km, 15.6 km, and 7.0 km respectively. The CERN project CLIC follows the TBA principle already mentioned above. Its accelerating gradient is 78.4 MV 1m, its total length amounts to 8.8 km. 4
By the total length, the active length plus further beam guiding components, including the cryostats in TESLA, are meant. The numbers were taken from [267].
250
5. Applications from Accelerator Physics
With increasing frequency, the dimensions of the accelerating structures get smaller. The diameter of the opening for the beam of the CLIC structure, for example, is only 6 mm. The smaller sizes necessitate very narrow tolerances for the alignment of the structures. For CLIC, they are 10 /Lm; for the other projects, they are of order 100 /Lm. For many years, all projects were pursued in parallel. Since 1997, test facilities are in operation for several of the projects. At this time, comparative evaluations have also started. Because of the enormous building costs of several billion dollars for each of the linear coUiders, probably only one of the concepts will be eventually realized. With several thousands of accelerating structures, which all have to be equipped with technical devices for the suppression of parasitic modes, the detailed knowledge of these modes is a decisive financial factor. In the following, studies regarding the parasitic modes or Higher Order Modes (HOM) of SBLC are presented. 5.2.2 Acceleration in Linear Colliders The goal of acceleration of particles is to continuously transmit energy to the beam over sections as long as possible. The necessary high energies are transferred via fast oscillating rf fields. In the GHz range, waveguide elements are used as cavities or to transmit electromagnetic waves. However, a plain cylindrical waveguide is not suited for the acceleration of elementary particles, since the phase velocity of electromagnetic waves is larger than the speed of light while the velocity of the particles is just below 5 the speed of light. The particles would not only be accelerated but also curbed by a wave with higher phase velocity. Yet, by insertion of irises in the cylindrical waveguide, the waves can be scaled to the same phase velocity as the particles. Usually, a constant distance between the irises is chosen. Such structures are called" slow wave structures". Figure 5.3 shows such a slow wave structure (cf. Fig. 2.1). The photograph shows the cross-section of a typical accelerating structure for a linear coUider. In particular, some cups of the SBLC structure are shown. The effect of the irises can be elucidated by the graph in Fig. 5.4: The phase velocity in a waveguide is a function of frequency. The dispersion relation of the waveguide states that the frequency w of the wave equals the product of the speed of light c = 2.997925· 108 mls and the square root of the sum of squared phase constants, also called wave numbers, (3 and the squared cut-off number kc: w = cl(P 5
+ k~
For the particle velocity v, the relation v = {3c holds, with the speed of light c and {3 = J1 - 1/,2 the Lorentz factor ,. For an electron with energy E = 100 GeV, this yields, = 100 Ge V / 511 ke V. Thus, the estimate {3 ;:::: 1- 1.3.10- 11 follows.
5.2 Linear Colliders
251
Figure 5.3. Iris-loaded waveguide. The photo is courtesy of DESY
The cut-off number kc of a waveguide separates the range of free wave propagation from that with damping. The phase constant {3 is defined via the wave length Az of the mode as {3 = 21l".
Az
The phase velocity vp of a wave can be derived from the dispersion relation: The phase velocity is given as the ratio of the frequency wand the wave number {3: w vp
= (3'
Often, the dispersion relations are displayed on graphs of w as functions of {3. Then the phase velocity is given by the gradient of the line connecting a point on the dispersion curve with the origin. From the graph of the dispersion relation in a waveguide (cf. Fig. 5.4), it is obvious that the phase velocity in the plain waveguide is always higher than the speed of light. The group velocity Vg is given by the gradient of the dispersion curve: dJ..u
Vg
= df3'
In lossless structures, the energy velocity and the group velocity are equal. The dispersion curve for a cylindrical waveguide loaded with irises separated by a constant distance d starts with large wave length, i.e., small wave numbers f3, like the dispersion curve of the plain waveguide. Then (at (3 = 1l" / d) it intersects the borderline where the phase velocity vp is equal to the speed of light c. Afterwards it quickly flattens for short wave lengths. The frequency range given by the extreme values of a dispersion curve is referred to as the pass band. A frequency range between two pass bands is called the stop band; there no power can flow into the structure. Suitable accelerating structures for high energy electrons have large coupling holes in order to achieve phase velocity vp close to the speed of light
252
5. Applications from Accelerator Physics
Tw ......... ... ...~..... "
,,'
,,'
~~ -:-.;, , .
,
..
---..><'
ideal circular waveguide
,-.
......
. >.» . . . "
" .
_...........,.';'''
Weul
......... ,
freespace(",,=c)
o Figure 5.4. Dispersion curves for a plain circular waveguide and a waveguide with irises. Also shown is the borderline where the phase velocity Vp equals tne speed of light c. At the intersection of these lines with the dispersion curve, the wave has the sl?eed of ligl),t .as. its phase velocity ~nd can therefore be used for the acceleration of nlghly relativistic elementary particles. Graph from the graduation paper of Krietenstein [160J
3. stopband
.: ... :.:~.~_4..,.
-0.:.:
.: ... : ... :... :... :... :... : ... : ... :... :... :... :... :..
.... ... , ....... ;".: ... :... :... : ... : ..... .
2. passband .....................
.
.. :'.':'.':'.':'.':'.':' ..
. . ' ........... 2. stopband
:cJ.~: .......................... .. Wo
----I
11. passband
I
I .. :.j:............... ..................... '.'J.' . '1.' .
'1
. •1'
I :... :... :... :... ;'j': ... : ... : ... :... : ... : ... :.......... . 1.·.·.·.·.·.·.·
.......••.•.. ... I
....... . . • . . . . t~. sto?b~n? : .... .
..
I
•
.~i~.-
Polpo + 211'
Figure 5.5. Dispersion curve, also called the Brillouin diagram, of a traveling wave tube with equidistantly placed irises. Graph from the graduation paper of Krietenstein [160J
5.2 Linear Colliders
253
c, since only then the wave can be used for acceleration of electrons. The large group velocity Vg allows excitation of a suitable traveling wave also in extended lossy structures by only one input coupler for the power supply. Dispersion curves are obtained experimentally by shortening both ends of a structure of finite length with very good conducting metal plates. These plates are placed in such a way that the field configurations of the structure of infinite length are preserved (symmetry planes, compare [30]). In a finite iris-loaded waveguide, the waves have to satisfy the boundary conditions on both ends. Only integer multiples of the distance between the irises can be wave lengths. These wave configurations are also called the modes. The numbering of the modes using index n is done from left to right on the ;3l-axis of Fig. 5.5, i.e., from the smaller to larger phase advances per cell. Thus the n-th mode has the phase advance
LJ
n7r
N'
n = 0,1, ... , N
from one cell to the next. For reasons to be discussed below, only three of the N possible modes are used for particle acceleration. These are the 7r-, 27r /3- and the 7r /2-mode . In a periodic structure, the mode with ;3. d = 27r /3 where d is the period of the structure (i.e., the gap between cells plus the thickness of an iris) is called the 27r/3-mode. Thus, the wave length of these modes extends over 2, 3 or 4 cells of the tube, respectively. Figure 5.6 shows the usual acceleration modes. An accelerating structure consists of some material with very high but finite electric conductivity",. With a good but not perfect conductor, the fields near the surface behave approximately like those with a perfect conductor. For this reason, it is possible to assume in numerical computations that the material is a perfect conductor. Inside the conductor, the fields are exponentially damped inside the so-called skin depth [) = j2/wf.L'" 6. If the conductivity of the walls is finite, Ohmic losses occur and the power flow along the conductor is damped. The Ohmic losses by wall currents, i.e., the power loss Pv , is defined by
Pv
=~ J 2
J. J* dV '"
=~J 2
H t · H; dA [)'"
=~
fWiiJHt. H*dA
2V ~
(5.1)
t
for a dissipative material. Here A denotes the surface of the conductor and H t is the magnetic field in the loss free case tangent to the surface A. This computation method for the wall losses is referred to as the power-loss method. In practice, mainly copper is used to build cavities and traveling tubes because of its good conductivity. Superconductive materials have been also used for some time. Even for good conductors, significant energy losses occur in the cavity walls and are caused by the wall currents. The energy losses lead to a damping, which can be expressed via the parameter Q. Then Wet) ex e-(wo/Q)t 6
At 100 MHz, the skin depth of copper, e.g., is J
= 7 {tm.
254
5. Applications from Accelerator Physics
o.
o
-------~ -----I O.
------t.----.. -
.., ------
o. - - - - - - I O.
7r
, I
0 . 10
E field configuration of the
----_ ....
0.20
/3 - mode
I
0.10
I
0 . 20
E field configuration of the 7r /2 - mode o.
o.
,
0 . 10
E field configuration of the 27r /3 - mode Figure 5.6. Field configuration of usual acceleration modes (7r /3, 7r /2, 27r /3) in six cells of the SBLC accelerating structure. For symmetry reasons, only the upper half of the cross-section of the cylindrically symmetric structure is shown.
holds for the energy of a standing wave. The damping in a traveling structure will be discussed in more detail. The average stored energy Ws is calculated as follows: The total energy inside a volume V is given by
For time-harmonic fields, WE WM
WE
= WM
where
= (f. r /4)E. E* = time-averaged mean of the stored electric energy
= (f..Lr/4)H . H* = time-averaged mean of the stored magnetic energy
5.2 Linear ColJiders
255
The quantities important for the choice of an accelerating mode and the estimation of parasitic modes are the voltage V, the shunt impedance R s , and the quality factor Q. Figure 5.7 shows the electric field E, which is part of the definitions of theses quantities, the wall losses Pv , and the stored energy Ws of the mode in case of the 27T /3 acceleration mode. A particle passing a cylindrically symmetric structure of length L at the distance TO from the axis experiences the voltage V
For the accelerating mode, the voltage on the design trajectory, i.e., at TO = 0, is of interest. The term accelerating voltage V is also used for this mode. The shunt impedance reflects the relation between the squared accelerating voltage V and the total power loss Pv in the walls of the structure. The shunt impedance Rs
Rs
= V V* = (voltage)2 Pv
wall losses
is a measure of the rf power that is transferred into an accelerating voltage for charged particles. The quality factoT Q gives the relation between the stored energy and the power loss: Q = wWs = frequency x stored energy. Pv wall losses For the accelerating mode the shunt impedance Rs and the quality factor Q are maximized while these parameters are minimized as far as possible for all other modes. In the choice of the accelerating mode, the necessary filling time is decisive besides the shunt impedance and the quality factor. The filling time t f is defined as tf =
1L
v[z) dz,
where v(z) is the group velocity Vg in case of a periodic structure and a corresponding auxiliary function in case of an aperiodic structure (compare v(z) according to (5.3) for a constant gradient structure). The 27T/3-mode is mainly used in linear accelerators, since it presents the best compromise with respect to filling time and shunt impedance: The 7T-mode requires relatively long time for filling and therefore is not suited for fast pulse operation. On the other hand, the shunt impedance per unit length is relatively small for the 7T /2-mode and thus the energy gain per accelerating section is relatively small for fixed rf power. The reason for this is the larger number of irises per unit length, which increases the total conductive surface, which finally leads to increased wall losses Pv '
256
5. Applications from Accelerator Physics
0 . 05
0.0
-----._--------
\
....
...
O• • - - - - - -
'0,
~-----
----;
~------
... ----~,\.
....
...----
~----.... ,
I
0.10
E field of the 21r /3-mode
Wall losses Pv of the 271' /3-mode
Stored energy W. of the 271' /3-mode Figure 5.7. Electric field E (top), wall losses Pv (middle), and the energy W. stored in the mode (bottom) for the 271'/3 accelerating mode in six cells of the SBLC structure.
5.2 Linear Colliders
257
The accelerating structures for linear colliders can be operated with standing waves or traveling waves. The question which operation mode should be preferred is a controversy since the forties [176]. There exists an opinion that standing wave structures are suitable for accelerators with long pulses, while traveling wave structures are best suited for linear colliders with short pulses and high accelerating gradients. Also, in the above-mentioned projects of linear colliders with normal conductive accelerating structures, special traveling wave structures are used 7 .
Periodic structures. For the traveling wave of a special mode in a periodic structure, the periodicity of the electric field is expressed by Floquet's theorem [66]: E(r,4',z) = £(r,4',z + L)e- jf3L . It states that, for a special mode and a given frequency, the fields at an arbitrary cross-section of the structure differ from the fields a period apart only by the complex constant e- jj3L • Therein the electric field is periodic in z with period L. This periodic field can also be expanded in a Fourier series of space harmonics, also referred to as modes, where each harmonic has its own propagation constant f3n = 130 + 21fn/1 and phase velocity vp,n = w /f3n.
E(r, 4', z)
L 00
=
En(r,4')e-jf3nz.
n=-oo
All space harmonics have the same group velocity Vg = 8w/8f3. In a finite structure with N cells, only N +1 modes having the cell-to-cell phase advances f3n l n1f f3n l = N' n = 0,1, ... , N can propagate. These relations are also reflected in the dispersion curves. In case of periodic structures, the w - f3-diagram is also called the Brillouin diagram (compare Fig. 5.5). Another important quantity is given by the attenuation parameter expressing damping per unit length: w 2Qv g
a=--
(5.2)
This parameter is the ratio of the frequency and twice the product of group velocity and quality factor. In a structure without geometrical changes over its total length, the accelerating field and the power are exponentially damped relative to the field and power at the input coupler of the structure: e 7
_ 27
P( z
= L)
output power .- P(z = 0) - input power .-
-
--"----=.--
A helpful image is that of the particles sitting" on top of the wave and experiencing the longitudinal field component as a constant acceleration.
258
5. Applications from Accelerator Physics
with the attenuation 7 = aL and a as in (5.2). The damping constant of the exponential decrease of the field (resp. power) is given by once (resp. twice) the attenuation per unit length:
IIE(z)11 = IIEol1 e- az P(z)
= Po e- 2az
Such structures are called constant impedance structures. The energy gain in a constant impedance structure is given by 1
V=JpoRsffr -e
-T
7
Modes in a constant impedance structure can always be put in relation to some pass band of a single cell. The main disadvantage of the constant impedance structures is the damping of the electric field along the structure. Another disadvantage is related to the parasitic modes, which are excited during the operation of the accelerator and negatively influence the particle dynamics. Among the parasitic modes, the dipole modes show the strongest interaction with the particles. The dipole modes are usually standing waves that interact with the particle bunches over the whole length of the constant impedance structure. The issue of parasitic modes is discussed in more detail in subsections 5.3-5.5.
Aperiodic Structures. The goal is to design an accelerating structure with constant accelerating gradient and constant thermal load along the iris-loaded waveguide, additionally having only dipole modes interacting over at most a part of the structure. This can be reached not by keeping a constant damping factor but by varying it along the structure. The term "detuned structure" is often used in order to emphasize the de tuning of the dipole modes. An aperiodic structure with constant accelerating gradient is referred to as constant gradient structure. (i) Constant gradient structures are iris-loaded traveling wave tubes with tapered cells. The cells differ in geometry one from the other in such a way that a constant accelerating gradient is reached during the operation. Figure 2.6 shows a tapered nine-cell structure with the 271'/3 monopole mode, which is often used for acceleration. The mode computation is carried out for the loss free case, which is why the longitudinal electric field and hence the accelerating gradient increase as shown in Fig. 2.6. Then one designs a structure such that the accelerating gradient stays constant with the wall losses taken into account. The power conversion P(z) shall be as close to a constant as possible. This means that the attenuation a and hence also the group velocity Vg have to be functions of z, according to (5.2). Analogously to constant impedance structures, it is usual to define the damping factor 7 by e- 2T = PL / Po. The requirement dP/dz = const. leads to a linear variation of the group velocity:
wL 1 - (1 - e- 2T ) v(z) = Q . 1 _ e- 2T
.
5.2 Linear Colliders
i '
0:::; z :::; L,
259
(5.3)
with an auxiliary function v( z) corresponding to the group velocity8. Then v(z) gives that group velocity that would be attained for some coordinate z belonging to the i-th cell of the constant gradient structure in a corresponding periodic structure with dimensions of cell i. In periodic structures, modes can always be put in relation to the pass band of a single cell. This is no longer possible for aperiodic structures, since the cells have different openings for both iris holes on the right and left. In approximation, it is possible to neglect the small radial differences and to compute the properties of one cell with identical iris openings on both sides. In this procedure, the group velocities vgi are used instead of v(z). Usually, the linear variation of group velocities vgi is reached by decreasing iris radii from the first to the last iris of the waveguide. Additionally, minor variation of the cell radii is necessary in order to tune the accelerating frequency of a constant gradient structure. For the SBLC tube, the variation of the cell radii is smaller than a tenth of a thousandth of the total length of the structure. It is usual to write the cell and iris radii as functions of v gi to characterize a special constant gradient structure. If the shunt impedance Rs and the quality factor Q of the cells remain constant over the whole length of the structure, a constant field results from the linear variation of the group velocities vgi • Yet, the quality factor in fact shows a minor decrease along the structure, while the shunt impedance increases by a small portion because of the decreasing iris holes. This again leads to a minor increase of field strength along the structure. However, for constant gradient structures, the design also has to take into account the difference between the loaded and the unloaded case: In the loaded case, the accelerating gradient shall be constant also in presence of the beam current. Then, for constant power conversion per length, the field gradient has to grow. This difference between the accelerating gradient and the field gradient is caused by the so-called beam loading, which describes the effects induced by the beam in an accelerating structure. It is one of many effects of the interaction between the beam and its surrounding (cf. subsection 5.3). When a loaded particle or a bunch passes an accelerating structure with the speed of light, this creates new electromagnetic fields in the accelerating structure. These induce currents and charges in the walls of the structure, which in turn induce the so-called wake fields inside the structure. The wake fields decelerate the beam and are able to deflect it transversally. The fundamental theorem of beam loading [317] asserts the following: Suppose a pointlike charge passes a resonator that was originally field free at the speed of 8
The term "group velocity vg" is still often used for the description of constant gradient structures. However, note that the group velocity is only defined for periodic structures. If it is used at all for constant gradient structures, it can be only an auxiliary quantity.
260
5. Applications from Accelerator Physics
light. The energy L1Un transferred into some mode n equals half the product of its own beam-induced voltage Vn in this mode and its electric charge q. This implies that the effective voltage Vn,ef f "felt" by the particle is only half of the charge-induced voltage Vn , thus Vn,eff = Vn/2. The beam-induced voltage Vn is proportional to the charge of the particle that causes excitation: Vn = 2knq, with the loss parameter kn defined in subsection 5.3. Because of energy conversion, the energy Wn remaining after the charge passed equals the energy L1Un lost by the charge in the structure. In connection with the beam loading, the term loaded gradient was introduced, which will not be explained here in more detail. The difference between the unloaded gradient with the voltage Vg , provided by the generator, and the loaded gradient is mainly caused by two factors: First, at the moment when the bunch enters the resonator, Vg (the generator voltage) has a phase angle
Detailed discussion of constant impedance and constant gradient structures can be found in [293], [167]. In a constant gradient structure, the phase velocity is constant over the complete length of the structure only in case of the accelerating mode. The constant gradient structure with N cells is tuned for the chosen accelerating mode, usually the 211"/3-mode. All other modes show an aperiodic field distribution. The frequencies of the N lowest parasitic dipole modes are equally distributed: As soon as the group velocities v g1 and v gN and thus the radii of the end cells have been fixed, the whole frequency interval fJ.ftot of the dipole modes and the average dipole mode frequency 11 are determined uniquely. The distance between neighbouring modes is given by
L1J; = !r,i - !r,i-l
= ~f~e~.
5.2 Linear Colliders
261
Since the parasitic modes exist in only a part of the structure, their maximal interaction is less than in a corresponding periodic structure. However, the sum of all loss parameters over some pass band nearly coincides in both cases. With this, note again that the modes of a constant gradient structure cannot be directly related to the pass band of the individual cells. Still, the pass bands of the single cells are usually not distributed too strongly, so that most modes of a cell complex oscillate in the same band. Yet an overlap of neighbouring dipole bands is possible. accelerating frequency accelerating mode wave length total length average shunt impedance attenuation quality factor of the 27l' /3-mode group velocities
f = 2996 GHz 27l'/3-Mode A =0.1 m L = 6 m (180 cells) Rs = 57 Mil/m TNeper = 0.55 Q;:::: 14000 Vg = 4.1 ... 1.3 % ·c
Table 5.2. Main parameters of the constant gradient structure for the SBLC study.
iris radii cell radii iris thickness period length
a
= 15.340
... 11.003 mm
b = 41.334 ... 40.005 mm d = 5.0 mm 1= 33.34 mm
Table 5.3. The range of geometric parameters of the constant gradient structure for the SBLC project.
Generally, regarding the geometrical extremes of the N -cell constant gradient structure, the following should be noted: In practice, the iris diameter at the end of the structure is chosen as small subject to compatibility with the dimensions of the accelerated beam. Thus, the group velocity v gN = /J(L) at the end of the structure is fixed and, after fixing the length L of the structure, the group velocity V gt = /J(O) at the input end can be obtained from (5.3). The choice of the iris thickness has to be a compromise between thin apertures with high shunt impedance and thick apertures reducing the heat load and the danger of electric breakdown and improving the mechanical stability. To minimize production costs, it is also usual to design some quasi-constant gradient structure with constant impedance landings and some adapted transitions between the landings. Tables 5.2 and 5.3 give the most important parameters of that constant gradient structure proposed for SBLC. For manufacturing reasons, the cells
262
5. Applications from Accelerator Physics
are usually each produced as one piece, thus forming a so-called cup, rather than from two parts: a ring (for the cell) and an aperture. Figures 5.8 and 5.9 show the geometry and a photo of single cups before soldering. The terms in table 5.3 correspond to those in Fig. 5.8. As was described in subsection 5.6.1, the cups are soldered together to form the traveling wave tube (compare Fig. 5.10). The S-band traveling wave tube for the SBLC project consists of 180 single cups. In subsection 5.4, this structure is studied with respect to its parasitic modes.
Figure 5.8. Geometry of a single" cup" of the SBLC constant gradient structure.
The drawing is courtesy of Drevlak, DESY; now IPP Greifswald
(ii) The term "detuned structure" describes different types of aperiodic accelerating structures. As was already noted, this shall emphasize that the dipole modes are put out of tune. Thus, all aperiodic structures, including in particular the constant gradient structures, are detuned structures. However, mostly the structures where recoherence of the long range wake fields is avoided by using a special distribution of the dipole modes are referred to as detuned structures. a) Gaussian distribution of the dipole modes inside some structure: A truncated Gaussian distribution with the standard deviation (J f' having a mode density in the neighbourhood of some frequency proportional to
5.2 Linear Colliders
263
Figure 5.9. Three single cups of the SBLC structure. The photo is courtesy of DESY
Figure 5.10. One 5.2 m long S-band tube for LINAC II at DESY. This structure is very similar to that for SBLC and therefore can be regarded as a prototype of the SBLC tube. The photo is courtesy of DESY
264
5. Applications from Accelerator Physics
exp[ -(I - 11)2 /2o'J]' presents one possibility [264]. Herein, the distance between neighbouring modes is given by
with neT
_ Lllto
= --.t (jf
Therein, neT gives the total width of the truncated distribution in units of (j f and F(x) is the usual error function
( ) _ 2 (X - u2 d F x = ;:ff io e u. In the center of the distribution, the relative distance between neighbouring modes is approximately given by
The X-band test structure for the NLC project of SLAC is a structure of such kind. The design of the NLC structure and further details regarding the de tuning are described in [264]. Many illustrations concerning the rf parameters can also be found there - often given in comparison with corresponding constant impedance or constant gradient structures. b) Variation of dipole frequencies from structure to structure (structure-tostructure detuning): On the basis of the detuning described above, one can divide constant gradient structures or detuned structures of the linear collider into several classes which are slightly shifted one with respect to the other. This again leads to a further reduction of the long range wake fields. This measure was first used in the end of the sixties on the constant gradient structures of SLAC which were already installed [124]: The structures were subdivided into three equal classes, and some cells of the structures were made uneven, so that the dipole frequency was changed by 2 MHz in one third and by 4 MHz in another third while remaining the same in the last third. Even though the choice of a Gaussian detuning of the dipole modes was first taken into account for the SBLC project [84], a constant gradient structure was finally chosen and a subdivision into ten classes was done for about 5000 structures. After careful numerical studies, the maximal detuning with respect to the reference frequency was chosen to be 36 MHz [82]. In such a way, the long range wake fields are strongly suppressed. As further necessary measure for the suppression of the wake fields, a damping of the dipole modes is proposed additionally (cf. [77] and subsection 5.5.8).
5.2 Linear Colliders
265
5.2.3 The S-Band Linear Collider Study Hitherto the schematic layout of the S-band linear collider, shortly SBLC, was already shown (cf. Fig. 5.2) and the accelerating structure was described in some detail in the preceding subsection. The S-band 2 x 250 GeV linear collider project SBLC proposes 5034 accelerating structures having 180 cells each. Each structure is 6 m long; its loaded gradient, i.e., the gradient including beam loading effects, is 17 MV 1m. The so-called bunch train of 125 bunches (packets of particles) and a distance of 16 ns from bunch to bunch is planned. A bunch consists of N = 2.9 X 10 10 particles with energy 3.15 GeV injected into the main accelerator. Table 5.4 summarizes these characteristic parameters of the SBLC project. In order to reach as high luminosity as possible, any enlargement of the bunch has to be avoided. Wake field effects are among the major reasons of the enlargement of the bunches. Consequently, the suppression of these modes, the so-called Higher Order Modes (HOMs), is of fundamental importance for the actual design of the collider. In the computations, the main interest was in the first dipole band, since the modes of this band cause the worst enlarging effects. Because of the minor variation of the cell radii, the electromagnetic fields of long constant gradient structures can hardly be calculated with standard discretization methods. Therefore, the mode matching technique was used for the numerical field calculation. Center of mass energy Number of bunches Bunch distance Particles per bunch constant gradient structures
500 GeV 125 16 ns 2.9 xl0 10 5034 with 180 cells each
unloaded gradient loaded gradient
21 MV / m 17 MV / m
= 6 m long
Table 5.4. The main parameters of the S-band linear collider SBLC.
266
5. Applications from Accelerator Physics
SBLC: 3GHz , 2517 Modules Modulalor
1200 J Puis
150 MW 2,8 ).IS 21 kW 50 Hz
125 bunches wilh disla nce of 16 ns
S-Ba nd: 180 Cells, 6m long
Figure 5.11. An accelerating module of the S-band linear collider SBLC.
Air Duct
Figure 5.12. Longitudinal cut through the SBLC-tunnel. The illustration is courtesy
of Holtkamp, DESY; now FERMILAB
5.2 Linear Colliders
267
o 7m
Figure 5.13. Cross-section of the SBLC-tunnel. The illustration is courtesy of Holtkamp, DESY; now FERMILAB
268
5. Applications from Accelerator Physics
5.3 Beam Dynamics in a Linear Collider By its principal layout, an accelerator with its accelerating sections and focusing magnets houses electric and magnetic fields that interact with the elementary particles. As long as no disturbing effects take place, these interactions ensure that the bunch train moves along its design trajectory and two opposite bunches collide in the designed interaction point. Yet, there exists a number of effects that influence the behaviour of the beam. The term "collective effects" is often used in this context. For all the linear collider design projects introduced in subsection 5.2.1, the beam dynamics in the accelerating structures of the main linac (cf. Fig. 5.2) is of main interest. The goal is to avoid any possible beam instabilities. All projects propose to deal with many bunches simultaneously (i.e., multibunch operation). A crucial disturbing effect of such operation is as follows. If some leading bunch induces parasitic modes by some deviation from the design trajectory, these fields will influence the dynamics of the following bunches. These scattered fields caused by single bunches, the so-called wake fields, sum up. We will now discuss the emittance concept before turning to the wake fields, which will be introduced formally in subsection 5.3.2.
5.3.1 Emittance The emittance is a term from beam optics, which studies the motion of the beam inside the accelerator. It takes the coordinate system (x, y, s) as a basis. Here s denotes the distance along the accelerator; the horizontal coordinate x and the vertical coordinate yare called as transversal coordinates. The longitudinal position z of a particle inside the beam is an additional coordinate with the same direction as s but with its reference point in the center of the beam. Each particle has three momentum components besides the three spatial components. The phase space is used in order to characterize the state of a particle system, e.g., the beam. This is a six-dimensional space with three spatial and three momentum coordinates. Then each state of the system is completely described by some point in the phase space. The phase space projection shown in Fig. 5.14 is chosen as a graphical representation of the phase space. Each spatial component (x or y, the displacement) and the corresponding angular component (x' = dx/ds or y', the divergence) are displayed. The divergence corresponds to the angle of the particle with the beam axis. The transverse motion of the beam is controlled by transversal magnetic fields: dipole magnets correct the direction of motion, quadrupole magnets serve for focusing, and sextuple magnets correct nonlinear effects. The equation of motion for a single particle is a differential equation of the second order (in the general case, it is Hill's equation). Then special assumptions yield the
5.3 Beam Dynamics in a Linear Collider
design trajectory (LlE = 0), the dispersion function (LlE the betatron oscillation (Llp = 0). The betatron motion
X/3(s) Y/3(s)
269
:I 0, Llp :I 0), and
= axV{3x(s)· cos(1/Jx(s) + 4>x) = ayV(3y(s) . cos(1/Jy(s) + 4>y)
with the phase advance
1/Jx,y(s)
r (3x,y(s') ds' 1
= Jo
represents a transversal periodic oscillation (corresponding to the harmonic oscillator). The amplitude and phase advance of this motion are described by the beta Junction (3(s), which is of central importance for beam optics. The direct determination of the beta function is very costly, so it is customary to use a matrix formalism. The divergences x~(s),Y~(s) are determined for this purpose and, after some rearrangements and eliminations of trigonometric functions, the following equation of an ellipse in x follows:
a; = ,
. x 2 + 2Q . X . x'
+ {3 . x'2,
with Q = -{3'J2 and, = (1 + (2)J{3. The equation of the other transversal coordinate Y is completely analogous. Of course, the area F = 71" a 2 =: 71" € of the ellipse is a constant independent of the localization. The factor € := a~,y is referred to as emittance. x'
Equivalent area of particle.distribution
A=1t£
x
Figure 5.14. Emittance ellipse in phase space. The illustration is courtesy of Drevlak, DESY; now IPP Greifswald
The emittance is derived from the equation of motion for a single particle. Then the motion of the bunch can be described by the phase ellipse or emittance ellipse: If the coordinates (x, x') of all particles with emittance € are displayed in the phase plane, this produces an ellipse in the phase space. Consequently, the emittance ellipse characterizes the beam at the position s. Given a magnet system and a beam that is given in the beginning by a cluster
270
5. Applications from Accelerator Physics
of points in the phase plane (x, X') centered around the reference point (0, 0), then an ellipse which exactly borders the cluster and thus represents the beam can be determined by a choice of (30, 0:0, and f. But then, Liouville's Theorem on the invariance of the phase ellipse states that the particle density in the phase space remains constant when the beam passes a magnet system, i.e., that the area of the phase ellipse is invariant: J x'dx = canst. = f'rr. Setting up transport matrices for each accelerating section, each magnet, and each drift space, the particle trajectory can be tracked by multiplication of the transport matrices starting off from the equation of the phase ellipse (cf. [218] and references therein). It is very important to note that Liouville's theorem is valid only for beams which are guided by external fields. Furthermore, the transformation properties are adjusted to the position in the phase space. A beam with this position is said to be matched. For a mismatched position, the emittance increases. Besides the just explained emittance concept, there is a number of related terms. To keep high luminosity in an accelerator operated with a bunch train - in the so-called multibunch operation - not only the emittance of the single bunch has to be preserved, but any cumulative beam instability along the bunch train has to be avoided. In this context, the specific terms of the single bunch emittance and multibunch emittance are introduced. In computer simulations for studies of the single and multi bunch dynamics of the beam, some effective emittance of the bunch train is computed from the so-called centroids of the bunches (centers of the bunches) and the single bunch emittances. Prescribed tolerances are reflected in the effective emittance of the bunch train at the end of the linear collider. Main input parameters for the simulation are the loss parameters introduced below and the quality factors of higher order modes (long range wake fields). This topic is treated in detail in [83]. Effects of wake fields and dispersion errors, which are caused by misalignment of accelerating structures and injection errors of the bunch, are main reasons of emittance growth. Further reasons are ground motions, jitter, filamentation by phase mixing, and so on. In [210], these effects and suitable correction mechanisms are described. 5.3.2 Wake Fields and Wake Potential In April 1966, cumulative beam instabilities were observed for the first time ever at the SLAC two-mile accelerator [184]. Transverse distortions of the beam were slightly amplified in each of the 960 constant gradient structures, thus leading to a large total amplification of six to seven orders of magnitude in the end. With 10 to 20 rnA, these instabilities had low current thresholds. A series of observations, experiments and calculations followed in the years 1966/67 [184], the results of which can be summarized as follows. 1. The lowest resonant frequency for which beam break-up was observed was
about 4.14 GHz. The field of the corresponding dipole mode was confined
5.3 Beam Dynamics in a Linear Collider
271
to the first eight to ten cells - the rest of the structure was practically field free. The cell-to-cell phase advance in the first cells was close to 1r. 2. Only few modes contributed to the instability. 3. For all contributing modes, the relevant electric field was confined to the first quarter of the structure. Thus, for the SBLC project with its relatively similar constant gradient structure, the following conclusion was irresistible: (i) Only few interacting dipole modes were expected. (ii) For these modes, it was expected that their field and thus their energy would also be confined to the first cells of the structure. (iii) As a consequence of (ii), only one damper in one of the first cells would be sufficient for an adequate suppression of short range wake fields. The numerical simulations described in subsection 5.4 (resp. in [83]) were used to study issues (i) and (ii) (resp. (iii)). These studies made it obvious that the situation is fundamentally different for the SBLC structure because of the different degree of geometrical variation compared to the SLAC structure. Now, we introduce wake fields and related quantities before the abovementioned studies are described in subsection 5.4. Some remarks concerning beam instabilities will also be made. There is a number of assumptions made for the exposition to follow: (i) The particles are assumed to be electrons. (ii) The particles move at the speed of light c, i.e. they are ultra-relativistic. (iii) The vacuum in the accelerator structure is assumed to be perfect. (iv) The walls of all components are assumed to be perfectly conducting. (v) The particle energy is assumed to be so high that Coulomb forces between the particles can be neglected. The following description is partly based on [309]. More detailed descriptions can also be found in [20] and [53].
Fundamental Principles of Formation of Wake Fields. First, consider a simple point charge q moving in free space with velocity v = j3c, j3 ~ 1, i.e., close to the speed of light. It is convenient to choose the cylindrical coordinate system for the description. It is known from classical electrodynamics [142) that a highly relativistic point charge carries a field whose electric and magnetic field lines are almost totally confined to the transverse plane because of the Lorentz contraction. The comparison is often made to a thin plate perpendicular to the direction of propagation. The opening angle of the field lines in the longitudinal plane approximately equals 1;', with the Lorentz factor 'Y = 1/~. In the ultra-relativistic borderline case v --t c {:> 'Y --t 00, the plate thickness turns into a <5-function-like 9 distribution. The non-vanishing field components become
Er 9
q ) =- <5( z - ct , 21l' Eor
H
= Er Zo'
Dirac's delta function is used in the sense of distributions; it is a singular distribution [2061.
272
5. Applications from Accelerator Physics
with the impedance Zo = 377 n of the free space. No forces are exerted on test charges in front of or behind the charge q, since the field strength is zero in front of and behind the point charge. This idealization is absolutely justified for e+e- high energy accelerators, since the Lorentz factor there is of order 105 (LEP or SLC) and space charges can be neglected. For protons and heavy ions, this idealization cannot be accepted automatically. Now, if the charge q is no longer located in free space but is moving along the z-axis through a circular cylindrical tube, then the electric field lines are ending transversally on surface charges in the wall of the tube. The image charges move synchronously with the charge on the axis if the tube is perfectly conducting. Mathematically, the Dirichlet boundary condition is satisfied for the tangential electric field on the perfectly conducting metallic surface. Therefore, the ultra-relativistic point charge q carries a purely transverse electric field as long as it is moving in a smooth perfectly conducting tube. However, in case of finite conductivity, i.e., a tube with specific resistance, the tangential field E z has to satisfy other different conditions depending on the field in the wall of the tube. In [53], the fields E z and Er = Bcp are derived for this case. It becomes obvious that the longitudinal component E z has negative sign just behind the charge and thus has a decelerating effect. At a somewhat larger distance, it changes sign and thus has an accelerating effect. Because of causality, no field exists in front of the charge. The scattered fields left by the charge are referred to as resistive wall wakes. In most cases they are negligible [309].
Figure 5.15. Formation of wake fields when the bunch passes a diameter change in the tube. Two snapshots while the bunch passes the structure and one snapshot after it left the structure are shown. The case of diameter changes in the tube, e.g., by some accelerating structure, is of major importance. In those locations, scattered electromagnetic fields are generated, since the boundary conditions also have to be satisfied in the changed geometry of the perfect conducting metallic surface. Now the walls are no longer parallel to the trajectory of the charge but are perpendicular or form an angle with it. Therefore, the field which was exactly transverse to the z-axis is "bended", since it must always be transverse to the
5.3 Beam Dynamics in a Linear Collider
273
perfectly conducting wall. Figure 5.15 shows this mechanism for a Gaussian bunch: As long as no change in diameter occurs in the beam tube, all image charges on the wall travel exactly with the bunch. In particular, they vanish completely as soon as the bunch has passed some location. At any diameter change in the beam tube, the additional field components are generated in longitudinal direction, i.e., in the direction of propagation of the particles. Because of causality, these field components are located behind the particles. If the beam tube narrows again, for example, after some cavity, reflections of the fields left behind may occur. The energy deposited by the scattering can induce resonances in a cavity or other resonant structure. If the decay time of the excited fields is smaller than the bunch length divided by c, then even the end of the bunch is influenced by the excited fields. These scattered electromagnetic fields arising as a result of interaction of the particles with its surrounding are referred to as wake fields. The wake fields act back on the bunch itself, producing energy losses and change in momentum. Figure 5.16 shows the wake fields in a three-cell accelerating structure. All the pictures show the upper half of the cross-section of the cylindrically symmetric structure. The development of the scattered electromagnetic fields while a Gaussian bunch passes the structure is shown in six successive moments. The wave front of the scattered electromagnetic fields is repeatedly reflected at the walls. In this way, high frequency modes are also induced, in addition to the fundamental mode. First, the system "beam - accelerating structure" passes through a transient phase before achieving a stationary state.
'1JCJ:1 'ITTI :
1
,.>em
~ .
4
Figure 5.16. Wake fields induced by a Gaussian bunch in a three-cell accelerating structure. The upper half of the cross-section of the structure is displayed.
274
5. Applications from Accelerator Physics
The scattered electromagnetic fields depend both on space and on time. Figure 5.16 shows that the electric field strength has components in the direction of the motion as soon as changes in diameter occur. Therefore, forces arise in the direction of the motion, while the forces in a smooth cylindrically symmetric structure completely compensate each other. Since it may be assumed for an ultra-relativistic particle that the relative position inside the bunch remains unchanged, the transient force of the scattered fields may be integrated over the total time needed for the bunch to pass the structure, leading to the concept of the wake potential. The wake fields act back on the generating bunch by exerting a force on the beam which withdraws energy and causes a deviation from the design trajectory of the particles. Furthermore, it heats the passed structure, cf., e.g., [285]. Therefore, these fields are also called parasitic fields. The wake fields can be determined numerically. The program MAFIA, with its time domain solvers T2 and T3 [155]' [76], [72], [22], [307], [262] and its predecessors TBCI [301], [299], which was used to generate the plots in Fig. 5.16, and BCI [298], is also based on FIT (cf. subsection 2.3 and [296]) and solves Maxwell's equations in time domain. Short and long range wake fields are reasons of instabilities. Wake Potential. Next, consider a wake field exciting a reference particle of charge ql having distance f and velocity f3e and moving parallel to the z-axis, as well as a test particle of charge q2 following the same path with the same velocity but at longitudinal distance s and at distance r to the z-axis. The charge density given by the point charge ql is ql e J(r) p(r, ip, z, t) = - - J ( z - f3et).
27r
r
The test particle experiences a Lorentz force generated by the wake fields:
F(r, ip, z, (z+s)/ f3e) = q2(E(r, ip, z, (z+s)/ f3e)
+ f3ee z xB(r, ip, z, (z+s)/ f3e)).
The integral effect of this force reflects the change in momentum and is described by the wake potential: The longitudinal wake potential is defined as the total energy loss of the test particle divided by the charge ql of the reference particle. It is a function of the transversal displacement r and the longitudinal distance s:
WI! (r, s)
= -1 ql
1
00
Ez(r, ip, z, (z
+ s)/ f3e) dz.
-00
The transverse wake potential is defined as the transverse change in momentum divided by the charge ql of the reference particle:
w l..(r, s) = -1
ql
1
00
-00
(El..(r, ip, z, (z+s)/f3e) +f3ee z xBl..(r, ip, z, (z+s)/f3c))dz.
5.3 Beam Dynamics in a Linear Collider
275
This definition refers to a 6-function-like charge distribution. The wake potential of the 6-function can be used as Green's function to determine the wake potential of an arbitrary charge distribution [20], e.g., for a Gaussian charge distribution of a bunch. The longitudinal wake potential corresponds to a voltage distribution and describes the energy loss of the single particles as a function of the relative position in the bunch. First of all, the scattered fields cause a short range effect on the particles inside the bunch, which generates the scattered fields. In addition, long range effects can occur if resonances with high quality factor are induced. If the decay time of these resonances is larger than the time interval between successive bunches, such resonances can lead to instabilities, which widen the bunch to an unacceptable extent.
i T
1
WII
jV
113MeV/m
w!
• "
n.',
\--W!
'I \\ I I
\
0.5MeV/m
"
I
I
',,,-,-'
article density = lmm
= 111 C
offset = 0.1 mm -1~r------------r-----------'------------1 Q50 5 ~ - - particle position s/m
m
Figure 5.17. Wake potential generated by a Gaussian bunch in the three-cell accelerating structure from Fig. 5.16.
Properties of the Wake Potential and its Relation to the Impedance. The longitudinal and the transverse wake potential are related to each other as follows: awl.. (1', s) _
as
- nVl.. W II (1', S ) .
(5.4)
This basic relation is referred to as the Panofsky- Wenzel Theorem [195). The correlation between longitudinal and transverse forces expressed by the Panofsky-Wenzel Theorem simplifies many problems. In experiments, it allows to reduce the determination of important resonator quantities, e.g., the transverse shunt impedance Rl.., to the measurement of the longitudinal
276
5. Applications from Accelerator Physics
electric field in some exposed positions rand ip in an accelerating structure [162]. As to the computation of the wake potential for some resonator ending on both sides in a beam tube of constant radius, it is possible to derive from the Panofsky-Wenzel Theorem that the infinite integration may be replaced by the integration along the beam tube, which in turn leads to a finite integral [299], [155), as will be described below. The Fourier transform maps the wake potential to the impedance (often denoted as coupling impedance):
Z(w) = -1
f3c
1
00
. W(s)e-tWs/i3cds.
-00
The impedance only depends on the transverse variables and frequency w/27r. The impedance describes the behaviour of a system that extends infinitely in the z-direction. Obviously, it is an integral quantity, i.e., determined by the whole structure. When the impedance is computed, e.g., for an accelerating section, then it is a local quantity in the sense that it only refers to that sub domain of a larger structure. Multipole Expansion of Wake Potentials. Rotationally invariant structures, represented in cylindrical coordinates (r, ip, z), have electromagnetic fields that are periodic with respect to ip. Thus, the fields can be expanded in a Fourier series E(r, ip, z, t) = 00
"" L Re
{E(m) (r -r
"
z t)e r
+~ E(m) (r z t)e + E(m) (r z t)e eim
m=O
H(r,ip,z,t) =
L Re {H~m) (r, z, t)e + 14 00
r
m ) (r,
z, t)e
+ H~m) (r, z, t)e z eim
m=O
with the complex phasers E and H and the unit vectors e r , e
In some literature, WI~O) is also called the longitudinal wake potential. To avoid misunderstanding, this component will be referred to as the monopole component of the longitudinal wake potential in the sequel.
5.3 Beam Dynamics in a Linear CoJlider
00
WII(r,cp,s)
= 2:: wm(s)1'mrm
277
cosm(cp-~)
m=O
and the transverse wake potential by
W.dr,cp,s)
=
t 1 m
m=O
5
Wm(s') 1'm r m- 1 (ercosm(cp-~)
-00
- e
The wm(s), m = 0,1,2, .... are scalar functions, while W(s) is a vector-valued function. Under the above assumptions, both the longitudinal wake potential WII (s) and the transverse wake potential W .1 (s) are uniquely determined by the wm(s). Since the dependence of the radial position is known, the integral can be determined at an arbitrary radial position, in particular, at the tube radius r = a. Since E z vanishes, on the metallic walls of the tube, there remains only the integral over the gap of the cavity. For many practical applications, the dependence on (r/a)m imply that the longitudinal wake potential is dominated by the monopole term and the transverse wake potential by the dipole term: WII (r, cp, s) = wo(s)
W.1(r,cp,s) =1'
[500 Wl(s')(ercos(cp-~) - e
The effects caused by the wake fields can be divided into three groups: (i) The short range forces which are always excited when a bunch passes a structure off the axis. They are independent of the quality factor of the structure but depend only on the geometry. (ii) The long range forces caused by resonant fields with a special frequency and decay time, being left in a structure after a bunch passed it. As soon as such a stationary state has been achieved, an interaction of the wake fields of one bunch in the beam to another is even possible for finite length of the bunches: Leading bunches excite higher order modes, i.e., parasitic modes, with damping time not small enough compared to the time interval between two successive bunches. Since these fields decay only slowly, the following bunch experiences the distortions of the preceding one.
278
5. Applications from Accelerator Physics
(iii) The beam dynamics, which can be studied separately with simple models and computer simulations (cf. the transport matrices in the explanation of the emittance). This is possible, since all forces are causal and the distribution of the high energy particles is varying slowly compared to the generation time of the forces. Loss Parameter. The loss parameter or loss factor [20] is a measure of the interaction of a beam and the accelerating structure. It measures the part of energy that a particle can loose or gain from the fields of a mode as it passes the structure. The loss parameter is defined as follows:
k(r)
1V12 = 4Ws .
It is given by the squared absolute value of the voltage that a particle experiences when it passes the structure with radius r divided by the stored energy of the mode. V stands for the electric voltage experienced by the particle moving along radius r. Ws stands for the stored energy of the mode. The loss parameterl l is needed for the calculation of the wake potential. One often uses the normalized loss parameter k' defined independently of the radius r of the voltage integral:
k'
12 . = k(r) . -rm
Here m describes the azimuthal dependence of the mode: m = 0 for monopole accelerating modes, m = 1 for the dipole modes, and so on. 'IV/pO]
__ ----.. .....
..
'1 ..... 1
Figure 5.1S. Loss parameter for the lowest dipole modes of a constant gradient
structure with 18 cells and for a periodic structure (constant impedance) with the dimensions of the middle cell of the constant gradient structure. The field computation and the computation of the loss parameter were performed with URMEL-T.
Then the wake functions Wm (s) of a cylindrically symmetric structure can be expressed via the resonant frequencies Wmn and the corresponding loss parameters kmn : 11
The loss parameter k should not be confused with the kick factor kl., which has another meaning.
5.3 Beam Dynamics in a Linear Collider
Wm(s)
= 2q L
kmn(r) cos(wmnsjc).
279
(5.5)
n
The formula (5.5) is valid as long as the modes do not loose any power through the beam tubes. Week losses can be taken into account in the wake functions using perturbation theory:
In summary, decisive parameters for beam dynamics computations are the resonant frequencies Wmn and the corresponding loss parameters kmn as well as the quality factors Qmn. The beam dynamics both of single bunches and of a complete bunch train then can be studied using computer simulations.
5.3.3 Single Bunch and Multibunch Instabilities First of all, possible instabilities are either generated by a single bunch itself or in an ensemble of bunches by one of the others. Therefore the terms" single bunch instabilities" and "multibunch instabilities" are used. These can each be divided into longitudinal and transverse effects. The instabilities decrease the luminosity, heat the structures [285], and can even cause the loss of the beam. All effects are only shortly listed here without explaining them in detail. The field of single bunch and multibunch dynamics is a wide research area, in which the higher order modes of the accelerating structures are just one topic. A good overview of the whole subject can be found, e.g., in the dissertations of Raubenheimer [210] and Drevlak [83].
Single Bunch Instabilities. The short range wake fields mentioned above, i.e., the interaction of particles in the head of a bunch with those in the tail of the bunch, are among the major reasons of the single bunch instabilities. Longitudinal single bunch effects are caused by the energy spread along the bunch, which in turn is generated by longitudinal wake fields and phase differences with the accelerating field (rf deflection). A proper adjustment of the rf phase can partly compensate for these effects, but additional focusing is necessary. Transversal single bunch effects are caused by bunches moving off axis . The deviation from the design trajectory can have different reasons: misalignment of accelerating structures, vibrations of different components, dispersion errors, mismatched injection of the bunch, or some form of jitter. As a result of the deviation, the particles at the head of the bunch interact via the excited transverse wake fields with those in the tail of the bunch, which leads to a linear enlargement of the bunch, the single bunch beam break-up, shortly SBBU. Effects of transverse wake fields as well as chromatic effects can be compensated by BNS damping [15]indexdamping!BNS (cf. also [234]), in which a systematic energy spread along the beam is used.
280
5. Applications from Accelerator Physics
Multibunch Instabilities. The multibunch dynamics expresses the influence of the leading bunches inside a bunch train on the following bunches. The long range wake fields, i.e., the fields caused by leading bunches in a bunch train that deflect all following bunches are the main reason of multibunch instabilities. The longitudinal multibunch effect consists of the bunch-to-bunch energy spread caused by the transient beam loading. The beam loading can be compensated for by different measures, e.g., by the so-called "staggered timing" or the ramping of the input rf power. The main problem of the transverse multibunch dynamics is the cumulative beam instability, which can lead to a beam break-up up to a total loss of the beam (multibunch beam break-up, MBBU). The MBBU is caused by the excitation of higher dipole modes in the accelerating structures. The suppression of these higher order modes (HOM) is one of the u{ost critical points in all design studies of linear colliders. The HOM suppression is necessary to control the multibunch instabilities in the actual linear collider and thus to prevent cumulative beam break-up. Suitable measures are the damping of the higher order modes and/or detuning of the accelerating structures. They will both be described in the sequel.
5.4 Numerical Analysis of Higher Order Modes In principle, a number of different methods exists for the numerical computation of higher order modes (cf. also section 2): - semi-analytical methods, e.g., the mode matching technique, based on a Fourier-Bessel expansion of the fields; - grid-oriented discretization methods such as Finite Volumes, Finite Differences or Finite Elements; - Boundary Element methods; - hybrid methods combining, e.g., a discretization method and mode matching technique; - coupled circuit models. It should be noted in the numerical analysis that not all of the mentioned methods are suitable for long constant gradient structures such as the SBLC structure, since the variation of the cell radii of constant gradient structures usually is much smaller than a thousandth part of the total length of the structure. Therefore the electromagnetic fields of these aperiodic structures cannot be computed by usual discretization methods. The semi-analytic methods and the coupled circuit models are suitable numerical methods for long aperiodic structures with small geometric variation from cell to cell. Yet, discretization methods are limited by the feasible number of grid points and the accuracy of the solution algorithms for the underlying eigenvalue problem. Consequently, they are not well suited for long aperiodic structures, since a radial variation of less than 0.1 of a thousandth
5.4 Numerical Analysis of Higher Order Modes
281
of the total length of one structure would lead to an enormously high number of grid points in an adapted rectangular grid and thus would exceed even today's storage capacities. Only recently some progress was made in the parallel computation using Finite Element Methods and domain decomposition [328]. Additionally, each n-cell structure shows a cluster of n close eigenvalues per pass band. Starting at a certain number, the corresponding eigenvectors are no longer numerically separable, i.e., the field computation for the modes by solving an eigenmode problem becomes very difficult for large n. Therefore, the mode matching technique was used for the numerical analysis of the complete SBLC structure. It was implemented in the code ORTHO [279], which was written for this special purpose of higher order mode computation. In order to verify ORTHO, to study convergence of the method, and to test the design of a relatively short test structure [161]' the programs MAFIA [303] and URMEL-T [286] both based on the Finite Integration Technique were also used. Finally, Dohlus [75] also set up a suitable coupled circuit model called COM as one of several subsequent developments. It is described in detail in [80]. Therein, the model parameters are determined with MAFIA.
5.4.1 Computation of the First Dipole Band of the S-Band Structure with 30 Homogeneous Sections In the following, we describe the results of mode matching computations. The method discussed in subsection 2.1 gives the solution as a linear combination of appropriately weighted basis functions:
For this purpose, one needs to subdivide the solution domain into subdomains allowing an analytic solution of Maxwell's equations as an orthogonal Fourier expansion into discrete modes. Then the field solution for the complete structure can be achieved by continuous matching of the fields at the common interfaces of the subdomains. This method was implemented in ORTHO. In order to use as few subdomains as possible, it is usually necessary to employ geometric simplifications of the original design. Some basic principles are described in subsection 2.l. The following splitting of the computation into two steps turned out to be an advantageous procedure: First, the resonant frequencies of the (parasitic) higher order modes, shortly HOMs, are determined by the mode matching code RESO [246]. In a second step, each of these resonant frequencies is studied separately with the program ORTHO [279]. This computer program was developed to determine the scattering matrices, the inner wave amplitudes, the field distribution of the electric and magnetic field, and many secondary quantities, e.g., the loss parameter, for constant gradient structures. In case of a trapped mode that does not touch any of the end cells, the structure is
282
5. Applications from Accelerator Physics
split for the numerical computation in its inner part of a suitable iris. The details of this procedure have already been described in subsection 2.1.
Geometry of the Simulated Structure. The simulation process for the SBLC structure is divided into several single steps: (1) Based on the central parameters and the design of some test cell [329], the geometry of the complete structure with" original cups" was fixed in an iterative process having the appropriate attenuation T as the target value (see [318] for details). (2) Then some details of the geometry had to be modified for the computation with the mode matching technique: a) The rounding of the irises as well as the asymmetric rounding of the cells had to be neglected. b) Furthermore, a quasi-constant gradient structure was taken for the numerical analysis. It had 30 homogeneous sections of six equal cells each between some landings instead of a constant gradient structure having a linear variation of the radii of the cells and irises. This idea was known from the construction of existing 5-6 m long structures for injector linear accelerators to decrease the production costs. For the simulation, this implemented a remarkable reduction of the modulation process for the geometry of the structure. It had to be taken into account that, in constant gradient tubes, the resonant frequency of a given mode must not be automatically the same for magnetic and electric boundary condition in case of aperiodic structures. Therefore, the geometry had to be adjusted for equality of the frequency in the 21l' /3 accelerating mode: In order to reach all important parameters of the SBLC tube even after neglecting the rounding, a fine tuning of each sixth cell was performed with respect to the frequency and the desired group velocity. Then homogeneous groups of six cells each were built and put in pairs to form groups of twelve cells. These groups were optimized again by variation of the coupling iris (the landing). The procedure was repeated until the geometry of the tube with 180 cells was complete. The whole procedure is described in [318]. There, the results of the field computation are also shown for the 21l' /3 accelerating mode. As is described in subsection 2.1, the traveling wave is given by superposition of two standing waves. (3) As last, the actual computation of the higher order modes follows. This is described in the sequel. Table 5.5 gives the data on the quasi-constant gradient structure with accelerating frequency of f27r / 3 = 2997.364 MHz and attenuation T = 0.56 Neper, which was studied with the mode matching technique with respect to its higher order modes. It is essential that this structure shows the piecewise linear variation qf the auxiliary quantity v(z) replacing the group velocity, which is characteristic for quasi-constant gradient structures. By small changes along the waveguide, a phase advance per cell of about 1200 could also be reached for the accelerating mode.
5.4 Numerical Analysis of Higher Order Modes
283
Table 5.5. The geometry of the studied S-band structure. The group velocities vgi correspond to the Periodic Cell Approximation PCA described above.
... 0
on
.;
• q ~o
aE
••
~on
~~
q
'j'
"l 'j'
0.000
31.867
6:\.333
85.000
126.667
158.333
19D.OOO
158.333
190.000
eell number
... 0
..
"l
· · i ,
'"£l
~
~
0>
"l 0
q
'j'
.., on
0.000
31.667
83.333
95.000
126.6&7
cell number
Figure 5.19. Upper graph: Normalized voltage amplitude of the third dipole mode with f = 4122.5 MHz, k = 10.8.104 V /(pC m 2 ). Lower graph: Normalized voltage amplitude of the 116-th dipole mode f = 4393.4 MHz, k = 10.5.104 V /(pC m 2 ).
284
5. Applications from Accelerator Physics mwll:..l
UIIldc..ll
ID.I!!k..2l.
lllII!k.1l
r=---LLL cell numbe:r
IIlQdUl. Amplitude
cell number
madill
Amplitude
cell number
ID.Q:dc..!l Amplitude
cell number
ID.!ldA.ll.
Amplitude
tLj:t__ ~l!L ~GL~8 cell number
~
cell number
mw1U2l.
Amplitude
cell number
~
cell number
mQIlilll Amplitude
cell number
Dl!IdU.Ql
cell number
~
Amplitude
cell number
III2dUll.
cell number
madc..lli
Amplitude
L:il::i~~_ cell DUmlx;r
cell number
cell number
cell number
cell number
cell number
cell number
Figure 5.20. Field distribution of the parasitic dipole modes in the SBLC quasiconstant gradient tube. The resonant frequencies are numbered in ascending order. The graphs show the normalized voltage amplitudes. The abscissa indicates the cell numbers n = 1, ... , 180.
5.4 Numerical Analysis of Higher Order Modes
285
k[V{pC]
2.5
1.5
0.5
f[Rs]
,
4.5 10
Figure 5.21. Loss parameter k as a function of the frequency computed by mode matching technique for the lowest 185 dipole modes in a quasi-constant gradient structure with 180 cells. The loss parameter was computed at the smallest iris radius (Le., at 1.17646 cm). The 185 points were connected.
286
5. Applications from Accelerator Physics
Figure 5.20 shows histogram-like field pattern of each tenth of the 180 lowest parasitic dipole modes in the SBLC constant gradient tube. The resonant frequencies were numbered in ascending order. The figure shows the normalized voltage amplitudes. The abscissa gives the cell numbers n = 1, ... , 180. The 185 computed dipole modes 12 can be classified in more detail as follows: (i) Modes of the first pass band that touch the input end of the tube: roughly mode 1 to 24; (ii) modes trapped completely inside the tube: modes 25 to 124; (iii) further modes with 7r-mode-like field distribution at the front end of the pattern trapped at the output end of the tube: modes 125 to 143 as well as the modes 145, 146, 148 and 150; (iv) modes without 7r-mode-like field distribution at the front end of the pattern that touch the output end of the tube: modes 177, 179, 181, 183 and 185; (v) modes from the overlap of the first and second pass bands: modes 151 to 176; (vi) modes which completely belong to the second pass band: modes 144, 147, 149 and then again 178, 180, 182 and 184. About 140 modes show a 7r-mode-like field distribution at the front end of their pattern. These modes are most important for the interaction with the bunch, since their phase advance per cell is nearly synchronous, which is expressed in the high values of the loss parameter. As was mentioned before, some of the modes are trapped at the input end of the structure (the left end in the figures), but many of them are trapped in the middle of the structure having no field at all at the input end. Results. The phenomenon of dipole modes which interact over large distances with the beam and which are completely trapped in the inner part of the constant gradient structure was observed. The field pattern of such a dipole mode is displayed in the right part of Fig. 5.19. Prior to the studies described above, this phenomenon was known neither theoretically nor practically. On the contrary, experiments at SLAC and those based on a coupled circuit model using only one dipole band suggested that a sharp maximum of the loss parameter should be expected at the lower frequency end of the dipole band and thus only for modes near the input end of the structure. The left part in Fig. 5.19 shows a mode of this kind. The most relevant results of the numerical analysis with ORTHO are:
- the first and second dipole band overlap: Roughly, up to the 150-th mode, the modes are either trapped near the end of the structure or are mixed modes, i.e., possess a portion of some mode from the first dipole pass band on one end of the structure and another portion of some mode of the second dipole pass band at the input end of the structure. 12
Because of the overlapping first and second dipole bands, slightly more than 180 modes were computed.
5.4 Numerical Analysis of Higher Order Modes
287
At this point, a major disadvantage of the coupled circuit models becomes obvious: Prior to the simulations with ORTHO, some simulations were carried out with a single band coupled circuit model [82]. The nature of this method prevents this model from determining the overlap. Therefore, coupled circuit models should only be used for further (faster) simulations, e.g., for design studies, after verification by a method for field calculation. - There exist many strongly interacting dipole modes in the structure. The loss parameter curve shows a somewhat oscillatory behaviour, which reflects the special geometry with local periodicity (six equal cells each) and 30 landings [279]. Therefore, the curve locally resembles the curve of a periodic structure depicted in Fig. 5.18. The corresponding curve for a constant gradient structure with continuous linear variation of the radii of the cells and irises can be obtained by averaging. The averaged curve shows a flat maximum over 2/3 of the first dipole band. - Many of the strongly interacting modes have no contact with the cells on either end but are trapped in the inner part of the structure (trapped modes). :rhe field patterns of these strongly interacting modes show the following characteristics: - The first modes are 1l'-mode-like dipole modes. Their fields are trapped in the first 10% to 20% of the cells. They all have a high loss parameter, as is clear from Fig. 5.21.13 - The next modes are trapped in inner cells. Their loss parameters are still very large, since the portion of five to ten cells with 1l'-mode-like field distribution dominates the field pattern. This pattern with five to ten cells having 1l'-mode-like field distribution "travels" towards the output end of the structure. - The 1l'-mode-like field portion is followed by some modes which mainly have a significant field near the output end of the structure. They have a very small loss parameter. - Consequently, not only the first 1l'-mode-like dipole modes influence the beam dynamics but all first 120 dipole modes. Only few of these modes are purely 1l'-mode-like and trapped at the input end of the constant gradient structure. The main portion of these deflecting higher order modes is trapped in the inner part of the structure, i.e., without any contact to the end cells. Thus, the main result is that about 120 modes of the first dipole band can strongly interact with the bunches. The modes that have negative influence on the particle dynamics are each trapped in some part of the traveling wave tube, while the rest of the structure is field free. In the original design for the damping of parasitic modes, only one waveguide at the input end was planned to absorb the higher order modes. The 13
The loss parameter k was computed at the smallest iris radius r = 1.17646 cm. Multiplication by 7.225 . 10 3 yields the normalized loss parameter k'.
288
5. Applications from Accelerator Physics
described characteristics of the higher order modes made a new damping concept necessary. Now all irises shall be coated with extremely lossy material in order to reach sufficient suppression of the parasitic modes. This can damp the higher order modes very well while the accelerating mode is hardly influenced at all [77], [131]. Two additional absorbing waveguides, the so-called HOM dampers, are now proposed. They will be used, in particular, as the so-called pick-up monitors for beam diagnostics [198]. 5.4.2 Developments That Followed the ORTHO Studies Since these results contradicted the usual ideas on the behaviour of higher order modes in constant gradient structures, the studies described above necessitated a series of further studies: - All project groups designing a future linear collider studied intensively the question of higher order modes, in particular that of trapped modes. - Theoretical considerations and numerical studies [160] led to realization of a unique relation between the grade of radial taper and the functional course of the loss parameter curve (compare subsection 5.4.3). - Numerical convergence studies were carried out with different methods [183], [204], [239] (cf. subsections 2.1.4 and 5.4.3). - Studies of a faster mode matching implementation were performed [87] (compare subsection 2.1). - A better suited equivalent circuit model in the form of a double banded coupled oscillator model (COM) [80] was set up for faster studies of the higher order modes. This model was validated with ORTHO (compare subsection 5.4.4). - Several measurement experiments were carried out: the 36-cell experiment at the Darmstadt University of Technology/University Frankfurt (cf. subsection 5.5), the 21-cell experiment at KEK resp. SLAC [128], the measurement of the old and new LINAC II structure at DESY [154], [217] (cf. subsection 5.4.5), the measurement of the first 30 cells of the SBLC structure with hybrid coupler [149], and the 28-cell experiment at KEK/University Frankfurt [128]. - In the context of these experiments, a systematic verification of the measurement methods was also carried out (cf. subsection 5.5). - Furthermore, the new damping concepts mentioned above [77] were developed, as a result of which new parameters [51] that allowed a substantial increase in luminosity could be chosen. 5.4.3 Geometry and Convergence Studies of Trapped Modes
Geometry Studies. In a student's work [160], five 18-cell structures differing in their degree of cell and iris tapering were studied with URMEL-T [286]. The extremes were a constant impedance structure with 4.0634 cm cell
5.4 Numerical Analysis of Higher Order Modes
289
radius and 1.6062 cm iris radius and a constant gradient structure with these dimensions at the input end and 3.9285 cm and 1.1765 cm at the output end. The eigenfrequencies of the first dipole band are very close when the structure is periodic and the distance between the first and second pass bands, i.e. the stop band, is about twice as large as the first pass band. The stronger the tapering, the more is the increase in the pass bands until they finally overlap. This behaviour can also be estimated by the so-called cut-off diagrams where the frequencies of the 0- and 7l'-mode are displayed as functions of the cell number (compare Fig. 5.26 in subsection 5.5). With a stronger radial taper, more and more portions of different mode types are found in a single field distribution. Accordingly, the loss parameter curve gets more and more flat and wide around its maximum. Already for the studied 18 cells with the strongest radial taper, some modes occurred which resembled very much the trapped modes - the structure wasn't just long enough to have also "field free" cells. Another study in [160] of an 18-cell structure concentrated on the neglection of roundings and on the differences between quasi-constant gradient and constant gradient structures. These studies were carried out with URMEL-T. Qualitatively, the field distributions coincided with those found with ORTHO for the 180-cell structure. The substitution of roundings by rectangular edges does not influence the qualitative field distribution. The grouping of cells in a quasi-constant gradient structure leads to a stronger concentration of the fields in such a homogeneous cell group. The oscillatory behaviour of the loss parameter curve of a quasi-constant gradient structure could be brought in a unique relation with the number of equal cells in one group (which reflects the local periodicity). Convergence Studies. A series of convergence studies were carried out with the mode matching technique [183], [239], [204] and the Finite Integration Technique [204]. Some essential results from [183] and [239] were already presented in subsection 2.1.4. In a diploma thesis [204], convergence studies regarding the loss parameter were carried out with ORTHO, MAFIA [303], and URMEL-T. The studies were done for one, three, five, ten, and 36 cells. The integration radius for the loss parameter was always taken as half the iris radius in order to exclude edge effects. Using MAFIA, it was ensured that the local step size ratio did not exceed the factor of two and that twice the number of searched dipole modes, i.e., two clusters of eigenvalues, were calculated. With URMEL-T, which uses an older version of the eigenvalue solver SAp 14 [270], [271] than MAFIA, the parameters for SAP were adjusted according to [166] for the case of closely located eigenvalues. Strong dependence of the loss parameter on the chosen grid was observed in URMEL-T. This boils down to the facts that the triangles can become degenerate, turning into line segments, and 14
This algorithm uses the simultaneous iteration of eigenfunctions going back to Bauer [26] and Rutishauser [221] together with Chebyshev acceleration.
290
5. Applications from Accelerator Physics
that the interpolation of the field components Er and Ez can be improved. Altogether, the resonant frequencies showed good convergence already with relatively few mesh points, while the convergence of the loss parameters only occurred for higher numbers of grid points. Only in MAFIA, a unique linear dependence was found for the latter. The results of ORTHO gave rise to a number of questions which could to a large extent be answered in [239]: Since errors in the longitudinal electric field of the dipole modes essentially determine the error in the loss parameter, the studies in [239] concentrated mainly on the convergence of the Ez-field and reached a convergence improvement via some filter and some" artificial" intermediate steps (cf. subsection 2.1.4). In fact, the loss parameters in ORTHO are not computed by numerical integration of the Ez-field shown in subsection 2.1.4, but by evaluating an analytic solution of the voltage integral over the Ez-field:
V(w)
=
JL (a~.
e-ikz.nz -
a~· e-ikz.nZ).dn·fYj. i~2 .J/l-(Kr).eiwz/f3cdz,
n
with the inner amplitudes a~ and a~, the propagation constants kz,n, normalizing constants dn, Y;: = wc/k z , and the IL-th Bessel function JIl with zeroes K = Kiln. Nevertheless, the Ez-curves give a good graphical representation of the error source. The example of a single cell is used in some studies on the normalized loss parameter as a function of the integration radius [239]. By definition, that parameter should be constant. But this is the case only if one uses a filter and auxiliary transitions, and even then increasing loss parameter values can be observed for radii near the iris. Without auxiliary transitions, errors occur near the axis. Comparisons with MAFIA also showed increasing normalized loss parameter values for radii near the iris. This effect is caused by field errors near the edges of the iris (as used for the mode matching simulations). If no special measures such as a clever choice of the mode ratio are taken - especially for the auxiliary transitions in ORTHO or a much finer grid near the edges in MAFIA, - an integration radius of about half the iris radius or smaller should be chosen. Another conclusion from the study in [239] is that an error minimization of the Ez-field by changes in the inner amplitudes should be advantageous.
5.4.4 Comparison with the Coupled Oscillator Model COM Microwave devices are often studied by using equivalent circuits. In connection with the HOM studies, a lumped circuit model should be found which represents well the constant gradient structure to allow the determination of characteristic parameters such as the eigenfrequency, field pattern and loss parameters. At least a double band model is necessary in order to enable the simulation of the effects, since the first two dipole bands overlap. Dohlus developed the so-called Multiply Coupled Oscillator Model, shortly COM [80]. It was used to compute the undamped and the damped SBLC structure, the 36-ceU test structure, and the SBLC structure with 30 landings.
5.4 Numerical Analysis of Higher Order Modes
291
Double-Banded Equivalent Circuit Model. The basic idea is that each cell of the structure is excited by a combination of modes of the corresponding periodic structure. The relation between the modal coefficients of neighbouring cells is fixed by the coupling iris openings. Magnetic coupling may be assumed, since the longitudinal electric dipole field vanishes on the axis. A double banded coupled model is adequate, since the SBLC structure is characterized by the overlap of the first and second dipole band. The two lowest dipole bands are given by the TMll and the TEll modes. Then the suitable model for each cell m, m = 1, ... , N is given by an equivalent circuit built by some inductance LI,m and some capacity CI,m parallel to it representing the TM-like modes. On the other hand, an equivalent circuit consisting of an inductance L 2 ,m and a capacity C2 ,m in series represents the TE-like modes. Each circuit of each band is coupled with its next neighbours. Furthermore, each circuit m of the first band is coupled with the circuits (m - 1) and (m + 1) of the second band and vice versa. This is illustrated in Fig. 5.22.
=TM
=TE
T
T
T
Figure 5.22. Doubled-banded coupled oscillator model (COM) of Dohlus for the computation of dipole modes in the SBLC structure.
The coupling coefficients for the left and right neighbours inside the band are denoted by Kll,m±~ and K22,m±~ for the first resp. second band. KI2 'm±! and K21 ) m±!2 describe the coupling between the bands. The cur2 rents inside the circuit are denoted by h,m and hm. Finally, with the further terms, we obtain .
1m
L
-m
=
(II,m) I '
K 1 = -m±2
2,m
= (LI,m 0 ) 0 L 2 ,m
'
C
-m
(Kll,m±~ KI2,m±~) K2I , m±!2 K22 ' m±!2
= (CI,m 0 ) 0 C ,m 2
'
(5.6)
'
so a linear eigenvalue problem follows for the coupled double-band model: -Km-ri.m- I +
(Km_~ + 1m + Km+~)fm - Km+~im+1 = ~2 C~lim' (5.7)
292
5. Applications from Accelerator Physics
The double-banded lumped circuit model requires the solution of a general linear eigenvalue problem Ai = >.Bi with eigenvectors iT = (11, h)m. Details of the further steps are described in [80]. Another variable transformation or the matching of the dispersion curves can be noted as further key words.
Results. In the following, some results of the calculations performed by Dohlus with the double-banded coupled oscillator model are described and compared with ORTHO results. For the COM generation, the following parameters are needed for each cell: the frequencies of the 0-, 7r /2 and 7r-mode of the first dipole band, those of the 0- and 7r-mode of the second dipole band, and finally the "mode-to-voltage" coupling coefficients (compare [80]). For this purpose, the periodic solutions of ten of the "original" SBLC cups (with roundings) with the group velocities vg/eo = 0.013, 0.016, ... , 0.042 were computed by MAFIA for the TMO 27r j3-mode. The values for the other cells were interpolated or extrapolated based on this data. The quasi-constant gradient structure with 30 landings [279] was analyzed with the COM model in order to compare directly ORTHO and COM. After all, there were two differences in this comparison. Their effects have to be assessed as follows: (i) ORTHO used a rectangular cross-section, while COM computed the rounded SBLC geometry. In geometrical studies [160] with URMEL-T for an 18-cell structure - in one case with the rounded SBLC shape, in the other with sharp edges,15 - the higher loss parameters lay between ~ 2000 V j(pCm2 ) and ~ 6300 V j(pCm 2 ) for the rounded shape or between ~ 2000 V j(pCm2 ) and ~ 8100 V j(pCm2 ) for the angular shape which corresponds to an increase of the maximal values by ~ 29% for the change from "round" to "angular" (d. [160], Fig. 5.11). This means that an over-estimation of the loss parameter up to 30% has to be expected for the simulation and measurement of the angular shape as a model for the original rounded structure. The comparative studies clearly indicated the advantage of rounded irises. (ii) An infinitely long tube is assumed at the input and output ends of the lBO-cell structure in the simulation with ORTH0,16 while COM uses a perfect magnetic boundary condition as the longitudinal termination. Since the interesting modes are trapped inside the structure, it is reasonable to assume that the effect of the boundary conditions can be neglected for these modes. In any case, it should be noted that the boundary conditions in ORTHO as well as in COM differ from the actual conditions where the input cell is connected in the radial direction with a waveguide and the last cell (output cell) is coated with an absorbing material. Altogether, 140 strongly interacting modes were found by COM, viz. the modes with k' > 103 V/(pCm 2 ). Figure 5.23 shows both curves obtained by 15 16
Here all roundings were replaced by edges inside. In discretization methods, this is often referred to as the open boundary condition or waveguide boundary condition. In these methods, a series expansion is done at the waveguide boundary.
5.4 Numerical Analysis of Higher Order Modes b
a __
~
I•
: : :'''-l~;~.~~t-~ -- ~ ,~.- r--t-.--.~-.-~
j
L . __ 1 :::" '.:. :>...::.....J..' I'
, '''-. . 1 .·-1··l -" ...., J :., .,:'" • ,
!.' .:.:' '. .I'.
o
•
·1
("'+!~
- -1-• • -
....
!
•• , -
j
1 .lUI:.".::::J/:::~;<~.'" ..... i ' ... '.
'j'" -.---
U
C
1.l1•• \6
4.2
4.3
293
U
frequencyillGHz
'.S
I
(-I~
.:.:
"-:' ••
0 ••"....-..,.. .....
4.6
,--r------,-~----,
4.\
4.1
'.'
.l"'.
··~,•• ' .. I\ .. t~~ 4.3
.
.u
r.......encyillGHz
d
',L,-:':-----:: •. '--"H.-'-~-------' f''''I''rncyinGIb
Figure 5,23, Loss-parameter: (a) ORTHO results for the quasi-constant gradient structure with 30 landings and angular shape (b) COM results for the quasi-constant gradient structure with 30 landings and rounded shape of the cups (c) COM results for the constant gradient structure corresponding to the central values of (a) resp. (b). (d) COM results for the SBLC structure. The graphs are courtesy of Dohlus, DESY
ORTHO and COM for the loss parameters of this structure. COM shows the same strong oscillations as ORTHO, thus clearly confirming that these oscillations reflect the six-cell constant impedance sections between the landings. The following theoretical plausibility reasoning could be confirmed: The first or the first two dipole modes of a constant impedance structure generally have a very high loss parameter; after that, the loss parameter curve steeply descends, as is shown in Fig. 5.18. The total sum of the loss parameters is nevertheless about the same for both types of structures (constant impedance and constant gradient) as long as the number of cells is the same and the dimensions are comparable. This follows from the conservation of energy. Therefore, the averaging over the cells 1-6, 7-12, ... of the loss parameter of the quasi-constant gradient structure with landings already approximates very well the curve for the constant gradient structure with linear tapering from cell to cell. In summary, the comparisons with the double-banded coupled oscillator model (COM) showed the following: - A very good qualitative agreement was found with ORTHO, while only comparisons with the unsuitable single-band model [82) as well as estimations with the double-banded model by Bane [19), [18) were possible earlier.
294
5. Applications from Accelerator Physics
- Also, the oscillations of the loss parameter curve could be verified with the COM model. Thus, the theoretical explanation that the six-cell constant impedance sections between the 30 landings were thus reflected in the curve found its confirmation. Locally, each curve resembles a curve for a constant impedance structure. Averaging yields a very good approximation for the corresponding curve of a linearly tapered constant gradient structure. - It became obvious that the width and height of the maximum of the loss parameter curve in the beginning of the dipole band depends on the grade of tapering: The stronger is the decrease in the radii of the cells and irises, the flatter and wider is the maximum of the loss parameter. - Yet, the height of the maxima depends also on shape approximations which were eventually performed: In comparison with the angular shape, the rounded shape showed loss parameters about 30 % smaller. - Absolutely substantial is the fact that a unique confirmation of the appearance of trapped modes could be given. Even for the actual design of the SBLC structure with weaker tapering, a considerable portion of the higher order modes trapped inside the structure is still to be found. Thus, the phenomenon of trapped modes can no longer be questioned. The necessity of a special damping strategy with more than just one HOM damper or a completely different damping strategy was confirmed.
5.4.5 Comparison with Measurements for the LINAC II Structure at DESY HOM measurements other than the 36-cell experiment took place at different accelerator laboratories. A method for broad band impedance measurements was studied at LAL [169]. Another short structure of 21 cells was designed and measured at KEK [322], [128] (cf. also [135]). At DESY, measurements were carried out with different traveling wave tubes: (i) An "old" S-band structure of the injector linac LINAC II built in 1972, which had to be replaced because of signs of wear was measured [154]. This quasi-constant gradient structure is 5.2 m long and has landings at every 9-13 cells. Thus, it is quite comparable with the 6m long structure computed by ORTHO (see above). The group velocity Vg varies between 3.5% and 1.2% in the "old" LINAC II structure. The dipole modes were measured with the resonant bead pull measurement [170], which is briefly described in subsection 5.5.4. This structure is perfectly suited for measurements, since each of the 156 cells has tuning holes, through which antennas can be mounted in the cells. All measurements were carried out in the" matching cells" (landings) between two constant impedance sections. Trapped modes and also the 1f-mode-like end of these modes could be uniquely identified. The main difficulty is the determination of the cell-to-cell phase advance from the measured field distribution. This difficulty is caused by the strong overlap of the modes.
5.5 36-Cell Experiment on Higher Order Modes
295
Phase advance per cell and shunt impedance could be determined with accuracy 10-20%. These measurements were the first indication that the qualitative understanding of the field patterns of higher order modes that was achieved by simulations is in good agreement with measurements. (ii) The "new" LINAC II structure, which was build to replace the old quasi-constant gradient structure, is a linearly tapered constant gradient structure. COM simulations for this structure are described in [80J. This structure was also measured [217J. (iii) Measurements were also carried out for the first 30 cells of the actual SBLC structure, two of which are build for the SBLC Test Facility [149J. The measurement was done for the 30 cells with one hybrid coupler attached.
5.5 36-Cell Experiment on Higher Order Modes Since the phenomenon of strongly interacting dipole modes that are completely trapped in the inner part of the accelerating structure was never studied before experimentally and hardly studied theoretically, a certain test structure was developed for a detailed study of trapped higher order modes in aperiodic iris-loaded waveguides. In particular, this structure was also to serve for validation of the program ORTHO and thus of the results for the investigated 180-cell structure. The following characteristics were required of the test structure: (1) The test structure should be "simple" to measure. Clustered modes cause difficulties from the measurement point of view, since they may overlap. For this reason, the structure should only have a small number of cells and thus a larger distance between the modes of a pass band. Figure 5.24 shows the impedance curve for a frequency interval from 4.14 GHz to 4.18 GHz assuming a quality factor of 10,000 for the dipole modes of the 180-cell resp. 36-cell structure. These curves result from the overlap of the single resonance curves. (2) Manufacture of the test structure should be simple and thus inexpensive. (3) It should be computable by different numerical methods without any geometric approximation. The Finite Integration Technique (FIT), the mode matching, and the equivalent circuits were chosen as numerical methods. In detail, the two programs MAFIA and URMEL-T both based on FIT, the program ORTHO described above, and the double-banded model COM [75] were used for the design and for numerical comparisons. A discretization in (r, z )-plane is sufficient, because of the cylindrical symmetry. MAFIA uses a Cartesian FIT grid, while URMEL-T uses a triangular FIT grid. ORTHO and COM were described before. Since all roundings are replaced by corresponding rectangular edges in the mode matching in order to keep storage and computational effort in reasonable bounds, the test structure should only have straight and angular boundaries thus avoiding geometrical errors.
296
5. Applications from Accelerator Physics
25r_----~----.-----_r----_,------r_----._----,_----,
g(x) -
20
15
10
v
0 4140
u
V
4145
V V 4150
V 4155
\J
4160
'J
IJ 4165
v
V
U
4170
4175
V 4180
25 g(x) -
20
15
10
Figure 5.24. Mode distribution in a structure with 180 cells and that with 36 cells with a uniform quality factor Q = 10, ODD. The modes of the ISO-cell structure overlap so strongly that a separation in a measurement is extremely aggravated.
5.5 36-Cell Experiment on Higher Order Modes
297
(4) The most important characteristic which the test structure should possess was the existence of trapped dipole modes. The aim was to give a firm statement about the higher order modes and the programs.
5.5.1 Design Thus, discretization methods were consciously used for the design of the test structure. In detail, there were two programs based on Finite Integration Technique : the two-dimensional eigenmode solver of MAFIA [303], which uses (r, cp, z)-coordinates and discretizes the (r, z)-plane with a regular Cartesian grid, and the program URMEL-T [286], which also uses (r, cp, z)coordinates but discretizes the (r, z)-plane with an irregular triangular grid. Both programs are sufficiently reliable, which is justified by more than ten (resp. twenty) years of their application. For two reasons, the number of cells had to be reduced drastically compared with the SBLC structure: (i) The structure should also be computable with grid-oriented methods as in MAFIA and URMEL-T. (ii) It should allow a good mode separation in the measurement (cf. Fig. 5.24). Since the Cartesian grid in MAFIA hardly allows small radial deviations, the cell and iris radii had either to be kept constant or to be changed considerably each. Then it had to be taken into account that the taper is necessary for the appearance of trapped modes. In order to avoid any geometrical approximation error, an "angular" structure was chosen, i.e., neither irises nor cells have any roundings. Two different designs, with 36 cells each, were compared with each other [160] using ORTHO, MAFIA and URMEL-T: (i) The first design was an angular approximation of each fifth cell of the SBLC structure (original design). (ii) The second design [295], which was finally chosen for the test structure, had a constant outer radius and a strong radial taper of the irises. Furthermore, the irises of this design are twice as thick as those of the SBLC structure. Figure 5.25 shows typical voltage distributions. Such voltage diagrams were each computed with MAFIA, URMEL-T, and ORTHO and showed a good qualitative agreement [160]. The chosen design showed effects of the SBLC and NLC structure; its simulation in particular showed many dearly trapped dipole modes. The main characteristic parameters of the test structure are: -
It has 36 cells and a total length of 1.30 m. The iris radii vary linearly between 20 mm and 10 mm. The iris thickness is 10 mm. The cell radius is 40 mm.
298
5. Applications from Accelerator Physics Mode 17
Mode 16
Jl. n~~
Jfll
1U U~~u Mode 19
tIIl
"n
~n
'1Il -~~.a
Mode 22
.-..
~ ~n
~tIl ~~.,
Mode 25
~ Mode 28
.dml
.rill. ~o. ~ W~
Mode 18
rilJ~~~n
A\,
-£[p
JJ
lIP Ii
~n
~~.1
Mode 23
Mode 26
CD
~n
JJ en ~~~
.odIh
~ Rij~
-l(j ~U
Mode 27
~ tdL~~
l{f ~a
Mode 29
dl!ll
-
Mode 24
~ .., ~ill~~n ~~4
'UP
ijo
Mode 21
Mode 20
.co.
'lr n
-""
lIt -1~u
w
--1r Mode 30
~0
1jf]
'4%i
.nffTIl~
~lf
Figure 5.25. Voltage am3>litudes for some modes of the 36-cell test structure calculation with MAFIA. The voltage of each cell was integrated at r = 5 mm and is plotted against the cell number. Graph from graduation paper of Krietenstein [160]
Some specific very precise computations with refined mesh in MAFIA as well as some convergence studies with MAFIA, URMEL-T, and ORTHO were carried out [204] before the structure was manufactured. The 36-cell structure was manufactured from standard OFHC copper ( Oxygen Free Highpurity Copper) and measured at the microwave laboratory of the Institute of Applied Physics of the Johann Wolfgang Goethe University in Frankfurt. Calculations and measurements were carried out for the first and third dipole band. Results of the measured field patterns, resonant frequencies, and loss
5.5 36-Cell Experiment on Higher Order Modes
299
parameters are described in subsection 5.5.4. There, they are also compared with numerical simulation results (see also [161)). Recently some studies on the mode reaction in HOM dampers and lossy sheets were implemented [137].
a
d
Figure 5.26. Cut-off frequency diagram for different S-band structures: a) 180cell quasi-constant gradient structure with angular cell geometry and radial taper similar to the original SBLC design; b) original SBLC design; c) actual SBLC design; d) 36-cell structure. The diagrams are courtesy of Dohlus, DESY
5.5.2 Numerical Results for the First Dipole Band Some comparisons were carried out for the 36-cell test structure with the four different numerical methods of MAFIA, URMEL-T, ORTHO, and the lumped circuit model COM. Resonant frequencies and field pattern of the dipole modes. The comparisons with MAFIA, URMEL-T, ORTHO, and COM showed a good
300
5. Applications from Accelerator Physics 5000.0
~ :
i . l
4000.0
::.::
3000.0
III
2000.0
~;
1000.0
:;
~
~
,-----~---~---_--_,
0.03':-.8---4:':.0----:'"4.2=------"'-'-::48:-. 4 ~----<~4.6 frequency f/GHz
Figure 5.27. Loss parameter of the 36-cell structure as a function of the frequency (MAFIA results).
agreement of resonant frequencies and field patterns for the dipole modes of the first pass band. The latter is most important since clear appearance of trapped modes inside the structure was found for several modes. Figure 5.28 displays a vector representation of the electric field E of a typical trapped mode. The first four and last 12 cells of the 36-cell structure are practically field free in case of this mode. The field pattern is restricted to the cells 5-23. The field in cells 5-15 and 22-23 each already strongly decays. Therefore, the most interesting part with cells 16-21 was zoomed in for the subsequent figures. Figure 5.29 shows the electric field E of the 15 th dipole mode, the distribution of the stored energy Ws of the mode and the wall losses P v in these cells. This mode shows the typical field pattern of all trapped modes. Fig. 5.33 on the comparison with the measurement also elucidates this. Each field pattern consists of a 7l'-mode-like front part and a O-mode-like end. With increasing frequency, this pattern travels from the input to the output end of the structure (from left to right in the figures). This behaviour was also observed in the ORTHO simulations for the lBO-cell SBLC structure.
Figure 5.28. Vector representation of the electric field of the 15 th dipole mode with frequency f :z:: 4.175 GHz from the first pass band of the 36-cell structure. Obviously, this mode is trapped in the inner part of the structure.
Loss Parameter. A very similar form of the functional relation to the resonant frequencies could be found for the loss parameter with all methods. Figure 5.27 shows the loss parameter as a function of frequency. This curve was computed with MAFIA. The shown curve agrees very well with those found by ORTHO and COM. The curve obtained by URMEL-T however
5.5 36-Cell Experiment on Higher Order Modes
:::::1 :::::' ~~ ~ ~ ~ I~~:~~ ........
.........
: .
~
...
~
~
..
~
..
... .. .......
......... .. ..
.......
............ "
I . .........
... .. .. •
~
~ ,
I
1
~
~
•
....... ..
....
..
...
~
~
...
~
... .
\ ...... ..
: .'-:- ~- : -:-'.-'. -' ~ . ~ . ~ . ~ .~.:..
..
......... ,
..
~
'..:_:_. ___ ._,._.
,.
.. ,
....... :
~~~
... . . . . . . tI> ..
··:1
.. .. " I ..... ... 1
... . . . . . . ~
.. ...... -41"
....... ,
,
'\ '" . . . . . . . . . . . ." iii
l
~
,
,
,
..
~
~
....... ,
301
#
I
. . . . . . . . . . . . . . . . ..... 1 ..........
~
,
t
,
~
,.
,
t
••
....
..
...
~..! ...! _ ~ _ .!. ~ ~ . :. ~'_ '_ '_ ~ _ ' _' _' _'~...: ~.!.. ~: ~
..
... :
....
I
,-.J
Electric field E
Stored energy VVs
Wall losses P"
Figure 5.29. The upper picture shows the electric field E of the 15 th dipole mode (f = 4.175 GHz) in cells 16 to 21 (compare Fig. 5.28); the distribution of the energy VVs stored in the mode is shown in the middle for the same cells; the lower picture displays the wall losses Pv .
302
5. Applications from Accelerator Physics
seemed very noisy. Therefore, some convergence studies were carried out, but they could not finally clear all details 17 . The following may be asserted: The loss parameter is very sensitive to each error in the field computation. This extreme sensitivity becomes more and more important with increasing number of cells [204]. It is of low importance as long as the number of cells is smaller than ten (in the specific case studied). Already with 36 cells, the errors can sum up enough to distort the loss parameter results considerably, while there is still an excellent agreement of the resonant frequencies and the field pattern along the structure. Qualitatively, a substantial difference in the loss parameter curves of the 36-cell structure and the SBLC structure has to be stressed: For the lowest dipole modes, the 36-cell curve starts with low values in contrast to very high values in the SBLC curve. Then a steep increase follows and the values stay on more or less the same level for many modes with the exception of only one low loss parameter belonging to the 13 th mode. Finally, the curve decreases to negligible values. The wide and flat maximum with k' ~ 4200 V/(pC m2 ) belongs to the modes 4-12 and 14-22. All of these modes are trapped modes. Mode 13 with f = 4.15 GHz can be regarded as the "beam pipe mode", i.e., there exists only some field close to the beam pipe. Therefore this mode can hardly interact with the beam. All four numerical methods found this mode in good agreement (cf. subsection 5.5.4). 5.5.3 Measurement Methods A well-established method for the measurement of field distributions and secondary quantities, e.g., the quality factor or the shunt impedance, is the bead pull measurement. With this, the originally homogeneous field in a structure is perturbed by a bead made of dielectric material. Two different versions of this measurement method are to be distinguished: the resonant bead pull measurement and the non-resonant bead pull measurement. Both shall be briefly described in the following. Detailed description can be found, e.g., in [136) or [162). Resonant Bead Pull Measurement. The bead pull measurement method exploits the following fact: The electric field in a resonator is perturbed by insertion of some dielectric body. The resulting shift of the resonant frequency is described by the resonant bead formula W2- W
w2
5 _ -!LlvP.EodV-!LlVM·HodV 2Wo
with the polarization P = (I': - l':o)E, the magnetization M = (J.L - J.Lo)H, and the total energy Wo of the resonator. The polarization expresses the field difference between the original field and the field in the bead. 17
Further studies and improvements of the discretization method on which URMEL-T is based are subject of ongoing research [205]
5.5 36-CeU Experiment on Higher Order Modes
303
It has to be taken in account that the magnetization inside the dielectric bead is of no relevance, since H = Ho holds. Therefore, the term fLlV M . HodV can be neglected in the resonant bead formula, yielding
E2
o
= 2Qo Pv awo
(W5 - W2) . w2
A sharp resonance is necessary in order to reach high accuracy with this method. Consequently, frequency shifts for small beads are very well measurable. For E2(z), the following quantities are measured with a dielectric bead: the quality factor Qo of the resonator, the unperturbed resonant frequency w, the perturbed resonant frequency wo, and the wall losses Pv. Non-Resonant Bead Pull Measurement. The non-resonant bead pull measurement exploits the fact that the transmission is changed when the bead is pulled through. The evaluation is based on a lumped circuit for the resonance, the Slater formula [170], and the fact that Ez and El.. can be separated by using two different beads. K
c
I
I
Figure 5.30. Lumped circuit for a resonator. Figure 5.30 shows the lumped circuit for the resonance. The impedance in the shown reference plane is given by
(wM)2IR 1 + iQ ( w 2-W o2) . o wwo Here Qo = woLI R and the well-known relations between Land C are exploited. Introducing a coupling factor, according to
k( ) W
= (WM)2
RZo'
and using Slater's formula [!
one obtains the formula
= Qo ( Wo -w
- -wo ) , w
304
5. Applications from Accelerator Physics (5.8)
for the transmission. Equation (5.8) allows to check the validity of the assumptions made and thus obtain information on the accuracy of the measurement. For this purpose, only the real part of .1S21 has to be checked. This should not exceed its magnitude by more than about 5%. The longitudinal and the transverse components of the electric field are of interest in the measurements. If some isotropic dielectric material of specific form is taken as the bead, e.g., a rotational ellipsoid, the form factor a can be given as the sum of independent quantities. The factor ailEol1 2 on the right hand side of (5.8) can then be written as
n
Therefore, Ez and El.. can be determined separately for such kind of beads. Necessary are only two beads of different shape (e.g., a needle and a disc) . .1S(1)
21 -(-1)-
A
az .1S(2) 21
(2) az
(5.9)
After two measurements along the same path with different beads, (5.9) is solved for E. The non-resonant bead pull measurement can also be applied to very small quality factors Qo, but this requires larger beads. 5.5.4 Bead Pull Measurements
The 36-cell structure was measured at the microwave laboratory of the Institute of Applied Physics at the Johann Wolfgang Goethe University in Frankfurt. In what follows, the results of the measured field patterns, resonant frequencies, and loss parameters are presented and compared with the results of the numerical simulation (cf. also [161]). Test Arrangement. The structure was manufactured from standard OFHC copper (Oxygen Free High-purity Copper). It was not soldered but clamped together with the help of supporting poles. On both ends, the so-called cutoff tubes 10 cm long and of the same radius as the neighbouring iris were attached. In these tubes, the fields of the studied dipole modes are aperiodically damped, since their resonant frequencies lie below the cut-off frequency of the tubes. Correct alignment of the structure is achieved by putting it
5.5 36-Cell Experiment on Higher Order Modes
305
Figure 5.31. Measuring set-up for the 36-cell experiment at the microwave labora~ory of the Institute for AppJiea Physics of the Johann Wolfgang Goethe University III Frankfurt. The photo is courtesy of Hiilsmann and Kurz, Univ. Frankfurt on an optical bench. The field measurements are carried out with a modified non-resonant bead pull measurement, according to Steele [243], which is briefly described in subsection 5.5.3 and in more detail in [162]. Figure 5.31 shows a picture of the measuring set-up at Frankfurt University. Two ceramic beads of different geometry were used for the measurements: a needle of diameter 0.6 mm and length 7.5 mm as well as a disc of diameter 4.7 mm and thickness 0.31 mm. Both beads were calibrated in a TMolO pill-box with regard to their longitudinal and transverse form factors. The needle made of A}z03 showed very stable values over the complete measuring period, while the disc showed significant fluctuations. It is assumed that this is caused by the hygroscopic properties of the bead. Field evaluations are performed at 801 discrete positions along the path which the bead is pulled through the 36-cell structure. The data are processed with a HP8753c network analyzer. Each time, 15 measuring series were carried out.
5.5.5 Comparison of Measurement and Simulation The dipole modes no. 12, 13, 15, 16, and 19-22 have been measured. The bead was pulled through the structure 6 mm off the axis, since the electric field vanishes on the axis. Figures 5.32, 5.33, and 5.34 display the measured field
306
5. Applications from Accelerator Physics t;12~~~~~~~~~~~,
t;12
~10
Smm off axis
~ 8
!
~
G
6
~
4
'0
£4
.g
o
U 2
2
"
jjj
:> 0
ci> 0
g
C/)
~ '20~-2~O~O~4=OO~~60~O~8~OO~1~OO=O~12~OO~ Bead Position [mm] ~12~~~~~~~~~~~,
o
.nd
8
'E
6
'0
ii &l
10
Smm off axis
"
~ '~~~2~OO~~40=O~6~OO~~80=O~1~OO~O~12~OO~ Bead Position (mm]
t;12~~~~~~~~~~~---'
~
~ a
j
Smm off axis
10
a
'E
~6
:g
]1 4
~ 4
.g
U2
G 6
'0
LL
~
Smm off axis
~. 10
9
9
o
2
UJ
w 0 ...
~
ci>
§.20~~~~~~~~~~~~ 200 400 600 aDo 1000 1200 Bead Position [mm]
01····,...,..,·············
I
" ~.20~~2~OO~~40=0~6~0~0~aOO~~1~OOO~~12~OO~ Bead Position [mm]
Figure 5.32. Longitudinal and transversal electric field of dipole mode no. 15 (upper picture) and no. 16 (lower picture) r == 6 mm off the axis. The graphs are
courtesy of Hiilsmann and Kurz, Univ. Frankfurt
pattern of the longitudinal electric field Ez and the transverse electric field E 1. for several dipole modes. The absolute values of longitudinal and transverse electric field normalized with regard to the square root of the input power are shown in Fig. 5.32 for the dipole modes no. 15 and 16. Table 5.6 gives the resonant frequencies, shunt impedances, and quality factors for these two dipole modes as measured and as computed by MAFIA. On the average, the resonant frequencies are accurate to less than 0.01 %: iJ.f :S 0.01%. The difference between computed and measured shunt impedance lie at 8% :S iJ.Rs :S 16%. Thus, these differences partly exceed the estimated accuracy limit of 15% by a small amount. The measured quality factors are ~ 20% smaller than the simulated values: iJ.Q :S 20%. Yet, this is acceptable, since the structure is only clamped together - not soldered. No better quality factors may be expected under those circumstances, Figures 5.33 and 5.34 show comparisons between measurement and simulation for several modes of the first dipole band. The normalized absolute value of the longitudinal electric field E z is displayed r = 6 mm off the axis. The field patterns in Fig. 5.33 and 5.34 are typical for all trapped modes. The fields were computed with different numerical methods. Shown are comparisons of MAFIA, partly also of RESO/ORTHO, with the bead pull measurement, The agreement of the field patterns in both simulation methods with the measurement is good. The typical behaviour of trapped dipole modes
5.5 36-Cell Experiment on Higher Order Modes
/I
I MAFIA I Measured II
f I GHz (Rl-/Q) / Q
4.17099 521 12379
Q
4.17477 626 9850
307
I MAFIA I Measured /I
f
I GHz (Rl-IQ) / Q
Q
4.18931 605 10150
4.18557 556 12471
Table 5.6. Comparison between MAFIA simulation and bead pull measurement: resonant frequency, transverse shunt impedance (without the transit time factor), and quality factor of the dipole modes no. 15 and no. 16.
MODE 19
1. -- MAFIA-SIMULATION -- ORTHO-SIMULATION
0.8000
N,
'"
.. MEASUREMENT
0.6000
;:
"~ "', '" Ul
''""
0.4000
0.2000
~.
1
o. o.
0.20
0.40
0.60
0.80
Z I M
--->
1.00
1. 20
1.40
1.20
1.40
MODE 20
1. -- MAFIA-SIMULATION -- ORTHO-SIMULATION
0, BODO
",''"
.. MEASUREMENT
A
0.6000
l'
~
:'
0.4000
'""' '"
0.2000
jl
i~
/1
V
o. o.
0.20
0.40
0.60
0.80
Z I M
--->
1. 00
Figure 5.33. Normalized longitudinal electric field IEzl/IIEziloo at radius r = 6 mm off the axis for the dipole modes no. 19 and no. 20 of the 36-cell structure.
308
5. Applications from Accelerator Physics
MODE 21
1. -- MAFIA-SIMULATION .• MEASUREMENT
0.9000
.,N,
0.6000
~
:' ...'"
,\
0.4000
0.2000
o. o.
0.20
0.40
0.60
0.90
Z I M
--->
1.00
1.20
1.40
1. 00
1. 20
1.40
MODE 22
1.
-- MAFIA-SIMULATION •. MEASUREMENT
0.9000
N
.,' ~
0.6000
:' ...'"
0.4000
,
,,
I I'
0.2000
,
" " O.
O.
0.20
0.40
0.60
O. BO Z I M --->
Figure 5.34. Normalized longitudinal electric field IEzl/IIEziloo at the distance r = 6 mm from the axis for the dipole modes no. 21 and no. 22 of the 36-cell structure.
5.5 36-Cell Experiment on Higher Order Modes
309
in a constant gradient structure also becomes obvious in Fig. 5.33 and 5.34: With increasing frequency, more or less the same field pattern travels from the input end with larger iris radii to the output end with the smallest iris radius. The modes are well apart, which guarantees a good mode separation. In Table 5.7, the computed resonant frequencies and normalized loss parameters of these modes are given as obtained by MAFIA. Their normalized loss parameters differ by at most 2.5 %. The normalized loss parameters of modes no. 19-21 even agree to ::::J 0.2 % (see also Fig. 5.27). Dipole mode no. 12 with the resonant frequency h2 = 4.145 GHz lies below the so-called beam pipe mode no. 13 with the resonant frequency h3 = 4.150295 GHz. This mode was found in simulation and measurement and is displayed in Fig. 5.35: It is the only mode having a significant electric mode close to the beam pipe. Therefore, no interaction with the beam is possible, as is reflected in the small loss parameter of 866 V /(pC m 2 ). Mode 12 15 16 19 20 21 22
Resonant MAFIA 4.14504 GHz 4.17099 GHz 4.18557 GHz 4.23120 GHz 4.24409 GHz 4.25630 GHz 4.26796 GHz
frequency Measurement 4.17477 4.18931 4.23057 4.24390 4.25631 4.26827
GHz GHz GHz GHz GHz GHz
normalized loss parameter MAFIA 4204 V/(pC m~) 4180 V/(pC m 2 ) 4207 V/(pC m 2 ) 4108 V/(pC m 2 ) 4102 V/(pC m 2 ) 4100 V/(pC m 2 ) 4000 V/(pC m 2 )
Table 5.1. Resonant frequencies and normalized loss parameters of some trapped dipole modes of the 36-cell structure as computed by MAFIA.
Some measurements and simulations were also carried out for the third dipole band. It has to be noted that the measurements are significantly more difficult for these higher frequencies than for the first dipole band. Also the simulations with MAFIA using the same parameters for the eigenvalue solver gave much less accurate solutions than the simulations for the first dipole band. This is caused by the fact that either twice as many modes have to be calculated or only a certain frequency window has to be computed. In the first case, 146 eigenvalues and eigenvectors have to be determined, where 36 eigenvalues are in each cluster. This creates difficulties in the orthogonalization of the eigenvectors in the eigenvalue solver SAP [271]. In the second case, a special method also implemented in SAP has to be used to find single eigenvalues. Therefore, further studies are necessary in order to get good results. This is the reason why no comparison of measurements with the results of simulation is given here for the third dipole band. Nevertheless, a principal match between measured and simulated field distribution could already be observed.
310
5. Applications from Accelerator Physics MODE 13
l. -- MAFIA-SIMULATION .. MEASUREMENT
0.8000
N
",'
0.6000
",
0.4000
'" ""'
0.2000
~ '"
Ij l
o.
O.
0.20
0.40
0.60
0.80
Z / M
--->
Figure 5.35. Normalized longitudinal electric field no. 13, the so-called beam pipe mode.
1.00
1.20
IEzl/IIEzlloo
1.40
for dipole mode
5.5.6 Measurement with Local Damping For the SBLC test facility, two HOM damper cells are proposed for each accelerating structure. The HOM damper cells have been designed by numerical simulation and some have been measured in short constant impedance structures [78] (see also subsection 5.5.8). For the study of the behaviour of such a damper cell, first measurements were carried out with a damped cell in the 36-cell structure. Since the electrical properties of the clamped structure would strongly change if one demounts and mounts it again, which would be necessary for a direct comparison, a design was found [179] that allowed to compare the damped and undamped cases without demounting. For this purpose, a sheet of graphite was put on a piece of paper which then was inserted in the 18 th cell (compare Fig. 5.36). The thickness of the graphite sheet was 1.5 J.lm and the conductivity of the graphite was 1.25.105 II Dm. It has to be stressed that this is only a rough model for test purposes representing a cell with wall slots and attached waveguides (compare [78] and subsection 5.5.8). This procedure prevents the influence of modified electrical contacts on the quality factor Q, which would result from remounting of the structure. The measurement setup remained unchanged. In this measurement, the focus was on the longitudinal electric field, since it indicates most clearly the effects of the damper cell. Therefore, a dielectric needle made of Alz0 3 (lOr = 9.2) was used. This bead was 6 mm long and had diameter 1 mm. It was also calibrated in a TMolO pill-box. Its longitudinal form factor amounted to a z = 9.34.10- 20 Asm 2 IV. It was 7 mm away from the axis in this measurement. Some modes from the first pass band with significant field strength in the damper position, i.e., in cell no. 18, were
5.5 36-Cell Experiment on Higher Order Modes
Figure 5.36. 36-cell structure with damping ring in cell 18. of Muller and Hiilsmann, Univ. Frankfurt
311
The drawing is courtesy
selected for this measurement, since a relevant damping effect can only be expected for such kind of modes. The cell-to-cell phase advance of the dipole modes varies between 0 and 7r in an aperiodic structure. The following modes were selected in order to study the influence of the damper position by measurement: For mode A, the 7r-mode-like end of the field pattern is located in the damper cell, for mode B it is the 7r /2-part, and for mode C is it the O-mode-like end. Two different positions of the damping material were studied: First, the paper with graphite sheet was pressed onto the inner wall of cell no. 18 in order to get a symmetric damper and to reach only a relatively low damping effect. The damping effect was increased in another measurement by moving the paper several millimeters inward, thus placing it in a location with higher field strength. Figure 5.37 shows the measured field distribution for all three cases: (a) no damper, (b) weak damping, and (c) strong damping. Unexpectedly, even for mode C, having only the O-mode-like field pattern in the damper cell, a good damping over the complete field pattern was found. Table 5.8 gives some data of the test result. The entries in table 5.8 for the damping effect (damping factor) equal ratios of the measured field strengths, which are again proportional to the square of the electric field strengths. Figure 5.38 illustrates that the damping effect is evenly distributed over all phases in the modes both for weak and for strong damping. The computer simulation of the 36-cell structure with a very thin (about 1.5 /-lm) damping sheet in cell no.I8 is scarcely possible with discretization methods, since a sufficiently accurate discretization would imply enormous storage demands. Therefore, a perturbation approach using data of a field calculation with MAFIA assuming only loss-free material was used. This approach and the results obtained are described in [308]. Earlier measurements of a 12-cell constant impedance structure [78] showed strong influence of the damping on the field pattern of the modes, shortly mode geometry, so the same was in fact expected for the aperiodic 36-cell structure. However, neither the weak nor the strong damping significantly changes the mode geometry, as Fig. 5.37 and 5.38 make obvious. Moreover, the damping effect is evenly distributed over all cell-to-cell phase advances,
312
5. Applications from Accelerator Physics
~ : :~ "" ' j '-'-r'-'~ j : 'Q -- --,- . - .• •. •• • . ......... .
- - - ",;,hou, Damrcr - - - Damper on Surf3Ce
- - - Damper in Cell
~
-.;
:
:
:
-iE ----,----------r------.. · .. _-_..· ·~··· · ··· · ····;···········t····· ·· · · C'O, : Ct¥l #18 ~ ~
.
t.-.....IFl·· ..... ·····r····
.~ ..... ~ ........... ~ ........ .
.:
'
t i: u:
:
iil·.. ··~- ----------1'- -
:a.., .....L--.. ; ::;
:
"5:"'+'" (I)
o
+o • • • • •
~
• • •• • • •••• ••
,
:
0,2
--"'!"
~
:
0,4
0,6
~
• • •• • •• • •• •
0,8
t!
!........ .
~ositio~ I
(mJ 1,4
1,2
:.E ..... :........... l: ...... .....: ...: . :::s :
- - - wilhout Damper
.;;..... ~ ........... l............~ ... .
- - - Dampct In
0........;
:
]1 :
:
:
- - - Damper on Surf.
c<:n
dell #lE
j rl• • • X TT I ':0-- -0--------------
-l .J- : C"
[I)
o
:
t
0,2
fositio* I [m] 0,4
""
!~: ;. u:
j
:
::;
:
:
0,6
0,8
1,2
1,4
~=:ErKe .:;.: : : : 1,-::::::::
t:rl:::llf i j ; :
:
:
:
."/;I ... _: ••••....•.. l ........... : ........... :........... -'........ :- --------
. -::J ••••• i ........... ~, ...........
:
;
~ ~silion/ [m] :
o
0,2
0,4
0,6
..... (
:
: 0,8
1,2
1,4
Figure 5.37. Measured field patterns for the three modes A, B, and C in the three cases: (a) no damper, (b) weak damping, and (c) strong damping. The graphs are courtesy of Miiller and Hiilsmann, Univ. Frankfurt [179]
5.5 36-Cell Experiment on Higher Order Modes
II
313
I no damper I weak damping I strong damping
Measured variables A
Frequency /GHz Quality factor Qo Damping effect Long. shunt impedance/Mil
B
Frequency /GHz Quality factor Qo Damping effect Long. shunt impedance/Mil C Frequency / GHz Quality factor Qo Damping effect Long. shunt impedance/Mil
4.14383 10400 1.000 6.43
4.14383 8900 0.778 5.00
4.14233 1500 0.045 0.92
4.17397 10500 1.000 11.32
4.17401 9000 0.815 10.74
4.17294 3200 0.172 3.80
4.32461 12100 1.000 11.73
4.32470 10800 0.731 7.63
4.32387 4400 0.183 3.26
"
Table 5.8. Results of measurement with damping.
1 ..-,:
&!. . . . . .
~i
O,8·...
i \fode ~, B an? C (Da~per oni Surfac
:
r·. . . . . . . . . . .
:
.~j._.
....
~b
S1
1. : :
t':$: ~:
:
:
i
1 :
• :
1
1
1
1
1
1
;~
N
1
Iil i i i _1
11. . . . . .·. . . .
..-,i i i O'4·.. ·j.................... j"........·........ gflode A, 1Band
. ...
• •
**
·t. . . .·.
1
0:
~
j... . . . . . . . . . .
¢ (Damper in C~llJ.
1
i
~ j.
i F';":
• :
o
1
.~
•
................
Mode A Mode A Mode B ModeB Mode C Mode C
•
i i
~
1 : :
•
O,2·ffi·I. . . . . .~j: ·~ ·L~. . . . ~! Iil j
t
1 1· : .ti.:
.6i
o 6·..~·1. .·. .· . . .·. ·. !·.........·........·!....................+...·...... ,
:
·i··i
1
•
~
1
i
Position I [m] 1
1,2
1,4
Figure 5.38. Damping effect of weak and strong damping. The graph is courtesy of Miiller and Hiilsmann, Univ. Frankfurt [179]
314
5. Applications from Accelerator Physics
i.e., the complete field pattern. Yet, no damping effect can be expected if a mode has no significant energy concentrated in the damper cell. 5.5.7 Comments and Outlook
Figure 5.39. Extract of the grid with smaller step size near iris edges in the 36-cell structure.
The interaction between experiment and simulation proved itself to be of great importance. The comparison led to improvements on both sides: In MAFIA, the discretization was made significantly finer near angular edges (cf. Fig. 5.39). Although a uniform grid as used in the design of the structure perfectly approximates all boundaries of the 36-cell structure, since they are parallel to the coordinate axes, the edge effects had considerable influence on the accuracy of the solutions, so that a very good agreement with the measurement could not be reached with the uniform grid. The measurement method was improved in several aspects. The comparisons showed the following relation: The lower is the sampling frequency, the more accurate are the results. Furthermore, it became obvious that the accuracy is very sensitive to the measuring velocity: Because of the temperature drift, the measurement should not take too much time. The thickness of the walls in the 36-cell structure already was an important advantage with respect to the temperature drift. Therefore, an important result is the following: The simultaneity of simulation and measurement (after the first design simulations) enabled to reach a very good precision of the quantitative results.
5.5 36-Cell Experiment on Higher Order Modes
315
In the future, the 36-cell experiment shall also be used for studies of the mode reaction in lossy cells. For the time being, two cells were built and completely coated with Kanthal so that their quality factors Q only amount to ~ 200 - 300. This geometry will also be computed with MAFIA, which allows computations with lossy material. Measurements of a damper cell with HOM coupler are also proposed. Finally, the effects of the grouping of several damper cells of the simple model type described above shall be compared with the effects of distributed damper cells. 5.5.8 Suppression of Parasitic Modes In order to suppress the parasitic higher order modes, a special characteristics of the constant gradient tubes has to be taken into account: The energy per cell strongly varies from cell to cell. The relation between minimal and maximal energy has the same order as vg/co. Additionally, there exist the characteristic trapped modes, in particular, in the first dipole band. Figure 5.40 provides a graphical representation [75J of the energy per cell for the dipole modes of the 36-cell structure. In principle, two methods for higher order mode suppression can be distinguished: damping and detuning. Damping can either be reached by local losses, e.g., by placing HOM dampers in a few positions along the structure, or by damping of each cell. Local Damping. Local damping is possible, for example, by attachment of a HOM coupler such as a waveguide with cut-off frequency significantly above the accelerating mode. Figures 5.41 and 5.42 display a picture of such a damper cell and a cross-sectional drawing with attached waveguide and load (very lossy material at the end of the waveguide, compare also with the temperature example in subsection 5.6) for the higher order modes. An advantage of the HOM dampers is their suitability for diagnostic purposes: HOM signals can be used for correct positioning of misaligned structures [198J. Doing local damping, one has to take care of the fact that couplers have only a limited range. Furthermore, the couplers should be designed so that they are also applicable for beam diagnosis. The design of structures with only few dampers has to take into account the local characteristic of the modes trapped modes - and is limited by the filling time. From the point of view of perturbation theory, the power absorption of a damper cell is proportional to the stored energy in this cell:
Qtotal
= Qcell' -Wtotal W. cell
(5.1O)
Therefore, only a part of the trapped modes can efficiently be damped with this kind of local damper cell.
316
5. Applications from Accelerator Physics
1----- 1I:mI
f/GHz 4.4
4.3
4.2
4.1
--
•
4.0
1
-4
12
-3
24
-2
36
cell number -1
o
Figure 5.40. Energy per cell for the ~rst 36 dipole modes of the 36-cell structure. The dark areas hold tlie largest energies. The diagram is courtesy of Dohlus, DESY
5.5 36-Cell Experiment on Higher Order Modes
Figure 5.41. Damper cell with wall slots and two rectangular waveguides.
photo is courtesy of Hiilsmann and Kurz, Univ. Frankfurt
317
The
Figure 5.42. Diagram of a multicell structure with damper cell and attached waveguide with lossy load. The diagram is courtesy of Hiilsmann and Kurz, Univ . Frankfurt
Global Damping. This problem is avoided in a very natural way if all cells are damped. With the damping of every cell, the design becomes simpler. Some crucial points have to be considered while assembling the complete structure: TUning and cooling has to be done without substantial cost increase. However, in many cases the complete structure becomes more complicated and also more expensive. Furthermore, the limit for the losses per cell of the accelerating mode decreases if all cells are damped. The so-called choke-mode cavity [235] is an example of a structure with damping of all cells. This cavity has no coupling of the accelerating mode with the absorber. It is a frequency-selecting stop band filter. It allows a perfect separation into the accelerating mode and the lossy load. Yet, the choke-mode cavity is operated in resonance. Therefore, a part of the energy is stored in
318
5. Applications from Accelerator Physics
the choke mode and the shunt impedance is reduced. Consequently, more energy W is needed to reach the same accelerating voltage. Additional losses in the second resonance reduce the quality factor Q.
Figure 5.43. Two damper cells with Kanthal-coated iris. DESY
The photo is courtesy of
A convenient method to damp all cells is to coat them with damping material sheets in the irises [77]. Figure 5.43 shows a picture of two such cells, the so-called lossy cells. The irises are coated with Kanthal, since this material keeps its substantial damping properties even during the soldering process in which the temperature of 820°C is reached. This method is applied in the SBLC test facility, where two HOM dampers per accelerating structure are used as beam position monitors. Local as well as global damping has to be selective in the sense that they should hardly influence the shunt impedance of the accelerating mode, while the dangerous higher order modes should be effectively reduced. Two selection mechanisms have to be distinguished: frequency selection and "field pattern selection" [75]. Detuning of Parasitic Modes. The detuning of the dipole modes can be performed in two different ways: Either the modes are detuned within one single structure and/or the modes are detuned from structure to structure. Constant gradient structures are per se detuned structures for which the frequency distribution of the first dipole band is nearly equidistant. The NLC project proposes a "damped-detuned" structure, while the SBLC project a structure-to-structure detuning. For this purpose, ten classes of accelerating structures are manufactured with mutually detuned dipole modes (about 1% frequency shift from structure to structure). 5.5.9 Design of the Damped SBLC Structure As a consequence of the previously described numerical studies and of the general reasoning about the damping, the following damping strategy was accepted for the SBLC structure:
5.5 36-Cell Experiment on Higher Order Modes
319
- two HOM couplers in cells 25 and 106, - coating of all irises with some very lossy material. This strategy reduces the Q-values of all frequencies in the first dipole band with high loss parameter. The degree of the Q-reduction depends on the type of coating and on the parameter of surface absorption. In [77], different combinations of these factors are studied. Simulations in [77J gave nearly constant Q-values in the frequency range 4.22 GHz to 4.44 GHz for the coating of all irises. Thus, it is possible to lower the mean Q-value for all strongly interacting dipole modes in the first dipole band to a value of 3109 with an increase of only 5 % in the losses of the accelerating mode. Both HOM dampers are also used as the so-called pick-ups (pick-up monitors) for the structure alignment.
5.5.10 Concluding Remarks about the Linear Collider Studies The presented linear collider studies mainly dealt with the accelerating structures. Looking back, one can recognize that the accelerating mode in constant gradient tubes was fully understood, but the higher order modes (HOMs) were studied neither theoretically nor experimentally to a sufficient extent. Discussed above was the accelerating structure of the SBLC. Nevertheless, it is of general importance, since the problem of higher order modes has to be solved for all linear colliders. A detailed knowledge of the higher order modes is a prerequisite for any prognosis of emittance growth and thus is a central point in the design of linear colliders. The presented numerical studies made the phenomenon of trapped modes) obvious. The 36-cell experiment which was subsequently carried out confirmed this phenomenon. It clarified that the interaction between experiment and simulation improves the integrity of any prognosis. These numerical and experimental studies led to the setup of a new damping concept [51]. In particular, the implementation of damping by iris coating with very lossy materials [77] yielded such a good suppression of multi bunch instabilities in beam dynamics simulations with the previously proposed SBLC parameters that now the single bunch instabilities play the limiting role [51]. This evoked new parameter studies and a new scaling of the SBLC parameters [50] towards a lower bunch charge, so that the single bunch wake fields are also reduced. The bunch distance is reduced accordingly, so that the product of bunch charge and the number of bunches remains unchanged. The advantages of the smaller charges are relaxed tolerances and/or the possibility of a smaller vertical emittance. Some other parameters, e.g., the crossing angle, are also changed accordingly. The result of these parameter changes is a luminosity increase by 43% .
320
5. Applications from Accelerator Physics
5.6 Coupled Temperature Problems in Accelerator Physics The design of many technical components requires investigations of their electro- or magneto-thermal behaviour. The Finite Integration Technique allows consistent numerical computation of electromagnetic fields. To be able to consistently simulate coupled thermal problems, FIT was applied to the computation of stationary temperature problems, as described in subsection 2.3.2. This procedure guarantees the consistency of the coupled calculations. A material of prescribed temperature or a heat source of given density or emission can be chosen as a heat source for the computation. Inside the modular program package based on FIT, it is then possible, in particular, to compute completely the heating by wall losses of a mode and the heating by Joule's energy caused by eddy currents. The modes are computed in time or frequency domain, the eddy currents are calculated by the time-harmonic solver, and the losses are determined in the post-processor. The stationary temperature computation is implemented in the static module of MAFIA. The discretization can be carried out on a two- or three-dimensional Cartesian or cylindrically symmetric grid. Some examples of heat caused by dissipation of electric energy are treated in the sequel: (i) inductive heating of some material, as it is used in inductive ovens by application of eddy currents; (ii) heating by wall losses in traveling wave tubes and resonators used for the acceleration of elementary particles; (iii) dissipative heating in ceramic rf-windows; (iv) dissipative heating in lossy waveguide terminations for high power rf applications. 5.6.1 Inductive Soldering of a Traveling Wave Tube Dissipative heating is profitably used in the construction of long traveling wave tubes for the SBLC linear collider described in subsection 5.2.3 and for the test facility to this study: The conversion of electric energy into thermal energy is used for the soldering process [132J. The location which has to be soldered is positioned close to the inductive coil. The driven coil is surrounded by eddy currents which heat the soldering material by Joule's energy. It was investigated if this procedure would be suitable for the soldering of the Sband structures for SBLC. The first simulations [133J provided evidence for the positive decision. Since the end of 1995, an inductive oven has been used at DESY. Figure 5.44 shows the inductive oven in a test setup with one inductor applied to solder vacuum chambers. Table 5.9 gives the main parameters of this inductive oven. The S-band tube for the SBLC project, which was already treated in detail, consists of 180 single so-called cups. Figure 5.8 shows a sketch of one cup
5.6 Coupled Temperature Problems
321
Figure 5.44. Inductive oven with vacuum chamber and corresponding inductor.
The photo is courtesy of Jagnow, DESY
inductive oven with one water-cooled conductor distance to structure frequency power efficiency final temperature
induction coil 15 x 15 mm 5-10 mm 4 kHz 50-100 kW ::::: 50 % 852°C
Table 5.9. Main parameters of the inductive oven.
and Fig. 5.9 a picture of three cups before soldering. Figure 5.45 shows two tested inductors for the soldering of SBLC cups. The idea to solder the cups by Joule's heat was investigated [133] with the algorithm for the computation of stationary temperature fields, as described in subsection 2.3.2. Here results of computations with the actual parameters and a better discretization are shown. This is a coupled problem where an eddy current subproblem is to be solved first. The wall losses are then computed from the electromagnetic fields obtained in step 1. Finally, the wall losses can be used as excitation in the simulation of the stationary temperature problem. Thus, a time-harmonic field calculation with excitation and a stationary temperature calculation are coupled. The two cups which have to be soldered were discretized on a cylindrically symmetric grid together with the surrounding induction loop. For symmetry reasons, it is sufficient to compute only a quarter of the structure. The solution domain was discretized with a 72 x 10 x 87 grid, i.e., with N = 62640 grid points. The complex symmetric matrix of the eddy current
322
5. Applications from Accelerator Physics
Figure 5.45. Inductor for the S-band cups.
The photo is courtesy of DESY
problem thus has dimension n = 3N = 187920. The real symmetric matrix of the temperature problem has dimension n = N = 62640. Figure 5.46 shows the electric field and the magnetic flux in the wall of the cups for the eddy current problem. Figures 5.47 and 5.48 show a threedimensional representation of Joule's energy that the induction coils deposit in the walls of the S-band cup. The resulting temperature distribution in the walls of the S-band cups which are to be soldered is also shown. The soldering wire is located in a groove at the temperature maximum. The surrounding area and the middle iris are also heated but not as strongly as the wire. Figure 5.49 shows a vector representation of the heat flow in a cross-section of two soldered SBLC-cups. 5.6.2 Temperature Distribution in Accelerating Structures
Wall losses lead to significant heating in the walls of the accelerator when it is in operation. A traveling wave tube or a cavity consists of material of finite but very high conductivity K,. Normal conductive structures are usually made of copper, superconducting structures of niobium. Despite good conductivity, there are significant energy losses in the walls of a traveling wave tube or a resonator caused by wall currents (cf. (1.10) in subsection 1.2) . For a good but not perfect conductor, the fields near the surface approximately behave as for a perfect conductor. For this reason, a perfect conductor may be assumed for the numerical calculation. The calculation of the wall losses is then
5.6 Coupled Temperature Problems
323
performed a posteriori with the so-called power-loss method which was described in subsection 5.2.2, equation (5.1). Besides the damping as explained in subsection 5.2.2, the energy losses result in a heating of the structure. Thus the wall losses create a heat source. This is the reason why normal conductive structures are water-cooled; the water flows through cooling channels which are attached to the outer wall. Superconducting accelerating structures are put in a so-called cryostat, a tank with liquid helium at extremely low temperatures. For example, the TESLA structure is operated at 1.8° Kelvin. In the following application, the resonant modes and the resulting wall losses were first computed with FIT in accordance with (5.1). Next, the wall losses were used as a heat source for the calculation of stationary temperature distributions in the cooled walls. Figure 5.50 shows the electric field of the 271"/3 accelerating mode in one cell of the 180-cell SBLC structure. (The SBLC linear collider project was described in more detail in subsection 5.2.1 .) For this calculation, one cell of the constant gradient structure was chosen and the eigenvalue problem was solved with periodic boundary condition and 271" /3 phase advance per cell. This periodic approximation is absolutely valid for the temperature calculation. For symmetry reasons, it would be sufficient to compute only a quarter of the structure. For a better understanding of the figures, about 2/3 of the structure was computed. Eight cooling channels are attached to the outer shell of the cell to compensate for the heating of the structure due to walllosses 18 . Furthermore, a three-dimensional contour plot of the wall currents on the cell surface is displayed in Fig. 5.50. The areas with the highest wall losses are the bright areas. Figure 5.50 also shows the resulting temperature distribution as the third possible representation. The influence of the cooling is perfectly observable: Near the cooling channels, the temperature is low as expected, while there are hot spots larger distances away from the cooling area. Figure 5.51 shows contour lines of the temperature and the temperature gradient for the upper half of a cell. The cut lies in the plane of a cooling channel. 5.6.3 RF-Window At the time of operation of a storage ring or linear collider, the recurrent problem of broken rf-windows in the vacuum system arises. The break is often caused by overheating. Therefore, a quantitative determination of the temperature distribution is of practical importance. The ceramic windows are often enlarged to decrease the heat load. As an example, consider a rf-window for a rectangular waveguide. The window itself is placed in a short cylindrical waveguide section. The TEOl mode with frequency 3 GHz was used as the waveguide mode. The electromagnetic field problem is solved by simulation in time domain. The loss angle of ceramics is taken into account in this calculation. In 18
The "new" LINAC II structure shown in Fig. 5.10 has eight cooling channels, only four shall be used in SBLC.
324
5. Applications from Accelerator Physics
the numerical simulation, the total power flow is due to a special waveguide boundary condition. The time-harmonic waveguide excitation is monitored until a steady state is reached. Then the field is stored and another quarter period is recorded in the time domain simulation. The result of this process is the real and imaginary parts of the solution in frequency domain. Figure 5.54 shows a vector representation of the electric field. The power density is determined a posteriori from the stationary electric field. In Fig. 5.55, the distribution of the energy density is displayed. This power density is then used as a heat source for the temperature computation. Figure 5.56 shows the final temperature distribution in the ceramic disc. 5.6.4 Waveguide with a Load
Coupled rf temperature problems also occur in high power rf-applications where rf-power shall be absorbed in lossy waveguide terminations. Such terminations are used, e.g., at the test phase of new power sources. Another application is accelerating structures for which resonant higher order modes have to be damped. The chosen concrete example was investigated in connection with the Stanford/Berkeley /Livermore "B-Factory" project. It is a very typical example. In the B-Factory, resonators are used for particle acceleration. As was described in detail for linear collider studies, the particles themselves induce Higher Order Modes (HOM), which have to be eliminated in order to guarantee beam stability. For this purpose, powerful antennas which extract the rf-energy are installed in each resonator. This energy is conducted to an rf-Ioad nearby. Such a load consists of a lossy assembly in a rectangular waveguide. Figure 5.58 shows the typical dimensions of the load. Figure 5.57 displays the geometry of the waveguide. The arising heat is absorbed by cooling channels which are attached on the upper side. Figure 5.58 shows a vector representation of the electric field. The absorption by the load is clearly visible. Approximately in the middle of the load, the field is already nearly zero. The temperature distribution in the lossy load is displayed in Fig. 5.59.
5.7 Bibliographical Comments This last section dealt with a special field of applications of the numerical methods described above. It was intended to be almost self-contained - starting with an introduction in the topic, focusing on a practical problem to be solved, applying semi-analytical as well as discretization methods for the electromagnetic field computation and, finally, stressing the importance of practical validation via measurements.
5.7 Bibliographical Comments
325
General Textbooks Not too many textbooks exist on accelerator physics. Yet, there exist many proceedings of accelerator workshops, which often have textbook quality. Some recommendable books on the fascinating story of high-energy physics that are readable for the non-experts are, e.g., those by Ne'eman and Kirsh [187] (German translation: [188]) and by Close, Marten, and Sutton [61] (German translation: [62]). A student's textbook on this topic is by Das and Ferbel [70] (German translation: [71]). Another interesting German book for nonscientists is by Fritsch [101]. Many relevant questions concerning accelerator physics can be answered with the help of classical electrodynamics, which is very well described in Jackson's book [143] (older edition translated into German is [142]). A very comprehensive book on collective beam instabilities is written by Chao [52], another one on charged particle dynamics is by Reiser [214] (see also references therein). Both were published in a new series "Series in Beam Physics and Accelerator Technology", which also includes an introductory book by Edwards and Syphers [86]. Other books often used by the author are [138] and [139] by Humphries. A German textbook was written by Wille [316]. A very good textbook on the foundations of electromagnetic slow wave systems, which also play in important role in many linear accelerators, is a book by Bevensee [30]. As for the proceedings of accelerator workshops, [218] presents a valuable course on accelerator optics (see also references therein), [53] on coherent instabilities, [309] on wake fields and impedances. Future Linear Colliders Reports and design reports on the on-going studies of future high energy linear colliders are the "Technical Review Committee" 's report [267] on all activities worldwide, [311] on SBLC, [31] on TESLA and SBLC. Also, the proceedings of the "Next Generation Linear Colliders" workshops [242], [153], [140] give a good overview of the existing projects. The physics made possible by these linear colliders is described in [327]. Some early publications on the many projects are: [314] for TESLA, [310] and [311] for SBLC, [258] and [257] for JLC, [194] for NLC, [16] for VLEPP, and [232] for CLIC. Accelerating Structures Many of the questions related to the design of a linear collider have first been discussed in the 60's when the Stanford Linear Accelerator was built: In April 1966 cumulative beam instabilities were observed for the first time ever at the SLAC two-mile accelerator. The observations, experiments, and calculations that followed are described in the famous "Blue Book" [184]. The fundamental question of the operational mode of accelerating structures
326
5. Applications from Accelerator Physics
for linear colliders is treated by Miller in [176]. The fundamental theorem of beam loading was presented by Wilson in [317]. Boussard [39] presents a good explanation of the loaded gradient [124] and reports about suppression of parasitic higher order modes. A basic relation in the transverse deflection of charged particles in rf fields, the Panofsky-Wenzel Theorem [195], goes back to 1956. Another fundamental article about wake fields is by Bane, Wilson, and Weiland [20]. Measurements techniques exploiting Panofsky-Wenzel Theorem for wake fields are described, e.g., in [162]' simulation techniques in [298], [301], [299], and [155]. Further publications concerning numerical simulations are [76], [72], [22], [307], [262]. Secondary effects such as structure heating are described, e.g., in [285]. Effects of transverse wake fields as well as chromatic effects can be compensated by the BNS damping [15] described by Balakin, Novokhatsky, and Smirnov (also cf. [234]). Some more recent publications on constant impedance, constant gradient, and detuned structures are, e.g., [293], [167], and [264]. Effects of and measures against parasitic modes in accelerating structures taken to optimize the beam dynamics are treated in great detail in [83] and [210]. Study of Higher Order Modes
The mode matching program ORTHO based on the code RESO [246] and first results for the SBLC structure were described in [279]. The design of a single cell of this structure is one of the examples treated in [329]. Prior to the simulations with ORTRO, some simulations were carried out with a single band coupled circuit model [82]. A series of numerical studies on the mode matching technique and ORTHO can be found in the student's theses [183], [87], [204], [239], and [318]. The design of the test structure and related results are described in [284]' [160], [161]' [163], and [308] (measurements at University Frankfurt under the guidance of Riilsmann). For this structure, simulations with MAFIA [303] and URMEL-T [286], [166] as well as with the coupled circuit model COM [80] were also carried out. One important point of these studies was the adequate choice of parameters in the eigenvalue solver SAP [270], [271] of URMELT. Later, similar experiments were also carried out at LAL by Losito [169], at KEK, SLAC, and University Frankfurt under the guidance of Rigo [322], with some linac structures at DESY [154], [217], and with the first 30 cells of the SBLC structure with hybrid coupler [149], both under the guidance of Holtkamp and Dohlus. Many of those measurements were based on the resonant bead pull measurement, according to Maier and Slater [170], or a modified non-resonant bead pull measurement, according to Steele [243]. Descriptions of this method can also be found in [136] or [162]. Strategies for dipole detuning and damping in the SBLC and corresponding measurements are described in [84], [82], [235], [78], [77], [51], [179], and
5.7 Bibliographical Comments
327
[131]. They gave rise to a new scaling of the SBLC parameters, which was presented in [50]. Parts of the damping system can also be used for beam diagnostics [198]. Recent studies with the 36 cell structure on the reaction of modes to HOM dampers and lossy sheets are described in [137] and [308]. The NLC structure was studied first by means of a double-banded model by Bane [19]. [328] described the development and simulation results of a similar program developed for NLC studies. Coupled Temperature Problems in Accelerators
The data for the examples were provided by the group of Holtkamp at DESY and by Weiland in private communication.
328
5. Applications from Accelerator Physics
- - -
.
.......
i i ~""'~""''''
:
~~
..
... .... ...
.. ..
..... ~
. . .. . .• e',.
.. ..\ .. ,
"
" '"
.~
."
,
, . \
\ ~ ~:
'\~ ", '
'
"
..
''''''
~\
', . ,
"
~ \ \ .....................t . __:.~ . : ;
. 4t\
'-
. .; .. .
.~
.
-
-
-
-
-
-
-
-
_._ -
-
. ~
-.I _ _ _ _ _ _ _ _ .!
Figure 5.46. Solution of the eddy current problem: Electric field Re(E) and mag· netic flux Re(B) in the wall of the cup.
Figure 5.47. Evaluation process: Three-dimensional representation of the distribution of Joule's energy that the inductive coils deposit in the walls of the S-band cups. One quarter of the cylindrically symmetric structure was computed.
5.7 Bibliographical Comments
329
Figure 5.48. Solution of the stationary temperature problem: Resulting temperature distribution in the walls of the S-band cups to be soldered. The soldering wire is located in a groove at the maximal temperature. The neighbouring areas and the middle iris are also heated but much less.
~-~~~~~~~/""~~~~~~~ ~-~~~~~~~~, "'~~~-~-~~
-~~-----
.
-----~~-~
Figure 5.49. Solution of the stationary temperature problem: Vector representation of the heat flow in a cross-section of two of the SBLC cups to be soldered.
330
5. Applications from Accelerator Physics
Figure 5.50. Eigenvalue problem, evaluation process, and stationary temperature problem: Vector representation of the electric field of the 271'/3 accelerating mode, the wall losses, and the resulting temperature distribution in a cooled cell of the lBO-cell SBLC structure. The areas with highest wall losses are in the light regions.
, I
I j
I "
1
:i
:~ : i :, : ;.
I
,
,
,
I
,
I
~
,
•
•
,
\
I
\
~
•l I
.
V
Figure 5.51. The stationary temperature problem: Contour lines of temperature and the temperature gradient in a cross-section of the cell in the plane of a cooling channel.
5.7 Bibliographical Comments
331
! .
:'
:J
, ,
~
:t
I
1
!
:~
:~
. I
j
I .
- - --
Figure 5.52. The stationary temperature problem: Contour lines of temperature and the temperature gradient in a cross-section of two cooling channels.
;/
/
,/
- '/ ~ ~/
,/
'.I t
'
..-
\ ' orr!
/
•
/'",,\\\I(
;
. "
""
~
•
I
\
\,
~. \
'. ,
,, ,,
,
,""""'"
.. -.. ......
\-
-~
~"""-"
.. '
\' ~
:
. -..
•
. .
. ... "..,,,,,,,,,
\
~
:Cl";
::::~~: .! 1 L../.o\ ____ ~--- ~ . . - : - l.. ____ • . .
,
J
,' lIlt ,"\\\1.
..,
\
' .,
,',
f;;//II"'1
\
, ,,,,,,, , , I , , ,
"
"
\
--"" "" ",>
,."""""~\ '"
.. . .
"
'.
) 1f ,
,
,
• .\
"J+ - ~
Figure 5.53. The stationary temperature problem: The temperature gradient in a cross-section of a cell and an iris.
332
5. Applications from Accelerator Physics
Figure 5.54. The time-domain calculation: Vector representation of the electric field in a rf-window.
Figure 5.55. Evaluation process: Distribution of the energy density in the ceramic disc of the rf-window.
Figure 5.56. The stationary temperature problem: Temperature distribution in the ceramic disc of the rf-window.
5.7 Bibliographical Comments
333
Figure 5.57. Geometry of the waveguide with a load and attached cooling channel, which is visible on the upper side.
Figure 5.58. The field problem: Vector representation of the electric field in the waveguide with a load. The upper wall was removed for this representation.
Figure 5.59. The stationary temperature problem: Temperature distribution in a lossy load.
Summary
This book has several main topics, which are all related to the solution of linear systems of equations in numerical field theory. In the first two sections, we discussed analytical and numerical methods for the solution of Maxwell's equations, viz. one semi-analytical and two major grid-oriented methods for field computation. Two of those three methods, the Mode Matching Technique and the Finite Integration Technique, were used to model various problems from field theory. In the course of this, on the one hand, full complex matrices of moderate dimension and, on the other hand, very large sparse real and complex matrices arise. The third section is devoted to possible solution methods for the linear systems thus obtained. The section started with Gaussian elimination, continued with SOR, the cg method, and the most recent Krylov subspace and hybrid methods, and ended with the multigrid or multilevel methods. The most recent algorithms in the field of Krylov subspace methods were collected and documented in a unified style. Besides the theory of Krylov subspace and multigrid methods, their convergence observed in practice was investigated in detail. In contrast to theoretical convergence results, which can be obtained only for simple model problems, serious difficulties arise when the methods are applied to examples relevant to practice. In the fourth section, examples of all electromagnetic problem types were presented. Among those are all the examples from the convergence part of the third section. The fields on humid high voltage insulators were treated much more thoroughly than the other examples. The high voltage problem was modeled with the Finite Integration Technique. First numerical results confirmed the suitability of this model. They also indicated the error made if this field problem is solved in electrostatics. Finally, the fifth section contained practical examples from accelerator physics. The computation of parasitic modes in accelerating structures is a substantial matter of concern in all linear collider projects. A short introduction on linear colliders had been given before the simulation results were presented. The computation of parasitic higher order modes revealed the phenomenon of trapped modes that strongly interact with the particle beam. This result had a considerable influence on the design of the structure. For the validation of the simulation results, a test structure was developed U. Rienen, Numerical Methods in Computational Electrodynamics © Springer-Verlag Berlin Heidelberg 2001
336
Summary
and investigated by measurements. Here it became obvious that interaction between experiment and simulation is of great importance for quantitatively reliable results.
References
1. R. Alexander. Diagonally Implicit Runge-Kutta Methods for Stiff ODE's. SIAM 1. Numer. Anal., 14:1006-1022, 1977. 2. J.H. Argyris. Energy Theorems and Structural Analysis. Butterworths Scientific Publications, London, 1960. first published in Aircraft Engineering, 26:347-356,383-387,394, 1956 and 27:42-58,80-94,125-134, 1955. 3. W.E. Arnoldi. The Principle of Minimized Iteration in the Solution of the Matrix Eigenvalue Problem. Quart.Appl.Math., 9:17-29, 1951. 4. O. Axelsson. On the Computational Complexity of Some Matrix Iterative Algorithms. Technical report, Department of Computer Science, Chalmers University of Technology, Goteborg, Sweden, 1974. 5. O. Axelsson. Conjugate Gradient Type Methods for Unsymmetric and Inconsistent Systems of Linear Equations. Technical Report RR.78.03R, Department of Computer Science, Chalmers University of Technology, Goteborg, Sweden, 1978. 6. O. Axelsson. Conjugate Gradient Type Methods for Unsymmetric and Inconsistent Systems of Linear Equations. Linear Algebra Appl., 29:1-16, 1980. 7. O. Axelsson. A Survey of Preconditioned Iterative Methods for Linear Systems of Algebraic Equations. BIT, 25:166-187, 1985. 8. O. Axelsson. A Generalized Conjugate Gradient, Least Square Method. Numer. Math. , 51:209-227, 1987. 9. O. Axelsson. Iterative Solution Methods. Cambridge University Press, 1994. 10. O. Axelsson and A. Kucherov. Real Valued Iterative Methods for Solving Complex Linear Systems. Report 9904, Department of Mathematics, University of Nijmegen, Nijmegen, Netherlands, January 1999. 11. O. Axelsson and A.A. Kutcherov. Real Valued Iterative Methods for Solving Complex Linear Systems. unpublished, Universitat Nijmegen, January 1994. 12. O. Axelsson, M. Neytcheva, and B. Polman. The Boardering Method as a Preconditioning Method. Report 9348, Department of Mathematics, Catholic University, Nijmegen, Netherlands, December 1993. 13. O. Axelsson and P.S. Vassilevski. Algebraic Multilevel Preconditioning Methods, II. SIAM 1.Numer.Anal., 27(6):1569-1590, 1990. 14. N.S. Bakhalov. On the Convergence of a Relaxation Method with Natural Constraints on the Elliptic Operator. USSR Comput Math and Math Phys, 6(5):101-135, 1966. Original: 0 schodimosti odnogo relaksacionnogo metoda pri estestvennych ogranicenijach na ellipticeskij operator. Z vyCisl matem i matem fiz 6,5 (1966) 861-883. 15. V. Balakin, A. Novokhatsky, and V. Smirnov. VLEPP: Transverse Beam Dynamics. In 12th Int. Conf. on High Energy Accelerators, Fermilab, 1983. 16. V.E. Balakin, G.!. Budker, and A.N. Skrinski. In Int. Sem. on Prob. on High Energy and Controled Nucl. Fusion, pages 78-101, Novosibirsk, UdSSR, 1978.
338
References
17. K Bane and B. Zotter. Transverse Modes in Periodic Cylindrical Waveguides. In Xlth Int. Conf. on High Energy Accelerators, pages 581-585, Geneva, Switzerland, July 1980. 18. KL.F. Bane. private communication. SLAC. 19. KL.F. Bane and R.L. Gluckstern. The Transverse Wakefield of a Detuned X-Band Accelerator Structure. Part.Acc., 42(3-4):123-169, 1993. 20. KL.F. Bane, P.B. Wilson, and T. Weiland. Wake Fields and Wake Field Acceleration. American Institute of Physics, AlP, 127:875-928, 1983. 21. R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. Van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd Edition. SIAM, Philadelphia, PA, 1994. also available under http://mip.ups-tlse.fr/ grundman / tern plates /report .html. 22. M. Bartsch, U. Becker, M. Dehler, M. Dohlus, X. Du, S. Gutschling, P. Hahne, R. Klatt, M. Marx, Z. Min, U. van Rienen, D. Schmitt, A. Schulz, B. Steffen, P. Thoma, B. Wagner, T. Weiland, S. Wipf, and H. Wolter. Time Domain Electromagnetic Field Computation with Finite Difference Methods. presented at IEEE 93 Berlin, 1993. 23. M. Bartsch, U. van Rienen, and T. Weiland. Consistent Finite Integration Approach for Coupled Computation of Static Current Distributions and Electromagnetic Fields. In ICEF'96, Wuhan, China, October 1996. 24. M. Bartsch, U. van Rienen, and T. Weiland. Consistent Finite Integration Approach for Coupled Computation of Static Current Distributions and Electromagnetic Fields. IEEE Transactions on Magnetics, 34(5):3098-3101, Sept. 1998. 25. M. Bartsch and T. Weiland. 2D and 3D Calculation of Forces. IEEE Transactions on Magnetics, 30(5), 1994. 26. F.L. Bauer. Das verfahren der Treppeniteration und verwandte Verfahren zur Lasung algebraischer Eigenwertprobleme. ZAMP, 8:214-235, 1967. 27. A. Bayliss and E. Turkel. Radiation Boundary Conditions for Wave-Like Equations. Communications on Pure and Applied Mathematics, XXXIII:707725, 1980. 28. R. Beck, P. Deuflhard, R. Hiptmair, R.H.W. Hoppe, and B. Wohlmuth. Adaptive Multilevel Methods for Edge Element Discretizations of Maxwell's Equations. Report SC-96-51, Konrad-Zuse-Zentrum fiir Informationstechnik, Berlin, 1996. 29. J.P. Berenger. A Perfectly Matched Layer for the Absorption of Electromagnetic Waves. Journal of Computational Physics, 114:185-200, 1994. 30. R.M. Bevensee. Electromagnetic Slow Wave Systems. John Wiley, New York, London, Sydney, 1964. 31. W. Bialowons et al. Conceptual Design Report of a 500 GeV e+e- Linear Collider with Integrated X-ray Laser Facility. Number DESY 1997-048, Hamburg, 1997. DESY. ECFA 1997-182. 32. U. Bleil. Zur numerischen Berechnung elektrostatischer und zeitharmonischer elektromagnetischer Felder im nichtorthogonalen dreidimensionalen Gitter. PhD thesis, Darmstadt University of Technology, 1994. 33. A. Bossavit. private communication. 34. A. Bossavit. Whitney forms: A Class of Finite Elements for three-dimensional Computations in Electromagnetism. In lEE Proc. A 135, pages 493-500, 1988. 35. A. Bossavit. A New Viewpoint on Mixed Elements. Meccanica, 27:3-11, 1992. 36. A. Bossavit. Computational Electromagnetism, Variational Formulations, Edge Elements, Complementarity. Academic Press, Boston, 1998.
References
339
37. A. Bossavit and L. Kettunen. Yee-like Schemes on a Tetrahedral Mesh with Diagonal Lumping. Int. J. Numer. Modelling, 1998. Special Issue "Finite Difference Time and Frequency Domain Methods" . 38. F. Bouillault, Z. Ren, and A. Razek. Calculation of Eddy Currents in an Asymmetrical Conductor with a Hole. COMPEL, 9(A):227-229, 1990. 39. D. Boussard. Beam Loading. In CERN Accelerator School 1985, Oxford, Great Britain, September 1985. also as CERN SPS/86-1O. 40. A. Brandt. Multi-Level Adaptive Technique (MLAT) for Fast Numerical Solution to Boundary Value Problems. In H. Cabannes and R. Temam, editors, Third international conference on numerical methods in fluid mechanics, pages 82-89, Paris, France, July 1972. Springer, Berlin. Lecture Notes in Phys. 18, 1973. 41. A. Brandt. Multi-Level Adaptive Solutions to Boundary-Value Problems. Math. Comp., 31:333-390, 1977. 42. A. Brandt. Multigrid Techniques: 1984 Guide with Applications to Fluid Dynamics. Studien 85, GMD, St. Augustin bei Bonn, Mai 1984. 43. A. Brandt and S. Taa'san. Multigrid Method for Nearly Singular and Sligthly Indefinite Problems. In 2nd European Conf. on Multigrid Methods, pages 100122, Kiiln, October 1-4 1985. GMD-Studien 110, Mai 1986. 44. Ch. Breit. Vektorimplementierung iterativer Verfahren und Vorkonditionierungsmethoden zur Liisung groBer linearer Gleichungssysteme in der Schaltkreissimulation. Diplomarbeit, Technische Universitat Miinchen, 1992. 45. S.C. Brenner and L.R. Scott. The Mathematical Theory of Finite Element Methods, volume 15 of Texts in Applied Mathematics. Springer, New York, 1996. 46. F. Brezzi, J. Douglas, and L.D. Marini. Two Families of Mixed Finite Elements for Second Order Elliptic Problems. Numer. Math., 47:217-235, 1985. 47. F. Brezzi and M. Fortin. Mixed and Hybrid Finite Elements, volume 15 of Springer Series in Computational Mathematics. Springer, New York, 1991. 48. E. Brieskorn. Lineare Algebra und analytische Geometrie, Band I und II. Vieweg, Braunschweig, Wiesbaden, 1983. 49. W.L. Briggs. A Multigrid Tutorial. SIAM, Philadelphia, Pennylvania, 1987. 50. R. Brinkmann. On the Scaling of SBLC Parameters Towards Smaller Bunch Charge. internal note, May 1995. DESY. 51. R. Brinkmann, M. Dohlus, M. Drevlak, N. Holtkamp, A. Mosnier, U. van Rienen, R. Wanzenberg, G.-A. Voss, and T. Weiland. notes of the "Beam Dynamics Study Group for TESLA and SBLC" . 52. A. Chao. Physics of Collective Beam Instabilities in High Energy Accelerators. Series in Beam Physics and Accelerator Technology. Wiley, New York, 1993. 53. A.W. Chao. Coherent Instabilities of a Relativistic Bunched Beam. In Second Summer School on High Energy Particle Accelerators, Stanford Linear Accelerator Center, Stanford, California, August 1982. also SLAC-PUB-2946, June 1982. 54. P.G. Ciarlet. The Finite Element Method for Elliptic Problems. NorthHolland, Amsterdam, 1987. 55. O. Claus. Charakterisierung des Oberfliichenzustandes zylindrischer Priijkorper aus Epoxidharz-FormstofJ vor und nach der Beanspruchung mit wiissrigen salzhaltigen Fremdschichten und 50 Hz- Wechselspannung. PhD thesis, Darmstadt University of Technology, 1996. 56. M. Clemens. private communication. Darmstadt University of Technology. 57. M. Clemens. Zur numerischen Berechnung zeitlich langsam-ver"anderlicher elektromagnetischer Felder mit der FI-Methode. PhD thesis, Darmstadt University of Technology, 1998.
340
References
58. M. Clemens, R Schuhmann, U. van Rienen, and T. Weiland. Modern Krylov Subspace Methods in Electromagnetic Field Computation Using the Finite Integration Theory. ACES Journal, Special Issue on Applied Mathematics: Meeting the challenges presented by Computational Electromagnetics, 11(1):70-84, March 1996. 59. M. Clemens, P. Thoma, T. Weiland, and U. van Rienen. A Survey on the Computational Electromagnetic Field Calculation with the FI Method. Surveys on Mathematics for Industry, 8(3-4):213-232, 1999. vermeintliches Special Issue on Scientific Computing in Electrical Engineering. 60. M. Clemens and T. Weiland. Iterative Methods for the Solution of Very Large Complex Symmetric Linear Systems of Equations in Electrodynamics. In T. A. Manteuffel and S. F. McCormick, editors, Eleventh Copper Mountain Conf. 1996, Copper Mountain, Colorado, USA, April 1996. The paper is available via MGNET. 61. F. Close, M. Marten, and C. Sutton. The Particle Explosion. Oxford University Press, Oxford, 1987. 62. F. Close, M. Marten, and C. Sutton. Spurensuche im Teilchenzoo. Spektrum der Wissenschaft, Heidelberg, 1989. 63. R.W. Clough. The Finite Element Method in Plane Stress Analysis. In 2nd Conf. on Electronic Computation, ASCE, Pittsburg, Pa, USA, 1960. 64. G. Cohen and P. Monk. Gaufipoint Mass Lumping Schemes in Electromagnetism. COMPEL Suppl. A, 13:293-298, 1994. 65. The MAFIA collaboration. User's guide mafia version 4.x. Technical report, CST GmbH, Lauteschlagerstr. 38, D-64289 Darmstadt, Germany. 66. RE. Collin. Field Theory of Guided Waves. IEEE Press, New York, 2nd edition, 1991. 67. R.E. Collin. Foundations for Microwave Engineering. McGraw-Hill, New York, 2nd edition, 1992. 68. R Courant. Variational Methods for the Solution of Problems of Equilibrium and Vibrations. Bull. Amer. Math. Soc., 49:1-23, 1943. 69. E.J. Craig. The n-step Iteration Procedures. Math. Phys., (34):64-73, 1955. 70. A. Das and T. Ferbel. Introduction to Nuclear and Particle Physics. Wiley, 1994. 71. A. Das and T. Ferbel. Kern- und Teilchenphysik. Spektrum Akademischer Verlag, Heidelberg, 1995. 72. M. Dehler. Numerische Losung der Maxwellschen Gleichungen auf kreiszylindrischen Gittern. PhD thesis, Darmstadt University of Technology, 1993. 73. P. Deuflhard. Cascadic Conjugate Gradient Methods for Elliptic Partial Differential Equations: Algorithm and Numerical Results. Cont. Math., 180:29-42, 1994. 74. P. Deuflhard and A. Hohmann. Numerische Mathematik 1. de Gruyter Lehrbuch. Walter de Gruyter, Berlin, 1993. 75. M. Dohlus. private communication. DESY. 76. M. Dohlus. Ein Beitrag zur numerischen Berechnung elektromagnetischer Felder im Zeit bereich. PhD thesis, Darmstadt University of Technology, 1992. 77. M. Dohlus, H. Hartwig, N. Holtkamp, S. Ivanov, V. Kaljazkny, and A. Nabaka. Higher Order Mode Damping by Artificially Increased Surface Losses. Technical report, DESY, 1996. in preparation. 78. M. Dohlus, M. Marx, N. Holtkamp, P. Hulsmann, W.F.O. Muller, M. Kurz, H.-W. Glock, and H. Klein. S-Band HOM-Damper Calculations and Experiments. In Particle Accelerator Conf. PAC'95, pages 692-694, Dallas, USA, May, 1-5 1995. DESY M-95-08, pp. 85-87.
References
341
79. M. Dohlus, P. Thoma, and T. Weiland. Stability of Finite Difference Time Domain Methods related to Space and Time Discretization. IEEE-MTT. submitted. 80. M. Dohlus and U. van Rienen. Analysis of Higher Order Modes in ConstantGradient Structures. internal note. 81. C. Douglas. MGNet-Digest. http://www.cerfacs.frr douglas/mgnet.html. 82. M. Drevlak. Berechnung des Wechselwirkungsparameters von Wanderwellenriihren. Studienarbeit, Darmstadt University of Technology, 1991. 83. M. Drevlak. On the Preservation of Single- and Multi-Bunch Emittance in Linear Accelerators. PhD thesis, Darmstadt University of Technology, 1995. DESY-report DESY 95-225. 84. M. Drevlak, H.G. Beyer, N. Holtkamp, U. van Rienen, V. Tsakanov, R. Wanzenberg, T. Weiland, and M. Zhang. Attenuation of Transverse Modes by Variable Cell Geometries in Travelling Wave Tubes. In XV Int. Conf. on High Energy Accelerators, pages 876-878, Hamburg, July 1992. 85. J.A. Edminister. Electromagnetism. Schaum's Outline. McGraw-Hill, New York,1984. 86. D.A. Edwards and M.J. Syphers. An Introduction to the Physics of High Energy Accelerators. Series in Beam Physics and Accelerator Technology. Wiley, New York, 1992. 87. R. Ehmann. Ein schnelles Verfahren zur Feldberechnung in aperiodischen Wanderwellenriihren. Diplomarbeit, Darmstadt University of Technology, 1994. 88. S.C. Eisenstat, H.C. Elman, and Schultz M.H. Variational Iterative Methods for Nonsymmetric Systems of Linear Equations. SIAM Sci. Stat. Comput., 20(2):345-357, April 1983. 89. H. Elman. Iterative Methods for Large, Sparse, Nonsymmetric Systems of Linear Equations. PhD thesis, Computer Science Dept., Yale University, New Haven, CT, 1982. Research Report 229. 90. B. Engquist and A. Majda. Absorbing Boundary Conditions for the Numerical Simulation of Waves. Math. Comp., 31(139):629-651, July 1977. 91. D.K. Faddeev and V.N. Faddeeva. Computational Methods of Linear Algebra. Freeman, San Francisco, California, 1963. 92. R.P. Fedorenko. A Relaxation Method for Solving Elliptic Difference Equations. USSR Comput Math and Math Phys, 1(5):1092-1096, 1961. Original: Relaksacionnyj metod resenija raznostnych ellipticeskich uravenija. Z vyCisl matem i matem fiz 1,5 (1961) 922-927. 93. R.P. Fedorenko. The Speed of Convergence of one Iteration Process. USSR Comput Math and Math Phys, 4(3):227-235, 1964. Original: 0 skorosti schodimosti odnogo iteracionnogo processa. Z vyCisl matem i matem fiz 4,3 (1964) 559-564. 94. P. Fellinger. Ein Verfahren zur numerischen Losung elastischer Wellenausbreitungsprobleme im Zeitbereich durch direkte Diskretisierung der elastodynamischen Grundgleichungen. PhD thesis, Gesamthochschule Kassel, 1991. 95. R.P. Feynman, R.B. Leighton, and M. Sands. The Feynman Lectures on Physics - Volume II; mainly electromagnetism and matter. Addison-Wesley, Menlo Park, 1977. 96. R. Fletcher. Conjugate Gradient Like Methods for Indefinite Systems. In Dundee Biennal Conf. on Numerical Analysis, pages 73-89, University of Dundee, Scotland, 1974. Springer Verlag, New York, 1975. 97. R.W. Freund. A Transpose-Free Quasi-Minimal Residual Algorithm for NonHermitian Linear Systems. SIAM J. Sci. Comput., 14(2):470-482, March 1993.
342
References
98. R.W. Freund and M. Hochbruck. On the Use of two QMR Algorithms for Solving Singular Systems and Applications in Markov Chain Modelling. RIACS Technical Report 91.25, Moffet Field, California, 1991. 99. RW. Freund and N.M. Nachtigal. An Implementation of the QMR Method based on Coupled Two-Term Recurrences. 100. RW. Freund and N.M. Nachtigal. QMR: A Quasi-Minimal Residual Method for Non-Hermitian Linear Systems. Numer. Math. , 60:315-339, 1991. 101. H. Fritsch. Vom Urknall zum ZerJall. dtv, Miinchen, 2nd edition, 1996. 102. B. Geib. Rechnergestutzter BreitbandentwurJ in Rechteckhohlleitertechnik. PhD thesis, Darmstadt University of Technology, 1992. 103. G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, 1983. 104. U. Gotte. Numerische Losung linearer Gleichungssysteme unter a priori Genauigkeitsvorgaben mit Anwendungen in der Netztheorie. Diplomarbeit, Universitat Bonn, 1982. 105. A. Greenbaum. Iterative Methods Jor Solving Linear Systems, volume 17 of Frontiers in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia, 1997. 106. M. Griebel. Multilevelmethoden als IterationsverJahren uber Erzeugendensystemen. Teubner Skripten zur Numerik. Teubner, Stuttgart, 1994. 107. Ch. GroBmann and H.-G. Roos. Numerik partieller Differentialgleichungen. Teubner Studienbiicher, Mathematik. Teubner, Stuttgart, 1994. 108. NAG Numerical Algorithm Group. NAG LIB-Manual. Technical Report Mark 11, 1983. 109. I. Gustafsson. A Class of First Order Factorization Methods. BIT,18:142-156, 1978. 110. M.H. Gutknecht. Variants of BICGSTAB for Matrices with Complex Spectrum. SIAM J. Sci. Stat., 14(5):1020-1033, September 1993. 111. W. Hackbusch. A Fast Iterative Method Solving Poisson's Equation in a General Region. In R Bulirsch, R.D. Griegorieff, and J. Schroder, editors, Numerical treatment of differential equations, pages 51-62, Oberwolfach, July 1976. Springer Verlag, Berlin. Lecture Notes in Math. 631, 1978. 112. W. Hackbusch. Multi-Grid Methods and Applications, volume 4 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, 1985. 113. W. Hackbusch. Elliptic Differential Equations - Theory and Numerical Treatment, volume 18 of Springer Series in Computational Mathematics. SpringerVerlag, Berlin, 1992. 114. W. Hackbusch. Iterative Losung grafter schwachbesetzter Gleichungssysteme, volume 69 of LeitJiiden der Angewandten Mathematik und Mechanik. Teubner Studienbiicher: Mathematik, Stuttgart, 1993. 115. H. Hahn. Deflecting Mode in Circular Iris-Loaded Waveguides. Rev. Sci. Instrum., 34(10):1094-1100, October 1963. 116. P. Hahne. Zur numerischen Feldberechnung zeitharmonischer elektramagnetischer Felder. PhD thesis, Darmstadt University of Technology, 1992. 117. E. Hairer and G. Wanner. Solving Ordinary Differential Equations II. Stiff and Differential-Algebraic Problems, volume 14 of Series in Compo Math. Springer Verlag, 1996. 118. P. Hammes. Verbesserung des Konvergenzverhaltens der FI Methode durch Ber"ucksichtigung von Feldsingularit"aten. Diplomarbeit, Darmstadt University of Technology, 1996. 119. M. Hano. Solution of the Problem of Asymmetric Conductor with a Hole by Finite Element Method Using Magnetic Vector Potential. COMPEL, 9(A):233235, 1990.
References
343
120. H.A. Haus and J.R. Melcher. Electromagnetic Fields and Energy. Prentice Hall, Englewood Cliffs, 1989. 121. S.A. Heifets and S.A Kheifets. Longitudinal Electromagnetic Fields in an Aperiodic Structure. IEEE-MTT, 42(1}:108-117, January 1994. SLAC-PUB5907, September 1992. 122. S.A. Heifets, S.A. Kheifets, and B. WOo Transverse Electromagnetic Fields in a Detuned X-Band Accelerating Structure. PUB 6336, SLAC, Stanford, California, USA, September 1993. 123. B. Heinrich. Finite Difference Methods on Irregular Networks. Akademie Verlag, Berlin, 1987. 124. R.H. Helm et al. IEEE Trans. Nucl. Sci., 16, 1969. 125. H. Henke. Impedances of a Set of Cylindrical Resonators with Beam Pipes. LEP-RF 87-62, CERN, Geneva, Switzerland, November 1987. 126. G.T. Herman, A. Lent, and P.H. Lutz. Relaxation Methods for Image Reconstruction. Commun.Assoc.Comput.Mach., 21:152-158, 1978. 127. M.R. Hestenes and E. Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. J. Res. Nat. Bur. Standards, 49:409-436, 1952. 128. T. Higo. private communication. KEK, Japan. 129. M. Hilgner. Eine spezielle reellwertige Methode zur Losung komplexer linearer Gleichungssysteme. Studienarbeit, Darmstadt University of Technology, 1997. (Beginn Mai 1996). 130. N. Holtkamp. private communication. DESY. 131. N. Holtkamp. The S-Band Linear Collider Test Facility. In Particle Accelerator Con/. PAC'95, pages 686-688, Dallas, USA, May 1-51995. DESY M-95-08, pp.113-115. 132. N. Holtkamp, D. Kahlke, and D. Jagnow. private communication, 1995. DESY. 133. N. Holtkamp and M. Marx. private communication, 1994. DESY. 134. K.H. Huebner. The Finite Element Method for Engineers. Wiley, 1975. 135. P. Hiilsmann. private communication. Univ. Frankfurt. 136. P. Hiilsmann. Theoretische und experimentelle Untersuchungen zur Bestimmung der transversalen Shuntimpedanz und Gute an stormodenbedampften Beschleunigerresonatoren fur lineare Kollider und Hochstrombeschleuniger in mittleren und hohen Energiebereichen. PhD thesis, Universitat Frankfurt, 1992.
137. P. Hiilsmann, W.F.O. Miiller, H. Klein, U. van Rienen, and T. Weiland. The 36-Cell Structure - Calculations and Experiments. In Particle Accelerator Con/. PAC'97, page to appear, Vancouver, Canada, May 12-16 1997. 138. S. Humphries, Jr. Principles of Charged Particle Acceleration. Wiley, New York, 1986. 139. S. Humphries, Jr. Charged Particle Beams. Wiley, New York, 1990. 140. INP. Seventh International Workshop on Linear Colliders, Zvenigorod, Russia, 1997. Branch INP Scientific Library. 141. T. Itoh, editor. Numerical Techniques for Microwave and Millimeter- Wave Passive Structures, pages 592-621. John Wiley & Sons, 1989. 142. J.D. Jackson. Klassische Elektrodynamik. W. de Gruyter, Berlin, New York, 1983. 143. J.D. Jackson. Classical Electrodynamics. Wiley-VCH, Weinheim, 1998. Third Edition. 144. D. A. H. Jacobs. The Exploitation of Sparsity by Iterative Methods. Sparse matrices and their Uses. I. S. Duff, Springer, 1981. pp. 191-222. 145. K.C. Jea and D.M. Young. Generalized Conjugate Gradient Acceledeaion of Nonsymmetrizable Iterative Methods. Lin.Alg.Appl., 34:159-194, 1980.
344
References
146. A. Jostingmeyer, C. Rieckmann, and A.S. Omar. Rigorous and Numerically Efficient Computation of the Irrotational Electric and Magnetic Eigenfunctions of Complex Gyrotron Cavities. IEEE-MTT, 43(5):1187-1195, May 1995. 147. Jostingmeyer, A. private communication, 1996. DESY. 148. S. Kaczmarz. Angeniiherte Aufiosung von Systemen linearer Gleichungen. Bull.Int.Acad.Polon.Sci. A, pages 355-357, 1937. 149. V.E. Kaljuzhny, D.V. Kostin, S.V. Ivanov, O.S. Milovanov, N.N. Nechaev, A.N. Parfenov, N.P. Sobenin, A.A. Zavadzev, S.N. Yarigin, M. Dohlus, and N. Holtkamp. Investigation of a Hybrid Coupler for a 6 Meter S-Band Linear Collider Accelerating Structure. M 96-05, DESY, Hamburg, April 1996. 150. U. Kaltenborn. private communication. TU Dresden. 151. E.M. Kasenally. GMBACK: A Generalised Minimum Backward Error Algorithm for Nonsymmetric Linear Systems. SIAM J. Sci. Comput., 16(3):698719, May 1995. 152. E. Keil. Beam-Cavity Interaction in Electron Storage Rings. NIM, 127:475485, 1975. 153. KEK. Sixth International Workshop on Next-Generation Linear Colliders, Tsukuba, Japan, 1995. 154. T. Khabiboulline, V. Puntus, S. Ivanov, K. Jin, M. Dohlus, N. Holtkamp, and G. Kreps. Measurement of the Higher Order Mode (HOM) Field Distribution in a Linac II Accelerating Section. M 94-02, DESY, Hamburg, June 1994. 155. R. Klatt and T. Weiland. Wake Field Calculations with Three-Dimensional BCI Code. In Int. Linear Accelerator Conf. LINAC'86, pages 282-285, Stanford University, June 1986. SLAC-303; DESY M-86-07, June 1986, S.20-23. 156. D. Konig. Surface and Ageing Phenomena on Organic Insulation under the Condition of Light Contamination and High Electric Stress. In Conf: Proceedings Nord-IS-94, 1994. (Invited Lecture). 157. S. Korotov. Iterative Solutions of certain Complex Valued Systems. AbschluBarbeit "master class", Katholieke Universiteit Nijmegen, Niederlande, 1994. 158. A. Kost. Numerische Methoden in der Berechnung elektromagnetischer Felder. Springer-Lehrbuch. Springer-Verlag, Berlin Heidelberg, 1994. 159. F. Krawzcyk. A Contribution to the Numerical Calculation of Static Electromagnetic Fields in Unbounded Domains. PhD thesis, Universitat Hamburg, 1990. DESY M-90-13. 160. B. Krietenstein. Geometrieveranderungen an Wanderwellenrohren und Bestimmung einer Teststruktur zu MeBzwecken. Studienarbeit, Darmstadt University of Technology, 1994. 161. B. Krietenstein, O. Podebrad, U. van Rienen, T. Weiland, H.-W. Glock, P. Hiilsmann, H. Klein, M. Kurz, M. Dohlus, and N. Holtkamp. The S-Band 36-cell Experiment. In Particle Accelerator Conf. PAC'95, pages 695-697, Dallas, USA, May 1-5 1995. DESY M-95-08, pp. 81-83. 162. M. Kurz. Untersuchungen zu mikrowellenfokussierenden Beschleunigerstrukturen fUr zukunJtige lineare Collider. PhD thesis, Universitat Frankfurt, 1993. 163. M. Kurz, W.F.O. Miiller, U. Niermann, P. Hiilsmann, H. Klein, U. van Rienen, O. Podebrad, T. Weiland, and M. Dohlus. Higher-Order Modes in a 36-Cell Test Structure. In 5th European Particle Accelerator Conf. (EPAC'96), Sitges, Spain, June 1996. 164. C. Lanczos. An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators. J. Res. Nat. Bur. Standards, 45:255-282, 1950. 165. C. Lanczos. Solution of Systems of Linear Equations by Minimized Iterations. J. Res. Nat. Bur. Standards, 49:33-53, 1952.
References
345
166. U. Laustroer, U. van Rienen, and T. Weiland. URMEL and URMEL-T User Guide. M 87-03, DESY, February 1987. 167. J. Le Duff. Dynamics and Acceleration in Linear Structures. In CERN Accelerator School, Fifth General Accelerator Physics Course, pages 253-288, Jyviiskylii, Finland, September 1992. 168. G. Lehner. Elektromagnetische Feldtheorie. Lehrbuch. Springer, Berlin, 1996. 169. R. Losito. private communication. INFN, Frascati, Guest visitor at LAL, Orsay. 170. L.C. Maier and J.C. Slater. Field Strength Measurements in Resonant Cavities. Journal of Applied Physics, 23(1), 1952. 171. T.A. Manteuffel. The Tchebychev Iteration for Nonsymmetric Linear Systems. Numer. Math., 28:307-327, 1977. 172. J.C. Maxwell. A Treatise on Electricity and Magnetism, 3rd edition 1891. Dover Publications, New York, 1954. 173. U. Meier Yang. Preconditioned Conjugate Gradient-Like Methods for Nonsymmetric Linear Systems. Diplomarbeit, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1994. Research Report 1210,. 174. Th. Meis and U. Marcowitz. Numerische Behandlung partieller Differentialgleichungen. Springer-Verlag, Berlin, 1978. 175. J. Meixner. Die Kantenbedingung in der Theorie der Beugung elektromagnetischer Wellen an vollkommen leitendem ebenen Schirm. Ann. der Physik, 6(6):2-9, 1949. 176. R.H. Miller. Comparison of Standing-Wave and Travelling-Wave Structures. In Int. Linear Accelerator Conference LINAC'86, pages 200-205, 1986. 177. T.G. Moore, J.G. Blaschak, A. Taflove, and G.A. Kriegsmann. Theory and Application of Radiation Boundary Operators. IEEE Trans. on Antennas and Propagation, 36(12):1797-1812, 1988. 178. P.M. Morse and H. Feshbach. Methods of Theoretical Physics. Int. Series in Pure and Applied Physics. McGraw-Hill, New York, 1953. 179. W.F.O. Miiller, P. Hiilsmann, M. Kurz, C. Peschke, H. Klein, U. van Rienen, T. Weiland, and M. Dohlus. Measurements and Numerical Calculations on Higher-Order-Mode-Dampers within a Stack of 36 Detuned S-Band-Cells. In Int. Linear Accelerator Conference LINAC'96, Geneva, Switzerland, August 1996. 180. G. Mur. Absorbing Boundary Conditions for the Finite-Difference Approximation of the Time-Domain Electromagnetic-Field Equations. IEEE Trans. on Electromagnetic Compatibility, 23(4):377-382, 1981. 181. G. Mur. Edge, Elements, their Advantages and Disadvantages. Et/EM 199335, TU Delft, 1993. 182. G. Mur. Panel Session on Edge Elements. In XI Conf. on the Computation of Electromagn. Fields COMPUMAG, Rio de Janeiro, Brazil, November 1997. 183. J. Nahr. Konvergenzuntersuchungen zur Orthogonalentwicklung in nichtperiodischen Wanderwellenrohren. Diplomarbeit, Darmstadt University of Technology, 1992. 184. R.B. Neal, editor. The Stanford Two-Mile Accelerator. Stanford University, W.A. Benjamin, Inc., New York, 1968. 185. J. Nedelec. Mixed Finite Elements in R3. Numer. Math., 35:315-341, 1980. 186. J. Nedelec. A New Family of Mixed Finite Elements in R3. Numer. Math., 50:57-81, 1986. 187. Y. Ne'eman and Y. Kirsh. The particle hunters. Cambridge University Press, Cambridge, 1986. 188. Y. Ne'eman and Y. Kirsh. Die Teilchenjiiger. Springer, Berlin, 1995.
346
References
189. M. Neytcheva. Arithmetic and Communication Complexity of Preconditioning Methods. PhD thesis, Katholieke Universiteit Nijmegen, Netherlands, 1995. 190. K. Oosterlee and T. Washio. An Evaluation of Parallel Multigrid as a Solver and a Preconditioner for Singular Perturbed Problems: Part I: The Standart Grid Sequence. Arbeitspapiere 980, GMD, Sankt Augustin, March 1996. 191. J. M. Ortega and W. C. Rheinboldt. Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York, 1970. 192. A.M. Ostrowski. On the Linear Iteration Procedures for Symmetric Matrices. Rend.Math.e.Appl., 14:140-163, 1954. 193. C.C. Paige and M.A. Saunders. Solution of Sparse Indefinite Systems of Linear Equations. SIAM 1. Numer. Anal., 12(4):617-629, September 1975. 194. R.B. Palmer. Prospects for High Energy e+e- Linear Colliders. PUB 5195, SLAC, Stanford, California, USA, 1990. 195. W.K.H. Panofsky and W.A. Wenzel. Some Considerations Concerning the Transverse Deflection of Charged Particles in Radio-Frequency Fields. Rev. Sci. Instrum., 27:967, 1956. 196. B.N. Parlett, D.R. Taylor, and Z.A. Liu. A Look-ahead Lanczos Algorithm for Unsymmetric Matrices. MathComp., 44:105-124, 1985. 197. K. Paulsen and D.R. Lynch. Elimination of Vector Parasites in Finite Element Maxwell Solutions. IEEE-MTT, 39(3):395-404, March 1991. 198. C. Peschke, P. Hiilsmann, W.F.O. Miiller, and H. Klein. Beam Position Monitoring for SBLC Using HOM-Coupler Signals. In 5th European Particle Accelerator Conf. (EPAC'96), Sitges, Spain, June 1996. 199. G. Piefke. Das dreidimensionale Zwischen medium in der Feldtheorie. AEU, 24:523, 1970. 200. G. Piefke. Feldtheorie III, volume 782 of BI-Hochschultaschenbiicher. Bibliographisches Institut AG, Mannheim, 1977. 201. P. Pinder. Berechnung stationarer Temperaturfelder unter Vorgabe von Wiirmequellen. Diplomarbeit, Darmstadt University of Technology, 1994. 202. P. Pinder. Verwendung vorkonditionierter cg-Verfahren bei der Berechnung statischer elektromagnetischer Felder. Studienarbeit, Darmstadt University of Technology, 1994. 203. P. Pinder and T. Weiland. Numerical Calculation of Coupled Electromagnetic and Thermal Fields Using the Finite Integration Method. In PIERS'96, 1996. 204. O. Podebrad. Untersuchungen zu den Streu- und Wechselwirkungsparametern einer Beschleunigungsstruktur. Diplomarbeit, Darmstadt University of Technology, 1994. 205. O. Podebrad and T. Weiland. private communication. Darmstadt University of Technology. 206. W. PreuE, A. Bleyer, and H. PreuE. Distributionen und Operatoren. Springer Verlag, Wien, 1985. 207. T. Propper. Numerische Integration der Gitter-Maxwell-Gleichungen zur Losung langsam-veriinderlicher Feldprobleme. PhD thesis, Darmstadt University of Technology, 1995. Untersuchungen zum Einftufl von schwach leitenden Fremd208. I. Quint. schichten auf das Oberftiichen-Alterungsverhalten wechselspannungsbehafteter zylindrischer Priiftinge aus Epoxidharz-FormstofJ. PhD thesis, Darmstadt University of Technology, 1993. 209. S.S. Rao. The Finite Element Method in Engineering. Pergammon, 1982. 210. T. Raubenheimer. The Generation and Acceleration of Low Emittance Flat Beams for Future Linear Colliders. PhD thesis, Stanford University, USA, 1991. SLAC-report 387.
References
347
211. P.-A. Raviart and J.-M. Thomas. A Mixed Finite Element Method for Second Order Elliptic Problems. In I. Galligani and E. Magenes, editors, Mathematical Aspects of the Finite Element Method, pages 292-315. Springer-Verlag, 1977. 212. P.A. Raviart and J.M. Thomas. Dual Finite Element Models for Second Order Elliptic Problems. Energy Methods in Finite Element Analysis. John Wiley and Sons, Chicester, 1979. 213. F Reisdorf. Die Zwischenmediumsmethode. PhD thesis, Darmstadt University of Technology, 1977. 214. M. Reiser. Theory and Design of Charged Particle Beams. Series in Beam Physics and Accelerator Technology. Wiley, New York, 1994. 215. A.J. Roberts. A one-dimensional Introduction to Continuum Mechanics. World Sci., 1994. 216. A.J. Roberts. Advanced Engineering Mathematics A. http://www.sci.usq.edu.au/units/64636/aem/aem.html. 1995. 217. G. Romanov, S. Ivanov, M. Dohlus, and N. Holtkamp. Some Remarks on the Location of Higher Order Modes in Tapered Accelerating Structures with the Use of a Coupled Oscillator Model. In Particle Accelerator Conf. PAC'95, pages 2345-2347, Dallas, USA, May 1-5 1995. DESY M-95-08, pp.89-91. 218. J. Rossbach and P. Schmuser. Basic Course on Accelerator Optics. In CERN Accelerator School, Fifth General Accelerator Physics Course, pages 17-88, Jyviiskyla, Finland, September 1992. 219. J. Ruge and K. Stuben. Algebraic Multigrid AMG. Arbeitspapiere der GMD 210, GMD, Schloss Birlinghoven, June 1986. 220. H. Rutishauser. Theory of Gradient Methods, pages 24-49. Refined Iterative Methods for Computation of the Solution and the Eigenvalues of Self-Adjoint Boundary Value Problems. Institute of Applied Mathematics, Zurich, BaselStuttgart, 1959. 221. H. Rutishauser. Simultaneous Iteration Method for Symmetric Matrices. Numer. Math., 16:205-223, 1970. 222. Y. Saad. Practical Use of Polynomial Preconditionings for the Conjugate Gradient Method. SIAM J. Sci. Stat. Comput., 6(4):865-881, October 1985. 223. Y. Saad. Preconditioning Techniques for Nonsymmetric and Indefinite Linear Systems. Journal of Computational and Applied Mathematics, (24):89-105, 1988. 224. Y. Saad. Krylov Subspace Methods on Supercomputers. SIAM J. Sci. Stat. Comput., 10(6):1200-1232, 1989. 225. Y. Saad. ILUT: A Dual Threshold Incomplete LU Factorization. Numerical Linear Algebra with Applications, 1(4):387-402, 1994. 226. Y. Saad. Krylov Subspace Techniques, Conjugate Gradients, Preconditioning and Sparse Matrix Solvers. Lecture Series 1994-05, von Karman Institute for Fluid Dynamics, California, 1994. 227. Y. Saad. Iterative Methods for Sparse Linear Systems. The PWS Series in Computer Science. PWS Pub!. Company, Boston, 1996. 228. Y. Saad and M.H. Schultz. Conjugate Gradient-like Algorithms for Solving Nonsymmetric Linear Systems. Math. Comput., 44:417-424, 1985. 229. Y. Saad and M.H. Schultz. GMRES: A Generalised Minimal Residual Method for Solving Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comput., 7:856869, 1986. 230. W. Schmid, M. Paffrath, and R.H.W. Hoppe. Application of Iterative Methods for Solving Nonsymmetric Linear Systems in the Simulation of Semiconductor Processing. Surv. Math. Ind., 5:1-26, 1995. 231. D. Schmitt. Zur numerischen Berechnung von Resonatoren und Wellenleitern. PhD thesis, Darmstadt University of Technology, 1994.
348
References
232. W. Schnell. Research and Development for a CERN LINEAR COLLIDER. CERN-LEP-RF 87-58, Cern, Geneva, Switzerland, 1987. siehe auch CERNLEP-RF /86-06 und CERN-LEP-RF /86-27, 1986. 233. W. Schanauer. Scientific Computing on Vector Computers. North-Holland, Amsterdam, New-York, Oxford, Tokio, 1987. 234. J.T. Seeman and N. Merminga. Mutual Compensation of Wakefield and Chromatic Effects of Intense Linac Bunches. PUB 5220, SLAC, Stanford, USA, May 1990. 235. T. Shintake et al. High Power Test of HOM-Free Choke-Mode Damped Accelerating Structures. In 17th Int. Linac Conf., Tsukuba, Japan, 1994. KEK Preprint 94-82 (1994). 236. P.P. Silvester and M.V.K. Chari. Finite Element Solution of Saturable Magnetic Field Problems. IEEE Trans. on PAS, 89:1642-1651, 1970. 237. G.L.G. Sleijpen and D.R. Fokkema. BICGstab(l) for Linear Equations Involving Unsymmetric Matrices with Complex Spectrum. Electronic Transactions on Numerical Analysis, 1:11-32, September 1993. 238. G.L.G. Sleijpen, D.R. Fokkema, and H.A. van der Vorst. Generalized Conjugate Gradient Squared. Preprint No. 851, University Utrecht, Department of Mathematics, May 1994. 239. M. Sommer. Untersuchungen zum Gibbs'schen Phanomen bei der Methode der Orthogonalentwicklung zur Berechnung von Wanderwellenrahren. Diplomarbeit, Darmstadt University of Technology, 1995. 240. P. Sonneveld. CGS, a Fast Lanczos-Type Solver for Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comput., 10(1):36-52, November 1989. 241. M. Spasojevic, P. Levin, J.T. Schmidt, S. Schrater, and C. Schlyter. Rapid Evaluation of HVDC Grading Rings by BIE-based Field Calculation. In Ninth Int. Symp. on High Voltage Engineering, pages 8339(1-4), Graz, 1995. 242. Stanford Linear Accelerator Center, Stanford University. Fifth International Workshop on Next-Generation Linear Colliders, Stanford, California 94309, October 1993. SLAC-436. 243. C.W. Steele. A Nonresonant Perturbation Theory. IEEE-MTT, 14(2):70-74, 1966. 244. B. Steffen. private communication. KFA Jiilich. 245. B. Steffen. Incorporation of Multigrid Methods in Accelerator Software. In 2nd European Conf. on Multigrid Methods, pages 163-178, Kaln, October 1-4 1985. GMD-Studien 110, Mai 1986. 246. K. Steinigke. Wellenausbreitung in koaxial und exzentrisch geschichteten Rundhohlleiterstrukturen . PhD thesis, Darmstadt University of Technology, 1992. Fortschritt-Berichte R. 21, 108, VDI Verlag, Diisseldorf. 247. E. Stiefel. Relaxationsmethoden bester Strategie zur Lasung linearer Gleichungssysteme. Comment. Math. He/v., 29:157-179, 1955. 248. J. Stoer and R. Bulirsch. Numerische Mathematik II, volume 114 of Heidelberger Taschenbiicher. Springer, Berlin, Heidelberg, 1990. 249. G. Strang. A Homework Exercise in Finite Elements. Int. J. Num. Meths. Engrg., 11:411-417, 1977. 250. G. Strang and G.J. Fix. An Analysis of the Finite Element Method. PrenticeHall, New York, 1973. (Now published by Wellesley-Cambridge Press). 251. J.A. Stratton. Electromagnetic Theory. Int. Series in Pure and Applied Physics. McGraw-Hill, New York, London, 1941. 252. K. Stiiben. Algebraic Multigrid (AMG): An Introduction with Applications. GMD-Reports 53, GMD, Schloss Birlinghoven, March 1999.
References
349
253. K Stiiben and U. Trottenberg. Multigrid Methods: Fundamental Algorithms, Model Problem Analysis and Applications. In W. Hackbusch and U. Trottenberg, editors, Multigrid Methods, Lecture Notes in Mathematics, 960, pages 1-176. Springer, Berlin, 1982. 254. I. S. Duff T. A. Davis. A Combined Unifrontal/Multifrontal Method for Unsymmetric Sparse Matrices. Technical Report TR-97-016, Computer and Information Science and Engineering Department, University of Florida, 1997. 255. A. Taflove. Computational Electrodynamics - The Finite-Difference TimeDomain Method. Artech House, Boston, London, 1995. 256. A. Taflove, KD. Umashankar, and T.G. Jurgens. Validation of FD-TD Modeling of the Radar Cross Sextion of Three-Dimensional Structures Spanning Up to Nine Wavelengths. IEEE Trans. on Antennas and Propagation, 33(6}:662666, June 1985. 257. K Takata. In First Workshop on JAPAN LINEAR COLLIDER (JLC), KEK, Tsukuba, Japan, October 24-25 1989. 258. S. Takeda. Part. Acc., 30:1101, 1990. 259. T. Tarhasaari, L. Kettunen, and A. Bossavit. An Interpretation of the Galerkin Method as the Realization of a Discrete Hodge Operator. In 8th Biennal IEEE Conf. on Electrom. Field Computation CEFC 1998, Tucson, Arizona, USA, June 1998. 260. H.-J. Thiebes. Mehrgittermethoden und Reduktionsverfahren fiir indefinite elliptische Randwertaufgaben. PhD thesis, Institut fiir Angewandte Mathematik, Universitiit Bonn, 1983. 261. P. Thoma. Zur numerischen Losung der Maxwellschen Gleichungen im Zeitbereich. PhD thesis, Darmstadt University of Technology, 1997. 262. P. Thoma and T. Weiland. A Consistent Subgridding Scheme for the Finite Difference Time Domain Method. International Journal of Numerical Modelling, 9:359-374, 1996. 263. J.M. Thomas. Sur l'analyse numerique des methodes d'elements finis hybrides et mixtes. These d'etat, Universite Pierre et Marie Curie, Paris, 1977. 264. KA. Thompson, C. Adolphsen, KL.F. Bane, H. Deruyter, Z.D. Farkas, H.A. Hoag, K Ko, G.A. Loew, R.H. Miller, E.M. Nelson, RB. Palmer, J.M. Paterson, RD. Ruth, J.W. Wang, P.B. Wilson, N.M. Kroll, X.T. Lin, RL. Gluckstern, N. Holtkamp, B.W. Littmann, and D.U.L. Yu. Design and Simulation of Accelerating Structures for Future Linear Colliders. Part. Ace., 47:65-109, 1994. 265. P. Tong and J.N. Rossettos. Finite-Element Method. The MIT Press, Cambridge, 1977. 266. B. Trapp. Beriicksichtigung teilweise elektrisch gefiillter Zellen mit der FIMethode. Studienarbeit, Darmstadt University of Technology, 1996. 267. TRC: International Linear Collider Technical Review Committee Report, 1995. 268. L.N. Trefethen and L. Halpern. Well-Posedness of One-Way Wave Equations and Absorbing Boundary Conditions. Math. Comp., 47:421-435, October 1986. 269. U. Trottenberg. Schnelle Lasung elliptischer Differentialgleichungen nach dem Mehrgitterprinzip. Mitteilungen der GAMM Heft 1, Februar 1984. 270. J. Tiickmantel. URMEL 1.8, URMEL with a High Speed and High Precision Eigenvector Finder. EF IRF 83-5, CERN, Geneva, Switzerland, July 1983. 271. J. Tiickmantel. Application of SAP in URMEL. EF IRF 85-4, CERN, Geneva, Switzerland, July 1985. 272. M.J. Turner, RW. Clough, H.C. Martin, and L.J. Topp. Stiffness and Deflection Analysis of Complex Structures. J. Aero. Sci., 23:805-823, 1956.
350
References
273. H.-G. Unger. Elektromagnetische Theorie fur die Hochfrequenztechnik II. Dr. A. Hiithig Verlag, Heidelberg, 1981. 274. M. van den Meijdenberg. On the Iterative Solution of Electrostatic Harmonic Problems with Thin Dielectric or Conducting Layers. Diplomarbeit, Katholieke Universiteit Nijmegen, Niederlande, 1989. 275. H.A. van der Vorst. Bi-CGStab: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comput., 2(13):631-644, 1992. 276. H.A. van der Vorst and J.B.M. Melissen. A Petrow-Galerkin Type Method for Solving Ax=b, where A is Symmetric Complex. IEEE Trans.Mag., 26(2):706708, 1990. 277. U. van Rienen. Zur numerischen Berechnung zeitharmonischer elektromagnetischer Felder in offenen, zylindersymmetrischen Strukturen unter Verwendung von Mehrgitterverfahren. PhD thesis, Darmstadt University of Technology, 1989. 278. U. van Rienen. Mode Matching Technique for Calculating Multi-Cell Cavities. In Third Int. Workshop on Linear Colliders, pages 352-365, Protvino, USSR, Sept. 1991. Vol.3, Issue 2. 279. U. van Rienen. Higher Order Mode Analysis of Tapered Disc-Loaded Waveguides Using the Mode Matching Technique. Part.Acc., 41:173-201, 1993. 280. U. van Rienen. Numerik groBer Gleichungssysteme - Theorie und Praxis. Skriptum zur Vorlesung, Darmstadt University of Technology, FB 18, FG TEMF, Juli 1996. 281. U. van Rienen, M. Clemens, and T. Weiland. Computation of Low-frequency Electromagnetic Fields. ZAMM, 76(Suppl. 1):567-568, 1996. 282. U. van Rienen, M. Clemens, and T. Weiland. Simulation of Low-Frequency Fields on High-Voltage Insulators with Light Contaminations. IEEE Trans. Mag., 32(3):816-819, May 1996. 283. U. van Rienen, P. Pinder, and T. Weiland. Consistent Finite Integration Approach for the Coupled Calculation of Electromagnetic Fields and Stationary Temperature Distributions. In 7th Biennal IEEE Conference on Electromagnetic Field Computation (CEFC), page 294, Okayama, Japan, March 1996. 284. U. van Rienen, G.-A. Voss, and T. Weiland. private communication, 1993. Darmstadt University of Technology, DESY. 285. U. van Rienen and T. Weiland. Higher Order Mode Loss Heating of Vacuum Chamber Joints in HERA p-Ring. HERA 84-13, DESY, Hamburg, Juni 1984. 286. U. van Rienen and T. Weiland. Triangular Discretization Method for the Evaluation of RF-Fields in Waveguides and Cylindrically Symmetric Cavities. Part.Acc., 20:239-267, 1986/87. 287. U. van Rienen and T. Weiland. Impedance Calculation with URMEL-I Using Multigrid Methods. IEEE, Trans. Mag, 26(2):743, 1990. 288. U. van Rienen and T. Weiland. Coupled Calculation of Electromagnetic Fields and Stationary Temperature Distributions. In 5th European Particle Accelerator Con/. (EPAC'96), Sitges, Spain, June 1996. 289. R.S. Varga. Factorization and Normalized Iterative Methods, pages 121-142. Boundary Problems in Differential Equations. University of Wisconsin Press, Madison, 1960. 290. C. Vassallo. On a Direct Use of Edge Condition in Modal Analysis. IEEE Trans. MTT, 24(April):208-212, 1976. 291. P.K.W. Vinsome. ORTHOMIN, an Iterative Method for Solving Sparse Sets of Simultaneous Linear Equations. In Fourth Symposium on Reservoir Simulation, pages 149-159, Society of Petroleum Engineers of AIME, 1976.
References
351
292. B. Wagner. Numerische Simulation elektromagnetischer Felder in nichtlinearen Medien. PhD thesis, Darmstadt University of Technology, 1996. 293. J.W. Wang. RF Properties of Periodic Accelerating Structures for Linear Colliders. PhD thesis, Stanford University, Stanford, California, 1989. 294. J.P. Webb. Edge Elements and What They can do for You. IEEE Trans. on Magnetics, 29(2}:1460-1465, 1993. 295. T. Weiland. private communication. Darmstadt University of Technology. 296. T. Weiland. Eine Methode zur Lasung der Maxwellschen Gleichungen fur sechskomponentige Felder auf diskreter Basis. AEU, 31:116-120, 1977. 297. T. Weiland. Zur Berechnung der Wirbelstrame in beliebig geformten, lamellierten, dreidimensionalen Eisenkarpern. Archiv fur Elektrotechnik, 60:345351, 1978. 298. T. Weiland. In 11th Int. Conf. on High Energy Accelerators, page 570, Geneva, 1980. 299. T. Weiland. Comment on Wake Field Computation in Time Domain. NIM, 216:31-34, 1983. 300. T. Weiland. On the Computation of Resonant Modes in Cylindrically Symmetric Cavities. NIM, 216:329-348, 1983. 301. T. Weiland. Transverse Beam Cavity Interaction, Part I: Short Range Forces. NIM, 212:13-21, 1983. 302. T. Weiland. On the Numerical Solution of Maxwell's Equations and Applications in Accelerator Physics. Part. Ace., 15:245-291, 1984. 303. T. Weiland. On the Unique Numerical Solution of Maxwellian Eigenvalue Problems in Three Dimensions. Part.Acc., 17:227-242, 1985. 304. T. Weiland. Ein allgemeines Verfahren zur Lasung der Maxwell'schen Gleichungen und seine Anwendung in Physik und Technik. Physikalische Blatter, 41:380, 1986. 305. T. Weiland. Elektromagnetisches CAD - Rechnergestutzte Methoden zur Berechnung von Feldern. Script, Darmstadt University of Technology, Mai 1995. . 306. T. Weiland. High Precision Eigenmode Computation. Part.Acc., 56:61-82, 1996. 307. T. Weiland. Time Domain Electromagnetic Field Computation with Finite Difference Methods. Int. Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 9(4):295-319, July-August 1996. 308. T. Weiland, U. van Rienen, P. Hulsmann, W.F.O. Muller, and H. Klein. Investigations of Trapped Higher Order Modes using a 36-Cell Test Structure. Physical Review Special Topics - Accelerators and Beam (PRST·AB), 2:042001-1 - 042001-8, 1999. Art. 04200l. 309. T. Weiland and R. Wanzenberg. Wake fields and impedances. In 1990 Joint US-CERN Accelerator Course, pages 39-79, Hilton Head, South Carolina, November 1990. also DESY M-91-06. 310. T. Weiland et al. Design Study for a 500 GeV Linear Collider. In Int. Linear Accelerator Con/. LINAC'90, Albuquerque, New Mexico, USA, 1990. 311. T. Weiland et al. Status Report of a 500 GeV S-Band Linear Collider Study. In Workshop on Physics and Experiments with Linear Colliders, Saariselkii., Finnland, September 1991. 312. R. Weiss. Error-Minimizing Krylov Subspace Methods. SIAM J. Sci. Comput., 15(3):511-527, May 1994. 313. 0 Widlund. A Lanczos Method for a Class of Non-symmetric Systems of Linear Equations. SIAM J. Numer. Anal., 15:801-812, 1978.
352
References
314. B. Wiik and M. Tigner. Electron-Proton-Colliders beyond HERA. In H. Padamsee, editor, Proceedings of the First TESLA- Workshop, July 1990. CLNS90-1029. 315. J.H. Wilkinson and C. Reinsch. Linear Algebra, volume II of Handbook for Automatic Computation. Springer-Verlag, Berlin, 1971. 316. K. Wille. Physik der Teilchenbeschleuniger und Synchrotronstrahlungsquellen. Studienbiicher Physik. Teubner, Stuttgart, 1992. 317. P. Wilson. ISR-TH 78-23, CERN, 1978. 318. M. Witting. Untersuchung elektromagnetischer Eigenschaften von Wanderwellenrohren mit der Methode der Orthogonalentwicklung. Diplomarbeit, Darmstadt University of Technology, 1992. 319. G. Wittum. On the Robustness of ILU Smoothing. SIAM J. Sci. Stat. Comput., 10:699-717, 1989. 320. G. Wittum. Mehrgitterverfahren. Spektrum der Wissenschajt, pages 78-90, April 1990. 321. H. Wolter. Berechnung akustischer Wellen und Resonatoren mit der FITMethode. PhD thesis, Darmstadt University of Technology, 1995. 322. M. Yamamoto, T. Higo, and K. Takata. Analysis of Detuned Structure by Open Mode Expansion. In 17th International Linac Conference (LINAC'94), Tsukuba, Japan, August 1994. 323. K.S. Yee. Numerical Solution of Initial Boundary Value Problems Involving Maxwell's Equations in Isotropic Media. IEEE-AP, 14:302-307, 1966. 324. D.M. Young. Iterative Solution of Large Linear Systems. Computer Science and Applied Mathematics. Academic Press, New York, 1971. 325. H. Yserentant. On the Multilevel Splitting of Finite Element Spaces. Numer. Math., 49:379-412, 1986. 326. H. Yserentant. Preconditioning Indefinite Discretization Matrices. Numer. Math., 54:719-735, 1988. 327. P.M. Zerwas. e+e- Linear Colliders: Physics Prospects. In ECFA Workshop on e+e- Linear Colliders (Fourth Int. Workshop on Next-Generation Linear Colliders), Garmisch-Partenkirchen, 25 July - 2 August 1992. ECFA 93-154, June 1993. 328. X. Zhan. Parallel Electromagnetic Field Solvers using Finite Element Methods with Adaptive Refinement and their Application to Wakefield Computation of Axisymmetric Accelerator Structure. PhD thesis, Stanford University, 1997. 329. M. Zhang. Numerical Optimization of Electromagnetic Components. PhD thesis, Darmstadt University of Technology, 1994. 330. L. Zhuo and F. Walker. Residual Smoothing Techniques for Iterative Methods. SIAM J. Sci. Comput., 15(2):297-312, March 1994. 331. O.C. Zienkiewicz, W.L. Wood, and N.H. Hine. A Unified Set of Single Step Algorithms. Part 1: General Formulation and Applications. Int. J. Num. Meth., 20:1529-1552, 1984.
Symbols
Classical Electrodynamics dielectric constant; permittivity dielectric constant of vacuum; ::: 8.854· 10- 12 As/Vm relative permittivity (electric) conductivity; [1 I .om] electric conductivity; [I/.om] thermal conductivity; [W Im·K] permeability {Lo
permeability of vacuum; 411' . 10- 7 VsI Am ::: 8.854· 10- 12
Vs/Am relative permeability wave length circular cutoff frequency scalar potential complex scalar potential
p
charge density; [As/m 3 ]
Xe
dielectric susceptibility
Xm
magnetic susceptibility
354
Symbols
C
speed of light
Co
vacuum speed of light; c = 2.99792458· 108 m/s
A
vector potential
B
magnetic flux; [TJ=[Vsjm 2]
B
phasor; complex amplitude of B(r, t) = Re(B(r)e iwt ) in complex notation for time-harmonic magnetic flux
D
electric flux; [C/m2] = [As/m 2 ]
D
phasor; complex amplitude of D(r, t) = Re(D(r)e iwt ) in complex notation for time-harmonic electric flux
e
electron charge; e = 1,60· 10- 19 As
E
electric field strength; [V jm]
E'
normalized electric field strength
E
phasor; complex amplitude of E(r,t) = Re(E(r)e iwt ) in complex notation for time-harmonic electric field
Ie
cutoff frequency
H
magnetic field strength; [Aim]
H'
normalized magnetic field strength
H
phasor; complex amplitude of H(r, t) = Re(H(r)e iwt ) in complex notation for time-harmonic magnetic field
J
current density; [A/m 2 ]
l
phasor; complex amplitude of J(r,t) = Re(~(r)eiwt) in complex notation for time-harmonic current density
JE
impressed current density
JK
convection current density
JL
conduction current density
Jw
thermal flux; [W1m2]
k
Boltzmann constant; k = 1,38.10- 23 J /K
Symbols L
typical length of a setup
M
magnetization
p
polarization density
S
Poynting vector; [W 1m 2 ]
t
time
T
temperature
v
velocity vector
u
energy as function of temperature
W
heat source density
We
electric energy density
Wm
magnetic energy density
Yo
admittance
Zo
wave impedance
Mode Matching Technique resonant frequency Cmn
coupling matrix element weighted coupling matrix element coupling matrix coupling matrix identity matrix field eigenfunction of the electric field
355
356
Symbols
hn
field eigenfunction of the magnetic field
In
current amplitude (Fourier coefficient)
I
vector holding the current amplitudes
M
number of eigenmodes
N
number of eigenmodes
T(E)
cross-sectional eigenfunction of Ez-waves
T(H)
cross-sectional eigenfunction of Hz-waves
Un
voltage amplitude (Fourier coefficient)
U
vector holding the voltage amplitudes
Ws
stored energy
Finite Element Method
basis function
f
right-hand side of differential equation
H(curl; il)
function space in edge element formulation
H(div; il)
function space in edge element formulation interaction coefficients linear differential operator
r
residual
u
unknown field; solution function coefficients in approximation to u; nodal values
Symbols
357
Finite Integration Technique PE,i
discrete electric scalar potential discrete magnetic scalar potential
Pm
magnetic charges (!non-physical auxiliary quantity!)
p.
discrete complex scalar potential (EQS)
-~
elementary area
b
vector holding the magnetic (grid-)fl.ux magnetic (grid-)fl.ux
c
discrete curl operator discrete curl operator on dual grid
d
vector holding the electric (grid-)fl.ux electric (grid-)fl.ux discrete material matrices
e
vector holding the electric (grid-) voltage electric (grid-)voltage
(G,G)
grid duplet
G
FIT-grid
G
dual FIT-grid magnetic (grid-)voltage homogeneous part of magnetic field inhomogeneous part of magnetic field
i, j, k
index of lexicographic numbering (in
U-, V-,
w-direction)
358
Symbols
J,J,K,N
maximal index of the grid (in u-, v-, w-direction resp. total number of grid points) total electric (grid- )current elementary line grid point discrete operators corresponding to partial derivatives in x, y and z on dual-orthogonal grids
qi
discrete charges
s
discrete divergence operator discrete divergence operator on dual grid discrete gradient operator discrete gradient operator on dual grid elementary volume; FIT-cell
Numerical Treatment of Linear Systems parameter on S in Axelsson's method ... in cg-algorithm ... in cg-algorithm cycling parameter in multigrid methods ~2(A)
condition number of matrix A relaxation parameter of the Kaczmarz algorithm eigenvalue of system matrix number of relaxations on grid G k convergence factor of Theta-component
Symbols smoothing factor iteration function BiCG-polynomial relating residual in step k to initial residual nearby BiCG-polynomial
p(M)
BiCG-polynomial relating search direction in step k to initial residual spectral radius of matrix M
fJ(PdM))
virtual spectral radius of matrix M
Pk
parameter in Chebyshev iteration acceleration parameter
e
Fourier component of error function
w
relaxation parameter
A
system matrix system matrix on grid with step size h system matrix on grid Gt element in row i and column j of the system matrix A
b
right hand side right hand side on grid with step size h right hand side on grid Gt
c
preconditioning matrix Chebyshev polynomial of degree k defect on grid with step size h defect on grid G t
359
360
Symbols
D
diagonal matrix
ei
i-th unit vector
Gn Gh
system matrix of transformed system in Axelsson's method grid with step size h (fine grid)
GH
grid with step size H (coarse grid)
IH h
restriction from fine to coarse grid
111
1
restriction from fine to coarse grid
Ih H
interpolation from coarse to fine grid
ILl
interpolation from coarse to fine grid
-I
interpolation from coarse to fine grid in FMG
11-
1
k
wave number
k' Km
m-th Krylov subspace
L
lower triangular matrix in Gaussian elimination
M
iteration matrix
n
matrix dimension
P
permutation matrix in Gaussian elimination
Pk
search direction in step k of cg-algorithm
Pk
pseudo-search direction
P
orthogonal projection
Pk
polynomial of degree k
Pk
projection on Vk
Symbols
361
residual vector after k iterations pseudo-residual R
real part of complex system matrix
s
shift-factor for eigenvalue k 2 in multigrid algorithm
s
imaginary part of complex system matrix tridiagonal matrix in Lanczos algorithm
u
real part of right hand side (Axelsson's method)
u
upper triangular matrix in Gaussian elimination linear subspace
v
imaginary part of right hand side (Axelsson's method) error on grid with step size h error on grid Gl coefficient in linear combination of approximate solutions after k Chebyshev iterations
Vn,m
error of approximation to solution on two-dimensional grid affine subspace weighting factor for under-interpolation in multigrid algorithm affine subspace
x
solution vector resp. its real part (Axelsson's method) solution vector on grid with step size h
Xt
solution vector on grid Gt approximation to solution vector on grid with step size h approximation to solution vector on grid Gt
362
Symbols
Xl
initial approximation to solution vector on grid Gl (FMG)
X·
fixpoint
X·
exact solution
Xk
approximation to solution vector after k iterations
X(k)
approximation to solution vector after k iterations
(k) Xi
i-th component of k-th iterative approximation
Y
complex part of solution vector (Axelsson's method)
Yk
approximation to solution vector after k iterations of Chebyshev algorithm
Index
O-mode, 289, 300, 311 27r /3-mode, 253, 255, 258, 282, 323 36-cell experiment, 295-302, 304-315, 319 7r-mode, 253, 255, 289, 300, 311 7r /2-mode, 253, 255 e+ e- -physics, see electron-positron physics * operator, see Hodge operator 3 dB waveguide coupler, 187, 237 A-conjugate, see A-orthogonal A-norm, 100 A-orthogonal - basis, 101, 102 - projection, 101, 102 - vectors, 103 ABC , see absorbing boundary condition absorbing boundary condition, 30 accelerating - gradient, 246, 258 - - loaded, 260 -- SBLC, 265 - mode, 253, 255, 282 - section, 243 - structure, 248, 253, 322 aperiodic, 258-264 constant gradient, 246, 255, 258-262, 265, 270, 278, 280 constant impedance, 258, 278 detuned, 258, 262-264 loaded,259 periodic, 257-258 quasi-constant gradient, 261, 282 rotationally invariant, 276 SBLC, 265, 271 SLAC, 270 tapered, 258 - voltage, 255 acceleration of charged particles, 244, 250
accelerator, 243 acoustics, 57 adaptive strategy, 127 admittance, 21 algebraic multigrid, see AMG AMG, 129, 214 AMLl,155 amplitudes - current, 39 - inner, 41 - outer, 42 - voltage, 39 - wave, 41 analytical solution methods, 23 aperiodic acceleration structure, 258-264, 280, 311 - loss parameter curve, 287 approximation quality of matrix inverse, 151 Arnoldi method, 115 asymptotic convergence factor, 94 attenuation, 258, 282 - parameter, 257 auxiliary transitions, 46 Axelsson's method, 158 Axelsson's reduction of a complex linear system to real form, 156-161 B-Factory, 324 backward SOR, 95, 154 balancing transformation of magnetic field, 143 basis function, 51 BCl,274 bead, 303, 310 beam - break-up, 270, 279 - - multibunch, 280 - - single bunch, 279 - dynamics, 246, 268-280 - instabilities, 268, 275, 279-280 - loading, 246, 259, 260
364
Index
- - fundamental theorem, 259 - - transient, 280 - matched, 270 - mismatched, 270 - optics, 268 - position monitor, 318 beam pipe mode, 309 BEM, 226, 280 Bernoulli's product ansatz, 28 Bessel's differential equation, 28 - general solution, 28 beta function, 269 betatron - motion, 269 - oscillation, 269 bi-orthogonalization, 106 BiCG, 109, 110, 190 BiCG polynomials, 111 BiCGCR,113 BiCGSTAB, 115-117, 178, 185, 187, 190 BiCGstab(l), 118 BiCGstab2, 118, 178, 185, 187 BINP, 248 black box solver, 100, 130, 190 BNS damping, 279 board, 223 Boltzmann constant, 15 bond wires, 189 bordering methods, 150 boundary condition - perfect electric, 42 - perfect magnetic, 42 boundary element method, see BEM boundary operator, 31 boundary value problems, 29-33 - absorbing boundary condition, 30 - open boundary condition, 30 - periodic boundary condition, 30 - potential theory, 29 Dirichlet problem, 29 first boundary value problem, 29 mixed boundary value problem, 29 -- Neumann problem, 29 - - Newton problem, 29 - - second boundary value problem, 29 - - third boundary value problem, 29 - waveguide boundary condition, 31 box schemes and FIT grids, 68 breakdown, 118 - in BiCe, 110 Brillouin diagram, 257 bunch, see particle bunch
- charge, 319 - compression, 248 - distance, 246 - Gaussian, 273 - length, 273 - train, 246, 270 -- SBLC, 265 bunch-to-bunch energy spread, 280 C-magnet, 168, 212 - nonlinear, 216 C-to-R method, 158 - with Chebyshev iteration, 160, 161 - with preconditioning, 158 calculation domain, 57 Cartesian coordinate grid, 57 cavity, 244 cell-to-cell phase advance, 257, 271, 282,311 centroid, 270 CERN, 247, 248 cg, 99-105, 161, 162, 169, 170, 207, 209, 214, 216, 219 - error estimate, 104 - rate of convergence, 105 - stopping criterion, 104 - with ILU(3) preconditioning, 224 CGNE,109 CGNR,109 CGS, 112, 115, 161, 178, 185, 187 CGS2, 113, 178, 185, 187 CGW-method, 154 chain matrix, 41 charge density, 12 charged particle, 243, 260 Chebyshev acceleration, 98 Chebyshev iteration, 96-99, 150, 160 - and SSOR, 95 - error estimate, 99 Chebyshev polynomials, 97 chip, 237 choke-mode cavity, 317 Cholesky-factor, 152 circuit breaker, 221 classification of electromagnetic fields, 22 CLIC, 248, 249 coarse grid, 126 - correction, 126 coarsening for high frequency problems, 147 coce, 111, 177, 181 - with Minimal Residual Smoothing (COCG-MRS), 182
Index collective effects, 268 COM, 281, 290-295 complex matrices - indefinite, 184 - special properties, 76-77 - symmetric, 184 - symmetric positive stable, 177 complex scalar potential, 24, 72, 224 computational domain, 31 computational effort, 31, 164, 172, 178, 182, 187 - BiCG, 111 - BiCGSTAB, 118 - BiCGstab(l), 118 - CGS, 112 - CGS2, 113 - COCG, 111 - GCG-LS, 121 - GMRES, 116 - model problem, 164-168 - SCBiCG, 113 condition number, 104, 150, 164 conductive contamination, 226 conductivity, 12 conjugate, 99 Conjugate Gradient algorithm, see cg connection coefficients, 51 conservation of charges, 13 conservation of energy, 14 consistent discretization method, 56, 64 constant gradient structure, 246, 255, 258-262, 280, 309, 315, 319 constant impedance landings, 261, 282 constant impedance structure, 246, 258, 310 contaminated insulator, 180 continuity condition, 40, 54 continuity equation, 13 convergence - behaviour, 178, 180-182, 187, 189 - - dependence on material parameters, 182 - factor, 124 - speed, 92 cooling of cavities, 323 Coulomb gauge, 23 coupled - calculation, 220 - problem, 205 - temperature problems, 320-324 coupled circuit model, 280 - verification, 287 coupling
365
- factor, 303 - impedance, 276 - matrix, 40 critical voltage, 225 cross-talk, 189, 237 cryostat, 323 cumulative beam instability, 270, 280 cup, 250, 262 Curl-Curl Equation, 75, 234 current - amplitude, 39 - density, 12 - - conduction rv, 12 -- convection rv, 12 -- impressed rv, 12 - sensor, 174, 213 cut-off - diagram, 289 - number, 250 - tubes, 304 cylindrically symmetric structures, 36 DAE,233 damper, 271 damping, 244, 257, 319 - constant, 258 - factor, 258, 311 - global, 317-318 - local, 315 - strong, 311 - weak, 311 dark currents, 249 decay time, 273, 275 decoupling by differentiation, 25 defect, 125 - equation, 125 delta function, 271 design trajectory, 269 DESY, 248 determinant - after LV decomposition, 89 detuned structure, 258, 262-264 detuning, 280, 318 dielectric - constant, 12 - susceptibility, 12 differential-algebraic equations, 233 diffusion constant, 12 dipole component, 276 dipole modes, 38, 258, 260, 278, 283, 305,309 - strongly interacting, 287 direct Lanczos algorithm, 107 directional coupler, 237
366
Index
Dirichlet - (boundary value) problem, 29 - boundary condition, 212, 272 discharge, 225 - voltage, 225 discrete - charges, 61 - curl operator, 60, 63 - divergence operator, 63 - gradient operator, 63 discretization methods, 35 - finite difference method, see FD - finite element method, see FEM - finite integration technique, see FIT - finite volume method, 56 dispersion - curve, 251, 253 - - for periodic structures, 257 - function, 269 - relation, 250 displacement current, 225 dissipative heating, 320 divergence, 268 dual - FIT grid, 60 - - dual-orthogonal, 60, 61 - - for uniform coarsening, 134 - - special properties for dualorthogonal grids, 66 - formulation, 54 - grid, 60 - problem, 53 duality methods, 52 duality theory, 52 Ez-waves, 27 eddy currents, 320 edge, 54 - condition, 44, 46 - element formulation, 54, 55 effective emittance, 270 eigenfrequency, 244 eigenfunctions, 32 - cross-sectional eigenfunctions of E z and Hz-waves, 29 - expansion, 32 - field eigenfunctions, 38 - modes, 38 eigenvalue, 32 - distribution, 184 - of E z- and Hz-waves (Helmholtz equation for circular cylindrical waveguides), 29
elastodynamics, 57 electric - (grid) current, 61 - (grid-)voltage, 58 - charge density, 12 - conductivity, 12 - displacement, 11 - field, 220, 227, 237, 239, 244, 276, 300 - - static, 207 - field strength, 11 - flux, 11, 61 - scalar potential, 23, 24 electro-quasistatic potential, 178, 227 electro-quasistatics (EQS), 17, 160, 177, 205, 224-233 - complex scalar potential, 24 - fundamental equations, 18 - with FIT, 72-73 electro-thermal behaviour, 320 electrode shape, 227 electrolytic partial discharge erosion, 226 electromagnetic fields, 11 - scattered, 272 electromagnetic waves, 21-22, 243 electron charge, 15 electron-positron collider, 247 electron-positron physics, 245 electrostatic model - vs. electro-quasistatic model, 227 electrostatic potential, 178, 207, 209 electrostatics, 16, 162, 205-211 - with FIT, 70 elementary - area, 57 - line, 57 - volumes, 57 elementary particle, 243 emittance, 268-270, 319 - effective, 270 - ellipse, 269 - growth, 270, 319 - multibunch, 270 - single bunch, 270 energy, 260, 318 - center of mass, 247 - conversion, 260 - density, 13, 324 -- electric, 13 - - magnetic, 13 - flux, 14 - gain, 258
Index - - in constant gradient structures, 260 - inner as function of temperature, 15 - kinetic, 244 - loss, 273, 274, 278, 322 - norm, 99, 100 - per cell - - in constant gradient structures, 315 - spread, 279 -- bunch-to-bunch, 280 - stored, 254, 278, 300 - supply, 244 - total, 302 - velocity, 251 epoxy-resin, 227 EQS, see electro-quasistatics equivalent circuit, 226, 290, 295 error estimate - cg, 104 - Chebyshev iteration, 99 error smoothing, 122, 123 exact breakdown, 108 excited fields, 273 expansion - into a series of eigenfunctions, 32 - multipole - - of wake potentials, 276 - orthonormal series, 33 FD, 35, 48, 280 FDTD,57 FEM, 48-56, 226, 280 - conforming approximation, 53 - degrees of freedom, 53 - edge element formulation, 54 - element properties, 50 . - external approximation, 53 - for Maxwell's equations, 54 - Hermite type elements, 53 - internal approximation, 53 - interpolation, 50 - Lagrange type elements, 53 - mixed, 54 - non-conforming approximation, 53 - system assembly, 50 - system solution, 50 - with Whitney forms, 54 field distribution - in 36-cell structure, 309 - in constant gradient structures, 260 field eigenfunctions, 38 fill-in, 90 filling time, 255, 315 filter
367
- weighting eigenfunctions, 44 - weighting the coupling matrix, 44 fine grid, 126 finite difference method, see FD Finite Difference Time Domain, see FDTD finite element method, see FEM finite elements, 48 - one-dimensional, 49 - three-dimensional, 49 - two-dimensional, 49 finite integration technique, see FIT first boundary value problem, 29 first law of thermodynamics, 15 FIT, 36, 56-76, 162, 178,205,226,244, 274, 281, 295, 323 - analogy with FD, 66 - cell, 57 - convergence of loss parameter, 289 - correspondence between discrete field quantities and state variables, 66 - coupled thermal problems, 320 - discretization of integrals, 66 - grid, 57 - - correspondence to box schemes, 68 - - discrete charges, 61 -- dual, 60 -- dual-orthogonal, 60, 61 -- grid cells, 57 - - scalar potential, 62 - - state variables, 58 - - three-dimensional dual nonorthogonal grids, 68 -- total electric current, 61 - - triangular grids, 68 - material matrices, 66 - properties, 64-65 - system matrices, 78-79 FIT grid - electric - - (grid-) voltage, 58 - magnetic - - gradient, 62 -- voltage, 61 Five-Point-Star, 94 fixed point, 92 - methods, 92 - - acceleration, 96 Floquet's theorem, 30, 257 FMC, 148 FMG approach, 129 force - exerted by wake fields, 274
368
Index
form factor, 304, 310 forward SOR, 154 Fourier coefficients, 39 Fourier series, 276 Fourier's law, 14 Fourier-Bessel series, 36, 37 - truncation, 44 frequency, 250 - accelerating - - tuning, 259 - excitation, 76, 79 - resonant, 42 Full MultiGrid, see FMG fundamental theorem of beam loading, 259 Galerkin approach, 50, 130 Galerkin's criterion, 51 Gauss algorithm, 161 Gauss-Seidel, 162 - algorithm, 126, 146 - method, 164 - - for restart, 172 - preconditioning, 154 Gaussian bunch, 273 Gaussian elimination, 89 - matrix inversion, 89 GCG-LS, 114, 120, 161 GCR,114 Generalized Conjugate Gradient, Least Squares method, see GCG-LS Generalized Conjugate Residual method, see GCR Generalized Minimal Residual, see GMRES geometrical mode ratio, 44, 47 ghost modes, 55 Gibbs' phenomenon, 44 GMBACK,114 GMERR,114 GMRES, 114, 115 - restarted version, 116 GMRES(l), 116 grid - Cartesian coordinate, 57 - in multigrid scheme, 127 - non-coordinate, 57 - non-orthogonal, 57 - operator, 131 - pair, 57 grid-dependent eigenvalue shift, 140-141 group velocity, 251, 255, 257, 258, 282
- linear variation, 259 Hi-field, 71 Hz-waves, 27 Hall element, 219 hard materials, 16 harmonic - oscillation, 27, 244 -- with FIT, 74-76 - time dependence - - electro-quasistatic, 17 - - magneto-quasistatic, 18 heat - flow, 322 - load, 323 - source, 223, 324 - source density, 15 Helmholtz equation, 27, 75, 234 - homogeneous, 27 - inhomogeneous, 27 - scalar, 28 - - solution, 28 - vectorial, 28 Hermitian part, 77 high energy physics, 243, 247 high frequency - error component, 124 - fields, 74 - waves, 243 high voltage, 225 - insulators, 224-233 Higher Order Modes, see HOM Hodge operator, 55 HOM, 246, 250, 265, 279, 280, 319, 324 - 36-cell experiment, 295-302, 304-315 - computation, 281 - coupler, 315 - dampers, 288, 299 - damping, 247, 280, 315-319 - - by iris coating, 288 - in constant gradient structures, 288-295 - in quasi-constant gradient SBLC-like structure, 281-288 - numerical analysis, 280-295 - trapped - - in aperiodic iris-loaded waveguides, 295-302, 304-315 HOM dampers, 310 hybrid methods, 116 - comparison, 121 hysteresis, 16 IC, 152,223
Index IC preconditioning, 162, 170 ICCG method, 174 ILU, 152, 161 - in accordance with matrix structure, 152 - with thresholds, 154 ILU )3(0), 153 ILU)3(3), 153 ILU w , 164 ILU w preconditioning, 167, 171, 174 ILU(O), 152 ILU(k), 153 ILUT(k), 154 image charge, 272 impedance, 149, 276, 295 - matrix, 43 implicit complex-valued split Jacobi preconditioning, 182 incomplete Cholesky, see IC incomplete LU decomposition, see ILU induction law - discrete form, 60 inductive - coil, 320 - oven, 320 - soldering, 320 inductor, 320 injection error, 270 inner amplitudes, 41 input coupler, 247, 253, 257 insulator, 180, 225 interaction coefficients, 51 interaction region, 248 intermediate steps, 46 interpolation, 126 involutory matrix, 78 iris-loaded waveguide, 36 isotropic media, 16 iteration - error, 92 - function, 92 iterative methods - classical iteration methods, 91 - Krylov subspace methods, 91 Jacobi - matrix - - eigenvalues and eigenvectors, 163 - method, 92, 164 - preconditioned BiCGCR, 180, 184 - preconditioning, 170, 176 jitter, 270 JLC, 248, 249
369
Joule's energy, 320, 322 Joule's heat, 14 Kaczmarz algorithm, 146 Kaczmarz method, 95 KEK,248 kick factor, 278 Krylov subspace, 87, 99, 102 - methods, 99-121,150,177, 181, 184, 205 -- comparison, 121 L-band,248 Lanczos algorithm, 99 - for linear systems, 107 - for non-hermitian matrices, 106 Lanczos-type algorithms, 105 landings, 261, 282 Laplace equation, 23 - on a rectangular domain, 32 Large Hadron Collider, 247 leap frog scheme, 57 least squares fit, 33 left-handed preconditioning, 164 LEP, 272 lexicographic ordering, 94 LHC, see Large Hadron Collider linear accelerator, 248 linear collider, 245 - beam dynamics, 268-280 - studies -- concluding remarks, 319 linear media, 16 linear system, 55 linewise relaxation, 94, 146 Liouville's Theorem, 270 loaded gradient, 260 local mode analysis, 131 longitudinal position, 268 Look-Ahead Lanczos, 108 Look-Ahead strategy, 118 Lorentz - contraction, 271 - factor, 250, 271 - force - - generated by wake fields, 274 - gauge, 25 loss factor, 278 loss parameter, 246, 260, 270, 278-279, 281, 285, 300, 309 - curve, 289 - - dependence on geometry, 294 - - dependence on tapering, 294 - - for constant gradient structure, 294
370
Index
- - for constant impedance structure, 294 - normalized, 278 loss parameter curve, 293 lossy cell, 318, 319 lossy sheets, 299 LU decomposition, 152 - with partial pivoting, 88 luminosity, 279, 319 lumped circuit - for resonance, 303 - model for constant gradient structure, 290 MAFIA, 162, 274, 281, 289, 292, 295, 306, 311, 320 magnet - dipole, 268 - quadrupole, 268 - sextupole, 268 magnetic - (grid) flux, 59 - charges (auxiliary quantity), 71 - field strength, 11 - flux, 11 - - density, 212, 214, 215 - induction, 11 - scalar potential (auxiliary quantity), 71, 211 - susceptibility, 12 - vector potential, 23, 216 - voltage, 61 magnetization, 12, 302 magneto-quasistatics (MQS), 17, 18, 233-234 - with FIT, 73, 234 magneto-thermal behaviour, 320 magnetostatics, 16, 162, 211-216 - with FIT, 70-71 mass matrix, 55 matching, 36, 38 material - discrete approximation, 66 - dissipative, 253 - matrices, 66 material aging, 225 matrix condition, 212 Maxwell,l1 Maxwell Grid Equations, 63 - setup, 57 Maxwell's equations, 11-13,54 - Ampere's law - - discrete form, 62
- as an initial-boundary value problem, 21 - divergence equation - - discrete form, 62 - for quasistatic fields, 17 - - electro-quasistatics, 17 - - magneto-quasistatics, 18 - for stationary fields, 16-18 - for time-harmonic fields, 22, 27 - in differential form, 12 - in integral form, 12, 56 - induction law, 58 - - discrete form, 60 - non-existence of magnetic sources, 60 - - discrete form, 60 MBBU, see multibunch beam break-up measurement, 228 - experiments, 288, 294 - - 36-cell experiment, 295-302, 304-315 - non-resonant bead pull, 303 - resonant bead pull, 294, 302-304 - sampling frequency, 314 measuring velocity, 314 mesh generation, 48 method of integral equations, 35 method of moments, 35 method of separation, 26-29 MIC, 176 - preconditioning, 178 MICry, 162, 164 - preconditioning, 167, 171, 174 MIC w (k),154
microchip, 189, 237 microwave systems, 243 MILU(k), 153 Minimal Residual smoothing, 113 MINRES, 114 mirror fields, 142 misalignment, 270 mixed boundary value problem, 29 mixed Finite Element Method (FEM), 54 mixed mode, 286 mode - 2~/3-, 253, 255, 258, 282 - ~-, 253, 255 - ~ /2-, 253, 255 - acceleration, 255, 282 - separation, 297, 309 mode matching technique, 35-47, 244, 265, 280, 295 - auxiliary transitions, 290
Index - convergence, 44-47, 289 - convergence criteria, 44 - - edge condition , 44 - - geometrical mode ratio, 44 - filter in, 290 - speed up by interpolation, 43 - system matrices, 77 model problem, 162 modes, 38, 244, 253, 257 - acceleration, see accelerating mode - dipole, see dipole modes - Higher Order, see HOM - monopole, 38, 278 - parasitic, see parasitic modes - quadrupole, 38 - trapped, see trapped modes modified IC, see MIC modified ILU, see MILU(k) modified ILU w preconditioning, 176 momentum, 273 momentum coordinates, 268 monopole component, 276 MQS, see magneto-quasistatics MR polynomials, 117 multibunch - beam break-up, 280 - dynamics - - longitudinal, 280 - - transverse, 280 - emittance, 270 - instabilities, 279-280, 319 - operation, 268, 270 multigrid, 214 - algorithm, 184 - for indefinite, nearly singular system, 138 - operator, 131 - scheme, 127 - techniques (MG), 121 multilevel preconditioner, 122, 155 multiply coupled oscillator model, see COM natural frequency, 42 near-breakdown, 108 nearly singular, 149 nested multi grid iteration, 129 Neumann - (boundary value) problem, 29 Neumann boundary condition, 212 Newton - (boundary value) problem, 29 NLC, 248, 249, 297, 318
371
non-Hermitian, 100 non-resonant bead pull measurement, 303-305 non-stationary methods, 83, 85 non-symmetric and indefinite matrices, 184 nonlinear C-magnet, 216 normalization, 21 Ohmic losses, 253 open boundary condition, 30, 212 optimal relaxation parameter, 93, 164 - for SOR, 171 optimization of technical components, 243 ORTHO, 281, 289, 293, 295, 306 ORTHODIR, 114 orthogonal - Fourier series, 36 - functions, 31 - matrix, 78 - projection, 96, 101 ORTHOMIN, 120 orthonormal series expansion, 33 outer amplitudes, 42 over-relaxation, 92 Panofsky-Wenzel Theorem, 275, 276 parallel plate capacitor, 177 parasitic - fields, 244, 274 - modes, 245, 246, 250, 258, 260, 268, 284 partial - pivoting, 88 - SSOR preconditioning, 176, 184, 187 partially filled FIT cell, 136 particle bunch, 244, 246 pass band, 251, 258, 286 - overlap, 286 PBiCGCR, 182 PBiCGCR-MRS, 182 PCG,214 PCOCG,182 perfect - electric boundary condition, 42 - magnetic boundary condition, 42 perfectly conducting wall, 272, 273 periodic acceleration structure, 257-258 - loss parameter curve, 287 periodic boundary condition, 30, 323 permeability, 12 - nonlinear, 216 permittivity, 12
372
Index
permutation matrix, 88 perturbation theory, 279, 315 Petrov-Galerkin condition, 102, 110, 118 phase - advance, 253, 269 -- cell-to-cell, 257, 271, 282, 286, 294, 311 - angle, 260 - constant, 250 - ellipse, 269 - - invariance, 270 - space, 268 - - projection, 268 - velocity, 251, 257, 260 phasers, 18, 276 pick-up monitors, 288, 319 pivot element, 87 plate capacitor, 243 plug, 205 point charge, 271 - highly relativistic - - field of, 271 Poisson equation, 23 - complex (electro-quasistatics), 24 polarization, 11, 302 - density, 11 polynomial preconditioning, 155 polynomial representation - BiCG, 111 - BiCGSTAB, 117 - CGS, 112 - CGS2, 113 - COCG, 112 positive definite matrix, 77, 162 positive semi-definite matrix, 77 positive stable matrix, 77 potential equation, 23 potential theory, 23-25, 29 power, 257 - absorption, 315 - conversion, 258 - density, 14, 324 - flow, 253 - loss, 253 - rf-, 255 - supply, 253 power-loss method, 253, 323 Poynting vector, 14 Poynting's theorem, 14 preconditioned methods, 182 - cg, 161, 166, 214 - - on circular cylindrical grids, 175
- CGS, 187 - COCG, 184 preconditioner - SSOR,95 preconditioning, 150-156 - left-handed, 151 - of Krylov subspace methods, 190 - one-sided, 151 - right-handed, 151 - split, 151 principle of multigrid approach, 123 product ansatz of Bernoulli, 28 projection method, 110 propagation constant, 257 pseudo-residual, 110 PSSOR-BiCGSTAB, 189 PSSOR-CGS, 187, 189 PSSOR-TFQMR, 187, 189 QMR, 108, 118, 184 quadrupole component, 276 quality factor, 255, 259, 275, 295, 306, 318 Quasi Minimal Residual, see QMR quasi-constant gradient structure, 261, 282 quasi-symmetric matrix, 77 quasistatic approximation, 18 quasistatic fields, 17 - conditions for, 18-20 - electro-quasistatics, 17 - magneto-quasistatics, 17 - with FIT, 72-73, 234 rate of convergence - cg, 105 re-entrant corners, 148 real symmetric matrix, 162 Red-Black Gauss-Seidel, 94 red-black ordering, 94 Red-Black SOR, 94 reflection coefficient, 237 reflection matrix, 42 relative iteration error, 92 relative residual, 92, 169 relaxation, 92, 125 - for indefinite problems, 146 - optimal relaxation parameter, 93 residual, 51, 92 resistive wall wakes, 272 RESO, 281, 306 resonance, 244 resonant - bead formula, 302
Index - bead pull measurement, 294, 302-303 - cavity, 244 - frequency, 42, 244, 303, 306 - mode, 244 restart, 170 restriction, 126 rf deflection, 279 rf-load, 324 rf-window, 323 Ritz-Galerkin approximation, 101 Robbin's boundary condition, 31 robustness, 100 Rogowski profile, 227 rounding errors, 90 row scaling, 88 row-action procedure, 95 S-band, 245, 248, 249 S-band structure, 320 SBBU, see single bunch beam break-up SBLC, 245, 248, 249, 265-268, 271, 280, 297, 318-320, 323 - accelerating module, 266 - damped accelerating structure, 318 - main parameters, 265 - test facility, 318 - tunnel layout, 266 scalar potential, 23, 35 - for electro-quasistatics, 24 - for stationary current fields, 24 scalar potential function, 23 scalar wave equation, 25 scattering matrix, 38, 41-44, 80, 281 - concatenation, 41 SCBiCG(r, n), 113 SCBiCG(r, n) class, 177 search directions, 103 second boundary value problem, 29 semi-analytical methods, 35, 280 - method of integral equations, 35 - method of moments, 35 - mode matching technique, 35 semi-coarsening, 150 semiconductor, 220 sensor, 213, 214 separation equation, 29 separative plane, 43 SGS,95 SGS preconditioning, 154, 170 shunt impedance, 255, 259, 275, 306, 318 simple plate capacitor, 178 - with layer of water, 178
373
simple test example with current excitation, 184 single bunch - beam break-up, 279 - dynamics - - longitudinal, 279 - - transverse, 279 - emittance, 270 - instabilities, 279-280, 319 singularity, 54 skew projection, 107 skew-Hermitian part, 77 skin depth, 253 SLAC,248 - two-mile accelerator, 270 Slater formula, 303 SLC, 247, 272 slow wave structure, 30, 250 slowly varying fields, 74, 225 smoothing factor, 124 smoothing procedure, 125 soft materials, 16 SOR, 93, 162, 169, 170, 216 - convergence, 93 - sensitivity to the relaxation parameter, 171 - with optimal relaxation parameter, 164, 176 source field, 75 source-free field, 75 space harmonics, 257 spd matrix, 100, 162 spectral condition number, 159 spectral radius, 92 - of Jacobi matrix, 164 spectrum, 150 speed of light, 250 split preconditioner, 151, 154 spurious modes, 55 SSOR, 95, 164 - preconditioning, 154, 166, 171, 174 stabilized BiCG, 116-118 staggered grids, 57 stagnation, 185, 187 - in BiCGstab(2), 118 standing wave, 42 state variables, 58 - allocation, 58 - correspondence to discrete field quantities, 66 - of a partially filled elementary area, 136 stationary
374
Index
- current fields, 16, 217-221 - - scalar potential, 24 - - with FIT, 71 - current problems, 162 - fields, 15-17 -- with FIT, 70-72 - heat conduction, 223-224 - temperature fields, 321 -- with FIT, 72 - temperature problems, 162 stationary methods, 83 stop band, 251 stopping criterion, 92 - for cg, 104 storage requirement - BiCG, 111 - BiCGSTAB, 118 - BiCGstab(l)2l+5 vectors, 118 - CGS, 112 - CGS2, 113 - COCG, 111 - GMRES, 116 - non-Hermitian Lanczos algorithm, 107 - of Gaussian elimination, 89 - of iterative methods, 91 storage ring, 243, 245, 247 stored energy, 43, 254 subdivision - for mode matching, 36 successive over-relaxation, see SOR superconductive accelerating structure, 248 superposition of waves, 42 surface charge, 272 symmetric Gauss-Seidel, see SGS symmetric matrix, 78 symmetric SOR, 95 symmetry plane, 212 SYMMLQ,107 synchrotron radiation, 245, 247 tapered, 258 TBA, see Two-Beam Accelerator TBCI,274 TEol-mode, 323 TE-waves,27 TEAM benchmark problem, 234 TEM-waves, 27 temperature, 14 - distribution, 224, 323, 324 - problems, 57, 223 TESLA, 248, 249, 323
TFQMR, 118, 119, 177, 178, 185, 187 thermal - conduction, 14 - conductivity, 14 - flux, 14 - load, 258 third boundary value problem, 29 three term recurrence, 87 three-dimensional dual non-orthogonal FIT grids, 68 time-dependence - general time-dependence, 21 - harmonic time-dependence, 22 - - electro-quasistatic, 17 - - magneto-quasistatic, 18 time-dependent - fields, 239 -- with FIT, 74-76 time-harmonic - fields, 184, 234-239 - - electro-quasistatic, 17 - - magneto-quasistatic, 18 - oscillations, 22 TMolO mode, 244 TM-waves, 27 toroid electrode, 227 tracking, 270 transmission, 303 transmission matrix, 42 transport matrix, 270 Transpose-free Quasi-Minimal Residual, see TFQMR transversal coordinates, 268 transversal waves, 27 - transverse electric, 27 - transverse magnetic, 27 transverse motion, 268 trapped modes, 43, 245, 247, 281, 286-302, 304-315, 319 - field pattern, 287 - geometry studies, 288-289 traveling wave, 42, 244 - structure, 249 - tube, 320 traveling wave tube - iris-loaded, 258 triangular FIT grids, 68 tuning, 259 Two-Beam Accelerator (TBA), 249 two-grid method, 126 two-grid operator, 131 ultra-relativistic particle, 244 under-interpolation, 145
Index under-relaxation, 93 uniform coarsening of a FIT grid, 134 unit cube, 162 URMEL,147 URMEL-I, 147, 149 URMEL-T, 278, 281, 288, 289, 295 V-cycle, 128, 148 vacuum system, 323 vector potential, 23, 25, 27, 35 vector wave equation, 25 velocity of light, 12 velocity sensor, 214 virtual spectral radius, 97 VLEPP, 248, 249 voltage, 255, 278 - amplitude, 39 - beam-induced, 260 - distribution (diagram), 283, 284, 297 - effective, 260 - generator-, 260 W-cycle, 128, 148 wake fields, 246, 249, 259, 268, 270-278, 319 - long range, 264, 275, 280 - - recoherence, 262 - long range forces, 277 -- HOM, 277 - longitudinal, 279 - resistive wall, 272 - short range, 271, 275, 279 - short range forces, 277 - transverse, 279 wake function, 277, 278 wake potential, 274-278 - m-pole component, 276 - longitudinal, 274 - - multi pole expansion, 276 - multi pole expansion, 276 - transverse, 274 - - multi pole expansion, 277 wall currents, 323
375
wall losses, 300, 320-322 water droplet, 227 wave - amplitude, 41 - impedance, 21 - length, 251 - number, 250 wave equation, 25 - for a damped wave, 26 - - inhomogeneous equation for electric field, 26 wave front - reflection of, 273 waveguide, 237, 243, 323 - circular cylindrical, 27 -- iris-loaded, 36 - lossy termination, 324 - mode matching for cylindrical waveguide, 39 - with load, 324 waveguide boundary condition, 31, 184, 324 waveguide coupler, 187 waves - in circular cylindrical waveguides, 27 - - TEM-waves, 27 - incident, 41 - reflected, 41 - standing, 42, 257 - superposition, 42 - transmitted, 41 - traveling, 42, 257 weak formulation, 54 weighted residual method, 50 Whitney forms, 54 Wiedemann-Franz' law, 15 X-band, 248 Young's Theorem, 93 Z-matrix, 43 Zwischenmediumsmethode, 46
Editorial Policy §l. Volumes in the following four categories will be published in LNCSE:
Research monographs ii) Lecture and seminar notes iii) Conference proceedings iv) Textbooks Those considering a book which might be suitable for the series are strongly advised to contact the publisher or the series editors at an early stage. i)
§2. Categories i) and ii). These categories will be emphasized by Lecture Notes in Computational Science and Engineering. Submissions by interdisciplinary teams of authors are encouraged. The goal is to report new developments - quickly, informally, and in a way that will make them accessible to non-specialists. In the evaluation of submissions timeliness of the work is an important criterion. Texts should be wellrounded, well-written and reasonably self-contained. In most cases the work will contain results of others as well as those of the author(s). In each case the author(s) should provide sufficient motivation, examples, and applications. In this respect, Ph.D. theses will usually be deemed unsuitable for the Lecture Notes series. Proposals for volumes in these categories should be submitted either to one of the series editors or to Springer-Veriag,Heidelberg, and will be refereed. A provisional judgment on the acceptability of a project can be based on partial information about the work: a detailed outline describing the contents of each chapter, the estimated length, a bibliography, and one or two sample chapters - or a first draft. A final decision whether to accept will rest on an evaluation of the completed work which should include - at least 100 pages of text; - a table of contents; - an informative introduction perhaps with some historical remarks which should be accessible to readers unfamiliar with the topic treated; - a subject index. §3. Category iii). Conference proceedings will be considered for publication provided that they are both of exceptional interest and devoted to a single topic. One (or more) expert participants will act as the scientific editor(s) of the volume. They select the papers which are suitable for inclusion and have them individually refereed as for a journal. Papers not closely related to the central topic are to be excluded. Organizers should contact Lecture Notes in Computational Science and Engineering at the planning stage. In exceptional cases some other multi-author-volumes may be considered in this category. §4. Category iv) Textbooks on topics in the field of computational science and engineering will be considered. They should be written for courses in CSE education. Both graduate and undergraduate level are appropriate. Multidisciplinary topics are especially welcome. §5. Format. Only works in English are considered. They should be submitted in camera-ready form according to Springer-Verlag's specifications. Electronic material can be included if appropriate. Please contact the publisher. Technical instructions and/or TEX macros are available via http://www.springer.de/author/tex/help-tex.html; the name of the macro package is "LNCSE - LaTEX2e class for Lecture Notes in Computational Science and Engineering". The macros can also be sent on request.
General Remarks Lecture Notes are printed by photo-offset from the master-copy delivered in cameraready form by the authors. For this purpose Springer-Verlag provides technical instructions for the preparation of manuscripts. See also Editorial Policy. Careful preparation of manuscripts will help keep production time short and ensure a satisfactory appearance of the finished book. The actual production of a Lecture Notes volume normally takes approximately 12 weeks. The following terms and conditions hold: Categories i), ii), and iii): Authors receive 50 free copies of their book. No royalty is paid. Commitment to publish is made by letter of intent rather than by signing a formal contract. SpringerVerlag secures the copyright for each volume. For conference proceedings, editors receive a total of 50 free copies of their volume for distribution to the contributing authors. Category iv): Regarding free copies and royalties, the standard terms for Springer mathematics monographs and textbooks hold. Please write to [email protected] for details. The standard contracts are used for publishing agreements. All categories: Authors are entitled to purchase further copies of their book and other Springer mathematics books for their personal use, at a discount of 33,3 % directly from Springer-Verlag. Addresses: Professor M. Griebel Institut fUr Angewandte Mathematik der Universita.t Bonn Wegelerstr. 6 D-53115 Bonn, Germany e-mail: [email protected] Professor D. E. Keyes Computer Science Department Old Dominion University Norfolk, VA 23529-0162, USA e-mail: [email protected] Professor R. M. Nieminen Laboratory of Physics Helsinki University of Technology 02150 Espoo, Finland e-mail: [email protected] Professor D. Roose Department of Computer Science Katholieke Universiteit Leuven Celestijnenlaan 200A 3001 Leuven-Heverlee, Belgium e-mail: [email protected]
Professor T. Schlick Department of Chemistry and Courant Institute of Mathematical Sciences New York University and Howard Hughes Medical Institute 251 Mercer Street, Rm 509 New York, NY 10012-1548, USA e-mail: [email protected] Springer-Verlag, Mathematics Editorial Tiergartenstrasse 17 D-69121 Heidelberg, Germany Tel.: *49 (6221) 487-185 e-mail: [email protected] http://www.springer.de/math/ peters.html
Lecture Notes in Computational Science and Engineering
Vol. I D. Funaro, Spectral Elements for Transport-Dominated Equations. 1997. X, 211 pp. Softcover. ISBN 3-540-62649-2 Vol. 2 H. P. Langtangen, Computational Partial Differential Equations. Numerical Methods and Diffpack Programming. 1999. XXIII, 682 pp. Hardcover. ISBN 3-540-65274-4 Vol. 3 W. Hackbusch, G. Wittum (eds.), Multigrid Methods V. Proceedings of the Fifth European Multigrid Conference held in Stuttgart, Germany, October 1-4,1996. 1998. VIII, 334 pp. Softcover. ISBN 3-540-63133-X Vol. 4 P. Deufihard, J. Hermans, B. Leimkuhler, A. E. Mark, S. Reich, R. D. Skeel (eds.), Computational Molecular Dynamics: Challenges, Methods, Ideas. Proceedings of the 2nd International Symposium on Algorithms for Macromolecular Modelling, Berlin, May 21-24, 1997. 1998. XI, 489 pp. Softcover. ISBN 3-540-63242-5 D. Kroner, M. Ohlberger, C. Rohde (eds.), An Introduction to Recent Developments in Theory and Numerics for Conservation Laws. Proceedings of the Inter-
Vol. 5
national School on Theory and Numerics for Conservation Laws, Freiburg / Littenweiler, October 20-24, 1997. 1998. VII, 285 pp. Softcover. ISBN 3-540-65081-4 Vol. 6 S. Turek, Efficient Solvers for Incompressible Flow Problems. An Algorithmic and Computational Approach. 1999. XVII, 352 pp, with CD-ROM. Hardcover. ISBN 3-540-65433-X . Vol. 7 R. von Schwerin, Multi Body System SIM ulation. Numerical Methods, Algorithms, and Software. 1999. XX, 338 pp. Softcover. ISBN 3-540-65662-6 H.-J. Bungartz, F. Durst, C. Zenger (eds.), High Performance Scientific and Engineering Computing. Proceedings of the International FORTWIHR Conference
Vol. 8
on HPSEC, Munich, March 16-18,1998. 1999. X, 471 pp. Softcover. 3-540-65730-4 T. J. Barth, H. Deconinck (eds.), High-Order Methods for Computational Physics. 1999. VII, 582 pp. Hardcover. 3-540-65893-9
Vol. 9
H. P. Langtangen, A. M. Bruaset, E. Quak (eds.), Advances in Software Tools for Scientific Computing. 2000. X, 357 pp. Softcover. 3-540-66557-9
VOI.IO
Vol.u B. Cockburn, G. E. Karniadakis, C.-W. Shu (eds.), Discontinuous Galerkin Methods. Theory, Computation and Applications. 2000. XI, 470 pp. Hardcover. 3-540-66787-3