ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 136
EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, France
ASSOCIATE EDITOR
BENJAMIN KAZAN Palo Alto, California
HONORARY ASSOCIATE EDITOR
TOM MULVEY
Advances in
Imaging and Electron Physics Edited by
PETER W. HAWKES CEMES-CNRS Toulouse, France
VOLUME 136
Elsevier Academic Press 525 B Street, Suite 1900, San Diego, California 92101-4495, USA 84 Theobald’s Road, London WC1X 8RR, UK
This book is printed on acid-free paper. Copyright ß 2005, Elsevier Inc. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (www.copyright.com), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2005 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters. 1076-5670/2005 $35.00 Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (þ44) 1865 843830, fax: (þ44) 1865 853333, E-mail:
[email protected]. You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting ‘‘Customer Support’’ and then ‘‘Obtaining Permissions.’’ For all information on all Academic Press publications visit our Web site at www.books.elsevier.com ISBN: 0-12-014778-5 PRINTED IN THE UNITED STATES OF AMERICA 05 06 07 08 09 10 9 8 7 6 5 4 3 2
1
CONTENTS
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future Contributions . . . . . . . . . . . . . . . . . . . . . . . .
vii ix xi
Real and Complex PDE-Based Schemes for Image Sharpening and Enhancement Guy Gilboa, Nir Sochen, and Yehoshua Y. Zeevi I. II. III. IV. V. VI.
Overview of PDE-Based Processes . . . Sharpening by the Axiomatic Approach Sharpening by the Variational Approach Complex DiVusion Processes . . . . . . Texture-Preserving Denoising . . . . . . Conclusion . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. 3 . 21 . 46 . 61 . 87 . 103 . 104
The S-State Model for Electron Channeling in High-Resolution Electron Microscopy P. Geuens and D. Van Dyck I. Introduction . . . . . . . . . . . . . . . . . . . . . II. The Channeling Theory . . . . . . . . . . . . . . . III. Calculation of the Eigenfunctions of an Electron in an Isolated Atom Column . . . . . . . . . . . . . . IV. The S-State Model . . . . . . . . . . . . . . . . . . V. The S-State Model for Nonisolated Atom Columns VI. The S-State Model in Case of Crystal or Beam Tilt VII. Experimental Channeling Maps . . . . . . . . . . . VIII. Electron DiVraction and the S-State Model. . . . . References . . . . . . . . . . . . . . . . . . . . . .
v
. . . . . . 113 . . . . . . 124 . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
130 160 175 191 201 203 223
vi
CONTENTS
Measurement of Electric Fields on Object Surface in an Emission Electron Microscope S. A. Nepijko, N. N. Sedov, and G. SchO¨nhense I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . II. Direct and Inverse Problems of Measurement of Electric Fields (Potential) on the Object Surface in Emission Electron Microscope . . . . . . . . . . . . . . III. Model Experiments on Mapping of Electric Fields (Potential) on the Object Surface Using an Emission Electron Microscope . . . . . . . . . . . . . . . . . . . IV. The EVect of the Local Fields and Microroughness at the Object on the Imaging and Resolving Power of an Emission Electron Microscope . . . . . . . . . . . . V. Practical Applications of Microfield Measurement Using an Emission Electron Microscope . . . . . . . . . VI. Measurement of Object Surface Geometry (Relief) with an Emission Electron Microscope . . . . . . . . . . . . VII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
228
. . .
230
. . .
252
. . .
262
. . .
278
. . . . . . . . .
291 312 313
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
317
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors’ contributions begin.
D. Van Dyck (111), Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium P. Geuens (111), Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium Guy Gilboa (1), Department of Electrical Engineering, Technion: The Israel Institute of Technology, Haifa 32000, Israel S. A. Nepijko (227), Institute of Physics, University Mainz, Staudingerweg 7, 55099 Mainz, Germany and Institute of Physics, National Academy of Sciences of Ukraine, Pr. Nauki 46, 03028 Kiev, Ukraine G. SchO¨nhense (227), Institute of Physics, University Mainz, Staudingerweg 7, 55099 Mainz, Germany N. N. Sedov (227), The Moscow Military Institute, Golovachev Str., 109380 Moscow, Russia Nir Sochen (1), Department of Applied Mathematics, University of Tel-Aviv, Tel-Aviv 69978, Israel Yehoshua Y. Zeevi (1), Department of Electrical Engineering, Technion: The Israel Institute of Technology, Haifa 32000, Israel
vii
PREFACE Three long contributions on a powerful method now coming into use in image processing, on image simulation and interpretation in electron microscopy and on an aspect of emission electron microscopy fill this new volume of Advances in Imaging and Electron Physics. The book begins with an extensive account by G. Gilboa, N. Sochen, and Y. Y. Zeevi of the schemes for image sharpening and enhancement based on partial diVerential equations. The literature of image enhancement is already huge, with large subsections devoted to convolutional methods and morphological approaches, as well as many other nonlinear techniques. Here, partial diVerential equations are at the heart of the methods, as they have several attractive features: their local nature is well adapted to the nonstationary character of images; their mathematics is well understood and the corresponding numerical methods are available; the algorithms involved are commonly straightforward and concise; the step from two to three or more dimensions is usually easy. The authors explain in detail how such equations are integrated into the image processing armoury. In the second long contribution, P. Geuens and D. van Dyck present the theory of their new approach to image simulation, an essential step in quantitative high-resolution electron microscopy. The older methods, based on plane-wave propagation in the specimen, give little physical insight into the scattering process whereas the theory presented here clearly matches the electron physics of the situation. The electron wavefunction is now expanded in terms of the eigenfunctions of the atomic column potential, averaged along the columns. Even one eigenfunction then gives a reasonably good description of the electron scattering. After explaining the advantages of this new approach, the authors describe how the eigenfunctions of an electron in an isolated atom column can be calculated. They then describe the S-state and its use for non-isolated atom columns, the eVect of crystal or beam tilt, and the treatment of electron diVraction. This account, in which all the details are spelled out carefully, will certainly be the standard reference for some time to come. The volume concludes with a contribution by S. A. Nepijko, N. N. Sedov, and G. Scho¨nhense on the measurement of electric fields on the surface of the object in an emission electron microscope. The eVect of local electric or magnetic fields at the specimen surface in such a microscope is easy to understand intuitively, for the slow-moving electrons emitted will be ix
x
PREFACE
deviated by the forces exerted by such fields. However, the image can be interpreted correctly only if a quantitative theory is available and this is the subject of this chapter, together with the associated inverse problem and ways of solving it. As always, I am most grateful to all the authors for the trouble they have taken over their material and I list below the contributions promised for future volumes. Peter Hawkes
FUTURE CONTRIBUTIONS
G. Abbate New developments in liquid-crystal-based photonic devices S. Ando Gradient operators and edge and corner detection A. Asif Applications of noncausal Gauss-Markov random processes in multidimensional image processing C. Beeli Structure and microscopy of quasicrystals M. Bianchini, F. Scarselli, and L. Sarti Recursive neural networks and object detection in images G. Borgefors Distance transforms A. Bottino Retrieval of shape from silhouette A. Buchau Boundary element or integral equation methods for static and time-dependent problems B. Buchberger Gro¨bner bases J. Caulfield Optics and information sciences C. Cervellera and M. Muselli The discrepancy-based approach to neural network learning T. Cremer Neutron microscopy H. Delingette Surface reconstruction based on simplex meshes A. R. Faruqi Direct detection devices for electron microscopy xi
xii
FUTURE CONTRIBUTIONS
R. G. Forbes Liquid metal ion sources J. Y.-l. Forrest Grey systems and grey information E. Fo¨ rster and F. N. Chukhovsky X-ray optics A. Fox The critical-voltage effect L. Godo and V. Torra Aggregation operators A. Go¨ lzha¨ user Recent advances in electron holography with point sources H. Harmuth and B. Meffert (vol. 137) Dogma of the continuum and the calculus of finite diVerences in quantum physics K. Hayashi X-ray holography M. I. Herrera The development of electron microscopy in Spain D. Hitz Recent progress on HF ECR ion sources D. P. Huijsmans and N. Sebe Ranking metrics and evaluation measures K. Ishizuka Contrast transfer and crystal images K. Jensen Field-emission source mechanisms L. Kipp Photon sieves G. Ko¨ gel Positron microscopy T. Kohashi Spin-polarized scanning electron microscopy
FUTURE CONTRIBUTIONS
W. Krakow Sideband imaging R. Leitgeb Fourier domain and time domain optical coherence tomography B. Lencova´ Modern developments in electron optical calculations R. Lenz (vol. 138) Aspects of colour image processing W. Lodwick Interval analysis and fuzzy possibility theory R. Lukac Weighted directional filters and colour imaging L. Macaire, N. Vandenbroucke, and J.-G. Postaire Color spaces and segmentation M. Matsuya Calculation of aberration coefficients using Lie algebra S. McVitie Microscopy of magnetic specimens L. Mugnier, A. Blanc, and J. Idier Phase diversity K. Nagayama (vol. 138) Electron phase microscopy M. A. O’Keefe Electron image simulation J. Orloff and X. Liu (vol. 138) Optics of a gas field-ionization source D. Oulton and H. Owens Colorimetric imaging N. Papamarkos and A. Kesidis The inverse Hough transform K. S. Pedersen, A. Lee, and M. Nielsen The scale-space properties of natural images
xiii
xiv
FUTURE CONTRIBUTIONS
E. Rau Energy analysers for electron microscopes H. Rauch The wave-particle dualism E. Recami Superluminal solutions to wave equations ˇ eha´ cˇ ek, Z. Hradil, J. Perˇina, S. Pascazio, P. Facchi, and M. Zawisky J. R Neutron imaging and sensing of physical fields G. Ritter Lattice-based artifical neural networks J.-F. Rivest Complex morphology G. Schmahl X-ray microscopy G. Scho¨ nhense, C. M. Schneider, and S. A. Nepijko Time-resolved photoemission electron microscopy F. Shih General sweep mathematical morphology R. Shimizu, T. Ikuta, and Y. Takai Defocus image modulation processing in real time S. Shirai CRT gun design methods N. Silvis-Cividjian and C. W. Hagen Electron-beam-induced deposition T. Soma Focus-deflection systems and their applications Q. F. Sugon Geometrical optics in terms of Clifford algebra W. Szmaja Recent developments in the imaging of magnetic domains I. Talmon Study of complex fluids by transmission electron microscopy
FUTURE CONTRIBUTIONS
I. J. Taneja (vol. 138) Divergence measures and their applications M. E. Testorf and M. Fiddy Imaging from scattered electromagnetic fields, investigations into an unsolved problem M. Tonouchi Terahertz radiation imaging N. M. Towghi Ip norm optimal filters Y. Uchikawa Electron gun optics K. Vaeth and G. Rajeswaran Organic light-emitting arrays J. Valde´ s (vol. 138) Units and measures, the future of the SI D. Walsh (vol. 138) The importance-sampling Hough transform G. G. Walter Recent studies on prolate spheroidal wave functions C. D. Wright and E. W. Hill Magnetic force microscopy B. Yazici Stochastic deconvolution over groups M. Yeadon Instrumentation for surface studies
xv
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 136
Real and Complex PDE-Based Schemes for Image Sharpening and Enhancement GUY GILBOA,* NIR SOCHEN,{ AND YEHOSHUA Y. ZEEVI* *Department of Electrical Engineering, Technion: The Israel Institute of Technology Haifa 32000, Israel { Department of Applied Mathematics, University of Tel-Aviv, Tel-Aviv 69978, Israel
I. Overview of PDE-Based Processes . . . . . . . . . A. Key PDE Processes in Vision . . . . . . . . . . B. Axiomatic Approach . . . . . . . . . . . . . 1. Linear Scale-Space . . . . . . . . . . . . . 2. Perona–Malik Nonlinear DiVusion . . . . . . . 3. Tensor DiVusivity . . . . . . . . . . . . . C. Variational Approach . . . . . . . . . . . . . D. Image Segmentation . . . . . . . . . . . . . E. Color Processing . . . . . . . . . . . . . . 1. The Beltrami Framework . . . . . . . . . . 2. A Geometric Measure on Embedded Maps . . . . 3. The Metric as a Structure Tensor . . . . . . . . F. Image Sharpening . . . . . . . . . . . . . . 1. Shock Filters . . . . . . . . . . . . . . . 2. Objectives for Image Sharpening . . . . . . . . G. Summary . . . . . . . . . . . . . . . . . 1. Why Use PDEs for Image Processing? . . . . . . II. Sharpening by the Axiomatic Approach . . . . . . . A. Introduction . . . . . . . . . . . . . . . . 1. Linear Inverse DiVusion . . . . . . . . . . . 2. Relation to Deconvolution . . . . . . . . . . 3. Physical Interpretation . . . . . . . . . . . 4. Advancing Back in Time . . . . . . . . . . . B. Forward-and-Backward DiVusion . . . . . . . . 1. The Model. . . . . . . . . . . . . . . . 2. Setting Criteria for Image Sharpening . . . . . . 3. The DiVusion CoeYcient. . . . . . . . . . . 4. Adaptive Parameters . . . . . . . . . . . . 5. Comparison with Shock Filters . . . . . . . . 6. Examples . . . . . . . . . . . . . . . . 7. Stability of Smooth Regions in 1D . . . . . . . C. Super-Resolution by the FAB Process . . . . . . . 1. Some Background: What Is Super-Resolution? . . . 2. The Proposed Scheme: Single-Image Super-Resolution 3. Resolution Enhancement: An Example . . . . . .
ISSN 1076-5670/05 DOI: 10.1016/S1076-5670(04)36001-5
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 3 3 4 6 7 13 13 14 14 15 16 17 20 20 21 21 21 23 25 26 26 26 26 27 28 31 31 32 34 38 38 39 40
Copyright 2005, Elsevier Inc. All rights reserved.
2
III.
IV.
V.
VI.
GILBOA ET AL. D. Color Processing . . . . . . . . . . . . 1. The Beltrami Framework . . . . . . . . 2. The Adaptive Structure Tensor . . . . . . 3. Algorithm for Color Image Enhancement . . . 4. Experimental Results . . . . . . . . . . E. Discussion . . . . . . . . . . . . . . . Sharpening by the Variational Approach . . . . . A. Introduction . . . . . . . . . . . . . . 1. Related Studies . . . . . . . . . . . . 2. The Double-Well Potential . . . . . . . . B. Energy Wells in Image Processing . . . . . . 1. The Energy Functional . . . . . . . . . 2. The Triple-Well Potential . . . . . . . . 3. Higher-Order Regularization . . . . . . . 4. Energy Minimization Flow . . . . . . . . 5. Steady-State Solutions . . . . . . . . . C. Examples . . . . . . . . . . . . . . . D. Relations to FAB DiVusion of Section II . . . . E. Discussion . . . . . . . . . . . . . . . Complex DiVusion Processes . . . . . . . . . A. Introduction . . . . . . . . . . . . . . B. Previous Related Studies . . . . . . . . . . C. Linear Complex DiVusion . . . . . . . . . 1. Problem Definition . . . . . . . . . . . 2. Fundamental Solution. . . . . . . . . . 3. Approximate Solution for Small Theta . . . . 4. Analysis of the Fundamental Solution . . . . 5. Properties of the Real Kernel hR . . . . . . 6. Properties of the Imaginary Kernel hI . . . . 7. Examples . . . . . . . . . . . . . . 8. Generalization to Nonlinear Complex DiVusion D. Ramp-Preserving Denoising. . . . . . . . . E. Regularized Shock Filters . . . . . . . . . 1. Previous Related Studies . . . . . . . . . 2. Coupling Shock and DiVusion . . . . . . . 3. Shock and Linear DiVusion . . . . . . . . 4. The Magnitude of the Second Derivative . . . F. Complex Shock Filters . . . . . . . . . . G. Discussion . . . . . . . . . . . . . . . Texture-Preserving Denoising . . . . . . . . . A. Introduction . . . . . . . . . . . . . . B. The Cartoon Pyramid Model . . . . . . . . C. The Adaptive F Problem. . . . . . . . . . 1. Automatic Texture-Preserving Denoising . . . 2. Denoising with Prior Information . . . . . D. Examples . . . . . . . . . . . . . . . 1. Implementation Details . . . . . . . . . E. Discussion . . . . . . . . . . . . . . . Conclusion. . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41 41 41 42 43 43 46 46 47 47 49 49 49 50 53 53 56 58 59 61 61 61 63 63 63 64 67 67 70 70 74 74 76 76 80 80 81 81 85 87 87 89 90 93 96 97 99 102 103 104
REAL AND COMPLEX PDE-BASED SCHEMES
I. OVERVIEW
OF
3
PDE-BASED PROCESSES
To understand our motivation for adopting the partial diVerential equation (PDE) based approach to image processing, we present some broader perspective on the topic of PDEs used in the fields of vision and image processing. We review some of the influential studies in this field and illustrate theoretical results by specific examples. We then focus on image sharpening, a main topic addressed by this study, and discuss in greater detail the relevant algorithms proposed using PDE techniques. A. Key PDE Processes in Vision We first review some PDE-based processes that had a major influence in the field. Here we refer to processes related to image-processing tasks other than sharpening, such as denoising, edge detection, and segmentation. In general, less attention was given to sharpening methods using the PDE framework. Section I.F gives the background for PDE-based sharpening algorithms. Two basic approaches to the implementation of PDEs in low-level vision have been previously presented. The first is the axiomatic approach, formalized by Alvarez et al. (1993). In this and related studies (see Lindeberg et al., 1994), a set of assumptions about the nature of the image and the required filtering is incorporated at the axiomatic level. The second is the variational approach, based on calculus of variations, whereby the diVusion-like PDE is derived by a functional minimization process (see Aubert and Kornprobst, 2002, and references therein). B. Axiomatic Approach 1. Linear Scale-Space The papers by Witkin (1983) and Koenderink (1984) are most commonly referred to as introducing linear PDEs to the vision field (although recently it was revealed that similar issues were already dealt in the 1960s in Japan [Weickert et al., 1999]). The scale-space approach was suggested as a multiresolution technique for image structure analysis. For low-level vision processing certain requirements were set in order to construct an uncommitted front end (ter Haar Romeny, 1996): Linearity (no previous model) Spatial shift invariance
4
GILBOA ET AL.
Isotropy Scale invariance (no preferred size)
The unique operator obeying all these requirements was a convolution with a Gaussian kernel. In order to be scale invariant, all scales were to be considered. Therefore the Gaussian convolution was to be applied to the input at all scales (standard deviation of Gaussian kernel ranging from 0 to 1). The diVusion process (or heat equation) is equivalent to a smoothing process with a Gaussian kernel. In this context the linear diVusion equation was used: It ¼ cDI;
Ijt¼0 ¼ I0 ;
c > 0 2 R:
ð1Þ
This introduced a natural continuous scale dimension t. For a constant diVusion coeYcient c ¼ 1, solving the diVusion equation (1) is analogous to convolving the input image I0 with a Gaussian of a standard deviation pffiffiffiffiffiffi s ¼ 2t: Important cues, such as edges and critical points, are gathered from information distributed over all scales to analyze the scene as a whole. One of the problems associated with this approach is that important structural features such as edges are smoothed and blurred along the flow, as the processed image evolves in time. As a consequence, the trajectories of zero crossings of the second derivative, which indicate the locations of edges, vary from scale to scale (Figure 1). 2. Perona–Malik Nonlinear DiVusion Perona and Malik (P-M) (1990) addressed this issue by using the general divergence diVusion form to construct a nonlinear adaptive denoising process, where diVusion can take place with a spatially variable diVusion in order to reduce the smoothing eVect near edges. The general diVusion equation, controlled by the gradient magnitude, is of the form: It ¼ divðcðjrIjÞrIÞ;
ð2Þ
where in the P-M case, c is a positive decreasing function of the gradient magnitude. Two functions for the diVusion coeYcient were proposed: cPM1 ¼ ð1 þ ðjrIj=kPM Þ2 Þ1 and cPM2 ¼ expððjrIj=kPM Þ2 Þ: It turns out that both have similar basic properties (positive coeYcient, nonconvex potentials, ability for some local enhancement of large gradients). An example of the P-M denoising is compared with linear diVusion in Figure 2. Results obtained with the P-M process paved the way for a variety of PDE-based methods that were applied to various problems in low-level
REAL AND COMPLEX PDE-BASED SCHEMES
5
FIGURE 1. A few instances along the linear scale-space attained by solving the linear diVusion equation (a). Corresponding Laplacian zero-crossing indicating edges (b). Iterations (from left): 0, 1, 10, 100. (dt ¼ 0.2). Small-scale features are smoothed out with the increase of scale. The remaining significant edges (mainly of large features) are dislocated, though, and should be traced back along the scale-space for proper object extraction.
vision (see ter Haar Romeny, 1996, and references cited therein). Some drawbacks and limitations of the original model have been mentioned in the literature (e.g., Catte et al., 1992; Li and Chen, 1994; Whitaker and Pizer, 1993). Catte et al. have shown the ill-posedness of the diVusion equation, imposed by using the P-M diVusion coeYcients, and proposed a regularized version wherein the coeYcient is a function of a smoothed gradient: It ¼ divðcðjrI Gs jÞrIÞ:
ð3Þ
Note that although this formulation solved a deep theoretical problem associated with the P-M process, the characteristics of this process essentially remained (see Figure 5). Weickert and Benhamouda (1997) investigated the stability of the P-M equation by spatial discretization, and Radmoser et al. (2000) proposed a generalized regularization formula in the continuous domain. In this study we show that this general diVusion form may well also accomplish image sharpening.
6
GILBOA ET AL.
FIGURE 2. Denoising by linear and nonlinear diVusion processes. (a) Original Old Technion image (left); image contaminated by additive white Gaussian noise (right, sn ¼ 10). (b) Denoised by linear diVusion (left): denoised by Perona–Malik (P-M) nonlinear diVusion (right). Linear diVusion suppresses noise at the expanse of smearing the edges (an inherent characteristic of any linear smoothing kernel). The P-M process removes noise while keeping edges sharp (and even enhanced, in some cases). Some isolated noisy points may remain. A part of the textural information is lost in the process.
3. Tensor DiVusivity For oriented flowlike structures, such as fingerprints, truly anisotropic processes are required. Processes emerging from Eq. (2) are controlled by a scalar diVusion coeYcient c(x, y, t). This permits a spatially varying process that can also change throughout the evolution but is basically isotropic, that is, locally the process acts the same in all directions (in the regularized version, see Weickert and Benhamouda, 1997). Weickert (1995b, 1997, 1999a,b) suggested an eVective anisotropic scheme using a tensor diVusivity. The diVusion tensor is derived by manipulation of the eigenvalues of the smoothed structure tensor Js ¼ Gs ðrIs rIsT Þ: This technique results in strong smoothing along edges and low smoothing across them. In relatively homogeneous regions without coherent edges, the process approaches linear diVusion with low diVusivity. The semilocal nature of the process may extract information from a neighborhood of radius proportional to s. This enables completion of interrupted lines and enhances flowlike structures. See Cottet and Germain (1993) for a diVerent anisotropic method.
REAL AND COMPLEX PDE-BASED SCHEMES
7
C. Variational Approach A diVerent approach for data regularization is based on the calculus of variations. Some smoothness assumptions are being made about the original data (which diVer from the noisy input data). A minimizer is sought to a cost functional, which penalizes ‘‘non-smoothness’’ (defined diVerently for diVerent purposes) and distance from the input data (‘‘measurements’’). The cost is often viewed as the energy of the system, where the task is to bring the system to the state of minimal energy. The calculus of variations defines a necessary condition for the minimum known as the Euler–Lagrange (E-L) equation: dEðIÞ ¼ 0; dI
ð4Þ
where the left-hand side denotes the variation of E with respect to I. In most cases the solution cannot be found analytically and is being sought numerically. The E-L equation is used in this case to construct an evolutionary process that dissipates energy. The process converges to a local minimum of the energy functional. For convex functional this is also the unique global minimum. Let us first review the relation between nonlinear diVusion processes and energy minimization flows. We define a potential function (energy density), which is a function of the gradient magnitude of I, C(|rI|), and a corresponding energy functional Z EðIÞ ¼ CðjrIjÞdx: ð5Þ O
Minimization of this functional, using a gradient descent method, leads to a nonlinear diVusion process: It ¼ divðJðrIÞÞ;
ð6Þ
where J( ) is the flux function given by JðrIÞ ¼ cð ÞrI ¼
C0 ðjrIjÞ rI; jrIj
ð7Þ
and c( ) is the diVusion coeYcient. Note that the flux in this context is defined with a negative sign, compared with its physical notion. The initial condition is I|t ¼ 0 ¼ I0, where I0 is in image-processing applications the input image. Neumann boundary conditions are assumed for gray-value conservation. One can observe that Eq. (6) coincides with Eq. (2). For more details, see Deriche and Faugeras (1996), Weickert (1997), and You et al. (1996).
8
GILBOA ET AL.
Typically, denoising potentials are monotonically increasing and attain their minimum at zero. This type of potentials can be classified as either convex potentials (e.g., linear diVusion [Charbonnier et al., 1994], Beltrami diVusion [Sochen, 2001]), or nonconvex potentials (e.g., Perona and Malik, 1990). Processes derived from convex potentials are well posed, and their evolution approaches the minimum global energy (zero gradient magnitude everywhere, that is, a constant function). Nonconvex potentials retain stronger edge-preserving properties, but their flux is not monotonic and the theory of proper energy minimization is much more complex in this case. Ho¨ llig (1983) showed the existence of an infinite number of solutions of a one-dimensional diVusion process with nonmonotonic flux (nonconvex potential). You et al. (1996) analyzed two-dimensional nonlinear diVusion and proved that processes based on a nonmonotonic flux, with the condition JðjrI ! jinfÞ ¼ 0;
ð8Þ
can have an infinite number of stationary points of the energy functional (and therefore are ill posed). Both studies were restricted to the case of positive diVusion coeYcients. In practice, however, the regularization of (Catte et al., 1992) or even simple discretization (Weickert and Benhamouda, 1997) have shown to suYce, causing the evolutionary process to converge onto a constant trivial steady-state unique solution. Apparent instabilities are staircasing (Figure 7, third column) and some speckle eVects (see Figure 5, bottom right) (Weickert, 1997; Weickert and Benhamouda, 1997). A diVerent and powerful approach has become known as the total variation (TV) denoising (Rudin et al., 1992). This approach, based on an L1 norm, is a special case in the context of our classification in that it is a non-strictly convex potential. To avoid numerical problems at low gradients, a small constant is usually added in the calculation of the pffi gradient magnitude (Vogel and Oman, 1996) (i.e., |rI| is substituted by jrIj2 þ 2 ), turning the process into a convex one. This gives rise to interpreting such processes as surface minimization evolutions, which are directly connected to the Beltrami flow (Sochen et al., 1998) (with a single channel in this case), described hereafter. Figures 3 and 4 show examples of the potential of some classical processes and of the corresponding diVusion coeYcients. A qualitative comparison between TV and P-M for the piece-wise constant case is shown in Figure 6. In cases of monotonically increasing potentials, the diVusion coeYcients are positive. Thus the minimum-maximum principal is satisfied (the minimum and maximum of I(t) are bounded by the initial condition I0, for all t > 0 in any dimension) and no real sharpening can occur.1 Looking at the 1
Note that this is not the case for numerical schemes of systems with codimension > 1; see Dascal and Sochen (2003).
REAL AND COMPLEX PDE-BASED SCHEMES
9
FIGURE 3. Potentials C(s) plotted as a function of the gradient magnitude s of some classical processes. (a) Linear forward diVusion (C(s) ¼ 12s2); (b) TV (C(s) ¼ s); (c) Charbonnier pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 et al. (C(s) ¼ k þ k2 s2 k2 , k ¼ 1); (d) Perona–Malik (C(s) ¼ 12k2 log(1 þ (ks )2), k ¼ 1); (e) linear inverse (backward) diVusion (C(s) ¼ 12 s2).
FIGURE 4. DiVusion coeYcients c(s) plotted as a function of the gradient magnitude s of the above processes. (a) Linear forward diVusion (c(s) ¼ 1); (b) total variation (TV) (c(s) ¼ 1s ), ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (c) Charbonnier et al. (c(s) ¼ 1=ð 1 þ s2 =k2 Þ, k ¼ 1); (d) Perona–Malik (c(s) ¼ 1=ð1 þ s2 =k2 Þ, k ¼ 1), (e) linear inverse (backward) diVusion (c(s) ¼ 1).
10
GILBOA ET AL.
FIGURE 5. Comparison between total variation (TV) denoising and regularized Perona– Malik (Catte et al., 1992). (a) Original (left) image contaminated by additive white Gaussian noise (right, sn ¼ 15). (b) Image denoised using TV (left) and regularized Perona–Malik (right). The noise power is assumed to be known. The results of both processes are visually quite similar. In the TV case, though, edges are not enhanced and there is a little less contrast. Also, isolated points vanish. In both schemes, textures and smallscale features are smoothed along with the noise. The latter issue is addressed in Section V.
11 FIGURE 6. Denoising of a piecewise constant image. Gradient-controlled processes are best suitable for denoising such images, exhibiting excellent performance. (a) From left: original, noisy (SNR ¼ 7 dB), results of P-M process (SNR ¼ 26 dB), results of TV process (SNR ¼ 20 dB). (b) Plot of the intensity values of one horizontal line. From left: original, noisy, P-M, TV. The sharp edges are well recovered by both the P-M and TV processes. Noise is greatly suppressed. Edges and corners are slightly less sharp in the TV process.
12 FIGURE 7. Denoising of gradual intensity changes causes staircasing eVects by gradient-controlled processes. (a) From left: original, noisy image, results of P-M process, results of TV process. (b) Plot of the intensity values of one horizontal line. From left: original, noisy intensity, P-M, TV. Staircasing is apparent primarily in the P-M process, which is not convex. The denoising by the TV process is also not uniform and the ramp is not recovered properly. The process should be controlled by higher-order derivatives (or according to our suggestion—complex-valued processes) in order not to confuse gradual changes with edges. A process suitable for denoising of ramps is presented in Section IV.
REAL AND COMPLEX PDE-BASED SCHEMES
13
backward (inverse) linear diVusion process, the potential is strictly concave (as the diVusion coeYcient c ¼ const < 0). This process attains its minimum energy at infinite gradient magnitudes, causing an explosion of the signal and severe noise amplification. D. Image Segmentation In parallel evolutionary processes derived from diVerential geometry for curve, evolutions were introduced. The method of Shah and Mumford (1989) was aimed at finding a proper contour C that segments the image into piecewise smooth regions (usually objects and background). They suggested to minimize the following functional Z Z 2 2 2 EMS ðI; CÞ ¼ ðI0 IÞ d x þ l jrIj2 d 2 x þ nC; ð9Þ O
OC
where I is a piecewise smooth approximation of I0 separated by the contour C. As the minimization is with respect to both I and C, two evolutionary equations were to be processed in parallel. Another segmentation technique was the use of snakes, or active contours. It was first proposed as a parametric curve evolution by Kass et al. (1987), where an initial curve was evolved to close on the border of an object in a bandlike manner. Today, most applications use some variation of the levelsets formulation (Caselles et al., 1997; Kichenassamy et al., 1995; Malladi et al., 1995) termed geodesic active contours: Z 1 EAC ðCÞ ¼ gðjrI0 ðCðqÞÞjÞjC 0 ðqÞjdq; ð10Þ 0
where g is a decreasing function. The contour is evolved toward a short and smooth curve that is mostly on the boundaries of the object (large gradients). The level-sets method, a powerful numerical tool proposed by Osher and Sethian (1988), facilitated the implementation of curve evolutions in an eYcient and stable manner. In this framework a curve was described as a level line of a higher-dimensional function. The curve was evolved in an implicit manner by evolving the embedding function. E. Color Processing Generalization of PDEs for gray-level images to color images (or more generally to multichannel signals) is not straightforward. Applying the scalar (gray-level) process to each channel independently has proved to be
14
GILBOA ET AL.
inappropriate, creating artifacts and dismissing the large correlation that usually exists between the channels. The problem of vector-valued diVusion has been addressed by several studies (e.g., Blomgren and Chan [1998], Weickert [1999b], Sapiro and Ringach [1996], and Whitaker and Gerig [1994]). In general, the idea is to compute the control mechanism (diVusion coeYcient or its equivalent) using information from all channels, and then evolve each channel separately. In our work, we use the Beltrami framework proposed by Kimmel et al. (1996, 2000), Sochen (2001), and Sochen et al. (1998) that views multichannel signals as Riemannian manifolds embedded in a higher-dimensional space (in the case of color images a two-dimensional [2D] manifold in five dimensions [5D]). This is a very general setting, with a sound geometrical interpretation, that can be used for many applications (videos, textures analysis, etc.) in addition to color images. In our work, we apply our sharpening scheme to color images using the Beltrami framework, but it can be generalized to enhancement of other types of multivalued signals. Therefore we present this approach in more detail. 1. The Beltrami Framework The original study of Sochen, Kimmel, and Malladi (1998) unifies several approaches by means of the Beltrami framework and oVers new definitions and solutions for various image-processing tasks. According to the extended Beltrami framework, images, visual objects, and their characteristics of interest, such as derivatives, orientations, texture, disparity in stereo vision, optical flow, and more, are described as embedded manifolds. The embedded manifold is equipped with a Riemannian structure (i.e., a metric that encodes the geometry of the manifold). Nonlinear operations are acting on these objects according to the proper local geometry. Iterative processes are considered in this context as evolution of manifolds. The latter is a consequence of the action of a nonlinear diVusion process or another type of a nonlinear PDE. No global (timewise) kernels can be associated with these nonlinear PDEs. Short time kernels for these processes were derived recently by Sochen (2001). 2. A Geometric Measure on Embedded Maps According to the geometric approach to image representation, images are considered to be 2D Riemannian surfaces embedded in higher-dimensional spatial-feature Riemannian manifolds (Kimmel et al., 1996, 2000; Sochen et al., 1998). Let u1, u2, be the local coordinates on the image surface, and let Xi, i ¼ 1, 2, . . ., m, be the coordinates of the embedding space. Then the embedding map is given by
REAL AND COMPLEX PDE-BASED SCHEMES
15
ðX 1 ðu1 ; u2 Þ; X 2 ðu1 ; u2 Þ; :::; X m ðu1 ; u2 ÞÞ:
ð11Þ
Riemannian manifolds are endowed with a bilinear positive-definite symmetric tensor that constitutes a metric. Let (S, (gmn)) denote the image manifold and its metric, and (M, (hij)) denote the spatial-feature manifold and its corresponding metric. Then according to Polyakov (1981), the map X: S ! M has the following weight: R pffiffiffi E½X i ; gmn ; hij ¼ d 2 u ggmn ð@m X i Þð@n X j Þhij ðXÞ; ð12Þ where the range of indices is m, n ¼ 1, 2, and i, j ¼ 1, . . ., m ¼ dim M, and where we use the Einstein summation convention: identical subscript and superscript indices are summed over. We denote by g the determinant of pffiffiffi (gmn) and by (gmn) the inverse of (gmn). The measure d2u g is an area element of the image manifold, and gmn ð@m X i Þð@n X j Þhij ðXÞ is a generalization of the L2 norm for gradients, from Euclidean spaces to manifolds. The last two expressions do not depend on the choice of local coordinates. A gradient descent evolution along the feature coordinates is derived from the respective E-L equations, multiplied by a strictly positive function and a positive-definite matrix, in order to gain reparameterization invariance: Xti
@X i 1 dE ¼ pffiffiffi hil : 2 g dX l @t
ð13Þ
Given a Euclidean embedding space, with Cartesian coordinate system, the variational derivative of E with respect to the coordinates of the embedding space is given by 1 dE i pffiffiffi ¼ Dg X i ¼ pffiffiffi @m ð ggmn @n X i Þ; pffiffiffi hil 2 g dX l g
ð14Þ
where Dg, referred to as the Beltrami operator (Kreyszing, 1991), is a generalization of the Laplacian from flat surfaces to manifolds. Assuming an isometric embedding, the image manifold metric can be ! deduced from the mapping X and the embedding space’s metric hij: gmn ¼ hij @m X i @n X j :
ð15Þ
It is called the induced metric. 3. The Metric as a Structure Tensor At the end of Section II we generalize the analysis elaborated by Kimmel et al. (2000). For completeness, we reiterate some of the relations developed in that study. Let us first show that the direction of the diVusion can be
16
GILBOA ET AL.
deduced from the smoothed metric coeYcients gmn and may thus be included within the Beltrami framework under the right choice of directional diVusion coeYcients. The induced metric (gmn) is a symmetric positive-definite matrix that captures the geometry of the image surface. Let l1 and l2 be the large and the small eigenvalues of (gmn), respectively. Because (gmn) is a symmetric positive matrix, its corresponding eigenvectors, u1 and u2, can be chosen orthonormal. Let l1 0 U ðu1 =u2 Þ; L ; 0 l2 and therefore ðgmn Þ ¼ ULU T : Let us define mn
ðg Þ ðgmn Þ
1
1=l1 ¼ UL U ¼ U 0 1
T
ð16Þ 0 UT; 1=l2
ð17Þ
and g detðgmn Þ ¼ l1 l2 :
ð18Þ
Our proposed enhancement procedure controls the above-determined eigenvalues adaptively, so that only meaningful edges are enhanced, whereas smooth areas are denoised. F. Image Sharpening Image restoration and sharpening has been investigated for several decades. Much research has been devoted to deconvolution techniques in which the image is assumed to be linearly degraded by a convolution with a blurring kernel, which is known a priori. The naive solution was to use inverse filtering, which was then generalized to the optimal linear Wiener filter (in the mean square error [MSE] sense), to account for additive noise and zeros in the blurring kernel. More modern nonlinear deconvolution methods are used today that are based on statistical methods (Fan, 1991; Stefanski and Carroll, 1990), Tikhonov regularization (Tikhonov and Arsenin, 1977), or wavelet-based techniques (Abramovich and Silverman, 1998; Donoho, 1995; Starck and Bijaoui, 1994), among other methods. When the convolution kernel was not known, methods for blind deconvolution were proposed (Ayers and Dainty, 1998; Kundur and Hatzinakos, 1996; McCallum, 1990) that try to achieve an adequate solution based on
REAL AND COMPLEX PDE-BASED SCHEMES
17
some general assumptions regarding the smoothness of the image and of the blurring kernel. In this field, PDE-based methods were also proposed, achieving high-quality results (Chan and Wong, 1998; Kaftory et al., 2003; You and Kaveh, 1996). A somewhat diVerent sharpening strategy is to try and find a general sharpening operator. In this case the blur is also not known, but is not assumed to be linear and therefore cannot be modeled as a convolution with a blurring kernel. Such blur degradations are frequently encountered today as by-products of image compression, for example, where blur and noise are data dependent and change throughout the image. In later sections we elaborate on how to find a general and robust sharpening operator, which can operate on a large collection of images that are possibly degraded by various types of blur and noise artifacts. A classical linear-sharpening operator is the negative Laplacian of the image, which is introduced and examined in Section II. Its sensitivity to noise and its close relations to the ill-posed inverse diVusion equation are shown. Unsharp masking is another common technique, which is somewhat similar, where the input image is being blurred and its diVerence from the input image (the ‘‘mask’’) is added back to the input image, increasing contrast (along with noise). Gabor (1965) proposed an anisotropic operator that sharpens edges by subtracting the second directional derivative in the gradient direction and adding the second directional derivative in the perpendicular level-set direction. Few studies were available in the PDE community regarding a general sharpening process. The P-M equation is known to have some sharpening eVects, but it serves mainly as an edge-preserving denoising process. The idea of Gabor (1965) was generalized to PDEs of gray levels in Lindenbaum et al. (1994) and later to vector-valued PDEs in Kimmel et al. (2000). We relate to Kimmel et al. (2000) in the color-processing section of Section II. Pollak et al. (2000) proposed an enhancement process based on a spring-mass model. This model, however, depends on one-sided derivatives and is mostly formulated in a discrete setting. Its purpose is primarily for segmentation of noisy signals (such as SAR images). An important general deblurring process is the shock filter (Osher and Rudin, 1990), which we explain in details below. 1. Shock Filters Most of the research concerning the application of partial diVerential equations in the fields of computer vision and image processing focused on parabolic (diVusion-type) equations. Osher and Rudin (1990) proposed a hyperbolic equation, called shock filter, that can serve as a stable deblurring algorithm, which behaves similarly to deconvolution (Figure 8).
18
GILBOA ET AL.
FIGURE 8. Shock filter deblurring of a step edge. Solid line indicates blurred step edge. Dashed lines indicate the three steps in the evolution of the PDE toward formation of a shock in the location of the inflection point.
The formulation of the shock filter equation is: It ¼ jIx jF ðIxx Þ;
ð19Þ 2
where F should satisfy F(0) ¼ 0, and F(s) sign(s) 0. Choosing F(s) ¼ sign(s) yields the classical shock filter equation: It ¼ signðIxx ÞjIx j;
ð20Þ
generalized in the 2D case to: It ¼ signðI ÞjrIj;
ð21Þ
where is the direction of the gradient. The main properties of the shock filter are as follows: Shocks develop at inflection points (zero crossings of second derivative). Local extrema remain unchanged in time. No erroneous local
extrema are created. The scheme is total-variation preserving (TVP).
2 Note that the above equation and all other evolutionary equations in this section have @I ¼ 0 where n is the initial conditions I(x, 0) ¼ I0(x) and Neumann boundary conditions (@n orientation perpendicular to the boundary).
REAL AND COMPLEX PDE-BASED SCHEMES
19
The steady-state (weak) solution is piecewise constant (with
discontinuities at the inflection points of I0).
The process approximates deconvolution.
As noted in the original paper, any noise in the blurred signal will also be enhanced. As a matter of fact, this process is extremely sensitive to noise. Theoretically, in the continuous domain, any white noise added to the signal may add an infinite number of inflection points, disrupting the process completely. Discretization may help somewhat, but the basic sensitivity to noise persists. This is illustrated by comparison of the processing of a noiseless and a noisy sine-wave signal (Figure 9). Whereas in the case of a noiseless signal the shock filter well enhances the edges, turning a sine wave into a square-wave signal, in the noisy case the shock filter does not enhance the edges at all, and the primary result of the processing is amplification of noise, although only a very low level of white Gaussian noise was added to the input signal (signal to noise ratio [SNR] ¼ 40 dB). Note also that this process does not enhance contrast, cannot be viewed as an energy-minimizing process, and does not obey a conservation law (i.e., the mean value of the signal is not kept in the evolutionary process).
FIGURE 9. A noiseless sine-wave signal (a) and the steady state of its processing by a shock filter (b) are compared with the processing of a noisy signal generated by adding low-level white Gaussian noise (SNR ¼ 40 dB) (c). The steady state of the processed noisy signal does not depict any enhancement and the only result is noise amplification (d). In Section IV a complex regularized shock filter is developed that can perform well in a noisy environment. SNR, signal to noise ratio.
20
GILBOA ET AL.
A more practical process based on the original shock filter was proposed by Alvarez and Mazorra (1994). This procedure couples a directional diVusion term with the shock filter of Osher and Rudin (1990), where the shock part is controlled by a smoothed second derivative, yielding an equation of the form: It ¼ signðGs I ÞjrIj þ lIxx ;
ð22Þ
where l is a positive constant.3 The first term on the right side creates solutions approaching piecewise constant regions separated by shocks at the zero-crossings of the smoothed second derivative of . The second term is an anisotropic diVusion along the level-set lines. This process is still quite sensitive to noise as explained in more details in a comparison to our process (Section II). More modern approaches used by Coulon and Arridge (2000) and Kornprobst et al. (1997) develop a coupled enhancement-denoising process. 2. Objectives for Image Sharpening The aim in the enhancement part of the study (Sections II and III) is to find a stable sharpening algorithm with the following characteristics: Robust against noise (can still work eVectively in a moderately noisy environment without noise amplification). Stable in all dimensions (specifically in 1D). Can be formulated as a continuous PDE. Can increase contrast (important for line-type edges). A general sharpener. Requires minimal knowledge of the blurring process. Can enhance to some extent degradations caused also by nonlinear, anisotropic or shift variant blurs. The process should be understood also in the context of variational calculus (as some sort of energy-minimization process). Can be easily generalized to multiple channel signals (specifically to color images).
G. Summary This section presented some of the widely used image-processing algorithms based on PDEs. We showed their assets and mentioned the main diYculties and drawbacks, some of which we will address in the following sections. PDEs are not (yet?) a classical tool in the image-processing field. To recap this introductory section we list in the following paragraph a few advantages of PDE-based processes and explain why research in this direction can 3
is the direction of the gradient (rI ) and x is the direction perpendicular to the gradient.
REAL AND COMPLEX PDE-BASED SCHEMES
21
contribute significantly to the theory and applications in the field of computer and computational vision. 1. Why Use PDEs for Image Processing? Advantages of algorithms based on PDEs include the following: The local nature of PDEs. One major characteristic of images is their nonstationary nature. Images in general have many local features and are not well described by global features such as frequencies (that served well in stationary signal processing). Therefore linear algorithms can reach limited performance and adaptive nonlinear processes are required. The adaptive aspects are mostly advantageous compared with various linear techniques. Wavelet and Gabor-based (1946) methods are especially local in nature (and therefore are commonly used in today’s image-processing algorithms). A vast theory and rigorous mathematical foundations related to PDEs already exist. These include strong tools for proving convergence, stability, and unique solutions of processes (specifically for convex processes). There are a variety of well-developed numerical schemes for the implementation of PDE-based algorithms. Concise representation of an algorithm (many times a single equation). No need to write and analyze complex and long algorithms. Easily understood. Less heuristic in its nature. In many cases a process can be generalized to any dimensions in a trivial way using the Laplacian, divergence, and gradient operators. A convenient and eYcient decoupling of theoretical aspects and implementational ones. All the theory and analysis of the characteristics and behavior of a new process can be done in the continuous domain. The implementation is done in the discrete domain (on pixels) and may be accomplished using diVerent numerical techniques. Finally, the ultimate reason in engineering—it works. PDE-based algorithms maintain most of the structure and information of the image and can compete well with state-of-the-art methods of other modern image-processing disciplines.
II. SHARPENING
BY THE
AXIOMATIC APPROACH
A. Introduction In this section we present a nonlinear enhancement process termed forwardand-backward (FAB) diVusion based on the axiomatic approach. Our goal is to enhance and sharpen blurry signals in a robust manner while still allowing
22
GILBOA ET AL.
for some additive white noise to interfere with the process. We minimize the eVect of noise amplification—an inherent by-product of signal sharpening— by combining forward diVusion to a selective backward process. We then generalize the analysis of Kimmel et al. (2000) by the introduction of a local adaptive criterion for the FAB diVusion in sharpening and denoising of color images (or any multivalued images, in general). We initiate our discussion by analyzing a classical linear sharpener: IðxÞ ¼ I0 ðxÞ lDI0 ðxÞ;
ð23Þ
where l is a constant controlling the measure of sharpening. This is quite a general-purpose algorithm that could be applied to most signals and types of blur (quite a desirable feature, which we would like to retain in our developed scheme also). To illustrate its operation, let us begin with a simple example of deblurring a one-dimensional (1D) step function (Figure 10). Reducing the Laplacian from the input signal emphasizes sharp transitions. In the case of signals with almost no noise (extremely high SNR), this
FIGURE 10. Operation of a basic linear sharpener [Eq. (23)] on a clean signal. (a) Blurred step without noise I0; (b) Laplacian approximation of I0; (c) sharpened signal I ¼ I0 2DI0. (l ¼ 2, (c)).
REAL AND COMPLEX PDE-BASED SCHEMES
23
FIGURE 11. Operation of a basic linear sharpening [Eq. (23)] of a noisy signal. (a) Blurred step with noise I0; (b) Laplacian approximation of I0; (c) sharpened signal I ¼ I0 DI0. (l ¼ 1, (c)).
simple algorithm may fairly work. But in the common case, it is reasonable to assume the signal is degraded also by some additive white noise. As seen in Figure 11, this algorithm performs very poorly even in a moderately noisy environment, where its output results in significant noise amplification. We shall now show the connection of the above sharpening algorithm with a known linear PDE called inverse diVusion. 1. Linear Inverse DiVusion Let us define an iterative process, based on Eq. (23), where I nþ1(x) is computed according to I n(x) of the previous iteration: I nþ1 ðxÞ ¼ I n ðxÞ lDI n ðxÞ;
I n¼0 ðxÞ ¼ I0 ðxÞ:
ð24Þ
Then Eq. (23) is simply the first iteration of this generalized scheme. Seeing this iterative process as an evolution in time we can write Eq. (24)
24
GILBOA ET AL.
with a change of notations I n(x) ! I(x, t), I nþ1(x) ! I(x, t þ dt), where t ¼ ndt, as Iðx; t þ dtÞ Iðx; tÞ l ¼ DIðx; tÞ; dt dt
Ijt¼0 ¼ I0 :
In the limit case dt ! 0, letting l ¼ cdt (c is some positive constant), we get the linear inverse diVusion equation: It ¼ cDIðx; tÞ; Ijt¼0 ¼ I0 ;
ð25Þ
where c, in this context, is understood as the diVusion coeYcient. Naturally, the same problems regarding noise of the algorithm in Eq. (23) arise in the process of Eq. (25). (See Figures 12 and 13 for 1D and 2D examples.) We now give some more insight regarding inverse diVusion from perspectives of inverse filtering, physical interpretation, and relation to advancing back in time.
FIGURE 12. Linear inverse diVusion in 1D. From top: a few instances in time of the inverse diVusion process applied to a step edge.
REAL AND COMPLEX PDE-BASED SCHEMES
25
FIGURE 13. Two-dimensional linear inverse diVusion. (a) Original image; (b) blurred image. (c–d) Inversely diVused at times 0.5 and 1.
2. Relation to Deconvolution As mentioned in the previous section, the linear forward diVusion is analogous to convolution with a Gaussian kernel. Hence the linear backward (inverse) diVusion is analogous to a Gaussian deconvolution. Let us examine the Fourier coeYcients of the 1D case of Eq. (25), c ¼ 1 (i.e., It ¼ Ixx): X @Ck X eikx ¼ k2 Ck eikx : @t k k This gives us a simple ordinary diVerential equation (ODE) for each coeYcient with the solution 2
C k ¼ ek t :
26
GILBOA ET AL.
Obviously, this process causes noise amplification to explode with frequency. Numerical application of such a deconvolution process results in oscillations (ringings) that grow with time until they reach the limiting minimum and maximum saturation values and the original signal is completely lost (as depicted in Figures 12 and 13). 3. Physical Interpretation We can consider the gray-level value at a pixel to be analogous to the amount of particles, each having one unit of ‘‘mass,’’ stacked at the pixel; then in order to emphasize large gradients, we would like to move mass from the lower part of a ‘‘slope’’ upward. This is exactly what inverse diVusion does: It brings up mass from the ‘‘bottom of a hill’’ (positive Laplacian) to the upper part (negative Laplacian), thus creating a ‘‘higher hill’’ with a sharper slope. 4. Advancing Back in Time We can regard a blurred image as one that was aVected by a linear diVusion process [Eq. (1)] of a limited duration time T. Then in order to deblur it, one should reverse the diVusion process back in time T units. We can do this by a change of the time variable t of Eq. (1) to negative time t ¼ t. It is easily seen that this is equivalent to changing the sign of the diVusion coeYcient c (to a negative value), getting Eq. (25). Unfortunately, inverse diVusion is a well-known example of an ill-posed equation. That is, the solution is extremely sensitive to any small perturbation of the initial conditions. The next section presents more formally the image degradation model and then analyzes the problems associated with inverse diVusion and emerges with a more complex nonlinear PDE-based sharpening process. B. Forward-and-Backward DiVusion 1. The Model We assume the following general model of our degraded image I0: I0 ¼ BðIorig Þ þ n;
ð26Þ
where Iorig is the original image; B is a smoothing (blurring) transformation, not necessarily linear or shift invariant; and n is some noise, uncorrelated with the signal (not necessarily white, but not of impulsive nature). We assume that large gradients (i.e., edges) of I are still relatively large in B(I ). After some sort of smoothing (or discretization) of I0 (e.g., I˜0 ¼ I0 * gs ¼ B(I ) * gs þ n * gs), we assume that the gradient magnitude of the noise is less than an upper-bound k with a very high probability (e.g., Prob(|rn * gs | < k) 1).
REAL AND COMPLEX PDE-BASED SCHEMES
27
Our objective is to sharpen important edges of the image. That is, edges with a relative large-gradient magnitude and with suYcient support. An imperative requirement is that noise should not be amplified in the process (and preferably even reduced). The noise amplification by-product is a major drawback of many classical sharpening processes. 2. Setting Criteria for Image Sharpening Three major problems associated with the linear backward diVusion process must be addressed: the explosive instability, noise amplification, and oscillations. To avoid the eVect of an explosive instability, the value of the inverse diVusion coeYcient at high gradients can be diminished. In this way, after the singularity exceeds a certain gradient, it does not continue to aVect the process any longer. In order not to amplify noise, which after some presmoothing can be regarded as having mainly medium to low gradients, it is desirable to diminish the inverse diVusion process at low gradients. To minimize the eVect of oscillations, they should be suppressed the minute they are introduced. For this reason, a forward diVusion force should be combined that smoothes low gradients. This force also smoothes some of the original noise that contaminates the signal from the beginning. Unfortunately, low gradients that are not caused by noise, such as those that are characteristic of certain textures in images, are also aVected and smoothed out by this force. In addition to the specific sharpening characteristics of our process, we also would like it to retain some general invariances to image transformations, such as translation, rotation, and constant illumination changes. See Olver et al. (1994) for a broad analysis on invariant geometric flows. In this context we define the processing of the input image I0 as Ts (I0), where Z s Ts ðI0 Þ ¼ I0 þ It dt; ð27Þ 0
and It is an evolutionary process with initial conditions I0. We could summarize these general requirements in the following properties (axioms): Property 1
Invariance to constant illumination changes: Ts ðI0 þ kÞ ¼ Ts ðI0 Þ þ k;
Property 2
Invariance to image translation: Ts ðI0 ðx þ kÞÞ ¼ Ts ðI0 Þðx þ kÞ;
Property 3
k 2 R:
x; k 2 R2 :
Invariance to image rotation: Ts ðRI0 Þ ¼ RTs ðI0 Þ;
R is a 2 2 rotation matrix:
28 Property 4
GILBOA ET AL.
Mean value is not changed: Z Z Ts ðI0 ðxÞÞdx ¼ I0 ðxÞdx: O
O
This means that the process does not amplify or attenuate the signal. As most blur operations do not change the signal’s mean value, so should the recovering operation (the noise is assumed to be of zero mean). Property 5
Gray-level scaling: Ts jkp ðkI0 Þ ¼ kTs jp ðI0 Þ;
k 2 R;
where Ts|p is the result of a process with a set of parameters p ¼ (p1, p2, . . .). REMARK: Nonlinear processes in general are not invariant to gray-level scaling, which is often quite sensible (e.g., when the image gray-level range is [0, 1] important gradients would be defined diVerently than when the range is [0, 255]). Therefore scaling invariance is not requested but rather an easy mechanism to adopt the process for new gray-level range. Specifically in the above formulation, when the parameters of a process are rescaled according to the scaling measure, we require that the processing be identical. 3. The DiVusion CoeYcient Consider the following formula of the diVusion coeYcient in the form of: cFAB ðsÞ ¼
1 1 þ ðs=kf Þn
a
1 þ ðs kb Þ=w
2m :
ð28Þ
In our implementation, the exponent parameters (n, m) were chosen to be (4, 2) and kf < kb w. The P-M diVusion coeYcient, in comparison, is: cPM ðsÞ ¼
1 1 þ ðs=kÞ2
:
ð29Þ
Plots of the coeYcients and respective fluxes of Equations (28) and (29) are shown in Figures 14 and 15, respectively. Theorem 1 The process of Eq. (2) with the diVusion coeYcient defined in Eq. (28) obeys the five properties stated in Section II.B.2. All proofs of theorems and lemmas in this paper can be seen in Gilboa (2004). A diVusion process defined by c, such as in Eq. (28) or by another process of this type, switches adaptively between FAB diVusion processes. Therefore we refer to it as an FAB diVusion process. In Section III another coeYcient
REAL AND COMPLEX PDE-BASED SCHEMES
29
FIGURE 14. The coeYcient cFAB and the corresponding flux, plotted as a function of the gradient magnitude.
is proposed with similar nature that emerges from a potential function. Other formulas with similar nature may also be proposed. Compared with the P-M, Eq. (29), where an ‘‘edge threshold’’ k is the sole parameter, we now have a parameter for the forward force kf, two parameters for the backward force (we defined them by the center kb and width w), and the relations between the strength of the backward and forward forces (a ratio we denoted by a). We therefore discuss some rules for determining these parameters. The parameter kf is essentially the limit of gradients to be smoothed out and is similar in nature to the role of the k parameter of the P-M diVusion equation. The parameters kb and w define the range of backward diVusion and should be determined by the values of gradients that one wishes to emphasize. In the proposed formula the range is symmetric, and we restrain the width of the backward diVusion to avoid overlapping the forward diVusion. The parameter a determines the ratio between the backward and forward diVusion. If the backward diVusion force is too dominant, the stabilizing forward force is not strong enough to avoid oscillations. The development of new singularities over smooth areas in the 1D case can be avoided by
30
GILBOA ET AL.
FIGURE 15. cPM and the corresponding flux as a function of the gradient magnitude.
bounding the maximum flux permissible in the backward diVusion to be less than the maximum of the forward one. Formally we say: maxfs cðsÞg > s
max
kb w<s
fs cðsÞg:
ð30Þ
In the case of our proposed coeYcients, simple bounds for a satisfying the inequality are obtained. (These hold for most choices of positive integer exponent parameter combinations n, m). We have the following bound: a kf =2ðkb þ wÞ;
for any 0 < w < kb kf :
ð31Þ
There are a few ways to regularize this PDE-based approach. Given a priori information on the smallest scale of interest, one can smooth smaller scales in a noisy signal by preprocessing. As we enhance the signal afterward, the smoothing process does not aVect the end result that much. This enables operation in a much noisier environment. Another possibility is to convolve the gradient used for calculating the diVusion coeYcient with a Gaussian, following the regularization method proposed by Catte et al. (1992). Using relatively smooth diVusion coeYcients (low exponent parameters m, n) also reduces the sensitivity of the process to noise.
REAL AND COMPLEX PDE-BASED SCHEMES
31
4. Adaptive Parameters One way to determine these parameters in the discrete case, without having any prior information, is by calculating the mean absolute gradient (MAG). For instance, [kf, kb, w] ¼ [2, 4, 1] * MAG. Local adjustment of the parameters can be done by calculating the MAG value in a window. The parameters [kf (x, y), kb (x, y), w(x, y)] vary gradually along the signal, and enhancement is accomplished by inducing diVerent thresholds in diVerent locations. This is indeed required in cases of natural signals or images because of their nonstationary structure. Usually a minimal value of forward diVusion should be kept, so that large smooth areas do not become noisy. An example implementing the local parameter adjustment is depicted by the parrot image (see Figure 20). In the deer image (see Figure 21) we adjusted the parameters according to the gradient magnitude of the initial image convolved by a Gaussian (instead of the MAG) obtaining similar results. 5. Comparison with Shock Filters In this section we would like to clarify the diVerences between the FAB scheme and the enhancement process of Alvarez and Mazorra (A-M) (1994) described in the introduction [Eq. (22)]. The main diVerences between this scheme and FAB are that the A-M scheme is limited to the minimum and maximum gray-level values of the original image and therefore is more stable but cannot achieve real contrast enhancement. The A-M scheme is not adaptive with respect to the gradient’s magnitude; it is targeted for enhancement everywhere, including smooth regions. Eq. (22) contains a diVusion term along the level sets (a curvature-flow process); therefore small features are sometimes eliminated and rounding of objects occurs. Edges are determined by the zero-crossings of the smoothed second derivative along the gradient direction, whereas we use the P-M model of gradient-dependent diVusivity. The first method may well serve edge-detection purposes but may not necessarily be useful as a means of controlling an enhancement process; indicating edges by the zero-crossing is a binary decision process (and not a fuzzy one) and may be therefore less immune against noise. The Gaussian convolution may help to some extent, but many false edges will still be enhanced (because the gradient magnitude is not considered). Applying a very wide Gaussian to increase robustness will dislocate edges and wipe out many image details. In the following section we compare, by means of an example, the A-M scheme with FAB.
32
GILBOA ET AL.
6. Examples We used the explicit Euler scheme with a forward diVerence scheme for the time derivative, and the central diVerence scheme with a 3 3 kernel for the spatial derivatives. Examples of signal and image restoration using the selective inverse diVusion are shown below. A blurred and noisy step signal (Figure 16b) was processed, assuming the availability of some prior information regarding the expected range of noise power and the approximate size of the original step. Enhancement of the step and denoising the rest of the signal are clearly depicted in Figure 16e. The second example illustrates simultaneous denoising and enhancement of a blurred multistep signal contaminated by uniform noise (Figure 17). The FAB is also eVective in the enhancement of images, as is illustrated in Figures 18, 19, 20, and 21. In Figures 18 and 19 we compare FAB results with those obtained by the application of the A-M process [Eq. (22)]. The A-M process indeed enhances the objects and forms clear edges, but small
FIGURE 16. FAB processing of a single-step noisy signal (from top to bottom): (a) blurred step; (b) signal contaminated by white Gaussian noise (SNR ¼ 7 dB); (c–e) results of FAB diVusion process after (c) 1, (d) 3, and (e) 30 time steps, respectively. [kf, kb, w] ¼ [1/6, 1, 1/3].
REAL AND COMPLEX PDE-BASED SCHEMES
33
FIGURE 17. FAB processing of multiple-step noisy signal (from top to bottom): (a) original signal (with both positive and negative discontinuities); (b) blurred signal contaminated by white uniform noise (SNR ¼ 8 dB); (c–e) results of FAB diVusion process obtained after (c) 2, (d) 4 and (e) 16 time steps, respectively. [kf, kb, w] ¼ [0.1, 0.8, 0.2].
details are lost (e.g., the teddy bear’s shirt patterns). It consequently appears as though the image has lost its natural appearance. In addition, relatively smooth regions (like the table) are erroneously enhanced, creating artificial textures. A plot of a horizontal line from the middle of the image (Figure 19) highlights in more detail the diVerent behavior of the two enhancement processes. A fidelity term l(I0 I) was added to the evolution equation (l ¼ 0.05) in Figures 18 and 21. In the second example (Figure 20), we implemented the automatic adjustment of local parameters, explained previously. This was needed to enhance gradients at diVerent locations diVerently. For example, the gradients near the parrot’s beak are very large, whereas those around the eye are much smaller. Enhancing both regions without blurring important details on one hand, and maintaining stability on the other hand, required a completely diVerent set of parameters (kf, kb, w). In the process of implementing automatic adjustment, the parameters vary gradually with image location, and the enhancement appears to be natural. In Figure 21 we compare the
34
GILBOA ET AL.
FIGURE 18. Comparison between regularized shock filter (Alvarez–Mazorra) and FAB. (a) Original; (b) shock filter (s ¼ 1, c ¼ 0.5); (c) FAB process ([kf, kb, w] ¼ [2, 20, 10]).
diVerent results of constant versus spatially varying parameters, whereby using the latter method, the deer, as well as the trees behind them, are enhanced. We should comment that the process is not well suited for the handling of textures, as seen in the spots created at the textured ground. 7. Stability of Smooth Regions in 1D Problem definitions: The flux J is defined as follows:
J cð ÞIx ðx; tÞ:
REAL AND COMPLEX PDE-BASED SCHEMES
35
FIGURE 19. Plot of gray-level values obtained along one line of Figure 18. (a) Original; (b) shock; (c) FAB.
(Note that flux in physical problems is usually defined with an opposite sign.) We assume a diVusion coeYcient of the type c ¼ c(|Ix|), leading to the flux properties: J ¼ JðIx ðx; tÞÞ; and the antisymmetry relation J(s) ¼ J ( s). The nonlinear diVusion equation, with its initial and boundary conditions, is: @ I t ¼ Jx ¼ cðjIx jÞIx @x 0 x 1;
Iðx; t ¼ 0Þ ¼ I0 ðxÞ;
Ix ð0; tÞ ¼ Ix ð1; tÞ ¼ 0: In theorem 2 we regard the simpler case of positive diVusion coeYcient (c( ) > 0) with nonmonotonic flux. We prove that once a gradient gets into the smooth band Ix 2 [r, r], where r is the point of maximum flux, it remains trapped there.
36
GILBOA ET AL.
FIGURE 20. FAB diVusion process applied to the parrot image, with local parameter adjustment using the MAG measure. (a) Original image; (b) blurred image. Bottom, diVusion process after time steps (c) 1 and (d) 8, respectively.
The maximum of the flux (see Figure 22) is defined by: M maxfJðIx Þg; Ix
r ¼ arg maxfJðIx Þg: Ix
Theorem 2 (smooth band ‘‘trap’’) If J(r) M ¼ maxIx {|J(Ix)|}, c(s) > 0 8s and |Ix(x0, t0)| < r, then |Ix(x0, t)| < r for any t > t0. Theorem 3 is the version of Theorem 1, adapted to the proposed FAB coeYcient, having both positive and negative values of c. The points of extrema of flux, in an FAB diVusion process, are defined as follows (see Figure 23): Mf maxfJðIx Þg; Ix >0
Mb minfJðIx Þg; Ix >0
frf : JðIx ¼ rf Þ ¼ Mf g; frb : JðIx ¼ rb Þ ¼ Mbg ;
REAL AND COMPLEX PDE-BASED SCHEMES
37
FIGURE 21. FAB diVusion process applied to the deer image. (a) Original image; (b) result of processing with constant parameters kf ¼ 2, kb ¼ 50, w ¼ 10; (c) magnitude of smoothed gradient of original image T(x, y) ¼ |rI0 * Gs|, (s ¼ 3); (d) result of processing with spatially varying parameters kf (x, y) ¼ 0.1T, kb(x, y) ¼ 6T, w(x, y) ¼ 2T. (c) M-dash.com.
FIGURE 22. Nonmonotonic flux of a forward-diVusion process and its critical points M and r.
38
GILBOA ET AL.
FIGURE 23. Flux and critical points of the FAB diVusion process.
This theorem states that, in the 1D case, a point x0 with an initial gradient magnitude below rf will not assume a gradient magnitude larger than rf (i.e., will stay ‘‘smooth’’) through the entire FAB diVusion process, provided the forward maximum flux, Mf, is larger than the backward one, Mb. Theorem 3 (stability of smooth regions) If Mf > Mb, then for every x0 for which |Ix (x0, 0)| < rf, the derivative stays bounded at all times (i.e., |Ix (x0, t)| < rf for any t > 0). C. Super-Resolution by the FAB Process The FAB diVusion process is useful in applications requiring simultaneous enhancement and smoothing. We present a simple super-resolution (SR) scheme, incorporating two main subsystems: an interpolator and an enhancer-denoiser, as shown in Figure 24. 1. Some Background: What Is Super-Resolution? By SR we refer to the process of artificially increasing the resolution of an image, using side information about the structure of any specific subset of images or of natural images in general. The processed image should not only have more pixels, but more importantly, be characterized by a wider band than that of the original image. Most applications of SR use several images obtained from the same scene or object, taken from slightly diVerent angles or locations. After proper registration, a higher-resolution image can be obtained from the low-
REAL AND COMPLEX PDE-BASED SCHEMES
39
FIGURE 24. Super-resolution processor.
resolution images by exploiting the combined information available at the diVerent sets of sampling points. Examples of such SR procedures can be found in studies conducted at NASA on satellite images (Cheeseman et al., 1996) by Schultz and Stevenson (1996) processing a series of movie frames, and by Elad and Feuer (1999). 2. The Proposed Scheme: Single-Image Super-Resolution We elaborate an approach suitable for SR based on a single image, similar to Vitsnudel et al. (1991). Instead of using a sequence of video frames or multiple exposures, we exploit the properties common to a wide range of natural images. Obviously, there are cases where only one image is available, and one would still like to enhance the resolution. The basic assumption is that images can be segmented into regions falling into one of the following three categories: smooth areas, edges, and textured regions. At this point we simplify our model and consider only images that are not endowed by significant textural attributes, that is, they can be approximated by piecewise-smooth segments separated by edges. The proposed scheme receives a low-resolution image as an input, with possibly some prior information about the structure of the scene. The processing is executed in two steps. First, the image is interpolated to the new desired size. In our implementation we used cubic B-spline interpolation, but other methods may also be used. The first step provides good results over smooth areas, but edges are smeared. The interpolated images often depict ringing eVects, with low spatial oscillations. The purpose of the second processing step is to enhance the edges and denoise the interpolation by-products. This is accomplished by using the FAB diVusion process. In our implementation the parameters kf, kb, w were locally adjusted according to the mean gradient criterion.
40
GILBOA ET AL.
3. Resolution Enhancement: An Example Consider a narrow-band system, such as a cellular telephone, that permits the communication of only low-resolution images at a reasonable rate. We wish to enhance the resolution of an image at the receiving end of the communication channel in such a way that it will appear as though a highresolution image was transmitted over a wide-band channel. We down-sample an input of a high-resolution image by 4 in each dimension and send the low-resolution ‘‘blocky’’ image (i.e., 1/16 of the original size). At the receiving end we apply the proposed SR process: The image is up-sampled and enlarged back to its original size. The FAB process is then applied. The end result (Figure 25d) looks more like the original image (Figure 25a) than the low-resolution image (Figure 25b). A considerable improvement can be gained by transmitting some side information in addition to the image itself. Such side information may include suitable parameters of the FAB process, specification of segments where enhancement should be avoided or emphasized, and so forth.
FIGURE 25. Application of the SR process (from top down). (a) Original high-resolution image; (b) low-resolution blocky input image with 1/16 of the original pixels; (c) the image shown in (b) but up-sampled and interpolated by a cubic B-spline; (d) image obtained after FAB processing; (e) FAB processing with additional side information, avoiding enhancement of most of the sky area.
REAL AND COMPLEX PDE-BASED SCHEMES
41
Whenever the original high-resolution image is available at the transmitting end of the channel, one can find much more easily the optimal parameters suitable for the task. In Figure 25, we assumed that additional information was available, specifying where enhancement should be avoided. Such image segments are typically blurry and fuzzy in the first place, like clouds for instance. In Figure 25e, we show the result of avoiding enhancement of most of the sky (above a certain horizontal line in the image). This results in fuzzy clouds, whereas the mountains below are crisp and sharp. To compare with, in the global enhancement (Figure 25d) the clouds are also sharpened and lose their natural appearance. We should emphasize that this resolution-enhancement process does not replace ordinary image compression. It can be used as an additional tool that improves the overall performance in terms of bandwidth of the final image that is displayed. Indeed the image of Figure 25d (or Figure 25e) is of a wider band than that of the transmitted one (Figure 25b). D. Color Processing 1. The Beltrami Framework For enhancement of color images, we adopt the Beltrami framework (described in Section I). We follow the studies presented by Perona and Malik (1990), Sochen et al., (1998), Kimmel et al., (2000), and Weickert (1999b), and show how one can design a structure tensor that controls the nonlinear diVusion process starting from the induced metric that is given in the Beltrami framework. The proposed structure tensor is nondefinite positive, or negative, and switches between these states according to image features. This results in an FAB diVusion flow, and diVerent regions of the image are either forward or backward diVused, according to the local geometry within a neighborhood. The adaptive property of the process, which finds its expression in the local decision on the direction of the diVusion and on its strength, is the main novelty of this section. 2. The Adaptive Structure Tensor From the above derivation of the induced metric gmn, it follows that the larger eigenvalue l1 corresponds to the eigenvector in the gradient direction (in the 3D Euclidean case: (Ix, Iy)). The smaller eigenvalue l2 corresponds to the eigenvector perpendicular to the gradient direction (in the 3D Euclidean case: (Iy, Ix)). The eigenvectors are equal for both gmn and its inverse gmn, whereas the eigenvalues have reciprocal values. We can use the eigenvalues as a means to control the Beltrami flow process. For convenience let us
42
GILBOA ET AL.
define l1 1=l1 : As the first eigenvalue of gmn (that is, l1) increases, so does the diVusion force in the gradient direction. Thus by changing this eigenvalue we can reduce, eliminate, or even reverse the diVusion process in the gradient direction. Similarly, changing l2 1=l2 controls the diVusion in the level-set direction. What is the best strategy to control the diVusion process via adjustment of the relevant parameters? The following requirements may be considered as guidelines: The enhancement should essentially be with respect to the important features, whereas smooth segments should not be enhanced. The contradictory processes of enhancement and noise reduction by smoothing (filtering) should coexist. The process should be as stable as possible, although restoration and enhancement processes are inherently unstable.
Let us define a1(s) as a new adaptive eigenvalue to be considered instead of the original l1. We propose an eigenvalue that is a function of the determinant of the smoothed metric. The formulation of the new eigenvalue is the same as the FAB diVusion coeYcient, that is: a1 ðsÞ ¼ cðsÞ;
ð32Þ
where c(s) is defined by [28] and s, ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi here, is chosen to be a function of the pEq. smoothed metric: s ¼ detðgmn Gr Þ: 3. Algorithm for Color Image Enhancement To implement the flow It ¼ Dgˆ I for color image enhancement, we modify and generalize the algorithm of Kimmel et al. (2000) as follows: 1. Compute the metric gmn. For the N channel case (for conventional color mapping N ¼ 3), we have gmn ¼ dmn þ
N X k¼1
Imk Ink :
ð33Þ
2. DiVuse the gmn coeYcients by convolving them with a Gaussian of variance r, thereby ~ gmn ¼ Gr gmn :
ð34Þ
3. Compute the inverse smoothed metric ~g . Change the eigenvalues of the inverse metric l1, l2, (l1 < l2), of ~gmn to a1(s), a2, respectively. The new, second eigenvalue should be in the range 0 < a2 1, preferably minimal (a2 1) when the image is not noisy. This yields a new inverse structure mn
REAL AND COMPLEX PDE-BASED SCHEMES
^ mv that is given by: tensor g
~ a1 ðsÞ ð^ g mv Þ ¼ U 0
0 ~T ~U ~L ~ T: U ¼U a2
43
ð35Þ
4. Calculate the determinant of the new structure tensor. Note that ^g can now have negative values. 5. Evolve the k-th channel via the Beltrami flow pffiffiffi 1 Ith ¼ D^ g I k pffiffiffi @m ð ^g ^g mn @n I k Þ: ^ g
ð36Þ
REMARK: Inpthis ffiffiffi flow, we do not get imaginary values, although we have the p term g , because in cases of negative ^g , the constant imaginary ffiffiffiffiffiffiffi ^ term i 1will be canceled. 4. Experimental Results We applied three Beltrami-type processes to the iguana color image: The original scheme of Kimmel et al. (2000), a modified version of Kimmel et al. (2000), where the second eigenvalue is small; and our Beltrami–FAB process. The results presented in Figure 26 show that in the first process, smoothing along the edges is very dominant, creating snakelike features at places of nonorientational textures (like the sand). The second process (using a small value of a2), creates strong sharpening eVects but amplifies noise at smooth regions (like the sea), as is clearly depicted in the enlargement (Figure 27). Our Beltrami–FAB process seems to behave well in this relatively complex natural image. In Figure 28, we show the eVects of enhancement on a compressed image. The tulip image was highly compressed according to the JPEG standard. A known by-product of JPEG compression is the blocking eVects created at smooth regions. Indeed, the original and modified schemes of Kimmel et al. (2000) enhanced the 8 8 block boundaries, whereas our scheme smoothed them out. E. Discussion Sharpening and denoising are contradictory requirements in image enhancement. We show how they can be reconciled by a local decision mechanism that controls the orientation, type, and extent of the diVusion process. The combined FAB diVusion process oVers practical advantages over previously proposed studies related to the enhancement of images. One of the important aspects of any attempt to implement a truly backward diVusion process in image processing (i.e., a process where the
44
GILBOA ET AL.
FIGURE 26. Iguana image processed by three Beltrami-type processes. From top: (a) original (b) scheme of Kimmel et al. (2000) (a1 ¼ 0.3, a2 ¼ a11); (c) modified Kimmel et al. (2000) with small a2(a1 ¼ 0.3, a2 ¼ 0.01,); (d) Beltrami–FAB process (a1 ¼ a1(s), [kf, kb, w, a] ¼ [10, 2000, 1000, 0.5], a2 ¼ 0.01). All processes ran 13 iterations, (time step) dt ¼ 0.1, r ¼ 2.
diVusion coeYcient becomes negative) is the inherent instability. Because the physical diVusion and heat propagation occur only as a forward process, the mathematical model that well represents the physics becomes ill posed when the diVusion coeYcient changes its sign. As is well known, stability is not well defined in ill-posed problems. It is therefore important to note that
REAL AND COMPLEX PDE-BASED SCHEMES
45
FIGURE 27. Enlargement of a segment of the iguana’s head, with the sea at the background. From left: (a) original; (b) image processed by a modification of Kimmel et al. (2000) with small a2; and (c) by the Beltrami–FAB. Note that smooth regions such as the sea are not becoming noisy due to processing by our scheme.
FIGURE 28. A segment of the compressed tulip image, processed by three Beltramitype schemes. (a) Original; (b) result of processing by the scheme of Kimmel et al. (2000) with a1 ¼ 0.5, a2 ¼ a11; (c) processed by a modified Kimmel et al. (2000) with small a2(a1 ¼ 0.5, a2 ¼ 0.1); (d) processed by the Beltrami–FAB process with a1 ¼ a1(s), [kf, kb, w, a] ¼ [30, 300, 200, 0.5], a2 ¼ 0.1. All processes ran 10 iterations, dt ¼ 0.1, r ¼ 1. Note that the JPEG-blocking artifacts are not enhanced by the Beltrami–FAB process.
stability is aVorded over certain regimes in the case of the FAB diVusion. We have proven stability for small gradient bands in the 1D case and verified the feasibility of our approach on a variety of signals and images. Intuitively, the stability in the backward process is aVorded by its limitation to small areas of very few pixels, surrounded by larger areas of many more pixels, where the forward diVusion provides a ‘‘safety belt’’ that avoids explosion.
46
GILBOA ET AL.
Indeed, because the majority of pixels in natural images are characterized by low gradients and mainly singular edges give rise to the reversal of the diVusion coeYcient sign, stability is achieved. This argument does not hold any longer when the FAB diVusion process encounters a highly textured or extremely noisy image. The FAB model is also generalized in the framework of the Beltrami flow for adaptive processing of color images. This is accomplished by replacing the eigenvalues of the color image metric by an adaptive coeYcient that locally controls the orientation and extent of the diVusion. The decision of where and how to adapt the coeYcient is based on the edge’s direction and strength, defined by the eigenvectors and determinant of the smoothed image metric, respectively. FAB diVusion process takes place in the direction of the gradient, and forward diVusion takes place in the perpendicular direction. Examples illustrate that this approach works and that sharpening and denoising can be combined together in the enhancement of gray-level and color images. For more details and numerical examples, see Gilboa et al. (2000a,b, 2001a, 2002a).
III. SHARPENING
BY THE
VARIATIONAL APPROACH
A. Introduction In this section we address the issue of sharpening of blurred and noisy images in the variational framework. We show how a selective sharpening process can be viewed as an energy minimization flow of a nonconvex energy density function in the shape of a triple-well. Our aim is to find a proper potential C(|rI|) for Eq. (5) that rewards sharper transitions on one hand, and penalizes oscillations on the other hand. Minimizing this energy by steepest descent according to Eq. (7) will provide us with the proper sharpening process. Not surprisingly, this will eventually result in an FAB-type process, which is diVerent, in some aspects, from the process presented in Section II. We will analyze the relations between these processes and portray the new insight revealed by following the variational approach. As in the FAB case, this type of potential is new and was not investigated before. In the image processing community, almost all potentials proposed so far were increasing. After a long survey in many diVerent scientific fields that model phenomena with PDEs, we found that a gradient-based doublewell potential is used to model the formation of microstructures (Ball and James, 1992; Ericksen, 1987). These related issues will be addressed next.
REAL AND COMPLEX PDE-BASED SCHEMES
47
1. Related Studies Samson et al. (2000) presented a study involving nonconvex potential using multiple wells for image classification. Their work is fundamentally diVerent from ours in that their potential is based on the signal and not on its gradient, and their purpose is intended for classification based on gray-level values. Kurganov et al. (1998) present some interesting bounds on the norm of the solution to a gradient-dependent inverse-diVusion problem in one dimension. This could be interpreted as minimization of a decreasing potential. The diVusion coeYcient, though, is negative for small gradient magnitudes and the solution tends, therefore, to create microstructures. 2. The Double-Well Potential Well-shaped potentials have been investigated recently in material science and structural mechanics (Ball and James, 1992; Ericksen, 1987; James, 1992; Luskin, 1997). We review some of the mathematical and numerical aspects that are relevant to our case. A mathematical model for the formation of microstructures in certain alloys was presented by Ball and James (1992). The theory is based on an energy minimization process of a double-well potential. The gradientdependent potential attains its minimum value at symmetry-related deformation gradients (Ball and James, 1992; Ericksen, 1987; Luskin, 1997). In the 1D case a typical example of such potential is Cdw ðIx Þ ¼ ðIx2 k2 Þ2 :
ð37Þ
Although it was not referred to as a diVusion process, and the outcome of this energy minimization flow does not resemble classical diVusion, it can clearly be viewed as a nonlinear diVusion process, with the following diVusion coeYcient: cdw ðjIx jÞ ¼ 4ðIx2 k2 Þ:
ð38Þ
Plots of the potential and of the corresponding diVusion coeYcient are depicted in Figure 29. This is indeed an FAB-type process—for small gradients |Ix| < k, it is a backward-diVusion process; and for large gradients |Ix| > k, it is a forward one. This leads to the sharpening of low gradients and the smoothing of large gradients where both approach a magnitude of k (Ix ¼ k). Because the potential is nonconvex, and along some of its segments decreasing (creating an inverse diVusion flow), this process has stimulated a growing number of studies dealing with both the theoretical and numerical diYculties that it entails. (See, for example, Ball and James, 1992; Carstensen
48
GILBOA ET AL.
FIGURE 29. A double-well potential (a) and the corresponding diVusion coeYcient (b). k ¼ 1.
and Plechac, 1997; Gobbert and Prohl, 1998; Luskin, 1997; Munoz and Pedregal, 2000; and Pedregal, 1996). Three main methods for numerical solutions of such problems were proposed (Gobbert and Prohl, 1998): Convexification of the potential, wherein the original potential is replaced by its convex hull. In this case, there exists a minimizer and it can be easily obtained, but at a cost of changing some of the process characteristics. Reformulation of the problem using Young measures (a mathematical tool in the calculus of variations, applying a gradient-generated family of probability measures) (Demoulini, 1996; Pedregal, 1999; Roubicek, 1997). Direct minimization of the energy functional. In this type of method, the process may converge toward a fixed point of a local minimum because of the nonconvex nature of the problem. However, in some applications, such minima are also of interest.
The nature of the double-well and other related problems is quite similar to the formalism of our problem, and numerical techniques in image processing can most likely benefit from the research conducted in the (mathematically and computationally) related field. Yet, we should emphasize the following diVerences from the problem that we have at hand. The potential does not have a ‘‘relaxed’’ region, where gradients are being smoothed. Specifically, constant functions are unstable. The basic solution of the crystalline microstructure intends to have oscillations, which is not desirable in our case.
REAL AND COMPLEX PDE-BASED SCHEMES
49
The boundary conditions are diVerent (Dirichlet versus Neumann in our case). The motivation is diVerent. We are interested in the evolution of the input image, whereas analysis of the double-well model focuses on the final minimal energy state with weak relations to any primary initial evolutionary state.
B. Energy Wells in Image Processing 1. The Energy Functional We minimize the following energy functional: Z EðIÞ ¼ ðW ðjrIjÞ þ lF ðIÞ þ Rðjr2 IjÞÞdx: O
ð39Þ
W is a potential generating a selective sharpening flow. Its form is discussed in details below. F is a convex fidelity criterion related to the input image F ðIÞ ¼ rðjI I0 jÞ:
ð40Þ
We choose here the standard function r(s) ¼ 12s2, but other choices are also possible (e.g., Nikolova, 2002). In Section V, we discuss in more detail the role of the fidelity term in denoising processes and suggest an adaptive term to preserve textures. Note that we assume no a priori knowledge of the blurring process and therefore avoid the introduction of a blur operator in the fidelity term. For cases of linear and translation invariant blur, a blind deconvolution may be a viable option (see Chan and Wong, 1998; Kaftory et al., 2003). R is a higher-order regularizing term. It is a function of the Laplacian and is discussed later. 2. The Triple-Well Potential We begin by discussing the shape of the potential W derived from our objectives. The blurring process smears edges; thus gradients of large magnitude decrease. We would like to reverse this process and increase medium gradients back to their original state. Therefore high gradients should retain a lower energy state (‘‘cost less energy’’) and the energy minimization process would thus be rewarded on edge sharpening. However, two restrictions must be made. First, a saturation of the sharpening should be defined so that very high gradients would not be sharpened and cause the explosion of the signal. As we do not want to fall in the
50
GILBOA ET AL.
category of the ill-posed problems of the condition in Eq. [8], very large gradients should be even smoothed slowly to reduce staircasing. Second, low gradients should not be enhanced to avoid as much as possible noise amplification. Specifically, the zero gradient should not contribute any energy (i.e., be of zero potential). From this discussion it follows that a potential intended for sharpening should be constructed of three basic attractors (low-energy states) in one dimension: two for high gradients (of positive and negative values) and one for the zero gradient. In two dimensions the potential is rotationally symmetric. This leads to a triple-well–shaped potential. Formally, we set the following requirements: ðaÞ ðbÞ ðcÞ ðdÞ ðeÞ
W ð0Þ ¼ 0 W ðsÞ ¼ W ðsÞ; 8s W ðsÞ 0; 8s 90 < a < b < 1 : W 0 ðs 2 ða; bÞÞ < 0 W 0 ðs ! 1Þ > 0:
We suggest the following formula for the potential: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a s W ðsÞ ¼ kf þ k2f s2 k2f k2b logð1 þ ð Þ2 Þ; 2 kb
ð41Þ
ð42Þ
where kf, kb are parameters determining the lower gradients’ forward diVusion region and the higher gradients’ backward diVusion region, respectively (kf < kb), and a is a weight parameter. In order to fulfill Requirement (41.c), a proper bound on a should be set. The corresponding diVusion coeYcient is 1 a cW ðsÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : 2 2 1 þ ðs=k bÞ 1 þ ðs=kf Þ
ð43Þ
The potential is ‘‘designed’’ such that the resultant diVusion coeYcient is as simple as possible. After all, we use the diVusion coeYcient to compute the flow in the numerical implementation. (See Figure 30 for plots of W and cW.) Other, more sophisticated formulas, with more parameters controlling the shape of the potential, can be used. 3. Higher-Order Regularization We wish to have the ‘‘smoothest’’ possible energy minimizer in order to reduce oscillations between the three low-energy states. (The reasoning is similar to what is given in cases of viscosity solutions.) For this purpose, we add the following high-order convex regularization term to the total energy density function:
REAL AND COMPLEX PDE-BASED SCHEMES
51
FIGURE 30. A triple-well potential (a) and the corresponding 1D diVusion coeYcient (b). kf k ¼ 0.2, kb ¼ 1, a ¼ 2.2kf . b
1 Rðjr2 IjÞ ¼ jr2 Ij2 : 2
ð44Þ
This adds a linear fourth-order term r4I to the gradient descent flow, where r4 is the biharmonic operator (or bi-Laplacian). In the 1D case, r4 I ¼ Ixxxx ; whereas in two dimensions, r4 I ¼ Ixxxx þ 2Ixxyy þ Iyyyy : The fourth-order linear equation It ¼ r4 I;
Ijt¼0 ¼ I0
ð45Þ
is often referred to as a hyperdiVusion flow (also superdiVusion). The funda4 mental solution of Eq. (45) in the frequency domain of (o) is eo t, implying that it is a strongly low-pass filtering flow that rapidly diminishes highfrequency oscillations. (See Figure 31 for plots of the fundamental solution, and Figures 32 and 33 for examples of hyperdiVusion in one and two dimensions.) A nonlinear hyperdiVusion term was added by Wei (1999) to the standard Perona–Malik (1990) equation to rapidly remove the noise. Note, though, that hyperdiVusion does not obey the minimum-maximum principle (the spatial fundamental solution is not strictly positive and resembles more the ideal low-pass sinc function [Figure 31]). Thus its implementation for denoising purposes should be executed with care. The Cahn–Hilliard (1958) and Kuramoto–Sivashinsky (Kuramoto, 1984; Sivashinsky, 1983) equations have a hyperdiVusion term, that is, stabilizing inverse diVusion processes (along with a first-order nonlinearity). These equations were used to model evolution of phase fields in alloy mixtures (Cahn and Hilliard, 1958) oscillatory chemical reactions (Kuramoto, 1984), and fronts of premixed flames (Sivashinsky, 1983), among other natural phenomena (Rost and Krug 1995; Witelski 1996).
52
GILBOA ET AL.
FIGURE 31. Fundamental solution of the hyperdiVusion (line) vs. diVusion (dots), plotted in the spatial domain (a) and frequency domains (b). Whereas the diVusion kernel is Gaussian in both domains, the hyperdiVusion has a sharper frequency cutoV and also attains negative values in the spatial domain.
FIGURE 32. Comparison of hyperdiVusion (left) and linear diVusion (right) processing of noise and a step edge, given at times 0, 0.1, 1, 10 (from top to bottom, respectively). HyperdiVusion diminishes high-frequency noise more rapidly, whereas low frequencies decay slower. Also, hyperdiVusion does not obey the minimum-maximum principle (most apparent in the step processing).
Witelski (1996) showed that a nonlinear forward-backward diVusion process with higher-order regularization (of hyperdiVusion and a viscous relaxation term) yields a unique solution. Although the equations are diVerent (e.g., Witelski nonlinear diVusion coeYcient is a function of the signal itself [c ¼ c(I )] and not of its gradient), we assume that similar results can be obtained in our case.
REAL AND COMPLEX PDE-BASED SCHEMES
53
FIGURE 33. HyperdiVusion processing of the cameraman image, given at times 0 (a), 0.1 (b), 1 (c), and 10 (d).
4. Energy Minimization Flow We use the following dissipating energy process: It ¼ divðcW ðjrIjÞrIÞ þ lðI0 IÞ r4 I; Ijt¼0 ¼ I0 ; @n Ijx2@O ¼ 0; @n2 Ijx2@O ¼ 0;
ð46Þ
where n is a unit vector, outward normal to the boundary @O. The second boundary condition is stated in this case for the fourth order PDE to be well defined (in addition to the standard first-order Neumann boundary conditions [BC]). 5. Steady-State Solutions We would like to examine the evolution in time of the triple-well potential (without the fidelity and higher-order terms). In standard scale-space nonlinear diVusions the steady-state solution is the trivial constant image, where
54
GILBOA ET AL.
the constant is the mean value of the input image. In our case, the flow is controlled by a nonconvex nonmonotone potential. What are its stationary solutions? A steady-state solution is reached when I corresponds to a local minimum of the energy functional. A necessary condition is the E-L equation (Aubert and Kornprobst, 2002). We examine the triple-well potential W(s) of Gilboa et al. (2002b). In one dimension, the E-L equation corresponds to: @ W 0 ðjIx jÞ Ix ¼ 0; ð47Þ @x jIx j which implies W 0 ðjIx jÞ Ix ¼ Const: jIx j
ð48Þ
From the Neumann boundary conditions Ix|x2@O ¼ 0, it follows that W 0 ðjIx jÞjx2@O ¼ W 0 ð0Þ ¼ 0: Therefore in the right-hand side of Eq. (48), Const ¼ 0 and the E-L condition is satisfied when ð49Þ W 0 ðjIx jÞ ¼ 0: In our case this means that solutions of Eq. (46) with only the first term (l ¼ 0, ¼ 0) converge to piecewise linear functions with the following characteristics: ðIÞ IRx 2 f0; kw g; R where Ix exists; ð50Þ ðIIÞ O IðxÞdx ¼ O I0 ðxÞdx; where kw > 0 is the right local minimum of the triple-well potential (in which W 0 ðkw Þ ¼ 0). A more rigorous analysis should require the definitions of weak solutions (because the above functions are only piecewise diVerentiable). We shall not go into it in this study, but it can be an interesting path to follow in the future. Here we limit ourselves to show numerically that the signal converges to a function in the set described by Eq. [50]. We could regard such evolution as approaching a sharpened cartoonlike sketch of the input signal. However, it should be noted that in practical evolutions the fidelity term does not allow reaching this degenerate states and the result is much closer to the input signal. Figures 34 and 35 show the behavior of three evolutionary processes in terms of their gradient statistics. An input signal containing only white Gaussian noise was set as the initial condition to three evolutionary processes. These were the triple-well potential process (kf ¼ 0.2, kb ¼ 1), linear forward diVusion (c ¼ 0.1), and linear backward diVusion (c ¼ 0.001). The
REAL AND COMPLEX PDE-BASED SCHEMES
55
FIGURE 34. Gradient histograms h(Px) of three processes after 1000 iterations. (a) Histogram of input signal I0—white Gaussian noise (sn ¼ 3); (b) histogram after process by triplewell potential; (c) histogram after process by forward-linear diVusion; (d) histogram after process by backward (inverse) linear diVusion. Forward-diVusion converges to a zero-gradient unique steady state. Backward-diVusion diverges. The triple-well process converges to a local minimum of the energy functional where the gradient’s value approaches three possible values, as stated in Goupillaud et al. (1984–1985). See Figure 35 for the evolution of these histograms in time.
FIGURE 35. Evolution in time of the gradient histograms of Figure 34. (a) Triple-well process; (b) forward diVusion; (c) backward diVusion. Lighter gray levels indicate higher values (log scale).
56
GILBOA ET AL.
evolution was done for 1000 iterations. Figure 34 shows the resulting histogram of each process. Figure 35 shows the evolution of each histogram in time. The forward diVusion clearly converges to a solution where Ix ! 0. The backward diVusion diverges. The triple-well converges to a stationary solution of the type defined in Eq. (50). C. Examples A 1D signal resembling a blurred line (two close step edges of opposite signs), with additive noise, was processed (Figure 36). This example demonstrates a noise-removing process that also sharpens edges. Whereas the two edges are sharpened, the noise is smoothed out. This process can handle multiple types of blurs, both isotropic and anisotropic, simultaneously (Figure 37). This is in contrast to deconvolution techniques that assume either an a priori known or an unknown (blind deconvolution) linear-blurring kernel. In Figure 38 a
FIGURE 36. Line edge with additive white Gaussian noise of standard deviation sn ¼ 0.1 (SNR ¼ 7 dB). (a) Blurred signal; (b) blurred and noisy input signal; (c–e) processed signal at times 2.5 (c), 10 (d), and 150 (e), respectively. Parameters: kf ¼ 1/6, kb ¼ 1, ¼ 0.01, l ¼ 0.
REAL AND COMPLEX PDE-BASED SCHEMES
57
FIGURE 37. Processing of a nonstationarily blurred step image, contaminated by additive noise. Top left: Degradation function, highlighting regions of diVerent types of degradations: (a) isotropic Gaussian blur (s ¼ 2); (b) anisotropic exponential blur, e|x|þ|y|/5; (c) 5 5 uniform averaging blur; (d) jagginess. Regions overlapped by a few filters were processed by all of them. Top right: degraded image, with added Gaussian white noise of std sn ¼ 0.03 and uniform white noise in the band [0.05, 0.05] (SNR ¼ 15 dB). Bottom: processed image. Process parameters: kf ¼ 0.02, kb ¼ 0.5, l ¼ 0.01, ¼ 0.1. Image is 50 80 pixels, with original gray-level values of 0.25 (box) and 0.75 (background).
blurred flower image is processed. Here an extended version of the processes is implemented, where the parameters controlling the shape of the well are spatially varying, that is, we use kf (x, y), kb (x, y). This is done to have wider sharpening range, where enhancement is accomplished by inducing diVerent thresholds in diVerent locations. We use an automatic heuristic mechanism to determine these parameters without having any prior information. We define T(x,y) ¼ gss * |rI0(x,y)|, which measures the average gradient magnitude in a neighborhood. The potential parameters are in turn adjusted according to T(x,y). The following values were assigned: kf (x,y) ¼ 0.5T(x,y), kb(x,y) ¼ 5T(x,y), ss ¼ 5. Although edges are sharper, there are still some staircasing eVects and the edges are not so smooth. A straightforward improvement could be the implementation of tensor diVusivity, instead of a scalar one (as in Weickert’s coherence-enhancing diVusion [1992a]), where the sharpening triple-well potential is used across the edge, and some smoothing potential is used along the edge. The numerical implementation consists of two iterative stages: at each time step the nonlinear FAB diVusion, with a fidelity term, is calculated by a
58
GILBOA ET AL.
FIGURE 38. Comparison between our sharpening scheme and ‘‘oV-the-shelf ’’ sharpeners publicly available. Processing of a Gaussianly blurred flower image (s ¼ 2), contaminated by white Gaussian noise (SNR ¼ 15 dB). (a) Input image; (b) processed image by our scheme; (c) applying Matlab unsharp filtering on the input image; (d) applying the sharpening filter of Photoshop on the input image. As can be seen, standard general sharpeners tend to amplify noise.
standard 3 3 template. The second stage implements the linear hyperdiVusion, by convolution with a 5 5 kernel (the minimal support required in the case of a fourth-order equation). For the triple-well potential we used, in all examples, a ¼ 2.2kf /kb. D. Relations to FAB DiVusion of Section II In the previous section, we proposed a diVerent formula for a forwardbackward diVusion coeYcient. Both processes turn out to be FAB-type where denoising and sharpening processes are acting simultaneously on the signal. In the following list, we summarize the relations and main diVerences between these two processes.
REAL AND COMPLEX PDE-BASED SCHEMES
59
1. The triple-well has a clear saturation in the enhancement of large gradients. This ensures stability but may slightly reduce contrast of very sharp edges, beyond the point of saturation. CFAB has no positive value in the large gradients region. This means its potential is not in a structure of a triple-well and can be viewed as a central well near zero and two one-sided wells in the high positive and negative gradients regions. 2. Eq. (28) is hard to formulate as a minimization of a gradient-based potential because CFAB(s)s has no analytic integral expression. 3. In this section a higher-order regularization (hyperdiVusion) was added that increases robustness and reduces oscillatory solutions and the enhancement of isolated points. 4. In the energy formulation, it is natural to allow only positive energy. The consequence of this restriction to any FAB diVusion coeYcient (of which many cannot be formulated analytically in terms of potential) is: Z q cðsÞsds 0; 80 q 1: ð51Þ 0
In the FAB formulation of Eq. (28) there exists a point k where c(s)s 0, 80 s k and c(s)s 0, 8k s 1. Therefore the minimum of the integral is achieved at 1 and the condition amounts to Z 1 cFAB ðsÞsds 0: ð52Þ 0
5. To see if the stability of smooth regions is assured in the triple-well formulation, we should check if the conditions of Theorem 2 are met. In terms of potentials this means: C0 ðsÞjs2ð0;q1 Þ > C0 ðsÞjs2ðq1 ;q2 Þ ;
ð53Þ
where C(q1) is the local maximum and C(q2) is the local minimum of C(q1, q2 > 0). The formulation of Eq. (42) admits this condition. See Figure 39 for a plot of the flux of the triple-well potential. E. Discussion This study has been concerned with the task of enhancement of important (steep) edges, by increasing their gradients, in order to reverse blurring eVects. Such an ill-posed task has to be accomplished without noise amplification to avoid signal ‘‘explosion.’’ This led to the formulation of a novel approach of signal- and image-sharpening processes according to a framework of calculus of variations. Our proposal is to use a gradient-dependent energy functional
60
GILBOA ET AL.
FIGURE 39. Triple-well flux JW (Ix). Conditions of Theorem 2 are kept in the triple-well k formulation (guaranteeing that smooth regions are not enhanced). kf ¼ 0.2, kb ¼ 1, a ¼ 2.2kf . b
based on a triple-well potential. The present study extends our FAB diVusion– type process for sharpening of edges while denoising fluctuations and noise (Gilboa et al., 2002a) in that it formulates it as a variational problem. The variational approach permits incorporation of additional terms into the functional to account for the importance of additional image attributes. It also facilitates the process of regularization. To accomplish the desired task, two additional terms were added to the general energy functional: a standard fidelity term and the square magnitude of the Laplacian, serving as a high-order regularizing term. The energy minimization associated with the resultant functional leads to a hyperdiVusion flow, a fourth-order process that exhibits strong low-pass filtering and attenuates high-frequency oscillations that are characteristic of inverse diVusion. The hyperdiVusion process eliminates the eVect of enhancement of isolated points, otherwise sharpened by the triple-well potential. Moreover, edges become more coherent. As the weight of this smoothing term increases, the sharpening eVects become less apparent. Additional eVects of hyperdiVusion on the general process are yet to be further analyzed and understood. The proposed approach of triple-well potentials, first proposed by Gilboa et al. (2004a), can be generalized to process color images using the Beltrami framework. For an example, see Sochen et al. (2000). It can be further generalized and extended for processing and enhancement of additional image features.
REAL AND COMPLEX PDE-BASED SCHEMES
61
IV. COMPLEX DIFFUSION PROCESSES A. Introduction In this section we take a fresh look at the application of PDEs in image processing and computer vision and propose a new, more general framework. In various areas of physics and engineering, it was realized that extending the analysis from the real axis to the complex domain is very useful, even if the variables, quantities of interest, or both, are real. In many cases the analytical structure reveals important features of the system, which are diYcult to account for by diVerent means. Examples of this eVect can be found in such unrelated subjects as the S-matrix elements in high-energy physics and in the bread and butter of signal processing—the Fourier transform. Similarly the Gabor (1946), the Gabor wavelets (Zibulski and Zeevi, 1997), and the Morlet wavelet (Goupillaud et al., 1984–1985; Grossmann and Morlet, 1984) are complexvalued transforms. The latter is relevant to our study in that it incorporates a discrete set of scaled Gaussian filters and a set of scaled approximations of the Gaussian second derivative. All of these are examples of complex filters used in the processing of real signals. In this section, we follow the idea of complexification and generalize it from filters to PDEs. We generalize the linear scale spaces in the complex domain, by combining the diVusion equation with the free Schro¨ dinger equation. A fundamental solution for the linear case is developed. Analysis of the linear complex diVusion shows that the generalized diVusion has properties of both forward and inverse diVusion. We thus obtain a stable flow that violates the maximum principle, while preserving other desirable mathematical and perceptual properties. The example of this flow may pave the way to a new class of diVusion-like processes. An important observation, supported theoretically and numerically, is that the imaginary part can serve as an edge detector (smoothed second derivative scaled by time) when the complex diVusion coeYcient approaches the real axis. Based on this observation, we develop two examples of nonlinear complex processes for the denoising and the enhancement of images. This section is based mainly on the studies of Gilboa et al. (2001b, 2002b, 2004a). B. Previous Related Studies Complex diVusion-type processes are commonly encountered, for example, in quantum physics and electro-optics (Cross and Hohenberg, 1993; Newell, 1974). The time-dependent Schro¨ dinger equation is the fundamental equation
62
GILBOA ET AL.
of quantum mechanics. In the simplest case for a particle without spin, subjected to an external field, it has the form @c h2 ¼ Dc þ V ðxÞc; ð54Þ @t 2m where c ¼ c(t,x) is the wave function of a quantum particle, m is its mass, h is Planck’s constant, V(x) is the external field potential, D is the Laplacian, : pffiffiffiffiffiffiffi and i ¼ 1. With an initial condition c |t ¼ 0 ¼ c0(x)), requiring that c(t, ) i 2 L2 for each fixed t, the solution is cðt; Þ ¼ ehtH c0 ; where the exponent is a shorthand for the corresponding power series, and the higher-order terms are defined recursively by HnC ¼ H(Hn1 C). The operator i h
h2 D þ V ðxÞ; ð55Þ 2m called the Schro¨ dinger operator, is interpreted as the energy operator of the particle under consideration. The first term is the kinetic energy and the second term is the potential energy. The duality relations that exist between the Schro¨ dinger equation and diVusion theory have been studied and are considered, for example, in Nagasawa (1993). It is very revealing to study the basic solution of the free (i.e., V ¼ 0) ‘‘particle.’’ Using separation of variables C (x,y,t) ¼ f(t)F(x,y), and simple manipulation of the equation, we get H¼
ft h2 DF ¼ E: ¼ f 2m F Because this equation is valid for all x, y, and t, it is clear that Epffiffiffiffi isffi a constant. The basic solution is therefore f ¼ exp(iEh t) and F ¼ exp(i h2m k
x), where k k ¼ E. This implies that the basic solution is a plane wave! We will encounter this ‘‘wavy behavior’’ in our complex flow. Another important complex PDE in the field of phase transitions in traveling wave systems is the complex Ginzburg–Landau (CGL) equation (Ginzburg and Landau, 1950) ut ¼ ð1 þ inÞuxx þ Ru ð1 þ imÞjuj2 u: Note that although these flows have a structure of a diVusion process, because of the complex coeYcient, they also retain wave propagation properties. In both cases of complex diVusion a nonlinearity is introduced by adding a potential term, whereas the kinetic energy remains linear. In this study we use the equation with zero potential (no external field) but with complex and nonlinear ‘‘kinetic energy.’’ There are several examples of diVusion of complex-valued features in lowlevel vision (e.g., Barbaresco, 2000; Kimmel et al., 2000; Whitaker and Gerig, 1994). Whitaker and Gerig generate a collection of band-passed images by means of Gabor filtering with specific set of frequencies. This i h
REAL AND COMPLEX PDE-BASED SCHEMES
63
vector-valued feature space was then smoothed linearly and in an anisotropic way. It is important to note that only the coeYcient of the drift term (the first derivatives) becomes complex. This is a basic diVerence from our approach because the qualitative behavior of a diVusion equation depends primarily on the coeYcient (or tensor in the general case) of the diVusion term. It follows that the complex scale-space equation(s) that we present in this study are extremely diVerent from the Whitaker and Gerig equations. A similar argument is relevant in reference to the approach presented by Kimmel et al. (2000). In their study the coeYcients of the Gabor–Morlet wavelet transform are smoothed by the Beltrami flow. Although the values of these filters are complex, the diVusion tensor is real and the behavior of the Beltrami flow is diVerent from the one described in this section. Another interesting work that studies the diVusion of complex-valued functions is the one presented by Barbaresco. This study is concerned, however, with complex curves using a variational technique. C. Linear Complex DiVusion 1. Problem Definition We consider the following initial value problem: It ¼ cIxx ; t > 0; x 2 R Iðx; 0Þ ¼ I0 2 R; c; I 2 C: ð56Þ This equation unifies the linear diVusion Eq. (1) for c 2 R and the free Schro¨ dinger equation (i.e., c 2 I and V(x) 0). When c 2 R, there are two cases: for c > 0 the process constitutes a well-posed forward diVusion, whereas for c < 0 an ill-posed inverse diVusion process is obtained. In the general case the initial condition I0 is complex. In this section we discuss the particular case of real initial conditions, where I0 is the original image. 2. Fundamental Solution We seek the complex fundamental solution h(x; t) that satisfies the relation: Iðx; tÞ ¼ I0 hðx; tÞ; ð57Þ where * denotes convolution. We write the complex diVusion coeYcient as : c ¼ reiy : ð58Þ Because a stable fundamental solution of the inverse diVusion process does not exist, we restrict the analysis to a positive real value of c. Theorem 4
The fundamental solution of Eq. (56), y 2 ( p2 ; p2), is hðx; tÞ ¼ Ags ðx; tÞiaðx;tÞ ;
ð59Þ
64
GILBOA ET AL.
1 where gs ðx; tÞ ¼ pffiffiffiffi ex 2psðtÞ
1 A ¼ pffiffiffiffiffiffiffiffiffiffi ; cosy
2 =2s2 ðtÞ
; and
x2 siny y ; aðx; tÞ ¼ 4tr 2
rffiffiffiffiffiffiffiffiffiffi 2tr sðtÞ ¼ : cosy
ð60Þ
See Figure 40 for a plot of Eq. (59). 3. Approximate Solution for Small Theta We will now show a novel observation in which as y ! 0, the imaginary part can be regarded as a smoothed second derivative of the initial signal, factored by y and the time t. The solution is generalizing to any dimension : in Cartesian coordinates x ¼ (x1, x2, . . . xQ ) 2 RN, I(x; t) 2 CN, and denoting : NN that in this coordinate system gs ðx; tÞ¼ i gs ðxi ; tÞ: The next Theorem entails a primary result of complex diVusion: Theorem 5 For any t 0, the real part of the fundamental solution of Eq. [61] approaches a Gaussian and the imaginary part approaches a Laplacian of a Gaussian, scaled by time and the magnitude of c, as the phase-angle y approaches 0: ðaÞ limReðIÞ ¼ gs I0; y!0
ImðIÞ ¼ trDgs I0; y!0 y
ðbÞ lim
ð61Þ
where Re ( ) is the real part and Im( ) is the imaginary part. In Figure 41 the approximations of the real and imaginary kernels are visualized. We now show, more concisely, how this approximation can be reached using a Taylor expansion. Restricting the analysis, for convenience, to a unitary complex diVusion coeYcient c ¼ eiy, utilizing the approximation cosy ¼ 1 þ O(y2) and siny ¼ y þ O(y3) for small y, and introducing an ~ ¼ cD, Eq. (56) can be written (for higher-dimensional systems operator H ~ I; I|t¼0 ¼ I0. The solution I ¼ etH~ I0 ; is the equivalent of too) as: It ¼ H Equations (57) and (59). The above approximations yield: Iðx; tÞ ¼ ectD I0 ¼ eiytD I0 eð1þiyÞtD I0 ¼ etD eiytD I0 etD ð1 þ iytDÞI0 ¼ ð1 þ iytDÞg~ s I0 : Further insight into the behavior of the small theta approximation can be gained by separating the real and imaginary parts of the signal, I ¼ IR þ iII, and diVusion coeYcient, c ¼ cR þ icI, into a set of two equations:
IRt ¼ cR IRxx cI IIxx ; IR jt¼0 ¼ I0 ð62Þ IIt ¼ cI IRxx þ cR IIxx ; II jt¼0 ¼ 0;
REAL AND COMPLEX PDE-BASED SCHEMES
65
FIGURE. 40. Fundamental solution hy(x, t) as a function of x and y (t ¼ 1). (a) Real part (hR); (b) imaginary part normalized by y (hI/y).
where cR ¼ cosy, cI ¼ siny. The relation IRxx yIIxx holds for small enough y, which allows us to omit the second term on the right-hand side of the first equation, to get the small y approximation:
66
GILBOA ET AL.
FIGURE 41. (a) hR (solid line) and gs (dashed line) as a function of x; (b) hI/y (solid line) @2 and @x 2 gst (dashed line) as a function of x; (c) the diVerence function hR gs; (d) the @2 p diVerence function hI/y @x 2 gst. y ¼ 10.
IRt IRxx ;
IIt IIxx þ yIRxx :
ð63Þ
In Eq. (63) IR is controlled by a linear forward-diVusion equation, whereas II is aVected by both the real and imaginary equations. We can regard the imaginary part as IIt yIRxx þ (‘‘a smoothing process’’). Note that because the initial condition is real valued, the term yIRxx is dominant and cannot be omitted even for very small y (at t ¼ 0 it is infinitely larger than IIxx as II|t¼0 0).
REAL AND COMPLEX PDE-BASED SCHEMES
67
4. Analysis of the Fundamental Solution We consider a few properties with reference to the fundamental solution and derive bounds on error under the small y approximation. The approximation of the real part to a Gaussian (and of the imaginary part to its second derivative scaled by time) obtained for small y is of the order O(y2). Here we limit the presentation to the summary of our results. All proofs and calculations can be seen in Gilboa (2004). The kernel can be separated into its real and imaginary parts. As the initial condition I0 is real-valued in this study, the real part of I(x; t) is aVected only by the real kernel, and the imaginary part of I(x; t) is aVected only by the imaginary kernel: Iðx; tÞ ¼ IR þ iII ¼ I0 h ¼ I0 hR þ iI0 hI ;
ð64Þ
where h ¼ hR þ ihI. (We get IR ¼ I0 * hR, II ¼ I0 * hI.) The nature of the complex kernel does not change through the evolution; the kernel is basically rescaled according to the time t (or to s). Therefore we can analyze a few characteristics of the kernel as a function of s for diVerent values of y. In the sequel we present some of the major characteristics of the real and imaginary kernels.
5. Properties of the Real Kernel hR 1. Kernel formulation hR ðx; tÞ ¼ Ags ðx; tÞcosaðx; tÞ:
ð65Þ
2. Maximal amplification Theorem 6 For any x, t 0, y 2 ( p2 ; p2), I0 2 L1, there exists the following upper bound on the amplification of I(x, t) ¼ I0 * hR with respect to the initial condition I0(x): maxx;t jIj A: maxx jI0 j
ð66Þ 2
For small theta, we have A ¼ cos1/2 y ¼ 1 þ y4 þ O(y4). 3. EVectively positive kernels. One requirement of the linear scale-space is to avoid creation of new local extrema along the scale-space in 1D. Kernels obeying this requirement should be positive everywhere. In 1D this is equivalent to the requirement that the operator be causal (Lindeberg and ter Haar Romeny, 1994). As this kernel is not positive everywhere, we check how
68
GILBOA ET AL.
close it is to a positive kernel. Let us define a positivity measure 1 Ph 1 of a kernel h as follows: R1 : 1 hðxÞdx : ð67Þ Ph ¼ R 1 1 jhðxÞjdx We regard a kernel h as eVectively positive with the measure 1, if Ph 1 . Considering hR we get the following bound: Theorem 7 For small enough y the positivity measure defined in Eq. (67) of the real kernel hR is bounded from below by Ph0R
1 8Fðx1 Þ ; 1 þ 8Fðx1 Þ
where
Z FðxÞ ¼
x
1
and
gs¼1 ðsÞds
s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 x1 ¼ p þ y coty: 3
ð68Þ
ð69Þ
ð70Þ
This bound is actually valid for quite a large theta range: y 2 (73 , 73 ). 18Fð11Þ p Example (a): for y ¼ 1 ¼ 180 we get x1 11, and PhR 1þ8Fð11Þ ¼12* 27 5 10 . Example (b): for PhR > 0.99999, ( < 10 ) we require y < 5 . 4. Small theta approximation. We define the distance between convolution kernels as the norm of their diVerence operator: : dðh; gÞ ¼ kTh0 gk1 ; ð71Þ where the norm of a linear operator (using the 1 norm) is kTh f k1 kh f k1 : ¼ sup : kThk1 ¼ sup k f k 1 k f k6¼0 k f k6¼0 k f k1
ð72Þ
Lemma 1 The distance between the real kernel hR(x; t) and a Gaussian gs(x; t) is dðhR ; gs Þ ¼ Oðy2 Þ: For small values of theta the distance is bounded by: h pi dðhR ; gs Þ < 0:5y2 ; 8y 2 0; ; 8t 0: 10 See Figures 42 and 43 for graphic representations.
ð73Þ
ð74Þ
REAL AND COMPLEX PDE-BASED SCHEMES
FIGURE 42. (a) dR; (b) dR/y2 as a function of y (t ¼ 1).
FIGURE 43. dR as a function of t (y ¼ p/1000). (a) t 2 (0, 10); (b) t 2 (5, 500).
69
70
GILBOA ET AL.
5. Definite integral
Z
1
1
hR ðx; tÞdx ¼ 1:
ð75Þ
6. Properties of the Imaginary Kernel hI 1. Kernel formulation hI ðx; tÞ ¼ Ags ðx; tÞ sinaðx; tÞ:
ð76Þ
2. Maximal amplification Theorem 8 For any x, t 0, y 2 ( p2 ; p2), I0 2 L1 there exists the following upper bound on the amplification of I(x, t) ¼ I0 * hI with respect to the initial condition I0 (x): maxx;t jIj A: maxx jI0 j
ð77Þ
3. Small theta approximation Lemma 2 The distance between the imaginary kernel, normalized by y, and a Gaussian’s second derivative scaled by time, is dðhI =y;
@2 gs tÞ ¼ Oðy2 Þ: @x2
For small values of theta the distance is bounded by: h pi @2 dðhI =y; 2 gs tÞ < 0:5y2 ; 8y 2 0; ; 8t 0: 10 @x
ð78Þ
ð79Þ
See Figures 44 and 45 for graphic representations. 4. Definite integral
Z
1
1
hI ðx; tÞdx ¼ 0:
ð80Þ
7. Examples We present examples of 1D and 2D signal processing with linear complex diVusion, characterized by small and large values of y. Figures 46 and 47 depict the evolution of a unit step, processed by a complex diVusion of small p 14p and large y (30 ; 30 ), respectively. The same y values are used in the processing of the cameraman image (Figures 48 and 49, respectively). The
REAL AND COMPLEX PDE-BASED SCHEMES
FIGURE 44. (a) dI; (b) dI/y2 as a function of y (t ¼ 1).
FIGURE 45. dI as a function of t (y ¼ p/1000). (a) t 2 (0, 10); (b) t 2 (5, 500).
71
72
GILBOA ET AL.
FIGURE 46. Complex diVusion of a small theta, y ¼ p/30, applied to a step signal. Left frame, real values; right frame, imaginary values. Each frame depicts from top to bottom: original step (a), diVused signal after times 0.025 (b), 0.25 (c), 2.5 (d), and 25 (e).
FIGURE 47. Complex diVusion of a large theta, y ¼ 14p/30, applied to a step signal. Left frame, real values; right frame, imaginary values. Each frame depicts from top to bottom: original step (a), diVused signal after times 0.025 (b), 0.25 (c), 2.5 (d), and 25 (e).
qualitative properties of the edge detection (smoothed second derivative) are clearly apparent in the imaginary part of the signals, for the small y value, whereas the real value depicts the properties of ordinary Gaussian scalespace. However, for large y the imaginary part feeds back into the real part significantly, creating wavelike ringing. In addition, the signal overshoots and undershoots, exceeding the original maximum and minimum values and
REAL AND COMPLEX PDE-BASED SCHEMES
73
FIGURE 48. Complex diVusion with small theta (y ¼ p/30), applied to the cameraman image. Top images, real values; bottom images, imaginary values (factored by 20). Each frame (from left to right): original image (a), and the results obtained after processing time 0.25 (b), 2.5 (c), and 25 (d), respectively.
FIGURE 49. Complex diffusion with large theta (y ¼ 14p/30), applied to the cameraman image. Top sequence of images, real values; bottom sequence, imaginary values (factored by 20). Each sequence depicts, from left to right, the original image (a) and the results of the processing after t ¼ 0.25 (b), 2.5 (c), and 25 (d), respectively.
thereby violating the ‘‘maximum-minimum’’ principle—a property suitable for sharpening purposes, similar to the Mach bands characteristic of vision (RatliV, 1965).
74
GILBOA ET AL.
8. Generalization to Nonlinear Complex DiVusion Nonlinear complex processes can be derived, based on the properties of the linear complex diVusion, to match the requirements of specific applications. We present two such nonlinear schemes developed for application in image denoising and enhancement. D. Ramp-Preserving Denoising Ramp functions can be used as a model of the basic structure of edges in images or their equivalent 1D functions. Step-type (singular) edges are a limiting case of ramp functions. Visual responses to ramp functions have been widely investigated both psychophysically and physiologically. In particular, they are known for the Mach bands associated with their perception (RatliV, 1965). Ramp-type edges are generic fundamental components of images, and as such, are extracted in the raw primal sketch of images (Marr, 1982). It is therefore of special interest and importance to compare the action of a nonlinear complex-diVusion equation on a ramp function with those of real nonlinear diVusion and other previously proposed operators. We are looking for a general nonlinear diVusion equation It ¼
@ ðcð ÞIx Þ @x
ð81Þ
that preserves smoothed ramps. As was the case with real nonlinear diVusion processes, we search here, too, for a suitable diVerential operator D for ramp edges. Eq. (81), with a diVusion coeYcient c(|DI|), which is a decreasing function of |DI|, can be regarded as a ramp-preserving process. Examining the gradient as a possible candidate leads to the conclusion that it is not a proper measure for two reasons. The gradient does not detect the ramp main features—namely its endpoints. Moreover, it has a nearly uniform value across the whole smoothed ramp, causing a nonlinear gradient-dependent diVusion to slow the diVusion process in that region and therefore being less eVective in noise reduction within the ramp edge. The second derivative (Laplacian in more than one dimension) is a more suitable choice. It has a high magnitude near the endpoints and low magnitude elsewhere and thus enables the nonlinear diVusion process to reduce noise over the ramp. We formulate c(s) as a decreasing function of s: cðsÞ ¼
1 ; where cðsÞ ¼ cðjIxx jÞ; 1 þ s2
and apply it in Eq. (81) to yield:
ð82Þ
REAL AND COMPLEX PDE-BASED SCHEMES
2 2I I @ Ix 1 þ Ixx x xxx It ¼ Ixx : ¼ 2 2 @x 1 þ Ixx ð1 þ Ixx Þ2
75 ð83Þ
Two main problems are associated with this scheme. The first and more severe problem is the fact that noise has very large (theoretically unbounded) second derivatives. Second, a numerical problem arises when third-order derivatives are computed, with large numerical support and noisier derivative estimations. These two problems are solved by using this nonlinear complex diVusion. Following the results of the linear complex diVusion [Eq. (61)], we implement the imaginary value of the signal (divided by y) in controlling the diVusion process. Whereas for small t this term vanishes, allowing stronger diVusion to reduce the noise, with time its influence increases and preserves the ramp features of the signal. The equation for the multidimensional process is It
¼ r ðcðImðIÞÞrIÞ; eiy cðImðIÞÞ ¼ ImðIÞ 2 1þ ky
ð84Þ
where k is a threshold parameter. For the same reasons discussed in the linear case, here, too, the phase-angle y should be small (y << 1). Because the imaginary part is normalized by y, the process is almost not aVected by changing the value of y, as long as it stays small (y < 5 ). The discrete implementation details are in Gilboa (2004). Figures 50 and 51 compare denoising of a 1D ramp signal by a P-M process [Eq. (2)], with the performance of the above process [Eq. (84)]. This example illustrates that the staircasing eVect, characteristic of the P-M process, does not occur in processing by our nonlinear complex scheme. In Figure 52 a 2D box with varying illumination was processed. We tried to demonstrate how denoising by our process can perform better in changing illumination conditions. We also show how the imaginary value can be of use for segmentation in such circumstances. With regard to the P-M and similar gradient-controlled processes, it is demonstrated that overcoming the staircasing eVects by increasing the threshold kPM (thus causing the gradient magnitude of the illumination to be in the convex regime of the process) comes at a cost of severely degrading the edges. In Figures 53 and 54 two face images were processed (where there are typically shadings and illumination changes). To the first figure (part of the Barbara image) additive, white Gaussian noise was added synthetically (SNR ¼ 20 dB), whereas the Mona
76
GILBOA ET AL.
FIGURE 50. Perona–Malik nonlinear diVusion process applied to a ramp-type soft edge (kPM ¼ 0.1). Left panels, original (top) and noisy ramp signal (white Gaussian, SNR ¼ 15 dB). Middle panels, denoised signal at times 0.25, 1, 2.5, from top to bottom, respectively. Right panels, respective values of c coeYcient.
FIGURE 51. Nonlinear complex diVusion process applied to a ramp-type soft edge (y ¼ p/30, k ¼ 0.07). Left panels, real values of denoised signal at times 0.25, 1, 2.5, from top to bottom, respectively. Middle panels, respective imaginary values. Right panels, respective real values of c.
Lisa image is a low-quality JPEG image with visible artifacts. The ramppreserving process smoothes gradual changes well, yet preserves edges (although not as strongly as the P-M). The staircasing eVect of the P-M process, which can create false edges, are apparent. Note also the JPEG artifacts near the eyes (see Figure 54) that were removed by our process. E. Regularized Shock Filters 1. Previous Related Studies The original motivation and formulation of the shock filter of Osher and Rudin (1990) are explained in detail in Section I.F.1. We return here to the issue of how to improve its performance under noisy conditions. The noise sensitivity problem of the original formulation is critical and, unless properly
REAL AND COMPLEX PDE-BASED SCHEMES
77
FIGURE 52. Filtering of a synthetic test image of ramp varying illumination. (a) (from left) Original image; with noise (SNR ¼ 5 dB) and linearly varying illumination (I ¼ I þ 4x þ 2y); ramp-preserving denoising—real part (k ¼ 2). (b) (from left) Ramp-preserving denoising— imaginary part; Perona–Malik (P-M) with lower threshold (kPM ¼ 3); P-M with higher threshold (kPM ¼ 5). C (from left) Thresholded gradient of imaginary part, ramp-preserving; thresholded gradient of P-M (low threshold). (d) (from left) Level-sets of imaginary part and P-M.
78
GILBOA ET AL.
FIGURE 53. Filtering of a face (Barbara image). (a) (from left) Noisy image (SNR ¼ 20 dB); result of filtering with a linear diVusion process; filtering with Perona–Malik (P-M) diVusion process. (b) (from left) Ramp-preserving denoising—real part; imaginary part. (c) Enlargement of forehead (from left): original; P-M; ramp-preserving real part.
solved, will continue to hinder most practical applications of shock filters. Previous studies addressing this issue came up with several plausible solutions. The common approach to increase robustness (Alvarez and Mazorra, 1994; Coulon and Arridge, 2000; Kornprobst et al., 1997; Rougon and Preteux, 1995) is to convolve the signal’s second derivative with a lowpass filter, such as a Gaussian: It ¼ signðGs Ixx ÞjIx j; where Gs is a Gaussian of standard deviation s.
ð85Þ
REAL AND COMPLEX PDE-BASED SCHEMES
79
FIGURE 54. Filtering a low-quality JPEG of the Mona Lisa image. (a) (from left) Original image; result of filtering with a linear diVusion process; filtering with Perona–Malik (P-M) diVusion process. (b) Ramp-preserving denoising: real part; imaginary part. (c) Enlargement of the eyes: original; P-M; ramp-preserving real part.
This is generally not suYcient to overcome the noise problem: Convolving the signal with a Gaussian of moderate width does not cancel in many cases the inflection points produced by the noise. Their magnitude becomes considerably lower, but there is still a change of sign at these points, which induces flow in opposite direction on each side of the inflection point. For very wide (large-scale) Gaussians, most inflection points produced by the noise are diminished but at a cost: The location of the signal’s inflection points become less accurate. Moreover, the eVective Gaussian’s width s often exceeds the signal’s extent, thus causing the boundary conditions imposed on the process to strongly aVect the solution. Finally, from a computational viewpoint, the convolution process in each iteration is costly.
80
GILBOA ET AL.
A more complex approach is to address the issue as an enhancing-denoising problem: Smoother parts are denoised, whereas edges are enhanced and sharpened. The main idea is to add some sort of anisotropic diVusion term with an adaptive weight of the shock and diVusion processes. Alvarez and Mazorra (1994) were the first to couple shock and diVusion, proposing Eq. (22). This equation, however, degenerates to Eq. (85) in the 1D case and the contribution of the diVusion to the combined process is lost. A more advanced scheme was proposed by Kornprobst et al. (1997): It ¼ ar ðht I þ Ixx Þ ae ð1 ht ÞsignðGs I ÞjrIj;
ð86Þ
where ht ¼ ht (|Gs~ *rI |) ¼ 1 if |Gs~ *rI | < t, and 0 otherwise. In Kornprobst et al., scheme includes another fidelity term af (I I0) that is omitted here (since such a term can be added to any scheme). Another modern scheme was proposed by Coulon and Arridge (2000), It ¼ divðcrIÞ ð1 cÞa signðGs I ÞjrIj;
ð87Þ
2
), was originally used for classification, based on a where c ¼ exp( jGs rIj k probabilistic framework. Eq. (87) is the adaptation of the original process for the task of image processing. The performance of the last two schemes will be later compared with that of the process proposed in the present study. 2. Coupling Shock and DiVusion The following section analyzes two discrete schemes involving shock filter and diVusion. We provide a few theorems regarding the behavior of these schemes. For simplicity, our analysis is done in one dimension. 3. Shock and Linear DiVusion We start by adding a linear diVusion term to the shock filter equation: It ¼ signðIxx ÞjIx j þ lIxx;
ð88Þ
where 0 < l 2 R is a constant weight parameter. The discrete scheme of Eq. (88) is: where
Iinþ1 ¼ Iin þ DtðsignðD2 Iin ÞjDIin j þ lD2 Iin Þ;
ð89Þ
: DIin ¼ mðDþ Iin ; D Iin Þ=h; n : 2 D Ii ¼ ðDþ D Iin Þ=h2 ;
ð90Þ
m(x, y) is the minmod function:
REAL AND COMPLEX PDE-BASED SCHEMES
: mðx; yÞ ¼
ðsignxÞminðjxj; jyjÞ 0
81
if xy > 0; otherwise;
: and D ¼ (ui1 ui). The CFL condition is lDt 0.5h2, (h 1). Most rigorous analysis and proofs regarding the properties of the original shock filter, described in Section I.F.1, were based on the discrete scheme. We will follow this line in our analysis. Theorem 9 The scheme of Eq. (89) obeys the strong minimum-maximum principal (no new local extrema are created and the global maximum and minimum at any time are bounded by those of the initial condition) and reaches a trivial constant steady-state solution limn!1 In(x) ¼ constant for any 0 < l 2 R. This process is a mix between denoising and enhancement processes, where for low l it behaves more like an enhancing shock filter and for large l, denoising is more dominant (with some edge preservation). Some characteristics of the shock filter are lost: Real shocks are actually not created; the scheme is not total-variation preserving; the signal diminishes with time—the steady-state solution is a constant function. 4. The Magnitude of the Second Derivative To account for the magnitude of the second derivative controlling the flow, we return to the original shock filter formulation of Eq. (19) and use F(s) ¼ p2 arctan (as), where a is a parameter that controls the sharpness of the slope near zero. With this modification F(s), Eq. (19) becomes: 2 It ¼ arctanðaIxx ÞjIx j þ lIxx : p
ð91Þ
Consequently, the inflection points are not of equal weight any longer; regions near edges, with large magnitude of the second derivative near the zero-crossing, are sharpened much faster than relatively smooth regions. This type of process is implemented in the sequel in a new formulation of a complex PDE. F. Complex Shock Filters From Equations (91) and (61) we derive the complex shock filter formulation for small y: 2 I It ¼ arctan aIm ð92Þ jIx j þ lIxx ; p y where l ¼ reiy is a complex scalar.
82
GILBOA ET AL.
Generalization of the complex shock filter to 2D yields: 2 I ~ xx ; It ¼ arctan aIm jrIj þ lI þ lI p y
ð93Þ
where ~ l is a real scalar. The complex filter provides an elegant way to avoid the need for convolving the signal in each iteration and still get smoothed estimations. The inherent time dependency contributes to the robustness of the process. Moreover, the imaginary value receives feedback—it is smoothed by the diVusion and enhanced at sharp transitions by the shock and thus can better control the process than a simple second derivative. The performance of our complex-valued shock filter [Eq. (92)] is compared with the most advanced real-valued robust shock filters, described earlier, of Kornprobst et al. (Eq. 86) and of Coulon and Arridge (Eq. 87). All three filters are designed to perform in a noisy environment, to produce shocks of important edges while simultaneously denoising fluctuations (of noise and/or texture). Trying to obtain objective quantitative measures to evaluate these filters, we conducted a representative experiment of processing a blurred and noisy step edge. In the experiment 100 blurred and noisy step edges (white Gaussian noise, SNR ¼ 5 dB) were processed by each filter. The summary of the results is shown in Table 1. The discrete signal I is comprised of N grid points (Ii, i ¼ 1, 2.., N). In this context the gradient is a simple grid point TABLE 1 EXPERIMENT RESULTS COMPARING THREE ROBUST SHOCK FILTERS PROCESSING A BLURRED NOISY STEP (SNR ¼ 5 DB) Process Ideal Kornprobst et al. Coulon–Arridge Ours Ours—0 dB
Slope Slope variance Shock success Stability in time Shock dislocation 1 0.57 0.76 0.78 0.62
0 0.031 0.192 0.006 0.024
100% 65% 72% 99% 81%
1 0.73 0.82 0.99 0.99
0 2.6 3.9 1.7 2.4
Process
Location variance
Location success
Location bias
SNR
Ideal Kornprobst et al. Coulon–Arridge Ours Ours—0 dB
0 14.3 86.7 4.7 8.7
100% 93% 94% 99% 92%
0 0.5 2.1 0.3 0.6
1 8.7 7.6 10.7 8.8
REAL AND COMPLEX PDE-BASED SCHEMES
83
diVerence DIi ¼ Iiþ1 Ii, where the largest gradient was considered as the place of the shock. We now explain each column of the table: ‘‘Slope’’ is the slope of largest gradient s(I ) ¼ maxi|DIi|. ‘‘Slope variance’’ is the variance of s(I ) over 100 trials. ‘‘Shock success’’ indicates a successful shock creation if the shock’s slope was at least half of the original magnitude s(I ) 0.5. ‘‘Stability in time’’ indicates how sensitive the result is to the stopping time. We computed the relative shock’s slope after 10% more time: s(I(1.1T ))/s(I(T)). ‘‘Shock dislocation’’ is the average distance of the produced shock from the original shock location in terms of grid points, E[|is iorig|], where is ¼ argmaxi|DIi| and iorig is the original shock point. ‘‘Location variance’’ is Var[is]. ‘‘Location success’’ indicates a success in terms of location accuracy if the distance of the formed shock was no more than 5 grid points from the original location, |is iorig| 5. ‘‘Location bias’’ is represented by E[is iorig] (negative values means bias toward the center). The expected value of the shock location of an unbiased process is at the original location. ‘‘SNR’’ is the average SNR of I(T) with respect to the original unit step. In this evaluation, for each process the parameters were first tuned to give good results and were kept constant in the experiment itself. The stopping time T was chosen automatically to produce a nonoscillatory signal with a sharp and clear shock. For the experiment to be reproducible, all the parameters and exact criteria are listed in Gilboa (2004). Some examples of processed outputs are shown in Figure 55. This experiment gives quantitative indications of the advantages of the complex shock filter with regard to the aforementioned criteria. The considerably lower variance in the results (sharpness and location) accounts for the process reliability. The stability of the shock over evolution time indicates that a proper stopping time can be selected also in the enhancement of more compound signals with several blurred steps of diVerent sizes and locations. Also, from our experience, it is far less sensitive to parameter tuning. Trying our process with noisier inputs of 0 dB SNR gives comparable results to the other processes at 5 dB SNR. In Figure 56 a blurred and noisy image was processed. In the case of 2D signals, only the scheme of Kornprobst et al. (1997) and our complex scheme produce acceptable results at this level of noise (SNR ¼ 15 dB). However, processing with the complex process results in sharper edges and is closer to the shock process, as can be observed in a comparison to an ideal shock response to a blurred image without noise (top right image of Figure 56). The combined enhancement-denoising properties of the complex scheme are highlighted by the display of one horizontal line of the image (bottom right image of Figure 56).
84
GILBOA ET AL.
REAL AND COMPLEX PDE-BASED SCHEMES
85
G. Discussion Generalization of the linear and nonlinear scale-spaces to the complex domain, by combining the diVusion and the free Schro¨dinger equations, further enhances the theoretical framework of the diVusion-type PDE approach to image processing. The following advantages are aVorded by the complexification of the diVusion equation according to the approach introduced in the present study. 1. The fundamental solution of the linear complex diVusion indicates that there exists a stable process over the wide range of the angular orientation of the complex diVusion coeYcient, y 2 ( p2 ; p2), that restricts the real value of the coeYcient to be positive. (Issues related to aspects of inverse diVusion in image processing, that is, negative real-valued diVusion coeYcient, are dealt with elsewhere [Gilboa et al., 2002a].) 2. In the case of small y, two observations concerning the properties of the real and imaginary components of the complex diVusion process are relevant with regard to the application of this process in image processing. The real function is eVectively decoupled from the imaginary one and behaves like a real linear diVusion process, whereas the imaginary part approximates a smoothed second derivative of the real part and can therefore well serve as an edge detector. In other words, the single complex diVusion process generates simultaneously an approximation of both the Gaussian and Laplacian pyramids (at discrete set of temporal sampling points), that is, the scale-space (Burt and Adelson, 1983). 3. It paves the way to a more complete scale-space analysis. Further, the complex field is complete and brings along with it powerful tools for dealing with critical values. 4. In the linear case the imaginary part is a bounded operator (and hence well-posed). Therefore small perturbations in the data cannot cause divergence of the results. This is unlike first- or second-order derivatives, which are ill-posed operators and are generally used for edge detection (preconvolving the signal with a Gaussian still produces unstable results as s ! 0,
FIGURE 55. Shock filters comparison experiment. From left to right. (a) Original step: blurred step (Gaussian blur sb ¼ 3). (b) Example of noisy signal with 5 dB SNR; example of noisy signal with 0 dB SNR; example of one result from Alvarez–Mazorra process (Eq. 22). In the last four rows, some examples of processed signals from the experiment (5 dB SNR) are shown. The result (solid line) is superimposed on the ideal response (dashed line). (c) Kornprobst et al. (Eq. 86); (d) Coulon–Arridge (Eq. 87); (e) our scheme (Eq. 92); (f ) our scheme processing noisier signals of 0 dB SNR.
86
GILBOA ET AL.
FIGURE 56. From left. (a) Original tools image; Gaussian blurred (s ¼ 2) with added white Gaussian noise (SNR ¼ 15 dB); ideal shock response (of blurred image without the noise). (b) Evolutions of Eq. (22): Alvarez-Mazorra (s ¼ 10); Eq. (86) of Kornprobst et al. (ar ¼ 0.2, ~ ¼ 1); Eq. (87) of Coulon-Arridge (k ¼ 5, a ¼ 1, s ¼ 10, s ~¼ 1). (c) ae ¼ 0.1, t ¼ 0.2, s ¼ 10, s Evolution of Eq. (92): complex process, from left: real values; imaginary values (|l| ¼ 0.1, ~l ¼ 0.5, a ¼ 0.5); gray-level values generated along a horizontal line in the course of complex evolution of the process (thin line 1 iteration; bold line 100 iterations). All of the image evolution results are presented for 100 iterations (dt ¼ 0.1).
REAL AND COMPLEX PDE-BASED SCHEMES
87
scaling by time is imperative). One may therefore conclude that the imaginary part can serve better as it is a ‘‘well-posed edge detector’’ for any t 0. Its stability is inherent and does not depend on discretization eVects or on the numerical schemes used in the computations. 5. In many cases it is advantageous to switch on the nonlinearity in an adiabatic way, such that over short time (small-scale) the flow is mostly smoothing, and as time progresses the interaction of the smoothing with the image’s features takes more important place and dominates the flow in large times. Explicit time dependency of the P-M coeYcient and its benefits was demonstrated by Gilboa et al. (2001c). In the complex framework presented here, time dependency of the anisotropic case is inherent. 6. Complex diVusion enables better performance in diVerent nonlinear tasks, such as ramp denoising and regularization of shock filters. Although nonlinear schemes remain to be further analyzed and better understood, nonlinear complex diVusion-type processes can be derived from the properties of the complex linear diVusion and applied in image processing and enhancement. Such are the two schemes developed for denoising of ramp edges and for regularization of shock filters. In the first scheme a nonlinear complex diVusion process controlled by the signal’s imaginary value avoids the staircasing eVect that is characteristic of gradient-controlled nonlinear processes such as the P-M process (Perona and Malik, 1990) (see Figures 50 through 54). The second proposed scheme presents a complex shock filter that overcomes problems inherent in the enhancement of noisy signals and images by the shock filters (Osher and Rudin, 1990) and outperforms its various variants (Alvarez and Mozurra, 1994; Coulon and Arridge, 2000; Kornprobst et al. (1997); Rougon and Preteux, 1995). V. TEXTURE-PRESERVING DENOISING A. Introduction A classical variational denoising algorithm is the total variation (TV) minimizing process (Rudin et al., 1992). This algorithm seeks an equilibrium state (minimal energy) of an energy functional composed of the TV norm of the image I and the fidelity of this image to the noisy input image I0: Z 1 jrIj þ lðI I0 Þ2 dxdy: ETV ¼ ð94Þ 2 O This is further generalized by the F formulation (Blanc-Fe´ raud et al., 1995; Deriche and Faugeras, 1996) with the functional
88
GILBOA ET AL.
Z 1 2 EF ¼ FðjrIjÞ þ lðI I0 Þ dxdy: 2 O
ð95Þ
The E-L equation is
rI þ lðI0 IÞ ¼ 0 F div F jrIj 0
ð96Þ
where l 2 R is a scalar controlling the fidelity of the solution to the input image (inversely proportional to the measure of denoising). Neumann boundary conditions are assumed. The solution is usually found by a steepest descent method: It ¼ F ; Ijt¼0 ¼ I0 :
ð97Þ
When the noise is approximated by an additive white Gaussian process of standard deviation s, the problem can be formulated as finding R minI O FðjrIjÞdxdy Z 1 ð98Þ ðI I0 Þ2 dxdy ¼ s2 : subject to jOj O In this formulation, l can be considered as a Lagrange multiplier, computed by: Z 1 0 rI l¼ 2 div F ð99Þ ðI I0 Þdxdy: s jOj O jrIj qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi The actual function with which we work in this section is F(s) ¼ 1 þ b2 s2 . The process that results from this function is more stable than the TV. We choose it as a representative of variational denoising processes. Although the performance of this and other PDE-based methods have shown impressive results, the limitations of such processes have recently become of great concern (Chan and Shen, 2002; Meyer, 2001; Vese and Osher, 2002). The implicit assumption that underlies the formulation of these flows/equations is the approximation of images by piecewise constant functions (in the BV space). In some sense they produce an approximation of the input image as the so-called cartoon model and thus naturally dispose of the oscillatory noise while preserving edges (in some cases even enhancing them, e.g., Perona and Malik, 1990). A good cartoon model captures much of the image important information. Yet it has several obvious drawbacks: textures are excluded, significant small details may be left out, and even large-scale fine features, which are not characterized by dominant edges, are often disregarded.
REAL AND COMPLEX PDE-BASED SCHEMES
89
The purpose of this section is to show that a relatively simple modification of the previous equation yields a denoising algorithm that better preserves the structural (texture) information of the image, as shown in Gilboa et al. (2003a,b). B. The Cartoon Pyramid Model The cartoon model was defined and investigated in the early 1980s (Blake and Zisserman, 1987; Geman and Geman, 1984), was further elaborated by Mumford (1994), and is widely used as the basic underlying model for many image denoising methods. In the continuous case the cartoon has a curve G of discontinuities, but everywhere else it is assumed to have a small or a null gradient |rI |. The TV and other nonlinear diVusion processes are especially good in extracting the cartoon part of the image. Therefore we use them as a simple pyramid (scale-space) of rough image sketches at diVerent scales. Let us define a cartoon of scale s, using the F process, as follows: : Cs ¼ IF jl¼1 ð100Þ s
where IF is the steady state of Eq. (97). Let us define the residue as the diVerence between two scales’ cartoons: : Rn;m ¼ Cn Cm ðn < mÞ: ð101Þ We shall refer to the Non-Cartoon (NC ) part of scale s as the residue from level zero: : ð102Þ NCs ¼ R0;s ¼ C0 Cs : This cartoon and residue data structure is analogous of the pyramid of wavelet approximations. By using the definitions of Equations (100) and (101) and integrating the E-L equation [Eq. (96)], we deduce several basic properties, listed below. Theorem 10
The cartoon pyramid model has the following properties:
1. The cartoon of scale 0 is the input image. C0 ¼ I0. R 2. The cartoon of scale 1 is the mean of the input image. C1 ¼ O I0 (x, y) dxdy. R 3. The mean of any residue is zero. O Rn,m dxdy ¼ 0. 4. A cartoon P image can be built from residues of larger scales. Cs ¼ 1 n¼s Rn;nþ1 þ C1: The proof can be seen in Gilboa (2004).
90
GILBOA ET AL.
R The F 2 diVusion process dissipates energy. We note that the term O ðI0 IÞ dxdy is, actually, the power of the residue. This implies that IF can be viewed as the most nonoscillatory sketch of I0 when the permitted reduced power of the original signal is bounded by some measure proportional to 1l. In order to model a natural image in a simple way, yet capture its significant characteristics, we model the image as a cartoon of a single scale with its matching residue. We refer to the scale so chosen, to represent the cartoon part of the image, as the representative cartoon scale sr. There can be several approaches to finding a representative scale, and, in general, an image can have several such scales. We propose to find the representative scale by examining the stability of the gradients along scales. As a cartoon consists primarily of a piecewise smooth image, partitioned by edges, a stable scale range [s1, s2] is one in which the total edge length (number and size of objects) changes very slowly. As the definition of an edge is not always clear, we resort to finding the smooth regions defined as having a gradient of less than 1% of the dynamic range ofR the input image. The total area (length in 1D) of smooth regions is |Ns| ¼ O w (Ns) dxdy, where w(A) is the indicator : function of the set A, and we define the set of smooth points as Ns ¼ {(x, y) : |rI (x, y) | < Ts}. Here Ts ¼ (maxO(I0) minO (I0)) /100. The set of nonsmooth points is Nns ¼ O Ns. The smoothness area |Ns| is generally increasing in scale (|Nns| decreasing), although monotonicity of the area, and embedding of the sets, is not guaranteed. For monotone Lyapunov functionals that can indicate stability of scales, see Weickert (1999a). We choose the scale sr as one of the meta-stable states of |Ns| (Figures 57 and 58). Our model consists of three components: I0 ¼ IC þ INC þ In, where Iorig ¼ IC þ INC is the original image; IC is the cartoon approximation; INC is the remainder Non-Cartoon part; and In is an additive noise. Note that we left the definition of ‘‘non-cartoon’’ partly vague. Typically it consists of textures, small-scale details, thin lines, etc. The only assumption we make is that it has zero mean. Under this decomposition, the residue of the noisy image is: IR I0 I ¼ ~I NC þ ~I n :
ð103Þ
Note that we distinguish between the ‘‘true’’ nonoscillatory part and its approximation by the F diVusion process by the superscript tilde. C. The Adaptive F Problem To obtain an adaptive scheme, we generalize the F denoising problem by imposing a spatially varying power constraint. Let us define first a measure to which we refer as the local power:
REAL AND COMPLEX PDE-BASED SCHEMES
91
FIGURE 57. (a) Original signal Iorig (left); nonsmoothness |Nns| as a function of scale s ¼ 1/l (right). (b) Signal (left) and residue (right) of first stable scale. (c) Signal (left) and residue (right) of second stable scale.
Pz ðx; yÞ
1 jOj
Z O
ðIz ð~ x; ~ yÞ ½Iz Þ2 wx;y ð~ x; ~yÞd~ xd~y;
ð104Þ
R ~ x |, | ~ where wx,y(~ x, ~ y) ¼ w(| x y y|) is a normalized ( O wx,y(~ x, ~y)d~ xd~y ¼ 1) and radially symmetric smoothing window, [ ] is the expected value.
92
GILBOA ET AL.
FIGURE 58. (a) Noisy signal I0 (left); nonsmoothness |Nns| as a function of scale s ¼ 1/l (right). (b) Signal (left) and residue (right) of stable scale.
From the definition of the local power, it follows that where P z varðIz Þ:
R
OPz(x,
y)dxdy ¼ P z , ð105Þ
We reformulate the scalar F problem, stated in Eq. (98), in the context of the adaptive F problem as follows: R minI O FðjrIjÞdxdy ð106Þ subject to PR^ ðx; yÞ ¼ Sðx; yÞ; where IR^ ¼ (I I0 C ), C is a constant, and S(x, y) > 0 is assumed to be given a priori. We solve the optimization problem using Lagrange multipliers: Z 1 E¼ FðjrIjÞ þ lðx; yÞPR^ ðx; yÞ dxdy: ð107Þ 2 O
REAL AND COMPLEX PDE-BASED SCHEMES
93
The E-L equation for the variation with respect to I is 0 rI lðx; yÞðI I0 CÞ div F ¼ 0; ð108Þ jrIj where for Rany quantity X(x, y), we define the locally averaged quantity X (x, y) ¼ OX(~ x, ~ y)wx,y(~ x, ~ y)d~ xd~ y. We solve this equation for I by a gradient descent: 0 rI It ¼ lðx; yÞðI0 I þ CÞ þ div F : ð109Þ jrIj To compute the value of l we multiply the E-L equation [Eq. (108)] by (I I0 C ) and integrate over. After a change in the order of integrals in the l term, we get Z ðlðx; yÞSðx; yÞ Qðx; yÞÞdxdy ¼ 0; ð110Þ O
where
0 rI Qðx; yÞ ¼ ðI I0 CÞdiv F : jrIj A suYcient condition is Qðx; yÞ : Sðx; yÞ Finally, the constant C is obtained by solving @ CE ¼ 0, yielding R lðx; yÞðI ðx; yÞ I 0 ðx; yÞÞdxdy R C¼ O : O lðx; yÞdxdy lðx; yÞ ¼
ð111Þ
ð112Þ
1. Automatic Texture-Preserving Denoising In the general case, we do not have any significant prior knowledge on the image that can facilitate the denoising process. We only assume that the noise is of constant power and is not correlated to the signal (e.g., additive white Gaussian or uniform noise). Our aim is to use the F denoising mechanism in a more accurate and precise manner. Images that can be well represented by large-scale cartoon models are the best candidates for successful denoising. Images with much fine texture and details will not benefit much from the operation; although reducing most of the noise, this type of processing inevitably degrades important image features. The first problem is to distinguish between good and bad candidates for F denoising. The task becomes even more complex if this is done adaptively. Many natural images exhibit a mosaic of piecewise
94
GILBOA ET AL.
smooth and texture patches. This type of image structure calls for position (spatial)-varying filtering operation. The performance of the scalar F denoising process is illustrated in Figure 59, using a typical cartoon-type and textured images. The SNRs of these three processed images are summarized in Figure 60 and plotted as a function of the reduced power (normalized power of the residue). Obviously, as these examples illustrate, cartoon-type images are denoised much better than textured images (both in terms of SNR and visually). Another important observation is that the maximal SNR of cartoon and non-cartoon images is reached at diVerent levels of denoising. Whereas cartoon-type images are stable and reach their peak SNR at high denoising levels ðP R s2 Þ; non-cartoon images degrade faster and require less denoising ðP R < s2 Þ: We present here a relatively simple method that can approximate the desired level of denoising in a region. In our aforementioned formulation [Eq. (106)], the problem reduces to finding S(x, y). We use the cartoon pyramid model for this purpose. Our first aim is to diVerentiate between the cartoon part of the image IC and the noise and texture parts INC þ In. We choose the first meta-stable scale where P R s2 (this condition is actually implicit as there is no stable scale with residue power below the noise level). We assign Sðx; yÞ ¼
s4 ; PR ðx; yÞ
ð113Þ
where PR(x, y) is the local power of the residue IR. In the case where IR In (basic cartoon model without textures or finescale details), this scheme degenerates to the scalar F process. The local power of the residue is almost constant ðPR ðx; yÞ s2 Þ and hence S(x, y) s2. We get a high-quality denoising process where I IC ¼ Iorig. In the case of most natural images, however, textures will also be filtered and included in the residue part. As the noise is uncorrelated with the signal, we can approximate the total power of the residue as PNC (x, y) þ Pn(x, y), the sum of local powers of the non-cartoon part and the noise, respectively. Thus textured regions are characterized by high local power of the residue. In order to preserve the detailed structure of such regions, the level of filtering there should be minimized over these regions. Let us recall the classical Wiener filter (optimal linear filter in the mean squared-error sense). Its formulation in the frequency domain is GðoÞ ¼
PsðoÞ ; PsðoÞ þ PnðoÞ
ð114Þ
REAL AND COMPLEX PDE-BASED SCHEMES
95
FIGURE 59. Scalar F denoising of textured and texture-free images. (a) Piecewise constant image; (b) textured image of grass; (c) patches of the two types of images combined in one. Left column depicts the original images; middle column shows noisy images; right column shows the result of scalar F processing (Eq. 3) at convergence (P R ¼ s2). As can be seen, this process is most suitable for piecewise constant images and unsuitable for textured ones. In the case of images containing both types (as often happens in natural images) the textured parts are oversmoothed, whereas the texture-free parts are not suYciently denoised. This naturally calls for diVerent measures of denoising in diVerent parts of the image.
96
GILBOA ET AL.
FIGURE 60. SNR of scalar F denoising of images shown in Figure 59. SNR is plotted as a function of the reduced power normalized by the noise power: P R /s2. The dashed line indicates the piecewise constant image, the dash-dot line shows the texture image, and the solid line shows the combined image.
where Ps(o) and Pn(o) are the power spectrum of the signal and noise, respectively. The basic concept amounts to reduction in the extent of filtering (G ! 1) at frequencies where the signal power exceeds that of the noise. In our case we have a similar principle, whereby reduction in the extent of filtering (i.e., S ! 0) is called for in regions where signal power exceeds that power of the noise. The signal in this case is that portion of the image accounting for the texture and fine details that may be filtered out by the F process. Formally, substituting the relation PR ðx; yÞ PNC ðx; yÞþ Pn ¼ PNC ðx; yÞ þ s2 for PR ðx; yÞ in Eq. (113), we get Sðx; yÞ s2
1 : 1 þ PNC ðx; yÞ=s2
ð115Þ
2. Denoising with Prior Information In cases signal is spatially specifics
where more information regarding the structure of the original available, the performance of denoising process incorporating a varying fidelity constraint can be substantially ameliorated. The are application dependent and heuristic in nature. Therefore we
REAL AND COMPLEX PDE-BASED SCHEMES
97
mention here only a few related ideas. To preserve specific features in the denoising process, such as long, thin line or known types of textures, one can preprocess with the corresponding feature detector (Hough transform, texture detector). The value of S(x, y) depends, then, locally on the feature detector response. Cases of spatially varying noise also fit the model. For example, in low-quality JPEG images, the boundaries between 8 8 pixelblocks are often more noisy, and fidelity of the original data there should, therefore, be decreased (S increased). D. Examples The eVects of adaptive- versus scalar-fidelity denoising are illustrated using a synthetic mosaic composed of two textured patches juxtaposed with two smooth patches (Figure 61). The scalar-fidelity term requires that a global power, equal to the noise power, be reduced. As the F process is smoothing both texture and noise, more power is reduced in the textured regions than in the originally smooth ones. This results in oversmoothing of textured regions, whereas smooth regions are not suYciently denoised (Figure 61b, left side). The adaptive-fidelity term process (Figure 61b, right) applies diVerent levels of denoising in diVerent regions. This improves the result both visually (texture is better preserved, smooth regions are better denoised) and in terms of signal-to-noise ratio. In Figure 61c we show how the required spatially varying noise power, S(x, y) (right), depends on the value of the residue, IR (left). The value of the adaptive fidelity term, l(x, y), obtained when the process converges is depicted graphically by the image at the bottom of the figure (lighter regions indicate higher value). Naturally, the value of l(x, y) is inversely related to the reduced power measure S(x, y). Processing a noisy version of the Barbara image (see Figure 64), it is demonstrated how the adaptive F method performs well on natural images. Our simple local power criterion seems to be suYcient to diVerentiate textured from smooth regions, even in relatively complex images. Accordingly, appropriate local requirements on the power to be reduced are applied. In Figure 65, Barbara’s right knee is enlarged to highlight similar phenomena to those obtained in the case of the synthetic example, where textures are preserved and the denoising of smooth regions is stronger. Figure 62 shows the teddy bear from the toys image where the textured bear parts are in front of a smooth background. Noise is reduced selectively in a natural manner. In Figure 63 the texture of the background snow is preserved better in the proposed scheme compared with the regularized P-M process.
98
GILBOA ET AL.
FIGURE 61. Processing of a noisy mosaic of textures (fabric and metal) and smooth areas. (a) Original mosaic made of patches of fabric and metal textures, juxtaposed with two constant patches (left); noisy version, I0, of the original with SNR ¼ 2.4 dB, s ¼ 40 (right). (b) Result of
REAL AND COMPLEX PDE-BASED SCHEMES
99
FIGURE 62. Part of the toys image. (a) Left: original; right: noisy image. (b) Left: result of scalar l denoising; right: result of our adaptive l denoising. More information of the original image is preserved in our scheme.
Table 2 shows the comparison between scalar and adaptive processes in terms of SNR. As can be observed, denoising is improved in a variety of natural images. 1. Implementation Details We used explicit Euler schemes to implement the iterative processes. The averaging window w(x,y) was selected to be a Gaussian of standard deviapffiffiffiffiffiffiffiffiffiffiffiffi tion sw ¼ 5. The potential in all images was F(s) ¼ 1 þ s2 (b ¼ 1). As we used gray-level images with values in the range [0, 255] the results are similar to TV denoising. We observed that the calculation of the constant C gives very little improvement. Therefore we used C ¼ 0 to save time. The residue power was bounded by P R 1:5s2 : In the experiment on natural images processing with scalar l: SNR ¼ 6.4 dB (left), result with adaptive l: SNR ¼ 7.6 dB (right). (c) Residue IR (left); S(x, y) calculated according to the residue (right). (d) l(x, y) at the convergence of the process.
100
GILBOA ET AL.
FIGURE 63. Comparison between regularized Perona–Malik (P-M) and our adaptive scheme. This example comes to address the texture-oversmoothing problem raised in Section I (Figure 5). (a) Original (left); image contaminated by additive white Gaussian noise (right, sn ¼ 15). (b) Image denoised using P-M (left) and processing with adaptive l (right). Textures and small-scale features are kept better in our scheme.
(results shown in Table 2) we set a constant residue power P R ¼ 1:5s2 : Texture patches were taken from the VisTex archive (VisTex Vision Texture Archive). All images were processed automatically with the same parameters (no tuning of parameters was performed for each image).
REAL AND COMPLEX PDE-BASED SCHEMES
101
FIGURE 64. An example of processing results obtained with a natural image. (a) Original Barbara image (left); noisy version of the original image, I0, with SNR¼8.7 dB, s ¼ 20 (right). (b) Result of processing with scalar l (SNR ¼ 12.6 dB, left); result of processing with adaptive l (SNR ¼ 14.2 dB, right). (c) Residue IR (left); S(x, y) calculated according to residue (middle), l(x, y) at convergence of process (right).
102
GILBOA ET AL.
FIGURE 65. Enlargement of Barbara’s right knee (full images are in Figure 64). Left: result of scalar process; right: result of our adaptive process.
TABLE 2 DENOISING RESULTS OF A FEW CLASSICAL IMAGES Image
SNR0
Scalar
Adaptive
Cameraman Lena Boats Sailboat Toys
15.8 13.5 15.6 10.4 10.0
19.2 17.5 19.6 15.1 16.8
20.8 18.6 20.6 16.3 17.8
From left, SNR of the noisy image (SNR0), SNR of scalar l denoising (Scalar), SNR of our adaptive l denoising (Adaptive). All experiments were done on images degraded by additive white Gaussian noise (s ¼ 10).
E. Discussion The widely used variational denoising algorithms with global power constraints perform well on simple cartoon-type images, where most of the information is represented by the simple skeleton approximation of the image. However, more subtle constraints are called for to preserve texture and small-scale details. We developed an adaptive variational scheme that controls the level of denoising by local power (variance) constraints. In this study a simple mechanism based on the local power of the residue was introduced to determine the desired adaptive constraints. Solving the EL equations resulted in a spatially varying fidelity term that determines the value of the fidelity to the input image (or degree of denoising) in each region. A priori knowledge on the details to be preserved can further enhance this method. We have shown that this scheme can filter noise better than the scalarfidelity term process in terms of SNR over a variety of synthetic and natural
REAL AND COMPLEX PDE-BASED SCHEMES
103
images. Visually, the processed images look more natural and less ‘‘cartoonlike.’’ Spatially varying power constraints can be used in almost any variational denoising process. Further improvement may be gained in distinguishing between texture and noise by using more elaborated schemes other than the power criterion, such as those obtained by transforming the residue to the Gabor/wavelet space. VI. CONCLUSION In this study we have tried to show the various capabilities of PDE-based image-processing algorithms. A few classical denoising schemes, such as P-M and Rudin–Osher–Fatemi (TV), were introduced, as well as some of the latest directions that are currently under research. We summarize below the main features of the recently developed methods of Sections II through V, which constitute part of the authors’ contribution to this dynamic field. 1. FAB–Triple-Well An evolutionary, gradient-based, general-purpose sharpener, which can cope with a wide range of blur and noise degradations, is presented and implemented. Simultaneous forward-and-backward diVusion processes are coupled in a single process. The diVusion coeYcient may assume negative values, and in this respect diVers from edge-preserving denoising processes. The stability over smooth regions is guaranteed under certain conditions. This provides a bound on the extent of the backward-diVusion part permitted in such processes. This type of sharpening process can also emerge out of a diVerent approach by using the tools of calculus of variations. In this case the energy functional to be minimized is based on nonmonotonic potential in the form of a triple-well. A fourth-order hyperdiVusion term is incorporated to increase robustness and reduce oscillatory solutions in processes containing backward diVusion. Vector-valued sharpening is achieved by an FAB-type process in the Beltrami framework, where the new ‘‘metric’’ defined on the manifold can be nonpositive definite at certain regions of the image.
2. Complex DiVusion The complex diVusion problem is formulated for image-processing tasks, coupling the diVusion and Schro¨ dinger equations.
104
GILBOA ET AL.
A fundamental solution for t 0, y 2 ( p2 ; p2) is obtained. The small y approximation is outlined and analyzed. The basic
findings are:
a. The imaginary part is approximately a smoothed second derivative, scaled by time. It can serve as a well-posed edge detector. b. Essentially, the real part of the process is similar to Gaussian smoothing with respect to the distance operator and the positivity measure of the kernel. It can therefore be regarded as a Gaussian scale-space for any practical implementation. c. The approximation error is of order O(y2), diminishing reasonably fast as y ! 0. A ramp-preserving nonlinear complex process overcomes the staircasing problem of gradient-based nonlinear diVusions. The imaginary part is found to be especially suitable for detecting edges in cases of varying illumination changes. A complex shock filter process is presented that can work in a noisy environment. It is demonstrated by a quantitative experiment that the process outperforms the most advanced and popular real-valued robust shock filters.
3. Texture Preserving Denoising A pyramidal model of sketches of increasing scale is presented,
wherein the scale of textures and details is well defined in the framework of a F-process. It was shown how, for quite a general family of denoising processes, textures can be better preserved by imposing spatially varying power constraints. An automatic mechanism in the case of white Gaussian noise is presented, depicting visual improvement and a consistent increase in terms of SNR.
REFERENCES Abramovich, F., and Silverman, B. W. (1998). Wavelet decomposition approaches to statistical inverse problems. Biometrika 85, 115–129. Alvarez, L., Guichard, F., Lions, P. L., and Morel, J. M. (1993). Axioms and fundamental equations of image processing. Arch. Ration. Mech. Anal. 123(3), 199–257. Alvarez, L., and Mazorra, L. (1994). Signal and image restoration using shock filters and anisotropic diVusion. SIAM J. Numer. Anal. 31(2), 590–605.
REAL AND COMPLEX PDE-BASED SCHEMES
105
Aubert, G., and Kornprobst, P. (2002). Mathematical problems in image processing, in Applied Mathematical Sciences, Vol. 147, New York: Springer-Verlag. Ayers, G. R., and Dainty, J. C. (1998). Iterative blind deconvolution method and its applications. Opt. Lett. 13(7), 547–549. Ball, J., and James, R. (1992). Proposed experimental tests of a theory of fine microstructure and the two-well problem. Phil. Trans. R. Soc. Lond. A 338, 389–450. Barbaresco, F. (2000). Calcul des variations et analyse spectrale: Equations de Fourier et de Burgers pour modeles autoregressifs regularises. Traitement du Signal 17(5/6). Blake, A., and Zisserman, A. (1987). Visual Reconstruction. Cambridge, MA: MIT Press. Blanc-Fe´ raud, L., Charbonnier, P., Aubert, G., and Barlaud, M. (1995). Nonlinear image processing: Modelling and fast algorithm for regularization with edge detection. Proc. IEEE ICIP-95 1, 474–477. Blomgren, P. V., and Chan, T. F. (1998). Color TV: Total variation methods for restoration of vector valued images. IEEE Trans. Image Process. 7, 304–309. Burt, P. J., and Adelson, E. H. (1993). The Laplacian pyramid as a compact image code. IEEE Trans. Commun. COM-31(4), 532–540. Cahn, J. W., and Hilliard, J. E. (1958). Free energy of a nonuniform system. I. Interfacial free energy. J. Chem. Phys. 28(2), 258–267. Carstensen, C., and Plechac, P. (1997). Adaptive mesh refinement in scalar non-convex variational problems. Berichtsreihe des Mathematischen Seminars Kiel. 97(2). Caselles, V., Kimmel, R., and Sapiro, G. (1997). Geodesic active contours. Int. J. Comput. Vision 22(1), 61–79. Catte, F., Lions, P. L., Morel, J. M., and Coll, T. (1992). Image selective smoothing and edge detection by nonlinear diVusion. SIAM J. Num. Anal. 29(1), 182–193. Chan, T. F., and Wong, C. (1998). Total variation blind deconvolution. IEEE Trans. Image Process. 7, 370–375. Chan, T. F., and Shen, J. (2002). A good image model eases restoration—on the contribution of Rudin-Osher-Fatemi’s BV image model. IMA Preprints 1829. Charbonnier, P., Blanc-Feraud, L., Aubert, G., and Barlaud, M. (1994). Two deterministic half-quadratic regularization algorithms for computed imaging. Proc. IEEE ICIP ’94 2, 168–172. Cheeseman, P., Kanefsky B., Kraft R., Stutz J., and Hanson, R. (1996). Super-resolved surface reconstruction from multiple images, in Maximum Entropy and Bavesian Methods, edited by G. R. Heidbreder. Kluwer: The Netherlands, pp. 293–308. Cottet, G. H., and Germain, L. (1993). Image processing through reaction combined with nonlinear diVusion. Math. Comp. 61, 659–673. Coulon, O., and Arridge, S. R. (2000). Dual echo MR image processing using multi-spectral probabilistic diVusion coupled with shock filters, in MIUA ’2000, British Conference on Medical Image Understanding and Analysis. London, United-Kingdom. Courant, R., Friedrichs, K. O., and Lewy, H. (1967). On the partial diVerence equations of mathematical physics. IBM J. 11, 215–235. Cross, M. C., and Hohenberg, P. C. (1993). Pattern formation outside of equilibrium. Rev. Mod. Phys. 65, 854–1090. Dascal L., and Sochen N. (2003). The maximum principle in the Beltrami color flow in ScaleSpace 2003, LNCS 2659, edited by L. D. Grifin and M. Lillholm. UK: Springer, Isle of Skye, pp. 196 208 Demoulini, S. (1996). Young measure solutions for a nonlinear parabolic equation of forwardbackward type. SIAM J. Math. Anal. 27, 376–403. Deriche, R., and Faugeras, O. (1996). Les EDP en traitement des images et vision par ordinateur. Traitement du Signal 13(6).
106
GILBOA ET AL.
Donoho, D. L. (1995). Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition. App. Comp. Harmonic Anal. 2, 101–126. Elad, M., and Feuer, A. (1999). Super-resolution restoration of continuous image sequence— adaptive filtering approach. IEEE Trans. Image Process. 8(3), 387–395. Ericksen J. (1987). Some constrained elastic crystals in Material Instabilities in Continuum Mechanics and Related Problems, edited by J. Ball. Oxford: Oxford University Press, pp. 119–137. Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Sta. 19, 1257–1272. Gabor, D. (1946). Theory of communication. J. Inst. Electric. Eng. 93(III), 429–457. Gabor, D. (1965). Information theory in electron microscopy. Lab. Invest. 14(6), 801–807. Geman, S., and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. PAMI 6, 721–741. Gilboa, G. (2004). Super-resolution algorithms based on inverse diVusion-type processes. PhD thesis. Technion—Israel Institute of Technology. Gilboa, G., Zeevi, Y. Y., and Sochen N. (2000a). Anisotropic selective inverse diVusion for signal enhancement in the presence of noise, Proc. IEEE ICASSP-2000, I, pp. 221–224, Turkey: Istanbul. Gilboa, G., Zeevi, Y. Y., and Sochen, N. (2000b). Signal and image enhancement by a generalized forward-and-backward adaptive diVusion process. Proc. EUSIPCO-2000, Tampara, Finland. Gilboa, G., Zeevi, Y. Y., and Sochen, N. (2001a). Resolution enhancement by forward-andbackward nonlinear diVusion processes, in Nonlinear Signal and Image Processing. Maryland: Baltimore. Gilboa, G., Zeevi, Y. Y., and Sochen N. (2001b). Complex diVusion processes for image filtering, in Scale-Space 2001, LNCS 2106, edited by M. Kerckhove. Vancoduver: SpringerVerlag, pp. 299–307. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2001c). Image enhancement segmentation and denoising by time dependent nonlinear diVusion processes. Proc. Int. Conf. Image Process. (ICIP) 2001 3, 134–137. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2002a). A forward-and-backward diVusion process for adaptive image enhancement and denoising. IEEE Trans. Image Process. 11(7), 689–703. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2002b). Regularized shock filters and complex diVusion, in ECCV-‘02, LNCS 2350. Copenhagen: Springer-Verlag, pp. 399–313. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2003a). Texture preserving variational denoising using an adaptive fidelity term. Proc. VLSM 2003, Nice: France, pp. 137–144. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2003b). PDE-based denoising of complex scenes using a spatially-varying fidelity term. Proc. ICIP 2003. 1, Barcelona: Spain, pp. 865–868. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2004a). Image sharpening by flows based on triple well potentials. J. Math. Imag. Vision 20, 121–131. Gilboa, G., Sochen, N., and Zeevi, Y. Y. (2004b). Image enhancement and denoising by complex diVusion processes. IEEE Tran. Pat. Anal. Mach. Intell. (PAMI) 25(8), 1020–1036. Ginzburg, V. L., Landau, L. D., and Fiz Zh. Eksp. Teor. Fiz. (1950). English translation: see Men of Physics: in Landau, edited by D. ter Haar. Vol. II. New York: Pergamon, pp. 546–568. Gobbert, M. K., and Prohl, A. (1998). A survey of classical and new finite element methods for the computation of crystalline microstructure. IMA Preprints 1576. Goupillaud, P., Grossmann, A., and Morlet, J. (1984–1985). Cycle-octave and related transforms in seismic signal analysis. Geoexploration 23, 85–102.
REAL AND COMPLEX PDE-BASED SCHEMES
107
Grossmann, A., and Morlet, J. (1984). Decomposition of Hardy functions into square integrable wavelets of constant shape. SIAM J. Math. Anal. 15, 723–736. Ho¨ llig, K. (1983). Existence of infinitely many solutions for a forward-backward heat equation. Trans. Amer. Math. Soc. 278, 299–316. Kaftory, R., Sochen, N., and Zeevi, Y. Y. (2003). Color image denoising and blind deconvolusion using the Beltramy operator, in Proc. Int. Symposium on Image and Signal Processing and Analysis. Rome: Italy, pp. 1–4. Kass, M., Witkin, A., and Terzopoulos, D. (1987). Snakes: Active contour models. Inter. J. Comput. Vision 1, 321–331. Kichenassamy, S., Kumar, A., Olver, P., Tannenbaum, A., and Yezzi, A. (1995). Gradient flows and geometric active contour models. Proc. IEEE ICCV 810–815. Kimmel, R., Sochen, N., and Malladi, R. (1996). On the geometry of texture. Report by Berkeley Labs. UC, LBNL-39640, UC-405UC, LBNL-39640, UC-405. Kimmel, R., Malladi, R., and Sochen, N. (2000). Images as embedding maps and minimal surfaces: Movies, color, texture, and volumetric medical/Images. Int. J. of Computer Vision 39(2), 111–129. Koenderink, J. J. (1984). The structure of images. Biol. Cybern. 50, 363–370. Kornprobst, P., Deriche, R., and Aubert, G. (1997). Image coupling, restoration and enhancement via PDE’s, in Proc. Int. Conf. on Image Processing 1997, pp. 458–461. SantaBarbara. Kreyszing, E. (1991). DiVerential Geometry. New York: Dover Publications. Kundur, D., and Hatzinakos, D. (1996). Blind image deconvolution. IEEE Sig. Process. Mag. 13, 43–64. Kuramoto, Y. (1984). Chemical Oscillations, Waves, and Turbulence. New York: SpringerVerlag. Kurganov, A., Levy, D., and Rosenau, P. (1998). On Burgers-type equations with non-monotonic dissipative fluxes. Commun. Pure Appl. Math. 51, 443–473. Li, X., and Chen, T. (1994). Nonlinear diVusion with multiple edginess thresholds. Pat. Recogn. 27(8), 1029–1037. Lindeberg, T., and ter Haar Romeny, B. (1994). Linear scale-space: (I) Basic theory and (II) Early visual operations, in Geometry-Driven DiVusion, edited by B. ter Haar Romeny. Dordrecht: Klewer Academic Publishers, pp. 1–77. Lindenbaum, M., Fischer, M., and Bruckstein, A. (1994). On Gabor’s contribution to image enhancement. Pat. Recognition 27(1), 1–8. Luskin, M. (1997). Approximation of a laminated microstructure for a rotationally invariant, double-well energy density. Numer. Math. 75, 205–221. Malladi, R., Sethian, J. A., and Vemuri, B. C. (1995). Shape modeling with front propagation: A level set approach. IEEE Trans. Pat. Anal. Mach. Intell. 17(2), 158–175. Marr, D. (1982). Vision. San Francisco, CA: Freeman & Co. McCallum, B. C. (1990). Blind deconvolution by simulated annealing. Opt. Commun. 75(2), 101–105. Meyer, Y. (2001). Oscillatory patterns in image processing and nonlinear evolution equations, Vol. 22. University Lecture Series, AMS. American Mathematical Society, Providence, RI. Mumford, D. (1994). The Bayesian rationale for energy functionals, in Geometry Driven DiVusion in Computer Vision, pp. 141–153, edited by B. M. ter Haar Romeny. Dordrecht: Kluwer Academic Publishers. Mumford, D., and Shah, J. (1989). Optimal approximations by piece-wise smooth functions and assosiated variational problems. Comm. Pure Appl. Math. LII, 577–685D. Munoz, J., and Pedregal, P. (2000). Explicit solutions of nonconvex variational problems in dimension one. Appl. Math. Optimiz. 41(1), 129–140.
108
GILBOA ET AL.
Nagasawa, M. (1993). Schro¨ dinger equations and diVusion theory, in Monographs in Mathematics, Vol. 86, Basel, Switzerland: Birkh€auser Verlag. Newell, A. C. (1974). Envelope equations. Lect. Appl. Math. 15, 157–163. Nikolova, M. (2002). Minimizers of cost-functions involving non-smooth data-fidelity terms. Application to the processing of outliers. SIAM Journ. Numer. Anal. 40, 3. Olver, P., Sapiro, G., and Tannenbaum, A. (1994). DiVerential invariant signatures and flows in computer vision: A symmetry group approach, in Geometry-Driven DiVusion, edited by B. ter Haar Romeny. Kluwen: Academic Publishers, Dordrecht. Osher, S. J., and Rudin, L. I. (1990). Feature-oriented image enhancement using shock filters. SIAM J. Numer. Anal. 27, 919–940. Osher, S., and Sethian, J. (1988). Fronts propagating with curvature dependent speed: Algorithms based on Hamilton-Jacobi formulations. J. Comp. Phys. 79, 12–49. Pedregal, P. (1996). On the numerical analysis of non-convex variational problems. Numer. Math. 74(03), 325–336. Pedregal, P. (1999). Optimization, relaxation and Young measures. Bull. Amer. Math. Soc. 36, 27–58. Perona, P., and Malik, J. (1990). Scale-space and edge detection using anisotropic diVusion. IEEE Trans. Pat. Anal. Machine Intel PAMI-12(7), 629–639. Pollak, I., Willsky, A. S., and Krim, H. (2000). Image segmentation and edge enhancement with stabilized inverse diVusion equations. IEEE Trans. Image Process. 9(2), 256–266. Polyakov, A. M. (1981). Quantum geometry of bosonic strings. Phys. Lett. 103B, 207–210. Radmoser, E., Scherzer, O., and Weickert, J. (2000). Scale-space properties of nonstationary iterative regularization methods. J. Visu. Commun. Image Representation 8, 96–114. RatliV, F. (1965). Mach Bands: Quantitative Studies on Neural Networks in the Retina. San Francisco: Holden-Day. Rost, M., and Krug, J. (1995). A practical model for the Kuramoto-Sivashinsky equation. Physica D 88, 1–13. Roubicek, T. (1997). Relaxation in Optimization Theory and Variational Calculus. New York: Walter de Gruyter. Rougon, N., and Preteux, F. (1995). Controlled anisotropic diVusion. Proc. SPIE Conf. on Non-linear Image Processing VI- IS&T/SPIE Symp. on Electronic Imaging, Science and Technology ’95 2424, pp. 329–340. Rudin, L., Osher, S., and Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica. D 60, 259–268. Samson, C., Blanc-Fe´ raud, L., Aubert, G., and Zerubia, J. (2000). A variational model for image classification and restoration. IEEE Trans. Pat. Anal. Machine Intel 22(5), 460–472. Sapiro, G., and Ringach, D. L. (1996). Anisotropic diVusion of multivalued images with applications to color filtering. IEEE Trans. Image Process. 5, 1582–1586. Schultz, R. R., and Stevenson, R. L. (1996). Extraction of high-resolution frames from video sequences. IEEE Trans. Image Process. 5(6), 996–1011. Sivashinsky, G. I. (1983). Instabilities, pattern formation, and turbulence in flames. Ann. Rev. Fluid Mech. 15, 179–199. Sochen, N., Gilboa, G., and Zeevi, Y. Y. (2000). Color image enhancement by a forwardand backward-adaptive Beltrami flow, in AFPAC-2000, LNCS 1888, edited by G. Sommer and Y. Y. Zeevi Springer-Verlag, Keil. pp. 319–328. Sochen, N., Kimmel, R., and Malladi, R. (1998). A general framework for low level vision. IEEE Trans. Image Process. 7, 310–318. Sochen, N. (1999). Stochastic processes in vision I: From Langevin to Beltrami. CCII Report No. 245, June 1999, Technion, and Proceedings of Int. Conf. Comp. Vis. July 2001, Vancouver, pp. 288–293.
REAL AND COMPLEX PDE-BASED SCHEMES
109
Starck, J. L., and Bijaoui, A. (1994). Filtering and deconvolution by the wavelet transform. Sig. Process. 35, 195–211. Stefanski, L., and Carroll, R. (1990). Deconvoluting kernel density estimators. Statistics 21, 169–184. ter Haar Romeny, B. M. (Ed.) (1994). Geometry Driven DiVusion in Computer Vision. Kluwer Academic Publishers. Dordrecht. ter Haar Romeny, B. M. (1996). Introduction to scale-space theory: Multiscale geometric image analysis. Tech. Report No. ICU-96-21. Utrecht University. Tikhonov, A. N., and Arsenin, V. Y. (1977). Solutions of Ill-posed Problems. Washington, D. C., Winston and Sons. Vese, L. A., and Osher, S. J. (2002). Modeling textures with total variation minimization and oscillating patterns in image processing. UCLA CAM Report pp. 02–19. VisTex Vision Texture Archive of the MIT Media Lab http://www-white.media.mit.edu/ vismod/imagery/VisionTexture/vistex.html. Vitsnudel, I., Ginosar, R., and Zeevi, Y. Y. (1991). Neural network aided design for image processing. SPIE Symp. Vis. Commun. Image Process. 1606, Boston, MA, pp. 1086–1091. Vogel, R. V., and Oman, M. E. (1996). Iterative methods for total variation denoising. SIAM J. Sci. Comput. 17(1), 227–238. Wei, G. W. (1999). Generalized Perona-Malik equation for image restoration. IEEE Signal Process. Lett. 6, 165–167. Weickert, J. (1995b). Anisotropic diVusion in image processing. Ph.D. thesis, Kaiserslautern University, Germany. Weickert, J. (1977). A review of nonlinear diVusion filtering, edited by B. ter Haar Romeny, L. Florack, J. Koenderink, and M. Viergever. in Scale-Space Theory in Computer Vision, LNCS 1252, Berlin: Springer, pp. 3–28. Weickert, J. (1999a). Coherence-enhancing diVusion filtering. Inter. J. Comput. Vision 31, 111–127. Weickert, J. (1999b). Coherence-enhancing diVusion of colour images. Image Vision Comp. 17, 199–210. Weickert, J., Ishikawa, S., and Imiya, A. (1999). Linear scale-space has first been proposed in Japan. J. Math. Imag. Vision 10, 237–252. Weickert, J., and Benhamouda, B. (1997). A semidiscrete nonlinear scale-space theory and its relation to the Perona-Malik paradox, in Advances in Computer Vision, edited by F. Solina, Wien: Springer, pp. 1–10 Whitaker, R., and Gerig, G. (1994). Vector-valued diVusion, in Geometry-Driven DiVusion, edited by B. ter Haar Romeny. Bordrecht: Kluwer Academic Publishers, pp. 93–134. Whitaker, R. T., and Pizer, S. M. (1993). A multi-scale approach to non uniform diVusion. CVGIP: Image Understanding 57(1), 99–110. Witelski, T. P. (1996). The structure of internal layers for unstable nonlinear diVuison equations. Stud. Appl. Math. 96, 277–300. Witkin, A. P. (1983). Scale space filtering. Proc. Int. Joint Conf. Artificial Intelligence, pp. 1019–1023. You, Y., and Kaveh, M. (1996). A regularization approach to joint blur identification and image restoration. IEEE Trans. Image Process. 5(3), 416–428. You, Y., Xu, W., Tannenbaum, A., and Kaveh, M. (1996). Behavioral analysis of anisotropic diVusion in image processing. IEEE Trans. Image Process. 5(11), 1539–1553. Zibulski, M., and Zeevi, Y. Y. (1997). Analysis of multi-window Gabor-type schemes by frame methods. J. App. Comput. Harmon. Analy. 4, 188–221.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 136
The S-State Model for Electron Channeling in High-Resolution Electron Microscopy P. GEUENS AND D. VAN DYCK Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . A. Why High-Resolution Electron Microscopy? . . . . . . . . . . . B. The Need for a New Theory . . . . . . . . . . . . . . . . C. Survey of DiVraction Theories . . . . . . . . . . . . . . . . 1. Thin Objects: The Weak Phase Object Approximation. . . . . . . 2. Multislice Method . . . . . . . . . . . . . . . . . . . 3. Bloch Wave Method . . . . . . . . . . . . . . . . . . D. Conclusion . . . . . . . . . . . . . . . . . . . . . . II. The Channeling Theory. . . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . B. The S-State Model for Channeling . . . . . . . . . . . . . . III. Calculation of the Eigenfunctions of an Electron in an Isolated Atom Column . . . . . . . . . . . . . . . . . . . . A. The Angular Equation . . . . . . . . . . . . . . . . . . B. The Radial Equation of the Bound Eigenfunctions . . . . . . . . . C. Solutions of the Radial Equation of the Bound Eigenfunctions: Finite DiVerence Methods . . . . . . . . . . . . . . . . . D. Solutions of the Radial Equation of the Bound Eigenfunctions: Expansion in a Basis Set . . . . . . . . . . . . . . . . . . 1. A Basis Set of Bessel Functions of the First Kind . . . . . . . . 2. Expansion in Two-Dimensional Quantum Harmonic Oscillator Eigenfunctions . . . . . . . . . . . . . . . . . 3. Optimization of the Two-Dimensional Quantum Harmonic Oscillator Width . . . . . . . . . . . . . . . . . . . . E. Calculation of the Bound Eigenfunctions Using the Bloch Wave Method or the Multislice Method . . . . . . . . . . . . . . . . . 1. The Bloch Wave Method . . . . . . . . . . . . . . . . . 2. The Multislice Method . . . . . . . . . . . . . . . . . F. Comparison of the Performance of the Presented Methods to Calculate the Bound Eigenfunctions and Their Eigenenergy . . . . . . . . . . 1. Finite DiVerence Method. . . . . . . . . . . . . . . . . 2. Bessel Functions of the First Kind Versus Two-Dimensional Quantum Harmonic Oscillator Eigenfunctions . . . . . . . . . . . . . 3. Some Hard Numbers for the Eigenenergies . . . . . . . . . . G. The Radial Equation of the Continuum Eigenfunctions . . . . . . . H. Excitation of the Eigenfunctions . . . . . . . . . . . . . . . 1. Excitation of the Bound Eigenfunctions . . . . . . . . . . . . 2. Excitation of the Continuum Eigenfunctions . . . . . . . . . .
ISSN 1076-5670/05 DOI: 10.1016/S1076-5670(04)36002-7
111
. . . . . . . . . . .
113 113 113 115 115 120 122 123 124 124 125
. . .
130 132 132
.
133
. .
135 135
.
138
.
142
. . .
143 143 146
. .
148 148
. . . . . .
150 156 157 158 158 159
Copyright 2005, Elsevier Inc. All rights reserved.
112
GEUENS AND VAN DYCK
IV. The S-State Model . . . . . . . . . . . . . . . . . . . . A. Physical Insight in the S-State Model: The Channeling Map . . . . . . . . . . . . . . . . . . B. Scaling and Parameterization of the S-State Model . . . . . . . . 1. Scaling and parameterization of the 1S Eigenfunction . . . . . . 2. Parameterization of the 1S Eigenenergy . . . . . . . . . . . C. A Fast Method to Calculate the Parameterized 1S-state: The Variational Principle. . . . . . . . . . . . . . . . . D. Conclusion . . . . . . . . . . . . . . . . . . . . . V. The S-State Model for Nonisolated Atom Columns . . . . . . . . . A. Symmetry Arguments . . . . . . . . . . . . . . . . . . B. A Pair of Identical Atom Columns . . . . . . . . . . . . . C. A Pair of Nonidentical Atom Columns . . . . . . . . . . . . D. The S-State Model for a General Assembly of Atom Columns . . . . E. Accuracy of the S-State Model for a General Assembly of Atom Columns . . . . . . . . . . . . . . . . . . . . F. The LCAO: A Method to Calculate Approximate Eigenfunctions and Eigenenergies of a Pair of Atom Columns . . . . . . . . . . . G. Conclusion . . . . . . . . . . . . . . . . . . . . . VI. The S-State Model in Case of Crystal or Beam Tilt . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . B. Excitation of the Eigenfunctions . . . . . . . . . . . . . . C. Shift of the Maxima in the Amplitude and Phase of the Wave Function . . . . . . . . . . . . . . . D. The S-State Model for a General Assembly of Atom Columns Including Crystal or Beam Tilt . . . . . . . . . . . . . . . . . . E. Accuracy of the S-State Model for a General Assembly of Atom Columns Including Crystal or Beam Tilt . . . . . . . . . F. Small-Angle Nonparallel Illumination . . . . . . . . . . . . G. Conclusion . . . . . . . . . . . . . . . . . . . . . VII. Experimental Channeling Maps . . . . . . . . . . . . . . . . VIII. Electron DiVraction and the S-State Model . . . . . . . . . . . . A. Electron DiVraction . . . . . . . . . . . . . . . . . . B. Direct Methods . . . . . . . . . . . . . . . . . . . . 1. The Patterson Function . . . . . . . . . . . . . . . . 2. Inequalities . . . . . . . . . . . . . . . . . . . . C. How Extinct Are Kinematically Forbidden Reflections? . . . . . . 1. The (002) Reflection of a Diamond-Type Structure in [110] Zone-Axis Orientation. . . . . . . . . . . . . . . . . 2. Beyond the S-State Model . . . . . . . . . . . . . . . 3. The Influence of Tilt . . . . . . . . . . . . . . . . . D. Conclusion . . . . . . . . . . . . . . . . . . . . . Appendix A. The Mean Atom Column Potential . . . . . . . . . . Appendix B. The Two-Dimentional Quantum Hermonic Oscillator. . . . 1. Solution of the Angular Equation . . . . . . . . . . . . . 2. Solution of the Radical Equation . . . . . . . . . . . . . 3. The Generating Function. . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. .
160
. . . .
. . . .
160 164 164 167
. . . . . . .
. . . . . . .
171 173 175 176 178 180 181
. .
182
. . . . .
. . . . .
184 190 191 191 195
. .
197
. .
197
. . . . . . . . . .
. . . . . . . . . .
198 200 201 201 203 203 205 207 208 210
. . . . . . . . . .
. . . . . . . . . .
212 213 216 217 218 219 221 221 223 223
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
113
I. INTRODUCTION A. Why High-Resolution Electron Microscopy? As scientists manage to control the structure of materials and devices, on an even finer scale, more and more nanostructures are being developed with interesting properties. Parallel to this evolution, we also see an evolution in the understanding and prediction of their properties. In the years to come, materials and bioscience will gradually evolve from a describing into a designing science. If this evolution toward understanding and designing of nanostructures is to continue, it is imperative that the structure characterization techniques keep pace. Information from an object can only be obtained from collisions with particles that interact with the object and that carry this information to the observer. Only a few types of particles can be used for this purpose: photons, neutrons, and electrons. Other types of particles are more diYcult to generate or to handle. For the study of nanostructures, such as nanoparticles, that cannot be arranged periodically so that diVraction methods cannot be used, electrons are much more appropriate as imaging particles because their interaction with atoms is orders of magnitude stronger than that of x-rays and neutrons. Furthermore, from all particles electrons provide the most information for a given amount of radiation damage (Henderson, 1995). Because electrons are charged, they are easy to deflect in an electrostatic or magnetic field, which makes it possible to construct an electron lens or to combine lenses into an electron microscope. During the last decade steady technological improvements gradually ˚ . Because of pushed the resolution of the electron microscope to below 1A their large kinetic energy, individual electrons can be detected with high eYciency in novel detectors such as charged coupled device (CCD) cameras so that all information can be captured and atom positions can be determined with the highest attainable precision. With these resolution and detection capabilities it now becomes possible to resolve the individual atoms in a structure and to refine the atomic structures quantitatively with ˚ as required to match the experiment a precision that is in the order of 0.01 A with theoretical calculations. However, this ambitious goal is still hampered by diYculties in the quantitative interpretation of the data. B. The Need for a New Theory A quantitative refinement consists of searching for the best fit between simulated and experimental datasets (images and/or diVraction patterns) in which all model parameters (atom coordinates, specimen orientation and
114
GEUENS AND VAN DYCK
thickness, imaging parameters, etc.) are varied. In fact, one searches for a global optimum in a high-dimensional space. This search is done in an interative way, in which each step requires full calculation of the dynamical electron diVraction in the crystal. At present, this is done with standard multislice programs (Zandbergen et al., 1997), which, if repeated thousands of times, presents a real bottleneck for flexible applications. A simpler way to calculate the exit wave, and hence the images and diVraction patterns, would allow speeding up the calculations drastically. Another need for a more eYcient description of the diVraction process stems from the fact that recently it has become possible to reconstruct the exit wave of an object at sub-angstrom resolution either by focal series reconstruction or by oV-axis holography (Kisielowski et al., 2001). In order to interpret the amplitude and phase of the exit wave in terms of the mass and position of the projected atom columns, the dynamical scattering of the electrons in the object must be ‘‘inverted’’ to obtain a starting structure, which can then be used as a ‘‘seed’’ for further quantitative structure refinement. Multislice methods, or plane-wave–based methods, are not useful for this purpose because they do not explain on an intuitive basis why even in a case of highly dynamical scattering the high-resolution electron microscopy (HREM) exit wave can still be locally related to the projected structure. The classical picture of electrons traversing the crystal as planelike waves in the directions of the Bragg beams, which stems from the x-ray diVraction picture and on which most of the simulation programs are based, is, in fact, misleading. The physical reason for this ‘‘local’’ dynamic diVraction is the channeling of the electrons along the atom columns parallel to the beam direction. In a zone-axis orientation, where the projected crystal structure is simplest, the atom cores exactly superimpose along the beam direction and hence the scattering is very dynamic. Therefore it would be much better to look for a more appropriate quantum mechanical base to describe the dynamic wave field. For a zone-axis orientation of the specimen in a transmission electron microscope with the accelerating voltage larger than 100 kV, the electrons are trapped in the electrostatic potential of the atom columns parallel to the electron beam. Once trapped in an atom column, the electrons cannot leave because their transversal kinetic energy is too small to escape the electrostatic potential of the atoms in the column. Classically speaking, the electrons will oscillate in the column while propagating to the exit face. A simple analog is depicted in Figure 1. If an atom is considered as a small lens and the electron wave as a light wave, the successive atoms periodically
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
115
FIGURE 1. From exit wave to structure: channeling theory.
focuses and defocuses the wave (e.g., a wave guide). The distance between two focal points is then called the extinction distance. If the crystal thickness is equal to an integer times the extinction distance, the exit wave is identical to the incident wave, and the column in a sense disappears. Inside the column the electrons are not influenced by the neighboring columns. In this way, the exit face of a crystal can still be expressed in terms of the projected structure, albeit the electron scattering within each column can be very dynamic. This eVect can be exploited to speed up calculations drastically and to help interpret the exit wave. C. Survey of DiVraction Theories 1. Thin Objects: The Weak Phase Object Approximation The weak phase object approximation describes the electron scattering in very thin specimens and specimens with very light atom columns. The geometric thickness is neglected in this case. Many practical specimens are too thick for this approximation to be quantitatively correct. It does not properly include the eVects of multiple scattering in the specimen foil. However, this approach can provide qualitative insight and is the basis for a more advanced method, namely, the multislice method, which describes the multiple scattering of an electron in a thick specimen foil. The primary interaction between the specimen and the imaging electrons is an interaction between the electrostatic potential of the specimen foil and the charge of the electron. In conventional high resolution electron microscopy
116
GEUENS AND VAN DYCK
FIGURE 2. An incident electron plane wave passing through the specimen. The wave function is drawn as lines of constant phase, the specimen is assumed to have a uniform constant potential. The electron wavelength is reduced by the positive potential inside the specimen.
(CHRTEM) the electrons are primarily described by a single plane wave, which is aVected by the specimen foil. The wave function (x, y, z) of a plane wave traveling along the optical axis in the z direction is 2piz
ðx; y; zÞ ¼ e l ;
ð1Þ
with l the wavelength of the electron. Because the speed of the electron approaches the speed of light and the rest mass of an electron is small, quantities such as mass m and wavelength should be treated relativistically eV0 m ¼ m0 1 þ ; ð2Þ m0 c2 1 ¼ l
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2m0 eV0 eV0 ; 1þ h2 2m0 c2
ð3Þ
with m0 the rest mass, e the charge of the electron, V0 the potential through which the electron was accelerated, h Plank’s constant, and c the speed of light. If the specimen is thin, the deviation of the path of the incident electrons will only be small and can be well approximated as a small change in the wavelength of the electrons, caused by the acceleration of the electrons by a small positive electrostatic potential Vs, as they pass through the specimen foil (Figure 2). The change in wavelength, Rin the specimen foil, will be determined by the z mean electrostatic potential 1z 0 Vs ðx; y; z0 Þdz0 averaged along the z direction and is to a good approximation given by
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
117
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z Z 0 1ffi u 1 z 1 z u 0 0 0 0 Vs ðx; y; z Þdz Þ B eðV0 þ Vs ðx; y; z Þdz ÞC u2m0 eðV0 þ 1 u z 0 z 0 B1 þ C ¼u @ A t ls ðx; y; zÞ h2 2m0 c2 vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z z Z 1ffi 0 u e z u 0 0 0 0 2 Vs ðx; y; z Þdz B2m0 c þ 2eV0 þ Vs ðx; y; z Þdz C u 1u z 0 C B ¼ u1 þ 0 A @ lt V0 z 2m c2 þ eV0 0
00Z z Z z 3 12 1 2 0 0 0 0 s Vs ðx; y; z Þdz 7 Vs ðx; y; z Þdz C C BB 16 0 7 þ OB CC B 0 ¼ 6 ; 1 þ l B 5 AC A @@ l4 2pz V0 z
ð4Þ with s¼
2p m0 c2 þ eV0 2pmel ¼ 2 lV0 2m0 c þ eV0 h2
ð5Þ
This approximation assumes that the electrostatic potential in the specimen foil Vs is much smaller than V0. The electron wave function after interaction with a thin specimen becomes then, byZ replacing l by ls(x, y, z) in Eq. (1) z
2piz
ðx; y; zÞ ¼ e l e
is
0
Vs ðx; y; z0 Þdz0 Rz
;
ð6Þ
and is called the phase object approximation. If 0 Vs ðx; y; z Þdz is small, the phase object approximation can be approximated as " Z z 2 !# Z z 2piz Vs ðx; y; z0 Þdz0 þ O Vs ðx; y; z0 Þdz0 ðrÞ ¼ e l 1 þ is ; ð7Þ 0
0
0
0
which is known as the weak phase object approximation. Because these approximations are only valid for very thin crystals, thinner than realistic specimen thicknesses used in CHRTEM, more elaborate models are needed to describe the electron–object interaction for thick specimen foils. If the electron interacts strongly with the specimen and can scatter more than once as it passes through the specimen foil, the scattering is said to be dynamic. If the electron can scatter only once, as discussed in the previous section, the scattering is said to be kinematic. In principle, for precise calculations, including the spin of the electrons, dynamic scattering of electrons must be treated with the relativistic Dirac equation. However, the simpler approach ignoring the spin and using the nonrelativistic Schro¨dinger equation with the relativistically correct mass and wavelength is mostly used,
118
GEUENS AND VAN DYCK
because it is much easier to work with. This approach has been compared to more accurate calculations using the relativistic Dirac equation by Fujiwara (1961). He found that the nonrelativistic Schro¨ dinger equation with the relativistically correct mass and wavelength is usually suYciently accurate in case of the typical energy ranges used in electron microscopy (100–400 keV). Because the potential of the specimen is assumed to be stationary, the three-dimensional (3D) time-independent Schro¨ dinger equation can be used to calculate the electron wave function (x, y, z) " # h2 Dx;y;z eV ðx; y; zÞ ðx; y; zÞ ¼ E0 ðx; y; zÞ; ð8Þ 2m with Dx,y,z the 3D Laplacian operator, V(x, y, z) the electrostatic potential in the specimen foil, and E0 ¼ eV0 the kinetic energy of the incident electrons. Because the kinetic energy of the incident electron is several orders of magnitude larger than the potential energy of the specimen foil, the electron wave function can be regarded as modulation of the incident wave function, which is in case of CHRTEM a plane wave ðx; y; zÞ ¼ e2piðkx xþky yþkz zÞ Cðx; y; zÞ;
ð9Þ
with k the 3D wave vector of the incident plane wave defined as k2 ¼ k2x þ k2y þ k2z ¼ l12 , with kx and ky one-dimensional (1D) wave vectors perpendicular to the optical axis and kz the 1D wave vector parallel to the optical axis. After substitution of Eq. (9) in Eq. (8), Eq. (8) can be written as h2 @2 @ Dxy þ 2 þ 4pikxy ▽xy þ 4pikz Cðx; y; zÞ ð10Þ @z 2m @z eV ðx; y; zÞCðx; y; zÞ ¼ 0; with ▽xy the in-plane gradient operator and kxy ¼ kx þ ky. The motion of the high-energy electrons is predominantly in the forward z direction, meaning that C(x, y, z) changes slowly with z. Therefore 2 @ k z @ ; ð11Þ @z @z2 given that kz is very large. Equation (10) can then be rewritten as h2 @ Dxy þ 4pikxy ▽xy þ 4pikz Cðx; y; zÞ eV ðx; y; zÞCðx; y; zÞ ¼ 0 @z 2m ð12Þ Ignoring the term containing the second derivative with respect to z is sometimes referred as ignoring backscattered electrons or ‘‘forwardscattering approximation,’’ which is appropriate for high-energy electrons
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
119
(Howie and Basinski, 1968; Van Dyck, 1976; Van Dyck and Coene, 1984). This approximation is probably better known as the paraxial approximation to the Schro¨ dinger equation. The Schro¨ dinger equation for fast electrons interacting with the specimen foil, for parallel illumination along a main zone-axis (kxy ¼ 0), can be written as first-order diVerential equation in z @ il Cðx; y; zÞ ¼ Dxy þ isV ðx; y; zÞ Cðx; y; zÞ @z 4p
ð13Þ
The theory of dynamic electron diVraction has been studied by many scientists over the last and current centuries. In principle, it all comes down to solving diVerential Eq. (8), or its approximation Eq. (12) or Eq. (13) in case kxy ¼ 0. Methods to solve this diVerential equation can be subdivided into two groups: the multislice method and the Bloch wave method. Cowley and Moodie (1957) considered the dynamic scattering problem by starting from optics and derived a method that has become known as the multislice method. In this method the specimen is divided into thin twodimensional (2D) slices along the electron beam direction. The electron beam alternately transverses a slice and propagates to the next slice. If each slice is thin enough, it can be regarded as a phase object and the propagation between the slices is described by the Fresnel formula. Bethe (1928) was the first to describe dynamic scattering in the context of electron diVraction. He started from the Schro¨dinger equation, and Fourier expanded the crystal potential and the electron wave function in components that match the underlying periodicity of the crystal lattice. In this way he obtained a set of coupled linear dispersion equations for the plane wave expansion coeYcients that can be put in a matrix form and, in the forwardscattering approximation, can be reformulated as an eigenvalue problem. The Fourier components of the wave function have since become known as Bloch waves in analogy with Bloch’s theorem in solid-state physics. Since Bethe’s original approach, several many-beam electron-diVraction theories have been developed, which are all in a sense reformulations of the original Bethe theory. Howie and Whelan (1961) used a diVerent starting point but concluded with a set of coupled first-order diVerential equations similar to the Bloch wave method. This approach was extremely valuable for the calculation of the contrast of electron microscopic images in two-beam and three-beam situations. Goodman and Moodie (1974) elucidated the interrelationship between the various existing theories and the multislice method, starting from Schro¨ dinger’s equation. Independently and using a totally diVerent but equivalent approach of Feynman path integral formalism of quantum
120
GEUENS AND VAN DYCK
mechanics, Van Dyck (1975) showed the equivalence between the multislice formulas and the system of Howie and Whelan (1961). Many simulation programs using the multislice method or the Bloch wave method have been developed in groups throughout the world. The two most well-known commercial programs are Mac Tempas (Kilaas, 1987) (multislice) and EMS (Stadelmann, 1912) (multislice and Bloch waves). A noncommercial program, NCEMSS, is available from the internet.1 The next two sections discuss both the multislice method and the Bloch wave method. 2. Multislice Method Although the multislice method can be derived for nonparallel illumination kxy 6¼ 0, for reasons of simplicity the illumination will be assumed to be parallel to a main zone-axis. The wave equation for fast electrons [Eq. (13)] in case kxy ¼ 0, can be written in operator form (Kirkland, 1998) @ Cðx; y; zÞ ¼ ½A þ BðzÞ Cðx; y; zÞ @z il Dxy A¼ 4p BðzÞ ¼ isV ðx; y; zÞ;
ð14Þ
where A and B are noncommuting operators. Equation (14) is, in fact, a mixture of two equations. The first equation is @ Cðx; y; zÞ ¼ BðzÞCðx; y; zÞ; @z with solution
Z
Cðx; y; z þ DzÞ ¼ exp
zþDz
z
¼ exp is
Z
Bðz0 Þdz0 Cðx; y; zÞ
zþDz z
ð15Þ
ð16Þ
V ðx; y; z Þdz Cðx; y; zÞ 0
0
ð17Þ
yields the phase object expression given in Eq. (6). The second equation is @ Cðx; y; zÞ ¼ ACðx; y; zÞ; @z
ð18Þ
which is a complex diVusion equation with solution Cðx; y; z þ DzÞ ¼ expðADzÞCðx; y; z þ DzÞ; 1
Available at http://ncem.lbl.gov/frames/software.htm.
ð19Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
121
which yields a propagation eVect. Equation (14) has a formal operator solution, after oVsetting the start value to z and integrating from z to z þ Dz, this solution is equal to Z zþDz Cðx; y; z þ DzÞ ¼ exp ADz þ Bðz0 Þdz0 Cðx; y; zÞ ð20Þ z
If A and B are noncommuting operators and is a small real number, then expðA þ BÞ ¼ expðAÞexpðBÞ þ Oð2 Þ
ð21Þ
Equation (20) can then be written for small Dz as Z zþDz Cðx; y; z þ DzÞ ¼ exp ðADzÞ exp Bðz0 Þdz0 Cðx; y; zÞ þ OðDz2 Þ ð22Þ z ¼ pðx; y; DzÞ ½tðx; y; zÞCðx; y; zÞ þ OðDz2 Þ; which is discussed in detail in Kirkland (1998). p(x, y, Dz) is the propagator function in real space for a distance Dz and t(x,y,z) is the transmission function given by 1 ip 2 exp ðx þ y2 Þ ; pðx; y; DzÞ ¼ ð23Þ ilDz lDz Z tðx; y; zÞ ¼ exp
z
zþDz
isV ðx; y; z Þdz 0
0
ð24Þ
The symbol denotes a convolution product. Thus, if the initial wave function C(x, y, 0) is given, the electron wave function at any depth z can be calculated by repeated application of Eq. (22), given a potential distribution of the specimen. The specimen is first divided into many thin slices, as in Figure 3. At each slice the electron wave function experiences a phase shift due to the projected potential of all atoms in the slice, as shown in Section I.C.1, and is then propagated to the next slice. More accurate
FIGURE 3. Multislice decomposition of a thick specimen. (a) The specimen divided into thin slices. (b) Each slice is treated as a transmission step (solid line) and by a propagator (vacuum between the slices).
122
GEUENS AND VAN DYCK
solutions can be obtained by extending the Taylor expansion in Eq. (21) (Chen, 1997; Van Dyck, 1979). Note that the multislice scheme can provide a more general solution of Eq. (8) (Chen, 1997). Note that the multislice method in principle solves the problem of propagation of a quantum mechanical wave packet through a potential, which is rather a more general problem. It is not surprising that similar methods have evolved in fields other than electron microscopy. 3. Bloch Wave Method The Bloch wave method directly solves the 3D time-independent Schro¨ dinger Eq. (8). It makes use of the Bloch theorem (see, for instance, Kittel, 1996) that states that a particular solution of the motion of the electron in a periodic potential V(r), is of the form ðrÞ ¼ bðK; rÞ exp ð2piK rÞ
ð25Þ
where b(K, r) has the periodicity of the potential, K is the Bloch wave vector, and r is a 3D vector. As a result b(K, r) can be expanded in a Fourier series of plane waves with wave vector g similar as V(r) X bðK; rÞ ¼ CgK expð2pig rÞ; ð26Þ g
V ðrÞ ¼
X
Vg expð2pig rÞ;
ð27Þ
g
with CgK the Bloch wave coeYcient. Hence the solution of the Schro¨ dinger equation is the sum of the particular solutions XX ðrÞ ¼ cK CgK expð2piðK þ gÞ rÞ: ð28Þ K
g
After substitution of Eq. (28) in Eq. (8), Eq. (8) can be rewritten as X 2me Vgh ChK ¼ 0; ½k2 ðK þ gÞ2 CgK þ 2 h h
ð29Þ
the unknowns are CgK and K. A nontrivial solution of Eq. (29) is obtained when det(M) ¼ 0, with M equal to h i 2me Mgh ¼ dðg hÞ k2 ðK þ gÞ2 þ 2 Vgh ; ð30Þ h or a2n jKj2n þ a2n1 jKj2n1 þ . . . þ a0 ¼ 0;
ð31Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
123
which is known as the dispersion relation of the dynamic theory. It determines the possible K vectors and shows that 2n particular Bloch wave solutions b(K, r) exist. It can be shown that n solutions are describing backscattered waves and n solutions forward-scattered waves. To solve Eq. (29) it is forced into the form of an eigenvalue equation, therefore K ¼ k þ gK n;
ð32Þ
where gK is the eigenvalue of the Bloch wave with wave vector K and n is the outward pointing normal to the entrance surface. Substitution of Eq. (32) into Eq. (29) and neglecting ðgk Þ2 brings Eq. (29) into the form of a linear eigenvalue problem h i X 2me Vgh ChK ¼ 2gK ½ðk þ gÞ nCgK ð33Þ k2 ðk þ gÞ2 CgK þ 2 h h This approximation neglects backscattered waves (Lewis et al., 1978). Each eigenvalue gK has its associated eigenvector [CK]. The constants cK must be obtained from boundary conditions and therefore depend on the shape of the crystal. D. Conclusion The multislice method and the Bloch wave method were the historically first accurate methods to describe the dynamic scattering of an electron in a specimen foil. However, they do not provide intuitive physical insight in real space. Multislice and Bloch wave programs are often used as black boxes. For example, it is hard to explain, using the classic picture of electrons crossing the specimen foil as planelike waves in the directions of the Bragg beams, why the exit wave and the projected structure are related to each other. This classic picture stems, in fact, from x-ray diVraction. This makes the Bloch wave method and the multislice method less suitable to ‘‘invert’’ the multiple electron scattering in the specimen foil. In a fitting procedure, the derivatives of these numerical iterative methods have to be calculated by a finite diVerence expansion (Jansen et al., 1998), which may lead to tedious calculations. However, due to the accuracy of the Bloch wave method and the multislice method, they are very well suited for the refinement step. In this step the structure will be refined, starting from a start structure determined from the resolving step, in order to reach the required precision. In the refinement step tedious calculations are not necessary because the resolving step provides a seed, close enough to the final solution.
124
GEUENS AND VAN DYCK
The next sections propose an alternative model, which describes the dynamic scattering in an atom column and a specimen foil, namely the S-state model, which is based on the S-state of the channeling theory. The S-state model provides an approximate (analytical) description, which will be used to ‘‘invert’’ the multiple scattering in the specimen foil, in the resolving step, in order to obtain a start structure, which will be used as a seed for the refinement step.
II. THE CHANNELING THEORY A. Introduction Plane wave–based methods such as the multislice method and the Bloch wave method are not useful to invert the process of multiple scattering of an electron in the specimen foil. They do not explain on an intuitive basis why, even in case of highly dynamic scattering, the HRTEM exit wave is still locally related to the projected structure. The physical reason for this local dynamic diVraction is the channeling of the electrons along the atom columns parallel to the beam direction. Due to the positive electrostatic potential of the atoms, an atom column acts as a guide or channel for an electron. In an atom column the electron can scatter dynamically without leaving it. The S-state of the channeling theory (Nellist and Pennycook, 1999; Op de Beeck and Van Dyck, 1991, 1995, 1996; Sinkler and Marks, 1999; Van Dyck, 1989; Van Dyck and Op de Beeck, 1996) describes this eVect and thus provides such physical insight. The principle of the S-state of the channeling theory is based on the expansion of the electron wave function in eigenfunctions of the atom column potential averaged along the atom column. It turns out that this basis is so eVective that the scattering of the electron can be described fairly well using only the so-called 1S eigenfunction. This model will be named the 1S-state model. The electron wave function can be represented as a simple and even analytic expression, which allows fast calculation and provides analytical derivatives with respect to the parameters. On the other hand, it explains why the motion of the electron along the atom column is nearly periodic in function of the depth in the specimen foil. Because of its simplicity, the method has the potential to become a work horse for HRTEM. It permits interpretation of the reconstructed electron wave function directly in terms of the projected structure, yielding an approximate structure model that can then further be used as a start for quantitative refinement. Furthermore, it is valid even for crystal defects (dislocations, translation interfaces, etc.) as long as the atoms are aligned in columns in a direction close to the beam direction.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
125
In this section, the main focus is on the S-state model for an isolated atom column. In the following sections, the S-state model for an assembly of atom columns and the S-state model in case of tilted illumination or crystal tilt are discussed. The concept of channeling is not new. It describes the tendency of a beam of charged particles to run along the paths of lowest potential energy in the crystal. For positively charged particles it runs along the empty tunnels between the atom strings. For electrons it runs along the nuclei, as stated earlier. For electrons and positrons it is a quantum mechanical eVect, whereas for ion beams it may be described using classical concepts. When a highenergy beam of charged particles hits a target, a very large number of elastic and inelastic processes occur. This may include the ionization of atoms, the excitation of atoms, the excitation of valence band electons, Rutherfort scattering, Bragg diVraction, nuclear reactions, x-ray emission, Auger electron production, phonon excitation, and many other processes. The finding that the rates of these processes depend on the direction of the incident beam in case of crystalline specimen is a result of channeling. Channeling may thus also be studied by measuring the intensity of secondary emissions as a function of the incident beam direction (Spence, 1992). Although the possibility of the eVect was already pointed out very early by Stark and Wendt (1912), it was demonstrated experimentally 50 years later by Rol et al. (1960), when the result of the ion sputtering was found to be dependent on the orientation of the target crystal. Lindhard (1965) has pointed out that classical theory still can hold for a series of collisions, each of which cannot be treated classically. The quantum mechanical treatment of the channeling of energetic electron beams was developed by Tamura and Ohtsuki (1974) and Fujimoto (1978). The theory of diVraction channeling was given in a form applicable to experiments with fast electrons and positrons by Howie (1966). The analogy with atomic wave functions has been developed extensively by Buxton et al. (1978) for the interpretation of convergent beam electron diVraction (CBED) patterns. B. The S-State Model for Channeling The main approximation made in the S-state of the channeling theory is that the potential energy, felt by an electron in the foil, can be assumed to be proportional to the 2D atom column potential averaged along the atom column direction U(x, y) (Appendix A). This is due to the high energy of the electrons used in CHRTEM, due to which they do not sense the alternative acceleration and deacceleration in the successive atoms in the atom column. Assuming the potential to be averaged along the atom column is equivalent to the neglect of the higher-order Laue zones. In this sense, it is a high-energy
126
GEUENS AND VAN DYCK
approximation, suitable for situations in which the incident beam direction is parallel or close to parallel to a main zone-axis. Equation (12) for parallel illumination along a main zone-axis (kxy ¼ 0) can then be rewritten as " # 2p h2 kz @ h2 Cðx; y; zÞ ¼ Dxy eUðx; yÞ Cðx; y; zÞ mi @z 2m ð34Þ ¼ HCðx; y; zÞ; with H the Hamiltonian. Because H is not dependent on z, the wave function C(x, y, z) can be written as a series of products of (x, y)–dependent eigenfunctions cnm(x, y) of the Hamiltonian and z-dependent phase factors " # X Enm cnm cnm ðx; yÞexp ip kz z : ð35Þ Cðx; y; zÞ ¼ E0 nm Note the similarity with the solution of the 2D time-dependent Schro¨ dinger equation ðt ¼ 2mmz Þ, which describes the time-dependent motion of an elech 2 kz tron in a stationary 2D potential. cnm are the excitation coeYcients of the eigenfunctions cnm(x, y) of the Hamiltonian with eigenenergies Enm. E0 is the kinetic energy of the incident electron. Substitution of this solution in Eq. (34) results in an eigenvalue problem Hcnm ðx; yÞ ¼ Enm cnm ðx; yÞ
ð36Þ
with n and m the main or principal and angular quantum number, respectively. The quantum numbers classify the bound eigenfunctions in a similar way as the eigenfunctions of the 2D quantum harmonic oscillator. The same restrictions on m are valid, as for the 2D quantum harmonic oscillator, that is, m ¼ n, n þ 2,. . ., n 2, n with n ¼ 0, 1, 2, . . . an integer number. Equation (35) can alternatively be written as X X Enm Cðx; y; zÞ ¼ cnm cnm ðx; yÞ 1 ip kz z þ cnm cnm ðx; yÞ E0 nm nm " # ð37Þ Enm Enm exp ip kz z 1 þ ip kz z : E0 E0 The excitation coeYcients cnm are determined from the boundary condition z ¼ 0, which is equal to X cnm cnm ðx; yÞ ¼ Cðx; y; 0Þ ¼ 1; ð38Þ nm
in case of a plane wave incidence. From Eq. (36) it can be concluded that X cnm cnm ðx; yÞEnm ¼ HCðx; y; 0Þ ¼ H 1 ¼ eUðx; yÞ: ð39Þ nm
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
127
Using the boundary condition and Eq. (39), Eq. (35) can be rewritten as X eUðx; yÞ Cðx; y; zÞ ¼ 1 þ ip kz z þ cnm cnm ðx; yÞ E " 0 # nm ð40Þ Enm Enm exp ip kz z 1 þ ip kz z E0 E0 The wave function is now described in terms of the eigenfunctions cnm(x,y) with eigenenergies Enm eigenfunctions of the 2D Hamiltonian. The first two terms yield the weak phase approximation given by Eq. (7). In the third term only the states will appear in the summation for which E0 ð41Þ pkz z If the object is very thin, so that no eigenfunction obeys Eq. (41), the weak phase approximation is valid. For thicker objects, only bound states will appear with very deep energy levels, which are localized near the atom column cores. Furthermore, a 2D projected atom column potentially has only a few strongly bound eigenfunctions, and when the overlap between adjacent atom columns is small, only the radial symmetric eigenfunctions will be excited. In practice, for most types of atom columns, only one eigenfunction appears, namely c00(x,y), which can be compared with the 1S eigenfunction of a hydrogen atom. In case of an isolated atom column labeled j, taking the origin in the center of the atom column, the electron wavefunction is given by jEnm j
X eU j ðx; yÞ Cðx; y; zÞ ¼ 1 þ ip kz z þ c jnm c jnm ðx; yÞ E0 " ( ) nm # j j Enm Enm exp ip kz z 1 þ ip kz z E0 E0
ð42Þ
An interesting consequence of this description is that, because the eigenfunctions c jnm ðx; yÞ are very localized at the atom column cores, the wave function for the total crystal can be expressed as a superposition of the individual atom column eigenfunctions as X
X eU j ðx xj ; y yj Þ kz z þ c jnm c jnm ðx xj ; y yj Þ E0 nm j ( ) #! j j Enm Enm exp ip kz z 1 þ ip kz z : E0 E0
Cðx; y; zÞ ¼ 1 þ "
ip
ð43Þ
128
GEUENS AND VAN DYCK
j If all eigenfunctions other than f00 have very small eigenenergies, that is
jEnm j
E0 ; pkz z
then Eq. (43) simplifies as XX Cðx; y; zÞ ¼ 1 þ c jnm c jnm ðx xj ; y yj Þ nm j " ( ) # j Enm exp ip kz z 1 E0 ! j Enm kz ¼1þ z c jnm ðx xj ; y yj Þ sin E 2 0 j ( nm !) j Enm kz 1 exp i p z 2 E0 2 XX
ð44Þ
ð45Þ
c jnm
ð46Þ
Equations (43) and (45) are the basic results of the S-state of the channeling theory. Equation (45) is named the S-state model because it depends only on the so-called 1S eigenfunction of the atom columns. The interpretation of the S-state model is simple. Each atom column j acts as a channel in which the wave function oscillates periodically with depth. The periodicity is related to the ‘‘weight’’ of the atom column (i.e., proportional to the atomic number of the atom columns and inversely proportional to the interatomic distance between the atoms in the atom column). The importance of these results is that they describe the dynamic diVraction for larger thicknesses than the usual phase grating approximation and that they require the knowledge of only one function, namely the 1S eigenfunction, per atom column. Furthermore, even in the presence of dynamic scattering, the wave function at the exit face still retains a one-to-one relation with the configuration of the atom columns. Hence the description is very useful for interpreting highresolution images and providing a possible answer to the direct retrieval problem as discussed in the introduction. Equation (45) applies to light atom columns, such as Si[111]0 or Cu[100], with a medium-acceleration voltage. When the atom columns are ‘‘heavier’’ and the acceleration voltage is higher, which because of the relativistic correction also increases the eVective strength of the atom column potential, then Eq. (43) must be used. This is the case for Au[100] at larger thicknesses, for example. Figure 4 shows the electron density |C(x, y, z)|2 as a function of depth in an Au4Mn alloy crystal for 200-keV incident electrons. The corners represent the projection of the Mn atom columns. The square in the center represents the four Au atom columns. The distance between adjacent atom columns is 0.2 nm. The periodicity along the direction of the atom column is
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
129
FIGURE 4. The electron density as function of depth in Au4Mn. The corners represent the projection of the Mn atom column. The square in the center represents the four Au atom columns.
0.4 nm. From these results it is clear that the electron’s density in each atom column fluctuates nearly periodically with depth. For Au this periodicity is approximately 4 nm, and for Mn, 13 nm. These periodicities are nearly the same as those for isolated atom columns, so that the influence of neighboring atom columns, in this case is still small. The energies of the 1S eigenfunctions are, respectively, 250 eV and 80 eV. When the atoms are heavier and the accelerating voltage is very high (0.5 to 1 MeV), more rotationally symmetrical eigenfunctions become important, which makes the wave function more complicated. When a crystal is viewed along a high-indexed zone-axis, the distance between the adjacent atom columns decreases and the weight of the atom columns also decreases. Hence the bound eigenfunctions broaden, and overlap between adjacent atom column eigenfunctions starts to occur. This can be incorporated in the theory by using a perturbation theory. When the overlap between the eigenfunctions of the atom columns is too large, they have to be considered as molecules. The localization can be improved by using higher-accelerating voltages. It is interesting to note that the channeling is usually described in terms of Bloch waves (Berry and Mount, 1972; Kambe et al., 1974). However, as follows from the foregoing, channeling is not a mere consequence of the periodicity of the crystal but occurs even in an isolated atom column parallel to the beam direction. In fact, even for an isolated atom column, the problem can be treated mathematically by making the column artificially periodic so as to generate a basis of functions (Bloch functions) to expand the wave function. In this view, the Bloch character is of only mathematical importance. This is the case even in a crystal in which Bloch wave calculations then yield the same 1S eigenfunctions as found in our simplified
130
GEUENS AND VAN DYCK
treatment. Only when the overlap between atom column eigenfunctions and potentials increases or when the beam is inclined, do the other Bloch states become physically important. Because the channeling is a consequence of the atom column structure and not the crystal periodicity, it is also valid in the presence of defects, if the atom columns parallel to the beam direction are not disrupted. The next section describes how to calculate the eigenfunctions of an electron in an isolated atom column. It should be noted that the reader can skip this part and proceed with Section IV. III. CALCULATION
EIGENFUNCTIONS OF ISOLATED ATOM COLUMN
OF THE
AN
ELECTRON
IN AN
In this section it is discussed in more detail how to calculate the eigenfunctions of an electron in an isolated atom column. To be correct, Eq. (35) will be written as follows " # X Enm Cðx; y; zÞ ¼ cnm cnm ðx; yÞ exp ip kz z E0 nm;Enm <0 ð47Þ " # Z 1 x þ cðxÞcðx; y; xÞexp ip kz z dx: E0 0 Here a distinction is made between the bound eigenfunctions, which have a discrete eigenenergy Enm, and the continuum eigenfunctions, which have a continous eigenenergy x. cnm (x, y) and c(x, y, x) are, respectively, the solutions of the following equations Hcnm ðx; yÞ ¼ Enm cnm ðx; yÞ with Enm < 0; Hcðx; y; xÞ ¼ xcðx; y; xÞ
with
x 0:
ð48Þ ð49Þ
The excitation coeYcients cnm and c(x) are determined from the boundary condition z ¼ 0 Z 1 X cnm cnm ðx; yÞ þ cðxÞcðx; y; xÞdx ¼ Cðx; y; 0Þ ¼ 1: ð50Þ nm
0
Using this boundary condition, Eq. (47) can be rewritten as " # X Enm Cðx; y; zÞ ¼ 1 þ cnm cnm ðx; yÞ exp ip kz z 1 E0 nm;Enm <0 " # Z 1 x þ cðxÞcðx; y; xÞ exp ip kz z 1 dx E0 0
ð51Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
Enm kz z cnm ðx; yÞ ¼1þ 2cnm sin p E0 2 nm " # Enm kz 1 z exp ip 2 E 2 0 Z 1 x kz þ z cðx; y; xÞ 2cðxÞ sin p E0 2# 0 " x kz 1 z exp ip dx E0 2 2
131
X
ð52Þ
The wave function is now described in terms of the bound eigenfunctions cnm (x, y) with eigenenergies Enm and continuum eigenfunctions c(x, y, x) with eigenenergies x, all eigenfunctions of the 2D Hamiltonian. Because the atom columns are regarded as isolated, the system poses rotational symmetry, that is, all directions in the plane are equivalent. The potential of such a system will only depend on the radial distance r from a suitably chosen origin; this means U(x, y) ¼ U(r) with r ¼ |x þ y|. Such a problem is known in physics as a central force problem. In this case, the solutions of Eq. (48) and Eq. (49) may be respectively factored as cnm ðx; yÞ ¼ Rnm ðrÞFm ð’Þ
ð53Þ
cðx; y; xÞ ¼ Rðr; xÞFm ð’Þ
ð54Þ
and
with Fm(’) the angular function, a solution of the angular equation @2 Fm ð’Þ ¼ m2 Fm ð’Þ; @’2
ð55Þ
and Rnm(r) and R(r, x) the radial functions, which are, respectively, solutions of the radial equation of the bound eigenfunctions ( ) h2 @ 2 1 @ m2 þ ð56Þ þ eUð rÞ Rnm ð rÞ ¼ Enm Rnm ðrÞ; 2m @r2 r @r r2 and the radial equation of the continuum eigenfunctions ( ) h2 @ 2 1 @ m2 þ þ eUðrÞ Rðr; xÞ ¼ xRðr; xÞ 2m @r2 r @r r2
ð57Þ
m is a separation constant. The solutions of these equations are discussed in the following paragraphs.
132
GEUENS AND VAN DYCK
A. The Angular Equation Equation (55) is known from the elementary theory of ordinary diVerential equations. Two elementary independent solutions are eim’ and eim’. Therefore, Fm(’) can be written as 1 Fm ð’Þ ¼ pffiffiffiffiffiffi eim’ 2p
ð58Þ
If m is an integer (m ¼ 0, 1, 2, . . .), this set of functions is orthonormal. Standing wave representations of Fm(’) can be formed by defining appropriate linear combinations. For example, for m ¼ 1 the standing wave representations F1x ð’Þ and F1y ð’Þ are defined as 8 1 > > < F1x ð’Þ ¼ pffiffiffi cos ’; p ð59Þ 1 > > F1y ð’Þ ¼ pffiffiffi sin ’ : p B. The Radial Equation of the Bound Eigenfunctions The radial equation of the bound eigenfunctions is much more diYcult to solve than the angular equation. In contrast with the angular equation, there is no analytical solution. Nevertheless, in Op de Beeck (1994) and Op de Beeck and Van Dyck (1995), an approximate analytical solution for the lowest eigenfunction (n ¼ 0, m ¼ 0) was presented. Alternatively, Eq. (56) can be solved numerically. In the past, various attempts were made to do this. These attempts can be mainly categorized in two groups: finite diVerence methods (Op de Beeck and Van Dyck, 1995; Sinkler and Marks, 1999) and expansion of the bound eigenfunctions in a set of basis functions (Op de Beeck and Van Dyck, 1995). Both methods have particular advantages and disadvantages, which are discussed below. First, the finite diVerence expansion proposed by Op de Beeck (1994) and Op de Beeck and Van Dyck (1995) is reviewed briefly and discussed, after which an alternative finite diVerence expansion is proposed. Second, the expansion of the bound eigenfunctions in a set of basis functions, namely, Bessel functions of the first kind by Op de Beeck (1994) and Op de Beeck and Van Dyck (1995), are reviewed and discussed. Third, a new set of basis functions is introduced, namely, 2D quantum harmonic oscillators. In conclusion, a method is presented that is based on Fourier analysis of the periodic behavior of the wave function in terms of thickness, from which the eigenenergies of the excited bound eigenfunctions can be determined.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
133
C. Solutions of the Radial Equation of the Bound Eigenfunctions: Finite DiVerence Methods If one tries to solve diVerential Eq. (56) by expansion in finite diVerences, one is faced with divergence problems of the term 1 @ m2 2 r @r r
ð60Þ
in the origin. The first term can be reduced to a second derivative using the rule of l’Hoˆpital and the boundary condition @ Rnm ðrÞ ¼ 0; r!0 @r lim
ð61Þ
keeping the rotational symmetry in mind, so that the apparent divergence can be avoided. Divergence of the second term can only be avoided for m ¼ 0. However, this puts restrictions on the eigenfunctions that can be calculated by this method. In Op de Beeck and Van Dyck (1995), two other numerical problems are reported. The first problem is that the radial range must be restricted to the interval [0, L] in order to limit the number of sampling points, and yet the order of the eigenmatrix. This implies that only these bound eigenfunctions, which are suYciently damped, can be calculated correctly. The second problem is that the equation is not defined for negative values so that asymmetric diVerence formulae must be used to approximate the derivatives in the neighborhood of the boundaries. Their use will destroy the symmetry of the eigenmatrix, and so the hermicity and consequently the eigensolutions of the matrix will not necessarily form a complete basis. The divergence problem of the term, given in Eq. (60) in the origin for m 6¼ 0, as well as the nondefined negative values of the radial solution, can be solved if the diVerential Eq. (48) is expanded in finite diVerences and eU (x, y) and cnm(x, y) are sampled discretely at sampling points (iDy, jDx) with i Ny Nx the row index, j the column index, Dy ¼ 2L and Dx ¼ 2L , with Ny and Nx the y x respective number of sampling points in the y and x direction. The Laplacian term can be expanded in finite diVerences as
h2 Dxy cnm ði; jÞ 2m " h2 cnm ði; j 1Þ 2cnm ði; jÞ þ cnm ði; j þ 1Þ þ O ðDxÞ2 ¼ 2 2m ðDxÞ # cnm ði 1; jÞ 2cnm ði; jÞ þ cnm ði þ 1; jÞ 2 þ þ O ðDyÞ : ðDyÞ2
ð62Þ
134
GEUENS AND VAN DYCK
The finite diVerence expansion of diVerential Eq. (48) can then be written in matrix form as 1 10 0 cnm ð1; 1Þ C1;1 Enm . . . C1;k ... C1;Nx Ny C .. CB B ⋮ C CB B ⋮ . ... ⋮ CB cnm ð1; Nx Þ C B C B C B Ck0 ;1 Ck0 ;k0 Enm Ck0 ;Nx Ny CB c ð2; 1Þ C ¼ 0 B nm C B C B .. A@ @ A ⋮ ⋮ ... . ⋮ ... CNx Ny ;k . . . CNx Ny ;Nx Ny Enm CNx Ny ;1 cnm ðNy ; Nx Þ ð63Þ with Ck0 ;k
"
h2 1 ¼ eU m ðDx Þ2 þ ðDy Þ2
h2
2mðDx Þ2 h2 2mðDy Þ2
* 0 + * 0 +# k 1 k 1 dk0 ;k þ 1; k Nx Nx Nx
ðdk0 ;k1 þ dk0 ;kþ1 Þ
ð64Þ
ðdk0 þNx ;k þ dk0 Nx ;k Þ;
with ⌊x⌋ the floor function or greatest integer function, which gives the largest integer less than or equal to x and dk0 ;k ¼ 1 if k0 ¼ k and dk0 ;k ¼ 0 if k0 6¼ k. C is a sparse symmetric matrix of dimensions NxNy NxNy. To solve such a huge matrix eigenvalue problem the ARPACK2 software package is used. ARPACK is a collection of Fortran77 subroutines designed to solve large-scale eigenvalue problems and is most appropriate for large sparse or structured matrices. The package is designed to compute a few eigenvalues, with user-specified features such as those with the smallest real part, and corresponding eigenvectors of a general square matrix. Although the problem of divergence and nondefined negative values of the radial solution are solved by this approach, the problem of restricting the range to [Lx, Lx] in the x direction and [Ly, Ly] in the y direction, in order to limit the order of the eigenvalue matrix, remains. In addition, it remains important that the potential is sampled suYciently. It is possible to adapt the grid spacing Dx and Dy in function of the local variation of the potential, but in this work Dx and Dy are chosen to be constant over space. Nevertheless, finite diVerence methods do provide a method to calculate the bound eigenfunctions, which are directly accessible, and their associated eigenenergies suYciently accurate depending on the number of sampling points and the range [Lx, Lx] in the x direction and [Ly, Ly] in the y 2
www.caam.rice.edu/software/ARPACK.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
135
direction. Note that it is easy to incorporate two atom column potentials and calculate the eigenfunctions of two neighboring atom columns, which is not straightforward in case of an expansion of the solutions of the radial equation, which is discussed in the next subsection. The performance of this method is discussed in subsection III.F.1. D. Solutions of the Radial Equation of the Bound Eigenfunctions: Expansion in a Basis Set Expansion of the solutions of the radial equation in an orthonormal and complete basis will avoid large sampling to correctly describe the sharp peak of the potential. At least if the matrix elements of the eigenvalue problem to which the problem is reduced, it can be calculated analytically or by means of a recurrence relation. When an optimal basis is chosen, the number of basis functions needed is limited. This reduces the dimension of the eigenvalue problem that must be solved. A disadvantage is that the eigenfunctions are not directly accessible. Thus the solution of the eigenvalue problem is not the eigenfunctions and their eigenenergy but an eigenvector and an eigenenergy, where the elements of the eigenvector are the excitation coeYcients of the various basis functions. In the past, Bessel functions of the first kind were proposed as basis set, which are orthogonalized on an interval [0, L] (Op de Beeck and Van Dyck, 1995). Here, the 2D quantum harmonic oscillator eigenfunctions are proposed as basis set. The pros and cons of both basis sets are discussed as well. The performance of both methods are discussed in subsection III.F.2. 1. A Basis Set of Bessel Functions of the First Kind An expansion of the solutions of the radial equation of the bound eigenfunctions, in a set of Bessel functions of the first kind, was first proposed by Op de Beeck and Van Dyck (1995). However, only the most bound eigenfunction was calculated with this method by them. The derivation is repeated here briefly. Apart from the potential term of the radial equation, the radial equation is closely related to Bessel’s diVerential equation (Arfken and Weber, 1995). Therefore, it seems a reasonable ansatz to expand the solution Rnm(r) in function of Bessel functions of the first kind, which provide a complete and orthogonal basis on the interval [0, L] after orthogonalization. The solutions of the radial equation can then be expanded in an orthogonalized set of basis functions Rnm ðrÞ ¼
N 1 X k¼0
anm k hrjki;
ð65Þ
136 with
GEUENS AND VAN DYCK
lk r hrj ki ¼ Nk Jjmj L
and
Nk ¼
pffiffiffi 2 LjJjmjþ1 ðlk Þj
;
ð66Þ
with lk the roots of J|m|(r) and N the number of basis functions in the expansion. The integral over the 2D space of the product of hk0 | ri and Eq. (56), after substitution of Eq. (65), divided by 2p, is given by * + Z L N 1 X h2 @ 2 1 @ m2 nm 0 2 þ eUðrÞr hrj kirdr ak hk jri r þ 2 r @r r 2m @r 0 k¼0 + * Z L N1 2 2 X h l k ð67Þ ¼ anm hk0 j ri r þ eUðrÞr hrj kirdr k 2m L 0 k¼0 N 1 X ¼ Enm anm k dk0 ;k ; k¼0
since
Z
L
0
and
Z
L
0
hk0 j rihrj rihrj kirdr ¼
1 0 hk jki ¼ dk0 k ; 2p
. 2 / @ 1 @ m2 2 r hrj kirdr hk j ri r þ 2 @r * r@r r + Z L lk 2 ¼ hk0 j ri r r hrj kirdr; L 0
ð68Þ
0
ð69Þ
which is Bessel’s diVerential equation, with dk0 ;k ¼ 1 if k0 ¼ k and dk0 ;k ¼ 0 if k0 6¼ k. DiVerential Eq. (56) is now reduced to an eigenvalue problem, which can be written in matrix form 1 0 0 nm 1 C0;0 Enm . . . C0;k ... C0;N1 C a0 .. B CB ⋮ C B ⋮ . ... ⋮ CB k C B CB anm C ¼ 0 B Ck0 ;0 Ck0 ;k0 Enm Ck0 ;N1 CB C B C@ ⋮ A B .. A @ ⋮ ... . ⋮ anm N1 CN1;0 ... CN1;k . . . CN1;N1 Enm ð70Þ
with
Z Ck0 ;k ¼
0
L
+ * h2 l 2 k hk0 j ri r þ eUðrÞ r hrj kirdr: 2m L
ð71Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
137
From Appendix A, it is known that eU(r) can be written in a parameterized form using the Doyle and Turner parameterization for the electronscattering factors as X Ai r2 e Bi ð72Þ eUðrÞ ¼ B i i The second term of Eq. (71) is thus a 2D integral of the product of a Gaussian and Bessel functions of the first kind. The integral can be calculated analytically on an infinite interval, whereas only a numerical approximation exists on a finite interval [0, L]. However, if L is taken large enough, the Gaussian will be damped almost completely at L. As a consequence the integral over a finite interval [0, L] will approach the integral over an infinite interval, which is given by ! Z 1 1 a 2 þ b2 ab %2 x2 e Jp ðaxÞJp ðbxÞxdx ¼ 2 exp Ip 2 2% 2%2 4% 0 ð73Þ p ½RðpÞ > 1; jargðrÞj < ; a > 0; b > 0; 4 (Gradshteyn and Ryzhik, 1965), with Ip a modified Bessel function. The diagonal matrix elements of C, Ck, k are then equal to ! ! h2 lk 2 1 2 X l2k l2k Ck;k ¼ Nk Ai exp Bi 2 Ijmj Bi 2 ð74Þ 2 2m L 2L 2L i and the nondiagonal matrix elements of C, Ck0 , k are equal to ! X 1 l2k0 þ l2k lk 0 lk Ck0 ;k ¼ Nk0 Nk Ai exp Bi Ijmj Bi 2 4L2 2L2 i
ð75Þ
The advantage of an expansion of the eigenfunctions in a basis set of Bessel functions is that the elements of the eigenmatrix can be calculated analytically if the Doyle and Turner parameterization is used for the electronscattering factors, to describe the mean atom column potential (Appendix A). A disadvantage is that an interval [0, L] has to be chosen on which the basis set is orthogonal. The larger the interval, the larger the number of basis functions that must be taken into account to achieve convergence, as will be shown in subsection III.F.2. Although the number of basis functions needed to describe the most bound eigenfunction is limited, Bessel functions of the first kind are not an optimal basis set for an expansion. This can be understood intuitively, if one considers that the Bessel functions of the first kind are oscillating, in function of the distance from the core, whereas the solutions of the radial equation are quite smooth. To counterbalance these
138
GEUENS AND VAN DYCK
oscillations many Bessel functions of the first kind are needed in the expansion. Two-dimensional quantum harmonic oscillator eigenfunctions provide a much more eVective basis set in this respect, as will be discussed next. 2. Expansion in Two-Dimensional Quantum Harmonic Oscillator Eigenfunctions The eigenfunctions of the 2D quantum harmonic oscillator provide an orthonormal and complete basis set over the whole 2D space. The solutions of the radial equation can be expanded in a set of basis functions u(2kþ|m|)m(r) the radial part of the 2D quantum harmonic oscillator eigenfunctions, which are discussed in more detail in Appendix B, Rnm ðrÞ ¼
N 1 X k¼0
anm k hrj ki with
hrj ki ¼ uð2kþjmjÞm ðrÞ;
ð76Þ
with N the number of basis functions in the expansion. The integral over the 2D space of the product of hk0 |ri and Eq. (56), after substitution of Eq. (76), divided by 2p, is given by * + Z 1 N 1 X h2 @ 2 1 @ m2 nm 0 2 þ eUðrÞ r hrj kirdr ak hk j ri r þ 2 2m @r r @r r 0 k¼0 ð77Þ N 1 X ¼ Enm anm d 0 k ;k; k k¼0
since
Z 0
1
hk0 j rihrj rihrj kirdr ¼
1 0 hk j ki ¼ dk0 k ; 2p
ð78Þ
with dk0 ;k ¼ 1 if k0 ¼ k and dk0 ;k ¼ 0 if k0 6¼ k. The diVerential Eq. (56) is now reduced to an eigenvalue problem, which can be written in matrix form similar to Eq. (70), with * + Z 1 2 2 2 h @ 1 @ m 0 2 þ eUðrÞ r hrj kirdr Ck0 ;k ¼ hk j ri r þ 2 r @r r 2m @r 0 ð79Þ To simplify the calculation of the matrix elements of C, the matrix will be written as C ¼ P þ Q, with the matrix elements of P and Q equal to * + Z 1 h2 @ 2 1 @ m2 0 Pk0 ;k ¼ r hrj kirdr; ð80Þ hk jri r þ 2m @r2 r @r r2 0
139
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
Z Qk0 ;k ¼
1 0
hk0 j rihrjeUðrÞj rihrj kirdr
Note that Pk0 , k can be written as Pk0 ;k
1 ¼ hoð2k þ jmj þ 1Þdk0 ;k mo2 2
Z
1 0
ð81Þ
hk0 j rihrjr2 jrihrjkirdr
ð82Þ
using Eq. (B7) with ho (2k þ | m | þ 1) the eigenenergy of the 2D quantum harmonic oscillator eigenfunction with main quantum number 2k þ | m |. To calculate the matrix elements of P, it is suYcient to calculate Z 1 hk0 jrihrjr2 jrihrjkirdr 0 Z 1 ¼ u ð2k0 þjmjÞm ðrÞuð2kþjmjÞm ðrÞr3 dr ð83Þ 0 Z 0 b0 4 1 x jmjþ1 jmj jmj ¼ ð1Þk þk Nð2k0 þjmjÞm Nð2kþjmjÞm e x Lk0 ðxÞLk ðxÞdx; 2 0 2
with x ¼ br0 2. Making use of the recurrence relation of the associated Laguerre polynomials (Arfken and Weber, 1995) jmj
jmj
jmj
jmj
xLk ðxÞ ¼ ðk þ 1ÞLkþ1 ðxÞ þ ð2k þ jmj þ 1ÞLk ðxÞ ðk þ jmjÞLk1 ðxÞ; ð84Þ and the orthogonality relation Z 1 ðk þ jmjÞ! jmj jmj dk0 ;k ; ex xjmj Lk0 ðxÞLk ðxÞdx ¼ k! 0
ð85Þ
the integral in Eq. (83) can be solved and rewritten in the following form: Z 1 hk0 jrihrjr2 jrihrjkirdr 0 b0 4 ðk þ jmj þ 1Þ! dk0 ;kþ1 Nð2ðkþ1ÞþjmjÞm Nð2kþjmjÞm ¼ k! 2 ðk þ jmjÞ! dk0 ;k þð2k þ jmj þ 1ÞNð2kþjmjÞm Nð2kþjmjÞm ð86Þ k! ðk þ jmjÞ! dk0 ;k1 þNð2ðk1Þþjmjm Nð2kþjmjÞm ðk 1Þ! pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ b0 2 ½ ðk þ 1Þðk þ jmj þ 1Þdk0 ;kþ1 þ ð2k þ jmj þ 1Þdk0 ;k pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ kðk þ mÞdk0 ;k1 Substitution of Eq. (86) into Eq. (82) gives
140
GEUENS AND VAN DYCK
1 Pk0 ;k ¼ hoð2k þ jmj þ 1Þdk0 ;k 2 i pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 hpffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ho ðk þ 1Þðk þ jmj þ 1Þdk0 ;kþ1 þ kðk þ mÞdk0 ;k1 2
ð87Þ
The matrix P is thus tridiagonal and the matrix elements Pk0 ;k can be calculated analytically. The matrix elements Qk0 ;k cannot be calculated analytically, in contrast with the matrix elements Pk0 ;k , but a recurrence relation can be derived. This was shown in Arickx et al. (1994) in three dimensions for the special case of a Gaussian potential. Arickx et al. used a set of 3D quantum harmonic oscillator eigenfunctions as a basis set, to expand the scattered wave function, in order to study quantum scattering problems. The recurrence relation is found with the help of a generating function K(s0 , s) of the matrix elements Qk0 ;k , making use of the generating function gm (r, s) of unm (r) given by Eq. (B25). K(s0 , s) is defined as Z 1 gm ðr; s0 ÞeUðrÞgm ðr; sÞrdr Kðs0 ; sÞ ¼ 0 R1 1 X 1 X hk0 jrihrjeUðrÞjrihrjkirdr 0 0 ¼ s k sk 0 ð88Þ Nð2k0 þjmjÞm Nð2kþjmjÞm k0 ¼0 k¼0
¼
1 X 1 X
0 0
s k sk
k0 ¼0 k¼0
Qk0 ;k Nð2k0 þjmjÞm Nð2kþjmjÞm
From Appendix A it is known that eU(r) can be written in a parameterized form using the Doyle and Turner parameters as X Ai r2 eUðrÞ ¼ e Bi ð89Þ B i i P P Assume K(s0 , s) ¼ i Ki (s0 , s) and Qk0 ;k ¼ i Qik0 ;k , with Ki (s0 , s) and Qik0 , k defined as Z 1 Ai r2 0 gm ðr; s0 Þ e Bi gm ðr; sÞrdr ð90Þ Ki ðs ; sÞ ¼ Bi 0 Qik0 ;k
Z ¼
1 0
. / Ai r2 hk jri r e Bi r hrjkirdr; Bi 0
ð91Þ
they are related as Ki ðs0 ; sÞ ¼
1 X 1 X k0 ¼0 k¼0
0 0
s k sk
Qik0 ;k Nð2k0 þjmjÞm Nð2kþjmjÞm
ð92Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
Ki(s0 , s) is then equal to, after some rewriting Ki ðs0 ; sÞ ¼
1
Ai jmjþ1 Bi 2ðð1 þ s0 Þð1 þ sÞaÞ
Z
1
0
2
eu ujmj du;
141
ð93Þ
where a¼
b0 2 1 1 s 0 1 1 s þ þ B i 2 1 þ s0 2 1 þ s
and
Z GðzÞ ¼
0
1
and u ¼ a
r 2 b0
;
eu uz1 du; RðzÞ > 0;
ð94Þ
ð95Þ
then 1 Ai Ki ðs0 ; sÞ ¼ ðjmjþ1Þ Gðjmj þ 1Þ Bi 2 0 2 bBi ð1 þ s0 þ s þ s0 sÞ þ ð1 s0 sÞ
ð96Þ
After derivation of Ki (s0 , s) using Eq. (96) with regard to s0 and multiplica02 tion of both sides by bBi ð1 þ s0 þ s þ s0 sÞ þ ð1 s0 sÞ and regrouping, the following relation is obtained: 0 2 02 b b0 2 0 b0 2 b d þ1 þ s þ sþ 1 s0 s Ki ðs0; sÞ ds0 Bi Bi Bi Bi ð97Þ 0 2 b b0 2 0 ¼ ðjmj þ 1Þ 1 sþ Ki ðs ; sÞ Bi Bi On the other hand, the derivative of Ki (s0 , s) using Eq. (92), with regard to s0 , is equal to 1 X 1 X Qik0 ;k 0 d 0 0 0 ðk 1Þ k K ðs ; sÞ ¼ k s s i ds0 Nð2k0 þjmjÞm Nð2kþjmjÞm k0 ¼0 k¼0
ð98Þ
Then the recurrence relation becomes after substitution of Eqs. (92) and (98) in Eq. (97), and after some rewriting ðk0 þ 1ÞQik0 þ1;k þ ðk0 þ jmj þ 1Þð1 gi Þ 2k0 þ1Qik0 ;k þðk0 þ 1Þð1 gi Þ 2kQik0 þ1;k1 þðk0 þ jmj þ 1Þð2gi 1Þ 2k0 þ1 2kQik0 ;k1 ¼ 0 with
ð99Þ
142
GEUENS AND VAN DYCK
gi ¼
1 02
1 þ bBi
Nð2kþjmjÞm and 2k ¼ ¼ Nð2ðk1ÞþjmjÞm
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k : k þ jmj
ð100Þ
It is suYcient to calculate one matrix element, more specifically, Qi0;0 because it can be calculated analytically Qi0;0 ¼
Ai jmjþ1 g Bi i
ð101Þ
All other matrix elements can be calculated using the recurrence relation and the property that Q is symmetric. It is thus suYcient to restrict to the calculation of the matrix elements Qk0 k; k due to the symmetry of Q. Because the basis functions are quite smooth, only a limited number of them are suYcient to describe the eigenfunctions, if the 2D quantum harmonic oscillator width b0 is optimized to the problem under study. The elements of the eigenmatrix cannot be calculated analytically, but a recurrence relation can be set up if the Doyle and Turner parameterisation of the electron-scattering factors is used. 3. Optimization of the Two-Dimensional Quantum Harmonic Oscillator Width As previously stated, the number of 2D quantum harmonic oscillator basis functions, needed in the expansion of the eigenfunctions, to achieve convergence is a function of the 2D quantum harmonic oscillator width b0 , where the optimal b0 is not equal for all eigenfunctions. In the case of the most bound eigenfunctions with angular quantum number m, the optimal value for b0 can be determined by finding the root of an analytical function, as will be shown. a. n ¼ |m| and m 2 Z. It can be assumed that only one basis function in the expansion is needed to determine the most bound eigenfunction with angular quantum number m and its eigenenergy. Nevertheless, the most bound eigenfunction with angular quantum number m of the 2D quantum harmonic oscillator will not match the true eigenfunction since the atom column potential is diVerent than the potential of the 2D quantum harmonic oscillator potential with oscillator width b0 . However, a value for b0 can be found for which the match between the most bound eigenfunction of the 2D quantum harmonic oscillator and the true eigenfunction is maximum or the calculated eigenenergy is minimum. The variational method (Sakurai, 1994) provides a method to determine such a value b0 . It states that
143
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
R 2p R 1 0
0
Ejmjm Hm ðb0 Þ ¼
*
+ h2 @ 2 1 @ h0j ri rj þ þ eUðrÞj r hrj0irdrd’ 2m @r2 r @r
X Ai h2 1 ðjmj þ 1Þ ¼ 2 0 2m b Bi i
h0j0i 1 jmjþ1 ; b0 2 1þ Bi
ð102Þ with hr| 0i ¼ um|m|(r) the trial eigenfunction. Hm provides thus an upper bound for the eigenenergy E|m|m. To find an optimal value for b0 , Hm(b0 ) is minimized with respect to b0 0 1 2 X @ B h 1 Ai 1 C ðjmj þ 1Þ ¼ 0 ð103Þ @ A 2 jmjþ1 @b0 2m b0 Bi b0 2 i 1þ B (b0 )
i
b0 ¼b0 opt
bopt must be a solution of X Ai i
B2i
1 1þ
b0 opt Bi
¼ 2 jmjþ2
h2 1 2m b0opt4
ð104Þ
b. n > |m| and m 2 Z. The variational method can be generalized to eigenfunctions with a higher eigenenergy than the lowest eigenfunction with angular quantum number m, if a trial function is chosen that is orthogonal to all the exact eigenfunctions with a lower eigenenergy than the eigenfunction for which one wants to optimize b0 . Nevertheless, there is not such a suitable trial function available. Therefore, b0 is optimized by solving the eigenvalue problem, taking into 0 account only a small number of basis functions. bopt is then the 2D quantum harmonic oscillator width for which the eigenenergy of the eigenfunction one wants to optimize is minimal and is determined by trial and error. E. Calculation of the Bound Eigenfunctions Using the Bloch Wave Method or the Multislice Method 1. The Bloch Wave Method Both the S-state of the channeling theory and the Bloch wave method provide a high-energy solution (r) of the Schro¨ dinger equation (Coene et al., 1985). After substitution of both solutions in Eq. (8), the diVerential equation was brought to an eigenproblem with Enm the eigenenergy and
144
GEUENS AND VAN DYCK
cnm(x, y) the associated eigenfunction of the S-state of the channeling theory and solutions gK the eigenvalue and CgK the associated eigenvector of the Bloch wave method. A first diVerence between both solutions is the potential term; U(x, y) is two dimensional in case of the S-state of the channeling theory, and V(x, y, z) is three dimensional in case of the Bloch wave method. A second diVerence is the periodicity of cnm(x, y) and b(K, r); namely, cnm(x, y) is nonperiodic and b(K, r) has the periodicity of the potential. If it is assumed, as argued before, that the potential energy felt by an electron in the specimen foil is proportional to V(x, y, z) ’ U(x, y), the 2D atom column potential averaged along the atom column, that cnm(x, y) has the periodicity of the potential and that the illumination is parallel with a main zone-axis (kxy ¼ 0), a relation between Enm and gK can be derived. Under these assumptions the wave function C(x, y, z) can be written as " # X Enm Cðx; y; zÞ ¼ cnm cnm ðx; yÞ exp ip kz z ; ð105Þ E0 nm with Enm 2 R. If cnm(x, y) has the same periodicity as the potential, it can be expanded in a Fourier series of plane waves with 2D reciprocal wave vector gR in a similar way as U(R) X cnm ð106Þ cnm ðx; yÞ ¼ gR expð2pigR RÞ; gR
Uðx; yÞ ¼
X
UgR expð2pigR RÞ
ð107Þ
gR
After substitution of both in Eq. (34) and regrouping, Eq. (34) becomes X 2me 2m nm g2R cnm UgRhR cnm ð108Þ gR hR ¼ 2 Enm cgR 2 h h hR Because, in the S-state of the channeling theory it is assumed that the potential energy of an electron felt in the foil is proportional to the 2D atom potential averaged along the atom column, gz ¼ 0 and thus g ¼ gR. The eigenproblem of the Bloch wave method, given by Eq. (33), with solutions gK, the eigenvalue and CgK , the associated eigenvector, can then be written, in case of parallel illumination along a main zone-axis (kxy ¼ 0), as X 2me K K g2R CgR VgR hR ChKR ¼ 2gK kz CgR ; ð109Þ 2 h h R
since k n ¼ k, k gR ¼ 0. Comparison of both equations results in the following relation between Enm and gK, and cnm(x, y) and b(K, r):
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
Enm ¼
h2 g kz ; m K
145 ð110Þ
cnm ðx; yÞ ¼ bðK; rÞ
ð111Þ
The following question can be asked: Is it possible to use the Bloch wave method to calculate the eigenenergies and their associated eigenfunctions of an isolated atom column? If one regards Eq. (108), one can conclude that this diVerential equation is equal to the Fourier transform of diVerential ^ ðgx ; gy Þ at sampling points Eq. (48), where cgR are the discrete values of c nm (hDgx, kDgy), with Dgx and Dgy the grid spacing in, respectively, the gx and gy ^ ðgx ; gy Þ the 2D Fourier transform of cnm(x, y). The Bloch direction, and c nm wave expansion is thus closely related to the presented finite diVerence expansion of Eq. (48) in real space in subsection III.C. Similar to subsection III.C., the 2D discrete sampled eigenfunction cnm(gx, gy) is written as a vector. Compared to the finite diVerence expansion of Eq. (48) in real space there are two diVerences. First, in Fourier space there is no need to expand the Laplacian in finite diVerences, as in real space. Second, the product of eU (x, y) and cnm(x, y) becomes a convolution product in Fourier space, as is clear from Eq. (108). As a consequence, the eigenmatrix is not sparse as in real space. Therefore, dedicated algorithms to solve large sparse eigenvalue problems are not suitable. As a consequence, the number of sampling points, or beams, is rather limited, which can hamper convergence of the eigenenergy, as will be shown in subsection III.F.1. Table 1 shows N N the relation between the range [L, L] in real space and the range ½4L ; 4L in Fourier space, as well as Dx and Dg. From this it is clear that to catch up a finite diVerence calculation with L ¼ 0.3 nm and N ¼ Nx ¼ Ny ¼ 64, the number of beams in the Bloch wave calculation is approximately p 4 Nx Ny ’ 3215, which is far more than the maximum amount of beams, which can be taken into account in a commercial Bloch wave algorithm such as EMS (Stadelmann, 1987). TABLE 1 THE RELATION BETWEEN THE INTERVAL IN REAL AND FOURIER SPACE, AS WELL AS THE RELATION BETWEEN THE GRID SPACINGS IN BOTH SPACESa
Range Grid spacing Dimension of the grid Number of sampling points, number of beams a
Real space
Fourier space
[L, L] Dx ¼ 2L N N 2 N
2 N N3 4L ; 4L Dg ¼ 2L N N 2 N
There are N values x ¼ L, L þ D, . . . , L D for the interval [L, L] in real space and N N N N values g ¼ 2L ; 2L þ Dg ; . . . ; 2L Dg in Fourier space.
146
GEUENS AND VAN DYCK
As a conclusion, the Bloch wave method is by far the worst method to calculate the eigenenergies and associated eigenvectors of an isolated column compared to a finite diVerence method in real space or an expansion in a set of basis functions, like Bessel functions of the first kind or 2D quantum harmonic oscillator eigenfunctions. 2. The Multislice Method a. The Method of Evolution of Imaginary Time. time-independent Schro¨ dinger equation such as Hcn ðx; yÞ ¼ En cn ðx; yÞ;
To solve an arbitrary ð112Þ
for the lowest eigenfunction, the method of evolution of imaginary time (Hiller et al., 1995; Koonin, 1986) can be used. This method exploits the connection between the time-dependent Schro¨ dinger equation and a diVusion-like equation. The diVusion-like equation can be obtained from the time-dependent Schro¨ dinger equation by substitution of an imaginary time. In the solution of the diVusion-like equation, given by 1 X En Cðx; y; tÞ ¼ with En 0; cn cn ðx; yÞ exp t ð113Þ h n¼0 the higher-energy eigenfunctions decay most rapidly and, after suYcient amount of ‘‘time’’ t ¼ it, the eigenfunction with the lowest eigenenergy remains. If E0 is known, the eigenfunction c0(x, y) can then be extracted from C(x, y, t), if computations are extended far enough in t E0 t Cðx; y; tÞ ð114Þ c0 ðx; yÞ / lim exp t!1 h This method can be applied to solve Eq. (34) for the lowest eigenfunction c00 (x, y). Indeed, after substitution of z ¼ iz, Eq. (34) becomes a diVusionlike equation with a solution " # X Enm cnm cnm ðx; yÞ exp p kz z Cðx; y; zÞ ¼ E0 nm;Enm <0 ð115Þ " # Z 1 x þ cðxÞcðx; y; xÞ exp p kz z dx E0 0 The terms containing the eigenfunctions with a discrete negative eigenenergy will gain importance exponentially in function of z, but since |E00| |Enm < 0|, at large ‘‘thicknesses’’ the lowest eigenfunction will dominate. If the second term is regarded as an infinite sum of eigenfunctions with a discrete eigenenergy equal to or larger than zero, it can be easily shown that they will
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
147
decay in function of ‘‘thickness’’ or will be constant in case of x ¼ 0. If E00 is known, the c00(x, y) eigenfunction can be extracted from C(x, y, z) as shown above. C(x, y, z) can be calculated using a slightly altered multislice method by substitution of z ¼ iz. The propagator function p(x, y, Dz) and the transmission function t(x, y, z) become then 1 p exp ðx2 þ y2 Þ ; pðx; y; zÞ ¼ ð116Þ lDz lDz Z tðx; y; zÞ ¼ exp
zþDz
sV ðx; y; z Þdz ; 0
0
ð117Þ
z
and Cðx; y; z þ DzÞ ¼ pðx; y; DzÞ ½tðx; y; zÞCðx; y; zÞ þ OðDz2 Þ
ð118Þ
This method thus provides a technique to calculate the lowest eigenfunction by slight modification of a multislice algorithm. b. Determination of Enm < 0 Using the Multislice Method. To extract the lowest eigenfunction from C(x, y, z), E00 should be known. E00 can be calculated with a standard multislice algorithm, as will be shown below. Regarding the discrete form of Eq. (47) " # X Enm Cðx; y; zÞ ¼ cnm cnm ðx; yÞexp ip kz z ; ð119Þ E0 nm with Enm 2 R, it is clear that each term oscillates in function of z with a respective frequency EEnm k2z, proportional to the eigenenergy Enm. To deter0 mine the respective frequencies of the terms, the forward 1D Fourier transform with respect to z of Eq. (119) is calculated X Enm kz ^ Cðx; y; gz Þ ¼ cnm cnm ðx; yÞd gz þ ; ð120Þ E0 2 nm with d(g) the delta function. After integration over x and y, a 1D function f(gz) is obtained as function of gz, from which the respective eigenenergies Enm of the eigenfunctions cnm(x, y) for which cnm 6¼ 0, can be obtained by determination of the positions of the delta peaks. Note that the theoretical ^ function Cðx; y; gz Þ is calculated from an infinite continuous series of wave functions C(x, y, z) as function of z. Experimentally this is not feasible, so that a finite discrete series will be calculated. Due to the finiteness of the series, the delta peaks in f(g) are convoluted with a sinc function, defined as sincðxÞ ¼ sinðxÞ , and are thus broadened. A new discrete sampled function x
148
GEUENS AND VAN DYCK
F(gz) is obtained F ðgz Þ ¼
Enm kz ðzmax zmax Þsinc pðzmax zmax Þ gz þ E0 2 nm R R cnm x y cnm ðx; yÞdxdy;
X
ð121Þ
with zmax and zmin, respectively, the thicknesses at which the last and first wave function in the range are calculated. The peak positions can thus be determined by fitting of sinc functions through the discrete sampled function F(gz). To narrow the sinc function and to sample the peaks suYciently, C(x, y, z) must be calculated over a large range up to large values for z. As a consequence of the broadening of the peaks the eigenenergies of the eigenfunctions contributing to the wave functions must be well separated to avoid problems concerning peak position determination. This problem is not encountered in case of a single atom column in a large-unit cell. The multislice method thus provides a technique to calculate the eigenenergy of the bound eigenfunctions. Note that this method is not limited to the case of an isolated atom column but can also be used for a pair of atom columns. This allows evaluation of the influence on the eigenenergy as function of the distance between the atom columns, as shown in Section V. Instead of integration over the whole wave function, one can alternatively integrate over a single atom column of a pair of atom columns in order to determine the eigenenergy of that single atom column, which makes the technique quite powerful. The accuracy which can be achieved with this method is discussed in subsection III.F.3. F. Comparison of the Performance of the Presented Methods to Calculate the Bound Eigenfunctions and Their Eigenenergy 1. Finite DiVerence Method To study the performance of the finite diVerence method, the eigenenergies of the bound eigenfunctions were calculated for a Cu and an Au atom column with, respectively, a repeat distance in the atom column of 0.3615 nm and 0.40786 nm, for various numbers of sampling points N ¼ Nx ¼ Ny, and ranges [L, L] ¼ [Lx, Lx] ¼ [Ly, Ly]. The Debye–Waller factor B is 0.006 nm2, and the acceleration voltage 300 keV. Table 2 shows the eigenenergies calculated for a Cu atom column, and Table 3 shows the eigenenergies calculated for an Au column. First, the eigenvalue is strongly dependent on the grid spacing. For equal grid spacings [e.g., (N ¼ 64, L ¼ 0.3 nm) and (N ¼ 128, L ¼ 0.6 nm)], comparable eigenenergies
149
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM TABLE 2 THE EIGENENERGIES OF THE BOUND EIGENFUNCTIONS OF A Cu ATOM COLUMNa L (nm) N
0.3
0.6
1.2
2.4
64 128 192
78.288 77.194 76.999
88.862 78.288 77.470
180.481 88.862 79.602
235.432 180.4811 118.5826
2.4
L (nm) N
0.3
0.6
1.2
64 128 192
1.364 1.1926 1.1613
1.95 1.4573 1.344
0.5050 1.95 1.632
0.0568 0.5051 1.722
L (nm) N 64 128 192
0.3 0.7438 0.8574 0.8854
0.6
1.2
0.4592 0.0431 0.0170
4.7894 0.4806 0.1584
2.4 0.0451 4.7894 3.0792
a d ¼ 0.3615 nm, B ¼ 0.006 nm2, and an acceleration voltage of 300 keV, for various ranges L and number of sampling points N. In the upper table the n ¼ 0, m ¼ 0 eigenenergy is shown; in the middle table the n ¼ 1, |m| ¼ 1 eigenenergy; and in the lower table the n ¼ 2, m ¼ 0 eigenenergy.
are found. Second, the eigenenergy converges only slowly as function of N. Therefore, a large number of sampling points are needed before convergence is achieved. Nevertheless, because the dimensions of the eigenmatrix increases quadratically as function of the number of sampling points N, the maximum number of sampling points is limited. As is clear from the tables, the less localized eigenfunctions with quantum numbers n ¼ 1, |m| ¼ 1 and n ¼ 2, m ¼ 0 are hard to determine, particularly for the lighter atom columns such as the Cu atom column, since the results are highly dependent on N, as well as the range [L, L]. In comparison with the finite diVerence calculations presented in Op de Beeck (1994), where the diVerential Eq. (56) was expanded in finite diVerences and the asymmetric diVerence formula was used to approximate the firstorder and second-order derivatives in the neighborhood of the boundary points, no nonphysical solutions are encountered and more sampling points are needed before convergence is achieved. The last is obvious because in order to sample the potential equally, the actual number of grid points is 4N2
150
GEUENS AND VAN DYCK TABLE 3 EIGENENERGIES OF THE BOUND EIGENFUNCTIONS OF AN Au ATOM COLUMN a L (nm)
N
0.3
0.6
1.2
64 128 192
215.242 211.443 210.786
266.790 215.242 212.384
423.607 266.794 220.895
L (nm) N
0.3
0.6
1.2
64 128 192
42.095 39.390 38.936
44.9431 42.095 40.049
18.867 44.9431 44.910
L (nm) N 64 128 192
0.3 9.426 8.251 8.065
0.6
1.2
35.7930 9.429 8.530
26.534 35.793 11.567
a d ¼ 0.40786 nm, B ¼ 0.006 nm2, and an acceleration voltage of 300 keV, for various ranges L and number of sampling points N. In the upper table the n ¼ 0, m ¼ 0 eigenenergy is shown; in the middle table the n ¼ 1, |m| ¼ 1 eigenenergy; and in the lower table the n ¼ 2, m ¼ 0 eigenenergy.
in case of a finite diVerence expansion of diVerential Eq. (48) compared to N in case of a finite diVerence expansion of diVerential Eq. (56). From a numerical point of view, it is possible to solve the eigenvalue problem using the finite diVerence method. Nevertheless, a finite diVerence expansion, limited to first order in Dx and Dy, is a limitation from the physical point of view. As a conclusion, an expansion of the bound eigenfunctions in an eYcient basis set is preferable to the finite diVerence method and allows calculating the eigenenergy of the bound eigenfunctions much more accurately, as shown in the next paragraph. 2. Bessel Functions of the First Kind Versus Two-Dimensional Quantum Harmonic Oscillator Eigenfunctions Figures 5 to 9 show plots of the eigenenergies of the bound eigenfunctions for various atom column types as function of the number of basis functions in the expansion. The atom column types are Si, Cu, Sr, Sn, and Au, with,
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
151
FIGURE 5. The eigenenergy of the eigenfunction (n ¼ 0 and m ¼ 0) calculated by expansion of the eigenfunction in, respectively, a set of Bessel functions of the first kind ‘‘B,’’ for various ranges [0, L] and number of basis functions and a set of 2D quantum harmonic oscillator eigenfunctions ‘‘H’’ for a various number of basis functions. The atom column types in the respective plots are (a) Si, (b) Cu, and (c) Sr, with respective repeat distance of 0.4531 nm, 0.3651 nm, and 0.608 nm. The Debye–Waller factor is 0.006 nm2 and the acceleration voltage 300 keV.
respectively, a repeat distance of 0.4531 nm, 0.3651 nm, 0.608 nm, 0.6489 nm, and 0.40786 nm. The Debye–Waller factor B is 0.006 nm2 and the acceleration voltage 300 keV. The matrix elements of matrix C are calculated from Eqs. (71) and (79), respectively, corresponding with an
152
GEUENS AND VAN DYCK
FIGURE 6. The eigenenergy of the eigenfunction (n ¼ 0 and m ¼ 0) calculated by expansion of the eigenfunction in, respectively, a set of Bessel functions of the first kind ‘‘B,’’ for various ranges [0, L] and number of basis functions and a set of 2D quantum harmonic oscillator eigenfunctions ‘‘H’’ for a various number of basis functions. The atom column types in the respective plots are (a) Sn and (b) Au with respective repeat distance of 0.6489 nm and 0.40786 nm. The Debye–Waller factor is 0.006 nm2 and the acceleration voltage 300 keV.
expansion in Bessel functions of the first kind or 2D quantum harmonic oscillator eigenfunctions. The dimension of C is equal to the largest number of basis functions in the expansion shown in the plots. For each smaller number of basis functions in the expansion, a subset of matrix C is used. The matrix elements of C have to be calculated only once for the largest number of basis functions. In case of a Bessel functions expansion, the dependence of the eigenenergy on L is studied as well. From the plots it can be concluded that the larger the L, the more basis functions are needed to achieve convergence. Figures 5 and 6 show plots of the lowest eigenenergies of the eigenfunctions (n ¼ 0 and m ¼ 0). It is shown that for both sets of basis functions the eigenenergy converges using a reasonable number of basis functions in the expansion. Nevertheless, it is clear that a quantum harmonic oscillator expansion is much more eVective than a Bessel function expansion. Only a
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
153
FIGURE 7. The eigenenergy of the eigenfunction (n ¼ 1 and |m| ¼ 1) calculated by expansion of the eigenfunction in, respectively, a set of Bessel functions of the first kind ‘‘B,’’ for various ranges [0, L] and number of basis functions and a set of 2D quantum harmonic oscillator eigenfunctions ‘‘H’’ for a various number of basis functions. The atom column types in the respective plots are (a) Cu, (b) Sr, and (c) Sn, with respective repeat distance of 0.3651 nm, 0.608 nm, and 0.6489 nm. The Debye–Waller factor is 0.006 nm2 and the acceleration voltage 300 keV.
limited number of basis functions must be taken into account if the 2D quantum harmonic oscillator width b0 is optimized to the problem. In each plot the optimized value for b0 is mentioned. Also for the calculation of the eigenfunctions and eigenenergies (n 6¼ 0 and m 6¼ 0), a quantum harmonic oscillator expansion is more eVective than a
154
GEUENS AND VAN DYCK
FIGURE 8. The eigenenergy of the eigenfunction (n ¼ 1 and |m| ¼ 1) calculated by expansion of the eigenfunction in, respectively, a set of Bessel functions of the first kind ‘‘B,’’ for various ranges [0, L] and number of basis functions and a set of 2D quantum harmonic oscillator eigenfunctions ‘‘H’’ for a various number of basis functions. The atom column type is Au with a repeat distance of 0.40786 nm. The Debye–Waller factor is 0.006 nm2 and the acceleration voltage 300 keV.
Bessel function expansion, although the diVerences are smaller than in case of (n ¼ 0 and m ¼ 0), as is clear from Figures 7 to 9. Figures 7 and 8 show plots of the eigenenergies of the eigenfunctions (n ¼ 1 and |m| ¼ 1). The eigenfunction (n ¼ 1 and |m| ¼ 1) for Cu is only bound very weakly and is quite broad. In case of a Bessel function expansion, the eigenenergy does not converge to the exact eigenenergy for all values of L. For example, for L ¼ 0.3 nm the converged eigenenergy is larger than the converged eigenenergy for L ¼ 0.6 nm. The eigenfunction (n ¼ 1 and |m| ¼ 1) is thus not yet damped at L ¼ 0.3 nm. Here, a quantum harmonic oscillator expansion has a significant advantage compared to a Bessel function expansion. The eigenfunctions of the 2D quantum harmonic oscillator are after all orthonormal and complete over the 2D space. The converged eigenvalue and eigenfunction is therefore an exact solution of the eigenproblem, whereas a converged eigenvalue and eigenfunction using a basis set of Bessel functions of the first kind can be a false one if L is not chosen properly. Figure 9 shows the eigenenergies of the eigenfunctions (n ¼ 2 and m ¼ 0). Using a Bessel function expansion, the converged eigenenergy of Cu in case of L ¼ 0.3 nm to L ¼ 0.6 nm is positive. As a conclusion, a basis set of 2D quantum harmonic oscillator eigenfunctions is preferable to a basis set of Bessel functions of the first kind, especially because an improperly chosen value for L can lead to wrong results even after convergence.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
155
FIGURE 9. The eigenenergy of the eigenfunction (n ¼ 2 and m ¼ 0) calculated by expansion of the eigenfunction in, respectively, a set of Bessel functions of the first kind ‘‘B,’’ for various ranges [0, L] and number of basis functions and a set of 2D quantum harmonic oscillator eigenfunctions ‘‘H’’ for a various number of basis functions. The atom column types in the respective plots are (a) Cu, (b) Sr, and (c) Au, with respective repeat distance of 0.3615 nm, 0.608 nm, and 0.40786 nm. The Debye–Waller factor is 0.006 nm2 and the acceleration voltage 300 keV.
In this work the Doyle and Turner parameterization for the electronscattering factors was used to calculate the atom column potential averaged along the atom column. An advantage of this parameterization is that from the electron-scattering factors the mean atom column potential can be calculated analytically, as shown in Appendix A. Although the solutions of
156
GEUENS AND VAN DYCK
the provided techniques and presented calculations were based on this parameterization, the techniques are more generally applicable for other parameterizations of the scattering factors or atom column potentials averaged along the atom column. However, it may be possible that the matrix elements cannot by calculated analytically, as in the case of a Bessel functions expansion or by means of a recursion relation as in case of a quantum harmonic oscillator expansion. The accuracy of the method will then depend on the accuracy of the numerical integration performed to calculate the matrix elements. The reasons for use of the Doyle and Turner parameterization for the electron-scattering factors is that it provides a well-behaving atom column potential averaged along the atom column and because most multislice and Bloch wave programs use this parameterization, which allow a comparison to be made. 3. Some Hard Numbers for the Eigenenergies In Table 4 some hard numbers are given for the eigenenergies calculated for various atom column types. It is found that all methods, namely a Bessel functions expansion, a quantum harmonic oscillator expansion, and Fourier analysis of the periodical behavior of the wave function in function of thickness, give comparable results. This provided that a suYcient number of basis TABLE 4 EIGENENERGIES OF THE BOUND EIGENFUNCTIONSa Functions Atom column type
Bessel (eV)
2D quantum harmonic oscillator expansion (eV)
Multislice method (eV)
Si (0.5431 nm) Cu (0.3615 nm)
20.16 78.32 1.43 0.12 57.26 3.63 1.36 69.78 2.55 210.79 38.68 7.93
20.11 78.12 1.42 0.10 57.11 3.71 1.36 69.61 2.54 210.27 38.58 7.93
20.3 77.5
Sr (0.608 nm) Sn (0.6489 nm) Au (0.40786 nm)
57.4
70.1 210.9
a Calculated for various atom column types (atom type and repeat distance in the atom column) with the presented methods. The Debye–Waller factor was 0.006 nm2 and the acceleration voltage 300 keV.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
157
functions are used in the expansion, L is chosen properly in case of a Bessel functions expansion, that the 2D quantum harmonic oscillator width is optimized to the problem in case of a quantum harmonic oscillator expansion, and that the thickness range is chosen properly in case of a Fourier analysis of the behavior of the wave function in function of thickness. G. The Radial Equation of the Continuum Eigenfunctions In contrast with the bound eigenfunctions are the continuum eigenfunctions not localized with respect to the atom column core; they keep oscillating and are infinite in space. The continuum eigenfunctions can be factorized similarly to the bound eigenfunctions, as shown in Section II. To calculate c(x, y, x), Eq. (57) must be solved. Here an approximate solution of Eq. (57) is proposed. For large r, eU(r) can be neglected in Eq. (57), so that the radial equation becomes then after some regrouping " 2 # @ 1 @ 2mx m2 þ þ 2 Rðr; xÞ ¼ 0; ð122Þ @r2 r @r r h2 which is Bessel’s diVerential equation. Without any boundary conditions imposed on the solution, this diVerential equation has two well-known solutions: Bessel functions of the first kind Jm and Bessel function of the second kind Ym. Consequently, every continuum eigenfunction reduces to a linear combination of the two solutions at large values of r, where eU(r) can be neglected, sffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffi !# " 2mx 2mx Rðr; xÞ ¼ Nm Cm Jm rÞ þ Dm Ym ð r ; ð123Þ 2 h h2 with Nm a normalization constant. For r ! 1 Jm and Ym are respectively proportional to a cosine and sine function. R(r, x) becomes then vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffi u 2 2mx 1 p u Rðr ! 1; xÞ ’ Nm u rffiffiffiffiffiffiffiffiffiffiffi Cm cos r mþ 2 2 t 2mx h2 r p 2 ð124Þ h ffiffiffiffiffiffiffiffi r 2mx 1 p þDm sin r mþ ; 2 2 h2 which is a solution of
2 @ 2 h Rðr ! 1; xÞ ¼ xRðr ! 1; xÞ 2m @r2
ð125Þ
158
GEUENS AND VAN DYCK
An alternative solution of Eq. (125) is Rðr ! 1; xÞ / e
i
pffiffiffiffi 2mx 2 h
r
ð126Þ
For reasons of mathematical simplicity this solution is used to study the excitation coeYcients of the continuum eigenfunctions in subsection III.H.2. H. Excitation of the Eigenfunctions 1. Excitation of the Bound Eigenfunctions Only a limited number of bound eigenfunctions are excited in case of an isolated atom column and parallel illumination along a main zone-axis. This is clear if one calculates the excitation coeYcients cnm Z 2p Z 1 cnm ¼ c nm ðrÞCðr; 0Þrdrd’ 0 0 ð127Þ Z 2p Z 1 Rnm ðrÞrdp Fm d’; ¼ 0
0
with * denoting the complex conjugate. Because " pffiffiffiffiffiffi Z 2p 2p m ¼ 0 F m d’ ¼ 0 m 6¼ 0 0
ð128Þ
6 0, the wave function of an isolated atom column is thus rotationally only cn0 ¼ symmetric C(x, y, z) ¼ C (r, z). Experimentally it has been found that the number of bound eigenfunctions is dependent on the type of atom column and on the speed of the incident electrons. In CHRTEM the acceleration voltages are mostly of the order of 200 keV to 400 keV, so that for most atom columns, only one eigenfunction is strongly bound, c00(r). In previous articles about channeling, this eigenfunction was called the 1S eigenfunction, in analogy with the hydrogen atom. Note that this spectroscopic notation stems from the labeling of the eigenfunctions of a 3D one-electron atom. Here, the problem is two dimensional. Nevertheless, this nomenclature is used here to avoid confusion with previous articles. The subscript of the 1S eigenfunction will be kept ‘‘00’’. For heavier atom columns (e.g., Sr [100] and Au [100]) also c20(r) is bound, albeit only weakly compared to c00(r). Because |E20| is much smaller than |E00|, the relative importance of c20(r) is much smaller and is building up much more slowly as a function of thickness compared to c00(r). This is clear from Figure 10, where the absolute value, of what will be called the excitation, E20 kz 00 kz jc00 sin p E z j and jc sin p z j of, respectively, c00(r) and c20(r) 20 E0 2 E0 2 are plotted for various atom column types as a function of thickness.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
E
159
k
FIGURE 10. jcn0 sinðp E n0 2z zÞj is plotted as function of specimen foil thickness, for all n, 0 for which En0 < 0. The repeat distance d in the atom column is, respectively, for Si, Cu, Sr, Sn, and Au: 0.5431 nm, 0.3615 nm, 0.608 nm, 0.6489 nm, and 0.40786 nm. The Debye–Waller factor was 0.006 nm 2 and the acceleration voltage 300 keV. Note that the E k jcn0 sinðp E n0 2z zjoscillates faster for heavier atom columns. 0
2. Excitation of the Continuum Eigenfunctions The eigenenergy x is not discrete but is continuous in contrast with the eigenenergy of the bound eigenfunctions. The excitation coeYcient of a continuum eigenfunction with eigenenergy x is equal to " pffiffiffiffiffiffi R 1 Z 1 Z 2p 2p 0 Rðr; xÞrdr m ¼ 0 cðxÞ ¼ Rðr; xÞrdr Fm d’ ¼ ð129Þ 0 m 6¼ 0: 0 0 Because the continuum eigenfunctions are infinite in space, the part of the continuum eigenfunctions where eU(r) can be neglected will contribute most to the integral, which is approximately equal to Z 1 Z 1 Rðr; xÞrdr ’ Rðr; xÞrdr 0 0 rffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffi ð130Þ Z 1 i 2mxr h2 2 h rdr / / e 2mx 0 The excitation coeYcient of the continuum eigenfunctions is thus inversely pffiffiffi proportional to x (Op de Beeck and Van Dyck, 1995). From this it can be concluded that only the continuum eigenfunctions with an eigenenergy close to zero have an excitation coeYcient that is diVerent from zero.
160
GEUENS AND VAN DYCK
kz Furthermore, if the excitation jcm ðxÞsinðp x E0 2 zÞj of a continuum eigenfunction with an eigenenergy close to zero is considered, it can be concluded that the relative importance compared to the bound eigenfunctions is negligible.
IV. THE S-STATE MODEL As concluded in subsection III.H.1, there is only one strongly bound eigenfunction, namely the 1S eigenfunction. For heavier atom columns other eigenfunctions are bound, but only weakly compared to the 1S eigenfunction. In subsection III.H.2. it was concluded that the excitation of the continuum eigenfunctions is negligible compared to the excitation of the 1S eigenfunction. The wave function can thus to good approximation be written as " # E00 kz E00 kz 1 Cðr; zÞ ’ 1 þ 2c00 sin p z c00 ðrÞexp ip z ; ð131Þ 2 E0 2 E0 2 an expression that is known as the S-state model (Geuens and Van Dyck, 2002). From Figure 10 it can be concluded that restricting the expansion to c00(r) is a valid approximation, for an isolated atom column, up to a thickness of approximately 80% of D00 ¼ k2z EE000 , the thickness period. This can be taken as a rule of thumb for the validity of this approximation. The thickness of the specimen foil used in HREM is mostly in that range. Note that the S-state model is valid for incident electrons with a kinetic energy in the intermediate range, that is, 200 to 400 keV, and is not valid for high kinetic energies (e.g., 1 MeV), for which, due to the increase of the interaction, more rotationally symmetric eigenfunctions are bound. A. Physical Insight in the S-State Model: The Channeling Map In this section the diVerent notations of the S-state model are discussed, as well as their physical meaning. The mathematical formulation of the S-state model can be expressed in two equivalent forms, namely as Eq. (51) or Eq. (52). From Eq. (51) the wave function can be written as an expansion in eigenfunctions of the atom column " # E00 Cðr; zÞ ’ c00 c00 ðrÞexp ip kz z þ ð1 c00 c00 ðrÞÞ ð132Þ E0 The first term contains the 1S eigenfunction of the atom column with an eigenenergy E00, the second term contains an eigenfunction with eigenenergy
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
161
equal to zero, namely a continuum eigenfunction, which is constant at large r where c00c00(r) is zero. On the other hand, from Eq. (52) the wave function can be written as a sum of two waves, namely a vacuum wave, which did not interact with the atom column, and a scattered wave. " # E00 kz E00 kz 1 z c00 ðrÞexp ip z Cðr; zÞ ’ 1 þ 2c00 sin p ð133Þ 2 E0 2 E0 2 The amplitude of the scattered wave is dependent on r, though the r dependence of the 1S eigenfunction and is oscillating in function of z with a periodicity equal to D00 the thickness periodicity, which is depending on the ‘‘weight’’ of the atom column. The phase of the scattered wave is a linear function of z and is, apart from a constant, proportional to the 1S eigenenergy. The heavier the atom column, the larger the phase shift in function of thickness of the specimen foil. The scattered wave can now be obtained by subtracting the vacuum wave, that is, " # E00 kz E00 kz 1 Cðr; zÞ 1 ’ 2c00 sin p z c00 ðrÞexp ip z ð134Þ 2 E0 2 E0 2 The amplitude and phase of the scattered wave are respectively equal to E00 kz ð135Þ z c00 ðrÞ; absðCðr; zÞ 1Þ ’ 2c00 sin p E0 2 argðCðr; zÞ 1Þ ’ p
E00 kz 1 z 2 E0 2
ð136Þ
The phase of the scattered wave provides direct access to the 1S eigenenergy E00 if the foil thickness z is known. To interpret the amplitude of the scattered wave, the limit for z going to zero or the kinematic limit will be regarded. Eq. (135) becomes then lim absðCðr; zÞ 1Þ ’ sUðrÞz z!0
ð137Þ
because c00 c00 ðrÞ ¼
eUðrÞ pkz e and ¼ s; E00 E0
ð138Þ
using the approximate expression of the 1S eigenfunction, given by Sinkler and Marks (1999) and s ¼ 2pmel . This expression is obtained by setting h2 Eq. (131) (the S-state model) equal to Eq. (7) (the R z weak phase approximation), in the limit for z going to zero and 0 eVs ðr; zÞd ¼ eUðrÞz. The amplitude of the scattered wave is thus in the kinematic approximation
162
GEUENS AND VAN DYCK
proportional to the atom column potential averaged along the beam direction. At thicknesses where the kinematic approximation is no more valid, the amplitude of the scattered wave function is still peaked at the atom column position. Nevertheless, the peak height is dependent on the foil thickness and is oscillating with a periodicity D00. Note that due to dynamic scattering, it is now possible that for certain thicknesses, light atom columns contribute much more to the amplitude of the scattered wave than heavy atom columns. This will happen if the specimen foil thickness is almost equal to D00 of the heavy atom column. In the phase of the scattered wave, phase jumps of p will appear. This is due to the fact that the amplitude is restricted to positive values. Every time the specimen foil thicknesses equals an integer times D00, the scattered wave is changing sign in function of the specimen foil thickness and a phase jump occurs. Even at large thicknesses, where the 1S-state model is no longer valid, the amplitude of the scattered wave is peaked at the atom column positions. The amplitude of the scattered wave is then equal to absðCðr; zÞ 1Þ X Enm kz ¼ z cnm ðrÞ 2cnm sin p nm E0 2 ¼
XX nm n0 m0
Enm kz En0 m0 kz z sin p z 4cnm cn0 m0 sin p E0 2 E0 2 "
#1=2 Enm En0 m0 kz cnm ðrÞcn0 m0 ðrÞexp ip z ; E0 2
ð139Þ
ð140Þ
which is still peaked at the atom column position. Due to this property, direct methods for x-ray crystallography will still work for dynamic electron diVraction, even at thicknesses for which the electron object interaction can be regarded as being highly dynamic. This will be shown theoretically in Section VIII.B. In practical experiments, the vacuum wave is mostly diVerent from one. Only if there is an edge of the object in the same field of view, the exact value of the vacuum wave can be determined. The channeling wave is then calculated as Cðr; zÞ Cu ðr; zÞ ; Cu ðr; zÞ with Cu(r, z) the vacuum wave.
ð141Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
163
A convenient way to visualize that the amplitude oscillates periodically as a function of thickness and that the phase increases linearly with thickness is by plotting for each pixel of the wave function, its real and imaginary values as coordinates in a complex plane with the x-axis as the real axis and the y-axis the imaginary axis. If the object is a wedge containing diVerent thicknesses, it can be expected from the 1S model that all the points are located on a circular locus that starts at the origin (for zero thickness) and has its center on the x-axis. The amplitude of the channeling wave is then the radius of the circle and the phase is the angular coordinate along the circle. All the pixels of the same column may have slightly diVerent amplitudes so that the circle in fact becomes a ring, but they all are expected to have the same angular coordinate (i.e., to lie in the same sector). If an atom is added to a column, the corresponding points are expected to rotate over a fixed angular increment. If the crystal contains diVerent types of columns, this map will reveal a distinct circle for each diVerent type of column. For this reason we call this representation a channeling map. Figure 11a shows simulations and Figure 11b experimental results for the channeling map of Au(110), which both confirm these theoretical conclusions. For a discussion see Section VII.
FIGURE 11. (a) Channeling map of a simulated wave-function of a wedge shaped Au [110] sample (10 layers) by multislice calculations (300 keV, Debye–Waller factor 0.005 nm2). (b) Channeling map of an experimentally reconstructed wave function of a wedge shaped Au [110] sample. The complex pixel values can be assigned to sectors, which correspond to different numbers of atoms in the atom columns. (Courtesy J. R. Jinschek and C. Kisielowski (NCEM, Berkeley, California.)
164
GEUENS AND VAN DYCK
B. Scaling and Parameterization of the S-State Model 1. Scaling and Parameterization of the 1S Eigenfunction Empirically, it has been shown that the 1S eigenfunctions of most types of atom columns can be scaled to a uniform 1S eigenfunction c000 ðrÞ (Van Dyck and Chen, 1999a,b; Geuens et al., 1999); see also Figure 12b, defined by pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi c00 ðrÞ ¼ jE00 jc000 jE00 jr ; ð142Þ which is independent of E00. Substitution of Eq. (142) in Eq. (48) results in " !# h2 e r0 U pffiffiffiffiffiffiffiffiffiffi c000 ðr0 Þ ¼ c000 ðr0 Þ; ð143Þ Dr0 jE00 j 2m jE00 j pffiffiffiffiffiffiffiffiffiffi r0 0 jE00 jr. Since c00 (r) is independent of E00, jE100 j eU pffiffiffiffiffiffiffi with r0 ¼ jE00 j should also be independent of E00, which leads to a scaling requirement of eU(r) ! 1 r0 0 0 eU pffiffiffiffiffiffiffiffiffiffi eU ðr Þ ¼ ð144Þ jE00 j jE00 j
FIGURE 12. The normalized and scaled potential eU 0 (r0 ) (a) and the normalized and scaled 1S eigenfunction c000 ðr0 Þ (b).
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
165
It can be empirically shown that such a scaling exists, as is clear from Figure 12a. The scaling requirement of eU(r) is also clear from Eq. (138) or using the scaling relation c000 c000 ðr0 Þ ¼ eU 0 ðr0 Þ ð145Þ Although this approximation of the 1S eigenfunctions is a crude one, it illustrates why the 1S eigenfunction resembles the atom column potential, as well as the scaling requirement of eU(r). If such a uniform atom column potential exists, the integral volume of each atom column potential must be constant because Z 2p Z 1 2p h2 fe ð0Þ eUðrÞrdrd’ ¼ m0 d 0 0 ð146Þ Z 2p Z 1 ¼
0
0
eU 0 ðr0 Þr0 dr0 d’ ¼ cte
Nevertheless, it is clear that the integral volume is not constant but inversely proportional to d, the repeat distance in the atom column, and proportional to fe(0). fe(0) is the electron-scattering factor at g ¼ 0, which is equal to the integral volume of the 3D atom potential. Note that this holds for the particular case of identical atoms in the atom column, but it can be generalized in a straightforward way for nonidentical atoms in the atom column. An expression for the electron-scattering factor at g ¼ 0 is given by Ibers (1958): Z 2 fe ð0Þ ¼ hr i; ð147Þ 3a0 with a0 ¼ h2 =m0 e2 ’ 0:05292 nm, the Bohr radius, hr2i the mean square radius of the electrons in the atom, and Z the atomic number. Assuming that hr2i is nearly constant, it can be concluded that the integral volume of the atom column potential is proportional to Z. At first sight these results seem to be contradictory to the conclusion that the scaled potentials coincide quite well. Nevertheless, a careful study of Figure 12a shows that although the scaled potentials coincide quite well, the integral volume of the 2D scaled potentials is diVerent. Because a uniform 1S eigenfunction exists, it is attractive to parameterize it by an analytical function. In this case the expression of the wave function would be completely analytical, which allows taking analytical derivatives with respect to the parameters, resulting in a fast calculation. From calculations (Figure 12) and prior work (Geuens et al., 1999; Op de Beeck and Van Dyck, 1996; Van Dyck and Chen, 1999a) it can be concluded that the 1S eigenfunctions have a shape that is somewhat in between, respectively, a 2D quadratically normalized exponential function
166
GEUENS AND VAN DYCK
rffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi !! jE00 jr jE00 j 1 1 exp c00 ðrÞ ¼ ; 2p b 2 b and a 2D quadratically normalized Gaussian function 0 rffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi !2 1 jE00 jr A jE00 j 1 1 exp@ c00 ðrÞ ¼ p b 2 b
ð148Þ
ð149Þ
If b is now calculated using a Gaussian parameterization and the presented method in Section IV.C., it can be concluded that b is slightly dependent on Z, as is clear from Figure 13a. From Figure 13b it can be concluded that b is dependent on d, for values of d in the physical interesting region where d is of the order of 0.1 nm to 0.7 nm. In Figure 13c it is shown that b is independent of B for realistic values of B.
FIGURE 13. The dependence of b in function of the atomic number Z (a), the repeat distance in the atom column d (b), and the Debye–Waller factor B (c).
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
167
pffiffiffi FIGURE 14. The dependence of d b for light atom columns (large values of the repeat distance in the atom column d ) as function of d.
In the limit for large values of d or light atoms in the atom column, the behavior of b in function of d and Z can be predicted by Eq. (145), which gives an approximation for the uniform 1S eigenfunction in case of weak phase conditions (large d ). The integral volume of the uniform 1S eigenfunction squared is, according to Eq. (145), proportional to the integral volume of the atom column potential, which is inversely proportional to d. In case of a quadratically normalized 2D Gaussian parameterization of the uniform 1S eigenfunction, the integral volume of the uniform 1S eigenfunction pffiffiffisquared is proportional to b2. b should thus be inversely proportional with d , which is shown in Figure 14. 2. Parameterization of the 1S Eigenenergy Empirically it was shown (Op de Beeck and Van Dyck, 1995; Van Dyck and Chen, 1999a,b) that E00 can be approximately parameterized in function of the atomic number Z, the repeat distance in the atom column d, and the Debye-Waller factor B as in Van Dyck et al. (1989) 2 1 d ¼a þ bB ; ð150Þ jE00 j Z or as in Op de Beeck and Van Dyck (1995) and Van Dyck and Chen (1996b) ! 5 1 d4 ¼a þ bB ð151Þ jE00 j Z In this paragraph, a theoretical foundation is given for these expressions. First, assume B ¼ 0. In this case all atoms are perfectly aligned in the atom column. The atom column potential is then sharply peaked at the atom
168
GEUENS AND VAN DYCK
column core, as is the 1S eigenfunction. Therefore, it can be assumed that the 1S eigenfunction is well described by a 2D quadratically normalized exponential function of the form 0 0qffiffiffiffiffiffiffiffiffiffiffiffiffi 11 sffiffiffiffiffiffiffiffiffiffiffiffiffi B¼0 B¼0 jr jE00 jE00 j1 1 AA ð152Þ exp@ @ c00 ðrÞ ¼ b 2 2p b By substitution of Eq. (152) in Eq. (48) it can be shown mathematically that the atom column potential is inversely proportional to r, and because it is averaged along the atom column, it is inversely proportional to d eUðr; B ¼ 0Þ ¼
C1 ; dr
ð153Þ
with C a proportionality factor, which is independent of d. Such coulombic string potentials were used in the past to describe so-called rosette-motion channeling (Komaki and Fujimoto, 1974; Tamura and Kawamura, 1976). The eigenenergy can then be calculated as R1 Hc ðrÞrdr B¼0 E00 ¼ R0 1 00 ð154Þ 0 c00 ðrÞrdr After substitution of Eqs. (152) and (153) in Eq. (154), Eq. (154) is equal to 0 0qffiffiffiffiffiffiffiffiffiffiffiffiffi 11 Z 1 B¼0 jr jE00 1 AAdr exp@ @ b 2 0 C B¼0 E00 ¼ 0 0qffiffiffiffiffiffiffiffiffiffiffiffiffi 11 dZ 1 B¼0 jr jE00 1 ð155Þ AArdr exp@ @ b 2 0 qffiffiffiffiffiffiffiffiffiffiffiffiffi B¼0 1 C jE00 j ¼ ; b 2d or 1 B¼0 j jE00
¼
4b2 2 d C2
ð156Þ
Because C is independent of d, the power of d depends on the relation between b and d. C and b depend on Z, as shown in Section IV.B.1. The power of Z is thus determined by these dependencies. To estimate the power of d and Z, as well as a proportionality factor a, a large number of eigenenergies are calculated for Z ¼ 14 ! 79, d ¼ 0.3 nm ! 0.7 nm, and B ¼ 0, to which two parametric models are fitted:
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
1 B¼0 j jE00
¼ a1
dx ; Z
ð157Þ
¼ a2
dx Zy
ð158Þ
and 1 B¼0 j jE00
169
The parameters were found to be a1 ¼ 1.1541, a2 ¼ 3.5712, x ¼ 1.66, and y ¼ 1.275. The parametric models look like Eq. (150) and (151). The mentioned parameters are the parameters for which the parametric models best fit the calculated eigenenergies for the given ranges of Z and d. These ranges are chosen quite wide. Slightly diVerent parameters are found if these ranges are chosen smaller. In this way diVerent parameters can be found for light and heavy atoms in the atom column or for large and small values of d. At the end of this subsection some hard numbers for the eigenenergies for various atom column types are given and compared to exact eigenenergies calculated with one of the methods proposed in this article. Second, it is assumed that B 6¼ 0. In this case the atoms are no longer perfectly aligned in the atom column due to thermal motion. As a result, U (r, B 6¼ 0) is broadened and flattened. If the atom column potential is broadened, the 1S eigenfunction also will be broader. Let us therefore assume that the 1S eigenfunction is well described by a Gaussian. It can be proven mathematically, by substitution into Eq. (48), that the atom column potential is then quadratic, which is equal to a Gaussian up to second order in r B6¼0 B6¼0 2 Uðr; B 6¼ 0Þ / abjE00 jexpð4p2 abjE00 jr Þ
ð159Þ
The broadening of U(r, B 6¼ 0) due to thermal motion can be described by a convolution of U(r, B ¼ 0) and a Gaussian damping function proportional 2 to B1 exp 4p2 rB Assume now for simplicity that also U(r, B ¼ 0) is Gaussian, in analogy with the usual parameterisation of the electronscattering factors (Doyle and Turner, 1968), B¼0 B¼0 2 jexpð4p2 abjE00 jr Þ Uðr; B ¼ 0Þ / abjE00
For B 6¼ 0, U(r, B 6¼ 0) is then Uðr; B 6¼ 0Þ /
0 1
1 B¼0 j abjE00
þB
exp@4p2
r2 1 B¼0 j abjE00
ð160Þ
1 A þB
ð161Þ
From Eq. (161), Eq. (159), and, respectively, Eq. (157) and Eq. (158), it follows that
170
GEUENS AND VAN DYCK
x d þ b1 B ; ¼ a1 B6¼0 Z jE00 j 1
and
1 B6¼0 jE00 j
¼ a2
dx þ b2 B Zy
ð162Þ
ð163Þ
In order to estimate b, a large number of eigenenergies are calculated for Z ¼ 14 ! 79, d ¼ 0.3 nm ! 0.7 nm, and B ¼ 0 nm2 ! 0.05 nm2, to which Eqs. (162) and (163) are fitted. b1 and b2 were found to be, respectively, 0.1734 and 0.05312. Similar expressions are obtained as in Eqs. (150) and (151). Eqs. (162) and (163) are not exact expressions for E00, but approximate ones and can be used as rules of thumb. Nevertheless, as shown in Tables 5 and 6, Eqs. (162) and (163) provide very good estimates for the eigenenergy. If the Debye–Waller factor is small, a quadratically normalized exponential will yield a suYcient good approximation. If the Debye–Waller factor is large, a quadratically normalized 2D Gaussian will yield a suYciently good approximation. Note that a Gaussian also has the advantage that its Fourier transform has a simple analytic expression. The analytic S-state model is in this case also applicable in Fourier space. Note that this derivation is for the particular case of identical atoms in the atom column. However, it can be shown in a straightforward way that in case of nonidentical atoms in the atom column Eqs. (157), (158), (162), and (163) become equal to
OF
TABLE 5 COMPARISON OF THE EIGENENERGIES OF THE MOST BOUND EIGENFUNCTION DIFFERENT TYPES OF ATOM COLUMNS, CALCULATED WITH DIFFERENT METHODS (DEBYE-WALLER FACTOR 0 nm2 AND ACCELERATION VOLTAGE 300 keV)
Z
d (nm)a
E00 (eV)b
E00 (eV)c
E00 (eV)d
Si Cu Sr Sn Au
0.5431 0.3615 0.608 0.6489 0.40786
22.91 106.99 72.00 91.65 326.29
33.4 136.0 75.2 88.8 303.0
22.3 110.0 66.9 84.2 326.0
a
The repeat distance in the atom column. The eigenenergy E00 calculated by expansion of the eigenfunctions in a basis of 2D quantum harmonic oscillator eigenfunctions (b); parametric model for E00 provided by equation (162) (c). d Parametric model for E00 provided by Eq. (163). The Debye–Waller factor was 0 nm2 and the acceleration voltage 300 keV. b,c
171
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
OF
TABLE 6 COMPARISON OF THE EIGENENERGIES OF THE MOST BOUND EIGENFUNCTION DIFFERENT TYPES OF ATOM COLUMNS, CALCULATED WITH DIFFERENT METHODS (DEBYE-WALLER 0.006 nm2 AND ACCELERATION VOLTAGE 300 keV)
Z
d (nm)a
E00 (eV)b
E00 (eV)c
E00 (eV)d
Si Cu Sr Sn Au
0.5431 0.3615 0.608 0.6489 0.40786
20.11 78.12 57.11 69.61 210.27
32.12 116.9 69.0 80.3 222.4
21.8 98.5 61.5 76.8 237.7
a
The repeat distance in the atom column. The eigenenergy E00 calculated by expansion of the eigenfunctions in a basis of 2D quantum harmonic oscillator eigenfunctions (b); parametric model for E00 provided by Eq. (162) (c). d parametric model for E00 provided by Eq. (163). The Debye–Waller factor was 0.006 nm2 and the acceleration voltage 300 keV. b,c
dx 1 dx ¼ a ¼ a and ; 1 2 B¼0 j B¼0 j P y2 2 P 12 2 jE00 jE00 Z Z i i i i 1
and
0 1 B6¼0 jE00 j
and
1
dx B C þ b1 BA; ¼ a1 @ P 12 2 i Zi 0
1 B6¼0 jE00 j
B ¼ a2 @
ð164Þ
ð165Þ
1 dx
P
i
y 2
Zi
C 2 þ b2 BA
ð166Þ
From simulations it can be shown that the Sr and TiO atom columns of SrTiO3 [100] show a similar contrast variation in the amplitude and phase of the wave function in function of thickness. This is due to a rather similar mean atom column potential. This reflects in a similar value for the eigenenergy ofy both yatom y 2 2 columns. This is clear from the equations above, because ZSr ’ ZTi þ ZO . C. A Fast Method to Calculate the Parameterized 1S-State: The Variational Principle It could be concluded from subsection IV.B.1. that the 1S eigenfunction of an atom column can be reasonably described as a 2D quadratically normalized Gaussian or exponential function. This deduction was drawn based on
172
GEUENS AND VAN DYCK
calculations of the most bound eigenfunction of an atom column, by finite diVerence and expansion in a set of basis functions, and from physical intuition. Two parameters are unknown in this parameterization of the c00(r) eigenfunction: b and E00. To improve an estimate for b and E00 the variational principle will be used (Sakurai, 1994). This method provides the possibility to calculate E00 in an easy and quite accurate way compared with the methods presented in Section III.2. Assume that the Gaussian parameterization of the 1S eigenfunction of the atom column is written as 1 1 r 2 c00 ðr; b0 Þ ¼ pffiffiffi exp ; ð167Þ 2 b0 pb0 b with b0 ¼ pffiffiffiffiffiffiffiffiffiffi variable. Then the variational principle states that jE00 j R c ðr; b0 ÞHc00 ðr; b0 Þrdr 0 ð168Þ E00 Hðb Þ ¼ R 00 c00 ðr; b0 Þc00 ðr; b0 Þrdr To minimize H(b0 ), the integrals in Eq. (168) must be calculated. It can be shown that H(b0 ) is equal to H0(b0 ) given in Eq. (102) if the Doyle and Turner (1968) parameterization is used for the 2D mean atom column potential U(r). H(b0 ) is then equal to Hðb0 Þ ¼
X Ai 2 1 h ðjmj þ 1Þ 02 2m b Bi i
1 1þ
b0 2 Bi
jmjþ1
ð169Þ
in order to minimize H(b0 ), bopt must be a solution of 0
X Ai i
B2 i
h2 1 ¼ 04 : jmjþ2 2 2m bopt b0opt 1
1þ
ð170Þ
Bi
Similar as for a Gaussian, H(b0 ) can be calculated assuming that the 1S eigenfunction of the atom column can be parameterized by an exponential function 1 1 r 0 c00 ðr; b Þ ¼ pffiffiffiffiffiffi exp ð171Þ 2 b0 2pb0 H(b0 ) can then be written as pffiffiffiffiffi pffiffiffiffiffi X Ai pffiffiffi Bi Bi h2 1 Bi Hðb0 Þ ¼ 2 1 p exp 1 erf 02 02 0 0 2m b0 2 b b b b i ð172Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
173
FIGURE 15. The modulus of the eigenenergy E00 calculated by expansion of the eigenfunctions in a basis of 2D quantum harmonic oscillators and by the fast presented method (Gaussian parameterization), using the variational principle. The repeat distance d is assumed to be 0.4 nm. The Debye–Waller factor was 0.006 nm2 and the acceleration voltage 300 keV.
After minimization of this equation as function of b0 , E00 can be estimated. It is experienced that for small values of the Debye–Waller factor, the parameterization of the 1S eigenfunction as an exponential function provides better estimates of E00 than a Gaussian parameterization. Although in practice it is much simpler to minimize Eq. (169) than Eq. (172), the latter is numerically unstable in some particular situations. The variational principle provides thus a very eVective and quite accurate method to calculate the eigenenergy of an atom column. In Figure 15 both the eigenenergies, calculated with the presented method (using a Gaussian parameterization) and by means of expansion in a basis set for constant d and B, for various atom types Z are plotted. In Table 7 some hard numbers of the eigenenergy of particular well-known isolated atom columns are given, calculated by expansion of the eigenfunctions in a basis of 2D quantum harmonic oscillator eigenfunctions and by the presented fast method (Gaussian parameterization), using the variational principle. As is clear from Figure 15 and from Table 7 the match is quite acceptable. D. Conclusion In this section an alternative method that describes the dynamic scattering of an electron in a specimen foil, namely the S-state of the channeling theory, was discussed. The principle of this method is based on the expansion of the
174
GEUENS AND VAN DYCK
TABLE 7 COMPARISON OF THE BIOENERGY OF PARTICULAR WELL-KNOWN ISOLATED ATOM COLUMNS CALCULATED BY TWO DIFFERENT METHODS Z
d (nm)a
E00 (eV)b
E00 (eV)c
Si Cu Sr Sn Au
0.5431 0.3615 0.608 0.6489 0.40786
20.11 78.12 57.11 69.61 210.27
18.42 74.95 54.30 66.63 206.82
a
The repeat distance in the atom column. The eigenenergy E00 calculated by expansion of the eigenfunctions in a basis of 2D quantum harmonic oscillator eigenfunctions (b) and by the presented fast method (Gaussian parameterization), using the variational principle (c). The Debye–Waller factor was 0.006 nm2 and the acceleration voltage 300 keV. b,c
electron wave function in eigenfunctions of the atom column potential averaged along the atom column. To study these eigenfunctions, several new methods, namely a finite diVerence expansion of the Schro¨dinger equation in cartesian coordinates, an expansion in a basis set of 2D quantum harmonic oscillators, Fourier analysis of the periodic behavior of the wave function in function of the specimen thickness, and a fast method to calculate the parameterized 1S eigenfunction by means of the variational principle, were proposed to calculate the eigenfunctions and their associated eigenenergy accurately. The performance of these methods was compared with some previously proposed methods. It could be shown that the basis of eigenfunctions of the atom columns is so eVective that the scattering of the electron can be described fairly well using only the lowest eigenfunction in the expansion, namely the 1S eigenfunction. Accurate calculations of the 1S eigenfunction allowed study of its scaling behavior and parameterization by a 2D quadratically normalized exponential function or Gaussian function. The electron wave function could then be represented as a simple analytic expression, which allows fast calculation and provides analytic derivatives with respect to the parameters. In addition to a parameterization of the 1S eigenfunction, a parameterization of the 1S eigenenergy was proposed, as function of the atomic number Z, the repeat distance in the atom column d and the Debye–Waller factor B. The S-state model provides intuitive physical insight in contrast with methods such as the Bloch wave method and the multislice method. Because of its simplicity the method has the potential to become a workhorse for HRTEM. It permits interpretation of the reconstructed electron wave function directly in terms of the projected structure, yielding an approximate structure model that can then further be used as a start for quantitative refinement. Furthermore, it is valid even for crystal defects (dislocations,
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
175
translation interfaces, etc.) as long as the atoms are aligned in columns in a direction close to the beam direction. In the next section, the validity of the S-state model for an assembly of atom columns is studied. V. THE S-STATE MODEL
FOR
NONISOLATED ATOM COLUMNS
Until now only the wave function of an isolated atom column was considered. In this section, the influence of neighboring atom columns on each other will be studied, in particular, the influence on the most bound local eigenfunctions and their eigenenergies. It will be shown that the wave function of an assembly of parallel atom columns can, to a good approximation, be described by the super-position of the wave functions of the respectively constituting atom columns, at least if the atom columns are not too closely spaced (Geuens and Van Dyck, 2002). For simplicity, a pair of atom columns will be considered. The results can be generalized for general assemblies of atom columns. In this case, the Hamiltonian is equal to H¼
2 h Dxy eU a ðx xa ; y ya Þ eU b ðx xb ; y yb Þ; 2m
ð173Þ
with (xa, ya) the position of the first atom column and (xb, yb) the position of the second one. Similarly, as for an isolated atom column, the wave function can be expanded in eigenfunctions of the Hamiltonian, X Ep kz z cp ðx; yÞ Cðx; y; zÞ ¼ 1 þ 2cp sin p E0 2 p ð174Þ " # Ep kz 1 z exp ip ; 2 E0 2 cp(x, y) are solutions of the diVerential equation Hcp(x, y) ¼ Epcp(x, y), where p labels the eigenfunctions. Exact solutions of this equation are not easily accessible. Approximate solutions can be obtained by expansion of the unknown eigenfunctions in a complete set of basis functions. A complete set usually contains an infinite number of elements. In this case, little is accomplished, unless it is a basis set with the desirable property that only a small number of basis functions are suYcient to describe the wave function with reasonable accuracy. Two familiar sets for a pair of atom columns are {canm (x, y)}, the set of isolated atom column eigenfunctions localized at atom column a and {cbnm (x, y)}, the set of isolated atom column eigenfunctions localized at atom column b. Each set is complete and consists of orthonormal basis functions. Although in principle expand cp(x, y) can be expanded only
176
GEUENS AND VAN DYCK
in terms of {canm (x, y)}, a very large number of basis functions are needed to describe the behavior at atom column b. The solution is to use the collection of all eigenfunctions on a and b as a basis set. In most cases, only a few eigenfunctions are needed. However, this set has one disadvantage: The eigenfunctions of diVerent atom columns are not orthogonal to each other. Symmetry arguments, together with qualitative insight, can be used to decide which basis functions to keep. Note that the expansion of cp(x, y) in functions of {canm (x, y)} and {cbnm (x, y)} allows one to learn about the relation between eigenfunctions of a columnar structure and eigenfunctions of an isolated atom column. A. Symmetry Arguments Columnar structures and crystals are characterized by certain symmetry operations that permit classification of the cp(x, y) eigenfunctions. Some symmetry operations leave the Hamiltonian of the system unchanged. In two dimensions the possible symmetries are limited to the ten 2D crystallographic point groups. Nevertheless, for a pair of atom columns the possible symmetries are limited. The point group is 2mhmv for a pair of identical atom columns and mh for a pair of nonidentical atom columns, with 2 a rotation over p radians, mh a horizontal mirror axis, and mv a vertical mirror axis. The Hamiltonian commutes in both cases with mh. The eigenfunctions are classified according to their behavior when acted on by this operation mh cs ðx; yÞ ¼ cs ðx; yÞ mh cp ðx; yÞ ¼ cp ðx; yÞ
ðs eigenfunctionÞ; ðp eigenfunctionÞ
ð175Þ
The eigenfunctions have a second label according to their rank in energy (1,2,3,. . .). In case of identical atom columns, the Hamiltonian is also invariant under inversion I ¼ mhmv ¼ 2, which demands a third classification I cg ðx; yÞ ¼ cg ðx; yÞ I cu ðx; yÞ ¼ cu ðx; yÞ
ðgerade eigenunctionÞ; ðungerade eigenfunctionÞ
ð176Þ
Summarizing, there are four possible symmetries for eigenfunctions of a pair of identical atom columns sg, su, pg, pu and two possible symmetries for eigenfunctions of a pair of nonidentical atom columns s, p. These symmetries are illustrated in Figure 16, where the signs in each quadrant denote the relative sign of the eigenfunction. It can easily be understood that in the expansion of an eigenfunction of a pair of atom columns, only these isolated atom column eigenfunctions appear that have the required symmetry. Note that in case of plane wave illumination along a main zone-axis, the ‘‘p eigenfunctions’’ and ‘‘ungerade
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
177
FIGURE 16. Possible symmetries of the eigenfunctions of a pair of identical (a) and nonidentical (b) atom columns. The relative sign of the eigenfunction (þ or ) is shown in each case.
FIGURE 17. (a) The correlation diagram of a pair of identical atom columns, showing the lowest eigenfunctions. (b) The correlation diagram of a pair of nonidentical atom columns, showing the lowest eigenfunctions. The dotted line marks the realistic interatom column distance D region for columnar structures in a main zone-axis orientation.
eigenfunctions’’ are not excited, which can be concluded from symmetry arguments. Each eigenfunction cp(x, y) has a certain symmetry classified above and an associated eigenenergy Ep, which is a function of the interatom column distance D. Hence it would be useful to represent on a diagram the eigenenergy of each eigenfunction as function of D together with its symmetry. Such a schematic sketch is called a correlation diagram. Figure 17a shows the correlation diagram for a pair of identical atom columns. The extremes of the graph show the eigenfunctions for infinitely spaced atom columns (right)
178
GEUENS AND VAN DYCK
and for fully coincident atom columns (left). The eigenfunctions with the same symmetry are connected; the lowest (D ¼ 0) to the lowest (D ¼ 1), the second lowest (D ¼ 0) to the second lowest (D ¼ 1), and so on. Figure 17b shows the correlation diagram for a pair of nonidentical atom columns. Similarly, as for identical atom columns, the eigenfunctions are classified according to their symmetry and are connected. The eigenfunctions of a pair of nonidentical atom columns are nongenerated. Here, it is assumed that the energy levels of atom column a are higher than the energy levels of atom column b. This scheme is discussed in more detail for 3D molecules in Morisson et al. (1976), which is very similar. Sketches of the eigenfunctions of a pair of identical and nonidentical atom columns can be generated using the correlation diagrams in Figures 17a and 17b, respectively. For example, in case of a pair of identical atom columns, the 1sg eigenfunction emerges from two isolated 1S eigenfunctions with the same eigenenergy and must change continuously to a 1S isolated atom column eigenfunction, as D decreases from 1. Meanwhile, the s symmetry must be preserved for all D. A sketch of the 1sg eigenfunction, as well as some other eigenfunctions, in the extreme limits and for an intermediate D, is shown in Figure 18. Figure 19 shows some sketches of eigenfunctions of a pair of nonidentical atom columns. Conversely, the correlation diagrams can be used to decide which isolated atom column eigenfunctions contribute most to the eigenfunction cp(x, y) of a particular pair of atom columns. In the range between D ¼ 0 and D ¼ 1, other isolated atom column eigenfunctions besides c00 can also contribute to 1sg as long as the symmetry requirements are fulfilled. In the next section the relevance of the various eigenfunctions of a pair of atom columns and the various isolated atom column eigenfunctions in the expansion are studied. Note that in HRTEM in a main zone-axis orientation the interatom column distances are close to the D ¼ 1 case, which implies that the atom columns to a very good approximation can be considered as isolated and the mutual overlap can be considered as a perturbation. This is discussed later. Note that atom columns with a zig-zag arrangement are excluded here; they can, to a first approximation, be regarded as a single atom column with a large Debye–Waller factor. B. A Pair of Identical Atom Columns a g and fE b g are equal, In case of a pair of identical atom columns, fEnm nm which means that they are degenerated. As stated above, it can be concluded from symmetry arguments that, in case of plane wave illumination parallel to a main zone-axis, only the sg eigenfunctions are excited. In this section
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
179
FIGURE 18. Sketches of the eigenfunctions of a pair of identical atom columns, in the extreme limits (D ¼ 0, at the left) and (D ¼ 1, at the right) and for intermediate interatom column distances D. The points denote the atom column positions.
FIGURE 19. Sketches of the eigenfunctions of a pair of nonidentical atom columns, in the extreme limits (D ¼ 0, at the left) and (D ¼ 1, at the right) and for intermediate interatom column distances D. The points denote the atom column positions.
180
GEUENS AND VAN DYCK
only the ca00 ðx; yÞ and cb00 ðx; yÞ eigenfunctions are considered in the expansion of c1sg ðx; yÞ. In principle, ca11x ðx; yÞ, cb11x ðx; yÞ, ca20 ðx; yÞ, and cb20 ðx; yÞ could be included in the expansion since they have the appropriate symmetry, although it is expected that these eigenfunctions will contribute much less than ca00 ðx; yÞ and cb00 ðx; yÞ. It is the aim to show that expansion of c1sg ðx; yÞ only in ca00 ðx; yÞ and cb00 ðx; yÞ is rather suYcient and that to a good approximation the expansion of C(x, y, z) can be limited by only taking c1sg ðx; yÞ into account. c1sg (x, y) can be written as 1s
1s
c1sg ðx; yÞ ’ a00 g ca00 ðx xa ; y ya Þ þ b00 g cb00 ðx xb ; y yb Þ; 1s
ð177Þ
1s
where a00 g is equal to b00 g since c1sg (x, y) is invariant under inversion. Substitution of Eq. (177) in Eq. (174) yields E1sg kz a 1s z c00 ðx xa ; y ya Þ Cðx; y; zÞ ’ 1 þ 2c1sg c00 g sin p E0 2 ð178Þ " # E1sg kz 1 b z þ c00 ðx xb ; y yb Þ exp ip ; 2 E0 2 1s
1s
1s
where a00 g and b00 g are set to c00 g . The next bound eigenfunction of a pair of identical atom columns, which is neglected in the equation above, is c2sg ðx; yÞ, which can mainly be described as a linear combination of ca11x ðx; yÞ and cb11x ðx; yÞ. This could be concluded from the correlation diagram in Figure 17a. This eigenfunction will contribute much less than c1sg ðx; yÞ and was therefore neglected.
C. A Pair of Nonidentical Atom Columns In the case of a pair of nonidentical atom columns, it may be supposed that a b E00 > E00 , since, ca00 ðx; yÞ and cb00 ðx; yÞ are nondegenerated, in contrast with a pair of identical atom columns. Also, here the expansions of c1s (x, y) and c2s (x, y) are limited to ca00 ðx; yÞ and cb00 ðx; yÞ, and the expansion of C(x, y, z) is limited to the two most bound eigenfunctions c1s (x, y) and c2s (x, y), which can be written as a 1s b c1s ðx; yÞ ’ a1s 00 c00 ðx xa ; y ya Þ þ b00 c00 ðx xb ; y yb Þ;
ð179Þ
a 2s b c2s ðx; yÞ ’ a2s 00 c00 ðx xa ; y ya Þ þ b00 c00 ðx xb ; y yb Þ
ð180Þ
Substitution of Eqs. (179) and (180) into Eq. (174) results in
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
181
a 2s Cðx; y; zÞ ’ 1 ðc1s a1s 00 þ c2s a00 Þ c00 ðx xa ; y ya Þ b 2s ðc1s b1s 00 þ c2s b00 Þc00 ðx xb ; y yb Þ " # E1s E2s 1s 2s kz z þ c2s a00 þ c1s a00 exp ip E0 " # E2s a c00 ðx xa ; y ya Þexp ip kz z E0 " # E2s E1s 1s 2s þ c1s b00 þ c2s b00 exp ip kz z E0 " # E1s cb00 ðx xb ; y yb Þexp ip kz z E0
ð181Þ
It can be shown that if the overlap among ca00 ðx xa ; y ya Þ, cb00 ðx xb ; y yb Þ, Ua (x xa, y ya), and Ub (x xb, y yb) is negligible, 2s a1s 00 and b00 tend to zero. The wave function can then be written as E2s kz 2s z ca00 ðx xa ; y ya Þ Cðx; y; zÞ ’ 1 þ c2s a00 sin p E0 2 " # E2s kz 1 E1s kz 1s exp ip z z þ c1s b00 sin p ð182Þ 2 E0 2 E0 2 " # E1sg kz 1 cb00 ðx xb ; y yb Þ exp ip z 2 E0 2 From this expression, it is clear that the excitation varies periodically with foil thickness, the periodicity being diVerent for diVerent atom columns, as is observed in experiments and simulations (Sinkler and Marks, 1999). D. The S-State Model for a General Assembly of Atom Columns Of note, Eqs. (131), (178), and (182) are closely related to each other. These expressions are generalized as ! X j E1j kz z c j1 ðx xj ; y yj Þ Cðx; y; zÞ ’ 1 þ 2c 1 sin p E 2 0 j ( !) ð183Þ j E1 kz 1 exp ip z ; 2 E0 2 where j labels the atom columns in the crystal. As shown in Section IV.B.1, the most bound local eigenfunction c1j (x, y) can be approximated well as a 2D Gaussian or exponential function, both quadratically normalized,
182
GEUENS AND VAN DYCK
sffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 ! jE1j j 1 1 x2 þ y2 j ; c1 ðx; yÞ ¼ exp jE1j j 2 p bj bj or
sffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jE1j j 1 1 x2 þ y2 j jE1j j c1 ðx; yÞ ¼ exp ; 2 2p bj bj
ð184Þ
ð185Þ
the wave function can be expressed in closed analytical form. Henceforth it will be called the analytic S-state model. The wave function is now completely determined by the parameters cj1 , E1j , bj, z, and (xj, yj). This allows a significant gain in calculation speed compared to iterative methods as Bloch wave or multislice algorithms. On the other hand, such an analytic model for the wave function is very well suited to invert the dynamic electron scattering. In order to invert, the parameters must be estimated by means of a parameter estimation technique. Because the model for the wave function is analytical, the gradient of C(x, y, z) and the Hessian matrix also can be calculated analytically, which allows fast convergence to the global optimum. The experimental data, which are used for the fitting, can be experimental reconstructed exit waves or Cs-corrected images. E. Accuracy of the S-State Model for a General Assembly of Atom Columns The S-state model is only an approximate description of a dynamic scattering process of an electron in a specimen foil. This section studies the limits of the model and the accuracy of the atom column positions that can be expected. As mentioned previously, the main goal of the S-state model is to invert the dynamic electron scattering, which basically yields the atom column positions and their chemical composition. To evaluate the accuracy to which the atom column positions can be determined, multislice simulations of known test structures are performed and used as observations, to which an analytic S-state model was fitted using a least-squares criterion. Noise was not taken into account. The estimated atom column positions (xj, yj) are then compared to the initial atom column positions used as input for the multislice calculations. From this, the accuracy or systematic error of the estimated atom column positions can be derived. Note that this test does not provide information about the statistical precision but only about the accuracy of the model. As test structures, Si [110], Sn [110], and GaN [110] are chosen because of the small spacing between the atom columns in the [110] orientation,
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
183
FIGURE 20. The systematic error on the estimated atom column position of both the left and right atom column of a dumbbell, along [001], for both Si [110] and Sn [110]. The dotted line marks 2 pm.
respectively, 136 pm, 162 pm, and 113 pm. All parameters mentioned in the previous section are fitted. From Figure 20 it is clear that the accuracy of the estimated atom column positions is better than 10 pm for Si [110] and 1.5 pm for Sn [110] up to approximately 70%–80% of D00, the thickness periodicity, which is approximately 30 nm for Si [110] and 10 nm for Sn [110]. Note that the accuracy on the atom column positions is better than 2 pm for both Si [110] and Sn [110] up to a thickness of 10 nm. Figure 21 shows that the accuracy of the estimated atom column positions in the x direction, chosen equal to the [001] direction, and the y direction, chosen equal to the [110] direction, is better than 5 pm for both the Ga and N atom columns of GaN [110], up to a thickness of 9 nm, approximately 80% of the thickness periodicity D00 ’ 11 nm of the Ga atom columns. For small thicknesses the positions of the Ga atom columns can be estimated more accurately than the positions of the N atom columns, whereas for thicknesses near the thickness periodicity D00 of the Ga atom columns, the N atom columns positions can be estimated more accurately than the Ga atom column positions. Figures 22 and 23 show the amplitude and phase, respectively, of the fitted analytic S-state model and the wave function calculated with a multislice algorithm, for Sn [110] (z ¼ 9 nm) and GaN [110] (z ¼ 8 nm). From this it can be concluded that the S-state model provides a quite robust model
184
GEUENS AND VAN DYCK
FIGURE 21. The systematic error on the estimated atom column position of both Ga and N along the x direction ([001]) (a) and the y direction (½110) (b). The dotted line marks 2 pm.
to estimate the atom column positions with high accuracy, from a wave function of crystals, even with closely spaced atom columns. F. The LCAO: A Method to Calculate Approximate Eigenfunctions and Eigenenergies of a Pair of Atom Columns Previous sections have shown that the bound eigenfunctions of a pair of atom columns can be expanded in a basis set of eigenfunctions canm ðx; yÞ localized on atom column a and eigenfunctions cbnm ðx; yÞ localized on atom column b. A method, related to the linear combination of atomic orbitals (LCAO), will be used to calculate the approximate most bound eigenfunctions and eigenenergies of a pair of atom columns. Symmetry arguments, together with qualitative insight, could be used to decide which basis functions to keep in the expansion. It was concluded that in case of a pair of identical atom columns c1sg ðx; yÞ can be written as Eq. (177), and in case of a pair of nonidentical atom columns c1s ðx; yÞ and c2s ðx; yÞ can be written, respectively, as Eqs. (179) and (180). In order to evaluate c1sg ðx; yÞ, c1su ðx; yÞ, c1s ðx; yÞ and c2s ðx; yÞ and E1sg , E1su , E1s , and E1s , the excitation
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
185
FIGURE 22. The electron wave of Sn [110] at a specimen foil thickness of 9 nm. (a) The amplitude of the fitted analytical S-state model. (b) The phase of the fitted analytical S-state model. (c) The amplitude calculated with a multislice formalism. (d) The phase calculated with a multislice formalism.
1s
1s
1su u 1s 1s 2s 2s coeYcients a00 g , b00 g , a1s 00 , b00 , a00 , b00 , a00 , and b00 must be determined. These coeYcients and the eigenenergies can be determined by solving the diVerential equations
Hc1sg ðx; yÞ ¼ E1sg c1sg ðx; yÞ;
ð186Þ
Hc1su ðx; yÞ ¼ E1su c1su ðx; yÞ;
ð187Þ
Hc1s ðx; yÞ ¼ E1s c1s ðx; yÞ;
ð188Þ
Hc2s ðx; yÞ ¼ E2s c2s ðx; yÞ;
ð189Þ
and
with, respectively, c1sg ðx; yÞ, c1s (x, y), and c2s (x, y) given by Eqs. (177), (179), (180), and c1su ðx; yÞ defined as
186
GEUENS AND VAN DYCK
FIGURE 23. The electron wave function of GaN [110] at a specimen foil thickness of 8 nm. (a) The amplitude of the fitted analytic S-state model. (b) the phase of the fitted analytic S-state model. (c) The amplitude calculated with a multislice formalism. (d) The phase calculated with a multislice formalism.
1su b a u c1su ðx; yÞ ’ a1s 00 c00 ðx xa ; y ya Þ b00 c00 ðx xb ; y yb Þ;
ð190Þ
1su u where a1s 00 is equal to b00 since c1su ðx; yÞ changes its sign under inversion. For simplicity the three diVerential equations will be generalized as
Hcs ðx; yÞ ¼ Es cs ðx; yÞ;
ð191Þ
with cs ðx; yÞ ¼ as00 ca00 ðx xa ; y ya Þ þ bs00 cb00 ðx xb ; y yb Þ ca 00 ðx
cb 00 ðx
ð192Þ
xa ; y ya Þ and xb ; y yb Þ on By multiplying respectively by the left and integrating, the diVerential equation is turned into a matrix equation s a00 Hab Es S Haa Es ¼0 ð193Þ bs00 Hba Es S Hbb Es with
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
Z Haa ¼
1
1
1
1
Z S ¼
1
1 1
Z
1
Z Hba ¼
1
1
Z
1
Z Hab ¼
Z
1
Z Hbb ¼
1
1 1
Z
1 1
Z
1
1 Z1 1Z 1
¼ 1
1
a ca 00 ðx xa ; y ya ÞHc00 ðx xa ; y ya Þdxdy
187 ð194Þ
b cb 00 ðx xb ; y yb ÞHc00 ðx xb ; y yb Þdxdy
ð195Þ
b ca 00 ðx xa ; y ya ÞHc00 ðx xb ; y yb Þdxdy
ð196Þ
a cb 00 ðx xb ; y yb ÞHc00 ðx xa ; y ya Þdxdy
ð197Þ
b ca 00 ðx xa ; y ya Þc00 ðx xb ; y yb Þdxdy a cb 00 ðx xb ; y yb Þc00 ðx xa ; y ya Þdxdy
ð198Þ
The matrix equation has nontrivial solutions if the determinant is zero or ðHaa Es ÞðHbb Es Þ ðHab Es SÞðHba Es SÞ ¼ 0:
ð199Þ
The discriminant is equal to D ¼ ðHaa Hbb ðHab þ Hba ÞSÞ2 4ð1 S 2 ÞðHaa Hbb Hab Hba Þ; ð200Þ and the solutions of Eq. (199) are E1s ¼ E2s
ðHaa þ Hbb ðHab þ Hba ÞSÞ 2ð1 S 2 Þ
ðHaa þ Hbb ðHab þ Hba ÞSÞ þ ¼ 2ð1 S 2 Þ
pffiffiffiffi D pffiffiffiffi D
;
ð201Þ
ð202Þ
In case of a pair of identical atom columns, Haa ¼ Hbb and Hab ¼ Hba, the solutions of Eq. (199) are then E1sg ¼
ðHaa þ Hab Þ ; ð1 þ SÞ
ð203Þ
E1su ¼
ðHaa Hab Þ ; ð1 SÞ
ð204Þ 1s
1s
1su u 1s as was shown by Van Aert (1999). The parameters a00 g , b00 g , a1s 00 , b00 , a00 , 2s , and b2s can be determined after substitution of the solutions of E in b1s , a s 00 00 00 matrix Eq. (193). In case of a 2D quadratically normalized Gaussian parameterization of the 1S eigenfunctions ca00 ðx; yÞ and cb00 ðx; yÞ, the integrals in Haa, Hbb, Hab,
188
GEUENS AND VAN DYCK TABLE 8 EIGENENERGIES OF THE TWO MOST BOUND EIGENFUNCTIONS OF SOME PAIRS OF ATOM COLUMNS Pair of atom columns (eV)
Structure
Isolated atom column (eV)
Diamond [110]
24.6
Si [110]
34.6
GaAs [112]
25.7 29.7 104.5 16.2
GaN [110]
(Ga) (As) (Ga) (N)
LCAO 34.5 24.8 37.7 37.2 29.1 38.5 105.7 22.8
Multislice method 40.3 42.3
(Ga) (As) (Ga) (N)
29.2 45.2 110.0 27.4
(Ga) (As) (Ga) (N)
a b and E00 using the method The eigenenergies are calculated for isolated atom columns E00 provided in Section IV.C, as well as the corrected eigenenergies E1sg, E1su, E1s, and E2s using Eqs. (201–204) and solving Eqs. (194–198). Fourier analysis of a series of wave functions, as function of thickness calculated with a multislice method, provides more accurate calculations of the eigenenergies E1sg, E1s, and E2s.
and Hba can be calculated analytically using the Doyle and Turner parameterization of the mean atom column potentials (Appendix A). Table 8 shows a b the eigenenergies calculated for isolated atom columns E00 and E00 using the method provided in Section IV.C, as well as the corrected eigenenergies E1sg , E1su , E1s, and E2s using Eqs. (201) through (204) and solving Eqs. (194) through (198). Fourier analysis of a series of wave functions, as function of thickness calculated with a multislice method, provides more accurate eigenenergies E1sg , E1s and E2s. E1su could not be determined in this way, because c1su ðx; yÞ is not excited in case of plane wave illumination. Note that Haa, Hbb, Hab, and Hba can alternatively be written as a correction to the single isolated atom column eigenenergies Z 1Z 1 b a ca Haa ¼ E00 00 ðx xa ; y ya ÞeU ðx xb ; y yb Þ ð205Þ 1 1 ca00 ðx xa ; y ya Þdxdy Hbb ¼
b E00
Z
1
Z
1
1
1
a cb 00 ðx xb ; y yb ÞeU ðx xa ; y yb Þ
cb00 ðx xb ; y yb Þdxdy b S Hab ¼ E00
Z
1 1
Z
1
1
a ca 00 ðx xa ; y ya ÞeU ðx xa ; y ya Þ
cb00 ðx xb ; y yb Þdxdy
ð206Þ
ð207Þ
189
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM a Hba ¼ E00 S
Z
1 1
Z
1
1
b cb 00 ðx xb ; y yb ÞeU ðx xb ; y yb Þ
ca00 ðx xa ; y ya Þ dxdy
ð208Þ
a Table 9 shows the eigenenergies calculated for isolated atom columns E00 b and E00 using an expansion in a basis set of 2D quantum harmonic oscillator eigenfunctions presented in subsection III.D.2. The corrected eigenenergies E1sg , E1su , E1s, and E2s are calculated using Eqs. (201) through (204), Eqs. (205) through (208), Eq. (198), and the isolated atom column eigenenergies a b E00 and E00 . The more accurate eigenenergies E1sg , E1s, and E2s were calculated similarly as in Table 8. The corrected eigenenergies, calculated using LCAO in Table 9, match the more accurate calculations better than these in Table 8. Nevertheless, from both tables it can be concluded that the LCAO method corrects the isolated atom column eigenenergies, but that an LCAO expansion in just ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ is not suYcient to predict a more accurate correction. An extension of the expansion to other bound isolated atom column eigenfunctions than the ones noted above will allow more exact calculation of the correction. Recently, similar results were published by Anstis et al. (2002). In this work the 1S eigenfunctions were parameterized by an analytic function other than a 2D quadratically normalized Gaussian function.
TABLE 9 EIGENENERGIES OF THE TWO MOST BOUND EIGENFUNCTIONS OF SOME PAIRS OF ATOM COLUMNS Pair of atom columns (eV) Structure
Isolated atom column (eV)
Diamond [110]
26.2
Si [110]
36.7
GaAs [112] GaN [110]
28.4 32.4 107.8 17.9
(Ga) (As) (Ga) (N)
LCAO 36.0 26.5 39.8 39.3 31.8 41.2 109.0 24.5
Multislice method 40.3 42.3
(Ga) (As) (Ga) (N)
29.2 (Ga) 45.2 (As) 110.0 (Ga) 27.4
a b and E00 using an expansion The eigenenergies are calculated for isolated atom columns E00 of the isolated atom column eigenfunctions in a basis set of quantum harmonic oscillator eigenfunctions. The corrected eigenenergies E1sg ; E1su ; E1s , and E2s are calculated using Eqs. a b (201–204), Eqs. (205–208), Eq. (198), and the isolated atom column eigenenergies E00 and E00 . Fourier analysis of a series of wave functions, as function of thickness calculated with a multislice method, provides more accurate calculations of the eigenenergies E1sg, E1s, and E2s.
190
GEUENS AND VAN DYCK
G. Conclusion This section studied the mutual influence of the neighboring atom columns, in particular the influence on the most bound local eigenfunctions and their eigenenergy. As an example, pairs of atom columns were considered. The results can be generalized for general assemblies of atom columns. The eigenfunctions of a pair of atom columns were expanded in a basis of isolated atom column eigenfunctions. Symmetry arguments, together with qualitative insight, could be used to decide which basis functions to keep. It was shown that restricting the expansion of the wave function to c1sg ðx; yÞ is suYcient, as is the expansion of c1sg ðx; yÞ in ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ in case of a pair of identical atom columns. In case of a pair of nonidentical atom columns, the expansion of the wave function is restricted to c1s ðx; yÞ and c2s ðx; yÞ, which is suYcient, as is the expansion of these eigenfunctions in ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ. After parameterization of ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ as a 2D quadratically normalized Gaussian or exponential function, the wave function can be written in a closed analytic form. The accuracy of the analytic model using a 2D quadratically normalized Gaussian parameterization was tested by fitting it to simulated wave functions calculated with a multislice method, using a least-squares criterion. It was shown that the atom column positions could be determined with an accuracy of approximately 5 pm even for closely spaced atom columns. Using the LCAO method and restricting the basis set to ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ, a correction to the eigenenergies a b E00 and E00 could be calculated analytically if the single atom column eigenfunctions were parameterized as 2D quadratically normalized Gaussians. From comparison of the calculated corrections with the more accurate calculations of the eigenenergies obtained from Fourier analysis of the thickness behavior of a series of wave functions, calculated with a multislice method, it can be concluded that the LCAO method corrects the isolated atom column eigenenergies, but that an LCAO expansion in ca00 ðx xa ; y ya Þ and cb00 ðx xb ; y yb Þ is not suYcient to predict the exact correction. An extension of the expansion to bound isolated atom column eigenfunctions other than the ones noted above more exact calculation of the correction. An expansion of the eigenfunctions of a pair of atom columns in isolated atom column eigenfunctions allows learning about the relation between the eigenfunctions of columnar structures and the eigenfunctions of isolated atom columns.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
VI. THE S-STATE MODEL
IN
CASE
OF
CRYSTAL
OR
191
BEAM TILT
A. Introduction In practice, it is almost impossible to align the specimen locally in perfect zone-axis orientation. This is due, for example, to local bending of the specimen or diVerent grain orientations. Therefore, it is important for practical applicability of the S-state model that the eVect of crystal and beam tilt be included in the model and can be estimated. Tilt was first included by Van Dyck et al. (1998) in a more complicated way compared with Geuens and Van Dyck (2002). Nevertheless, the conclusions are similar. In addition, the applicability of the S-state model in case of tilt is studied as well as the influence of tilt on the accuracy of the estimates of the atom column positions. A distinction must be made between crystal and beam tilt. In case of crystal tilt, the electron beam is aligned along the optical axis and the crystal tilted away, so that its zone-axis is inclined with respect to the beam (Figure 24a). In case of beam tilt, the zone-axis of the crystal is oriented
FIGURE 24. (a) Crystal and (b) beam tilt can be regarded as equivalent if the description is restricted to the interaction between electron and specimen.
192
GEUENS AND VAN DYCK
along the optical axis, but the incident beam is inclined to this axis (Figure 24b). Although the relative orientation of the electron beam to the crystal is the same in both cases, the tilt of the electron beam has an eVect on the imaging process, namely, it will introduce coma. However, both crystal and beam tilt can be regarded as equivalent if one restricts to the interaction between electron and specimen. For convenience, the reference frame will be fixed to the crystal and the beam will be described as tilted. In this case kxy, the 2D radial component parallel to the surface of the specimen foil of the wave vector k of the incident plane wave, is nonzero. The 3D time-independent Schro¨ dinger equation of an electron in a 2D potential, in the ‘‘forwardscattering approximation,’’ can be written as Eq. (12, eV(x, y, z) ¼ eU(x, y)) or alternatively as follows:
2p h2 k z @ Cðx; y; zÞe2piðkx xþky yÞ mi @z ¼ HCðx; y; zÞe2piðkx xþky yÞ
ð209Þ
4p2 h2 ðk2x þ k2y Þ Cðx; y; zÞe2piðkx xþky yÞ 2m
Because H is not dependent on z, the wave function C(x, y, z) can be written as a series of solutions, similar as Eq. (47), factorized in (x, y)-dependent eigenfunctions cnm (x, y) and z-dependent phase factors Cðx; y; zÞe2piðkx xþky yÞ ¼
X nm;Enm <0
cnm ðkx ; ky Þcnm ðx; yÞ
# Enm 2 z 2 2 exp ip k ðkx þ ky Þ kz E0 Z 1 þ cðx; kx ; ky Þcðx; y; xÞ "
0
ð210Þ
# x 2 z 2 2 exp ip k ðkx þ ky Þ dx E0 kz "
Note that cnm (kx, ky) and c(x, kx, ky) are now dependent on kx and ky. Using the boundary condition Z 1 X cnm ðkx ; ky Þcnm ðx; yÞ þ cðx; kx ; ky Þcðx; y; xÞdx 0 ð211Þ nm;Enm <0 ¼ Cðx; y; 0Þe2piðkx xþky yÞ ¼ e2piðkx xþky yÞ ;
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
193
this equation can be rewritten as Cðx; y; zÞ
Enm 2 z 2 2 ¼1þ 2cnm ðkx ; ky Þsin p k ðkx þ ky Þ c ðx; yÞ 2kz nm E0 nm;Enm <0 " # Enm 2 z 1 2 2 k ðkx þ ky Þ þ 2ðkx x þ ky yÞ exp ip 2kz 2 E0 Z 1 x 2 z þ 2cðx; kx ; ky Þsin p k ðk2x þ k2y Þ cðx; y; xÞ E 2k 0 z 0 " # x 2 z 1 k ðk2x þ k2y Þ þ 2ðkx x þ ky yÞ dx exp ip E0 2kz 2 X
ð212Þ After substitution of Eq. (210) or (212) into Eq. (209), it is clear that the diVerential equation can be transformed into the known eigenvalue problem Eqs. (48) and (49), which were derived for plane wave illumination along a main zone-axis. In a similar manner as shown in Sections III.G and III.H.2, it can be proven that only these continuum eigenfunctions are excited for which the eigenenergy x is equal to the transverse energy Et ¼
h2 ðk2x þ k2y Þ 2m
ð213Þ
For small tilt angles the most bound eigenfunctions will still dominate the expansion of the wave function. The term containing the continuum eigenfunctions is negligible compared to the terms containing the most bound eigenfunctions. At larger thicknesses the situation is diVerent, as will be discussed further. Equation (212) is not valid for high tilt angles. The breakdown will happen if the approximations made in Eq. (209) are no more justified. For large beam tilts, the backscattering cannot be neglected and thus the ‘‘forward-scattering approximation’’ is no longer valid. However, for realistic tilt angles, as occurs in CHRTEM, the S-state model including tilt remains valid, as will be shown in the next section. kxy The thickness periodicity Dnm of the eigenfunctions cnm (x, y, z) is not invariant under tilt kx þ ky and is equal to pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ tan2 a 2kz kxy 0 Dnm ¼ ¼ D ; ð214Þ 0 nm E ðk2x þ k2y Þ Enm0 k2 ð1 þ tan2 aÞ3=2 þ Dnm tan2 a 2l
because
jkx þky j jkz j
xy 1 ffi with a the tilt angle. If a is small, Dnm ¼ tana and kz ¼ 1l pffiffiffiffiffiffiffiffiffiffiffiffiffi 2
k
1þtan a
194
GEUENS AND VAN DYCK
can be written to a good approximation as 1 D0nm 2 kxy 0 Dnm ’ Dnm 1 þ a 2l
ð215Þ
The thickness periodicity will thus decrease as a function of increasing tilt, as if the electron–object interaction increases. This seems surprising, because it is expected that the strength of the electron–object interaction decreases in case of increasing tilt. However, this is only part of the story. It will be shown that indeed the electron–object interaction decreases as expected. The eigenfunctions cnm(x, y) can be classified into three groups. A first group contains the 1S eigenfunction c00(x,y), with an absolute eigenenergy that is much larger than zero. A second group contains the less bound eigenfunctions, which have a much smaller absolute eigenenergy than the 1S eigenfunction. However, due to their symmetry, some of these eigenfunctions are not excited, in case of plane wave illumination along a main zoneaxis. A third group contains the continuum eigenfunctions, which have an eigenenergy approximately equal to Et. If, for example, a Si [100] atom column is considered, which has only one bound eigenfunction, the 1S eigenfunction, only eigenfunctions of group one and three are involved. To a good approximation the terms containing eigenfunctions of the third group of Eq. (212) can be expanded up to first order in pðEx0 k2 ðk2x þ k2y ÞÞ 2kz z , because only these continuum eigenfunctions are excited for which x ’ Et, Cðx; y; zÞ
E00 2 z 2 2 ’ 1 þ 2c00 ðkx ; ky Þsin p k ðkx þ ky Þ c ðx; yÞ 2kz 00 E0 " # E00 2 z 1 exp ip k ðk2x þ k2y Þ z þ 2ðkx x þ ky yÞ 2kz 2 E0 Z 1 x 2 z þ 2icðx; kx ; ky Þ p k ðk2x þ k2y Þ E 2k 0 z 0
ð216Þ
cnm ðx; yÞe2piðkx xþky yÞ dx It can be easily shown that the excitation coeYcient Z c00 ðkx ; ky Þ ¼ c 00 ðx; yÞe2piðkx xþky yÞ dxdy
ð217Þ
is decreasing with increasing tilt. The second term of Eq. (216) is thus gaining importance compared to the first term. The thickness dependence of the wave function becomes thus much more linear as a function of thickness. Physically this means that fewer electrons are trapped in the atom column and that the electron–object interaction is closer to a kinematic model than it
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
195
is in the nontilted case. Tilt thus breaks down the strong dynamic interaction as is expected. For a heavier atom column, such as an Au [100] atom column, eigenfunctions of the second group are contributing to Eq. (216) as well. Although Enm Et can be rather small, an expansion up to first order in pðEEnm0 k2 ðk2x þ k2y ÞÞ 2kz z would be insuYcient. This group of eigenfunctions will thus introduce nonlinear terms as function of thickness in Eq. (216). Nevertheless, the excitation coeYcients cnm (kx, ky) will decrease with increasing beam tilt, as will be shown in the next section. This confirms what would be expected; namely, that for heavier atom columns a larger beam tilt is needed to reduce the dynamic interaction. B. Excitation of the Eigenfunctions In case of plane wave illumination along a main zone-axis, the eigenfunctions that were not invariant under mh and I were not excited (e.g., c11x(x, y) in case of an isolated atom column and c1pg(x, y) in case of a pair of identical atom columns). Nevertheless, in case of tilted illumination, the excitation coeYcients of these eigenfunctions, similarly defined in Eq. (217), will be diVerent from zero depending on the direction of tilt. To evaluate the influence of tilt on the excitation of the eigenfunctions, the symmetry of the eigenfunctions multiplied by e2piðkx xþky yÞ will be studied. First, the behavior of the real and imaginary part of e2piðkx xþky yÞ is considered when acted on by operations mh and I mh cosð2pðkx x þ ky yÞÞ ¼ cosð2pðkx x ky yÞÞ; mh sinð2pðkx x þ ky yÞÞ ¼ sinð2pðkx x ky yÞÞ;
ð218Þ
I cosð2pðkx x þ ky yÞÞ ¼ cosð2pðkx x ky yÞÞ; I sinð2pðkx x þ ky yÞÞ ¼ sinð2pðkx x ky yÞÞ
ð219Þ
and
It can be concluded that neither the real nor the imaginary part is invariant under mirror-axis mh, at least not if both kx and ky are diVerent from zero. The real part is invariant under inversion I , whereas the imaginary part is not invariant. Combining both the behavior of the eigenfunctions and e2piðkx xþky yÞ ¼ cosð2pðkx x þ ky yÞÞ þ i sinð2pðkx x þ ky yÞÞ;
ð220Þ
under the operation I , using I (g(x, y)h(x, y)) ¼ I (g(x, y)) I (h(x, y)), it can be concluded that the excitation of the eigenfunctions, which are invariant under inversion I , will be real and that the ones which are not invariant
196
GEUENS AND VAN DYCK
under inversion I , will be imaginary. Because all excitation coeYcients are either real or imaginary, their phases are 0 and p2 or p2, respectively. In case of an isolated atom column, the excitation coeYcient of cn0(x, y) will be real and, for example, the excitation coeYcients of c11x (x, y) and c11x (x, y) will be imaginary. In case of a pair of atom columns, the excitation coeYcients of csg (x, y) and cpg (x, y) will be real and the excitation coeYcients of csu (x, y) and cpu (x, y) will be imaginary. If kx ¼ 0, c11x (x, y), csu (x, y) and csg (x, y) are not excited, and if ky ¼ 0, c11y (x, y) and cp(x, y) are not excited. This could be concluded from the behavior of the eigenfunction multiplied by e2piðkx xþky yÞ under the operation mh, using mh (g(x, y)h(x, y)) ¼ mh(g(x, y) ) mh(h(x, y) ). E In Figure 25, cp ðkx ; ky Þsin p Ep0 k2 ðk2x þ k2y Þ 2kz z for diVerent eigenfunctions of a pair of Sn atom columns, are plotted as a function of thickness and crystal tilt, for the case that both kx 6¼ 0 and ky 6¼ 0. It is clear that for tilts larger than a few mrad, for a dumbbell of Sn [110], the S-state model is no longer valid, because the wave function is no longer governed by the 1sg - or the 1S eigenfunction.
Ep
FIGURE 25. jcp ðkx ; ky Þ sinðpðE k2 ðk2x þ k2y ÞÞ2kz Þj as function of specimen foil thickness 0 z z and tilt (kx, ky), respectively, for (a) 1sg, (b) 1su, (c) 2sg, (d) 1pu, (e) 1pg, and (f) 2su. The excitations and eigenergies were calculated for a dumbbell of Sn [110].
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
197
C. Shift of the Maxima in the Amplitude and Phase of the Wave Function Tilted illumination introduces diVerent eVects on the wave function. Most were discussed in detail in the previous sections, such as the decrease of the thickness periodicity and the excitation of eigenfunctions with p or ‘‘ungerade’’-like symmetry due to the nonsymmetric illumination. One eVect that was not discussed in the previous sections is the factor cnm ðkx ; ky Þe2piðkx xþky yÞ , which will cause a shift of the maxima and minima of the amplitude and phase of the wave function with respect to nontilted illumination. In a fitting procedure this will lead to biased estimations of the atom column positions. This can be avoided by including two extra tilt parameters (kx, ky), which can be position dependent if the crystal foil is, for example, locally bent. D. The S-State Model for a General Assembly of Atom Columns Including Crystal or Beam Tilt The S-state model including crystal or beam tilt can be written as ! ! X j E1j 2 z 2 2 cj1 ðx xj ; y yj Þ Cðx; y; zÞ ’ 1 þ 2ci sin p k ðkx þ ky Þ E 2k 0 z (j !) ! E1j 2 z 1 2 2 ; k ðkx þ ky Þ þ 2ðkx x þ ky yÞ exp ip E0 2kz 2 ð221Þ where j labels the atom columns in the crystal. If it is assumed that the most bound local eigenfunction cj1 ðx; yÞ can be approximated well as a 2D Gaussian or exponential function, both quadratically normalized, as shown in subsection IV.B.1, sffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 ! jE1j j 1 1 x2 þ y2 j jE1j j c1 ðx; yÞ ¼ exp ; ð222Þ 2 p bj bj or
sffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jE1j j 1 1 x2 þ y2 j exp jE1j j ; c1 ðx; yÞ ¼ 2 2p bj bj
ð223Þ
the wave function can be expressed in closed analytical form. The wave function is now completely determined by the parameters E1j , bj, cj1 , (xj, yj), z, and (kx, ky).
198
GEUENS AND VAN DYCK
E. Accuracy of the S-State Model for a General Assembly of Atom Columns Including Crystal or Beam Tilt To evaluate the accuracy to which the atom column positions can be determined in case of crystal or beam tilt, multislice simulations of a known test structure are performed and used as observations, to which an analytic S-state model including crystal or beam tilt is fitted using a least-squares criterion. As test structure GaN [110] is chosen, which has a spacing between the atom columns of only 113 pm. Wave functions are simulated for various thicknesses, tilt directions, and tilt magnitudes. All parameters mentioned in the previous section are fitted. Figures 26 and 27, respectively, show the accuracy of the estimated atom column positions, in case of crystal or beam tilt parallel to the x direction, which is chosen equal to the [001] direction and in case of crystal or beam tilt parallel to the y direction, which is chosen equal to the [110] direction. The accuracy of the estimated atom column positions in the direction perpendicular to the tilt direction is almost not aVected, whereas the accuracy of the estimated atom column positions parallel to the tilt direction is aVected. Note that the eVect of crystal or beam tilt is highly dependent on the crystal thickness. Figures 28 and 29 show the fitted tilt magnitudes (kx, ky) compared with the exact ones. It is clear that the tilt magnitude is generally underestimated
FIGURE 26. The systematic error on the estimated atom column position of both Ga (solid square) and N (solid circle) along (a) the x direction ([001]) and (b) the y direction (½110) (solid line kx ¼ 0.97 nm1 or a ’ 1.9 mrad; dashed line kx ¼ 1.93 nm1 or a ’ 3.8 mrad; dotted line kx ¼ 2.91 nm1 or a ’ 5.7 mrad; in all cases ky ¼ 0).
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
199
FIGURE 27. The systematic error on the estimated atom column position of both Ga (solid square) and N (solid circle) along (a) the x direction ([001]) and (b) the y direction (½110) (solid line ky ¼ 0.91 nm1 or a ’ 1.8 mrad; dashed line ky ¼ 1.81 nm1 or a ’ 3.6 mrad; dotted line ky ¼ 2.72 nm1 or a ’ 5.4 mrad; in all cases kx ¼ 0).
FIGURE 28. The estimated tilt magnitudes (kx, ky) compared with the exact ones for different crystal thicknesses. The exact values for kx are equal to 0.97 nm1, 1.93 nm1, and 2.91 nm1, and ky ¼ 0 nm1. The solid line shows the function f(x) ¼ x.
200
GEUENS AND VAN DYCK
FIGURE 29. The estimated tilt magnitudes (kx, ky) compared with the exact ones for different crystal thicknesses. The exact values for kx ¼ 0 nm1 and ky are equal to 0.91 nm1, 1.81 nm1, and 2.72 nm1. The solid line shows the function f(x) ¼ x.
but that the estimates are getting better for larger thicknesses, at least for the tilt magnitude of the initial tilt direction. From this it can be concluded that the S-state model including crystal or beam tilt still provides quite a robust model to estimate the atom column positions with reasonable accuracy, from a wave function of crystals even with closely spaced atom columns, tilted over small tilt angles. F. Small-Angle Nonparallel Illumination In practice, the incident beam is not perfectly parallel to the atom columns but is slightly convergent. In this case the beam consists of a set of plane waves with k vectors distributed around a central vector parallel to the atom columns. In case of coherent illumination, all magnitudes of the k vectors are equal. The 1S state model can then be written as X
2cj1 ðyc Þsin
Cðx; y; zÞ ’ pðx; y; yc Þ þ j ( !) E1j 1 kz z ; exp ip E0 2
! E1j p kz z cj1 ðx xj ; y yj Þ E0
ð224Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
201
where j labels the atom columns in the crystal, p(x, y, yc) is the probe function defined as Cðx; y; 0Þ ¼ pðx; y; yc Þ;
ð225Þ
and yc the half angle of beam convergence. c1j ðyc Þ can be calculated provided the correct boundary conditions. In case of incoherent illumination, the formed image is equal to the superposition of the images caused by each of the incident beam directions, which are calculated from the wave functions given by Eq. (212). Nevertheless, in practice the eVect of beam convergence is small. The influence of beam convergence is therefore introduced on the level of the image formation rather than on the level of dynamic calculations. G. Conclusion In this section crystal or beam tilt was introduced in the S-state model. This was needed because in practice it is almost impossible to align the specimen locally in perfect zone-axis orientation. This is due, for example, to local bending of the specimen or diVerent grain orientations. The influence of crystal or beam tilt on the wave function was studied. It can be concluded that the wave function can still be expanded to a good approximation in the local atom column eigenfunctions. On the other hand, the excitation coeYcients are altered due to crystal or beam tilt. Nonradial symmetric eigenfunctions are now excited as well due to the nonsymmetrical illumination. Nevertheless, the 1S-state model is still a valid model, after including tilt, up to tilt angles of a few mrad. The thickness periodicity decreases now with tilt angle. The minima and maxima of the wave functions are shifted. The accuracy of the analytic model using a 2D quadratically normalized Gaussian parameterization was tested by fitting it to simulated wave functions including crystal or beam tilt calculated with a multislice method, using a criterion. Two extra tilt parameters (tx, ty) are fitted. It was shown that the atom column positions could be determined with a reasonable accuracy even for closely spaced atom columns. VII. EXPERIMENTAL CHANNELING MAPS As shown in Section IV, the S-state model can be written in good approximation in closed analytical form (described by three parameters c1, E1, b, respectively, an excitation coeYcient, the eigenenergy of the S-state, and the characteristic width of the uniform S-state):
202
GEUENS AND VAN DYCK
! E1j kz z cj1 ðx xj ; y yj Þ Cðx; y; zÞ ’ 1 þ p E 2 0 (j !) E1j kz 1 exp ip z 2 E0 2 X
with
2c1j
sffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 ! jE1j j 1 1 x2 þ y2 j exp jE1j j c1 ðx; yÞ ¼ 2 p bj bj
ð226Þ
ð227Þ
with j the atom column index, E0 the incident electron energy, kz proportional to the inverse wavelength, and z the crystal thickness (Geuens et al., 1999). This expression allows determination of the position and ‘‘weight’’ (E1) of the atom columns by direct fitting with the exit wave. To interpret the exit wave C in terms of channeling, it is much more appropriate to subtract the vacuum wave (entrance wave) (i.e., Cs ¼ C 1) to isolate the part describing the channeling. We will call this the channeling wave Cs. In practical experiments both the exit wave C and the vacuum wave Cu can be determined if the edge of the object is in the same field of view. Because the vacuum wave Cu can only be determined apart from an arbitrary phase, the channeling wave is then calculated as Cs ¼ (C Cu)/Cu. According to the S-state model, the amplitude of Cs is strongly peaked at the atom column position, which allows determination of the position of the column, and phase is constant over the atom column and proportional to its ‘‘weight’’ (E1). The amplitude of Cs oscillates periodically as function of thickness, and the phase increases linearly with thickness. A convenient way to visualize this is by plotting for each pixel of the exit wave its real and imaginary values as coordinates in a complex plane with the x-axis as the real axis and the y-axis the imaginary axis. If the object is a wedge containing diVerent thicknesses, one can expect from the 1S theory above that all the points are located on a circular locus that starts at the origin (for zero thickness) and has its center on the x-axis. The amplitude of the channeling wave is then the radius of the circle, and the phase is the angular coordinate along the circle. All the pixels of the same column may have slightly diVerent amplitudes so that the circle in fact becomes a ring, but they are all expected to have the same angular coordinate (i.e., to lie in the same sector). If an atom is added to a column, the corresponding points are expected to rotate over a fixed angular increment. If the crystal contains diVerent types of columns, this map will reveal a distinct circle
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
203
for each diVerent type of column. For this reason we call this representation a channeling map. Figure 11a shows simulations and Figure 11b experimental results for the channeling map of Au[110], which both confirm these theoretical conclusions. It is clear that the pixels are grouped in well-resolved sectors with an increment of approximately 0.6 rad, each corresponding to one extra Au atom in the column. For this particular case, the channeling map allows determination of the composition of each column with single atom sensitivity. It should be noted that Sinkler and Marks (1999) were the first to plot exit wave pixels for C (instead of Cs) in the complex plane and in polar coordinates (Argand diagram), but only for very thin crystal films with insuYcient thickness variation to observe the typical channeling circles. In the near future we will try to extend the method to systems with diVerent types of columns and to columns with mixed composition. VIII. ELECTRON DIFFRACTION
AND THE
S-STATE MODEL
Due to the simplicity of the S-state model and the fact that it describes the dynamic electron scattering fully in real space, it provides intuitive physical insight in a large range of properties of electron–object interaction, physical insight that is not provided by a Bloch wave solution or multislice method. In this section the S-state model is used to explain especially some properties of electron diVraction such as, ‘‘Why do direct methods work for dynamic electron diVraction?’’ and ‘‘How extinct are kinematically forbidden reflections?’’ It is hard or almost impossible to answer these questions using a Bloch wave method or multislice method. Before these questions are answered, an electron diVraction pattern is described using the S-state model. A. Electron DiVraction Up to now only a description of the wave function in real space has been discussed. However, the S-state model is equally well suited to describe the wave function in Fourier space (i.e., electron diVraction). According to the dynamic theory, the electron is multiply scattered inside the crystal but can freely propagate once outside the crystal. Hence the wave function of the crystal can be considered as a Huyghens source for secondary waves so that the diVracted wave pattern is proportional to the Fourier transform of the wave function at the exit face of the object (Fraunhofer theory).
204
GEUENS AND VAN DYCK
Whereas the description of the wave function in real space was well suited to describe small local deviations from the bulk structure (e.g., interfaces, point defects), the description of the wave function in Fourier space is better suited to describe diVraction at a periodic structure. After all, in this case the electron diVraction pattern is discrete. The wave function at the exit face of a periodic structure in real space can be written according to the S-state model as ! X X E1j kz j Cðx; y; zÞ ’ 1 þ z c1j ðx xj ; y yj Þ 2c1 sin p E0 2 ux uy j unit cell ( !) ð228Þ E1j k 1 z dðx ux ; y uy Þ exp ip ; 2 E0 2 with j the label of the atom columns in the unit cell and (ux, uy) a vector with ux ¼ na and uy ¼ mb with n ¼ 1, . . . , 1 and n ¼ 1, . . . , 1 integer numbers, and a and b the dimensions of the rectangular 2D unit cell of the averaged atom column potential along the beam direction. The unit cell is chosen as rectangular here for reasons of simplicity but can be more general. After forward Fourier transforming the wave function becomes ^ x ; gy ; zÞ ’ dðgx ; gy Þ þ Cðg
X X qx ;qy j unit cell
2c1j
! E1j kz ^ j ðgx ; gy Þ sin p z c 1 E0 2
expð2piðgx xj þ gy yj ÞÞdðgx qx ; gy qy Þ ( !) E1j kz 1 z exp ip 2 E0 2
ð229Þ
with gx and gy the reciprocal lattice vectors and qx ¼ ha and qx ¼ kb with h ¼ 1, . . . ,1 and k ¼ 1, . . . ,1 integer numbers called the Miller indices. Note that if the 1S eigenfunction is parameterized as a 2D quadratically ^ j ðgx ; gy Þ is equal to normalized Gaussian function, c 1 ! sffiffiffiffiffiffiffiffi 2 ðg2 þ g2 Þ b j p j x y ^ ðgx ; gy Þ ¼ 2 bj exp 2p2 ; ð230Þ c 1 jE1j j jE1j j and the wave function in Fourier space is expressed in closed analytical form. The proper Fourier series representing C(x, y, z) is then x 1X^ y Cðx; y; zÞ ¼ ch;k ðzÞ exp 2pi h þ k ; ð231Þ A h;k a b
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
205
TABLE 10 SUMMARY OF THE RESULTS Kinematic
Dynamic
U(x, y) Fhk
|C(x, y, z) 1| F ðjCðx; y; zÞ 1jÞh;k
Real space Fourier space
^ h;k ðzÞ given by with A ¼ ab the size of the unit cell and C h k ^ ^ c h;k ðzÞ ¼ cðgx ; gy ; zÞd gx ; gy a b ^ 0 h;k ðzÞ; ¼ dðh; kÞ þ ic ^ 0 h;k ðzÞ equal to with c Z a=2 Z ^ 0 h;k ðzÞ ¼ i C a=2
ð232Þ
x y ðCðx; y; zÞ 1Þ exp 2pi h þ k dx dy a b b=2 b=2
ð233Þ ^ 0 h;k ðzÞ, the kinematical limit (z ! 0) is taken. In the In order to interpret C kinematical limit i (C(x, y, z) 1) is approximately equal to |C(x, y, z) 1| and as shown in Section IV.A, |C(x, y, z) 1| is in the kinematic limit equal to lim jCðx; y; zÞ 1j ’ sUðx; yÞz; z!0
ð234Þ
^ 0 h;k ðzÞ thus equal to and limz!0 C ^ 0 h;k ðzÞ ’ lim F ðjCðx; y; zÞ 1jÞ ’ sFh;k z lim C h;k z!0
z!0
ð235Þ
^ 0 h;k ðzÞ is equal to F (|C(x, y, z) 1|)h,k, the Fourier In the kinematic limit C coeYcients of |C(x, y, z) 1|, and thus proportional to the structure factor Fh,k. These results are summarized in Table 10. B. Direct Methods As stated by Sinkler et al. (1998), direct methods for crystal structure determination are unique in their ability to determine the positions of atoms within a crystalline unit cell using only the diVraction amplitudes. The power of this approach is evident in the long list of organic and inorganic structures that have been solved using direct methods. The theoretical basis of direct methods has been developed under the assumption that the diVraction amplitudes are purely kinematic and the prior application of direct methods
206
GEUENS AND VAN DYCK
has thus largely been to cases in which complete kinematic diVraction data sets are available from single-crystal x-ray measurements. Such measurements are associated with significant experimental eVort and are dependent on the ability to obtain a single crystal. This has been a motivation for recent work applying direct methods to electron diVraction data for solving organic structures (Dorset, 1996a,b). Nevertheless, one can ask the question, ‘‘Are direct methods applicable to dynamic diVraction data?’’ From experiments the answer would be ‘‘Yes.’’ But what is the theory behind that? This question was answered by Sinkler et al. (1998, 1999b). In the kinematic case Fh;k ¼ Fh;k ;
ð236Þ
because the inverse Fourier transform of Fh,k is real, namely, the real charge distribution. In case of dynamic diVraction, this is no more valid because ^ 0 h;k ðzÞ 6¼ C ^ 0 ðzÞ; C h;k
ð237Þ
due to the complex nature of i(C(x, y, z) 1). From this, one could conclude that direct methods cannot be applied to dynamic electron diVraction data. Nevertheless, this is not true, as will be shown below. The basic theory of direct methods arises from probabilistic relationships obeyed by a set of (pseudorandom) ‘‘atomistic’’ peaks in real space, ideally delta functions, surrounded by regions of essentially zero amplitude. However, in case of dynamic diVraction i(C(x, y, z) 1) is not proportional to a set of atomistic peaks in real space and is even complex. But away from the atom columns it is zero. Nevertheless, |C(x, y, z) 1| is real, peaked at the atom columns and zero away from the atom columns, as discussed in Section IV.A. The peaks that represent the 1S eigenfunction are even more sharp than the peaks in the projected potential. Thus, provided the amplitudes of the Fourier coeYcients of |C(x, y, z) 1|, they can be phased using direct methods and |C(x, y, z) 1| can be reconstructed. If there would exist a relation between the amplitudes of the Fourier coeYcients of |C(x, y, z) 1| and C(x, y, z), direct methods can be applied to dynamic electron diVraction data. Such a theoretical relation exists if the 1S-state model is valid and all atom columns in the unit cell are equal. If all atom columns are equal F ðjCðx; y; zÞ 1jÞh;k ’ ’
X
2c1j sin
j unit cell ^ 0 h;k ðzÞj sh;k jC
! E1j kz ^ j ðh; kÞ exp 2pi h xj þ k yi z Þc p 1 E0 2 a b
ð238Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
207
with sh;k
! x yj j ¼ sign exp 2pi h þ k a b j unit cell X
ð239Þ
x . The modulus of and the sign function defined as signðxÞ ¼ jxj ^ 0 h;k ðzÞ if the 1S-state F ðjCðx; y; zÞ 1jÞh;k is thus equal to the modulus of C model is valid and all atom columns in the unit cell are equal. It has been shown by Sinkler et al. (1998, 1999b) that this relation is more generally true for nonidentical atom columns in the unit cell and at thicknesses larger than the thickness periodicity D00 of the heaviest atom columns.
1. The Patterson Function The inability to reconstruct |C(x, y, z) 1| straightforwardly from the measured intensities naturally leads to the question: What useful information, if any, can be obtained from a Fourier series employing the amplitudes ^ 0 h;k ðzÞ directly? This question was first successfully answered by of C Patterson for x-ray diVraction. In this subsection it will be shown that what Patterson found for x-ray diVraction still holds in case of dynamic electron diVraction. The Patterson series stems from a well-known function in Fourier integral theory called the correlation operation. This function P(X, Y ) can be expressed as PðX ; Y Þ Z Z 1 a=2 b=2 ¼ jCðx; y; zÞ 1jjCðx þ X ; y þ Y ; zÞ 1j dx dy A a=2 b=2 Z a=2 Z b=2 X x 1 y ¼ 3 F ðjCðx; y; zÞ 1jÞh;k exp 2pi h þ k A a=2 b=2 h;k a b X x þ X y þ Y 0 0 þk F ðjCðx; y; zÞ 1jÞh0 ;k0 exp 2pi h dx dy a b h0 ;k0 1 XX F ðjCðx; y; zÞ 1jÞh;k F ðjCðx; y; zÞ 1jÞh0 ;k0 ¼ 3 A h;k h0 ;k0 X Y exp 2pi h0 þ k0 a b Z a=2 Z b=2 x y exp 2pi ðh þ h0 Þ þ ðk þ k0 Þ dx dy a b a=2 b=2 ð240Þ
208
GEUENS AND VAN DYCK
The integral in Eq. (240) equals zero unless h0 ¼ h and k0 ¼ k. Because |C (x, y, z) 1| is real ð241Þ F ðjCðx; y; zÞ 1jÞh;k ¼ F ðjCðx; y; zÞ 1jÞh;k and P(X,Y ) is then equal to 2 1 X X Y ; PðX ; Y Þ ¼ 2 F ðjCðx; y; zÞ 1jÞh; k cos 2p h þ k A h;k a b the sine terms in the series cancel in pairs, due to the summation from minus to plus infinity. Because there exists a relation ^ 0 h;k ðzÞj, the Patterson function is applicable for jF ðjCðx; y; zÞ 1jÞh;k j ’ jC dynamic electron diVraction 1 X 0 X Y jCh;k ðzÞj2 cos 2p h þ k PðX ; Y Þ ’ 2 ð242Þ A h;k a b 2. Inequalities It was thought at first that the interpretation of Patterson maps was the only way that observed intensity data could yield any information about the crystal structure. Fortunately, this criticism is not correct. It will be shown that the amplitudes of the Fourier coeYcients of |C(x, y, z) 1| themselves can be used in a direct way to determine crystal structures without recourse to any other information. In x-ray crystallography, the possibility of direct phasing was seen from the Harker–Kasper inequality (1948) using the Cauchy–Schwartz inequalities. Harker and Kasper examined the relations between structure-factor magnitudes that could be established with their aid. Here it will be shown that the Harker–Kasper inequality derived originally for x-ray diVraction is valid for electron diVraction as well. As an illustration, the Schwartz inequality is considered. Z 2 Z Z 2 2 fg dt
jgj j f j dt dt ð243Þ In case of a centro-symmetric unit cell, Cunit cell(x, y, z) ¼ Cunit cell (x, y, ^ 0 h;k ðzÞ can be written as z), C Z a=2 Z b=2 x y ^ 0 h;k ðzÞ ¼ i ðCðx; y; zÞ 1Þ cos 2p h þ k C dx dy ð244Þ a b a=2 b=2 f, g, and dt can then be defined as
209
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
f ¼ g ¼ i
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Cðx; y; zÞ 1
ð245Þ
x pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi y Cðx; y; zÞ 1 cos 2p h þ k a b
ð246Þ
dt ¼ dxdy
ð247Þ
The Schwartz inequality becomes then "Z # a=2 Z b=2 1 2 0 jC h;k ðzÞj
jCðx; y; zÞ 1j dx dy 2 a=2 b=2 "Z # x a=2 Z b=2 y jCðx; y; zÞ 1j 1 þ cos 2p 2h þ 2k dx dy ; a b a=2 b=2 ð248Þ 6 x 77 1 6 6 6 x 777 y y or via the identity cos2 2p h a þ k b ¼ 2 1 þ cos 2p 2h a þ 2k b , "Z # a=2 Z b=2 1 2 0 jC h;k ðzÞj
jCðx; y; zÞ 1j dx dy 2 a=2 b=2 ð249Þ R a=2 R b=2 a=2 b=2 jCðx; y; zÞ 1j dx dy þ F jCðx; y; zÞ 1j 6
2h;2k
^0 jC
Because h;k ðzÞj to good approximation is equal to jF ðjCðx; y; zÞ 1jÞh;k j, as shown before, the inequality can be written as " # 2 1 Z a=2 Z b=2 F jCðx; y; zÞ 1j
jCðx; y; zÞ 1j dx dy h;k 2 a=2 b=2 "Z # ð250Þ a=2 Z b=2 jCðx; y; zÞ 1j dx dy þ F jCðx; y; zÞ 1j 2h; 2k a=2
b=2
After assumption of F jCðx; y; zÞ 1j h;k Uh;k ðzÞ ¼ Z a=2 Z b=2 ; jCðx; y; zÞ 1j dxdy a=2
ð251Þ
b=2
the inequality is equal to jUh;k ðzÞj2
1 1 þ U2h;2k ðzÞ 2
ð252Þ
210
GEUENS AND VAN DYCK
Hence, if jUh;k ðzÞj2 is larger than 1/2, then U2h,2k(z) must be positive, and so on. Similar relationships can be found for other symmetry operations on zonal reflections. Here two other inequalities are given in case of centrosymmetry. They can be proven in a similar way as the one above: jUh;k ðzÞ þ Uh0 ;k0 ðzÞj2 1 þ Uhþh0 ;kþk0 ðzÞ 1 þ Uhh0 ;kk0 ðzÞ ; ð253Þ and jUh;k ðzÞ Uh0 ;k0 ðzÞj2 1 Uhþh0 ;kþk0 ðzÞ 1 Uhh0 ;kk0 ðzÞ
ð254Þ
Note that in the kinematic limit (z ! 0) Uh,k (z) converges to the unitarity structure factor Uhk, which follows from Eqs. (252), (234), and (235), then lim Uh;k ðzÞ ¼
z!0
Fh;k ¼ Uhk; z
ð255Þ
with z the total number of electrons contained in the unit cell. The discovery of Harper and Kasper showed that it is possible to deduce phases directly from magnitudes of structure factors and inaugurated a new era in structure analysis.
C. How Extinct Are Kinematically Forbidden Reflections? Electron scattering in a crystal is caused by the electrostatic potential of the atoms of the crystal. Hence, in the kinematic limit, which essentially is a single scattering limit, the diVracted electron wave in a zone plane is given by the Fourier transform of the electrostatic potential, projected along the corresponding zone-axis. Reflections can be kinematically extinct as a consequence of the symmetry of the averaged potential along the beam direction. Kinematic extinctions occur, for instance, in crystals with space group Fd3 m–like diamond, Si, Ge, and Sn for the reflections hkl : h þ k þ l ¼ 4n þ 2, with n an integer number. If a diamond-type structure is regarded6 as two 7 interpenetrating face-centered cubic lattices displaced by a vector 14 ; 14 ; 14 , the squared structure factor of the hkl reflection is given by p jFhkl j2 ¼ 4f 2 ½1 þ ð1Þhþk þ ð1Þhþl þ ð1Þkþl cos2 ðh þ k þ lÞ ð256Þ 4 With f the scattering factor of the constituent atom, it is clear that the cosine term is vanishing for h þ k þ l ¼ 4n þ 2. Figure 30 shows a diVraction pattern of an Si crystal in [110] zone-axis orientation. The reflections h þ k ¼
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
211
FIGURE 30. DiVraction pattern of a diamond-type structure in [110] zone-axis orientation (kinematic theory). Circles mark the h þ k ¼ 0 and l ¼ 4n þ 2 reflections. The gray arrows denote the multiple scattering paths for the (002) reflection.
0 and l ¼ 4n þ 2 like {002}, f222g, . . . are kinematically extinct and are indicated by open circles in Figure 30. When the thickness of the crystal becomes larger, so that multiple scattering occurs, these reflections can gain intensity due to two or more successive scatterings. For instance, the h þ k ¼ 0 and l ¼ 4n þ 2 reflections of a diamondlike structure in [110] zone-axis orientation gain intensity due to multiple scattering. This can be explained by means of the multibeam Darwin–Howie–Whelan equation of dynamic electron diVraction theory [27]. This equation for the (002) reflection of a diamondlike structure in [110] zone-axis orientation is given by dc002 eiy002 eiy11 1 2pis002 c002 ¼ ip . . . þ c000 þ . . . þ c þ ... dz q002 q111 111 ð257Þ eiy1 11 ... þ c þ ... q111 111 iy
002 The term eq002 c000 vanishes because q002 ¼ 1. The other terms at the righthand side are nonzero and contain a factor q111 or q111 , which describes the probability to be scattered dynamically by the (111) or (111) planes and contain a diVracted amplitude factor c111 or c111 . Electrons scattered by both planes (111) and (111) will have a direction given by k þ ð111Þ þ ð111Þ ¼ k þ ð002Þ. Electrons are not scattered directly in the (002) direction but after multiple scattering with the (111) and (111) planes. The multiple scattering path is shown in Figure 30 and is denoted by arrows.
212
GEUENS AND VAN DYCK
It may occur that in the presence of a screw-axis or a glide plane, due to group theoretical reasons, multiple scattering contributions cancel even in the dynamic scattering regime, because of destructive interference between contributions with opposite phase. Such dynamical extinctions are called Gjo¨ nnes–Moodie extinctions (1965). However, most kinematic extinctions, such as the h þ k ¼ 0 and l ¼ 4n þ 2 reflections of a diamondlike structure, are not of this Gjo¨ nnes–Moodie type. Nevertheless, dynamic simulations and experiments show that these reflections are still weak at thicknesses for which the diVraction can be considered as highly dynamic. It is hard to explain this using the multibeam Darwin– Howie–Whelan equation of dynamic electron diVraction theory shown in Eq. (258). Therefore, the S-state of the channeling theory is used here to explain this phenomenon in real space. Although a crystal with a diamondlike structure, such as Si in [110] orientation and particularly the (002) reflection, will be used as concrete example, the conclusions are more general. 1. The (002) Reflection of a Diamond-Type Structure in [110] Zone-Axis Orientation According to the kinematic theory, the amplitude of the diVraction pattern is proportional to the Fourier transform of the averaged potential along the beam direction eU(x, y). Figure 31 shows the projected structure of a diamond-type crystal in [110] zone-axis orientation. From this figure, it is clear that the projection of eU(x, y) onto the [001] axis has a periodicity of c/ 4, because the projections of the (002) planes and the (004) planes are identical. The (002) reflection is thus kinematically extinct, as was shown above using the structure factor. In the dynamic theory, the diVracted wave pattern is proportional to the Fourier transform of the wave function as already explained. Hence the
FIGURE 31. A diamond-type structure in [110] zone-axis orientation (half of the cubic unit cell is shown by a rectangular) and the projected structure on the [001] axis.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
213
FIGURE 32. The (a) real and (b) imaginary part of the wave function of an Si crystal in [110] zone-axis orientation at a specimen thickness of 13 nm (acceleration voltage ¼ 300 keV, Debye-Waller factor ¼ 0.006 nm2). The projection of the real part and imaginary part of the wave function on the [001] axis are shown respectively in (c) and (d).
extinctions in the diVraction pattern will be caused by symmetry conditions in the wave function. Kinematic extinctions, which are a consequence of the symmetry of the projected potential, will remain also dynamically extinct if the wave function maintains the symmetry of the projected potential. For an Si crystal in [110] orientation for a thickness of 13 nm (Figure 32), this is not the case. Compared to the projection of eU(x, y) onto the [001] axis, the projection of the wave function onto the [001] axis shows a doubled periodicity, namely c/2, as is clear from Figure 32. Hence the (002) reflection is no longer extinct. The reason for the symmetry breaking can be explained by means of the channeling theory in real space. 2. Beyond the S-State Model Because most bound local eigenfunctions have the symmetry of the local atom column potentials, the description of the wave function of a specimen foil given in Eq. (183) has the symmetry of the local atom columns, which is radially symmetric around the projected center of the atom column. The S-state model thus cannot explain the intensity in the (002) reflection. The (002) reflection appears due to the deviation of the S-state model. If the structure of an Si crystal in [110] zone-axis orientation is regarded, it is clear that it is built up of well-separated dumbbells of closely spaced atom
214
GEUENS AND VAN DYCK
columns that cannot be regarded as isolated. Therefore, the structure is particularly interesting to study the deviations from the S-state model. Instead of atom columns as isolated entity, dumbbells are regarded now as a new isolated entity. Because the separation between adjacent dumbbells is large, the wave function of the specimen foil can to good approximation be expanded in the most bound eigenfunctions of the constituting dumbbells X E1sg kz Cðx; y; zÞ ’ 1 þ z c1sg ðx xj ; y yj Þ 2c1sg sin p E0 2 j ð258Þ " # E1sg kz 1 exp ip z ; 2 E0 2 where j labels the dumbbells and c1sg ðx; yÞ; c1sg , and E1sg , respectively, the most bound ‘‘sigma gerade’’ eigenfunction of the dumbbell and its associated excitation coeYcient and eigenenergy. For thicknesses normally used in HREM it is suYcient to keep only these in the expansion. It has been shown in Section V that c1sg ðx; yÞ can be described as a linear combination of isolated atom column eigenfunctions. Nevertheless, in Section V the expansion was restricted to ca00 ðx; yÞ and cb00 ðx; yÞ. Because such an expansion is not suYcient to explain the intensity in the (002) reflection, the expansion is extended by 1s c1sg ðx; yÞ ’ c00 g ca00 ðx xa ; y ya Þ þ cb00 ðx xb ; y yb Þ ð259Þ 1s þ c11xg ca11x ðx xa ; y ya Þ cb11x ðx xb ; y yb Þ The sign of the linear combination coeYcients can be determined from symmetry arguments. After substitution of Eq. (260) in Eq. (259), the wave function becomes then X E1sg kz z ðca00 ðx xj xa ; y yj ya Þ Cðx; y; zÞ ’ 1 þ 2c1sg c1s sin p 00 E 2 0 j " # E1sg kz 1 b þ c00 ðx xj xb ; y yj yb Þ exp ip z 2 E0 2 X E1sg kz a 1s þ z c11z ðx xj xa ; y yj ya Þ 2c1sg c11x sin p E0 2 j # " E1sg kz 1 b þ c11x ðx xj xb ; y yj yb Þ exp ip z 2 E0 2 ð260Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
215
The projection of the wave function onto the [001] axis has now a periodicity of c/2, due the symmetry of the c11x ðx; yÞ eigenfunction. The periodicity is thus doubled compared to the periodicity of the projection of eU(x, y) onto the [001] axis. As a consequence the (002) reflection is no longer extinct. The wave function can be split into two parts: part one containing the first two terms and part two containing the last term. The first part will only contribute to the reflections h þ k ¼ 0, h þ l ¼ 2m, and l 6¼ 4n þ 2 with m an integer number, whereas part two will contribute to the reflections h þ k ¼ 0 and l ¼ 4n þ 2. Because part two is much smaller than part one, at least in case of Si in [110] orientation, the h þ k ¼ 0 and l ¼ 4n þ 2 reflections will appear only weakly in the diVraction pattern up to thicknesses normally used in HREM. This is clear from Figure 33, where the normalized intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections to, respectively, the total intensity and the total diVracted intensity, is shown in Figure 33a and b. The normalized intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections is negligible up to a thickness of approximately 10–20 nm. At larger thicknesses the description of the wave function given in Eq. (259) is no more suYcient. The second most ‘‘sigma gerade’’ bound dumbbell eigenfunction is no longer negligible compared to the most bound one. The wave function becomes then
FIGURE 33. The normalized intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections to (a) the total intensity and (b) the total diVracted intensity for an Si crystal in [110] zone-axis orientation (acceleration voltage ¼ 300 keV; the Debye–Waller factor ¼ 0.006 nm2).
216
GEUENS AND VAN DYCK
E1sg kz z c1sg ðx xj ; y yj Þ Cðx; y; zÞ ’ 1 þ 2c1sg sin p E0 2 j " # E1sg kz 1 exp ip z 2 E 2 0 X E2sg kz þ z c2sg ðx xj ; y yj Þ 2c2sg sin p E0 2 j " # E2sg kz 1 exp ip z ; 2 E0 2 X
ð261Þ
with c2sg ðx; yÞ; c2sg , and E2sg , respectively, the second most bound ‘‘sigma gerade’’ dumbbell eigenfunction and its associated excitation coeYcient and eigenenergy. Similar to c1sg ðx; yÞ, it can be shown that c2sg ðx; yÞ can be described as a linear combination of isolated atom column eigenfunctions, where c11x ðx; yÞ will be the most significant in the expansion, as was clear from the correlation diagram shown in Figure 17 2s c2sg ðx; yÞ ’ c00 g ca00 ðx xa ; y ya Þ þ cb00 ðx xb ; y yb Þ 2s þ c11xg ca11x ðx xa ; y ya Þ cb11x ðx xb ; y yb Þ þ . . . ð262Þ The projected periodicity of the term containing the second most bound dumbbell eigenfunction is thus c/2. This term will thus contribute to the h þ k ¼ 0 and l ¼ 4n þ 2 reflections and will gain intensity in function of thickness up to a thickness of ð1=kz ÞðE0 =E2sg Þ to decrease from there on. This is clear from Figure 33. The normalized intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections is building up more rapidly in function of thickness at larger specimen thicknesses. Nevertheless, at a specimen thickness of 50 nm the total normalized intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections to the total intensity is only 16%. 3. The Influence of Tilt In case of crystal or beam tilt the wave function, according to Section VI, is Cðx; y; zÞ
E1sg 2 z 2 2 ’1þ 2c1sg ðkx ; ky Þ sin p k ðkx þ ky Þ c ðx xj ; y yj Þ 2kz 1sg E0 j " # E1sg 2 z 1 exp ip k ðk2x þ k2y Þ þ 2ðkx x þ ky yÞ 2kz 2 E0 X
ð263Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
217
Due to the factor c1sg ðkx ; ky Þ expf2piðkx x þ ky yÞg the symmetry breaking is enhanced. The (004) and (002) planes are now distinguishable in the projection, of the wave function of a diamond-type crystal in [110] zone-axis orientation, on the [001] direction. As a consequence the h þ k ¼ 0 and l ¼ 4n þ 2 reflections will gain intensity. It was shown experimentally by Coene et al. (1985) that in hexagonal polytypes and 18R-type structures, the experimental CHRTEM images often show extra modulations and the kinematically forbidden reflections show intensity. Moreover, the kinematically forbidden reflections were much stronger than predicted by computer simulations assuming perfect orientation of the crystal. It was shown by computer simulations that these anomalies could be associated with small misorientations of the crystal. At that time, the origin of this sensitivity to small tilt was not properly understood. Based on this work it now becomes clear that even a small tilt can cause a symmetry breaking from the perfect channeling condition. D. Conclusion In this section it was shown that the S-state model is equally well suited to describe the wave function in Fourier space (i.e., electron diVraction). If the 1S eigenfunction is parameterized as a 2D quadratically normalized Gaussian function, the wave function in Fourier space can be expressed in closed analytical form. Using the S-state model it was proven that direct methods are still valid for electron microscopy even at thicknesses for which the electron scattering can be regarded as highly nonlinear. It has been proven that the Patterson function is still applicable for dynamic electron diVraction, as well as the inequalities originally set up for x-ray diVraction. By means of the channeling theory, it has been explained why kinematic extinctions such as the h þ k ¼ 0 and l ¼ 4n þ 2 reflections in a diamondlike structure in [110] orientation are still weak at thicknesses at which the electron–object interaction can be regarded as highly dynamic. It has been shown that the appearance of these reflections is a consequence of the deviation of the exact wave function from the S-state model. Because the deviation of the S-state model is rather small for thicknesses up to some tens of nm, the relative importance of the intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections is rather weak compared to the other beams. Therefore the appearance of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections can be used as a validity test of the S-state model. It is shown that the deviation of the real wave function from the S-state model is only significant at thicknesses that are larger than the ones normally used in HREM crystal or beam tilt can enhance the symmetry breaking and thus can aVect the intensity of the h þ k ¼ 0 and l ¼ 4n þ 2 reflections.
218
GEUENS AND VAN DYCK
APPENDIX A. THE MEAN ATOM COLUMN POTENTIAL For all calculations presented in this work a parameterized form was used for the mean atom column potential, which is deduced from the Doyle and Turner parameterization of the electron scattering factors given by X 2 fe ðgx ; gy ; gz Þ ¼ ai ebi g ; ðA1Þ i
with ai and bi the Doyle and Turner (1968) parameters and g2 ¼ g2x þ g2y þ g2z . The atom potential Vj (x, y, x) and the atom electron scattering factor fej ðx; y; zÞ are related as 2p h2 1 j gx gy gz ; ; eVj ðx; y; zÞ ¼ F fe ; ðA2Þ m0 2 2 2 with F 1 the inverse Fourier transform and e the absolute charge of an electron. The mean atom column potential eU(x, y) is related to eV(x, y, z) ¼ Sj eVj (x, y, z), with eV(x, y, z) the atom column potential and Vj (x, y, z) the potential of atom j, as Z Z 1 T 1 TX eUðx; yÞ ¼ eV ðx; y; zÞ dz ¼ eVj ðx; y; zÞ dz T 0 T 0 j ðA3Þ Z d Z 1 2 X 1 1 ¼ eVj ðx; y; zÞ dz ¼ eVj ðx; y; zÞ dz; d d j d 1 2 with T the thickness of the specimen foil and d the repeat distance in the atom column. Note that this derivation is for the particular case of identical atoms in the atom column. Nevertheless, it can be generalized for nonidentical atoms in the atom column in a straightforward way. Substitution of Eqs. (A2) and (A3) into Eq. (A4) gives bi 2 ðgx þ g2y Þ 2piðg xþg yÞ x y ai e 4 dgx dgy e gx gy 2 2pix 4p2 ðx2 þ y2 Þ Z Z pffiffiffi gx pffiffiffiffi bi 2p h2 X 2 bi bi ¼ e ai e m0 d i gx gy 2 2piy pffiffiffi gy pffiffiffiffi bi 2 bi dgx dgy e
2p h2 X eUðx; yÞ ¼ m0 d i
Z Z
8p2 h2 X ai ¼ e m0 d i bi
4p2 ðx2 þ y2 Þ bi
ðA4Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
or simplified eUðx; yÞ ¼
X Ai Bi
i
e
2 þy2 Bi
x
;
219
ðA5Þ
2
with Ai ¼ m2hd ai and Bi ¼ 4pbi2 . 0 The electron scattering factors were calculated for atoms at a temperature of 0 K. In practice, the atoms will vibrate. The scattering planes in a crystal are no longer mathematical planes. Temperature will thus aVect the electronscattering factors; it will dampen them. Debye was the first to analyze the eVect of the thermally induced vibrations of the atoms on x-ray diVraction maxima. He found that the scattering factors were damped by a Gaussianlike function. The electron-scattering factor is then X 2 fe ðgx ; gy ; gz Þ ¼ ai eðbi þBÞg ; ðA6Þ i bi þB 4p2 .
and Bi thus equal to Bi ¼ Figure 34 shows the Doyle and Turner parameterization of the mean atom column potential of an electron in an Au atom column, together with a parabolic parameterization and a r1 parameterization.
FIGURE 34. The mean atom column potential of an electron in an Au atom column represented by different parameterizations. The interatomic distance in the atom column is 0.40786 nm; the Debye-Waller factor was 0.006 nm2.
220
GEUENS AND VAN DYCK
APPENDIX B. THE TWO-DIMENSIONAL QUANTUM HARMONIC OSCILLATOR In order to calculate the eigenfunctions of a 2D quantum harmonic oscillator, one has to solve the following Schro¨ dinger equation: Hcnm ðr; ’Þ ¼ Enm cnm ðr; ’Þ
ðB1Þ
The Hamiltonian is determined by H ¼T þV ¼
2 h 1 D þ mo2 r2 2 2m
ðB2Þ ðB3Þ
The Laplacian operator may be written as D¼
@2 1 @ 1 @2 þ þ @r2 r @r r2 @’2
ðB4Þ
Equation (B2) is a second-order partial diVerential equation in two variables r and ’. However, because the potential term of the Hamiltonian only depends on the radial distance r, the solution of Eq. (B2) can be factorized as cnm ðr; ’Þ ¼ unm ðrÞFm ð’Þ;
ðB5Þ
where Fm(’) is the angular function and unm(r) is the radial function, which are solutions of 1 @2 Fm ð’Þ ¼ m2 ; Fm ð’Þ @’2 and
(
) 2 @ 2 h 1 @ m2 1 2 2 þ þ mo r unm ðrÞ ¼ Enm unm ðrÞ; 2 2m @r2 r @r r2
ðB6Þ
ðB7Þ
respectively, m is a separation constant. Equation (B7) is called the angular equation and Eq. (B8) the radial equation. 1. Solution of the Angular Equation Equation (B6) is familiar from the elementary theory of ordinary diVerential equations. Two elementary independent solutions are eim’ and eim’.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
221
Therefore Fm(’) can be written as 1 Fm ð’Þ ¼ pffiffiffiffiffiffi eim’ 2p
ðB8Þ
This set of functions is orthonormal if m is an integer (m ¼ 0, 1, 2, . . .). 2. Solution of the Radial Equation The radial Eq. (B8) can be solved by transforming q it ffiffiffiffiinto a 6standard ffi 72 h diVerential equation. Therefore new variables b0 ¼ mo , x ¼ br0 and Xnm(x) ¼ unm(r) are defined. After appropriately grouping the terms, Eq. (B8) becomes " 2 # @ 1 @ 1 Enm 1 m2 þ þ ðB9Þ Xnm ðxÞ ¼ 0 @x2 x @x 4 2ho x 4x2 In the limit for (x ! 1), Eq. (B9) becomes " 2 # @ 1 Xnm ðxÞ ¼ 0 @x2 4
ðB10Þ
A solution of this equation, that is physically admissible (finite everywhere), is lim Xnm ðxÞ / ex=2
x!1
Equation (B10) can alternatively be written as " 2 # @ 1 Enm 1 ð1 m2 Þ pffiffiffi þ þ xXnm ðxÞ ¼ 0 4x2 @x2 4 2 ho x
ðB11Þ
ðB12Þ
making use of 2 @2 k @ X ðxÞ ¼ x xk Xnm ðxÞ kðk 1Þx2 Xnm ðxÞ nm @x2 @x2 @ Xnm ðxÞ 2kx1 @x
ðB13Þ
In the limit for (x ! 0), Eq. (B13) becomes, keeping only the second-order term in x1, " 2 # @ ð1 m2 Þ pffiffiffi xXnm ðxÞ ¼ 0 þ ðB14Þ 4x2 @x2 A solution of this equation, that is regular at x ¼ 0, is
222
GEUENS AND VAN DYCK
lim Xnm ðxÞ / xjmj=2
x!0
ðB15Þ
A general solution of Eq. (B10) must then be of the form Xnm ðxÞ ¼ xjmj=2 ex=2 fnm ðxÞ
ðB16Þ
Substitution of this solution in Eq. (B10), with Enm ¼ (n þ 1)ho, gives @2 @ n jmj x 2 fnm ðxÞ þ ðjmj þ 1 xÞ fnm ðxÞ þ ðB17Þ fnm ðxÞ ¼ 0 @x 2 @x A solution fnm(x) is then given by jmj
fnm ðxÞ ¼ ð1ÞðnjmjÞ=2 LðnjmjÞ=2 ðxÞ;
ðB18Þ
jmj
with LðnjmjÞ=2 ðxÞ; associated Laguerre polynomials, which are defined for integral values (n |m|)/2 and m. (n |m|)/2 is restricted to integers, because for nonintegers the associated Laguerre polynomials diverge [2]. n ¼ 0, 1, . . . is the principal quantum number; the condition that (n |m|)/ 2 should be an integer restricts m (m ¼ n, n þ 2, . . . , n 2, n) the angular quantum number. The general solution of the radial equation can then be written as 2 jmj 1 r2 r jmj 2 b0 2 ðnjmjÞ=2 r unm ðrÞ ¼ Nnm ð1Þ L e ; ðB19Þ ðnjmjÞ=2 b0 b0 2 with Nnm the normalization coeYcient, which is calculated from the following Z 1 ðn þ jmjÞ=2 ! jmj jmj dn;n0 ; xjmj ex LðnjmjÞ=2 ðxÞLðn0 jmjÞ=2 ðxÞ dx ¼ ðB20Þ 0 ðn jmjÞ=2 ! then Z
1 0
ðn þ jmjÞ=2 ! dn;n0 ; un;m ðrÞun0 ;m r dr ¼ 2 ðn jmjÞ=2 ! b0 2
and Nnm
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi pffiffiffi u u ðn jmjÞ=2 ! 2u ¼ 0 t b ðn þ jmjÞ=2 !
ðB21Þ
ðB22Þ
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
223
3. The Generating Function The generating function of unm(r) is developed from the generating function of the associated Laguerre polynomials (Arfken and Weber, 1995), given by expfxs=ð1 sÞg ð1 sÞjmjþ1
¼
1 X
Lk ðxÞsk ;
ðB23Þ
Lk ðxÞð1Þk sk ;
ðB24Þ
k¼0
jmj
or expfþxs=ð1 þ sÞg ð1 þ sÞjmjþ1
¼
1 X k¼0
jmj
2
with 6|s|7< 1.1 rWhen x is replaced with br0 2 and both sides are multiplied 2 r jmj 2b0 2 with b0 e n o 2 1 r jmj exp 12 br0 2 1s X uð2kþjmjÞm ðrÞ k 1þs gm ðr; sÞ ¼ 0 ¼ s ; ðB25Þ jmjþ1 b Nð2kþjmjÞm ð1 þ sÞ k¼0 with |s| < 1, a generating function gm(r, s) of unm(r) is derived. ACKNOWLEDGMENTS P. Geuens thanks IWT-Flanders for financial support of this research. Both authors thank J. R. Jinschek and C. Kisielowski of the National Center for Electron Microscopy in Berkeley, California, for the data presented in Figure 11 and the discussions about applying theory in practice.
REFERENCES Anstis, G. R., Cai, D. Q., and Cockayne, D. J. H. (2002). Limitations on the s-state approach to the interpretation of sub-angstrom resolution electron microscope images and microanalysis. Ultramicroscopy 94, 309–327. Arfken, G. B., and Weber, H. J. (1995). Mathematical Methods for Physicists. 4th ed. San Diego: Academic Press. Arickx, F., Broeckhove, J., Van Leuven, P., Vasilevsky, V., and Filippov, G. (1994). Algebraicmethod for the quantum-theory of scattering. Am. J. Phy. 62, 362–370.
224
GEUENS AND VAN DYCK
Berry, M. V., and Mount, K. E. (1972). Semiclassical approximations in wave mechanics. Rep. Prog. Phys. 35, 315. Bethe, H. A. (1928). Theorie der beugung von elektronen an kristallen. Annalen der Physik (Leipzig) 87, 55–129. Buxton, B. F., Loveluck, J. E., and Steeds, J. W. (1978). Bloch waves and their corresponding atomic and molecular-orbitals in high-energy electron-diVraction. Philosophi. Magaz. A-Phys. Conde. Mat. Struct. Def Mech. Prop. 38, 259–278. Chen, J. H. (1997). Accurate Elastic Eectron Scattering in Transmission Electron Microscopy. PhD. Thesis, University of Antwerp, Belgium. Coene, W., Bender, H., Lovey, F. C., Van Dyck, D., and Amelinckx, S. (1995). On the influence of crystal orientation on the high-resolution image-contrast of polytypes. Phys. Status Solidi A-Applied Res. 87, 483–497. Cowley, J. M., and Moodie, A. F. (1957). The scattering of electrons by atoms and crystals. I. A new theoretical approach. Acta Crystallogr. 10, 609–619. Dorset, D. L. (1996a). Direct phasing in protein electron crystallography – phase extension and the prospects for ab initio determinations. Acta Crystallogr. A. 52, 480–489. Dorset, D. L. (1996b). Electron crystallography. Acta Crystallogr. B-Struct. Sci. 52, 753–769. Doyle, P. A., and Turner, P. S. (1968). Relativistic hartree-fock X-ray and electron scattering factors. Acta Crystallogr. A 24, 390–397. Fujimoto, F. (1978). Periodicity of crystal-structure images in electron-microscopy with crystal thickness. Physica Status Solidi A-Applied Res. 45, 99–106. Fujiwara, K. (1961). Relativistic dynamical theory of electron diVraction. J. Phys. Soc. Jpn. 16, 2226–2238. Geuens, P., Chen, J. H., den Dekker, A. J., and Van Dyck, D. (1999). An analytic expression in closed form for the electron exit wave. Acta Crystallogr. A 55(Suppl. Pil.OE.002). Geuens, P., and Van Dyck, D. (2002). The S-state model: A work horse for HRTEM. Ultramicroscopy 93, 179–198. Geuens, P., and Van Dyck, D. (2003). About forbidden and weak reflections. Micron 34, 167–171. Gjo¨ nnes, J., and Moodie, A. F. (1965). Extinction conditions in the dynamical theory of electron diVraction. Acta Crystallogr. 19, 65–67. Goodman, P., and Moodie, A. F. (1974). Numerical evaluation of N-beam wave-functions in electron-scattering by multi-slice method. Acta Crystallogr. A 30, 280–290. Gradshteyn, I. S., and Ryzhik, I. M. (1965). Table of Integrals, Series, and Products. New York: Academic Press. Harker, D., and Kasper, J. S. (1948). Phases of Fourier coeYcients directly from crystal diVraction data. Acta Crystallogr. 1, 70–75. Henderson, R. (1995). The potential and limitations of neutrons, electrons and X-rays for atomicresolution microscopy of unstained biological molecules. Quar. Rev. Biophys. 28, 171–193. Hiller, J. R., Johnston, I. D., and Styer, D. F. (1995). Quantum Mechanics Simulations: The Consortium for Upper-level Physics Software. New York: Wiley. Howie, A. (1996). DiVraction channelling of fast electrons and positrons in crystals. Philosophi. Magaz. 14, 223–237. Howie, A., and Basinski, Z. S. (1968). Approximations of the dynamical theory of diVraction contrast. Philosophi. Magaz. 17, 1039–1063. Howie, A., and Whelan, M. J. (1961). DiVraction contrast of electron microscope images of crystal lattice defects. II. The development of a dynamical theory. Proc. R. Soc. of London Ser. A Math. Phys. Sci. 263, 217–237. Ibers, J. A. (1958). Atomic scattering amplitudes for electrons. Acta Crystallogr. 11, 178–183.
S-STATE MODEL FOR ELECTRON CHANNELING IN HREM
225
Jansen, J., Tang, D., Zandbergen, H. W., and Schenk, H. (1998). MSLS, a least-squares procedure for accurate crystal structure refinement from dynamical electron diVraction patterns. Acta Crystallogr. A 54, 91–101. Kambe, K., Lehmpfuh, G., and Fujimoto, F. (1974). Interpretation of electron channeling by dynamical theory of electron-diVraction. Zeitschrift Fur Naturforschung Section A-A J. Phys. Sci. A 29, 1034–1044. Kilaas, K. (Ed.) (1987). Interactive simulation of high-resolution electron micrographs, in Proceedings of the 45th Annual Meeting of the Microscopy Society of America. San Francisco: San Fransisco Press. Kirkland, E. J. (1998). Advanced Computing in Electron Microscopy. New York: Plenum Press. Kisielowski, C., Hetherington, C. J. D., Wang, Y. C., Kilaas, R., O’Keefe, M. A., and Thust, A. (2001). Imaging columns of the light elements carbon, nitrogen and oxygen with sub Angstrom resolution. Ultramicroscopy 89, 243–263. Kittel, C. (1996). Introduction to Solid State Physics, 7th ed. New York: Wiley. Komaki, K., and Fujimoto, F. (1974). Quantized rosette motion of energetic electron around an atomic row in crystal. Phys. Let. A 49, 445–446. Koonin, S. E. (1986). Computational Physics. Menlo Park, CA: Benjamin/Cummings. Lewis, A. L., Villagrana, R. E., and Metherell, A. J. F. (1978). Description of electrondiVraction from higher-order laue zones. Acta Crystallogr. A 34, 138–139. Lindhard, J. (1965). Influence of crystal lattice on motion of energetic charged particles. Matematisk-Fysiske Meddelelser Danske Videnskab Selskab 34, 1–64. Morisson, M. A., Estle, T. L., and Lane, N. F. (1976). Quantum States of Atoms, Molecules and Solids. NJ: Prentice-Hall. Nellistand, P. D., and Pennycook, S. J. (1999). Incoherent imaging using dynamically scattered coherent electrons. Ultramicroscopy 78, 111–124. Op de Beeck, M. (1994). Direct Structure Reconstruction in High Resolution Transmission Electron Microscopy. PhD Thesis, University of Antwerp, Belgium. Op de Beeck, M., and Van Dyck, D. (1995). An analytical approach for the fast calculation of dynamical scattering in HRTEM. Phys. Status Solidi A-Applied Res. 150, 587–602. Op de Beeck, M., and Van Dyck, D. (1996). Direct structure reconstruction in HRTEM. Ultramicroscopy 64, 153–165. Pennycook, S. J., and Jesson, D. E. (1991). High-resolution Z-contrast imaging of crystals. Ultramicroscopy 37, 14–38. Rol, P. K., Fluit, J. M., Viehbo¨ ck, F. P. and De Jong, M. (1960). In The Fourth International Conference on Ionization Phenomena in Gasses, edited by N. R. Nilsson. Amsterdam: NorthHolland, p. 257. Sakurai, J. J. (1994). Modern Quantum Mechanics. Readings, MA: Addison-Wesley, revised edition. Sinkler, W., Bengu, E., and Marks, L. D. (1998). Application of direct methods to dynamical electron diVraction data for solving bulk crystal structures. Acta Crystallogr. A. 54, 591–605. Sinkler, W., and Marks, L. D. (1999a). A simple channelling model for HREM contrast transfer under dynamical conditions. J. Microsc. Oxford 194, 112–123. Sinkler, W., and Marks, L. D. (1999b). Application of direct methods for crystal structure determination using strongly dynamical bulk electron diVraction. Ultramicroscopy 75, 251–268. Spence, J. C. H. (1992). Electron channelling and its uses, in Electron DiVraction Techniques, Vol. 1 of IUCR Monographs on Crystallography, edited by J. M. Cowley. New York: Oxford University Press, pp. 464–532.
226
GEUENS AND VAN DYCK
Stadelmann, P. A. (1987). EMS - a software package for electron-diVraction analysis and HREM image simulation in materials science. Ultramicroscopy 21, 131–145. Stark, J., and Wendt, G. (1912). Annalen der Physik. 38, 921. Tamura, A., and Kawamura, T. (1976). Quantum-theory of rosette-motion channeling. Phys. Status Solidi B-Basic Res. 73, 391–400. Tamura, A., and Ohtsuki, Y. H. (1974). Quantum-mechanical study of rosette-motion channeling. Phys. Status Solidi B-Basic Res. 62, 477–480. Van Aert, S. (1999). Quantitative High-Resolution Electron Microscopy: Channelling Theory and Its Application on GaN. Master’s thesis, University of Antwerp, Belgium. Van Dyck, D. (1975). Path integral formalism as a new description for diVraction of highenergy electrons in crystals. Phys. Status Solidi B-Basic Res. 72, 321–336. Van Dyck, D. (1976). Importance of backscattering in high-energy electron-diVraction calculations. Phys. Status Solidi B-Basic Res. 77, 301–308. Van Dyck, D. (1979). Improved methods for the high speed calculation of electron microscopic structure images. Phys. Status Solidi A 52, 283–292. Van Dyck, D., Bokel, R. M. J., and Zandbergen, H. W. (1998). Does crystal tilt enhance the electron interaction? Microsc. Microanal. 4, 428–434. Van Dyck, D., and Chen, J. H. (1999). Towards an exit wave in closed analytical form. Acta Crystallogr. A 55, 212–215. Van Dyck, D., and Chen, J. H. (1999). A simple theory for dynamical electron diVraction in crystals. Solid State Commun. 109, 501–505. Van Dyck, D., and Coene, W. (1984). The real space method for dynamical electron-diVraction calculations in high-resolution electron-microscopy. I. Principles of the method. Ultramicroscopy 15, 29–40. Van Dyck, D., Danckaert, J., Coene, W., Selderslaghs, E., Broddin, D., Van Landuyt, J., and Amelinckx, S. (1989). The atom column approximation in dynamical electron-diVraction calculations, in Computer Simulations of Electron Microscope DiVraction and Images. Warrendale, PA: The Minerals, Metals and Materials Society, pp. 107–134. Van Dyck, D., and Op de Beeck, M. (1996). A simple intuitive theory for electron diVraction. Ultramicroscopy 64, 99–107. Zandbergen, H. W., Andersen, S. J., and Jansen, J. (1997). Structure determination of Mg5Si6 particles in A1 by dynamic electron diVraction studies. Science 277, 1221–1225.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 136
Measurement of Electric Fields on Object Surface in an Emission Electron Microscope ¨ NHENSE* S. A. NEPIJKO,*,{ N. N. SEDOV,{ AND G. SCHO *Institute of Physics, University Mainz, Staudingerweg 7, 55099 Mainz, Germany { Institute of Physics, National Academy of Sciences of Ukraine, Pr. Nauki 46, 03028 Kiev, Ukraine { The Moscow Military Institute, Golovachev Str., 109380 Moscow, Russia
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . II. Direct and Inverse Problems of Measurement of Electric Fields (Potential) on the Object Surface in Emission Electron Microscope . . . . . . . . A . Measurement of Local Electric Fields in an Emission Electron Microscope Without Restriction of the Electron Beam. . . . . . . . . . . . 1. Model of an Optical System of an Emission Electron Microscope Without Restriction of the Electron Beam. . . . . . . . . . . 2. The Relation Between the Local Field Strength and the Electron Trajectory Deflection . . . . . . . . . . . . . . . . . . 3. The Relation Between the Local Shift at the Image and the Screen Brightness . . . . . . . . . . . . . . . . . . 4. Solution of the Inverse Problem of the Local Field Determination from the Image. . . . . . . . . . . . . . . 5. Simplified Analytical Expressions for the Case of One-Dimensional Fields . . . . . . . . . . . . B . Measurement of Local Electric Fields in an Emission Electron Microscope with Partial Electron Beam Restriction . . . . . . . . 1. Image Contrast Formation in the Case of Beam Restriction . . . . 2. The Relation Between the Electron Beam Shift and the Image Brightness . . . . . . . . . . . . . . . . . . . 3. The Relation Between the Local Field Strength and Electron Beam Shift . . . . . . . . . . . . . . . . . . 4. Solution of the Inverse Problem . . . . . . . . . . . . . . C . Comparison of Investigation Techniques in an Emission Electron Microscope With and Without Beam Restriction . . . . . . . . . III. Model Experiments on Mapping of Electric Fields (Potential) on the Object Surface Using an Emission Electron Microscope . . . . . . A . Visualization of a Potential Step . . . . . . . . . . . . . . . B . Computer Simulation of the Image Contrast . . . . . . . . . . . C . Illustrative Measurements with a Semiconductor p-n Junction . . . . . IV. The EVect of the Local Fields and Microroughness at the Object on the Imaging and Resolving Power of an Emission Electron Microscope . . . . A . Distortion of the Image Details Under the EVect of the Local Field on the Object Surface . . . . . . . . . . . . . . . . . 1. Image Distortion of One-Dimensional Structures . . . . . . . .
ISSN 1076-5670/05 DOI: 10.1016/S1076-5670(04)36003-9
227
. .
228
. .
230
. .
230
. .
230
. .
232
. .
235
. .
236
.
238
. . . .
239 240
. .
242
. . . .
247 249
. .
251
. . . . . . .
252 254 256 258
. .
262
. . . .
262 262
Copyright 2005, Elsevier Inc. All rights reserved.
228
NEPIJKO ET AL.
2. Image Distortion of Two-Dimensional Structures . . . . . . . . B . EVect of the Object Microgeometry (Relief) on the Image . . . . . . C . Deterioration of the Resolving Power of an Emission Electron Microscope in the Presence of Local Fields and Object Roughness . . . . . . . D. Numerical Simulation of Electron Trajectories for the Case of Equipotential Object With Roughness . . . . . . . . . . . . . V. Practical Applications of Microfield Measurement Using an Emission Electron Microscope . . . . . . . . . . . . . . . . . A . Measurement of Electric Fields (Potential) on an Object Under a Variable-Voltage–Applied. Microelectronics Application . . . . . . B . Measurement of Electric Fields (Potential) Distribution at DiVerent Emission Current Density From Various Object Areas . . . . . . . 1. Method of Determination of Trajectory Shifts at DiVerent Emission Current Density From the Object. . . . . . . . . . . . . . 2. Measurement of Potentials for Catalytic Chemistry Applications. . . 3. Improvement of Accuracy in Electric Potential Measurement by Application of Statistical Methods . . . . . . . . . . . . . C . Other Possibilities for Electric Potential Imaging with an Emission Electron Microscope . . . . . . . . . . . . . . . . VI. Measurement of Object Surface Geometry (Relief) with an Emission Electron Microscope . . . . . . . . . . . . . . . . . . . . A . Imaging of Small Three-Dimensional Particles on the Object Surface . . 1. Imaging of Spherical Particle Under Sharp Focusing Without Contact Potential DiVerence . . . . . . . . . . . . . . . 2. Imaging of Spherical Particle Under Sharp Focusing with Additional Contact Potential DiVerence . . . . . . . . . . . 3. Measurements of Spherical Particle Parameters in the Case of Sharp Focusing . . . . . . . . . . . . . . . . . . . . 4. Imaging of Spherical Particles in the Case of Defocusing . . . . . 5. Measurement of Spherical Particle Diameter Without Contact Potential DiVerence in the Case of Defocusing . . . . . . . . . 6. Measurement of Spherical Particle Parameters with Contact Potential DiVerence in the Case of Defocusing . . . . . . . . . . . . 7. EVect of Spherical Particle Shadow on the Image Pattern . . . . . B . Characterization of Three-Dimensional Objects in Particular with Sizes Comparable to or Smaller than the Lateral Resolution of the Emission Electron Microscope. Practical Recommendations . . . . . VII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
265 268
. .
269
. .
275
. .
278
. .
278
. .
283
. . . .
283 285
. .
287
. .
288
. . . .
291 292
. .
293
. .
296
. . . .
298 299
. .
302
. . . .
302 304
. . . . . .
306 312 313
I. INTRODUCTION The image contrast is formed in an emission electron microscope not only due to a diVerence in the emission density from diVerent regions of the object under study and to the existence of a geometrical relief on it, but also due to local electric and magnetic fields existing on the object surface (Dyukov et al., 1991; Sedov et al., 1962; Spivak et al., 1957). The slowly moving
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
229
electrons emitted from the object surface experience the action of these local fields near the object surface. Their trajectories are distorted, which leads to a redistribution of the electron current density on a screen. As a result of this, the local electric and magnetic fields become visible as local bright and dark regions on the microscope screen. These phenomena are not routinely considered in investigations using emission electron microscopy (EEM). However, in a number of cases an influence of these fields on the image formation can be rather strong or even play a dominant role. The mechanism of this influence is diVerent depending on whether or not some electrodes of the optical system are the reason of restriction of the electron beam on its way from the object to the screen. Therefore these two cases are considered separately. The influence of the local fields on the image formation in the emission electron microscope should be taken into account for several reasons. (1) Appearance of bright or dark spots on the image caused by these fields can be confused with other eVects, for example, with a diVerence in the electron emission density. (2) A deformation of the electron trajectories caused by the local fields results in a distortion of the geometrical sizes of details on their image in comparison with their undistorted image sizes. (3) The microscope’s local resolving power can change under the action of the local fields. (4) Finally, the image contrast generated by these fields can be used for the measurement of their strength and of the potential distribution function on the object surface. The theory of the image formation under the action of the local fields enables these quantities to be calculated. This method of image evaluation can be applied to measure the fields of electron microcircuits, or contact potential diVerences, to estimate quantitatively the processes occurring on the object surface and followed by a change of these fields, to measure fields changing rapidly in time and others. In this case it is possible to measure fields of sizes so small that such measurements cannot be performed with the help of other methods. In fact, the image of an object being investigated in an emission electron microscope is deformed under the action of the microgeometry or local fields almost without exception. Even if a flat object is observed and there are no salient fields on its surface, the image contrast, as a rule, is caused by the fact that diVerent microregions have diVerent work functions. Such diVerences lead to the existence of corresponding contact potential diVerences that give rise to local microfields. The deforming action of microfields manifests itself stronger as the sizes of microregions become smaller. In the present paper a quantitative theory permitting to calculate an influence of the local electric fields on the EEM image is given. The so-called inverse problem (i.e., a calculation of these fields from the image) is also solved, and a distortion of the image details caused by the local electric fields
230
NEPIJKO ET AL.
is calculated. Examples of application of this theory for the investigation of objects of diVerent nature are also presented. The presented theory allows (1) calculation of the local fields’ distribution and in consequence, (2) reconstruction of the image with the help of a reconstruction of the real sizes and shape of microregions, and (3) measurement of the profile of a microrelief because the surface microrelief and the local microfields are related to one another. The theory also enables the local deterioration of the microscope resolving power caused by these factors to be estimated. II. DIRECT AND INVERSE PROBLEMS OF MEASUREMENT OF ELECTRIC FIELDS (POTENTIAL) ON THE OBJECT SURFACE IN EMISSION ELECTRON MICROSCOPE A. Measurement of Local Electric Fields in an Emission Electron Microscope Without Restriction of the Electron Beam 1. Model of an Optical System of an Emission Electron Microscope Without Restriction of the Electron Beam A model of the electron optics of an emission electron microscope (Figure 1) was chosen for the following calculations. It consists of a flat object, a uniform field accelerating electrons above its surface, the anode diaphragm
FIGURE 1. Schematic view of an emission electron microscope illustrating the principle of the image contrast formation due to the eVect of the local fields on the object surface. K is the object surface, K0 is the virtual cathode plane, L denotes the electron lens, and E is the microscope screen. The dashed line represents the trajectory of an electron leaving the object with a given initial velocity and moving without deflection by local fields, as well as the continuation of tangent to the trajectory toward the virtual cathode plane (a). The solid line represents the trajectory of an electron deflected by the local fields above the object surface, as well as the continuation of the tangent to this trajectory (b). The electron shift s at the microscope screen corresponds to the shift S in the virtual cathode plane. Other notation are given in the text.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
231
with hole, the electron-optical lenses focusing the electron beam, and the screen on which the image is observed. The electron emission from the object surface can be of various nature, for example, this can be the thermionic emission from heated objects. However, other emission mechanisms that do not require heating of the object are of particular interest for the investigation of objects of diVerent kinds because they make it possible to study an object under real conditions. A photoemission electron microscope has the advantage that it can operate in high vacuum. The secondary emission under the action of a primary beam of electrons or ions can also be used. The emission caused by an ion beam is more preferential because electrons emitted from the object surface have low initial velocities. This results in a good lateral resolution of the object details. The real construction of the microscope cathode lens can be diVerent. For example, electrodes are frequently made in the form of a cone such that the primary beam is not obscured. A focusing electrode can be located in front of the anode. However, the microscope construction aforementioned model above is suitable for these cases, too. The point is that the strong field accelerating the electrons should act on the object region being imaged. The length of the uniform field region proves to be inessential. The only condition is that the length of the accelerating field should be several times larger than the sizes of the local fields on the object surface. This condition is usually met in any construction of emission microscopes. The accelerating voltage applied to the anode of the emission electron microscope will be designated as V0. Suppose the studied site of the object is exposed to the external electric field of strength E0. We shall introduce a cathode lens parameter l satisfying the equality V0 ¼ E0 l:
ð1Þ
If there is a flat anode before the object, the parameter l will be equal to the distance between the object and the anode. Let us select a rectangular coordinate system in such a way that its (x, y) plane coincides with the object surface, while its z-axis is directed along the optical axis of the cathode lens. If the electrons on their way from the object to the microscope screen are free from any obstacles capable of partial beam restriction, the image of the local fields on the object surface shall be displayed because of the shift of the electron trajectories caused by the eVect of the local electric fields above the object surface, as is shown in Figure 1. Electrons emitted from the object surface and not deflected by the local electric fields above the surface shall move along parabolic trajectories in the uniform accelerating electric field E0 (see Figure 1, dashed curve [a]). Lens L focuses these electrons into the corresponding image point on the microscope screen. If we continue the tangent to the trajectory drawn at the point of electron
232
NEPIJKO ET AL.
arrival to the anode plane before the lens, it shall cross the axis behind the object plane at the distance approximately equal to l. This is the virtual cathode plane K0 , which is optically conjugated with the screen plane E. If there are local electric fields on the object surface, an electron trajectory near the object will be distorted (see Figure 1, solid curve [b]). Any point of the object will be projected on the screen with a shift s by the microscope lens L. If we continue the tangent to this trajectory drawn at some point near the anode up to its intersection with the virtual cathode plane K0 , we will see the corresponding shift S of the image point. S ¼ s=M, where M stands for the magnification of the microscope. Because the shift value S is diVerent for diVerent points of the object, certain redistribution of the electron flux density will be seen at the screen as local variation of the brightness of the corresponding image spots, even if the density of electron emission from the object is constant. In theory, variation of the current density at the screen (image brightness) can be related to the field strength on the object surface. The problem is subdivided in two parts: (1) to evaluate a quantitative relationship between the field strength at the object and the image shift S; (2) to define a relationship between the shift S and local current density at the screen j(x, y). 2. The Relation Between the Local Field Strength and the Electron Trajectory Deflection Local fields on the object surface being investigated will be characterized by a potential distribution function ’(x, y). Then the potential distribution function in the space above the object surface will be determined as a solution of Dirichlet’s problem for semi-space (Courant and Hilbert, 1989) V ðx; y; zÞ ¼
z 2p
Z1 Z1
’ðx; Þdxd 2
1 1
½ðx xÞ þ ðy Þ2 þ z2 3=2
þ E0 z:
ð2Þ
In practice, a much more simple case is often observed, when the function of the potential on the object surface depends, within certain limits, only on the x-coordinate. In such a case, Eq. (2) can be simplified: z V ðx; zÞ ¼ p
Z1 1
’ðxÞdx ðx xÞ2 þ z2
þ E0 z:
ð3Þ
Equations (2) and (3) permit diVerentiation with respect to x and y, giving similar integral equations for the field strength. Physically, these equations mean that the eVect of the local fields decreases with height z, whereas the spatial spread of the beam increases. For this reason, the deflection of the
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
233
electron trajectory is aVected not only specifically by the local fields acting at the electron emission area on the object surface, but also by the fields existing at the adjacent areas. The latter eVect weakens if the distance between these areas increases. Because of this the problem of the relation between the field distribution function and the deflection of electron trajectories becomes more complicated. Let us derive formulae that will relate the electric field distribution function on the object surface ’(x, y) to the electron shift value S(x, y). Consider an electron emitted from the object surface at a distance z from this surface. As seen in Figure 1, the tangent drawn to the trajectory in a plane, which is reflection symmetric to the object plane and passes through the z-coordinate, will be shifted along the x-axis by the value x_ Sx ðzÞ ¼ Dx 2ztana ¼ Dx 2z ; z_
ð4Þ
where Dx is the shift of the electron in the direction x for the plane z, a is an angle between the tangent to the trajectory and the z-axis. Let us rearrange this equation considering that in ordinary conditions the external accelerating field strength E0 is much higher than the local fields’ strength: jgrad’j << E0 :
ð5Þ
In this case, as a first approximation, we could suppose that the equations of electron motion €¼ x
e @V ; m @x
produce the equations z_ ¼
€ y¼
e E0 t; m
Whence it follows that
e @V ; m @y z¼
€z ¼
e E0 t2 : 2m
e @V m @z
ð6Þ
ð7Þ
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dz e ¼ 2 E 0 z: ð8Þ dt m Here, e and m are the charge and mass of the electron, and t is the time elapsed since the electron escape from the surface. Let us set time t as an argument of Eq. (4): x_ _ St ðxÞ ¼ Dx 2zðtÞ ¼ Dx tx: ð9Þ z_ ðtÞ DiVerentiation of this equation with respect to time gives: x ¼ t€ x: ð10Þ S_ x ðtÞ ¼ x_ x_ t€ z_ ¼
234
NEPIJKO ET AL.
Let us calculate
dSx dz
considering z ¼ z(t): dSx dSx dt S_ x mS_ x S_ x ¼ ¼ : ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ e dz dt dz z_ 2 m E0 z eE0 t
ð11Þ
Substituting here Eq. (7), we obtain dSx m €: ¼ x ð12Þ eE0 dz Upon integrating over z varying within the limits from zero to l, we obtain an expression for the electron shift displayed on the microscope screen (Dyukov et al., 1991; Sedov, 1970): m Sx ðx; yÞ ¼ eE0
Zl €ðzÞdz: x
ð13Þ
0
With account for Eq. (6), this relation assumes the following form: 1 Sx ðx; yÞ ¼ E0
Zl 0
@V dz: @x
ð14Þ
In this equation, the upper integration limit can be substituted by infinity because in practice the spread of the local fields is much smaller than the distance l. Upon substituting Eq. (2) into Eq. (4), reversal of the order of integration and some transformations, one can obtain the following equation for the electron shift seen on the screen: 1 Sx ðx; yÞ ¼ 2pE0
Z1 Z1
1 @’ðx x; y Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dxd: @x 2 2 x þ 1 1
ð15Þ
A similar equation can be also written for the y-component of the shift Sy(x, y). When the local fields are a function of only the x-coordinate, in a similar manner we obtain 1 SðxÞ ¼ 2pE0
Z1
l2 ’ ðx xÞln 1 þ 2 dx: x 0
ð16Þ
1
Equations (15) and (16) provide the solution of the first part of the problem of local electric field imaging on the screen of an emission electron microscope.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
235
3. The Relation Between the Local Shift at the Image and the Screen Brightness Let us assume that function j0(x, y) describes the unperturbed current density distribution at the microscope screen in the absence of local fields. If now, under the action of local fields, certain parts of the image are subject to a shift with components Sx(x, y) and Sy(x, y), the initial points of the object with the coordinates (x, y) will be displayed at points with new coordinates x0 ¼ x þ Sx ðx; yÞ
ð17Þ
y0 ¼ y þ Sy ðx; yÞ:
ð18Þ
and Consequently, a redistribution of the current density will take place. The new distribution can be found on the basis of the law of current conservation in electron optical systems without restriction of the electron beam (by diaphragms, etc.): j0 ðx; yÞdxdy ¼ jðx0 ; y0 Þdx0 dy0 :
ð19Þ
The elementary area of the object section dx0 dy0 in new coordinates will be equal to 0 0 @x @y @y0 @x0 Dðx0 ; y0 Þ 0 0 dxdy dx dy ¼ dxdy ¼ Dðx; yÞ @x @y @x @y @Sx @Sy DðSx ; Sy Þ ¼ 1þ þ þ dxdy: Dðx; yÞ @x @y
ð20Þ
It means that the new distribution function of the electron current density at the screen will be described by the equation (Dyukov et al., 1991; Sedov, 1970) jðx0 ; y0 Þ ¼
j0 ðx; yÞ 1þ
@Sx @x
þ
@Sy @y
þ
DðSx ; Sy Þ Dðx; yÞ
:
ð21Þ
If all the functions depend only on the x-coordinate, this relation is simplified: jðx0 Þ ¼
j0 ðxÞ : 1 þ dS dx
ð22Þ
The complete solution of the direct problem for the current density distribution on the screen (i.e., the local brightness of the image), due to the eVect of
236
NEPIJKO ET AL.
local fields on the object, consists of a calculation of the shift S by Eq. (15) or (16), followed by a calculation of the current density itself by Eq. (21) or (22). 4. Solution of the Inverse Problem of the Local Field Determination from the Image The solution of the inverse problem of calculation of the electric field strength on the object surface from the respective images is also performed in two stages: (1) calculation of shifts Sx and Sy from experimentally measured function j(x0 , y0 ) or j(x0 ) using Eq. (21) or (22); (2) calculation of the local field strength on the object surface or the electric potential distribution function ’(x, y) or ’(x) itself using the calculated shift functions Sx and Sy. At first, we consider the solution of the first part of the problem for the simple case when all functions depend only on the x-coordinate. Equation (22) can be written in the following form: jðx0 Þdðx0 Þ ¼ j0 ðxÞdx:
ð23Þ
Upon integration of this equation, we obtain: xþS Z
J ðx Þ ¼ 0
Zx jðx Þdðx Þ ¼
0
0
x00
j0 ðxÞdx ¼ JðxÞ:
0
ð24Þ
x0
Here, x0 and x00 denote mutually corresponding initial coordinates. In practice, these points should be taken far away from the region of local fields in such a way that the shift S in this point is approximately equal to zero. In this case we have simply x00 ¼ x0 . It follows from Eq. (24) that the shift S can be calculated as the diVerence of the arguments of the two integral relations when these relations are set equal. This procedure for a calculation of shifts S is graphically illustrated in Figure 2. The method for the calculation of S(x) poses no special problems in the cases when it is possible to switch oV the local fields in order to measure the function j0(x). Such a case may happen, for example, while studying p-n transitions in semiconductors or integrated circuits when the voltage applied to the object can be switched oV. However, when it is not possible, S(x) can also be measured through the measurement of the current density distribution at the screen under two diVerent accelerating voltages of the microscope. This method is described in the following paragraphs. In the case of the two measurements, the problem of evaluation of Sx(x, y) and Sy(x, y) can be solved in the similar manner. Upon integration of Eq. (19) we obtain:
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
237
FIGURE 2. Illustration of the method of calculation of coordinate shifts S(x) from integral curves of the current density distribution J 0 (x 0 ) and J(x) at the microscope screen. The shift S(x) is visible as the abscissa diVerence at a given ordinate value of the integral curves J 0 (x 0 ) and J(x) (upper diagram). The lower diagram shows the corresponding shift function S(x). xþS Z x yþS Z y
Zx Zy Jðx; yÞ ¼
j0 ðx; yÞdxdy ¼ x0 y0
jðx0 ; y0 Þdx0 dy0 ¼ J 0 ðx0 ; y0 Þ: x00
ð25Þ
y00
The equality of the two integrals J and J0 (conservation of total electron current) defines a curve in the (x0 , y0 ) plane, and the end of the shift vector S is contained in this curve. To determine uniquely this vector for the point (x, y), it is necessary to repeat the integration procedure, changing the order of integration over the x and x0 axis. When doing so, we obtain a second, similar curve intersecting the first one. The cross point of these two curves defines the position of the end of the vector S. In that way, the components Sx ¼ x0 x and Sy ¼ y0 y of the vector S are uniquely defined for any point in the (x, y) plane. To determine the integrals J and J0 , the knowledge of the initial lines of integration x00 ðx; yÞ and y00 ðx; yÞ is required. In other words, it is necessary to know the shift values along these curves. If the area of the field distribution on the object is localized, their determination is simplified through the selection of a suYciently wide path of integration in such a manner that the image shift over this path is practically absent. The solution of the second part of the problem consists in the solution of the integral Eqs. (15) and (16) with respect to the function ’. This solution is performed with the help of the direct Fourier transform in the area of the generalized functions (distributions). The solution of Eq. (15) has the following form:
238 @’ E0 ðx; yÞ ¼ @x 2p
NEPIJKO ET AL.
Z1 Z1 S ðx; Þ S ðx; yÞ þ ðx xÞ @Sx ðx; yÞ þ ðy Þ @Sx ðx; yÞ x x @x @y ½ðx xÞ2 þ ðy Þ2 3=2
1 1
dxd:
ð26Þ @’ @y ðx; yÞ.
A similar equation can be written for the following form: E0 ’ ðxÞ ¼ p
Z1
0
The solution of Eq. (16) has
SðxÞ SðxÞ þ ðx xÞS 0 ðxÞ ðx xÞ2
1
dx:
ð27Þ
The electric potential distribution function ’(x, y) or ’(x) on the object surface can be determined through the integration of Eqs. (26) and (27). Thus, the calculation of the electric fields on the object surface is performed according to the following procedure. First, it is necessary to measure the current density distribution function j(x) or j(x, y) in the image plane. Then the electron trajectory shift function S is calculated using Eq. (24) or (25). Afterward the electric field distribution function for the local fields on the object surface is calculated using Eq. (26) or (27). Finally, the electric potential distribution function ’(x, y) or ’(x) is calculated by means of numerical integration of the local field distributions. 5. Simplified Analytical Expressions for the Case of One-Dimensional Fields Upon integration of Eq. (27) by parts we obtain E0 ’ ðxÞ ¼ p
Z1
0
1
S0 ðxÞdx : xx
ð28Þ
Its inverse transformation is E0 S ðxÞ ¼ p
Z1
0
1
’0 ðxÞdx : xx
ð29Þ
At the same time, in the absence of electric charge (divE ¼ 0) the electric field components E0z and E0x ¼ ’0 ðxÞ on the surface z ¼ 0 are related by the Hilbert transformation E0x and
1 ¼ p
Z1 1
E0z ðxÞdx xx
ð30Þ
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
1 E0z ðxÞ ¼ p
Z1 1
E0x ðxÞdx 1 ¼ xx p
Z1 1
’0 ðxÞdx : xx
239
ð31Þ
From the comparison of Eqs. (30) and (31) we obtain the following expression for the vertical component of the field: E0z ðxÞ ¼ E0 S 0 ðxÞ: ð32Þ All the aforementioned expressions containing the Hilbert transformation involve the principal values of the Cauchy integrals. Because the Hilbert transformations are valid for the functions belonging to Lp class at p > 1, it is necessary that the function E0x(x) belongs to this class. In the present case the function S0 (x) will also belong to this class. When not only the derivative of ’(x), but this function itself belongs to this class, we can write expressions similar to Eqs. (28) and (29) immediately for the functions ’(x) and S(x) because their harmonic components are related in a similar manner. In this case 1 ’ðxÞ ¼ pE0
Z1 1
SðxÞdx xx
ð33Þ
’ðxÞdx : xx
ð34Þ
and 1 SðxÞ ¼ pE0
Z1 1
Equation (34) is simpler and can be used instead of Eq. (16), provided the magnitude of l is much larger than the magnitude of the field of view at the object, which is always true. In such cases the magnitude of l practically does not aVect the image contrast. B. Measurement of Local Electric Fields in an Emission Electron Microscope with Partial Electron Beam Restriction One of the operation modes of the emission electron microscope is the regime with a partial restriction of the electron beam. This mode can be achieved by two ways: (1) placement of an aperture of a small diameter in the crossover region of the microscope immersion objective lens, and (2) partial beam restriction by plates (‘‘knife-edges’’) of diVerent shapes. The electron beam can also be partially restricted by various electrodes of the electron-optical system of the microscope. In this case a partial restriction is caused by the electron beam deflection by the magnetic fields.
240
NEPIJKO ET AL.
Placement of an aperture diaphragm in the crossover plane of the objective lens is often used to increase the resolving power of emission electron microscopes (Boersch, 1942; Recknagel, 1943), because thereby the electrons emitted from the object at large angles and with high initial velocity are eliminated. For each given distribution of the emitted electrons by their velocities and angles, there is an optimal diameter of the aperture diaphragm that provides the best resolution. In addition, however, the aperture diaphragm opens new possibilities for the imaging of local fields. In particular, in this case the microscope sensitivity to local microfields and microgeometry essentially increases. The eVect is the most pronounced when the microscope design permits movement of the diaphragm in the directions perpendicular to the optical axis (Sedov et al., 1962). In terms of image quality, the best way to be followed is to place the diaphragm or knife-edge in the crossover plane. A beam restriction by means of trajectory deflection with an external magnetic field is used when the microscope design is not suitable for placement of a diaphragm or knife-edge in the crossover plane of the objective (Mundschau et al., 1996; Nepijko et al., 2002b). From the electron-optical point of view, this technique for beam restriction is not the best one because it results in image vignetting. When this occurs, the local field action is observed only for a relatively narrow band on the object surface. On one side of this band, the image is completely blocked, whereas on the other side an image is formed with no beam restriction. Nevertheless, this technique is interesting because it allows operation with standard equipment configuration without modification. With partial electron beam restriction, the EEM image contrast will be governed by other rules relative to the nonrestricted beam. In these conditions, the image contrast is formed by even small tangential speed that might be gained by the electrons under the eVect of the local fields. Consequently, the microscope sensitivity to these fields becomes much higher. As in the case without beam restriction, the calculation of the image contrast is performed in two stages: (1) calculation of the tangential speed acquired by the electron in the space above the object surface under the action of the local field, and (2) calculation of the image brightness distribution (electron current density) at the microscope screen caused by the tangential speed. These two problems are considered in more detail later in this section. 1. Image Contrast Formation in the Case of Beam Restriction To calculate the image contrast observed on the microscope screen under the eVect of the local fields in the case of beam restriction with an aperture diaphragm placed in the crossover plane of the objective lens, we adopt the
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
241
FIGURE 3. Schematic diagram of the image contrast formation in an emission electron microscope caused by local fields with restriction of the electron beam. Undeflected and deflected electron beams are denoted by ‘‘a’’ and ‘‘b,’’ respectively. K is the object plane, L is the objective lens, and E is the screen. The contrast aperture D with a round hole (upper diagram) or knife‐edge P (lower diagram) is located in the crossover plane C (back focal plane of the objective lens). L is the distance from the objective lens to the screen, l is the distance from the object plane to the anode diaphragm, f is the focal length of the objective lens, and S is the shift of the electron beam in the crossover plane due to the fields.
microscope model of an EEM instrument shown in Figure 3, upper diagram. Here, K denotes the flat surface of an object and L designates the objective lens of the microscope. The anode diaphragm, to which the microscope anode voltage V0 is applied, is placed at the distance l from the object. Above the object surface, there is a uniform accelerating electric field determined from Eq. (1): E0 ¼ V0 =l. The microscope screen plane, on which the image is observed, is designated as E. C denotes the crossover plane where all the electron rays leaving the object perpendicular to the surface intersect with the optical axis of the system. The focal distance of the objective lens is labeled f. As in the previous case shown in Figure 1, the z-axis of the Cartesian coordinate system is directed along the optical axis of the microscope, whereas the plane (x, y) coincides with the flat object surface.
242
NEPIJKO ET AL.
In the crossover plane, we place a diaphragm with a hole D, leading to a partial restriction of the electron beam. The letter ‘‘a’’ designates the trajectory of an electron in the electron beam emitted from some point of the object normally to the object surface if this path is not deflected by the local fields. Should the diaphragm hole be positioned concentric with the optical axis, a major part of the beam will pass to the screen. Then the corresponding object point will look bright. If the electron trajectory near the object was deflected by the local fields (the ray represented by ‘‘b’’), this part of electrons will be blocked by the diaphragm. The corresponding area of the object image will look dark. In this case, we observe a bright-field image of the local field distribution. However, if the diaphragm D is moved oV axis (e.g., upward; see Figure 3, upper diagram), then, to the contrary, the electron flux on the screen for the deflected rays will be larger compared with the nondeflected ones, and a dark-field image is observed. Images of a similar nature can be obtained not only with a diaphragm, but also with a knife-edge placed in the crossover plane, resulting in a beam restriction from one side only (Figure 3, lower diagram). Referring to the figure, if the beam is deflected upward under the eVect of the local fields, and if the edge blocks the lower part of the beam, a dark-field image appears. If the beam is deflected downward, the image will be bright-field. It is worth mentioning that if the diameter of the aperture diaphragm placed in the crossover plane is much larger than the electron beam cross-section, the beam restriction occurs specifically due to the eVect of only one side of the diaphragm, and, in fact, this does not diVer from the case with the knife-edge application. 2. The Relation Between the Electron Beam Shift and the Image Brightness To find the relation between the shift of the electron beam under the eVect of the local fields and the electron current density on the screen, it is necessary to know the width of the electron beam emitted from a single point of the object in the crossover plane. For this purpose, we consider the beam parameters in the EEM optical system in more detail (Figure 4). Let us separate two basic rays. The ray ‘‘1’’ emerges from the center of the object K at the certain angle a. In paraxial approximation (small a) the screen plane E (or the object plane of the projection lens) is located at the second point of intersection of this ray with the optical axis of the system. The ray ‘‘2’’ emerges normally to the object plane K at a distance r from the axis and intersects the screen plane E at a distance R from the axis. The ratio M ¼ R=r is defined as the microscope’s lateral magnification. The plane C, where ray ‘‘2’’ intersects the axis, constitutes the crossover plane. Just in this plane, the aperture diaphragm or the beam restricting knife-edge is positioned. There is no electric field in the crossover region. All other rays
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
243
FIGURE 4. Principal electron trajectories in an emission electron microscope. See text for explanations.
starting at an arbitrary point of the object at other angles can be represented by a linear combination of these two principal rays. Let us denote by the potential diVerence, which corresponds to the initial energy of the electron when escaping from the object. Assuming that the angle a is small enough, we write the Helmholtz–Lagrange equation for electron-optical systems (Glaser, 1952): pffiffiffiffiffiffi pffiffi a r ¼ y V0 R; ð35Þ where y denotes the aperture angle in the screen plane. In this case, in the right part of the equation we ignored the quantity being negligibly small with respect to the microscope anode p voltage ffiffiffiffiffiffiffiffiffiffiffi V0. Upon multiplying the right and the left parts of this equality by 2e=m; we obtain v0x ¼ Myv0 ;
ð36Þ
wherepnffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0x designates the tangential component of the electron initial velocity; n0 ¼ 2eV0 =m denotes the electron velocity after its acceleration by the anode voltage. Because there is no field in the crossover plane and beyond it up to the screen plane, the electron trajectories in this region are linear (Figure 4). If the distance between the crossover plane and the screen plane equals L, the distance r from the trajectory 1 to the axis in the crossover plane will be equal to rffiffiffiffiffiffi v0x L v0x x r ¼ yL ¼ ¼f ¼f : ð37Þ Mv0 v0 V0 L is a focal distance; x ¼ Here, f ¼ M for all values of the angle a.
mv20x 2e .
In this form, Eq. (37) will be valid
244
NEPIJKO ET AL.
Assuming that electrons emitted from the object are distributed by energies according a Maxwell distribution, and their distribution by angles is close to a cosine curve, we obtain that the current density of the electron beam is distributed according a Gaussian distribution in the crossover plane (Glaser, 1952): jc ¼ jc 0 expðr2 =r2c Þ; ð38Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi here, r ¼ x2 þ y2 is the distance from the axis, and rc is the eVective radius of the electron distribution function in the crossover plane. According to Eq. (37) this radius is equal to pffiffiffiffiffiffiffiffiffiffiffiffiffiffi rc ¼ f VT =V0 : ð39Þ Here, VT denotes the potential corresponding to the most probable energy (center of gravity) of the electron distribution leaving the object. The value eVT is of the order of tenths of eV for thermoelectrons; it varies from tenths of eV up to several eV for photoelectrons. It is of the order of one to several eV in the case of the ion-induced secondary electron emission. It is clear that if there is a mutual shift s between the center of the diaphragm and the peak position of the current density in the beam, the magnitude of current transmitted through the diaphragm will depend on the relative value of the shift S ¼ s=rc :
ð40Þ
This dependence is shown graphically in Figure 5. It can be seen that dependence is close to a Gaussian distribution, but it is somewhat broader due to the finite size of the diaphragm hole diameter. It means that the width of the distribution also depends on the diameter of the diaphragm hole. If the diaphragm can be displaced upward or downward in the crossover plane by some distance s0, we can obtain the best image contrast. In this case, the current transmitted to the screen will depend on the value S S0, where S0 ¼ s0 =rc . The sign of the initial relative shift S0 depends on whether the directions of shift values S and S0 are the same or opposite. The value of the current transmitted through the diaphragm from the given point of the object is reduced according to the relation jðx; yÞ ¼ j0 ðx; yÞ expðS S0 Þ:
ð41Þ
Here, j0(x, y) is a distribution function for the current density on the screen in the absence of local fields. Let us now calculate the magnitude of current transmitted to the screen if the electron beam is restricted by the knife-edge of plate P and the center of the beam is shifted with respect to the edge by the distance s. Upon
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
245
FIGURE 5. Relative magnitude of the electron flux j/j0 on the screen of an emission electron microscope as a function of the shift S between the maximum of the electron flux distribution F and the center of the round hole of the contrast aperture D. In the upper part of the figure the dotted line shows the undeflected flux and the solid line shows the deflected flux distribution.
integration of Eq. (38) within the infinite limits along the y-axis and within the interval from 1 to s, we obtain j0 s 1 þ erf jðsÞ ¼ : ð42Þ rc 2 Here, j0 is the current density from the elementary surface area of the object. As seen from this equation, the current transmitted to the screen does not depend on the absolute value of the shift, but on its relative value S ¼ s=rc . We obtain jðSÞ 1 ¼ ð1 þ erfðSÞÞ: j0 2
ð43Þ
The plot of this dependence is shown in Figure 6. However, the beam restricting edge P can be also shifted by a distance s0 with respect to the axis. Because the current transmitted to the screen depends on the shift of the electron beam just with regard to this edge, then the value S in Eq. (43) should be changed to S S0 , where S0 ¼ s0 =rc . Therewith Eq. (43) takes the final form jðSÞ 1 ¼ ð1 þ erfðS S0 ÞÞ: j0 2
ð44Þ
246
NEPIJKO ET AL.
FIGURE 6. Relative magnitude of the electron flux j/j0 on the screen of an emission electron microscope as a function of the shift S between the maximum of the electron flux distribution F and the position of a knife‐edge P. Other notations as in Figure 5.
It can also be seen from Figures 3, 5, and 6 that depending on the value and direction of shifts, either a bright-field or a dark-field image of the same object can be seen on the screen. Indeed, let us assume that the position of the edge of diaphragm D and the direction of the electron beam shift by the eVect of the local fields are as shown in Figure 3, upper diagram. In this case, the electron flux transmitted to the microscope screen from those parts of the object, where the local fields’ eVect results in a beam deflection, will be smaller than that of the current emitted from the parts without local fields. The object will be displayed as a bright-field image. Let us now suppose that the diaphragm in Figure 3, upper diagram, is also moved upward. In this case, to the contrary, the parts of the object without local fields will look darker, whereas the parts of the object with deflecting fields will look brighter; in other words, we will see a dark-field image. An intermediate case might also occur, when the aperture diaphragm is moved by a relatively small distance. Let us assume a dark-field image is observed, and an image section becomes brighter at small beam deflections. However, if the beam deflection is large enough, the beam center shall be again displaced apart from the center of the diaphragm, and we shall see again a darkening in the central parts of the local deflecting field. In addition, it is worth mentioning the practically important case of a weak image contrast, when after the beam restriction by the knife-edge, we are in the central part of the curve in Figure 5, which is linear with respect to the shift. In this case, the microscope sensitivity to the local fields is highest.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
247
Upon diVerentiation of Eq. (44) we obtain that at small values of the diVerence S S0 it is possible to apply the following approximation: jðSÞ S S0 1 ð45Þ pffiffiffi þ : j0 2 p If the absolute value of (S S0) does not exceed 0.3, the error in the estimation of the current density on the screen according to Eq. (45) will be no more than 5%. This allows simplification of the calculations in the case of a relatively weak image contrast, because in this part of the curve the current density is linearly dependent on the shift S. 3. The Relation Between the Local Field Strength and Electron Beam Shift If the calculation of the image contrast is to be completed in full measure, it is necessary to find a relation between the local fields’ distribution function on the object surface and the shift of the electron trajectories S. The value of this shift in the crossover plane of the objective lens depends on the additional tangential velocity n gained by the electrons under the eVect of the local fields near the object surface. The problem of determination of the additional tangential velocity of electrons is solved by the method of successive approximations. The electron movement with zero initial speed only under the action of the accelerating field of the immersion objective E0 is taken as the zero-order approximation. When it is considered that the strength of this field is much larger with respect to the local fields of the object, even the first approximation provides fairly high accuracy (Dyukov et al., 1991; Sedov, 1970). Let us calculate the tangential velocity nx of an electron acquired by the eVect of the local fields above the object surface while the electron moves along the direction of the z-axis: e vx ¼ x_ ¼ m
Zt 0
@V e dt ¼ @x m
Zz 0
@V dz : @x z_
ð46Þ
Here, the integration with respect to time t was changed to integration over the z-coordinate. z˙ is expressed by Eq. (7) . Considering that the local fields above the object surface rapidly decrease with height, the upper limit of integration can be taken as infinity. The expression for @V @x is taken from Eq. (2) after diVerentiation with respect to x under the integral. Upon these substitutions, changing the integration order and integrating over z we obtain rffiffiffiffiffiffiffiffiffiffiffiffi Z1 Z1 pe @’=@xðx x; y Þdxd vx ðx; yÞ ¼ : 2 2mE 0 ½Gð1=4Þ ðx2 þ 2 Þ3=4 1
1 1
ð47Þ
248
NEPIJKO ET AL.
Here G(1/4) ¼ 3.6256 is the value of Euler’s G function for 1/4. A similar expression can be written for the shift in the direction of the y-axis. Let us also consider the case, which is important for practical purposes, when the local fields on the object surface depend only on the x-coordinate. In this case, upon integrating of Eq. (47) with respect to variable , we obtain 1 vx ðxÞ ¼ 2
rffiffiffiffiffiffiffiffiffi Z1 e d’ðx xÞ dx pffiffiffiffi : mE0 dx jxj
ð48Þ
1
The electron shift s(x, y) in the crossover plane caused by the tangential velocity attained by the electron can be calculated by Eq. (37) into which the value taken from the expression mv2x ð49Þ 2 should be substituted as x. From here we obtain the expression for the absolute value of the shift rffiffiffiffiffiffiffiffiffiffi m s¼f ð50Þ vx : 2eV0 ex ¼
As is seen from this expression, the actual shift is directly proportional to the acquired tangential velocity. When calculating the current density at the screen, it is more convenient to use the relative shift expressed by Eqs. (39) and (40). Upon substitution we obtain vx S ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : 2eVT =m
ð51Þ
Substitution of nx from Eq. (47) or (48) gives the relative shift S ¼ Sx(x, y) for the two-dimensional fields on the object surface: pffiffiffiffiffi Z1 Z1 @’=@xðx x; y Þdxd pl Sx ðx; yÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi : 2 V0 VT ½Gð1=4Þ2 ðx2 þ 2 Þ3=4
ð52Þ
1 1
A similar expression is also valid for the y-axis. For one-dimensional fields that depend only on the x-coordinate, we obtain pffiffi Z1 d’ðx xÞ dx l SðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi : dx 2 2E0 VT jxj
ð53Þ
1
With a little manipulation, these expressions can be also written in another form. For example, by changing the variables Eq. (53) can be written in the following form:
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
pffiffi Z1 0 ’ ðxÞdx l pffiffiffiffiffiffiffiffiffiffiffiffiffiffi : SðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2E0 VT jx xj
249
ð54Þ
1
Upon rearrangement of the integral by parts, this expression can be represented as pffiffi Z1 0 ’ ðxÞ sgnx dx l pffiffiffiffiffiffiffiffiffiffiffiffiffiffi SðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð55Þ 4 2E0 VT jx xj 1
This integral should be taken in terms of principal value. The derived Eqs. (52) through (55) are functional transformations, which together with Eqs. (41) through (45), provide the solution of the direct problem of the image contrast on the screen of an emission electron microscope under the eVect of the local fields. Using the known function ’(x, y), we calculate the integral transformations (52) through (55), and then, using the calculated shifts S(x, y), we determine the electron current transmitted to the screen in the given point of the image. 4. Solution of the Inverse Problem Because the previously derived formulae define a quantitative relation between the local strength of the field on the object surface and the current density on the screen, now it is possible to solve the inverse problem on the image contrast; in other words, we can derive the electric potential distribution on the object surface from its image. The solution of the inverse problem consists of two parts: first, the functions Sx(x, y) and Sy(x, y) are calculated using the measured brightness distribution at the screen of an emission electron microscope, and then with the help of these functions, the electric potential distribution ’(x, y) is calculated. The first part of the problem is solved with the help of plots, as shown in Figure 5 or 6, depending on whether the beam restriction is achieved with the help of a diaphragm or a plate with a knife-edge. It is also possible to use Eqs. (41) through (45). If so, for practical applications the case of a relatively small image contrast described by Eq. (45) is very convenient. Upon solving Eq. (45) with respect to the shift S, we obtain pffiffiffi jðxÞ 1 Sx ðxÞ ¼ S0 p : ð56Þ j0 2 The second part of the problem can be solved if we consider Eqs. (52) and (53) as integral equations with respect to an unknown function ’(x, y). These are the types of equations of convolution of two functions
250
NEPIJKO ET AL.
(Fredholm’s equations of the first genus with polar cores having diVerent exponential quantities). The solution of equations of these types is made by the method of the Fourier transformation over the space of generalized functions (distributions) (Schwartz, 1950). The Fourier transformation of a convolution of two functions is equal to the product of the transformations of these functions. Therefore we can derive the expression for the Fourier transformation of the desired function, and then, upon the inverse Fourier transformation, this function itself can be expressed in an explicit form. However, to exclude infinitely large components, it is necessary to perform a regularization of the obtained expressions according to the rules established for generalized functions. Upon omitting the details of all these transformations, we present here the final solution of the integral equations given previously. The solution of Eq. (52) for two-dimensional fields on the object surface has the following form for the field component directed along the x-axis: rffiffiffiffiffiffiffiffiffiffiffiffi Z1 Z1 @’ðx; yÞ ½Gð1=4Þ2 V0 VT Sx ðx; yÞ Sx ðx x; y Þ ¼ dxd: ð57Þ 2 @x 8p pl ðx2 þ 2 Þ5=4 1 1
A similar solution can be also written for the field component directed along the y-axis. The electric potential distribution function ’(x, y) on the object surface can be found by means of integration of these expressions over the respective coordinates. When the local fields depend only on a single coordinate x, the solution of Eq. (53) has the following form: d’ 1 ¼ dx p
rffiffiffiffiffiffiffiffiffiffiffiffi Z1 V0 VT SðxÞ Sðx xÞ dx: 2l jxj3=2
ð58Þ
1
The electric potential distribution on the object surface is derived from this equation by means of numerical integration. Thus the complete solution of the problem of calculation of the local electric fields on the object surface in the case of partial restriction of the electron beam is made in two stages. (1) Using the experimentally measured electron current distribution at the screen j(x, y), the function of the electron beam shifts S is calculated with the help of Eqs. (41) through (45) or plots such as Figures 5 and 6. (2) Using Eq. (57) or (58), the integral transformations are performed. After this, the electric potential distribution on the surface is calculated by means of numerical integration over the coordinates.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
251
C. Comparison of Investigation Techniques in an Emission Electron Microscope With and Without Beam Restriction Based on the presented theory and experiments on the measurement of local fields on the object, we can draw the following conclusions regarding the comparative assessment of two EEM experimental techniques, namely those with and without electron beam restriction. The study of objects by the operation mode without electron beam restriction allows us to use a simple design of the electron emission microscope without any additional elements, such as an aperture diaphragm or knife-edge placed in the crossover plane. In this case, the electron microscope sensitivity to local fields is lower than in the case of beam restriction. However, in this case a lower level of noise caused by the microroughness of the object surface is generally observed at the image. The presence of an aperture diaphragm or a knife-edge in the crossover plane of the emission electron microscope improves significantly the microscope performance capabilities, in particular, the sensitivity to local fields. The performance capabilities are especially improved if the microscope design permits a controllable shift of the diaphragm or the knife-edge. This concerns not only the emission electron microscope capabilities for the study of local fields, but also its application for other purposes. It becomes possible to study an object using the electrons escaping from the object with narrow-distributed angles and initial energies. Moreover, the microscope’s sensitivity to the local fields and object microgeometry significantly improves. It is necessary to say a few words about the stability of the mathematical manipulations on the fields’ calculation with respect to the uncertainty of the current density measurements on the screen. It is clear that the current density distribution at the screen is always measured with uncertainties mainly caused by microroughness on the object. The question is: What is the magnitude of error of the calculated functions of electric field and potential distribution on the object surface? The point is that the integral Eqs. (15), (16), (52), and (53) belong to so-called ill-posed problems of mathematical physics. This means that when calculating integrands, the errors may increase. In physical terms, this is related to the fact that the solution results in the increase of the contribution from high spatial frequencies of the initial functions o ¼ 2p=d, where d is the period of spatial variation of the function. The distribution of spatial frequencies of functions can be derived with Fourier transformations. As a rule, measurement errors and noise give the largest contribution to the high spatial frequencies. A more detailed mathematical analysis leads to the following results. When there is no beam restriction, the calculation of the field strength for one-dimensional fields is made using Eq. (27), which represent the
252
NEPIJKO ET AL.
convoluted function S(x) with polar core x2. In terms of the Fourier transformations, it means that spatial frequencies of the function S(x) are multiplied by o. This results in the enhancement of the high-frequency noise. At the same time, the magnitude of the initial experimental error increases approximately by a factor of 2. However, subsequent integration of the field strength function, performed to obtain the electric potential distribution function, is equivalent to the division of all spatial frequencies by o. This results in the decrease of the error magnitude down to its initial magnitude. In other words, if the relative error in determination of the function S(x) from its electron microscope image comprises 10%, then the error in the field strength calculation will be approximately equal to 20%, but the potential distribution function will be determined with an error of 10%. However, in the case of the electron beam restriction, the calculation of the field strength for one-dimensional fields is made using Eq. (58), which represents the convoluted function S(x) with x3/2 polar core. In terms of the Fourier transformation, it means that spatial frequencies of the function S(x) are multiplied by o1/2. This also increases the high-frequency noise; however, the relative increase of the initial errors is smaller—approximately a factor of 1.5. Subsequent integration of the field strength to obtain the potential distribution function results in a decrease of the error, too. Thus it should be noted that computational errors are somewhat smaller when the technique of microfield’s measurement with partially restricted electron beams is applied. Certain manipulations can be used to increase the accuracy of the measurements. For example, when measuring the electron current density for one-dimensional local fields, it is reasonable to carry out several measurements along parallel lines and to average the obtained results afterward. In the presence of high-frequency noise caused by the object microgeometry, it is recommended at first to smooth over the current density profilograms using mathematical techniques. At last, there are purely mathematical techniques for the solution of similar ill-posed problems that minimize possible errors (Tikhonov and Arsenin, 1977, 1986; Tikhonov and Goncharsky, 1987; Tikhonov et al., 1998). III. MODEL EXPERIMENTS ON MAPPING OF ELECTRIC FIELDS (POTENTIAL) ON THE OBJECT SURFACE USING AN EMISSION ELECTRON MICROSCOPE Generally, the aforementioned integral transformations are performed by numerical methods. However, in some cases, these integrals can be analytically calculated. For example, let us take an object with a smooth potential
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
step described by the function ’ðxÞ ¼
x
’0 : arctan a p
253
ð59Þ
Here, ’0 is the height of the potential step, a stands for a parameter determining its width (see Figure 7a). Such a potential distribution function is characteristic, for example, of a p-n junction in a semiconductor to which a blocking voltage ’0 is applied.
FIGURE 7. Analytically calculated functions obtained from the solution of the direct image contrast problem for a potential step on a smooth object surface. Electric potential distribution on the object surface when a potential of ’0 ¼ 3 (1), 6 (2), and 9V (3) is applied to it (a). Curves of electron shifts S calculated for these cases (b). Plots of the current density distribution at the microscope screen corresponding to these cases (c). The values of the microscope parameters are given in the text.
254
NEPIJKO ET AL.
A. Visualization of a Potential Step To calculate the image of a potential step on a smooth object surface in an emission electron microscope without electron beam restriction, let us substitute Eq. (59) in the integral (16). The integral is derived analytically. Then we obtain the following expression for the shift function S(x): SðxÞ ¼
’0 ðl þ aÞ2 þ x2 ln : 2pE0 x2 þ a2
ð60Þ
This function is shown in Figure 7b at diVerent voltages ’0. Figure 7c shows the plots of the corresponding electron current density distribution at the screen calculated by Eq. (22). The calculation was performed assuming the following parameters: V0 ¼ 2 kV, l ¼ 3 mm, a ¼ 3 mm, and ’0 ¼ 3, 6, and 9 V. From this figure, several conclusions can be derived. First, the image of the potential step (located at x ¼ 0) can be significantly shifted on the microscope screen, the distance between the actual and displayed location of the step increases with increasing voltage applied to the step. The respective shift is equal to ’0 lþa Sð0Þ ¼ ln : ð61Þ a pE0 For example, if the voltage applied to the p-n junction is 9 V, the shift of the apparent step position comprises 30 mm, which is much more than the halfwidth of the step of 3 mm. Should this shift phenomenon be neglected, a false conclusion would be drawn regarding the location of this feature on the object. The potential step looks like a double dark and bright stripe at the screen. When the voltage applied to the object increases, the maximum of the current density at the screen becomes sharper, and finally a narrow bright stripe is observed at the screen. Consequently, due to the eVect of the local fields, the depth of the image contrast at the screen of the emission electron microscope can be rather large. Let us now calculate the image of the same potential step displayed at the screen of an emission electron microscope obtained under partial restriction of the electron beam by an aperture diaphragm. Upon substitution of Eq. (59) into Eq. (53), we obtain sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ x2 =a2 ’0 l pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi SðxÞ ¼ : ð62Þ : 2 2V0 VT a 1 þ x2 =a2
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
255
The current density distribution on the screen is shown in Figure 8 for diVerent positions of the aperture diaphragm in the crossover plane. Figure 8a shows the potential distribution on the object surface. Figure 8b shows the shift curve S(x) calculated by Eq. (62). In the following graphs, the current density on the screen is presented for diVerent values of the aperture diaphragm shift S0 with respect to the optical axis of the system (see Section II.B.2). Figure 8c corresponds to the diaphragm centered on the optical axis
FIGURE 8. Calculated curves for the case of a potential step on a smooth object surface when imaging with an aperture diaphragm placed in the crossover plane. Potential distribution on the object surface (a). Curve of trajectory shift calculated by using Eq. (62) (b). Plot of the current density at the screen of the electron emission microscope with aperture diaphragm shifted relative to the optical axis by S0 ¼ 0 (c), 1.5 (d), 2.5 (e), and 3.5 (f ). The image character is bright‐field (c), intermediate (d), dark‐field (e) and remains dark‐field, but its brightness becomes weaker (f ).
256
NEPIJKO ET AL.
of the system, S0 ¼ 0. Therefore we observe a bright-field image: The electrons escaping from the object far apart from the potential step pass freely to the screen, whereas part of the electrons leaving the surface in the region of the potential step are blocked by the diaphragm. Figure 8d demonstrates the current density curve when the diaphragm shift is S0 ¼ 1.5. The image is of intermediate nature: The electrons escaping from the object far apart from the potential step are partially blocked, whereas those leaving the surface near the step are slightly deflected and reach the center of the diaphragm; therefore the current density increases. However, the electrons escaping from the object surface at the center region of the potential step experience a deflection that is large enough to cause the current density to decrease again. This results in the appearance of a minimum in the central part of the curve. Figure 8e corresponds to the diaphragm shift with respect to the axis by S0 ¼ 2.5. This is a dark-field image because only strongly deflected electrons pass through the diaphragm. Finally, Figure 8f shows the current density at a diaphragm shift of S0 ¼ 3.5. Again, we observe a dark-field image, but its brightness is lower. The image variations for such diaphragm shifts in the crossover plane were actually observed in the experiment (Dyukov et al., 1991). B. Computer Simulation of the Image Contrast To check the above analytical derivation for validity, we performed a computer simulation of the electron trajectories for the case of the observation of a p-n junction in an emission electron microscope. The calculation was performed using the program SIMION 6.0. The p-n junction field was simulated by a linear voltage drop of 10 V over a section 1 mm in width. The starting angles of the electron beam escaping from the object surface were taken in the range from 20 to þ20 with the interval of 10 . A contrast aperture with a hole 100 mm in diameter was modeled in the crossover plane of the objective lens. The magnification was assumed to be equal to 25—a typical value for the objective lens of an emission electron microscope. Results of the calculations of the current density at the screen are shown in Figure 9. Figure 9a shows the current density distribution at the microscope screen in the operation mode without restriction of the electron beam. The overall shift of part of the electron beam sideways above the local field corresponds to Eq. (16). This shift results in the appearance of a maximum and a minimum in the curve of current density distribution. The total current of the beam does not change in this case (flux conservation); therefore the areas under the curve j(x) in sections corresponding to the current decrease and its increase with respect to the initial level are equal.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
257
FIGURE 9. Computer simulation of the current density on the screen j(x) from a p‐n junction located on the optical axis. There is no contrast aperture (a). The distance between the contrast aperture with round hole 100 mm in diameter and the microscope optical axis is 0 (b), 100 (c), 200 (d), 300 (e), and 400 mm (f ). Other parameters used in the calculation are given in the text.
Figure 9b gives the current distribution for the case of a bright-field image when the aperture diaphragm of 100 mm width is centered on the optical axis of the system. In this case, the full current passes through the diaphragm to the screen from the object regions where there are no fields, whereas a pronounced darkening reaching zero intensity is observed in the central region where the local field is maximum. The contrast aperture is shifted 100 mm oV axis in Figure 9c. This corresponds to the aforementioned intermediate type of contrast. In the central object part, the shift of the rays is so large that it exceeds the diaphragm shift. Therefore, a darkening is observed here (i.e., the current on the screen is reduced). However, at some distance from the object center where the deflection of electron trajectories is smaller, it becomes equal to the shift of the diaphragm itself. Then the relative shift (S S0) becomes zero, and the maximum current passes through the diaphragm. Still farther from the
258
NEPIJKO ET AL.
central object part, the beam shift is already smaller than the diaphragm shift; therefore the transmitted current decreases again. In this case a p-n junction looks like a double bright and dark stripe. The diaphragm shift is increased to 200 mm in Figure 9d. Then the image is qualitatively similar to the previous one, but the passed current goes down because the aperture is far away from the optical axis. The image approaches the dark-field case. When the diaphragm is shifted by 300 mm, as shown in Figure 9e, only electrons in the central object part, for which the deflection is strongest, can pass through the diaphragm, forming a bright stripe at the screen. This case corresponds to the dark-field image. Note that the apparent position of the p-n junction has changed significantly. Finally, if the diaphragm shift is as large as 400 mm, as shown in Figure 9f, even in the central object part, only a small fraction of electrons passes through the diaphragm. The stripe appears very weakly. A further shift of the diaphragm leads to a completely dark screen. These direct calculations of the electron trajectories prove the validity of the derived formulae for calculation of the image contrast. They also confirm the above results obtained analytically for the case of a potential step. The aforementioned conclusions are illustrated for the image pattern as a function of initial diaphragm position and the magnitude of the electron beam deflection due to the eVect of the local field on the object. C. Illustrative Measurements with a Semiconductor p-n Junction An experimental verification of the theory set forth in Section II was also performed for the case of semiconducting p-n junctions with applied blocking voltage of preset value. Experiments of this kind allow more exact definition of some parameters of the electron microscope, such as the quantity l appearing in Eq. (1) and subsequent ones. These experiments were carried out with the electron microscope model using ion-induced secondary electron emission from the object surface (Nepijko et al., 2002d, 2003). The electric fields were studied by the emission electron microscope using both no electron beam restriction and partial restriction by a contrast aperture. The measured results of both cases were compared. The images of a p-n junction on a silicon diode are shown in Figure 10. In this figure, three images of the same object obtained under diVerent conditions are compared. Figure 10a demonstrates the image of the p-n junction when no electron beam restriction is applied, Figure 10b shows the bright-field image obtained with the beam restriction by an aperture diaphragm placed in the crossover plane, and Figure 10c refers to the dark-field image.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
259
FIGURE 10. Images of a p‐n junction on silicon in an emission electron microscope without (a) and with (b, c) electron beam restriction. The bright‐field (b) and dark‐field (c) images were obtained by means of a lateral shift of the aperture diaphragm. A voltage of 10V was applied to the p‐n junction in the reverse direction. To determine the p‐n junction characteristics, the profiles of the current density j(x) measured along line AB were used.
Because the investigated object was chemically homogeneous, the density of the electron emission current j0 from all parts of its surface was practically the same, and in the absence of the voltage, the p-n junction is not seen at the microscope screen. It becomes visible when the blocking voltage is applied. The images shown in Figure 10a were taken at an applied voltage of 10 V. The redistribution of the electron flux at the screen occurred because of a shift of the electron trajectories under the action of the local electrical field across the p-n junction. Therefore the p-n junction looks like a double bright and dark stripe. The results of processing of this image according to the aforementioned theory are given in Figure 11. Figure 11a shows the profile of the current density distribution j(x) along line AB (see Figure 10a). Figure 11b exhibits the shift S(x) of the electron trajectories obtained by integration of the current density curves, as explained in Figure 2 and by application of Eq. (24). Figure 11c presents the distribution of the local electrical field strength EðxÞ ¼ ’0 ðxÞ. This curve was numerically calculated by solving the integral Eq. (27). Finally, Figure 11d shows the profile of the electric potential distribution ’(x) obtained by numerical integration of the function ’0 (x). The design of the emission electron microscope with ion-induced secondary electron emission, which was used in our studies, permits placement of an aperture diaphragm in the crossover plane of the immersion objective lens. The diaphragm can be moved in a controlled manner along the directions perpendicular to the optical axis. In this case, all the aforementioned types of images can be obtained, namely bright-field, dark-field, and the intermediate type. The bright-field and the dark-field images are most suitable for the measurements.
260
NEPIJKO ET AL.
FIGURE 11. Results obtained from the solution of the inverse problem of the electric potential distribution on the surface of the silicon p‐n junction shown in Figure 10a. Profile of current density distribution j(x) along the line AB in Figure 10a is shown in panel (a) here. Shift S(x) as a function of the coordinate calculated from the curve j(x), which was smoothed to reduce the influence of the noise (b). Calculated profile of the electric field strength distribution E(x) at the object (c). Curve of the electric potential distribution ’(x) at the p‐n junction obtained by numeric integration of the field strength (d).
Figure 10b and c demonstrates the images of a p-n junction on silicon obtained with the beam restriction by the aperture diaphragm. To provide a possibility for calculation of the potential distribution function at the object, we have to measure the initial energy distribution of the emitted secondary electrons required for subsequent calculations. The energy distribution of the emitted secondary electrons corresponded to a value of VT ¼ 1.1 V (see Section II.B.2). A voltage of 10 V was applied to the p-n junction in the reverse direction. In Figure 12a, the image brightness profiles j(x) along line AB (see Figure 10a) are compared for the bright-field (Figure 10b) and dark-field images (Figure 10c). Despite the sign diVerence of the observed features, they correspond to the same local field, which can be calculated from each of these curves if the corresponding magnitude of aperture shift S0 is taken into account. The obtained curves were processed by means of the computer program MATHCAD. A weak smoothing of the curves was initially performed to remove noise. In addition, it was necessary to remove a possible influence of
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
261
FIGURE 12. Panel (a) shows the current density profiles j(x) measured along line AB (marked in Figure 10a) in Figures 10b and c. Corresponding shifts S(x) (b) and corresponding distributions of the electrical fields E(x) (c), and potential ’(x) (d). The curves j(x), S(x), and E (x) are shown by solid and dashed lines for the cases of the bright- (Figure 10b) and dark-field (Figure 10c) images, respectively. The curves ’(x) practically coincide for both cases (d).
the nonuniformity of object illumination by the primary beam on the measured profiles. This nonuniformity manifests itself in the fact that the unperturbed current density j0 may vary across the sample. However, in the present experiment this dependence was weak, and it is possible to assume that j0(x) depends linearly on the coordinate across the field of view. To eliminate this eVect, a regression of the first order of the obtained curve j(x) was calculated by the program. Then the linearly dependent part of the regression was subtracted from the initial curve. This operation resulted in a leveling of the current density at both ends of the obtained curves, where there is no influence of local fields. After this, the obtained profile of the current density j(x) was used to calculate the magnitudes of the electron beam shift S(x) on the basis of Eq. (16).
262
NEPIJKO ET AL.
Figure 12 b–d shows the results of this derivation using the aforementioned equations. Figure 12b gives the shifts S(x) obtained from the brightfield (solid curve) and dark-field (dashed curve) images of the p-n junction. A small diVerence is visible, which can be explained by measurement error in each image. However, this figure illustrates that the profile of shift S(x) can be reconstructed from the images with satisfying accuracy for both types of images. The graphs of distribution of the electric field strength E(x) on the object surface were numerically determined for both curves using Eq. (58) by the program MATHCAD. These curves are presented in Figure 12c. Figure 12d shows the curve of the electrical potential distribution ’(x) obtained from the preceding two curves by means of numerical integration. These curves practically coincide because the influence of noise decreases under integration. Thus, the comparison of the experimental results obtained on the same object by the two diVerent techniques—with and without electron beam restriction—shows that these results are practically identical.
IV. THE EFFECT OF THE LOCAL FIELDS AND MICROROUGHNESS AT THE OBJECT ON THE IMAGING AND RESOLVING POWER OF AN EMISSION ELECTRON MICROSCOPE In Section III we presented the images of a p-n junction obtained in the emission electron microscope that demonstrate some peculiarities of imaging by a cathode lens in the presence of local fields on the object surface. In this section we present further examples showing visible distortions of the images that can be systematically analyzed by analytical calculations (Nepijko et al., 2000d) or ray tracing (Nepijko et al., 2000c). A. Distortion of the Image Details Under the EVect of the Local Field on the Object Surface 1. Image Distortion of One-Dimensional Structures First, let us suppose that the emission current density j0 is invariable. This is possible, for example, in the case of the ion-induced secondary electron emission from the object surface, when the emission current depends only weakly on the work function of the object material. Consider a strip on the object of width 2d at a negative voltage ’0 with respect to the rest of the surface. The potential distribution at the strip boundary is described by
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
263
FIGURE 13. Current density distribution at the screen (line scan) in the presence of a negatively charged strip with a half‐width of d ¼ 5 mm on the object surface. The parameters of the calculation are given in the text.
Eq. (59). Numerical values of the respective parameters were taken as follows: ’0 ¼ 3 V, a ¼ 1 mm, V0 ¼ 20 kV, l ¼ 10 mm, and the half-width d of the strip was equal to 5 mm. The current density distribution at the screen calculated for this case is shown in Figure 13. The following features of the image can be seen. One could expect that the negatively charged strip ought to look wider compared to its actual width, and darker than adjacent areas, because the electrons are repulsed from this strip. Nevertheless, our calculations show the reverse situation. This phenomenon can be explained using Figure 1. Of course, the electron trajectories diverge from the negatively charged strip; however, we should consider not these trajectories themselves, but the continuation of the tangents to these trajectories up to the virtual object plane. According to calculations, these continuations cross each other, as is shown in the figure. This means that instead of a deviation, the electrons approach each other at the screen, which results in the image contrast reverse. However, it should be noted that here we deal with the in-focus image. Even the smallest defocusing may cause an additional defocusing contrast that will change signs depending on the defocusing sign. In such case, we can obtain both positive and negative image contrast. These types of image contrast can also be calculated by formulae given by Dyukov et al. (1991). As evident from Figure 13, at the aforementioned numerical values of parameters, the visible width of the strip makes up approximately 7 mm instead of 10 mm. Thus, for such small details, the diVerence in the image dimensions and those of actual objects can be rather significant. This diVerence decreases if the dimensions of the details increase. Figure 14 shows the case of a positively charged strip with the same values of parameters. Now it is already clear that in this case the strip looks wider than its actual size and darker than adjacent areas of the object. In this case, the visible width of the strip makes up 13 to 14 mm instead of 10 mm. Let us now consider the case when the initial distribution of the current density j0 emitted from the object surface is not uniform. Such a case can be
264
NEPIJKO ET AL.
FIGURE 14. Current density distribution at the screen (line scan) in the presence of a positively charged strip with a half‐width of d ¼ 5 mm on the object surface. The parameters of the calculation are given in the text.
FIGURE 15. Current density distribution at the screen (line scan) in the presence of a jump of the emission current density in combination with a potential jump on the object. This behavior occurs if the adjacent areas have diVerent work functions. The parameters of the calculation are given in the text.
observed, for example, when the work function is diVerent at diVerent parts of the object being investigated. The photoelectric current density will be larger at parts of the object with lower work function. This combination of two factors results in the current density distribution pattern shown in the solid curve in Figure 15. The plot in the figure refers to the following values of parameters: ’0 ¼ 3 V, l ¼ 10 mm, V0 ¼ 20 kV, and a ¼ 1 mm, whereas the ratio of the initial magnitudes of brightness of the two parts is equal to 3. The following characteristic features of the image can be seen from Figure 15. At the presence of a sharp drop of the emission, the potential step still looks sharp with some distortion of the current density distribution pattern. However, a notable shift of the step edge is observed toward the darker part of the object characterized by a higher work function. In this example, this shift reaches 4 mm. The dashed curve in Figure 15 corresponds to the initial distribution of the emission current at x ¼ 0 when the eVect of the microfield is neglected. As follows from the aforementioned examples, in these cases the image of a narrow strip will also look sharp, but the dark strip will look narrower than its actual size and vice versa.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
265
2. Image Distortion of Two-Dimensional Structures The previous examples illustrate the principle of the image distortion. For sake of clarity, this section presents examples of image distortions for twodimensional details on the object surface. Let us suppose that on the object surface there is a circular patch (microdot) negatively charged with respect to the rest of the surface. In this case, its apparent diameter on the screen will be smaller than in reality, and the current density inside the image will be increased. In general, this increase in brightness is higher at the perimeter of this circle, whereas in its center the brightness will be smaller, as shown in Figure 16. A marked dark ring is observed outside the bright edge. In a similar way, a positively charged microdot at the object will be displayed darker with respect to adjacent areas; its diameter is larger compared to its actual value (Figure 17). A bright ring is observed outside the disk. For the calculation we assumed values of ’0 ¼ 3 V, a ¼ 1 mm, V0 ¼ 20 kV, l ¼ 10 mm, and the circle radius R0 ¼ 10 mm. The potential distribution at the microdot boundary is described from the analog of Eq. (59) in the two-dimensional case. Let us now suppose that there is a charged round spot on the object surface, the electric potential of which varies as a function of distance r from the center of the spot as expressed by the following formula: a’0 ’ðrÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð63Þ r2 þ a2
FIGURE 16. Current density distribution for a negatively charged microdot on the object surface. The radius of the circle R0 is taken equal to 10 mm. Other parameters are identical to those given for calculation of the strip shown in Figure 13.
266
NEPIJKO ET AL.
FIGURE 17. Current density distribution for a positively charged microdot on the object surface. The radius of the circle R0 is taken equal to 10 mm. Other parameters are identical to those used for calculation in Figure 14.
FIGURE 18. Current density distribution (line scan) on the screen for a positively charged spot on the object surface. Curves 1 through 4 correspond to increasing potentials in the center of the spot.
Here, ’0 denotes the electric potential in the center of the spot, while a characterizes the spot size. Analytical calculations performed using this formula demonstrate that if the spot is charged positively, the current density in the center of the image decreases, and it looks like a dark spot at the microscope screen. Figure 18 shows the current density distribution pattern calculated for this case as a function of the electric potential in the center of the spot. The image brightness in the center of the spot can be expressed in the following form:
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
jð0Þ 1 ¼
2 : j0 ð0Þ ’0 1 þ 2aE 0
267 ð64Þ
Therefore, for example, when ’0 ¼ 2aE0 , the image brightness in the center of the spot decreases by 4 times compared with the initial one. For assumed values of E0 ¼ 2106 V/m and a ¼ 1 mm, this will happen at ’0 ¼ 2 V. Such a spot may look dark in the photoemission electron microscopy (PEEM) studies even if it emits more photoelectrons due to a smaller photoelectric work function. On the contrary, if the spot is charged negatively with respect to the rest of the object surface, its image will appear brighter. The corresponding curves are shown in Figure 19 for diVerent values of the electric potential at the center of the spot. When ’0 ¼ 2aE0 , a caustic appears in the center of the spot; that is, even if the electron emission from the center of the spot is reduced due to a larger work function, this spot will appear as a bright point at the microscope screen. Emission electron microscopes are also used for investigation of the emission distribution from the surface of thermocathodes. In this case, a variation in the thermionic emission current is also caused by the diVerence in the work function at diVerent parts of the cathode, and therefore it inevitably leads to the appearance of the contact potential diVerence between these parts. Local fields generated in this case result in a redistribution of the emission current at the screen, which leads to a change of the visible dimensions of the emitting areas, thus giving erroneous values for the emission current density for the diVerent parts of the cathode. As is already clear from
FIGURE 19. Current density distribution (line scan) on the screen for a negatively charged spot on the object surface. Curves 1 through 3 correspond to increasing absolute values of the potential in the center of the spot.
268
NEPIJKO ET AL.
the previous examples, it is a general trend that the image brightness for strongly emitting parts of the object is reduced with respect to its actual brightness, whereas the sizes of these parts are increased, and vice versa. This eVect can be also calculated using the theory given previously. B. EVect of the Object Microgeometry (Relief) on the Image The eVect of a geometric surface roughness (relief) on the image can be calculated with the use of formulae derived in Section II describing the eVect of local fields. This enables reconstruction of the geometric profile of the object surface. It also becomes possible to separate the eVect of the object geometry from the local field eVect if both eVects are present. Let us assume the electric potential at the object surface is invariable and equal to zero. Due to the presence of the strong accelerating field E0 above the object surface, the electric potential increases by DV ¼ h E0 at a small height h above this surface. However, if there is a protruding region of height h at the object surface, the potential at this point will also equal zero. Therefore, in the first approximation, the presence of a roughness feature with height h is equivalent to the presence of a charged area at the flat surface with the electric potential ’ ¼ h E0 , because in this case the potential at the same point will also be equal to zero. This gives us a possibility to correlate the geometric and the potential pattern of the object surface with the help of the formula ’ðx; yÞ ¼ hðx; yÞ E0 :
ð65Þ
This relation is valid for the condition that jgradhðx; yÞj << 1;
ð66Þ
that in this case is equivalent to Eq. (5). Thus, for calculation of the shift S(x, y) of electron trajectories under the eVect of a geometric roughness at the object surface, we can use Eqs. (15) and (16) with Eq. (65) substituted for ’(x, y). Solving the inverse problem, we can calculate the shape of the geometric relief of the object from its image using Eqs. (26) and (27) in which the same Eq. (65) is substituted. Upon the substitution of Eq. (65) into Eq. (16), we obtain the formula for calculation of the shift S(x) in the presence of a surface roughness h(x) at the object provided its dependence on a single coordinate x: SðxÞ ¼
1 2p
Z1 1
l2 h0 ðx xÞln 1 þ 2 dx: x
ð67Þ
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
269
To solve the inverse problem of a calculation of the object geometry profile from the observed image for the same case, we should substitute Eq. (65) into Eq. (27) : h0 ðxÞ ¼
1 p
Z1 1
SðxÞ SðxÞ þ ðx xÞS 0 ðxÞ ðx xÞ2
dx:
ð68Þ
Finally, the function h(x) is obtained by numerical integration. The comparison of Eqs. (67) and (68) with Eqs. (15) and (27) for the local fields allow us to separate the eVect of surface geometry from that of surface potential. For that purpose, the image of the same object should be obtained at two values of the accelerating voltage V0. Indeed, as is seen from Eq. (15), the shift S under the eVect of the local fields is inversely proportional to the field strength E0, that is, to the accelerating voltage V0. At the same time, it follows from Eqs. (67) and (68) that the contrast depth caused by the surface geometry does not depend on the accelerating voltage. This fact, in principle, permits separation of these two eVects. C. Deterioration of the Resolving Power of an Emission Electron Microscope in the Presence of Local Fields and Object Roughness The resolution of an emission electron microscope is impaired by the presence of geometrical roughness and electrical or magnetic microfields on the object surface (Nepijko et al., 2000c, 2002e). This eVect is illustrated in Figure 20. The object displays a rough surface, which is described by the function h ¼ hðx; yÞ. Electrons 1, 2, and 3 emitted at various angles from the point x ¼ 0 of this surface are accelerated by the uniform electric field E0 ¼ V0 =l of the immersion objective lens. Let us assume that the focusing microscope lens L is ideal, that is, all electrons starting from the same object point are focused to the same screen point E if the object surface is plane. This means that we neglect all aberrations of acceleration field and lens. However, the object roughness causes an electrical field perturbation in the vicinity of the object, and in turn, it results in the additional shift S(x, y) on the screen. These shifts are diVerent for electrons 1, 2, and 3 because field perturbations diVer along the trajectory of each electron. Consequently, an enlarged spot of current density distribution j(x, y) is observed on the screen. Depending on the electron path, the object point x ¼ 0 appears with a virtual shift. Cases when electron trajectories cross each other are also possible and result in an additional deterioration of resolution. Let us consider the motion of two electrons starting from the same point (x,y) of the object—one of them moves normally to the surface, and the
270
NEPIJKO ET AL.
FIGURE 20. Explanation of the contrast formation and resolution determination due to roughness features h(x) on the object surface. Other notations are given in the text.
second electron moves at a tangent with energy e in the direction of the x-axis (e is the electron charge, is the voltage corresponding to this energy). In the following we assume a transmission of 1 (i.e., the electron beam is not restricted by a contrast aperture). If the second electron moves only under the action of the field E0, its trajectory is given by a parabola: rffiffiffiffiffiffiffiffi z Dx ¼ 2 : ð69Þ E0 The gradient of the microfield perturbation is expressed in terms of the series @ @ @2 Dx2 @ 3 V ðx þ Dx; y; zÞ ¼ V ðx; y; zÞ þ Dx 2 V ðx; y; zÞ þ V ðx; y; zÞ þ . . . : @x @x @x 2 @x3
ð70Þ The first term of this series is the same for all electrons; therefore it does not result in deterioration of resolution but only in a shift of the image of this point on the screen. The deterioration is due to the second and the following terms of the series. For example, we derive Eq. (71) by substitution of the second term in Eq. (14)
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
pffiffi Z1 2 pffiffiffi @ 2 z 2 V ðx; y; zÞdz: d1 ðx; yÞ ¼ 3=2 @x E0
271
ð71Þ
0
d1 denotes the basic reduction of resolution due to the first-order correction in the series expansion of Eq. (70). Generally, a roughness on the object surface h(x, y) produces a field perturbation in the vicinity of the object, which is equivalent to the action of the field of a planar object with the potential distribution ’(x, y) on the surface [see Eq. (65)]. The potential distribution V(x, y, z) above the object is expressed by these boundary conditions, as the solution of Dirichlet’s problem for a half-space (Courant and Hilbert, 1989). This solution is substituted in Eq. (71). After a series of rearrangements of the integral, we obtain an additional dispersion spot that is equal to the diVerence in the electrons’ shifts d1 ðx; yÞ ¼
pffiffi Z1 Z1 d x dx ’ðx x; y Þ ½Gð1=4Þ2 3=2
8p3=2 E0
1 1
ðx2 þ 2 Þ5=4
dxd:
ð72Þ
A similar expression can be obtained for the y-axis, and one can see that the resolution deterioration is anisotropic. Let us also consider the case, which is important for practical purposes, when the functions h and ’ are dependent only on the coordinate x. Then the integration over can be carried out in Eq. (72), so we obtain pffiffi Z1 0 ’ ðx xÞsgnx pffiffiffiffiffiffi d1 ðxÞ ¼ pffiffiffi 3=2 dx: jxj 2E 0
ð73Þ
1
Equations (72) and (73) give the first-order solution of the problem of resolution deterioration under the action of electric microfields on the object surface, for example, due to work function diVerences ’(x, y). In the case of a geometric roughness h(x, y), we substitute Eq. (69) in these equations pffiffi Z1 Z1 @ x @x hðx x; y Þ ½Gð1=4Þ2 pffiffiffiffiffiffi d1 ðx; yÞ ¼ dxd 3=2 8p E0 ðx2 þ 2 Þ5=4
ð74Þ
pffiffi Z1 0 h ðx xÞsgnx pffiffiffiffiffiffi d1 ðxÞ ¼ pffiffiffiffiffiffiffiffi dx: 2E0 jxj
ð75Þ
1 1
and
1
272
NEPIJKO ET AL.
Using transformation by parts, we can bring Eq. (75) to the form pffiffi Z1 hðx xÞ hðxÞ dx: d1 ðxÞ ¼ pffiffiffiffiffiffiffiffi 2 2E0 jxj3=2
ð76Þ
1
Let us consider some examples. Assume a sinusoidal height modulation of the form hðxÞ ¼ h0 sin
2px : a
ð77Þ
In accordance with Eq. (65), the maximum resolution deterioration is pffiffiffi rffiffiffiffiffiffi 2p d1 ¼ pffiffi ð78Þ h0 : E0 l This value is twice as large as calculated by Bertein (1953). The point is that Eq. (78) was derived for focusing on the object surface, whereas in Bertein (1953) the resolution for the optimal defocusing for each surface microregion was calculated. The method developed in the present paper also permits calculation of the resolution in the case of arbitrary defocusing, and then the results coincide. However, under defocusing the resolution improvement on one distorted object microregion is accompanied by corresponding resolution deterioration on other microregions without distortions. Another example is given for a geometric step of height h0, which has a smooth drop of half-width a: hðxÞ ¼
h0 x arctan : a p
ð79Þ
Substitution in Eq. (75) gives ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi pp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffi pp x2 þ a2 x x2 þ a 2 þ x h0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d1 ðxÞ ¼ pffiffiffiffiffiffi : 2 E0 x2 þ a2
ð80Þ
This expression should be taken absolute value of d1 because it means crossing trajectories. Consider numerical examples. Let us assume ¼ 0.5 V, V0 ¼ 18 kV, l ¼ 4 mm, and a ¼ h0/2. Then for h0 equal to 0.2 nm, 3 nm, 0.3 mm, 1 mm, and 10 mm, the maximum value of d1 is 2.4 nm, 9.1 nm, 91.5 nm, 167 nm, and 0.53 mm, respectively. In these cases the basic resolution of the acceleration field calculated with the Bru¨ che–Recknagel formula (Bru¨che, 1942; Recknagel, 1941) is given by
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
d0 ¼
¼ 110 nm: E0
273 ð81Þ
It is clear from these examples that in this case (without restriction of the electron beam) the resolution deterioration is of little significance for steps of a few atomic layers in height. In practice, values of d0 ¼ 50 nm can be reached in threshold photoemission. Thus steps of the order of 100 nm or more will lead to a significant decrease in resolution. It may seem from Eq. (80) that the resolution is not deteriorated at x ¼ 0. However, this is not correct, and the next term of the expansion series according to Eq. (70) must be taken into account for the total resolution calculation. After substitution of this term in Eq. (71) and similar transformations, we obtain for two-dimensional microfields d2 ðx; yÞ ¼ pE02
Z1 Z1
2
@ x @x 2 ’ðx x; y Þ
x2 þ 2
1 1
dxd
ð82Þ
and for one-dimensional microfields d2 ðxÞ ¼
2 0 ’ ðxÞ: E02
ð83Þ
The corresponding expression for a geometrical roughness in the two-dimensional case is given by d2 ðx; yÞ ¼ pE0
Z1 Z1
2
@ x @x 2 hðx x; y Þ
1 1
x2 þ 2
dxd
ð84Þ
and in the one-dimensional case d2 ðxÞ ¼
2 0 h ðxÞ: E0
ð85Þ
Here, d2 denotes the reduction of resolution due to the second-order correction in the series expansion of Eq. (70). Consideration of this term becomes essential when d1 is close to zero. For example, for a step of the form (79) we obtain from Eq. (85) d2 ðxÞ ¼
2ah0 : pðx2 þ a2 Þ E0
ð86Þ
According to this equation, the maximum deterioration is obtained for the point x ¼ 0, which follows from the general consideration. From the above-mentioned numerical data we have for all steps
274
NEPIJKO ET AL.
d2 ð0Þ ¼ 140 nm:
ð87Þ
Finally, we add up the joint action of d0, d1, and d2 using the following equation: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ d20 þ ðd1 þ d2 Þ2 : ð88Þ The first term d0 being independent on the electron trajectory gives the base resolution of the acceleration field in the case without restriction of the electron beam. Only the second term (d1þd2) describes the resolution deterioration. Only the sum of d1 and d2 should be taken in Eq. (88) because they operate in common. Sometimes the terms may decrease the action of each other if they are opposite in sign. The values of d1 and d2 should be doubled to take into account that the initial tangential electron velocities point in all directions. The theoretical curves d1(x) and d2(x), as well as their joint action d(x) in units of the basic shift d0 for the conditions ¼ 0.3 V, V0 ¼ 18 kV, l ¼ 4 mm, a ¼ 1 mm, and h0 ¼ 2 mm, are presented in Figure 21. Results of the numerical simulation are shown as open squares. As seen from the figure, the direct calculation of trajectories (ray tracing) gives results that agree very closely with the derived analytical formulae. Part of the diVerence can be explained by the fact that the terms of the third and subsequent orders of infinitesimals of series in Eq. (70) should be taken into account for a more accurate calculation. However, the order of magnitude of the resolution deterioration caused by a corrugation on the object is correct. It is also clear from this figure that the resolution deteriorates somewhat asymmetrically relative to the step position x ¼ 0.
FIGURE 21. Comparison of the theoretical curves for the relative deterioration of resolution in an emission electron microscope with the results of a numerical simulation. The dashed curve (a) shows the first approximation, the dotted curve (b) demonstrates the second approximation, and the solid curve (c) gives the resulting relative resolution. The result of the numerical simulation is shown by open squares.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
275
D. Numerical Simulation of Electron Trajectories for the Case of Equipotential Object With Roughness The computer simulation shows the value and character of the electron trajectories’ distortion caused by microcorrugations on the object surface. To do this, geometrical steps of diVerent height and shape have been modeled and treated in combination with the tetrode objective lens of a photoemission electron microscope (FOCUS IS-PEEM, see Swiech et al., 1997). The problem is to achieve a spatial resolution in the nanometer range and, on the other hand, to treat the macroscopic objective lens having its image in a distance of 80 mm. Therefore two regions were defined: one close to the sample with a grid of down to 1 nm in order to obtain a proper modeling of the field distribution around the step, and a second, macroscopic region containing the whole objective lens up to the first intermediate image plane. An easy matching of the two regions was possible because at a certain distance from the sample surface (depending on the step height) there is only the homogeneous acceleration field defined by the potentials of the sample and the extractor electrode (Figure 22). At this interface, all relevant data of the trajectories (position, angle, and energy) were transferred from the fine-mesh region to the coarse-mesh region. The computer program used was SIMION 6.0 being based on the Runge–Kutta formalism. Figure 22 illustrates the case of a rectangular-shaped step 1 mm in height. The equipotential planes in this figure are equidistant with 5 V spacing and, in addition, the contours close to the step at 0.1, 0.5, 1, and 2 V are shown. Groups of trajectories with starting angles from 8 to þ8 with respect to the surface normal were calculated. These conditions are typical for the high resolution in PEEM, although electrons are emitted within the p/2 angle. Starting energy e, length l, and acceleration voltage V0 were assumed 0.5 eV, 4 mm, and 18 kV, respectively. It can clearly be seen that the trajectories in the vicinity of the step are asymmetrically distorted (are inclined toward the lower terrace). This makes it possible to separate regions with higher and lower current density. This leads to a nonmonotonous distribution of the current density in the image plane. To determine this distribution, an algorithm, calculating the number of trajectories falling into equally spaced intervals and displaying these numbers in a histogram, was applied. The above-mentioned data were used in the current density calculation j(x) presented in Figure 23. Discreteness of the escape angles’ range and equidistance of the starting points were chosen as 2 and 12.5 nm, respectively. Figure 23 shows a nonmonotonic character of the contrast from a rectangular-shaped step. The apparent step position, as observed in a photoemission electron microscope, is shifted toward the right (i.e., toward the lower terrace). It looks like a bright stripe adjacent to a wider dark one.
276
NEPIJKO ET AL.
FIGURE 22. Computer simulation of the trajectories in the vicinity of a step of 1 mm height. The groups of trajectories have starting angles between 8 and þ8 , increment 2 . The upper part of the figure displays the trajectories at the sample. The lower part shows the image plane of the objective lens with magnification of 25; for simplicity the image is shown non-inverted.
The dark stripe is on the side of the upper terrace. The inflection from maximum to minimum I of the curve of current density distribution j(x) becomes less sharp as the step becomes higher. Using the Rayleigh criterion as shown in Figure 23 and taking into account the magnification of the immersion objective being 25, this inflection width d* comprised 69, 195, and 500 nm for the case of rectangular-shaped steps of high h ¼ 0.2, 1, and 10 mm, respectively. The value of d* determined in such a way is not identical with the line resolution. In images illustrating the microscope resolving power, the line resolution is most often used to characterize quantity. Usually it is the minimum distance at which adjacent image details (lines) are separately visible. In this context the following question can arise: At which minimum distance are two steps visualized still separately? The computer simulation
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
277
FIGURE 23. Simulated current density distribution in the image plane (area denoted by dashed line in Figure 22) for a step of 1 mm height. The data points (circles) represent the histogram of trajectory numbers falling into equally spaced intervals. See text for explanations.
shows that the contrast from two steps becomes more complicated. When a second step facing the same side is added, a feature appeared at the inflection from maximum to minimum of the curve of the current density distribution in the image plane. This feature is narrower and closer to maximum, the smaller the distance between the steps. In such a manner the line resolution can be determined in every specific case, but a more precise definition should be given. The results of calculation depend on the fact whether adjacent steps are pointing along the same direction or in opposite directions. In the latter case it is important whether they form a protrusion or a recess. In addition, the steps can diVer in height or in other parameters. The computer simulation enables the point resolution to be determined. To do this, it will suYce to calculate the photoelectron trajectories only for two starting points. In the image plane the current distribution of electrons emitted from one starting point is a bell-shaped curve. Two bell-shaped curves are superimposed as the distance between the starting points decreases. In the general case the total curve is characterized by two maxima. The minimal distance between two starting points, at which these maxima can be separated using the Rayleigh criterion, is the point resolution in a local region separated by the distance x from a step. A natural question can arise: How it is possible to improve the resolution of an emission electron microscope in the presence of local microfields or a microgeometry at the object surface? For that purpose, the following techniques can be used.
278
NEPIJKO ET AL.
1. If the resolution was deteriorated due to local fields on the object surface, the resolution can be improved by means of an increase of accelerating field strength E0 above the object surface. This follows from Eq. (72): The 3=2 spatial resolution is inversely proportional to E0 . If the resolution deterioration was caused by a surface microgeometry, the eVect of the field strength on the microscope resolving power is much less. As it follows from Eq. (74), 1=2 the spatial resolution in such a case is inversely proportional to E0 . 2. The resolution can be improved by decreasing the chromatic aberration. This possibility is realized in the time-of-flight photoemission electron microscope (Nepijko et al., 2004). It is also possible to correct the chromatic and spherical aberrations using time-of-flight techniques (Scho¨ nhense and Spiecker, 2002) or an electron mirror (Rose and Preikszas, 1992, 1995; Veneklasen, 1991). The theoretical description of the aberration caused by the uniform field of the immersion lens was given by Liebl (1988). Calculations of the resolving power of the time-of-flight photoemission electron microscope (Nepijko et al., 2004) show that it can be significantly improved—by an order of magnitude and more—with respect to standard of emission electron microscopes. To realize this possibility, it is necessary to use for image formation the electrons emitted in a defined, narrow range of energies. In addition, it is also necessary to restrict the width of the electron beam entering the electron optical column by means of a diaphragm preferably placed in the crossover plane. In this case, it is possible to reach a theoretical resolving power of a few nanometers. V. PRACTICAL APPLICATIONS OF MICROFIELD MEASUREMENT USING AN EMISSION ELECTRON MICROSCOPE The theory described in Section II of the present paper, as well as some other techniques, were used for EEM measurements of local electric fields of diVerent nature. Fields of various objects to which a voltage was applied were measured. Local fields due to the contact potential diVerence arising due to chemical reactions on the object surface were also investigated. Examples of these investigations are given below. A. Measurement of Electric Fields (Potential) on an Object Under a Variable-Voltage–Applied. Microelectronics Application Small metal particles and ensembles of particles (nanoparticle films) are a topic of high current interest. Their properties are essentially determined by the particle sizes and eVects caused by their interaction. For example, if the
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
279
concentration of metal particles is high enough, a tunnel current can pass through them as a voltage is applied (Borziak et al., 1981). New physical properties are inherent in these systems in comparison with bulk material. A passage of a tunnel current through a metal nanoparticle film is accompanied by electron and photon emission (Borzjak et al., 1965; Fedorovich et al., 2000; Nepijko et al., 2000a, 2002a). The electric field (potential) distribution in such films was investigated by using an emission electron microscope (Gloskovskii et al., 2004). The silver nanoparticle films were prepared and studied in ultra high vacuum (UHV) conditions. They were deposited on a glass substrate in a gap between two Ag electrodes. The latter had the thickness of 100 nm, a length of 2 mm, and were separated by a gap 25 mm wide. A photoemission electron microscope (FOCUS IS-PEEM, see Swiech et al., 1997) was used to visualize the cluster film. Photoelectrons were excited by illumination with a high-pressure mercury UV lamp (hv ¼ 4.9 eV). Figure 24 shows a series of EEM images of a silver nanoparticle film at diVerent voltages applied between the contacts. The left contact was grounded. A positive voltage Uf of 0 (a), 15 (b), 40 (c), and 50 V (d) was applied to the right contact. The dashed line in Figure 24a is drawn through the center of the silver nanoparticle film and is parallel to the contacts. When the voltage increases, the image of the nanoparticle film widens, its contrast becomes more pronounced, and its character changes. Widening occurs as a result of a shift of the image of both contacts to the left, and the left contact shifts stronger. At Uf ¼ 15 V the silver nanoparticle film is visualized as two doubled alternating bright and dark stripes parallel to the contacts (Figure 24b). As the voltage increases (Uf ¼ 40 V), the contrast between the bright and dark regions of the left stripe enhances stronger than in the case of the right one (Figure 24c). Further increase of the voltage
FIGURE 24. Image series of a silver nanoparticle film in an emission electron microscope at voltages of Uf ¼ 0 (a), 15 (b), 40 (c), and 50 V (d). The dashed line in (a) indicates the center of the nanoparticle film. The electrical potential distribution was determined along line AB. Arrows indicate the electron emission centers (see text).
280
NEPIJKO ET AL.
(Uf ¼ 50 V) results in prevailing of the left band (Figure 24d) and fading out of the contrast on the side of the right contact. This reflects the fact that the microscope lenses were adjusted at Uf ¼ 0 V (i.e., at the potential of the left contact). At the voltage of Uf ¼ 40 V the silver nanoparticle film begins to eject electrons (arrows in Figure 24c). If the UV lamp is switched oV, we observe only the contribution of these electrons. As seen in Figure 24 c and d, the electron emission observed from the nanoparticle film has a local character. In this connection let us apply the concept of electron emission centers. According to Blessing and Pagnia (1982), an emission center is a particle emitting electrons. The performed measurements allow us to restore the electrical field (potential) distribution in the silver nanoparticle film when a voltage is applied to it. This is illustrated for two voltages: Uf ¼ 15 (I) and 40 V (II). The intensity profiles j(x) along the line AB in Figure 24a through c, shown by the solid curves (R), (I), and (II) in Figure 25a, were taken as input for the calculations. The curve of photoemission current density distribution (intensity profile) at Uf ¼ 0 V serves as reference (R). In the given case the intensity profile is close to a constant, and its value is taken as 1 arb. unit. At Uf ¼ 15 V, the profile is deformed into a nonmonotonous curve with several distinct minima and maxima (I). At Uf ¼ 40 V, the nanoparticle film just starts to eject electrons, (II). The x-axis runs from point A to point B. The length of line AB exceeds the linear dimensions of the area where the contrast caused by the nanoparticle film changes if the voltage is applied. An analysis shows that the areas under the three solid curves in Figure 25a are equal. It means that the total number of photoelectrons, forming the image of the silver nanoparticle film at diVerent voltages, remains invariable. Variation of the image contrast is exclusively caused by a redistribution of the photoelectron density due to deflections of their trajectories by the electrical microfields. The contrast formation without restriction of the electron beam, as in the given case, requires the use of the approaches and formulae from Section II.A. Using the intensity profiles j(x) measured in the field-free case and in cases when a voltage is applied to the nanoparticle film as input in Eq. (22), the function of shift of the image elements at the microscope screen S(x) is found. From the latter dependence and Eq. (27), the electrical field distribution E(x) or potential distribution ’(x) in the silver nanoparticle film is uniquely reconstructed (Krasnov, 1975). We referred to this case as the inverse problem. Another possibility lies in the use of Eq. (16) (i.e., the direct problem). In this case a distribution of the electrical field (or the potential) is chosen so that the initial curve of the intensity profile measured at Uf ¼ 0 V is deformed until the best fit is achieved with the intensity profile curve measured at a given voltage.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
281
FIGURE 25. Photoemission intensity profiles along line AB in Figure 24a, b, and c are shown here in panel (a), in addition to corresponding curves of the electrical field (b) and potential (c) distribution. The voltage applied to the silver nanoparticle film amounts to Uf ¼ 15 (I) and 40 V (II). Intensity profile at Uf ¼ 0 V served as reference (R). Theoretical curves of the distribution of photoemission current density (intensity profiles) are shown by dashed curves (direct problem, see text).
It seems that the inverse problem should be always preferable. However, this problem is ill-posed, as mentioned previously. In practice this manifests itself in the fact that the experimental current density profiles j(x) are always measured with certain errors, and the curves of the electrical field E(x) (potential ’(x)) distribution calculated are characterized by errors several times larger. A dependence of the solution on the size of the area where the profile j(x) was measured is a certain disadvantage of this method. This area may exceed the linear size of the area of the local field concentration on the object area by many times. In practice, the function of the current density distribution j(x) can often be reliably measured on a limited interval only. In such cases it is worth giving preference to the solution of the direct problem. Nevertheless, it does not help to overcome the fact that the problem is illposed. Therefore, both methods of solving the problem give almost the same
282
NEPIJKO ET AL.
accuracy of solution. If, as in our case, it is possible to estimate the accuracy of the measurement of j(x) as 10%, then, in practice, it corresponds to an inaccuracy in the determination of the potential distribution function ’(x) of approximately 30%. The results of the calculation of the electrical field (potential) distribution, obtained from the image in an emission electron microscope without restriction of the electron beam by the solution of the inverse and direct problems, are similar. The latter method is used in this section. Figure 25b shows curves (I) and (II) of the electrical field distribution E(x) in the silver nanoparticle film at voltages Uf ¼ 15 and 40 V, respectively. The solution of the direct problem gives the intensity profiles j(x) shown as dashed curves in Figure 25a. In these measurements and calculations the following parameters of the emission electron microscope were used: the accelerating voltage applied to the anode (extractor) V0 ¼ 4 kV, the distance between the cathode (object) and anode (extractor) l ¼ 2 mm. As can be seen, the dashed and solid curves are close to each other in both cases (I) and (II) (i.e., quite good agreement of the theoretically calculated and experimentally measured curves of the intensity profiles j(x) is evident). In Figure 25c, curves (I) and (II) of the potential distribution ’(x) in the silver nanoparticle film at voltages Uf ¼ 15 and 40V, respectively, are shown. They are obtained by numerical integration of the corresponding curves E(x) in Figure 25b. As is evident from Figure 25c, the voltage drops in two areas of the silver nanoparticle film near the contacts. It is essentially larger in the left near-contact area. At Uf ¼ 15 V, the voltage drop in the left and right near-contact areas equals 12 and 3 V, respectively. As the voltage is increased to Uf ¼ 40 V, the areas of the electrical field localization (the voltage drop) remain the same, but the nonuniformity of the voltage drop becomes more pronounced. Now in the left and right near-contact areas the voltage drop is 36 and 4 V, respectively. This result allows us to conclude that the voltage drop (the value of the electrical field) across the silver nanoparticle film is a nonlinear function of the applied voltage. The emission center, marked by the upper arrow in Figure 24c, is located in the immediate vicinity of line AB. The electrical field distribution E(x) along this line is shown by curve (II) in Figure 25b. This allows the value of the electrical field near the marked emission center to be estimated. It amounts to nearly 2106 V/m. Certainly, the electrical field on the emission center itself can be somewhat larger, but hardly by three orders of magnitude. This result is important because it rules out the possibility of the field emission mechanism. An electrical field of 109 V/m is necessary to realize this type of emission (Mu¨ ller, 1956). Rather, the electron emission from the nanoparticle film, becoming very prominent in Figure 24d, is connected with tunneling conductance (Gloskovskii et al., 2004).
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
283
B. Measurement of Electric Fields (Potential) Distribution at DiVerent Emission Current Density From Various Object Areas In most cases the image contrast on the microscope screen is formed because of two factors: (1) the diVerence between the emission current density from diVerent regions of the object, and (2) the redistribution of the current density on the screen caused by the action of local fields. In many cases the first factor is predominant, and usually only this one is taken into account for image interpretation. The second factor is often masked by the first one and may hinder its detection and quantitative evaluation. For such cases, the developed theory has to be supplemented to provide a possibility to separate the influence of emission density variation on the image contrast from the eVect of the local fields. Below we describe such a separation technique based on the use of two diVerent accelerating voltages at the extractor electrode. The developed technique was applied for calculation of the potential distribution on the object surface for the case of helical structures formed in oxidation reactions as described by Mundschau et al. (1990) and Rotermund et al. (1991). 1. Method of Determination of Trajectory Shifts at DiVerent Emission Current Density From the Object In cases where the microfields cannot be switched oV, the initial distribution function j0(x) of the emission current density on the object surface is unknown because of the redistribution of the current density on the screen. So, it is necessary to separate the influence of variations of the initial emission current density on the image and of the trajectory shift under the action of the local fields. This is achieved by a comparison of two images of the same object region obtained at diVerent accelerating voltages of the cathode lens. It follows from Eqs. (14) through (16) that the shift S(x) is inversely related to the accelerating field strength E0. Therefore, the trajectory shift will be diVerent at diVerent accelerating voltages V0, thus permitting to separate these two factors. A quantitative determination of S can be made in the following way. It is necessary to obtain two images of the same object region at diVerent accelerating voltages V01 and V02. Then the current density distribution profiles are taken on both images along the same line normal to the linear details of the image. Let us denote the obtained intensity curves by j1(x) and j2(x), respectively. These curves are then numerically integrated. Let us designate the corresponding integral curves as J1(x) and J2(x). In Figure 26 this procedure is explained schematically by plotting an assumed initial integral curve
284
NEPIJKO ET AL.
FIGURE 26. The method for determination of the image shifts caused by the eVect of the local fields based on the application of a variable accelerating voltage of the cathode lens of the microscope. See text for explanations.
Zx J0 ðxÞ ¼
j0 ðxÞ dx:
ð89Þ
x0
In addition to the two resulting integral curves J1(x) and J2(x), the exact shape of J0(x) is unknown. We draw a horizontal straight line intersecting the three integral curves. The distances along this line measured from the initial curve J0(x) will be equal to S1(x) and S2(x) according to Eq. (24). Because in practice we have access only to curves J1(x) and J2(x), we can measure only the diVerence (S2 S1). However, it follows from Eqs. (14) through (16) that S2 E01 V01 ¼ ¼ ¼ k: S1 E02 V02
ð90Þ
Thus, after measuring the diVerence (S2 S1), we can calculate the values of shift according to: S1 ¼
S2 S1 ; k1
S2 ¼
k ðS2 S1 Þ: k1
ð91Þ
Simultaneously one can find the true x coordinate on the object: x ¼ x0 S1 ¼ x0
S2 S1 k ðS2 S1 Þ: ¼ x00 S2 ¼ x00 k1 k1
Here, x0 ¼ x þ S1 and x00 ¼ x þ S2.
ð92Þ
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
285
Once S1 and S2 are found, the electric field and potential distribution at the object surface is finally calculated using Eq. (27). 2. Measurement of Potentials for Catalytic Chemistry Applications Investigations of electric fields arising from the contact potential diVerence emerging due to chemical reactions on the object surface were carried out by Nepijko et al. (2003). Specifically, the catalytic reaction of CO oxidation on a Pt(110) single crystal surface was investigated (the substrate itself served as catalyst). The measurements were performed using a photoemission electron microscope (Engel et al., 1991), and the photoelectrons were excited by UV light from a deuterium discharge lamp. The photoemission electron microscope was installed into a UHV chamber with a base pressure <11010 mbar. A gas inlet system feedback-stabilization was used to keep the partial pressures of CO and O2 constant during the experiments (Rotermund et al., 1991). The Pt(110) single crystal was cleaned by mild sputtering (1 keV Arþ, 5 min) and annealed to 700 K. The experiment was carried out at a specimen temperature of 470 K. Figure 27 shows the reaction-diVusion patterns obtained in the photoemission electron microscope without restriction of the electron beam for two diVerent accelerating voltages applied to the extractor electrode (19.6 and 16.0 kV). As seen in Figure 27, the reaction-diVusion patterns have the shape of evolving spirals (see also Graham et al., 1994; Mundschau et al., 1990; Rotermund et al., 1991). The oxygen-covered parts of the surface appear dark because of their higher work function, whereas the CO-covered areas with their lower work function appear gray in the PEEM image. The dark patterns with rectangular shape are the reference marks on the object
FIGURE 27. Images of spiral structures for catalytic oxidation of CO on Pt(110) taken at the temperature of 470 K in a photoemission electron microscope without restriction of the electron beam. The voltage applied to the extractor was equal to 19.6 (a) and 16.0 kV (b).
286
NEPIJKO ET AL.
surface. They served for determination of the precise magnification and for localization of the measured region and were produced from Ti (a noncatalytic material for CO oxidation). Owing to the high work function of the oxygen-covered Ti layer, the UV radiation of the deuterium lamp does not lead to significant photoelectron emission. One of the lines along which intensity profiles were determined is indicated in Figure 27. This line AB is drawn nearly perpendicular to the loops of the spiral structure in the area where the helical structure remains invariable for the time interval between two measurements. Curves a and b in Figure 28 show the current density distribution along line AB in Figure 27a and b, respectively. Curve (b) lies lower than curve (a) because the total image brightness was weaker for the lower accelerating voltage (the extractor voltage). As expected, the relative amplitude of oscillations (relative to an average brightness of the same curve) is larger at the lower voltage. To separate the influence of the local fields from that of the diVerence of emission intensity from diVerent areas (work function contrast), the trend of the potential was calculated just from the diVerence of these relative amplitudes. The processing of these curves was based on the method described in Section V.B.1, with the following calculation of the potential distribution function using Eq. (33) . Figure 28c gives the result of the surface potential distribution along the line AB in Figure 27. The front of the oxidation reaction spread does not have the shape of a sharp jump, but the potential changes gradually in the adsorbed layer. The shape of the obtained curves is close to a sinusoidal one. Waves of this kind spread on the object surface in the tangential direction.
FIGURE 28. Line scans of the current density along line AB in Figure 27 with extractor voltages of 19.6 (a) and 16.0 kV (b). Calculated potential distribution on the object surface along this line (c).
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
287
Figure 28c thus shows the relative value of the potential variation (i.e., the work function diVerence) measured in volts. 3. Improvement of Accuracy in Electric Potential Measurement by Application of Statistical Methods As mentioned, curves (a) and (b) in Figure 28 have a shape close to sinusoidal. In such cases it is possible to apply statistical method using averaging over several waves. This can significantly improve the precision of the evaluation. Let the potential distribution function take the following form: ’ðxÞ ¼ ’0 sin
2px ; a
ð93Þ
where a is the structure period. Upon substituting Eq. (93) into Eq. (34), we obtain SðxÞ ¼
’0 2px : cos a E0
ð94Þ
In our case the image contrast is masked by the distribution of the photoemission current density, which is similar in shape to the potential distribution. So, let us parameterize the current density as j0 ðxÞ ¼ A þ B sin
2px ; a
ð95Þ
where A and B characterize the average density of the emission current and the amplitude of its variation, respectively. Using Eq. (22) along with Eqs. (94) and (95), one can obtain, A þ B sin 2px 2px 2p’0 2px a jðxÞ ¼ sin A þ B sin 1þ ; ð96Þ 2px 0 a a E0 a 1 2p’ E a sin a 0
0 because in our case the magnitude of 2p’ E0 a is far less than 1. After applying this equation twice for the accelerating microscope voltages V01 and V02 with consideration of Eq. (1) and choosing coordinates with the maximum and minimum current density, one arrives at the following expression: aV01 V02 j1max j1min j2max j2min ’0 ¼ ; ð97Þ 2plðV02 V01 Þ j1max þ j1min j2max þ j2min
where j1max, j1min, j2max, and j2min are the maximum and minimum values of the current density on the line scan. They are taken for the accelerating microscope voltages V01 and V02.
288
NEPIJKO ET AL.
This equation can be used for averaging the required value of ’0 over several periods of the waves in order to decrease the experimental error. This procedure gave the potential diVerence between the bright and dark regions on the object surface as 0.45 V. Another conclusion can be drawn from Eq. (97). If the relative amplitude of the current density oscillations in the image is larger at the smaller accelerating microscope voltage, then in the regions with the highest current density the object surface potential has the more positive value, and vice versa. Application of this rule to the object being investigated shows that the potential is more positive in the regions with the higher brightness. C. Other Possibilities for Electric Potential Imaging with an Emission Electron Microscope The intensities in threshold and multiphoton electron emission are sensitive to the work function F of the investigated surface area. Indeed, the width of the energy spectrum is equal to nhv F (where n is the number of photons absorbed). Therefore the area under the spectral curve (photoemission intensity) changes with the work function variation. The photoemission current for fixed photon energy hv (in the case of a high-pressure mercury or deuterium UV lamp, an interference filter can be used), work function F and temperature T was first explained by Fowler (1931). It is assumed that the photocurrent is proportional to the number of occupied states n() for hv > F : nðÞ can be assumed to be constant for simple metals (e.g., Cu, Ag, Au, or alkalis) with only s-p bands in the region close to EF if the photon energy exceeds the work function only slightly (< 2 eV). In practice, it has to be lower than the energy needed to excite emission from the d bands. The total photocurrent is given after integration over all directions in the half space above the sample by hv F jðv; F; TÞ ¼ AT 2 f ; ð98Þ kT where A and f(x) are the Fowler constant and function, respectively. The electric field E0 generated by the extractor will shift the threshold by lowering the solid to vacuum potential barrier (Schottky eVect). Then, Eq. (98) will still be valid if substituting x0 in f(x) pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð99Þ x0 ¼ hv F þ eE0 =4p0 =kT: Here, e is the elementary charge and 0 is the vacuum dielectric constant. The contribution of the last term in Eq. (99) can be simply taken into account
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
289
because the geometry of the immersion objective of an emission electron microscope and the voltage applied to the extractor are always known. Because V0 20 kV and l ¼ 2 mm, the largest electric field for a flat surface equals 107 V/m; see Eq. (1). This corresponds to a lowering of the threshold by 120 mV. The adsorption of atoms and molecules also results in changes DF of the work function caused by an additional dipole moment. The coverage dependence of these changes is usually described by the Topping formula (Topping, 1927) DF ¼
4pp0 N : 1 þ 9aN 2=3
ð100Þ
Here, p0 is the initial dipole moment of the adsorbate at coverage close to zero, N is the number of adsorbed particles per unit area, and a is their eVective polarizability (Ertl and Ku¨ ppers, 1985; Ho¨ lzl and Schulze, 1979). The initial change of the work function is proportional to the change of the coverage (for ! 0 the right-hand part of Eq. (100) becomes equal to p0D). Figure 29a shows the PEEM image of a Ti sample, where the crystallites 1–7 with diVerent brightness are clearly seen (von Przychowski et al., 2004). This experiment was performed using a PEEM instrument (Marx et al., 1994) with an electrostatic tetrode-lens (Chmelik et al., 1989) installed into a UHV chamber. Oxygen could be closed using a leak valve. Initially nearly amorphous Ti plate was mechanically polished and heated for several hours at 970 K until crystallites with sizes of the order of 100 mm became visible. The carbon contamination was completely removed by Arþ ion bombardment and heating cycles, whereas some small amount of sulfur remained at the surface. Care was taken during the cleaning cycles that the temperature did not exceed the hcp-bcc transition temperature at 1150 K.
FIGURE 29. Series of PEEM images taken during oxygen adsorption on polycrystalline Ti. The oxygen exposure is 0 (a), 10 (b), and 20L (c). The area marked with a square in (c) is shown in more detail in (d). To determine the contact potential diVerence between neighboring crystallites, the profiles of the current density j(x) measured along line AB were used. DiVerent Ti crystallites are marked by numbers 1 through 7 in panel (a).
290
NEPIJKO ET AL.
Figure 29a through c shows the images taken during the oxygen adsorption series: ¼ 0 (a), 10 (b), and 20 L (c). It reveals a reversal of the contrast if comparing the image of the clean sample (a) and one taken at high oxygen exposure (c). The diVerent crystallites exhibit a strongly diVerent behavior during oxygen adsorption. More detailed information can be gained from the local brightness recorded during the oxygen adsorption, as shown in Figure 30a. The brightness of crystallite 1 responds much faster on adsorbed oxygen compared to the remaining crystallites. The reversal of the contrast between 1 and the other crystallites appears at the point where the corresponding brightness curve crosses successively those of the other crystallites. The same phenomenon occurs also for the other crystallites, which leads to contrast reversal between crystallites 2 and 3 or 5 and 6. At high exposure, the brightness becomes nearly equal for all crystallites. This may point to the onset of bulk oxidation. By processing of the experimental curves shown in Figure 30a with the help of Eqs. (98) through (100), the work function variation DF as a function of coverage was obtained for diVerent Ti crystallites (i.e., diVerent crystallographic faces) (Figure 30b; see von Przychowski et al., 2004).
FIGURE 30. Image brightness (a) and calculated work function diVerence (b) for numbered areas (see Figure 29a) versus oxygen exposure. The curves for crystallites 1 and 4, as well as 2 and 7, coincide. The curves corresponding to crystallites 1 (4), 2 (7), 3, 5, and 6 are marked by (), (△), (▼), (◇), and (◀), respectively.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
291
FIGURE 31. Current density profiles j(x) measured along the line AB in Figure 29d (solid line) and determined by solving the direct problem (dashed line). They show a good overlap for a contact potential of 0.2 V between crystallites 1 and 2 (7). The parameters used in the calculation are given in the text.
Coincidence of the curves for crystallites 1 and 4, as well as 2 and 7, in Figure 30a, b points to the identical orientation of the corresponding crystallites. The described procedure is important for express (routine) measurements preceded by specific (reference) potential measurements. When the adsorbate chemically interacts with the substrate, the electronic structure changes together with the material specific constant A in Eq. (98). Independent measurements of the electric potential can be carried out, for example, as set forth in Section V.B. As an example, Figure 29d shows the area marked with a square in Figure 29c in more detail. It demonstrates the nonmonotonous character of the image contrast at the boundary of crystallites 1 and 2 (7) at the coverage ¼ 20 L. The absolute value of the potential diVerence (work function diVerence) of crystallites 1 and 2 (7) can be calculated from the profile of the intensity curve j(x) along line AB in Figure 29d. By solving the direct problem (with the PEEM parameters V0 ¼ 10 kV and l ¼ 2.5 mm), one obtains a contact potential diVerence of 0.2 V. In this case, the calculated and experimental profiles, plotted respectively in Figure 31 with dashed and solid lines, have an optimal overlap. Let us note here that, as seen in Figure 30b, the potential diVerence between crystallites 1 and 2 (7) after exposure with 20 L oxygen is 0.15 V. Thus, a treatment of the PEEM images by means of the two procedures gives similar results. VI. MEASUREMENT OF OBJECT SURFACE GEOMETRY (RELIEF) EMISSION ELECTRON MICROSCOPE
WITH AN
In Section IV we derived Eqs. (67) and (68), which allow us to calculate the image contrast resulting from the eVect of a surface microgeometry and to solve the inverse problem of surface geometry calculation from the image.
292
NEPIJKO ET AL.
These formulae are applicable at the condition in Eq. (66) (i.e., at a suYciently smooth geometry). However, cases can occur when the geometrical microstructures do not satisfy this condition. For example, this occurs in the case of small three-dimensional particles on a flat object surface. Then the problem is not only to evaluate the actual size of such particles, but also to find the possible contact potential diVerence between the substrate and the particle (Nepijko et al., 2000b). The approach is important for the interpretation of photoemission images of nanoparticles, for example, after pulser-laser excitation (Cinchetti et al., 2004; Fecher et al., 2002; Schmidt et al., 2002). A. Imaging of Small Three-Dimensional Particles on the Object Surface Let us assume small-size spherical particles on a flat object surface being investigated with an emission electron microscope. We calculate the image contrast for a spherule on a flat object surface. The derived formulae will help us to evaluate the size of these spherules from their image despite a strong deflection of the electron trajectories and the subsequent strong distortion of the apparent dimension of the spherule image. Let d be the spherule diameter. At the beginning, let us also suppose that there is no contact potential diVerence (CPD) between the spherule and the substrate, so they are equipotential. It is known that when a freeconducting spherule of diameter d is placed in a uniform electric field E0, such a spherule generates an electric dipole field with a moment (Govorkov and Kupalyan, 1970) 1 p ¼ p0 d 3 E0 : 2
ð101Þ
Here, 0 is the vacuum dielectric constant. However, the total field pattern becomes more complicated for a deposited spherule due to electrostatic reflection from the substrate surface caused by the mirror charge of the dipole. The uniform field E0 is supplemented by a combined field generated by the negatively charged spherule and its positively charged mirror image. Although the total field pattern of such a system is more complicated, it can also be represented with great accuracy as a dipole field with diVerent electric moment. The electric potential of the modified dipole field superimposed with the uniform field E0 is expressed by the following formula (Govorkov and Kupalyan, 1970): p z ’ðr; zÞ ¼ E0 z þ : : ð102Þ 4p0 ðr2 þ z2 Þ3=2
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
293
Here, r is the distance from the z-axis. The field components along the direction of the r- and z-axes will be equal to Ep ¼
3p rz 4p0 ðr2 þ z2 Þ5=2
ð103Þ
and Ez ¼ E0
p 2z2 r2 : 4p0 ðr2 þ z2 Þ5=2
ð104Þ
The electric dipole moment p of such a system can be determined using the condition that in reality, disregarding a possible CPD (this eVect and the way to measure it will be considered below), the electric potential at the top point of the spherule is equal to zero. From this condition using Eq. (102) we obtain: p ¼ 4p0 d 3 E0 :
ð105Þ
Here, the minus sign indicates the direction of the dipole moment in the explicit form, which results from the negative charge of the spherule in the accelerating field. The same electric moment is characteristic of the dipole with a finite distance between the charges if these charges are placed in the centers of the real spherule and its symmetrical mirror image. In certain cases it is worthwhile to operate the emission electron microscope not only in the focused mode, but also at a defined defocusing h relative to the focused mode. Here, h denotes the distance between the plane of the virtual image (K0 in Figure 1) formed by the uniform accelerating field and the plane focused by the objective lens on the screen. The virtual object plane is located at a distance l behind the physical object plane. The cases of sharp image focusing and defocusing are separately considered below. The image contrast at sharp focusing will be referred to as the basic contrast, while at image defocusing it will be referred to as the defocusing contrast. 1. Imaging of Spherical Particle Under Sharp Focusing Without Contact Potential DiVerence For the basic image contrast, the electron shift at the screen S is calculated by Eq. (14). Upon substitution of the expression for a potential derivative according to Eq. (103) into this formula and making some calculations, we obtain SðrÞ ¼
p 1 : 4p0 E0 r2
ð106Þ
294
NEPIJKO ET AL.
This is the electron trajectory shift at the screen for electrons escaping from the object surface with zero initial velocity. According to this formula, when r ! 0, the shift becomes infinitely large, but this formula is not applicable when r < d/2 because in this range the electrons are emitted directly from the spherule. Therefore the electron emission from the spherule is to be calculated independently. The problem is additionally complicated due to the shadow of the spherule formed in the case of an inclined incidence of primary particles initiating the electron emission from the object, as well as due to diVerent emission intensity from diVerent areas of the spherule caused by diVerent incident angles of the primary particles. These eVects result in changes in the central part of the spherule image, although such changes are not significant for further considerations. For the case of circular symmetry, the current density redistribution at the screen of the emission electron microscope without beam restriction can be calculated from the following considerations. Let j0(r) be the initial current density at the screen in the absence of the spherule. Due to the perturbation by the spherule, electrons starting at the distance r from the center will be shifted in the radial direction by the distance S(r). For infinitely narrow rings with radius ranging from r to r þ dr, the electron flux conservation equation can be written in the following form: 2pj0 rdr ¼ 2pðr þ SÞ dðr þ SÞ:
ð107Þ
Here, j(r þ S) refers to the new current density at the distance r þ S from the center. Hence we obtain jðr þ SÞ ¼
j0 ðrÞ : ð1 þ S=rÞð1 þ dS=drÞ
ð108Þ
For sake of simplicity, suppose the initial emission density j0(r) from the object surface to be constant. The numerically calculated electron trajectories near the object surface at the presence of a spherule are shown in Figure 32. The calculation was made for the following parameter values: V0 ¼ 18 kV, l ¼ 4 mm, and d ¼ 20 nm. Here, the initial electron velocity was taken as directed normally to the object surface and corresponding to the energy of 0.25 eV. As can be seen from the figure, near the spherule the electrons are deflected back to the surface by the local field. With increasing they are deflected less strongly according to Eq. (14). From Figure 32, one could conclude that the spherule at the screen of the emission electron microscope is displayed as a black spot. However, this is not true. In the case of the focused image, the microscope lenses display at the screen not the current density existing at the some height above the object surface, but the current density distribution at the virtual image in
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
295
FIGURE 32. Electron trajectories in the vicinity of a spherule with diameter d on a planar object surface in the emission electron microscope. The parameters of the calculation are given in the text.
the plane of the virtual cathode K0 located at the distance l behind the real object plane (see Figure 1). To obtain the current density distribution patterns at the screen, it is necessary to consider the tangents to the electron trajectories at their exit from the uniform acceleration field and draw them to this virtual image plane. Should these tangents intersect, this means that a caustic will be observed at the screen (i.e., a bright spot is observed instead of a dark one). This seemingly paradoxical conclusion is due to the fact that, in reality, the sign of the electric dipole moment p of the spherule is negative. Therefore, three-dimensional particles at the object surface will appear not as dark, but as bright spots. This is experimentally confirmed. If there is no CPD between the spherule and the substrate, substitute Eq. (105) for the dipole moment of the spherule into Eq. (106). We obtain the simple expression S¼
d3 : r2
ð109Þ
The current density distribution pattern at the screen for this case is shown in Figure 33. A bright spot with maximum intensity in the center is seen at the place of the spherule. Theoretically, the current density in the center is infinitely large if the finite resolution power of the microscope is neglected. According to additional numerical processing of this curve, the diameter of this bright spot is equal to the diameter of the spherule at the current density level
296
NEPIJKO ET AL.
FIGURE 33. Current density distribution on the screen of an emission electron microscope when imaging a spherule of diameter d in the focused regime. The diameter can be determined by the current density level j* ¼ 1.106j0.
j * ¼ 1:106j0 :
ð110Þ
In other words, we can evaluate the diameter of the spherule at the object surface if we measure the intensity (current density) profile through its image and trace here the specified brightness level. Note that the electron flux focused into the maximum is missing at higher r/d, and only at r/d > 6 the flux density j/j0 gradually reaches the value of 1. In order to eliminate CPD, it is recommended to deposit on the object surface a thin conducting layer, which does not have a significant eVect on the size of the image details. 2. Imaging of Spherical Particle Under Sharp Focusing with Additional Contact Potential DiVerence If there is CPD DF between the spherule and the substrate, the dipole moment p of the spherule is changed, resulting in subsequent changes of the apparent image size at the screen. In this case, the dipole moment of the spherule should be taken equal to p ¼ 4p0 d 2 ðE0 d þ DFÞ:
ð111Þ
Here, the sign of the dipole moment is already taken into account. If the CPD causes the spherule being negatively charged with respect to the substrate, the dipole moment modulus increases remaining negative by sign. The spherule is displayed as a bright spot, but the size of this spot is increased as compared with the case DF ¼ 0. To calculate the image, we must substitute Eq. (111) in Eq. (106). Therewith, we obtain:
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
SðrÞ ¼ d
2
DF 1 dþ 2: E0 r
297 ð112Þ
Unlike in Eq. (109), here the field strength E0 enters into the shift function. In this case, a combined value depending on the spherule diameter and on the CPD value will be measured if the spot diameter is measured on the same level of brightness according to Eq. (110): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi DF 3 2 D¼ d dþ : ð113Þ E0 Should the spherule be charged positively with regard to the substrate, the result will depend on the CPD value. As long as the CPD value is small enough, that is, jDFj E0 d;
ð114Þ
the image pattern is qualitatively retained, but the visible size of the bright spot decreases until it reaches zero for the case of the equality. Then, only secondary eVects remain, which are connected with the shadow from the spherule and a weak inhomogeneity of the emission in the vicinity of the spherule. However, provided that jDFj > E0 d;
ð115Þ
the image is inverted and looks like a dark spot. This spot is surrounded by a bright rim. The rotational symmetry of the pattern may be slightly distorted because of the shadow of the primary particles’ beam caused by the spherule. The dark spot diameter can be determined from the condition that in this case the caustic is observed when the following equation is satisfied: 1 þ dS=dr ¼ 0:
ð116Þ
On substituting Eq. (112) in Eq. (116), we obtain the dark spot diameter sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi DF 3 2 D ¼ 2:52 d d þ : ð117Þ E0 The current density distribution pattern at the screen for this case is shown in Figure 34. Here we also did not take into account a possible image blooming due to the finite resolving power of the microscope.
298
NEPIJKO ET AL.
FIGURE 34. Current density distribution on the screen of an emission electron microscope when imaging a spherule of diameter d in the case of image contrast reversal due to a CPD.
3. Measurements of Spherical Particle Parameters in the Case of Sharp Focusing Let us derive some formulae that allow us to determine the spherule diameter and CPD value in the case of sharp focusing. Similar to the case of defocusing, we can also apply here two techniques. 1. The first technique to be considered here consists of the following. First, the bright spot diameter d1 is measured at the image in the presence of CPD, and then a thin conducting layer is deposited on the object surface to prevent CPD formation. After that, the geometrical size of the spherule d is evaluated. Both the diameters are measured along the same brightness level according to Eq. (110). Equation (113) gives the following expression for the calculation of the CPD value: DF ¼
E0 3 ðd d 3 Þ: d2 1
ð118Þ
The following arguments can be applied to determine the sign of the CPD. If the spherule is charged negatively with respect to the substrate, its dipole moment increases, which results in an increase of the spot at the screen. Therefore, if the spot decreases in diameter when a CPD is absent, it follows that the spherule was charged negatively, and vice versa. However, if the spherule was charged positively with respect to the substrate, this might result not only in a decrease of the visible size of the spherule image at the screen, but also in an image contrast reversal. For example, at the aforementioned specified parameters of the emission microscope and a spherule diameter of 10 nm, this will occur when the positive
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
299
potential of the spherule relative to the substrate just exceeds 0.045 V. Thus, if the image contrast is reversed after deposition of the CPD-preventing conducting layer on the object surface, the spherule is charged positively. 2. Finally, we derive formulae for a determination of the diameter and CPD of a spherule placed on the surface by means of a variable accelerating voltage technique. All designations of variables are the same as those used in the case of image defocusing. Upon substituting the expression E0 ¼ V0/l with two values of accelerating voltage V01 and V02 into Eq. (113), we obtain: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 d 2 ðV01 d lDFÞ ð119Þ D1 ¼ V01 and
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 d 2 ðV02 d lDFÞ : D2 ¼ V02
ð120Þ
Considering these formulae again as a system of equations for definition of d and DF, we obtain the following solution: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 3 3 d V02 d V01 2 1 ð121Þ d¼ V02 V01 and DF ¼
V01 V02 ðd23 d13 Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi l 3 ðV02 V01 Þðd23 V02 d13 V01 Þ:
ð122Þ
These formulae permit independent calculation of both the spherule diameter and CPD value by two measurements at diVerent voltages. If the diameter of the spot seen at the screen decreases when the accelerating voltage increases, it means that as a result of the CPD eVect the spherule is charged negatively relative to the substrate, and vice versa. In this case, it is not necessary to deposit a coating layer. 4. Imaging of Spherical Particles in the Case of Defocusing In the case of defocusing, the object image drastically changes, and an image contrast reversal is often observed under switching between underfocusing and overfocusing. These eVects can also be numerically calculated and used for determination of parameters of the object details. Image defocusing causes an additional shift of the electron trajectories at the microscope screen, thus aVecting the image pattern. Let us assume that under defocusing
300
NEPIJKO ET AL.
FIGURE 35. Electron trajectories in the cathode lens in the presence of microfields. Solid and dashed lines denote the real and apparent trajectories, respectively. L represents the lens of the microscope immersion objective. K is the surface of the object. The virtual image K0 is observed in the focused regime of the emission electron microscope. K00 is the virtual image plane at defocusing h. S is the additional trajectory shift due to defocusing. In the case of divergent electron trajectories, this can result in formation of a bright spot instead of a dark one on the screen.
the microscope optical system focuses to the screen not the virtual cathode plane K0 , but some other plane K00 located at a distance h from the first one (Figure 35). As seen in Figure 35, the electron trajectory continuations up to this plane experience the additional shift S S ¼ h tanðaÞ ¼
hvp hvr ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; v0 2V0 e=m
ð123Þ
where nr is the electron radial velocity gained under the eVect of the local microfields, n0 is the total electron velocity gained due to the eVect of the accelerating voltage, and e and m refer to the charge and mass of the electron. The radial velocity nr gained by an electron under the eVect of the local fields can be calculated in the following way. Equation (6) gives dz dt ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : 2E0 zðe=mÞ Therefore, e dvr ¼ Er dt ¼ m and
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi e Er dz 2E0 zm
ð124Þ
ð125Þ
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
rffiffiffiffiffiffiffiffiffiffiffiffiZ1 Er ðzÞdz e pffiffiffi : vr ¼ 2E0 m z
301
ð126Þ
0
Upon substituting Eq. (103) and performing the integration, we obtain pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3p 2ple=m 1 vr ¼ : ð127Þ pffiffiffiffiffiffi 2 r5=2 80 V0 ½Gð1=4Þ Upon substituting this expression into Eq. (123), we obtain pffiffiffiffiffi 3hp pl 1 SðrÞ ¼ : 2 r5=2 80 V0 ½Gð1=4Þ
ð128Þ
The defocusing value h may be either positive that corresponds to image overfocusing, or negative that corresponds to image underfocusing. In the latter case, negative images are displayed, which are diVerent in detail as well. The overfocusing mode, with a dark spot observed in the image, is best suited for measurements. The dark spot radius can be also obtained from Eq. (116) upon substituting Eq. (128). We obtain " 5=7 # 5 2=7 2 2=7 r1 ¼ K þ ; ð129Þ 2 5 where K¼
pffiffiffiffiffi 3hp pl 80 V0 ½Gð1=4Þ2
:
ð130Þ
Upon collecting all numerical coeYcients, we can obtain the following formula for the determination of the dark spot diameter: pffiffi!2=7 hp l D ¼ 2r1 ¼ 1:551 : ð131Þ 0 V 0 It follows from this formula that the image diameter D increases very slowly proportionally to h2/7 with defocusing h. So, to increase the spot diameter 2 times we have to increase the defocusing value as much as 11.3 times. In the case of underfocusing, an image contrast reversal is observed; in other words, a bright spot appears in the image centre surrounded by a dark rim. However, according to additional research, the size of this bright spot
302
NEPIJKO ET AL.
depends on the spherule diameter too insignificantly, and therefore underfocusing is less suitable for measurements. 5. Measurement of Spherical Particle Diameter Without Contact Potential DiVerence in the Case of Defocusing In this case, the dipole moment of the spherule is taken in accordance with Eq. (101). Upon substituting it into Eq. (131) and making calculations of the numerical coeYcient, we obtain D ¼ 3:196
d 6=7 h2=7 : l 1=7
ð132Þ
Hence it follows that the analytical formula for a determination of the spherule diameter from its defocused image has the following form: d¼
0:258D7=6 l 1=6 : h1=3
ð133Þ
The main diYculty that arises when applying this technique is determining the defocusing value h. This problem can be solved rather simply when a magnetic lens is placed in the immersion objective of the emission electron microscope since its focal length dependence on the lens current is known. For an electrostatic immersion objective, the problem is more complicated and needs an additional consideration. One of the possible ways to solve this problem is as follows. First, it is necessary to carry out a measurement for a spherule of known diameter. From this measurement, we can calculate the defocusing value for given parameters of the objective lens by the following formula derived from Eq. (133) : pffiffi 0:0172D7=2 l h¼ : ð134Þ d3 Another possibility is to apply numerical ray tracing calculations along with a planar pattern on the substrate surface, suitable to find the settings for sharp focusing. 6. Measurement of Spherical Particle Parameters with Contact Potential DiVerence in the Case of Defocusing In this case, the equivalent dipole electric moment of the spherule is taken according to Eq. (111). Again, the diameter of the spherule image visible at the screen depends on the CPD sign and value. Two techniques can be proposed for a measurement of CPD and diameter of the spherule from its images in the emission electron microscope.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
303
1. After measuring the spot diameter at the image, a thin conducting layer is deposited on the object surface that prevents CPD formation. This layer should not change the spherule diameter significantly. Then the geometric size of the spherule can be calculated by processing this new image. After that, the CPD is calculated by comparing these two images. This technique is fairly simple, but it results in a possible destruction of the object by the overlayer. 2. Two diVerent images of the spherule are recorded at two diVerent values of the microscope accelerating voltage V0. Alternatively, it is also possible to change the microscope parameter l. In the presence of a CPD, the diameters of the two spots displayed will be diVerent. Using these data, we could calculate the actual size of the spherule and its CPD. The advantage of this technique is that the object is preserved. Let us consider these two techniques in more detail. 1. Let D designate the diameter of the spot displayed at the screen in the case of a CPD, whereas D0 refers to the spot diameter observed after the deposition of the conducting layer. In the case of defocusing, we can calculate the spherule diameter from the value D0 using Eq. (133) . Using Eq. (131), the electric dipole moment p can be expressed as follows: p¼
0:2150 V0 D7=2 pffiffi : h l
ð135Þ
Upon equating it to the dipole moment taken from Eq. (111), we obtain the expression for the CPD calculation: 7=2
DF ¼
0:258V0 ðD0 D7=2 Þ 7=3
h1=3 l 5=6 D0
:
ð136Þ
2. Let us now derive the formula for CPD determination varying the microscope accelerating voltage in the case of defocusing. For this purpose, let V01 and V02 be two diVerent values of the accelerating voltage. They correspond to two diVerent values of the spherule electric dipole moment: V01 d DF ð137Þ p1 ¼ 4p0 d 2 l and
V02 d DF : p2 ¼ 4p0 d l 2
ð138Þ
304
NEPIJKO ET AL.
Upon substitution of these expressions into Eq. (130), we obtain two spot diameters, D1 and D2, respectively. These two expressions are considered as two combined equations in two variables d and DF: 2=7 4pd 2 hðV01 d lDFÞ pffiffi D1 ¼ 1:551 ð139Þ V01 l and 2=7 4pd 2 hðV02 d lDFÞ pffiffi D2 ¼ 1:551 : V02 l
ð140Þ
Leaving out the solution details, we present here just the final expressions: ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7=2 7=2 pffiffi 3 ðV02 D V D Þ l 01 1 2 d ¼ 0:258 ð141Þ hðV02 V01 Þ and 7=2
DF ¼
7=2
0:258V01 V02 ðD2 D1 Þ 7=2
7=2
l 5=6 h1=3 ðV02 V01 Þ1=3 ðV02 D2 V01 D1 Þ2=3
:
ð142Þ
These formulae permit independent calculation of both the diameter of the spherule and its CPD from the two measurements. The CPD sign can be determined using the same procedures as described for sharp focusing. If the visible spot diameter decreases when the accelerating voltage increases, it means that due to the CPD the spherule is charged negatively with respect to the substrate, and vice versa. In practical situations, in the case of defocusing, both types of image contrast, namely the basic and defocusing ones, are simultaneously present. Therefore, strictly speaking, it is necessary to add the shift values for the two contributions to the image contrast. In the above formulae, the trajectory shift caused by the basic contrast is supposed to be negligible with respect to that one caused by the defocusing contrast. However, it is worth mentioning that at the assumed microscope parameters and spherule diameter and provided a reasonable h value, the defocusing contrast is smaller than the basic contrast. 7. EVect of Spherical Particle Shadow on the Image Pattern At the oV-normal incidence of the primary beam of particles (e.g., photon beam), which is responsible for the electron emission from the object, the shadow from the spherule appears in the form of ellipse on the substrate (Figure 36). In the case of photoemission, the spherule diameter is vastly
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
305
FIGURE 36. Formation of a shadow area from a spherule on the substrate at oV‐normal incidence of the primary beam.
larger than the wavelength of light (micrometer sizes) when diVraction eVects do not appear. For the sake of simplicity, suppose the initial beam to be parallel, and its angle of incidence with respect to the surface normal to be equal to y. A converging bundle of rays will generate a partial shadow, but it does not markedly aVect the results. The semi-axes of the elliptical shadow will be equal to d/2 and d/(2cosy), and its area is equal to Se ¼
pd 2 : 4cosy
ð143Þ
There is no electron emission from this area of the substrate. For grazing incidence (i.e., an angle y close to right angle), the shadow shall be rather extensive. For example, at y ¼ 75 , a common angle for PEEM instruments, the length of the shadow will be equal to 3.86d, and the ellipse area will be Se ¼ 0.97pd2. The emission from the spherule surface will be strongly nonuniform. However, according to numerical simulation of the electron trajectories, electrons emitted from the surface of the spherule are strongly deflected aside, and therefore they do not significantly aVect the image. An exception are those cases when the emissivity of the particle is strongly enhanced either by a lower work function or by secondary processes such as surface plasmon–enhanced multiphoton photoemission (Cinchetti et al., 2004) or emission of the hot electrons (Andreeva et al., 1984; Gloskovskii et al., 2004). Generally, however, the eVect of inclined incidence of the primary beam is reduced to the shadow eVect on the image. In the case of sharp focusing, this leads to the breaking of the circular spot symmetry. The spot will be deformed at the shadow side. For small particles, the pattern will be smoothed out due to finite microscope resolving power and diVraction of the UV beam in PEEM applications.
306
NEPIJKO ET AL.
In order to reduce the influence of this phenomenon on the measurement accuracy for determination of the spherule diameter and its CPD, we recommend the following rule be observed: When the spot observed at the image is slightly deformed, for calculation take the largest magnitude of the spot dimensions as its diameter value; in other words, the spot diameter has to be measured along the direction not perturbed by the shadow. B. Characterization of Three-Dimensional Objects in Particular with Sizes Comparable to or Smaller than the Lateral Resolution of the Emission Electron Microscope. Practical Recommendations Although usually used for the study of plane surfaces, emission electron microscopes also are suited for characterization of small three-dimensional objects. Characterization of small particles consists of determining their shape and dimensions. Information regarding the particles shape can be obtained when the photoelectrons are excited by s- and p-polarized light (Nepijko et al., 2002c). For practical measurements of a particle diameter from its images, it is most suitable to use Eq. (110) at the current density level at the image j* ¼ 1.106j0 ( j0 is the initial emission current density from the substrate surface in the vicinity of the given particle). It is necessary to take into account that the particle size displayed at the screen may be aVected by the contact potential diVerence between the particle and the substrate. Another technique to measure the diameter can be recommended for small particles, whose size is comparable with the resolvable distance. From consideration of dimensionality it follows that the relative integrated magnitude of the brightness enhancement along a line across the spot diameter is proportional to the squared diameter of the particle. In this case, we can obtain the constant of proportionality required for calculations from a comparison of the results measured for large particles by the two abovementioned techniques. These and other details (practical recommendations) that may arise during characterization of three-dimensional objects with an emission electron microscope are discussed in this section. A photoemission electron microscope (FOCUS IS-PEEM, see Swiech et al., 1997) was used for the investigation. It was attached to a UHV chamber with a base pressure below 1010 mbar. The photoelectrons were excited by ultrashort laser pulses with a wavelength of 400 nm (hv 3.1 eV). The pulses of approximately 80 fs were generated with a repetition rate of 80 MHz and a maximum peak power of 1 kW by a frequency-doubled (with a b-BaB2O4 crystal) self-mode–locked Ti:sapphire laser (Spectra Physics). A Fresnel-rhomb served to rotate the polarization vector of the laser light. Its angle of incidence was 65 oV the surface normal.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
307
The samples were prepared by in situ evaporation of Pb from indirectly heated crucibles onto an Si(111) surface. The formation of Pb particles was initiated by heating of the sample. Their formation at the melting point of the Pb film was monitored in situ with the photoemission electron microscope. The Pb particles prepared in such a way had sizes from some hundreds of nanometers up to several micrometers (Fecher et al., 2002; Nepijko et al., 2002c). The Pb particles are visible in Figure 37. Figure 37a and b corresponds to the same area, and the photoelectrons were excited by s- or p-polarized light, respectively. The images were obtained using the photoemission electron microscope in the focusing regime. It can be easily seen that the images in Figure 37a and b are not identical. The photoemission intensity depends
FIGURE 37. PEEM images of the same area showing an ensemble of Pb particles on an Si wafer upon illumination by femtosecond laser pulses with a wavelength of l ¼ 400 nm. The illuminating light was s‐polarized (a), or p‐polarized (b). Asymmetry image (c) and sum image (d) of images (a) and (b) are also given. Designations are explained in the text.
308
NEPIJKO ET AL.
on both the sample material and the angle of incidence, but these parameters remain the same for both types of polarization. The diVerences seen in Figures 37a and b for the images of the same particle can be used to study its geometric shape anisotropy. If the photoemission intensities for s- and p-polarization are equal, then a Pb particle is symmetrical regarding these planes of polarization (e.g., it is essentially spherically shaped). However, this is no longer the case for nonsymmetrical particles. In addition, the image of the Si substrate is more intense when excited by p-polarized light (Figure 37b) as compared to s-polarized light (Figure 37a). This behavior can be understood assuming that the photoemission signal depends on the projection of the electric vector on the direction of observation. In the former case, the electric vector has a component perpendicular to the surface, whereas in the latter case it is purely in-plane. Let us discuss the following diVerences between Figure 37a and b. For example, the bright particles enclosed by a circle in Figure 37a look darker in Figure 37b. Vice versa, bright particles enclosed by a square in Figure 37b are darker in Figure 37a. The shape of these particles obviously diVers significantly from a sphere. The particles enclosed by a circle and by a square are either strongly oblate, or prolate, with respect to the substrate surface-normal, respectively. Figure 37c shows the asymmetry between Figure 37a and b, that is, the normalized intensity diVerence. All three parts of the image contain the same number of pixels and the asymmetry is defined for each pixel by A¼
Is Ip ; Is þ Ip
ð144Þ
where Is and Ip are the intensities for the orthogonal directions of the polarization vector. Eq. (144) has to be applied pixel by pixel to the images in Figure 37a and b to result in the asymmetry image shown in Figure 37c. It is clearly seen from Figure 37c that smaller particles generally appear dark and particles larger than a certain limiting size appear bright in the asymmetry image. This result testifies that small (dark) particles are emitting symmetrically if illuminated by s- or p-polarized light (Is ffi Ip). Small particles can thus be considered spherical to a good approximation. However, this is not true for larger particles. The numbered arrows in Figure 37c point to particles with decreasing size as their number increases. Particles 2 through 5 appear dark (i.e., they have a spherical shape). Particle 1 is gray; therefore it can be considered to be nearly spherical. The other, larger particles in Figure 37c are bright (i.e., they are not spherical). Thus, the spherical particles 1 through 5 were chosen for the subsequent determination of their sizes according to this contrast criterion.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
309
Figure 37d was used to determine the diameter of the particles 1 through 5. This figure was obtained by summation of Figure 37a and b, so that Figure 37d contains the information corresponding to the case when the object is illuminated by unpolarized light. The diameter of the particles was calculated using the criterion j* ¼ 1.106j0 from the radial dependence of the current density j(x), as shown in Figure 38. The profiles in Figure 38 were found from a line-scan through the centers of the particle images presented in Figure 37d. However, the following eVects should be taken into account. As previously discussed, a contact potential diVerence between a particle and the substrate influences the apparent size of this particle. The correction is Dd ¼ DF=E0 . It has to be added if the work function of a particle is lower than the work function of the substrate and it should be subtracted in the opposite case. In this experiment, Si(111) served as a substrate; its work function is approximately 4.6 eV (Fomenko and Podchernyaeva, 1975) and exceeds the work function of the Pb particles by approximately 0.7 eV. A closer determination of the work function diVerence DF is rather diYcult because it is not known which crystallographic plane of the Pb particles is parallel to the substrate. However, the work functions of the major lowindex planes of Pb, such as (100), (110), and (111) are 3.95, 3.8, and 3.85 eV, respectively (Fomenko and Podchernyaeva, 1975), and thus close to approximately 3.9 eV. The field of the immersion objective lens was E0 ¼ 3.106 V/m. Hence it follows that the correction is approximately Dd 0.2 mm and therefore it may be neglected for particles with a size of several micrometers even though the work function was not correctly determined.
FIGURE 38. Profiles of the current density distribution obtained by line scans passing through the centres of the particles’ images shown in Figure 37d. Curves 1–5 correspond to the numbered particles in Figure 37c and d.
310
NEPIJKO ET AL.
The next approach takes the limited resolution of the microscope into account. In the present experiments the smallest resolvable distance was approximately 0.1 mm. Each point of an object image is a circle of diVusion with a current density being close to a Gaussian distribution. For this reason, in the center of the particle image a flat maximum is observed instead of a sharp peak of high brightness. The curve of brightness, as a whole, becomes bell-shaped. This eVect is also weak if the diameters of the particles largely exceed the resolvable distance. It results in only a slight increase of the apparent diameter for large particles. However, this eVect becomes significant for small particles, which have a size comparable to the microscope resolution. The particle image then becomes a spot slightly exceeding the background in brightness. The brightness maximum of particle 5 (Figure 38) is already below the level j* ¼ 1.106j0 as determined for the case when the limited resolution of the microscope is not taken into account (dashed line). Another method to measure the diameter may be proposed for small spherical particles. From a consideration of dimensionality it follows that the integral spot intensity is given by Z2p Z1 Ji ¼
ji ðrÞrdrd’; 0
ð145Þ
0
where r ¼ x x0i and x0i is the position of the center of the i-th particle. Ji is proportional to the square of this particle diameter di2 . This suggests that the diameter of the i-th particle may be expressed by parameters of a reference particle rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Ji di ¼ dreference : ð146Þ Jreference The results of the particles size determination by the proposed methods are discussed in the following text. The width of curves 1 through 4 at the level j* ¼ 1.106j0 (see Figure 38) was determined as 2.33, 2.19, 1.62, and 1.23 mm, respectively. It is necessary to add a correction due to the contact potential diVerence to these values. Thus, the diameters of the particles 1 through 4 determined by the first method amount to 2.53, 2.39, 1.82, and 1.43 mm. The accuracy of the diameter of the largest particles can be estimated to be approximately 0.2 mm. As the particle size decreases, the accuracy becomes worse. If particle 1 is taken as reference, then the diameters of particles 2 through 5 obtained by the second method are equal to 2.37, 1.45, 0.94, and 0.51 mm, respectively. The diameters calculated by both methods are still coincident for particle 2, but the next values in this series diVer more
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
311
and more. Calculations performed by the second method result in smaller values. Let us discuss possible reasons for this discrepancy. It should be noted here that the second method based on Eqs. (145) and (146) assumes that the magnitude of the integral photoemission current is proportional to the particle surface area. This is strictly valid in the case of one-photon photoemission when the excitation energy exceeds the work function. However, it is only an approximation for the case of multiphoton photoemission. The latter is based on nonlinear processes that are strongly enhanced on increasing the illuminating laser power and on a decrease of the particle size. From the assumption that the photoemission integral intensity does not increase under decrease of the particles’ sizes, it follows that their eVective sizes should be increased. However, comparison of the sizes of the particles 1 through 5 determined by both methods shows the inverse dependence. It only remains to suppose that this eVect was not dominant in the investigated size range. For multiphoton photoemission the second method gives an upper estimation of the particle size. It means that the real sizes of the particles 3 through 5 can be still smaller than those calculated by the second method. The intensity of the photoemission signal related to the unit of volume of a particle shows a size dependence until the particle size remains smaller than the photon penetration depth. However, all particles observed here had diameters 0.5 mm, so this additional eVect should not be expected. The precision for the diameter of particles 1 and 2 is approximately 4%. The accuracy of the diameters of the remaining particles is determined by the functional dependence given in Eq. (46), as well as by the accuracy of the values entering the equation. The precision of the diameters obtained for particles 3, 4, and 5 is approximately 9, 14, and 35%, respectively. Thus, two methods were described to be used for a quantitative determination of the diameter of three-dimensional particles in PEEM investigations. Determination of the particle diameter from the j* ¼ 1.106j0 criterion is a direct method. It is less precise for small particles because of spreading of the image caused by the limited resolution of the instrument. This method is correct in both cases of one-photon and multiphoton photoemission. The second method is attractive because it does not use the shape of the intensity curve through the i-th particle ji (x), which is deformed for small particles, as mentioned above, but instead the integral intensity Ji derived using Eq. (145). The latter is determined by the photoelectron intensity forming the image of an observed particle and remains unchanged under spreading of the particle image due to the limited lateral resolution. Thus, the first method was applied to determine the diameter of a large spherical particle being well separated from the others. This diameter is further used as reference when the diameter of the smaller particles is derived by the second
312
NEPIJKO ET AL.
method. This approach is accurate in the case of one-photon photoemission and enables the upper limit of the particle size to be estimated in the case of multiphoton photoemission. VII. CONCLUSIONS Based on the theory and experiments described in the present paper, the following conclusions can be drawn. 1. An emission electron microscope makes it possible to observe local electric fields on the object surface with very high sensitivity. 2. Images of various kinds can be obtained depending on whether the electron beam is restricted on its way from the object to the screen. The microscope sensitivity to local fields is much higher if the beam is partially restricted by means of a contrast aperture or a knife-edge located in the crossover plane of the objective lens. 3. Depending on the position of the contrast aperture or knife-edge, a typical bright-field or dark-field image of the same object or an image of intermediate contrast type can be obtained. 4. Independent of the image type, it is possible to reconstruct the quantitative two-dimensional distribution of the strength of electric microfields and the distribution of the local potential on the object surface from these images using the described formulae. 5. Using the above-mentioned methods, it is possible to calculate numerically the surface geometrical microprofile of the investigated object. 6. Due to the distortion of the electron trajectories by local microfields, the image size and shape of the inhomogeneities on the surface of the investigated object at the electron microscope screen can be rather diVerent from their actual values. The above theory makes it possible to restore the actual size and shape of these details. 7. The formulae given above enable the magnitudes of microparticles present on the object surface to be determined. 8. Contrary to what might be expected, in the regime of sharp focusing, particles positioned on the object surface do not appear as dark spots, but as bright spots. Conversely, pits on the object surface look dark relative to adjacent areas. 9. Similarly, positively charged microregions on the object surface look darker and larger with respect to their actual magnitude, and vice versa for negatively charged regions.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
313
10. In the regime of defocusing, the image contrast experiences changes, sometimes being transformed into an inverted contrast. The theory enables the image to be calculated for any value of defocusing. 11. The technique based on application of two diVerent accelerating voltages of the cathode lens of the microscope permits separation of the eVect of a microinhomogeneity from that of local electric fields, as well as those caused by diVerent values of the local emission current density from the object. 12. Local fields and surface roughness on the object surface may significantly deteriorate the spatial resolution in the vicinity of such an inhomogeneity. The theoretical resolution can be calculated by the formulae derived in the paper. REFERENCES Andreeva, L. I., Benditskii, A. A., Viduta, L. V., Granovskii, A. B., Kulyupin, Yu. A., Makedontsev, M. A., Rukman, G. I., Stepanov, B. M., Fedorovich, R. D., Shoitov, M. A., and Yuzhin, A. I. (1984). Electron emission induced by the action of CO2 laser radiation on island metal films. Sov. Phys. Solid State 26, 923–924. Bertein, F. (1953). Aberrations des images electroniques des cathodes emissives imparfaites. J. Phys. Et Radium 14, 235–240. Blessing, R., and Pagnia, H. (1982). Electron emission from gold island films. Phys. Stat. Solidi (b) 110, 537–542. Boersch, H. (1942). Die Verbesserung des Auflo¨ sungsvermo¨ gens im Emissions-Elektronenmikroskop. Zs. Techn. Phys. 23, 129–130. Borziak, P. G., Kulyupin, Yu. A., Nepijko, S. A., and Shamonya, V. G. (1981). Electrical conductivity and electron emission from discontinuous metal films of homogeneous structure. Thin Solid Films 76, 359–378. Borzjak, P. G., Sarbej, O. G., and Fedorowitsch, R. D. (1965). New phenomena in very thin metal layers. Phys. Stat. Solidi. 8, 55–58. Bru¨ che, E. (1942). Die Auflo¨ sungsgrenze des Emissions-Elektronenmikroskops. Kolloid. Z. 100, 192–206. Chmelik, J., Veneklasen, L., and Marx, G. (1989). Comparing cathode lens configurations for low energy electron microscopy. Optik 83, 155–160. Cinchetti, M., Valdaitsev, D. A., Gloskovskii, A., Oelsner, A., Nepijko, S. A., and Scho¨ nhense, G. (2004). Photoemission time-of-flight spectromicroscopy of Ag nanoparticle films on Si (111). J. Electr. Spectr. Relat. Phenom. 137–140, 249–257. Courant, R., and Hilbert, D. (1989). Methods of Mathematical Physics, Vols. 1 and 2. New York: John Wiley & Sons. Dyukov, V. G., Nepijko, S. A., and Sedov, N. N. (1991). Electron Microscopy of Local Potentials. Kiev: Naukova Dumka. Engel, W., Kordesh, M. E., Rotermund, H. H., Kubala, S., and von Oersten, A. (1991). An UHV photoemission microscope for use in surface science. Ultramicroscopy 36, 148–153. Ertl, G., and Ku¨ ppers, J. (1985). Low Energy Electrons and Surface Chemistry. Weinheim: VHC.
314
NEPIJKO ET AL.
Fecher, G. H., Schmidt, O., Hwu, Y., and Scho¨ nhense, G. (2002). Multiphoton photoemission electron microscopy using femtosecond laser radiation. J. Electr. Spectr. Relat. Phenom. 126, 77–87. Fedorovich, R. D., Naumovets, A. G., and Tomchuk, P. M. (2000). Electron and light emission from island metal films and generation of hot electrons in nanoparticles. Phys. Rep. 328, 73–179. Fomenko, V. S., and Podchernyaeva, I. A. (1975). Emission and Adsorptive Properties of Substances and Materials, Handbook. Moscow: Atomizdat. Fowler, R. H. (1931). The analysis of photoelectric sensitivity curves for clean metals at various temperatures. Phys. Rev. 38, 45–56. Glaser, W. (1952). Grundlagen der Elektronenoptik. Wien: Springer-Verlag. Gloskovskii, A., Valdaitsev, D., Nepijko, S. A., Sedov, N. N., and Scho¨ nhense, G. (2004). Electrical and emission properties of current-carrying silver cluster films detected by an emission electron microscope. Appl. Phys. A 79, 707–712. Govorkov, V. A., and Kupalyan, S. D. (1970). Theory of Electromagnetic Field in Exercises and Problems. Moscow: Vysshaya Shkola. Graham, M. D., Kevrekidis, I. G., Asakura, K., Lauterbach, J., Krischer, K., Rotermund, H. H., and Ertl, G. (1994). EVects of boundaries on pattern formation: Catalytic oxidation of CO on platinum. Science 264, 80–82. Ho¨ lzl, J., and Schulze, F. K. (1979). Work function of metals, in Springer Tracts in Modern Physics—Solid Surface Physics, Vol. 85, edited by G. Ho¨ hler. Berlin: Springer-Verlag, pp. 1–150. Krasnov, M. L. (1975). Integral Equations. Moscow: Nauka. Liebl, H. (1988). On the image aberration of the uniform acceleration field of an emission lens. Optik 80, 4–8. Marx, G., von Przychowski, M. D., Kro¨ mker, B., Ziethen, Ch., and Scho¨ nhense, G. (1994). Construction of a UHV emission electron microscope with preparation chamber, in Electron Microscopy 1994, Vol. 1, edited by B. JouVrey and C. Colliex. Les Ulis: Les Editions de Physique, pp. 239–240. Mu¨ ller, E. W. (1956). Field emission microscopy, in Physical Methods in Chemical Analysis, Vol. 3, edited by W. G. Berl. New York: Academic Press, pp. 135–182. Mundschau, M., Kordesch, M. E., Rausenberger, B., Engel, W., Bradshaw, A. M., and Zeitler, E. (1990). Real-time observation of the nucleation and propagation of reaction fronts on surfaces using photoemission electron microscopy. Surface Sci. 227, 246–260. Mundschau, M., Romanowicz, J., Wang, J. Y., Sun, D. L., and Chen, H. C. (1996). Imaging of ferromagnetic domains using photoelectrons: Photoelectron emission microscopy of neodymium-iron-boron (Nd2Fe14B). J. Vac. Sci. Technol. B 14, 3126–3130. Nepijko, S. A., Fedorovich, R. D., Viduta, L. V., Ievlev, D. N., and Schulze, W. (2000a). Light emission from Ag cluster films excited by conduction current. Ann. Phys. 9, 125–131. Nepijko, S. A., Gloskovskii, A., Sedov, N. N., and Scho¨ nhense, G. (2003). Measurement of the electric field distribution and potentials on the object surface in an emission electron microscope without restriction of the electron beams. J. Microsc. 211, 89–94. Nepijko, S. A., Ievlev, D. N., Viduta, L. V., Schulze, W., and Ertl, G. (2002a). The light emission observed from small palladium particles during passage of electron current. ChemPhysChem 3, 680–685. Nepijko, S. A., Oelsner, A., Krasiyuk, A., Gloskovskii, A., Sedov, N. N., Schneider, C. M., and Scho¨ nhense, G. (2004). Lateral resolving power of a time-of-flight photoemission electron microscope. Appl. Phys. A 78, 47–51.
MEASUREMENT OF ELECTRIC FIELDS ON OBJECT SURFACE
315
Nepijko, S. A., Sedov, N. N., Ohldag, H., and Kisker, E. (2002b). Measurement of local magnetic fields in photoelectron emission microscopy by restriction of the electron beams. Rev. Sci. Instr. 73, 1224–1228. Nepijko, S. A., Sedov, N. N., Schmidt, O., Fecher, G. H., and Scho¨ nhense, G. (2002c). Size of three-dimensional objects measured by means of photoemission electron microscopy. Ann. Phys. 11, 39–48. Nepijko, S. A., Sedov, N. N., Schmidt, O., Scho¨ nhense, G., Bao, X., and Huang, W. (2000b). Imaging of three-dimensional objects in emission electron microscopy. J. Microsc. 202, 480–487. Nepijko, S. A., Sedov, N. N., Scho¨ nhense, G., and Escher, M. (2002d). Use of electron emission microscope for potential mapping in semiconductor microelectronics. J. Microsc. 206, 132–138. Nepijko, S. A., Sedov, N. N., Scho¨ nhense, G., Escher, M., Bao, X., and Huang, W. (2000c). Resolution deterioration in emission electron microscopy due to object roughness. Ann. Phys. 9, 441–451. Nepijko, S. A., Sedov, N. N., Scho¨ nhense, G., Muschiol, U., Schneider, C. M., Zennaro, S., and Zema, N. (2002e). Resolution of an emission electron microscope in the presence of magnetic fields on the object. Ann. Phys. 11, 461–471. Nepijko, S. A., Sedov, N. N., Ziethen, Ch., Scho¨ nhense, G., Merkel, M., and Escher, M. (2000d). Peculiarities of imaging one- and two-dimensional structures in an emission electron microscope. 1. Theory. J. Microsc. 199, 124–129. Recknagel, A. (1941). Theorie des elektrischen Elektronenmikroskops fu¨ r Selbststrahler. Z. Phys. 117, 689–708. Recknagel, A. (1943). Das Auflo¨ sungsvermo¨ gen des Elektronenmikroskops fu¨ r Selbststrahler. Z. Phys. 120, 331–362. Rose, H., and Preikszas, D. (1992). Outline of a versatile corrected LEEM. Optik 92(1), 31. Rose, H., and Preikszas, D. (1995). Time-dependent perturbation formalism for calculating the aberrations of systems with large ray gradients. Nucl. Instrum. Methods Phys. Res. A 363, 301–315. Rotermund, H. H., Engel, W., Jakubith, S., von Oertzen, A., and Ertl, G. (1991). Methods and application of UV photoelectron microscopy in heterogeneous catalysis. Ultramicroscopy 36, 164–172. Schmidt, O., Bauer, M., Wiemann, C., Porath, R., Scharte, M., Andreyev, O., Scho¨ nhense, G., and Aeschlimann, M. (2002). Time-resolved two photon photoemission electron microscopy. Appl. Phys. B 74, 223–227. Scho¨ nhense, G., and Spiecker, H. (2002). Correction of chromatic and spherical aberration in electron microscopy utilizing the time structure of pulsed exitation sources. J. Vac. Sci. Technol. B 20, 2526–2534. Schwartz, L. (1950). The´ orie des Distributions. Paris: Hermann et C. Sedov, N. N. (1970). The´ orie quantitative des syste`mes en microscopie e´ lectronique a` balayage, a` miroir et a` e´ mission. Journal de Microscopie 9, 1–26. Sedov, N. N., Spivak, G. V., and Ivanov, R. D. (1962). Electron-optical study of p-n junction of germanium and silicon. Izv. Akad. Nauk SSSR, Ser. Fiz. 26, 1332–1338. Spivak, G. V., Dombrovskaja, T. N., and Sedov, N. N. (1957). Observation of ferromagnetic domain structure by using photoelectrons. Dokl. Akad. Nauk SSSR 113, 78–81. Swiech, W., Fecher, G. H., Ziethen, Ch., Schmidt, O., Scho¨ nhense, G., Grzelakowski, K., Schneider, C. M., Fro¨ mter, R., Oepen, H. P., and Kirschner, J. (1997). Recent progress in photoemission microscopy with emphasis on chemical and magnetic sensitivity. J. Electr. Spectr. Rel. Phenom. 84, 171–188.
316
NEPIJKO ET AL.
Tikhonov, A., and Arsenin, V. (1977). Solution of Ill-Posed Problems. New York: John Wiley: & Sons. Tikhonov, A. N., and Arsenin, V. Ya. (1986). The Methods of Solution of Incorrect Problems. Moscow: Nauka. Tikhonov, A. N., and Goncharsky, A. V. (1987). Ill-Posed Problems in the Natural Sciences. Moscow: Mir. Tikhonov, A. N., Leonov, A. S., and Yagola, A. G. (1998). Non-Linear Ill-Posed Problems, Vols. 1 and 2, London: Chapman & Hall. Topping, J. (1927). On the mutual potential energy of a plane network of doublets. Proc. Royal. Soc. A 114, 67–72. Veneklasen, L. H. (1991). Design of a spectroscopic low-energy electron microscope. Ultramicroscopy 36, 76–90. von Przychovski, M. D., Marx, G. K. L., Fecher, G. H., and Scho¨ nhense, G. (2004). A spatially resolved investigation of oxygen adsorption on polycrystalline copper and titanium by means of photoemission electron microscopy. Surf. Sci. 549, 37–51.
Index
eigenergies/bound eigenfunction in Cu, 149 eigenergies/bound eigenfunction with various methods, 156 eigenfunctions, possible symmetries of, 176, 177 electrons’ density in, 129 Gaussian parameterization of 1S eigenfunction of, 172 identical, 178–180 identical/nonidentical, 179, 190 identical/nonidentical correlation, 177 integration of, 148 LCAO, 184–189 mean potential of, 218–219, 220 neighboring influence on each other, 175 nonidentical, 180–181 nonisolated, 175–190 scattered wave relation to, 161–162 Sr, 171 S-state model, accuracy of general assembly with, 182–184 S-state model and, 125, 181–182, 197, 198–200 systematic error of estimated position of, 183, 184 tilt and, 194–195 TiO, 171 types of, 150–151 uniform potential, 165 atomic number Z, 166, 167 Auger electron production, 125 automatic texture-preserving denoising, 93–96 axiomatic approach, 3–6 linear scale-space, 3–4
A a priori, 16 adaptive OI problem, 90–97 automatic texture-preserving denoising in, 93–96 denoising with prior information in, 96–97 adaptive processing, 46 P-M scheme comparison to, 100 scalar comparison to, 101 adaptive structure tensor, 41–42 Alvarez and Mazzora (A-M) scheme, 31, 84, 85 FAB comparisons to examples of, 32–34 image enhancement by, 32–33 angular equation, 221 anisotropic operator, 17 aperture diaphragm, 240, 251, 255 ARPACK software, 134 atom column, 124, 174 2D, 125 2D potential of, 127 bound eigenfunctions, number of dependent on, 158 bound eigenfunctions, symmetry with, 213 broadening of, 169 correlation diagram and, 177–178 dependence of db (square root) for, 167 diVerent parameters for light and heavy atoms in, 169 eigenenergies comparison of most bound eigenfunctions of diVerent types of, 170, 171 eigenergies/bound eigenfunction in Au, 150 317
318 axiomatic approach (Cont.) Perona-Malik nonlinear diVusion, 4–5 sharpening by, 21–46 tensor diVusivity, 6
B basic contrast, 293 Beltrami flow, 8, 41, 46, 63 framework, 14, 16, 41, 60 operator, 15 process, 44 schemes, 45 Bessel diVerential equation, 157 radial equation similarity to, 135–137 Bessel functions, 132, 156–157 basis set of first kind, 135–138 eigenenergy calculated by, 151, 152, 153, 154, 155 eigenfunctions’ expansion advantages/disadvantages, 137–138 expansion, ineVectiveness of, 152–154 of first kind v. 2D quantum harmonic oscillator eigenfunctions, 150–156 of first/second kind, 157 biharmonic operator, 51 blind deconvolution, 16, 49, 56 Bloch function, 130 Bloch theorem, 119 Bloch wave method, 119, 122–123, 174, 203 associated eigenvector of, 144 channeling described by, 129 eigenenergies, ineVectiveness with, 146 eigenfunctions’ calculation by, 143–146 eigenproblem written as, 144–145 eigenvalues and, 123
INDEX
finite diVerence expansion’s related to, 145 historical significance of, 123 isolated atom column’s eigenenergies/eigenfunctions calculated by, 145 refinement step and, 123 Schro¨ dinger equation solved by, 122 blurring kernel, 16, 17 bound eigenfunctions, 130, 131 basis functions’ expansion of, 132, 137–138, 142 Bloch wave method, calculation of, 143–146 comparison of methods for calculation of, 148–160 eigenenergies’ comparison of types of atom columns calculated with diVerent methods, 170, 171 eigenenergy of, 156, 188 eigenenergy of in a Au atom column, 150 eigenenergy of in a Cu atom column, 149 excitation of, 158–159 local atom columns, symmetry with, 213 multislice method calculation of, 146–148 number of dependent on atom column, 158 optimal value of, 142 radial equation for, solution of with expansions in a basis set, 135–143 radial equation for, solution of with finite diVerence expansion methods, 133–135 radial equation of, 132 sigma gerade, 214–215 wave function dominated by, for small tilt angles, 193
INDEX
Bragg beams, 114 diVraction, 125 bright field, 259, 262 bright particles, 308 Bru¨ che-Recknagel formula, 272
C cartoon pyramid model definition of scale in, 89 desired level in a region, method for approximating, 94 model components of, 90 non-cartoon and, 89 properties of, 89 catalystic chemistry applications, measurement of, 285–287 Cauchy integrals, 239 Cauchy-Schwartz inequalities, 208–210 CCD. See charged coupled device CGL. See complex Ginzburg-Landau equation channeling map, 160–163 experimental, 201–203 reasoning for, 202 representation of, 163 channeling theory, 124–130, 143 associated eigenfunction of S-state model of, 144 Bloch waves’ description of, 129 concept of, 125 introduction to, 124–125 principle of, 124 S-state model equations for, 126–128 S-state model for, 125–130 S-state model main approximation in, 125–126 study of, 125 channeling wave, 162, 202 charged coupled device (CCD), 113 CHRTEM. See Conventional High Resolution Electron Microscopy
319 circular spot symmetry, 305 circular symmetry, 294 CO oxidation, 285–286 color image enhancement, 32–33 algorithm of, 42–43 color processing, 13–16, 41–43 adaptive structure tensor, 41–42 Beltrami framework, 41 experimental results of, 43 columnar structures, 175–176, 190 complex diVusion processes, 61–87 basic findings of, 104 complex shock filters, 81–84 discussion of, 85–87 of a large/small theta, 72 of a large/small theta with cameraman image, 73 linear, 63–74 linear, fundamental solution of, 85 main features of new methods of, 103–104 previous related studies of, 61–63 ramp-preserving denoising, 74–76 regularized shock filters of, 76–81 schemes of, 87 wave equation for fast electrons solution of, 120–121 complex Ginzburg-Landau (CGL) equation, 62 complex shock filters, 81–84 performance of, 82 constant illumination changes, invariance to, 27 contact potential diVerence (CPD), 292, 296–299, 302–304 continuum eigenfunctions, 131 excitation of, 159–160 radial equation for, 157–158 contrast. See also image contrast basic, 293 of defocusing, 304 Conventional High Resolution Electron Microscopy (CHRTEM), 115–116 convex potentials, 8 convolution kernel, 16
320 correlation diagram, 177–178 correlation operator, 207–208 coulombic string potentials, 168 Coulon and Arridge scheme, 80, 82, 84, 85, 86 CPD. See contact potential diVerence crystal structure determination, 205–206 crystal tilt, 191–201, 217–218 beam tilt equivalence to, 191 crystal thickness dependent for, 198 equations of, 192–193 process of, 191 S-state model and, 197–201 current density, 286 emission, measurement of electric fields distribution at, 283–288 jo, 262–264 photoelectric, 264 photoemission, 280, 281, 287 profiles, from solution of direct problem, 291 current density distribution, 296, 298, 309 function of, 281 simulated in the image plane, 277 current density distribution function, 237, 238, 264, 276 for a negative charged microdot, 265 for a positively charged microdot, 266 in presence of a negatively charged strip, 263 current density profiles, 261 current density redistribution, 294
D dark-field, 259, 313 dark-field image, 246, 255, 257–258, 262 Darwin-Howie-Whelan equation, 211, 212 Debye-Waller factor, 170 deconvolution, 17–18, 25–26
INDEX
blind, 16, 49, 56 Gaussian, 25 a priori/blind, 56 defocusing, 272, 298 contrast, 293, 304 EEM mode of, 293 image, 299–300 spherical particle measurement without CPD in case of, 302 spherical particles imaging in case of, 299–302 value, 301 denoising, 3, 46, 95 adaptive, 102 adaptive v. scalar eVects of, 97 automatic texture-preserving, 93–96 classical images results of, 102 edge-preserving process of, 17 of gradual intensity changes, 12 by linear/nonlinear diVusion processes, 6 method for approximating desired level in a region, 94 of a piecewise constant image, 11 potentials of, 8 with prior information, 96–97 ramp preserving, 74–76, 77, 104 scalar l, 99, 102 scalar OI, 95, 98 sharpening, contradictory to, 43 texture-preserving, 87–104 total variation, 8, 10, 11, 12, 104 density, 229, 255, 256, 264, 280 current, 283–288 current distribution, 26, 237, 238, 261, 263, 264, 265, 266, 276, 277, 281, 294, 296, 298, 309 in electrons, 129 deuterium lamp, 286 diamond-type crystal, 211–213 diVusion. See complex diVusion processes; forward-andbackward diVusion; hyperdiVusion; linear diVusion; nonlinear diVusion diVusion coeYcient, 28–30, 47
INDEX
double-well potential corresponding to, 48 triplewell potential corresponding to, 51 diVusionlike equation, 146 Dirac equation, 117, 118 direct methods, 205–210 basic theory of, 206–207 direct problem, 280, 291 solution of, 281–282 Dirichlet’s problem, 232, 271 dispersion relation, 123 double-well potential, 47–49 diVusion coeYcient corresponding to, 48 Doyle and Turner parameterization, 137, 140, 172, 188, 218 for electronscattering factors, 155–156 dynamic diVraction, 206 dynamical extinctions, 212
E edge detection, 3 EEM. See emission electron microscope; emission electron microscopy eVectively positive kernels, 67–68 eigenenergy, 139, 145, 170, 171, 173, 174 1S interaction with scattered wave, 161 1S, parameterization of, 167–171 Bloch wave method, ineVectiveness with, 146 of bound eigenfunction of a Au atom column, 150 of bound eigenfunction of a Cu atom column, 149 comparison of methods for calculation of, 148–160, 151, 152, 153, 154, 155, 156–157 LCAO calculation of, 184–189 modules of, 173
321 of most bound eigenfunctions of some pairs of atom columns, 188, 189 multislice methods’ calculation of, 147–148 tilt and, 194 eigenfunctions, 149, 150, 156. See also bound eigenfunctions; 1S eigenfunctions; two-dimensional quantum harmonic oscillator eigenfunctions Bessel functions expansion, advantages/disadvantages, 137–138 Bloch wave method calculation of, 143–146 of columnar structures, 175–176 continuum, 131, 157–158, 159–160 correlation diagram and, 177–178 dumbbell, 214–216 excitation of, 158–160, 195–196 LCAO calculation of, 184–189 methods for studying of, 174 multislice methods’ calculation of, 146–148 of a pair of identical/nonidentical atom columns, 178–180, 190 possible symmetries for, in nonidentical atom columns, 176, 177 tilt and, 193, 194 two-dimensional quantum harmonic oscillator calculated for, 219–223 two-dimensional quantum harmonic oscillator expansion, 138–142 E-L equation, 54, 93 electric field, 278–282, 312 EEM and, 230–262 mapping of, models for using EEM, 252–262 nanoparticle film and, 279, 280 object surface calculation of, 238 electric field distribution, 283–288
322 electric field distribution function, 233, 282 magnitude of error of, 251 electric field strength, inverse problem solution for, 236 electric potential, 268, 287–291 electric potential distribution function, 236, 238, 249, 253, 260 electron beam, 244–247, 250, 282, 312 electron beam restriction, 239, 250, 252, 258, 259, 282 current density distribution at screen of EEM without, 256, 257 EEM investigation techniques with/ without, 251–252 image contrast formation in case of, 240–242 image contrast schematic diagram of, with partial, 241 by means of trajectory deflection, 240 PEEM and, 285 potential step and, 254–256 electron beam shift, 242 image brightness’ relation to, 242–247 local field strength’s relation to, 247–249 electron beam tilt, 191–201, 217–218 crystal equivalence to, 191 crystal thickness dependent for, 198 equations of, 192–193 process of, 191–192 S-state model and, 197, 198–200, 201 electron diVraction, 203–218 direct methods of, 205–210 discrete pattern of, 204 dynamic, 206 dynamic theory of, 203 inequalities, 208–210 kinematic, 206 pattern of, 211 Patterson function, 207–208
INDEX
electron emission centers, 280, 282 electron flux, 245, 246 electron scattering, 174, 182, 210, 217, 218–219 temperature’s eVect on, 219 electron trajectories, 230–234, 256–258, 294, 312 in cathode lens in presence of microfields, 300 computer simulation of, 256, 276 crossed, 270 deflection, 232–234 emission current density and, 283–285 numerical simulation for equipotential object with roughness with, 275–278 numerical simulation of, for equipotential object with roughness, 275–278 principle, 243 in spherule vicinity, 295 electron wave, 185, 186 electrons additional tangential velocity of, 247 backscattered, 118 channeling theory and, 125 computer simulation of trajectories of, 256, 276 density as function of depth in Au4Mn, 129 density in atom columns of, 129 direct trajectory calculations, 256–258 eigenfunctions calculated in an isolated atom column, 130–160 emission density, 229 exit wave interpreted by inverted, 114 flux, relative magnitude of on screen of EEM, 245, 246 as imaging particles, 113 initial tangential velocities of, 274 multiple scattering of, 211–212
INDEX
nanoparticles, emission observed, 279–280 object interaction decreased with tilt increased, 194–195 object surface emitted on, 231 oscillation of while propagating to exit face, 114, 115 partial beam restriction, 240 potential energy in S-state model of, 144 potential step and, 256 Schro¨ dinger equation for fast, 119 shift value, 233 shifts and microscope resolution, 269–271 wave equation for fast, 120 wave function of, 118, 124, 174 wavelength of, 116 electrostatic potential, 116–117 embedded manifold, 14 embedded maps, 14–15 emission electron microscope (EEM), 228, 255, 256, 257, 298, 312 accelerating voltage of, 231 bright field/dark field/intermediate type images of, 259 cathode lens, real construction of, 231 computer simulation of image contrast of, 256–258 current density distribution at screen of without electron beam restriction, 256, 257 current density plot at screen of, with aperture diaphragm shifted relative to optical axis, 255 dark-field image formed in, 246 defocusing mode of, 293 electric fields calculation from image in without electron beam restriction, 282 electric fields (potential) on object surface in, 230–252
323 electric fields (potential) on object surface, mapping of, model experiments for using, 252–262 electric potential imaging, other possibilities of, 288–291 electron flux, relative magnitude of on screen of, 245, 246 electron optics, model for, 230 image contrast calculated with partial beam restriction, 240 image contrast, schematic diagram of with partial beam restriction, 241 image contrast, solution to direct problem of, on screen of, under eVect of local fields, 247–249 investigation techniques comparison of with/without beam restriction, 251–252 local electric fields measurement in, 230–250 local field and microroughness eVect at object on imaging and resolving power of, 262–278 local field strength and electron trajectory deflection relationship, 232–234 local fields’ influence on, 229 local shift at image and screen brightness relationship, 235–236 microfields measurement, practical applications of using, 278–291 object surface geometry (relief ) measurements with, 291–312 one-dimensional fields, simplified analytical expressions of, 238–239 optical system model without restriction of electron beam, 230–232 performance capabilities methods to improve, 251 potential step visualization of, 254–256
324 emission electron microscope (EEM) (Cont.) practical recommendations for, 306–312 principal electron trajectories in, 243 regime with partial restriction, operation of, 239–240 resolution deterioration and, 274 resolution impaired by geometrical roughness, 269 resolution improved in presence of local microfields or microgeometry at object surface, 277–278 resolution limitations of, 310 resolving power’s deterioration in presence of local fields and object roughness, 269–274 schematic view illustrating principle of image contrast, 230 semiconductor p-n junction, illustrative measurements with, 258–262 silver nonoparticle film in, 279, 279 spherule of on screen of, 294–295 theoretical curves, comparison for relative deterioration of, 274 thermacathodes’ emission distribution investigated by, 267 three-dimensional objects characterized by, 306–312 emission electron microscopy (EEM), 229 energy minimization flows, 7, 53, 60 energy wells, 49–56 energy functional, 49 energy minimization flow, 53 higher-order regularization, 50–52 steady state solutions, 53–56 triple well potential, 49–50 equations of crystal tilt, 192–193 of electron beam tilt, 192–193
INDEX
of S-state model, 126–128 of texture-preserving denoising, 88 equipotential object, 275–278 Euclidean embedding space, 15 Euler schemes, 99 Euler-Lagrangre (E-L) equation, 7 exit wave, 114, 115, 202 experimental channeling maps, 201–203 extinction distance, 115 extinctions dynamic, 212 Gjo¨ nnes-Moodie, 212 kinematic, 210, 217 extractor electrode, 283
F FAB. See forward-and-backward diVusion finite diVerence expansion, 132, 174 Bloch wave method related to, 145 limitations of, 150 finite diVerence expansion methods, 133–135, 148–150 problems with, 133 flux conservation, 256 focusing, 296–297. See also defocusing under, 301–302 over, 301 sharp, 293–296, 298–302 forward-and-backward (FAB) diVusion, 21–22, 26–38, 39, 46, 50, 55, 57 1D smooth regions stability, 34–38 adaptive parameters of, 31 A-M scheme, comparison to, 32–34 aspects to implement, 43–45 coeVicient’s value in large gradient regions, 59 diVusion coeVicient of, 28–30 extrema of flux points in, 36 flow, 41 flux and critical points of, 38 image enhancement by, 32
325
INDEX
main features of new methods of, 103 model of, 26–27 multiple-step noisy signal processed by, 33 nonlinear, 52 nonmonotonic flux of, 37 parameters between, 29–30 process of, 28–29 resolution enhancement with, 40–41 setting criteria of, 27–28 shock filters, comparison to, 31 signal-step noisy signal processed by, 32 super-resolution by, 38–41 Fourier analysis, 156–157, 174, 188 Fourier space, 145, 217 kinematic v. dynamic space, 205 wave function in, 203–218 Fourier transformation, 61, 145, 170, 203, 212, 218, 237, 250, 251–252 Fowler constant, 288 Fraunhofer theory, 203 Fredholm’s equation, 250 Fresnel formula, 119 functions. See Bessel functions; eigenfunctions; generating functions
G Gabor, 61 filtering, 62 -Morlet wavelet, 61 wavelet, 61 Gabor-Morlet wavelet, 63 Gaussian, 68, 170 2D parameterization, 167, 174, 187, 217 2D quadratically normalized, 171–172 dampening function, 169 deconvolution, 25 distribution, 244, 310 kernel, 4, 25 noise, 6, 19, 54
parameterization, 166, 173, 174 parameterization of 1S eigenfunction, 172 real kernel, diVerence to, 68, 69 regularized shock filters and, 78–79 second derivative, 70 white noise, 55, 56, 86, 104 generating functions, 223 geometric roughness, corresponding expression of, 273 geometric step of height h0, 272 geometric surface roughness (relief ), 268, 271 Gjo¨ nnes-Moodie extinctions, 212 global power, 97 gradient histograms, 55 gray-level scaling, 28 grazing incidence, 305
H Hamiltonian, 126, 220 Harker-Kasper inequality, 203, 208 importance of, 210 Helmholtz-Lagrange equation, 243 higher-order regularization, 50–52, 59 high-resolution electron microscopy (HREM), 124 need for, 113–114 new theory impetus for, 113–115 survey of diVerent theories of, 115–124 Hilbert transformation, 238–239 Hough transform, 97 HREM. See high-resolution electron microscopy Huyghens source, 203 hyperdiVusion, 51, 58, 59 flow, 51, 52, 60 processing, 53
I identical atom columns, 178–180 Illumination constant changes, 27
326 Illumination (Cont.) incoherent, 201 nonsymmetric, 197 nontilted, 197 nonuniformity of, 261 small angle nonparallel, 200–201 tilted, 195, 197 image dark-field, 246, 255, 257–258, 262 nonstationary nature of, 21 object microgeometry’s eVect on, 268–269 shifts, method for determining, 284 of small three-dimensional particles on object surface, 292–303 of spherical particle in case of defocusing, 299–302 of spherical particle under sharp focusing with contact potential diVerence, 296–297 of spherical particle under sharp focusing without contact potential diVerence, 293–296 of spiral structures for catalytic oxidation of CO, 285 image brightness, 290 electron beam shift’s relation to, 242–247 image contrast, 228, 229, 230, 240–242, 280, 287, 291 analytically calculated functions obtained from solution of direct problem of, 253 completion of calculation of, 247 computer simulation of, 256–258 derived fomulae’s validity, 256–258 direct problem solution of, on screen of EEM under eVect of local fields, 247–249 inverse problem of, 249 on microscope screen, 283 partial beam restriction’s calculation of, 240 schematic diagram of with partial beam restriction, 241 shift values of, 304
INDEX
weak, 246 image defocusing, 299–300 image distortion, 262–268 of one-dimensional structures, 262–264 of two-dimensional structure, 265–268 image enhancement, 32–33, 42–43 image pattern, spherical particle shadow’s eVect on, 304–306 image processing, 20–21 algorithms various capabilities of, 103–104 energy functional, 49 energy minimization flow, 53 energy wells in, 49–56 higher-order regularization, 50–52 steady-state solutions, 53–56 triplewell potential, 49–50 image rotation, invariance to, 27 image segmentation, 13 image sharpening, 16–20 objectives for, 20 setting criteria for, 27 shock filters, 17–20 image translation, invariance to, 27 imaginary kernel, 67 Gaussian’s second derivative distance of, 70 maximal amplification of, 70 properties of, 70 small theta approximation of, 70 incident beam, 200 incident wave function, 116, 118 induced metric, 15, 16 inequalities, 208–210 inhomogeneity, 312–313 initial wave function, 121 inverse filtering, 16 inverse problem, 236–238, 281 of object geometry profile, 269 results obtained from solution of, 260 silver nanoparticles films and, 280
INDEX
solution of, 249–250 isolated atom column eigenenergies/eigenfunctions calculated by Bloch wave method, 145 eigenfunction excitation and tilt’s relation to, 195–196 isotropy, 4
K kernel blurring, 16, 17 convolution, 16 eVectively positive, 67–68 Gaussian, 4, 68, 69 imaginary, 67, 70 real, 67–70 kinematic approximation, 161–162 extinctions, 210, 217 limit, 210 kinematically forbidden reflections, 210–217 tilt’s influence on, 216–217 Kornprobst scheme, 80, 82, 84, 85, 86
L Lagrange multipliers, 92 Laguerre polynomials, 222, 223 Laplacian, 15, 21, 62 operator, 118, 220 postive/negative, 26 Laue zones, 125 LCAO. See linear combination of atomic orbitals light polarized, 306, 307 unpolarized, 309 line resolution, 276 linear backward diVusion, problems with, 27 linear combination of atomic orbitals (LCAO), 184–189 accuracy of, 190
327 linear complex diVusion, 63–74 examples of, 70–73, fundamental solution, analysis of, 67 fundamental solution of, 63–64, 65 nonlinear complex diVusion, generalization to, 74 problem definition of, 63 real kernel properties, 67–70 small theta, approximate solution for, 64–66 linear diVusion, 6, 80–81 linear diVusion process, 78, 79 parameters’ adjustment for control of, 42 linear forward diVusion, 66 deconvolution’s relation to, 25–26 linear inverse diVusion, 23–24 1d, 24 2d, 25 advancing back in time with, 26 physical interpretation of, 26 linear scale-space, 3–4 examples of, 5 linear sharpener, basic, 22, 23 linearity, 3 linearsharpening operator, 17 local electric fields, 230–239 EEM measurement of with partial beam restriction, 239–250 object surface/electron trajectory and, 230–232 solution of problem on object surface in case of partial beam restriction, 244–247, 250 local field determination, inverse problem solution, 236–238 local field strength, 232–234, 236 electron beam shift’s relation to, 247–249 local fields, 229, 278 image distortion eVected by on object surface, 262–268 imaging and resolving power of EEM at object on, 262–278
328 local fields (Cont.) object microgeomety eVect on image, 268–269 resolving power’s deterioration of an EEM in presence of object roughness and, 269–274 local power, 90–91, 94 local shift, 235–236 location bias, 83 location success, 83 location variance, 83 Lyapunov functionals, 90
M Mach bands, 73, 74 MATHCAD, 260, 262 maxima, 197 Maxwell distribution, 244 mean absolute gradient (MAG), 31, 36 measured brightness distribution, 249 metric, 15 smoothed, 42 as structure tensor, 15–16 microdot, 265 negatively charged, 265, 267 positively charged, 265 microfields, EEM measurement of, practical applications of using, 278–291 microroughness, 262–278 ‘‘minimum-maximum’’ principle, 73, 81 Morlet wavelet, 61 multiphoton photoemission, 311–312 multislice methods, 114, 115, 120–122, 174, 203 dynamic of, 119 eigenenergy calculated by, 147–148 eigenfunction calculation by, 146–148 historical significance of, 123 refinement step and, 123
INDEX
N nanoparticle films, 278, 279–280, 292 electric field (potential) distributed in, 280 photoemission density and, 280, 281 potential distribution, 280 nanostructures, designing and understanding evolution of, 113 NASA, 39 NC. See Non-Cartoon negatively charged strip, 263 Neumann boundary conditions, 7, 53, 54 noise amplification, 22, 27 power, 97 noisy signal, 92 Non-Cartoon (NC), 89 nonconvex potentials, 8 nonidentical atom columns, 180–181 nonisolated atom columns, 175–190 dynamic of, 175–176 equation solution of, 175–176 pair of identical, 178–180 a pair of nonidentical, 180–181 symmetry arguments of, 176–178 wave function of, 175 nonlinear complex diVusion, 6, 74 general equation of, 74 ramp functions and, 74 ramp-type soft edge application of, 76 nonlinear diVusion coeYcient, 52 nonlinear diVusion processes, 7, 8, 14, 47 cartoon pyramid model and, 89 equation of, 35 nonstationarily blurred step image, 57 normalization coeYcient, 222
O object microgeometry (relief ), 268–269 object roughness, 269–274
INDEX
object surface, imaging of small three-dimensional particles on, 292–303 object surface geometry (relief), EEM measurement of, 291–312 ODE. See ordinary diVerential equation 1S eigenfunctions, 124, 128, 158, 174, 217 as 2D quadratically normalized Gaussian or exponential function, 171–172 broadening, 169 and correlation diagrams, 178 energies of, 129 Gaussian parameterization of, 172 interaction with scattered wave, 161 LCAO, 187–188 scaled into uniform, 164–165 shape of, 165–166 uniform approximation for weak phase conditions, 167 optimal linear Wiener filter, 16 ordinary diVerential equation (ODE), 25–26 original signal, 91 overfocusing, 301 oxidation bulk, 290 reaction spread, 286 oxygen adsorption, 289, 290
P paraxial approximation, 119 partial beam restriction electrons, 240 image contrast and, 240 image contrast calculated with EEM, 240, 241 local electric fields and, 239–250 partial diVerential equations. See PDE particles, 299–302 3D, 292–303 electrons as imaging, 113
329 nano, 279–280 Pb, 307, 308 Patterson function, 207–208, 217 maps, 203, 208 Pb particles, 307, 308 PDE, 3 algorithms’ various capabilities of, 103–104 complexification generalized by filters of, 61 image processing, algorithms based on, 20–21 implementation of, 3 nonlinear, 14 processes of, 3 real and complex based schemes, 9 vector-valued, 17 PEEM. See photoemission electron microscope Perona-Malik nonlinear diVusion, 4–5, 6, 10, 12, 76, 77, 78, 79, 103 Perona-Malik (P-M) regularized scheme, comparison to adaptive scheme, 100 perturbation theory, 129 phase object approximation, 117 photoelectric current density, 264 photoelectron trajectories, 277 photoemission density, 280, 281 photoemission electron microscope (PEEM), 231, 267, 275, 278, 279, 285 oxygen adsorption images by, 289, 290 Pb particles and, 307 reaction-diVusion patterns obtained without beam restriction, 285 photoemission intensity, 288 photon beam, 304 excitation, 125 penetration depth, 311 Planck’s constant, 62 plane wave, 118
330 plane-wave-based, 114 plasmon-enhanced multiphoton photoemission, 305 P-M equation, 17 P-M model, 31 P-M process, 75, 87 p-n junction appearance of, 259 images of, on silicon diode with/ without electron beam restriction, 258, 259 inverse problem and, 260 semiconductor, 258–262 point resolution, 277 polarized light, 306, 307 position (spacial)-varying filtering operation, 94 positively charged strip, 263 potential step, 264 actual and displayed location distance between, 254 appearance of, 254 curves calculated for, 255 partial electron beam restriction calculation of under, 254–256 visualization of, 254–256 pulser-laser excitation, 292
R radial equation, 132 Bessel diVerential equation similarity to, 135–137 of continuum eigenfunctions, 157–158 diYculty of, 132 problems with, 133 solution for, expansions in 2D quantum harmonic oscillator eigenfunctions, 138–142 solution for, expansions in basis set, 135–143 solution for, finite diVerence methods, 133–135 solution of, 221–222 ray tracing, 274
INDEX
Rayleigh criterion, 276 real kernel, 67 formulation of, 67 Gaussian’s diVerence to, 68, 69 maximal amplification of, 67 properties of, 67–70 small theta approximation, 68 real space, 145 kinematic v. dynamic, 205 reflections, 210–217, 214–215 of a diamond-like structure, 211, 217 of a diamond-like structure in zone-axis orientation, 212–213 tilt’s influence on, 216–217 regularized shock filters, 76–81 previous related studies of, 76–80 second derivative magnitude in, 81 shock and diVusion coupling, 80–81 resolution, 40–41, 269 limitations, 310 super, 38–41 resolution deterioration, 270–274, 278 theoretical curves’ comparison for in an EEM, 274 resolution determination, 270 resolution enhancement, 40–41 Riemannian manifolds, 14–15 rosette-motion channeling, 168 Rudin-Osher-Fatemi (TV) scheme, 103 Runge-Kutta formalism, 275 Rutherfort scattering, 125
S scalar OI denoising, 95 SNR of, 96 scalar process, 101, 102 scale invariance, 4, 28 scattered wave, 161 amplitude of, 162 Schro¨ dinger equation, 61–62, 63, 85, 117, 118, 126, 143, 174, 192 Bloch wave method solved for, 122
INDEX
for fast electrons, 119 solution of, 219–220 time-dependent, 146 Schro¨ dinger operator, 62 screen brightness, 235–236 segmentation, 3 sharp focusing, 293–296, 298–302 sharpening. See also image sharpening axiomatic approach with, 21–46 denoising, contradictory to, 43 ‘‘oV-the-shelf’’ comparisons with, 58 operator, 17 triplewell potential and, 49–50 by variational approach, 46–60 shock dislocation, 83 shock filters, 17–20. See also regularized shock filters comparison experiment of, 84, 85 complex, 81–84 deblurring of step edge, 18 FAB diVusion, comparisons to, 31 formulation of, 18 main properties of, 18–19 noise sensitivity of, 19, 20 robust processing of, 82 sine waves processed by, 19 total-variation preserving (TVP) scheme, 18 shock success, 83 SIMION 6.0, 256 simplified one-dimensional fields, 238–239 sine-wave signal, 19 sinusoidal height modulation, 272 slope, 83 slopes variance, 83 small particles, 308 small theta approximation, 64–66, 68, 69, 70 small-angle non parallel illumination, 200–201 spatial discretization, 5 resolution, 278 shift invariance, 3
331 spectral curve, 288 spherical particle diameter, 302 parameters, 298–299, 302–304 shadow, 304–306 spherule, 294, 295, 296–297, 299, 303, 305 spherule diameter, 303, 304–306 spot diameter, 303, 306 SR. See super-resolution S-state model, 124, 143, 160–175, 185, 186 analytical, 182, 183 associated eigenfunction of, 144 atom columns, for general assembly of, 181–182, 197 atom columns, for general assembly of, accuracy of, 182–184, 198–200 channeling map of, 160–163 in channeling theory, 125–130 channeling wave and, 202 conclusion to study of, 173–175 in crystal or beam tilt, 191–201 direct methods, 205–210 dumbbells, deviations of with, 214 electron diVraction and, 203–218 electron potential energy in, 144 equations of, 126–128 interpretation of, 128 intuitive physical insight of, 174 IS parameterized, fast method for calculation of, 171–173 for isolated atom column, 125 main goal of, 182 mathematical formulation of, 160–161 for nonisolated atom columns, 175–190 physical insight into, 160–163 principles of, 124, 173–174 reflections, inability to express intensity by, 213–214 scaling and parameterization of, 164–171
332 S-state model (Cont.) Schro¨ dinger 2D equation similarity to, 126 small angle nonparallel illumination and, 200–201 testing process of, 182–183 tilt introduced to, conclusions of, 201 wave function as, 160, 182 wave function of Fourier space described by, 203–218 stable process, 85 successive approximations, 247 super-resolution (SR), 38 application of, 40 definition of, 38–39 processor of, 39 single-image proposed scheme, 39
T tangential velocity, 247–248 Taylor expansion, 64 tensor diVusivity, 6 texture detector, 97 texture-preserving denoising, 87–104 adaptive IO problem, 90–97 cartoon pyramid model and, 89–90 discussion of, 102–103 equations of, 88 examples of, 97–102 implementation details of, 99–100 introduction to, 87–89 limitations of, 88 main features of new methods of, 104 thermacathodes, 267 three-dimensional objects, 306–312 Tikhonov regularization, 16 tilt, 196, 217–218 beam, 191–201 crystal, 191–201 eigenfunction excitation and, 195–196 electron-object interaction decreased with increased, 194–195 reflections influence of, 216–217
INDEX
S-state model and, 197 S-state model conclusion of testing with, 201 S-state model, general assembly with, accuracy of, 198–200 tilted illumination, 195, 197 Topping formula, 289 total variation (TV) denoising, 8, 10, 11, 12, 104 total variation (TV) minimizing process, 87 transmitted wave function, 121 triplewell flux, 60 triplewell formulation, smooth region stability of, 59 triplewell potential, 49–50 1D diVusion coeYcient corresponding to, 51 forward/backward diVusion regions in, 50 proposed approach of, 60 saturation in large gradient enhancements of, 59 steady-state solutions and, 53–54 triplewell process, 55 main features of new methods of, 103 two-dimensional quantum harmonic oscillator, 173, 174 eigenfunctions calculated for, 219–223 expansion of, 152–154 two-dimensional quantum harmonic oscillator eigenfunctions, 174 accuracy of, 154 Bessel functions of first kind v., 150–156 eigenenergy calculated by, 151, 152, 153, 154, 155 eigenenergy of, 139 expansion, 138–142 two-dimensional quantum harmonic oscillator width, 156–157 optimization of, 142–143
333
INDEX
U ultra high vacuum (UHV), 279 underfocusing, 301–302 unpolarized light, 309 unsharp masking, 17
V vacuum wave, 161, 202 value of, 162 variational approach, 3, 7–13 with bound eigenfunctions, 142–143 double-well potential, 47–49 energy wells, 49–56 FAB similarity to, 46 related studies to, 47 sharpening by, 46–60 variational principle, 171–173, 174 eigenenergy calculated by, 173 vector-valued diVusion, 14 VisTex archive (VisTex Vision Texture Archive), 100
W wave channeling, 162, 202 electron, 185, 186 exit, 114, 115, 202 plane, 118 scattered, 161–162 vacuum, 161, 162, 202 wave function, 183, 193, 217–218 analytical expression of, 165 dumbbells and, 214–216
electron diVraction and, 203–205 of electrons, 118, 124, 174 of Fourier space described by S-state model, 203–218 incident, 116 maxima shift in amplitude and phase of, 197 of nonisolated atom columns, 175 plotting of, 163 Si crystal and, 213 Si crystal, real and imaginary parts of, 213 of a specimen foil, 213–214 S-state model as, 160, 182 sum of, 161 transmitted, 121 weak image contrast, 246–247 weak phase conditions, 167 weak phase object approximation, 115–120, 127 Weickert’s coherence-enhancing diVusion, 57 well-shaped potentials, 47–49 Wiener filter, 94, 96
Y Young measures, 48
Z zone-axis orientation, 114, 211, 212, 217 reflections of a diamond-type structure in, 212–213