ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 70
EDITOR-IN-CHIEF
PETER W. HAWKES Laboratoire d’0ptique Electro...
23 downloads
600 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 70
EDITOR-IN-CHIEF
PETER W. HAWKES Laboratoire d’0ptique Electronique du Centre National de la Recherche Scientijique Totilouse, France
ASSOCIATE EDITOR
BENJAMIN KAZAN Xerox Corporation Palo Alto Research Center Palo Alto, California
Advances in
Electronics and Electron Physics EDITED BY PETER W. HAWKES Laboratoire d'Optique Electronique du Centre National de la Recherche Scientijique Toulouse, France
VOLUME 70
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers Boston San Diego New York Berkeley London Sydney Tokyo Toronto
COPYRIGHT @ 1988 by ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL. INCLUDING PHOTOCOPY, RECORDING. OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. 1250 Sixth Avenue. San Diego. CA 92101
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24-28 Oval Road, London NWI 7DX
LIBRARY OF CONGRESS CATALOG CARDNUMBER:49-7504 ISBN 0-12-014670-3 PRINTED IN THE UNITED STATES OF AMERICA
8XX99091
9 8 7 6 5 4 3 2 I
CONTENTS CONTRIBUTORS TO VOLUME 70 . . . . . . . . . . . . . . . . . . . . . . . PREFACE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Scanning Electron Microscopy at Very Low Temperatures . . . . . . . . . . R. P. HUEBENER
vii ix
1
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1 . Low-Temperature Stage . . . . . . . . . . . . . . . . . . . . . . . I11. Principles and Electron Beam Parameters . . . . . . . . . . . . . . IV . Interaction between Electron Beam and Specimen . . . . . . . . . . V. Superconducting Tunnel Junctions: Pair Tunneling . . . . . . . . . . VI . Superconducting Tunnel Junctions: Quasiparticle Tunneling . . . . . VII . Arrays of Superconducting Tunnel Junctions . . . . . . . . . . . . . VIII . Hotspots in Superconducting Microbridges . . . . . . . . . . . . . . IX . Current Filaments and Turbulence in Semiconductors . . . . . . . . X . Ballistic Phonon Signal . . . . . . . . . . . . . . . . . . . . . . . XI . Phonon Focusing . . . . . . . . . . . . . . . . . . . . . . . . . . XI1 . Imaging of Structural Defects with Ballistic Phonons . . . . . . . . . Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 4 8 10 18 26 39 41 49 58 64 71 75 15
. . . . . . . . . . . . . . . .
79
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I . Introduction and Overview . . . . . . . . . . . . . . . . . . . . . I1. AR and ARMA Models . . . . . . . . . . . . . . . . . . . . . . . 111. Robust Estimation in Causal Autoregressive Models . . . . . . . . . I V . Image Restoration with Robust Image Modelling Techniques . . . . . V. Composite Edge Detection . . . . . . . . . . . . . . . . . . . . . VI . Summary and Suggestions . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80 80 84 109 121 139 155 155
Physical Limits in Information Processing . . . . . . . . . . . . . . . . . . ROBERTW . KEYES
159
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1 . Representation of Information . . . . . . . . . . . . . . . . . . . . I11. Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 161 163
Robust Image Models and Their Applications
R . L . KASHYAP AND KE-BUM EOM
V
vi IV . V. VI . VII . VIII . IX.
CONTENTS
The Nature of Devices . . . . . . . . . . . . . . . . . . . . . . . Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dissipation of Energy . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
164 175 198 203 207 213 213
Synthetic Aperture Ultrasonic Imagery . . . . . . . . . . . . . . . . . . . KEINOSUKE NAGAI
215
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imaging System and Aperture . . . . . . . . . . . . . . . . . . . . Theory and Application of Holography . . . . . . . . . . . . . . . Fundamentals of Digital Ultrasonic Imaging . . . . . . . . . . . . . Properties of a Transducer Array . . . . . . . . . . . . . . . . . . Actual Digital Imaging System . . . . . . . . . . . . . . . . . . . . Diffraction Tomography as the Inverse Problem . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
215 216 223 253 267 282 290 313
1. I1. 111. IV. V. VI . VII .
Dual Complementary Variational Techniques for the Calculation of Electromagnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . J . PENMAN
315
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. A Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . 111. Complementary Variational Principles . . . . . . . . . . . . . . . . IV . The General Engineering Field Problem . . . . . . . . . . . . . . . V. Field Problems in Engineering . . . . . . . . . . . . . . . . . . . . VI . Magnetostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . VII . The Electrostatic Field . . . . . . . . . . . . . . . . . . . . . . . VIII . The Electromagnetic Field . . . . . . . . . . . . . . . . . . . . . . IX . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
316 316 318 323 331 336 342 347 358 364 364
CONTRIBUTORS TO VOLUME 70 The numbers in parentheses indicate the pages on which the authors’ contributions begin. Kie- Bum Eom, School of Electrical Engineering, Purdue University, West Lafayette, Indiana 47907 (79) R. P. Huebener, Physikalisches Institut 11, Universitate Tubingen, D-7400 Tubingen, Federal Republic of Germany (1) R. L. Kashyap, School of Electrical Engineering, Purdue University, West Lafayette, Indiana 47907 (79) Robert W. Keyes, IBM T. J. Watson Research Center, P O Box 218, Yorktown Heights, New York 10598 (159) Keinosuke Nagai, Institute of Applied Physics, University of Tsukuba, Sakura, Ibaraki 305, Japan (21 5) J. Penman, Department of Engineering, University of Aberdeen, Aberdeen, Scotland (315)
vii
This Page Intentionally Left Blank
PREFACE The five chapters in this volume range over the traditional subjects of these Advances, with an aspect of scanning electron microscopy in first place. Very low temperature SEM is a relatively new development: it merits separate treatment, for not only does it enable us to study superconductors by the techniques well developed in the scanning field, but it also proves to generate information of a different kind, due essentially to the localized heating effect caused by the energy deposited by the scanning beam. Not much of the work described here has yet found its way into the textbooks of SEM and we are delighted to publish so complete and authoritative an account by H. P. Huebner in these pages. The second chapter, by R. L. Kashyap and K. B. Eom, reflects my efforts to increase the coverage of digital image processing in these pages. Image models are needed in various types of image processing, image restoration by statistical methods and image segmentation in particular. K. L. Kashyap has made important contributions to our understanding of image models, especially those that are robust in the sense that they remain reliable even if the assumptions on which they are based are not exactly satisfied. He and K. B. Eom cover the theoretical background of AR and A R M A models, robust estimation in the causal case, and restoration and edge detection based on these statistical foundations. The third chapter examines questions raised by the extraordinary rate at which miniaturization is progressing. There must be a limit to this progress, but what is it, what governs it, and is it likely to be reached? R. W. Keyes considers the numerous factors involved, which are of very different kinds. At one extreme, we have physical laws governing speeds of propagation, mathematical laws associated with topology, the slippery rules of uncertainty, and the constraints of thermodynamics. At the other, there are economic pressures, which tend to mean that large scale progress, as opposed to isolated achievement, can only be expected if it pays off. R. W. Keyes succeeds in keeping all these pressures in mind in his discussion of the various devices used to process information. In the fourth chapter we return to imagery, and, in particular, to synthetic aperture ultrasonic images. Ultrasound images are found in medicine, wherever non-destructive testing is vital and in the study of natural resources. Unfortunately, the wavelengths employed are frequently comparable with those of the structures of interest, and synthetic aperture techniques have hence been developed to provide good images despite this. K. Nagai examines ix
X
PREFACE
the whole field, from wave propagation to the properties of transducer arrays and the performance of modern systems. This authoritative account will surely be of use to the experienced and invaluable to newcomers to the subject. The final chapter is concerned with a problem that is encountered in all branches of electronics and electron physics: the calculation of electromagnetic fields. The dual complementary variational techniques presented here are not as well known as they deserve to be, despite the ubiquity of the finiteelement method. J. Penman first sets out the mathematical tools needed and then examines first the two static cases, with examples from electrostatics and magnetostatics. He subsequently turns to the electromagnetic field, devoting a section to eddy current problems. This clear account of the ideas will surely help many of us confronted with the problem of field calculation to appreciate these methods. As usual, we conclude with a list of forthcoming chapters. Peter W. Hawkes Parallel Image Processing Methodologies Image Processing with Signal-Dependent Noise Scanning Electron Acoustic Microscopy Electronic and Optical Properties of Two-Dimensional Semiconductor Heterostructures Inverse Problems Pattern Recognition and Line Drawings Magnetic Reconnection Sampling Theory Dimensional Analysis Electrons in a Periodic Lattice Potential The Artificial Visual System Concept Accelerator Physics High-Resolution Electron Beam Lithography Corrected Lenses for Charged Particles Environmental Scanning Electron Microscopy The Development of Electron Microscopy in Italy Energy-Loss Spectroscopy Amorphous Semiconductors
J. K. Aggarwal H. H. Arsenault L. J. Balk G. Bastard et al. M. Bertero H. Bley A. Bratenahl and P. J. Baum J. L. Brown J. F. CariEena and M. Santander J. M. Churchill and F. E. Holmstrom J. M. Coggins F. T. Cole and F. Mills H. G. Craighead R. L. Dalglish D. G. Danilatos G. Donelli J. Fink W. Fuhs
xi
PREFACE
Median Filters Bayesian Image Analysis Vector Quantization and Lattices Aberration Theory Ion Optics Systems Theory and Electromagnetic Waves Phosphor Materials for CRTs The Scanning Tunnelling Microscope Multi-Colour AC Electroluminescent Thin-Film Devices Spin-Polarized SEM Proton Micro probes Ferroelasticit y Active-Matrix TFT Liquid Crystal Displays Image Formation in STEM Electron Microscopy in Archaeology Low-Voltage SEM Languages for Vector Computers Electron Scattering and Nuclear Structure Electrostatic Lenses Historical Development of Electron Microscopy in the USA Atom-Probe FIM X-Ray Microscopy Applications of Mathematical Morphology Focus-Deflection Systems and Their Applications Electron Gun Optics Electron Beam Testing
N. C. Gallagher and E. Coyle S. and D. Geman J. D. Gibson and K. Sayood E. Hahn D. Ioanoviciu M. Kaiser K. Kano et al. H. Van Kempen H. Kobayashi and S. Tanaka K. Koike J. S. C. Mc Kee and C. R. Smith S. Meeks and B. A. Auld S. Morozumi C. Mory and C. Colliex S. L. Olsen J. Pawley R. H. Perrott G. A. Peterson F. H. Read and I. W. Drummond J. H. Reisner
T. Sakurai G. Schmahl J. Serra T. Soma et al. Y. Uchikawa K. Ura
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. V O L 70
Scanning Electron Microscopy at Very Low Temperatures R. P. HUEBENER Physikalisches lnstitut 11. Uniuersitat Tiibinyen D- 7400 Tiibingen, Federal Republic of Germany
1. Introduction . . . . . . . . . . 11. Low-Temperature Stage . . . . . . 111. Principles and Electron Beam Parameters IV. Interaction Between Electron Beam and Specimen . . . . A. Thermalization of the Beam Energy. . . . . . . . B. Localized Heating EKect: Thermal Healing Length and Thermal Relaxation Time. . . . . . . . . . V. Superconducting Tunnel Junctions: Pair Tunneling . . . VI. Superconducting Tunnel Junctions: Quasiparticle Tunneling VII. Arrays of Superconducting Tunnel Junctions. . . . . . VIII. Hotspots in Superconducting Microbridges . . . . . . IX. Current Filaments and Turbulence in Semiconductors . . X. Ballistic Phonon Signal . . . . . . . . . . . . . XI. Phonon Focusing . . . . . . . . . . . . . . . XI1. Imaging of Structural Defects with Ballistic Phonons . . . Acknowledgments . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . . , . . , . . . .
. . . . . . . . . .
1 4 8
10 11
. . . . . . . . . . . . . . . .
12
. . . . . . . .
26 39
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . . .
18
41 49
58 64
I1 75
I5
I. INTRODUCTION
Today scanning electron microscopy (SEM) is a widely used analytical tool providing structural information in many different fields such as materials science, solid state physics, microelectronics, biology, and the medical sciences (Reimer, 1985).The principle of SEM simply consists of scanning the surface of the specimen with a well focused electron beam and recording simultaneously a proper response signal generated by the interaction of the beam with the specimen. If this signal is displayed following the same geometric pattern as in the scanning process, a two-dimensional image of some specimen property can be generated. Usually, the response signal is utilized for modulating the brightness on the screen of a cathode-ray tube, which is operated synchronously with the electron-beam scanning process. In many I Copyright ic 1988 by Academic P r e s , Inc All rights of reproduction re\crved ISBN O-I?-O14670-3
2
R. P.HUEBENER
applications of SEM, the response signal consists of the emitted secondary electrons or the back-scattered electrons. In addition, the emission of Auger electrons and x-ray photons is often utilized for structural imaging. Of course, the interaction processes generating these signals take place only within the penetration depths of the primary beam electrons in the sample material. Hence, the information obtained in this way is restricted to a region close to the specimen surface. The spatial resolution of SEM is determined by the diameter of the region perturbed by the electron beam and acting as the signal source. Hence, the beam diameter represents the ultimate spatial resolution limit. However, the beam-induced perturbation of the specimen often extends appreciably beyond the beam diameter, resulting in a corresponding deterioration of the resolution limit, perhaps by several orders of magnitude. If the primary electron beam irradiating the sample is temporally structured, time dependent phenomena can also be investigated by SEM. Using the stroboscopic principle, strongly time-dependent structures can be observed with high temporal and spatial resolution. Of course, the principle of scanning microscopy can be extended to any other probe that is movable in a two-dimensional pattern (Ash, 1980). A moving laser beam, acoustic beam, or mechanical micro-contact represent some examples that have been used for two-dimensional imaging. However, due to the well developed technology for generating and manipulating a sharply focused electron beam, so far electron-beam scanning has found the widest application in scanning microscopy. Here the small value of the beam diameter and the long working distance between specimen and lower polepiece of the final lens, which can be achieved, represent distinct advantages of SEM. Although SEM is now widely used as an analytical tool, its extension to the regime of very low temperatures is still relatively rare. Here we have in mind the temperature range provided by liquid helium, i.e., temperatures around 4 K and below down to about 1.5 K. For experiments in this temperature range one needs a scanning electron microscope equipped with a well-functioning liquid-He stage. With such an apparatus, two types of studies can be performed. First, typical low temperature phenomena such as superconductivity and low-temperature devices used in cryoelectronics can be investigated. Second, experiments can be performed where the temperature range of liquid He is required by the measuring principle. The ballistic phonon signal represents an example for the second case. This signal requires a long phonon mean free path and a highly sensitive phonon detector, both being realized only in the temperature range of liquid He. In the following review, we will deal with both types of applications of low temperature scanning electron microscopy (LTSEM). Of course, in addition to electron beam scanning other scanning
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
3
techniques can also be extended to the liquid-He temperature range (see Huebener, 1984; Bosch et al., 1986). In the following we only discuss these other scanning experiments if they bear directly upon the results obtained by LTSEM. We do not present a critical evaluation and comparison of the different scanning techniques which can be performed at low temperatures. The signals mainly to be utilized in SEM performed in the temperature range of liquid helium are expected to be different from those usually used during room temperature operation and mentioned above. Generally, the latter signals do not provide any new information if the sample iscooled to low temperatures. We will see that it is the localized heating effect caused by the electron beam during the scanning process that generates the important response signal providing the structural information about the specimen. This prominent role of the electron beam as a localized heat source and the importance of the beam-induced thermal perturbation of the sample results from the fact that at low temperatures many material properties can depend sensitively upon temperature. Here superconductors represent a particularly striking example since their energy gap often corresponds to thermal energies of only a few K. Of course, in an ancillary and helpful way the “usual signals” discussed above are always used in LTSEM in addition to the new signals only obtained at low temperatures. In this review we summarize the results obtained recently by LTSEM. Following a brief discussion of the main features of the low-temperature stage in $11, we treat the important underlying principles of LTSEM in $111. In gIV we discuss the interaction between the electron beam and the specimen, concentrating only on the signal generating processes important for the low temperature experiments treated in the remainder of this article. In Sections V-VII we deal with spatial structures observed by LTSEM in superconducting tunnel junctions. In $VIII we discuss experiments relating to spatial temperature structures in current-carrying superconducting microbridges. Spatial structures generated in semiconductors during avalanche breakdown at low temperatures and observed by LTSEM are treated in $IX. The signals discussed in Sections VI-IX have some similarity to the concepts of the electron-beam induced current (EBIC) or electron-beam induced voltage (EBIV) utilized often in studies of semiconductors by means of SEM (see Reimer, 1985; Ehrenberg and Gibbons, 1981). In gX-XI1 we deal with a distinctly different signal for spatial imaging, namely the ballistic phonon signal. Here the region locally heated by the electron beam acts as a source of ballistic phonons (quanta of sound energy) in a similar way as the heated filament in a light bulb acts as a source of photons (quanta of electromagnetic energy).The ballistic phonons can serve for imaging the phonon focusing effect based on the elastic anisotropy in a single crystal. They can also be utilized for imaging structural defects in a crystal. These two applications of the ballistic phonon signal are treated in $XI and 4x11,respectively.
4
R. P. HUEBENER
TI. LOW-TEMPERATURE STAGE The best experimental setup for performing scanning electron microscopy at liquid-He temperatures appears to be an arrangement where one side of the specimen is in direct contact with the liquid-He bath, whereas the opposite side of the sample can directly be scanned with the electron beam. Such an arrangement is shown schematically in Fig. 1. Further, it is highly advantageous if the temperature of the liquid-He bath can be reduced from 4.2 K down to about 1.5 K by pumping. The operation in the temperature range below 2.17 K is of particular interest, since here the cooling efficiency of liquid He is strongly increased due to its superfluid state. Because of these considerations a bath cryostat extending into the sample chamber of the scanning electron microscope appears to be the best possible choice. The typical features of a low-temperature stage based on the principles indicated above is shown schematically in Fig. 2 (Seifert, 1982).On the left side we see the lower part of a conventional 4He cryostat consisting of a cylindrical liquid-He tank surrounded by a liquid-nitrogen tank for precooling and thermal shielding. The cryostat extends horizontally into the sample chamber of the microscope. On the right side in Fig. 2 we see the lower part of the electron-beam column and the sample chamber of the microscope. For thermal shielding it is important that the part cooled to liquid-nitrogen temperatures extends into the sample chamber in addition to the liquid-He tank. Horizontal adjustment of the sample position is possible by mechanically shifting the base plate carrying the low-temperature stage. Flexible connections between the low-temperature stage in the sample chamber and the cryostat on the left serve for gaining the necessary mechanical freedom. Helium gas bubbles forming perhaps near the sample and impeding the cooling process for the specimen can be removed by a circulation pump within the 4He cryostat.
1
electron beam thin-film specimen substrate
m ‘
‘
v
‘
v
V
r
m m Y
liquid He
a
FIG.1. Sample configuration for LTSEM.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES CRY OSTAT
5
MICROSCOPE
I CRY0 STAGE
FIG. 2. Schematics of the low-temperature stage. 1, outer wall; 2, LN, reservoir; 3, LHe reservoir; 4, vacuum space; 5, position of LHe transfer line; 6, driving shaft; 7, circulation pump; 8, LHe transfer tubes with bellows; 9, LN, shield; 10, copper ribbon for thermal coupling; I I, micrometer screw for shifting micropositioning stage; 12, mechanical vacuum feedthrough; 13, push rod; 14, mounting plate; 15, x-y micropositioning stage; 16, LN, base plate; 17, LHe base plate; 18, spacers for thermal isolation; 19, LHe tank; 20, sample; 21, microscope chamber; 22, microscope column; 23, electron beam (reproduced from Seifert, H. CRYOGENICS, 1982, 22, 657-660, by permission of the publishers, Butterworth & Co g?). (Publishers) Ltd. @?).
The top plate of the He tank in the sample chamber can be used directly as sample holder. Such a sample mounting configuration is shown schematically in Fig. 3. Here the sample material separates the liquid He from the vacuum of the electron-beam column. The sample, shaped preferably as a round disk (of, say, 20 mm diameter and 2-3 mm thickness), is fixed mechanically by a clamping screw which also compresses the indium seal between sample and top plate. It is important to keep the sample position sufficiently low such that the liquid-He level is always higher. The top plate is fixed by a clamping ring and sealed also with indium. A shield above the sample with a hole for the electron beam provides protection against thermal radiation. Typical dimensions of the He tank are about 4 cm height and 5 cm diameter, corresponding to a volume of about 80 cm3. Of course, this sample mounting configuration is also very useful for investigating thin-film structures deposited on the top of a proper substrate, the substrate being again preferably shaped as a round disk. Due to its high heat conductivity, single-crystalline sapphire is well suited as subtrate
6
R. P. HUEBENER
FIG.3. Sample mounting configuration. I , electron beam; 2, sample; 3, sample holder; 4, clamping screw; 5, copper ring for wire heat sinking; 6, thermal shield; 7, LHe tank;
8, clamping ring; 9, indium seal; 10, LHe tubes (from Seifert, 1982).
material for such thin-film structures. In the same way, other specimens which are unsuitable to act directly as the separating wall between the liquid He and the vacuum because of their small size or their mechanical weakness, can be mounted on such a substrate material with high heat conductivity. Sufficient thermal contact to the substrate can be attained by a proper medium such as stycast cement, vacuum grease, etc. A photograph of a low-temperature stage which has been used for several years in the laboratory of the author is shown in Fig. 4.The circular flanche (diameter = 6.8 cm) represents the top plate of the He tank, the sample being located below the opening in the middle. The whole stage is to be inserted into the sample chamber of the microscope. In the back, the end plate of the horizontal extension of the cryostat with the rubber O-ring seal can be seen. If electrical current and voltage leads are to be attached to the top side of the sample or of the substrate (electrical connections to a thin-film structure, etc.), it is important that these lead wires are thermally anchored to the liquid helium bath after a short distance, in order to minimize sample heating effects. In some applications of LTSEM, it is necessary to apply an external magnetic field to the sample. Such a field can be generated by a small superconducting coil placed in the liquid He surrounding the sample location. On the other hand, it can become important to carefully shield the sample against any ambient external magnetic field such as the earth’s magnetic field. An effective magnetic shield of the sample can be fabricated from a magnetically soft material such as cryoperm (obtained from Vakuumschmelze GmbH, Hanau, FRG). Figure 5 shows a cross-section of the complete sample mounting configuration with the magnetic shield in place. Such a magnetic shielding can
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
7
FIG.4. Low-temperature stage.
has turned out to be highly effective in LTSEM studies of superconducting tunnel junctions, as will be discussed in Sections V-VII. The ease with which the low-temperature stage can be attached to and removed from the scanning electron microscope represents an important consideration. A quick turnaround time for changing the sample is always attractive. Further, intermittent operation of the microscope at room temperature for conventional applications is often required. A lowtemperature stage of the type shown in Figs. 2 and 3 can be self-supporting and
Fici. 5. Cross-Section of the sample mounting stage including the magnetic shield. I , electron beam; 2, sample; 3, sample holder; 4, clamping screw; 5, magnetic shield.
8
R. P. HUEBENER
provides this flexibility.On the other hand, scanning electron microscopes are commercially available today, where the low-temperature stage can directly be attached to the hinged door of the sample chamber without any further mechanical provisions. The fact, that the weight of some of the more elaborate sample tables available today is similar to the weight of the complete liquidHe stage including the cryostat, provides such a possibility. (As an example, in the laboratory of the author, a liquid-He stage is mounted in this way to the hinged door of a Camscan Model S4DV scanning electron microscope). It is important to note that we have concentrated so far on a form of the low-temperature stage which is most universal in terms of its applicability and most effective in terms of its cooling power. For special applications simpler and less expensive cold stages are often adequate and commercially available. Here the specimen is mounted on some cold finger extending into the sample chamber of the microscope, and no direct contact between the liquid He and the specimen is provided. At present the lowest temperature which can be reached with such simple cooling stages is often limited to 10-15 K. In this review we exclude such applications from our discussion. 111. PRINCIPLES AND ELECTRON BEAMPARAMETERS The principle of SEM is illustrated in Fig. 6. The primary electron beam is scanned over the specimen surface and produces a localized perturbation of the sample. As a consequence the sample generates a response signal which generally depends upon the coordinates of the electron beam focus. The electron beam of a separate cathode ray tube (CRT) is operated synchronously with the primary electron beam. If the beam intensity of the CRT is modulated by means of the sample response signal, a two-dimensional image of the specimen property corresponding to this response signal is obtained. In addition to this two-dimensional display by means of the brightness on the CRT, a linear scan with the signal amplitude plotted against the scanned coordinate is often useful for a quantitative analysis of the results. This latter operational mode is referred to as “y-modulation”. For SEM studies at low temperatures, the localized heating effect of the electron beam already results directly in a highly useful response signal (see Clem and Huebener, 1980). At liquid-He temperatures the electronic properties of superconductors as well as semiconductors respond sensitively to small changes in temperature. Therefore, in many applications of LTSEM the electron-bombardment induced conductivity plays a central role for imaging. (A similar effect is often used at room temperature in SEM studies of semiconductors or electrical insulators; see, e.g., Reimer, 1985; Ehrenberg and Gibbons, 1981 .)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
1
Primary e I ec t ron
beam
*
9
Beam source
Oef lec tton
""I+
/ /
I
/--Response /
signal
IRT
screen X X
FIG.6 . Principle of scanning electron microscopy.
At low temperatures the phonon density in a crystal is strongly reduced compared to the region locally heated by the electron beam. Therefore, this heated region acts as a strong source of phonons which propagate ballistically, i.e., without any scattering, over long distances through the remainder of the crystal. This ballistic phonon signal can also be utilized for structural imaging. In many applications of LTSEM the electron beam should represent a passive probe. Hence, the beam-induced perturbation of the sample and the beam power should be as small as possible. On the other hand, the beam power must be sufficiently large for generating a detectable signal. With a typical beam voltage and current of 20 kV and 10 PA, respectively, we have a beam power of 0.2 pW. As an example we consider a thin-film specimen deposited on a substrate with high thermal conductivity. We assume that the thickness d of the specimen film is larger than the range of penetration for the electrons of the primary beam into the solid. (This range of penetration is about 1 pm for a beam energy near 20 keV and for materials such as the noble metals). In this case, the beam power is totally dissipated within the specimen film. The resulting temperature rise T - Tb in the film is given by
10
R. P. HUEBENER
where T is the temperature in the irradiated film region and Tb is the bath temperature. Po is the beam power and CI the heat transfer coefficient describing the heat transfer between the film and the substrate. The quantity q is the thermal healing length given by (see Huebener, 1984; Clem and Huebener, 1980),
where K is the heat conductivity of the film. Assuming the value n * q 2 = 10 pm2 and taking for the heat transfer coefficient near 4.2 K the value a = 1 Wcm-'K-' (Skocpol et al., 1974; Schulze and Keck, 1984), we find for Po = 0.2 pW the temperature increment T - Tb = 2 K. The sensitivity for signal detection can be strongly increased if the electron beam is modulated and phase-sensitive detection is applied. At sufficiently high modulation frequencies the spatial spreading of the modulated sample perturbation is reduced because of thermal or electronic skin effects (Clem and Huebener, 1980). In this way, the spatial resolution of LTSEM can be increased considerably. A limitation of this increase in resolution results from the fact that the amplitude of the modulated signal decreases with increasing modulation frequency. We shall return to this question in the following section. Investigations of time-dependent effects can be performed with high time resolution if short electron beam pulses are used in combination with a boxcar technique.
rv. INTERACTION BETWEENELECTRON BEAMAND SPECIMEN Due to the electron beam irradiation, both charge and energy are transferred to the specimen. Whereas the charge transfer is not of primary importance, it is the energy transfer that is generally utilized for imaging. This energy transfer takes place via inelastic scattering processes (generation of phonons). As a result, the sample is locally heated by the electron beam. This localized heating effect represents the important sample perturbation which is utilized for imaging in LTSEM (Clem and Huebener, 1980). Therefore, in the following, we concentrate on the thermalization of the beam energy and the corresponding time and length scales. Further, we deal with the characteristic healing length and relaxation time associated with the beam-induced thermal perturbation of the specimen. We do not discuss at all secondary and backscattered electrons, nor Auger electrons and x-ray emission. Although these latter probes provide important signals for imaging by conventional SEM (Reimer, 1985), they are not characteristic for imaging in LTSEM.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
11
A . Thermalization of the Beam Energy
The energy loss of the beam electrons in a stopping medium and the sequence of collisions can be described as a continuous process if the energy transfer in a Coulomb interaction between the incident and the target electrons is much smaller than the energy E , of the incident electrons. The energy loss can then be expressed per unit path length along the electron trajectory in the target, leading to the equation (see Reimer, 1985; Birkhoff, 1958) dE e4 - = -4nNZ-In Bi (3) dx mu Here E is the electron energy, and x is the coordinate along the electron path in the target. The coordinate x is taken to increase from zero at the target surface as the beam electrons proceed into the target interior. Due to the sequence of collisions the quantity x follows a complicated path and is not simply the distance of the beam electrons from the target surface. N is the number of the target atoms per unit volume, Z is the number of electrons per target atom, e is the elementary charge, m is the electron mass, and u is the velocity of the electrons. The quantity Bidenotes the ratio of the maximum to the minimum impact parameter and depends upon the electron energy. The derivative d E / d x is called the stopping power, and Eq. (3) is often referred to as the Bethe loss formula (Bethe. 1930). Because of its weak influence, in the following, we ignore the factor In Bi and rewrite Eq. (3) in the form dE 2nNZe4 -=-(4) dx E using E = m v 2 / 2 .With the boundary condition E = E , for x = 0, we obtain the solution 1 - ( E i - E z ) = (2nNZe4)x (5 ) 2 According to Eq. ( 5 ) the total path length L travelled by the beam electrons before reaching the thermal energy of the target atoms is L
=
Et(4nNZe4)
(6)
Here, in view of an energy E , as high as about 20 keV and of a target kept near liquid-He temperatures, we have set the final energy E corresponding to thermal equilibrium with the target equal to zero. The thermalization time zb is obtained from
12
R. P. HUEBENER
Inserting u ( x ) using Eq. ( 5 ) we find m2u: 4
q, = ---(3nNZe4u,)-'
where uo is defined by Eo = mu:/2. As an example, we consider lead as a target material and take E , = 10 keV as the incident beam energy. From Eq. (8) we then obtain zb
=
.
3 10-14s
(9)
This value of the thermalization time appears to be consistent with the experimental observation that the emission of secondary electrons reaches its final value in a time less than lO-'Os following the impact of the primary beam electrons (Oatley, 1972). From Eqs. (6) and (8) one obtains the relation
Except for the factor 3 / 4 , this is the path length travelled by the electrons along their trajectory in the target material during the time zb if they would have the constant velocity 0., For the value u, = 6 lo9 cm s - l corresponding to E , = 10 keV and the thermalization time of Eq. (9), we find from Eq. (10) the path length L = 1.4 pm. Since scattering in the forward direction is expected to dominate for the higher energy range of the beam electrons, this value of L is similar to the penetration depth of the electron beam within the target material. The thermalization time of Eq. (9) shows that the beam energy thermalizes quickly at the coordinate point of the beam focus on the specimen surface. The energy of the beam electrons is transferred into phonons resulting in a localized heating effect. On the other hand, Eq. (10) tells us that these thermalization processes take place only in a thin layer at the specimen surface with a typical depth in the pm range. Following these thermalization processes, the interaction between the electron beam and the specimen can simply be treated in terms of a localized heating effect (Clem and Huebener, 1980),as will be discussed in the next section.
-
B. Localized Heating Eflect: Thermal Healing Length and Thermal Relaxation Time
For treating the beam-induced localized heating effect, we apply the usual concepts of heat diffusion. We start with the geometry of a homogeneous halfspace irradiated with the electron beam. The beam power Po is assumed to be homogeneously dissipated within a hemisphere of radius r o . Because of the
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
13
spherical symmetry of the geometry, heat conduction is determined by the equations
dT Po = - ~ ( T ) - 2 n r * . dr
for ro < r < co
(12)
Here ti(T) is the temperature dependent heat conductivity of the target material, r is the radial coordinate, and T is the temperature. For simplicity, we ignore for the moment the temperature dependence of K . Then we find from Eqs. ( 1 1) and (12) by integration
(13) and for ro < r < co The total beam-induced temperature increment (at the center of the hemisphere) is
The temperature profile is seen to depend critically upon the value taken for the radius ro. From Eqs. (14) and (15), the radius r* at which the beam-induced temperature increment is reduced to 50% is given by 4
r*(50”/,) = -ro 3 i.e., it is only slightly larger than the radius ro. This radius is approximately given by the penetration depth of the beam electrons in the target material, which typically amounts to about 1 pm for a beam energy of 10-20 keV. Of course, in a more accurate treatment of the temperature profile, the temperature dependence of the heat conductivity K of the target material cannot be ignored and must be taken into account. We note further, that the temperature dependence of ti is particularly strong in the low temperature regime where LTSEM is applied. We can include this temperature dependence in a convenient way by approximating the function K ( T )in terms of a simple power series, say, of the form K ( T )= a T 2 + bT+
c‘
(1 7)
14
R. P. HUEBENER
where a, b, and c are constants. After inserting this function K ( T )into Eqs. ( 1 1) and (12), the solution T(r) can be found analytically by integration. As an example (Metzger, 1987), we show in Fig. 7 the temperature profile T ( r )for germanium and Tb = 2 K based on the approximation (17) and the experimental heat conductivity data of Geballe and Hull (1958). Whereas so far we have considered an unmodulated electron beam, next we turn to the case where the beam current is modulated at the angular frequency o.Now we are dealing with time dependent heat diffusion. The beam-induced temperature modulation of the specimen extends up to the distance qmfrom the coordinate point of the beam focus given by (Clem and Huebener, 1980)
Here D is the thermal diffusivity D =K/C-P
(19)
where C and p are the specific heat and the mass density of the target material, respectively. The length q,, is often referred to as the dynamic thermal healing length. The frequency dependence in expression (18) is known as the thermal skin effect. We see that the length qu decreases with increasing modulation frequency. This results in a corresponding increase of the spatial resolution obtained by LTSEM, if the modulated signal is detected. On the other hand, the signal amplitude decreases with increasing modulation frequency. In this way the gain in spatial resolution becomes limited at high modulation frequencies. As an example we take again germanium. If we ignore for the moment the local beam-induced increment of the time-averaged temperature, we find from Eq. (18) q m = 1 mm, taking the temperature value T = 4.2 K and the modulation frequency o / 2 n = 1 MHz. However, due to the rise of the timeaveraged temperature near the coordinate point of the beam focus, the dynamic thermal healing length qw is expected to be considerably smaller than such a value. This expectation has been confirmed by experiment (Huebener and Metzger, 1985).Of course, an exact treatment requires the consideration of the correct temperature dependence of the thermal diffusivity, and the accurate value of the healing length qw can only be obtained by numerical procedures. So far we have dealt with the geometry of a homogeneous half-space. Next we consider scanning a thin film deposited on a substrate, as we have briefly discussed already in Section 111. The film thickness d is assumed larger than the range of penetration of the beam electrons. The substrate is assumed to have
7 .OO
8.00
2
5.00
Y Y
aJ
L
3 3
4-
2 aJ
4.00
a
E, I-
3.00
2.00
1.00
~
'
0.00
5.00
10 .oo
15.00
20.00
25.00
30.00
Radius [ p m l FIG.7. Temperature profile calculated for germanium based on the approximation (17), for different values of the electron-beam power as indicated. Tb = 2 K. (From Metzger, 1987).
16
R. P. HUEBENER
high heat conductivity and to be closely coupled to the liquid-He bath, such that the substrate temperature is about equal to the bath temperature Tb.For an unmodulated beam the thermal perturbation of the specimen film extends up to the distance from the coordinate point of the beam focus given by the thermal healing length v] of Eq. (2). The temporal response to the beam irradiation is governed by the thermal relaxation time (Huebener, 1984; Clem and Huebener, 1980) z, =
C*p*d ~
ff
If the electron beam is turned on or off, the steady state behavior is attained after the time 7., The thermal healing length v] and the thermal relaxation time t,, given by Eqs. (2) and (20), respectively, determine the spatial and temporal resolution limits of the thermally induced response signal in LTSEM. For a lead film of 1 pm thickness deposited on a substrate, at 4.2 K one obtains the typical values v] = 100 pm and t, = lo-' s. Here we have used u = 1 Wcm-2K-' for the heat transfer coefficient. (The heat conductivity K was found using the Wiedemann-Franz law and assuming an electrical resistance ratio between room temperature and 4.2 K for the lead film of 200). It is important to note that the length v] of Eq. (2) refers to an unmodulated electron beam. If the beam current is modulated at the angular frequency w, the static healing length q is replaced by the frequency-dependent thermal healing length
where v] is taken from Eq. (2). In the limit wz, << 1 and ot,>> 1, Eq. (21) approaches the expressions given in Eq. (2) and (18), respectively. Of course, the increase in spatial resolution at high modulation frequencies expressed in Eqs. (18) and (21) can only be obtained if the modulated signal is detected. The frequency dependence of Eq. (21) has recently been confirmed experimentally (see Pavlicek et al., 1984). In this study a superconducting thin film microbridge fabricated from lead-indium alloy has been used as a bolometer for recording the beam-induced voltage signal as a function of the distance between the microbridge and the coordinate point of the beam focus. The width and length of the microbridge was about 5 pm and 10 pm, respectively. The frequency of the beam modulation had been varied between 100 kHz and 18 MHz. Typical experimental results are shown in Fig. 8. Comparison with the theoretical curve calculated from Eq. (21) indicates
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
0 0
2
4
6
17
8
10 12 14 16 18 20 f [MHzl FIG.8. Experimental values of the length qo for a thin film of lead-indium alloy versus the frequency of the beam modulation (circles).The solid line is calculated from Eq. (21). 7’’ = 4.2 K (from Pavlicek el al., 1984).
excellent agreement between experiment and theory. If one applies Eq. (21) to superconducting films of Pb, Pb-In alloy, or Nb,Ge with 0.5 pm thickness as typical model cases, values of near or below 1 pm are obtained for modulation frequencies in the range 10-1000 MHz, depending upon the thinfilm material (Pavlicek et al., 1984). Of course, in any calculation of the beam-induced thermal response of the specimen only the beam power absorbed in the sample plays any role. Therefore, correction factors may have to be applied due to the emission of secondary and backscattered electrons, photons, etc. So far, in our discussion of the thermal sample response to the electron beam irradiation we have concentrated on purely diffusive heat propagation. This mode of energy transfer dominates if local equilibrium is established within the phonon and electron system and also between both systems by means of sufficient scattering. However, at liquid-He temperatures phononphonon and electron-phonon scattering is strongly reduced and energetic excitations can propagate over relatively long distances ballistically, i.e., without scattering. For the application of LTSEM, ballistic phonon propagation is particularly important, and we will discuss this point in more detail in Sections X-XII. Ballistic processes are expected to contribute eventually to the spatial become resolution limit of LTSEM if the thermal healing lengths q or extremely small. Of course, ultimately the resolution of LTSEM is always limited by about the range of penetration of the beam electrons in which the thermalization of the beam energy takes place, and which we have discussed in Section IVA.
R. P.HUEBENER
18
V. SUPERCONDUCTING TUNNEL JUNCTIONS: PAIRTUNNELING As the first example of a recent application of LTSEM we consider the observation of spatial structures in superconducting tunneljunctions. Here we deal with a typical low temperature phenomenon particularly well suited for LTSEM. Usually, superconducting tunnel junctions consist of two superconducting films separated from each other by a thin electrically insulating barrier. The thin-film structure is deposited on a proper substrate. During the LTSEM experiments, the film structure on top of the substrate is scanned directly with the electron beam, whereas the bottom side of the substrate is in direct contact with the liquid-He bath (see Fig. 1). In addition to the strong fundamental interest in the observation of spatial structures in superconducting tunnel junctions by means of LTSEM, such experiments are also highly important for the cryoelectronic applications. In Sections V-VII we deal separately with different aspects of tunnel junctions, namely pair tunneling, quasi-particle tunneling, and large junction arrays, respectively. The typical geometry of a superconducting tunnel junction is shown schematically in Fig. 9. The two superconducting films crossing each other are separated by a thin electrically insulating barrier. The barrier is formed usually by an oxide of one of the superconducting electrodes and has a thickness of only a few A. Current and voltage leads can easily be attached to the tunnel junction in the cross-line geometry as indicated in Fig. 9. Standard thin-film technology and microfabrication techniques are employed for preparing the junctions. As first predicted by Josephson (1962), Cooper pairs (the particles constituting the superconducting electronic system) can tunnel across the barrier resulting in an electric current flow at zero voltage, if the junction barrier is sufficiently thin. The maximum Josephson current density that can flow across the tunnel junction without electrical resistance is given by the equation
Here J1(x,y ) is the local critical current density and #(x,y) the local difference between the phase of the superconducting wave function in both electrodes.
current voltage
FIG.9. Cross-line geometry of a superconducting tunnel junction.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
19
The coordinates on the junction are denoted by x and y. In the following we assume J , ( x ,y ) to be homogeneous. (Spatial variations of J I ( x ,y ) can result from inhomogeneities in the barrier and in the superconducting electrodes of the junction.) For two identical and homogeneous superconductors one finds (see, e.g., Ambegaokar and Baratoff, 1963)
where A(T)is the temperature dependent energy gap, e the elementary charge, R , the tunneling resistance per unit area of the junction in the normal state, and k , Boltzmann’s constant. The phase-difference function &x, y) depends upon the junction geometry. Further, it is influenced by a magnetic field applied parallel to the junction barrier. Geometrically, the important length scale is the Josephson penetration depth 1, given by
where h is Planck’s constant divided by 2 q and po the permeability of free space. The length d is
d = I‘,
+ I,, + t
(25)
where ILL, and I L 2 are the effective London penetration depths of the two superconducting electrodes and t the barrier thickness of the junction. The typical range of A, is about 1 mm. For junctions with small area, where both dimensions are much smaller than I,, and for zero applied magnetic field, the phase-difference function and, hence, the current density J ( x , y ) is a constant. If a magnetic field H is applied parallel to the barrier of a small-area junction, the local maximum Josephson current density J ( x , y ) is modulated as a function of the magnetic field, and the total maximum Josephson current Imaxshows the “Fraunhofer diffraction pattern”
Here w and L are the width and length of the junction, respectively. cp = H . L d is the total magnetic flux threading the junction (assuming the magnetic field to be oriented perpendicular to the direction with the junction dimension L). cpo = h/2e is the magnetic flux quantum. For junctions with dimensions equal to or larger than 1, the phasedifference function depends upon the spatial coordinates of the junction since the magnetic self-field of the tunneling current cannot be neglected any more.
-
R. P. HUEBENER
20
0
L
FIG. 10. Density of the maximum Josephson current J ( y ) of a one-dimensional tunnel junction versus the junction coordinate for the case L = 15 1,. L =Junction length; 1, = Josephson penetration depth.
First we consider a one-dimensional geometry, where one dimension, say, the width w of the junction (in x-direction), is still much smaller than A,, whereas the length L of the other dimension (in y-direction) is much larger than I , . The first analysis of such a geometry has been given by Owen and Scalapino (1967). In Fig. 10 we show the maximum Josephson current density along the direction of the dimension of length L for the case L = 15 I , and for zero applied magnetic field. The current density J ( y )is seen to reach a maximum at a distance of about 1, from both ends of the junction and to decrease sharply in the interior of the junction. This expulsion of the Josephson current from the junction interior is similar to the Meissner effect in a bulk superconductor, the London penetration depth AL being replaced by the Josephson penetration depth I , . If a magnetic field H parallel to the barrier and perpendicular to the direction of the dimension of length L is applied to the junction, Josephson vortices penetrate into the junction interior. In Fig. 11, we show schematically the case where four Josephson vortices exist within the junction due to the applied magnetic field. The current applied to the junction is assumed to be zero. In addition to the current density J ( y ) , the current vortices, as viewed parallel to the barrier, can be seen. It is the existence of the Josephson vortices within the junction and the oscillating pair current density similar to that shown in Fig. 11 that results in the “Fraunhofer diffraction pattern” of I,,, expressed in Eq. (26) for a small-area junction. Two one-dimensional junction geometries which are particularly interestingdue to their simplicity are shown schematically in Fig. 12: the in-line and the overlap geometry. In both cases, the current ca)nbe fed symmetrically to the junction.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
21
t
J1y’
top electrode
barrier
@H
bottom electrode
FIG, I I . Top: Density of the Josephson current J ( y ) of a one-dimensional tunnel junction in the case where four Josephson vortices are generated due to a magnetic field applied parallel to the barrier. The current applied to the junction is zero. Bottom: Josephson vortices as seen parallel to the barrier.
Two-dimensional junction geometries, where both dimensions are larger than A J , require a complicated numerical analysis. The influence of the geometry of the current feeding lines must be explicitly taken into account in addition to the particular junction geometry. The brief summary we have given so far on the spatial distribution of the pair tunneling current has concentrated only on the most essential points. Further details can be found elsewhere (see, e.g., Tinkham, 1975; Barone and Paterno, 1982; Bosch, 1986). Next we turn to the signal generated by LTSEM for imaging the spatial distribution of the maximum Josephson current density J(x, y). The local perturbation of the junction by the electron beam is expected to change the critical current density J1(x,y)by the amount SJ, in the irradiated region. If the phase-difference function #(x, y) were to remain unaffected, the measured beam-induced change of the junction critical current as a function of the beam position yo would be given by m Y 0 ) = SJ,(YO) * 6Y * w sin #(Yo)
(27)
Here and in the following, for simplicity, we assume a one-dimensional geometry, restricting any spatial dependence to the y-direction and ignoring
in-line geometry
overlap geometry
FIG. 12. In-line and overlap geometry. The barrier is indicated by the hatched part.
22
R. P. HUEBENER
the dependence upon the x-coordinate. The results obtained can easily be extended to a two-dimensional geometry. In Eq. (27) 6y is the length of the junction element perturbed by the beam, and w is the junction width measured along the x-direction. From Eq. (27) we see that the change 61:(yo) is proportional to the current distribution in the unperturbed junction biased at its critical current value. According to Eq. (22) this current distribution is determined by the phase-difference functions 4(yo). As pointed out by Chang and coworkers (Chang and Scalapino, 1984; Chang and Ho, 1984; Chang, Ho, and Scalapino, 1985), in addition to the local effect expressed in Eq. (27), there also exists a nonlocal contribution to the beam-induced signal 61,(yo) due to the change 64 in the phase-difference function. This contribution results from the increase in the penetration depths AL and I,, by 6AL and 61,, respectively, and is given by 61f(Yo) = w
-
s
dY * J,(Y) * cos $(Y)
*
64(Y,Yo)
(28)
Clearly, the beam-induced change 6 4 results in a signal contribution also from the nonirradiated part of the junction in addition to the irradiated portion, as expressed by the integral over the y-coordinate in Eq. (28).In the limit of weak perturbation (6R,/RL << 1; 6 R J / R J << l), the total beam-induced change 61,(yo) can be written as the sum 61,(Yo) = 61:(Yo)
+ 61$(Yo)
(29)
using the expressions of Eq. (27) and (28). Whereas 61:(yo) has the same spatial dependence as the unperturbed current density J(y), the contribution 61?(yo) can be quite complicated. An attempt to understand the qualitative behavior of the contribution 6 1 $ ( y o )from simple physical arguments can be found elsewhere (see Bosch, 1986). The principle of the LTSEM experiments performedfor the observation of the spatial distribution of the pair tunneling current density (Bosch et al., 1985) is shown schematically in Fig. 13. The electron beam is scanned over the junction surface, and the beam-induced change 61,(x, y) of the maximum Josephson current is recorded electronically as a function of the coordinate point (x,y) of the beam focus. A magnetic field H parallel to the junction barrier is applied by passing an electric current through the upper electrode only. A typical result is shown in Fig. 14. The signal - SZ,(x, y) is plotted for a series of linear scans along the y-direction, the geometric position of the junction and the scanning direction being indicated at the bottom. The 4-5 vortex state generated in an applied magnetic field parallel to the barrier can be seen. The sapphire substrate used in this experiment was coated with a superconducting Nb ground plane of 150 nm thickness and a SiO insulating film of 330 nm thickness in the area near the junction prior to the evaporation of the planar thin film tunnel junction. The base electrode was a PbIn film of
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
23
-
e BEAM
LEAD FOR MAGNETIC FIELD GENERATING CURRENT
GROUND
FIG. 13. Principal arrangement for measuring the spatial variation of the pair tunneling current density (from Bosch et al., 1985).
130 nm thickness. A PbBi film of 330 nm thickness served as the counter electrode. The rectangular tunneling barrier of about 3.5 pm x 90 pm area was formed by oxidation of the base electrode in a 0,-atmosphere at room temperature. This area of the tunneling barrier was sharply defined by means of a proper window in a thin insulating SiO layer deposited on the base electrode. Further details about the electronic measuring procedures can be found elsewhere (see, e.g., Bosch et al., 1985). The results shown in Fig. 14 refer to a one-dimensional junction geometry, the width and length being smaller and much larger than AJ, respectively. For this junction the ratio L/RJ is 14.4. The magnetic interference pattern of the maximum Josephson current I,,, of this junction was typical for a junction with a length L >> EL, (Bosch et al., 1985). The recording (b) in Fig. 14 was obtained near the local maximum associated with the 4-5 vortex state in the magnetic interference pattern. The recordings (a) and (c)were obtained on the low- and high-field side of this maximum, respectively. The increase and decrease in the maximum amplitude of the signal - 6 1 , ( x , y ) along the y-coordinate shown by the recordings (a) and (c), respectively, result from the nonlocal response due to the influence of the phase-difference function qh and agrees well with the predictions by Chang and coworkers (1984, 1985). The maximum amplitudes of the beam-induced change - 6 1 , ( x , y ) in Fig. 14 correspond to about 30% of the maximum critical current of the unirradiated junction at the corresponding magnetic field values.
24
R. P. HUEBENER
B e -BEAM
d
H-
90 p m
+
FIG. 14. The signal -61,(x,y) showing the 4-5 vortex state and obtained by scanning longitudinally along the junction. The line scans were performed for several values of the transverse coordinate. The position of the junction and the scanning direction are indicated at the bottom. Recording(b) was obtained near the local maximum of the magnetic interference pattern, whereas recordings (a) and (c) were taken on the low and high-field side of this maximum, respectively. Tb = 4.2 K, beam voltage = 26 kV, beam current = 10-100 pA (from Bosch et al., 1985).
In addition to the different vortex states in an applied magnetic field, the restriction of the flow of the maximum Josephson current to the region near both ends of the junction at zero magnetic field as determined by the penetration depth ;1 could be confirmed by LTSEM imaging. Of course, from these latter experiments the value of the Josephson penetration depth ;1 can be obtained. The evolution of the different vortex states in an increasing magnetic field parallel to the junction barrier (Bosch, 1986) can be seen in Fig. 15. Here the sample geometry is similar to that of Figs. 13 and 14, with a sapphire substrate and Nb groundplane covered by a SiO insulating film. The base electrode is a PbIn film of 109 nm thickness and the top electrode a PbBi film of 250 nm
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 25
+
I
Magnetic Field
[mAl Generating Current IF
0 Be
f
100 JJA
3 50pA
91 pm __* Y FIG. 15. Signal -61,(y) recorded in an increasing magnetic field applied parallel to the barrier and showing the different vortex states. For each value of the magnetic field only a single line scan is presented. The number of vortices in the junction are indicated on the left. The magnetic field generating current (passing through the top electrode) are given in the second column on the left. From top to bottom the sensitivity of the signal detection increases as indicated by the scale marks on the right. Tb= 4.2 K, beam voltage = 26 kV, beam current = 10- 100 pA (from Bosch, 1986).
thickness. The area of the tunneling window is 97 pm x 19 pm. The signal -dIl(y), obtained by scanning linearly along the middle of the junction in the direction of the long dimension L, is shown. For this junction the ratio L/A, is 5.2. The different vortex states were recorded for the magnetic field values corresponding to the local maxima of the magnetic interference pattern of the maximum Josephson current. Therefore, the maximum amplitude of the signal -61,(y) is expected to remain approximately constant along the junction length. The magnetic interference pattern of this junction is shown in Fig. 16. As seen from Fig. 15, the different vortex states show reasonably regular behavior. The small shifting of the signal seen at the highest vortex states containing 12-14 vortices was accompanied by an increasing difficulty at higher magnetic fields to accurately reproduce the recorded signal during repeated line scans. It appears that these high vortex states are extremely sensitive to small perturbations.
26
R. P. HUEBENER
Ic
I[
[rnAl
JJAI
1.5
-
-
150
1.0
-
-
100
0.5
-
-
50
0.0
-
-
0
1 I I I I I I I -30 -20 -10 0 10 20 IF ImAl FIG. 16. Magnetic interference pattern (&-Hcharacteristic) for the same sample as that of Fig. 15. Two measured curves are shown at different sensitivity of the vertical current axis as indicated. & = 4.2 K (from Bosch, 1986). I
In summary, we note that the different vortex states generated by the pair tunneling current have clearly been observed by LTSEM. The results agree well with the theoretical expectation. In particular, the nonlocal effect due to the influence of the phase-difference function predicted theoretically has been confirmed experimentally. So far, most experiments have concentrated on one-dimensional junction geometries. In these experiments a spatial resolution of 1-2 pm has been reached. For further experiments studies of twodimensional junction geometries are highly interesting. Here particular attention is expected to be given to the influence of the geometry of the lines feeding electric current to the junction. Comparison of the two-dimensional images of the pair current density with numerical calculations will be an important task. During all of this section we have assumed that the tunneling barrier is spatially homogeneous. Accurate evaluation of the barrier homogeneity and a sensitive detection of inhomogeneities in the barrier is possible by means of the quasiparticle tunneling current, as we will see in the following section. VI. SUPERCONDUCTING TUNNEL JUNCTIONS: QUASIPARTICLE TUNNELING In addition to the pair tunneling current at zero voltage drop, the normal excitations or single quasiparticles can tunnel across the barrier of a superconducting tunnel junction. In the latter case, the flow of the tunneling
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 27
current is accompanied by a resistive voltage. The current-voltage characteristic (IVC) is highly nonlinear, its shape depending on the energy gap of the superconducting electrodes. The tunneling process of quasiparticles across a superconducting tunnel junction has first been demonstrated experimentally by Giaever (1 960). The variation of the quasiparticle tunneling current with the voltage V applied to the junction is shown schematically in Fig. 17, assuming two superconducting electrodes with the energy gaps A1 and A,, respectively. At T = 0 the tunneling current remains zero until a discontinuous jump occurs at the voltage V = (Al + A2)/e. Following this jump, the current gradually approaches the normal tunneling characteristic (curve I , ) obtained when both electrodes are in the normal state. At T > 0 thermally excited quasiparticles result in a tunneling current also at voltages smaller than (Al + A,)/e. Now the tunneling current peaks at V = [ A , - A,l/e and rises sharply again at V = [A,(T) + A,(T)]/e approaching the curve I , . This structure in the IVC provides a simple way to measure the energy gap of the superconducting electrodes. The quasiparticle tunneling current is given by (Tinkam, 1975; Barone and Paterno, 1982)
(30)
Here, A is the tunnel junction area and R , the normal tunneling resistance per unit area obtained when both electrodes are in the normal state. E is the quasiparticle energy and f ( E ) the Fermi distribution function at energy E. In the integral of Eq. (30),the energy ranges IEJ < ) A I [and IE + eVI < IA2( are excluded. The complete IVC is found from Eq. (30)by numerical integration. In some regimes approximate expressions for the quasiparticle tunneling current can be used. For a symmetric tunnel junction where both electrodes consist of the same material (Al = A, = A) one obtains in the voltage range V < 2A/e (Solymar, 1972)
exp( -A/kBT) (31) According to (31) the current depends only weakly upon the voltage, except near V = 0 where it decreases linearly with decreasing voltage. Assuming the voltage-dependent terms to be of the order of 1, one finds A I z --(27~Ak,T)'~~ exp( -A/k,T) x
Rn
I
V 0 FIG. 17. Quasiparticle tunneling current I versus the voltage V for two superconducting electrodes with the energy gaps A, and A * , respectively. The solid line refers to T = 0 and the dashed line to T > 0. The straight line marked I , represents the normal tunneling characteristic.
/ I
/
Load line
\
V
FIG. 18. Pair-tunneling current at zero voltage. If its maximum value Imax is exceeded, the junction switches along the load line to the voltage state on the quasiparticle tunneling characteristic. The straight line marked I , represents the normal tunneling characteristic.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
29
It is the resistive nonlinear quasiparticle IVC which is attained by the junction if the total maximum Josephson current I,,, discussed in Section V is exceeded. At I > I,,,, the junction switches from the zero-voltage state to the corresponding point on the quasiparticle tunneling characteristic along the load line (see Fig. 18). In this state of non-zero voltage in addition to the quasiparticle DC current, the Josephson AC current appears. The latter current flows without dissipation and oscillates at the frequency 2e o=-v h
(33)
These pair current oscillations in a superconducting tunnel junction are described by the equation
where $ is the phase difference function introduced in Section V. Equations (22) and (34) constitute the two Josephson equations describing the flow of supercurrent across the barrier of a tunnel junction. In the imaging experiments on the quasiparticle current distribution using LTSEM and described in the following, the Josephson current is suppressed usually by means of a small magnetic field applied to the junction. The highly nonlinear quasiparticle IVC of a superconducting tunnel junction and the extremely rapid switching between the zero-voltage and finite voltage state (Fig. 18) are of particular interest for many cryoelectronic applications, Further details about the basic properties and the cryoelectronic applications of tunnel junctions can be found elsewhere (Tinkham, 1975; Barone and Paterno, 1982; Solymar, 1972; Matisoo, 1980). As we can see from Eqs. (30)-(32), spatial structures in the distribution of the quasiparticle current of a tunnel junction can be caused by inhomogeneities in the barrier and in the energy gap of the superconducting electrodes. In this case, a two-dimensional voltage image of the inhomogeneous junction properties is obtained by recording the beam-induced voltage change 6 V ( x ,y) of the current-biased junction as a function of the coordinates x and y of the beam focus. This voltage signal 6 V ( x ,y) is due to the localized heating effect of the electron beam and is the signature of the beam-induced conductivity change we have discussed in Section 111. We shall see below that the information obtained in this way depends critically upon the bias point on the IVC of the junction. Whereas the first LTSEM studies were performed using single junctions (Epperlein, Seifert and Huebener, 1982; Seifert, Huebener and Epperlein, 1983; Gross et al., 1984), subsequent experiments dealt with double junctions with the standard injector-detector configuration (Gross, Koyanagi,
30
R. P. HUEBENER
Seifert and Huebener, 1985; Gross, Schmid and Huebener, 1986). Earlier summaries of these results can be found in Huebener (1984), Bosch et al. (1986), Huebener and Seifert (1984), and Huebener (1985). The beam-induced voltage signal 6 V ( x ,y) is generated in the following way (Seifert, Huebener and Epperlein, 1983). We consider a planar tunnel junction of total area A . Focusing the electron beam on the coordinate point x, y on the junction surface, results in the thermal perturbation of the area nA2 around this point. Usually the radius A can be identified with the thermal healing length of Eq. (2) or (18). In this area the tunneling current increases by the amount 6 I ( x , y ) . Hence, in the unperturbed area A - nA2 of the current-biased junction, the current is reduced by the amount S l ( x , y). Assuming nA2 << A, i.e., 6 I ( x , y) << I , the beam-induced voltage change is found to be 1 6 V ( x ,y) = - -6Z(X, y) (35)
4V)
where o(V )is the differential conductance of the total unperturbed junction at the voltage V. Due to the beam irradiation, a transition takes place locally in the perturbed part of the junction from the quasiparticle IVC of the superconducting state to that of the normal state. This transition follows the load line with the slope ( - l/a) as shown schematically in Fig. 19. First, we consider a bias point in the thermal tunneling regime ( V < 2A/e
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
31
for a symmetric junction; see Eqs. (31) and (32), as shown in Fig. 19(a) for two different values of the local tunneling conductance. The current increment 61(x,y) and, hence, the voltage signal 16V(x,y)l is seen to increase with increasing local tunneling conductance. It is the latter junction property that is imaged by LTSEM at this bias point. Next we turn to a bias point in the gap regime ( V % 2A/e for a symmetric junction), as shown in Fig. 19(b) for two different values of the local energy gap. Here the voltage signal is dominated by the superconducting energy gap and increases with increasing local gap value, such that the local gap value is imaged by LTSEM at this bias point. For the bias point above the gap regime ( V > 2A/e for a symmetric junction) the voltage signal is expected to decrease rapidly since the IVC approaches the normal-state asymptote. In this regime, the voltage signal is determined by the local tunneling conductance as in the thermal tunneling regime. These considerations have been confirmed experimentally. As a typical example we show in Fig. 20 the results obtained for a PbAuIn/PbBi cross-line junction (Seifert, Huebener and Epperlein, 1983). The base- and counterelectrode were typically 200 nm and 400 nm thick, respectively. The rectangular tunnel area was 9 pm x 20pm. The barrier consisted of a thin SiO layer. The three voltage images (a)-(c) in Fig. 20 were obtained at the corresponding bias points shown on the inset. The upper voltage image (a), recorded in the thermal tunneling regime and imaging the quasiparticle tunneling conductance, shows a distinct maximum in the middle of the junction. (The reduced local tunneling conductance along both counterelectrode edges is caused probably by the influence of the photoresist stencil, defining the pattern for the counter-electrode, on the local electric field distribution in the rf-plasma discharge for the fabrication of the barrier). The middle voltage image (b) obtained in the gap regime indicates an energy gap minimum in the middle of the junction, where the current density shows a maximum according to the voltage image (a). A simple thermal analysis suggests that this spatial structure of the energy gap can result from the increased Joule heating of the junction in the middle region of high current density (Seifert er al., 1983). The lower voltage image (c) shows a structure similar to but smaller in amplitude than that obtained in the thermal regime as expected for the bias point above the gap regime. The results of Fig. 20 were obtained by 20 kHz beam modulation and by using a lock-in technique for detecting the voltage signal. Typically the signal SV(x,y ) was smaller than about 10 pV. The spatial resolution of this imaging method based on the voltage signal SV(.x,y) is determined by the thermal healing length r], of Eq. (21) yielding the expressions (2) and (18) in the lowfrequency and high-frequency limit, respectively. From this we note that relatively high spatial resolution can be obtained for tunnel junctions with strongly impure films (such as alloys) as superconducting electrodes (such that
32
R. P. HUEBENER
FIG. 20. Two-dimensional voltage image obtained for a PbAuln/PbBi cross-line junction for the different bias points shown on the inset. The beam is scanned horizontally and the voltage signal 16V(x, y)l is plotted vertically (y-modulation). The boundary lines of the junction electrodes are indicated by the markers. Tb= 4.2 K, beam voltage = 26 kV, beam current = 10-100 pA (from Seifert et al., 1983).
the heat conductivity of the films is small) in contrast to junctions prepared with highly pure electrode films. The increase in spatial resolution based on the thermal skin effect and achieved by high-frequency beam modulation as expressed in Eq. (1 8) has been experimentally demonstrated (Seifert et al., 1983).So far, LTSEM imaging of superconducting tunnel junctions has been extended up to angular frequencies of the beam modulation of 10-20 MHz. Experimentally a spatial resolution approaching 1-2 ,urn has been achieved. The analysis of the spatial resolution limit encountered in LTSEM of thinfilm superconductors has been extended recently beyond the purely thermal model discussed so far by treating the nonequilibrium distribution of the quasiparticles and of the phonons separately by means of the RothwarfTaylor equations (Gross and Koyanagi, 1985). In the limit of high phonon trapping the thermal healing length is found to be dominant. On the other hand, for low phonon trapping, where the beam generated nonequilibrium
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
33
phonons leave the superconducting electrode films rapidly, the spatial resolution is limited by the quasiparticle diffusion length. It appears that the first case is valid for most junction materials. The second case, which has not been studied in LTSEM experiments up to now, can arise for A1 as the material of the junction electrodes (Gross and Koyanagi, 1985). So far we have given only a qualitative discussion of the origin of the voltage signal 6 V ( x , y )for the various bias points on the quasiparticle IVC. A more detailed discussion can be found in (Huebener, 1984; Gross et al., 1985; Huebener and Seifert, 1984). Furthermore, the absolute local value of the energy gap at the coordinate point x , y of the junction can be determined quantitatively by plotting the product 6 V(x,y) IT(V )versus the voltage (Gross et al., 1985). In this way, for example the high and low energy gap in the detector junction of a double junction configuration, where the detector was only partly covered by the injector, could be determined accurately. By investigating the spatial transition between the high-gap and the low-gap region, the quasiparticle diffusion length in the electrodes of the detector junction could be measured. We have seen, that by means of LTSEM one can determine independently the local value of the tunneling conductance and of the energy gap in the superconducting electrodes of a tunnel junction, depending upon the bias point on the quasiparticle IVC. This becomes particularly important for investigating the spatial structures developed in the nonequilibrium state of a thin-film superconductor during tunnel injection of quasiparticles (Gross, Schmid and Huebener, 1986). For this nonequilibrium state, theory predicts a gap instability resulting in a spatial multiple gap structure where domains with different values of the energy gap exist simultaneously in the superconductor (see, e.g., Gross, Schmid and Huebener, 1986; Tremblay, 1981; Elesin and Kopaev, 1981). For the first time, LTSEM provides the possibility for the spatially resolved observation of this multiple gap structure with high resolution. Specifically, theory predicts that during injection of excess quasiparticles two stable values of the energy gap can exist: a reduced gap value A1 and an unperturbed gap value A 3 . For a certain fixed voltage V, across the injector junction, a first-order phase transition takes place, and the relative size of the domains with the different energy gaps A, and A, is determined by the total injection current. For a fixed current through the injector, a second-order phase transition takes place at V = V,, and the given total current fixes the relative phase volumes of the A1 (higher-current) and the A3 (lower current) domain. The phase boundaries are stationary and stable. A current change is accompanied by a motion of the phase boundary. For the injection voltage V, the relation holds A, + A3 < leV,l < 2A3. Further theoretical references and details can be found elsewhere (see Gross, Schmid and Huebener, 1986).
-
R. P. HUEBENER
34
FILM II (PbIn,l3COA)
FILM IU IPbIn.2700A)
1
FILM I IPbIn, 9 0 0 A ) INJECTOR: FILM I - I I OETECTOR: FILM 1-IU FIG.21. Double junction configuration for studying spatial structures developed during tunnel injection of quasiparticles. Further details are given in the text (from Gross et ul., 1986).
As a typical example, we present results obtained with a PbIn-PbIn-PbIn double tunnel junction. The geometrical cross-section of this configuration is shown in Fig. 21. The bottom and middle film formed the injector junction (overlap geometry). The detector junction formed by the middle and top film was defined by a window in an insulating SiO layer. In this way, an incomplete overlay of the detector and injector junction resulting in an unperturbed part of the detector junction was avoided. The oxide barriers of the junctions were formed by exposing the metal surface to a pure 0,-atmosphere. The tunneling area of the injector junction was typically 45 pm x 50 pm. The normal tunneling resistance R , per unit area of the detector junction was always at least an order of magnitude larger than that of the injector junction. For observing the spatial configuration of the multiple gap structure by means of LTSEM, the voltage signal 6 V ( x , y ) generated in the injector junction is recorded at different values of the injector current. (During this imaging process no voltage is applied to the detector junction). Typical results are presented in Fig. 22. The inset shows the IVC of the injector junction and the bias points (black dots) at which the voltage images were recorded. Image (a) was obtained with a bias point in the thermal tunneling regime and shows the spatial distribution of the quasiparticle tunneling current density. Obviously an inhomogeneity in the barrier results in a relatively sharp peak of the tunneling current density in the middle of the junction. Also near the middle of the right edge of the injector junction the current density is slightly increased. The images (b)-(g) were obtained with the bias points in the gap regime. Since the relation A, + A, < eV, < 2A3 holds for the injector bias voltage V, in the vertically rising part of its IVC and since the voltage signal reaches its maximum where eV, = 2A(x, y), the transition region between the domains of small gap A , and large gap A, generates the maximum voltage signal. Image (b) shows that the small-gap domains nucleate just at the point of increased local current density, i.e., the inhomogeneities in the tunneling barrier act as nucleation centers for the small-gap domain. As the injector current is increased the small-gap region grows in size until it occupies the
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
35
FIG.22. Two-dimensional voltage images of the injector junction showing the quasiparticle current density distribution (a) and the energy gap distribution (b)-(g). The bias points used for the recording of the images are shown in the inset. The electron beam induced voltage signal lSV( u,y)l is plotted vertically during the horizontal scans. The maximum voltage signal is about I pV. The arrows mark the junction area within the field of view. The spatial resolution is about 3 pm. Tb = 1.8 K, beam voltage = 26 kV, beam current = 6.5 PA, beam modulation frequency = 20 kHz (from Gross ef al., 1986).
total junction area (images (c)-(g)). The width of the transition region between the small-gap and the high-gap domain (where the maximum voltage signal is generated) is given by the quasiparticle diffusion length. Of course, an accurate determination of the two energy gaps Al and A, (without observing their spatial distribution) is possible in the usual way by measuring the quasiparticle IVC of the detector junction at fixed values of the injector current. Such measurements were found to be in excellent agreement with the theoretical predictions (Gross, Schmid and Huebener, 1986).
36
R. P. HUEBENER
The tunneling barrier between both superconducting electrodes represents the most sensitive part of a junction. Since the tunneling resistance depends exponentially upon the height and width of the barrier potential, small inhomogeneities in the barrier cause strong deviations from the regular junction behavior. The spatially resolved detection of such barrier inhomogeneities and the clarification of the underlying mechanisms in their generation represent perhaps the most important and challenging tasks for the application of LTSEM to superconducting tunnel junctions. Such experiments promise an effective optimization of the junction fabrication process. In the remainder of this section, we concentrate on this question of the barrier inhomogeneities. In Fig. 23 we show the results of three different methods by which typical barrier inhomogeneities can be detected (Bosch, 1986). These measurements were performed with a Pbln-PbIn junction with a tunneling window of
FIG.23. Three methods for two-dimensional imaging of barrier inhomogeneities. The principle of each method is indicated on the left.(a) Image resulting from the beam induced change - 61, (x, y) of the maximum Josephson current. (b) Image based on the beam-induced change dl(x,y) of the quasiparticle current at fixed bias voltage Vs in the thermal tunneling regime. (c) Voltage image 6 V ( x , y ) obtained from the quasiparticle IVC according to Eq. (35). Further details are given in the text. T, = 4.2 K, beam voltage = 26 kV, beam current = 10 - 100 pA (from Bosch, 1986).
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 37
101 pm length and 19 pm width (Josephson penetration depth AJ = 47 pm). On the left, the measuring principle of each method is indicated. O n the right, the beam-induced signal of different line scans in the y-direction of the long junction dimension for fixed values of the x-coordinate is plotted vertically (y-modulation). Image (a) shows the change - 6 1 , ( x , y ) of the maximum Josephson current. A large signal amplitude corresponds to a large value of the local maximum Josephson current density in the unperturbed junction. Each line scan took about 0.8 s. During this scanning time the maximum Josephson current was measured electronically about 4000 times. Image (b) is based on the quasiparticle IVC. Here the beaminduced change 61(x,y) of the quasiparticle current at a fixed bias voltage V, zz A/e in the thermal tunneling regime constitutes the signal. This imaging procedure simply consists of electronically detecting the quasiparticle current value at which the fixed threshold voltage V, is exceeded. Image (c) is also based on the quasiparticle IVC and shows the voltage signal 6 V ( x , y ) as expressed in Eq. (35), obtained in the thermal tunneling regime near the operating voltage V z A/e. All three images in Fig. 23 are similar and clearly indicate a local maximum of the tunneling conductivity in the middle of the junction. The deviation in the signal shape of the line scans in image (a) from that in the other two images apparently results from the influence of the phase-difference function Cp(x, y) (see Eq. (28)).A local conductivity maximum in the tunneling window such as shown in Fig. 23 eventually can lead to the development of a hotspot in the junction (Gross, Koyanagi, Seifert and Huebener, 1984). A particularly drastic and important case of an inhomogeneity in the tunneling barrier is the existence of a superconducting microshort between both electrodes of the junction. In this case, imaging by means of the voltage signal 6 V ( x ,y) derived from the spatial distribution of the quasiparticle current density and summarized in Eq. (35) is not possible any more since the IVC of the junction is changed to that of a superconducting microbridge. However, the location of one or more microshorts in the junction can be imaged by LTSEM using the same method which has been applied for imaging the Josephson current density distribution (see Section V). Instead of the maximum Josephson current, one determines the critical current of the shorted junction as a function of the coordinate point of the electron beam focus. For detection of the critical current value one can use a suitable threshold value of the voltage of, say, 10 mV. As a typical example in Fig. 24, we show the IVC of a tunnel junction containing a distinct superconducting microshort. The specimen is a PbInPbBi junction with a tunneling window of 97 pm length and 4 pm width (inline geometry). The critical current is about 1 1 mA. The IVC is highly similar to that of a superconducting microbridge in which phase slip processes take
38
R. P. HUEBENER
-
1 ImAl
I
I Microshort
0-10
-
I
1
I
-
-0.50 -0.25 0. 0.25 V [mVI FIG.24. IVC of a tunnel junction with a superconducting microshort. Further details are given in the text. Tb= 4.2 K. (From Bosch, 1986).
place (Huebener, 1979).It is distinctly different from that expected for quasiparticle tunneling and displayed in Figs. 17 and 18. An SEM micrograph of the junction is shown in Fig. 25(c). The beam-induced critical current change - &(y) as a function of the y-coordinate in the direction of the long junction dimension is shown in Fig. 25(a). This change is proportional to the local critical current density in the unperturbed junction. The junction was scanned along a single line (at a fixed value of the x coordinate). The time for this line scan was about 100 ms. During this scanning time the critical current was measured electronically about 500 times. The critical current is seen to be strongly reduced due to the beam irradiation at two locations close to both ends of the junction indicating superconducting microshorts at these locations. The maximum beam-induced signal )6Zc(y)lis about 30% of the critical current in the unperturbed sample. A two-dimensional image of the signal
FIG.25. Image of a superconducting short in a tunnel junction showing the IVC presented in Fig. 24. (a) Beam-induced change -61,(y) of the critical current versus the (longitudinal) y-coordinate. (b) Two-dimensional image of the signal (61,(x,y)l obtained by modulating the brightness on the oscilloscope screen. The dark area indicates a large value of the signal 161,(x, y)(. The dashed lines mark the boundary of the tunneling window. Tb= 4.2 K, beam voltage = 26 kV, beam current = 10-100 PA. (c) SEM micrograph of the specimen. (From Bosch, 1986).
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
39
161,(x,y)Jis shown in Fig. 25(b). Here the signal is used for modulating the brightness on the oscilloscope screen. The dark area indicates large values of the signal 161,(x, y)l. The position of the tunneling window is marked by the dashed lines. The investigation of possible correlations between the appearance of superconducting microshorts and the metallurgical microstructure of the junction electrodes represents an important and challenging task. In this context the spatially resolved functional test of the junction by means of LTSEM is highly crucial. Summarizing, we note that the voltage signal 6V(x,y) based on the quasiparticle IVC clearly images inhomogeneous junction properties of the barrier as well as of the energy gap in the superconducting electrodes with a spatial resolution approaching 1-2 pm. This structural information obtained by LTSEM is highly important for optimizing the fabrication process and the geometric parameters of the tunnel junction. Microshorts can also be detected accurately by LTSEM and can be correlated with the defect structure of the barrier or the metallurgical microstructure of the superconducting electrodes. In addition to these inhomogeneities resulting from the junction fabrication process, intrinsic spatial structures of the energy gap due to excess quasiparticle injection can be investigated by means of LTSEM.
VII. ARRAYS OF SUPERCONDUCTING TUNNEL JUNCTIONS In addition to the studies of single tunnel junctions we have discussed in Sections V and VI, large junction arrays and complex superconducting electronic circuits can be investigated by LTSEM. Such arrays become increasingly interesting because of their fundamental properties (Bindslev Hansen and Lindelof, 1984; Lobb, 1984) and because of their cryoelectronic applications including high-frequency devices (Bindslev Hansen, Finnegan and Lindelof, 198 1 ) or frequency-based voltage standards (Niemeyer, Hinken and Kautz, 1984; Niemeyer, Hinken and Meier, 1984). Superconducting microelectronic circuits with varying degree of complexity find increasing use as intelligent electromagnetic sensors of extreme sensitivity. Here superconducting quantum interference devices (squids) often play a central role. Further, the Josephson computer represents a highly complex cryoelectronic system with many interconnected superconducting lines and tunnel junctions. Today all of these systems are fabricated by means of thin-film technology. LTSEM is ideally suited for a spatially resolved investigation of their physical properties. In addition to the spatially resolved investigation of the overall superconducting circuitry of such complex systems, at the same time the individual
40
R. P. HUEBENER
a
H----*l
200prn
b
H---H 200pm
FIG.26. (a) Micrograph of a large part of an array of 166 tunnel junctions. The series connection of the junctions is indicated at the bottom. (b) Voltage image of the array obtained for current bias in the gap regime of the quasiparticle IVC. The dark junction areas represent shorted junctions. Tb = 4.2 K, beam voltage = 26 kV, beam current = 10-100 PA. Further details are given in the text. (From Bosch et al., 1985).
components such as single tunnel junctions can be studied with high spatial resolution. In this way, LTSEM can detect single malfunctioning elements in a large complex array or network. For illustration we present some results obtained recently by LTSEM for a series configuration of 166 superconducting tunnel junctions (Bosch et al., 1985). This array was fabricated in conjunction with the development of a Josephson voltage standard (see Niemeyer, Hinken and Kautz, 1984; Niemeyer, Hinken and Meier, 1984). The base electrodes were 200 nm thick and consisted of Pb-l2”,In-40/,Au alloy. Thecounterelectrodes were450nm thick and consisted of Pb-3XAu alloy (compositions in weight percent). The singlejunction area of 20 pm x 45 pm was defined by a window in a 350 nm thick SiO layer. Figure 26(a) shows a micrograph of a large part of the array. The series connection of the tunnel junctions is indicated at the bottom. In Fig. 26(b), we see a voltage image of a large part of the array obtained at the bath temperature 7‘’ = 4.2 K. The Josephson current was suppressed by a small external applied magnetic field. The signal 6 V ( x ,y) of the voltage image was obtained when the array was current biased in the gap regime of the quasiparticle IVC. In this case, the beam-induced voltage signal 6 V ( x ,y) displays a mixture of information about the energy gap of the superconducting electrodes and the barrier resistance, as we have discussed in Section VI. The voltage signal was used for modulating the brightness on the oscilloscope screen of the SEM. The bright regions indicate a large tunneling conductance or a large energy gap. In Fig. 26(b), the tunneling windows of the individual
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 41
junctions can clearly be seen. The dark junction areas represent (seven)shorted tunnel junctions, which can easily be identified by this method. Again, one would like to correlate the malfunctioning junctions with anomalies in their microstructure resulting from the fabrication process of the array. Sometimes useful indications can be obtained already from conventional SEM studies performed in situ along with the LTSEM experiments (Bosch, Gross, Huebener and Niemeyer, 1985). Finally we note that thin-film superconducting networks are rapidly becoming subjects of fundamental research because of their interesting properties and due to the recent advances in microfabrication techniques (Pannetier et al., 1984; Gordon et al., 1986). Clearly the high resolution imaging capability of LTSEM promises further insight into the physical behavior of such networks. IN SUPERCONDUCTING MICROBRIDGES VIII. HOTSPOTS
In a thin-film superconductor, electric current can flow without energy dissipation only up to a distinct critical value. If the critical current is exceeded, a resistive voltage appears. The onset of this voltage is due to the process of phase slippage and flux flow (Huebener, 1979). If the current is increased further, the energy dissipation eventually becomes large enough such that one or more self-heating hotspots develop locally (possibly resulting in the destruction of the superconductor). Such a hotspot in a thin-film superconductor represents a stable temperature structure consisting of a domain where the temperature is elevated appreciably above the critical temperature of the superconducting sample. In this way considerable Joule energy is dissipated within the hotspot domain. The stable spatial temperature structure of a hotspot is an example of the dissipative structures often encountered in the nonequilibrium state of open systems. Studies of the formation of hotspots in a thin-film superconductor are important both because of fundamental reasons (Bedeaux and Mazur, 1981; Freytag and Huebener; 1985) and due to their implications for technology (Gray et al., 1983). As pointed out by Landauer (Buttiker and Landauer, 1982; Landauer, 1978), the temperature structure associated with a hotspot results from the particular temperature dependence of the electric resistance, namely in the form of an S-shaped curve such as shown schematically in Fig. 27. The Sshaped character of the resistance-temperature curve is particularly strongly observed in a superconductor. Here with increasing temperature the electric resistance rises rapidly from zero near the critical temperature reaching a value which varies only little as the temperature is increased further.
42
R. P. HUEBENER
0
T
FIG.27. S-shaped temperature dependence of the electric resistance.
A detailed analysis of the heat balance equation describing a hotspot generated by Joule heating in a thin-film superconductor deposited on a substrate has been given by Skocpol et al. (1974). For simplicity, these authors assumed a one-dimensional geometry where the transverse sample dimension is small compared to the thermal healing length. Their experiments essentially confirmed the details of their theoretical analysis, including the highly nonlinear IVC associated with the generation of a hotspot. Subsequent experiments on wider films (Huebener, 1975) have shown that the onedimensional model also applies reasonably well to geometries that are not strictly one-dimensional any more. The subject of hotspots in thin-film superconductors has been summarized in two recent reviews (Dharmadurai, 1980; Skocpol, 1982). We will see in the following, that the boundaries of a hotspot in a thin-film superconductor can be imaged by LTSEM. Again it is the beam-induced localized heating effect and the resulting voltage signal 6 V(x,y ) which is utilized for imaging, similar to the situation discussed in Section VI in conjunction with superconducting tunnel junctions. Instead of a nonlinear quasiparticle IVC, now we are dealing with the non-linear IVC associated with the hotspot formation in a thin-film superconductor (Skocpol et al., 1974).The principle for the generation of this voltage image 6V(x,y) is shown schematically in Fig. 28. In the region of the hotspot, the temperature in the film is raised above the critical temperature T,, whereas outside this region the temperature is below T, and approaches the bath temperature Tb sufficiently far away from the hotspot. The boundaries of the hotspot are defined by the coordinates at which the temperature profile passes the value T,. The voltage signal 6 V ( x ,y ) for imaging the hotspot boundaries consists of the change in the resistive voltage of the current-biased sample caused by the beam irradiation. We note that the hotspot is maintained by the current applied to the sample, whereas the electron beam only acts as a small perturbation of the
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
43
V
I
r t
I
I
I
I
I
X
X
FIG.28. Principle of the generation of the voltage image S V ( x , y)of the hotspot boundaries in a thin-film superconductor. (a)Sample geometry and location of the hotspot (hatched area). (b) Temperature profile along the sample. (c) Electron-beam induced voltage signal S V ( x ) versus the sample coordinate.
system. Qualitatively, the origin of the voltage signal marking the hotspot boundaries can be understood as follows. If the electron beam is scanned over the total length of the sample, the beam-induced local temperature increment causes a small increase in the electric sample resistance resulting in a voltage signal 6 V ( x ,y) only near the location where the temperature profile passes through the value T,. Further away from the hotspot the sample temperature is too low for any additional electric resistance to appear due to the beam irradiation. On the other hand, within the hotspot domain with T > T,, the resistance is practically temperature independent for the small temperature excursions expected due to the beam irradiation. The sensitive region at each hotspot boundary extends over a distance of about the thermal healing length v] in both directions into the adjacent superconductor. Therefore, we expect a peak in the voltage signal 6 V ( x , y ) of width 21 to appear at the hotspot boundaries, if the sample film is scanned in the longitudinal direction. A more detailed treatment of the origin of the voltage signal S V ( x , y ) , based on a mathematical analysis of the heat balance equation, can be found elsewhere (see Eichele et al., 1983). Of course, such an accurate treatment must include the influence of the external electronic circuit attached to the specimen. As
- increasing bias voltagew
E
>
I
. -
-;-E-
FIG.29. Two-dimensional voltage image of the hotspot boundaries indicated by the bright regions for a superconducting tin film. The narrow strip with the wide section at both ends is indicated by the black lines. The image shown at the top is obtained by detecting the backscattered electrons. Results are presented for different sample voltages as indicated on the right. Tb = 2.5 K, beam voltage = 30 kV, beam current = 1-50 pA. (From Eichele et al., 1983).
shown by the theoretical analysis, the voltage signal 6 V ( x ,y) arises because of a small expansion of the hotspot in addition to a small shift in its location caused by the electron-beam irradiation. Typical experimental results (Eichele et al., 1983) are shown in Fig. 29. Here we see the two-dimensional voltage signal 6 V(x, y) for increasing and decreasing sample voltage. The signal is used for modulating the brightness on the oscilloscope screen. Bright regions correspond to large values of the signal 16V(x,y)J.At the top of the figure the signal generated by the backscattered electrons is indicated for identification of the sample location. The experiments were performed using a tin film of 1.9 mm length, 102 pm width and 0.5 pm thickness at a He bath temperature of 2.5 K. Figure 30 presents the voltage signal obtained under identical conditions as in Fig. 29 plotted against the longitudinal sample coordinate (y-modulation). As we expect, a localized voltage signal 6V(x,y) appears at the two hotspot boundaries. Figures 29 and 30 clearly show how the length of the hotspot changes with increasing and decreasing sample voltage and current.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 45
detector signal
I
I
X
I
> co
sample coordinptt
I I
I
I
I
I I
I I
!I
I
I
I
I
10.7 12s9
0
o ' I
or
5,l
I
I
'
I
12
FIG.30. Voltage signal S V ( x ) (in arbitrary units) obtained under identical conditions as in Fig. 29 versus the longitudinal sample coordinate. The amplitude of the signals S V ( x ) shown is 100 - 200 nV. The top shows the signal generated by the backscattered electrons. The two hotspot boundaries are marked by the voltage peaks. The different sample voltages are indicated on the right. (From Eichele et a!., 1983).
Figure 31 shows a plot of the electric resistance of the film versus the distance L, between the two maxima of the voltage signal for eight different values of the length L,. The data can be fitted well by a straight line passing through the origin and through the resistance value obtained if the total sample length L is in the normal state (last point on the upper right). These results refer to a tin film doped with 0, of 1.9 mm length, 91 pm width and 0.5 pm thickness. They clearly demonstrate that the distance between the two peaks of the voltage signal indeed measures the geometric length of the hotspot. We have pointed out above, that the width of the signal 6 V ( x , y ) at the hotspot boundaries is approximately twice the thermal healing length q
46
R. P. HUEBENER
f
200
-C E
i
cz
I I I
100
I I I
0
0.5
1 L,(rnrn)
1.5
1
2
FIG.31. Electric resistance versus the distance L,, between the two maxima of the voltage signal S V ( x ) for an 0,-doped tin film. Tb = 3.7 K, beam voltage = 30 kV, beam current = 1-50 PA. (From Eichele et a[., 1983.)
expressed in Eq. (2). Through the heat conductivity K, the length q depends upon the purity of the superconducting film. This dependence is demonstrated in Fig. 32 which shows the signal 6 V ( x ,y) marking the two hotspot boundaries plotted against the longitudinal sample coordinate for three samples strongly differing in their heat conductivity. All samples had the same film thickness of 0.5 pm and were deposited on single-crystalline sapphire substrates. From top to bottom in Fig. 32 the sample material was pure tin, 0,-doped tin, and 02doped aluminum. For the last material the image 6 V ( x ,y) of a single hotspot boundary is also shown at higher resolution. The distance 2q calculated from Eq. (2) for the three samples is indicated in Fig. 32 and agrees reasonably well with the geometrical width of the voltage signals. It is interesting that in the 0,-doped aluminum sample the voltage signal is localized in a region as small as only a few pm. Further details on these experiments including the measuring electronics can be found in (Eichele et al., 1983). In a superconducting microbridge fabricated from a highly impure material such as 0,-doped aluminum many hotspots can be generated simultaneously because of their small size. Indeed, hotspots with a total length of 15-20 pm have been observed by LTSEM (Eichele et al., 1983). Of
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
47
/
0
/
\
Fic. 32. Voltage signal SV(x) marking the two hotspot boundaries versus the longitudinal sample coordinate for three samples as indicated in the text. The thermal healing length q decreases from the sample at the top to that at the bottom because of the decrease in heat conductivity. For sample H9 (bottom) (he voltage signal 6 V ( x )of a single boundary is shown also at higher spatial resolution. The width 2q of each voltage peak calculated from Eq. (2) is indicated for each sample. T, differed for the three samples and ranged between 2.25 and 3.7 K. Beam voltage = 30 k V , beam current = 1-50 PA. (From Eichele et al., 1983.)
course, the high spatial resolution provided by LTSEM is crucial for such observations. According to Eq. (2 I), high-frequency modulation of the electron beam results in a shrinking of the dynamic thermal healing length v], with increasing modulation frequency o. Correspondingly, the width of the modulated voltage signal marking the hotspot boundaries is reduced with increasing frequency. Experiments performed with an 0,-doped tin bridge up to beam modulation frequencies of 10 MHz confirmed these effects (Freytag et al., 1985). Typical results are shown in Fig. 33. Here the dynamic thermal healing length qw found experimentally from the width of the signal peaks 6 V ( x ,y ) at the hotspot boundaries is plotted versus the frequency v of the beam modulation. The solid line represents a theoretical curve calculated from Eq. (21)
R. P. HUEBENER
0.0
4.0 8.0 frequency V IllHz1
12.0
FIG.33. Dynamic thermal healing length qo versus the frequency v = 4271 of the beam modulation. Crosses: experimental values obtained from the width of the voltage peaks. Solid line: theoretical curve calculated from Eq. (21) using the experimentally determined values q = 86 pm and 7, = 280 ns. Tb= 2.5 K. (From Freytag et al., 1985.)
using the values q = 86 pm and z, = 280 ns. The latter two values were found from independent measurements. Hence, the theoretical curve contains no adjustable parameter. Simultaneously with the dynamic healing length q w , the peak value of the voltage signal 6V(x,y) at the hotspot boundaries decreases with increasing modulation frequency. This effect is shown in Fig. 34. Again, the solid line represents a theoretical curve obtained from the theoretical analysis of the voltage signal at high-frequency beam modulation (Freytag et al., 1985).Both Figs. 33 and 34 indicate satisfactory agreement between experiment and theory. For further details we refer to Freytag et al., 1985. In summary, we have seen that the beam-induced voltage signal 6 V ( x ,y) clearly images the boundaries of a hotspot in a thin-film superconductor. Theoretically the origin of the voltage signal is well understood, and good agreement is found between experiment and theory, including the effects due to high-frequency beam modulation. Noting that the development of a hotspot represents the final stage (before destructive burnout) in the resistive behavior of a thin-film superconductor reached at relatively high values of the applied electric current, the possible role of LTSEM imaging of the “early” stage remains an open and interesting question. Here we have in mind the formation of a phase-slip center in a nearly one-dimensional geometry
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
49
40
0
0.0
4.0 fraquency 3
8.0
12.0
UHz 1
FIG.34. Peak value of the voltage signal marking the hotspot boundaries versus the frequency v = u)/2nof the beam modulation. Crosses: experimental values. Solid line: theoretical curve obtained from a theoretical analysis. Tb = 2.5 K. (From Freytag er al., 1985.)
(Huebener, 1979; Skocpol et a!., 1974). In a two-dimensional geometry, correspondingly we deal with a phase-slip line (Volotskaya et al., 1981, 1984) and flux flow (Huebener, 1979). So far, imaging of these latter structures in thin-film superconductors have not been reported. However, the generation of a voltage image of, say, a phase-slip center by means of LTSEM does not seem unfeasible, in particular, if the phase-slip center can be pinned at some location within the thin-film superconductor.
1X. CURRENT FILAMENTS AND TURBULENCE I N SEMICONDUCTORS So far we have discussed applications of LTSEM dealing with superconductivity as a typical low temperature phenomenon. Next we turn to semiconductors. Here we will concentrate on the application of LTSEM for imaging of the current filaments generated by impurity impact ionization induced avalanche breakdown at low temperatures. At liquid-helium temperatures in a semiconductor all charge carriers are frozen out, and the material acts as an electric insulator, if the applied electric field is not too large. However, above a threshold field, avalanche breakdown occurs resulting in electric current flow. In a doped semiconductor this process
R. P.HUEBENER
50
is caused by impact ionization of the donors or acceptors (Seeger, 1982; Monch, 1969). Recently, the subject of avalanche breakdown in semiconductors has received strongly increasing attention because of the observation of spontaneous oscillations and chaotic behavior of the electric resistance during avalanche breakdown (Teitsworth et al., 1983, 1986; Held et al., 1984, 1986; Peinke et al., 1985). Usually during avalanche breakdown of a homogeneous semiconductor, spatial structures develop such as current filaments or high-field domains (Bonch-Bruevich et al., 1975).These structures evolve as a result of the highly nonlinear sample response to the applied electric field, and their formation is associated with a strongly nonlinear IVC of the semiconducting material. One of the major issues of strong current interest is the question to what extent the complex temporal resistance behavior of a semiconductor during avalanche breakdown (Teitsworth et al., 1983, 1986; Held et al., 1984, 1986; Peinke et al., 1985)is correlated with complex spatial structures of the electric current flow. Here imaging by means of LTSEM promises to provide an answer. In the following we summarize recent experiments performed with p-germanium using LTSEM for the two-dimensional imaging of electric current filaments (Mayer, 1986; Mayer et al., 1987). The experimental arrangement is shown schematically in Fig. 35. The semiconductor crystal is glued to a single-crystalline sapphire substrate with
electron beam
_
I
-
-
I
I
I
-
-
-
-
liquid He T=L.ZK FIG.35. Experimental arrangement for imaging current filaments in a semiconductor by means of LTSEM. The p-germanium sample is glued to the sapphire substrate using Stycast cement. The ohmic contacts are indicated by the hatched areas on the sample surface. (From Mayer, 1986)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
51
1 mm thickness and 20 mm diameter using Stycast cement for good thermal contact. During the LTSEM experiments the bottom of the sapphire substrate is in direct contact with the liquid-He bath, whereas the top surface of the semiconducting crystal can be scanned with the electron beam. Electric leads attached to proper ohmic contacts on the specimen serve for applying the electric field. The response signal to be investigated as a function of the coordinate point (x, y) of the electron beam focus on the sample surface is the beam-induced conductivity change. The experiments described in the following were performed with voltage biased operation. Hence, the response signal consisted of the beam-induced electric current change 61(x, y). We note that the beam irradiation only represents a small perturbation of the object to be studied. Typically, the absorbed beam power was about 1 pW, whereas the Joule heat dissipated in the specimen during avalanche breakdown was in the mW range. For increasing the sensitivity, the beam current was modulated at 5 kHz and the signal 61(x,y) was detected using a lock-in technique. Qualitatively, the imaging process of the current filaments by means of the signal S l ( x , y ) can be understood as follows (see Fig. 36). We refer to a semiconducting sample doped homogeneously with flat donors or acceptors. Taking Ge as an example and assuming a typical beam energy of 26 keV, the beam injects about lo4 electron-hole pairs per incident electron in the generation volume of a few pm diameter. If the beam is directed to a nonconducting region, where no filamentary current flow takes place, the injected hot carriers can locally induce avalanche breakdown by impurity impact ionization, resulting in a significant current increment 61(x, y) in the voltage-biased sample. On the other hand, if the beam is focused on a highly conducting region with filamentary current flow, where most of the shallow
FIG.36. Origin of the signal 61(x,y) for imaging current filaments in a semiconductor. Top: part of a semiconductor sample as seen from the top with a current filament in vertical direction indicated by the hatched section in the center. The horizontal dashed line indicates a line scan of the electron beam. Bottom: beam-induced signal 6 I ( x )versus the beam coordinate for the line scan shown at the top.
1
0
=-
1.00
=~ ,
Q 1.10
1 ,
1.20
T=L':
1.30
V I'
FIG.37. Upper part: current-voltage characteristic without and with the beam irradiation. Inset shows the sample configuration. Lower part: beam-induced signal 61(x, y ) plotted vertically for the series of horizontal line scans. The triangular markers on the images correspond to those on the inset and specify the scanned portion of the sample. The images(a)-(f) refer to the different bias voltages indicated. T, = 4.2 K, beam voltage = 26 kV, beam current = 300 PA, material is p-doped Ge. (From Mayer, 1986.)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
53
impurities are ionized, no significant current increment is expected. In this way the boundaries of a current filament can be detected. (This imaging principle is in some way similar to that for hotspots in current-carrying thin-film superconductors discussed in Section VIII). The sample material was single-crystalline p-doped Ge with an acceptor (indium) concentration of about I O l 4 ~ 3 1 3 The ~ ~ . samples had the typical dimensions 9 x 3 x 0.26 mm3 and were polished with diamond paste. They were provided with ohmic aluminum contacts. Figure 37 (top)shows the IVC without and with the beam irradiation for a Ge sample with the geometry indicated in the inset. The hatched areas represent the ohmic contacts. The voltage was applied to the inner two contacts, and the beam-induced signal 6 1 ( x ,y) was detected by means of the 1 ohm series resistor. The beam irradiation is seen to cause a small shift of the IVC. During this measurement of the IVC for the irradiated sample the beam was focused on a fixed point on the sample surface showing a relatively large response signal 61(x, y). In the lower part of Fig. 37, the signal Sl(x, y) is plotted vertically for a series of horizontal line scans (y-modulation) across the center part of the sample. The triangular markers on the two-dimensional images correspond to those on the inset and specify the scanned portion of the specimen. The images (a)-(f) were obtained for increasing bias voltage as shown. Images (a) and (b) were recorded in the pre-breakdown regime with about 10 times the signal amplification than that of images (c)-(f) obtained in the post-breakdown regime. The results of Fig. 37 clearly indicate the formation of a current filament, its width extending up to about 2 mm at the highest voltage shown. Of course, the shape of the filament in Fig. 37 is influenced by the dipole-like electric field pattern caused by the small area of the inner ohmic contacts. This field pattern is likely to play a role in the rapid decrease of the signal 61(x, y ) seen in the images (d)-(f) outside the filament region. For a proper recording of the signal 61(x,y) the beam modulation frequency must be sufficiently low such that the sample can respond properly. For the Ge samples studied the typical signal decay time was found to be about 10-20 ps. Therefore, the 5 kHz modulation frequency appears adequate. Figure 38 shows the evolution of a multifilamentary structure for increasing voltage bias. The top part indicates the geometry, and the results refer to the upper sample portion. The two small tips on each of the upper two ohmic contacts serve for promoting filament nucleation. The TVC is shown also at the top. It does not display any beam-induced shift because of the small beam current of about 20 pA used in this experiment. The two-dimensional images presented in the lower part were obtained at the different levels of the voltage bias as indicated. Here the signal 61(x, y) is used for modulating the brightness on the oscilloscope screen. Bright regions correspond to a large
T I
I
electron beom
1.18
M A , FiZK; 1.20
1.22
1.24
126
1.28
L30
1
1.32 V I V l
FIG.38. Evolution of multifilamentary current flow in p-doped Ge. Upper part: geometry and current-voltage characteristic. Lower part: two-dimensional images of the beam-induced signal N ( x , y) obtained at the different voltage levels indicated. Bright regions correspond to a large signal 61(x, y). Tb= 4.2 K, beam voltage = 26 kV, beam current = 20 PA. (From Mayer, 1986.)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
55
response signal 61(x,y). Again, the triangular markers on the two-dimensional image on the upper left correspond to those on the inset and specify the scanned portion of the specimen. All other images refer to the identical scanned section of the sample. The image obtained at 1.210 V refers to the prebreakdown regime according to the IVC displayed at the top. Apparently, there does not yet exist a self-sustained current filament, and current flow only occurs in the presence of the beam irradiation. The bright vertical band seems to be caused by this beam-induced current. At 1.220 V the onset of a self-sustained current filament can be seen on the upper right. At the same voltage a non-zero slope appears in the IVC. At 1.230 V, the filament is more pronounced. However, the image of the filament is still relatively noisy. At 1.235 V the filament looks more stable and the noise level is strongly reduced. The following images at 1.240 and 1.245 V show growth of the filament width. A t 1.260 V the filament extends up to the right sample edge, and the filament continues to grow (1.270 V). The last three images (1.275 V, 1.277 V, 1.278 V) were obtained at relatively small voltage increments and refer to the onset of the region with the steep slope in the IVC. The rapid formation of new current filaments on the left can be seen. If the voltage was increased further, the current flow appeared to become more and more homogeneous. The sequence of events displayed in Fig. 38 can be seen distinctly also on linear line scans performed perpendicular to the electric field direction across the current filament structure. Figure 39 shows such a series of line scans performed on the same specimen as that in Fig. 38. These scans were taken across the middle of the sample for bias voltages in the range 1.200-1.300 V. The triangular markers indicate the location of the left and right specimen edge. Again, the self-sustained current filaments show up in form of a depression of the beam-induced signal H ( x ,y). In particular, we point out the nucleation (1.222 V) and growth (1.224-1.274 V) of a current filament on the right side of the sample, the noisy behavior (1.229 V), and the strong signal increase at 1.274 V, just before several new filaments appear abruptly in the left part of the sample (1.278 V). Again, the signal disappears at higher voltages. We have seen that LTSEM provides valuable information on the spatial structures developed during avalanche breakdown of a homogeneous semiconductor. In addition, the temporal structure of the sample response to a local beam-induced perturbation can be investigated as a function of the coordinates of the beam focus. The results of such an experiment are shown in Fig. 40. They indicate that the beam irradiation can stimulate current oscillations if the beam is focused on the boundary region of a current filament. The beam was turned on and off periodically as shown by the trace on the top. The three temporal recordings of the response signal bl(t)were obtained
500 pm, 2
v
-
1.300
* 1.290
W 0 C 0
c
1.280
1.279-
i
U
-
> -
C
W c L
3 U
V W ul
U J
2 E 0 W
n
0 n
n
FIG.39. Signal 61(x)for a series of line scans performed on the same sample as that in Fig. 38. The scans were taken across the middle of the sample for the different bias voltages indicated. The triangular markers indicate the left and right specimen edge. Tb = 4.2 K, beam voltage = 26 kV, beam current = 20 PA. (From Mayer, 1986.)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 57
m,
electron beam oft
electron beam on
/
/
I
\
curren filament
I
\
\
I
\
'\
= I *
I
7
/-------
-
/
/ /
0
..- _ _ - -- -
/
1 J
time 100psec Idiv. FIG.40. Time dependence of the beam-induced signal &(t) in p-doped Ge. The trace on the top indicates the pulsed operation of the electron beam. The three signal recordings (a)-(c) were obtained when the beam was focused on the corresponding locations marked on the inset. Tb = 4.2 K , beam voltage = 26 kV, beam current = 3 nA. (From Mayer, 1986.)
by moving the beam focus from location (a) to location (c) across the filament boundary, as indicated on the inset. When the beam was focused on location (b) on the filament boundary, stable current oscillations were observed as shown on the temporal trace (b). Beam-induced oscillations of the response signal dI(t) appeared in the frequency range 10-120 kHz, depending upon the location of the beam focus along the filament boundary and upon the beam intensity. The measurements shown in Fig. 40 were performed using a standard boxcar technique. Recordings (a) and (c) were obtained for the location of the beam focus outside and inside the current filament region and show a relatively large and small signal amplitude, respectively. This behavior is consistent with the results presented above. The temporal structure of the sample response observed when the beam is focused on the boundary region of the current filament suggests that it is this boundary region where the spontaneous oscillations and the chaotic behavior of the electric resistance (Teitsworth er ul., 1983, 1986; Held et ul., 1984, 1986; Peinke et al., 1985) develop during avalanche breakdown.
58
R. P. HUEBENER
The experiments described above have shown that LTSEM yields valuable new information on the spatially structured current flow in a homogeneous semiconductor during avalanche breakdown at low temperatures. From the examples discussed we conclude that in the near future such experiments are expected to contribute significantly to a more detailed understanding of various low-temperature properties in a semiconductor. As an interesting example for further investigations by means of LTSEM, we mention the new concept of a magnetic field effect transistor discovered recently (Mannhart and Huebener, 1986). This concept is based on the magnetic control and switching of current filaments in a semiconductor and appears highly promising as the key element of a new cryoelectronic device family (Mannhart et al., 1986; Huebener et al., 1985). Clarification of the mechanisms, which determine the spatial resolution limit in these applications of LTSEM to low-temperature semiconductor physics, still represents an important task. So far, systematic experiments relating to this question have not been performed. Empirically, a resolution of better than 30 pm, as indicated by the traces of linear line scans, has been obtained (Mayer, 1986). Finally, we note that in addition to the LTSEM experiments described above some other attempts for two-dimensional imaging of spatially structured current flow in a semiconductor have been reported recently (see, e.g., Kerner and Sinkevich, 1982; Jager et al., 1986).
X. BALLISTIC PHONON SIGNAL Conceptually, the applications of LTSEM discussed in Sections VI-IX were all based on the same principle: the electron-beam induced local change of the electric conductivity of the specimen. Depending upon the bias condition, the sample response then consisted of the voltage signal 6 V ( x ,y) or the current signal &(x, y). In these applications the local perturbation of the sample effected by the electron beam (thermal and/or electronic excitations) was spreading by means of diffusive processes, and the latter processes played a central role in determining the spatial resolution limit. In the final three sections, we deal with a distinctly different signal concept for spatial imaging, namely the ballistic phonon signal. Here the region of the specimen locally heated by the electron beam acts as a source of phonons (quanta of sound energy) which propagate ballistically (i.e., without scattering) to the opposite side of the crystal where they can be detected with a suitable phonon detector. This arrangement is shown schematically in Fig. 41. There is a close similarity between this emission of phonons from the source
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
I
'I'
59
beam
\
'phonon detect or FIG.41. Imaging principle based on the ballistic phonon signal
region and the electromagnetic radiation emanating from a heated source (blackbody radiation) (Ashcroft and Mermin, 1976; Fjeldly et al., 1973). For the ballistic phonon propagation through the crystal over distances in the range of mm or cm, it is essential that the crystal is kept at low temperatures of only a few K (except for the hot region acting as phonon source). In this way phonon-phonon scattering is sufficiently suppressed. On the other hand, in metallic electrical conductors phonons cannot propagate ballistically over long distances because of their interaction with the electrons. Hence, the ballistic phonon signal can be utilized only in electric insulators and semiconductors (which act as insulators at low temperatures since all charge carriers are frozen out). Electric superconductors at temperatures far below their critical temperature T, also represent an interesting case with a sutficiently long phonon mean free path because of the strong suppression of the electron-phonon interaction. However, in all cases imaging by means of the ballistic phonon signal is restricted to nearly single-crystalline materials. The ballistic phonons propagating through the crystal can serve for imaging the elastic anisotropy of the material by means of the phonon focusing effect. On the other hand, they can image structural defects in the crystal due to their scattering at these defects. These two imaging concepts are discussed in Sections XI and XII, respectively. The imaging procedure by means of the ballistic phonon signal (Huebener and Metzger, 1985) shown schematically in Fig. 41 is straightforward. The ballistic phonons are generated locally at the top surface of the specimen by irradiation with the electron beam. They are detected at the bottom surface of the specimen. The bottom surface is in direct contact with the liquid-He bath (note the general scheme shown in Fig. 1). In this way effective cooling of the phonon detector and its operation at a well defined temperature is assured. By scanning the specimen surface with the electron beam, the ballistic phonons
60
R. P. HUEBENER
arriving from the different directions are recorded by the detector. In this way the detector signal images the angular variation of the phonon intensity. Clearly, the angular resolution of this imaging procedure is determined by the geometric size of the source and of the detector for the ballistic phonons. For eliminating charging effects on the top side of the specimen due to the electron beam irradiation, this side is usually covered with an electrically conducting metal film. Typically, a granular aluminum film of about 0.5 pm thickness prepared in the presence of oxygen has been used for this overlay (Eichele et ul., 1982). Turning first to the phonon source, we assume that the source region is heated to an effective temperature T*. According to Planck's radiation law the spectral energy density u(w, T * ) of the phonons emitted from this region is given by (Ashcroft and Mermin, 1976; Fjeldly et al., 1973)
Here the second term on the left takes into account the zero-point energy. is the angular phonon frequency. The factor lju," is the average of the inverse third power of the long-wavelength phase velocities of the three acoustic phonon modes. The frequency dependence of expression (36) for the temperature T* = 10 K is shown in Fig. 42. Here we have taken the value us = 4000 mjs for the sound velocity. The spectral energy density is w
v)
U 01
z
P ' c
01
+ a u l a v)
0
I
0 1 2 v[THzl FIG.42. Spectral energy density u(w, T )versus the phonon frequency v = w/2n. (The zeropoint energy is subtracted; T = 10 K ; us = 4000 m/s).
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 61
seen to reach a maximum at some frequency (omax.From Eq. (36) one obtains the relation hwmax= 2.82knT*
(37)
which can also be written in the form
A,,,, T* = 6.81 -
m eK
0
(38)
Here i,,,, is the phonon wavelength corresponding to the frequency w,,,. The optical analog of Eq. (38) is known as Wien's displacement law. As an important result, by looking at Fig. 42 we note that the dominant phonon frequency is typically about 600 GHz corresponding to an acoustic wavelength of about 7 nm (based on the values us = 4000m/s and T* = 10 K). A rough estimate of the effective temperature T* of the source region can be obtained from the balance between the power input of the electron beam and the power output of the emitted ballistic phonons (Huebener and Metzger, 1985). Here we assume that the electron-beam power dissipated in the sample is completely removed by the emission process of the ballistic phonons. The power emitted by the phonons from a region at temperature T* into the adjoining half-space is given per unit area by
where I ) , is the sound velocity (averaged over the different acoustic modes) and w(T*) the phonon energy density in this region. The energy density w(T*)is found from the spectral energy density U ( Q , T * )of Eq. (36) by integration over all phonon frequencies, yielding 7c2
W(T*)= 10
(knT*)4 ~
(iiL1,)3
The optical analog of the result expressed in Eqs. (39) and (40)is known as the Stefan-Boltzmann law. For determining the emitted phonon power, one must know the total effective area of the phonon source. This area can be found experimentally from the angular resolution limit of the images based on the ballistic phonon signal. In this way a typical value T* = 10 K has been obtained recently for germanium (Huebener and Metzger, 1985). In our discussion given above we have assumed that the source region is in thermal equilibrium at the elevated temperature T*. It is important to note that such a treatment of the phonon source region in terms of an effective temperature T* represents an approximation where all nonequilibrium aspects are contained only in the value of this temperature T*. The questions relating to the diameter of the source region have been discussed in Section IVB, and the important results are contained in Eqs. (16)-
62
R. P. HUEBENER
(19). A typical temperature profile T(r)of the region locally heated by the electron beam is shown in Fig. 7. Again, high-frequency beam modulation can be utilized for reducingconsiderably the diameter of the modulated part of the source region. In this way the angular resolution of the phonon imaging technique can be increased. Experimentally, diameter values of the source region in the range of 10-100pm have been observed (Metzger, 1987; Huebener and Metzger, 1985; Huebener et al., 1986). We will see below that the effective area of the phonon detector can be made much smaller than that of the phonon source. Therefore, the angular resolution limit of the phonon imaging technique is dominantly determined by the phonon source. Turning next to the phonon detector, we note that the detector operation in the liquid-helium temperature range represents a distinct advantage since it allows the application of extremely sensitive low-temperature devices. The simplest device appears to be a superconducting microbridge attached directly to the specimen surface at the bottom by thin-film evaporation. Such microbridges have been used extensively for phonon detection at low temperatures (Eichele et al., 1982).They consist of a thin-film superconductor connected at both ends to much wider film sections to which current and voltage leads are attached. Such a detector is shown schematically in Fig. 43. For maximum sensitivity the detector is operated close to the superconducting critical temperature T,, where the electric resistance of the microbridge shows an extremely strong temperature dependence. During the experiment the temperature of the liquid-helium bath is adjusted for operation at the maximum slope of the resistance-temperature curve of the detector. Usually, the microbridge is current biased, and the voltage change due to the arriving ballistic phonons is recorded. As detector material, superconducting films of granular aluminum prepared in the presence of oxygen with a typical thickness of about 30 nm have been used. Granular aluminum films have a high electric resistance at low temperatures in the normal state yielding a relatively strong detector signal. The critical temperature of such films is near 2.12 K. As a consequence, the temperature of the liquid-helium bath is below
FIG. 43. Typical geometry of a thin-film superconducting bolometer. The hatched part shows the effective bolometer area which can be as small as a few pm’.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
63
the A-point, and the superfluid state of the liquid helium provides additional thermal stability of the cooling arrangement. Theeffective area of the thin-film detector can be made as small as only a few pmZ by means of standard microfabrication techniques. In this case, the contribution of the detector to the angular resolution limit of the imaging method is insignificant, as we have pointed out above. Further details including the electronic measuring procedures can be found elsewhere (see Metzger, 1987; Huebener and Metzger, 1985; Eichele et a!., 1982). It is important to note that in this application as a phonon detector a superconducting microbridge represents a bolometer integrating over all frequencies of the arriving phonons. Therefore, dispersive effects requiring frequency selective phonon detection cannot be investigated in this way. On the other hand, frequency selective phonon detection at low temperatures is possible by means of superconducting tunnel junctions (Eisenmenger, 1976). So far, tunnel junctions have been used mainly for the detection of ballistic phonons above a distinct threshold frequency (Renk, 1972; Anderson and Wolfe, 1986). For further details and references we refer to these authors. The ballistic phonon signal for the two-dimensional imaging of the phonon propagation in a crystal can also be generated by a laser scanning technique. Here the electron beam is just replaced by a laser beam and scanning is performed by means of a two-mirror system operated in the form of precision galvanometers (Northrop and Wolfe, 1980). Laser-beam scanning has been used extensively for investigating phonon focusing as will be discussed in Section XI. From a comparison of electron-beam scanning with laser-beam scanning at low temperatures it appears that the former has distinct advantages over the latter because of the relatively small electron-beam diameter available even at relatively long working distances between the final lense and the specimen surface. The ballistic phonon signal in LTSEM is distinctly different from the acoustic signal generated for imaging in scanning electron-acoustic (SEAM) and scanning photo-acoustic microscopy (SPAM) (Huebener and Metzger, 1985). In the two latter schemes the electron or laser beam is modulated at the angular frequency w, and coherent sound waves of frequency w are generated near the specimen surface. Typically, modulation frequencies in the range co = 100 kHz to 500 MHz are used. On the other hand, the ballistic phonons we have discussed in this section are emitted incoherenrly from the region locally heated by the beam, and their frequency distribution is given by Eq. (36) (see also Fig. 42). The dominant frequency of these incoherent phonons is typically v = 600 GHz and at least more than three orders of magnitude higher than the typical acoustic frequencies in SEAM and SPAM.
64
R. P. HUEBENER
XI. PHONONFOCUSING As the first application of the ballistic phonon signal for two-dimensional imaging, we discuss in the following the phonon focusing effect. This effect is caused by the elastic anisotropy of the crystal. As a result the intensity of the ballistic phonon flux through the crystal depends upon the crystallographic direction of the phonon propagation. Because of the elastic anisotropy in a crystal the surface of constant phonon energy in wavevector space is, in general, not spherical and displays more or less pronounced anisotropy depending upon the material. Therefore, the energy flux or group velocity of an acoustic plane wave is generally not parallel to the wave vector. This situation is shown schematically in Fig. 44. We see that the phonon energy propagates preferentially along distinct crystallographic directions. At low temperatures, where the phonon mean free path is long and where ballistic phonon propagation dominates over diffusive processes, the anisotropic channeling of the phonon energy can be highly 1010;
FIG.44. Principle of phonon focusing. For an anisotropic surface of constant phonon energy in wave vector space, the phonon energy flux (oriented perpendicular to the surface of constant energy) is generally not parallel to the phonon wave vector. This results in preferential propagation of phonon energy along distinct crystallographic directions, as indicated by the arrows at the top.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
65
pronounced. The first experimental demonstration of this “phonon focusing” has been reported by Taylor et al. (1969, 1971). In principle, the phonon focusing effect can be measured using a fixed localized phonon source at the crystal surface and a two-dimensional array of phonon detectors attached to the opposite surface. The phonon intensity measured with the different detectors of the array (possibly corrected for the differences in distance between source and detector) immediately yields the anisotropy of the ballistic phonon propagation. However, this scheme can also be inverted by attaching a single localized detector to one crystal surface and by operating a two-dimensional array of individual phonon sources on the opposite surface. The two-dimensional phonon source array can simply be realized by means of electron beam scanning, and we arrive at the scheme shown in Fig. 41. By recording the signal of the phonon detector as a function of the coordinate point of the electron beam focus, a two-dimensional image of the anisotropic phonon energy flux is obtained. Recently such experiments have been performed with single-crystalline a-quartz (Eichele et al., 1982), sapphire (Eichele et al., 1982), germanium (Schulze and Keck, 1984; Huebener et ul., 1986), and silicon (Metzger, 1987; Huebener and Metzger, 1985; Huebener et al., 1986). The samples were typically disks of 20 mm diameter and 2 mm thickness. As an example we show in Fig. 45 the two-dimensional image of the
FIG. 45. Two-dimensional display of the ballistic phonon signal in single-crystalline q u a r t z ( y-orientation). Bright regions indicate high intensity of the phonon flux. Further details are given in the text. Tb = 2 K, beam voltage = 30 kV, beam current = 30 nA. (From Eichele ef al.,1982.)
66
R. P.HUEBENER
FIG.46. Monte Carlo image calculation of the phonon-focusing pattern of @-quartz in y-orientation. Bright regions indicate high phonon intensity. Further details are given in the text. (From Koos and Wolfe, 1984.)
anisotropic energy flux of the ballistic phonons in single-crystalline a-quartz (Eichele et al., 1982). The crystallographic y-axis of the sample (a disk of 20 mm diameter and 2 mm thickness) is oriented perpendicular to the plane of the figure, and the x-axis lies horizontally in the plane of the figure. Bright regions indicate high intensity of the phonon flux. The scanned area shown is 3.5 mm x 3.5 mm, corresponding to an angle of about f40" around the crystal y-axis. The image shown in Fig. 45 represents the time-integrated bolometer signal. This signal is proportional to the total phonon intensity and, hence, does not display separately the individual contributions from the different acoustic phonon branches. For comparison we show in Fig. 46 the results of a Monte Carlo image calculation of the phonon-focusing pattern of a-quartz viewed along the y-direction (Koos and Wolfe, 1984). The calculated image extends f56" horizontally around the crystal y-axis. Piezoelectric effects, which cause only insignificant changes, are included in the calculation. Again the bright regions indicate high phonon intensity. We see that these theoretical results agree well with the experimental image shown in Fig. 45. Theoretical calculations of the angular dependence of the total phonon energy flux for the sum of the three acoustic modes in the long-wavelength limit reported earlier (Rosch and Weis, 1976) also show good agreement with the experimental results of Fig. 45. In addition to the time-integrated bolometer signal, the time-resolved
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
67
FIG.47. Two-dimensional image of the anisotropic time-integrated phonon energy flux in single-crystalline [ 1 1I]-oriented silicon. Bright regions indicate high intensity of the phonon flux. Further details are given in the text. Tb = 2.01 K, beam voltage = 26 kV, beam current 2 1 /LA.
bolometer signal can be recorded. In this way the anisotropic ballistic propagation of the three acoustic phonon branches can be investigated separately because of their different phase velocities. Such experiments based on electron-beam scanning have been performed also with a-quartz confirming the theoretical predictions. Whereas the time-integrated measurements were carried out using a lock-in technique with a modulation frequency for the electron beam of typically 10-50 kHz, the time-resolved images were obtained by means of boxcar integration (Eichele et al., 1982). Figure 47 shows the anisotropic time-integrated phonon energy flux of [11 I]-oriented single-crystalline silicon (a disk of 20 mm diameter and 2 mm
68
R. P. HUEBENER
1 rnrn
FIG.48. Ballistic phonon signal obtained for a single line scan of single-crystalline [I I I]oriented silicon versus the horizontal line coordinate (same sample as in Fig. 47). The location of the line scan within the two-dimensional phonon focusing pattern is indicated by the dashed line in the inset. T, = 2.01 K, beam voltage = 26 kV, beam current x 1 PA.
thickness). Again, bright regions indicate high intensity of the phonon flux. The scanned area shown is 7 mm x 8 mm, corresponding to an angle of about & 60" around the [111] axis. For quantitative studies, individual line scans performed with the electron beam are more advantageous. The bolometer signal can then be plotted vertically versus the horizontal line coordinate (y(y-modulation). As an example we show in Fig. 48 the result of such a line scan performed on the same [111] oriented silicon crystal as in Fig. 47. The location of this line scan within the two-dimensional phonon focusing pattern is indicated in the inset. The time-integrated phonon energy flux of [00 11-oriented single-crystalline germanium (a disk of 20 mm diameter and 2 mm thickness) is presented in Fig. 49. Most of the experiments mentioned above were performed using superconducting thin-film bolometers for phonon detection with an effective area typically ranging from 10 pm x 10 pm down to 2 pm x 2 pm. As we have pointed out in Section X, the angular resolution of the acoustic imaging method is dominated by the effective diameter of the phonon source if such highly miniaturized devices are utilized for phonon detection. High-frequency modulation of the electron beam can serve for reducing the diameter of the modulated phonon source region and thereby increasing the angular resolution (see Eq. (18)). Experimentally, modulation frequencies up to about
S C A N N I N G E L E C T R O N M I C R O S C O P Y A T VERY L O W T E M P E R A T U R E S
H
69
2Opm
FIG.49. Image of the time-integrated phonon energy flux of [001]-oriented singlecrystalline germanium. Bright regions indicate high intensity of the phonon flux. Tb = 1.93 K, beam voltage = 26 kV, beam current z 2 nA.
20 MHz have been used so far for imaging by means of the ballistic phonon signal. The effective diameter of the phonon source and, hence, the angular resolution limit of this imaging method can be estimated experimentally from line scans such as shown in Fig. 48 and a comparison with theoretical curves. The result of such a procedure (Metzger, 1987; Huebener et al., 1986)is shown in Fig. 50. Here the anisotropic intensity of the ballistic phonons has been calculated for the case of [001]-oriented germanium, assuming a phonon source of variable diameter and a point-like phonon detector. A Gaussian distribution was assumed for the intensity of the phonon source as a function of the radial distance from its center. Twice the radius at which the intensity has decreased to l/e of the value at the center was taken as the source diameter. In Fig. 50 we show a series of theoretical curves with the diameter d of the phonon source as parameter. An experimental curve is also shown for comparison. All results given in Fig. 50 refer to the slow-transverse phonon mode and the location of the line scan indicated in the inset. By comparing the first peak on the left of the experimental curve with the theoretical results, we
70
R. P. HUEBENER
-
scan coord
U
100 p m
FIG. 50. Intensity of the ballistic slow-transverse phonons in [001]-oriented singlecrystalline germanium for the line scan shown on the inset. Left: theoretical curves obtained for the different values of the source diameter d indicated. Right: experimental curve. Crystal thickness = 2 mm, Tb = 1.96 K, detector area 10 pm x 10 pm, beam voltage = 26 kV, beam current = 1 p A (from Metzger, 1987).
find for the diameter of the phonon source d = 40 pm. We note that this value has been obtained without high-frequency beam modulation. Of course, the effective source diameter can be reduced by means of high-frequency beam modulation and by detection of the modulated signal (Metzger, 1987; Huebener and Metzger, 1985; Huebener et al., 1986). The influence of a thin-film overlay, placed on the specimen surface irradiated with the electron beam, upon the effective diameter of the phonon source has been investigated in a series of experiments (see Metzger, 1987; Huebener and Metzger, 1985). However, it appears that more work needs to be done before this question of an optimum overlay film for improving the angular resolution limit of this imaging method is settled. The experiments described above yielding information on the phonon propagation through the bulk crystal clearly require a highly homogeneous specimen surface for irradiation with the electron beam In this way, additional
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
71
two-dimensional structures in the ballistic phonon image possibly due to the phonon generation process at the crystal surface are avoided. In addition to electron-beam scanning, laser-beam scanning has also been used for studying phonon focusing in single crystals. Such experiments have been performed with germanium (Northrop and Wolfe, 1980, 1979); silicon (Hurley and Wolfe, 1985); a-quartz (Koos and Wolfe, 1984); sapphire (Every et al., 1984); lithium niobate (Koos and Wolfe, 1984); diamond (Hurley et ul., 1984);calcium fluoride (Hurley and Wolfe, 1985); lithium fluoride (Northrop ef al., 1982); tellurium dioxide (Hurley et al., 1986); and gallium arsenide (Northrop er al., 1985). In those cases where a comparison is possible because of the material and orientation of the crystal (a-quartz, sapphire, and germanium) electron-beam scanning and laser-beam scanning yielded highly similar results. In a series of experiments, laser-beam scanning has also been employed recently for investigating dispersive effects in the phonon focusing pattern appearing at large phonon wave vectors. (See, for example, Dietsche et ul., 1981; Northrop, 1982; Wolfe and Northrop, 1984; Hebboul and Wolfe, 1986; Schreiber et al., 1986.) XII.
STRUCTURAL DEFECTS WITH BALLISTIC PHONONS
IMAGING OF
Structural defects in a nearly single-crystalline specimen, impeding the ballistic phonon propagation by absorption or scattering, cause a reduction of the ballistic phonon signal recorded by the detector. Therefore, this signal can also be used for imaging these structural defects. As we have seen in Section X, the phonon frequencies contained in the ballistic phonon signal are typically about 600 GHz corresponding to an acoustic wavelength of only a few nm. Hence, this acoustic imaging of crystal defects can simply be discussed in terms of geometric optics. The phonon detector just observes the “shadow” generated by such an object. Clearly, during electron-beam scanning an object will be imaged by its shadow falling on the detector, if the straight line connecting source and detector of the ballistic phonons just passes through the object. Furthermore, if two phonon detectors are operated simultaneously, three-dimensional imaging of structural defects with ballistic phonons becomes possible (three-dimensional acoustic tomography; Huebener and Metzger, 1985; Huebener et al., 1986; Huebener, 1986). This imaging principle is shown schematically in Fig. 51. By scanning the electron beam over the specimen surface, two different two-dimensional images by means of the ballistic phonon signal are obtained with the two phonon detectors placed
72
R. P. HUEBENER
I‘ \
electron beam
\
small - area phonon detectors FIG.51. Three-dimensional tomography based on the ballistic phonon signal. The acoustic “shadow” of an object (indicated by the open dot) is independently registrated by the two phonon detectors at the bottom.
at different locations. From these two images the three-dimensional configuration of the structural inhomogeneities can be reconstructed. Of course, with a simple scheme such as shown in Fig. 51 (two detectors only), the unambiguous reconstruction of the three-dimensional configuration of the inhomogeneities is possible only if, for a given coordinate point of the beam focus on the specimen surface, each detector signal is affected only by a single object. Therefore, the density of the objects affecting the detector signal must be sufficiently low. As an example we show in Fig. 52 the ballistic phonon image of two holes, drilled sideways into a sapphire single crystal using a laser technique (crystal thickness = 2 mm). A superconducting thin-film bolometer fabricated from
FIG.52. Ballistic phonon image (left) and optical image (right) of two laser-drilled holes in z-oriented single-crystalline sapphire. Parameters of the ballistic phonon image: Tb = 2.065 K, beam voltage = 26 kV, beam current x 0.3 PA. The background features in the ballistic phonon image are due to phonon focusing. (From Huebener et al., 1986.)
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
73
FIG.S3. Optical image (top) and ballistic phonon image (bottom) of a laser-drilled hole in single-crystalline a-quartz. Parameters of the ballistic phonon image: Tb = 1.91 K, beam voltage = 26 kV, beam current 2 0.3 PA.
oxygen-doped aluminum with an effective area of 10 pm x 10 pm has been used for phonon detection. Comparison with the optical image, which is also shown, indicates that nearly all details are well reproduced by the acoustic image based on ballistic phonons. Figure 53 shows the acoustic image of a hole drilled horizontally into an uquartz single crystal (crystal thickness = 2 mm) together with the optical image. At the tip of the hole a round “shadow” can be seen in the acoustic image, which is not present in the optical image and which may be caused by mechanical strain. The fact that different images are obtained for two bolometers placed at different locations on the specimen surface is demonstrated by the results presented in Fig. 54. Here the sample is the same as that in Fig. 53, and the hole is detected separately by two bolometers placed about 200 pm apart. In both
74
R. P. HUEBENER
H
100 prn
FIG. 54. Ballistic phonon images of a laser-drilled hole in single-crystalline a-quartz. The two images were obtained with two bolometers placed about 200 pm apart. The sample and the experimental parameters are the same as in Fig. 53.
acoustic images the hole appears at different locations relative to the phonon focusing pattern of the background, as one would expect. Of course, from both images together with the exact locations of the two bolometers the exact placement of the hole can be obtained in all three dimensions. The background features seen in the acoustic images of Figs. 52-54 result from phonon focusing effects. It appears that in general the anisotropy resulting from the phonon focusing effect does not present severe problems for the acoustic imaging of structural defects. On the contrary, phonon focusing represents an advantage since an angular regime with relatively high ballistic phonon intensity can be selected (Metzger et al., 1985). The acoustic imaging method based on the ballistic phonon signal appears * particularly interesting for a detailed characterization of the materials used in semiconductor microelectronics. The method looks highly promising for three-dimensional imaging of doping structures, chemical precipitates, dislocations, etc. Recently, encouraging results on imaging of oxide precipitates in silicon have been reported (see Metzger et al., 1985).
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES 75
The spatial resolution limit of this three-dimensional acoustic imaging principle is found from the same considerations discussed in Sections X and XI. The accessible depth range is limited, of course, by the phonon mean free path and is expected to reach at least several mm. In all experiments reported so far the phonon detectors were evaporated directly on the specimen surface opposite to that scanned by the beam. In this way, multiple use of a single detector for different samples has been impossible. The development of a suitable highly miniaturized detector configuration, which can be removed from the sample and used again for other specimens, represents an interesting task.
ACKNOWLEDGMENTS A large part of the work on scanning electron microscopy at very low temperatures performed in the group of the author and described in this article has been supported financially by grants of the Deutsche Forschungsgemeinschaft and of the Stiftung Volkswagenwerk. The developments summarized in this article were possible only because of the outstanding contributions of the many former and present coworkers of the author: J. Bosch, John R. Clem, R. Eichele, P. W. Epperlein, L. Freytag, R. Gross, H.-U. Habermeier, R. J. Haug, E. Held, W. Klein. M. Koyanagi, J. Mannhart, K. M. Mayer, W. Metzger, J. Niemeyer, H. Pavlicek, H. Seifert, and H.-G. Wener.
REFERENCES A. C. Anderson and J. P.Wolfe, Proceed. Fifrh Interncir. Conf on Phonon Scarrering in Condensed Mu//er,Springer Verlag, Berlin, 1986. V. Ambegaokar and A. Baratoff, Phys. Reu. Leu. 10.486 (1963); erralum 11, 104 (1963). E. A. Ash, Scanned lmaye Microsccipy, Academic Press, New York, 1980. N. W. Ashcroft and N. D. Mermin, Solid S/a/cl Physics, Holt, Rinehart and Winston, New York, 1976. A. Barone and G. Paterno, Physics und Applica/ions vf /he Josephson Efecr, John Wiley, New York, 1982. D. Bedeaux and P. Mazur, Physicu 105A, l(1981). H. Bethe, Ann. Phys. 5, 325 (1930). J. Bindslev Hansen, T. F. Finnegdn, and P. E. Lindelof, IEEE Trans. Magn. MAG-l7,95 (1981). J. Bindslev Hansen and P.E. Lindelof, Reu. Mod. Phys. 56,431 (1984). R. D. Birkhoff, Hundh. Phvsik, Springer Verlag, Berlin, Vol. 34, 1958. V. L. Bonch-Bruevich, 1. P. Zvyagin, and A. G. Mironov, Domain Electrical Instabilities in Semiconductors, Consultants Bureau, New York, 1975. J. Bosch, Dissertation, University of Tuebingen. 1986. J. Bosch, R. Gross, and R. P. Huebener, in Josephson Effecrs-Achievements and Trenk, A. Barone, ed., World Scientific Publishing Comp., Singapore, 1986. J. Bosch, R. Gross, R. P. Huebener, and J. Niemeyer, Appl. Phys. Lert. 47, l W ( 1 9 8 5 ) .
76
R. P. HUEBENER
J. Bosch, R. Gross, M. Koyanagi, and R. P. Huebener, Phys. Rev. Lett. 54, 1448 (1985).
M . Buttiker and K. Landauer, in Nonlinear Phenomenu at Phase Transitions and Instabilities, T. Riste, ed., Plenum Press, New York, 1982, p. 11 I . J. J. Chang and C. H. Ho, Appl. Phys. Lett. 45, 182(1984). J. J. Chang, C. H. Ho, and D. J. Scalapino, Phys. Rev. B31,5826 (1985). J. J. Chang and D. J. Scalapino, Phys. Reu. B29, 2843 (1984). J. Clarke and T. Y . Hsiang, Phys. Reu. B13,4790 (1976). J. R. Clem and R. P. Huebener, J . Appl. Phys. 51,2764 (1980). G. Dharmadurai, Phys. Stut. Sol. 62, 1 1 (1980). W. Dietsche, G. A. Northrop, and J. P. Wolfe, Phys. Rev. Lett. 47, 660 (1981). W. Ehrenberg and D. J. Gibbons, Electron Bombardment Induced Conduciivity und Its Applicutions, Academic Press, London, 198 1. R. Eichele, L. Freytag, H. Seifert, R. P. Huebener, and J. R. Clem, J . Low Temp. Phys. 52, 449 (1983). R. Eichele, R. P. Huebener, and H. Seifert, Z . Phys. B48,89 (1982). W. Eisenmenger, in Physical Acoustics, Vol. XII, W. P. Mason and R. N. Thurston, eds., Academic Press, New York, 1976, p. 79. V. F. Elesin and Yu. V. Kopaev, Usp. Fiz. Nauk 133,259 (1981) [Sou. Phys. Usp. 24, 116 (1981)l. P. W. Epperlein, H. Seifert, and R. P. Huebener, Phys. Lett. 92A, 146(1982). A. G. Every, G. L. Koos, and J. P. Wolfe, Phys. Rev. B29,2190(1984). T. Fjeldly, T. Ishiguro, and C. Elbaum, Phys. Rev. B7, 1392 (1973). L. Freytag and R. P. Huebener, J . Low Temp. Phys. 60,377 (1985). L. Freytag, R. P. Huebener, and H. Seifert, J . Low Temp. Phys. 60,365 (1985). T. H. Geballe and G. W. Hull, Phys. Rev. 110,773 (1958). 1. Giaever, Phys. Reo. Lett. 5, 147 (1960). I. Giaever, Phys. Rev. Lett. 5, 464 (1960). J. M. Gordon, A. M. Goldman, J. Maps, D. Costello, R. Tiberio, and B. Whitehead, Phys. Rev. Lett. 56,2280 (1986). K. E. Gray, R. T. Kampwirth, J. F. Zasadzinksi, and S. P. Ducharme, J . Phys. F Met. Phys. 13, 405 (1983). R. Gross and M. Koyanagi, J . Low Temp. Phys. 60,277 (1985). R. Gross, M. Koyanagi, H . Seifert, and R. P. Huebener, Phys. Lett. 109A, 298 (1985). R. Gross, M. Koyanagi. H. Seifert, and R. P. Huebener, Proceedings LT17, U. Eckern, A. Schmid, W. Weber, and H. Wiihl eds., North Holland, Amsterdam, Vol. I, p. 431 (1984). R.Gross, D. B. Schmid, and R. P. Huebener, J . Low Temp. Phys. 62,245 (1986). S. E. Hebboul and J. P. Wolfe, Proceedings 18th Internut. Con$ on the Physics qf Semiconductors, Stockholm, 1986. G. A. Held, C. Jeffries, and E. E. Haller, Phys. Rev. Letr. 52,1037 (1984);G.A. Held and C. JelTries, Phys. Reo. Lett. 56, I183 (1986). R. P. Huebener, German Patent No. D E 3526241 A1 from 27.02. 1986. R. P. Huebener, J . Appl. Phy.s. 46,4982 (1975). R. P. Huebener, Magnetic Flux Structures in Superconductors, Springer Verlag, Berlin, 1979. R. P. Huebener, Rep. Prog. Phys. 47, 175 (1984). R. P. Huebener, Oyo Buturi, Jupun. Sac. of Appl. Phys. 54,660 (1985). R . P. Huebener, E. Held, W. Klein, and W. Metzger, Proceedings Fifth Internat. Conf. on Phonon Scattering in Condmsed Mutter, A. C. Anderson and J. P. W o k , eds., Springer Verlag, Berlin, 1986. R. P. Huebener, J. Mannhart, and J. Parisi, German Patent Application P 3.5412Y0.9, 22. Nov. 1985. R. P. Huebener and W. Metzger, Scanning Electron Microscopy, 1985,II. p. 617.
SCANNING ELECTRON MICROSCOPY AT VERY LOW TEMPERATURES
77
R. P. Huebener and H. Seifert, Scanning Electron Microscopy, 1984,111, p. 1053. D. C. Hurley, A. G . Every, and J. P. Wolfe, J . Phys. C: SolidStare Phys. 17, 3157 (1984). D. C. Hurley and J. P. W o k , Phys. Rev. B32, 2568 (1985). D. C. Hurley, J. P. Wolfe. and K. A. McCarthy, Phys. Rro. B33,4189(1986). D. Jiger, H. Baumann, and R. Syrnanczyk, Phys. Left. 117,141 (1986). B. D. Josephson, Phyx Lett. I, 251 (1962). B. S. Kerner and V. F. Sinkevich, Pis’ma Zh. Eksp. Tear. Fiz. 36, 359 (1982) [JETP Letters 36, 437 (1982)l. G. L. Koos and J. P. Wolfe, f h y s . Rev. B29.6015 (1984). G. L. Koos and J. P. Wolfe, fh.ps. Reo. B30, 3470 (1984). R. Landauer, Phys. Today 31, 23 (1978). C. J. Lobb, Physira 126B. 319 (1984). J. Mannhart and R. P. Huebener, J . Appl. Phys. 60,1829 (1986). J. Mannhart, R. P. Huebener, J. Parisi, and J. Peinke, Solid State Comnt. 58, 323 (1986). J. Matisoo, IBM J . Rrs. Deu. 24, 113 (1980). J. Matisoo, Sci.Am. 242, 38 (1980). K. M. Mayer, Thesis, University of Tubingen, 1986. K. M. Mayer, R. Gross, J. Parisi, J. Peinke, and R. P. Huebener, Solid State Comm. 63, 55 (1987). W. Metzger. Dissertation, University of Tubingen, 1987 (unpublished). W. Metzger, R. P. Huebener, R. J. Haug,and H.-U. Haberrneier, Appl. Phys. Lett. 47, 1051 (1985). W. Miinch, Phys. Stat. Sol. 36, 9 (1969). J. Niemeyer. J. H. Hinken, and R. L. Kautz, A p p / . Phys. Lett. 45, 478 (1984). J. Niemeyer, J. H. Hinken, and W. Meier, IEEE Trans. Instr. Meas. 1M-33, 31 1 (1984). G. A. Northrop, Phys. Reu. B26,903 (1982). G . A. Northrop, E. J. Cotts, A. C. Anderson, and J. P. Wolfe, Phqw. Rev. Lett. 49, 54 (1982). G. A. Northrop, S. E. Hebboul, and J. P. Wolfe, Phq).s.Rev. Lett. 5 5 9 5 (1985). G . A. Northrop and J. P. W o k , Phys. Rev. B22.6196 (1980). G . A. Northrop and J. P. Wolfe, Phy.s. Rev. Lett. 43, 1424 (1979). W. C. Oatley, The Scanning Electron Mic,roscope. Cambridge University Press, Cambridge, 1972. C. S. Owen and D. J. Scalapino, Phys. Reu. 164,538 (1967). B. Pannetier, J. Chaussy, R. Rarnmal, and J. C. Villegier, Phys. Rev. Lett. 53, 1845 (1984). H. Pavlicek, L. Freytag, H. Seifert, and R. P. Huebener, J. LOWTemp. Phys. 56, 237 (1984). J. Peinke, A. Miihlbach, R. P. Huebener, and J. Parisi, Phys. Left. 108A. 407 (1985). L. Reimer, Scunning Electron Microscopy, Springer Verlag, Berlin, 1985. K. F. Renk, in Festkiirperprohlenze, Vol. XII, ed. by 0. Madelung, Vieweg, Braunschweig, 1972, p. 107. F. Riisch and 0. Weis, Z . Phys. B24, 101 (1976). C. Schmidt and E. Urnlauf, J . Low Temp. Phys. 22, 597 (1976). M. Schreiber, M. Fieseler, A. Mazur, J. Pollmann, B. Stock, and R. G. Ulbrich, Proceedings 18rh Inrernut. Con/: on the Physics of Semiconductors, Stockholm, 1986. H.-J. Schulze and K. Keck, Appl. Phys. AM, 243 (1984). H. Seifert, Cryogenic.v 22, 657 (1982). H. Seifert. R . P. Huebener, and P. W. Epperlein, Ph.p.s. Left.95A. 326 (1983). H. Seifert, R. P. Huebener, and P. W. Epperlein, Phys. Lett. 97A, 421 (1983). K. Seeger, Semiconductor Physics, Springer Verlag. Berlin, 1982. W. J. Skocpol, in Noneyuilihrium Supercottduciioir?,, Phonons, and Kapitza Boundaries, K. E. Gray, ed., Plenum Press, New York, 1982, p. 559. W. J. Skocpol, M. R. Beasley, and M. Tinkham, J . Appl. f h y s . 45,4054 (1974). W. J. Skocpol, M. R. Beasley, and M. Tinkham, J. Low Temp. Phys. 16, 145 (1974).
78
R. P. HUEBENER
L. Solymar, Superconductive Tunneling and Applications, Chapman and Hall, London, 1972. B. Taylor, H. J. Maris, and C. Elbaum, Phys. Rev. B3, 1462 (1971). B. Taylor, H. J. Maris, and C. Elbaum, Phys. Rev. Lett. 23,416 (1969). S . W. Teitsworth, R. M. Westervelt, and E. E. Haller, Phys. Rev. Lett. 51, 825 (1983); S. W. Teitsworth and R. M. Westervelt, Phys. Rev. Lett. 56, 516 (1986). M. Tinkham, Introduction to Superconductivity, McGraw-Hill, New York, 1975. A. M. S. Tremblay, Nonequifibrium Superconductivity, Phonons, and Kapitza Boundaries, K. E. Gray, ed., Plenum Press, New York (1981),p. 289,309. V. G. Volotskaya, 1. M. Dmitrenko, L. E. Musienko, and A. G . Sivakov, Fiz. Nizk. Temp. 7 , 3 8 3 (1981) [Sov. J. Low Temp. Phys. 7 , 188 (1981)l; V. G. Volotskaya, I. M. Dmitrenko, and A. G. Sivakov, Fiz. Nizk. Temp. 10, 347 (1984) [Sov. J. Low Temp. Phys. 10, 179 (1984)l. J. P. Wolfe and G. A. Northrop, in Proceed. Fourth Internat. Con5 on Phonon Scattering in Condensed Matter, ed., by W. Eisenmenger, K. Lassmann, and S. Dottinger, Springer, New York, 1984, p. 100.
.
ADVANC’ES IN ELtCTRONICS AND ELECTRON PHYSICS VOL . 70
Robust Image Models and Their Applications R . L . KASHYAP AND KIE-BUM EOM* cf Electrical Engineering Purdue Uniiwrsity West LaJayette. lndiana
School
Abstract . . . . . . . . . . . . . . . . . . . I . Introduction and Overview . . . . . . . . . . . . A . Robust Statistical Procedures . . . . . . . . . . B. Image Models . . . . . . . . . . . . . . . . C. Applications . . . . . . . . . . . . . . . . 1. Image Restoration . . . . . . . . . . . . . 2 . Boundary Detection . . . . . . . . . . . . . I1. ARand ARMAModels . . . . . . . . . . . . . . A . Introduction . . . . . . . . . . . . . B. 2D AR Processes . . . . . . . . . . . . . . . C . Simultaneous and Recursive AR Models . . . . . . D . Generalized ARMA Models . . . . . . . . . . . E . Generative Interpretation of Models . . . . . . . . F. Approximations to The Image Models . . . . . . . G . Summary . . . . . . . . . . . . . . . . . 111. Robust Estimation in Causal Autoregressive Models . . . . A . Introduction . . . . . . . . . . . . . . . . B. Causal Autoregressive Model . . . . . . . . . . C . Robust Parameter Estimation . . . . . . . . . . I . Perfect Observation Case . . . . . . . . . . . 2. Noisy Observation Case . . . . . . . . . . . D . Experimental Results . . . . . . . . . . . . . E . Discussions and Conclusions . . . . . . . . . . . IV . Image Restoration with Robust Image Modelling Techniques . A . Introduction . . . . . . . . . . . . . . . . B. Previous Robust Filters . . . . . . . . . . . . I . The L Filter . . . . . . . . . . . . . . . 2. The M Filter . . . . . . . . . . . . . . . C. Intensity Representation for Restoration . . . . . . . D . Image Restoration Algorithm . . . . . . . . . . E . Experimental Results . . . . . . . . . . . . .
. . . . . . . . 80
. . . . . . . . 80 . . . . . . . . 81 . . . . . . . . 82 . . . . . . . . 82 . . . . . . . . 82 . . . . . . . . 83
. . . . . . . . 84 . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
84 87 93 91 100 103 109 109 109 111 112 112 115 119 120 121 121 123 124 126 126 127 129
’
Partially supported by the Ofice of Naval Research under the grant N00014-85K-0611and by the National Science Foundation under the grant IST 8405052. * Currently with the Department of Electrical and Computer Engineering. Syracuse University Syracuse. New York 13244-1240.
.
19
.
Copyright il 1 9 R R by Academic Press Inc. All rights of reproduction In any form reserved.
ISBN 0-12-014670-3
80
R. L. KASHYAP AND KIE-BUM EOM
F. Discussions and Conclusions. . . . . . . . V. Composite Edge Detection. . . . . . . . . . A. Introduction . . . . . . . . . . . . . B. Edge Hypothesis Generation (Algorithm 1 ) . . . C. Confirming The Presence of Edges (Algorithm 2 ) . 1. Confirming a Texture Edge . . . . . . . 2. Confirming an Intensity Edge . . . . . . D. Experimental Results . . . . . . . . . . E. Discussions and Conclusions. . . . . . . . VI. Summary and Suggestions. . . . . . . . . . References . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
137 139 139 141 144 144 147 148 154 15s
155
ABSTRACT Various types of image models for representing images are considered, and robust image models are developed. The robust methods in image models are also applied to some important image processing problems such as image segmentation by texture property and image restoration in the presence of impulse noise. Robust estimation algorithms for two different outlier processes in causal autoregressive models are developed. These algorithms are based on robust M-estimators. Theoretical properties of the robust estimation algorithms are presented. The robustness of the estimators are also shown in the experiment. The robust estimation algorithm for causal autoregressive models is applied to the image restoration problem. Traditionally, median or a-trimmed mean filters are used, but these methods result in blurred images. The restoration method based on robust image model cleans out impulse noise without involving any blurring of the image. Experimental results show that the quality of images restored by the model-based method is much superior to the images restored by other traditional methods. We considered the detection of both intensity edges and texture edges. An intensity edge is defined by an abrupt change of intensities and a texture edge is defined as a boundary between different textures. Traditional edge detection algorithms cannot effectively detect texture edges. We developed an image model based edge detection algorithm which can detect both intensity and texture edges. The validity of the model based method is demonstrated by comparing with the result of conventional methods.
I. INTRODUCTIONAND OVERVIEW In the past decade, there has been remarkable progress in the research on statistical image models and their applications. Statistical image models (often called random field models or spatial interaction models) represent the image
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
81
intensity of a given picture by a small number of parameters. There are many applications of image models in image processing and analysis. For instance, they can be used for image synthesis (Kashyap, 1984b; Cross and Jain, 1983), image restoration (Chellappa and Kashyap, 1982; Geman and Geman, 1984), image coding (Delp et al., 1979), texture boundary detection (Kashyap and Eom, 1985a), and texture analysis (Kashyap and Khotanzad, 1984a). For the application of image models to such image processing tasks, we need to estimate the parameters in the image models. There are many different estimation algorithms for different image models, but most of these methods are based on the assumption of Gaussian image intensity distribution. However, the actual distribution of image intensity deviates from the Gaussian assumption, and traditional estimation methods are very sensitive to minor deviations from the Gaussian assumption. During the past decades, many estimators which are robust to the deviations from the Gaussian assumption have been proposed (Huber, 1981),but they are rarely applied to image modelling. Robust estimation procedures for several different image models are developed and applied to some important image processing problems such as image segmentation and image restoration in this study. A. Robust Statistical Procedures
There has been considerable interest in robust methods in statistics in recent years. It is because most statistical inference methods are based on rather restrictive assumptions about the observations and models, such as independence of observations, distribution of observations, etc. However, these assumptions do not always hold, and many statistical procedures are very sensitive to minor deviations from the given assumptions. For example, it is well known that least squares methods are excessively sensitive to a small number of outliers. The term robust was introduced by G.E.P. Box in 1953, and a procedure is called robust, if it is reasonably good (optimal or near optimal) if the assumption holds, and it is not sensitive to small deviations from the assumption. Primarily robustness implies distribution robustness, i.e., the robustness about the small deviations from the assumed distribution (usually Gaussian). The resistance to outliers is considered equivalent to the distribution robustness (Huber, 1981). There are several types of robust procedures: M-estimators, L-estimators, and R-estimators. Among these, M-estimators have an advantage over other procedures because they can be extended to the parameter estimation problems in image models. In contrast, either L-estimators or R-estimators are
82
R. L. KASHYAP AND KIE-BUM EOM
difficult to generalize well beyond one parameter location or scale problems. The robust M-estimators are applied to the parameter estimation problem of causal autoregressive models. Two different outlier processes are considered, and iterative robust estimation algorithms for both of the outlier processes are developed. Theoretical properties of the proposed robust estimators are investigated.
B. Image Models Image models characterize image intensity surface with a small number of parameters. Image models can be divided into two groups, namely, descriptive and generative models. A descriptive model for an image summarizes the intensity distribution into a finite number of statistics. An example is the cooccurence matrix (Haralick, 1973) used in texture analysis. The generative model, on the other hand, allows one to synthesize an image obeying the given model by using the model description and a set of random numbers. We will restrict ourselves to generative models since they can be used for many variety of applications. We can further divide the generative models into two large classes. In the first class, the observed intensity function y(i,j ) is assumed to be the sum of a deterministic function-usually polynomial or sinusoid-and an additive noise. In the second class, the image intensity function is generated as the output of a transfer function whose input is a sequence of independent random variables. The transfer function represents the known structural information on the image surface; the independent random sequence accounts for the unknown part. Note that the neighboring pixels are highly correlated, unlike in the earlier case, and the transfer function accounts for the covariance. C . Applications
Image restoration and image segmentation are two important branches of image processing. Image restoration is needed to recover the original image from the image corrupted by noise (including impulse noise), and image segmentation procedure, especially edge detection or boundary detection, is involved in most high level image processing problems. Robust image models are developed and applied to the above image processing problems in this study. 1. Image Restoration
An image may be subject to noise and interference from many different sources, and image restoration is used to remove noise from the given image.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
83
Traditionally, noise distribution is assumed as Gaussian distribution, and many different restoration algorithms based on Gaussian assumption have been introduced (Pratt, 1978; Rosenfeld and Kak, 1982). Recently, image models are used in image restoration applications. For example, Chellappa and Kashyap (1982) used simultaneous autoregressive model and conditional Markov model, Wu (1985) used nonsymmetric half plane autoregressive model and two-dimensional Kalman filtering approach, and Geman and Geman ( I 984) used a family of Markov models. Even though the above examples show some successful applications of image models in image restoration problem, all of the above methods are designed to remove Gaussian noise, and are not very effective in removing impulse noise (Pratt, 1978). Traditionally, median filter and its generalizations (Kassam and Lee, 1985) are used to remove impulse noise (also called salt and pepper noise) from the noisy image. These methods are simple applications of robust location parameter estimators, such as median or a-trimmed mean, where image intensity is assumed constant over a small size window. However, the restored images by these methods are blurred (Pratt, 1978). Robust image model approaches are applied to the image restoration problem in our study. The original image intensity is assumed to follow an image model, and parameters are estimated by a robust estimation algorithm. The image is restored by applying data cleaning algorithm with the robustly estimated parameters. The robust model-based method perform better than any other traditional methods in the experiment.
2. Boundary Detection Edge detection or boundary detection is a fundamental step in scene analysis. Traditionally, an edge is defined as a boundary between two uniform regions, where the intensity of each region is uniform and the intensity difference between two regions is large. Most edge detection algorithms are based on the gradient operator or the laplacian operator (Davis, 1977) which is sensitive to change of intensity. Recently, some model based edge detection approaches are proposed (Haralick, 1984; Zhou and Chellappa, 1986), but they are also based on the derivatives methods using decision rules with estimated model parameters. For the higher level processing, the edges should be able to distinguish the shape of each object from the background of an image. However, intensity edges are sometimes not satisfactory to represent an object and distinguish it from the background, because the intensity of an object or a background is not uniform. For instance, grass lawn in an outdoor scene is homogeneous by its texture property, but it has many intensity edges within the region. The
84
R. L. KASHYAP AND KIE-BUM EOM
above example suggests the necessity of detecting boundaries (or edges) by its texture property. Image models are already used in synthesizing textures which is very similar to real textures, and the estimated parameters which are obtained by fitting an image model to the given image can be used as texture features. The texture features derived from image model or from other methods can be used to segment an image by a statistical classification method, if the number and types of textures in the given image is known in advance. However, the above prior information is generally not available. A composite edge detection algorithm is developed in this study. The composite edge detection algorithm combines the model-based texture boundary detection method and a conventional intensity edge detection method. This algorithm detects all potential edges by a directional derivatives method, and final edges are confirmed whether they are texture edges or intensity edges. This algorithm is also compared with other conventional edge detection methods in the experiment. The composite edge detection algorithm performs better than other conventional methods which detect only intensity edges in the experiment.
11. AR
AND
ARMA MODELS
A . Introduction
It is claimed traditionally that a complete stochastic description of an M x M array of pixel intensities y(s) is given by the joint probability density of the M Z intensity variables y(.). Even writing down the expression is horrendous considering that the typical value of M is 128 or 256 or 512. As a consequence it was often conjectured that probabilistic models may not be of much use in solving interesting problems in image processing. The purpose of this chapter is to draw attention to the existence of a large class of image models which can be characterized completely in terms of the second order properties of the image sequence, i.e., the correlations E [ y ( s ) y ( s+ r)] or the corresponding spectral density. Consequently these models are relatively easy to analyze. It must be emphasized that the joint probability density of all the intensities is not assumed to be Gaussian. In the beginning, we will focus our attention on the two-dimensional generalization of the autoregressive (AR) models and autoregressive moving average ( A R M A ) models popular in the time series analysis. Basically all these two dimensional models can handle rational spectral densities i.e., ratio of two
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
85
linear combination of sinusoids in the two frequency variables in the direction, just as in the one dimensional case. However, there are many differences between the 1D and 2D cases which will be highlighted in this chapter. For example, in the 1D case, the correlation function is an exponentially decaying function of the lag variable. But in the 2D case, one rarely encounters the exponential correlation function. Similarly in the 1D case, the driving input random sequence is both statistically independent and uncorrelated with the dependent variables in the past. In the general 2D case, the input sequence cannot possess both these properties simultaneously. Secondly, we will consider the various possible ways of defining the weak Markov property in the 2D case. By weak, we mean that the corresponding Markov property can be described completely in terms of the second order properties like correlation or spectral density. The traditional Markov property defined in terms of the probability densities is termed as the strong Markov property. A sequence cannot be strong Markov without being weak Markov. We will characterize the various subclasses of 2D AR and ARMA models which possess various types of weak Markov property. We recall that the general AR or ARMA models mentioned above are not recursive, in general. Still these models are generative in principle, i.e., it is possible to give an algorithm which generates a sequence which obeys a prespecified model. However, the amount of computation involved may be considerable. We will consider modifications or approximations of the AR or ARMA models so that it is relatively easy to synthesize an image obeying a given model. Preliminaries. We will consider a covariance stationary array of the real numbers { y ( i , j ) , -co,
R. L. KASHYAP AND KIE-BUM EOM
86
i and j alone and not a function of s and hence is denoted by R ( i , j )
EC(y(s) - Y ) ( Y (+ ~ (i,j))
-
V)I = W , j )
where j = E [ y ( s ) ] . A covariance stationary random field in which J is a constant is called weak stationary. A random field { y ( s ) } is said to be isotropic if R(i,j ) = R(lil, l j l ) = R ( j ,i). For a covariance stationary RF, we can define a spectral density : S(I) =
1 R ( s ) e x p [ m s . A] SPI2
=
f F
+4*)1
R(s,Jz)exPCJ-l(sJ,
s~=-cc s ~ = - m
Another important second order measure of an R F model is the variogram Vl(s) = E [ ( y ( s )- y ( s + r))'] = function of r only if y(.) is weak stationary. The covariance function R ( . ) can be recovered from S(A) by the usual fourier integral R(r)=
s
S ( A ) e x p [ n A * r]ldAl
IdIJ = JdAllldA21 Another important concept is the neighbor set. A neighbor set is a set of grid points whose coordinates are near 0, but 0 itself is not a member of a neighbor set. N is said to be symmetric if r E N imply - r E N . A popular neighbor sets are the ones having 4-nearest neighbors and 8 nearest neighbors. ,I= (J.l,,Iz), r = ( i , j )
X
X
x
X X
4 neighbor
X
X
N
X X X
x X
8 neighbor
N
A neighbor set N is said to be semicausal if the row coordinates(or column) of all members are same sign. Some examples of the semicausal neighbor sets are given below.
x x x x x x x - x x x x x .
x x where stands for origin.
X
X
'
X
X
x x x x x
x x e
x
x x
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
87
B. 2 0 A R Processes Consider a real valued stationary process possessing a spectral density of the form
S(A) =
1 [positive linear combination of sinusoids in A,, A 2 ]
Our first step is to enquire whether y(.) can be expressed as the output of a system characterized by a two dimensional rational transfer function of finite order, the input being some elementary stochastic process, say u(-). Towards this end, consider the system described by the difference equation where u(.) is the elementary input
where N , a so called neighbor set is a set of grid points possessing symmetry, i.e., if s E N , then -s E N . All No neighbor sets can have the origin 0 for its member. However, not all neighbor sets are symmetric. Define the two dimensional polynomial A ( z , ,z2) in terms of the coefficients 0, A(z,,z,)
=
A(z)= 1 -
11 O i , j ~ t ~ i i
j
(i.jlEN
The coefficients { O r } in (1) obey the following condition defined in terms of the polynomial A A(z,,z,) > 0
for all
Izll = 1
and
Iz2( = 1
(2)
In addition, the input u ( - ) in (1) is assumed to have zero mean and be orthogonal to all y ( - ) ,i.e., E[u(s)y(s+ r)] = 0
for all r # 0
(3)
We also assume E [ u 2 ( s ) ]= 1. The parameter p in (1) can specify the relative power of the input term. We can also rewrite Eq. (1) compactly in terms of the polynomial A (4)
In defining (4), zi are interpreted as the unit lead operators in the two directions. Equation (3) defines the process u ( - ) only indirectly. The precise structure of the process o(.) is not obvious. We will derive later an expression for the spectral density of u ( - ) using (1)-(3). Equation ( 3 ) can be thought of as defining a u ( - ) process given a y(.) process. It is not obvious here to generate a y(.) and a u ( - ) sequence obeying
R. L. KASHYAP AND KIE-BUM EOM
88
simultaneously (1)-(3). We will later show constructively that there do exist infinite sequences y ( - )and u(.) obeying (1)-(3). Structure of u(.) Process. The following theorem gives the spectral densities of the processes y(.) and u ( - ) which obey (1)-(3). Theorem 1: The spectral density of y and u obeying (1)-(3) are given below:
S""(4 = A l ( 4 where A,(E,) = A ( z , , z2),zi = e x p [ f i l i ] . Proof: We will obtain a difference equation for the covariance function of y. Note E(y(.)) = 0. Let R ( t ) = E[y(s)y(s + t)]. Multiply (1) by u(s), take expectation on both sides and use (3). E[y(s)u(s)l = &ECv2(s)l
=& Next, multiply (1) by y(s + t ) on both sides and take expectation
by using (3) and (7), where
dt,o = 1 =0
if
t =0
otherwise
Take Fourier transform of (8)
i.e., or S,,(l) = p / A l ( l ) . To prove (6),take spectral density of both sides of (4). PS""(4 = IlA(z1 = e x p ( J r 4 ) , '72 = exP(J-1~,))ll2Sy,(4 =
IIA , (4II 'S,,(4
(7)
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
89
Using (5) for S,,(A),the above equation yields the required expression for &"(A) in (6). The proof is given in some detail because it gives the difference equation for Ry(r).In addition, the above proof indicates the existence of a process y(.) obeying (1)-(3) by demonstrating its spectral density. The u(.) process is an unalog of a one-dimensional moving average process. Its covariance function is E[u(s)u(s
+ r)] = -0,
if
rEN
=1
if
=o
elsewhere
r=o]
(9)
However, one important distinction between l-D and 2-D cases lies in the fact that it cannot have 2D version of moving average representation, i.e., it cannot be represented as a finite linear combination of independent random variables. The reason is that the symmetric polynomial A(z,,z,) cannot be factored, ie., it cannot be expressed, in general, as a product of 2 finite polynomials, in general. Conuerse of Theorem 1 . This section started with the assumption (3) on u(.). What would be the structure of the process y(-) if u(-) is assumed to be
white? We will prove the converse of Theorem 1 and show that a process with inverse sinusoidal spectral density does not in general have any representation other than (1). The exceptions will be handled later. Theorem 2: Consider a zero mean stationary process y(-) having a spectral density as shown below
S,,(A) = p/[a positive linear combination of sinusoids in A,, A,] i.e.,
s,,(A)= p / ~ ( z ,z,), , and
A(z,,z,) = 1
zi= exp(J-lAi) -
C 8,z' reN
where N is symmetric, 0, = 0_, and A ( - )obeys (3). Then define u(-) as:
Then
+
E[u(s)y(s r)] = 0
for all r # 0
R. L. KASHYAP AND KIE-BUM EOM
90
Proof: By definition =&)y(s)/h
Multiply both sides by y ( s + t ) and take expectation
JGy(- t ) = ' 4 ( Z ) W ) / & Take Fourier transform of both sides
suy(1)
zi
exp(J-lJi)
=
~ ( ~ ~ , ~ 2 ) s y y ( ~ = ) / h ~
=
h
by (10)
hence E [ v ( s ) y ( s + r)] = 0 if r # 0 Expression for the Correlation. In the one dimensional case, the correlation function is a linear combination of exponentially decaying function of the lag term given that the spectral density is a ratio of linear combinations of sinusoids. Such a result is not true in the 2D case. Exponential correlation functions are rare. We can evaluate the correlations from the spectral density by numerical integration. We will give one example below. Example: Consider the 4 member symmetric neighbor set.
Let
Y b ) = 0 1 Y(S + rsN
N
=
+h V ( 4
[ ( i , j ) , Iil = 1 or ljl = 1, not both]
The spectral density is S(1) = p / { 1
-
20(cos I ,
+ cos &)}
Here y(.) is isotropic. Full Plane Weak Markov Property. The system in (1)-(3) possesses a type of Markov property which is called weak since it can be described in terms of second order properties only. The weak Markov property will be defined in terms of the operator E*, the linear estimate of y ( s ) based on all its neighbors having the least mean square error. Dejnition of E*: E * ( y ( s )1 all y ( s + r), r # 0) is defined as the estimate of y(s) based on all linear functions of y ( s r), r # 0 having the least mean square error:
+
i.e.,
EC(y(s)- E*Cy(s) I all y ( s + r), r 5 EC(Yb) - d Y ( S
z 01121
+ 4 r # ON2]
where y is any linear function of one or more variables y ( s + r), r # 0.
(1 1)
91
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
Dejinition (Weak Full Plane Markov): A sequence y ( - )is said to have the weak full plane Markov property if ~ * [ y ( sI ail ) y(s
+ r), r z 01 = C ctry(s + r) rsN1
(12)
where N1 is a finite neighbor set and a,, r E N l are constants. The word ‘full plane’ appears because y(s) is conditioned on all other intensities in the entire plane. We will introduce later the half plane Markov property also. We will introduce below the intimate connection between the weak Markov property and the A R models discussed earlier. Theorem 3:
(i) The process y ( . ) defined in system in (1)-(3) possesses the weak full plane Markov property with the neighbor set N , equal to N and a, same as Or in (1). (ii) A stationary process which possesses the weak full plane Markov property in (12) must have its spectral density as in (10). Prooj:
Part (i): By adding and subtracting E * ( y ( s )I y ( s + r ) , r # 0) to y ( s ) - g on the R H S of Eq. (1 I), and simplifying it using the linearity of the function, one can easily show that (1 1) is equivalent to the following: E [ ( y ( s ) - E*(y(s)1 all y(s
+ r I ) ,r I z O))y(s+ r)] = 0
for all r
z 0. (13)
By the definition of the u ( - ) process in (3), we have E [ ( y ( s )-
C 8,y(s + t))y(s + r ) ] = 0,
for all r # 0.
(EN
Consequently, from ( 1 3), we have
4. Part (ii): In (12), let us replace simplicity. Define:
c(,
u(s) = y(s) -
by Or and N , by N for the sake of
C flry(s + r).
rEN
By the equivalence of (1 1) and (13), we have E[u(s)y(s + r)] = 0
which is precisely Eq. (3). By Theorem 1, the spectral density will have the required form.
R. L. KASHYAP AND KIE-BUM EOM
92
Gaussian Markou Models. Since a Gaussian distribution is completely characterized by the second order properties, we will obtain an explicit expression for the joint probability density of all the M 2 pixel intensities in the given image in terms of the AR-parameters 6, and p. The expression is useful in estimating the appropriate value of the AR parameters so that the model fits the given image. After, even if the actual density is not Gaussian, we will still use the Gaussian expression for estimating the parameters. The parameter estimates obtained from this approach possess good properties like relatively low bias and variance, even though their variance may not be the least attainable value. Let us arrange the M 2 pixel intensities in the form of a column vector, row by row y = Col.[y(O,O),y ( 0 , l ) .. . , y(0, M - l ) , y ( l , O ) ,. . .,y ( M - 1 , M - l ) ] . By definition, E [ y ] = 0. Let R = E [ y y T ] . p ( y ) = The joint probability density of y = Gaussian(0, R )
- (2n)-MZ/2 IR I - l / 2
exPC - (1/2)YT R - l Y l (15) where (RI = determinant of R. The size of the matrix R is M 2 x M 2 . Even though every element of R is a correlation of the type E [ y ( s ) y ( s+ r)] and hence can be related to the parameters 6, in principle, the sheer size of R is a drawback in the manipulation of R . Hence, for all practical purposes, the above expression is not very useful. We will give here an approximation to the probability density when M is large which clearly displays the explicit dependence on the parameters 6,. According to a result of Brillinger, the elements of the discrete fourier transform (DFT) of y are approximately independently distributed for large M , with the following density. Let i, j = 0,... ,M - 1) be the DFT of y ,
{xj,
P(Yj)
'V
N ( 0 , Sij)
where 8 = Column[B,,, (k, 1 ) E N , ]
where
zij = Column[cos[(ik
+ j l ) 2 n / M ] ,( k , 1 ) E N ,
N , is the asymmetric half of N . We can write an expression for the density of y from the density of Xj. Note that the determinant of the matrix of transform to y is one. from {
xj}
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
93
Thus the density displays the effect of the parameters Or clearly. The error introduced by the use of the Brillinger's result is of the order 0( 1/M '). Since M is of the order 128-5/2 in real images, the error is negligible. The logarithm of the density function is not quadratic in 8. Hence maximizing this function is relatively difficult. C. Simultaneous und Recursive A R Models We will consider the AR models in which the input random sequence is white. These models are referred to as simultaneous AR models. As mentioned earlier, class of these models is a proper subset of the class of general AR models discussed earlier. Simultaneous A R Models. A simultaneous model is described as a difference equation
where N , can be any arbitrury neighbor set, and [ w ( . ) ] is a sequence of zero mean and uncorrelated random variables with E [ w 2 ( s ) ]= 1. E[w(s)w(r)]= 0
I.e.,
r#s
if
(17)
The above equation can be rewritten in the operator form also A , ( z ) y ( s )= h where A , ( z ) = 1 -
(18)
W ( 4
1 Przr.If r = ( i , j ) , z r 4 z i z ; .
rsNl
For the stationarity of the process, the coefficients following condition A,(z,,z,)A,(z;',z;')> 0
for all
fi must obey the
and
J z l (= 1
1z21 = 1
(19)
The spectral density of y(.) obeying (16) or (18) can be written down by in spect ion
This expression clearly indicates the connection between the SAR models and the general AR models. Clearly the SAR model is one in which the denominator of the spectral density possesses a factorization, i.e., if Syy(A)= p / A ( z , , z , ) , then the process y can possess a SAR representation in ( 1 6 ) if and only if the polynomial A can be factored as follows A ( Z i 1 z 2 = A 1 ( 2 i 1 z 2) A1 (zT
' 2;
)
(21)
94
R. L. KASHYAP AND KIE-BUM EOM
where A, can be any finite polynomial. Unlike 1D polynomials, 2D polynomials do not possess in general a factorization as in (21). Equation (20) clearly indicates that a process y(.) obeying a SAR model in (16) possesses a full plane weak Markov property with a neighbor set N defined by the product A ( z l , z Z ) A ( z ; ' , z ~ ' ) i.e., A ( z l , z z ) A ( z ; ' , z ~ ' ) = 6, + 6#. Consequently
2
re N
The minimum mean square error of the estimate = p/8,. We will show that the noise w(-) is correlated with y(.). We will determine the cross spectral density between w(.)and y(.),
Thus E [ w ( s ) y ( s+ r)] # 0. If we d o not put any restrictions on the neighbor set N , then the cross correlation between w(*) and y(.) does not have any particular pattern. But if N is limited to special cases, as in the recursive models, then the cross correlations E[w(s)y(s + r)] will possess special properties. Gaussian SAR Models. Let us write an expression for the joint density of the M 2 intensity variables of a given image assuming that the density is Gaussian. One way is to begin with the joint density of the all w(9) variables which are independent and Gaussian. Then we can use (16) to obtain the density for y(-).However we need the Jacobian of the transformation between variables. This the M z vector of the y(.) variables and M Z vector of the w(-) determinant does not in general equal one and it is not easy to evaluate exactly. The second method is to use the approximation mentioned in the earlier i,j = 0,. . ., M - 1) be the D F T of the image sequence section. Let { {y(i,j),i,j = 0,. .., M - 1):
x,j,
Let
S,
= Syy(Al = 2ni/M,
= p/[i
-
LZ = 271j/M)
e ~ ~ , l[ ie
~ztl
0 = Column[&, ( k , 1 ) E N ] zij = Column[exp[fi(2n/M)(ik Z$
yj
= Column[exp[ --(2n/M)(ik
- N(0,
Sij)
+ j l ) ] , (k, I ) E N ] + j l ) , (k,I ) E N ]
95
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
p(y(i,j ) , i, j
= 0,. . . , M -
I) =
n-
i,k=O
Since S,' is quadratic in 0, the log likelihood function or the logarithm of the density function involve quadratic terms in I9 in addition to nonquadratic terms in 8 and p . Recursive A R Models. We will introduce here a subclass of the simultaneous AR models which behave very much like ID time series models and consequently possess properties like recursive generation. Let us define an asymmetric half plane R as follows:
R- = {(i,j ) : (i = 0 and j < 0) or (i < 0 and j is arbitrary)} One important property of R- is: if r E R- and s E R- then (s A recursive AR model can be written as
(25)
+ r) E R-.
where (i) the noise w ( . ) is zero mean, unit variance and uncorrelated, and (ii) the neighbor set N is a subset of the asymmetric half plane R-. Equation (26) can be rewritten in the usual operator form Al(Z)Y(S)= &w(s)
where
We will prove that w(s) is uncorrelated with y ( s + r) for all r belonging to R-. This property is similar to that of time series AR models. To prove this result, we begin with the expression (22) for the cross spectral density
&
A l ( z ; 1 > z i 1 ) ~ w .= "(4
(27)
We will express S,,,,(L) as a power series in zi
where Rwy(r)= E [ w ( s ) y ( s+ r)].Since the right hand side of (27) is independent of z 1 and z 2 , (27) is possible only if RWy(i,j ) = 0
i.e., E [ w ( s ) y ( s+ r)] = 0 for all r e R
for all (i,j ) E R
96
R. L. KASHYAP A N D KIE-BUM EOM
Similarly we can see that E[w(s)y(s)l =
h
Recursive Generation. We will describe a procedure which recursively generates a sequence y(.) obeying (26) given a set of values for w ( - ) and a set of boundary conditions. Recursive generation is possible only with the recursive AR models discussed in this section. Our intention is to recursively generate a finite image { y(i, j), 0 I i, j I M - I}. We need the intensities of the boundary pixels to be defined presently. Let i* = Min{i; (i, j ) E N , for some j } j * = Min{ j ; (i, j ) E N , for some i} 0,= {(i, j), (i = - 1,. .. ,i* and j = 0,1,. . ., M - 1 ) o r ( j = - l , ...,j * a n d i = O , l , ..., M - 1 ) ) We need the intensities of all the pixels in Q,. For example if N , = ((0,- l), (- 1, l), (- 2,0)), i* = - 2, j* = - 1, a, is displayed below. x x x QB
x x -
x x
x x
x x
x x t
X
* = (0,O) 0 row M=5
X X
X
t 0 Column
We are given {w(i,j), 0 I i, j 5 M - l}. We will recursively generate row by row to generate the zeroth row using the following equation:
+m)
In this case all the values y(r are the intensities of the pixels in 0,. w(0,j), j = 0,1,. . .,M - 1 are given, a set of independent pseudorandom numbers with zero mean and unit variance. Next we can generate all the intensities in the row-1, namely {y(l,j),j = 0,. . , ,M - 13, as before. All the quantities on the right hand side are available, either precomputed or given apriori. In general, to compute the intensities of the i I h row, given all the intensities of the previous rows, we use the following
j=O,l,,..,M
-
1
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
97
State Variable Models. All the models defined above involve recursion in both the variables (i, j ) . Here we will focus on a subset of the recursive models which can be written as a multiuariable model with recursion in only one index, say i. Then we can use Kalman type algorithms for filtering and smoothing an image in the presence of additive noise. We consider the recursive model given below
{ w(-))is a zero mean uncorrelated sequence with unit variance. The neighbor set N , is restricted as follows N , = {(i, j ) , i I - l }
We can also handle a similar neighbor set in which i is replaced by j, in a similar way. For illustration, consider the following set N2 =
{(-~,-l),(-~,O),(-~,~)}
Hence y ( k , j ) = P , y ( k - I , j - 1) +P,y(k - 1, j ) + P 3 y ( k - 1 , j + 1) + f i w ( k , j ) . Let yk = Column vector of dimension M = Col.(y(k,O),. , . ,y ( k , M - I)). Let wk = Col.(w(k,O),. . . ,w(k, M - 1)). Then equation can be rewritten as follows y(k)
= By(k-
1)
+
J&k)
where P2
B = I
P3
k g'
In the above equation, we generate one entire row as one step. Thus we have converted the two dimensional recursion into an M-variable one dimensional recursion. If N , has neighbors like (-2, j ) , then the corresponding vector . can convert this difference equation for y ' k )will involve y(k- " and y ( k - 2 ) We equation into a 2M-variable first order vector difference equation. D. Generalized ARMA Models
An ARMA model is characterized by two polynomials A(z) and B(z). A(z)= 1 -
c &z', or
reNl
=
/Ir
98
R. L. KASHYAP AND KIE-BUM EOM
N , and N2 are symmetric neighbor sets. In addition the parameters S and satisfy the following equation. A(z) > 0 and
B(z) > 0
for all Izl( = I
and
1z21 = 1
4
(30)
A stationary A R M A model is defined by the following difference equation involving the coefficients 8 and 4 defined above.
C BrY(s + r ) + J;v(s) A(z)y(s) = Jm Y(S) =
(31)
rENl
or
(32)
where the polynomials A and B obey (30) and the input u(-) is zero mean, and correlated with spectral density S,,(Z)= A(z)B(z)
(33)
The condition (30) on A and B are necessary to ensure the existence of a stationary process y(.) obeying (31). As before, (31) is a descriptive model. It is not obvious how to generate a y(.) using (31). This aspect will be treated later. The spectral density of y(.) obeying (31) is
It is easy to see that given any spectral density which is a ratio of two positive linear combinations of sinusoids in Al and A 2 , there exists a corresponding A R M A model as in (31). Unlike the one dimensional A R M A models, u(-) in Eq. (31) cannot be replaced, in general, by a finite movin average representation operating on an uncorrelated sequence because A(z)B(z)is not, in general, a finite order polynomial in z. In view of (31),the sequence o(.) has nonzero correlations only over a finite number of lags, as displayed below.
P-
E[u(t)v(t
+ s)]
=
otherwise
where
N"
=
{ ( r + s): r E N ; , s E N ; }
N f= N i u (03, i = 1,2 We are defining Bo = - 1, q50 (31) since E [ e 2 ( S ) ] = 1.
=
1. Note that the parameter v is introduced in
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
99
Let us define the cross correlations between 2) and y. Multiply both sides of Eq. (31) by y ( s + r), take expectation and sum over t from -m to a: A ( 4 S Y , ( 4 = J;S,?,(4
Using the expression for S,, in (34), we get
S U Y ( 4= JI;W
(35)
Expanding S,,(-)in powers of z and equating the coefficient of z'on both sides, we get if r E N ; otherwise
E[o(*s)u(s + r)] =
Thus the sequence u(s) has nonzero correlation with y(s + r) only for a finite number of values of r. The A R M A model in (31) does not, in general, possess the full plane weak Markov property. This will be established by showing that the best linear estimate of y(s) given all other intensities explicitly depends on all of them.
Theorem 4: The linear least squares estimate of y(s) based on all y ( r ) ,s # r has the following expression where the sequence y ( - ) is stationary and obeys (31). yl(s) = E * [ y ( s )I all Y ( S
and
+ r), r z 01
EC(y(s) - Yl(.$)21 = v / K
(37)
where - K = the -K g , =
i.e., Proof:
-K
C yrz'
constant term in the expansion of A ( z ) / B ( z )
the coefficient of z' in the expansion of A ( z ) / B ( z )
= A ( z ) / B ( z ) go , =
-
1.
(38)
Let
44 = Y ( 4 - allr,1r 3 0 Y,Y(S + where yr obeys (38) u(s) = ( - C g , z ' ) y ( s ) ,
since go = - 1
Find the cross spectral density of u and y SUY(4 = (-Cg,z')S,,(4
(39)
R. L. KASHYAP AND KIE-BUM EOM
100
Substitute for Sy,(A) from (34) and zg,z‘from (38) S,,(A.)
1 B vA K A B
= -- * =
v/K
Consequently
+
E[u(s)y(s r)] = 0
for all r # 0
(40)
Let y’(s) be any linear function of all y(s + r), r # 0.
+ r) + Cgry(s + r) - ~‘(s))’] = E [ ( Y ( s )- Cgry(s + r))’] + E[Cg,y(s + r ) - Y’(s))*],using (40)
E [ ( Y ( s )- Y‘(s))’I = E[(y(s) - Cgry(s 2 E [ ( y ( s )- CgrY(s +
This proves (36).To prove (37), let us find the spectral density of S,,,, from the definition of u in (39)
Hence E[u’(s)]
=v/K.
As a consequence of the theorem, only the generalized A R model possesses full plane weak Markov property among all processes with ‘rational’ spectral densities. There are several special subclasses of A R M A models which possess interesting properties. The detailed description of these models and their properties are well explained in Kashyap (1981).
E . Generative Interpretation of Models
While defining the basic AR process in Section A, Eq. (1) was used to define the process u ( * ) in terms of the y(.) process. The spectral density of u ( * ) in (6) was derived from the corresponding properties of y(.). In this section, we will view the basic Eq. (1) as generative, i.e., given a sequence u ( - ) with spectral density in (6), how can we generate a sequence y(.) with the corresponding spectral density in (5) or equivalently y(.) is the output of the system whose input is u(-). If Eq. (1) were recursive, the generation would
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
101
not pose any problem. However (1) is not recursive and y,(i, j ) depends on intensities of pixels all around it. Thus solution of the generative process is not an initial value problem as in recursive equations, but a boundary value problem. We will pose the problem in a precise way and solve it accordingly. We are interested in generating a sequence { y(s)), which belongs to the grid R.
R
= { ( i , j ) ,0 Ii , j IM
-
1)
Let the spectral density of the required process be [ p / A ( z ) ] , where A ( z ) is defined in terms of a symmetric neighbor set N
Note that the neighbors of the pixels as defined by N on the boundary of R will not belong to R. Let 0, be the smallest superset of R to include all the neighbors of a11 the pixels in R, i.e., let s E R, imply either s E R or there exists a s, E R so that s = s1 I , r E N . For instance, if N = ((0, l), (0, - I), (l,O), ( - l,O)}, then R, = { ( i ,j ) , - 1 I i, j 5 M } . Let R, = R, - R. Let us rewrite Eq. ( 1
+
13
The boundary value problem can be posed as follows: Given {u(s), s E R} and { y(.s), s E R,} determine [y(s), s E R) by solving the M 2 equation given by (41). Let y = {y(O,O), y(0,I),..., y ( M - 1, M - l)} and u = (u(O,O), ~ ( 0l),. , . ., u ( M - 1, M - 1)). We can rewrite the M 2 equations in (41) in the following vector matrix format Ay
=
hV + YE
(42)
where A is a M 2 x M 2 matrix involving (Or, r E N ) . y , is a M 2 vector whose components are the image intensities belonging to the border region R,. The matrix A is a block Toeplitz matrix. The existence of the solution for y obeying (42) depends on the regularity of matrix A . It can be shown that the condition (2) on the polynomial A ( z ) implies that the matrix A is invertible. Thus it is interesting that the conditions (2) are needed in both the generative and nongenerative interpretation of the model. The existence of the inverse for A is not the same thing as saying that the inverse is easily computable for any neighbor set N . Recall that a value 130 for M is common or even low, and M 2 x M 2 dimension means 16,900 x 16,900. The sheer volume of computation is stupendous. We will discuss the approximation methods in a later section.
R. L. KASHYAP A N D KIE-BUM EOM
102
Let us discuss the structure of A for the simple case of 4 member first order or nearest neighbor set N . The basic equation is
+ 1, j ) + y ( i - 1, j ) ] + e 2 [ y ( i , j - 1) + y(i,j + 111 + &W),( i , ~Q
Let
y(i,j ) = O1[y(i
E
We can write down the following M 2 simultaneous equations for y in terms of u and the vector of boundary pixels Ay = & U
A 2 A 1A2
+ b,
(43)
...
A=
A,
=
1
-02
o
-e2 1 -e2 o -e2 1
o
0
To find the eigenvectors and eigenvalues of the matrix A , we need to solve the equation A X = AX x
i.e.,
X l M I x 2 1 , . . . ,X M M l - 0 1 x i - 1 , j - e 2 x i , j ~+1x i j - e 2 x i , j + l- o , ~ ~ + , A, ~ ~=
= C O l . ~ X , 1 ,x 1 2 , .
9
1
,
~
(44) ,
Let us try for xij, a solution of the type xij = sin ia sin j r (45) Substituting (45) in (44) and simplifying we can find the eigenvalues of A to be
(A) (A)
li,= j 1 - 28, sin - - 28,sin
~
~
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
103
The corresponding eigenvector is x(i,J)
[ (
= Col. sin
~
M+1
)sin('"),
M+l
k,l
=
1 ,..., M
Note that the condition (2) on A ( z ) implies that all the eigenvalues in (46) are strictly positive. Hence we can easily invert the A matrix and solve Eq. (43)for y in terms of u and the boundary conditions. The above procedure is elegant, but it does not appear that one can generalize it to handle Eq. (1) with an arbitrary neighbor set. In the next section, we will give approximation methods which are computationally elegant and still can synthesize an image close to the ideal image given in Eq. (43).
F. Approximations to The Image Models The models presented earlier are very interesting from the theoretical point of view. But if we are interested in applying the models for interesting applications like image synthesis, segmentation etc., the allocated computational problems are not easy to handle. To illustrate the nature of the difficulty, we will consider four cases. First of all, to obtain an exact value for the correlation R(s), we need to numerically solve the corresponding fourier inversion problem involving the spectral density. There is no analytic method. We have already indicated that obtaining an exact expression for the joint probability density of all the intensities in a M x M image is almost impossible, we can get only an approximate expression for the density. Next, synthesizing an M x M image obeying a given model is computationally stupendous, as indicated in the earlier section. As indicated, we can think of separate approximation for each task like joint density of synthesis. The alternative method is to think of approximating the model itself, like the AR model in ( 1 ) or ARMA model in (28) etc. so that all the tasks mentioned earlier can be done with relative ease and modest computation. We will consider several approximations, but the most approximation is the toroidal model. These approximations are also called finite lattice models since they are defined only for a finite lattice, say M x M . The mc del has no meaning for the intensities outside of the grid. Toroidal Approximation. The toroidal approximation of the generalized AR model (1) defined for an M x M lattice is given below
R. L. KASHYAP AND KIE-BUM EOM
104
All the symbols in (48) such as N , etc. have exactly the same meaning as in (1) except that the symbol @ stands for summation modular M in both components, i.e., (i, j ) 0 (k, I ) = ((i + k)mod M , ( j + l)mod M ) . Equation (48) is closed, i.e., it gives M Zlinear equations involving the M Zimage intensities [y(i, j ) , i, j = 0,. , .,M - 11. They do not involve any values of y(s) for s outside the boundary. The modulo operation makes the pixel (0, M - l), the left neighbor of (0,O) and ( M - 1,0), the top neighbor of (0,O). A similar statement is valid for the pixels on the edge of the grid. It is, as if, we have folded the M x M grid into a torus by folding the grid such that the O’h row and ( M - 1) row are neighbors and similarly the OIh column and ( M - l ) I h column are neighbors. Often the toroidal models are criticized as being unrealistic for “real” images in view of the folding property. This criticism misses the point that this toroidal AR model is an approximation to the spatial AR model in (1). We measure the quality of approximation by the quality of the inferences such as correlations, synthesized images, etc. The toroidal model is accepted only if the error between the inferences given by the toroidal models and the original models is sufficiently small. Properties of the Toroidal Model. We will consider here the properties of the model in (48), the toroidal approximation of the AR model in (1). Toroidal ARMA models possess similar properties. i, j = 0,. . .,M - I} be the discrete fourier transform of the finite Let { sequence {y(i,j ) , i , j = 0,. . . ,M - 13. Let us assume that y(-) is stationary and E[y(.)] = 0. Similarly, {vij} be the DFT of the sequence { ~ ( s ) } . Recall that
xj,
A(z,,z,) = 1 -
1 eijz;z’,
i,jsN
The S”,(i, j ) , spectral density of v(-) is defined as the mathematical expectation of uijn$,* indicating complex conjugate. S,(i, j ) = A(vi, v j ) &(i, j ) = A(vi, v j )
v = exp[-2n/~]
R,Jr) = Discrete inverse of the spectral density S,, =
-0,
rEN
= I
r=O
=o
elsewhere
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
105
By transforming (48),
xj
= &vij/A(vi, v')
(49)
Consequently Syy(irj ) = spectral density of y ( . ) = E[Y,,.Y$]
= p/A(v', v ' )
Thus both the toroidal AR model in (48) and the ordinary AR model (1) have the same expression for the spectral density. This statement is true for an ordinary ARMA model in (28)and the corresponding toroidal model. We can restate this result as follows. Theorem 5: The spectral density of an ordinary AR and ARMA model, say S,(i, j ) , defined over the infinite grid { i , j ; i, j = - 00,.. .,a} is identical to the spectral density of the corresponding toroidal model defined over the finite lattice M x M , say S F ( i , j )i.e.,
Consequently we can rewrite (49) as
(51) suggests a relatively easy method of generating a finite M 2 sequence y ( . ) obeying (48) in the following 3 steps:
(i) Obtain a sequence of M 2 uncorrelated pseudo-random numbers with zero mean unit variance and any required marginal distribution. Let { i, j = 0,. . . , M - I } be the DFT of { w ( - ) } . (ii) Compute x j , , i ,j = 0,1,. . . ,M - 1 from (51). (iii) A DFT of { yields the required finite image {y(i,j), i , j = 0,1, ..., M - I } .
wj,
xj}
Joint Probability Density. An exact expression for the probability density for the finite image { y ( i , j ) , i, j = 0,. . . ,M - I}, given that it is Gaussian since are Gaussian with zero mean, unit variance and independent, (51) implies that { i, j = 0,1,. . . ,M - 1 } are also zero mean, independent and has density
{w,)
xj,
x,'is Gaussian(0, p / A ( v ' , v'))
R L KASHYAI' A N D KI17-BIJM EOM
106
p(Y+ i , j = 0,1, . . . , h4 - 1)
Since the Jacobian of the transformation from [y(.s)i to [ write the joint density of y(.) in terms of y.j p ( y ( i , j ) , i , j = 0....,M 1 -
~
-
(2np)
1)
n
M-I
M~L ~
x j ) is one, we can
( ( ~ ( v 'v, . j ) ) " 2 e x p [ - ( ~ / 2 p ) ~ j ~ ~ ~ ( ~ ~ ' . \ ~ ' ) ]
1,/=0
Recdl that .
1
M
.
A(\jl, \ t J )
=
1
-
I (),,,\,'k'+J1)
h.l-0
Thus the exponent of the density function is linear in 0,. Also note that the above expression which is the exact joint density for the image obeying (48),is also the o p p r o x i m n r c expression for the density of the model in Eq. ( I ) . as discussed in Section (2). Thus the toroidal model throws light on the precise nature of thc approximation in the density expression of Section (2). The correlation function of y can be obtained by summation of S,
Thc correlations of the toroidal lattice model arc not exactly identical to [he correlations of the corresponding infinite lattice AR and ARMA model. But the dift'erence between them is small. Synr/w.si.s. Let i w(s,,j),i..j = 0,. . . , M - I ) be a sequence of zero mean uncorrrlatc~dvariables with unit variance. Let { W i j )be its DFT. We can relate v f to as follows:
wj
y j = J A ( l l ' , d )I q i Note that v i ) is real. Finally, i t is possible to arrange the M 2 equations for i j f i , j ) , i, j 0, I , . . . , M - I given (48) in the matrix form: A(\l',
=
u =Jpu
4'= Col.[y(0,0),..., y(0, M
-
I ) , y ( l , O ), . . . , 4'(M
-
I, M
-
I)]
u = Col.[v(0,0),. . . , \ ' ( M , M ) ]
where B is a doubly circulant matrix, i.e., B can be written as a block circulant matrix and each block is also a circiilant
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
1
Bo.0
B0.1
*
B0,M- 1
B0.M-I
Bo.0
Bo.1
B0.M-2
Bo.1
B0.2
Bo.0
107
where Eo.i, i = 0,1,. . ., M - 1 are also circulant For instance, if N is the 4 member nearest neighbor set then y(s) = 00,1(y(s
+ 8,,o(y(s
+ 0,1) + Y ( S + 0 7 ) ) + i,)+ + - 1,o))+ hw
Boo = circulant( 1, -do, 1
I,.
. . , - oo,l) -60.1
-00.1
1
-00.1
Po
-60.1
Bo.1 =
=0
1
-00.1
-00.1
- 4.0
1
1
for i # 0,1,M - 1
The eigenvectors of the matrix B are the fourier vectors fij.
B.. = C ~ ] . [ v ( ~ ~k,’ j ”=, 0, 1, . . . , A 4 ?I v
-
I]
= exp[J-127c/~l
The eigenvalue of B associated with the eigenvector fii is p i j .
.
pij = A(v’, V J )
For the 4 neighbor case pij = I =
+
v-j)
-
6,,o(vi + v-j)
27cj 27ci 1 - 2 0 0 ~ ~ ~ 0 ~ - - 2 0 ~ , ~ ~ 0 ~ - - , i , j = O ,Ml , . . 1. ,
M
M
we can compare these eigenvalues with the eigenvectors of the A matrix in Section 1I.E. These eigenvalues are identical to the approximate eigenvalues of the A matrix. This again throws light on the nature of approximation. Other Finite Lattice Models. As alluded earlier, the toroidal model of (48) is only one way of modifying the AR model in (1) so that the equation for y ( - ) are closed, i.e., they do not involve any y(i,j) where (i,j ) is outside the M x M
R. L. KASHYAP AND KIE-BUM EOM
108
lattice. We can have two models in which the closure is achieved in different ways. We will give two examples. In both of them, we have the 4 member neighbor set.
N
=
{ ~1
(0,l), S; = (0, - l), ~2
= (1, 0),$ = ( - 1,O))
Example 1 (cosine vectors as eigenvectors): Let us modify the Eq. (1) as follows to achieve closure 2
y(s) =
1 8iyl(s + s i ) + h u b ) , s = (i,j), i, j = 0,1,. . .,M - 1
i=l
(52)
where Y,(S
+ s i ) E R, (s + g ) E R (s + s i ) E R, (s + 3)q! R (s + q) E R, (s + si) 4 R
+ si) = Y(S + si) + Y(S + Fi),
+ si) if = 2y(s + F), if = 2y(s
if
(S
(53) By definition M 2 equations for y(s) given by (52) are closed, i.e., we can write Eq. (52) as DY=&u
where
D,, and I are M x M matrices. To find the eigenvector of D , consider the Chebychev polynomials ci(xj), xj = cos ci(xj)= , , , ( A ) ,
( M j : ~
i, j
1) = 0,1,
.. . , M - 1
-
c j = (co(xj),cl(xj),.. . , c ~ - , ( x ~ ) M-vector )~
cij = column(c0xj)ci, c,(xj)ci,.. .,c M M,(xj)ci)
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
109
where 0 I i, j I m - 1. Then cij, 0 I i, j I ( M - 1) are the eigenvectors of D. The eigenvalue of D corresponding to the eigenvector cij is aij, aij = 1 + 2e,xj + 20,xj. Example 2: Consider again the 4 neighbor case; (52) is the finite lattice model with the following definition for y l ( . ) instead of ( 5 3 ) y,(s
+ si) = y(s + s i ) + y ( s + q), + si) = y ( s + Fj) = y(s
if
if
if
s + si E R, s + 5 E R
s + si E a, s
+ 5 .$ R s + < E R, s + si .$ n.
Then we can rewrite these equations
DY
= JpU
The eigenvectors of the matrix D are the so-called discrete sine vectors. G . Summary
We have defined the various types of AR and A R M A models and their toroidal and other approximations. We have concentrated on their second order properties. We have also discussed the various types of weak Markov properties which are completely characterized by the second order properties.
111. ROBUSTESTIMATION IN CAUSAL AUTOREGRESSIVE MODELS
A. Introduction
The importance of model based techniques for image processing tasks such as edge detection, image synthesis, image coding, image restoration, etc. has been well documented. However in all of these models, the image intensity array is assumed to be a multivariate Gaussian distribution. The Gaussian assumption is used primarily in estimating the parameters of the image model fitted to the image. The corresponding estimation procedure is relatively easy; for example, for the causal autoregressive model, the maximum likelihood method is the same as the least squares method. However in many applications, it is well known that the Gaussian assumption is not appropriate. A more realistic assumption is a contaminated Gaussian noise, il(i,j)=
w(i,j ) , u(i,j),
with probability 1 - p with probability p ’
(54)
110
R. L. KASHYAP AND KIE-BUM EOM
where w ( i , j ) is a regular white Gaussian noise and u(i, j ) is an outlier process and the ratio of outlier is assumed small (less than 5%). Unfortunately, least squares estimators or maximum likelihood estimators under the Gaussian assumption are very sensitive to minor deviations from the Gaussian noise assumption. Even a single bad data (outlier) among 1000 observations can cause large error in the estimator. Because of this excessive sensitivity of least squares estimator, a robust estimator is needed in image models. The robust estimator should possess the following properties: (1) It should have a reasonably good (optimal or nearly optimal) efficiency at the assumed noise distribution. (2) It should be robust in the sense that a small number of outliers impair the performance only slightly. (3) Somewhat larger deviations from the assumed distribution should not cause a catastrophe. The resistance to outliers (e.g., impulse noise) is equivalent to the distribution robustness by Hampel’s theorem (Huber, 1981). Many different robust estimation algorithms have been developed in the last twenty years, mostly on the location parameter estimation. These robust estimation algorithms can be classified into three large types of estimators: M-estimator, L-estimator, and R-estimator. M-estimator is a maximum likelihood type estimator and it is obtained by solving a minimization problem. L-estimator is a linear combination of ordered statistics. R-estimator is derived from the rank tests. We are mostly interested in M-estimator for the application on the image models. M-estimator is easy to extend to the problems of image models, but other types of estimators are difficult to use in problems other than simple location parameter estimation. M-estimator is defined by the following minimization problem, Minimize
1p ( x i ;0)
or solve the following implicit function,
1+ ( x i ; 0) = 0 where p is a continuous and differentiable convex function possessing bounded and continuous derivative +(x) = a p ( x ) / d x , and p is symmetric about origin with p(0) = 0. The convexity of p function ensures the equivalence of (55) and (56). The boundedness and continuity of t,h function is essential in obtaining robustness of the M-estimator. If is not bounded, then a single gross outlier can completely upset the estimator. If I) is not continuous, then small changes in the observation x i may produce a large change in the estimator.
+
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
111
There are several different definitions of robustness of an estimator (Huber, I98 1). Qualitutiiv ruhttstrie.s.s is defined by weak continuity of the estimator. M-estimator is qualitatively robust if and only if the corresponding t+b is bounded and continuous. Mininiux rohusf estimator minimizes the maximum degradation over E deviations. The M-estimator of location is optimal in the sense of minimax robustness. Quantitutiue robustness is defined by the property of small change in asymptotic bias and asymptotic variance in the contaminated neighborhood. Even though a robust procedure is necessary in most image processing applications, very little research has been done on the use of robust procedure in image processing. In this section, we develop estimation algorithms for the causal autoregressive image model.
B. Ctrusul Autoreyressioe Model I t is well known that a large class of images can be effectively represented by various types of image models involving small number of parameters (Kachyap, 1981).Image models are already uscd in image coding (Delp et ul., 1979), image synthesis, texture analysis, edge detection (Kashyap and Eom, 198Sa).Of course, there are many different types of image models and these can be classified into two large classes of image models by their second order statistical structures: classical short correlation models and long correlation models. These different image models and their general properties are discussed by Kashyap (1981). The causal autoregressive model is a generalization of one dimensional autoregressive model. This model is simple but has good modelling performance as shown in previous studies. Consider the following m x n image (Fig. I ) .
t -1
11 1_1 I 1
11-I-i- t I - I - T + i -I -I -1-14 4-1-
c
n FIG. I .
An ni x
II
image and three causal neighbors
112
R. L. KASHYAP AND KIE-BUM EOM
Assume that the image intensity in this image follows three neighbor causal autoregressive model. Let (i, j ) be an index for the coordinate location and y(i,j ) be the intensity at the coordinate (i,j ) . Then the causal three neighbors of this pixel are { y ( i - 1, j ) , y(i,j - l), y(i - 1, j - l)}. This causality is from the convention of raster scanning, and because of the causality, the resulting two dimensional model has all the convenience of one dimensional model. Suppose that { [ ( i , j ) } is a two dimensional white noise sequence with outliers as assumed in (54).The variance of the regular part of noise is IS’. Then the three neighbor causal autoregressive model is represented by the following equation:
+
y(i,j)= eTz(i,j) U , j )
(57)
where 0 is a parameter vector and z(i,j ) is a vector consists of intensities of three causal neighbors and unity. The last element of the vector z(i, j )is used to represent constant grey level in the image.
z(i, j ) =
It is assumed that every pixel has all of its neighbors, i.e., for each pixel at ( i , j ) , pixels at (i,j - l), (i - 1, j ) and ( i - 1, j - 1) are available. We consider the robust parameter estimation of the causal autoregressive model for two cases of outliers. First case, we assume that the process y ( i , j ) given in (57) can be perfectly observed. In this case, the outlier process is involved only in the noise process [ ( i , j ) to generate y(i,j ) . Second case, we assume that the observation x(i,j ) of the process y(i,j ) is corrupted by noise [ ( i , j ) . It is given by the following equation: x(i,j) = y(i,j)
+ t(i,j)
(59)
The noise process 5 is assumed to contain outliers. In this case, the outliers are not only involved in generating y(i,j ) but also involved in observation. In the next section, robust parameter estimation will be discussed for these two different cases of outliers. C . Robust Parameter Estimation 1 . Perfect Observation Case
The parameters of the image model given in (57) can be estimated by robust M-estimator. The M-estimator of the parameters in (57) is a gen-
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
113
eralization of location M-estimator. Define the following function Q(0,c):
where p is a continuous, differentiable and convex function possessing bounded derivative, and it is symmetric about origin with p(0) = 0. Then Mestimator of the causal autoregressive model is defined by the following minimization problem. Minimize Q(0,o)
(61)
The M-estimator can also be obtained by solving the following two equations simultaneously.
) x$(x) - p ( x ) ,function 1(1 is continuous and Where $(x) = dp(x)/dx and ~ ( x = bounded. The following p , $, and x functions satisfy the above conditions on these functions. In this section, it is assumed that the following functions are used in our robust estimation algorithm.
X I - c
x(4
= x$(x)
- P(X) = ~ / 2 [ $ ( x ) l 2
(66)
Asymptotic Property. The asymptotic property of the robust Mestimator for autoregression is investigated by Nasburg and Kashyap (1975). The asymptotic property of one dimensional autoregression is also applicable to two dimensional causal autoregressive model. First, the following conditions are assumed. (i) { y ( i , j ) }is a weakly stationary random sequence. (ii) is an odd, monotone increasing function satisfying a Lipshitz condition. (iii) The noise process [ has finite moments up to third order. (iv) E[$(<(i, j ) c)] = $(c) for all c.
$c)
+
R. L. KASHYAP AND KIE-BUM EOM
114
eN
Now define be an M-estimator which satisfies (61) and computed with sample size N . The following Theorem 6 and Theorem 7 are from Nasburg and Kashyap (1975). Theorem 6 (Consistency): Under the above assumptions,
eN -,o
N -,w.p.1
as
Theorem 7 (Asymptotic Normality): Under the above assumptions, “(6, - O ) ] ‘ I 2 converges in distribution to a normal distribution with zero mean and variance V , / V i , where
and
Computation of M-Estimator. The robust M-estimators are defined by (61) and can be computed by the following iterative algorithm. Algorithm 1 (Estimation with Perfect Observation): 1 . Let O(”, and do)be initial estimates (conventional least square estimator can be used). 2. A t k-th iteration, compute the following quantities by the given formula. tJk)(i,j ) = y(i,j ) - ~ ( ~ ) ~ zj () ~ ) ( i ,
(67)
where z(i,j ) is as defined in (58). 3. Update parameters by the following formula. O(k+ 1)
=
O(k)
+ T(k)
4. Repeat steps 2-3 until the differences llO(k+l) - O(k)II and Idk+’) - d k )become ( negligible.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
115
The following Lemma 1, Lemma 2 and Theorem 8 summarize the theoretical properties of the iterative method (Algorithm 1) to compute robust M-estimator. The detailed proof of these lemmas and theorem can be found in Eom (1986). Lemma 1 and Lemma 2 show the decrease of Q in (60) with the update of parameters 0 and a at each iteration.
Lemma 1: Assume that the t,b function given in (65) is used in the iterative estimation algorithm (Algorithm 1). Then the following relation holds.
Lemma 2: Under the assumption of Lemma 1, the following relation is satisfied. Q((j"1,
- Q(@k),a ( k +
1))
2
(,.$k
+ 1) -
a(k))2/2a(k)
(73)
The iterative estimation algorithm (Algorithm 1) decreases the value of the function Q in (60) as the number of iteration increase and converges to the minimum value of function Q. The estimated parameters also converge to the parameters which minimize the function Q in (60). The above property of the algorithm is summarized in Theorem 8.
Theorem 8: Under the assumption of Lemma 1 and 6 > 0, the sequence f(O'k),a(k))}converges to (8,3), a unique solution of (62) and (63) and an estimator which minimizes (61). In our experiment, the estimation algorithm (1) converges relatively fast. In most cases, the estimates converge in less than three iterations. 2, Noisy Observation Case
In this section, we assume that the process y ( i , j ) is not observable and the observation x(i, j ) is corrupted by noise with outlier. The observation x ( i , j ) can be represented by the following equation. x ( i , j ) = y ( i , J )+ 5 ( i , j )
The white noise process 5(i,j) contains a small fraction of outlier and y ( i , j ) is defined in (57). As mentioned in the introduction, M-estimator is defined by a minimization of a likelihood function type loss function. However, the M estimator of the problem with noisy observation is difficult to define. Kleiner, Martin, and Thomson (1979) considered the following minimization problem for one dimensional autoregression with some heuristic arguments.
116
R. L. KASHYAP AND KIE-BUM EOM
Minimize
1p
x ( i ) - QTz^(i)
i
where ,?is an estimator of state vector z defined in (58). However, it is not very helpful to use the above criterion because the resulting estimator can be far from being robust depending on the state estimate 2. In this section, the estimation with noisy observation is done by combining the robust estimation algorithm for the perfect observation and data cleaning procedure. In this algorithm, the original data (uncorrupted data) is also estimated at each iteration by a data cleaning procedure. At each iteration, the cleaned data at the preceding iteration is used to estimate parameters by the algorithm (1) and the data is again cleaned by using the estimated parameters. The estimation algorithm for noisy observation is given in the following algorithm. Algorithm 2 (Estimation with Noisy Observation): 1. Initially, set y(')(i, j ) = x ( i , j ) .Compute the initial estimate O(') and do) from the noisy observation { x ( i ,j ) } by the least squares algorithm. 2. Consider k-th iteration. Clean data by the following formula.
where
3. Obtain estimators Q(k+l) and minimizing the following function.
d k + l )from
the cleaned data y ( k + l )by
This can be computed by Algorithm 1 . Practically, it needs only one or two as initial estimates in the algorithm (1). iterations with Q ( k ) and dk) 4. Repeat steps 2 and 3 until the difference of estimates between successive iterations becomes small. We are unable to show the theoretical convergence of the sequence { Y ' ~ ) } to the original image intensity y in this study. However, the algorithm (2) performed satisfactorily in the experiment with synthesized and real images.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
117
*(XI
FIG.2. Hard limiter type $ function.
The algorithm given above also stabilizes fast and both yields estimates in less than three iterations in our experiment. To speed up the process, the iteration in Algorithm 1 (step 3) can be applied only once. In the experiment, the above algorithm performed well even with one iteration in Algorithm 1. Choice o j II/ Function. The t j function is used in both Algorithm 1 and the data cleaning procedure. A good choice of II, function is not only important for the robustness of estimator but also important for the fast convergence of the iterative procedure. The theoretical results in Section 1II.C.1 are developed with the following monotone II/ function t,bHL.
The typical value for c is between 1.5 and 2 in the above $ function (Fig. 2).
FIG.3. $ function for 3-0 rule.
118
R. L. KASHYAP AND KIE-BUM EOM
Even before the theoretical work on robust estimation, 3-0 edit rule was used for data cleaning for many years. The 3-0 rule is a simple implementation of hard rejection rule and corresponds to the following choice of $ function.
The above $ function is obviously not continuous. Thediscontinuity of the $ function is not desirable for robust estimation as discussed before (Fig. 3). Another interesting $-function is the following Hampel’s $-function. The Hampel’s function is also continuous but returns to zero outside of some interval. It is known that the redescending $ function yields higher efficiencies than monotone $ function for extremely heavy tailed distributions (Huber, 1981; Rey, 1983). This advantage of the redescending $ function is also confirmed in our experiment: the procedure converges much faster with Hampel’s redescending Ic/ function (79) than with Huber’s monotone $ function (77). This function performed best with parameters a = 2, b = 2.5, c = 4.5 in our experiment (Fig. 4).
Ix,
(XI
I a
107
1x1
’b
These three different $ functions are compared in the experiment, and the best performing function is chosen in our algorithm. Hampel’s function
FIG.4. Hampel’s $ function.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
119
performed better than other functions in our experiment with the parameter values given above. The theoretical results of Algorithm 1 are also valid with the redescending $ function except for the uniqueness of the solution. However, we should be careful in using the redescending $ function in data cleaning, because it may converge to a local minimum of (60).Consequently, we need to iterate only a few times when using the redescending $ function. However, the experiment shows that both robust estimation algorithms with Hampel’s redescending $ function are stable for more than ten iterations in all images tested. More details of the experiment will be discussed in Section 1II.D. D . Experimental Results Pefect Observation Case. Experimental comparisons are made between the robust estimator and conventional least squares estimator with perfectly observed data. One hundred images, each of size 10 x 10, are generated by the causal autoregressive model using the following equation. y(i, j ) = 0.3(y(i,j
-
1)
+ y ( i - 1, . j ) + y(i - 1,j - 1)) + 5i(i,j )
(80)
where (<(i,j)) is a white noise sequence which is generated by adding fl percent of outliers to the standard Gaussian (zero mean, unit variance) noise as in (54). The outliers are random noise having high magnitude and is generated from a uniform noise sequence. The result for Algorithm 1 is summarized in Table I. The robust estimator is obtained after three iterations with hard TABLE I
COMPARISON OF ROBUST ESTIMATOR AND LEASTSQUARES ESTIMATOR FOR PERFECTLY OBSERVED CAUSAL AUTOREGRESSIVE MODFL./? IS THE PERCENTAIX OF OUTLIER. ~~
Least squares estimator
P
Robust estimator
Sample mean
M.S. error
Sample mean
M.S. error
0.3024 0.2994 0.2906 0.290I 0.2944 0.2915 0.2869 0.28 19 0.2854
0.0031 0.0043 0.0050 0.0065 0.0081 0.0084 0.0091 0.0096 0.01 15
0.3011 0.3055 0.2973 0.3033 0.3066 0.3080 0.3012 0.2994 0.3008
0.0040 0.0034 0.0042 0.0045 0.0046 0.0048 0.0048 0.0050 0.0064
120
R. L. KASHYAP AND KIE-BUM EOM TABLE I1 COMPARISON OF ROBUSTESTIMATOR AND LEASTSQUARES ESTIMATOR FOR CAUSAL AUTOREGRESSIVE MODELWITH NOISY OBSERVATION.
fl IS THE
PERCENTAGE OF OUTLIER.
Least squares estimator
P
Sample mean
0 1 2 3
0.3024 0.2889 0.2934 0.3I68 0.3137 0.3050 0.2988 0.2795 0.3120
4 5 6 7 8
M.S.
error
0.0031 0.0046 0.0050 0.0058 0.0057 0.0063 0.0076 0.008I 0.0082
Robust estimator Sample mean
0.3083 0.3176 0.3055 0.3135 0.3120 0.3050 0.3036 0.2959 0.3185
M.S.
error
0.0040 0.0034 0.0040 0.0041
0.0040 0.0041 0.0044 0.0055 0.0057
limiter type II/ function (Fig. 2). Note that the robust algorithm is not very sensitive to outliers, but the mean square error of least squares estimator increases rapidly as p increases. Noisy Observation Case. One hundred images, each of size 10 x 10, are generated by the following equation.
where y ( i , j ) is generated by (80)with standard gaussian noise w(i,j ) , and ( ( i , j ) is the outliers sequence which occupies p percent of image. The result of the experiment with Algorithm 2 is summarized in Table 11. The robust estimator is obtained after three iterations with Hampel’s rl/ function in Algorithm 2. The robust estimator is much less sensitive than least squares estimator to the outlier.
E . Discussions and Conclusions Robust estimation in causal autoregressive model is considered in this section. When the outliers are present only in the recursive generation process, the estimator is obtained by minimizing a loss function. This estimator is consistent and asymptotically normal. The iterative estimation algorithm presented in this paper converges to robust M-estimator. For the estimation with noisy observation, an iterative estimation algorithm which use the data cleaning procedure is developed. These estimation algorithms are tested in the experiment and showed the robust property of the estimator.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
Iv.
121
IMAGE RESTORATION WITH ROBUST IMAGE
MODELLING TECHNIQUES A. Introduction
Restoration of an image in the presence of noise is one of the fundamental problems in image processing. Let x ( i , j ) be the observed image intensity of the original (uncorrupted) image intensity y(i,j) at the location ( i , j ) and is assumed corrupted by additive white noise [(i, j ) .
To restore image intensity ( y ( i ,j ) ) from the observation { x ( i ,j ) ; , we generally make assumptions on the noise process ( [ ( i , j ) ) and the original image intensity y(i,j). A common assumption on the noise process is that the noise distribution is Gaussian. However, the assumption of Gaussian noise has been seriously questioned as we discussed in Section 111. A more realistic assumption is that noise is a mixture of Gaussian noise and impulse noise. i(i,i ) =
w(i,j ) u(i, j )
with probability 1 with probability p
-
J
(83)
where w ( i , j )is a regular Gaussian noise and u(i, j ) is an outlier, pis the fraction of outliers and it is usually less than 5”/,,. There are many image restoration methods based on the Gaussian noise assumption. Chellappa and Kashyap (1982) used a spatial interaction model to represent image intensity array and restored images with minimum mean square error criterion. Geman and Geman (1984) used the equivalence of Markov random field and Gibbs distribution and restored images by a stochastic relaxation method with maximum aposteriori criterion. Bovick et al. (1985) used an order constrained least squares method. Wu (1985) used a multidimensional Kalman filtering approach and nonsymmetric half plane autoregressive model. Chan and Lim (1985)used a cascade of four 1-D adaptive filters in four different directions. Unfortunately, most image restoration methods based on the Gaussian noise assumption are not effective to impulse noise (Rosenfeld and Kak, 1982). The impulsive component of the noise, which is also called salt and pepper noise, is only a small portion (usually less than 5 % ) of the total image but difficult to remove by the methods based on the Gaussian noise assumption, because its amplitude is much higher than the signal amplitude. The importance of this problem has been recognized for a long period of time. Traditionally, nonlinear filtering methods such as median filter (Pratt, 1978)or a-trimmed mean filter (Bovick et al., 1983) are used to remove impulse noise
R. L. KASHYAP AND KIE-BUM EOM
122
from the image. These methods use a sliding window and the grey level of the center pixel of the window is estimated by the median or a-trimmed mean of the samples in the window. The grey level of the center pixel is replaced by this estimate. These traditional nonlinear filtering methods such as median filter or atrimmed mean filters are based on the robust location estimator which use a linear combination of ordered statistics (robust L-estimator; Huber, 1981). These methods based on the ordered statistics are used in robust estimation of location parameter from the 18th century (Rey, 1983). The median or generalized median (linear combination of ordered statistics) are resistant to the contamination of outliers. However it is based on the assumption of constant grey level in the window applied to the image. Obviously, this constant intensity assumption is inaccurate. The image intensity in a window is continuously changing especially near the edges or corners. Because of this constant grey level assumption, the methods based on the linear combination of ordered statistics, such as median filter or a-trimmed mean filter, have disadvantage of blurry result. The blurring effect is more severe on the atrimmed mean filter than median filter, because its averaging effect even if mean square error of the a-trimmed mean filter is smaller than that of median filter. Median filter generally does better in preserving edges and corners, but the following known examples (Pratt, 1978)show that it also blurs image. The line in Fig. 5a disappears as in Fig. 5b and the corner in Fig. 5c becomes rounded as in Fig. 5d after 3 x 3 median filtering.
0 O 0 0 0 0 0
0 O 0 0 0 0 0
0 O 0 0 0 0 0
0 b 0 0 0 0 0
0 O 0 0 0 0 0
0 0 0 O O O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(a)ooo0oooo
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(c)
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
o o o
(b)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
O O 0 0 0 b 0 0
O O b b @ b O O
0 0 0 0 0 0 0 0
0 0 0 0
O O b 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0
b 0 0 0
a 0 0 0
O b 0 0 b O O O 0 0 0 0 0 0 0 0
0 0 0 0 0 0
o o o
(d)
FIG.5. Examples of blurring effect of median filter (Pratt, 1978).(a) Original image of a line. (b) 3 x 3 median filtered image of (a). (c) Original image of a corner. (d) 3 x 3 median filtered image of (c).
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
123
There are two difficulties in solving the blurring problem in the traditional methods such as median filter or @-trimmedmean filter. First, the intensity function in the window applied to the image is unknown and difficult to be represented by a simple function. Second, the linear combination or ordered statistics method used in traditional methods are difficult to accommodate the effect of changing intensity. Even if there has been a facet model based approach (Yasuoka and Haralick, 1983) to reduce blurring effect after removing impulse noise, but it is based on the least squares estimator which is not robust to impulse noise. We propose a restoration method which uses a statistical image model for the representation of changing intensity and which uses a type of robust method, the so-called M-estimator. We can use one of image models mentioned earlier to represent intensity change in a window of the original image. The parameters of the image model can be estimated by robust M-estimator as shown in Section 111. The robust M-estimator of causal autoregressive model can be obtained by the iterative algorithm given in Section 111. This estimation algorithm includes a data cleaning procedure at each iteration, and it reduces the outliers in the observed data. The convergence property of the robust parameter estimation algorithm is also discussed in Section HI. The image data become noise free as the number of iterations increases, because the parameter estimates converge as the number of iteration increases by the convergence of M-estimator of the causal autoregressive model. By this data cleaning procedure, we can obtain the image from which most of the impulse noise has been removed, and the original sharpness of the edges is preserved. The iterative data cleaning procedure converges relatively fast in our experiment. In most of our experiments, the data cleaning procedure converges only after three iterations with almost noise-free results. The restoration algorithm based on the robust estimation algorithm has many advantages over the traditional methods such as median filter or a-trimmed mean filter. The comparison with other methods will be discussed later. B. Previous Robust Filters
There has been growing interest in robust methods in signal and image processing area in recent years. A good review on applications of robust procedures in signal and image processing can be found in Kassam and Poor (1985). Image restoration from the presence of impulse noise is one of important applications of robust statistical procedures. This problem is an extension of robust (resistant) estimation problem introduced by Tukey (1971), where we want to estimate parameters robustly from the data which contains a small fraction of outliers.
124
R. L. KASHYAP AND KIE-BUM EOM
Consider a small sized window taken from a relatively large image. If the size of the window is sufficiently small, then the intensity change can be assumed small, and the intensity function in the window can be approximated by a constant with some sacrifice of resolution. This is the justification for the application of robust location estimator in traditional restoration procedures. If we consider a window, then one dimensional observation sequence can be obtained from the two dimensional intensity observations in the window. Let these one dimensional observations be y , , . . . ,y,,. Then these can be represented by (84) with constant intensity assumption. yi = k
+ Ci
i = 1, ..., n
(84)
. . , i nis} a zero mean, white noise sequence with small fraction of where outliers (impulse noise). This robust location parameter estimation problem is studied by many statisticians (Huber, 1981; Rey, 1983),and several types of methods are widely used in statistics and signal processing. Among these, maximum likelihood type estimator (M-filter) and linear combination of order statistics estimator (L-filter) are used in previous image restoration studies (Kassam and Poor, 1985). L-filter has been widely used for many years, and many properties of this filter have been investigated by researchers (Huber, 1981). L-filter is easier to construct than M-filter which needs to solve a minimization problem, but L-filter needs ordering and takes more time in this ordering process than Mfilter. On the contrary, M-filter was not a popular method in previous image restoration studies, even if it has some advantages over L-filter. Recently a limiter type M-filter (LTM filter) was used by Lee and Kassam (1985), but no experiment with real contaminated images was performed. 1. The L Filter
Let y(ll < . . . < y,,,, are the ordered observations of the samples y l , . . .,y,. Then L-filter is defined by a linear combination of these ordered statistics.
In order that $ is unbiased, { u i } should satisfy the following equation. t u i = l i= 1
Median Filter. One well-known L-filter is the median filter. The output of this filter is given by the median of the observations. It is based on the fact that
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
125
the median is resistant to the outliers. It is given by (87) The robust property of the median filter is known and it has been used for many years in signal processing and image processing to suppress impulse noise. The median filter relatively well preserves edges and corners compared to other conventional approaches such as mean filters, but produces unnatural looking images. a-Trimmed Mean Filter. Another example of L-filter is a-trimmed mean filter. Yrnedian
= Y(Ini2l)
where a is a fractional number in the range [O,+) and [na] is the largest integer I nu. That is, we throw out the outer [na] observations on either side and take the average of the rest. The a-trimmed mean filter is known since the 18th century (Rey, 1983) and applied to image restoration recently (Bovick et al., 1983). The a-trimmed mean filter provides intermediate behavior including both the mean and the median. This filter is better than median filter with the mean square error criterion, but it gives more blurred image than median filter. Human eyes are known to be more sensitive to edges and corners, and the atrimmed mean filter often produces worse results than median filters with qualitative criterion. Another difficulty with this method is in choosing the value a. One approach is choosing a by the distribution of noise ( (Bovick et al., 1983), but this approach is unrealistic, because the noise distribution cannot be known in advance. Other Types of L-Jlters. There are many different L-filters which have similar characteristics to those of a-trimmed mean. For example, modified trimmed mean (MTM) (Lee and Kassam, 1985), Edgeworth’s location estimator (Rey, 1983), Tukey’s trimean (Rey, 1983) are weighted averages of ordered statistics. Modified trimmed mean filter is a generalization of atrimmed mean filter which can select the value a adaptively. This adaptability of selecting the ratio a solves the difficulty of the optimal choise of a in atrimmed mean filter, and it is better than a-trimmed mean filter in preserving edges [LEE85]. However, this filter also loses fine details of the original image, because it is based on the constant intensity assumption. Although these Lfilters are used for many years in signal and image processing, they have some disadvantages over M-filter. First, these L-estimators are computationally expensive, because it takes long computation time in ordering samples. Second, it is difficult to apply this L-estimator to problems other than the location parameter estimation problem.
126
R. L. KASHYAP A N D KIE-BUM EOM
2. The M Filter The M-filter is defined as a solution of the following equation.
2 $(y, - k) = 0
i= 1
The $-function should be bounded, continuous, and asymmetric about origin. The weight function $ is essential to obtain robustness. Limiter Type M Filter ( L T M Filter). LTM filter is an M-filter with the following limiter type $ function. $(XI =
1
Y (P), g(4,
-dP)
X’P
1x1 5 P x < -P
(90)
where g ( x ) is a strictly increasing, odd, continuous function, and p is some positive constant. Notice that the boundedness of the $-function makes the estimator robust. The M-filter is scale variant and the scale parameter need be estimated. The filter coefficient p in the LTM filter is depending on the scale parameter. Even if there are several types of scale parameter estimators are available, they are hardly used by Lee and Kassam (1985), and no experiment with M-filter has been performed in previous studies. C . Intensity Representation for Restoration
The objective of restoration problem is to estimate the original image intensity y ( i , j ) from the given sequence of x(i,j ) . In most images, y ( i , j ) is a function which is changing over pixel locations (i,j ) . However in the traditional approaches (median filter or a-trimmed mean filter), y(i,j) is assumed a constant and applied a simple robust location estimation method. For the restoration of an image from its corrupted image, an image representation method is needed. One simple method is assuming constant intensity over small sized sliding windows. This method is used in traditional image restoration methods, such as median filters or a-trimmed mean filters. However, this approximation is unrealistic. Therefore, the methods which use this approximation of constant intensity in a window give blurred images as discussed in Section 1V.A. Markov random field models are widely used to represent an image in image restoration from the Gaussian noise contamination. For example, Chellappa and Kashyap (1 982) used simultaneously an autoregressive model and a conditional Markov model, Wu (1985) used a nonsymmetric half plane
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
127
autoregressive model, and Geman and Geman (1984) used a family of Markov models. We use the causal autoregressive model to represent the image intensity array. The causal autoregressive model is one of classical short correlation models and is a generalization of one-dimensional autoregressive models. This model is simple but has good modelling performance as shown in previous studies (Kashyap, 1981). The robust estimation in causal autoregressive models is explained in Section 111. Let (i, j ) be an index for the coordinate location and y ( i , j ) be the intensity at the location (i,j). Then the three neighbor causal autoregressive model is represented by the following equation. y(i,j) = 6rTz(i,j )
+ i(i,j)
(91)
where 0 is a parameter vector, ((‘(i,j)} is a two dimensional white noise sequence with outliers as in (83), and z(i, j ) , a vector consisting of intensities of three causal neighbors and unity. The last element of the vector z(i, j ) is used to represent constant grey level in the image.
It is assumed that every pixel has all of its neighbors, i.e., for each pixel at (i, j ) , pixels at ( i , j - I), (i - 1, j ) and (i - 1, j - 1) are available. We assume that the observation x(i,,j)of the process y(i, j ) is corrupted by noise ( ( i , j ) . It is given by the following equation. .di,j)= y ( i , j ) + & , j )
The noise process
(93)
4 is assumed to be containing outliers. D. Image Restoration Algorithm
The purpose of image restoration is to remove noise, including impulse noise, from the image. The image degradation process can be represented by the following equation. where x is the observation, y is the original image intensity, and 5 is the noise process with outlier. Image restoration involves estimation of the original intensity y from the observation x. For a small sized image, original image intensity can be modelled by a causal autoregressive model. If the original image intensity indeed obeys a causal autoregressive model, then the original
128
R. L. KASHYAP A N D KIE-BUM EOM
- - - - - - - - - - - - Iterate -* - - - - - - - - - - -1 I I
:Y(k) %
*-
+
Estimation
,
dk) ~
Compute Residual
r(k)
. -
Data Cleaning
-II y(k+')
.
FIG.6 . Block diagram of image restoration method at each iteration. yIk1and y ' k + ' l are cleaned data at k-th and (k + I)-th iterations respectively, f?ckl and dk)are parameter estimates obtained by algorithm 1, and r ( k )is the residual.
image intensity can be recovered by the robust estimation algorithm for the noisy observation case (Algorithm 2). The data cleaning procedure in Algorithm 2 removes outliers at each iteration without degrading original signal. The restoration method based on the robust image model has an advantage over conventional methods such as the median filter or the atrimmed mean filter. The robust image model-based method does not blur images after restoration. Conventional methods, such as the median filter or the a-trimmed mean filter, replace every pixel by its location estimates. Because these methods are based on the constant intensity assumption, the details of original image are significantly blurred. This procedure at each iteration is described in the following block diagram (Fig. 6) and the algorithm is also summarized below.
Image Restoration Algorithm: 1. Divide the image into small sized (8 x 8) windows and the following procedures in steps 2 - 6 are applied for each window. 2. Let { x ( i ,j ) } represent the given noisy data in the window and { ~ ( ~ ) ( i , j ) } represent the cleaned data at the k-th iteration. Initially, y'')(i, j ) = x(i,j ) for all ( i , j ) . Compute initial estimators 8") and do)by the least squares method.
and
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
129
where m and n are row and column dimensions of the image and z ( k ) ( i , j is ) the following state vector. dk'(i,j - 1) z ( k ) ( ij, ) =
j ) and modi3. Consider k-th iteration, k > 0. Compute residuals r(k)(i, fied residuals P k ) ( i j, ) by the following formula with the estimated parameters computed in step 2 for all pixels in the window. t J k ) ( i , , j ) = y(k)(i,j ) - ~ ( ~ ) ~ z ( ~ ) ( i , j )
(97)
where $ is a bounded and continuous function as discussed in Section 111 ( e g , Hampel's redescending $-function). 4. Restore the image by the following rule. y(k+
1l(i,j) = e ( k ) ' z l k ) ( i , j )
+
jXk)(i,j )
(99)
5. Update estimators of parameter 0 and scale parameter o 2 by the following formula.
and
6. Repeat steps 3-5 until the difference between estimates in successive iterations becomes small. E. Experimental Results The restoration algorithm based on the robust modelling approach is applied to five different pictures as shown in Fig. 7. Figure 7a is a 256 x 256 picture of a bridge. Figure 7b is a 256 x 256 picture of face of a monkey. Figure 7c is a 256 x 256 picture of a girl. Figure 7d is a 256 x 256 picture of an outdoor scene. Figure 7e is a 512 x 512 aerial picture of the Purdue University campus. All of these pictures are digitized into 256 grey levels. To measure the performance of different algorithms on the noisy pictures, contaminated
130
R. L. KASHYAP AND KIE-BUM EOM
FIG.7. Originals.
images are constructed by adding both Gaussian(0,lOO) noise and 5% of impulse noise to the originals given in Fig. 7. The generated impulse noise has only 2 grey levels, 0 (black) and 255 (white), both with the same probability. In the robust model based algorithm, Hampel’s $-function is used in all experiments. Experiments are designed to clarify three different aspects of the restoration process. First, the convergence of the restoration algorithm is shown with these noisy pictures and the rate of convergence is measured experimentally. Second, the mean square error of three different restoration algorithms namely, model based algorithm, median filter, and a-trimmed mean filter, are compared for different window sizes and different images. Third, the overall performance of three different restoration algorithms are compared qualitatively for different noisy images. Convergence of Image Restoration Algorithm. The robust model-based restoration algorithm is applied to the contaminated images. Mean square error of the cleaned image is computed at each iteration. Figures 8a, 8b, and 8c are plots of mean square errors versus the number of iterations for the outdoor scene (Fig. 7d), the girl’s image (Fig. 7c) and the
9 CI
8
s f! 0
9
d iz
132
R. L. KASHYAP AND KIE-BUM EOM
bridge scene (Fig. 7a), respectively. Contaminated pictures are made by adding Gaussian (0,100)noise and 5% of impulse noise to the images in Fig. 7. Initial mean square errors in all cases are very large because of the additive noise, but they decrease considerably fast in the first two iterations. The mean square error stabilizes in less than three iterations. The convergence of the data cleaning method is also fast (less than three iterations). Mean Square Error Comparison of’ Image Restoration Methods. Four different types of image restoration methods with different sizes of windows, 3 x 3 , 5 x 5, and 7 x 7, are used in this experiment. These are the mean filter, median filter, a-trimmed mean filter with the trimming ratio a = 0.15, and the robust model-based method. Note that the popular choice of a is in the range from 0.1 to 0.15, and the method performed best was with choice a = 0.15 in our experiment. In the case of the robust model-based method, the fixed window size of 8 x 8 is used. The choice of 8 x 8 is from the convenience and a small change of window size would not adversely affect the performance, because the fitted image model will not change significantly. Four contaminated images are obtained from the originals in Fig. 7 by the same procedure explained in the above section. Different restoration methods which we discussed in the above are applied to these contaminated images, and mean square error of restored images are computed. In the case of median filter, mean filter, and a-trimmed mean filter, the mean square error is computed for different window sizes, but in the case of robust model based method, the plotted mean square error is for the fixed window size 8 x 8. The computed mean square error is plotted with respect to window size. Figures 9a, 9b, 9c, 9d are plots of mean square error computed by different methods for the originals of the outdoor scene (Fig. 7d), the girl’s image (Fig. 7c), the bridge scene (Fig. 7a), and the aerial picture of Purdue University campus (Fig. 7e), respectively. The results are consistent for all different types of images. All traditional methods result in relatively large values of mean square error on most of the images especially on the images having many edges. For example, in the outdoor scene, minimum values of mean square error of mean filter, median filter, and a-trimmed mean filter are 690.144l,651.1638,220.2222, respectively. In contrast, the mean square error of the robust model-based method is 103.9669. The difference, which is significant, corresponds to the fact that the intensity in a window cannot be approximated by a constant because of the edges and corners. Traditional methods have small values of mean square error at the window sizes 3 x 3 or 5 x 5 depending on the types of images. Mean filter performs worst on all images tested as expected, and median filter has slightly lower mean square error than that of mean filter; a-trimmed mean filter performs better than median filter or mean filter but its mean square error is always larger than
133
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
l.w
.ooo
1
1
3
S
7
uindou size *fixed uindou size 8
1.w 1
1.05
1
1.90)
- - - our st &--.A
uindou size *fixed uindou size 8
1.w 1
mean m---- "median -a-tm
.ooo
uindou size tfixed uindou size 8
r-------
1 1
3
5
uindou size tfixed uindou size 8
FIG.9. Mean square error comparisons of different methods. (a) Comparison for outdoor scene. (b) Comparison for girl image. (c) Comparison for bridge picture. (d) Comparison for Purdue campus.
that of robust model-based method on all images tested. The mean square error comparison shows that robust model-based method perform better than any other conventional methods on tested images. The minimum values of mean square error in conventional methods are 220.2222 for the outdoor scene, 80.6720 for the girl, 92.1 115 for the bridge, 253.7658 for Purdue campus, respectively. Mean square errors of our approach are 103.9669 for
134
R. L. KASHYAP AND KIE-BUM EOM TABLE 111 MEANSQUARE ERRORCOMPARISON OF DIFFERENT RESTORATION METHODS ON FOUR DIFFERENT TYPESOF IMAGES.
Image
MSE of robust model method
MSE of mean filter
MSE of median filter
MSE of a-TM filter
Outdoor Girl Bridge Campus
103.9669 5 2.5648 47.3367 189.1433
690. I44 1 318.9122 264.6290 453.929 1
65 1. I638 300.3 172 2 1 6.3370 401.6255
220.2222 80.6720 92.1115 253.7658
the outdoor scene, 52.5648 for the girl, 47.3367 for the bridge, 189.1443 for Purdue campus. The level of mean square error of conventional methods are always higher than that of robust model-based method. The detailed comparison is summarized in Table 111. Qualitative Comparison of lmage Restoration Methods. The noisy images and images restored by different restoration algorithms are shown in Figs. 1014. Figures 10-14 are results on the originals of Fig. 7a-e in the same order. The upper left corner of each picture of Figs. 10-14 is the noisy picture contaminated by noise and is generated by adding white Gaussian(0,lOO) noise and 5% of impulse noise to the original. This image shows typical salt and pepper noise pattern as well as Gaussian noise degradation. This noisy picture is used to obtain restored images by different methods. The upper right corner of each picture in Figs. 10-14 is the restored image by the robust model-based method. This image is obtained after three iterations of data cleaning process. The impulsive noise is almost completely absent and residual Gaussian noise is hardly noticeable. The fine details of the restored image are well preserved. As a matter of fact, almost all details of the original in Fig. 7 are still well shown in this picture. For example, the guy wire of the bridge (Fig. lo), hair of the monkey’s face (Fig. 1l), eyes of the girl’s face (Fig. 12), leaves of trees (Fig. 13), etc. have sharp edges as in the original. This result shows the important ability of the image model-based approach: it can preserve the edges and corners even with superior performance of noise removing. The lower left corner of each picture of Figs. 10-14 is the image restored by median filter with a 5 x 5 window. Note that the 5 x 5 window gives lowest mean square error as well as 3 x 3 window in the experiment of the former section. Most of the impulse noise is removed in these pictures, but they are much more blurred than the results of the robust model-based method. This blurring effect can be more easily observed in the images with many edges and corners than in the images with large areas with constant grey levels. Guy wire
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
135
FIG. 10. Qualitative comparison for bridge picture. Most of details, such as guy wire, are clearly shown in the result of model-based approach, but are not clear in others. (a) Contaminated image. (b) Robust model approach. (c) Median filter. (d) a-trimmed mean filter.
and details of the bridge frame (Fig. lo), hairs and eyes of the monkey’s face (Fig. 1 I), eyes and mouth of the girl’s image (Fig. 12), leaves of the tree, details of the car and windows of the house in the outdoor scene(Fig. 13), most details in the aerial picture (Fig. 14) are blurred and cannot be observed in these median filtered images. The regions with small intensity variations are replaced by constant grey levels and the transitions between different regions are rather abrupt. This effect is typical in the median filter because median filter fails in smoothing images. These effects can be observed in the tower region of bridge (Fig. lo), windows and wheels of the car (Fig. 13),etc. The lower right corner of each picture of Figs. 10- 14 is the image restored by a-trimmed mean filter with a 5 x 5 window and c( = 0.15. Note that the
136
R. L. KASHYAP AND KIE-BUM EOM
FIG. 1 I. Qualitative comparison for monkey picture. Most of details, such as hair, eyes, etc., are clearly shown in the result of model-based approach, but are not clear in others. (a) Contaminated image. (b) Robust model approach. (c) Median filter. (d) cc-trimmed mean filter.
choice of a = 0.15 is considered a good choice in previous studies (Rey, 1983; Bickel, 1977). Even though the a-trimmed mean filter has lower mean square error than the median filter, the image restored by the a-trimmed mean filter is more blurred than the median filter. Edges and corners of the image convey more information to human perception and because of this, the image restored by cc-trimmed mean filter is worse than those restored by the median filter in the visual comparison even if it has smaller mean square error. For example, the tower and guy wire in the bridge (Fig. lo), the hairs and eyes of monkey’s face (Fig. 1 I), the eyes and flowers in girl’s image (Fig. 12), most of the tree, car, shrubs in the outdoor scene (Fig. 13),most details of the aerial picture (Fig. 14) are blurred. It is also not successful in removing impulse noise and has con-
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
137
FIG. 12. Qualitative comparison for girl picture. Most of details, such as flowers, eyes, etc.. are clearly shown in the result of model-based approach, but are blurred in others. (a) Contaminated image. (b) Robust model approach. (c) Median filter. (d) a-trimmed mean filter.
siderable residual noise caused by impulse noise. This residual noise can be observed in all images (Figs. 10-14). F . Discussions and Conclusions
An image restoration method based on the robust modelling approach is presented in this section. This method can restore images from the impulse noise contaminations as well as Gaussian noise contaminations and has an advantage of preserving all the details of the original image. This ability is from the modelling of image intensity function by the three neighbor causal
138
R. L. KASHYAP AND KIE-BUM EOM
FIG. 13. Qualitative comparison for outdoor scene. Most of details, such as leaves, windows, etc., are clearly shown in the result of model-based approach, but are blurred in others. (a) Contaminated image. (b) Robust model approach. (c)Median filter. (d) a-trimmed mean filter.
autoregressive model. Experiment shows that this method is fast, because it does not need any ordering of data and converges relatively fast (in three iterations). Mean square error of the images restored by the robust modelbased method is smaller than images restored by any other conventional methods. Qualitative comparisons show the superior performance of the robust model-based method. The details of the image are well-recovered and all the impulsive noise is removed even if conventional methods blur these details and cannot remove all impulse noise from some images.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
139
FIG. 14. Qualitative comparison for Purdue campus. Most of details, such as roads, buildings, etc., are clearly shown in the result of model-based approach, but are blurred in others. (a)Contaminated image. (b) Robust model approach. (c) Median filter. (d) a-trimmed mean filter.
V. COMPOSITE EDGE DETECTION A. Introduction
Edge detection is not only an important topic in image processing in its own right, but also as a tool for the important problem of image segmentation. The traditional methods of edge detection based on the windows of Robert,
140
R. L. KASHYAP AND KIE-BUM EOM
Prewitt and Sobel (Rosenfeld and Kak, 1982)are based on the fact that there is a sharp change in the intensity on either side of an edge pixel. We can call these types of edges step edges. Using the step function, we can employ other types of functions like the roof function (Brady, 1982) to characterize the local intensity behavior near the edge. In recent times there have been attempts at characterizing and detecting edges by considering the intensity density over a broad area around the edge pixels. Examples of these methods are the Laplacian on Gaussian operator (Marr and Hildreth, 1980), or difference of Gaussians (DOG) (Wilson and Bergen, 1979),the facet model-based methods (Haralick, 1984), and the causal autoregressive model-based methods (Zhou and Chellappa, 1986). However, there is another mechanism of creation of an edge which has recently received some attention. Consider the pixels which are at the boundary of two textures, say cotton canvas and raffia. There is no sharp intensity change at the boundary, yet everyone will perceive the existence of a sharp edge at the boundary of the two textures. We can characterize these edge pixels as texture edges. Recently there has been considerable interest in developing methods which can detect all the texture boundaries in a scene involving several textures (Kashyap and Eom, 1985a). These algorithms effectively locate most of the boundaries between the textures which are perceived by a human observer. Of ,course, any real life images such as an outdoor scene or an airport scene will have both intensity edges and texture edges. When we apply the methods mentioned earlier for detecting edges on outdoor scenes, the final result is not satisfactory for several reasons. For instance, the result given by the Laplacian on Gaussian approach or the facet model approach yields a lot of micro edges corresponding to the leaves of a tree or the inside of a shrub in the house image. These micro edges do not convey much information and only add to the confusion, even if the edges of runways or highways are often smeared. The texture boundaries are never sharply delineated. These methods cannot distinguish between the edges within a texture like the wood texture and the boundary between the two textures, say wood and cork. The texture based algorithms also have their limitations. Since the size of the windows or masks needed to detect or discriminate between textures is much bigger than that used in the other methods, sharp edges like highways or runways in the airport are missed by these images. The purpose of this section is to develop a composite edge detection approach which can detect all types of edges including intensity edges and texture edges. We employ a two stage approach. In the first stage, we use an algorithm which determines all the possible pixels in an image which are
141
ROBUST IMAGE MODELS AND THEIR APPLICATIONS I
Original Image
--
Texture Edge Test Edge Hypothesis Generation
Edges
-
-
Final Detected Edges
Intensity Edge Test
4 FIG. 15. Block diagram of the composite edge detection algorithm.
potential edges (either intensity edge or texture edge). In addition, the algorithm gives the direction of the potential edge. In the second stage, we submit each candidate edge pixel to two procedures, one of which is designed to test whether the candidate edge pixel is a texture edge or not, and the other designed to test whether the candidate edge is an intensity edge. We accept only those edges which pass at least one of the two tests. The procedure for testing for the texture edge is a likelihood approach based on a causal autoregressive model. The procedure for testing for a step edge is fairly conventional. The comprehensive algorithm presented here was applied to several images, both synthetic as well as real life images. The synthetic images are checkerboard images involving two different textures alternately. Each texture has its own internal structure. The other two images are the outdoor scene and the airport image. We give the results of our algorithm. To bring out the highlights of our approach, we also give the results of the two popular edge detection approaches in recent literature, namely the Laplacian on Gaussian method and the facet model approach, for all four images. The overall approach is given in Fig. 15.
B. Edge Hypothesis Generation (Algorithm I ) As indicated in Fig. 15, the first step in the composite edge detection algorithm is identifying all pixels which are potential edge pixels. In this process, all potential edge pixels should be detected whether they are step edges, roof edges, or texture edges. Intensity edges, such as step edges or roof edges have abrupt changes of intensity at the edge pixels and these can be detected by a derivative operator. lntensity transition is also involved at the texture boundary as well as at microedges inside of each texture and it can be detected by a derivative operator. The algorithm used here is based on directional derivatives. We use 3 x 3 masks so that the edge pixels deleted here
142
K . L. KASHYAP AND K I F - B U M EOM
are relatively sharp. Large mask operators are not adequate because they yield potciitial edge pixels which are situated away from the actual or rue edge pixels. Let q ( r , y ) be the image intensity at position (x,y). The firs[ order directional derivative is given by the following equation. ( 103)
where (7q/(7.x and (7y/?y are partial derivatives of y i n s and 2’directions and can be obtained by convolving with the following dilfcrencing operators /I, and / I v . -1
0
(103)
-I
-I
-1
0 1
I
I
( 104a)
(104b) The angle of gradient direction is
Likewise, the second order directional derivative is given by the following eq 11ii L I o n .
where second order partial derivatives ? 2 y / ? u 2 , ( 7 2 q / i u i i y and (72y/?v2 arc obtained by convolving q with the following second order differencing operators D,,, and D,,
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
143
(108a) (108b) ( 108c)
An edge hypothesis is made at the pixel whose first directional derivative has a magnitude larger than a threshold t , and the corresponding second order directional derivative is negative, i.e.,
Note that the Prewitt operator is a special case of this directional derivatives method and the Prewitt operator does not involve second order directional derivative. The angle of the first derivative is given by a
= tan-'
aslay , and
(a,iax)
it can be any value between 0 to 360 degrees. The angle of edge direction is quantized into 4 directions as defined in Table IV so that horizontal or vertical directional strips can be applied. Around each potential edge pixel, a TABLE I V QUANTIZATION RULEFOR ESTIMATED EDGEDIRECTION Gradient angle (degrees)
Approximated direction
Type of strip
315-45 45-135 135-225 225-315
0 2 4 6
horizontal vertical horizontal vertical
144
R. L. KASHYAP AND KIE-BUM EOM
m x 2n strip (5 x 16 is used in this experiment) is constructed so that the strip is perpendicular to the approximated edge direction (Fig. 16). For each potential edge pixel at the center of the strip, the following null hypothesis H,,is assigned.
H, = An edge exists in the given direction The above hypothesis is tested by applying decision rules to the image strip. The details of the tests are given later. C . Conjrminy The Presence of Edges (Algorithm 2)
The potential edge pixels selected by the edge hypothesis generation process given in Section VB are not the final edge pixels. Each potential edge can be either an intensity edge, a texture edge, or a spurious edge (microedge) caused by intensity changes inside of a texture. We want to detect only valid edges such as intensity edges or texture edges, but microedges (spurious edges) need be deleted from the potential edge map. We need to confirm valid edges at each potential edge pixels. This confirmation process involves two different types of confirmation processes. Intensity edges and texture edges have different generation mechanisms, and these need be confirmed by separate decision processes. Therefore, two different types of decision rules are needed to detect both texture edges and intensity edges. The first decision rule tests the existence of a texture edge at the given position and edge direction, and it is based on the likelihood ratio test with statistical texture modelling method. The second decision rule tests the existence of an intensity edge, and it is based on the differencing operator with weighted differencing. The pixels which fail in both of these tests will be deleted from the final edge map. 1. Conjirming a Texture Edge
A texture edge in an image can be modelled as a boundary between two different texture regions. This is analogous to the intensity edge which is potential edge pixel
n
t
estimated edge direction
n2
FIG. 16. rn
x 2n strip of
image with potential edge pixel at center.
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
145
modelled by a boundary between two different grey levels. Detection of a texture edge is much more difficult than detection of intensity edges, because each texture region contains many microedges. The texture edges cannot be detected by the strength of gradient or Laplacian operators, and we need a method to characterize textures before detecting texture edges. Textures can be characterized by a small number of parameters after fitting the image by an image model, such as the causal autoregressive model. Consider a horizontal strip of an image intensity array which is sufficiently small, so that the strip can have at most two different textures in the strip. If it has two textures, the boundary between textures can be assumed as vertical. In this strip, a texture edge is defined as the boundary between two different textures. Consider the strip around the candidate pixel defined earlier. Let the null and alternative hypothesis be
H,
= texture
edge exists at the given pixel and direction
H I = no texture edge exists at the given pixel and direction Under the hypothesis H,, texture in the left of the potential edge (this region will be called R , ) and texture in the right of the potential edge (this are different from each other. These two different region will be called 0,) textures are modeled by causal autoregressive models. The models in the regions R , and R2 are defined below. s ( i , j ) = OTz(i,j)
if
+ Jp,w(i,j)
0I i I m, 0 I j 5 n (region R,)
(1 10)
y(i,j)= G z ( i , j )+ & G ( i , j ) ,
if
0 _< i Im,n < j I2n (region R,)
(1 11)
where {w(i,j ) } is a standard 2D white noise sequence, 0, and 82 are parameter vectors for the regions 0,and R,, respectively, and z(i,j) is a 4-vector.
The parameters of the autoregressive model in the 2 regions R, and R, will be different under the null hypothesis H,. On the other hand, under the hypothesis H I ,the strip has only one type of texture, that is also assumed to follow a causal autoregressive model.
R. L. KASHYAP AND KIE-BUM EOM
146
g(kd = G ( i , j )
if 0 5 i 5 m,0 I j I 2n
+ &Ai,j) (region R, u R,)
(1 12)
where 0, is the parameter vector and z(i,j) is previously defined. The decision rule based on the likelihood ratio test has the following form: accept H, reject H,
{
if if
logp(g 1 H,) - logp(g 1 H,) 2 K logp(g I H,) - logp(g I H I ) < K
(113)
where K is a constant. The likelihood functions logp(g I H,) and logp(g I H,) for autoregressive model are given in the following theorem. The proof can be found in the reference (Kashyap, 1982). Theorem 9: The likelihood functions logp(g I H,) and logp(g I H , ) are given by the equations (1 14) and (1 17). logp(g)H,) = k=1.2
{ -(mn
- 5)logfi, - 3log(mn) - logdet(S;')}
where &, i = 1,2 are the maximum likelihood estimates of { P k , i {&, k = 1,2} is a matrix defined below.
(1 14)
=
1,2}, and
logp(g IH,) = -(2mn - 5)10gfi0 - 3log(2mn) - logdet(S,')
(117)
where the maximum likelihood estimator &, and the matrix So are obtained in the same way as in (115) and (1 16) by summing over a whole strip. In the likelihood functions given in (1 14) and (1 17), the term det(S; ') = det(xz(s)zT(s))is a inverse measure of correlation between pixels in a strip. If the pixels in the strip are strongly correlated, then this quantity is small, and if the correlation is weak, then this quantity is large. However, if the pixels in the strip are perfectly correlated (for example, constant grey level), this becomes zero and the likelihood function cannot exist. In such a trivial case, the following rule is used.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
147
1. If det(S;') = 0, the whole strip is completely correlated. Therefore, the hypothesis Ho is rejected. 2. If det(S6') # 0 and det(S;') = 0, k = 1 or 2, the hypothesis H, is accepred. 3. Any other cases, the decision rule (113) is applied. The texture edge detection by applying the decision rule (1 13) on the pixels with edge hypothesis has several advantages over the texture boundary detection algorithms given in Kashyap ( 1 985a). First, the texture edge direction is estimated in new method, and this gives more accuracy in detecting edges than applying both horizontal and vertical strips. Second, new method tests only the existence of a texture edge, and it provides much faster processing. 2. Conjrming an Intensity Edge This decision rule tests the existence of an intensity edge at the pixel having edge hypothesis. The intensity edge is modeled by a step edge and the decision is made on the output of a differencing operator with weighted averaging. Briefly speaking, with the strip applied at the given pixel, the difference of the weighted average of grey levels in both sides of the potential edge pixel is computed. If this difference exceeds a threshold, the pixel is accepted. This decision rule also can be extended to detect the local maximum instead of detecting the strength of the weighted differencing operator output. Let W ( i , j ) be a weight function. This weight function should be asymmetric with respect to the hypothetical edge pixel and direction. Then the output of the weighted differencing operator is given by the following equation.
All edge detection window operators can be considered members of these weighted differencing operators. For example, Prewitt, Robert and Sobel operators (Rosenfeld and Kak, 1982)are weighted differencing operators with appropriate weight functions detecting large output as edges; Laplacian on Gaussian operator (Marr and Hildreth, 1980) is also a weighted differencing operator with a derivative of Gaussian wight function detecting local maximum of output. Many variations of weight functions are possible, but we will restrict our attention to the simple operator which can detect the step edges. Probably the simplest weighted differencing operator is the one with
148
R. L. KASHYAP AND KIE-BUM EOM
uniform weight function. This operator is defined for the given strip in the equation below. (119) The above operator is used to decide the existence of an intensity edge at the potential edge pixel in our experiment. The decision is based on the strength of the operator output. i.e., accept edge reject edge
if S ' > t otherwise
where t is a constant. Experimental results (Figs. 17-21) show good performance with this simple decision rule. D . Experimental Results
The composite edge detection algorithm is tested with the following four different images (Fig. 17).Figure 17a is a 128 x 128 image generated from two textures chosen from Brodatz's photo album (Brodatz, 1966),grass and wood grain textures. This image has only major edges at the boundary of two textures but each square has many weak edges caused by textures. Figure 17b is a 128 x 128 original test image generated by rotating a checker board image generated similarly as Fig. 17a. Textures in this image are the same as in Fig. 17a. The major edges of this image are sloped in a 45degree direction and each diamond pattern has many weak edges caused by intensity changes within a texture. This image is given to demonstrate that our method can detect the edges which are neither horizontal nor vertical. Figure 17c is a 256 x 256 outdoor scene. Figure 17d is a 256 x 256 image of a monkey. Experiment 1: Checker Board Image. Figure 18a is the final result of the composite edge detection algorithm with a low threshold in the decision rule for the intensity edge. It shows the detected major edges at the boundary of two different textures as well as weak edges inside of each texture. The edges detected inside of textures are close to the actual edge locations. Figure 18b is the result of the composite edge detection algorithm with a high threshold in the decision rule for the intensity edge. It shows only major edges between two different textures, and most of the weak edges inside the textures are eliminated. Thus, an investigator can get an idea of the texture edges (corresponding to the boundaries between textures) and the intensity edges separately.
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
149
FIG. 17. Original images. (a) Checker board. (b) Rotated checker board. (c) Outdoor scene. (d) Monkey image.
Figure 18c is the result of Laplacian on Gaussian approach with CJ = 0.5. Even if we alter the parameters, still the final edge map is similar to the one before. Thus if we use this approach, we cannot distinguish the edges which are caused by the boundaries of textures and the microedges within each texture. Figure 18d is a result of the facet model approach. It shows detected major edges and weak edges. Even if the parameters are changed, the final edge map is similar to Fig. 18d. Thus if we use this approach, the texture edges and intensity edges are not distinguished. Another noticeable distortion is at the corner of the square. The detected edges around the corner are distorted. Experiment 2: Rotated Checker Board Image. Figure 19a is the final result of composite edge detection algorithm with a low threshold in the decision rule for the intensity edge. It shows all major edges between texture regions
150
R. L. KASHYAP AND KIE-BUM EOM
FIG.18. Comparison with checker board image. (a) Edges detected by composite edge detection algorithm with low threshold. (b) Edges detected by composite edge detection algorithm with high threshold. (c) Result of Laplacian on Gaussian method. (d) Result of facet model method.
and weak edges inside of each texture region. The location of detected edges correspond to actual edge locations of the original image. Figure 19b is the final result of the composite edge detection algorithm with a high threshold in the decision rule for the intensity edge. It shows all major edges between different texture regions, but most of the weak edges inside of texture region are removed without weakening major edges. Figure 19c is the result of the Laplacian on Gaussian operator. It shows both major edges and weak edges. The final edge map does not change even if the parameters are changed. Therefore if we use this approach, texture edges and intensity edges are not distinguished. Figure 19d is the result of the facet model approach. It shows severely distorted detected edges. It contains major
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
151
FIG. 19. Comparison with rotated checker board image. (a) Edges detected by composite edge detection algorithm with low threshold. (b) Edges detected by composite edge detection algorithm with high threshold. (c) Result of Laplacian on Gaussian method. (d) Result of facet model method.
edges between texture regions and weak edges inside of each texture region, but weak edges cannot be separated from the major edges by changing parameters. Experiment 3: Outdoor Scene. Figure 20a is the image of pixels having edge hypothesis which is generated by the edge hypothesis generation process which is described in Section VB. It contains all edge pixels in the original image. These potential edge locations are very close to actual edge locations. For example, windows of the house in the upper left quarter, wheels of the car in the lower half, rear of the car in the lower left, entrance of the house in the middle, side window in the gable in the upper right quarter, etc. are very well
152
R. L. KASHYAP AND KIE-BUM EOM
FIG.20. Comparison with outdoor scene. (a) Potential edges (pixels having edge hypothesis)detected by Algorithm 1. (b) Composite edges detected by Algorithm 2 (compositeedge detection algorithm). (c) Result of Laplacian on Gaussian method. (d) Result of facet model method.
detected in this edge picture. This picture shows that this edge hypothesis generation process itself can be used as a good edge detection method. Performance of this 3 x 3 mask directional derivatives method as an edge detection method appears to be comparable or superior to those of other edge detection algorithms (we will discuss this more later). Figure 20b is the final result of the composite edge detection algorithm. Notice that most of unwanted microedges caused by textures such as leaves of tree in the upper left, shrubs in the middle, etc. are removed, but other important edges are still preserved and not weakened. For example, the edges at the window area in the left and middle, wheels of the car in the lower half, entrance of the house in the middle, and the side window in the gable in the
ROBUST IMAGE MODELS AND THEIR APPLICATIONS
153
upper right quarter are still preserved even though minor edges in the tree region are absent. Figure 20c is the result of the Laplacian on Gaussian operator. The edges detected by this method are distorted, but unwanted minor edges inside of texture regions still exist. Detected edges in the window area in the left and middle, the car in the lower half, the roof top in the upper, the line in the bottom, lines on the gable in the upper right, the entrance of the house in the middle, etc. are distorted and the location of detected edges are relatively far from the actual edge locations. The side window in the gable in the upper right quarter is not detected even though many unwanted weak edges are detected. The windows in the left of the house cannot be distinguished since they are distorted by the microedges of the tree. Figure 20d is the result of the facet model approach. Detected edges are not only blurred but also distorted. Most of the unwanted minor edges generated by textures still exist, and the major edges are blurred. The detected edges in the regions of windows in the left and middle, entrance of the house in the middle, wheels of the car in the lower half, lines in the gable in the upper right quarter, etc. are distorted. Windows in the left cannot be distinguished from leaves, because they are distorted and mixed with edges generated by leaves and shades of a tree which look like windows. Experiment 4: Monkey Image. Figure 21a is the image of pixels having edge hypothesis which is obtained by the edge hypothesis generation process which is described in Section VB. It shows sharp edges, and the location of these potential edge pixels are very close to actual edge location. For example, eyes of the monkey, lines in the center of the image, etc. are well detected and show good performance of this algorithm as an edge detection method. The performance as an edge detection method is superior than other edge detection methods. Figure 21b is the final result of the composite edge detection algorithm. Notice that most of microedges in the texture region in the cheeks of the monkey’s face are removed, but most of the important edges, such as the eyes and nose of the monkey and lines in the center of the picture are well preserved. Figure 21c is the result of Laplacian on Gaussian operator. It shows distorted major edges, and many unwanted edges caused by textures in the cheeks of the monkey’s face. This picture not only includes many unwanted microedges but also shows distorted major edges. The edges in the eyes and nose region are distorted and barely distinguishable. Figure 21d is the result of the facet model approach. Detected edges are distorted and contain many false (spurious) edges. The location of detected edges are relatively far from the actual edge location.
154
R. L. KASHYAP AND KIE-BUM EOM
FIG.21. Comparison with monkey image. (a) Potential edges (pixels having edge hypothesis) detected by Algorithm 1. (b)Composite edges detected by Algorithm 2 (composite edge detection algorithm). (c) Result of Laplacian on Gaussian method. (d) Result of facet model method.
E. Discussions and Conclusions Edges are generated in at least two different ways, namely by the difference in intensity (intensity edge) and by the difference in textures (texture edge). The importance of the texture edge is demonstrated by the examples. Conventional edge detection algorithms cannot distinguish between texture edges and intensity edges. A new edge detection algorithm which can detect both intensity and texture edges is developed. The performance of the composite edge detection algorithm shown in this experiment can be summarized into the following two points.
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
155
1. Edge hypothesis generation procedure developed in this research can be used as an edge detection method, and the performance as an edge detection algorithm is better than other edge detection methods. 2. Our composite edge detection algorithm is flexible enough to detect both major and weak edges by changing threshold. In other words, it can detect only major edges without detecting microedges, which are caused by texture for high thresholds, and can detect both major edges and microedges for lower thresholds.
VI. SUMMARY AND CONCLUSIONS
Robust image models are investigated and applied to several important image processing problems in this study. Robust image models have potential applications in many problems arising in image processing and computer vision. Image models are already used in image synthesis, texture analysis, image coding, and image segmentation, but they are generally nonrobust to outliers. We applied the robust image models to two important problems in image processing, namely image restoration and edge detection. The robust model-based methods are compared experimentally with conventional methods. The advantage of robust model-based methods over conventional methods in some image processing problems has been shown in Section IV and V.
REFERENCES Besag, J. E. ( I 972a). J . Ro,d Statistical Sock/),,B34, p. 75. Besag, J. E. ( 1 972b). Eionie/riko, 59, p. 43. Besag. J . E. (1974). J . Royd S/atis/icalSocietjj, B36. p. 192. Bickel, P. J . and Doksum, K. A. (1977). "Muthm~utirulS/u/is/ic.s." Holden-Day, Inc. Bovick, A. C.. Huang, T. S. and Munson, D. C.. Jr. (1983) IEEE Trans. on acoustic.^, Speech and Signul Processing, ASSP-31, p. 1342. Bovick, A. C., Huang, T. S. and Munson, D. C., Jr. (1985). IEEE Trans. on Acoustics, Speech and Signul Processing, ASSP-33, p. 1253. Brady, M. (1982). ACM Computing Siirceys, 14. p. 3. Brillinger. D . R. (1981) "Tinze Series, Data Ana1j'si.sand Theory." Expanded Edition, Holden Day Inc. Brodatz, P. (1966). "Tex-rures: A Phorographic Album for Artis/ irnd Desiyner.s." Dover Publication, New York. Chan, P., and Lim, J. S. (1985). IEEE Tru17.s.on Acoustiix, Speech, and Signal Processing, ASSP-33, p. 117.
156
R. L. KASHYAP A N D KIE-BUM E O M
Chellappa, R., and Kashyap, R. L. (1982). IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-30, p. 461. Chen, C. H. (1982). Proc. IEEE 6th In/. Con6 on Pattern Recognition, p. 172. Chen, P. C., and Pavlidis, T. (1979). Computer Graphics and Image Processing, 10, p. 172. Cross, G. R., and Jain, A. K. (1983). IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-5, p. 25. Davis, L. (1977). Computer Graphics andImage Processing, 6, p. 492. Delp, E. J. and Chu, C. H. (1985), IEEE Trans. on Systems, Man, and Cyhern., SMC-15, p. 144. Delp, E. J., Kashyap, R. L., and Mitchell, 0.R. (1979). Pattern Recognition, 11, p. 313. Eom, K. B. (1986). “Rohust Image Models with Application.” Ph.D. Thesis, Purdue University. Fu, K. S. (1982). “Syntactic Pattern Recognition and Applications.” Prentice-Hall. Geman, S. and Geman, D. (1984). IEEE Trans. on Pattern Anal. and Machine Intell., PAMI-6, p. 721. Gradshteyn, 1. S. and Ryzhik, I. M. (1980). “Table of Integrals, Series, and Products.” Academic Press. Granger, C. W. J. and Joyeux, R. (1980).Journal qf Time Series Analysis, 1, p. 15. Haralick, R. M., Shanmugam, K., and Dinstein, I. (1973). IEEE Trans. on Syst., Man, and Cybern., SMC-3, p. 6 10. Haralick, R. M. (1983a). Computer Vision, Graphics, and Image Processing, 22, p. 28. Haralick, R. M. and Shapiro, L. G . (l983b). Computer Vision, Graphics, and Image Processing, 29, p. 102. Haralick, R. M., Watson, L. T., and Laffey, T. J. (1983~).The International Journal of Robotics Research, 2, p. 50. Haralick, R. M. (1984). IEEE Trans. on Pattern Analysis andMachine Intelligence, PAMI-6, p. 58. Hosking, J. R. M. (1981). Biometrika, 68, p. 165. Huber, P. J. (1977). “Robust Statistical Procedures.” SIAM Press. Huber, P. J. (1981). “Robust Statistics.” John Wiley & Sons, Inc. Jernigan, M. E. and Wardell, R. W. (1981). IEEE Trans. on Svstem, Man, and Cybernetics, SMC-11, p. 441. Kashyap, R. L. (1981). In “Progremin Pattern Recoyni/ion.”(L.N. Kana1 and A. Rosenfeld, ed.), I, p. 149, North Holland. Kashyap, R. L. (1982). IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-4, p. 99. Kashyap, R. L. and Rao, A. R. (1976).Dynamic Stochastic Models,fiom Empirical Data, Academic Press. Kashyap, R. L. and Khotanzad, A. (1984a). Proc. Int. Con$ Pattern Recognition. Kashyap, R. L. and Lapsa, P. M. (1984b). IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-6. Kashyap, R. L. and Eom, K. B. (1985a). Proc. IEEE Int. Geoscience and Remote Sensing Symposium, p. 255. Kashyap, R. L. and Eom, K. B. (1985b). Proc. The 23rd Annual Allerton Conference, p, 314. Kashyap, R. L. and Eom, K. B. (1986). Proc. IEEE Int. Symp. Inf. Theory. Kassam, S. A. and Poor, H. V. (1985). Proceedings of IEEE, 73, p. 433. Kasturi, R., Walkup, J. F., and Krile, T. F. (1984). Computer Vision, Graphics, and Image Processing, 28, p. 363. Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Science, 220, p. 671. Kleiner, B., Martin, R. D., and Thomson, D. J. (1979). Journal of Royal Society, B41, p. 313. Kuan, T. K., Sawchuck, A. A,, Strand, C. S., and Chavel, P. (1985). IEEE Tram. on Paltern Analysis and Machine Intelligence, PAMI-7, p. 165. Kundu, A,, Mitra, S. K., and Vaidyanathan, P. P. (1984). IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-32, p. 600.
ROBUST IMAGE MODELS A N D THEIR APPLICATIONS
157
Lawrance, A. J. and Kottegoda. N. T. (1977).Journuluf’ Royal Statistical Society, A140, Part 1, p. I . Lee, Y. H. and Kassam, S . A. (1985). IEEE Traris. on Acoustics, Speech and Signal Processing, ASP-33, p. 672. Loeve, M. (1963). “Prohuhility Theory.” Third Edition, D. Van Nostrand Company, Inc. Mandelbrot, B. B. (1967). IEEE Trans. on In/iirniution Theory, IT-13, p. 289. Mandelbrot, B. B. and van Ness. J. W. (1968). S.I.A.M. Reuicw, 10, p. 422. Mandelbrot, B. B. (1977). “Fractals:Form, Clinnw, und Dimension.” Freeman, San Francisco. Marr, D. and Hildreth, E. (1980). Proc. Royal Snc,iety uf’ London, 8207, p. 187. Martin. R. D. (1972). IEEE Trans. on In/: Theory, IT-18, p. 596. Martin, R. D. and Schwartz, S. C. (197 I). IEEE Trans. on In/: Theory, IT-17, p. 50. Martin, R. D. (1972). IEEE Trans. m i In/irmation Theory, IT-18, p. 596-606. Martin, R. D. and Masreliez, C. J. (1975). IEEE Trcin.~.on I n f Theory, IT-21, p. 263. Martin, R. D. (1981). In “Applied Tinie Series 11” (H. D. Findley, ed.), p. 683, Academic Press. Martin, R. D., and Thomson, D. J. (1982). Proceedings nf IEEE, 70, p. 1097. Masreliez, C. J. and Martin, R. D. (1977). IEEE Trans. on Auto. Control, AC-22, p. 361. Medioni, G . and Yasumoto, Y. (1984). Proc. Worksliops on Computer Vision, p. 25. Mero, L. ( 1 98 I). IEEE Tran.s. oti Pattern .4na/.vsi.sarid Machine Intelliyence, PAMI-3, p. 593. Nasburg, R. E. and Kashyap, R. L. (1 975). Proceedings of Infbrmation Science and Systems Con,fernice. Jolins Hnpkms Univer.sitv. Ord, K. (1975). Journal o f ilir American Sta/i.vtical Association, 70, p. 120. Pentland, A. P. (1984). IEEE Trans. on Pattern Analysis andMachine Intelligence, PAMI-6, p. 661. Polzak. B. T. and Tsypkin, Ja. 2. (1980). Automaticu, 16. p. 53. Pratt, W. K . (1978). “Digital Image Processing.’’ John Wiley & Sons, Inc. Price, E. L. and Vandelinde, V. D. (1979). IEEE Trans. on In/: Theory, IT-25. Rao, C. R. (1967). “Linear Stutisticul Inference and Its Applications.” John Wiley & Sons. Rey, W. J. J. ( 1983). “ltilroduclion to Robust and Quasi-Robust Statistical Methods.” SpringerVerlag, New York. Robbins, H. and Monro, S . (1951). Annuls of’ MuthematiculStatistic.s, 22, p. 400. Rosenblatt, M. (1956). Proceedings of’ Nutionul Academy o / Science (U.S.A.),42, p. 43. Rosenfeld, A. and Kak, A. (1982). “Digital Image Processing.” Second Edition, 1 & 2, Academic Press. Shanmugam, K. S., Dickey, F. M.. and Green, J. A. (1979).IEEE Trans. on Pattern Analysis rind Muchint Intelligence, PAMI-I, p. 37. Shipman. A. L. Bitmead, R. R., and Allen, G. H. (1984). IEEE Trans. on Pattern Analysis and Mucliine Intelliqence, PAMI-6, p. 96. Spitzer, F . (1971). Amerir,un Matlimma/ic.s Monthly, 78, p. 142. Tukey, J. W. (1971).“E.yplorutory Data Ana/ysi.s.” Addison-Wesley Publ., Mass. Wilson, H. R., and Bergen, J. R. (1979). Vision Research, 19, p . 19. Wu, L.-D. (1984).IEEE Trans. on Pattern Ana1.v.si.s rind Machine Intelligence, PAMI-6, p. 41. Wu. Z. (1985). IEEE Trans. on Acou.stics, Sperch, and Signal Procexeing, ASSP-33, p. 1576. Yasuoka, Y. and Haralick, R. M. (1983). Putterti Recognition, 16, p. 23. Yeh. C.-L., and Chin, R. T. (1985). IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-33, p. 1593. Zhou, Y. and Chellappa, R. (1986). Proc. Int. Con/: Acoustics, Speech, and Signal Processing.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS PHYSICS . VOL . 70
Physical Limits in Information Processing ROBERT W . KEYES I B M T. J . Watson Reseurch Center Yorktown Heights. N Y
I . Introduction . . . . . . . I1 . Representation of Information . 111. Systems . . . . . . . . . IV . The Nature of Devices . . . A . Three Terminal Devices . . B . Two Terminal Devices . . C.Voltage . . . . . . . V . Transistors . . . . . . . A . Bipolar Transistors . . . B . Field-EKect Transistors . . C . MESFET . . . . . . . D . Soft Errors . . . . . . VI . Wiring . . . . . . . . . A . ChipWire . . . . . . . B . Elect romigration . . . . C. Chip Interconnection . . . V11 . Fabrication . . . . . . . VIII . Dissipation of Energy . . . . A . Fundamental Limits . . . B. Power Supply and Cooling . IX . Concluding Remarks . . . . References . . . . . . . .
. . . . . . . . . . . . . . . . . . 159 . . . . . . . . . . . . . . . . . . 161 . . . . . . . . . . . . . . . . . . 163 . . . . . . . . . . . . . . . . . . 164 . . . . . . . . . . . . . . . . . . 164 . . . . . . . . . . . . . . . . . . 166 . . . . . . . . . . . . . . . . . . 171 . . . . . . . . . . . . . . . . . . 175 . . . . . . . . . . . . . . . . . . 177
. . . . . . . . . . . . . . . . . .
188
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
194 . . . . 197 . . . . 198 . . . . 198 . . . . 201 . . . . 202 . . . . 203 . . . . 207 . . . . 207 . . . . . 209 . . . . 213 . . . . 213
I . INTRODUCTION The rapid progress of information processing technology causes one to wonder how long its exponential rates of change can continue. Factors of two decreases every few years in dimensions and costs and increases in the sizes of systems and number of components on a chip obviously cannot continue forever. Yet new inventions and discoveries continue to expand the distance to which such trends are believable. Thus, one seeks to find limits related to information and the devices that handle it that are based on fundamental physical laws. rather than on any assumptions about technology . 159
.
Copyright c) 1988 hy Academic Press. Inc. All rights of reproduction reserved. ISBN n- I2-014670-3
160
ROBERT W.KEYES
Unfortunately, this goal is elusive. A limited amount of progress can be made with information as an abstract concept. Mathematical descriptions of noise can be introduced. The efficiency of coding schemes that combat noise with redundancy can be calculated. In the real systems, in which it is desired to compare limits with the present state of affairs, information is represented as the value of a physical quantity. If information is represented by photons it is possible to deduce limits relevant to reality, as photons can be treated as noninteracting particles in a vacuum, a system whose properties are well-known. Useful results pertaining to optical communication can be obtained with which systems where attenuation and diffraction and dispersion are present can be compared. However, most systems involve much more than the propagation of light in vacuum and one is soon involved with the physics of materials. In communication, signals must be converted from one physical form to another with detectors, amplifiers, and transducers and they are often guided by material structures. In logic, material systems that can produce strong interactions between different streams of information are essential. The subject of limits on information processing is quickly entwined with that of limits of materials. The latter can be founded in the properties of known materials, but at this point one is fairly far away from the ideal of “fundamental”. Basic physics has not established that the favorable properties of known materials cannot be improved upon. On the other hand, the decades of research that have not produced better electronic materials than those now known may be evidence that better materials cannot be created because of aspects of the physics of solids that are not well understood. Another fact that may or may not be viewed as fundamental is that the advancement of the technology of information handling and processing is driven by economic forces. Integrated circuits and digital electronics are technologies. They exist because they perform useful functions economically. Miniaturization and integration are pursued as avenues to reduce the cost of providing services. They are supported by resources that are made available because they have demonstrated an ability to make electronic manipulation and h-andling of information, in such functions as record keeping, simulation of physical processes, and analysis of large amounts of data, possible at lower cost. Although there is nothing fundamental in the sense of physical science about these economic considerations, they are essential to the consideration of limits to information handling technology. There is an enormous gap between the physics that can be done in a laboratory with equipment worth millions of dollars and practical systems where a million or more components are used and their cost must be measured in cents. Thus an inquiry into the limits of a technology raises a semantic question. “Physical” invokes thoughts of very basic laws, such things as the uncertainty
PHYSICAL LIMITS IN INFORMATION PROCESSING
161
principle of quantum mechanics, the laws of thermodynamics, and the relativistic limit on the velocity of material bodies. However, many things that are permissible from the perspective of physical laws are out of the question as elements of a technology. There is no doubt that the presence or absence of a single atom or electron at a specified position can represent a bit of information. A complex of laboratory apparatus can, in fact, “write” and “read” such a bit (Gabrielse et al., 1985). Such possibilities, though quite consistent with physical law, have no relevance to the limitations on the availability and performance of information processing technology.
11. REPRESENTATION OF INFORMATION Information is handled in digital, almost exclusively binary, form in modern computation. Digital means that there are a finite set of signal values that must be distinguished. The digits are represented as the value of some physical quantity. One reason that binary representation dominates in computation is its natural correspondence with a variety of physical effects. Many physical concepts are inherently binary. Some familiar examples are the power supply and ground potentials in electrical circuits, the direction of magnetization of a magnetic domain, the amorphous and crystalline states of a solid, and opaqueness or transparency of a material. A bit is also the natural answer to many common information processing questions, e.g., is there another datum to be examined? Is the printer turned on? Is the file sought on the accessed disk? Even quantities that are not inherently digital are used in binary form in computation and in information storage. A bead on a rod in an abacus may occupy a continuum of positions, but only two are used. A charge on a capacitor is used to indicate the value of a bit, even though it can be regarded as a continuous variable. The value of digital representation lies in its use in preventing the deterioration of information in the course of a long series of computations. For example, to say that information is conveyed by a binary digit means that there are two distinct standard signal values that can be recognized by the computational elements. A large difference between the digits allows them to be recognized even in the p;esence of a certain amount of noise or distortion or attenuation (Lo, 1961). Communication also depends upon the representation of bits of information as the value of some physical quantity. Information to be transmitted over long distances has for a long time been transmitted as electrical pulses, quantities of electrons, but is increasingly represented as numbers,
162
ROBERT W. KEYES
frequencies, and phases of photons. (Large quantities of information represented as inks on paper and chemicals in photographic film are, of course, also transmitted over long distances and stored for varying periods of time.) Communication and computation differ greatly in their demands upon the physical representation of information. Communication is rather similar to memory in that its purpose is to represent a bit in some physical form that can be subsequently retrieved and identified. Both are, in a sense, linear. A one entry in binary representation should be retrieved as a one and a zero entered should be retrieved as a zero. Limitations upon the representation of information for the purposes of storage and communication arise from various undesired perturbations of the representation collected in the term noise. Noise can usually be regarded as additive. Error correcting codes, parity, and other forms of redundancy, can be used to increase noise immunity. The retrieval of the information is at a significantly different location than the point of entry in communication. In contrast, logic signals must interact with one another in nonlinear ways to implement functions such as AND or NOR, Table I, in which the result is not linearly related to the operands. The signals must be large enough to produce nonlinear responses in materials. Nonlinearity also has a role in memory, as it permits a form of AND operation to be used in matrix addressing of memory. For example, a current through two wires was used to switch the direction of magnetization of a ferrite core; the field produced by current in one wire was not sufficient to overcome the coercive force. The need for large signals is the source of the various limitations treated subsequently. The representation of information as a physical quantity does not have to be binary. Proposals for modifying digital devices to represent two or more bits with some physical quantity are occasionally investigated. The obvious difficulty is that 2" signal levels must be distinguished to represent n bits;
TABLE I EXAMPLES OF BOOLEANLOGIC FUNCTIONS P
4
r
0 0 0
0 0
0 1 0 0 0 I
1 1 1
0 1
1 0 I 0 1 1
1
I
p AND q AND r
p
NOR q NOR r
PHYSICAL LIMITS IN INFORMATION PROCESSING
163
the accuracy required of the sensing systems increases exponentially with n. Binary representation maximizes the separation of signals that must be distinguished. 111. SYSTEMS
A fundamental aspect of useful computation is that it is not accomplished by individual devices, but requires systems of many devices. The devices are organized into logic gates, small circuits that perform elementary logic functions. A logic gate accepts several inputs and produces an output that is a Boolean function of the inputs. The NOR gate, whose input-output relation is shown in Table I, is an example of a useful gate. It is known that all logic functions can be produced by combining NOR'S. The creation of a computational system, a microprocessor working with four-bit words, on a single chip around 1970, was a significant advance in the technology of computation. This simple system contained about 2000 transistors. The size of contemporary computing systems ranges from this minimum to large supercomputers containing several millions of transistors. The large numbers limit the technologies that may be used to produce the devices and the properties that they must possess. We review these constraints. One of the realities of providing large numbers of components seriously limits the nature of systems: the large numbers mean that the cost per component must be very low. The need for low cost precludes high-precision fabrication of the components; a certain amount of variability and imprecision among devices must be accepted and taken into account in the design of systems. Failure of a device in a large system usually means failure of the system. The reliability of each device must be very high. Components must function in the variable temperatures encountered in large systems, in the presence of intense energy fluxes representing information, and must withstand the irreversible effects of corrosion, creep, bleaching, and diffusion. In other words, they must work properly for a long time through a range of conditions. If it is not very easy to make devices work under controlled conditions in a laboratory, the difficulties will be enormously magnified when they are assembled into a large system. Feasibility of a system of a great many devices also depends upon a low power dissipation per device. The subject of the power consumed by logic devices is of great importance and has received a large amount of attention and we will return to it later. Still another fundamental aspect of systems composed of a large number of logic gates is that a large amount of communication among the gates is
164
ROBERT W. KEYES
needed to enable the system to function as a unit. The provision of many channels of communication is necessary. The connections have a pseudorandom character, and much can be learned about the requirements by statistical studies of probabilistic models (Heller et al., 1977; El Gamal, 1981; Donath, 1981; Keyes, 1982b). Experience has shown that the communication is characterized by what might be called “nonlocality”. We mean the following by this. If one examines segments of a computing system the segments do not become more self-contained as the size of the segment grows. Rather, as the size of a segment grows, the length of communication paths within it increases, as do its demands for communication with the remainder of the system. Although it is hard to justify these trends by arguing from basic principles, they are supported by empirically derived relations, to which we will return in Section VI, a more quantitative description of communication in computational systems. Another consequence of communication is that it provides an opportunity for signals to become altered in their passage from one part of the system to another part. High fidelity cannot be attained, and the logic gates must be able to work with signals that have been distorted in transit.
Iv. THENATURE OF DEVICES The devices that have been pursued as bases for the construction of large computers can be divided into two broad types. One type, relays, vacuum tubes, and transistors, has three terminals and has been successfully used to construct large computers. The second type, two terminal bistable devices, namely, tunnel diodes and cryotrons, although subject to massive development programs, have been unable to evolve to working systems. What limits the ability of two terminal devices to function in computing systems? A . Three Terminal Devices
Binary logic requires that input signals have the ability to cause a change in the state of a device. A finite input signal amplitude is required to insure this ability. There are essentially two reasons for the need for a finite signal size. One is inherent in the action of devices. As discussed below, the current in a bipolar transistor depends on voltage as exp(qV/kT), for example, and a change in voltage of many kT/q is necessary to change a current from a negligible value to a large and easily detectable value. The signal must be large enough to cause a clearly nonlinear effect; the logic functions are nonlinear. Somewhat different considerations apply to a device such as a relay, in
PHYSICAL LIMITS IN INFORMATION PROCESSING
165
which the input triggers some instability. Even though the triggering action itself may be a response to an infinitesimal change in an input, finite signals are still required because the exact point at which the triggering action will occur is not known. The input signal must be large enough to encompass all possible thresholds of the recipient device. In the case of the relay, the point at which it switches will be affected by temperature, which changes the resistance of the coil and the elasticity of the spring, fatigue in the spring, friction, and inevitable small differences in manufacture. The input signal must be large enough to cover the entire range of possibilities. In fact, variability of characteristics is also found in transistors and tubes and is an additional demand on signal size in those cases. The design amplitudes of signals in practical systems must be even greater than the values that can cause switching in accordance with the discussion in the preceding paragraphs. The form of the desired response function is shown in Fig. 1. The two possible values of output must be established over an appreciable range of input signal amplitudes. The large ranges throughout which each of the binary outputs is produced are known as noise margins and are needed because of the many distortions to which physical signals are subject during their passage from one device to another. Even if the amplitude of a signal is altered by a substantial amount during transmission, it can be recognized and a standardized value produced as output. Electrical signals suffer from dispersion, loss in ohmic resistance, and crosstalk. Optical signals are distorted by dispersion, misalignment of parts, attenuation, and diffraction. lnformation represented by the pressure in a fluid, as is the case in hydraulic controls, is attenuated by viscous resistance. The need for fan-out, the transmission of output signals simultaneously to several recipient devices, may place a further burden on means for enduring signal distortion. In all
XI FIG. I . A response with large noise margins that is typical of the three-terminal devices that have been successfully used in large computing machines (relay, vacuum tube, transistor).
166
ROBERT W. KEYES
cases reliable computation can only be insured by large noise margins that can accommodate extraneous influences. Obtaining the response shown in Fig. 1 from an electrical circuit is possible because the devices in question all have high voltage gain. That is, the change in the output voltage is much larger than the change in input voltage needed to effect it. The devices also provide the high current gain needed for fan-out, the transmission of the output signal to a multiplicity of receivers. B. Two Terminal Devices
Two terminal bistability occurs when a material or structure exhibits a response to an input of the type shown in Fig. 2, response that is called a negative resistance if the input and output are electrical current and voltage. The tunnel or Esaki diode is probably the best-known electrical example, in which case the abscissa of Fig. 2 represents current and the ordinate represents voltage (Sze, 1969). In any case, the region of negative response is unstable in most circumstances, and the observed dependence of output on input is as shown in Fig. 3. The bistable region in the characteristic of Fig. 3 naturally invites its use to represent binary digits. It is seen that when the input is increased the bistable device must switch to the upper branch of the curve at the threshold T . Upon decreasing the output the device returns to the lower branch at S . Variation of the input thus can control the state of the device.
INPUT
FIG.2. Response of a system that can exhibit bistability. For example, a tunnel diode with current as input and voltage as output has a negative resistance characterized qualitatively by this type of curve.
PHYSICAL LIMITS IN INFORMATION PROCESSING
167
INPUT
FIG.3. Bistability obtained with the system of Fig. 2. For example, operation of a tunnel diode in series with a resistor and a voltage source can produce this kind of bistability.
More complex control of the state of a bistable device is possible by using the sum of a number of inputs as the input variable in Fig. 4. Signals of magnitude X are added to a biasing input B, Fig. 4,and switching occurs when the sum of bias and signals exceeds the threshold. It is possible in this way to implement various logic functions. M ONE inputs to a logic stage are summed and if the sum is equal to or greater than some value N the device switches. M and N are small integers. If N = 1 an OR is created, if N = M the function is an AND. In other words, if the bias is such that a single input causes the device to switch, then it acts as a logical OR. If all of the inputs are needed to cause switching, then an AND function is formed. There are fundamental difficultieswith this approach to logic. An apparent one is the lack of inversion-once the device is switched to the high output state the logic signals cannot cause it to return to the low state. A separate resetting operation involving removal of the bias is needed. Another limitation is the restricted range in which the signal must lie. Consider that X and B in an N input AND must satisfy B+NX>T
(1)
+(N
(2)
and B
-
l)X < T
168
ROBERT W. KEYES t
I
q ; I
0
I
I I
I I
I
S
T
INPUT
FIG.4. Threshold logic with a bistable device. Bias B maintains the device on the lower branch of the curve. Addition of a sufficient number of inputs X causes the threshold T to be exceeded and the device is switched to the upper branch of the characteristic.
Thus T-B T-B >X>N-1 N
(3)
The limited range in which the signal must lie causes limited ability to withstand alterations of signals during transmission and in fan-out to other devices and a variability of signal sources. Still another limitation is the poor tolerance of variability in component parameters, the threshold and the bias. To illustrate the effect of variability, assume that the signal amplitudes lie in a range between (1 + a ) X and ( 1 - a)X around the nominal value and that the threshold may vary from ( I -B)T to (1 +B)T. Assume also for simplicity that the bias B can be obtained from an accurately controlled source. Then, to implement an N input AND function switching by N inputs must be assured even in the worst case: B
+ NX(1
- CI)
> (1
+ B)T
(4)
Also, in no case must N - 1 inputs cause switching:
B
+ ( N - 1)X(1 + a) < (1
-
P)T
(5)
Figures 5(a) and 5(b) are plots of examples of regions in the ( X / T ) - ( B I T ) space where the inequalities (4)and ( 5 ) are both satisfied when CI = 8. The case
N=2 8.0.15
B/T (b)
FIG.5. Restrictions on the value of the nominal signal value X used in switching the bistable device of Fig. 4 as an N input AND as a function of bias Band variability of signal amplitude and threshold. (a) The dashed lines bound the allowable input amplitudes when there is no signal or component variability, /? = 0. The shaded region shows workable combinations of X and B if /? = 0.05. (b) Permissible range of X and B in the case N = 2, /? = 0.15. See inequalities (1) and (2).
170
ROBERT W. KEYES
c1 = /I = 0 reduces to (1) and (2). Component variability decreases the already small range of admissible signal amplitudes. The restriction of signal amplitudes to a narrow range means that nothing comparable to the large noise margins that can be provided with the three terminal devices are available. It is not possible to increase the amplitudes to values much greater than those needed to actually cause switching to tolerate the effects of unanticipated attenuation of signals. Figure 5 also shows that only a certain range of values of the bias is admissible in the presence of component variability. As the bias increases the restrictions on signal amplitude become more severe. A large bias may be needed because the range of bistability is small, that is, the interval between S and T in Fig. 3 is a small fraction of T. Often in experiments it is deliberately selected to be close to the threshold in order to demonstrate switching with a low energy input. However, large bias implies a very narrow range of allowable signal and component values in the implementation of an AND in either case. Careful adjustment to match signals to a particular device is necessary to achieve switching. To quantify the limitations on the bias, X can be eliminated from the inequalities (4)and ( 5 ) to obtain
BIT <
1
+ up
+ 8)
- (2N - l)(@ 1 - ( 2 N - l)n
Again letting j = c1 for illustration, Eq. ( 6 ) becomes a relation between the maximum value of CL and (BIT).The relation is shown for several values of
B/T FIG.6 . Maximum allowable variability as a function of bias (Eq. (6)).
PHYSICAL LIMITS IN INFORMATION PROCESSING
171
N in Fig. 6. Figures 5 and 6 show that severe demands on the precision of component parameters and signal amplitudes must be met to perform threshold logic. C. Voltage
Note the nonlinearity in the response functions of Fig. 1 . Physical systems respond only linearly to small signals. Many of the physical limitations on devices for digital logic arise from a need to use large signals. Large signals are required because of the inherent nonlinearity of the processes involved. Binary logic devices are essentially switches; they produce one of two outputs depending on the state of their inputs. Electrical devices operate through the application of voltages to change the heights of potential barriers that retain electrons. Large voltages are required in the following sense: the energy of the electrons being retained is distributed through a range of several times the thermal energy, kT, and only a differential effect is obtained if the energy imparted to each electron by the voltage is less than kT. A large effect, a transition from a condition in which practically no electrons can surmount a barrier to one in which a great many electrons can overcome it, requires that the voltage change by an amount V that satisfies qV >> kT. The classic example of this requirement is the p - n junction in semiconductors. The electron current through thejunction when a voltage V is applied to it, lowering the barrier to the passage of electrons into the p side, is
Here q is the charge on the electron, A is the area of the junction, N, and N , are the thermally available densities of states in the conduction and valence bands, P is the doping level on the p side of the junction, and Onand L, are the diffusion constant and the average distance that an electron diffuses before recombining with a hole of electrons on the p side of the junction. EG is the energy gap of the semiconductor and V is the applied voltage, in the sense that positive V means that the p side is positive. A similar equation applies to the hole current. The resultant dependence of i on V is shown in Fig. 7, and it is seen that the nonlinearity is quite large when V >> kT/q. A general measure of the voltage scale of nonlinearity of the currentvoltage relationship of electrical devices was proposed by Gunn (1968)
v,, =
($)($)
172
ROBERT W. KEYES
qV/kT FIG.7 . The strongly nonlinear exponential dependence of current on voltage in a p - n junction.
COLLECTOR (n)
FIG.8. An npn bipolar transistor. The shaded areas are metallic contacts to the emitter, base, and collector.
V,, = kT/q for a p - n junction from Eq. (7). Strong nonlinear response necessitates that signal amplitudes be much larger than Vnl. The nonlinearity of Eq. (7) is used to advantage in the bipolar transistor, Fig. 8. The electrons flowing from the emitter into the base are controlled by
PHYSICAL LIMITS IN INFORMATION PROCESSING
173
+ RB
*A
= "0
1
A
T2
. RX
i: R X
T
i / RC
FIG.9. An emitter-coupled NOR circuit.
the voltage across the E - B junction in the way expressed by Eq. 7. Most of the current is carried by these electrons. Almost all of the electrons are removed by the reverse biased collector junction rather than by recombination, however, so that L , must be replaced by the base width. Thus, the operation of the transistor is described by an equation of the form i, = ls[lexp(%)
-11
(9)
i, is the collector current, V,, is the voltage applied to the base-emitter junction, and I, is a constant. The magnitude of the signal needed to provide the desired logic response may be estimated by considering the emitter-coupled NOR circuit of Fig. 9 (Sedra and Smith, 1982).The operation of the circuit qualitatively is as follows. Kef is a reference potential. If terminals A and 5 are at a low voltage TI and T, are off and a current flows through T,. Output terminal Vo is at the power supply potential. If one of the inputs, say B, is at a high voltage current flows through T2,raising the potential of the emitters. The emitter potentials are raised sufficiently closely to Kef that very little current flows through T,. The current through RA produces a low output at V,. The dependence of the output on an input at A or B is as shown in Fig. 10. The signal amplitude at the input required to switch the current from T3to T, may be defined as the range between the points at which the ratio of the currents is 0.01 and 100. This is readily calculated from Eq. (9). Since the emitters of the transistors are at the same potential, 100 times as much current flows through T, when the voltage on the base of T, is (kT/q)In 100 below Kef, neglecting the 1 in Eq. (9).Similarly, the ratio of the currents is 100 in the other
174
ROBERT W. KEYES
- 100
0
+loo
(VA,VB-Vref (rnv))
FIG. 10. Response at the output terminal of the circuit of Fig. 9 to an input signal.
direction when the input potential is the same amount above &.The voltage swing is therefore
AVis 240 millivolts for T = 300K. This example shows that 1/4 of a volt is approximately a minimum logic signal amplitude in room temperature circuitry. In practice, a substantial additional design signal voltage is needed to allow for degradation of signals during transmission. The situation is somewhat different in field-effect transistors. Here the current depends on the gate voltage, V,, as (1 1) i = c(VG - VT)2 in the saturation region (Sze, 1969). VT is the threshold voltage. The voltage scale for small-signal nonlinearity, Eq. (8), in this case, obtained from Eq. (1 l), is V,, = (V, - VT). The use of Eq. (1 1) already assumes that ( V, - VT) >> kT/q, so that nonlinearity in field-effect transistors is smaller than in bipolar transistors. Signals are larger in field-effect logic circuits than in bipolar circuits for this reason. The conclusion to be drawn from the preceding discussion is that voltage in electrical logic circuitry cannot be reduced indefinitely. Similar statements apply to logical operations on information represented in other physical forms.
PHYSICAL LIMITS IN INFORMATION PROCESSING
175
Large voltages are also used in logic to take account of the unavoidable differences among devices. Signal voltages must be large enough to switch all the devices of the system in spite of the differences. There are many sources of variation in the threshold voltages of insulated gate field-effect transistors: trapped charges, differences in insulator thicknesses, differences in doping concentrations. Bipolar transistors are less affected by physical differences, changes in voltages required vary only logarithmically with device parameters because of the exponential current-voltage characteristic. The additional voltages that take account of the difference among devices are part of the need for large noise margins. Voltages larger than necessary are sometimes used becausedevices must be adapted to existing power supply standards. A power supply may have several uses and is not easily changed to adapt to new device technologies. The demand for high voltages in solid-state electronics leads to two kinds of limits. One concerns the effects of high electric fields in devices. The other is the problem of removing the heat produced, the product of the voltage and the current supplied. These will be discussed in later sections.
V. TRANSISTORS
Continued miniaturization combined with a minimum voltage tends to increase the electric fields in electronic devices. Containment of high electric field effects as miniaturization is advanced led to a reduction in the voltages used in digital circuitry in the early stages of semiconductor device development. Reduction of voltage was also pursued as a means to reduce power dissipation. However, the intent to thwart high electric field effects by reducing voltage is eventually frustrated by the existence of the important voltage restrictions that have just been mentioned. High electric fields can produce undesirable or catastrophic effects, not encompassed in conventional transport theory. These include the production of hot electrons, reduction of mobilities, and dielectric breakdown. The nature of high-field effects is shown very schematically in Fig. 11, a plot of the current through a region of a semiconductor as a function of the electric field in the region. The current is controlled by mobility at low fields, the electron velocity and the current are proportional to the field. At around lo4 V/cm the electrons begin to acquire energy from the field at a faster rate than it is lost through the scattering processes that dominate at low field, the average electron energy increases, new scattering mechanisms come into play, and the velocity is no longer proportional to the field. The velocity eventually reaches a field-independent, saturated value. As the field approaches lo6 V/cm
176
ROBERT W. KEYES
'I02
lo4 FIELD (V/cm)
FIG. 11. Hot electron effects in a semiconductor. The current is proportional to electric field at small fields. The carrier velocity reaches a field-independent value at high fields. At still higher electric fields electrons acquire enough energy to excite more electrons across the energy gap and avalanche breakdown occurs.
some electrons acquire enough energy to excite a second electron from the valence band to the conduction band. The added electrons can also acquire energy from the field, excite more electrons, and an uncontrolled avalanche breakdown current rapidly develops. The exact fields at which these events take place are different for different semiconductors and also depend on doping concentrations and on the size of the region to which the field is applied. The voltage differences between different parts of a transistor are supported by layers depleted of electrons and holes. The thickness of such a depleted layer, x, is related to the potential difference that it supports, 4, and the doping level in the layer, N , by
Here E is the dielectric constant of the semiconductor. Equation 12 is the basis of various limitations on semiconductor devices. Elements of a device must be large enough to avoid being entirely depleted by applied voltages. The dimensions of the layers are shown in Fig. 12. The depleted layers occupy a large part of the volume of transistors, and miniaturization of devices must
PHYSICAL LIMITS IN INFORMATION PROCESSING
0.1
0.2
0.5
I volt
2
5
177
10
FIG. 12. Widths of one-sided depletion layers as a function of voltage across the layer and doping concentration. The numbers apply to silicon and gallium arsenide.
include miniaturization of the depleted layers. Apparently the widths can be reduced by increasing N , Eq. (12); miniaturization is accompanied by higher doping levels. A . Bipolar Transistors
Practically all logic circuitry in large high-performance computers is based on bipolar transistors. As in the case of other solid-state electronic components, the evolution of bipolar transistors has been marked by rapid miniaturization. However, the current that these transistors are called upon to control in the fastest logic circuits has not decreased very much. Higher speeds have been attained by combining the reduction of capacitances produced by miniaturization with high currents. Therefore, the density of current has steadily increased. The trend of current density in bipolar transistors for highspeed logic in a sampling of the literature is shown in Fig. 13 (Keyes, 1987).The need to handle the high and steadily increasing current densities severely limits the design of transistors for use in logic circuits.
178
IIOBTKT W K l Y E S
1o6
I
I
I
I
I o5 0
0
NE
lo4
0
>
.-4-
0
8
0
2
0
o
I o3
0
ooo 00 oo
0
e
E L
al -0 4-
lo2
C
2L
3
10
1 10-1 1950
1960
1970
1980
1990
2000
l h e problenis arising from tlie voltage requirement of logic circuitry are less scvcrc i n bipolar devices than in field-efTect devices. 'The principle limit arising directly from the ~ipplicationo f voltage to ;I bipolar device i s that the 1 1 - 11 junctions should not break d o w n and that tlie base should n o t "punchthrough." T h e latter expression refers t o complete depletion of the base by the reverse bins applied to tlie collector. The essentiul compromise that must be m i d e is between heavy doping of the base, which reduces tlie penetration of t h e depleted layer i n t o it, and low doping to widen the collector depletion layer and increase i t s breakdown voltage (Hoeneisen and Mead, I972b). l-icavy doping of tlie base is also clesirable to decrcase the resistance in the path of the ha sc cii r re n t . The quantitative details depend on the doping profile ;it the junction. 'The doping i n ninny modern p1;inar bipolar transistors varies with distance from tlic stirfice ;IS shown in I-'ig. 14. T h e lightly doped collector :illows ;I wide deple[ion layer to forin atid the breakdown voltagc to be high. I t also reduces tlic capacitance of the collector junction. which is important i n determining tlic switching spced of ;I switching circuit.
PHYSICAL LIMITS IN INFORMATION PROCESSING
179
102“
I
\
DISTANCE FIG. 14. Doping concentration as a function of distance from the surface in an npn bipolar transistor fabricated by the methods of planar technology. The transistor is formed in a lightly doped layer grown epitaxially on a heavily doped subcollector in the substrate.
However the use of the lightly doped collector in planar bipolar transistors introduces two new problems at high current densities. One is apparent from the physical form of the transistor and its contacts, Fig. 8. The collected current must flow from the region under the emitter to the contact through the substrate. The high resistance of a lightly doped collector in this current path would be inadmissible and it is avoided by the heavily doped “subcollector”, shown in Fig. 14, which provides an alternative, lower resistance current path to the contact. The other limitation of the lightly-doped collector is the phenomenon known as “base stretching”(Kirk, 1962). Base stretching occurs at high current levels, when the density of electrons required to carry the current is comparable to the density of the doping impurities. The charge of the currentcarrying electrons then drastically changes the potential distribution in the device. Quantitative analysis of the events in a structure such as that of Fig. 14 is complicated and must be done by numerical simulation (Poon et al., 1969; Knepper et al., 1985). An idealized model can show the nature and approximate magnitude of the effect, however. We consider the base-collector junction to consist of three uniformly doped regions: base, lightly-doped collector, and subcollector, separated by abrupt junctions. The sequence of events as the current is increased is shown in Fig. 15 in the form of graphs of potential, shown as the energy of the conduction band, as a function of position. (a) is the potential in equilibrium, with no applied voltage and no current flowing. (b)shows the effect of reverse bias on the collector; the regions depleted of carriers have widened. When current flows the electrons crossing the depleted region of the collector compensate the positively charged donor atoms there. The density of
180
ROBERT W. KEYES BASE
1
I
I
h
COLLECTOR
1I sc I I I
I
,
I (d)
- -I
FIG. 15. Potential in the base, collector, and subcollector of a bipolar transistor under various conditions. (a) No voltage on the collector, no current flowing. (b) Reverse bias on collector, no current flowing.The depleted regions are thicker than in (b).(c)The concentration of electrons needed to carry the current through the collector is equal to the collector doping. The collector is neutral and the field in it is constant. (d) The concentration of electrons carrying current through the collector is very high. Holes neutralize the electrons and the effective base width is greatly increased.
electrons with velocity u needed to carry the current j is
The electrons compensate the donors in the collector and effectively reduce the concentration, N,, to N , - n. When n = N, the collector is neutral, the field in it is constant, and the potential varies as in Fig. 15(c). Beyond this point the net charge in the collector becomes negative and the potential has the form of
PHYSICAL LIMITS IN INFORMATION PROCESSING
181
Fig. 15(d). Holes enter the collector to compensate the electron charge and rapidly extend the netural base beyond the metallurgical junction. The extra charge stored in the expanded base slows the transistor action at current densities greater than .jL= NJUL (14) The electric fields in contemporary silicon transistors are large enough to make u close to the saturation velocity, uL = lo7 cm/s. For practical purposes j , is an upper limit to the current density in the transistor. The avoidance of base stretching interacts with the rest of transistor design. Thin bases are desired in bipolar transistors to minimize base transit times and the charge stored in the base. The widths of the depleted regions in uniformly doped base and collector are given by
Here = V, + Vbi,and V, is the reverse bias applied to the collector. However, uniformly doped bases and collectors are not used in the fastest logic transistors. Because of the limitations of the lightly doped collector that have been discussed, the collector doping is increased rapidly away from the junction to prevent the base from extending into the collector. The depletion layer is primarily in the base and extends into it to a distance given by Eq. (12), or with Nb >> N, in Eq. (15).
The condition that the base not “punch-through”, w b > x b , can be written in terms of the number of impurities per unit area of base, k f b = N b W b , (Mb = N b W b in a uniformly doped base) as
Here w b is the base thickness, &, is the punch-through voltage, and 5 is a constant that depends on the distribution of the dopant atoms in the base. From Eq. (17), 5 = 2 for a uniformly doped base, but is close to 1 for the type of doping profile shown in Fig. 14, and hereafter = 1 is assumed in Eq. (8). The sheet resistance of the base is another important parameter associated with M b . The sheet resistance is p _ = (‘b4p)-
(19)
182
ROBERT W. KEYES
Any base current must flow through this sheet resistance. The potential drop caused by the passage of base current through the sheet resistance decreases the forward bias at the emitter-base junction and decreases the density at which current is injected from the emitter. Because the current through a p - n junction depends on voltage as exp(qV/kT), if the voltage drop through resistance in the base is greater than (kT/q),little current flows across the emitter junction. If the emitter is a rectangle with dimensions W x L , then, very approximately, for the entire emitter to be active as a current source
Here i, is the base current, assumed to be injected along the long sides L. All current crossing the emitter junction is assumed to be captured by the collector in the ideal npn transistor; no base current is needed. The main source of deviation from this ideal is a component of current carried across the emitter junction by holes entering the emitter. The ratio of the current collected by the collector to that that must be supplied by the base is the current gain p. High values of p, p N SO, are needed in logic circuitry. All but a few percent of the current crossing the emitter junction reaches the collector. Nevertheless, the base current plays an important role in transistor design by limiting the distance from the base contact to which bipolar transistor action is effective. When ib = jWL/P is substituted in (20) the limit becomes
Many miniaturized logic transistors are square or circular, in which case j W 2 is approximately the transistor current. Current is injected along the entire periphery of the base, and a numerical factor appreciably larger than the 4 in Eq. (21) is applicable in this case. As an example of (21), if fl = 50, (kT/q) = O.O25V, and j W 2 = 1 mA, then p c I 5 x lo3Ohm. It is difficult to obtain high-speed switching of the required currents with prJ> lo4 Ohm. Equations (18) and (19) show the advantages of heavy doping of the base. However, the ratio of currents carried across the emitter junction by electrons and by holes is proportional to the ratio of the numbers of these entities in the emitter and the base. The large number of holes in the base associated with large M , leads to injection of holes into the emitter. This hole current is supplied through the base and decreases the current gain, 8. The number of dopant atoms in the base should be less than the number in the emitter to maintain high emitter efficiency. Heavy doping of the emitter helps to attain this end up to a point, but is not effective beyond about S x 1019cm-3 because of a decrease in the energy gap of silicon at higher doping levels (Slotboom and
PHYSICAL LIMITS IN INFORMATION PROCESSING
183
deGraaf, 1976; Keyes, 1976; Selloni and Pantelides, 1982). The decrease in the energy gap lowers the barrier to the entrance of holes into the emitter. Therefore, keeping the number of dopants in the base below the number in the emitter means in practice that the average base doping must be below 10'8cm-3. The heavy doping in the emitter can also cause a decrease in lifetime by leading to Auger recombination, which increases injection of holes into the emitter. Solomon (1982) has discussed scaling and limits of bipolar transistors. In addition to the decreasing energy gap at high doping levels other limits to the scaling of bipolar transistors to higher current densities and smaller dimensions arise from effects that are more difficult to quantify. A further limit to the doping level in the base is the possible existence of a tunnel-assisted recombination current in the emitter-base junction (del Alamo and Swanson, 1986) at dopings above a few times 10'8/cm3. Another, less significant, effect of heavy doping in the base is the scattering of charge carriers that it introduces. The lowered hole mobility operates on p , _ and the lowered electron diffusivity increases the time taken for electrons to cross the base. In addition, neutrality is maintained in the base by holes that compensate the injected electrons. The number of holes in the base is equal to the sum of the number of acceptor atoms and the number of electrons injected. The latter number is determined by the current; the charge in the base is equal to the current times the time needed for an electron to traverse the base. The electrons cross the base by diffusion in an average time.
The number of electrons per unit area of base is thus
When j is small M , is small and the number of holes in the base is nearly equal to the number of acceptors. At high current M e can dominate the acceptors and the number of holes increases with current, thereby increasing the rate at which holes are injected into the emitter and reducing the current gain. The current gain begins to fall off when the current density becomes so large that M e exceeds the number of doping atoms in the base. To insure that the current gain is controlled by the base and does not fall off at high current density it can be required that the number of doping atoms in the base be greater than M e
184
ROBERT W. KEYES
Summarizing again, a high fl at large current densities, high punchthrough voltage, and low base resistance all call for a high M,. High emitter efficiency and high carrier mobilities conflict with these requirements. The various influences that confine the base doping on both the high and the low sides have prevented Mb from changing very much during the decades of miniaturization; it has remained in the range 10'2cm-2 to IOl3cm-*. Also, near-equality prevails in Eq. (24). The status of Eq. (24), which represents a limit on the base doping, is shown in Fig. 16, where the data of Fig. 13 is plotted against the base width (Keyes, 1987). Figure 16 shows a correlation of base width with current density. This figure includes points in addition to those in Fig. 13 derived from simulations of projected devices (Gaur, 1979; Kimura and Takahashi, 1982). No date was attached to such devices and no attempt to plot them in Fig. 13 was made. To the extent that Mb is a constant and (24) is an equality the relation in Fig. 16 is expected to be Wb =
(yy2
The line shown is calculated from (25) and deviates from proportionality to the - 112 power because of the dependence of D on doping. Increasing doping
lo2,
I
I
I
I
I
I
I
j(A/c~~) FIG. 16. Comparison of trends in npn bipolar transistors with Eq. (25), including some simulated transistors to which no date could be attached for inclusion in Fig. 13. The line is derived from Eq. ( 2 5 ) with Mb = loLz (Keyes, 1987).
PHYSICAL LIMITS IN INFORMATION PROCESSING
185
levels reduce mobility and the electron diffusion constant in the base. The current density increases more slowly with decreasing base width than would be suggested by Eq. (25) with constant D. The line in Fig. 16 is calculated with Mb = 1012cm-3.The best value for M , is rather uncertain because of a variety of effects that are difficult to take into account quantitatively: the effects of fields associated with concentration gradients in the base, the effect of electron-hole scattering and compensation on mobility, averaging over the rapidly varying impurity concentration of Fig. 14, and the effects of degeneracy at high carrier concentrations. The limits described by Eqs. (18) and (24) and Nb < 3 x l O ' * ~ m -are ~ illustrated in Fig. 17, a plot in the M b - w b plane. As wb is decreased the value of M , becomes increasingly confined between the punch-through and heavy base doping limits. The window between these vanishes at the 106A/cm2 current density contour if the average acceptor concentration is kept below 3 x 10ls ~ m - The ~ . maximum current density transistor is close to that described by Solomon and Tang (1979). Certain other consequences of these considerations are worth noting. Eliminating pL and M , from Eqs. (19), (21) and (25) yields W2 By the Einstein relation the last part of (26a)is the ratio of the mobilites of the two carrier types, so w2 /Ix w; (26b)
-
There is a rough proportionality between the thickness of the base and the linear size of the emitter. A rapidly decreasing total number of impurities in the base is also implied. The decrease with time is shown in Fig. I8(a). It appears that in a decade or so it will be small enough that random fluctuations will affect yield. In terms of the above equations, the number of dopant atoms in a square base is hfbW2. The number can be written by using (19) and (21) as
The relation between M , W 2 and j is shown in Fig. 18(b), where the inverse proportionality described by (27) is illustrated by the line. There is an inevitability to the decrease in the number of dopant atoms as the current density is increased. According to Eq. (25) the increasing current densities that accompany miniaturization of bipolar transistors for high speed circuitry demand decreasing base widths. The fabrication of the ever-thinner bases is a difficult
186
ROBERT W. KEYES
10'
1 o6
lo5 A/crn2
lo4
I o3
10l6
?-
-E 0
W
v)
m
.-C
s
4-
0.
W
1015
0
8
1 o-2
10-I
1
10
Base width, w,, (pm) FIG. 17. Limits on bipolar transistors in the wb - M , space. Contours of constant current density (solid lines), and punch through voltage are shown. The average acceptor concentration in the base is indicated by the dotted lines. The limit set by a 3 V punch-through voltage and an average base doping level of 3 x lO''//cm3 is shown by the heavy line.
aspect of bipolar technology. The base width is defined by the difference in penetration beneath the surface of an acceptor and a donor (see Figs. 8 and 14). As neither of these can be controlled perfectly, controlling the base width within reasonable limits requires that both depths be scaled together; the depth of the emitter-base junction decreases with the base width. However, the contact of the emitter to a metal conductor at the surface has a high rctcombination velocity and forms an almost perfect sink for holes. The
1 Ol2
10I2
loll 1o1O 0, u)
m n .-c
E0
1 o9
a,
rn m n .-c
1 o8
?
0
4-
m
4-
4-
10’
m + c m
C
m
a
0
n
1 o6
Q
00
1 o5 1 o4 1o3
1950 1960
(a)
I
1970
1980
Year
1990 2000
I
I
I
I
I
( b) Current density (A/crn *) FIG. 18. (a)The decrease in the number of acceptor atoms in the base of npn transistors with time (Keyes, 1987).(b) The relation of the number of dopants in the base to the current density. The line illustrates Eq. (27).
188
ROBERT W. KEYES
gradient of the hole concentration in the emitter is increased by a shallow junction and the hole current flowing from the base is increased, decreasing the current gain. The passage of current through contacts of ever-decreasing cross section increases the loss in series resistance, which is another limitation on the miniaturization of bipolar devices. These limits on the base and emitter can be attacked by the heterojunction emitter and the polycrystalline emitter. The use of an emitter with an energy gap larger than the base creates an energy barrier to entry of holes into the emitter. The base can then be heavily doped without reducing emitter efficiency. The larger energy gap can be realized by forming the emitter from a different semiconductor than the base, making the emitter junction a heterojunction. A suitable larger energy gap semiconductor, one that can be heavily doped and grown epitaxially on the base, is not known for silicon. The requirement can be met by depositing a GaAs-A1As alloy on a GaAs base, and heterojunction emitter transistors are pursued in this system. The poly-emitter refers to the deposition of a layer of doped silicon on the surface of the emitter. The deposited layer is ordinarily not epitaxial to the base and is polycrystalline. Nevertheless, the interface between the single and polycrystals can be of such good quality that the deposit acts as an extension of the single crystal emitter, moving the metal contact from the metal contact away from the base-emitter junction and allowing the number of dopants in the emitter to be increased. The level at which these limitations on bipolar transistors are effective involves process details and qualitative judgements. The details depend on the exact doping profiles. The magnitude of current gain needed depends on circuit choice. The necessary control of base width depends on acceptable yield, among other things. B. Field-EfSect Transistors
Metal-oxide-semiconductor field-effect transistors are limited by phenomena associated with both the semiconductor and the oxide. The structure of a small n-channel MOSFET is shown in Fig. 19. The transistor is turned on by the application of a positive potential to the gate electrode, producing a variation of potential normal to the surface as shown in Fig. 20. Fp is the distance of the Fermi energy from the top of the valence band. The threshold voltage, V,, is the voltage that must be applied to the gate to induce a significant number of electrons at the semiconductor-oxide interface, conventionally defined as the point at which the energy of the conduction band is depressed by an amount EG - 2Fp at the interface. The electrons induced at the Si-SiO, interface, the “channel”, carry current when a positive voltage is applied to the drain.
PHYSICAL LIMITS IN INFORMATION PROCESSING
189
OXIDE
SUBSTRATE DRAIN FIG. 19. The physical structure of a MOSFET, showing gate, oxide, source, drain, and substrate.
M
O
FIG.20. Form of the variation of potential perpendicular to the surface of a MOSFET at threshold. The potential in the semiconductor is represented by the energy of the conduction band. The Fermi energy is shown as the dashed line.
A brief description of the theory of the MOSFET current is given here to clarify the meaning of a few terms and introduce notation. The current carried by the electrons at the silicon surface, the “channel”, is (charge per unit area) x velocity x width. The charge density is equal to the excess of gate voltage minus threshold voltage over the channel voltage, V, acting through the capacitance of the insulator. The velocity is determined by the mobility and the electric field along the channel, dV/dx. Thus
190
ROBERT W. KEYES
is the dielectric constant of the insulator and ti is its thickness. The source and the substrate are regarded as the zero of potential. The current is constant along the channel. The voltage is therefore determined by integrating (28). E~
ix = ( + O V ,
-
V,)V-
"'I
2
V and X are measured from the source. At the drain V = VDand x length of the channel. The current depends on VDas
= L,
the
The current reaches a maximum as a function of VD when V, = VG - VT. When V, > VG - V, Eq. (30) no longer holds; the current remains constant at its maximum value
The current is said to be saturated at the value (31). The extent of the depleted regions in the miniaturized MOSFET is suggested in Fig. 18. A first limit on the design of the MOSFET is seen from Fig. 21(a): the depleted layers at the source and drain electrodes must not meet (Hoeneisen and Mead, 1972a). The widths are of the form of Eq. (12) 2E(VD+ 4%
xd=(
Ki)
l''
)
V, is the voltage applied to the drain and hi is the built-in voltage of the junctions. Thus it is required that the channel length satisfy
L > 2x,
(33) The limit (33) can be reduced by increasing N,, Eq. (32). However, heavy doping of the substrate increases the field normal to the surface at the interface
I - J l
__-i
I
I I
.
I---
\ \
/
--/
\
------
\
\ \
\---
(a1 (b) FIG.21. The dashed lines show the extent of the depleted region in a MOSFET. (a) Reverse biased drain below threshold. (b) With channel formed and current flowing.
PHYSICAL LIMITS IN INFORMATION PROCESSING
191
with the oxide and the field in the oxide. For a uniformly doped substrate the field is (34) The potential supported by the depleted layer in the substrate, 4, approaches (E,/q), Fig. 20, in a heavily doped semiconductor at the threshold. The field in the oxide is ciFi = cF,, where subscript i refers to the insulator. Thus, the maximum field sustainable by the oxide sets a limit on N , and on L (Hoeneisen and Mead, 1972a). (VD
L<4[
+ V,,)E, q
]
‘j2
E
(35)
This kind of limit is not quantitatively applicable to modern MOSFETs, however, since implantations of ions into the depleted layer and near the surface are used to create variations in the doping concentration and the potential contours that mitigate the undesirable effects described (Parrillo, 1983).It is often the case that increasing process complexity can relieve limits derived from simple models. The simple MOSFET model of Eqs. ( 2 8 ) to (30) treats the transistor as a planar structure in which the electrical lines of force are perpendicular to the surface. The depleted regions around the source and drain are small compared to the gate. The form of the depleted regions in Fig. 2 1 shows that this is not the case. When the depleted regions have the substantial extent shown in the figure, deviations from the simple picture are quite significant. The deviations are known as “short channel effects” and limit the straightforward miniaturization of MOSFETs. Several types (Duvvury, 1986) will be described. The threshold in the simple model is the gate voltage that just compensates the charge in the depleted region in the substrate under the gate and the insulator. Some of this charge is compensated by the ions in the heavily doped source and drain regions in a short channel. As there is less charge to be compensated by the gate, the threshold is lowered. The decrease depends on the gate length, and a dependence of the threshold on gate length appears. The high electric fields in the channel produce velocity saturation, Fig. 11, degrading the transconductance. The current in the simple model is independent of drain voltage in the saturation regime; the differential drain conductance is zero. However, as the drain bias is increased the drain depletion width increases, encroaching on the channel and effectively shortening it in the short channel case. The decreasing channel length causes the current to increase and introduces a drain conductance, which reduces the circuit gain in logic applications. The thickness of the potential barrier in the substrate that separates the source and the drain decreases with decreasing gate length. An increasing
192
ROBERT W. KEYES
amount of current can leak over the barrier. This current can be quite important when the channel current is turned off by the gate. It is known as subthreshold current and increases with decreasing gate length. Also sometimes counted among the short channel effects is the increasing electric field associated with the heavy substrate doping of short channel FETs, Eq. (34). The higher vertical electric fields retain the electrons in the channel closer to the surface and decrease their mobility through the increased surface scattering. The regions of high electric fields in the semiconductor can accelerate electrons to high energies, producing electrons that are said to be “hot”. Sufficiently energetic electrons can overcome the 3.2 eV barrier between the conduction bands of the SiO, and the silicon and enter the oxide. There is a certain probability that an electron entering the oxide is trapped at a defect there, changing the threshold of the device (Abbas and Dockerty, 1975; Cottrell et al., 1979; Ning et al., 1979). There are few electrons in the depleted regions of the FET, but the fields there are high. Thermally generated electrons can gain energy rapidly and be swept to the surface and into the SiO,. High electric fields parallel to the surface are found in the channel. Indeed, according to the simple FET model of Eqs. (28) to (30), the field increases without bound as the drain is approached in the saturated condition. Of course this does not actually occur, the approximations involved in writing (28) and (29) fail at high fields (Hoefflinger et al., 1979). However, high electric fields do exist near the drain and cause degradation of device characteristics with use. Some relief from the severity of hot-electron alteration of devices is obtained by the use of “lightly doped drain” structures, in which a low concentration of donors is created near the surface between the channel and the heavily doped drain. This structure, among other things, reduces the maximum electric field near the drain (Ogura et al., 1980). The difficulty in controlling degradation of MOSFETs by hot electrons is rooted in the fact that even extremely rare events can accumulate to an unacceptable change during the long lifetimes expected of electronic components‘in large systems. Consider the simple model of a MOSFET above. nT trapped electrons per unit volume, uniformly distributed throughout the oxide, change the threshold voltage by
The total number of electrons that must be trapped to produce a change AV is NT
= n,tiL
W
=
2ei A V L W qti
(37)
PHYSICAL LIMITS IN INFORMATION PROCESSING
193
One wishes A V to be less than some maximum amount after a life t , . The total number of electrons that pass through the transistor in t , can be obtained from Eq. (31). N, =
WpEi(VG- VT)2tL 2Lqti
(38)
The fraction of the electrons that pass under the oxide in the device current that must be trapped is
For example, taking L = 1 pm, A V = 0.1V, p = 500 cm2/Vs, V, - V, = 2V, Even very improbable events and t , = 3 yr = lo8 s, one finds q = 2 x can be a source of difficulty. It is sometimes thought that if all voltages applied to a MOSFET are less than 3.2 V, then no electron can acquire large enough energy to surmount the barrier to entry into the SiO,. This is certainly true if no collisions take place or if the only collisions are elastic ones between a pair of identical spheres. However, no thermalization of the electron distribution could take place then either. The reality of other possibilities can be illustrated with a few simple examples, chosen for their numerical simplicity. The anistropic effective masses of silicon allow substantial changes in particle energies to occur in two-electron collisions. The ratio of the heavy mass to the light mass is approximately mh/ml= 5 (e.g., Sze, 1969). Let two electrons be moving in a (001)direction, one with heavy mass in that direction and momentum
and the other in the light mass direction of another valley with momentum
Here p, is the total momentum of the two electrons. These electrons can collide with conservation of energy and momentum in such a way that after the collision the momentum in the light mass valley is zero and the momentum of the heavy electron is p,. The original energy of the light mass
ROBERT W.KEYES
194 -v
-
-v
-
0.
*
4v/3
v/3 v/3
BEFORE AFTER FIG.22. Example of a three body collision in which the maximum electron energy is increased by a factor 16/9.
is added to the energy in the heavy mass,
to produce an energy U = p j / 2 m hin the heavy mass. In the case of silicon the relative values of the energies are 1.25, 1, and 2.25. Apart from anisotropy, three-body collisions can concentrate energy in a single electron. Phonons can be quite effective in maintaining momentum balance. Consider two electrons with identical momenta p interacting with a phonon of momentum pq to leave one electron at rest. Then conservation of momentum and energy take the form 2P = Pf
+ Pq
P 2 P: m - 2m
+ PqS
(44) (45)
pf is again the final electron momentum and s is the velocity of the phonon, and dispersion in the phonon spectrum is ignored. Since s is small compared to the velocities of electrons, an approximate solution of (44) and (45) is Pf = P J Z , Pq = ( 2 - 4% (46) Almost all of the original energy resides in a single electron. Figure 22 illustrates the transfer of energy to a single electron in a threeelectron interaction. It can be argued that the three body collisions have small probability. However, even very improbable events can contribute to longtime reliability problems. It cannot be guaranteed that low voltages will completely eliminate hot-electron threshold drift.
C. MESFET
The silicon field-effect transistor has been outstandingly successful because of the excellent qualities of S O , as an insulator. No comparable insulator
PHYSICAL LIMITS IN INFORMATION PROCESSING SOURCE
n+
GATE n
195
DRAIN
SI
n+
FIG.2 3 . Structure of a MESFET on a semi-insulating substrate. Heavily doped source and drain regions allow ohmic contacts to be made to the ends of the channel with metals. A different metal forms the Schottky gate.
compatible with other semiconductors, in particular, the 111-IV compounds such as GaAs and InP, is known. Field-effect transistors are made in other semiconductors by the method illustrated in Fig. 23. A thin n-type layer is prepared on a high resistivity p-type or semi-insulating substrate. A metal deposited on the n-type layer forms a Schottky barrier with a region depleted of electrons beneath it. The metal acts as a gate; potentials applied to it change the width of the depleted layer, thereby modulating the conductivity of the n layer. As used in logic circuits, a large enough negative potential is applied to the gate to extend the depleted region completely through the n-type layer, cutting off conductivity between the source and the drain and causing the device to act as a switch. The transistor is called a MESFET (MetalSemiconductor Field-Effect-Transistor) or Schottky-gate transistor (Sze, 1969). The conductive n-type layer can be made so thin that it is completely depleted of electrons even in the absence of a potential applied to the gate. A positive potential can then be applied to the gate to narrow the depleted region and allow the device to become conductive. This form of transistor is called a normally-off MESFET. Although MESFETs have been made in various semiconductors, including silicon, and can be made of p-type semiconductors, the greatest interest in them is in the case of n-GaAs (Eden rt al., 1979).Electrons in GaAs have very high mobility, and the gallium arsenide MESFET offers an avenue to the use of the high mobility in electronic devices. This approach has been very successful in the microwave field, the attainment of a miniaturized MESFET in gallium arsenide immediately produced a large step in the ability of solid state devices to amplify microwaves (Drangeid et al., 1970). The thickness of the depleted layer is related to the barrier height plus applied voltage and the doping of the n layer in the way described by Eq. (12). The relation shown in Fig. 12 is plotted in Fig. 24 with the addition of special features relevant to MESFETs. A scale showing voltage applied to the gate is shown in addition to the scale of potential supported by the barrier for a barrier height of 0.8 V, approximately that for metals on GaAs. The vertical scale, the thickness of the depleted layer, can also be regarded as the thickness
196
ROBERT W. KEYES
BARRIER VOLTAGE ( V ) 0.1
0.2
0.5
I
2
5
GATE VOLTAGE ( V ) FIG.24. A plot of the type of Fig. 12 applied to GaAs MESFETs. The Schottky barrier is taken to be 0.8 V. The applied voltage modulates the thickness of the depleted region, which also depends on the doping, as shown by the lines. The points illustrate two representative MESFETs with layer dopings of 10” ~ m - One, ~ . with a layer thickness of 0.09 pm is normally off (0.8 V depletes 0.1 pn) and is turned on by a positive voltage of 0.1 V applied to the gate. The layer thickness of the other is 0.13 pm and it is pinched off by a gate voltage of -0.7 V. The limits set by dielectric breakdown under negative gate voltage and by forward current (10 A/cm2) with forward voltage are shown by the solid lines.
of a MESFET channel that is just depleted by the applied voltage. When this voltage is positive the transistor is normally off, and the applied voltage may be regarded as the threshold at which conductivity appears. If the applied voltage is negative, it is the potential that must be applied to the gate to pinchoff the channel, reducing the conductance of the device to zero. Two points show the locations of representative GaAs MESFETs. The limitations of MESFETs for logic applications are also shown in Fig. 24 (Keyes, 1981). The normally on MESFET is limited by dielectric breakdown. The electric field in the depleted region increases with applied negative voltage and the conductive layer must be entirely depleted by the voltage before the breakdown field is reached. The limit in the case of the normally off MESFET is the forward conductance of the Schottky gate. The value of this current that is acceptable depends on the circuit application; 10 A/cm2 has been used as the limit in the figure for purpose of illustration.
PHYSICAL LIMITS I N INFORMATION PROCESSING
197
The logic signal must switch the gate between points between the limits. It is seen that an interval of several volts is available even for doping levels ~ . with layer thickness of only 25 nm should be exceeding 10” ~ m - MESFETs feasible. It is desirable that gate lengths should be several times the thickness to avoid degradation of the characteristics by short-channel effects; MESFETs should be practical to gate lengths of 0.1 pm. D. Soft Errors
Another basic problem of miniaturized electronic devices arises from the presence of energetic radiation in the environment. The problem can be severe in integrated electronics because of the presence of natural radioactivity in the impurity content of materials used in electronic packaging: ceramics, plastics, solders, metallic wires. (x particle emission is prominent in natural radioactivity at CI energies around 5 MeV. An (x particle entering the silicon creates over lo6 electron-hole pairs distributed through a range of a few tens of microns. The range of the energetic (x particles is an order of magnitude or more greater than the thickness of the layer in which the devices are fabricated. However, the extraneous holes and electrons can diffuse to the devices and add to any charge that may be stored in their capacitances. The effect is particularly noticeable in dynamic FET memory, where information is stored as charge on a capacitor (May, 1979). Alterations of the stored charge that destroy the information contained in the capacitor are called “soft” errors, a term intended to imply that no permanent change in the device is involved, only a transient error occurred. The magnitude of this limit depends on environments that are difficult to characterize. Purification of packaging materials is an obvious approach to controlling soft errors. A certain amount of the problem, however, also arises from cosmic radiation (Ziegler and Lanford, 1979). Error correction, the use of redundancy in storage of information that enables a limited number of erroneous bits to be recognized and corrected, can lessen the seriousness of soft errors. A most promising approach to management of soft errors is the development of structures that avoid collecting carriers created at some distance from a device. For example, the formation of an oxide layer between the surface layer that contains the device and the bulk silicon can prevent most of the radiation-induced carriers from reaching the devices. It is found that acceptable soft error rates require that at least lo6 electrons, 160 fC, be used to represent a bit in random access memory. The density at which charge can be stored in an SO2-insulated capacitor is limited by the maximum field that can be sustained by the oxide. S i 0 2 is an excellent insulator. However, the mechanisms that cause it to break down at high electric fields are not understood. Time of exposure to the field, thickness of the layer, and method of preparation are all factors. The maximum working
I98
KOH1:R'I' W. K1;YF.S
field compatible with long term reliability of devices must be regarded at present as less than 3 x 10" V/cm. This maximum field allows 10fC/pm2 to be storcd ,in a capacitor. Storage of 10" electrons then requires an area of 16 pm2, an area too great to be economically accommodated in megabit memories with conventional technology. The solution to this dilemma has been the trench capacitor, made by etching a hole in the silicon and oxidizing the sides of the hole t o form the insulator of the capacitor; the silicon surface area occupied is less than the capacitor area.
A . Chip Wire
Another frequently recognized limit to miniaturization is increasing ohmic resistance of wires. This problem derives from the limitations of manufacturing processes: as the width of wires is reduced, their thickness must also be rcduced. I t is ditxcult l o fabricate structures with thickness equal to or grcater than their width with high yield. The cross sectional area of wires decreases nearly as the square of their width. A simple model of wiring reveals the important features (Kcycs, 198221). A chip can be regarded as a collection of logic gates, cach occupying a cell of area A . The area A must contain a nuniher of devices of total area A, and area devoted to wiring channels that traverse it. The latter area is the number of channels times their length times their width. The lcngth of a channel is just A''' and, calling their width W and their number N , , the areit of the wire channels is
A,
=
N, WA ' I 2
(47)
The iirca per gate, containing wire and devices, is A , devices can be placed in K layers. Thus KA
=
+ A,,. The wire and the
NTWA"2 + A,,
Since the total length of wire in the logic cell is N,A''2 written
K(LW/NT)' = WL,
+A,
(48) =
L,, (48) can also be (49)
When the wiring area dominates A,, as in gate arrays for high-speed logic, then a good approximation to the solution of the quadratic equation (49) for L , is
PHYSICAL LIMITS IN INFORMATION PROCESSING
199
If the sheet resistance of the wiring is pCland the width of the wire is W/2, then the resistance of the wire in a cell, which must equal the average wire driven by a logic gate is R , = 2p,L,/W. Using (50) Rw
= 2P,,[(g)
+($)I
The dominant term, the first, does not depend on W and is not obviously affected by miniaturization. There are, nevertheless, strong incentives for reducing the width of wires on chips. Reducing the area of circuits is an essential point of integration. In the approximation leading to (50) the area per logic cell is A=-
NqW2 K2
+-2AD K
The limitations on W are one of the important limits to the level of integration; high levels of integration require small areas per circuit. The effects of miniaturization on R w appear in two other ways. The thickness of wires is decreased with their width because of the limitations of fabrication technology, causing an increase in p Q . Also, it is found that the required number of tracks, N , increases as levels of integration increase (Heller er ul., 1977). For these reasons increasing wire resistance is, in fact, a serious problem of miniaturization. The wire resistance is reduced by increasing K , the number of wire layers. Increasing K is desirable from all performance points of view. In the framework of the above approximation the wiring capacitance per logic cell is
c=
[(F) + AD](:)
(53)
C is also reduced by large K . E~ and ti are the dielectric constant and the thickness of the insulator that separates the wiring from the substrate. The trend to narrower lines and more layers of wire seems destined to continue. Because of the nature of IC fabrication processes ti and W tend to scale with one another and EiW/ti tends to remain constant at 2pFlcm = c,. At the other extreme limit, the optimum design would allot an area A = A D to each logic gate (assuming that all of the devices must be fabricated in the substrate) and place all of the wiring in other layers. If A = A,, then from (48)
The required values of K to achieve this limit are about 5 for current technologies and can be anticipated to rise to perhaps 8 in plausible
ROBERT W.KEYES
200
extrapolations to the end of the century. With such large numbers of layers, larger channel widths will be necessary in the higher wiring levels than in the layers near the substrate. W in (48) should then be interpreted as the harmonic mean of the widths in the various layers. Examining this case in the approximation K >> 1 yields for the wire resistance and capacitance R, =
2p, N,Ah”
W
(55)
Thus there is a time constant R,C,
=
-f
2P
W
For example, assuming po = 0.1 ohm for metallic wires, NT = 30, W = 2 pm, and A D = 30W2 gives R,Cw = 4ps. Wire resistance does not seem to be an important consideration if this limit can be reached. The most serious limitation arising from the need for a large amount of wire at present is the difficulty of fabricating it. Let there be some maximum length of wire, L,,,, that can be fabricated with acceptable yield on a chip (yield here is the probability that the wire is not defective).Then the number of logic gates that can be integrated on the chip is
M = - L,X
(57)
LW
Equation (49) can be expressed in terms of M and L,,,
L,,, depends on the channel width, W. The narrower the channel, the greater the probability of a defect. The wire yield Y, the probability that there is no defect in a wire of length L , depends on L as Y = exp( - y( W ) L )
(59)
The probability per unit length of a defect, y(W),increases with decreasing W. For example, if y ( W ) = Z/W, then the maximum length depends on the yield as
PHYSICAL LIMITS IN INFORMATION PROCESSING
20 1
Using this value of L,,, in (57) and observing from (49) that L , < ( N : W / K ) yields for a maximum level of integration
For example, the following numbers are somewhat representative of 1985 technology: K = 3, N , = 16, 2 = W = 5 pm. If Y = I/e, then L,,, = 5 m and the limit on M given by (61) is lo4. (l/Z), Eq. (64), is essentially the ratio of the total wire length to wire width. Improvements in technology that increase this ratio, decreasing Z , are necessary to increasing integration. The form of limit (61) obviously depends on the particular choice of y(W) and should not be interpreted too rigorously. The yield also depends directly on K and the ability of fabrication techniques to provide many connections among wiring layers. B. Electromiyration
Another limit on the reduction of the wire width is set by electromigration. The high current densities in miniaturized wires produce a force on atoms in metallic wires that causes them to move (D’Heurle and Rosenberg, 1973). The motions are related to diffusion; the jumps of an atom from one site to another are thermally activated, and the electrical force imposes a preference for jumping in one direction. The wires are polycrystalline and the events associated with electromigration are complicated, involving grain boundaries and nucleation phenomena. Any divergence in the atomic current leads to depletion of material. A void nucleated at a grain boundary can grow and lead to an open circuit in the wire. The relation of the mean time to failure of a wire to the most important parameters determining the failure rate can be approximated for practical purposes by the form (Ghate, 1983),
Here J is the current density, T is the temperature, and H is an activation energy. Other variables, such as grain size, conductor dimensions, surface conditions, and efficiency of heat sinking are reflected in the parameters A and n. There is a wide variability of lifetimes around the mean, even in nominally identical conductors, largely because of the differences in arrangement of grains. An example of a lifetime distribution is shown in Fig. 25.
202
ROBERT W. KEYES 5000
s
I
I
I
1
I
3000
w
2000
0.50 0.80 0.90 0.99 FAILURES FIG.25. The distribution of times to failure in accelerated testing of aluminum conductors at 106A/cm2, 105°C (Attardo and Rosenberg, 1970). The distribution has the lognormal form with a mean time to failure of 2000 hours. W.IU
".LO
Accelerated testing is necessary to design structures intended for the long lifetimes expected of electronic circuitry. The accelerated testing is accomplished by the use of high current densities and high temperatures and extrapolation to the conditions of use with (62). The great variability of actual lifetimes around a mean make a conservative approach necessary and also leads to uncertainty in the values of H and n. A maximum permissible current density in copper doped aluminum at S0"C appears to be about 2 x 105A/cm2. Preventing the spreading of voids through the conductor is a promising approach to increasing the electromigration limit. This has been demonstrated by insertion of a layer of an intermetallic compound in the center of the conductor (Howard et al., 1978). Preparing metal lines with large grains that have fewer grain boundary intersections at which atomic flux divergences occur is another method of attack on electromigration (Attardo and Rosenberg, 1970; Vaidya et al., 1980; Vaidya and Sinha, 1981). C . Chip Interconnection
The density at which chips can be placed on a substrate is also limited by the density at which the interconnections among them can be fabricated. The interconnections consist of wires fabricated by mass production methods in layers in the chip substrate (Blodgett, 1983). The techniques used in substrate manufacture do not permit use of dimensions as small as those on chips; coarser wiring is used. The wire requirements can be quantified in a
PHYSICAL LIMITS IN INFORMATION PROCESSING
203
way similar to that applied to chips in Section A. The amount of wire in the substrate is determined by the number of chip terminals that must be interconnected. Let a module, the substrate on which the chips are mounted, contain K layers of wiring channels of width W and have a total area M A , where M is the number of chips on the module and A is the module area per chip. The total length of wire channel is then L,=-
KMA W
L , must be equal to the amount of wire needed to interconnect the chips. If each chip has C terminals there are M C terminals to be connected. Assuming as a first approximation that the terminals are connected in pairs, there are M C / 2 wires. Let the average length of the wires be measured in chip spacings traversed that is, average length = mA ' I 2 . Then
From (63) and (64) A=
m2CZW2 4K2
C increases with level of integration and is roughly proportional to the 1/2 or 2/3 power of the number of logic gates on a chip (Landman and Russo, 1971). Thus, according to (65), maintaining the close spacing of chips requires increasing K as integration advances. The other alternative, reducing W, is not feasible without a very major change in packaging technology. Experience shows that the length of wires on a module is some appreciable fraction of the linear size of the module; that is, the parameter rn in (63)-(65) increases as M very roughly, m % M '12/4.Thus modules containing more chips also demand either more area per chip or more wire layers.
'",
VTT. FABRICATION
The advance of microelectronic technology is paced by the rate of development of fabrication methods. It is difficult to identify any real physical limits to minimum dimensions of manufactured structures. Conductors with widths in the range of tens of nanometers (Broers et a/., 1976) and layers with thicknesses of a few atomic radii can be fabricated. Additional factors are important in considering limits to the fabrication of useful electronic devices, however: economic considerations become an essential consideration.
204
ROBERT W. KEYES
The elcctronic circuitry on chips consists of complex patterns of materials. The patterns are originally formed in radiation-sensitive substances by beams of various types. Certain fundamental physical facts are associated with practical limits to pattern formation with beams. One is the increasing energy of quanta of electromagnetic radiation with decreasing wavelength. Diffraction requires the wavelength of exposing radiation to be less than the dimension of structures to be fabricated. Thus, one micron resolution demands photon energies greater than 1.24 eV. Tenth micron resolution means radiation of energy greater than 12.4 eV, and a 0.01 pm wavelength corresponds to 124 eV radiation. Several factors impede the production of structures on such small scales, however. Wavelength is usually not a consideration in exposure with particle beams, the wavelengths are very short compared to foreseeable structures. The particles of principal interest are electrons, as electron beams have been successfully used in the fabrication of microelectronic circuits (Brewer, 1980). The limitations on the use of electron beams in microfabrication stem from the nature of electron lenses and electron emitters. High energy particles are used for somewhat different reasons. The lenses are subject to chromatic aberration; their focal length depends on the energy with which an electron enters the lens system. The energy is variable because electrons are emitted from the source with a Maxwellian velocity distribution. The relative spread of electron energies and the enlargement of a spot by chromatic aberration are reduced by a high accelerating voltage. The thermal distribution of electron velocities also means that transverse velocity components are present. A high accelerating voltage increases the fraction of the electrons that can be focussed into a small spot. The mutual repulsion of electrons in a beam also causes spreading of a spot at a rate that increases with increasing density of electrons in the beam. High velocity, produced by a high accelerating voltage, reduces the density and the space charge spreading. The high energies used in lithographic exposure can cause atomic displacements, damage, that may affect the operation of devices. This possibility limits the application of exposure tools in particular cases. Another physical fact is that macroscopic tools only control the average rate of arrival of quanta at a substrate. The number of quanta that actually arrive at a particular small area of a substrate is randomly distributed about some average value. A relatively small variability in the fabrication process, an acceptable yield, requires that the number of quanta used to expose a spot be large, one hundred or more. This latter observation leads to an economic limitation, as it affects “throughput”. Throughput means the rate at which silicon is processed. It
PHYSICAL LIMITS IN INFORMATION PROCESSING
205
must be high to justify the large investment in equipment needed for modern semiconductor manufacturing. Assume that M quanta per spot are required as the nominal exposure that reduces the statistical variability in the number actually arriving at the spot to an acceptable level, ( M >> 1). Let the spot have diameter d, so that quanta of energy at least hcld are used in the exposure and an energy Mhcld is deposited in the spot. The rate at which spots are exposed must be proportional to l / d 2 to maintain a constant throughput, that is, to expose silicon area at a constant rate as d varies. Therefore, the rate at which energy must be delivered to a substrate varies as l/d3 in this limit. Stringent demands are placed upon radiation sources and optics of exposure tools. Some length parameters related to various exposure methods are shown in Fig. 26. A few general observations can be made. It becomes increasingly
j /cm
0.OOl
0.01
0.1
LENGTH
I
10
2
too
(pm)
FIG.26. Relations between length parameters relevant to lithographic exposure and energetic quantities. Line a is the energy of an electromagnetic quantum as a function of its wavelength. h is the range of electrons in silicon as a function of energy; its dashed extension suggests the range of secondary electrons produced by quanta of radiation. The shaded area encompasses the range-energy relation of various ions. The dotted lines are contours of the density of energy deposition when 100 quanta or particles are used to expose each spot of the diameter on the abscissa.
206
ROBERT W. KEYES
difficult to localize the effects of exposure as dimension is reduced because of the increasing range of electrons with increasing energy. For example, a one KeV quantum has a wavelength of only pm but can produce an electron with a range of 0.02 pm. The exposure produced by energetic particles is not confined to the area in which a beam is focussed, but occurs throughout the particle’s range. This effect is most prominent in exposure with electrons. Electrons experience many large-angle scattering events because of their small mass compared to atoms and produce exposure at an appreciable distance from their point of entry. Fortunately, a large part of this proximity effect can be compensated in computer-controlled electron beam tools by adjusting the exposure at each site to account for any exposure received from nearby sites. The sensitivity of resist materials in current use is to 1 j/cmz. The energy densities corresponding to 100 quanta per spot are also shown in Fig. 26. The quantum limit (M 100 quanta per spot) is not yet here. The energy deposited per unit area in the quantum limit increases rapidly with decreasing dimension. Further, less energy is deposited in exposure with photons than in exposure with electrons at a given dimension in the quantum limit. The long electron ranges shown cannot be regarded as a dimensional limit in the same way as the wavelength of radiation is a limit, however, a long electron range produces the proximity effect mentioned above, which can be corrected to a great extent. The radiation exposes some kind of resist material, changing its solubility in a developer, in conventional fabrication methods. The pattern created in the resist must be converted to some physical structure that forms part of a device, usually a pattern of metallic conductors or of dopants in a semiconductor. The conversion is effected by a sequence of processes involving material deposition and selective material removal by various kinds of etching. The accuracy with which the original pattern can be reproduced from one process step to another limits the fabrication of microminiaturized structures. A major improvement in the fidelity of pattern transfer must accompany any significant reduction of the dimensions of devices. Time-consuming development of new process methods limits the rate at which miniaturization can advance. Alignment is a further problem that limits the rate of advance of miniaturization. Device structures are made by several successive steps. For example, first a highly conductive region is formed in a semiconductor by introducing a dopant, then a metallic contact is deposited on the doped region. Alignment means locating the same place on a chip several times so that a sequence of steps are all properly placed, Increasing the accuracy of alignment is part of miniaturization and also requires the development of new methods as dimensions are reduced.
-
PHYSICAL LIMITS IN INFORMATION PROCESSING
207
VIII. DISSIPATION OF ENERGY The dissipation of energy to heat in logical operations is one of the limits on the computational process that has attracted most attention. Removing the heat produced in logic operations is one of the limitations on the performance of large computing systems (Keyes, 1970; Vilkelis and Henle, 1979). A . Fundamental Limits I t is known that computation can be performed in principle without the dissipation of energy. Landauer (1986) and Bennett and Landauer (1985) have reviewed the subject. However, the idealized systems that demonstrate the possibility of dissipationless computation are far removed from the devices that have actually proved useful. Perhaps the concept closest to reality is the molecular Turing machine, whose operation closely resembles the transcription of genetic information in living systems (Bennett, 1982). A Turing machine consists of a black box or “head” that can assume any of a certain set of states and a movable tape on which a member of an alphabet of symbols can be written by the head. The machine operates step-by-step. At each step the head reads a symbol from the tape. Logic in the head combines the symbol read and the current state of the head to produce a new state in the head, write a new symbol on the tape, and move the tape to another position. The machine is then ready for the next step. A Turing machine can perform any possible computation. The tape of the molecular Turing machine would consist of a long molecule such as a strand of DNA that can bind various symbol molecules at a series of positions along its length. The state of the machine is represented by the attachment of one of a set of state molecules, a molecule selected to record the state of the machine, at the current position of the tape. The calculation is carried out by a collection of enzymes that can catalyze the reaction on the chain that places the next symbol molecule on the chain and replaces the previous state molecule with the appropriate one. Chemical reactions, such as those involved in the operation of the molecular Turing machines, are driven by imbalances in the concentration of reactants. At equilibrium a reaction proceeds equally rapidly in both directions. No progress is made in either direction and no energy is consumed; diffusion and Brownian motion cause reactants to come together or to dissociate occasionally. A reaction can be made to proceed preferentially in one direction by controlling the concentration of reactants, in accord with the law of mass action. The reaction in the case of the molecular Turing machine
208
ROBERT
w. K E w s
consumes the symbol monomers and can be caused to proceed in the direction of the computation by an excess of these monomers. However, the reaction is then no longer in equilibrium and dissipation occurs. The source of energy dissipation in logic can be identified as the discarding of information in an idealized system. The basic idea can be seen by thinking of a box partitioned into two equal parts. A bit of information can be stored by putting a gas of N particles in one or the other side of the box. If the partition separating the two chambers is removed, thereby erasing the stored information, the entropy of the system is increased by Nk In 2. A n amount of work NkT In 2 must be performed lo compress the gas and energy N k T In 2 dissipated to heat to restore ;in information-containing state. Since writing a bit can be regarded a s the simplest logic operation and the minimum value of N is one, kT In 2 represents a minimal dissipation per logic operation. This elementary system is still quite far from any real logic device. However, the result docs suggest that k7' is an appropriate unit in which to measure dissipation in logic and thus compare actual energy utilization with a crude limit. The dissipation in this cxnmple is essential because removing the partition between the two cells is thermodynamically irreversible; the cycle through which the system has been carried cannot be run in reverse. Common logic operations have a similar property. Consider the N O R in Table I. The inputs cannot bc determined in general from the output, the computation cannot be reversed. Pursuing the example, the dissipation could he avoided by eliminating the destruction of the original information. The new information could be written with an apparatus that could move the gas from one cell to the other at constant volume if the original location of the gas were known. Models of computational systems that would avoid dissipation in principle by storing all intermediate results to enable the system to be restorccl to its initial state without dissipation have been described (Bennett, 1973). Likharev (1 982) has proposed a scheme for doing this that does not seem too remote from reality. Stein (1977) has considered the effect of thermal fluctuations on information in clectrical form. If a bit is represented by a charge or absence of charge on a capacitor c', then the energy of storage is ( C V 2 / 2 ) .The mean thermal fluctuation energy in the capacitor ( k 7 / 2 ) , can produce an error in reading the bit. Stein shows, for example, that the probability of incorrect reading is reduced to lo-'" if ( C V 2 / 2 )> 165kT. Still another view of a fundamental limit to dissipation is based on the observation that according to the quantum mechanical uncertainty principle an event localized i n a time t must be associated with an energy-tilt, suggesting that this quantity is a lower limit to the dissipation of a logic operation performed i n time f . It does not seem possible to translate this rather general
PHYSICAL LIMITS IN INFORMATION PROCESSING
209
thought into specific models involving particles, potentials, fields, or Hamiltonians, however. Its significance is, therefore, somewhat vague. A version, difficultto question, is that nothing can be done with aphoton of energy hv in a time less than l / v . B. Power Supply and Cooling
The energy dissipated in practical electrical logic is 10' kT or more and far exceeds any of these limits. Contemporary electrical logic uses large signals to influence many electrons at each step. We now turn to this topic, which constitutes an important limit. The need for large signals in logic means that large amounts of power must be supplied to logic chips and this power must be removed as heat. The ability of technology to meet these demands is limited by the properties of known materials. Power is supplied to chips through the same kind of connections that are used for logic signals. The connections on and off of the chip are subject to electromigration and are limited to current densities not much greater than lo5 A/cm2. Ohmic resistance may limit current densities to even lower values. For example, the resistive voltage loss in a conductor of resistivity ohm cm carrying current at a density lo5 A/cm2 is 1 V/cm, an amount that may not allow the tolerances on the power supply to be met on all chips and devices. The limited number of connections to the chip and their finite current-carrying capacity limit the power that can be supplied. The heat flow paths in a large system are shown schematically in Fig. 27. Most of the energy drawn from the power supply is converted to heat in the devices. Heat is conducted away from devices into the body of the chip, encountering the thermal resistance of the silicon in doing so. If the dimensions of the region under the emitter in a bipolar transistor are small compared to the thickness of the chip and the distance between devices the thermal resistance in the silicon can be regarded as a spreading resistance. The
FIG.27. Heat flow from devices to a cooling fluid. Heat spreads into the chip from the device where it is produced at high density. It is passed from the chip to cooling structures from which it is transferred at low density to a moving fluid that carries it out of the system.
210
ROBERT W. KEYES
I
2
3
4
5
10
L/W
FIG.28. Values of thermal resistance encountered in spreading of heat from a rectangular source region of dimensions L , W ( L > W ) into a semi-infinite substrate with thermal conductivity K.
spreading resistance of rectangular regions has been calculated and the results are presented in Fig. 28 (Loewen and Shaw, 1954). The relation is wellrepresented by the formula
Here K is the thermal conductivity of the substrate and L and W are the dimensions of the rectangle ( L is the larger of these dimensions in Eq. 66). Bipolar transistors projected for the future with 114 pm dimensions and currents around 0.5 ma (Solomon and Tang, 1979) will have thermal resistances to the substrate exceeding lo4 deg/W and rise 10°C above the substrate temperature. The time constants for a steady state heat flow to be achieved will be a few nanoseconds with 1 pm dimensions. As these times are longer than the anticipated delay times on the chips, the temperature of the device will depend to some extent on its switching history. Examples of calculated times and thermal resistances are presented in Fig. 29. The thermal resistance given by Eq. (66) actually refers to the relation of the maximum temperature of the rectangle, which is found at the center, to heat current uniformly distributed across the rectangle. The temperature increase will be only about half of this amount at the corners. The temperature gradient will cause a device to operate nonuniformly across its area. A 5 degree temperature difference changes the current density in a junction by a factor 213.
PHYSICAL LIMITS IN INFORMATION PROCESSING
21 1
W (pm) FIG.29. Thermal resistances and thermal relaxation times for rectangular areas on a silicon surface. The solid lines represent the thermal resistance, left hand scale, and the dashed lines represent the relaxation times, right hand scale. The curves are labelled with the values of L/W to which they apply.
The heat spreads into the chip from the devices. It is eventually transferred to the atmosphere, a large heat sink. The density at which this is done depends on the investment in cooling hardware and to a less important degree on the permissible temperature difference between the chip and its environment. The allowable increase in temperature of a chip is limited by reliability; activated processes such as diffusion, creep, chemical reaction, and thermomigration alter device structures increasingly rapidly as temperature is increased. The activation energy of failure mechanisms in integrated circuitry is about 0.5 eV. At 70°C a temperature change of 15°C changes the mean life of components by a factor 2. Careful attention to the details of heat transfer is essential. It is desired to place the chips in Fig. 27 as closely together as possible in order to minimize the transit time of signals between chips. The spacing is limited by the density at which heat can be removed from the chip array. The historical trend of cooling in high performance systems is shown in Fig. 30. Early systems were cooled by free convection of the air. Fans were soon introduced to provide forced convection. At heat transfer rates above a few tenths of a watt per square centimeter, special structures must be provided to increase the area of solid-air interface across which heat can be moved. Heat
212
ROBERT W. KEYES
10010 -
1990 2000 YEAR FIG.30. Evolution of cooling technologies for large high-performance computers.
transfer to air becomes quite difficult around 1 W/cm2 and liquid cooling is used to take advantage of higher heat transfer coefficients at solid-liquid interfaces. There are several compromises involved in the design of heat transfer structures: thin fins introduce a thermal resistance into the path of heat flow into the fluid, thick fins reduce the number of fins that can be placed in a given volume; short fins increase the density at which heat must be transferred to the fluid, long fins increase the volume of fluid that must be provided; small separations between fins cause high viscous resistance to fluid flow, large separations mean that heat must flow a long distance through the thermal resistance of the fluid to reach the rapidly moving part and also reduce the number of fins. The optimum design must take account of many factors, including the constraints upon the system that is called upon to supply the fluid, notably, pressure and volume of fluid required. Tuckerman and Pease (1981) showed that there is a length parameter that characterizes a fluid defined by
Here pFis the viscosity of the fluid, KF is its thermal conductivity, and CFis its heat capacity per unit volume. P is the pressure driving the fluid through a channel of length L . r is a constant, approximately 100.The optimum channel width is close to b, depending to some extent on the constraints under which optimization is carried out (Tuckerman, 1984; Keyes, 1984).Tuckerman and Pease, (1981) recognized that if water is used as the cooling fluid, L is a typical chip dimension, 0.5 cm or 1 cm, and P is around one atmosphere, then b x 50 pm, a dimension that is easy to fabricate with techniques practiced in
PHYSICAL LIMITS I N INFORMATION PROCESSING
213
semiconductor technology. They demonstrated that hundreds of watts can be removed from a chip by this method, simply etching fins and channels into the back of a microelectronic chip.
IX. CONCLUDING REMARKS Many physical effects retard the advance of microelectronics towards smaller structures, higher levels of integration, higher speeds, lower powers, and greater reliability. Most of these depend on the properties of particular materials. In some cases, such as semiconductor materials, few materials are known and one can regard the consequences of the properties of these as limits. In other cases, for example, radiation-sensitive resists, there is substantial opportunity to devise new compositions and one anticipates a stream of improvements. Advances in the capabilities of microelectronics by and large stem from miniaturization and integration. Continuing these avenues of improvement depends on new device structures and new fabrication tools. The extent to which novel concepts can be utilized is frequently limited by economics, the ever-increasing cost of new instruments and facilities.
REFERENCES Abbas, S . A,, and Dockerty, R. C. (1975). Appl. Phys. Lerrers 27, 147-148. Attardo, M. J., and Rosenberg, R. (1970).J . Appl. Phvs. 41,2381-2386. Bennett, C. H. (1973).IBM J . Res. DCLX17, 525-532. Bennett, C. H. (1982).Inf. J . Theor. Phys. 21, 905-940. Bennett, C. H., and Landauer, R. (1985). Scient$c American 253 (I), 48-56. Blodgett. A. J., Jr. (1983). Scient$c American 129 ( I ) , 86-96. Brewer, G. R. (1980). In “Electron-Beam Technology in Microelectronic Fabrication” ( G . R. Brewer, ed.), 1-58. Academic Press, New York. Broers, A. N., Molzen, W. W., Cuomo, J. J., and Wittles, N. D. (1976). Appl. Phys. Lerter.s 29, 596-598. Cottrell, P. E.,Troutman, R. R., and Ning,T. H. (1979).IEEETrans. Elecrr. Deu. ED-26,520-533. DHeurle, F.. and Rosenberg, R. (1973).In“Physics of Thin Films”, 7(Academic Press, New York), 257- 310. del Alamo, J. A,, and Swanson, R. M. (1986). IEEE Elecrr. Deu. Lelters EDL-7,628-631. Donath, W. E. (1981).IBM Journal Res. and Deu. 25, 152-155. Drangeid, K. E., Sommerhalder, R., and Walter, W. (1970). Electronics Letters 6,228-229. Duvurry, C. (1986).IEEE Ckts., and Deu. Magazine 2 (6), 6-10. Eden, R. C., Welch, B. M., Zucca, R., and Long, S . L. (1979). IEEE Trans. Electr. Deu. ED-26, 299-317. El Gamal, A. A. (1981). IEEE Trans. Ckrs. Sys. CS-28, 127-135. Gabrielse. G., Dehmelt, H., and Kells, W. (1985). Phj,.s Rev Lerters 54, 537-539.
214
ROBERT W. KEYES
Gaur, S. P. (1979).IEEE Trans. Electr. Deo. ED-26,415-421. Ghate, P. B. (1983).Solid State Tech. 26(3), 113-120. Gunn, J. B.(1968).J . Appl. Phys. 39,5357-5361. Heller, W. R., Mikhail, W. F., and Donath, W. E. (1977).Proc. 14th Design Automation Conference (New Orleans, 1977),32-42. Hoefflinger, B., Sibbert, H., and Zimmer, G. (1979). IEEE Trans. Electr. Deu. ED-26,513-520. Hoeneisen, B., and Mead, C. A. (1972a).Sol. SI. Elecfr. 15,819-829. Hoeneisen, B., and Mead, C. A. (1972b).Sol. Sf. Electr. 15,891-897. Howard, J. K., White, J. F., and Ho, P. S. (1978).J . Appl. Phys. 49,4083-4093. Keyes, R. W. (1970).Science 168,796-801. Keyes, R. W. (1976). Comments Sol. Sf. Phys. 7, 149-157. Keyes, R. W. (1981). In “Digital Technology - Status and Trends”, ed. H. Painke. Oldenburg, Munich. 253-271. Keyes, R. W. (1982a).IEEE J . Sol.-St. Ckts. SC-17, 1232-1233. Keyes, R. W. (1982b).Int. J . Theor. Phys. 21,263-273. Keyes, R. W. (1984). lEEE Trans. Elecfr. Den ED-31, 1218-1221. Keyes, R.W. (1987). IBM Research Report RC12843 (unpublished). Keyes, R. W. and Landauer, R. (1970).IBM J . Res. Deo. 14, 152-157. Kimura, K., and Takahashi, T. (1982). In “Large Scale Integrated Circuits Technology”, ed. L. Esaki and G . Soncini (Martinus Nijhoff, The Hague) 373-398. Kirk, C. T. (1962).IRE Trans. Electr. Deu. E9-9, 164-174. Knepper, R. W., Gaur, S. P., Chang, F.-Y., and Srinivasan, G. R. (1985). IBM J . Res. Deu. 29, 2 18-228. Kohonen, T. (1972).“Digital Circuits and Devices”. Prentice-Hall, Englewood Cliffs, N. J. Landauer, R. (1986). In “Der lnformationsbegriff im Technik und Wissenschaft”, eds. 0. G . Folberth, C. Hackl. Oldenbourg, Munchen. 139-158. Landman, B. S., and Russo, R. L. (1971).IEEE Trans. Computers, C-20, 1469-1479. Lo, A. W. (1961)./ R E Trans. Elect. Comput. EC-IO,416-425. Likharev, K. K. (1982).Int. J . Theor. Phys. 21, 311-326. Loewen, E. G.,and Shaw, M. C. (1954). Trans. ASME76,217. May, T. C. (1979).IEEE Trans. Components, Hybrids, Manu6 Tech., CHMT-2,377-387. Ning, T. H., Cook, P. W., Dennard, R. H., Osburn, C. M., Schuster, S. E., and Yu, H-N. (1979). IEEE Trans. Electr. Dev. ED-26, 346-352. Ogura, S., Tsang, P. J., Walker, W. W., Critchlow, D. L., and Shephard, J. F., (1980) IEEE Trans. Electr. Dev., ED-27, 1359-1367. Parrillo, L. C . (1983).In “VLSI Technology”, ed. S. M. Sze. McGraw-Hill, New York. 445-505. Poon, H. C., Gummel, H. K., and Scharfetter, D. L. (1969), IEEE Trans. Electr. Deu. ED-16, 455-457. Sedra, A. S. and Smith, K. C. (1982).“Microelectronic Circuits”. Holt, Rinehart and Winston. Selloni, A,, and Pantelides, S. T., (1982).Phys. Rev. Lett. 49, 586. Slotboom, J. W., and de Graaf, H. C. (1976).Solid-state Electron 19, 857. Solomon, P. M., and Tang, D. D. (1979).Inr. Sol. St. Ckts. ConjDigest, 86-87. Sze, S. M. (1969).“Physics of Semiconductor Devices”. Wiley-Interscience, New York. Stein, K.-U. (1977). IEEE J . Sol. SI. Ckts. SC-12, 527-530. Tuckerman, D. B., and Pease, R. F. W. (1981).IEEE Electron Device Lefters EDL-2, 126-129. Tuckerman, D. B. (1984). Thesis, Stranford, University. Vaidya, S., Sheng, T. T., and Sinha, A. K. (1980).Appl. Phys. Letters 36,464-6. Vaidya, S., and Sinha, A. K . (1981). Thin Solid Films 75,253-9. Vilkelis, W., and Henle, R. A. (1979). Spring Compcon 79 Digest, 285. Ziegler, J. F., and Lanford, W. A. (1979).Science 206,776-788.
ADVANCtS I N ELECTRONICS A N D ELECTRON PHYSICS . VOL 70
Synthetic Aperture Ultrasonic Imagery KEINOSUKE NAGAI Institute of Applied Physics Uniuersity of’ Tsukubu Sakura. lharaki. Japan
I . Introduction . . . . . . . . . . . . . . . . . . . . I1 . Imaging System and Aperture . . . . . . . . . . . . . A . A Problem in Ultrasonic Imaging Method . . . . . . . . B. Aperture . . . . . . . . . . . . . . . . . . . C. Resolution . . . . . . . . . . . . . . . . . . . 111. Theory and Application of Holography . . . . . . . . . . A . Holography and the Synthetic Aperture Imaging Method . . B. Principles of Holography . . . . . . . . . . . . . C. Application to Ultrasonic Imaging . . . . . . . . . . D . Synthetic Aperture Side-Looking Sonar . . . . . . . . . IV . Fundamentals of Digital Ultrasonic Imaging . . . . . . . . A . Propagation of an Ultrasonic Wave . . . . . . . . . . B. Numerical Reconstruction of an Image from a Hologram . . C. HolographywithaBroad-Band Pulse Wave . . . . . . . Appendix . Spatial Fourier Transform of a Spherical Wave . . . V . Properties of a Transducer Array . . . . . . . . . . . . A. Radiated Field from a Single Transducer . . . . . . . . B . Radiation Field from an Array of Transducers . . . . . . C . Combination of Transmitter-Array and Receiver-Array . . . VI . Actual Digital Imaging System . . . . . . . . . . . . . A . Speedy Processing System with a Transducer Array . . . . B. Synthetic Aperture Method Using a Broad-Band Pulse Wave . C . Ultrasonic Computerized Tomography Using the Time-of-Flight VII . Diffraction Tomography as the Inverse Problem . . . . . . . A . Wave in an lnhomogeneous Medium . . . . . . . . . B. Diffraction Tomography with Plane Wave Illumination . . . C. Diffraction Tomography with Fan-Beam Illumination . . . . D . DiffractionTomographywithBroad-Band Pulse Wave . . . References . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . 215 . . . . 216 . . . . 216
. . . . 217
. . . . . . 219 . . . . . . 223
. . . . . . 223 . . . . . . 224 . . . . . 231 . . . . . 244 . . . . . 253 . . . . . 254 . . . . . 260 . . . . . 265 . . . . . 266 . . . . . 267 . . . . . . 268 . . . . . . 272 . . . . . . 215 . . . . . . 282 . . . . . . 283 . . . . . . 287 Profiles . . . 288 . . . . . . 290 . . . . . . 291 . . . . . . 295 . . . . . . 300 . . . . . . 305 . . . . . 313
. . . . . . . .
I . INTRODUCTION The theory and the application of synthetic aperture ultrasonic imaging will be reviewed in the following article. Ultrasonic images of high quality can 215
.
Copyright ((5 1988 by Academic Press Inc. All rights of reproduction reserved. ISBN 0-12-014670-3
216
KEINOSUKE NAGAI
be obtained by the technique that removes diffraction effects and makes the images in focus. Ultrasonic waves play important roles in medical diagnosis, nondestructive evaluation (NDE), underwater observation and earth resource survey. This is because the waves have many advantageous properties: They are non-invasive. They can penetrate through solid and liquid. And, they convey information on spatial distribution of elastic parameters in a medium that is complementary to optical parameters. The disadvantage of ultrasonic waves to optical waves or X-rays is that they often have wavelengths comparable in size to object inhomogeneities. The observation values in these fields are often displayed in a form of images. The use of long wavelengths bring diffraction effects that degrade the images. Many researchers have investigated synthetic aperture ultrasonic imaging which adopts the concept of holography in order to remove the diffraction effects. Their results are reviewed in this article with emphasis on fundamental theory and concept.
TI. IMAGING SYSTEMAND APERTURE A. A Problem in Ultrasonic Imaging Method
Ultrasonic images at present are mainly produced by the pulse-echo method or the pulse-transmission method. The pulse-echo method, as an example, is shown in Fig. 1. The ultrasonic pulse wave is shot from the transducer. The pulse wave travels to the object from which it is reflected. Then the pulse is received by the transducer. The received signal is processed and displayed with a CRT (Cathode Ray Tube) display. The horizontal position and the vertical position of the spot in the CRT correspond to the round-trip time of the pulse wave and the position of the transducer, respectively. The brightness of the spot represents the intensity of the received wave. As the transducer is vertically scanned, the whole image is completed. The ultrasonic wave in the method is assumed to be a narrow beam. If the assumption is satisfied, a good image can be obtained. However, if the assumption is violated, the image is severely blurred. When one intends to obtain the ultrasonicimage of the inside of the object, the main problem is attenuation of the wave. Attenuation associated with wave propagation is represented by a = a, exp( - a x ) , where a is amplitude of the ultrasonic wave, x is the propagation distance, a, is a constant and a is named an attenuation constant. a is usually proportional to the square of the
SYNTHETIC APERTURE ULTRASONIC IMAGERY
217
1 0 ..
In
E!
object
<
2 -. rc
n
E
t-
Horizontal
CRT display FIG. I .
B-scan imaging by the pulse-echo method.
frequency. Thus, the ultrasonic wave with a high operating frequency cannot penetrate a thick object. An ultrasonic pulse with a carrier frequency of several MHz is usually used in medical diagnosis and in nondestructive evaluation. The wavelength is in tenths of millimeters and is comparable in size to defects or inhomogeneities of the objects. Then, scattering and diffraction takes place, and the ultrasonic wave spreads out. In this case the wave cannot form a beam; the diffraction considerably degrades the images. B. Aperture
An image can be formed from the wavefront spread due to the diffraction. Fundamentals of image formation lie in the imaging system of a single lens. So, the system must first be considered. As shown in Fig. 2, an optical wave emanating from point A spreads out, but it could be focused at point B by a lens. Image B of point A is formed. This fact can be interpreted as follows. The wavelength inside the lens is shorter than that outside the lens (in the air), because the lens is made of materials of a larger refractive index. That is, it takes more time for the wave to travel a distance inside the lens than to travel the same distance outside the lens.
218
KEINOSUKE NAGAl
FIG.2. Image formation by a lens,
Of the wave propagation paths from A to B (refer to Fig. 2), the path A P B or the path AQB of long distance runs through the thin part of the lens. The path AOB of short distance runs through the thick part of lens. B is such a point that all waves emanating from A and running through the lens reach simultaneously. These waves are added in phase at point B where a large amplitude is observed. Waves that emanate from A and reach a point other than B are added out of phase. There, they cancel each other out. It is concluded that the waves emanating from A are collected to have a large amplitude at B and vanish at all other points. That is, the image of point A is formed at point B. An ultrasonic imaging system which removes diffraction effects can be derived from the lens system mentioned above. For this purpose, the optical wave should be replaced with the ultrasonic wave and other necessary changes should be made. In the structure shown in Fig. 3, the ultrasonic wave departing from an object is received by a large number of transducers. If an image of point A is to be formed, each received signal is delayed to compensate for the time delay corresponding to the distances from A to the transducer by the appropriate set of delay lines. All waves from A are added in phase but waves from other points than A are added out of phase to cancel each other out. When the image is formed from the spread wavefront, only the part of the wavefront is usable for the imaging system. The aperture of the imaging system is defined as the area within which the wavefront is used to form the Thus, the aperture image. In the example shown in Fig. 3, it is the width of can be synthesized by arraying transducers. In order to realize the system shown in Fig. 3 with hard-wired circuits, a great number of the delay lines should be installed. The number of the set should be almost equal to that of the image points. Therefore, it may result in the need for very large equipment. It will be shown, however, in Section I11 and Section IV that the equivalent digital image processing can be effectively
m.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
219
A
Q
transducer
delay line
sum
FIG.3. Fundamental structure of synthetic aperture ultrasonic imaging method
accomplished by executing two-dimensional fast Fourier transform (FFT) only once. C . Resolution
As the aperture becomes large, the performance of an imaging system is improved and high resolution is attained. This could be quantitatively discussed by applying the well-known theory of lenses. However, the resolution will be directly derived here for the sake of intuitive understanding of the imaging system. Discussion of the two-dimensional problem is sufficient for the resolution. 1. Azimuthal (Lateral) Resolution
An image of point A is formed by the system shown in Fig. 3. As mentioned in the previous section, waves emanating from A are received by all transducers. All the waves at the output of the delay lines are made to be in phase. This is shown in Fig. 4(a), where the wave is depicted as a vector in which the length and the deflection angle correspond to the amplitude and the phase of the wave, respectively. The amplitude of waves is considered to be appropriately compensated. So, the lengths of all vectors are constant. All waves received at 0, P, Q and so on are in phase and form a line as shown in this figure.
220
KEINOSUKE NAGAI
(a)
(b) FIG.4. Waves at output of delay-line.
P
Q FIG.5. Schematic for calculating azimuthal resolution.
Now consider the wave emanating from A', a point slightly shifted from A in the direction of the transducer array that is called the azimuthal or lateral direction. Fig. 5 shows the shift in this direction. The propagation distance A'O is almost unchanged and is the same as but A'P becomes shorter than 3whereas A'Q becomes longer than As a result, the waves at the output of the delay lines become slightly out of phase as in Fig. 4(b). As for the waves emanating from a point further shift in the azimuthal direction, they become completely out of phase as in Fig. 4(c) and cancel out-h other. Summation of these waves vanishes. The shift AA' for summation to vanish corresponds to zero of the amplitude of a point image. Thus, the azimuthal resolution is defined as the minimum shift AA' for the summation to vanish.
a.z,
SYNTHETIC APERTURE ULTRASONIC IMAGERY
22 1
When the phase difference due to the propagation A P a n d AP' becomes (Fig. 5) for the symmetrical geometry, the phase difference due to the propagation and A'Q becomes - T I . Therefore, when the difference between the two distances becomes half the wavelength 1, TI
signals are almost completely out of phase as shown in Fig. 4(c), and the summation of the signals becomes almost zero. When the shift is smaller than this, signals are as shown in Fig. 4(b), and the summation is not zero. Thus, the shift which satisfies Eq. (1) is the azimuthal resolution ra. Now let 0 be the angle OAP in Fig. 5. The array PQ subtends an angle 2tl for point A, then
- -
A P - A'P
= r,sin
tl
Substitution (2) into (1) yields ra =
i,
(3)
~
2 sin 6
Equation ( 3 ) indicates that a large aperture (a large 0) attains good (small) azimuthal resolution. When the aperture is infinite (0 = 7c/2),r, becomes 1/2, which is the limit of the azimuthal resolution. Although the aperture is synthesized to be large enough, the azimuthal resolution is limited to a half of the wavelength used.
2. Radial (Longitudinal) Resolution The radial or longitudinal resolution, that is, the resolution in the direction perpendicular to the array can be similarly derived. Consider point A' shifted from A in the radial direction as shown in Fig. 6. The shift in the azimuthal direction as mentioned previously keeps unchanged, shortens A -T and _ lengthens However, the shift in the radial direction lengthen all AO, A P and This suggests that the radial shift should be required to be very large for these waves to become out of phase and to cancel each other out. Thus, the radial resolution is much worse than the azimuthal resolution. From Fig. 6 it is clear that the following equation has to be satisfied in order for the waves to cancel out,
a.a.
- -
-
-
(A'O - AO) - (A'P - A P ) = 3, The shift
(4)
AA' which satisfies (4), is referred to as the radial resolution rr.
222
KEINOSUKE NAGAI
P
Q FIG. 6. Schematic for calculating radial resolution.
Now substituting
- A'O - A 0 = rr
- -
A'P - A P k rrcos 6
(5)
into (4), the radial resolution rr is obtained,
21 sin' 6
rr = -
3. Discussion As a numerical example, if 6 = 30", the azimut-.al resolution ra is obtained from (3): r, = 1,and the radial resolution rr = 8 A from (6). Thus, the radial resolution is eight times as large as the azimuthal resolution. The optical holography is said to have the ability to record three dimensional images. However, it is clear from this numerical example that the radial resolution is several times inferior to the azimuthal resolution. In optics, the wavelength is so short that the resolution is sufficient even in the radial direction. In ultrasonics, the wavelength is 1000 times as long as that of an optical wave, and 8 is limited by dimensions of the equipment. So, the poor resolution in the radial direction becomes a serious problem. As a counterplan, the time information of the burst signal is used in side-
223
SYNTHETIC APERTURE ULTRASONIC IMAGERY
looking sonar. Further multiple-frequency techniques or broad-band pulses are positively adopted. These subjects will be discussed in later sections. It has been proved in this section that a large aperture (large 0) attains high resolutions. A large aperture can be synthesized by the use of area sensors such as photographic films or by the use of transducer arrays. The term “synthetic aperture” is derived from synthetic aperture sidelooking radar. The term is now used for much broader means. An example is the simulation of a filled array by the clever use of some smaller arrays. The term “synthetic aperture imaging method” describes the method which reconstructs good images by using a synthesized aperture to eliminate the diffraction effects. 111. THEORY AND APPLICATION OF HOLOGRAPHY
A . Holography and the Synthetic Aperture Imaging Method
The synthetic aperture imaging method forms good ultrasonic images. The structure of the imaging system largely depends on the ultrasonic detectors used. So, in actual implementation of the method, it is very important to select appropriate detectors. Detectors usually used are listed in Table I which is partly quoted from papers of Bergar (1967) and of Mueller (1971). The upper three detectors in the table are square-law detectors, that is, they are sensitive to the intensity (a square of amplitude) of the ultrasonic wave. On the other hand, the lower two detectors are linear-law detectors. They react to amplitude itself but can also detect the phase as well as the amplitude of the ultrasonic wave. TABLE I ULTRASONIC DETECTORS* -
~
Detector Photographic plate in developer bath Thermosensitive dyes Liquid surface deformation Solid surface deformation Piezoelectric detector
-~
~~
~~
Approximate threshold sensitivity (Wjcm’)
Law
5-1
Square
1
-
10-3
10-6
10-7
10-1’
* After Berger H . (1967) and Mueller R. K . (1971).
Square Square Linear Linear
224
KEINOSUKE NAGAI
The square-law detectors have a disadvantage of low sensitivity. However, they are advantageous because they can simultaneously detect the spatial distribution of the field intensity. These detectors are ultrasonic counterparts of a photographic film in optics. The linear-law detectors have to be arrayed or scanned mechanically in order to detect the spatial distribution of the amplitude and the phase of the wave. These detectors have the sensitivity several orders of magnitude higher than that of square-law detectors. As mentioned in the previous section the synthetic aperture imaging method utilizes the phase and amplitude of the ultrasonic waves. The phase, for example, is detected and delayed to form images. Applying holography techniques, the square-law detectors are used in the imaging method. That is, holography makes it possible to detect and to record the phase and amplitude of the ultrasonic waves with the square-law detectors. In holography, the record of the phase and the amplitude of the wave is called a hologram which is defined by El-Sum (1967) as follows. Hologram: A recording (permanent or semipermanent, surface or volume) of diffraction pattern of an object biased by a coherent background radiation. This biasing radiation may be referred to as a reference wave.
It is essential to add the reference wave to the diffracted wave from an object in order to detect the phase and the amplitude with the square-law detectors in holography. Holography is merely one of the techniques which realizes synthetic aperture imaging. However, the role of the phase and the amplitude of the wave in the imaging method is noticeable and emphasized by holography. Historically, the synthetic aperture imaging method has been developed from holography and like techniques. Procedures and theories of synthetic aperture imaging are, therefore, often interpreted and explained by using the terminology of holography. For example, the recording of the phase and the amplitude detected with the linear-law detector is called a hologram, even though they do not satisfy the definition mentioned above.
B. Principles of Holography Gabor (1948)succeeded in recording the wavefront and reconstructing its replica with his invention of holography. His procedure is performed by two steps. At the first step, an object is insonified by a monochromatic wave to generate the scattered wave or “object wave”. As mentioned previously, the object wave is added to the mutually coherent wave or “reference wave”. Then,
SYNTHETIC APERTURE ULTRASONIC IMAGERY
225
these waves interfere with each other making an interference fringe which is detected and recorded with photographic film of square-law characteristics. The film is developed and fixed later. It becomes a transparency with an amplitude transmission proportional to the interference fringe recorded. The transparency is the hologram which in Greek stood for complete recording. At the second step, the hologram is illuminated by a monochromatic wave, from which a replica of the object is generated. The steps of recording and reconstructing are formulated in simple equations as shown below. 1. Recording
In this article, for brevity, ultrasonic waves are expressed as scalar quantities, though they should be treated as tensors or vectors to be represented exact 1y . Complex amplitude of the object wave and the reference wave are denoted by V, and V,, respectively. As shown in Fig. 7, V, is added to V, to interfere each other. The interference fringe is detected with a square-law detector and is recorded. The information recorded is lU,
+ U0l2= lVA2 + u,u: + v,*u,+ lV,I2
(7)
where asterisk * represents a complex conjugate. The third term of the right hand of Eq. (7)contains V,, and so it confirms that the object wave is recorded.
2. Reconstruction The transparency (or hologram) with which the information of Eq. (7) is recorded is illuminated by a monochromatic wave V,. A transmitted wave through the transparency is represented by a multiplication of V, and Eq. (7) except for a constant factor.
Now the condition, (V,V,*)= constant
(9)
is imposed on V,. The third term of the right hand of Eq. (8) shows that the object wave is reconstructed. However, the existence of other terms in Eq. (8) is not neglected. The transmitted wave includes not only the object wave but also the waves represented by these extra terms. The first term of the right hand of Eq. (8) represents an illuminating wave which transmits the transparency and goes straight on. It is similar to the zero order diffracted wave from a diffractiongrating. The second term contains V:. A complex conjugate of a diverging spherical wave represents a converging
226
KEINOSUKE NAGAl
spherical wave. The analogy shows U,* expresses a wave converging to a real image. The second term represents a wave forming a real image of the object. The third term, therefore, represents a wave forming a virtual image. They are called a ‘twin image’. 1 Uolzof the last term represents the interaction of waves emanating from each point of the object. This term shows that 1 UJ2 is conveyed by the illuminating wave. The first and the fourth term are often called ‘background noise’. In the procedure proposed by Gabor (1948),the reference wave is also used to illuminate the object. Then the in-line hologram is produced by the procedure. At the reconstruction step, the wave forming the virtual image is embedded in waves due to other noise terms. Then, the reconstructed image is severely soiled. Leith and Upatnieks (1962) succeeded in separating the object wave from the waves due to other terms. 3. Ofset-Reference Hologram
There are two basic works which expedite the progress of investigation of holography. One is the invention of laser as a highly coherent illumination source. The other is the offset-reference hologram proposed by Leith and Upatnieks (1962). A reference wave is arranged to be separated from an illuminating wave at the step of recording. Figure 7 shows the geometry. The direction of propagation makes a proper angle to that of an object wave. For simplicity, let the reference wave U, be a plane wave, and let it be represented on a hologram plane as, Ur = Aexp{ -i(?)xsin$}
where A is a constant and the angle between the direction of the propagation and the optical axis (the z-axis) is -4, (4 > 0) as shown in Fig. 7. At a step of reconstruction, the reference wave is also used as a illuminating wave, that is,
u, = u,
(1 1)
Substitution of Eqs. (9) to (11) into Eq. (8) yields the reconstructed wavefront, U,lu,
+
u012=
,exp{ - i
+ A’exp
(y) } -
x s .i n 4
{\U,I’
i (3 1 -i
-
+ lU01’}
x s i n 4 .U,* +IA12.Uo (12)
SYNTHETIC APERTURE ULTRASONIC IMAGERY
221
wave FIG. 7. Recording the offset-referencehologram.
FIG.8. Image reconstruction from the offset-referencehologram.
The step of reconstruction is shown in Fig. 8. The third term in Eq. (12) represents that the object is reconstructed in the vicinity of the optical axis. From the first term, it is known that the background noise is conveyed to the direction of the propagation of the illuminating wave, that is, at angle 4 from the optical axis. The second term shows that the real image is formed in the direction at the angle of 4' = sin-'(2 sin 4) from the optical axis. 4' is roughly approximated to 24.
228
KEINOSUKE NAGAI
At the step of recording, one should slant the direction of the reference wave at the appropriate angle. Then, the reconstructed object wave can be separated from the waves generated from extra terms, and it has no effect from them. 4, Formula for Image Reconstruction A simple formula for image reconstruction by holography is derived here. As an object can be regarded as a set of points, it is often sufficient only to know where and how the image of a point is formed. Therefore, it is significant to analyze holography in the case where an object is only a point. A point object exists at (xo,y o , zo), which is illuminated by a monochromatic wave with wavelength I , , then the object wave Uo is generated, which is represented on the hologram plane (x,y, 0) as {zo”
+ (x
-
x0I2
+ (Y - yo) }
(13)
where B is a constant. a. Resolution. Now that a point object is set at the origin (O,O, zo),Eq. (13) becomes (zg
+ x 2 + y2)’/’
A period of the fringe becomes narrow at the outer part of the hologram. This is known quantitatively by calculating the spatial frequency of Eq. (14). Let the argument of the exponent in Eq. (14) be F:
The local spatial frequency v is calculated by the gradient of F,
which monotonously increases with respect to x and y. Thus, the size of the hologram limits the spatial frequency. This is intuitively understood by Fig. 9. This figure shows the Fresnel zone plate, which is a transparency representing the real part of Eq. (14) in binary notation: 1 or 0. For simplicity, a one-dimensional hologram is considered (y = 0). The maximum value of the spatial frequency v, is derived from Eq. (16). If the aperture width of the hologram is L , substitution of x = L / 2 , y = 0 into
SYNTHETIC APERTURE ULTRASONIC IMAGERY
229
Y
T -X
I
FIG.9. Fresnel zone plate. The real part of U, = Bexp{i(2x/l)(zz + x 2 + . v ' ) " ~ } , z,/2 = 10, is represented in binary notation: 1 or 0. The aperture limits the spatial frequency.
Eq. (16 ) yields
where 0 is an angle given in Fig. 10; the hologram subtends the angle 20 for the point object. The spatial frequencies in the range -v,,, < v < v,,, is recorded on the hologram. Thus, the frequency band width is 2v,,, and the reciprocal of the width gives the azimuthal resolution rar r,
1
=-
2vmax
The substitution of (1 7 ) into (1 8) yields r, =
b' ~
1
2 sin O
Equation (19) is coincident with Eq. (3).
230
KEINOSUKE NAGAI
fX
FIG.10. Geometry of recording hologram.
b. Magnijcation. Azimuthal magnification and radial magnification of an holographic imaging system are derived in this subsection. For simplicity, assuming
z,’ >> (x - X J 2 ?
( Y - Yo),,
the approximation,
yields the following equation for the object wave of Eq. (13),
+ y2
-~ X X ,
-
2yy,)
where C is a constant. The reference wave U, is not necessarily a plane wave. Here, however, it is a tilted plane wave as is represented by Eq. (10) for simplicity. So far, the waves have not been specific.They may be optical waves, X-rays or ultrasonic waves. Now entering into the main subject, the following situation is considered. In recording, the ultrasonic wave with the wavelength I1 is used, but in reconstruction, the optical wave with wavelength I , is adopted in order for the image to be visible. The value of 4 / A 2 is roughly about a thousand. The ultrasonic hologram should be, therefore, reduced to rn times where m < 1. So, replacing x and y of U, in Eq. (20) with x/m and y/m, respectively,
SYNTHETIC APERTURE ULTRASONIC IMAGERY
23 1
is obtained. The hologram is illuminated by the optical wave which satisfies the condition of Eq. (9). Then, the object wave Ub is reconstructed. Let the image point of the wave be (xi,yi,zi),then, referring to Eq. (20), Ub should be
i"
~ ; ( x , y= ) Dexp i - ( x 2 22zi
-
2xxi
+ y2 - 2yyJ
(22)
where D is a constant. Equating the arguments of Eq. (21) and Eq. (22), xi = mx,
yi = my,
(23)
Finally, the azimuthal magnification Ma and the radial magnification M , are obtained from Eq. (23) as follows,
Meier (1965) reported if it does not hold that 3" m = l
4
(26)
spherical aberration takes place. A more severe problem is that the difference between M , and M , increases the three dimensional distortion. However, if Eq. (26) holds, according to Eq. (24) and (25), the reduction is so small (about 1/1000) that a microscope is necessary for images to be seen. C. Application to Ultrasonic Imaging Various ultrasonic detectors are used in the imaging method and they are listed in Table I in the preceding section. Good ultrasonic images can be obtained by synthetic aperture methods that are considered to be applications of holography. These methods depend on the detectors. Three of them: (i) liquid surface deformation, (ii) solid surface deformation,
232
KEINOSUKE NAGAI
(iii) mechanical scan of a converging beam by a piezoelectric detector, are discussed here and are representative examples of analog-type methods. Other methods of digital imaging will be discussed in later sections. The liquid surface is a square-law dectector. The imaging method using this detector is a direct implementation of holography. The solid surface, on the other hand, is a linear-law detector. The method using the solid surface, therefore, is not a direct application, but it adopts the concept of holography. 1. Liquid Surface Deformution Method
a. Trunsjer Function of a Liquid Surface. It is well known that when the ultrasonic wave impinges on the liquid-air interface, it is deformed by radiation pressure. Thus, the liquid relief representing the spatial distribution of the wave intensity is formed at the interface. The liquid surface deformation method obtains ultrasonic images by utilizing the phenomenon. Spatial Fourier transform of the two-dimensional impulse response of the surface deformation, that is, the spatial transfer function is given by 1
H= 1
+
(3
where v is a spatial frequency and v , is a cut-off frequency which is represented by the surface tension c,the density p of the liquid and the acceleration of the gravity y as 1 p g lj2 v, = 2n 0
(-)
As CT = 7O[dyn/cm] at the liquid-air interface, v, is about 0.6[cycle/cm], which is rather small. The transfer function for these parameters is shown in Fig. 1 1. It is seen from this figure that the liquid surface deformation works like the low-pass-filter with cut-off frequency of v,. Thus, v , limits the resolution of the imaging method as the aperture of the hologram does (described in the preceding section). Now, the hologram of a point object is again considered. For simplicity a two dimensional, (x, z), problem is discussed here. The point object exists at (0,z,) and the hologram plane is z = 0. The object wave V,(x) on the plane is
where B is a constant. The reference wave V, is a plane wave which propagates in the direction tilted at angle 4 from the optical axis (the z-axis), and has been
SYNTHETIC APERTURE ULTRASONIC IMAGERY
233
FIG. 11. Transfer function relating water displacement to acoustic pressure.
described by Eq. (lo), U, = Aexp{ -irf)xsinm) Square-law detection of the sum of U, and U, includes U,*U,, which is, from Eqs. (29) and (lo),
+ (z,” + x’)“’}
U,*U, = A . Bexp
Let the argument of the exponent in Eq. (30) be if, similar to Eq. (16) mentioned previously, the local spatial frequency v can be obtained, 1 dF v=-.-- v,
2n dx
+ v,
where v, is the frequency of the reference wave and v, is the frequency of the object wave, 1
v - -sin4
(32)
,-A
v, =
x
qz:
+
x2)1’2
(33)
The spatial frequency of the offset-reference hologram is the sum of which is higher by v, than that of the in-line hologram. As the tilted angle 4 is increased to separate the image completely, v, becomes higher and v,
+ v,,
234
KEINOSUKE NAGAI
FIG. 12. Schematic diagram of liquid surface holographic imaging (Anderson, 1974).
the sum v, + v, exceeds the cut-off frequency v,. This results in the blurred image. Equation (28) shows that the liquid with small (T should be used to obtain the high cut-off frequency v,. The surface of the liquid may be wetted by a wetting agent such as Triten X-100, in which surface tension is less than that of water.
b. Images Reconstructed from a Liquid Surface. As mentioned previously when the liquid-air interface is illuminated by the ultrasonic wave, the relief pattern of the spatial intensity distribution of the wave field is formed. Strictly speaking, it is a spatial low-pass filtered pattern. A visible representation of the image is produced by reflecting light from the pattern. The liquid surface deformation method has a disadvantage of low sensitivity, but it has an advantage of real-time visualization. Figure 12 shows the schematic diagram of the liquid surface deformation method proposed by Anderson (1974).The object wave and the reference wave impinge on the liquid-air interface. The hologram is formed at the interface which is illuminated by laser light to reconstruct images. An example of the image is shown in Fig. 13(a),which is the ultrasonic image of the forearm. An X-ray image is also shown in Fig. 13(b) for comparison. Bony detail is more
SYNTHETIC APERTURE ULTRASONIC IMAGERY
(a)
235
(b)
FIG. 13. Comparison of ultrasonic and X-ray images of the forearm (Anderson, 1974). (a) Ultrasonic image. (b) X-ray image.
clear in the X-ray image, but a great deal more soft tissue structure can be seen in the ultrasonic image. Figure 14 illustrates the experimental set-up proposed by Holbrooke et al. (1974). The same principle is applied to this method as to the previous example. The water tank for constructing the hologram is separated from that of the object to avoid disturbing the hologram plane, and the lens system is equipped in aid of the separation. Figure 15 shows the results which are the ultrasonic images of a fetus in vitro 14 weeks. In some images, the h e r appears translucent and in others, opaque, depending on the interactive effect of impinging ultrasonic waves. 2. Solid Surface Deformation Method A solid surface is deformed by ultrasonic displacement. The instantaneous displacement could be read out by a laser beam which is scanned two dimensionally. The sensitivity is three orders of magnitude higher than that of the liquid surface deformation. The propagation axis of the laser beam is either tilted or parallel to the normal of the surface.
SPATIAL FILTER
LENS
LASER
\ /
a
DBlECT lRANSDUCER
ACOUSTIC LENSES
REFERENCE TRMSOUCER
FIG. 14. Functional schematic of immersion-type surface levitation holographic imaging system. (Holbrooke el a/., 1974).
FIG. IS. 5 MHz images of in vitro 14-week fetus demonstrating apparent differences in acoustic absorption as specimen is moved about in ultrasonic field. (Holbrooke et a/., 1974).
SYNTHETIC APERTURE ULTRASONIC IMAGERY
237
FIG.16. Experimental setup used to form an acoustic image hologram at 2.268 MHz (Whitman rt a/., 1972).
The tilted incident beam is deflected by an angle proportional to the slope of the deformation, and then it is partially intercepted by a knife edge. Thus, the intensity of the beam collected at a photodiode behind the knife edge represents the slope of the deformation, which is converted to diode current. In the case of parallel incident, the ultrasonic displacement could be read out by the optical Michelson interferometer. Figure 16 shows the experimental setup proposed by Whitman et a!. (1972). The object placed in the water tank is illuminated by the ultrasonic wave. The scattered wave, that is, the object wave, is focused on the gold plated plexiglass by the brass concave mirror. The tilted incident laser beam is scanned over the gold plated plexiglass. According to the ultrasonic displacement of the plexiglass, the beam is deflected and is converted into the photodiode current. Then, the image signal is amplified and displayed in a CRT (Cathode Ray Tube) synchronously with the scanning laser beam. The ultrasonic image of a hand obtained by this system is shown in Fig. 17. The operating frequency is 2.25 MHz. The level of the insonification is 40 mW/cm2.
238
KEINOSUKE NAGAI
FIG. 17. A composite acoustic transmission picture of a hand, taken at 2.25 MHz (Whitman et al., 1972).
The ultrasonic image is directly focused on the plexiglass by the brass concave mirror. The phase and the amplitude of the image can be detected, because the image is represented by the instantaneous displacement. The hologram of the image, therefore, can be recorded by adding the electronic reference wave, even if the image is out of focus. The clear image can be obtained from the hologram. Figure 18 illustrates the basic arrangement of the experimental system proposed by Mezrich et al. (1975) of the optical Michelson interferometer. The laser light is split by the beam-splitter. One wave reflected from the mirror M1 is used as the reference wave. The other propagates in the direction parallel to the normal of a thin film (pellicle), M2, from which it is reflected. Both are collected by the lens, L1, and detected by the photodiode D.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
239
V
FIG.18. Basic arrangement of system. M, is pellicle in water tank, M, is reference mirror (Mezrich ef al., 1975).
The essential part of the system is the pellicle M2, which is a thin ( - 6 pm) metalized film. The ultrasonic image is focused on the pellicle by the polystyrene lens which is not depicted in Fig. 18. The displacement of the ultrasonic wave could be detected by interferometrically measuring the motion of the pellicle. By scanning the laser beam over the pellicle, the spatial distribution of the ultrasonic field is known. Figure 19 shows the ultrasonic image of a hand obtained by the system. The lower picture is an expanded view of a region of palm near the index finger. I t clearly shows a bifurcation of a blood vessel.
3. Mechanical Scanning of a Convergent Ultrasonic Beam Techniques other than holography are also used to obtain ultrasonic clear images by eliminating the diffraction effect. An ultrasonic lens, for example, could make the image in focus. Another technique is discussed in this subsection which adopts ultrasonic lenses or concave transducers to make a convergent ultrasonic beam. The beam is focused on an object, and the image information (transmissivity or reflectivity)at the focal point can be obtained. The focal point is scanned over the object and one-to-one mapping constructs the image of the object on the CRT. A sufficient convergent wave is obtained by comparatively small lenses or concave transducers when the ultrasonic wave with a short wavelength is used. When the wavelength is almost equal to that of light, the method is called SAM (mechanically Scanning Acoustic Microscopy); SAM has achieved brilliant success in the field of ultrasonic imaging. Although the mechanical scanning requires much time, one-to-one mapping removes interference of each point of the objects. Namely, it does not
240
KEINOSUKE NAGAI
FIG. 19. Acoustic image of adult hand. Lower picture shows detail of region of palm near index finger, with bifurcation of blood vessel visible (Mezrich et al., 1975).
generate speckle noise which is the worst disadvantage of coherent imaging. Therefore, recently, the methods of speedy scanning have been vigorously investigated, instead of replacing the scanning mechanism with other procedures. a. Mechanically Scanning Acoustic Microscopy ( S A M ) . Figure 20 shows a schematic diagram of SAM for (a) transmission mode and for (b) reflection mode. The pulse signal from the oscillator is converted into the ultrasonic plane wave by the transducer. The plane wave travels through a sapphire block. The interface between the sapphire and water forms a concave lens, through which the ultrasonic wave is transmitted into the water as a convergent wave. The object is placed on the mylar support in the water. For transmission mode, the convergent wave is focused in the object, through which it transmits
PIEZOELECTRIC TRANS.DUCLR
Z n o LAYER BLTWEEN METAL ELECTROOES
CONTOUR OF ACDUSTIC RAOIATION
I-t iH
-
SAPPHIRE 81O C K
PIEZOELECTffIC TRANSDUCER
FIG. 20. Schematic diagram of Scanning Acoustic Microscope (SAM) (Quate et al., 1979). (a)Transmission mode. (b) Reflection mode.
242
KEINOSUKE NAGAI
and propagates as the divergent wave. It is restored to the plane wave by the concave lens of the receiver. Then, the wave is reconverted into the electric current. Because the wave is most affected by the elastic properties about the focal point, the amplitude of the current is considered to be proportional to the transmissivity in the vicinity of the focal point. The received current, that is, the image signal is amplified and displayed at the corresponding point on the CRT. Then the position of the object is slightly shifted by the x-y scanning equipment and the operation is repeated to obtain the image of the whole object. Usually the time required is several seconds. The influence of the multiple reflections mainly due to the water-sapphire interface and the water-object interface can be excluded by adopting the pulse wave and the appropriate time window. The reflection mode works on almost the same principle as the trans-
J
=
FIG.21. (a)Optical ( x 125) and (b) acoustic images in reflection of polished sample of coal. 1100 MHz. (Quate et al., 1979).
SYNTHETIC APERTURE ULTRASONIC IMAGERY
243
mission mode. The former, however, uses the circulator to separate the received signal from the transmitted signal. Figure 21 shows an example of the image formed by SAM of the reflection mode (after Quate et al., 1979). The lower (b) is the ultrasonic image of polished samples of coal. The upper (a)is the optical image which is shown for comparison. The operating frequency of the ultrasonic wave is 1100 MHz, i.e. the wavelength is about 1 pm. I t is plain that optical images represent optical properties of objects. On the other hand, ultrasonic images display elastic properties. They often form different images of the same object. The optical reflectivity of the sample in Fig. 21 varies between 1 and 2 percent. The ultrasonic reflectivity is much larger (typically 10 to 100 percent). Thus, the ultrasonic image increases contrast in this case. Acoustic microscopy has been investigated by many researchers, and many excellent explanations have been published. The reader who arouses deeper interest in the subject should refer to such explanations as Quate et al. (1979), Kessler and Yuhas (1979), and Chubachi (1982). h. Mechanically Scanning Imaging for Medical Diagnosis. Figure 22 shows the schematic diagram of the imaging system developed for medical diagnosis (Green et al., 1972).Ultrasonic waves travel into a human body with the operating frequency of 5 MHz. The wavelength is 0.3 mm which is a hundred times longer than that used in SAM. The size of the system is increased according to the multiple factor. An ultrasonic pulse wave is transmitted from the transmitting transducer. A continuous wave can be used in principle but the pulse wave is preferable to overcome the multipath problem. The pulse is received by the receiving transducer after traveling through the object and the convergent lens. The amplitude of the received pulse is considered to be proportional to the transmissivity at the focal point similarly as explained in the preceding section of SAM. The transmitting transducer is scanned synchronously with the focused receiver. A small light source, modulated by the received signal, is also scanned with the transducers and paints out the image on film in a time-exposed camera. The transmission mode has been discussed so far. The reflection mode can also be realized. Most ultrasonic images in medical diagnosis have been constructed by the pulse-echo method, which represents the cross-sectional (B-scan) images of the object. On the other hand, the image made by this system is the orthographic (C-scan) image which is similar to the usual optical imaging. This imaging format is preferable for workers in medicine.
244
KEINOSUKE NAGAI MODULATED SCANNIMG LAMP CAMERA
TRANSDUCER
FIG.22. Arrangement for producing focused ultrasonic images of excised organs with a mechanically scanned transducer (Green et al., 1972).
The image made by this system is displayed in Fig. 23 (after Green et al., 1972).The object is a human fetus in approximately the 17th week. The carrier frequency of the ultrasonic pulse is 5 MHz. Scan-line spacing is 1/8 mm. An image of very high quality is obtained. However, medical workers reported that the system requires too much time, half an hour or more to paint out the image. The time should be decreased to several seconds, as little as that of SAM, for the system to be more practical in medical use. D . Synthetic Aperture Side-Looking Sonar 1. Real Aperture Side-Looking Sonar ( S L S )
The image of sea bottom is obtained by real aperture side-looking sonar (SLS). A ship is equipped with the ultrasonic transducer from which the pulse wave is transmitted at a grazing angle to the bottom, as shown in Fig. 24. The
Frc. 23. Transmission image of a human fetus in approximately the 17th week (Green et a/., 1972).
ship direction of moving
FIG.24. Geometry of side looking sonar
246
KEINOSUKE NAGAI
reflected wave from the bottom is received by the transducer and then it is processed for display. One coordinate, say the ordinate, of the spot on the CRT represents the round-trip time of the wave. The brightness is proportional to the intensity of the wave. The abscissa corresponds to the position of the ship, which advances perpendicularly to the direction of the wave propagation to paint out the whole image. SLS is similar to the pulse-echo imaging method. Both the methods work on the same principle, though some differences are present: The scale is larger. SLS looks obliquely, whereas the pulse-echo method looks down. SLS does not obtain a cross-sectional image but an orthographic image. The radial resolution and the azimuthal resolution are determined by the pulse duration time and the beam width, respectively. The sufficient radial resolution is obtained as the pulse duration time can be easily shortened. The beam width is inversely proportionate to the width of the transducer, called the real aperture, and is especially restricted as the equipment is carried by ships. It is difficult to increase the real aperture in order to attain the high azimuthal resolution. 2. Synthetic Aperture Technique The azimuthal resolution can be improved by reducing the aperture, though it seems inverse. Figure 25 shows this method which uses a wide beam. The object at the point P enters into the beam and gets out from it, when the ship reaches at point A and at point B, respectively. aperture ; synthesized k
*
ultrasonic fan beam FIG.25. Synthetic aperture technique.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
247
Then, during the time when the ship travels from A to B, all the received waves include the reflected wave from P. If the data of these received waves are appropriately processed, the position of P could be precisely known; that is, the point is distinguished with high resolution. The track from A to B is considered as the effective aperture of this imaging system. The track A B is called the synthetic aperture, which corresponds with the beam width as shown in Fig. 25. The azimuthal resolution increases as A B becomes longer.
3. Synthetic Aperture Side-Looking Radar The imaging technique described in the preceding section is implemented to the synthetic aperture side-looking radar using microwaves and an airplane instead of ultrasonic waves and a ship (Cutrona et al., 1966). The synthetic aperture side-looking radar is one of the greatest achievements in the field of synthetic aperture imaging. In fact, synthetic aperture is named after this technique, though the terminology ‘synthetic aperture’ is used at present for much broader meaning, as mentioned in Part I. The imaging of the synthetic aperture side-looking radar or sonar can be easily accomplished by the recent digital techniques. However, the procedure proposed by Cutrona et al. (1 966) is described here. The image is reconstructed with the optical system composed of lenses. Figure 26 shows the geometry for data acquisition. The received signals are displayed on the CRT and recorded on film in the format shown in Fig. 27. Though the object space is naturally scaled down to the storage space, the scale of 1 to 1 is considered here to avoid meaningless complexity of discussion.
vehicle -direction of moving
azimuth -
(;,,Yo) FIG.26. Geometry of synthetic aperture radar.
-
248
KEINOSUKE NAGAI
film motion
sweep direction
V ,
range azimuth FIG.27. Storage format.
The airplane equipped with the radar antenna is flying on the x-axis as shown in Fig. 26. The pulse signal with carrier frequency o and with envelope A ( t ) : A ( t ) exp( - iwt)
is transmitted by the antenna of the airplane at (x, 0). Now only the object at (xo,yo) is present. The reflected wave from the object is expressed by oA(t
-
h)exp{ - i ( o t
-
y)]
where o is a constant which includes the reflectivity; c, the wave velocity; and I = 2nc/o, the wavelength. 1 is the round-trip distance, 1 = 2{y: + (x - x , ) ~ ) ~which ~ * , is approximated to
1 = 2y0 +
(x - xo)2
(34)
YO
by assuming Ix - xo(2<< yo” The detected output g(x) is represented by the phase delay due to the propagation of the distance 1. Thus, substituting Eq. (34), 2n(x - xo)2 is obtained where B contains A(t). g(x) is recorded on film and is schematically displayed in Fig. 28.
(35)
-
SYNTHETIC APERTURE ULTRASONIC IMAGERY
.---
249
film motion
- - - - -.. (XO.Y*
1
Y
t,
FIG.28. An example of the storage, when an object is a point: one dimensional zone plate.
The image reconstruction of the synthetic aperture side-looking radar could be interpreted in two ways. One is from the viewpoint of signal processing. The other is from that of coherent optics, or holography. The two interpretations are described below. One can see the concept of holography is very useful in interpreting the imaging method. a. Viewpoint of Signal Processing. An object is regarded as a set of points. Thus, in order to resolve each point the received wave is cross-correlated with the signal,
which is assumed to be a reflected wave from the point at ( x , y ) with the antenna being at the origin (0,O) and C, constant. Then, the correlation is $(x, y ) =
s
g(x'
- x ) r : ( x ' , y) dx'
(37)
where r r is the complex conjugate of r , , r:(x,y) = C* exp
( );*. -I--
The calculation of Eq. (37) is executed by the optical system. Though optical system includes such a troublesome task as development of film,it has the advantage of executing two-dimensional correlations with the speed of light.
250
KEINOSUKE N A G A l
Y
t
I =
0
FIG.29. Spherical lens.
Properties of Lenses Before we enter the main subject, a basic knowledge of lenses is presented here. The transmittance function of the spherical lens shown in Fig. 29 is
-
~ , ( x , y )= D exp
(39)
where f is the focal length and D is a constant. Let the (complex) amplitude of the incident light on the front focal plane of the lens be t ( x , y ) ;then the amplitude tb(& q ) on the back focal plane of the lens becomes
-m
where E is a constant. Equation (39) represents Fourier transforming properties of the lens. The transmittance function of the cylindrical lens shown in Fig. 30 is (41) where F is a constant and f is the focal length of the cylindrical lens. Optical Implementation Comparing Eq. (41) with Eq. (38), it can be seen that the transmittance function r r ( x , y ) is realized by the cylindrical lens with the focal length being
SYNTHETIC APERTURE ULTRASONIC IMAGERY
I’
25 1
Y
T
FIG.30. Cylindrical lens.
FIG.31. Conical lens.
proportional to y, that is, simply the conical lens. The conical lens is shown in Fig. 3 1. Let the transmittance function of the storage film be g(x,y). The film is shifted by x and it is in contact with the conical lens. Then, the film is insonified by the plane wave of light. The transmitting light t(x’,y) becomes t ( x ’ ,Y) = d x ’
-
x, y)r,*(x’,Y)
(42)
which is the integrand of Eq. (37). Integral of Eq. (42)can be executed by the use of the Fourier transforming property of the spherical lens. Equation (40) becomes at the origin ( 5 = 17 = 0) of the back focal plane, tb(O, 0) = E
-
1
t ( x , y) dx dy
(43)
-a
This equation is, however, two-dimensional, though Eq. (37) requires a onedimensional integral. In order to execute only one integral with respect to x, the cylindrical lens curved along the y-axis is placed in front of the spherical lens. Then with respect to the y-axis, r(x, y) is twice Fourier transformed by the cylindrical lens and the spherical lens, and is restored to the original state with the exception of a reversed sign. The cylindrical lens effects nothing in the x-direction. Therefore, the integral with respect to the x-axis is executed by the spherical lens and the reconstructed image appears on the line t = 0 on the back focal plane of the spherical lens. The image is photographed on film through the slit 5 = 0. The film for the
252
KEINOSUKE NAGAI
\
conical lens
\
spherical lens
slit FIG.32. Optical system for reconstructing images.
image is sent in synchronism with the film of the data storage so that the whole image may be reconstructed and recorded. Summing up the description, the optical system is implemented as shown in Fig. 32. b. View Point of Holography. Figure 28 shows the record of the reflected wave from a point object which is expressed by Eq. (35).The record is regarded as the one-dimensional zone plate. Therefore, the point object is reconstructed at (xoryo), if the record is illuminated by the plane wave. The record object is considered to be a super-position of a large number of the one-dimensional zone plates. Each constituent point is reconstructed from the record which is illuminated by the plane wave. Two problems are posed at this stage.
a) The tilted image is generally reconstructed as shown in Fig. 33, as depth of the image varies with the height of the film record (refer to Eq. (35)).
SYNTHETIC APERTURE ULTRASONIC IMAGERY
253
image plane
FIG.33. Image points of the one-dimensional zone plate. These points form a tilted plane because the focal length varies with the height.
b) The image is focused on the tilted plane only in the x-direction. It is blurred in the y-direction because of the one-dimensional zone plate. The optical system is constructed to solve the problems. The record film is illuminated by the optical plane wave. The wave which passes through the film converges in the x-direction and diverges in the y-direction. At first, the wave is collimated in the x-direction by the conical lens having a focal length equal and opposite to that of the record film. Then, the wave is collimated in the y-direction by the cylindrical lens. Thus, the wave is collimated in both directions, that is, the image is reconstructed on the plane at infinity. Finally, the image at infinity is reimagined by the spherical lens at the focal plane. The resulting optical system is the same as displayed in Fig. 32. An example of radar imagery is shown in Fig. 34 which is obtained by the optical processing. The data are collected from synthetic aperture radar. The radar image, which is of Monroe, Michigan clearly shows a variety of targets including city, wooded areas, farmland and so on.
Iv. FUNDAMENTALS OF DIGITAL ULTRASONIC
IMAGING
The phase and the amplitude of the object wave, that is, the ultrasonic wave scattered from an object, can be easily measured with such a linear-law detector as a piezoelectric transducer. The collected data are called the hologram, though they do not satisfy the definition given by El-Sum (1967) as mentioned in Part 11. They do not involve the difficulty of a twin image and the background noise of holography. The numerical data of the hologram can be processed to reconstruct the image.
254
KEINOSUKE NAGAI
FIG.34. Synthetic-aperture radar image of Monroe, Michigan area (Cutrona et al., 1966).
Digital processing of the ultrasonic imaging is discussed. The theoretical background is clarified in this section and includes derivation of the wave equation, calculation of images from an ordinary hologram, and imaging from the hologram produced with the broad-band pulse, which is equivalent to the multi-frequency hologram. A. Propagation of an Ultrasonic Wave 1. Wave Equation
The ultrasonic pressure p is the pressure change caused by sound. The particle considered is so small that each constituent element moves in unison with the particle velocity u. Now the particle is the rectangular prism with an area Ay Az and a length Ax, Referring to Fig. 35, one-dimensional motion in the x-direction of the particle is at first considered. If the ultrasonic pressure on the surface at x = x and x = x + Ax are represented as p and p + (dp/dx)Ax, respectively, the pressure (@/ax) Ax acts on the particle to negative direction. Newton’s equation of motion is represented by u x , the particle velocity in the x direction, F,, the x component of the body force which acts the particle
SYNTHETIC APERTURE ULTRASONIC IMAGERY
x
255
X+A X FIG.35. One-dimensional geometry.
from the outside as the gravity and p, the density. Thus, dux p AX Ay Az= p AX Ay AzF,
dt
-
a P AX Ay AZ ax
(44)
It is assumed that dux - au, dt
at
au, ax +-.-
sx
at
That is, u and p are so small that their multiples such as Iu12, plu(,p 2 and so on can be neglected. Therefore, only the linear field is discussed. Substituting Eq. (45) into Eq. (44),
If the body force is removed (F,
= 0), Eq. (46) becomes
c?u, ap Pz=-z
(47)
256
KEINOSUKE NAGAI
The equation of continuity is similarly considered. The difference between the fluid flow across the surface at x = x and that at x = x + Ax produces a change in the fluid density p, aux AYAZP-AX
=
ax
-AxAYAz- dP at
and then
If the pressure applying on the fluid increases by Ap, the volume u of fluid decreases by - Au and the density p increases by Ap. Neglecting the second order terms, Au _AP_- - _ (49) P 0 The compressibility of the fluid is defined by -1
K=-.-
AU
AP Substitution of Eq. (50) into Eq. (49) yields v
AP
-=
P
KAP
Further substitution of Eq. (51) into Eq. (48) yields the following equation:
Only one dimension (the x-axis) has been so far considered. It is not difficult to generalize these equations to three dimensions if vector notation is used. From Eqs. (46) and (52),
Assuming the spatial and time derivatives of p and K to be small enough to be neglected, the following equation is obtained from Eqs. (53) and (54), 2 a2P Vp-plC-=-f at2
where f
=
-
.
V p F represents a wave source.
(55)
SYNTHETIC APERTURE ULTRASONIC IMAGERY
257
2. Green’s Theorem The temporal Fourier transform and its inverse are defined by
j-m a,
pw =
p(t)eiw‘dt
The Fourier transform of both sides of Eq. (55) becomes V2p,
+ k2p, = -fa
(58)
where c = ( p ~ ) - ’ ’ ’is the ultrasonic velocity and k = w/c is the wave number. Representation of the wave field is derived by Green’s theorem. Green’s function of the free space is at first introduced for the purpose:
where r and r, are position vectors. The function is the solution of the following equation. V2g,(r 1 l o )
+ k 2 g & I r,) = - S(r - r,)
(60) The left hand side of Eq. (60),into which Eq. (59) is substituted, becomes zero at all points except r = r,. If r = r,, the integrals of both sides within the infinitesimally small sphere, of which the center is at r,, are coincident. Now, Eq. (58) is multiplied by g,,,. Equation (60) is multiplied by p,. The difference between both the multiplications becomes
-
g10V2pm- ec,,V2gw= V (s,Ve, =
-
p,Vg,)
-fA, + de,
This equation is termwise integrated in the region of volume V,. The integral including the divergence operator is represented by the surface integral, Rearranging this, the equation representing the wave field is obtained as follows. e,(r) = [[[g~r
I r , ~ r , dv0 )
V‘,
S”
where notations of ro and r in Eq. (60) are reversed and the property, g,(r 1 ro) = g,(ro 1 r) is used.
258
KEINOSUKE NAGAI
3. Two-Dimensional Problem Equation (6 1) represents wave fields in three dimensions. In ultrasonic imaging, however, the assumption that the field is constant with respect to, say, the y-axis, is often made from the practical viewpoint. The representation in the two-dimensional (x - z ) plane is, therefore, useful and is described here. The two-dimensional Green’s function corresponding to Eq. (59) is i (62) R = Ir - rot, g,(r Ir,) = iHL1)(kR), where HL” is a zero order Hankel function of the first kind. Using Eq. (62), the two-dimensional version of Eq. (61) is obtained by
PJr) =
11
g,(r
I ro)f,(ro) dK
SO
LO
4. The Rayleigh-Sommeufeld Formula of DifSraction a. Three-Dimensional Representation. The representation of the wave fields described in the preceding subsections are applied to the theory of ultrasonic imaging. As a step of the application, the Rayleigh-Sommerfeld formula of diffraction by a plane screen is derived. As shown in Fig. 36, the infinite plane C forms a boundary. The Green’s function suitable for the boundary is g,(rO)
=
exp{ikR} - exp{ikR’} 4nR 4nR’ ’
(64)
where R = Ir - rol and R’ = lr’ - r,I. The position vector r expresses a point in the right of the boundary, r’, the symmetrical point of r with respect to the boundary, and ro shows any point. The Green’s function represented by Eq. (64) satisfies Eq. (60). This can be proved similarly as Eq. (59).Thus the representation of Eq. (61) is also suited to the boundary if go in Eq. (61) is given by Eq. (64). Now consider the region only in the right of the boundary in Fig. 36. Infinite plane C connected with the infinite sphere S’ can be regarded as So in Eq. (61).f, is zero because there is no source within the boundary. If the field p, vanishes as fast as a diverging wave, the integral over S’ vanishes. go(roI r) becomes zero on I: because R = R‘ on Z. Thus, Eq. (61) reduces to
SYNTHETIC APERTURE ULTRASONIC IMAGERY
259
n
FIG.36. Diffraction from the planar object.
Then, consider
Assuming the distance from Z to the observation point to be much longer than the wave length, 1 k >> R
and approximating aR an
-=
1
the following equation is obtained, ag, ikexp(ikR) _ dn
27rR
Substitution of Eq. (66) into Eq. (65) yields the Rayleigh-Sommerfeld
260
KEINOSUKE NAGAI
formula of diffraction: exp(ikr)
2n
ds0
z
Equation (67) explains quantitatively the so-called Huygen’s principle: the spherical wave, exp(ikR)/(4nR), of which amplitude is - 2ikp,(ro) is generated at the point ro on the boundary. The spherical waves from each point are observed at the point r. b. Two-Dimensional Representation The process to obtain Eq. (67) from Eq. (61) is applied to Eq. (63) to yield the representation of the twodimensional Rayleigh-Sommerfeld formula of diffraction. The result is as follows,
“s
Pdr) = j
z
p , ( r , ) W ( k R ) dlo
(68)
B. Numerical Reconstruction of an Image from a Hologram 1. Recording Holograms
Figure 37 shows the geometry of recording a hologram using the piezoelectric transducers. An object is placed on the object plane (z = 0), which is insonified by the ultrasonic wave pi. The scattered waves are measured over the hologram plane ( z = z h ) by planar array of detectors or mechanical scanning of a detector if one has sufficient time for data acquisition. f(xo, yo) denotes the scattering coefficient of the object: pi(xo,yo) and po(xo,yo)are the illuminating wave and the object wave, respectively, on the object plane. Then, (69)
yo) = Pi(xo, Y o ) f (xo, yo)
~o(xo,
Ph(xh,
yh),the object wave on the hologram plane is derived from Eq. (67),
=
fZ: + (xh
- xo)2 + (Yh
- Yo)
2
1
1/2
The set of the measurements Ph(Xh,yh) is called the hologram. The illuminating wave pi(xo,yo) is usually known. Then f(xo,yo) is easily obtained from po(xo,yo)by the use of Eq. (69). po(xo,yo)as well as f(xo, yo)is often called ‘object’. Image reconstruction means reconstruction of the object, that is, f ( x o ,yo) or po(xo,yo) from the hologram.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
26 1
Y
t
I
FIG.37. Geometry of recording ultrasonic hologram.
2. Numerical Reconstruction of Images The relationship between the hologram P h ( X h , y h ) and the object po(xo,yo) is presented by Eq. (70). Using this relationship, po(xo,yo)can be calculated from P h ( X h , y h ) . For example, Eq. (70) means that p h ( x h , y h ) is a linear combination of po(xo,yo), or for various P h ( X h , yh), Eq. (70) presents simultaneous linear equations which comprise unknowns po(xoryo). A numerical solution for po(xo,yo)is, of course, obtainable. However, the process takes too much time. The procedure which reconstructs images using a spatial Fourier transform is described here. Spatial Fourier transform P(u, u) of p(x, y) is defined by
-a
and its inverse, P(u, u) exp{i(ux + uy)) du du
(72)
-a.
Note that the sign of i, the imaginary unit, in the temporal Fourier transforms of Eqs. (56) and (57) is reversed in the spatial Fourier transforms of Eqs. (71) and (72).
262
KEINOSUKE NAGAI
It is known from Eq. (70) that the hologram p h is the two-dimensional convolution of the object po and the spherical wave, S(X,Y)
exp(ikR) 4nR
= ___
R
= (x’
+ y2 + z 2 ) l l 2
(73)
that is, y) = -2ikpo(x, y) * S(x, y)
(74) where * means the convolution integral. The property of the Fourier transform presents that the Fourier transform of the convolution of two functions is the multiplication of each transform: ph(x,
Ph(U,u ) = - 2ikP0(u,u)S(u, u)
(75)
where S(u, u) is the Fourier transform of the spherical wave and is derived in Appendix A attached at the end of this section. Now as z = z h ,
Ph(u,u), the spatial Fourier transform of the hologram, is numerically calculated by the use of the fast Fourier transform (FFT) from the numerical data of Ph(Xh, yh). Referring to Eqs. (75) and (76), Po(#, u), the spatial Fourier transform of the object, is obtained by Po(u,u) =
( k z - u z - u 2 ) l l 2 exp{ -izh(k2 k
- u2 - u
~ ) ’ /Ph(u, ~ } u)
(77)
The inverse Fourier transform of Po(u,u) can be again numerically calculated by the FFT, which results in the reconstructed image: m
-m
If the scattered coefficient f(x, y) is required, it can be obtained by
where pi(x,y) is the illuminating wave, which was already described in Eq. (69). The main calculation to reconstruct the image in the procedure is the two two-dimensional Fourier transforms executed by FFT. One is the transform of the hologram and the other is the inverse transform.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
263
3. Imaging Procedure Based on Fresnel Approximation
The numerical reconstruction procedure presented in the preceding subsection imposes no particular condition to record the hologram. The procedure is applicable to almost all holograms. However, it requires two two-dimensional FFT. The calculation can be reduced to half, when the Fresnel approximation is valid. The procedure which is accomplished with only one two-dimensional FFT is described here. Consider Eq. (70) at first, m
If the lateral dimension is much smaller than the longitudinal distance from the object plane to the hologram plane, lxn19
Ixh19 IYol?
lyhl << Izhl
R in the argument of exponent in the integrand of Eq. (70) can be approximated as follows, =
{.h’ + (xo
- xh)2
+ ( Y o - yh)2 } 112
whereas it is sufficient for R of the denominator to be approximated by zh. The Fresnel approximation of the diffraction formula is obtained by m
m
where the wave number k = 2nfA is used. Referring to the Eq. (71), the definition of spatial Fourier transform,
264
KEINOSUKE NAGAI
Eq. (81) can be written as follows,
where F is the operator of the spatial Fourier transform. Equation (82) expresses that the hologram ph(xh,yh) is the Fourier transform of the object po(xo,yo)multiplied by a phase factor exp{in(xz +y:)/hh} except for the coefficient. Then, it is easily known that the image of the object po(xo,yo)can be calculated by the inverse Fourier transform of the hologram; that is
where F-' is, of course, the operator of the inverse Fourier transform. Equation (83)shows that the reconstruction of the image from the hologram is accomplished by only one two-dimensional FFT. Usually only the intensity of the image is a concern. When this is the case, Eq. (83) is rewritten by
The image reconstruction is accomplished by Eq. (84). However, Eq. (83) should be discussed further and will be found important in a later section. Rewriting Eq. (83) in the form of integral, the following equation is obtained:
265
SYNTHETIC APERTURE ULTRASONIC IMAGERY
The exact equation is restored from this equation as follows, po(xo,Y o ) = i i
ij.
P h ( X h 7 yh)
eXp{- i k R } ' R dxh dyh
(86)
-m
It is seen that Eq. (86) and Eq. (70)form a pair. Eq. (70) shows that the wave propagates from po to ph. Equation (86) can be interpreted to be a numerical implementation of the synthetic aperture imaging schematically shown in Fig. 3 in Part I. Each measurement P h ( X h , yh) is multiplied by exp( - ikR) R to compensate the phase delay and attenuation caused by the propagation from the object point to the measuring point. Then, the measurements are added to reconstruct images. In fact, if one intends to implement the synthetic aperture ultrasonic imaging equipment with an electric circuit, it becomes very large and complex as described by Harmuth (1979). However, if one adopts the numerical method, the image could be obtained by executing two-dimensional FFT only once, though it takes some time. The numerical reconstruction method discussed in this subsection is named the backward propagation method.
-
C. Holography with a Broad-Band Pulse Wave
The imaging system of the ultrasonic wave with a single frequency could increase its azimuthal resolution by synthesizing the aperture. However, the radial resolution is very poor as described in Part 11. The imaging system using sonar finds the position of the object with the time when the reflected pulse from the object is received. The resolution in the system is proportional to c t,, where c, the acoustic velocity and t,, the duration of the pulse. A good resolution is attained by adopting short t,. From the viewpoint of the frequency domain, t , is inversely proportionate to f,, the frequency band-width of the pulse. The resolution increases as f;. becomes broader. Multifrequency signals are adopted also in synthetic aperture imaging methods to improve the radial resolution. Holograms are produced with different frequencies. Then, the image from each hologram is conherently summed to increase the radial resolution. However, this procedure is usually difficult to follow. The procedure which uses broad-band pulses and reconstructs images in the real domain instead of the frequency domain, is practical and is equivalent to the coherent summation. The procedure is discussed here. An object is illuminated by the broad-band pulse wave as shown in Fig. 37.
-
266
KEINOSUKE NAGAI
The scattered wave is p,(t, x,, yo)just behind the object and is &(t, x h , y h ) on the hologram plane, which is received and recorded. Let the temporal Fourier transforms of po(t,x,, yo) and Ph(t, x h , y h ) respectively be Po(w,x,, yo) and P h ( 0 , x h , Yh). Similarly to Eq. (86), the following equation is obtained:
P O ( W
Xo, y o )
= - iA
sj.
Xh, y h )
eXp( - ikR) ' R d x h d y h
(87)
-m
The coherent summation of po(w,x,, yo) is: m
po(w,x,, yo)exp( - iwt)dw
2rr m
Substitution of Eq. (87) into Eq. (88) yields, m
00
The property that the inverse transform of - iop(w)is dp(t)/at is used in the derivation. Equation (89) shows the algorithm with which the image is calculated from p h ( t , x h , y h ) in the real domain. It represents the backward propagation method wih the broad-band pulse. The wave field p,(t, x,, yo) is determined by the measurement P h ( t + R/c, X h , y h ) , of which time is delayed due to the propagation from the object to the measuring point. APPENDIX. SPATIAL
FOURIER TRANSFORM OF A
SPHERICAL W A V E
A spherical wave is represented in the spatial frequency domain as m
exp(ikR)
1
exp{i(ux u2
+ uy + WZ)}du du dw
+ v2 + w 2- k2
('41)
which is quoted from Morse and Ingard (1968). Referring to the definition of Eq. (72), the spatial Fourier transform of the spherical wave is S(U, u) = -
exp(iwz)
u2
+ v 2 + w 2 - k 2 dw
SYNTHETIC APERTURE ULTRASONIC IMAGERY
267
imaginary part of w
FIG. 38. Contour of the integral of Eq. (A2)
The integrand of this equation is rewritten by ex p(iwz)
U’
+ v 2 + w2 - k 2 -
i
exp(iwz) 1 1 2(k2 - u 2 - u 2 ) ’ / 2 w - (k2 - u* - 0 2 ) l / 2 w + (k2 - u2 - u2)1/2
643) When our concern is limited in the region z > 0, the contour of the integral on the complex number plane of w can be taken as shown in Fig. 38. The integral of Eq. (A2) is evaluated by the residue as follows,
V. PROPERTIES OF A TRANSDUCER ARRAY A piezoelectric detector is usually used in ultrasonic imaging methods. As mentioned in Section 111, the piezoelectric detector is the most sensible of all detectors as it can detect a wave field as weak as lo-” W/cm2. However, the piezoelectric detector has the disadvantage that it cannot detect the entire spatial distribution of the wave field at one time. To alleviate this problem, either the single detector has to be scanned mechanically or many detectors have to be arrayed on the plane of
268
KEINOSUKE NAGAl
measurement. The mechanical scan is adopted in scanning acoustic microscopy (SAM) and the like mentioned in Section 111. The array of the detectors is discussed in this Section. The wave field produced by a single transducer is analyzed at first. Then, the result is used to derive the directivity of the field radiated from the array. In the following, a procedure, which combines the small array of the transmitter with that of the receiver to acquire the large number of hologram data needed to reconstruct images, is discussed. A. Radiated Field from a Single Transducer
The wave field produced by a single transducer is considered at first. The field in the unbounded medium is represented from Eq. (61) in Part IV.
where g,(r 1 r,) is the Green's function in the medium which was represented in Eq. (59) and is again cited as follows:
fa
expresses the wave source and appeared in the wave equation (58) as well,
v 2 P" + k2P" = -fo
(92)
The source of a vibrating plane is discussed. The region V, of the thickness 39 shows the region, of which the surface is denoted by So. Integrating Eq. (92)over V,,the second term of the left hand side vanishes as the limit t becomes zero. Then, t surrounding the plane is considered. Figure
the right = -
ss
tfmdS,
(94)
SO
Th integra ds of Eqs. (93) and (94) are equal, as both the equations are equal independently of the shape of V,,
where n, and n2 are unit vectors along the external normals of the plane.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
269
FIG.39. Region V, and its surface So surrounding the planar source.
Let n = n, = -nn,,then
Substitution of Eq. (96) into Eq. (90) yields dp,, exp(ikR)
dS0
(97)
SO
The inverse temporal Fourier transform of Eq. (97) represents the wave field in the real domain,
(98) SO
where c is the acoustic velocity. If the plane source vibrates with the velocity u in the normal direction, Eq. (47) in Section IV becomes du
p-=-at
dp dn
(99)
Substituting Eq. (99) into Eq. (98),
where u’ represents the time derivative of u. Equation (100) expresses the wave field radiated from a single transducer.
270
KEINOSUKE NAGAI
When the whole surface moves uniformly, setting
SO
the wave field is represented by h(t) in the form p ( t , r) = h(t, r) * u’(t)
or alternately p(t, r) = h’(t,r) * u(t)
where
* denotes the convolution integral. Radiation Field from Rectangular Transducer
Now the radiation field from the rectangular transducer is discussed, which will be useful in later sections. The geometry is shown in Fig. 40(a). The width 6 of the transducer is, for the sake of simplicity, very small. If the observation point r is placed on the x - z plane, the angle of r from the z-axis is 0 and Irl = R, Eq. (101)becomes
h(t,r) = pb
dx.
2
t
FIG.40. Geometry. (a) Rectangular transducer (shaded area). (b) The x-z plane.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
Referring to Fig. 40(b) d x is expressed in relation to d R by
dx
dR sin 9
=7
Using the identity 6
3
(
t --
=c6(R -ct)
and approximation for R >> a, the following equation is obtained: pcb [ R + ( a / Z ) s i n f J 6(R - c t ) d R h(t,r) = 27TR sinH R-(a/Z)sinfJ
Or R h(t,r) %
271R sin H
L
o
-
(:)sin9
if
C
sts
R
+ (:)sin9
otherwise,
schematically shown in Fig. 41.
h(t 1
Pcb 2rRs’s\B
0
FIG.41. h(t) and h‘(t)from the rectangular transducer.
C
272
KEINOSUKE NAGAI
Finally, differentiating both sides, the following equation is obtained:
h'(t,r) is also shown in Fig. 41. As an example, if the plane vibrates sinusoidally, that is, u(t) = U, exp( - iwt),
( 1 10)
the radiated field can be obtained by substituting Eqs. (109) and (110) into Eq. (103): pcb p(t, r) =
[
exi-ico[t
-
R - (I,"n0
)i
2nR sin 0
-exp
i( -io
R
+ (:)sin0)]]
t-
--ipcbUo exp [ - i o ( t n R sin 0
-
k R ) ) sin
where k is the wave number, k = w/c. Normalizing Eq. ( l l l ) , the directivity of the radiated field from the rectangular transducer is presented by sin
W )=
{(F)
sin o}
ka
sin 0
L
The numerical example for a
=
I is calculated and displayed in Fig. 42.
B. Radiation Field ,from an Array of Transducers
The directivity of the radiated field from a linear array of transducers is calculated here. The linear array is placed along the x-axis as shown in Fig. 43. The array consists of N transducers. Each transducer is the same as is discussed in the preceding section. These transducers are driven by a sinusoidal signal with a frequency w. However, the phase of the signal fed to each transducer is delayed by q,, from that fed to its left-hand neighbor. The phase of its right-hand neighbor is
SYNTHETIC APERTURE ULTRASONIC IMAGERY
273
FIG.42. The directivity of the radiated field from the rectangular transducer: lDs(0)l,a = 1.
Y
t
FIG.43. The radiated field from transducer-array.
further delayed by qo.Thus, the phase is delayed by qo one by one. The phase of the right most transducer, therefore, delayed by ( N - l)qofrom that of the left most transducer. The directivity of the elementary transducer is D,(d). qo is replaced with 0, by the relation qo = kdsin O,, where d is the interval of the transducer in the array. It will be found that 0, is the deflection angle of the radiated field. The
214
KEINOSUKE NAGAI
radiated field sufficiently distant from the array at the direction I3 from the zaxis is proportional to W(8)and is given as follows:
W ( 0 )= Ds(8) + Ds(13)exp{ikd(sin 8 - sin 0,))
+ . + Ds(13)exp{i(N - l)kd(sin I3 - sin do)} + .
(113)
The first term represents the field from the left most transducer. The second term expresses the field from the second transducer. In this manner, the last term corresponds to the field from the right most transducer. Equation ( 1 13) is easily calculated, as it is a geometric series,
(114)
Normalizing Eq. (114), the directivity of the whole array, D J 0 ) is obtained by,
D,(@ = DS(0)* Da(0)
(115)
where
45 O
goo
e
FIG.44. The directivity of the radiated field from an array of infinitesimally narrow transducers: lDa(0)(,0, = 0,d = 21.
SYNTHETIC APERTURE IJLTRASONIC IMAGERY
275
Equation ( 1 15) shows that the whole directivity is expressed by the multiplication of D,(d) and D,(H). D,(H) is the directivity of the single transducer, which was described in the preceding section. D,(O) represents the directivity of the array when Ds(d) = 1 ; that is, the width of the constituent transducers is infinitesimally small. Equation (116) shows that 6, is the maximum of D,(O), which is interpreted as the deflection angle of the radiated field. Figure 44 depicts the 1Da(8)1 for 8, = 0. Zero of the denominator of D,(O), that is, sin{(kd/2)(sinfl - sin Oo)} = 0, determines the position of the gratinglobes of the directivity. Figure 45 illustrates lD,(d)l for a = il, rl = 2A. As the deflection angle 0" is increased, the peak of the radiation pattern of the whole array, DJH), is relatively decreased by the directivity of the single transducer, D,(8). C. Comhinmtion of' Transmitter- Array and Receiver-Array 1. The Nurnher of Hologram Data
The relationships between the hologram Pl,(.q,, y h ) and the object po(xn,y,,) have been derived and expressed in Eqs. (70) and (86) in Part IV.
--x
Fic;.
45. The directivity of the radiated field from an array of transducers with finite width: = 10 , d = 2i,, N = i.
\Dw(0)\, O,,
276
KEINOSUKE NAGAI
po(xo, yo)
= i~
1
Ph(Xh? y h ) exp(-ikR)
’ Rdxh
dyh
(1 18)
-41
where R = { Z i (Xh - x,)2 -k ( y h - yo)2 } lj2 . These equations present that the hologram at a point is a linear combination of the object points and vice versa. Therefore, the required number of the hologram points is equal to that of the image points to be reconstructed. The number may be very large. If one datum of hologram is obtained by one element of an array, it might be too large to be practical. Methods called ‘super resolution’ reconstruct a large number of object points from a small number of hologram data. However, this method is not always successful. Figure 46 shows the structure of another method which combines a transmitter-array and a receiver-array to collect hologram data. It is plain that the hologram data at different points can be obtained by moving the receiver. However, the same data might be obtainable by moving the transmitter when the receiver is fixed. If this scheme is realized, N, N , hologram data can be collected by (N, + N,) elements of N, transmitter and N, receiver. A piezoelectric transducer can be used as the transmitter and as the receiver. This method has the advantage that the number of elements does not increase as rapidly as that of data. A large number of data could be collected, therefore, by the combination of small arrays.
+
.
FIG.46. Combination of a transmitter-array and a receiver-array to collect hologram data.
277
SYNTHETIC APERTURE ULTRASONIC IMAGERY
. . . ................ . . . . . . . .
.
transmitter
receiver 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
O o a o O o o o o o o o o o o o
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o a o o o o o
0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(a) ( b) (C) FIG.47. Combination of small arrays equivalent to large array.
The geometrical structure of the array should be carefully examined to realize the method. There are two reasons for this. One is to avoid collecting redundant data. The other is for the numerical processing not to become especially troublesome. In other words, it is desired that the data collected by the structure of Fig. 47(a) can also be acquired by that of Figs. 47(b) or (c). The structure of Fig. 47(a) consists of one transmitter and N, N, receivers. Those of Figs. 47(b) and (c) show the combination of the array of N, transmitters and that of N, receivers. It is impossible to collect exactly the same data of Fig. 47(a) by using the structures of Figs. 47(b) or (c). However, it will be clear that these data are equivalent to the permitted limit of the Fresnel approximation. It will be presented that the data of Figs. 47(b) or (c) are processed as easily as the data of Fig. 47(a).
-
2. Structure of Arruys Let the illuminating wave be the spherical wave emanating from the transmitter at the point (xL,yL,z,),(refer to Fig. 46). Then, from Eq. (69) in Part IV, exp(ikR ,) Po(Xor Y o ) = f(XO,YO), 4nR,
278
KEINOSUKE NAGAl
where R, = {z: + (x, - x , ) ~ + (y, - yo)z}112,and f(x,,y,) is the scattering coefficient of the object. The scattered wave is received at the point (x,, y,, z,) as the hologram which is denoted by ph(xI,y,, xr, yr). Substitution of Eq. (119) into Eq. (117) yields
+
+
where R, = {zf (x, - x , ) ~ (yr - yo)2 112 Equation (120)expresses that the hologram data is the weighted sum of the object points. The weight, exp(ikR,) exp(ikR,) (RtRr) is symmetrically determined by the coordinates of both the transmitter and the receiver. For simplicity, notation related to the y-coordinate are omitted hereafter. Therefore, keep in mind that calculations with respect to x, or x, involve those with respect to y, or y,, respectively. Equation (120) is modified. The variables related to y-coordinate are omitted as just mentioned. The Fresnel approximation is made: R, and R, in the denominator are replaced with z, and z,, respectively.
and
are substituted into the argument of the exponent. Rearranging the result, the following equation is obtained,
where
and F is the operator of Fourier transform.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
279
From Eq. (121), the spatial frequency u of the Fourier transform domain corresponds to,
(- + 2)
2n x, 1 z,
u =-
Now, let the intervals of the transmitters and the receivers be d, and d,, respectively: x, = n,d,,
(n, = 0, 1, 2,. . . ,N, - 1)
x, = n,d,,
(n, = 0, 1, 2,..., N , - 1)
( 1 23)
Substituting Eq. (123) into Eq. (122), the spatial frequency obtained is represented by
It is required that the hologram data should avoid redundancy. That is, the points indicated by Eq. (124) are regularly located and any two of them should not be coincident. Referring to Eq. (124), if either
z d, = d N , d , zr
or 2
d, = 2 N,d,
(126)
2,
holds, these points are regularly placed. Now, if Eq. (125) holds, Eq. (124) becomes
These points are regularly located and the data collected at them avoid redundancy. 3. Reconstruction of Images It was seen that the geometrical structure of the arrays should satisfiy Eq. (125) or Eq. (126). Now the image reconstruction procedure from the data ph(xl,y,)collected by the arrays is discussed. From the inverse Fourier
280
KEINOSUKE NAGAI
transform of Eq. (121),the image of the object is obtained as follows,
I Considering that u is expressed at sampling points, substitution of Eq. (122) into Eq. (128) yields
(’ +
x exp:{i
:)x}
where B is a constant. Let x be represented by sampling points
f,(x) and Ph(X,,x,) are rewritten by f ( m )and p,(n), respectively. Substitution of Eqs. ( 1 24), ( 1 27) and ( I 30) into Eq. (129) yields,
x exp
( ~
Equation (131) has the form of the two-dimensional discrete Fourier transform (2-D DFT) and can be calculated by executing 2-D FFT only once (note that the transform with respect to the y-axis is omitted here), which is similar to the case of ordinary data acquisition. That is, the combination of the arrays give rise to no extraneous calculations.
4. Point Spread Function The structure of the N, transmitter and N , receivers which satisfies Eq. (125) or Eq. (126) is equivalent to that of one transmitter and N,N, receivers from the viewpoint of not only the image reconstruction proce-
SYNTHETIC APERTURE ULTRASONIC IMAGERY
28 1
dure but also the resolution (point spread function). This is proved in this subsection. Consider a point as the object, at x,, f,(.w) = &x - x,) The hologram is given by the substitution of Eq. (132) into (121), Ph(-'Ct, X r )
= -
( 132)
~
Further substituting Eq. (133) into Eq. (129), the reconstruction image is obtained.
Using Eq. (124), Eq. (134) is calculated to obtain
(+I i ) ( x 2 - xi)}
f,(x) = Bexp{ - i E1 z,
n N d , ( x - x,) . nN,d,(x - x,) sin Jzt Azr sin nd,(x - x,) sin . nd,(x - x,) i.zt JZ,
sin X
(135)
Now, introducing the Fresnel approximation, x
.
X O
sin 8, = -,
sin H,, = -,
Z,
Z,
From Eq. ( 1 16), the approximated normalized field from the transmitter array D,(x) and that of the receiver array D,(x) are obtained by,
282
where k = 21-42 is used. Referring to Eqs. (137) and (138), Dc(x),the approximated normalized field of the combined arrays, is rewritten from Eq. (135), as follows, (139)
D C ( 4 = Dl(X) DS(4
From the relationship of Eq. (125) which characterizes the arrays, the following equation holds,
Then, the numerator of Eq. (138) is coincident with the denominator of Eq. ( 1 37). That is, in Eq. (135) or Eq. (1 39), the numerator cancels out the denominator. In other words, the grating-lobes of the transmitter-array are removed with the zeros of the receiver-array. Equation (135), therefore, is represented by sin nNtNrdr(x - xo)
{
1
/iZr
DJX) =
ndAx - xo) sin{ Azr
}
(141)
Equation (141) represents the reconstructed image of the point object given by Eq. (132), which is interpreted as a point spread function of the imaging system. This also expresses the directivity of the combination of arrays, which is coincident with that of the array with the interval of element, dr, and with N J , , the number of elements, from the Fresnel approximation of Eq. (116). VI. ACTUALDIGITAL IMAGINGSYSTEM The ultrasonic imaging system adopts piezoelectric transducers and digital techniques to attain high sensitivity and to speed up processing. The procedures which implement the system with electric circuits are discussed by Harmuth (1979). It is clear that the system becomes inevitably large and complex if one intends to obtain image in real-time. Simpler systems are discussed here, which use partially, for example, mechanical scanning and/or a general purpose computer.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
283
A . Speedy Processing System with a Transducer Array
A speedy processing system can be realized by adopting the transducerarray to shorten the data-aquisition time. These systems are discussed first. Figure 48 shows the real-time imaging system proposed by Suarez et al. (1975).This system can detect the ultrasonic field as weak as lo-'' W/cm. The ultrasonic chirp signal (800 KHz bandwidth centered at 2 MHz) is collimated by a plastic lens, refer to Fig. 48(a), and the signal is scattered by the object in the auxiliary tank. The field is collected and focused by two polystyrene lenses onto the 192-element piezoelectric linear array. Thus, one line of the image is obtained. A pair of polystyrene prisms are counter rotated to sweep the ultrasonic field across the array. The image field is electrically scanned in the direction of the array, but it is mechanically scanned in the perpendicular direction by the prisms. The resultant image consists of approximately 400 lines. The image data is stored and displayed in a television screen. One of the images is shown in Fig. 48(b).This is the transmission image of the hand which shows the metacarpal-phalangeal and interphalangeal joint. The linear structures visualized between the bones are probably related to muscles, tendons, and neurovascular bundles. Figure 49 shows the ultrasonic imaging system using multibeam scanning proposed by Nitadori et ul. (1980). This system uses ultrasonic waves with about a 200 KHz operating frequency to view underwater of the near-shore bottom where a man is working. Figure 49(a) indicates the underwater unit consists of transducer-arrays. A 4 x 4-transmitter-array is seen outside and 32 x 32-receiver-array is inside. These relatively small arrays are combined to collect a large number of hologram data. The theory has been already described in Section C of Part V and it is applicable to the unit. These arrays are placed on the same plane; that is, as for the structure shown in Fig. 46 in Part V, 2, =
z,
( 142)
From Fig. 49(a), it is clear d , = N,d,
(143)
where d, and d, are the interval of the transmitters and that of the receivers, respectively, and the number of receivers, N , = 32. Referring to Eqs. (142) and (143), it can be seen that this structure of arrays satisfies Eq. (125) in Part V. Then, the hologram data collected by this system are equivalent to those collected by the combination of one transmitter and 128 x 128 receivers as discussed.
L
(b) FIG. 49. Ultrasonic imaging system using multibeam scanning (Nitadori et a!., 1980). (a) LJnderwater unit. 4 x 4-transducer-array is seen outside. 32 x 32-receiver-array is inside. (b) Ultrasonic image of a bicycle taken at a range of 3.6 m with the operating frequency of 200 kHz.
FIG.50. Underwater imaging system (Shibata and Koda, 1986). (a) Coaxial circular spherical arrays (CCS array). The outermost elements with horns are the transmitters and the inner three are the receiver-array. (b) Underwater vehicle equipped with CCS array. (c) The image with operating frequency of 95 kHz.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
287
It was reported that the image could be reconstructed in 2 seconds by the computer. Figure 49(b) shows an example of the image. This is the image of a bicycle taken at a range of 3.6 m. Considering the long wavelength of 7.5 mm, the quality of the image is very high. Figure 50 shows the underwater imaging system too. This system was proposed by Shibata and Koda (1986). The structure of the transmitter-array and the receiver-array is indicated by Fig. 50(a). They are also set on the same plane. 8 transmitters with horns attached are placed in the outermost circle. Receivers form three inner coaxial circular arrays. Each circular array consists of 16 receivers. They are considerably sparse arrays. However, the radii of the four circles are so cleverly selected that the grating-lobes of an array may be canceled out by the zeros of another array. Thus, all the grating-lobes at small angles from the axis are removed. The canceling of the grating-lobes by the combination of linear arrays was discussed in Section C of Section V. The method extends to the circular arrays. The system has the distinctive feature of a small number of transducers. From the feature, the electric circuit could be implemented with a relatively simple set of delay-lines using charge coupled devices. Figure 50(b) shows the arrays with which the underwater vehicle is equipped. Figure 50(c)shows the image which is reconstructed by the system. The operating frequency is 95 KHz. The object is a character 'A' of 50 cm long and 5 cm wide which is made of aluminum. B. Synthetic Aperture Method Using a Broad-Band Pulse Wave
In order to attain the radial resolution, a broad-band pulse wave is used. The aperture is synthesized in real domain. An example of the technique was reported by Ishii and Sasaki (1983). It is shown in Fig. 51. The data in the example were collected by mechanical scanning of a transducer instead of using the arrays. The transducer is used not only as a transmitter but also as a receiver. Thus, R in Eq. (89)in Part IV, the distance from the object point to the receiver, should be replaced with 2R, the round-trip distance. A metal block is drilled to have 5 holes as shown in Fig. 5l(a). The transducer is scanned along a line on the upper surface of the block. At the spatial sampling points, the broad-band pulse wave is transmitted. Then, the scattered wave is received and recorded as time series. The image reconstruction is based on Eq. (89); that is, the data collected are delayed by the corresponding round-trip time from the transducer to the object point, and summed. The result of the two-dimensional cross-sectional image is shown in Fig. 5l(b).
288
KEJNOSUKE NAGAl
transducer
22
FIG.51. Synthetic aperture method using pulse waves (Ishii and Sasaki, 1983). (a) Specimen. (b) Image.
C. Ultrasonic Computerized Tomography Using the Time-of - Flight Profiles Computerized tomography (CT) which obtains clear x-ray images of the cross-section of human body has achieved great success. However, this technique may not be directly applicable to ultrasonic imaging. Diffraction effects cannot be neglected in ultrasonic waves but cause no problem in x-rays. The image of the ultrasonic CT can be reconstructed from the time-offlight profiles, where the time-of-flight means the propagation time from the transducer to the receiver through the specimen. The profiles are measured and then the two-dimensional distributions of the ultrasonic velocities are painted out. In this method, diffraction effects are minimized since the earliest arrival time is most probably that of the straightest ray path. The method is not a synthetic aperture method, but it is a promising new method. Figure 52(a) shows the geometry of ultrasonic transmission for algebraic reconstruction. The time of flight z j is measured by two opposing transducers on either side of the specimen in the fluid.
where Lijis length of ray path j in region i, uti is the velocity in the specimen in
,/” ” ”
S C A N LOCUS
( W a t e r - F i l l e d , 2 2 ”C )
I cm
FIG. 52. Ultrasonic computerized tomography using the time-of-flight (Greenleaf ct ul., 1975).(a) Geometry of ultrasound transmission for algebraic reconstruction. (b) Reconstruction of relative propagation delay within canine heart (left) compared to photographs of sections through corresponding levels (right).
290
KEINOSUKE NAGAI
region i, u w , the velocity in the fluid, and D is the distance from the transmitter to the receiver. Usually the differenceAzj between the arrival time zw through fluid and the arrival time z j through specimen is measured,
The measured data at sampling points are collected by scanning rectilinearly the transmitter and the receiver. Then, the object is rotated by an angle of AO. The measurements are repeated. A large number of AT, are obtained by the rotation and the rectilinear scanning. Using the Azj, ( l / u w - l/ut,), the unknowns, in Eq. (145) are found at each value of i by solving the simultaneous linear equations. As the number of the measurements increases, the quantity of the images becomes high since noises are removed by the measurements. Figure 52(b) shows the image reconstructed by the method. This is the profile of relative propagation delays within a canine heart for two separate transverse levels separated by 2 cm (left)compared to a photograph of sections through corresponding levels (right). Pulses were propagated through the tissue at each of 256 equally spaced points along the 12 cm scan and were digitized with a temporal resolution of + 10 nanoseconds. This is repeated at each step of 36 angles of view separated by 5”.
VII. DIFFRACTION TOMOGRAPHY AS THE
INVERSE PROBLEM
The term “ultrasonic imaging” has been used without clear definition in this article so far. Its meaning is as follows: An object is illuminated by a ultrasonic wave at first. The scattered and diffracted wave is then measured. The form of the object is evaluated by the measurements, and it is finally displayed in the visual fashion. The evaluation of the form of the object from the scattered wave has been studied for a long time as inverse scattering problems. Considerably exact solutions of some problems are obtained. An image of high quality might be obtained if such inverse solutions are used, because the scattering and diffraction effects are removed from the image. The evaluation requires the measurements of the wave over a certain region, which corresponds to the synthesized aperture. Diffraction effects degrade the image reconstructed by ultrasonic tomography. The ultrasonic tomography which takes diffraction into account
SYNTHETIC APERTURE ULTRASONIC IMAGERY
29 1
and compensates its effects is called diffraction tomography. The diffraction tomography which uses the inverse scattering solution to reconstruct the clear image is described here. Two-dimensional problems are discussed here. The configurations are assumed to be constant with respect to the y-axis (d/iiy = 0). Therefore, the two-dimensional image similar to the B-scan is obtained. The discussion could be simply extended to three dimensions, and a three-dimensional image could be reconstructed. The assumption of two dimensions is necessary, however, for the discussion to be simplified and easily developed. A. Wave in an Inhomoyeneous Medium
1. Wane Equafion
The object in which we are interested is like a human body. It exists in the homogeneous medium. Its elastic properties, the density p and the compressibility K, are slightly different from those of the surrounding medium. The wave equation in an inhomogeneous medium is derived at first. The coordinate system is shown in Fig. 53. The object, the inhomogeneous medium, is assumed to exist only inside the region S. Inside S, that is, p and K are functions of the position but outside S are they constants, po and K ~ .
Z
FIG.53. Coordinate system.
292
KEINOSUKE NAGAI
Where there is no wave source of the body force, substituting F = 0 in Eq. (53) of Part IV, the following equation holds, aU
p-
at
=
-vp
Equation (54) is cited here,
Equation (147) is differentiated with respect to t. Substituting Eq. (146) into the result, the following equation is obtained 1
at
va-vp P
Now, the following equations are defined using p and ti,
It is clear that I, and
fid
are zero outside S . From Eqs. (149) and (150), K = Ko(Bc
+ 1)
(151)
Substituting Eqs. (151) and (1 52) into Eq. (148) and rearranging the result,
i 2 the acoustic velocity in the homogeneous medium. where co = ( p o ~ o ) - L is Using the definitions of the Fourier transform pair, Eqs. (56) and (57) in Section IV, Eq. ( 1 53) becomes
where k = o / c o , wave number and
The solution of the wave equation in two dimensions is obtained from Eq. (63)
SYNTHETIC APERTURE ULTRASONIC IMAGERY
293
in Section 111 as follows,
where two-dimensional Green's function is
R = Ir - ro(
(158)
H t ' : zero order Hankel function of the first kind. f, in Eq. (58) of Section IV expresses the wave source of the body force but in Eq. (155) of this section, it represents the source of the second wave caused by the inhomogeneity of the medium. In other words, the incident wave is scattered by the inhomogeneity and fl,is the source of the scattered wave. When the boundary is taken at infinity, the line integral in Eq. (156) expresses the incident wave pi. Substituting Eq. (155) into Eq. (156), a term becomes
Since
td
vanishes on L , Eq. (156) becomes finally,
2. The Born Appraximation cind the Rytov Approximation The total field P , , ~is the sum of the incident wave pi and the scattered wave phdue to the inhomogeneity of the medium, Po = Pi
+ Ps
(161)
294
KEINOSUKE N A G A l
Being compared to Eq. ( 1 6 l), Eq. (1 60) becomes
The relationship between p s and fiC or fid is not clear, because Eq. (162) comprises p m in the integral. However, assuming (163)
IPiI >> Ips1 p , in Eq. (162) could be replaced with pi to obtain
+ l d v g o * vPi)
P s ( ~= ) l[{rck2pigo>
ds
( 164)
This approximation is called the Born approximation and is applicable to the case where scattering is weak (refer to Eq. (163))and multiple scattering can be neglected. The other approximation called the Rytov approximation is as follows. The wave field is represented in the exponential form p = eq (165) And q is represented by the sum of the component due to incident wave and the component due to the scattered wave, 4
=
qi
+ 4s
where qi is expressed by the incident wave of Eq. (161), Pi = exP(qi)
(167)
P = Pi exP(qs)
( 168)
that is, Assuming )q,1 << 1, exp(q,) = 1 obtain,
+ qs could be substituted into Eq. (168) to
P + Pi(1 + 4 s ) Comparing Eq. ( 1 69) with Eq. (1 6 l), Piqs = P s
( 170)
Finally, substituting Eq. (164) into Eq. (170), the scattered field is obtained by
SYNTHETIC APERTURE ULTRASONIC IMAGERY
295
Theoretically, the Rytov approximation gives better solution to the forward scattering than the Born approximation. However the solution represented by the Rytov approximation is very sensitive to noise, and this limits the use of the Rytov approximation. The Born approximation is used hereafter in this article. B. Diffraction Tomography with Plane Wave Illumination
It is assumed in this part and the next that the density p is independent of its position. That is, p is the constant po and only the compressibility ti is inhomogeneous. This assumption is often made in medical diagnosis to model the human body. If the assumption is valid, from Eq. (1 50), l,j
=0
( 172)
and Eq. ( 1 64) reduces to n n
JJ S
Figure 54 shows the configuration of the diffraction tomography with plane wave illumination which was presented by Wolf (1969). This is the first
z
t
illuminating wave
-
receiver
FIG.54. The configuration of the diffraction tomography with plane wave illumination
296
KEINOSUKE NAGAI
scheme of the diffraction tomography as the application of inverse scattering solutions. Other configurations are developed from this. The figure displays a single transducer which is scanned. This part could be, of course, replaced with a transducer array. Similar replacements are also valid in the configurations of Figs. 58 and 63 which will appear later. An object is illuminated by a plane wave. Then, the scattered wave from the object is detected at a point on a line which is parallel to the x-axis (refer to Fig. 54). It will be shown that the transducer will be scanned on two lines, z = z, and z = -z,, to detect the reflected wave and the transmitted wave, respectively. Now the receiver is scanned on the line z = z, and is placed at x = x,. A point of the object is represented by r = (x,z). The illuminating plane wave is expressed by
-
pi(r) = exp(ik r)
(174)
where the amplitude is normalized and the wave vector k = ( k x ,k,) expresses the direction of the plane wave propagation. Substituting Eqs. (157) and (174) into Eq. (173), the scattered wave which is received is represented by ps(xr,zr)=
s,(r)exp{ik r)Hf)(kIr, - r1)dxdz
(175)
-m
In the derivation the domain of integration can be extended to infinity because tc = 0 outside S. The scattered wave ps(x,,z,) is measured at many points on z = z,. The spatial Fourier transform Ps(u,z,) with respect to the x-axis is numerically calculated from the measurements,
Substitution of Eq. (1 75) into Eq. (176) yields
-m
x exp( - iux,) dx,
Substituting the integral representation of Hankel function,
(177)
297
SYNTHETIC APERTURE ULTRASONIC IMAGERY
into Eq. (177), interchanging the order of the integrals and using the Dirac's delta function,
L JZJ 27c
exp(i(K, - u)x,} dx, = b ( K , - u )
(179)
--7
the spatial Fourier transform is obtained as follows,
:'
Ps(u,z,) = -
s:
t c ( r )exp(ik
- r)
Actually, the receiver is required to be scanned along z = -zr as well as z = z,, and the wave is measured on both the lines. The positions of the lines are selected apriori not to enter the region occupied by the object (zL < z < z"). Namely, the condition
< ZL, 2" < z, (181) is imposed and is always satisfied by the configurations even in the later sections. Then, Eq. ( 1 80) becomes, -z,
ik2 P,(u, + z , ) = ,exp[L
{ fi(k2
-u ~ ) ~ ' ~ z , } ]
*c(r)exp(ik* r)
exp[i( -ux (k2
z ( k 2 - u')}]
- u2)1/2
dxdz
(182)
-a
where the double signs correspond in order. The data of the reflection mode and those of the transmission mode are acquired along the scanning lines z = z , and z = - z , , respectively. Let the spatial Fourier transform of t c ( . x , z ) be T(u, w), fic(x,z)exp{ -i(ux
+ wz)}d x dz
(183)
Comparing Eq. (182) with Eq. (183), it is found that the simple relationship between Fourier transforms of ps(u,k z , ) and pC(x,z ) holds as follows, ik2 exp[ - _ f z , ( k 2- u ~ ) ' ' ~ } ]
P,(U,k z r ) = 2 x
r[u
( p- u2)1/2 - k,,
-{ &(k2
- u ~ ) "+ ~k A 1
( 1 84)
298
KEINOSUKE NAGAI W
Z=-Zy .-----.* /'
.
-U
Z=Zr
FIG.55. Locus of points at which I-@,
w ) is obtained.
If the arguments of T in Eq. (1 84) are represented by U and W, U =u W
=
-
k,,
-{ f(k2
- u2)'iZ} -
k,
the following equation is derived, (U
+ k J 2 + ( W + k,)2 = k 2
(186)
Equation (184)expresses that T(u,w ) could be numerically calculated from Fourier transform of the measurements, ps(xr,k zr). The locus of the points at which T(u,w ) is obtained is a circle represented by Eq. (186). The circle is shown in Fig. 55. The center of the circle (-k,., -I?,)on the (u,w) plane is determined by the direction of the propagation of the illuminating wave, that is, the wave vector k = ( k x , k z ) .The radius is k = ( k ( .Therefore, the circle passes through the origin. The upper semicircle (a broken line) is formed from the measurements collected along z = -zr, which are the data of the transmitted wave. The lower semicircle (a solid line) is drawn from the measurement acquired along z = zr, which is the data of the reflected wave. As the incident angle is increased step by step from 0 to 271, the region R, within which T(u, w ) is known extends and becomes inside the circle with the center at the origin and the radius of 2k, which is shown in Fig. 56. If the spatial frequency components of fic(x,z) are limited within R,, h c ( x , z) is perfectly obtained from the inverse Fourier transform of T(u,w), To (u,W ) exp{i(ux R
+ W Z ) ) du dw
(187)
SYNTHETIC APERTURE ULTRASONIC IMAGERY
299
W
FIG.56. Region R,, within which
I&,
r(u,w ) is obtainable.
The inverse scattering problem has been solved. So, the image of the object z) is obtained. R, is represented by u2 + w 2 5 ( 2 k ) 2
(188)
which is interpreted as follows. Let the minimum units in the x-direction and the z-direction be Ax and Az, respectively. Using k = 2n/;l, A: wavelength, Eq. (188) is written by,
Referring to Fig. 57, let AL be 1 ~-
1 1 &$ +-( A z ) ~
and from Eq. (189), the following equation is obtained, A
ALZ2
300
KEINOSUKE NAGAI
AX
Frc. 57. The relationship between A L and Ax, Az.
AL is regarded as the rough measure of the resolution of the imaging system and is given by Eq. (191).This is the meaning of Eq. (188). C . DifSvaction Tomography with Fan-Beam Illumination
It seems difficult to realize the diffraction tomography with plane wave illumination described in the preceding section, though the technique is useful to understand the theory of diffraction tomography. In fact, the technique has to illuminate an object with plane waves from various directions, and it is troublesome to generate such plane waves. The diffraction tomography with fan-beam illumination is more realizable (Nahamoo et al., 1984). Figure 58 shows the configuration of the diffraction tomography with fan-beam illumination. The coordinate system is given as in this figure. A receiver and a transmitter are set on either side of an object. They are scanned in the xdirection. The scanning lines of the transmitter and the receiver are z = z,and z = z,,respectively. The transmitter is placed at a point on the line z = z, and transmits ultrasonic waves. The scattered waves from the object are measured by the receiver at all sampling points on x = x,.Then the transmitter is moved to the next point and similar data acquisition is made. The procedures are repeated for all transmitting points. Assuming that the real apertures of the transmitter and the receiver are sufficiently small, the illuminating wave pi(r) from the transmitter is a cylindrical wave, because the problem is again two-dimensional. Then,
where rt = (x,,z,) is the position of the transmitter. The scattering wave received at rr = (x,,z,)is denoted by p(x,;x,). Substitution of Eqs. (157) and
301
SYNTHETIC APERTURE ULTRASONIC IMAGERY
D X
FIG.58. The configuration of the diffraction tomography with fan-beam illumination.
(1 92) into Eq. (1 73) yields
I
x ;H:)(klr - r,l)dxdz
fi
(193)
If the spatial Fourier transform of p ( x , ; x , ) with respect to ( x , , x , ) is P(cc,j), P(a, 8) =
~ ( x x,) , ; exp( - ibx,
+ Px,)} dx, dx,
( 194)
-5
Equation (193) is substituted into Eq. (194). Then, rewriting the Hankel function with integral representation as in Eq. (178),the following equation is obtained, a
P(a,8) = - !f j { j j d x dz dx, dx, t&, z ) exp{ - i(ax, 16
exp[i{ K,(x - x,)
+ (k2
-
+ jx,)}
K ; ) l i 2 ( z ,- z ) } ]
(k2 - K y ' 2
302
KEINOSUKE NAGAI
Interchanging the order of the integral, and further substituting
i 271 jm exp{ - i ( K x + a).,}
dx, = 6 ( K , + a)
(196)
+ /?)x,} dx, = 6(K: + /?)
(197)
-m
and 271
jrn
exp{ -i(K;
-m
Equation (195) becomes P(a,B) = -
k2exp[i{z,(k2 4(k2 - a2)’/2
x [tc(x,z)expL-i{(m
.
- z,(k2 - /?2),”2}] - /?2)‘/2
(k2
+ /?)x
-00
+((k2
- a 2 ) 1 / 2-
( k 2 - f12)”2)z}]dxdz
(1 98)
which expresses the spatial Fourier transform of the object aC(x,z). Namely, let
r(u,w ) = f l t c ( x , z) exp( - i(ux + W Z ) } dx dz
(1 99)
-m
Comparing Eq. (198) and (199), finally the following relationship is obtained,
r{m+ /?,( k 2 -
a2)l/’ -
(k2 - /?2)112}
= -P(a,/?)(k2 -
4 k
x 1(k2
-
f12)112exp[i{z,(k2- /32)1/2- z,(k2 - a 2 ) } ] (200)
The spatial Fourier transform T(u,w) of the object can be known from the spatial Fourier transform of the received data p(x,; x,). Denoting two wave vector as k, = { a , -(k2 - a2)1/2} k, = {/?, + ( k 2 - /?2)1’2}
Equation (200) gives r at k, + k, on the (u, w ) plane, that is, r(k1+ k,). The loci at k, + k, are shown in Fig. 59. r is obtained within the region indicated by Fig. 60 by using Eq. (200).
W
4
FIG.59. Loci of k,
+ k, on the (u, w) plane.
1"
FIG.60. The region within which scanned parallel to the x-axis.
is obtainable when the transmitter and the receiver are
W
f
I
FIG.61. The region within which r is obtainable when the transmitter and the receiver are scanned parallel to the z-axis. W
f
FIG.62. The total region. The sum of the regions indicated by Fig. 60 and Fig. 61.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
305
The transmitter and the receiver are scanned parallel to the x-axis so far. When they are scanned along the z-axis or they are still scanned along the xaxis but the object is rotated by 90", the region where is obtained is derived, repeating the above discussion. The region is shown in Fig. 61. The total region where I- is obtained is the sum of the region of Fig. 60 and that of Fig. 61, which are shown in Fig. 62. From this figure the region of the spatial frequency domain (u, w) is expressed by
(u2 + w2)'I2 < f i .k
(202)
The image of the object is reconstructed by the inverse Fourier transform of I- similarly to the preceding section. D. Diffraction Tomography with Broad-Band Pulse Wave
It was shown that inverse scattering problem was solved and a fairly exact solution was obtained. The solutions are applicable to the imaging method to reconstruct good images. There are, however, two weak points in these methods. (1) The collection of the data is problematic. These methods require a vast number of data corresponding to the geometrical combination of the transmitter and the receiver. (2) These methods are based on the assumption that the density of the object is homogeneous. It is improbable that the assumption always holds. The assumption at least limits the objects of interest.
Sinusoidal waves with a single frequency which has been used in the previous two sections, are replaced with broad-band pulses to overcome the first weak point. The information of the scattered wave with respect to one dimension, say the z-axis, are exchangeable for that in the form of time or equivalently in the form of frequency. Then the efficiency of the data collection is increased if the pulse waves are used and the position of the transmitter and the receiver are kept fixed. Some discussions are needed in order to find the procedure to circumvent the second weak point. In Eq. (164) which presents the received wave, P s ( ~=~ ])] ( # c k 2 p i g u
f td
vgm
'Vpi)ds
(164)
S
the second term of the integrand, 'kd V g , V p i ,which is the inner product of vectors, largely varies according to the angle made by the lines from the transmitter to the object point and from the object point to the receiver. It is
306
KEINOSUKE NAGAI
troublesome to deal with this term. In the previous sections, the assumption that the density of the object is homogeneous, that is, fid = 0, is required to remove the term. The term can be modified and can be more easily dealt with by selecting the geometry of the transmitter and the receiver. In the procedure which is described here, a transducer is used as the transmitter and as its receiver at a time. The transducer transmits an ultrasonic pulse wave. The wave is scattered by an object and then, it is received by the same transducer. Denoting the position of the transducer as r,, Eqs. (157) and (192) become,
From the integral representation of the Hankel function,
the following equation is obtained 1 Vp, = Vpi = - - kHb”(kR) 4
(204)
where k = {k,., ( k z - K ; ) ’ l 2 } is a wave number vector. When the transmitter is used as its receiver, the angle made by the lines from the transmitter to the object point and from the object point to the receiver is always 0”. Then, because Vg,.
1 16
VP, = --k2{Hb”(k(r
-
rrl)l2
(205)
the second term in the integral of Eq. (164) takes the same form as the first term. Substitution of Eqs. (203),(205) and rr = r, into Eq. (164) yields,
-m
(fid
- 4,)
in the equation is considered as the object o(x,z), that is,
SYNTHETIC APERTURE ULTRASONIC IMAGERY
307
If lkRl >> 1, using the following approximation
{A) 1/2
H!’(kR)
=
exp(ikR)
a square of the Hankel function becomes,
2 fHb”(kR)}2= -exp(ik2R) nikR ’
L
(nikR)’”
Hb”(2kR)
Substituting Eqs. (207) and (209) into Eq. (206) and rewriting ps(r,) as ps(w,r,) to emphasize the frequency w, the following representation is obtained.
Lc
where R
= Ir, - rl, r = (x,z), r, = ( x , , ~ , ) .
a. Data Acquisition. Figure 63 shows the configuration of collecting data with broad-band pulse illumination. The transducer at x = x, transmits a broad-band pulse wave. The scattered wave from an object is received by the
. .
..
Z =-Zr
>
FIG.63. The configuration of diffraction tomography with broad-band pulse illumination, using a transmitter as its receiver.
308
KEINOSUKE NAGAl
same transducer. The received signal for the appropriate period is temporalFourier-transformed. The frequency component at o is considered to be represented by Eq. (210).Thus, ps(w.x , ) within the frequency band width of the pulse wave are obtained. h. Zmage Reconstruction. ps(w,x , ) is spatial-Fourier transformed with respect to x , to’obtain the image, ps(w,x , )exp{ - iux,} dx,
(21 1)
Equation (210) is substituted into Eq. (211), and the Hankel function is rewritten with the integral representation of Eq. (178). The following equation is obtained. 0u
-m
x
71
[dK,dx,
exp[i{k,(x, - x )
+ ( k 2 - K;)’/’lz - z,I - u x , } ] ( k 2 - K;)lI2
And further substituting exp{ -i(u - K , ) x , } dx, = 6(u - K , ) Ps(cu,u) =
k 2 exp{ ~fri ( k 2 - u ’ ) ~ ’ ~ z , } 4(nik)li2(k’ - u ’ ) ’ / ~
-Ou
is obtained, where the double signs depend on the scanning line. The upper sign corresponds to z = - z, and the lower corresponds to z = z , . Let the spatial Fourier transform of o ( x , z ) / f i be O(u,w), and it is represented by O { U ,+ ( k 2 - ~
’ ) l / ~= }
4 T ( n i k ) 1 i 2 ( k 2- u2)l/’ k x exp[ - { L iz(k2 - u ~ ) ’ ’ ~ } P , (u)] w,
SYNTHETIC APERTURE ULTRASONIC IMAGERY
..
309
FIG.64. The region within which O(u,w ) is obtainable. The upper semicircle and the lower semicircle are depicted corresponding to the scanning line of z = z, and z = - z , , respectively.
The spatial Fourier transform of the object could be obtained on the circle with the center at the origin of the (u, w) plane and with the radius of k = oj/c,; co is the acoustic velocity in the surrounding homogeneous medium. O(u,w) within a certain region of the (u, w) plane is obtained corresponding to width of o.The region is shown in Fig. 64. The image O(X, z ) is reconstructed by the inverse spatial Fourier transform of O(u,w) thus obtaining: o(x, z ) = 4d(X, z) -
Ic(X, 2)
c. Experiment. When the broad band spatial frequency components of the object are required, the transducer has to be scanned on both sides of the object. However, when the object is point-like or when only an outline of the object is needed, half the data of the frequency domain which are obtained by scanning the transducer on only one side often works well. As it is troublesome to scan the transducer on both sides of the object or to rotate the object by 180", it is desirable to reconstruct the image from one half of the data. The reconstructed image from such data by the procedure just mentioned are presented here (Nagai, 1985). Mechanical scanning of a single transducer simulate a linear array of the transducer. The object is placed 70 mm ahead of the scanning line. Figure 65 shows the pulse wave which was used for the object of aluminum block. The shape of the pulse is shown in Fig. 65(a) and the spectrum is
310
KEINOSUKE NAGAI
(b) FIG.65. Pulse used. (a) Wave form. (b) Spectrum.
indicated by Fig. 65(b).The frequency band from 2.10 MHz to 9.00 MHz was used to reconstruct images. A different transducer was used for the underwater object. The frequency band from 1.62 MHz to 2.38 MHz was adopted then. The shape of the wave, however, is omitted here to avoid similar figures. Figure 66 shows the image of five parallel wires of 1 mm diameter in the water tank. The point spread function of this imaging system is estimated by this figure. The size of the lattice of this and the next figure is 1 mm x 1 mm. (x,, y,) in these figures simply indicates the center of the reconstructed plane. Though the area of data acquisition is determined in order for the resolution to be 1 mm, the point spread function is slightly broader. This might be caused by the transducer used. The diameter of the transducer was relatively large: 5 mm, and its directivity might limit the effective area of the data acquisition. The central wire was placed exactly on the lattice point of the reconstructed image. There, it is at the highest possible point. Amplitude of others decreases as their positions depart from lattice points. The imaging ability of the system is estimated to be good from the point spread functions in this figure.
SYNTHETIC APERTURE ULTRASONIC IMAGERY
31 1
FIG.66. Image of five parallel wires. The division of the lattice in this and the next figure is 1 m m x I m m (Nagai, 1985).
Figure 67 shows the image of the plate behind the five parallel wires in the water tank. The reflected wave from the plate is much larger than those from the wires, which is expressed by the difference between the amplitude of the plate and that of wire. The amplitude of the wires is very small, but their shadows are clear in the image of the plate. The artifacts appear further behind the plate, which is caused probably by the multiple reflection of the wave between the plate and the wires. In the
FIG.67. Image of a plane behind five wires. Shadows of the wires are found on the plane and artifacts are recognized further behind the plane.
312
KEINOSUKE NAGAI
FIG. 68. Reconstruction of aluminum block with five holes. (a) Test object for nondestructive evaluation, 100 mm long, 48 mm high and 48 m m deep. (b) Reconstructed image.
theoretical discussion, the multiple reflection is neglected by the Born approximation. The artifacts are one of the effects of the multiple reflection. The test object for the nondestructive evaluation is shown in Fig. 68(a), which is the aluminum block of 100 mm long, 48 mm high, and 48 mm deep. Five holes are drilled to the depth direction. The diameter of the largest holes is 10 mm and that of the smallest one is 0.6 mm. The transducer was scanned
SYNTHETIC APERTURE ULTRASONIC IMAGERY
313
linearly only on the upper surface and collected the data at the intervals of 0.5 mm. The reconstructed image is displayed on the CRT at the lattice points of 128 x 128, of which the size is 0.5 mm x 0.5 mm. The picture is shown in Fig. 68(b). The five holes are reconstructed. The shape of the largest hole is not clear from the figure. This is because the image was reconstructed from the data collected only on the upper surface of the block and the data on the lower surface were not yet used. Except for this, the clear image has been reconstructed by the method.
REFERENCES Ahmed, M., Wang, K. Y. and Metherell (1979). froc. IEEE. 67, 446. Alais, P. (1974). Acou.sr. Hologr. 5, 671. Anderson, R. E. (1974). Aroust. Holoyr. 5, 505. Aoki, Y., Yosida, N., Tzukamoto, N. and Suzuki, M. (1967). Proc. IEEE. 55, 1622. Aoki, Y. (1970).IEEE Trans. Audio Electroacousr. AU-18,258. Berger, H. (1967). Acousr. Hologr. 1. 27. Boyer, A. L., Hirsch, P. M., Jordan, J. A., Lesem, Jr. L. B. and Van Rooy, D. L. Acoust. Hologr, 3, 333. Brenden, B. B. (1967). Acousr. H o k ~ g rI, . 57. Chubachi, N. (1982).Japan. J. Appl. f h y s . 21-3,7. (Proc. 3rd Symp. Ultrasonic Electronics 1982). Corl, P. D., Kino, G. S., Desilets, C. S., and Grant, P. M. (1980).Acou.st. h a g . 8, 39. Cutrona, L. T.. Leith, E. N., Porcello, L. J. and Vivian, W. E. (1966). Proc. IEEE. 54, 1026. Devaney. A. J. (1983). IEEE Trans. Sonics Ultrason. SU-30, 355. Devaney, A. J., Beylkin, G. (1984). Ullrcrsonc h a y . 6, 181. El-Sum, H. M. A. (1967). Acousr. Hologr. I, I . El-Sum, H. M. A. (1969).Acousr. Holoyr. 2, 7. Ermert, H. and Karg, R.(1979). IEEE Trans. Sonics Ultrmon. SU-26,279. Esmersoy, C. and Levy, B. C. (1986). Proc. IEEE74,466. Farser, J., Havlice, J., Kino, G., Leung, W., Shaw, Toda, K., Waugh, T., Winslow, D. and Zitell, L. (1975). Acousr. Hologr. 6, 275. Gabor, D. (1948).Norure 161. 777. Goodman, J. W. (1969). Acoust. Holoyr. I, 173. Goodman, J. W. (1971). Proc. IEEE59. 1292. Green, P.s. (1971).Acoust. Hologr. 3, 173. Green, P. S., Schaefer, L. F. and Macovski, A. (1972). Acousr. Hologr. 4,97, Green, P. S., Schaefer, L. F., Jones E. D. and Suarez, R. (1974). Acousr. Holoyr. 5,493. Greenleaf, J. F., Johnson, S. A., Samayoa, W. F. and Duck. (1975). Acousr. Hologr. 6, 71. Greenleaf. J. F. and Bahn, R. C. (1981).IEEE Trans. Biomedical Eng. BME-28, 177. Harmuth H. F. (1979).“Acoustic Imaging with Electronic Circuits”, Academic Press, New York. Hidaka, T. (1975). J. Appl. f h y s . 46,786. Hildebrand, B. P. (1980). Acousr. h a g . 8, 165. Hildebrand, B. P. Boland, A. J. and Cochram M. L. ( 1982). Acousr. Imag. 11, 529. Holbrooke. D. R., McCurry, E. E., and Richards, V. (1974). Acoust. Hologr. 5,415. Igarashi, M., Kaihoh, I. and Hayakawa, H. (1986).J. Acousr. Soc. Japan 42,548. (in Japanese). Iizuka,K..Ogura, H.,Yen, J. L.,Nguyen,V. K. and Weedmark, J. R.(1976). Proc. IEEE64,1493.
314
KEINOSUKE NAGAI
Ishii, J. and Sasaki, S. (1983). Japan. J. Appl. Phys. 22-3, 130. (Proc. 3rd Sym. (Iltrusoncc Electron.). Johnson, S. A., Greenleaf, J. F., Duck, F. A., Chu, A,, Sarnayoa, W. R. and Gilbert, 9.K. (1974). Acoust. Holoyr. 6, 193. Johnson, J. A. and Barna, 9.A. (1983). IEEE Trans. Sonics Li/lrason. SU-30,5. Kaveh, M., Sournekh, M. and Greenleaf, J. F. IEEE Trans. Sonics Ultrason. SU-31,230. Kessler, L. W. and Yuhas, D. E. (1979). Proc. lEEE67, 526. Kino, G. S. (1979). Proc. IEEE67, 510. Kubota, J., Ishii, J. and Sasaki, S. 1982. Acoust. Imag. 11,597. Leith, E. N. and Upatnieks, J. (1962). J. Optic. SOC.Am. 52, 1123. Macovski, A. (1979). Proc. I E E E 67,484. Maginness, M. G., Cook, G . 9.and Higgens, L. G. (1972). Acoust. Hologr. 4, 195. Meier, R. W. (1965). J. Opt. Soc. Am. 55,987. Metherell, A. F. (1974). Acoust. Hologr. 5.41. Mezrich, P. S., Etzold, K. P. and Vilkomerson, D. H. R. (1975). Acoust. Hologr. 6, 165. Miyashita, T. (1980). Proc. fEEE68, 1018. Morse P. M. and Ingard K. U. (1968). “Theoretical Acoustics”, McGraw-Hill, New York. Mueller, R.(1971). Proc. IEEE 59, 1319. Mueller, R.,Kaveh, M. and Wade, G . (1979). Proc. IEEE67, 567. Mueller, R., Kaveh, M. and Iverson, R. (1980). Acoust. Hologr. 8, 615. Nagai, K. ( I 983). Proc. IEEE 71, 1457. Nagai, K . (1984a). IEEE Trans. Sonics Ultrason. SU-31, 151. Nagai, K. (1984b). Proc. IEEE72,748. Nagai, K. (1985). IEEE Trans. Sonics Ultrason. SU-32, 531. Nahamoo, D. and Kak A. C. (1981). Ultrasonic Imaginy 3, 1. Nahamoo, D., Pan, S. X. and Kak, A. C. (1984). IEEE Trans. Sonics Ulfrason.SU-31,218. Nakayama, J., Ogura, H. and Fujiwara, M. (1978). Proc. IEEE 66, 1289. Nitadori, K. (1975). Acoust. Holoyr. 6, 507. Nitadori, K., Mano, K. and Karnata, H. (1980). Acoust. h a y . 8,249. Norton, S. J. and Linzer, M. (1981). IEEE Trans. Biomed. Eng. BME-28,202. Ogura, H. and Iizuka, K. (1973). Proc. IEEE61, 1040. Porter, R. P. (1969). Phys. Lett. 29A, 193. Porter, R. P. (1970).J . Opt. Soc. Am. 60, 10-51. Porter, R. P. (1981). Opt. Comm. 39,362. Power, J. P. and Mueller L. D. E. (1973). Acoust. Hologr. 5,527. Quate, C . , Atalar, A. and Wickramasinghe, H. K. (1979). Proc. IEEE67, 1092. Rayleigh, J. W. S. (1945).“The Theory of Sound”, Dober, New York. Roger, G. L. (1950). Nature 166, 237. Sasaki, 0.and Yoshida, H. (1986). J . A c ( ~ u . ~Soc. f . Am. 79,999. Sato, T., Sunada, T. and Wadaka, S. (1978). J . Acousi. Soc. A m . 64, 1101. Sato, T., Sasaki, K. and Urnemura, K. (1979). J. Acousf. Soc. Am. 65,976. Schueler,C. F.. Lee, H. and Wade,G. (1981). Proc. IEEE69, 1580. Schueler. C. F., Lee, H. and Wade, G. (1984). IEEE Trans. Sonics LNtrason. SU-31, 195. Shibata, S. and Koda, T. (1986). J. Acoust. Soc. Japan. 42, 556. (in Japanese). Smith, J. M. and Moody, N. F. (1969). Acoust. Hologr. 1,97. Suarez, J. R., Marich, K. W., Holzerner, J. F., Taenzer, J. and Green, P. S. (1975). Acoust. Hologr. 6, 1. Sutton, J. L. (1979). Proc. IEEE 67, 554. Takahashi, F., Suzuki, K. and Kanamori, T. (1980). Acoust. Holugr. 8,685. Wells. W. H. (1970). Acoust. Holoyr. 2, 87. Whitman, R. L., Ahrned, M. and Korpel, A. (1972). Acoust. Hologr. 4, 11. Wolf, E. (1969). Opt. Comm. 1, 153.
.
ADVANCES I N ELFCIRONICS AND ELECTRON PHYSICS VOL . 70
Dual and Complementary Variational Techniques for the Calculation of Electromagnetic Fields J . PENMAN Depurrmenl of Engineering. University o/ A herdeen. Aherdern. Scotlund
1. Introduction . . . . . . . . . . . . . . . . . 11. A Historical Perspective . . . . . . . . . . . . . 111. Complementary Variational Principles . . . . . . . A . Spaces, Operators and Functionals . . . . . . . B. Adjoint Linear Operators . . . . . . . . . . C . Weak Solutions Given by Functional Stationarity . . I V . The General Engineering Field Problem . . . . . . A . The Generalized Form of Green's Theorem . . . . B. The General Functional . . . . . . . . . . . . C . Standard Functional . . . . . . . . . . . . . D . Complementary Functional . . . . . . . . . E . Upper and Lower Bounds . . . . . . . . . . V . Field Problems in Engineering . . . . . . . . . . VI . Magnetostatics . . . . . . . . . . . . . . . . A . ThePrimalSet . . . . . . . . . . . . . . . B. The Dual Set . . . . . . . . . . . . . . . C . A Magnetostatic Example . . . . . . . . . . VII . The Electrostatic Field . . . . . . . . . . . . . A . The Primal Set . . . . . . . . . . . . . . . B. The Dual Set . . . . . . . . . . . . . . . C. An Electrostatic Example . . . . . . . . . . V111. The Electromagnetic Field . . . . . . . . . . . . A . Thestructure . . . . . . . . . . . . . . . B. The Primal and Dual Equations . . . . . . . . C. Complementary Functionsfor Eddy Current Systems D . An Electromagnetic Field Example . . . . . . . XI . Concluding Remarks . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
.
. . .
316 316 318 318 320 321 323 325 326 328 328 329 331 336 338 338 340 342 344 345 346 347 347 351 355 358 358 364 364
315 Copyright I I 1988 hy Academic Press. Inv . All righlr or reproduction rcrcrvcd ISBN 0-I?-DI4670-3
316
J. PENMAN
I. INTRODUCTION Below we shall discuss the dual and complementary variational formulations of the electromagnetic field equations and explore the usefulness of these techniques, particularly when they are used in conjunction with the finite element method. These methods offer the numerical analyst great advantages since they provide a means whereby error bounds can be placed on the global accuracy of problems. They also allow local solutions to be exploited so that a more efficient use can be made of the finite element technique. This, in turn, releases the user of such methods from many of the time consuming tasks associated with data entry and mesh design. 11. A HISTORICAL PERSPECTIVE
Since the techniques we wish to discuss have their roots in the method of the calculus of variation, we shall begin with a short historical perspective of the relevant work, as it relates to our needs. Probably the first use of variational principles was that of Aristotle around 350 R.C. when he used a veiled formulation of the principles of virtual work to derive the law of the lever. Galileo further refined the notion of virtual work in the 1600s, and in 1717, John Bernoulli developed the principle of virtual work in more or less the form we know it today. During the 1700s Euler and Lagrange recognised variational principles as an exciting new subject in its own right, and developed a variational treatment of mechanics by introducing a new calculus-the calculus of variations. This led to variational principles being applied to many abstract problems which were intractable using conventional methods. Although variational principles were first used to derive operator equations from the laws of energy conservation, it was realised in the 1800s that this process was reversible, enabling a wide variety of such systems to be expressed in terms of a weaker energy formulation. This in turn led to a method of solving such operator equations, devised by Rayleigh in 1870 and later refined by Ritz in 1908. A second, independent solution technique applicable to operator equations is one based on the notion of weighted residuals, and Galerkin’s method developed in 1915 has proved to be the most reliable method of this class. These two methods, Rayleigh-Ritz and Galerkin, soon were recognised as the simplest and most flexible means of solving many of the equations that were being formulated at the time. Initially the Rayleigh-Ritz method proved to be less useful because of difficulties associated with the choice of appropriate basis functions. More recently, with the development of the finite element
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
317
method as a vehicle to help construct these basic functions, the variational method has had its previous limitations removed. With complementary variational principles two independent, but bounded, descriptions of the same problem are obtained, and we shall refer to these as the standard energy principle and the complementary energy principle. The method developed by Hamilton, (1805- 1865) and known as Hamilton’s Principle, may be regarded as the first explicit statement of a standard energy principle. However, it was not until 1952 that Toupin enunciated a formulation involving complementary energy which could be viewed as a complementary formulation of Hamilton’s Principle. The first use of finite elements, in a strict mathematical sense, was by Courant in 1943, for the solution of a torsion problem. Unfortunately, the true benefits of the finite element method were not to be realised until much later with the development of high powered digital computers. Finite elements were first used in a significant way by Turner et al. in 1956, applied to the theory of structures. Initially therefore the development of the finite element method lay in the hands of the structural engineers. In these initial stages, the formulation was exclusively the type we will later develop and call the standard primal form, with the complementary primal method not being used until 1964 by Fraeijs de Veubeke. This method aroused great interest since it was shown that upper and lower bounds could be obtained on the energy of a problem. Unfortunately, the difficulty in suppressing rigid body modes, combined with many other technical problems led to a loss of interest in these techniques for structural problems. The work of Nobe and Rall (1966) is also of fundamental importance. Their approach to the subject was from a completely mathematical viewpoint employing the methods of functional analysis. This heralded the development of a much more mathematical approach to complementary variational principles and finite elements. Soon the work of Tonti (1972) unified the mathematical structure of a large class of physical problems. This work illustrated the striking mathematical similarity between a wide variety of general field problems. Following Tonti, between 1972 and 1980, Oden and Reddy provided formulations for complementary variational principles, not through Hamilton’s and Toupin’s Principles, but via the method of direct integration. This is essentially the approach used in functional analysis and it allows a more direct and complete statement of complementary variational principles. The result is that today the method of finite elements is now well defined and understood, in a mathematical sense. Applications of variational principles to problems outwith mechanical systems can be found by referring to the work of Arthurs and Anderson (1969) when they obtained bounds to the capacitance of a microwave problem. In 1976, Hammond, and Hammond and Penman applied complementary
318
J. PENMAN
variational principles to magnetostatic problems using the principles of Toupin and Hamilton. Hammond and Penman (1978) then extended their work to include steady-state eddy current problems. In 1981, Hammond outlined complementary variational principles for electromagnetic wave problems. Later, Penman and Fraser (1982, 1984) applied a more rigorous approach using the techniques of functional analysis to construct complementary variational principles for a whole class of general engineering field problems. The extension of complementary variational principles to complementary finite element principles was also examined, and the globally bounded nature of the two complementary finite element solutions verified. This allowed the application of these techniques to magnetostatic, electrostatic, and steady state eddy current systems. VARIATIONAL PRINCIPLES 111. COMPLEMENTARY Complementary variational principles can be developed in a variety of ways, for example, the hypercircle method (Prager and Synge, 1947), or using Legendre transformations (Lanczos, 1970). Here a third technique is used, that of direct integration through functional analysis. We must begin by defining many of the terms and concepts of functional analysis that we will require. This is considered only briefly here. Readers who wish a much more comprehensive account of the subject are directed to the work of Oden and Reddy (1 976). A. Spaces, Operators and Functionals
A11 the systems we shall deal with here have an underlying linear structure. This leads us logically to the use of vector spaces and linear operators. By a vector space we mean a set of elements including a zero element, called vectors, and the linear operations of vector addition and scalar multiplication. Therefore, functions can also be elements of such a space. A normed vector space is one in which the concept of length is recognised through a linear mapping called a norm, and the norm of an element u, is denoted by IIuII. We must also introduce an abstract geometry that includes definitions of direction, orthogonality, and “angle between vectors”. These geometrical properties are introduced via the inner product, and the notation for the inner product of two elements u, u is (u 1 u). A space with all of the above features is called a Hilbert space. An example of a Hilbert space is the L,(R) Lebesgue
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
3 19
space. This is the set of all functions, over a domain R, whose square integral is finite, i.e. r
( u 2 ) d R < a>, R c R"
L2(R) = { ~ ( x )x : E R, Jn
The inner product for this space is defined as
ln
I
( u u ) =~ ~ uii dR
Also two elements u, u of a Hilbert space U , are orthogonal if
1
( u u)u = 0
Important subsets of the L2(R)space are the Sobolev spaces H"(f2).These are the sets of all functions whose elements and first nth derivatives are square integrable. For example, over a two dimensional domain
and the inner product of two elements u, u is given by
These spaces effectively indicate the degree of smoothness of functions by determining the number of derivatives they possess. With the concept of spaces defined, we may now introduce operators which map each element of one space into itself or into another space. For example, if a map A , maps an element u of a space U into an element u of a space V, then it is denoted by AU=V
UEU,
UEV
The set of all bounded linear operators from a space U to a space V is itself a normed space, denoted by B(U, V ) ,where the norm (IAll of an element A of this space is given by
where sup is the least upper bound. In the special case where an operator maps elements u of a space U to the real number set R, the operator is called a functional. If C#J is a functional on u E U, then it has a special notation. d(U)
=
320
J. PENMAN
Such operators have special significance in our work, since they are used to set up the complementary variational equations. So, every normed space U has a space of bounded linear functionals acting on U. This space is called the dual space of U and is denoted by U *. In many cases, for example the Lz(Q)space, the dual space is none other than the space itself. That is, LT(Q) = L,(Q)
B. Adjoint Linear Operators
Such operators have a central position in the development of this work. They are defined as follows. Consider an operator T E B(U, V ) .The adjoint operator of T, denoted by T",is that operator which satisfies the relationship:
I
(Taw14" = (w T4"
This can usefully be expressed pictorially as shown in Fig. 1. The adjoint operator is widely used in problems for which complementary variational principles exist, since such problems can be cast in the general form, TaMTu = p
(1)
which in turn can be expressed as P(u) = 0, where P = Lu - p , and L = T a M T ;that is L is separable into the operators T", M and T. This is shown diagrammatically in Fig. 2.
T
FIG. 1. The adjoint operator.
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
321
FIG.2. Schematic representation of P(u) = 0.
C . Weak Solutions Given b y Functional Stationarity
Consider P, an operator (not necessarily linear), from a subspace H of a Hilbert space U to another subspace G , also in U. The strong solution to the operator equation P(u) = 0, is given by the element uo E H, such that P maps uo to the zero element in G . This is shown diagrammatically in Fig. 3. The weak solution to P ( u ) = 0 is given by the u such that
t), = 0
vg E: v
c
u
That is, the u for which r = P(u), is orthogonal to the entire subspace V. If V = G then choosing ( = P = P ( u ) gives, llP(u)112 = 0
* P(u) = 0
.nd so, if V is the same space as the range of P, then the strong and weak solutions are identical. The weak solution to an operator problem can be expressed indirectly in a functional form. If a function 4: U -+ R is given, then the Gateaux differential
322
J. PENMAN
FIG.3. Weak and strong solution to P(u) = 0.
of
4 at u is given by M u ; 5 ) = (grad
4(4, 5 )
The Gateaux derivative is merely a generalised form of differentiation (Oden and Reddy, 1976),so that 64(u; 5 ) represents the infinitesimal increment in 4 at u in the 5 direction. Now, if the 4 functional is chosen such that its gradient is equal to the operator problem, i.e. grad +(u) = P(u) then
W ( u ;5 ) = ( P ( 4 15)" and so the stationary point of 4,64 = 0, provides a weak solution to P(u) = 0. In general, if a functional 4 on a Hilbert space is supplied, then an operator P can be found such that P(u) = 0 is given by the stationary point of 4. In the following pages, the inverse problem is addressed; that is, given an operator P, find a functional whose stationary point is a weak solution to P(u) = 0. The process of constructing functionals for a class of general operators, through the method of direct integration, is mathematically complex. The interested reader may refer to the work of Vainberg (1973) for a full account. Here, we will present only the salient features as required for our development. An operator P is defined to be a potential operator if the bilinear functional (6P(u; y~),5 ) is symmetric, that is, if
( W u ;Vl), 5 )
=(
W u ;0, yI>
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
323
It can then be shown that for a potential operator P, a general functional 4, such that the stationary point of provides a weak solution to P(u) = 0 is given by,
4iu) =
lo1 1 (P(SU)
u)ds
(2)
Accordingly, the operator given by expression (1) is potential. The relationship formalised in (2) is therefore of major importance in the development of complementary variational methods for the class of problems under consideration here.
IV. THEGENERAL ENGINEERING FIELDPROBLEM The purpose of this section is to develop, using variational methods, solutions to partial differential equations that have the general form, TaM(TuiX)+ u h ) ) = p(u(x)),
(3)
where x E R and R c R 3 , representing 3-dimensional space. We wish to do this because Eq. (3) has the same form as many of the equations relating to practical engineering field problems. We are particularly interested in problems involving magnetostatics, electrostatics, and sinusoidally time varying electromagnetic fields. We shall develop the variational principles in general form using Eq. (3),and give explicit forms for the specific systems afterwards. Now, associated with Eq. (3) there are constraints which are placed upon the solution at the boundaries. These conditions are of the form Bu(x) = y(x) S'wix) = h(x)
for x E JR, for x
E
aR,
(4)
where w = M(Tu + us) and JR,, dR, = dR, the bounding surface to the problem domain R. JR, and 80, are exclusive surfaces; that is, they do not overlap. u,(x) is a known function of x , and u ( x ) is the unknown in the equation. Also, the operator T" is the adjoint of the operator T ; M is a selfadjoint invertible operator; and the domains and ranges of the operators defined in (3) and (4) are T: H + G
B: H + J G \
M: G + G T a :G
+H
B': G + a H ,
I
(5)
324
J. PENMAN
U
FIG.4. The general operator set.
where U and V are Hilbert spaces and H c U and G c V. The spaces aG and aH are known as boundary spaces. Operators B and B+ are best introduced using the two trace operators yo and 6,. These are special operators which map functions defined over a region SZ, into functions defined only on the bounding surface of the region. That is, yo: u(x) E H
-+
dH
6,: w(x) E G
+
aG
These operators are very often I, n, nx or they can involve the differentiation a/an, where I is the identity operator and n is the outward normal on 80. If two new operators B,, and B: are now defined as boundary selective operators such that B,: a(x)
{;(x) --f
if if
XE
aR1
X E ~ S Z ~
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
B i : h ( x ) -+
i"
b(x)
if if
325
x~dQ, x E dQ2
then B and B' may be defined as B = BOY0
B+ = B06, The operators and spaces that we have introduced are best visualised diagramatically as in Fig. 4. A . The Generalized Form of Green's Theorem
In complementary variational principles, Green's theorem can be used to good advantage to help develop the complementary forms from general functionals of the type given by Eq. (2). To preserve our generality for the time being, we ,will therefore require a general statement of Green's theorem. This is achieved as follows. Consider two new spaces U and V, known as product spaces and defined as U=Uxi3H and V=VxdG
u
An element il E is the ordered pair (u,y,u) and an element F E is the ordered pair (w, 6,w). Two new operators and 7." can be therefore defined such that
T:
-+
and
Ta: v-+ u,
where T and T" have the form
Here, T" is the adjoint of T, and B; is the adjoint of B,. As we have previously seen, if T and ra are adjoint, then
(Tu I W)" and expanding this gives
= (5%
I U),
326
J. PENMAN
which immediately yields the generalised form of Green's theorem,
( T a w1 u)t! = ( T u IW)"
+ (Bu Ibowhc - ( B + wI Y O U h
(7)
As an example we can consider the case where, for example T=-V,
B=I
The appropriate adjoints are thus,
B+ = -n
T"=V.,
So for U = Ho(Q) and V = Ho(12)x Ho(12), ( T a w1 u ) =~
In-
(V w)udR
and (Tu 1 W)" =
jn(
.
VU) w dQ
Also
and
(BfwlG,u),, =
la%
(-n.w))ud(dQ)
which on substitution in to equation (7) gives the classical Green's formula
12(vw)u d~
= j n ( - v u ) . w dn
*
+ Ianu(n - w)d(aQ)
B. The General Functional We have now developed the necessary mathematical tools and we can proceed by considering the partial differential Eq. (3) with the boundary conditions (4).They can be expressed in canonical form, thus:
TU = v
- U,
BU = g
on dR,
MV=W T a w = p(u)
B + w = ~ onan,
DUAL AND COMPLEMENTARY VARlATlONAL TECHNIQUES
327
We have called Eqs. (8) the canonical set because of their similarity to the canonical equations of analytical mechanics, as proposed by Hamilton. The problem at hand is therefore, to find a functional whose stationary point is the solution to (8). This will be achieved using the method of direct integration referred to earlier, and outlined in the following pages. Firstly, we define a product space which consists of the functions u, u, and w , together with their boundary terms. An operator equation representing (8) is then constructed using elements of this space and, following the methods used by Vainberg, Eq. (2) is used to obtain a functional 0,which has the form:
I
@(u, u, w ) = ( + T a w - ,f u)u
+ ($B+w
-
h 1 yOu)aH
+ $(MU- w 1 u)" + ( ~ T -u +U + U , I w)" + ()Bu - g 1 bow)ac
(9)
Here p(u) = f(x) is a function of position only, and therefore this represents a static field problem. If, however p(u) = k(x) u(x) + f(x), it could equally well describe a steady state sinusoidally time varying problem. We shall consider this more complicated case later. In the theory of elasticity, the functional CD is known as the Hu-Washizu form, and finding the ordered set of functions ( u , v , w ) which makes the functional stationary is equivalent to solving the canonical Eqs. (8). Now, the stationary point of CD is given by W ( U , u,w): ( 4
5, i)) =0
and this yields the three equations - fl?)U
+ (g+w - hl?Oq)aH
= 0,
+ us - U ( 0 " + (Bu -
u
v5 E v
(Mu-w15)v=o,
(Tu
vq E
Y(d,i),G
= 0,
viE
which are the conditions for the weak solution to the canonical set (8). Now it is apparent that finding the stationary point of CD involves seeking the three functions u, v, and w simultaneously. Clearly this method has a large number of unknowns. It is therefore rarely used in practice. However, when the relationship between v and w is known, then u can be substituted as M - ' w . This reduces @(u,u, w ) to n(u,w), and this form is known in the thoery of elasticity as the Hellinger-Reissner functional;
+ (+B+w - h I you)a" + (+Tu - ) M - ' w + us I w)" + (9Bu - g 1 how),,
rI(u, w ) = (+Taw = f 1 u)u
This functional can be further simplified in two ways.
(10)
328
J. PENMAN
C. Standard Functional
If the generalised form of Green's theorem, (7) is used to substitute for * ( T a wI u ) in~ (10) then may be written as
n(u,W ) = (Tu- 4 M - l ~+ U , 1 w)" - ( f (U)U + (Bu - 9 1 - (h I YOu)aW BOW)aG
(11)
If v = M-'w is defined as T u + us, and u is such that Bu is identically equal to g on dQ1, then from (11)
mu, w(4)
+
@(u)
where @(u) = *(Tu
+ us 1 M ( T u + us))" - (f1 u)U
1
- ( h YOu)C?H
(12)
It can be shown that, at the stationary point of this functional, i.e. when 6 0 ( u ; q ) = 0, then
( T a w - f I il)u
+ ( B + W - h IYoil)arf = 0,
Vil E Hf@) =
where Hf(Q), a subset of H , is the space of all functions satisfying the boundary condition Bu = g. Therefore, the stationary point of 0 provides a weak solution to the equations
T a w = f i n Q and B+w = h on dsZ,
(13)
provided that TU
+ V, = w MV=W
in R, with Bu = g on dRl
are satisfied a priori. The 0 functional is known as the total potential energy in the theory of elasticity. The complementary functional to 0,known as the total complementary energy is denoted by Z and may also be constructed from n as follows. D. Complementary Functional
If, this time, the generalised form of Green's theorem (7) is applied to (10) to substitute for ~ ( T1 w)" u then n may be written as
n(u,W ) = ( T a w - f 1 u)U
- (M-'w
I
- U,
I
w)V
-k (Bfw - h You)a~- ( 9 160w)aG
(15)
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
329
If w is defined such that Taw is identical to ,f in 0, and B'w is identically h on aR,, then from (15) rI(u(w),w ) + Z(u) where
-
=(w)=(-iMp'w+
u,Iw)~
-(g1d"w),,
(16)
It can also be shown that at the stationary point of (16), when d3(w; 5 ) = 0, then the following is true: (Tu
+
0, -
M-'w
15)y
+ (Bu - g
1805)dC
VY
= 0,
E
Gf(R) c G
where G,.(Q), a subset of C, is the space of all functions satisfying the boundary condition B + w = h. Therefore, the stationary point of E generates a weak solution to Tu + us = u in R and
Bu = g on an,
(17)
provided that Taw= f Mv=w
in R, with B + w = h on dR,
are satisfied a priori. We have therefore used the general functional form, Eq. (9), to find two further functionals, Eqs. (12) and (16), both which will provide a solution to the canonical set of Eq. (8) and which will therefore solve the partial differential equation (3). This somehwat lengthy and involved development has left us with two solution routes to the same problem. We proceed by examining the nature of these functionals. E. Upper and Lower Bounds
The convex or concave nature of the 0 and Z functionals can be examined by the sign of the second Gateaux differential. The second Gateaux differential of 0 at u and in the direction q, is given by
Performing these required operations on the functional expression (12) gives d2@(u; q,d= (TrlJMTq),,
'dY
E
Hfm
Letting $ = Tv],and noting that M is a positive definite operator gives
I
d 2 @ ( u ; q, q) = ($ M $ ) v 2 0
(19)
330
J. PENMAN
/
H'"'(Q)
bound
/
-
G'"'(Q)
functionals. FIG.5. Extremum point of functionals
The solution to (13) is therefore obtained by minimizing 0 over the space H,(R). Similarly, the second Gateaux differential of z at w and in the direction ( is given by m(W;
a 68(w + p<; t)1 p = 0, (,t)= @
and using the functional expression (16) gives, h%(w; 5, ()
= -((
1 M-1()"
I
0
(20)
Therefore, the solution to (15) is obtained by maximizing E over the space G,(R). For the case of a space with infinite dimensions, it can be seen that if u and w exactly satisfy
TU-I-V, = v Taw = f
in R,
Bu = g on dR, B+ w = h on aRz
(21)
then @(u) = n(u,w ) = Z(W)
However, when numerical solution procedures are used, the spaces H,(R) and G,(Q must be approximated with the finite dimensional subspaces Hy'(R) and G$"(R). This means that the resulting solutions u, and w, rarely satisfy (21) exactly, and so the true extremum point of the functionals is not found. Under
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
33 1
these conditions we have, O(u,) 2 O(u)= n(u,w ) = Z ( W ) > Z(w,).
Thus O(u,)and S(w,)provide upper and lower bounds to the exact functional. This is shown schematically in Fig. 5. Obviously, as n, the dimension of the spaces Hy'(R) and Gy'(R) is increased, then the magnitude of the error between the approximate functionals reduces. Essentially, we have now developed a general theory which allows us to construct complementary functionals, the solution of which will provide answers which bound the true value of energy for the system under consideration. V. FIELD PROBLEMS IN ENGINEERING
We will now continue by applying our general results to systems that occur commonly in engineering field problems. Previously it was shown how complementary variational methods could be used to solve equations of the form TaM(Tu+ 17,)
=p
in R
B'w
=
with
Bu
=y
on
an,
and
h on 8 0 ,
and it is precisely this type of equation that governs the behavior of a large class of time independent physical phenomena. In general, we may say problems of this type satisfy three related equations which we will call (i) a compatibility equation, Sau = q (ii) a constitutive equation, w = M u (iii) an equilibrium equation, Taw = p
1
1
(24)
This notation is of course borrowed from stress analysis but we choose to extend the terminology to the general field problem, as a matter of convenience. These relationships can be expressed in diagrammatical form by employing the schematic representation proposed by Tonti (1972). This is illustrated in Fig. 6, and it represents a number of Hilbert spaces connected by operators. We will now continue by showing that field problems such as these can in fact be transformed into two different sets of canonical equations of the general form of Eq. (8).
332
J. PENMAN
Ta
compatibility (1 inear)
''
equilibrium (linear)
1"
FIG.6 . Relationships between general field quantities.
This is achieved as follows. If two functions, u E U and v, E V and an operator T are defined so as to satisfy the relationship,
v = 0,
+ Tu
(25)
such that S"V = S"V, =q+o
+ S'TU (26)
then our so called compatibility condition is automatically satisfied. Here v, is the source component of v, and when operated on by Sa it yields q. To satisfy S"Tu = 0, the range of operator T must of course be the null space of operators S", and this is the case for all problems of the type being considered here. Similarly, two further functions r E W and w, E V, together with operator
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
333
Fic. 7. The overall relationship between spaces.
S are introduced such that if w = w,
+ Sr
then Taw = Taw, + T"Sr
=p+o Here the equilibrium condition is automatically satisfied, and a condition of this is that the range of S must be the null space of operator T".This implies the extended structure illustrated by Fig. 7. To complete the picture we must return once again to the problem of the boundary conditions. Generally, boundary conditions can be of three types for systems of the kind under consideration:
(i) a far boundary condition; (ii) a compatibility boundary condition, E'v
=
&
(iii) an equilibrium boundary condition, B'w = h
Conditions (ii) and (iii) may be restated using expressions (25) and (26) to give the indirect conditions
and
Bu=y
ondR,
(27)
Er=y
ondR,
(28)
334
J. P E N M A N
The applicable boundary equation will depend upon the solution potential being used. At the far boundaries, the field may be considered to be zero and therefore i t rcduccs t o a special casc of the above conditions. The structure shown i n Fig. 7 can now be regardcd in two altcrnativc ways. First, the compatibility equation can be directly satisfied by expressing ;IS 1 1 , + 7'11. When this is done, the resulting equation is called the primal canonical set and is written a s 11
7'11
+
11,
= 1)
M 11 = \V T"w= p
I
Bu
on cX2,
=
in R,
(29) B +w
=
h on (In,
Alternatively, the equilibrium condition can be satisfied directly by expressing IZJ as M', + Sr and this produces the dual canonical set, which may be written as SZ'1>= y 11 =
Sr
M 'w
+ u', = M'
I
-
E + c = h on ?R,
i n Q,
(30) E r = ij on JR,
We may now gather together all of the notions developed thus far and express thcm schematically a s shown in Fig. 8.
DUAL A N D COMPLEMENTARY VARIATIONAL TECHNIQUES
335
We can now clearly see the choices. It has been shown that each canonical set can be solved in one of two ways, giving in all, four possible solution schemes. They are: (i) Standard Primul Method
w = Mu, Tu + us = u in R, Bu = g on dR,, enforce: minimize: O(u), Taw = p in R, B'w = h on 80,. solves:
This is a direct equilibrium method, as it gives a weak solution to the equilibrium equation. (ii) Complementary Primal Method
enforce:
w
=
maximize: Z(w), solves:
1
Mu, Taw = p in 0, B'w = h on do2,
Tu + us = u in R and Bu = g on 130,.
(32)
This is an indirect compatibility method, since it gives a weak solution to equations which in turn satisfy the original compatibility equations. (iii) Stundard Dual Method
enforce:
w = Mu, Sr
+ w, = w in R, Er =
on do,,
minimize: O ( r ) , solves:
S"v = 4 in R and E'u
=
h on dR,.
A direct compatibility method. (iv) Complementary Dual Method
enforce:
w = Mu, S"v = 4 in R, E 'v
= hon
dR,,
maximize: E(u), solves:
Sr
+ w, = w in R and Er =
(34) on dR,.
An indirect equilibrium method. This concludes the general development and we shall now show how it can be directly related to problems in electromagnetism.
336
J. PENMAN
VI. MAGNETOSTATICS We begin by examining the magnetostatic field. The magnetostatic field is such that the following equations must be satisfied in the domain of the problem, denoted by Q.
-
(i) V B (ii) B
-compatibility
=0
= pH
(iii) V x H
-constitutive
=J
-equilibrium
Where B is the magnetic flux density, H is the magnetic field intensity, J is the current density, and p is the permeability. Relating the above operators to the general operators used in the previous section, it is easy to see that the following equivalences are implied,
S " = V . and
T"=Vx
-
Therefore, by considering the adjoints of the vector operators V and V x,
S=-V
and
T=Vx
Now, introducing A, the magnetic vector potential, defined by
B=VXA we have
V.B=V.(VXA)EO, and so the compatibility equation in Q is satisfied. Also, for two dimensional fields in which the current density can be expressed as, j = Oi
+ Oj + jk
it is sufficient also to define A by A
= Oi
+ Oj + Ak.
That is, the vector field has only one unknown, namely the z directed component of A. We also choose to introduce another potential $, termed the reduced magnetic scalar potential, which is defined by,
H=H,-V$ where
V x H = ( V x H , ) + ( V x (-V$) =J+O
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
337
This means that the equilibrium condition is automatically satisfied in R. Following Penman and Fraser (1984), H, is called the reduced magnetic field intensity and represents the magnetic field produced by the current carrying elements of the system. The -V$ term represents the “deformation” of the H field, from H,, due to the effects of magnetic materials. Also, for the primal equations, we see that
M = v , the reluctivity and
us = 0.
For the dual equation we have, M = p, the permeability us = H,
and
.
Associated with the operators V and V x are their boundary operators Ef
=
-n.
Bt = - n x
and
therefore we may define their adjoints as
E=I
and
B = -nx,
noting that in two dimensions n = n,i
+ nyj
The manner in which these boundary conditions arise is discussed later in this section. Note though that the boundary operator B acts on the magnetic vector potential. That is, -nxA=g
If we remember that A has only a z-component and use the form of n given above then we have (-n,i
+ n,j)A
=g
Since A is the only unknown in this equation, it is best expressed as A=g
where g, although still a vector is now in the z direction. The boundary conditions for two dimensional magnetostatic problems are normally specified using the following sign convention for the boundary operators, n.B=h
$=g
nxH=h
A=g
338
J. PENMAN
Taking this into account and substituting the magnetostatic terms into the general functionals given by Eqs. (12) and (16), the following system of equations and functionals is obtained: A. The Primal Set
Primal Canonical Equation A = g on
B=AxA H
=
VB
an,
inn
(35) nx H
J=VXH
= h on
aR,
for which the appropriate functionals are,
Standard Primal Functional (V x A ) . v ( V x A ) & with A
=g
specified on
h * A d ( a R ) (36)
an,.
Complementary Primal Functional
with V x H = J in R and n x H = h on 8 0 , both satisfied. We can express this functional in a more useful form by introducing the magnetostatic scalar potential, t+b, defined by H = H, - V$. This gives, r
r
with II/ = 9 specified on 8QZ.
B. The Dual Set We can proceed in an identical way to that above, again using the general expression to produce a dual set.
Dual Canonical Equation
H
= H, - Vt+b
$=gonaR,
B = pH O=V.B
]
n .B = 6 on 8 0 ,
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
Standard Dual Functional @($) =
with $
-
ln
4
-
(H, - V$) p(Hs - V$) dR
+
Ian, h3/ d(dR)
339
(39)
specified on (3Q2,.
=
Complementary Dual Functional
-
=(B)=
B.vBdQ+
-L
In
-
H,.BdRIn
-
L2
@-Bd(dR)
with V B = 0 in R and n B = h on an, enforced. If we replace B by V x A in this expression we arrive at,
-
z(A)=-i
In
(VxA).v(VxA)dR+
In
H,.VxA-
L
Sn.Vd(d0) (40)
with A = g specified on dR,. The structure of the magnetostatic field can be summarized by Fig. 9 and certain points now emerge. When we wish to solve a magnetostatic field problem, several choices are open to us. First we may choose to solve either the Primal Equation, V x VV x A
V x VV x A
=
J
V . jc(H, FIG. 9.
= J,
~
V$) = 0
The magnetostatic system.
340
J. PENMAN
or we use the Dual Equation,
V * p(Hs = V$) = 0 Once this choice has been made we may then use the functionals associated with each equation. The choice is Eqs. (39) and (40) for the dual set. All four forms will provide an answer but a further benefit is that when considered in pairs, as given above, these pairs provide solutions which bound the exact solution. C. A Magnetostatic Example
We can illustrate this by means of an example. Only the primal set is considered, i.e. the functionals described by (36) and (37) will be extremized. This is done using the finite element method, and for a series of uniform meshes with an increasing degree of refinement. This course is chosen because it is generally the case that an increase in mesh refinement will improve solution accuracy, and hence the convergence from above and below to the exact value of field energy should be easily seen. The problem involves a simple magnetic circuit as shown in Fig. 10. The figure represents one quarter of the complete system and gives the appropriate boundary conditions. The four meshes with varying degrees of refinement are
air
. - . . . .. .. .. .. .. .
... ... . iron . . . . ' '. *
*
nxH=O FIG. 10. A magnetostatic problem.
DUAL AND COMPLEMENTARY VARlATlONAL TECHNIQUES
341
FIG. I 1. Meshes used for finite element solutions. (a) 64 nodes. (b) 225 nodes. (c) 484 nodes. (d) 841 nodes.
shown in Fig. 1 1, whilst the solution fields are illustrated in Fig. 12. It should be noted that in this problem we have used a total scalar potential function, defined by H = -Vx, in the iron region to avoid the cancellation errors discussed by Simkin and Trowbridge (1979, 1980). The convergence of the functional values for the standard and complementary forms is shown in Fig. 13. The system energy is clearly bounded, and as expected, the error reduces with mesh refinement. The advantages are immediately recognizable
342
J. PENMAN
FIG.12. Solution fields. (a) $-potential. (b) X-potential.(c) H field. (d) A-potential.
too, for it is now possible to solve a relatively simple problem twice, using both functional forms, and average the answers. This will give an accurate value that would otherwise have required a much greater computational effort. The additional benefit of being able to quantify error is obviously highly desirable. VII. THEELECTROSTATIC FIELD
We can now consider the electrostatic field by noting that the following equations must be satisfied in the domain of the problem.
I
I
I I I I
I
I I
I I I
I \
\\
\
J. PENMAN
344 (i) V x E (ii) D
=0
-compatibility
= EE
-constitutive
-
-equilibrium
(iii) V D
=p
(41)
where D is the electric flux density, E is the electric field intensity, p is the charge density, and E is the permitivity. If we introduce an electric scalar potential 4 by letting
E=-V$
(42)
then this automatically satisfies the compatibility condition, since - V x (V4) = 0. Similarly introducing an electric vector potential C, such that
D=D,+VxC
(43)
where
V-D=V-D,+V-VXC =p+o
automatically satisfies the equilibrium condition. For two dimensional problems C, like the magnetic vector potential, requires only the z component to model the system correctly. Following the detail presented earlier for magnetostatics, we see that the boundary conditions for the electrostatic problem are, n x E =h
-
n D
and
=h
which gives C = g, which gives
4 =g
Also, inserting the electrostatic quantities into the general equations and functionals gives the following electrostatic forms.
A . The Primal Set
Primal Canonical Equation -V$
E
=
D
= EE
p=V*D
4 = g on dR, (43)
.
n D = h on dR,
and this generates the following pair of complementary functionals:
D U A L A N D C O M P L E M E N T A R Y VARIATIONAL T E C H N I Q U E S
345
Standard Primal Functional
(V4) * &( - V4) dR with
&(an)
(44)
4 = y specified on ifR,.
Complementary Primal Functionul 1
(Ds + V x C)*-(D, + V x C)$R &
with C = g specified on dR,.
B. The Dual Set As with the magnetostatic set it is also possible to develop a dual system which is classified by the equations given below.
D u d Canonical Equation
D=D,+VxC
C
1
E=-D
=g
on dR,
in R
8
n x E=hondR,
O=VXE
for which the following functionals are appropriate: Standurd Dual Functional
O(C) = f with C
=g
I
1
(D, + V x C).-(D,
specified on dR2.
Complementary Dual Functional
z(+)= with
-
+
+ V x C)dR +
E
I
-
( - Vd) c( - v4)dR
4 = y specified on dR,.
+
lo -
Ds (v4)d~
-
L
6Cd(dR)
(47)
gn x ( - v4)$(do) (48)
346
J. PENMAN
FIG. 14. The electrostatic system.
The full structure of electrostatic systems together with its boundary conditions is illustrated in Fig. 14. Again we see that we have two choices of system equation. The standard form, v ( -EV4) = p and the dual equation,
-
1
V x -(Ds &
+ V x C) = 0
Both of these yield pairs of functionals which give error bounds. It also illustrates the striking symmetry inherent in electric and magnetic fields. We shall see later how it is possible to combine Figs. 9 and 14 to provide a complete structure for the electromagnetic field. First, we illustrate the provision of error bounds in the electrostatic field by means of an example. C . An Electrostatic Example
The problem considered here is that of a point to plane discharge. The system geometry, which is axi-symmetric is defined in Fig. 15, and as before a series of solutions were computed for a set of increasingly refined meshes. The
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
347
4=0
n.D=O
n.D=O
qJ=O FIG. 15. An electrostatic example.
solution fields produced using the two functionals generated by the standard form, given above, are illustrated in Fig. 16, whilst the energy convergence curves are given in Fig. 17. The provision of error bounds is once again clearly illustrated, and all of the conclusions drawn in the discussion of the magnetostatic problem solved in the previous section are obviously still appropriate.
FIELD VIII. THE ELECTROMAGNETIC A . The Structure
In this section, we will show how time varying systems can also be incorporated into our general scheme, and how our schematic representation can also be extended to accommodate this additional complexity. Much of the general structure proposed here first appeared in Penman and Fraser (1984) and was published in the Proc. I E E (Pt. A), 131, 1984. When sources of the electric and magnetic fields are subject to variation with time then the two fields became coupled. This is expressed in the time
348
J. PENMAN
FIG. 16. Solution fields. (a) $-potential. (b) C-potential. ( c ) Orthogonal nature of potentials in charge free regions. (d) E field.
varying forms of Maxwell's Equations,
VXE=-and
aB at
dD VXH=J+at
(49) (50)
I
I
I I I
I I I I I I I
I
I
I I
.
J. PENMAN
350
We can extend (49),for the sake of symmetry, to include magnetic current @. This may be regarded as the transport of free poles which will of course
always be zero. Its inclusion does however allow us to write,
VXE=@-or
l3B at
(51)
VXE=@+Bd
where
ad,the
magnetic displacement current
= - aB/dt.
Also, we define
@t=@+@d.
Equation (50) is often expressed in the form,
VxH
=J
+ Jd
(52)
with Jd, the electric displacement current = aD/at, and we may also let J, = J + Jd. It is now simple to include time variation into our diagrammatical representation. Consider for the moment the combination of the top loop in Fig. 9 combined with the lower half of Fig. 14. We can connect these diagrams through the mechanism of time differentiation by forming the three dimensional diagram shown in Fig. 18, which assumes for simplicity that no free
/1
FIG. 18. The time linkages in the electromagnetic field.
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
35 1
charges or poles are present. It may be seen that the links joining the ‘electric field plane’ to the ‘magnetic field plane’ are not space differentials but time differentials and that moving from the electric to the magnetic field implies the operation a/&, whilst moving in the reverse sense implies the adjoint of this operator, -a/&. We can extend our representation further by noting that we previously split the magnetic field H into two vectors thus,
and we may compare this with the usual expression for the non-conservative electric field, dA C7t
--
v+
(54)
Clearly we may write,
H
dC at
=-
(55)
and this confirms our earlier choice of the name electric vector potential for the quantity C. These additional relationships allow us to expand Fig. 18 to the fuller form shown in Fig. 19. For completeness, we have added the links between the electric current and field, and the corresponding magnetic variables. The result is a striking symmetry of structure in which our previously developed notions of equilibrium and compatibility, and primal, dual, and complementary still hold. It also shows that when we solve electromagnetic field problems we are effectively solving both compatibility, or both equilibrium equations, whereas for static fields only one need be considered.
B. The Primal and Dual Equations With reference to Fig. 19 we can identify the following principal relationships. (i) Compatibility Equutions
V-B=O V x E = @,
aB dt
= --
in the absence of free poles
FIG. 19. The electromagnetic system.
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
353
(ii) Constitutive Equations =p
B
H
D = EE
(iii) Equilibrium Equations V.D=/, V xH
= J, = J
aD +at
From this set of expressions we can now construct the primal and dual equations. The Primal form is developed in the following way: if B is defined by VxA=B,
V * B = V * V xAGO,
then
.
i.e. V B = 0 is satisfied.
Also,
V x E = V x (-V4)+V x
(
s)
--
= o - - c?B at
Thus both the compatibility equations are satisfied. The constitutive relationships may be written as,
and the equilibrium conditions as,
Equations ( 5 6 ) and ( 5 7 ) may now be combined with the expressions for B and E,
[-:I=[
v x
$
0
v][;l
J PFNMAN
3 54 to give the Primal equation,
Examination o f Eq. (59) shows that it has precisely the general Form we developed earlier, i.c.
7 " M ( R r + us) = p in which 0, = 0. This implies that the standard solution to the primal form in electromagnetic systems is in A and (I),and it is a direct equilibrium method since i t satisfies ( S X ) and directly solves Eq. (57). The complementary primal form is therefore in H and D and is an indirect comptability method, which solves Eq. (58). We shall develop complementary fuiictionals that achieve these two solutions later. First, for completeness, we develop the dual equation for the electromagnetic system. We begin by remembering that D has been defined as,
D=D,+VxC
(60)
I t is therefore simple to satisfy the condition
-
V I)
=
-
V D,
=
by direct iiitcgration.
p
Also since,
V x H
=
V x H,
=
v
x
+ V x (-V$)
(g).
then using (60) and (6 I)we have
and if the time variation of D, is such that (7 D,
-=-J
6/
then V x H = J + ?D/i'r is al\o satisfied. Thi\ mean\ i f we emure that V D, = p and (7D,/(7t= equilibrium conditions are automatically satisfied.
-
-
J , then the
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
The remaining conditions are,
L
J
and
[:]
=
a at
The above equations may be combined to yield the Dual equation,
Again we see that it has the general form given above, and is therefore amenable to our variational treatment. Note too, that posing the problem in (C, $) is a direct compatibility method giving the standard dual form, whilst using (B,E) is the indirect equilibrium approach, for the complementary dual form. The structure is thus preserved for electromagnetic systems. We can still develop a standard or a dual partial differential equation, and they can both provide two functionals which provide error bounds. In the specific cases of Eqs. (59) and (64), the extraction of appropriate functionals is a significant task. As previously, in the case of the static systems, we shall only consider the standard form, and limit our practical illustrations to two dimensional steady state eddy current problems. C . Complementary Function& for Eddy Current Systems
We begin by returning to the primal Eq. (59) for which it is easy to show that if there are no free charges and no displacement currents, it reduces to the usual form of the diffusion equation in A , dA V x v(V x A ) = -a-+ J (65) at together with the condition, V.A=O Furthermore, in the absence of eddy currents, that is, in a region of zero conductivity, it yields the usual form of the wave equation. And, in the absence
356
J. PENMAN
of time variation, it reduces to the standard magnetostatic and electrostatic partial differential equations. This highlights the fact that Eq. (59) is very general indeed, but we now limit our attention to the particular case where Eqs. (65) and (66) are applicable. Furthermore, we will restrict ourselves to situations where the vector potential and the impressed current may be defined by, A
=
Ak
J = Jk,
and
where k is the unit vector in the z-direction. For a steady-state sinusoidal analysis A(x, y, t ) = A(x, Y )cos(wt + O(x,Y ) )
and
JAx, Y , t ) = j,(x, Y )c o s ( 4
where .& and jsrepresent the peak values of A and J, respectively, o is the angular frequency of the current, and 0 is a phase angle. Equation (65) can now be more conveniently represented in complex number notation. That is, we use the identity cos(ut) = $(ejwt + e-j"' ) and substitute it in (65) to give
v + vv x AC+ j o o R c = 3, v x vv x Rc* = jwoAc* = 3, where AC= Ads is a complex function, and A'* denotes the complex If the following notation to denote real and imaginary conjugate of components is used:
A'.
and
Rc = Ack = u' + ju" Ac* = ;ic*k = u' - j u t ' 3, = 3,k = f' + jo,
then (66) may be rewritten as
This can of course be expressed in our operator form as
TaMTu = p(u)
DLlAl. A N D ~ ' O M f ' l . l ~ M t ~ N ' I ' AVKA~K I A ' I I O N A I _ f 7 : C ' H N l ( ) ~ l f ~ S
357
Here ii is a n ordered pair o f the rcal functions ( u ' , ~ " ) ,and they replace tile complex vcctor potential A'. Wi t h h oin oge ti o u s bo LI t i d a ry co t i d I t io t i s, t his opera I o r eq u a t i o n c;i t i be M ritten i n its canonical form as -
7" = F. _ -
M 1' = ,1.
1
I-"\(. = j ( G )
Ui
=0
on CQ,
in 0
(6%)
B
I(. =
0 011 ?Q2
We set' t h a t Eq. (66) has thcrcfore been transformed i n t o a form to which complementary wriational principles can be applied, since it is identical in form to I q . ( X j . The pliysiciil significance of I', 113, and p can again be highlighted, thus.
Standard and complementary functionals f o r (6X)can therefore be constructed i n the manner described in Section 4" and 4D. The resulting functionals are:
358
J. PENMAN
which becomes
s
E(H) = - 4 pH’ * H ‘ d R +
+-
‘s
UC7
V x
s
jiH” * H ” d R
Ur7
J * ( V x H ” ) d R (73)
The functional represented by expression (72) is the usual form for the 2dimensional eddy current problem, and together with expression (73) it provides a complementary pair of functionals which bound the exact value of the functionals from above and below. D. A n Electromagnetic Field Example
In this type of problem we have a further variable to consider, namely that of frequency. We have therefore solved the problems illustrated in Fig. 20 for the range of finite element discretizations also shown in this figure, and for a spread of frequencies. By splitting the functionals into real and imaginary parts it is possible to equate these components to the stored magnetic energy and the energy dissipated. This, in turn allows the inductance and resistance to be calculated for each condition. The results of how each of these parameters vary with mesh modification and frequency are shown in Figs. 21 and 22. It can be seen that bounded solutions are still provided, but the absolute accuracy also depends on how well the meshing models the skin depth. It is clearly noticeable that a small number of elements in the iron region results in an increased error for each individual solution as the frequency increases. Nevertheless the accuracy of the average value is still good under these conditions. This can obviously result in a much improved computer utilization, for it allows a greater degree of confidence to be placed on the results. For completeness, the computed real and imaginary components of the standard potential, A, are shown in Figs. 23 and 24, for each frequency, at the highest level of mesh discretization.
IX. CONCLUDING REMARKS In the preceding pages we have presented a more or less self-contained development of how complementary variational methods can be applied to problems in electromagnetism. A general structure has been introduced which
DUAL AND COMPLEMENTARY VARIATIONAL TECHNIQUES
359
air A=O
t
FIG.20. An electromagnetic problem. (a) Problem definition. (b) 93 nodes. (c) 238 nodes. (d) 471 nodes.
leads to a powerful schematic representation of the interrelationships among the various field quantities. Examples of the application of the method via the finite element method have demonstrated its utility and power, and may hopefully provoke others to find yet wider areas of application for these techniques.
xlO-8
5
5
4
F.
E
\
I
" W
u Z < t-
u
3
n
5 S
W
I-
v) )v)
200
3
5
5
FIG.21. Bounds on system inductance as a function of mesh discretization and frequency.
V_ A R_ I A T I_ O N _OF_RESISTANCE WITH FREQUENCY ~ __~10-5 40
- 0 0
5
10
15
20
5
xlO' FREQUENCY
(
Hz )
364
J. PENMAN
ACKNOWLEDGMENTS The author would like t o warmly acknowledge the debt he is due, in the development of this work, to Dr. J. R. Fraser and Dr. M. D. Grieve. They both contributed massively t o the thoughts and substance contained in this article through their labours as graduate students in the Faculty of Engineering at the University of Aberdeen. The U.K. Science and Engineering Research Council are also due thanks for financially supporting a large amount of this work.
REFERENCES Arthurs A. M. and Anderson, N. (1970).“Bounds for capacities in a microwave filter problem”, 28, NO. 3, pp. 259-262. Fraeijs de Veubeke, B. M. (1964). “Upper and lower bounds in matrix structural analysis”, Agurdoyraph, Pergamon Press, 72, pp. 165-192. Hammond, P. (1981).“Energy methods in electromagnetism”, Pergamon Press. Hammond, P. (1976).“Physical basis of the variational method for the computation of magnetic field problems”, Compumag-76, Rutherford Laboratories, Oxford, pp. 28-34. Hammond, P. and Penman, J. (1978). “Calculation of eddy currents by dual energy methods”, Proc. IEE, 125, No. 7, pp. 701-708. Hammond, P. and Penman, J. (1976). “Calculation of inductance and capacitance by means of dual energy principles”, Proc. IEE, 123, No. 6, pp. 554-559. Lanczos, C. (1970).“The variational principles of mechanics”, Uniu. of Toronto Press, 4th Edition. Noble, B. (1966). “Complementary variational principles for boundary value problems”, Report 473, Mathematics Research Centre, Univ. of Winconsin. Oden, J. T. and Reddy, J. N. (1974). “On dual complementary variational principles in mathematical physics”, In/. J . of Engineering Science, 12, pp. 1-29. Oden, J. T. and Reddy, J. N. (1976). “Variational methods in theoretical mechanics”, SpringerVrrlay. Penman, J. and Fraser, J. R. (1982). “Complementary and dual energy finite element principles in magnetostatics”, Trans. IEEE, MAG-18, No. 2, pp. 319-324. Penman, J. and Fraser, J. R. (1984).“Unified approach to problems in electromagnetism”, Proc. IEE, 131, NO. I, pp. 55-61. Prager, W. and Synge, J. L. (1947).“Approximations in elasticity based on the concept of function space”, Quart. of App. Math., 5, No. 3, pp. 241-269. Simkin, J. and Trowbridge, C. W. (1979).“On the use of the total scalar potential in the numerical solution of field problems in electromagnetics”, h i . J . Num. Mefh. Eny., 14, pp. 423-440. J. Simkin and Trowbridge, C. W. (1980). “Three dimensional nonlinear electromagnetic field computations using scalar potentials” Proc. IEE, 127, No. 6, pp. 368-374. Tonti, E. (1972).“On the mathematical structure of a large class of physical theories”, Acad. Naz. Dai, Lincei., Lii, Series 111, pp. 48-56. Turner, M. J., Clough, R. W., Martin, H. C.,Topp, J. L. (1956).“Stiffness and deflection analysis of complex structures”, J . q / Aeronautical Sciences, 23, No. 9, pp. 805-824. Vainberg, M. M. (1973).“Variational methods and methods of monotone operators in the theory of non-linear equations”, John Wiley.
Index A
Broad-band pulse wave holography and, 265-266 synthetic aperture method using, 287-288
1-trimmed median filter, 125 Adjoint linear operators, 320-323 Aperture imaging system and, 216-223 azimuthal resolution, 219-221 radial resolution, 221-223 structure of, 219 real, side-looking sonar, 244-246 synthetic, broad-band pulse wave and, 287-288 side-looking sonar, 247-253 ultrasonic imagery and, 215-314 Approximations image models and, 103- 109 toroidal, 103-107 AR models. 84- 109 recursive, 95-97 simultaneous, 93-95 ARMA models, 84-109,97-100 Arrays image reconstruction and, 279-280 point spread function and, 280-282 structure of, 277-279 Autoregressive models, casual, robust estimation and, 109-120 Azimuthal resolution, 2 I9 - 22 1
C Casual autoregressive models, I 1 I - 1 12 robust estimation and, 109-120 Chip interconnection, 202-206 wire, 198-201 Complementary variational principles, 3 18-323 adjoint linear operators, 320-323 electromagnetic fields, 347-358 electrostatic field, 342-347 field problems, 33 1-336 general engineering, 323-331 magnetostatic, 336- 342 Composite edge detection robust models and, 139- 155 confirming edge presence, 144-148 edge hypothesis generation, 141 144 experimental results, 148-154 Computerized tomography, ultrasonic, 288-290 Current filaments and, semiconductors, turbulence in, 49-58 -
B D
Ballistic phonon imaging structural defects with, 71-75 signal, 58-63 Binary representation of information, I62 Bipolar transistors, 177-188 current density in, I78 doping concentration and, 179-182 limits on, 186 Boolean logic, I62 Born approximation, 293-295 Boundary detection, 83-84 Box, C.E.P.,81
Diffraction tomography, 290-313 born approximation, 293-295 broad-band pulse wave and, 305- 3 13 fan-beam illumination and, 300-305 plane wave illumination and, 295-300 Rytov approximation, 293-295 wave equation, 291 -293 Digital ultrasonic imaging, 253-267. 282-290 wave propagation, 254-260 equation, 254-256
365
366
INDEX
Digital ultrasonic imaging (Continued) Green's theorem, 257 two-dimensional problem, 258-260
Fresnel approximation, imaging procedure based on, 263-265 G
E Eddy current systems complementary functions, 355-358 Edge detection robust models and, 139- 155 confirming edge presence, 144-148 edge hypothesis generation, 141-144 Electromagnetic fields, 347- 358 calculation of, variational techniques for, 315-364 dual equations, 351-355 eddy current systems complementary functions, 355-358 example, 358 primal equations, 351-355 structure of, 347-351 Electromigration, 201-202 Electron beam energy loss of, 1 I- I2 heating effect of, 12-17 parameters, principles of, 8-10 specimen interaction with, 10-17 thermalization of, 11-12 Electrostatic fields, 342-347 dual set, 345-346 example, 346-347 primal set, 344-345 Energy dissipation of limits of, 207-213 power supply and cooling, 209-213 loss, electron beam, 1 1- 12
Gateaux differential, 329-331 General engineering field problem, 323-33 1 Germanium, 14 temperature profile, 15 Green's theorem, 257 complementary functional, 328-329 generalized form of, 325-326 standard functional, 328
H Heat conduction, 13-14 electron beam and, 12- 17 sapphire and, 5-6 Hologram defined, 224 Fresnel approximation and, 263 -265 geometry of recordings, 230 numerical reconstruction from a, 260-265 offset-reference, 226-228 recording, 260-261 Holography broad-band pulse wave and, 265-266 principles of, 224-231 reconstruction, 225 - 226,228 - 23 1 recording, 225 theory and application of, 223-253 ultrasonic imaging and, 231-244 liquid surface deformation method, 232-235 solid surface deformation method, 235-239 ultrasonic beam, 239-244
F Field effect transistors, 188-194 Field problems, 331-336 general engineering, 323-331 Filters a-trimmed, 125 L, 124-126 M, 126 robust, 123-126 Finite lattice models, 107-109 Focusing, phonon, 64-71
I Image reconstructed from, liquid surface, 234-235 recording, 225 restoration, 225-226,228-231 Image models, 82 robust, approximations to, 103-109 AR and ARMA models, 84-109 ARMA, 97-100
367
INDEX boundary detection and, 83-84 image restoration, 82-83 recursive, 95-97 simultaneous, 93-95 Image restoration, 82-83 algorithm for, 127- I29 robust models and, 121-139 experimental results, 129-137 intensive representation for, 126-127 robust filters and, 123-126 Imaging pulse-echo method, 216,217 system. aperture and, azimuthal resolution, 219-221 radial resolution, 221-223 holography and, 223-224 ultrasonic, aperture and. 216-223 diffraction, 290-3 13 holography and, 223-253 structure of, 219 synthetic aperture, 215-3 14 Information binary representation, 162 representation of, 161-163 digital, defined, 161 Information processing devices for, 164- 175 three terminal devices, 164-166 two terminal devices, 166-171 voltage and, 171- I75 dissipation of energy and, 207- 2 I3 physical limits of, 159-214 fabrication and, 203-207 systems, 163-164 transistors and, 175- 198 bipolar, 177- 188 FET. 188-194 MSFET and, 194-197 sofi errors and, 197-198 wiring and, 198-206 chip interconnection, 202-206 chip wiring and, 198-201 electromigration and, 201-202 L L-estimators, 81, I10 L filter, 124-126 Lateral resolution, 219--221
Liquid surface deformation method. 232-235 images reconstructed from, 234-235 transfer function of, 232-235 Longitudinal resolution, 221-223 Low temperature scanning electron microscope, see Scanning electron microscope LTSEM, see Scanning electron microscope
M M-estimators, 81, 82, 110, 111 computation of, 114-1 15 M filter, 126 Magnetostatics, 336- 342 dual set, 338-340 example, 340-342 primal set, 338 Median filter, 124-125 Medical diagnosis, mechanically scanning imaging for, 243-244 MESFET, 194-197 structure of, 195 Metal-oxide-semiconductor field-effect transistor, 189 theory of, 189-191 Microbridges, superconducting, hotspots in, 41 -49 Microscopy, mechanically scanning acoustic, 240-243 Models AR, 84-109 recursive, 95-97 simultaneous, 93-95 ARMA, 84-109.97-100 casual autoregressive, 1 11- I 12 finite lattice, 107-109 Gaussian Markov, 92-93 Gaussian noise, 109-1 10 image, 82 robust image, 79-157 boundary detection and, 83-84 composite edge detection and, 139- 155 generative interpretation of, 100- 103 image restoration with, 121-139 recursive, 95-97 simultaneous, 93-95 SAR Gaussian, 94-95 state variable. 97
368
INDEX
Monte Carlo image, 66 MOSFET, see Metal-oxide-semiconductor field-effect transistor
N NOR circuit, emitter-coupled, 173 0
Offset-reference hologram, 226- 228
P P-n junction in semiconductors, 171
Pair tunneling, zero voltage, 28 Phonon ballistic, imaging structural defects with, 71-75 signal, 58-63 focusing, 64-71 Power supply and cooling in dissipation of energy, 209-213
Q Qualitative robustness, I 1 1 Quasiparticle tunneling, 26- 39 current formula, 27
R R-estimators, 81, 110 Radial resolution, 221-223 Receiver, array, transmitter array and, 275-282 Rectangular transducer, radiation field from, 270- 272 Representation, intensive for image restoration, 126- 127 Resolution, 219-223 azimuthal, 2 19- 22 1 radial, 221-223 Robust estimation, casual autoregressive models and, 109-120 Robust filters, 123-126 Robust image models, see Models Rytov approximation, 293-295 S
Sapphire, heat conductivity of, 5-6 SAR models, Gaussian, 94-95
Scanning acoustic microscopy, 240-243 Scanning electron microscope low temperature stage, 4-8 cross-section of, 7 mounting configuration, 6 schematics of, 5 principle of, 1-2,X-10 spatial resolution of, 2 very low temperatures and, 1-78 Schottky-gate transistor, see MSFET Semiconductors hot electron effects in, 176 p-n junction in, 171 Side-looking sonar real aperture, 244-246 synthetic aperture, 247-253 technique, 246-247 Simultaneous AR models, 93-95 , Single transducer, radiated field from, 268-272 Soft errors, transistors and, 197- 198 Solid surface deformation method, 235-239 Sonar side-looking real aperture, 244-246 synthetic aperture, 247-253 Spatial fourier transform, spherical wave and, 266-267 Specimen electron beam interaction with, 10-17 energy loss and, 11-12 heating effect, 12-17 State variable models, 97 Statistical procedures robust, 81-82 types of, 8 1-82 Stefan-Boltzmann law, 61 Superconducting current filaments and, turbulence in, 49-58 microbridges, hotspots in, 41-49 tunnel junctions, 18-26 arrays of, 39-41 cross-line geometry, 18 quasiparticle tunneling, 26-39 Synthetic aperture broad-band pulse wave and, 287-288 holography and, 223-253 ultrasonic imagery and, 215-314
369
INDEX
U
T Texture edge, confirming, 145-147 Thermal healing length, 12-17 Thermal relaxation time, 12-1 7 Thermalization, electron beam, 1 I - 12 Throughput, defined, 204 Tomography computerized ultrasonic, 288-290 diffraction, 290-3 I 3 born approximation, 293-295 broad-band pulse wave and, 305-313 fan-beam illumination and, 300-305 plane wave illumination and, 295-300 Rytov approximation, 293-295 wave equation, 291-293 ultrasonic computerized, 288- 290 Toroidal approximation, 103- 104 Toroidal model, properties of, 104-107 Transducer array, processing system with a, 283-287 radiation field from, 272-275 ultrasonic imagery and, 267-282 rectangular, radiation field from, 270-272 single. radiated field from, 268-272 Transistors, 175-198 bipolar, 177-188 current density in, 178 doping concentration and, 179- 182 limits on, 186 field effect, 188-194 metal-oxide-semiconductor, 189- 194 MSFET, 194-197 structure of, 195 operation of, 173 soft errors and, 197-198 Transmitter, array, receiver array and, 275-282
Tunnel junctions superconducting. 18 -26 arrays of. 39-41 cross-line geometry, 18 quasiparticle tunneling, 26.- 39 Turbulence, superconducting, current filaments and, 49-58 Turning machine, 207-208
Ultrasonic beam, mechanical scanning of, 239 - 244
Ultrasonic computerized tomography, 288-290
Ultrasonic imagery holography and, 223-253 liquid surface deformation method, 232-233
ultrasonic beam, 239-244 problem in, 216-217 solid surface deformation method, 235-239 synthetic aperture, 215-3 14 transducer array and, 267-282 Ultrasonic imaging diffraction tomography and, 290-313 digital, 253-267,282-290 wave propagation, 254-260 real aperture, side-looking, 244-246 synthetic aperture, side-looking sonar, 247-253
V Variational principles complementary, 318-323 adjoint linear operators, 320- 323 electromagnetic fields, 347-358 electrostatic field, 342-347 field problems, 331-336 magnetostatics, 336-342 field problems, general engineering, 323- 331
historical perspective of, 316-318 Voltage information processing devices and, 171-175
scale of nonlinearity, 17 1 - 173 transistor operations and, 171 W
Wave propagation, 254-260 equation for, 254-256 two dimensional problem, 258- 260 Green’s theorem, 257
This Page Intentionally Left Blank