ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 89
EDITOR-IN-CHIEF
PETER W. HAWKES CEMESILaboratoire d'Optique Electronique du Centre National de la Recherche Scientifique Toulouse, France
ASSOCIATE EDITOR
BENJAMIN KAZAN Xerox Corporation Palo Alto Research Center Palo Alto, California
Advances in
Electronics and Electron Physics EDITEDBY PETER W. HAWKES CEMESILaboratoire dOptique Electronique du Centre National de la Recherche Scientifique Toulouse, France
VOLUME 89
ACADEMIC PRESS
San Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper.
@
Copyright 0 1994 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495
United Kingdom Edifion published by Academic Press Limited 24-28 Oval Road, London NWl 7DX International Standard Serial Number: 0065-2539 International Standard Book Number: 0-12-014731-9 PRINTED IN THE UNITED STATES OF AMERICA 94 95 9 6 9 7 98 9 9 B C 9 8 7 6
5
4
3 2 1
CONTENTS
. . CONTRIBUTORS PREFACE. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Digital Techniques in Electron Off-Axis Holography G . ADE I . Introduction . . . . . . . . . . . . . . . . . . . . . . I1. Electron Off-Axis Holography . . . . . . . . . . . . . 111. Problems of Off-Axis Holography . . . . . . . . . . . . IV. Examples of Applications . . . . . . . . . . . . . . . V. Conclusions and Future Prospects . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
Optical Symbolic Substitution Architectures M . s. ALAMAND M. A. KARIM I . Introduction . . . . . . . . . . . . . . . . . . . . . . I1. Optical Symbolic Substitution . . . . . . . . . . . . . 111. Coding Techniques . . . . . . . . . . . . . . . . . . IV. Signed-Digit Arithmetic Using OSS . . . . . . . . . . . V. OSS Architectures . . . . . . . . . . . . . . . . . . VI . Limitations and Challenges . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
I. I1.
111. IV. V. VI .
ix xi
. .
. .
. .
. . .
Semiconductor Quantum Devices MARCCAHAYAND SUPRIYO BANDYOPADHYAY Introduction . . . . . . . . . . . . . . . . . . . . . . Quantum Devices . . . . . . . . . . . . . . . . . . . Resonant Tunneling Devices . . . . . . . . . . . . . . . Aharonov-Bohm Effect-Based Devices . . . . . . . . . . T-Structure Transistors . . . . . . . . . . . . . . . . . Electron Wave Directional Couplers . . . . . . . . . . . . V
1 6 25 36 47 48
53 54 58 59 71 90 91
94 98 121 142 178 193
vi
CONTENTS
VII . Spin Precession Devices . . . . . . . . . . . . . . . . . VIII . Granular Electron Devices . . . . . . . . . . . . . . . . IX . Connecting Quantum Devices on a Chip: The Interconnecting Problems . . . . . . . . . . . . . . . . . . . . . . . X . Quantum-Coupled Architectures and Quantum Chips . . . . . XI . Epilogue: The Long-Term Prognosis . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
I. I1. I11. IV. V.
Fuzzy Relations and Applications BERNARD DE BAETS AND ETIENNEKERRE Introduction to Fuzzy Set Theory . . . . . . . . . . . . . Fuzzy Relational Calculus . . . . . . . . . . . . . . . . Special Types of Fuzzy Relations . . . . . . . . . . . . . Applications of Triangular Compositions . . . . . . . . . . Fuzzy Inference Mechanisms . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
Basis Algorithms in Mathematical Morphology RONALDJONESAND IMANTSD . SVALBE I. Introduction . . . . . . . . . . . . . . . . . . . I1 . Basis Algorithms . . . . . . . . . . . . . . . . . 111. Applying the General Basis Algorithm . . . . . . . . IV. Filtering Properties and the Basis Representation . . . . V. Translation-Invariant Set Mappings . . . . . . . . . VI . Gray-Scale Function Mappings . . . . . . . . . . . VII . Transforming the Basis Representation . . . . . . . . VIII . Conclusion . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Mirror-Bank Energy Analyzers S. I? KARETSKAYA. L . G. GLICKMAN. L . G . BEIZINA. AND Y U . v. GOLOSKOKOV I . Introduction . . . . . . . . . . . . . . . . . . . . . . I1. Equations for Charged Particle Trajectories in an Electrostatic Field Having a Symmetry Plane . . . . . . . . . . . . . . . .
199 203
208 217 243 245
255 266 291 297 312 323
326 334 342 349 358 366 374 383 385 389
391 393
CONTENTS
vii
111. Peculiarities of Charged Particle Focusing and Energy Separation
in a Mirror with a Two-Dimensional Electric Field . . . . . . IV. Energy Analyzers Based on Mirrors with Two-Plate Electrodes Separated by Direct Slits . . . . . . . . . . . . . . . . V. Peculiarities of Charged Particle Focusing and Separation in Energy in a Transaxial Mirror . . . . . . . . . . . . . . . . VI. Energy Analyzers Based on Transaxial Mirrors with Two-Plate Electrodes . . . . . . . . . . . . . . . . . . . . . . VII. Conclusion . . . , . . . . . . . . . , . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
441 477 478
. . . . . . . . . . . . . . . . . . . . . .
481
INDEX., . . .
399 410 433
This Page Intentionally Left Blank
CONTRIBUTORS Numbers in parentheses indicate the pages on which the authors' contributions begin.
G. ADE (l), Physikalisch-Technische Bundesanstalt, D-38116 Braunschweig, Germany M. S. ALAM(53), Department of Engineering, Purdue University, Fort Wayne, Indiana 46805 SUPRIYOBANDYOPADHYAY (93), Department of Electrical Engineering, University of Notre Dame, Notre Dame, Indiana 46556 L. G. BEIZINA(391), Laboratory of Mass-Spectroscopy, Institute of Nuclear Physics, National Nuclear Center, Republic of Kazakhstan, Alma-Ata 480082, Kazakhstan
MARCCAHAY(93), Department of Electrical and Computer Engineering, University of Cincinnati, Cincinnati, Ohio 45221 BERNARD DE BAETS(255), Department of Applied Mathematics and Computer Science, University of Gent, 9000 Gent, Belgium
L. G. GLICKMAN (39 1), Laboratory of Mass-Spectroscopy, Institute of Nuclear Physics, National Nuclear Center, Republic of Kazakhstan, Alma-Ata 480082, Kazakhstan
Yu. V. GOLOSKOKOV (391), Laboratory of Mass-Spectroscopy, Institute of Nuclear Physics, National Nuclear Center, Republic of Kazakhstan, Alma-Ata 480082, Kazakhstan RONALDJONES'(325), Department of Physics, Monash University, Melbourne, Victoria 3 168. Australia S. P. KARETSKAYA (391), Laboratory of Mass-Spectroscopy, Institute of Nuclear Physics, National Nuclear Center, Republic of Kazakhstan, Alma-Ata 480082, Kazakhstan
M. A. KARIM(53), Center for Electro-Optics, University of Dayton, Dayton, Ohio 45469 'Present address: CSIRO Division of Mathematics and Statistics, Locked Bag 17, North Ryde, NSW 21 13, Australia. ix
X
CONTRIBUTORS
ETIENNEKERRE(255), Department of Applied Mathematics and Computer Science, University of Gent, 9000 Gent, Belgium IMANTS D. SVALBE(3251, Department of Physics, Monash University, Melbourne, Victoria 3168, Australia
PREFACE
The range of topics explored in this volume typifies the ambitions of these Advances, with chapters on electron holography, electron device physics, fuzzy sets, optical computers, and a branch of image processing. The first chapter illustrates the interplay between technological development and pure science. Gabor invented holography to cure a defect of electron lenses but the method could not be used in practice until coherent sources became available. Now that it is being employed in electron optics as well as with light, it has become apparent that digital rather than optical reconstruction is preferable, indeed indispensable for the more exacting tasks, and the necessary computing power has only recently become generally available. It is this latest installment of the electron holography story that is related here by G . Ade. In the second chapter, in which M. S . Alam and M. A. Karim describe the architectures of optical computers using optical symbolic substitution, we might be accused of opening the gates to the enemy, for electronic and optical computers have been seen as rivals. We, however, share the views of A. Lohmann, active for many years in the field of optical computing, that they are complementary and believe that it is important for each community to be aware of progress in the other. This extremely clear account of an important aspect of optical information processing is very welcome. With the extreme miniaturization that has been achieved over the past few years, a family of semiconductor devices that depend on the quantum mechanical properties of charge carriers has come into being. In their detailed account of superconductor quantum devices, M. Cahay and S . Bandyopadhyay present the underlying physics and the technological features of these miniature elements. With this chapter, which may properly be regarded as a short monograph on the subject, we reaffirm the vocation of the series to review new developments in electron physics. Uncertainty, imprecision, and the qualitative nature of many scientific statements-these are hallmarks of science as soon as we move away from abstractions toward practical measurement and decision-making. Fuzzy set theory was introduced some 30 years ago in order to provide a respectable mathematical structure in terms of which these vague statements could be expressed more clearly. The subject has now progressed far from its theoretical beginnings and is used in many widely different fields; it undoubtedly has something to offer in many others into which it has not yet penetrated. The very readable account by B. De Baets and E. Kerre of fuzzy relations and their applications xi
xii
PREFACE
will perhaps awaken wider interest in these methods and complements an earlier survey by S. K. Pal. The branch of image processing that is known as mathematical morphology has now come of age, with regular courses, textbooks, and computer packages available. Nevertheless, the subject is still young and numerous aspects of it are far from fully explored. In their contribution to this volume, R. Jones and I. D. Svalbe give a full account of an important and difficult problem, namely, the computation of bases in terms of which a broad class of mappings can be represented. Thanks to their introductory material and to the examples that illustrate their account, they manage to demystify the subject and ease the reading of their solution to this knotty problem. We conclude with a very full account by S . P. Karetskaya, L. G. Glickman, L. G. Beizina, and Yu. V. Goloskokov of a type of energy analyzer that has been developed and extensively studied in their institute in Alma-Ata. These analyzers exploit the properties of a particular kind of reflecting element, which has likewise been thoroughly examined in Russia and Kazakhstan, and several monographs on these instruments have been published in Russian. I feel sure that this extended account in English will be welcomed by the mass-analyzer and mass-spectrometer communities. I am most grateful to these authors for the trouble they have taken to make their work accessible to readers of these Advances. As usual, I conclude with a list of forthcoming contributions, in which it will be seen that the number of surveys in the field of image science, image processing in particular, continues to be appreciable. This has been a deliberate policy on my part for some years now and, with the full approval of Academic Press, a new name has been chosen for the series that reflects this trend: from the next volume, the series will be entitled Advances in Imuging and Electron Physics. The Preface to Volume 90 will contain more details.
FORTHCOMING ARTICLES Group invariant Fourier transform algorithms Nanofabrication Use of the hypermatrix Image processing with signal-dependent noise The Wigner distribution Parallel detection Hexagon-based image processing Microscopic imaging with mass-selected secondary ions Nanoemission
Y. Abdelatif and colleagues H. Ahmed D. Antzoulatos H. H. Arsenault M. J. Bastiaans P. E. Batson S. B. M. Bell M. T. Bernius Vu Thien Binh
...
XI11
PREFACE
Metareasoning in image interpretation Magnetic reconnection Sampling theory ODE methods The artificial visual system concept Projection methods for image processing Minimax algebra and its applications Corrected lenses for charged particles The development of electron microscopy in Italy Space-time algebra and electron physics The study of dynamic phenomena in solids using field emission Gabor filters and texture analysis Group algebra in image processing Miniaturization in electron optics Crystal aperture STEM The critical-voltage effect Physical information and electron physics Amorphous semiconductors Stack filtering Median filters Bayesian image analysis RF tubes in space Mirror electron microscopy Relativistic microwave electronics Rough sets The quantum flux parametron The de Broglie-Bohm theory Contrast transfer and crystal images Seismic and electrical tomographic imaging Morphological scale-space operations Algebraic approach to the quantum theory of electron optics
I? Bottoni and F? Mussio A. Bratenahl and F? J. Baum J. L. Brown J. C. Butcher J. M. Coggins F? L. Combettes R. A. CuninghameGreen R. L. Dalglish G. Donelli C. Doran and colleagues M. Drechsler
J. M. H. Du Buf D. Eberly A. Feinerman J. T. Fourie A. Fox B. R. Frieden W. Fuhs M. Gabbouj N. C. Gallagher and E. Coyle S. Geman and D. Geman A. S. Gilmour R. Godehardt V. L. Granatstein J. W. GrzymalaBusse W. Hioe and M. Hosoya €? Holland K. Ishizuka F? D. Jackson and colleagues I? Jackway R. Jagannathan and S. Khan
XIV
PREFACE
Electron holography in conventional and scanning transmission electron microscopy Quantum neurocomputing Applications of speech recognition technology Spin-polarized SEM Sideband imaging Highly anisotropic media High-definition television Regularization Near-field optical imaging SEM image processing Electronic tools in parapsychology Image formation in STEM The Growth of Electron Microscopy Phase retrieval Phase-space treatment of photon beams Image plate New developments in electron diffraction theory Z-contrast in materials science Electron scattering and nuclear structure Multislice theory of electron lenses The wave-particle dualism Electrostatic lenses Scientific work of Reinhold Rudenberg Electron holography X-ray microscopy Accelerator mass spectroscopy Applications of mathematical morphology Set-theoretic methods in image processing Texture analysis Parallel image processing and image algebra Focus-deflection systems and their applications Information measures New developments in ferroelectrics Orientation analysis The Suprenum project Knowledge-based vision Electron gun optics Very high resolution electron microscopy
E Kahl and H. Rose S. Kak H. R. Kirby K. Koike W. Krakow C. M. Krowne M. Kunt A. Lannes A. Lewis N. C. MacDonald R. L. Moms C. Mory and C. Colliex T. Mulvey (ed.) N. Nakajima G. Nemes T. Oikawa and N. Mori L. M. Peng S. J. Pennycook G . A. Peterson G . Pozzi H. Rauch E H. Read and I. W. Drummond H. G. Rudenberg D. Saldin G. Schmahl J. P. E Sellschop J. Serra M. I. Sezan H. C. Shen H. Shi, G. X. Ritter, and J. N. Wilson T. Soma I. J. Taneja J. Toulouse K. Tovey 0. Trottenberg J. K. Tsotsos Y. Uchikawa D. van Dyck
PREFACE
Spin-polarized SEM Morphology on graphs Cathode-ray tube projection TV systems
Canonical aberration theory Image enhancement Signal description The Aharonov-Casher effect
xv
T. R. van Zandt and R. Browning L. Vincent L. Vriens, T. G . Spanjer, and R. Raue J. Ximen I? Zamperoni A. Zayezdny and I. Druckmann A. Zeilinger, E. Rasel, and H. Weinfurter
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS AND ELEIXRON PHYSICS. VOL. 89
Digital Techniques in Electron Off-Axis Holography G . ADE Physikalisch-Technische Bundesanstalt 0-38116 Braunschweig. Germany
I . Introduction . . . . . . . . . . . . . . . . A . Why Holography? . . . . . . . . . . . . B . Principle of Holography . . . . . . . . . . C . A Brief Historical Review . . . . . . . . . . D . Choice of Method . . . . . . . . . . . . . E . The Problem of Phase Detection . . . . . . . . I1 . Electron Off-Axis Holography . . . . . . . . . . A . Generation of the Hologram . . . . . . . . . B . Light-Optical Reconstruction of the Object . . . . C . Digital Reconstruction without Aberration Corrections . D . Techniques for Displaying Phase Distributions . . . E . Phase Amplification Techniques . . . . . . . . F Digital Reconstruction Including Aberration Correction 111 Problems of Off-Axis Holography . . . . . . . . . A . The Effect of Limited Coherence . . . . . . . . B Noise Problems . . . . . . . . . . . . . C Problems of Hologram Recording . . . . . . . IV Examples of Applications . . . . . . . . . . . A . Thickness Measurement . . . . . . . . . . . B Study of Dynamical Phase Effects . . . . . . . C. Crystal Defects . . . . . . . . . . . . . . V Conclusions and Future Prospects . . . . . . . . . References . . . . . . . . . . . . . . . .
.
.
.
. .
.
-
.
. . . . . . .
. . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . . . . .
1 2 2 3 5 5 6 7 10 12 18 19 21 25 25 31 32 36 36 38 43 41 48
I . INTRODUCTION
The objective of this article is to give an overview of the present state of the art in electron off-axis holography . For completeness and a better understanding of the subject. an outline of the principle of holography and a brief historical review are given . Examples of applications are presented and future prospects are touched upon . Despite the fact that there is no substitute for reading the original papers. it has not been attempted to cover the literature comprehensively.For some topics. reference has been made to review articles rather than to original papers . I
Copyright 0 1994 by Academic Press. Inc. All rights of reproduction in any form reserved ISBN 0-12-014731-9
.
2
G . ADE
A . Why Holography?
The importance of electron microscopy is based on the fact that it provides direct information on the microscopic structures of the object. Generally, the electron wave leaving the object is modulated in both the amplitude A and phase 0. This wave is transformed into an image wave determined by the functions A i and Q i , which do not agree with A and Q because of the influence of the aberrations of the objective lens. For a full determination of the object structures, both functions A i and Qi must be known. Unfortunately, only the information on amplitude A ican be obtained from a conventional electron micrograph. To visualize the phase information, phase Qi must be converted into a bright-dark distribution which, in principle, can be achieved by Fourier filtering. This leads, however, to a deterioration of the image and in particular to a limitation of resolution. Problems of this kind are thoroughly treated by the contrast transfer theory of the electron microscope (see, e.g., Hanszen, 1971). The difficulty in obtaining phase information can be overcome by means of holography. As was shown by Gabor (1948, 1949, 1951), it is always possible to recover the full information about A i and Qi and also to correct for aberrations by supplementing the image formation in the electron microscope by a light-optical reconstruction step. Electron holography thus provides an excellent means of quantitatively determining the complex object function of interest without sacrificing the resolution. The reconstruction of this function makes it possible to investigate its amplitude and phase components by employing various methods such as dark-field imaging, differential contrast, and holographic interferometry. B. Principle of Holography
Holography is normally understood as a two-step imaging technique. In fact, it is a three-step rather than a two-step method. This is because the recording process, which represents an intermediate stage between the hologram-formation step and the reconstruction step, plays an important role in determining the quality of the reconstructed image. Because of the nonlinearity of the phoptographic process, for example, reliable reconstructions can only be obtained when certain conditions are fulfilled (Ade, 1980, 1982a). This and other problems connected with the hologram recording step are discussed in Section III,C. In the first step, an interference pattern is produced by superposing the object wave with a reference wave. If we denote the complex object wave in an observation plane by iyo and the reference wave by iyr, the intensity
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
3
distribution of the interference pattern can be written as lint
+r woI2
=
l
=
IWrl2
~
+
lWOl2
+ w:vo +
WrWX,
(1.1)
where the symbol * represents a complex conjugation. The first two terms on the right-hand side depend on the intensity of the reference and object waves, while the last two terms depend on the amplitude and phase of these waves. The second step consists of recording the interference pattern (1.1). For simplicity, we assume here that it is linearly recorded on a photographic plate. The amplitude transmittance T, of the developed transparency, the hologram, is then given by
T, = Tb + 4(IWOl2 + w:w, + WrW,*),
(1.2)
where Tb is a “bias” and q is a constant factor. Once the amplitude and phase information on the object wave have been recorded, it only remains to reconstruct this wave. In the reconstruction step, the hologram is illuminated by a coherent light wave wI. The light transmitted by the transparency is then WIT, = WIG + 4WllWo12 + 4WlW:Wo + 4WlWrWo*.
(1.3)
It can be easily recognized that if tyIis a duplication of the original reference wave wr, the third term of Eq. (1.3) is, up to a constant factor, an exact duplication of the original object wave. This term leads to the so-called main image. Furthermore, if wI is chosen as the conjugate of the reference wave, the fourth term becomes proportional to the conjugate complex of the object wave which leads to the conjugate or twin image. Thus, it can be concluded that full information on the object wave can indeed be reconstructed from the hologram. However, the problem of whether or not the reconstructed wave can be efficiently separated from the other terms occurring simultaneously in Eq. (1.3) remains. The solution to this problem is discussed in the following section.
C. A Brief Historical Review As already mentioned, the essence of holography is to produce an interference pattern between a reference wave and the wave modulated by the object. If the reference wave coincides with the object wave, an in-line hologram is produced. A major disadvantage of this holographic technique is that the disturbing twin image that is always produced in the
4
G . ADE
reconstruction step cannot be separated from the main image. The effect of the twin image can be reduced to a reasonable degree, however, if the hologram is formed under Fraunhofer conditions (Thompson, 1965; De Velis et al., 1966). The realization of this concept in electron microscopy was performed by Tonomura et al. (1968a,b). A detailed description of this method has been given by Hanszen (1970, 1982). The other technique for avoiding the disturbance caused by the twin image is tilted-beam or off-axis holography, introduced by Leith and Upatnieks (1962, 1967). In this method the reference wave is inclined with respect to the image-forming wave. If the hologram is formed in a plane that is conjugate to the object plane, the technique is called image-plane off-axis holography. Such holograms are particularly relevant because the effect of spatial coherence on resolution is the least in this case. This and other holographic techniques are described in detail in many articles on holography (see, e.g., Collier et al., 1971; Menzel et al., 1973). A review of holography with an extended list of references has been given by Hawkes (1978) as part of an article on partial coherence. In electron holography and interferometry, several authors have examined methods for recording holograms and discussed the problems encountered in the reconstruction of the original object wave (for a full review, see e.g., Hanszen, 1971, 1982, 1986; Wade, 1980; Missiroli et al., 1981; Tonomura, 1986, 1987; Lichte, 1991a). Experimental work on electron holography was first carried out by Haine and Mulvey (1952, 1953) and Hibi (1956). These experiments did not lead to the success expected, however, because the coherence requirements for the electron beam were so severe that recording the hologram with the conventional sources available required very long exposure times. Consequently, the practical resolution of the hologram was limited by mechanical vibrations, specimen-stage drift, and magnetic stray fields. In modern electron microscopes, these problems have practically been overcome. Because of the employment of more efficient electron guns, especially field emission guns, the exposure times necessary to record high-resolution holograms can be drastically reduced. However, although image resolutions of about 0.2 nm can be readily obtained with the electron microscope, the resolutions of holographic reconstructions are usually found to be relatively low. As we shall see later in Sections II,F and III,B, this is mainly due to the influence of aberrations and noise. Thus, Gabor’s objective of visualizing atoms at a resolution of about 0.1 nm in holographic reconstructions has not been yet realized. In this context, it should be mentioned that resolutions of 0.1 nm or less are not only of academic interest, but also of great importance in a variety of technical fields, such as materials science and modern electronic devices.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
5
D. Choice of Method Several methods have already been employed in electron holography. We concentrate here on the most important method of image-plane off-uxis holography suggested by Weingartner et al. (1971), and realized and thoroughly investigated by Wahl (1974, 1975) using a Mollenstedt-type electrostatic biprism' (Mollenstedt and Duker, 1956). In contrast to the in-line holography method, in which only weak objects can be reliably reconstructed, off-uxis holography also enables the reconstruction of strong objects. After the preliminary experiments by Mollenstedt and Wahl (1968), the method was successfully applied by Tomita et al. (1970, 1971, 1972). Further experiments were performed by Wahl(l974, 1975). A considerable improvement in the quality of reconstructed images was achieved by Tonomura et al. (1979b,c), Lauer (1982a), and Lichte (1982) using a field emission gun. As a result of this improvement, interest in electron holography has increased, and many applications have been reported which demonstrate the potentiality of this method. A full review of these investigations can be found in articles by Hanszen (1982, 1986) and Tonomura (1986, 1987). These investigations are, however, confined to the region of medium resolutions, since the reconstructed images were obtained without lens aberration corrections. The present attempts (see, e.g., Lichte, 1991a,b, 1992a,b, 1993) are therefore aimed at solving this problem.
E. The Problem of Phase Detection As already mentioned, holography provides full information on the amplitude and phase structures of the object. However, there remains the problem of how to detect the amplitude and phase distributions separately. Until recently, this problem was solved by performing the reconstruction step in a light-optical interferometer (Tonomura et al., 1979a; Hanszen, 1980). A comprehensive review of this subject including practical applications can be found in articles by Hanszen (1982, 1986), Tonomura (1986, 1987), and Lichte (1991a). Generally, the light-optical reconstruction is beset by many difficulties, such as the nonlinearity of the photographic process (Ade, 1980, 1982a), photographic noise (Hanszen and Ade, 1983, 1984; Hanszen, 1983), and imperfections of the optical elements. These problems can be totally avoided, however, if the hologram is recorded electronically and the reconstruction process is performed digitally by means of a computer
' A different electron holography method without the use of a biprism has been investigated by Lauer (1981. 1984).
6
G. ADE
(Takeda and Ru, 1985; Franke et al., 1986; Lichte, 1986; Yatagai et al., 1987; Ade and Lauer, 1988, 1990). Further investigations have been carried out by Lichte (1991a,b, 1992a,b, 1993), Ade and Lauer (1991, 1992a,b), Matsuda et al. (1991), and Ru et al. (1991). Digital techniques are finding increasing application in almost all areas of electron microscopy, ranging from computer control of the instrumental functions to the processing and visual display of the final image. Modern image-processing systems are usually equipped with fast processors which allow the complex Fourier transform of an input signal or the corresponding modified output signal to be determined within a relatively short time, a few seconds in most cases. Furthermore, they are extremely flexible with regard to arbitrary manipulations in both real and Fourier spaces, such as contrast enhancement, signal mixing, edge extraction, and filtering. But because of the limited number of pixels (usually up to 512 x 512) and grey levels (I 255) in most image-processing systems, the detection of small phase shifts on the order of 2n/100, which are of great interest in practical work on an atomic scale, is still a difficult problem. Digital techniques which lead to a high degree of phase amplification and thus to a high sensitivity in phase detection (cf. Ade and Lauer, 1990, 1992a,b; Lauer and Ade, 1990) are discussed in Section I1,E. The detailed theoretical background of the digital phase reconstruction and amplification methods is given there and experimental results are presented which demonstrate that these methods can be effectively used to detect very small phases of the just-mentioned order. In high-resolution work, some problems such as the processing of a large number of pixels within a reasonable time, elimination of lens aberrations, and reduction of the influence of electron noise must be overcome; see Sections II,F and 111, A-C. Recent attempts at solving these problems are discussed by Fu et al. (1991) and Lichte (1991a,b, 1992a,b, 1993). For a limited field of view, commercial systems with image formats of 512 x 512 pixels or smaller can be efficiently used to obtain the desired reconstructions of the object in real time. 11. ELECTRON OFF-AXISHOLOGRAPHY
Since it was first introduced by Gabor (1948), several types of holography have been realized. As already mentioned in Section I,D, we shall concentrate here on the image-plane off-axis holography method (Leith and Upatnieks, 1962; Weingartner et al., 1971), in which a slightly tilted wave is used as a reference. For convenience, the term “image-plane” will be omitted from now on.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
7
A . Generation of the Hologram
Electron holograms are obtained in the electron microscope by means of an electrostatic biprism (Mollenstedt and Diiker, 1956) inserted between the back focal plane (aperture plane) of the objective lens and the intermediate image plane. As shown schematically in Fig. la, the object under investigation is illuminated by a coherent electron wave. The part of this wave covering the area on the other side of the optical axis is used as a reference wave. When a voltage is applied to the filament of the biprism, the object a Electrons Object Hologram
0 bjective lens
Aperture plane
m
L W
4
-
Aperture
Filament
s
Biprism
i
Image plane
Hologram
~
FIGURE1. Schematic arrangements for the recording and reconstruction steps in electron off-axis holography. (a) The object and reference waves are brought to interference in an intermediate image plane by means of an electrostatic biprism. The resulting fringe pattern is recorded at high magnification as a hologram in the final image plane. (b) The incident light beam produces three diffracted beams behind the hologram. Only one of the first-order beams is used for reconstruction. If no reference beam is employed, only the amplitude information on the object can be obtained from the reconstructed image.
8
G . ADE
and reference waves overlap, producing an interference pattern which can be recorded either photographically or by electronic means. This recorded pattern is called a hologram. The object can be generally described by a complex function of the form
F(x) = A(x) exp[i@(x)],
(11.1)
where A(x) and @(x) represent the amplitude and phase components at the position x = ( x , y ) in the object plane. Taking third-order aberrations of the imaging system into account (Ade, 1978, 1981), the wave function of the image wave in the hologram recording plane can be written as
WH
= VQH
FH(x)
(11.2)
where tpQHrepresents the wave function of the illuminating wave, and the complex function FH(x)is given by
(11.3) -m
As is usual in imaging theory, the vector x = xH/m is used to describe the position xH in the hologram plane; m is the actual magnification. The function F(R) occurring in the integral on the right-hand side of this equation is the Fourier transform (spectrum) of the object function,2 and the two-dimensional vector R = (R,, R,) describes the spatial frequencies of the object. The aberrations have the effect of multiplying the object spectrum by the complex “amplitude transfer function” P(R, x)
=
a@) exp[iW(R, $1,
(11.4)
where a(R) is the aperture function and W is the reduced wave aberration. The latter depends not only on the spatial frequencies of the object, but also on the position x in the hologram plane. The aberration coefficients occurring in W are generally complicated functions of magnification, pupil position, and defocusing (Ade, 1978,1982b; Hawkes, 1968a,b, 1970). Simple relations exist only for parallel object illumination and small defocus values, as is usually the case in off-axis holography. By means of Eq. (45) of Ade (1978) and Eqs. (14), (24), and (25) of Ade (1982b), the ’The tilde is used here to indicate the Fourier transform of the function under consideration.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
9
wave aberration W can then be written as Az W(R, X) = --AR2 2
C S + -A3R4 4
[ f, M’
M‘
D, EI + -21 M‘ -AR2x2 - -(&)x2 M‘
Cr
+M 7 -
FI (AR)’& --
(AR)2 - +A& M
+M 7x2 er
1
(Ryx - R,y), (II.5a)
where Az is the defocus, C, the spherical aberration coefficient, 1 the wavelength of the imaging electrons, and M ’ the magnification of the objective lens. The off-axial aberration coefficient F,/M’ represents isotropic coma, CJM’ isotropic astigmatism, DJM‘ field curvature, and EJM’ isotropic distortion. Furthermore, the coefficients f r / M ’ , c J M ’ , and eJM’ correspondingly represent anisotropic coma, anisotropic astigmatism, and anisotropic distortion. Because of the off-axial aberrations, the imaging conditions in the electron microscope are generally found to be non-isoplanatic. For a detailed study of the effect of non-isoplanatism in micrographs taken in an electron microscope equipped with a field emission gun, see, e.g., the article by Hanszen et al. (1985a). As discussed by Ade (1978), the condition of isoplanatism is nearly satisfied if the contribution of every off-axial aberration is small compared with 2n. In this way, the size of the axial isoplanatic region can easily be estimated. It is found that the radius of this region depends not only on the numerical values of the aberration coefficients, but also on the resolution limit considered. For simplicity, the discussion will be restricted here to the pure isoplanotic case where only spherical aberration and slight defocusing are taken into account. In this case, the full expression for the wave aberration reduces to the well-known form
Az W(R) = --AR2 2
+ -A3R4. 4 C S
(IISb)
As can be deduced from Eq.(11.3), the image function FH completely agrees with the original object function F if the transfer function P(R,x) is made equal to unity. In this ideal case, the amplitude and phase components A H and OH are identical with the corresponding components A and O of the object function. As discussed by Wahl (1975) and Ade (1982a), the action of the biprism can be taken into account by multiplying the spectrum of the electron wave
10
G . ADE
of interest by the complex phase function (11.6) where the plus and minus signs apply to the reference and the image wave, respectively. The first term in the exponent describes the tilt of each wave, and the second its shift by one-half of the width D = ID1 of the interference field, cf. Section II,F. The vector Q is determined by the direction of the illumination wave in the object plane. The value Q = 0 corresponds to the axial illumination. Assuming axial illumination and ideal coherence3 between the reference and the image wave, the normalized intensity distribution in the overlap region is found to be H(x)= 1
+ A&(x) + ~ A H ( x ) c o s [ ~ ~ R-, x@H(x)],
(11.7)
where the carrier frequency R, = lRcl is determined by the mutual tilt of the two waves.4 Equation (11.7) shows that the interference pattern consists of a cosinusoidal fringe system that is modulated in contrast by AH@) and in position by OH@). If we assume a linear photographic recording of this pattern, we may consider Eq. (11.7) as representing the amplitude transmittance of the hologram.
B. Light-Optical Reconstruction of the Object The object information is usually determined from the hologram by lightoptical reconstruction using only one or both first diffraction orders of the hologram. Generally, all filtering operations that are desirable but not feasible in the electron microscope can be easily performed in the lightoptical step. To display the phases, all the methods available in coherent light optics, such as bright-field, dark-field, phase-contrast imaging, and holographic interferometry, can be used. In the following, only the lastmentioned method will be briefly discussed. For reconstruction purposes, the hologram is illuminated by a collimated laser beam, and the resulting light distribution in the hologram plane is Fourier transformed by means of a lens (Fig. lb). The desired diffraction order is selected with the aid of a diaphragm, and the corresponding reconstructed image is recorded in the image plane of the hologram. Unfortunately, the recorded image does not carry the important phase 'The effect of partial coherence is discussed in Section III,A. Note that all coordinates and tilt angles refer to the object side.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
11
Reconstruction
FIGURE2. Mach-Zehnder interferometer for light-optical reconstruction of the object wave from off-axis holograms (see Hanszen, 1980; Hanszen and Ade, 1983).
information on the object, since only the intensity of the reconstructed wave can be recorded. In order to detect the phase information, the image wave must be superposed by a properly adjusted reference wave. The first holographic reconstructions of this kind were reported by Wahl (1974, 1975). For the same purpose, a Mach-Zehnder interferometer (Fig. 2) was used by Tonomura et al. (1979a), and Hanszen (1980); see also Hanszen (1983), Hanszen et al. (1983), and Hanszen and Ade (1983, 1984). Because of the difficulty of constructing reconstruction lenses that fulfill all the necessary requirements (Ade, 1981), the problem of obtaining the highest resolution by eliminating lens aberrations in the light-optical reconstruction step has not yet been satisfactorily solved. The problem of eliminating lens aberrations by digital means will be discussed in Section II,F. In the following, we shall ignore the influence of aberrations and thus restrict the discussion to the case of medium resolutions. In this case, the functions A H and OH in Eq. (11.7) can be replaced by the corresponding functions A and O of the object. If we use Eq. (11.1) and the exponential representation of the cosine, Eq. (11.7) takes the form H(x) = 1 + A2(x) =
1
+ 2A(x)cos[2n%x
- O(x)]
+ IF(x)I2+ F(x)exp(-i2nRCx)
+ F*(x) exp(+i2nRCx),
(11.8)
where the last two conjugate complex terms represent the main and the twin image, respectively. If the hologram is coherently illuminated by a collimated laser beam, as schematically shown in Fig. 2, the light distribution in the hologram plane is then, up to a constant factor, given by
12
G. ADE
Eq. (11.8), and the corresponding two-dimensional Fourier transform (FT) is found to be
Besides the zero diffraction order (delta function) and the intermodulation spectrum (second term), two sidebands representing the Fourier spectrum of the object wave and its conjugate are seen to occur in the back focal plane of the Fourier transforming lens. If the carrier frequency R, is sufficiently high, the two sidebands will not overlap with the intermodulation term, and they can therefore be selected without affecting the resolution of the reconstructed image. This means that R , should be at least three times as large as the highest spatial frequency Re to be resolved in the image. If only one sideband, the +1 in Fig. 2, is selected by means of an aperture (cf. Hanszen and Ade, 1983) and a properly adapted reference wave is e m p l ~ y e d we , ~ obtain as the reconstructed image an interferogram with the normalized intensity distribution r,(x) = 1
+ A2(x)+ 2A(x)cos[2nRIx - 0(x)],
(11.10)
where I l/R1l represents the fringe distance of the resulting interference fringes. The right-hand side of this equation is seen to be equivalent to that of the hologram (11.8). Thus, the interferometric reconstruction does not provide more information than the hologram, but facilitates the visual interpretation and extraction of the desired information on the object, since the fringe distances and the azimuth of the fringes can be chosen to meet specific needs. The phase information is encoded in the interference pattern and can be determined from the bends of the fringes. Further details and other techniques for obtaining this information have been given elsewhere (see, e.g., Hanszen, 1982; Hanszen et al., 1983). C. Digital Reconstruction without Aberration Corrections As already mentioned, all the problems encountered in the light-optical reconstruction step can be avoided if the reconstruction process is performed digitally. For simplicity, we shall ignore the influence of aberrations in the following considerations. This problem is discussed in Section II,F. As in the preceding section, the hologram can then be described by means of Eq. (11.7) with A H and QH being replaced ay A and 0;see Eq. (11.8). We now assume that the hologram is read out digitally into an imageprocessing system. 'This can be achieved by correctly tilting the mirror M in Fig. 2.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
13
1. Interferometric-Type Reconstruction The explicit calculation of the amplitude and phase functions of the unknown object from the digitized hologram requires the use of an efficient image-processing system and quite sophisticated software for the data processing and a suitable display of the final image data. Before we deal with this problem in the next section, a simple procedure is now described which leads to an “interferometric” reconstruction similar to that given by Eq. (11.10) and which requires only few manipulations in the real and Fourier spaces. Experimental results and an outline of the method have been reported by Ade and Lauer (1988,1990). A similar approach has recently been made by Ru et al. (1992). The desired interferometric reconstruction of the complex object function can be accomplished in three steps:
a. Preprocessing. In the first step, a filtered version of the hologram is produced by complex Fourier transformation of the digitized hologram data and subsequent inverse Fourier transformation of the data corresponding to the two sidebands around +.R, in Fourier space, see Fig. 3b. For this purpose, a high-pass filtering mask with two circular regions centered at *Rc is employed. In this way not only background variations and nonlinear terms caused by the hologram recording process can be removed, but also hologram distortions with low frequencies and noise terms whose frequency components occur outside the filter pass-areas. The resulting filtered hologram can then be described by the function
H&)
=
B + ~ A ( x ) c o s [ ~ ~ R-, @(x)], x
(11.11)
where b represents a constant bias term. Comparison with the first line of Eq. (11.8) shows that the term A’ has been removed by this filtering operation.
b. Employment of a Reference Fringe System. To obtain a reconstruction with a given fringe orientation and a prespecified fringe distance, the azimuthal positions and the mutual separation of the aforementioned sidebands must be properly adjusted. This can be done by employing a cosinusoidal fringe system of the form s(x)
=
1
+ 2pcos(2nR(3x),
(11.12)
where p is a constant, and ll/&[ represents the fringe distance. If we multiply6 the filtered hologram (11.1 1) by the fringe system (11.12) and 61n an earlier version of this method (Ade and Lauer, 1988), digital subtraction was used instead of multiplication.
14
G. ADE
apply a two-dimensional Fourier transformation to the resulting ‘‘moird image”, we then obtain for the corresponding spectrum (Ade and Lauer, 1991) the expression
+ p b ( R , - R, + R) + p b ( R , - R, - R) + F(RC+ R) + F‘*(R, - R) + pF‘(2RC- R, + R) + pF’*(2RC- R, - R) (11.13) + pF‘(R, + R) + pF’*(R, - R),
Hmod(R)= b ( R )
where R, = R, - R,. The first three terms occurring on the right-hand side of this equation represent the Fourier spectrum of the fringe system (11.12), and the two following terms represent the sidebands corresponding to the spectrum of the object and its conjugate. As can be easily recognized, the residual terms of Eq. (11.13) represent laterally shifted sidebands which are centered at f(2R, - R,) and f R,. By a proper choice of the spatial frequency R, of the reference fringe system, and hence of the difference frequency R,, the sidebands given by the last two terms of (11.13) can be made to occur near the origin of the spectrum where they can easily be selected for reconstruction purposes; see Fig. 3c. c. Reconstruction. In the final step, a low-pass filtering mask is used to select the aforementioned sidebands around _+R, [last two terms in Eq. (II.13)] and to remove all other unwanted spectral terms; cf. Fig. 3c. Inverse Fourier transformation of the corresponding data leads directly to the result T(x) = 2pA(x) C O S [ ~ ~ R-, X @(x)], (11.14)
which can be immediately displayed as an “interferometric” reconstruction of the form T ~ ( x )= 61 + ~ ~ A ( x ) c o s [ ~ ~ - R@(x)] ,x (11.15) by including a constant bias bl , or as
r2(x) = I2pA(x)cos[2nRSx - @(x)]I
(11.16)
by determining the absolute value of the Fourier-transformed data. The equation for the nth dark fringe in these reconstructions is given by 2nR,x - O(x) = (2n
+ 1)n;
n
= 0; f1;+2;
...
(II.17a)
in the case of r l , and by 2nR,x - @(x)
=
(2n
+ 1)n/2
(II.17b)
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
15
FIGURE3. Digital interferometric reconstruction of a Latex sphere (Ade and Lauer, 1991). (a) Electron off-axis hologram. (b) Corresponding Fourier-transform amplitude (spectrum). (c) Fourier spectrum of the moirt image resulting from the multiplication of the filtered hologram with a sinusoidal fringe system. The sidebands of interest now occur in the middle section of the micrograph. (d), (e) Interferometric reconstructions according to Eqs. (11.15) and (11.16). The result (f) is obtained by applying thresholding and contrast enhancement techniques to the reconstruction (e). The fringes occurring in these reconstructions indicate object positions of equal phase. Their bends reveal the thickness profile of the object.
in the case of r,. Thus, the fringe distance in the reconstruction r2 is seen to be one-half of that in rl , in accordance with the results shown in Figs. 3d and 3e. To demonstrate the performance of the method, a latex sphere was chosen as a test object. An off-axis hologram of this object was produced in the electron microscope and recorded on a photographic plate. The hologram was scanned with a TV camera and the video signal was digitized in a 512 x 512 raster with an %bit gray scale by means of an imageprocessing system equipped with a fast Fourier processor. The hologram and the digital reconstructions are given in Fig. 3. The interference fringes occurring in the reconstructions shown in Figs. 3d and 3e indicate positions of equal phase, and the bends of the fringes indicate local variations of the object phase function. Since the phase is directly proportional to the object thickness in our example (see also Section IV,A), the thickness profile of the test object is thus made visible by holographic reconstruction. Comparison of the reconstruction results of Figs. 3d and 3e shows that the latter is more appropriate for a quantitative determination of 0 because
16
G. ADE
of the sharpness of the fringes and the enhanced phase sensitivity inherent in this type of reconstruction. As can be deduced directly from Eq. (11.16), the dark fringes occurring in the reconstruction (e) indicate an amplified phase distribution with the amplification rate of two. The sharpness and detectability of the fringes in the reconstructions of type (11.16) can be appreciably improved by employing thresholding and contrastenhancement techniques, as demonstrated in Fig. 3f. Further sensitivity enhancements can be achieved by applying other phase amplification methods (Ade and Lauer, 1990, 1992a,b), which will be discussed in the following section. 2 . Digital Calculation of Amplitude and Phase We now assume that an efficient image-processing system with fast Fourier transform capabilities and sufficient memory capacities is used for reconstruction. If we perform a two-dimensional Fourier transformation of the digitized hologram data, the resulting spectrum can be described by an equation of the form (11.9). The desired object information is stored in both sidebands of the spectrum which represent the Fourier transform of the object and its conjugate. To reconstruct the object function it is therefore sufficient to use only one of these sidebands. We choose here the sideband around -R, because it leads directly to the original object function. This function can be simply reconstructed by isolating the sideband under consideration, shifting it to the origin of the Fourier space, and performing an inverse Fourier transformation of the corresponding data.’ The resulting complex function F(x) can then be analyzed by displaying its amplitude A(x) = [Re(F)]’ + [Im(F)]2)”2 (II.18a) and phase @(x) = arctan[Im(F)/(Re(F)] (11.18b) as reconstructed “amplitude” and “phase” images. The signs of both the numerator and denominator in (II.18b) must be taken into account to obtain phase values ranging from - n to + R . Since the calculated phase values are always modulo 2n, phase steps may occur in the image displayed. To create a continuous phase distribution, special techniques for phase ’For a perfect reconstruction,the position of the sideband must be exactly known. Usually, the center of the sideband can be determined within a precision of one pixel. The information on the exact position can, however, be obtained by the method described by de Ruijter (1992). The correct centering of the sideband can then be achieved by employing the algorithm introduced by Volkl and Allard (1994).
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
17
unwrapping (cf., e.g., Takeda et al., 1982; Macy, 1983) must be applied to the result (11.18b). Note that if the sideband used for reconstruction is not shifted to the origin of the Fourier space, a linear phase term @,in
=
-2nR,x
(11.19)
will be added to the phase O of interest. This means that if we merely separate the sideband around -R, and perform an inverse Fourier transformation of the corresponding data, we then obtain the function
F,(x) = F(x) exp(-i2nRcx) = A(x) exp[+iO,(x)]
(11.20)
O ~ ( X=) @(x) - 2nRCx.
(11.21)
with This differs from the object function F(x) only by the complex factor exp(- i2nRcx). According to the shift theorem of Fourier transforms (see, e.g., Champeney, 1973, p. 15), this is because a shift u in Fourier space corresponds to a multiplication by exp(-i2nux) in real space. To compensate for the unwanted linear phase introduced by this factor, it is therefore necessary to shift the sideband to the origin of the Fourier space. The required shift of the sideband may not always be possible, however, since in many image-processing systems only one-half of the Fourier spectrum is usually calculated, and the other half is obtained from the data of the first one by complex conjugation (Friedel’s law), because the input signals are always real functions. This means that if the sideband were shifted to the desired position, only one-half of the corresponding data would be effectively available. To avoid this complication, additional memory capacities and software modifications are generally necessary. A much simpler solution to the problem of eliminating the linear phase (11.19) is now described which does not require any shift of the sideband under consideration. Instead of the necessary shift, a second reconstruction with a linear phase identical with that in (11.19) is employed. This reconstruction can be easily obtained by an inverse Fourier transformation of a single point at the position -R, in the spectrum, i.e., at the center of the sideband used for reconstruction. If we denote this reconstruction by F,, we can then write
Fz(x) = exp(- i2nRCx)= exp[iOz(x)].
(11.22)
Once the functions Fl and Fz given by (11.20) and (11.22) have been calculated, the corresponding phases Q1 and Oz can be determined
18
G . ADE
according to Oj = arctan[Im(F,)/Re(Fj)],
j
=
1,2,
(11.23)
where Re(5) and Im(F,) represent the real and imaginary parts of F, . As in (II.l8b), the signs of the numerator and denominator in (11.23) must be taken into account to obtain values ranging from - n to + n. Digital subtraction of the two uwrapped phase distributions given by means of (11.23) immediately leads to the result @,(XI = @ 1 W -
@*W
(11.24)
for the phase @ of the original object function.
D . Techniquesfor Displaying Phase Distributions There are several optical techniques for observing phase distributions such as dark-field imaging, phase-contrast imaging employing a n/2 phase plate, and interference microscopy. The last is the most commonly used method for phase determination in light optics. An electron-optical version of this method can be easily realized once the electron wave is reconstructed. As discussed in Section II,C, 1, an “interferometric” reconstruction of the object can be obtained directly by inverse Fourier transformation of the data corresponding to the two sidebands of the hologram spectrum which have been shifted to the positions +R,by multiplying the filtered hologram by a reference fringe system of an appropriate fringe distance. The equation for the nth dark line in this reconstruction is given by means of (11.17). A reconstructed phase image revealing very sharp fringes can be easily obtained by utilizing function (11.20) together with a function in the form of (11.22) resulting from the inverse Fourier transformation of a single point set at the position -R,, in Fourier space. If we subtract the phase distribution of the first function from that of the second one and introduce the frequency R, = R, - R,,, we obtain the phase difference @,jif(~) =
2nR,x - @(x).
(11.25)
By displaying this distribution without phase unwrapping, a fringe system with a fringe distance of 2n is obtained where the fringe locations are determined by the phase steps; see Fig. 13. As discussed in Section IV,A, the bends of these fringes are directly proportional to the local variations of the phase distribution @(x). It can be recognized from (11.25) that when R, = 0, i.e., when R,, is equal to R,, a contour map of the phase distribution @(x) of the object is obtained; see Fig. 14.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
19
In many cases, the phase @(x) is found to be extremely small (101 4 2n), and therefore the bends of the resulting fringes can scarcely be observed in the displayed image. Under these circumstances, it is necessary to amplify the phase distribution in order to facilitate its visualization and to enhance the sensitivity of its measurement.
E. Phase Amplification Techniques In principle, the digitally calculated phase distribution can be easily amplified to any desired degree by multiplying by an appropriate factor (numerical amplification). Because of the limited number of pixels and grey levels available in most image-processing systems, however, it seems to be impossible to detect extremely small phase shifts on the order of 2n/100 which are of great interest in practical work on an atomic scale. This difficulty can be avoided if higher diffraction orders of the hologram spectrum are used. These can be obtained by employing the following two methods (cf. Ade and Lauer, 1992a,b): 1. Nonlinear Imaging of the Original Hologram As schematically shown in Fig. 4, a nonlinear image of the hologram leading to several diffraction orders can be produced by Fourier transformation of the hologram data, employment of the f 1st-order spectral parts, inverse Fourier transformation of the corresponding data, and display of the absolute value of the resulting interference pattern including a small bias term. The spectrum of this nonlinear image of the hologram is found to consist of a set of sidebands occurring at the positions f n R , . Selection of the sideband at -nR, and application of the phase determination method described in Section II,C,2 directly leads to an n-times amplified phase distribution of the object.
2. Repeated Generation of Holograms A high rate of phase amplification can be achieved by repeated generation of a new hologram using the first-order sidebands of the preceding one;
Hologram
5 of Selection +I st order
Bias, Abs. value
-
Nonlinear image
FIO~RE 4. Flow diagram for nonlinear hologram imaging (Ade and Lauer, 1992a).
20
G. ADE
Selection of tlst order
FT
Hologram
FT-'
Absolute value
-
Phase-amplified hologram
F r a w 5. Flow diagram for generating phase-amplified holograms (Ade and Lauer, 1992a).
see Fig. 5 . When the process is repeated n times, the resulting hologram is then given by H,(x)= ) 2 ACOS[~XR,X - 2"-'@(~)]1, (11.26) where R, denotes the position of the sideband used for the generation of the hologram. The Fourier transform of (11.26) consists of a set of sidebands situated at +2R,, +4R,, etc. The higher-order sidebands are usually weak and can therefore be neglected. If the first-order sideband at -2R, is used, a phase distribution with an amplification rate of 2" is obtained. The applicability of the preceding two methods is based on the repetitive character of the spectrum, which is a result of the discrete Fourier transformation involved. As demonstrated in Fig. 6, spectral parts occurring outside the actual field in Fourier space (monitor in Fig. 6) are introduced into this field by the laterally shifted repetitions of the spectrum. To avoid an overlap of these spectral parts with those of lower order, the hologram fringes must be inclined slightly at an angle to the coordinate axis. Generally, the method of repeated generation of a new hologram is found to be superior to that of nonlinear imaging of the original electron hologram. This is because only a limited number of diffraction orders (sidebands), and consequently a relatively low amplification rate (usually c lo), can be achieved in the latter method. On the other hand, only amplification rates of 2, 4, 8, ... are achievable with the other method. In practical applications, it is therefore advantageous to combine the two Monitor
3 o
-1 @
@
e
@
1
9
Repetition -2
-L
0
-3
0
0
1
ii
-1
0
0 Soectrurn
-L
@
-3
.
&
L 1 63
Repet it ion
@
FIOIJRE6. The occurrence of high-order diffraction sidebands within the observation field is demonstrated here (Ade and Lauer, 1992a,b).
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
21
methods to obtain the desired rate of amplification. When high diffraction orders are required for sensitivity enhancement, there may be some noise and overlapping problems. In this case it is advisable to use one of the next lower-order sidebands and to apply numerical amplification to the phase of the reconstructed wave. Examples of applications that clearly demonstrate the usefulness of the just-described methods for phase amplification are given in Figs. 13-15. F. Digital Reconstruction Including Aberration Correction
Up to now the discussion has been confined to the case of medium resolutions. The image wave recorded in the hologram has therefore been assumed to be the same as that just behind the object. Because of the degrading effect of aberrations, however, this does not hold in the case of high resolutions. Thus, for a perfect reconstruction of the original object function, it is generally necessary to eliminate the wave aberration W introduced in the recording step of the hologram. This can be achieved by multiplying the sideband of the hologram spectrum that is used for reconstruction, the (-1) in our case, by a complex “filter” of the form exp(- i2nW) and performing an inverse Fourier transformation of the resulting data. A numerical example showing the phase distribution of such a filter is given in Fig. 7. It should be mentioned, however, that the correction procedure requires not only an exact knowledge of the aberration coefficients, but also a correct sampling of the wave aberration function W . This implies that a minimum number of pixels is available and a minimum number of fringes are contained in the hologram. Because of the limited number of pixels available in most image-processing systems, however, the resolution achievable and the number of hologram fringes are not independent of each other. Generally, the resolution is determined by the maximum spatial frequency Re recorded in the hologram. Since Re is restricted by Re I R,/3, the carrier frequency R , must be sufficiently high. To achieve a resolution of 0.2 nm, for example, a fringe distance E, = 1/R, of about 0.06 nm must be realized. In practical work, the carrier frequency R, should be made just high enough to ensure that the information on the object structures of interest is well transferred to the hologram. Because of the deterioration of the fringe contrast with decreasing fringe spacing as a result of the instabilities of the electron microscope, it is namely unnecessary to select values of R, higher than three times the spatial frequency limit determined by the effective widths of the attenuating envelope functions that describe the influence of partial coherence (see Section 111,A).
22
G . ADE
FIGURE7. Unwrapped phase distribution of a digital filter for correcting spherical aberration in an image reconstructed from an off-axis hologram taken at zero defocus. The moire patterns occurring in the outer part of the figure are due to an undersampling of the steep wave aberration at higher spatial frequencies.
Experimentally, the carrier frequency R, is generated by applying a voltage U, to the filament of the biprism. The image and reference waves are then deflected towards each other by an angle y = coU,, where c, is determined by the energy of the electrons as well as by the radius of the filament and its distance from the ground plates (see Wahl, 1975). The two waves overlap at an angle B = 2ay/(a + b) in the intermediate image plane and produce an interference pattern of width D , and fringe spacing E: (see Fig. 8). The carrier frequency R; = l/&: of the resulting fringes is given by 2c0a (11.27) R,' = B/A = A(a + b) UF . It can be referred to the object side by multiplying it by the magnification M' = (a + b)/f of the objective lens of focal length f. We then have R,
=
M'RL = 2coaUF/Af.
(11.28)
Thus, for a given energy of the electrons, the carrier frequency R, is only determined by the voltage U, of the filament, since the distance a is kept fixed by the design of the biprism.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
23
FIGURE 8. Schematic diagram for determining the fringe spacing &f and the width D , of the interference field in the intermediate image plane of the electron microscope.
The width DH of the interference field at a distance b from the filament can be easily calculated by means of Fig. 8. To obtain the object-side related width D of the hologram, the expression for DH must be divided by the magnification M'. This leads to the result (see also Wahl, 1975) (11.29) where r, represents the radius of the biprism filament. It follows from Eqs. (11.28) and (11.29) that the carrier frequency R, and the hologram width D cannot be controlled independently by the filament voltage U,. Since U, is determined by the choice of R,, the width D can only be controlled by means of the distance b between the filament and the intermediate image plane. 1. Number of Hologram Fringes
For an accurate correction of aberrations, the phase uncertainty should not 271 6'151= 2n(grad W) 6R within a sampling interval 6 R = (6R( exceed the value of 71/2 (see also Lichte, 1991a,b). This means that 6R should satisfy the condition
where Re represents the maximum spatial frequency to be resolved in the reconstructed image.
24
G. ADE
The sampling interval 6R can be expressed as 6R = 1/D
(11.3 1a)
in terms of the hologram width D , or as 6R = R,/Nf,
(11.3 1b)
in terms of the carrier frequency R, and the number Nfrof the hologram fringes. Noting that R, must meet the condition R, 2 3Re in the general case of a strong object, it follows that a minimum number Nf, 1 12ReIgrad W ~= RR,
(11.32)
of hologram fringes are required to achieve the envisaged resolution. If, for example, only spherical aberration is taken into account, i.e., (11.33) the required number of fringes is then given by
Nf, 1 l2C,A3R;.
(11.34)
For C, = 1.2 mm, A = 2.5 pm (corresponding to an electron energy of 200 keV), and Re = 7 nm-I, the condition (11.34) leads to the result that the hologram must contain about 540 fringes. Since a sampling rate of 4 pixels per fringe is required (Lenz and Volkl, 1990), the hologram must be sampled with more than 2000 x 2000 pixels. 2. Lateral Resolution and Number of Pixels The maximum spatial frequency Re that can be reconstructed from the hologram is determined not only by the number Nfr of the hologram fringes, but also by the number Np of pixels available in the imageprocessing system employed. By making use of the relation Np
= 4Nfr3
(11.35)
the following result can be directly obtained from (11.34): Re 5
(11.36)
Thus, if an image-processing system with Np = 512 (2,048) pixels per line is employed and the previously given values for C, and A are used, a lateral resolution of 0.2 nm (0.14 nm) can be expected in the reconstructed image. Besides the number of hologram fringes and the number of available pixels, the performance of electron holography is found to be limited by other parameters which will be discussed in the next section.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
25
111. PROBLEMS OF OFF-AXISHOLOGRAPHY Although off-axis holography is a straightforward method for reconstructing the original object function, its realization especially at high resolutions is restricted by several parameters. The restrictions imposed by the number of hologram fringes and the number of pixels available have been dealt with in Section I1,F. The effect of other important parameters will now be discussed. A. The Effect of Limited Coherence
Up to now holography has been discussed in terms of strictly coherent waves. In reality, electron waves are far from being coherent, and electron holograms can only be performed under certain conditions. Because of the effect of partial coherence, only information about spatial frequencies below a certain limit can be recovered from the hologram. The effect of partial coherence on image formation in light and electron optics has been studied by several authors. A detailed and comprehensive review of this subject has been given by Hawkes (1978). An extensive study of the influence of partial coherence in contrast transfer theory has been made by Hanszen and Trepte (1971a,b). A theoretical treatment of the possibility of writing the effective transfer function as a product of the coherent transfer function and an envelope function has been given by Hanszen and Trepte (1971a) in the case of chromatic partial coherence, and by Frank (1973) in the general case of partial coherence. The effects of both spatial and chromatic partial coherence on reconstructed images in off-axis holography have been analyzed by Wahl (1975), and in an extended form including off-axial aberrations by Ade (1982a,c). 1. Chromatic Partial Coherence The effects of current and voltage fluctuations and also the influence of the energy width of the electron beam can be generally described by means of defocus variations (Hanszen and Trepte, 1971a). The intensity distribution in the hologram plane is therefore a composite of intensities corresponding to different values 6z of defocus. The mean intensity can be found by integration over the defocus, taking a distribution function h, into account. As can be easily shown (see, e.g., Hanszen and Trepte, 1971a; Ade, 1982c), the effect of chromatic partial coherence can be described by an envelope function G, that is determined by the Fourier transform of the distribution function under consideration. As an example, we consider here
26
G . ADE
a Gaussian distribution (111.1)
with the effective I/e width 26,. In most cases, the defocus variations are mainly determined by the relative energy width 6 E / E of the electrons. According to Hanszen and Trepte (1971a), we can therefore write (I 11.2) where C, represents the coefficient of chromatic aberration. The envelope function G,(R)is then found to be G,(R)
=
h,(AR2/2) = exp[-(n~3,lR~/2)~].
(111.3)
This function is shown graphically in Fig. 9 for different values of the parameter 6E/E. Because of the steep decrease of the envelope at higher spatial frequencies, the resolution of the reconstructed image can be severely restricted by chromatic partial coherence. Since 6, is mainly determined by 6 E / E , the attenuation effect of the envelope cannot be changed by choosing other conditions of operation in the electron microscope.
\
\
\
Cc= 2 mm I
I
I
1
2
1
3 L Spatial frequency R
6
nm-1
-C
FIGURE9. Envelope G, as a function of the spatial frequency R in the case of chromatic partial coherence with a Gaussian distribution. The parameter is the relative energy width 6 E / E of the electron beam.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
27
2. Spatial Partial Coherence In the case of spatial partial coherence, the finite size of the electron source must be taken into account. To calculate the intensity distribution in the hologram plane, the source is split into point sources which are considered to radiate incoherently (Hanszen and Trepte, 1971b). Each point source corresponds to a different direction of illumination indicated by the illumination angle p. The total intensity distribution can be derived by integration over all possible directions. As in the case of chromatic partial coherence, the effect of spatial partial coherence can be described by an envelope function G, which is the Fourier transform of the distribution function h, of the source. As already shown by Ade (1982c), the envelope G, can be written as G,(R, x, D) = h,[D
+ grad W(R, x)],
(111.4)
where ID( is the object-side related width of the hologram. In the general case, the wave aberration W includes all axial and off-axial aberrations; see Section II,A. Two important cases will now be discussed.
a. Rotationally Symmetric Source. We now assume an ideal case of aberration-free imaging, i.e., W = 0. The envelope (111.4) is then only a function of the width D of the hologram. Furhermore, we assume a rotationally symmetric source with the normalized Gaussian distribution (111.5) where px and ury are the components of the illumination aperture in the x and y direction, and 2pe represents the effective angular width of the source. The corresponding envelope function is then given by
G,,(D) = exp[- (apeD/1)2].
(111.6)
According to the general theory of partial coherence (cf. Born and Wolf, 1959, Section 10.4.2), the envelope G,, is identical with the degree of spatial coherence K~~ between two points with a mutual separation D in the object plane. Note that the fringe contrast decreases with increasing width D of the hologram; see Fig. 10. In principle, this decrease of contrast can be compensated by proper demagnification of the source. But this leads to a decrease of the current density in the hologram plane and, consequently, to severe noise problems in the reconstruction step. Since pe can be expressed as pe = x Q e / C
(111.7)
28
G. ADE
I
I
20
10
30 LO 50 nm Hologram width D
FIGURE10. Envelope G,, as a function of the width D of the interference field in the case of an aberration-free imaging and spatial partial coherence with rotationally symmetric Gaussian distribution. The parameter perepresents the effective illumination aperture of the electron source.
in terms of the natural coordinate xQein the source plane and the distance [ between the source and object plane, we rewrite (111.6) as G,,(D) = K~~
=
~xP[-(~cx~,D/~-A)~].
(I 11.8)
The terms 7rx6eand nD2/[’ represent the effective area dfof the source and the solid angle 6Q of diameter W/c,respectively. If we take the logarithm of (111.8) and make use of the relation 6J B=(111.9) df 6Q for the axial brightness B of the source (see, e.g., Lauer, 1982b), the following result is obtained for the current 6J available for recording a hologram with the contrast factor K , , ~:
6J
= - BA2 In K
~
~
.
(111.10)
Thus, for a given electron wavelength A, the current 6 J is only determined by the brightness B of the source. To achieve higher current densities in the hologram plane, a more favorabale illumination mode can be employed (Lauer, 1982a; see also Hanszen et a/., 1985a,b, 1986).
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
29
6. Elliptical Source. To obtain high fringe modulation, the value of the envelope function (111.6) must be close to unity. This implies that the illumination aperture qe must satisfy the condition ( P ~B
L/nD.
(111.1 1)
L
= 3.7 pm and D = 50 nm, for example, an illumination aperture of rad or smaller is required. On the other hand, in order to operate with small exposure times, the current density in the hologram plane should be high. These two requirements can be fulfilled if a field emission gun is used and the object is coherently illuminated by a line-shaped source (astigmatic illumination). This kind of illumination can be obtained by overexciting the field of the condenser stigmator to such a degree that two mutually perpendicular line foci with a small separation are produced (Lauer, 1982a). As schematically shown in Fig. 11, by a proper excitation of the condenser lens, the first line-focus is made to occur in front of the object, and the other one at a very short distance behind it. The two line foci of vanishingly small widths represent one-dimensional images of the point-like electron source. Their widths and distances from the object determine the angular extensions of the two components of the effective illumination aperture in the x and y directions. Thus, the illumination aperture is, as a function of the azimuth, generally elliptical in shape. To calculate the influence of astigmatic illumination on the contrast and resolution in off-axis holography (cf. Ade, 1982c; Ade et al., 1985), we assume that the effective source can be described by a Gaussian function according to
For
where 2peXand 2qeYrepresent the l/e widths in the x and y direction, respectively. Furthermore, we assume that the hologram fringes are oriented parallel to the x direction. Finally, we restrict the discussion to the case of isoplanatic imaging and take only spherical aberration into account.* The wave aberration W is then given by means of Eq. (11.5) with Az = 0. If we introduce polar coordinates according to Rx = R cosy;
Ry = R sin y ,
(111.13)
the envelope function (111.4) can be written as (cf. Ade, 1982c) Gse(R,D, Y ) = e x ~ ( - n ~ [ ( ~+~( (ePxe~y ~~) 2 ~ ~
(111.14)
'The following discussion is aimed at studying the attenuation effect of the envelope. As already discussed by Lichte (1991b), the combined effect of defocus and spherical aberration can be optimized in the spatial frequency range of interest by employing a proper defocus value in the recording step of the hologram.
30
G . ADE
FIGURE11. Object illumination by a coherent line-shaped focus produced by astigmatic imaging of the point source of a field emission gun (Lauer, 1982a). This line-focus is adjusted parallel to the biprism filament. The second focus extending in a perpendicular direction is placed at a short distance from the object. Because of the elliptical shape of the illumination aperture, the current density is very high in this case.
where
< = CSA2R3cosy; rl = CsA2R3sin y
+ D/L.
(III.15a) (III.15b)
To study the attenuation effect of the envelope on the sideband used for reconstruction, Eq. (111.14) is normalized by dividing it by its value Go = e~p[-(nyl,D/A)~]
(111.16)
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
31
t
c6
=28mm A = 3 7 10-3nrn D =50nm qeX= 1 xlO-‘rod vey= 1 xlO-‘rod
relevan
FIGURE12. Two-dimensional display (projected cross-sections) of the normalized envelope G,,/G, in the case of off-axis holography employing astigmatic illumination (Ade et 01.. 1985). It is assumed that the (- 1 ) diffraction order of the hologram is utilized for reconstruction. The relevant region of the correspondingspectral part remains almost unaffected by the envelope.
at the position R = 0. The resulting function G,,/G,-, is shown in Fig. 12 in the form of contour lines, where the position R = (0, - R,) has been taken as the origin. It follows from this result [see also Eqs. (78) and (83) in the article by Ade (1982c)l that the transfer properties of the reconstructed image are determined only by the illumination aperture qeyperpendicular to the hologram fringes. The value of the illumination aperture qu parallel to the fringes can be chosen at least 100 times larger than the maximum tolerable value of ley without degrading the quality of the reconstructed image. This means that the exposure time can be about 100 times shorter than that in the case of rotationally symmetric illumination.
B. Noise Problems As is well known in high-resolution electron microscopy, the image aquires a granular appearance because of statistical variations of the electron density. This “quantum noise’’ is found to be governed by Poisson statistics. If the mean number of electrons incident on an image element (pixel) during the exposure time is denoted by N, and its variance by AN, the noise contrast between different pixels is then given by ANIN, = Image details with a contrast smaller than AN/N, are therefore buried in noise and cannot be recognized in the image. To supress the effect of noise, the number N, of electrons must be as high as possible. In electron holography, both the contrast and the location of the hologram fringes are influenced by noise. Consequently, the amplitude and phase of the reconstructed wave can be determined only with a limited accuracy. The influence of photographic noise on light-optical reconstructions from off-axis holograms has been thoroughly investigated by Hanszen
32
G. ADE
and Ade (1983, 1984) and Hanszen (1983); see also Lauer and Hanszen (1986a,b). Because of the large width of the noise band, it is found that object phases smaller than d 6 cannot be recognized in interferometric reconstructions (Hanszen and Ade, 1984). The statistical errors inherent in the determination of the amplitude and phase in a noisy interference pattern have been calculated by Walkup and Goodman (1973), Lichte et al. (1987), Lenz (1988), Lenz and Volkl (1990), and de Ruijter and Weiss (1993). The accuracy of these parameters is found to be limited by the total number N of electrons in a given area and by the contrast K of the fringe pattern. The phase error A 0 of the reconstructed wave is, for example, given by A@ 1
(~m)-'.
(111.17)
This relation shows that A@ is strongly influenced by the contrast K , and more weakly influenced by N. Thus, to keep A 0 small, the fringe contrast should be made as high as possible, because any reduction of contrast must be compensated by a quadratic increase in the number of electrons. Generally, the fringe contrast K is determined by the relation K = Kin Kpc KMTF 3
(111.18)
which includes the effects of the microscope instabilities, partial coherence, and the modulation transferfunction (MTF). The factor tcin depends on the technical design of the electron microscope; with careful measures it can be made close to unity. The factor K~~ has been dealt with in Section III,A,2, and the effect of the remaining factor KMTF will be investigated in the following section. C. Problems of Hologram Recording
Electron holograms are usually recorded on photographic plates or films even if the subsequent reconstruction process is performed digitally. Electronic detection would, of course, be more favorable for a real-time digital reconstruction, but image-processing systems with a sufficiently large number of pixels to meet practical needs are not yet available. As is well known, the photographic process has some undesirable characteristics which will now be briefly discussed. 1. Detective Quantum Efficiency
One of the most important parameters for describing a recording device is its detective quantum efficiency (DQE). In an ideal recorder (DQE = l), all impinging electrons are registered as spatial &pulses of uniform height.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
33
The DQE of a real recorder is generally found to be smaller than unity because of the nonuniform pulse-height distribution and additional noise. In the case of photographic recording, it is due to electron diffusion and the statistical variation of the optical density resulting from the statistical variation of the density of the impinging electrons and number of grains made developable by each electron. If we take the effect of the DQE into account, the phase error (111.17) becomes A@ 2 ( K - m ) - l .
(111.19)
According to the theoretical results published by Herrmann (1982), a DQE value of about 0.7 can be achieved under optimum conditions. This means that if the number of electrons is chosen to meet the condition for maximum DQE, nearly 70% of the electrons impinging on a photographic plate can be expected to yield a significant signal. Besides its inability to record all incident electrons (DQE < l), the photographic process has other undesirable characteristics which have their origin in the multiple scattering of the electrons in the emulsion (for details, see Frieser and Klein, 1958; Frieser et al., 1959). 2. Nonlinearity of the Transmittance-Exposure Curve If a photographic plate is exposed to electrons with the current density j , the optical density of the developed plate is found to be adequately described by (I1I .20) fi = B,[1 - exp(-cjt)l, where t is the exposure time, B, is the saturation density, and cfi, defines the sensitivity of the emulsion. For fi + B, , the linear relation fi = D,jt (111.21) can be used as an approximation. For a light-optical reconstruction of the hologram, the amplitude transmittance T, of the developed photographic plate is required, while for a digital reconstruction the intensity transmittance T is more relevant. The transmittance T can be related to the optical density b by recalling that fi is defined as the decimal logarithm of l / T , i.e., (I 11.22) It directly follows from this relation that
T
= 10-D.
(111.23)
34
G. ADE
Consequently, the amplitude transmittance
r, can be written as
r, = 10-6’2.
(111.24)
Unfortunately, neither T, nor T is linearly related to the current density j of the electrons. Linear relations are obtained only for small values of the optical density B. This is of minor importance, however, because high fringe-contrast is normally needed in practice to reduce the effect of noise (Lauer and Hanszen, 1986a,b). The influence of the nonlinearity of T, on light-optical reconstructions has been investigated by Ade (1980, 1982a) for both in-line and offaxis holography. This nonlinearity generally leads to a distortion of the amplitude and phase in both methods. In off-axis holography with low resolution requirements, however, the nonlinearity has no influence on the reconstructed phase. In the case of digital reconstruction, the nonlinearity of the intensity transmittance T of the photographic plate does not represent a serious problem, since the electron current density j of interest can be easily determined from the measured values of the transmittance by using the known relation between j and T . 3. Modulation Transfer Function Because of multiple scattering, a large number of photographic grains can be made developable by a single impinging electron within the scattering or diffusion volume. The lateral spread within the emulsion corresponds to the size of the “point spread function” which sets the resolution limit. The effect of this function is to attenuate the contrast in images of small object details, i.e., of high spatial frequencies. This attenuation is described by means of the so-called modulation transferfunction’ (MTF). In the case of off-axis holography, the contrast of the hologram fringes is found to be attenuated by the factor KMTF =
[l
+ (s0/&,)2]-1
=
[l
+ vs;/V&y,
(I 11.25)
where so is the half width of the MTF, E, = 1/R, is the fringe distance, and v is the electron density (i.e., electrons per unit area). In the following, we assume that v is chosen to yield a maximum DQE. As mentioned in Section II,B, the fringe distance must be so adjusted that three fringes correspond to the linear size of the object element to be resolved in the reconstruction. Thus, if N electrons are collected within a Experimentally determined MTF curves for various photographic emulsions have been published by Downing and Grano (1982).
35
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
, term VE: occurring in (111.25) corresponds to “pixel” of an area ( 3 ~ , ) ~the N/9. Similarly, the term vsi corresponds to the number N D of electrons impinging on the diffusion area so”. Thus, we can rewrite Eq. (111.25) as
If we use this relation in (111.18), the fringe contrast takes the form K =
~ i , , ~ , , / ( l+ 9ND/N).
(I1I. 27)
To determine the combined effect of the DQE and MTF on the phase error A@, we insert Eq. (111.27) in (111.19) and obtain the result (111.28) If a total number Nto,= r 6J/e of electrons with charge e is available during the exposure time r for recording the hologram, the number of “pixels” n, = N,,,/N which can be recorded and subsequently reconstructed with a phase accuracy A@ can be calculated by means of (111.10) and (111.28). It follows that n, can be written as (111.29) where
nK,idI
~i~ln(Kpc)BrA2A02/2e
(111.30)
represents the number of reconstructable pixels in the case of an ideal recorder. Equation (111.30) shows that nrr,idis strongly dependent on the contrast factors K~,,and K,, . For a fixed value of xi,,, the number nK,idis found to be maximum if K~ is made equal to 0.61. According to Eqs. (111.28) and (111.29), the smallest number N D needed for recording the hologram at optimum DQE should be used to keep the phase error A@ as small as possible, and the number of pixels n, as large as possible. As discussed by Lichte (1991a), a small value of N D can be achieved by combining different values of so with corresponding emulsion speeds. All emulsions with the same value N D are found to be equivalent. Emulsions leading to small diffusion areas under exposure with high electron densities are preferable, however, since they allow a larger number of pixels to be recorded. As can be seen in Eq. (111.29), the effect of the photographic process is to reduce the number n,,id of reconstructable pixels. If the numerical values DQE = 0.7, N D = 100, a n d N = 10,OOO are used, for example, the number of pixels is found to be reduced by a factor of 0.6.
36
G. ADE
Iv.
EXAMPLESOF
APPLICATIONS
To demonstrate the potential of electron off-axis ha.agraphy, we now proceed to a discussion of several experimental examples dealing with nonmagnetic crystalline objects. Digital reconstructions dealing with magnetic structures have been reported by Matsuda et al. (1991), Ru et al. (1991), and Tonomura (1992). The reconstruction results presented in this article have been obtained without the correction of aberrations. Because a large number of pixels and exact knowledge of the aberration coefficients are required for a reliable high-resolution reconstruction (cf. Section II,F), only a few examples of reconstructions at atomic resolutions have yet been published (cf. Lichte, 1991a, 1992a,b; Kawasaki et al., 1992). The first examples to be discussed here deal with thickness measurement. They are followed by another example in which the dynamical phase effects of transmitted waves in crystals are studied. The final example deals with the detection of lattice defects by means of holography. The holograms were produced in the electron microscope (Philips EM 400 T, equipped with a field emission gun and an electrostatic biprism) and recorded on photographic plates. The developed plates were scanned by means of a video camera, and the signals were digitized in a 512 x 512 raster with an 8-bit grey scale by means of a TEMDIPS image-processing system." As mentioned in Section I,A, the goal of electron holography is the reconstruction of the original electron wave at the exit side of the object, and particularly the quantitative determination of the phase distribution of this wave. A. Thickness Measurement
Within the validity range of the kinematical theory, a non-magnetic crystal can be considered as a refracting medium. Because of refraction on the two generally non-parallel surfaces of the crystal, the electron wave incident in a direction far from the Bragg position will therefore be deflected from its original direction of propagation after leaving the crystal. Ignoring relativistic corrections, the phase distribution of the wave just behind the crystal can be expressed as = Ruot/Aua, (IV. 1) where the symbol U,represents the mean inner potential, t the crystal thickness, U, the accelerating voltage of the electrons, and A their wavelength. lo Tietz
Video & Image Processing Systems, Gauting, Germany.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
37
FIGURE13. Digital phase reconstruction of an MgO crystal. (a) Two-times amplified phase reconstruction with fringes. (b) Perspective view of the crystal obtained by differentiating the image shown in (a) by a Laplacian.
Since 0 is proportional to the thickness t, the reconstructed phase image offers a direct means for thickness measurements. As demonstrated in Fig. 13, the whole thickness profile of the crystal can be made visible by holographic reconstruction. The relation (IV.l) shows that the phase distribution CP is determined by the product of the inner potential Uo and the crystal thickness t. If the phase 0 and one of these parameters are known, the other parameter can then be easily determined by means of (IV. 1). The holographic determination of the mean inner potential Uowas demonstrated by Lauer et al. (1980) by light-optical reconstruction. Here we demonstrate the other possibility, i.e., the measurement of thickness variations. The objects investigated are flat platelets of MgO crystals with several thickness steps on their surfaces. These steps are barely visible in conventional electron micrographs or holograms. To enhance the sensitivity of phase measurement, the amplification techniques described in Section II,E were used. The reconstruction results are shown in Figs. 14 and 15. The fringes represent lines of equal phases, and the bends 6x of the fringes correspond to local phase changes 60. If we take an amplification rate n into account, by means of (11.25) we obtain
6 0 = 2n 6x RJn.
(IV.2)
For convenience, we take the direction parallel to the fringes as the x direction. In this case the vector product 6xR, can be replaced by 6 y / e s , where E, is the fringe distance. Equation (IV.2) can then be rewritten as 6Q = 2n Gy/ne,.
(IV.3)
38
C . ADE
FIGURE14. Digital phase reconstruction of an almost quadratic depression on the surface of a thin MgO platelet (see also Ade and Lauer, 1992a). (a) Usual phase reconstruction (contour map). (b) Differentiated phase with an amplification rate of 12. The bends of the fringes correspond to a step height of 2.2 nm.
If we now express
as
a@= nu, dt/LU,
(IV.4)
by means of the step height dt of interest [see Eq. (IV.l)], the following result is obtained: (IV.5)
The holograms leading to the reconstruction results of Figs. 14 and 15 were produced in the electron microscope at an accelerating voltage of 100 kV and 60 kV, respectively. The corresponding wavelengths are 3.7 and 4.87 pm. For the mean inner potential Uo,the theoretical value of 16.23 V can be used (cf. Radi, 1970). With these numerical data, the step height dt can be easily determined according to Eq. (IV.5) by inserting the measured values for the fringe distance E , and the bend Sy of the fringes. For the examples shown in Figs. 14 and 15, step heights of 2.2 nm, 3.24 nm, and 0.42 nm were determined. These values correspond to a phase shift of about 2n/20, 2n/12, and 2n/90, respectively. B. Study of Dynamical Phase Effects
As mentioned earlier, the phase distribution (IV.l) is only valid in the kinematical case, where the direction of the incident electron beam is far from the Bragg position. Much more complicated conditions occur when a direction of illumination near Bragg-reflection is realized, i.e., when the
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
39
FIGURE15. Digital phase reconstruction of atomic steps in thin MgO crystals (Ade and Lauer, 1992a.b). Left column: Bright-field image taken under exciting the (200)-reflection (top) and corresponding 8-times amplified phase reconstruction (bottom) obtained from a hologram of the area marked by a rectangle in the bright-field image. To obtain a topographic view of the steps, two orthogonal fringe systems according to Eq. (11.25) were produced and superposed after differentiation by a Laplacian. The step within the encircled area corresponds to a thickness variation of 3.24 nm. Right column: Dark-field image of the crystal (200-reflection) revealing some surface steps (top) and corresponding digital 32-times phase-amplified reconstruction from a hologram of the area marked by a rectangle in the dark-field image. The fringe step within the encircled area corresponds to a monoatomic step leading to a thickness variation of 0.42 nm.
dynamical theory is to be applied. A full review of this theory can be found in the book by Hirsch et al. (1965; see also Reirner, 1984). The dynamical amplitude and phase distributions of the electron wave at the exit side of the crystal can, in general, only be obtained by numerical calculation. Analytical expressions for these distributions can, however, be derived within the validity range of the so-called two-beam approximation. Usually, only expressions for the amplitude distribution can be found in the literature. Analytical calculation of dynamical phase distributions in the absence of absorption was carried out by Ade more than 10 years ago
40
G . ADE
(see Ade et al., 1980). Detailed calculations of amplitude and phase distributions including the effect of absorption can be found in an extended article by Ade (1986a). The results are derived for the two-beam case, where besides the direct wave only one diffracted wave is exited. In this case, the incident wave is split up into two partial waves propagating in the direction of the incident wave, and two partial waves propagating in the direction of the Bragg-reflected wave. The current of the electron wave therefore oscillates between the two directions (pendellosung). Usually, only the direct waves are employed in holographic work. But even in this case, i.e., if the diffracted waves are screened off by means of a diaphragm, the encoded phase information is found to depend not only on the mean inner potential as is kinematically expected. Because of the interaction between the direct and the diffracted waves, the structure potential Ug also contributes to the phase modulation. According to the results of Ade (1986a,b,c), the amplitude and phase distributions of the total direct wave in the absorption-free case can be written as
A ( t , w) = 1
-
;:‘3”’
-
(IV .6a)
9
tan X ] , (IV.6b) where (IV. 6c) If absorption is taken into account, the corresponding expressions are given by
A(t, w) = exp(-nt/($
sinh2X’ (IV.7a)
- arctan where
sinh X’ + w / ( m )cosh X‘
I
cosh X’+ w / ( m ) sinh X’
tan X] , (IV.7b)
(I V. 7c)
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
41
In these equations, the parameter
represents the so-called extinction thickness. In a crystal of this thickness, the electron beam oscillates once from the direction of the incident wave to that of the diffracted wave and then back to the original direction. The value of tgis mainly determined by the structure potential U g ,and thus it depends on the reflex exited. Furthermore, the parameters t h and t r are used to describe the effects of “mean absorption” and “anomalous absorption,” respectively. As can be recognized in Eq. (IV.7), the mean absorption parameter occurs only in the envelope function exp(- nt/(h) of the amplitude A , and thus leads to a strong attenuation of the intensity as a function of the crystal thickness t. Finally, the dimensionless parameter w is used to denote the deviation 68 from the Bragg angle 8. It is defined as
w
=
tg68/d = 25,68 sin 8/1,
(IV.9)
where d denotes the lattice plane spacing. The just-given amplitude and phase distributions have been numerically evaluated, and the results are graphically displayed in Figs. 16 and 17. The kinematical phase part Okin= ntUO/Ugtg= ntUo/AUa has already been discussed in the preceding section. It describes a linear increase of the phase as a function of t and leads only to an inclination of the curves with respect to the horizontal t-axis. It has, therefore, been omitted in the curves in Fig. 17. As can be seen in Eq. (IV.7) and Fig. 16, both the intensity (squared amplitude) and phase distributions depend strongly on the deviation parameter w. In contrast to the intensity, the phase distribution also depends on the sign of w. According to a proposal of Lauer (1982a), the whole range of the dynamical phenomena can be experimentally covered by employing astigmatic illumination and placing one of the two resulting line foci near the object (see Figs. 11, 16c, and 18). In this case, the object is coherently illuminated by a quasicylindrical wave, and the illumination of each object element parallel to the line focus occurs at a different value of the deviation parameter w. To give experimental proof of the previously mentioned theoretical results, a wedge-shaped silicon crystal was used as an object. The orientation of the line focus of the astigmatic illumination used with respect to the crystal edge is shown schematically in Fig. 18. As can be seen in the holographic reconstructions shown in Figs. 16d and 17d, the steps of the fringe systems have exactly the characteristics predicted by the theory according to Figs. 16b and 17b.
42
G. ADE a
C
LJ 2tg Crystal thickness t
d
b
,
,
'
I
2% Crystal thickness t %I
'
-
I Edge
Crystal
FIGURE16. Thickness fringes and dynamical phase distribution of the direct wave behind a wedge-shaped crystal (see also Ade, 1986a.b). (a) Normalized intensity distribution (squared amplitude) for a crystal orientation at Bragg position ( w = 0) as a function of the crystal thickness t . Because of absorption, the thickness fringes fade out with increasing t. (b) Phase distribution Q(t, w ) as a function of the thickness t with w as a parameter. For clarity, the curves are drawn vertically displaced. The inclination of the curves is determined by the kinematical part @kin of @. The bends of the curves and the steps occurring at the extinction contours (locations of vanishing intensity) are seen to diminish with increasing values of the deviation (wl from the Bragg position. Numerical data for Si (220) according to Radi (1970): Uo/UE= 2.63; [J[; = U;/U, = 0.16; tE/t;= U;/UE= 0.025. (c) Simplified diagram for crystal illumination by a line focus (see also Fig. 18). In this way, the deviation parameter w varies in proportion to the x coordinate. (d) Differentiated phase image of a silicon crystal obtained from an off-axis hologram produced under bright-field conditions (excited reflection 220). A comparison of this reconstruction with the theoretical results shown in (b) reveals a remarkable correspondence.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
a
43
C
w=-m
Crystal thickness t
--c
d
b w=-aD
4
It
% 2% Crystal thickness t
-
I Edge
-
Crystal
FIGURE17. Dynamical part adof the phase distribution O ( f ,w ) as a function of the crystal thickness f with w as a Parameter (see also Ade, 1986a,b). (a) In the absence of absorption, the phase distribution adis symmetrical about w = 0. The phase steps occurring at the extinction contours degenerate in phase jumps of R at w = 0. (b) When absorption is included, the phase distribution adbecomes asymmetric about w = 0, and the phase jumps at w = 0 are smoothed. Furthermore, steps of opposite sign may also occur. (c) Simplified diagram for crystal illumination (see legend of Fig. 16). (d) Differentiated phase image corresponding to that of Fig. 16d, but with fringes directed horizontally to remove the effect of the kinematical phase part. The experimental result is seen to be in accord with the theoretical predictions.
C. Crystal Defects As a final example we consider the reconstruction of a silicon wedge-shaped crystal with an inclined stacking fault; see Fig. 19a (for more details, see
Ade and Lauer, 1981; Hanszen, 1986).
44
G . ADE
Interference field
FIGURE 18. Astigmatic illumination used for the investigation of dynamical phase effects of transmitted waves behind a wedge-shaped crystal (Lauer and Ade, 1990, poster presentation). The first line-focus is parallel to the biprism filament, and the second one is perpendicular to the crystal edge. The illumination angle a, and, correspondingly, the deviation parameter w vary in proportion to the coordinate x.
An off-axis hologram of the crystal was produced in the electron microscope at an accelerating voltage of 100 kV. The crystal was so placed that the part containing the fault and a perfect part of the crystal occupied different areas on opposite sides of the optical axis. In this way, the kinematical phase shift introduced by the mean inner potential is totally eliminated, because the hologram is formed by means of two electron waves passing through congruent parts of the crystal. Only the exited diffraction order (220) was used for hologram formation. A dark-field micrograph of the crystal at the Bragg position (w = 0) and the corresponding digital phase reconstruction are shown in Figs. 19b and 19d. The fault can be recognized in the micrograph by contrast phenomena. The wedge-shaped
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
45
C
I I I I
I
I I I I
b
d
FIGURE19. (a) Perspective view of a wedge-shaped crystal with a stacking fault. (b)Electron dark-field micrograph of a silicon crystal at w = 0. (c) Calculated positions of the fringes for a crystal thickness smaller than the extinction thickness Ts. (d) Digital phase reconstruction with horizontal fringes. A phase shift of one-third of the fringe distance d' corresponding to the actual shift of the lattice planes can be clearly recognized between the fringe positions on opposite sides of the fault.
area in this micrograph is the projection of the fault plane. In the holographic reconstruction, the fringes are directed horizontally. However, their azimuth can be arbitrarily tilted without any loss of information. As can be easily recognized, the fringe positions on different sides of the fault reveal a phase shift of one-third of the fringe distance. This corresponds exactly to the actual shift of the lattice planes.
46
G. ADE
For a simple interpretation of the results, the relevant theoretical relations will now be presented. The theory is based on four Bloch waves which interfere just behind the crystal. General expressions for the wave functions of the direct and diffracted waves were derived by Whelan and Hirsch (1957) and by Hashimoto et al. (1962), including the effect of absorption. According to these results, the wave function of the total diffracted wave for a crystal orientation at the Bragg position ( w = 0) can be written as
vP = vQexp(inUot/r3ULr,) 1 - e-'a 2
+ sin(2nz/(,)]
[sin(nt/(,)
where vQrepresents a constant amplitude, z is a coordinate indicating the fault plane, and a = 2n/3 is a phase angle which characterizes the displacement of the lattice planes. Equation (IV.10) is also applicable to the case of absorption, provided we replace l/{, by l/<, + i/(; and multiply the expression on the right-hand side by exp(-nt/(&) to include mean absorption. As can be expected, Eq. (IV.10) reduces to the perfect crystal expression
w,,~ = vQexp(inUot/lU,)[i
sin(nt/(,)]
(IV. 11)
if CY is zero or 2nn, where n is an integer. Furthermore, even when a is nonzero, the expression (IV.10) joins continuously to that of the perfect crystal at the top (z = -t/2) and bottom (z = +t/2) surfaces. Thus, in full agreement with the experimental result, the observable phase difference 6@ of the wave functions on opposite sides of the fault is found to be
6@
=
a
=
2~/3.
(IV. 12)
For comparison with the holographic reconstruction, the theoretical phase curves based on Eq. (IV.10) including absorption are shown in Fig. 19c for various values of the crystal thickness t. As can be recognized, they are in good accord with the reconstruction result. Other examples with larger deviation from the Bragg position ( w = 0) and corresponding calculations based on the general expression for the direct wave are given by Ade and Lauer (1981; see also Hanszen, 1986). According to these theoretical investigations, the result (IV. 12) can in principle be obtained with an arbitrary value of w . Hence, an exact adjustment of the Bragg position is not an essential requirement for the experimental determination of the displacement parameter a.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXISHOLOGRAPHY
47
V. CONCLUSIONS AND FUTURE PROSPECTS
The theoretical and experimental results of the preceding sections demonstrate clearly that the digital reconstruction of electron holograms provides a powerful means for a quantitative determination of the object function of interest. By employing the phase amplification techniques, for example, phase shifts in the order of 2n/100 or less can then be detected. This leads to a variety of practical applications, such as the observation of microscopic magnetic structures and thickness measurements on an atomic scale. Furthermore, the ability to correct the wave aberration and the attenuating effect of the envelope functions due to partial coherence makes it possible to attain ultrahigh point resolutions of less than 0.1 nm. This allows real crystal structures such as grain boundaries and point defects to be investigated. Such an ability is indispensable for new technologies aiming to control materials in the atomic range. In addition, because the objective aperture can be appreciably increased, the problem of radiation damage can be considerably reduced. However, as discussed in Sections II,F and 111,A-C, there are some obstacles to be removed before the digital reconstruction method is firmly established. The problem of generating holograms with high contrast and a sufficient number of fine fringes to enable the recording of object structures at a resolution of about 0.1 nm can be overcome by employing modern electron microscopes with high-brightness electron guns and taking careful measures against instabilities. The difficulties connected with the photographic recording of the hologram, such as photographic noise and the nonlinearity of the photographic process, can be reduced to a large extent if the hologram is recorded electronically. Despite its long processing time and other disadvantages, the photoplate is still favored as a recording medium because of its high number of pixels and ease of use. However, changes will come about in electron holography because of the introduction of high-grade CCD camera systems. These have pixel sizes equivalent to those of photoplates and a dynamic range of about lO,OOO, which exceeds that of the plate by a factor of 10. The rapid progress in modern CCD slow-scan technology and the continuing increase in computing power will surely allow digital reconstruction to be performed on-line at the electron microscope. The major difficulty in reaching the highest resolutions seems to lie in the determination of the wave aberration of the electron microscope. As mentioned in Section III,F, the aberration correction is performed by multiplying the spectrum of the sideband used for reconstruction
48
G. ADE
by an appropriate phase filter. The data of the electron microscope, and particularly the aberration coefficients occurring in the wave aberration, must therefore be exactly known for each particular hologram. According to Lichte (1993), a precision of a few parts per thousand is needed if resolutions of about 0.1 nm are to be achieved. However, this is not too far from being reached (see Coene and Denteneer, 1991; Koster and de Jong, 1991). In conclusion, we note that all the signs point to an exciting future for electron holography. ACKNOWLEDGMENTS
I thank Dr. R. Lauer for his critical reading of the manuscript, Mr. W. Buchholz for his cooperation and efficient assistance in preparing the line drawings, Mr. U. Dierksen for his graphical work, and Mrs. P. Helm for putting the final polish on the English text. REFERENCES Ade, G. (1978). Optik 50, 143. Ade, G. (1980). EIectriculMicrosc., Proc. Eur. Congr., 7th. The Hague, 1980, Vol. 1 , p. 138. Ade, G . (1981). Optik 58, 321. Ade, G. (1982a). Optik 62, 67. Ade, G. (1982b). Optik 63, 43. Ade, G. (1982~).Ber. APh-Phys.-Tech. Bundesunst. PTB-APh-18, 1 . Ade, G. (1986a). Ber. APh-Phys.-Tech. Bundesunst. PTB-APh-26, 1. Ade, G. (1986b). Electron Microsc., Proc. Int. Congr.. Ilth, Kyoto, 1986, Vol. I , p. 687. Ade, G. (1986~).Ber. APh-Phys.-Tech. Bundesunst. PTB-APh-30, 27. Ade, G . , and Lauer, R . (1981). Ber. APh-Phys.-Tech. Bundesunst. PTB-APh-16, 25. Ade, G., and Lauer, R. (1988). Electron Microsc.. Proc. Eur. Congr.. 9th, York, 1988, Vol. 1 , p. 203. Ade, G., and Lauer, R. (1990). Electron Microsc., Proc. Int. Congr., 12th, Seattle, 1990, Vol. 1 , p. 232. Ade, G., and Lauer, R. (1991). Optik 88, 103. Ade, G., and Lauer, R. (1992a). Optik 91, 5. Ade, G., and Lauer, R. (1992b). PTB-Mitt. 102, 181. Ade, G., Hanszen, K.-J., and Lauer, R. (1980). Eer. APh-Phys.-Tech. Bundesunst. PTB-APh-15, 11. Ade, G., Lauer, R., and Hanszen, K.-J. (1985). Ber. APh-Phys.-Tech. Bundesunst. PTB-APb-25, 45. Born, M., and Wolf, E. (1959). “Principles of Optics.” Pergamon, Oxford. Champeney, D. C. (1973). “Fourier Transforms and their Physical Applications.” Academic Press, London and New York. Coene, W. M. J . , and Denteneer, T. J . J. (1991). Ultrumicroscopy 38, 225.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
49
Collier, R. J., Burckhardt, C. B., and Lin, L. H. (1971). “Optical Holography.” Academic Press, New York and London. de Ruijter, W. J., and Weiss, J. K. (1993). Ultramicroscopy 50. 269. de Ruijter, W. J. PhD Thesis, Delft University, 1992. De Velis, J. B., Parrent, G. B., and Thompson, B. J. (1966). J. Opt. SOC.Am. 56, 423. Downing, K. H., and Grano, D. A. (1982). Ultramicroscopy 7, 381. Frank, J. (1973). Optik 38, 519. Franke, F. J., Herrmann, K.-H., and Lichte, H. (1986). Electron Microsc., Proc. Int. Congr., llth, Kyoto, 1986, Vol. 1, p. 677. Frieser, H., and Klein, E. (1958). Z. Angew. Phys. 10, 337. Frieser, H., Klein, E., and Zeitler, E. (1959). Z. Angew. Phys. 11, 190. Fu, Q., Lichte, H., and Volkl, E. (1991). Phys. Rev. Lett. 67, 2319. Gabor, D. (1948). Nature (London) 161, 777. Gabor, D. (1949). Proc. R. SOC. London, Ser. A 197, 454. Gabor, D. (1951). Proc. Phys. SOC. London, Ser. B 64, 449. Haine, M. E., and Mulvey, T. (1952). J. Opt. SOC. Am. 42, 763. Haine, M. E., and Mulvey, T. (1953). Proc. Int. Congr. Electron Microsc., lst, Paris, 1950, p. 120. Hanszen, K.-J. (1970). Optik 32, 74. Hanszen, K.-J. (1971). Adv. Opt. Electron Microsc. 4, 1. Hanszen, K.-J. (1980). Electron Microsc., Proc. Eur. Congr.. 7th, The Hague, 1980, Vol. 1, p. 136. Hanszen, K.-J. (1982). Adv. Electron. Electron Phys. 59, 1. Hanszen, K.-J. (1983). Optik 65, 153. Hanszen, K.-J. (1986). J. Phys. D 19, 373. Hanszen, K.-J., and Ade, G. (1983). Optik 63, 247. Hanszen, K.-J., and Ade, G. (1984). Optik 68, 81. Hanszen, K.-J., and Trepte, L. (1971a). Optik 32, 519. Hanszen, K.-J., and Trepte, L. (1971b). Optik 33, 166 and 182. Hanszen, K.-J., Lauer, R., and Ade, G. (1983). Optik 63, 285. Hanszen, K.-J., Lauer, R., and Ade, G. (1985a). Ultramicroscopy 16, 47. Hanszen, K.-J., Lauer, R., and Ade, G. (1985b). Optik 71, 64. Hanszen, K.-J., Ade, G., and Lauer, R. (1986). Optik 73, 163. Hashimoto, H., Howie, A., and Whelan, M. J. (1962). Proc. R. SOC. London, Ser. A 269, 80. Hawkes, P. W. (1968a). Br. J. Appl. Phys. 1, 131 and 1549. Hawkes, P. W. (1968b). Optik 27, 287. Hawkes, P. W. (1970). Adv. Electron. Electron Phys.. Suppl. 7, Section 1I.C. Hawkes, P. W. (1978). Adv. Opt. Electron Microsc. 7, 101. Herrmann, K.-H. (1982). Electron Microsc., Pap. Int. Congr., loth, Hamburg, 1982, Vol. 1, p. 131. Hibi, T. (1956). J. Electron Microsc. 4, 10. Hirsch, P. B., Howie, A., Nicholson, R. B., Pashley, D. W.. and Whelan, M. J. (1977). “Electron Microscopy of Thin Crystals.” R. Krieger Publishing Co., Malabar, Florida. Kawasaki, T., Ru, Q. X., and Tonomura, A. (1992). Electron Microsc., Proc. Eur. Congr., loth, Granada, 1992, Vol. 1, p. 653. Koster, A. J., and de Jong, A. F. (1991). Ultramicroscopy 38, 235. Lauer, R. (1981). Ber. APh-Phys.-Tech. Bundesanst. PTB-APh-16, 19. Lauer, R. (1982a). Electron Microsc., Pap. Int. Congr., 10th. Hamburg, 1982, Vol. 1, p. 427. Lauer, R. (1982b). Adv. Opt. Electron Microsc. 8, 137. Lauer, R. (1984). Optik 66, 159.
50
G. ADE
Lauer, R., and Ade, G. (1990). Electron Microsc., Proc. Int. Congr., 12tb, Seattle, 1990, Vol. I , p. 230. Lauer, R., and Hanszen, K . 4 . (1986a). Electron Microsc. Proc. Int. Congr., l l t b , Kyoto, 1986, Vol 1, p. 681. Lauer, R., and Hanszen, K . 4 . (1986b). Ber. APb-Pbys.-Tech. Bundesunst. PTB-APh-30, 15. Lauer, R., Hanszen, K.-J., and Ade, G . (1980). Ber. APh-Phys.-Tech. Bundesunst. PTB-APh-15, 19. Leith, E. N., and Upatnieks, J . (1962). J. Opt. SOC.Am. 52, 1123. Leith, E. N., and Upatnieks, J. (1967). Prog. Opt. 6, 1. Lenz, F. (1988). Optik 79, 13. Lenz, F., and Volkl, E. (1990). Electron Microsc., Proc. Znt. Congr., 12th. Seattle, 1990, Vol. 1, p. 228. Lichte, H. (1982). Optik 70, 176. Lichte, H. (1986). Ultramicroscopy 20, 293. Lichte, H. (1991a). Adv. Opt. Electron Microsc. 12, 25. Lichte, H. (1991b). Ultramicroscopy 38, 13. Lichte, H. (1992a). Ultrumicroscopy47, 223. Lichte, H. (1992b). Electron Microsc., Proc. Eur. Congr.. IOfh, Granada, 1992, Vol. 1, p . 637. Lichte, H. (1993). Ultramicroscopy51, 15. Lichte, H., Herrmann, K.-H., and Lenz, F. (1987). Optik 77, 135. Macy, W. W., Jr. (1983). Appl. Opt. 22, 3898. Matsuda, T., Fukuhara, A., Yoshida, T., Hasegawa, S., Tonomura, A., and Ru, Q. (1991). Phys. Rev. Lett. 66, 457. Menzel, E., Mirandk, W., and Weingartner, 1. (1973). “Fourier-Optik und Holographie.” Springer-Verlag, Vienna and New York. Missiroli, G. F., Pozzi, G., and Valdrt, U. (1981). J. Phys. E 14, 649. Mollenstedt, G., and Duker, H. (1956). Z. Pbys. 145, 377. Mollenstedt, G., and Wahl, H. (1968). Nuturwissenscbuften 55, 340. Radi, G. (1970). Actu Crystullogr., Sect. A AM, 41. Reimer, L. (1984). “Transmission Electron Microscopy,” Springer Series in Optical Sciences, Vol. 36. Springer-Verlag, Berlin and New York. Ru, Q., Matsuda, T., Fukuhara, A., and Tonomura, A. (1991). J. Opt. SOC. Am. A 8, 1739. Ru, Q., Endo, J.. and Tonomura, A. (1992). Appl. Pbys. Lett. 60. 2840. Takeda, M., and Ru, Q.(1985). Appl. Opt. 24, 3068. Takeda, M., h a , H., and Kobayashi, S. (1982). J. Opt. Soc. Am. 72, 156. Thompson, B. J. (1965). Jpn. J. Appl. Pbys. 4, Suppl. 1, 302. Tomita. A., Matsuda, T., and Komoda, T. (1970). Jpn. J. Appl. Phys. 9, 719. Tomita. A,, Matsuda, T., and Komoda, T. (1971). Electron Microsc. Proc. Int. Congr., 7th, Grenoble, 1970, Vol. 1, p . 151. Tomita. A., Matsuda, T., and Komada, T. (1972). Jpn. J. Appl. Phys. 11, 143. Tonomura, A. (1986). Prog. Opt. 23, 185. Tonomura, A. (1987). Rev. Mod. Phys. 59, 639. Tonomura, A. (1992). Ultramicroscopy 47, 419. Tonomura, A., Fukuhara, A., Watanabe, H., and Komoda, T. (1968a). Jpn. J. Appl. Phys. 7, 295. Tonomura, A., Fukuhara, A., Watanabe, H., and Komoda, T. (1968b). Proc. Reg. CoM. ElectronMicrosc., 4tb, Rome, 1968, Vol. 1, p. 277. Tonomura, A., Endo, J., and Matsuda, T. (1979a). Optik 53, 143. Tonomura, A., Matsuda, T., and Endo, J. (1979b). Jpn. J. Appl. Pbys. 18, 9.
DIGITAL TECHNIQUES IN ELECTRON OFF-AXIS HOLOGRAPHY
51
Tonomura, A., Matsuda, T., Endo, J., Todokoro, H., and Komoda, T. (1979~).J. Electron Microsc. 28, 1. Volkl, E., and Allard, L. F. (1994).MSA Bull. 2d(1), 466. Wade, R. H. (1980). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), p. 223. Springer-Verlag, Berlin and New York. Wahl, H. (1974). Optik 39, 585. Wahl, H. (1975). Habilitationszchrift, University of Tiibingen. Walkup, J. F., and Goodman, J. W. (1973). J. Opt. SOC.Am. 63, 399. Weingirtner, I., Mirandk, W., and Menzel, E. (1971).Ann. Phys. (Leipzig) [7]26, 289. Whelan, M. J., and Hirsch, P. B. (1957). Philos. Mag. [El 2, 1121 and 1303. Yatagai, T., Ohmura, K., Iwasaki, S., Hasegawa, S., Endo, J., and Tonomura, A. (1987). Appl. Opt. 26, 337.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS. VOL . 89
Optical Symbolic Substitution Architectures M . S . ALAM Department of Engineering. Purdue University. Fort Wayne. Indiana
and
M . A . KARIM Center for Electro.0ptics. University of Dayton. Dayton. Ohio
1. Introduction . . . . . . . . . . . . . I1. Optical Symbolic Substitution (OSS) . . . . 111. Coding Techniques . . . . . . . . . . IV . Signed-Digit Arithmetic Using OSS . . . . A . Signed-Digit Theory . . . . . . . . B. Algorithm for OSS Rules . . . . . . C . Higher-Order MSD Arithmetic . . . . D . Truth-Table Minimization . . . . . . E . MSD OSS Rule Coding . . . . . . . F . Optical Implementation . . . . . . . V . OSS Architectures . . . . . . . . . . . A . OSS Using Diffraction Gratings . . . . B. OSS Using Matched Filtering . . . . . C . OSS Using Phase-Only Holograms . . . D . OSS Using Opto-electronic Devices . . . E . OSS Using Acousto-optic Cells . . . . F . OSS Using Multiplexed Correlator . . . G . OSS Using Shadow-Casting and Polarization H . OSS-Based Image Processing . . . . . VI . Limitations and Challenges . . . . . . . References . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . . .
. . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . .
. . . . . . . . . . .
.
.
. .
. . . . . . . .
53 54 58 59
60 61 61
66 68 70 71 72 74 74 77 80 82 83 87
90 91
.
I INTRODUCTION
Optical information processing techniques have shown remarkable promise over all-electronic techniques for computation-intensive applications . It appears that the escalating demand for higher computing power can only be achieved through optical computing or digital optical computing techniques . Optical computing systems will allow two- and three-dimensional interconnections and non-interfering communication. as well as ultrahigh switching speed . The main thrust in optical computing research 53
Copyright 0 1994 by Academic Press. Inc. All rights of reproduction in any form reserved. ISBN 0-12-014731-9
54
M . S. A L A M and M. A . KARIM
/-:
1 1y-
Inputs
outputs a
Recognition Pattern
Substitution Pattern b
FIGUREI . (a) A Boolean logic system and (b) a symbolic substitution system.
is thus being geared towards the development of alternate algorithms and architectures to exploit all the advantages that optics can offer. Among the various optical computing techniques, however, symbolic substitution appears to be the most promising since it exploits the parallelism of optics completely (Huang, 1983; Yu and Jutamulia, 1987). Symbolic substitution has been proposed as a very powerful means for implementing optical computing operations (Huang, 1983). This technique exploits the parallelism of optics to first perform spatial search (referred to as the recognition) for all occurrences of a particular pattern and then replace (referred to as substitution) all occurrences of this pattern with another pattern. While a Boolean operator, such as AND, recognizes a combination of input bits and produces a single output bit, symbolic substitution recognizes not only a combination of bits, but also the relative spatial location of these bits and outputs often a combination of bits positioned according to an arbitrary substitution rule, as shown in Fig. 1 (Brenner et al., 1986). 11. OPTICAL SYMBOLIC SUBSTITUTION (OSS)
Consider the recognition and substitution phases of symbolic substitution for a particular pattern using intensity coding as shown in Fig. 2. In the recognition phase, a parallel search is performed for a desired pattern
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
55
t
t
Search Pattern a
I
Left/*
t
Y,b-@ Input Pattern
Recognition Phase
Search Pattern
Substitution Pattern C
t
t
Substitution Phase d
FIGURE2. A symbolic substitution system using intensity coding. (a) Search pattern, (b) recognition unit, (c) substitution pattern, and (d) substitution unit.
56
M. S. ALAM and M. A . KARlM
shown in Fig. 2a on the input pattern shown in Fig. 2b. The input pattern is copied n times, where n is the number of opaque pixels in the search pattern. These copies are shifted up, down, left, or right depending on the location of the opaque pixels in the search pattern. These copies when superimposed will produce in the final output an opaque pixel for each occurrence of the search pattern in the input pattern. A careful observation of the search pattern reveals that a shift down by one row of the original pattern will bring the top left pixel t o the desired bottom left position, while a shift left by one column of the original pattern will bring the bottom left pixel to the desired bottom left position. Accordingly, in Fig. 2b, two copies of the original pattern are needed because there are two opaque pixels present in the search pattern. These two copies of the original image are superimposed, i.e., ANDed, and the resulting output pattern shows only two opaque pixels at the bottom left position of each region, identifying the two instances of the search pattern in the input pattern. A mask may be used to read only the lower left pixel of the 2 x 2 patterns. The output pattern obtained from the aforementioned recognition phase is used as the input pattern for the substitution phase as shown in Fig. 2d. Assume that the intent is to replace the search pattern with the scribe pattern, as shown in Fig. 2c. The scribe pattern has an opaque pixel at the top right position. Consequently, it is necessary to move this dark pixel to the new location by shifting right by one column and then shifting up by one row, as shown in Fig. 2d. The resulting pattern is ORed with the input pattern applied to yield the final output pattern, which has the scribe pattern for every instance of the search pattern detected during the recognition phase. Another example of symbolic substitution using polarization coding is shown in Fig. 3. The search pattern is shown in Fig. 3a, while the input pattern is shown on the left side of Fig. 3b. The input image is copied twice, corresponding to the two polarized pixels of the search pattern. One copy is shifted left by one column, corresponding to the lower right pixel of the search pattern, while the other copy is shifted down by one row, corresponding to the upper left pixel of the search pattern. If a search pattern pixel is horizontally polarized, the corresponding copy of the input pattern is passed through a half-wave plate before shifting. All copies of the input pattern are then superimposed to yield the recognition pattern, as shown on the very right of Fig. 3b. This pattern contains pixels with two horizontal polarizations, two vertical polarizations, and some with both polarizations. All pixels with only vertical polarization indicate the presence of a search pattern in the input image. Next, the recognized output is passed through an analyzer and then through an optical NOR gate array to remove all unwanted signals and to yield the final output pattern shown in Fig. 3b. The substitution phase is shown in Fig. 3d, where the output pattern obtained
57
OPTlCAL SYMBOLIC SUBSTlTUTION ARCHITECTURES
rr
m
Detected Pattern
Search Pattern a
/
Shift / /’Left
t
Plate
/ Superposition
Recognition Phase b
Search Pattern
Substitution Pattern C
t NO Shift
3
Output Pattern
t
d Substitution Phase
FIGURE3. A symbolic substitution system using polarization coding. (a) Search pattern, @) recognition unit, (c) substitution pattern, and (d) substitution unit.
58
M. S. ALAM and M. A . KARIM
from the recognition phase is used as the input. This input pattern is copied, shifted, and superimposed according to the number and location of polarized pixels of the substitution pattern shown in Fig. 3c. The rightmost pattern in Fig. 3d represents the substituted pattern.
111. CODING TECHNIQUES
Symbolic substitution employs two-dimensional spatial patterns as symbols. For illustration, to implement a binary logic either intensity coding or polarization coding may be used. In intensity coding for binary logic, a dual-rail scheme is used where both 1 and 0 have a transparent and an opaque pixel in them, as shown in Fig. 4. Thus, a bit of information is represented by two pixels, where one pixel is the complement of the other pixel. The relative positions of the transparent and opaque pixels indicate whether it is 1 or 0. The dual-rail scheme simplifies the recognition phase of symbolic substitution, since it is then necessary to identify only the transparent or the opaque pixel of an input pattern. This type of coding is also preferable from the viewpoint of energy balancing, since it guarantees homogeneous distribution of energy (Brenner et al., 1986). In polarization coding, a bit is represented simply by a particular state of light polarization. For example, 1 can be represented by a vertically polarized light (denoted by a vertical arrow) pixel and 0 can be represented by a horizontally polarized light (represented by a horizontal arrow) pixel, as shown in Fig. 5 . The advantage of polarization coding over intensity coding is the ability to recognize with respect to either zeros or ones, not simply one or the other. Polarization codes, when compared to the intensity codes, generally reduce the size of the inputs by half, since only one pixel is required for coding both 0 and 1. Note that for systems involving more than two symbols, such as the trinary signed-digit number system, additional pixels will be required in each case of the aforementioned coding techniques. In the following
FIGURE4. Intensity coding.
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
0
59
1
FIGURE5. Polarization coding.
section, we consider the implementation of high-speed carry-free addition and borrow-free subtraction using symbolic substitution.
IV. SIGNED-DIGIT ARITHMETIC USINGOSS The most important arithmetic operation in digital computing is addition. But as the bit string of the pair of numbers to be added increases, the computation speed decreases because of the propagation of the carry through the cascaded adders. Carry look-ahead addition may speed up the process for shorter bit strings. But as the number of bits increases, the number of inputs to the logic gates becomes excessive, thus creating a high fan-in problem (Alam et al., 1992a; Awwal et al., 1992). To solve this speed bottleneck problem, various nonbinary number systems such as multiplevalued fixed radix, residue, and modified signed-digit (MSD) (Li and Eichmann, 1987; Huang et al., 1979; Cherri and Karim, 1988; Bocker et al., 1986; Drake et al., 1986) have already been proposed and investigated. Although multiple-valued fixed radix number representation increases the processing speed, it cannot guarantee carry-free computation. The residue number system also allows for a parallel processing scheme, but in order to process large numbers, large prime modulo logic elements, which are often difficult to implement, must be used (Karim and Awwal, 1992). Among these methods, the MSD arithmetic appears to be the most promising because it provides parallel, carry-free addition and borrow-free subtraction with storage complexity proportional to the length of the bit string (Cherri and Karim, 1988; Avizienis, 1961; Mirsalehi and Gaylord, 1986; Li et al., 1989). In MSD carry-free addition schemes, the pair of numbers to be added is first converted into an intermediate pair such that the addition of the latter pair will not generate any carry. These two steps of the MSD system readily correspond to the recognition and substitution phases of symbolic substitution. Optical symbolic substitution schemes for binary arithmetic has already been proposed and investigated (Brenner et al., 1986). Huang (1983) presented the basic symbolic substitution rules for binary (1 bit)
60
M. S. ALAM and M. A. KARIM
addition which processes two bits at a time. Thus, for an n-bit string, n computation steps are necessary, which makes the computation process relatively slow (Kozaitis, 1988; Eichmann et al., 1990; Alam et al., 1992b). Li and Eichmann (1987) improved this technique by incorporating additional information (called the reference) from the next less significant pair of digits so that when the first step symbolic substitution rules are applied there are no two identical nonzero MSD digits in any column, which would otherwise result in a carry in the second step. Kozaitis (1988) identified the addition rules for a multiple-bit (two-bit) symbolic substitution scheme, thus allowing four bits to be processed at the same time. Therefore, for longer words, the use of a combination of the rules presented in Kozaitis (1988) reduces the number of computational steps, thus resulting in higher computational speed. Applying the higher-order symbolic substitution rules developed by Kozaitis (1988), Eichmann et al. (1990) had explored a number of optical implementations. However, the methods presented in Eichmann et al. (1990) and Kozaitis (1988) are still dependent on the number of to-beprocessed bits, thus resulting in an unwanted bottleneck in the computation speed when processing large numbers. Most recently, a number of higher-order signed-digit based symbolic substitution techniques for arithmetic operations have been reported (Alam et al., 1992a,b). In these techniques, both addition and subtraction operations can be implemented in only two steps, irrespective of the number of bits to be processed. The higher-order symbolic substitution techniques are expected to make the implementation of higher-order MSD-based symbolic substitution systems faster, easier, and more practical. Accordingly, in the following sections, we consider the design of a higher-order MSD symbolic substitution system. A. Signed-Digit Theory
In general, a signed decimal number D may be represented in terms of an n-bit radix-r signed-digit number as n
D
diri
=
(IV. 1)
i= 1
--
where the modified signed-digitdi is a member of- the set [ r - 1, r - 2, ..., l,O,l ..., r - 2 , r - 1 ) with r - 1, r - 2, ..., 1 representing -(r - l), - ( r - 2), .. ., - 2, and - 1, respectively. Herein, the value of r is taken to be 2 since we are dealing with the MSD number system. Because a number may have more than one representation in the MSD number system, it is also known as a redundant number system.
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
61
B. Algorithm for OSS Rules
In a higher-order MSD system, a carry will be generated when the two-bit operands are -either -- (lX, lX), or (11, Ol), or (01, l l ) , or (TX, TX), or ( T i , Or), or (01, 1 1 ), where X is a don’t-care literal. Thus, the carry generation can be prohibited by transforming the addition operands into an intermediate sum and an intermediate carry such that addition of the ith intermediate sum digit and the intermediate carry generated at the (i - 1)th digit position will not produce a carry. The algorithm to derive the higherorder MSD symbolic substitution rules consists of the following steps: Step 1: Find all possible three-digit MSD representations for each sum value and arrange them in one group. The most significant MSD digit is termed intermediate carry C,,,and the remaining two digits are . identified as intermediate sum SiSi-, Step 2: For each group of step 1, find all representations of the twodigit MSD addend and augend operands. Step 3: Based on the reference pair of digits [(i - 2)th digit pair], select a suitable symbolic substitution rule such that the addition of the (i - 2)th intermediate carry and the (i - 1)th intermediate sum will not generate any carry. The intermediate pair of digits are then added to get the final sum. During this addition, for the least significant-digits, the reference digits are assumed to be zero, and for odd-length operands, zero padding can be used if necessary. For MSD subtraction, if the subtrahend is bitwise complemented, then all steps of MSD addition can be applied. C. Higher-Order MSD Arithmetic
For the proposed system, all possible sums of two 2-bit MSD numbers are considered, as shown in Table I. Since the sums range from -6 (i.e., T i + 17) to + 6 (i.e., 1 1 + l l ) , there are 13 possible groups of sums. Also, with the exception of -6,-4, 0, +4, and +6, the rest of the decimal -- - For numbers in Table I have more than one MSD representation. - example, the MSD representations corresponding to - 3 are 011, 1 1 1 , and 701. The higher-order MSD symbolic substitution addition rules are listed in Table 11, with the last two columns listing the intermediate carry and sum digits. The pairs of 2-bit MSD numbers to be added are divided into 13 groups such that the two numbers AiAj-, and BIBi-, to be added (in each group) correspond to the same sum. For example, all group 2 entries of Table I1 correspond to the sum - 5 . Assume that the intermediate sum digits
62
M. S. ALAM and M. A. KARIM TABLE I MSD REPRESENTATION OF THE SUMOF TWO 2-BIT NUMBERS Decimal number
MSD representation ~
-6 -5 -4
-3 -2 -1
0
I 2 3 4 5 6
-_
110
i- o i , T i 1
100 oii,iol,ili
oio,iio
o o i , o i 1.1,
i i 1
000
o o 1, o 1 i, 1 T i 010, 011,
100 101,
Iio
ioi, 1 1 1 iii
110
TABLE I1 HIGHER-ORDER MSD OSS RULETRUTHTABLE FOR ADDITION
IT
2
i i i- i
To
1 0
i i
3
-T i
o- i
o i
i- i
1
10
T i
-
4
1 -
-1 1
1
T i
0 0
0 0
i i
All negative
-
ii
o i
-1 0 -1 1
Otherwise
0
Don’t care
0
0 0
0- 0 1 0
o i
o- i
o- i -1 1 1 1
0 0
01
1 0
1 0
o i
I
o- i 1 1
10
ii
1 1
i i
-
-
10
1 1
-1 0
5
Don’t care All negative Otherwise Don’t care
10
-
10
1 1
o- i I 1
(continues)
63
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES TABLE 11-continued
NO. 6
Addend
Augend
AiA,-I
BiBi-1
o i 0- 0 1 1 0 0
Carry A,-2 Bi-2
Sum
Ci
si
si- 1
0 0
All negative
0
11
Otherwise
0
o i
Don’t care All negative
0 0
00 0 1
Otherwise
0
1 1
0
1 0
o i
7
0 0
0- 0 1 1 00
8
0 0 i 0
1 0 i 0
0 0 0 i
0 1 0 i
1 0 0 0
0 0 1 1
0 1 0 i 0 1
0 0 1 i 1 i
Don’t care
0 1 0 1 i 1
0 1 1 0 i 0
All negative
11
Otherwise
o i
1 0 1 i 1
Don’t care
00
I T
0 1 1 i 1
12
1 1 1 0
1 0 1 1
All negative Otherwise
0 1 i i
13
1 1
1 1
Don’t care
1 0
9
I T I T 10
11
I 0 1 0 1 i 1 1 0
1 0 0 1 0 i 1 0 1
1 1
correspond respectively to the ith (most significant digit) and the (i - 1)th digit positions. Inspection of Table I1 reveals that the allowable combination of digits at the (i - 2)th digit position must satisfy one of the following three cases: Case 1: In this category, the digit pair at the (i - 2)th digit position belong to the set T E [ i f , TO, O f ) -. When this case is encountered, there can be negative carry (i.e., 1 ) propagating from the (i - 2)th to the (i - 1)th digit position. Therefore, the (i - 1)th digit of the intermediate sum must be either 0 or 1. Note that if there is a T carry
64
M . S. ALAM and M. A. KARIM
propagating from the ( i - 3)th to the ( i - 2)th digit position, the digit pairs 10 and 01 of the set T will also generate a 1 carry to the (i - 1)th digit position. This category is referred to as all-negative. Case 2: This category includes the members of the set T when no T carry is generated from the (i - 2)th digit position. Evidently, the ( i - 1)th digit of the intermediate sum can be either 0 or 1. This category is referred to as the otherwise case. Case 3: When the (i - 1)th digit of the intermediate sum is 0, there is no restriction on the digit pair at the (i - 2)th position, so any member of the set S + may be allowed. This corresponds to the don’t-care case. It should be mentioned that the digit pair 00 can be a member of the set T or T.
s
The aforementioned cases are listed in column 3 of Table 11, which determines the Si, Si- , and Ci+entries of column 4. We observe that the negative digit pairs at the ( i - 2)th position produce a carry of either 0 or 1, in which case the allowed (i - 1)th sum digit is either 0 or 1. The remaining combinations of digits produce a carry of either 0 or 1 with the corresponding sum digit allowed to be either 0 or 1. For a particular input condition, sometimes the requirements imposed by the ( i - 2)th pair of digits can be satisfied in multiple ways. For example, group 2 input pairs of Table I1 can be mapped into either T i 1 or 101.When mapping with an intermediate sum of 11, the (i - 2)th digit pair couldn’t be allowed to be 11 because a carry of 1 will be generated otherwise by this combination. Thus, all combinations except 11 are allowed for this position. However, when substituting an intermediate sum of 07 for the same input condition, the selection of the (i - 2)th digit position should be restricted to only those combinations of numbers which will not produce a negative carry. The following example shows the application of some of the symbolic substitution rules listed in Table I1 where 4 indicates a padded zero. Notice that the final result of addition is carry-free. The higher-order MSD symbolic substitution rules for subtraction can be derived from the addition rules of Table I1 with minor modification. Addend Augend
= =
I
T 11o i o i o i
o 1 T To 1 T 1 i 1
= 427,, = 76,,
Intermediate sum = 1 0 0 0 0 0 0 1 0 Intermediate carry = 0 6 0 6 0 6 0 6 0 6 6 Final sum Final carry
ioooooioio
= = 0000000000
=503,,
65
OPTICAL SY MBOLlC SUBSTITUTION ARCHITECTURES
The MSD subtraction is performed by adding the minuend (P) t o the complement of the subtrahend (Q), since P -Q
=
P
+ Q.
(IV.2)
Therefore, by complementing the BiBi-, entries of column of Table 11, the higher-order MSD symbolic substitution subtraction rules can be obtained, as depicted in Table 111. Consider the tabulated example for MSD subtraction. Minuend Subtrahend
_ _ ioioioi
= 1 11 = 1 1 1 oo 1 1 1 i 1
=
939,,
= -613,,
Intermediate sum = 0 0 1 10 0 1 0 T 0 Intermediate carry = 0 0 1 $J 0 d, 0 $J 0 $J $J Final difference Final borrow
= 0 1 I 100 1 0 T O = 0000000000
= 36,,
It is obvious that the final result of subtraction is borrow-free. TABLE Ill HIGHER-ORDER MSD OSS RULETRUTHTABLEFOR SUBTRACTION No. 1
2
3
Minuend Pipi-,
1 1
1 0
1 1
T- i
0 1 1 0
o i T- i
1 0
0- 0
1 1
10
10 -
0 1 1 0
i i
1 0 0 0
1 0 0 0 1 0
o i o- i
0- 1 1 1
1 1
1 i
I 1
-
-1 1
Don’t care All negative Otherwise Don’t care
Borrow Bi 1 -1 1 I
Difference Di Di- 1 1 0
-
I 1
o i 0 0
i i
T i o- i
pi-2 Qi-2
1 1
I 1 0 0
1 1
5
Qi Qi- I
ii T- i 10
4
Subtrahend
All negative
-
1
0 1
Otherwise
0
i i
Don’t care
0
1 0
o i
(continues)
66
M. S. ALAM and M. A . KARIM TABLE Ill-continued
No. 6
Minuend
Subtrahend
PiP i - I
Qi Qi- I
o i
0 0
0- 0
Borrow Pi-2 Qi-2
Bi
All negative
0
o T
Difference Di Di- I 1 1
0 0 i i
Otherwise
0
o i
0 0 7
0 0
0 0
Don’t care
0
0 0
8
0 1
0 0
All negative
0
0 1
0 0
i i
I T
0 0 1 1
Otherwise
0
1 1
Don’t care
0
1 0
1 1
0 0 9
10
11
1 0 0 0 i i
0 0 1 1 i i
0- 0 1 0
1 0 1 0 1 i
1 0 0 1 0 i
0 0
1 1 1 0 0 1 1 1
i i 12 13
1 1 1 0 1 1
o- i 1 1
o- i
1 1
All negative
1 1
-1 0 -1 1
Otherwise
o i
o- i
Don’t care
0 0
All negative Otherwise
0 1
Don’t care
1 0
Ti
o- i
1 0 1 0
T- i
1 1
i i -
1 0
T i T i
I T
D. Truth-Table Minimization
The first step toward the minimization of a truth-table is to obtain the reduced minterms (maxterms) in sum-of-product (product-of-sum) form for each output bit. This can be accomplished by using Karnaugh maps (K-maps) or Quine-McCluskey’s tabular reduction method (Karim and Awwal, 1992). Using K-maps and considering only the 1 and entries of the outputs Si, S i - , , and Ci in Table 11, we obtain the reduced minterms shown in Table IV, where X represents a don’t-care literal, meaning that the bit
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
67
TABLE IV REDUCEDMINTERMS FOR HIOHER-ORDER MSD ADDITION
00 1 XOl XOl xo I OXOl I ox,~x,~ 1XOlooxo~x,~ loox,,xoix,i ix,iiix,lx,i I ~lxoixoixoi 0 1 OX,,r,ix,i ox, 0 1 x,iw,i Ix,, I Iw,,w,, I I Ix,,x,ix,i OXOl
Ir,ix,i
I ioxx,iw,i
oo ix, ox, -
ixo x,
o ixoixO
--
I X,il I X O ~ X 0 ~ _-I I Ixo~xo~xo~
o xOi T I xOxi oi
1 I Yo1 XOl x,ix,i 1 XOl1 ~xo,x,iw,i I 0 1x,ix,ix,i
'm
X0,XOI
IX,jlOr,ix,i
I I OXX" jXo1
1 Ix,,
ooTxOiw,iw, oxOiTo~ Ix,~oow,ix,i I oox,iw,ix,i I1I x , , i ~ ix,, i 1 ix
XI
IX,iX,i
I
IX,~X,~
l X O l Ix,,xx
I x,
Yo, 1 xx
X0il
I
1 Ixx
IX,~lXX
o_ _l o_ _in
OlOlXX
Illlxx 01 I Txx 1 i i TXX 1 Toixx 1 oooxx 00 I oxx
Illlxx oiTTxx Ti i i x x iioixx I oooxx ooioxx
could be any of 1 or 0 or 1, while Xoi (Xo,)represents a partial don't-care meaning that the bit could be either 0 or i ( 1 ) . Each entry in Table IV consists of six bits, where the least significant two bits correspond to the reference bits (Ai-zBi-z),the most significant two bits correspond to the addend ( A i A i - l ) ,and the remaining two bits correspond to the augend (Billi- respectively. The maximum logical minimization occurs for the 1 and 1entries of the Si-l output, and the corresponding K-maps for the first four entries of Si- in Table IV are shown in Fig. 6. The remaining entries of Table IV are derived following a similar procedure. In Table IV, the all~ , the otherwise group can negative group can be identified as X o ~ X owhile be identified as XoIXoi. The don't-care group appears as X X , representing XiO,X,Ol~ Considering Tables I1 and IV, we observe that the unminimized minterrns for the 1 and 1 entries of the output Si amount to 234 ( = 117 + 117) minterms, which can be minimized to 38 ( = 19 + 19) minterms. The and Ci corresponding number of minterms for the output bits of reduces from 216 (=96 + 120) to 16 ( = 8 + 8) and 198 (= 102 + 96) to 22 ( = 11 1l), respectively. Thus, a total of 72 minterms are required to yield a pair of output bits.
+
M. S. ALAM and M. A. KARIM
68
Xoi X o i = constant
\Ai
Ai-1
Bi
FIWRE 6. Karnaugh-map minimization for the S,-, outputs.
E. MSD OSS Rule Coding For spatial coding of the minterms (MSD digits), a dual-rail coding scheme has been proposed as shown in Fig. 7a. A bit value of 1(i) is coded by making the top pixel subcell opaque (transparent) and the bottom pixel subcell transparent (opaque). A 0 (don’t-care) literal is represented by making both the pixel subcells transparent (opaque). Assuming that the (i - 2)th digit pair is either T i or 11, the symbolic substitution rules using this coding scheme for the first two groups of Table 11, for example, are shown in Figs. 7b and 7c, respectively. In the substituted pattern (shown in front of the arrows in Figs. 7b and 7c), the upper-row pixels correspond to
69
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
b
-1 -1 -1 (addition o f iiwith
C
-10 4-
Sum bits
-1
Carry
4-
iT 1
iiiH-P i i-1 i-2
-
1 1 4- Sum bits
-
1o
-1
i
(addition o f
4-
Carry
07
4-
Sum bits
-
+ Carry
with 10 )
i i-1 i-2
ioiuJ-P 4 4 4
--
1
111
(addition o f i
o with ii1
FIGURE7. (a) Dual-rail coding scheme suitable for the MSD digits. (b) Coded symbolic substitution rule for group 1 of Table 11. (c) Coded symbolic substitution rule for group 2 of Table 11.
the sum bits, while the bottom-row pixels correspond to the carry bit, which is shifted to the left by two pixel positions. The shifted positions in the bottom row are padded with zeros. The coding rules for other group members of Table I1 can be derived in a similar fashion. Finally, a practical example for the addtion of two MSD numbers (0010 and 1011) using the coded symbolic substitution rules of Figs. 7b and 7c is shown in Fig. 8. Note
70
M. S. ALAM and M. A . KARIM
0010 101 1
r o i i 0100
1 1 I T 0 0 0 0
FIGURE8. Addition of two MSD numbers using the higher-order MSD symbolic substitution rules of Table 11.
that the augend bit positions in the input pattern, which are responsible for the generation of carry, are substituted by zeros after the first symbolic substitution operation (first-step), thus generating the intermediate pairs. In the second step, the substitution rules are applied again to the intermediate pairs to yield the carry-free result. From, the rightmost pattern of Fig. 8, we observe that the sum (1 1 IT) appears in the upper-row pixels and the bottomrow pixels appear padded with zeros (since carry-free). F. Optical Implementation
The symbolic substitution processor for higher-order MSD arithmetic can be realized by spatially coding the input variables and then using a holographic content-addressable memory (CAM) for both recognition and substitution. Mirsalehi and Gaylord (1 986) implemented the MSD addition and subtraction by utilizing a direct truth-table-based CAM. This is a onestep process and requires 56 holograms to store the minterms for each output bit, thus requiring a huge amount of storage. Li and Eichmann (1987) decreased the hologram storage necessity for binary MSD addition from 56 to 20 for each output bit by adopting a two-step scheme. In a paper published more recently by Cherri and Karim (1988), it was observed that the hologram requirement can be further reduced to 13 by implementation of a three-step scheme. However, as the number of steps increases, the processing speed diminishes. To implement the aforementioned higher-order symbolic substitution rules, a CAM using a holographic recording technique, first proposed by Mirsalehi and Gaylord (1986) and later modified by Li and Eichmann (1987) and by Cherri and Karim (1988), can be used. An optical implementation for the holographic recording and reconstruction of the higher-order MSD symbolic substitution scheme is shown in Fig. 9, where the minterm recording operation involves a three-step process (Li and Eichmann, 1987; Mirsalehi and Gaylord, 1986) and makes use of an extra bit (R) as the reference in addition to the six input lines ( A i A i - ,B i B i - , A i - 2 B i - 2 )At . first,
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
71
FIGURE9. Optical implementation of the higher-order MSD holographic recording and reconstruction scheme.
the complement of the input pattern (minterm) is recorded with the phase between the object beam and the reference beam set at 0". For simplicity, the reference beam is not shown in Fig. 9. Next, the input pattern (minterm) is recorded with 180" phase difference between the object beam and the reference beam. Finally, the reference bit R (which is kept off in the first two steps) is recorded with 0" phase difference between the two beams. The don't-care bits (X) are kept opaque during all three steps of recording. For partial don't-care bits (Xo, or Xoi),the pixel(s) corresponding to the forbidden bit (i.e., 1 or 1)for this bit position is (are) made transparent during first recording step, while the pixel@)corresponding to the allowed bits is (are) made opaque during all three recording steps. During logic operation (reconstruction), the reference beam (used during the recording operation) is turned off while the input signal and the reference bit are introduced to the system. When the input signal matches with a stored minterm, a dark spot appears at the corresponding location of the detector (output) plane, because the waves diffracted by the second- and third-step recorded patterns cancel each other (since both have the same magnitude but opposite phase). Depending on the intensity of this spot, an appropriate value can be assigned to it. In case of no-match, a bright spot (which is equivalent to zero) appears at the detector plane.
V. OSS ARCHITECTURES A number of symbolic substitution architectures using CAM (Mirsalehi and Gaylord, 1986), location addressable memory (Drake et al., 1986), diffraction gratings (Thalmann et al., 1990), spatial filtering (Jeon et al., 1990; Brenner et al., 1989), phase-only holograms (Mait and Brenner, 1988), self-electro-optic effect device (Cloonan, 1988), multichannel
72
M . S. ALAM and M. A. KARIM
correlation (Botha et al., 1987; Hwang and Louri, 1987; Casasent and Botha, 1989), shadow-casting and polarization (Cherri and Karim, 1989a; Louri, 1991), and optical phase conjugation (Eichmann et al., 1988) have been suggested. Although each of these implementations has its own advantages and disadvantages relative to the others, the CAM technique described in Section IV,F has been widely investigated and appears to be very promising. The aforementioned architectures are described briefly in the following sections. A . OSS Using Diffraction Gratings Both coherent and incoherent optical processors have been proposed for the implementation of symbolic substitution. Since incoherent systems offer several advantages over coherent systems (Huang, 1983; Brenner et al., 1986; Thalmann et al., 1990), incoherent processors for symbolic substitution are of special interest. Accordingly, a symbolic substitution system based on diffraction gratings and Fourier filtering (Thalmann et al., 1990) is discussed in this section. 1. Recognition Unit
The optical implementation of the symbol recognition unit is shown in Fig. 10 (Thalmann et al., 1990). This is essentially a 4f filtering system. The input data are encoded in intensity transmittance (darklbright) of the input spatial modulator. A two-dimensional diffraction grating, placed behind the input pattern, splits the input data into different diffraction orders. The distance d between the grating and the input pattern is chosen so that the patterns of two adjacent diffraction orders are shifted by one pixel in the image plane. The search symbol is selected by appropriate spatial filtering Input Pattern
Y
V
Spatial Filtering
Grating
output Pattern
I
f
d
-
f-f-fd
FIGURE10. Optical symbolic recognition system using diffraction gratings. Adapted from Thalmann el al. (1990).
73
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
output Pattern
Spatial Filtering
FIGURE 1 1 . Multi-channel symbolic recognition system using diffraction gratings. Adapted from Thalmann et a/. (1990).
of the diffraction orders at the Fourier plane. The selected diffraction orders are then recombined in the image plane to yield recognition of the search symbol. For numerous applications, several different symbols need to be recognized at the same time. This can be achieved by introducing an additional grating to produce multiple spatially separated images in the image plane as shown in Fig. 11. The corresponding diffraction orders in the Fourier plane must be sufficiently separated to be able to choose for each channel another search symbol by appropriate filtering of the diffraction orders.
2. Substitution unit The optical implementation for substitution is identical to that for recognition. The complete system for symbolic substitution is thus composed of two 4f systems in series as shown in Fig. 12. A NOR gate array is inserted at the output plane of the recognition system to perform the thresholds and reestablish the binary values. The bright pixels at the output of the NOR gate array identify the location of the search pattern in the input pattern. The NOR gate array is split again by a two-dimensional diffraction grating. The scribe pattern is chosen by spatial filtering in the Fourier plane of the substitution unit. Depending on the spatial filter, both intensity- and polarization-coded output patterns may be obtained. Recognition Una Input Panern
Grating
Substitution Unit Spatial Filtering
output Pattern
NOR Gate Gratina
-f~f-f-f-I----t-f--fI--f-f-
FIGURE 12. Optical symbolic substitution system using diffraction gratings. Adapted from Thalmann et a/. (1990)
74
M . S. ALAM and M. A. KARIM
B. OSS Using Matched Filtering In this technique, symbolic substitution is implemented by pattern recognition followed by pattern substitution, pattern combination, and feedback for a fixed number of iterations to yield the final output. The schematic diagram of the optical symbolic substitution processor is shown in Fig. 13c. The encoded binary input pattern formed by using the symbols shown in Fig. 13a, is fed into shutter S1 and split into four identical portions by a combination of beam splitters and mirrors. The coding scheme shown in Fig. 13a is specially designed to prevent crosstalk. Since there are four possible combinations of symbols for binary addition (0 + 0,O + 1, 1 + 0, and 1 + l), four different holographic filters are needed to recognize each of them. Each of these filters has a transfer function which is the complex conjugate of the Fourier transform of one of the four different symbols shown in Fig. 13b. Through these filters and Fourier transform (FT) lenses, autocorrelations are produced in the output planes of the matched filters. Using threshold elements in these output planes, the autocorrelation peaks are detected at all positions where the four different patterns are matched. The recognized patterns are used to generate new patterns based on the substitution rules shown in Fig. 13b. Since a hologram can be used as a beam steering element, any new pattern can be generated using computergenerated or optically recorded holograms placed between the Fourier transform lenses. The substituted patterns are combined through a number of beam splitters and mirrors, as shown on the right side of Fig. 13c. The resulting pattern is again split intatwo parts and stored in optical memory M1 where shutter S2 is opened and shutter S4 is closed. After storing the output, shutters S1 and S2 are closed and S3 is opened to produce the intermediate result of the first iteration. This intermediate result is fed back to the input through the beam splitters and mirrors for the second iteration. In the second iteration, S2 is still closed and S4 is opened t o store the result of the second iteration in the optical memory M2. After storing the result of the second itcation, S3 and S4 are closed, M1 is erased, and S5 is opened to feed the result back into the input. The final result for the binary addition of two n-bit numbers is thus obtained in n + 1 iterations. C . O S S Using Phase-Only Holograms 1. One-Channel System
In a single-channel optical symbolic recognition system, all operations such as replication, shift, and polarization rotation are performed sequentially. A recognition system using intensity-coded inputs is shown in Fig. 14a
A
o+o
0 0+1
a
c
FIGURE13. A symbolic substitution processor based on spatial filtering. (a) Coding scheme. (b) Substitution rules. (c) Schematic digram. Adapted from Jeon el 01. (1990).
76
M. S. ALAM and M. A. KARIM
0
t 9
0 7 @
%
inout
splitter
replicated inputs
prisms
/ overlapped inputs
mask
a
input
splitter
replicated input
polarization rotators and prisms
overlapped inputs
polarizing filters
outputs
b
FIGURE14. Single-channel optical symbolic recognition system using (a) dual-rail logic and (b) polarization-based logic. Adapted from Mait and Brenner (1988).
(Mait and Brenner, 1988). Assume that a logic having three rules is to be implemented. The input image is introduced in the input plane of the system. The holographic splitter produces three copies of the input image. Next, prisms are used to provide appropriate shifts and overlap of the duplicated images. Then, an output mask is used to select the recognized
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
77
output from the superimposed image. Figure 14b is a similarly constructed polarization-based logic system. Replication of the input image is performed by the holographic splitter. Necessary changes in polarization are achieved through polarization rotators, while the prisms provide the necessary shifting and overlap of the images. The polarizing filters complete the recognition process by producing a logical zero if the search pattern is present in the input scene. In both cases, the recognition output is passed through a nonlinear device for appropriate optical logic and signal regeneration for the subsequent processing in the substitution stage. Substitution systems for intensity-coded and polarization logic are represented in Figs. 15a and 15b, respectively. In the substitution system, the recognized outputs of Fig. 14a or Fig. 14b are used as the input patterns depending on the type of coding used. The operations in this stage involve splitting, shifting, superimposing, and combining operations, which are realized using holographic splitters, prisms, and holographic combiners, respectively. Note that the substitution systems are constructed by reversing the order of operations in the recognition systems and interchanging the role of holographic combiners and splitters. The prisms, used as combiners in the recognition system, however, are replaced by holographic splitters in the substitution system. Finally, the holographic combiner combines the superimposed images to generate the final output. The intensity-coded technique was used by Cherri and Karim (1989b) to implement a multiplier and a histogram equalization processor, while the polarization-coded technique was used by Barua (1991a, b) to implement a high-speed multiplier and a binary median filter for image processing applications. 2 . Dual-Channel System A Michelson interferometer can be used to produce two shifted copies of an input pattern. However, using two phase-only holograms in conjunction with the interferometer it is possible to realize multiple shifted replicas as shown in Fig. 16 (Mait and Brenner, 1988). The inclusion of holograms alters the interferometric system from being a geometrical-optics-based system to a diffractive-optics-based system. Moreover, the two channels allow for direct realization of a complex transfer function, as opposed to a sequential implementation employing a single channel. The dual-channel system presented herein is suitable only for intensity-coded logic.
D. OSS Using Opto-electronic Devices A symmetric self-electro-optic effect device (S-SEED) based symbolic substitution processor is shown in Fig. 17 (Cloonan, 1988). This parallel
78
recognized outputs
M. S. ALAM and M. A. KARIM
splitters
replicated outputs
overlapped images
prisms
combiner
final output
a
recognized outputs
splitter
replicaled outputs
rotators
combiner
final output
b
F I ~ U R15. E Single-channel optical symbolic substitution system using (a) dual-rail logic and (b) polarization-based logic. Adapted from Mait and Brenner (1988).
architecture employs a multirule implementation consisting of an input, an output, and a processing loop. The processing loop contains multiple processing blocks and two S-SEED storage arrays (A and B). The outputs
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
Mirror
79
\9 Beam Splitter
5z
Polari
Image Plane
Object Plane
FIGURE16. Dual-channel optical symbolic substitution processor using a Michelson interferometer. Adapted from Mait and Brenner (1988). Feedback loop
k
S
= Beam splitter
/=
Split-shift-mask-combination
0,
= NOR gate array = Shutter
FIGURE 17. A symbolic substitution processor using SEED arrays. Adapted from Cloonan (1988).
80
M . S. A L A M and M . A. KARlM
of the S-SEED storage arrays are combined into one image and looped back to the system input, where the image is split and directed to each of the processing blocks. In a processing block, the image is split and shifted into multiple copies to recognize a particular pattern. The multiple copies are then overlapped and directed to an array of SEED NOR gates. The NOR gates produce a logical 1 output only in the spatial regions where the search pattern is detected. The outputs from the SEED NOR gates can be shifted and overlapped to produce a scribing image. This image is combined with the outputs of the other processing blocks and is directed back to the S-SEED arrays. The scribing image writes the substitution pattern into the receiving S-SEED array whenever the NOR gates detect the search pattern. In the symbolic substitution architecture of Fig. 17, multiple rules can be implemented at a given instant of time, because a copy of the initial image can be routed to each processing block. The external control unit can select substitution rules which will be implemented at any instant of time. Since Fig. 17 shows six processing blocks, the system can implement six processing rules. Four of these six processing blocks provide the rules required for binary addition. The fifth and sixth blocks provide rules for crossed (univerted) data transfers and uncrossed (inverted) data transfers. In a typical application, an input data pattern would be routed through the crossed processing block to S-SEED array A. The output from the S-SEED array A could then be looped back through the four addition processing blocks, and the output images from these blocks would be recombined and used to simultaneously scribe the four substitution patterns into S-SEED array B. The resulting image could then be looped back through the uncrossed processing block to invert the bits and store them in array A. The image in array A, representing the inverted sum of bits, could then be routed to the system output. For an n-bit addition, a processing cycle is defined to be a single step in which the four substitution rules for binary addition are implemented in parallel during a single pass around the processing loop of Fig. 17. Thus, n + 1 processing cycles must be implemented around the processing loop to yield the final result.
E. OSS Using Acousto-optic Cells Botha et al. (1987) introduced an architecture for symbolic substitution using presently available devices such as light-emitting diodes (LEDs), multichannel correlators, and detectors. The method was intended for the implementation of the ripple carry addition rules. Hwang and Louri (24) modified this architecture to implement the modified signed-digit addition
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
81
and subtraction, as shown in Fig. 18a. The search patterns are introduced through the LEDs in the input plane of the recognition phase. Each column of LEDs corresponds to four pixels required to encode the operand digits. During the recognition, the input operands are fed to the multichannel acousto-optic cells, and the reference patterns are input through the LEDs. A multiplication of the reference patterns and the input operands takes place in the acousto-optic cells; the result is integrated onto the detectors by a lens. The number of detectors is equal t o the number of search patterns. A peak signal on the output detectors indicates the presence in time of the Multichannel acousto-optic cells
Search Patterns
LEDs
Integrating optics
Imaging OPtlCS Input electrical signals a
Substitution Patterns
' acousto-optic cells
LEDs
b
FIGUREIS. A symbolic substitution processor using multichannel acousto-optic cells. (a) Recognition phase and (b) substitution phase. Adapted from Hwang and Louri (1987).
82
M. S. ALAM and M. A . KARIM
search pattern. The substitution phase is implemented by the optical setup shown in Fig. 18b. The detector outputs of the recognition phase are used to scribe the desired substitution symbol on the LEDs situated in the input plane of the substitution phase. The LED light is imaged by a lens to an array of shift detectors. The size of the detector array is n x 4, where n is the operand length and the factor 4 corresponds to the encoding scheme (two pixels per digit). The right end of the detector array is fed back to the acousto-optic cells for the next iteration. F. OSS Using Multiplexed Correlator A correlator realization of symbolic substitution was developed by Botha et a/. (1987) and Casasent and Botha (1989) for implementing logic and numeric operations. Figure 19 shows this architecture, which consists of a standard matched spatial filter correlator with multiple laser diode sources at Po, and multiple spatially multiplexed filters at P, with frequency multiplexed filters at each spatially multiplexed filter position. The output P, contains four correlation planes. The laser diodes at Po select the set of four frequency-multiplexed filters at P, . The recognition phase is achieved with one set of four frequency-multiplexed filters at one location in P, . The four-output correlation plane in P, contain peaks at each location corresponding to the 00,01, 10, and 11 patterns in the input plane. Figure 20 lists the substitution rules for the four possible input patterns to realize all 16 binary logic functions. In the substitution phase, one or all four correlation output planes are placed at P, , and a different set of four filters are placed PO
laser diodes
P1
input data
P2
filter bank 8 HOES
P3
output correlation plane
FIGURE19. Optical symbolic substitution multiplexed correlator. Adapted from Casasent and Botha (1989).
83
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
Output Symbols
F
0
A B A B
0
0
A
A B
B
0
0
0
MfB A + B m AOB
0
0
1
1
A+B
1
1
A
1
T
A+B -A
1
1
1
FIGURE 20. Symbolic substitution rules for 16 logic functions. Adapted from Casasent and Botha (1989).
at P, , corresponding to each substitution rule. For each correlation peak, a given pattern replaces it, and the AND of all four substitution planes is formed. An example of the steps and intermediate results involved in the logic AND operation is shown in Fig. 21, where two four-digit spatially encoded binary numbers are used as the input operands (shown above PI). The four frequency multiplexed filters at P2 are shown in the space domain. The output correlation plane data are shown above P3. The correlation plane data is thresholded by using a bistable optical device at P3. The four regions of P3 are Fourier transformed by a holographic optical element and illuminate four spatially multiplexed filters at P, . The four convolution outputs are superimposed at P, and yield the pattern shown, which is the AND of the two input operands in the same symbolic representation. G . OSS Using Shadow-Casting and Polarization
Figure 22 illustrates the parallel implementation of the recognition phase of two symbolic substitution logic (Louri, 1991). The source plane consists of 2 x 2 orthogonally polarized LED arrays. Each element of the 2 x 2 array is a pair of orthogonally polarized LEDs that are located almost at the
84
M. S. ALAM and M. A. KARIM
0
01 I 0
101011
+
10
0
+
1
P3
1
+
0
1 1
I
B~~~~~~~ Holographic Spatially Device
1
Optical Element
MUXed "AND" Filters
p4
I
Lens
output
m 0100
FIGURE21. The steps and intermediate results involved in a symbolic substitution operation. Adapted from Casasent and Botha (1989).
same point in the source plane. These LEDs can radiate both horizontally and vertically polarized light, which simultaneously passes through the input plane. The vertically polarized LEDs implement the substitution rule 1, while substitution rule 2 is implemented by the horizontally polarized LEDs. The horizontal and vertical states of polarization are represented by a horizontal and a vertical bar, respectively. The reference pixel for indicating the location of the search pattern is chosen to be the lower right corner of the search pattern. Each LED array provides multiple shadowgrams of the input plane such that all the pixels of its associated search pattern overlap in the reference pixel. The configuration of the LED arrays produces distinct, shifted copies of the input plane onto the screen. The superimposed image consists of pixels containing two horizontal polarizations, two vertical polarizations, one horizontal polarization, one vertical polarization, or both polarizations. Pixels containing two
Substitution rule 1
Substitution rule 2
fl .--b
Orthogonally Polarized LEDs
Input image
Superimposed image
Wollaston prism
Copies with distinct polarization
AND-gate arrays
Masks
Recognition planes for each substilution rule
FIGURE22. Parallel implementation of the recognition phase of two symbolic substitution rules. Adapted from Louri (1991).
86
M . S. ALAM and M. A. KARIM
horizontal (vertical) polarizations indicate the presence and location of the search pattern for the substitution rule (substitution rule 1). This image is then passed through a Wollaston prism, which deflects the two states of polarization in opposite directions, thus producing two physically separate recognition planes. Note that the Wollaston prism splits the image according to polarization states. Consequently, there is no power loss in generating these two images. Next, a thresholding operation through an optical AND gate array is applied to these images. The thresholding operation makes all the pixels with two identical polarizations bright; all other pixels become opaque. The thresholded patterns are passed through an optical mask, whose transparent pixels coincide with the location of the reference pixel in the thresholded image, to filter out the erroneous pixels. Thus, the recognition phase output contains bright pixels only in those locations of the input plane where the search patterns are present. Figure 23 shows the implementation of the substitution phase for a single substitution rule. The unpolarized LED array provides a superimposed image of shifted replicas of the recognition plane. The on-off state of the LED is dictated by the placement of the bright pixels in the substitution pattern. Thus, for a parallel implementation, two distinct LED arrays are required. The replicas are shifted and superimposed, thereby scribing the substitution pattern in all occurrences of the search pattern in the input image. The source plane configuration and reference pixel arrangement for the parallel implementation of four substitution rules are shown in Fig. 24.
Recopnitinn nlane
Unoolarized
for
BU
FIGURE 23. Parallel implementation of the substitution phase using shadow-casting. Adapted from Louri (1991).
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
87
LEDs for rule 2
Rj: Reference pixel for rule j LEDs for rule 4 b
a
FIGURE24. (a) Source plane configuration and (b) reference pixel arrangement for the parallel implementation of four symbolic substitution rules. Adapted from Louri (1991).
H . OSS-Based Image Processing Symbolic substitution has also been found to be attractive for image-processing applications such as morphological processing, feature suppression, histogram equalization, and image enhancement (Casasent and Botha, 1989; Cherri and Karim, 1989b; Goodman and Rhodes, 1988; Cherri and Karim, 1991; Eichmann et al., 1988). In this section, image enhancement using various image-sharpening operators is discussed. 1. Roberts and Difference Operators
The Roberts gradient is a simple nonlinear edge detector. It is employed by convolving an image with two 2 x 2 kernels which approximate the horizontal and vertical strength of the edge at each pixel location, given by
Thus, the Roberts operator is especially useful for detecting edges of an image along the diagonal. Each pixel in the input is replaced with the larger of the absolute value of these to operators, given by fedge(X,Y)=
maxllR+I, lR-11.
(V.1)
The Roberts edge kernels are convolved with the input image to create the edge-enhanced or sharpened image. It is obvious that the Roberts operator suffers from directional dependency. For illustration, R+ detects only 45" edges and R - detects only 135" edges (Cherri and Karim, 1991). To avoid the directional dependency of the Roberts operator, a modified Roberts
88
M. S. ALAM and M. A . KARlM
operator may be used. Table Va lists the 16 symbolic substitution rules corresponding to the 16 input combinations of the 2 x 2 window of the Roberts mask. The input variables A , B , C , and D represent the binary image values at ( i , j ) ,(i + l , j ) , ( i , j + l), and (i + 1 , j + 1) pixel locations, respectively. Notice that these input combinations include all possible edge orientations: vertical, horizontal, diagonal, no edge, and corners. The absolute values of the responses R , and R-,as well as their maxima, are listed as shown in columns six through eight of Table Va, where
IR+J= A
-
D
(R-( =B
-
C.
W.2)
and W.3) From Table Va, it is evident that the edge detector can be implemented by using the input combinations that yield either the 1 outputs or the 0 outputs. Since the 0 output involves only four input conditions (1, 7, 10, and 16) it is preferable to design the symbolic substitution-based edge detector using the minterms corresponding t o the 0 outputs. A CAM-based symbolic substitution system, such as the one proposed by Mirsalehi and Gaylord (1986), may be used to implement the symbolic substitution-based edge detector. Since CAMS use reduced logical expressions, many recognition-substitution rules can be combined to generate an output pattern, thus reducing the overall number of recognitionsubstitution patterns required to perform a specific task. The reduced minterms are used as references and stored in Fourier holograms. In this case, four holograms will be necessary for each output bit in the image, whether one elects to use only output 0’s or only output 1’s. A second pattern check may now be considered for edge detection. In binary images, an edge is detected whenever a 0 1 or a 1 0 transition is encountered. For detecting a horizontal/vertical edge, one can make use of a vertical/horizontal difference operator, given by +
D+ = 1-1
+
and
Again, to avoid directional dependency of these operators, they are combined to form a modified digital gradient almost in the same manner as for the Roberts operators. Since the first pixel of both D+ and D- is at the same position, a three-pixel mask will be sufficient to extract edges. Table Vb lists the eight input combinations for the three-pixel windows, with I, J, and K representing binary image values at ( i , j ) , ( i , j + l), and (i + 1 , j ) pixel positions, respectively, of the input image. The corresponding absolute different responses and modified digital gradients are also listed in the table.
89
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
TABLE V SYMBOLIC SUBSTITUTION RULES FOR EDGEDETECTION USING(A) ROBERTS OPERATORS, AND (B) DIFFERENCE OPERATORS
X
C
B D
IR1'
IR2'
DX 8 DY OPERATORS
ROBERTS MASKS
ROBERTS MASKSENTRY A
MAX (IR11 IR211
ENTRY A
C
E
D
f
MAX lR'l
IR2' (IR11 IR21)
ENTRY #
K
I J
IDXI ID''
MAX (IDXI IDYI)
n
n
m
a
2. Simulation Results A 64 x 64 pixel binary image having edges along multiple orientations is subjected to a digital edge detection experimentation. The input image is
90
M . S. ALAM and M. A. KARIM
EO FIGIJRE 25. A 64 x 64 pixel input image.
intensity encoded as shown in Fig. 25. For a CAM-based symbolic substitution detector, reduced minterms are used for recognition. Figure 26a shows the edge-detected output obtained using Roberts operators, while Fig. 26b shows the outputs obtained using difference operators. Notice that because of the symmetric nature of Roberts mask, all edges have equal strength, which is obvious from Fig. 26a. If a difference operator is used instead of the Roberts operator, then from Fig. 26b, it is evident that a difference operator can't detect the edges with equal edge strength. Notice the weak response at both upper left corners and at 135" edges. Roberts operator require either four symbolic substitution rules or four reduced minterms, while two symbolic substitution rules or two reduced minterms are needed for the difference operators. Comparing Fig. 26a with Fig. 26b, we observe that the Roberts operator-based symbolic edge detector yields better image quality.
VI. LIMITATIONS AND CHALLENGES
Symbolic substitution is a very powerful technique for optical parallel digital computing. However, in addition to encoding of data, symbolic substitution requires a coherent source for recognition and substitution. The primary limitation of symbolic substitution logic is that it can only be used in a space-invariant interconnected architecture (Alam et al., 1992a;
FIGURE26. Edge detected image using (a) Roberts operator and (b) different operator.
OPTICAL SYMBOLIC SUBSTITUTION ARCHITECTURES
91
Barua, 1991b). That is, there has to be the same number of inputs to each gate or module, the same number of outputs, and all of the connections have to be exactly the same between the modules in a stage. Other drawbacks include a requirement for higher space-bandwidth product.
REFERENCES Alam, M. S., Awwal, A. A. S., and Karim, M. A. (1992a). Digital optical processing based on higher-order modified signed-digit symbolic substitution. Appl. Opt. 31, 2419. Alam, M. S., Karim, M. A., Awwal, A. A. S., and Westerkamp, J . J. (1992b). Optical processing based on conditional higher-order trinary modified signed-digit symbolic substitution. Appl. Opt. 31, 5614. Avizienis, A. (1961). Signed-digit number representation for fast parallel arithmetic. IRE Trans. Electron. Comput. EC-10,389. Awwal, A. A. S., Islam, M. N., and Karim, M. A. (1992). Modified signed-digit trinary arithmetic by using optic9 symbolic substitution. Appl. Opt. 31. 1687. Barua, S. (1991a). Finite impulse response-Median hybrid filtering techniques for image smoothing. Opt. Eng. 30, 271. Barua S. (1991b). High-speed multiplier for digital signal processing. Opt. Eng. 30, 1997. Bocker, R. P., Drake, B. L., Lasher, M. E., and Henderson,. T. B. (1986). Modified signeddigit addition and substraction using optical symbolic substitution. Appl. Opt. 25, 456. Botha, E., Casasent, D., and Barnhard, E. (1987). Optical symbolic substitution using multichannel correlators. Appl. Opt. 27, 817. Brenner, K. H., Huang, A., Streibl, N. (1986). Digital optical computing with symbolic substitution. Appl. Opt. 25, 3054. Brenner, K. H., Lohmann, A. W., and Merklein, T. K. (1989). Symbolic substitution implemented by spatial filtering logic. Opt. Eng. 28, 390. Casasent, D., and Botha, E. (1989). Multifunctional optical processor based on symbolic substitution. Opt. Eng. 28, 425. Cherri, A. K., and Karim, M. A. (1988). Modified signed-digit arithmetic using an efficient symbolic substitution. Appl. Op. 27, 384. Cherri, A. K., and Karim, M. A. (1989a). Symbolic substitution based flagged arithmetic using polarization encoded optical shadow-casting. Opt. Commun. 70, 455. Cherri, A. K., and Karim, M. A. (1989b). Symbolic substitution based operations using holograms: Multiplication and histogram equaiization. Opt. Eng. 28, 638. Cherri, A. K., and Karim, M. A. (1991). Image enhancement using optical symbolic substitution, Opt. Eng. 30, 259. Cloonan, T. J . (1988). Performance analysis of optical symbolic substitution. Appl.. Opt. 27, 1701. Drake, 9 . L., Bocker, R. P., Lasher, M. E., and Henderson, T. B. (1986). Photonic computing using modified signed-digit number representation. Opt. Eng. 25, 38. Eichmann, G., Zhu, J . , and Li, Y. (1988). Optical parallel image skeletonization using content-addressable memory based symbolic recognition. Appl. Opt. 28, 2905. Eichmann, G., Kostrzewski, A., Kim, D. H., and Li, Y. (1990). Optical higher order symbolic recognition. Appl. Opt. 29, 135. Goodman, S . D., and Rhodes, W. T. (1988). Symbolic substitution applications to image processing. Appl. Opt. 27, 1708.
92
M. S. ALAM and M. A. KARlM
Huang, A. (1983). Parallel algorithms for optical digital computers. Tech. Dig., Int. Opt. Comput. Cony., loth, IEEE Computer Society, Los Angeles, 1983, 13. Huang, A., Tsunida, Y., Goodman, J . W., and Ishihara, S. (1979). Optical computation using residue arithmetic. Appl. Opt. 18, 149. Hwang, K., and Louri, A. (1987). Optical multiplication and division using modified signeddigit symbolic substitution. Appl. Opt. 27, 817. Jeon, H., Abushagur, M. A. G , , Sawchuk, A. A., and Jenkins, B. K. (1990). Digital optical processor based on symbolic substitution using holographic matched filtering. Appl. Opt. 29, 2113. Karim, M. A., and Awwal, A. A. S. (1992). Optical Computing: An Introduction. Wiley, New York. Kozaitis, S. P . (1988). Higher-ordered rules for symbolic substitution. Opt. Commun. 65, 339. Li, Y ., and Eichmann, G . (1987). Conditional symbolic modified signed-digit arithmetic using optical content-addressable memory logic elements. Appl. Opt. 26, 2328. Dorisinville, R., and Alfano, R. K. (1988). Parallel digital and symbolic Li, Y., Eichmann, 0.. optical computation via optical phase conjugation. Appl. Opt. 27, 2025. Li, Y., Kim, D. H., Kostrzewski, A. and Eichmann, G . (1989). Content-addressable-memorybased single-stage optical modified signed-digit arithmetic. Opt. Lett. 14, 1254. Louri, A. (1991). Parallel implementation of optical symbolic substitution logic using shadowcasting and polarization. Appl. Opt. 30, 540. Mait, J . N., and Brenner, K. H. (1988). Optical symbolic substitution: System design using phase-only holograms. Appl. Opt. 27, 169. Mirsalehi, M. M., and Gaylord, T. K. (1986). Truth table look-up parallel data processing using an optical content-addressable memory. Appl. Opt. 25, 77. Thalmann, R., Pedrini, G . , and Weible, K. J . (1990). Optical symbolic substitution using diffraction gratings. Appl. Opt. Commun. 29, 2126, Yu, F. T. S., and Jutamulia, S. (1987). Implementation of symbolic substitution logic using optical associative memories. Appl. O p f . 26, 2293.
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS,VOL. 89
Semiconductor Quantum Devices MARC CAHAY Department of Electrical and Computer Engineering, University of Cincinnati Cincinnaii, Ohio
and SUPRIYO BANDYOPADHYAY Department of Electrical Engineering, University of Notre Dame Notre Dame, Indiana
. . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . . . . . . . . Electron Wave Devices . . . . . . . . . . . . . . . . . .
I. Introduction .
11. Quantum Devices
A. B. Multichannel and Nonlinear Transport in Electron Wave Devices: Additional Requirements for Interference . . . . . . . . . . . . . . . C. Wave Behavior of Electrons in Solids: The Analogy with Linear Optics and Microwaves . . . . . . . . . . . . . . . . . . . . . D. Current and Conductance Formulas for Electron Wave Devices . . . . . 111. Resonant Tunneling Devices . . . . . . . . . . . . . . . . . A. Space Charge Effects in Resonant Tunneling Devices-Possible Sources of Bistability . . . . . . . . . . . . . . . . . . . . . B. Effects of Inelastic Scattering on the Device Characteristics of Double-Barrier Resonant Tunneling Diodes . . . . . . . . . . . . . . . . C. Applications of Resonant Tunneling Devices . . . . . . . . . . D. Quantum-Mechanical Tunneling Time and Its Relation to the Tsu-Esaki Formula . . . . . . . . . . . . . . . . . . . . . . 1V. Aharonov-Bohm Effect-Based Devices . . . . . . . . . . . . . . A. The Optical Analog of Aharonov-Bohm Devices: The Mach-Zender . . . . , . . . . Interferometer . . . . . . . . . . B. The Magnetostatic Aharonov-Bohm Effect in Wide Rings or Double Quantum Wells . . . . . . . . . . . . . . . . . . . . C. The Electrostatic Aharonov-Bohm Effect in Double Quantum Wells: Possible Ultrahigh-Performance Transistors . . . . . . . . . . . D. The Electrostatic Aharonov-Bohm Effect in Disordered Structures: Comparison between Double Quantum Wires and Double Quantum Wells in the Presence of Elastic Scattering . . . . . . . . . . . . . V. T-Structure Transistors . . . . . . . . . . . . . . . . . . . A. Analysis of T-Structure Transistors . . . . . . . . . . . . . B. Sensitivity of the Device Characteristics to Structural Dimensions: Implications for lntegrated Circuit Implementation . . . . . . . . . . . C. Device Characteristics of a Single Transistor . . . . . . . . . . .
.
93
94 9a 99
103 110
113 121
130 134 135 136 142 146 149
157
165 178 179
I a3 184
Copyright 0 1994 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0- 12-01473 1-9
94
VI. VII. VIII. IX.
X.
XI.
MARC CAHAY and SUPRIYO BANDYOPADHYAY D. Analog Applications . . . . . . . . . . . . . . . . . . E. Digital Applications . . . . . . . . . . . . . . . . . . F. Electro-optic Applications . . . . . . . . . . . . . . . . Electron Wave Directional Couplers . . . . . . . . . . . . . . A. Theory of Electron Wave Directional Couplers . . . . . . . . . . Spin Precession Devices . . . . . . . . . . . . . . . . . . Granular Electron Devices . . . . . . . . . . . . . . . . . . Connecting Quantum Devices on a Chip: The Interconnecting Problems. . . A. Coupling between Optical Interconnects . . . . . . . . . . . . B. Quantum-Mechanical Coupling . . . . . . . . . . . . . . . C. Calculation of Coupling Coefficients . . . . . . . . . . . . . Quantum-Coupled Architectures and Quantum Chips. . . . . . . . . A. Shortcomings of Quantum-Coupled Devices . . . . . . . . . . . B. Quantum-Coupled Spin-Polarized Singe-Electron Logic Devices . . . . C. Logic Circuits Using Spin-Polarized Single Electrons . . . . . . . . D. Reading and Writing Operations: Orienting and Detecting Electron Spins in Single-Electron Cells . . . . . . . . . . . . . . . . . . E. Potential Problems in the Computing Scheme and Their Solutions . . . F. Performance Figures for Spin-Polarized Single-Electron Logic Devices . . Epilogue: The Long-Term Prognosis . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . .
188 189 191 193 193 199 203 208 210 213 213 217 221 223 228
235 238 241 243 245
I. INTRODUCTION
This article presents a review of a recent genre of semiconductor electronic devices which utilize quantum-mechanical properties of charge carriers for their operation. Examples of such properties are the quantum-mechanical wave nature (as opposed to the classical particle nature) of electrons and holes, the granularity of charge, and the precession or polarization of an electron spin. We have attempted to present a comprehensive description, as well as a balanced view of the merits and drawbacks of these devices, which have been variously termed as “quantum devices, ’’ mesoscopic devices,” and “nanostructure electronic devices.” In addition to discussing individual quantum devices, we have also discussed how these devices can ultimately be incorporated in actual circuits to realize a complete functional unit such as a microprocessor of a supercomputer. The ideal chip architecture of quantum devices is quantumcoupled architecture, in which quantum-mechanical coupling between nearest neighbors replace physical wires that carry signal from one device to the next. The elimination of physical interconnects results in unprecedented density and vastly improved speed of operation. This article is not concerned with the merits of quantum devices alone. In spite of all their potential advantages, quantum devices also suffer from some serious drawbacks. We have addressed some of the drawbacks and
SEMICONDUCTOR QUANTUM DEVICES
95
highlighted the unresolved questions that need to be answered before quantum devices and circuits become a reality. The subject matter of this article has matured in the last 10 years. Quantum devices have proliferated in the literature, and new ideas are being put forth with increasing frequency. Therefore, this review cannot be exhaustive, and we apologize to those whose work has not been mentioned because of our unfamiliarity with the details. In addition to the references listed in this article, the reader is referred to the bibliography on quantum interference effects in semiconductors published by Gaylord et al. (1991). During the late 1970s and early 1980s, two revolutionary advancements were made in semiconductor technology. They are the invention and refinement of sophisticated thin-film growth techniques such as molecular beam epitaxy (MBE), organo-metallic chemical vapor deposition (OMCVD), or atomic layer epitaxy (ALE), and superfine patterning techniques such as electron beam lithography (EBL), X-ray lithography, focussed ion beam milling (FIBM), or scanning tip lithography (STL). With the help of these new techniques it became possible to make a wide variety of unique ultrasmall semiconductor structures. Using MBE, for instance, one could grow structures consisting of multiple layers of ultrathin semiconductor films of extremely high purity and different material composition. The interfaces between such layers can be atomically sharp, and the thickness of each layer can be that of a monolayer ( - 4 A ) or even a fraction thereof. With STL, it is possible to pattern and delineate structures whose lateral dimensions are - 6 0 A or less. The combination of the two techniques affords unique possibilities of making truly ultrasmall semiconductor structures. One can grow a layer of a material with thickness as small as 4 A , and then pattern the “horizontal” dimensions down to about 60A or less. Such a structure is shown in Fig. 1 . The vertical dimension can be made comparable to the spacing between atoms in semiconductors, while the horizontal dimensions could become comparable to the de Broglie wavelength of electrons and holes, or the mean free path of electrons at room temperature, or the effective Bohr radius of excitons in semiconductors.‘ A host of novel physical phenomena, many of which are of purely quantum-mechanical origin, have been demonstrated in such structures. Quite often, they have intriguing device applications and are of paramount interest to device physicists and engineers. This
-
’
The thermal de Broglie wavelengths of electrons, light holes and heavy holes in GaAs are 292 A , 264 A , and 112 A , respectively. The Fermi-DeBroglie wavelength at typical carrier concentrations is also ofethis order. The mean, free path of electrons under quasi-equilibrium conditions is about 300 A in GaAs at room temperature. The effective Bohr radii of electro?light hole excitons and electron-heavy hole excitons in GaAs are about 56A and % A , respectively.
96
MARC CAHAY and SUPRIYO BANDYOPADHYAY
article will provide an introduction to and a review of such devices and their quantum-mechanical operating principles. Ultrasmall structures of the type just discussed are generally referred to as quantum-confined systems or simply confined systems. Almost all quantum devices use confined systems. The origin of the name “confined system” lies in the fact that in these structures, an electron’s (or hole’s) motion is restricted in one, two, or all of the three possible coordinate directions, unlike in the case of bulk structures, where an electron is free to move in all
quantum cap layer (n’GaAs) donor layer ( n + AIGaAs) spacer layer (I-AIGaAs) quantum well (GaAs)
a quantum
2DEG
.............................
I \ , \ \ \ \ \ \ , \
GaAs) donor layer spacer layer quantum well (GaAs)
\ \ \ \ \ \ \ \ .
~,~,,,~;,~,~,~,~,,,, i-AIGaAs , , , , , , , 1 , ,
.............................. ............................. .............................. ............................. .............................. ............................. .............................. ............................. .......... .......... ......... .......... .......... ......... i-GaAs s::::::::: ............................. .............................. ............................. .............................. ............................. b FIGURE1. A two-dimensional structure (quantum well) is fabricated by growing an ultrathin film surrounded by materials with wider bandgap. The electrons in the structure are confined in the ultrathin film forming a two-dimensional electron gas (2DEG). To create a onedimensional structure (quantum wire), one can either (a) etch a narrow mesa past the two-dimensional electron gas, or (b) employ shallow mesa etching, or (c) fabricate split metallic gates, or (d) deposit metals on the etched surface. In case (b), the two-dimensional electron gas is depleted under the etched (exposed) surface because of Fermi level pinning in the bandgap caused by surface states at the exposed surface. In case (c) and (d), depletion is caused by applying a dc voltage (negative for electrons and positive for holes) to the split gates. The split gate structures are superior to the deep mesa etched structures, since one can vary the degree of one-dimensional confinement by varying the split gate voltage. Additionally, the edges of the structure are defined by electrostatic confinement, rather than by material boundaries, so that boundary roughness scattering is minimized.
SEMICONDUCTOR QUANTUM DEVICES quant wire
Metal Schottky gate pad
2DEG
cap layer (n’ GaAs) donor layer spacer layer quantum well (GaAs)
97
/
d FIGURE1-continued.
three directions. The restriction of motion accrues from the small size of the structure; it is caused by the fact that one, two, or all three dimensions of the structure are comparable to the DeBroglie wavelength of the electrons at the Fermi energy. Obviously, the restriction can be of three types depending on how many dimensions are comparable to the DeBroglie wavelength. In an ultrathin film growth by MBE, only the thickness is comparable to the wavelength. Such a structure is called a two-dimensional structure or a “quantum well” in which an electron’s motion is restricted only along the thickness. STL can then be used to pattern and subsequently trim (or etch) the film along the width to create a one-dimensional structure (called a “quantum wire”) whose thickness and width are both comparable to the DeBroglie wavelength of electrons. Finally, the length can also be trimmed to create a quasi zero-dimensional structure (called a “quantum dot” or “quantum bubble”) in which all three dimensions are comparable to the DeBroglie wavelength and motion of electrons in them is restricted along all three coordinate axes.
98
MARC CAHAY and SUPRIYO BANDYOPADHYAY
It is now apparent that there are three different types of confined systems: (a) quantum wells, (b) quantum wires, and (c) quantum dots. All three types have found applications in existing or proposed quantum devices.
11. QUANTUM DEVICES In many quantum confined structures, the quantum-mechanical nature of charge carriers (electrons and holes) can become manifest. The most common, and perhaps the most mundane, example of this is a generic metal oxide semiconductor field-effect transistor (MOSFET), or its more modern cousin the modulation-doped field-effect transistor (MODFET). In these rather garden-variety devices, the conducting channel (inversion layer or accumulation layer) is a two-dimensional quantum confined system with the thickness comparable to the DeBroglie wavelength of charge carriers. This leads to quantization of the kinetic energy of charge carriers in the conducting channel (this is the kinetic energy of motion along the thickness). The quantization, which is undoubtedly a quantum-mechanical phenomenon, has some effect on the total inversion or accumulation layer charge, transconductance of the transistor, threshold voltage, and other properties (see, for example, Stern and Howard, 1967; Delagebeaudeuf and Linh, 1982), but these effects are neither dramatic nor easy to observe unambiguously. Most importantly, the quantization of the energy plays no role in the operation of the MOSFET or MODFET; the operation still obeys entirely classical principles. Therefore, a MOSFET or MODFET is not a “quantum device” in the true sense; quantum mechanics is not central to its operation even though quantum mechanics does affect the device characteristics somewhat. In contrast, ultrashort constrictions or “point contacts” (ballistic quasi one-dimensional structures), which exhibit quantized conductance (Van Wees et al., 1988; Wharam et al., 1988) because of the same energy quantization, will qualify as a “quantum device’’ since the quantization of the electron energy determines the basic feature of the device characteristics. What makes a device a true “quantum device” is quantum mechanics playing the central role (rather than a peripheral role) in the operation. The old tunnel diode by this definition would be a true quantum device, as would be the superconducting quantum interference device (SQUID). We shall not discuss superconducting devices at all in this article, but one should be aware that semiconductors are not the only materials used for quantum devices. In fact, the use of semiconductors is quite recent, with the only exception being the tunnel diode pioneered by Esaki (1958).
SEMICONDUCTOR QUANTUM DEVICES
99
The quantum behavior of electron and holes in true quantum devices may be manifested in many different ways. For example, (a) the quantummechanical wave nature of charge carriers may hold dominant sway, and the device operation may be based on wave attributes such as interference, focussing, diffraction, etc.; (b) the granularity of charge (i.e., the fact that electrical charge comes in quanta of the electronic charge e) may form the basis of the device operation; or (c) such orthodox quantum-mechanical properties as the spin polarization of electrons may be responsible for device operation. Other types of quantum behavior (and corresponding quantum devices) may be possible (and indeed do exist), but are not addressed in this review. This review concentrates only on the three types of quantum devices that we have just mentioned. We will call them “electron wave devices,” “granular electronic devices,” and “electron spin devices ,” respectively. They are discussed in the rest of this article. Before concluding this section, we may mention that quantum devices are often termed by other names as well. The two most popular ones are “nanostructure devices” and “mesoscopic devices.” Because these devices are so small, they are referred to as “nanostructure devices” (their dimensions are typically smaller than 1 pm or 1 ,OOO nm). For the same reason, they are often called “mesoscopic devices” because their dimensions span the range between the “microscopic” with atomic dimensions and the “macroscopic” with dimensions exceeding about 1 pm (in which classical physics usually dominates device behavior). It is important to keep these various names in mind since the terminology has still not been standardized in current literature. A. Electron Wave Devices
In this subsection, we will introduce “electron wave devices” whose operations rely on the wave nature of electrons. The wave-particle duality of electrons is a basic tenet of quantum mechanics as embodied in the famous DeBroglie relation’
p
=
h/L,
(11.1)
where p is the electron momentum and 1 is the electron’s De-Broglie wavelength. In electron wave devices, the primary requirement is to promote the wave nature of electrons over the particle nature. The most important characteristic of a wave is a well-defined phase, so that the wave nature of electrons can be manifested if only we can ensure that the phase ’This relation is not quite applicablein the presence of a magnetic field, and a magnetic field is often required for quantum devices.
100
MARC CAHAY and SUPRIYO BANDYOPADHYAY
of the electron wave is not randomized by any perturbation. This requires that an electron preserve its phase (or phase memory) throughout the entire device. This is a necessary, but not a sufficient condition for a device to act as an electron wave device. The question that naturally arises now is what ensures complete preservation of phase memory. In a solid not subjected to external acoustic or electromagnetic perturbations such as incoherent light or microwave radiation, the phase of an electron can be randomized only by so-called phase randomizing (or dephasing) collisions that the electron can suffer. Therefore, we must make sure that no such collisions take place (or that they take place very infrequently, to be more realistic). This means that the transmit time 7t of an average electron though the device must be smaller than the mean time zg between phase randomizing collisions: 7,
c T,.
(11.2)
If the electron motion is diffusive, then the transit time 7, = L2/D,where L is the length of the device along the direction of current flow (distance between the so-called “source” and “drain” contacts in the parlance of electronic devices) and D is the diffusion coefficient. Therefore, the condition in Eq. (11.2) can be recast as L < L , = a .
(11.3)
A necessary requirement for a device to become an electron wave device is now obvious: the dimension along the direction of current flow must be smaller than the phase breaking length or phase coherence length L , defined by the preceding equation. Now comes the fundamental question. What properties determine L , (or more fundamentally, r,)? To answer this question, we must first define precisely what are phase-randomizing collisions. Not all collisions that an electron suffers in a solid are phase-randomizing. Only those that cause a change in the quantum state or an internal degree of freedom of the environment scramble the phase (Leggett, 1989). Therefore, generally inelastic collisions in which a nonzero amount of energy is exchanged by an electron with the environment (such as through phonon or photon emission or absorption, or through collision with another electron or hole) would be phase-randomizing. There are, however, two serious exceptions to this rule. An electron absorbing energy from a coherent radiation (such as a laser or a maser) is not undergoing a dephasing event. This has now been demonstrated in a series of beautiful experiments (Badurek et al., 1983). The other situation is pertinent to a two-branch interference experiment like the Young’s doubleslit experiment in optics or its electronic analogue (Feynman and Hibbs, 1965). Let us consider a ring structure with two arms as shown in Fig. 2a.
SEMICONDUCTOR QUANTUM DEVICES
101
a
t
Voltage
b FIGURE 2. (a) A ring structure used as an interferometer much like the Young’s double slit experiment. (b) The terminal current as a function of the phase (or voltage) difference accrued by an electron in traversing the two branches of the ring.
An electron enters from the left lead and exists at the right lead, where it is detected as terminal current. The terminal current should be determined by the interference of the two waves in the two branches, provided we cannot tell by any means which of the two branches the electron traversed before arriving at the right lead (Feynman and Hibbs, 1965). If this last condition is fulfilled, then the terminal current should oscillate as function of the phase difference between the two branches as shown in Fig. 2b.3 If, on the other hand, we could tell by some means which branch the electron took in arriving at the right lead, then the wavefunction would have collapsed in that branch and the interference would have been destroyed. Consider the situation that the electron suffers an inelastic collision somewhere in the structure (in one of the two branches). This would locally change the temperature or cause local luminescence or absorption, It is indeed possible to modulate the phase difference between the two branches with an external electric or magnetic field in an actual device, as we shall see later.
102
MARC CAHAY and SUPRIYO BANDYOPADHYAY
a
b FIGURE3. (a) An inelastic scattering event occurring in one branch of the ring, which collapses the wavefunction in that branch and ruins all interference effects. (b) Perfectly correlated and synchronized inelastic scattering events which do not collapse the wavefunction in any branch and therefore do not destroy the interference.
depending on whether a phonon or photon was involved. By monitoring the temperature and luminescence of the two branches, we could tell which branch the electron was in when the event took place (this situation is depicted in Fig. 3a). Obviously, this inelastic event would be effectively phase-randomizing, since it would in principle tell the observer which branch the electron traversed. In other words, the inelastic collision would collapse the wavefunction in one of the branches and destroy the interference. Now consider the situation shown in Fig. 3b. Every time an inelastic collision occurs in one branch, an identical event occurs in the same position in the other branch. In this case, we could never tell which branch the electron was in, and the interference is recovered. Therefore, perfectly correlated (perfectly synchronized in time and space) inelastic collisions (admittedly a highly unlikely occurrence unless it is involved with absorption from a coherent photon source) are not phase-randomizing (Datta and Bandyopadhyay, 1987). What about elastic collisions? Since there is no finite energy exchange with the environment in these collisions, they are generally not phaserandomizing (Gunther and Imry, 1969; Landauer, 1970; Biittiker, 1985a,b). However, there is again an exception. An elastic collision with a magnetic
SEMICONDUCTOR QUANTUM DEVICES
103
impurity may flip the spin of the impurity so that there is a change in the internal degree of freedom of the scatterer. Since the scatterer is a part of the environment, such a collision, albeit elastic, is phase-randomizing. From the preceding discussions, we get some inkling of what determines r4 and therefore L,. Since r, is determined by inelastic collisions and collisions with magnetic impurities, the two obvious ways of increasing r+ are (a) by reducing the ambient (lattice) temperature, which reduces inelastic phonon collision rates and electron-electron or electron-hole collision rates; and (b) by eliminating magnetic impurities such as the ones that cause the Kondo effect. In addition, L, can be increased by increasing the diffusion coefficient D . The latter can be increased by reducing the concentration of impurities and crystallographic defects which degrade the mobility of the structure. Therefore, we need clean (relatively impurity-free or high-mobility)structures and low temperatures for electron wave devices.
B. Multichannel and Nonlinear Transport in Electron Wave Devices: Additional Requirements for Interference At the beginning of the previous section we had mentioned that a necessary (but not the sufficient) condition for a device to act as an electron wave device is that the phase memory of every electron passing through the device be preserved throughout the entire device. We now elucidate why this is not the sufficient condition. There are two reasons for this. If we consider an electron moving freely along the x-direction, then the x-component of its wavefunction is a plane wave (I&, t ) = ei@~x-E”h)). The electron’s phase is then (k,x - Et/h), which depends on the wavevector k, and the energy E of the electron. We want to make sure this phase is unique for every electron that traverses the device. Since the current through the device is determined by the ensemble average over all such electrons, the phase shift has to be more or less unique. Otherwise, ensemble averaging will wash out any interference effect. Uniqueness of the phase can only be guaranteed if both E and k, are conserved. Let us see how we can meet the first requirement, namely the invariance of E . This is met by enforcing two conditions: (a) no random inelastic collision to change E unpredictably, thereby causing different electrons to have diffrent E, and (b) “monochromicity” of the incident electron flux. All electrons carrying current must have the same energy E. This is possible only at very low temperatures and low applied voltages (biases). In this so-called linear response regime of transport, only electrons at the Fermi level can carry current. This happens because there are no electrons above the Fermi level at low temperatures, and states below the Fermi level are
104
MARC CAHAY and SUPRIYO BANDYOPADHYAY
filled and therefore Pauli-blocked (electrons cannot flow into a filled state because of the Pauli Exclusion Principle). In this case, a current-carrying electron's energy is fixed; it is the Fermi energy E F . What about the second requirement, namely invariance of k,? To see how this can be ensured, consider a quasi one-dimensional structure with multiple transverse subbands occupied. Here E and k, are related by (we neglect band-structure non-parabolicity effects)
E
=
E', + &;
ti2e +2m* '
(11.4)
where is the subband bottom energy for the mth subband along the transverse y-direction, E: is the subband bottom energy for the nth subband in the transverse z-direction, and m* is the effective mass for motion along the x-direction. The energy dispersion relation E versus k, is shown in Fig. 4a. Note that an elastic scattering event can take an electron from one subband to a multitude of others without changing the net energy E . However, this changes k, by an arbitrary amount, and therefore the absolute phase shift changes by an arbitrary amount. In two- or three-dimensional structures, the number of transverse subbands can be viewed as being essentially infinite, so that there are infinite possibilities regarding the change in k,. (The elastic scattering event is not phase-randomizing, since elastic perturbations are time-invariant. Therefore, we can trace the system back in time to recover the phase information. But this kind of scattering causes different electrons to have different phase shifts even when they all have the same energy.) The only way to ensure that elastic scattering does not change k, randomly is to make sure that only one transverse subband is filled in either the y - or z-direction so that transport is single-channeled. For such a system, if E is constant, so is k,. This allows us to meet the second requirement .4 There are thus three conditions that together are sufficient to ensure wave behavior of electrons. These are: 1. no inelastic collisions or very infrequent inelastic collisions 2. single- or few-channeled transport, as in quasi one-dimensional structures 3. linear response transport It is still possible that an elastic scattering event can change the sign of k, as shown in Fig. 4b. The accompanying momentum change is, however, quite large, especially if k, is large enough. This makes it a highly unlikely event (Sakaki, 19801, since most elastic scattering potentials are long-range and can only effect small momentum changes. In other words, large-angle scatterings (such as over an angle of 180") as depicted in Fig. 4b have very low probabilities.
105
SEMICONDUCTOR QUANTUM DEVICES
0
I Wavevector
a Energy
Fermi level
Wavevector
b FIGURE4. (a) The energy dispersion (E-k) relations for various occupied subbands. An elastic scattering event can take an electron from one subband to a multitude of others without changing the total energy. (b) When only one subband is occupied, an elastic scattering event can take an electron from a wavevector state to another with equal but opposite wavevector. This backward scattering by 180" is, however, highly improbable for high-velocity electrons, since it entails a very large momentum change. Such momentum changes usually cannot be caused by scattering potentials that vary smoothly in space such as impurity scattering potentials that are Coulombic.
These requirements together are sufficient, but not all of them are necessary. Only the first one is always necessary. To see when the last two requirements are not always necessary, consider the following. There are instances when the operation of an electron wave device depends on the interference of waves in two branches of a structure. This requires that the phase difference (k,x, - Eltl) - (k2x2- E2t2)
106
MARC CAHAY and SUPRIYO BANDYOPADHYAY
between the two branches be unique, while the absolute phases k , x , - El t l or k,x2 - E,t, in each branch need not be. An example of such a device is the “quantum interference transistor” utilizing the magnetostatic Aharonov-Bohm effect (Datta et al., 1985), which relies on the interference of waves in two branches of a ring-like structure. In this device, the phase difference between the two branches is independent of both k, and E if the two branches are identical and either transport is ballistic or scattering events in the two branches are perfectly correlated in time and space (we shall show this later in Section IV). In that case, neither linear response nor single-channeled transport is required for essentially a 100% interference effect (Datta et al., 1985; Datta and Banyopadhyay, 1987; Datta, 1989a). Another example of this is encountered in the “spin precession transistor” utilizing interference between electron spins (Datta and Das, 1990). This device is discussed in Section VII. Here again, the conditions 2 and 3 are not necessary, even though condition 1 still is. Before we conclude this section, let us address the more general situation where E and k, do determine the phase that controls interference effects. How necessary is it to meet the conditions of linear response (low temperature, low bias) and single-channeled transport in such cases? Quantum interference effects have been demonstrated in semiconductors at quite high temperatures ( - 100 K) and high biases (Webb, 1989). They have also been demonstrated in metallic structures where transport is far from single-channeled (metallic structures used in quantum interference experiments typically involve transport in lo5 channels) (Webb et al., 1985; Chandrasekhar et al., 1985). So what really are the important (rather than the absolute) requirements? At any finite temperature, there is a spread in the energy of electrons that carry current through the device. This spread is approximately kT around the Fermi energy, where T is the electron temperature (which is also the lattice temperature if the bias is small enough that the electron distribution is not heated). Because of this spread, different electrons traversing the device will have different energies and therefore different phase shifts, which will dilute the interference effects. The dilution occurs because if the spread in the phase shift is - a , then some electrons would interfere constructively while others would interfere destructively, and there would be no net interference effect. This is the deleterious role of ensemble averaging. The question now is how large can we allow T (or, equivalently, the spread in the electron energy around the Fermi level) to be. The maximum allowed value of this spread is such that two electrons that have this difference in energy and traverse the device along the same path acquire a phase difference of a. If the path length is S,then this maximum allowed
SEMICONDUCTOR QUANTUM DEVICES
107
spread in electron energy, which is called the “correlation energy” E, (a term adopted from electromagnetics), is obtained from (Stone, 1985)
where vF is the Fermi velocity (which is the average velocity of electrons around the Fermi level) and A is the DeBroglie wavelength, which depends on the electron energy E. If transport is ballistic, then S is equal to the device length L, and we find that the maximum allowed temperature T,, is given by kTm, = E, % RhVF/L. (11.6) On the other hand, if transport is diffusive (elastic collisions are present), then S = vFL2/D,and we get kTm, = E, = nhD/L2.
(11.7)
The temperature Tmax is called the “Thouless temperature” after Thouless, who first pointed out its importance (Edwards and Thouless, 1972), and its particular relevance to quantum interference effects was later elucidated by many other authors (see, for example, Stone and Imry, 1986). The important points to note are the following. In ballistic transport, one can expect to see quantum interference effects at higher temperatures by increasing the carrier concentration, which results in a larger Fermi velocity vF (see Eq. 11.6). In diffusive transport, the improvement accrues from increasing the diffusion constant. Since in degenerate semiconductors or metals the diffusion coefficient is given by D = (l/q)v:7,
(11.8)
where T is the mean time between elastic collisions and q is the dimensionality of the structure ( q = 2 for quantum wells, q = 1 for quantum wires, etc.), it is obvious that higher operating temperatures can be obtained by increasing the carrier concentration and decreasing the density of elastic scatterers in a sample. In all cases, shorter samples work better, since T,, is inversely proportional to L or L2. Therefore, heavily doped, highmobility, short structures are apparently ideal for electron wave devices. This is an important insight. There is, however, one qualification to the preceding statement. Higher carrier concentration also causes increased carrier-carrier scattering, which is dephasing. Therefore, increasing the carrier concentration shortens L, and correspondingly demands a shorter device length L ( L 5 L,). Nonetheless, as long as L is significantly shorter than L,, higher carrier
108
MARC CAHAY and SUPRIYO BANDYOPADHYAY
FIOURE 5. Calculation of the correlation temperature T,,
.
concentration helps. This was shown explicitly by Bandyopadhyay and Porod (1988a,b) in the context of the electrostatic Aharonov-Bohm interference effect. What happens when the device length L is larger then the phase-breaking length L,? In that case, the correlation energy E, is equal to the so-called Thouless energy (Webb, 1989),
E, = hD/L: = h / r + .
(11.9)
Note that the phase-breaking time r, is itself temperature-dependent and varies as r+ l/TP, wherep is between 0.33 and 1 (Lin et al., 1983). One can then find the critical temperature Tmaxby solving for T from the equation k T = E,(T). This is shown pictorially in Fig. 5 . Once the temperature exceeds T, , the quantum interference effects generally decay slowly with increasing temperature. For instance, the quantum correction to the resistance of a structure varies as
-
AR
-
- T P -’.
This is a slow enough decay that quantum interference effects in the resistance of a structure may be observed at quite high temperatures. In fact, it is possible (although certainly not easy) to observe quantum interference effects at the liquid nitrogen temperature of 77K in certain semiconductor structures (Bandyopadhyay and Porod, 1988a). In the end, there is one last question that may arise. Looking at Eq. (11.9), it appears as if decreasing the phase breaking time is beneficial in that it increases E, and Tma.That may be so, but overall it is not beneficial to decrease the phase-breaking time (such as by increasing the carrier
SEMICONDUCTOR QUANTUM DEVICES
109
FIGURE6. A set of ring structures arranged in series.
concentration to promote carrier-carrier scattering).’ A smaller phase breaking length results in a weaker interference effect at a given temperature. Consider a set of rings with two branches arranged in series as shown in Fig. 6 . The semi-circumference of each ring is smaller than L, so that each individual ring is phase-coherent, but the total length is larger than L, . In that case, the quantum correction AG to the conductance of the ensemble varies as L r . Evidently, then, a larger L, is better. If, on the other hand, the semi-circumference of an individual ring is larger than L,, then the quantum correction has a much stronger dependence on L,; it varies as e-L’L+,where L is the semi-circumference (Chandrasekhar et al., 1985). Therefore, decreasing L, is actually counterproductive. The next question that needs to be answered is, how large can the bias across a structure be before quantum interference effects are destroyed? The bias should be small enough to prevent electron heating so that the electron temperature does not exceed the lattice temperature. More importantly, the bias also effectively causes a spread in the electron energy equal to eV (V is the voltage). Insofar as this spread has to be smaller than the correlation energy, we arrive at the condition that
nhuF eV< L nhD L2
c-
(ballistic transport), (diffusive transport).
(11.10)
We can now quantify the requirements of “linear response transport.” There are two independent requirements that impose a limit on the operating temperature and the bias. These are given by Eqs. (11.6), (11.7), and (11.10). We now address the final issue, namely how many transport channels can be present (or, equivalently, how many transverse subbands can be occupied) in an electron wave device without diluting the interference effect. Different transport channels will undergo different phase shifts [since they have different k, even for the same energy E; see Eq. (11.4)], and ensemble Note that we are talking about the situation when the device length L is larger than L,. When L < L , , increasing the carrier concentration is in fact beneficial, as discussed before in connection with Eqs. (11.6)-(11.8).
110
MARC CAHAY and SUPRIYO BANDYOPADHYAY
averaging over the channels will presumably “self-average” away the interference effect. In fact, it was once postulated that the relative quantum correction to the conductance of a structure will decay as 1/M, where M is the number of transport channels (Buttiker et al., 1985). This is absolutely true if all the M channels are uncorreluted. Actual simulations (Stone, 1985), however, showed that the relative conductance correction decayed much more slowly than l/M. This agrees with experiments (Webb et al., 1985) where M was as large as lo’. If the 1/M rule was valid, then these experiments should not have shown any signature of quantum interference effects, since they would have been too small to detect. Therefore, the l / M rule must not be strictly valid, which means that there must be some subtle mysterious correlation between the channels (Stone, 1986) which causes a violation of the 1/M rule. It was later shown that this correlation (or “conspiracy”) exists only between the channels that transmit, not between channels that reflect (Lee, 1987). Presumably these correlations develop because the electrons in the transmitted channels undergo many elastic collisions before exiting at the drain contact of a device, while the reflected channels that end up in the injecting source have suffered very few collisions. So, somehow, these elastic collisions develop the correlations. C. Wave Behavior of Electrons in Solids: The Analogy with Linear Optics and Microwaves
In the classical picture, electrons (or holes) in solids do not behave as waves. Instead, they behave as particles that obey Newton’s laws. The famous drift-diffusion formalism of electron transport (found in all undergraduate textbooks on semiconductor device physics) relies on the classical picture. The drift diffusion formalism has two basic assumptions: Electrons and holes are particles (like Billiard balls) whose motion in a solid is governed by Newton’s laws. External electric and magnetic fields accelerate these particles, which also encounter occasional scattering forces which could be either time-invariant (e.g., impurity scattering which varies only randomly in space but not in time) or time-varying (e.g., phonon scattering, electron-electron scattering). The response of the particles to the scattering forces also obeys Newton’s laws. The accelerating electric or magnetic field (causing particle drift) changes slowly enough in space and time so that a particle can scatter many times before the field changes significantly. This allows the particles to equilibriate locally with the local electric or magnetic field.
SEMICONDUCTOR QUANTUM DEVICES
111
In many semiconductor structures such as submicron devices, doping or compositional superlattices, or n-n+ diodes, the internal electric field can change very rapidly in space. In fact, the spatial scale for the change is much smaller than the mean free path. As a result, electrons or holes can no longer equilibrate with the local electric field via scattering. This results in a violation of the second assumption that underlies the drift-diffusion picture. When this happens, one encounters non-local hot electron effects such as velocity overshoot. In the presence of these effects, the driftdiffusion formalism is no longer adequate to describe charge transport.6 Transport of electrons and holes is then governed by the Boltzmann Transport Equation (BTE) which is more general than the drift-diffusion formalism,’ but it is still based on Newton’s laws.’ In other words, classical physics still has dominant sway. However, if the phase memory of electrons is preserved, classical physics is no longer adequate, and electrons or holes can no longer be treated as particles obeying Newton’s laws. Instead, they must be viewed as waves that propagate through the device according to the
Unlike velocity overshoot, saturation of the electron’s drift velocity, which occurs at high electric fields, is a hot electron effect, but not a non-local one. Local hot electron effects such as velocity saturation can be incorporated into the classical drift diffusion picture by introducing nonlinearity in the linear differential equations of drift-diffusion. One way of doing this is to view the mobility and diffusion coefficient of electrons and holes in a material not as transport constants (i.e., intrinsic material-dependent parameters independent of the local electric field), but rather as transport variables which depend on the local field and very spatially. The functional dependence of mobility and diffusion constant on the local electric field needs to be established either experimentally or from more sophisticated theories such as the Boltzmann Transport Equation. The resulting equations are sometimes called the modified non-lineor drift-diffusion equations. ’The drift-diffusion equations are approximate versions of a more sophisticated hierarchy of equations called the “balance equations” or “hydrodynamic equations.” The latter equations can be derived rigorously from the Boltzmann Transport Equation, but they present one serious problem. The number of unknowns in these equations is always one more than the number of equations, which necessitates an ud-hoc assumption to close the hierarchy. This last fact makes these equations less rigorous than the BTE itself. However, it has been claimed that the balance equations can in some instances approximately incorporate the physics of non-local hot electron effects such as velocity overshoot. ‘The BTE also has its own limitations. For instance, it is based on the assumption that scattering forces are instantaneous in both time and space. This assumption is invalid at high electric fields when the collision duration may become comparable with the mean time between collisions. Also, the particles may be accelerated significantly by the high electric field during the collisions, resulting in what is known as intra-collisional field effect (see, for example, Levinson, 1970; Barker, 1973; Thornber, 1978). Other effects that invalidate the BTE are collisional broadening whereby an uncertainty is introduced into the electron energy because of collisions. These effects are truly speaking beyond the scope of the BTE.
112
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Schrodinger equation.’ The Schrodinger equation is adequate to describe quantum transport as long as there are no dissipative (inelastic) phaserandomizing collisions. It describes the evolution of the electron (or hole) wavefunction &, t) in time and space: (11.1 1) where p is the momentum operator, A(r, t) is the magnetic (or electric) vector potential due to any magnetic (or electric) field present, and V(r, t) is the spatially and temporally varying electrostatic potential that a carrier sees as a result of any band discontinuities, internal and external electric fields (any time-varying field must have temporal coherence to preserve phase), many-body interactions (such as Hartree, exchange or correlation), elastic scattering potentials, and even the periodic lattice potential. If we are interested in one-dimensional steady-state transport in the absence of any magnetic field, then we can use the time-independent Schrodinger equation (11.12) Note that this equation is mathematically similar to Maxwell’s (or Helmholtz’s) equation for the scalar component of the electric or magnetic vector of a monochromatic electromagnetic wave, w2 + n2(x)-Ex@) C2
d2EX(X)
ax2
=
0,
(11.13)
where Ex is the x-component of the electric field, n is the spatially varying refractive index of the medium supporting the wave, c is the speed of light in vacuum, and o is the radial frequency. Because of the similarity between 9The Schrodinger equation is by no means the only equation that has been used to model the behavior of electron waves. In fact, the Schrodinger equation cannot describe quasi-dissipative transport where some (perhaps infrequent) inelastic collisions take place but the wave nature is still preserved. In the latter case, two main approaches have been used to model quantum transport. One is based on the Liouville equation (see, for example, Frensley, 1990) and its derivative the Wigner function formalism (see, for example, Frensley, 1987;Kluksdahl et ul., 1989;Buot and Jensen, 1990).The other, more sophisticated approach which can handle both Markovian and non-Markovian dissipative (phase randomizing) scattering easily is based on the Kadanoff-Baym-Keldysh formalism (see, for example, Kadanoff and Baym, 1962; Keldysh, 1965;Mahan, 1987;Datta, 1989b,1990, 1993;McLennan et ul., 1991;D’Amato and Pastawski, 1990; Jauho and Ziep, 1989;Jauho, 1989). A third approach based on Feynman path integrals (see, for example, Feynman and Hibbs, 1965; Feynman and Vernon, 1963; Thornber and Feynman, 1970)has been tried in the past, but without any significant success in the context of device simulation.
SEMICONDUCTOR QUANTUM DEVICES
113
Eqs. (11.12)and (11.13),the physics of dissipationless steady-state quantum transport of charge in a device is identical to the physics of electromagnetic wave propagation through a medium with spatially varying refractive index. This is not just an interesting and fundamental observation; it is more than that. Because of this, many quantum devices have been proposed and demonstrated which are exact analogues of optical and microwave devices. In the rest of this article we will discuss several such devices and point out each time the optical or microwave analogue. D. Current and Conductance Formulas for Electron Wave Devices
In analyzing any electronic device, we are primarily interested in finding out the current response to voltages applied at various terminals of the device. In analyzing quantum devices we will make two basic assumptions: The terminals or contacts of the device are thermodynamic “reservoirs” in which carriers are in local thermodynamic equilibrium. This means that the distribution function for carriers in any contact is the Fermi Dirac function l/exp[(E(k) - p ) / k T ) , where p is the chemical potential in that contact, E is the electron kinetic energy, and k is the wavevector.” The contacts “shoot” electrons into the device with a spectrum of wavevectors that are weighted according to the distribution function in the contacts. Any dissipation that is required to enforce equilibrium in the contacts takes place in the contacts themselves (Landauer, 1957). The device itself is dissipationless (see Fig. 7). We will now assume that the leads of the device are long and narrow so that they act as quasi one-dimensional structures. The current entering the ith lead is then given by
wheref is the electron occupation probability which depends on the electron energy E and the chemical potential pi in the ith contact, m* is the effective mass of carriers, and K,,,, is the wavevector along the lead in the (n, n’)th transverse subband (each subband is labeled by two indices corresponding to the two transverse directions). Assuming a parabolic dispersion relation, “At high enough current levels, the thermodynamic equilibrium in the contacts must be disturbed. This leads to a deviation of the distribution function from the Fermi Dirac nature. This issue has been addressed by some researchers (see, for example, Potz, 1989a,b).
114
MARC CAHAY and SUPRIYO BANDYOPADHYAY
I I I I I I 1
I I I
I 1
1 -T
T
l l -
____)
DISORDERED REGION
FIGURE7. In the Landauer approach, an obstacle is connected to two different reservoirs by ideal conductors. A stream of particle incident from the left hits the disordered region, a fraction R is reflected, and a fraction T transmitted. The conductance G due to elastic scattering in the disordered region is given by the Landauer-Buttiker formula. In this formalism, phase-breaking processes are assumed to occur only in the contacts.
we can relate the wavevector to the energy (11.15)
where E, and E,, are the subband bottom energies for the nth and n'th subbands along the two transverse directions. This immediately gives 1;+
=2eh- c
n,n'
lw e,+e,.
dEf(E - p;).
(11.16)
The current entering the ith lead and exiting at the j t h lead is then given by 2e 1. .=dE'(E - pi)ry*"*"*'" @,Pi - Pj), (11.17) J-J h rn,m',n,n' en+&,,,
c
lw
where (m,m')denotes the subband index in the j t h lead and T ; * ~ ' * " , (El ~' is the transmission probability for an electron entering the device from the (n,n')th subband in the ith lead with an energy E and exiting at the (m,m')th subband in the j t h lead with the same energy (no dissipation within the device).
115
SEMICONDUCTOR QUANTUM DEVICES
The preceding equation can be written in a more compact form as (11.18) where TU(E,pi - pj) =
C
f~"''nsn'(E,pj - pj)f?(E-
E,
- t,,,), (11.19)
m,m',m,n'
where 0 is the Heaviside unit step function. 1. Two-Terminal Current and Conductance
If the device has only two terminals i and j , then the current flowing through either terminal is given by I I. = I I.- J . - IJ .- I .
=
-Ij.
(11.20)
The last equality follows from conservation of charge and is the statement of Kirchoff's current law. In dissipationless transport, time reversal symmetry dictates that Tj = so that
Ti,
dETj(E, V ) [ f ( E- pi)- f ( E - pi
- eV)],
(11.21)
where V is the applied voltage between the two terminals (-eV = pi - pj). The preceding equation is often referred to as the Tsu-Esaki formula, since it was widely used by Tsu and Esaki (1973) in the analysis of resonant tunneling devices, which we will discuss in Section 111. This formula has proved to be extremely useful in analyzing and predicting the behavior of a large number of quantum devices. We will next derive a simplified version of this formula that applies to the situation when the applied voltage V is much smaller than the thermal voltage kT/e. Expanding the distribution functionf(E - pi)in a Taylor series, we obtain
+ higher-order terms,
(11.22)
where fo is the distribution function and po the chemical potential under global equilibrium when no current flows through the device. Noting thatf,
116
MARC CAHAY and SUPRIYO BANDYOPADHYAY
is the Fermi-Dirac factor, the second term in the right-hand side is obtained by straightforward differentiation. This yields
+ higher-order terms.
(11.23)
a. Low-Bias Situation. Consider the situation when the bias is small such that V 4 kT/e. Since -eV = pi - pj 1 p i - p o , the term within the curly brackets in the above equation will be very small. Therefore, we can neglect the higher-order terms in Eq. (11.23) and then use it in Eq. (11.20) to obtain 1. = -1. = -I
2hkT
dE( T j ( E ) p i - Zji(E)pji)sech'
where $,(p0) =
s:
sech' [(E - po)/(2kT)] dETo(E) 4kT
(11.25)
b. Low-Temperature Operation. In addition to the bias V being low, if the temperature T is also low, then the thermal broadening function , has a full-width-at-half-maximum sech2[(E - p o ) / ( 2 k T ) ] / ( 4 k T )which spread of 3.5kT in energy, begins to resemble a Dirac delta function 6(E - po). Equation (11.24) then reduces to Ii = -1.I
=
-[To(p0)pi 214 h IJ
-
j i PO)pi19 To(
(11.26)
where Ti(,uo)is the transmission probability of Fermi-level electrons under global equilibrium. Since time-reversal symmetry makes the transmission probability reciprocal (T$(po)= 7jy(po)),we obtain
2e2
214 To( 1. = -1.J = h Vo IJ (Po)' h IJ PO)[Pi - p J. ] = -
(11.27)
Note that the current is linearly proportional to the voltage. Hence, this regime (low bias and low temperature) is called the linear response regime. Dividing the current by the voltage, we obtain a constant conductance G in the linear response regime (11.28)
SEMICONDUCTOR QUANTUM DEVICES
117
Note that the maximum value of 7$po) is M , the number of subbands occupied in the contacts, which is also the number of transport “channels.” This maximum value is encountered when transport through the device is entirely ballistic so that each transport channel has a 100% transmission probability. In that case, Eq. (11.28) predicts that the conductance is finite and has a value (2e2/h)M. The reciprocal of this value has been interpreted as the contact resistance of the structure (Imry, 1986), since a ballistic device by itself can offer no resistance. Equation (11.28) is the famous two-terminal Landauerformula (Landauer, 1957). It offers an important viewpoint in that it reduces the problem of calculating quantum conductance in the dissipationless linear response regime to a much simpler scattering problem of calculating the transmission of electron waves through a medium with spatially varying potential. It has been shown that the Landauer formalism is equivalent to the more familiar but more difficult Kubo formalism of linear response transport. Either formalism can be used to describe linear response transport, but the Landauer formalism is more elegant. 2. Multi-terminal Formulas The two terminal Landauer formula was generalized to a multi-terminal formula by Buttiker (1988a). The basis of his work was that all terminals of a device can be treated on an equal footing. In that case, one finds that in a device with multiple terminals, the current flowing in the ith terminal is obtained by summing the right-hand side of Eq. (11.26) over all terminals: [I;lipj.
(11.29)
The preceding equation holds for the terminal current in any terminal and has been successfully applied to explain a number of phenomena, including the celebrated integral quantum Hall effect (Buttiker, 1988b). Buttiker has also used the multiterminal generalization to obtain expressions for the four-terminal resistance of a structure. This is the resistance measured by passing current between two probes and measuring the voltage that develops between two other probes. This technique of measuring resistance eliminates the effect of contact resistance and is widely used by experimentalists studying electron interference effects. Buttiker’s expression for the four-terminal resistance is (Buttiker, 1988a). (11.30)
118
MARC CAHAY and SUPRIYO BANDYOPADHYAY
where fmn is the transmission probability between the m and nth terminals. In the preceding equation, the ith a n d j t h terminals are the voltage probes. The preceding multi-terminal formulas have been applied to explain a number of quantum transport phenomena in the linear response regime, such as the negative magnetoresistance of quantum point contacts (Beenakker and Van Houten, 1989). Buttiker has also derived expressions for phase-sensitive, phase-insensitive and phase-averaged resistances. These are resistances measured by different types of measuring probes (terminals) depending on whether or not they can detect the phase of the electron at the measuring location (Buttiker, 1989, 1900). This subject is not strictly relevant to the subject matter of this article and therefore will not be discussed further.
3. Extension to Quasi-dissipative Transport
We had mentioned earlier that electron wave behavior is manifested only in the absence of dissipative, inelastic collisions. Dissipation tends to restore classical (particle-like) behavior. However, the transition from wave to particle behavior is not sudden; it is gradual. The behavior gradually transforms from wave type to particle type as the frequency of dissipative interactions is increased. Treating the transition region is the most difficult challenge for transport theorists. In this regime, neither the Schrodinger equation nor the Boltzmann equation will work. The former does not work because dissipative interactions are caused by complex (rather than real) scattering potentials, whose inclusion in the Hamiltonian will make the Hamiltonian non-Hermitian and therefore tend to violate the conservation of charge (Datta, 1989b). On the other hand, the Boltzmann Transport Equation cannot work because it treats electrons as particles only. This region has been addressed by a number of researchers. One approach is based on the Liouville equation (see, for example, Frensley, 1990) and its derivative the Wigner function formalism (see, for example, Frensley, 1987; Kluksdahl et a[., 1989; Buot and Jensen, 1990). These formalisms have been successfully applied to the study of resonant tunneling devices, which are discussed in the next section. In these approaches, the dissipative interaction is often treated through an energy-independent relaxation time approximation which fails to distinguish between Markovian and non-Markovian scattering processes, and thus masks some interesting physics (Lake and Datta, 1992). The other, more sophisticated approach, which can handle both Markovian and non-Markovian scattering easily, is based on the Kadanoff-Baym-Keldysh formalism (see, for example, Kadanoff and
119
SEMICONDUCTOR QUANTUM DEVICES
Baym, 1962; Keldysh, 1965; Mahan, 1987; Datta, 1989a, 1990, 1993; McLennan et al., 1991; D’Amato and Pastawski, 1990; Pastawski, 1991; Jauho and Ziep, 1989; Jauho, 1989). A third approach based on Feynman path integrals (see, for example, Feynman and Hibbs, 1965; Feynman and Vernon, 1963; Feynman et al., 1965; Thornber and Feynman, 1970) has been tried in the past, but without any significant success in the context of device simulation. All the preceding approaches are applicable to both linear and nonlinear response (high biases and high temperatures). It would be interesting to investigate whether inelastic processes can be treated more easily in the special case of the linear response regime. A viewpoint due to Buttiker (1986, 1988a) hypothesized that in the linear response regime, one can simulate localized phase breaking scatterers by “probes” or contacts. Therefore, a continuous distribution of phase-breaking scatterers is equivalent to a continuous distribution of current- or voltage-measuring probes placed along the length of a device. This viewpoint lead to a current formula given by
s
li(r) = - dr’T(r, r’)[p(r’) - p(r)], 2‘e‘ h
(11.31)
where p(r) is the chemical potential at location r and T(r, r’) is the transmission probability for propagating from r to r’. The foregoing approach has been used by Pastawski (1991) and Khondker and Alam (1991). Its validity was established on a rigorous footing by McLennan et al. (1991). Later it was shown by Datta (1990) that this approach is strictly valid if the phase-breaking scatterers are point scatterers or the transmission probability is fairly independent of electron energy. Datta (1990) also generalized the preceding equation to the case of spatially correlated scatterers. Datta’s prescription has been successfully applied to model resonant tunneling devices (Lake and Datta, 1992). A thorough discussion of his approach is, however, outside the scope of this article. 4. Where Do the Potential Drop and the Energy Dissipation Occur?
What the Landauer formula of Eq. (11.28) and its extensions do not answer are questions such as where exactly do the resistance drop ZR and the power dissipation Z2R associated with a scatterer occur in a quantum device. Interestingly, these questions have actually led to some variations of the Landauer formula, depending on how the potential drop is measured (Jain and Kivelson, 1988; Yosefin and Kaveh, 1990; Engquist and Anderson, 1981). The ZR and Z2R drops do not necessarily occur together. In fact, the ZR drop occurs even in dissipationless transport, but obviously the
120
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Z2R drop does not. Landauer has shown that the resistance drop ZR occurs because of a residual resistivity dipole (Landauer, 1957; Landauer and Woo, 1972), which forms in the vicinity of a scatterer because charges pile up on one side of a scatterer and are depleted on the other side. This dipole has profound implications for electromigration (Landauer, 1975). The questions of IR and 12R drops have been answered in great detail by McLennan et al. (1991), but are rather outside the scope of the present review. The question of spatial variations of transport quantities around elastic scatterers in the phase-coherent regime has been addressed in great detail by Bandyopadhyay et al. (1992) and Chaudhuri et al. (1992, 1993). Lee et al. (1991) examined these distributions in the phase-incoherent regime. These groups also studied the influence of a magnetic field on the distributions. Bandyopadhyay et al. (1992) found several interesting effects, such as the following: (a) The formation of current vortices as a result of quantum interference between electron waves reflected by the scatterers and the walls of a quasi one-dimensional structure. These vortices are much more prominent when the scatterers have an attractive potential (as in majority carrier transport as opposed to minority carrier transport). (b) Quenching or accentuation of these vortices by a magnetic field. (c) The formation of circulating currents around a scatterer at certain magnetic field strengths associated with the formation of a magnetic bound state. Finally, (d) a fundamental difference in the chemical potential drops around an attractive and a repulsive scatterer. Examples of these are provided in Figs. 8a-c. In many situations, only the terminal characteristics of a device are important, and the internal spatial variations of transport quantities are not. However, there are other situations when the distributions determine the terminal characteristics and are very important. This is well known in the device simulation community, where people are acutely familiar with space charge effects that cause an inhomogeneous electric field distribution in solid-state devices. Space charge effects exert a profound influence on device switching speed as well as current-voltage characteristics. The spatial distributions are also important in more basic situations. In many cases, they provide direct visualization of the transport physics. Therefore, understanding these distributions is imperative for the device physicist. In the next sections, we discuss some specific electron wave devices. All these devices utilize quantum confined structures in which the electron motion is restricted in at least one direction by potential barriers. For convenience, we will classify these devices into two groups: vertical devices, in which the current flows perpendicular to the confining potential barriers; and lateral devices, in which current flows parallel to potential barriers. Occasionally, some devices may belong to both classes.
SEMICONDUCTOR QUANTUM DEVICES
121
111. RESONANTTUNNELING DEVICES
We start with a description of the first semiconductor quantum device that made its debut after the advent of molecular beam epitaxy. This is the genre of resonant tunneling devices which are electronic analogues of optical Fabry-Perot cavities. It is a vertical device according to the classification
~~~
~~
FI~URE 8. (a) The formation of current vortices in a quasi one-dimensional structure (for linear response transport) as a result of quantum interference between electron waves reflected by e!astic scatterers and the walls of the structure. The structure is a quantum wire of length 800 A and width 1,OOO A . The Fermi energy is 2.41 meV. The scatterers are attractive, and their locations are shown by solid circles. These vortices are much more prominent when the scatterers have an attractive potential (as in majority carrier transport as opposed to minority carrier transport). (b) The formation of circulating currents around a scatterer at certain magnetic field strengths associated with the formation of a magnetic bound state. The Fermi energy is 2.38meV, and the magnetic flux density is 3.5 tesla. (continues)
122
MARC CAHAY and SUPRIYO BANDYOPADHYAY
d
L X
FI~URE 8-continued. (c) The chemical potential drops around an attractive and (d) a repulsive scatterer. Current flows along the x-direction. The scatterer is located as indicated in Fig. 8b. In the case of an attractive scatterer, most of the drop occurs around the scatterer, showing that the scatterer is the major cause of the resistance. In the case of a repulsive scatterer of the same scattering cross-section, the drop mostly occurs at the contacts, showing that the contact resistance is the dominant source of the resistance. This indicates that majority carrier mobility in the quantum regime can be significantly different from minority carrier mobility.
SEMICONDUCTOR QUANTUM DEVICES
123
FIGURE9. Typical conduction- and valence-band energy profiles in a double-barrier resonant tunneling diode. In this device, the current flows along the direction perpendicular to the layers. The quasi-bound state in the well (E,) is shown. Typically. regions 1, 3, and 5 are formed of GaAs, while the thin barriers (regions 2 and 4) are grown using Al,Ga,_,As with aluminum fraction x .
scheme just mentioned, and was conceptualized by Tsu and Esaki (1973). Its basic operation was first demonstrated experimentally by Chang et al. (1974). The archetypal resonant tunneling structure consists of two barriers sandwiching a quantum well as shown in Fig. 9. Typically the barriers are wide-gap semiconductors (such as AlGaAs) which are lattice-matched to a narrower-gap semiconductor (such as GaAs) which forms the well. The structure is grown by molecular beam epitaxy or a similar technique which allows one to exercise monolayer control over the barrier and well widths, as well as provide an extremely sharp and excellent interface between the barriers and the well. The operating principle of this device is rather simple. An electron emanating from an emitter impinges on the leftmost barrier, tunnels through it, traverses the well, tunnels through the second barrier, and finally emerges into a collector. The current depends on the total transmission probability through the entire structure Tu(E, V) according to Eq. (11.21). This probability depends on the kinetic energy E of the incident electron and the applied voltage V (these two are not independent quantities). These quantities can be modulated by applying a bias to the well through a third terminal or simply by varying the bias between the emitter and collector. The crucial point to note is that the transmission probability Tj(E, V) is a non-monotonic function of the electron energy E, as shown in Fig. 10, and therefore the current through the structure is also a non-monotonic function of the applied voltage V. This gives rise to negative differential resistance in the current-voltage characteristics, which has applications in high-frequency devices and logic circuits. To see how this device is an electron wave device, consider the situation when the two barriers are identical in all respects. Let T be the classical transmission probability through each barrier for a given kinetic energy of
124
MARC CAHAY and SUPRIYO BANDYOPADHYAY
0.0
0.1
0.2
0.3
0.4
0.5
Energy (eV)
FIGURE10. Typical transmission coefficient vs. incident energy for a double-barrier resonant tunneling structure (under zero-bias conditions). The transmission coefficient reaches unity at the quasi-bound state energies.
the incident electron. The total transmission through both barriers between the emitter (terminal 1 ) and collector (terminal 2) can be found by classically adding up the transmission probabilities of all possible Feynman paths as shown in Fig. 1 1 . The sum forms a geometric series so that the transmission probability 7i2 (between terminals 1 and 2) is given by
7i2 = T2[1 + R 2 + R 4
--T l-R2-1+R
---T2
+ -
- - a ]
T 2-T’
(111.1)
where R is the reflection coefficient (T + R = 1). From the preceding equation, we immediately see that the classical transmission probability through two barriers is less than the transmission probability through each barrier. This is what we expect intuitively. After all, it must be more difficult to travel through two barriers than through one. The picture is very different if we analyze this problem quantummechanically, when instead of summing transmission probabilities, which are positive indefinite real numbers, we have to sum probability amplitudes,
SEMICONDUCTOR QUANTUM DEVICES
125
Barrier 1 Barrier2
FIGURE11. The transmission coefficient through a resonant tunneling structure calculated by summing the amplitudes of an infinite series of multiple reflections: (r , ,1,) and (r,, t,) are the reflection and transmission amplitudes for barrier 1 and 2, respectively; k is the wavevector of the electron incident on the resonant tunneling structure.
which are complex quantities with phases. In that case, the transmission amplitude through two barriers is
t2 f12
= 1 - r2ei2ku
- ei281
-
It12 ei2et
-1-
if ku
(r12ei(2ku+2@r)
+ Or = nn,
(111.2)
where k is the electron’s wavevector along the direction of current flow, 0, and Or are the phases of the transmission and reflection amplitudes, respectively, and n is an integer. The condition in the right-hand side of the foregoing equation is the wellknown Fabry-Perot resonance condition-namely, the round-trip phase shift within the “cavity” formed by the two barriers is an even multiple of n. When this condition is met, the transmission probability through the structure q2 ( = ItI2l2)is unity regardless of what the transmission probability through each barrier is. Here is a little bit of the mysticism often associated with quantum mechanics. When resonance is achieved, the transmission through each barrier may be much less than unity so that the
126
MARC CAHAY and SUPRIYO BANDYOPADHYAY
electron can barely go through either barrier. Yet, it can go clean through both! This is something that would have been classically impossible, and what makes it possible is quantum mechanics. The electron is a wave that suffers multiple reflections within the two barriers. When the multiply reflected waves interfere constructively, the transmission goes to unity. It is obvious now that this is a quantum interference device, and of course it has an optical analogue. The analogue is the optical or microwave Fabry-Perot resonator. It is easy to see how the current through such a structure is a nonmonotonic function of the applied bias. As the bias across the structure is increased from zero, the energy and wavevector of the electrons incident from the emitter increases. The wavevector k gradually approaches the resonance condition in Eq. (111.2) and the transmission increases, ultimately reaching its peak at resonance. At this point, the current reaches a peak. It can be verified that the resonance also corresponds to the situation when the electron energy matches the energy of a quasi-bound state in the well. As the electron energy (or bias across the structure) is increased further, the transmission goes off-resonance and the current falls. This causes a negative differential resistance in the current-voltage characteristics. In the preceding analysis, we had assumed that the barriers are identical and transport is purely ballistic. Even if the barriers are identical to start with, they will not remain identical when a bias is applied. For nonidentical barriers, the transmission probability at resonance is the transmission probability through the more opaque barrier divided by the transmission probability through the more transparent barrier (Ricco and Azbel, 1984). To ensure that this ratio is the maximum possible, several tricks are employed. For example, the emitters and collectors can be doped such that the Fermi level in the structure at equilibrium is close to the energy of the lowest quasi-bound state in the well. This ensures that a rather small bias will attain the resonance condition. The smaller the bias, the more identical the two barriers will be. Another trick is to intentionally make the barriers different such that at the bias corresponding to the resonance, they become as identical as possible. Unfortunately, the latter trick does not work very well except at very low temperature. At higher temperature, there is a significant spread in the energy of the incoming electrons, which is typically much larger than the level broadening of the quasi-bound states in the well. In that case, the transmission probability is fairly close to that of the more opaque barrier (Weil and Vinter, 1987; Jonson and Grincwajg, 1987). It must be pointed out that the Fabry-Perot mechanism, which requires coherence of the electron wavefunction across both barriers, is not a requirement for either resonant tunneling or negative differential resistance
127
SEMICONDUCTOR QUANTUM DEVICES
Ef
i
Ef EC
-4
Ef
Ef
Ec
n
i KY Kx
FIGURE 12. Illustration of how a double-barrier resonant tunneling diode may operate. The conduction band diagrams are shown with and without the application of an external bias V. Shown alongside is the Fermi sphere of a degenerately doped emitter. Since the lateral component of an electron’s wavevector k, is preserved during resonant tunneling, only those electrons whose k , lies on a disk k, = ko [where ko = (2rn*/h)(s: - E,)] will be resonant. At zero temperature, resonant tunneling occurs only in a range of voltage for which the disk k, = k, (shaded disk) moves down from the pole to the equitorial plane of the Fermi sphere. After Capasso et 01. (1986). 0 1986 IEEE.
(NDR)in double-barrier diodes. It was shown by Luryi (1985) that an entirely different mechanism is more likely to produce the NDR. It arises simply from energy and momentum conservation (which requires that no scattering be present, but does not require coherence of the electron wavefunction). To understand this, one may refer to Fig. 12.
128
MARC CAHAY and SUPRIYO BANDYOPADHYAY
The energy of electrons in the three-dimensional emitter is given by (111.3) where k, and k , are the longitudinal and transverse wavevectors, and E, is the bulk conduction band bottom in the emitter. In the quantum well, the energy is (I 11.4) where E," is the energy at the bottom of the nth z-directed subband. Resonant tunneling requires conservation of energy and lateral momentum ( k l ) . Therefore, equating the preceding two equations, we observe that only those electrons can tunnel whose longitudinal (z-directed) wavevector along the direction of current flow is given by
2m* k," = -(E," h
-
E,).
(111.5)
This means that the tunneling electrons must lie on the shaded disk within the Fermi sphere shown in Fig. 12. Now as the bias across the structure is increased, E, rises, and therefore k," decreases. The shaded disk moves towards the equatorial plane of the Fermi sphere, and its area increases. The number of electrons that can tunnel is proportional to this area at low enough temperature so that the current increases. Once E, rises above E,", there are no electrons in the emitter than can tunnel (at the absolute zero of temperature) and the current should drop abruptly. This causes NDR, which is associated with tunneling from a three-dimensional emitter into a two-dimensional quantum well, and even the presence of two tunneling barriers is not essential. This mechanism of NDR was demonstrated experimentally by Beltram et al. (1988). The coherent Fabry-Perot mechanism was demonstrated by Choi et al. (1987). Both the above mentioned mechanisms for NDR assume ballistic transport with no scattering. In the limit of strong scattering, one can still obtain NDR by a third mechanism known as incoherent tunneling. This is a twostep process for a double-barrier diode. Electrons first tunnel into the well through the first barrier, and then tunnel out through the second barrier. It is the first step that causes NDR (Capasso et al., 1990). The role of scattering in double-barrier resonant tunneling diodes was elucidated by Stone and Lee (1985). They showed that in the presence of scattering, the peak transmission decreases by a factor proportional to the
SEMICONDUCTOR QUANTUM DEVICES
129
-
a
E3
E2 El
-
FIGURE 13. Schematic illustration of sequential resonant tunneling of electron for a potential energy drop across the superlatticeperiod equal, respectively, to the energy difference (a) between the first excited state the ground state of the wells. and (b)to the energy difference between the second excited state and the ground state of the wells. After Capasso et ul. (1986). 0 1986 IEEE.
ratio 7 0 / ( 7 + To), where 7 is the mean time between collisions and T, is related to the resonance width or energy broadening r of the transmission peak ( T ~= h/T). A larger value of 70 corresponds to a less leaky resonator or a well with wider and thicker barriers. The ratio of electrons that tunnel coherently to those that tunnel after scattering is simply 7 / 7 0 . In a strong electric field, a fourth mechanism can give rise to NDR in multiple-barrier diodes. It is known as sequential resonant tunneling, in which electrons sequentially tunnel through a Stark ladder of states, as shown in Fig. 13. The current peaks at well-defined values of the voltage bias when the lowest quasi-bound Stark ladder state in the nth well is degenerate in energy with one of the higher Stark ladder states in the (n + 1)th well. Electrons tunnel resonantly from the ground state in the (n - 1)th well into an excited state in the nth well, decay into the ground state in the nth well by emitting phonons, and then tunnel resonantly into an excited state in the (n + 1)th well to carry on the process. This is illustrated in Fig. 13 and was first demonstrated experimentally by Capasso et al. (1986, 1987).
130
MARC CAHAY and SUPRIYO BANDYOPADHYAY
A. Space Charge Effects in Resonant Tunneling Devices-
Possible Sources of Bistability Consider a double-barrier resonant tunneling device at resonance when the transmission is maximum. If the resonance is caused by the coherent Fabry-Perot mechanism (i.e., the constructive interference of all multiply reflected electron waves that are reflected by the two barriers)," then the wavefunction will build up coherently within the well. Since the electron density is proportional to the squared magnitude of the wavefunction, this means that the electron concentration in the well can be very high. At such high concentrations, Hartree and many-body interactions (exchange and correlation) are significant and alter the potential (or the shape of the conduction band profile) within the well (Bandara et al., 1988). To device engineers, this is known as space charge effect. The space charge effect has an important device application in that it can cause bistability in the current-voltage characteristics of a resonant tunneling diode. Such bistability can be used in binary logic and other applications. The origin of this bistability is explained next. Since the potential within the well depends on whether the well is full of electrons or empty, the transmission probability through the well (and hence the current) can have two different values at the same applied bias, depending on whether the well is initially full or empty. Therefore, the resonance (peak in the current) can appear at two different voltages, depending on whether one is approaching the resonance from below or above. Such bistability has been observed experimentally (Goldman et al., 1987a,b). Theoretically examining the effects of space charge buildup on the current-voltage characteristics of a double-barrier resonant tunneling diode is a fairly well-researched topic. Since 1986, several quantitative calculations have been reported by several authors (Onishi et al., 1986; Cahay et al., 1987; Brennan, 1987; Potz, 1989a,b; Mains et al., 1989; Landheer and Aers, 1990; Fij and Jauho, 1991; Jogai et al., 1991). We will not review these calculations, but instead briefly recount the salient features. 1. Ballistic Regime: Full Quantum-Mechanical Analysis of Space Charge Effects Proper treatment of space charge effects requires solving the Schrodinger equation (which gives the carrier concentration for a given potential) and "The coherence is, however, likely to be lost after a few multiple reflections because of an inelastic scattering event.
SEMICONDUCTOR QUANTUM DEVICES
131
the Poisson equation (which gives the potential for a given carrier concentration) simultaneously. An alternate (and more popular) approach is to solve these two equations iteratively until a self-consistent solution is obtained. The latter approach was used by Cahay and co-workers (1987) to study how space charge effects influence the current-voltage characteristics of a resonant tunneling diode. Their analysis had three main assumptions: (a) The emitter and collector for the double barrier device act as electron reservoirs in which the electrons maintain local thermodynamic equilibrium at all injection levels; (b) the structure is three-dimensional, but the potential varies only along the direction of current flow; and (c) the effective mass approximation is valid. The structure that was analyzed by Cahay et al. (1987) is a resonant tunneling device fabricated by Ray et al. (1986) and is shown in Fig. 14. The calculated current-voltage characteristics in the presence of space charge effects are compared in Fig. 15 with the so-called "flatband" results obtained by assuming that the applied bias drops linearly across the 0
ILacontocts 0 spocer toyers barriers welt
P u
self-consistent - - - flotbond - -
0.4
0.3 ,
C
p"
0.2
.-s
0.1
L
'
quasi-bound states: -........-self -consistent (lotbond
- ........-
U
F r o m 14. (a) Structure fabricated by Ray el u/. (1986). Contact regions are doped 2 x 10'' ~ r n (Te); - ~ spacer regions are undoped GaAs; barriers are undoped Al,,,,Ga,,,,As; and the well is undoped GaAs. @) Equilibrium conduction band energy profiles for selfconsistent and flatband calculations. After Cahay et u/. (1987); reprinted with permission.
132
MARC CAHAY and SUPRIYO BANDYOPADHYAY
I
f-consistent
0.0
0.05
0.10
0.15
0.20
0.25
0.30
Voltage (V) FIGURE15. Current-voltage characteristics (both self-consistent and flatband results) for the structure of Fig. 14, at 300 K. Note that the inclusion of self-consistency has shifted the position of NDR to a higher bias, and broadened the characteristic. In addition, the peak current is reduced for the self-consistent calculation. After Cahay ef al. (1987); reprinted with permission.
structure. The flatband theory makes no allowance for space charge effects. Note that NDR occurs at a higher voltage when space charge effects are taken into account. This happens because accumulated charges in the well alter the potential in such a way that the first quasi-bound state in the well is shifted away from the conduction band edge in the emitter. Therefore, a larger bias is now required to line up the quasi-bound state in the well with the Fermi level in the emitter to cause resonance. There are other effects associated with space charges. The transmission peak is broadened and the peak current is reduced, leading to a degradation of the peak-to-valley ratio of the current. This ratio is a figure of merit for a resonant tunneling device. The reduction in the peak current is related to the fact that the peak shifts to a higher bias. At a higher bias, the structure becomes more asymmetric so that the transmission through the two barriers becomes more unequal. This reduces the net transmission probability according to the findings of Ricco and Azbel(l984), as previously described. The bistability in the current voltage characteristics caused by space charge effects was demonstrated theoretically by Mains et al. (1989) and Frensley (1990). In these calculations, space charge effects are included via the Thomas-Fermi model. Mains et al. (1989) calculated the currentbarrier and voltage characteristic of an In,~,,Gqb.47As-Ino,,2Alo~48As I%.53Gq.47Aswell structure. Figure 16 shows the static current-voltage characteristics obtained for this device. For the solid curve, the bias voltage was swept upwards from 0 volts, while for the dashed curve (which coincides
SEMICONDUCTOR QUANTUM DEVICES
133
-
N
'5 a
0 ,
1.0
0.8
- -
0.6
-
v
-
2.
.-
ul
Voltoge
(Volts)
From 16. I- V curves for the I ~ , , , G ~ , 4 7 A s - I n o , , , ~ , 4 ,barrier-Iq,,,Ga,,,,As As well structure (with 23.2 A barriers and 43.5 A well) studied by Mains el al. (1989). The doping of the Ino,,pG~,4,Ascontacts is 2 x 10'8cm-', and 50A undoped spacer layers are placed adjacent to each barrier. The effective mass is assumed to be 0.042m0in the contacts and well, and 0.075m0in the barrier. The barrier height is 0.53 eV. The full (dashed) line in the I- Vcurve was obtained while sweeping the voltage from O V (8V) to 8 V (OV), respectively. The bistability in the negative differential region of the current-voltage characteristic can clearly be seen. After Mains et al. (1989); reprinted with permission.
with the solid curve except in the bistable region), the bias voltage was swept down from 0.8 volts. It is seen that a bistable region exists around resonance. 2 . Scattering-DominatedRegime: Classical Thomas-Fermi Screening Model for Space Charge Wfects
It has been argued that the Thomas-Fermi model for space charge effects (Mains et al., 1989) is rather appropriate for the usually heavily implanted regions on either side of a double-barrier structure since it automatically includes the effect of inelastic scattering. Frensley has claimed that the inelastic collisions are required to allow the electric fields in the devices to approach zero at the emitter and collector boundaries. Besides, there has been experimental evidence that a quasi-bound state resides within the conduction-band notch between the emitter spacer and the first barrier of the device (Jogai et al., 1991; Choi and Wie, 1992). This state cannot even be populated without inelastic collisions. Landheer and Aers (1990) have compared the two types of self-consistent calculations (the quantum-mechanical model for the ballistic regime and the Thomas-Fermi model for the scattering-dominated regime) in the case of Si/Sil-,Ge, and AlAslGaAs double barrier diodes. They showed that, for SiGe double-barrier diodes, the Thomas-Fermi model in the contact regions
134
MARC CAHAY and SUPRIYO BANDYOPADHYAY
gives closer agreement with experiment because the carrier mean free path near the barrier regions is short in these structures so that electrons suffer frequent scattering. In contrast, the full quantum-mechanical treatment, equivalent to the assumption of an infinite carrier mean free path, seems more appropriate for AlAs/GaAs devices which have high mobility. B. Effects of Inelastic Scattering on the Device Characteristics of Double-Barrier Resonant Tunneling Diodes
Two of the long-standing problems with the purely coherent picture of resonant tunneling were the inability to explain the large value of the valley current and the appearance of a phonon peak in the NDR of double-barrier diodes whose intensity can be quite substantial in asymmetric structures (i.e., with different barrier thicknesses) (Goldman et al., 1987a,b; Alves et al., 1988; Leadbeater et al., 1988). These features are intricately related to inelastic scattering events and could not be explained by the coherent picture. The effect of inelastic scattering on quantum electron transport through double-barrier resonant tunneling structures has been addressed by several groups (Stone and Lee, 1985; Glazman and Shekhter, 1988; Cai et al., 1989; Wingren et al., 1988; Wu and McGill, 1989; Jonson, 1989). In their treatments two approximations have been used for inelastic scattering, namely (a) it is confined to a region of space, and (b) it is treated within the one-electron picture, which neglects the effects of all other electrons in the device and therefore the Pauli Exclusion Principle. Alternate methods have been proposed by Chevoir and Vinter (1989) and Gu et al. (1990). The most popular techniques, however, are based on a solution of the Liouville equation which yields the density matrix (Frensley, 1987) and an equivalent technique based on a Wigner-Weyl transform of the density matrix, namely the Wigner function (Buot and Jensen, 1990; Kluksdahl et al., 1989; Frensley, 1990). The inclusion of inelastic scattering in these formalisms has been problematic and usually relies on a relaxation time approximation. The latter approximation summarily neglects non-Markovian effects and thus masks many interesting features associated with non-Markovian scattering processes. A much more powerful technique to investigate the effects of inelastic scattering is based on the general many-body nonequilibrium Green’sfunction formalism of Keldysh (1965) and Kadanoff and Baym (1962). This was applied successfully by Lake and Datta (1992) and by Lake et al. (1993), who rigorously took into account the Pauli Exclusion Principle, a feature that was neglected in all previous work. Unlike in the case of the density
SEMICONDUCTOR QUANTUM DEVICES
135
matrix method or the Wigner function method, Lake and Datta did not make the Markovian approximation either. This allowed them to include the phonon energy spectrum in the calculations. As a result, they could calculate the effect of inelastic scattering on the population of the quasibound states (both in the emitter notch and in the well), the density of states, the energy distribution of the current density, and the power density in double-barrier resonant tunneling diodes. Their results showed peaks in the power density (dissipated in the device) when current passes through a resonance. The effects of inelastic scattering were also examined by Rudberg (1990), Hyldgaard and Jauho (1990), Anda and Flores (1991), Stovneng et al. (1991), Wu and Ulloa (1991), and Gadzuk (1991). The work of Lake et al. (1993), however, clearly provided a prescription to study the role of inelastic scattering in the context of specific measurable device characteristics. These workers also pointed out the existence of a backflow component in the tunneling current due to the absorption of phonons. Furthermore, they explained the existence of the phonon peak in the tunneling current found in asymmetric structures under forward and reverse bias. Overall, their technique has now emerged as a powerful tool to study the effect of dissipative interactions in electron wave devices. C. Applications of Resonant Tunneling Devices
The field of resonant tunneling devices is quite old and mature. It would be impossible to do justice to this field within the short scope of this article. Several excellent review articles have appeared on this topic, and we refer the reader to the one by Capasso and co-workers (1990). However, we shall dwell on this topic a bit longer and mention some landmark developments in this field. The first practical application of resonant tunneling diodes was demonstrated by Sollner et al. (1983) at the MIT Lincoln Laboratories. They used the NDR of double-barrier resonant tunneling diodes in detectors and mixers at frequencies up to 2.5 terahertz. With the improvement of the technology, the parasitic series resistance was reduced, and resonant tunneling diodes could be used for microwave generation as well. Brown et al. (1989) have reported microwave oscillations in double-barrier resonant tunneling diodes at frequencies of 456GHz,which, to our knowledge, is the highest frequency of operation demonstrated for any solid-state electronic device to date. Resonant tunneling through parabolic quantum wells, as opposed to square wells, was demonstrated by Sen et al. (1987). Resonant tunneling electron spectroscopy was used as a tool to probe energy distribution
136
MARC CAHAY and SUPRIYO BANDYOPADHYAY
of electrons by Capasso et al. (1986, 1987). Over the years, a variety of resonant tunneling transistors, based on different operating principles, have been proposed and demonstrated. They include both bipolar (Luryi and Capasso, 1985; Capasso and Kiehl, 1985; Yokoyama et al., 1985; Bonnefoi et al., 1985; Futatsugi et al., 1987; Woodward et al., 1987; Capasso et al., 1986, 1987) and field-effect transistors (Sen et al., 1987). The latter type has exhibited negative transconductance even when the majority charge carriers are electrons instead of holes. In addition, novel structures where transistor action is based on the gating of the quantum well subbands have been demonstrated (Beltram et al., 1988). In all of these transistors, the purpose of the resonant tunneling structure is to control the collector or drain current via modulation of the base or gate terminal. Thus peaks are obtained in the direct and transfer characteristics. It is this feature that is very useful for the implementation of many circuits. These include exclusive ORs, parity checkers, analog-to-digital converters, frequency multipliers, and multiple-valued logic. The application of resonant tunneling devices in some of these circuits has been reviewed by Capasso and co-workers (1990) and Sollner and co-workers (1983, 1987). D. Quantum-Mechanical Tunneling Time and Its Relation to the Tsu-Esaki Formula
Before we conclude this section on resonant tunneling diodes, we should mention a few words about tunneling time, which is the time it takes for an electron to tunnel through one or both barriers. This is a topic of much recent interest because it has serious implications for the ultimate switching speed of resonant tunneling devices. It is not even quite obvious how one can define the tunneling time. To define it as the time that elapses between the electron’s disappearing within the barrier and reappearing at the other side would require assigning definite positions to the electron. This is, of course, forbidden by the Uncertainty Principle. Because of this nebulosity, various definitions of the tunneling time have been proposed. The earliest one is based on the phasedelay method introduced by Bohm (1951) and Wigner (1955). An alternate approach in which the Larmor precession of the electron’s spin serves as an index of tunneling time was proposed by Buttiker (1983) and generalized by Hauge and Stovneng (1989). Smith (1960) proposed a third definition which he called the “dwell time.” Others have performed numerical simulations of wavepacket propagation through potential barriers (Collins et al., 1987; Jauho and Nieto, 1986) and arrived at various estimates of this so-called ‘‘tunneling time. ”
SEMICONDUCTOR QUANTUM DEVICES
137
Most of the preceding studies have only dealt with the tunneling time through simple obstacles, including delta-potential scatterers and simple rectangular barriers under zero bias. Even in these simple cases, the agreement among the various estimates is far from satisfactory. Recently, one of us (M. C.) derived an expression for the tunneling time valid for a barrier of arbitrary shape and under nonzero bias (Cahay et al., 1992). The approach is based on a technique developed by Khondker and co-workers (1988). In this technique, one defines a complex quantummechanical wave impedance Z(x) at any popint x within a structure as follows:
2h Y ’ -jm* Y
=
Z(x) = R(x)
+jX(x),
(111.6)
where Y is the electron wavefunction and Y ’ is its gradient; R(x) and X ( x ) are the real and imaginary part of the impedance Z(x) looking in the positive x direction; and m* is the electron effective mass and is assumed to be constant. Equation (111.6) may be recast in the form Y’
Y = jK(x)
+ CY(X),
(111.7)
where K(X)
=m*(x)R(x),
2h
and
~ ( x =) m*(x)X(x);
2h
these quantities may be interpreted as the propagation constant and the attenuation constant of the wavefunction, respectively. Integrating Eq. (111.7) from x = 0 to any x results in
where Y o = “(0) is the wavefunction of the incident electron at x = 0. Under steady-state conditions, the probability current density at any point x may be written as
The probability current density J can also be defined in terms of the group velocity, ug(x),as
J = ug(x)lY(x)12.
(111.10)
138
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Assuming plane wave boundary conditions and using Eqs. (111.9)and (111. lo), the following result is then obtained:
(111.11) where TLR is the transmitted amplitude at the right contact of a plane wave incident from the left, and kR is the electron wavenumber at the right contact. The time required to move through an elemental distance dx at any x is given by dr = dx/u,(x). Using Eqs. (III.6)-(III.1 l), the time required for an electron incident from the left contact to traverse the potential barrier of length L is given by
where kLis the electron wavenumber at the left contact. The term within the curly brackets is the electron transmission across the structure. Using the plane wave boundary conditions, 1Y012= 11 + p L R l 2 at the left contact, where PLR is the amplitude of the reflected wave. Equation (111.12)can be rewritten by making use of Equations (III.7)-(III.l2) as follows:
(111.13) Equation (111.13)is general and applies for off-resonance conditions. It also generalizes the tunneling time expression obtained by Hirschfelder et al. (1975)and Khondker et al. (1988), which they derived under zero-bias conditions. Starting with Eq. (111.9)and using Eq. (111.12),it is instructive t o write the total current density flowing from the left to the right in a structure as follows:
(111.14) which is equivalent to the Tsu-Esaki formula ( - e is the charge of the electron). Equations (111.13) and (111.14) are therefore the quantum mechanical analogue of a quite familiar expression used in the study of the base transit time through bipolar devices (Ferry et al., 1988). More importantly, the last equation relates the quantum-mechanical tunneling time to a measurable physical quantity: the current through the structure. It is interesting t o note that the tunneling time expression [Eq. (II.13)] reduces to its classical counterpart when the Wigner-Kramers-Brillouin (WKB) approximation is used for the electron wavefunction. In this case, the
SEMICONDUCTOR QUANTUM DEVICES
139
conduction band profile E, is assumed to vary smoothly with position over a length equal to the DeBroglie wavelength, and the solution of the Schrodinger equation can be written as (111.15) where (111.16) and
4(x) =
s:
dx'k,(x'),
(I1I. 17)
kt being the transverse wavevector of the incident electron with energy E. Using Eqs. (111.10) and (111.1 l), the tunneling time expression (Eq. (111.14)) reduces to (111.18) where v(x) = hk,(x)/rn*, which is the well-known classical result. As an application of the approach, we consider a typical resonant tunneling structure with a 50A wide GaAs well sandwiched between two 50A thick AlGaAs layers with a barrier height of 0.225 eV. The electron effective mass is assumed to be 0.067rn0 everywhere. Figure 17a shows the tunneling time for this structure under zero bias conditions. Also shown is the phase-delay time calculated by Liu (1987) for the same structure. The tunneling time calculated in this study is found to be minimum at the quasi-boundstate energy (E = 0.0791 eV), whereas the phase-delay time is a maximum at this energy (Liu, 1987). At resonance, both tunneling times are in good agreement with a value around one picosecond. More systematic work needs to be carried out to understand the qualitative difference between the two times at off-resonance energies. It should be pointed out that if the tunneling time was to be shorter at off-resonance energies as predicted by the phase-delay result, resonant tunneling devices ought to operate at frequencies far beyond the terahertz regime, which is contrary to the observed experimental results. Figures 17a and 18a show the tunneling time for electrons impinging from both contacts for an applied bias of 0.1 V and 0.2 V, respectively. Figures 17b and 18b show a plot of the transmission coefficient through the structure for the corresponding biasing condition. As can be seen, the tunneling time is minimum when the transmission coefficient is maximum.
140
MARC CAHAY and SUPRIYO BANDYOPADHYAY
a
10-Q \
I
0.0
0.1
0.0
0.1
0.2 Energy
0.3 ( rV )
0.4
I
0.3
0.4
C
b
-cm 0
.-m
E
m
C
E
c
0.2
Energy ( eV ) 1.1e+12 I I
O.Oe+ll 1.00+12
8.oe+ll
-
7.00+11
.
6.0.+11! . I I . 0.034 0.036 0.038 0.040
.
.
I
.
0.042 0.044 0.048
Energy ( eV ) FIGURE17. (a) Tunneling time through a resonant tunneling structure with 50A wide barriers and wells. The barrier height is assumed to be 0.3 eV; the effective mass m* is taken to be 0.067m0 everywhere, A linear potential drop of 0.1 V has been applied across the structure. (b) Plot of the transmission coefficient versus energy. The labels LR and RL are for electrons impinging from left to right and right to left, respectively. (c) Plot of the beat frequency w - as a function of energy in the vicinity of the first quasi boundstate energy. The position of the latter is indicated by a vertical dashed line. After Cahay et al. (1992); reprinted with permission.
141
SEMICONDUCTOR QUANTUM DEVICES
Figures 17a and 18a show that the difference in the energy dependence of the tunneling times from opposite contacts is more pronounced for larger bias. The following mechanism could be proposed to explain the highfrequency components observed in resonant tunneling devices. When the structure is biased at 0.1 eV, the position of the lowest quasi-bound state is indicated by the dashed vertical line in Fig. 17a. Only electrons impinging with an energy within the width of the resonant peak will contribute to the total current at relatively low temperature. Because of the difference in tunneling times for electrons impinging from opposite contacts, there will be a time variation in the electron charge density AnL (within the energy range A E of the resonance width) in the left contact, inducing a change in current through the structure given by
dAnL AnL AnR AZCY= -+ -, dt TLR 7RL
010
011
0.0
0.1
012 013 Energy ( eV )
(I 11.19)
0:4
015
0.4
0.5
b
-c -E 0
0
a
a
c
2 c
0.2
0.3
Energy ( eV )
FIGURE18. (a) Same as Figure 17a for an applied bias of 0.2 V. (b) Same as Figure 17b for an applied bias of 0.2 V. After Cahay et a/. (1992); reprinted with permission.
142
MARC CAHAY and SUPRIYO BANDYOPADHYAY
T L R , t R L being the tunneling times (assumed not to vary much over A E ) while traveling from left to right and right to left, respectively, and AnR being the variation in the electron density in the right contact in the same energy range. The change in total current can now be written as
A10l - $(AnL + AnR)W- - (AnL - AnR)W+,
(I 11.20)
where 1 1 w+*- = f-. TLR
TRL
As evident from Fig. 17c, the beat frequency varies rapidly with energy and reaches the terahertz regime within the width of the resonant peak. This suggests that current components in resonant tunneling devices could appear at the beat frequency. The maximum beat frequency for the specific structure considered above was found to be around 1 THz around the position of the resonant peak under a bias of 0.1 V. The intrinsic bistability and high-frequency current oscillations observed in resonant tunneling devices could therefore result from the difference in tunneling times for electrons with energy around the location of the quasi-bound state but impinging from opposite contacts. One should mention that several review articles have already addressed the problem of an appropriate definition of a quantum-mechanical tunneling time. The subject is still fairly controversial. At the time of this writing, a popular article had appeared in Scientific American which had summarized experimental efforts to measure the tunneling time (Chiao et al., 1993).
IV. AHARONOV-BOHM EFFECT-BASED DEVICES The next genre of quantum devices that we will discuss are lateral devices based on the Aharonov-Bohm effect. We remind the reader that lateral devices are those in which current flows parallel to the confining potential barriers. To understand how such a device works, consider the ring structure shown in Fig. 19. The semi-circumference of the ring is much shorter than the mean free path of carriers so that transport in the two branches is ballistic. If t$i(E)and t$;(E) are the transmission amplitudes from the mth subband in the left lead to the nth subband in the right lead via the two branches, then (if we neglect multiple reflection effects) the total transmission T,,(E) through the structure (from contact 1 to contact 2) at low bias is given by [recall Eq. (II.25)] M i MI - po)/2kT] F,2 = G, = 1 1 dEltiA(E) + tg(E)I’ sech2[(E4kT m=ln=l 0 (IV. 1)
s
(0
SEMICONDUCTOR QUANTUM DEVICES
143
FIGURE19. An Aharonov-Bohm ring structure. The semi-circumference of the ring (L) is smaller than the mean free path, and the leads are narrow quantum wires in which only one transverse subband is occupied. At low enough temperatures, transport in this structure is single-moded.
If the two branches are identical and placed symmetrically with respect to the two leads, then we can assume that the magnitudes of t$;(E) and tg;(E) are equal for all rn and n and E. They may, however, have different phases which may be controlled by some external means. Let the phase difference be Om,@). Then [recall Eq. (II.24)] the teminal current will be
where V is the applied bias. If the phase factor ( 1 + cos[Bmn(E)])is independent of energy and the subband indices rn and n, then it can be pulled outside the integration as well as the summations to yield
I,
=
-z2
where Mi
Mj
C C m = l n-1
Imn,
= I, C O S ~ ( B / ~ ) ,
(IV.3)
144
MARC CAHAY and SUPRIYO BANDYOPADHYAY
and O,,(E) = O for all m, n,E. Equation (IV.3) immediately shows that the current can be modulated between 0 and I,, thereby producing a 100% interference effect. We will show shortly that one can indeed induce an energy-independent and subband-independent phase difference 0 between the transmission amplitudes t,$A and t:; with an external magnetic field. This is made possible by the so-called magnetostatic Aharonov-Bohm effect (Aharonov and Bohm, 1959). If a magnetic field is applied in a direction perpendicular to the plane of the ring, then it induces a constant phase shift between the wavefunctions in the two branches given by (IV.5) where 0 is the flux enclosed by the ring,’* A is the magnetic vector potential, and the contour integration is performed around the circumference of the ring. Strictly speaking, this way of inducing a phase shift is based on the so-called type I1 Aharonov-Bohm (AB) effect, since in the original (type I) AB effect, the magnetic field is not supposed to exist in the path of the electrons, i.e., the two branches should be shielded from the field. We shall not deal with such strict semantics here and call it the AB effect. The total phase difference e,,(E) between the wavefunctions in the two branches (in the presence of a magnetic flux threading the ring) can be written as e
e:,y(E, aq = e,,(E, 0) + - a. h
(IV.6)
If the zero-field phases of the different conduction channels (or subbands) are strongly correlated at all energy (recall the mysterious “conspiracy theory” mentioned at the end of Section II,B), then the zero-field phase difference Bm,(E,0) is a constant independent of m , n, and E. In that case, e,”,”g(E,0)is also approximately independent of m , n, and E , which means that every conduction channel has about the same phase shift. In that case, we can use Eq. (IV.3), which predicts that a 100% modulation in the conductance is possible as the magnetic field is scanned. The major assumption in this is that the zero field phases Bm,(E,O)are equal for all m, n, and E. In Section IV,C, we will show how this may be enforced by clever design of structures and by ensuring ballistic transport. ”Since the branches of the ring have a nonzero thickness, the area enclosed by the ring has an uncertainty whose maximum value is the difference between the areas enclosed by the inner and outer circumferences. This introduces an uncertainty in @ and therefore the Aharonov-Bohm phase shift. Because of this reason, it is usually necessary to maintain a large “aspect ratio,” defined as the ratio of the area of the hole to that of the annulus.
SEMICONDUCTOR QUANTUM DEVICES
145
But if this assumption does not hold, then the conductance modulation would decrease. In the limit of large Mi or Mi, the conductance modulation decreases to l/Mi or l/Mj of the 100% value if the zero-field phases are completely random and uncorrelated. Therefore, it is usually best to have a single subband occupied in the leads. Actually, we will show later, in Section IV,B, that it is only necessary to have a single subband occupied in the direction perpendicular to both the magnetic field and the direction of current flow; multiple subbands can be occupied in the direction of the magnetic field. In this situation, one can in principle obtain 100% conductance modulation as long as transport is ballistic and the ring is geometrically symmetric about an axis. Note that the ring structure acts as a transistor because the current flowing between two terminals can be controlled by a magnetic field. For practical applications, a magnetic field is not as convenient as an electric field. Fortunately, it turns out that an electric field (or an electrostatic potential) imposed between the two branches can also induce a relative phase shift between the two branches owing to the so-called electrostatic Aharonov-Bohm effect (Matteucci and Pozzi, 1985; Washburn et al., 1987; De Vegvar et al., 1989).13 However, the phase shifts is then no longer independent of the electron energy or subband index. It is given by
e e,%(E, A V ) = h A Vq(E, m, n, A V ) ,
(IV.7)
where A V is the potential drop between the branches inducing the phase shift, and q ( E , m, n, A V ) is the harmonic mean of the transit times through the two branches. Equation (IV.7) can be rewritten as
where E is the electron energy measured from the bottom of the bulk conduction band in the material, E,, is the energy at the bottom of the (m, n)th subband, and L is the semi-circumference of the ring.14 Looking at Eq. (IV.8), it becomes evident that the phase shift due to the electrostatic effect depends on the electron energy E and the subband indices rn and n. Hence, et?(E, A V) can no longer be pulled outside either l 3 The AB effect can even be induced by a gravitational field (see, for example, Colella et al., 1975). I4Note that the phase shift is a nonlinear function of AV, so that the electrostatic Aharonov-Bohm oscillations are not periodic in A V is much smaller than the electron's kinetic energy (E - E,,,").
146
MARC CAHAY and SUPRIYO BANDYOPADHYAY
the integral over energy or the summation over m and n in Eq. (IV.2). Consequently, the current or conductance will not be modulated by 100% unless (a) only the lowest subband is occupied (so that the sum over m and n has only one term corresponding to m = 1 and n = 1) and (b) the spread in the electron energy ( - kT)approaches zero. Physically, this means that ensemble averaging, indicated by the integration over energy and the summations over subband indices m and n in Eq. (IV.2), will tend to dilute the interference effect at elevated temperatures and in multichanneled transport. T o see how the dilution of interference occurs, consider the situation when electrons enter the device with a broad spectrum of energy and from a multitude of subbands. Since OzF(E, A V ) depends on energy and subband indices, some of them will constructively while others will interfere destructively. Consequently, there will be hardly any net interference effect to cause a current modulation. The only way to mitigate this problem is to ensure that the spread in the energy of the electrons ( k T ) is kept small by keeping the temperature low, and that the structure is strictly onedimensional so that only the lowest subband is occupied everywhere in the branches. This will ensure that all electrons contributing to the current will suffer the same phase shift and therefore interfere in synchronism. This synchronism is most important for a large ( - 100%) current modulation. The synchronism was guaranteed for the magnetostatic effect, since the magnetostatic phase shift is independent of electron energy and subband indices. In that case, ensemble averaging did not dilute the interference effect. Therefore, the magnetostatic AB effect is easier to observe and control than its electrostatic counterpart. In other words, the magnetostatic effect is more robust than the electrostatic effect. In spite of its relative disadvantages, the electrostatic effect is still preferable for device applications since a magnetic field is difficult to generate in an integrated circuit and cannot be switched as rapidly as an electric field. Therefore, electronic transistors must utilize the electrostatic effect and consequently must contend with the deleterious effect of ensemble averaging. Such transistors have been proposed (Fowler, 1985; Datta et al., 1985, 1986a,b; Bandyopadhyay et al., 1986a,b). They have many advantages and we will describe them in a later section; but first, we show that they are indeed true electron wave devices and that there is a well-known optical analog.
-
A . The Optical Analog of Aharonov-Bohm Devices: The Mach-Zender Interferometer A well known interferometer in optics is the Mach-Zender interferometer (see, for example, Loenberger et al., 1976). It basically consists of a ring
SEMICONDUCTOR QUANTUM DEVICES
147
FIGURE20. A Mach-Zender interferometer consisting of a ring of optical fibers. One branch of the ring is gated to change its refractive index through the application of an external voltage.
made of a single-moded optical fiber as shown in Fig. 20. An electric field changes the refractive index (and hence the phase velocity of light) in one of the branches as a result of some electro-optic effect. This introduces a frequency-dependent phase-shift between light propagating in the two branches. This phase shift can be varied continuously by the electric field, which results in a field-controlled interference of the light emerging in the right fiber. The Mach-Zender interferometer is an exact optical analogue of electronic transistors based on the electrostatic Aharonov-Bohm interference effect. To see this analogy more clearly, the reader may refer to Fig. 21, where we show the dispersion (frequency versus wavevector) relation of guided light in the two branches of the Mach-Zender interferometer. The slopes of the lines are different for the two branches because the velocity of light is different in them. For monochromatic light of a given frequency, this results in a difference in the wavevectors A k ( = uAn/c) in the two branches, where An is the difference in the refractive indices and c is the speed of light in vacuum. The resulting phase difference AkL, which depends on the light frequency or photon energy, is the cause of the interference effect. In Fig. 21,we also show the dispersion (energy versus wavevector) relation for electrons in the two branches of a quantum interference transistor when a potential A V is impressed between the two branches, or a magnetic field with flux density B is applied perpendicular to the plane of the ring. We assume that a single subband is occupied. Again, at any given energy, we get a net phase shift AkL which depends on the energy for the electrostatic case, but is independent of the energy for the magnetostatic case. If we assume
148
MARC CAHAY and SUPRIYO BANDYOPADHYAY
t
I
hf
I
A
F C C c
a
8
I I
Photon wavevector
a Electron energy
4
Electron wavevector
b FIGURE21. (a) The dispersion relation for guided light in the two branches of a MachZender interferometer, and (b) the dispersion relation for electrons in the two branches of the ring structure of Fig. 19 with a potential difference A V applied between the two branches.
the dispersion relation to be parabolic, then for the electrostatic case, we get eAV=
A2Wl
+ k2)(k, - k,) , 2m*
(IV.9)
which gives AkL = (k,- k2)L
e
= -A V
A
1
l/(L/(Akl/m*))
+ l/(L/(hkz/rn*)) (IV. 10)
SEMICONDUCTOR QUANTUM DEVICES
149
where ( r t ) is the harmonic mean of the transit times. The transit time is given by r, = L/(hk/m*). Note that the above equation gives exactly the same expression as that in Eq. (IV.7). In the case of the magnetostatic effect, the phase shift is
e e AkL = -BdL = -@, h h
(IV. 11)
where d is the center to center displacement between the two branches. Note again that the phase shift is the same as the expression in Eq. (IV.5) and is independent of the electron energy. The fact that the magnetostatic phase shift does not depend on electron energy or the subband index (whereas the electrostatic phase shift does) has a very important consequence. It allows one to observe the magnetostatic effect (in principle a 100% modulation of the current) in two-dimensional structures where several subbands can be occupied in one transverse direction. It also allows observation of the effect at arbitrarily high temperature (or for an arbitrarily large spread in the electron energy) as long as the temperature is low enough to allow ballistic transport. All this is made possible by the fact that we could pull the phase factor outside the integral over energy and summation over subbands in Eq. (IV.2). As long as the integration and summation do not operate on the phase factor, we are guaranteed immunity from the ensemble averaging effects. In discussing the 100% current modulation for the magnetostatic effect, we had made two tacit assumptions. We neglected the effects of multiple reflection suffered by an electron within the ring [this was the assumption in arriving at Eq. (52)] and we tacitly assumed that the zero-field phases O,,,(E,O) is independent of m, n, and E. In the next subsection, we will show that multiple reflections do not degrade the 100% current modulation; they merely make the oscillation non-sinusoidal by causing harmonic distortion. The harmonics are generated because multiply reflected waves may suffer phase shifts that are integral multiples of (e/h)cS. Finally, in Section IV,C we will show how one can approximately ensure that the zero-field phases O,,,,(E,0) are independent of m, n, and E.
B. The Magnetostatic Aharonov-Bohm Effect in Wide Rings or Double Quantum Wells A convenient way to realize an Aharonov-Bohm ring like the one shown in Fig. 19 is to use a double quantum well structure (see Fig. 22). The two quantum wells serve as the two branches. Note that in this case the branches of ring are semi-infinite in width, so that a very large number of transverse
150
MARC CAHAY and SUPRIYO BANDYOPADHYAY
__ Alloyed contact
__
-
*
I
Semi-Insulating GaAs Substrate
7
FIGURE 22. A double quantum well structure grown by molecular beam epitaxy which simulates an Aharonov-Bohm ring structure as shown in Fig. 19. After Datta et a/. (1985); reprinted with permission.
subbands will be occupied along the width. The earliest attempts to study the magnetostatic Aharonov-Bohm effect experimentally involved a similar structure, namely a cylinder with leads attached at diametrically opposite points on the surface. These experiments (Sharvin and Sharvin, 1981; Al’tshuler et al., 1982) showed the conductance oscillations but with exactly one-half of the predicted period in the magnetic flux. These oscillations have a somewhat different origin from the Aharonov-Bohm effect and were predicted by Al’tshuler, Aronov, and Spivak in the same year in which they were observed (1981). They are customarily referred to as the h/2e oscillations (in contrast, the Aharonov-Bohm oscillations are called h/e oscillations since the fundamental period in the magnetic flux is Me). We will discuss the origin of the h/2e effect shortly, but before we do that, we want to discuss why the h/e Aharonov-Bohm effect is difficult or impossible to observe in cylinders. In dirty metal cylinders such as those that were used in the experiments, there is abundant elastic scattering even though phase-randomizing inelastic scatterings may be infrequent at sufficiently low temperature. The reader should recall the discussion in Section II,B, where we indicated that elastic
SEMICONDUCTOR QUANTUM DEVICES
151
FIGURE23. Illustration of the Aronov-Al’tschuler-Spivak effect which gives rise to h/2e conductance oscillations. The reflection probability for an incident electron (and hence the conductance of the structure) is determined by the interference of two time-reversed Feynman trajectories that complete a full circle around the ring clockwise and counterclockwise.
scattering can change the absolute phase of an electron randomly in structures where many transverse subbands are occupied. This happens because elastic scattering can cause intersubband transitions. As a result, different electrons undergo different zero-field phase shifts and fall out of synchronism. This is equivalent to an ensemble averaging which dilutes the interference effect. Therefore, wide structures with multiple subband occupancy will not exhibit a sufficiently strong Aharonov-Bohm effect if transport is diffusive instead of ballistic. Nevertheless, one still expects to observe a small effect with a conductance oscillation amplitude of I2e2/h because of the mysterious conspiracy between the transport channels as discussed in Section I1,B. Precisely this effect was observed in two experiments in 1985 that were carried out in dirty metal rings with semicircumference much smaller than the phase-breaking length L , (Webb et al., 1985; Chandrasekhar et al., 1985). We now discuss the origin of the h/2e oscillations which survive elastic scattering. These arise from the interference of two time-reversed Feynman paths that can be viewed as real space trajectories traveling in clockwise and counterclockwise directions around the ring. They are shown in Fig. 23. These two trajectories visit exactly the same elastic scatters in their journey around the ring and suffer exactly the same zero-field phase shifts.’’ Therefore, the zero-field phase difference between these two trajectories is not affected by elastic scattering-in fact, this phase difference is exactly zero. In the presence of a magnetic field, they undergo equal but opposite phase shifts (e/h)(pA dl so that the phase difference is twice the
-
Is
These are the trajectories that are responsible for Anderson localization (Bergmann,
1983).
152
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Aharonov-Bohm phase shift (e/h)O. Therefore, the oscillations have twice the frequency. What about wide structures (such as the one in Fig. 22) in which transport is ballistic instead of diffusive? Large conductance oscillations with amplitude far in excess of 2e2/h were experimentally observed in molecular beam epitaxy-grown double quantum wells in which transport was partially ballistic. In this section, we show that 100% conductance modulation due to the magnetostatic Aharonov-Bohm effect can, in principle, be obtained in such structures regardless of (a) the width of the structure (or number of transverse subbands occupied along the width), (b) the ambient temperature (or the spread in the electron energy), and (c) the amount of multiple reflections suffered by an electron within the structure if four conditions are satisfied: (a) Either transport in the two wells is ballistic, or there is elastic scattering, but it is perfectly correlated in the two wells (this means that the concentration and configuration of elastic scatterers in the two wells are identical); (b) only a single subband is occupied in the wells (along the z-axis in Fig. 22); (c) the structure is geometrically symmetric about the x-y plane [this condition, along with condition (b), ensures that the zero-field phasse difference 8,,,(E, 0) = 0 for all energy E and y-directed subbands indexed by the integer m ] ; and (d) the magnetic field is weak enough that the cyclotron radius is larger than the width of the well along the z-direction. Consider now the hypothetical structure as shown in Fig. 24a with two end regions (x C 0 and x > L) and a middle region (0 < x < L) consisting of two contiguous quantum wells. In each of the three regions the energy of an electron is given by
-
E(k,, ky) = c0
+ h2k,2/2m* + h2k,/2m*,
(IV. 12)
where m* is the effective mass and c0 is the subband bottom energy of the lowest (and the only occupied) subband. The dispersion relations are sketched in Fig. 24b. It will be noted that at equilibrium, the relative positioning of the subbands in the three regions is fixed by the requirement of a constant Fermi level for the specified doping densities. The current Z through the structure for an applied potential V is given by the Tsu-Esaki formula
where W, is the width of the structure in the y direction and T(E, k,) is the transmission coefficient from the left contact to the right contact. The quantities E and Ky are the energy and the y-component of an electron’s wavevector.
153
SEMICONDUCTOR QUANTUM DEVICES a
b
l
l
E
k
F
k X
X
X
FIOURE24. (a) A hypothetical double quantum well structure grown by etching and regrowth in which a single transverse subband is occupied everywhere in the wells along the zdirection. The device is uniform along they-direction with a width W,.(b) The electron energy E(k,, ky = 0) versus wavevector k, relation in various regions of the structure. In the region (0 < x < L ) where the two branches exist, the dispersion relations in the two branches are degenerate and are shown slightly displaced from each other for clarity. 1 and 2 are the two lowest subbands in the wide regions, while 1' (1") and 2' (2") are the lowest subbands in the narrow regions (branches). Only subbands whose bottoms are below the Fermi energy EF are occupied. After Datta (1989a); reprinted with permission.
We will assume that all four conditions mentioned before are satisfied. Condition (d) mean that the magnetic field By is low enough to allow its effect to be described by lowest-order perturbation theory. Condition (c) means that the conduction band potential EJz) is symmetric about the plane z = 0. It was shown by Datta and Bandyopadhyay (1987) that when all four conditions are met, a large conductance modulation (approaching
154
MARC CAHAY and SUPRIYO BANDYOPADHYAY
100%) can be obtained even if (a) the width W, is arbitrarily large (so that transport is severely multichanneled), (b) an electron suffers multiple reflections within the structure, and (c) the ambient temperature is arbitrarily high. The absolute conductance modulation AG can thus be made arbitrarily large by increasing W, . Furthermore, the conductance modulation is unaffected by the aspect ratio of the ring. We now proceed to prove the preceding rigorously following Datta and Bandyopadhyay (1987). Condition (a) implies that there is only one subband to consider in each of the end regions. In the middle region there are two subbands corresponding to channels 1 and 2. The transmission coefficient T from one end to the other can be written as (Anderson, 1981)
T = t ' [ Z - PrP'r']-'Pt.
(IV.14)
The preceding equation explicitly takes into account the effects of multiple refections that can be suffered by an electron within the ring before it exists. The matrices t, t', r, r', P, and P' are defined in Fig. 24c. t is a 2 x 1 matrix describing the transmission from the left end into the two wells, while t' is a 1 x 2 matrix describing the transmission from the wells into the right end: (IV.15)
Similarly, r and r' are 2 x 2 matrices describing the reflections at the two junctions from the wells back into the wells. P and P' are 2 x 2 matrices describing forward and reverse propagation between x = 0 and x = L. Since transport is ballistic, these matrices are diagonal, and the diagonal elements describe the absolute phase shifts in traversing the two wells: (IV.16)
where 9 . 2
= exP(ikx,,,L),
P,'2 = exp(ik:,,,l).
(1V.17) (IV. 18)
Here k,, and k,, are the x-components of the wavevectors in wells 1 and 2 for a given E and k,,. Here unprimed wavectors characterize forward propagation from x = 0 to x = L, while primed wavevectors describe reverse propagation from x = L to x = 0. Using Eqs. (IV.15)-(1V.18) in Eq. (IV.14) and taking the magnitude squared, we get
lT12 = lu12 + lbI2 + 2(ullbl cos(8 + y),
(IV. 19)
SEMICONDUCTOR QUANTUM DEVICES
155
where
+ Ct;t,)/(AD- BC), b = (Btlt2 + At;t,)/(AD - BC),
(IV .20b)
y = phase(a*b),
(IV.20c)
e = (kX2- k , , ~ ,
(IV.20d)
u = (Dtitl
A
=
1 - P I P { r l l r i l- P l P i r 1 2 r i l ,
B = P l P { r l l r i 2+ P,Pir12ri2, C = PzPirz2ril
+ PzP{rzlr~l,
D = 1 - P2Pir22r;z- P2P{r21ri2.
(IV.20a)
(IV.2la) (IV .21b) (IV.2 1c) (IV.2ld)
Every quantity in Eqs. (IV.20) and (IV.21) except t9 in the magnetostatic effect [recall Eq. (IV.ll)] depends on k, and therefore E and k,. For a given E and ky , the parameters t , t’, r, and r’ can be calculated exactly by enforcing continuity of the wavefunction and its first derivative across a boundary (Frohne and Datta, 1988). The transmission I TI2(E, k,) can then be calculated from Eq. (IV.19) and used in Eq. (IV.13) to calculate the current for a given voltage. The quantity I TI2depends on a, b, (p, and 8, of which the first three parameters will, in general, vary with E and k,.. However, if EJz) is symmetric about z = 0, then t , = t 2 , since wells 1 and 2 are symmetrically disposed with respect to the leads, while the lowest subband in the leads (end regions) has a symmetric wave function (a wavefunction of even parity). Similary, t i = t i , rll = rZz,etc. As a result, it can be seen from Eqs. (IV.20) and (IV.21) that a = b; hence, IT(E,k,,Cg)12 = 2(a(E,k,,Cg)12(1 + cos8).
(IV.22)
The phase shift 8 = (e/h)@ [recall Eq. (IV.ll)] and is therefore independent of E and k,,. Therefore, substituting the preceding expression in Eq. (IV.13), we immediately find that Z = (1
+ cos
h
j 1 dE
[f(E) - f ( E + eV)]la12. (IV.23)
The quantity 1ul2 depends on the magnetic flux 4, and hence 8. Consequently, the current does not vary sinusoidally with the magnetic flux. This is an aftermath of multiple reflections suffered by an electron within the ring before it exits. These reflections generate harmonics that distort the shape of the oscillations, but do not degrade the 100% modulation of the current. Moreover, the absolute modulation of the conductance AG can be made arbitrarily large by increasing the width W,. At the same time, the
156
MARC CAHAY and SUPRIYO BANDYOPADHYAY
relative modulation remains independent of temperature and the aspect ratio of the ring. What makes the 100% modulation (or infinite peak-to-valley ratio of the conductance) possible is the fact that 0 is independent of E and ky. Given that, one can proceed to build clever structures in which a = b (this requirement is satisfied by meeting the four conditions mentioned earlier). Such structures will exhibit a 100% modulation. Unfortunately, the electrostatic Aharonov-Bohm phase shift is not independent of E and k,, in a double quantum well structure, so that the conductance modulation would never be 100%. This is unfortunate, since the electrostatic effect is more amenable to device application. Before we conclude, we wish to revisit conditions (c) and (d) for the 100% modulation. At large magnetic fields, the wave functions will resemble those of cyclotron orbits associated with Landau levels. If the cyclotron radius r, a W, then the electron wavefunction in the lead is localized near the entry to one of the wells. Electrons will then be preferentially injected in that well so that a # 6 . In that case the modulation is reduced significantly. This will, of course, inevitably happen at high magnetic fields, and there are experimental indications of this actually happening in ring structures (Datta et al., 1985, 1986a,b; Melloch et al., 1986; Timp et ul., 1987). It thus appears that in order to observe many periods of Aharonov-Bohm oscillations, long structures (L % 2 W), which reduce the amount of flux required to induce one period of the oscillation, should be used. Finally we address the question as to whether sufficiently symmetric structures can be fabricated by present-day technology so that u = b and large interference effects can be observed. In metals, this appears impossible because the phase shift across one arm of the interferometer is typically thousands of radians, and even a small percentage deviation would cause a phase jitter greater than 2n. But in semiconductors the de Broglie wavelength is much longer, so that the phase shift across one arm can be only a few radians, and it should be possible to control the phase more accurately. It should be noted that a scattering center of a given dimension has a far smaller scattering cross-section in semiconductors compared to metals; this implies a smaller phase shift. Moreover, because of the small screening in semiconductors, the scattering potential in one channel could be strongly correlated with that in a nearby channel -100A away. The interference is not affected if the scattering potential is the same in both channels, even if it is inelastic; it is differential potential that matters. There is already some experimental evidence (Datta et al., 1985) for conductance modulations far in excess of 2ez/h even though the end regions in the experimental structure were multimoded and transport was probably only partially ballistic.
157
SEMICONDUCTOR QUANTUM DEVICES
C. The Electrostatic Aharonov-Bohm Effect in Double Quantum Wells: Possible Ultrahigh-PerformanceTransistors In the previous section, we discussed the magnetostatic Aharonov-Bohm effect in double quantum wells. In this section, we will discuss the electrostatic Aharonov-Bohm effect in double quantum wells and its possible device applications. Although the electrostatic effect cannot produce a 100% conductance modulation in double quantum well structures under any circumstance, it is still interesting because of its potential applications in ultrafast and ultralow-power electronic transistors. The double quantum well structure that has been most often invoked for transistor applications (Datta et al., 1986b) consists of two parallel GaAs quantum wells separated by an AlGaAs barrier, as shown in Fig. 25. Alternate designs have been proposed by Okuda et al. (1990, 1993). In the structure of Fig. 25, the AlGaAs barrier is thin at the two ends and thick in the middle. Such a structure can be grown by selective area etching followed by film regrowth using molecular beam epitaxy (Chang and Kroemer, 1984). At the two ends, the barriers are thin (a few tens of angstroms) to allow considerable coupling between the wells. Consequently, there is a significant overlap between the electron wavefunctions in the two wells at the ends, but hardly any in the middle. The large coupling in the end regions makes the electronic state with the symmetric wavefunction IS) lower in energy than the antisymmetric state(A), as shown in Fig. 25, and most of the electrons occupy the symmetric state (at low temperatures) if the carrier density is such that the Fermi level is located as shown in the figure.
n+
n+
Contact
Contact
FIGURE 25. Proposed structure of the quantum interference transistor based on the electrostatic Aharanov-Bohm effect. After Datta et al. (1986b);reprinted with permission.
158
MARC CAHAY and SUPRIYO BANDYOPADHYAY
The wave function ~ ( rt), for electrons at the ends (x > L and x c 0) can now be written as ~ ( r1 ), = IS)expi[k,x + kYy - ~ t / h ] . (IV .24) In the central region (0 5 x IL), there is no coupling, so that the wavefunctions 11) and 12) in the two isolated wells are orthogonal. Here we can write the wave function ~ ( rt), as a linear combination of the wave functions 11) and 12):
~ ( rt ), = [C,(x)ll> + C2(x)12)] exp i[k,y - Et/h].
(IV.25)
Since the electron is in a symmetric state at the left end, it will enter the two wells in phase. Note that this satisfies the requirement a = b as discussed in the previous section. Therefore, at x = 0, Cl(0) = C2(0).
(IV.26)
Let kl and k2 be the wavevectors in the x direction along the two wells. At the right end (x = L), we get (IV .27) If the waves in the two channels arrive at the right end (x = L ) in phase [i.e., (k,- k2)L = 2n721, then C,(L) = C2(L). In that case, the wavefunction given by Eq. (IV.25) will have even parity and constitute a symmetric state which can transmit out freely. But if the waves arrive out of phase [C,(L)= - C2(L)],then the wavefunction is antisymmetric. The antisymmetric state is completely reflected at the right end for the following reasons. It cannot transmit into the symmetric state IS) in the right end because the overlap with this state is zero. However, the symmetric state is the onlypropugating state in the right end since the antisymmetric state [ A ) is above the Fermi energy and hence evanescent. In sum, the structure conducts when the phase difference (k,- k2)L between the two wells is an even multiple of n and does not conduct when the phase difference is an odd multiple. This realizes an interferometer. Furthermore, by controlling the phase difference with an electrostatic potential applied via a third (gate) terminal, one realizes a “quantum interference transistor.’’ We now derive the transfer characteristics (current versus gate voltage) of this structure to demonstrate the transistor action. Mathematically, the transmission coefficient 1 T l2 for electrons from the left end to the right end is given by (IV.28)
159
SEMICONDUCTOR QUANTUM DEVICES
Using Eqs. (IV.26)-(IV.28), we get
I TI2 = C0S2[(kl - k,)L/2].
(IV .29)
We are assuming that the length L is shorter than a mean free path, so that both elastic and inelastic scattering can be neglected. With a mobility of los cm2/V s and a Fermi velocity of lo7cm/s (typical of GaAs wells with a carrier concentration of 10" cm2), the mean free path is -0.4pm. The current Z flowing between two terminals of the structure in response to a voltage V is 2nh
dEdky cos2[(k, - k2)L/2][f(E)
-f(E
+ eV)].
(IV.30)
Using Eq. (IV.10) to replace (k,- k2)L, we obtain
1
A V(rJ(E,A V , k,,) [ f ( E ) - f ( E + e V ) ] . (IV.31) It is obvious that one can modulate Z by modulating A V. The modulation is of course not periodic in A V , since ( T J depends on A V , which makes the argument of the cosine a nonlinear function of AV. More importantly, since ( r J depends on E and k,,,the conductance modulation is degraded by the integration over the energy and transverse wavevector. The simple description just presented is qualitatively correct, but it is complicated somewhat by multiple reflections. These can be accounted for by using scattering matrices which relate amplitudes of reflected waves to those of incident waves:
[:"j [..',*." c; (0)
0
exp(ik:l)
=
c,+(L)
0
0
0
0
0
exp(ikz+l)
0 exp(ik;L)
0
I[
24.
G(U (IV.32)
Here the superscripts + and - are used to denote the amplitudes of wave functions propagating in the positive and negative x directions, respectively. For an electrostatic potential difference A V ,
k:
=
k;
=
k, = k
+ (eAV/2hv),
(IV.32a)
k;
=
k;
=
k2 = k - (eAV/2hv),
(IV.32b)
where u is the average of the electron velocities in the two wells. The
160
MARC CAHAY and SUPRIYO BANDYOPADHYAY
amplitudes C, and C , of the symmetric and antisymmetric components are defined by (1V.33a) c, = (l/d3)(C1 + C2),
c, = (l/dZ)(Cl
-
C2).
(IV .33b)
The antisymmetric component is completely trapped between x = 0 and x = L and cannot propagate in the two end regions where it is evanescent. Assuming that the reflection coefficient for the antisymmetric state is - 1 at either end, we get from Eqs. (IV.31) and (IV.33)
where
s,,
=
s,
s,, = s,2
=
-b 2/(I
=
a[l
2
- a ),
+ bV(1 - a2)].
a = eikL cos[(k, - k2)L/2],
b
=
eikLsin[(k,
-
k,)~/2].
(IV .3 5 a) (IV.35b) (IV .36a) (IV.36b)
If we assume that the symmetric component flows freely to and from the end regions without any reflection, then the transmission coefficient T is given by T = S21. Figure 26 shows the linear response conductance G ( = I / V) calculated as a function of the voltage difference A V. In these calculations, we assumed L = 5 , 0 0 0 8 , and the Fermi velocity u to be lo7 cm/s. It is interesting to note that the conductance can be decreased by 75% with a voltage of only -1 mV. This indicates that a quantum interference transistor will have very low threshold voltages, which, in fact, endows it with extremely attractive properties, as we shall show later. In practice of course, the assumption that the symmetric component flows freely to and from the ends is unrealistic, especially in view of the geometric bend caused by the process of etching and regrowth. This bend will inevitably cause some reflection, which can be taken into account using the scattering matrix in Eq. (IV.34). Reflections in the end regions distort the transfer characteristics in Fig. 26, and the amount of distortion is critically sensitive to the phase of the reflection coefficient. 1. Performance of Quantum Interference Transistors
In this section, we will discuss the performance of quantum interference transistors utilizing double quantum wells such as one in Fig. 25. Appropriate
161
SEMICONDUCTOR QUANTUM DEVICES
a
L
0
1250
2500
3750
- B (Gauss)
5000
b
u
0.00 0.0
0.5
1 .o
- AWmV)
1.5
2.0
FIGURE26. Normalized conductance for the structure shown in Fig. 25 (L = 5,0oO A) as a function of (a) magnetic field, By (assuming d = 300A) and (b) potential difference A V between the channels (assuming that the Fermi velocity is lo7cm/s). After Datta et 01. (1986b); reprinted with permission.
performance figures were calculated by Bandyopadhyay et ul. (1986a, b). In all calculations the length L between the contacts injecting and detecting electrons (termed “source” and “drain” according to conventional device parlance) was taken to be 2,OOOA, and the carrier concentration was 10” cm’. These are very reasonable figures for GaAs devices. The lattice temperature (as well as the electron temperature in the contacts) was assumed to be 4.2 K. In Fig. 27a we show the transfer characteristics (drain current vs. gate voltage) of the device for two source-to-drain biases as calculated by Bandyopadhyay et ul. (1986a,b). The maximum source-to-drain bias that can be applied without promoting significant phase randomizing inelastic
162
MARC CAHAY and SUPRIYO BANDYOPADHYAY
a
- V, = 30mv
---
V, = lmv
c)
E
I 0.001
0.0
\-I
1.5
I
3.0 Gate Voltage (mv)
4.5
lzor-
I
0.0
v, = 0.0
90 -
V, = 0.4 mv
60 -
0.00
7.60
16.0 Drain Voltage (mv)
22.6
30.0
FIOURE27. (a) The transfer characteristics for a quantum interference transistor for two different source-to-drain voltages. The source-to-drain separation is 2,000 A, the carrier concentration is 10" crn-', and the electron temperature is 4.2 K. (b) The output characteristics (drain current versus source-to-drain voltage) for various gate biases. After Bandyopadhyay et a/. (1986b); reprinted with permission.
SEMICONDUCTOR QUANTUM DEVICES
163
collisions (due to polar optical phonon emission) is 36 mV.16 This arises from the fact that the threshold electron energy for polar optical phonon emission in GaAs is 36meV. Therefore, a ballistic electron arriving at the drain must have an excess kinetic energy smaller than this in order to prevent polar optical phonon emission. In arriving at the characteristics of Fig. 27a, three assumptions were made. Firstly, it was assumed that the applied gate voltage is dropped entirely between the two quantum wells. This means that the voltage drops over the donor and spacer layers and the subtrate (see Fig. 25) are negligible. The substrate may be a conducting material (such as Cr-doped n+-GaAs), so that the substrate drop may indeed be negligible. Secondly, the electrons were assumed not to suffer any reflection as they enter and exit the wells from the contacts. This is a tacit assumption, since electrons must suffer some reflection in entering and exiting the upper well, which has an abrupt geometric bend caused by the process of etching and regrowth. The effect of such spurious reflections, as well as reflections due to imperfections in the channel, were later examined by Frohne and Datta (1988) and Cahay et al. (1990). They found that these reflections have some deleterious effects, but they are not serious enough to inhibit the interference phenomenon responsible for transistor action. Finally, no allowance is made for scattering within the wells; all electrons are assumed to be ballistic. In reality, electrons whose velocities are not co-directional with the electric field between the source and drain have relatively larger transit times. Some of these electrons will scatter and lose phase memory. They do not contribute to the interference, but instead give rise to a uniform background current, which decreases the percentage modulation in the conductance. From Fig. 27a, it is seen that the transconductance (slope of the transfer characteristics) can be either positive or negative depending on the choice of the dc gate bias. This can result in many interesting applications. As an example, one can realize unipolar complementary logic elements by synthesizing a complementary pair with one transistor biased in the positive and the other in the negative transconductance region. Each transistor can also be made to act either a depletion-mode device or an enhancement-mode device, by introducing a sheet of charge in the AlGaAs barrier layer during MBE growth to cause a fixed threshold shift. It is also possible to synthesize a single-state differential amplifier with essentially infinite common mode rejection ratio. Since the electrostatic Aharonov-Bohm phase difference that controls the current depends on the voltage difference between the two '6Actually, this is an overestimation. At this bias, a ballistic electron gains an energy equal to 36 meV from the applied bias, which causes the electron temperature to be 415 K! Needless to say, significant electron-electron scattering can take place at this electron temperature.
164
MARC CAHAY and SUPRIYO BANDYOPADHYAY
wells, the output current depends on the difference input and not on the absolute input. If we can provide separate gate contacts to the two wells, we can provide independent inputs. This allows us to use the structure as a differential amplifier. One can also use the structure as a single transistor frequency multiplier. It is evident that if an ac gate voltage of amplitude (2m + 1) times the threshold voltage is applied at the gate, then the sourceto-drain current will oscillate with a frequency that is m times the frequency of the gate signal. In all of the foregoing applications, a single transistor can be made to perform the task of multiple transistors in conventional circuits. We call this property multifunctionality (Bandyopadhyay et al., 1989). This feature may be the most valued feature of quantum devices. In Fig. 27b we reproduce the drain current versus drain voltage for various gate biases as calculated by Bandyopadhyay et al. (1986b). Note that the characteristics saturate past a drain voltage of 25 mV. In reality, the drain current will saturate if the drain bias exceeds E,/e, where EF is the Fermi energy and e the charge of an electron. Therefore, one can make the drain characteristics saturate at a lower voltage by reducing the Fermi energy. This is achieved by reducing the electron concentration in the quantum well (the Fermi energy of a two-dimensional electron gas is proportional to the electron concentration). Reducing the Fermi energy, however, also reduces the threshold voltage, since the latter is proportional to the Fermi velocity. We now present the estimates of Bandyopadhyay et al. (1986b) for the figures of merit of such a transistor. The excellent figures mostly accrue from the very low threshold voltage V, (- 1.5mV). An obvious consequence of this low value of the threshold voltage is that the transconductance (and hence the frequency bandwidth) of this device can be very large and the power-delay product can be very small. In Table I we list the various device performance figures that have been estimated (Bandyopadhyay et al., 1986b). The transconductance g, is defined as dZs,/dVG (for a fixed VsD), when V,, is the , is the source-to-drain current, and V, is the source-to-drain voltage, Z gate voltage. The unity gain frequencyf, is given by g,/(C, + C,,), where C, and C,,are the gate and source-to-drain capacitances, respectively. The TABLE I INTRINSIC PARAMETERS FOR THE QUANTUM INTERFERENCE TRANSISTOR (FIG. 25)' Power-delay product Transconductance Unity gain frequency Threshold voltage
10-20-10-
19 J
> 100 S/mm of gate width
- 1 THz - 1.5 mV
'After Bandyopadhyay et al. (1986b).
SEMICONDUCTOR QUANTUM DEVICES
165
switching delay is given by Cg(VT/IsD),which is the effective RC time constant associated with gate charging. It was pointed out by Yamanishi (1989) that since the threshold voltage is so small, it need not be applied through an electrical gate contact. Instead, it can be generated by an optical radiation field with a photon energy smaller than the bandgap of the AlGaAs barrier material. Such a field can induce virtual transitions between electron and hole states in a quantum confined structure (Yamanishi, 1987; Chemla et al., 1987).Yamanishi (1989)showed that the charge polarization caused by these virtual transitions can generate the small voltage required for switching the transistor. This mode of switching of course eliminates the RC time constant limitation associated with gate charging, since it eliminates the gate circuit altogether. Moreover, it opens the way to realizing an ultrafast opto-electronic switch. The switching delay for this mode of switching is determined by the optical pulse width and the intrinsic time response of the virtual charge polarization. This time scale is on the order of 100 femtoseconds. The power-delay product for switching in a typical inverter circuit was calculated by Bandyopadhyay et al. (1986b). It is -10-20joules, which is within three orders of magnitude of the ultimate quantum-mechanical and thermodynamical limits (Bate, 1983). Such low power-delay products can allow extremely high levels of integration (McDonald et al., 1984). The small power-delay product accrues from the small value of the threshold voltage. Unfortunately, the low threshold voltage also causes the noise margin to be very low and imposes a restriction on the operating environment. These devices need to be operated in well-shielded environments, which imposes stringent demands on packaging. Another important consideration is the operating temperature. Low-temperature operation is mandatory to reduce the ensemble averaging effect, to reduce Johnson noise, and to eliminate inelastic scattering. However, we shall show later that if the transistor is implemented from quantum wires, as opposed to quantum wells, then the ensemble averaging due to the occupancy of multiple subbands is eliminated. In that case, the operating temperature can be increased significantly, perhaps even to 77K for wires that are short enough so that ballistic transport is still achievable (Bandyopadhyay and Porod, 1988, 1989). This is discussed in the next subsection.
D. The Electrostatic Aharonov-Bohm Effect in Disordered Structures: Comparison between Double Quantum Wires and Double Quantum Wells in the Presence of Elastic Scattering Until now, we have examined the electrostatic Aharonov-Bohm effect only in the ballistic regime when an electron suffers no collision-elastic
166
MARC CAHAY and SUPRIYO BANDYOPADHYAY
or inelastic. We now examine the diffusive regime when elastic (but not inelastic) scattering is present. In the diffusive regime, there is a significant difference between the performance of two-dimensional structures (double quantum wells) and that of one-dimensional structures (double quantum wires), as far as the electrostatic effect is concerned. We saw in Section IV,C that in the ballistic regime, double quantum wells may still produce a large (-75%) conductance modulation owing to the electrostatic effect, but unfortunately this does not apply in the diffusive regime. Elastic scattering rapidly degrades the performance of double quantum well interferometers. In contrast, it has very little effect on the performance of quantum wire interferometers. This is primarily because elastic scattering is rather infrequent in onedimensional structures as a result of a dramatic constriction of the available phase space for scattering. In fact, elastic scattering by long range potentials (e.g. impurity scattering) is virtually prohibited in strictly one dimensional structures (Sakaki, 1980) so that one dimensional structures have an advantage. The motivation to examine diffusive transport as opposed to ballistic transport stems from a practical consideration. It is very difficult to produce truly ballistic structures. Fabrication of such structures places extreme demands on semiconductor technology. The demand is twofold. First, sophisticated techniques, such as modulation doping, are required to reduce impurity scattering. Second, the fabrication demands can be quite imposing, since the length of the structure has to be smaller than the elastic mean free path, and this quantity can be quite small in quantum wells since one of the interfaces of the well will always be an “inverted” interface. l7 Inverted interfaces are known to cause significant roughness scattering. In contrast, structures meant for the diffusive regime are much easier to fabricate. Modulation doping is not required, and the fabrication demands are significantly relaxed because the length of the structure has to be merely shorter than the phase-breaking length (or inelastic diffusion length), which is typically much longer than the elastic mean free path at low temperatures. We now examine why elastic scattering has a severe detrimental effect on the electrostatic AB interference. We have seen before that the electrostatic AB phase shift is proportional to the transit time of electrons. Therefore, any uncertainty or spread in the transit time (due to whatever source) will introduce a corresponding spread in the phase shift, while dilutes the interference effect. In the ballistic case, the spread in the transit time I7An inverted interface is the interface that occurs when GaAs is grown on AlGaAs rather than the reverse. Inverted interfaces are rougher than normal interfaces and are generally of worse quality.
SEMICONDUCTOR QUANTUM DEVICES
167
accrues from only two sources: the spread in the electron energy caused by nonzero temperature (thermal, broadening) and multiple subband occupancy (recall that in two-dimensional structures, (7,) is a function of E and k,,). However, in the diffusive case there is a third (and much more important) source of the spread which exists in polydimensional structures but is virtually absent in strictly one-dimensional structures. We discuss this next. In diffusive (or disordered) 2-D structures, carriers execute a “random walk” motion because of elastic scattering, which introduces a very large spread in the transit time (usually much larger than that introduced by multiple subband occupancy or elevated temperatures). This spread is very small in 1-D structures, since the “random walk” motion is severely restricted. The only permitted “random walk” in strictly 1-D structures is “backwards and forwards” motion (but no “sideways” motion), since all elastic scattering events must involve a 180” deflection of the electron, which corresponds to a reflection. As stated before, even this reflection is a highly unlikely occurrence, especially for high-velocity electrons, because the accompanying momentum change is so large that it can only be caused by the short-range (i.e. , large-wavevector) components of the scattering potential. As long as the scattering potential in a 1-D structure varies smoothly in space (an example is the Coulombic impurity scattering potential in modulation doped structures), such scatterings are practically absent and random walk is essentially prohibited. The difference between two- and one-dimensional disordered structures is now obvious. In two-dimensional structures there are three distinct causes for the large spread in the phase shift: (a) thermal broadening, (b) multiple subband occupancy, and (c) random walk motion of carriers in diffusive transport. In contrast, strictly one-dimensional structures are vulnerable only to thermal broadening; the other two sources are absent. We now proceed to calculate the spread in the phase shift in two- and one-dimensional structures in diffusive transport. Based on this, we will establish a performance criterion to define various levels of performance for quantum interferometers. To determine an appropriate performance criterion, we first ask what is the important figure of merit for a quantum interference transistor. For switching applications, it is necessary to ensure that the conductance modulation (or the ratio of the maximum to minimum conductance of the structure) is sufficiently large that there are well-defined logic levels. This is ensured if the valley current is close to zero. For this we require that when the mean phase difference between the two branches of the interferometer be n (corresponding to the conductance minimum), the spread in the phase shift A 4 4 n.
168
MARC CAHAY and SUPRIYO BANDYOPADHYAY
The preceding condition is equivalent to the condition that the ratio A4/t$ 4 1. We now adopt this ratio as a “performance index” (labeled 4’). The smaller the value of (, the better the performance. We now calculate this dimensionless quantity ( for 2-D and 1-D structures: (IV. 37)
where A ( T ~ )is the spread in the transit time (the standard deviation) and overline denotes ensemble average. For 2-D structures, the major source of any spread in the transit time is the random walk motion. The other two sources, namely thermal broadening and multiple subband occupancy, are relatively minor. Therefore, we will calculate A(7,) considering only the random walk motion. The quantity A(rt) can be calculated the same way as is done for the Shockley-Haynes experiment (Streetman, 1972):
where D is the diffusion coefficient, 6 is the average drift velocity, and L is the length of the structure between the contacts.
(IV .39) Therefore,
(IV .40)
For a low electric field &, ij;; = pC, where p is the mobility. Also, & = Vbias/Ly where Vbias is the bias voltage over the structure. This gives (IV.41)
For 2-D structures, ( depends inversely on the square root of the bias. The upper limit on Vbias is the voltage at which an electron, arriving at one contact from the other, just reaches sufficient energy to cross the threshold for polar optical phonon emission (strong inelastic scattering). Hence, from Eq. (IV.39), invoking the generalized Einstein relation for a carrier concentration n,
169
SEMICONDUCTOR QUANTUM DEVICES
Temperature (K) FIGURE 28. Performance diagram for two-dimensional GaAs electrostatic Aharonov-Bohm interferometers operating in the diffusive regime. Regions of “fair” and “good” performance are shown in the diagram. There is no region of excellent performance. After Bandyopadhyay and Porod (1989); reprinted with permission.
we get that for a 2-D structure
where Eo is the energy at the bottom of the lowest subband (the only one presumed to be occupied), EF is the Fermi energy (which depends on the carrier concentration), and cpOp is the polar optical phonon energy ( = 36 mev for GaAs).Equation (IV.42) gives the limiting values of tempera< 1. As long as the temperature and carrier concentration for which is less than unity, the ture and carrier concentration are such that device can perform satisfactorily. In Fig. 28 we depict the performance of 2-D GaAs interferometers as a function of temperature and carrier concentration. There are regions of “fair” and “good” performance. < 1, “good” if < 0.5, The performance is considered “fair” if and “excellent” if < 0.1. For 2-D structures, there is no visible region of “excellent” performance. Note that the performance improves with decreasing carrier concentration and temperature. The lowest carrier concentration practically achievable in GaAs 2-D structures (molecular beam epitaxy-grown quantum wells) is about lo8 cm-2. For this carrier concenI0.1 only at temperatures 5250 mK. From this figure, we tration, also find that 2-D structures cannot operate at 77 K, which is far outside the range of “fair” performance. In addition, we find that even for “fair”
cmin
cmin
cmin
emin
cmin
cAn
170
MARC CAHAY and SUPRIYO BANDYOPADHYAY
performance, the maximum temperature of operation is -26K (for the lowest carrier concentration) and the maximum allowed carrier concentration is -6.3 x 10’’ ern-' (at the lowest temperature). For “good” performance, the maximum temperature of operation is 7K, and the maximum allowed carrier concentration is -1.6 x 10” cm-*. Two-dimensional interferometers are therefore not a judicious choice for device application in the diffusive regime. Of course, in the ballistic regime, 2-D interferometers can perform much better and have the additional advantage over 1-D structures in that the transconductance of the device can be made arbitrarily large by increasing the transverse width of the structure. We now discuss 1-D structures. In such systems, diffusive motion of carriers (random walk) is inhibited. Hence the only source of a spread in the transit time is the thermal smearing of the electron distribution. For a one-dimensional structure, we then have (IV.43) where Avd is the spread in the drift velocity arising from the thermal spread in energy. For a non-degenerate carrier distribution and sufficiently high temperature, we adopt the thermionic emission model. This gives
Therefore,
(IV.44)
One-dimensional structures, therefore, cannot perform satisfactorily with non-degenerate carrier concentrations at elevated temperature. For degenerate carrier concentrations, = uF (the Fermi velocity). Hence, (IV .45)
In contrast to the case of 2-D structures, ‘4 for 1-D structures does not depend on the bias and improves with improving carrier concentration. In Fig. 29, we show the performance of 1-D structures for various temperatures and (degenerate) carrier concentrations. Unlike in the case of 2-0 structures, there k a region of “excellent” performance. We find that one can expect “fair” performance at liquid nitrogen temperature if the carrrier concentration exceeds 2.5 x lo5 cm-’ , “good” performance
SEMICONDUCTOR QUANTUM DEVICES
171
FIGURE29. Performance diagram for one-dimensionalGaAs electrostatic Aharonov-Bohm interferometers operating in the diffusive regime. There are regions of “fair”, “good”, and “excellent” performance. “Good” to “excellent” performance can be expected at 77 K for practical carrier concentrations around lo6 After Bandyopadhyay and Porod (1989); reprinted with permission.
if it exceeds 5 x lO’cm-’, and “excellent” performance if it exceeds 2.5 x lo6 cm-’. However, as the carrier concentration increases, electronelectron scattering (which is an inelastic mechanism) becomes more frequent and the inelastic mean free path (the phase-breaking length) becomes shorter, which requires that the device size be shrunk concomitantly. Nevertheless, for a carrier concentration of -1O6cm-’, which it is possible to obtain with present-day technology, the conductance modulation over the first half-cycle can be greater than 90% in a 1-D structure at 77 K (Bandyopadhyay and Porod, 1989). A 1-D structure with this carrier concentration can therefore perform admirably as a switching transistor, as long as its length is shorter than the inelastic mean free path. The inelastic scattering time in heavily doped semiconductor wires has been reported to exceed 5 ps at 4.2 K (Ishibashi et al., 1987). Therefore, assuming a T-’’2 dependence of the inelastic scattering time on temperature (Santhanam et al., 1984), we expect the phase-breaking length to exceed 0.3,um for a carrier concentration of lo6cm-’ at 77 K. Delineation of this feature size is well within the capability of electron beam and other types of lithography. Although 1-D structures have the potential for excellent performance at 77 K, there are other critical design issues that must be addressed before such performance can be expected. Perhaps the most critical issue is the role of the contacts. The contacts must infuse and extract carriers from the
172
MARC CAHAY and SUPRIYO BANDYOPADHYAY
two interfering paths with a definite phase relationship so that y given by Eq. (1V.20~)is zero or at least a constant. In Section IV,C, we achieved this by selectively populating only the symmetric state in the end region using etching and regrowth. An easier way to approximately realize this is to place the interfering paths in close physical proximity and reduce the carrier concentration in the contacts as much as possible.” Lightly doped contacts can be realized by Si implantation (for GaAs-AlGaAs structures) (Williams, 1987). In addition, the contact geometry must also be such that the contacts are “transparent” to the electrons. Otherwise, an electron will suffer many reflections back and forth between the contacts before it finally exits the structure. Multiple reflections have two deleterious effects. First, since the transit time through the structure increases proportionately with the number of reflections, the spread in the transit time also increases, which in turn reduces the conductance modulation. Second, the dwell time of an electron within the structure increases, and this enhances its chances of encountering a phase-randomizing inelastic collision. The geometry of the structure is therefore a critical consideration in the design. Semiconductor ring structures, which are conventionally used for experiments (Timp et al., 1987), are a poor design in this respect because the radius of curvature of the ring is usually comparable to the deBroglie wavelength of carriers, so that multiple reflections between the leads (contacts) can be severe. Alternate structures that do not have sharp bends and curvatures are possibly a better choice for device configurations. One such soluble quantum wire structure is discussed in the next section. A quantitative analysis (Bandyopadhyay and Porod, 1989) has indicated that such a structure, with feature sizes of 0.25 pm and carrier concentration of lo6 cm-’, is capable of producing -90% conductance modulation due to the electrostatic Aharonov-Bohm effect at liquid nitrogen temperature. 1. Double Quantum Wire Interferometers f o r Possible
Elevated Temperature (77 K) Operation In this section, we discuss a possible realization of a double quantum wire electrostatic Aharonov-Bohm interferometer that could operate at the relatively balmy temperature of 77 K. The proposed structure is shown in Fig. 30. It consists of a single undoped GaAs quantum well sandwiched between intrinsic AlGaAs layers. After etching a narrow V-groove through the quantum well by focused ion beam milling or electron beam exposure, an n+-AlGaAs layer is regrown on the etched surface. These steps can all be performed in ultrahigh vacuum without ever breaking the vacuum. The “ I t can be shown that the maximum uncertainty in the phase y is k,d, where k, is the Fermi wavevector in the injecting contacts and d is the separation between the wells.
SEMICONDUCTOR QUANTUM DEVICES
173
FIGURE30. Proposed structure for a double quantum wire electrostatic Aharonov-Bohm interferometer for possible 77 K operation. After Bandyopadhyay and Porod (1989); reprinted with permission.
process of “etching and regrowth” is certainly a difficult step, but it has been demonstrated (Chang and Kroemer, 1984). Following successful regrowth, two parallel closely spaced “quantum wires” will form as accumulation layers in the GaAs quantum well if spatial transfer of charges from the n+-AlGaAs layer to the GaAs layer takes place. Even if the spatial transfer does not occur, there may still be enough carriers in the channel generated by positively charged interface states. The mobility of these carriers will be poor, but the mobility is not important in this case. The only major problem may arise because of Fermi level pinning. If the Fermi level gets pinned inside the bandgap, the wires will be depleted of carriers. This problem does not arise in InAs systems. There have been reports of inversion layers forming under natively grown oxides on InAs (Baglee et al., 1980). It may be advantageous to replace the GaAs quantum well with an InAs quantum well, since then one merely has to grow a native oxide on the
174
MARC CAHAY and SUPRIYO BANDYOPADHYAY
etched surface of the V-groove to generate the carriers. This is much easier than effecting spatial transfer of charges across the V-groove surface. Another problem may arise because of electron localization effects. It is important to ensure that the device does not operate in the regime of strong localization. This can be ensured by making the length of the structure shorter than the localization length. The quantum wires formed at the surface of the V-groove can be contacted by either Au-Ge alloying or by Si implantation (Williams, 1987). The latter is preferable because it creates rather lightly doped contacts, and this has an advantage, as previously mentioned. The electrostatic potential inducing the Aharonov-Bohm effect is applied between two gate pads (see the top view in Fig. 30). We reproduce here the analysis due to Bandyopadhyay and Porod (1989) which resulted in the prediction of 77 K operation for this structure. For purposes of analysis, the proposed structure is schematically represented as shown in Fig. 31. The two-terminal conductance of this structure for small applied bias is given by the usual prescription
.=-I
e2
2hKT
m] EF .
dEIT,,(E)I2sech2[ E
-
(IV.46)
To find the transmission probability T(E) through the entire structure, one needs to cascade three scattering matrices (Anderson, 1981; Shapiro, 1983; Datta, 1989a) representing propagation from the left contact region to the interfering paths, propagation along the paths, and propagation from the paths to the right contact. The first and the last of these scattering matrices (for junctions A-B and C-D; see Fig. 31) can, in principle, be found exactly by matching the wavefunctions and their derivatives along the junction between the contacts and paths (Frohne and Datta, 1988).
FIGURE31. Schematic representation of the double quantum wire structure in Fig. 30 showing the pertinent scattering matrices. After Bandyopadhyay and Porod (1 989); reprinted with permission.
SEMICONDUCTOR QUANTUM DEVICES
175
However, for simplicity, we will represent these scattering matrices by the so-called Shapiro matrix (Shapiro, 1983; Biittiker, 1985b):
(:;) (
F) (i:),
=
B2+
*
fi
b
a
(IV.47)
Bi-
where
- l),
(IV.48)
b = +(di=Ti + 1).
(IV .49)
a = +(-
The amplitudes A and B are defined in Fig. 3 1. The superscript + refers to waves traveling from left to right and - refers to waves traveling in the opposite direction. In the Shapiro matrix, E represents the probability of transmission from the contact into any one path; E = 0.5 corresponds to perfect transmission (no end-reflection). The Shapiro matrix implicitly assumes that transmission into the two paths from the contact are equal in both magnitude and phase. The latter condition is more difficult to meet in practice, but if the carrier concentration in the contacts is not too large and the channels are physically very close, so that the separation between them is comparable to the de Broglie wavelength in the contacts, then some degree of phase coherence in the injection and detection process can be expected. This consideration makes it necessary to make the V-groove in the proposed structure very narrow so that the wires are closely spaced and also have the contacts defined by Si implantation rather than by Au-Ge alloying. The transmission amplitudes for propagation from junction B to C are represented by f1,2for left to right propagation along paths 1 and 2 and f;,2for reverse propagation along these paths. For single-moded structures, the task of cascading these scattering matrices is relatively simple and can be performed analytically to yield analytical expressions for T,, in the presence of an external electric field. For the electrostatic effect, the transmission probability (assuming strictly single-channeled transport) is given by (Cahay et al., 1989)
where 4 is the electrostatic AB phase shift given by Eq. (IV.8). Note that the transmission reaches zero under two distinct values of the potential difference A V inducing the phase shift. They correspond to the
176
MARC CAHAY and SUPRIYO BANDYOPADHYAY
0
3
6
9
12
15
18
20
Potential V (mv)
FIGURE32. The transfer characteristics for the double quantum wire structure for various values of E . The carrier concentration is 106/cm and the length is 0.25pm. which gives a correlation temperature of 8 K. After Bandyopadhyay and Porod (1989); reprinted with permission.
conditions
where kL is the phase of and t;‘ . This means that unlike the magnetostatic effect, the conductance oscillations due to the electrostatic effect have two components (with the same period) that are displaced in phase by 2kL. As a result, one would observe twice as many nulls in the conductance oscillations as one would naively expect from the AB effect. The origin of the second set of nulls has been discussed by Cahay et al. (1989). To examine the conductance modulation in a electric field, the integral in Eq. (IV.46) was performed numerically, and the results are shown in Fig. 32 (at T = 77 K) for various values of &19. Note that E = 0.5 corresponds to perfect transmission from the contact into the paths, and E = 0.1 corresponds to 10% transmission probability. In the calculation, we neglected any dependence of E on the electron’s wavevector. The carrier concentration was assumed to be lo6cm-’, the length of the structure was 0.25 pm, and the material was GaAs. For this structure, the “correlation temperature” [recall Eq. (II.6)] is 8 K . For a 1,000A long structure with a carrier ”We do not resolve the second set of nulls in this figure, since 2kL = rr.
177
SEMICONDUCTOR QUANTUM DEVICES 1 .o
0.9 0.8 0.7 0.6
0.5 0.4
0.3 0.2 0.1 0
0
30
60
90
120
150
180
210
Potential V (mv)
FIGURE33. The transfer characteristics for a structure with length 1,OOOA and carrier concentration of 4 x 106/cm. The corresponding correlation temperature is 77 K. After Bandyopadhyay and Porod (1989); reprinted with permission.
concentration of 4 x lo6 cm-', the correlation temperature is about 77 K. For the sake of comparison, we have also shown the conductance modulation of such a structure in Fig. 33 for E = 0.5. Several interesting features are found in Fig. 32. The conductance modulation (for the first half-cycle) is greater than 90% at 77 K for all values of E . This is promising for switching transistor applications. For such applications, it is only the first half-cycle of the oscillation which is important, since all that is required is to switch the conductance between the on and off states. Consequently, a 90% conductance modulation over the first half-cycle is encouraging. Another interesting feature is the decay of the oscillations with increasing values of the electrostatic potential. This happens because the uncertainty in the phase-shift for a fixed potential V is given by A+ = V A ( T J ,and this is proportional to the potential itself. At low values of the potential, the uncertainty A+ is small and the conductance modulation is large. At higher values of the potential, the uncertainty increases, thereby decreasing the conductance modulation. Perhaps the most interesting feature in Fig. 32 is the effect of E or the role of multiple reflections. If E is small (large end-reflections), only the first half-cycle of the oscillation is discernible and the later cycles are not. This will make it impossible to detect the presence of the Aharonov-Bohm oscillations in a direct experiment if the test structure is not cleverly designed to eliminate such reflectibns. An explanation for this role of multiple reflections is the following. If an electron suffers many reflections back and forth between the contacts before it exits the structure, its
178
MARC CAHAY and SUPRIYO BANDYOPADHYAY
effective transit time increases. This, in turn, increases the spread in the transit time, thereby increasing the uncertainty in the phase-shift for a given potential. As a result, the interference effect “dies off” much more rapidly in the presence of multiple reflections. There is, of course, another harmful effect of multiple reflections, namely that an increase in the effective transit time may cause it to exceed the inelastic scattering time and the interference effect may be summarily destroyed by inelastic scattering. We had discussed this in the previous subsection. For experiments designed to demonstrate the electrostatic Aharonov-Bohm oscillations, it is important to minimize multiple reflections. Before concluding this section, we would like to leave the reader with the impression that one-dimensional electrostatic Aharonov-Bohm interferometers are a real possibility and may even work at 77 K. However, this vision must be tempered by the realization that these interferometers are extremely difficult to fabricate because the specifications are very demanding, and experimental realization may be several years away.
V. T-STRUCTURE TRANSISTORS
In this section, we discuss a different electron wave device that also has a well-known microwave analogue. That analogue is the microwave stubtuned T-junction. This device was first conceptualized by Datta (1988a,b) and independently by Fowler (1988) and Sols et al. (1989a,b). Preliminary experimental studies were reported by Miller et al. (1989) and Aihara et al. (1992). The archetypal quantum interference transistor consists of a T-shaped semiconductor structure, with three terminals, as shown in Fig. 34. It can be delineated lithographically by patterning a T-shaped mesa (on a modulation-doped heterostructure or a quantum well) with longitudinal dimensions (i.e., dimensions along current flow) smaller than the phasecoherence length of electrons, and transverse dimensions smaller than the Fermi-de Broglie wavelength of electrons. At low enough temperatures, only the lowest subband is occupied everywhere in the structure, so that each limb behaves as a true quantum wire or a “single-moded electron waveguide.’’ Its operation as a transistor is elucidated next. The T-shaped waveguide has three ports which we term (in conventional device parlance) the source (S), the drain (D), and the gate (G). The first two terminals are ohmic contacts, and the last is a Schottky contact. A negative dc potential applied at the gate port will deplete a portion of the semiconductor under the gate-terminal, thereby effectively controlling
179
SEMICONDUCTOR QUANTUM DEVICES
\
s- s+
I
G+ G-
FIGURE 34. A T-structure transistor showing the source, drain, and gate. The figure also shows the two primary Feynman paths whose interference results in the modulation of the current. The interference can be controlled by the gate potential. which varies the depletion layer width under the gate (shaded region) and therefore the path length of one of the paths. This realizes the transistor action. After Subramaniarnel ol. (1990); reprinted with permission.
the conducting length of the gate arm. This is equivalent to inserting or withdrawing a “stub” (stub-tuning) in a microwave T-junction, which modulates the transmission between the other two ports, namely the source and the drain. It has been shown (Datta, 1988a) that modulating the potential at the gate indeed modulates the transmission (and hence the current) from the source to the drain, which realizes the transistor action. A . Analysis of T-Structure Transistors
We will briefly review the analysis of two T-structure transistor designs that were discussed by Subramaniam et al. (1990). They are depicted schematically in Figs. 35a and 35b. To understand the operation of the structure in Fig. 35a, one may assume that the source-to-drain transmission (or current) is controlled by the interference of two primary paths, as shown by the broken lines. Of course there are numerous other paths that are multiply reflected between the source and drain, but they are not important in the basic operation. They only cause harmonic distortion in the conductance oscillations. Note that the length of the longer path (and hence the phase difference between the two paths) is controlled by the width of the depletion layer underneath the gate, which depends on the gate voltage. Changing the gate voltage will therefore modulate the phase
180
MARC CAHAY and SUPRIYO BANDYOPADHYAY
1 I
I
-
I
a
I
I
b
FIO~RE 35. Two possible designs for the T-structure transistor. (a) A single-gate transistor, and (b) a double gate transistor. In the latter case, the second gate is maintained at a fixed potential and merely reflects incoming electrons. The first gate creates a potential barrier of variable height and width underneath the strip, which affects the phase of the Feynman path that is reflected by the gate. The figure also shows the conduction band edge profiles EJx). After Subramaniam et al. (1990); reprinted with permission.
difference and the interference. This will control the current at the drain terminal, thereby realizing transistor action. Just as in the case of the Aharonov-Bohm interferometers, we required that the transmission along the two branches be equal in magnitude [this is equivalent to the condition that la[ = Ibl where these quantities were defined in Eq. (IV.20), so we would require that the transmission probabilities along the two paths in Fig. 35a be as close as possible. This is ensured by placing the source between the gate and the drain (so that both paths have to bend around corners) and by placing the source much closer to the gate than to the drain. The structure in Fig. 35b operates somewhat differently. Here the gate voltage which modulates the drain current is not applied at the gate
SEMICONDUCTOR QUANTUM DEVICES
181
termination, but instead applied to a Schottky metal strip which creates an electrostatic potential barrier of variable height underneath the strip. This modulates the electron’s wavevector k and the phase-shift kL, under the gate strip (LG is the width of the gate strip). This controls the interference effect. This structure has the advantage that the current can be modulated by a much lower gate voltage than the structure of Fig. 35a. By analyze the structures, we start with the scattering matrix for a threeport network similar to the one introduced in Section IV,D,l. We assume that the transmission probabilities of the two Feynman paths discussed in the previous paragraph are equal. This is equivalent to assuming that the source-to-drain transmission is equal to the source-to-gate transmission, if the gate is perfectly reflecting. If this assumption holds, then without any loss of generality we can represent the scattering matrix of the structure by a generalized Shapiro matrix. The difference between the generalized matrix and the original Shapiro matrix of Section IV,D,l is that in the generalized matrix, we will allow the matrix elements E , a, and b to assume complex values instead of just real values. The requirements of current conservation and time-reversal symmetry mandate that the generalized Shapiro matrix be unitary. This gives rise to the following relations between the elements of the matrix:
b
-a =
ei”,
(V. 1)
where a is the phase of the element a in the Shapiro matrix and obeys the inequality 1 - $4 - J & J 2 t a n 22a0. W.4) Equations (V.1)-(V.4) were derived by Subramaniam et af. (1990). If the wave amplitudes G f and G - at the gate port are related according to G + = RG-,
W.5)
where R is the reflection at the gate, then the total transmission t from the source to the drain (which determines the source-to-drain current) is given by (Datta, 1988a) R t=fi+bfi. 1 - RU
182
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Using Eq. ( V . l ) to relate b to a, one obtains t = f i
1 + Rei" 1-Ra'
It is reasonable to assume that the reflection coefficient R has a magnitude of unity, since the gate impedance is infinitely high. The phase of the reflection coefficient is related to the path length of the reflected path and depends on the gate voltage. For the structure in Fig. 35a, it is given by
where Ld(V;i) is the gate-voltage-dependent depletion width under the gate. For a Schottky gate, it is given by
where Vbi is the Schottky barrier height at the gate, N, is the carrier concentration, K is the dielectric constant, and VG is the gate potential. For the structure in Fig. 35b,
where L, is the width of the gate strip and k(V,) is the gate-voltagedependent electron wavevector under the gate strip. Substituting R = eie in Eq. (V.9), one finally obtains an expression for the total transmission t through the structure: ( V .1 1 ) Note that the transmission t is a function of the electron energy E as well as the gate voltage Vo, since 0 is a function of both these quantities. From the transmission t, we can calculate the source-to-drain current ZsD as a function of the gate voltage V, for various source-to-drain voltages VSD
: zSD =
-
21elh
s
dEIt(E, vG)12[.f(E) -f(E +
evSD)l*
( V .12)
It is obvious that the current and therefore the linear response conductance between the source and drain will oscillate as the gate voltage is ramped, since It1 has an oscillatory dependence on 0 which depends on the gate
SEMICONDUCTOR QUANTUM DEVICES
183
voltage. The oscillation is not sinusoidal because of harmonic distortion caused by multiple reflections. Moreover, the oscillation will not be periodic in the gate voltage, since 8 does not depend linearly on the gate voltage. B. Sensitivity of the Device Characteristics to Structural Dimensions: Implications for Integrated Circuit Implementation
In this section, we examine the dependence of the conductance oscillations on various elements of the Shapiro matrix that characterize the structure. From Eqs. (V.2), (V.3), ( V . l l ) and (V.12), we find that the linear response conductance GsD ( = I s D / V s D ) , at any value of 8, depends on two basic parameters: E and a. Note that E is measure of how easily the source can infuse electrons into the structure. In other words, it is a measure of the source “transparency.” The parameter a, on the other hand, is the phase of the internal reflection of an electron at the gate termination. Both E and Q depend on the electron’s wavevector or energy, as well as the precise dimensions and geometry of the structure. We are interested in how these parameters affect the oscillation of the source-to-drain conductance as the gate voltage is varied. In Fig. 36 we show the amplitude of the source-to-drain conductance oscillation (which is the maximum “on-conductance” of the device) as a function of a and E , reproduced from Subramaniam et al. (1990). What is immediately apparent is the extreme sensitivity of GsD to the structural
‘
2
I
0
FIGURE36. The maximum source-to-drain conductance G,,(max) (in the linear response regime) as a function of E and a. After Subramaniam ef 01. (1990); reprinted with permission.
184
MARC CAHAY and SUPRIYO BANDYOPADHYAY
parameter a.The on-conductance GsD varies by almost 100% when 01 varies over a range of n/2. This can have a catastrophic effect in integrated circuits. The parameter a,which depends on the precise dimensions of the T-structure and the electron wavevector, can vary significantly across a wafer. Consequently, different devices on a wafer will exhibit widely different behavior. For a typical carrier concentration of 1 x 106/cm, the Fermi wavevector is 1.57 x 106/cm. Therefore, a variation of just -5OA in the dimensions of the structure alone can cause a to vary by + n / 2 . This then causes the on-conductance (and hence the output current for individual devices) to vary by almost 100V0,which renders integrated circuit implementation impossible. To prevent this, one needs unprecedented tolerance, namely control over 50 A in individual device dimensions. Control of this feature size is unattainable unless the entire wafer is patterned by scanning tip lithography (Van Loenen et al., 1989), which, however, is unlikely ever to emerge as a mass production tool because it is a direct write technique. The extreme sensitivity of device characteristics to structural parameters is a serious drawback for some quantum devices and may at present preclude their application in integrated circuits. The T-structure transistor is a pathological example of this unfortunate aspect. However, we do not wish to appear as doomsayers. This is a problem that seems to exist at present, but may gradually disappear with improvements in fabrication technology. Nonetheless, it is necessary to point out this aspect in order to present a balanced and honest view rather than that of an over-enthusiastic zealot.
C. Device Characteristics of a Single Transistor In this section, we examine the device characteristics of a single discrete transistor as calculated by Subramaniam et al. (1990). In all calculations, the cross-sectional area of the quantum wires A, is assumed to be l 0 0 A x l00A. Figures 37a and 37b show the transfer characteristics (drain current vs. gate voltage) for the structures in Fig. 35a and 35b, respectively. The ambient temperature is assumed to be 4.2 K, and the material is GaAs. These curves were obtained directly from Eq. (V. 12). The drain bias VsD is assumed to be 10 millivolts. At this bias, a ballistic electron arriving at the drain has an excess kinetic energy of lOmeV, which, although well below the threshold for polar optical phonon emission, will still raise the electron temperature to -1 15 K. At this temperature, significant electron-electron scattering (which is the dominant phase-randomizing inelastic process in these structures) can occur. The mean time between electron-electron collisions in one-dimensional structures depends inversely on the square-root
185
SEMICONDUCTOR QUANTUM DEVICES
a
2 W
Gate Voltage (V) XI0
b
1
c,
c
C
8
.r(
Gate Voltage (V) FIGURE 37. Drain current versus gate voltage for T-structure transistors: (a) for the singlegate structure in Fig. 35a, and (b) for the double-gate structure in Fig. 35b. In (a), we have shown the characteristics for three different carrier concentrations. The ambient temperature is 4.2K, and the scattering matrix parameters are E = 0.5 and a = K. After Subramaniam ef a/. (1990); reprinted with permission.
-
of temperature. Electron-electron scattering times of 5 ps have been measured in relatively heavily doped GaAs quantum wire structures at 4.2K (Ishibashi et al., 1987). We therefore expect the scattering time at s. For a carrier concentration of a temperature of 115 K to be lo7 cm-', the electron-electron scattering mean free path is -1,OOO A at a drain bias of 10mV. This feature size is at the limit of present day
186
MARC CAHAY and SUPRIYO BANDYOPADHYAY
lithographic capability, so that realistically, 10 millivolts is about the largest drain bias that can be applied in these structures. Returning to Fig. 37a, we find two salient features. First, the gate voltage required to induce one cycle of the oscillation decreases with increasing carrier concentration N, . This behavior is opposite to that of electrostatic Aharonov-Bohm interferometers. The origin of this is easily understood from Eqs (V.8)-(V.10). To induce one cycle of the oscillation, the phaseshift B has to be changed by 2n. We find from Eqs (V.8)-(V.10) that the quantity ae/a V, increases as ,since the wavevector k increases linearly with N, while Ld(VG)decreases as - 1 / a . Consequently, a smaller gate voltage is required to induce a 2n change in 0 if N, is larger. The second and perhaps the more important feature in Fig. 37a is that the peak-to-valley ratio of the drain current increases with increasing carrier concentration. This can be understood as follows. The current modulation in this structure is due to quantum interference and therefore depends critically on how rigidly the phase shift B can be controlled. The phase shift B depends on the electron energy. At nonzero temperatures, the thermal spread in the electron energy introduces a spread A 0 in the phase-shift. If A0 is large, the interference effect is washed out by thermal ensemble averaging. The smaller the value of A 0 at a given temperature, the stronger the interference effect and the larger the conductance modulation or the transconductance. From Eq. (V.8), we find that for the structure in Fig. 35a,
-a
(V.13) where A E is the thermal spread in energy ( Z k T ) and nl is the electron concentration per unit length. For the structure in Fig. 35b, Equation (V.10) gives the following relation:
A0 =
SEMICONDUCTOR QUANTUM DEVICES
187
From Eqs. (V.13) and (V.14), we can see that for both structures, A0 decreases with increasing carrier concentration nI. This would strike the reader as a familiar feature, since the same feature (with the same underlying physics) was found in connection with electrostatic Aharonov-Bohm interferometers (Section IV,D). A larger carrier concentration therefore gives rise to a smaller A0 at a given temperature and provides a larger current modulation or a larger peak-to-valley ratio of the drain current. This, combined with the fact that the gate voltage required to induce one cycle of the oscillation also decreases with increasing carrier concentration, means that the transconductance g, ( = al,,/d VG)increases with increasing carrier concentration. It appears from Fig. 36 that at an operating temperature of 4.2 K, one requires a volume carrier concentration of 10’’ cm-’ (which corresponds to a linear carrier concentration of lo7cm-’, assuming the cross-sectional area of the structure to be lOOA x l W A ) in order to obtain a sufficiently large transconductance. However, even at this large carrier concentration and low temperature, the actual value of the transconductance is rather small; it is only about lo-’ siemens. This obviously has a deleterious effect on device performance and lowers both the smallsignal gain and the bandwidth significantly. We will examine the cause for the small value of the transconductance later. In Figs. 38a and 38b, we plot the drain characteristics (drain current vs. drain voltage) for the structures in Figs. 35a and 35b for various gate voltages. The most important feature to note here is that the drain characteristics do not saturate up to the maximum allowed drain voltage of 10 mV. This has very serious implications for device application. Because of this feature, it may be argued that it is not even meaningful to specify a transconductance for this device, since the transconductance is not constant over any appreciable range of the drain voltage or output voltage swing. More importantly, it also implies that this transistor cannot provide a constant amplification over any range of the input signal. This is a serious drawback in that the output signal will be a nonlinear function of the input if this device is used as an amplifier. Devices with this property are known to cause severe signal distortion, which may preclude their use in many applications. This is actually a pathological problem with quantum devices that are constrained to operate in the linear response regime in order to avoid carrier heating. For constant signal amplification, a device must operate in the nonlinear response regime in the sense that the drain current must saturate. This can be achieved by lowering the carrier concentration in this structure (the drain characteristics do tend to saturate if the carrier concentration is less than 1017cm-’ for the prototype structures), but this also lowers the transconductance drastically. Even if the drain current saturates so that the differential drain resistance r, (= a VsD/aZsD) is large,
188
MARC CAHAY and SUPRIYO BANDYOPADHYAY
4.67
.s d
2.33
Drain Voltage (V) b
E
Y
4.67 -
t:
5
233 -
Drain Voltage (V) FIGURE38. The drain current versus drain voltage characteristics for the structures in (a) Fig. 35a and (b) Fig. 35b. The temperature is 4.2 K, E = 0.5, and a = R . After Subramaniam et a/. (1990); reprinted with permission.
a small transconductance g, would mean that the small-signal voltage gain a, ( =gmrd)may not be large. Therefore, a constant voltage amplification can be achieved only by sacrificing the magnitude of the voltage amplification. This is a serious dilemma.
D. A nalog Applications It is obvious that the T-structure transistor is not suitable for application as an analog amplifier. Yet there may be some applications for which it is
SEMICONDUCTOR QUANTUM DEVICES
189
suitable, or even desirable. One such application is frequency multiplication. Since the gate voltage swing required to make the drain current go through one cycle of oscillation is rather small, this transistor can be used as a single-stage frequency multiplier in the same way as an electrostatic Aharonov-Bohm transistor (see Section IV,C,l). If V, is the gate voltage required to induce one cycle of the current oscillation, then for a gate voltage of peak to zero amplitude V, the frequency multiplication factor N is simply V/V,. For a carrier concentration of 1019~ m - the ~ , threshold voltage to induce the first half-cycle of the oscillation is =170mV for the structure of Fig. 35a and 240 mV for the structure of Fig. 35b, if the gate width is 100A. Hence, for a gate voltage amplitude of lOvolts, the frequency multiplication factor is -20-30. This factor itself is not all that impressive, but what is impressive is that it can be realized with a single transistor. As discussed before, it is the multifunctionality property (the ability of a single device to perform the task of many) which is often most attractive. Since quantum devices are not ideally suited for use in integrated circuits or circuits with many active elements, the ability to realize the function of a multi-element circuit without using more than one element is a treasured property.
E. Digital Applications We now examine the performance of the T-structure transistor for discrete logic applications. The non-saturating behavior of the drain characteristics poses a problem here as well, since signal restoration at logic nodes, sufficient noise margin, and sharp transitions between logic levels all require saturating (nonlinear) devices with preferably high transconductance (Ferry et al., 1989). More importantly, isolation between input and output stages also requires a rather large intrinsic device gain (we discuss this issue in more detail in Section XI).Therefore, the T-structure transistor does not seem to be a very attractive candidate for digital applications as well. However, there are some special applications where a T-structure transistor may have an advantage. An example is a discrete logic switch (inverter) for digital applications. There are three basic requirements that such a switch has to satisfy: (1) The ratio of the on to offconductance must be sufficiently large so that there are well-defined logic levels; (2) the switching speed must be high; (3) the power-delay product must be low. From Fig. 36 we find that in order to satisfy the first requirement, the carrier concentration must at least equal l O I 9 ~ m - This ~ . value of the carrier concentration was used by Subramaniam et al. (1990, 1991) to calculate the switching speed and power-delay product. Their calculations are repeated here.
190
MARC CAHAY and SUPRIYO BANDYOPADHYAY
In calculating the switching speed, one should first recognize that there are three times constants involved in switching. The gate charges up to the threshold voltage in a time determined by the RC time constant of the gate circuit, the barrier in the device responds to the gate potential in the dielectric response time, and the drain (output) current responds on time scales of the order of the transit time through the device. Of these three time constants, the dielectric response time is much smaller than the other two, so that the switching time is determined primarily by the other two time constants. The transit time through the structure is essentially the time required to traverse the longer of the two Feynman paths in Fig. 34. Assuming this path length L to be 1,000 A , one obtains
-
(V.15) where the Fermi velocity uF is assumed to be by Bragg reflection in GaAs. The RC time constant is obtained as
- 10' cm-'
which is the limit set
(V. 16)
where C, is the gate capacitance, is the threshold voltage, and I is the gate charging current. It is important to note that the RC time constant depends on the threshold voltage y . The lower the value of y , the lower is the value of the RC time constant. The low values of threshold voltages in quantum devices partially offset the low values of the current I that can be sustained in these devices to yield a small enough RC time constant. From Figs. 37a and 37b we find that the drain current can be changed by more than 90% if the gate voltage is changed by = 100 mV. Hence, the threshold voltage 6 = 100mV. Using this value of along with C, = 10-'8F,20 and I = (2e2/h)V,,(max) = (2e2/h)x 10 mV, one obtains an RC time constant tRc of -0.15 ps. The overall intrinsic switching delay of the transistor is therefore -0.1 ps. This is, of course, faster than the switching speeds of many ultrafast electronic devices that currently exist (Kobayashi et al., 1988; Kotani et al., 1988, Yamane et al., 1988), but one must remember that this is the theoretical limit. Actual switching speeds will be lower because of parasitic elements and interconnects. The small value of the switching delay accrues primarily from the small threshold voltage and the small gate capacitance. The gate capacitance was estimated to be lo-'' F. While such small discrete capacitors have been
-
"This value is obtained from the gate dimensions.
191
SEMICONDUCTOR QUANTUM DEVICES
realized in Coulomb blockade experiments (Barner and Ruggiero, 1987), it is unlikely that when interconnects are attached to the device, the overall gate capacitance (including that due to the interconnects) will be that small. A more realistic estimate for the overall capacitance (including the effect of interconnects) is F. Therefore, even though the intrinsic switching delay is smaller than 1 ps, the extrinsic switching delay may actually exceed 100ps because of the external switching circuit. In the next subsection, we will discuss how this limitation may be overcome by switching the device optically rather than electronically, thereby eliminating the external switching circuit altogether.
-
F. Electro-optic Applications
In this section, we examine the viability of switching the T-structure transistor optically, rather than electronically, in order to eliminate the RC time constant limitation on switching and realize an ultrafast electro-optic switch. Recently, it was pointed out that an optical radiation field, with a frequency lower than the bandgap frequency, can induce virtual charge polarization because of virtual transitions between electron and hole states in a quantum confined structure (Yamanishi, 1987; Chemla et al., 1987). The field associated with this charge polarization may be sufficiently large to generate the small voltage required for switching a T-structure transistor. It is important to note that this voltage is not generated electronically, so that there is no RC time constant limitation on the switching (Yamanishi, 1989). Instead, the voltage is generated on time scales determined by the pulse width of the optical pulse and the inherent time response of the virtual charge polarization mechanism, which is expected to be 100 fs (Yamanishi, 1987). This time scale is comparable to or smaller than the transit time through the T-structure, so that the overall switching speed will be on the order of the transit time, which can be made small enough (-1 ps) by making the structure short. Using this scheme, one can therefore realize an ultrafast electro-optic switch. We now examine any modifications that would be necessary to convert the T-structure transistor into an electro-optic switch. It was calculated by Yamanishi (1989) that for a reasonable laser pump power density of lo8 W cm-*, the screening field generated by virtual charge polarization is -0.5 kV/cm for an optical detuning energy of 50 meV, when the structure is biased with a dc field of lo5V/cm along the direction in which the screening field is created. Since a laser spot can be focused to an area of 1 pm’, the required pump power can be provided by a 1-watt laser. For a T-structure with a vertical dimension of 2008,, the required dc bias of
-
192
MARC CAHAY and SUPRIYO BANDYOPADHYAY
10’ V/cm can be obtained by applying a constant voltage of 200 mV across the vertical dimension. With this arrangement, the voltage generated by the screening field will be 0.5 kV/cm x 200 8, = 1 mV. If the threshold voltage for switching can be lowered to this value, it will be possible to switch the transistor optically and realize an ultrafast electro-optic switch. To attain a threshold voltage of 1 mV while maintaining a carrier concen~ , needs to increase the width of the gate strip in the tration of l O I 9 ~ m - one structure of Fig. 35b to -4,0008,. This will of course also necessitate increasing the source-to-drain separation to about 1 pm, which in turn would necessitate reducing the drain voltage to about 1 mV in order to increase the mean free path for electron-electron scattering. This low value of the source-to-drain voltage does not present any special problems for switching applications, since even with this small drain voltage, the drain current can swing over a range of -(ez/h)Vs,, = 40nA, which can be detected by sensing amplifiers. We therefore conclude that it is possible to use a T-structure transistor to realize an ultrafast electro-optic switch. A prototypical structure for such a device was proposed by Subramaniam et al. (1900). It will consist of the configuration shown in Fig. 35b with the entire top surface covered with an optically opaque material, leaving a transparent slit of width -4,000 8, in the place of the gate strip. This is shown in Fig. 39. Since the vertical dimension of the structure is only about 200 8,,one can neglect the effect of diffraction through the slit and assume that incident radiation absorbed through the slit will create a localized potential barrier underneath the slit which can switch the transistor from an on to an off state or vice versa. The effective switching speed will be of the order of the transit time through the
FIGURE39. Proposed structure for an opto-electronic switch. The T-structure transistor is switched optically by creating the threshold voltage using subbandgap radiation. This causes virtual electron-hole excitations, and the resulting virtual charge polarization can cause the appearance of a voltage in the irradiated region.
SEMICONDUCTOR QUANTUM DEVICES
193
structure, which for a 1 pm long structure is - 1 ps. This is an attractively high switching speed. To conclude, the T-structure has some promise, but it also has serious drawbacks. It is difficult to foretell what the future of such devices will be, and they may not even materialize for many years to come. Nonetheless, it is important to pursue research in such concepts, since they can lead to valuable insights in other areas. Moreover, they can also aid the conception and realization of other similar devices with equally attractive properties. One such device is discussed in the next section. VI. ELECTRON WAVEDIRECTIONAL COUPLERS
In optical and microwave networks, a fairly common device that is used for switching power and signal from one waveguide to another is a directional coupler. In this section, we review the experimental efforts in building its electronic analogue-the electron wave directional coupler. This is an electron wave device that fully exploits the wave attributes of electrons for its operation. Theoretical modeling of the steady-state and transient operation of this structure has been performed using standard couple mode theory, the strong coupling theory, and the more recent supermode theory. This theory is reviewed in the next subsection. A. Theory of Electron Wave Directional Couplers
The first proposals for electron wave directional couplers were put forth by Dagli et al. (1991) and Tsukada et al. (1990). The prototypical structure consists of two contiguous “electron waveguides” with controlled coupling between them (see Fig. 40). In these devices an electron injected in one electron waveguide is controllably coupled (or switched) to the other waveguide, very much like switching a train from one track to another. Electron waveguides are realized by ballistic quasi one-dimensional semiconductor structures where electron propagation mimics electromagnetic wave propagation in a waveguide. The coupling is achieved by allowing controllable amount of tunneling between the waveguides. The control over tunneling is achieved by varying the potential height or width of the tunneling barrier with an external gate voltage. Experimental demonstration of this concept was recently reported by Eugster and del Alamo (1991) and Eugster et al. (1992). In order to investigate the potential high-frequency applications of electron wave directional couplers, Qian et al. (1992) studied the transfer of
194
MARC CAHAY and SUPRIYO BANDYOPADHYAY
From 40. Schematic representation of an electron wave directional coupler. The two waveguides are in close proximity in the middle of the device and split away at both ends of the active region. The gate in the middle is used to control the amount of switching between the two electron waveguides.
wavepackets between parallel quantum wells (two dimensional electron waveguides) by solving the two-dimensional time-dependent Schrodinger equation. They found that a wavepacket injected into one well is only partially transferred to the other (coupling efficiency close to 85%) even for perfect double quantum wells. These authors also studied the amount of transfer in asymmetric well configurations as a function of the total energy of the wavepacket. The length and time scales for maximum transfer are found to be around a few thousand angstroms and picoseconds, respectively, which appears very promising. Recently, Sankaran and Singh (1991) have used the time-dependent multiband effective mass equation to calculate the tunneling rate for holes between coupled quantum well structures. Their results indicate that it is possible to realize p-channel directional couplers as well. The availability of both n-channel and p-channel switches is important for the viability of ultrafast complementary logic. More importantly, coherent oscillations of an electron wavepacket between two quantum wells can emit terahertz electromagnetic radiation. The frequency of the radiation can be tuned by an external electric field applied over the wells, thereby providing a continuously tunable submillimeter wave source. Recently, the observation of such radiation was reported by Leo et al. (1991).
195
SEMICONDUCTORQUANTUM DEVICES
Steady-state analysis of electron transfer between coupled double quantum wells was presented by a number of authors (del Alamo and Gugster, 1989; Dagli et al., 1991; Tsukada et al., 1990; Wang et al., 1991; Yang and Xu, 1991a,b; Singh et al., 1992; Wilson et al., 1993). Singh et al. (1992) performed a transient analysis, and we reproduce some of their results. 1. Examples of Coupling between Contiguous Quantum Wells We present an example of electron transfer between two 100A wide quantum wells separated by a barrier 200 A wide and 10 meV high. The potential barrier at the boundary is 0.1 eV high (see Fig. 41). Such a structure can be realized using an Al,Ga, -,As-GaAs-AlyGa,-yAs-GaAs-A1,Ga, -,As layered structure grown by molecular beam epitaxy. All the interfaces are assumed to be smooth in this simulation (i.e., surface roughness scattering is neglected). The initial wavepacket has the form
0.1 eV 88.28 66.89 48.06
X
30.74 19.97
7.83 4.65
o
200A
- _ Symmetric - Antisymmetric
d
200A
700A
Y
I 100A I
FIGURE 41. Conduction energy profile in the electron wave directional coupler studied in the text. The location of the symmetricand antisymmetric eigenstatesin the directional coupler are shown on the left. The dashed curve shows a more realistic conduction-band energy profile in the direction perpendicular to the direction of current flow. This conduction-band energy profile must be calculated using a self-consistenttwo-dimensional solution to the Poisson and Schrodinger equations.
196
MARC CAHAY and SUPRIYO BANDYOPADHYAY 0
2
.’ .
I
I
I
FIGURE42. Time evolution of the probability for the wavepacket described by Eq. (V1.1)to be in well 1 ( y > 350 A; full line) and well 2 (y < 350 A; dashed line), respectively. The curve starting in the lower left corner is interpreted as the probability of transfer between wells (see Fig. 41).
where (x,,zo) = (500A,200A) are the coordinates of the wavepacket center, o is equal to 50 A , and W is the well width. The electron wavevector ko is 0.0265 A-’ [with this value of k,, a free electron (m*= 0.067mo) travels a length of 1,OOO A in 0.22 ps]. The average kinetic energy of an electron in the state is about 65 meV for the values of the parameters listed. The simulation results are shown in Fig. 42. In this figure, one can see the probability of the wavepacket to be on the left or right of the plane z = 350A as a function of time. Obviously, the transfer is only partial. This is to be expected, since a steady-state analysis of perfectly symmetrical (in the z direction) directional couplers indicates that total transfer can occur only if one assumes a plane wave solution of the electron wavefunction in the x-direction. For a wavepacket built up of many plane-wave Fourier components, the length over which total transfer occurs varies with the wavevector, leading to partial transfer of the wavepacket between wells. This also explains the fragmentation of the wavepacket in many successive lobes in the direction of propagation, as clearly seen in Fig. 43. Singh et al. (1992) found that the amount of transfer between wells is slightly altered in asymmetric configurations (narrow well/wide well) hardly affected by the initial energy of the wavepacket.
SEMICONDUCTOR QUANTUM DEVICES
197
FIGURE43. Wavepacket switching between the two parallel quantum wells shown in Fig. 41. Initially the wavepacket is located in well 1 and described by Eq. (VI.1). (continues)
198
MARC CAHAY and SUPRIYO BANDYOPADHYAY
FIGURE43-continued
SEMICONDUCTOR QUANTUM DEVICES
199
More simulations need to be done to study the possibility of near-perfect (100%) switching of electron wavepackets between parallel quantum wells in an asymmetric configuration. Singh et al. (1992) found in their prototype problem that the maximum transfer is only 85% and occurs when the total kinetic energy of the electron injected in the first well is 35 meV. This is not a very high injection energy and is slightly below the threshold for polar optical phonon emission in GaAs. The potential energy profile assumed in the simulations of Singh et al. consisted of rectangular barriers of varying heights. In a realistic electron wave directional coupler such as proposed by Dagli and co-workers (1991), the potential is more smooth because it results from electrostatic depletion under closely spaced gate configurations. One would expect, however, that the results for these structures will be qualitatively the same as those for rectangular barriers. In summary, electron wave directional couplers also appears to be promising. However, much more work needs to be done before such a device can be considered a serious candidate for switching applications. VII. SPINPRECESSION DEVICES It was mentioned at the beginning of this review that electron wave devices are often conceptualized from analogous microwave or optical devices. A number of microwave and optical devices like the magic tee or the electro-optic light modulator rely on the interference between two allowed polarizations of electromagnetic waves. An obvious question to ask, then, is whether analogous devices are conceivable based on the two possible spin polarizations of electron waves. This possibility was examined theoretically by Datta and Das (1990), whose analysis is discussed in this section. Their observations led to the proposal of a spin-precession transistor whose operation is based on the interference of two spin polarizations of an electron wave. The phase difference between the two spin polarizations (which can be controlled by an external electrostatic potential) is independent of the electron energy, so that any resulting interference effect is immune to the deleterious effects of ensemble averaging. This feature, which is very desirable, is of course also found in the magnetostatic Aharonov-Bohm interferometer. However, the advantage here is that the controlling field is an electric field rather than a magnetic field. This makes the spin-precession device more attractive than Aharonov-Bohm interferometers. The basic concept behind the operation of the spin precession transistor can be understood by considering a familiar optical device: the electro-optic
200
MARC CAHAY and SUPRIYO BANDYOPADHYAY
a Schottky aate I
FG contact
i-InGaAs
I b
FIGURE44. (a) A conventional electro-optic modulator. (b) An electronic analogue of the electro-optic modulator, which is the spin-precessiontransistor. After Datta and Das (1990); reprinted with permission.
light modulator, which is shown in Fig. 44a. In the latter device, a polarizer at the input polarizes incoming light at 45" to the y axis (in the y-z plane), which can be represented as a linear combination of z- and y-polarized light:
(45" pol.) = (zpol.)
+ (ypol.).
(VII. 1)
As this light passes through the electro-optic material, the two polarizations suffer different phase shifts kl L and k2L because the electro-optic effect makes the z-component of the dielectric constant tensor E, slightly different , k , # k 2 .This phase difference causes the from they-component E ~ so~ that plane of polarization to rotate. The difference between the two components of the dielectric tensor (and hence the amount of rotation) can be controlled by an external electric field applied to the material through a gate terminal, as shown in Fig. 44a. The light emerging from the electro-optic material is represented as
(;::). Finally, the analyzer at the output (which is aligned congruously with the
SEMICONDUCTOR QUANTUM DEVICES
20 1
polarizer) lets only the component along
pass through. Therefore, the output power Po is given by
The light output can now be modulated with a gate voltage that controls the differential phase difference A0 = ( k , - k2)L. The analogous device based on electron wave interference is shown in Fig. 44b. The polarizer and analyzer can be implemented using contacts made of a ferromagnetic material such as iron. At the Fermi level in such materials, the density of states for electrons with one spin greatly exceeds that for the other, so that the contact preferentially injects and detects electrons with a particular spin. Spin current polarization up to -5OVo has been experimentally demonstrated utilizing Permalloy contacts (Johnson and Silsbee, 1988; Meservey et al., 1976). A contact magnetized in the x direction preferentially launches and detects electrons spin polarized along positive x-direction, which is a superposition of positive z-polarized and negative z-polarized electrons:
(+xpol.) = (+zpol.)
+ (-zpol.).
(VII.3)
Therefore, ferromagnetic contacts can mimic the 45” polarizers and analyzers. Finally, one needs the phase-shifter, or the analogue of an electro-optic material which will introduce a differential phase shift between +z polarized and - z polarized electrons that can be controlled with a gate voltage. Narrow-gap semiconductors such as InGaAs provide just this, as discussed later. It has been established both theoretically (Lommer et al., 1988; Bychkov and Rashba, 1984) and experimentally (Luo et al., 1988; Das et al., 1989) that narrow-gap semiconductor quantum wells exhibit an energy splitting between up-spin and down-spin electrons even in the absence of any magnetic field. The dominant mechanism for this “zero-field spin splitting” is believed to be the so-called Rashba term in the effective mass Hamiltonian of a confined system: (VII.4) H R = q(gzkx - axkz),
202
MARC CAHAY and SUPRIYO BANDYOPADHYAY
where u is the Pauli spin matrix and q is the spin-orbit coupling constant. This term arises from the perpendicular electric field at heterojunction interfaces, and q depends on this field. Other mechanisms, such as the inversion asymmetry term in certain crystals, can also contribute to the zero-field spin splitting. It is easy to see that the Rashba term causes + z polarized and -z polarized electrons with the same energy to have different wave vectors kl and k 2 . Consider an electron traveling in the x direction with k, = 0 and k, f 0 (we assume that the electron forms a twodimensional electron gas in the x-z plane). The Rashba term H R is then equal to qu,k,. This raises the energy of z-polarized electrons by qk, and lowers that of - z polarized electrons by the same amount. It is as if the electrons feel a magnetic field B,, proportional to k,(qk, + p B B,, p B being the Bohr magneton). The energies of the two spin polarizations are given by E(z pol.) = h2k:,/2m* - q k x l ,
(VII.5)
E(-z pol.) = h2k:2/2m* + qkx2.
(VII.6)
In the absence of inelastic scattering, the energy is invariant (it is a good quantum number), so that the preceding two equations yield
k,,
-
kX2= 2m*q/h2.
(VII .7)
It is apparent that a differential phase shift
A0 = (k,,
-
k,,)L = 2m*qL/h2
(VII.8)
is introduced between up- and down-spin (or z polarized and - z polarized) electrons by the Rashba effect, which is proportional to the spin-orbit coefficient q . Changing q to change this phase shift will effectively rotate or precess the spin of the electron and modulate the current. This is very much like changing the phase shift between two polarization components of a polarized light beam to rotate the plane of polarization as in an electro-optic modulator. The question now is whether the spin-orbit coupling constant q (and therefore the amount of precession) can be controlled by some external means so as to realize a transistor. The quantity q is proportional to the expectation value of the electric field at the heterostructure interface (Bychkov and Rashba, 1984) and therefore, in principle, can be controlled by the application of a gate voltage. If q can be modulated to yield a phase difference that can vary between 0 and 71, then one can utilize this device as a transistor. For InGaAs/InAlAs heterostructures, from the experimentally observed zero-field spin slitting, q is estimated to be 3.9 x 10-l2eVm (Das et al., 1989). This yields a value of L required for a 180" phase shift to be 0.67pm in InGaAs. This is obviously smaller than the mean free path for
-
SEMICONDUCTOR QUANTUM DEVICES
203
spin-flip scattering in high-mobility semiconductors at low temperatures. It therefore appears possible that A0 can be modulated by a gate voltage to cause an appreciable change in the current. This realizes transistor action. We call this device a “spin precession field effect transistor” (SPINFET) where the current between the source and the drain (spin polarized contacts) can be modulated by a gate voltage. The device just described is obviously an electronic analogue of the electro-optic modulator. Here the electron spin plays the role of the photon polarization. It may be possible to realize this device using iron contacts to implement the polarizer and the analyzer and a narrow-gap semiconductor such as InGaAs to implement the phase shifter. The phase shifter introduces a controllable phase shift (controlled by an external gate voltage) which makes this device act as a transistor. An important issue that needs to be considered here is the following. In the preceding analysis [which was carried out by Datta and Das (1990)] only electrons traveling along the x-direction were considered. In practice, there will be an angular spectrum of electrons in the x-z plane. The spin precession effect is reduced if the direction of propagation is not along the x axis. Of course, the electrons which have their velocities misaligned with the x-axis (which is the direction of current flow) will also have a longer transit time and therefore will probably suffer spin-flip scattering. They will not contribute to the interference, but instead create an incoherent background current which decreases the relative modulation of the current. This effect was analyzed by Datta and Das (1990), who concluded that the background current is severely suppressed in quasi one-dimensional structures whose width is much less than h2/qm*. Before concluding this section, we point out that the differential phase shift A 0 = 2m*qL/h2is the same for all subbands and for all energies. This is the same feature that makes the magnetostatic Aharonov-Bohm effect much more robust than the electrostatic Aharonov-Bohm effect, and permits the observation of a 100% conductance modulation in arbitrarily wide structures at arbitrarily high temperatures as long as transport is ballistic. Consequently, it may be possible to achieve large percentage modulation of the current via the spin precession effect even in multichanneled transport and at elevated temperature and applied bias. This is a rare and very desirable feature.
ELECTRON DEVICES VIII. GRANULAR Granular electron devices are a new genre of electronic devices that have recently attracted some attention. They are not “electron wave devices”;
204
MARC CAHAY and SUPRIYO BANDYOPADHYAY
FIGURE45. The basic structure for a single-electron transistor.
the only quantum-mechanical property that is involved in their operation is the granularity of charge, i.e., the fact that electric charge is quantized in units of the fundamental electronic charge e. Since the granularity is often a more robust property than the wave attribute,21it is sometimes claimed that these devices are less delicate than electron wave counterparts. The basic structure for the granular electron transistor (also called “charge effect transistor”) (Amman et al., 1989; Likharev and Claeson, 1992) consists of two ultrasmall capacitors in series, as shown in Fig. 45. The three plates can be contacted separately to realize a three-terminal device or a transistor. The way this device works is as follows. When the charge on the central plate is ne (n is an integer), no current can pass through the capacitors. This is because of a phenomenon known as Coulomb blockade (Kulik and Schechter, 1975; Averin and Likharev, 1986). When the charge is (n + O.O5)e, the Coulomb blockade is removed and current flows through the capacitors. Therefore by controlling the charge on the central plate by the central terminal, one can control the current flowing between the other two terminals. This realizes the transistor action. If it is bothersome to understand how the charge on the central terminal can ever be a fraction of e, one has to remember that this charge is a transferred charge which really represents the net shift of the electrons from their equilibrium positions. Such a shift need not be quantized in units of e. It has been claimed that granular electron transistors can operate at room temperature with much more ease than electron wave devices. This is because the operation of single electron transistors does not depend on the preservance of an electron’s phase, which is an extremely delicate entity. However, room-temperature operation is not all that easy anyway, since the charging energy e2/C must always exceed the thermal energy kT for the Coulomb blockade effect to be sufficiently prominent. This requires the capacitance to be less than lo-’’ farads, which requires the capacitor plate
*’
Phase-randomizingscattering events will inhibt the wave nature of electrons, but not the granularity.
205
SEMICONDUCTOR QUANTUM DEVICES
areas to be smaller than 300 A x 300 8,. If the plates are patterned using e-beam or x-ray lithography and then etched using anisotropic etching processes such as reactive ion etching, the resulting structures can be severely damaged. In fact, conventional nanofabrication processes introduce many stray defects and spurious charges in the dielectric sandwiched by the capacitor plates, which can completely mask the Coulomb blockade effect. Another merit claimed for granular electron devices is that the noise in these devices is quite small (Amman et al., 1989). This is also an advantage that can be profitably exploited. A variety of Boolean logic circuits have been designed using single electron transistors. Figure 46 shows possible designs of NAND and NOR gates (Tucker, 1992). Since the capacitor areas are very small, such circuits can result in increased density in integrated circuit chips. The Coulomb blockade effect also has other potential applications. It can be used to realize high-frequency oscillators. When an ultrasmall capacitor of capacitance C is driven by a dc current of magnitude I , the voltage across the capacitor will exhibit oscillations with amplitude e/2C and frequency Z/e (Ben-Jacob and Gefen, 1985; Averin and Likharev, 1986). For a current of 1 PA, the resulting frequency is easily in the submillimeter wave (tetrahertz) range. Coherent electromagnetic sources in this frequency range are rather rare, so that these so-called single electron tunneling oscillations can be very important for device applications. In addition, single-electron devices can also be used as ultrasensitive electrometers capable of measuring charges one thousandth of e, and they can serve as high-accuracy ammeters
R1‘C 1
V=V
v=o
(on)
S (off)
Db V
S
FIGURE46. (a) Capacitively biased double junction “n-switch.” The switch is an open circuit for low gate voltage V, = 0 and drains charge off the output load capacitor C, when a high gate voltage V, = Vs is applied. (continues)
206
MARC CAHAY and SUPRIYO BANDYOPADHYAY
b
P +vs C
FIGURE46-continued. (b) A corresponding “p-switch” which is an open circuit for high gate bias V, = V, and charges the load capacitor C, to the supply voltage when a low gate voltage V, = 0 is applied. (c) Complementary inverter circuit utilizing capacitively biased double junction switches. After Tucker (1992); reprinted with permission.
(Likharev and Claeson, 1992). Therefore, granular electronic devices can have a large number of analog applications. In addition to the analog applications just discussed, granular electronic devices may also have intriguing applications in digital electronics. The basic current-voltage characteristic of two ultrasmall capacitors in series has a steplike shape, as shown in Fig. 47a, known as the “Coulomb staircase” (Barner and Ruggiero, 1987), and this can be used to realize analog-to-digital converters (an analog voltage is converted into a digital
SEMICONDUCTOR QUANTUM DEVICES
207
3 Voltage
a
Voltage
Vollage
b FIGURE47. (a) The current-voltage characteristics of two ultrasmall capacitors in series, showing the Coulomb staircase; (b) the Coulomb staircase for capacitors with ferromagnetic plates which emit spin-polarizedelectrons in a magnetic field. There can be a hysteresisleading to bistable stages in a resistance-capacitancecircuit which may be utilized for ultrafast digital logic.
current). The current-voltage characteristics may also exhibit intrinsic bistability, as in the case of resonant tunneling diodes, especially if the capacitor plates are made from spin polarized materials such as iron (Bandyopadhyay et al., 1993). The bistability may be caused by the following fact. The basic Coulomb blockade is an electrostatic effect that can exist between any two charged particles regardless of whether they are bosom or fermions. However, in the case of fermions (such as electrons), there is an additional source of repulsion caused by the Pauli exclusion principle, which mandates that two electrons cannot coexist in the same quantum state. Therefore, there is an additional energy cost involved in transferring an electron to a capacitor plate if that plate is already occupied by other electrons of same spin. One may call this additional blockade the Pauli or exchange blockade. This additional blockade may cause bistability to appear in the current-voltage characteristics in a magnetic field, as shown in Fig. 47b. One can exploit this bistability in a resistance capacitance circuit to realize ultrafast digital logic whose switching time can be 0.1 ps (R = 1 MQ, c = 10-1’ F).
208
MARC CAHAY and SUPRIYO BANDYOPADHYAY
There are many other potential applications of granular electronic devices. Recently, one that was demonstrated involved using these devices as exotic electronic turnstiles that allow precisely one electron to get through in each cycle of an ac voltage imposed over the junction (Geerligs et al., 1988). Such an effect has important applications in sequential circuits and dynamic memory.
IX. CONNECTING QUANTUM DEVICESON A CHIP: THE INTERCONNECTING PROBLEMS
Until now, we have discussed a number of different discrete quantum devices. In doing so, we have tacitly avoided the question of how these devices could be connected with each other on a chip without sacrificing the performance. This is a very important issue-perhaps the most important issue today in this field. In the remaining parts of this article, we will mostly discuss this problem and possible remedies. It appears that the interconnection problem (along with the reproducibility problem hinted at in Section V,C) may be the ultimate bottleneck in the way of implementing “quantum chips” as opposed to discrete quantum devices. Two major advantages of quantum devices are their high speed and their small power dissipation. The former makes them ideal candidates for very high-speed integrated circuits (VHSIC). The latter allows extremely dense integration, so that quantum devices also appear ideal for wafer-scale integration and ultralarge-scale integrated (ULSI) chips. It turns out, however, that both these advantages may be limited by the chip interconnects. We discuss this in some detail next. The actual speed of a functional unit on a chip (say, for carrying out a logic operation) may not depend on the intrinsic device speed, but rather on the time it takes to charge and discharge the interconnects which communicate signal between individual devices. This time is approximately (Rout + Rinterconnect)(Cout + Cinterconnect), where Rout and Cout are the output resistance and capacitance of the preceding device, and Rinterconnect and CintercORnecf are the corresponding quantities for the interconnect. At one time it was believed that since superconducting interconnects can reduce Rintercomectdramatically, they can vastly improve the interconnect speed. This belief was later tempered by the realization that Rout is usually much (even for relatively long metallic or silicide interlarger than RintercoMect connects), so that replacing conventional wires with superconducting ones does not buy a great deal of advantage (Likharev, 1990). With quantum devices, the wisdom of this dictate is called into question. Electron wave
SEMICONDUCTOR QUANTUM DEVICES
209
devices typically have linear output characteristics (see, for instance, the discussion in Section V,C) and therefore a small value of R,,,. Consequently, the interconnect resistance may dominant in quantum chips, in which case superconducting wires may offer an advantage. The other issue that needs to be addressed is the question of interconnect capacitance. This may be three or four orders of magnitude larger than the device capacitance. We saw an example of this in Section V,D, where the interconnect capacitance, rather than the device capacitance, will most likely control the power-delay product and the switching speed. In circuits too densely packed on a wafer, the interconnect capacitance may become too large, which may ultimately limit the integration level that one can implement even with quantum devices. Another problem, which we will briefly address in this section, is the issue of coupling and crosstalk between densely packed interconnects in ULSI. In ULSI chips, interconnect lines are so narrow and spaced in such close proximity that signal from one line could easily get coupled to another electromagnetically or otherwise, causing interference and crosstalk. For very closely spaced interconnects, electrons may even quantummechanically tunnel from one wire into another, causing interference. This effect is very similar to the effect that undergirds the operation of the directional coupler devices. The difference is that here it is undesirable and may cause catastrophic chip failures. Electromagnetic crosstalk and coupling between interconnects have been discussed by a number of authors (Gray, 1963; Catt, 1967; DeFalco, 1970; Brews, 1989). A set of lossless interconnects can be modeled as coupled transmission lines whose equations are
d -- [VI dz
=jw[L][I],
d --dz [I1 = j w [ C I [ V ] ,
(IX.1)
where [ V ] and [I]are the vectors of line voltages and currents, and [L] and [ C ] are the inductance and capacitance matrices which give the distributed capacitance and inductance per unit length. Coupling is represented by the off-diagonal terms in the inductance and capacitance matrices. These off-diagonal terms can be calculated starting from the basic Maxwell’s equation, and we shall not dwell on that issue here. Instead, we will treat an unusual situation, namely when crosstalk occurs as a result of quantummechanical tunneling of electrons from one crosstalk to another. Normally, this effect is almost imperceptible in ordinary VLSI circuits, but may not be negligible in future ULSI circuits for quantum devices with an
210
MARC CAHAY and SUPRIYO BANDYOPADHYAY
interconnect spacing <100A. This kind of coupling will be especially severe if the dielectric in which the interconnects are embedded is leaky (such as porous Si,N, or SiOz grown by wet oxidation). In addition to causing crosstalk, tunneling can give rise to a unique problem. In multilayered interconnects, there can be crossings of two lines with a very thin dielectric layer sandwiched between them. If the thickness of the lines is a few hundred angstroms, then at the crossing we have a crossover capacitor whose effective plate area is a few hundred angstroms square, and the plate separation is also of the same order. The corresponding capacitance can be estimated from a standard formula (Long and Butner, 1990). If the linewidths are 300 A and the plate separation is 100 A , then the crossover capacitance is -lo-’’ farads. Since the dielectric layer between the plates is thin enough, an electron can tunnel through this layer from one interconnect to another. Such tunneling can charge up the capacitor to 10 mV per electron! This is the single-electron charging effect that we had discussed earlier in connection with granular electron devices. Obviously, stray voltages of this nature are undesirable in an integrated circuit and can cause reliability problems, logic errors, etc., especially if the supply voltages have been scaled down with the device sizes. The close physical proximity of neighboring lines is not the only cause of increased coupling in ULSI. The increasing length of interconnects with increasing die size also contributes to increased coupling, since a larger region is available for interaction when the interconnects are long. Suffice it to say, then, that crosstalk and coupling can be a serious problem in large ULSI chips because of the dense packing and long interconnect length. The coupling between interconnect lines caused by quantum-mechanical coupling was analyzed in detail by Bandyopadhyay (1992). He considered two cases: quantum-mechanical coupling between silicide interconnects embedded in silicon dioxide, and optical coupling between optical interconnects. The analysis was based on coupled mode theory (Yariv, 1973; Ishimaru, 1991; Tamir, 1975; Hardy and Streifer, 1985, 1986; Tsang and Chuang, 1988; Chuang, 1987a,b; Chuang and Do, 1987), which is of course also used in the analysis of directional coupler-based devices. The salient features of Bandyopadhyay’s analysis (1992) is reproduced next. A . Coupling bet ween Optical Interconnects To model electromagnetic coupling between a set of closely spaced optical interconnect lines, one can view the interconnects as optical waveguides and start from the wave equation that governs the propagation of an electromagnetic signal in a waveguide. The interconnects are assumed to be lossless
SEMICONDUCTOR QUANTUM DEVICES
21 1
and non-dispersive. This is a very good assumption for optical interconnects comprising GaAs waveguides surrounded by AlGaAs cladding. The scalar wave equation for a TE mode propagating in the x direction in any one waveguide is (Yariv, 1973) (IX.2) where Ey is the y-component of the electric field in the interconnect is the y-component of a distributed polarization (waveguide), and [PcOup]y source caused by the coupling of signal from other interconnects. The quantities ,u and E are the permeability and permittivity of the interconnects. To solve for the field Ey in the preceding equation, one can invoke standard coupled mode theory. The solution Ey(r,t) can be written as a linear superposition of the normal modes (unperturbed fields) in the individual interconnects: (IX.3) where &F'(y,z) is the y-component of the electric field in the isolated nth interconnect (in the absence of coupling) and o is the signal frequency (one assumes linear coupling so that there is no harmonic generation). The is a measure of how much the electromagnetic field couples coefficient C,, to the nth interconnect. In fact, it is the amplitude of the electromagnetic field in the nth interconnect. It was shown by Bandyopadhyay (1992) that C,, satisfies a set of coupled equations
(m = 1,2,3,
...,n),
(IX.4)
where o,, is the signal frequency in the nth interconnect,
and Om,,is the overlap between the fields in the mth and nth interconnect (Om,,= j d3r&7*&;). The preceding set of equations was solved analytically by Bandyopadhyay (1992) for the special case of just two interconnects to find C,(x) and C2(x). If A: and B: are the values of C,,(x) at x = 0 and x = L, then they
212
MARC CAHAY and SUPRIYO BANDYOPADHYAY
are related by (IX.5) where V
sin(vL)) ,
b = c = eik4( -
+ k; , 6 = k;
in which ko = k;
k;
K
=
-
k; , v
=
(IX.6)
m, and
+ dpt(w2 - 03,
A
=A[ 2
-
K221011 k2(011022
-
101212)
:.I
021
ki(011022
- 10121~)
Note that the 2 x 2 matrix in Eq. (1x5)is in the form of a transmission matrix. The magnitude of the off-diagonal element b( = c) is a measure of the coupling from one interconnect to another. It is the fraction of the signal in interconnect 1 that gets coupled to interconnect 2 after a distance L. Substituting back the values of K and v, we find that the quantity d is given by
(IX.8) It is easy to show that this coupling is maximum if 6 = 0 as long as tan(KL) > KL or as long as KL In/2. To have 6 = 0 would require that the two interconnects be identical and carry the same signal frequency. Note that when 6 is zero it is possible for 100% of the signal in one interconnect to get coupled to the other, and this happens at a distance 71 LIOO%
In the case of 6
=
=-
2K'
(IX.9)
0, the coupling over a distance L is simply given by
( b (= Jc(= Isin(KL)I.
(IX. 10)
213
SEMICONDUCTOR QUANTUM DEVICES
Therefore, the fraction of the signal power from one interconnect that is coupled into another is given by
lbI2 = IcI2 = (sin(KL)12.
(IX.11)
B. Quantum-Mechanical Coupling
The Schrodinger equation governing quantum mechanical coupling between neighboring interconnects is
ih-aw = (Ho + H ’ ) w = at
(
--h2
2m*
(IX.12)
where h is the reduced Planck’s constant, m* is the electron’s effective mass, v/ (= v(r, t ) ) is the electronic wavefunction, Ho is the unperturbed Hamiltonian, and H’ is the perturbation in the Hamiltonian arising from coupling. It was shown by Bandyopadhyay (1992) that in this case coupling can be represented by the same matrix equation as Eq. (IX.5), where A: and B: are the wavefunction amplitudes. The coupling coefficient b is given by
Ibl = IcI = sin(@,),
(IX.13)
where
c=
(mIf ;,
2 I/h2)011 B(011022
-
( mIHI,1I/h2)021
-
101212)
while HA,fl= j d3r4GH‘4,,,Om,,are the wavefunction overlaps and
B = .\/2m(E - E l ) / h = .\/2m(E - E2)/h,
( I X .15)
where El and E2 are the unperturbed energy levels in interconnect 1 and 2.
C. Calculation of Coupling Coefficients for The coupling coefficients K for electromagnetic coupling and quantum-mechanical coupling were calculated by Bandyopadhyay (1992) for the model problems of two rectangular optical waveguides and two rectangular silicide wires separated from each other by a distance s. From this, the coupling parameter (b12[the off-diagonal element in the matrix of Eq. ( I X . S ) ] is obtained.
MARC CAHAY and SUPRIYO BANDYOPADHYAY
c
It''
I
1.1
I,] Stpnlion ip)
1.1
1.4
0
5
II
IS
lilrrcoiaal kidh (em)
FIGURE48. (a) The coupling parameter Ib12 (which is the fraction of the power in one interconnect coupled to another) for electromagnetic coupling as a function of the separation between the interconnects. The interconnects are 1 p m wide, and the refractive indices of the interconnect material and the surroundings are 3.6 and 3.5, respectively. The results are plotted for two different interconnect lengths, namely 1 and IOcm. The angular frequency of the electromagnetic wave is 671 x 10'4/s. (b) The coupling parameter versus interconnect length for a spacing of 1.005pm. After Bandyopadhyay (1992). 0 1992 IEEE.
In Fig. 48a we show the coupling parameter lbI2 due to optical coupling as a function of the spacings between two identical optical interconnects for two different lengths of the interconnects. The coupling parameter Ib is the fraction of the power in one interconnect that is coupled to the other. The interconnects are assumed to be 1 pm wide (this is of the order of the wavelength of light emitted by semiconductor lasers and light emitting diodes), the material is GaAs with a refractive index of 3.6, and the isolating medium is AlGaAs with a refractive index of 3.5. Figure 48b shows the coupling parameter due to electromagnetic coupling as a function of interconnect length when the spacing between the optical interconnects is 1.005 pm. The signal frequency is assumed to be 2 x lOI4 Hz, which roughly corresponds to the signal frequency of a GaAs diode laser. Other parameters regarding the interconnects are shown in the figure legends. In Fig. 49a we show the coupling parameter due to quantum-mechanical coupling as a function of spacing between two identical silicide interconnects embedded in silicon dioxide. We assume that the potential barrier
11
215
SEMICONDUCTOR QUANTUM DEVICES
1a5 -
a
b
10.1 I
g
ID.( I 10‘ 10.8 I 10.11 -
h
10.11
‘I 6
1046
N
B
E
1 I
4I 4 I
4
L = 10 cm
I
L=lcm
-
,
-
1,+8
1
IOJ’
E En = 10 mtV V*E
,
n
V.V,:3tV
I . I
a
I1.M 11:;
,
-
i
D I
I
11.34 11.36
-
:ItV
/-
--,
/-
10’- ,/ *
I
“
I 1”-
1 E
ee
W:MA
-
A
ir5-
ir1-
+
U
W:MA
I
10.58 1
’
1
’
1
’
1c’-
I
l
l
FIGURE 49. (a) The coupling parameter lbl’ for quantum mechanical coupling versus the separation between interconnects. The interconnects are assumed to be made of polysilicon embedded in silicon dioxide. The energy barrier between polysilicon and oxide is assumed to be 3 eV, the electron subband energy is 1 eV below the barrier, and the longitudinal kinetic energy is 10 meV. The width W of the interconnects is 50 A. The results are plotted for two different lengths of the interconnects, namely 1 cm and 10 cm. (b) The coupling parameter lbI2 versus the length of the interconnect for two different spacings of 100 A and 150 A. All other parameters are the same as in Fig. 48a. After Bandyopadhyay (1992). 0 1992 IEEE.
between silicide and silicon dioxide is 3 eV, which is close to the potential barrier between silicon and silicon dioxide. The interconnects are assumed to be 50 A wide (the limit of present-day lithography capability). The data are presented for two different lengths L of the interconnects (L = 1 cm and L = 10 cm). In Fig. 49b, we show the coupling parameter as a function of the length of the interconnect for two different spacings of l00A and 150 A. All other relevant parameters about the interconnects are displayed in the legend. It is evident from the results that coupling in optical interconnects is not a serious problem in ULSI. This is because GaAdAlGaAs waveguides provide excellent confinement of the optical signal. However, quantummechanical tunneling can be quite serious in silicide dielectrics embedded in silicon dioxide. As shown in Fig. 49a, the coupling parameter lbI2 for quantum-mechanical coupling approaches 1 if the spacing is 100 A. The coupling is always larger for a longer length of the interconnects since a
216
MARC CAHAY and SUPRIYO BANDYOPADHYAY
larger region of coupling is provided. The coupling is also large when the spacing between interconnects is small. In Fig. 49b, we see that the coupling oscillates with increasing length of the interconnect L when the spacing is small enough. This is easily understood from Eq. (IX. lo), which shows that lb12 sin2(&), which is an oscillatory function in L . When the spacing is 100 A , the coupling and [ are large enough that a period of the oscillation occurs when L = 6,300pm. For a spacing of 150 A , the coupling and are much smaller, so that a much larger length L would be required for a full period of the oscillation. Therefore, we do not see the oscillation for the spacing of 150A in Fig. 49b. Figure 50 shows that quantum-mechanical tunneling between silicide interconnects can be a serious problem in ULSI. Tunneling, of course, can be a coherent process or an incoherent process (phonon-assisted tunneling, for instance, is incoherent), and our analysis has dealt with coherent tunneling only. In the analysis, it was assumed that the electron wavefunction is coherent over the entire width of the interconnect, which is a good assumption for very narrow polysilicon interconnects (<50 A wide) even at room temperature and certainly below. Consequently, tunneling may cause significant crosstalk in ULSI circuits at room temperature and at cryogenic temperatures. To circumvent this problem, one may devise ways of destroying the coherence of the electron wavefunction to suppress tunneling. One possibility is to impregnate the intervening dielectric isolating the interconnects with a soft magnetic material in which an electron will suffer strong spin-orbit scattering. Since spin-orbit scattering is very efficient in destroying coherence, this may inhibit coupling between
<
FIGURE50. The charge polarization in an array of quantum dashes containing single electrons. The black circles represent the mean positions of the electrons. The polarization in neighboring dashes is opposite, which mimics antiferroelectricity. After Bakshi et al. (1991); reprinted with permission.
SEMICONDUCTOR QUANTUM DEVICES
217
neighboring interconnects and reduce crosstalk. The other obvious way of countering this problem is to use as isolating dielectrics those insulators that present a large energy barrier to tunneling. In this respect, silicon dioxide is better than silicon nitride, since the energy barrier to tunneling is usually larger with silicon dioxide, which has an energy gap of -9eV compared to 5 eV for silicon nitride. The small energy gap of silicon nitride may preclude its use as isolating dielectric in ULSI in spite of its other attractive properties, such as high resistance to Na’ diffusion.
-
X. QUANTUM-COUPLED ARCHITECTURES AND QUANTUM CHPS In the last section, we saw what problems may arise because of close physical proximity of interconnects in ultralarge-scale integrated circuits (ULSI). These problems are caused by unwanted coupling between wires. Similar unwanted coupling may also occur between devices and add to the crosstalk. This is unfortunate, since the only remedy seems to be decreasing device density on a chip, which defeats the very purpose of using quantum devices. After all, quantum devices have very low power dissipation (recall Section IV,C,l) and therefore can be packed much more densely than conventional devices. A solution to this apparent impasse is almost melodramatic. It may be possible to use the very effects which cause coupling between devices as effective interconnects for communicating signals between the devices. These “non-physical” wires do not have to be fabricated separately, and it is to be hoped that they will be faster than physical wires, since they are much “shorter.” Chip architectures based on this concept are termed quantum-coupled architectures. Here quantum-mechanical coupling between neighbouring devices communicate signals back and forth between the device without the presence of any actual interconnecting wire. This idea dates back to at least 1983 (Bate, 1983, 1985; Bate et al., 1987). The preferred prototypes for such schemes are cellular automata, since they are most synergistic with quantum coupling. Quantum coupling is short-range and is best suited to architectures that have nearest-neighbor interaction. Cellular automata not only meet this requirement, but also afford massive parallelization and are known to be fault-tolerant (Burks, 1970; Harao and Noguchi, 1975; Wolfram, 1986; Albert and Kulik, 1986; Gacs, 1986). Moreover, this architecture can perform any desired memory or logic function, and in some special cases is more efficient than other architectures. Among its various applications are imaging of x-rays, blood-cell recognition, analyzing bubble chamber photographs, synthetic aperture radar imaging, and floating-point computations.
218
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Researchers at Texas Instruments have proposed specific implementations of quantum-coupled integrated circuits utilizing cellular-automata architectures (Bate et al., 1987, 1989; Bate, 1989; Randall et al., 1989a,b, 1990) without physical wires. In these circuits, inter-device communication is achieved through a variety of coupling mechanisms. For instance, the devices can be quantum-dot resonant tunneling devices, and signal is transferred from one device to the next via tunneling of electrons. Switches and logic gates (INVERTERS, NAND and NOR gates) have been designed using this scheme. Also, more complicated logic functions, such as Shannon cells, have been designed (Randall et al., 1989b). In addition to tunneling, other possible coupling mechanisms that have been proposed to replace physical interconnects are electrostatic (or capacitive) coupling (Randall et al., 1989a), optical coupling, acoustic coupling (Randall et al., 1990), etc. The elimination of physical interconnects between devices is an important conceptual leap in the area of ultralarge-scale integrated circuits, since physical interconnects pose the ultimate obstacle to further miniaturization (Ferry et al., 1988, 1988; Bandyopadhyay, 1992). It would be most desirable to eliminate all physical interconnects in a ULSI chip, but in practice, this is not possible. Even though the on-chip interconnects between devices can be eliminated, the chip itself must communicate with the external world (power supply, terminals, other chips, etc.) through physical wires. Therefore, at least some devices on a chip must be connected to wires that will carry information back and forth from the external world. These devices will act as input/output ports. The Texas Instruments (TI) group’s suggestion was that these devices be placed at the periphery of the chip for a simple technological reason. Bonding pads typically consume the area occupied by several hundred thousand devices. Therefore, it would not be possible to bond to individual ultrasmall devices at the center of the chip (where packing is densest to facilitate short-range quantum coupling), since even the most sophisticated bonding technique would not have that kind of spatial resolution. Therefore, it is advisable to wire only devices at the edges of a chip (where packing is relatively sparse), and provide or retrieve data only to and from these devices. Another important idea that naturally evolved out of the TI group’s work is the concept of using the many-body physical ground state of an ensemble of interacting logic devices to represent the result of a computation. Although the idea of computing with the ground state is also implicit in Hopfield-type artificial neural networks (Hopfield, 1984), it was new in the context of quantum-coupled architecture. This idea was refined and implemented in an elegant scheme by Bakshi and co-workers (1991), which we will discuss later. The basic idea is the following. Individual bistable logic devices at the periphery of a chip (the input ports) receive the input
SEMICONDUCTOR QUANTUM DEVICES
219
string, which switches some or all of them. This may take the entire system into an excited state, and the excitation is communicated in a domino-like fashion throughout the chip via nearest-neighbor coupling between devices. The system is then allowed to cooperatively relax to its many-body ground state after the input is removed. By some clever arrangement, the cooperative interactions between the devices are so engineered that once the many-body physical ground state is reached, the logic states of the output devices (which are also placed on the periphery of the chip) represent the results of the computation in response to the input. The output is then read from these devices. Every time a binary input string arrives, a new ground is attained, and the logic states of the output devices contain the result of the computation. This “ground-state computing” scheme is actually a necessary requirement of truly interconnectless architectures. Since the internal devices on a chip are not connected to any power source, there is no mechanism to retain them perpetually in excited states, and they will inevitably decay to the lowest energy states. The obvious advantages of ground-state computing are the inherent stability of the device, reliability of computation, fault tolerance, improved noise immunity (noise perturbations may cause the device to stray from the ground state, but it will ultimately relax back to the ground state), possible non-volatility, and the elimination of the need to provide refresh cycles that are responsible for about 80% of the power consumption of a conventional chip. These are major advantages. The concept of ground-state computing in an interconnectless architecture was elegantly utilized in a scheme recently proposed by Bakshi and co-workers (1991). In this scheme, regimented arrays of elongated semiconductor quantum dots (termed “quantum dashes”) are fabricated on a wafer with dimensions small enough that each dash contains a single conduction band electron. With the right arrangement and geometry, the many-body ground state of this array will be such that the lone electron in each dash will be displaced towards one edge or the other of the dash, giving rise to a net charge polarization in each cell. The polarizations in neighboring dashes are opposite when the system is in the ground state. This mimics antiferroelectricity, as shown in Fig. 50. Each dash can now act as a binary logic device, with the two possible directions of the charge polarization representing the two binary bits. Computation is performed by providing input to (orienting the polarizations in) dashes at the periphery of a chip. These polarizations modify those in the neighboring dashes via Coulomb interaction, and the effect propagates in a domino-type fashion through the entire array. The final configuration of the individual polarizations in the dashes (the new ground state) will respresent the result of the computation. Note that this is an interconnectlessarchitecture (Coulomb interaction plays
220
MARC CAHAY and SUPRIYO BANDYOPADHYAY
the role of interconnects), and it also embodies the idea of ground-state computing. A slight variation of the Bakshi scheme (Bakshi et al., 1991) was proposed by Lent et af. (1993) in which a logic device is constructed by arranging five semiconductor quantum dots in the shape of an X. There are only two conduction-band electrons in the entire device, which (in the ground state of the system) can occupy either the two pivotal dots in one limb of the X or the two pivotal dots in the other. This gives rise to two possible charge polarizations along the two limbs of the X, which can encode the two binary bits. It has been shown (Lent et al., 1993) that because of Coulomb repulsion between electrons, the polarization that is favored in one device is the same as that of its nearest neighbor. Therefore, switching the polarization of one device will switch that of its nearest neighbor via Coulomb interaction, and the effect will again propagate in a domino-fashion. This scheme is identical to that of Bakshi et af., with the only difference being that the ground state mimics ferroelectricity rather than antiferroelectricity. However, it was claimed that such a scheme affords better bistability of the charge polarization than the scheme of Bakshi et al. (1991) even though the electron mechanics for switching such a device is much more complicated. In the scheme of Bakshi et al. (1991), switching requires merely pushing an electron from one side to the other in a quantum dash. In the scheme of Lent et al. (1993), switching would require that each of the two electrons in a limb tunnel through the central dot and occupy the pivotal dots in the other limb-a significantly more complicated motion. In addition to the foregoing schemes, there have been earlier proposals for quantum-coupled cellular automata architectures which utilize multistationary quantum states of molecules (Carter, 1982, 1987) or semiconductor quantum dots (Obermayer et al., 1988; Teich et al., 1988; Teich and Mahler, 1992). Logic operations are performed by selectivity driving optical absorption resonances that induce transitions between these states. Cellular automata rules are configured by Coulomb interaction between nearest neighbors (Teich et al., 1988). Teich et af. (1988) provided a detailed scheme for realizing a one-dimensional one-way cellular automaton. More recently, an elegant scheme has been proposed for realizing a one-dimensional quantum-coupled cellular automaton using a heteropolymer excited by sequences of resonant laser pulses (Lloyd, 1993). Each molecule in the heteropolymer has a ground state and a long-lived excited state, which act as the binary logic levels. Switching between these levels is accomplished by the laser pulse. Logical operations take place coherently, and dissipation is required only for error correcting. This scheme is potentially capable of approaching the requirements of a dissipationless (reversible) computer
SEMICONDUCTOR QUANTUM DEVICES
22 1
(Landauer, 1961; Lecerf, 1963a,b; Bennet, 1973, 1982; Benioff, 1980, 1982a,b; Fredkin and Toffoli, 1982; Likharev, 1982; Feynman, 1982, 1985, 1986; Zurek, 1984; Peres, 1985). However, the problem with this approach is that it does not use the concept of ground-state computing and therefore relies on metastable states for computation. Consequently, it does not have the advantages of stability, high noise margin, etc., which are the hallmarks of the schemes of Bakshi et al. (1991) and Lent et al. (1993). A. Shortcomings of Quantum-Coupled Devices
Until now, we have extolled the merits of quantum-coupled devices and ground-state computing. However, these devices and architecture are not devoid of shortcomings. They all have a few problems, and some may have more disadvantages than advantages. We can use the scheme of Lent et al. (1993) to illustrate the major problems and possible pitfalls. Specifically, we will point out two problems that are quite generic in nature. Switching of the logic device in the Lent scheme requires the physical movement of precisely one electron from a dot to another. Trapping of that lone electron in any one dot would result in catastrophic failure. Carrier trapping and its aftermath are well-known problems to device engineers in the charge-coupled-device (CCD) community. CCDs are classical anlogues of single-electron devices. The difference between the two is that in CCDs, the charge packet typically contains about 10,OOO electrons, whereas in single-electron devices, that number is precisely 1. Therefore, singleelectron devices are much more vulnerable to trapping than CCDs. Even in a CCD, the charge transfer inefficiency arising from trapping of a few carriers by material defects limits the bit rate in sequential memory to less than 1 Mbit/s (Carnes and Kosonocky, 1974; Pierret, 1978; Schroder, 1989). Therefore, in single electron devices, the bit rate will have to be unacceptably low-possibly orders of magnitude lower than what is found in garden-variety classical memory chips. The only way to alleviate this problem (pathological of all granular electronic devices that require charge movement) is to devise a scheme where switching does not require physical movement of an electron. Such a scheme, recently proposed by Bandyopadhyay et al. (1993), will be discussed later. The trapping problem is not peculiar to the scheme of Lent et al. alone (1993). To some extent, it also afflicts the scheme of Bakshi et al. (1991), but to a lesser degree. There, switching requires the movement of an electron within the same quantum dash, not between two different quantum dashes. This reduces the trapping probability, which is an advantage. The TI scheme (Bate, 1989; Bate et al., 1989; Randall et al., 1990), on the other
222
MARC CAHAY and SUPRIYO BANDYOPADHYAY
hand, does not suffer from this problem, even though the “quantum dot” resonant tunneling devices are not much different from single-electron tunneling devices. The difference in the TI scheme is that switching does not depend on the movement of discrete charges packets with no tolerance for any variation in the number of electrons in the packet. Instead, it depends on resonant tunneling current. The second problem that is often encountered pertains only to those devices that rely on tunneling for switching of logic levels. It is believed that tunneling devices may seriously suffer from irreproducibility (Landauer, 1989a,b, 1990). Tunneling probability and tunneling speed are exponentially sensitive to barrier heights and widths which cannot be controlled with sufficient precision (Landauer, 1989b) unless the barriers are fabricated by rather sophisticated and expensive film growth techniques such as atomic layer epitaxy (ALE). Moreover, in resonant tunneling devices, the current depends sensitively on the relative transmission of two barriers which are not grown simultaneously (Landauer, 1989b). The onset of negative differential resistance also depends on the width of the intervening well. Because of all this, fabricating devices with a specified set of characteristics may not be easy. However, recent advances in ALE have made it possible to achieve monolayer control of film thickness across an entire wafer. This may now result in adequate reproducibility and yield of resonant tunneling integrated circuits. In contrast to resonant tunneling devices, the barrier widths in the devices of Lent et al. (1993) are defined by lithography, which cannot provide anywhere near the monolayer resolution afforded by ALE. Therefore, the reproducibility (and consequently the yield) of such devices will probably be poor. This is also true of lateral resonant tunneling devices (Chou etal., 1989; Ismail et al., 1989; Bouchard et al., 1991) with lithographically defined barriers. Devices with poor reproducibility are unsuitable for high-density integrated circuits where billions of devices must be fabricated with nominally identical characteristics. Therefore, tunneling devices with lithographically defned barrier widths will most likely be unsuitable for integrated circuits. The reproducibility problem associated with tunneling devices is fortunately nonexistent in the scheme of Bakshi et al. (1991), where switching is achieved by simply skewing the wavefunction of an electron in a cell from one side to another. Since the switching characteristic is not sensitive to the precise dimensions or shape of the dashes (as long as they contain a single electron), this scheme has much better fabrication tolerance. The fabrication tolerance vastly improves the reproducibility of individual devices, which is a requirement for any realistic scheme. We believe that it is the combination of single-electron transfer and transfer by tunneling that may be lethal as far as quantum integrated circuits are concerned.
SEMICONDUCTOR QUANTUM DEVICES
223
B. Quantum-Coupled Spin-Polarized Single-Electron Logic Devices
We now discuss a recently proposed scheme that does not suffer from the two drawbacks of quantum-coupled devices previously discussed. This is a scheme that was recently proposed by Bandyopadhyay et al. (1993) which utilizes spin-polarized single electrons as binary logic devices. The spin of the electron encodes the bit information. The desire for single-electron devices has a well-founded basis. In all devices that we have discussed in this article [with perhaps the sole exception of the turnstile device of Geerligs et al. (1988)], several electrons contribute to the current at any time. As a result, many electrons (not just one) determine the device characteristics, and we always have to ensemble average over them to find out how the device will bahave. The ensemble averaging has many deleterious effects.22 As discussed before, it could wash out any observable quantum interference effect if different electrons interfere differently. In connection with AharonovBohm devices, we showed that in order to prevent this, two-dimensional structures need to be really ballistic, so that there is no scattering, elastic or inelastic. One-dimensional structures are somewhat better, since elastic scattering is highly suppressed. In addition, the temperature needs to be low, so that the spread in the energy of the electrons (which introduces a corresponding spread in the phase shift) remains small. All the preceding difficulties can be surmounted if one can conceptualize a functional device whose operation is determined by some property of a single electron which does not depend on any external or system parameter or the environment, such as temperature. Such a device can be immune to the deleterious effects of ensemble averaging and can provide roomtemperature operation. Moreover, if the property is robust, then the device can have a very high noise margin. Recently, such a concept was put forth by Bandyopadhyay et al. (1993), where a single electron by itself acts as a binary logic device. The spin of the electron (which in some cases can assume only one of two polarizations and hence becomes a binary variable) encodes the bit information. Since the spin polarization does not depend on the electron energy or any other such parameters, these devices are not vulnerable to the harmful effects of ensemble averaging. Moreover, the spin is a rather robust physical quantity and cannot be flipped by any external perturbation other than a strong magnetic field. As a result, the proposed devices have some rare properties: "The ensemble averaging is, of course, more serious for some devices than others. The reader will remember that it is less serious for magnetostatic Aharanov-Bohm effect devices and spin precessing transistors than, say, electrostatic Aharonov-Bohm devices.
224
MARC CAHAY and SUPRIYO BANDYOPADHYAY
(a) They can operate at room temperature, (b) they are practically immune to electrical noise and have high reliability and (c) they can act as nonvolatile memory. Bandyopadhyay et al. (1993) have proposed a quantum-coupled architecture utilizing spin-polarized single-electron logic devices. Their scheme embodies the concept of ground-state computing and is fashioned after that of Bakshi et al. (1991). We will describe it in the next section. In some ways, this is the culmination of many aspects of quantum device and architecture research. The device itself is the ultimate quantum device-a single electron, which is the smallest possible maneuverable entity. It is also a superdevice in that it has none of the usual disadvantages of quantum devices, such as low noise margin, irreproducibility, and low temperature of operation, and yet it has all the advantages such as high speed, high density, and low power dissipation. Finally, the architecture is also superior to most quantumcoupled architectures, as we shall show later. 1. Spin-Polarized Single-Electron Logic Devices and Architecture We now examine how the spin-polarized single-electron logic chip works. It works in exactly the same way as the scheme of Bakshi et al. (1991), where nearest-neighbor quantum dashes had opposite charge polarizations, thereby mimicking anti-ferroelectricity. In the case of spin-polarized single electrons, nearest-neighbor electrons confined in quantum dots or dashes with single electron occupancy will have antiparallel spins. This mimics antiferromagnetism. Therefore, the only difference between this scheme and that of Bakshi et al. (1991) is that in one case an electron’s spin orientation encodes the bit information, and in the other it is the charge polarization. There are other possible ways of utilizing spin-polarized single-electron chips to perform computation. The archaic cellular automaton scheme requires that the logic state of each element be determined uniquely by the logic states of its nearest neighbors. Computation proceeds by advancing the logic devices through each time step with an external clock. The clock can be realized by an oscillating, globally applied ac (square wave) magnetic field (Landauer, 1992) superimposed on the weak dc magnetic field. The ac field can flip the spins in each cycle. The direction of the spin in any cell at any time step depends on the previous spin orientation and the orientation of the immediate neighbors. One can configure cellular automaton rules in this fashion. This possibility is very attractive for an interconnectless architecture, since individual devices do not have t o be contacted for the clock terminal. Moreover, it may be possible t o approximate dissipationless (reversible) computation by using a coherent resonant electromagnetic field
SEMICONDUCTOR QUANTUM DEVICES
225
to drive the transitions between spin states via magnetic dipole interactions. Although this is an intriguing possibility, we shall not discuss this here. The use of spin polarization in lieu of charge polarization has the following advantages: (a) Bistability of the spin orientation can be much stronger than the bistability of charge polarization; (b) the bistability is not affected by temperature, unlike in the case of charge polarization, where the bistability is smeared out at slightly elevated temperatures (-4.2 K); (c) switching the polarization (and hence a logic bit) does not require physical movement of charge from one region of space to another; and (d) spin is immune to most types of electrical noise. The concept of using electron spin to represent binary bits is quite old (Benioff, 1980, 1982a,b; Deutsch, 1985; Feynman, 1982, 1985, 1986; Zurek, 1984; Peres, 1985). However, in order to translate this concept into practice, one needs a specific computing paradigm and a way to realize it. Once these are established, four other requirements must be met. They are (a) a mechanism for switching between the logic states (spins), (b) a layout scheme to realize different circuit topologies to perform different logic functions, (c) a communication method (interconnection) between different logic devices to transfer information back and forth for computation (for this, the state of one logic device must determine that of its nearest neighbor, to which it is connected via quantum coupling); the communication may have to be unidirectional or highly non-reciprocal in order to provide isolation between input and output stages, and (d) the ability to read and write bit information in selected (input/output) devices. These selected devices provide the link between the chip and the external world. The first requirement, namely switching a device from one logic state to another, is accomplished by flipping an electron’s spin, either with a locally applied magnetic field (externally induced during writing of data) or by spin-spin coupling between two nearest-neighbor electrons (internally induced during computation). A highly localized magnetic field can be applied to a single electron by a spin polarized scanning tunneling microscope (SPSTM) tip (Manassen et al., 1989; Wiesendanger et al., 1990, 1992). We will discuss this in more detail later on. The second requirement is met by laying out cells containing single electrons in various two-dimensional arrangements (by some patterning scheme) to realize different circuit topologies. The spatial arrangement determines how the single electrons (logic devices) are connected to each other, and this realizes various circuits. The entire chip can be fabricated in this way. The third requirement of inter-device communication is accomplished by quantum-mechanical (spin-spin) coupling. The spin of one electron affects that of its neighbor in the following way. In a two- or one-dimensional
226
MARC CAHAY and SUPRIYO BANDYOPADHYAY
arrangement of electrons subjected to a weak magnetic field, it is energetically favorable for two nearest neighbors to have opposite spins (we will show this later). Therefore, when the spin of one electron is switched during the write cycle, the neighbor feels it since the system goes to an excited state. The system can relax to the ground state only if the neighbor flips its own spin by emitting a phonon or magnon. This, in turn, flips the spin of the next electron, and so on. The effect again propagates in a domino-like fashion until the entire system of electrons has achieved a new ground-state spin configuration. This new configuration is the result of the computation in response to the input. This is the basic computing paradigm. Note that information is transmitted across the entire chip by perturbations that act like “spin waves.” These waves typically have speeds that are about two to three orders of magnitude smaller than the speed of light. The only other requirement that needs to be met is that of nonreciprocal or unidirectional transfer of information between devices to isolate input and output stages. This is the most difficult requirement in these schemes and is addressed later on. Finally, the last requirement of reading and writing bit information in selected elements is fulfilled through the use of SPSTM tips. These tips can orient (write) the spin of the lone conduction band electron in a chosen cell by creating a localized magnetic field with atomic resolution and also measure (read) the spin polarization of such an electron in an isolated cell. This provides the link between the chip and the external world. 2 . Antiferromagnetism: The Basis of the Computing Paradigm
We now discuss the basis of the ground state computing. The sole basis is the fact that two conduction-band electrons in two nearest-neighbor single electron cells will tend to have their spins antiparallel (even in the presence of a sufficiently weak magnetic field). This is the same basis as that in the scheme of Bakshi et al. (1991). The antiferromagnetic ordering is caused by exchange and correlation which affect only conduction band electrons in neighboring cells since the valence band or core electrons in any cell are highly localized, and their wavefunctions d o not overlap with those of others in a different cell. Consequently, there is no exchange interaction and spin-spin coupling between valence-band or core electrons in different cells. One can neglect the valence-band or core electrons in the theoretical analysis and concentrate only on the lone conduction-band electron in each cell. The rest of the cell merely acts as a core potential for confining the lone conduction-band electron and serves no other purpose. Therefore, inter-cell interaction consists only of the quantum-mechanical (Hartree, exchange and correlation) interaction between two nearest-neighbor conduction-band
SEMICONDUCTOR QUANTUM DEVICES
227
electrons in nearest-neighbor cells. In such a case, the spins of nearestneighbor conduction-band electrons will assume antiparallel orientations in the ground state (antiferromagnetic ordering). This will be true even in the presence of a magnetic field, as long as the field is weak enough that the Zeeman splitting induced by the field is smaller than exchange splitting (the energy difference between the triplet and singlet states of a two-electron system). The fact that antiferromagnetism is the preferred ground state of course does not also guarantee that it is stable. For a long time it was believed that antiferromagnetism in a one- or two-dimensional Ising model is not stable against lattice perturbations (phonons) and spin waves (magnons) at any temperature above absolute zero (Peierls, 1935; Landau, 1937). However, it is now understood that the instability occurs only in infinte systems, whereas finite systems are theoretically stable (Gunther, 1967). Therefore, it is possible to sustain stable antiferromagnetism in a finite system of restricted size at finite temperatures. Unlike a one-dimensional system, a two-dimensional system can exhibit a phase transition to an ordered magnetic state at a nonzero temperature, even if it is infinite in extent. The critical temperature T, for such a phase transition (in a perfectly periodic infinite Ising system) is given by the Onsager relation (Ziman, 1972; Ashcroft and Mermin, 1976)
where J is the exchange splitting. This splitting is the energy difference between the singlet and triplet states of two electrons in nearest neighbor cells. In the Heisenberg model, the sign of J determines whether the phase transition occurs to a ferromagnetic or antiferromagnetic state. Since J is negative, antiferromagnetism is preferred. One would want the phase transition temperature to exceed room temperature so that the antiferromagnetism may be sustained at room temperature. This is a necessary (but not a sufficient) condition to allow room-temperature operation of the circuits. Obviously, this condition can be met by making the exchange splitting (JI sufficiently large. A large splitting also causes a large energy difference between antiferromagnetic and ferromagnetic ordering, which improves the stability of the ground state, and fault tolerance while reducing the bit error rate. The probability of a bit error can be interpreted as the probability that two nearest-neighbor cells have parallel spins. This probability is -exp[- lJI/kT] at a temperature T. For it to be small, we require the magnitude of J to be large. Fortunately, unlike in a natural system, the exchange splitting J in an artificially structured two-dimensional array can be engineered. The
228
MARC CAHAY and SUPRIYO BANDYOPADHYAY
magnitude of the splitting depends on two factors: (a) the separation between adjacent single electron cells, and (b) the size and shape of the cells, which determines the degree of quantum confinement for each electron. By adjusting these parameters, one can make J sufficiently large to allow room-temperature operation. The magnitude of IJI was estimated by Bandyopadhyay et al. (1993) for a model two-electron system by solving the pertinent Schrodinger equation numerically. From these solutions, it was found that (a) the singlet state (with antiparallel spins) is indeed lower in energy than the triplet state (with parallel spins), and (b) the energy difference between the two states IJI can be larger than 100meV if the cells have a diameter of 20A and are separated from each other by barriers that are 20A wide and 1 eV high. These parameters may not be too difficult to realize in real systems. Bandyopadhyay et al. (1993) proposed a technique involving nanophase particles patterned by scanning tunneling microscopes. In such systems, antiferromagnetism will be strongly preferred, even in the presence of a weak magnetic field, as long as the Zeeman splitting induced by the magnetic field is significantly smaller than 100meV. One also finds that with these parameters, the critical temperature for phase transition to antiferromagnetism [see Eq. (11. l)] is -2,600 K (theoretically), which is much above room temperature. C. Logic Circuits Using Spin-Polarized Single Electrons
Following Bandyopadhyay et al. (1993), we now show the design of various logic circuits with single-electron cells. Their implementation requires only two basic properties, namely (a) antiferromagnetic ordering of cells, i.e., any two nearest-neighbor electrons have opposite spins, and (b) only nearest-neighbor interactions matter; second-nearest neighbor interactions between cells are unimportant. These two properties together imply that if the spin of any one electron in a cell is “up,” then the spin of its nearest neighbor must be “down” even if a third downspin electron is only incrementally farther away. The basis for assuming only nearest-neighbor interaction is that spin-spin coupling is short-range and decays exponentially with distance, unlike Coulomb or electrostatic interaction. The short-range interaction is of course ideally suited for cellular automata, where each cell must have interactions with only some of its neighbors and not with others (Landauer, 1992). Spin-spin coupling satisfies this requirement, while unscreened Coulomb interaction between charge dipoles (Bakshi et al., 1991; Lent et al., 1993) may not. The latter interaction is rather long-range, so that second- or third-nearest neighbor
SEMICONDUCTOR QUANTUM DEVICES
229
interaction is quite strong. It may appear that the introduction of charge screening will mitigate this problem to a large extent, since the screened Coulomb interaction decays exponentially with distance. Unfortunately, screening may also inhibit self-polarization of the quantum dots and dashes, as predicted by Bakshi et al. (1991) and Lent et al. (1993). Moreover, screening can weaken or eliminate the bistable behavior of the polarization, which is the heart of the physics. Therefore, dipole interaction may not be optimum for cellular automata. In the following section, we will demonstrate the design of some logic gates and circuits (after Bandyopadhyay et al., 1993) that may be more suitable for conventional architecture than cellular automata. This does not mean that the cellular automaton approach is inappropriate for these devices. The purpose here is to demonstrate the versatility of the proposed technology by designing logic gates that are compatibility with conventional architecture as well. To demonstrate these gates, we will adopt the convention that the “up” spin state is logic level 1 and the “down” spin state is logic level 0. Similar logic gates in quantum-coupled architecture were also designed by Bate et al. (1987), Bakshi et al. (1991), and Lent et al. (1993) using their schemes. 1. NOT Gates (Inverters)
It is obvious that a system of just two coupled electrons (in closely spaced cells) constitutes a natural inverter. If the spin of one is “up,” then the spin of the other must be “down,” and vice versa, since the ordering is antiferromagnetic. Therefore, if we consider the electron spin in one cell to be the input and the other to be the output, the output will always be the inverted version of the input. This realizes a NOT gate, which is schematically depicted in Fig. 5 1. 2 . AND and NAND Gates
To construct a NAND gate, one needs three equally spaced cells in a linear chain (Fig. 52). The two extreme cells are the two input ports, and the one in the middle is the output port. If the spins in the two extreme cells are
Input
output
FIGURE51. An INVERTER (NOT gate) utilizing spin-polarized single conduction-band electrons confined in ultrasmall cells. The spin state (logic level) of one electron is the inverted version of the spin state or logic level of the other.
230
MARC CAHAY and SUPRIYO BANDYOPADHYAY Boolean Truth Table
A Y NAND gate
B
A Y B
@ @ @ @ 0 @
l
o
0
1
1
@ @ @
1
1
0
l
@ @ @ Possible spin configurations of the array Nanophase realization
(Truth table)
a
A
B
@ @ @ @ Y AND gate
@ 0 @ @ @ @ @ @
Nanophase realization
Boolean Truth Table
A Y B 1
1
1
0
0
1
1
0
0
0
0
0
@ 0 @ @ Possible spin conliguralions of the array (Truth table)
b FIGURE52. (a) A NAND gate, and (b) an AND gate. Also shown are the four possible spin configurations of the array, which correspond to the Boolean truth table.
SEMICONDUCTOR QUANTUM DEVICES
23 1
TABLE I1 TRUTH TABLE OF A NAND GATESHOWN IN FIO.52. ~
Input 1
Input 2
output
oriented “up” (i.e., both inputs are held at logic level l), then the spin in the middle cell must be “down” for antiferromagnetic ordering. Similarly, it is easy to see that when the inputs are held at logic level 0, the output will be at 1. Now, if one of the inputs is 1 and the other is 0, then the output can be either 1 or 0, since these two possibilities appear to be energetically degenerate. In reality, they will not be energetically degenerate because of the weak external dc magnetic field applied globally on the entire chip. In fact, it is because of this reason alone that the field is required. The field induces a small Zeeman splitting (smaller than the exchange splitting J ) between the “up” and “down” spin states, and thus defines a preferred orientation. Let us assume that the direction of this field is such that the “up” spin state is favored. Therefore, if any one of the two inputs is at logic level 1, then the output will also be at logic level 1. One can see that the truth table shown in Table I1 is interpreted. It is easy to verify that this is the truth table of a NAND gate. The NAND gate can be converted to an AND gate by directing the output of the NAND gate through an inverter. This requires four cells in a nonlinear chain. The two extreme cells are the input ports, and the one off the line is the output port. The spin orientations in the various cells for various inputs are also shown in Fig. 52. It can be easily verified from this diagram (which is essentially the “truth table”) that this system is an AND gate. 3 . OR and NOR Gates
The OR gate can be realized from NAND gates and inverters through anapplication of De Morgan’s law of Boolean algebra. This law states ZB = A + B, where A and B are two binary Boolean quantities. The righthand side of this equality is the OR function of two quantities A and B. Therefore, an OR gate can be realized by realizing the left-hand side of the equality, using NAND gates and inverters. The realization is shown in Fig. 53.
232
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Possible spin configurations 01 theanay A nanophase realization
(Truth Table)
a
A NOR gate
A nenophase realization
@ @ @ @ G I
@ Possible spin configurations of the array
(Truth table)
b FIGURE 53. (a) An OR gate, and (b) a NOR gate.
4. Exclusive OR Gates
Exclusive OR gates c a n b e realized by using the exclusive OR relation Y = ( A + @(A) = (zB)(A), where A and B are two inputs and Y is the output. The array in Fig. 54 is an exclusive OR gate.
SEMICONDUCTOR QUANTUM DEVICES
233
An exclusive OR gate
Input
Q
Input output
Input
@ A nanophase realization
FIOURE54. An exclusive OR gate.
5 . Combination Digital Systems for Arithmetic Logic Units A digital computer is required to perform only two basic types of functions: logic operations and memory storage (Millman, 1979). Logic operations are achieved through combinational digital systems (consisting of logic gates), while random-access memory (RAM) can be realized through sequential digital systems. The following example taken from Bandyopadhyay et al. (1993) is that of a basic combinational system-the binary half adder. In a half adder, if A and B are two binary addends, S the sum, D the digit indicating the last digit of the sum, and C the carry, then D is the exclusive OR function of A and B [i.e., D = (A T )(= )], while C is the AND function of A and B. The schematic realization of a half adder is shown in Fig. 55a, and the actual realization with single-electron cells is shown in Fig. 55b. The chip area consumed by such a system is only about 3,000 A*, which promises extremely high functional density. In a similar fashion, one can construct code converters, parity checkers, parity encoders, multiplexers, etc. These circuits are of course more complicated and are not presented in this article.
234
MARC CAHAY and SUPRIYO BANDYOPADHYAY
a
Input
Input
Inverse Input 1
b FIGURE55. (a) A realization of a binary half adder using exclusive OR and AND gates; (b) a spin-polarized single-electron realization.
6. Sequential Digital Systems for Random Access Memory As an example of a sequential digital system for memory, Fig. 56a shows the design of an SR flip-flop. Its nanophase realization is shown in Fig. 56b. The reader can verify that it indeed performs as required. Other types of flip-flops such as J-K and master-slave J-K can also be constructed in a similar fashion. From these flip-flops, all basic memory circuits such as shift registers and counters can be constructed.
SEMICONDUCTOR QUANTUM DEVICES
Truth table
Steering gate8
235
Latch
A NAND gate reallzation
of an SR flip (lop
a
b FIOURE56. (a) A realization of an SR flip flop using NAND gates; (b) a spin-polarized single-electron realization.
D. Reading and Writing Operations: Orienting and Detecting Electron Spins in Single-Electron Cells We have discussed the computational paradigm and logic circuit design utilizing single-electron cells. The last requirement for realizing a complete functional system is the inputting and outputting of data, namely the READ/WRITE mechanism. The WRITE operation will align the conduction electron spin in a single electron cell (which will be a particle with few atoms) to the desired orientation, while the READ operation will detect the orientation. It does not matter if the WRITE operation affects the valence band and core electrons as well. Since they do not cause inter-cell interactions, they do not transfer any information, and their spin states are unimportant. However, the READ operation must single out the conduction-band electron (against the background of the valence band and core electrons) in each cell, since only its spin is meaningful and carries
236
MARC CAHAY and SUPRIYO BANDYOPADHYAY
information. Fortunately, this is achieved if the READ operation involves a current which can be carried only by the conduction-band electron. For both READ and WRITE operations, one needs to access individual cells with virtually atomic resolution. This can be achieved with a spinpolarized scanning tunneling microscope (SPSTM), which offer the atomic resolution as well as the ability to couple to an electron’s spin. We discuss this next. Since their inception, scanning tunneling microscopes (STM) have been extensively used for surface analysis of atomic arrangements as well as nanofabrication (Binnig et al., 1982a, b; Staufer et al., 1988; Ehrichs et al. , 1988). A SPSTM is a special type of STM in which the probe is constructed from a magnetic material. The tunneling current in this case depends on the magnetization of the probe (the spin orientation at the very tip), as well as the magnetization of the surface (spin orientation of the surface atoms). In the last few years, there has been considerable interest in SPSTM for fundamental studies of surface magnetism as well as developing techniques for magnetic recording (Shvets et al., 1992). Using SPSTM, spin-polarized electrons have been observed on the surfaces of SiO, and Cr (Manassen et al., 1989). Very recently, the imaging of magnetite (Fe,O,) was reported using an Fe tip (Wiesendanger et al., 1992). The use of SPSTM is possibly the ideal technique for spin READ/WRITE (spin alignment/spin detection) operations in single-electron cells. 1. Reading Mechanism
The READ operation can be performed by detecting the spin polarization in selected cells (at the periphery of the chip) with SPSTM tips. The way an SPSTM detects spin is the following. In an SPSTM, the tunneling current depends on the relative spin polarizations of the probe tip and the atom on the surface being probed. The probe tip has a fixed known polarization. Thus, from a measure of the tunneling current, the spin polarization in a surface atom (or a single electron cell) can be determined at any time. Theoretical estimates (Minakov and Shvets, 1990) show that the difference between the currents for the two spin polarizations can differ by a factor of 3, which is sufficiently large for unambiguous spin detection. This allows one to perform the READ operation. Note that since only the conductionband electron will contribute to the tunneling current, one has effectively filtered it out from the background of the valence-band and core electrons. This is an important requirement. A question that may arise is that while singling out the conduction-band electron, any tunneling current will also disrupt the single-conductionelectron nature of the cell. However, this is not a problem. When the
SEMICONDUCTOR QUANTUM DEVICES
237
n
+SPSTM Probe
Spin Polarized Nanophaae Particles
FIGURE57. Reading the spin of a nanophaseparticle using an SPSTM. The tunneling current measured by the SPSTM tip depends on the relative spin polarizationsof the tip and the particle.
current is shut off, the cell returns to single-conduction-electron occupancy since that is the equilibrium ground state. By making repeated measurements using pulsed currents, and time averaging over these measurements, one can detect the equilibrium spin state of any cell. Of course, certain difficulties can be encounted in the READ operation. First of all, one cannot scan the probe across the chip to read the spin state in selected cells since mechanical motion is unacceptably slow. Hence, a probe tip must be permanently attached to each output port (cell) along the chip periphery. This is schematically shown in Fig. 57. Other difficulties can be encountered because of magnetostriction and variations in cell size and shape. Magnetostriction can cause a change in the thickness and the shape of the cell (particle) itself. Since the gap between even a small amount the probe tip and the particle is very small (- 1 of magnetostriction will affect the tunneling current. A possible way to alleviate the problem is to use a non-magnetic probe tip to measure the particle height under two different magnetizations, and then use this data to calibrate the reading operation.
A),
2. Writing Mechanism The WRITE mechanism will polarize the spin of the electron in an input cell to the desired orientation. Again, SPSTM tips will be used for this purpose. To polarize an electron’s spin to a desired orientation, a strong enough magnetic field needs to be generated locally (with atomic resolution and range). This can be achieved as follows. A soft magnetic probe is placed in close proximity with the cell (particle). During the writing operation, it will be magnetized electrically. The magnetostatic force experienced (the basis for the Magnetic Force Microscope) is expected to alter the spin polarization of the particle.
238
MARC CAHAY and SUPRlYO BANDYOPADHYAY
E. Potential Problems in the Computing Scheme and Their Solutions 1. Unidirectional Isolation between Input and Output
In all electronic logic devices, a necessary requirement is isolation between input and output. Information should flow unidirectionally, i.e. ,the output of one device should drive the input of the next; however, the logic state of a device must not influence that of the preceding one. In other words, the interaction between devices must be non-reciprocal so that the input signal determines the output signal, but not vice versa. Under no circumstance should the output usurp the role of the input. In conventional active devices, this is accomplished through the device gain. The input signal is amplified on its way to the output port, whereas the output signal is attenuated in propagating back to the input. This automatically provides isolation. In passive microwave circuits where there is no gain, isolation is achieved through the use of non-reciprocal elements such as circulators or isolators based on Faraday rotation. Unfortunately, spin-polarized single-electron logic devices have no gain and nothing to emulate a Faraday rotator. This poses a problem with unidirectional isolation. To understand how this problem arises, we refer to Fig. 58. There are two NOT gates in series, and Fig. 58a shows the equilibrium configuration of spins. Now imagine that the input of the first NOT gate is flipped by an external source such as by an SPSTM (Fig. 58b). At this point, the spin state in the central cell becomes indeterminate, since the spin in the right cell favors the “upspin” state while the spin in the left cell favors the “downspin” state. If the external magnetic field (refer to the discussion of AND gates) favors the “upspin” state, then the spin in the central cell will not flip in response to the external input. In other words, the first NOT gate fails. In fact, if the external input (SPSTM) is removed, the spin in the leftmost cell (input port) will flip back to “upspin.” The foregoing is an example of the input being determined by the state of the output port rather than the reverse. This is due to the lack of unidirectional isolation between input and output. This problem was probably never recognized in previous schemes and to our knowledge was never addressed before in the context of granular electronic devices. We believe that this is the major problem with such devices and may ultimately limit their applicability. A possible solution to this problem is shown in Fig. 58c. One can change the spacing between cells as shown in Fig. 58c. Since the right cell is farther from the central cell than the left cell, the left cell has dominant sway.
SEMICONDUCTOR QUANTUM DEVICES
239
. NOT gate 1
NOT gate 2
a
b
C
FIGURE58. An example of failure due to the lack of unidirectional isolation between input and output. (a) The equilibrium configuration of electron spins (logic states) in two NOT gates in series. (b) The logic state at input of the first NOT gate (left cell) is changed by an SPSTM (external source), but the state at the output does not change in response because of the previous state at the output of the second NOT gate. The failure occurs because of a lack of unidirectional isolation between the input and output. (c) A possible solution. The left cell is closer to the central cell and therefore holds dominant sway.
This provides effective unidirectional isolation. Unfortunately, this type of solution is problem-specific. Also, increasing the separation cannot be carried on indefinitely, since increasing separation also decreases the strength of the spin-spin coupling. Ultimately, this limits the number of logic devices that can be used in a serial register-type array. We may also point out that using spin, rather than charge, polarization [as in the schemes of Bakshi et al. (1991) and Lent et al. (1993)l to encode bit information again has an advantage in this case. Since the spin-spin coupling decays exponentially with distance, a slight variation in the intercell distance can significantly alter the coupling strength between cells. Therefore, varying the intercell coupling is rather easy. In contrast, the
240
MARC CAHAY and SUPRIYO BANDYOPADHYAY
unscreened Coulomb coupling is long-range and decays slowly with distance. Therefore, varying the intercell coupling is much more difficult. Of course, introduction of screening will largely mitigate this problem, but again this may weaken or eliminate the bistable behavior of charge polarization in the schemes of Bakshi et al. (1991) and Lent et al. (1993), which is the heart of the physics. 2 . Wiringsfor Conventional Layouts In a chip that does not utilize cellular automata architecture, remote devices (rather than just nearest-neighbor devices) may sometimes need to be connected together. Obviously, this cannot be accomplished by quantum coupling, since it is short-range and exists only between nearest neighbors. To connect remote devices, we need something to emulate a physical wire or a line that will carry signal. Such a “wire” can be realized by a linear (rectilinear or curvilinear) array of odd-numbered cells as shown in Fig. 59. Unfortunately, we cannot have these wires cross over each other as is done in multilayered interconnects in conventional integrated circuits. Therefore, the layout can be quite challenging, since it has to avoid crossovers. It is possible that the best architectures will be a combination of cellular automata and conventional architectures, with the former being dominant. A completely cellular automata-based chip will have input/output devices placed along the periphery of the chip where the packing density is relatively sparse. The center of the chip will have the highest packing density, and
a
b FIGURE59. A simulated wire to connect remote devices: (a) a physical wire, and (b) a wire consisting of odd numbers of cells.
SEMICONDUCTOR QUANTUM DEVICES
24 1
FIGURE 60. A schematic representation of a quantum-coupled chip based on cellular automata. Each cell, represented by a dot, communicates only with its nearest neighbors via quantum-mechanical (spin-spin) coupling. The packing density decreases gradually away from the center towards the periphery. The cells which receive the input and provide the output data to the external world are placed at the periphery, where the packing density is the sparsest.
an individual cell will communicate only with its nearest neighbors through short-range quantum-mechanical coupling. The schematic layout of such an idealized chip is shown in Fig. 60.
F. Performance Figuresfor Spin-Polarized Single-Electron Logic Devices In this section, we present the performance figures for spin-polarized singleelectron logic devices as calculated by Bandyopadhyay et d. (1993). 1. Switching Speed
In single-electron logic, a bit is switched by flipping an electron spin. This is achieved either by locally applying a magnetic field (during writing) or by spin-spin coupling (during computation). In the former case, the switching time will be of the order of h/(g(BB), where B is the flux density of the locally applied field, g is the Lande g-factor (which can be very large in some semimagnetic semiconductors such as CdMnTe), and uB is the Bohr magneton. For a flux density of 0.1 tesla (which may be applied with a SPSTM) and a g-factor of 100, the switching time is -1 picosecond. In the second case, spin-spin coupling flips the spin of an electron by emitting a phonon. Spin-phonon coupling can be quite strong in pyroelectric materials (uniaxial crystals without inversion symmetry) where electric dipole spin resonance (Romestain et al., 1977; Dobrowolska et al.,
242
MARC CAHAY and SUPRIYO BANDYOPADHYAY
1982) can increase spin flip rates significantly. In some materials such as HgTe, spin-phonon transition linewidths of 0.4 meV have been predicted (Zawadzki et al., 1975), which gives a switching time of -A/0.4meV = 1 picosecond.
2. Power Dissipation The power dissipation for switching a single bit can be estimated as follows. If the exchange splitting (energy difference between the triplet and singlet state) is 100meV and the switching time is 1 picosecond, then the power dissipation for switching a single bit is 100 meV/1 picosecond = 16 nanowatts. This is a few orders of magnitude smaller than what can be achieved joules, in conventional devices. The power-delay product is then which is of the same order as that achievable with quantum interference devices. It is orders of magnitude smaller than what can be achieved with conventional devices, including Josephson junctions.
-
3 . Bit Density Next, we calculate the bit density that can be realized. Each nanophase particle (or each bit) occupies an area of -50 x 50 A. Therefore, the bit density will be -25 terabits/cm2. Such a high bit density poses a problem with cooling. Since the power dissipation per bit is 16 nanowatts, the maximum power dissipation from a 1 cm2 chip will be -400,000 watts! Removal of 1 ,000 W/cm2 from a silicon chip was demonstrated more than 10 years ago (Tuckerman and Pease, 1981), and it may be possible to improve this. However, acquiring a capability of removing 400 kilowatts/ cm2 at room temperature will not be easy. This problem can of course be eliminated altogether by reducing the operating temperature from room temperature to 77 K. The device sizes can then be increased to reduce the energy splitting between the triplet and singlet states (energy difference between logic levels) to 10 meV. This is still larger than the thermal energy kT at 77 K and therefore allows 77 K operation. Also, at 77 K, the phononassisted spin flip rate (switching speed) may decrease by a factor of 10, since phonon-assisted scattering rates are proportional to the Bose-Einstein factor, which has an exponential dependence on temperature. This reduces the power dissipation per bit by a factor of 100, thereby reducing the total dissipation to 4000 watts/cm2, which is more manageable. It must be emphasized that low-temperature (77 K) operation is not required because of device or circuit limitations. Rather, it is required because current heat-removal technology cannot perform at the required level. Once heat-sinking technology has improved enough, room-temperature operation can be restored.
SEMICONDUCTOR QUANTUM DEVICES
243
4. Temperature of Operation
The temperature of operation is ultimately determined by the smallest energy scale that is involved in the operation of the circuits. This is the Zeeman splitting energy caused by the globally applied magnetic field (recall the discussion of NAND gates). The maximum value of the Zeeman splitting depends on the maximum magnetic field we can apply globally without upsetting the basic antiferromagnetic ordering and without exceeding the field that can be applied with a SPSTM during writing of input data. From these considerations, we deduce that the maximum field may be about 10 teslas, which gives rise to a Zeeman splitting of 57 meV (g-factor = 100). This is smaller than the achievable exchange splitting of 100meV, so that we can sustain the antiferromagnetic ordering even in the present of a globally applied field of 10 tesla. With this Zeeman splitting, the bit error probability exp[-gp,B/kT)] becomes 11070 at room temperature and only 0.02% at 77 K. Therefore, room-temperature operation may be possible.
XI. EPILOGUE: THELONG-TERM PROGNOSIS In this section, we will conclude by discussing the long-term outlook for quantum devices. We will summarize their advantages and point out the disadvantages. The problem with classical devices (conventional bipolar junction transistor and metal oxide semiconductor field effect transistor) is thought to be associated with scaling. Power-supply voltages cannot be scaled down indefinitely. The minimum they can reach is the thermal voltage below which the noise margin becomes unacceptably poor. When device sizes are scaled down without scaling the voltages, the electric field increases proportionately with decreasing length. Ultimately, the electric field will reach the critical value for breakdown, which sets a limit to device scaling. The product of the power-supply voltage and the unity gain frequency in , is the breakdown conventional devices cannot exceed F,, v,, , where F field and v,, is the saturation velocity of charge carriers at that field. This is known as the Johnson limit (Johnson, 1965), and it has never been surmounted. For silicon, this limits the maximum unity gain frequency to 2 x 10" Hz and the switching speed to 5 ps if the voltage is 1 volt. This is a fundamental material limit and no amount of clever innovativeness can surmount it (Nagata, 1992). Therefore, it becomes necessary to explore alternate means of realizing ultrafast and ultradense computing devices. This led to the research on quantum devices.
244
MARC CAHAY and SUPRIYO BANDYOPADHYAY
The advantages of quantum devices are the exceptional speed, extremely low power dissipation, and high packing density. These are significant advantages, but there are also disadvantages. It has been pointed out (Landauer, 1989a,b, 1990; Keyes, 1992; Subramaniam et al., 1991) that quantum interference devices are impractical for integrated circuits. This is because their characteristics are extremely sensitive to a few angstroms’ variation in size, or a few millivolts’ variation in voltage, or a few nanoamperes’ variation in current. Because of the lack of fabrication tolerance, these devices are not reproducible. Consequently, they cannot be used in integrated circuits where hundreds of millions of devices must be fabricated reproducibly with reasonably high yield. In addition to having no fabrication tolerance, quantum interference devices also have practically no noise tolerance. Such delicate devices cannot work in integrated circuits where voltage variations will inevitably occur owing to reflection, attenuation, and distortion of signals communicated between various devices. There are some other fundamental shortcomings of quantum interference devices. For instance, the lack of nonlinear operating characteristics (the only exception is the resonant tunneling device) and the lack of intrinsic device gain make these devices unsuitable for many analog and digital applications. Finally, the extremely low current-carrying capability (quantum devices must operate at low currents to avoid dephasing interactions) causes these devices to be actually quite slow in their overall switching response (- 100 ps), sometimes slower than even conventional silicon devices. Granular quantum devices whose switching relies on the transfer (usually via tunneling) of one or a few electrons from one region of space to another are often worse than quantum interference devices, especially if they require physical movement of precisely one or a few electrons and that, too, by tunneling. In addition to having most of the disadvantages of quantum interference devices, such devices also have the additional disadvantage of being extremely slow in their response. The switching speed is reduced dramatically by lack of tolerance to trapping. In silicon CCDs, the trapping problem limits switching delays to 1 us typically, resulting in an extremely slow bit rate of 1 Mbit/s. In single- or few-electron devices, the problem is bound to be worse, since the charge packets are extremely small (one or a few electrons instead of about 10,OOO in conventional CCDs). Therefore, no single-electron device should ever rely on charge-transfer for switching. Fortunately, the spin-polarized single-electron devices discussed in Section X meets this requirement. Finally, almost all quantum devices have a serious drawback. Most of them cannot operate even at 77 K, let alone room temperature. This feature makes them impractical for many applications. The only exceptions to this
SEMICONDUCTOR QUANTUM DEVICES
245
are resonant tunneling devices, which have operated at room temperature. Presumably, the spin-polarized single-electron devices could also operate at room temperature. Other than these devices, none of the other quantum devices could operate at non-cryogenic temperatures given the present state of fabrication technology. However, it is hoped that with gradual improvement in material growth and nanofabrication techniques, this barrier may one day be surmounted. Till then, quantum devices will probably remain a laboratory curiosity.
ACKNOWLEDGMENTS This review has described work performed by the authors under the auspices of the National Science Foundation (Award No. ECS-9108932), the U.S. Air Force Office of Scientific Research (Award No. AFOSR-91021 l), the U.S. Office of Naval Research (Award No. NOOO14-91-5-1505), the IBM Corporation, and the U.S. Department of Energy providing funds through the Midwest Superconductivity Consortium. M. C. also acknowledges the Ohio supercomputing center for the use of its facilities. We thank Tarkeshwar Singh for Figs. 7, 9, 12, 13, 40, 41, 42, and 43. The material reported here drew on the expertise of many of our colleagues, teachers, and students, and we remain grateful for their help, constructive criticisms, and advice. Marc Cahay dedicates this article to the memory of his father.
REFERENCES Aharonov, Y., and Bohm, D. (1959). Phys. Rev. 115,485. Aihara, K., Yamamoto, M., and Mizutani, T. (1992). Jpn. J. Appl. Phys. 31, L916. Albert, J., and Kulik, K. (1986). Proc. MIT Workshop Cell. Automata, 1986. Al’tshuler, B. L., Aronov, A. G., and Spivak, B. 2. (1981). Sov. Phys.-JETP Lett. (Engl. Transl.) 33, 94. Al’tshuler, B. L., Aronov, A. G., Spivak, B. Z., Sharvin, D. Yu., and Sharvin, Yu. V. (1982). Sov. Phys.-JETP Lett. (Engl. Transl.) 35, 588. Alves, E. S., Eaves, L., Henni, M., Hughes, 0. H., Leadbeater, M. L., Sheard, F. W., and Toombs, G. A. (1988). Electron. Lett. 24, 1190. Amman, M., Mullen, K., and Ben-Jacob, E. (1989). J. Appl. Phys. 65, 339. Anda, E. V., and Flores, F. (1991). J. Phys.: Condens. Matter 3, 9087. Anderson, P. W. (1981). Phys. Rev, B: Condens. Matter 23, 4828. Ashcroft, N. W., and Mermin, N. D. (1976). “Solid State Physics,” Chapter 32. Saunders College, Philadelphia. Averin, D. V., and Likharev, K. K. (1986). J. Low Temp. Phys. 62, 345.
246
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Badurek, G., Rauch, H., and Summhammer, J. (1983). Phys. Ref. Lett. 51, 1015. Baglee, D. A., Ferry, D. K..Wilmsen, C. W., and Wieder, H. H. (1980). J. Vac. Sci. Technol. 17. 1032. Bakshi, P., Broido, D., and Kempa, K . (1991). J. Appl. Phys. 70, 5150. Bandara, K. M. S. V., Coon, D. D., Byungsung, O., Lin, Y. F., and Francombe, M. H. (1988). Appl. Phys. Lett. 53, 1931. Bandyopadhyay, S. (1992). IEEE J. Quant. Electron. QE-28, 1554. Bandyopadhyay, S., and Porod, W. (1988). Appl. Phys. Lett. 53, 2323. Bandyopadhyay, S., and Porod, W. (1989). Superlattices Microstruct. 5, 239. Bandyopadhyay, S., Datta, S., and Melloch, M. R. (1986a). SuperlatticesMicrostruct. 2, 539. Bandyopadhyay, S., Melloch, M. R., Datta, S., Das, B., Cooper, J. A., and Lundstrom, M. S. (1986b). Proc. IEEE Electron Devices Meet.. Los Angeles, IEEE Catalog No. 86CH2381-2, p. 76. Bandyopadhyay, S., Bernstein, G. H., and Porod, W. (1989). I n “Nanostructure Physics and Fabrication” (M. A. Reed and W. P. Kirk, eds.), p. 31. Academic Press, Boston. Bandyopadhyay, S., Chaudhuri, S.. Das, B., and Cahay, M. (1992). SuperlatticesMicrostruct. 12. 123. Bandyopadhyay, S., Das, B., and Miller, A. E. (1993). Preprint. Barker, J. R. (1973). J. Phys. C 6 , 2663. Barner, J. B., and Ruggiero, S. T. (1987). Phys. Rev. Lett. 59, 807. Bate, R. T. (1983). Workshop Phys. Submicron Struct. 4th, Champaign, Illinois. Bate, R. T. (1985). I n “VLSI Handbook” (N. G. Einspruch, ed.), Vol. 2, Chapter 9. Academic Press, New York. Bate, R. T. (1989). Solid State Technol. 6, 101. Bate, R. T., Frazier, G. A., Frensley, W. R., Lee, J. W., and Reed, M. A. (1987). Proc. SPIE-Int. SOC.Opt. Eng. 792, 26. Bate, R. T., Frazier, G. A., Frensley, W. R., and Reed, M. A. (1989). Tex. Instrum. Tech. J. 6, 13. Beenakker, C . W. J., and Van Houten, H. (1989). Phys. Rev. B: Condens. Matter 39, 10445. Beltram, F., Capasso, F., Luryi, S., Chu, N. G., and Cho, A. Y. (1988). Appl. Phys. Lett. 53, 219. Benioff, P. (1980). J. Stat. Phys. 22, 563. Benioff, P. (1982a). Phys. Rev. Left. 48, 1581. Benioff, P. (1982b). J. Stat. Phys. 29, 515. Ben-Jacob, E., and Gefen, Y. (1985). Phys. Lett. A 108, 289. Bennet, C. H. (1973). IBM J. Res. Dev. 17, 525. Bennet, C. H. (1982). Int. J. Theor. Phys. 21, 905. Bergmann, G. (1983). Phys. Rev. B: Condens. Matter 28, 2914. Binnig, G., Rohrer, H., Gerber, C., and Weibel, E. (1982a). Appl. Phys. Lett. 40, 178. Binnig, G., Rohrer, H., Gerber, C., and Weibel, E. (1982b). Phys. Rev. Lett. 49, 57. Bohm (1951). “Quantum Theory.” Prentice-Hall, Englewood, NJ. Bonnefoi, A. R., Collins, R. T., McGill, T. C., Burnham, R. D., and Ponce, F. A. (1985). Appl. Phys. Lett. 46, 285. Bouchard, A. M., Luscombe, J. H., Seabaugh, A. C., and Randall, J. N. (1991). Proc. Int. Symp. Nanmtruct. Mesoscopic Syst., Santa Fe, New Mexico. Brennan, K . F. (1987). J. Appl. Phys. 62, 2392. Brews, J. F. (1989). In “Submicron Integrated Circuits” (R. K. Watts, ed.), Chapter 6. Wiley, New York. Brown, E. R., Sollner, T. C. L. G., Parker, C. D., Goodhue, W. D., and Chen, C. L. (1989). Appl. Phys. Lett. 55, 1777.
SEMICONDUCTOR QUANTUM DEVICES
247
Buot, F. A., and Jensen, K. L. (1990). Phys. Rev. B: Condens. Matter 42, 9429. Burks, A. W., ed. (1970). “Essays on Cellular Automata.” Univ. of Illinois Press, Urbana. Biittiker, M. (1983). Phys. Rev. B: Condens. Matter 27, 6178. Biittiker, M. (1985a). Phys. Rev. B: Condens. Matter 32, 1846. Biittiker, M. (1985b). In “SQUID-85; Superconducting Quantum Interference Devices and Their Applications” (D. A. Halbohm and H. Lubbig, eds.), p. 529. de Gruyter, New York. Biittiker, M. (1986). Phys. Rev. B: Condens. Matter 33, 3020. Biittiker, M. (1988a). IBMJ. Res. Dev. 32, 63. Biittiker, M. (1988b). Phys. Rev. B: Condens. Matter 38, 9375. Biittiker, M. (1989). Phys. Rev. B: Condens. Matter 40, 3409. Biittiker, M. (1990). In “Analogies in Optics and Microelectronics” (W. van Haeringen and D. Lenstra, eds.), p. 185. Kluwer Acad. Publ., Dordrecht, The Netherlands. Biittiker, M., Imry, Y., Landauer, R., and Pinhas, S. (1985). Phys. Rev. B: Condens. Matter 31, 6207. Bychkov, Y. A., and Rashba, E. I. (1984). J. Phys. C 17, 6039. Cahay, M., McLennan, M., Datta, S., and Lundstrom, M. (1987). Appl. Phys. Lett. 50,612. Cahay, M., Bandyopadhyay, S., and Grubin, H. L. (1989). Phys. Rev. B: Condens. Matter39, 12989. Cahay, M., Bandyopadhyay, S., and Frohne, H. R. (1990). J. Yac. Sci. Technol., B [2] 8, 1399. Cahay, M., Dalton, K., Fisher, G., Anwar, A. F. M., and Lacomb, R. (1992). Superlattices Microstruct. 111, 113. Cai, W., Zheng, T. F., Hu, P., Yudanin, B., and Lax, M. (1989). Phys. Rev. Lett. 63, 418. Capasso, F., and Kiehl, R. A. K. (1985). J. Appl. Phys. 58, 1366. Capasso, F., Mohammed, K., and Cho, A. Y. (1986). IEEE J. Quant. Electron. QE-9, 1853. Capasso, F., Sen, S., Cho, A. Y., and Hutchinson, A. L. (1986). Inst. Phys. Conf. Ser. 83, 539. Capasso, F., Sen, S., Beltram, F., and Cho, A. Y. (1990). In “Physics of Quantum Electron Devices” (F. Capasso, ed.), p. 181. Springer-Verlag, Berlin. Carnes, J. E., and Kosonocky, W. F. (1974). Solid State Technol. 17, 67. Carter, F. L. (1982). “Molecular Electronics.” Dekker, New York. Carter, F. L. (1987). “Molecular Electronics 11.” Dekker, New York. Catt, I. (1967). IEEE Trans. Electron. Comput. EC-16, 743. Chandrasekhar, V., Rooks, M. L., Wind, S., and Prober, D. E. (1985). Phys. Rev. Lett. 55, 1610. Chang, L. L., Esaki, L., and Tsu, R. (1974). Appl. Phys. Lett. 24, 593. Chang, Y. J., and Kroemer, H. (1984). Appl. Phys. Lett. 45, 449. Chaudhuri, S., Bandyopadhyay, S., and Cahay, M. (1992). Phys. Rev. B: Condens. Matter45, 11126. Chaudhuri, S., Bandyopadhyay, S., and Cahay, M. (1993). Phys. Rev. B: Condens. Matter47, 12649. Chemla, D. S., Miller, D. A. B., and Schmitt-Rink, S. (1987). Phys. Rev. Lett. 59, 1018. Chevoir, F., and Vinter, B. (1989). Appl. Phys. Lett. 55, 1859. Chiao, R. Y., Kwiat, P. G., and Steinberg, A. M. (1993). Sci. Am., August, p. 52. Choi, K. K., Levine, B. F., Bethea, C. G., Walker, J., and Malik, R. J. (1987). Phys. Rev. Lett. 59, 2459. Choi, Y. W., and Wie, C. R. (1992). J. Appl. Phys. 71, 1853. Chou, S.-Y., Alle, D. R., Pease, R. F. W., and Harris, J. S. (1989). Appl. Phys. Lett. 55, 176. Chuang, S. L. (1987a). J. Lightwave Technol. LT-5, 5 . Chuang, S. L. (1987b). J. Lightwave Technol. LT-5. 174.
248
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Chuang, S. L., and Do, B. (1987).J . Appl. Phys. 62, 1290. Colella, R., Overhauser, A. W., and Werner, S. A. (1975).Phys. Rev. Lett. 34, 1472. Collins, S., Lowe, D., and Barker, J. R. (1987).J. Phys. C34, 1472. Collins, S., Lowe, D., and Barker, J. R. (1987).J. Phys. C20, 6213. Dagli, N., Snider, G., Waldman, J., and Hu. E. (1991). J. Appl. Phys. 69, 1047. D’Amato, J. L., and Pastawski, H. M. (1990).Phys. Rev. B: Condens. Matter 41, 741 I . Das, B., Miller, D. C., Datta, S., Reifenberger, R., Hong, W. P., Bhattacharyya, P. K., Singh, J., and Jaffe, M. (1989).Phys. Rev. B: Condens. Matter 39, 141 1. Datta, S. (1988a). Extt. Abstr. 20th (19SSInt.) Conf. Solid State Devices Muter., J. SOC.Appl. Phys., Tokyo, p. 491. Datta, S. (1988b). Talk at the Air Force Office of Scientific Research Workshop on Quantum Devices, Atlanta, Sept. 15-16. Datta, S. (1989a). Superlattices Microstruct. 6,83. Catta, S. (1989b).Phys. Rev. B: Condens. Matter 40, 5830. Datta, S. (1990). J. Phys.: Condens. Matter 2, 8023. Datta, S. (1993).In “Physics of Low Dimensional Semiconductor Structures” (N. H. March, P. N. Butcher, and M. P. Tosi, eds.), p. OOO. Plenum, New York. Datta, S., and Bandyopadhyay, S. (1987).Phys. Rev. Lett. 58, 717. Datta, S., and Das, B. (1900). Appl. Phys. Left. 56, 665. Datta, S., Melloch, M. R., Bandyopadhyay, S., Noren, R., Vaziri, M., Miller, M., and Reifenberger. R. (1985).Phys. Rev. Lett. 55, 2344. Datta, S., Bandyopadhyay, S., Melloch, M. R., Reifenberger, R., Miller, M., Vaziri, M., Dungan, T., Cahay, M., and Noren, R. (1986a). Surf. Sci. 174, 439. Datta, S.,Melloch, M. R., Bandyopadhyay, S., and Lundstrom, M. S. (1986b).Appl. Phys. Leu. 48, 487. DeFalco, J. A. (1970).IEEE Spectrum 7, 44. Delagebeaudeuf, D., and Linh, N. T. (1982). IEEE Truns. Electron. Devices ED-29, 955. del Alamo, J. A., and Eugster, C. C. (1989).Appl. Phys. Lett. 56, 78. Deutsch, D. (1985).Proc. R. SOC. London, Ser. A 400,97. De Vegvar, P. G. N., Timp, G., Mankiewich, P. M., Behringer, R., and Cunningham, J. (1989).Phys. Rev. E: Condens. Matter 40, 3491. Dobrowolska, M., Drew, H. D., Furdyna, J. K.,Ichiguchi, T., Wltowski, A., and Wolff, P. A. (1982).Phys. Rev. Letf. 49, 845. Edwards, J. T., and Thouless, D. J . (1972). J. Phys. C 5, 807. Ehrichs, E. E.,Silver, R. M., and delozanne, A. L. (1988).J. Vac. Sci. Technol., A [2]6,540. Engquist, H. L., and Anderson, P. W. (1981).Phys. Rev. E: Condens. Matter 24, 1151. Esaki, L. (1958).Phys. Rev. 109, 603. Eugster, C. C., and del Alamo, J. A. (1991).Phys. Rev. Lett. 67, 3586. Eugster, C. C., del Alamo, J. A., Rooks, M. J., and Melloch, M. R. (1992).Appl. Phys. Lett. 60,642. Ferry, D. K., Akers, L. A., and Greeneich, E. W. (1988). “Ultra Large Scale Integrated Microelectronics.” Prentice-Hall, Englewood Cliffs, NJ. Ferry, D. K., Grondin, R. 0..and Akers, L. A. (1989). In “Submicron Integrated Circuits” (R. K. Watts, ed.), Chapter 9. Wiley, New York. Feynman, R. P. (1982).lnt. J. Theor. Phys. 21, 467. Feynman, R. P. (1985).Opt. News 11, 1 1 . Feynman, R. P. (1986).Found. Phys. 16, 507. Feynman, R. P., and Hibbs, A. R. (1965). “Quantum Mechanics and Path Integrals.” McGraw-Hill, New York. Feynman, R. P., and Vernon, F. L., Jr. (1963).Ann. Phys. (Leipzig) [7]24, 118.
SEMICONDUCTOR QUANTUM DEVICES
249
Feynman, R. P., Leighton, R. B., and Sands, M. (1965). “The Feynman Lectures on Physics,” Vol. 3, Chapter I . Addison-Wesley, Reading, MO. Fij, T., and Jauho, A. P. (1991). Appl. Phys. Lett. 59, 2245. Fowler, A. B. (1985). U.S. Pat. 45,503,320. Fowler, A. B. (1988). Talk presented at the U.S. Air Force Office of Scientific Research Workshop on Quantum Devices, Atlanta, GA. Fredkin, E., and Toffoli, T. (1982). Int. J. Theor. Phys. 21, 219. Frensley, W. R. (1987). Phys. Rev. B: Condens. Matter 36, 1570. Frensley, W. R. (1990). Rev. Mod. Phys. 62, 745. Frohne, H. R., and Datta, S. (1988). J. Appl. Phys. 64, 4086. Futatsugi, T., Yamaguchi, T., Ishii, K., Imamura, K., Muto, S., Yokoyama, N., and Shibatomi, A. (1987). Jpn. J. Appl. Phys. 26, L-131. Gacs, P. (1986). MIT Workshop on Cell. Automata, 1986. Gadzuk, J. W. (1991). Phys. Rev. B: Condens. Matter 44, 13446. Gaylord, T. K., Glytsis, E. N., Henderson, G. N., Martin, K. P., Walker, D. B., Wilson, D. W., and Brennan, K. F. (1991). Proc. IEEE 79, 1159. Geerligs, L. J., Anderegg, V. F., Holweg, P. A. M., Mooij, J. E., Pothiar, H., Esteve, D., Urbina, C., and Devoret, M. H. (1988). Phys. Rev. Lett. 64, 2691. Glazman, L. I., and Shekhter, R. I. (1988). Sov. Phys.-JETP (Engl. Transl.) 67, 163. Goldman, V. J., Tsui, D. C., and Cunningham, J. E. (1987a). Phys. Rev. Lett. 58, 1256. Goldman, V. J., Tsui, D. C., and Cunningham, J. E. (1987b). Phys. Rev. B: Condens. Matter 36, 7635.
Gray, H. J. (1963). “Digital Computer Engineering.” Prentice-Hall, Englewood Cliffs, NJ. Gu, B. Y., Coluzza, C., Mangiantini, M., and Frova, A. (1990). SuperlatticesMicrostruct. 7, 29.
Gunther, L. (1967). Phys. Lett. A 25A, 649; G . Aallan in Ref. 35, p. 2. Gunther, L., and Imry, Y. (1969). Solid State Cornmun. 7, 1391. Harao, M., and Noguchi, S. (1975). J. Comput. Syst. Sci. 11, 171. Hardy, A., and Streifer, W. (1985). J. Lightwave Technol. LT-3, 1135. Hardy, A., and Streifer, W. (1986). IEEE J. Quantum Electron. QE-22, 528. Hauge, E. H., and Strovneng, J. A. (1989). Rev. Mod. Phys. 61, 917. Hirschfelder, J. 0..Christoph, A. C., and Polke, W.E. (1975). J. Chem. Phys. 61, 5435. Hopfield, J. J. (1984). Proc. Natl. Acad. Sci. U.S.A. 79, 2554. Hyldgaard, P., and Jauho, A. P. (1990). J. Phys.: Condens. Matter 2, 8725. Imry, Y. (1986). Europhys. Lett. 1, 249. Ishibashi, K., Nagata, K., Ishida, S., Gamo, K., Aoyagi, Y., Kawabe, M., Murase, K., and Namba, S. (1987). Solid State Commun. 61, 385. Ishimaru, A. (1991). “Electromagnetic Wave Propagation, Radiation and Scattering.” Prentice-Hall, Englewood Cliffs, NJ. Ismail, K., Antoniadis, D. A., and Smith, H. I. (1989). Appl. Phys. Lett. 55, 589. Jain, J. K., and Kivelson, S. A. (1988). Phys. Rev. B: Condens. Matter 37, 4276. Jauho, A. P. (1989). Solid State Electron. 32, 1265. Jauho, A. P., and Nieto, 0. (1986). Wavepacket 1986, p. 44. Jauho, A. P., and Ziep, 0. (1989). Phys. Scr. T25,329. Jogai, B., Huang, C. I., Koenig, E. T., and Bozada, C. A. (1991). J. Vac. Sci. Technol.. B [2] 9, 143. Johnson, E. 0. (1965). RCA Rev. 26, 163. Johnson, M., and Silsbee, R. H. (1988). Phys. Rev. B: Condens. Matter 37, 5312. Jonson, M. (1989). Phys. Rev. B: Condens. Matter 39, 5924. Jonson, M., and Grincwajg, A. (1987). Appl. Phys. Lett. 51, 1729.
250
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Kadanoff, L. P., and Baym, G. (1962). “Quantum Statistical Mechanics.” Benjamin, New York. Keldysh, L. V. (1965). Sov. Phys.-JETP (Engl. Transl.) 20, 1018. Keyes, R. W. (1992). Phys. Today 45, 42. Khondker, A. N., and Alam, M. A. (1991). Phys. Rev. B: Condens. Matter 44,5444. Khondker, A. N., Khan, M. R., and Anwar, A. F. M. (1988). J. Appl. Phys. 63, 5191. Kluksdahl, N. C., Kriman, A. M., Ferry, D. K., and Ringhofer, C. (1989). Phys. Rev. B: Condens. Matter 39, 7720. Kobayashi, T., Miyaka, M., Okazaki, Y., Matsuda, T., Sato, M., Diguchi, K., Ohki, S., and Oda, M. (1988). Tech. Dig. IEEE Electron Device Meet., p. 881. Kotani, S., Imamura, T., and Hasuo, S. (1988). Tech. Dig. IEEE Electron Device Meet., p. 884. Kulik, 1. 0.. and Schechter, R. 1. (1975). Sov. Phys.-JETP (Engl. Transl.) 41, 308. Lake, R., and Datta, S. (1992). Phys. Rev. B: Condens. Matter 45, 6670. Lake, R., Klimeck, G., and Datta, S. (1993). Phys. Rev. B: Condens. Matter 47, 6427. Landau, L. D. (1937). Phys. Z. Sowjetunion 11, 26. Landauer, R. (1957). IBM J. Res. Dev. 1, 223. Landauer, R. (1961). IBM J. Res. Dev. 5, 183. Landauer, R. (1970). Philos. Mag. [8] 21, 863. Landauer, R. (1975). Z. Phys. B 21, 247. Landauer, R. (1989a). Phys. Today 42, 119. Landauer, R. (1989b). In “Nanostructure Physics and Fabrication” (M. A. Reed and W. P. Kirk, eds.), p. 17. Academic Press, Boston. Landauer, R. (1990). Physica A (Amsterdam) 168, 75. Landauer, R. (1992). Proc. Workshop Phys. Comput., Dallas, Texas, 1992, available as an IEEE Computer Society Press reprint. Landauer, R., and Woo, J . W. F. (1972). Phys. Rev. B: Condens. Matter 5, 1189. Landheer, D., and Aers, G. C. (1990). Superlattices Microstruct. 7, 17. Leadbeater, M. L., Alves, E. S., Eaves, L., Henni, M., Hughes, 0. H., Sheard, F. W., and Toombs, G . A. (1988). Semicond. Sci. Technol. 3, 1060. Lecerf, Y. (1963a). C. R. Hebd. Seances Acad. Sci. 257, 2597. Lecerf, Y. (1963b). C. R. Hebd. Seances Acad. Sci. 257, 2940. Lee, P. A. (1987). Physica A (Amsterdam) 140, 169. Lee, Y., McLennan, M. J., and Datta, S. (1991). Phys. Rev. B: Condens. Matter 43, 14333. Leggett, A. J. (1989). I n “Nanostructure Physics and Fabrication” (M. A. Reed and W. P. Kirk, eds.), p. 31. Academic Press, Boston. Lent, C. S., Tougaw, P. D., and Porod, W. (1993). Appl. Phys. Lett. 62, 714. Leo, K., Shah, S., Gobel, E. O., Dammen, T. C., Schmitt-Rink, S., Schafer, W., and Koler, K. (1991). Phys. Rev. Lett. 66, 201. Levinson, I. B. (1970). Sov. Phys.-JETP (Engl. Transl.) 30, 362. Likharev, K. K. (1982). Int. J. Theor. Phys. 21, 311. Likharev, K. K. (1990). In “Superconducting Devices” ( S . T. Ruggiero, ed.), Chaper 1. Academic Press, Boston. Likharev, K. K., and Claeson, T. (1992). Sci Am., June, p. 80. Lin, B. J. F., Paalanen, M. A., Gossard, A. C., and Tsui, D. C. (1983). Phys. Rev. B: Condens. Matter 29, 927. Liu, H. C. (1987). Superlattices Microstruct. 3, 379. Lloyd, S. (1993). Science 261, 1569. Loenberger, F. J., Woodward, C. E.,and Spears, D. L. (1976). IEEE Trans. Circuits Syst. CAS-26, 1125.
SEMICONDUCTOR QUANTUM DEVICES
25 1
Lommer, G., Malcher, F., and Rossler, U. (1988). Phys. Rev. Lett. 60,728. Long, S. I., and Butner, S. E. (1990). “GaAs Digital Integrated Circuit Design.” McGrawHill, New York. Luo, J., Munekata, H., Fang, F. F., and Stiles, P. J. (1988). Phys. Rev. 8: Condens. Matter 38, 10142. Luryi, S. (1985). Appl. Phys. Lett. 47, 490. Luryi, S . , and Capasso, F. (1985). Appl. Phys. Lett. 47, 1347. Mahan, G. D. (1987). Phys. Rep. 145, 251. Mains, R. K., Sun, J. P., and Haddad, G. I. (1989). Appl. Phys. Lett. 55, 371. Manassen, Y., Hamers, R. J., Demuth, J. E., and Castellano, A. J., Jr. (1989). Phys. Rev. Lett. 62, 2531. Matteucci, G., and Pozzi, G. (1985). Phys. Rev. Lett. 54, 2469. McDonald, J. F., Rogers, E. H., Rose, K., and Steckl, A. J. (1984). IEEE Spectrum, October, p. 32. McLennan, M. J., Lee, Y., and Datta, S. (1991). Phys. Rev. B: Condens. Matter 53, 13846. Melloch, M. R., Bandyopadhyay, S., Datta, S., Noren, R., Lundstrom, M. S., Tan, K.,and Dungan, T. (1986). J. Vac. Sci. Technol., B [2] 4, 653. Meservey, R., Paraskevopoulos, D., and Tedrow, P. M. (1976). Phys. Rev. Lett. 37, 858. Miller, D. C., Lake, R. K., Datta, S., Lundstrom, M. S., Melloch, M. R., and Reifenberger, R. (1989). In “Nanostructure physics and Fabrication” (M. A. Reed and W. P. Kirk, eds.), p. 165. Academic Press, Boston. Millman, J. (1979). “Microelectronics.” McGraw-Hill, New York. Minakov, A. A., and Shvets, I. V. (1990). Surf. Sci. Lett. 236, L377. Nagata, M. (1992). IEEE J. Solid-State Circuits 27. 465. Obermayer, K., Teich, W. G., and Mahler, G. (1988). Phys. Rev. B: Condens. Matter 37, 8096. Okuda, M., Fujii, K., and Shimizu, A. (1990). Appl. Phys. Lett. 57, 2231. Okuda, M., Miyazawa, S., Fujii, K., and Shimizu, A. (1993). Phys. Rev. B: Condens. Mutter 47, 4103. Onishi, H., Inata, T., Muto, S., Yokoyama, N., and Shibatomi, A. (1986). Appl. Phys. Lett. 49, 1249. Pastawski, H. M. (1991). Phys. Rev. B: Condens. Matter 44, 6329. Peierls, R. E. (1935). Ann. Imt. Henri Poincar65, 177. Peres, A. (1985). Phys. Rev. A 32, 3266. Pierret, R. F. (1978). Purdue Univ. Tech. Rep. TR-EE-78-50. Potz, W. (1989a). J. Appl. Phys. 66, 2458. Potz, W. (1989b). Superlattices Microstruct. 6, 187. Quian, G., Singh, T., and Cahay, M. (1992). Superluttices Microstruct. 12, 185. Randall, J. N., Luscombe, J., Reed, M. A., and Seabaugh, A. (1989a). Tex.Instrum. Tech. J. 6, 49. Randall, J. N., Reed, M. A., and Frazier, G. A. (1989b). J. Vac. Sci. Technol., B [2] 7, 1398. Randall, J. N., Reed, M. A., and Kao, Y. C. (1990). J. Vac. Sci. Technol., B [2] 8, 1348. Ray, S., Sokolov, P., Kolbas, R., Boonstra, T., and Williams, J. (1986). Appl. Phys. Lett. 48, 1666. Ricco, B., and Azbel, M. Ya. (1984). Phys. Rev. B: Condens. Matter 29, 1970. Romestain, R., Geschwind, S., and Devlin, G. E. (1977). Phys. Rev. Lett. 39, 1583. Rudberg, B. G. R. (1990). Semicond. Sci. Technol. 5, 328. Sakaki, H. (1980). Jpn. J. Appl. Phys. 19, L-735. Sankaran, V., and Singh, J. (1991). Appl. Phys. Lett. 59, 1963. Santhanam, P., Wind, S., and Prober, D. E. (1984). Phys. Rev. Lett. 53. 1179.
252
MARC CAHAY and SUPRIYO BANDYOPADHYAY
Schroder, D. K. (1989). I n “Advanced MOS Devices: Modular Series on Solid State Devices” (R. F. Pierret and G. W. Neudeck, eds.), Vol. 7. Addison-Wesley, Reading, MA. Sen S., Capasso, F., Gossard, A. C., Spah, R. A., Hutchinson, A. L., and Chu, S. N. G. (1987). Appl. Phys. Left. 51, 1428. Shapiro, B. (1983). Phys. Rev. Lett. 50, 747. Sharvin, D. Yu., and Sharvin, Yu. V. (1981). Sov. Phys.-JETP (Engl. Transl.) 34, 272. Shvets, I. V., Wiesendanger, R., Burglar, D., Tarrach, G., Guntherodt, H. J., and Coey, J. M. D. (1992). J. Appl. Phys. 71, 5489. Singh, T., Qian, G., and Cahay, M. (1992). Proc. SPIE-Quantum WellSuperlatticePhys. IV 1675, 1 1 .
Smith, F. T. (1960). Phys. Rev. 118, 349. Sollner, T. C. L. G., Goodhue, W. D., Tannenwald, P. E., Parker, C. D., and Peck, D. D. (1983). Appl. Phys. Lett. 43, 588. Sollner, T. C. L. G., Brown, E. R., Goodhue, W. D., and Le, H. Q. (1987). Appl. Phys. Lett. 50, 332. Sols, F., Maccuci, M., Ravaioloi, U.,and Hess, K. (1989a). Apple. Phys. Lett. 54, 350. Sols, F., Maccuci, M., Ravaioloi, U., and Hess, K. (1989b). J. Appl. Phys. 66, 3892. Staufer, U., Weisdanger, R., Eng, L., Rosenthaler, L., Hidber, H. R., and Guntherodt, H. J. (1988). J. Vac. Sci. Technol., A [2] 6, 537. Stern, F., and Howard, W. E. (1967). Phys. Rev. 163, 816. Stone, A. D. (1985). Phys. Rev. Lett. 54, 2692. Stone, A. D. (1986). Proc. Int. Symp. Found. Quantum Mech. 2nd, Tokyo, p. 207. Stone, A. D., and Imry, Y. (1986). Phys. Rev. Lett. 56, 189. Stone, A. D., and Lee, P. A. (1985). Phys. Rev. Lett. 54, 1196. Stovneng, J. A., Hauge, E. H., Lipavsky, P., and Spicka, V. (1991). Phys. Rev. E: Condens. Matter 44, 13595. Streetman, B. G. (1972). “Solid State Electronic Devices.” Prentice-Hall, Englewood Cliffs, NJ. Subramanian, S., Bandyopadhyay, S., and Porod, W. (1990). J. Appl. Phys. 68, 4861. Subramaniam, S., Bandyopadhyay, S., and Porod, W. (1991). Superlattices Microstruct. 10, 347. Tamir. T. (1975). “Integrated Optics.” Springer-Verlag, New York. Teich, W. G., and Mahler, 0. (1992). Phys. Rev. A 45, 3300. Teich, W. G., Obermayer, K., and Mahler, G. (1988). Phys. Rev. E: Condens. Matter 37, 8111. Thornber, K. K. (1978). Solid State Electronics 21, 259. Thornber, K. K.,and Feynman, R. P. (1970). Phys. Rev. B: Solid State 1, 4099. Timp, G. L., Chang, A. M., Cunningham, J. E., Chang, T. Y., Mankiewich, P., Behringer, R., and Howard, R. E. (1987). Phys. Rev. Lett. 58, 2814. Tsang, L., and Chuang, S . L. (1988). J. Lightwave Technol. LT-6, 304. Tsu, R., and Esaki, L. (1973). Appl. Phys. Lett. 22, 562. Tsukada, N., Wieck, A. D., and Ploog, K. (1990). Appl. Phys. Lett. 56, 2527. Tucker, J. A. (1992). J. Appl. Phys. 72, 4399. Tuckerman, D. B., and Pease, R. F. W. (1981). IEEE Electron Device Lett. EDL-2, 126. Van Loenen, E. J., Dijkkamp, D., Hoeven, A. J., Lenssinck, J. M., and Dieleman, J. (1989). Appl. Phys. Lett. 55, 1312. Van Wees, B. J., Van Houten, H., Beenakker, C. W. J., Williamson, J. G., Kouwenhoven, L. P., Van der Marel, D., and Foxon, C. T. (1988). Phys. Rev. Lett. 60,848. Wang, J., Guo, H., and Harris, R. (1991). Appl. Phys. Lett. 59, 3075. Washburn, S., Schmid, H., Kern, D., and Webb, R. A. (1987). Phys. Rev. Lett. 59, 1791.
SEMICONDUCTOR QUANTUM DEVICES
253
Webb, R. A. (1989). In “Nanostructure Physics and Fabrication” (M. A. Reed and W. P. Kirk, eds.), p. 43. Academic Press, Boston. Webb, R. A., Washburn, S., Umbach, C., and Laibowitz, R. A. (1985). Phys. Rev. Lett. 54, 2996.
Weil, T., and Vinter, B. (1987). Appl. Phys. Lett. 50, 1281. Wharam, D. A., Thornton, T. J., Newbury, R., Pepper, M., Ahmed, H., Frost, J. E. F., Hasko, D. G., Peakcock, D. C., Ritchie, D. A., and Jones, G . A. C. (1988). J. Phys. C 2 1 , L209.
Wiesendanger, R., Guntherodt, H. J., Guntherodt, G., Gambino, R. J., and Ruf, R. (1990). Phys. Rev. Lett. 65, 247. Wiesendanger, R., Shvets, I. V., Burglar, D., Tarach, G., Guntherodt, H. J., and Coey, J. M. D. (1992). Europhys. Lett. 19, 141. Wigner, E. P. (1955). Phys. Rev. 98, 145. Williams, R. E. (1987). In “GaAs Processing Techniques,” Chapter 11. Artech House, Dedham, MA. Wilson, D. W., Glytsis, E. N., and Gaylor, T. K. (1993). J. Appl. Phys. 73, 3352. Wingren, N. S . , Jacobsen, K. W., and Wilkins, J. W. (1988). Phys. Rev. Lett. 61, 1396. Wolfram, S.(1986). “Theory and Applications of Cellular Automata,” p. 7. World Scientific, Singapore. Woodward, T. K., McGill, T. C., Chung, H. F., and Burnham, R. D. (1987). Appl. Phys. Lett. 51, 1542. Wu, G. Y., and McGill, T. C. (1989). Phys. Rev. B: Condens. Matter 40,9969. Wu, X., and Ulloa, S. E. (1991). Phys. Rev. B: Condens. Matfer44, 13148 Yamane, Y., Enoki, T., Sugitani, S., and Hirayama, M. (1988). Tech. Dig. IEEE Electron Device Meet., IEEE, New York), p. 894. Yamanishi, M. (1987). Phys. Rev. Left. 59, 1014. Yamanishi, M. (1989). Superlattices Microstruct. 4, 403. Yang, R. Q., and Xu, J. M. (1991a). Appl. Phys. Lett. 59, 315. Yang, R. Q., and Xu, J. M. (1991b). Phys. Rev. B: Condens. Matter43, 1699. Yariv, A. (1973). IEEE J. Quantum Electron. QE-9,919. Yokoyama, N., Imamura, K., Muto, S., Hiyamizu, S., and Nishi, H.(1985). Jpn. J. Appl. Phys. 84, L-853. Yosefin, M., and Kaveh, M. (1990). Phys. Rev. lett. 64, 2819. Zawadzki, W., Bauer, G., Racek, W., and Kahlert, H. (1975). Phys. Rev. Lett. 35, 1098. Ziman, J. M. (1972). “Principles of the Theory of Solids,” 2nd edn., Chapter 10. Cambridge Univ. Press, Cambridge, UK. Zuek, W. H. (1984). Phys. Rev. Lett. 53, 391.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS. VOL . 89
Fuzzy Relations and Applications BERNARD DE BAETS* and ETIENNE KERRE Department of Applied Mathematics and Computer Science University of Gent. 9OOO Gent. Belgium
I . Introduction to Fuzzy Set Theory . . . . . . . A . Introduction . . . . . . . . . . . . . B. Structural Considerations . . . . . . . . C . Miscellaneous Concepts . . . . . . . . . I1. Fuzzy Relational Calculus . . . . . . . . . A . Introduction . . . . . . . . . . . . . B . Classical Relational Calculus . . . . . . . C . Fuzzy Relational Calculus . . . . . . . . D . Further References . . . . . . . . . . Ill . Special Types of Fuzzy Relations . . . . . . . A . Potential Properties of Fuzzy Relations . . . . B . Similarity Relations . . . . . . . . . . C . Likeness Relations . . . . . . . . . . IV . Applications of Triangular Compositions . . . . A . Introduction . . . . . . . . . . . . . B . An Example from Medical Diagnosis . . . . . C . An Example from Automated Information Retrieval V Fuzzy Inference Mechanisms . . . . . . . . A Introduction . . . . . . . . . . . . . B. A Fuzzy Knowledge Base . . . . . . . . C . A Fuzzy Inference Engine . . . . . . . . D . Further References . . . . . . . . . . References . . . . . . . . . . . . . . .
.
.
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. .
. .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . .
. .
255 255 258 264 266 266 267 276 289 291 291 294 296 297 297 298 306 312 312 313 316 322 323
I . INTRODUCTION TO FUZZY SETTHEORY
A . Introduction Because of the rapid development of computer technology during the past two decades. our society is moving from an industrial towards an information society. The immense growth of the number-crunching capabilities of modern computers has led to new. powerful applications such as the simulation of complex industrial processes and the creation of very large data bases . Another important computer application lies in the field of
.
Research Assistant of the National Fund for Scientific Research (Belgium) 255
Copyright 0 1994 by Academic Press. Inc. All rights of reproduction in any form reserved. ISBN 0-12-014731-9
256
BERNARD DE BAETS and ETIENNE KERRE
knowledge engineering (Zadeh, 1984). The design of expert systems is one of the major topics of interest in artificial intelligence. It is widely recognized that in such applications we are interested not merely in the syntactical transmission of information, but more in its semantical aspect (Zadeh, 1978). Most classically designed expert systems are based on binary logic and classical set theory, and therefore suppose well-defined notions and predicates that are true without any restriction. Unfortunately, the facts and if-then rules used by experts do not always fit these crisp criteria. It is widely accepted now that humans deal with imprecision and uncertainty by using linguistic values and applying some kind of approximate reasoning techniques, rather than by using exact numerical values and first-order logic. Hence, there exists a real need for formal models capable of grasping this human way of tackling complex phenomena. Imprecision and uncertainty are two inherent features of the incompleteness of information. A proposition is considered uncertain if its truth or falsity cannot be claimed. Classically, there are two ways to treat uncertainty: probability theory and error calculus. As argued extensively by Zadeh (1986), the long-standing tradition in science of treating any kind of uncertainty with probability theory can be questioned. Classical probability theory is insufficiently expressive to serve as the language of uncertainty in artificial intelligence. It has no facilities to describe fuzzy predicates such as “young,” “small,” etc., nor fuzzy probabilities such as “likely” or “not very likely,” nor fuzzy truth values such as “more or less true” or “not completely false,” nor modifiers such as “very” or “slightly.” A proposition is considered imprecise if it contains predicates that are not precisely defined. A classical subset A of a universe X partitions this universe into two disjoint parts. This sharp partitioning is an immediate consequence of the classical two-valued logic underlying classical set theory, more specifically of the law of excluded middle P v -P that holds for every formula P . A lot of notions cannot be handled in such a crisp way. The verification of an assertion such as “2121 is a large number” presupposes the existence of a set of large numbers in order to be able to verify whether 212’ belongs to it. This will be no problem for extreme cases; however, there will be a serious doubt about a lot of remaining numbers, the borderline cases. All efforts to define a set of large numbers encounter the same problem: Where we put the boundary-how we determine the number above which every number will be regarded as large, and below which every number is not large. Classically, two solutions exist. Some authors, such as Gentilhomme (1968), solve this problem by partitioning the universe of discourse into
FUZZY RELATIONS AND APPLICATIONS
257
three disjoint parts: the certain part, containing the numbers that certainly satisfy the predicate large; the negative part, containing the numbers that are certainly not large; and the set of borderline cases. This tripartitioning of the universe of discourse is based on a Kleene-Dienes three-valued logic. A second category of authors claims that borderline cases must be eliminated by defining concepts that define sharp boundaries, i.e., they make fuzzy concepts crisp. A predicate such as “red,” for instance, has to be defined in the following way: a red object is one that is indistinguishable in hue from a uniformly reflecting surface illuminated with monochromatic light of wavelength between 580.27 nm and 702.35 nm. This kind of precision sounds at least extremely artificial and useless in practice. The foregoing discussion should have convinced the reader that there is a real need for solid mathematical models that are capable of grasping the imprecision and uncertainty intrinsically present in our information systems. In this review we demonstrate the capabilities of fuzzy set theory to obtain some of these goals. The solution proposed by Zadeh (1965) is the following: instead of a sharp boundary or a set of definite borderline cases, he introduces a gradual, rather than abrupt, transition from nonmembership to full membership, allowing partial degrees of membership. The predicate “large,” for example, constitutes a fuzzy set in the underlying universe of discourse (for instance, the set of nonzero natural numbers No). The set of large numbers is considered as a mapping L, called membership function, from IN, into the unit interval [0, 11, defined by
associating with each nonzero natural number n a degree to which it satisfies the description “large number.” The number 100, for example, is considered as a large number to a degree 0.5; the number lo4 is a large number to a degree 1,OOO,OOO/l,OOO,001. Clearly, the choice of a membership function is both context- and observer-dependent. These observations indicate that an absolute choice for the membership function is in general impossible: An exact degree of membership does not exist. It suffices to have a rough estimation of the shape of the membership function. As Zadeh states: “In a way that is not well understood at present, humans have a remarkable ability to assign a grade of membership to a given object without a conscious understanding of that way in which the grade is arrived at.” Let us now turn towards a more mathematical discussion.
258
BERNARD DE BAETS and ETIENNE KERRE
B. Structural Considerations 1. Original Operations by Zadeh
A mapping A from a universe X into the unit interval [0, 1J is called a fuzzy set in X; the value A(x) of x E X is called the degree of membership of x in A. The class of all fuzzy sets in X is denoted S ( X ) . A fuzzy set R in the Cartesian product X x Y of two universes X and Y is called a fuzzy relation from X to Y. A fuzzy relation from X to X is called a binary fuzzy relation in X. In the original approach by Zadeh (1965), the unit interval has been chosen to evaluate the degree to which an element belongs to a fuzzy set. However, in some applications this choice is too restrictive. The unit interval is totally ordered, and does not therefore allow incomparable degrees of membership. Goguen (1967) has extended the concept of a fuzzy set to an L-fuzzy set by using a complete lattice L as a set to evaluate degrees of membership. In this review we will stick to the unit interval. It is well known from classical logic, as well as, for instance, from switching theory, that the structure (10,11, max, min, 1-) constitutes a Boolean algebra and that, as a consequence, the structure ( 6 ( X ) ,U, n, co), where 6 ( X ) denotes the powerset of X, also constitutes a Boolean algebra. The structure ([0, 11, max, min,l-) no longer satisfies all the axioms of a Boolean algebra. There is some loss of structure with respect to the complementation operation. The structure ([0, 11, max, min, 1-) constitutes a Morgan algebra, sometimes called soft algebra-more specifically, a bounded, completely distributive lattice satisfying the following additional properties, for any x and y in [0,1], where x’ is shorthand for 1 - x :
(i) (ii) (iii) (iv) (v) (vi)
0‘ = 1 and 1’ = 0; max(x, y)’ = min(x’, y ‘ ) ; min(x, y)’ = max(x’,y ’ ) ;
x I y o y’
5 x’;
(x’)’ = x ; max(x,y) = 1 * x’ s y.
However, the property (vx E [0, 1])(3y E [0, l])(max(x,y) = 1 and min(x,y) = 0) does not hold, which means that this structure is only pseudo-complemented. Zadeh’s original extensions of the classical union, intersection, and complementation are based on this Morgan algebra. Consider two fuzzy sets A and B in X, then: (i) the union A U B of A and B is the fuzzy set in X defined by A U B(x) = max(A(x), B(x));
259
FUZZY RELATIONS AND APPLICATIONS
(ii) the intersection A
n B of A and B is the fuzzy set in X A n B(x) = min(A(x), B(x));
defined by
(iii) the complement co A of A is the fuzzy set in X defined by coA(x)
=
1 - A(x).
Since the structure ([0, I], max, min, 1-) constitutes a Morgan algebra, it immediately follows that the structure (S(X), U, n, co) also constitutes a Morgan algebra. This means that compared to (6(X), U, n, co), there is some loss of structure, which is to be expected because the concept of a fuzzy set is an extension of the concept of a classical set, when identified with its characteristic mapping. The corresponding order relation c on S(X)is defined by A EB
c)
( v x E X)(A(x)
IB(x)).
The union and intersection of fuzzy sets are extended to arbitrary families of fuzzy sets in the following way. Consider a family (Ai I i E I)of fuzzy sets in X , then: (i) the union U i e I A i of the family (Ai I i E I) is the fuzzy set in X defined by Ai(X) = sup Ai(x);
u
isI
(ii) the intersection defined by
ieI
ni IAi of the family (Ai 1 i E I) is the fuzzy set in X nAi(x) infAi(x). =
ieI
ieI
For a complete list of the properties of the class of fuzzy sets endowed with Zadeh’s original max-min operations, we refer t o Kerre (1993). We mention the most important ones. Consider three fuzzy sets A, B, and C in X, then: 1. de Morgan laws:
CO(Au B ) = C O A
n COB,
CO(AnB) = C O A U C O B ; 2. idempotency laws:
AuA=AnA=A; 3. commutativity laws:
AUB=BUA, A
nB = B n A ;
260
BERNARD DE BAETS and ETIENNE KERRE
4. associativity laws:
(A U B ) U C
=
A U (B U C ) ,
(A n s ) n c = An(Bnc); 5 . absorption laws:
A
u (A n B ) = A n (A u B ) = A;
6. distributivity laws:
u ( B n c)= ( A u B ) n (A u c), A n (B u c)= (A n B ) u (A n c). A
Some of the most important deviations of fuzzy set theory from classical set theory are the following. 1. In fuzzy set theory the law of contradiction A n coA = 0 and its dual the law of excluded middle A U co A = X no longer hold. This means that, by allowing sets to have ill-defined boundaries, it is possible to have some overlap between a fuzzy set and its complement. This overlap is bounded in the following way: (vX
E X)(A nc o ~ ( x5 ) +).
This law is called the weakened law of contradiction. On the other hand, because of this overlap, a fuzzy set and its complement do not fill up the entire universe. We only have that (VX E X ) ( A
u coA(x) 1 4).
This law is called the weakened law of excluded middle. 2. For classical sets A and B in a universe X,the relationship coA E B allows the conclusion that A U B = X, and the relationship B E co A allows the conclusion that A n B = 0 .These properties are no longer valid for fuzzy sets.
As in classical set theory, a lot of operations can be derived from the basic operations union, intersection, and complement. We mention the difference and symmetric difference. Consider two fuzzy sets A and B in X,then: (i) the difference AW of A and B is defined as A r l co B, i.e., AW(x) = min(A(x), 1 - B(x));
(ii) the symmetrical difference A @ B of A and B is defined as ( A W ) U (BW), i.e., A
0 B(x) = max(min(A(x), 1
-
B(x)),min(1
-
A(x), B(x))).
FUZZY RELATIONS AND APPLICATIONS
261
It should be stressed that the maximum and minimum operations are not the only candidates for the generalization of the union and intersection of classical sets to fuzzy sets. A lot of theoretical as well as practical research has been performed on the problem of the confluence of several objectives (Bellman and Giertz, 1973; Zimmermann and Zysno, 1980). In fact, any triangular conorm can be used to model union (disjunction), and any triangular norm can be used to model intersection (conjunction). A triangular conorm S, resp. norm 3, is an increasing, associative, and commutative [0,11’ + [0,11 mapping satisfying the boundary condition (vx E [0, l])(S(x, 0) = x), resp. (VX E [0, 1])(3(l,x) = x ) (Schweizer and Sklar, 1983). The corresponding union Us and intersection n, of two fuzzy sets A and B in X are defined by A U s B(x) = S(A(X),B(x)),
A fl, B(x) = ~ ( A ( xB(x)). ),
To every triangular norm 3 corresponds a triangular conorm 3*, called the dual conorm, defined by ~ * ( x , Y= ) 1 - 3(1 - X , 1 - y ) .
Popular choices for 3 in fuzzy set theory are minimum ( M ) , product (P), and the bold intersection W defined by W ( x , y )= max(0,x + y - 1). The corresponding dual conorms are maximum, probabilistic sum, and bounded sum. Each such pair of dual operations gives rise to alternative definitions for the union and intersection of fuzzy sets. Notice that for any triangular norm 3 it holds that 3(x,y) I min(x,y). For any triangular conorm S it holds that max(x,y) IS(x,y).
2. Probabilistic Sum and Algebraic Product Operations The structure (S(X), U, n, co) is based on the order-dependent operations maximum and minimum on the unit interval. Using the arithmetic operations we can introduce alternative operations on S(X).As mentioned earlier, the (algebraic) product P is a triangular norm. The corresponding dual conorm, the probabilistic sum, is defined by P*(x,y) = x + y - xy. Consider two fuzzy sets A and B in X. Then (i) the probabilistic sum A Up+B of A and B is the fuzzy set in X defined by A Up, B(x) = A(x) + B(x) - A(x)B(x);
(ii) the algebraic product A nP B of A and B is the fuzzy set in X defined by A npB(X) = A(x)B(x).
262
BERNARD DE BAETS and ETIENNE KERRE
The structure ( 5 ( X ) ,Up*, np,co) loses some properties compared to the structure (S(X), U, n, co). Consider three fuzzy sets A , B, and C in X ;then 1. de Morgan laws:
npco B , CO(AnpB ) = C O A up*COB;
CO(Aup+ B ) = co A
2. weakened idempotency laws: AnpAcAcAuprA;
3. commutativity laws: A Up*B = BUp*A,
A ~ ~ B = B ~ ~ A ; 4. associativity laws:
c = A up*( B u p C ) , (A np B ) np c = A np ( B np c);
(A u p * B ) up,
5 . weakened absorption laws: A
np (A up*B ) c A G A up,(A npB ) ;
6. weakened distributivity laws:
(A up.B ) np (A up, c)E A
up, ( B npc), A np( B up*c)E (A np B ) up+ (A npc).
The new structure does not obey the idempotency, absorption and distributivity laws. On the other hand, there is an improvement with respect to the law of contradiction and the law of excluded middle. Indeed, we have (Vx E X ) ( A npco A(X) I $1 (VX E X ) ( A up*co A(x) 2
4).
Perhaps one of the most remarkable differences between the max-minbased and the probabilistic sum-product-based operations is that the latter ones show some compensatory behavior. More explicitly, for A(x) = 0.5 we have A flB(x) I0.5, independent of the value of B(x); however, enlarging B(x) leads to an increase of A np B(x).
FUZZY RELATIONS AND APPLICATIONS
263
3. Bounded Sum and Bold Intersection Operations The third and final structure that we mention here is (S(X), Uw*,n , co) based on the bold intersection W. The corresponding conorm, the bounded sum, is defined by W*(x,y ) = min(1, x + y ) . Consider two fuzzy sets A and B in X ; then (i) the bounded sum A Uw.B of A and B is the fuzzy set in Xdefined by A UwlB(x) = min(l,A(x)
+ B(x));
(ii) the bold intersection A n, B of A and B is the fuzzy set in X defined by A n, B(x) = max(0, A(x) + B(x) - 1). The max-min operations are entirely based on the order structure of the unit interval; the probabilistic sum and algebraic product operations only use the algebraic structure of it. The bounded sum and bold intersection operations are in this sense hybrid: The order structure as well as the algebraic structure are used. The structure (S(X), Uw*,n,, co) loses some properties compared to the structure (S(X), U, n, co). Consider three fuzzy sets A , B, and C in X ;then 1. de Morgan laws:
uwlB ) = COA n, co B , co(A n , B ) = coA u p COB;
CO(A
2. weakened idempotency laws: AnwAEAEAuw.A;
3, commutativity laws: A U,*B
= BU,*A,
AnwB=BnwA; 4. associativity laws:
(A U *,
B ) U W * C = A Uw* (B U p C),
( ~ n , ~ ) n , c = ~ n , ( ~ n , c ) ; 5 . weakened absorption laws: A
n,
(A .U ,
B ) 5 A G A U,+ (A n, B ) ;
6. no weakened distributivity laws.
264
BERNARD DE BAETS and ETIENNE KERRE
Compared to the previous structure, this structure also loses the weakened distributivity laws. However, for this structure the law of contradiction and the law of excluded middle hold. The bounded sum-bold intersection operations also show some compensatory behavior. C. Miscellaneous Concepts 1 . Cartesian Product
Consider a fuzzy set A in X and a fuzzy set B in Y; then the Cartesian product A x B of A and B is the fuzzy set in X x Y (the fuzzy relation from X to Y) defined by A x B(x,y) = min(A(x), B ( y ) ) . In fact, as for intersection, any triangular norm can be used to model the Cartesian product. The corresponding Cartesian product x3 is then, of course, defined by A
Xa B(x, Y ) =
3(A(x),B(Y))-
For a complete list of the properties of the Cartesian product we refer to Kerre (1993). We mention for instance the distributivity laws. Consider a fuzzy set A in X and two fuzzy sets B and C in Y, then A x ( B U C ) = ( A x B ) U (A x C ) , A x (B n
c)= ( A x B ) n (A x c).
For a general union and intersection only weakened distributivity laws hold: A x ( B Us C ) 5 (A X B ) Us (A x C ) ,
c)2 (A x B ) n3(A x c).
A x ( B n3
2 . Cuts and Strict Cuts Important concepts that allow to relate fuzzy sets to classical sets are the cuts and strict cuts. The a-cut, a E [0, I], of a fuzzy set A in X is the subset A, of X defined as follows: A, = ( x I x E X and A ( x ) 2 a).
The strict a-cut, a E [0, 11, of a fuzzy set A in X is the subset A , of X defined as follows: A, = [ x I x E X and A(x) > a]. The 1-cut of a fuzzy set A is also called the kernel of A , denoted ker(A); the strict 0-cut of A is also called the support of A , denoted supp(A).
FUZZY RELATIONS AND APPLICATIONS
265
Consider a fuzzy set A in X and a fuzzy set B in Y,a and /I in [0, 11; then 1. the (strict) cuts form a descending chain: a I/I* A, C A,,
a ~ / 3 *As C A,;
2. decomposition of a fuzzy set in terms of its (strict) cuts: A = U ~ ~ , I ( Y E l O , ~ l l = ~ ~ o l A , l a E ~ ~ , l ~ l ;
3. characterization of inclusion in terms of (strict) cuts: A c_ B e ( v a E 10, l])(A, E B,)’ A E B H ( V a E [0, l[)(A, E B,); 4. Zadeh’s union in terms of (strict) cuts:
(A U B ) , = A , U B,,
(A u B ) , = A,
u B,;
5 . Zadeh’s intersection in terms of (strict) cuts: (A n B ) , = A , n B , , (A n B ) , = A, f l B , ;
6. Cartesian product in terms of (strict) cuts:
( A x B ) , = A , x B,, (A x B ) , = A , x B,.
3 . Height and Plinth
The height and plinth operators occur frequently in fuzzy set theory, for instance in the fuzzy relational calculus and in the generalization of classical inference rules. Their function is mainly to reduce the size of formulas and to facilitate the study of these formulas. For this purpose, the properties of the height and plinth operators come in very handy. Consider a fuzzy set A in X ;then (i) the height Hgt(A) of A is defined as Hgt(A) = supA(x); x EX
(ii) the plinth Plt(A) of A is defined as Plt(A)
=
inf A(x).
x
EX
266
BERNARD DE BAETS and ETIENNE KERRE
The height and plinth operators show the following properties. Consider I i E I ) of fuzzy sets two fuzzy sets A and B and a family of fuzzy sets (Ai in X; then 1. monotonicity: A C B =$ Hgt(A) 5 Hgt(B),
A EB
=)
Plt(A)
IPlt(B);
2. interaction with Zadeh’s union:
3. interaction with Zadeh’s intersection:
4. interaction with a general union:
H g t U Us B ) 5 S(Hgt(A), Hgt(B)),
Plt(A
u, B) 1 S(Plt(A), Plt(B));
5 . interaction with a general intersection:
QI B ) 5 3 ( H g U ) , HgW)), Pit(A
n, B ) 2 g(Pit(A), Pit(B));
6. interaction with complementation:
Hgt(c0A) = 1 - Plt(A). 11. FUZZYRELATIONAL CALCULUS
A . Introduction From different points of view the concept of a relation is a fundamental one. Sometimes science itself is described as the discovery of relations between objects in a very broad sense. In mathematics, relations-and in particular functional relations, or in short, functions-are undeniably basic concepts.
FUZZY RELATIONS AND APPLICATIONS
267
How could we imagine the development of a mathematical framework without the fundamental relations such as E, C , =, f , 1,I , ...? From a practical point of view relations are even more important: A relation is a suitable tool for describing correspondences between facts. Furthermore, the concept of a relation is a key concept in the framework of relational databases: Data are represented by means of two-dimensional tables, expressing relationships between attributes. In this review we will show that the concept of a fuzzy relation has considerably enlarged the potential application areas. B. Classical Relational Calculus 1. Basic Concepts
It is well known that a (crisp) relation R from a universe X to a universe Y is a subset of X x Y, i.e., R E X x Y. The formula (x, y) E R is abbreviated as xRy and one says that x is in relation R with y. A relation R from X to X is called a binary relation in X. Consider a relation Rfrom X to Y , x E X and y E Y; then (i) the afterset XR of x is the subset of Y defined as XR = ( y l y ~ Y a n d x R y ] ; (ii) the foreset Ry of y is the subset of X defined as Ry = (x I x E X and xRy); (iii) the domain dom(R) of R is the subset of X defined as dom(R)
=
(xI x E X and ( 3 y E Y)(xRy));
(iv) the range rng(R) of R is the subset of Y defined as rng(R)
=
(y ( y E Y and (3x E X)(xRy));
(v) the converse relation R' of R is the relation from Y to X defined as R'
=
( ( y ,x) I (y, x) E Y x X and xRy).
2. Images of a Set under a Relation Consider a relation R from X to Y and a subset A of X. The classical definition of the direct image of the set A under the relation R is given as follows: R(A) = ( y I y E Y and (3x E A)(xRy)), and it can be written in terms of foresets as R(A)= I y l y E Y a n d A n R y Z
0).
268
BERNARD DE BAETS and ETIENNE KERRE
The direct image R(A) is the set of those elements of Y that are in relation R' with at least one element of A . The purpose now is to refine the direct image R(A) in order to distinguish those elements of Y that are in relation R' with all elements of A and those elements of Y that are in relation R' with elements of A only. This refinement is achieved in the following definition: (i) the subdirect image R&l) of A under R is the subset of Y defined as R&)
I
= (V y E
Y and 0
CA
C Ry);
(ii) the superdirect image Rb(A) of A under R is the subset of Y defined as R & l ) = ( y ly E Y and 0 C Ry G A ) ; (iii) the ultradirect image Ro(A) of A under R is the subset of Y defined as Ro(A) = ( y I y E Y and 0 c A = Ry). For a non-empty set A , the subdirect image R&)
can be written as
R & l ) = ( y I y E Y and (Vx E A)(xRy)). This explains why the direct and subdirect images are called existential and universal compositions by Izumi et al. (1986). They are also called upper and lower images by Dubois and Prade (1992). The non-emptiness condition 0 c A in the definition of the subdirect image seems superfluous at first sight and could be evaded by restricting the definition to non-empty sets. Without this condition it would follow that R a ( 0 ) = Y, which is unacceptable. Neither Izumi et al. nor Dubois and Prade have observed the necessity of this non-emptiness condition. The non-emptiness condition (21 C Ry in the definition of the superdirect image has stronger consequences. Without this condition it would follow that co(rng(R)) E Rb(A), which is unacceptable again. The condition 0 c Ry ensures that Rb(A) contains those elements of Y that are in relation R' with elements of A only and that actually do so. The non-emptiness conditions imply that all images are contained in rng(R) and that the images of the empty set under a relation all yield the empty set.
Example ZZ.1. The images of a set under a relation can be illustrated on an example from medical diagnosis. Consider a set of patients X and a set of symptoms Y. Let R be the relation from X to Y defined by pRs
0
patient p shows symptom s.
Let F be the non-empty set of female patients in the population X; then the images of F under R are given by
269
FUZZY RELATIONS AND APPLICATIONS
R(F) is the set of symptoms shown by at least one female patient; R#) is the set of symptoms shown by all female patients; R#) is the set of symptoms shown by at least one female patient and not by any male patient; Ro(F) is the set of symptoms shown by all female patients and not by any male patient. Now consider a subset B of Y. The direct images of B under the converse relation R' are called the inverse images of B under R : (i) the inverse image R'(B) of B under R is the subset of X defined as
R'(B) = (xlx E X and B n x R # 0 ) ; (ii) the subinverse image R&(B) of B under R is the subset of X defined as R&(B) = ( x ( x E X a n d 0 c B G x R ) ; (iii) the superinverse image @ ( B ) of B under R is the subset of X defined as @ ( B ) = ( x [ x ~ X a n d ( Z cI x R s B ) ; (iii) the ultrainverse image R&B) of B under R is the subset of X defined as R&(B) = ( x I x E X a n d 0 c B = x R ) . For a close examination of the relationships between the direct images and the properties of these images, we refer to De Baets and Kerre (1993a). We mention the most important ones. Consider two relations R and S from X to Y , two subsets A and B of X , and a family (AJiE , of subsets of X; then 1. containment:
Ro(A) = &(A)
nM A ) ,
R o w E &(A) E R W , Ro(A) E &(A) c R(A); 2. relationships: CO((C0 R)(A)), &(A) = co(R(c0 A ) ) f l rng(R),
ifA # 0 ,
R,(A) = C O ( ( C R~) ( A ) )n CO(R(CO A))
i f A # 0,
R,(A)
R&)
=
= (coR)~(co A) U co(rng(co R))
if A
z 0,
&(A) = (co R)a(co A ) n rng(R)
ifA # X ,
&(A) = (COR )O(COA )
if (21 # A # X ;
270
BERNARD DE BAETS and ETIENNE KERRE
3. monotonicity:
* R(A) !& R(B), (21 C A E B * Ra(B) C Ra(A), A G B * RD(A) E Rp(B), R E S * R(A) E S(A), R E S * Ra(A) E &(A), A EB
(rng(R) = rng(S) and R E S ) =) SD(A)E RD(A); 4. interaction with union:
5. interaction with intersection:
3. Compositions of Relations
Consider a relation R from X to Y and a relation S from Y to Z . The classical definition of the composition of the relations R and S is given as follows: R 0S
=
I(x, z ) I (x,z ) E X x Z and ( 3 y E Y)(xRyand ySz)],
and it can be rewritten in terms of after- and foresets as R o S = I ( x , ~ ) I ( x , z ) ~ X x Z a n d x R n#S0 z1 .
FUZZY RELATIONS AND APPLICATIONS
27 1
The composition R 0 S is a relation from X to 2, consisting of those couples ( x , z ) for which there exists at least one element of Y that is in relation R' with x and that is in relation S with z. The relation R o S is read as R before S or R followed by S . Bandler and Kohout have introduced the following new compositions: (i) the Bandler-Kohout subcomposition R relation from X to 2 defined as
R Qbk S =
I
( ( X , 2 ) ( X , Z) E
X
X
abk
2 and XR E SZ);
(ii) the Bandler-Kohout supercomposition R relation from X to 2 defined as
R
f>bk
Obk
f>bk
S of R and S is the
S = ((x,z) I ( x , z ) E X x 2 and Sz C x R ) ;
(iii) the Bandler-Kohout ultracomposition R relation from X to 2 defined as
R
S of R and S is the
S=
1
[ ( X , Z) ( X ,
Obk
S of R and S is the
Z) E X X 2 and XR = SZ).
These compositions are called subproduct, superproduct, and square product by Bandler and Kohout. The sub- and superproduct are also called triangular products. The classical composition is sometimes called round composition. It is clear that our definitions of the sub- and superdirect images of a set under a relation have been inspired by these triangular compositions. It is surprising that the definitions of Bandler and Kohout do not mention any non-emptiness condition. This is a regrettable shortcoming. One easily verifies that co(dom(R)) X 2 E R a b k S,
x x CO(mg(S)) E R f>bk S. The first expression means that x is in relation R a b k S with all elements of 2, even if there is no element of Y that is in relation R' with x. A similar remark holds for the second expression. In this way, the compositions R a b k S, R t>bk S' and R O b k S can contain a lot of unwanted couples. It is clear that only those couples can be accepted for which both components are involved in the relations. We have suggested (De Baets and Kerre, 1993a) the following improved definitions: (i) the subcomposition R defined as
RaS
=
a S of R
and S is the relation from X to 2
( ( x ,z ) I (x, z) E X x 2 and 0
c X R E Sz);
272
BERNARD DE BAETS and ETIENNE KERRE
(ii) the supercomposition R D S of R and S is the relation from X to 2 defined as
R D S = [(x,z) I (x, z ) E X x Z and
0 C Sz E x R ] ;
(iii) the ultracomposition R 0 S of R and S is the relation from X to 2 defined as
R oS
=
((x,z) I (x, z)
E
X x 2 and
0 C XR = Sz).
Example 11.2. The compositions of two relations can also be illustrated on an example from medical diagnosis. Consider a set of patients X, a set of symptoms Y, and a set of illnesses 2. Let R be the relation from X to Y defined by pRs o patient p shows symptom s and S the relation from Y to Z defined by sSi a s is a symptom of illness i.
The compositions of R and S are given by
p ( R o S)i a patient p shows at least one symptom of illness i ; p ( R a S)i a all symptoms shown by patient p are symptoms of illness i (and patient p shows at least one symptom); p(R D S)i o patient p shows all symptoms of illness i ; p ( R 0 S)i o the symptoms shown by patient p are exactly those of illness i.
Example 11.3. A second example is taken from the field of information retrieval. Consider a set of users X , a set of terms Y, and a set of documents 2. Let R be the relation from X to Y defined by uRt a user u is interested in term t and S the relation from Y to Z defined by
tSd
@
term t is treated in document d.
The compositions of R and S are given by
u(R o S)d o user u is interested in at least one of the terms treated in document d ; u(R Q S)d o all the terms user u is interested in are treated in document d ; u(R D S)d a user u is interested in all the terms treated in document d ; u(R 0 S)d o the terms treated in document d are exactly those user u is interested in.
FUZZY RELATIONS AND APPLICATIONS
273
For a close examination of the relationships between the compositions and the properties of the compositions, we refer again to De Baets and Kerre (1993a). We mention the most important ones. Consider three relations R, R, ,and R2 from X to Y,a relation S from Y to 2, and a family (Ri)iGIof relations from X to Y ;then 1. containment:
R
o s = (R a s) n (R D s),
R O S S R Q S E RoS, RoSGRDSGROS;
2. relationships:
a s = CO(Ro (COs)) n (dom(R) x z), R D S = CO((CO R ) 0 S) n (XX rng(S)), R o s = CO(Ro (COs)) n CO((CO R ) o S) n (dom(R) x z), = co(R 0 (COS)) n CO((CO R) 0 S ) f(X l x rng(S)), R a S = ((co R) D (co S) fl (dom(R) x 2))U (dom(R) x co(rng(co S))), R D S = ((co R) a (co S) n (Xx rng(S)) U (co(dom(co R)) x rng(S)), R 0 S = ((co R) 0 (co S ) n (dom(R) x 2)) R
U (co(dom(co R)) x co(rng(co S))).
Note that for the Bandler-Kohout compositions the following relationships hold: R a b k S = (CO R ) Dbk (CO S)y R Obk S = (CO R ) Obk (CO S); 3. convertibility: (R 0 S)' = S ' O R', (R a S)' = S' D R', (R D S)' = S'Q R', (R 0 S)' = S'O R'; 4. monotonicity:
* R1 O S C R2OS, E R2) * R2 a S G R, a S,
R, E R2 (dom(R,) = dom(R2) and R,
R, E R 2 * R I D S G R 2 D S ;
274
BERNARD DE BAETS and ETIENNE KERRE
5 . interaction with union:
6. interaction with intersection:
7 . associativity: R o ( S 0 T ) = ( R 0 S ) O T, R o ( S p T ) G ( R o S ) D T,
R
a (S 0 T ) 2 ( R a s)0 T,
Ra(SaT)c (ROSIQT, R a ( S P T ) = (RaS)D T , R D ( S O T ) 2 ( R D S ) D T.
Note that for the Bandler-Kohout compositions the following associativity properties hold:
(s a b k T ) = (R 0 s, a b k T, R a b k (SDbk T ) = (R a b k s) Dbk TY R Dbk ( S 0 T ) = ( R b b k s) t>bk T. abk
Two sets of equivalent expressions for the improved triangular compositions are particularly interesting, since they lead to two alternative ways of fuzzifying these relational compositions. The first set expresses that the improved compositions can be seen as the intersection of the BandlerKohout compositions and the Cartesian product of the domain of the first relation and the range of the second relation:
R
uS
= (R abk
S) n (dom(R) x rng(S)),
R D S = ( R b b k S)n (dom(R) X rng(S)).
FUZZY RELATIONS AND APPLICATIONS
275
The second set expresses that they can be seen as the intersection of the Bandler-Kohout compositions and the classical composition:
R a S = (R a b k S ) n (R 0S), R D S = (R Dbk S ) n (R 0S). Although Bandler and Kohout have never introduced sub- and superdirect images, we also include the definitions as they would have conceived them, in the same spirit as their triangular compositions. Consider a relation R from X to Y and a subset A of X; then (i) the Bandler-Kohout subdirect image R,,,(A) of A under R is the subset of Y defined as
R,,,(A) = ( y Iy
E
Y and A G Ry);
(ii) the Bandler-Kohout superdirect image Rp,,(A) of A under R is the subset of Y defined as
Rbb,(A) = ( y l y E Yand Ry E A).
4. Characteristic Mappings It is well known that a relation R from X to Y can be identified with its characteristic mapping, in the following way:
R : X x Y + (0, 11, (x,y )
-
1,
( x , y ) ,+ 0,
if xRy else.
The characteristic mapping of the round composition is given by
R 0 S k 2 ) = SUP R(x,U)AB S(Y, Z) YeY
where A, stands for the Boolean conjunction. Bandler and Kohout (1980b) have shown that the characteristic mappings of their compositions can be found in the following way:
R
abk
S(x, z ) = inf R(x,u) *B SO, z), YPY
R Dbk S(X, 2 ) = inf R(x, YB) ' S(Y, Z), YeY
R
Obk S ( X , Z) =
inf R(x, .Y)
YeY
*B
S(Y, Z),
where *B and oB stand for the Boolean implication and equivalence, and b eBa is defined as a aBb. Taking the Boolean conjunction of these
216
BERNARD DE BAETS and ETIENNE KERRE
expressions with an extra term that takes into account the non-emptiness conditions, the characteristic mappings of the improved definitions can be found in the following way: R(x,V ) *B S(y, Z)
inf R(x, Y ) e B SO, Similarly, the characteristic mappings of the Bandler-Kohout sub- and superdirect images are given by R&(A)(Y) = inf A(x) x EX
*B
R(x, Y),
Rhk(A)(y) = inf R ( ~yY ) *B A@)X€X
C. Fuzzy Relational Calculus 1. Basic Concepts
As already mentioned in Section I, a fuzzy relation R from a universe X to a universe Y is a fuzzy set in X x Y.R(x, y ) is called the degree of relationship between x and y . A fuzzy relation from X to X is called a binary fuzzy relation in X. Consider a fuzzy relation R from X to Y,x E X and y E Y ;then
(i) the afterset XR of x is the fuzzy set in Y defined by xR(y) = R(x, y ) ; (ii) the foreset Ry of y is the fuzzy set in X defined by Ry(x) = R(x,y); (iii) the domain dom(R) of R is the fuzzy set in X defined by dom(R)(x) = Hgt(xR); (iv) the range rng(R) of R is the fuzzy set in Y defined by rng(R )(Y) = Hgt (Ry); (v) the converse fuzzy relation R' of R is the fuzzy relation from Y to X defined by R'O, x ) = R(x,y).
277
FUZZY RELATIONS AND APPLICATIONS
2. Implication Operators
In Section I we have already seen that the Boolean conjunction can be extended to the unit interval by means of a triangular norm. In this section we investigate the extension of the Boolean implication to the unit interval. A [0, 112 [0, 11 mapping 9 is called an implication operator if it satisfies the boundary conditions 9(0,0) = 9(0,1 ) = 9(1, 1) = 1 and g(1,O) = 0. These conditions are, of course, the least we can expect from an implication operator. Other interesting potential properties of an implication operator are listed next. -+
An implication operator 9 is called contrapositive if and only if (V (x,Y ) E [O,
112)(g(x, Y ) = 9(1 - Y , 1 - x)).
An implication operator 9 satisfies the exchangeprinciple if and only if (V (x,y , 2) E
[o,
1i3)(9(X, g o , 2)) =
w,g(x, z
~ .
An implication operator 9 is called hybrid monotonous if and only if (VX E
[0, 1])(9(x, is increasing), a)
(Vy E [0, 11)(9(.,y) in decreasing).
An implication operator satisfies the neutrality principle if and only if (VX E [O,
l ] ) ( S ( lx) , = x).
To every implication operator 9 corresponds an implication operator 9*, defined by 9*(x,y) = 9(1 - y , 1 - x). If 9 is contrapositive, then 9 = 9*. Consider a triangular norm 3, then the [0, 112 9:, ..., 9; defined by
-+
[0, 11 mappings
9:(x,Y) = 3*(3(X,Y),1 - x), 9%Y)
=
3*(3(1 - x, 1 - Y ) , Y ) ,
9i(x,y) = 3*(1 - X,Y), 9!(x, y ) = supk I z E [0, 11 and 3(x, z) s ul,
$(x, y ) = supk I z E [0, 11 and 3(1 - y , z) 5 1 - XI are implication operators (De Baets and Kerre, 1993d).
278
BERNARD DE BAETS and ETIENNE KERRE
We mention some of the most important properties of these implication operators: 1 . contrapositivity: ($)* = $, ($)* = S: and ($)* = 9:; 2. exchange principle: If the triangular norm 3 has left-continuous partial mappings, then S," satisfies the exchange principle; 3. monotonicity: The implication operators S:, g:, and $ are hybrid monotonous, the partial mappings $(x, .) are increasing, and the partial mappings $( * ,y) are decreasing; 4. neutrality principle: The implication operators $, $, #, and 9: satisfy the neutrality principle; 5 . continuity: If the triangular norm 3 is continuous, then the implication operators gt, s:, $ are continuous; 6. boundary behavior:
$(x, 0)
1-x
if i E [1,2, 3, 51,
$(x, 1) = 1
if i E ( 2 , 3 , 4 , 5 ) ,
1
if i E [ 1 , 3 , 4 , 5 ) .
SY(0,y)
For a complete list of the explicit expressions of these implication operators for the most important triangular norms we refer to De Baets and Kerre (1993d). We mention the implication operators $ and S:: 1. for the triangular norm M: 9y(x, y ) = max(1 - x, y )
and
9p(x, y) =
1, y,
ifx s y . else Y
2. for the triangular norm P: (1,
$-(x,y) = 1 - x
+ xy
and
ifx s y
Sr(x,y) =
3. for the triangular norm W:
$"(x,y)
= ~ r ( x , y )=
min(1 - x
+ y, 1).
Sr
This list includes the well-known Kleene-Dienes operator and the tuckasiewicz operator g y . To conclude this section, we discuss the extension of the Boolean equivalence to the unit interval. A [0, 11' -, [0, 11 mapping E is called an equivalence operator if it satisfies the boundary conditions E(0, 0) = E(1, 1 ) = 1 and &(I, 0) = E(0, 1) = 0.
FUZZY RELATIONS AND APPLICATIONS
279
Consider an implication operator 4 and a triangular norm 3; then the [0,112 + [0, I] mapping & defined by E(x, Y ) = 3(9(x, Y ) , g(Y, XI) is an equivalence operator. 3. Compositions of Fuzzy Relations
The classical composition of relations has been extended to fuzzy relations by Zadeh in his very first paper on fuzzy sets (Zadeh, 1965). Consider a fuzzy relation R from X to Y and a fuzzy relation S from Y to Z. The sup-min composition R o S of R and S is the fuzzy relation from X to Z defined by R 0 S(x, z) =
SUP min(R(x, y), YEY
S(y, 2)).
This definition can be written in terms of after- and foresets in the following way: R 0 S(X,z) = Hgt(xR n Sz). Other authors have introduced the sup-3 composition by replacing the minimum operator by a general triangular norm 3: R o3S(x, z ) = SUP 3(R(x, Y ) , S(Y, z)), YCY
which can be written as R O3 S(X,Z )
=
Hgt(xR
& SZ).
Note that the degree of relationship between x and z in R 0’S is determined by the strongest of the connections between x and z via an element y of Y, where the strength of such a connection is given by 3(R(x, y ) , S(y, 2)). Since 3(R(x, Y ) ,
w,
2)) 5
min(R(x, Y ) , S(Y, z))
it follows that the strength of such a connection is not greater than the strength of the connection between x and y and the strength of the connection between y and z. This is a mathematical interpretation of the expression “a chain is as strong as the weakest of its links.” Bandler and Kohout have extended their triangular compositions to fuzzy relations by replacing the Boolean implication * B in the characteristic mappings by an implication operator 9: R
d k
s(X, Z ) = inf W(X, Y ) , S(Y, 2 ) ) YEY
R t>zk S(x, z) = inf 9(S(y, z), R(x, y)). YSY
280
BERNARD DE BAETS and ETIENNE KERRE
There are two possible ways to extend the ultracomposition to fuzzy relations. The first one is to replace the Boolean equivalence by an equivalence operator. The second one is to define the ultracomposition as the intersection of the triangular compositions, inspired by the corresponding relationship for crisp relations. As will become clear further on, we prefer the second possibility. It has already been argued that in the crisp case these compositions show some shortcomings when the aftersets or foresets involved are empty. Additional problems arise in the fuzzy case, as is shown in the following example.
Example 11.4. Consider a set of patients X , a set of symptoms Y , and a set of illnesses Z . Let R be the fuzzy relation from X to Y defined by R ( p , s) = the degree to which patient p shows symptom s
and S the fuzzy relation from Y to 2 defined by S(s, i) = the degree to which s is a symptom of illness i.
Let Y = Isl,...,ss] and ik be an illness with Sik
=
, 11, ( ~ ,20.7), 6% ,0 4 ,(s4, 0.861,(SS,0.4)).
l(s1
Let p i and p j be two patients with PiR = [(~1,0), (s2 0 )s ( PjR
=
[(sly
11, (
~ 3
0)s (s4 0-1), (sS O)I,
~ 20.51, ( ~ 33 0.4)s ( ~ 940.71, (SS i 0.5)).
Consider, for example, the Goguen implication operator
Then R d t k S ( p i ,ik) = 1 and R atkS ( p j ,i k ) = 0.8. This means that the degree to which all symptoms shown by patient p i are symptoms of illness ik is equal to 1, although patient p i is only showing symptom s4 to degree 0.1. Such surprising results stem from the fact that the Bandler-Kohout compositions do not take into account the degree of emptiness or nonemptiness of the aftersets and foresets involved. We have introduced two sets of improved definitions for these compositions based on the two alternative sets of representations of the corresponding compositions of crisp relations. But first we introduce a notation that will allow us to write the definitions in a more compact form. A [0, 11’ -, [0,I] mapping 3n is (pointwise) extended to a S(X)’-+ S(X)
FUZZY RELATIONS AND APPLICATIONS
28 1
mapping as follows 3x3 S(X)Z + S(X), ( A , B ) cI %(A, B ) ,
V(A,
B ) € s(X)2,
where %(A, B ) is the fuzzy set in X defined by %(A, B)(x) = rn(A(X), B(x)). The first set of improved definitions is given by
) S(y,z) , )
R Q: S(x, z) = min inf W ( x , Y ) , S(y, z)),sup R(x, y ) , sup S(y,z) ( y €
Y
YeY
YSY
R Dt S(x, z ) = min inf g(S(y, z), R(x, y ) ) , sup R(x, y ) , sup Y
YeY
YSY
R 0; S(x, z) = min(R Q t S(x, z), R D i S(x, 2)). The second set of improved definitions is given by: R Q3’ S(x, z ) = min
inf g(R(x, u), W, z)),SUP V ( x , y ) , S(y, z)) ,
(y€Y
R D”;
S(x, z) = min
) ),
YeY
inf W y , z), R(x, Y ) ) , SUP 3(R(x, y ) , S(y, z ) )
(yGY
Yay
R 03’ S(x, z) = min(R Q?’ S(x, z),R D3’ S(x, 2)). Using the height and plinth operators the expressions for the triangular compositions can be written as
R Qt S(x, z) = rnin(Plt(S(xR, Sz)), Hgt(xR), Hgt(Sz)) R D; S(x, z) = min(Plt(S(Sz, xR)), Hgt(xR), Hgt(Sz)) and R a?’ S(x, z) = min(Plt(S(xR, Sz)), Hgt(xR
n, Sz))
R DlSg S(x, z) = min(Plt(S(Sz, xR)), Hgt(xR fl, Sz)).
It is easy to see that the second set of definitions is more restrictive than the first one, i.e., yields lower degrees of relationship, and that the first one in turn is more restrictive than the Bandler-Kohout compositions R a t gS E R Q{
s E R a$ s,
R D>’s E R D t s E R Dik S.
282
BERNARD DE BAETS and ETIENNE KERRE
As already indicated, an alternative way of defining the ultracomposition is by introducing an equivalence operator: R 0;S(x, z)
=
min(Plt(&(xR, Sz)), Hgt(xR), Hgt(Sz)),
R O;*& S(x, z) = min(Plt(&(xR,Sz)), Hgt(xR
n,
Sz)).
If the equivalence operator & is defined by &(x,y ) = 3(9(x, y ) , 9 ( y ,x)), then the following relationships hold: R 0:S G R 0;S,
R0
2 s~ E R o;sg S .
Bandler and Kohout consider their triangular compositions as a special case of a more general relational product. They define an abstract fuzzy relational product R S, analogous to the matrix product, in the following way:
*
R
* S ( X , 2 ) = 0 R(x, Y ) 0 so, z), YEY
where, as Bandler and Kohout (1980b) write, “0 is likely to be something other than multiplication and @ to refer to something other than summation. ” In the discussion of the characteristic mappings of the triangular compositions of classical relations, we have already seen that for t E (0,a b k ,D b k J , the operator o is an element of the set (sup, inf], and the operator 0 is an element of the set ( A B , j BeB , J.These compositions have been extended to fuzzy relations using triangular norms and implication operators. Other instances of this general relational product are the mean compositions. In the case of finite universes, Bandler and Kohout suggest another choice for the operator 0 , namely the (arithmetic) mean or averaging operator. Consider a relation R from X to Y and a relation S from Y to Z, and assume that the universe Y has a finite cardinality #Y. The mean compositions of R and S are the fuzzy relations from X to 2 defined as follows: (i) the mean subcomposition R a, S of R and S is the fuzzy relation from X to Z defined by
(ii) the mean supercomposition R D, S of R and S is the fuzzy relation from X to 2 defined by
FUZZY RELATIONS AND APPLICATIONS
283
(iii) the mean ultracomposition R Om S of R and S is the fuzzy relation from X to Z defined by
These compositions are called mean subproduct and mean superproduct by Bandler and Kohout. Notice that in general a mean composition of crisp relations already yields a fuzzy relation. Bandler and Kohout’s motivation for the introduction of these mean compositions is interesting. They argue that in some situations the infimum operator is too strict (harsh membership criterion). For instance, both of the following cases: 1. #xR = 10 and #(xR r l Sz) = 9, 2. #xR = 10 and #(xR n Sz) = 0,
yield (x, z) @ R a b k S even when in the first case 90% of the elements of XR belong to Sz. Replacing the infimum operator by the mean operator, they have chosen a less strict operator (moderate membership criterion). Indeed, one easily verifies the following properties, for crisp relations (identified with their characteristic mappings):
R a S ER
abk
S GR
am S,
R D S c R D~~S E R D, S. In this way R a, S(x, z) is interpreted as the mean degree to which X R is a subset of Sz. We have already shown that these compositions suffer from similar problems as the triangular products, because of the absence of nonemptiness conditions (De Baets and Kerre, 1993~).We have also shown that these shortcomings are fundamental and cannot be overcome by adding extra terms, as has been done for the harsh triangular products. We have suggested the following alternative definitions: (i) the procentual subcomposition R a p s of R and S is the fuzzy relation from X to Z defined by
else (ii) the procentual supercomposition R D , S of R and S is the fuzzy relation from X to Z defined by
284
BERNARD DE BAETS and ETIENNE KERRE
Of course, these definitions are no longer instances of the general relational product. In this way R a,, S(x,z) is interpreted as the percentage of elements of XR belonging to Sz. Bandler and Kohout have extended the mean subcomposition and supercomposition to fuzzy relations by replacing the Boolean implication *B by an implication operator 9. We have extended the procentual compositions to fuzzy relations in the following way. Consider a fuzzy relation R from X to Y , a fuzzy relation S from Y to Z, and a triangular norm 3; then (i) the procentual subcomposition R u ; S of R and S is the fuzzy relation from X to Z defined by
else (ii) the procentual supercomposition RD;S of R and S is the fuzzy relation from X to Z defined by
4. Images of a Fuzzy Set under a Fuzzy Relation
We can be rather brief here and follow the same line of reasoning as for the compositions. Consider a fuzzy relation R from X to Y and a fuzzy set A in X ;then (i) the direct image R3(A) of A under R is the fuzzy set in Y defined by R3(A)(~)= Hgt(A
n 3 RY);
(ii) the Bandler-Kohout subdirect image R:,,(A) fuzzy set in Y defined by R:,,(A)(Y)
of A under R is the
= Plt(J(A, W ) ;
(iii) the Bandler-Kohout superdirect image R&,(A) of A under R is the fuzzy set in Y defined by R&,(A)(Y) = Plt(WY, A)). We suggest the following two sets of improved definitions: R:,(A)(Y)
= min(Plt(W, RY)), Hgt(A), Hgt(Ry)),
R;,(A)(y)
=
min(Plt(Wu, A)), Hgt(A), H g t W ) ) ,
FUZZY RELATIONS AND APPLICATIONS
285
and
n3W), = min(Plt(S(Ry, A)), Hgt(A n3Ry)).
R~;W(Y) = m i n ( p l t ( ~RYN, , Hgt(A
R2:(A)(y)
Example 11.5. Consider a set of patients X and a set of symptoms Y. Let R be the fuzzy relation from X to Y defined by R(p, s) = the degree to which patient p shows symptom s. Let 0 be the fuzzy set of old patients in the population X,then the direct images of 0 under R (for either one of the improved definitions) are given by R3(0) is the fuzzy set of symptoms shown by at least one old patient; R4(0) is the fuzzy set of symptoms shown by all old patients; R,(O) is the fuzzy set of symptoms shown by at least one old patient and not by any non-old patient. 5 . Properties
We end this section with an overview of the most important properties of the compositions of fuzzy relations. Whenever a property is valid for the a: subcomposition as well as the a? subcomposition, we will simply write a, and similarly for the supercomposition. Consider three fuzzy relations R, R,, and R2 from X to Y, a fuzzy relation S from Y to Z, and a finite family (R#= of fuzzy relations from X to Y ;then
'
,
1. containment:
R a?"
E R o",
2. convertibility:
(R O3S)' = S' O3R', (R 4 S)' = S'D R', (R D S)' = S'a R',
(ROS)' = S'OR'; 3. monotonicity, for a hybrid monotonous implication operator:
R,SR2*R,OSERzOS,
* R2a S E R 1 a S, R, C R2 * R 1 D S G RzDS;
(dom(R,) = dom(R,) and R, S R,)
286
BERNARD DE BAETS and ETIENNE KERRE
4. interaction with union, for a hybrid monotonous implication
operator:
( Ij
Ri) O S
i= I
=
(J (Rj o S ) , i= 1
5 . interaction with intersection, for a hybrid monotonous implication
operator:
(h (6 i= 1
i=l
R i ) 0S E Ri) D S =
i=1
ifi
(RIO S ) ,
(RiD S ) .
If the triangular norm and implication operator involved are also continuous, then the foregoing interactions with union and intersection remain valid for an arbitrary family of fuzzy relations. The associativity study is a lot more complicated and depends heavily upon the properties of the implication operator involved. We mention the associativity properties of the Bandler-Kohout compositions. Let us first recall the following properties. If the triangular norm 3 has left-continuous partial mappings, then for 9 = 9,"the following properties hold:
w,
(v (x9Y ) E 10,1I2)(3(x, Y))5 Y), (V (x,Y , 2 ) E [O, 1I3)(S(x, 9(Y, 2 ) ) = 9 ( W ,Y ) , z)), (v (x,Y , z ) E [O, 1I3Mx,9(Y, z)) = 9 0 , 9(x, z))), (W, Y , z ) E [O, 113)(3(x, 9(Y, z)) 5 9 ( Y , z)).
w,
Notice that 9," always has right-continuous second partial mappings, regardless of the continuity of 3. The following associativity properties hold, for 9 = 9:: 1. If the universes are finite or if 3 has left-continuous partial mappings:
R O3 (S 0' T ) = ( R o3S ) 0 ' T;
2. if the universes are finite and 3 has left-continuous partial mappings or if 3 is continuous and 9 has left-continuous second partial mappings: R O3 (S & T ) E ( R O3 S ) D& T,
R
Q:k
( S O3 T ) 1 ( R a l k S ) O3 T ;
FUZZY RELATIONS AND APPLICATIONS
287
3. if 9 and 3 have left-continuous first partial mappings: R
atk
( S a& T ) = ( R 0’ S ) a&T,
R b:k ( S O3 T ) = ( R & S ) & T ; 4. if 3 has left-continuous partial mappings:
To conclude this section, we spend a few lines on the cuttability study of the compositions of fuzzy relations. Consider a fuzzy relation R from X to Y and a fuzzy relation S from Y to 2. It is well known that the following equality holds, for all a in [0, 11, provided that Y is finite: (ROS), = R,oS,.
This equality means that the a-cut of a sup-min composition coincides with the classical composition of the corresponding a-cuts of the fuzzy relations. An equality like the foregoing is of extreme practical importance: It is possible to determine an a-cut of a sup-min composition without determining the complete sup-min composition. For the strict a-cuts the following equality holds, regardless of the finiteness of Y:
(R 0 S ) , = R ,
0 S,.
In general the following inclusion holds, provided that Y is finite:
( R o3S ) , E R , 0 S,. The converse inclusion only holds when 3 possesses the following property: (v(x,y)
E
[0, 1I2)((x L a and y
L a) *
3(x,y) 2 a).
The only triangular norm satisfying this property for all a in [0, 11 is the minimum operator M. The cuttability of the Bandler-Kohout compositions and their improved versions is far more complex and can be found in De Baets and Kerre (1994). We mention, for instance, the cuttability properties of the Bandler-Kohout compositions. If the implication operator 9 satisfies (V (x, Y ) E
1% 11 x
[O, at
)(m,Y ) < 4,
then the following inclusions hold:
( R a&s), E R ,
abk
S,
A
(R P t k S ) ,
c R , f>bk S,.
The converse inclusions hold when (V(X,Y) E
[O, 1I2\[a, 11 x 10,a[)(Nx,y)
1 4.
288
BERNARD DE BAETS and ETIENNE KERRE
We have shown that for any implication operator 9 there exists at most one a > 0 for which both properties hold, and hence for which the BandlerKohout compositions are perfectly cuttable. 6 . Matrix Representation
When dealing with finite universes, as is often the case, relations and fuzzy relations can be represented by means of a matrix. A fuzzy relation R from X = ( x l ,...,x I ]to Y = ( y , , ...,y,) can be represented by means of an I by m matrix, as follows: R=
where R , stands for R ( x j ,y,). A fuzzy set A in X can be represented by means of a row vector with 1 entries: A = (A,
A/),
where A istands for A(xi). The direct image of A under R can be written as follows: R3(A) = ( A ,
**.
A/)
where the matrix product is calculated using the triangular norm 3 as multiplication and the maximum operator as addition. Now consider a fuzzy relation S from Y to 2 = ( z ,, ...,z , ] . The max-3 composition of R and S can be written as follows: R;m) ("1 s;n) a * *
R O S S = ("1
R/, Rim S, S,, The max-3 composition is similar to the well-known matrix product, again by using the triangular norm 3 as multiplication and the maximum operator as addition.
Example 11.6. Consider the triangular norm W and the tuckasiewicz implication operator 9 ( x , y ) = min(1, 1 - x + y). The different fuzzy relational compositions are illustrated on the following fuzzy relations: R
=
(0.2 0.5
0.4 0.4)
0.3 0.9 0.6
0 1
and
1 0.8
0 0.2 0.7 0.6 0.3
(0.5 S = 0.6
)
.
289
FUZZY RELATIONS AND APPLICATIONS
The compositions of R and S are given by (De Baets and Kerre, 1993b) 0.1 0.2 0 R O W S = ( 0 0.5 0 ) , 0.7 0.9 0.3
R QES =
0.4 0.4 0.3 0.5 0.5 0.3 (0.6 0.6 0.1)’
R d T 9 S = ( 0. 01 0.2 0.5
0), 0
RbtS =
R D T g S = ( 0.1 0 0.2 0.4
0.6 0.6 0.1
1
0).
0.7 0.8 0.3
0.7 0.2
0.8
0.6 0.6 0.1
0.4 0.2 0.3 R 0:s = 0.3 0.4 0.3 ( 0 . 6 0.6 0.1)’
0.4 0.2 0.3 0.3 0.4 0.3 , (0.7 0.8 0.3)
1
RO’gS =
(>:
1
0.8
*)l!O
D. Further References
By no means do we want to pretend that this overview of images and compositions is complete. We have only selected the material needed in the following sections. However, we will indicate to the reader some other interesting sources. 1. Quantified Images
We have already stressed the quantification aspect in the different images of a set under a relation, with the existential quantifier in the direct image and the universal quantifier in the subdirect image. This insight leads to a lot of new possibilities. Notice that
R(A) = ( y l y E Yand #(A n R y ) 2 1). Suppose that X is finite and has cardinality n. We can consider more general quantifiers, such as “at least i,” and define &(A), i E (0,1, ...,n), as follows: R,(A) = [ y Iy E Y and #(A n Ry) 2 i ] ,
290
BERNARD DE BAETS and ETIENNE KERRE
i.e., R i ( A )contains these elements of Y that are in relation R‘ with at least i elements of A . These generalized direct images constitute a decreasing sequence in 6 ( Y ) :
Ro(A) 2 R , ( A ) 2 R,(A) 1
2 R,(A),
with
RoV) = Y,
R,(-4) = R(A),
R # A ( - ~=)R&),
and
(v i E [(#A)+ 1, ..., n))(Ri(A)=
0).
Let R,+,(A) = 0; then Ri\Ri+, , i E (0,1, ..., n), is the set of elements of Y that are in relation R‘ with exactly i elements of A. Dubois and Prade (1992) have also explored this idea for the images of a fuzzy set under a fuzzy relation by introducing fuzzy quantifiers such as “about 5,” “a few,” and “a lot.’’ 2 . Other Compositions Dual to the sup-3 composition of fuzzy relations, Pedrycz (1989) discusses the inf-S composition, with S a triangular conorm. For finite universes, Pedrycz (1993) has introduced the S-3 composition. Since triangular norms and conorms are associative, they can be extended to an arbitrary finite number of arguments in a unique way. The S-3 composition R 0”’s of R and S is then defined by
R OsVa S(X, Z) = S ~ ( R ( xy ), , S(y, z)). YeY
Still other forms of compositions have been introduced by Di Nola et al. (1988). They have replaced the triangular norm 3 in the sup-3 composition by what they call an equality operator, which is in fact an equivalence operator, defined by
w,Y ) = W
m X ,Y),
4).
Similarly, they have replaced the triangular conorm S in the inf-S composition by what they call a difference operator.
3 . Semantics For the influence of the choice of the implication operator 9 on the interpretation and meaning of the Bandler-Kohout triangular compositions, we refer to Bandler and Kohout (1980a,b).
FUZZY RELATIONS AND APPLICATIONS
111.
29 1
SPECIALTYPESOF FUZZY RELATIONS
A. Potential Properties of Fuzzy Relations 1. Properties of Binary Relations
Before we discuss the potential properties of binary fuzzy relations, we recall the underlying properties of binary crisp relations. A binary relation R in a universe X is called (Bandler and Kohout, 1988) (i) covering if and only if (V x E X ) ( ~ E Y XNXRY);
(ii) locally reflexive if and only if (Vx E X)((3y E X)(xRy V yRx) =) xRx);
(iii) reflexive if and only if it is covering and locally reflexive, ( V X E X)(xRx);
(iv) symmetric if and only if (V (x,Y ) E X2)(XRY =$ Y W ;
(v) antisymmetric if and only if (V (x, u) E X2)((xf Y A XRY)*
-.yRx);
(vi) strictly antisymmetric if and only if ( v ( x , Y ) E X2)(xRu=) lYRx);
(vii) transitive if and only if (V (x, y , z) E X3)((xRyA yRz) * xRz).
These simple properties can be combined into more complex types of relations, as follows. A binary relation R in a universe X is called (Bandler and Kohout, 1988) (i) a local tolerance relation if and only if it is locally reflexive and symmetric; (ii) a tolerance relation if and only if it is a local tolerance relation and it is covering; (iii) a local preorder relation if and only if it is locally reflexive and transitive; (iv) a preorder relation if and only if it is a local preorder relation and it is covering;
292
BERNARD DE BAETS and ETIENNE KERRE
(v) a local equivalence relation if and only if it is locally reflexive, symmetric, and transitive; (vi) an equivalence relation if and only if it is a local equivalence relation and it is covering; (vii) a local order relation if and only if it is locally reflexive, antisymmetric, and transitive; (viii) an order relation if and only if it is a local order relation and it is covering; (ix) a strict order relation if and only if it is strictly antisymmetric and transitive. The reason for using localproperties is that it is very often so that one or more elements of the universe do not participate in the relation under consideration, and that therefore this relation does not show some properties which it obviously has on its effective domain. 2 . Properties of Binary Fuzzy Relations Most of the following definitions of properties of binary fuzzy relations are due to Zadeh. A binary fuzzy relation R in a universe X is called (Zadeh, 1971; Bandler and Kohout, 1988) (i) covering if and only if
( v x E X)OY E X)(R(X,Y) = 1); (ii) locally reflexive if and only if ( v x E X)(R(x,x) =
SUP Y EX
max(R(x, Y ) , R(Y,x)));
(iii) reflexive if and only if it is covering and locally reflexive, i.e., if
( v x E X)(R(x,x ) = 1); (iv) symmetric if and only if (V (x, Y ) E X2)(R(X,Y ) =
R(Y,X I ) ;
(v) antisymmetric if and only if
(v(x, Y ) E X 2 W f Y * min(R(x, Y ) , R(Y,x)) = 0); (vi) strictly antisymmetric if and only if (v (x,Y ) E X2)(min(R(x, Y ) , R ( Y , X I ) = 0); (vii) transitive if and only if (x,Y , z) E X3)(min(R(x, u),R ( y , 2)) IR(x, 2)). These simple properties are combined into more complex types of fuzzy relations, in the same way as for crisp relations. (V
293
FUZZY RELATIONS AND APPLICATIONS
The justification of these definitions follows from the following theorem (Bandler and Kohout, 1988). Proposition 111.1. Consider a binary fuzzy relation R in X ; then for each of the simple and compound properties P defined earlier, it holds that R possesses P in the fuzzy sense if and only if every cr-cut (a! E 10, 11) of R possesses P in the crisp sense.
The transitivity of a binary fuzzy relation can also be written as
)
(v (x, z ) E X2)SUP min(R(x, Y ) , R(Y, 2 ) ) 5 R(x, z ) ( y EX
and is therefore usually referred to as sup-min transitivity. Other forms of transitivity have been introduced by replacing minimum by another triangular norm. In this context the triangular norm W is rather popular. A binary fuzzy relation is called sup- W transitive if and only if max(R(x, Y ) + R ( y , Z ) - 1,O)
IR(x,
z)
3. Closuresand Interiors A crisp or fuzzy relation may of course fail to possess one or more desired properties. The following question then becomes important: Can we modify (enlarge or reduce) the relation so that it now possesses these properties? Consider a fuzzy relation R and a property P ; then the P-closure of R is defined as the smallest fuzzy relation containing R and possessing P; the P-interior of R is defined as the greatest fuzzy relation contained in R and possessing P. Of course, P-closures and P-interiors do not necessarily exist for all fuzzy relations. The following proposition easily follows.
Proposition 111.2. Consider a property P and a binary fuzzy relation R in X .
(i) The fuzzy relation R possesses P P-closure. (ii) The f u w relation R possesses P P-interior.
if and
only if it is equal to its
if and
only if it is equal to its
Proposition 111.3 (Bandler and Kohout, 1988). Consider aproperty P.
(i) A P-closure existsfor all binaryfuzzy relations in X ifand only if the universal relation X 2possesses P and the intersection of every nonemptyfamily of binaryfuzzy relations, each of whichpossesses P, also possesses P.
294
BERNARD DE BAETS and ETIENNE KERRE
(ii) A P-interior exists for all binary fuzzy relations in X if and only i!f the empty relation 0possesses P and the union of every non-empty family of binary fuzzy relations, each of which possesses P , also possesses P . Bandler and Kohout have investigated which properties satisfy the conditions of the foregoing proposition and have shown how to compute the corresponding closures and interiors. We mention some of these results: (i) The locally reflexive closure of a binary fuzzy relation R is given by R U ER with ER defined by
&(x,
X ) = SUP YEX
max(R(x,Y ) , R(Y,x ) )
and ER(x,y ) = 0 , if x # y ; (ii) the locally reflexive interior of a fuzzy relation R is given by
R r l rowsol R
n colsol R ,
with rowsol R the row-solipsism of R defined by rowsol R(x, y ) = R(x, x), and colsol R the column-solipsism of R defined by colsol R(x, y ) = R ( y ,y ) ; (iii) the symmetric closure of a binary fuzzy relation R is given by R u R'; (iv) the symmetric interior of a binary fuzzy relation R is given by R n R'; (v) the transitive closure of a binary fuzzy relation R is given by
R U R 2 U R 3 U . * - = U Rk, kEN,
where R 2 stands for R
o R and R""
=
R" o R .
B. Similarity Relations A binary fuzzy relation that is an equivalence relation, i.e., that is reflexive, symmetric, and sup-min transitive, is usually called a similarity relation.
Example III. 1. 1. The first example is a similarity relation R in a finite universe X = [a,6,c, d , e) represented by means of the following matrix:
FUZZY RELATIONS AND APPLICATIONS
295
1 0.3 0.5 0.3 0.5 0.3 1 0.3 0.7 0.3 0.3 0.7 0.3 1 0.3 0.5 0.3 0.7 0.3 1 2. The second example is borrowed from Kerre (1993) and is defined on in the following way: the continuum X = [0, +a[, R : [0, +a[2 [0, 11, +
(XSY)
ifx=y else
[::m=(x.y),
Proposition 111.4 (Properties of Similarity Relations) (Bandler and Kohout, 1988; Kerre, 1993). Consider a binary fuzzy relation R in X . 1. If R is a similarity relation then, for all (x,y , z) E X 3 , at least two of the degrees R(x,y ) , R ( y , z) and R(x, z) are equal. 2. R is a similarity relation if and only if
( v a E 10, l ] ) ( R , is an equivalence relation). 3. R is a similarity relation if and only if co R is a [0, 11-valued pseudo ultrametric on X,i.e., co R satisfies, for all (x,y , z ) E X 3:
(MI) Non-negativity: co R(x,y ) 2 0; (M2) Pseudo-separation: x
=y
* co R(x,y) = 0;
(M3) Symmetry: co R(x,y ) = co R ( y , x); (M4) Strong triangle inequality: co R(x, z ) Imax(co R(x,y ) , co R ( y , 2)). An important concept associated with a similarity relation is its partition tree, the fuzzy analogue of the quotient set of a crisp equivalence relation. As mentioned before, the a-cuts R, of a similarity relation R are equivalence relations. To each of these equivalence relations corresponds a partition n, of the universe X: the quotient set of the equivalence relation. These partitions become finer with increasinga!. The fact that these partitions are nested can be visualized by means of a partition tree. We illustrate this construction procedure on the first fuzzy relation from Example 111.1. It suffices to consider those a-cuts for which a! is effectively used as a
296
BERNARD DE BAETS and ETIENNE KERRE
degree of relationship. These a-cuts are given by
The corresponding partitions are given by
C. Likeness Relations A likeness relation is a binary fuzzy relation that is reflexive, symmetric, and sup-W transitive. Since sup-min transitivity implies sup-W transitivity it follows that every similarity relation is a likeness relation.
Example 111.2. This example is again borrowed from Kerre (1993). Consider the single-valued attribute BUILD of a person, with the following possible linguistic descriptions: thin, slim, middling, sturdy and corpulent. It is clear that these terms overlap to a certain extent. In order to take this into account we introduce a likeness relation on the domain of this attribute:
i
0.5 0.3 0 0.6 0.4 0.1 R = 1 0.8 0.4 . 0.8 1 0.6 0.4 0.6 1 One easily verifies that this fuzzy relation is a likeness relation and not a similarity relation. Some a-cuts of this fuzzy relation are given next (with t = thin, s = slim, m = middling, st = sturdy and c = corpulent): 1 0.9 0.5 0.3 0
0.9 1 0.6 0.4 0.1
R,.o = l(t, 0 , (s, 4,(m,m),(st,st), (c, 41, Ro.9 =
R1.0
R0.8
= R0.9
R0.6
=
R0.8
u l(t,s), (s, t ) ) ,
u
((m, st), @t, m)1, m), (mys), (st,c), (c, st)).
FUZZY RELATIONS AND APPLICATIONS
297
The relations R l . o ,Ro,9,and Ro,8are equivalence relations. The relation R0.6, however, is not an equivalence relation, as can be seen from the following example: (c, st) E Ro.6 and (st, m) E Ro.6 but (c, m) c j Ro& Proposition 111.5 (Kerre, 1993). A binary fuzzy relation R in X is a likeness relation if and o n b if co R is a [0, I]-valued pseudo metric on X , i.e., co R satisfies, for all (x,y , z) E X’:
-
(Ml) Non-negativity: co R(x,y ) 2 0; (M2) Pseudo-separation: x = y
co R(x, y) = 0;
(M3) Symmetry: co R(x, y ) = co R ( y , x); (M4) Triangle inequality: co R(x, z)
Ico R(x, y )
+ co R ( y , z).
IV. APPLICATIONS OF TRIANGULAR COMPOSITIONS A. Introduction
Dependent on the context or situation in which they are used, fuzzy relations can be interpreted in two different ways: a disjunctive and a conjunctive interpretation (Dubois and Prade, 1992). These two interpretations can be made clear by viewing a fuzzy relation as a collection of aftersets. In the disjunctive interpretation, the aftersets are considered as fuzzy sets of more or less possible values of a single-valued variable of which the value is not known precisely. Consider, for instance, a fuzzy relation R describing the relationship between a single-valued variable U in a universe X and a single-valued variable V in a universe Y. When the variable U takes the value x , then the afterset X R is considered as the possibility distribution of the possible values of the variable Y for this value of U.The generalized modus ponens and generalized modus tollens, the forward- and backwardchaining inference mechanisms used in fuzzy rule-based knowledge systems, correspond to this disjunctive interpretation. They are extensions of the corresponding classical inference schemes to the case of fuzzy premises. These inference mechanisms are discussed briefly in the next section. The direct image of a fuzzy set under a fuzzy relation and the composition of fuzzy relations play an important role in this interpretation. In the conjunctive interpretation the aftersets are considered as fuzzy sets of values of a multi-valued variable. Consider, for instance, the fuzzy relation K from a set of software engineers E to a set of programming languages L with K(e, I ) the degree to which engineer e is used to program
298
BERNARD DE BAETS and ETIENNE KERRE
in the language 1. The afterset eK then represents the programming ability of engineer e. In the conjunctive interpretation one often deals with diagnostic problems, where fuzzy relations are used to model the relationships between causes (faults, illnesses) and effects (symptoms, etc.). The number of potential applications of the triangular compositions is legion in this interpretation. Applications can, for instance, be found in medical diagnosis (Bandler and Kohout, 1980b, 1986) and in information retrieval systems (Kohout et al., 1983, 1984). We also mention the CLINAID knowledge-based system architecture developed upon this fuzzy relational calculus (Kohout et al., 1991). In this section, we briefly discuss an example from medical diagnosis and an example of a thesaurus construction for a small information-retrieval system.
B. A n Example from Medical Diagnosis 1 . Description of the Experiment
In a medical diagnostic problem the following three finite sets play an important role: the set of patients P; the set of possible symptoms S ; the set of possible illnesses I. The fundamental relation is the diagnostic relation D from the set of symptoms to the set of illnesses. This relation is inherently fuzzy. A second important relation is the observation relation 0 from the set of patients to the set of symptoms. This relation is also inherently fuzzy. In general, the observation relation depends on the observer and on the time of observation. In practice often a k-point scale (e.g., (1, ..., 71, 1-3, .. .,0, ..., 3 ) or 1- - -, - -, -, 0,+, + +, + + +)) is used to evaluate the observations; in this case a simple linear transformation can be applied to rescale the observations to the unit interval [0, 11. The diagnostic relation D is the fuzzy relation from S to I defined by
D(s, i) = the degree to which s is a symptom of illness i. The observation relation 0 is the fuzzy relation from P to S defined by O(p,s) = the degree to which patient p shows symptom s. The fuzzy relational compositions of 0 and D are the fuzzy relations 0 o D , 0 Q D, 0 D D, and 0 0 D from P to I with the following interpretation:
299
FUZZY RELATIONS AND APPLICATIONS
0 oD(p, i)
=
the degree to which patient p shows at least one symptom of illness i,
0 Q D ( p , i)
=
the (mean) degree to which the symptoms of patient p are symptoms of illness i,
0 b D ( p , i) = the (mean) degree to which the symptoms of illness i are shown by patient p, 0 o D(p, i) = the (mean) degree to which the symptoms of illness i coincide with those shown by patient p.
Notice that we have not indicated the specific definitions used for these compositions. They can either be the Bandler-Kohout compositions or, preferably, one of their improved versions, or the mean or procentual compositions. Also, the specific operators used in these compositions have not been indicated. We now consider the following case study: 1. A set of six patients suffering from AIDS:
P 2. A
...,p61.
= ip,,
set of eight symptoms observed in the patients:
s = [s,,
...,s 81,
given as in the accompanying table. Qualities
Symptoms
almost normal ability independent easy to cope with accepting advice interested and exploring almost healthy cheerful calm and secure
disabled dependent difficult to cope with rejecting advice apathetic and unconcerned very ill depressed anxious and worried
The opposites of the symptoms are called qualities or constructs. These symptoms can be divided into three groups: the manageability of the patient (s, ,s, ,s3); the motivation of the patient (s4, s5); the physical and psychological state of the patient
(s6, s7, s8).
300
BERNARD DE BAETS and ETIENNE KERRE
3. A set of three (physi0)therapists treating the patients:
N
=
( n l , n 2 , n31.
4. A set of four occasions on which the patients were observed:
A great deal of psychological data, whether clinical or not, particularly of the sort derived from repertory grids, such as the list of symptoms above, needs to be analyzed in a way that points out dependencies and implications among the variables. Such dependencies are by nature non-symmetrical. Methods available until now are based on symmetrical operators: The correlation between x and y is the same as the correlation between y and x . This is where the triangular compositions come in. An analysis of the observation relations by means of the triangular compositions reveals two types of information:
1. cognitive knowledge: knowledge about the way the symptoms are attributed by the @hysio)therapists; 2. interpersonal knowledge: knowledge about the relationship between the symptoms attributed to the different patients.
2 . Compositions of Observations of Different Therapists on the Same Occasion On every occasion t, we obtain three observation relations Ol;, 05,and 0;from the (physio)therapists n, , n 2 , and n 3 . These fuzzy relations are defined from P to S in the following way:
OL(p,s) = the degree to which therapist n k assigns symptom s to patient p on the occasion tr. The fuzzy relational compositions of the converse of the observation relation of one therapist and the observation relation of a second therapist all yield binary fuzzy relations in the set of symptoms, with the following interpretation:
Oc Q O;(si,s,)
t,, the assignment of symptom si by therapist n k implies the assignment of symptom sj by therapist nl.
= the (mean) degree to which, on the occasion
In the same way, we can consider the compositions of the converse of the observation relation and the relation itself of a particular therapist:
0;a Ogsi,s,)
t,, of symptom si to symptom s j , according to therapist n k .
= the (mean) degree of subordination, on the occasion
FUZZY RELATIONS AND APPLICATIONS
301
This composition allows us to induce a hierarchy in the repertoire of symptoms of a therapist. The fuzzy relational compositions of the observation relation of one therapist and the converse of the observation relation of a second therapist lead to binary fuzzy relations in the set of patients, with the following interpretation:
0;a O((pi,pj)= the (mean) degree t o which, on the occasion t,, the assignment of symptoms to patient pi by therapist nk is included in the assignment of symptoms to patient p j by therapist nl. Of particular interest is the diagonal of this relation. The diagonal elements indicate how close therapist n, follows therapist nk in his observations. In the same way, we can consider the composition of the observation relation and its converse of a particular therapist:
0;a Oz(pi,pj)= the (mean) degree to which, on the occasion t,, the symptoms of patient pi are included in the symptoms of patient p j , according to therapist nk . 3 . Compositions of Observations of the Same Patient on Different Occasions
For every patient p , and every (physio)therapist nk we can construct a fuzzy relation 0;from the set of symptoms S to the set of occasions T in the following way: = the degree to which therapist
nk assigns symptom si
to patient p , on the occasion t,. The fuzzy relational compositions of these fuzzy relations and their converses yield binary fuzzy relations in the set of symptoms, with the following interpretation:
0;a O;(si, sj)
=
the (mean) degree to which, for patient p,, symptom si is a subsymptom of symptom sj, according to therapist nk .
4. Analysis and Synthesis of F u w Relational
Compositions: Methodology
Fuzzy relational calculus is more than just different types of fuzzy relational compositions. It also offers techniques to bring the information contained
302
BERNARD DE B U T S and ETIENNE KERRE
in these compositions to the surface. The analysis of a fuzzy relational composition consists of the study of its cr-cuts. We will illustrate the general procedure on a few examples. on the occasion t z , Consider the observation relations 0;and 0,” containing the first two groups of symptoms (in order to reduce the size of the matrices):
I
0.7 0.3 0.4 0.5 0.5 0.7 0.3 0.7 0.5 0.4 0.7 0.3 1
0;=
0.8 0.7 0.6
0.5 0.4 0.4
0.7 0.3 0.6 0.6 0.8 0.4
,
I
o;=
\
0.7 0.4 0.7 0.6 0.7
0.6 0.2 0.5 0.5
0.3
0.9 1 0.7 0.6 0.7 0.6 0.7 0.6 0.4 0.2 0.3 0.5 0.7 0.4 0.5 0.8 0.2 0.3 0.7 0.6
0.6 0.3 0.8 0.5
0.8
The most important compositions of these observation relations and their converses are given next. They are given for the triangular compositions with indices b, i.e., a = 4; and D = D,; and the implication operator 9
=
g:: 0.43 0.57 0.6 0.43 0.7
0.7
0.57 0.57
0.75 0.38 0.75 0.8 0.7 0.57 0.7 0.5 0.9
0;‘aO; =
I
o:ao:’=
I
\
0.5 0.7
0.25 0.38 0.67 0.33
0.7 0.43 0.38 0.8
(0.6
0.57
1
0.57 0.291 0.57 0.33 ,
0.7 0.6
0.29 0.43 0.7 0.5 0.33 0.50 0.63 0.8
0.7 0.43 0.43 0.57 0.63 0.57
0.7 1 0.7 0.7 0.8 0.7
0.43 0.6
0.7
0.3
0.43 0.43
0.7 0.6 0.43 0.43
0.57 0.7 0.38 0.5
0.43
0.67 0.8 0.57
0.7 0.57 0.57 0.7 0.7 0.7
\
FUZZY RELATIONS AND APPLICATIONS
0.6
0.6 0.5 0.29 0.33 0.7 0.5 0.4 0.7 0.33 0.38 0.25 0.5
0.6
0.2 1 0.29 0.7
0;a 0;‘=
0.4
\
0.7
0.5 0.8 0.38 0.8
0.6
303 0.6
0.2 0.3 0.29 0.43 0.4 0.6 0.8 0.71 0.38 0.8
I
Consider the a-cut of 0:’ a 0:for a = 0.7 (the highest value of a such that (0:a’of),is a reflexive relation): 1 0 0 0 0
This means that therapist n, assigns the symptom difficult to cope with (s3) as soon as he assigns any of the symptoms different from the symptom disabled (s,). When he assigns any of the symptoms concerning the motivation of the patient (s4 or s5) he also assigns the symptom disabled (sl). Now consider the a-cut of 0:a 0:’for a = 0.7: /l
1 0 0 1 l\
We can write this relation also in the following way:
(0:a Of1)0.7 = [(PI ,P I ) , (PI YPZ),( P I PS), (PI Pa), (PZ PZ), ( P 3 PZ), (P3 P3)9 ( P 4 PZ), ( P 4 P4)9 (P4 P6), ( P S P2)9 (PZ PS), ( P S pa), (P6 YPZ),(P6 P 6 ) ) , 9
9
Y
9
Y
9
Y
9
9
Y
Y
which turns out to be an order relation in the set of patients and can be represented by means of the Hasse diagram of Fig. 1. From this diagram we can read that patient p , is in better condition than patients p z ,p S, and p6, but cannot be compared with patients p 3 and p 4 . Patient p z is definitely in the worst condition.
304
9
BERNARD DE BAETS and ETIENNE KERRE
P6
Pl
P5
FIGURE 1. Hasse diagram for (0: Q Of),,,.
Consider the a-cut of 0,”a 0;‘ for a = 0.6:
0 1 0 0 0 1
This relation also represents an order relation in the set of patients and is shown in the Hasse diagram of Fig. 2. The only difference in judgment between therapists n, and n, is that therapist n, considers patient p, to be in better condition than patientp,, whereas therapist n, cannot compare them. In the foregoing considerations, both therapists n, and n, came to a consistent conclusion. Also the a-cuts of the compositions of the observaand their converses have been studied. Surprising and tion relations 0;: encouraging is the observation that the data tend to organize themselves spontaneously in understandable patterns, often in a (local) preorder
FIGURE2. Hasse diagram for (0; a Of)o,6.
FUZZY RELATIONS AND APPLICATIONS
305
relation or order relation (possibly after identification of structurally equivalent elements). When the dimension of the universes increases, small variations in the fuzzy relations can make it no longer possible to obtain a preorder relation. It is not in the spirit of fuzzy methods to pay too much attention to small variations in numbers that are rough estimates anyhow. The objective is to offer a technique that can reveal the structure present in the data-without, of course, forcing a structure that is not present. A fuzzy relation that is expected to represent a (local) preorder or order relation is treated in the following way. First we calculate the local preorder closure, as explained in the previous section, hoping that it differs not too much from the original fuzzy relation. A measure for the difference between the fuzzy relation and its closure is then calculated-for instance, the Hamming distance. When this measure is larger than a given small number, the investigation comes to an end. If this measure is smaller than a given small number, then the differences between the a-cuts of the given relation and its closure are also considered (for instance, for a E (i x 0.1 I i = 1, ..., 10)). Only when these differences are small enough does the investigation of the closure continue. A (local) preorder relation R in a universe X is then analyzed in the following way, by studying its a-cuts: 1. Consider a relevant cr-cut R, of R. This will be a crisp (local) preorder relation. 2. Calculate the symmetric interior S of R,. This will be a (local) equivalence relation. 3. Remove the null class C, defined by
C, = ( x I x E X a n d x S = 0 ) consisting of the elements that do not take part in the relation S.This class will only be non-empty if S is indeed a local equivalence relation, but not an equivalence relation. 4. Determine the quotient set Q of x\C, under the equivalence relation S: Q
=
(xSI x E X\Co)
with xS = ( y I y
E
X\C, and xSy].
Structurally equivalent elements are grouped together here. in the quotient set Q, in the following way: XS yS @ xR, y.
5 . The relation R, induces an order relation
<,
<,
6. Draw a Hasse diagram from the order relation
<,.
306
BERNARD DE BAETS and ETIENNE KERRE
Following this procedure, the analysis of the different a-cuts leads to a structural synthesis of the relationships present in the original fuzzy relation. One may of course wonder whether the choice of a particular a-cut, for instance for the observation relations, has a (clinical) meaning. The rule of thumb applied by Bandler and Kohout, when using the mean compositions to analyze a fuzzy relation, is to consider the a-cut at the average value of the entries of the fuzzy relation. Theoretical research concerning the choice of significant a-cuts is still going on. One may also wonder whether the analysis of the fuzzy relational compositions leads to similar results when a different implication operator or a different triangular norm is used. Studies have shown that in most cases the results obtained are rather similar and differ mainly in the a-cuts that have to be considered and analyzed. In order to understand the influence of the choice of the implication operator and the semantics of the implication operator, Bandler and Kohout (1980~)have developed the so-called checklist paradigm. In this discussion we have illustrated that it is, for instance, possible to analyze, in a meaningful way, the profile of a patient or a small group of patients, using techniques from the fuzzy relational calculus, even when there are not sufficient data to apply statistical techniques.
C. An Example from Automated Information Retrieval 1. Classical Information-Retrieval Systems The purpose of an automatic information-retrieval system is to provide the user with a list of references to documents as answer to a specific request. A good system mentions all relevant documents and as few as possible non-relevant documents. Although some commercial information-retrieval systems are already widely used, several theoretical and practical problems remain unsolved. Classical retrieval systems adopt a search strategy that tries to find all documents that match the query. A query is expressed by means of keywords combined using the logical operators AND, OR, and NOT. These systems show some shortcomings stemming from the limitations imposed by classical logic:
0
they cannot express the degree of relevance of a term to the contents of a document; individual users’ profiles are out of the question; the system returns the same list of references to different users as answer to the same request.
FUZZY RELATIONS AND APPLICATIONS
307
Also, the conventional thesaurus construction shows shortcomings. A typical information-retrieval system is an on-line system with a feedback facility that allows the user to modify his request during a session. In order to make this feedback efficient, the user needs access to information that allows him to make his query more general or more specific. This information can be found in the thesaurus, a kind of a structured dictionary in which, for every term, the terms can be found that are more general or more specific. The construction of such a thesaurus is usually done by means of a statistical formula that results in statistical associations among the terms. Since these associations are purely based on frequencies, they do not indicate the degree of relevance of the relationship between the general and specific terms of the thesaurus. The need for a fuzzy approach in the automated classification and the automated retrieval of documents is obvious. The recent developments of the fuzzy relational calculus offer the ideal theoretical framework for the development of a fuzzy approach. 2 . A Fuzzy Approach In an information-retrieval system three finite sets play an important role: the set of users CJ of the system; the set of documents D stored in the system; the set of terms T used to describe the documents. The fundamental relation in an information-retrieval system is the relevance relation R from the set of terms to the set of documents. This relation is inherently fuzzy and is represented conceptually as a matrix. From an implementational point of view, the compact representation by means of after- and foresets is a lot more efficient. A second important relation is the users’ profiles relation P from the set of users to the set of terms. This relation is again inherently fuzzy. The relevance relation R is the fuzzy relation from T to D defined by
R(t, d) = the degree of relevance of term t to document d = the degree to which term
t is treated in document d.
The users’ profiles relation P is the fuzzy relation from CJ to T defined by
P(u,t ) = the degree to which user
u is interested in term t.
The aftersets of the users’ profiles relation are, of course, the users’ profiles.
308
BERNARD DE BAETS and ETIENNE KERRE
3 . Compositions and Their Semantics
The fuzzy relational compositions of the relevance relation R and its converse R' are the binary fuzzy relations R O R', R UR', R D R', and R 0 R' in the set of terms T with the following interpretation:
R o R'(ti, t j ) = the degree to which both terms ti and tj are relevant to at least one document =
the degree to which at least one document treats both terms ti and ti;
R u R'(ti, tj) = the (mean) degree to which the relevance of term ti implies the relevance of term tj = the (mean) degree to which term
ti is more specific
than term tj =
R D R'(tj, ti)
=
the (mean) degree to which term ti is more general than term t j ;
R 0 R'(ti, t j ) = the (mean) degree to which the terms ti and tj are synonymous. Provided that ti(R 0 R') C _ ti(R U R') (depends upon the definition and operators chosen), the fuzzy set Gridefined by
Gti(tj)= R a R'(ti, tj) - R O R'(ti, tj) can be considered as the fuzzy set of terms that are more general than term t i . Similarly, provided that ti(R 0 R') G t i ( RD R'), the fuzzy set Sti defined by St,(tj) = R D R'(ti, t,) - R 0 R'(t;, tj) can be considered as the fuzzy set of terms that are more specific than term ti. A typical use of the fuzzy sets G, and S, goes as follows. After giving a relevance threshold 8, the system returns the &cut of G, as answer to the question General(t); and as answer to the question Specific(t ), the system returns the 6-cut of the fuzzy set S , . By adjusting the threshold 8, this list of terms can be reduced or enlarged. Hence, the use of the relevance threshold leads to an increased flexibility.
FUZZY RELATIONS AND APPLICATIONS
309
The fuzzy relational compositions of the users' profiles relation P and its converse P' are the binary fuzzy relations Po P', P a P', P D P', and P OP' in the set of users I/ with the following interpretation:
Po P'(ui,
uj) = the degree to which users
uiand uj share at
least one term of interest; P a P'(ui, u j) = the (mean) degree to which the profile of user uiis contained in the profile of user uj = P D P'(Uj, Ui);
P o P'(ui, uj) = the (mean) degree to which the profiles of the users uiand uj coincide.
The fuzzy relational compositions of the user's profiles relation P and the relevance relation R are the fuzzy relations P OR , P a R , P D R , and P 0 R from U to D with the following interpretation:
P o R(u, d) = the degree to which user u is interested in at least one of the terms treated in document d; P a R(u, d) = the (mean) degree to which the profile of user u is contained in document d;
P D R(u, d) = the (mean) degree to which user u is interested in all terms treated in document d; P OR(u, d) = the (mean) degree to which the profile of user u and the contents of document d coincide. 4. Matrix Representation versus Afterset
and Foreset Representation The actual computation of the fuzzy relational compositions can be done in two different ways: 0
based on the matrix representation of fuzzy relations; based on the representation by means of aftersets and foresets.
A modest information-retrieval system already needs gigantic matrices for the representation of the relational information. The extent of this problem becomes clear if we consider the following imaginary system: 400 volumes of scientific journals yielding approximately 3,000 documents and 5,000 terms. This means that we are working with relevance relations of size 5,000 x 3,000. The corresponding matrices are typically sparse matrices. It is therefore advantageous to use the representation by means of afterand foresets.
310
BERNARD DE BAETS and ETIENNE KERRE
An estimate of the required computing time of a fuzzy relational composition of the relevance relation R and its converse R' is given by (Keravnou, 1986) 0 0
O(td O(td
+ t2) for the matrix representation, + t2) for the after- and foreset representation,
where t is the number of terms, d the number of documents, and d the average number of terms per document. This shows that for sparse matrices, the after- and foreset representation not only saves storage space, but also considerably increases the computing speed of the fuzzy relational compositions. 5 , Example of a Thesaurus Construction
As shown before, the triangular compositions of the relevance relation and its converse can be used to come up with answers to requests such as General(t) and Specific(t). We can also attempt to construct a global thesaurus from these compositions. We will illustrate this on a small sample relation extracted from an experiment carried out by Zenner et al. (1984). As set of documents they selected 20 articles in Dutch, with varying length, from newspapers and magazines, all dealing to some extent with the following terms: politics, Belgian politics, foreign politics, European Community, finances, economics, cruise missiles, Reagan, Russia, ... . A heterogeneous group of 15 persons (consisting of professors, teachers, secretaries, housewives, and students) was asked to ascertain to which degree, in their opinion, the documents dealt with the terms considered. Then several methods were developed to come up with a common relevance relation. The following relation is extracted from this experiment. The set of terms is given by T = Itl, ..., t5J and the set of documents by D = (d1,..., db]. 0.8 0.6 0.2 0.3 0.8 0.3 0.6 0.5 0.2 0.3 0.8 0.4 0 0.6 0.1 0.4 0
0.1
0 0
0.1 0 0.6 0.2
We consider the procentual subcomposition of the relevance relation R and its converse R'. Bandler and Kohout usually consider the mean subcomposition since it has some kind of averaging effect. In practice, the relevance relation is very large, and the other (harsh) compositions tend to produce meaningless small numbers. The procentual compositions, alternatives for the mean compositions, also have this desired averaging effect.
FUZZY RELATIONS AND APPLICATIONS
31 1
As triangular norm we simply take the minimum operator M . Let 0 = R ;a R', then
0.9
0.07 0.57
0.3
@ =
0.89 0.89 0.21 0.11 The a-cut for a
=
0.7 (the mean value of the entries of 0)is given by 1 1 0 0 0
After identification of the first two terms (since the corresponding rows and columns are equal) we obtain an order relation with the Hasse diagram of Fig. 3. This diagram has to be interpreted as follows: terms t , and tz are the most general terms, terms t, and t, are not related, but are both more general than term t 4 . The a-cut for a = 1 (only perfect generalizations are considered), again an order relation, is given by
0 0 0 0 1
Q
and can be represented by means of the Hasse diagram of Fig. 4. 11=I2
6
FIGURE3. Hasse diagram for e0,,
312
BERNARD DE BAETS and ETIENNE KERRE
FIGURE4. Hasse diagram for 0,.
V. FUZZYINFERENCE MECHANISMS A . Introduction
The kernel of a classical rule-based knowledge system consists of two parts: a knowledge base; an inference engine. The knowledge base can be decomposed into a data base and a rule base. The data base contains to types of data: static data: data that are usually not influenced by a consultation of the system, such as names and addresses; dynamic data: data that can be influenced by a consultation, such as the body temperature of a patient in a medical expert system. The expertise of a rule-based knowledge system is contained in the rule base. Every rule consists of a number of conditions and a number of conclusions. Such rules are usually called IF-THEN rules. Knowledge about the truth of the conditions allows one to infer knowledge about the truth of the conclusions. Knowledge about the falsity of the conclusions allows one to infer knowledge about the falsity of the conditions. The inference engine is the driving force behind the knowledge system and uses the knowledge contained in the knowledge base to come up with a solution to the given problem-for instance, to give an answer to a question concerning the validity of a hypothesis. There are essentially two different techniques to make use of the knowledge base: forward-chaining and backward-chaining .
FUZZY RELATIONS AND APPLICATIONS
313
When using a forward-chaining inference mechanism, the inference engine tries to recognize a given situation in the data base. Therefore, the inference engine tries to find a rule in the rule base of which all conditions correspond to facts present in the data base. If this is the case, then the conclusions of the rule are added to the data base and the process continues. When using a backward-chaining inference mechanism, the inference engine tries to deduce, in a logical fashion, a presupposed hypothesis. If the hypothesis is not present in the data base, then the inference engine tries to find a rule in the rule base that allows to draw a conclusion about this hypothesis. Therefore it has to repeat the same procedure for the conditions of the rule found. If no suitable rules can be found, then the system lacks knowledge and has to consult the user to obtain additional informationfor instance, to instruct him to do some measurements. It is obvious that both mechanisms can be used in one and the same system. A typical example is a diagnosis system: The user supplies information to the system; the system then tries to deduce new facts using a forward-chaining inference mechanism, and finally tries to prove a certain fact by applying a backward-chaining inference mechanism.
B. A Fuzzy Knowledge Base 1. A Fuzzy Data Base
The facts stored in a data base are usually propositions concerning the value of a variable: a given attribute of a given object has a given value. Such an assertion is represented as "X = A ," where X is a variable standing for the attribute of the object and A is the value of this attribute. In a classical data base these values need to be precise, for example: The body temperature is 37.9"C; Ronald earns $983 a month; Pete is 14 months old; Patricia has brown hair; Nick weighs 70 kg. Such exact descriptions are a consequence of our western striving for precision. In a lot of cases this precision has undesired side-effects. If the rule base contains the following rules: If the body temperature is at least 38"C, then ... ; If the salary is at least $1,OOO a month, then ... ;
314
BERNARD DE BAETS and ETIENNE KERRE
then these rules cannot be applied to the data mentioned earlier. For the body temperature, for instance, it is obvious that the conclusion will also be valid, possibly with a lower degree of truth, for a body temperature of 37.9”C, but such a conclusion cannot be deduced. This is a classical argumentation in favor of facts described by means of fuzzy sets, in order to avoid artificial borders. In a fuzzy data base the data do not need to be precise, for example: The body temperature is about 38°C. Ronald earns about $1,000 a month. Pete is a baby. Patricia has dark hair. Nick weighs between 60 and 70 kg. Of course not all data have to be imprecise. The data base can still contain precise statements or imprecise non-fuzzy statements such as the last example. We conclude that the contents of a proposition of the form “ X is A” impose a restriction, in a fuzzy or non-fuzzy sense, on the possible values of a variable X in a universe U.The facts in a fuzzy data base are then of the following form: “ X is A,” where A E S ( U ) . In case of a non-fuzzy restriction we have “ X E A” with A E 6(U).When A is a singleton ( u ] , then we have the precise fact &I
x = u.”
Combined facts such as “ X is A AND Y is B” and “ X is A OR Y is NOT B” can be represented by means of fuzzy relations. Because of the limited space, such facts will not be considered here. 2. A Fuzzy Rule Base The rules stored in a rule base are usually propositions concerning relationships between the values of different variables. A rule is considered as a piece of knowledge about the way in which the possible values of a variable Y in a universe V depend on the values of a variable X in a universe U. Rules are considered as general laws in the sense that they are always valid and not only in special cases. The rules in a fuzzy rule base are of the following form: “IF X is A THEN Y is B,” where A E S ( U ) and B E
W).
315
FUZZY RELATIONS AND APPLICATIONS
This representation is not yet suitable for the development of fuzzy inference mechanisms. Notice that the contents of such a rule imposes a restriction, in a fuzzy or non-fuzzy sense, on the possible values of the pair of variables (X, Y ) in the Cartesian product U x V. The rules are then of the form “ ( X , Y ) is R,” with R E S(Ux V ) . The question that remains is how a fuzzy relation can be distilled from such a rule. For this purpose we first consider a rule with a non-fuzzy condition and a non-fuzzy conclusion: “IF X E A THEN Y E B.” In fact, this rule means that
( X , Y ) E ( A x B) u (coA x V). In De Baets and Kerre (1993d) we have shown that the relation R = (A x B) U (coA x V) can also be written in the following equivalent ways: R = (COAx C O B )U (UXB),
R = (COAx V) U (UXB), R =
u (C I C is a subset of LI x V and (A x V) n C C
I/ x
B],
R = U(CICisasubsetof U x V a n d ( U x c o B ) f l C ~ c o A x V ) . Mamdani and Sembi (1980) and Braae and Rutherford (1979) have suggested another way of representing this rule, namely as R = A x B. This representation is unacceptable, since it is a representation of a statement “X E A AND Y E B” and certainly not of a conditional piece of information. In De Baets and Kerre (1993d) we have also shown that these five equivalent representations of a non-fuzzy rule lead to five possibly different representations of a fuzzy rule:
R1
=
R2 =
( A X, B) Us* (COA
X,
V),
(COAX S C O B ) U s * ( U X z B ) ,
R , = (COAX,V)Up(UX,B),
R4 = greatest solution in C of (A x, V ) n, C E U x, B,
R5 = greatest solution in C of (Ux, co B) n, C C co A x, V.
316
BERNARD DE BAETS and ETIENNE KERRE
Using the implication operators introduced in Section I1 we can write
Nu)),
R , ( u , u)
= g%A(u),
R,(u, u)
=
$(A(U), B(u)),
R,(u, u) = $(A(u), B(u)), v) = $(A(U), B(u)),
R,(u, u)
= $(A(u),
mu)).
It has to be stressed that of course any representation
R(u, 0)
W ( 4 ,B(u)),
=
with 9 an implication operator, of a single-condition, single-conclusion rule is an extension of the equivalent representations of such a non-fuzzy rule. We usually restrict ourselves to the aforementioned five representations, since they can be linked to a certain interpretation.
C. A Fuzzy Inference Engine 1. Classical Inference Mechanisms
The most important classical inference mechanisms are the modus ponens and modus tollens. The modus ponens can be represented by means of the following scheme: P a 9
P 9
and the modus tollens by means of the following scheme: P - 9
'4
'P Let us now consider propositions concerning the value of a variable. The classical modus ponens allows us to derive the following non-fuzzy inference scheme (MP): IF X E A THEN Y E B XEA' ~
Y EB ' ,
FUZZY RELATIONS AND APPLICATIONS
317
where ifA’cA else
B ’ = [ B, V,
Note that B C B ’ , i.e., the output B‘ cannot be less restrictive than B. When the rule is represented by means of a relation R as explained before, then one easily verifies that R(A’) = B ’ , i.e., the output is the direct image of the input under the relation representing the rule. Similarly, the classical modus tollens leads to the following non-fuzzy inference scheme (MT): IfXEATHEN Y E B YEB’ where A’=
(U,
COA,
if B’ E C O B else
Also note that coA E A ’ , i.e., the output A’ cannot be less restrictive than coA. Here one easily verifies that R‘(B‘) = A‘, i.e., the output is the inverse image of the input under the relation representing the rule.
2. Fuzzy Inference Mechanisms The MP inference scheme has been extended to fuzzy rules by Zadeh (1973). This fuzzy inference scheme appears in the literature as generalized modus ponens, and also as compositional rule of inference. Consider a fuzzy single-condition, single-conclusion rule “IF X is A THEN Y is B.” The fuzzy inference scheme (GMP) is defined as follows: IF X isA THEN Y is B X is A‘
Y is B ’ , where B’ is the direct image of the input A’ under the fuzzy relation R representing the fuzzy rule, i.e., B’ = R‘(A’), or more explicitly,
B’(u) =
SUP
~ ( A ( uR(u, ) , 0)).
ueU
When the relation R is defined as R(u, v) = 9(A(u),B(v)), with 9 an implication operator, then the GMP inference scheme is an extension of the MP inference scheme.
318
BERNARD DE BAETS and ETIENNE KERRE
Zadeh originally considered 3 = M , and he modeled the rule as follows: R(u, v ) = SY(A(u), B(v)). Mamdani also considered 3 = M , but he represented the rule as R(u, v) = min(A(u), B(v)). Since M is not an implication operator, his proposal is not an extension of the M P inference scheme, and therefore the results of it are questionable. In general, any choice of 3 and 4 gives birth to a GMP inference scheme. In the same way, the MT inference scheme can be extended to a GMT inference scheme, by defining the output A’ as the inverse image of the input B’ under the fuzzy relation R representing the rule. In particular, the M P inference scheme satisfies the following property: If the input equals the condition of the rule, then the output equals the conclusion of the rule. A GMP inference scheme not necessarily behaves in this way. We mention the following result (De Baets and Kerre, 1993d). We call a fuzzy set A in U normalized if and only if (3 u E U)(A(u) = 1).
Proposition V.l. Consider a fuzzy rule IF X is A THEN Y is B and the corresponding five fuzzy relations Ri (i = 1, ..., 5 ) ; then the following properties hold: (i) if A’ is normalized then B E R:(A’), i = 2,3,4; (ii) ifB’ is normalized then co A C Rf3(B’), i = 1 , 3 , 5 .
If the triangular norm 3 has left-continuous partial mappings, then the following additional properties hold 1 (iii) (iv)
if A is normalized then R:(A) = B ; if co B is normalized then Rf(co B ) = co A.
The second half of this proposition indicates that the specific behavior of the MP inference scheme is preserved by the GMP inference scheme under very specific conditions. 3. Multiple Rules In this discussion we will restrict ourselves to the (generalized) modus ponens inference schemes. Let us first consider n non-fuzzy rules describing the relationship between two variables X and Y: IF X E A l THEN Y E B , , IF X E A, THEN Y E B,. How do we now go to work when given an input “X E A’ ”? First of all, each of the rules is represented by means of a relation Ri, as explained before: Ri = (Ai x Bi) U (COAi x V ) .
FUZZY RELATIONS AND APPLICATIONS
319
One possible way is to apply the modus ponens for each of the rules separately, yielding n conclusions “ Y E B / ” with Bf = R i ( A ’ ) . These conclusions are then combined into one global conclusion “YE B”’ by taking the intersection of the individual conclusions: n
B’ =
nR,(A‘).
i= 1
This is an optimistic attitude towards multiple sources of information. A second possibility is to combine the rules into one global rule, as follows: “(X, Y ) E R” with n
R= ORi, i= 1
and then apply the modus ponens, yielding the conclusion “ Y E B”’ with
n Ri)( A ’ ) .
B’ =
(i:I
From the property
( fi
Ri)(A’)
i=l
Ri(A‘),
i= 1
it follows that the second method is more restrictive than the first one. We will spend a few lines here to situate the well-known approach of Mamdani. In this approach each of the rules is represented by means of a relation R l : Rf = Ai X Bi. When the conditions A , , ...,A n form a partition of the universe S of the variable X , then one easily verifies that n
n
nRi u A i x Bi u Rf n
=
i= 1
=
i= 1
i= 1
and /
n
\
n
This means that we obtain the same result as in the second approach mentioned earlier if we apply the modus ponens for each of the rules separately, using the representation of Mamdani, and then combine the individual conclusions into one global conclusion by taking the union. This is a pessimistic attitude towards multiple sources of information. The foregoing reasoning shows how an unacceptable representation combined with an unfavorable attitude can, in some circumstances, lead to a correct result.
320
BERNARD DE BAETS and ETIENNE KERRE
When considering fuzzy rules, such an observation is no longer possible, and the approach of Mamdani can no longer be justified. We recommend the first approach mentioned earlier, also for fuzzy rules: Apply the generalized modus ponens for each of the rules separately, and then define the global conclusion as the intersection of the individual conclusions. 4. Practical Considerations
As long as the universes are finite, there is no difficulty in computing the membership function of the output produced by the generalized modus ponens or modus tollens inference scheme. In practice, however, the conditions and conclusions are often propositions concerning the value of a numerical variable, expressed by means of a fuzzy set in the real line IR or a subset of the real line. A fuzzy set in the real line is usually called a fuzzy quantity. It is not possible then, except under very stringent conditions, to obtain an exact expression for this membership function, since the underlying universe is infinite. We will discuss a few methods that obtain an acceptable approximation of the output, or even an exact analytical expression of the output, under certain conditions. In fact, there are two possible ways to turn, for instance, a GMP inference scheme into a practical tool. Firstly, we can restrict the choice of the operators (triangular norm and implication operator), and secondly, we can restrict the data model and stick to certain specific types of fuzzy quantities. In most cases, a mixed approach is followed. Martin-Clouaire (1 987) has shown that for some pairs (3,9) of triangular norms and implication operators, more specifically
(M, 9 3 , (M
m,P,g:>,
and
w,g 3 ,
practical algorithms can be developed to come up with a suitable approximation of the conclusion B', provided that the trapezoidal fuzzy data model is used. A trapezoidal fuzzy quantity Q can be represented by means of a quadruple (a, b,a, fl), with a 4 b and a 1 0 and 2 0. The degree of membership of x E R in Q is then given by
c
Q(x) =
{
ifx
O9
x-u+a 1,
,
ifa-arxca ifasxsb
I10,b-i+87
if b < x s b ifb+B<x
+B
FUZZY RELATIONS AND APPLICATIONS
321
Now consider a rule “If X is A THEN Y is B ” with A and B trapezoidal fuzzy quantities: A = ( a ] ,b l , (.y19/31),
B = (a29 bz, a 2 , / 3 2 ) , and a trapezoidal input “X is A”’ with A’ = (a;,b,’,a;,8;).
For 3 = M and 9 = S,“, we can obtain the following approximation B’” of the output: B’@) = max(B*(v), O),
“
Y is
with B* the trapezoidal fuzzy quantity defined by B* = (a*, b*, a*,/3*) with a* =
(Y](U]
- 1)
b* = @1(1 a* =
241)
+ a], + bl,
(.y]Q,
8* = 81Q, 8
= uo,
where Q = uJ(1 - u,,), uo is the maximal degree of membership of A’ outside the support of A , and u1is the minimal degree of membership of A’ inside the kernel of A . For example, for A = (12, 16,2,4), B = (14, 18,3,8), A’ = (8, 12,3,4),
we obtain B* = (6.5, 14,6,8) and 8 = 0.75. For A = (10, 12,9,8), B = (7, 15,3,2), A ‘ = (8, 14,4,4),
we obtain B* = (6.5, 15.5,2.5,2.5) and 8 = 0. Apart from these approximate methods, we have suggested some exact analytical methods for a more restrictive data model, but for a broader selection of triangular norms and implication operators (De Baets and
322
BERNARD DE BAETS and ETIENNE KERRE
Kerre, 1993d). A symmetrical triangular fuzzy number A is a fuzzy quantity determined by its center a and its width c, as follows, for u E IR:
This data model is rather popular in application areas such as fuzzy control (Gupta et al., 1986). Our study contains all pairs (3,$) (i = 1, ..., 5 ) for 3 = M and 3 = W. We have shown that it is possible to obtain a simple analytical expression for the output, provided that the condition of the rule and the input are both described by means of symmetrical triangular fuzzy numbers with the same width. Consider a rule “IF X is A THEN Y is B” with A a symmetrical triangular fuzzy number with center a and width c. When the input “X is A ’ ” is a symmetrical fuzzy number with center a‘ and width c, then the output ‘‘I‘ is B”’ is defined by for 3 = M a n d 9 = S r :
B ’ ( u ) = max(min8
-
I , B(u)),f),
with I = min(1, + d) and d = (la - a ’ I ) / c ; for 3 = W and 9 = g y :
B’(u) = min(k
+ B(u), l),
with k = min(l,2d) and d = (la - a ’ l ) / c .
D. Further References We have only touched upon the modus ponens and modus tollens inference schemes. There exist several other inference schemes that have been extended to fuzzy inference schemes. We mention the syllogism and the method of cases. The syllogism inference scheme is given by P a 4
q*r p a r , and the method of cases inference scheme is given by Pvq P a r
r.
FUZZY RELATIONS AND APPLICATIONS
323
For a discussion of an extension of the method of cases we refer to Ruan and Kerre (1993a). A unified approach to the extension of any classical inference scheme to a corresponding fuzzy inference scheme can be found in Ruan and Kerre (1993b).
REFERENCES Bandler, W., and Kohout, L. (1980a). Fuzzy power sets and fuzzy implication operators. Fuzzy Sets Syst. 4, 13. Bandler, W., and Kohout, L. (1980b). Fuzzy relational products as a tool for analysis and synthesis of the behaviour of complex natural and artificial systems. In “Fuzzy Sets: Theory and Application to Policy Analysis and Information Systems” (P. Wang and S. Chang, eds.), p. 341. Plenum, New York. Bandler, W., and Kohout, L. (1980~).Semantics of implication operators and fuzzy relational products. Int. J. Man-Mach. Stud. 12, 89. Bandler, W., and Kohout, L. (1986). A survey of fuzzy relational products in their applicability to medicine and clinical psychology. In “Knowledge Representation in Medicine and Clinical Behavioural Science” (W. Bandler and L. Kohout, eds.), p. 107. Abacus Press, Cambridge and Tunbridge Wells. Bandler, W., and Kohout, L. (1988). Special properties, closures and interiors of crisp and fuzzy relations. Fuzzy Sets Syst. 26, 317. Bellman, R., and Giertz, M. (1973). On the analytic formalism of the theory of fuzzy sets. Inf. Sci. (N. Y.) 5 , 149. Braae, M., and Rutherford, D. (1979). Theoretical and linguistic aspects of the fuzzy logic controller. Automata 15, 553. De Baets, B., and Kerre, E. (1993a). A revision of Bandler-Kohout compositions of relations. Math. Pannonica 4, 59. De Baets, B., and Kerre, E. (1993b). Fuzzy relational compositions. Fuuy Sets Syst. 60,109. De Baets, B., and Kerre, E. (1993~).Triangular fuzzy relational compositions revisited. In “Intelligent Systems with Uncertainty” (B. Bouchon-Meunier, R. Yager, and L. Valverde, eds.), p. 257. North-Holland Publ., Amsterdam. De Baets, B., and Kerre, E. (1993d). The generalized modus ponens and the triangular fuzzy data model. Fuzzy Sets Syst. 59, 305. De Baets, B., and Kerre, E. (1994). The cutting of compositions. Fuuy Sets Syst. 62, 295. Di Nola, A., Pedrycz, W., and Sessa, S. (1988). Fuzzy relation equations with equality and difference operators. Fuuy Sets Syst. 25, 205. Dubois, D., and Prade, H. (1992). Upper and lower images of a fuzzy set induced by a fuzzy relation: Applications to fuzzy inference and diagnosis. Inf. Sci. (N. Y.) 64, 203. Gentilhomme, Y. (1968). Les ensembles flous en linguistique. Cuh. Ling. Thdor. Appl. 5 , 47. Goguen, J. (1967). L-fuzzy sets. J. Math. Anal. Appl. 18, 145. Gupta, M., Kiszka, J., and Trojan, M. (1986). Multivariable structure of fuzzy control systems. IEEE Trans. Syst. Man Cybernet. SMC-16, 638. Izumi, K., Tanaka, H., and Asai, K. (1986). Adjointness of fuzzy systems. F u m Sets Syst. 20, 21 1. Keravnou, E. (1986). Computer representation of fuzzy data structures. I n “Knowledge Representation in Medicine and Clinical Behavioural Science” (W. Bandler and L. Kohout. eds.), p. 129. Abacus Press, Cambridge and Tunbridge Wells.
324
BERNARD DE BAETS and ETIENNE KERRE
Kerre, E. (1993). “Introduction to the Basic Principles of Fuzzy Set Theory and Its Applications.” Communication & Cognition, Gent, Belgium. Kohout, L., Keravnou, E.. and Bandler, W. (1983). Information retrieval system using fuzzy relational products for thesaurus construction. In “IFAC Fuzzy Information, Knowledge Representation and Decision Analysis” (E. Sanchez and M. Gupta, eds.), p. 7. Marseille, France. Kohout, L., Keravnou, E., and Bandler, W. (1984). Automatic documentary information retrieval by means of fuzzy relational products. TIMS/Stud. Manage. Sci. 20, 383. Kohout, L., Anderson, J., Behrooz, A., Gao, S., and Trayner, C. (1991). Activity structure based architectures for knowledge-based systems. Part 1. Dynamics of localized fuzzy inference and its interaction with planning. Fuzzy Sets Sysf. 44, 405. Mamdani, E., and Sembi, 9. (1980). Process control using fuzzy logic. In “Fuzzy Sets: Theory and Application to Policy Analysis and Information Systems” (P. Wang and S . Chang, eds.), p. 249. Plenum, New York and London. Martin-Clouaire, R. (1987). “Semantics and Computation of the Generalized Modus Ponens: The Long Paper,” Intern. Rep. No. 270 of the Laboratory “Languages et Systtmes Informatiques,” Universitk Paul Sabatier, Toulouse, France. Pedrycz, W. (1989). “Fuzzy Control and Fuzzy Systems.” Research Studies Press, Somerset, England. Pedrycz, W. (1993). s-t Fuzzy relational equations. Fuzzy Sets Syst. 59, 189. Ruan, D., and Kerre, E. (1993a). Fuzzy implication operators and generalized fuzzy method of cases. F u z u Sets Syst. 54, 23. Ruan, D., and Kerre, E. (1993b). On the extension of the compositional rule of inference. Int. J. Intell. Syst. 8, 807. Schweizer, B., and Sklar, A. (1983). “Probabilistic Metric Spaces.” Elsevier, New York. Zadeh, L. (1965). Fuzzy sets. Inf. Control 8 , 338. Zadeh, L. (1971). Similarity relations and fuzzy orderings. Inf. Sci. (N. Y.) 3, 177. Zadeh. L. (1973). Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybernet. SMC-3, 28. Zadeh, L. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 3. Zadeh, L. (1984). Coping with the imprecision of the real world. Commun. ACM. 27, 304. Zadeh, L. (1986). Is probability theory sufficient for dealing with uncertainty in A.I.: A negative view. In “Uncertainty in Artificial Intelligence” (L. Kana1 and J. Lemmer, eds.), p. 103. Elsevier, Amsterdam. Zenner, R., Kerre, E., and De Caluwe, R. (1984). Practical determination of document description relations in document retrieval systems. In. “Proceedings of the Workshop on the Membership Function” (C. Carlsson, ed.), p. 127. European Institute for Advanced Studies in Management, Brussels. Zimmermann, H.-J., and Zysno, P. (1980). Latent connectives in human decision making. F u u y Sets Syst. 4, 37.
ADVANCES I N ELECTRONICS AND ELECTRON PHYSICS. VOL. 89
Basis Algorithms in Mathematical Morphology
.
RONALD JONES' and IMANTS D SVALBE
.
Department of Physics. Monash University. Melbourne Victoria 3168. Australia
I . Introduction . . . . . . . . . . . . . A . Mathematical Morphology . . . . . . B. Background Theory . . . . . . . . C . Scope and Organization of This Work . . . I I . Basis Algorithms . . . . . . . . . . . . A . Existing Basis Algorithms . . . . . . . B . A General Basis Algorithm . . . . . . III . Applying the General Basis Algorithm . . . . A . Union and Intersection of r-Mappings . . . B. Dilation and Erosion . . . . . . . . C . General Opening and Closing . . . . . D . Cascaded r-Mappings . . . . . . . . E TheDual Basis . . . . . . . . . . . IV . Filtering Properties and the Basis Representation A . Extensive and Anti-extensive r-Mappings . . B. Over-Filters and Under-Filters . . . . . C . Self-Duality . . . . . . . . . . . . D . Combining Design Constraints . . . . . V . Translation-Invariant Set Mappings . . . . . A . Background Theory . . . . . . . . B Basis Algorithms . . . . . . . . . C . Application of the General Basis Algorithm . VI . Gray-Scale Function Mappings . . . . . . A . Background Theory . . . . . . . . B . Basis Algorithms . . . . . . . . . C . The General Basis Algorithm . . . . . . D . Application of the General Basis Algorithm . VII . Transforming the Basis Representation . . . . A . BasicTools . . . . . . . . . . . . B. Serial Transformations . . . . . . . C . Cascaded r-Mappings . . . . . . . . VIII . Conclusion . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . References . . . . . . . . . . . . . .
.
.
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
326 326 326 333 334 334 337 342 342 343 345 345 348 349 350 351 354 357 358 358 361 362 366 366 369 370 371 374 375 377 381 383 385 389
'
Present address: Division of Maths and Statistics. CSIRO. Locked Bag 17. North Ryde. NSW 21 13. Australia . 325
.
Copyright 0 1994 by Academic Press inc. All rights of reproduction in any form reserved. ISBN 0- 1241473 1-9
326
RONALD JONES and IMANTS D. SVALBE
I. INTRODUCTION A. Mathematical Morphology
Mathematical morphology is a particular discipline of image processing that has been used successfully in the classification, recognition and analysis of images. Since its inception in the 1960s, mathematical morphology has become increasingly more popular in the image-processing community. This can be attributed both to its proven utility and to its rigorous mathematical description. Initially designed by Matheron (1975) and Serra (1982) to formalize Euclidean set transformations, in particular those which are translation-invariant, it has since been extended to functionals (Sternberg, 1982, 1986) and to complete lattices (Serra, 1988). For the extraction of information from images it is generally required to transform an image into another, from which such information can be more readily accessed. To this end, mathematical morphology uses a range of tools derived from such mathematical disciplines as algebra, topology, and probabilistic theory. A set-theoretic interpretation of images allows access to a range of functionals already well established in set theory. It has a strong association with shape, and for this reason it has proved very useful for object recognition in images. For a full account of mathematical morphology, the reader is referred to one of the many surveys available, notably the tutorial of Haralick et al. (1987), the introductory text from Serra (1982) and that of Dougherty and Giardina (1988b). Work of a more technical nature can be found in the papers of Heijmans and Ronse (1990, 1991) and the second text from Serra (1988).
B. Background Theory In mathematical morphology, a binary image is treated as a set. The set can be either the set of white pixels or the set of black pixels; in either case the set represents a complete description of the image. Interpreting the image as a set allows one to utilize a range of well-known functionals such as set union, intersection, complementation, and translation. It is from this basic algebra that mathematical morphology derives its tools. In the following we establish some basic concepts of mathematical morphology that will be used throughout this work. The analysis will be in terms of binary morphology but these concepts are readily extended to gray-scale morphology. Grayscale morphology in its relation to this work is discussed in Section VI.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
327
Unless stated otherwise, we assume the underlying space on which all binary images are defined to be a Euclidean space E of arbitrary dimension. Elements of E will be denoted by symbols such as x , y, and z. In particular, the origin is an element of E and will be denoted by the symbol 0. The space of all subsets of E will be defined by 6 ( E ) = ( X : X C E ) . Both E and the empty set 0 are elements of 6 ( E ) .Binary images, which will be referred to simply as sets, are also elements of 6 ( E ) and will be denoted by symbols such as X, Y, and Z. We denote the complement of a set X E 6 ( E ) by X c and define it as X U X' = E and X fl X c = 0 .The cardinality of X will be denoted by x ( X ) , the translation of X through t E E will be defined as = ( z E E : z = x + t , x E x), and the reflect of X will be defined as X = [Z E E :z = - x , x E XI.
Xt
1. Set Mappings
Morphological filters map one set to another in Euclidean space. We will call morphological filters set mappings and denote them by symbols such as Y and @. For any set mapping Y :X -+ Y,where X , Y E 6 ( E ) , we may associate a dual set mapping Y* = [Y(Xc)Ic. A set mapping Y will be called increasing if X
c Y * Y(X)s Y ( Y ) ,
VX,
YE 6(E).
(1.1)
This property, also known as the growth of Y, means that if a set X increases in size then the set Y(X)can only increase as well. A useful result from this property is that X C Y =) w ( X )U Y(Y) = Y(Y),and this has been used to represent increasing gray-scale mappings as a summation of increasing binary mappings. Unless otherwise stated it will be assumed that all set mappings are increasing. A set mapping Y will be called translotion-invariant if Y(XJ = ["(X)],,
V X E 6 ( E ) ,t
E E.
(1.2)
If Y is translation-invariant, then it can be applied to a set X , independently of the vector t E E. It is a property which greatly simplifies the analysis of set mappings, and in this work we assume that all set mappings are translation-invariant . A set mapping Y will be called idempotent if Y [ Y ( X ) ]=
Y(X), VXE 6 ( E ) .
(1.3)
If Y is idempotent, then successive applications of Y have no effect on the initial result Y(X).When a set X is transformed by a set mapping €' ' the result Y(X)is a characterization of X in terms of Y. As unwanted
328
RONALD JONES and IMANTS D. SVALBE
information is removed in this process, the net result is in fact a simpler form of the set X. If Y is idempotent, then this process of simplification is limited to the very first application. Idempotent set mappings have proved very popular for this reason, and indeed there are many in the image processing community who view idempotence as a defining property of all morphological filters. Finally, Y will be called extensive if Y(X)2 X, for all X E @ ( E ) ,antiextensive if Y(X)c X,for all X E @@), and we will denote the identity mapping Y(X)= X, for all X E 6 ( E ) , by Y = id. 2. Dilation and Erosion
Dilation and erosion are the two fundamental set mappings used in mathematical morphology. A dilation 6 is defined to be any increasing set mapping that commutes with union, 6(U X) = U 6 ( X ) ,
and an erosion intersection,
E
V X E @(E),
to be any increasing set mapping that commutes with
&(nX ) = n & ( X I , v x
E
@(E).
The set mapping known as Minkowski addition is the only translationinvariant dilation that can be defined and is given by
The set B is called a structuring element and is typically small and simple compared to the set X. Although this definition does indeed date from Minkowski (1903), it will be referred to simply as dilation. The set mapping known as Minkowski subtraction, introduced by Hadwiger (1950), is the only translation-invariant erosion that can be defined and is given by
It will be referred to simply as erosion. Note that erosion and dilation are dual mappings, the duality given by X @ B = ( X @ &)* = (X"0A)". The nomenclature for dilation and erosion is slightly differently in other works (see, for example, Serra, 1982). An equivalent definition for erosion, used in the tutorial of Haralick et al. (1987), is X@B=[ZEE:B,EX).
(1.6)
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
329
The advantage of this representation is that it affords a more intuitive description of erosion. The structuring element B is seen as a template which is translated through a vector z and fitted inside the set X. If it fits, that is, Bz E X, then the vector z is an element of the erosion X 0B. In this way, erosion can be used to find by inclusion instances of a particular structure B in the set X. Each element z indicates where this structure can be found. From dilation and erosion we have the relations A EB
* X O A G X @ B,
An important corollary is that A G B =) ( X 0A ) fl ( X @ B) = X 0 A andA E B * ( X O A ) U ( X O B ) = X O A . In both cases the set B 2 A contributes nothing to the result and it can be discarded from the expression. For this reason it will be called redundant. As we will see, the concept of redundancy can be used to form a minimal description of any set mapping, and this is the fundamental concept which underlies the basis representation. 3 . Openings and Closings By combining dilation with erosion, we obtain examples of what are known as openings and closings, respectively:
XOB=(XOB)OB and
X.B=
(XOB)OB.
More generally, any anti-extensive, idempotent, and increasing set mapping is called an opening and all translation-invariant openings y are given by (Matheron, 1975) y ( X ) = X 0 B. (1.8)
u B
Likewise, any extensive, idempotent, and increasing set mapping is called a closing, and all translation-invariant closings v, are given by
Opening and closing are used extensively in the morphologic processing of images. The reader is referred to the work of Matheron (1975), Serra
330
RONALD JONES and IMANTS D. SVALBE
(1988), and Heijmans and Ronse (1991) for many interesting theoretical results. General openings and closings have also been studied by Song and Delp (1990).
4. The Kernel Representation The foundation for this work has been laid down in Matheron's theorem (1979, which states that all translation-invariant and increasing set mappings can be represented exactly as a union of erosions, or dually as an intersection of dilations. We will call general translation-invariant and increasing set mappings r-mappings. Matheron's theorem states that a simple algebra of union, intersection, and translation suffices to implement all r-mappings-and in parallel. Although this result has long been established, it is instructive to outline how it can be derived. Let us define the kernel X(Y) of a r-mapping Y by
X(Y) = ( A E 6 ( E ) : O E Y ( A ) ) . The kernel is a "list" of all those A E 6 ( E ) where the origin is an element of Y ( A ) .The kernel uniquely defines the r-mapping Y and may be used to implement Y on any set. Consider Y mapping an initial set X to the resulWe are able to tell whether or not the origin is an element of tant set Y(X). Y ( X )simply by checking whether or not X E X(Y), because X(Y) contains all such sets. More formally, 0 E Y ( X ) o X E X(Y). By using the property of translation-invariance we may extend this notion to construct the complete set of elements z E Y(X)using the kernel. An element z E Y ( X ) o 0 E [Y(X)]-, and because Y is translation-invariant, [Y(X)]-, = Y(X-,). Therefore, z E Y(X)o 0 E Y(X-,). But using the kernel, 0 E Y(X-,) o X - , E X(Y), and so we may conclude that Y ( X ) = (z E E : X - , E X(Y)). The set mapping Y is now completely defined in terms of the kernel. We may further extend this result to be in terms of erosions. By separating Y(X)= ( z E E : X - , E X(Y)) into the contribution from each kernel element A E X(Y), we have Y(X)= UA[zE E : A = X - , , A E X(Y)] = E E : A, = X,A E X(Y)]. We may relax the constraint A, = X a little because Y is in fact increasing. If a set A is an element of the kernel, then there are infinitely many supersets of A that are also elements of the kernel [as B 2 A and 0 E Y ( A ) imply that 0 E Y(B) as well, from which 0 E Y(B) e, B E X(Y)]. We may use instead Y(X)= UA(zE E : A , G X , A E X(Y)] and, from the definition of erosion as it appears in Eq. (I.6), we have that Y ( X ) = U(X 0A : A E X(Y)].
uA(z
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
33 1
By duality, Y has a similar representation as an intersection of dilations. The complete representation of Y is given by
Y(x)=
u
X ~ A =
A E X(Y)
A
n
E
x@/i*.
(1.10)
X(**)
Here Y is a 5-mapping, the kernel X(Y) is a set of structuring elements derived from Y, and the dual kernel X(Y*) is a set derived from the dual mapping Y* = [Y(XC)Ic.
5 . The Basis Representation In terms of practical applications, the result in Eq. (1.10) has little utility because the kernel contains an infinite number of elements. However, this result was extended by Dougherty and Giardina (1986) and Maragos and Schafer (1987a, b) to include the concept of a basis. The basis is a minimal subset of the kernel, and it can be finite. As we have seen from Eq. (1.7), for two sets A , B E @ ( E ) , A E B * X G A U X @ B = X @ A and X @ A n X @ B = X @ A . I n b o t h c a s e s t h e s e t B 2 A isredundant. IfA and B 3 A are two different elements of a kernel then, under the representation of Eq. (I.lO), the redundant element B can be discarded. The basis contains only those elements of the kernel that are non-redundant. For most r-mappings the basis is finite, although it is possible to construct Tmappings whose bases have an infinite number of elements (for example, the set ( A ) = ( ( z ):z E E ) is a basis and has an infinite number of elements). We will use the symbol 63 to denote a basis. In particular, &(Y) denotes the basis that represents the set mapping Y and & ( ( A ) )denotes the set of basis elements in the set ( A ) .We will also use, for example, &(X -, X 0 A ) to denote the basis that represents the set mapping Y :X -+ X @ A . The basis &(Y) is defined by &(Y) = ( A E X(Y): (B E X(Y) and B E A )
=)
B = A).
Using the concept of a basis, the kernel representation of a mapping in Eq. (1.10) becomes the basis representation
Y(x)=
u
A E @I(*)
XOA=
x@A*. A* E @I(**)
Here the basis &(Y) is a minimal set of structuring elements derived from Y , and the dual basis &(Y *) is a minimal set derived from the dual mapping Y* = [Y(Xc)]'.Note that this representation is parallel as each erosion X 0A (or dilation X @ A*) may be implemented independently. The basis representation is very promising in terms of its potential application to the parallel implementation of morphological set mappings.
332
RONALD JONES and IMANTS D. SVALBE
We will now examine some simple r-mappings and their basis representations. Consider initially a r-mapping Y(X)= X 0B that is already in the form of a union of erosions. The kernel is given by X(Y) = ( A E @(I?): 0 E A 0B ) = ( A E 6 ( E ): A 2 B), using the definition of erosion in Eq. (1.6). The structuring element B is the only non-redundant element of the kernel because all other elements of the kernel are supersets of B. The basis @(Y)is then given by the single element A = B, and the basis representation of Y by Y(X) = X 0A , where A = B. Another simple case is the basis for the identity mapping Y(X)= id(X). The kernel X(Y) = ( A E 6 ( E ) : 0 E id(A)) = ( A E @ ( E ) : 0 E A ) , as id(A) = A . Every A E X(Y) is a superset of the origin and, as the origin is itself an element of X(Y), the basis @(Y)is given simply by @(id) = 0. The basis representation of the identity mapping is then Y(X)= X 0 (0). The kernel for the dilation Y ( X ) = X @ B is given by X(Y) = ( A E ~ ( E ) : O E A O B ) = ( A E ~ ( E ) : O E U ~ ~ EB ~A( ~E) )=: ((A- ~ ) E A , b E B ) . Each singleton - b is an element of the kernel, and every element of the kernel is a superset of some singleton. The basis is therefore given by
@ ( X + X @ B) = ( ( - b ] : bE B J .
(1.11)
The r-mapping Y(X)= X @ B may now be represented as a union of erosions, Y(X)= U(X 0A : A = - b, b E B). Note that this result can also be obtained directly from the definitions of dilation and erosion in Eqs. (1.4) and (1.5). The basis for the opening Y(X)= X 0 B may be derived in a similar way and is given by (1.12) @ ( X + X O B) = ( B - , : b E B ) . The opening may now be represented Y(X)= U(X0A : A = B-b, b E B). Each element of the basis for opening is a simple translate of B through a vector. All basis elements then have the same cardinality-the cardinality of B. Likewise, the bases that represent erosion and dilation consist of a set of elements with constant cardinality. This greatly simplifies the form of the basis, as if all basis elements have the same cardinality then they cannot be redundant by one another (as they cannot be supersets of one another). The only other example of such a simple basis representation is that of rank order-statistic filters (see Maragos and Schafer, 1987b, and Section 11). In contrast, the basis for the closing Y(X) = X 0 B is an example of a basis where the elements do not have a constant cardinality. The basis for such a r-mapping can only be computed using an algorithmic approach, as has been presented by Svalbe (1991a). In general, an algorithm such as this involves the selection of basis elements from the kernel using a process that avoids elements that would be redundant in the final basis. The difficulty
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
333
lies in the comparative process that must take place between elements. If we consider a simplified situation where the kernel is finite and has already been computed, the problem would then be one of finding the most efficient algorithm to sort through the kernel elements to obtain the basis elements. In the case of translation-invariant set mappings, simplification techniques for Boolean functions in switching theory can and have been used to reduce a set of elements to a set of basis elements. In practice, however, the kernel can never be actualized because it is infinite, and what must be used instead is the kernel constraint 0 E Y ( A ) .As this constraint depends on the set mapping Y, a new basis algorithm must be invented for every different type of set mapping. In this work, a more general approach to the computation of basis elements will be introduced. We conclude the background theory for the basis representation with a few properties of the basis that will be used frequently throughout this work. A r-mapping Y with a basis @(Y) = ( A ) will be denoted by YAi. The basis constitutes a complete representation of any r-mapping and so can be used in place of Y to design and implement any r-mapping. For any r-mapping there can exist only one solution for the basis, and Yl = Yz @('PI) = @(Yz). Finally, for any two mappings YAl and YteI, YA1E YB1if and only if every A E ( A )is a superset of some B E ( B ) . C. Scope and Organization of This Work
Recently, the basis representation of morphological mappings has received renewed interest from the imaging-processing community, as witnessed by the recent publications of Dougherty and Loce (1992b), Banon and Barrera (1993), and Khosravi and Schafer (1993). However, apart from work by Jones and Svalbe (1992a, c), there currently exist few algorithms for generating bases for set mappings. The initial aim of this work is to help fill this void by establishing a general approach to the computation of bases (and in so doing unify some previous algorithms into one method). The advantage of the basis representation is that it is able to represent in parallel many different types of set mappings, previously considered disparate, with one common set mapping-the union of erosions set mapping. As recognized in the work of Dougherty and Loce (1992a, b), this common platform can be used to design and implement a broad range of mappings. However, for such an approach to realize its full potential, a deeper understanding of how the filtering properties are manifest in the basis is required. In this work we reconcile some important filtering properties in terms of the basis and discuss the possibility of modifying a given basis to admit such properties.
334
RONALD JONES and IMANTS D. SVALBE
The basis exemplifies every possible data structure in the region of support of a set mapping. Unfortunately, this can induce a combinatoric “explosion” of basis elements as the size of this region increases. In general there are two solutions to this problem. The first, adopted in the work of Dougherty and Loce, is to clip the basis down to an optimized subset of the basis which is of a more manageable size. Such an approach sacrifices design optimization for design tractability. An alternative solution, proposed in this work, is to transform the basis into forms that are more tractable than the original representation. This approach retains all basis elements and so represents no loss of design optimization. This work will proceed as follows. In the following section, a general approach to compute bases will be proposed that can be used for any combination of set mappings that are realized as unions of erosions. In Section 111, it will be demonstrated how this method can be used to compute bases for such set mappings as the close, close-open, and open-close set mappings and multiple passes of the median filter. A further algorithm will then be introduced to compute the dual basis of any given basis. In Section IV, filtering properties will be examined in terms of how they are manifest in the basis representation. In Section V, the constraint that a set mapping be increasing will be relaxed and the algorithms previously derived will be generalized to such set mappings. The extension to gray-scale morphological mappings will be made in Section VI. The results are readily extended to gray-scale morphology, and in particular to flat gray-scale morphology, which requires no alteration to the results for binary morphology. We will then return to binary t-mappings and to the problem of basis intractability in Section VII. We conclude with some remarks in Section VIII. The proofs of all lemmas and theorems appear in the Appendix. Finally, note that although the analysis in this work is oriented towards a union of erosions, it could also have been oriented towards the dual intersection of dilations with no loss of generality. We choose the union of erosions representation to conform with the prevailing literature. 11. BASISALGORITHMS A. Existing Basis Algorithms
Many set mappings used in mathematical morphology are translationinvariant and increasing. Apart from the erosion, dilation, opening, and closing set mappings that have been defined in the introduction, examples include all parallel and serial combinations of these mappings, such as the
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
335
open-close and close-open mappings. All these r-mappings have a basis representation, but to actually compute a basis an algorithmic approach is generally required. In this section some existing basis algorithms will be discussed. 1. Rank Order-Statistic Filters
The bases that represent erosion, dilation, and opening all have a simple form, and this is due to the fact that the elements of these bases have a constant cardinality. There exists another type of r-mapping that is represented by such a basis, and it is called a rank order-statisticf. It has been used extensively in image processing to suppress impulse noise in images while preserving edges. A binary rank order-statistic filter Yr,,+, is defined by Y r , w ( X ) =( z E E : X ( X nWz)2r].
Here W E 6 ( E ) is a window consisting of n elements, r is the rank, and x denotes cardinality. If r = (n + 1)/2 when n is odd, then Yr,,+, is a median filter. For each element z, the window W, is matched with the structure in the set X . If there are more than r matching elements, then z is an element of the resultant set Yr,&f). In practice, such a procedure involves counting the number of elements that are formed in the set X n W,. An alternative approach is to use the basis representation. The basis a3(Yr,,+,) of Yr, is given by (Maragos and Schafer, 1987b) = [A E W:x(A) =
r],
from which the basis representation of Yr,, is then given by Yr,w ( X ) = U(X 0A : A E W and
x(A) = r ] .
The rank order-statistic filter is now represented in a parallel form. Each basis element is a subset of the window W consisting of exactly r elements (thus, the cardinality of the basis elements is a constant), and there are X(WY
r!(X(W)- r)! basis elements in all. An example is illustrated in Fig. 1. The window W is defined as a cross shape W = ((0, 0), (0, l), (0, - l), (1, 0), ( - 1,O)) with n = 5 elements, as shown in Fig. la. By setting the order r = 1, Yr, is a dilation and the basis consists of the five singletons in Fig. lb. With an order r = 3, Yr,+,is a median filter, and the basis consists of the 10 elements shown in Fig. lc. If r = n = 5 , then Yr, is a simple erosion with a basis consisting of a single element equal to the original window Win Fig. la.
336
RONALD JONES and IMANTS D. SVALBE
0
0t
0
0 +0
o+o
0
11
0t0 0
C
FIGURE1. Bases for the rank order-statistic filter. (a) Window W . (b) Basis with rank r = 1 . (c) Basis with rank r = 3. Elements are marked by the ‘‘0” and “0”symbols. The origin is marked by the “0”and “ + ” symbols.
2. Other Binary r-Mappings The computation of bases for many other types of r-mappings is significantly more complex. For example, the basis for a closing can only be computed using an algorithmic approach. The earliest work on this problem was from Maragos (1989), where a one-dimensional convex set was used for a structuring element. The problem is greatly simplified with this choice of structuring element as, apart from the origin, each basis element has a constant cardinality. It was in the work of Svalbe (1991a, b) that an algorithm to compute the basis for the binary closing with an arbitrary N-dimensional structuring element was first introduced. From this work, extensions were made to a basis algorithm for closing using multiple structuring elements (Jones and Svalbe, 1992a) and t o algorithms for the open-close and close-open r-mappings (Jones and Svalbe, 1992b).
3 . Gray-Scale r-Mappings The extension of the basis representation to gray-scale morphology introduces a new aspect to the basis representation because gray-scale image processing can support image mappings which are generally described as linear (for example, the local average of a function). Typically, these mappings employ operators such as addition and multiplication, and in this sense they are quite different from “morphological” image mappings which advocate the use of the maximum and minimum operators. There is one unifying link, however, and that is that if a linear mapping is translationinvariant and increasing, it, too, has a basis representation and can be
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
337
represented as a maximum of erosions. In this way, linear mappings are in fact “morphological. ” Note that bases for linear mappings are usually infinite (although there has been a considerable amount of literature devoted to methods for making such bases finite; see, for example, Dougherty and Kraus, 1991, and Dougherty and Schafer, 1993). The basis representation of gray-scale 7-mappings is discussed in Section VI.
B. A General Basis Algorithm The problem with the basis algorithms mentioned earlier is that each can only be used for the specific 7-mapping that it is designed for. A different basis algorithm must be used for every different type of 7-mapping. Of course, it would be more useful to have a completely general algorithm that can compute bases for any given 7-mapping. In the following such an algorithm is attempted. We will proceed by deriving the forms of the bases that represent the U ysl, and yAl n yB1, where YAland yBl three 7-mappings (Y,Al)r, yA1 are two arbitrary 7-mappings. Note that the algebra of translation, union, and intersection used is exactly that required for the basis representation. The bases for the first two 7-mappings are rather simple, but the last is more complex and exhibits some interesting properties. The three results form the tools which make up the general basis algorithm. The algorithm can be used to compute the basis for any 7mapping that consists of a combination of 7-mappings for which the bases are known. It cannot be used for all 7-mappings-for example, the rank order-statistic filter. However, if the basis for this 7-mapping is known, then the algorithm can be used to compute the basis for an arbitrary number of iterations of it, or for combinations with any other 7-mapping for which the basis is known. The algorithm is applied to some example 7-mappings in Section 111. 1. The Translation of a 7-Mapping
As one might expect, the basis representing ( y A ) ) , is given by translating the elements of the basis [A). The result is formalized in the following theorem.
Theorem 11.1. The basis for the translation of a t-mapping yA1 through a vector t E E is given by
@[(YI,A~)~I = {A-r : A E [AllThe number of elements in the basis [A-*:A E [ A ] )is equal to the number of elements in the basis [ A ] .
338
RONALD JONES and IMANTS D. SVALBE
The proof of this and subsequent theorems and lemmas is to be found in the Appendix. 2. The Union of Two r-Mappings The basis representing the union Y,A,U 'IfB, is given by the basis elements of the set [ C ]= [A) U (B). The problem is then how to compute the basis elements of this union. As illustrated by the example in Fig. 2, the set (C) = (A) U (B] does not in general form a basis. In Fig. 2a is a basis (A) consisting of three elements, and in Fig. 2b is a second basis [B) consisting of four elements. A set [C] formed by the union of (A) and (B) would consist of the entire seven elements shown. This does not form a basis, however, as some of the elements of ( B ) are supersets of elements of (A). The basis 63((C])is illustrated in Fig. 2c. To compute the basis elements of ( C ) some comparative process must take place between the bases (A) and (B). By definition of a basis, an element C E (C]will be an element of 63((C))if and only if C 3 C', for all 0
O0
0 0
0 $0
00
::
;J
00 b
a
0
00
00 00
0 0
0 0
00 0 0
0 0
0
00
02:
::
0 0
0
00
00
;go
00
0
F I ~ U R2. E Example illustrating the union and intersecting of two r-mappings. (a) Basis ( A ] . (b) Basis ( B ] .(c) Basis (qformed as the basis elements of the union of (A1 and (B). (d) The set (C) = (A U B : A E ( A ) ,B E ( B ) ] .(e) The basis elements of the set (q.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
339
C’ E ( C } .As (C) = (A} U ( B } ,each C E (C)is either equal to A E (A) or to B E ( B } .An element A E (A) will be an element of @((C))iff A 3 B, for all B E ( B } .Note that (A} itself forms a basis, and so it suffices to compare A only with the basis (B}and not with (A) as well. The same applies for those We may formalize this in the elements of ( B } that are elements of &((C}). following theorem.
Theorem 11.2.
The basis for the r-mapping @(y{A]
U
is given by
u y[EI) = ( A @ ) tB@),
where (A@= ) ( A E ( A }: A 3 B , V B E ( B ) )
and (Ba} = ( B E ( B ) : B3 A , V A E ( A ) ) . Note that there are other ways to compute the basis U YfEI). For example, it can be defined initially as the full union (A) U (B),and then the redundant elements can be subsequently removed. Although simpler, this algorithm is in practice slower, because sorting routines are in general slow and because it does not utilize the fact that both (A) and (B)are themselves bases. The total number of elements in the basis @((A)U (B)) is given by the number of basis elements in (Aa) and the number of elements in (Ba). This is not a simple addition x((Aa})+ x((Ba)),however, as it is possible for the same basis element to be present in both sets. By subtracting the overlapping elements we have x [ @ ( ( AU ) (B))]= z((Aa)) + x((Ba))~ ( ( A an l (Ba)).The maximum number of possible elements in this basis is given by x ( ( A ) )+ x((B)),and the minimum possible number by either z ( ( A } )or x((B}),depending on which is the lesser. 3 . The Intersection of Two r-Mappings
The basis for an intersection of two r-mappings is more complex than that for a union of two r-mappings because each r-mapping is represented as a union of erosions. Therefore, the intersection of r-mappings involves the coupling of intersection with union, and not union with union, as was the case earlier. In contrast, if each r-mapping is represented as an intersection of dilations (which is the dual representation), then the situation is reversed and the case for the union is more complex than that for the intersection. We proceed by first establishing the following theorem.
340 Theorem 11.3.
RONALD JONES and IMANTS D. SVALBE
The basis for the r-mapping YfAIf l Yer is given by @(?A1
YE,) = @((c)),
where ( C ) = (A U B : A E ( A ] ,B E [ B ) ) . The problem now reduces to one of finding the basis elements of the set (C). Illustrated in Fig. 2d is a set (C) formed as defined in Theorem 11.3, using the bases ( A )and ( B )that are shown in Figs. 2a and 2b. It is evident from this example that the set (C)does not in general form a basis, as it can contain redundant elements. The basis @ ( [ C ] is ) shown in Fig. 2e. A simple approach to the problem is to first compute the set (C]and then to sort this set so as to retain only the basis elements. However, as the number of elements in (CJ is given by z ( ( A ) )x x ( ( B) ) , this algorithm becomes intractable as the size of the bases ( A ) and ( B )increase. In many cases a high proportion of these elements will be redundant, and so a lot of time is wasted by storing and sorting such elements. Furthermore, the method does not utilize the fact that both (A) and (B) are bases. We propose instead the following approach. The constraint for C = A U B to be an element of the basis @ ( ( C ) )is that for the two elements A E (A), B E ( B ) , ( A U B) $ C, for all C E ( C ) o ( A U B) $ (A‘ U B’), for all A’ E (A), B’ E ( B ) o [(A U B ) $ A’ or (A U B) $ B’ or A U B = A’ U B’], for all A’ E ( A ) ,B’ E (B).The following basis algorithm is derived from these three constraints. Consider a given A E ( A ]and a set (C,) = (A U B :B E (BJJof potential For each C, E (CAI, the sets [AcA)= elements of the basis @((C)). ( A E [ A ): C, 3 A ) and (BCJ = ( B E ( B ): C, 3 B ) are computed. If (AcA] is empty, then C, $ A for all A E (A), and in such a case C, is an element of the basis @((C)).Similarly, if the set (B,,) is empty, then C, $ B for all B E [ B ) ,and it is therefore a basis element. If both (AcA)and (B,,) are nonempty, then C, can only be an element of the basis @ ( ( C )if) and only if C, = A , U BcA, for all A,, E (AcA),BcA E {BcAJ. This process is repeated for each C, E (C,) and for each A E ( A ) to derive the complete basis @((CI). This process is illustrated in Fig. 3. The basis (A)is that from Fig. 2a and IB) is that from Fig. 2b. In Fig. 3a is one of the elements A E ( A ) ,and in Fig. 3b is the set (C,) = (A U B : B E (B))of potential basis elements. We take from this set the last element C, = A U B, as shown in Fig. 3c, to test for whether or not it is an element of the basis @((Q).The sets (AcA)and (BcA)are shown in Figs. 3d and 3e, respectively. As these sets are nonempty, the only way that C, can be a basis element is if C, = AC? U BcA for both A,, E (AcA].In this case it is not true, and therefore C, is not a
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY 00
00
0
00
00
0
000
000
00
341
0
00 00
b
a
0
0
$:
$0
C
0
000
0 0 0
0 0
f
6
FIGURE3. Example bases illustrating the basis algorithm for the intersection of two mappings. See text for details.
7-
basis element. A second element of (C,)is shown in Fig. 3f, and the sets (AcA]and (BcA]in Figs. 3g and 3h, respectively. Once again these sets are non-empty, but this time C,., = AcAU BcA, and so it is an element of the basis @((q). Although more complex than the algorithm suggested initially, the preceding algorithm need only handle a maximum of 2 x x((B))+ x((A)) sets at any one time, and only the basis elements are stored. When the bases (A) and (B) are large, this represents a substantial increase in efficiency. In terms of cardinality, the basis for the intersection of two r-mappings presents an interesting property. The maximum number of basis elements in (B((C))is given by x ( ( A ] )x z((B]),and this occurs for the case when the set [C) contains no redundant elements. Essentially, then, the cardinality of @ ( [ C )diverges ) with increasing x ( ( A ) )and x((B)).However, there are cases where the cardinality of B((C))is actually less than that of either (A) or (B). Consider when two elements A E (A) and B E IB) satisfy A 1 B. The set C = A U B = A, and so for any B’ E ( B ] , the union C’ = A U B’ 2 C. Therefore, we need not check any other union C’ = A U B’, because these will all be redundant. Furthermore, because (A] is a basis, A 2 A’ for all A’ E (A], and so C = A 2 A’ U B’ for any combination of A’ or B’. The set C = A U B = A cannot be redundant and so is an element of the basis @ ( ( C ) )Alternatively, . if B 2 A , then B E and we need not check any other associations of B with the elements of ( A ) .It is for this reason can be substantially reduced. that the cardinality of @((C)) For the example in Fig. 2, there are three elements of the basis (B)where
@((a),
342
RONALD JONES and IMANTS D. SVALBE
B 2 A for some A E ( A ] , and all three of these are elements of the final basis @([C)). By using such basis elements, it is easy to construct two bases [ A ]and ( B ]where the cardinality of @ ( ( C )is ) less than that of either ( A ]or [B]. This will be looked at again in Section III,C in relation to the basis for closing.
111. APPLYING THE GENERAL BASIS ALGORITHM The results for the translation and combination of r-mappings form the basic tools of the general basis algorithm and can now be applied to compute bases for r-mappings. As noted in Section II,B, the approach will not work for all r-mappings. It requires that they be formed as a combination of r-mappings, for each of which the basis is known, under the algebra of translation, union, and intersection. In this section some examples are given to illustrate how the general basis algorithm can be applied to compute bases for a range of such t-mappings. A. Union and Intersection of r-Mappings
An annular opening is defined in Serra (1988) to be a r-mapping y ( X ) = X f l ( X @ A ) , where the structuring element A is a symmetric set (i.e., A = A ) that does not include the origin. As y ( X ) is anti-extensive, idempotent, and increasing, it is by definition an opening. We may express the annular opening as y ( X ) = id(X) fl (X@ A ) . This is an intersection of two r-mappings, for each of which the basis is known, and so Theorem 11.3 may be used to compute its basis. The basis for the identity mapping id(X) = X i s the origin, and the basis for the dilation X @ A is given by Eq. (1.1 1) as @ ( X 4 X 0A ) = (( - a ] :a E A ) . Using these two bases, Theorem r (X@ A ) ) = [ ( a 0 , ): a E A ) . (Note that, as 11.3 yields @ ( X id(X) 7 pointed out in Section I,B,3, a translation-invariant opening can be represented as a union of elementary openings, and indeed it is possible to show that the preceding basis is identical to that which represents a rmapping Y(X)= U e X0 B.) For example, in Fig. 4a is a symmetric element A , and the basis @(X 4 id(X) f l ( X 0 A ) ) consists of the eight elements shown in Fig. 4b. The theorems for the union and intersection of two r-mappings are easily extended to an arbitrary number of r-mappings. Consider, for example, the computation of the basis for the union of three r-mappings, U '€-',el U The basis for U yelcan be computed using Theorem 11.2 and is -+
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
343
a
I) FIGURE
4. Basis for an annular opening. (a) Structuring element A . (b) Resultant basis.
a3(yAl U Y E )= ) (As)U (BA).Denoting this basis by (D), yAlU y B l= yDland yA1 U YfBIU ycr= yDlU Y,cl. The basis for this can be U ycl)= obtained using Theorem 11.2 again and is given by @(yo, a3(yAl U yBl U yCl) = (D,) U (CD).The basis for the union of an arbitrary given by
number of r-mappings can be obtained by such a process of iteration. In the same way, Theorem 11.3 can be used to compute the basis for the intersection of an arbitrary number of r-mappings. By combining Theorem 11.2 with Theorem 11.3, the basis for any combination of r-mappings under intersection and union can be obtained. For example, the basis for (yAI U yBlU y,,) n y D lcan be computed by combining the basis for yAI U yB1 U Yc1,obtained using Theorem 11.2 twice, with the basis ( D )in Theorem 11.3.
B. Dilation and Erosion An application of the general basis algorithm is to the simple dilated r-mapping
u
9,4] @B = (yA])b* beB
(111.1)
The basis for this r-mapping may be obtained by applying Theorem 11.1 to each translation ( y A I ) b , and then Theorem 11.2 to the union u b B ( y A I ) b . For example, consider the basis ( A ) to be a single structuring element A . The preceding equation becomes y A l ( X )@ B = ( X 0A ) 0 B = u b B(X 0A ) b . The basis for each translation (X 0A ) b is given by Theorem 11.1 as the single set A - b . The basis for the union U b e B X O ( A - b ) is then given by Theorem 11.2 as @(X -, U b e B X 0( A - b ) ) = ( A - b : b E B). When B = A , @ B becomes the opening X 0 A and we have the basis @(X -, U a e A X O(A-,,))= ( A - , : a E A ) that was previously given for the basis for opening in Eq. (1.12).
344
RONALD JONES and IMANTS D. SVALBE
The counterpart to Eq. 111.1 is the eroded r-mapping
yA,o B =
n (y;,,~,.
(I I I. 2)
beB
The basis for this r-mapping may be obtained by applying Theorem 11.1 to each translation ( y A ) ) - b and then Theorem 11.3 to the intersection r ) b E B ( y [ A ] ) - b . For example, consider ( A )to be a set of singletons ( A ) = ( ( - b ) : b E B J . This set defines the basis for a dilation X @ B, and so Y A , ( X )= X O B and y A I ( X0 ) B = ( X O B ) 0B = X O B , which is a closing. Consider, for example, the “L”-shaped structuring element B = ((0,0), (1, 0), (0, 1)) as shown in Fig. 5a. The basis ( A )= I(- b) : b E B ) is as shown in Fig. 5b, and the basis for the translation ( y , A l ) - b is shown in Fig. 5b f o r b = (0, 0), in Fig. 5c f o r b = (1, 0 ) , and in Fig. 5d for b = (0, 1). Theorem 11.3 is then used to compute the basis for the intersection n b ( y [ A l ) - b , taking two bases at a time. In Fig. 5e is the basis for the intersection (Y;AI)-(O,O)fl (Y;Al)-(l,O), using two elements of B . This resultant basis is then combined with the basis for the translation (Y,A,)-(o,l), using the last element of B . The final basis, illustrated in Fig. 5f, is the basis aj(X -+ X . B ) . 0
+
0
00
o+
0
b
a
0
0
t
+o
0
+
0
O
t
d
c
+
to
0
0
o+o
o+
11 ( 1
0
c
O +(I 0
0 o
Ir+o
n
0
0
n
o t o
0
t
+0
0 0
0
+n o
0
o t
o+ 0
0
FIGURE5 . Procedure for computing the basis for a closing. See text for details.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
345
C. General Opening and Closing More general forms of opening and closing are given by Eqs. (1.8) and (1.9). The bases for both these set mappings can be obtained using the general basis algorithm, but of particular interest is the basis for the closing p ( X ) = & X O B. Illustrated in Fig. 6a is a series of four “L”-shaped structuring elements (see Song and Delp, 1990, for more details on the use of these structuring elements). A basis @ ( X + X 0 B) for each structuring element B may be obtained as directed in the previous section. The basis for the intersection n B X OB may then be obtained using Theorem 11.3 and is as shown in Fig. 6b. Note that there are only eight elements in the basis for the intersection of closings, and this is fewer than for any one of the component closings, each of which has nine basis elements. As pointed out in Section 11, if for two r-mappings yAI and YfBlthere exist an A and a B where A 2 B y then the number of basis elements in the basis @(yAl fl y B l ) can be substantially reduced. At times it can be less than the number of basis elements in either ( A ] or ( B ] .When more than two r-mappings are combined under intersection, the likelihood of this happening increases.
D. Cascaded r-Mappings A cascade of two r-mappings YAland ye,is itself a r-mapping, and its basis may be derived using the general basis algorithm. Using the definitions as of dilation and erosion, we may express a cascade YfB1(yA1)
(111.3)
a
0
0
0
o t o o o
0
o t o
0
o t o
0 0 0
t
0
0
0
0 0 0
0 0
0
0
0 0
oto
o t
o
on
t o 00
I)
FIGURE6. Basis for a general closing. (a) Four “L”-type structuringelements. (b) Basis for closing using these four structuring elements.
346
RONALD JONES and IMANTS D. SVALBE
It is clear, then, that the basis for this r-mapping can be obtained through a process of translating, uniting, and intersecting r-mappings, as supported by the general basis algorithm. For example, consider the simple structuring element C = [(0, 0), (1, 0)) shown in Fig. 7a. The basis for opening @ ( X 0 C) is shown in Fig. 7b, and the basis for closing @ ( X O C) in Fig. 7c. If (A) = @ ( X O C) and (BJ= @(X 0 C), then Eq. (111.3) corresponds to the close-open r-mapping (X 0 C) 0 C. Figures 7d to 7h illustrate the process whereby the basis for this r-mapping can be computed. In Fig. 7d is the basis for the r-mapping n b ~ B , ( y [ A ) ) - bwhere , B , is the first element of (B]. As B , is the origin, the result is simply the basis [A). In Fig. 7e is the basis for ( y A ] ) - b , where b = ( - 1,O) is one element of B, E [B), and in Fig. 7f is the basis for ( y [ A ] ) - b , where b = (1,O) is the other element of B, E [B). The basis for the intersection n b B , ( y A J ) - b is shown in Fig. 7g. The basis for the union [ n b B 1 ( y A l ) - b ] U [nb B 2 ( Y A ] ) - b ] forms the basis for the complete closeopen r-mapping and is as shown in Fig. 7h. Conversely, if (A) = @(X 0 C) and [B) = @(X 0 C), then the cascaded representation in Eq. (111.3) corresponds to the open-close r-mapping. The basis is obtained using the same procedure and is as shown in Fig. 7i.
b
a
00
00
C
(1
00
00
t o o
r
c
oo+oo
000
00
00
R
0 0
oo+
oo+oo
h
00
0
0
0
0
o t o
FIGURE7. Procedure for computing the bases for the cascaded r-mappings (XOC) 0 C and ( X o C) 0 C . See text for details.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
347
Using the basis representation, direct comparisons can be made between r-mappings that would otherwise be considered disparate. In the preceding example, both (X 0 C) 0 C and (X 0 C) 0 C are represented as a union of erosions, and the difference between these two r-mappings is described entirely by the differences in their bases. This unified description can reveal relationships between r-mappings that would be difficult to establish theoretically. For example, we may conclude that (X 0 C) 0 C C ( X 0 C) 0 C, as every element of the basis @((X0 C) 0 C) in Fig. 7h is a superset of some element of the basis @((X0 C) 0 C) in Fig. 7i. Note that such a result depends on the structuring element C and does hold for an arbitrary structuring element. The bases [A) and ( B )are not restricted to the form of simple open and close r-mappings and can represent any pair of r-mappings. For example, they can correspond to the generalized open and close r-mappings as defined by Eqs. (1.8) and (1.9), or each to a further cascade of r-mappings. In this way, the basis for an arbitrary number of iterations of r-mappings can be computed, as illustrated by the following example. In Fig. 8a is the basis @(Yr,+,) for a one-dimensional median filter Yr,w defined with a window W = (( - 1, 0), (0, 0), (1,O)) and a rank r = 2. If ( A ) = @(Yr,+,) and (B) = @(Yr,+,),then the cascaded r-mapping in Eq. (111.3) represents two passes of this median filter. The basis is given by the set of elements shown in Fig. 8b. If the basis (A) is then defined as this +,), then the cascaded r-mapping resultant basis and [B) remains as @(Yr,
00
at0
00 a
00
00
o t o o
oo+o
0 . 0 . 0
I1
00
0
0
0 0
o+oo
00
0 0
0
0
o a t 0
0
at0
0
C
FIGURE8. Bases for iterations of a 1-D median filter. (a) Basis for one iteration. (b) Basis for two iterations. (c) Basis for three iterations.
348
RONALD JONES and IMANTS D. SVALBE
represents three passes of the median filter. The basis is as illustrated in Fig. 8c. If this process were to be repeated again, then the resulting basis would represent four passes of the median filter, and so on. In such a way the basis representing an arbitrary number of iterations of the median filter, or of any other r-mapping, can be computed. These bases may be compared with one another, with the bases derived earlier for (X 0 C) 0 C and ( X 0 C) 0 C, or with the basis for any other r-mapping. For example, it is apparent from the bases shown that (X 0 C) 0 C E Y,,r,&) C ( X 0 C) 0 C, where denotes n iterations of Yr,+,.This confirms the result in Maragos and Schafer (1987b) that iterations of a one-dimensional median filter are bounded by the openclose and close-open r-mappings. Such a direct and visual comparison is only possible with the basis representation. E. The Dual Basis
Any set mapping Y has a dual Y* given by Y* = [Y(XC)lc.For example, the dual of the dilation Y(X)= X @ B is the erosion Y * ( X ) = X 08, and the dual of the opening Y ( X ) = X 0 B is the closing Y * ( X ) = X 0 B . The dual of the basis representation Y(X)= n A X 0A is an intersection of dilations Y*(X) = n A X @ A . Alternatively, as Y*(X) is translationinvariant and increasing, it has its own basis representation, given by Y * ( X )= U,,X 0A*, where the set (A*]forms the dual basis a("*). The duality between Y and Y*, in terms of bases, is captured by the two equations Y(X)= XOA= X@A*, A E @(Y) A * € a(**) (I I I .4) Y*(X)= X@A*= XOA.
u u
A ' € @A(*')
As Y**
=
u u A
E
@(Y)
Y, the dual basis of the dual mapping Y* is the original basis
@(V. In the work to follow it will be necessary to compute the dual basis (A*] from any given basis [AJ,and vice versa. Such a process can be used, for example, to compute the basis for closing from that for opening, or the basis for the open-close from that for the close-open. It will now be demonstrated that the general basis algorithm can be used to compute the dual basis a3(Y*) from any given basis a3(Y). We require that the dual r-mapping Y* be expressed in terms of the basis a3(Y), and so we will represent Y* using the dual representation Y * ( X ) = f l ( X @ A : A E a3(Y)J in Eq. (111.4). An intersection of dilations is a
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
349
a
0 0
t o 0
0
o t 0
0
0
0
O+O
t
o t o 0
+
0
0
0
+o
0
0
0
0
O:
t o 0
+0
0 0
0
t
0
0
b
FIGURE9. Dual basis. (a) Original basis [ A ) .(b) Dual basis {A *) .
7-mapping, and it has a basis representation that may be computed using the general basis algorithm. For each dilation X 0A , there corresponds a basis @(X -+ X @ A ) = [ ( a ):a E A). The basis for the intersection n (X0A :A E @(Y))may be computed using Theorem 11.3, using two input bases @ ( X + X @ A ) at a time. The dual r-mapping Y* then becomes Y * ( X )= U ( X 0B :B E @ ( X + n , X 0 A)). The basis @ ( X --* n , X @ A ) may therefore be used to represent Y*, and, as the basis representation is unique, we may conclude that @(X + n , X 0A ) = @(Y*). In Fig. 9a is illustrated a basis (A) of four structuring elements. The dual basis @(Y*)is given by the basis @ ( X n , X 0A). For each A E (A), there corresponds a dilation X 0A which has a basis @(X + X 0A ) = ((a]: a E A). The process of intersecting the four 7-mappings X 0A is accomplished using Theorem 11.3, and the result is the basis a("*) that is shown in Fig. 9b. Note that, as the basis (A) in Fig. 9a is that for an opening Y(X)= X O B (where B can be any one of the four structuring elements shown), the dual basis in Fig. 9b corresponds to that for a closing Y * ( X )= X 0 B. Finally, as Y** = Y , if the basis in Fig. 9b is used as the basis (A), then the resultant dual basis @(Y*)is the original basis shown in Fig. 9a. +
IV. FILTERING PROPERTIES AND THE BASISREPRESENTATION
A set mapping can exhibit many different properties. Most of these have been introduced in Section I-for example, increasingness, translationinvariance, and idempotence. Equipping a set mapping with certain properties allows one to design set mappings for particular filtering tasks or
350
RONALD JONES and IMANTS D. SVALBE
for particular types of images. The basis constitutes a complete description of a set mapping and it contains all the information necessary to distinguish one set mapping from any other. Filtering properties such as extensivity, anti-extensivity, and idempotence are manifest as sometimes complex interrelations between basis elements. Slight changes to the basis can result in the loss or gain of these properties. The basis representation is a common platform from which the study, design, and implementation of r-mappings can be approached. Dougherty (1992a, b) and Loce (1993) have studied in detail the design of optimal rmappings by using the basis representation. In the work of Dougherty (1992a), design constraints were placed on the basis so as to produce an optimal r-mapping that is also an open. In this section, we will introduce further design constraints by demonstrating how some important filtering properties occur in the basis and what relationships between basis elements cause these properties to be present. We will begin with well-known results for extensive and anti-extensive properties and then proceed with more complex results for over-filters, under-filters and self-duality. Following this, the results can then be combined to give design constraints for rmappings such as opens and closes. A. Extensive and Anti-extensive r-Mappings Among the simplest of filtering properties are the extensive and antiextensive properties. Recall from Section I,B, 1 that a r-mapping Y is called extensive when Y(X)2 X, for all X E 6 ( E ) . Examples of extensive rmappings include dilation by a structuring element which contains the origin and any closing. In terms of the basis representation, a r-mapping is extensive if its basis contains an element which is equal to the origin. For example, the bases for closing in Figs. 5f and 6b in Section 111 each contain the origin as a basis element, as closing is an extensive r-mapping. This result is formalized in Property IV.l. Property IV.l.
A r-mapping YfAlis extensive iff 0 E (A}.
A t-mapping is called anti-extensive when Y(X)C X,for all X E 6 ( E ) . Examples include erosion (when the origin is an element of the structuring element) and any opening. A r-mapping is anti-extensive if every element of its basis contains the origin. For example, the origin is contained in each element of the basis for the annular opening in Fig. 4b, as all openings are anti-extensive. This result is formalized next. Property IV.2. A r-mapping Y,*, is anti-extensive iff 0 E A , for all A E [A].
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
351
Properties IV. 1 and IV.2 can be used as design constraints to design and implement r-mappings. For example, any extensive r-mapping can be designed by constructing a basis which satisfies Property IV.l. It can then be implemented as a union of erosions using this basis. Similarly, any antiextensive r-mapping can be designed as a basis satisfying Property IV.2 and implemented as a union of erosions using this basis. In the following, we look at filtering properties that are manifest as more complex interrelations between basis elements. B. Over-Filters and Under-Filters
Iteration is an important aspect of the morphologic processing of images as by iteration it is possible to construct a large class of useful r-mappings. Different r-mappings do not behave the same under iteration. Some, like openings and closings, are idempotent after the very first iteration. Others, like median filters, exhibit an image-dependent idempotence which can occur after a finite number of applications or which can oscillate between fixed bounds (refer to Maragos and Schafer, 1989b, and Astola et al., 1987). Two important filtering properties that concern the iteration of a rmapping are over-filtering and under-filtering. A r-mapping YIA1 is called an over-filter if
Examples of over-filters include all extensive and all idempotent rmappings. A r-mapping Y,,, is called an under-filter if YAI(Y((A1)
c
YAl.
Any idempotent or anti-extensive r-mapping is an under-filter. If YAl satisfies both the preceding equations, then yAI(YAI) = YAl,and so YA1 is idempotent. 1. A Design Constraint f o r Over-Filters The constraint for a r-mapping YA1 to be an over-filter can be formulated entirely in terms of the basis (A]. As the cascade YAI(YAI) is a r-mapping, = U ( X 0 B :B E it may be represented as a union of erosions, YA1(YA1) [ B ] ]Although . the basis elements of the set ( B ]may be computed using the general basis algorithm as shown in Section 111, we require an explicit expression for the set ( B ) in terms of the basis (A). This is given in the following lemma.
352
RONALD JONES and IMANTS D. SVALBE
Lemma IV.l. Defining the basis ( A ) = (Ai:i = 1, Ai E ( A )by Ai = ( a i , j : j= 1, ..., J = x(AJ),
yA,(yA,)
=
...,Z = ~(lXl))and
u (X0B : B E IBIL
where ( B ) = ( B :B = Uj (aj,j
+ Aij)).
An element B E ( B ) is formed by first selecting some Ai E ( A ) .Then, for each element ai,j E A i , a basis element Aj, E ( A )is selected and translated through a i , j .The union Uj ( q j + Ai> forms a B E ( B ) .The set ( B )is completed by using all possible combinations of ai,j and A,, for all Ai E ( A ) . The over-filter constraint is now given by Y,AI(yAl) 2 yAI e U B ( X 0B : B E ( E l ) 2 YIA1o for all A E ( A ) , there exists some B E ( B ) such that [ B ] E A . Any set B is defined as a union of translated A’ E [ A ) , and so B E A implies that all these translated A’ are a subsets of A . In essence, consists of basis elements that can then, the basis for an over-filter yAI translate to be subsets of other elements of the basis. The added complication is that the solution space of allowed translations is restricted. This result is as formalized next. Property IV.3. A r-mapping Y,AIis an over-filter iff for every & we can find some A E ( A ) where A G &. For each A E ( A ) ,we define the set of translates & = ( t :A 2 A;, for some A’ E ( A ) ) . In Fig. 10a is an example basis ( AJ.Directly below each basis element, in Fig. lOc, is the corresponding set & = ( t : A 1 A:,A’ E ( A ) ) .The constraint in Property IV.3 is satisfied for the sets &, and G 2 ,but for the 00
00
o+o
b
a
00
00
o+oo
0
000
d
C
e
FIGURE10. Procedure to design an over-filter. (a) Original basis [ A ) .(b) Modified element of the basis [ A ] .(c) Sets associated with the original basis [ A ) .(d) Set for the modified element. (e) Basis for a cascade of two ?-mappings using the modified basis.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
353
last set, G,,there exists no A E ( A ) where G,2 A . This basis does not represent an over-filter as it is, but it can be modified as follows. The first option is to add an element to the basis (A) which is a subset of G3.In this instance, the origin would be added and, as the basis would then represent an extensive t-mapping, it would be an over-filter. Alternatively, the last basis element A, can be changed to the set shown in Fig. lob. The new set G,,in Fig. 10d, is now a superset of some A E ( A ) ,and so the modified using the basis represents an over-filter. The basis for the cascade Y,Al(yAI) modified basis is shown in Fig. 10e. By chance, this is equal to the modified basis itself and so yIAI(yAI) = yAl. We may conclude then that the new Tmapping YAlis both an over-filter and idempotent. In general, any basis can be modified using the two approaches just given, but in most cases it is more difficult. In the first approach, a set A’ s G is added to the basis ( A ) .But the constraint in Property IV.3 must also be satisfied for this new element A‘, and this dictates that there must be some basis element A where A c G,. If there is no such basis element then the modified basis still does not represent an over-filter. In the second approach, a basis element A is changed to a set A’ 2 A which is used in place of A in the basis. The set A’ is chosen so that G ,2 A , for some element A of the modified basis. But changing an element A to A‘ 2 A can invalidate the results for the other sets G ,and again the modified basis may not represent an over-filter. In practice, an iterative process which combines the two approaches must be used to find a suitable solution for the modified basis. 2 . A Design Constraint for Under-Filters The constraint for a t-mapping yAI to be an under-filter may be expressed entirely in terms of the basis ( A ) . A result may be derived in a similar fashion as that above for an over-filter, but it is easier to utilize the duality that exists between under-filters and over-filters. If Y is an over-filter then its dual Y * is an under-filter, and the dual of the under-filter constraint YA!(ylAI) C YA1is the over-filter constraint Y;A#&.I) 2 Y,A.l.The dual basis (A*) may then be used to formulate the constraint on a basis (A) so that it will represent an under-filter, using the result already given in Property IV.3. The result is as summarized next. Property IV.4. A t-mapping YAlis an over-filter iff for every G. we can find some A* E (A*) where A* E G*.The set G. = ( t : A * 2 AT’, A*’ E (A*)),and (A*)is the dual basis of ( A ) . A basis can be designed so as to represent an under-filter by designing its dual basis to represent an over-filter. From a given basis ( A ) ,the dual basis
354
RONALD JONES and IMANTS D. SVALBE
(A*]can be computed using the general basis algorithm. After (A*)has been modified so as to represent an over-filter, the dual of the modified basis is computed, and this resulting basis will represent an under-filter.
C. Self-Duality The dual of a set mapping Y is given by Y*(X) = [Y(X“)]‘, and a set mapping is called self-dual when Y * = Y. Self-dual set mappings are useful in image analysis because they treat the foreground X and background X c of an image equally. For example, a self-dual set mapping that removes “salt and pepper” noise (or impulse noise) from an image will remove the same amount of “salt” as “pepper” from the image. In terms of the basis is self-dual when yA1 = yiA*,. As a basis representation, a ?-mapping yA1 uniquely represents a t-mapping, we have that “,A, = “,A*, if and only if (A) = (A*), and so the constraint for self-duality may be formulated in terms of the basis (A). 1. Formulation for the Dual Basis
The dual basis [A*)may be computed from any given basis (A), as demonstrated in Section III,E, but we require an explicit expression for (A*]in terms of (A]. This is given in the following lemma.
Lemma IV.2. Defining the basis ( A ) = ( A i :i = 1, Ai E ( A )by Ai = ( ~ i , , :=j 1, ...,J = x(Ai)),
..., Z = x((Aj)) and
(A*)= W ( B I ) ,
where ( B ) = ( B :B
= Ui~i,,~).
Each B E [ B )is formed by selecting some element a E A from every A E (A) and combining these elements together to form a single set. There are x(AJ x &l, x) ... x x(A,) possible ways to make such a selection, each forming some B E (B).The basis elements of the resultant set (B)forms the dual basis (A*).We can compare this selection procedure with the result for the dual basis that was shown in Fig. 9. Each element of the dual basis (A*) in Fig. 9b can be formed by selecting one element from each A E (A) in Fig. 9a. In this example, there are 4 x 4 x 4 x 4 = 256 possible ways to select a set B. Most of these selections are redundant, as they are supersets of some other possible selection. Only 12 are basis elements of ( B ) ,and these form the dual basis (A*) as shown in Fig. 9b. The procedure for selecting the elements of (B)can be modified so that only the basis elements A* are selected. A basis element A* E ( B )may be
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
355
defined by the fact that for all C C A*, C @ (B). In this way, it is impossible for A* to be redundant, as there is no subset of A* that is an element of (B). In terms of the foregoing selection procedure, this is equivalent to the constraint that no C c A* satisfies the selection procedure for the set (B). As each B E (B)is formed by selecting one element from every A E ( A ) ,we may conclude then that A* is a basis element if and only if every a E A* exists “alone” in some A E ( A ) without the other elements of A*. In such case, no element can be dropped from A* and the selection procedure remain satisfied. Moreover, if there existed some a E A* that did not exist alone in some A E ( A ) ,then it could be dropped from A* and the selection procedure would still be satisfied. Notice that for any given A* E (A*) in Fig. 9b, every a E A* exists without the other elements of A* in some A E ( A )in Fig. 9a. 2 . A Design Constraint for SelfDuality The constraint for self-duality can now be formulated in terms of a basis ( A ) . If ( A ) = (A*) then the procedure for computing the dual basis (A*) must yield the original basis ( A )itself. Equivalently, every A E ( A )must be able to be selected as an element of (A*)and only A E ( A )can be selected as elements of (A*).For a given A E ( A )to be selected, every other element of ( A )must contain some a E A , and every a E A must exist alone in some element of ( A ) .In this way, each A is able to be selected as an element of (A*)and, when selected, will not be redundant. We can now show that this also implies that only A E ( A )can be selected as elements of (A*),and that therefore (A*) = ( A ) . As Y** = Y, the dual of the dual basis (A*) is the original basis [A).Therefore, if the dual basis procedure is implemented on (A*),only elements of ( A )can be selected. But, every A E ( A )is an element of (A*),and so the dual basis procedure implemented on ( A )can only select subsets of elements of ( A ) .As every A E ( A )is an element of (A*),this then implies that only A E (A) can be selected as elements of (A*). The final result is as summarized next. Property IV.5. A r-mapping YAlis selfdual iff for all A E ( A ) ,(i) every A’ E ( A ) contains some a E A and (ii) for every a E A there exists some A’ E ( A )such that a = A n A’.
For example, the basis for the median filter in Fig. 8a is self-dual, as every element of this basis satisfies the two constraints in Property IV.5. Consider the first element A = {(O, 0), (1, 0)). Here constraint (i) is satisfied because every element of the basis contains either the singleton a = (0,O) or the singleton a = (1 ,0). Constraint (ii) is satisfied because both a = (0,O)and a = (1,O) exist alone in some element of the basis. That is, a = (0,O)exists
356
RONALD JONES and IMANTS D. SVALBE
in the element A = I( - 1,0), (0, 0)) without the presence of a = (1, 0), and a = (1,O) exists in the element A = (( - 1,0), (1,O)) without the presence of a = (0,O). More formally, for each element a E A , there exists some A’ E (A] such that a = A n A’, as required by Property IV.5. As the two constraints are also satisfied for the other two elements of the basis, it may be concluded that the basis ( A )represents a self-dual r-mapping. The result in Property IV.5 can be used as a design constraint to design r-mappings that are self-dual. Alternatively, a given r-mapping that is not self-dual can be made so by modifying the elements of its basis. One method of modifying a given r-mapping, noted in Serra and Vincent (1992), is given by = [id
u (Y n Y*)] n (Y u Y*)
= p*.
Here p is a self-dual r-mapping based on Y and its dual Y*. In Fig. 11 is an example in terms of basis representations. A basis @(Y)= (A}is shown in Fig. l l a , and its dual @(Y*) = (A*) in Fig. l l b . The basis a@) may be obtained with the general basis algorithm as follows. The bases for the union Y U Y* and the intersection Y n Y* are obtained using Theorems 11.3 and 11.2, respectively. The basis for the union id U (Y n Y * ) is then obtained using Theorem 11.2 (where the basis for the identity mapping id is given by the origin), and the basis for the subsequent intersection of id U (Y n Y * ) and Y U Y* using Theorem 11.3. This final basis is the basis a@),as shown in Fig. 1lc. As a@) is self-dual, it satisfies the constraints in Property IV.5. Note that this is only one method of modifying a given Tmapping, and other solutions which satisfy Property IV.5 also exist. 0
00 0
00
0
o+ 0
c
FIGURE 11. Self-dual basis. (a) Original basis [ A ] . (b) Dual basis [A*]. (c) Self-dual basis
0.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
357
D. Combining Design Constraints The design constraints in Properties IV.1 to IV.5 may now be combined with one another to give design constraints for other filtering properties. Note that many combinations do not lead to useful results. For example, the only possible basis solution when Property 1V.I is combined with Property IV.2 is the origin. This reflects the fact that the identity mapping is the only r-mapping that is both extensive and anti-extensive. Similarly, the origin is the only possible solution when Property IV.5 is combined with either Property IV.1 or Property IV.2. A more useful combination is Property IV.2 and Property IV.3. A Tmapping that exhibits both these properties is an opening, and all openings are defined by a basis that exhibits these two properties simultaneously. Recall that an opening is any anti-extensive and idempotent r-mapping. The condition for idempotence is equivalent to the condition that the r-mapping be both an over-filter and an under-filter. However, if the anti-extensive constraint is satisfied, then the under-filter constraint is satisfied automatically. An opening may then be defined as an anti-extensive overfilter, or as any r-mapping that satisfies both Property IV.2 and Property IV.3. We have then the following result. Property IV.6. A r-mapping is an opening if 0 E A for all A E [ A ] , = ( t : A 2 A ; , A' E [ A ] ] ,there issome A E [ A ) where and for every set A C_ T A .
In the same way, a r-mapping that exhibits both Property IV.1 and Property IV.4 is a closing, and all closings are defined by a basis that exhibits these two properties simultaneously. As Property IV.4 is defined in terms of the dual basis, we require Property IV.1 to be defined in a similar fashion so that the two properties can be combined. This is a simple matter as, by duality, a r-mapping is extensive if and only if its dual basis is antiextensive. We then have the following result. Property IV.7. A r-mapping is a closing iff 0 E A* for all A* E [A*), and for every set 5' = [ t :A* 2 A:', A*' E (A*]],there iS some A* E [A*) where A* G q..
A basis [ A ] can be designed so as to represent any closing by designing its dual basis [A*] to represent an opening. Once a basis [A*] is found, the basis [ A )may be computed using the general basis algorithm. All idempotent r-mappings are defined by a basis that exhibits both Property IV.3 and Property IV.4 simultaneously. The openings and closings discussed earlier are particular examples of such r-mappings. Other examples include the open-close and close-open r-mappings. If
358
RONALD JONES and IMANTS D. SVALBE
Property IV.5 is also satisfied, then the basis would represent a self-dual idempotent r-mapping. However, even though these properties are all defined explicitly in terms of basis elements, the interrelations between these elements are complex and, as yet, we have no practical method of combining these properties.
V. TRANSLATION-INVARIANT SETMAPPINGS Banon and Barrera (1991) have extended the basis representation to translation-invariant set mappings that are not necessarily increasing. A similar representation using a special case of translation-invariant set mappings known as window transformations was attempted by Crimmins and Brown (1985) some years earlier, though neither the basis nor the dual representation were established. Dougherty and Loce (1992b) have investigated the optimal representation of translation-invariant set mappings by optimizing a basis of hit-or-miss set mappings. Other work includes that from Jones and Svalbe (1994a), where the basis representation is implemented on images using a template-matching procedure in combination with a look-up table. Such an approach allows a parallel implementation of a set mapping via its basis representation. In this section, the basis representation of translation-invariant set mappings as developed by Banon and Barrera (1991) will be outlined, a general basis algorithm will be presented, and then some examples will be given to demonstrate the algorithm. A. Background Theory
A translation-invariant set mapping will be called a TI mapping and denoted by the symbol @. A fundamental TI mapping is the sup-generating set mapping
~ @ ( A , B=) ( x @ A ) n ( x c @ B c ) .
W.1)
This is made up of an ordinary erosion X 0B and an additional set mapping X c 0BCthat is not increasing. It is similar to the well-known hitor-miss set mapping X @ ( A ,B) = ( X @ A ) fl ( X c 0B) of Serra (1982), and in fact X @ ( A , B) = X 0 (A, B C ) . The dual of the sup-generating set mapping is the inf-generating set mapping X @(A, B). In terms of dilations, X @ ( A , B ) = ( X @ A ) U ( X ' @ BC).
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
359
Here it is the additional set mapping X c @ BC that is not increasing. For two structuring elements A , B E 6 ( E ) , we may define an interval [A, B] = (C E 6 ( E ): A E C G B] to contain all the elements of 6 ( E ) that lie between A and B. The concept of an interval is a convenient formalism to introduce, as it simplifies considerably the algebra of TI mappings. For example, using an interval, the sup-generating set mapping can be defined X @ ( A , B ) = ( z E E : X - , E [ A , B ] ] .When A $ B, the interval [ A , B ] = 0, and it follows immediately that X @ ( A ,B) = 0 . This result is not as obvious from the definition of X @ ( A ,B) as it appears in Eq. (V.1). 1. The Kernel Representation All TI mappings can be represented as a union of simple sup-generating set mappings,
@(X)= U ( X @ ( A ,B ) : [A,B] E X(@)). The kernel X(@) = (A E 6 ( E ):0 E @(A)]is the same as that defined for rmappings. An interval [ A ,B] C X(@)if every element of the interval is an element of the kernel. By duality, there is also a dual kernel representation for TI mappings as an intersection of inf-generating set mappings, O ( X ) = I l ( X @(A*,B*) : [A*,B*] E X(O*)),
where X(@*)is the kernel of the dual mapping @* = [@(X‘)]‘. 2 . The Basis Representation The kernel representation has little practical application because it contains an infinite number of intervals. However, by discarding redundant intervals from the kernel representation, it is possible to represent the TI mapping with a set of intervals that can be finite. In the case of r-mappings, we have seen that for two kernel elements A and B 3 A , the element B can be discarded because it is redundant. For TI mappings the situation is slightly different. If [ A ,B] c [A’,B’] E X(@), then X @ ( A , B) C X @ ( A ’ ,B’) and X @(A, B) U X @(A’,B’) = X @(A’ ,B‘). The interval [A, B] is therefore redundant. Note that redundancy is now governed by a pair of constraints coinciding with the two limits of an interval. This leads to an interesting basis property that will be discussed in the following section. The basis a(@)is defined as the set of non-redundant intervals contained in the kernel,
a(@) = “A, B] E X(@):
( [ A ,B] E [A’,B’] S X(@)) * [A,B] = [A’,B ’ ] ] . (V.2)
360
RONALD JONES and IMANTS D. SVALBE
The representation of a TI mapping then becomes the basis representation @(x) = =
u [ X @(A,B) : [A,B] E a(@))
n (x@(A*,B*): [A*,B*I E a(@*)).
(V.3)
Note that when Q, is increasing, this expression reduces to that for the basis representation for r-mappings. A TI mapping Q, with a basis a(@) = ( [ A ,B ] ) will be denoted by For example, the inf-generating set mapping Q,(X)= X @(A,B ) is a TI mapping and has a basis representation that may be derived as follows. The kernel X(@) = [C E S(E):0 E C @ (A, B ) ) . The constraint 0 E C @ ( A , B) Y O E [(C 0A ) U ( C C0BE)] Y (OE C 0A or O E Cc @ E ) e ( ( - a ) E C,a E A , or ( - z ) 4 C , z E E ) . The kernel can now be defined X(0) = [ C E @(El : ( - u) E C, a E A , or ( - z) 4 C , z E El. The maximal intervals contained in this kernel are [ - a, E ] , where a E A and [a, ( - z)‘], where z E LF. The basis a ( X X @ ( A ,B)) consists of the complete set of these intervals, +
W X + x @ ( A , B ) ) = I[-a,E],a E A ) U I [ ~ , ( - Z ) ~ ] , Z E B E )(V.4) . 3 . The Minimal Representation In contrast to T-mappings, the basis representation of TI mappings is not always the minimal representation. As, if two different intervals [A, B ] , [ A ‘ ,B’] E a(@)can combine to form a single interval [A”,B ” ] = [A, B] U [A’,B ’ ] , then [A,B] and [A’,B‘] would both be redundant by [A”,B ” ] . Other intervals in the basis a(@)may also become redundant by the new interval [A”,B ” ] .Two intervals can combine to form a single interval when X@(A,B)UXa(A’,B’= ) ( z E E : X _ , E ( [ A , B ] U [ A ’ , B ’= ] ) () z E E : X - , E [A”,B ” ] ) ,where [ A , B] U [A’,B’] forms a single interval [A”,B ” ] . By combining intervals, a basis may be further reduced to a sub-basis X(@)which is both a basis and a minimal representation of Q,. Shown in Fig. 12a is a set of four intervals ( [ A ,B ] ] which form a basis 63.For each interval the set of ones denotes a set A and the set of zeros denotes a set BE. The complement of B is depicted because in all of these intervals the limit B is an infinite set. (This is invariably the case when a sup-generating mapping X @ ( A ,B) is used in practice. In contrast, for the hit-or-miss set mapping X 0 ( A , B) = X @(A, BE), both A and B are finite sets.) As there exist no two intervals [A,B] and [A‘,B’] in this set where [ A , B] c [A’,B ’ ] , all intervals are non-redundant. Note that an equation [A,B] c [A’,B’] has a more practical form [(A 2 A’ and @ 3 B”) or (A 3 A’ and BE 2 B ” ) ] . Shown in Fig. 12b is a sub-basis ‘32 of this basis which consists of a single interval. The two bases 63 and ‘32 are equivalent when used in the basis representation of Eq. (V.3).
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
opp mmo
q m
q m mmm
o p moo
moo
361
om El m m b
a
FIGURE12. Minimal basis. (a) Basis consisting of four intervals. (b) Minimal basis with only one interval. The origin is marked by the box in bold type.
In this work the analysis will be restricted to the computation of any basis as defined by Eq. (V.2), and the basis produced may not necessarily be a minimal basis.
B. Basis Algorithms Having established that TI mappings have a basis representation, the problem then becomes one of how to compute bases for given TI mappings. In this section we will establish tools for a general basis algorithm based on the approach adopted for 7-mappings. We proceed by deriving the following four results on the basis representation of TI mappings. The first three are generalizations of the translation, union, and intersection results established in Section 11. The fourth result, on the complementation of TI mappings, is a necessary response to the fact that complementation is now in the algebra of the set mappings (although complementation can also be used in the algebra of 7-mappings, it is not an essential element of that algebra). Note that the basis solutions given in the following results may only be instances of basis solutions and that there may be other possible solutions-for example, sub-bases of those given.
Theorem V . l . A basis for the translation of a TI mapping @[[A,B]I through a vector t E E is given by
(IA, Bl-f: B1 E a(@))m The translation [A, B]-, = [ A _ ,,B-,] and can be obtained from any given a[(@,w,B]Jfl
=
interval [A, B].
Theorem V.2. A basis for the TI mapping @ [ [ A , B ] ]U @IIc,Dl)is given by a(@[[A,B]l
@ [ [ C , D ] l )=
( [ A ,B]@)u ([CY Dl@),
where “A , BIaJ = “A, BI E ( [ A ,BIJ : [A,BI
c [C,DI, V[CY Dl E “CY Dll)
I[C, DIaJ = “C, Dl E I[C, Dl1 : [C,Dl
@
and [A, BIY WAY BI E ( [ A ,Bl)).
362
RONALD JONES and IMANTS D. SVALBE
Here an equation [ A ,B] B $ Dor (A 2 Cand B Theorem V.3.
c [C,D ] has the more practical form [ A 2 C or c D)].
A basis for the TI mapping @ [ [ A , B ] l n @l,c,D1l is given by a(@[,,B],
n @ [ [ c , D ] ] )= a(([E,FIJI,
where
“E, 41 = “A
u c, B n Dl : [ A ,BI
E ( [ A ,BIJ,
[ C ,Dl E ( [ C ,DlIJ.
Here, the interval [E,F ] = [ A U C , B n D]can be empty even when the intervals [ A ,B] and [C,D ] are non-empty, as A E B and C C D does not In such a case, X @(E, F) = 0 and, as this imply that ( A U B) E (C n 0). contributes nothing to the basis representation, the interval [E,F ] may be discarded. In practice, a great number of intervals may be empty, and it is even possible to have a completely empty set of resultant intervals ( [E,F ] ] . In general, [ [ E , F ] )does not form a basis, and an algorithmic approach , ) )Such . an algorithm can be must be used to compute the basis @ ( ( [ E4 modeled on that given for the intersection of r-mappings, but we will omit it from this work. As an alternative approach, the set of intervals ([E,F ] ] can be computed as defined earlier, and the redundant intervals subsequently removed. Theorem V.4. A basis for the complement of a TI mapping given by @ [ ( @ [ [ A , B I I ) ~=]
([B*‘,A*‘] : [A*,B*l E
@tIA,BIl
is
a(@*)).
This basis may be obtained directly from a solution for the dual basis. However, as we assume that only the given basis a(@) = ([A, B ] ]is known, to complete this result we require an algorithm to compute a dual basis a(@*)from a(@). Such an algorithm will be proposed in the following section. Finally, if the foregoing algorithms are to be implemented using a computer, then of course the limits A and B of any interval [A, B] need to be realized as finite data structures in the computer. However, as the limit B is always an infinite set, the complement BCmust be stored instead of the set B and the algorithms implemented with this change in mind. C. Application of the General Basis Algorithm
The preceding results form the basic tools of the general basis algorithm and can now be applied to compute bases for given TI mappings. As with the general basis algorithm for r-mappings, the algorithm requires that the TI mapping be formed as a combination of TI mappings, for each of which
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
363
the basis is known, under the algebra of translation, union, intersection, and complementation. For example, Theorems V.2 and V.3 on the combination of two TI mappings are easily extended to an arbitrary number of TI mappings, from which basis solutions for cascades of dilation or erosion can then be derived. In the following we give two examples illustrating how the general basis algorithm can be applied to compute bases for given TI mappings. The first is for the basis of the dual TI mapping, and the second is for the basis representing cascaded TI mappings. 1. The Dual Basis
The dual of the TI mapping @ ( X )is defined by @*(X)= [@(X')]' and has a basis representation @*(X)= U ( X @(A*,B*) : [A*,B*] E a(@*)].We can derive a solution for the dual basis a(@*) from any given basis a(@) using the general basis algorithm. The dual TI mapping @* can be expressed in terms of the basis a(@) by using the dual basis representation @ * ( X )= fl ( X @(A,8): [ A ,B] E a(@)].An intersection of inf-generating set mappings is a TI mapping, and it has a basis that may be computed using the general basis algorithm. The basis for a single inf-generating mapping X @(A,B) is given by Eq. (V.4) as @(X + X @(A,8))= ( [ a ,El :a E A ] U z'] :z E B C ]A . solution for the basis representing the intersection n ( X @(A,B) : [ A ,B] E a(@)] can then be obtained from Theorem V.3, using two input bases a ( X + X @(A,8))at a time. If we denote the basis produced by such a procedure by ([E,F ] ] ,then we have ( X @(A,g): [A,B] E a(@)] = U ( X @ ( E ,F ) :[E,FI E ( [ E ,FII]. The that fl dual TI mapping @ * ( X )then becomes @ * ( X )= U ( X @(E, F ) :[E,F ] E ([E,F ] ] ] Therefore, . a solution for the dual basis a("*)is given by the basis ([E,FII = a(fl ( X @(A,B) : [A, B] E a(@)])and may be obtained from any given basis a(@) using the general basis algorithm. This result for the dual basis completes Theorem V.4 on the complementation of TI mappings. An example illustrating the dual basis is given next.
([a,
2. Cascaded TI Mappings If @ [ [ A , B ) ]and @ [ [ c , D ] are ] two TI mappings, then the basis representing the cascade @[[c,D))(@[[A,B]l)may be obtained using the general basis algorithm. We illustrate this with the following example. In Fig. 13a is a set of six templates which can be used to form the convex hull of a given set X (see Serra, 1982, for more details). Each template is defined on a hexagonal grid and represents a pair of structuring elements (A, B). In this figure, a set of ones denotes a set A and a set of zeros denotes a set B'. Again, we depict BCrather than B because in all these templates the set B is an infinite set.
364
RONALD JONES and IMANTS D. SVALBE
C
d
FIGURE13. Basis for TI mappings. (a) The basis for the convex hull TI mapping. (b) Basis for a cascaded erosion. (c) Dual basis of the convex hull basis. (d) Basis for the complement of a convex hull TI mapping. The origin is marked by the box in bold type, and the "." symbol indicates that there is no value at that coordinate.
Each template ( A , B) is used as a pair of structuring elements in the supgenerating mapping X @ ( A ,B), and the union over the six templates forms = U [ X @(A, B) : [A, B] E ( [ A ,B ] ) ) The . convex the TI mapping @l[A,B]I(X) hull of a set X is given by applying @[w,B]] to the image repeatedly until it has no further effect on the image. When @ [ [ A , B ] ]is iterated twice, the result is given by @l[A,B]l(@l[A,B]l) = u [ @ [ [ A , B ] , @ ( A , B ) : [A,BIE[[A,BIl= ) u ~ ~ ~ ~ [ A , B ] l ~ A ~ n ~ @ l0 [ ABC): , B ] l ~ c [ A , B] E [ [ A ,B ] ) ] .A solution for the basis that represents this cascade may be obtained as follows. A basis for a TI mapping @[[A,B]J 0A = fI,.A(@l,,Bll)-, can be computed by applying Theorem V.l to each translation (@l[A,Bll)-a and then Theorem V.3 to the intersection A result is illustrated in Fig. 13b using a set A from the first template in Fig. 13a. From Theorem V.4, a basis for the complementation (@{[A,B]J)' is given by 6 3 [ ( @ [ [ A , B l l ) c ] = ([B', A'] : [ A ,B] E @(@*)I and
n,..(@,[A,Bll)-,.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
365
may be obtained directly from the dual basis. A solution for the dual basis a(@*)is computed as directed in Section V,C,l and is as shown in Fig. 13c. Using this result, the basis @ [ ( @ [ [ A , B ] ] ) c ]is computed and is as shown in Fig. 13d. The basis representing the erosion (@[[A,B]))' 0BC is then computed. In this example, B' is the origin, and so (@[w,B]])c 0B' = and is given simply by the basis for ( @ l [ A , B l ~ ) ' ,as shown in Fig. 13d. Then, the basis for the intersection (O,,,,,, 0A) n ((@,~A,Bll)' 0B') is obtained using Theorem V.3.
& @ 1 0 0
m0
1 0
mm
FIGURE14. Solution for basis representing two passes of the convex hull. The origin is marked by the box in bold type, and the " . I ' symbol indicates that there is no value at that coordinate.
366
RONALD JONES and IMANTS D. SVALBE
This procedure is repeated for each of the six templates (A, B). A basis representing the final union U [(@lIA,Bll0A ) n (@[[A,B]I)' 0B') : [ A ,B] E ( [ A ,E l ) )is then computed using Theorem V.2 and is as shown in Fig. 14. By denoting the basis shown in Fig. 14 by ([C,D ] ) , we may then say that @ [ [ A , B ] ] ( @ [ [ A , B ] ]= ) @ [ [ C , , D ] ) *That is, two passes of the TI mapping @ [ [ A , B ] l can be represented by a single TI mapping @l[c,Dll. The basis representing any number of iterations of @l[A,B1l may be obtained in the same way. For example, a basis representing three passes of @[[A1,B]lcan be obtained by , ([C, 01)is the basis represencomputing a basis for @ l ~ A , B ~ l ( @ l ~ c , D l l )where ting @ [ [ A, B ]l(@l[A, Ell). VI. GRAY-SCALE FUNCTION MAPPINGS Up until now, the images that we have considered have all been binary images. In a binary image, each pixel can have only one of two values, and so we have been able to represent the image by a set X E 6 ( E ) .In gray-scale morphology, a gray-scale image is treated as a function which maps a subset of Euclidean space, which represents the coordinates in an image, to a continuous set of numbers, which represents gray-scale values. A binary image can be represented as a function which has only two values, and this function is called the characteristic function. Meyer (1978) and Sternberg (1982) were among the first to extend binary morphology to gray-scale morphology. The theory was further developed by many authors, notably Haralick et al. (1987), Dougherty and Giardina (1988b), and more recently Heijmans (1991). In the original work of Sternberg the extension from the binary to the gray-scale was made using the concept of an umbra transform, and many authors still use such a notion instead of working with the functions explicitly. This approach is unnecessary, however, and in this work it will not be used. A . Background Theory Functions that represent gray-scale images will be denoted by symbols such as f,g, and k. We define f:F + IR, where F E 6 ( E ) and will be called the support off. The symbol R denotes the set of real numbers including - 00 and +a.In this work we adopt the convention that any function has a value of --a0 outside its region of support. This simplifies greatly the analysis between functions that have different supports and yet remains a suitable description of a gray-scale image. The space of all functions will be denoted by Fun@). An image transformation is considered as a function
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
367
mapping w :f -, g , f , g E Fun@) which maps one function to another in Fun(E). A function mapping v/ will be called increasing if
f
5g
* w(f)5 wk),
v f ,g E F u W ) .
Iff E Fun(E), h E E and u E IR, then we may define horizontal translation fh(x) = f (x - h) and vertical translation (f v)(x) = f (x) u. A function mapping w will be called translation-invariant if
+
I/(
fh
+ V ) = [u/( f)]h
-t U,
Vf E
+
Fun(E), h E E,
U E R.
In this work we assume all function mappings to be translation-invariant and increasing, and we will call such function mappings gray-scale rmappings. 1 . Flat Gray-Scale r-Mappings
We consider first the case when ty is a flat gray-scale r-mapping. Defining the threshold sets of a function f by
X,(f) we will call
=
(x E E :f(x) 1 t ) ,
w a flat gray-scale r-mapping if
w(f )(XI = v It :x E WXt(f ))I. Here the symbol “V” denotes supremum. Y is a set mapping which uniquely corresponds to the function mapping w. The two are related by ~((3,) = Y(X), where ex(z), the characteristic function of a set X, is equal to one if z E X and zero if z @ X . If w is a flat gray-scale r-mapping, then it can be implemented on a function f by decomposing f into its threshold sets X c , ( f ) ,mapping each set X,(f)with the set mapping Y, and then reconstructing w( f)from the transformed sets Y(X,( f )). Such an approach avoids the need to use function mappings to transform functions. More importantly however, it allows many theoretical results that apply to binary morphology to be applied directly to flat gray-scale r-mappings. The basis representation for flat gray-scale r-mappings is given directly by that for binary r-mappings. The two fundamental flat gray-scale rmappings are dilation and erosion, defined respectively as
(VI. 1) (VI.2)
368
RONALD JONES and IMANTS D. SVALBE
Here B E @(El is a structuring element and the symbol “A” denotes infimum. If II/ is a flat gray-scale r-mapping then it has the basis representation
w(f)= V
f 0 A
A
=
A E a(*)
A* E
fOA*.
a(**)
(VI.3)
The basis for w is simply that for its corresponding set mapping Y. The basis algorithms that have been developed for binary r-mappings can be used to compute it. Moreover, any other results derived for the basis representation of binary r-mappings apply directly to flat gray-scale rmappings. 2. General Gray-Scale r-Mappings The theory for the basis representation of general gray-scale r-mappings was developed simultaneously by Dougherty and Giardina (1988a, b), by Maragos (1989), and by Maragos and Schafer (1987a, b). We begin with the definitions of gray-scale dilation and erosion. Respectively, (VI.4) (VI.5)
The function g is known as a structuring function. If it is assumed that all functions have a discrete range, then the more familiar maximum and minimum operators may be used in place of the supremum and infimum operators. Further, considering that any function is defined to have a value of - a outside its region of support, the preceding definitions can be reduced to the expressions
(f0g)(x) =
m a
2E
(f 0g)(x) =
G , (x-z) E F
min
z E G,(x+z)
E
F
( f ( x - z) + g(z)), ( f ( x + z) - &)I.
These are the definitions of dilation and erosion as they appear in the tutorial of Haralick et al. (1987). If w is any gray-scale r-mapping, then by defining the kernel X(w) as X(w) =
k? E F W m : w(g)(O) 2
01,
we may say that iy admits the kernel representation
w ( f ) = V f 0g gE
w*.)
A
= g* E
X(W
f 0 i*.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
369
Here the dual r-mapping w * ( f ) = - ( w ( - f ) ) and its kernel is given by
X(w*) = {g*). The reflect of a function g is defined by %(z)= g( - z ) , z E E. In this representation, a kernel element g E X(w) will be redundant if there exists another element g’ E X(w) where g’ < g, as g’ < g * f 0 g’ 2 f O g *fOg’ V f O g = f O g ’ , and similarly g’ < g *f@ g‘ A f @ g = f @ g ’ . The basis is defined as the set of non-redundant elements of the kernel,
&(w)
= (g E
X(w): (g’ E X(w) and g’
Ig)
= g’ = g).
(VI.6)
w is then given by the basis representation w(f)= V fog= A fa%*. (VI.7)
The gray-scale r-mapping
gE
a(*)
g*E
a(**)
The basis representation w is unique and minimal. A gray-scale r-mapping w with a basis ( g ) = @(w) will be denoted by wlgl. For example, any dilation w(f) = f @ k is a gray-scale r-mapping and so may be represented by a basis. By defining a pulse function p z ( y ) = - k( - y ) if y = z and p , ( y) = - 00 otherwise, the basis is then given by
& ( f - t f @ k ) = (p,: - 2 E K ) . The dilation ~ ( f=)f @ k may now be represented w(f) = v {f0p z : - z E K ) . In Fig. 15a is an example for a structuring function k that is defined as a set of four pulse functions. The support of k is given by K = {(O, 0), (0, l), (1, 0), (1, 1)) and the values of k by k((0,O))= 0, k((0, 1)) = 1, k((1,O)) = 2, k((1, 1)) = 3 and k(z) = - 00 for z c€ K. The basis for dilation consists of the four pulse functions p z , - z E K, as shown in Fig. 15b. B. Basis Algorithms
Because certain types of linear mappings fall into the class of gray-scale Tmappings there is a considerable amount of literature devoted to basis algorithms for gray-scale r-mappings. For example, Dougherty and Kraus
a
b
FIGURE15. Basis for gray-scale dilation. (a) Structuring function k. (b) Basis for dilation. Numbers indicate function values; the “.” symbol indicates that the function is undefined at that coordinate, and the origin is marked by the box in bold type.
370
RONALD JONES and IMANTS D. SVALBE
(1991) have presented work on the representation of digital moving average filters, and Khosravi and Schafer (1993) on linear r-mappings with a discrete range. The advantage of using the basis representation is that it uses an algebra of supremum, infimum, and additions and does not employ any of the multiplications generally required for the standard implementation of linear mappings. However, the disadvantage is that only under certain conditions is the basis finite and thus of practical utility. In the work of Khosravi and Schafer (1993), it is demonstrated that a finite basis representation can be obtained by quantizing functions into a finite number of discrete intervals (which is most often the real situation when processing images). C. The General Basis Algorithm
Despite the fact that the basis representation of nonlinear gray-scale rmappings is almost invariable finite, there has been little work on basis algorithms for these mappings. In the following we present a general basis algorithm based on the approach adopted for binary r-mappings in Section I1 and TI mappings in Section V. We proceed by deriving the following three results for the translation, supremum, and infimum of gray-scale rmappings. Some examples of the algorithm are given in Section V1,D. The work generalizes some results that have previously appeared in Jones and Svalbe (1992c, 1994b).
Theorem VI.1. The basis for the translation of h E E and vertically through u E R is given by W(Wlgj)h
Theorem VI.2.
wIglhorizontally through
+ 01 = k - h - 0 : g E W W ) l .
The basis for Wlgl V Wlhl is given by WWIgl
v
Wlhl)
=
IgaJ
u IhaL
where k l : g 3 h, v h
Ig,]
=
lg
lhd
=
lh E Ihl :h
E
E
lhll
and Theorem VI.3.
The basis f o r
3 g , V g E Igll.
tylgl A tylhl is
WWlgj W [ h ] )=
given by
WW,
where (kl =
Is v h :g E lgl, h E WJI.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
371
This last result requires further attention as it still remains to find an algorithm to compute the basis @((k)). The simplest of such algorithms is to compute the set (k)and subsequently remove redundant elements. A more efficient algorithm can be modeled on that suggested for the intersection of binary rmappings. For any two elements g E ( g ) , h E ( h ) , (k = g v h) E @((k)) e, (g v h) 3 (g’ V h’), for all g’ E ( g ) , h’ E ( h ] o [(g v h) 3 g’ or (g v h) 3 h‘ or g v h = g’ V h ’ ] , for all g’ E ( g ) , h‘ E ( h ] .For any given g E ( g ] we may define a set (k,)= ( g v h : h E ( h ) )of potential basis elements of (k).For each k, E (k,]the sets (gk, = ( g E ( g ) :kg > g ) and [Ak) = ( h E [ h ) :kg > h ) are computed. If either (gk ] or (hk,]is empty, then kg is an element of the basis a ( ( k ) ) If . they are both non-empty, then kg can only be a basis element if kg = gkgv hk,, for all &, E ( g , , J , hkKE (hk,).This process is repeated for each kg E {k,)and for all g E ( g ) to derive the complete basis CB((k1). Note that a condition g 3 h has the more practical form g 3 h e, (g = h or g ( z ) c h(z) for some z E H ) and a condition g > h e, (g f h and g ( z ) > = h(z) for all z E H).
D. Application of the General Basis Algorithm The translation, supremum, and infimum of gray-scale r-mappings form the basic tools of the general decomposition algorithm. As with the previous basis algorithms, the approach requires that the gray-scale 7-mapping be formed as a combination of such mappings, for each of which the basis is known, under the algebra of translation, supremum, and infimum. It is clear that both Theorem VI.2 and Theorem VI.3 can be extended to an arbitrary number of gray-scale 7-mappings. From these extensions, bases for the gray-scale equivalents to the dilated and eroded set mappings in Section II1,B can be derived. These results can then be applied to gray-scale opening and closing and to cascades of gray-scale r-mappings. In the following, we give two examples demonstrating how the general basis algorithm can be applied to compute bases for given TI mappings. As for binary r-mappings and TI mappings, and because of its theoretical importance, we will also derive a result for the dual basis. This result will then be applied to derive the basis for gray-scale closing. The differences between bases for flat and general gray-scale r-mappings will then be highlighted. 1 . The Dual Basis The dual function mapping w*( f) = - (w( -f))is a gray-scale r-mapping and has a basis representation w*( f) = v (f 0g* :g* E @(w*)). The general basis algorithm can be used to compute the dual basis @ ( y / * ) from any given basis @(w), as follows.
312
RONALD JONES and IMANTS D. SVALBE
The dual r-mapping Y* may be expressed in terms of the basis @(w) by using the dual basis representation w * ( f ) = A If 0 k:k E @(w)). An infimum of dilations is a gray-scale r-mapping and has a basis that may be derived using the general basis algorithm. Each dilation f 0 k has a basis @(f + f @ k) = [ - p z :z E KJ, where the pulse function p,(y) = - k( - y ) if y = z and p , ( y ) = - 00 otherwise. The basis for the infimum of dilations A { f @ k: k E @(w)) can then be obtained by using Theorem VI.3, using two input bases @ ( f f 0k) at a time. The dual r-mapping v/* then becomes w * ( f ) = v ( f @ g : g E @ ( f + A k f @ k ) ) . The basis C%(f+ Akf @ k) may therefore be used to represent Y* and, as the basis representation is unique, @(x-, 0 A ) = @(Y*). Finally, note that as ty** = w, the dual basis of the dual mapping v/* is the original basis @(w). An example demonstrating the computation of the dual basis from any given basis is given in the following section. +
nX ,
2. Gray-Scale Opening and Closing The gray-scale t-mapping ty = f 0 g is an opening and its dual is a closing. In terms of basis representations,
u/=fOg= V f @ k = k E @(+)
w*
=f 0 g =
V k*e
A
A f@k.
f@k*=
a(+*)
f@k*,
k* E a(**)
k
E
a(+)
ly*
=f
0
(VI .8) (VI.9)
The basis for a closingf 0 2 is given by the dual basis @(w*) and so may be obtained from the basis for opening @(w) using the dual basis algorithm in Section VI,D,l. Note that the basis for gray-scale closing can also be obtained using the general basis algorithm in the same manner that was demonstrated for binary closing in Section II1,C. The opening f O g = ( f 0g ) 0 g = V Z E E (0 ( f g), + g(z)), using the definition of dilation in Eq. (VI.4). As the structuring function g is defined to be -00 outside its region of support, this reduces to f 0 g = V, G(( f 0g), + g(z)) and is a finite supremum of gray-scale r-mappings. Theorem VI.1 can be used to compute the basis for the translation @(( f 0g), + g(z))for each z E G, and the resulting bases can be combined under supremum using Theorem VI.2. In Fig. 16a is a structuring function g defined as a set of four pulse functions. Figure 16b illustrates the four elements of the basis for opening @(y/), obtained using the general basis algorithm. The basis representing the closing f 0 2 is given by the basis ( h ] = @(w*) and can be obtained from @(w) using the dual basis algorithm proposed in Section VI,D,l. The result is shown in Fig. 16c.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
a
313
b
C
tl
e
FIGURE16. Bases for gray-scale opening and closing. (a) Structuring function g. 0)Basis for closing. (d) and (e) Other examples of g.
If different gray-scale values are used in the structuring function, then the number of basis elements may change dramatically. For example, using the structuring function shown in Fig. 16d, the basis for the closingf. has 30 elements. The basis using the original structuring function in Fig. 16a has only 12 elements, despite the apparent similarity between the two structuring functions. The reason for the difference lies in the sensitivity of the constraint on basis elements. If k and k' are two elements of the kernel, then k < k' implies that k' is redundant and is not an element of the basis. Even a slight change in gray-scale values can nullify k < k', however, and admit k' as an element of the basis. If a gray-scale t-mapping is to be implemented via its basis representation, then gray-scales values must be chosen with care as, in terms of implementation, it is preferable to have the minimum possible number of basis elements. The minimal basis is always given by the case that corresponds to a flat gray-scale t-mapping. In Figure 16e is an example of a structuring function where all the grayscale values are zero. The use of such a structuring function reduces a general gray-scale t-mapping to a flat gray-scale t-mapping. Consider, for example, the expressions for general gray-scale dilation and erosion in Eqs. (VI.4) and (VI.5). If g(z) is defined as zero for all z E G, then these
374
RONALD JONES and IMANTS D. SVALBE
equations reduce to those for flat dilations and erosions in Eqs. (VI.1) and (VI.2). The basis for the closing w(f) =fog using the flat structuring function g in Fig. 16e may be obtained in two ways. If g is treated as a function with zero values, as depicted, then it may be obtained as before for the structuring function in Fig. 16a. The result is the same basis as that shown in Fig. 16c, except that all the gray-scale values are zero. Alternatively, g can be treated as a set G = ((O,O), (0, l), (1,0), (1, 1)) and the closing w ( f ) = f0g as a flat gray-scale closing w(f) = f0e. The basis &(w) is is the set mapping which uniquely given by a("), where Y = X O corresponds to y via the characteristic function (in short, Y is the binary equivalent to w). Both descriptions may be used as a representation for w, the former using the basis representation in Eq. (V1.7) and the latter using that in Eq. (VI.3).
VII. TRANSFORMING THE BASISREPRESENTATION Any r-mapping can be represented as a union of erosions using an appropriate basis. The design and implementation of all r-mappings can be considered from this common representation, and in this respect the basis representation is potentially very useful. However, the basis representation can become intractable as the size of the region of support of the r-mapping increases. Consider, for example, a five by five region of pixels in a binary image. In this relatively small region there can form 225different possible data patterns. If this is the region of support of a r-mapping, then the potential number of elements in its basis is similarly very large. In general there are two solutions to the tractability problem. The first, adopted by Dougherty and Loce, is to introduce an incomplete representation of the r-mapping by omitting elements of the basis. Exactly which basis elements are dropped is determined by restricting the type of basis elements studied and by using an optimization process. It has been shown that a high degree of optimality can be obtained using a surprising small number of basis elements-even five or six basis elements for some representations (Dougherty and Loce, 1992a). The second approach, adopted in this work (see also Jones, 1993), is to transform the basis representation into alternative forms that offer a more tractable representation. Although the new representation sacrifices the parallel form of a union of erosions, the transformation retains all basis elements, and so there is no loss of design optimization. For example, a closing using a square three by three structuring element can be represented as a union of erosions using a basis consisting of 270 elements. Given this basis, the original closing is one possible
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
375
representation to which the basis can be transformed. This original representation is neither parallel nor comparable with other t-mappings, but it is clearly the more tractable representation. The optimal representation of any t-mapping depends on the particular basis at hand, but a general approach stems from the result that reducing the region of support spanned by the basis elements will generally reduce the total number of basis elements. In the following, this approach is realized by transforming the basis into the form of cascaded z-mappings. Such representations offer an immediate reduction in the total region of support and are therefore the more likely to yield tractable representations.
A . Basic Tools It has been shown that a process of translating, uniting, and intersecting zmappings constitutes a general basis algorithm that can be used to compute bases for a range of given r-mappings. To transform the basis representation into different representations, these same basic transforms are used, only in reverse. All possible ways to transform the basis can be accommodated using such an approach. In the following, we establish methods for reversing the union and intersection tools. Note that a new result is not required for the translation of a basis, as this is already a reversible process. 1. Reversing the Union of Two z-Mappings
From any basis (A] it is possible to form two further bases [B]and {qsuch that YAl= ye,U qcl.This is a reversal of the procedure for the union of two t-mappings described in Section 11. For example, if the basis ( A ] is given by a single structuring element A , then a possible representation of YA is YA = Y EU Yc, where B = A and C 2 A, as then YEUYc = X 0B UX 0C = X 0A = Y A. Note that there are infinitely many solutions for the structuring element C 2 A . Although somewhat trivial, this example highlights the fact that any basis can be partitioned into two further bases and, because solutions like C 2 A are allowed, there are infinitely many ways to perform such a partition. A more general result is as summarized in the following theorem.
Theorem VII.1. Any t-mapping yrCA, can be represented by qA1 = Y E )
Y C ~ Y
376
RONALD JONES and IMANTS D . SVALBE
wherefor all B E ( B ] ,B 2 A , for some A E ( A )and for all C E ( C ) ,C 1 A , for some A E ( A ) and for all A E [ A ) ,either B = A for some B E ( B ) or C = A for some C E ( C ) . Illustrated in Fig. 17a is an example for a basis (A). The total region of support spanned by the basis elements is shown in Fig. 17d. In Figs. 17b and 17c, respectively, are solutions for the bases (B) and (GI which combine to give YAI= ye]U YCr.The advantage of using this new representation is evident from the relationship between the bases (B)and (C).These bases are . union related by a translation ( - 1,0), that is, (C) = (B(-l,ol:B E [ B ) ) The yBIU YCl then becomes Y,B1U ( \ y B ] ) l , O = yBI@ D, where D = ((0,O), (1,O)). The total region of support required by this r-mapping is shown in Fig. 17e, and this region is smaller than that required for the original representation of Y A l .The point to note from this example is that a reduction in the region of support will generally induce a reduction in the number of basis elements. The extraction of dilations (and erosions) in such a way will be looked at in greater detail next. 2 . Reversing the Intersection of Two r-Mappings From any basis [A) it is possible to form two further bases (B)and (C)such that YAl= YB1r)Ycl. This is the reversal of the procedure for the intersection of two r-mappings that was discussed in Section 11. Recall that the case for the intersection of two r-mappings is significantly more complex than that for the union of two r-mappings, and that this is because of the coupling of an intersection with a union. The same situation arises 0
00
0:
0
2
00
0:
a
2
0
00
0:
0
00
0 2
b
Et
c
0 0
000 0 0 0
d
0
2: e
FIGURE17. Reversing the union of two r-mappings. (a) Basis ( A ) .(b) Basis ( B ) .(c) Basis (C). (d) Region of support of basis ( A ] .(e) Total region of support of basis [eland structuring element D.
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
311
for the reversal of the intersection procedure, but we may avoid any = yBI f l ycr complications with the aid of the dual basis. The relation yAl has a dual form yAeI = yeel U where (A*), (B*), and (C*)are the dual bases of ( A ) , (B), and [ C ) .Solutions for (B) and (C) may therefore be obtained by first transforming the given basis ( A ) to the dual basis (A*), partitioning (A*] into two bases (B*] and (c*)that satisfy = Y,B*rU yC+), and then transforming (B*)and (C*)to their respective dual bases (B) and (C). More formally, we have the following theorem.
Theorem VII.2.
Any r-mapping yAI can be represented
whereforallB* E (B*),B* 2 A*, forsomeA* E (A*)andforaNC*E (PI, CC 2 A*, for some A* E (A*) and for all A* E (A*), either B* = A* for some B* E (B*) or C* = A* for some C* E (C*). 3 . Multiple Transformations Any given r-mapping can be transformed into a union or an intersection of two further r-mappings. This process can then be repeated for each new ‘5mapping created, and in this way a limitless number of possible transformations of the original r-mapping can be made. But the ultimate goal is of course a tractable representation, and we need to establish a method of combining the elementary transformations in Theorems VII.1 and VII.2 so that they will ultimately lead to this goal. Furthermore, the role of translation in this process needs to be developed. In the remainder of this work we will address these issues by restricting the space of solutions to cascaded solutions. More precisely, Theorems VII. 1 and VII.2 will be modified so as to yield a cascade of two r-mappings. Such transformations reduce the region of support spanned by the basis, and in so doing reduce the total number of basis elements. B. Serial Transformations
We can modify Theorem VII.1 to the form yAl = (yB1)c, U( Y , B ~ U) ~ ~ U = yiB, @ C , where C = (ci:i = 1, ...,N). This transformation, an example of which was given in Section VII,A, is a cascade of a r-mapping yet and a dilation by C. In a similar way, Theorem VII.2 can be modified to the cascaded form yAl= (y&, fl(yel)-cz n n = yBI0 C, where C = (ci:i = 1, ..., N). As, in fact, yAI= yBl0C @ @ c, we need only solve the first equation yAI = yBI 0 C. yAa1 =
378
RONALD JONES and IMANTS D. SVALBE
In general, it is unlikely that any given r-mapping YAIwill be of the 0C, and so we will consider the two equations exact form \yA1= (VII. 1) and (VII.2) If an exact solution YAl= yEl 0C does exist it will also be a solution for is underestimated by a serial Teither (VII.1) or (V11.2). In (VILl), mapping, and in (V11.2),it is overestimated. The representation of yA1 may be completed by combining solutions under either union or intersection, as will be detailed next.
I. Solutions Underestimating Y In order to solve Eq. (VII.l)7 a basis [ B )and a structuring element C that satisfy y B 0 l C E VIAlmust be found. Note that for any basis ( A ] there will always exist some solution to this equation, as there is always the trivial solution where (El = ( A ] and C is the origin. The range of all solutions is given in the following lemma.
Lemma V11.1. For any given structuring element C , all solutions for Y B I that satisfy 0 C C yAlare given by Y f EGl yBmal, where yBmarl =
yAl 0 c.
For any structuring element C there exists a solution for the basis (Bmmj, and any basis ( E ] that admits ye, E yiEmaI(that is, every B E ( E ) is a superset of some element of is also an allowed solution. The basis (EmJ itself is given by @ C), and this can be obtained from any ( A ]and C using the general basis algorithm, as was demonstrated in Section III,B. Although a basis (Bj and structuring element C may satisfy the constraint ",El 0 C E yA17 we must also consider the fact that we require these soluWe will therefore tions to combine to complete the representation of YA1. E yiEl0 C for some A E ( A ] .In this impose a further constraint that yAl 0 C is computed for each A E ( A ]then a union way, if a solution for YIEl 0 C: yAIE YIsl 0 C E YAI]= YfAI7and the representation of YIAIis complete. The range of possible solutions is given in the following lemma.
uA1",EI
Lemma VII.2. For any given structuring element C, all solutions for YIEI that satisfy yA1 E YIBI0 C f o r some A E [ A ]are given by YBI1 YEmin, where Emin= A,, z E C .
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
379
A solution for a basis (B) must satisfy both Lemma VII.1 and Lemma VII.2 simultaneously if it is to admit yAI G Y,B1@ C E yAl.We find in such case that the solution space for the structuring element C becomes constrained. The complete result is given in the following lemma.
Lemma VII.3. A basis (B]andstructuring element Csatiqfy YAC yBl0 C L yA,for some A E ( A )iff (i) A , 2 B for some B E ( B ) ,(ii) all B E ( B ) are supersets of some B,, E (B,,,], and (iii) C E + z , where the set TA = ( t : A 2 A ; , A’ E ( A ) ]and z E C. The basis (B,J is as defined in Lemma VII.1. The first two constraints dictate that the basis (B)must lie within the range of the structuring element Bminand the basis (B,,,], and the third constraint restricts the space of solutions for the structuring element C. Shown in Fig. 18a is a simple basis ( A ) .Directly below each basis element, in Fig. 18b, is a corresponding set = ( t :A 2 A ; , A’ E ( A ) ]For . any given A , the set G is given by the set of translates t that satisfy A 2 A ; , where A‘ is @ C for example, where A , is another element of ( A ) .To admit YAl 5 qB, the first basis element shown, the structuring element C must be a subset of = ((0,0), ( - 1, 0))translated through some z E C . One solution is the set C = GI itself, where z is the origin. The basis Bmiais then given by Bmin= A , = ((0, 0)).The basis (B,,), given by 0 C), can be obtained using the general basis algorithm and is as shown in Fig. 18c. All other solutions for (B] lie between Bminand (B,J and will admit YAI C yBI 0 C E yAl. A r-mapping yAIcan now be represented yA,= UA(qB, 0 C : Y AG yBl@ C E where for each A E (A) there corresponds a different 0
O
+
+
+o
O
0 0
a
b
0
0
+
oto
ot
0
0
0
0
0
0
+o 0
t
0
0
C
FIGURE 18. Solution underestimating Y. (a) Basis [A). (b) Corresponding sets Solution for a basis [BmJusing first the element of ( A ] .
q. (c)
380
RONALD JONES and IMANTS D. SVALBE
solution for the basis ( B ] and structuring element C. Under certain conditions this representation may be simplified. Consider two solutions YA,G Yell 0 C,and YAzE YBIz0 C , , where A , and A, are two different elements of (A]. The bases ( B ] , and (BIZ are each bounded by a lower 5 yiBl, C YBmmll and Y1lgminl2 E YlBb E and upper extreme, given by YIBminl, Y;B,,lz. If a basis (B) satisfies both Y;BminllC YIBl E ‘IfBmmll and YBmin12 E YBrE Y,B,,lz, then it is an allowed solution for both the basis ( B ] ,and G YB1E YIBm,l, where the basis ( B ] , . The range for (B) is given by YBmial = YBminIl Y E , , & and YBmmI = YBmaxll y B m a x l 2* The two O C1U qBI O C,. may then be combined YBllO C , U YBIz0 C, = YIBl Finally, as dilation commutes with union, this may be further simplified to Y,Bl C, where C = C , U C,. Other solutions may be combined in the same way and the representation further simplified. If there exists a solution for basis (B)that is common to all other solutions, then the r-mapping YAl may be represented by the single solution = YBI0 C. Note that, if such a solution exists, it may be found more directly by placing further restrictions on the space of solutions for ( B ) and C in Lemma VII.3. However, we will not pursue this further as the solution for Eq. (V11.2), which is detailed next, gives a more general result.
2 . Solutions Overestimating Y The range of solutions that satisfy Eq. (VII.2) is given in the following lemma.
Lemma VII.4. For any given structuring element C,all solutions for that satisfy 981 0C 2 YA1are given by 2 where uA(X 0A ) @ ( - C A I *
YBr =
The basis (B,,J is formed by sequentially selecting each A E (A] in turn, translating it through a vector c E C,and including A, as an element of (Bmin].There are many ways to form the basis (Bmi,),as each element of (A] 2 can be translated through any c E C. A basis (B) that satisfies can be formed by adding any arbitrary structuring elements to the basis
mnn i 1.
In order to complete the representation of YAl, we will impose the additional constraint that “,Bl(X)0C E X 0A*, where A* is an element 0C is computed of the dual basis (A*].In such a way, if a solution for qBl for each A* E (A*],then an intersection of all these solutions is a subset of nA.X 0A*. But by duality, A* = U A X0A = YAland, as all solutions also satisfy YBI@ C 2 Yal, we have that
nAZ@
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
381
The range of solutions that satisfy yel(X)0 C E X 0 k* is given in the following lemma.
Lemma VIM. For any given structuring element C, all solutions for yel that satkfy y B I ( X 0 ) C S X 0k*for some A* E (A*) are given by yel2 E yBmml where yBmml = ( X 0 k*)0C . The basis IB,,,,) = @((X0k*)0C) can be obtained from any A* and C using the general basis algorithm. Again, if a solution for (B)is to satisfy both Lemma VII.4 and Lemma VII.5 simultaneously, then the solution space for the structuring element C becomes constrained. The complete result is as given next.
Lemma VII.6. A bask ( B ) and structuring element C satkfy qAI s yel(X)0 C G X 0 k* for some A* E (A*) iff (i) all (BminIE (B,in) are supersets of some B E ( B ) ,(ii) all B E (B)are supersets of some B , E (Bm& and (iii) C G B- 0 A*,for all B- E IBbi,). The bases (Bbn)and (B,,) are as defined in Lemmas VZZ.4 and VZZ.5, respectively. A r-mapping yAl may now be represented YAl= nA.(9el 0 C :y A I ( X ) E ye@)0 C S X 0 A*), where for each A* E (A*) there corresponds a solution for a basis (B]and structuring element C. This representation may be simplified by combining and comparing solutions, in the same way that was detailed in Section VII,B,l for solutions to Eq. (VII. 1). In the following section we focus on a particular simplification of this representation that yields a cascade of two arbitrary r-mappings. C. Cascaded r-Mappings
If there exists a basis (B) that is common to all solutions, then Eq. (VII.3) may be simplified to the form of the cascaded r-mapping (VII.4) Here the r-mapping yA1 is represented by the common r-mapping yE1 and a second r-mapping in the form of an intersection of dilations. The set of structuring elements (D)is given by the basis elements of the set (q.Note that there will always be some solution to this equation, as there is always the trivial solution where the basis [B] = ( A ) and the basis (0)is a single structuring element D = (0). The cascaded representation just given may be obtained more directly by restricting the space of allowed solutions for (B)and C in Lemma VII.6. The result is given in the following theorem.
382
RONALD JONES and IMANTS D. SVALBE
This theorem may be implemented as follows. First a solution for the set (CA,)and its dual (Cj] must be found that satisfies the constraint that for all A E ( A ] and all A* E (A*], 0 CA,C A 0A*. A structuring element C,, E (C,,] exists for each A* E (A*] and a CJ E (Cx]for each A E ( A ] .As the basis ( A )and its dual (A*]are known, each dilation A 0A* is known and 0C,, c A @ k* can be computed by searching through a solutions for
’c
’c
finite solution space. The bases (Bmi,]and (Bma] can then be computed from the solutions for ( c ) and (P] using the general basis algorithm. Illustrated in Fig. 19 is an example using the basis ( A ]that was shown in Fig. 18. Arranged above the box that is shown are the four basis elements of ( A ] ,and down the left-hand side of the box are the four elements of the dual basis (A*]. Directly below each A and to the right of each A* is the set A 0A*. For any given row of dilations A 0 A*, each solution for c;l* 0 C,, G A 0A* must use the same structuring element C,, . Similarly, for any @ C,, E A @ A* must use the column of dilations, each solution for In addition, ( c ) and (C*)must be chosen so same structuring element that (C’] is the dual basis of (C,,].
c$
el.
-O0+
O
t o
0
00 0
0
0 0
00 00
00
00
0
0 00
0
0 00
00 0
00
00
00
00 0
0 0’3
00 0
00
$: 0
00
0
0 00
00
O:
2:
00 0
0 0
0
00
00
0 0
00
0
00
0 0
2:
t
00
0
00
FIGLIRE19. Solution space for [CAJand [C’j.
383
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
00 a
n
0
0
0
0
(1
FIGURE20. Solution for the cascaded r-mapping. (a) Solution for [CA&(b) Dual basis [C$. &,,I. (d) Basis [B,,,=].
(c) Basis
In Fig. 20a is a solution for the set (C,*), where for each A* E (A*) there corresponds a structuring element C,, . In this instance we can find a solution for C,, that is the same for every A* E (A*),and so there need be only one structuring element in the set (Cj).In Fig. 20b is the corresponding solution for the dual basis (CJ). For each A E ( A )there exists some CJ, and the set is shown so as to correspond to the elements of ( A ) in the same order as they appear in Fig. 18. The cascaded representation in Eq. (VII.4) becomes yAl= ye,0 D, where D = C is the single structuring element in the set (CA*). The basis (B,,,] is given by the basis C%(X + UA(X0A ) 0 and is as shown in Fig. 20c. The basis (B,,) is given by @(X + nA4X 0 A*) 0 C,*)and is as shown in Fig. 20d. Any basis that lies within the range of (Bmin) and (BmJis an allowed solution for [ B ] ,and this defines the complete space Note that there exists another solution of solutions using the chosen set (C,,]. for the set (CA4which is also non-trivial, but the example given is the simpler. is used in the representation yA1 = yB1 0 D, then If the basis [Bmm) clearly it would be a less tractable representation of y,. On the other hand, the use of the basis (B,,) will indeed result in a representation that involves fewer basis elements than the original representation. The reduction in the size of the basis here is relatively small when compared to the original, but it should be borne in mind that the original basis ( A ) consists of only four elements and was chosen to admit a pictorial representation rather than a large gain in tractability. Fuller testing of the cascaded transformation awaits the development of a more systematic algorithm to compute the elements of the sets (CA4and [CA*].
ca
VIII. CONCLUSION In this work we have proposed a general basis algorithm to compute bases for a range of t-mappings and have subsequently extended these results to TI mappings and to gray-scale r-mappings. Although the basis representation
384
RONALD JONES and IMANTS D. SVALBE
of a r-mapping has been established for a long time, there has hitherto been no general method of computing bases for arbitrary r-mappings. The general basis algorithm proposed is composed of three tools, based on the algebra of union, intersection, and translation. All of these tools involve the combination of just two bases but, when combined with one another and used repeatedly, can be used to compute bases for 7-mappings such as opening, closing, open-close, close-open , and all other cascaded r-mappings (such as multiple passes of the median filter). In particular, it can also be used to compute the dual basis, and this has allowed us to exploit the duality between r-mappings whenever required. The general basis algorithm cannot be used to compute bases for all r-mappings, as it requires that they be formed as a combination of rmappings, each for which the basis is known, under the algebra of translation, union, and intersection. However, if all r-mappings were to be designed using the basis representation, which is the approach advocated in this work, then the basis representation would always be known. Moreover, it is not by chance that the algebra of the general basis algorithm is precisely the algebra required in the union of erosions representation (recall that an erosion is defined as an intersection of translations). The tools used by the general basis algorithm and the reversal of these tools constitutes a complete description of all possible transformations involving the basis representation. In this sense, the methodology that we have proposed is completely self contained. The basis affords a parallel representation of a r-mapping. Any tmapping is represented by a union of erosions (or dually an intersection of dilations), and each erosion (or dilation) may be implemented independently. The general basis algorithm proposed herein may be applied immediately to compute bases for well-used r-mappings such as opening, closing, open-close, and close-open in order that they can be implemented in a parallel form. The basis representation is a common platform from which the study, design, and implementation of all t-mappings can be approached. Direct comparisons can be made between r-mappings that would otherwise be considered disparate, and this unified description can reveal relationships between r-mappings that may otherwise be difficult to establish. Moreover, the basis contains all the information necessary to distinguish one r-mapping from any other. Filtering properties such as over-filtering, under-filtering, and self-duality can be defined completely in terms of the basis representation. By developing this further and incorporating any new properties, it should be possible to establish a complete design methodology that is applicable to all r-mappings. The basis exemplifies every possible data structure in the region of support of a r-mapping, and when this region becomes large, the basis can diverge to
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
385
an unmanageable size. In this work we have attempted to address this problem by transforming the basis representation into forms that offer a more tractable representation, specifically cascaded representations. Such transformations reduce the region of support spanned by the basis elements and thus can induce a reduction in the size and total number of basis elements. This can be a complex affair, and there are still issues that have not been fully resolved. For example, despite the fact that the space of transformations is restricted to those of a cascaded form, many of these solutions may not offer a more tractable representation. A systematic mechanism for avoiding such solutions would be useful. Moreover, a given basis may not admit any cascaded form which is more tractable than the original representation and, if such a basis is to be further pursued, an incomplete representation would need to be used. The most optimal incomplete representation would then be an issue to consider. The parallel representation of the basis uses explicit rather than contingent logic. The examination of the basis for any r-mapping exposes the detailed workings of that r-mapping, and this may be used to select optimal structuring elements for r-mappings or create new and useful r-mappings. Alternatively, it may be that the techniques for the serialization of the basis advocated here can be incorporated into the search procedures for an optimal basis that have been developed in the work of Dougherty and Loce. Tractability is of major concern, and the merging of large groups of basis elements into more compact representations would play a central role if full design optimization is to be realized.
APPENDIX
Proof of Theorem 11.1. (Y;A,)l = (UAX 0A), = U (X 0A), = UAX 0 A - , = U ( X 0( A t ) : A E (A)]. If (A] is a basis, then ( A _ ,: A E (A)) is a , C A _ , * A' C A, basis because for any A-r, A'-r E (A-r : A E ( A ] ]A'-r where A, A' E (A]. But (A) is a basis, and so A' E A =. A' = A * A'- t = A-r, and therefore (A, : A E (A)) forms a basis. Proof of Theorem 11.2. 'yA1U ysl = (UAX0 A ) U (UBX0 B) = U ( X 0C : C E ( ( A )U ( B ) ) )= U ( X 0 C : C E @ ( ( A U ] (B]))by definition of a U yBI). An element A E (A] is an basis. The basis @((A]U ( B ] ) )= @(\y,, As (A] forms a basis, the element of @((A) E (B)) iff A $ B, v B E (4. second constraint suffices, and the set (A,] = (A E ( A ): A $ B, v B E ( B ] ) defines all possible solutions. Similarly, the set (B,) = ( B E (B]:B 2 A , V A € ( A ] )defines all possible solutions from (B). The union (A,] U (B,] forms the complete set of solutions.
386
RONALD JONES and IMANTS D. SVALBE
Proof of Theorem 11.3. yAIn 'IfBl = ( U A X O A ) n ( U B X O B) = UA,B[(X@ A ) r)( X O B)] from the property of general distributivity. But ( X O A ) n ( X O B )= X Q ( A U B ) = X O C , where C = A U B . The union UA,B[(X@ A ) f l ( X G B)] = U c X @ C, where ( C ) = [ AU B : A E ( A ) ,B E ( B J ]By . definition of a basis, U c X O C = U ( X O D :D E &((C])), and so yAln y,, = u [ X D :D E ~ ~ { c I ) J .
o
h o o f of Lemma IV.1. ~,,,(y,~) = UA,(UAX O A ) O A,. But (UAX O A ) A; = (U,X A ) (Ujai,)= n j [ ( U A X@ A ) Q a,,,] = n j [ ( U A X O A ) O (-u,,~)I = n j [ U A ( ( X OA ) O (-ai,j)JI = Ui,,.,,, .,i [flj[(x@Ai,) O (-ai,j)ll = U ;,,..., ;x, [nj7,1(XO (Ai, O ui,j)II = Ui,,.... O + a,,,)]. Therefore, = u IX B : B E 1 ~ 1 1 ,where (BI = [B : B = U j a i , + A;,).
o
o o
. .
~ ~ ~ ( o4 ~ ~ ) iW3k
Proof of Lemma IV.2. As demoystrated in Section III,E, the dual basis @(Y*)= @(X -r n,x @ A). By definition, f I A X O A = ni(U,X @ (a,,,)) = Uj,,..,,jx(,lJn;XO (a,,,)) = Uj,,...,jx(,,,xO (Ui(ai.j)) = U U . 0 B, where [ B )is as defined. Therefore, @(Y*)= &(X n A X @ A) = & ( U B X @ B) = &((B)). +
Proof of Theorem V. I. (@{IA,Bll)r = [ U [ X @(A,B) : [ A ,B] E &(@))I, = U ( ( X @ ( A , B)), : [ A ,B] E &(@)I. But ( X 8 ( A , B)), = ((z + t ) E E : X-, E [ A ,B ] ) = lz' E E:X-,l E [ A ,Bl-,I = X @ ( A ,B)-, and so (@(H,B1})t = U ( X @ ( A , B)-, : [ A , B] E &(a)).Furthermore, the set ( [ A ,B]-, : [ A , 81 E @(@)I forms a basis because for any [ A ,B]-,, [A',B']-, E ( [ A B]-,: , [ A ,B] E &(@)I,[ A ,B]-I G [A',B']-, a [ A ,B] G [A',B'] a [ A ,B] = [A',B'] * [A, B]-, = [A',Bt]-,. Proof of Theorem V.2. a l c 4 , B n U @crc,s} = [ U (X@(A, B) : [ A ,B] E ( [ A , E ([C,DlJll = U ( X @ ( E , F ) : [ E , F E l (([A,BII Blll1U[U~X@(C,D):[C,Dl u ( [ C ,Dll)l = U (X@ ( E , F ) : [E, Fl € @ ( ( [ ABll , U ([C,Dl))). BY the , ] ]U ( [ C ,Dl)) iff [ A , B] (t [C,Dl, definition of a basis, [ A , B] E & ( ( [ A B v[C,Dl E [[C, Dl) (as [ [ A B]] , is a basis we need only compare [ A ,B] with ([c, Dl)). Similarly, [C,Dl E @(( [ A ,BII U ( [ C ,DI1) iff [C,Dl C [ A ,Bl, , The complete set of basis elements is given by &(@([A,B1) v [ A ,B] E ( [ AB]). u @ ( , C , D ~ ]=) ( ( A ,B) E [ ( A ,B)I : [ A ,BI (t [C,Dl, V[C,DI E I[C,DII U [(C,D) E [(c, 0 1 : [C,Dl (t [ A ,BI, v [ A ,BI E ( [ A ,BIII = [ [ A Blal , U([C, by , and ( [ C ,Dl,). definition of ( [ A B],]
.
Proof of Theorem V1.3. @ ( u , ~n] @ ] ( , c , ~=) ( U u , B l x8 ( A , B)) n ( U [ C , D I X @ (C, = Uw,BI,[C,Dl[X @ ( A B, X @ (C, from the property of general distributivity. But X @(A,B) n X @(C,D ) = ( X Q A n Xc O B C ) n ( X @C n X c O P ) = ( X Q A n X O C ) n ( X c Q B C n X c Q DC)= X @ ( A U C ) n X c @ ( B C U p )= X O ( A U C ) n X c @
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
387
Proof of Theorem YI.1. (tykI),, + u = (V,fO gIh + u = [V,(fO gIhl + u = V,[(fG g)h + ul. The erosion (f0 g)h(X) = (fOg)(x - h) = A, €E(f(X h + Z) = A,, € E ( ~ ( x+ z') - g(Z' + h)) = Ay EE(f(x + z') - g-h)(Z')) = (fOg-h)(X) and (fO g - h + U)(X)= (fOg-h)(X) + = A,EE(f(X + Z) - g-h(Z)) + 0) = A,.E(f(x + Z) - (g-h(z) - u)) = A,EE(f(x + Z) - (g h - u)(z)) = (f 0(g-,, - u))(x). From which V,[(fO g),, + 01 = V(, f 0 (g-,, - u) :g E (g)).Furthermore, as (g) is a basis, (g-h - u:g E a3(ty)] forms a basis because for any two elements (g- h - u) and (gih - u), (g-,, - u) I (gLh - U) * g I g' *g = g' * ( g - h - U) = (gLh - u). Proof of Theorem VZ.2. tykl v ty{,+] = (V, f 0g) V (V, f 0h) = v (fO k: k E ((g)U (h)))= v (f0k : k E cB((g)U (h))).By the definition of a basis and using the fact that both (g)and (h) are themselves bases, g E (a((g)U (h))) iff g $ h, vh E ( h ) and similarly h E a3((gl U [h)))iff h 3 g, Vg E (8)so that @(lg)U(hJ)=( g E ( g ) : g $ h , V h E l h l l U l h E ( h ) : h 3 g , v g E l g l )= IgaalU (ha).
Proof of Theorem VZ.3. tykl A ty(h] = (V, f 0 g) A ( V h f 0 h) = V,,*[(f 0g) A (f0h)] from the property of general distributivity for arbitrary supremum and infimum. But from the definition of erosion (f0 g) A ( f O h )= fO(gv h), and so V,,d(fO g) A (f0h)l = V g , J0 (g v h) = V,f 0 k, where {k)= (g V h : g E (g), h E ( h ) ) .The basis 63(tyk1 A ty(,,)) is therefore given by the basis elements of (k). Proof of Theorem VZZJ. = yB)Uy;cIo yAI 2 yBIUyc,and Y A ) E U;B,U Scl.4 ~2 ~) B UI 'ycl 0 ~ A 2I and ",Al 2 +' I 0 for all B E [BJ, B 2 A , A E ( A )and for all C E (q,C 2 A , A E (A). Y A l E Y;B,U\Ycl o for all A E [A), either B C A for some B E (B)or C E A for some C E (C). Combining the two constraints yields the final result. W
388
RONALD JONES and IMANTS D. SVALBE
yB1 OC c YAl0 U,,c(Y;Bl)c E yAl (qB1), E yAl,Vc EC e’€iBI 5 (yAI)-,, Vc E C e yBlE yAI 0 C . Therefore, all solutions for ‘IfBl are given by yBlE yAl0 C and the maximum such solution is given by Y;B,,I = yAI 0 C. Proof of Lemma VZZ.2. The t-mapping yBI0 C = UcECfl,Bl)C= U c E c ( ~ B - c= l ) UB,ceCX 0 (B-,) by Theorem 11.1. Therefore, YA E yBl Proof of Lemma VII.2.
0 C e X 0 A E U B , c E c X0 (B-,) e A 2 B-,, for some z E C, B E ( B ) e A, 2 B, for some z E C, B E (B] e YBmhC y e ] ,where Bmin= A,,z E
c.
Proof of Lemma V11.3. To find a non-empty solution for y e ] ,we must satisfy YBmin E 481, E yiBm,Iand YEminE yBmal. (i) and (ii) follow directly from the first two constraints. YE-. E Y;Bm,I e for some B, E (B,,), B,, 2 B,, * 3B,, E Bcy,,,, O c): A , 2 B,, @ v c E c, 3A’ E ( A ): A , 2 A: e C E + z, where the set = ( t : A 2 A;, A’ E ( A ) ) .This is the constraint on C as it appears in (iii). H
o
Roof of Lemma VZIA c 2 y A 1 0 Uc(yBl)c 2 YAI* UcyB-cl 2 y A 1 * VA E ( A ) ,3~ E C , B E ( B ] : B - ,E A * VA E ( A ) ,~ c EA C , B E [ B ) : B E A O (CAI 6 2 UA(XO A ) 0 ( - c ~ )AS . (U,(X O A ) O ( - cA)) 0 C = UA((XOA)O((-CA)OC)) 2 U A X O A , U A ( X O A ) O ( - c A ] i s also an allowed solution for (B), and because yet2 UA(X0A ) 0 ( - ca), the minimum solution for (B) is given by yBminl = uA(X 0 A ) 0
(-d. Proof of Lemma y11.5.
o
o
y B l ( ~ )c E x A* U , E _ ~ y B I ( ~E) )xc ( y * l ( X ) ) ~ ” E x O A * , v c E c ey B l ( X ) ~ ( ~ O A * ) - c , V C € C * y B l ( X )C (X 0 A*) 0 C. Therefore, all solutions for yBlare given by y B I ( XE ) (X0A*) 0C, and the maximum solution is given by yBWl(X) @A* =
@
( X O A*) 0 c.
Proof of Lemma V11.6. To find a non-empty solution for yBl,we must E yB,l, and C yBmI. (i) and (ii) follow satisfy Y(IBminl E Y E ) , E 0 u ( X 0&in :B,;, E directly from the first two constraints. IBminlI E TB,,I e for all Bmin ,X 0B- E yBmI e for all Emin, X 0B, E ( X O A * ) @ C ef o r a l l B ~ , , 3 B , , ~ a d ( ( X ~ ~ * ) ~ C ) : B m i , 2 B , , o for all B,, , for all c E C, l a E A* :(c + a) E B,, e for all Bmin,C E (Bmin 0A*),which is the constraint on C in (iii).
Proof of Theorem V1Z.3. A basis ( B ] that is common to all solutions in Lemma VII.6 is bounded by two bases [BJ and (B,,]. (i) By Lemma VII.4, a common solution for the basis (B,J is obtained from the expression = UCUA(X0A ) 0( - cA), where cA E C, and can be different
BASIS ALGORITHMS IN MATHEMATICAL MORPHOLOGY
389
for each structuring element A . UcU,(X 0A ) 0 ( - c,) = U,(X 0A ) 0 (UC- cA) = U,(X 0A ) @ CJ, by using the dual basis (C*}and by discarding redundant sets - c,) 2 PI. (ii) By Lemma VII.5, a common solution for the basis (B,,} is obtained from the expression \yIBW1 = &4X 0A*) 0C,,, where C,, E (Cl can be different for each A* E (A*}. (iii) Solutions for ( B } will exist between the range of (B,J and (BmJ iff A*) 0 c,, e) for all \yIBml e) U,(X 0 A ) @ Cl c n,dX \yIBminl A E [ A } ,A* E (A*],( X 0A ) @ C ’ c ( X @ A*) 0c,, o for all A E ( A } , A* E (A*],( X 0A ) 0 (CJ 0 CA*)C X @ A*. This must hold for all X E W E ) and so (X 0A ) @ ( C’ @ C,*)E X @ A* =$ (A 0A ) @ (C’ @ Cp) C A @ A* * (Cl@ Cp) C A 0 A*. AS (X0A ) @ (A @ A*) = ( X O A ) 0 A* E X @ A*, C’@ c,* E A @A* * ( X O A ) @ (C’ @ CA*)C X 0 A*, and so for all A E ( A } ,A* E (A*},(X0 A ) @ (c’ @ CA*)E X @ A* o for all A E ( A } ,A* E (A*],Cj @ C,, C A @ A*.
REFERENCES Astola, J., Heinonen, P., and Neuvo, Y. (1987).IEEE Trans. Acoust.. Speech, SignalProcess. ASSP-35, 1199. Banon, G . J. F., and Barrera, J. (1991).SIAMJ. Appl. Math. 51, No. 6. Banon, G. J. F., and Barrera, J. (1993). I n ‘‘International Workshop on Mathematical Morphology” (P. Salembier and J. Serra, eds.), p. 234. Barcelona. Cnmmins, T. R., and Brown, W. R. (1985). IEEE Trans. Aermp. Electron. Syst. AES-21, 60. Dougherty, E. R. (1992a). Comput. Vision. Graphics, Image Proms. -Image Understanding 55, 36. Dougherty, E. R. (1992b). Comput. Vision, Graphics, Image Procm.-Image Understanding 55, 55. Dougherty, E. R., and Giardina, C. R. (1986). IEEE Conf. Comput. Vision Pattern Recognition, Miami, p. 534. Dougherty, E. R., and Giardina, C. R. (1988a). IEEE Cod. Comput. Vision Pattern Recognition, Michigan. Dougherty, E. R., and Giardina, C. R. (1988b).“Morphological Methods in Image and Signal Processing.” Prentice-Hall, Englewood Cliffs, NJ. Dougherty, E. R., and Kraus, E. (1991).SIAMJ. Appl. Moth. 51, 1764. Dougherty, E. R., and Loce, R. P. (1992a). IEEE Conf. Comput. Vision Pattern Recognition, Vol. 3, p. 256. Dougherty, E. R., and Loce, R. (1992b). I n ‘‘Mathematical Morphology in Image Processing” (E. Dougherty. ed.). Dekker, New York. Hadwiger, H. (1950).Math. Z. 53. 210. Haralick, R. M., Sternberg, S. R., and Zhuang, X. (1987). IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 523. Heijmans, H. (1991).IEEE Trans. Pattern Anal. Mach. Intell. PAMI-13,568. Heijmans, H., and Ronse, C. (1990).Comput. Vision, Graphics, h u g e Process. 50. US. Heijmans, H., and Ronse, C. (1991).Comput. Vision, Graphics, Image Process.: Image Understanding 54, 14.
390
RONALD JONES and IMANTS D. SVALBE
Jones, R. (1993). In “International Workshop on Mathematical Morphology” (P. Salembier and J. Serra, eds.), p. 239. Barcelona. Jones, R., and Svalbe, I. (1992a). Pattern Recognition Lett. 13, 123. Jones, R., and Svalbe, I. (1992b). Pattern Recognition Lett. 13, 175. Jones, R., and Svalbe, 1. (1992~).IAPR Int. Conf. Pattern Recognition, The Hague, Vol. 3, p. 264. Jones, R., and Svalbe, I. (1994a). IEEE Trans. Pattern Anal. Mach. Intell. 16, 438. Jones, R., and Svalbe, I. (1994b). IEEE Trans. Patfern Anal. Mach. Intell. 16, 581. Khosravi, M., and Schafer, R. W. (1993). In “International Workshop on Mathematical Morphology” (P. Salembier and J. Serra, eds.), p. 228. Barcelona. Loce, R. P. (1993) Ph.D. Thesis, RIT, New York. Maragos, P. A. (1989). IEEE Trans. Pattern Anal. Mach. Intell. PAMI-11, 586. Maragos, P. A., and Schafer, R. W. (1987a). IEEE Trans. Acoust., Speech, Signal Process. ASSP-35, 1153. Maragos, P. A., and Schafer, R. W. (1987b). IEEE Trans. Acoust., Speech, Signal Process. ASSP-35, 1 185. Matheron, G . (1975). “Random Sets and Integral Geometry.” Wiley, New York. Meyer, F. (1978). In “Quantitative Analysis of Micro-structures in Materials Science, Biology and Medicine” (J. L. Chermant. ed.), p. 374. Riederer-Verlag. Stuttgart. Minkowski, H. (1903). Math. Ann. 57, 447. Serra, J . (1982). “lmage Analysis and Mathematical Morphology.” Academic Press, New York. Serra, J. (1988). “Image Analysis and Mathematical Morphology,” Vol. 2. Academic Press, New York. Serra, J., and Vincent, L. (1992). Circ. Syst. Signal Process. 11, 47. Song, J., and Delp, E. J. (1990). Comput. Vkion, Graphics, Image Process. 50, 308. Sternberg, S. R. (1982). Lect. NotesMed. I d . 17, 294. Sternberg, S. R. (1986). Comput. Vision, Graphics Image Process. 35, 333. Svalbe, I . D. (1991a). IEEE Trans. Pattern Anal. Mach. Intell. PAMI-13, No. 12. Svalbe, I. D. (1991b). In “DICTA-91,” p. 258. Melbourne.
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS,VOL. 89
Mirror-Bank Energy Analyzers S. P. KARETSKAYA, L. G. GLICKMAN, L. G. BEIZINA, and YU. V. GOLOSKOKOV Laboratory of Mass-Spectroscopy. Institute of Nuclear Physics, National Nuclear Center, Republic of Kazakhstan, Alma-Ata 480082, Kazakhstan
.
.
.
.
.
1. Introduction . . . . . . . . . . . . . . . . . 11. Equations for Charged Particle Trajectories in an Electrostatic Field Having a . . . . . . . . . . . . . . . Symmetry Plane . . A. Paraxial Approximation for a Monoenergetic Beam . . . . . B. Dispersion, Geometric, and Chromatic Aberrations . . . . . 111. Peculiarities of Charged Particle Focusing and Energy Separation in a Mirror . . . . . . . . . . with a Two-Dimensional Electric Field . A. An Image Equation and Dispersion in Energy . . . . B. Aberrations . . . . . . . . . . . . . . . . . . . IV. Energy Analyzers Based on Mirrors with Two-Plate Electrodes Separated by Direct Slits . . . . . . . . . . . . . . . . . A. Mirrors with Parallel Electrode Plates . . . . . . . . . . B. The Mirror “with a Wall” . . . . . . . . . C. Application of Mirrors with Two-Plate Electrodes Separated by Direct Slits in a Mass Spectrometer . . . . . . . . . . . . D. The Two-Cascade Energy Analyzer with Electrostatic Wedge-Shaped Mirrors V. Peculiarities of Charged Particle Focusing and Separation in Energy in a Transaxial Mirror . . . . . . . . . . . . . . A. Paraxial Approximation . . . . . . . . . . B. Aberrations . . . . . . . . . . . . . VI. Energy Analyzers Based on Transaxial Mirrors with Two-Plate Electrodes A. Two-Electrode Mirrors . . . . . . . . . . . . B. Three-Electrode Mirrors . . , . . . . . . C. Multicascade Energy Analyzers and Schemes for Mass Analyzers with Transaxial Mirrors . . . . . . . . . . . . VII. Conclusion . . . . . . , . . . . . . . . . References . . . . . . . . . . , . . . .
.
.
. .
. .
.
.
. .
.
. .
.
.
.
.
.
.
. . .
. . . . . . . .
. . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . .
391 393 395 397 399 40 1 403 410 41 1 420 425 430 433 435 43 7 441 441 458 469 477 478
I, INTRODUCTION
This review is devoted to a new class of the electrostatic energy analyzers which have been proposed and treated in our Institute in the last few years-the class of energy analyzers based on variatious mirrors with twoplate electrodes. These novel analyzers are characterized by high-quality 391
Copyright 0 1994 by Academic Press, Inc. All ri&ls of reproduction in any form reserved. ISBN 0-12-014731-9
392
S. P. KARETSKAYA et al.
focusing, large dispersion in energy, design simplicity, and high adaptability to manufacture. Openings in one of the electrodes or grids are absent, in contrast to other mirrors. Hence, the scattered and secondary particles that usually produce spectrum distortion are also absent. The electric field of a two-plate electrode mirror possesses a symmetry plane (the mid-plane) near which charged particle motion comes about. The mirrors can be divided, with respect to a field symmetry character, into two types: the mirrors with a two-dimensional field independent of one of the Cartesian coordinates, and the transaxial mirrors with a field, symmetric relative to the axis, which is perpendicular to the mid-plane. Examples of the two types of mirrors are presented in Fig. 1. Each electrode of the considered mirrors is composed of a pair of identical plates, located symmetrically with respect to the mid-plane. This plane contains the axial trajectory of a charged particle beam. In the first type of mirror, the plates of the neighbor electrodes are separated by direct slits, whereas in the second type, the slits are curvilinear: their projections onto the mid-plane represent sections of concentric rings. Various modifications of the mirrors have been considered: concave and convex transaxial mirrors, mirrors with direct slits having parallel electrode plates, as in Fig. la, or mirrors with the electrode plates inclined to the mid-plane-the wedged-shaped mirrors, etc. General properties of charged-particle focusing in the systems having a symmetry plane, as well as the peculiarities of charged-particle focusing and energy separation in a mirror with a two-dimensional field and in a transaxial mirror, have been investigated. The revealed relationships have been used when studying the electron-optical properties of the two-plate electrode mirrors. The geometrical and electrical parameters of various mirrors for energy analyzers have been found by mathematical simulation techniques. It is important that in all cases the function describing the potential distribution can be written in a simple analytical form. The typical feature of the mirrors considered is the fact that the field decreases rapidly and smoothly, when the distance from the slits separating the neighbor electrode
a
b
FIGURE1. Mirrors with two-plate electrodes. (a) A mirror with two-dimensional electric field; (b) a transaxial concave mirror. 1-3, the mirror electrodes; 4, the axial trajectory of a charged particle beam.
393
MIRROR-BANK ENERGY ANALYZERS
plates increases. At a certain distance from the slits, it has practically no effect on the trajectory form. The field boundary position has been determined by calculations. With correctly chosen electrode sizes, which turn out to be sufficiently small, the scattering field effect on a trajectory form does not manifest itself. The novel analyzers have been shown to be able to surpass many known ones (Afanas'ev and Yavor, 1978; Ballu, 1980; Leckey, 1987; Baranova and Yavor, 1988; Roy and Tremblay, 1990) in focusing quality and in the magnitude of relative dispersion in energy. Beyond that, they possess a number of advantages that open new possibilities for improvement of the quality of energy and mass analyzers. The plane near which charged particles move is parallel to the electrode plates, and a beam enters and leaves the field freely. The deviation angle and the distances between the object (or its image) and the boundary of the analyzer field vary within wide limits. The opportunity to control effectively the mirror electron-optical parameters via the electrode potentials is also worthy of mention. Experience with application of the novel mirrors in various devices has shown that they are reliable and stable in operation. In our opinion, in many cases these energy analyzers are preferable to those in widespread use. The purpose of this review is to attract the attention of scientific workers to the treated analyzers.
11. EQUATIONS FOR CHARGED PARTICLE TRAJECTORIES IN
AN
ELECTROSTATIC FIELDHAVING A SYMMETRY PLANE The electron-optical parameters of all mirror-type energy-resolved analyzers covered by this review have been calculated by a unique routine, which is based on the results of the article by Karetskaya and Fedulina (1982b). There, in the framework of nonrelativistic approximation, the focusing and dispersing properties of the electrostatic systems having a symmetry plane were considered with neglect of the space-charge effect. The curvilinear axial trajectory of a charged particle beam is assumed to be located in the field symmetry plane (the mid-plane). The curvilinear orthogonal coordinate system s, x , y is introduced, in which the axis s coinciding with the beam axial trajectory is directed towards particle motion. The assumption is adopted that s = 0 at the point of axial trajectory intersection with the field boundary at the mirror entrance. The x-axis is located in the mid-plane, and the y-axis is perpendicular to it. The x-axis positive direction is obtained on rotating the tangent to the axial trajectory by 90" counterclockwise (Fig. 2).
394
S.
P. KARETSKAYA et al.
An electric field is described by a scalar potential p* = p*(s,x, y ) . At the axial trajectory, p* = p*(s, 0,O)= p(s). The potential is normalized in such a way that for a particle moving along the axial trajectory (of the basic particle), its kinetic energy W = -ep. For an arbitrary particle of a beam W = - e(p* + E ) , where the constant E characterizes the deviation of initial energy of this particle with respect to the basic initial energy. The set of the accurate equations for an arbitrary trajectory is written as follows: 2(p* + &)42(k+ x” 2(p*
d 2 PX) + x’ [(p* + &)4] ds
av*
= -,
ax
d [(p* + E)4 ] = av* , + &)qZy‘’ + y’ ds 2
(11.1)
(11.2)
aY
where
4 = [(l - kx)2 + (x’)2 + (y’)2]-1’2.
(11.3)
Primes denote differentiation with respect to s; k = k(s) is the axial trajectory curvature. Its sign at every point of a trajectory coincides with the sign of the x-coordinate of the curvature center corresponding to this point. At x = y = 0,E = 0, and p* = p(s), Eqs. (11.1) and (11.2) result in the axial trajectory equation: VlO = 2kp, pol = 0, (11.4) where the notations (11.5) are used. For beam particle trajectories kx, y , x’, y ‘ , and E are considered as the values of first order of smallness. The potential p*(s, x, y ) in the vicinity of the axial trajectory is represented as the power series p* = p + p l o x + -p20 x 2
2
PO2 +-y
2
2
p30 +-x
6
3
p12 + -xy 2
2
+ ...,
(11.6)
where the terms involving odd powers of y are absent because of field
MIRROR-BANK ENERGY ANALYZERS
395
symmetry. The power expansions in series for I* and d are substituted into Eqs. (11.1) and (11.2). After necessary transformations, the approximate trajectory equations are derived, where the terms needed for determination of geometrical aberrations of second and third order, as well as secondorder chromatic aberrations, are kept (Karetskaya and Fedulina, 1982b). Here, third-order aberrations are not presented. The approximated equations, with the values of second order of smallness kept, have the following form: X”
kE + U]X‘ + u2x = - + F“’,
(11.7)
+ u]y’ + u 3 y = M”’.
(11.8)
I
y”
Here
u - - I’ , - 2p
I20 u2=3kz--,
2I
u3 -- - - I 0 2 260
’
P2’= V l X 2 + V Z X X ’ + v3(x’)2 + v4y2 + v,(y’)2 + vs&x + v,&x’ =
q1xy
+ V8EZ,
+ q2xy’ + q3x’y’ + q4Ey + q,&y’,
where
I’ v, = 2(92
’
k V 8 = 7 ,
I
A . Paraxial Approximation for a Monoenergetic Beam The set of Eqs. (11.7) and (11.8) is solved by the method of successive approximations. The solutions for the homogeneous linear equations
+ UIX’ + u2x = 0 , y” + u ] y ’ + u3y = 0, x”
(11.9) (11.10)
396
S. P. KARETSKAYA et al.
are written as follows:
x
= XAg
+ Xoh,
(11.11) (11.12)
Y = YAG + YOH,
where g(s), h(s) and G(s), H(s) are the particular solutions for Eqs. (11.9) and (ILlO), respectively. Index “0” denotes variable values relating to the object plane Qo. The latter is assumed to be located in the object space, perpendicular to the axial trajectory (its rectilinear continuation), corresponding to a real (imaginary) object. The particular solutions satisfy the following conditions in the plane s = 0: g = Lo,
G = Lo,
g’ =
1,
G ‘ = 1,
h
=
1,
H = 1,
h ’ = 0,
(11.13)
H ’ = 0.
(11.14)
The modulus of Lo equals the distance between the object plane and the plane s = 0. For a real object, Lo > 0, while for an imaginary object, Lo c 0. The particular solutions are found by numerical techniques. Integration is performed from the point of the axial trajectory entrance to a field (s = 0) to the exit point (s = s b ) . The planes of the Gaussian images Q1and Q2 for x- and y-directions of focusing are considered to be located in the image space. This planes are perpendicular to the axial trajectory or its rectilinear continuation. The plane locations are determined as follows:
(11.15) where lLll and lLzl are the distances between the planes = sb and the plane Q1 or Q 2 , respectively. L1(L2)is positive if the image is real, and it is negative if it is imaginary. The subscript “b” denotes variable values related to the plane s = sb . The planes Q1and Qz, generally speaking, do not coincide. Only in certain particular cases, when some special conditions are satisfied, can Q1and Qz be matched. As a result, stigmatic focusing is achieved. For the mirrors considered, the stigmatic focusing conditions are satisfied without difficulty at any location of the object plane. In transaxial mirrors, focal plane coincidence is also of special interest. In the image space, the locations of the foci F,, and are determined by the equalities
el
(11.16) where lL(&)l and \L(el)lare distances between the plane s = sb and the corresponding focal plane; L(&) and L(&) are positive for real foci.
MIRROR-BANK ENERGY ANALYZERS
397
Evidently, if in a mirror the foci of the image space coincide, then the foci
F,, and F,, of the object space coincide, too. The focal lengths for x- and ydirections are calculated by the formulas 1 hiy
f=--
f
=--*
1 H A
(11.17)
‘ F and F,, are always In the two-dimensional field mirror, the foci located at infinity. In other words, in the x-direction such a mirror is always a telescopic system. B. Dispersion, Geometric, and Chromatic Aberrations
(11.20) i‘hiqds -h
K . = -(g1
‘6 where E~
,
i = 1,2, ..., 9, (11.21)
,
j = 1,2, . . . , 6 , (11.22)
0
= d p , (po
is the potential of the object space,
S . P. KARETSKAYA et al.
398
+ u2(hD)’+ 2v3h’D‘+ v,h + v 7 h f , = U I D + u ~ D D+ ‘ t13(D’)’ + v,jD + ~7 D’ + U S , 2v,hD
K* = ~g
v1gG + v2gG’ + v3g’G‘, = vlhG + ~ 2 h G + ’ t/3h‘G‘,
Pl = ~2
v1gH + v22I-f‘ + v3g’H’, p4 = qlhH + v2hH’ + q3h’H’, P3 =
+ v3D’G‘ + v4G + vsG‘, p6 = v l D H + v2DH’ + q3D’H’ + v4H + vsH’. ,US
v1DG = v2DG‘
=
The integrands in Eqs. (11.20)-(11.22) are equal to zero beyond the field, so, when calculating D , K;, Mj in the image space, it is sufficient to integrate up to the point of the axial trajectory exit from the field, where s = sb.
Linear dispersion D and the aberration coefficients K; in the plane Q 1are calculated by the formulas (11.23)
K.
=--
i = 1,2, ..., 9.
mx l s b g d $ x j d s , 6
I
(11.24)
0
In the plane Q2 the aberration coefficients are as follows:
m
M . = - Y
isb
GGpjds,
j
6 0
=
1,2, ..., 6,
(11.25)
where m, and mu are the linear magnifications in the x- and y-directions. In the focal plane containing the focus F,, , (11.26) K. =
f,i s b h G r i d s , 6
.
i
=
1,2, ..., 9.
(11.27)
0
In the focal plane containing the focus & , (11.28) S s b H G p j d s , j = 1,2, ..., 6. G o When calculating aberrations in focal planes, the trajectory initial coordinates (xo,yo) are specified in a certain plane, located in the object space at the distance Lo from the point of axial trajectory entrance into the mirror field.
M.=
MIRROR-BANK ENERGY ANALYZERS
399
Performing uncomplicated transformations, one can show that the coefficients M , ,M3in the plane Q,and the coefficients K 4 , K , in the plane Q,are connected by the simple relations (11.29)
One also can show that the coefficients M , and M4 in the focal plane that contains the focus FyI are related to the coefficients K5 and K6 in the focal plane that contains the focus F,, as follows:
(11.30) Thus, elimination of certain types of aberrations in the x- and y-directions comes about simultaneously. As has been already mentioned, the electron-optical parameters for all mirrors were calculated by means of the unique routine based on the formulas given earlier. The difference consists in calculation of a potential and its derivatives. We shall speak about it in detail in the corresponding sections of this chapter. In addition, the mirror peculiarities of focusing and dispersing action related to the character of the appropriate electric field symmetry will be considered. The results of these studies will allow us better to understand the relationships of charged particle motion for every mirror type. These results were also used in order to effectively control the correctness of computations.
111. PECULIARITIES OF CHARGED PARTICLE FOCUSING AND
ENERGY SEPARATION IN A MIRROR WITH A TWO-DIMENSIONAL ELECTRIC FIELD In the mirrors of the type (a) shown in Fig. 1, the electric field within the region of charged particle motion, provided the electrode plates are of sufficient length, can be considered two-dimensional (planar). The intensity vector of such a field is always parallel to a certain plane and does not vary along the straight line perpendicular to this plane. Let us introduce the Cartesian coordinate system X, Y , 2, the plane XZ being matched with the mirror mid-plane (Fig. 3). In the frame of this coordinate system, the potential is assumed to be independent of X and is described by the function v*( Y, Z).The angle between the axis Z and the tangent to the axial trajectory is denoted by 19, being assumed to be positive if it is counted counterclockwise relative to the axis 2. Then the Cartesian coordinates of a
400
S. P. KARETSKAYA et al.
FIGURE3.
point P,Y*,Z* of a certain arbitrary trajectory are determinated by the following equations:
X* = X
+ xcos9,
P = y,
Z* = Z - xsind,
(111.1)
where X(s) and Z(s) are the coordinates of the corresponding point of the axial trajectory. Note that
dX
dZ ds
- = sin 9,
-=
ds
cos 8.
(I 11.2)
The function P ( Y , Z)is assumed to be known. It will be defined in an analytical form when we describe the results of calculations for mirrors of the type (a). Let the potential of the axial trajectory P ( 0 , Z)be denoted by V(Z). The functions ~ ' ( s ) ,pjk(s), and k(s) entering the formulas of Section I1 will be expressed via the potential V(Z)and its derivatives. Taking into account the Laplace equation and Eqs. (111.1) and (111.2), we find d p - -cos dV _
ds
79,
(111.3)
sin 9,
(111.4)
d2V. 9, dZ
(I 11.5)
dZ
plo = -
dV
pzO= -sin
d 2V dZ
Po2 = -2,
(111.6)
( P , ~=
d 3V -sin dZ
~ 3 = 0
d3V. - 7 s i n 6, dZ
(111.8)
sin 9 d V 2V d Z '
(I I I. 9)
k=
9,
(111.7)
40 1
MIRROR-BANK ENERGY ANALYZERS
For a two-dimensional electrostatic field, constancy of the momentum, corresponding to the coordinate X,allows us to write the following firstorder differential equation for the trajectory projection onto the mid-plane (Vandakurov, 1956): q[(1 - kx) sin 6
+ x' cos 61 = B.
(111.10)
Here B is the constant that will be determined by the initial conditions in the object space B=
d & T i (sin 6, + x; cos 6,) .\jl + (xa2 + (Y$
9
(111.1 1)
where 6, is the beam entrance angle in the mirror field. At the axial trajectory, G s i n 6 = Jrp,sina,.
(111.12)
A. An Image Equation and Dispersion in Energy Let us consider the electron-optical properties of a two-dimensional electric field in the linear approximation (Karetskaya and Saichenko, 1990). We shall perform in Eq. (111.10) the expansion in power series in small parameters as well as subsequent required transformations, keeping the terms of first order of smallness. As a result we obtain the ordinary linear first-order differential equation X'
Cot 6
& + kx + = B1,
24)
(111.13)
where
The solution of this equation is found in quadratures of Eq. (111.13), x=c0s(j(B1-'->--.c), sin 6 ds 24) cos2 6 where C is an arbitrary constant. If the solution is represented in the form
x
=
xhg
+ x,h + D&,,
(111.1 4)
S . P. KARETSKAYA et a/.
402
which coincides with the linear part of the solution (11.15), one can define g, h, and D: g = cot 9,J,
(111.15)
cos 9 h =-, cos 80
(111.16) (111.17)
Here the following notation is introduced:
J=
kll
(cos 9
1
Ee
SO
(k, - k ) sin 9 ds + cos2 9 cos 9,
where k, is the curvature of the axial trajectory in the turning point (9 = n/2, s = s,). The integral entering the particular solution g(s) has been transformed into such form that the integrand remains finite at the turning point. Integration is performed beginning with the object plane s = so = - L o . Here and later in this chapter, for definiteness, an object and its image are considered to be real. However, the obtained results will also be valid for the imaginary object and image. The coordinate X of the axial trajectory in the object plane is denoted by X,. Note that in the plane s = s, containing the turning point of the axial trajectory, the following equalities are valid: g=-
cot 90 , k,
(111.18)
h=0,
where p = cp, is the potential in the turning point, p,
=
posin2 9,.
In the mirror image space, where v, = po, 9 g =
D =
-Lo
- (s - Sb)
-*(.
(111.19) = 71
- 60,
+ c,,
=),
+ x - x,
(111.20)
2 h=-l,
g’=-l,
h’=0,
D’ = 0.
The constant C , is as follows: (k, - k ) sin 9 ds cos2 9
MIRROR-BANK ENERGY ANALYZERS
403
The mirror with a two-dimensional field always remains a telescopic system with the linear magnification equal to - 1 in the x-direction. Its angular dispersion in the image space is absent. Setting g(sl) = 0, we obtain the image equation Lo
+ L , = c,,
(111.21)
from which it follows that the increase of Lo by a certain value results in the decrease of L, by the same value. At Lo = C1/2 the points of axial trajectory intersection with the object plane Qo and the Gaussian image plane Q1 turn out to be equidistant from the mirror. The coordinate 2 of these points is denoted by Z,. A somewhat different form for the image equation was obtained earlier in the work by Kel’man et al. (1972). The linear dispersion in the image space is constant and can be found by formula (I1I. 22) The variable values relating to the image plane Q,(s = s,) are denoted by subscript “ l ” , as earlier. Thus, in an energy analyzer based on any mirror with a two-dimensional field the relative dispersion, Do =
X, - X,
-
cot 90 2 sin bo’
(111.23)
is determined solely by the angle of axial trajectory entrance to the field. The quantity X , - Xo is sometimes called the analyzer base. Validity of Eq. (111.23) for certain types of mirrors with two-dimensional fields was marked by many authors (see, e.g., the works by Afanas’ev and Yavor, 1978, and Golikov et al., 1981). For a wide class of two-electrode mirrors, in which a beam intersects one of the field-specifying surfaces, the validity of Eq. (111.23) was proved by Fishkova (1987). B. Aberrations
Constancy of the momentum corresponding to the coordinate X results in the fact that for all particles moving in the mirror mid-plane, the angle of particle entrance to a field equals the angle of its exit. So angular aberrations of any order in the image space are absent in the case of motion in the mid-plane. The angular aberrations of the x-direction that occur when a particle goes away from the mid-plane are defined with ease from Eq. (111.10). If one restricts oneself by the second-order aberrations, then
404
S.
P. KARETSKAYA et al.
x’ = x; is determined in the image space by the following expressions: X; =
-x& + K&Q2
+ K;y&yo + K i d ,
(111.24)
where
K i = - 21 ( y 2
K; =
K
1 -
6
Yy
1) tan o0,
-
tan 0 0
4
’
tan 0o -
-
7
24
,
yy = G(,is the angular magnification in the y-direction. It should be noted that when fy =a,i.e., when the mirror is a telescopic system in the ydirection too, Iyyl = 1, and the coefficients K i , K; , and K i turn out to be equal to zero. This case is also of special interest because if an image is stigmatic for a certain position of the object, it remains stigmatic for any of its other positions. In fact, it is easy to realize that, when4 tends to infinity, the image equation for they-direction and Eq. (111.20) have the same form,
Lo
+ L, = c,,
where C, is a constant value. So if L, = L, for a certain L o , then C, = C , . Absence of a majority of angular aberrations in the system with a twodimensional field operating in the telescopic regime was noted by Kel’man and Rodnikova (1963). This fact gave rise to construction of quite perfect energy and mass analyzers (Kel’man and Yavor, 1968; Kel’man et al., 1979, 1985). 1. Relations between the Coefficients of Linear Spherical and Chromatic Aberrations Glickman and Goloskokov (1 991a) have found simple relations connecting the coefficients of chromatic aberrations K , , K9 and the coefficient of spherical aberration K , in the plane of the Gaussian image Q1. The accurate solution of the trajectory equation for a charged particle moving in a two-dimensional field mid-plane (Coggeshall, 1946) served as the initial relation. This solution for a point object with the coordinates X o , 2, in the image space can be written as follows:
x*
= X,
+
sin(8,
+ aro)1(c, z).
(111.25)
Here a0 = arctanx; is the angle between an arbitrary trajectory and the
405
MIRROR-BANK ENERGY ANALYZERS
axial one in the object space, I(C, z) = j : r ( F
Z
=
+ C)-”2dZ -
s,,
(F + C)-”,dZ.
(111.26)
Z& is the coordinate of the turning point of an arbitrary trajectory, F = -V(Z) *
,
c = (1 + Eo)COS2(?90+
(110)
- 1.
(111.27)
(PO
In the plane of the Gaussian image Z = Z 1 whose position is determined by the condition
ax* aao
=
0,
(I I I. 28)
&o=O ao=O
z=z1
the coordinate of an arbitrary trajectory X* = Xy can be represented as the series Xt = XI
+ D , E +~ A,@;+ A ~ c Y+ ~A ~E E~+;
me..
(111.29)
The linear in energy dispersion O x ,the coefficient of spherical aberration and the coefficients of chromatic aberrations A,, A3 are determined by the following relations:
A
’,
(111.30)
(111.31)
(111.32)
(111.33) The parameters c0 and a. enter the integrand (F + C)-’” only via the constant C. Because of this fact, one can find simple relations between the aberration coefficients. In order to avoid integral divergence in the vicinity
406
S. P. KARETSKAYA et al.
of the turning point, where F(Zzax) + C = 0, before differentiating with respect to C, the function Z(C,Z) must be transformed. It is written as follows:
The assumption is made that F'(Z,) # 0 and F t ( Z z a x )# 0. Primes in III,B,l denote differentiation with respect to Z, in contrast to the general text, where the primes denote differentiation with respect to s. Let i denote the integral
1
ZZ' U
i
=
(F
+ C)-"2dZ.
zk
On triple integration by parts it is reduced to the form
From Eqs. (111.25), (111.28), (111.30), and (111.31) it follows that
XI- X o = I(Co,Z,) sin O o ,
(I I I. 34)
Z(C, Z)- 2 sin2 do
(111.35) z=z,
(I I I. 36) z=z,
,
A = sin o0 cos2 a0
9
c=c, z=z,
(111.37)
MIRROR-BANK ENERGY ANALYZERS
where C,
=-
sin2 6, is the constant C with Z(C,, 2,) =
E, =
407
0, a. = 0. Then
W,sin Go,
(111.38) (111.39)
1
(*
2sin36, cos 6o + z=z,
3D~)9
(111.40)
(111.41) The last is consistent with the formula (111.22) obtained earlier. Further, from Eqs. (111.32), (111.33), and (111.38)-(111.40), one can obtain the relations connecting the coefficients D,, A ,A,, and A,, A2 = -(2Dx A, = +[0,(3
+ Al)Cot 60, - tan’ 6,) + A , ] cot’ 8,.
(111.42) (111.43)
Using Eqs. (111.29) and (111.41)-(111.43) and going from aberrations in the X-directions to those in the x-direction, one can find that in the plane of the Gaussian image Q1,
K , = -[K,
K9 = i[Kl
+ (2 + tan’ S,)D,] cot d o , + (3 - tan’ So)DIJcot’ d o .
(111.44) (111.45)
Note the fact that if focusing of the second order in the beam divergence angle a. is achieved i.e., K 1 = 0, then
K?
-= D1
-(2cot 6, + tan O,),
K9
1
Dl
4
- = - (3 cot’ 60 - 1).
(111.46) (111.47)
Thus, in this case K 7 / D 1and K9/D, do not depend on the field distribution within the mirror and are determined solely by the beam entrance angle. For small b0, when the condition cot 6, %- 1 is satisfied, K7 = -2cot 60,
3 K9 = - cot’ 6,.
(111.48) D1 4 D, In this case the effect of chromatic aberrations on the analyzer resolving power can be significant.
S . P. KARETSKAYA et al.
408
The relations between the coefficients of spherical and chromatic aberrations of higher orders, evidently, can be found in the same manner. In the work by Glickman and Goloskokov (1991a) such relations were also obtained for the third-order aberration coefficients. 2. Aberrations Related to the Object Width In the article by Glickman and Goloskokov (1991b), a technique that gives the possibility of defining the aberration coefficients related to the object width was proposed for the case when the aberration coefficients for an infinitesimally narrow linear object, located perpendicularly to the midplane, are known. Systems are considered in which the two-dimensional electric and magnetic static fields have the common mid-plane, being the symmetry (antisymmetry) plane for the electric (magnetic) field. The idea of the technique is very simple. We can exemplify it by the case of an electrostatic mirror. We shall consider two trajectories whose projections onto the mid-plane are shown by dashed lines in Fig. 4. One of them starts with the point Xo , Yo,2, of the linear object A, while another passes through the point having the coordinates Xo P, &, 2,. For both trajectories, xh and y; in the object space are assumed to be the same. In the object plane Qo for the first trajectory x = Xo = 0,y = jjo = &, and for the second one
+
x
=
xo = Pcos do(l - xh tan do),
y = yo = jjo
-
yhPsin d o .
(I I I .49)
(I I I. 50)
Because of the character of field symmetry, both trajectories are evidently under quite identical conditions, so they are identical. Thus, if in
FIGURE4.
MIRROR-BANK ENERGY ANALYZERS
409
the image space the coordinates together with their derivatives for the first trajectory are known, one can define the corresponding values for the second trajectory. Let us assume that for the first trajectory in the plane Q1, x = Z1 ; in the plane Q2, y = yZ ; and in the image space, x’ = 2;,y’ = pi. Then for the second trajectory in the plane Q1,
x=x1
-
=XI -
xo(l + 2;tan Oo) ’ (1 - x; tan 19,)
(111.51)
in the plane Qz, (111.52)
and in the image space
x’ = x;
F
n;,
y’ = y ; = p;.
(111.53)
The quantities related to the first trajectory are determined by the following equalities:
+ K4@@2+ K,y&?O + K67: + K,x;eO + K& + ..., 2; = -x; + K;(y;)2 + K;y;po + K& + ...,
21 =
y2
D&O + K,(x;)2
= t?ZyJo
+ M1J’;X; + M3foX; + MSy;&o+ M6Jo&o + ...,
+ M&&, +
1..
.
(111.54)
(111.55) (111.56)
(111.57)
Here it has been taken into account that f o= 0, K / = 0, with
i = 1,2,3,7,8,9. From the relations (111.49) and (111.50), it follows that (111.58)
On replacing jjo by the expression (111.58) and on substituting the expansions (111.54)-(111.57) into Eqs. (111.51)-(111.53), one can define the aberration coefficients relating to the object width. Here we shall restrict
S. P.KARETSKAYA ef al.
410
ourselves to the second-order aberrations: K2 =
K3
0,
=
0,
Ks
tan 8, M4 = -,
M2 = (my - yy) tan do,
Mi
=
tan 8, --
fy
,
Mi
0,
fy
=
(I 11.59)
0.
In the article by Glickman and Goloskokov (1991b), the formulas for the third-order aberration coefficients were derived. Also, a series of general conclusions concerning angular and linear aberrations of any order were made.
IV. ENERGY ANALYZERS BASEDON MIRRORS WITH TWO-PLATE ELECTRODES SEPARATED BY D ~ C SLITS T In this Section, the results of calculations of electron-optical parameters of the two-dimensional field mirrors are presented. The mirror advantages and disadvantages are discussed, and examples of mirror applications are given. All calculations have been made for symmetrically located planes Qo and Q,(the object plane and the Gaussian image plane, respectively): a symmetric case. In this case the points of the axial trajectory (or its rectilinear continuations) intersect with the planes Qo and Q,at the same coordinate Z = 2, (Fig. 5). Simple relations between the aberration coefficients of the two-dimensional field mirrors are available, so there is no need in the expressions for all coefficients. The appropriate expressions for M , and M3 are not given here because, according to Eqs. (11.29),
M,
=
-2myK4,
M3 = m y K s .
(IV. 1)
The expressions for K7 and K9 are absent, too, as these coefficients are simply related to K , and D , by Eqs. (111.44) and (111.45). It should be noted that the coefficients K , ,K , ,and K , in a mirror with a two-dimensional field are always equal to zero. If in the considered symmetric case a mirror creates a stigmatic image, then my = + 1 and, as follows from Eq. (111.59), M2 = 0. Also, one can show that in such a case K4 = TfyKs.
(IV.2)
In a symmetric case for a telescopic system for the y-direction of focusing
MIRROR-BANK ENERGY ANALYZERS
41 1
FIGURE5. A three-electrode mirror with parallel electrode plates in two projections. 1-3, the mirror electrodes; Z = Z,, the effective plane of reflection.
(f,= a),the coefficient K5 is always equal to zero. In addition, according to Eqs. (111.59), the coefficients Mz and M4 are equal to zero with fy = a0 at any location of an object. A . Mirrors with Parallel Electrode Plates Let us consider the electron-optical properties of two- and three-electrode mirrors with each electrode consisting of a pair of identical plates. Two plates of an electrode are situated symmetrically with respect to the midplane, at a distance d from each other. The notations used here will be cleared up when considering a three-electrode mirror as one example, which is depicted in two projections in Fig. 5 . The numbers 1, 2, and 3 denote, respectively, the first, second, and third electrodes of a mirror with the corresponding applied potentials q1, (pz , and p3. The potential of the first (with respect to a beam direction) electrode coincides with that of the mirror object space and the image one ((pl = po).The distance between the middles of the slits separating neighbor electrode plates is denoted by P; the slit width, being the same for all electrodes, is denoted by 6.The electrode plate sizes are to be such that the field in the region of charged particle motion is two-dimensional. The axial trajectory and one of the adjacent trajectories for the symmetrical case are shown by dashed lines. In Fig. 5 the applied Cartesian coordinate system X,Y,Z is also shown. The plane XY passes through the middles of the rectilinear slits separating the plates of the first
412
S. P. KARETSKAYA el al.
and second electrodes, whereas the plane XZ coincides with the mirror mid-plane. The potential distribution in a three-electrode mirror with parallel electrode plates is described by the function (Glickman et al., 1967) col + co3 V*(Y,Z ) = + 2
+D
co2 ~
-
co1
arctan sinh(nZ/d)
n
cos(n Y/d)
a r c t a n
n
sinh[n(Z - P)/d] cos(n Y / d ) ’
(IV.3)
This formula has been derived under the assumption that the value of 6 is infinitesimal. In practice, it has been shown that the electron-optical parameters of the considered systems, which are calculated with Eq. (IV.3), do not differ from the real ones if 6 5 0, Id. The calculation result is the fact that the mirror field can be considered to be accumulated within the region -3 The plane Z
=
z
I- I
d
3
P + -. d
-3d is viewed as mirror field entrance boundary. The plane
s = 0 passes through the point of intersection of a rectilinear section of the
axial trajectory in an object space with this boundary. In calculations related to mirrors with parallel electrode plates, the distance d is adopted as a length unit, and the potential of the first electrode p, plays the role of the potential unit. 1. Two-Electrode Mirrors Investigation of two-electrode mirror properties was carried out in the work by Kel’man et al. (1982). Such a mirror scheme is not given here, because it is a particular case of a three-electrode one. If, e.g., cp3 = (p2 and the slit between the plates of the second and third electrodes is absent, then a two-electrode mirror is received. The plane X Y is considered as passing through the middle of the slits separating the plates of the first and second electrodes. The retarding potential 92 is applied to the second electrode of the mirror. The value of the potential cp2 must be less than the potential p, at the turning point of the axial trajectory, pu = sin2 6,. In Figs. 6-8 the coordinates of axial trajectory turning point Z , , the dispersion in energy D 1 , and the coordinates 2, versus the potential p2 are given for five values of the entrance angle 6,. The numbers 1,2, 3,4, and 5 denote curves that correspond to angles 6, equal to 20°, 30°, 40°, 50°, and 60°, respectively. As Fig. 6 demonstrates, z, increases when 9 2 grows at
MIRROR-BANK ENERGY ANALYZERS
413
M& 8 4
a -4 FIGURE6. The graph for determination of the turning point coordinate of the axial trajectory in a two-electrode mirror with parallel electrode plates. 1, 8, = 20"; 2, 19, = 30"; 3, 8, = 40'; 4, (9, = 50"; 5 , 6, = 60".
-44
0
0,4
018 *'V
FIGUR~ 7. The dependence of the dispersion in energy or 8, and p2in a two-electrode mirror with parallel electrode plates. Notation as for Fig. 6.
-44
o
0,4
qe
F I ~ U R8.B The position of the plane Z = Z , in a two-electrode mirror with parallel electrode plates. Notation as for Fig. 6.
414
S.
P.KARETSKAYA et al.
fixed d o . As rp, comes closer to the potential rp,, of the turning point of the p,, . At axial trajectory, Z , increases faster and tends to infinity when rp2 large values of Z,,, the turning point of the axial trajectory finds itself in a very weak field, rapidly decreasing with the growth of Z . This is the reason for the rapid growth of the dispersion D , modulus when p, approaches rpu (Fig. 7). As the coordinate Z , increases, the coordinate Z , decreases rapidly: An object and its image are removed from the mirror (Fig. 8). The coefficient of spherical aberration K4 is always positive in the considered two-electrode mirror. The coefficient of spherical aberration K , for every value of O0 in the specified interval of p,-variation becomes equal to zero one time. The condition lKll < 0, 1 is satisfied if 60
=
20",
rp,
=
-0,005,
do = 30",
9, = -0,02,
60 = 40°,
( ~ = 2
do = 50",
rp2 = 0,08,
O0 = 60",
rp2 = 0 , 3 .
-O,Ol,
(IV.4)
In this case the image is not stigmatic, and the object and its image are imaginary. The module of dispersion in energy reaches its greatest value at O0 = 20". It equals 7 , 3 . The value of the coefficient K , does not depend on the positions of an object and its image, because the coefficients K , , K , , and K ; equal zero. So, with asymmetric location of an object and its image, when an object or an image can be made real, in mirrors satisfying the conditions (IV.4) lKll is also less than 0, 1. With values of p2 close to the potential of the turning point of the axial trajectory, the moduli of K I and K4 increase sharply, and K , c 0, whereas K4 > 0. The considered two-electrode mirrors create, generally speaking, nonstigmatic images, i.e., G I # 0. Such mirrors can be used in complex systems incorporating several electron-optical elements. Under certain conditions a two-electrode mirror with parallel plates can create a stigmatic image of a point (GI= 0). In Table I the electron-optical parameters of two-electrode mirrors that form such an image in the symmetric case are presented. In accordance with the number of zeros of the function GI in the applied region of rp2, there are three groups of stigmates. In the first group my = 1, an object and its image are always imaginary. At fixed Q0 the values of rp,, Z,,, and - Z , corresponding to this group are the least for all three groups. For Q0 1 40" the coordinate Z, is negative, i.e., the axial trajectory does not reach the plane Z = 0, being the
TABLE I TWO-ELECTRODE MIRRORS WITH PARALLEL ELECTRODE PLATES CREATING A STIGMATIC IMAGE OF AN OBJECT $0
9
Group
deg.
P2
1
20 30 40 50 60
-0.0987 -0.1581 -0.1923 -0.1799 -0.1694
20
30 40 50 60
-0.0226 0.0212 0.1832 0.4093 0.6383
20 30 40 50 60
0.0508 0.1875 0.3630 0.5511 0.7283
2
3
-z,
Z,
-0.376 -0.049 0.126 0.275 0.454
0.364 0.153 -0.008 -0.156 -0.335
0.096 0.467 0.992 1.61 2.20
0.485 0.304 0.238 0.210 0.204
1.49 2.04 3.52 6.31 11.1
0.703 0.671 0.663 0.660 0.658
4.37 12.7 29.3 58.3 107
2.95 8.13 14.7 21.6 28.2
X , - X,, 0.813 0.972 1.17 1.48 2.01
-D, 3.27 1.68 1.08 0.81 0.67 6.00 3.53 3.26 3.46 3.71 17.5 22.0 21.2 31.9 35.8
my 1
1 1 1 1 -1 -1 -1 -1 -1 1 1
K4
K,
4
4.2 1.8 1.2 1.4 2.5
2.4 1.1 1.1 1.5 2.4
2.3
1.9 6.3 32 130 460
- 1.7 - 9.0 - 26 - 69 - 42 -200
1
-600
1 1
-1500 -3700
200 2300 12,000 44,000 140,Ooo
-6.8 0.89 8.3 18 32 4.1 15 43 100 200 120 490 1300 2600 4900
Ks 10 15 49 120 210
M 4
1.o -0.45 -6.3 -15 -23
12 12 15 19 22
0.77 1.4 1.1 0.88 0.75
19 26 33 39 44
-0.22 -0.12 -0.089 -0.070 -0.060
416
S . P. KARETSKAYA et al.
equipotential surface of p = (1 + p2)/2. If an electrode with such a potential is matched to this plane, a two-electrode mirror "with a wall" is obtained. Its electron-optical parameters coincide with those of the considered mirror with parallel plates but with smaller gabarits. In the interval 20" < 9, < 30", the coefficients K5and M4reverse their signs for stigmates of the first group and then both turn to zero at a0 = 28" simultaneously. As was mentioned at the beginning of this section, this case corresponds to a telescopic system with the y-direction of focusing (f, = 00). It should be restated that this case is of special interest because here an image is stigmatic at any location of an object. For the second group, mu equals - 1, an object and its image are always imaginary. The dispersion absolute value is somewhat greater than for the first group at any value of 6,. The coefficient K , reverses its sign in the interval 20" < 19, < 30" and turns to zero at a0 = 25". For the third group, mu equals 1, an object and its image are real at 19, > 20". In these mirrors the dispersion absolute value is the largest; however, the aberration coefficients are also large. The latter increase sharply as 0, grows. For illustration, the dispersion D , and the ratios K,/ID,( and &/ID1/ versus 9, are presented graphically in Figs. 9 and 10. The numbers 1,2, and 3 denote the curves relating to the first, second, and third group, respectively. The ratios K , / l D , I and K4/ID,I determine the spherical aberration contributions to the analyzer resolution. In the first and second groups, dispersion varies insignificantly, whereas in the third ID, I increases rapidly with a,, almost linearly. Despite this fact, the ratios IK,/D, I and IK4/D,I in the third group increase faster than in the first and second groups. Thus, it seems to be reasonable to use mirrors relating to the first and second groups as components in complicated electron-optical systems. Third-group mirrors, characterized by real objects and images, can be used as a base for the construction of simple energy analyzers. However,
10
20
60
60 lJo
FIGURE9. The dependence of dispersion in energy on 9,, in a two-electrode mirror with parallel electrode plates creating a stigmatic image. 1-3, the groups of mirrors.
MIRROR-BANK ENERGY ANALYZERS
a
417
b
FIGURE10. The graphs for determination of (a) ( K l / D , (and (b) (K4/D11 in a two-electrode mirror with parallel electrode plates creating a stigmatic image. Notation as for Fig. 9.
operational characteristics of such analyzers will not be high-level. Later it will be shown that analyzers with high-level operational characteristics can be created based on three-electrode mirrors. 2. Three-ElectrodeMirrors Focusing and dispersing properties of three-electrode mirrors with parallel electrode plates were studied in the work of Kel'man et al. (1982). Incorporation of a third electrode results in the introduction of additional parameters: the potential of the second (intermediate) electrode cp2 , and the distance P between the middles of the slits separating the plates of neighbor electrodes (Fig. 5). Because of this, they managed to find that threeelectrode mirrors noticeably surpass two-electrode ones in their focusing properties. Three-electrode mirrors were studied at various values of the entrance angle a0 and of the distance 0. For the considered mirror, the conditions necessary to form a stigmatic image with second-order spherical aberrations being eliminated were found (a symmetrical case): GI = 0, K , = K4 = M I = 0. The calculations have shown that for each 6,,an optimal value of the parameter P exists for which a mirror with a specified focusing quality is found by appropriate choice of potential. Within a wide range of O0 (25"-5O0), the optimal value of P varies insignificantly from 1.20 to 1.17. Let us describe briefly an algorithm for the search for high-quality focusing mirrors. For example, it is assumed that a0 = 30°, 0 = 1.15. In Fig. 1l a the potential cp2 as a function of cp3 such that it provides a stigmatic image in the symmetrical case is given as an example. The position of the plane Z = Z , for such mirrors is given in Fig. l l b . As in the case of twoelectrode mirrors, for three-electrode ones three groups of stigmates can also be isolated. For two-electrode mirrors in each group every fixed value of
418
S. P. KARETSKAYA et al,
o t , ,
, , , Za,
4 8 f Z 16 20 ‘ f , f O z
FIGURE1 I . Interrelation between the electrode potentials of a three-electrode mirror with parallel electrode plates creating an object stigmatic image (a) and the position of the plane Z = Z , for such mirror (b). 0, = 30”,P = 1.15. 1-3, the groups of mirrors; a-c, their various branches.
a0 corresponds to one mirror (see Table I), whereas for three-electrode mirrors a fixed value of do corresponds to an infinite set of mirrors. The numbers 1,2, and 3 in Fig. 11 denote the curves referring to the first, second, and third group of mirrors, respectively. As the presented dependences of p2 on (p3 and 2, on p3are many-valued, the curves are split in branches denoted by letters. For instance, the branch a of the group 3 is denoted by 3a. The first and third groups are characterized by magnification rn,, = 1, whereas the second one has mu = - 1. As in two-electrode mirrors, the plane Z = 2, in three-electrode mirrors for the first and second groups is located in the range occupied by the mirror field. In the third group, this plane is beyond the field for the greater part of the considered range of a0. Later we shall restrict ourselves to investigation of the third group of mirrors. Let us consider the coefficients K , and K4 versus p3 for the third-group mirrors with 0 equal to 1.O, 1.15, and 1.5, the entrance angle being equal to 30°, as before. The dependences are presented in Fig. 12. The branches 3c are not considered, because they are associated with aberrations that are too high. The branches 3a and 3b, 3a‘ and 3b’, 3a” and 3b” correspond to 0 = 1.15, 1.5, and 1.0, respectively. The graph presented in Fig. 12a
MIRROR-BANK ENERGY ANALYZERS
0.43
0,15
037
919
419
0,24 'f$
FIGURE 12. The influence of Pon the coefficients K,(a) and K4@) in three-electrodemirrors with parallel electrode plates. 19, = 30'; 3a and 3b, P = 1.15; 3a' and 3b', P = 1.5; 3a"and 3b", P = 1.0.
demonstrates that the coefficient K , varies significantly with the change of both the potential p3 and the distance P. With ease it can be rendered into zero a wide interval of the values of P by appropriate choice of the electrode potentials. At the same time, the coefficient K4 (Fig. 12b) of the branches 3b, 3b', 3b" depends weakly on the potential p3 in the wide interval of its variation, and the dependence of K4 on P is such that one can choose the value of P making K4 close to zero through a larger part of the branch. In turn, for a chosen P one can obtain also K , close to zero by the appropriate choice of p3. It is worth noting that in the considered case, according to Eqs. (IV. 1) and (IV.2), the coefficients K 5 ,M , , and M3 become small along with K 4 . The calculations showed that with other values of the entrance angle, referring to the interval 25" Ibo I50", the dependences of mirror electron-optical parameters on the value of P and on the electrode potentials are analogous to those considered for b0 = 30". The mirror electron-optical parameters providing stigmatic focusing and elimination of the second-order spherical aberration in both the x- and the
420
S. P. KARETSKAYA et al.
TABLE I1 THREE-ELECTRODE MIRRORS WITH PARALLEL ELECTRODE PLATES CREATING A STIGMATIC IMAGE OF AN OBJECT WHILEELIMINATING SECOND-ORDER SPHERICAL ABERRATION" 80
deg.
P
co2
25 30 35 40 45 50
1.206 1.188 1.179 1.175 1.173 1.171
0.2129 0.2833 0.3593 0.4398 0.5227 0.6055
9,
0.1050 0.1725 0.2535 0.3440 0.4386 0.5348
-ZA
Z,
XI - X,
-D,
-lOK,
-10M4
2.43 3.91 5.67 7.60 9.62 11.7
1.15 1.11 1.09 1.08 1.07 1.06
6.68 9.66 14.0 19.8 27.6 37.6
16.9 16.7 17.4 18.4 19.5 20.6
-0.23 1.8 1.8 1.5 1.3 1.1
4.0 2.9 2.2 1.8 1.6 1.4
'From Kel'man et al. (1982).
y-directions are given in Table 11. For all mirrors with parameters presented in Table 11, the following conditions of smallness for aberration coefficients I are satisfied: IK,] 5 2 x lo-', lK41 I: 2 X lo-', lKsl 5 2 x lo-', 4 x lo-', lM31 5 2 x lo-'. The values of the coefficients K6 and M4 presented in the table are also not large. Note that the coefficient K6 in the interval 25" c 9, < 30" reverses its sign, being equal to zero at t9, 5: 25.5'. At 9, = 25", an object and its image are still in a weak field. With the increase of 8, they are removed from the mirror, and at 6, = SO" the coordinate ZA reaches the value -11.7. The dispersion modulus varies weakly in the interval 25" 5 6, I35", and then, as 9, grows, it becomes larger, reaching the value 20.6 at do = 50". High focusing quality, the large dispersion in energy, and the wide range of variation of the coordinate Z, characterizing the distance from an object and its image to the mirror offer the possibility of using the energy analyzer with parallel electrode plates for solution of quite different problems related to separation of a charged particle beam in energy and in mass. B. The Mirror "with a Wall"
The opportunity to control focusing quality of a three-electrode mirror with parallel electrode plates by varying the electrode potentials is the most important advantage of such a mirror. One manages to achieve a stigmatic image and to make one of the coefficients of spherical aberration, K , , as small as one needs by appropriate choice of the potentials. However, if one wants, at the same time, to make the second coefficient of spherical aberration, K4, sufficiently small, the specification of the geometrical parameter-the width P of the intermediate electrode-with high accuracy (see Table 11) is necessary.
MIRROR-BANK ENERGY ANALYZERS
42 1
The urge to eliminate unwanted dependence of focusing quality on the geometrical parameter and, simultaneously, to reduce mirror size resulted in the proposal to modify the mirror construction by introduction of the fourth electrode (Karetskaya and Saichenko, 1989). The new mirror and the Cartesian coordinate X,Y, Z system specified for it are shown in Fig. 13. Each of the three electrodes 1-3 is formed by a pair of identical parallel plates, located symmetrically with respect to the mid-plane at a distance d from each other. Electrode 4 consists of one plate (the “wall”) located perpendicularly to the mid-plane. In the figure the axial trajectory and one adjacent to it are also shown. The XY-plane coincides with the internal surface of the fourth electrode, and the XZ-plane coincides with the mid-plane. With a sufficient extension of the electrode plates in the direction of the X-axis, the electric field of the considered mirror within the region of charged particle motion is two-dimensional and is described by the potential V*( Y, Z). Under the assumption that the slit width 6 between the plates of neighbor electrodes is infinitesimal, the following relation is valid: V*(Y, Z) = p4
+-
sinh(n(2 + Z,)/d) COS(R Y/d)
I
- Z,)/d) + arctan sinh(n(Z +-(P3 -R (Pz COS(R Y/d)
I
+ Z,)/d) + arctan sinh(a(Z - Z,)/d) COS(R Y/d) COS(R Y/d)
sinh(n(2
+ 2 D Ra r c t a n sinh(nZ/d) COS(R Y/d) ’
(IV.5)
where pi is the potential of the ith electrode (i = 1,2,3,4), and Z, ,Z, are the coordinates of the slits between the plates of the first and second electrodes-the second and the third one, respectively. As before, in calculations d is used as the unit of length, and (P, plays the role of the unit of potential. An investigationof new mirror capabilities was carried out for the entrance angle Qo = 30”. The mirrors forming a stigmatic image in the symmetrical case were considered. For such mirrors and at fixed values of the widths of the second and third electrode and P,, and at a fixed value of p4, the dependence of pz on p3 can be represented by several curves analogous to one depicted in Fig. 11. For the four-electrode mirror, from the practical point of view, the branch analogous to “b” of the third group is the most interesting. Next the properties of just such mirrors will be discussed.
422
S. P. KARETSKAYA et al. a
Z’ZA
ZSfs
FIGURE 13. A general scheme of the mirror with a “wall” (a) and its projection onto the planes YZ and X Z (b). 1-4, the mirror electrodes; Z = Z , , the effective plane of reflection. (From Karetskaya and Saichenko, 1989.)
In Figs. 14-16 the parameters of the four-electrode mirror with = 1.2 and & =? 1.5, versus the potential cp3 are shown for four values of the potential cp4 . The correspondence between the digits, curves are denoted by, and the values of the potential cp4 is as follows: 1-O.OO0; 2-0.428; 3-1.000; 4-1.500. In Fig. 14 the relation between the mirror electrode potentials with which the mirror creates (in the the symmetrical case) a stigmatic image of the object is given. In Fig. 15 the position of the plane Z = ZAwith respect to the plane Z = Z1is shown. Figure 16 gives an idea on the values of the coefficients K 1 and K 4 . The coefficient K4 has its minimum at all pointed values of cp4, and K , along with the minimal value of K4 depend on cp3 and cp4 in such a way that one manages to control effectively the mirror electron-optical properties by varying the potentials. With cp4 = 0.428 near cp3 = 0.162, the coefficients K 1 and K4 pass through
0,43
0.45
0.47
ON
‘P3
FIOURE14. Interrelation between the electrode potentials of the mirror with a “wall” creating an object stigmatic image. 8, = 30°, f, = 1.2, 0, = 1.5. 1, p4 = o.Oo0; 2, p4 = 0.428; 3, p4 = 1.Oo0; 4, p4 = 1.500. (From Karetskaya and Saichenko, 1989.)
MIRROR-BANK ENERGY ANALYZERS
423
FIGURE15. The position of the plane Z = Z , in a mirror with a “wall” creating an object stigmatic image. The parameter values and notation are as in Fig. 14. (From Karetskaya and Saichenko, 1989.)
zero and can be made as small as one needs by the appropriate choice of the potentials. In Table I11 the parameters of the four-electrode mirrors having different widths of the second and third electrodes and small coefficients of secondorder spherical aberration are presented. As in the three-electrode mirror with parallel electrode plates, the width of the second electrode’s plates affects the value of the coefficient K4; however, the effect is weaker now. For each mirror the interval of the p3-values is shown where the coefficients K , and K4 reverse their signs at a certain definite p4. It is worth mentioning that within this interval the coefficients K , , M,, and M3 turn out to be small, along with K , and K4. Comparison between the parameters of mirrors with the same width of the second electrode P, = 1.20 and various widths of the third electrode P, = 1.30, 1.50, and 2.00 shows that the
FIGURE16. The dependence of the spherical aberration coefficients K,(a) and K4 (b) on the potentials lo, and 60, in the mirror with a “wall”. The parameter values and notation are as in Fig. 14. (From Karetskaya and Saichenko, 1989.)
TABLE I11 MIRRORSCREATING A STIGMATIC IMAGEWHILEELIMINATING SECOND-ORDER SPHERICAL ABERRATIONS, 6, = 30" ' FOUR-ELECTRODE
1.20
1.50
1.20
1.30
1.20
2.00
1.21
1.50
1.30
1.50
0.2820 0.2821 0.2820 0.2822 0.2820 0.2821 0.2812 0.2813 0.2766 0.2767
Z" - z,
83
P4
0.1623 0.1624 0.1625 0.1626 0.1624 0.1625 0.1538 0.1539 0.0373 0.0373
0.4280 0.4280 0.3030 0.3030 1.425 1.425 0.6650 0.6650 4.815 4.815
3.89 3.90 3.89 3.90 3.89 3.90 3.89 3.89 3.87 3.87
x,- x,
-D1
10 K,
I d K4
-]OK,
-1OM4
9.60 9.60 9.60 9.61 9.60 9.61 9.57 9.58 9.45 9.45
16.6 16.6 16.6 16.6 16.6 16.6 16.6 16.6 16.4 16.4
6.0 - 0.48 2.4
- 1.5
1.8 1.7 1.9 1.7 I .8 1.7 1.8 1.8 1.9 1.8
2.7 2.7 2.8 2.7 2.8 2.7 2.8 2.8 2.9 2.9
1.10 1.10 1.10 1.11 1.10
1.10 1.10 1.10 1.07 1.07
- 4.0
7.0 - 0.43 3.4 - 4.5 2.9 - 8.0
4.3 - 13 3.9 - 4.9 2.5 - 1.6 0.11 - 4.3 0.70
"From Karetskaya and Saichenko (1989).
TABLE IV THREE-ELECTRODE ~ ~ J R R O R SWITE
A
"WALL" ENCLOSED THIRD-ELECTRODE PLATE CREATINGA STIGMATIC IMAGEWHILE ELIMINATING 9, = 30" a SECOND-ORDER SPHERICAL ABERRATION;
~
1.188
1.50
1.187
1.00
1.171
0.50
0.2833 0.2833 0.2833 0.2835 2.2855 0.2856
0.1724 0.1725 0.1729 0.1730 0.1847 0.1848
'From Karetskaya and Saichenko (1989).
3.91 3.91 3.91 3.91 3.93 3.93
1.11 1.11 1.11 1.11 1.13 1.13
9.66 9.66 9.65 9.66 9.73 9.73
16.7 16.7 16.7 16.7 16.8 16.9
1.3 -0.045 3.5 -3.2 1.7 -4.0
1.6 -2.1 -8.1
2.1 - 1.4 3.9
I .8 1.8 1.8 1.7 1.7 1.7
~~
2.8 2.9 2.8 2.8 2.8 2.8
MIRROR-BANK ENERGY ANALYZERS
425
positions of the object and its image do not vary, and the same is true for linear dispersion. In the mirror with P2 = 1.50 and P, = 1.20, 1.21, and 1.30, the differences in the coordinate Z , and in dispersion are more noticeable. The range of P,, at which the required focusing quality is achievable, is much narrower than the range of P2. The calculation shows that with P, = 1.1 or 1.4 one does not manage to make the coefficients K 1and K4 simultaneously as small as possible. In Table IV the parameters of the mirrors depicted in Fig. 13, with the same potentials of the third and fourth electrodes, q3 = q 4 , are given. Recollect that in the mirror with the two-plate electrodes the width (an extension in the z-direction) of the edge electrode plates is to be not less than 3d in order to conserve the required character of field symmetry in the region of particle motion. The closing of the third electrode plates in the manner pointed out above allows the reduction of the system gabarits. In Table IV the parameters of the mirrors having various widths P2 of the third electrode at Oo = 30" are given. In this mirror, as in the three-electrode mirror with parallel electrode plates, P, is selected by the condition of smallness for the coefficient K4.The results presented demonstrate that when P2 decreases, reaching 0.5, the coefficients K, and K4 reverse their signs within their common range of q 3 . Comparison between the electronoptical parameters of the mirror "with a wall" having Pz = 1.5 and the parameters of the three-electrode mirror having parallel electrode plates at the same incidence angle Oo = 30" (Table 11) show their complete identity. It is evident that in other mirrors from Table I1 (with other values of t90) one can close, in the same manner, the third electrode plates at the distance 2 1.5d from the second slit with the mirror electron-optical parameters being conserved. C. Application of Mirrors with Two-Plate Electrodes Separated by Direct Slits in a Mass Spectrometer
In work by Karetskaya et al. (1984) it was proposed to use a mirror with the focusing and dispersing properties discussed earlier in a statical mass spectrometer with a sectorial magnetic field in order to achieve focusing in energy. The schemes with a real intermediate image and with an imaginary one were discussed. The most attention was paid to designing correspondence between a mirror and a magnetic analyzer-its mid-planes coincide, and ions move freely between the magnet poles and the mirror electrode plates. Broad capabilities of the mirror were noted. It can be used for beam focusing in the direction perpendicular to the mid-plane if the beam is not focused by a magnetic field in this direction. The mirror solves
426
S. P. KARETSKAYA et al.
this problem with ease, and its optical strength in this direction can be varied by a small change in the electrode potentials. One can significantly suppress the aberrations of the magnetic field with the help of the mirror. The sizes of the latter that provide energy focusing are less than those of the cylindrical or spherical deflectors traditionally used for the same purpose. Further, for the mass-spectrometer arrangement with a sectorial analyzer intended for secondary ion analysis, the energy filter based on the threeelectrode mirrors with two-plate electrodes separated by direct slits was carried out, built, and tested by Daukeev et al. (1985). The design of the mirror and arrangement scheme are shown in Figs. 17 and 18. The mirror electrodes, 1-3, are made of annealed iron, and the vacuum camera, 5, is manufactured from stainless steel. Each electrode consists of a pair of identical parallel plates. The plates of the last electrode, 3, are joined by the iron bridge, 4. The mirror first electrode and the camera are grounded. The calculated value of the potential at the intermediate electrode, 2, is 0.477CJ;at the third electrode the potential is 0.56U; U is the value of the accelerating potential. Here potentials are counted with respect to the ground potential. The angle at which the beam axial trajectory enters A-A
FIGURE17. The vacuum camera with a mirror. (From Daukeev el al., 1985.)
MIRROR-BANK ENERGY ANALYZERS 5
7
6
7
427
7 7
FIGURE 18. The scheme of a mass-spectrometric installation with a sectorial magnetic analyzer and a three-electrodemirror. 1, The studied sample; 2, an ion gun; 3, the electrostatic lens system forming a secondary ion beam; 4, the entrance slit of the mass analyzer; 5, the plate of the spectrometer analytical stand; 6, the electromagnet poles; 7, the electromagnet yoke; 8, the excitation wiring; 9, an intermediate slit; 10, the three-electrodemirror; 11, the exit slit; 12, the receiver. (From Daukeev et al., 1985.)
the mirror field equals 45",and dispersion in energy is 200 mm. Sizes in Fig. 17 are given in millimeters. All the data characterizing the mirror ion-optical properties, including aberrations, can be found in Table 11. The mass spectrometer arrangement (Fig. 18) was made by reconstruction of an industrial mass spectrometer MI-1201("Electron," T.Sumy) intended for layer-by-layer analysis of solids by the secondary-ion mass-spectrometry technique. The surface of the studied sample, 1, is bombarded by argon ions of energy 1-10keV with current density 30-50pA/cmZ. After their acceleration and focusing by a system, 3, composed of three diaphragms and a quadrupole lens, secondary ions occurring because of bombardment enter the magnetic analyzer via an entrance slit, 4,of width 1 mm and height 7 mm. The accelerating potential U varies within 1-5 keV. The electromagnet poles have a sectorial form. The beam is deviated by the magnetic field by 90". It enters the field at an angle of 26.5" and leaves it at the same angle. Stigmatic focusing is provided. The curvature radius of ion trajectories in the homogeneous magnetic field equals 200 mm. The magnetic analyzer dispersion in mass (energy) is 400 mm. The mirror, 10, with two-plate electrodes is located after the magnet. With such an arrangement, all new assemblies could be placed on the plate of the analytical stand of the mass spectrometer MI-1201.The intermediate slit, 9, in which plane the magnet field creates the image of the entrance slit, 4, has a width of 0.5 mm; its height is 5 mm, and it serves as the entrance slit for the mirror. The exit slit, 11, of the spectrometer is located in front of the receiver, 12, and has the same sizes. With the mass spectrometer
428
S.
P. KARETSKAYA et al.
a mass numder
m 95 a mass numbez FIOW 19. A section of the Mo secondary ion spectrum measured at the installation without a mirror (a) and with a mirror (b). (From Daukeev et al., 1985.) tuning, which is optimal with respect to the ratio of resolving power to sensitivity, the values of the potentials at the mirror electrodes differ from the calculated ones by several percent. In Fig. 19a a sector of the molybdenum secondary-ion mass spectrum measured at the arrangement without a mirror is given. The same section, measured at the same arrangement supplemented by a three-electrode mirror, is shown in Fig. 19b. The conditions for spectrum acquisition are almost the same. In the second case the lines of all molybdenum isotopes are separated completely, and the shape of the lines is improved noticeably. In Fig. 20, a section of the mass spectrum of the secondary ions from a sample with a content of niobium and zirconium, obtained without a mirror (a) and with a mirror (b), is shown. A comparison between the spectra demonstrates that the resolving power can be improved significantly by means of a mirror, particularly near a line ground. For instance, at 10% of Nb’ peak height, the resolving power is 40 for the installation without a mirror, and 220 with it. Simultaneously, peak heights and sensitivity increase. A visual demonstration of the resolving-power improvement near line grounds is the appearance of the peaks of hydrocarbons having mass numbers 93, 99, 101 (Fig. 19b), and 94 (Fig. 20b), which are masked by long “tails” proceeding from the lines of Mo and Nb being measured without a mirror. It was established that the tested electrostatic analyzer is tuned with ease, it is convenient, and it is operationally stable.
MIRROR-BANK ENERGY ANALYZERS
a
I
429
b
b’
‘’N b’
&* 2
’
er
FIOURE20. A section of a Nb and Zr secondary-ion spectrum measured at the installation without a mirror (a) and with a mirror (b). (From Daukeev et ul., 1985.)
Later another three-electrode mirror with two-plate electrodes separated by direct slits was used for achieving focusing in energy in the massspectrometer MI-1201 E manufactured by the Sumy plant “Electron” (Zhukovsky et al., 1989). The device is intended for secondary-emission investigation of solid and liquid samples in which spraying is carried out by a beam of fast neutral atoms. The device is provided as well by sources of other types. The sectorial-type magnetic analyzer with a homogeneous field is as shown in the scheme in Fig. 18. Proceeding from design reasoning, a mirror having an entrance angle do = 34” was chosen for a mass spectrometer of this type. The mirror dispersion in energy was made equal to that of the magnetic analyzer, i.e., 400mm, providing an ion focusing in energy within the plane of the device exit slit. A mirror with very small second-order aberrations was chosen, so the device aberrations are determined mainly by those of the deflecting magnet. The mirror significantly improves the mass-spectrometer resolving power: It increases, at the level of 10% of the peak height, from 200 to 2000. The mirror transmission coefficient, i.e. , the fraction of ions passing through the mirror and being registered, was determined. It turned out t o be close to unity, speaking in favor of the quality of ion beam focusing in two mutually perpendicular directions. Long-term operation of the device has shown that the state of the surface of the mirror electrodes has very weak effects on the mirror operation. This is because charged particles practically do not hit the electrode surface, and hence, electrical charges are not created at the electrodes even if they are strongly conteminated.
430
S. P. KARETSKAYA et al.
The experience of the manufacture and long-term maintenance of a serial mass-spectrometer MI-1201 E has affirmed the advantages of the new energy analyzer. It is simple in design, technological in manufacture, possesses quite perfect focusing properties, and is reliable and stable in operation. In contrast to energy analyzers widely used for analogous purposes, it allows electrical tuning of the mass spectrometer to be carried out in order to compensate inaccuracies admitted in mechanical assembly. Adjustment of the device with the mirror, with small practice, is not difficult. The optimal voltages at the mirror electrodes chosen during adjustment are in good agreement with the calculated values and require no retuning over a long time.
D. The Two-Cascade Energy Analyzer with Electrostatic Wedge-Shaped Mirrors
In energy analyzers based on mirrors with two-dimensional electrostatic fields, as follows from Eq. (111.23), the modulus of relative dispersion in energy Do increases as the beam entrance angle do decreases. So, to diminish the analyzer gabarits with a certain specified dispersion in energy D,, 9, should be decreased. However, in three-electrode mirrors with parallel electrode plates, providing a stigmatic image and high-quality focusing, the plane Z = Z, approaches the mirror field boundary (Table 11) as do decreases. Already, at do = 26", when IDo[ = 2.34, this plane turns out to be located practically at the electric field boundary, and Z, = -2.71 (Kel'man et al., 1982). One can manage to increase substantially IDo[,providing the same highquality focusing, in the wedge-shaped electrostatic mirrors if the plane Z = Z, is beyond the field (Glickman et al., 1989, 1990, 1992a-d). In a wedge-shaped mirror the plates are inclined to the mid-plane and lie on two half-planes to produce a dihedral angle a. The larger the angle a,the faster the field decay when approaching the dihedral angle edge. Thus, in the three-electrode wedge-shaped mirror possessing the mentioned properties, the plane Z = Z , can be located beyond the field even for do = 426" (a = 45"). When going from do = 26" to do = 14", the modulus of Do increases by a factor of three. However, despite a large gain in Do, making a more than 45" is unreasonable. The field boundary at which the plates of the first (with respect to the beam) electrode can be cut off without mirror field distortion is close to the dihedral angle edge at large a. The produced clearance between the plates of the first electrode becomes too small for beam injection.
@p Ig
MIRROR-BANK ENERGY ANALYZERS
43 1
I
,._-
.__I
6 p &
FIGURE 21. The two-cascade energy analyzer with wedge-shaped mirrors in two projections. I. 2, object and image; 3-6, the mirror electrodes. (From Glickman et al., 1993.).
The advantages of wedge-shaped mirrors can be exemplified by a smallgabarit two-cascade energy analyzer composed of two identical twoelectrode wedge-shaped mirrors. Schematically, this analyzer is depicted in Fig. 21 (see Glickman et al., 1993). The object, 1, and its image, 2, are located at the boundary of the energy analyzer field. The numbers 3 and 4 denote the electrodes of the first-cascade wedge-shaped mirror, and the numbers 5 and 6 refer to the second cascade. The dashed lines show the axial trajectory and one of the adjacent trajectories. The potential (pl is applied to the electrodes 3 and 5, whereas the reflecting potential pz refers to electrodes 4 and 6. For each mirror the plates of neighbor electrodes are separated by direct slits and lie on two half-planes, producing a dihedral angle a. The plates of electrode 4 (in the second mirror it is electrode 6) are closed by a cylindrical surface of radius R with a symmetry axis matched to the edge of the dihedral angle. The plane dividing the dihedral angles in two halves is the energy analyzer mid-plane. In the energy analyzer, wedge-shaped mirrors with a dihedral angle a equal to 20” are used. The angle of beam entrance into each mirror do equals 15” with (pz/p, = 0.0044 and relative dispersion in energy Do = -7.2. Every mirror provides stigmatic focusing as well as secondorder focusing in the beam divergence angle in the mid-plane. The reader should be reminded that in two-electrode mirrors with parallel electrode plates an object and its image are always imaginary with this quality of focusing. The energy analyzer shown in Fig. 21 was built and applied (Shevelev et al., 1991) for improvement of the characteristics of the monopole secondary-ion mass spectrometer MS-7201 M manufactured by the plant “Electron,” T. Sumy. Neutral and scattered particles present in the device channel of registration and the large ion energy spread in the basic beam led to the fact that the device had not high operational characteristics. To eliminate the harmful effects, a miniature energy analyzer with high-level focusing and dispersing properties was needed. It had to be able to be placed into a small volume of the device vacuum camera, after the mass analyzer. The applied energy analyzer was small, 23 x 35 x 7 mm and had a low linear dispersion in energy equal to 153 mm.
432
S. P. KARETSKAYA et al.
a
2
1
b
2
f C
FIGURE22. Various types of energy analyzers with the same dispersion in energy. 1 , 2, object and image.
In the improved device, the signal-to-noise ratio for pure silicon was increased by 60 times, and the limit of detection for boron in silicon was decreased by two orders of magnitude. The studied sample was sprayed by Ar’ ions of energy 8-keV (Shevelev et al., 1991). Figure 22 gives an idea on the size of the energy analyzers based on mirrors with two-plate electrodes separated by direct slits. Here the energy analyxrs with the same dispersion in energy are shown in a projection onto the mid-plane: (a) a semi-spherical deflector (Afanas’ev and Yavor, 1978), (b) a mirror with parallel electrode plates closed by a “wall,” do = 26” (Kel’man et al., 1982; Karetskaya and Saichenko, 1989); and (c) a wedgeshaped mirror, a = 20”, d = 20” (Glickman et al., 1990). Dashed lines depict the axial trajectories of charged particle beams. For all cases an object and its image are located practically at the field boundary, and beam stigmatic focusing is provided. In the analyzers shown in Figs. 22b and 22c, the spherical second-order aberration in the dispersing direction is eliminated completely. In the analyzer shown in Fig. 22a, only one of two coefficients of spherical aberration is equal to zero. Based on the results of theoretical and experimental investigations stated in this section, one can conclude that the treated analyzers can be applied successfully both in modernization and in the design of new electron and mass spectrometers.
MIRROR-BANK ENERGY ANALYZERS
433
V. PECULIARITIES OF CHARGED PARTICLE FOCUSING AND SEPARATION IN ENERGY IN A TRANSAXIAL MIRROR The electrostatic field of a transaxial mirror within a region of charged particle motion is symmetric with respect to a certain axis, and to the plane perpendicular to it (the mid-plane) as well. Particle motion comes about near this plane, almost perpendicularly to the field symmetry axis, giving rise to the term “transaxial system” (Strashkevich, 1966). Theoretical investigation of focusing and dispersing properties of the electrostatic transaxial system for the case of the curvilinear axial trajectory of a charged particle beam was carried out by Karetskaya and Fedulina (1982a). In this chapter the basic results of this work relating to mirrors are stated. Also, some results from the articles (Beizina et al., 1985, 1986) concerning transaxial mirror theory are used here. For definiteness, an object and its image are assumed to be real when deriving the formulas. The foci and the principal points are also considered as real, i.e., they are located just at the axial trajectory, rather than at its rectilinear continuations. The formulas obtained remain valid in the case when an object, an image, or cardinal elements are imaginary. Let us introduce the cylindrical coordinate system R, Y, Y, the axis Y being coincident with the field symmetry axis, and the plane Y = 0 with the mid-plane. In this coordinate system, the transaxial system electric field is described by the potential V*(R,Y), which is the function of two variables. For the mirrors considered, the function V*(R, Y) is found analytically (Glickman et al., 1971). At the axial trajectory, R = r, Y = 0, and at the adjacent trajectory,
The angle between the radius vector r and the tangent with respect to the axial trajectory is denoted by /3 (see Fig. 23); it is positive if it is counted counterclockwise with respect to the radius vector. At the axial trajectory the potential is denoted as follows: V*(r,0) = V(r) = d-9.
FIGURE23.
434
S.
P. KARETSKAYA et al.
The functions (~’(s),k(s),and pjk(s) entering the calculation formulas of the second section can be expressed via the potential V(r) and its derivatives. Taking into account Eq. (V.l) together with the Laplace equation for the potential V*(R,Y), we shall determine that (P‘
dV dr
= -cosp,
d 2V dr
pzO= -sin2p
(Po2 =
d2V 1 dV 7 - --3 dr
(-
d3V
PI2
=
k=
dV cos2p + --, dr r
dr3
r dr
1 d2V +; -p- 7 r -dr
sin p dV 2V d r ‘
In an electrostatic field possessing axial symmetry the integral of motion exists:
r*’#* = constant.
(V.3)
So, in the coordinate system s, x, y, one can write for the projection of an arbitrary trajectory onto the transaxial system mid-plane the following first-order differential equation: G q [ ( 1 - kx)(rsin p - x )
+ x’r cos 81 = B .
(V.4)
Here B is a constant that can be expressed via the object plane variables as follows:
Also, the functions a(s) = r sin p and b(s) = r cos p are introduced; their values in the object plane are denoted by a subscript “0.” Beyond the field, at the rectilinear sections of the axial trajectory, these functions bear simple meanings: a(s) is a constant, equal, by modulus, to the length of the
43 5
MIRROR-BANK ENERGY ANALYZERS
perpendicular dropping from the point 0 onto the axial trajectory; Ib(.s)l is the distance between the perpendicular base and a current point (Fig. 23). From Eq. (V.4) it follows that at the axial trajectory (V.6)
G r s i n B = &ao.
Both in the object space and in the image space of the mirror, the values of a are the same, equal to a,. The sign of the constant a, determines the direction of motion along the axial trajectory. A. Paraxial Approximation In the linear approximation Eq. (V.4) acquires the form
x’
ka - 1 x = - BO , + Be +b b 4
where
Taking into account that
r‘ = cosB,
B’
=
sin /3 k - -, r
(V.10)
we shall find that
b’ = 1
-
ka.
(V. 11)
Now the solution of Eq. (V.7) can be written in the following form: (V.12) where C is an arbitrary constant. Then one can find explicit forms for the particular solutions, g and h, introduced in the second section, and for linear dispersion D as well. However, we shall not do it here, because all the information on the specific properties of transaxial mirrors can be received in the most simple way, directly analyzing Eq. (V.7) itself. From this equation it follows that in the mirror image space, (V.13)
436
S. P.KARETSKAYA el al.
The subscript “1” again denotes the values of variables relating to the plane of the Gaussian image Q1. As = xom,
X]
+ D,&,,
(V.14)
then x; =
x6bo
+ xrJ(rn, -
1)
+ Dl60
bl
(V.15)
On another hand, it is known that (V.16) On comparing the coefficients as xh, x,, and E , in Eqs. (V.15) and (V.16) , we shall find that in transaxial mirror the angular magnification is as follows: y = -b0
(V. 17)
bl The linear dispersion in energy is
(V. 18)
D] = D;bl, and the image equation has the following form: 1
1
1
bl
bo
fx’
---=-
(V. 19)
Setting b, = -00, we shall find that b = bF1for the focus bl = 00, we shall obtain b = bFOfor the focus F,, i.e.,
and setting (V.20)
D F =~ -bFO = f,.
It should be noted, returning to Eq. (V.7), that in the image space
bx; - x = box6 - x,.
(V.21)
It follows that if bo = 0 in the object plane, then in the image plane, bl = 0,
x1
=
x,,
D ] = 0.
(V.22)
Thus, in a transaxial mirror, the planes containing the field symmetry axis and perpendicular to the axial trajectory, or to its rectilinear continuation in the object space and in the image one, are always principal plates. Note that the linear dispersion in the principal plane of such a mirror equals zero. The arrangement of cardinal elements in the concave and convex transaxial mirrors is depicted in Fig. 24. Here the mirrors (themselves) are shown schematically in the projection onto the mid-plane; only the slits of average
437
MIRROR-BANK ENERGY ANALYZERS a
b
/
S
FIGURE24. Cardinal points and planes of electrostatic transaxial mirrors: (a) concave, (b) convex.
radii R1, separating neighbor electrode plates, are shown. The principal planes are denoted by Xo and X, ,while the anti-principal ones, for which m, = - 1, are denoted by Po and P,. In addition, the effective surface of reflection, R = Re,, and the angle a of the axial trajectory deviation in a mirror field are shown. For a concave mirror Q > 0, whereas for a convex one a C 0. Evidently, Reff and a are connected by the following relation: QO
(V.23)
Reff = Icos(a/2)l
B. Aberrations Let us perform the required expansions in powers of small parameters in Eq. (V.4), keeping the terms up to and including second order of smallness. Then for the mirror image space we obtain the following equation:
bx;
+ (y;)’] + -b2x ; E O
a0
- x - - [(x;)~
2
=
a0
box6 - XO - -[(x;)’ 2
1
- ZXEO
b0 + (y;)’] + -x;EO 2
1
- -XOEO. 2
(V.24)
Then we shall take into account that an arbitrary trajectory coordinate in the principal plane of the object space
x
= x X , = XO
- box;,
(V.25)
438
S. P. KARETSKAYA el al.
and that in the principal plane of the image space in the paraxial approximation, (V .26) x = xx, = xxo. On making use of Eqs. (V.25) and (V.26) and on solving Eq. (V.24) by the method of successive approximation, we shall determine xKl in the second approximation:
On substituting XO
x ; = x;yx - fx
Yo r; = YAYy - -
+ D;Eo, 1
fy we have obtained the expressions for the coefficients K i(i the plane X1:
00 K4 = K4 = -(1 2
= I , 2,
...,9) in
- Y,'), (V.28)
439
MIRROR-BANK ENERGY ANALYZERS
Thus, if the electron-optical parameters characterizing mirror properties in the paraxial approximation are known, then the magnitudes of the coefficients xi are known too. For the particular case when an object is located in the principal plane Xo, yx = 1 and the aberration coefficient Rl = 0. With this, if a stigmatic image is produced in the plane X 1 ,then, evidently, y, = f 1, and the coefficient K4 also equals zero. The coefficient Ml equals zero, too [see Eq. (11.29)]; i.e., the second-order spherical aberration in both directions x and y is absent for this case. As was shown earlier, in the plane X1the linear energy dispersion also equals zero. With 0;= 0, all second-order chromatic aberrations are absent, too. From here it follows that an excellent deflecting system without beam separation in energy can be constructed on the base of a transaxial mirror. In the case of a concave mirror, an object and its image are real, while for a convex mirror both are imaginary. In an image space arbitrary plane that is perpendicular to the axial trajectory, the coefficients Ki are determined as follows: K; = K;+ bK/,
(i = 1,2, ..., 9),
(V.29)
where K,!,, are angular aberration coefficients in this space. Following symmetry ideas, Beizina et al. (1986) have shown that in a transaxial mirror the coefficients K; are interconnected by the following relations: K1 - h K 2 + (p;
- f,2)K3 =
'0 2
(I - $)
9
(V.30)
(V.3 1)
K4 - fyOK5
and the coefficients K,! are determined as follows: (V.32)
(V.33) Here ex0 = ex1
s(F,) - so,
= s1 - S(F,l),
Let us consider two particular cases.
- so,
(V.34)
= s1 - s(F,,).
(V.35)
& = l s(&)
e,
440
S . P. KARETSKAYA et al.
In Case 1 , an object is in the plane Po, whereas its stigmatic image is in the plane PI. For this case, rn, = - 1 , my = T l , Pd = f x , Py0 = * A , ex1
=fx,
$1
=
*At K , - fXK2 = 0, K4 T f , K s = 0.
(V.36) (V.37)
From here it follows that if the spherical aberration is eliminated, i.e., K 1 = K4 = 0, then K2 = K , = 0 , too. In addition, in accordance with Eqs. (11.29), M , = 0 and M3 = 0. Such mirrors will be exemplified in the next chapter. In Case 2, the trajectory initial coordinates are specified in a focal plane s = s(F,,), and the aberration coefficients are calculated for the focal plane s = s(F,,). The focal planes of the focusing in the directions x and y coincide: s(F,) = S(F,O)Y s(41) = s(F,1). Then PN = ,P = ,P = P, = 0, (V.38) (V.39)
K ; - f:K' 3 -
a0 --9
(V.40)
Y X
(V.41) Ki -A2K; = --.a0 2fx From Eqs. (V.29), (V.28), and (V.20), it follows that in this case the coefficients of the angular and linear geometric aberrations are interrelated in the most simple way, i.e.,
K 3 = -K,i fx
(V.42)
MIRROR-BANK ENERGY ANALYZERS
441
From here it is obvious, in particular, that elimination of the linear spherical aberration in the focal plane of the image space (K3 = Ks = 0) results in the elimination of the angular spherical aberration, K ; = Ki = 0, too. The last possibility is significant for collimator objectives. According to Eq. (11.30), with K6 = 0, the coefficient M4 of spherical aberration in the direction y also turns out to be equal to zero. Mirrors with such properties have been found through calculation and will be described later. The relations presented in this section not only allow control of the correctness of calculations of electron-optical parameters for a concrete transaxial mirror, but also make more effective the search for optimal regimes for its operation in systems designed for particular purposes.
VI. ENERGY ANALYZERS BASEDON TRANS~UUAL MIRRORS WITH TWO-PLATE ELECTRODES
A mirror with a two-dimensional field cannot transform a divergent beam into a parallel one or a parallel beam into a convergent one, because it represents an afocal (telescopic) system in the mid-plane. Neither does such a mirror possess angular dispersion in energy. These facts restrict significantly the region of its applicability. A transaxial mirror is free of such shortcomings. Now we shall consider the focusing and dispersing properties of a number of concrete transaxial mirrors. A. Two-Electrode Mirrors
The convex and concave two-electrode transaxial mirrors are shown in Fig. 25. Their electron-optical properties were studied by Beizina et al. (1985, 1986, 1987) and Beizina and Karetskaya (1987, 1991a) by the mathematical simulation technique. Each electrode of the mirror is formed of two identical parallel plates, situated symmetrically with respect to the mid-plane. For both electrodes the distance between the plates is constant, denoted by d . The projections onto the mid-plane of the slits that separate the plates of different electrodes are shaped like sections of rings. The slit average radius is denoted by R, . Electrode plate sizes must be large enough that a required symmetry type is conserved in the region of charged particle motion. The charged particle beam enters the mirror field at a certain angle relative to the radial direction. It enters the concave mirror field from the side where the slit curvature centers are located, whereas it passes through the convex mirror field from the opposite side. The potential of the first
S. P. KARETSKAYA et al.
442
yt
a
b
Y
d
FICIURE 25. Two-electrodetransaxial mirrors: (a) the concave mirror, (b) the convex mirror. 1 , 2, mirror electrodes; 3, axial trajectory of charged particle beam.
(second) electrode is denoted by cpl (q2).The potential of the object space, as well as the image space of the mirror, denoted earlier by cpo, coincides with cp,. 1. The Potential Distribution
According to the theory stated earlier, when calculating the electron-optical properties of transaxial mirrors, just the distribution of the potential V(r) along the axial trajectory is needed. For the considered two-electrode mirror, in accord with the work by Glickman et al. (1971), it is determined by the formulas
for the concave mirror, and
(VI.1)
for the convex mirror, where Jo and J1are the first-kind Bessel functions of the zeroth and first orders. The formulas (VI.l) are derived under the assumption that the width of the slits between the plates of neighbor electrodes is infinitesimal. However, these formulas also determine, with sufficient accuracy, the potential distribution for a finite width of the slit if 6 s o , Id. Mirror electron-optical parameter calculations by means of an electronic computer are time-consuming when using V(r)in Eqs. (VI.1). So in most cases the potential distribution (VI. 1) has been approximated by simpler
443
MIRROR-BANK ENERGY ANALYZERS
functions:
r(r - R,) V(r) = " - " arctan sinh 71 d
+-
cp1
+ cp2 2 (VI.2)
for the concave mirror, and V(r) = " - "arctan sinh 71
n(r - Rl) cp1 + cp2 +- 2 d
for the convex mirror. The replacement of Eqs. (VI.1) by Eqs. (VI.2) reduces the run time by approximately a factor of 50. The difference between the results obtained with Eqs. (VI.l) and those obtained with Eqs. (VI.2) for mirrors with R , 2 5d does not exceed 2% when calculating paraxial characteristics and 10% for aberration coefficient calculations. In calculations the potential cpl of the first electrode of a mirror has been adopted as the unit potential, whereas the distance between the plates representing an electrode plays the role of a unit length, i.e., p1 = 1 and d = 1 . The electric field is assumed to be absent beyond the region
R1
-3
5
r S R,
+ 3.
Table V exemplifies the comparison between the typical results obtained by Eqs. (VI.l) and (VI.2). In addition to the quantities whose meanings were explained above, the following characteristics are presented: cpu is a potential at the turning point of the axial trajectory and, r,, is the distance between this point and the center 0. From Eq. (V.6) it follows that cp" =
ty.
(VI.3)
In a concave mirror, the turning point is the axial trajectory point farthest from the center, r,, = rma. In a convex mirror, the turning point is closest to the center, r,, = rmin.The values of r,, and rmingive an idea of the depth of particle penetration into a mirror field, whereas cp,, characterizes the extent of particle braking.
2. Cardinal Elements and Dispersion in Energy Let us present the results of calculations for the concave and convex mirrors with the same R , , equal to 10, 30, and 50, and with the same ratio ao/R, = 0.6. Then a reader will be able to compare the properties of twoelectrode concave and convex mirrors, as well as to imagine how these properties undergo the effect of the mirror geometrical and electrical parameters. When calculating, R , and the ratio a o / R , ,which determines an angle at which the axial trajectory enter a mirror field, have been specified. The angle ar of axial trajectory deflection has been calculated.
TABLE V THE ELECTRON-OPTICAL PARA~~ETERS OF TWO-ELECTRODE TRANSAXLUMIRRORS' --a
4
P2
V
deg.
r,,
10
0.1500
I I1 I
96.5 96.6 92.9 93.1
9.74 9.75 9.61 9.62
0.2500
I1
(pu
f,
0.380 0.379 0.390 0.389
4.90 4.79 6.12 6.09
0;
K,
K,
10 K3
-lo-' K4
-lo-' K5
-Ks
K,
K,
K9
0.378 0.374 0.658 0.654
5.6
0.22 0.19 6.2
-0.46 -0.47 2.3 2.2
12 12
6.7 6.7 8.5 8.6
9.5 9.6
16 16 48 48
1.6 1.6
2.1 2.1 6.9 6.9
5.5
39 38
6.2
'I: V(r)has been calculated by Eq. (V1.1); 11: V(r) is calculated by Eq. (VI.2).
I5 16
12
12
4.5
4.5
445
MIRROR-BANK ENERGY ANALYZERS
I
a
0
42
0,4
v,
0
0,2
0,4
V,
FIGURE26. The focal length f, and the angular dispersion 0; of the concave (a, b) and convex (a', b') mirrors. (From Beizina et ol., 1985.)
In Fig. 26 the focal length f , and the angular dispersion in the image space 0;versus pz are depicted. In Fig. 27, the focal length&, together with the value of the particular solution H (see Section 11) in the focal plane s = s(F,,) are given. With the value of H[s(F,,)], one can define the astigmatic difference,
AF = s(F,l) - s(F,l) = s(&J - ~ ( F , o=) &,W[s(F,,)I, (VI.4) and hence, the positions of the foci &, and 6,. Digits near the curves denote the slit radius values: for 1, R , = 10; for 2, R 1= 30; and for 3, R , = 50. From the side of small values of pz ,the potential variation region is limited because of the rapid growth of aberrations of certain types; from the side of larger p2, it is limited by the regime changing-from reflection (mirror) to refraction. In Tables VI and VII the parameters Refr,01, and r,, (rmi,,)characterizing the axial trajectory are presented. The data given demonstrate that the focal length f , of both the concave and convex mirrors is determined, in a wide range of values of p2, mainly by the radius of the slit separating the electrodes; it is almost independent of p2, the value off, being somewhat less than R , / 2 . With pz > 0.2, the focal length f . of the concave mirror starts to increase. The curve of the
446
S. P.KARETSKAYA, et al.
FIGURE27. The value of H = H[s(F,,)]and the focal length 4 for the concave (a, b) and convex (a', b') mirrors. (From Beizina et al., 1985.)
focal length f, of the convex mirror decays into two sections, one of which is located in the region of positive values off,, while the other is in the region of its negative values. With a certain value of cp2, the convex mirror in the x-direction transforms into a telescopic system: f , = 00. Angular dispersion of the concave mirror 0; increases when cp2 grows, i.e., when the depth of particle penetration into the mirror field grows. With a specified p 2 , the greater the R , , the less the angular dispersion, always remaining positive, The angular dispersion of the convex mirror has its minimum for every R, and is always positive, too. It should be recalled that the linear dispersion in the focal plane of a transaxial mirror is as follows: D = D ; f , . Generally speaking, a space parallel beam is focused by a transaxial mirror in the plane s(F,,) into the prime-focus, with the height p being determined by the condition
P
= POH"u1,
where po is the initial size of parallel beam in the y-direction. With H[s(F,,)]= 0, the parallel beam is focused into a point. In this case the focal planes of the focusing in the x- and y-directions coincide, s(&) = s(F,,), s(F,,) = ~(6,). For every value of R1 in the concave mirror, there
447
MIRROR-BANK ENERGY ANALYZERS
TABLE VI CONCAVE TWO-ELECTRODE MIRRORS: PARAMETERS OF AXUL TRAJECTORIES, o,/R, = 0.6"
- 0.08 - 0.04 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30
10.92 10.99 11.07 11.12 11.18 11.21 11.31 11.39 11.49 11.61 11.76 11.96 12.22 12.61 13.28 14.84
10.1 10.1 10.2 10.2 10.2 10.2 10.2 10.2 10.3 10.3 10.3 10.4 10.4 10.5 10.6 10.7
113.3 113.8 114.4 114.7 115.1 115.5 115.9 116.5 117.1 117.8 118.7 119.8 121.2 123.2 126.3 132.3
30.81 30.86 30.93 30.97 31.01 31.06 31.11 31.17 31.24 31.31 31.41 31.52 31.66 31.83 32.06 32.39 32.92 33.96
30.1 30.1 30.1 30.2 30.2 30.2 30.2 30.2
30.3 30.3 30.3 30.3 30.4 30.4 30.5 30.5 30.6 30.7
108.5 108.6 108.8 108.9 109.0 109.2 109.3 109.4 109.6 109.8 110.1 110.3 110.7 111.1 111.7 112.5 113.7 116.0
50.79 50.84 50.90 50.94 50.98 51.03 5 1.07 51.13 51.19 51.27 51.35 51.45 51.58 5 1.73 5 1.93 52.20 52.60 53.28
50.1 50.1 50.1 50.2 50.2 50.2 50.2 50.2 50.3 50.3 50.3 50.3 50.4 50.4 50.5 50.5 50.6 50.7
107.6 107.7 107.8 107.8 107.9 108.0 108.1 108.1 108.3 108.4 108.5 108.7 108.9 109.1 109.4 109.8 110.8 111.5
"From Beizina et 01. (1985).
are two values of 92 at which H[s(F,,)]= 0. In the convex mirror, the number of such values reaches four. For both concave and convex mirrors, the focal length fy depends on R , relatively weakly. As for fy , it varies in very wide limits and changes its sign twice when (p2 varies. Note that the properties of the transaxial convex mirror are quite different from those of the light-optical convex one. The latter always scatters a parallel light beam. The transaxial convex mirror, depending on the ratio of the electrode potentials, is capable either of scattering a parallel charged particle beam-creating an imaginary focus-or of converging itcreating a real focus-and otherwise leaves the beam to be parallel (Fig. 28). This property is connected with the difference in the depth of penetration into the mirror field of a particle moving along the axial trajectory, and one moving in an adjacent trajectory. A significant feature inherent in electrostatic mirrors is their capability to separate particles in energy, as the depth of particle penetration into a field also depends on energy. Here a substantial difference between the concave and convex mirrors is prominent. Whereas the concave mirror always deflects a more energetic particle to a
448
S. P. KARETSKAYA et al.
TABLE VII
CONVEXTWO-ELECTRODE MIRRORS: PARAMETERS OF A x w TRAJECTORIES, a,,/R,
-0.08 -0.04 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.30 0.32 0.34 0.35 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50
9.351 9.311 9.267 9.215 9.156 9.085 9.000 8.895 8.762 8.589 8.482 8.357 8.210 8.127 8.037 7.837 7.609 7.364 7.120 6.899 6.715 6.570
9.91 9.89 9.87 9.84 9.81 9.78 9.74 9.69 9.63 9.56 9.52 9.47 9.41 9.38 9.35 9.27 9.17 9.06 8.92 8.78 8.62 8.46
100.3 99.8 99.3 98.7 98.1 97.3 96.4 95.2 93.6 91.4 90.0 88.2 86.1 84.8 83.4 80.1 75.9 70.9 65.1 59.1 53.3 48.1
29.28 29.23 29.18 29.12 29.04 28.95 28.83 28.68 28.46 28.13 27.90 27.57 27.10 26.77 26.35 25.07 23.15 21.58 20.67 20.10 19.69 19.39
29.9 29.9 29.9 29.8 29.8 29.8 29.7 29.7 29.6 29.5 29.4 29.3 29.2 29.2 29.1 28.8 28.4 27.8 27.1 26.5 27.0 25.5
104.1 104.0 103.8 103.6 103.4 103.1 102.7 102.2 101.5 100.4 99.6 98.5 96.8 95.5 93.4 88.2 77.9 66.9 58.9 52.8 41.9 43.6
49.26 49.22 49.16 49.09 49.01 48.92 48.79 48.62 48.38 48.00 47.71 47.28 46.59 46.04 45.24 41.92 37.53 35.43 34.20
49.9 49.9 49.9 49.8 49.8 49.8 49.7 49.6 49.6 49.5 49.4 49.3 49.2 49.1 49.0 48.5 47.4 46.3 45.2
= 0.6
105.0 104.9 104.8 104.7 104.5 104.3 104.1 103.8 103.3 102.6 102.1 101.2 99.8 98.7 96.9 88.6 73.9 64.3 57.4
larger angle, the convex mirror deflects a more energetic particle to a smaller angle (Fig. 29). 3 . Geometrical and Chromatic Aberrations in the Focal Plane
Let us consider the behavior of the aberration coefficients of the twoelectrode concave and convex mirrors with the paraxial properties discussed earlier. In Figs. 30, 31, and 32,the appropriate coefficients of geometrical and chromatic aberrations as the functions of cp2 are presented. The coefficient values have been calculated in the focal plane s = s(F,,).The coordinates of an arbitrary trajectory have been specified in the plane s = 0, at the mirror field entrance, where r = r, = R , - 3 for the concave mirror and r = ra = R 1 + 3 for the convex mirror. Let us denote the coordinates in this plane by x, and ya. Then the coordinate x = xF, of an arbitrary trajectory in the plane s = s(F,,) with an accuracy up to terms of
449
MIRROR-BANK ENERGY ANALYZERS a
a’
b’
C’
d’
1FZI
FIGURE28. Various cases of parallel beam transformation by the concave (a, b) and convex (a‘, b’, c’, d’) two-electrode mirrors. R, = 10, u,, = 6. p2 = 0.04 (a); 0.25 @); 0.20 (a‘); 0.31 (b’); 0.34 (c‘); 0.41 (d‘). (From Beizina et ol., 1987.)
the second order of smallness, is determined as follows:
+ D;f,&o + Kl(Xh)’ + K2x&xa+ K& + K4(y6)2 (VI.7) + K , yh ya + K6y f + K , X ~ +E K8xac0 ~ + K9c;,
xF1= xbf,
where Ki are the coefficients given in Figs. 30-32. When comparing the concave and convex mirrors, we note that, in total, the convex mirror aberrations are larger than the concave ones. Also it should be noted that charged-particle exit from the mid-plane is accompanied
FIGURE29. Particle deflection in the concave and convex mirrors for various energies of particles. The trajectories of the particles having greater energies are denoted by “2.” (From Beizina ef ul., 1987.)
450
S . P. KARETSKAYA el al.
fj2.3
-4
-a a
0.2
44
FIGURE30. The geometrical aberration coefficients determining the aberration correction for particles moving in the mid-plane (a, b, c, concave mirrors; a', b', c', convex ones). (From Beizina et al., 1986.)
qor 0 -4
-4
-4
-4
-8
a
a,z
a,4
q2
a
42
0,4
q2
0
02
a,4
qZ
FIGURE31. The coefficients occurring on particle exit from the midplane (a, b, c, concave mirrors; a', b', c', convex ones). (From Beizina el al., 1986.)
MIRROR-BANK ENERGY ANALYZERS
45 1
FIGURE32. The chromatic aberration coefficients (a, b, c, concave mirrors; a', b', c', convex ones). (From Beizina and Karetskaya, 1991a.)
by substantial deterioration of focusing quality of both concave and convex mirrors. Large values of the coefficients K4, K , , and K 6 , in comparison with the rest, affirm this observation. However, with correct choice of parameters one can obtain a mirror with quite good focusing properties for a space beam. With definite values of p2, the aberration coefficients decrease substantially; many of them reverse their signs. 4. Various Regimes of Operation
A special calculation has been carried out in order to select the twoelectrode concave and convex mirrors collecting a space parallel beam into a real stigmatic focus. The parameters for such mirrors are presented in Tables VIII and IX.The stigmatic property is provided by some definite choice of p,-values. According to these tables, data on the focal length and angular dispersion can be governed by varying a, and R , . However, the calculations have demonstrated that in two-electrode transaxial mirrors, one does manage to focus a parallel beam into a point and simultaneously minimize both coefficients determining spherical aberration in the focal plane (K3 and &). In Table X the parameters of concave mirrors focusing a parallel beam into a prime focus are given. For such mirrors the coefficients K3 and K6
TABLE VIII
CONCAVE TWO-ELECTRODE MIRRORSFOCUSING A SPACE PARALLEL BEAMINTO ff,
R,
adR,
deg.
cp2
4%
10
0.4
137.9 146.3 115.2 128.8 134.5 137.4 109.1
- 0.2330 0.0669 0.0448 0.2504 - 0.2483 0.0790 0.0442
114.0 133.8 135.6 107.9 111.0
-0.2515 0.0817 0.0437 0.2915
0.155 0.141 0.347 0.320 0.158 0.153 0.356 0.345 0.160 0.156 0.359 0.351
0.6 P
VI
h,
30
0.4 0.6
50
0.4 0.6
0.2840
"From Beizina el al. (1986).
rm,
10.2 10.7 10.2 10.6 30.2 30.7 30.2 30.6 50.2 50.6 50.2 50.6
R,ii
-L
11.13 13.80 11.19 13.88 31.00 33.05 31.02 33.07 50.98 52.93 50.99 52.94
4.83 3.32 3.80 1.39 14.0 12.1 11.8 7.39 23.1 21.1 19.8 14.6
0; 0.175 1.94 0.326 3.80 0.056 0.609 0.099 1.10 0.033 0.360 0.058 0.640
A
POINT'
r,
K,
K2
lOK,
lO-'K,
lO-'K,
K6
4.41 -5.94 4.23 -6.68 13.6 -15.5 12.3 -16.7 22.9 -24.7 20.4 -25.6
2.1 -7.4 0.45 -25 3.4 -20 1.4 -110 4.7 -23 1.5 -140
0.16 3.1 - 0.33 17 0.68 2.1 0.49 13 1.7 1.6 0.59 9.2
- 0.020 10 1.7 29 - 0.37 0.64 0.50 3.8 - 2.0 0.29 0.29 1.6
0.77 1.o 2.0 1.8 0.67 1.1 1.4 1.8 0.66 1.1 1.3 1.9
2.2 2.9 4.4 3.9 1.9 3.1 3.5 4.5 1.9 3.2 3.3 4.8
6.2 8.3 9.9 8.8 5.7 9.2 8.6 11 5.6 9.5 8.5 12
CONVEXTWO-ELECTRODE ~ O
R
TABLE IX Focus~ro S A SPACE PARALLEL BEAMINTO
A
P o d
-ff,
6
R,
adR,
deg.
ps
pu
10
0.40 0.45 0.40 0.45 0.55 0.65 0.40 0.45 0.55 0.65
117.1 108.9 125.9 119.4 105.6 89.36 128.0 121.5 107.9 93.06
0.1225 0.1734 0.1319 0.1708 0.2670 0.4OOO 0.1384 0.1780 0.2730 0.3919
0.185 0.234 0.169 0.214 0.317 0.443 0.166 0.210 0.312 0.436
30
W
50
'From Beizina et al. (1987).
r,
9.30 9.31 29.1 29.2 29.3 29.3 49.0 49.1 49.2 49.3
Refl
f,
7.668 7.742 26.41 26.76 27.28 27.43 45.60 46.03 46.73 47.24
12.6 23.9 27.7 27.3 27.9 62.9 45.8 44.8 42.4 42.4
Di
-fy
1.87 3.51 1.89 15.0 1.41 0.550 1.21 0.762 0.973 3.00 1.04 40.2 0.440 1.34 0.506 1.16 0.877 0.91 1 0.722 4.60
K, 10-'K2 0.21 0.54 0.45 0.43 0.46 1.7 0.94 0.87 0.77 0.82
3.4 8.8 3.0 2.9 3.2 13 3.8 3.6 3.4 3.9
K3 -1O-'K., 1.4 3.6 0.47 0.47 0.55 2.4 0.38 0.37 0.37 0.46
2.6 5.2 3.3 3.2 3.6 10 4.4 4.3 4.2 4.9
-lo-' K5
-10-'K6
1.6 3.1 2.0 1.9 2.0 5.3 2.6 2.5 2.4 2.5
2.4 4.7 3.0 2.8 2.8 6.7 3.9 3.7 3.2 3.2
TABLE X CONCAVE TWO-ELECTRODE MIRRORS FOCUSING A PARALLEL BEAMINTO
A
PRIME Focus'
~
8 9 10 20 30 40
2.544 2.772 3.000 5.500 8.100 10.80
150.3 150.9 151.3 151.4 151.0 150.5
0.1081 0.0912 0.0846 0.0613 0.0521 0.0456
0.092 0.086 0.082 0.072 0.071 0.071
8.39 9.43 10.4 20.5 30.5 40.5
9.920 11.05 12.10 22.26 32.32 42.35
4.17 4.65 5.12 9.97 14.8 19.5
0.426 0.437 0.421 0.277 0.207 0.168
0.326 0.342 0.360 0.504 0.611 0.700
10.7 11.8 12.7 19.2 24.0 23.0
1.3 1.6 1.9 3.4 3.8 4.0
-4.2 -4.1 -3.8 -1.1 0.16 0.97
0.64 - 1.9
-3.3 -11 -15 - 19
0.094 0.49 0.71 1.9 2.6 3.2
From Beizina et ol. (1986).
TABLE XI CONVEX TWO-ELECTRODE MIRRORSOPERATING IN THE TELESCOPIC REGIME'
10 20 30 40
50
4.875 12.25 20.70 29.40 39.00
120.2 91.3 82.0 76.3 69.1
0.2188 0.3584 0.4622 0.5271 0.5988
"From Beizina et of. (1987).
0.274 0.403 0.499 0.560 0.626
9.30 19.3 29.3 39.3 49.3
7.764 17.53 27.44 37.39 47.36
1.96 1.36 1.11 0.951 0.867
2.8 2.9 3.5 3.6 4.8
4.7 3.0 2.8 2.3 2.7
20 7.9 5.4 3.6 3.7
2.2 1.7 1.7 1.6 1.9
13 9.6 8.9 7.7 8.5
1.9 1.3 1.0 0.89 0.91
MIRROR-BANK ENERGY ANALYZERS
455
are sufficiently small. When calculating, the parameters R, , a,, and p2 are chosen following the requirement IK31 c and lKal c Also, the calculation has been made with the purpose of finding the parameters of convex mirrors with which a parallel beam is left parallel (see Table XI). A specified regime of operation is provided by selection of the required values of a, and cp2. Mirror angular dispersion, as follows from the table is sufficiently large. The required value of DI can be obtained by appropriate choice of R, . Now a few words about the charged particle divergent beam deflecting and focusing by two-electrode transaxial mirrors. Two cases have been considered. In the first case, an object and its image are located in the anti-principal planes Po and P, ,m, = - 1, particles in the image plane being separated in energy. In the second one, an object and its image are in the principal planes Xo and X, , rn, = + 1, particles in the image plane being unseparated in energy. In Tables XI1 and XI11 the mirror parameters which provide stigmatic focusing in the plane P, , rn, = -1, are given. When calculating, the values of cp2 and a, have been varied in order to choose the systems with G1= 0 and with a certain specified deflection angle a. For all mirrors from Tables XI1 and XIII, the condition my = +1 is valid. It implies that when achieving a stigmatic image, the anti-principal planes Po and P, turn out to be coincident with the principal planes of the y-direction of focusing. Unfortunately, in this case focusing quality is unsatisfactory, judging by too-large values of the spherical aberration coefficients K1 and K4. Later it will be shown that introduction of a third electrode allows one to improve significantly the focusing properties of a mirror. In Table XIV the parameters of the concave mirrors forming a stigmatic image, m, = + 1, in the plane X, are given. Only in the concave mirror are an object and its image, both located in the principal planes X, and X 1 , real. In the convex mirror the planes X, and X, are located in such a way that both an object and its image are imaginary. Because of this, convex mirrors have been excluded from consideration. In Table XIV, mirrors are divided into two groups. In group I, my = -1, i.e., the principal planes of the x-direction coincide with the anti-principal planes of the y-direction, whereas in group 11, my = +1, i.e., the principal planes of the x- and y-directions coincide. According to Eqs. (V.18) and (V.28), for these mirrors in the plane X, the following relations are valid:
CONCAVE
TABLE XI1 TWO-ELECTRODE MIRRORS FOFWING A STIGMATIC IMAGE OF A POINT;AN O m C T AND ITS h ANTIPRINCIPAL PLANES Po AND Pl"
a,
$
R,
deg.
a0
P2
P.2
rum
Re,
-f*
-Dl
-4
10
140.0 130.0 140.0 120.0 100.0 80.0 140.0 120.0 100.0 80.0
4.800 5.920 11.40 16.57 21.30 25.46 18.18 26.50 34.06 40.61
0.1347 0.2435 0.0713 0.2304 0.4333 0.6570 0.0621 0.2115 0.4021 0.6076
0.203 0.310 0.138 0.292 0.484 0.694 0.129 0.274 0.452 0.644
10.7 10.6 30.7 30.7 30.6 30.6 50.7 50.7 50.6 50.6
14.04 14.02 33.30 33.15 33.14 33.23 53.20 53.03 53.00 53.02
2.30 1.37 12.1 8.34 4.53 1.54 21.6 16.8 11.1
13.0 10.9 16.6 17.2 15.0 11.5 17.0 19.1 19.0 15.5
2.14 2.71 1.87 4.22 6.10 6.44 1.77 4.44 7.26 9.10
30
50
From Beizina and Keretskaya (1987).
5.56
-1O-'K, 0.29 0.27 0.43 1.1 1.2 0.75 0.43 1.4 2.2 2.1
lo-' K2 1.3 2.0 0.36 1.3 2.6 4.9 0.20 0.83 2.0 3.8
4
3
1.6 4.4 0.094 0.44 1.7 10 0.033 0.15 0.52 2.0
G E ARE PI THE
lo-' K~ 0.31 0.51 0.26 1.6 1.7 3.4 0.23 1.8 5.3 7.9
10-'
K5 lo-' K ,
1.5 1.9 1.4 3.7 5.4 5.2 1.3 4.1 7.3 8.7
1.7 1.7 1.9 2.2 2.2 2.0 1.9 2.3 2.5 2.4
I1 9.S L'Z OEZ
S'E 8'Z E'Z ZI 6'5 07 O'E E'Z 6'1 L'E
02
S'P
65
E'9
IP
WI ZE E'6 8'2 LL'O 001 I
PL'O 6L O't
0s
ZL El I'E
SI 1'9 9'Z L'L
96-0 ZP'O OZ'O 060'0 SEO'O
E'P
E'I
95'0 PZ'O L60'0
8'5 9'1
91 L'S P'Z 0'1 W'O 66 91
0s 6'1 IL'O LP 5'9
S'EZ
06'2
"I
E X
667
P'W OIE
P'EI L1'8
89 61 OL 8'2 OLS LP
I1
E'W
P'LI 11'6 I8'Z 8'21 ZZ'S
P'6 E'9
EI'S
P'E Z'I
2'85
9'OZ 611 6'PS 1'9E 8-92
I'IZ S'SL 6'EE
O'ZP EE'LP I'EE ZC'LP 1-62 OE'LP O'LZ LZ'LP L'SZ IZ'LP L'LS EP'LZ 5-62 IP'LZ 8-12 6E'LZ P'81 SE'LZ 9'91 6Z'LZ WL'L POL'L
I'OZ
6L.6
E'6P E'6P E'6P E'6P E'6P E'62 E.6Z E'6Z E'6Z E'6Z I E'6 ZE'6
IW.0 08E'O ZOE'O 612'0
WI'O 8EP'O 19E'O L8Z'O 81Z'O
SSI'O LZZ'O
ILI'O
SLIP'O L6ZE'O
I SPZ'O w1-0
ILWO W6E'O
EOIE'O 2622'0
IPSI'O LL80'0 1991'0 LPOI'O
LP'EE IP'OE EI'LZ E9'EZ 96'61 6E.61 Z9'LI
IL'SI 89'EI
ES'II W P SS8'E
0'06 0'001 0'01I O'OZI
O'OCI
0s
0'06
0'001 0'01 I O'OZI
O'OEI
OE
IIn -t
0'01I
O'OZI
01
458
S. P. KARETSKAYA et al.
TABLE XIV CONCAVE TWO-ELECTRODE MIFS~ORSFORMINO A STIOMATIC IMAOEOF A POINT;AN OBJECT AND ITS IMAOE ARE IN THE PRINCPAL PLANESX, AND X: a,
R,
deg.
10
140.0 130.0
NN Group
I II
I I1
30
120.0 140.0 120.0 100.0 80.0
50
140.0 120.0 100.0
I I
II I II I I1
I I1 I II I II
I I1
80.0
I
60.0
I1 I
uo
3.810 4.740 4.710 5.887 5.590 10.60 11.30 15.50 16.52 19.94 21.27 23.80 25.44 17.42 18.12 25.49 26.48 32.76 34.03 39.08 40.59 44.23
cp2
- 0.2542 0.1266 -0,1472 0.2388 -0.0210 - 0.2999 0.0606 - 0.0918 0.2262 0.1639 0.43 11 0.4384 0.6568 -0.3087 0.0516 -0.1040 0.2082 0.1463 0.4001 0.4141 0.6065 0.6673
cp"
0.140 0.198 0.214 0.307 0.301 0.124 0.136 0.265 0.291 0.437 0.483 0.622 0.693 0.121 0.128 0.258 0.273 0.426 0.453
0.606 0.643 0.777
rmm
Rcff
10.2 10.6 10.2 10.6 10.2 30.2 30.7 30.2 30.6 30.2 30.6 30.2 30.6 50.2 50.7 50.2 50.6 50.2 50.6 50.2 50.6 50.2
11.13 13.86 11.15 13.93 11.17 31.00 33.05 3 1.01 33.07 31.03 33.10 31.07 33.21 50.99 52.93 50.98 52,94 50.99 52.96 51.01 52.99 51.07
0; 0.165 2.53 0.219 3.78 0.287 0.0535 0.567 0.0786 0.962 0.120 1.60 0.188 3.18 0.0105 0.320 0.0456 0.527 0.0680 0.822 0.102 1.36 0.170
-fx
4.90 2.47 4.53 1.44 4.06 14.5 12.5 12.9 8.58 10.7 4.64 8.03 1.57 24.6 22.1 21.5 17.1 18.5 11.3 14.6 5.65 9.67
'From Beizina and Karetskaya (1987).
From the relation (11.29) it follows that for these mirrors, in addition, M , = 0. Thus, such mirrors are high-quality deflecting systems capable of providing second-order focusing in beam divergence angles in both the x- and y-directions.
B. Three-Electrode Mirrors Investigation of two-plate-electrode transaxial concave and convex mirrors (Section VI,A) has shown that such mirrors manage to solve various tasks of charged particle beam transformation. When the focal lengths f, and&, , are varied with wide ranges, stigmatic as well as the angular dispersion 0; focusing can be provided, and the second-order spherical aberrations are eliminated on deflecting of charged particles without their separation in energy. However, in a number of practically important cases the focusing
459
MIRROR-BANK ENERGY ANALYZERS
quality leaves much to be desired. Thus, in a two-electrode mirror with sufficiently large angular dispersion, one cannot both focus a space parallel beam into a point and, at the same time, make the second-order spherical aberration small. Treatment of a large angular aperture energy analyzer on the base of two electrode mirrors is also impossible. Introduction of a third electrode allows the elimination of these deficiencies (Beizina and Karetskaya, 1988a, b, 1991b).
I . The Potential Distribution In Fig. 33 the axonometric images of three electrode concave and convex mirrors are given along with the appropriate notations. In both the concave and the convex mirrors, the potentials of the first, second, and third (relative to a beam) electrodes are denoted by (p, , q2, and (ps, respectively, and the average radii of the slits separating the plates of neighbor electrodes are denoted by R, and R,. Note that in the concave mirror R , < R , , while in the convex one R, > R, . The distance d between the electrode plates is the same for all electrodes. The potential in the axial trajectory current point, removed from the center 0 by a distance r, has been calculated, as a rule, as for the twoelectrode case, by approximate formulas representing V(r) by a sum of elementary functions. For a three-electrode mirror,
V(r) = " ~
"+ " - " arctan sinh n 2
+
n(r - R , ) d
n(r - R,) +% arctanI sinh -% d 71
ty
a
b
tY
FIGURE 33. Three-electrodetransaxial mirrors. (a) A concave mirror; (b) aconvex one. (From Beizina and Karetskaya, 1988b.)
S. P. KARETSKAYA et al.
460
(VI.6)
for the concave mirror, and V(r) = " "+ " - "arctansinh 2 R
x(r - R , )
+
+
R
arctan sinh
d
n(r - R2) d
for the convex mirror. For a number of cases, to verify the approximation expressions, potentials have been calculated by accurate formulas having the forms
for the concave mirror, and
(VI.7)
for the convex mirror, where Jo (5,) is the first-kind Bessel function of zeroth (first) order. It should be noted that, when deriving Eqs. (VI.6) and (VI.7), the width of the slit between two neighbor electrodes (4 has been assumed to be equal to zero. In calculations, as before, cp, serves as the unit of potential, and d is the unit of length. 2. Mirrors for Collimators and Cameras The focusing and dispersing properties of three-electrode concave and convex mirrors have been studied over a wide range of variation of their geometrical and electric parameters. The first slit radius R , , the width of the mediate electrode &' = IR, - R, I, and the value of a,, as well as the electrode potentials pz and bp3, have been varied. A part of the results obtained in the article by Beizina and Karetskaya (1988b) gives an idea of the effect of the mediate electrode and its potential on the electron-optical properties of a mirror. Here we shall restrict ourselves to stating the main conclusions following from the studies mentioned. First of all, it should be noted that the mediate electrode width &' and the potential (pz influence effectively the electron-optical parameters of a mirror. The latters vary within wide ranges
MIRROR-BANK ENERGY ANALYZERS
46 1
when Pincreases from 1.2 to 2.0 in both the concave and the convex mirror. With Pgreater than a certain value, telescopic systems occur: For a specified p3 two values of p2 appear at which the focal length f, becomes infinite. With the other p2 values, the focal length f, acquires very small values, either positive or negative. Also, we would like to note that in two-electrode mirrors, the angular dispersion in the image space conserves its sign, so in the concave mirror the particles with greater energy are deflected by a greater angle, and in the convex mirror it is vice versa. As for the threecan be reversed, as p2 electrode mirrors, for corresponding P the sign of 0; changes. One can find mirrors with 0;= 0, possessing the interesting property of bending particles without their separation in energy. In such mirrors, if the trajectories of particles with different energies coincide before they enter the mirror field, they coincide, in paraxial approximation, on leaving the field, too. This phenomenon is related to the fact that in the principal plane X I of a transaxial mirror, the linear dispersion always equals zero, and if 0;= 0, it is absent in any plane of the image space. It is significant that, in three-electrode mirrors, one can choose the geometrical and electrical parameters in such a way that in the focal plane, high-quality stigmatic focusing is provided (Beizina and Karetskaya, 1988a). In Tables XV and XVI the appropriate parameters for concave and convex mirrors with such properties are presented. For some specified R , (or R,) and a, the mediate electrode width P, the value of a,, and the potentials pz and p3 have been varied, parameter sets being chosen at which an image becomes stigmatic in the focal plane and the second-order spherical aberration is small. To be more exact, the moduli of the coefficients K3 and K , become less than a certain specified small value. For all mirrors presented in Tables XV and XVI, in the focal plane IK31 < lo-’ and lK61 < lo-,. The focus 4, (&,) is matched to 4, all foci being real. For both the concave mirrors and the convex ones, the smaller radius of two concentric slits separating the plates of neighbor electrodes is given in the tables. For a concave mirror, R, is the smaller radius, whereas for a convex mirror, it is R, .The mediate electrode width is l = IR2 - R , I. In the calculations, slits are assumed to be infinitesimally narrow. Practically, the slit width 6 is not to exceed O.ld. With a finite width of slits, R , and Rzare their average radii, the real width of a mediate electrode being equal to P = P - 6. As was said before, the sign of the parameter a, determines the direction of particle motion in a mirror field. The calculations have been carried out for a,, > 0. Note that in a concave mirror, particle motion comes about counterclockwise with a > 0, whereas in the convex mirror it is vice versa: particles move clockwise with a < 0. Making use of the data in the tables, one can calculate the coordinates XF] and yFlof the point at which the trajectory, which is close to the axial one, intersects the focal
(el),
TABLE XV CONCAVE THREE-ELECTRODE MIRRORSFOR COLLIMATORS AND C m m "
20
P
8
30
50
140.0 130.0 120.0 110.0 100.0 130.0 120.0 110.0 100.0 90.0 100.0 90.0 80.0
1.141 1.130 1.121 1.113 1.183 1.155 1.141 1.140 1.131 1.150 1.171 1.158 1.155
0.2014 0.2816 0.3754 0.4827 0.5971 0.2476 0.3359 0.4275 0.5197 0.6347 0.4818 0.5789 0.6754
0.06oO 0.1490 0.2520 0.3680 0.4659 0.1060
0.2080 0.3100 0.4250 0.5350 0.3700 0.4840 0.5940
8.530 10.52 12.43 14.28 16.01 14.54 17.24 19.76 22.18 24.42 34.80 38.32 41.55
24.92 8.94 24.89 7.28 24.87 5.52 24.90 3.75 2.16 24.91 34.41 12.3 34.49 10.3 34.46 8.18 34.46 5.95 43.54 3.79 54.14 13.8 54.20 10.6 54.89 7.39
'From Beizina and Karetskaya (1988a, 1991b).
0.628 8.64 0.843 9.22 1.12 9.63 1.55 9.74 2.27 9.33 0.471 13.4 0.620 13.7 0.797 13.8 13.7 1.06 I .47 13.0 0.510 21.5 0.651 20.8 0.872 19.7
9.9 13 20 28 34 21 22 36 46 61 72 86 110
0.97 1.3 2.2 3.7 7.4 1.3 1.1 2.6 3.6 6.3 3.1 4.1 5.9
-0.55 0.24 1.4 1.9 0.32 1.7 2.1 4.0 5.2 5.5 8.3
0.68 0.96 1.3 1.7 2.3 0.86 1.1 1.4 1.7 2.3 1.5
10
1.8
12
2.4
2.4 4.4 12 20 31 20 15 29 33 41 63 64 68
-0.20 -0.34 -0.14 -0.24 -0.99 0.60 0.18 0.73 0.59 0.43 1.4 1.4 1.4
- 5.5 - 2.8
3.9 9.2 17 1 .o 1.6 33 - 1.8 50 9.8 7.9 16 4.3 9.5 28 8.2 46 6.4 75 20 46 18 64 15 110
1.1
1.3 1.3 1.7 1.5 1.0 1.6 1.5 1.6 1.6 2.4 1.7 2.7
TABLE XVI CONVEXTHREE-ELECTRODE MIRRORSFOR COLLIMATORS AND CAMERAS'
7
P
8
90.0 80.0 10 90.0 80.0 20 90.0 80.0 70.0 30 80.0 70.0 50 70.0
1.315 1.339 1.289 1.311 1.253 1.265 1.291 1.246 1.263 1.238
0.3046 0.3670 0.3438 0.4091 0.4052 0.4793 0.5493 0.5096 0.5841 0.6156
0.2067 0.2729 0.2517 0.3233 0.3258 0.4043 0.4851 0.4409 0.5237 0.561 1
3.703 5.237 4.110 5.365 5.666 8.013 6.240 8.146 12.43 17.58 13.59 17.74 14.63 17.86 21.08 27.51 22.66 27.66 38.83 47.41
"From Beizina and Karetskaya (1988a, 1991b).
10.9 15.7 13.6 19.0 20.4 23.7 36.1 28.1 31.6 39.8
1.65 2.75 7.46 1.48 3.18 1.38 1.27 8.58 0.994 2.56 0.925 6.48 0.905 19.6 0.766 4.28 0.740 12.1 0.578 5.60
2.2 0.42 2.1 0.98 0.31 10 2.0 0.31 3.8 1.3 0.36 14 8.8 0.83 5.1 4.7 0.56 15 -0.69 0.26 42 12 0.89 14 7.0 0.76 33 15 0.85 27
1.5 1.6 1.8 1.8 2.6 2.3 2.1 2.9 2.5 3.7
6.9 6.9 6.1 6.3 6.3 5.3 6.2 5.7 5.4 5.5
7.0 7.1 5.3 5.5 3.6 3.3 3.9 2.7 2.7 2.0
2.9 11 2.7 10 -0.36 6.2 15 3.8 9.8 6.6
4.3 11 6.6 15 4.8 20 34 17 39 33
1.3 0.55 1.8 0.68 4.9 2.1 0.70 5.0 1.8 6.9
S. P. KARETSKAYA et at.
464
where x,, y,, x6, y(, are the linear and angular coordinates of the adjacent trajectory at the entrance of the mirror field with
ra =
(R 1 - 3 R 1+ 3
for the concave mirror, for the convex one.
The coefficients M2 and M4, which are absent in the tables, can be calculated by the formulas (11.30)¶
M2 = fy K5
9
fx
M4=-.2f, K6 fx
The coefficients of the chromatic aberrations in the y-direction, M5 and M6, have not been calculated. It is known that the collimator objective is to convert a homocentric beam into a parallel one, and the camera objective focuses a parallel beam into a point or into a line. In the first case an object is located in the focal plane, and in the second one an image is formed in the focal plane. The threeelectrode transaxial mirrors described in Tables XV and XVI form a set of mirror objectives having dispersion in energy which can be used in the collimators and cameras of spectrometers designed for various purposes. In vary in wide ranges. Concave and convex mirrors this set a,f,,f,, and 0; , the focal length fx in supplement each other. With the same values of 0; a concave mirror is always less than in a convex one. The difference can reach a large value. All concave mirrors from Table XV deflect a more energetic particle by a larger angle, whereas all convex mirrors from Table XVI act vice versa: The greater the particle energy, the smaller the angle of deflection. It is interesting to note that with a larger focal length f,, the focal planes in the convex mirror are located closer to the electrodes than in the concave one. This fact is important both for reducing device size and for increasing the relative aperture of the instruments where such mirrors are applied. Concrete schemes for mass and energy analyzers will be discussed in Section V1,C. 3. Mirrors for Energy Analyzers In articles by Beizina and Karetskaya (1988a, 1991b), the parameters were found for three-electrode concave and convex mirrors that provide
MIRROR-BANK ENERGY ANALYZERS
465
homocentric beam stigmatic focusing (G, = 0) and have small spherical aberrations in both the x- and y-directions. The case was considered in which an object and its image are situated in the planes Po and P I ,m, = - 1. Only mirrors with these planes located beyond the field were selected. The chosen mirror parameters are given in Tables XVII and XVIII. All mirrors possess a linear magnification in the y-direction my = +1, and the moduli From of the spherical aberration coefficients K , and K4 are less than Eqs. (11.29), (V.36), and (V.37), it follows that if the conditions given earlier are satisfied, then the following relations are valid:
implying that the coefficient M , of spherical aberration in the y-direction as well as the coefficients M 3 , K2,and K , are also sufficiently small in order to neglect the effects of the corresponding aberrations, IM,I c 2 x 1M31 < 1 x lo-’, lK21 < 110-2/fxl, lKsl < (10-2/f,I. The values of the aberration coefficients K 3 ,K6 ,K , ,K , ,K 9 , Mz , and M4 in the image plane are presented in the tables. The coefficients M , and M6 of chromatic aberrations in the y-direction were not calculated. Mirrors with the parameters given in Tables XVII and XVIII can serve as energy analyzers. Also, they can be used for realization of ion focusing in energy in a mass spectrometer with an intermediate image. This problem will be discussed in more detail later. Diversity of properties of the selected mirrors is attractive: One can choose the angle of beam deflection, the mirror gabarits, the distance between the entrance and exit slits, and the electrode system. The linear dispersion in energy is the largest in convex mirrors, an interval of D lbetween 25d and 42d being typical. Figure 34 exemplifies the energy analyzers based on various three-electrode transaxial mirrors. The electrode systems of mirrors in the projection onto the mid-plane, the axial trajectories and location of the conjugated planes Po and Pl are shown. In Fig. 34a, a concave mirror is depicted, and in Fig.34bYc, and d, convex mirrors are presented. All these analyzers have the same linear dispersion in energy, and second-order spherical aberration is eliminated. The figure is made to scale; it gives an idea of the gabarits of various systems. Analyzers with D , = 80 mm are shown. If one needs larger dispersion, the sizes should be increased proportionally to the increase of the dispersion. Let us compare an energy analyzer based on a mirror with parallel electrode plates and a wide-spread flat capacitor operating in the mirror regime. In both cases, mirror designs are rather simple because of plate electrodes. See Fig. 35, which depicts the compared analyzers with the same dispersion in energy measured in the direction perpendicular to the beam
TABLE XVII CONCAVE THREE-ELECTRODE MIRRORS FOR ENERGY ANNYZERS'
R,
a, deg.
20 120.0 110.0 30 110.0 100.0 90.0 50 120.0 110.0 100.0 90.0 80.0
0
qz
q,
qu
a,
1.117 1.085 1.126 1.103 1.070 1.163 1.150 1.137 1.121 1.098
0.3931 0.5077 0.4436 0.5472 0.6602 0.3190 0.4049 0.4969 0.5928 0.6904
0.2846 0.4079 0.3417 0.4564 0.5826 0.2089 0.3014 0.4038 0.5125 0.6239
0.365 0.408 0.417 0.527 0.644 0.209 0.377 0.473 0.573 0.675
12.70 14.59 20.04 22.46 24.82 27.39 31.34 35.09 38.61 41.88
rmar R,, 21.0 20.6 31.0 31.0 30.9 51.1 51.0 51.0 51.0 51.0
'From Beizina and Karetskaya (1988a, 1991b).
-fx
25.40 4.85 25.45 3.18 34.93 7.45 34.96 5.31 35.10 3.15 54.75 18.4 54.64 15.8 54.60 13.0 54.61 9.85 54.67 6.64
-D,
-4 - l d K ,
-Ks
lO-'K,
-K,
-10-'K9
ldM,
-lOM,
13.7 12.2 14.5 13.4 11.5 15.9 15.9 15.7 15.0 13.6
2.67 3.42 3.47 4.36 4.98 2.32 3.40 4.57 5.72 6.67
0.23 0.25 0.21 0.21 0.24 0.20 0.19 0.18 0.17 0.18
3.9 2.9 4.0 3.2 2.5 5.7 4.9 4.2 3.6 3.0
5.9 9.0 4.0 5.7 11 1.9 2.0 2.5 3.3 5.5
3.6 3.8 2.7 2.7 3.8 3.3 2.4 1.9 1.8 2.2
0.50 0.56 0.11 0.90 1.1 2.1 1.9 1.8 1.7 1.8
4.9 6.7 3.9 4.8 7.9 3.2 2.9 3.0 3.4 4.7
13 36 9.1 20 63 2.0 3.1 5.3 10 24
TABLE XVIII CONVEXTHREE-ELECTRODE MIRRORS FOR ENERGY ANALYZERS"
~
~~~
7
5
90.0 80.0 10 110.0 100.0 90.0 30 110.0 100.0 90.0 80.0 50 120.0 110.0 100.0 90.0 80.0 70.0
1.313 1.339 1.270 1.272 1.286 1.215 1.217 1.226 1.240 1.209 1.202 1.202 1.208 1.217 2.232
0.3124 0.3684 0.2350 0.2923 0.3512 0.3041 0.3732 0.4442 0.5156 0.2550 0.3240 0.3966 0.4715
0.2052 0.2727 0.1226 0.1824 0.2501 0.1949 0.2718 0.3542 0.4390 0.1439 9.2160 0.2972 0.4465 0.5466 0.4723 0.6199 0.5596
0.277 0.337 0.197 0.257 0.319 0.271 0.343 0.418 0.493 0.220 0.292 0.368
3.730 4.118 4.472 5.107 5.699 15.65 17.62 19.45 21.14 23.50 27.05 30.38 0.446 33.48 0.526 36.32 0.603 38.90
7.09 7.09 10.1 10.1 10.1 30.1 30.1 30.1 30.1 50.1 50.1 50.1 50.1 50.1 50.1
"From Beizina and Karetskaya (1988a, 1991b).
5.275 5.376 7.796 7.945 8.061 27.30 27.42 27.51 27.60 46.99 47.15 47.26 47.34 47.42 47.49
8.55 13.9 7.10 8.40 10.9 16.4 17.5 19.4 23.1 24.7 24.9 25.4 26.5 28.8 33.6
25.8 4.41 40.1 9.73 18.6 1.65 21.3 3.03 27.4 5.63 18.5 2.66 20.9 4.31 24.8 6.85 31.4 11.2 17.2 1.74 18.2 2.88 20.2 4.49 23.0 6.80 27.3 10.3 34.6 16.4
13 5.3 22 18 12 14 14 13 9.6 9.6 11 12 12 10 8.1
11 6.4 7.1 14 10 17 14 11 7.4 14 17 14 11 8.5
6.0
~
1.7 3.3 1.0 1.2 1.6 0.84 0.86 1.o 1.4 0.77 0.74 0.76 0.84 1.0 1.5
-0.34 10 -14 12 25 7.7 7.3 12 7.6 0.91 2.7 22 2.7 12 2.9 4.8 3.3 -2.7 1.7 35 1.7 22 1.7 13 1.9 6.3 2.1 0.14 2.5 -6.6
0.00 -0.07 -0.04 0.11 0.04 0.46 0.34 0.20 -0.07 1.6 1.2 1.o 0.58
0.15 -0.05
0.49 0.15 0.19 0.10
0.46 1.8 1.2 0.73 0.41 2.7 1.9 1.3 0.93 0.62 0.36
468
S. P. KARETSKAYA et al. a
b
FIGURE34. Energy analyzers with the same dispersion in energy based on various threeelectrode transaxial mirrors.
axial trajectory. First of all, the considerably smaller sizes of the novel analyzers should be marked. Further, a plane capacitor provides secondorder focusing in the angle of beam divergence in the figure plane and does not focus the beam in its divergence angle in the orthogonal direction only with a definite value of the axial trajectory entrance angle, namely, with O0 = 60" (this case is shown in Fig. 35). The newly designed analyzers can provide second-order stigmatic focusing in both angles of beam divergence at various values of entrance angle. Thus, with the same relative aperture, higher resolution is achieved. Besides, the plane in the vicinity of which vicinity charged particles move is parallel to the electrode plates in the new systems, so a beam freely enters the electric field and then leaves it, and charged particles do not hit the electrode plates even at a very large a
\/
3
b
C
FIGURE35. Energy analyzers with the same dispersion in energy based on (a) a plane capacitor operating in the mirror regime: (b) a mirror with parallel electrode plates and direct slits; (c) a transaxial convex mirror.
MIRROR-BANK ENERGY ANALYZERS
469
spread of particle energies. In other well-known mirrors-flat, cylindrical, spherical, toroidal-a beam intersects one of the electrode plates twice, so appropriate openings are needed in this plate that inevitably distort the electric field. In addition, in these mirrors a fraction of the particles hit electrode plates when energy spread is large, leading to spurious effects (Froitzheim et al., 1975; Bargeron and Nall, 1981). Further, in the novel mirrors, the availability of three electrodes allows one, simply by varying their potentials, to change the mirror electron-optical parameters-and particularly, in this way, to eliminate various errors of mechanical assembly. Although the novel mirror designs are slightly more complicated than the plate capacitor (the increased number of electrode plates, the more complicated power system), their advantages expiate their complications. Comparison with other widely used energy analyzers shows that the analyzers based on transaxial mirrors surpass many of them in the value of relative dispersion in energy with higher level of focusing.
C. Multicascade Energy Analyzers and Schemes for Mass Analyzers with Transaxial Mirrors The possibility of achieving large energy dispersions and high-quality focusing in transaxial mirrors, the capability of governing their electronoptical properties by electrical means, as well as the other advantages and peculiarities of the electrode system structure, have allowed a series of new technical solutions to be worked out (Beizina et al., 1988, 1989, 1990, 1992). 1. Multicascade Energy Analyzers The question is electrostatic energy analyzers composed of transaxial mirrors located in such a way that charged particles, leaving the field of one mirror, find themselves in the field of another. Usually such systems are called multicascade ones. From one cascade to another, dispersion in energy is accumulated (Beizina et al., 1990). First of all it should be noted that a multicascade system composed of any number of transaxial mirrors can be represented, as before, by two parallel plates located symmetrically with respect to the mid-plane near which charged particles move. These plates are divided into parts in the same way as the slits, having projections onto the mid-plane that are shaped as rings or sections of rings. A section of one plate and the symmetrical section of another plate are under the same potential and form an electrode. Two examples of multicascade analyzers based on transaxial mirrors are given in Figs. 36 and 37. For each of them the source exit slit, 1, receiver entrance slit, 2, and the analyzer electrode system in the projection onto
470
S . P. KARETSKAYA et a/.
FIGURE 36. A four-cascade energy analyzer conserving the direction of beam motion. (From Beizina et a/., 1990.)
two mutually perpendicular planes are shown. Also, the axial trajectory of a charged particle beam is depicted. The electrode system of the analyzer shown in Fig. 36 consists of two parallel plates, 3 and 4, divided by the slits, shaped as sections of rings having the average radii R,, R,, R 3 , and R , , into nine electrodes, 5-13. Slits form four groups; in each the slit centers lie at the common axis perpendicular to the surfaces of the plates 3 and 4. The centers of the slits of the first, second, third, and fourth groups lie at the axes 0,, O,, 0 3 ,and 0,, respectively. Every group of slits corresponds to the cascade of an analyzer. Cascades are separated by spaces free of a field. The potentials cp, , cp,, p3, p,, and cp5 are applied, respectively, to the electrodes 5, 6 (and 12), 7 (and 13), 8 (and lo), and 9 (and 11). Thus, despite the large number of electrodes, the scheme of the electric power supply is not so complicated.
1 d j
FIGURE 37. A compact seven-cascade energy analyzer. (From Beizina et a/., 1990.)
47 1
MIRROR-BANK ENERGY ANALYZERS
Each of the cascades bends a beam by 90". The screen, 14, eliminates charged particles striking directly from a source to a receiver. Such a multicascade analyzer operates as follows. The charged particle beam emitted by the source, finding itself in the field of the first (with respect to the beam direction) cascade, deviates, and then it is focused between the first and second cascades. The intermediate image, 16, produced by particles of the certain definite energy to which the analyzer is tuned, is matched to the second cascade front principal focus, which forms, for this reason, a parallel beam in the section between the second and third cascades. The latter collects this beam in the back principal focus, 17, being matched with the object plane of the fourth cascade. The latter focuses the beam in the plane of the entrance slit of the receiver. The particles having other energies deviate other ways and do not hit the receiver slit. Varying the energy of the analyzer tuning, one can receive the energy spectrum. For the analyzer considered, linear dispersion D is determined by the equality
D
=
(Dl + 2D,)m4
+ D4,
where D , and D4are linear dispersions in energy of the identical first and fourth cascades, 20, is the total linear dispersion of the identical second and third cascades, and m4 is the linear magnification of the fourth cascade in the x-direction. Linear magnification of the entire analyzer in this direction equals - 1. For the analyzer shown in Fig. 36,
D, = 18.4d,
20, = 37.64,
D4= 41.ld,
m4 = 2.23.
The dispersions of the first three cascades are multiplied by magnification of the latter cascade, so analyzer dispersion is very large: D = 166d. The mirrors selected form a stigmatic image free of second-order spherical aberration. Because of this, one can create a device possessing very high resolving power. The electrode system of the multicascade analyzer shown in Fig. 37 is also composed of two parallel plates, 3 and 4. The plates are divided by the ring-like slits of radii R, and R2 and by slits of a radius R , into six electrodes, 5-10. Centers of the slits lie in the axes O,, 02, 03,and 0,. The potentials ppl,p2, p, , and (p4 are applied to the electrodes 5, 6, 7, 8 (and 9, lo), correspondingly. A charged particle beam leaving the source is reflected four times in the field created by the electrodes 5-7 (four cascades), and it is reflected, in consecutive order, in the field created by the electrodes 5 and 8, 5 and 9, 5 and 10 (another three cascades), reaching the receiver; at the first, third, fifth, and seventh reflections, real stigmatic images are created, serving as imaginary objects for the second, fourth, and sixth cascades. These cascades produce imaginary stigmatic images that
472
S. P. KARETSKAYA et al.
serve, in turn, as objects for odd cascades. The imaginary character of the intermediate images allows one to diminish significantly the device gabarits. In the field of each of odd cascades, a beam deviates by 90', and in the field of each even cascade the deviation is 164.4'. The dispersions of all seven cascades are summed up
D = 401
+302,
where D1 (D2)is the linear dispersion for the odd (even) cascades. For the analyzer depicted in Fig. 37, D , = 22.9d, D2 = 26.8d, i.e., D = 172d. The ratio of the dispersion D to the distance between the source and receiver (the relative dispersion) equals 11.5. It should be noted that this seven-cascade energy analyzer is extremely compact. 2 . Static Mass Analyzers with Energy Focusing Three-electrode transaxial mirrors can be used successfully in static mass spectrometers in order to achieve focusing in energy (Beizina et al., 1992). In Fig. 38 three different ion-optical schemes for mass spectrometers with a
\I
456
C
FIGURE38. Various ion-optical schemes of statical mass analyzers with energy focusing realized by transaxial mirrors. (From Beizina et al., 1992.)
MIRROR-BANK ENERGY ANALYZERS
473
energy focusing are shown in the projection onto the mid-plane. Instead of the usual electrostatic analyzers, three-electrode transaxial convex mirrors are applied. In schemes (a) and (b), cylindrical, spherical, or toroidal deflectors are used the most often, while in scheme (c) a refracting system with a lens is applied (Herzog and Hauk, 1938; Ewald, 1959; Sachenko and Fridlyanski, 1980; Kel’man et al., 1985). For all three schemes shown in Fig. 38 the following notations are used: 1 and 2, the entrance and exit slits of the spectrometer; 3, the poles of the magnetic analyzer with uniform magnetic field; 4, 5 , 6, the transaxial mirror electrodes; 7, the axial trajectory of an ion beam. In the first case (Fig. 38a), the mirror produces an intermediate image in the plane of the diaphragm, 8, serving as an object for the sector magnetic analyzer. The latter produces its image in the plane of the entrance slit, 2, of the ion receiver. To achieve energy focusing, one must equalize the linear dispersions of the mirror and the magnetic analyzer. The second case (Fig. 38b), is the scheme of a Mattauch-Herzog-type mass spectrograph. The mirror converts an ion homocentric beam into a parallel one, which is further focused by the magnetic analyzer in the plane, 2. The scheme allows one to record simultaneously a large range of the mass spectrum. The mirror plays the role of the collimator objective. To achieve focusing in energy, one must satisfy the condition
qf;+ d = 0, where 0; is the mirror angular dispersion in energy in the image space, f; is the focal length of the magnetic analyzer, and d is its linear dispersion in energy (mass) in the focal plane. The device linear magnification is m, = f ; / f , , where f, is the mirror focal length. One can gain considerably in the ratio of the spectrometer dispersion d to its magnification, choosing a mirror with a large focal length f,. It is significant that even with large f, the focal planes (in one of which the source slit is situated) in the convex transaxial mirror can be close to the electrode system. In such case the increase off, is not accompanied by losses in device sensitivity. In the third case (Fig. 38c), mirrors are used in the scheme of a prism mass spectrometer. The first mirror (a collimator objective), with the ion source slit in its focal plane, transforms a divergent space beam into a parallel one. The input and output edges of the magnet poles are parallel, so the parallelism of the beam, homogeneous in energy and in mass, on passing through a magnetic analyzer is conserved in the mid-plane. The angle at which the beam enters the magnetic analyzer is chosen in such way that the latter represents a telescopic system in the direction perpendicular to the mid-plane, too. So a space parallel beam entering a magnetic field remains parallel on leaving the field, as in an optical prism. The second
474
S. P.KARETSKAYA et al.
mirror (the camera objective) collects the beam in the focal plane with which the receiver entrance slit is matched. The device linear dispersion in mass b is proportional to the mirror focal length,
d = b;f,, where b; is the angular dispersion in mass (energy) of the magnetic analyzer. In order to achieve energy focusing in the symmetrical scheme shown in the figure, the angular dispersion 0; of every mirror is equal to half of the angular dispersion bi of the magnetic analyzer. For all three schemes the authors have managed to find needed mirrors, not only providing ion energy focusing, but also significantly suppressing aberrations of the magnetic analyzer. The design correspondence between a static magnetic analyzer and a mirror should also be mentioned. Its mid-planes coincide, and ions move freely between the electrode plates and the magnet poles, so the width of an ion beam can be sufficiently large in this plane. Note also the small sizes of the transaxial mirror. One of the three-electrode transaxial mirrors was manufactured and tested in the scheme with a sector magnet, deflecting an ion beam by 90" and providing its stigmatic focusing. The radius of the ion trajectory in the homogeneous magnetic field was equal to 200 mm, and the dispersion of the magnetic analyzer in mass (energy) was 400mm. The sizes of the mirror electrode system with the same dispersion in the projection onto the mid-plane did not exceed a page of this book. Test results were in accord with the calculations. 3. Mass Analyzer with Ion Multiple Passage of Magnet Field A mass analyzer with ion multiple passage of the same magnetic field (Beizina et al., 1989) represents one more interesting example of a possible application of transaxial mirrors with two-plate electrodes. Analogous schemes with mirrors of other types were studied earlier in detail by Celles et al. (1975). In Fig. 39 one of the possible ion-optical schemes of a mass analyzer with a transaxial mirror is shown in a projection onto the mid-plane. The analyzer entrance slit, 1, and exit slit, 2; the magnet circular poles, 3; the transaxial concave mirror with electrodes 4, 5, and 6; and the ion beam axial trajectory, 7, are shown. The same magnetic field is passed by ions four times in this case. The transaxial mirror returns ions into the magnetic field three times and simultaneously provides the beam energy focusing. Let us clarify the main conditions that must be satisfied in schemes of such type. The following notations are introduced: R , the effective radius of the region with a homogeneous magnetic field; o,the angle of beam
MIRROR-BANK ENERGY ANALYZERS
475
FIGURE39. A mass analyzer with ion multiple passage of the magnetic field. (From Beizina et al., 1989.)
entrance into the magnetic field and the exit angle, equal to the first one; and 8, the angle of beam deflection in the magnetic field. For every ion passage of the magnetic field, an object and its image, formed by the field, will be considered to be situated symmetrically with respect to the magnet. Suppose that the condition
1
8
-tan - = tan a, 2 2
(VI.9)
providing the image stigmatic character, is satisfied (Enge, 1967). Then the magnetic analyzer dispersion in mass and the same one-passage dispersion in energy is as follows:
d, + 2p = R(2 + cot2 o)sin o,
(VI.10)
where p is the radius of curvature of ion trajectories in a homogeneous magnetic field. The image created by the magnetic field serves as an object for the mirror, which, in turn, creates its stigmatic image and so on, until the beam focuses in the plane of the receiver entrance slit after the nth passage of the magnetic field. In order to achieve the ion beam entrance into the magnetic field again at an angle o after its reflection in the mirror, the condition
Reff - sin w R cos(a/2) must be satisfied.
(VI. 11)
476
S. P. KARETSKAYA el al.
The magnet image plane is located at a distance P from the point of the axial trajectory intersection with the edge of the magnetic field. It is matched with the anti-principal plane of a concave mirror removed to a distance 21fxl from the mirror principal plane. Thus, the requirement follows: (VI.12) R C O S U+ 1 = -2fx. It is known that
P
=
P 2
R 2
2p cot - = - (2
+ cot2 0)cos 0 ,
(VI.13)
(Enge, 1967), so from Eq. (VI.12) we obtain the following condition,
f,
COSE
R
4
_ -- --
(4 + cot2w),
(VI. 14)
for the mirror focal length. As a beam passes the magnetic field repeatedly, dispersion in mass (energy) increases proportionally to the number of passages, n, so finally, the dispersion is as follows:
D, = nij,
+ 2np.
(VI.15)
For n passages in the magnetic field and (n - 1) reflections in the electric field, energy focusing in the exit slit plane of the mass analyzer is reached if
nB,= (n - l p , , where D , is the mirror dispersion in energy in the anti-principal plane. For the mirror dispersion the condition is obtained:
3-R
n (2 (n - 1)
+ cot2 w ) sin w
(VI.16)
Thus, in a mass analyzer with a transaxial mirror have parameters satisfying conditions (VI.1 l), (V1.14), and (V1.16), stigmatic focusing and energy focusing in the plane of the entrance slit of an ion receiver are provided. In the scheme shown in Fig. 39 a dispersion in mass D4= 400mm is achieved with R = 50 mm. A transaxial mirror, in contrast to a toroidal one (Celles et al., 1975), gives the possibility for ions to move freely through the entire ion optical tract, as there are no grids or slits in the ion path. So a minimum of scattering and knocking-out of secondaries, as well as of losses of analyzed ions, is achieved. A chance to increase sensibility, accuracy, and reliability of measurements appears. Other examples of potential application of transaxial mirrors in mass analyzers were discussed in work by Glickman et al. (1991).
MIRROR-BANK ENERGY ANALYZERS
477
VII. CONCLUSION Investigation of two-plate electrode mirrors has shown that they can serve as the base of a new class of energy analyzers in which stigmatic focusing proceeds, and second-order spherical aberrations are absent in both directions of focusing. Some other kinds of aberration are absent, too. It is a common thing for this class of analyzers that charged particle motion comes about between the electrode plates, making it unnecessary to provide openings for beam entrance and exit. The mirror electron-optical properties can be controlled effectively via the electrode potentials. Various two-plate electrode mirrors mutually supplement each other. The mirror choice is determined by the conditions of a concrete problem and the requirements imposed on the analyzer operating characteristics. With the same highquality focusing, the beam deviation angle can be varied within a wide range from 70" to 150". The range of the distance between an object (its image) and the analyzer field is also large. To restrict ourselves in the number of considered versions, we have investigated only the case of symmetric location of an object and its image with respect to the mirror. However, an energy analyzer that is evidently of the same high quality can be found for their asymmetric location, too. The applicability field of transaxial mirrors is wider than that of those with a two-dimensional field. They can be used as objectives having energy dispersion in collimators, or in cameras of spectrometers of various types. Dispersionless deflecting systems, free of second-order spherical and chromatic aberrations, can be designed on the base of concave transaxial mirrors. Peculiarities of the mirror electrode system structure allow creation of interesting multicascade energy analyzers. A miniature two-cascade analyzer possessing high-level operating parameters can be produced on the base of a wedge-shaped mirror. A compact multicascade energy analyzer can be made on the base of a transaxial mirror. Interesting new solutions arise when two-plate electrode mirrors are used in mass analyzers. An energy analyzer of the new class undoubtedly cannot compete in relative aperture value with the analyzers having ring-shaped diaphragms. However, in those cases when beam divergence is restricted by a rectangular or circular diaphragm, its application is rather promising. When treating the novel energy analyzers, interesting results related to the theory of charged particle beam focusing have been obtained as well. In any electrostatic field having a symmetry plane, elimination of spherical aberration in the direction parallel to this plane is shown to be accompanied by its elimination in the orthogonal direction as well. Also, it has been shown that in a mirror having a two-dimensional field and a symmetry plane, the relative dispersion in energy is determined solely by the angle of
478
S. P. KARETSKAYA et al.
beam entrance into the mirror field. The relations connecting dispersion and the coefficients of spherical and chromatic aberration in a twodimensional field with a mid-plane have been derived. In addition, it has been shown that in a transaxial mirror, the plane that contains the field symmetry axis and that is perpendicular to rectilinear sections of the axial trajectory in the object and image spaces is always principal. In the principal plane, independently of the field distribution in a mirror, dispersion in energy and the coefficient of the second-order spherical aberration, related to the angle of beam divergence in the mid-plane, are equal to zero. When an image formed in the principal plane is adjusted to be stigmatic, the second-order spherical aberration is eliminated completely in two directions of focusing. Investigation of the electron-optical properties of the novel energy analyzers cannot be considered as completed. Preliminary studies have shown that in certain mirrors the coefficients of third-order spherical aberration can be made sufficiently small. Studies in this direction will be continued. ACKNOWLEDGMENTS
The authors owe a pleasant debt of gratitude to all our colleagues involved in the study of the energy analyzers to which the given review is devoted. We especially appreciate our teacher, Dr. V. M. Kel’man, and Drs. N. Yu. Saichenko and L. V. Fedulina. Also, we are grateful to Dr. P. W.Hawkes for the chance to publish this review.
REFERENcEs Afanas’ev, V. P., and Yavor, S. Ya. (1978). “Electrostatic Energy Analyzers for Charged Particle Beams” (in Russian). Nauka, Moscow. Ballu, Y. (1980). Adv. Electron. Electron Phys. B 13, 257. Baranova, L. A., and Yavor, S. Ya. (1988). Zh. Tekh. Fiz. 58(2), 217. Bargeron, C. B., and Nall, B. H. (1981). Rev. Sci. I n s t r m . 52(11), 1777. Beizina, L. G.,and Karetskaya, S. P. (1987). Zh. Tekh. Fiz. 57(10), 1972. Beizina. L. G.,and Karetskaya, S. P., (1988a). Zh. Tekh. Fiz. 58(5), 870. Beizina, L. G.,and Karetskaya, S. P., (1988b). Zh. Tekh. Fiz. 58(5), 877. Beizina. L. G.,and Karetskaya, S. P., (1991a). Zh. Tekh. Fiz. 61(7), 171. Beizina, L. G.,and Karetskaya, S. P., (1991b). Zh. Tekh. Fiz. 61(7), 191. Beizina, L. G.,Karetskaya, S. P., and Kel’man, V. M. (1985). Zh. Tekh. Fiz. 55(9), 1681. Beizina, L. G.,Karetskaya, S. P., and Kel’man, V. M.(1986). Zh. Tekh. Fiz. 56(7), 1249. Beizina, L. G.,Karetskaya, S. P., and Kel’man, V. M. (1987). Zh. Tekh. Fiz. 57(3), 434.
MIRROR-BANK ENERGY ANALYZERS
419
Beizina, L. G., Karetskaya, S. P., and Kel’man, V. M. (1988). USSR Pat. 1,436,148. Beizina, L. G., Karetskaya, S. P., and Kel’man, V. M. (1989).USSR Pat. 1,525,774. Beizina, L. G., Karetskaya, S. P., and Kel’man, V. M. (1990).USSR Pat. 1,550,589. Beizina, L. G., Karetskaya, S. P., and Kel’man, V. M. (1992).USSR Pat. 1,438,522. Celles, M., Baril, M., and Bolduc, L. (1975). Nucl. Instrum. Methods 125, 535. Coggeshall, N. D.(1946). Phys. Rev. 70, 270. Daukeev, D.K., Karetskaya, S. P., Kasimov, S. I., Kel’man, V. M., Mit’, A. G., Saichenko, N. Yu., and Shevelev, G. A. (1985).Zh. Tekh. Fiz. 55(3), 632. Enge, H.A. (1967).In “Focusing of Charged Particles” (A. Septier, ed.), Vol. 2, p. 203. Academic Press, New York and London. Ewald, H. (1959).Z. Naturforsch. 14A, 198. Fishkova, T.Ya. (1987). Zh. Tekh. Fiz. 57(7), 1358. Froitzheim, H.,Ibach, H., and Lehwald, S. (1975). Rev. Sci. Instrum. 46(10), 1325. Glickman, L. G., and Goloskokov, Yu. V. (1991a). Nauchn. Priborostroenie 1(2), 99. Glickman, L. G., and Goloskokov, Yu. V. (1991b). Zh. Tekh. Fiz. 61(10), 169. Glickman, L. G., Kel’man, V. M., and Yakushev, E. M. (1967). Zh. Tekh. Fiz. 37(6), 1028. Glickman, L. G., Karetskaya, S. P., Kel’man, V. M., and Yakushev, E. M. (1971).Zh. Tekh. Fiz. 41(2), 330. Glickman, L. G., Kel’man, V. M., Karetskaya, S. P., Iskakova, Z. D., and Goloskokov, Yu. V. (1989). USSR Pat. 1,471,234. Glickman, L. G., Goloskokov, Yu. V., Iskakova, Z. D., Karetskaya, S. P., and Kel’man. V. M. (1990).Dokl. Akad. Nauk SSSR 312(4), 869. Glickman, L. G., Karetskaya, S. P., and Kel’man, V. M. (1991). Zh. Tekh. Fiz. 61(1), 144. Glickman, L. G., Goloskokov, Yu. V., and Iskakova, Z. D. (1992a). Zh. Tekh. Fiz. 62(1), 113. Glickman, L. G., Goloskokov, Yu. V., and Iskakova, Z. D. (1992b). Zh. Tekh. Fiz. 62(1), 119. Glickman, L. G., Goloskokov, Yu. V., and Iskakova, Z. D. (1992~).Zh. Tekh. Fiz. 62(1), 137. Glickman, L. G., Goloskokov, Yu. V., and Iskakova, Z. D. (1992d). Zh. Tekh. Fiz. 62(1), 146. Glickman, L. G., Goloskokov, Yu. V., and Karetskaya, S. P. (1993). Pis’ma Zh. Tekh. Fiz. 19(7), 68. Golikov, Yu. K., Ivanov, V. G., Kolomenkov, V. Yu., and Matishev, A. A. (1981).Zh. Tekh. Fiz. 51(5), 1010. Herzog, R., and Hauk, V. (1938).Ann. Phys. (Leipzigl [ 5 ] 33, 89. Karetskaya, S. P., and Fedulina, L. V. (1982a).Zh. Tekh. Fiz. 52(4), 735. Karetskaya, S. P.,and Fedulina, L. V. (1982b). Zh. Tekh. Fiz. 52(4), 740. Karetskaya, S. P.,and Saichenko, N. Yu. (1989). Zh. Tekh. Fiz. 59(10), 98. Karetskaya, S. P., and Saichenko, N. Yu. (1990). All-Union Semin. Methoak Calculation Electron-Opt. Syst., IOth, Lvov, Report theses, p. 69. Karetskaya, S. P., Kel’man, V. M., and Saichenko, N. Yu. (1984). USSR Pat. 1,091,257. Kel’man, V. M.,and Rodnikova, I. V. (1963). Zh. Tekh. Fiz. 33(4), 387. Kel’man, V. M.,and Yavor, S. Ya. (1968). “Electron Optics” (in Russian). Nauka, Leningrad. Kel’man, V. M., Fedulina, L. V., and Yakushev, E. M. (1972). Zh. Tekh. Fiz. 42(2), 297. Kel’man, V. M.,Karetskaya, S. P., Fedulina, L. V.,and Yakushev, E. M. (1979).“Electronoptical Elements of the Prism Spectrometers of Charged Particles” (in Russian). Nauka, Alma-Ata. Kel’man, V. M., Karetskaya, S. P., Saichenko, N. Yu., and Fedulina, L. V. (1982).Zh. Tekh. Fiz. 52(11), 2140.
480
S. P. KARETSKAYA et al.
Kel’man, V. M., Rodnikova, I. V., and Sekunova, L. M. (1985). “Static Mass Spectrometers” (in Russian). Nauka, Alma-Ata. Leckey, R. C. G. (1987). J. Electron Spectrosc. Relat. Phenom. 43, 183. Roy, D., and Tremblay, D. (1990). Rep. Prog. Phys. 53(12), 1621. Sachenko, V. D., and Fridlyanski, G. V. (1980). Zh. Tekh. Fiz. 50(9), 1974. Shevelev, G. A., Senin, N. A., Glickman, L. G., Goloskokov, Yu. V., and Karetskaya, S. P. (1991). All-Union Semin. Secondary Ion Ion-Proton Emission, 6th, Kharkov, Report theses, p. 245. Strashkevich, A. M. (1966). “Electron Optics for Electrostatic Systems” (in Russian). Energiya, Moscow and Leningrad. Vandakurov, Yu. V. (1956). Zh. Tekh. Fit. 26(11), 2578. Zhukovsky, A. G., Karetskaya, S. P., Kel’man, V. M., Koval’, N. A., Mikhailichenko, A. I., Ryabishev, A. G., Saichenko, N. Yu., Solovei, S. D., and Tantsirev, G. D. (1989). Zh. Tekh. Fiz. 59(6), 110.
Index
A Aberrations electron off-axis holography, 8, 9, 21, 48 mirror-bank energy analyzers, 403 -4 10 equations, 397-399 object width, 408 -4 10 transaxial mirrors, 437-441 wave aberrations, 8, 9, 21, 48 Acousto-optical cells, optical symbolic substitution, 8 1- 82 Addition Minkowski addition, 328 optical symbolic substitution, 59-64, 70 Aharonov-Bohm effect, 144-146 devices based on, 106, 142-178 electrostatic, 145, 146, 157-178 magnetostatic, 146, 149-156 Aharonov-Bohm interferometer, 172-1 78, 180, 186, 199 ALE, 222 Algebra, Boolean, fuzzy set theory, 258 Algebraic product, fuzzy set theory, 261-262 Algorithms basis algorithms, 334-349, 383, 384 general basis algorithms, 337-349, 363-365, 370-374 general decomposition algorithm, 371 for OSS rules, 59-71 Amplitude transfer function, 8
Amplitude transmittance, 3, 10, 33, 34 AND gate, 229-231 Angular aberrations, mirror-bank energy analyzers, 403 -404 Anti-extensive ?-mapping, 350-35 1 Antiferromagnetism. ground-state computing, 226-228 Arithmetic combination digital systems, 233-234 Minkowski addition and subtraction, 328 optical symbolic substitution, 59-7 1 Associativity, fuzzy relations, 286 Atomic layer epitaxy, 222 Automated information retrieval, 306-307 fuzzy relations, 272, 306-310
B Ballistic transport, 107 Aharonov-Bohm effect-based devices, 144-145, 151-152, 166 space charge effects, quantum mechanical analysis, 130-133 Bandler-Kohout compositions, 27 1-276 associativity. 286 cuttability, 287-288 fuzzy relations, 279-289 Basis algorithms, 334-349, 383-384 binary ?-mappings, 336, 366 general basis algorithm, 337-349, 363-365, 370-374 48 1
482
INDEX
Basis algorithms (continued) gray-scale 7-mappings, 336-337, 366-374 rank order-statistic filters, 335 translation-invariant set mapping, 361-366 Basis representation, 33 1-333 dual basis, 348-349 filtering properties, 349-358 gray-scale 7-mapping, 368 transforming, 374-383 translation-invariant set mapping, 359 -360 Binary images, mathematical morphology, 326-327 Binary morphology, 326 Binary number system half adder, 233 optical symbolic substitution, 58-59 spin-polarized single-electron logic devices, 223-235 Binary relations, 291-292 fuzzy relations, 292-293 Binary wnappings, 336, 366 Bipolar transistors, 136 Biprism, off-axis holography. 5, 7, 9-10. 22 Bistability granular electron devices, 207 quantum-coupled devices, 219, 220, 224-225 resonant tunneling devices, 130-133 Bold intersection, fuzzy set theory, 263 -264 Boltzmann transport equation, 111 Boolean algebra, fuzzy set theory, 258 Bounded sum, fuzzy set theory, 263-264 BTE, 111 C Calculus, 266-267 classical relational, 267-276 fuzzy relational, 266, 276-289 CAM, OSS, 70-72, 88, 90 Cameras, mirrors in, 460-464 Capacitance, interconnect, 209 Carrier frequency, electron off-axis holography, 22 Carry-free addition, 59, 61 Cartesian product, fuzzy set theory, 264 Cascaded ?-mapping, 345 -348, 363-366, 381-383 CCD, 221,244
Cellular automata, 217, 220, 224, 229, 240-241 Change effect transistor, 204 Characteristic function, 366 Characteristic mapping classical relational calculus, 275-276, 282 fuzzy relations, 286-287 Charged-coupled device, 22 1, 244 Charged particles focusing, 392 trajectories, 393-399 Chip architecture, 217-243 Chromatic aberrations. mirror-bank energy analyzers, 404-408 equations, 397- 399 transaxial mirrors, 448-45 1 Chromatic partial coherence, 25-26 Closing, mathematical morphology, 329-330, 336. 345 gray-scale, 372- 374 Closure, math, fuzzy relations, 293-294 Coding, optical symbolic substitution, 52-59, 68-70 Collimators. mirrors, 460 Computation combination digital systems, 233-234 optical symbolic architecture, 54-91 spin-polarized single-electron chips, 224 -225 reading and writing, 235-237 Conductance, quantum conductance, 113-117 Confined systems, 96 Conjunction, fuzzy sets, 261, 297-298 Content-addressable memory, 70-7 1 Contradiction, weakened law of, 260 Contrapositivity, 277, 278 Cornlator, multiplexed, optical symbolic substitution, 82-84 Coupling, electromagnetic coupling coefficient, 213-214 optical interconnects, 210-213 quantum devices, 209-243 quantum mechanical coupling coefficient, 2 13-2 17 between quantum wells, 195-199 shortcomings, 221-223.244-245 spin -phonon, 241-242 Coupling coefficients, 213-217 Crosstalk, electromagnetic,quantum devices, 209
INDEX
Crystals, electron off-axis holography, 37-38,44-47 Current vortices, electron transport devices, 120 Cuts, fuzzy set theory, 264-265
D Database, fuzzy relations, 313-314 Defects, crystals, electron off-axis holography, 44-47 De Morgan’s law, 231, 259 Detective quantum efficiency, 33, 34 Diffraction grating, optical symbolic substitution, 72-73 Diffusive transport, 107 Aharonov -Bohm effect-based devices, 151-152 Digital reconstruction, electron off-axis holography, 5-6, 12-18, 21-24, 33-34, 36,47 Digital system electron microscopy, 6 spin-polarized single-electron device, 233-234 Dilation, mathematical morphology, 328-329. 343-344 Directional couplers, 193-199 Disjunction, fuzzy sets, 261, 297 Dispersion, mirror-bank energy analyzers, 397,403 Double-barrier resonant tunneling device, 130-135 Double quantum wells, Aharonov-Bohm effect, 149-178 Double quantum wire Aharonov-Bohm interferometer, 172-178, 180, 186, 199 DQE, 32-33 Drift diffusion formalism, 110-1 13 Dual basis, 348-349, 354-358 gray-scale .r-mappings, 37 1-372 translation-invariantset mappings, 363 Dual conorm, fuzzy set theory, 261 Dual rail coding, 68-69 Dwell time, 136 Dynamical phase effects, 38-44
E Edge detection, OSS, 87-89 Elastic collisions, electrons, 100-103 Electric field, electron-optical properties, 401-403
483
Electromagnetic coupling coupling coefficient, 213-214 optical interconnects, 210-213 quantum devices, 208-243 Electromagnetic crosstalk, quantum devices, 209 Electron microscopy, 2, 4, 6 Electron off-axis holography, 1-48 applications, 36-47 crystal defects, 44-47 dynamical phase effects, 38-43 thickness measurement, 36-38 phase distribution, displaying, 18-19 problems, 25-35 hologram recording, 32-35 limited coherence, 25-3 1 noise problems, 31-32 reconstruction digital, 5-6, 12-18, 21-24, 33-34, 36, 47 light optical, 5-6, 10-12, 21-24, 33-34 Electrons drift diffusion formalism, 110-1 13 elastic-inelastic collisions, 100-103 particle behavior, 110-1 11, 118 quantumcoupled spin-polarized singleelectron logic devices, 223-235 transport, 103-1 19 wave behavior, 103-1 10, 118 Electron transport ballistic, 107 diffusive, 107 drift diffusion formalism, 110-1 13 electron wave devices, 103-1 10 quasi-dissipative, 118-1 19 Electron wave devices, 99-120 Aharonov-Bohm devices, 142-178 current and conductance formulas, 113-118 directional couplers, 193-199 electron transport, 99-1 13 energy dissipation, 1 19-120 potential drop, 119-120 quasi-dissipative transport, 118-1 19 resonant tunneling devices, 123-124 T-structure transistors, 178-193 Electron wave directional couplers, 193-199 Electron wave guides, 193 Electro-optic light modulator, 199-200 Electro-optic switch, T-structure transistor, 191-193
484
INDEX
Electrostatic Aharonov - Bohm effect, 145-146 disordered structures, 165-178 double quantum wells, 157-165 interferometer, 172-178, 180, 186, 199 Electrostatic energy analyzers, mirror bank, 391-478 charged particle focusing, 399-410 charged particle trajectory equations, 393-399 Energy analyzers, mirror bank, see Mirrorbank energy analyzers Envelope function, electron off-axis holography, 25-27, 29 Equivalence, fuzzy relations, 278-279 Erosion, mathematical morphology, 328-329, 343-344, 314 Exchange principle, 277, 278 Excluded middle, weakened law of, 260 Exclusive OR gate, 232-233 Extensive 7-mapping, 350-35 I Extinction thickness, 41
F Fabry-Perot resonance condition, 125, 126, 128 Field-effect transistors, 98, 136 Filtering, mathematical morphology, 349-358 anti-extensive mappings, 350-35 1 extensive 7-mappings, 350-35 1 over-filtering, 35 1-354 self-duality, 35 1-354 under-filtering, 35 1-354 Flatband theory, 132 Flat gray-scale mapping. 367-368 Focusing, charged particle, 392, 433-441 Four-electrode mirrors, energy analyzers, 421-425 Friedel’s law. 17 Fringe contrast, electron off-axis holography, 32 Fuzzy inference engine, 3 17- 3 18 Fuzzy knowledge base, 3 I3 - 3 16 Fuzzy relational calculus, 266, 276-289 Fuzzy relations binary relations, 292-293 calculus, 266, 276-289 closures and interiors, 293-294 likeness relations, 296-297 similarity relations, 294-296 theory, 255-264
Fuzzy set theory, 255-264
G Gates, logic, spin-polarized single electrons, 229-232 General basis algorithm, 337 gray-scale function mapping dual basis, 371-372 opening and closing, 372-374 7-mapping cascaded mappings, 345-348 dilation, 343-345 dual basis, 348-349 erosion, 343-345 intersection, 339-343 translation, 337-338 union, 338-339, 342-343 translation-invariant mapping cascaded mapping, 363-365 dual basis, 363 Goguen implication operator, 280 Granular electron devices, 203 -208, 244 quantum-coupled, shortcomings, 221, 244 -245 Gray-scale morphology, 326, 336 Gray-scale 7-mapping, 336-337, 366-374 general basis algorithm, 370-375 Ground state computing, 218-220 antiferromagnetism. 226-228
H Hamming distance, 305 Height, fuzzy set theory, 265-266 Hit-or-miss mapping, 358 Hologram definition, 8 optical symbolic substitution, 74, 77-79 recording, electron holograms, 32-35, 47 Hologram fringes, electron off-axis holography, 23-24, 32 Holography, 2 content-addressable memory, 70-7 1 historical review, 2-3 in-line, 3, 5 off axis, 4, 5 applications, 36-47 phase distribution, displaying, 18-19 problems, 25-35 reconstruction, 10-18, 2 1-24 optical symbolic substitution, 70-71, 74, 77-79
INDEX phase detection, 5-6 principles, 2-3
I Image equation, mirror-bank energy analyzers, 403 Image-plane off-axis holography, 4, 6 applications, 36-47 crystal defects, 44-47 dynamical phase effects, 38-43 thickness measurement, 36-38 phase distribution, displaying, 18-19 problems, 25 -35 hologram recording, 32-35 limited coherence, 25-3 1 noise problems, 31-32 reconstruction digital, 12-18 light optical, 10-12, 21-24 Image processing electron off-axis holography, 7-47 gray-scale morphology, 326, 336, 366-374 mathematical morphology, 325 -389 optical symbolic substitution, 82-84 Image of set classical relational calculus, 267-276 fuzzy relations, 289-290 fuzzy sets, 284-285 Implication operator, fuzzy relations, 277-279 Incoherent tunneling, 128 Inelastic collisions, electrons, 100-103 Inelastic scattering, quantum electron transport, 134-135 Inference classical, 3 16-3 17 fuzzy, 3 17-3 18 method of cases, 322-323 modus ponens, 3 16, 318-320 syllogism, 322 Inf-generating mapping, 378 Information retrieval, 306-307 fuzzy relations, 272, 307-310 In-line holography, 3, 5 Input, single-electron devices, isolation, 238-240 Integrated circuits quantum-coupled architectures, 2 17-243 quantum devices, 183-184, 208 T-structure transistors, 183-184 ULSI, 208
485
VHSIC, 208-210, 215-217 Intensity coding, optical symbolic substitution, 58, 77 Intensity transmittance, 33, 34 Interconnect capacitance, 209 Interconnectless architecture, 219, 224 Interconnects, optical, coupling, 210-213 Interference, electron wave devices, 103-1 10 Interferometer double quantum wire Aharonov-Bohm interferometer, 172-178, 180, 186, 199 Mach-Zender interferometer, 11, 146-149 optical symbolic substitution, 77 Interferometric reconstruction, 13-16 Intersection, ?-mapping, 339-343, 376-377 Irreproducibility, quantum-coupled devices, 222,244
J Johnson limit, 243
K Kadanoff-Baym-Keldysh formalism, 118-119, 134-135 Karnaugh maps, 66 Kernel constraint, 333 Kernel representation, 330-33 1 translation-invariant set mapping, 359 Kirchoff’s current law, 115 Kleene-Dienes operator, 278
L Landauer formula, 117, 119 Lateral semiconductor quantum devices, Aharonov-Bohm effect-based, 106, 142-178 Law of contradiction, 260 Light-emitting diode, optical symbolic substitution, 81-82, 84, 86 Light-optical reconstruction, electron offaxis holography, 5-6, 10-12, 21-24, 33-34 Likeness relations, 296-297 Linear response regime, 103 Linear transport response, 109 Liouville equation, 1 18 Logic devices, spin-polarized single-electron devices, 223-235 AND and NAND gates, 229-231 digital systems, 233-234
486
INDEX
Logic devices, spin-polarized single-electron devices (continued) input and output isolation, 238-240 NOT gates, 229 OR gates, 231-232 performance figures, 241-243 Logic gates, spin-polarized single electron, 229-232 Luckasiewicz operator, 278, 288
M Mach-Zender interferometer, 11, 146-149 Magnetostatic Aharonov-Bohm effect, 146 double quantum wells, 149-156, 166 Magnetostriction, single-electron cells, 237 Mapping, characteristic mapping classical relational calculus, 275-276, 282 fuzzy relations, 286-287 T-mapping, 330-332, 383 anti-extensive, 350-35 1 binary, 336, 366 cascaded, 345-348, 363-366, 381-383 dual, 348-349, 354-358 extensive, 350-35 1 gray-scale, 336-337, 366-374 intersection, 339-343, 376-377 over-filtering, 35 1-353 translation, 337-338 underfiltering, 35 I , 353 -354 union, 338-339, 342-343, 375-376 Mass analyzers, with transaxial mirrors, 469-476 Mass spectrometry, two-plate electrodes, 425 -430 Matched filtering, optical symbolic substitution, 74 Mathematical model, fuzzy set theory, 255-264 Mathematical morphology, 325 -389 basis algorithms, 334-349, 383-384 binary ?-mappings, 336, 366 general basis algorithm, 337-349, 363-365, 370-374 gray-scale 7-mappings, 336-337, 366-374 rank order-statistic filters, 335 basis representation filtering properties, 349-358 transforming, 374-3383 filtering, 349-358 anti-extensive 7-mappings, 350-35 I
extensive 7-mappings, 350-35 1 over-filtering, 351-354 self-duality, 354- 358 under-filtering, 35 1-354 general basis algorithm, 337 gray-scale function mapping, 371-374 .r-mapping, 337-349 translation-invariant mapping, 363 - 365 gray-scale function mappings, 366-374 theory, 326-333 basis representation, 33 1-333 closing, 329-330, 336, 345, 372-374 dilation, 328-329, 343-344 erosion, 328-329, 343-344, 374 kernel representation, 330-331 Matheron’s theorem, 330 opening, 329-330, 336, 345, 372-374 translation-invariant set mappings, 361-366 Matheron’s theory, 330 Matrix, fuzzy relations, 288-289 Maximum spatial frequency, electron offaxis holography, 24 MBE, semiconductor quantum devices, 95, 97 Mean composition, 282-283 Median filter, mathematical morphology, 335 Medical diagnosis, fuzzy relations, 272, 298-300 Memory storage, single-spin single-electron devices, 233, 234 Mesoscopic devices, 94, 99 Metal oxide semiconductor field-effect transistor, 98 Method of cases inference, 322-323 Michelson interferometer, optical symbolic substitution, 77 Microscopy electron microscope, 2, 4. 6 spin-polarized scanning tunneling microscope, 225, 236-237 Minimal representation, translation-invariant set mapping, 360-362 Minimization, truth-table, 66-68 Minkowski addition, 328 Minkowski subtraction, 328 Minterms, 66-68 Mirror-bank energy analyzers, 39 1-478 charged particle focusing and energy separation, 399-410 charged particle trajectory equations, 393-399
487
INDEX
multicascade, 430-433, 469-472 static mass analyzer, 472-476 transaxial mirrors, 441-476 two-cascade, 430-432 two-plate electrodes separated by direct slits, 410-432 transaxial mirrors, 441-476 Mirror-based electron analyzers, twocascade, 430-432 Mirrors, charged particle focusing and energy separation, 399-410 transaxial mirrors, 433-441 charged particle trajectory equations, 393-399 electron-optical parameters, 393 -399, 442-443 four-electrode mirror, 42 1-425 mirror with a ‘‘wall,” 420-425 three-electrode mirrors, 417-420, 458-468 transaxial, 392, 396 charged particle focusing and energy separation, 433-441 two-electrode mirrors, 392, 41 1-417, 441-458 two-plate electrodes, 392 in mass spectrometer, 425 -430 parallel electrode plates, 41 1-420 wedge-shaped mirrors, 430-432 MODFET, 98 Modified signed-digit number system, 59-65 content-addressable memory, 70-7 1 OSS coding, 68-70 Modulation-doped field-effect transistor, 98 Modulation transfer function, electron offaxis holography, 32, 34-35 Modus ponens inference, 3 16, 3 18-320 Molecular beam epitaxy, semiconductor quantum devices, 95, 97 Monotonicity, 277, 278 Morgan algebra, fuzzy set theory, 256 MOSFET, 98 MSD arithmetic, 61-65, 70-71 content-addressable memory, 70-7 1 OSS coding, 68-70 MTE 32. 34-35 Multicascade energy analyzers, 430-432, 469-472 Multifunctionality, semiconductor quantum devices, 164 Multiple transformations, mathematical morphology, 377
Multiple value fixed radix number system, 59 Multiplexed correlator, optical symbolic substitution, 82-84 Multiterminal formulas, electron-wave devices, 117-1 18
N NAND gate, 229-231 Nanostructure electronic devices, 94, 99 NDR, 126. 128 Negative differential resistance, 126, 128 Neutrality principle, 277, 278 Noise, electron off-axis holography, 3 1-32 Nonbinary number systems, optical symbolic substitution, 59 Nonisoplanatism, 9 NOR gate, 231-232 NOT gate, 229 Number systems, binary number systems half adder, 233 OSS, 58-59 spin polarized singleelectron logic device, 223-235 modified signed-digit number systems, 59-65 content-addressable memory, 70-71 OSS coding, 68-70 optical symbolic substitution, 58-59 redundant number systems, 60 residue number system, 59 ternary signed-digit number systems, 59 0 Off-axis holography, 4, 5 applications, 36-47 crystal defects, 44-47 dynamical phase effects, 38-43 thickness measurement, 36-38 phase distribution, displaying, 18-19 problems, 25-35 hologram recording, 32-35 limited coherence, 25-31 noise problems, 3 1-32 reconstruction digital, 12-18 light optical, 10-12, 21-24 Onsager relation, 227 Opening, mathematical morphology, 329-330, 336, 345, 372-374 gray-scale, 372-374
488
INDEX
Optical interconnects, coupling between, 210-213 Optical processors, optical symbolic substitution, 72-87 Optical symbolic substitution, 54-58 architecture, 71-91 with acousto-optic cells, 81-82 with diffraction grating, 72-73 image processing, 87-91 with matched filtering, 74 with multiplexed comelator, 82-83 with opto-electronic devices, 79-80 with phase-only holograms, 74, 77-79 with shadow-casting and polarization, 84, 86, 87 coding techniques, 57-58, 68-70 content-addressable memory, 70-7 1 image processing, 87-9 I signed-digit arithmetic, 59-71 algorithm for OSS rules, 61 higher order MSD arithmetic, 61-65 MSD OSS rule coding, 68-70 optical implementation, 70-71 theory, 60 truth-table minimization, 66-68 Opto-electronic devices, optical symbolic substitution, 79-80 OR gate, 231-232 OSS, see Optical symbolic substitution Output, single-electron devices, isolation, 238-240 Over-filtering, mathematical morphology, 351, 353-354
P Partial coherence, electron off-axis holography, 25 - 3 1 Particle-wave duality, 99, 110, 118 Phase amplifications, electron off-axis holography, 19-21, 47 Phase breaking length, 100 Phase coherence length, 100 Phase detection, electron off-axis holography, 5-7 Phase distribution, electron off-axis holography, 18-19. 41 Phase memory, quantum devices, 100, I I I Phase uncertainty, electron off-axis holography, 23 Photographic noise, electron off-axis holography, 31-32
Pixels electron off-axis holography, 24, 32 optical symbolic substitution, 55-56 Plinth, fuzzy set theory, 265-266 Polarization coding, optical symbolic substitution, 57, 59, 84-87 Probabilistic sum, fuzzy set theory, 261-262 Procentual composition, 283 -284
Q Quantified image, fuzzy relations, 289-290 Quantum bubble, 97 Quantum chips, 208-243 Quantum conductance, 113-1 17 Quantum-confined systems, 96 Quantum-coupled architectures, 2 17- 243 spin-polarized single-electron logic devices, 223-235 Quantum dashes, 219, 220, 224 Quantum devices, definition, 98 semiconductor, 93 -245 Aharonov-Bohm effect-based devices, 106, 142-178 connecting on a chip, 208-243 coupling, 208-243 electron wave devices, 99-120 directional couplers, 193-199 granular electronic devices, 203-208 quantum-coupled devices, 208 -243 resonant tunneling devices, 121-142, 222, 245 shortcomings, 210-222 spin precession devices, 106, 199-203 transistors, 106, 157, 160-165, 167, 189, 244 T-structure transistors, 178-193 superconductor, 98 tunnel diode, 98 Quantum dots, 97, 219, 220, 224 Quantum interference effects, 106-1 10, 120 Quantum interference transistor, 106, 157, 160-165, 167, 178, 244 Quantum mechanical coupling, 213-217 shortcomings, 221-223 Quantum mechanical tunneling, 216 tunneling time, 136-142 Quantum mechanics, 99 resonant tunneling devices, space charge efect, 130-133 Quantum noise, electron off-axis holography, 3 1- 32
INDEX
Quantum well, 97 coupling, 195-199 double quantum wells electrostatic, 157-178 magnetostatic Aharonov -Bohm effect, 149-1 56 resonant tunneling devices, 121-142 Quantum wire, 97 Aharonov-Bohm interferometer, 172, 178, 180, 186, 199 Quasi-dissipative electron transport, 118-1 19
R Random access memory, sequential digital systems, 234 Rank order-statistic filters, 335 READWRITE mechanism, single-electron cells, 235-237 Reconstruction digital, 5-6, 12-18, 21-24, 33-34, 36, 47 interferometric, 13-16 light optical, 5-6, 10-12, 33-34 Redundant number system, 60 Relations classical relational calculus, 270-275 fuzzy relations, 276, 279-288 Reproducibility, quantum-coupled devices, 222, 244 Residue number system, 59 Resonant tunneling, 126, 128 Resonant tunneling devices, 121-142, 222, 245 applications, 135-136 inelastic scattering, 134-135 reproducibility, 222, 244 space-charge effects, 130-134 spectroscopy, 135-136 transistors, I36 tunneling time, 136-142 Resonant tunneling electron spectroscopy, 135-136 Resonant tunneling transistors, 136 Roberts operator, 87-88, 90, 91
S Sampling interval, electron off-axis holography, 23, 24 Scanning tip lithography, semiconductor quantum devices, 95, 97 Scanning tunnel microscopes, 237 Schrodinger equation, 112
489
Self-duality, mathematical morphology, 354-358 Semiconductor quantum devices, 93 -245 Aharonov-Bohm effect-based devices, 106, 142-178 connecting on a chip, 208-243 electron wave devices, 99-120 directional couplers, 193-199 granular electronic devices, 203-208, 245 quantum-coupled devices, 208-243 reproducibility, 222 resonant tunneling devices, 121-142, 222, 245 spin precession devices, 106, 199-203 transistors, 106, 157, 160-165, 167, 189, 244 T-structure transistors, 178-193 Sequential resonant tunneling, 129 Serial mass spectrometer, 430 Serial transformations, mathematical morphology, 377-381 Set mapping, see also .r-mapping filtering properties, 349-358 mathematical morphology, 327-328 translation-invariant, 358 -366 Set theory, fuzzy set theory, 255-264 Shadow-casting, optical symbolic substitution, 84 -87 Shapiro matrix, 181, 183 Signed digit arithmetic, optical symbolic substitution, 59-7 I Silicon, Johnson limit, 243 Similarity relations, 294-296 Single-electron logic devices, quantumcoupled spin-polarized, 223-235, 24 1- 243 Soft algebra, fuzzy set theory, 256 Space charge effect, electron transport devices, 120, 130-134 Spatial partial coherence, 26-3 1 Spectrometry, two-plate electrodes, 425 -430 Spectroscopy, resonant tunneling electron spectroscopy, 135-136 Spherical aberrations, mirror-bank energy analyzers, 404-408 Spin-phonon coupling, 241-242 Spin polarization, single-electron logic devices, 223-235, 241-243 Spin-polarized scanning tunneling microscope, 225, 236-237 Spin precession devices, 106, 199-203
490
INDEX
Spin-spin coupling, quantum devices, 225-226 SPSTM, 225, 236-237 SQUID, 98 S-SEED, 79-80 Static mass analyzers, 472-476 STL, semiconductor quantum devices, 95, 97 STM, 236 Strict cuts, fuzzy set theory, 264-265 Structuring function, 368 Stub-tuning, 179 Subcomposition, 271, 283-284 Subtraction Minkowski subtraction, 328 optical symbolic substitution, 60, 61, 64-65, 70 Supercomposition, 271, 283-284 Superconducting quantum interference device, 98 Sup-generating mapping, 378 Switching, quantum-coupled devices, 221, 222, 225, 241-242 Switching speed, single-electron cells, 241-242 Symbolic substitution, optical, see Optical symbolic substitution Symmetric self-electro-optic effect device, 79-80
T ?-mapping, 330-332, 383 anti-extensive, 350-35 I binary, 336, 366 cascaded, 345-348, 363-366, 381-383 dual, 348-349, 354-358 extensive, 350-351 gray-scale, 336-337, 366-374 intersection, 339-343 reversing, 376-377 over-filtering, 35 1-353 translation, 337-338 under-filtering, 35 I, 353 -354 union, 338-339, 342-343 reversing, 375-376 Ternary signed-digit number system, optical symbolic substitution, 59 Texas Instruments, quantum-coupled integrated circuits, 218, 221-222 Thesaurus construction, fuzzy relations, 307, 310-311
Thickness, measurement with electron offaxis holography, 37-38 Thomas-Fermi model, space charge effects, 132-134 Thouless energy, 108 Thouless temperature, 107 Three-electrode mirrors, energy analyzers, 417-420.458-469 TI mapping, 358-366 general basis algorithm, 363-365 Transaxial mirrors, 392, 396 charged particle focusing and energy separation, 433-441 energy analyzers, 441-469 mass analyzer, 469-476 Transformation, 7-mapping, 374-385 multiple transformations, 377 serial transformations, 377-38 1 Transistors Aharonov -Bohm effect-based devices, 145, 146, 189 bipolar transistors, 136 field-effect transistors, 98, 136 granular electron transistors, 205 -208 quantum interference transistors, 106, 157, 160-165, 167. 178 resonant tunneling transistors, 136, 245 spin precession transistors, 106, 199 T-structure transistors, 178-193 Translation, ?-mapping, 337-338 Translation-invariant set mapping, 358-366 basis algorithms, 361-363 general basis algorithm, 363-365 Transmission amplitude, 124-125 Trapezoidal fuzzy quantity, 320 Trapping, quantum-coupled devices, 221-222 Triangular compositions, applications, 297- 3 11 Triangular conorm, fuzzy set theory, 261 Triangular norm, fuzzy set theory, 261, 278, 289 Truth table minimization, 66-68 NAND logic gate, 230-231 T-structure transistors, 178-193 analog, 188-189 digital applications, 189-191 electro-optic applications, 191-193 Tsu-Esaki formula, 115 quantum mechanical tunneling time, 136-1 42
49 1
INDEX
Tunnel diode, 98 Tunneling incoherent tunneling, 128 quantum mechanical tunneling, 136-142, 216 resonant tunneling, 126, 128 resonant tunneling devices, 12 1-142, 222, 245 sequential resonant tunneling, 129 spin-polarized scanning tunneling microscope, 225. 236-231 Tunneling time, quantum mechanical, 136-1 42 Two-electrode mirrors, energy analyzers, 392, 41 1-417, 441-458 Two-plate electrodes, energy analyzers, 392, 410-432, 441-476
U ULSI, quantum devices, 208-210, 215-217 Ultracomposition, 27 I , 280, 282 Ultralarge-scale integrated chips, 208, 209 Ultrathin film, MBE, 97 Umbra transform, 366
Uncertainty, fuzzy set theory, 256 Under-filtering, mathematical morphology, 351, 353-354 Union, 7-mapping, 338-339, 342-343
V Vertical semiconductor quantum devices, resonant tunneling devices, 121-142 Very high-speed integrated circuits, 208-209 VHSIC, quantum devices, 208-209
W Wave aberration, off-axis holography, 8, 9, 21, 48 Wave-particle duality, 99, 110, I18 Weakened law of contradiction, 260 Weakened law of excluded middle, 260 Window, mathematical morphology, 335 Window tranformation, 358 WRITE mechanism, single-electron cells, 235 -237
2 Zero-field spin splitting, 201
This Page Intentionally Left Blank