ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 48
CONTRIBUTORS TO THISVOLUME
J. Arsac C. Baud Ch. Galtier Rona...
27 downloads
919 Views
17MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 48
CONTRIBUTORS TO THISVOLUME
J. Arsac C. Baud Ch. Galtier Ronald E. Rosensweig H. Rougeot G. Ruggiu P. R. Thornton Tran Van Khai J. P. Vasseur T. A. Welton
Advances in
Electronics and Electron Physics EDITEDBY L. MARTON Smithsonian Institution, Washington, D.C.
Associate Editor CLAIRE MARTON EDITORIAL BOARD T. E. Allibone E. R. Piore H. B. G. Casimir M. Ponte W. G. Dow A. Rose A. 0.C. Nier L. P. Smith F. K. Willenbrock
VOLUME 48
1979
ACADEMIC PRESS New York San Francisco London A Subsidiary of Harcourt Brace Jovanovich, Publishers
COPYRIGHT @ 1979, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC.
111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London N W 1 IDX
LIBRARY OF CONGRESS CATALOG CARD NUMBER:49-7504 ISBN 0-12-014648-7 PRINTED IN THE UNITED STATES OF AMERICA
79808182838485 9 8 7 6 5 4 3 2 3
CONTENTS CONTRIBUTORS TO VOLUME 48 . . . FOREWORD . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Vii ix
Negative Electron Afhity Photoemitters H . ROUGEOT AND C . BAUD I . Introduction . . . . . . . . . . . . . . I1. Photocathodes Using Negative Electron Affinity 111. Reflection Photocathodes . . . . . . . . IV. Photoemission by Transmission . . . . . . V. Angular Energy Distribution . . . . . . . VI. NEA Photocathode Technology . . . . . . VII . Photoemission Stability and Dark Current . . VIII . Conclusion . . . . . . . . . . . . . . References . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
1 2 9 17 21
. . . . . . . .
22
. . . . . . . . . . . . . .
31 32 33
. . . . . . . .
A Computational Critique of an Algorithm for Image Enhancement in Bright Field Electron Microscopy T. A. WELTON
I . Introduction . . . . . . . . . . . . . . . I1. Image Theory for Bright Field Electron Microscopy 111. Effect of Partial Coherence . . . . . . . . . IV. Statistical Error . . . . . . . . . . . . . . V . Object Reconstruction . . . . . . . . . . VI . Programs for Numerical Tests of the Reconstruction VII . Presentation and Discussion of Data . . . . . Appendix A.Estimates of Quadratic Effects . . . Appendix B. The Wiener Spectrum of the Object Set Appendix C. Programs for Determining W (R). . References . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . .
Algorithm
. . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37 39 47 50 55 67 74 87 94 97 100
Fluid Dynamics and Science of Magnetic Liquids RONALDE. ROSENSWEIG
I . Structure and Properties of Magnetic Fluids . . . . . . . . . . 103 I1. Fluid Dynamics of Magnetic Fluids . . . . . . . . . . . . . 122 111. Magnetic Fluids in Devices . . . . . . . . . . . . . . . . 157 IV. Processes Based on Magnetic Fluids . . . . . . . . . . . . . 186 References . . . . . . . . . . . . . . . . . . . . . . 195 V
vi
CONTENTS
The Edelweiss System J . ARSAC.CH. GALTIER,G . RUGGIU. TRAN VAN
U A I , AND
I. General-Purpose Operating Systems. . . . . . I1. New Trends in Computer Architecture . . . . . 111. New Trends in Programming . . . . . . . . IV. Principles of EXEL . . . . . . . . . . . . V. Large-Scale Systems: The EDELWEISS Architecture . VI . The Single-User Family . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . Appendix C . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . .
J . P. VASSEUR
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202 203 205 207 223 256 263 264 267 269
Electran Physics in Device Microfabrication. I General Background and Scanning System P. R . THORNTON
I . General Introduction . . . . . . . . . . . . . . . . . . I1 Photon and Electron Beam Lithography for Device Microfabrication 111. Interactions between an Electron Beam and a Resist-Coated Substrate IV . Electron Beam Methods for Device Microfabrication . . . . . . . V. The Development of a Fast Scanning Systern-General . . . . . . VI . Design of a Fast Scanning System-Use of Thermal Cathodes . . . . VII . The Development of a Fast Scanning System-Use of Field Emitter Cathodes . . . . . . . . . . . . . . . . . . . . . . VIII . The Development of a Fast Scanning System-The Role of ComputerAided Design . . . . . . . . . . . . . . . . . . . . . IX. The Deflection Problem . . . . . . . . . . . . . . . . . X . High-Current Effects . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
272 275 280 288 300 326
AUTHORINDEX. SUBJECT INDEX.
381
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
336 341 352 367 376 391
CONTRIBUTORS TO VOLUME 48 Numbers in parentheses indicate the pages on which the authors’ contributions begin.
J. ARSAC,Institut de Programation, Universite Paris VI, Tour 45-55, 11, Quai Saint-Bernard 75005, Paris, France (201) C. BAUD, Laboratoire de Recherches, Thomson-CSF Division Tubes Electroniques, 38120 St. Egreve, France (1) CH.GALTIER,Thomson-CSF, Laboratoire Central de Recherches, Domaine de Corbeville-B.P. 10, 91401-0rsay, France (201) RONALD E. ROSENSWEIG, Corporate Research Laboratories, Exxon Research and Engineering Company, Linden, New Jersey 07036 (103) H. ROUGEOT, Laboratoire de Recherches, Thomson-CSF Division Tubes Electroniques, 38120 St. Egreve, France (1) G. RUGGIU,Thomson-Brandt, 173, Boulevard Haussmann, B.P. 700-08, 75360, Paris Cedex 08, France (201)
P. R. THORNTON,G C A Corporation, Burlington, Massachusetts 01803 (271) TRANVAN KHAI, Thomson-Brandt, 173, Boulevard Haussmann, B.P. 700-08, 75360, Paris Cedex 08, France (201) J. P. VASSEUR, Thomson-Brandt, 173, Boulevard Haussmann, B.P. 700-08, 75360, Paris Cedex 08, France (201) T. A. WELTON,Physics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, and Department of Physics, University of Tennessee, Knoxville, Tennessee 37916 (37)
vii
This Page Intentionally Left Blank
FOREWORD Although many aspects of photoelectricity have been treated as recently as in Volumes 40A and B (1976) of the Advances, these related mainly to devices. A more thorough treatment of the physics of photoemitters has not appeared since the eleventh volume (1959) and this omission is filled partially by the review of H. Rougeot and Ch. Baud, entitled “Negative Electron Affinity Photoemitters.” The authors show that studies utilizing advances in solid state physics have improved the understanding and performance of such photoemitters. The next review is by T.A. Welton: “A Computational Critique of an Algorithm for Image Enhancement in Bright Field Electron Microscopy.” A closely related subject is treated in the recent monograph by W. 0. Saxton (Supplement 10 to Advances in Electronics and Electron Physics), but here Welton’s approach is different since he explores in some detail a small subset of computational procedures, with special attention to their practical difficulties, in an effort to obtain better image reconstruction. The title of the third review, “Fluid Dynamics and Science of Magnetic Liquids,” by R. E. Rosensweig may give the impression that its subject is far removed from the usual contents of the Advances. However, there are at least two reasons for its inclusion: first is the similarity of fluid dynamics of magnetic fluids to magnetohydrodynamics, and second is the intriguing possibilities of coupling ferrofluidic devices to electronic devices. The expected interest of electronics engineers in this new class of devices is ample justification for this review’s appearance. Since the last review of large-scale computer organization in the eighteenth volume (1963), little has been published on computer architecture except for a review of minicomputers (Vol. 44, 1977). It is timely therefore to examine the architecture of one particular modem computer and the review titled “The Edelweiss System,” by J. Arsac, Ch. Galtier, G. Ruggiu, Tran Van Khai, and J. P. Vasseur, does so. The authors discuss different trends in computer architecture, in programming, and in the use of such systems. The first part of a two-part review on “Electron Physics in Device Microfabrication,” by P. R. Thornton, completes this volume. The reduction in size of present-day integrated circuitry requires ever increasing sophistication in the methods needed for their production. With components approaching almost molecular dimensions, the most advanced techniques in electron optics are needed for these extreme requirements. The author discusses such techniques and their limitations. ix
X
FOREWORD
Following our custom we list again the titles of future reviews, with the names of their authors. This time the listings are given in three categories: first, regular critical reviews, second, as usual, supplementary volumes, and third a special listing of Volume 50 of this serial publication. This fiftieth volume, marking a kind of anniversary, will be devoted entirely to historical presentations of different subjects in electronics and electron physics. Critical Reviews: The Gunn-Hilson Effect A Review of Application of Superconductivity Sonar Electron Attachment and Detachment Electron-Beam-Controlled Lasers Amorphous Semiconductors Electron Beams in Microfabrication.I1 Design Automation of Digital Systems. I and I1
Spin Effects in Electron-Atom Collision Processes Electronic Clocks and Watches Review of Hydromagnetic Shocks and Waves Beam Waveguides and Guided Propagation Recent Developments in Electron Beam Deflection Systems Seeing with Sound Large Molecules in Space Recent Advances and Basic Studies of Photoemitters Application of the Glauber and Eikonal Approximations to Atomic Collisions Josephson Effect Electronics Signal Processing with CCDs and SAWS Flicker Noise Present Stage of High Voltage Electron Microscopy Noise Fluctuations in Semiconductor Laser and LED Light Sources X-Ray Laser Research Ellipsometric Studies of Surfaces Medical Diagnosis by Nuclear Magnetism Energy Losses in Electron Microscopy The Impact of Integrated Electronics in Medicine Design Theory in Quadrupole Mass Spectrometry Ionic Photodetachment and Photodissociation Electron Interference Phenomena Electron Storage Rings Radiation Damage in Semiconductors Solid-state Imaging Devices Particle Beam Fusion
M. P. Shaw and H. Grubin W. B. Fowler F. N. Spiess R. S.Berry Charles Cason H. Scher and G. Pfister P.R. Thornton W. G. Magnuson and Robert J. Smith H. Kleinpoppen A. Gnadinger A. Jaumotte & Hirsch L. Ronchi E. G. Ritz, Jr. A. F. Brown M. and G. Winnewisser H. Timan F. T. Chan, W. Williamson, G. Foster, and M. Lieber M. Nisenoff W. W. Brodersen and R. M. White A. van der Ziel B. Jouffrey H. Melchior Ch. Cason and M. Scully A. V. Rzhanov G. J. Bbnb B. Jouffrey J. D. Meindl P. Dawson T.M. Miller M. C. Li D. Trines N. D. Wilsey E. H. Snow A. J. Toepfer
xi
FOREWORD Resonant Multiphoton Processes Magnetic Reconnection Experiments Cyclotron Resonance Devices The Biological Effects of Microwaves Advances in Infrared Light Sources Heavy Doping effects in Silicon Spectroscopy of Electrons from High Energy Atomic Collisions Solid Surfaces Analysis Surface Analysis Using Charged Particle Beams Low Energy Atomic Beam Spectroscopy Sputtering Photovoltaic Effect Electron Irradiation Effect in MOS Systems Light Valve Technology High Power Lasers Visualization of Single Heavy Atoms with the Electron Microscope Spin Polarized Low Energy Electron Scattering Defect Centers in 111-V Semiconductors Atomic Frequency Standards Interfaces Reliability High Power Millimeter Radiation from Intense Relativistic Electron Beams Solar Physics Auger Electron Spectroscopy Fiber Optic Communication Systems Microwave Imaging of Subsurface Features Novel MW Techniques for Industrial Measurements Diagnosis and Therapy Using Microwaves Electron Scattering and Nuclear Structure Electrical Structure of the Middle Atmosphere Microwave Superconducting Electronics
P.P. Lambropoulos P. J. Baum R. S. Symous and H. R. Jory H.Frohlich Ch. Timmermann R. Van Overstraeten D. Berenyi M. H. Higatsberger F. P. Viehbkk and F. Riidenauer E. M. H&l and E. Semerad G. H. Wehner R. H. Bube J. N. Churchill, F. E. Holmstrom, and T. W. Collins J. Grinberg V. N. Smiley J. S. Wall D. T. Pierce and R. J. Celotta J. Schneider and V. Kaufmann C. Audoin M. L. Cohen H. Wilde T. C. Marshall and S. P. Schlesinger L. E. Cram P. Holloway P. W. Baier and M. Pandit A. P. Anderson W. Schilz and B. Schiek M. Gautherie and A. Priou G. A. Peterson L. C. Hale R. Adde
Supplementary Volumes: Image Transmission Systems High-Voltage and High-Power Applications of Thyristors Applied Corpuscular Optics Microwave Field Effect Transistors
G. Karady A. Septier J. Frey
Volume 50: History of the Revolution in Electronics Early History of Accelerators Power Electronics at General Electric 1900 to 1950 History of Thermoelectricity
P. Grivet M. S. Livingston J. E. Brittain B. S. Finn
W. K. Pratt
xii
FOREWORD
Evolution of the Concept of the Elementary Charge The Technological Development of the Short-wave Radio History of Photoelectricity From the Flat Earth to the Topology of Space-Time History of Noise Research Ferdinand Braun: Forgotten Forefather
L. L. Marton E. Sivowitch W. E. Spicer H. F. Harmuth A. van der Ziel Ch. Siisskind
As in the past, we have enjoyed the friendly cooperation and advice of many friends and colleagues. Our heartfelt thanks go to them, since without their help it would have been almost impossible to issue a volume such as the present one.
L. MARTON C. MARTON
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 48
This Page Intentionally Left Blank
ADVANCES IN ELECTRONIC3 A N D ELECTRON PHYSICS, VOL.
48
Negative Electron Affinity Photoemitters H. ROUGEOT
AND
C. BAUD
Laboratoire de Recherches Thomson-CSF Division Tubes Electroniques St. Egreue, France
Introduction .............................................................................. Photocathodes Using Negative Electron Affinity ...... Reflection Photocathodes ............................................................... Photoemission by Transmission ........................................................ Angular Energy Distribution.. .......................................................... NEA Photocathode Technology ........................................................ A. The Material and Growing Technology... B. Transmission Photocathodes with Heteroe C. Investigation of Material Characteristics....... D. High-Vacuum Enclosures ........................................................... VII. Photoemission Stability and Dark Current ............................................ VIII. Conclusion ...... ................... ..... ..... ........ References ................................................................................ 1. 11. 111. IV. V. VI.
1 2 9 17 21 22
29 31 32 33
I. INTRODUCTION In 1887, Hertz discovered that the surface of a conductc- emitted negatively charged particles when irradiated with ultraviolet light. Two years later, a similar effect was observed with visible light and alkali metals by Elster and Geitel. For many years, research was concerned with investigating the fundamental nature of this phenomenon, being aimed at explaining the interaction between electromagneticwaves and corpuscles, and at verifying the photon-to-free-electron conversion concept that was proposed by Einstein in 1905. Then, in 1929, the silver-oxygen-cesium photocathode was discovered. From 1930 onward, photoemission technology progressed rapidly. Quantum yields were improved, and the sensitivity was extended into the red end of the visible spectrum by the introduction of antimony-cesium, bismuthgold oxygen-cesium, and multialkaline photocathodes. The discovery of these new materials and the performance improvements were purely empirical. It was only after the rapid expansion of and major progress in solid-state 1 Copyright 8 1979 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-014648-7
2
H. ROUGEOT AND C. BAUD
physics that photocathodes were considered as semiconductors. It was the semiconductive nature of the photoemissive layers that permitted explaining the photon-electron interaction. Emphasis was then placed on finding practical applications for photoemission and on creating models of the potential profiles across the photocathode-vacuum interface that were compatible with experimental results. Numerous investigations were carried out, but no matter what type of material was used for the layer, the energy required to create photoelectrons was always found to be greater than the ionization energy in the layers. The difference between the two values corresponds to the photocathode material's positive affinity for electrons. This affinity must be reduced to obtain improvements in photocathode yield and extension of red sensitivity. In 1965, Scheer and Van Laar discovered the phenomenon of negative electron affinity (NEA) at the surface of semiconductors.They found that by depositing a layer of cesium on a crystal of p-type gallium arsenide, followed by a layer of cesium and oxygen, photoemission with an excellent quantum yield could be obtained up to the cutoff wavelength (absorption limit) of the material. This limit to the photoemission indicated that the excitation energy did not depend on the surface properties, but on the forces between the electrons and atoms of the semiconductor. From this it was deduced that the vacuum level at the surface of the semiconductor had been lowered below that of a free electron in the conduction band of the crystal. This drop is a measure of the negative electron affinity at the surface of the material. With such an affinity, all free electrons can escape from the crystal without any supplementary excitation. Apker et al. (1948) had already tried to obtain photoemission with vacuum-deposited semiconductor films, but they had not discovered negative electron affinity. 11. PHOTOCATHODES USINGNEGATIVE ELECTRON AFFINITY
Consider a photocathode (Fig. 1) consisting of, for example, a semiconducting monocrystal of gallium arsenide (GaAs) with an activating layer of cesium and oxygen. With a photocathode designed for operation by " reflection " (emission from the irradiated face), an incident photon traverses the activating layer with a very low probability of being absorbed and penetrates the semiconductor. There, after a path length that depends on its energy and on the absorption coefficient of the material, the photon is stopped and creates a free electron-hole pair. After a certain number of collisions, one of these charges, a hot electron, can thermalize to near the lower level of one of the conduction bands that are available to it. Two such bands, known as X and r exist in gallium arsenide. Each one keeps the
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS Monocrystol
3
cs-0
FIG.1. GaAs photocathode operating by reflection.
electron for a specific time, during which it may “diffuse” to the surface of the semiconductor. To escape, an electron that is near the surface must penetrate a surface perturbation zone which controls the surface electronemission yield. Several hypotheses have been proposed to explain the potential profile of the perturbation zone. The basis for these hypotheses is given in Fig. 2, which represents the energy band configurations in gallium arsenide, a thin layer of ionized cesium, and a layer of cesium oxide. The bending of the bands at the crystal surface is due to surface centers that are initially electrically neutral, but which have electrons that are weakly bound as compared to those of the lattice molecules. These weakly bound electrons migrate to acceptor centers (impurities that have been deliberately introduced into the lattice), so creating a negative space charge near the surface, and causing the band bending and reduction in vacuum level shown in Fig. 2. This may be the situation before deposition of the activating layer. As a first hypothesis, suppose that this layer consists of cesium oxide, considered as a semiconductor, and whose energy bands (Fig. 2) are as proposed by Uebbing and James (1970a,b). Because the electron affinity of cesium oxide is lower than that of gallium arsenide, we can assume that a flux of electrons will pass from the first into the second material. This will stop when the gallium arsenide’s valency band, which constitutes an important electron reservoir, reaches the Fermi level. The resulting potential profile is shown in Fig. 3. Sonnenberg (1969a,b) was apparently the first to propose this model, which is also found in work by Bell and Spicer (1970). Notice that in particular, the vacuum level, which depends on the cesium
4
H. ROUGEOT AND C. BAUD
{rr
Vacuum level without band bending.--.-..-:
Vacuum,’ Conduction bar,
band bending.. ....
py”’ 0
-r--
-7-
Conduction band
i I
In
CS
Go As Fermi level of a p-type materiol
i i
--
Volency band
Valency bond
\I
(a)
(b)
(C)
FIG.2. Band configuration in (a) GaAs, (b) ionized cesium, (c) cesium oxide.
oxide, has been lowered to a value below the bottom of gallium arsenide’s conduction band, thus creating negative electron affinity conditions. In other words, an electron excited from the valency band to the conduction band of the semiconductor will be in an energy level above that of vacuum. However, this model does not completely explain many experimental observations including those of the authors. It indicates that 6 to 5 nm of Cs,O, considered as a semiconductor, would be required to provide maximum band bending and create negative electron affinity. In reality, a whole or partial monomolecular Cs-0 layer is sufficient. Emission already starts on partial monatomic layer of cesium and, after addition of an equivalent number of oxygen atoms, this effect is increased by a factor of 20. Further improvement can be obtained by adding successive
Conduction band
Vakncy band
Go A s
4
1 % 6nm
1 CsrO
FIG.3. Heterojunction model.
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
5
layers of cesium and oxygen until a maximum is reached for a total thickness of a few layers. This noticeable improvement in photoemission, obtained by adding oxygen atoms to the cesium-covered surface of semiconductors, was first pointed out by Turnbull and Evans (1968) who alternated layers of cesium and oxygen to obtain the equivalent of 4 to 10 monomolecular layers of cesium. James et al. (1969) obtained optimum results with six 0-0 layers, whereas Garbe and Frank (1969) used three to four. The increased yields marked the departure from the explanation and model given by Scheer and Van Laar. According to Uebbing and James (1970a,b), NEA could be explained by the existence of a heterojunction between two semiconductors: the substrate (GaAs in this case) and n-type cesium oxide (Cs-0) with a very low electron affinity (0.85 eV). These authors were, in fact, clarifying a hypothesis that was first suggested by Sonnenberg (1969a,b), developed by Bell and Spicer (1970), and further developed by the same authors and others in several articles (Bell et al., 1971; Milton and Baer, 1971). By using quantitative chemical analysis in solution, Sommer et al. (1970) showed that the Cs-0 layer giving optimum photoemission was equivalent to four to five monatomic layers of Cs. However, because of the high density of the Cs10 component, they concluded that the surface deposit of this oxidized form was in fact a monomolecular layer. So, they contested the existence of a Cs,O-GaAs heterojunction. James and Uebbing (1970a,b), working on low-bandgap materials GaSb and GaAsSb, found new arguments. In particular, they showed that the NEA could be increased by depositing many successive layers of Cs-0. Brown et al. (1971)found conflicting evidence, showing once again that a simple monomolecular layer was sufficient to obtain optimum photoemission from GaAs, a material that also has a low band gap. They were more explicit about the role played by oxygen in the hypothesis of a surface dipole layer, suggesting that it acted as a shield between the surface cesium ions, so permitting an increase in dipole moment. As an alternative, they envisaged the possibility that oxygen atoms introduced themselves into the Cssemiconductor interface. Considering the structures of Fig. 2 (GaAs and Cs) and the low electronic affinity of Cs, we could imagine that the surface layer of Cs becomes ionized and forms a dipole layer with the GaAs by giving up its valency electrons to the semiconductor. This dipole layer would shift the vacuum level to the level of the bottom of the conduction band of the semiconductor, as shown in Fig. 4. This hypothesis was proposed by Scheer and Van Laar (1965). The role of the first layer of oxygen would not, in this case, be to form cesium oxide, Cs,O. Instead, because of the dangling surface bonds of the GaAs in the interstices of the cesium layer, it increases the dipole effect by
6
H. ROUGEOT AND C. BAUD
Vacuum level
Zone I
+ Zone 2
Zone 3
FIG.4. Approximate band profiles.
introducing itself into the interstices. Because the oxygen molecule is small, the image force, which depends on the distance between the substrate and the absorbed oxygen layer, is large. The vacuum level is lowered by a value corresponding to this image force. Although NEA photocathodes have since been made from many different materials, no simple explanation has been accepted by all the specialists. However, it is generally accepted that the Cs-0 surface layer is particularly rich in cesium (Sonnenberg, l971,1972a,b; Fisher et al., 1972).The structure of this deposit was examined by Uebbing and James (1970b), Goldstein (1973), and Martinelli (1973a, 1974), who concluded that it is amorphous, whereas in bulk CszO, Cs,O, Cs403, and Cs,O are crystalline (Simon, 1973) with well-defined structures (Simon, 1971; Simon and Westerbeck, 1972); CszO is a semiconductor with a band gap of 2 eV (Borziak et al., 1956; Hoene, 1970; Heiman et al., 1973); Cs,O has a distinctly metallic nature (Tsai et al., 1956). Clark (1975) suggested that the Cs-0 activating layer be considered to be an amorphous phase, Csz + x 0 where, depending on the value of x, the structure varies uniformly between the semiconductor Cs,O structure and the metallic Csz + x 0 (x > 0) structure, the material having a distinct metallic nature for x > 0.1. He deduced that the layer can easily pass from the metallic to the nonmetallic state, so that two cases must be considered, a heterojunction of two semiconductors (CszO and a 111-V material) and Schottky dipole (CsO and a 111-V material).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
7
Using electron spectroscopy with ultraviolet-irradiated oxidized cesium on a silver substrate, Ebbinghau et al. (1976) concluded that the stable compound is C~11O3,this being similar to the ideas of Goldsmith so far as the nature of the oxidized compound is concerned. However, this assures that the composition is unaffected by the substrate, this being compatible with a heterostructure of the form GaAs/Csl103. According to Fisher et al. (1972) and Sommer et al. (1970), optimum emission with a III-V material is obtained by monomolecular Cs30 layer, the interface barrier potential being given by the difference between the electronic affinity of the III-V material and the Cs30, reduced by the image force effect. Fisher et al. (1972) calculated the effect of this barrier on photoemission by assuming it to be rectangular and taking the case of Ga,Inl -,As. James et al. (l969,1971a,b) and Clark (1975)have pointed out that if the theory of a III-V/Cs30 Schottky dipolar barrier is to be examined in greater detail, the surface states of the semiconductors must be taken into account because they can partially mask the difference in electronic affinity of the two materials. Bell (1973) assembled all the ideas on NEA, current in 1973, in a work entitled “Negative Electron Affinity Devices.” Sommer (1973~) has also prepared a similar, excellent work. Whether the superficial structure is a dipole or a heterojunction, the photoemission yield is due to a photoemission probability that was described by Fisher et al. (1974). These authors reconstructed a potential profile (not concerning themselves with its origin) to explain the observed emission probabilities. Some workers have tried to go beyond the hypothesis stage.and to observe the surface structure directly. Ranke and Jacobi (1973) used Auger spectrometry and thermal desorption to study the surface-bonding forces of gallium arsenide for different orientations. Their work is very interesting as it establishes desorption conditions under ultrahigh vacuum. Among other things, it shows that the Fermi level has a fixed value of 0.5 eV on the TTT (As) face of GaAs. Although these values are of use in establishing a potential profile for the activated material, they do not permit one to decide if the cesium-covered surface is of a dipole or heterojunction nature. Using low energy electron diffraction (LEED), Papageorgopoulos and Chen (1973)studied the absorp tion of Cs and H2 on W(lO0) and showed the existence of a regular structure. The minimum value of the work function corresponds to a regular Cs structure evenly deposited on a regular W structure. When only the Cs has a regular structure and when its lattice constant does not correspond to that of W, the work function increases. Although Martinelli (1974) found, by using
8
H. ROUGEOT AND C. BAUD
LEED, that Cs on silicon also has a regular structure, they could find no regular structure with any orientation of GaAs. However, Mityagin et al. (1973) and Derrien et al. (1977) contest this fact, indicating a regular structure for cesium on GaAs(ll0). Derrien et al. (1975) found an amorphous structure for cesium on Gap. In an article entitled “Absorption Kinetics of Cs on GaAs,” Smith and Huchital (1972) concluded that for Cs on GaAs, two states of absorption exist. The first has a sticking coefficient of one and an ionization of the absorbed cesium atoms which have a low work function and high bond strength. The second has a low sticking function, coinciding with the establishment of a second layer. The bonds weaken and the work function increases. Paradoxically, the second state starts to appear before the first state is completed. Using flash desorption, Goldstein and Szostak (1975) reached the same conclusion. They found, experimentally, that for covering factors of over 0.5, the electron affinity increases and photoemission diminishes. They admit that once this value has been passed, the polarization of the surface dipoles diminishes because of electrostatic repulsion between the cesium atoms, and the work function goes to a minimum. At the same time, it is found that thermal desorption of cesium can be carried out at lower temperatures, perhaps because the cesium atoms are not so strongly bonded. After addition of oxygen to the Cs layer, flash desorption indicates an increase in the quantity of strongly bonded cesium atoms. Oxygen thus increases cesium bonding strengths. Another interesting observation made by these authors is that oxygen can be desorbed in the G a 2 0state. The fact that there is no desorption in the Cs,O, state tends to indicate that there is no bonding in the Cs20 surface stage. Auger analysis of the surface confirms that oxygen does not start to escape until nearly all of the cesium has gone. The desorption of oxygen in the GazO stage does not, however, imply that oxygen fixes itself to the surface in the state Ga20. These conclusions may appear to contradict those of Gregory et al. (1974) who noted that if oxygen is supplied to a GaAs surface, an oxygenarsenide bond is formed, leaving the gallium free. In this way, they explain why the Fermi level at the surface of GaAs does not change when oxygen is added. The surface states due to the gallium atoms keep this characteristic unchanged during the oxidation. Desorption of oxygen in the form G a 2 0 could thus be due to the formation of the chemical compound by heating the GaAs above 400°C. On the other hand, Gregory et al. (1974) found, like Uebbing and James (1970b) and Bell et al. (1971), that coating p-type GaAs pinned the Fermi level at 0.5 eV above the valency band. This seems to be widely confirmed by other authors (Van Laar and Scheer, 1967; Dinan et al., 1971).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
9
111. REFLECTION PHOTOCATHODES The overall quantum yield of a reflection photocathode was established by Baud and Rougeot (1976),working from hypotheses due to James et al. (1969) and Kressel and Kupsky (1966). It is given by the expression
x e-ao
+ T(E)(1-
(1)
PO)))
where E is the energy of the incident photon, R is the reflection coefficient, P,, Pr are the electron escape probabilities for the X and r conduction bands, a is the optical absorption coefficient, Lx, Lr are the electron diffusion lengths in the X and r conduction bands, Fx, Fr are the fraction of total electron excitation occurring in the X and r conduction band minima, xo is the thickness of the space-charge zone, and T ( E )is the escape probability for an electron of energy E. One can consider that the electrons have a Boltzmann distribution in each of the conduction bands, X and r. n ( A E ) a exp
(
E)
--
where A E is the kinetic energy of the electron, k is Boltzmann’s constant, and T is the absolute temperature. Electrons entering the space-charge zone are accelerated toward the surface by the internal field. They become hot electrons and interact with phonons. The electron will have a mean free path I,, and energy A E p will be lost at every electron-photon interaction. We will take I, = 43 A and AE, = 0.036 eV, values given for GaAs by Kressel and Kupsky (1966). The thickness of the space-charge zone is given by
where n, is the doping level, V, is the band bending (0.45 eV for GaAs), ccO is the dielectric constant, and q is the charge of an electron. The solution given by Williams and Simon (1967) and Bartelink et al. (1963) is used to calculate the surface energy distribution of photoelectrons that have traversed the band-bending zone. In the case: where Eo = q2F21i/3AE,
(4)
10
H. ROUGEOT AND C. BAUD
and F is the electrostatic field in the space-charge zone, E(0) is the kinetic energy of electrons at the surface, and E , is the energy lost by an electron, the energy distribution is of the form
This distribution must be normalized. Applied to the X and r conduction bands, it represents the probability of finding an electron from one of these bands at an energy level E at the surface of the material before escape into the vacuum (6) where E x , is the minimum energy of the X and r bands in the bottom of the conduction band after bending. The probabilities of emission of an electron from the X and r bands are E = Ex,r - E,
,-
and
Pr = C N d E i ) T ( E i )
(7)
Px = C Nx(Ei)T(Ei)
(8)
i
i
where Nr(Ei) and N x ( E i )are the probabilities of finding an electron from the I‘ and X bands at an energy level of E i . The transmission probability, T(E), was worked out by Baud and Rougeot (1976) in the following manner. Applying quantum mechanical principles to regular structures permits associating a wave, defined by Schrodinger’s equation, to an electron.
where V is the potential perturbation encountered by an electron on its trajectory, h is Planck’s constant, E is the energy of an electron at the surface of a material, and m is the mass of the electron. Integrating the equation for the three zones in Fig. 4, we have zone 1: $, = a, exp(ik, x) + b , exp( - ik, x),
2m
k:
=
hZ E
kt
=
hZ [ E - q(vl + v,)]
zone 2 :
lcl2 = a2 exp(ik,x)
+ b, exp(- ik2x),
2m
11
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
zone 3: $3
= a3 exp(ik,x)
+ b3 exp(-ik3x),
k: =
2m
[ E - q(V1 - VO)]
where h = h/27t, and m = electron mass. Zero energy is taken as being the bottom of the conduction band after bending. The constants a,, bl, a,, b, , u 3 , and b3 are obtained by equating the wave functions and their derivatives at the limits of the zones. The emission probability (probability of a photoelectron of energy E being emitted into the vacuum) is given by
where we have T=
(k, k,
+
4kl k3 k:
k2k3)2
+ (k; + k:)(k: + k:)
sinh2(k,ao)
where a. is the thickness of the Cs-0 dipole layer (see Table I).
TABLE I %ME
NUMERICAL VALUES OF EMISSION PROBABILITY T AS A FUNCTION OF ELECTRON ENERGY'
Electron energy (eV) relative to top of the valency band
Transmission coefficient, T
1.41 1.50 1.60 1.73 1.80 1.90 2.00 2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80
0.44319 0.48452 0.52429 0.56834 0.58915 0.61595 0.63983 0.66123 0.68052 0.69800 0.71390 0.73844 0.74177 0.75405 0.76538
For a, = 0.1 nm and barrier height, V, + V, ,of 2.8 eV (see Fig. 4).
(11)
12
H. ROUGEOT AND C. BAUD
Figure 5 shows the energy distribution (normalized) of surface electrons as a function of thermalization energy in the space-charge zone for several different doping levels. These curves are applicable for electrons from either the X or the r conduction bands.
-2.10'e c r n S 3 = N
__---
3.lOte ~ r n ' ~ 5.101B~ r n - ~ .......................... 1 0~ ~ r n ~- ~
0.1
0.2
0.3
0.4
Electron enerqy, E ( e V )
FIG.5. Normalized energy distribution of surface electrons as a function of doping N.
Figure 6 shows the quantum yield (reflection operation) as a function of wavelength for different doping levels. Activating layer thickness a. was 0.1 nm and the barrier height V, was 2.35 eV. Figure 7 shows the quantum yield (reflection operation) as a function of wavelength for different doping levels, thickness and barrier height being the same as in Fig. 6. Figure 8 shows the quantum yield (reflection operation) as a function of wavelength for different activating layer thicknesses (ao= 0.1,0.15,0.2 nm), other parameters having the following values: n, = 5.10'* Lr = 3 pm, V, = 2.35 eV. Figure 9 shows the quantum yield (reflection operation) as a function of wavelength for different barrier heights (Vz = 2.35, 2.05, 1.75, and 1.45 eV),
NEGATIVE ELECTRON AFFINITY PHOTOEMI'ITERS
13
0.3 -
0
.-WZI
5 o.2 -
c
.........................
0 c
---- ----
3
0
0.1 -
3.1019 ~ r n - ~
1019 ~ m - ~ 5.1018 cm-3 3.1018 ~ m - ~ -.,-,.-..- 2 . 1 0 ~~~r n - ~ Cs-0 thickness, a. = 0.1nrn Height of barrier, V 2 = 2.35 eV
-.
.
FIG.6. Quantum yield as a function of wavelength for different doping levels (photoemission by reflection). 0.4 -
0.3-
.............................
0.3
0.4
0.5 0.6 0.7 Wovelength, X ( p m )
0.8
0.9
FIG.7. Quantum yield as a function of wavelength for different diffusion lengths L ( p h e toemission by reflection)
H. ROUGEOT A N D C. B A U D
14
__----.-.-.
04
01 nrn = a,, 0.15 n m - 0.2 nm
.............................
0.3nrn
Doping, n, = 5.10'8~ r n - ~
03
Diffusion length, L = 3 lm Height of barrler, V2 =2.35 eV
D
-
%J r
E, 0 2
c
C 0
a 0
01
................................................................ (
;
0.5
0.4
0.6
0.7
......
0.8
.
_
0.9
Wavelength, X (pm)
FIG.8. Quantum yield as a function of wavelength for different Cs-0 thicknesses a, (photoemission by reflection).
04
03
z
%J r
5 02
c
C
0 2
0
__----.-.-. .............................
01
1.75e V 2.05 e V 2.35 e V
Doping, n, = 5.10'8cm-3 Diffusion l e n g t h , L = 3 p m C s - 0 thickness, a. = 0.1n m
0.3
0.4
0.5
0.6
0.7
Wavelength A ( p r n )
0.8
0.9
FIG.9. Quantum yield as a function of wavelength for different barrier heights V, (phe toemission by reflection).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
15
other parameters having the following values: a, = 0.1nm, n, = 5.101* I,,- = 3 pm. Figures 10 and 11 are comparisons of theoretical values and experimental results due to Baud and Rougeot (1976).The substrate of the NEA photocathode was gallium arsenide, obtained by liquid-phase epitaxy. The theoretical curves were obtained by selecting values for the four parameters as explained below.
i a
-cnE
Cs - 0 thickness, a. = 0.1nm SL = 1 2 0 0 1 A / l m = Experimental observations
_ - - Theoretical curve I1
0.4
0.6
I
0.8 I Wavelength X (pm) FIG.10. Comparison of theoretical and observed spectral sensitivities (photoemission by reflection).
First, the thickness of the space-charge zone was deduced from the doping level, taking the spontaneous polarization of the 11 1B face of GaAs to be the generally accepted value of 0.45 V. It should be noted that the value taken for the spontaneous polarization does not greatly affect the theoretical curves. Second, the height of the potential barrier that must be traversed by the emerging electrons was taken to be 1.35 eV in all cases, in conformity with the contribution of the electron affinity of the different materials present, GaAs lllB and Cs-Cs,O (see Section 11). Third, the thickness of this barrier, of utmost importance according to the theory, was evaluated to be 0.1 nm for a clean surface before activation, but was corrected for each figure to obtain the best match between theory and experiment. Fourth, the diffusion length was chosen whose value gives the theo-
16
H. ROUGEOT AND C. BAUD
100
I
5
a
E
Y
v)
.-3 10 > .+ .-
I
Cs -0 thickness, a,, = 0.1nm SL = 1200 p A / l m
v)
C
0)
-In e
- -=
t
\
\
Experimental observations
_- Theoretical curve
0
0)
n
v)
0.4
0.6
0.8
I
Wavelength X ( pm) FIG. 11. Comparison of theoretical and observed spectral sensitivities (photoemission by reflection)
retical curves the characteristic shape that is also found in most of the corresponding experimental curves. Figure 10 shows the excellent agreement that is obtained over the whole spectral range. The close similarity of theoretical and spectral sensitivities is also shown by the curves in Fig. 11, being more obvious at low sensitivities because of the falloff in wavelength response. The pioneers in NEA photoemission (Scheer and Van Laar, 1965; Apker et al., 1948; James et al., 1968), were first concerned with depositing cesium on clean semiconductor substrates, obtained by vacuum cleavage. Turnbull and Evans (1968) were able to obtain 500 pA/lm. Using ion bombardment, Garbe and Frank (1969) obtained sensitivities of several hundred microamperes per lumen. Many workers were also primarily concerned about the purity of the cesium and the oxygen, and about the deposition conditions (speed, temperature, etc.), the work of Fisher (1974) being noteworthy. However, it was quickly realized that the crystalline quality of the material was the most important factor. Sensitivities of better than 2000 pA/lm are now obtainable in reflection operation; see James et al. (1971~)and Olsen et al. (1977).
17
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
IV. PHOTOEMISSION BY TRANSMISSION Figure 12 is a schematic cross section of a photocathode for transmission operation. It shows a transparent substrate carrying the active semiconductor (GaAs, for example), which is covered by a Cs-0 layer that causes negative electron affinity. C s - 0 activating loyer
Incident radiation N
c X
Tronsparent substrate
-
t
-
Active semiconductor
FIG.12. Photocathode operating by transmission.
Some of the conditions that must be satisfied for operation by reflection and operation by transmission are common to the two techniques. However, transmission operation has some extra requirements. Photons traversing the transparent support will excite photoelectrons on the illuminated surface of the semiconductor. They must then diffuse all the way through this active material without recombining and must arrive at the other face with a significant probability of escape. It can be assumed that virtually all of the electrons are thermalized into the bottom of the conduction band because of the thickness of the active layer as compared with the thermalization length. Expressions giving the photoemissive yield in transmission operation were established by Antypas et al. (1970) and Allen (1971). For permanent excitation, the density An of these electrons at a point of distance x from the illuminated face is given by (see Fig. 12) d2 An Ddx2
I
Electrons diffusing distance x
-
An
-
z
Electrons recombining in x
N + (1 - R)--ae-" A
]+[
=0
Electrons excited inx
(12)
H. ROUGEOT AND C. BAUD
18
where D is the diffusion coefficient of the electrons, z is the lifetime of the electrons, R is the reflection coefficient of the GaAs surface, u is the optical absorption coefficient of GaAs, N is the incident photon flux, and A is the illuminated area. An electric field that instantly absorbs any electrons approaching it exists to a depth of several nanometers at the emission surface. Many of these electrons are emitted into the vacuum. The electron concentration summed over t is An@) = 0 (13) Those which diffuse toward the illuminated face recombine at a speed S, and their flux toward this face is given by D F l x = o = S An
The flux of electrons diffusing toward the emitting surface is given by
I = -DIf T, is the transparence of the support and P the probability of emission, the photoemissive quantum yield is p=-
IPT, N/A
which becomes UL u2L2- 1
p = PT,(l- R)-
1
I
1 D / L ) cosh(t/l)
1;uD
+ S sinh(t/L)]
+ S ) - e-"(S
cosh
+
p
sinh 1 L1 1- uLe-"'/
(17)
1. Diffusion Length, L
In transmission photocathodes, the diffusion length must be as long as possible, so low doping with a very carefully chosen doping material is preferable. However, because a limited space-charge zone is required at the emission surface, a high concentration of free carriers is required. The doping must therefore be a compromise.
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
19
The diffusion length depends to a large extent on the crystalline purity of the material, this having been discussed by Abrahams et al. (1971). Figure 13 shows theoretical curves of quantum yield for an escape probability of 0.4 and total transmission in the substrate. Target thickness was taken to be 3 pm and recombination speed at the interface to be lo3 cm/sec. These curves are for diffusion lengths L of 1, 3, 5, and 7 pm.
_,__
- - --- -
Recombination velocity , S , a t interfoce =
lo3crn /second
7pm = L
--------
___10-31
014
5 Km
3 Prn I w-n
I
I
I
06
0.8
I
Wavelength, X ( p m )
FIG. 13. Quantum yield as a function of wavelength for different diffusion lengths L, target thickness of 3 pm (photoemission by transmission).
2. Thickness of the Active Layer, t A high quantum yield will be obtained if the thickness of the active layer is less than the diffusion length. Figures 13, 14, and 15, for thickness of 3, 1, and 6 pm, respectively, show that red sensitivity drops if the photocathode becomes too thin. The target thickness must be longer than the absorption length of the photons, but shorter than the diffusion length of the electrons.
l/a < t < L
I -
10-1-
_
v .-0
)
-
)I
s
I
c C
0
a 0
-
Recombination velocity at 10-2 -
i n t e r f a c e , S = lo3 cm/second 71m = L
-_ _ ____ - 5 w
- 3w
-. -. -. -. -. .-
.
.-
I wn
I
1
0.6
014
I
0.8
I
FIG. 14. Quantum yield as a function of wavelength for different diffusion lengths L, target thickness of 1 pm (photoemission by transmission).
?
_ _ _ _ _ _ _ _ _ _ _ _ _-.
/ *-
_._._.
/'
-.-
\
Recombination velocity at the i n t e r f a c e , S lo3 cm /second 7pm = L
I
-___ -___ 5 vm -.-.-. -.- 3 1 m -. .-..-. .I pm
I
0.4
,
/./
I / /" /"
..I"-" I
0.6
0.8
Wavelength , X (prn
1
I
FIG. 15. Quantum yield as a function of wavelength for different diffusion lengths L, target thickness of 6 pm (photoemission by transmission).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
21
3. Quality of the SubstratelActive-Layer Interface: Recombination Speed, S An interface with a high recombination speed will absorb photoelectrons to the detriment of emission at the other face. The importance of this parameter S is shown in Fig. 16. The quality of an interface can thus be appreciated by looking at the form of the quantum yield curve. It depends mainly on the crystal structure. Constant quantum yield over a large part of the spectrum indicates a low recombination speed. The quantum yield itself depends to a large extent on the emission probability and on the transparency of the support.
L
10-1-
0
_._.--
.-.,
-
.-oi 21
-5
/'
/,,
-
a
/
/
0
/'
10-2
-
/
/
/'
'
/
,./"
C
0
/'
Diffusion length,L=3 p m Target thickness,t= 3 pm s = 102 cm/s 103 cm/s
_______
I
I
I
0.6
0.8
I
Wavelength , A ( p m )
FIG.16. Quantum yield as a function of wavelength for different recombination velocities S, at the interface (photoemission by transmission).
V. ANGULARENERGYDISTRIBUTION Csorba (1970)has calculated the resolution characteristics of a proximity focused tube incorporating an NEA photocathode. Assuming a perfectly flat photocathode and Lambertian electron distribution with a spread of 2 eV, the modulation transfer function (MTF) is given by MTF(v,) = exp[ - 12(EM/V)(v,d)']
(18)
22
H. ROUGEOT AND C. BAUD
where EM is the maximum emission energy in the Lambertian distribution, v, is the spatial frequency, I/ is the screen accelerating voltage, and d is the cathode-to-screen separation. Under these conditions, it should be possible to obtain much higher resolutions than with similar tubes incorporating multialkaline photocathodes. Pollard (1972) has calculated the tangential component of emission energy and has shown that values of 1 to 2 meV may be envisaged for GaAs photocathodes. These values were confirmed by experiment (Pollard, 1972), using a LEED Auger angular analyzer, the emission half-angle being less than 5”. However, measurements made by Holeman et al. (1974) on proximity focused tubes gave tangential emission energies on the order of 100 meV. Bradley et al. (1977) analyzed these discrepancies between experimental results and suggested two reasons: measurement errors in the LEED Auger analyses for energies below 1 eV, faceting of the emissive surface of the photocathode. In fact, Chen (1971) and Fisher and Martinelli (1974~)have already shown that no matter what the original orientation may be, heating causes “faceting ” along the 110planes of GaAs. The second point is thus the main reason for the high measured values (100 meV) (Bradley et al., 1977). Martinelli (1973b) has calculated the effect of this “faceting” on the resolution of photocathodes. The resolutions hoped for cannot be obtained in this case and are in fact similar to those of multialkaline photocathode. The desorption heating must be optimized to reduce “ faceting.” TECHNOLOGY VI. NEA PHOTOCATHODE A . The Material and Growing Technology
The semiconducting detector of multialkaline photocathodes is formed by evaporation, usually onto glass, of the constituent parts. The technique for NEA photocathodes is completely different. Here, the starting point is a semiconducting material, surface treatment being used to give NEA properties. The crystal quality of the semiconductor must be extremely high so that its electrical properties are compatible with its function as a detector. Optimum doping and a large diffusion length are particularly important. Initially, vacuum-cleaved monocrystals were used (Scheer and Van Laar,
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
23
1965; Van Laar and Scheer, 1968). But, it soon became apparent that epitaxial layers were preferabledoping is easier to control, and the structure lends itself better to transmission operation. The substrates are sliced from monocrystals that can be obtained in various ways: the Czochralski method, in which an oriented seed crystal draws the monocrystal vertically out of a liquid bath; the Bridgmann technique, a horizontal drawing method using a crucible. Several techniques can be used to deposit the epitaxial layers: liquid phase, vapor phase, organometallic compounds, molecular jet. The last one is not really suitable for photoemitters, being best adapted to microwave devices. Liquid-phase epitaxy with GaAs uses a solution of molten gallium, and was developed by Nelson (1963) to obtain p n junctions in GaAs. The principle of this technique is shown in Fig. 17: points A, B, and C are the fusion
i
1 X
5
Composition
FIG. 17. Phase diagram for GaAs.
points of gallium (30 "C), arsenic (800 "C), and gallium arsenide (1240 "C), respectively. At a temperature T , a liquid of composition X is in equilibrium with the solid S. If the temperature falls to T',the liquid partially crystallizes to point L and becomes enriched in gallium with a new composition X'. This method permits deposition at much lower temperatures than that using the fusion component. Tiller (1968), Tiller and Kang (1968), and Potard (1972), drawing on the
24
H. ROUGEOT AND C. BAUD
mathematical treatment of Pohlhausen and Angew (1921), have studied the thermodynamics of this method. This work was taken up again by Crossley and Small (1972). The substrate, which acts as the seed, can be brought into contact with the bath in various ways. Nelson (1963), Kang and Green (1967), and Goodwin et al. (1970) use a tilting method, whereas a vertical dipping technique was used by Deitch (1970) and King et al. (1971). A crucible of the form shown in Fig. 18 can also be used (Hayashi et al., 1970). This is particularly suitable for preparing multiple epitaxial layers, used in transmission devices, during a single thermal cycle.
,-
Quartz tube
Heater elements
FIG. 18. Horizontal liquid-phase epitaxy system.
Among the numerous publications dealing with these techniques are those of Pinkas et al. (1972), Miller et al. (1972), and Casey et al. (1973). Antypas (1970) has studied the phase diagram of InGaAs. Panish and Legems (1971) have established the basis for the calculation of quaternarycompound phase diagrams. Vapor-phase deposition was used as early as 1966 by Tietjen and Amick to prepare a layer of GaAsP using arsine and phosphine. The technique that they used is described in Bell (1973). Binary and ternary 111-V compounds can be made in this way. More recently, epitaxial growth using organometallic compounds has been employed by Manasevit and Simpson (1972). The 111-V compound is obtained from a group-I11 organometallic compound and a group-V hydride by means of the reaction (CH,),Ga
+ ASH,
+
GaAs
+ 3CH,
The GaAs substrate is hot. This method has the advantage that it permits preparing aluminum compounds, something that is not possible with chlorides. Encouraging results have been obtained by Allenson and Bass (1976) who, using homoepitaxial structures, have obtained diffusion lengths similar to those obtained with liquid-phase epitaxy and a sensitivity of 1150 pA/lm. Andre et al. (1976) have made GaAs/Ga, -,Al,As heterostructures with electrical
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
25
qualities comparable to those of homoepitaxial GaAs obtained by the vapor-phase method. However, this technique is still in the experimental stage. Liquid-phase and vapor-phase epitaxy have their own individual advantages. Liquid-phase epitaxy gives better electrical characteristics, but the resulting surface finish is of lower quality, which could be a disadvantage for optical applications (Ettenberg and Nuese, 1975). Gutierrez et al. (1974) recommend a mixed technique for heterostructure, GaAs/GaAlAs/GaP, that is suitable for transmission operation; the GaAlAs is deposited in liquid phase, and the GaAs in vapor phase. At present, both liquid-phase and vapor-phase epitaxy are used, each technique having given good results (Olsen et al., 1977). B. Transmission Photocathodes with Heteroepitaxial Structures
Various structures could be envisaged for a transmission photoemitter. The simplest technique would be to polish a monocrystal until the desired thickness, 2 to 5 pm for 111-V materials, is obtained. Although this would be possible for silicon, it is not suitable for 111-V materials because of their fragility. Some form of support, or substrate, is needed that is monocrystalline and transparent to radiations in the spectral sensitivity range of the photocathode. Sapphire (A1,0,) and Spinel (MgA1204),which have the advantage of being transparent in the ultraviolet and visible region of the spectrum, have been used. However, the results, 70 to 200 pA/lm, are low (Syms, 1969; Liu et al., 1970; Andrew et al., 1970; Hyder, 1971). This is due to a high carrier recombination speed at the interface, caused by the difference between the substrate and GaAs lattice parameters. Materials of type 111-V are preferred for photocathodes, Gap, GaAs, or InP being chosen depending on the characteristics required. Here we will discuss the most common case, GaAs, although silicon and photocathodes optimized for neodymium-doped YAG laser ( A = 1.06 pm) detection should also be mentioned. The most common substrate material is Gap, which is transparent beyond 0.55 pm. Gutierrez and Pommering (1973) have made a GaAs/GaP heterostructure using vapor-phase epitaxy. However, the difference between the lattice constant of the two compounds (see Fig. 19) results in an elevated recombination speed. A buffer layer is employed to reduce this speed. The buffer, which must be transparent, is used to absorb the defects due to the lattice mismatching. The intermediate layer can have one of several compositions. GaAlAs is very suitable (see Fig. 19); its band gap is greater than that of GaAs and its lattice
H. ROUGEOT AND C . BAUD
26 28 2.4
-> 2 .o - 1.6 W
a
0
m
U
c
O
1.2
m 0.8 0.4
- Direct Direct
---
band band gap gap Indirect band gap 1
55
56
57
58
Lattice constant
59
60
61
62
(HI
FIG. 19. Bandgap variation as a function of lattice constant for a number of ternary Ill-V compounds.
constant is virtually the same. Allenson et al. (1972) and Frank and Garbe (1973) have made GaAs/GaAlAs/GaP heterostructures using liquid-phase epitaxy and have obtained transmission-operation sensitivities of better than 400 ,uA/lm. Gutierrez et al. (1974) used the same structure, but tried a hybrid, liquid-phase plus vapor-phase technique to obtain a GaAs surface that was blemish-free. GaAsP is also used as a intermediate layer. Although its recombination speed is less suitable than that of GaAlAs, it is transparent over a greater range of the spectrum (see Fig. 19). Gutierrez and Pommering (1973), who used vapor-phase epitaxy, were the first to give results for this structure. The work of Fisher (1974) should also be noted. By varying the composition of the ternary layer, it can be optimized on the GaP/GaAsP and GaAsPIGaAs interfaces so as to minimize the dislocation density. The formation of a potential well for photoelectrons at the interface can be avoided by using ptype doping of the intermediate layer. A systematic study of the use of In,Ga, -,P as a buffer layer material has been carried out by Enstrom and Fisher (1975). Liu et al. (1973) explored the possibility of using structures of the type GaAs/GaAlAs/GaAs-substrate. The opaque substrate must be eliminated from the useful surface of the photocathode, the GaAlAs acting as the support. However, this structure remains fragile (Baud and Rougeot, 1976).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
27
To get over this disadvantage while keeping the advantages of such a structure, Antypas and Edgecumbe (1975) proposed sealing it onto a glass support to make it rigid. A heteroepitaxial structure of the type GaAlAsJGaAsJGaAIAsJGaAs-substrate is prepared by liquid-phase epitaxy. The last layer (GaAlAs) is sealed onto glass, and then the substrate and the first layer of the GaAlAs are eliminated by selective chemical baths. The sealed layer of GaAlAs acts as a buffer layer, absorbing damage caused by sealing. Sensitivities of better than 600 pA/lm have been measured in transmission operation with photomultipliers using this type of photocathode structure, higher values of sensitivity being classified.
C. Investigation of Material Characteristics
The preceding sections have shown the major importance of the diffusion length, doping, and electron escape probability. To be able to measure these parameters and to correlate then with photoemission is therefore of the utmost importance. If possible, the measurements should be nondestructive.
1. Difusion Length
Garbe and Frank (1969) proposed using the photoemission to measure the diffusion length and, if desired, the electron emission probability. This method only works for wavelengths which only excite the r band of the semiconductor. For example, if the material is gallium arsenide the wavelength must be longer than 700 nm. The inverse of quantum yield, measured by using photoemission by reflection, is given by the expression 1
[
+!]
-~
p ( ~ ) -(1 - R ) P L ~ ( A )
P
The curve of l/p(A) with respect to l/a(A) is linear, and if it is extrapolated so as to cut the abscissa at the origin [l/p(A) = 01, it can be used to measure the diffusion length. The electron emission probability P can be calculated from where the curve intersects the ordinate at the origin [l/a(A) = 01. Mayamlin and Pleskov (1967) have shown that the potential difference AV(A) developed between a semiconductor and an electrolyte into which it is immersed is of the same nature as the photovoltaic effect observed in a semiconductor placed in a vacuum or gaseous environment (Johnson, 1958). The inverse of the photovoltaic potential at a given wavelength is related to
28
H. ROUGEOT AND C. BAUD
the inverse of the absorption coefficient at the same wavelength by the expression (Allenson, 1973)
where N o is the incident photon flux, and k is an electrolyte/semiconductor interaction coefficient, eliminated by the measurement method. This relationship is linear, so if the curve is extrapolated, the diffusion length can be calculated from the intersection with the abscissa. A third way to measure the diffusion length uses the photomagnetoelectric effect. If the mobility of the charge carriers in the material is known and if certain approximations are made, the measurement of current or voltage can give the diffusion length (Agraz and Li, 1970). This method, unlike the previous ones, has the disadvantage of being destructive. Another destructive technique is to measure the photocurrent at the epitaxial-layer/substratejunction of a beveled sample. If the response is plotted as a function of distance from the junction in log-linear coordinates, the diffusion length can be derived from the resulting straight line (Ashley and Biard, 1967; Ashley et al., 1973).
R
= [2 exp( - x/L)](1
+ S/VD)-
(21)
where R is the relative response, S is the surface recombination speed, and V, = D/L with D being the diffusion coefficient. Laser excitation may be replaced by an electron beam (Hackett, 1972). The smaller dimensions mean that oblique polishing is no longer necessary, simple cleaving being sufficient.
2. Doping
The doping can be measured by the Hall effect, using the now classic technique of Van der Paw (1958). With epitaxial matepals, the substrate must be semi-insulating. Measurement at 300 K gives the carrier density and mobility. Measurement at 77 K gives the compensation coefficient. This technique is destructive. Another method, nondestructive this time, exploits the plasma resonance that occurs at the surface of a semiconductor which is irradiated with farinfrared radiation. This phenomenon affects the reflection at a wavelength that is a function of the doping. Various ways of applying this technique are described in Black et al. (1970), Schuman (1970), Riccus et al. (1966), and Kesamanly et al. (1966).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
29
3. The Nature of the Sudace
The characteristics mentioned previously concern the bulk of the semiconductor and its electrical properties. The nature of the surface affects the electron escape probability, depending on its crystalline quality and purity. Auger electron spectroscopy is now widely used for the study of photocathodes. Numerous articles, such as Chang (1971), Auger (1975), and Harris (1968), describe this secondary emission process. The energy of each peak is characteristic of the emitting body, so the identification of chemical species at the surface can be performed with an accuracy of 1/100 of a monolayer. The influence of contaminants can thus be shown with this technique (Uebbing, 1970). The lattice structure at the surface can be checked by low energy electron diffraction,a method which gives an image of the surface reciprocal lattice. A description of this is given in Estrup (1971) and Fiermans (1974). This nondestructive control instrumentation should be installed inside the enclosure in which the photocathode is prepared, so that the desorption and activation processes can be checked (Goldstein, 1975; Stocker, 1975; Van Bommel and Crombeen, 1976). D. High- Vacuum Enclosures
The incorporation of NEA photocathodes in sealed tubes presents several problems, among which are maintaining a vacuum of lo-’’ torr inside the enclosure, and the desorption under high vacuum of the monocrystalline surfaces that are to be activated. 1. The Need for High Vacuum
By high vacuum, we mean residual pressures of less than lo-’ torr. The necessity of using high vacuum becomes evident on inspecting Fig. 20 which gives the number of impacts per square centimeter as a function of partial pressure of an element of molecular weight M and Fig. 8 which shows the rapidity with which photoelectron transmission decreases due to the tunnel effect as a function of the thickness of the barrier potential. From Fig. 20, it can be seen that only 1 sec is required to deposit a monomolecular layer of H 2 0 on a cold surface at a pressure of 2.lO-’ torr. This time becomes sec at 2.10-lo torr. 2. High- Vacuum Sealing Equipment
Various ways of sealing tubes under high vacuum have been recommended. As an example, the authors describe here their own equipment (see
30
H. ROUGEOT AND
C. BAUD
Partial pressure ( t o r r )
FIG.20. Variation of number of impacts per square centimeter with partial pressure.
Fig. 21). It consists of a liquid-helium cryogenic pump; a quadrupolar gas analyzer; an Lshaped, metal-jointed, bakeable valve with a high throughput; and a bell housing. Half of the tube is mounted on a b e d support. The other half is mounted on a support that can be made to slide along two guide columns by means of a mechanical press mounted outside the vacuum enclosure that transmits the motion via a metal bellows. Initially, the two halves of the tube are kept separated, the space between them permitting the introduction of the photocathode before sealing. The photocathode itself is degassed and prepared outside the tube. It is fixed to a turntable, controlled manually from outside, that can be rotated so as to position the photocathode for the different operations of desorption, coating with cesium, and insertion in the tube. Pure oxygen can be introduced into the vacuum enclosure via a leak valve. Various viewing ports permit positioning the photocathode, measuring its sensitivity, and checking the sealing operation. Photocathodes of several different structures have now been tried in tubes, including GaP/GaAsP/GaAs with a sensitivity of 270 pA/lm in transmission operation (Hughes et al., 1972,1974) and GaP/GaAlAs/GaAs with sensitivity of better than 300 p A / h (Holeman et al., 1974).
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
31
FIG.21. Equipment for ultrahigh vacuum sealing with indium joints: 1, Cryogenic pump; 2, quadrupole gas analyzer; 3, metallic-joint ultrahigh vacuum valve; 4, bell jar; 5, oxygen leak valve; 6, inspection window; 7, workpiece manipulation system; 8, sorption pumps.
VII. PHOTOEMISION STABILITY AND DARK CURRENT Factors affecting stability are listed in publications by Sommer (1973a,b) and Yee and Jackson (1971).It appears that the main cause of photocathode destruction is bombardment by residual ions; Shade et al. (1972)studied this problem and found that if the photocathode is to conserve its efficiency,the current drawn must not exceed lo-’ A/cmz. Spicer (1974) studied the origin of the instabilities in greater detail by irradiating metallic cesium and 111-V materials with ultraviolet light and measuring how their photoemission evolved with degree of oxidation. He found that cesium keeps its metallic nature after a short exposure to oxygen which, apparently, dissolves into the bulk of the material. A larger exposure to oxygen first reduces the work function to a minimum value of 0.7 eV and then tends to increase it. Working with p-type gallium arsenide, he noted that the energy bands tended to bend toward a lower level. A special feature of NEA photocathodes is their low dark current. This dark current has several origins: thermionic emission, which varies with the height of the band gap of the material. This is the main source of dark current in GaInAsP photocathodes that are intended for operation at 1.06 pm (Escher et al., 1976); charge carriers created by generation-recombination centers at the interface; the Cs-0 layer, whose effect increases with thickness.
32
H.ROUGEOT AND
C. BAUD
The dark current of GaAs photocathodes is on the order of A/cm2 (Van Laar, 1973). For materials with narrower band gaps, it may reach lo-'' A/cm2 (Martinelli, 1974). This subject is treated in publications by Bell (1970) and Spier and Bell (1972). VIII. CONCLUSION Although much of this discussion has been based on gallium arsenide, other materials can, of course, be used. They must, however, fulfill the following conditions:
(1) The material must be highly absorbent and have a diffusion length compatible with the thickness of the photoemission layer. (2) The width of the band gap must not be less than the work function of the activated surface. (3) Because the active material is either sealed or grown epitaxially onto a substrate that is transparent to the wavelengths of interest, their expansion coefficients must be well matched. So far as the second point is concerned, the work function of the oxygencesium layer, deposited on the active layer, is on the order of 0.85 eV. This sets a lower limit for the band gap of the semiconductor, below which an equivalent spectral response (into the infrared) can hardly be hoped for. This limit corresponds to a wavelength of 1.46 pm. It must also be remembered that, according to the theories developed up to now, any extension of the spectral response toward longer wavelengths will, because of the reduction in NEA (band gap of -0.85 eV), be accompanied by a drop in electron escape probability, and hence a reduction in quantum yield. Figure 22 shows the main semiconductors that satisfy conditions 1 and 2. With ternary compounds, the band gap, and hence cutoff wavelength, can be adjusted to give a desired value by simply altering the composition. The most commonly used ternary compounds are GaInAs and InAsP, deposited by epitaxy on GaAs and InP, respectively. Condition 3 limits this possibility because the lattice dimensions and the absorption limits of ternary materials both vary with composition (Vegard's law). To make the parameters independent of each other, the use of quaternary compounds has been envisaged. These materials permit varying the spectral absorption limit while keeping the lattice structure constant. One quaternary material that has been investigated in some detail is GaInAsP. The main contribution of the study of NEA at the surface of semiconductors has been to permit a better understanding of photoemission and to set its limits. Improvements now depend on the metallurgy of the active layers, the efficiency of surface-cleaning techniques, and so far as operating stability
33
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS 1.74eV ( 0 . 7 1 2 u m )
Cd Se
1.5 eV ( 0 . 8 3 p m 1
C d Te
1.4 eV ( 0 . 9 p m )
Go As
1.27eV10.975 p m )
InP
1.17eV (1.06 p m ) 1.12eV ( 1.1 p m
_-______--------- Wavelength of
N d Yag loser
Si
O.8SeV ( 1 . 4 6 p m )
Limit of NEA (electron
offinity of C s -0 loyer
)
EV
FIG.22. Band gap of commonly used semiconductors in NEA.
is concerned, the cleanliness of the enclosures incorporating such photocathodes. It may be possible to increase the value of limiting wavelength, presently 1.46 pm, beyond which NEA disappears. This would require the discovery of new activating materials with lower electronic affinity, or the creation in some way of an electron-extraction field in the substrate. Several laboratories are working on these ideas at present.
REFERENCF~S Abrahams, M. S.,Buiocchi, C. J., and Williams, B. F. (1971) Appl. Phys. Lett. 18, 220. Agraz, J. C., and Li, S. S. (1970). Phys. Rev. B 2, 1847. Allen, G. A. (1971). J . Phys. D 4, 308. Allenson, M. B. (1973). S E R L Technol. J . 23, 11.1. Allenson, M. B., and Bass, S. J. (1976). Appl. Phys. Lett. 28, 113. Allenson, M. B., King, P. G. R., Rowland, M. C., Steward, G . J., and Syms, C. H.A. (1972). J. Phys. D 5, L89. Andre, J. P., Gallais, A., and Hallais, J. (1976). " Process Gallium Arsenide and Refated Compounds" (C. Hilsum, ed.), Conf. Ser. No. 334 Edinburgh 1.
34
H. ROUGEOT AND C. BAUD
Andrew, D., Gowers, J. P., Henderson, J. A., Plummer, M. J., Stocker, B. J., and Turnbull, A. A. (1970). J. Phys. D 3, 320. Antypas, G. A. (1970). J. Electrochem. Soc. 117, 1393. Antypas, G. A., and Edgecumbe, J. (1975). Appl. Phys. Lett. 26, No. 7, 371. Antypas, G. A,, James, L.W., and Uebbing, J. J. (1970). J. Appl. Phys. 41, 2888. Apker, L., Taff, E., and Dickey, J. (1948). Phys. Rev. 74, 1462. Ashley, K. L.,and Biard, J. R. (1967). IEEE Trans. Electron Devices ED-14,429. Ashley, K. L., Carr, D. L., and Moran, R. R. (1973). Appl. Phys. Lett. 22, 23. Auger, P. (1975). Surf. Sci. 48, 1. Bartelink, D. T., Moll, J. L., and Meyer, N. I. (1963). Phys. Rev. 130, 972. Baud, C., and Rougeot, H. (1976). Rev. Thomson-CSF 8, 449. Bell, R. L. (1970). Solid State Electron. 13, 397. Bell, R. L. (1973). “Negative Electron Affinity Devices.” Oxford Univ. Press (Clarendon), London and New York. Bell, R. L.,and Spier, W. E. (1970). Proc. IEEE 58, 1788. Bell, R. L., James, L. W., Antypas, G. A., and Edgecumbe, J., and Moon, R. L.(1971). Appl. Phys. Lett. 19, 513. Black, J. F. et a/. (1970). Infrared Phys. 10, 126. Borziak, P. J., Bibik, V. F., and Dramarenko, G. S. (1956).Izv. Akad. Nauk SSSR, Ser. Fiz 20, 1039. Bradley, D. J., Allenson, M. B., and Holeman, B. R. (1977). J. Phys. D 10, No. 4, 11 1. Brown, F. Williams, and Tietjen, J. J. (1971). Proc. IEEE 59, 1489. Casey, H. C., Miller, B. I., and Pinkas, E. (1973). J. Appl. Phys. 44, 1281. Chang, C. C. (1971). Surf. Sci. 25, 53. Chen, J. M. (1971). Surf. Sci. 25, 305. Clark, M. G. (1975). J. Phys. D 8, Crossley, J., and Small, M. B. (1972). J. Cryst. Growth 15, 275. Csorba, I. P. (1970). R C A Rev. 31, 534. Deitch, R. H. (1970). J. Cryst. Growth 7 , 69. Derrien, J., Arnaud DAvitaya, F., and Glachan, A. (1975). Surf. Sci. 47, 162. Derrien, J., Arnaud DAvitaya, F., and Bienfait, M. (1977). Colloq. Int. Phys. Chim. Surf. Solides, 3rd, 1977, p. 181. Dinan, J. H.,Galbraith, L. K., and Fisher, F. E. (1971). Surf. Sci. 26, 587. Ebbinghau, G., Braun, W.. and Simon, A. (1976). Phys. Rev. Lett. 37, 1770. Enstrom, R. E., and Fisher, D. J. (1975). J. Appl. Phys. 46, 1976. Escher, G. A,, Antypas, J., and Edgecumbe, J. (1976). Appl. Phys. Lett. 29, 153. Estrup, P. (1971). Surf. Sci. 25, 1. Ettenberg, M., and Nuese, C. J. (1975). J. Appl. Phys. 46, 3500. Fiermans, L., and Vennik, J. (1974). Silicates Industriels 3, 75. Fisher, D. G. (1974).IEEE Trans. Electron Devices, August 1974, 541. Fisher, D. G., and Martinelli, R. U. (1974). “Negative Electron Affinity Materials. Image Pick Up and Display” (B. Kazan, ed.), vol. 1. Academic Press, New York. Fisher, D. G., Enstrom, R. E., Escher, I. S.,and Williams, B. F. (1972). J. Appl. Phys. 43,3815. Fisher, D. G., Enstrom, R. E., Esher, J. S.,Gossenberger, H. S., and Appert, J. A. (1974). IEEE Trans. Electron Devices 21, 641. Frank, G., and Garbe, S. (1973). Acta Electron, 16, 237. Garbe, S., and Frank, G. (1969). Solid State Commun. 7,615. Goldstein, B. (1973). Surf. Sci. 35, 227. Goldstein, B. (1975). Rapport AD/A 026-710. National Technical Information Service. Goldstein, B., and Szostak, D. J. (1975). Appl. Phys. Lett. 26, 111.
NEGATIVE ELECTRON AFFINITY PHOTOEMITTERS
35
Goodwin, A. R.,Gordon, J., and Dobson, C. D. (1970). Br. Appl. Phys. Lett. 17, 109. Gregory, P. E., Spicer, W. E., Ciraci, S., and Harrison, W. A. (1974). Appl. Phys. Lett. 25, 511. Gutierrez, W. A., Wilson, H. L., and Yee, E. M.(1974). Appl. Phys. Lett. 25, 482. Hackett, J. (1972). J . Appl. Phys. 43, 1649. Harris, L. A. (1968). J . Appl. Phys. 39, 1419. Hayashi, I., Panish, M. B., Foy, P. W., and Sumsky, S. (1970). Appl. Phys. Lett. 17, 109. Heiman, W., Hoene, E. L., and Kansky, E. (1973). Exp. Technol. Phys. 21, 193. Hoene, E. L. (1970). Trans. f M E K O Symp. Photon Detect., 4th, 1969 p. 29. Holeman, B. R.,Conder, P. C., and Skingsley, J. D. (1974). SERL Technol. J . 24, 6.1. Hughes, F. R.,Savoye, E. D., and Thoman, D. L. (1972). “Application of Negative Electron Affinity Materials to Imagine Devices.” AIME, Boston, Massachusetts. Hughes, F. R.,Savoye, E. D., and Thoman, D. 1. (1974). J. Electron. Mater. 3, 9. Hyder, S. B. (1971). J. Vac. Sci. Technol. 8,228. James, L. W., Moll, J. L., and Spicer, W. E. (1968). Inst. Phys. Conf. Ser. 7, 230. James, L. W., Antypas, G . A., Edgecumbe, J., and Bell, R. L. (1971a). Inst. Phys. Con6 Ser. 9,195. James, L. W., Antypas, G. A., Uebbing, J. J., YepT.,and Bell, R. L. (1971b). J. Appl. Phys. 42, 580.
James, L. W., Antypas, G. A., Edgecumbe, J., Moon, R.L., and Bell, R.L. (1971~).J. Appl. Phys. 42, 4976.
Johnson, E. 0. (1958). Phys. Rev. 111, 153. Kang, C. S., and Green, P. E. (1967). Appl. Phys. Lett. 11, 171. Kesamanly, F. P., Maltsev, Yu. V., Masledov, D. M., and Ukhanov, Yu. I. (1966). Phys. Stat. Solid 13, 41 19. King, S., Dawson, L. R.,Kilorenzo, J. V., and Jahnson, W. A. (1971). Inst. Phys. Conf. Ser. 9,108. Kressel, A., and Kupsky, G. (1966). Int. J . Elect. 20, 535. Liu, Y. Z., Moll, J. L., and Spicer, W. E. (1970) Appl. Phys. Lett. 17, 60. Liu, Y.Z., Hallish, C. D., Stein, N. W., Badger, D. E., and Greene, P. 0.(1973). J. Appl. Phys. 44, 5619.
Manasevit, H. M., and Simpson, W. I. (1972). J. Cryst. Growth 13/14, 306. Martinelli, R. U. (1973a). J . Appl. Phys. 44, 2566. Martinelli, R. U. (1973b) Appl. Opt. 12, 1841. Martinelli, R. U. (1974). J. Appl. Phys. 45, 1183. Mayamlin, V. A., and Pleskov, Yu. (1967). In “Electrochemistry of Semiconductors” (P. J. Holmes, 4.). Academic Press, New York. Miller, B. I., Pinkas, E., Hayashi, I., and Gapik, R.J. (1972). J. Appl. Phys. 43, 2817. Milton, A. F., and Baer, A. D. (1971). J . Appl. Phys. 42, 5095. Mityagin, A. Ya., Orlov, V. P., Panteleev, V. V., Khronopulo, K. A., and Cherevaltskii, N. Ya. (1973). Sou. Phys. Solid State (Engl. Transl.) 14, 1623. Nelson, H. (1963). RCA Rev. 24, 603. Olsen, G. H., Szostak, D. J., Zmerowski, T. J., and Ettenberg, M. (1977). J . Appl. Phys. 48, 1007. Panish, A. B., and Legems, M.I. (1971). fnt. Phys. Conf. Ser. 9, IPPS, London. Papageorgopoulos, C. A., and Chen, J. M. (1973). Surf: Sci. 39, 283. Pinkas, E., Miller, B. I., Hayashi, I., and Foy, P. W. (1972). J. Appl. Phys. 43, 2827. Pohlhausen, K., and Angew, L. (1921). Math. Mech. 1, 252. Pollard, J. H. (1972). AD 750 364. Potard, C. (1972). J. Cryst. Growth 13/14, 804. Ranke, W., and Jacobi, K. (1973). Solid State Commun. 5. Riccus, H. D., et a/. (1966). Can. J. Phys. 44, 1665. Scheer, J. J., and Van Laar, J. (1965). Solid State Commun. 3, 189. Schuman, P. A., Jr. (1970). Solid State Technol. 13, 50.
36
H. ROUGEOT AND C. BAUD
Shade, H., Nelson, H., and Kressel, H. (1972) Appl. Phys. Lett. 20, 385. Simon, A. (1971). Naturwissenschafen 58, 622. Simon, A. (1973). Z. Anorg. Allg. Chem. 395,301. Simon, A., and Westerbeck, (1972). Angew. Chem., Int. Ed. Engl. 11, 1105. Smith, K. L., and Huchital, D. A. (1972). J. Appl. Phys. 43,2624. Sommer, A. H. (1973a). Appl. Opt. 12.90. Sommer, A. H. (1973b). R C A Rev. 34,95. Sommer, A. H. (1973~).5th Inst. Phys. Conf. Ser. 17, 143. Sommer, A. H.,Whitaker, H. H., and Williams, B. F. (1970). Appl. Phys. Lett. 17, 273. Sonnenberg, H. (1969a). J. Appl. Phys. 40,3414. Sonnenberg, H. (1969b). Appl. Phys. Lett. 14, 289. Sonnenberg, H. (1971). Appl. Phys. Lett. 19, 431. Sonnenberg, H. (1972a). Appl. Phys. Lett. 21, 103. Sonnenberg, H. (1972b). Appl. Phys. Lett. 21, 278. Spicer, W. E. (1974) “Study of the Electronic Surface of 111-V Compounds” (AD A 010 802 Oct.). National Technical Information Service. Spicer, W. E., and Bell, R. L. (1972). Publ. Astron. Soc. Pac. 84, 110. Stocker, B. J. (1975). Surf: Sci. 47, 501. Syms, C. H. A. (1969). Ado. Electron. Electron Deoices ZSA, 399. Tietjen, J. J., and Amick, J. A. (1966). J . Electrochem. SOC.113, No. 7, 724. Tiller, W. A. (1968). J. Cryst. Growth 2, 69. Tiller, W. A., and Kang, C. (1968). J. Cryst. Growth 2, 345. Tsai, K. R.,Harris, P. M., and Lassetre, E. N. (1956). J. Phys. Chem. 60,345. Turnbull, A. A., and Evans, G. B. (1968). J. Phys. D 1, 155. Uebbing, J. J. (1970). J. Appl. Phys. 41, 802. Uebbing, J. J., and James, L. W. (1970a) Appl. Phys. Lett. 16, 370. Uebbing, J. J., and James, L. W. (1970b). J. Appl. Phys. 41, No.11,4505. Van Bommel, A. J., and Crombeen, J. E. (1976). Surf: Sci. 1976, 109. Van der Pauw, L. J. (1958). Philips Res. Rep. 13, No. 1, 1. Van Laar, J. (1973). Acta Electron. 16, 215. Van Laar, J., and Scheer, J. J. (1967). Surf: Sci. 8, 342. Van Laar, J., and Scheer, J. J. (1968) Philips Tech. Rev. 28, No. 12, 355. William, B. F., and Simon, R. E. (1967). Phys. Rev. Lett. 18,485. Yee, E. M., and Jackson, D. A. (1971). Solid State Electron. 15, 245.
ADVANCPS IN BLKTRONICS A N D ELECTRON PHYSICS, V O L
48
A Computational Critique of an Algorithm for Image Enhancement in Bright Field Electron Microscopy* T. A. WELTON Physics Division Oak Ridge National Laboratory Oak Ridge, Tennessee and Department of Physics University of Tennessee Knoxville, Tennessee
I. Introduction IV. V. VI. VII.
............................................................. 37 ............ 39
Statistical Error ......... ....................................... Object Reconstruction ........................................... Programs for Numerical Tests of the Reconstructi Presentation and Discussion of Data...............
50
Appendix B. The Wiener Spectrum of the Object Set.................................. Appendix C. Programs for Determining W(k) ......................... ............................................... References ...
94 97
loo
I. INTRODUCTION Since the elucidation by Scherzer (1936) of the role played by the aperture defect in limiting the resolution of the electron microscope, a number of approaches have been explored to obviate such limitation. Scherzer himself proposed several possible improvements (1947, 1949), that of greatest interest being probably the use of multipole corrector lenses (Scherzer, 1947). Subsequent work has unfortunately made it all too clear that this approach, while admirable from a theoretical viewpoint, is highly complex in realization (Seeliger, 1951; Burfoot, 1952; Archard, 1955; Deltrap, 1964). No resolution improvement in an actual microscope has, in fact, been realized, and
* Research sponsored by the Division of Physical Research, US. Department of Energy, under contract W-7405-eng-26with the Union Carbide Corporation. 37 Copyright 0 1979 by Academic Press,Inc. All rights of reproduction in any form reserved
ISBN 0-12-014648-7
38
T. A. WELTON
there is presently good reason for skepticism regarding the true value of this initially promising idea. Other suggested improvements (Thon and Willasch, 1971) were based on the idea of altering the image wave in the back focal plane by selective obstruction and/or retardation, in order to at least partially compensate the extremely unfavorable phase variations introduced by the aperture defect, defocus, and axial astigmatism. When, however, the severe fabrication problems inherent in this approach were ingeniously surmounted, the actual results were disappointing. Some skepticism as to these approaches seems accordingly also presently justified. A third class of improvements has also received considerable attention (Hahn and Baumeister, 1973; Hahn, 1973; Thon and Siegel, 1970, 1971; Stroke and Halioua, 1973; Stroke et al., 1974; Welton, 1970; Langer et al., 1971; Erickson and Klug, 1971) namely, that in which the imperfect micrograph is processed in some way to at least partially compensate for the effect of aberrations. These suggestions have always possessed some attractive features. They do not involve delicate fabrication problems, since they operate on the micrograph, an entity of macroscopic size. These methods further bear a close relationship to methods currently under study in light-optical applications, with much of this experience being directly transferable to the electron-optical problem. As with the other methods mentioned, practical results thus far obtained with these micrograph enhancement methods do not yet live up to their promise, and it is this frustrating situation which the present article is intended to address. We conclude this section by noting that two distinct methods exist for carrying out the work of enhancing a micrograph. The first utilizes coherent illumination of the micrograph (in transparency form), followed by suitable focusing, retarding, and absorbing elements. The result is a real image which provides a finished picture by exposure of a film. This method has been carefully explored by several workers (Hahn and Baumeister, 1973; Hahn, 1973; Thon and Siegel, 1970,1971; Stroke and Halioua, 1973; Stroke et al., 1974). No fully conclusive results appear to have been obtained yet, and it is of considerable importance to understand the difficulties and potential strengths of the method. Further discussion is beyond the purview of this article, and definitive answers must clearly be obtained by practitioners of the methods. The methods to be discussed in detail (Welton, 1970; Langer et al., 1971; Erickson and Klug, 1971) in the present article involve carrying out the required enhancement operations by large-scale computation on a digital representation of the micrograph. As always, these methods carry their peculiar advantages and difficulties, and these desperately require realistic evaluation, especially in view of the fact that here again substantial successes are difficult to find.
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
39
Accordingly, the plan of this article will be to explore in some detail a small (but promising) subset of the computational procedures, with a view to enumerating and evaluating the practical difficulties. Parallel discussions of the optical processing and the purely electron-optical methods are badly needed, as well as detailed comparisons with the purely computational procedures. The method to be used is based on computer synthesis of the micrograph to be processed, a procedure that allows careful control of the quality of the data used. For a number of years, such results (Welton, 1971a) were the only existing substantial evidence for the possible value of computational methods, although a few interesting results have been more recently reported (Welton, 1975), using actual electron micrographs. 11. IMAGE THEORY FOR BRIGHTFIELD ELECTRON MICROSCOPY It is a truism that the enhancement or improvement of any image can only be undertaken if a reasonably detailed theory is available for the process of image formation. We shall see, in fact, that the aberrations of the imaging process, the statistical error level (noise), and some statistical information concerning the object structure must all be reasonably well known before any progress can be made. At this point, we specialize to the case of bright field imaging, the requisite theory being then relatively simple. Although the dark field image mode is more attractive to the eye, it has been found by numerous tests that the bright field mode is not inferior in information content. Once the bright field micrograph is digitized, moreover, the background density level can be subtracted and the contrast level of the remainder expanded to the limits of the final display medium, so that the eye can be fully satisfied. A similar contrast enhancement could, of course, be performed by optical printing onto a high contrast emulsion. We will shortly exhibit the advantage of the bright field mode from the standpoint of computer processing. The basic theory needed is, of course, basically that given by Abbe for the optical microscope, as adapted by Scherzer (1949) for the case of electron imagmg. Some changes will be necessitated by the essential aberrations of electron-optical systems and by the special properties of high resolution electron microscopic samples. Most of these differences will be seen to cause additional difficulties (ranging from moderate to severe) in the electronoptical case, the single exception being the small numerical aperture of the electron microscope. The simple Abbe theory is nevertheless a most useful starting point, and we proceed to set up a treatment adequate for our purpose. We assume the object to be some arrangement of atoms, lying nearly in a plane (to within 100 A or less). The electrons are assumed to be incident all with very nearly the same direction and energy, so that the illumination is nearly coherent. The first Born approximation will be
40
T. A. WLTON
assumed to be reasonably valid for the description of the interaction between the beam electrons and the sample atoms, but we shall discuss briefly the effects of departures from this assumption. We further assume that the field is small enough that coma is small in a sense easily made quantitative. In order to describe the effect of partial coherence, we calculate the image intensity distribution expected from each of a continuous set of perfectly coherent illuminations and finally average these intensities over an assumed distribution function for the energy and direction spread of the actual illumination. These averages must in general be done numerically if high accuracy is to be achieved, but it will be seen that a certain degree of crudity in this calculation can be tolerated for the purpose at hand. A basic approximation will therefore be that of normal (gaussian) distribution for the energies and directions of the incident electrons, a procedure first exploited in the numerical synthesis of bright field micrographs (Welton, 1971a) and in the use of informational concepts to characterize microscope performance (Welton, 1969). Full derivations have been given for the case of energy spread by Welton (1971b) and by Hanszen and Trepte (1971), and for the case of angular spread by Welton (1971b) and by Frank (1973). We now present a version of bright field image theory in a form adequate for our purpose. A particular electron of the illuminating beam is described by a wave function exp[i(lc
+ C)z + it
*
x]
where
and m is the electron rest mass, c light speed, h (h/27r) Planck‘s constant, p mean electron momentum, D electron speed, and /-j = u/c. The quantities ( and 5 are deviations for an individual electron from the mean momentum of the beam electrons. It will be extremely convenient to characterize a beam electron by a longitudinal (or axial) momentum component h(ic + () and a pair of rectangular transverse components he, written for compactness as a two-dimensional vector quantity. In all that follows it will be assumed that C and 6 are distributed in normal fashion. Thus, with N a normalization factor, is the probability per unit volume in the C, 6 space of finding a given beam electron. With a carefully adjusted illumination system, qx can be made equal to q,,, but in any event a particular orientation for the xy axes has been chosen to slightly simplify Eq. (2.3). We may anticipate that the above form for the [ distribution may be quite good
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
41
generally, and that the 5 distribution will be closely of the chosen form for the interesting practical case of the field emission source (Young and Muller, 1959). We note the equivalence of the parameters 6 and q to more familiar forms, thus
where 6 p and 6E are t..e rms momentum and energy spreaL.i, respectively, and 60 and 6x are the illumination angle and the transverse coherence length, respectively.* In the Abbk theory, as used by Scherzer, the interaction of the illuminating beam with the sample is described by computing the phase retardation of the wave corresponding to a particular electron, produced by passage through the electrostatic fields of the sample atoms. We consider a sample atom with coordinate (x,, z,) and electrostatic potential V ( x - x, , z - z,). It will be convenient (and presumably adequately accurate) to take the potentials of the separate atoms as additive and spherically symmetric, thus ignoring the details of chemical binding. It is further plausible that the first Born approximation will be of useful accuracy.? Finally, it is easily shown that a simple calculation of the retardation produced by the sample atom for the undeflected incident wave yields a convenient and accurate form for the scattered wave. For the moment, we take = 6 = 0, and write a3
where the arrow indicates the change in the form of the incident wave produced by passage through the sample. The retardation function 8~ is simply defined by energy conservation. Thus
+ 6 ~ ) ] ’+ m2c4 = (E + mc’ + eV)’ ( ~ c K+)m2c4 ~ = (E + mc2)’
[~c(K
and
(2.6) (2.7)
* The approximate equivalence indicated for 6 would be precise in the nonrelativistic limit and that indicated for q x , qy reflects a confusion in the literature between several nearly equivalent definitions of the width of the angular distribution. t Exceptions will obviously arise when a number of atoms are nearly aligned in the beam direction, or when the average number of atoms per unit area, transverse to the beam, becomes too large.
42
T. A. WELTON
where E is the beam kinetic energy, e is the magnitude of the electron charge, and V is the electrostatic potential (positive for an atom with its positive central charge). A simple calculation yields
where terms in 6rc with powers of V past the second have been ignored. The term quadratic in V will in fact be ignored, as will some further such quadratic terms, and it is of some importance to understand the basis for this neglect. First, as /3 -+ 1, the term in question clearly becomes of vanishing importance. For more modest /3 values, however, another argument is needed. As an extreme assumption, take V = Ze/r
(2.9) so that we are to compare Ze2/rnc2r with unity. Clearly the quadratic term will be unimportant for r > Ze2/mc2 (2.10)
or, taking Z = 92 as the worst case, for r > 92 x 2.42 x
cm (2.11)
or r > 0.0022
A
which figure is far beyond the resolution capability of any imaginable electron microscope. Thus the very small regions around the sample nuclei in which the quadratic term of (2.8) is important simply are not imaged. Following Scherzer (1949), we return to Eq. (2.5) and make a series expansion of the exponential factor containing 6rc. We thus find that an incident wave exp(ilcz) is transformed, by passage through the sample, into the sum of the incident source wave plus an object wave exp(iicz)-+ exp(iicz) + iO(x) . exp(ilcz)
(2.12)
[ im I
(2.13)
dz 8K(X, z)
(2.14)
where
m
iO(x) = exp i
I‘
m
or
O(x) 3
dz 6lc(x, z) - 1
*-m
where the form (2.14) again involves neglect of a term quadratic in V.* Unlike the previous neglection, which involved a comparison of potential * See Appendix A for a discussion of this point.
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
43
energy with rest energy (and was therefore relativistic in origm), this one involves neglect of the mechanism by which the incident wave is systematically diminished in amplitude by interaction with the sample. Again, a “worst case ” analysis suggests that for imaging of single uranium atoms, the linearized analysis will be valid except for unresolvable disks of diameter about 0.03 A surrounding each nucleus. We will assume in the following that all samples (of necessity biological, rather than metallurgical) are sufficiently thin that attenuation of the incident beam is of no consequence. The assumptions made as to the negligibility of terms quadratic in V will be collectively referred to as the “weak object ” approximation. For the moment, we take all sample atoms to be located in the plane z = 0. The optical system following the sample will be characterized by numerous aberrations, most of which can be ignored. The primary spherical aberration and defocus must be considered, and we choose to include the axial astigmatism, since a practical system can never be completely free of anisotropy. For the present, the illumination is fully coherent, so that chromatic aberration does not yet appear. As is usual in microscopic imaging, aberrations involving location in the object plane are of little importance, so that distortion, field curvature, and nonaxial astigmatism will be ignored completely. Coma will be the field aberration which sets the practical limit to the usefulness of the image processing methods to be considered. A simple wave-optical calculation of the propagation through the system will now yield the image-plane intensity Z(x) which is produced by the source wave and object wave O(x). For convenience, we take the system to have unit magnification. As in AbbC’s original theory, it is extremely convenient to decompose the object wave into its Fourier components, each of which can be considered to propagate independently through the system. Thus, if
1
-
O(x) = dk B(k) exp(ik x)
(2.15)
we can calculate the image plane amplitude A(x)= i
[ dk b(k)O(k) exp(ik
*
x)
(2.16)
by multiplying each Fourier component by an appropriate complex function b(k), before recombining the individual components. A given component describes an electron wave propagating at a small angle to the system axis, a convenient description being one that uses an angle 8 with two components, such that 8 = k/K
(2.17)
44
T. A. WELTON
In the absence of aberrations, this component wave would focus to a diffraction-limited “point ” image in the back focal plane. (z = 2F, where F is the focal length of the assumed thin objective lens. In a slightly more complete treatment, the object plane would be at a distance F preceding the first principal plane, and the back focal plane would be at a distance F following the second principal plane.) The center of the point focus for the component k would then be at a transverse point
xB = F8
(2.18)
It will naturally be convenient to think of k as a position in the back focal plane, although the precise focusing implied by (2.18)will be disrupted by the aberrations. The effect of axial aberration is now simply described by computing the amplitude that would be present at the back focal plane, in the absence of aberrations, and multiplying it by a complex function of position, which involves the aberrations. Thus, the image amplitude will be determined by (2.16) with = exp[i+(kll (2.19)
w
and
+(k) = (
N ~ c- ,8 + -YC,I8 14)
(2.20)
The scalar constant C3is simply the usual coefficient of primary spherical aberration (aperture defect), having a magnitude of the order of the focal length for an optimized objective lens. The quantity C, (actually a 2 x 2 symmetric matrix) describes the defocus and axial astigmatism. Thus, we have for the first term in the parentheses of Eq. (2.20) (2.21) A more usual notation would be Clxx = C,
+ ( 4 2 ) cos 2a
Clxy= ( 4 2 ) sin 2a
(2.22)
ClYy= C, - ( 4 2 ) cos 2a
where C1is now the scalar mean defocus, A is the astigmatism, and a is the rotation angle from the x axis to the principal axis of the astigmatism. The absolute sign of 4 is not usually of great importance (although it does determine the sign of the contrast), but it is necessary to note that C, and C3 will have the same sign for the case of overfocus. If it were not for the possible anisotropy of the illumination as described by Eq. (2.3),it would be natural to choose axes so that a would vanish. However, to allow for illu-
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
45
mination anisotropy, it will be convenient to keep the form (2.3), which already selects an orientation and necessitates use of the more general form (2.22). In order to complete the treatment of bright field imaging of a weak object by fully coherent illumination, we must propagate the source wave through the system, add it to the image amplitude, and take the absolute square of the sum to obtain the image intensity. The source wave, from Eq. (2.12), has unit amplitude over the object plane, so that its Fourier transform is given by 9 ( k ) = (2a)-’
dx exp( - ik x)
= 6(k)
(2.23)
The resulting image plane amplitude will then be
-
mC 6(k) 6(k) exp(ik x ) = &(O) = 1
(2.24)
corresponding to the fact that a unit amplitude wave propagates through the system of unit magnification as a unit amplitude wave. Finally the image plane intensity can be written as 1
+
Z(X)
=
11
+ A(x)IZ
or I(x) = A(x) = A(x)
+ A*(x) + IA(x) 1’ + A*(x) + quadratic terms
(2.25)
We will effectively neglect the quadratic terms in what follows, consistently with our assumption of a weak object, but some discussion will be given of the probable error thus incurred. It will be convenient at this point to obtain an expression for O(k) in terms of the actual object structure. We accordingly write for the electrostatic potential of the sample (2.26)
where V , is the potential of the nth atom, at transverse position xn and centered on the plane z = 0. We now have, from (2.8) and (2.14) O(x) = (e/hcB)
OD
n
-w
dz K ( x - x, , z )
(2.27)
46
T. A. WELTON
and from the inverse of (2.15)
The second line of (2.28) is obtained by an obvious origin shift, with the assumption of spherical symmetry for each V, about its nucleus. Finally, we use the definition of the amplitude for electron scattering from an atom, in Born approximation 2me sin kr F(k) = (2.29) r2 dr V(r)h2 0 kr to obtain h O(k) = -(211)-’ C F,(k) exp( - ik * x,) (2.30) mcB n We now combine (2.16), (2.19), and (2.25) to obtain ~
I ( x ) = J’ dk Y(k) exp(ik * x) =i
(2.31)
1’
-
dk €(k)O(k) exp(ik * x)
J’ dk €*(k)O*(k) exp( -ik
- x)
+ quadratic terms
(2.32)
Equation (2.28)defines O(k) as the Fourier transform of a real function of x, so that O*(k) = O( - k) (2.33) and a reversal in sign of k allows the first two terms of (2.32)to be combined. Thus I ( x ) = i J’ dk[€(k) - a*(- k)]O(k) exp(ik x)
-
+ quadratic terms
(2.34)
which, by use of (2.19) and (2.20) becomes
1
-
I ( x ) = dk i{exp[i+(k)] - exp[ - i+(k)]}O(k) exp(ik x)
+ quadratic terms
(2.35)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
47
We neglect the quadratic terms for the time being, and obtain
Y(k) = P ( k ) .O(k) where
P ( k )= - 2 sin
4(k)
(2.36) (2.37)
which nearly completes the derivation of the conventional bright field imaging formula, with neglect of all quadratic terms. We can allow for sample thickness by taking z,, as the axial displacement of the nth atom from the mean sample position (defocus Cl). An effective axial resolution can be defined as the axial displacement 6z which causes the phase factor exp(i8z . k’/2~)to be - 1, for the largest value of k of interest. If 6 is the spatial resolution required, then
k I n/6 = k,,
(2.38)
and 8z = 2n~/ki,, =4
(2.39)
62/1
For 6 = 1 8, and I = 0.037 A (100 keV), we obtain 6z = 100 A. It appears quite feasible to handle thicker samples by use of sample tilt, but we will not attempt to so burden the present discussion. OF PARTIAL 111. EFFECT
COHERENCE
Partial coherence of the incident electron beam is easily handled in the approximation used in the previous section. The essential idea, as used by Hanszen and Trepte (1971), by Frank (1973), and by the author (Welton, 1971b), is to calculate the sum of the image intensities due to each energy and direction of illumination, the distribution of these quantities being given by Eq. (2.3), for the case of a field emission source without condenser aperture, which is the case to be considered here. The case of a single electron of the ensemble now representing the beam is handled by consideration of the more general incident wave function (2.1). For the case of approximately axial illumination, the illumination angle 68 will be smaller than rad, and the fractional momentum deviation will be less than Under these conditions, two potentially troublesome complications are avoided. First, the perspective alteration as the beam direction varies will be so small that for a reasonable size field (order of lo00 A) the resulting shifts in image position will be undetectably small. Second, the magnification variation as the beam energy varies will again produce undetectably small image plane shifts. These two observations
T. A. WELTON
48
make it possible to use a single object function for all electron energies and directions present in the illuminating beam. More precisely, Eq. (2.5) now becomes exp(ilcz + i[z
+ it
*
x) 3 exp(ilcz
+ i[z + i t - x ) exp
[
I
m
dz‘ ~ I C ( Xz’) ,
i J-m
(3.1) The subsequent argument is slightly altered by the presence of the factor exp(i6 * x) in O(x). It is easily seen that Eq. (2.16) must be altered to allow for the modified propagation through the optical system. Thus, to replace (2.16), we have A(x)= i
J dk d(k + 5, K + [ ) O ( k ) exp(ik
*
x) exp(i5 x)
(3.2)
The indicated modification of the transmission factor d leads to a rather complex form when taken literally, but some neglections are very much in order. We neglect the change in C3with electron energy and assume that aF Cl4ICZ
(3.3)
where F is the objective focal length. This condition is very nearly equivalent to Ci 4 F (3.4) which is clearly extremely well satisfied. Finally, we ignore all terms in 4 which are quadratic in 5 and [. The result for 4 is
+(k
+ 6,IC + () = (IC/~)[IC-%Ci k + (c4/2)C31 k 1‘ + x[K-’k C, 5 + I C - ~1 kCI2k ~ 51 + (k’/2~’)( IC *
*
*
*
(3.5)
The contribution to the image intensity I ( x ) is altered slightly from Eq. (2.25), since the source wave in the image plane now becomes
q r ) exp(i5
x)
(3.6)
where S(g)can be replaced by unity, in view of Eq. (3.5). Finally, Eq. (2.35) becomes, with neglect of the quadratic term (2.33),
49
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
There remains only to average over 4 and [, using Eq. (2.3). We use the lemma
/ 1-
m
m
(exp(im))
=
ds exp(ias) exp( - p2s2/2) - W
= exp(
oo
ds exp( -p2s2/2)
2) 2P2
which is simply obtained by completing the square in the exponent of the integrand of the numerator and deforming the contour in a clearly legal fashion. We thus obtain
.
= J dk Y(k) exp(ik x)
(3.9)
where
-
(3.10)
9 ( k ) = 9 ( k ) O(k) with 9(k) =
-2
exp[ - E(k)] sin #(k)
(3.11)
and
(3.12)
-
The notation (C, k)x, of course, signifies
+
Cixxkx+ Clxyky= [C, ( 4 2 ) cos 2a]kx
+ [(A/2) sin 2a]k,
(3.13)
and similarly for (C, * k), . The case of isotropic illumination, with negligible astigmatism, is of particular interest as a limiting case, although the more complete formalism is needed in practical calculations. We then obtain
'
lk12(C, +-i-c 3 K
),I
(3.14)
As will be shown, the form (3.11) will be extremely useful as a theoretical basis for the problem of object reconstruction. The factored form with an
50
T. A. WELTON
envelope function exp( - E) multiplying a simple sinusoid follows simply from exclusion of quadratic terms in the deviation quantities from the exponent of &‘(k).Although we shall use the simple form (3.11) in the numerical work to follow, it should be emphasized that the factored form is actually more general than the derivation here given. It was in fact shown by Frank (1973) that a relaxation of the requirement that +(k) be precisely the function appropriate to the fully coherent case will allow the form (3.11)to be retained. This result will be seen to allow retention of the bulk of the formalism to be displayed herein. A word concerning the quadratic term in I ( x ) will be useful. The 6, averaging which has been carried out for the linear terms leads to no particular simplification for the quadratic term. As a plausible empirical method for estimating the importance of this term, we adopt the procedure of replacing the proper average of the term A * ( x ) A ( x ) by the absolute square of the average of A ( x ) . Thus, we assume
I
I ( x ) = dk 9(k) O(k) exp(ik * x)
+ (A*(x))
*
(A($)
(3.15)
where ( A ( x ) ) = i J’ dk exp[ - E(k)] exp[i+(k)]O(k) exp(ik * x)
(3.16)
We may anticipate that the quadratic term of Eq. (3.15)will be an underestimate of the true term, but that the order of magnitude will be correct. ERROR IV. STATISTICAL In the absence of an optimum coding scheme, any recorded information, such as a micrograph, must be more or less corrupted by noise, or statistical error of some sort. In the case of an electron micrograph, the number of electrons impinging on a unit area of the emulsion will of necessity be finite, and only its expectation value is related to the object structure. At this point, it will be extremely convenient to modify our formalism so that it is oriented to practical computations. We note that we cannot deal with an image in complete detail, but must rather work with a two-dimensional table of sampled image values from a set of picture elements (“pixels”). A given image value will normally be the average of the actual image density over the pixel, and for not too heavily exposed emulsions, this average will be nearly proportional to the actual number of electrons incident per unit area. We now assume that the object function O(x) is periodic in the x plane, and in fact repeats when x or y are increased by an amount S. Such a repeating object will be expected to yield an image structure with one of its
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
51
unit cells, sensibly identical to that which would obtain if only a single unit (S x S) of the object were present. This assumption in fact constitutes a potentially important limitation on the smallness of the sample area that can be practically handled, and every case must be examined individually to determine the width of margin about the edge of the sample area which is seriously contaminated by nonexisting information from the surrounding sample. It can in fact be seen (Harada et al., 1974) that the transverse coherence length is a plausible measure of the width of the defective margin. In addition to this artificial periodicity, we assume that the domain of spatial frequency k required for imaging is finite, and we are thus immediately led to the representation of O(x) and Z(x) by a finite Fourier series. Thus, we make the following assumptions O(x) =
c O(k) exp(ik c 9 ( k ) exp(ik *
x)
(44
x)
(44
k
I(x) =
k
with the obvious inverses
O(k) = N-’
1 O(x) exp( -ik
x)
(4.3)
x)
(4.4)
X
4 ( k ) = N-’
c
Z(x) exp( -ik
*
X
We take the dimension of the unit object cell S to equal NQ,where Q is the chosen pixel dimension. The discrete x values are then given by x
= (x, Y ) = (mQ, nQ)
(4.5)
with m, n = 0, 1, . . . , N - 1. The discrete k values are correspondingly given by k ( k x , ky)= (k . 2n/S, 1 * 2n/S) (4.6) with k, 1 = 0 , 1, ..., N - 1. Mention must be made at this point of the phenomenon of “aliasing,” by which is signified the equivalence of half the range of positive spatial frequencies to negative spatial frequencies. Thus, for the spatial points defined by (4.5),
exp(ik, . x) = exp(ik . 2n/S . mQ) = exp[2nik . m / N ) ] = exp(-2niN
m/N) . exp(2nik * m/N)
= exp[ -2ni(N - k) . m / N ]
(4.7)
52
T. A. WELTON
We therefore think of values of k in the range
0 5 k I(N/2) - 1
(4.8)
as corresponding to positive k,, and values in the range N/2 Ik 5 N - 1
(4.9) as corresponding to negative k, . The precise correspondence for this latter range is obviously (4.10) k, = -(N - k ) * 2x1s
The limiting value k = N/2 yields
k,= -N.n/S = -n/Q
(4.11)
which can be equally well thought of as a positive k , value, since exp(in/Q . mQ) = exp(imn)
(4.12)
which simply alternates between + 1 and - 1 as we pass from a pixel to its neighbor. Note that, if k,,, is the maximum spatial frequency present in an image, then a natural definition of the spatial resolution available is S=Q=
n/km,x
(4.13)
We can now introduce noise in a natural way. Let N, be the number of electrons incident per unit area on the sample, so that N, Q2 is the expected number incident on a given pixel. We designate by (Z(x)) the image intensity calculated in the previous section, since it is just the expected intensity distribution, averaging over statistical error. The expected number of electrons striking a given pixel of the image plane will then be NeQ2[1+ (Z(x))I
(4.14)
The actual number detected will differ randomly from (4.14),with Poisson distribution and standard deviation 6N = (N,QZ)1’2[1+ (Z(X))]”~
(4.15)
The distribution can be taken as gaussian, to high accuracy, and we accordingly write for the observed number of electrons per pixel NeQ2[1+ (Z(x))][l
+ R(x)]
(4.16)
where R(x) is normally distributed with (R) = 0
(R2) = (N,Q2)-’[l
+ (Z(X))]-’
(4.17)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
53
If we now divide by N e Q 2 ,we obtain 1 + Z(x) = [l
+ (I(x))][l + R(x)]
(4.18)
where Z(x) is the measured image intensity distribution. At this point, the special convenience of bright field imaging becomes apparent. We assume (Z(x)) 4 1 and N , Q Z B 1, with two important consequences. First, the noise level becomes independent of position, since we now have
( R ~ =) ( N , Q ~ ) -
(4.19)
and second, the noise becomes additive, since the cross product of Z(x) and R ( x ) in (4.18)can be ignored. We finally obtain
+ R(x)
Z(x) =
(4.20)
which form will allow some interesting manipulation. It is clearly of imporand tance to obtain an estimate of the error incurred by the use of Eq. (4.20), this is done analytically in Appendix A, and computational evidence on this question will be found in Section VII. At any rate, further progress is extraordinarily difficult without the above assumption, either taken literally, or used as the starting point for a sequence of successive approximations. Allowance can easily be made for the imperfection of the image plane detector by dividing the right-hand side of Eq. (4.19)by the detective quantum efficiency (DQE) of the detector (approximately 0.7 for good quality electron emulsions), so that the variance of R now refers to the statistics of silver halide grain development. We now introduce a concept that will be central in the following work. Write (R(x)R(x’)) = C( I x - X’ I )
(4.21)
where C will be referred to as the autocorrelation coefficient for the error function R ( x ) . The indicated average can be defined in several ways, and these will be assumed to be substantially equivalent. The first definition considers a single micrograph and averages over position in it. Thus dS R(x + S)R(x’ + S)
(R(x)R(x’)) = A - ’
(4.22)
A
where A is the area chosen for averaging. The second definition considers a large collection (ensemble) of micrographs, identical save for statistical error, and the average is now the average over this ensemble. The connection between the two definitions lies in the fact that many subareas of a single micrograph can be thought of as the members of a small ensemble. The form chosen for C, as depending only on the magnitude of the displacement between the two points in question, reflects the fundamentally satisfying
T. A. WELTON
54
assumption that the statistical error is somehow independent of position and orientation in the micrograph. A convenient and plausible further assumption is that the error in one pixel is uncorrelated with that in any other pixel. This assumption can, in principle, fail if electron scattering in the detector allows a single electron to cause response (e.g., expose silver halide grains) in two adjacent pixels. In practice, the author has never seen evidence of a requirement for this degree of generality, and we accordingly assume C(
I X - x’I)
= ( N O Q 2 ) - ’S(X- x’)
(4.23)
where discrete values have been assumed for x and x’ and 6(x - x’) is the Kronecker symbol, defined by 6(x - x’) = 1,
x = x’
6(x - x’) = 0,
x # x’
(4.24)
The fundamental imaging equation (4.20) lends itself beautifully to a treatment in Fourier space. We have I(x) = C 9 ( k ) exp(ik x)
(4.25)
k
(I(x)) =
B(k) * O(k) exp(ik * x)
(4.26)
k
.
R(x) = C W(k) exp(ik x) k
(4.27)
where (4.25) is a definition [I(x) being the measured image plane intensity distribution], where (4.26)is just (3.9)and (3.10)rewritten, and where (4.27) is also a definition. These three equations can be immediately combined to yield Y(k) = B(k) * O(k)
+ W(k)
(4.28)
and the statistical properties of W ( k ) are easily deducible from those of R ( x ) . Thus
C(I x - x’ I ) = =
C W ( k )exp(ik * x)W*(k’)exp( -ik’ * x’)
(k. k‘
C (W(k)W*(k’)) exp[i(k - x - k’
k, k’
x’)]
(4.29)
We have simply inserted the definition (4.27) in Eq. (4.21) using R*(x’) instead of R(x’) for convenience (R is real, in any event). It is clear that Eq. (4.29)can be obeyed only if (W(k)W*(k’)) = JV( 1 k I ) 6(k - k‘)
(4.30)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
55
in which case, Eq. (4.29) becomes C( Ix - x’ I ) =
k
N(1 k I ) exp[ik
(x - x’)]
or
(4.31)
N ( k )= N - *
C C(x) exp( -ik
- x)
X
Finally, by use of Eq. (4.23), we obtain
N ( k )= (N,N2Q2)-l
(4.32)
The usual terminology at this point would refer to N ( k ) as the power spectrum (electrical analogy in the time domain) or Wiener spectrum of the function R(x). Finally, we note that a normal, possibly spatially correlated distribution for the values of R ( x ) will imply a normal distribution for the values of W(k), uncorrelated in the k domain. V. OBJECT RECONSTRUCTION
We are now prepared to attack the problem of finding a suitable algorithm for the extraction of object structure information from a measured noisy image. This process is frequently referred to as image enhancement, but the name “object reconstruction” will be used in the following. Such reconstruction will of necessity be incomplete, but we propose to develop simple techniques for evaluating the degree of reconstruction that should be possible, and then processing the micrograph as simply as possible to achieve something like optimum reconstruction. The measured image intensity contains sample information which has been degraded in two distinct ways. By reference to Eq. (4.28), we see that B(k), the modulation transfer function (MTF)of the microscope, will act to make the image function differ substantially from the object function. In fact, 9(k)is simply the Fourier transform of the point spread function P(x), which the optical system (in the absence of noise) convolutes with O(x) to yield I(x). In a conventional optical system P(k) will be slowly varying and B(k) will drop of one sign up to some value k = k,,,. Beyond k =,,k rapidly to zero, with or without rapid oscillation, usually with the assistance of an aperture stop in the back focal plane. The normal result of this behavior for 9 ( k ) is that each point feature of O(x) produces a somewhat diffuse feature with a radial extent of approximately n/kmaX. In the conventional electron microscope, the necessarily nonvanishing aperture defect [C, , from Eq. (2.2011, in conjunction with the defocus (C,) will cause such a rapid oscillation of B(k),and the rapid increase of the envelope function E(k), usually because of beam energy spread [as conclu-
T. A. WELTON
56
sively shown by Frank (197511, will act to cut off contributions to f ( k ) . An objective aperture is normally used to prevent confusion of the image by successive sign reversals of P(k),but we will assume the aperture to be absent. This omission has no serious effect on the image quality; it simplifies the theory and can easily be inserted in the course of computer processing. In Fig. 1, the oscillatory curves are essentially the function F2(k) exp[ - 2E(k)] sin’ +(k)
(54 with a small added constant, irrelevant to the present argument. The variation of F2(k), as indicated, is not important to the argument. The dashed
1.0
I
I
I
I
I
I
I
I
I
I
I
0.9 0.8 0.7
t Il l / m
0.6 -
0.5
I
-
Y
i
\
I
0.4
0.3 0.2 0.1
0
0
20
40
60
80 100 120 140 160 180 200 220 240 k
(I-‘)
FIG.1. Theoretical diffractogram densities for the conventional and high-coherence cases.
curve (labeled “conventional”) and the solid curve (labeled “high coherence”) are both calculated for microscopes having
W = beam energy = 100 keV F = objective focal length = 2 mm C3= C,= spherical aberration coefficient = 1 mm C, = P--aF = chromatic aberration coefficient = 1 mm 2 aP
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
57
The conventional microscope is further characterized by C, = 700 A (underfocus)
6=
($)
=6.4 x low6
rms
dx = transverse coherence length = 20 A
while the high-coherence microscope has C, = 1650 A (underfocus) 6 = 1.0 x
6x=200A At a later stage in the argument, it will be necessary to define the sample and the conditions of exposure. The basic sample is an amorphous carbon film of thickness 10 A, with N, = 500 electrons/A2. This basic sample will be assumed, without further adornment, unless otherwise specified. The conventional curve in Fig. 1 is characterized by a single dominant peak, with a following train of small further peaks (invisible on this scale, except for the first such). The choice of C, (the precise criterion will be discussed shortly) is qualitatively such as to extend the large peak to as high a value of k as possible, subject to the requirement that the peak also remains " high " as " long as possible " (with these meanings to be clarified). This last requirement tends to be thwarted by the chromatic aberration contribution to E(k), the relevant term being The high-coherence curve of Fig. 1, on the other hand, is permitted by the lesser value of 6 p / p [which yields 2E,h,(k) = (k/4.129)4]to display a train It will subsequently of peaks still of substantial importance at k = 3.0 kl. appear that the integral jokm"k dk exp( - 2E) sin2 4
(5.3)
is of prime importance in defining image quality, so that the maintenance of substantial values for the critical envelope function exp( -2E) over the widest possible range is extremely desirable. We now proceed to derive a suitable expression for the image quality, and in so doing, we shall have found the essentials of an interesting reconstruction procedure. Consider again the fundamental imaging equation (4.28). With the assumption of normal distribution for the noise function Wb),we can write an expression for the probability distribution of the function 9 ( k ) about its expectation P(k)O(k). We use an obvious bracket
T. A. WELTON
58
notation for the probability, with specified conditions first and result second, thus
Mk)Iy(k)}=
z-’ exP[-ck
I 4 k ) - P(k)@(k)I’ / 2 N ( l k l ) ]
(5.4)
where Z is a suitable normalization constant. The computation of 2 yields useful practice with this formalism. First, note that the sum over k includes each term twice, because of the reality condition f * ( k ) = 9(- k) (5.5) We consider a single k vector only, and write 9 ( k ) - P(k)O(k) = 9 ( k ) x + iy (5.6) The full normalization constant 2 is clearly a simple product of such constants, one for each k vector, thus Z=
fl z(k)
(5.7)
k
where m
z(k) =
m
dx -m
-m
d y exp[ - I x
+ iy I’/N(k)]
m
= 211
jor dr exp[ - r2/N(k)]
= nN(k)
We now verify the form (5.4) by calculating the variance of W(k). Thus
( I~(k)1’)= (x’
+ y 2 ) = (r’)
Jb
m
= 2n
r3 dr exp[ - r’/N(k)]
r dr exp[ - r’/N(k)]
Equation (5.9) agrees precisely with Eq. (4.30)if we note that the distribution (5.4) has no correlation between the variables for different k vectors.
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
59
The solution of the reconstruction problem requires a probability distribution of a type different from that of Eq. (5.4). We need {Y(k)1 O(k)}, i.e., the probability that the observed Y(k) implies a given O(k). Elementary intuition requires a close connection with (5.4), since a narrow distribution of 9 - 9 . O would seem to imply a close relationship of the implied 0(k) to the measured 9(k). At this point we use the fundamental theorem of Bayes, concerning inverse probability. Thus
V ( k ) I O(kN = 2- ‘{O(k)) {O(k) IY(kN *
(5.10)
where 2=
j
. {O(k)} . P ( k ) I Y(k)}
(5.11)
The integration is over all possible values of all the O(k), and is hopeless from the viewpoint of practical computation unless the integrand is of very simple form. The functional {O} is a somewhat shadowy quantity whose meaning we now attempt to make clear. It will be called (after Bayes) the prior probability of the specified object structure O(k). We may consider that the procedures used for preparing the samples define an average number of molecular species of various allowed types to be expected on a typical sample of the set. The locations and orientations of these species in a particular sample of the set cannot be known in advance, and it is the task of microscopy to attempt to define these parameters. We can make a simple statement about the statistics of the object set, by considering the quantity
(@(k)O*(k’))
(5.12)
Consideration of the spatial isotropy and homogeneity of the object set requires
(O(k)O*(k))
=Y
( k ) 6(k - k’)
(5.13)
where the angular brackets now imply an average over the object set. It is usually trivial to subtract off (O(k)) from O(k) itself, so that without loss of generality, we can take the distribution of O(k) to have zero mean. In some hypothetical simple cases, the absence of correlation implied by (5.13) may actually persist for higher moments of 0, but in practical cases the simplicity of an uncorrelated normal distribution, which obtained for g ( k ) is not to be expected for O(k). We nevertheless proceed by assuming the simple normal distribution to hold. Thus, we propose to write (5.14)
60
T. A. WELTON
with an obvious normalization constant required to convert the proportionality to an equality. We now take seriously the relation (5.10), with insertion of (5.4) and (5.14). The result can be beautifully simplified because of the several normal distributions which are being compounded. Thus
with a suitable normalization again required. Any function of the 9 ( k ) alone can be factored out, to be absorbed in the normalization, and the essential 0 dependence will be left as
{f(k)l 0(k)) exP[- w / 2 1
(5.16)
with
u = ( 9 - 1 + M-1P’)1012
+9
- &--9(9* * 0
*
0*)
= ( 9 - 1 +M-~P’)[plz-Jlr-lP
+ Jlr-19q-1(9* 0 +3 = ( 10 - (0)I2/A’) - (1 (0)I2/A2) x
(9-1
*
*
0*)] (5.17)
The vector k appears as the argument of every quantity in Eq. (5.17) and is therefore conveniently omitted. The quantities (0) and A are given by (0)= 9.9q.4P’ ”
+ &”)-I
A = (91 + SzJlr- 1)-
1/2
(5.18) (5.19)
The term in I (0) 12, left in U after completing the square, is a function of 9 only and can be dropped by absorbing it in the overall normalization constant. The drastic simplifications thus far introduced have clearly led to some simple results; their utility is yet to be determined. We note that (O(k)), from (5.18),is a mathematically simple estimate of the most probable object function following from a specified image, while A is an equally simple expression for the r m s error present in this estimation. Both forms are plausible. Consider Eq. (5.18) for a range of k such that the statistical error Jlr is very small. We then obtain
(0(k))
N
P - ’ ( k ) .3(k)
(5.20)
which corresponds to a naive attempt to compensate for the attenuation of the various Fourier components introduced by the imperfections of the optical system. Such an attempt must fail and is in fact always frustrated by
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
61
noise. We rewrite Eq. (5.18) as ( 0 ) = [1+
( N / Y 9 2 ) ] 9 - *13
(5.21)
where the ratio (5.22) plays the role of a dimensionlessnoise figure (noise/signal),which becomes overwhelmingly large as we approach any zero of 9(k). The rms object spread A correspondingly approaches 9”12, as it should, when the noise figure (5.22) becomes large. For values of the noise figure small compared with unity, on the other hand, the object spread A can be very much less than the limiting value Y1/2. These observations correspond nicely to our expectation that only where 9 ( k ) is sufficiently different from zero can any sharpening of our prior object distribution be expected. An extremely important concept, easily introduced at this point, is that of the informational content of a micrograph (Welton, 1969,1971b).On this we follow the work of Fellgett and Linfoot (1955), who were first to apply the now standard ideas of information theory to optical images. We consider the normalization integrals required for the prior object distribution (Z,) and that for the distribution which holds as a result of the micrograph. Thus Jtr/Y92
Z o = dO exp and
[
[
-
k
IO(k) - (O(k))
Z = dO exp k
1
(5.23)
12/2A2(k)]
(5.24)
C 1 O(k)I2/29(k)
The concept of information content of the micrograph is subject to a difficulty of the same sort encountered in giving a classical definition of the entropy, and we will here also content ourselves with computing the increase in sample information resulting from the micrograph. Thus, we write
1
11
(5.25)
and analogously for I, using {9IO } for the distribution appearing in the integral. As in the usual statistical mechanical derivation of the entropy, we obtain (5.26) I0 = -1% z o - C ( IO(k) 12)/9(k) k
I = -log Z - C ( I O(k) - (O(k)) I2/A2(k) k
(5.27)
62
T. A. WELTON
where the averages indicated on the right are easily done. The results are 10
= -log
zo - c
I = -1ogz-c
(5.28) (5.29)
where the constant C is infinite (or at least uncomfortably large) but is the same number for both distributions. Finally, we write 61 = 1 - 1 0 = log(Zo/Z)
(5.30)
for the information content of the micrograph and proceed to evaluate Zo and Z, using the result (5.7). We obtain
zo = fl [nY(k)]1/2
(5.31)
k
Z=
n [KA~(~)]”~
(5.32)
k
each of which is wildly infinite. We finally obtain for 61
(5.33) =
c l o d l + 92(k)Y(k)/-4’-(kll
3k
Note the use of the exponent 4 to allow use of an unrestricted sum over k values. Finally, we pass to an integral over k, so that 61 = (48~’) dk log[l
+ 8’(k)Y(k)/-4’-(k)]
(5.34)
and we have a natural definition for information density (61/A),A being the area of the micrograph. It is now of considerable interest to evaluate Y ( k )for a simple class of object, in order to see a little more of the meaning to be attached to Eq. (5.34). Consider the object set to be that in which each micrograph, of area A, is known to contain a single atom of known species, but with completely uncertain location. We then ask for the positional accuracy which can be achieved by study of a single micrograph. Consider Eq. (2.30) for B(k) in terms of the electron scattering amplitude F ( k ) and the atomic position. An obvious modification is required to convert (2.30)to the form appropriate for the discrete Fourier series representation and we obtain O(k) = (h/mcfl)A- ‘ F ( k ) exp( - ik x,)
(5.35)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
63
The average over the object set here means averaging over x, ,the position of the single unknown atom. Thus Y ( k )= A - ’
dxA(h/mcfi)’A-’F’(k) A
= (h/mcfi)’A-’FZ(k)
(5.36)
Equation (5.34)now becomes, using Eq. (4.32)for A’”&),
1
61 = ( 4 8 ~ ’ ) dk log[l
+ N , A - ’ ( h / r n ~ B ) ~ F ’ ( k ~ ~ ( k )(5.37) ]
This equation simplifies if we assume A
4N,(h/mcfi)’F2(0) % 600 A2
(5.38)
where N , = 500 electrons/i(’, the beam energy is 100 keV, and F is taken to be that for mercury. The logarithm in the integrand of (5.37) can now be expanded, so that
j
00
61 = ~,(4n)-’(h/mcfi)2 k d ~ ( k ) ~ ( k )
(5.39)
0
with the assumption that 9 depends only on k( = I k I ). We now consider the information increase corresponding to localization of the atom within area 6 A . The probability of fmding the atom within any cell of area 6 A is just 6A/A. As a result of analysis of the micrograph, the probability becomes unity for the cell actually occupied, and zero for all other cells. The information change is then
61=1*log1+o~logo+o*logo+ - (A/6A) *
( 6 A / A ) log(GP/A)
(5.40)
= log(A/GA)
where 0 . log 0 = lime+oE log E = 0, and the factor A / 6 A is simply the total number of cells over which the summation is to be performed. Finally, we define an effective accuracy of location 6x = (6A)’/’, given by 6x = A’/’ exp( -61/2)
(5.41)
Some typical numbers will be given for 6x shortly, but we first wish to emphasize the utility of the simple expression (5.39)as a convenient measure of microscope performance. We should also emphasize that while the quality of conventional imaging depends on the range of k in which 9 ( k )has no sign change, the criterion (5.39)has no such requirement. We should then think
64
T. A. WELTON
of 6x, as given above, as the true resolution parameter, capable of being realized by suitable processing of the micrograph, even though the image is badly blurred by sign changes in B(k). This informational approach to the definition of image quality was the original motivation for the work reported in Welton (1969). The recognition that image quality in a real sense does not directly depend on absence of aberration leads immediately to the question of how to extract in a useful fashion the full information content of a blurred micrograph. A parallel question was also immediately asked, namely, how to design a microscope with the best possible informational performance. The result of these considerations was the so-called high-coherence microscope (Worsham et al., 1972, 1973). These same considerations have clearly been explicitly or implicitly important in the work of Siege1 (1971), and of Chiu and Glaeser (1977). The essential consideration in microscope design is the reduction of the effect on P(k)produced by instabilities and incoherence. Thus, the form of the envelope exponent E(k) given in (3.12) imposes the necessity for adequately small rms spread in focal length and small illumination angle. These matters have recently been carefully discussed by Chiu and Glaeser (1977). In view of their considerations, it would seem probable that the most important gain yet to be made may lie in the reduction of the chromatic aberration coefficient by use of the composite magnetic and electrostatic objective proposed by Rose (1971). The traditional effort to eliminate the primary spherical aberration now appears less important in itself, although not without a point, as will be seen. Unfortunately, such an elimination of the aperture defect remains a formidable task. Similarly, a substantial increase in beam energy seems to be less important in its own right than as another possible method for reducing the information loss caused by chromatic aberration. We do not wish to give an extensive discussion of radiation damage, but here note that for given BZ(k), 61 is simply proportional to N , //Iz.The damage produced by the illumination is (N,/f12)f(/I), where f(/I) increases very slowly with energy up to about 500 keV, and then more rapidly. In other words, the ratio of 61 to damage decreases only moderately with energy, until the electrons become relativistic. The computation of 61 is easily extended (Welton, 1971b) to a case of much greater practical interest. An atom of interest will of necessity reside on a substrate of some sort. The substrate will not normally be resolved into its atoms, but will constitute an important contribution to the noise level of the micrograph. We assume as substrate an amorphous carbon film of thickness t A. We further assume the atoms of the carbon film to be randomly distributed (not strictly true, of course, but probably not in serious
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
65
error, for present purposes). We return to the basic imaging equation (4.35) and rewrite it as Y(k) = B(k) . O(k)
+ 9(k)
+
O,(k) 9 ( k ) (5.42) where O(k) is the object function for the atom of interest, and Os(k) is the object function for the substrate. Previous results will be unchanged on making the substitution *
+ B’(k)( I Os(k) 1)’
N ( k )= (N,N’Q’)-
(5.43)
The procedure leading from (5.35) to (5.36) immediately yields
( I o,(k)
1)’
= (N’QZ)-’(h/~~B)’Nc~’(k)Ff(~) (5.44)
where F,(k) is the electron scattering amplitude for carbon, and N,( = t/lO) is the number of substrate carbon atoms per A’. We proceed to give a small tabulation (Table I) for the two standard cases described earlier in this section [conventional (CONV) and high coherence (HC)]. We consider substrate thicknesses of 0, 5, and 10 A, and TABLE 1
Case CONV HC CONV HC CONV HC
t
(4
(A)
61
bx = 512 exp(-61/2)
0 0 5 5 10 10
32.3 63.9 5.6 12.1 3.3 7.0
5.0 x 7.0 x lo-’* 3 1.0 1.2 99.0 15.0
assume that a single mercury atom is to be located within a field of 512 x 512 A’. For mercury, we use a simple but adequate approximation for F(k), namely, (5.45)
We see that the approximate doubling of 61 achieved by passing from the conventional case to the high-coherence case has a striking effect on the potential resolution dx. Note also the very serious degradation produced by even very nominal substrate thickness. The very small values of 6x obtained for t = 0 are, of course, meaningless (aside from the unavailability of such samples) unless suitably small pixels (Q < 6x) are used. We return now to the expression (5.18) for the most probable object function. The derivation given is suggestive, albeit based on a seemingly
66
T. A. WELTON
crude assumption for the prior probability {O(k)}. In this regard, it is of considerable interest that the same formula follows by application of an argument originally given by Wiener (1949).* Wiener's argument seeks to find the convolution on the image that yields an estimate of the object function with the smallest possible mean squared deviation from the true object function. It is assumed that the object set distribution is uncorrelated with the noise distribution, and the squared deviation is assumed to be averaged over both distributions. We also note that Wiener's derivation was originally given for the case of an electrical signal, in an accurately linear system, with strictly additive gaussian noise. In addition, he proposed realizing his filter with a passive circuit, so that the desired convolution was required to be over the past history only of the corrupted signal. Modern communications usage takes full advantage of storage and computation to allow a realization fully analogous to that expressed by Eq. (5.18). Accordingly, we define W(k) = Y(k) * P(k)[Y(k)PZ(k)+ N(k)]-'
(5.46)
and make the quantity W(k) the central concept of the reconstruction algorithm to be tested. A further specialization must be made, in order to achieve practical results. We must have a simple standard assumption for Y(k), which can be an extraordinarily complicated object if its definition is taken literally. As discussed further in Appendix A, optimal use of a micrograph requires that Y(k) takes into account all prior information on the probable numbers of various molecular species present, as well as available information on bond lengths and angles. We here content ourselves with a minimum of such information, namely, the probable numbers of atoms of various species present, assuming each atom to be randomly distributed over the field of the micrograph. The result will be Y ( k ) = (h/mc/3)2(N2Q2)-1 NaF:(k)
(5.47)
0
where N o is the number of atoms of species a per square angstrom and F,(k) is the electron scattering amplitude for an atom of species a. In actual practice we will take advantage of the qualitative similarity in shape of the Fa&)curves for all 2 values, for the k range of interest, and define the sample by an equivalent number of carbon atoms. Thus ~c
=
C Na(F,Z(k)/F,Z(k))
(5.48)
a
* Rohler (1967) has given a proof directly applicable to the optical image problem.
67
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
with the indicated average probably best defined as knur
c a
j”
=
(5.49)
d ( k 2 )~ ( k ) / ~ ( k )
0
The values of the Caare not strongly sensitive to the value of k,,, but for illustrative purposes, we give some typical values appropriate for 1-Aresolution (k,, = K A- ’) (see Table 11). TABLE I1 a
C
N
0
P
Br
Hg
Th
C,
1.oooO
1.1878
1.2765
3.1100
9.9092
28.9600
34.9200
It is apparent that there is no serious danger that we have thus overestimated the available prior information. In fact, almost all the atomic arrangements considered as possible in (5.47)could be ruled out by considerations of bond lengths, repulsion radii, and similar information. It will, however, become clear from Appendix A, and from the discussion of Section VII, that any attempt to build into Y ( k )more detailed prior information will incur severe computational difficulty, as well as serious danger of artifact production. It is finally not at all obvious that the crude assumption (5.47)is sufficiently refined to give a useful result, and it will only become clear from the evidence of Section VII that real progress is possible. VI.
NUMERICAL TESTS OF THE RECONSTRUCTION ALGORITHM
PROGRAMS FOR
From the work of the preceding sections, it is a simple matter, in principle, to set up simple numerical tests of the Wiener reconstruction algorithm, as modified by the assumption (5.47)for Y ( k ) .Several such tests have been made (Welton, 1971a), using computed bright field micrographs of simple objects as the input data, with promising results. Several tests (Welton, 1975; Welton et al., 1973)have, in addition, been made using actual micrographs, without notable success in the first case, but with modest success in the later attempt. It is unfortunately fair to say that no conclusive proof (or disproof) has yet been given of the practical utility of the Wiener reconstruction algorithm, and the present work accordingly had for its principal motivation the evaluation of that algorithm. As a starting point, it was decided to use a group of computer programs developed by the author (Welton, 1974;Welton and Harris, 1975)for use in
68
T. A. WELTON
processing actual micrographs. The flexibility and generality of this system of programs recommends it as a basis for careful testing, particularly in view of the success it has yielded in the handling of several simple micrographs (Welton, 1975).The author has, however, experienced considerable difficulty in procuring micrographs taken under adequately controlled conditions, and it was thought advisable to conduct the present tests on easily controllable synthetic micrographs. We accordingly describe in useful detail at this point the computational procedures followed.* These procedures are to be thought of as job steps (in the IBM 360 sense), and they should be cataloged in the disk system of the computer so that they can be conveniently invoked in various combinations. The first such job step is given the name OBJECT, with the task of supplying the atomic coordinates for the desired sample, exclusive of substrate. One available option positions the atoms for a DNA double helix having a size of 600 nucleotide pairs (Plate la). This molecule is built around an axis that is smoothly bent into a space-fillingcurve occupying a roughly rectangular plane region approximately 240 x 280 A. The molecule is about 20 A thick and is assumed to be placed on a carbon film no more than 10 A thick, so that a single defocus value can be used for all atoms of the sample, at the assumed pixel size of 1 A [cf. Eq. (2.3911. The axial displacement of each atom is actually calculated and transmitted by the program, so that relatively thick objects can be studied, if desired. The micrograph to be produced will have 512 pixels in each direction, this number being large enough to allow interesting results and small enough to keep costs down. The reconstruction programs have been tested at size 1024 pixels, and size 2048 pixels could be handled with only minor changes. To allow for a range of pixel sizes and numbers, the atomic coordinates are actually calculated and transmitted to an accuracy of 1 part in 4096, although the subsequent job step may not use such accuracy. Note that for a picture 1024 x 1024 8, the available position accuracy would be 0.25 A, which certainly cannot be resolved. In the sample shown in Plate la, a mercury atom has been substituted for the phosphorus atom of each phosphate group, This replacement is naturally not intended to be chemically realistic, but does indicate the sort of atomic spacing and identification with which we would like our methods to deal. Another version of OBJECT produces text (Plate 4a) composed of letters, each of which is represented by a dot matrix whose dots are single atoms of thorium, mercury, bromine, phosphorus oxygen, nitrogen, or These programs have all been carefully optimized and rather fully tested, and can be made available on request to any interested investigator. Unfortunately, proper optimization goes outside the usual Fortran language and has only been done for the IBM 360, 370 system. Optimization for another system should not be difficult.
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
69
carbon, as desired. The separation of these dots is chosen as 2 A in Plate 4% to make objects that are directly recognizable on a fairly coarse scale as Roman capital letters, while on a finer scale, the individual atoms may become detectable. In both versions, as many as 25,200 individual atoms can be accommodated, so that rather complex and interesting objects can be produced, by simple modifications of the basic program. A second job step (IMAGE)carries out the formulation of the preceding sections to produce I ( x ) in the form of a file containing density information for the computed bright field micrograph. The input is the file containing the atomic locations produced by OBJECT, plus the parameters describing the microscope used and the conditions of exposure. The output file will normally have one byte (0-255) integers for the various pixels, arranged in 512 records (scan lines) of 512 bytes each. The statistical error would have to be incredibly small for greater precision to be required. Considerable simplification is required in the description of the atomic scattering amplitudes, because of the complexity of the problem. It has been found that a reasonable approximation is the following
~ , ( k= ) A,/(k2
+ a 2 )+ ~ , / ( +k ~B2)
(6.1)
where it is essential that a and fl be independent of the atomic species a. The representation (6.1) is not required to be accurate over all k, but only over the range corresponding to the desired resolution. We have followed the practive of choosing a and /3 to allow a reasonable shape difference between the two terms and then adjusting A, and B, to yield precise values of F,(k) for k = 0 and k = ~tA-'. Because of the reasonable shapes of the two terms, the resulting fit is quite good, certainly more than adequate for the exploratory purpose we have in mind. Table 111 lists the values chosen (a = 3.0781 A-',
B=
3.9738 A-').
TABLE 111
C N
0
P Br
Hg Th
26.1966 10.6589 0.0034 88.8238 58.4816 143.5691 429.9931
-4.8222 17.0567 31.3891 - 62.0841 14.7710 -29.1951 -375.9774
(4 300
(4
PLATE1. (a)Object-DNA, 300 A x 300 A. (b) Image-EMPTY-HC,10 A-NORMAL, 300 A x 300 A. (c)Image-DNA-COW, 5 A-NORMAL, x 300 A. (d) Diffractogram, EMFTY-HC-10 A-NORMAL, 271A-' x 271A-'.
A
72
T. A. WELTON
The first task of IMAGE is to calculate two object functions, by use of the given atomic coordinates and the A, and Ba values. Thus
C Aa S(X - xa) = C Ba S(X -
OA(x) =
a
O,(X)
Xa)
(6.2)
a
For simplicity, it is assumed that each atom is repositioned to the center of the pixel in which it lies, an assumption that cannot introduce serious error at the resolution level implied by the assumed pixel size. The sum over atoms is to include the sum over all substrate atoms, which are introduced by taking (with the help of a random number generator) the number of substrate carbon atoms in each pixel from a Poisson distribution, with mean determined by the assumed thickness of the film (a lo-A film is assumed to have an average of 1 carbon atom/A2). The two real functions (6.2) are taken to be the real and imaginary parts of a complex object function, which is then Fourier transformed (fast Fourier transform). From the transform of this complex object function, it is simple to extract the separate transforms O,(k) =
C A, exp(-ik
xa)
a
O&) = C Ba exp( - ik x,)
(6.3)
a
From these it is a simple matter to form the full object function (A = NZQZ= area of sample) 6(k)= (h/mc/l)A-
C Fa@)exp(-ik
*
xa)
a
= (~/mcS)A-'"A(k)/(kz
+ ".I
+ PB(k)/(k2 + S'll>
(6.4) The amplitude function d ( k ) is then formed by introducing the complex instrumental MTF b(k), as implied by Eq. (3.16).Thus
@) = exp[ - W
l exP"k)l
(6.5)
and d ( k ) = ib(k)* O(k)
(6.6) Note that the amplitude (6.6)has been averaged over the energy and angular spreads of the illuminating beam. At this point, we find the Fourier transform of d ( k ) ,yielding (A(x))= i
k
b(k)O(k) exp(ik x)
(6.7)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
73
which is just the discrete analog of Eq. (3.16). We then obtain (I(x)) [or rather the estimate given by Eq. (3.1511 The final step is to form I(x) according to Eq. (4.15),with R(x) drawn from a normal distribution, again with the help of a random number generator. More explicitly, we take I(X)
= (I(x))
+ (N,Q2)-'/2[1+ (I(x))]'/~ . r(x)
with r ( x ) a random function, without correlation, and with normal distribution for each x value. Thus (r(x)) =0
(r(x)r(x')) = S(x - x') and the distribution function for r is just P ( r ) = (274- 1/2e-r2/'
(6.10) (6.11)
Finally, the function I(x) is tested for minimum and maximum values, and a one-byte integer (0-255) calculated from
(6.12) B(x) = 255[1(x)- Iminl/(lmax - Imin) and output as the previously defined disk file. The input parameters for IMAGE include the complete list of parameters thus far introduced, plus one other artificial, but rather useful, constant. In order to make convenient checks of the importance of the quadratic term in Z(x), all scattering amplitudes can be multiplied by a factor FCV, and at the same time the electron dose N, is divided by (FCV)'. This has the effect of leaving all pictures unchanged (signal/noise unchanged) if the quadratic term is negligible. The third job step is named XFORM, and has the simple function of obtaining the Fourier transform of a disk file in the format produced by IMAGE. The resulting transform file consists of N 2 ( N = 512) eight-byte complex numbers, and is therefore too large to be conveniently saved. Instead, it is passed to subsequent job steps as a scratch file, to be deleted at the end of the complete job. The step XFORM is the starting point for three important procedures. These all have as a probable end the production of a two-dimensional display from a data file in the standard format produced by IMAGE. The job step for display is called PLOT and takes considerable advantage of the useful characteristics of a rather ancient cathode ray tube plotter (Calcomp, Model 835). A Fortran-callable subroutine generates a tape that will direct the electron beam to any desired pixel and produce the desired optical
74
T. A. WELTON
density, This process is surprisingly economical and has produced all the plates in this article. Some tests by the author indicate that use of one of the more modern film writers (Perkin-Elmer Model 1010A, for example) would produce neater results, with an unfortunate capital cost (or rental fee) attached. One available procedure passes the transform file from XFORM to a job step XPLOT, which has the function of forming a new file, which is essentially the absolute square of the transform file (put on a logarithmic scale to avoid some major uncertainties in scaling). This new file is then rescaled and output as a new file in standard one-byte format. With PLOT as the third job step, a display is produced of a function which is essentially the diffractogram of the starting data. A second procedure follows IMAGE directly by PLOT to display the Z(x) file, as a synthetic micrograph. A third procedure has as its purpose the determination from a given micrograph the best values for the instrumental and exposure constants required to compute (O(x)), the Wiener estimate for the object function. These job steps are called SPEC (power spectrum) and WIENER. These are sufficiently described in Appendix C, and suffice it to say at this point that SPEC has the transform file from XFORM as its input and WIENER has as its output the constants required to compute W(k). A fourth procedure passes from the transform file to another job step FILT (Jilter) which accepts as input the constants for W(k),as well as the transform file Y(k), from XFORM. It constructs a new file W(k)Y(k),and Fourier transforms it to produce (O(x)). This file is then put in standard one-byte form and output for use as input by PLOT. AND DISCUSSION OF DATA VII. PRESENTATION
We are now ready to describe and discuss some typical results obtained by application of the formalism developed in Sections 111, IV, and V, together with the programs of Section VI and Appendix C. We summarize the basic assumptions. The objects studied consist of one of three ordered arrays of atoms mounted on one of three substrates. The ordered arrays are designated as DNA, TEXT, or EMPTY, while the three substrates consist of random carbon films of nominal 0, 5, and 10 A thickness. Each micrograph consists of a square field 512 x 512 A, with a pixel size 1 x 1 A. The full field may not be displayed, the actual displayed area being indicated in the caption. The arrays called DNA and TEXT are described under the job step OBJECT in Section VI, and they are displayed at high resolution in Plates l a and 4a, respectively. The array called EMPTY has no atoms present, and the corresponding display for it would be simply a blank
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
75
square. The two parameter sets describing the conventional microscope (CONV) and the high-coherence microscope (HC) have already been listed in Section V, and the electron dose N, has been uniformly set to 500 electrons/A2.Finally, the cases are distinguished by the value chosen for the parameter FCV, the value 1.0 being designated as NORMAL (some of the quadratic terms being included approximately correctly),while the value 0.1 will be designated as LINEAR (the quadratic terms now being artificially reduced by a factor of ten with respect to the linear terms). The two displays named “ object ” are in reality very special images in which the highest possible resolution is provided for. The substrate thickness is taken as zero, all aberration coefficients are made to vanish, the electron illumination N, (electrons/AZ)is taken to be an extremely large number, and a phase shift of 4 2 is inserted in the source wave (as though a retarding film of suitable thickness were inserted in the center of the back focal plane) in order to produce a bright field image intensity in the absence of aberrations. The display then would contain all the available sample information if the resolution of the display system were adequate. Comparison of the displays of a 300 x 300 A square with a 100 x 100 A square suggest that thedisplay system is not yet limiting at 300 x 300 A. The individual mercury atoms are in fact quite apparent where they extend (in projection) to the outside of the double helical structure. Note also the easy visibility of the base planes, which are seen edge-on in the straight sections of the molecule. The curved sections show peculiar phenomena arising from the fact that the 33.4 A required for the helix to complete one turn about the helical axis is a substantial fraction of the 50 8, required for the shorter radius turns and 150 A required for the longer radius. At the 512 x 512 A level chosen for the TEXT displays however, it is apparent that the display resolution is limiting. As elsewhere described, the characters are composed of atoms located on a 2 x 2 A lattice, so that in horizontal or vertical lines, the atomic spacing is 2 A, while in the diagonal direction, it will be 2.82 A. Inspection with a magnifier shows (in the originals at least) no significant discrete structure for horizontal and vertical lines, while a definite discrete structure is seen for the diagonal lines. The displays labeled “image ” are actual computed micrographs with sample, exposure, and instrumental conditions as indicated in the captions. The displays whose captions refer to “original image” are the result of an attempted object reconstruction based on the indicated micrograph, and using a Wiener function W whose origin is indicated in the following parentheses as “theory,” or a plate number. If a plate number is given, then W was actually calculated (by use of SPEC and WIENER) from the indicated micrograph. We are now ready to begin an orderly presentation of the evidence. Plate
76
T. A. WELTON
l b shows the typical “phase grain” seen in bright field micrographs of amorphous films. The sample has one carbon atom per pixel on the average, 0, 1, and 2 being then the dominant occupation numbers. It is important to note that no structure resolvable with the stated instrumental conditions is present, and we see only a badly blurred image of the atomic array forming the substrate. This micrograph is probably best characterized by its diffractogram, as displayed in Plate Id, or the computed smoothed function shown as the “high-coherence” curve in Fig. 1. The connections between the features of the radial plot of Fig. 1and the circular pattern of Plate Id are quite apparent, but we must again emphasize the extremely noisy character of the diffractogram. This noise originates primarily from the random locations of the substrate atoms and secondarily from the statisticaI error imposed by the 500 electrons/A2illumination. For later comparison, we show in Plate l c the conventional image of the DNA object placed on a 5-A carbon substrate. Atomic resolution is naturally not present, but examination with a magnifier reveals features with an extent of about 3.5 8, which appear to result from the merging of two mercury atoms which occasionally lie close in projection. Less prominent features can be found with an extent of roughly 2.0 A, which is reasonably consistent with the curve labeled “ conventional” in Fig. 1. In Plate 2a, we display a high-coherence micrograph with real content. The apparent resolution is very poor, as would be anticipated from the phase reversals seen in Fig. 1. The first such reversal in fact comes at k = 0.82 A, which would correspond to a S value of lt/0.82 or 3.8 A. As noted in the caption, the micrograph computed with FCV = 0.1 (LINEAR) has an appearance indistinguishable from that of Plate 2a. Substantial differences do, however, appear in the portion of the diffractogram with k 5 1.0 A-i, and it was accordingly felt important that the effect of the approximated quadratic terms on the object reconstruction should be studied. In Plate 2b is displayed the result of processing with the W(k) which results from the use of the job steps SPEC and WIENER (Appendix C). Note the unfortunate reversal of the gray scale, which somewhat hampers comparison with other reconstructions. At this point, it will be useful to discuss briefly the calculation of W(k) by the adaptive procedure (Appendix C) and compare the results with the W(k)obtained more accurately, but less directly. Figure 2 displays the function 9(k) obtained by SPEC, from the image of Plate 2%plotted as a function of k along four radial lines. The plots are somewhat crude but illustrate clearly the difficulties involved. The four plots coincide reasonably well for k 2 0.6 A-1, and have generally the shape of the idealized curve of Fig. 1. The complex behavior for k I0.6 A-’ can be shown to be due entirely to the detailed object structure (DNA organization of the object atoms, rather than some other arrangement of the same atoms) and cannot easily be used in a statistical investigation.The irregulari-
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
77
ties that remain for k > 0.6 A-’ do not prevent a reasonably accurate estimate of the essential parameters for the construction of W(k). A better estimate can be obtained from the image of the substrate alone (Plate lb), which completely lacks the complex behavior at small k, and which yields better agreement of the four radial plots for larger k. Finally the true function W(k)can be calculated from Eq. (5.46) and the parameters listed for the high-coherence instrument. By simple division, we write in a slightly - - more convenient form
where +(k) is assumed to have the standard form defined by Eqs. (2.20), (2.21), and (2.22). The function F,(k) is the amplitude for electron scattering from carbon, and the form (5.47) for Y ( k ) has been assumed, with the additional simplification embodied in Eq. (5.48). The constant C is a composite of the equivalent carbon atom count N , and the electron count N , , and the envelope exponent E(k) is given the general form of Eq. (3.12),although the special form (3.14) suffices for our assumption of rotational invariance. For purposes of easy comparison, we write 4 and E in a simple standard form W x , k y ) = 41ki+ 4 2 k x k y + 43k,” + 44k4 (7.2) E ( k x , ky)= Elk: E2 kx ky + E3k,” (7.3)
+
+ k2(E4k: + E,k,ky + E,k,”)+ E,k6
and present the values of the A i , B i , and C in Table IV. TABLE IV Case
Constant
2a
lb
4.9166 0.0028 4.9319 -0.5157 -0.5236
4.9631 0.0015 4.9690 -0.5197 -0.1196 0.0063 -0.1397 0.06342 -0.00195 0.06721 - 0.00668 0.1388
Theory ~
0.0008
-0.4469 0.1924 O.oo00
0.1808 - 0.01823 0.5 164
4.9062 O.oo00
4.9062 -.0.5110 0.001204 0.000000
0.001204 0.0012336 0.0000000
0.0012336 O.oo005223 0.1072
W 4
(4
(dl
PLATE2. (a) Image-DNA-HC, 10 &NORMAL (LINEAR has identical appearance), 300 A x 300 A. (b) Original image from (2a). Processed with W (2a). Note photographic reversal from other processing results. 300 A x 300 A. (c) Original image from (2a). Processed with W (theory).300 A x 300 A. (d) Original image had same assumptions as (2a), except LINEAR instead of NORMAL. Processed with W (theory). 300A x 300A.
80
T. A. WELTON
4 .O
0.9 0.8
-
0.7
-
+
++
+
>
+:
L
0.6
0
+
VO +
+
00
SMOOTHED DIFFRACTOGRAM
O&+
0
a4m
2
0.5
i)
a+ 0
I + 0
0.4
a X
0
0.3
8 +
0.2
0.4 0
0
0.4
0.8
4.2
4.6
k
2 .o
2.4
2.8
(8-')
FIG.2. Diffractogram densities obtained by smoothing from the Fourier transform of a high-coherence image.
The rotational symmetry of the system is reflected in the equality of r$l and 4~~, E , and E 3 , and E4 and E , ,as well as the vanishing of E , ,and E5.These rules are reasonably well obeyed in the approximate forms (2a) and (lb), the accuracy being considerably better in the 4i than in the Ei . A false anisotropy is introduced into the system by the necessary anisotropy of the object. This anisotropy is anticipated to be markedly greater for the DNA + substrate than for the substrate alone, and this is well borne out, for the 4i at least. It should be clear that 4(k) must be accurately represented because of the requirement that the factor sin 4 in W(k) should strongly suppress contributions to the reconstruction from k values near the zeros of
+,,
4.
The accuracy of the fit to E ( k ) is clearly less crucial, since it is not
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
81
required to produce any precise cancellations. This is fortunate, since the false anisotropies introduced in the Ei deduced from (2a) and (lb) are substantial. We also note that the detailed values of the Eiin the first two columns bear no obvious relation to the exact values in the third column. This failure reflects the fact that the number of parameters allowed for the fitting process is larger than the smoothness of the data would justify. Nevertheless, the E ( k ) from Plate l b lies close to the true function over the range of k of interest. The variation in the value of C is to be regarded more seriously. This parameter is directly proportional to the noise-to-signal ratio, and the value obtained from the actual micrograph (2a) appears to be significantly higher than the theoretical value given. Since the known ambiguities of the problem suggest only a downward revision of the theoretical C value, the discrepancy between C = 0.5164 and 0.1072 must be regarded as potentially serious. The effect produced in reconstruction by use of too high a C value is to reduce unduly the use of image information near the zeros of +(It), and it is reassuring to fmd the essential features of the displays of (2c) and (2d) present in (2b) as well. A word is necessary concerning the assumptions used in obtaining C theoretically in the various cases considered. For a simple carbon substrate of thickness (A), we have
N, = t/lO (carbon atoms/A*)
(7.4)
which corresponds to a total of 262,144 carbon atoms distributed over an area of 512 x 512 A’. The DNA molecule contains approximately the equivalent of 50,000 carbon atoms [Eq. (5.48)and Table 111, and TEXT contains the equivalent of 122,000 carbon atoms. In Table V, we give the resulting theoretical C values for the various micrographs. The values given were not the ones actually used, the discrepanciesbeing due to ambiguities of definition not apparent until late in the process of data collection. Some further discussion of the quadratic effects is now in order. A comparison (not shown) of the estimated Wiener spectra for Plates lb, 2a, TABLE V Plate
3b 3b, except 5 A 2a
4c 5b
C(theory)
C(used)
0.6683 0.1847 0.1072 0.1323 0.08713
0.3010 0.2127 0.1160 0.1128 0.07826
82
T. A. WELTON
and the corresponding micrographs with FCV = 0.1 (the LINEAR cases) show that the quadratic effects are important in the low-k region where the gross irregularities of Fig. 2 are apparent. The irregularities are due to the real object structure, as previously stated, but their magnitude changes with FCV as would be expected from the quadratic terms. The comparison of the NORMAL and LINEAR versions (lb), on the other hand, shows no irregularity, but does have a smoothly varying quadratic contribution. It should be noted that the quadratic terms will not show zeros in the Wiener spectrum at the zeros of +(k), so that their presence can falsify the estimate of the noise level (Appendix C) and hence alter the C-value. This effect will presumably raise the estimated C-values above the theoretical values and plausibly account for the discrepancies noted in the last line of Table IV. The very much improved C value obtained by computation from the micrograph of the substrate alone appears to result from the very much smaller quadratic effects associated with the absence of heavy atoms. It is therefore strongly suggested that a single actual micrograph be divided into two adjacent squares, one containing sample and the other consisting of substrate only. To reasonable accuracy, these will have the same instrumental, exposure, and substrate parameters. The adaptive procedure of Appendix C can then be profitably applied to the square of substrate only. The scattering power of the sample square should be estimated from higher k values (k > 1 A-', say), and in this way reliable C values should be experimentally obtainable. These questions clearly deserve more detailed investigation than has been permitted by the time and cost limitations of this study. We are at least prepared to state empirically that the uncertainties in the C value discussed above do not fatally affect the results of reconstruction. This is clear by detailed inspection of Plates 2b, c, d, and 3a. Comparison of Plates 2c and d in addition demonstrates that the quadratic terms do not seriouslyinterfere with reconstruction, at least with the thin and relatively weak samples considered here, We now note Plate 3b, which is the only case of a reconstruction attempted with a C value substantially smaller than the correct theoretical value. This discrepancy, of more than a factor of two, will cause k values too close to the zeros of 4 to be used in the reconstruction. The resulting amplification of noise close to these contours produces a characteristic artifact consisting of fringes having spatial frequencies for which 4 vanishes. It may be of interest in this connection to display W(k) in some detail, together with some properties of its Fourier transform, the Wiener kernel W ( x ) .Figure 3 shows a plot of W(k),calculated with the HC parameter set from Eq. (5.46). The absolute magnitude is plotted for convenience, a sign reversal being understood at each dip to zero. The values assumed by
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
83
c$ at each of these zeros is indicated. The absolute magnitude shows minima , a rise as k approaches near points where C#I is an odd multiple of ~ / 2 with a multiple of K. This rise is terminated by a catastrophic fall to zero (and reversal in sign) as the available signal strength falls below the noise level. Consider the Wiener kernel
5
W ( r )= ( 2 ~ ) - dk ~ W(k)e-ik ’
(7.5)
where W ( k ) is given by (5.46), which can be rewritten as W(k)= l/4[exp( - E ) sin C#I
+ i ( N / 4 Y ) - ‘I2] + c - c
(74 We see that W ( k ) has a series of simple poles, each near a value k , of k which makes c$ equal a multiple of K. These poles are displaced off the where real axis of k by an amount proportional to (N/9’n)-1’2, 9, N Y ( k , ) . More precisely, W(k) can be represented as a sum of simple poles W ( k ) = C Rn(k2- (:)n
+c
*
C.
(7.7)
where (, is approximately given by Cn
= kn
+ iy,
and
The residues R, are of no importance to our argument but the values of the y n , the imaginary displacements of the poles, are in fact central. If the Fourier integral (7.5) is performed on W ( k ) in the form (7.7), a sum of terms will be obtained of the form (7.10)
with (7.11)
The function Hi*?is the usual Hankel function with 1 or 2 being taken (according to the sign of yn) to force an exponential decay of W, for large
(4
(4
PLATE 3. (a) Original image identical with (2a), except LINEAR. Processed with W (lb). 300 di x 300 di. (b) Image-DNA-HC-O,&LINEAR, 300 di x 300 di. (c) Original image from (3b). Processed with W (theory). 300 A x 300 di. (d) Original image identical with (3b), except 5
A substrate. Processed with W (theory).300 di x 300 di.
86 4
3
5 2
z
1
0
FIG.3. The Wiener function for reconstruction of a high-coherence micrograph.
r. Thus, W,(r) behaves at large r as
-
(7.12) W,(r) R, r - ‘I2 exp( L ik, I ) exp(- Iy, Ir ) The characteristic lengths Iy, I - are also displayed in Fig. 3. The k, values are of interest in defining the periodicities of artifacts which can appear in reconstructions for which inappropriate noise figures have been used. The lengths ly,,l-’ are of importance in deciding how large a “frame” is required around an area to be reconstructed. We would anticipate that a micrograph of side S would allow reasonable reconstruction of a smaller square of side S - 2y,$, ,where yminis the minimum of 1 y, I. We have not encountered this limitation in the computations here presented, because of our assumptions of periodicity for the data, but in practical work, the data square used for Z(x) should be substantially larger than 2yii: if gross inefficiency is to be avoided. Another possible procedure for reconstruction would perform the convolution of W with Z directly, without recourse to Fourier transformation. This direct convolution can, of course, be more efficient if yiif is relatively small compared with the size of the data square, but the author knows of no real test of this method. Finally, comparison of Plates 3c and d with the preceding recon-
’
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
87
structions shows the obscuring power of the substrate. A simple estimate, using the ideas of Eq. (5.43) and (5.44), indicates that the noise introduced by 1 A of carbon is roughly equivalent to that arising from an illumination of 500 electrons/A2 and accordingly, 10 A of carbon corresponds to about X, electrons/A2. Plates 4 and 5 make use of the object TEXT [shown in (4a)] to test in more graphic form some of the foregoing conclusions. In (4b) is shown the conventional micrograph of TEXT on a 5-A substrate, while (5a) shows the effect of a lo-A substrate. No discrete dot structure is ever visible, and lines of dots appear as lines of about 2 A width. The effect of the substrate in blurring the object is quite apparent. With high-coherence imaging of TEXT, in (4c) and (5b), we see an image apparently devoid of meaning. The rectangles corresponding to the individual characters retain their identity, but the aberrations have led to an impenetrable disguise for each character. Note that the image (4c) was formed with the quadratic terms in full force. The reconstruction (4d)is a striking illustration of the potential power of the Wiener algorithm. The characters formed of thorium and mercury atoms are fully visible over a 5-A carbon substrate, while the bromine atom characters are partly discernible. The lines of the individual characters have the minimum width of lA, and no real ambiguities are present, in spite of the many opportunities available. Some artifacts are, however, visible, including a slightly distorted rendering of “S,” and it is easy to imagine that a character set two or four times as large could show definite ambiguities. Finally, a pair of micrographs (Sb) of identical appearance were computed, both with 10 A of substrate, but one NORMAL and the other LINEAR. The extremely gratifying reconstructions (5c) and (5d) show a little more obscuration by the thicker substrate, but no important differences ascribable to the quadratic terms. APPENDIX
A. ESTIMATESOF QUADRATIC EFFECTS
We have more or less systematically ignored terms in the image function that are not linear in the object function. The quadratic effects of relativistic origin [Eqs. (2.8) ff.] seem to be always very small at resolution levels presently available, or likely to be attained, and will not be further discussed. Several effects, however, remain as real worries. The first is seen from Eq. (2.13). We define m
q(x) =
1
-m
dz
~ K ( x Z) ,
W 00
(4
(4
PLATE 4. (a) Object-TEXT, 512 A x 512 A. (b) Image-TEXT-COW-5 A-NORMAL, 512 A x 512 A. (c) Image-TEXT-HC-5 A-NORMAL, 512 A x 512 A. (d) Original image from (4c). Processed with W (theory). 512 A x 512 A.
(4
P1
PLATE 5. (a) Image-TEXT-CONV, 10 A-NORMAL, 512 A x 512 A. (b) Image-TEXT-HC, 10 A-NORMAL (LINEAR has identical appearance), 512 A x 512 A. (c) Original image from (5b). Processed with W (theory). 512 A x 512 A. (d) Original image identical with (5bX except LINEAR. Processed with W (theory). 512 A x 512 A.
92
T. A. WELTON
so that, by Eq. (2.13),
O(x) = -i[exp(iq) - 13
(A4
Past this point, we systematically approximated the object function by O(x) = rl
(A.3)
so that our first question concerns the magnitude of the next term in the expansion of the exponential. A seemingly unrelated question concerns the possible error incurred by the neglect of the quadratic term in the image plane intensity Z(x). There is, in fact, a connection between these two questions arising from the requirement of electron conservation. We consider the simplest case of coherent imaging, with no inelastic scattering allowed. The wave function for a typical electron is taken as
$ = A(x) exp(ilcz)
(A.4)
where A(x)= 1
before the sample plane, and
after the sample plane. If no inelastic scattering is present, q(x) will be real, and the intensity of the electron wave is unaltered by passage through the sample. Thus
I 1 l2 = IexP(i?) I2 = I 1 + iq - (q2/2) + ...l2 = 11
+ iO(x)l2
(A.7)
where the quadratic term q2/2 has the obvious function of canceling the absolute square of the linear term iq,when the cross term between 1 and q2/2 is formed. A similar interplay of linear and quadratic terms occurs in the image plane, to enforce overall electron conservation, and an interesting theorem (essentially the requirement of unitarity for the electron-atom scattering matrix) results. Wherever the real amplitude F ( k ) appears, it should have added to it an imaginary term
F , ( k ) = (4a~)-’
a’F(#)F( I k - k I )
(A4
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
93
Specialization to the direction k = 0 yields F,(O) = (4nrc)-'
5 dk F y k )
(A.9)
which is easily seen to be identical with FI(0) = rc(4n)- 'aT
(A.lO)
where gT is the total elastic scattering cross section, for an electron incident on the atom in question. Considerations of electron conservation dictate that the form (A.lO) will hold in general, if oT includes the inelastic scattering, although the proper generalization of Eq. (A.8) is too complex to be usefully presented here. It can, however, be qualitatively indicated that the contribution to F,(k) arising from inelastic scattering will decrease much more rapidly with increasing k than will the contribution from elastic scattering. This is so because the spatial extent of the inelastic mechanism (outer electrons, mainly, and largely excited by rather remote collisions) is relatively large. In the main body of the text, we have at all times assumed that F ( k ) is purely real, and we now see that examination of the magnitudes of the quadratic terms is, in fact, inseparable from the question of the error incurred by ignoring the imaginary part of F(k). We here give sample results from two methods of investigation. First, we tabulate q(r) for the cases of a carbon atom and a mercury atom on axis (x. = 0; r = I x I). The simple analytic amplitude expression (6.1) will be used, with the constants from Table 111. The function q(r)becomes logarithmically infinite at r = 0, but, as previously, we assume that only the features beyond a reasonable resolution circle will be of importance. We obtain Table A.1 from which it appears that q2 6 q for conditions of interest. Exception would clearly occur for cases where too many atoms are approximately aligned, but this situation is unlikely for thin amorphous samples. TABLE A.1
0.50 0.65
0.0106 0.0057
0.0572 0.0310
We also give some approximate results for F ( k ) and F,(k), again for carbon and mercury. The FIvalues given in Table A.11 are those calculated on the assumption of elastic scattering only. The value of Fl,,-(O) will be increased by approxi-
94
T. A. WELTON
TABLE A.11
k(A-’)
Fdk)
Fi,c(k)
0.0 1.0 2.0 3.0
2.765 2.501 1.944 1.418
0.107 0.105 0.100 0.092
F~s(k)
FI,d
k)
3.120 3.062 2.916 2.683
14.931 13.505 10.498 7.657
(Fl/F)c
(Fl/F)Hs
0.0387 0.0420 0.0514 0.0649
0.209 0.227 0.278 0.350
mately 0.1 A if inelastic scattering is included, with however a rapid decrease of this addition as k increases. The addition is roughly the same for mercury, so that the dominant effect in the heavy element is that arising from the elastic scattering. The impression conveyed by Table A.11 is similar to that of Table A.1, namely, the effect of the nonlinearities is likely to be relatively unimportant, but not negligible. As indicated in Sections V and VI, the effect of the simplest nonlinearity (neglect of the I A 1’ term in I)can be directly checked, and this has been extensively used in Section VII. The modification of the test programs to allow use of exp(iq) instead of 1 + iq is a simple one and obviously should be carried out at an early date.
APPENDIXB. THEWIENERSPECTRUM OF THE OBJECT SET The Wiener spectrum of the object set Y ( k ) is a central concept for the present discussion and deserves a somewhat fuller discussion. We first ask how Y ( k )will reflect prior assumptions about the object set more restrictive than that actually used to obtain Eq. (5.47).As an example, suppose that the sample is known to contain precisely N , “molecules.” A “ molecule ” is a planar array of atoms with accurately known interatomic distances, and each such structure can be translated and rotated (about an axis perpendicular to the sample plane) at random. Thus
c
Nm
O(x) =
K(x - x,,
n=l
or O(k) =
cN-2 c c exp(- ik c exp(-ik n
=
n
=
c U ( k ) exp(ik k
*
x)
(B.1)
K(x - x, , 4,) exp(- ik x)
X
xn)K2
n
=
4,)
c K(s, 4,) exp(-ik -
s)
(B.2)
8
x,)X(k,
cos 4n- k, sin d,, k, sin +,,
+ k, cos $,,)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
95
We now form (exp(ik x: - ik * x , ) )
(O(k)O*(k‘)) = n
n‘
x ( ~ ( k +,n ) X * ( k ,
4,s))
03.3) A simple lemma is needed,
where we are to average over all x , and 4,. namely, (exp(ik’ x,)) = d,,, d(k’ - k)
-
so that
+ (1 - d,,,) d(k) 6(k) (~(kP*(k)) = d(k’ - k) C I ~ ( k4,) , 1’ n
(B.4) (B.5)
plus a term which vanishes except for k = k‘ = 0. It will immediately be recognized that the angular average on the right-hand side of Eq. (B.5) must give a result independent of the direction of k, and of n. We then obtain, from the definition (5.13)
jOzn d 4 I ~ ( k4,)
~ ( k=)~ m ( 2 n )1-
12
(B.6)
If, now, we assume specific atomic species and coordinates for the structure of a “molecule,” we obtain X(k,4 ) = (h/rncS)A-’
a
Fa(k) exp( - i k
- xa)
03.7)
where k can be taken as the vector (k cos 4, k sin 4). We then find
’
Y ( k )= N,(h/mc/3)2A-Z C Fa(k)Fa3(k)(2n)aa*
x
jOznd 4 exp[(ik’
= N,(h/mcS)’A-’
*
(xa - x..)]
C Fa(k)Fa,(k)Jo(kI xa - xa, I )
aa‘
In order to describe the form of 9 ( k ) ,it is convenient to consider the terms with a = a’ separately. Since Jo(0) = 1, we obtain Y ( k )= (h/mcS)2(Nm/A2)C F,Z(k) a
+ (h/mcS)’(Nm/A’)
C Fa(k)Fa*(k)Jo(kRaa*)
a#a’
(B.9)
The first term on the right is immediately identifiable as the definition (5.47), with an obvious notational change. The second term has a much more
96
T. A. WELTON
complex structure and will oscillate violently for all k values satisfying kR,i” Z
(B.lO)
where Rminis the minimum interatomic distance Ru,. It is clear that if we were certain of all the R values for a “molecule,” the form (B.9) would yield superior results in an attempt to reconstruct a micrograph. In actual fact, the unavoidable presence of such factors as radiation damage and rotations around other axes than the single axis considered must operate to smear badly the oscillatory part of (B.9). The result must certainly be that the oscillatory term is sharply reduced from the idealized form given, and we shall not further agonize over the truncation used. Finally, we attempt to evaluate the gaussian assumption for the prior distribution of O(k). We first point out the obvious and unfortunate fact that the set of O(k) corresponding to randomly distributed atoms fails to satisfy some obvious relations between its moments, which would be required for a normal distribution. Consider the first moment, with the nonessential simplification that all atoms ( N , per unit area) are assumed identical. The factor h/rncfl is also dropped, for convenience. We accordingly obtain
O(k) = ( N 2 Q 2 ) - ’ F ( k ) (exp( -ik
*
x,))
(B.11)
n
and use the obvious lemma (exp( -ik * x,)) = 6(k) to obtain (O(k)) = ( N 2 Q Z ) - 1 ( N , N 2 Q Z ) F ( 6(k) 0)
(B.12) (B.13)
If we do the inverse transform, we find O(x) = N,F(O)
(B.14)
which corresponds precisely to the phase shift produced in an electron wave by propagating through the mean electrostatic potential of the sample atoms. A trivial subtraction of (O(k)) from O(k) itself yields a set of quantities with zero mean. Unfortunately, the cubic average of obvious interest
(B.15) (O(k’))l”(k’’) - (O(k”))I) still fails to vanish and the hoped-for uncorrelated normal distribution is therefore skewed and correlated. It is finally of interest to ask what sort of object set would yield a normal, uncorrelated distribution for the O(k). The following form
([O(k) - (qk))l[O(k’)
-
O F ) = (h/rncg)(N2QZ)-’
L,F(k) exp( - ik x,) a
(B.16)
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
97
has the desired properties if the x, are randomly distributed, as previously, and if the numbers [, are normally distributed, with zero mean, and without correlation. We are saying here that all atoms have the same shape for their charge distributions, but have normally distributed strengths ([,). We can obtain an extraordinarily good approximation to a normal distribution for the O(k) if we assume just four discrete possible values for the [,, namely, [, = +a, f3.146264a
(B.17)
with probabilities of occurrence 0.908248 and 0.091752, respectively. Products of 1, 2, . . . , 7(k) factors would then all have the desired averages, while the eightfold product is of the desired form, but with a numerical coefficient 81, rather than the gaussian value 105. An element of unrealism arises with the introduction of negative scattering amplitudes. This can be partially repaired by adding a suitable constant to all amplitudes and subtracting ( O ( k ) ) from the definition (B.16), but full realism for a realizable object set seems difficult to achieve. In view of the apparent importance we have attached to the quantity Y ( k ) , and the further assumption of a normal distribution for the O(k), it would appear to be of considerable interest to carry through reconstruction of a synthetic object, constructed according to (B.16). The comparison of interest would be with the corresponding reconstruction of a more realistic object. This comparison could clearly be carried out by a trivial modification of the job step OBJECT, but has not yet been performed. The crucial nature of the assumption of normal distribution is actually somewhat reduced by the existence of the original derivation of our reconstruction algorithm by Wiener, who would have worked only with 9 ( k ) , with the assumption of a best least-squares fit between (0) and 0. APPENDIXC. PROGRAMS FOR DETERMINING W(k)
Two job steps were not described in Section VI, and a brief outline will be attempted here. Bright field electron micrographs have the extraordinarily useful property that essentially complete information on the instrumental and exposure parameters for a given micrograph can be extracted from that micrograph. There thus becomes feasible a procedure which we shall term “ adaptive reconstruction,” (Welton, 1974; Welton and Harris, 1975) and which has been called “ blind deconvolution ” by Stockham, Cannon, and Ingebretson (Stockham et al., 1975). The basic procedure to be described would use as input the absolute square of the Fourier transform of the measured image function Z(x). This can quickly be obtained in reasonable form on film exposed in the focal plane of a focused coherent light beam, in the path of which is placed the
T. A. WELTON
98
transparent micrograph. The resulting diffractogram (or its computed equivalent) has a striking appearance (see Plate Id for a display of a diffractogram computed from a high-coherence synthetic micrograph). The function displayed is extremely noisy, and the innocuous appearance it presents to the eye is the result of a very large amount of sophisticated averaging and filtering which is performed automatically in the eye-brain system of the viewer in an attempt to create some sort of order out of a most disorderly situation. If, for example, a plot is made of the density values along a radial line passing through the center of the display, only a vague hint of the apparently systematic lightdark alternation will survive. The diffractogram can nevertheless be used to obtain reasonably accurate estimates of Ci, A, 01 [Eq. (2.2211, C3 [Eq. (2.20)],6 * (dF/alc),qz ,and qy’ [Eq. (3.12)], N , [Eq. (4.1111, and N, [Eq. (5.48)]. The basic observation is that if we define the diffractogram density as
W) = ( 14k)I2)
(C.1) where the angular brackets indicate a “ suitable” smoothing of the measured values, then 9(k)
-
+
Y ( k ).P2(k) N(k)
(C.2) The constant of proportionality in this relation turns out to be irrelevant, and Y ( k ) ,P(k), and N(k)have the meanings given in Eqs. (5.13), (3.11), and (4.39), respectively. Because of the vanishing of B(k) when +(k) is a multiple of n, we can obtain values of N(k)along such contours (“dark rings ” in the diffractogram). With the assumption of reasonable smoothness for M(k), subtraction of an estimated N(k)can be made along contours where +(k) is an odd multiple of 4 2 , and Y ( k )exp[ - 2E(k)] can there be found by using sin 4 = L- 1. The function Y ( k )is then represented in terms of F,(k) and an equivalent density of carbon atoms N , , and a sampling of values for E(k) can be obtained by division. The known analytic form for +(k), Eq. (2.20), is to determine the of great importance in using the contours of minimum 9@) aberration parameters and also in reliably locating the contours where sin = f 1. The procedure here outlined is simple, in principle, but surprisingly difficult to carry out. Conceptually it is clear, but rigorous it very definitely is not. Central to the whole procedure is a conjecture which states (approximately, at least) that all reasonable forms of averaging will lead to essentially the same result. A rather similar assertion of the equivalence of time averaging and ensemble averaging underlies statistical mechanics, without proven fault. For the problem at hand, we do not expect quite the same equivalence of spatial and ensemble averaging, but the procedure used is plausible and not easy to improve on.
+
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
99
The definition of the smoothing process to be used for 9 ( k ) is then central to our procedure. If there were no axial astigmatism in the microscope, a simple circular averaging would be clearly appropriate. Although expert operation and careful selection of micrographs should allow such circular averaging in principle, it was felt that a less demanding procedure would be much more generally useful. The averaging procedure chosen was therefore convolution of IY(k) 1' with a gaussian kernel of adjustable width. Thus
9 ( k ) = (R2/n) dk' exp[ - R2 I k
- k'
12]
IY(k)1'
which operation is naturally carried out by double use of the fast Fourier transform. Thus, define the autocorrelation function for Z(x) by D(x)=
k
IY(k) 1'
exp(ik * x)
(C-4)
multiply by the transform of the kernel of (C.3), and calculate the inverse transform. Thus exp( - I x I2/2R2)D ( x ) exp( -ik
9 ( k )= N - 2
- x)
(C.5)
X
which can be calculated with great efficiency to yield a smoothed version of the diffractogram. Some caution is necessary in the choice of the parameter R. If R is chosen too large, the smoothing will be inadequate, whereas with too small an R, the essential light-dark alternation will be strongly attenuated. Simple inspection of the easily obtained optical diffractogram will yield a value for the shortest wavelength effectively present, and the R value can then be easily set. The job step just described (SPEC) produces as output a temporary disk file containing 9(k), which is then available as input for the step WIENER. This final step is simple in principle, but quite complex in execution. Essentially, it explores the variation of 9 ( k ) along a set of 20 radial lines (through k = 0), covering an angular range of 180" in the k space, at uniform angular increments. The other half space is identical and can contribute no further information. Along each radial line, the k values for the minima are noted, and the values of 9 ( k )recorded. A decision is made for each minimum as to what multiple of A the phase 4 there equals. Once this tabulation is complete, the values of k at each angle at which 4 is an odd multiple of 4 2 can be found and the corresponding values of 9(k) can be recorded. A least-squares fit (linear) can then be made to +(k), by adjusting C1, A, a, and C3. If the function E ( k ) is taken as a sixth-degree polynomial in k, and k, (even powers only), another linear least-squares fit
100
T. A. WELTON
will determine the coefficients, as well as the required estimate for N,. The noise parameter N, is, of course, determined from the values of 9 at the minima. A useful degree of immunity to the twin hazards of too large or too small an R value has been built into the program, which appears to perform easily and reliably. Mention should be made of several other procedures that have been proposed. The first (Frank et al., 1971) has been tested and appears workable. Since it requires a nonlinear least-squares fit, it may perhaps prove a bit balky in some cases. Another (Frank, 1976) appears to be potentially as useful as the method we have described. It requires two micrographs taken under identical conditions, but appears applicable to cases where the number of minima in 9(k) may be too low for the above procedures to work well.
REFERENCES Archard, G. D. (1955). Proc. Phys. Soc. London, Ser. B 68, 156. Burfoot, J. C. (1952). Thesis, University of Cambridge. Chiu, W., and Glaeser, R. M.(1977). Ultramicroscopy 2, 207. Deltrap, J. H. M.(1964). Thesis, University of Cambridge. Erickson, H. P., and Klug, A. (1971). Philos. Trans. R. Soc. London, Ser. B 261, 105. Fellgett, P.B., and Linfoot, E. H.(1955) Philos. Trans. R. Soc. London, Ser. A 247, 369. Frank, J. (1973) Optik (Stuttgart) 38, 519. Frank, J. (1975) Proc., Electron Microsc. Soc. Am. Na 2, 182. Frank, J. (1976) Proc., Electron Microsc. Soc. Am. Na 2, 478. Frank, J., Bussler, P. H., Langer, R., and Hoppe, W. (1971). Electron Microsc., Proc. Int. Congr., 7th, 1970, p. 17. Hahn, M. (1973). Nature (London) 241,445. Hahn, M., and Baumeister, W. (1973). Cytobiologie 7, 224. Hanszen, K. J., and Trepte, L. (1971). Optik (Stuttgart) 32, 519. Harada, Y.,Goto, T., and Someya, T. (1974). Proc., Electron Microsc. Soc. Am., p. 388. Langer, R., Frank, J., Feltynowski, A., and Hoppe, W. (1971). Electron Microsc., Proc. Int. Congr., 7th, 1970 Vol. 1, p. 19. Rohler, R. (1967). “Informationstheone in der Optik,” p. 175 and R. Wiss. Verlagsges., Stuttgart. Rose, H. (1971). Optik (Stuttgart) 34, 285. Schemer, 0. (1936). Z . Phys. 101, 593. Schemer, 0. (1947). Optik (Stuttgart)2, 114. Schemer, 0. (1949). J . Appl. Phys. 20, 20. Seeliger, R. (1951). Optik (Stuttgart) 5,490. Siegel, B. M.(1971) Philos. Trans. R. Soc. London, Ser. B 261, 5. Stockham, T. G., Jr., Cannon, T. M.,and Ingebretson, R. B. (1975) Proc. IEEE 63, 678. Stroke, G. W., and Halioua, M.(1973). Optik (Stuttgart) 37, 192. Stroke, G. W., Halioua, M., Thon, F., and Willasch, D. (1974). Optik (Stuttgart) 41, 319. Thon, F., and Siegel, 8. M.(1970). Ber. Bunsenges. Phys. Chem. 74, 1116. Thon, F., and Siegel, B. M.(1971). Electron Microsc., Proc. Int. Congr., 7th, 1970, p. 13. Thon, F., and Willasch, D. (1971). Proc., Electron Microsc. SOC.Am., p. 38.
IMAGE THEORY FOR BRIGHT FIELD ELECTRON MICROSCOPY
101
Welton, T. A. (1969). Proc., Electron Microsc. SOC.Am., p. 182. Welton, T. A. (1970). Proc., Electron Microsc. SOC. Am., p. 32. Welton, T. A. (1971a). Proc., Electron Microsc. SOC.Am., p. 94. Welton, T. A. (1971b). Proc. Workshop Con$ Microsc. Cluster Nuclei Defected Cryst., Chalk River Nucl. Lab. CRNL-622-1, p. 125. Welton, T. A. (1974). Proc., Electron Microsc. SOC.Am., p. 338. Welton, T. A. (1975). Proc., Electron Microsc. SOC.Am., p. 196. Welton, T. A., and Harris, W. W. (1975). Electron Microsc., Proc. Int. Congr., 8th. 1974, p. 318. Welton, T. A,, Ball, F. L., and Harris, W. W. (1973). Proc., Electron Microsc. SOC.Am., p. 270. Wiener, N. (1949). “The Interpolation, Extrapolation, and Smoothing of Stationary Time Series.” Wiley, New York. Worsham, R. E., Mann, J. E., and Richardson, E. G. (1972). Proc., Electron Microsc. SOC.Am., p. 426. Worsham, R. E., Mann, J. E., Richardson, E. G., and Ziegler, N. F. (1973). Proc., Electron Microsc. SOC.Am., p. 260. Young, R. D., and Miiller, E. W. (1959). Phys. Reo. 113, 115.
This Page Intentionally Left Blank
ADVANCE3 I N RLIETRONICS AND ELECTRON PHYSICS, VOL.
48
Fluid Dynamics and Science of Magnetic Liquids RONALD E. ROSENSWEIG Corporate Research Laboratories EXXON Research and Engineering Company Linden, New Jersey I. Structure and Properties of Magnetic Fluids ............................................ 103 A. Introduction ............................................................................ 104 B. Stability of the Colloidal Dispersion ................................................. 108 C. Equilibrium Magnetic Properties ..................................................... 111 D. Magnetization Kinetics ................................................................ 114 E. Viscosity ................................................................................ 117 F . Tabulated Data and Other Properties ............................................... 120 I1. Fluid Dynamics of Magnetic Fluids ...................................................... 122 A . Magnetic Stress Tensor and Body Force ............................................ 122 ............................................... 126 B. Alternate Forms ................... . . C. Generalized Bernoulli Equation ...................................................... 128 D. Summary of Inviscid Relationships ................................................... 130 E. Basic Flows ............................................................................ 131 F . Instabilities and Their Modification.................................................. 141 111. Magnetic Fluids in Devices ............................................................... 157 A. Seals .................................................................................... 158 B. Bearings ................................................................................ 163 C. Dampers ................................................................................ 175 D. Transducers ............................................................................ 177 E. Graphics ............................................................................... 179 F. Other ................................................................................... 183 IV. Processes Based on Magnetic Fluids ..................................................... 186 A. Magnetohydrostatic Separation ...................................................... 187 B. LiquidILiquid Separations ............................................................ 189 C. Energy Conversion .................................................................... 189 D. Other ................................................................................... 190 List of Symbols ............................................................................ 192 References .................................................................................. 195
I.
STRUCTURE A N D PROPERTIES OF
MAGNETIC FLUIDS
In the past, classical mechanics and thermodynamics of fluids have dealt mainly with fluids having no appreciable magnetic moment. In the last ten to fifteen years, problems of mechanics and physics of liquids with strong magnetic properties have been attracting increasing attention and the fluid 103 Copyright 0 1979 by Academic Press, Inc. All rights ofreproduction in MY form rcrcrved. ISBN O - I ~ - O I W E - ~
104
RONALD E. ROSENSWEIG
dynamics of magnetic fluids, similar to magnetohydrodynamics, has begun to be considered as a branch of mechanics. Reviews of the subject are given by Rosensweig (1966a, 1971a),Bertrand (1970),Shliomis (1974),and Khalafalla (1975). Fluid media composed of solid magnetic particles of subdomain size colloidally dispersed in a liquid carrier are the basis for the highly stable, strongly magnetizable liquids known as magnetic fluids or ferrofluids. The number density of particles in suspension is on the order of lOZ3/m3.It is the existence of these synthetic materials that makes the study of magneticliquid fluid dynamics (ferrohydrodynamics) possible. The practitioner of ferrohydrodynamics may well be content to accept the available ferrofluids with their empirically reported properties as given quantities. Others may desire a more full treatment. While a thorough discussion of magnetic fluid structure and properties could occupy an entire chapter, in this section an intermediate path is followed that emphasizes topics mainly concerning the fluid dynamical behavior. In addition, indication is provided of information in the literature relating to broader aspects of the fluids, e.g., preparation, physicochemical behavior, optical and acoustic properties, etc.
A. Introduction
The synthesis and systematic study of the properties of magnetic fluids was started in the 1960s (Pappel, 1965; Rosensweig et al., 1965). These ferrofluids have little in common with the magnetic suspensions of particles used in magnetic clutches which came into use in the 1940s. In these, the suspensions used were of a ferromagnetic powder such as carbonyl iron in a mineral oil. The dimensions of the particles were in the range 0.5-40 pm. The technical application of such suspensions was based on their property of congealing under the influence of a magnetic field (Rabinow, 1949). Ferrofluids differ from these coarse suspensions primarily by the thousand times smaller dimensions of the suspended particles (billion times smaller volume). Depending on the ferromagnetic material and method of preparation, the mean diameter varies from less than 3 to 15 nm. A properly stabilized ferrofluid undergoes practically no aging or separation, remains liquid in a magnetic field, and after removal of field completely recovers its characteristics, e.g., there is no magnetic remanence. The magnetic particles in the colloidal dispersion of a ferrofluid are constantly attracted in the direction of an applied field gradient. Their tendency to drift in the gradient is counteracted by diffusive motion due to thermal agitation. Boltzmann statistics gives a criterion for maximum par-
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
105
title size that may be stated as follows:
(In this equation and throughout this chapter, SI metric units are employed.) This criterion demands that in a monodisperse mixture the difference in concentration anywhere in the system will not exceed the average concentraN * m/K, T = 298 K, H = 1.59 x lo6 A/m tion. With k = 1.38 x (20,000Oe) and considering magnetite particles with domain magnetization of 4.46 x lo5 A/m (5600G),Eq. (1) gives d < 3.0 x lo-' m or 3.0nm, which falls at the lower boundary of the range for actual ferrofluids noted above. The finite volume occupied by the particles in a ferrofluid will often limit concentration variations more so than indicated here. It is interesting to compare the magnitude of magnetic force to gravitational force on a particle in the ferrofluid. The ratio of these forces is independent of particle size. magnetic force - p o M 1 VH I gravitational force g Ap
(2)
Under the extreme conditions of high field gradient occurring in certain devices, such as ferrofluid seals, the field gradient VH reaches magnitudes of 1.2 x 10" A/mz. For magnetite particles in a typical organic fluid carrier, using Ap = 4500 kg/m3 the force ratio exceeds 1.5 x lo5.This is a huge number, and many colloids ordinarily considered as stable under gravity settling conditions cannot perform as ferrofluids. Preparation of Magnetic Fluids
There are two broad ways to make a magnetic fluid: size reduction of coarse material and chemical precipitation of small particles. Size reduction has been done by spark evaporation-condensation, electrolysis, and grinding. Chemical routes include decomposition of metal carbonyls (Thomas, 1966;Hess and Parker, 1966)and precipitation from salt solutions (Khalafalla and Reimers, 1973% 1974).Thus far the grinding technique introduced by Pappel (1965)has been most used, but for large-scale production the synthetic methods may be more suitable. Grinding should be done in a liquid, preferably one with a low viscosity, and in the presence of a suitable dispersing agent. Long processing times are the rule (5to 20 weeks). Specialized techniques have been developed in finish treating the product. Exchange of the solvent increases the concentration of magnetic solids and removes excess dispersant from solution (Rosensweig,
106
RONALD E. ROSENSWEIG
1970). Under some conditions it is possible to exchange surfactant on the particle surface (Rosensweig, 1975). In addition, evaporative removal of solvent and dilution with carrier fluid are commonly used to adjust the particle number concentration. An electron micrograph of the particles in a ferrofluid prepared by grinding is shown in Fig. la, and a histogram of the particle sizes in Fig. lb. Ferrofluids have been prepared in numerous diverse solvents including water, glycerol, paraffinics, aromatics, esters, halocarbons, and some silicones (Rosensweig and Kaiser, 1967; Kaiser and Rosensweig, 1968).To be a good dispersant, molecules should have (1) A “head” that adsorbs on the particle surface. Examples are molecules containing a polar group of carboxyl, sulfosuccinate, phosphonate, phosphoric acid, or amine. With polymers less active groups like succinimide or vinyl acetate can be sufficient. (2) A “tail” (or loop) 1-2 nm in length that is compatible with the base fluid. Chemical structural similarity is a good criterion for this. It is found that bulges (methyl groups on polyisobutylene) or kinks (conjugated bonds in oleic and linoleic acid) in the tails or loops prevent crystallization (association with their own species) and therefore are favorable (Scholten, 1978). Polymers having several anchor groups are the most tenacious stabilizers available. A disadvantage of polymers is the excessive space they tend to occupy. The use of “solubility parameters” should be useful in predicting the compatibility of the anchored tail with the solvent. Finally, while the search for a dispersing agent can be aided by the general rules, only an experiment can be decisive. Aqueous molecular solutions of paramagnetic salts provide magnetization M greater than 2.3 x lo3 A/m in an applied field of 8 x lo5 A/m and have been used for sink/float separation of minerals (Andres, 1976a,b).These paramagnetic solutions may be preferred when high homogeneity is essential and high magnetic field is available. In another direction, statistical mechanical studies such as that of Hemmer and Imbro (1977)are tantalizing in defining molecular parameters of a liquid in order that ferromagnetism may exist; theoretically, the Curie temperature exceeds the melting temperature of a substance if the exchange interaction is sufficiently strong, and FIG. 1. (a) Electron micrograph of Fe,O, magnetic particles prepared by grinding in aqueous carrier fluid. The bar represents 100 nm. (Ferrofluid A 0 1 Ferrofluidics Corporation. Photo courtesy EXXON Research and Development Company.) (b) Particle size histogram with the gaussian fit (solid line). Open circles represent the number of particles.The bars in each interval represent the statistical uncertainty in sampling. (From McNab et a/., 1968.)
30r--50
1
00
g B
c
C .-
n
El
0 -
0 UI)
L
0
n
50
0
D
0
10 Particle diameter (nm)
(b)
p
108
RONALD E. ROSENSWEIG
crystalline order is not required. Gold cobalt eutectic melt is reported as ferromagnetic by more than one investigation, but the finding is controversial (Kraeft and Alexander, 1973).
B. Stability of the Colloidal Dispersion Four different interparticle forces are encountered in magnetic dispersions: van der Waals attraction, magnetic attraction, steric repulsion, and electric repulsion. 1. Van der Waals Attraction
The van der Waals-London or dispersion force is due to the interaction between orbital electrons in one particle and the oscillating dipoles they induce in the other. For two equal spherical particles it is given by the expression of Hamaker (Kruyt, 1952) E,=
- - -+- 2 A 6 ( I’
+ 41
(1 + 2)’
( 1 + 2)2
where A is a dimensional quantity that can be calculated from the (UV) optical dielectric properties of particles and medium. For most material combinations A is known only to within a factor of 3. For iron as well as for gamma Fe,O, and for Fe,O, in hydrocarbon media a value of 10- l9 N . m is taken as representative. 1 is the relative surface separation, defined as
1 = (rc/r)- 2
(4)
where rc is center-to-center distance and r is radius of a particle. From Eq. (3) it follows that E , is the same for any pair of equal-sizespheres at the same 1. This potential, plotted in Fig. 2, is powerful but effectively works over a short range only. 2. Magnetic Attraction
Calculation of the critical dimensions, below which a particle becomes absolutely single domain, leads to values of d ranging from tens of nanometers [33 and 76 nm for iron and nickel respectively, see Brown (196911 to several hundred nanometers for materials with strong magnetic anisotropy [4 x lo2 and 13 x lo2 nm for manganese-bismuth alloy and for barium ferrite, see Wohlfarth (1959)l. Thus the particles of a ferrofluid may be regarded as single domain and, hence, uniformly magnetized. The magnetic
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
Steric repulsion
109
.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.9 1.0 f=s/r
FIG.2. Influence of film thickness 6 on the agglomeration stability for magnetic particles with radius r = 5 nm.
potential energy EM of a particle pair is then exactly described by the formula for dipoles. When the magnetic moments of the particles are collinear, the energy is maximum and can be written as follows:
This potential, plotted in Fig. 2, is relatively long-range, changing slowly with separation of the particles. In absence of a field, thermal motion tends to disorient the dipoles and the attraction energy is lower; Scholten (1978) gives an expression for that case. 3. Steric Repulsion
Steric repulsion is encountered with particles that have long, flexible molecules attached to their surface. As mentioned previously, the molecules can be simple linear chains with an anchor polar group at one end, e.g., fatty acids, or long polymers with many polar groups along the chain so that adsorption occurs with loops. Except for the anchor part, the adsorbed molecules perform thermal movements. When a second particle approaches closely, the positions the chains take up are restricted. Just as when the volume of gas is decreased, this loss of space (entropy) requires work when done at constant temperature. For chains that have a tendency to bind solvent molecules, the approach also involves the energy of breaking these chain-solvent bonds. Polyethylene oxide in water is an example where this occurs. This second (enthalpic) effect can work both ways: if the polymer
110
RONALD E. ROSENSWEIG
molecules would rather associate with their own species, repulsion is reduced or even changed to attraction. This term makes steric stabilization very sensitive to the solvent composition. Calculation of the repulsion energy for adsorbed polymers is difficult and the results are uncertain. For the short chains used often in magnetic fluids, however, an estimate can be made of the entropic effect. The theoretical result of Mackor (1951) for planar geometry was extended to spheres with the following result (Rosensweig et al., 1965):
N is the number of adsorbed molecules per unit area and t = 6/r, where 6 is length of the chain, regarded as a rigid rod. 1 is relative surface separation defined previously. The cross-sectional area of an oleic acid molecule is about 46 x m2 and the extended length about 2 nm. For the sake of calculation it is assumed that N = 1 x 10'' molecules per square meter, corresponding to a fractional surface coverage of about 50%. Figure 2 gives the steric repulsion curve with 6 = 2 nm for a particle having radius r of 5 nm. The steric repulsion decreases with increasing separation of the particles, becoming zero at 1 = 0.8, corresponding to a separation of 26. A second curve illustrates steric repulsion for a shorter chain, 6 = 0.5 nm. The electric repulsion mechanism has not been used in actual magnetic fluids. It does play a role, however, in several preparative methods. The electric repulsion between particles is the Coulomb repulsion of charged surfaces; the surface charges result from ions removed from the surface or adsorbed from the liquid. The repulsive force is reduced by the screening action of the surrounding ions by a mechanism that is well understood (Verwey and Overbeek, 1948). 4. Stability Related to Net Potential Curves
The algebraic sum of the repulsion and attraction energies yields the net potential curves shown as dashed lines in Fig. 2. For 6 = 2 nm the net curve displays an energy barrier of about 25kT, more than sufficient to prevent agglomeration. In comparison the net curve for 6 = 0.5 nm corresponds to attraction between the particles at all separation distances and hence the system fails to stabilize. These trends are rather informative even though the calculations are only crudely quantitative. The steric stabilization mechanism is not available in liquid metals, and to date no truly stable dispersions have been produced (Rosensweig et al.,
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
111
1965; Shepherd et al., 1972). Although fine iron particles produced by electrodeposition in mercury are well wetted by the mercury, in a magnetic field gradient an iron-free portion of mercury is expelled from the mixture. The concentrated magnetic portion that remains is stiff (Bingham plastic). Exploitation of work function differences may permit electric charge stabilization in the future, and there is need for a fundamental investigation of the surface and colloidal physics of these systems.
C . Equilibrium Magnetic Properties
Suspended in a fluid each particle with its embedded magnetic moment m is analogous to a molecule of a paramagnetic gas. At equilibrium the tendency for the dipole moments to align with an applied field is partially overcome by thermal agitation. Langevin's classical theory can be applied to give the superparamagnetic result provided there is neghgible particle-toparticle magnetic interaction (Jacobs and Bean, 1963). The orientation energy of a particle of volume V with dipole moment m = M V making an angle 0 with the magnetic field H is
U = -pomH cos 0
(7) Boltzmann statistics gives the angular distribution function over an ensemble of particles and from it the average component m is the direction of the field
Unlike in the original application to paramagnetism, here the magnetic moment per particle is a function of temperature. For spherical grains m = Zd3Md/6 (9) The magnetization M is related to Md ,m, and m through the volume fraction 4 of suspended solids
MfMd = &i/m Combining Eqs. (8), (9), and (10) gives
M
__ = coth (PMd
1
a -U
= L(a)
(10) II po MdHd3
a =-
6
kT
(11)
where L(a) denotes the Langevin function. Figure 3 gives magnetization curves computed from Eq. (1 1) for various particle sizes.
112
RONALD E. ROSENSWEIG
0
01
02
03
04
05
Applied induction, B(teslas1
FIG.3. Calculated magnetization curves for monodisperse spherical particles with domain magnetization of Fe,O, (4.46 x lo5 A/m). (After Kaiser and Rosensweig, 1968.)
The asymptotic form of the Langevin function for values of the parameter t( small compared to one is L(a) N t(/3. Thus the initial susceptibility is given by
and the approach to saturation, aB1
M=4M
x po MdHd3
Bibik et al. (1973) determined particle size from a plot of M versus 1/H using Eq. (13). In weak fields the chief contribution to the magnetization is made by the larger particles, which are more easily oriented by a magnetic field, whereas the approach to saturation is determined by the fine particles, orientation of which requires large fields. Thus d computed from Eq. (12) always exceeds the value found from Eq. (13). When the initial permeability is appreciable, it is no longer permissible to neglect the interaction between the magnetic moments of the particles. Shliomis (1974) discusses this case, assuming particles are monodispersed, by a method similar to that used in the Debye-Onsager theory of polar liquids. As a result, formula (12) is replaced by
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
113
In actual ferrofluids there are two additional influences that must be accounted for in relating composition to the magnetization curve (Kaiser and Miskolczy, 1970a). One of these influences is the distribution of particle size which may be determined by means of an electron microscope. The other is the decrease of magnetic diameter of each particle by the amount d, , where d,/2 is the thickness of a nonmagnetic surface layer formed by chemical reaction with the adsorbed dispersing agent; for example, for magnetite one takes ( d , / 2 )= 0.83 nm corresponding to the lattice constant of the cubic structure; the dispersant may be oleic acid that enters into a reaction with the Fe304. Iron oleate is formed which possesses negligible magnetic properties. For a solid particle of 10-nm diameter the volume fraction of magnetic solids is 0.58. Thus, a better agreement is observed between the experimental magnetization curves and theoretical curves calculated with the formula
The dead surface layer mechanism is consistent with experiments of various authors, discussed in Bean and Livingston (1959), that detect no decrease of the spontaneous magnetization of subdomain particles having no sorbed layers for diameters down to at least 2 nm. Mossbauer and magnetic data of NiFe,O, dispersions indicate that loss of magnetization due to sorption of dispersing agent on the surface layer is not due to a magnetic “dead” layer as such; the cations at the particle surface are magnetically ordered but pinned to remain at large angles with respect to the direction of an applied field (Berkowitz et al., 1975). The net effect in Eq. (15) is unchanged.
Formation of Chains and Clusters
Interesting results pertaining to the formation of chains of colloidal magnetic particles and to the effect of a uniform magnetic field on this process were obtained by de Gennes and Pincus (1970) and Jordan (1973). They considered some of the properties of the equation of state of a “ rarified gas ” of ferromagnetic particles suspended in an inert liquid. Allowance was made for the departure of the gas from ideality so far as this resulted from the magnetic attraction between the particles, neglecting any other forces that may be present. By considering pair correlations between particle positions, it was found that in strong external fields the ferromagnetic grains tend to form chains parallel to the field direction. The mean number of particles n ,
114
RONALD E. ROSENSWEIG
in the chain is n , = [I
- 4(+//n2)e2']-'
where 1is the (dimensionless) coupling coefficient,
1 = pom2/4nd3kT
(17)
which measures the strength of the grain-grain interaction. When the second term on the right side of Eq. (16) exceeds unity the approximations break down. It may be that clusters rather than chains will then form in the liquid. At zero external field and 1% 1, there also exists a certain number of chains according to the prediction. Their mean length given by no = [1
- 4(+/13)e2']~1
(18)
is smaller than in a strong field, and they are oriented in a random manner. For a magnetite particle with outer diameter 10 nm and A = 0.8 nm, the magnetic diameter is d = 8.4 nm, so with M = 446 kA/m and T = 298 K the coupling coefficient from Eq. (17) is 1= 0.78. With 4 = 0.05 in Eq. (16), the value of n , = 1.35 and from Eq. (18), no = 1.50.The particles are essentially monodisperse and the colloid has little clustering or agglomeration. A particle of elemental iron must be smaller to avoid clustering. The domain magnetization of iron is about four times that of magnetite, so from Eq. (17) the magnetic diameter yielding the same value of 1 is reduced by a factor of 42/3to 3.4 nm. Peterson and Krueger (1978) studied in situ particle clustering of ferrofluids in a vertical tube subjected to an applied magnetic field. The clusters redistribute under gravity by sedimentation in the tube, with concentration detected by a Colpitts oscillator circuit. The clustering is pronounced for present-day water-base ferrofluids and nearly absent for many other compositions such as well-stabilized dispersions in diester or hydrocarbon carrier fluid. Clustering, when it occurs, is reversible with removal of the field, thermal agitation being effective in redispersing the agglomerates. The technique should be useful in evaluating stability of new ferrofluid compositions.
D. Magnetization Kinetics
A ferrofluid may be defined as superparamagnetic if its magnetization obeys the Langevin magnetization law, Eq. (1 1). This superparamagnetic behavior may have two origins.
FLUID DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
115
1. Intrinsic Superparamagnetism of the Grains This corresponds to reversal of the magnetic moment within the grain, there being no mechanical rotation of the grain itself. This relaxation mechanism for sufficiently small subdomain particles was first pointed out by Neel. Reversal of magnetization is possible by surmounting of an energy barrier KV between different directions of easy magnetization relative to the crystalline axis of the grain material. K is the crystalline anisotropy constant and V the volume of a grain. The relaxation time TN is given by 1
TN
KV
= fo ""P(=)
-
wheref, is a characteristic frequency of order lo9 Hz. The transition between ferro- and superparamagnetism is for (KV/kT) 20. f
> 20
TN + 00
ferromagnetism
At a constant temperature around the value KV/kT of 20, TN varies by a factor of lo9 for a variation of the volume by a factor of 2. The critical diameter is in the range of the actual grain diameter (10 nm) of ferrofluids. TN for an oleic acid stabilized ferrofluid in kerosene carrier is given as TN N lO-'sec by Martinet (1977, 1978). Brown (1963a,b) obtained a theoretical result relating frequencyf, to the precessional decay process that accompanies return of magnetic moment of a particle to the equilibrium orientation following an initial perturbation.
2. Superparamagnetism Induced by Brownian Motion Since the particles are suspended in a liquid they are free to rotate and that offers an additional mechanism for reversing the orientation of their magnetic moment. The Brownian rotational relaxation time T g is now of hydrodynamic origin (Frenkel, 1955). T~ = 3Vqo/kT
where qo is the liquid carrier viscosity. The main mechanism for varying T g is change of Q,. For kerosene or aqueous base ferrofluids values calculated from Eq. (21) give Tg 2: lo-' sec. The rate of Brownian rotational relaxation when field is present is determined by solutions to the Fokker-Planck equation (Martensyuk et al., 1974).
116
RONALD E. ROSENSWEIG
Shliomis reviews the mechanisms for relaxation of magnetization (1974). He also develops numerical estimates of relaxation for iron particles in a ferrofluid. Above 8.5 nm the orientation of the magnetic moment is controlled by Brownian rotation of the particles. For smaller iron particles the chief relaxation process is the Nee1 mechanism. 3 . Experimental Measurements Martinet (1977, 1978) experimentally investigated the lag angle between the magnetization and the field. In rotating magnetic field this influence appears as a perpendicular component of the susceptibility. Measurements were facilitated by rotating the fluid sample (100 to 700 Hz) in a static magnetic field. In other tests the carrier is polymerized to a solid that mechanically traps the grains (styrene-divinylbenzene mixture replacing kerosene carrier, polyvinyl alcohol substituted for water carrier). For polymerized samples zB 00 SO that zB & zN. A polymerized sample containing 6.5-nm cobalt particles possessing a high crystalline anisotropy (KV/kT N 15) furnished a reference sample in which magnetic moment could not fluctuate spontaneously. Experimental ratios of M , / M I I followed the theoretical trend: invariance with angular rate R and decreasing ratio with increase of field intensity H. A sample polymerized from kerosene parent fluid having magnetic moment that could easily fluctuate inside the grain ( K V k T = 0.7) gave a result in support of The theory: M , / M I Iat R = lo3Hz was too small to observe (less than sample of cobalt particles in fluid carrier gave ratios of M , / M in excess of 8x Mossbauer spectroscopy was utilized.by Winkler et al. (1976) to distinguish between diffusional rotational relaxation and collisional relaxation due to particle-particle impacts; spectral data are given for diester-base ferrofluid. Mossbauer investigations reported by McNab et al. (1968) for ferrofluid composed of Fe304in kerosene carrier gives values [see Eq. (19)]
-
K = (6.0 & 1.0) x lo3 N/mZ
llfo = t o= (9.5 & 1.5) x lo-" sec
(energy barrier) (frequency factor)
in agreement with the order-of-magnitudecalculations of Nee1 and Brown. Water-base ferrofluids normally contain particles of larger size than the particles in organic carrier; a commercially available water-base ferrofluid gave a distribution with geometric mean diameter 10.8 nm and volumeweighted mean diameter 16.5 nm (Keller and Kundig, 1975). This reconciles with the statement of Sharma and Waldner (1977) that water-base ferrofluids give no (intrinsic) superparamagnetism in Mossbauer expenments.
K U I D DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
117
Bogardus et al. (1975) describe a pulsed field magnetometer to measure magnetic moment of a ferrofluid as a function of time. After the applied field is removed from a particular water-base ferrofluid, the magnetization was characterized by a fast decay (< 1 p e c ) and a gradual decay on the order of 4 msec [see Eq. (2111. The fast component is attributed to intrinsic processes within the particle, and the slow part to particle rotation. The mechanisms and behavior of the magnetic relaxation processes are important to no-moving-part pumps (Moskowitz and Rosensweig, 1967) and gyroscopes (Miskolczy et al., 1970), as well as to topics of magnetoviscosity and flows having internal rotation.
E. Viscosity It is axiomatic that a ferrofluid is a material having concomitant magnetic and fluid properties. Ferrofluid retains its flowability in the presence of magnetic field even when magnetized to saturation. Nonetheless, the rheology is affected by presence of the field. The following sections discuss the viscosity of ferrofluid in the absence, then in the presence of an applied magnetic field.
1. N o External Field This situation is the same as for nonmagnetic colloids of solid particles suspended in a liquid (Rosensweig et al., 1965). Thus, theoretical models are available for determining the viscosity, with the earliest being that of Einstein (1906, 1911) derived from the flow field of pure strain perturbed by the presence of a sphere. The resulting formula relates mixture viscosity qs to carrier fluid viscosity qo and solids fraction 4 (assuming for the moment that the particles are bare of coatings) ?s/'lo = 1
+ $4
(22)
This relationship is valid only for small concentrations. For higher concentrations a two-constant expression may be assumed: rls/'to =
1/(1 + a 4 + M 2 )
(23)
It is insisted that this expression reduce to Eq. (22) for small values of 4 and this determines a = - 5. At a concentration 4cthe suspension becomes effectively rigid and so the ratio qo /qs goes to zero. This determines the second constant as b = ($& - l)/&. A value of +c = 0.74 corresponds to close packing of spheres. Uncoated spherical particles of radius r, when present in a ferrofluid at volume fraction 4, will when coated with a uniform layer of dispersing agent having thickness 6, occupy a fractional volume in the fluid
118 of 4( 1 +
RONALD E. ROSENSWEIG
Combining these relationships gives
A plot of measured values of (qs - qo)/$qs versus 4 yields a straight line (see Fig. 4). Values of 6/r determined from the intercept at 4 = 0 and from the slope using 4, = 0.74 are in good agreement, yielding 6/r = 0.84. With 6 = 2 nm the particle diameter found in this manner is 4.8 nm. This is less than the mean size determined from an electron microscope count. This directional variance is expected due to the presence of a particle size distribution; small particles with their sorbed coatings tie up a disproportionate share of the total volume in the dispersion.
'4
t
FIG.4. Reduced experimental viscosity data for oleic acid stabilized ferrite dispersions. (After Rosensweig et al., 1965.)
When the suspended particles are nonspherical, theory predicts an increase in the coefficient of 4 in Eq. (22); due to Brownian rotation a larger volume is swept out by a particle of given size. As an example, an axial ratio of 5 increases the coefficient from 2.5 to 6.0(Kruyt, 1952). From the above considerations it follows that highly concentrated (high saturation moment) ferrofluids of greatest possible fluidity are favored by small coating thickness 6, large particle radius r, and spherical shape particles. These desired trends for 6 and r are opposite to the conditions favoring stabilization as a colloid, so in any actual ferrofluid compromises must be made using intermediate values of these parameters.
2. External Field Present When magnetic field is applied to a sample of magnetic fluid subjected to shear deformation, the magnetic particles in the fluid tend to remain rigidly aligned with the direction of the orienting field. As a result larger gradients
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
119
in the velocity field surrounding a particle are to be expected than if the particle were not present, and dissipation increases in the sample as a whole. Rosensweig et al. (1969) measured the effect of vertically oriented magnetic field on the viscosity of thin horizontal layers of ferrofluid subjected to uniform shear in a horizontal plane. Dimensional reasoning leads to the hypothesis that where qH is viscosity in presence of the field, qs viscosity of the ferrofluid in absence of field, y the shear rate, qo the carrier fluid viscosity, M the ferrofluid’s magnetization, and H the applied field. The data, shown in Fig. 5, roughly define a single curve with the following ranges:
0< r <
Speculate relative viscosity constant at maximum value of 4
(26)
Viscosity is magnetic field and shear rate dependent
(27)
Viscosity is constant at its field-free value
(28)
4.0
35 30 ln F
\
25
:2 0
15
10 10-8 10-7 10-6 10-5
10-4
10-3
10-2
10-1
FIG. 5. Correlation of relative viscosity of ferrofluids with torque modulus yqo/poMH. (After Rosensweig et al., 1969.)
A theoretical treatment from first principles giving the viscosity of dilute suspensions of single-domain spheres and accounting for Brownian rotational motion was developed by Shliomis (1972) and represents an advance over the earlier treatment of Hall and Busenberg (1969). When the fluid viscosity (or fluid rotation $2= iV x u) and the magnetic field are parallel, the particles can rotate freely and magnetism exerts no influence on the viscosity, which then is given by the Einstein relationship. When the directions are perpendicular to each other, the magnetic contribution to the
120
RONALD E. ROSENSWEIG
viscosity is greatest and is given by the formula
where a is the argument of the Langevin function (a = po mH/kT). For small values of a, q H= qo(l i$4) which again is the Einstein relationship; for large values of a the value of qH reaches its greatest value, qH = qo(l 44). Thus the Einstein contribution is augmented by a magnetoviscous contribution of 34/2. Physically the particles are pinned by the field and prevented from rotating. Measurements of viscosity in flow through a capillary tube directionally confirm the prediction of theory that orientation of field perpendicular to the vorticity vector yields a larger value of viscosity than for the parallel orientation (McTague, 1969). Taking qs = 1 + $4, the ratio q H / q sbecomes 1.2 for a reference value of 4 = 0.2. This is less than the value of (approximately)4 derived from Fig. 5. The higher ratios are supported by the data of Cfilugaru et al. (1976), where values of q H / q sgreater than 2 were measured with no apparent saturation. The data of Fig. 5 are shear rate dependent and in all cases RT, < 1, while Eq. (29) predicts no dependence on shear rate in this range. At this time the relationship of the transition zone in Fig. 5 to the theoretical results is unclear. Since the particles of a ferrofluid may possess an angular momentum that differs from that of the surrounding fluid, it may be expected that unusual new modes of response exist. Yakushin (1974) analyzes motion and magnetization of ferrofluid in response to oscillatory motion of a plane wall in its own (x-y) plane. Using the formalism of Shliomis it is found that a z component of magnetization appears at twice the frequency of the oscillating plane.
+
F. Tabulated Data and Other Properties
Table I lists fluid and thermal physical properties for numerous types and concentrations of commercially available magnetic fluids. The magnetic saturation may be proportionately varied by dilution with the carrier fluid. Diester-base fluid has low vapor pressure, hence may be exposed to the environment for long periods of time at normal temperatures with negligible evaporation. The hydrocarbon-base fluids have electrical resistivity of lo6 R-m at 60 Hz and relative dielectric constant of 20 at 1 kHz. Sonic velocity in a hydrocarbon-base ferrofluid was 1.201 x lo3 m/sec compared to 1.275 x lo3 m/sec in the carrier liquid (Chung and Isler, 1977). A fluoroalkylpolyether constitutes the base carrier of the fluorocarbon fluids. The ester fluids are based on silicate esters and hence are susceptibleto hydrolytic decomposition. They provide fluidity to low temperatures. The carrier fluid polyphenyl ether has a radiation resistance in excess of lo8 rad.
TABLE I NOMINAL PROPWTES OF FERROFLUIDS (298 K)”
Carrier fluid
Magnetic saturation (A/m)
Density (kg/m3)
Viscosity (N . sec/m2)b
Pour point
Boiling point
(KI
(Kp
Surface tension (mN/m)
Thermal conductivity (W/m. K)
specific heat (kJ/m3 . K)
Thermal expansion coefficient (m3/m3 . KY 9.0 10-4 8.6 x 10.6 x 10-4
~~
Diester Hydrocarbon Fluorocarbon Ester Water Polyphenyl ether
15,900 15,900 3 1,800 7,960
1185 1050 1250 2050
0.075 0.003 0.006 2.50
236 278 281 239
422 350 350 456
28 28 18
0.15 0.15 0.20
1715 1840 1966
15,900 3 1,800 47,700 15,900 3 1,800
1150
1300 1400 1180 1380
0.014 0.030 0.035 0.007 0.010
217 217 211 273’ 273’
422 422 422 299s 299
26 26 21 26 26
0.31 0.3 1 0.3 1 1.40 1.40
3724 3724 3724 4184 4184
7,960
2050
7.50
283
533
Adapted from Technology Handbook/Ferrofluids Catalog, Ferrofluidics Corporation, Burlington, Massachusetts. Measured in absence of magnetic field at shear rate > 10 sec-’. Viscosity 100 N . sec/m2. Under pressure of 133 Pa (1 torr). ‘Average over range 298 K to 367 K. Freezing point. 3.2 kPa.
8.1 x 8.1 x 8.1 x 5.2 5.0
10-4 10-~
10-4 10-4
122
RONALD E. ROSENSWEIG
The pH of water-base ferrofluids may be adjusted over a range of acidic and alkaline values. Electrical conductivity of a particular sample having saturation magnetization of 16,000 A/m was about constant at 0.2 S/m at ac frequencies (Kaplan and Jacobson, 1976). The same investigators report some new magnetoelectric effects on capacitance. The origin of osmotic pressure appears to be a controversial subject. Scholander and Perez (1971) measure osmotic pressure of a water-base ferrofluid and discuss the results in terms of alternate theories. Any of the ferrofluids in Table I may be freeze-thawed without damage. Goldberg et al. (1971) report the polarization of light by magnetic fluid. A number of investigators have published studies of optical properties subsequently. Hayes (1975) relates transmission and scattering of light to particle clustering. 11. FLUIDDYNAMICS OF MAGNETIC FLUIDS
The fluid dynamics of magnetic fluids differ from that of ordinary fluids in that stresses of magnetic origin appear and, unlike in magnetohydrodynamics, there need be no electrical currents (Neuringer and Rosensweig, 1964). While theoretical expressions are known for the forces acting between isolated sources of electromagnetic field, there is no universal law describing magnetic stress set up within a magnetized medium. However, satisfactory relationships may be derived using the principle of energy conservation, taking into account the storage of energy in the magnetostatic field. Relationships for stress obtained in this manner are found to depend on detailed characteristics of the material, particularly the dependence of magnetization on state variables. A. Magnetic Stress Tensor and Body Force
Cowley and Rosensweig (1967) derived the following expression for stress tensor for magnetic fluid having arbitrary single-valued dependence of magnetization on magnetic field under the condition that local magnetization vector is collinear with the local field vector in any volume element.
where in Cartesian coordinatesj is the component of the vectorial force per unit area (traction) on an infinitesimal surface whose normal is oriented in the i direction. The Kronecker delta 6,, is unity when the subscripts are equal and vanishes when they are not equal. B, and Hiare components of the magnetic field of induction B and the magnetic field intensity H,respectively.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
123
In the S.I. metric system the units of B are Wb/mz or T and H is given in A/m. The constant po has the value 471 x lo-’ H/m. Specific volume u has units of m3/kg. Since instruments for measuring magnetic field intensity are called gaussmeters, and probably will remain so for a long time, it is useful to remember that one tesla (T) is equivalent to lo4 G. Another derivation of the stress tensor leading to the result of Eq. (30) is developed by Penfield and Haus (1967) based on the principle of virtual power. The magnetization M describes the polarization of the fluid medium and is related to B and H through the defining equation
+
B = po(H M) (31) For a ferromagneticfluid without hysteresis the magnitude ofM, denoted by M, has the properties:
where M , is the saturation magnetization. As a consequence of collinearity, B=pH
(35)
M = xH (36) with the permeability p and the susceptibility x representing scalar quantities generally dependent on N and u. p and x are related to each other from their definitions,
x = (P//cLo)- 1
(37)
At a given point H i Bj = pNi H j = H j B ihence T jis symmetrical, so the fluid medium is free of torque. According to its definition the magnetic stress tensor T gives the total magnetic force F, on a volume V, of magnetic field as expressed by the following surface integral: F,=$j;-nds
(38)
where the surface S encloses V , and n denotes the unit normal vector facing outward from the volume. In formulating solutions to given problems it is sometimes most convenient to evaluate stress over an enveloping surface, as indicated by Eq. (38), for example, if field is known everywhere at the surface or has a simpler expression there. Alternatively, using Gauss’s divergence
124
RONALD E. ROSENSWEIG
theorem the surface integral in Eq. (38) may be transformed to a volume integral,
where V * T = f , components
appears as the magnetic body force density having fi
= aTj/axj
(40)
Thus from Eq. (30)
The last term in the above may be expanded as
-
where the Maxwell relationship V B = 0 was used. Collecting the components of Bj (aHi/axj) into vectorial form gives the term (B V)H. Since the field vectors are collinear by assumption,
(B V)H = (B/H)(H V)H Using the vector identity
(H * V)H = iV(H * H) - H x (V x H)
(43)
(44)
with V x H = 0 permits (B V)H to be expressed in terms of the vector magnitudes.
-
(B V)H = (B/H)$VH* H = BVH
(45)
Thus the vector force may be written as
Newton's law of motion applied to an infinitesimal element of the magnetic fluid gives
where q is the vector velocity of a fluid element and D/Dt = a/& + q * V the
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
125
substantial derivative following the fluid motion. The right side of Eq. (47) is the sum of the body forces acting upon a unit volume. The terms familiar from fluid mechanics are
fp = pressure gradient = -Vp(p, 7')
(48)
fg = gravity force = -grad I),
(49)
f,
= viscous
with $ = pgh
force = qV2q (viscosity assumed isotropic
for simplicity) (50) where p(p, T) is thermodynamic pressure and magnetic force density f, was given previously. Substituting the force expressions into Eq. (47) gives the equation of motion aq
p-
at
+ p(q
*
+ POMVH - Vpgh + qV2q
V)q = -Vp*
(51)
where p* is defined as
An isolated volume element dV with magnetization M subjected to an applied field Ho experiences magnetic force po(M * V)Ho (see Fig. 6). For
R0 t 8Ho
FIG.6. A small cylindrical volume of magnetically polarized substance with geometric axis 6s aligned with the magnetization vector M. Poles of density u = p, M appear in equal number and opposite polarity on the ends of area a d . Field H, may be taken as force on a unit pole, hence the force experienced by the volume element is
6F = -H,ua, = po 6H,
+ po(Ho + 6HO)oad
bad
where 6H, is the change of H, along the direction of 6s. Thus 6H, = (6s * V)H, = ( 6 s / M )x (M * V)H, and the differential force becomes 6F = po(M V)HOa,6s.Thus the force per unit volume, 6F/ad 6s becomes
-
force/volume = po(M * V)H, Volume of the element is ad 6s and dipole moment is uad 6s = p o Ma, 6s = p,Mad 6s SO that poM represents the vector moment per unit volume.
126
RONALD E. ROSENSWEIG
soft magnetic material M is collinear with Ho and by the same argument that led to Eq. (45), the volumetric force density is expressible in terms of field magnitudes as po MVHo .The resemblance between this expression and the term po MVH in Eq. (46) motivated the expressing of Eq. (46) in that form. However, it is noted that external applied field Ho rather than the local field H enters into the force expression for the isolated volume element. For a whole body the summation of forces produced on the body by itself must vanish so that jjjupo MVH d K must give the same total force as jjjupo MVHo d V , when the integration is carried out over the whole volume of the body. These results are consistent with force on the whole body obtained by integrating Eq. (46).
I[-
V(P0
[ BMu (dv)H*
dHJdV, =
(F)H,
-[[ (Po 1
dH}n dS = 0 (54)
Here the surface S enclosing volume V, is taken just outside the body in a surrounding nonmagnetic medium. B. Alternate Forms
There is arbitrariness in the grouping of magnetic terms in Eqs. (46) and (51) that relates to alternate expressions for stress and force seen in the literature. Thus fm
joaMu dHl + POMVH = - V [ p o jo u aMz d H 1 - V ( p o [ M dH + poMVH
= -.(Po
H
0
i
H
- poMVH - po = -V[po
jo u dM~ d
H- )po
/
V M dH
+ poMVH
0
H
0
V M dH
where V M is evaluated at constant H. The term po jt o(dM/au)dH represents magnetostriction. The magnetostriction term may be omitted in problems of incompressible flow with no effect on the results. For uniform
127
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
t V M dH = 0 within the fluid region; this integral takes magnetic fluid po s on a finite value then only at an interface. From this perspective the magnetic forces originate only at interfaces. The general force density of Cowley and Rosensweig reduces to the expression of Korteweg and Helmholtz valid for linearly magnetizable media in the following manner: aM H.
T
H. T
(574
+
In the above, use was made of B = p o ( H M ) = pH so that M = [(p/po)- 1]H and permeability p is assumed to depend only on p and not on H . Collecting terms establishes the desired identity.
Another modification of force density found in the literature corresponds to the adding and subtracting of po HVH = V{po st H d H } with the expression for magnetic force of Eq. (56).Neglecting striction in the second equality of Eq. (56) this procedure gives, H
f, = - V ( p o joHM d H l + poMVH = -V
JO
Formulation of magnetic force in terms of the coenergy w‘ H2
wt
=
jop(al
or,, H’)
HZ
128
RONALD E. ROSENSWEIG
by Zelazo and Melcher (1969)generalizes the magnetic treatment to account for spatial variation of properties within the magnetic fluid region. The ai’s represent intensive properties of the fluid such as temperature and composition. Table I1 summarizes alternate expressions for stress and body force density in magnetic fluids. Similar to striction, magnetization force density terms which take the form of the gradient of a pressure have no influence on hydrostatics or hydrodynamics of incompressible magnetic liquids. Byrne (1977) gives a survey of the force densities indicating relationships to,early literature dealing with magnetic stresses in solids. C . Generalized Bernoulli Equation For inviscid flows the equation of motion (51) may be rewritten with the aid of vector identities as a4 q2 p -at- p q x w = -grad p * + p 2 + p g h - p , ( M d H )
where o = V x q is the vorticity. For irrotational flow o = 0 there will exist a velocity potential cf, such that q = -grad @. Then if grad T = 0 or 6 M / 6 T = 0 there is obtained as the integral of the equation of motion the generalized Bernoulli equation which follows (Neuringer and Rosensweig, 1964):
M denotes the field averaged magnetization defined by
Asymptotic values of M
l H M = - j MdH H O may be found from Eqs. (34) and (63)
xi is the initial susceptibility, (aM/aH),. For time-invariant flow a@/& = 0 and g ( t ) = const, so generalized Bernoulli equation reduces to P*
+ p ( q 2 / 2 )+ pgh - p o M H = const
(65)
In the absence of an applied field p* = p and the term proportional to disappears. With one or another term absent the remaining terms provide
TABLE I1 MUTUALLY CONSISTENT S T RTENSORS ~ AND ASSOCXATED FORCEDENSITIESFOR ISWROPIC MAGNETIC FLUIDS Stress tensor ( T j )
Formulation Compressible nonlinear media Cowley and Rosensweig (1967); Penfield and Haus (19671 Incompressible nonlinear media
1
- po f H MdH ' 0
Force density (f,,,)
dH
+ -2H 2
i
+ &2 H 2
I+
HiBj
bij
6,,
+ HiBj
Assumptions/Dehitions
aMv
-V[p0
/ + poMVH lo( x )d H~ ,
Jb
V x H = 0,
MllH
V xH=O,
MJIH
VxH=O,
MllH
VxH=O
B=pH
H
= -po
Incompressible nonlinear media Chu (1959)
- 1IoHBdHlbij + H i B j
H
-V[
H
BdH+BVH=-j -0
Incompressible nonlinear media Zelazo and Melcher (1969)
-W'6ij
VHM dH
VHBdH 0
+ pHiHj
w' = ["'$p
dH2
'0
p = p(al . . . an, I f 2 )
Compressible linear media Korteweg and Helmholtz (see Melcher, 1963)
~2
-
ap
~2
v-p- -vp 2 ap 2
VXH-0
1
= :PO($
Incompressible linear media
Maxwell
p = p(p) in the fluid
VXH=O
- $pH2bij
Vacuum stresses
H2)
B=pH
+ pHi Hi
- $po H2bij
+ p o Hi Hj
H2 - -vp 2
0
B = pH, p = constant in the fluid
M=O
130
RONALD E. ROSENSWEIG
several important examples from ordinary fluid mechanics. With h = const, the remaining relationship between pressure and velocity describes the operation of the venturi meter, Pitot tubes, and the pressure at the edge of a boundary layer. In hydrostatics with q = 0 the pressure term combined with the gravity term describes, for example, the pressure distribution in a tank of liquid, while the gravity term combined with the term containing speed yields an expression for the efflux rate of material from a hole in the tank. In similar manner, combination of the “fluid magnetic pressure” pm = po MH with each of the remaining terms produces additional classes of fluid phenomena. As an additional feature of magnetic fluid flow that must be considered in concert, the existence of jump boundary conditions is crucial, and this topic is developed next. Thus from Eq. (30) the traction on a surface element with unit normal n is
T*n= The difference of this magnetic stress across an interface between media is a force oriented along the normal which may be expressed as follows:
[ T e n ] = -[T,,]n=
(67)
The square brackets denote difference of the quantity across the interface, and subscript n denotes the normal direction. It is noted that the argument of the bracketed quantity vanishes in a nonmagnetic medium. In deriving this relationship use is made of the magnetic field boundary conditions [B,] = 0 and [H,] = 0, where subscript t denotes the tangential direction. When the contacting media are both fluids, the stress difference from Eq. (67) may be balanced by actual thermodynamic pressures p ( p , T)giving the following result when one medium is nonmagnetic.
P*
= Po - (Po/2)M,2
(68) po is pressure in the nonmagnetic fluid medium and p* was defined previously. While it is familiar to require continuity of pressure across a plane fluid boundary when considering ordinary fluids, this is no longer the case with fluids possessing magnetization. Instead, it is seen from Eq. (68) that magnetic stress at the interface produces a traction (po/ 2 ) M i . D. Summary of Inviscid Relationships
The following equations represent a consistent set of governing relationships for the inviscid flow of magnetic fluids.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
Stress tensor H. T
Force density
Bernoulli equation (incompressible, steady flow) p*
+ p zq2 + pgh - p o A H = const
Boundary condition
P* = Po
-
(Po/2)M:
Definitions
l H M = - j MdH HO These relationships are applied to several basic responses of the magnetic fluids as detailed in the following section. E. Basic Flows 1. The Conical Meniscus
Neuringer and Rosensweig (1964)conducted experiments and performed analysis of a vertical current-carrying wire which emerges from a pool of magnetic liquid. In response to the magnetic field, the liquid rises up in a symmetric conical meniscus around the wire (see Fig. 7). The steady current Z produces an azimuthal field with magnitude H = 1/2nr,
(69)
where r, is radial distance. At the free surface M , = 0 so the boundary condition (68) reduces to p: = po . With q = 0 the constant of the Bernoulli equation (65) is evaluated at h ( a )where H = 0 giving po + p g h ( a ) = const. Then, evaluating terms of the Bernoulli at a surface point where the field is finite gives, with minor rearrangement, Ah = h - h ( a ) = poMH/pg
(70)
132
RONALD E. ROSENSWEIG
v
Current-carrying rod
( 0 )
FIG. 7. Sketch (a) and photograph (b) of free surface surroundinga current-carrying rod. . (After Neuringer and Rosensweig, 1964.) ,
Then from (Ma) and (69) for small applied fields
while from (64b) and (69) for saturated fluids,
the latter representing a hyperbolic cross section. Krueger and Jones (1974) calculated the surface shape and found good agreement with experiment assuming the Langevin theory of superparamagnetism and a realistic distribution of particle sizes. Next the problem is solved using the stress tensor of Chu to illustrate the use of a different formulation from Table 11.
The force density corresponding to the stress tensor of (73) is
-Vj
BdH+BVH= 0
-j V H B d H H
H
f,=
0
If evaluation off, is restricted to points within the magnetic fluid and the
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
FIG.7(b)
133
134
RONALD E. ROGENSWEIG
magnetic fluid has the same composition at all points, then V H B = 0 and the body force ascribed to the fluid by the Chu formulation disappears.
f, = 0
(74) Therefore the Bernoulli equation consistent with this formulation has no magnetic term and is the same as for a nonmagnetic fluid. The boundary condition to be satisfied at the free surface must be worked out anew, consistent with (73). Denoting “normal” by subscript n, “tangential” by subscript t, “ liquid ” with 1, and “ vapor” with v, (73) gives the following stress elements: Liquid side
T,, = 0
since 6,,=0
and H , = O
H
17; = T,, = - jo B dH
since H , = 0
Vapor side T,, = 0
Tv= T,, =
-I, B, H
dH
Since Bl = po(H + M) and B, = po H , the stress difference is
If the density of the vapor is neglected, then
P1- P v = - PBAh
(76)
while the balance of all forces at the interface require P l - P Y = T - T ,
(77)
Combining (75), (76), and (77) gives Ah = p O M H / p g This is just the result given previously as (70).
(78)
2. The Classical Quincke Problem
This problem has practical utility in measurement of magnetization in magnetic liquids (Bates, 1961). Figure 8 illustrates an idealized geometry consisting of two parallel magnetic poles. The height and width of the poles
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
135
Free inter
FIG.8. The classical Quincke experiment showing the rise of magnetic fluid between the poles of a magnet. (After Jones, 1977.)
is much greater than the separation, and also, the poles are assumed to be highly permeable so the magnetic field between them is uniform. The poles are immersed part-way into a reservoir of magnetic liquid of density p . Pole spacing is sufficiently wide that capillarity is not important, so the height to which the liquid rises between the poles is a function of the applied magnetic field H. Because the density of air is very small compared to that of the magnetic liquid, the external ambient pressure may be assumed constant at PO' This problem is easily solved with the Bernoulli equation (65), considering a point 1 chosen at the free surface outside the field and a point 2 at the free surface in the field region. From Eq. (65),
P:
+P
A = PZ
+ P9h2 - POMH
(79)
From the boundary condition formulated as Eq. (68),
P:
= Po
and
PZ = Po
(80)
Combining these relationships, Ah = hz
- h1 = P o M H / p g
(81) Normally the experiment is carried out with fluid in a vertical glass tube.
3. Surface Elevation in Normal Field
For the problem shown in Fig. 9 (Jones, 1977) the plane-parallel poles of a magnet produce vertical field, so the magnetic field is perpendicular to the magnetic liquid interface. The magnetic fluid responds with the surface elevation change Ah. The magnetic field above and below the interface are H 2 and H1,respectively. They are related by the boundary condition on the normal component of B,that is, POPI
+ M ) = POHZ
(84
136
RONALD E. ROSENSWEIG
tic fluid
FIG. 9. Uniform magnetic field imposed normal to free ferrofluid interface. The fluid magnetization is assumed smaller than required to produce the normal field instability. (After Jones, 1977.)
Using the Bernoulli equation of (65) gives
P:
+ PShl = P f + PShz - Po AH,
(831
while Eq. (68) for boundary conditions gives P: = Po Combining (83) and (84) gives an expression for Ah.
Ah = h;
- hi =
PS
poMHl
+ p0- 2
Compared to the Quincke result of Eq. (81) the surface elevation is greater in this problem by the amount po M 2 / 2 for the same value of field H in the fluid. Berkovsky and Orlov (1973) analytically investigate a number of problems in the shape of a free surface of a magnetic fluid. 4. Jet Flow The jet flow of Fig. 10 illustrates the coupling that may occur between flow speed and applied magnetic field. The magnetic field is provided by a uniformly wound current-carrying solenoid. Attraction of the fluid by the field accelerates the fluid motion along the direction of its path which is assumed horizontal. The magnetic boundary condition on field at station 2 requires continuity of the tangential component of H.Thus H ; = H; From the Bernoulli relationship of Eq. (65),
(86)
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
137
Winding of a currentcarrying solenoid I
/
o
o
o
o
Ferrofluid jet
FIG. 10. Free jet of magnetic fluid changes cross section and velocity with no mechanical contact. (After Rosensweig, 1966b.)
The boundary conditions on the fluid parameters from Eq. (68) give
P: = Po
P t = Po Thus 4;
(89) The incompressible fluid velocity satisfies the continuity relationship V q = 0. This may be integrated using the divergence theorem as follows, assuming the jet possesses a round cross section everywhere. - 4: = 2Po A H 2 /P
Combining (89) and (90) gives for the ratio of jet diameter,
5 . Modified Gouy Experiment The technique illustrated in Fig. 11 provides a means for gravimetric measurement of the field averaged magnetization A. Originally the technique was used for measurement of weakly paramagnetic liquids having constant permeability (Bates, 1961), while the present treatment extends the analysis to nonlinear media of high magnetic moment. A tube of weight w, having cross-sectional area a, and containing ferrofluid is suspended vertically by a filament between the poles of an electromagnet furnishing a source of applied field Ha.The top surface of the ferrofluid at plane 1 experiences negligible field intensity, while at plane 2 the field H 2 within the fluid is uniform at its maximum value. The force F is given as the sum of
138
RONALD E. ROSENSWEIG
Gravity
Tube containing ferrofluid
FIG.11. Analysis of the modified Gouy relationship.
pressure forces and weight of the containing tube.
F = ( P t - Pokt + WI (92) Note that p* is regarded as capable of exerting a normal stress on a surface in the same manner as ordinary pressure. From the Bernoulli equation of (65) applied between sections 1 and 2,
P:
+ PShl = P t + PSh2 - P o M H 2
(93)
From (68) the boundary condition at the free surface is
P:
= Po
(94)
Combining the above and solving for M gives
where Fo = w, + pg(h2 - h,)a, and represents the force when field is absent. H 2 is less than the applied field Ha due to influence of the sample shape H2 = Ha - DM
(96)
For circular cylinders the demagnetization coefficient D equals 4. Additional examples of magnetic fluid hydrostatics and hydrodynamics are developed later in discussing devices.
6. Convective Flows The isothermal flow of magnetic fluid without free surfaces, to first approximation, assuming absence of magnetorheological influence, is independent of applied magnetic field. However, given a magnetic fluid having a temperature-dependent magnetic moment, body forces may appear when temperature gradients are present, and a number of new phenomena have been investigated. (See also Section IV,C regarding thermomagnetic
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
139
pumping.) Neuringer (1966) analyzes stagnation point flow of a warm ferrofluid against a cold wall, and parallel flow of warm ferrofluid along a cold flat plate. Numerical results are calculated describing the velocity and temperature profiles when the field source is a dipole. An increase in the magnetic field strength leads to a decrease in the heat flux and skin friction. More recently, Buckmaster (1978)in a study of boundary layers shows that magnetic force can significantly delay or enhance separation. Indeed, if the force is unfavorable, the separation point can be moved all the way forward to a front stagnation point; whereas if the force is favorable, separation can be delayed to a point arbitrarily close to a rear stagnation point. A study of Berkovsky et al. (1973)examines heat transfer across vertical ferrofluid layers placed in a gradient magnetic field (Fig. 12).Equation (51)
VH C
-____
FIG. 12. Geometry for convective heat transfer in a closed volume. (After Berkovsky and Bashtovoi, 1973.)
together with the convective conductive equation for heat flow are numerically integrated subject to the Boussinesq approximation. Experiments with a kerosene-base ferrofluid agree well with the computed values. Heat transfer is increased when the temperature gradient and magnetic field gradient are in the same direction; there is a decrease when the directions are opposite. With increased heat transfer the results are correlated over the range R > lo3,R* > lo3,and 2 < c/w < 10 by the formula NU = 0.42(~/w)-~.~~[R + 4(R*)0.91]0.23
(97)
where Nu is Nusselt number, R is the usual Rayleigh number [see Eq. (12911,
140
RONALD E. ROSENSWEIG
and R* = Ipo ZAH/pwg/lo IR. L = (ko + /lo)Mo,where ko = M-' aM/BT is the pyromagnetic coefficient and /lo is the thermal expansion coefficient. Berkovsky and Bashtovoi (1973) review additional results and prospects for convective heat-transfer processes in magnetic fluids.
7. Other Studies
While the literature is somewhat large to permit a review of all work, several interesting directions for further research are indicated by the following studies. As part of their pioneering work, Papell and Faber (1966) simulated zero- and reduced-gravity pool boiling using ferrofluid in a field gradient. It is likely that a complete simulation is not possible due to the influence of magnetism on various surface instabilities; as a potential benefit, however, the new mechanisms should permit enhancement and control of the rate of boiling. Miller and Resler (1975) investigated ferrofluid surface pressure jump in uniform magnetic field. The surface pressure jump at one surface cannot be measured by itself, as any method of measurement will naturally involve two surfaces. A differential manometer was attached to a glass sphere containing ferrofluid with one tube connected at the equator and another at the top of the sphere. The experimental pressure jump followed the directional trend predicted by theory but exceeded prediction by a factor of about 2.2. Magnetization of the ferrofluid as a function of field was not measured directly by these investigators or reported by them, and hence there is a question concerning the interpretation of results. The experiment furnishes a fundamental means to check the continuum theory and deserves to be repeated. Jenkins (1972) formulates constitutive relations for the flow of ferrofluids. His treatment permits anisotropy, hence gives another approach to the incorporation of antisymmetric stress. In considering particular flows (Jenkins, 1971) a peculiar conclusion is reached that the swirl flow in a rotating uniform magnetic field is theoretically not possible. There is need to reconcile the conclusion with the experimental evidence. The ferroelectrohydrodynamics of suspended ferroelectric particles is treated by Dzhaugashtin and Yantovskii (1969). To date an electrically polarizable analog of ferrofluid does not seem to have been produced; it is likely that depolarization by free charge may defeat such efforts. In this sense it may be said that ferrofluid owes its existence to the absence of magnetic monopoles in the environment. The relativistic hydrodynamic motion of magnetic fluids is developed by Cissoko (1976).
K U I D DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
141
F. Instabilities and Their Modification
Subtle and unexpected fluid dynamic phenomena are associated with flow instabilities of magnetic fluids. These phenomena offer practical value in a number of cases where magnetization of the appropriate orientation and intensity prevents instability and extends the operating range of the equilibrium flow field. Conversely, in other situations magnetization upsets the fluid configuration, setting limitations that would not exist in the absence of the field. The user of magnetic fluids can benefit from an awareness of both of these aspects. In the following, attention is initially devoted to two uniform layers having an equilibrium plane interface of infinite extent. This apparently simple situation is actually rich in physical interest as will be seen. The stability of the equilibrium is examined in response to flow speed, gravitational force, interfacial tension, magnetic properties, and magnetizing field. 1. Formulating the Problem
The stability of flow may be ascertained from the behavior of the interfacial boundary between the fluid layers. The sketch of Fig. 13 illustrates nomenclature for these systems. Since an arbitrary initial disturbance of the interface may be represented as the superposition of harmonic terms, it is only necessary to consider the evolution of one such term having an arbitrary wavelength.
FIG.13. Nomenclature for interfacial stability of magnetic fluids. (After Zelazo and Melcher, 1969.)
The deflection of the interface may be represented as
5 = toexp ut[cos(yt - k,y - k,z)] (98) Each of the parameters u, y, 5 and the wave numbers k, and k, are taken as real valued. Hence Eq. (98) represents a traveling wave having a velocity termed the phase velocity of magnitude y/k, where k = (ki + kf)”’ and amplitude &, at time t = 0. If u = 0 the disturbance is neutrally stable and Eq. (98) describes a traveling wave of constant amplitude; while for u > 0 the disturbance grows in amplitude with time, and the flow is said to be
142
RONALD E. ROSENSWEIG
unstable. Values of u c 0 correspond to stable flows. In circumstances where y = 0 the disturbance is termed static. Determination of definite expressions for the parameters is obtained from solving the ferrohydrodynamic equations in their small disturbance (linearized) form. The algebraic work is simplified when Eq. (98) is represented in alternate form as
< = toRe exp i ( o t - k,y
- k,z)
where Re denotes the real part, i is the imaginary number complex as indicated by the following:
(99) and o is
0,
w=y-iu
(loo)
2. General Dispersion Relation for Moving Nonlinear Media with Oblique Magnetic Fielri Zelazo and Melcher (1969) develop a dispersion relation for stationary layers that may be generalized (M. Zahn, personal communication, 1977)to include motion of the media; the result appears as follows: (o- k, U,)2p, coth ka = gk(pb - pa)
+ (o- k, ub)2pb coth kb
(101)
+ k 3 Y - kZ/Y
where
z = b i ( H l , a -H:,b)2flb1actxc& cash B b b cash Baa sinh Bbb sinh @,a] sinh Baa cash @bb &Excash Baa sinh Bbb]
- k;pO(Hi,, - H i J 2
Y = PO[Bbctx
+
B = [ r : x ( k ; r ; , + kzZrzoz) - kF(G,)21”2/elx
(102a) (102b) (102c) (102d)
B = p(H2)H P = Po(X
+ 1)
(102e) (102f)
The magnetic fluid has magnetization density M that depends on H as illustrated in Fig. 14, where xo is the chord susceptibility and xt the tangent susceptibility. Fluid velocities U, and Ub are directed along the y direction and field is oriented in the xy plane. 6jk is the Kronecker delta function. Superscript degree (“) denotes equilibrium flow value; subscript a denotes “ above ” and b “ below.”
143
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
H (103A/m)
FIG.14. Nomenclature and typical appearanceof magnetization curve for a magnetic fluid. (After Zelazo and Melcher, 1969.)
Thus the dispersion equation (101)relates angular velocity 0 to the wave vector components of the harmonic disturbance (perturbations) of the interface. In general the individual waves propagate at different velocities, dispersing away from each other.
3. Reduction of the Dispersion Relation for a Linear Medium The dispersion relation is rather complicated, so that initially an understanding of its content is facilitated by considering the simpler form it takes when the magnetic media are linearly magnetizable ( p = const).
Y = k(pb sinh ka cosh kb + pa cosh ka sinh kb)
The dispersion relation then reduces to (0- k,
U,)’p,, coth k,a 4-(0 - k, U b ) ’ p b coth kb = gk(Pb - pa)
+k 3 9
1
kfH;(~ ~pb)’ k z / - b p b ( H ! - H:)2 tanh ka + p,, tanh kb - p b coth kb + p,, coth ka It is noted that the MH-IIa result in Melcher (1963)differs from Eq. (104)in the denominator of the final term (surface current is absent at the infinitely permeable wall so tangential field is continuous). Equation (104)reduces further for thick layers (a + 00, b + m ) to the following: (0- k,
U,)’P,,
+ (0- k,
Ub)’Pb
1
= & ( p b - pa)
+k 3 9
k 2 k p b ( H ! - H:)2 - k,”H,’(pa - pb)’
-
pb
-k p a
pb
+ pa
1
(105)
144
RONALD E. ROSENSWEIG
4. No Mean Flow: The Rayleigh-Taylor Problem
The classical Rayleigh-Taylor problem treats the stability of a dense fluid overlaying a less dense fluid. The instability of this fluid configuration provides an explanation for the familiar fact that liquid spills from an inverted bottle despite the fact that atmospheric pressure can support a water column 10 m in length. The instability is prevented in a familiar demonstration using a layer of stiff paper placed in contact over the vessel mouth. The instability phenomenon has broader ramification than might be thought, particularly when kinematic acceleration or deceleration is considered in place of gravitational acceleration. The following simplifications are imposed on Eq. (105), corresponding to the absence of mean flow and the presence of tangential applied field.
ua=o
u b = o
H”,o
H!=o
/J+=pO
(106)
From (105) there is obtained for the dispersion equation,
where M is oriented along the y direction. In the usual nonmagnetic case (Lamb, 1932) M = 0 and (107) reduces to
When P b > pa, corresponding to the more dense layer on the bottom, o is real so 4 = to cos(wt - k y y - k , z ) and the solution describes traveling waves that are neutrally stable. With dense fluid overlaying less dense fluid, < pa and the right side of (108) is negative. o is imaginary (not complex), y = 0, and u is real. Thus 4 = toe”‘ cos(k, y + k, z) corresponding to static waves that are unstable. This is Rayleigh-Taylor instability. As Fig. 15 illustrates, the expression given by Eq. (108) leads to negative values of w2,hence imaginary values of w, when the (negative)gravitational term of (108) is larger in magnitude than the stabilizing interfacial tension term. Incipient instability corresponds to the value k = k* obtained from (108) when w = 0.
A* = 2n/k* is the Taylor wavelength. Wavelengths shorter than A* are stabilized by interfacial tension; another familiar demonstration stabilizes the interface with capillary forces created by an open mesh screen placed over
FLUID DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
145
I(-) FIG.15. Dispersion in *k plane for Rayleigh-Taylor instability.
the interface. The effect is dramatized by passing a fine wire through the openings of the mesh into and out of the fluid. As seen from the last term of (107), magnetization provides a stabilizing (stiffening)influence for disturbances propagating along the field lines, while self-field effects are absent for perturbations propagating across the lines of field intensity. Convenient experiments for verifying the dispersion relations with tangential and normally applied magnetic fields use rectangular containers, partly filled with magnetic fluid, driven by a low-frequency transducer to vibrate in the horizontal plane. By shaking the container at appropriate frequencies, it is possible to excite resonances near the natural frequencies of the interface. These occur as the box contains an integral number n, of halfwaves over its length such that k, = n,n/l,, k, = 0. In a typical measurement as illustrated in Fig. 16a, the resonant condition is established by varying the driving frequency in order to approach the resonance from above and, again, from below. In all cases the fluid depth was great enough to ignore the presence of the container bottom. Magnetic field was produced by Helmholtz coils. From Eq. (107) the relative frequency shift to produce standing waves is given as [(Wl
- W;)/W;]1’2 = F ,
(110)
Figure 16b displays a satisfactory agreement between experimental values and theoretical prediction given as the solid line. The resonance frequencies shift upward with increasing magnetization.
146
RONALD E. ROSENSWEIG
Ft-,
(b)
FIG.16. (a) Experimental arrangement to determine influence of tangential field on resonance of surface waves on a magnetic fluid. (b) Data illustrating shift of resonance to higher frequencies with increase of fluid magnetization. (After Zelazo and Melcher, 1969.)
5. Gradient Field Stabilization Whereas uniform tangential magnetic field stiffens the fluid interface for wave propagation along the field direction, waves propagating normal to the field remain uninfluenced by the field and become unstable when dense fluid overlays less dense fluid. However, an imposed magnetic field having a gradient of intensity is capable of stabilizing the fluid interface against growth in amplitude of waves having any orientation. The theory for gradient field stabilization is more complex than for the uniform field selfinteraction and includes the self-field influence as a special case (Zelazo and Melcher, 1969). To be stabilizing the field intensity must increase in the direction of the magnetizable fluid whether the magnetizable layer is more dense than the underlaying fluid or the magnetizable layer is less dense and underlays the nonmagnetic fluid, in which case buoyant mixing should be prevented. Normal field possessing a gradient of intensity is less satisfactory for this purpose than tangential field having the requisite gradient due to the desta-
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
147
bilizing tendency of uniform field oriented normal to the interface (see Section II,F,6). The following expression gives the interface criterion in order for tangential gradient field to prevent the Rayleigh-Taylor instability. PcoM(dH,/W
’d P ,
(112) Surface tension, which was neglected, only further increases the stability. Zelazo and Melcher (1969) tested the relationship of (112) in adverse gravitational acceleration using wedge-shaped steel pole pieces to provide the gradient in imposed field intensity; they report quantitative agreement between theory and experiment. However, it appears that the gradient field extended over the whole volume of the magnetic fluid, so the experiment was unable to distinguish between field gradient support of the liquid and gradient field stabilization of the interface. Rosensweig (1970, unpublished) devised a demonstration to illustrate the effectiveness of employing gradient field that is localized at the magnetic fluid interface. Figure 17 illustrates the stably supported liquid column. In - Pb)
t-
I;
I/PI
I
1
FIG.17. Field gradient stabilization of Rayleigh-Taylor instability illustrating mechanism for a magnetic fluid contactless valve.
one apparatus a sealed glass tube T of 8 mm i.d. and 330 mm length contained magnetic fluid with p = 1200 kg/m3, po M = .012 Wb/m2 (120 G). The field is furnished by a ring magnet M,, slid over the tube, having face-to-face magnetization and made of oriented barium ferrite, 25 mm 0.d. by 7.5 mm thick. As shown in Fig. 17a the fluid column of length 1, is initially supported against gravity by pressure difference p1 less p 2 , the lower interface stabilized by the gradient field. At Fig. 17b the magnet slid to a lower
148
RONALD E. ROSENSWEIG
position resulting in a lowering of the fluid column as a whole. Figure 17c depicts the system after the next change of magnet position in this sequence. As the magnet is raised, fluid that is passed over flows to the tube bottom while the overlaying fluid remains in place. The features are each in accord with the expectations of the interface stabilization phenomena. The containment of the fluid is effectively accomplished with a magnetic bamer which may serve as a nonmaterial valve. 6. Normal Field Sut$ace Instability Magnetic field oriented perpendicular to the flat interface between a magnetizable and a nonmagnetic fluid has a destabilizing influence on the interface. The phenomenon was first reported by Rosensweig (1966a) who observed the phenomenon upon producing a magnetizable fluid that was severalfold more concentrated than the fluids previously available. The phenomenon is evoked in its essential form when the applied magnetic field is uniform (see Fig. 18a). As shown in the photograph of Fig. 18b, at the onset of transition the interface displays a repetitive pattern of peaks. The spatial pattern is invariant with time at constant field. Cowley and Rosensweig (1967) gave the analysis and confirming experiments for a nonlinearly magnetizable fluid forming an hexagonal array of peaks (spikes). Consider Eq. (105) with U4=
ub=O
H y = O
Pa=/lo
Pb=P
(113)
In Eq. (114),w appears as a square term only, while the right-hand side is a real number. Thus the value of w is either real or imaginary but never complex. From Eq. (loo),when o is imaginary, an arbitrary disturbance of the interface initially grows with time as described by the factor e”‘,where o is the imaginary part of w. Thus the onset of instability corresponds to the dependence of w2 on k as sketched in Fig. 19a. It is evident from the figure that transition occurs if both the following conditions are met w2 = 0 (115a) and aoz/ak = o (115b) The instability occurs as a spatial pattern that is static in time. Applying the conditions of (115) gives
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
149
FIG.18. (a) Experimental apparatus for producing the normal field instability in vertical applied magnetic field. Power source supplies current through the ammeter A to the coils C and subjects the ferromagnetic fluid F to an approximately uniform magnetic field. (b) Photograph illustrating the normal field instability of a magnetic fluid free surface. A small source of light at the camera lens is reflected from local flats on the fluid surface. (From Cowley and Rosensweig, 1967.)
Comparing Eq. (117) to Eq. (109) it is seen that k , = k*, i.e., the critical wave number for the normal field instability corresponds to the Taylor wave number. Equation (116) gives the critical value of magnetization M , ; this is the lowest value of magnetization at which the phenomenon can be ob-
150
RONALD E. ROSENSWEIG W2
(b)
I
1.1
1.2
13 1.4 1.5 1.6 1.7 1.8
9
P /Po FIG.19. (a) Dispersion in o-k plane at onset of the normal field instability. (b) Experimental data for appearance of the normal field instability agree with the predictions of theory. (After Cowley and Rosensweig, 1967.)
served. Experimentally and theoretically, when magnetization is increased from zero by increasing the applied magnetic field, the fluid interface is perfectly flat over a wide range of applied field intensities up to the point where transition suddenly occurs. The phenomenon is striking in this regard, and due to its critical nature, the condition for onset is easily and accurately detectable. The corresponding instability at the free surface of a liquid dielectric in a constant vertical electrical field has been studied experimentally by Taylor and McEwan (1965) and theoretically by Melcher (1963). Equation (116) applies for the linear medium. In comparison, the results
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
151
of Cowley and Rosensweig (1967) for the nonlinear medium are (118a) ro = ( P C P I / P W 2
(118b)
where pc = B o / H o and p1 = (dB/dH), . According to Eq. (118) the role of permeability in Eq. (116) is replaced by the geometric mean of the chord and tangent permeabilities when the medium is nonlinear. As a numerical example, for a pool of magnetic fluid exposed to air at atmospheric pressure with P b = 1200 kg/m3, pa x 0, .F= 0.025 N/m, p,p, = 2p& and g = 9.8 m/sec2, the critical magnetization is M c = 6825 A/m, (86 G). If the fluid’s saturation magnetization were less than this value, it could not display the instability regardless of the intensity of the applied magnetic field. One can offer the following physical explanation for the appearance of peaks on the liquid surface. Suppose that in a uniform vertical field there arises a wavy perturbation of the magnetic fluid surface. The field intensity near the bulges of the perturbations is increased, but in the hollows it is decreased in comparison with the equilibrium value. Therefore the perturbation of the magnetic force is directed upward at the bulges but downward at the hollows; that is, a tendency exists to amplify the perturbation of the surface. On the other hand, the surface-tension and gravity forces are directed opposite to the displacement of the parts of the surface from the equilibrium; that is, they impede the displacement. As long as the warping of the surface is small, all the forces produced by it are proportional to the value of the displacement. The elastic coefficients representing the ratio of force to displacement for surface tension and gravity are independent of applied field intensity. However, the coefficient in the perturbation of the magnetic force is also proportional to the square of the magnetization of the magnetic fluid. Therefore, at sufficiently large magnetization, the destabilizing magnetic force exceeds the sum of the other two forces and instability sets in. Comparisons of the predicted critical magnetization from Eq. (118a) with experiments are shown in Fig. 19b for magnetic fluid with air and water interfaces. Subscript 0 denotes properties of the kerosene carrier liquid. Density p of the magnetic fluid was varied by changing the particle concentration in the kerosene carrier. The comparison between theory and experiment in these tests is excellent, as were the predictions for spacing between peaks. Additional study of the normal field instability develops conditions for appearance of a square array of peaks versus the hexagonal array using an
152
RONALD E. ROSENSWEIG
energy minimization principle, and calculates the amplitude of the peaks from nonlinear equations (Gailitis, 1977). Zaitsev and Shliomis (1969) analyze hysteresis of peak disappearance as field is decreased in terms of bifurcation solutions; a critical experiment is needed to confirm this prediction. 7. Kelvin-Helmholtz Instability Classical Kelvin-Helmholtz instability relates to the behavior of a plane interface between moving fluid layers. Wind-generated ocean waves and the flapping of flags are two manifestations of the instability. A rather basic situation in ferrohydrodynamics is the inviscid wave behavior at the interface between layers of magnetized fluid having permeabilitiespa and &,.The following considers the case of applied magnetic field with intensity H , oriented parallel with the unperturbed surface. Gravity is oriented normal to the field and the fluid layers move at speeds U, and Ub relative to fixed boundaries. The dispersion relationship is obtainable as a special case of Eq. (104) for linear media with H: = HJ:= 0. This problem was originally treated abstractly in the monograph of Melcher (1963). (0 - k,
U,)’P, coth ka
+
(0 - k,
Ub)’Pb coth kb
This expression is a quadratic in a.For simplicity, considering the case when k = k, ,a + 00, and b + 00, o becomes complex and concomitantly the flow is incipiently unstable when the following conditions are satisfied:
(120b) The critical wave number is found from (120b) as
-P ~ J ) / ~ ] ” ~
(121) which again correspondsto the critical wavelength 2 4 k Cin Rayleigh-Taylor instability. Eliminating k from (120a) then produces a criterion for instability in the magnetic Kelvin-Helmholtz problem: = kc = [&b
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
153
The larger the difference in permeability and applied field across the interface, the greater is the velocity difference that can be accommodated before instability occurs. Low density of a layer also promotes stability. 8. Stabilization of Fluid Penetration through a Porous Medium
The interface between two fluids can be unstable when the more viscous fluid is driven through the voids in a porous medium by a less viscous fluid. The phenomenon was analyzed by Saffman and Taylor (1958) using small signal stability analysis. Recently, Rosensweig et al. (1978) demonstrated that if a layer of magnetizable fluid is used to push the more viscous fluid, the interface can be stabilized for sufficiently small wavelengths with an imposed magnetic field. The simplest case, that of a two-region problem, is illustrated in Fig. 20.
FIG. 20. Perturbations on the interface separating two dissimilar fluids penetrating through a porous medium. (After Rosensweig et al., 1978.)
Magnetic fluid pushes nonmagnetic fluid in the presence of tangential applied field. The fluid motion is normal to the interfacial boundary between the two fluids. In a porous medium the details of the interstices are not known, but the local average fluid velocity is adequately described by Darcy’s law O = -Vp-/3q+F (123) where p is the hydrodynamic pressure, /3 = qH / K is the ratio of fluid viscosity q,, to the permeability K which depends on the geometry of the interstices, and F is any other internal force density. In the present m e F is composed of gravitational and magnetization forces. Additional governingequations are the incompressible continuity relationship V q = 0 and the magnetostatic field relations, V x H = 0 and V * B = 0. The interfacial perturbation again may be represented by Eq. (99). Linearization and solution of the governing equations subject to requiring force
154
RONALD E. ROSENSWEIG
balance at the interface leads to the dispersion relation
where
If v is negative any perturbation decays with time, while if it is positive the system is unstable and any perturbation grows exponentially with time. Thus surface tension and magnetization tend to stabilize the system; gravity also stabilizes the system if the more dense fluid is below ( p b > pa). If p b < pa, there results an instability of a more dense fluid supported by a less dense fluid. Unlike in Rayleigh-Taylor instability, in fluid penetration it is viscous drag rather than inertia that controls the dynamics. The magnetic field only stabilizes those waves oriented along the direction of the field. Then with k, = 0 so that k = k,, Eq. (124) reduces to
where ro = (pfi/pi)1’2as given previously, and =
(pa
- Bb)F
+ &a
- Pb)
(127)
Surface tension stabilizes the smallest wavelengths (largest wave numbers), and magnetic field stabilizes intermediate wavelengths. However, the system is unstable over a range of small wave numbers readily found from the above relationships when G > 0. When G I 0 the system is stable for all wave numbers whether magnetization is present or not. Additional analysis for a magnetic fluid layer having a finite thickness results in the same stability condition discussed above. The interfacial stability is independent of the layer thickness. The photographs of Fig. 21a illustrate an experimental verification of the magnetic stabilization of fluid penetration. The test utilizes a Hele-Shaw cell consisting of two parallel plates separated by a small distance do in the z direction as shown in Fig. 21b. The flow in the cell models Darcy’s law with the correspondence, (128a) (128b)
FLUID DYNAMICS AND SCIJiNCE OF MAGNETIC LIQUIDS
155
FIG.21. (a) Fluid penetration from left to right through a horizontal Hele-Shaw cell. Plate spacing d = 0.52 mm, aqueous base magnetic fluid viscosity qb = 1.18 mN . sec/m2, oil viscosity qa = 219 mN . sec/m2, interfacial tension Y = 30 mN/sec, velocity V = 0.3 mmisec. (b) Schematic drawing of Hele-Shaw cell. (After Rosensweig et al., 1978.)
9. Thermoconvective Instability
In ordinary fluid mechanics there is a well-known convective instability which arises in a fluid supporting a temperature gradient. Owing to thermal expansion, the hotter portion of the fluid has a smaller body force acting on it per unit volume than does the colder fluid. The fluid, if it is heated from below, may then be considered top-heavy, subject to a tendency to redistribute itself to offset this imbalance, a tendency which is counteracted by the viscous forces acting in the fluid. Theoretical treatment of this phenomenon predicts that the fluid will undergo this convective redistribution when the value of a dimensionless number R, the Rayleigh number, exceeds a certain critical value Ro . The Rayleigh number in ordinary fluids acted upon only by the gravitational body force is given by
156
RONALD E. ROSENSWEIG
Ro is 1708 for a horizontal layer of fluid and 1558 for a vertical layer. By analogy it is clear that a similar phenomenon may occur in a ferrofluid subjected to a body force po MVH. The body force depends on the thermal state of the fluid since M = M(T, H ) with (aM/aT), < 0. Thus an increase of temperature T in the direction of the magnetic field gradient tends to produce an unstable situation; the colder fluid is more strongly magnetized, then is drawn to the higher field region, displacing the warmer fluid. Thermoconvective instability of magnetic fluids may be investigated through linearizing the equation of motion, the equation of heat conduction, and the equation of continuity. Shliomis (1974)gives linearized relations for normalized perturbation velocity, temperature, and pressure valid at the limit of stability when equilibrium is replaced by stationary convective motion and the excitations neither decay nor build up with time. The set of equations has the same form as in the problem of ordinary convective stability when a generalized combination of parameters R, plays the role of Rayleigh’s number.
dT A o = - dz ,
dH dz ’
G0 -
1 av
Po = ;@,
ko = - L(E) (131) M aT
Instability is indicated by the following criterion :
Rg
’Ro
1708 Ro = (1558
(horizontal layer) (vertical layer)
(132)
Mechanical equilibrium of an isothermal liquid (Ao = 0) is always stable since the “Rayleigh number” is then always negative
R=
__(POPS kt tl
+ PoMkoGo)2
(133)
so that the inequality R < Ro is known to be satisfied (Cowley and Rosensweig, 1967). In the absence of magnetism (M = 0) Eq. (130)reduces to Eq. (129) provided
T o B o ~ / ~4 o ~1 o
(134)
Taking values To = 300 K, Po = 5 x (K-’), g = 9.8 m/sec, co = 4 x lo3 W * sec/kg . K, and A . = lo3 K/m gives (T,Po g/coA , ) = 3.7 x lo-’ thus justifying the neglect of the adiabatic expansion term.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
157
The ratio of the third term (magnetocaloric cooling) to the first term in the brackets of (130)is also generally a negligibly small number. CLOMkoGoTo/~coA,-4 1 (135) Taking ko = K-', M = 29,900A/m, Go = 8 x lo5 A/m2, To = 300 K, p = lo3 kg/m3, and c, A, the same as above gives a ratio of 2.3 x from Eq. (135).Thus both adiabatic terms in (130)can be neglected so the effective Rayleigh number takes the simpler form given by Lalas and Carmi (1971)and Curtis (1971).
R, =
6 (fi, pg + p, M k , G o ) kt rl
Using the numerical values considered above, po M k , G/fiopg = 6.2,so it is seen that the magnetic mechanism dominates over the gravitational mechanism. In the preceding discussion, the field gradient Go was considered as constant throughout the fluid layer. This approach is justified if Go %- G i , where Gi is the gradient of the magnetic intensity induced by the temperature gradient A.
Finlayson (1970) analyzed thermal convective instability in the case where applied field is uniform and magnetically induced temperature change is appreciable. The governing parameter is the dimensionless group
where xt
=
(g) T
(139)
For values of M > 3 x lo5 A/m the critical value of Rf approaches 1708 for any value of x,. Then the magnetic mechanism produces convection provided the following criterion is satisfied: Rf > 1708
(horizontal layer)
(140)
111. MAGNETIC FLUIDSIN DEVICES The unique fluid dynamic phenomena of magnetic fluids have led to numerous exploratory device applications and several proven technological applications. Often a small amount of magnetic fluid plays a critical role and
158
RONALD E. ROSENSWEIG
suffices to make rather massive devices possible. The fluid contained within the recesses of the device may not be visible to the user. A good example is provided by magnetic fluid rotary shaft seals. These seals and several other device applications make up the subject matter of this section. A . Seals
The concept of a magnetic fluid shaft seal was developed about the time that ferrofluid became available (Rosensweig et al., 1968b). Sealing regions of differing pressure is now the most well developed of the proposed ferrohydrodynamic applications (Moskowitz, 1974). Figure 22 illustrates schematically how ferromagnetic liquid can be employed as a leak-proof dynamic seal between a rotary shaft and stationary Fwmanent
piece
Ferroiagnetic liquid
permeable
f Cylindrical per manen t ring magnet
FIG. 22. Two basic types of magnetic fluid rotary shaft seals: (a) nonmagnetic shaft, (b) magnetizable shaft. (After Rosensweig et al., 1968b.)
surroundings. The space between the shaft and stationary housing is loaded with ferromagnetic liquid held in place as a discrete ring(s) by the magnetic field. The design in Fig. 22a is adapted for the sealing of shafts that are nonmagnetic. External field generated by the permanent magnet emanates from a pole piece of the outer member and reenters another pole piece located on the same member. The magnetic field in the gap region is oriented tangential to the shaft surface. The alternate configuration of Fig. 22b is essentially that employed in commercially available ferrofluid seals. A magnetically permeable shaft and outer housing are used as part of a lowreluctance magnetic circuit containing an axially magnetized ring magnet mounted between stationary pole blocks. Focusing structures are employed to concentrate the field in a small annular volume with the ferrofluid introduced into this region. In this configuration the magnetic field is oriented transversely across the gap. Using this arrangement, pressure differences up to about lo5 P can be supported across a single-stage seal. The design
159
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
shown in Fig. 22b is particularly adaptable to staging whereby much larger pressure differences can be sustained (Rosensweig, 1971b, 1977). The multistage seals of this type have reached wide-spread commercial usage (see Rotary Seals Catalog Handbook, Ferrofluidics Corporation, Burlington, Massachusetts). 1. Principle of Magnetic Fluid Seals
To analyze seals utilizing the transverse orientation of field consider the sketch of Fig. 23 which illustrates a path through one liquid stage. Pressure 4 is greater than pressure 1 with the fluid displaced somewhat toward direction 1. The interface 4/3 is in a region where field is assumed uniform and
FIG.23. Relationship for pressure difference across one stage of a ferrofluid seal.
oriented tangential to the interface between ferrofluid and the surrounding nonmagnetic medium. Interface 2/1 is located in a relatively weak portion of the fringing fluid, and gravitational force is negligible. From the Bernoulli equation (65) applied between points 3 and 2 within the ferrofluid, p': - p 0 ( M H ) 3 = P; - p O ( M H ) Z Or P': - P t = p O [ ( f i H ) 3 - ("H)2] (141) Since the normal component of magnetization is zero at both interfaces, the boundary condition (68) gives
P': = P 4
and
Pt
=P1
(142)
Hence
The integral in this equation appears frequently in ferrohydrodynamics; Fig. 24 illustrates its meaning as an area under the magnetization curve. In a well-designed seal, the field H , is negligible compared to H 3 so that the burst pressure of the static seal is given closely as Ap = p o M H (144) where MH is evaluated at the peak value of field (Rosensweig, 1971a).
160
RONALD E. ROSENSWEIG
I
Soluration
"3
H2 Magnetic field H
FIG. 24. The shaded area under the magnetization curve relates to pressure difference across a stage of a magnetic fluid seal.
The development above assumes that the ferrofluid is uniform in composition. Actually the particle concentration is greater in high intensity portions of the field and leads to values of Ap that exceed the theoretical value of Eq. (144) in seals that have been idle for a period of time. Since viscosity increases rapidly as particle concentration increases [see Eq. (24) of Section I,E], a seal is stiff to rotate initially. Rotary motion of the seal shaft stirs the fluid, tending to equalize the particle concentration, reduce the torque, and
FIG.25. Experiment to determine static loading of a one-stage magnetic fluid (plug) seal. (After Perry and Jones, 1976.)
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
161
return the operating pressure capability to the value predicted by Eq. (144). The relationship of Eq. (144) was tested quantitatively by Perry and Jones (1976) utilizing a ferrofluid plug held magnetically by external pole pieces and contained in a vertical glass tube (see Fig. 25). Static loading of this seal was accomplished with a leg of immiscible liquid overlaying the plug. From Fig. 26 it is seen that calculated and measured static burst pressures are in very good agreement for various ferrofluid seals operated over a range of magnetic inductions. + Hydrocarbon- bose I0 = 1.5 mrn 0 Hydrocorbon- bose
-
5j04 -
ID=2 . 8 m m
Woter-bose ID = 1.5 rnm
L
W
2
f
440~-
-g 3.10~\
a
+8 0
210~ Hydrocarbon
1.10~ 0.3 0'
I
0.04 0.06 0.08 010
012
( 4
FIG.26. Calculated and measured static burst pressures of magnetic fluid seals are in very good agreement. (After Perry and Jones, 1976.)
2. Seal Applications Bailey (1976) describes studies in the development of ferrohydrodynamic sealing undertaken for application to a 150-mm diameter feedthrough for a cryogenic liquid helium system. The rotating member in this case was driven up to 3000 rpm with cooling water continuously circulated near the seal to maintain the temperature at a low enough level to avoid evaporative loss of the ferrofluid carrier liquid. Performance and cost of ferrohydrodynamic seals were compared with other more conventional techniques in Bailey's excellent review article. The ferrohydrodynamicseal is also developed for a superconductorgenerator as a feedthrough to a low temperature environment (Bailey, 1978). The 48-mm diameter seal was tested at about 3500 rpm for a period greater than a year. No sign of wear or any imminent failure was obvious on final
-
162
RONALD E. ROSENSWEIG
inspection. General Electric Company’s Neutron Devices Department at St. Petersburg, Florida has had seals running 24 hours per day 7 days per week for over 4 years with no operational failures due to the seals. The seals operate at 200 rpm on 12.5-mm diameter shafts and seal against a pressure of 100 kP on a resin degasser system (Perry, 1978).Levine (1977)reports the use of seals requiring high reliability on two space simulators for testing communications satellites. With 1600 hours running time accumulated in continuing tests, the seals had performed reliably and consistently. Ten stages of diester-based ferrofluid in 0.1-mm radial gaps around a 120-mm diameter shaft running at 3000 rpm were tested when sealing helium gas at 100 kP differential pressure. No leak was detected using a mass spectrometer to search for helium leaks. This indicates that the leakage m3 per year and represents an improvement over rate was below 6 x mechanical face seals when sealing gases of many orders of magnitude (Bailey, 1978). The ferrofluid seals are suggested for long-term use in inertial energy storage wherein high-speed flywheels rotate in vacuum enclosure (Rabenhorst, 1975).A patent advocates porous pole pieces to serve as reservoirs of fluid, but no working example is given (Miskolczy and Kaiser, 1973). Another patent (Hudgins, 1974) utilizes a ferrofluid seal combined with a labyrinth seal and a pressurized air cavity for use in gearbox transmission systems. The pressurized air cavity prevents the internal fluids from contacting the ferrofluid. In developing a high-power gas laser, NASA finds that the key to achieving a completely closed-cycle system without using makeup of C0,-helium-nitrogen gas mixture is a multistage ferrofluid seal surrounding the blower drive shaft (Lancanshire et al., 1977). The system operates over a pressure range of 13 to 106 kP, coupled to a 187-kW motor. Moskowitz and Ezekiel (1975) review successful commercial applications of vacuum sealing as well as the sealing of high differential pressure. The seals have also found wide application as exclusion seals preventing liquid, vapor, metallic, and nonmetallic contaminants from reaching machinery parts such as grinding spindles, textile wind-up heads, and digital disk drives. Thus, the disk in a computer magnetic disk drive whirls at high speed, with the read/write head floating on a cushion of air 2.5 pm above it. The magnetic fluid seals keep out contaminants such as 5-pm smoke particles or dust that can cause a “crash” with computer memory loss (Person, 1977). Evidence of technical seal activity in the Soviet Union is given by Avramchuk et al. (1975).The development in Japan of a magnetic fluid seal for use in the liquid helium transfer coupling of a superconducting generator is described by Akiyama et al. (1976). Rotary seals of ferrofluid in contact with water or other liquids generally
FLUID DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
163
leak at modest rates of rotation. Calculation from Eq. (122) reveals this trend is consistent with a mechanism of leakage due to Kelvin-Helmholtz instability. The same calculation for sealing against gas yields stability to high rotational rates, again in accord with experience. B. Bearings
Passive bearings based on magnetic fluid flotation phenomena produce hydrostatic levitation of movable members. Dynamic bearings using the magnetic retention of the fluids offer additional novel characteristics. 1. Review of the Phenomenology
Samuel Earnshaw as early as 1839 propounded the theorem that stable levitation of isolated collections of charges (or poles) is not possible by static fields [Jeans, 1948; also Stratton, 1941; a recent discussion of Earnshaw’s Theorem is given by Weinstock (197611. As one consequence, it is impossible to find “all-repulsion” combinations of magnets to float objects free of contact with any solid support. However, it is interesting to note that Braunbeck in 1938 deduced that diamagnetic materials and superconductors escape the restrictions underlying Earnshaw’s theorem. These special materials may be successfully suspended although the diamagnetics suffer from very low load support and superconductors require cryogenic refrigeration. More recently Rosensweig (1966a,b) discovered first the levitation of nonmagnetic objects immersed in magnetic fluids subjected to an applied magnetic field and then (Rosensweig, 1966c)the self-levitation of immersed permanent magnets when no external field is present. These phenomena are reviewed in the following: a. Passive levitation of a nonmagnetic body. Consider a container of magnetic fluid as shown in Fig. 27a. Assume magnetic field and gravitational field are absent so the pressure is uniform at a constant value everywhere within the magnetic fluid region. Two opposed sources of magnetic field are brought up to the vicinity of the container as shown in Fig. 27b. If these sources have equal strength, the field is zero midway between and increases in intensity in every direction away from the midpoint. Since the magnetic fluid is attracted toward regions of higher field, the fluid is attracted away from the midpoint. However, the fluid is incompressible and fills the container so the response it provides is an increase of pressure in directions away from the center. Next, the fate of a nonmagnetic object introduced into this environment may be considered. In Fig. 27c the object is located at the center point. Since pressure forces are symmetrically distributed over its surface, the object
164
RONALD E. ROSENSWEIG
1-
MAGNET FIELD SOURCE
c q-
I _ _ _ _ I
I
rkG=
P = UNIFORM
-J!u -
-
FLUID PRESSURE INCREASES AWAY FROM CENTER
MAGNETIC FLUID IN ABSENCE OF ANY FIELD (0)
OBJECT IN STABLE EOUlLlBRlUM ~-~ (PRESSURE FORCES BALANCED) ~~
---
---
OBJECT EXPERIENCES A RESTORING (UNBALANCED PRESSURE FORCES)
FIG.27. Passive levitation of a nonmagnetic body in magnetized fluid. (a) Initial field-free state in absence of applied magnetic field, (b) magnetized state of fluid containing null point of field at center, (c) equilibrium position of the permanent magnet disk in the ferrofluid space, (d) restoring force arises when the levitated magnet is displaced from the equilibrium position. (From Rosensweig, 1978.)
attains a state of stable equilibrium. Displaced from the equilibrium position as shown in Fig. 27d, the object experiences unbalanced pressure forces that establish a restoring force. This is the phenomenon of passive levitation of a nonmagnetic body. b. SeIf-leuitation in magneticpuid. In Fig. 28a the magnetic field again is absent and pressure is constant throughout the fluid. The field of a small permanent magnet, for example a disk magnetized from face to face, is shown in Fig. 28b. With this magnet immersed at the point in the fluid as in Fig. 28c the field is symmetrically disposed and pressure in the fluid, although altered by the field, is symmetrically distributed over the surfaces of the magnet. Accordingly, the magnet experiences an equilibrium of forces in this position. When the magnet is displaced from the center as shown in Fig. 28d, the field distribution no longer remains symmetric; consideration of the permeable path encountered by the magnetic flux readily leads to the conclusion that field is greater over that magnet surface facing away from the center. Again employing the notion that fluid pressure is greatest where the magnetic field is greatest, it is clear that the magnet is subjected to
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
165
FIELD OF MAGNET
MAGNET MAGNETIC FLUID
(a
,
(b)
FLUID PRESSURE INCREASES TOWARD CENTER
EQUILIBRIUM POSITION (BALANCED PRESSURES) (C
(d)
DISPLACED MAGNET EXPERIENCES RESTORING FORCE (UNBALANCED PRESSURESDUE TO PERTURBED FIELD)
FIG.28. Self-levitation in magnetic fluid. (From Rosensweig, 1978.)
restoring forces that will return it to the center. This is the phenomenon of self-levitation in magnetic fluid. Figure 29 illustrates a corollary to the two types of levitation introduced above, the mutual repulsion of a magnet and a nonmagnetic object when both are immersed in magnetic fluid in a region removed from any fluid
MAGNETIZED BODY
(OR OTHER FIELD SOURCE)
FIG.29. Generalization of the levitational phenomena recognizes the mutual repulsion of a magnet and a nonmagnetic object when both are immersed in ferrofluid. (From Rosensweig, 1978.)
166
RONALD E. ROSENSWEIG
boundary. This interaction is without analog in ordinary magnetostatics wherein both bodies must possess magnetic moments if there is to be a static interaction between them. In the present instance, it will be realized that the mutual force is not the result of a direct interaction between the two bodies but is due to magnetic fluid attracted into the space between these bodies. Nonetheless, the net effect is the mutual repulsion of the bodies.
2. Formulation of the Force on an Immersed Body; Levitation
Consider a body, either magnetic or nonmagnetic, with surface S immersed in magnetic fluid, as shown in Fig. 30. The net force acting on the
FIG. 30. Force on arbitrary (magnetized or unmagnetized) body immersed in magnetic fluid in presence (or absence) of an external source of magnetic field.
body is given generally by the expression
# (T-n-pn)dS
F,=
(145)
where n is the outward facing unit normal vector, and T is the magnetic fluid stress tensor having components given by Eq. (30).
-
T n = T,n
Tn=-PoJ” 0
(-)aMv
= HnBl= H I B ,
+ I; t
d H - - ’ O H2* H,T
(146)
+ H,B,
(147a) (147b)
The expression for normal force may be simplified using the Bernoulli expression of Eq. (65) applied between a field-free region of the fluid where
FLUID DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
167
pressure is po and a point in the fluid near the surface of the body.
Thus,
Substituting (149) and (147b) into (146) and the result into (145) gives a result (J. V. Byrne, private communication, 1978)which may be expressed as
where the constant term po vanished using the divergence theorem. Equation (150) allows force on an arbitrary immersed body to be computed from field solutions; an analogous expression in terms of the surface tractions may be written to determine torque. If the immersed body is nonmagnetic, the integral over a surface Sijust inside the body disappears since there is no magnetic force on the matter within the surface.
Subtracting (151) from (150) using [B,] = 0 and [HI] = 0, where brackets denote outside value minus inside value, proves that tangential force is zero, hence that the surface force is purely normal; and what remains may be expressed as follows after some reduction.
The surface integral of Eq. (152) depends on the object’s shape and size as well as the magnetic field variables. Present magnetic fluids acted upon by laboratory magnet sources are able to levitate, against the force of gravity, any nonmagnetic element in the periodic table. A condition for stable levitation at an interior point of the fluid space is now simply developed. In addition to F, = 0 in Eq. (152)it is required, at the equilibrium point, that a positive restoring force accompany any small displacement. Since (MH R,2/2) increases asymptotically with H it follows that to levitate, a magnetic field must possess a local minimum
+
168
RONALD E. ROSENSWEIG
of field magnitude, i.e., for any displacement, 6H>O
(153) As a special case, the magnetic force on a nonmagnetic immersed object can be given explicit expression in the limit of intense applied field, i.e., (M,2/2)/MH -% 1 (154) Then with uniform grad p,, where p , = po MH is the fluid magnetic pressure, the magnetic force from (152) using the divergence theorem is
F,=
-
fl p,n d S = -\\\grad
pmdV
(155)
= - VpoM grad H
Due to the minus siG in (155) [and (15211 the force is equal and opposite to the magnetic body force on an equivalent volume of fluid. Repulsion and stable levitation owe their existence to the presence of the minus sign. Expression for levitation force on spheres and ellipsoids in a linearly polarizable medium are given by Jones and Bliss (1977). Expressions for levitation force in the case of nearly saturated magnetic fluid are developed for objects shaped as cylinders, spheres, and plates by Curtis (1974). 3. Measurements
A ceramic disk magnet having density 4700 kg/m3 with direction of magnetization perpendicular to the faces self-levitated about 5 mm off the bottom of the vessel containing magnetic fluid of density 1220 kg/m3. The force of repulsion between magnet and the bottom surface of the container when measured for various displacements of the magnet from its equilibrium position gave the results shown in Fig. 3 1 (Rosensweig,1966b).The theoretical curve results from a calculation by the method of images and gives excellent agreement with the data. The flotation of nonmagnetic objects was investigated using a sink/float technique in a field gradient that could be varied. Spheres of glass, ceramic, and coral gave the data of Fig. 32. The ordinate of the graph represents buoyant weight of the immersed object and the abscissa approximates the theoretical force. It is seen that the experimental values fall reasonably close to the parity line. However, it would be desirable to determine the force relationship to a greater precision. Another investigation (Kaiser and Miskolczy, 1970b) demonstrated the flotation of dense solids in the series from diamond (sp. gr. = 3 4 , zirconia, niobium, copper, molybdenum, silver, tantalum, and tungsten (sp. gr. = 19.3).
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
0
0.I
1
169
Experimental, F< 0 Experimental, F > O
-- Theoretical osymptote 2
5 10 20 Distance from wall, a/2, mm
3
FIG. 31. Levitational force of a magnetized disk immersed in magnetizable fluid versus distance from the horizontal lower plane of the fluid space. (ARer Rosensweig, 1966b.)
FIG.32. Flotation of relatively dense, nonmagnetic materials in magnetizable fluid subjected to an applied magnetic field gradient.
4. Computationsfor a Model Bearing (Rosensweig et al., 1968a.)
A model for analysis of a bearing is illustrated in Fig. 33. The bearing has two-dimensional geometry with magnetic poles distributed along the top and bottom surface of the levitated member. The surface density of poles is specified as sinusoidal in the x direction. The gaps of thickness 6 between an upper wall at surface S2 and a lower wall at surface SI are filled with
170
R O N A L D E. ROSENSWEIG
magnetic fluid. The float is displaced by an amount A setting up a restoring force per unit area Fb. The distribution of magnetic field in the spaces occupied by magnet, fluid gaps, and the nonmagnetic wall is solved exactly, on the assumption of uniform fluid magnetic permeability p. Inserting the field component values into the stress tensor of Eq. (30) and summing contributions over surfaces s, and S 2 gives the following result for normalized force:
(1 57a) (157b) (157c) = 2412 where 12 is wavelength of the repetitive pole pattern and M , is amplitude of the magnet’s magnetization. It can be shown that peak field at the surface in the absence of magnetic fluid is M,/2. The selection of surfaces S , and S 2 adjacent to the nonmagnetic walls is convenient for evaluating the force. However, a property of the stress tensor approach is that other choices must lead to the same result. The relationship for net force on the floated member as predicted by the foregoing analysis is shown in the graph of Fig. 34 in terms of bearing stiffness. For a magnetic fluid having a given permeability, the bearing stiffness is maximized for a particular choice of pole spacing relative to gap length. The optimum pole face spacing falls in the range of kh between
k
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
0
171
0.8 1.0 1.2 1.4 KA NORMALIZED INITIAL SPRING CONSTANT 0.2
0.4
0.6
FIG.34. Prediction of model bearing stiffness versus pole spacing with fluid relative permeability as parameter. (After Rosensweig et al., 1968a.)
about 0.1 and 0.5 or A/d in the range from 60 down to 12. It is fortunate that the magnet dimensions are much larger than the dimensions of the gap as this makes fabrication of optimized bearings not too difficult. Perhaps opposite to intuition, high permeability of the ferrofluid reduces the bearing stiffness in some cases-the crossover as kd increases along the curve for pr = 10 is an example.
5. Reductions to Practice A magnetic fluid spindle bearing was developed incorporating principles described above (Rosensweig, 1973, 1978); it is shown in the photograph of Fig. 35. The cylindrical outer race is nonmagnetic, while samarium cobalt ring magnets of alternating magnetization stacked on a rotable shaft make up the inner race. With magnetic fluid in the gap, the inner and outer members float apart from each other completely free of mechanical contact with each other. The levitational support is totally passive, requiring no energy input. Starting friction is astonishingly low (or perhaps nonexistent); the outer member rotates under its own minute imbalance, oscillating as a damped pendulum before coming to rest. The device operates silently and was driven to 10,OOO rpm with a fiber loop as a prototype for use in textile machinery. Unlike ball bearing spindles, there are no parts that wear or become noisy in operation.
172
RONALD E. ROSENSWEIG
FIG.35. A magnetic fluid spindle bearing. The outer race is supported purely by magnetic fluid forces, with no mechanical contact of relatively movable members and no source of energy.
A cylindrical slide bearing utilizing the magnetic fluid repulsion idea furnished the basis for an integrating accelerometer to sense increments of linear velocity by inertial response (Rosensweig et al., 1968a). The application required a very low order of sticking friction, or stiction as it is termed by instrumentation scientists. Experimentally the device was responsive to
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
173
FLEXIBLE SURROUND 7
1
NETlC FLUID
FIG. 36. Magnetic fluid levitational force centers the voice coil in high-fidelity loud speakers. The captive fluid also transfers heat away from the voice coil and dampens vibrations. (ARer Teledyne Acoustic Research,Bulletin 700056, Norwood, Massachusetts.)
an input acceleration of less than lo-' g. This is the tilt equivalent to the angle subtended by a 10-mm coin at a distance of 1 km. A recent patent (Hunter and Little, 1977)claims novelty in the use of closed loop operation with electromagnet drivers to measure acceleration in such applications as inertial platform leveling and thrust termination. Another study evaluates the sensitivity of a floated magnet as a material sensor (Cook, 1972) A very different type of application for the bearing principle is illustrated in the sketch of Fig. 36. Magnetic fluid centers the voice coil of an otherwise conventional loudspeaker replacing mechanical spiders ordinarily used for
Motor shaft
Nonferromognetic housing
Disk magnets axially polarized
FIG.37. Components of an inertial damper based on levitation of a magnetic mass in magnetic fluid. (After Lerro, 1977.)
174
RONALD E. ROSENSWEIG
this purpose. As a key benefit, the heat generated in the voice coil is effectively transferred through the fluid to the surrounding structure. This innovation increases the amplifier power the coils can accommodate, hence the sound level the speaker produces. The application was recently adopted into commercial practice (see, for example, Teledyne Acoustic Research, Bulletin 700056, Norwood, Massachusetts). A new inertial damper that mounts on the end of a stepping motor shaft or similar device has been developed by Ferrofluidics Corporation (Lerro, 1977). A schematic diagram of the damper is shown in Fig. 37. The ferrofluid acts in a dual function: it provides bearing support for the floated seismic mass and acts as an energy absorber that dissipates energy through viscous shear in the damper. The unit is lighter and less expensive than conventional devices using mechanical bearings. 6. Other Developments in Bearings
The one-fluid hydrostatic bearings already discussed and other principles leading to magnetic fluid bearings are summarized in Table 111. The hydrostatic magnetic fluid bearings eliminate wear, noise, starting torque, power input, and lubricant flow. The hydrodynamic-type magnetic fluid bearings achieve an increase in load carrying ability relative to the TABLE I11 MAGNETIC FLUIDBEARING CONFIGURATIONS
Generic type
Number of fluids
Hydrostatic
One
Hydrostatic
Two
Hydrodynamic
Hydrodynamic
One
Two
Principle
References
Mutual repulsion of magnet and nonmagnetic member Ferrofluid seals periphery of nonmagnetic fluid that carries the load. Ferrofluid held in working gap by magnetic attraction and pumping action. Fluid develops load support from dynamic motion. Ferrofluid seals nonmagnetic fluid in which dynamic support force develops as a result of bearing rotational motion.
Roseiisweig (1978) Rabenhorst (1972); Rosensweig (1973, 1974);
Schmieder (1972) Styles et al. (1973)
Styles (1969)
K U I D DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
175
hydrostatic type while sacrificinglow starting torque and wear at start-up or shutdown. Demonstrated state-of-art monofluid bearings of the hydrostatic type support 1 kP with an ultimate potential capability in excess of 650 kP assuming efficient magnets and highly magnetic ferrofluid equivalent to 50 vol% iron in colloidal suspension. A bifluid hydrostatic bearing s u p ported 25 kP using eight sealed stages with conceivable potential for approaching 6500 kP.It is hoped more investigatorswill turn their attention to the problems and opportunities offered by this field.
C. Dampers Dampers utilize the viscous property of magnetic fluids to dissipate kinetic energy of unwanted motion or oscillations to thermal energy. A spectrum of applications have been studied ranging from delicate instrumentation to mass transportation vehicles. There appears to be significant opportunity for further developments in this area. A survey of damper applications is given in the following: 1. Satellite Damper
This ferrofluid viscous damper was developed by Avco Corporation for application on NASA's Radio Astronomy Explorer satellite. The damper (Fig. 38) consists of a small quantity of magnetic fluid hermetically sealed in Riaid connection
Damper boom axis
FIG. 38. Viscous damper designed for service in a radio astronomy Earth satellite; the period of oscillation is approximately 90 min. (After Coulombre et al., 1967.)
176
RONALD E. ROSENSWEIG
a vane mounted on the central body of the satellite. The ferrofluid is acted upon by a permanent magnet mounted on a long damper boom. Relative angular motion of the damper boom with respect to the satellite central body produces magnetic force on the ferrofluid causing the fluid to dissipate energy by flowing through a constriction in the vane (Coulombre et al., 1967). Operation of the device resulted in smooth fluid damping with no residual oscillations in contrast to devices wherein mechanical friction is present. 2. Stepping Motor Damper
A stepping motor provides discontinuous angular positioning of an electrically torqued rotor. Due to the sudden on-off motion required, settling time is often a problem. Ferrofluid between the stator and rotor of a stepper motor reduces settling time by as much as a factor of three or four and thus provides damping, without addition of any hardware to the basic motor. The ferrofluid is retained within the gap between the moving and stationary parts by the magnetic field that already is present there. The ferrofluid damping concept has also been applied to DArsonval meter movements and for the damping of flappers using squeeze film viscous motion. 3. Instrument Damper A wave and tide gauge developed by Bass Engineering Co. employs a Bourdon gauge coupled by a linkage to an optical readout. The linkage cannot be supported at all points, hence is subject to oscillations that interfere with the readings. A ferrofluid damper module was developed with viscous fluid held in place magnetically allowing the gauge to operate at any angle of tilt. The principle is widely applicable. 4. Electromagnetic Transportation
Several national groups are investigating the feasibility of concepts for electromagnetic flight at ground level. In the MIT concept a vehicle carrying 200 passengers is levitated with 300 mm clearance above a trough-shaped guideway using superconductive magnets onboard the vehicle. The vehicle levitates at speeds above 15 km/hr, and with modifications of a partially evacuated enclosure can travel at speeds above 200 km/hr. Means for damping roll, pitch, and yaw oscillations must be provided, and one proposal of the developers is to utilize magnetic fluid in a manner similar to the satellite damper described previously. Magnetic field is already present as part of the system to provide coupling to the fluid. At present, however, electrical feedback control methods are dominant.
FL.UID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
177
D. Transducers Transducers convert an input of one physical sort to an output of another sort. Ferrofluids have furnished the basis for transducers of many kinds, and a survey of devices is given below. Opportunity is ample for the conceptualization and development of additional devices to perform specific functions. 1. Acoustic Transducers
Cary and Fenlon (1969) assess the suitability of ferrofluids for acoustic transducer and receiver applications. The piston motion of a ferrofluid induced by an applied field gradient provides a feasible alternative to conventional magnetostrictivetransducers. For an applied static induction field of 0.4 T and a gradient of 10 T/m, it is possible to achieve an overall efficiency that is greater than provided by ferromagnetic solids. Ferrofluids also appear to provide a desirable alternative for pressure-sensing applications in severe environments, for example detonations, where piezoelectric and pyroelectric materials suffer fatigue and failure. These workers also concluded that ferrofluids offer the prospect of obtaining broadband frequency response when radiating or receiving in liquid media 2. Pressure Generator A simple method of generating faithful sinusoidal pressure variations utilizes magnetic fluid's property of linear magnetization at small applied 1976). A schematic diagram of the pressure generator field intensities (Hok, is shown in Fig. 39. Using a drop of magnetic fluid instead of diaphragm, the pressure chamber part can be assembled from a few pieces of tubing and an O-ring, and this part does not have to be fixed to the electromagneticactuator. A working device was found useful over at least the frequency range 0 to 100 Hz in the dynamic calibration of manometers for the clinical environment. Electromagnet
I
generator
Polyethylene tube I silicon
O-ring
\
Transducer or under
Woter
FIG.39. Fluidic pressure signal generator using magnetic Liquid. (After Hok,1976.)
RONALD E. ROSENSWEIG
178 3. Level Detector
An elegantly simple device proposed for measuring angle of tilt consists of a hollow cannister partially filled with a magnetic fluid (Stripling et al., 1974). Using magnetic induction pickoff, the device provides readout signals to remote areas in a manner not possible with conventional surveying and leveling devices. The device eliminates the need of an air supply required by air bearing level indicators. 4. Current Detectors
An indicator of current flow described by Sargent (1976) employs magnetic field of the current to draw ferrofluid into a chamber having a transparent face. With current off the ferrofluid flows out of the chamber by capillary action. The device furnishes on-off electrovisual indications. This device has a close relationship to display devices described in Section III,E,l on graphics. Bibik (1975)describes an ammeter device in which current produces field that is focused in a straight, narrow gap between permeable pole pieces. A column of magnetic fluid is drawn to various lengths along the gap which is provided with a calibrated scale along its length. 5. Accelerometers Accelerometers employing the levitation of an inertial-proof mass in magnetized fluid are described previously in the section on bearings. A sketch of an accelerometer is shown in Fig. 40; a prototype integrating PERMANENT / PICKOFF ELECTRODES /MAGNETS /
/
I
BUOYANCY CHAMBER
MAGNETIC POLE s
\
PROOF MASS
BYPASS CHANNEL
FIG. 40. Integrating accelerometer based on levitation of an inertial proof mass in magnetic fluid. (Fluid omitted for clarity.) (After Rosensweig et al., 1966.)
K U I D DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
179
accelerometer was built and tested. This device performed up to its design expectations, with a lower limit to detectable input measured as less than lop5g. The scale factor of the instrument is sensitive to changes in fluid viscosity induced by particle concentration gradients and temperature change. These disadvantages are mitigated by operation of the device as an accelerometer rather than an incremental velocity meter, using forcing coils to maintain the proof mass at a central position. Another principle for an accelerometer is suggested in which magnetic particles of a ferrofluid are contained in a hollow chamber completely filled with the liquid suspension (Schmieder, 1970). An acceleration of the case generates a signal due to relative motion between the particles and a set of sensing coils. No analysis of performance was given. 6. Liquid Level Sensor Ferrofluid is specified in a liquid level sensor consisting of a magnetic float surrounding a guide cylinder (Carrico, 1976). The ferrofluid is introduced into the gap between the movable members in order to exclude the process fluid, giving a wider range of temperature operation.
E . Graphics In the field of graphics, ferrofluids are being considered for display devices and as a means for printing hard copy. The important properties of the fluid utilized in these devices include magnetic positioning, magnetic sensing, capillary latching, and deformability. The opaque optical property is important to all these applications. 1. Displays
An early patent for displays specifies the use of an immiscible, transparent fluid and an opaque magnetic fluid of about the same density enclosed in a flat container having a transparent face (Rosensweig and Resnick, 1972). The transparent fluid preferentially wets the wall displacing the opaque fluid when the fluids are moved. The magnetic fluid is positioned with magnets to mask and unmask alphanumerics, or the fluid itself may be shaped into a predetermined pattern. A fascinating property of magnetic fluid is revealed when the flat container of the two fluids is subjected to a uniform magnetic field oriented normal to the flat surface (Romankiw et al., 1975). As shown in Fig. 41 the fluid collects into an intricate maze, a liquid magnetic labyrinth similar to that observed in a demagnetized bubble platelet. According to Romankiw
180
RONALD E. ROSENSWEIG
FIG.41. Liquid magnetic labyrinth. Water-base ferrofluid ( p 0 M = 0.02 T at saturation) together with kerosene transparent fluid contained between glass plates with spacing of400 pm and subjected to 0.01-Tinduction field. (See Romankiw et al., 1975.)
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
181
et al. (1975) the stable configuration is determined by minimizing the total energy. Etotal
= Einterfacial
+ Edemag
-k
Eapplied
(158)
where Eintedacial=
Edemag
Eapplied
=
(ywall)(wall length)(wall height)
-$
=-
MHdemag
MHapplied
d6 d6
Thus, although the ferrofluid has no spontaneous magnetization or anisotropy, the application of a constant bias field perpendicular to the plates induces a moment in the ferrofluid. Since the relative susceptibility of ferrofluid is low, typically less than two, the magnetization in the magnetic fluid is reasonably uniform, so that the general form of these energy relationships is the same as for magnetic bubble domains. Indeed, at higher magnetic fields, liquid magnetic bubbles are formed; that is, one of the stable configurations is the cylindrical domain appearing as a bubble or a hole when viewed in either reflected or transmitted light. An analogy between liquid magnetic bubbles and magnetic bubble domains is more than superficial (Romankiw et al., 1975). Many of the general properties are similar, such as mutual repulsion between the cylindrical domains, attraction to permeable alloy overlays, damping of bubble motion, and the relationship of optimum cylindrical domain diameter to the height of the cylinder and overlay thickness. The size of the liquid magnetic bubbles is ideal for direct use as picture elements. A display using this technology would utilize an array of shift registers, moving the generated elements into the display from behind a shadow mask that can be etched onto the cover plate. Uniform illumination behind the plates is sufficient since the contrast is 100%. Olah (1975) developed magnetic fluid display devices requiring no external force or energy to maintain the readout in a preset condition. The device can be configured as a pattern with seven compartments arranged in the form of a numeral 8 (see Figs. 42 and 43). By selectively filling the compartments with opaque liquid, the numerals from 0 to 9 can be formed similar to more familiar liquid crystal displays, light emitting displays, and the like. Cavities are provided in a housing of two joined blocks permitting fluid to circulate from one cavity to another in response to a pulse of magnetic field. The cavities are filled with two immiscible fluids, one of which is ferrofluid as specified by Rosensweig and Resnick (1972). The cavities are geometrically
182
RONALD E. ROSENSWEIG
Transoarent
Opaque black
FIG.42. Magnetic fluid numeric display based on remote positioning of captive fluids using a pulse of current. (After Olah, 1975.)
configured to provide surface tension forces that latch the display in a given state. Units with character size of a few millimeters have been produced as prototypes for use in electronic calculators and related machines. Sensing the impedance of a driver coil can determine the state of the element. Transparent Ferrof bid (opaque)
,Reflecting layer Transparent block
FIG. 43. Cross section of the display device illustrating the cavities and passage configuration leading to latching based on fluid surface-energy effect. (After Olah, 1975.)
Long-term compatibility and stability of the contacting liquid phases is a key requisite in these graphics applications. More research is needed in defining and controlling the state of the interfacial surface separating the phases. Each phase is a complex, multispecies solution containing surface active molecules and other constituents. The influence of fluid motion and fluid properties on emulsification may play an important role.
2. Printing In data processing equipment there is a need for an output printer that is high speed and quiet to handle graphical and pictorial data. Print speeds in excess of 5000 lines per minute are desired as compared to 2000 lines per minute available with mechanical impact printers. Until recently, ink jets
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
183
considered for this application were restricted to the use of electrostatic principles. Electrostatic technology has some technical problems with the use of high voltage electronics and interaction of charged droplets during transit that motivate the investigation of magnetic systems. Fan (1975; see also Fan and Toupin, 1974) discusses the development of a magnetic ink jet printer employing ferrofluid in which a uniform stream of ferrofluid droplets is produced by introducing periodic disturbances along a flowing jet. The droplets are deflected by passing through a magnetic field gradient of a transducer coil. To eliminate interaction between successive droplets, the path length in the transducer is made shorter than the separation distance between the droplets. Fan’s process illustrated in Fig. 44 introduces a deflector as an additional element to select droplets generated as a steady stream. Directionally, the response time is expected to be faster than in the earlier system suggested by Johnson (1968). y - Deflector
x-Def lector
Selector,
or magnet catcher
/
Paper
FIG.44. Magnetic fluid ink jet printing. (After Fan, 1975; see also Fan and Toupin, 1974.)
F. Other A few other diverse device applications for ferrofluids are indicated in this section, including the use as a mechanical actuator, as a sorter in the production of semiconductor chips; and as a superior means for magnetic inspection of metallurgical alloy structure. 1. Actuators
An actuator that converts electrical energy to mechanical motion and work is the subject of a patent application (Sabelman, 1972), in which ferrofluid and a coil are contained within an elastomeric capsule. Shown in Fig. 45, energization of the coil by flow of current distorts the capsule to give, for example, radial expansion and axial contraction. The distortion is due to redistribution of the ferrofluid under the influence of the magnetic field. The uses for this device appear to be very broad, with a particular
184
RONALD E. ROSENSWEIG
/
F l e x i i l e skin
Fer ro f lu id
FIG.45. Magnetic fluid deforms shape of elastomeric capsule to provide actuating force. (After Sabelman, 1972.)
application as an artificial muscle for prosthetic devices illustrated in the sketch of Fig. 46. This actuator has advantage over conventional solenoids employing a movable core since sliding parts that wear are eliminated, and force at full stroke may be maximized. A recent patent describes a magnetic fluid-actuated control valve having
I
- Coil
energized COII deenergtzed
I
FIG.46. Ferrofluid actuator used as an artificial muscle. (After Sabelman, 1972.)
FLUID DYNAMICS AND SCIENCE OF MAGNEnC LIQUIDS
185
utility in controlling flow from a pressurized reservoir implanted in the body, e.g., artificial pancreas, sphincter for bladder control, or other orthotic devices (Goldstein, 1977). 2. Chip Sorter A complete circuit for a quartz electronic watch may be about 2.5 x
mm thick, formed by the hundreds on a 50-mm silicon wafer. Each circuit may have up to lo3 transistors and each transistor may have to be tested five or six times. This precision job is done by needle probes that hover over the circuit and imprint each defective circuit with a magnetic fluid. Then, magnets rather than hand-held tweezers lift out the imperfect circuits (Broy, 1972). The usual coarse suspension of micron-size particles such as found in magnetic inks is conventionally used in this application, and it seems likely that ferrofluids (particle size 10 nm) could prevent problems associated with clogging of the quills.
-
3. Observing Ferromagnetic Microstructures
Gray (1972) has developed a technique using ferrofluid that improves resolution and magnification of the Bitter technique by tenfold is microscopic observation of magnetic patterns on ferromagnetic alloy microstructures. Experimental apparatus for the magnetic examination of specimens using ferrofluid is shown in the sketch of Fig. 47; the photograph of Fig. 48 illustrates the procedure for applying the ferrofluid to the sample. Previous broad use of the Bitter technique was limited because of the difficulty in preparing fresh colloid that did not agglomerate, whereas the ferrofluid has indefinitely long shelf life. In addition, the ferrofluid eliminates undesirable chemical etching of the sample surface caused by the old preparations. Gray shows photomicrographs of delta ferrite domain patterns; the technique is useful in the laboratory and may be of value in locating precursors to failure in weldments of structures.
Immersion oil Objective lens
FIG.47. Magnetic etching apparatus for observing ferromagnetic microstructures. (After Gray, 1972.)
186
RONALD E. ROSENSWEIG
FIG. 48. Technique for applying magnetic fluid to metallurgical sample. (From Gray, 1972.)
In magnetic bubble garnet films, ion implantation can produce a layer with planar magnetization near the surface of the film. The first direct detection of the layer by observation of planar domains associated with bubbles in the underlying garnet was accomplished by the Bitter pattern method using a ferrofluid (Wolfe and North, 1974). The method uses a thin layer of the fluid pulled by capillary action into the space between a glass slide and the sample. The magnetized particles of the ferrofluid interact with local magnetic fields of the garnet domains which are made visible by transmitted unpolarized light.
Iv.
PROCESSES BASEDON
MAGNETIC FLUIDS
The magnetic fluid in a process application contacts other streams of matter or energy undergoing physical, or possibly chemical, change. Whereas device applications sometimes utilize but a minute quantity of magnetic fluid, the typical process application employs the fluid in volume quantity. As the first example sink/float separation processes are discussed; they are sometimes referred to as magnetohydrostatic separations.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
187
A . Magnetohydrostatic Separation
One of the most attractive large-scale applications of magnetic fluids involves their capability to separate materials and minerals. Rather than depend on the inherent magnetism of the particles to be separated, a ferrofluid technique in which the medium itself is magnetized separates nonmagnetic particles having a broad density range. Central to the concept of magnetohydrostatic separation is the magnetic levitation force discussed in detail in the Section III,B on bearings. The levitation force is exerted on an immersed body in a magnetic fluid placed in an inhomogeneous magnetic field. As an approximation, the apparent density pa of the magnetic fluid is given by the expression P a = Pt - Vo(M/g)(dH/dz)
(159)
in which p l is the true density, g is the gravitational constant, z is vertical distance in upward direction, and dH/dz is field gradient. With H decreasing in the direction of positive z, the sign of dH/dz is negative so the apparent density exceeds the true density. Using available ferrofluids and state-of-theart electromagnets, it is possible to float any known material that is less magnetic than the ferrofluid. The technique has advantage compared to use of conventional heavy liquids such as bromoform, methylene iodide, thalium formate aqueous solutions, and slurries such as ferrosilicon or magnetite in water. These media provide a limited range of densities (sp. gr. to 5.0) and a disadvantage that the halogenated organics and the heavy salt solutions are toxic. A system for continuous separation of solid feed is based on a series of patents assigned to Avco Corporation (Rosensweig, 1969; Kaiser, 1969; Kaiser et al., 1976). A module (see Fig. 49) was constructed to separate shredded automobile nonferrous scrap metal and also many types of industrial scrap metals. For a throughput capacity of 1 metric ton of mixed metals per hour the power requirement is about 40 kW with cooling water consumption of 400 liters per hour. Greater hourly throughput is possible in larger systems. Standard degreasing equipment may be used to clean the separated solids and recover ferrofluid for recycle to the process. Development of magnetohydrostatic separation along closely similar lines has proceeded in Japan at Hitachi Ltd. (Nogita et al., 1977). A prototype separation system with a throughput capacity of 0.5 metric ton per hour was used to separate components of discarded automobiles and household electrical appliances. Nonmagnetic metals such as aluminum, zinc, and copper were recovered at a yield of 80% and a purity of90%. The solids ranged in size from 6 to 30 mm, and the resolution is stated as 0.3 specific gravity units.
188
RONALD E. ROSENSWEIG
I FERROFLUID
,
FE
Y
I N MAGNETIC GAP
FIG.49. Material handling system in the continuous sink/tloat separation process using magnetic fluid. (From Metal Separation System Brochure, Avco Corporation, Lowell, Massachusetts.)
In a different configuration developed by Khalafalla and Reimers (1973b), the magnetic displacing force is directed in a horizontal direction causing particles of various densities to follow different trajectories in the fluid under the influence of vertical gravitational force. An advantage is that weaker, hence cheaper, magnetic fluid may be used and the applied magnetic field intensity can be less. Since the process is dynamic, it likely suffers some loss of resolution due to influence of particle shape on the fluid mechanical drag force. Aluminum-copper-zinc mixtures from solid waste incineration were separated in bench scale tests using a magnetic induction of 0.24 T and a magnetic fluid with saturation induction of 0.0215 T. Study of magnetohydrostatic methods of mineral separation developed in the Soviet Union using aqueous solutions of paramagnetic salts is described in a monograph of Andres (1976a,b). For H o ( N O ~ at ) ~a solution density of 1930 kg/m3 the cgs magnetic susceptibility is 233 x emu/cm3;the corresponding magnetization in an applied field of 1 T is po M of 0.00293 T (29.3 G). The paramagnetic solutions are optically transparent and permit striking visual demonstrations of separations when an intense field gradient is applied to a vial containing solid pebbles of different densities; glass, pyrite, galena, and other substances may be levitated in this manner. Additional studies reported by Zimmels and colleagues (1977) calculate the distribution of force due to particular shaped pole pieces.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
189
B. LiquidlLiquid Separations In liquid/liquid separation systems, magnetic fluid is mixed with oilcontaminated water. This renders the oil phase magnetic allowing a physical separation of the oil and water in a magnetic separator. Specificapplications include removing suspended oil from shipboard ballast and bilge waters, removing lubricating oils and cutting oils from factory wastewaters, and cleanup of oily wastewater discharged from rolling and pickling processes in steel mills. A skid-mounted commercial unit is offered (Houston Research, Inc.) for reducing oil in water from 200 ppm to less than 5 ppm. A published paper discusses equipment means to separate oil-water emulsion and to remove an oil spill from the surface of the ocean (Kaiser et al., 1971). An 88-kg portable system was developed to collect oil polluting the water around loading docks and oil storage facilities (Anonymous, 1971). Additional details are given in a U.S. patent (Kaiser, 1972). Another study relates oil removal efficiency to flow rate of the emulsion, amount of ferrofluid added, and the strength of applied magnetic field (Yorim e and Tozawa, 1976). It is interesting to note that buoyant, ferromagnetic, sorbent particles with an affinity for oil are the basis for an alternative technique for control of maritime oil spills (Turbeville, 1973). C . Energy Conversion (Resler and Rosensweig, 1964)
A direct conversion of thermal energy to energy of fluid motion can be based on the change of ferrofluid magnetization with temperature, which is most pronounced near the Curie temperature. The conversion process may be analyzed with reference to flow in a tilted, uniform cross section tube as sketched in Fig. 50. Cold fluid enters at section 1, isothermal entry into a
Mh H=consfanf
I
\
Heat addition
FIG.50. Direct conversion of heat energy to flow work is accomplished by magnetocaloric pump in which ferrofluid having temperature-dependent magnetic moment is heated in presence of magnetic field.
190
RONALD E. ROSENSWEIG
magnetic field is complete at section 2, heat addition at constant magnetic field is done between sections 2 and 3, and isothermal flow of heated fluid out of the field is completed at section 4. From Eq. (65) with q = const, there is obtained for sections 1 to 2 and 3 to 4, respectively, P1
= Pf
+ Pdh2 - h l ) - POWT,)H
(160)
P4
= P?
+ Pg(h3 - h 4 ) - P 0 M ( T 3 ) H
(161)
where i@(T,)< (M(T2).In the heated zone of 3 to 4 with grad H = 0 the direct utilization of the equation of motion gives the relationship
Pj - P1 = Pdh2
(162)
- h3)
Eliminating the starred variables gives the following expression for overall pressure increase of the process: (P4 - PI) = POHAM
- Pdh4
- hl)
(163)
Analysis of the process conducted as a closed cycle with provision for regenerative heat transfer and accounting for magnetocaloric energy effect reveals that conversion efficiency can approach the Carnot limit set by the second law of thermodynamics (Resler and Rosensweig, 1967). The process is scientifically interesting in the coupling of thermodynamics with magnetics. Additional cycle analysis and a design configuration has been studied using this no-moving-part converter in application to topping cycles for nuclear reactors and high reliability, light-weight space power supplies (Donea et al., 1968).A laboratory proof of principle demonstration has been given (Rosensweig et al., 1965),and current effort is reported in development of magnetic fluids based on liquid metal carrier to increase the heat release rate in the process (Popplewell et al., 1977).The concept is also suggested for removal of heat from nuclear fusion reactors where high-intensity magnetic field is available as part of the process (Roth et al., 1970). In more modest use of the principle there would seem to be much opportunity for applications in heat pipes, self-actuated pumps in thermal loops, and other devices. D. Other
Several other processes using magnetic fluids deserve review. For the most part these processes have been studied by few investigators. However, the concepts appear promising and worthy of further development.
FZUID DYNAMICS A N D SCIENCE OF MAGNETIC LIQUIDS
191
1. Magnetic Separation by Fluid Coating (Shubert, 1975) In this type of process, particulate mixtures of essentially nonmagnetic materials are separated by selectively coating the surfaces of a component of the mixture with a magnetic fluid. Thereafter, the particulate mixture is subjected to a magnetic separation yielding a magnetic fluid-coated fraction and a nonmagnetic fraction. The process is especially intended for beneficiation wherein a mineral concentrate is recovered from its ore. The selective wetting of surfaces may be achieved by techniques well known in froth flotation practice. Coatings that yield a hydrophobic but aerophilic surface are also organophilic and so are readily wet by a hydrocarbon-base ferrofluid. The minimum amount of magnetic fluid required is that sufficient to form a thin coating on the surfaces of those particles wettable by the fluid. Copper ore of chalcocite in a siliceous matrix was ground and separated from a water slurry using a kerosene-base magnetic fluid following a ferric chloride pretreatment. Zinc sulfide in sphalerite ore separates from the gangue after a dilute sulfurous acid treatment. In another example, waste anthracite coal fines are separated directly from ash.
2. Biochemical Processing
Enzymes function as highly specific catalysts for chemical change. A difficulty of the technique in the past has been their separation from the reaction mixture after the chemical change occurred. A trend in recent work is the immobilization of the enzyme in a polymer support that can be readily removed from the reaction mixture (Adalsteinsson et al., 1977).The separation is facilitated by entrapping ferrofluid in the polymer to form a suspended gel containing the enzyme. Typically, a 1-pm gel particle contains lo2 to lo3 magnetic particles. The resulting magnetic gels can be manipulated using either conventional magnetic filtration or more effective highgradient procedures (Liu, 1976).The technique also circumvents difficulties encountered in packed bed contacting wherein pressure drop is too great due to deformation of the polymer beads. Procedures for preparing and separating small, magnetically responsive, polymer particles should be useful for manipulating a variety of immobilized biochemicals besides enzymes-for example, in radioimmunoassay procedures and affinity chromatography (Mosbach and Anderson, 1977).
192
RONALD E. ROSENSWEIG
3. Lubrication A concern of lubrication engineers has been to introduce the appropriate lubricant in the desired location and keep it there. Conventional lubricant retention methods include: mechanical seals, oil-impregnated retainers, wicks, pumping of lubricant, splashing of lubricant, surface treatment to prevent creep, and the like. Magnetic lubricants add the capability of being retained in the desired location by means of an external magnetic field. Ezekiel (1974) reviews concepts using magnetic lubricants in pivots, hinges, ball bearings, gears, and pistons.
LISTOF SYMBOLS Thickness of fluid layer above interface, m Dipolar element cross section area, m3 Cross section area of tube, mz Hamaker Constant, N * m Temperature gradient, K/m Thickness of fluid layer below interface, m Vector induction field, Wb/m2 Component of induction in jth direction, Wb/m2 Height of heat transfer zone, m Specific heat of magnetic fluid, WBg. K Spherical particle diameter, m Wall spacing, m Diameter of fluid jet at sections 1 and 2, respectively, m Reduction of particle diameter due to inert surface layer, m Demagnetization coefficient Energy per particle pair, N . m; energy terms associated with magnetic fluid bubbles when subscripted, N . m ith component of f,, N Characteristic frequency, sec-'. foro = 1 Force, N Restoring force on unit area of levitated slab, N/m2 Magnetic body force density, N/m3 External force density, N/m3 Magnetic force on a body immersed in ferrofluid, N Magnetic force on a volume of ferrofluid, N Gravitational constant, 9.8 m/sec2 Terms defined by Eq. (127), kg/sec2 * m2 Spatial gradient of magnetic field intensity, A/m3 Magnetic field magnitude, A/m Component of magnetic field in ith direction, A/m Component of magnetic field at interface in normal and tangential direction, A/m Vector magnetic field, A/m
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
I 10
1, m
6 M Md
Mln
A
M "t
n0 "m
"s
n
N Nu P Pm
P* 4
a r rc
r0 'd
R
Rf 6s SO
S t
T
r
F
T
U ua,
V
V
K
ub
Applied magnetic field, A/m Electric current, A Boltzmann constant, 1.38 x N . m/sec; also, wave number, rad/m Critical wave number for onset of normal field instability, rad/m Pyromagnetic coefficient, K-'.ko = - M - ' ( a M / d T ) , Thermal conductivity, W/m . sec . K Wave vector component in y- and z-direction, rad/m Crystalline anisotropy constant, N/mZ Relative surface separation of spherical particles, (rc - 2r)/r Length of region containing standing waves, m Length of heated fluid layer, m Magnetic moment of a particle, A . mz Average magnetic moment in direction of field, A . mz Magnetization magnitude, A/m Domain magnetization, A/m Magnetization of permanent magnet, A/m Field averaged magnetization, A/m. A = H - ' j f M dH Vector magnetization, A/m Number of particles with diameter di in a sample Number of particles in a chain in absence of external field Number of particles in a chain in strong external field Number of nodes less one in standing wave Outward facing unit normal vector Surface concentration of sorbed molecules, m-z Nusselt number (convective to conductive heat transfer ratio) Fluid pressure, N/m2 Fluid magnetic pressure, po M H ,N/m2 Pseudopressure defined by Eq. (52), N/m2 Fluid speed, m/sec Fluid velocity, m/sec Radius of a spherical particle, m Center-to-center separation of spherical particles, m Geometric mean of chord and tangent permeabilities Radial distance from line current, m Rayleigh number [defined by Eq. (12911 Dimensionless group [defined by Eq. (138)] Length of elementary dipole, m Term defined by Eq. (125) Surface area, m2 Surface layer thickness normalized to particle radius Absolute temperature, K Surface tension, N/m Component of magnetic stress tensor representing ith component of stress on surface having normal oriented in jth direction, N/m2 Magnetic stress tensor, N/m2 Orientation energy of isolated dipole, N . m Fluid velocity above and below interface, respectively; m/sec Specific volume, m3/kg Volume of a particle, m3 Control volume, m3
193
RONALD E. ROSENSWEIG
Velocity of fluid interface, m/sec Liquid volume, m3 Width of heat transfer zone, m Weight of tube, kg Coenergy, N/m2 Distance from origin along Cartesian coordinate axes, m Defined by Eqs. (102b,a) Greek Symbols Argument of Langevin function Ratio of ferrofluid viscosity to porous medium flow permeability, N . sec/m4; defined by Eq. (102c) if subscripted a or b Thermal expansion coefficient, KAngular frequency, rad/sec Distance between float surface and wall with no displacement, m Displacement of levitated mass, m Defined by Eq. (102d) Viscosity coefficient of ferrofluid in isotropic range, N . sec/m2 A viscosity coefficient of magnetic fluid in presence of external magnetic field, N . sec/m2 Viscosity coefficient of magnetic fluid in absence of external magnetic field, N . sec/m2 Viscosity coefficient of carrier fluid, N . sec/m2 Crystalline anisotropy constant, N/m2 Coupling coefficient Magnetic permeability, E / H ; H/m* Permeability of free space, 4n x lo-' H/m Relative magnetic permeability, p/po Deflection of interface, m Amplitude of interfacial deflection mode, m Expressions defined by Eqs. (157b,c) Particle mass density, kg/m3 Apparent density, kg/m3 Liquid density, kg/m3 Surface density of (apparent) magnetic poles, poles/m2 Brownian rotation relaxation time, sec Ntel (intrinsic) relaxation time, sec Characteristic precessional time, sec Frequency, rad/sec Volume fraction magnetic solids Velocity potential, m2/sec Magnetic susceptibility, M/H Apparent susceptibility Initial susceptibility, aM/aH at H = 0; d dilute, c concentrated Gravitational potential function, N . m Complex frequency, rad/sec
'
* The symbol H for the unit of henry should not be confused with the symbol H for the parameter of magnetic field magnitude.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
195
Frequency of standing waves in absence of magnetization, rad/sec Frequency of standing waves in magnetized medium, rad/sec Vorticity vector, secFluid rotational rate, rad/sec
0 0
0 ,
w
R
REFERENCES Adalsteinsson, O., Lamotte, A., Baddour, R. F., Colton, C. K., and Whitesides, G. M. (1977). MIT Industrial Liaison Report NSF (Rann) No. GI34284. Massachusetts Institute of Technology, Cambridge, Massachusetts. Akiyama, S., Fujino, J., Ishihara, A., Ueda, K., Nishio, M., Shindo, Y. and Fuji, H. (1976). Proc. lnt. Cryog. Eng. Con$, 6th, 1976, p. 432. Andres, U. Ts. (1976a). “ Magnetohydrodynamic and Magnetohydrostatic Methods of Mineral Separation.” Wiley, New York. Andres, U. Ts. (1976b) Mater. Sci. Eng. 26, 269. Anonymous (1971). Ordnance, July-August. Avramchuk, A. Z., Kalinkin, A. K., Mikhalev, O., Orlov, D. V., and Sizov, A. P. (1975). Instrum. Exp. Tech. (Engl. Transl.) 18, Part 2, 900. Bailey, R. L. (1976). Proc. ASME Design Technol. Transfer Con/., 2nd (Montreal). Bailey, R. L. (1978). In “Therrnomechanics of Magnetic Fluids” (B. Berkovsky, ed.), p. 299. Hemisphere, Washington, D.C. Bates, L. F. (1961). “Modem Magnetism,” 4th ed. Cambridge Univ. Press, London and New York. Bean, C. P., and Livingston, J. D. (1959). J. Appl. Phys. 30, 120s. Berkovsky, B. M., and Bashtovoi, V. G. (1973). Heat Transfer-Sou. Res. 5, No. 15, 137. Berkovsky, B. M., and Orlov, L. P. (1973). Magnetohydrodynamics (Engl. Transl.), No.4,38. Berkowitz, A. E., Lahut, J. A., Jacobs, I. S., Levinson, L. M., and Forester, D. W . (1975). Phys. Rev. Lett. 34, No. 10, 594. Bertrand, A. R. V. (1970). Rev. Inst. Fr. Pet. 25, 16. Bibik, E. E. (1975). Russian Patent 473,098. Bibik, E. E., Matygullin, B. Ya., Raikher, Yu. L., and Shliomis, M. I. (1973). Magnetohydrodynamics (Engl. Transl.) No. 1, p. 68. Bogardus, E. H., Scranton, R., and Thompson, D. A. (1975). IEEE Trans. Magn. mag-11, No. 5, 1364.
Brown, W. F., Jr. (1963a) Phys. Rev. 130, 1677. Brown, W. F., Jr. (1963b). J . Appl. Phys. 34, 1319. Brown, W. F., Jr. (1969). Ann. N.Y. Acad. Sci. 147,463. Broy, A. (1972). N.Y. Times, March 26, p. 3. Buckrnaster, J. (1978). In “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.), p. 213. Hemisphere, Washington, D.C. Byme, J. V. (1977). Proc. IEE 24, No. 11, 1089. Calugkru, Gh., Badescu, R., and Luca, E. (1976). Rev. Roum. Phys. 21, No. 3, 305. Carrico, J. P. (1976). U.S. Patent 3,946,177. Cary, B. B., Jr. and Fenlon, F. H. (1969). J. Acoust. Soc. Am. 45, No. 5, 1210. Chu, B.-T. (1959). Phys. Fluids 2, No. 5, 473. Chung, D. Y., and Isler, W. E. (1977) Phys. Lett. A61, No. 6, 373. Cissoko, M. (1976). C. R. Hebd. Seances Acad. Sci., Ser. A 283,413. Cook, E. J. (1972) “Feasibility Evaluation of a Ferromagnetic Materials Sensor,” Rep. No. 0101-F. Arthur D. Little, Inc., Cambridge, Massachusetts.
196
RONALD E. ROSENSWEIG
Coulombre, R. E., d’Auriol, H., Schnee, L., Rosensweig, R. E., and Kaiser, R. (1967). “Feasibility Study and Model Development for a Ferrofluid Viscous Damper,” Rep. No. NAS5-9431, AVSSD-0222-67-CR. Goddard Space Flight Center, Greenbelt, Maryland. Cowley, M. D., and Rosensweig, R. E. (1967) J. Fluid Mech. 30,671. Curtis, R. A. (1971) Phys. Fluids 14, No. 10,2096. Curtis, R. A. (1974) Appl. Sci. Res. 29, No. 5, 342. de Gennes, P. G., and Pincus, P. A. (1970) Phys. Kondens. Mater. 11, 188. Donea, J., Lanza, F., and Van der Voort, E. (1968) “Evaluation of Magnetocaloric Converters,” Rep. EUR 4039e. Euratom, Ispra Establishment, Varese, Italy. Dzhauzashtin, K. E., and Yantovskii, E. I. (1969) Magn. Gidrodin. 5, No. 2, 19. Einstein, A. (1906) Ann. Phys. (Leipzig) [4] 19, 289. Einstein, A. (1911) Ann. Phys. (Leipzig) [4] 34, 591. Ezekiel, F. D. (1974). Des. Eng. Con$, Chicago Paper No. 74DE21. Fan, G. J. (1975). Proc. Magn. Magn. Mater., 21st Annu. Con$ AIP No. 29. Fan, G. J., and Toupin, R. A. (1974) German Patent 2,340,120. Finlayson, B. A. (1970). J . Fluid Mech. 40, No. 4, 753. Frenkel, Y a I. (1955) “Collection of Selected Works” (transl.), Dover, New York. Gailitis, A. (1977). J. Fluid Mech. 82, Part 3,401. Goldberg, P., Hansford, J., and van Heerden, P. J. (1971) J . Appl. Phys. 42, No. 10, 3874. Goldstein, S. R. (1977) U.S.Patent 4,053,952. Gray, R. J. (1972) Proc. 4th Annu. Int. Metallogr. Meet., 1971 ORNL-TM-3681. Hall, W. F., and Busenberg, S. N. (1969). J . Chem. Phys. 51, No. 1, 137. Hayes, C. F. (1975). J. Colloid Interjiace Sci. 52, No. 2, 239. Hemmer, P. C., and Imbro, D. (1977). Phys. Rev. A 16, No. 1, 380. Hess, P. H., and Parker, P. H. (1966). J. Appl. Polym. Sci. 10, 1915, Hok, B. (1976). Med. Biol. Eng., March, 193. Hudgins, W. A. (1974). U.S.Patent 3,858,879. Hunter, J. S., and Little, J. L. (1977). US.Patent 4,043,204. Jacobs, I. S., and Bean, C. P. (1963) Magnetism 3,272. Jeans, J. H. (1948). “The Mathematical Theory of Electricity and Magnetism,” Chapter 7, Sect. 192. Cambridge Univ. Press, London and New York. Jenkins, J. T. (1971). J. Phys. (Paris) 32,931. Jenkins, J. T. (1972) Arch. Ration. Mech. Anal. 46, No. 1,42. Johnson, C. E., Jr. (1968). US.Patent 3,510,878. Jones, T. B. (1978). I n “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.),p. 255. Hemisphere, Washington, D.C. Jones, T. B., and Bliss, G. W. (1977). J . Appl. Phys. 48, No. 4, 1412. Jordan, P. C. (1973). Mol. Phys. 25, No. 4,961. Kaiser, R. (1969). U.S.Patent 3,483,968. Kaiser, R. (1972). U.S.Patent 3,653,819. Kaiser, R., and Miskolczy, G. (1970a). J . Appl. Phys. 41, No. 3, 1064. Kaiser, R., and Miskolczy, G. (1970b). IEEE Trans. Magn. mag-6,No. 3, 694. Kaiser, R., and Rosensweig, R. E. (1968) “Study of Ferromagnetic Liquid, Phases I1 and 111,” Rep. No. NASW-1581.NASA Office of Advanced Research and Technology,Washington, D.C. Kaiser, R., Mir, L., and Curtiss, R. A. (1976). U.S.Patents 3,951,784-5. Kaiser, R., Miskolczy, G., Curtiss, R. A., and Colton, C. K. (1971). Proc. J t . Con$ Preo. Control Oil Spills, Washington. Kaplan, B. Z., and Jacobson, D. M. (1976). Nature (London) 259,654.
FLUID DYNAMlCS AND SCIENCE OF MAGNETIC LIQUIDS
197
Keller, H., and Kundig, W. (1975). Solid Srute Commun. 16, 253. Khalafalla, S. E. (1975). Chemtech 5, 540. Khalafalla, S. E., and Reimers, G. W. (1973a). U.S.Patent 3,764,540. Khalafalla, S. E., and Reimers, G. W. (1973b). Sep. Sci. 8, No. 2, 161. Khalafalla, S. E., and Reimers, G. W. (1974). U.S.Patent 3,843,540. Kraefi, B., and Alexander, H. (1973). Phys. Kondens. Materie 16,281. Krueger, D. A., and Jones, T. B. (1974). Phys. Fluids 17, No. 10, 1831. Kruyt, H. R. (1952). “Colloid Science,” Vol. I. Am. Elsevier, New York. Lalas, D. P., and Carmi, S. (1971). Phys. Fluids 14, No. 2,436. Lamb, Sir H. (1932) “Hydrodynamics,” 6th ed. Dover, New York. Lancanshire, R B., Alger, D. L., Manista, E. J., Slaby, J. G., Dudning, J. W., and Stubbs, R. M. (1977). Opt. Eng. 16, No. 5, 505. Lerro,J. P. (1977). Des. News, Sept. 5, 46. Levine, M. B. (1977). 9th Space Simulation Con$ Los Angeles. Liu, Y. A., ed. (1976). Theory Appl. Magn. Separ. IEEE Trans. Magn. mag-12, No. 5. Mackor, E. L. (1951). J. Colloid Sci. 6,492. McNab, T. K., Fox, R. A., and Boyle, J. F. (1968). J. Appl. Phys. 39, No. 12, 5703. McTague, J. P. (1969). J. Chem. Phys. 51, 133. Martinet, A. (1977). J. Colloid lntevace Sci. 41, 391. Martinet, A. (1978). In “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.). Hemisphere, Washington, D.C. Martinet, A. (1977b). J. Colloid Interface Sci. 41,391. Martsenyuk, M.A., Raikher, Yu. L.,and Shliomis, M.I. (1974). Sou. Phys.-JETP 38,No. 2, 413. Melcher, J. R. (1963) “Field Coupled Surface Waves.” MIT Press, Cambridge, Massachusetts. Miller, C. W.,and Resler, E. L., Jr. (1975). Phys. Fluids 18, No. 9, 1112 Miskolczy, G., and Kaiser, R. (1973). U.S.Patent 3,740,060. Miskolczy, G., Litte, R., and Kaiser, R. (1970) Ferrofluid Particle Gyro,” Tech. Rep. AFFDL TR-70-5. Air Force Flight Dynamics Laboratory, Wright Patterson Air Force Base, Ohio. Mosbach, K., and Anderson, L. (1977). Nature (London) 270,259. Moskowitz, R. (1974) ASLE Trons. 18, No. 2, 135. Moskowitz, R., and Ezekiel, F. D. (1975). 1975 SAE Of-Highway Vehicle Meet. Paper No. 750851. Moskowitz, R., and Rosensweig, R. E. (1967) Appl. Phys. Lett. 11,301. Neuringer, J. L. (1966). lnt. J . Non-Linear Mech. 1, No. 2, 123. Neuringer, J. L., and Rosensweig, R. E. (1964) Phys. Fluids 7, No. 12, 1927. Nogita, S., Ikeguchi, T., Muramori, K.,Kazama, S.,and Sakai, H. (1977). Hitachi Rev. 26, No. 4, 139. Olah, E. E. (1975). US.Patent 3,863,249. Pappel, S. S. (1965). U.S. Patent 3,215,572. Pappel, S. S., and Faber, 0. C., Jr. (1966). N A S A Tech. Note I)-3288 Penfield, P., and Haus, H. A. (1967). “Electrodynamics of Moving Media.” MIT Press, Cambridge, Massachusetts. Perry, M.P. (1978). In “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.),p. 219. Hemisphere, Washington, D.C. Perry, M.P., and Jones, T. B. (1976). IEEE Trans. Magn. mg-12,798. Persson, N. C. (1977). Des. News, April 18. Peterson, E. A., and Krueger, D. A. (1978). J . Colloid Interface Sci. (in press) Popplewell, J., Charles, S. W., and Chantrell, R. (1977). Energy Conuers. 16, 133. Rabenhorst, D. W. (1972). U.S. Patent 3,682,518.
198
RONALD E. ROSENSWEIG
Rabenhorst, D. W. (1975). Energy Sources 2, No. 3, 251. Rabinow, J. (1949). J . Franklin Inst. 248, 155. Resler, E. L., Jr., and Rosensweig, R. E. (1964). AIAA J . 2,No. 8, 1418. Resler, E. L.,Jr., and Rosensweig, R. E. (1967). J . Eng. Power A3 89,399. Romankiw, L. T., Slusarczuk, M. M. G., and Thompson, D. A. (1975). IEEE Trans. Magn. mag-11,No. 1, 25. Rosensweig, R. E. (1966a). Int. Sci. Technol. 55, 48. Rosensweig, R. E. (1966b). AIAA J . 4, No. 10, 1751. Rosensweig, R. E. (1966~).Nature (London) 210,613. Rosensweig, R. E. (1969). U.S.Patent 3,483,969. Rosensweig, R. E. (1970). US.Patent 3,531,413. Rosensweig, R. E. (1971a). “Ferrohydrodynamics.” Encycl. Dictionary Phys., Suppl. 4, 411. Pergamon, Oxford. Rosensweig, R. E. (1971b). US. Patent 3,620,584. Rosensweig, R. E. (1973). US.Patent 3,734,578. Rosensweig, R. E. (1974). U.S. Reissue Patent 27,955 (Original No.3,612,630 dated October 12, 1971) Rosensweig, R. E. (1975). U.S.Patent 3,917,538. Rosensweig, R. E. (1977). Japanese Letters Patent 862,559. Rosensweig,R. E.(1978). In “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.),p. 231. Hemisphere, Washington, D.C. Rosensweig, R. E., and Kaiser, R. (1967). “Study of Ferromagnetic Liquid. Phase I,” Rep. No. NASW-1219. NASA Office of Advanced Research and Technology, Washington, D.C. Rosensweig, R. E., and Resnick, J. Y. (1972). U.S. Patent 3,648,299. Rosensweig, R. E., Nestor, J. W., and Timmins, R. S. (1965). Mater. Assoc. Direct Energy Convers., Proc. Symp. AIChE-I. Chem. E . Ser. 5, 104. Rosensweig, R. E., Litte, R., and Gelb, A. (1966). Proc. 4th Symp. Unconventional Inertial Sensors, Washington, D.C., Avco Corp. Report No. AVSSD-0291-66-PP. Rosensweig, R. E., Litte, R., Miskolczy, G., and Pellegrino, J. J. (1968a).“ FHD Sensor Develop ment,” AFFDL-TR-67-162. Air Force Flight Dynamics Lab., Wright Patterson Air Force Base, Ohio. Rosensweig, R. E., Miskolczy, G., and Ezekiel, F. D. (1968b). Mach. Des. 40, 145. Rosensweig, R. E., Kaiser, R., and Miskolczy, G. (1969). J . Colloid Interface Sci. 29,No.4,680. Rosensweig, R. E., Zahn, M., and Vogler, T. (1978). In “Thermomechanics of Magnetic Fluids” (B. Berkovsky, ed.), p. 195. Hemisphere, Washington, D.C. Roth, J. R., Rayk, W. D., and Reiman, J. J. (1970) NASA Tech. Memo 2106. Sabelman, E. E.(1972). NASA, Jet Propulsion Laboratory, S/N235,295,Pasadena, California. Saffman, P. G., and Taylor, G. I. (1958). Proc. R. SOC.London,Ser. A 245,312. Sargent, R. W. (1976). U.S.Patent 3,935,571. Schmieder, R. W. (1970). U.S. Patent 3,516,294. Schmieder, R. W. (1972). Nucl. Instrum. & Methods 102,313. Scholander, P. F., and Perez, M. (1971). Proc. Natl. Acad. Sci. U.S.A. 68, 1093. Scholten, P. C. (1978). In “Thermomechanics of Magnetic Fluids’’ (B. Berkovsky, ed.), p. 1. Hemisphere, Washington, D.C. Sharma, V. K., and Waldner, F. (1977). J . Appl. Phys. 48, No. 10, 4298. Shepherd, P. G., Popplewell, J., and Charles, S.W. (1972). J. Phys. D 5, 2273. Shliomis, M. I. (1972). Sou. Phys.-JETP (Engl. Transl.) 34, No. 6, 1291. Shliomis, M. I. (1974). Sou. Phys.-Usp. (Engl. Transl.) 17,No. 2, 153. Shubert, R. H.(1975). U.S.Patent 3,926,789. Stratton, J. A. (1941). “Electromagnetic Theory,” p. 116. McGraw-Hill, New York.
FLUID DYNAMICS AND SCIENCE OF MAGNETIC LIQUIDS
199
Stripling, W. W., White, H. V., and Hunter, J. S. (1974). U.S. Patent 3,839,904. Styles, J. C. (1969). U.S. Patent 3,439,961. Styles, J. C., Tuffias, R. H., and Blakely, R. W., Jr. (1973). U.S. Patent 3,746,407. Taylor, G. I., and McEwan, A. D. (1965). J. Fluid Mech. 22, 1. Thomas, J. R. (1966). J . Appl. Phys. 37, 2914. Turbeville, J. E. (1973). Enuiron. Sci. Technol. 7 , No. 5 , 433. Verwey, E. J. W., and Overbeek, J. Th. G. (1948). “The Theory of the Stability of Lyophobic Colloids.” Am. Elsevier, New York. Weinstock, R. (1976). Am. J . Phys. 44, No. 9, 392. Winkler, H., Heinrich, H.-J., and Gerdau, E. (1976). J . Phys. (Paris), C 6, Suppl. 12, 261. Wohlfarth, E. P. (1959). Adu. Phys. 8, 87. Wolfe, R., and North, J. C. (1974). Appl. Phys. Lett. 25, No. 2, 122. Yakushin, V. I. (1974). Magnetohydrodynamics (Engl. Transl.) No. 4, p. 19. Yorizane, M.,and Tozawa, 0. (1976). Bull. Jpn. Pet. Inst. 18, 183. Zaitsev, V. M.,and Shliomis, M.I. (1969). Dokl. Akad. Nauk SSSR 188, 1261. Zelazo, R. E., and Melcher, J. R. (1969). J . Fluid Mech. 39, 1. Zimmels, Y., Tuval, Y., and Lin, I. J. (1977). IEEE Trans. Magn. mag-13, No. 4, 1045.
This Page Intentionally Left Blank
The Edelweiss System J . ARSAC,* CH. GALTIER.t G. RUGGIU.? TRAN VAN KHAI.t AND J . P. VASSEURt$ I. General-Purpose Operating System ............................................... 202 I1. New Trends in Computer Architecture ............................................ 203 A. Von Neumann Architecture ................................................... 203 B. Syntax-Oriented Architecture .................................................. 204 C. Indirect Execution Architecture ................................................ 204 D. Direct Execution Architecture ................................................. 204 E. Choosing the High-Level Language ............................................ 204 111. New Trends in Programming ..................................................... 205 A. Empirical Programming ....................................................... 205 B. TopDown Programming ..................................................... 205 C. Control Structures ............................................................ 206 D. Program Manipulation ........................................................ 206 IV. Principles of m~................................................................. 207 A. Description of EXBL ........................................................... 207 B. Procedure Calls ............................................................... 211 C. Example ...................................................................... 213 D. Systems of Regular Program Equations ........................................ 217 E. Programming Methodology ................................................... 222 F . Is EXEL a GO-TO-less Language? .............................................. 223 V. Large-Scale Systems: The EDELWEISS Architecture ................................. 223 A. Description of EDELWEISS ...................................................... 223 B. The Working Set in the ~DELWEISSSystem ..................................... 232 C. Analytical Model of EDELWEISS System ......................................... 244 VI. The Single-User Family ........................................................... 256 A. Description of E X E L E ~....................................................... 256 B. Operating e w L m ........................................................... 258 C. Internal Management of EX EL^ ............................................. 258 D. Three-Processor Implementation .............................................. 262 Appendix A ...................................................................... 263 Appendix B ...................................................................... 264 Appendix C ...................................................................... 267 References ........................................................................ 269
Institut de Programmation. Universite Paris VI. Paris. France. Laboratoire Central de Recherches. Orsay. France. f Present address: Thomson-Brandt. 173 Boulevard Haussmann. B.P. 700-08. 75360 Paris Cedex 08. France.
t Thomson-CSF.
Copyright 8 1979 by Acdemic has, lnc M'tighla of reprodudion in any form r s m c d . ISBN 412-014648-7
202
J. ARSAC ET AL.
I. GENERAL-PURPOSE OPERATING SYSTEMS Most operating systems have been designed to take charge of all possible programs, written in all possible languages. Programmers are not concerned with problems of memory management: programs are written without any consideration of memory requirements or storage resources. The concept of virtual memory is the fundamental issue: first, programs are translated by compilers for a quasi-infmite memory, using logical addresses; then they are translated into physical addresses before execution. Implantation decisions are taken statically at load time before starting program execution. The program and its data are loaded, and space is assigned as working area to the program, so that it can be run for a long time (compared with the time needed for loading). This is possible only if there is a huge central memory in the computer. Fragmenting the program and its data so that a coherent part of it may be loaded and run for a long enough time is possible only if indications have been given by the programmer (overlays). Another way is to take implantation decisions dynamically at run time. This is possible if there exists some hardware mechanism that maps logical addresses onto physical ones. The central memory is split into pages, and only a small number of them are assigned to a program at each time. The hardware mechanism detects the fact that some logical addresses cannot be translated into physical addresses because the corresponding page has not been assigned to the program and may be not loaded from disks. Execution is then interrupted, and decisions are taken by the operating system to find a free page in core memory, assign it to the program, and if necessary, load its contents from disks. It may be necessary to save the previous contents of this page on disks before assigning it to the program. Thus, for each page fault, one or two disk accesses may be made, giving a very important overhead. If the execution time between two page faults is not long compared with disk access time, the program execution time is considerably increased (system thrashing) (Denning, 1968). Several strategies have been proposed to avoid this phenomenon, such as replacing the page that has been least recently used (algorithm LRU) (Hansen, 1973). It has been shown that whatever the program, the set of pages needed for the execution of a small part of it (working set) (Denning, 1968) will have a lifetime longer than the execution time of this part, and more or less independent of the program. The following orders of magnitude may be given : time needed for the execution of one machine statement: lop6sec lifetime of a working set: sec disk access time: lo-’ sec
THE EDELWEISS SYSTEM
203
Operating systems may be improved if the working-set lifetime is increased (while disk technology does not allow faster access). This lifetime is a general statistical property of programs. If working sets are derived from considerations of program semantics, they may be made such that they include all the statements of a repetitive part of the program (some loops) and the corresponding data and working area, and so have a much longer lifetime. This is exactly the same thing as fragmenting the program into smaller segments or atoms for consistent execution. Thus, both static and dynamic memory allocation lead to the same idea: every possibility of splitting a program into segments will improve performances of the operating system. Unfortunately, this is a difficult problem. It cannot be solved easily for classical languages as FORTRAN. Adding a sophisticated program to the operating system is not a solution; the time saved from page management will be lost in program processing. Moreover, such a program is not independent of the language in which the processed program is written. It can be thought of only if a unique language is used, designed in such a way that program segmentation is simple.
11. NEWTRENDS IN COMPUTER ARCHITECTURE With large-scale integration, the cost of hardware has been drastically reduced and the realization of specialized computer architecture has been considerably simplified. A computer may be specialized to accept only one high-level language, and so take into account every facility of this language. Four kinds of high-level language computer architectures have been recognized by Chu (1975). A. Von Neumann Architecture
This is a classical architecture, based on a central memory made of addressable cells in which the program is stored. There is no direct connection between the high-level language and the machine language except the one coming from the history of computer development. Most of the highlevel languages (FORTRAN, ALGOL, COBOL, m/1, etc.) have been designed as more or less powerful abbreviations of machine languages, using the same universal concepts-variables, assignments, labels, GO TOs, conditional jumps, subroutines, etc. The compatibility between the high-level language and the machine language is provided by a compiler. Specializing such an architecture to a single language simplifies the operations of operating systems: they do not have to manage a library of compilers and run-time environments. These systems do not fundamentally differ from general-purpose systems.
204
J. ARSAC ET AL.
B. Syntax-Oriented Architecture This is still an architecture based on von Neumann machines but with some specialization of the machine code for lexical analysis-processing of character strings, Polish notation of operators, etc. With such machines, the part of software is reduced, but there is still an important distance between the high-level language and the machine language. For instance, the user’s program is written with parentheses and translated into Polish notation. C . Indirect Execution Architecture
Such a system uses still two languages: the external high-level language, and the internal machine language. But the distance between these languages has been considerably reduced by taking the internal language closer to the external one. Most of the translation operations between these languages are made by hardware (SYMBOL system) (Rice and Smith, 1971). These systems differ from von Neumann architecture because they do not start from a machine architecture on which a high-level language is mapped through more or less sophisticated compilers. The high-level language is first given, then the machine architecture is designed, so that the internal language will be close to the external one. D. Direct Execution Architecture
In these systems, there exists only one language, the high-level language, directly executed by hardware. No translation is made, and a program is stored without any modification, which greatly simplifies debugging (Chu, 1972).
E. Choosing the High-Level Language Most of the high-level languages being presently used have been designed to be easily translated into machine language of von Neumann machines. They use assignments [except LISP (McCarthy, 1960) and some very new assignment-free languages (Arsac, 1977; Ashcroft and Wadge, 197511, GO TO statements, and procedures considered as sequences of statements written separately to reduce program length, or even compiled separately to economize compile time. As far as other architectures are concerned, there is no reason why such consideration should remain while designing a high-level programming language. It is made for description of algorithm, not for compatibility with
THE EDELWEISS SYSTEM
205
a machine architecture. Therefore, it must take into account the new trends in programming. 111. NEWTRENDS IN PROGRAMMING A. Empirical Programming
Dijkstra has been the first to insist on the poor state of the art in programming [Notes on structured programming (Dijkstra et d., 1972) first published in 1969 as a publication of the University of Maryland]. A symposium in Monterey (Goldberg, 1973)has discussed the high cost of software and gives poor programming techniques as the main reason for such a cost. The usual programming methodology may be characterized as follows: ( i ) Program design. The problem is analyzed and a project of program built, generally represented by flowcharts. Then a program is written. ( i i ) Program debugging. The program is compiled. Syntactic errors and maybe some very simple semantic errors are detected by compilers. They are corrected until finally the program is accepted by the compiler. The program is run on test data, and dumps are requested when bugs are detected at execution time. Some anomaly is localized in the dump and prevented by code modification. This process is iterated until a correct result is obtained for test data It does not prove that the program is correct (Dahl et al., 1972),but only that no more bugs can be detected from these data. Software unreliability is the direct consequence of this methodology. More structured or systematic programming (Wirth, 1973)must be used.
B. Top-Down Programming Instead of describing the whole program architecture by flowcharts, exhibiting very precisely what is only implementation detail (the G O TO of the program), its structure must be given in terms of global actions, which will be refined later. A number of implementation problems may thus be delayed. This is an important point. In early stages of the development, attention is paid only to the main actions, which can be checked carefully. Decisions of implementation are taken later, when a good knowledge of what has to be done has been collected. For instance, in the early stages, we shall speak of a stack, on which elements will be pushed or from which they may be popped. Later, when a sufficient knowledge of the structure of the working space will be obtained, it will be decided that the stack is represented as a vector, or a linked list of cells.
J. ARSAC ET AL.
206
Top-down approach may be used with every language. It is just a discipline of programming. It is recommended even for FORTRAN programmers (Ledgard, 1975). In this case, it appears in intermediary stages of the programming process and will not be very apparent in the final program. A good programming language should facilitate top-down programming, allowing the programmer to describe its algorithm in terms of actions, whose refinements are given later. C . Control Structures
The GO TO statements have a strong negative effect on program readability (Dahl et al., 1972). They are used for three main reasons: (1) selecting sequences of statements according to the result of predicate evaluation, (2) making loops, (3) avoiding copies of parts of programs.
Clarity is gained by using a selection statement:
IF...
THEN...
ELSE...
FI
in the first case, and some loop statement in the second case. The third case is more ambiguous; if it corresponds to some loop, it will be better to use a loop. If not, there is a real danger that two sequences which have been merged by use of GO TO are not really identical, giving bugs difficult to detect and more difficult to prevent. A lot of loop statements have been proposed (Knuth, 1974; Ledgard and Marcotty, 1975) for GO TO elimination, and their ability to represent every flowchart has been widely discussed (a summary of these discussions has been given in Kosaraju, 1974). D. Program Manipulation
Manipulating a program is now considered a normal way to improve it (Knuth, 1974; Loveman, 1977; Standish et al., 1976). It is made by applying successive transformations that preserve program meaning. They may be divided into two classes: (i) Syntactic transforms. These preserve history of computation. They do not modify the sequence of assignments and test evaluations, and so the corresponding execution time. They may act on the overhead introduced by action interpretation in the system.
THE EDELWEISS SYSTEM
207
(ii) Semantic transforms. We consider only local semantic transforms, which depend on local program property. For instance, may be changed into a(j):=y;
a(i)= x
+1
if and only if i # j
These transforms change the number of assignments or test evaluations, and so act on the execution time. The EXEL language described below provides a good system for program manipulation. IV.
PRINCIPLES
OF
EXEL
The system EDELWEISS is designed for the use of a single language named (stands for Experimental Language). EXEL is described in N o h and Ruggiu (1973) and Arsac (1974). Here will be given a presentation of its principles and some examples illustrating its capabilities. EXEL
A. Description of E X E L EXEL is a GO-TO-less control structure language. It may describe any control structure, either flowchart type or recursive. In EXEL, three different hierarchical levels exist: formulas, actions, and procedures.
1. Formulas
A formula is a sequence of operands, operators, and assignments. A formula is not expressed in EXEL, but in any convenient computational language (BASIC, FORTRAN, APL, etc.). This language is called in EXEL terminology the formula language. In the general philosophy of EXEL, operands are structured data. Operators perform operations on these operands. These operations may be the usual arithmetic ones-addition, multiplication, logarithms, circular functic]ns, etc. They may also be operations related to the structure of the operand (transposition, global operations on arrays, tree scanning, etc.). In general, the set of available formula operators depends on the data structures that may be described in the formula language. EXEL control structures describe the order in which formulas are executed. The various possible choices of the formula language lead to as many different EXEL systems: EXEL/BASIC, EXEL/FORTRAN, EXEL/APL. A formula describes a sequence of operations leading to a result; in an
208
J. ARSAC ET AL.
program, no control transfer may occur inside a formula. Thus GO TOs or IFs or similar branching or test instructions are not to befound inside a formula.
EXEL
2. Actions In order to give a feeling of what an action is, one can say that an action is a portion of a flowchart having only one entry point and one exit point. Names are given to actions. Actions are sequences of formulas and calls to other actions or procedures, linked together by EXEL control structures. Before giving a proper definition of what an action is, it is necessary to introduce the EXEL operators. a. Composition. Iff and g are two formulas, the flowchart
is interpreted as: computef, then compute g. This will be written in EXEL as fog
.
The EXEL " " is equivalent to the ALGOL ";". b. Alternation. If t , f , g are three formulas, and if t evaluates to a Boolean value (true, false), the flowchart
--Iis interpreted as: compute t ; if the result is true, computef; else compute g. Proceed after this. This will be denoted in EXEL as ct-bfOg3
THE EDELWEISS SYSTEM
209
Or, if one does not like symbols, IF t THEN f ELSE g FI Semantic extensions of the alternation operator are possible. For example, multiple alternations as
t must in this case evaluate to an integer i ; if 1 I i I n, computef;, else computef.. The first formula in an alternation ( t in the examples above) is called a test formula. c. Iteration Operator. Composition and alternation combined with action call have been shown to be sufficient to describe any program (Ruggiu, 1974). However, a redundant iteration operator has been added to EXEL, both for theoretical reasons and programming convenience. - are used to describe this operator. Three symbols “ {,” “},” and “EXIT” On occurrence of a “ },” control comes back just after the corresponding “ {.” This goes on until one EXIT is met. Then control is given after the “ 1.’ “ { ” and “ } ” may benested in a parenthesis structure, thus allowing definition of several levels of iteration. For this purpose, symbol EXIT is followed by a positive integer n ; EXIT n gets control out of n iteration levels. EXIT 1 is equivalent to EXIT. For abbreviation, EXIT n can be written as a”. EXIT n is defined for n 2 1. During program manipulations, operations may occur on this n (for example, operation + 1-see Section IV,C). It has been found a natural extension of this notation to allow also EXIT 0. This notation can be interpreted as a “do nothing” or an empty formula. (In some cases, the symbol R will be used as an EXIT 0 for sake of clarity). In program manipulations, EXIT 0 is treated exactly the same way as EXIT n, n 2 1. d. Action DeJnition. The three operations: composition, alternation, and iteration can be recursively composed, leading to sophisticated sentences:
f,
.c,+
O { f ~ . c , + E X I T O { f ~ .c Y + Of4.EXIT2fS.EXITl x } . f 6 I}
3
Such sentences are called actions; the formulas (fl, fz , ...) are atomic actions.
J. ARSAC ET AL.
210 3. Procedures
Actions are grouped in procedures. An EXEL procedure has three distinct parts : a header. It defines a result list, a data argument list or data list, and the name of the procedure. a body. It defines a set of actions with their names. The order of definition of the actions is irrelevant. an entry point. It gives the name of the first action to be executed. The Backus Normal Form (BNF) description of thejprocedures is as follows:
(header) :: = V((resu1t list))(procedure name)((data list)) The data list and the result list are lists of identifiers separated by commas. They are the formal parameters of the procedure. For example, (X1,Yl,Z)FON(F 1,A,X1,T) This header defines the procedure FON which has four data arguments (Fl,A,Xl,T) and three results (Xl,Yl,Z). Data arguments can be either variable type or procedure type. Result arguments are only variable type. (body):: = (action) 1 (body)(action) (action):: = (action name)+(action body)+ (entry point):: = (action name)V (procedure):: = (header)(body)(entry point) The (action body) is defined according to the rules given in the previous sections. At this point, two new atomic actions appear: the action calls and the procedure calls. (atomic action) :: = (formula) I (action call) I (procedure call) (action call):: = (action name) (procedure call) :: = ((result list))(procedure name)((data list)) The action calls define the control of the computations between the different actions of a same procedure. For example, let us consider the two actions: A l + c , + g . A1 OA2 I+ A2kh-1 A1 defines the following computation: while t is true, compute g; when t becomes false, compute A2. A2 means compute h.
THE EDELWEISS SYSTEM
21 1
Likewise, the procedure calls define the control between the different procedures of a same program, a program being defined by a set of procedures. The execution of the program itself is defined by a procedure call which starts the computation. Actions look like parameterless procedures on global variables. ALGOL copy rule may be used. There is no renaming, all the variables being global to the actions of a same procedure. Thus, an action call may always be replaced by the sequence of statements defining it. This is exactly the mechanism of substitution in mathematics: a variable name is replaced by the expression associated to this name. This has interesting applications in the field of program transformation. It has been seen that actions know only global variables. It goes the opposite way at the procedure level. All variables in a procedure are local to this procedure. This prevents undesirable side effects and collisions of names which are usual in procedure-type languages. There is however one exception to this rule. In order to give EXEL the ability to handle files without having to duplicate them in memory, the user may define global variables. They must be declared before execution and their names are syntactically differentiated. In the currently implemented systems, file names start by the character “0.” They are the only global variables to be found in the programs.
B. Procedure Calls 1. Syntax
The procedure calls are atomic actions. They may be used as regular formulas or to compute an operand inside a formula. In this case, the special character “ * ” indicates which result is to be used. A
+
A
+ (B,*,C)FON(G,A,C,D)
where FON is the procedure, the header of which has been described before, stands for the sequence:
.
(A,VT,C)FON(G,A,C,D) A + A
+ VT
VT is an auxiliary variable used for composition of the procedure call and the formula which follows it. Likewise actual parameters may be formulas: (A[3
+ I],B,D)FON(G,A + B,H(C),D)
212
J. ARSAC ET AL.
2. Argument Transfer and Evaluation a. Data. The computation rule of procedure calls is as follows. At the procedure call, a new context is defined by the global variables and the actual parameters. The point is that these actual parameters are evaluated only when they are needed during the execution of the called procedure. Therefore the context is recursively defined by the variables occurring in the actual parameters. This rule is consistent with the /3-reduction of the bcalculus; so it is correct, and like the delay rule (Vuillemin, 1974), it is generally optimal. b. Results. Generally the result of procedures are array-structured variables. They can be indexed by expressions. These expressions are simultaneously valuated: they are logically computed in parallel and in the context of the calling procedure before the call of the called procedure. There is no side effect nor ambiguity when the same variable occurs several times as a result of the call. For example, let the header of F1 be
(X,Y)F1(- - -) Then the call (A[1 2],A[2 3])Fl(---)
defines A if the values of X and Y of F1 satisfy the relation
X[2] = Y[1] in that case, after the call, A is a three-element vector, the value of which is
A = x[1lJ[2I;Y[21 If X [ 2 ] # ql],the call will generate an error. 3. Type Expression
In EXEL, the procedures and the variables are typed. These types define the functionality of the objects. An order relation is defined between the types: roughly, a procedure PIhas a type higher to that of P2if PIhas more data arguments than P2 or if these arguments have types higher than the types of the corresponding arguments of P2.In EXEL a procedure call is meaningful if the types of the real parameters are higher or equal to the formal parameters. This rule is more general than the usual rule which states that the types must be equal. It is a very powerful mechanism, allowing one to write procedure expressions. When the type of a real parameter is strictly higher than the corresponding formal parameter, this is called a type extension.
THE EDELWEISS SYSTEM
213
Example Let the procedure DER which computes the derivation of functions of one variable be
( )DER(F, x)+f(x)/dx where F is the procedure which computes the function f. The procedure DER can be directly called to compute the partial derivations of functions of two variables. Let g be a function of two arguments:
In the second call, ( )DER(g, 6), there is a type extension since g is a function of two arguments, whilefhas one argument. As a matter of fact, this call defines an auxiliary procedure that computes the partial derivation ag(x, 6)/8y, the derivation in x of which is computed in the first call. C. Example
An example of development of an EXEL program is given below. First the problem is stated in English, and the actions to be taken are expressed informally. Then they are stepwise replaced by formal EXEL/ALGOL-like programming. The program is a sorting problem. Let a[l : n] be a vector of n elements to be sorted in ascending order. The following program sorts it. Assertions are written between quotes. Sort
I-
“Only permutations have been made on a ” look for an i : a(i) > a(i
+ 1) ; reorder4
Reorder + IF none THEN R ELSE swap (a(i),a(i
+ 1)). Sort FI+
Reorder decreases the number of inversions in a (pairs of consecutive elements in reverse order) by one, so that the program stops. i2 is the empty formula. In this particular case, when control reaches 0,it goes after the FI, and there is nothing else to do. R is said to be injinal position (this term is more formally defined in Section IV,D,l). Thus, an R in final position acts as a STOP instruction. The program is made of two actions-Sort and Reorder-to be refined at the various stages of its development. Now, let us develop our Sort program a little more. The action look for “
J. ARSAC ET AL.
214
an i : a(i) > a(i + 1)”is not entirely defined. What i must be selected, if not unique? We decide to take the first one:
Sort + look for the first i : a(i) > a(i + 1). “Sorted(1j)” Reorder -I Reorder
IF none THEN R ELSE swap(a(i),a(i+ 1)). Sort FI
The assertion “Sorted(1j)” indicates the fact that for every j: (We selected the first i : a(i) > a(i + 1)) 1 Ij < i , a(j) I a(j + 1) The effect of Reorder on this assertion is considered. Assuming none is false, (swap(a(i),a(i+ 1)) leaves a(l : i - 1) unchanged, so that it remains sorted (if it is not void, that is to say i > 1). Thus, i > 1 ==- Sorted(1,i - 1)
+
Now,if i = 1, swap(a(i),a(i 1))gives a sorted part from 1 to 2 = i i = 1 =$ Sorted(1,i + 1). Assertions are put in Reorder: Reorder I-
“
+1
Sorted(1j)”
IF none THEN R ELSE swap(a(i),a(i
+ 1)).
“Sorted(1, IF i = 1 THEN i + 1 ELSE i - 1)”
Sort FI -I
Sort is replaced by its value in Reorder. Reorder I- IF none THEN R ELSE swap(a(i),a(i + 1)). “Sorted(1, IF i = 1 THEN i + 1 ELSE i - 1)’’
look for the first i : a(i) > a(i + 1). Reorder
FI -I We refine the action a little more: look for the first i -... It is made by starting from some initial value, then scanning a until an inversion is found. Look for the first i : a(i) > a(i + 1) I- initialize i ; scan a until a(i) > a(i + 1) -I
T H E EDELWEISS SYSTEM
215
The initialization is not the same in Sort and Reorder. In the action Sort, we do not have any information on a, hence the scan must be started from the beginning.
.
Sort I- i := 1 ; scan a until a(i) > a(i + 1) Reorder --I Reorder
I-
IF none THEN R ELSE swap(a(i),a(i+ 1)).
+ 1 ELSE i:=i - 1 FI. scan a until a(i) > a(i + 1) IF i = 1 THEN i:=i Reorder
FI -I In the action Reorder, we know that a is sorted from 1 to a given value, so that it does not have to be scanned for an inversion in this part. The same sequence occurring in Sort and Reorder, a new action is introduced: Sort
I-
Scan
I- scan
i:=1 ; Scan4
Reorder
I-
a until a(i) > a(i + 1). Reorder -I
IF none THEN R ELSE swap(a(i),a(i+ 1 ) ) .
IFi=lTHENi:=i+lELSEi:=i-lFI. scan FI + Now, the action scans a until a(i) > a(i + 1) is refined: Scan I- IF j = n THEN Reorder
ELSE IF a(i) > a(i + 1) THEN Reorder ELSE i:=i
FI FI --I
+ 1 . Scan
J. ARSAC ET AL.
216
The predicate " none " may be taken as i = n. Reorder is replaced by its value into Scan. IF i = n THEN R
Scan I- IF i = n THEN R
ELSE ... FI IF a(i) a(i + 1)
ELSE
THEN IF i = n THEN R ELSE swap(a(i),a(i + 1)). IF i = 1 THEN i:=i
ELSE i : = i
+1
-
1 FI
.
SCan
FI
ELSE
i := i
+ 1 . Scan
FI FI --I The value of inner predicates i = n is known: in the THEN alternate of the first selection, the value is TRUE. The selection is removed, and only the TRUE alternate is used. The same thing is done for the ELSE alternate.
Scan I- IF i = n THEN R ELSE IF a(i) > a(i
+ 1)
THEN swap(a(i),a(i + 1)). IF i = 1 THEN i:=i
+ 1 . Scan
ELSE i:=i - 1 . Scan FI
ELSE i : = i
+ 1 . Scan
FI FI -I
A new action Advance
I-
i:=i
+ 1 . Scan+
THE EDELWEISS SYSTEM
217
is introduced. The action scan being much too intricate, the test on i = n is set apart by introduction of a new action. So we have: Sort+.i:=l.Scan-r Scan
I-
Test
I-
IF i = n THEN SZ ELSE Test FI + IF a(i) > a(i + 1)
+
THEN swap(a(i),a(i 1)). IF i = 1 THEN Advance ELSE i:=i
-
1 . Scan
FI ELSE Advance FI + Advance I- i:=i + 1 . Scan+ We notice that Test is called only if i # n. After decrementation of i, this property remains true, so that in Test, Scan is called with i # n, and the predicate i = n of Scan has the value false. Thus, Scan may be replaced by its ELSE alternate in Test, giving: Sort+i:=l.Scan+ Scan I- IF i = n THEN SZ ELSE Test FI --I Test
I-
IF a(i) > a(i + 1)
THEN swap(a(i),a(i+ 1)).
IF i = 1 THEN Advance
.
ELSE i := i - 1 Test FI ELSE Advance FI + Advance I- i:=i
+ 1 . Scan+
D. Systems of Regular Program Equations
We will first try to give an informal presentation of the underlying idea. If we look at the three actions Sort, Scan, and Test of the previous example, we find the action names both in definition position and in the action body itself. If we consider these names as variables, we can think of the whole program as defining a system of equations, the variables of which are the
J. ARSAC ET AL.
218
action names. Note that one action, defined as the entry point, plays the role of the program itself. Now, if we give ourselves a set of algebraic rules to manipulate the various elements of these equations (formulas and EXEL symbols or keywords), we can eliminate the variables and end up with only one action which will be the program itself. Of course, dependingon the order in which we perform these substitutions, we may end up with various shapes for the program. But the idea is to do only transformations that keep the logical function of the programs. In order to be more precise, we will need some definitions.
1. Definitions and Scope of Application For the sake of clarity and to avoid being too technical, we will suppose that the program we start with does not have any iteration operators but only action calls. This will allow us to give only the definitions needed for understanding of the paper. Final position. The final positions of:
.
a b ” are those of “ b ” ‘ ‘ c t - t a l ()a, - - - ()a,
“
3
”
are those of “a,,” “a,,” ---, and “a,,.”
Also, the EXIT n occurring in p nested iterations are in final positions if and only if p In. Regular action. An action is regular if and only if all the action calls in its body are in final position. For example,
.
ACTl I- f 1 c t + ACTl 0 ACT2 3-1 is regular;
ACT3t-f2.ct-+ACTlOACT2~.f3-i is not regular. (fl,f2,f3, t are formulas; ACTi are action calls). Systems of regular equations. An equation is said to be regular if the correspondingaction is regular. Many programs may be defined by a system of regular program equations, as in the preceding example. Let it be schematized as Xi =A@,, ..., X,,Q),
i = 1,2, . . ., n
Solving this system is eliminating all action variables except the one which is distinguished as the program, i.e., the entry point. Substitution rules in systems of regular equations. There are different substitution rules, depending if the action name occurs in its definition or not.
THE EDELWEISS SYSTEM
219
Rule (a).If the action X i is such that X i does not occur inf, ,thenf, may be substituted for X i in all other equations. The equation X i =f, is then removed, giving a new equivalent system with one less equation. Rule (b). If X i occurs in f,, substitution will give new calls of X i . Hence this variable cannot be eliminated by a simple substitution. Let Xi = f , ( X i , Xj+i Q)
be such an equation. It has been proved (Arsac, 1977)that the following rule transforms this equation into an equivalent one: in J , X i is replaced by EXIT (0)
X j p i is replaced by the conventional notation X j + 1 R is also replaced by R
+1
then, the resulting formula is enclosed between iteration brackets:
X i = {J[Xi/EXIT (0),X j + i / X j + 1, R/Q
+ 11)
By repeated use of this rule, a variable X may occur in an equation as with an arbitrary positive increment k. Replacing X by X + 1 in X + k gives X + 1 + k = X + (k + 1). Now we must explain the notation X + n, i.e., what is this operation " + "? When substituting X i by J [rule (a)], the effect of + n is to transform all the EXIT p ( p 2 0) which are in final position, into EXIT (p + n). An R in final position being equivalent to an EXIT (0), it becomes EXIT n. Rule (b) and " + " operation allow replacement of the equation
X
+k
Xi = f , ( X i , Xj+i
9
a)
by a new equivalent equation without occurrence of X i . After this, X i may be eliminated by substitution. Using this rule, the whole system may be replaced by a unique equation X I = 4 which is the wanted program. And, which is quite interesting, both rules and the associated " + " operation can be performed in an automatic way, i.e., by a program. 2. Example This method is applied to the system obtained in Section IV,B: Sort+ i : = l . S c a n +
Scan I- IF i = n THEN R ELSE Test FI 4
J. ARSAC ET AL.
220
+
Test I- IF a(i) > a(i + 1) THEN swap(a(i),a(i 1)). IF i = 1 THEN Advance ELSE i :=i
-
.
1 Test
FI ELSE Advance FI -I Advance I- i :=i
+ 1 .Scan
-I
Test occurs in the equation defining Test. The right-hand formula is enclosed between loop brackets, Advance is replaced by Advance + 1, Test by EXIT (0) [rule (b)]: Test
I-
+
{IF a(i) > a(i + 1) THEN swap(a(i),a(i 1)). IF i = 1 THEN Advance + 1
.
ELSE i :=i - 1 EXIT (0) FI ELSE Advance
+1
FI} + Advance is replaced by its value [rule (a)]. By doing so, Advance + 1 is changed into i :=i 1 Scan 1 Then Test is copied into Scan [rule (a)].
+
.
+ .
Scan I- IF i = n THEN R ELSE
{IF a(i)
> a(i + 1) THEN swap(a(i),a(i+ 1)). IFi=1 THEN i :=i + 1 . Scan + 1
.
ELSE i :=i - 1 EXIT (0) FI ELSEi:=i+l.Scan+l FI)
221
THE EDELWEISS SYSTEM
There are occurrences of Scan in the right-hand member, so rule (b) applies. The body of Scan is enclosed between loop brackets; Scan is replaced by EXIT (0), Scan + 1 leads to EXIT (0) + 1 = EXIT 1 = EXIT. Rule (b) states that only EXIT n in final position are incremented. The R in the first line is in h a 1 position, hence it becomes EXIT. The EXIT (0) in the ELSE part of the innermost IF is not in final position, so it is not incremented. Being equivalent to an empty formula, it is simply dropped. EXIT (0)is removed from the result: Sort I- i := 1
.
(IF i = n THEN EXIT ELSE {IF a(i)
> a(i + 1) THEN swap(a(i),a(i + 1)). IFi=1 THEN i : = i ELSE i:=i
+ 1 . EXIT
-
1
FI ELSE i :=i
+ 1 .EXIT -
FI}
FI}--I So we have a program Sort, made of two nested loops, that has been mechanically derived from the system of program equations. It may be made a little clearer by simple syntactic transforms (Arsac, 1977; Cousineau, 1977a). They are given here without formalism or justification, being considered as “obvious.” Instead of doing i :=i + 1 then going out of the loop, the loop is first exited, then i := i + 1 is performed.
Sort + i I= 1
.
{IF i = n THEN EXIT FI
.
{IF a(i) 5 a(i + 1) THEN EXIT FI
.
swap(a(i),a(i+ I)). IF i = 1 THEN EXIT ELSE i :=i - 1 FIJ i:=i
+ I}+
This program may be greatly improved. k being the value of i before
222
J. ARSAC ET AL.
+
entering the inner loop, it may be proved that a is sorted from 1 to k 1 after this loop, so that the last statement may be changed into i :=k + 1. Sort F i I = 1
.
{IF i = n k:=i.
THEN EXIT FI
.
{IFa(i) Ia(i + 1) THEN E X FI} swap(a(i),(i
.
+ 1)).
IF i = 1 THEN EXIT ELSE i :=i - 1 FI} i:=k
.
+ 1}+ E. Programming Methodology
This example illustrates the EXEL programming methodology. A first approach in writing a program is built in terms of actions, in a top-down way, without any consideration of efficiency. The only important point is correctness. Then, the program is carefully examined for possible simplification or for detection of sources of inefficiency. Some kind of algebraic simplification is made on the system of program equations. The program may be kept as a system of equations and executed by the EDELWEISS System. The system of equations may be solved, giving a structured program. It will be more efficiently performed by the system (action calls introduce some overhead),and other sources of inefficiency are possibly more easily detected on such a form. We have been able to extend solution of some systems of program equations which are not regular, i.e., in which action calls do not have to be in final positions. For these cases, it is a very powerful way to transform recursive procedures into iterative ones (Arsac, 1977). A lot of experiments have been made with this programming methodology. It appears to be very powerful, simplifying the programmer’s task by separation of concerns. (1) Write a correct program. (2) Improve it, not only for better efficiency,but for even greater simplic-
ity or readability. It is used now in programming lectures, even for FORTRAN programmers. It is very easy to transform an EXEL program into a FORTRAN one by “hand compiling.”
THE EDELWEISS SYSTEM
223
F. Is E X E L a GO-TO-less Language?
It has been said that EXIT statements are some kind of branch statements, and so they reintroduce GO TOs. This could be said of every loop statement, which realizes a branch from the end of the iterand to its beginning and out of the loop according to the value of some predicate. Using top-down programming, only one loop must be written at a time. If at some point of the loop the final assertion (i.e., the goal of the loop) is reached, then an EXIT statement is written. It is not at all a GO TO some point, but just the indication that the loop is completed. EXIT ( p ) statements, with p > 1, should not be written directly by the programmer. They appear during program manipulation. Doing so, the programmer is not faced with the question: how many loops must be exited? More interesting is the relation between action names (or calls) and GO TOs. Actions have been said to be like parameterless procedures on global variables. This has been proposed by Floyd and Knuth (1971) as a way to avoid GO TO statements. We may also consider action names as labels. An equation
xi = h ( X 1 , ..., R) is then a sequence of statements labeled X i . Right occurrences of action names are interpreted as statements GO TO X j ,GO TO R being the exit of the program. Conversely, a GO TO program may be replaced by a system of regular program equations, then solved into an EXEL program (Nolin and Ruggiu, 1973; Arsac, 1977). This is the simplest way to structure a FORTRAN program (Baker, 1977; Urschler, 1973). V. LARGE-SCALE SYSTEMS: THEEDELWEISS ARCHITECTURE A. Description of EDELWEISS 1. Principle of Operating EDELWEISS has been designed according to the structure of EXEL. EXEL programs are wholly parenthesized structures built on parenthesized statements: selection IF ... THEN ... ELSE ... FI
iteration
{. . . EXIT . ..}
Using the three hierarchical levels, an EXEL program may be split into segments very easily. For example, every action may be a segment. Because
224
J. ARSAC ET AL.
of its parenthesized structure, it may be represented by a tree, and every node of the tree is the root of a tree, thus a possible segment. That is to say, every sequence of statements in a program is a segment of the program. Thus, an EXEL program is more easily split into parts which can be loaded separately in the central memory, with a good probability of a long enough lifetime. The lifetime of working sets is thus increased, the rate of page faults is reduced, and performances of systems based on virtual memory increased. The choice of efficient data structures can also greatly improve the lifetime of the working sets. The presentation and discussion of EXEL in Section IV does not assume anything on data structures. Thus their choice is open, and it will usually depend on the chosen formula language. A good example is the APL array. Such structures used in an EXEL-APL system (Section V,B) give a long execution time for each segment as well as useful information on the way they are used; in particular, sequential processing allows efficient prediction of the segments of data to be loaded if data segmentation is wished. call Prom-
'
-
Command
Data
-
K
ROBOT Control "structure ,,Segments
Procedure Y, call 3 I SCRIBE
-
I
Datal I
Data command
I
Fragments,' Results,'
Fragments INTENDANT dscripbrs
Program Program
MAIN
I Data descriptors
SOUTIER
I
I
I
t
Data Formulas ,,Data
MEMORY
I
FIG. 1. The EDELWEISS system. EDELWEISS is a large-scale multiuser, multiprocessor machine (Fig. 1). Each processor performs a special task. Processors work asynchronously according to producer-consumer relation scheme. In Fig. 1, only the data paths have been represented. Three processors are associated with the three logical levels of EXEL:
SCRIBE processes procedure calls. GREFFIER interprets control structures in action bodies. ROBOT executes the formulas.
225
THE EDELWEISS SYSTEM
Two classical functions are performed by two other processors: SOUTIER is the memory manager. HUISSIER is for I/O's, user communication and program, and associates to procedures the information needed by processors during execution (cf. Section V,A,3+ontrol structure for GREFFIER, descriptors and complexity coefficients for INTENDANT, data access functions for SOUTIER.
A sixth processor, INTENDANT, performs programs and data segmentation, adjusting online data flow to ROBOT'S resources. Roughly, the execution of a program goes as follows: (i) HUISSIER receives an execution request. It determines the first procedure to be executed and asks SCRIBE to perform the corresponding call. It sends to GREFFIER the control structure to be executed. (ii) SCRIBE asks GREFFIER for control structure execution. (iii) GREFFIER, during this execution, sends to INTENDANT the sequence of formulas to be executed. This sequence of formulas is a program segment. (iv) INTENDANT adapts segments to ROBOT'S resources and defines a fragment. (v) SOUTIER creates the working set and the memory requirements necessary for the execution of this fragment. (vi) ROBOT executes the fragment, sends results to SOUTIER and the control information (if any) to GREFFIER (results of test formula evaluation for conditional branching). There can be several ROBOTS working together, independently. The main properties of this architecture are the following: (i) Multiprocessing without conflicts between processors, because they are specialized in independent functions. (ii) Natural multiprogramming, because changing contexts between the various segments is automatic. (iii) Resource allocation through a prevision of program behavior based on their structure. (iv) The communications between processors are based on a producerconsumer scheme. They need only to communicaZ&rough FIFO files. The length of these files is determined by an equilibrium condition between the multiprogramming rate on the service rate of each processor. A mathematical study of this point is presented in Section V,C. /---
b
226
J. ARSAC ET AL.
2. Management of EDELWEISS
The multiprogramming, in the classical architecture, is defined by a time device which controls the multiplexing between the user programs. So the system must be able to handle the successive states of programs during their execution. Referring to EXEL, we can distinguish four states: initialization (which starts execution), procedure, action, and formula (see Fig. 2). Initialization User 1
Program
-- -- -- Procedure
Action
Formula
---- ---
User 2 I I
I
I I
I
I
I
---+---
I I I
I
---+---
User i I
I
I
I
I
I
I
---+---
I I
I
User n
FIG.2. States of different programs.
At one time the processor contains different programs which are in different states: for example, it may contain the programs circled on Fig. 2. To multiplex the processor, the transitions between the different states must Initlalization -t
Program 1
-+
I I
I I I I I
HUlSSlER
Local data
I
COMMON DATA OF THE EDELWEISS SYSTEM
FIG.3. Program states and the processors of EDELWMSS
I
227
THE EDELWEISS SYSTEM
be executed by a special program, generally called the supervisor; this later program requires that the processor be in a special state, the master state. The states of the user programs are called slave states. Some instructions are available only in the master state. This multiplexing method is the main reason that the efficiency of the multiprogramming is limited. In EDELWEISSthe multiprogrammingis defined through a logical segmentation of programs, based on the EXEL structure. So each state of user programs is taken into account by some processor like in Fig. 3. So we can say that each processor works in monoprogramming, the multiprogramming being spread among the different processors. Each program is split into segments and a segment defines a task for the corresponding processor. The processors communicate between them through waiting queues (Fig. 4). Input queue
Output queue ‘hl
k‘ 1
I
Processor i
‘kn
I
‘hl
FIG.4. Processor communications.
The general processing of a processor is Repeat forever Get a segment from an input queue. Complete the execution of the task defined by this segment. Put the resulting segment into the output queue for the corresponding processor. So the processors are not multiplexed between the different tasks. In the same way, there is no supervisor or basic monitor which controls the synchronization of concurrent processes. Instead, the management of the system is distributed in each processor, at the level at which the synchronization is reduced to the producer-consumer rule. Each processor has its local data. A data set, which constitutes the common data of the system, is shared by the processors. It consists o f
the set of waiting queues, the internal representation of the user programs. This latter has a tree structure described by the diagram in Fig. 5.
228
J. ARSAC ET AL.
User's library Ui
Files
i
Fi
t Procedure I I I
Procedure
Procedure
FIG.5. System data structure.
The tree ends with the sequence of procedures called during execution of the program. Of course this sequence increases at each procedure call and decreases at each procedure return. It is empty when the program is finished. The information associated with programs is defined by the following: Fi = the table of files used by the user U i V, = the table of local variables used by the procedure PI
V , = the table of local variables used by the procedure P,,. As a procedure only knows its local variables and the files, the data of a segment are completely defined by the user (Ui),its file table (Fj), and the local variables of the running procedure (P,,). So there is no difference between the procedure calls or the multiprogramming. In the first case P, is changed; in the second it is U i . Therefore the segments have the following structure:
Ui, Fj, V , , S ~ S $ ' S F~ Sj z~ ~U 2~ ~~ ~ V 1 2 ~ ~ U i 3 * * -
THE EDELWEISS SYSTEM
229
Ui,.F j , . P k ,define the context of si,f, . . . , sy, etc. Each segment contains the name of the statement, the list of operators occurring in the statement, the names of data occurring in the statement, the descriptors of these data. 3 . Segmentation and Fragmentation
a. The Function of GREFFIER. The main difficulty in implementing the segmentation of GREFFIER to feed ROBOT. Roughly the segmentation points of GREFFIER for ROBOT are the tests. For example, suppose that the control structure of an action of the procedure P , of the user U i , the file of which is F i , is as follows: EDELWEISSis
a . b . {c. c , + .ol ~ O f 2 .g}. h GREFFIER generates the segments UiFiV,abcrY ... The test t defines the end of the segments for Ui. GREFFIER will then execute the program of user U j. When it gets the value oft, it will be able to resume the execution of the program of F; and if t is true, it will generate U i F i V , e h ...
if t is false, it will generate
Ui F', V3fgctUk . . . In the same way, the procedure calls and returns are segmentation points, but action calls are not. This logical segmentation method presents a new characteristic with regard to the usual ones. It relies upon a previsional behavior of programs. This anticipation is deduced from the EXEL structure. For ROBOT memory, all happen as if the replacement algorithm was based, not upon the past behavior, but upon the future pattern of references. For example, the segment:
UiF', 1/,fgct determines the next references which will happen after the test t, when it is false. But this prevision is limited to the segmentation points. So this algorithm draws near again to the ideal one which would replace the pages which will remain unreferenced for the longest period-if they could be known.
230
J. ARSAC ET AL.
This leads to a new type of working set which is described in Section V,B. The classical working sets are based on the past references of programs. This policy is good if these references are strongly correlated. But generally, for large programs, it is not the case. The EDELWEISSworking set is defined by logical analysis of future behavior of programs; it is no longer probabilistic, but deterministic. Statistical evaluations on execution of APL programs, have shown (cf. Section V,B,4) that this working set is 10-100 times better than the classical ones. So this makes possible the use as the main memory of EDELWEISS of a very large and slow memory, such as the bubble memory, and for the ROBOT local memory a small and fast memory; such main memory would be of the order of lo7 bytes and the one of ROBOT memory of the order of lo5 bytes. We have not built the EDELWEISS machine. Instead we have built a small single-user machine, EXELETTE, which is described in Section VI. The working of EXELETTE is so simplified; but the memory management is based on the future behavior of programs, according to the EDELWEISS principles. So the main memory of EXELETTE can be a floppy disk (of 256 K bytes) and the memory of the microprocessor which stands for ROBOT is about 12 K bytes. E X E L E is ~ not yet finished, but the first results of this management are very encouraging. b. The Function of INTENDANT.GREFFIER generates segments for INTENDANT. From these segments, INTENDANT creates the fragments, which fit, in time and size, to the ROBOT resources. For that purpose, every object belonging to a segment-data and operators-is described by two complexity coefficients, one for the time, the other for the size. By composition, INTENDANT determines the complexity xs of each segment: xs
= x:,
xr
where 2: is the time complexity of the segment, and is the memory complexity. These complexities are computed according to the semantics of the operators occurring in the statements of the segments. For example, the complexity of the APL statement s: D+A+B+C
S=
will be
x:
= (x'(
+
1 + 2xY+ ))Xm(A)
+
where x'( ) and x'( ) are the complexities of APL assignment and addition and f ( A ) is the size of the array A; +
xr = 6Xm(A) since there are four arrays (A,B, C, D ) and two intermediary arrays (for the
THE EDELWEISS SYSTEM
23 1
two additions), the size of all of them being the same (and equal to the size of A). This complexity is compared to the two parameters of the system: the size of the local memory of ROBOT and the time slice allocated to a fragment. These parameters determine the power xo of ROBOT. If xs < xo, then INTENDANT takes together many consecutive segments, until it gets a fragment, the complexity of which equals to the sum of complexities of segments belonging to it, is greater than xo . If xs = xo ,the segment itself is the fragment. If xs > xo , then INTENDANT cuts the segment in as many fragments as necessary in order that the complexity of each of them is less than x o . Let us suppose, in the preceding example, that:
Then INTENDANT would have cut the segment s in three fragments:
(D): (D):
+
+
(0):
+
(4:+ (B): + (c): ( 4 2 + (BE+ (4:+ P)3”+ (c):
Where (Xfi designates the jth part of X when this one is cut in i parts. c. The Function of SOUTIER. SOUTIER feeds ROBOT from the fragments generated by INTENDANT. However the data belonging to a same fragment may be scattered in the main memory, which by hypothesis is a slow, sequential access memory, like a disk. The essential function of SOUTIER is the dynamic reorganization of the memory to lower the transfer time of the data between the ROBOT local memory and the main memory. For that purpose SOUTIER stores together, as much as possible, in contiguous blocks of the main memory, the data produced by ROBOT (after execution of the last fragment) which will be used by the following segments of the same procedure. SOUTIER settles implantations of data from a proximity index associated to each variable of the segment. The corresponding algorithm has been described in Widory and Roucairol (1977). Furthermore SOUTIER, for large arrays, can decide to access them by columns, or rows, or planes, etc., according to the computations defined in the segment. This is important when a segment has to be cut into several fragments. For example, let us suppose that the APL formula must be cut into three fragments: C
+
(@A)+ B + A
J. ARSAC ET AL.
232
In this formula, A, B, and C are vectors. The three fragments will be
(c): B = (c): a=
+-
(4:
( @ ( A ) : )+ (BE + ( 4 3
(a (@(A):)+ (B): + (4: +-
Y=
( @ ( A ) : )+ (B): +
+-
But the APL operator @ reverses the order of components; so that the data fragments associated with a, /3, y that SOUTIER will get for ROBOT will be a = (A):,
(W, (A):
B = ( A ) : , (B): Y = ( A ) : , (B):,
(4:
B. The Working Set in the EDELWEISS System 1. Introduction
The working set is a parameter that characterizes the locality’s properties of the working space needed for executing a program (or process). This working space is defined by the two segments: the program segment S,(t) which contains the statement to execute at time t, the data segment S,(t) which contains the values necessary for executing the statement. These segments are defined by a logical segmentation. It is necessary to map these logical segments on physical segments of equal length, called page frames; so that the logical segments are described into units of the same length called pages. Therefore the segment S ( t ) = S,(t) u S&) is defined by the pages it contains. Let us designate by IICls(t)I the number of pages contained in $&), the set of pages of S(t). During program execution, the machine must access to informations contained in the page set $(t). These accesses are characterized by the name r, of the page to which the information belongs: r, is called a reference. If the set of all the pages is numbered:
N = (1, 2, ..., n},
IN\ =n these references are numbers rt E N and $ ( t ) is a subset of N:
44)N ,
I WI 5 n
THE EDELWEISS SYSTEM
233
The main problem is therefore to build a segmentation algorithm such that the sequence of segments s ( t ) takes into account the behavior of programs, or, at least of a class of programs. The difficulties come from: (a) a too large dispersion of data related to program activity in a computer system and the paucity of statistical measures that are convincing, (b) the poor structures of usual programming languages which prevents analysis of this behavior at compile time, or even at run time. In the EDELWEISS system, it has been shown that cause (b) is in great part overcome, thanks to the good structure that EXEL gives to programs (cf. Section V,A,3).Cause (a) is very weakened, for it will be shown (cf. Section V,B,I,d) it is enough to meet some threshold conditions; and if APL is chosen for formula language, these conditions will be satisfied. In the classical systems, one could think that the difficulties can be surmounted by asking programmers to give sufficient information to determine the working set. As a matter of fact, this method is impracticable for many reasons: pieces of programs can be written by different persons on different machines; the programmers may not know how to use optimally the resources of the system; and last, the system must wholly optimize the resources for the whole of users and this global optimum is not necessarily the sum of local optimums of each program. On the contrary, the EDELWEISS system is able to optimize the management of resources as it has been described in Section V,A,3. Therefore, the working set in EDELWEISS is very different from classical working sets. 2. The Working Set in the Classical System a. DeJinitionsand Hypothesis. In a classical system, it is very difficult to do previsions. A retarded working set is used, i.e., it is determined by the recent past. It is based on the fact that the frequency with which a page is referenced changes slowly with the time (quasi-stationarity hypothesis); besides neighboring references are strongly correlated but as their distance increases, they become asymptotically uncorrelated. Generally, to explain this, it is noted that neighboring references belong usually to the same module of a program, thus on neighboring program and data segments. Besides, it is usually considered that compilation does the same thing as a low-pass filter which eliminates large variations in the addresses of referenced informations. It is worth remarking that these arguments are related to the method of programming and to the properties of languages and compilers. The classical working set is determined by a parameter T (called the window size), which is a processing time interval which can be supposed identical to real time. This time is discrete: tl, t2, ... at regular intervals.
234
J. ARSAC ET AL.
The retarded working set (RWS)is then the set of distinct pages referenced in the time interval [t - T, t ] : Window 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
-t
1
time
T
Execution of a program gives the sequence p of references: p = rl, r2 , ..., r,, ...
in the set of pages N = (1, 2, . . ., n>. The working set at time t , W(t, T) is then i = rt-x} and W(t, T ) = {i/& e-0 5 x 5 T - 1 The instantaneous segment is this working set: $s(4
= W(4 T )
Its size is the number of pages in this segment: w(t, T )=
I w(t,T )I
Three hypotheses are assumed (Denning and Schwartz, 1972):
H1 H2 H3
sequence p is unending; rr is a stationary stochastic variable, i.e., it is independent of absolute time origin; V t > 0, r,, and rt+x become uncorrelated as x --t 00.
From Denning and Schwartz (1972), we recall here some properties useful for the following. b. Average - Size. Let r=l
denote the working-set size averaged over the first k references. H2 allows us to state: s(T)= lim s(k, T ) k+ao
Generally this average size depends on T; it is an increasing function: Ti ITz * S( Ti) I S( Tz)
Besides, we have the inequalities: 1 = s(1) Is(T)s min{n, T }
THE EDELWEISS SYSTEM
23 5
Under H1, H2,and H3 hypotheses, it can be proved that the average size is equal to the stochastic average, i.e., n
C ip(i)
s(T)=
i=l
where p ( i ) is the probability that the working-set size is i. c. Missing-Page Rate. A page fault occurs when at time t + 1 a reference is made to a page that does not belong to W(t, T). It is thus defined by the binary variable A:
If A = 1, there is one page fault. Thus the fundamental assumption in the theory of the working set, i.e., the quasi-stationarity of references to the same page is equivalent to the assumption that the probability A = 1 is small:
T ) = 1) -4 1
a = Pr(A(t,
(1)
But, A(t, 0) = 1
and
Ti I T2 * A(t, Ti) 2 A(t, T2)
When T + m, Pr(A(t, T ) = 1) 0. Hence to satisfy Eq. (l), we need generally to take a T large enough, which may lead to an average size too large. Some trade-off must be found. The missing-page rate is defined by I
m(T) = lim
C
-
k-m
~
A(t, T )
1=0
It characterizes the average size variation: w(t
+ 1, T + 1) = w ( t , T ) + A@, T )
By summing and taking the limit, we get m ( T ) = s(T
+ 1) - s(T)
and, by induction T- 1
d. Interreference Interval. Suppose that two successive references to page i occur at times t - xi and t. We call xi the interreferenceinterval at time t for page i. Let F,(x) be the distribution function of variable xi andx(x) its probability density: J(x) = F,(x) - Fi(X - 1)
236
3. ARSAC ET AL.
The relative frequency of references to page i is 1 l i = lim - x y i k+m
where y i is the number of references to pages i in the sequence rlr . ..,r,. Note that n
The overall density and distribution functions are defined, respectively, to be n
f ( x )=
C Aif;(x)
n
and
C liFi(x)
F ( x )=
i=l
i=l
The mean overall interreference intervals are n
C
Xi =
x~;(x)
and
j;. =
C
liXi
i=l
x>o
Page i will be called recurrent if it is referenced an infinite number of times in p. The stationarity hypothesis shows that in this case li # 0 and xi = l / l i . If a page is nonrecurrent, then Ai = 0. Let N , = {jl,. ..,jn,)the set of recurrent pages, and n, = I N , I their number. It can be shown that j;. = n,. It follows that nonrecurrent pages make no contribution to s(T) and m(T). Therefore it may be assumed that N = NR. 3. The Working Set in the EDELWEISS System
a. Definition. The management of EDELWEISS is based upon an anticipated allocation of resources. Here, the resources that concern us are the local memory size and the time slice of ROBOT. The fragmentation algorithm enables us to cut programs into pieces that fit to these resources. Therefore, in EDELWEISSit is an adoanced working set (AWS) that represents the behavior of programs. tl I
ti l
l
/
/
1 1 / 1 1
t2 I
+ tr-time
Fmgment to execute
The fragments are multiplexed at times t , and t z . For intermediary time ti , the working set @ ( t i ) is the fragment t,b(tl) called at t , : t , I ti It 2 + @ ( t i ) = $(ti)
THE EDELWEISS SYSTEM
237
Let z($) be the execution time of $: T ( $ ( t i ) ) = t~ - t ,
The fragmentation algorithm leads us to bound this number: 5 z($(tn)) I ~z
Vti,
Now, Ir(S)5
if and only if W ( t z , T,) i @(ti)_< W ( t z ,T,) Considering the sizes, $(ti) = I@(ti) I : w(tz 9
(2)
Ti) I $(ti) I ~ ( t > zT2)
Let us characterize the AWS by the mean size cr: 1k-1 cr = lim C $(ti) ri=O
k-rm
The relations (2) lead us to write s(z1)
I
0
IS ( T 2 )
So the asymptotic behavior of the AWS can be bounded by the RWS related to the time intervals z1 and zz . Statistical measures on execution of APL programs, on which the segmentation has been simulated, has shown (cf. Section V,B,3,d) that the bounds z1 and t 2 are z, N 3.103 z2 N 3.10'
These quantities are measured in numbers of instruction cycles of the central processing unit (IBM 370). The first value tl is very precise, the second much less, and z1 yields a greatest lower bound of the AWS. But, the size of the ROBOT memory is the true least upper bound of cr. Let no be this size. So the relations (2) become ~ ( t3 ,TI) I $(ti)
min(n0
w(t23
G))
(3)
So we get s(zl)
s cr I min(no, s(zz))
b. The R WS Equivalent ta the A WS. Let us suppose a program is perfectly regular, that is, the execution of each fragment lasts the same time z;, so we can choose Tl = Tz = t o and ; in (3) the two bounds become equal and we get $(ti) = w(t2 20) Ino 7
J. ARSAC ET AL.
238
We shall say that w(t, , 70)is the retarded working set equivalent to # ( t i ) ; and in this case, we generalize to the average sizes and put 0
= s(zo) I no
In the general case, the probability that the fragment i lasts the time ziis
f (zi), with
pr(t) = f ( t )
and
jo+m f(t) dt = 1
The average retarded working set $ equivalent to 0 is then defined by
c. Efficiency ofthe A WS. To compare the EDELWEISSmanagement to the one of classical machines, we define the efficiency Q of the AWS; it is the ratio
In fact, the actual efficiency can be less than Q, if the ROBOT memory surplus no - + ( t i ) is not used. In that case the true efficiency Q is
Q' = @/no Therefore, if Q > 1 (or Q > l ) , there is a gain; and if Q c 1, there is a loss. 4. Valuation of the EfJiciency of the EDELWEISS A WS a. Independent Reference Model. To value the efficiency, we must get the valuesf(t) and s ( t ) which occur in the integral (4). Therefore, we have to define some hypothesis and get some statistical measures on EDELWEISS. It is known that s ( t ) can be expressed in the independent reference model (Denning and Schwartz, 1972). This model is close enough to reality as soon as t is great enough, for the references are very uncorrelated; now we shall see in the following that the EDELWEISSAWS yields great values for t . Under this hypothesis, the reference probability of page i is l/Ai. The distribution function F i ( x ) of the interreference interval is then
Fi(X) = 1 - (1 - nip
and the density function J(X)
= F i ( X ) - Fi(X - 1) = Ai(l
- 2,Y-I
THE EDELWEISS SYSTEM
239
From that we deduce the missing-page rate and the average working-set size: n
m(t)=
C
Ai(l - Aiy
i=l n
s(t) =n
-
C
n
C
(1 - Aiy = n -
exp[t log(1 - nil]
i=l
i=l
In particular, when the recurrent pages are equiprobable, s ( t ) becomes s*(t):
vi,
Ai = l/n
hence, n
s*(t) = n
-
C exp{t log[l - (I/.)]} i=l
But the interesting case is n % 1, that gives s * ( t ) = n{ 1 - exp[ - (t/n)]}
It is worth noticing that s(t) I s*(t)
This inequality shows that the equiprobability hypothesis yields a working set s * ( t ) corresponding to the worst case. b. Measures. To estimate the EDELWEISS AWS, we used APL programs. They were run on the IBM 370 machine under the APL-sv system. This system supplies working spaces of 128 K bytes. The program sample included a compiler for arithmetical and logical APL statements, programs of simulation of logical circuits, programs of statistical analysis, and some management programs. All these programs represented about 1000 APL statements and 5000 segmentation points. The segmentation of APL programs is not like the EXEL segmentation; however, with structured programs, it is possible to have a good simulation of what happens with true EXEL programs in the EDELWEISSsystem. The APL-sv system contains one shared variable, ITS, which enables the user to know, at all times, the CPU time consumed by the program. To get the results, a supervisor controlled the execution of user programs, it simulated the segmentation, and at each segmentation point, noted the value of OTS. From these measures, we could deduce a histogram which we replaced by a continuous function, after adequacy test. This function which gives the probability of lifetime segments is given in Fig. 6. One observes that the high probabilities are concentrated around the
240
J. ARSAC ET AL.
FIG.6. Probability function of lifetime segments (with 0 = 1).
highest value, called 6. When t is less than 8/2, the probabilities fall down quickly, and below 6/10, they are nearly zero. The value of 8 is very high and is about 104-105 machine cycles. These values come from the fact that APL statements must be interpreted and the basic functions work on arrays, so that each APL operation generates many elementary operations. A good approximation of the probability function can be supplied by the formula
where g(x) can be chosen to equal exp[ - (log x)’/2]
x20 x
The constant a is determined by the condition tm
Jo
g(x)dx = 1
We get a = I/@
The mean lifetime q :
is then, with Eq. (6)
=
1/4.15 = 0.24
24 1
THE EDELWEISS SYSTEM
We get q = 8e3f2= 4.488
So q is about 5 x lo4 to 5 x lo5 elementary cycles. As these cycles are about 1 pec, the average lifetime of segments is of the order of 100 msec. These results confirm the main thesis of the EDELWEISSsystem: the segmentation can definefragments in such a way that their lifetime is great enough to allow the use of sequential access memory as central memory of the system. c. EfJiciency. Now, we can get an order of magnitude of the efficiency of the AWS: $ = Jo+mih(i)(n i = l exp[t log(1 - i i ) ] / d t
where h(x)= exp[ - $(log x)']
so,
But, for all t
h(t/O)I 1,
[for t = 6, h(1)= 13
So, it becomes
JI
' ea " +
1
log(1 - l i )
But as soon as the number n of pages is great enough, the probabilities Ai are small and next to l/n: l i
1 n
= - (1
+~
i )
and
1
l i
with ci Q 1. Therefore we have log( 1 - l i )N Then, we get
Define
-l
i
N
n(l
-~ i )
242
J. ARSAC ET AL.
We get I) > n ( l - y ( l + b)
Since b 4 1, we can write
1
I) > 4 1 - (an/6)l
Therefore, the minimum value of the efficiency of EDELWEISS is
1
or
d. Discussion. The statistical measures allow us to valuate the gain. Let us suppose, n/no N 10 and n 1: 100, which is not unlikely. We know that 6 is greater than lo4. In that case an/O is negligible before 1. So, it becomes
Q > n/no
10
=
So, the EDELWEISS system would be, in those conditions, ten times better than the classical ones. But the efficiency chiefly comes from the relation (5), since this one yields the formula (7); the precise analytical expression does not much matter. The most important fact is the shape of the function. But we must have
an/0 < 1,
8
+ n/4
Let us choose a factor of ten:
8 N 10(n/4)= 2.51 But the measures show that the execution time of segments most often exceeds 6/10. Thus, the minimum execution time eminmust satisfy the threshold condition
>44
(8)
emin > 25 and the mean execution time e, will be
(9)
emin
If we take n = 100, we obtain
e,
N
4 6 = lo3
THE EDELWEISS SYSTEM
243
These values are achieved under independent reference hypothesis. We notice that the experimental value of 0 is of the order of lo4, being much greater than the threshold of (9); we think that this hypothesis is likely. Moreover, we can admit to remove the equiprobability hypothesis. Try some cases:
We have $
’n i l - (2../0)3
so, emin> n/2
and
e, > 5n
(b) Let us suppose that pi=l+ei
and
pi
Let po be the minimum of the pi’s. For all i
Li> p,,/n
with po
+n
We get
To get Q’ > 1, it is enough to have
With 8 of the order of lo4, we get a gain, even if the probability of some pages is very low, near to 2 lo-’. (c) Examine the case where some probabilities Liare negligible before t3-I
244
J. ARSAC ET AL.
Let k be the number of such pages. These are now like recurrent pages; so efficiency coefficient becomes
Q > c . -*Ok , [ l - i ( n
-k)]
+
We have a gain if k n - no; if not, there may be a loss. At the limit, when k = n - 1, we have s ( t ) = 1,
Q=1
and
Q = l/no
But this case is highly unlikely. (d) Suppose a page has a probability very near to one (there is only one such page). Let n be this page and 3., its probability. The probabilities of other pages are small before one but not negligible. We have then n- 1
s(t)= n
-
c (1
[i=,
-
niy
]
c (1
n-1
-
(1 - A n y #
n
-
i=l
-
niy
So, we get
Let lobe the minimum of li’s; it remains
And this comes back to the case (b) above. Summing up, the resource allocation policy of EDELWEISSnearly always leads to a high gain of the working set, the order of magnitude being of n/no, where n is the average size of the programs and no that of ROBOT memory. Of course, this gain is achieved with the EXEL-APL language because of array data structure. C. Analytical Model of EDELWEISS System
1. Introduction EDELWEISSis a multiprocessor system in which each processor is qualified to perform one type of specific task. The entire system can be viewed as a multiserver queueing system (Fig. 1). A program’s service requirement is composed of a sequence of segments. In the model, a segment is to be regarded as a customer. Program’s segments join the input queue, remain there until served, then go to the next center on their routing until the last service is completed, at which time they leave the system.
THE EDELWEISS SYSTEM
245
a. Generalities on Queueing Systems. Let M be the number of service centers connected to one another by N waiting queues. Let n, be the number of customers in ith queue. A state of the system is represented by a Ndimensional vector N = nln2
- 1 .
ni ... nN
When a service is completed at center i, customer is supposed to join randomly center j unless center i is the terminal one of the routing, in which case customer definitely leaves the system. Customer arrivals at different centers are modeled as Poisson’s processes with mean arrival rates li. Service times are exponentially distributed with means l/pi. P ( N , t ) denoting the probability the system is in state N at the instant t, we consider all the states (j, t ) which may evolve into state N at (t + h) P(N, t
+ h ) = c P(J, t ) P ( N , t + h I J , t )
(10)
Events occurring during interval of time h may be
,
a customer arrival at center i: probability p o , a customer departure at center i: probability pi, + service completion at center i, next service required at centerj: probability pi, combination of these events Let N = (nl, n 2 , ..., n m , .. ., n,, . .., nM)
K, I,,
1, . . ., n,, .. ., nM)
= (nl, n,,
..., n,
= (nl, n,,
.. ., nm + 1, .. ., n,, . .., n M )
-
QmI= (nl, n,, . . ., n, + 1, . . ., n, - 1, . . ., n M ) with m, 1 = {1, 2 ..., M } . Taking the limit of (10) when h tends to zero, we obtain
J. ARSAC ET AL.
246
Long-run stationary solution is given by
Which means a given state P,(N) of the system being represented by a lattice point N in a N-dimensional space, the expected number of transitions per unit time into N (from neighboring points) equals the expected number of transitions out of N. b. Restrictions to EDELWEISS System. EDELWEISS system is schematically sketched in Fig. 7.
FIG.7. The eDELweIss system.
The following notations will be used in the text: Number of servers 1 name of servers 9 name of queues g1 lengthof queues n,,
2
3
4
J Y W Y , g2
i
s1
s2
n12 n2 n,,
n,,
It may be noted that
9 is the single input/output port of the system p0.m
=O,
Pm,M+l=O,
5
Vm# 1
vm#
r s, n4 nS
THE EDELWEISS SYSTEM
247
Customers always proceed to a different stage j when service is completed at stage i: Pm,m = 0, v.m Assuming the total number of customers circulating in the system is maintained fixed (i.e., as soon as a departure occurs, a new customer is admitted), Eq. (10) becomes M
O=
M
M
C C m=l m=l
+ 1)Pm.lPdQml) - C
m=l
Pm(nm)PdN)
(13)
Solutions of Eq. (11) are thoroughly discussed in Gordon and Newel1 (1967) under various assumptions.
2. Partition of EDELWEISS into Subsystems In order to take into account local constraints of servers, EDELWEISS system is now broken down into subsystems in which waiting rooms are limited in length, so that on leaving center i, if a customer requires servicing by center j and if incoming queue j is full, he is prevented from joining the latter and blocks server i. Three subsystems will be considered: S,-G, G-I-S, and S-R. Notations pi = mean service time of ith server Pij = Pi/Pj
b)= gN(a, b) =
2 - 1
I
+ a N - 2 b + ... + a N - l - i b i
+ ...
+ bN-'
if a # b
NaN-'
if a = b
a. B-S, Subsystem (Fig. 8). B is the 1/0 port of the system and is concerned with customers coming from queue gl, queue g2, and newcomers' queue. Those of the first two classes are given priority over newcomers (i.e., a new customer is admitted only when g1 and g2 are empty) and are served alternately by 9 with mean shared service rate pl/2. Moreover we assume that the system is working under heavy load rate, i.e., outside incoming queue is never empty, and as a consequence, g is always busy. Let n, I , n5 denote the respective number of customers in queues s, and g1 and let L be the length of both queues (nl or n5 includes the customer under servicing in g or S,).
J. ARSAC ET AL.
248
FIG.8. The G-S, subsystem.
9 is said to be blocked if both its two output queues are full at the same time. Then, if s, is full, 9 is allowed to go on processing. At the completion of the service, the customer may or may not join s, . Because of this assump tion, s, is given a buffer so that its maximum length is brought up to L + 1.
I
L+ 2
p11 FIG.9. The state transitions of the G-S, subsystem.
T H E EDELWEISS SYSTEM
249
This constraint causes that of g 1 to be limited to L + 1 too. So 1I
n1,
I
L+2
OIn, I L + ~ Figure 9 shows the state transitions of the system. b. Aoeruge Queue Lengths (Appendlx A). Let p(nll, n5) be the state probability of lattice point (nl 1, n5) and Pt5=
1 p(nll, n5 = i ) nll
and
Expected values of n l l and n5 are defined as L+2
E(n11)=
,
1 iPA1
i=l
L+ 1
E(n,)=
1 iPt,
i=O
and are readily obtained as
with
c. G-I-S Subsystem. This subsystem is shown in Fig. 10. Assuming waiting rooms i and s1 limited to N + 1 places (N places plus a buffer) leads to Fig. 11.
250
J. ARSAC ET AL.
rc--
\
/
/
\ L -
- - -
- + -
---f
FIG.10. The G-1-S subsystem.
FIG.11. Transition diagram for the G-I-S subsystem.
Expected values of n2 and n32 are derived in Appendix B as
THE EDELWEISS SYSTEM
25 1
d. G-S-R Subsystem. See Fig. 12.
FIG.12. The G-S-R subsystem.
The three-dimensional Fig. 13 is obtained under the following assumptions: Each waiting room is limited in length to M + 1 places ( M places plus a buffer). For the sake of simplicity, exceptionally n12 does not include the customer under being served in g. Let p(i, j , k) = p(n4 = i,
n32
= j , n12 = k)
252
J. ARSAC
ET AL.
FIG. 13. Transition diagram for the G-S-R subsystem.
253
THE EDELWEISS SYSTEM
forthegeneralcasewherea # fl # y. Ifa = f l o r a = ?,or f l = y or a = fl = y, limits of p(0, 0, 0) must be taken. Finally (Appendix C), average queue lengths are given by
if a # l
and y # f l
Permuting a, fl, y gives E(n32) and E(n,,). Limits of these expressions must be considered if a = 1, fl = 1, or y = 1 and/or a = fl, a = y, or fl = y.
3. Global Performances Of particular interest is the blocked state of the input/output server 9, because it conditions the throughput of the system. It is easily seen that the associated probability is
pz =
+1
@15PlS)L
0
pn,P15
+ ON+Z
Pi2 P12Pl2r -t)l3)P(O, P34
+
O)p12
a. Mean Throughput. Let ps = 1 - (p12 pls) be the probability that a customer ends his routing at 1/0 station and leaves the system. The throughput of the system is defined as the average number of departures per unit of time, so that, in the long run, is given by P s = PSPl(1 - PE)
J. ARSAC ET AL.
254
b. Waiting Times. Let z denote the interval of time elapsing between two consecutive departures. The expected value of 7 is given by 1 1 E ( z )= - = K PlPS(1 - PE)
With N denoting the total number of customers circulating in the system, the expected time a customer remains in the system is related to the expected value of N by the Little theorem.
E(T) = E(7)E(N) with E(N) = E ( C ni) =
c E(ni)
Therefore, the expected time a customer is waiting in the ith queue is
c. Servers’ Utilization. Let a customer’s requirement composed of m12 tours through G-I-S-R circuit and m15 tours through G-S, circuit (hence m,, m i 5 1 services by 4);the expected values of m12 and ti115are defined as
+
+
~ ( m 1 2= )
5 c c
$
m 1 2 ~ ~ : : + ~ ~ ~ t + i ~ ~ ~ p ~
mtz=O mlS=O m
E(m15)=
o
o
m15CE::+mlsf14
z Pml 5i 5 P s
m15=0 m 1 2 = 0
Developing these expressions yields m12
+ m15 +
= l/Ps
m12
= Pl2/Ps
ml5
= P15/Ps
Hence the proportion of time a server remains busy is
and and of course
THE EDELWEISS SYSTEM
255
d. A Comparison with Monocustomer System. By monocustomer system we mean a discipline in which one customer may occupy the system at a time, and cycles through it until completing his requirement at which time a newcomer is then admitted. The mean occupation time of the system by such a customer is 1
E(z’) = - ( E ( m l Z ) P1
+ E(m15) + l)
Gain over the monocustomer discipline is then given by
4. Concluding Remarks
As stated previously, a program to be run in EDELWEISSsystem is broken down into a sequence of logical segments, each of them the equivalent of a customer in the model. In a multiprogramming situation, at a given instant, processors’ waiting rooms are filled with mixed segments which have arisen from different programs. Those segments which correspond to procedure calls are dispatched by $?I into S, queue, the others into P-S-R circuit. Assuming the mean frequency of calls is such that the following relation holds E h z ) = kE(m15) and the complete execution of a program requires q services by 9,we obtain
1
Ps
=4
Therefore the input parameters of the model are twofold: relative and, within certain limits, sizable processors’ speed, statistics on EXEL programs. For example, if k = 4, q = 20, L = N = 8 and assuming the processors have identical speed (i.e., pij = 1 V i , j ) , the previous formulas yield PE N 0.35 F = 3.6
256
J. ARSAC ET AL.
It may be seen that in our case the high value of P z is due to saturation of processors Z and S. In a realistic situation a remedy to this may be increase the speed of processors I and S; give the waiting rooms more places. FAMILY VI. THESINGLE-USER
If the multiuser environment is not wanted, the general concepts of be used for definition of a family of smaller-sized machines. A prototype of such a machine has been built in the Thomson-CSF Central Laboratory. This machine is called EXELEITE.
EDELWEISS may
A. Description of EXELETTE
The machines of the EDELWEISS family being language oriented, it is natural to start by describing the language used in EXELEITE. The control language is EXEL, and the formula language is APL. APL is a conversational mathematical oriented language using array structured data Data size is unknown at compile time and varies dynamically during execution. Data are processed through a set of operators that are part of the language itself. Operators are regular arithmetical operators (+ - *, etc.) or specialized array operators (reductions, rotations, transpositions, etc.). For a detailed description of the APL language, refer to IV,1. When designing EXELETTE, it has been tried to stay as close as possible to the operating mode of classical APL systems. Hence, the usual notions of working space, command mode, definition, or execution mode are to be found in EDLETIT also. 1. Hardware (Fig. 14)
The version of EXELETTE currently implemented has the following hardware resources: two microprocessors INTEL 8080 and a 64 K bytes MOS memory, two floppy disks 256 K bytes, one printer, one console, one keyboard. ( i ) Particularities of the peripherals. On the console EXEL characters are distinguished from APL characters by a video inversion. Also on the printer, EXEL and APL appear on different lines, in order to get a better visualization of the control structure of a procedure. The first floppy disk contains the system and the working spaces defined by the user. Extension of this system/storage disk is provided up to seven floppy disks.
THE EDELWEISS SYSTEM
257
0 1.8080
Printer Keyboard
FIG.14. EXELETTE architecture (hardware).
The second floppy disk contains the active working space. It has been created during the current session or has been loaded from the storage floppy disk by an APL command “ )LOAD (name).” This disk is organized as a virtual memory and is addressed online by the user, so the size of the active working space is 256 K bytes. Each of these floppy disks is organized in 64 tracks of 16 sectors of 256 bytes each. ( i i ) Biprocessor central unit. The first microprocessor (pC) performs calculations specified by the programs of the user. The other (pG)deals with management of peripherals. Both have access to main memory. A priority scheme takes care of access conflicts. In one zone of 8 K bytes, pG has priority over pC; in the second zone (56 K) pC has priority. The dialog between pG and pC consists of external commands (like ABORT) sent to pG by the user, a request from pC to pG through hardware built-in “mailboxes”, answer from pG to pC to the request. Each 1/0 request is realized by an interruption from pC to pG. An interruption level is attributed to each peripheral device according to a classical hierarchical scheme. 2. Software Architecture
A modular description of the system is given in Fig. 15. The original features will be discussed in Sections VI,B and C.
258
J. ARSAC ET AL.
P General Monitor
Compilers Monitor
Editor
Compiler
Procedure Calls
APL Monitor
Compiler Other Compilers
Allocator
Calculator
FIG.15. Modular description of EXELETTE software.
B. Operating EXELETTE Similarly to usual conversational APL systems, EXELETTE can be used under two working modes. 1. Dejnition Mode
By inputing the character " V," the user sets the calculator in definition mode. In this mode, one can define a procedure or modify it. A procedure is composed of a header and a set of actions. Each action is entered, processed, and stored separately. The presence of two microprocessors allows one action to be processed by pC, while the user is entering the following action at the keyboard. A partial compilation of each action text is performed. This compilation recognizes external identifiers and gives to the corresponding objects internal names. It also generates two intermediate texts, one correspondingto the APL, the other to the EXEL. These two texts are interpreted at execution time by different interpretors. The intermediate APL (I-APL) is set of n-tuples: ( o P ~ > ( o P ~ > (oPi>(res> where (opr)
is an
APL
operator
(OPJ
are operands
(on) (res)
is the result
THE EDELWEISS SYSTEM
259
In the I-APL, identifiers have been recognized, parenthesis structure has been removed, and internal names have been given to intermediate results (temporary variables). After this compilation has been performed, source and object codes are stored on the active floppy, along with various internal directory and information. If a syntactic error is recognized, no message is sent at compile time, but the error diagnostic is also stored. 2. Execution Mode
One exits from definition mode by input of a “ V.” Then the machine is in execution mode. In this mode, calculations are performed: both I-EXEL and I-APL texts are executed. I-EXEL indicates which is the next procedure, action, or formula to be executed. If a compilation error is found during execution, a message is sent. This way, compilation error messages are sent only if control is actually given to an incorrectly composed action or formula. It is in this mode also that are performed usual APL commands starting by a “ ).” C . Internal Management of E X E L E T T E
One of the most interesting features of this machine is its internal management. The three logical levels of segmentation of EXEL have been used at various levels for internal organization in execution mode. Variable management on external storage is achieved at procedure level (see Section VI,C,4). User’s program overlay is achieved in EXELETTE at action level. This means that segments of programs brought into central memory are action texts. Depending on the level of sophistication, one or several actions may be present in memory at the same time. Management of variables in central memory is achieved at formula level. The method has been described in Galtier (1977). The idea is to recognize by a short preprocessing of the formula which variables will be necessary for its execution, and to initiate corresponding disk accesses (load or swap). While allocation or execution proceed on pC, pG takes care of these transfers asynchronously. This is possible because at the formula level no branching is to be found. Thus a simple syntactic analysis of the I-APL text of a formula is sufficient in order to know which variables are to be brought in. Let it be recalled here that APL variables are arrays, the size of which are unknown at compile time. Hence the problem of memory allocation has to be solved at execution time.
260
J. ARSAC ET AL.
The APL interpretor has thus been split into two distinct programs: an allocator and a calculator. 1. Allocator (Central Memory Management)
The allocator realizes a dynamic and previsional management of central memory at the formula level. It performs the previously mentioned processing and recognizes which variables are necessary in this formula and which are not. This management is done logically. This means that variables are manipulated as logical objects and are not scattered into various physical parts having no logical significance, which happens when, for example, pagination techniques are used. Space for creation of temporary variables is reserved and disk accesses necessary to bring in missing objects are initiated. Upon failure of an allocation, the allocator stops and control is given to the calculator. In the current version of EXELETTE, allocation stops on occurrence of an assignment, but it is possible to avoid this. Actually, in a more elaborate system, it is possible to allocate up to the next occurrence of an EXEL alternation. 2. Calculator (Execution of the n-Tuples of the I-APL)
The calculator is composed of a set of microprograms realizing the APL operators, and an interface with the calculator and the microprocessor pG. Before initiating execution of an operator, the calculator tests if all accesses necessary for this execution have been successfully completed. These accesses have been previously set by the allocator. If processor pG has not yet taken care of all accesses, pC will be waiting before starting execution of the APL operator. However, it is hoped that during the time of the previous calculations, pG could realize the transfers, thus preventing pC from waiting for outside variables to be brought in. This method is totally efficient if allocation time is short compared to calculation and transfers, and if calculation and allocation overlap time for transfers. After completion of the execution of an operator, the calculator frees the space occupied by temporary results having been used as operands. Allocator and calculator communicate through a set of internal states which specify which variables are necessary in the near future and which are not. 3. Dialog between Processors during Execution
During allocator run, pC sets a number of disk accesses in a queue of software mailboxes. These mailboxes are chained together. One interruption
THE EDELWEISS SYSTEM
26 1
is sent to pG which executes the first access, then tests for chaining. If other mailboxes are waiting, they are processed by pG without any other intervention of pC. Hence only one signal from pC is required to process a chain of accesses. Both processes of requesting an access and honoring it are asynchronous. The critical variable between the two is the chaining condition: it must not be changed by one pP when tested by the other. The sharing of this variable is done with T. S. Dekker’s algorithm cited by E. W. Dijkstra (1968).
4. Floppy-Disk Management
Three types of variables are to be found in EXELETTE: global variables, local variables, and temporary variables. Global and local variables have external names; temporaries are generated by the compiler. Each of these variables have a different range, to which corresponds a method for disk management. Global variable size declaration is required prior to execution. If missing, a standard value of 256 bytes will be assumed. These global variables do not cease to exist after completion of a session and will be saved on the storage disk upon request of the user. On loading a working space on the active working space disk, space for global variables is reserved at the top of the area. The other types of variables are stored at each end of the variable zone. The structure of the active working space disk is shown in Fig. 16, in which LV1, . . . , LVn show the stacking of the local variables corresponding to n procedure calls.
L, Temporary variables
Local variables
Files
Procedures LVl
- - -
-
- LVN
Available
FIG. 16. Disk management.
When a copy of a local or a temporary variable is to be created on disk, it is created at the current position of PLV or PTV. No hole management is involved. On completion of a formula, PTV is reset to its original value. This way, the disk space of temporary variables is automatically recollected. A similar way is used to reclaim local variable space. On a procedure return, the local variables of this procedure cease to exist. The corresponding disk
262
J. ARSAC ET AL.
space is recollected by simply moving the pointer PLV up to the last stacked procedure.
MEMORY
CALCULATOR -c-
Order of scanning Request Queue
ALLOCATOR
FIG.17. Three-processor implementation.
It is interesting to point out that some of the specialized processors of the larger-sized EDELWEISS have their counterparts in EXELETTE, especially in the three-processor implementation. The calculator is the equivalent of ROBOT and the part of SCRIBE that processes procedure calls. The allocator microprocessor performs the functions associated with segment definition and management, which are done in EDELWEISS by GREFFIER and INTENDANT. The segment limits are the points where the allocator cannot decide where to go. These points are the EXEL alternation, as mentioned in Section VI,C,l. It is also the allocator’s job to set all necessary accesses to the floppy disk in order to get calculator’s memory ready. This is performed in EDELWEISS by GREFFIER. Processor pG is responsible for accesses with mass storage and periph-
263
THE EDELWEISS SYSTEM
erals. The first of these functions is done in EDELWEISS by SOUTIER,the second one by HUISSIER. This is summed up in the following table: Single user
EDELWEISS
Calculator AUocator Processor pG
ROBOT + SCRIBE GREFFIER + INTENDANT SOUTIER + HUISSIER
APPENDIXA Let PI, =
c Phi, n,
=
4
nil
Performing horizontal cuts in Fig. 9 leads to the following set of equations: Pi5 plP:5
=, !p5I '
Pl5plpt!,
= pSp;5
P 1 5 pl
= p5 ," ':
Successive elimination of Pi,, P;,, . . . , PI;
As
c Pt,
L+ 1 i=O
Then
or
=1
results in
264
J. ARSAC ET AL.
Vertical cuts yield
1 K = 2aL+l(1, P 1 5 P 1 5 ) %+z(l, K) P15P15) Expected values for n5 and n,, are readily obtained as p,!,, =
L+ 1
I
I
L+ 1
I
I
APPENDIXB We consider the two-server system and its associated transition diagram as sketched in Fig. 18. Equating the expected number of transitions into and out of each state
X
Y
Y
FIG.18. Two-server system and its transition diagram.
265
THE EDELWEISS SYSTEM
yields a set of (K + 1)(K + 2)/2 linear equations with the same number of unknowns p(i, j ) . An additional relation between the unknowns is K
K
Therefore a unique solution of the system is expected to exist. For K = 1, this solution is obvious:
P( 4 0 ) = (U/W)P(O, 0)= ap(O, 0) P(0,1) = (U/U)P(O,0)= bP(0, 0) p ( 0 , O ) = 1/(1
+ a + b)
with a = UJW
and
b = u/u
For K = 2, we have U P ( 0 , O ) = UP(0, 1) (u
+ W)P(l,
0)= UP(0, 0)+ UP(L 1)
WP(2,O) = UP(1,O) (u
+ U)P(O, 1) = W P ( L 0)+ VP(0,2) 2) = W P ( A 1)
UP@, (0
+ W)P(l,
1) = WP(0,2)
+ UP@,
1)
Solving this system is straightforward: PO, 0)= aP(O9 0)
P ( 2 , O ) = a2p(0,0)
P(0,1) = bP(0, 0) P ( 0 , 2 ) = b2P(0,0) P(1, 1) = abp(0,O) p ( 0 , O ) = 1/(1
+ a + b + ab + a2 + b2)
For arbitrary K,extraction of explicit solution is a tedious task. Results obtained for K = 1 , 2 suggest a solution in the form p(i, j ) = a'b'p(O,O) K
K
J. ARSAC ET AL.
266
We have to prove the following: (u
+ v + w)aW = uai-lbj + ua'b'+' + wai+'b'-' for i + j < K
(u + v)b'= v b ' + l + wabj-'
for i # K (u
+ w)aibK-i
= uai-l
bK - i
and j = O
+ wai+lbK-i-i
for i + j = K Eqs. (14) can be rewritten as
u+o+w=-
U
a
+ub+-
wa b
wa u+u=v~+b u+w=-
u+w=-
U
a
+vb
wa +a b u
Inspection of (15) suggests
or finally, U
a = -W
U
and
b=v
These results can be obtained also by recurrence on K . Average Queue Length. The probability of a diagonal cut is given by P, = P(i + j = rn) = p(0, O)(a"
+ a"-lb + a"-'b + + ab"-' + b") = p(0, O)c7m+l(a,b)
267
THE EDELWEISS SYSTEM
Substitution of the P , in
P(0,O) =
ct P , = 1 yields
a-b aoK+l(l9 a ) - b o k + l ( l ,
if a + b
b,
1 [d(.K + 2 )/da](r, a) The expected value of x is defined as
if a = b
K
E(x)=
iPi i=O
As
c aWp(0,O)
K-i
Pk =
= doK+
-i(
1, b)
j=O
so,
E(x)=
1-b
[daz ~ ~a)+b-oa,+,(a, ~aaa ( l , b) 1 -
if b # 1
APPENDIXC Given the three-server system and its three-dimensional transition diagram as sketched in Fig. 19, it can be shown that a recurrent state ( i , j , k), possesses probability: p(i, j , k) = aiBjykp(O,0,0) z
X
FIG. 19. Three-server system and its transition diagram.
J. ARSAC ET AL.
268
with
a = aJb
B=
aJc
y = aJd
A convenient way to determine p(0, 0, 0) consists in cutting along one coordinate, for example, x:
P:
= ~ ( o , o0) ,
x ao(al(y,B) +
pi = p(0, 0, 0) x a ' ( O i ( y ,
B) + B) + O 2 ( 7 , B) +
The Zth line gives rise to a partial sum:
and finally,
or
+ (TN+2(Y9B)) + a N + l ( y , 8))
THE EDELWEISS SYSTEM
269
The expected value of x is readily obtained as N+ 1
E(x)=
iP1 i=O
or the more concise form
a f l
YfS
lima-,l E ( x ) or limy+pE ( x ) is not difficult to be derived if a = 1 or y = 8. Circularly permuting a, 8, y gives E ( y ) and E ( z ) .
REFERENCES Adiri, I., Hori, M., and Yadin, M. (1973). J . Assoc. Cornput. Mach. 20, No. 4. Arsac, J. (1977). In “Automata, Languages and Programming” (J. Loeckx, ed.),Lect. Notes Cornput. Sci., Vol. 14, pp. 112-128. Springer-Verlag, Berlin and New York. Arsac, J. (1977). “La construction de programmes structures.” Dunod, Paris. Ashcroft, E., and Wadge, W. W. (1975). “Lucid, a Non Procedural Language with Iteration.” University of Waterloo, Ontario. Baker, B. S. (1977). J. Assoc. Cornput. Mach. 24, 98-120. Boussinot, F. (1977). T h b e de 3eme cycle, Paris.
270
J. ARSAC ET AL.
Chu, Y. (1975).“ High-Level Language Computer Architecture,” pp. 1-14. Academic Press, New York. Chu, Y. (1972).“Computer Organization and Microprogramming.” Prentice-Hall, Englewood Cliffs, New Jersey. Cousineau, G. (1977a).I n “Programmation” (B.Robinet, ed.), pp. 53-74. Dunod, Paris. Cousineau, G. (1977b).These d’Etat, Paris. Denning, J. P. (1968).Proc. AFIPS pp. 915-922. Dahl, 0.-J., Dijkstra, E., and Hoare, C. A. R. (1972) “Structured Programming.” Academic Press, New York. Dijkstra, E. (1968).Commun. ACM 11, 147-148. Dijkstra, E. W. (1968). In “Programming Languages” (F. Genuys, ed.), pp. 43-112.NorthHolland Publ., Amsterdam. Denning, P. J., and Schwartz, S.C. (1972).Commun. ACM 15, 191-198. Floyd, R. W.,and Knuth, D. E. (1971).InJ Process. Lett. 1,23-31. Galtier, C. (1977).Proc. IFIP pp. 291-296. Goldberg, J., ed. (1973).“The High Cost of Software.”Stanford Res. Inst., Stanford,California. Gordon, W. J., and Newell, G. F. (1967).Opr. Res. 6,254-265. Hansen, P. B. (1973) “Operating System Principles,” pp. 188-189.Prentice-Hall, Englewood Cliffs, New Jersey. Jackson, J. R. (1963).Manage. Sci. 10,No. 1. Knuth, D. E. (1974) ACM Comput. Suru. 6, No. 4. Kosaraju, R. S. (1974).J. Comput. Syst. Sci. 9,232-255. Ledgard, H. (1975).‘‘Programming Proverbs for FORTRAN Programmers.” Hayden Editor, Rochelle Park, New Jersey. Ledgard, H., and Marcotty, M. (1975).Commun. ACM 18,629-639. Little, J. D. C. (1961).Opr. Res. 9,383-387. Laveman, D. B. (1977). J . Assoc. Comput. Mach. 24, 121-145. McCarthy, J. (1960).Commun. ACM 3, 184-195. N o h , L.,and Ruggiu, G. (1973).ACM Symp. Princ. Programm. Lang., 1973 pp. 108-119. Rice, R., and Smith, W.R. (1971).Proc. SJCC, 1971 pp. 575-587. Ruggiu, G. (1973).C.R. Hebd. Seances Acad. Sci., Ser. A 211,371-373. Ruggiu, G. (1974).These d’Etat, Paris. Standish, T. A., Harriman, D. C., Kibler, D. F., and Neighbors, J. M. (1976) Proc. ACM Natl. Conf, 1976 pp. 509-516. Urschler, G. (1975).IBM Res. J . pp. 181-194. Vasseur, J. P. (1973).“Le systeme EDELWEISS.” Colloq. Int. Mem., Pans. Vasseur, J. P., and Ruggiu, G. (1973).C.R. Hebd. Seances Acad. Sci., Ser. A 211,345-347. Vuillemin, J. (1974).These d’Etat, Pans. Widory, A., and Roucairol, G. (1977).Comput. Sci. 11,209-233. Wirth, N.(1973) “Systematic Programming.” Prentice-Hall, Englewood Cliffs, New Jersey.
.
ADVANCEB IN ELECTRONICS A N D ELECTRON PHYSICS VOL. 48
.
Electron Physics in Device Microfabrication I General Background and Scanning Systems P. R. THORNTON G C A Corporation Burlington. Massachusetts
I. General Introduction ................................................................... 272 I1. Photon and Electron Beam Lithography for Device Microfabrication .............. 275 A. Photolithographic Methods .........................................................275 B. Electron Beam Lithography .............. . .......................................277 C. Comparison between Lithographic Approaches ................................... 279 I11. Interaction between an Electron Beam and a Resist-Coated Substrate .............. 280 A. Interactions within the Resist Itself ................................................. B. Role of the Underlying Substrate ................................................... ... ................................. C. The Proximity Effect ............................ IV . Electron Beam Methods for Device Microfabrication ................................ A. Pattern Generation by a Scanning Electron Beam System ....................... B. Electron Projection Systems Based on a Photocathode ........................... C. Electron Beam Projection Systems .................... ............................ D. X-Ray Lithography .................................................................. E. The Central Role of the Scanning Electron Beam Approach ..................... V. The Development of a Fast Scanning System-General .............................. ........ A. The Electron-Optical Design Problem .................................... B. Practical Aspects of the Electron-Optical Design ................................. C. Question of Cathode Choice ........................................................ D. Basic Electron-Optical Properties of Guns ........................................ E. Field Emission Cathodes ............................................................ F. Thermal Cathodes .................... ............................................. G . Fiducial Mark Detectors, Beam Blanking, and Alignment ....................... VI . The Development of a Fast Scanning System-Use of Thermal Cathodes .......... A. Introduction ..........................................................................326 B. Application of Spot Shaping and the Use of a Variable Aperture to Obtain High Throughput ..........................................................................326 C. Use of Koehler Illumination ........................................................328 D. Experimental Results .............................. . .............................. 335 VII . The Development of a Fast Scanning System-Use of Field Emitter Cathodes ...: 336 A. Introduction and Potential Performance ......... . . ....... . . . . . . . . . . . . . . . . . 336 B. Practical Difficulties Associated with Field Emitter Microfabrication Systems . 337 C. Experimental Work ................................................................. 339 D . Future Possibilities .................................................................. 340 Copyright 0 1979 by Academic Rm. Inc. All rights ofreproduction in any form reacrvcd.
ISBN 0-12414648-7
272
P. R. THORNTON
VIII. The Development of a Fast Scanning System-The Role of Computer-Aided Design A. General Approach ................................................................... B. The Combination of Contributions to the Final Image Size, Shape, and Position C. The Computer-Aided Design of Individual Electron-Optical Components.. ....
...........................................................
341 341 344 346 350
..................................... A. B. C. D.
E. F. G. H.
Unique Features Required of a Microfabrication Deflector Deflector Design Philosophy and a Statement of the Electron-Optical Problem 356 The Question of Scan Strategy ..................................................... 358 Analytical Treatments of the Deflector Problem Recent Experimental Data .......................................................... 362 Fabrication Aspects ............ ................ Electronic Implications ......................................... Calibration and Diagnostics ...................................
I. r uture rossmuties . . . . . . . . . . . . . . X. High-Current Effects.. . . . . . . . .. . . . . . . A. The Role of Coulombic Effects .
..............
C. Additional Data Pertaining to Coulombic Interactions ........................... 374 D. Application to Microfabrication System Design.. ......... ............... 375 References ................................................................................ 376
I. GENERAL INTRODUCTION The electronics industry has, in the last thirty or more years, seen an astounding number of new inventions and developments which have confounded all predictions as to saturation of growth within the industry. After the enforced stimulus of World War 11, a happier growth resulted from the invention of the transistor in the late 1940s. In the following decade, this inventiveness was exploited with increasing skill and miniaturization culminating in the late 1950s with the realization of the integrated circuit. In the following years, workers improved on this approach under the normal stimulus of commercial competition, under the drive imposed by the telephone industry in its need for fast, ultrareliable switching with reduced space requirements, and under the impetus resulting from space exploration where ultrareliability coupled with minimum weight and power needs represents the ultimate goal. The next growth factors were the successful introduction of the MOS technology, which both complemented and simplified the bipolar approach, and the successful introduction of GaAs into the forefront of semiconductor technology, resulting in the large-scale manufacture of electroluminescent devices and in the possibility of optical coupling in LSI circuits. In the early 1970s, these technologies resulted in the introduction of the first of the hand-held calculators-the HP35.This new form of computing facility had significant impact first on the scientific and engineering com-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
273
munities and then, as price reduction occurred, on the general public. At present, the industry is striving to fully realize and exploit the microprocessor in all its possible variants. The microprocessor will have a similar impact on both scientific and consumer markets. New technologies involving bubble memories, CCD arrays, fiber optics, Josephson junctions, GaAs LSI, and silicon on sapphire are waiting in the wings in varying degrees of readiness to expand our capability in the area of miniaturized devices. This continuing drive toward miniaturization is very emphatically stressed today as it can be shown that by making the basic chips on which the new instrumentation depends still smaller, significant gains can be achieved in terms of speed, reduction of power consumption, increase in memory capability, and reduction of cost. In somewhat oversimplified terms, this need implies that we have to reduce the “ minimum element ” size from 5 to 2 to 1 pm to submicron and then to lo00 A successively in the next 10 to 15 years. This specification, in turn, implies an ability to make photomasks (or to develop an equivalent process) to this level of resolution. In addition, techniques for the actual fabrication of the device elements themselves have to be established concurrently with the ability to outline the necessary pattern. In relation to a mask-making capability of increasing resolution by optical means, we come up against the fundamental limit set by the diffraction of the light used in the photoprocess. The exact point at which diffraction becomes the limitation depends to some extent on the details of the particular application envisaged. But, in general, it is difficult to make complex circuitry with resolutions better than 1-2 pm. In special cases of simple devices, device elements of just less than 1 pm can be obtained. But it is not possible to fabricate complete circuits with submicron detail in a commercial environment with photomasks as we understand them today. It should be stressed that optical mask makers and suppliers of equip ment to make such masks have matched the general inventiveness of the electronic industry by a continuing improvement in the specification of mask makers for commercial environments. The exposure area has been continually improved to accommodate larger and larger masks. The development of good ultraviolet sources of decreasing wavelength has improved the resolution capability set by diffraction. Good engineering, both optical and mechanical, has simplified the process of successive mask alignment. Automation has been introduced to some degree to minimize human error. While wear-and-tear problems associated with contact printing have been largely eliminated by projection printers in which no contact with the mask and the resist-coated substrate occurs. Finally, the last year has seen the development of optical systems that eliminate the mask by directly “step and repeating” a group of “chip” patterns onto the wafer
-
P. R. THORNTON
274
itself. Optical processing has tremendous advantages in that it is a parallel process, does not involve vacuum systems, and is relatively inexpensive. Nevertheless, the need to go beyond the diffraction limit has led to a search for other methods of device “microfabrication ” that can complement or supercede optical methods. Obvious approaches to this problem involve the use of electron beams and X rays. Briefly two approaches can be considered. Either a scanning electron beam can be used to write a predetermined pattern directly onto a resist-coated wafer. Or it can be made to expose a similar pattern onto a coated mask. Microfabrication is a considerable challenge involving skills at (and beyond) the current state of the art in many disciplines. Leaving aside the exacting skills required in the fabrication of the devices themselves, we need a heavy involvement with the skills listed in Table I. In these two articles, we shall concern ourselves only with the electron-optical problems posed, the TABLE I
ELECTRONBEAMMICROFABRICATION IN ADDITION TO DEVICE FABRICATION T@CHNIQUES, ELECTRONOPTICAL SKILLS, AND BEAM-MATERIAL INTERACTIONSEXPERTISE SKILLS INVOLVED IN
Vacuum technology Laser interferometry Precision mechanical engineering X-Ray generation Computers, computer architecture and software Digital electronics Analog electronics
development of electron sources, the interaction of electrons with solids and electron resists, and the use of X rays in microlithography. The other technologies will only be discussed when they interact with the electron physics aspect of the system in a significant or new way. The division of content between the two articles is somewhat arbitrary. Part I consists of a general description of the approaches available and a detailed description of the various methods whereby scanning beam systems can be exploited. Part I1 covers projection methods, approaches dependent on X rays, and electron beam resists. No consideration is given to the economics of the relative worth of electron beam methods compared to optical approaches. The comparison, on the basis of the economics, of an established technology with a new, unproven one of greater potentiality is a difficult one that can only be made
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
275
by individuals cognizant of the details of the particular use or application under consideration. The conclusion reached will depend heavily on these details. Such details are often protected information within the commercial domain. Here, the basic assumption is made that both optical and electron beam systems will complement each other in the next ten years, that there are ready applications for both approaches, and that a mutual compatibility will increase the value of each. Another omission is also to be noted. No attempt will be made to compare electron beam systems commercially available at the present time. The reason for this is twofold. Our main preoccupation here is with the physics and engineering implicit in present and future systems. In a rapidly changing field, it is extremely difficult to give an up-todate and fair perspective. At this stage, it is perhaps wise to stress that limits other than the resolution limit of the pattern generation may become the limiting factor in the process of microminiaturization. Here we can note as examples three possibilities-one practical, one a question of design, and the third fundamental. As the devices get smaller and smaller, their susceptibility to yield loss because of particle contamination increases. Inherent in the process of microminiaturization is the possibility that the increased cost of increased cleanliness and dust filteration may become prohibitive. As we seek to pack more and more devices onto a single chip, that chip can perform an increasing number of functions. But it can only perform these functions by communication with the outside world by means of connectors of one form or other. Is there a point at which we become connector limited? Finally, there are fundamental limits associated with device processes themselves. Keyes (1975) has considered this aspect in the case of digital devices. 11. PHOTON A N D ELECTRON BEAMLITHOGRAPHY FOR
DEVICE MICROFABRICATION A. Photolithographic Methods
Figure 1 shows in schematic form the methods used to print the required device pattern onto the wafer surface. The beginning point in each approach is the use of the computer to reduce the physical design to a series of patterns that can be reproduced onto the wafer surface. Each layer of masking is also reproduced as a magnified plot of the total layer for checking purposes. The left-hand column of Fig. 1 shows a two-stage process in which soft emulsions are used to produce first an intermediate-sized " reticle " which consists of, say, a single x 10 image of the required pattern. Subsequently,this image is reduced by a further factor of ten and exposed onto a mask plate. The
276
P. R. THORNTON COMPUTER AIDED DESIGN .t
PHOTO REDUCTION
n EMULSION RETICLE (x 10)
HARDSURFACE RETICLE ( x5 TO x20 )
CAMERA
STEP AND REPEAT CAMERA
EMULSION MASK ( x 1 )
CONTACT PRINTING
HARD SURFACE MASK ( x 1 )
. HARDSURFACE RETICLE ( x2 TO x4 )
STEP AND REPEAT
PROJECTION PRINTING
0 0 0 WAFER
WAFER
WAFER
FIG. 1. Schematic representation of modern photolithographic processes for device microfabrication.
plate is then accurately “stepped” over to the next area and the pattern is repeated. This process of “ step and repeat” is continued over the whole of the mask plate to give, for example, about 400 images of a required LSI device with a 0.5-cm chip size on a 10-cm (4 in.) mask. Working masks are then reproduced from this master mask. The working mask is then placed in contact with the resist-covered wafer, and the whole wafer is exposed on one “shot ” using an intense and uniform ultraviolet source. Contact printing has some disadvantages in that the actual physical contact between mask and wafer leads to wear-and-tear problems which reduce the yield.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
277
The center column in Fig. 1 shows a variant of the above approach that avoids the physical contact between the mask and wafer. Here a hardsurface reticle and mask can be used to advantage and can be photoprinted in one of two ways. Either the mask can be held close to the wafer but definitely separated and exposed-proximity printing. Or the mask can be located far from the wafer and used as the effective source in a 1 : 1 projection system where the wafer is located in the image plane-projection printing. Both methods outlined so far involve the use of a final masking stage. The right-hand column in Fig. 1 shows a process in which the final masking stage is eliminated-a process of direct writing by a step-and-repeat process onto the wafer. Hitherto we have used the word “reticle” to describe a single, magnified image of the chip pattern. In this approach, the reticle consists of a wider field than an individual chip. This pattern is then demagnified and exposed directly onto the wafer by a step-and-repeat camera. This process effectively eliminates one stage of masking at the expense of additional alignment and exposure time. Although limited by problems associated with source distribution and dimensional repeatability,the final field can still be of the order of 2 cm x 2 cm and could, therefore, contain sixteen 0.5 cm x 0.5 cm chips. It should be stressed that the numbers given in this section are intended only to indicate the scale of the changes involved. In practice, a wider range of changes is used, and in details, variants of the processes given in Fig. 1 are exploited as the need arises. But the figure does give the required overall picture and can be used to compare and contrast lithographic methods based on electron beam techniques. B. Electron Beam Lithography Figure 2 summarizes possible electron beam approaches. Once again the starting point is the use of computer-derived artwork and data for pattern generation. The data are used to control an electron beam pattern generator. This pattern generator is a scanning system that exposes the pattern sequentially point by point over the target area. The output from this pattern generator can be one of two forms, either a hard-surface mask ( x 1) or a pattern written directly onto the wafer. Consider the use of a mask first. Three ways of exposing a pattern using an electron beam-generated mask have been proposed. In the first method, the diffraction-limited problem is avoided by the use of soft X rays which effectively reduce the operative wavelength by a factor of 100 or more. In this case, the mask is different from the conventional type in that it is specifically designed to give high contrast under X-ray exposure, to be stable under such bombardment, and to have the necessary alignment capability. The second approach involves the use of
278
P. R. THORNTON
COMPUTER AIDED DESIGN
F
PATTERN GENERATION
ELECTRON BEAM PATTERN GENERATOR
GENERATED
l PROCESSING
HARD SURFACE MASK ( x 1 )
X-RAY
PROXIMITY
ELECTRON BEAM PROJECTION PRINTER
UV ACTIVATED E B PROJECTION
FIQ.2. Schematic representation of device microfabrication methods involving electron beam systems.
an electron beam projection system. Here again a rather special mask is used to define the throughput of an electron beam projector which exposes every element of a chip in parallel. The wafer is then stepped and exposed over its whole area. In the third of the possible exposure methods, the pattern is generated onto the surface of a photocathode which is then incorporated in an electron beam projection system as the electron source. Exposure of the entire wafer (or a substantial fraction of it) is obtained by flooding the source with UV causing the photocathode to emit electrons at the required locations. These electrons are projected to give, for example, a l : l image on the wafer. In this case, the diffraction limit is circumvented because of the thinness of the photocathode and its overlying patterns. In essence, the resultant angular apertures used are large, and diffraction effects are avoided (see Section IV,B).
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
219
The final way of using an electron beam pattern generator is to write the pattern directly onto the wafer itself. Here all masking is avoided at the expense of exposure time. A significant factor in the commercial effectiveness of this approach is the speed with which such a “direct writer” can operate. It is this factor that really poses the challenge in electron beam lithography. An electron beam mask master can be a fundamentally slow machine. Resolution, stability, and repeatability can be stressed at the expense of speed. In the case of the fast machine for direct writing, this freedom of design is removed. As a consequence, the first-generation machines for the commercial market have been slow mask makers. While device engineers are gaining experience with such machines, fast direct writing machines (with maskmaking capability) are being developed. We shall return to the question of system speed and stability frequently in the following pages, particularly in Section V. For the present, it is sufficient to state that electron beam mask makers will probably have a significant but limited role in the next few years, and that they will be difficult to upgrade to direct exposure machines without fundamental change. On the other hand, the development of fast commercial machines uncompromised in design for the sake of immediate market entry are some two to three years from completion. At this stage, it is pertinent to compare and contrast the photon and electron beam approaches in a general manner, in order to stress some straightforward factors which are often overlooked. C . Comparison between Lithographic Approaches
The essential similarities between the approaches come from the underlying nature of the problem. It is a very exact pattern we are seeking to produce. To write 0.5-pm detail over 0.25-cm2chips requires data for at least lo8 sites. To overlay such patterns to a repeatability of 0.1 pm involves physical systems measuring data to 2 parts in 10’. To achieve this specification on a day-to-day basis without loss due to system downtime implies that we cannot work at the limit of our measuring systems and we must have factors of 2, 4, or 8 in hand. Few physical systems match up to these needs. This severity of specification has two immediate implications. One is that, in both approaches, such systems are not going to be produced without cost and great attention to detail. The second implication is a heavy reliance on laser interferometry. Without the absolute scaling ability of the laser interferometer, further advances in microlithography would be immensely complicated. Photon-based approaches, particularly those not requiring vacuum systems, have a significant advantage in terms of cost effectivenessin that the
280
P. R. THORNTON
“number crunching” aspects of the problem are reduced by the use of parallel exposures. To be competitive, a purely scanning system working at the limit of resolution over the entire chip area must be capable of working at data rates of the order of 100 MHz or more. This speed requirement is at the limit of our present capability and so poses reliability problems for computer, digital, and analog engineers. As a result, a third approach lying between the completely parallel exposure and the completely sequential is developing which seeks to achieve the required performance by writing as many elements as possible in parallel without loss of resolution. It is of interest to note that while this remark is to be stressed for electron beam (EB) systems, it is also true of photon-based approaches. The most recent technique in photolithography sacrifices the complete parallel exposure, eases the uniformity-of-intensity and repeatability problems, and improves resolution by breaking up the wafer field into a relatively few “ super chips.” Further discussion of this point must wait until we have established the necessary background in detail. We begin this process by outlining the basic interactions between an electron beam and a resist-coated wafer. 111.
ELECTRON BEAMA N D A RESIST-COATED SUBSTRATE
INTERACTIONS BETWEEN A N
A . Interactions within the Resist Itself
In addition to sensitivity and acid resistance, two of the fundamental properties required of a resist layer are that it is smooth and free of pinholes. As a result of this stipulation, most resists are polymeric organic materials with long chain molecules. The molecular weight range is from, say, 10,000 to 500,000. When an electron beam is used to irradiate an organic film, two basic mechanisms occur. One process is the expulsion of an electron from a molecular bond giving rise to a free radical (“chain scission”). The second mechanism is the reforming of the broken bonds. The macroscopic behavior of the film upon irradiation will depend on which of these two mechanisms predominates and on the detailed manner in which the bond rejoining occurs. If the bond-breaking mechanism predominates, the net result is that the molecules become increasingly fragmented, the molecular weight drops, and there is an increased solubility in solvents; and the material is classified as a positive resist. By this expression, we mean that windows can be developed in the resist in the regions where the film has been irradiated. By way of contrast, negatioe resists are materials in which the mean molecular weight increases in the irradiated regions. This increase in molecular weight occurs when the bonds are predominately remade between molecular
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 1
28 1
chains, i.e., when " cross linking " predominates. In such materials, islands of resist film are left in the irradiated areas after suitable solvent treatment. The quantities that define the behavior of a resist are shown in Fig. 3 for the case of a positive resist.
ELECTRON DOSE C/CM2-
FIG.3. Parameters used to specify the properties of a positive resist.
Experiments are performed in which all parameters, such as beam properties, resist properties, resist pre- and posttreatments, and development schedules, are kept constant and the film thickness after a given electron dosage is recorded as a function of the absorbed dosage. Three parameters define the behavior. Do is the minimum dose required to render the resist completely soluble to the given solvent. Diis the dose at which significant changes in solubility first become apparent. The contrast parameter-the y value-is defined by
Y = [l~g~o(Do/Di)]-'
(1) and is a measure of the contrast that results from the use of the particular resist under the given conditions of exposure and development. Doand D i, particularly Do,determine the sensitivity of the resist. The lower these quantities, the faster the exposure time for a given beam current, the greater the throughput, and the greater the cost effectiveness of the total system. An assessment of how y affects the figure of merit of a resist can be made from Fig. 4.In high resolution work, for example, the edge profile of the resist is important. Some typical resist profiles are shown schematically in Fig. 4a. The profile will be specified by the fraction of film thickness required after development and by the allowable width of the edge region, for example,
282
P. R. THORNTON
b
(3)
a
(11)
FIG.4. (a) Typical resist profiles observed using a positive resist such as PMMA.(b) Good, indifferent, and poor profiles resulting from the use of a negative resist.
region ab in Fig. 4b. The thickness of film remaining will determine the ability of the resist layer to withstand the necessary wet or dry etchants used to outline the structure in the silicon wafer itself, while the width of the transition range ab will affect the sharpness of definition of the edge of the resultant structure and, hence, the ability to align the structure to coincide with or to avoid adjacent device elements. In determining the final thickness, the device engineer is free to specify where on the sensitivity curve shown in Fig. 3 he wishes to operate. But the specification of this point also largely determines the edge resolution obtainable. We can see this interaction from the drawings shown in Figs. 5 and 6. In Fig. 5% we have illustrated a reasonable operating point and an associated operating or tolerance range. Imagine the application where we are seeking to write a narrow line with a single pass of the beam. In general, unless we design otherwise, the current distribution in an electron beam is gaussian (see Fig. 5b). By adjustment of the exposure time (scanning rate), a width AA' will receive a dosage lying in the specified range. But there are regions outside AA' that receive a finite fraction of the required charge density and so will lead to films with varying masking capabilities. The
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
I
283
ALL0 W ED OPERATING
0
fZ
L Z
CI cz ELECTRON DOSE
-
FIG. 5. (a) A suggested definition of the allowed operating point and tolerance range, (C, C2)/2 and (C, - C,). (b) and (c) Showing how, for a given exposure time AT and a gaussian distribution of scale determined by a,a region AA' will be correctly exposed.
+
extent of these transition regions is directly related to the y factor of the resist and will extend from the operating point down to a dosage approaching Di . Figure 6 illustrates a situation that compares two resists of near equal sensitivity but with significantly different y values. In the two cases shown, the edge regions will extend down to points corresponding to doses C, and C,; or, in terms of the spot distribution, down to points D, and D2from the line center. After development, these figures will translate into the line shapes schematically shown in Fig. 6c. This question of the interaction between resist properties, beam profile, and the required edge resolution can be
284
P. R. THORNTON
I
I
ELECTRON DOSE
-
K Z
D,
D,
0
D;
D;
DISTANCE
'ANCE
FIG.6. (a) Definition of two resists of near equal sensitivities and significantly different y values. (b) Corresponding lateral positions of the exposed and unexposed regions for the two resists D and D20;.(c) Schematic representationof the corresponding line profiles resulting in the two cases.
summed up from the viewpoint of the resist by saying that an ideal resist is sensitive and has a very steep slope, i.e., a high y value. The treatment given above represents an idealized oversimplification of the situation that is met with experimentally, in that it neglects the presence of an underlying substrate and its interaction with the electron beam. We consider this complication in the next section.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
285
B. Role of the Underlying Substrate The underlying substrate can complicate the exposure problem in the two ways outlined in Figs. 7 and 8. Figure 7a shows the role that the substrate plays in a smooth surface situation. Depending on the beam energy, the thickness of the resist, and the atomic number of the underlying substrate. there will be a contribution to the dose received by the resist film
RESIST LAYER (31
SUBSTRATE
I111
-
REGION EXPOSED BY INCIDENT ELECTRONS
REGION EXPOSED 6 Y BACKSCATTERED ELECTRONS
TOTAL EXPOSURE REGION
FIG.7. (a) An indication of the scattering that occurs in a resist-coveredsubstrate. (b)The lateral extent of the region exposed by incident electrons. (c) The region in which backscattered electrons contribute significantly to the exposure. (d) A summation of the effects of incident and backscattered electrons.
P. R. THORNTON
286
* ’ \
FIG.8. The role of localized surface topography during the latter stages of device fabrication: (a) the nature of the scattering, (b) the resultant resist profile viewed sideways.
from electrons backscattered from the substrate. So the net exposure consists of a narrow cylinder with dimensions largely determined by the incident beam traversing the resist layer and by scattering within the resist itself (Fig. 7b), and a wide domelike distribution centered on the lower surface of the resist and with larger dimensions determined by the backscattering (see Fig. 7c). Figure 8 outlines, in a particularly schematic manner, the way in which the surface topography of the device can affect the resist exposure. In the deposition of the final device layers, the surface is no longer planar. Previous processing procedures have led to surface steps that can approach a micron in height. Figure 8a shows the scattering situation that exists as the beam
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
287
approaches such an edge. The net result is that the region A near the top of the step is " starved " of electrons, while the region B near the bottom of the inclined face receives an overdose. The resultant resist profile viewed sideways is shown in Fig. 8b. This distribution could, for example, lead to a narrowing of an emitter lead in the region A and a broadening in the region B. In the former case, device failure problems associated with current continuity and current crowding can result. In the latter case, interference problems with adjacent elements can occur. Of these two effects, the former has received the fuller treatment in the literature (Chang, 1975b; Ozdemir et al., 1973; Chang et al., 1974). Here we follow the work of Chang (1975) using the PMMA resist. This backscattering effect becomes increasingly important as we seek to make submicron structures with micron-sized spacings between the adjacent elements. Under these conditions, there is a contribution to the dose received by a given element from the exposure used to make adjacent elements. This cooperative effect has come to be known in the literature as the proximity effect. The orders of magnitude associated with this effect are shown in the next section. C . The Proximity Eflect
The approach adopted by Chang (1975)was to experimentally determine the variation of dose required for exposure as a function of linewidth and gap separation between adjacent lines. The conditions used are given in the caption of Fig. 9. The data can be summarized by saying that for devices requiring 2-pm lines or wider with 2-pm spacings or wider, the variation in
- 1
0
1
2
LINE SEPARATION (pm)
-
3
FIG.9. Data indicative of the proximity effect, after Chang (1975b). Experimental conditions: resist, PMMA;beam voltage, 25 kV.
288
P. R. THORNTON
dose needed as a function of linewidth is small and can be accommodated within the working range of the resist. If we seek to write &-pm lines even with 3-pm spacings, a considerable increase (of the order of 30%) is required in the exposure doses because proximity contributions are absent. As the spacing is reduced, the required dose is reduced for all linewidths, but the variation only falls outside the resist tolerance for lines of the order of 1 pm or less. For the present, it is sufficient to note that scattering from the substrate and to a lesser extent in the resist itself imposes a need for a dynamic correction to be applied when we are concerned with submicron geometries. And this correction has to be “tailored ” to meet the needs of each resist, its exposure, or development schedule and to meet the needs of the local device geometry at a given site on the wafer. In quantitative terms, using the conditions given in Fig. 9 and making the exposure with 0.2-pm beam, Chang determined that the exposure intensity distribution could be expressed as the sum of two gaussian distributions: C , exp[ - (r/B,)2] from the incident beam
and C2 exp[ - (r/B2)2] from the backscattered beam
-
where r is the radial distance from the beam position. B , lies between 0.1 and 0.2 pm.B2 is 1-1.2 pm and C, /C2 1.5-3. The backscattered contribution can be as much as 30% of the total dose. FOR IV. ELECTRON BEAMMETHODS DEVICEMICROFABRICATION
A. Pattern Generation by a Scanning Electron Beam System
Figure 10 shows the essential elements of an electron column for pattern generation by sequential scanning. The system shown resembles, in many ways, a scanning electron microscope. An electron gun provides the required current of electrons at the required beam voltage. From the viewpoint of column components further down the optic axis, the gun can be represented by an effective crossover of given diameter operating at a given brightness. Two types of electron gun can be used. Guns based on the use of thermal cathodes have been generally used (Broers, 1965; Herriott et al., 1975; Varnell et al., 1973; Beasley and Squire, 1975; Ozdemir et al., 1973; Piwczyk and Mcquhae, 1973), although some work has been done using field emission cathodes (Friedman et al., 1973; Stille and Astrand, 1977). A system of magnetic lenses delivers a demagnified image of the cross-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
289
MAGNETIC LENS
MAGNETIC LENS
BLANKING APERTURE MAGNETIC LENS DEFLECTION
--BEAM J
POSITIONAL DATA TO COMPUTER
-
LIMITING APERTURE
FlDUClAL MARK DETECTORS
-LASER
DRIVEN STAGE
FIG. 10. The basic components of a scanning electron beam microfabrication system.
over onto the target plane. Typically these lenses will demagnify a source of 40 pm in diameter into a beam diameter of 0.2-1.0 pm. A deflection system is suitably located to move the electron beam over an area of, say, 1 mm x 1 mm to 5 mm x 5 mm depending on the system and application. Within the column itself, there are three other important components. A beam-limiting aperture is used to define the beam half-angle at the target. This component is largely instrumental in determining the engineering trade-off between beam current and aberration properties. A beam blanker acts under computer control to turn the beam on or off at the required
290
P. R. THORNTON
location in synchronism with the deflector unit. Associated with the deflector unit are subsidiary coils to provide dynamic correction as the beam is scanned toward the edges of the electronic scan field. Thespecimen itself is mounted on a laser-controlled stage which provides an absolute and repeatable measure of the wafer movement during exposure and a means of electronic calibration. Located immediately above the wafer itself is an electron detection system which receives electrons backscattered from the target. The target itself has a series of markers-“ fiducial marks ”imposed on it in a predetermined pattern. These fiducial marks are printed onto the target during the first exposure and are used to mechanically align subsequent patterns with the first layer. The method of operation of such a machine can be outlined by considering the particular “direct writing” application in which a second layer is to be placed upon the first completed layer. A typical sequence of events can be summarized in list form as follows: (1) After loading, pumpout, etc., initial routines determine that the column, gun, electronics, and data processing are performing to specification. (2) After inclusion of any particular data required for the application, the laser stage is directed to bring the first chip into the scan area of the beam. (3) The beam is directed to find the fiducial marks bounding the first chip area. Taking due care to turn the beam off where appropriate to avoid inadvertent exposure of the resist, the fiducial marks are located, the locations compared to previous values, and corrections made for changes in origin shift, scale, and rotation. (4) Next the beam is allowed to free run and expose the required pattern, correction for dynamic focusing in large scan area applications and the proximity effect being applied as the scan proceeds. ( 5 ) On completion of the pattern, the fiducial marks could be relocated, compared to the previous locations, and a computer printout obtained if the system has moved out of specification. (6) On completion of the first chip, the laser stage moves the wafer to position the second chip under the beam and the whole process is repeated until every chip has been exposed.
The sequence can be made more elaborate by the inclusion of various subroutines to establish accuracy, column behavior, etc. In particular, the “dead time” involved while the laser stage moves and settles can be used to monitor the beam current by recording the current delivered to the blanking aperture in the “ beam off ” position. This sequence of events can be permutated to a considerable degree, but
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
29 1
it does establish the contributions to the time required to complete an exposure: (a) The dead time associated with loading, pumpout, and system checkout. This overhead is shared between the number of wafers loaded at a given time. (b) The finite time for the laser stage to move and settle for each chip. (c) The time taken to locate the fiducial marks, calculate the necessary corrections, and establish the coordinate system for the chip exposure. (d) The actual time required to make the exposure. In general terms for a mask-making machine, we would like to complete a 10-cm (4 in.) mask within an hour. For a direct writing machine for micron or submicron application, a good design aim is a total exposure time [(a)-(d) above] of less than 2 4 min per 7.5-cm (3 in.) wafer. The electron-optical design factors that have to be considered and the limiting components include the gun, its brightness, cathode life, and stability of emission; the blanker speed; the deflection system, its speed, alignment, fabrication, aberration performance, and stability; the need to make dynamic correction for deflection defocusing and for the proximity effect in real time on a minicomputer of limited capability; the spherical aberration properties of the final lens as a function of working distance; the signal/noise ratio in the fiducial mark detector system; and the development of an efficient scan strategy to increase throughput and to decrease date rate. B. Electron Projection Systems Based on a Photocathode
1. A General Description
Originally proposed by OKeefe (OKeefe et al., 1969), electron projection systems using a photocathode have been extensively studied over the last eight years. A schematic representation is shown in Fig. 11. The central feature is the photoemissive cathode which consists of a hard-surface mask made on a good quality quartz substrate. A photoemissive material is deposited onto the mask surface. Two materials have been used. Initial work was carried out using a palladium layer excited by a mercury 2536-A line (O’Keefe et al., 1969). More recent work has exploited cesium iodide under mercury 1849-A line excitation (Scott, 1975,1977;Wardly, 1975a).The cathode is illuminated by UV from the rear face and, in the transparent regions, emits electrons. The resist-coated silicon wafer forms the anode and is placed about 1 cm away from the cathode. A potential difference on the order 10-20 kV is maintained between the two. A homogeneous magnetic field is
292
P. R. THORNTON UV EXCITATION FOCUS COIL
\
MAGNETIC ALIGNMENT COILS
w
MASKED AREAS
- PH(
-0EMISSIVE LAYER
w
RESIST COVERED WAFER
/
X-RAY'DETECTORS
FIG.11. An electron beam projection system using a photocathode-a cathode projection system.
applied along the axis of the electron field. As a result, the electron emission from the cathode is focused point by point on the resist layer. By suitable timing of the UV excitation, the cathode pattern can be transferred in one parallel exposure to the resist layer. A means of aligning sequential layers of a device is required. To achieve this alignment, a set of deflection coils is incorporated to provide small displacements of the beam. The detection of alignment is carried out by using the Brehmsstrahlung resulting from the beam bombardment. Markers made of a heavy metal are placed on the silicon and are brought into coincidence with complementary markers on the alignment system by maximizing the total X-ray signal received by detectors placed behind the wafer. The alignment correction has to include both linear displacement and rotation. The basic challenges facing this approach arise from four main sources: (a) the chromatic spread on the photoemitted electrons, (b) small perturbations that arise in the electromagnetic fields due to minor mechanical faults, electrical variances, and thermal warping, (c) the complications introduced
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
293
by the interactions between electrons backscattered from the target and the combined electromagnetic fields, and (d) the signal-to-noise problem in the alignment system. We briefly indicate the mechanisms and implications in the next section. 2. Possible System Limitations The existence of a finite energy distribution on the photoemitted beam leads to an inability to focus all the emission from a point source on the cathode into a point on the resist layer. This blurring of the image imposes two limitations. It determines the magnitude of the smallest element that can be obtained on a resist layer by this technique, and it limits the edge resolution that can result. If we consider the numbers appropriate to the CsI photocathode and consider an accelerating potential of 20 kV, we obtain a minimum device element size of 0.5 pm and, in order to avoid loss of contrast at the edges, it is considered necessary to keep the fractional variation in electron energy to less than 1 part in lo4. Over and above the short-range effects (the proximity effect outlined in Section III,C), there are long-range effects associated with electrons backscattered out of the target. Such electrons face a retarding electric field which forces them back toward the resist and a magnetic field which causes the trajectories to spiral. As a result, they impact the surface up to distances of the order of half the anode-cathode separation away from the intended image point. The net result is a somewhat variable overall background to the required exposure over the bulk of the wafer. Fortunately, we can, because of the short duration of the exposure, use an insensitive resist with a good y value. Under these conditions, the main implication is the existence of a 5-10 % exposure background which places a restriction on the exposures and development tolerance that can be allowed (Wardly, 1975a). Another effect, due in part to the backscattering and in part to the incident beam itself, is the question of pattern distortion due to surface charge up. The total current in such a projector system can be quite large. A cesium iodide photocathode can deliver up to 5 pA/cm2. So that a 7.5-cm (3 in.) wafer with a 50% transparency would deliver over 100 pA of primary current. In contrast to scanning systems, the column where the beam is operative is “ uncluttered ” and the scattered electrons can impinge on large regions of the “column” surface. As a result, surface charge-up can be a problem. This comment applies particularly to the resist layer itself. To avoid significant distortions, the surface voltage variations due to charge-up have to be kept to less than a few volts (Wardly, 1975a). The quality of the focusing and hence the resolution of the resist pattern depends on the field homogeneity and in its lack of variation across the
-
294
P. R. THORNTON
exposure field. Various factors can disturb the required constancy both on a micro and on a macro scale. For example, both the photocathode and the anode have to be supported mechanically and located to a high precision. The necessary chucks have to have edges of “ lips ” to contain these elements. These lips represent a displacement of the potential distribution from the essential planar geometry required. So edge effects exist around the wafer circumference. These effects are reproducible from run to run and are of the order of 10 pm. Other macroscopic effects include the overall parallelism of the structure. In the case of the cathode, the required planar surface can be obtained by use of an optical flat. Two types of problem arise in the case of the wafer. Surface waviness can result from thermal processing treatments and can lead to pattern distortions which are unacceptable. A similar surface waviness can occur because of thermal expansion if attempts are made to increase the current and thereby decrease the exposure time. This thermal “bowing” of the target is unlikely to be a major impediment, because improvement in exposure rate is not a primary concern. The total throughput is determined by the sum of the exposure time, the pumpout time per wafer, and the alignment time. Even eliminating the exposure time completely would not increase the throughput too significantly.Therefore, the emphasis here is to achieve the spatial resolution even at the expense of a slightly increased exposure time. The question of “ out-of-plane” wafer distortion due to thermal processing has been attacked by using an electrostatic chuck to flatten the wafer against an insulating support to the required tolerance of 1 pm. This approach has been shown to be successful in the laboratory (Wardly, 1975b; Davey, 1971). Its practicality on the factory floor has yet to be established. Microscopic distortions can result if the film thickness of the mask is too large. In practical terms, the necessary contrast in the mask (before deposition of the photoemissive layer) must be obtained with film thicknesses of the order of 0.2 pm or less. One final distortion factor arises if the rotational alignment correction is made by magnetic coils. It evolves that the focus quality is a very sensitive function of the positioning of the gradient free coil used. Surprisingly, the specification needed is that the positioning is correct to within &2 pm. Bearing in mind that coil concerned is over 30 cm (12 in.) in diameter, the difficulty is apparent. Currently a mechanical rotation is used to avoid this difficulty. One final comment is pertinent in relation to “in-plane” wafer distortions caused by thermal processing. Any projector approach that aligns the whole wafer in one operation can only cope with in-plane distortions to some degree. These in-plane distortions can be partially random in direction and of the order of 1-2 pm in magnitude. Under these conditions, a single alignment of the whole wafer will lead to regions of the wafer where mis-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
295
matches of this order will occur and some reduction in yield will occur. Scanning systems, by way of contrast, can have fiducial marks associated with each and every chip if the need arises. This factor means that a much more localized correction (on a scale of every 0.5 cm for example) can be applied to combat in-plane distortions. Thus, the yield loss, due to misregistration, should be significantly lower. The same ability to correct in-plane distortions on the scale of the device chip rather than the wafer is also implicit in principle in the next approach to be considered-electron beam projection. C . Electron Beam Projection Systems This approach is the electron-optical equivalent of an optical projection camera. A self-supporting foil mask is flooded by an essentially parallel beam of electrons provided by a gun and a lens system. This illuminated mask acts as a large-area, distributed electron source for the second half of the column which projects a demagnified image of the mask pattern onto the wafer. A system giving a x 10 reduction has been described by Heritage (1975) and is shown schematically in Fig. 12. The analogy to the optical stepand-repeat camera is apparent. The self-supporting foil is the counterpart of the reticle of the optical system. The system gives an image of the required chip pattern. The pattern is subsequently repeated after stepping by a laser-controlled movement of the wafer. The design features that have to be considered are: (1) the development of the high-quality projection optics to give the required resolution, (2) the ability to align mask and substrate, and (3) the fabrication of the mask itself. The alignment procedure has been solved in an elegant manner (Heritage, 1975; Koops et al., 1968). The alignment system includes a set of lower alignment coils centered on a registration aperture and located between the two projector lenses. A pair of secondary electron detectors is placed between the final projector lens and the wafer. And, on the input side of the second condenser lens above the foil mask, an upper set of scan coils is included. The alignment required only affects the bottom half of the column. So an alignment procedure that varies the optical properties of the column above the foil itself is allowable. As a result, the excitation of the condenser lenses can be varied to focus the beam onto the foil itself and the upper scan coil activated to turn the system into a rather specialized scanning electron microscope (see Fig. 12b). In this way an image of markers on the foil can be projected onto corresponding areas on the wafer. The signal received by the secondary electron detectors is modulated by details on both the mask and the wafer. As a result, the mask image can be accurately located on the wafer by use of the lower coils. In detail, the procedure is focus the foil image using
296
u
P. R. THORNTON ELECTRON GUN
CONDENSER
SECOND CONDENSER LENS
m I I \
L
7
f l 7
I
1-AT
-PROJECTION
THIRD
MASK
7
PROJECTOR
ALIGNMENT COILS AND APERTURE SECOND PROJECTOR LENS
1 SECONDARY ELECTRON DETECTORS
WAFER
FIG.12. A demagnifying electron projection system: (a) in the exposure mode, (b) in the alignment mode.
the third condenser lens, focus the wafer image using the projector system, shift the mask image into the required coincidence on the wafer, and reset the condensers to give the parallel beam mode. The question of mask design has been addressed by Lischke et al. (1977). The basic problem is that, in the general case, the required pattern that has to be projected onto the resist layer contains a large number of rectangular structures isolated from each other. Yet, if the mask itself is to exist, the structures must be physically connected. To meet both of these needs, the
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
297
ELECTRON GUN
FIRST CONDENSER LENS
SECOND CONDENSER LENS
UPPER DEFLECTION COILS THIRD CONDENSER LENS
FIRST PROJECTOR LENS ALIGNMENT COILS AND APERTURE SECOND PROJECTOR LENS
\
I
SECONDARY ELECTRON DETECTORS
FIG. 12. (b)
“support bars” in the mask have to be rendered transparent to the beam and have to be sufficiently strong to avoid warpage errors. Two approaches have been suggested. One method which is applicable to a projector system giving a l@folddemagnification is to fabricate the mask by UV lithography on a grid or mesh of the fine lines. After demagnification, exposure, and development, the only residue OT the line structure is a small additional “ ripple ” on the edges of the required structure of the order of lo00 A or less. This approach cannot be used in systems with a 1 : 1projection stage. In this
P. R. THORNTON
298
case, the bars can be removed from the pattern by a multiple-exposure technique in which the beam is tilted or translated to expose the regions under the support bars. It should be stressed that the electron projection approach is not limited to chip exposure only. It has been used to expose 5-cm diameter wafers to give 2-pm lines with adequate edge resolution (Fischke et al., 1977). The present limit is the mask accuracy. In the case of individual chip exposure 3-pm detail over a 3 mm x 3 mm chip size has been obtained (Heritage, 1975).
D. X-Ray Lithography Smith and co-workers have been largely instrumental in establishing the factors involved in this technique (Smith et al., 1973; Sullivan and McCoy, 1977; Flanders and Smith, 1977; Austin et al., 1977).The basic simple idea is shown in Fig. 13. It is essentially proximity printing by X rays with no degradation due to diffraction. Figure 13 also shows two design factors that have to be considered. The system geometry has to be chosen to reduce a penumbra distortion of the image to an acceptable level. Briefly, this implies that the separation between mask and wafer is small, that the separation between mask and source is large, and that the source of X rays is an intense
t \ALIGNMENT X-RAY
MARKS
DETECTOR
FIG. 13. Schematic drawing of pattern exposure by X rays illustrating an alignment technique and the “penumbra blurring” that results if a distributed X-ray source is used.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
299
“point source of very little spatial extent. There is also a small but repeatable difference between the pattern on the mask and that projected onto the wafer. The separation between source and mask and the need to limit the size of the X-ray source imposes design problems on the X-ray unit used. To obtain exposure times of less than a minute with resists of reasonable sensitivity and contrast, it is necessary to use a rotating anode X-ray set with its inherent problems of stability, anode life, and downtime. One alignment method is illustrated in Fig. 13. It is similar to that used in the cathode projection system. X-Ray detectors are used to bring marks on the mask into coincidence with marks on the wafer to within a repeatability of fs pm. This method has been superceded by a laser diffraction method capable of giving a superposition repeatability approaching 100 A (Austin et al., 1977). The underlying idea is to align periodic gratings on both mask and wafer by examination of the diffracted beams. Repeatability of the alignment has shown to be as good as 200 A under laboratory conditions. In principle, the method can be extended to alignment in three dimensions. The question of mask fabrication has been examined in depth (Smith et al., 1973; Flanders and Smith, 1977; Austin et al., 1977). The early work was carried out using Mylar, Kapton, A120,, SiO,, Si,N,, and Si itself. The more recent work has exploited the smoothness, strength, and transparency to X-rays of polyimide films. Such films can be made with thicknesses of the order of 0.5-5.0 pm and are transparent to light, a fact that has been used to measure the distortions in the film by laser interferometry.It has been shown that areas of the order of 7.5 mm in diameter can have fractional distortions of less than 2 parts in lo5. In most of the mask work, gold has been used to give the opaque regions. The thickness of gold required is of the order of lo00 A. A fuller treatment of this important approach will be given in the second of these articles.
-
E. The Central Role of the Scanning Electron Beam Approach An electron beam lithography system, in order to be competitive, has to achieve a given throughput which will depend on the resolution required, the economic viability of other methods, etc. The necessary throughput can be achieved, either by developing a fast machine for the direct writing of submicronstructures, or by using such a system to make the necessary masks and subsequently exposing the masks by one of the projection methods outlined in the previous sections. In each approach, the scanning system can play a central role. Over and above these roles of fast direct writing and mask making, the scanning system has an implicit versatility that has found
300
P. R.
THORNTON
ready application in the research laboratory. Of these approaches, it is the fast direct writer that promises the greater advances and presents the most problems. Implicit in such a machine is the ability to make masks and to quicken the research and development process by allowing rapid turn around of trial designs. In the following sections, we outline the progress made in the development of such a machine.
V. THEDEVELOPMENT OF A FAST SCANNING SYSTEM-GENERAL A. The Electron-Optical Design Problem
It is the device engineer who sets the specification by making the best educated estimate possible of the required device properties needed in the foreseeable future. In particular, the engineer specifies: (a) the minimum linewidth, (b) the quality of the edge resolution, (c) the minimum separation between lines, (d) the chip size, (e) the pattern repeatability from layer to layer, (f) the wafer size, (g) the resists to be employed. The overall economics of the device family being developed then stresses what degree of trade-off can be made between device specification and throughput. This engineering specification then translates into an electron-optical specification, an electronics specification in relation to speed and stability, and an overall mechanical specification for stability and repeatability. From the electron-optical viewpoint, the starting point is the fact that the minimum linewidth, increasing edge resolution, and the use of a resist with poor contrast properties all lead to a reduction in the allowed spot size. The minimum line separation and the related question of overlay capability impact the stability and repeatability specification and the extent to which it is necessary to correct for the proximity effect. Strictly speaking, these are not basic electron-optical problems; but the question of chip size is largely an electron-optical problem. A great simplification exists if the whole of a device chip can be written in
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
301
a single electronic scan without interim mechanical movements or corrections being made. As the chip sizes get bigger, the deflection aberrations at the edge of the scan field cause the size of the focused spot to grow and to affect the linewidth characteristics that can be obtained in these regions. The properties of the resist will largely determine the amount by which the electron spot can be allowed to grow in size as it approaches the edge of the scan field. In general terms, the allowed growth will be about 10% of the undeflected spot. This limitation will impose a specification in the properties of the electron gun that can be used. We can see briefly how this condition arises as follows. If we define fl as the scan angle needed to give required scan field and ai as the beam half-angle at the target, then we can write the deflection aberrations and distortions as terms in a:/?, ai fl’, and f13. Increasing B within the constraints that the total quadrature sum of these aberrations has to be small compared to the undeflected spot size imposes limitation on ai ,the beam half-angle. Now we can define the effective brightness at the target as B, which can be expressed in terms of the spot current I, as
where di is the required spot size at the target. Assuming that brightness is conserved down the column, we can equate B to the effective gun brightness B, . So, if ai and di are already prescribed, increase in speed through increase in I, can only be obtained by increasing B,. So a high brightness gun is a critical component of a fast system. The choice of gun design first resolves itself into a choice of the cathode type to be used. Here, some subjectivity can enter into the design because the fundamental choice is the frequently encountered one of making a judgment between the use of an established type of component (thermal cathodes) of limited capability and a new type of component (field emitter cathodes) with some unknown properties but greater potentialities. Here factors other than just the physics of the design enter the scene, but once this choice is made, the detailed design of the column and its components can now be made on a very scientific basis. In recent years, we have seen a remarkably effective application of computer-aided design to high resolution electron beam instrumentation, including microfabrication. The role of computer-aided design of electron-optical components is the subject of Section VIII. The question of cathode choice is discussed in Sections V,D, E, and F. However, it must be remembered that such a design capability must be used against the background of practical problems that exist.
302
P. R. THORNTON
B. Practical Aspects of the Electron-Optical Design From a practical viewpoint, an electron beam microfabrication system has the engineering problems typical of this type of instrumentation and the additional problems arising from the very quantitative nature of the technology. In general terms, the need to match patterns to a repeatability of kO.1 pm or better over, say, a 0.5-an chip imposes strict limitations on magnetic pickup, ground loop noise, vibration, temperature control, general system noise, and instabilities due to electrical charge up. Vibration, magnetic pickup, particularly of electric mains induced ac fields, can induce departures from the predicted beam placement in the frequency range, say, 0.01400 Hz General system noise and ground loop problems can affect both the beam placement and the ability to locate the fiducial marks. The absence of adequate temperature control can lead to drift problems affecting calibration and repeatability from day to day. While electrical charge up can produce a variety of unwanted beam movements varying from a slow relaxation type of oscillation of small magnitude with a period of several seconds to a complete “blanking” of the beam which happens on a submicrosecond time scale. Peculiar to electron beam microfabrication systems are practicalities associated with the use of high currents-the need to have an acceptable depth of focus, to have an efficient scan strategy coupled with a fast and repeatable deflection system, and the need to rapidly and accurately locate fiducial marks. Finally, consideration in depth has to be given to the problem of “ patching.” In the previous discussion, the implicit assumption was made that we could accommodate the required chip size within the area of a single electronic scan. If we are unable to do this, then it is necessary to mechanically move the wafer during the chip exposure and to “patch” the chip together from several subfield exposures. This approach requires that lines can be butted up against each other with a placement accuracy of f O . l pm or better without affecting the linewidth or edge resolution. These specialist problems accentuated by, or unique to, microfabrication work merit consideration in a separate section (see Section IX). We should stress that a microfabrication system has to include an alignment and aperturing system that can function under the control of a single operator sitting at a computer console possibly at a location removed from the column itself. It must also include a beam blanker that can involve the location of high-speed electronics located on the column itself, Finally, we should emphasize a topic that has not received the attention it merits and that is the question of system diagnostics and/or setup. A practical working system has to have an automated system to diagnose which component is down, and once thecomponent has been replaced, to establish that it is now in specification or to bring it in specification by suitable adjustment and recalibration.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
303
C. Question of Cathode Choice
This problem is really a combined one involving both gun design and cathode. The design considerations fall into two main categories: (1) basic electron-optical properties and (2) engineering properties specific to microfabrication. The most important basic properties include the gun brightness, the chromatic spread, the noise, the stability of the emission, and the life of the cathode. From the engineering viewpoint, it is desirable that the gun and the total system can operate over a range of beam voltages, and we have to consider the need to automate the control of the guns. Probably the most difficult aspect of gun operation under automatic control is maintaining a constant brightness and/or emission current. The difficulty arises because the potential of the electrode which controls the emission level affects the position of the effective source along the axis; it can affect the size of the source and, unless the gun is very accurately aligned, it can cause a lateral movement of the effective source. These changes in source properties will be reflected in corresponding changes in image size and position. These image shifts may have to be corrected automatically by minor adjustments and possible recalibration of an appropriate lens and of an appropriate deflection or alignment system. Before considering the properties of individual cathodes, we need to establish the importance of the basic electron-optical properties listed above in microfabrication work. The engineering problems outlined above will be considered in relation to the use of a field emission cathode.
D. Basic Electron-Optical Properties of Guns We considered the question of brightness briefly in Section V,A. We can define the effective gun brightness B, in terms of the spot current I, emitted into a solid angle of nai by a gun with an effective source size deff as In Section V,A, we assumed without proof that brightness is conserved along the column and that the brightness at the target is equal to the gun brightness (see Davey, 1971). Therefore,
Also in Section V,A, we indicated that with di fixed by the minimum linewidth and required edge resolution and with ailimited by the aberration performance of the system, then a high value of I, requires a high gun
304
P. R. THORNTON
brightness. We can compare the experimental values of Be achieved with a theoretical estimate made by Langmuir (1937). Be N ( j c / 4 ( e V b / W (5 1 In Eq. (5), eVb is the beam energy emitted from the gun, T, is the cathode temperature, k is Boltzmann's constant, and j , is the current density at the cathode surface. Two thermal cathodes have been mainly exploited in the microfabrication work to date, the tungsten hairpin filament and a LaB, cathode in various configurations. Assume for the present that the operating temperature for LaB6 is 1900°Kand for W is 2930°K; then for a system operating at 20 kV, we obtain the relationship between the theoretical brightness and the cathode emission density shown in Fig. 14. We see that
-
-
j,
-
CATHODE EMISSION CURRENT DENSITY
FIG. 14. Predicted values of gun brightness plotted against required cathode loading. The horizontal arrows indicate the brightness value used in recent comparativegun studies (Pfeiffer, 1971). See Section X.
for both types of cathode, to obtain brightness values approaching 5 x lo5 A/cm2 sr requires current densities of the order of 1&20 A/cm2. These are substantial values and imply limitation in relation to cathode life and stability. In fact, many experimental data indicate that the brightness predicted by Eq. ( 5 ) cannot be achieved at current densities of this order because the emitted electrons repel each other, space charge effects occur, and the energy spread on the beam increases (see Section X). The question of an increasing energy spread on the beam is important because it not only affects the ability of the lenses to provide an unaberrated
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
305
image, but of particular importance in microfabrication systems involving wide-angle scan systems, it complicates the position in relation to the growth of the spot that occurs as the deflection angle is increased. The situation shown in Fig. 15 depicts the degradation of performance that results in the case of an electrostatic deflection system. If we have a gaussian distribution
MONOCHROMATIC I BEAM I
I
“TAILS” DUE TO ENERGYSPREAD
FIG. 15. The role of an energy spread on the deflected beam characteristics; the growth of chromatic “tails” in the scan direction.
of energy on the beam with a half-width A V at the l/e point, then the beam will be delivered onto the target over a region 2AS centered on S o , where AS is given by
AS = -So(AV/Vb)
(6) In general terms, the implication is that the total energy spread on the beam at the final deflector must be I 1.5 eV for a 20-kV beam using electrostatic deflection and approximately double this value when the deflection is done magnetically. We shall see later that the chromatic spread in the final stages of the column is the sum of components arising from the gun, from the instabilities associated with the gun and lens supplies, and with high-current beams, from interactions between electrons within the beam itself. Under such conditions, it is necessary to regard any gain in gun brightness obtained at the expense of an increase in energy spread as being in the nature of a Pyrrhic victory. The drive toward higher brightness with the implicit need to increase current density also increases both noise and stability problems. This comment is particularly true in the case of field emitter systems. More detail is given in the section relating to the specific properties of such cathodes
306
P. R. THORNTON
(Section V,E,2). Here it is sufficient to summarize the position by saying that a combination of factors involving chromatic spread, beam noise, emission stability, and cathode life place an effective upper limit on the brightness that can usefully be obtained. And that the decision to use a thermal cathode implies a willingness to work within this brightness-Limited situation and find other ways to increase throughput without substantial increases in brightness.
E. Field Emission Cathodes 1. General We suggested in Section V,D that an adequate description of the behavior of a thermal gun could be given in terms of the gun brightness, effective source size, and allowed half-angle subtended at the gun. This description is valid if the area of emission on the cathode is about 1 pm or more in diameter. A different situation occurs with field emitter cathodes. Because of the very strong dependence of field emission current density on the local field strength at the cathode surface (Nordheim, 1928; Cutler and Nagy, 1964), the actual emission area is very small. In the case of recently developed TF field emitters (Swanson and Martin, 1975; Veneklasen and Siegel, 1972; Swanson, 1973),the actual area of emission can be of the order of 10 to 100 A in diameter. Under these conditions, the aberrations introduced by the gun have to be considered (Butler, 1966; Worster, 1969; Everhart, 1967).An electron gun is, in essence, an immersion lens and will, in particular, give rise to spherical and chromatic aberration. If we define C,, as the spherical aberration coefficient of the gun referred to the object, then the effective source size at full beam potential seen by components further down the column will be given by
-
3 2 1/2
defl= [& + (3csgao) 1 x 3C,,ai = d,,
(7) where do is the diameter of the actual emission area, and a. is the half-angle subtended at the gun. Figure 16 shows the basic difference between field emitter guns and guns using thermal cathodes. In Fig. 16a, we see that for a wide range of conditions, the effective source for field emission guns is determined by the aberration properties of the gun rather than the size of the emission area. By way of contrast Fig. 16b shows that for thermal cathodes over the relevent range of a. , the gun aberrations contribute little. In the case where the effective source size is determined by the aberration properties of the gun, the gun brightness does not have the same significance
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
40
\
-t W
N
5 20-
307
deffzdo, dsa INSIGNIFICANT FOR Csg I 50cm
W
0
U
3
5 3 0
1 I
t
03
600-
dsa
1
W
N
I
v)
w
$
400-
3
%
d0
I
10-4
I
lo-’
10-3
QO
(RADIANS)
FIG.16. The differing behavior in thermal and field emitter guns: (a) typical dependence of source size on half-angle for thermal gun (hairpin filament), (b) corresponding behavior for a field emitter gun. The intervals between the vertical arrows indicate the ranges of half-angles used in practice.
it has when the size of the emission area is the major contributor. The apparent brightness depends on the beam half-angle used. In addition, when the emission area is very small, it is difficult to obtain good quantitative estimates of the emission area and, hence, of the brightness. Under these conditions, it is more appropriate to evaluate gun performance in terms of
308
P. R. THORNTON
the angular distribution available from the gun. If we write the current emitted at beam potential per unit solid angle (the specific emission) as I k , then for a gun with spherical aberration constant C,, using a beam halfangle a. , we have I, =d k a ;
(84
and I, = n22/31k(deff /cSg)2/3 (W by eliminating a. from Eq. (7). One other factor needs to be introduced. Figure 17 shows the simplest form that a field emission gun can assume. In the notation adopted in Fig. 17, the first anode is run at a positive potential Ve and the second is run at Vb the required beam potential, Ik and u0 refer to the electron behavior at full beam potential; but most of the pertinent data on the testing of cathodes are given in terms of V,, If we define z k o and aooas the specific emission and effective half-angle at a beam potential of V,,then 1, = nlka; = nI,oa;o
(94
and since uoo/ao = (Vb /K)’/’ (9b) This equation relates the spot current to the required cathode loading, the gun operating conditions, effective source, and where spherical aberration is dominant, to the gun aberration properties. I s = n2”3zkO(Vb/~)(~efI/csg)2/3
2. Detailed Properties of Field Emission Cathodes
Summaries of the earlier work on the application of field cathodes can be obtained in Dyke and Dolan (1956) and Good and Miiller (1956). An updated review has been given by Swanson and Bell (1973). Of particular importance to microfabrication work are studies in two areas-one is the application of a (310) tungsten emitter as an electron source for scanning electron microscopes and the second is the development of “modeconfined” thermally aided field emitters (TF emitters) (Cutler and Nagy, 1964; Swanson and Martin, 1975; Veneklasen and Siegel, 1972. Quantitatively, the behavior of a field emitter can be expressed in terms of the Fowler-Nordheim (1928) expression for the emitted current density J % (1.54 x F 2 / 4 ) exp(-6.83 x 10743/2) A/cm’ (10) where 4 is the surface work function in electron volts, and F is the surface field strength in volts/cm. Van Ostrom (1965)has shown that this equation has a wide range of validity. The main departures become apparent at very
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 1
309
CATHODE TIP
i
I -
(b) FIG. 17. Simple form of field emitter gun: (a) electrical configuration, (b) definition of parameters used in text.
high current levels where space effects cause the characteristic to approach a Child's law relationship (Dolan and Dyke, 1939; Barbour et al., 1953) and where instabilitiesdue to overheatingand thermal runaway cause additional departures (Dolan et al., 1953). The above theoretical treatment gives a good and near complete analysis of the processes that are fundamental to the field emission process. The analysis has been extended to treat experimental situations where both field
310
P. R. THORNTON
emission and Schottky emission occur simultaneously (Swanson and Bell, 1973).Where uncertainty enters the model is when we seek to determine the total current I, as a function of the applied voltage V,. If we define A, as the emission area and fi = F/V,, then we can rewrite Eq. (10) as
I,
= A,(1.54 x
fi2V:/+) exp(-6.63 x 107+3/2//lV,)
(11)
Various workers (Charbonnier and Martin, 1962; Kunetsov et al., 1969) have shown that if is assumed to be constant, independent of temperature, and of known value, then approximate methods are available to find /Iand A, provided they are assumed constant and independent of V,,temperature, etc. Such methods lead to approximate data particularly at low to medium field strengths, i.e., at low to medium currents (say, up to 25 PA). At high current levels, other factors have to be considered. This is particularly true in application work, where a long downtime to obtain a very good vacuum is not, in general, acceptable. As a result, the field emitter may not be operating under conditions appropriate to the modeling. The assumptions made in the physical modeling as to the constancy of the surface conditions can be approximated experimentally in two cases-at low currents in a lo-” torr vacuum and over a wider range of current at higher pressures, if care is taken to establish a situation of dynamic rather than static balance at the emission tip. Over and above the processes essential to field emission, other mechanisms can occur at the same time. Consider the case of an emitter operating at, say, 1600°K in a vacuum of the order of lo-* torr. Among the physical processes occurring at or near to the tip we can list the following:
+
(1) A diffusion of matrix and surface contaminant atoms down the “shank ” of the tip, then onto the tip itself, and in some cases, this diffusion is followed by evaporation into the vacuum. (2) A continual rearrangement on an atomic scale of the edges of the emission area due to surface migration and to evaporation. (3) A process of ionic bombardment arises due, in the main, to two causes. Neutral atoms in the vacuum can approach the tip, be field ionized, and impact the surface enhancing the rate of change. This process is compounded by positive ions released by electron bombardment from the nearby electrodes by the field emission itself and which are accelerated back onto the tip or the adjacent shank. (4) At high field strengths and temperatures above 1600”K,the surface tension of the tip is reduced by the mechanical forces associated with field. Depending on whether the electrostatic field component is less than, equal to, or greater than the conventional surface tension force, the effective tip radius can either grow, remain constant, or become sharper. ( 5 ) Under these conditions of dynamic equilibrium, there need not be a
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
311
unique relationship between the current that is emitted into a narrow cone down the optic axis of the column and the total current. Figure 18 shows the type of situation that can develop. Depending on the crystal orientation, field strength, temperature, etc., the relative strengths of the main emission lobe and the side lobes can vary. It is, therefore, possible that the total current can change significantly while the spot current down the bore can be effectively constant and vice versa. Under these conditions, there is a dynamic balance between those processes which bring atoms of a given species onto the surface and those which remove them. It is also possible to work in a range of field strength in which the microscopic radius is effectively constant or, at the worst, slowly varying.
(b) FIG.18. The role of side lobes in field emission guns: (a) side view showing side lobes as an additional source of backbombardment,(b) emission patterns viewed along the axis showing increasing degrees of mode confinement from left to right.
3 12
P. R. THORNTON
The scientific challenge here is to establish situations in which a dynamic balance can be established for a sufficient length of time for useful work to be done. Consider first the experience gained with the (310) oriented singlecrystal tungsten emitter. 3. The (310) Emitter
Such an emitter, when operated at room temperature, will give a time dependence of spot current as shown in Fig. 19. After application of the emission voltage, there is a monotonic fall-off of the emission current
t
V,
CONSTANT
i-
z W U
IT
0
n W
I-
t
~ _ _ _
TIME
+
FIG. 19. Time dependence of the emission from a (310) tungsten field emitter operated at room temperature.
followed by a constant plateau which lasts for an “ effective lifetime ” of AT seconds. At the completion of this regime, the current starts to increase again, there are increases in the noise, relatively gross instabilities occur, and the tip has to be renewed or “flashed” to reconvert it to its initial starting condition. This recycling process consists of cleaning the tip by the application of a short electrical pulse to the cathode filament with the emission voltage still applied. After this treatment, the emission will repeat the behavior shown in Fig. 19 very closely. The process can be repeated indefinitely; tips have been operated in this fashion for a period of one year or more, and the voltage required to give a given level of emission keeps constant to within a few volts. In pressures of the order of lo-’’ torr, AT is of the order of 4 hr for total emission currents 100 pA (105).This situation finds ready application in scanning electron microscopy, following mainly the work of
-
313
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
Crewe (1964, 1966) and Crewe et al. (1968b). If we seek to extend this technique to high-current instrumentation such as microfabrication work we approach several limits. Scanning electron microscopy utilizes spot currents on the order of 10-’2-10-’o A into an image spot on the order of 100 A or less. Microfabrication work requires, say, 2 x 10- A into a $--&,urn spot using degraded vacuum conditions. Under these conditions, the effective lifetime between flashingscan become severely reduced, as the total emission current required is of the order of 300-500 PA. In addition to the need for increased frequency of clean up, there is always the problem of the side lobes and that the significant fraction of the emitted current in these lobes contributes only to the instabilities of the system, not to the specific emission. It has, therefore, been argued that rather than use a (310) emitter which gives a central lobe with a half-angle of 15”-20” and side lobes, a way should be found to confine the operative emission modes to the central one and also, if possible, to reduce the angle over which this mode emits. Two factors can be exploited here to increase the effective emission along the optic axis (i.e., to mode confine” the beam). We can increase the electric field at the tip itself significantly compared to the regions from which the side lobes derive. An equivalent process is to significantly reduce the work function itself at the tip compared to the surrounding regions. These two approaches have been successfully exploited particularly by the Linfield group (Swanson and Crouser, 1967,1969; Barbour et al., 1960; Bettler and Charbonnier, 1960). In general both approaches use a (100)oriented tungsten crystal. In one approach, the tip is “built up” to a very fine point under the combined action of applied field, temperature, and possibly, the presence of 02.In the second approach, the known fact that Zr (4 = 2.73 eV) is preferentially absorbed on the (100) tungsten face is used to create a mode-confined beam down the optic axis.
’
“
4. Mode-Conjined Field Emitters
Figure 20a outlines the tip configuration that exists at a conventional field emitter tip. Near to the tip itself the curvature can be represented approximately as that of a sphere. Since the local field varies as l/r, there is little variation on field over this region and side lobes can develop from crystal faces which have a slightly smaller work function (0.14.2 eV) than that corresponding to the main lobe. On the other hand, if the tip is built up in the fashion shown in Figs. 20b and 20c, then the center lobe will be enhanced at the expense of the side lobes. This configuration can be achieved by the migration of W atoms up the shank. With a reasonable physical mode
-
3 14
P. R. THORNTON
FIG.20. The localization of the high-field region at an emitter tip by surface migration under the combined action of temperature and electrostatic stress: (a) initial configuration, (b) equilibrium shape with an intermediatevalue of the applied field, (c) correspondingshape with higher value of applied field.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
3 15
of the crystal in the absence of an applied field, it can be shown that the rate at which the tip becomes blunter is given by dr/dt N (f2,’/aAo)(Do/kT)x (1.25y/r3)exp( -Q/kT) (12)
-= -Cy/r3
In Eq. (12), nois the volume per atom, A. the average area per atom, a the shank half-angle, Do and Q the diffusivity and activation energy for selfdiffusion, y the surface tension, k Boltzmann’s constant, and T the absolute temperature of the tip. In the presence of an electrostatic field F, there is an electrostatic stress which opposes the surface tension, and dr/dt becomes (drldt) N (Cy/r3)(1 - rF2/8ny) Therefore, under a given electrostatic field, there is an equilibrium radius ro given by
(14)
ro = 8ny/F2
and the rate at which the equilibrium is achieved depends mainly on the factor C. This analytical treatment does not include surface changes brought about electron beam-induced bombardment or by vacuum-induced phenomena. Figure 21 shows the effects that occur if a series of constant-radius tips are heated to suitable temperature (usually 1600-1800°K) and subjected to a range of field strengths which corresponds to an equilibrium radius smaller than that existing on the crystal. As the applied voltage is increased, the time taken to bring about the required build up is reduced and the final current obtained down the bore increases. Figure 21 does not give a complete description of the phenomena that occur. For example, it has been found that the presence of carbon on the surface can inhibit the build-up process (Okuyama and Hibi, 1965). The carbon effectively increases the energy for surface diffusion and “freezes ” the configuration. The introduction of oxygen into the chamber “bums off ” the carbon and the build up can then occur. of 800-1500 ,uA/sr at Such cathodes can give specific emission values lko total emission currents of 100-150 PA. Take a mean figure of lo00 pA/sr and assume that the emission voltage V, 5 kV. So, using Eqs. (8) and (9) for a gun with C,, 2 cm operating at 20 kV, Ik 4000 pA/sr, and we predict the gun performance of 4n x 10- A from an effective source size of 100 with a. lo-’ rad. A second approach to a mode-confined cathode has been pioneered by Schrednik (1961) and exploited mainly by Swanson and co-workers (Bettler and Charbonnier, 1960; Swanson and Crouser, 1967). In this case, a (100) W
a
-
-
-
-
’
-
3 16
P. R. THORNTON
F
CATHODE TEMPERATURE KEPT CONSTANT
100-
10-
1
7
I
I
crystal is modified in the way shown in Fig. 22. A deposit of ZrH, is placed on the emitter shank at some distance back from the tip. After suitable preactivation, the cathode can be operated at temperatures 1600-1800°K and Zr diffusesonto the tip, is selectively absorbed onto the (100) face, and lowers the work function to that of Zr (2.73 eV). The total emission from this
-
11001TUNGSTEN
HEATING YOKE
I
N~SS U PPR E.'
FIG.22. The physical configuration of a zirconated (100) W field emitter.
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
317
type of cathode consists of an intense field emission oriented down the z axis and a widely distributed Schottky emission. The bulk of the Schottky emission is eliminated by placing a suppressor grid about 0.5 mm back from the tip. If this grid is operated at the cathode potential, only a small Schottky contribution is made to the central beam. The disposition and retention of the Zr on the (100) face is not so straightforward as this synopsis implies. It is found that small amounts of oxygen are usually beneficial to the process (Bettler and Charbonnier, 1960; Swanson and Crouser, 1967). The details are not understood in depth (Swanson and Martin, 1975). These cathodes give specific emissions of the same order as the built-up emitters previously described and in each case, the half-angle of the center lobe is 5"-7". There are differences in the detailed behavior, and a direct comparison as regards suitability for microfabrication work between the two types of cathodes can only be made indirectly in terms of the research data. Both are very effective contenders for microfabrication work at present. The basic problems facing the application of such emitters in microfabrication systems are the interrelated questions of emitter noise, stability, and lifetime. These problems are discussed in the next section. The points made there should be considered against what we require of a field emitter cathode in a practical microfabrication system. The detailed specifications vary, but in general terms, we need Zko lo3 pA/sr, a total rms noise figure out to 100 MHz of f2%- 5 % depending on the resist, a lifetime (between replacements) of 2 400 hr, and a low downtime due to cathode processing and reactivation.
-
-
5. Field Emitter Noise, Stability, and Life The basic feature about a field emitter is that it derives an intense current density from a very small emission area. Indirect estimates of the actual emission area from Fowler-Nordhiem plots suggest that the diameter lies between 10 A and 100 A. Under these conditions, the atomicity of the structure will play a role and we must expect statistical variation in the behavior. In general, the noise behavior of a field emitter will take the form shown in Fig. 23a. Above a certain frequency, the noise level approaches the shot noise limit approach to the current in the beam. Below this frequency, an additional component becomes increasingly important. This component is proportional to l/f" with n varying between 0.5 and 1.5 (Gomer, 1973; Timm and Van Der Ziel, 1966; Kleint, 1963). This low frequency noise, therefore, formally resembles the excess noise observed in semiconductor devices. If noise data are obtained at a variety of current levels, the results appear as illustrated in Fig. 23b. The frequency range over which the excess noise is
P. R. THORNTON
318
W
> W
J
SHOT NOISE LEVEL FREQUENCY
-
t
1 W
1
lo-‘
I
lo6
Hz
FREQUENCY +
FIG.23. Field emission noise characteristics: (a) general nature of the field emission noise, (b) dependence of noise behavior on emission level.
measurable increases, and the measured noise currents increase in proportion to the current. At low frequencies, we come up against a difficulty in describing the behavior because we are unsure as to when noise components become changes in mean level. Without concerning ourselves with this problem too greatly, we can describe the behavior by saying that the noise behavior described above is imposed on the additional variations shown in Fig. 24. There can be a slow drift (Fig. 24a); there can be intermittent or random bursts of noise in a given frequency range which are uncorrelated with changes in mean beam current (Fig. 24b). Finally, there can be a very characteristic “noise” of the form shown in Fig. 24c. At statistical intervals t, ,there is an abrupt change in the mean level. When viewed over a period of time large compared to t , ,the magnitude of the change is seen to be essentially constant but random in frequency; t, can vary from subsecond to, say, once per hour. The magnitude of the excursion can be as high as 20% of the mean level but is, more typically, between 5 and 10%. If the temperature dependence of the noise is recorded, a variety of behavior is observed depending on cathode temperature, operating point,
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
SIGNAL
SlGNAl
3 19
--. 1.0
.II.------4!%+
t
SIGNAL
1 TIME
4
FIG.24. Additional types of current variation observed: (a) steady variation of mean level, (b) intermittent bursts in a given frequency range, (c) “ popcorn ” noise-random frequency changes in mean level with near constant amplitudes.
and vacuum level (Barbour et al., 1960; Bettler and Charbonnier, 1960; Swanson and Crouser, 1967). There are no published data giving the total rms noise current integrated up to frequencies of the order required for a fast microfabrication system. But an approximate estimate obtained by extrapolating from published data indicate that the total noise level could lie between +4% to the 10% with the possibility of greater variation associated with the “popcorn” noise illustrated in Fig. 24. While the particular figures quoted here need confirmation, there is little doubt that the noise levels observed with mode-confined field emitters at the 100- to 200-pA total emission level complicate the question of fiducial mark detection and raise questions about the compatibility of such sources and low contrast electron resists. This question of compatibility with resist properties is one that merits further study. Most of the work reported to date has been concerned with relatively sharp field emitter tips, i.e., emitters with small emission areas. Intuitively, we would expect that the smaller the emission area, the greater the statistical fluctuations. This surmise is borne out by an analysis of Gomer (1973) studying the practically important case where there is an absorbed layer on the surface. Depending on the energy states occupied and the nature of the absorbed species, it was found that the spectral density function W ( j )could be proportional to either A;’, A,’, or A,3, where A, is the emitting area. Therefore, the use of blunter emitters with increased emission area could lead to a reduction of the noise levels. Lifetime figures obtained under experimental conditions indicate that we can reasonably expect lifetimes of the order of several hundred hours if we can establish the same kind of environment within the microfabrication systems.
P. R. THORNTON
320
F. Thermal Cathodes 1. The Lanthanum Hexaboride Thermal Emitter Following early work by Buckingham (1965), this cathode was successfully introduced by Broers (1969; Blair and Broers, 1971;Vogel, 1970)to provide additional brightness for advanced scanning electron microscopy. The technical development of this source has been considerable (Vogel, 1970; Pfeiffer, 1971; Stickel and Pfeiffer, 1973;Yonezawa et al., 1977), but for space reasons, we have to be somewhat selective in the studies to be considered in detail. One particular study merits attention because of its extreme relevance to microfabrication applications-that carried out by Pfeiffer (1971) relating to the brightness, chromatic spread to the gun configuration and cathode type (Pfeiffer, 1971; Stickel and Pfeiffer, 1973) (see Fig. 25). In this investigation, three factors were varied, the nature of the emitting material, the geometrical configuration, and the electrical biasing arrangement. One gun studied was the traditional tungsten hairpin gun with a 5
2
t g
I-
0.5
Lu
0.2
FIG.25. Chromatic spread (AEeUn) plotted against gun brightnessfor three gun types. After Pfeiffer (1971)
ELECTRON PHYSICS IN DEVICE MICKOFABHICATION. I
32 1
self-biased grid run at a negative potential. Two gun configurations using LaB, were examined; one was a pointed tip, indirectly heating and operated with a negatively biased grid. The final gun used was a hairpin ” of LaB, in front of which was a positively biased plate with a small aperture. The ‘‘ aperture-limited” gun is highly reminiscent of the gun configuration used in cathode ray tubes (Moss, 1968). Measurements were made of the effective source size derrin each case, and the chromatic spread from the gun was determined as a function of gun brightness. Finally, the total chromatic spread arising from the entire system was determined. The data gave a clear indication of the existence of larger chromatic spreads than usually associated with magnetic columns applications. The three gun structures investigated give very different energy spreads as a function of brightness. Both the total emission current and the chromatic spread are reduced as we go from the tungsten hairpin gun through the LaB, pointed filament to the aperture-limited LaB, gun. The most natural explanation is that while the configuration and cathode details can cause small variations on the observed spread, the main factor is the magnitude of the total emission current; the smaller the total current, the smaller the total energy spread down to a low plateau value characteristic of the emission properties and structure. Some indication of the depth of the studies on the LaB, cathode can be obtained from Yonezawa et at. (1977),which outlines the variety ofcathode configurations studied, the variation in source size with bias conditions as a function of cathode shape. Also indicated is the nature of the interpretative difficulties that exist. One area in which good data are missing is the detailed examination of the noise properties of the source. A further discussion of these data is given in Section X, in relation to coulombic effects which become significant at high current levels. “
2. The Tungsten Hairpin Filament Recent work on this type of cathode has been limited, and it has not received the emphasis that has been accorded to the LaB, source. On the fundamental side, Andersen and Mol (1968) have reported careful simultaneous studies ofthe obtainable brightness and the energy spread on the beam. The measurements were made as a function of cathode temperature and of bias condition. The range of brightness covered was 1 x lo4 to 2 x lo5 A/cm2 sr, and the energy spreads recorded were of the same order as those given by Pfeiffer (1971) and quoted in Fig. 25. Hanszen and Lauer (1967) have reported brightness data on spherical tungsten emitters over a varied range of radius, from 250 pm to 1.5 pm. Data are reported as a function of cathode temperature and bias conditions.
-
322
P. R. THORNTON
The observations are of interest in that they indicate the difficulty of obtaining reliable data as a function of bias conditions and cathode temperature when the cathode radius gets small, i.e., 5 5 pm. With regard to our understanding of the tungsten emitter, we lack two basic data. We do not have good readily available data relating to the noise properties of a tungsten thermal emitter. The implicit assumption that the noise on the beam can be given by the shot noise approximation may well not be valid. Compare the case of the BaO, dispenser emitter which reveals the presence of some excess noise at low frequencies (Brodie and Jenkins, 1956; Brodie, 1961). The other observations that are lacking pertain to cathode life as a function of operating point and vacuum environment. The early work of Bloomer (1957)does not relate to modern vacuum techniques. Nor have the potent modern techniques for surface analysis been applied to the question of the noise, stability, and lifetime behavior of the tungsten thermal emitter. 3. Possible Application of Other Cathodes Recent work in the field of cathode technology has resulted in the development of several rugged cathodes which can operate at levels approaching 10 A/cm2. Glascock (1969)has stressed the use of a ThC emitter and has given very full fabrication details. The indications are that a reasonable life could be obtained at an emission level of 10 A/cm2. Since its introduction by Levi (1955),the dispenser cathode has seen a continuous technological development and can also operate at this level of emission with a reasonable life [see Hughes (1972)) Ahmed and co-workers (1972;Ahmed and Munro, 1973; Ahmed and Nixon, 1973)have suggested that other borides in addition to LaB6 may find useful application in both the electron beam microscopy and microfabrication. Finally, it can be remarked that thermal gun performance should improve in the near future not only because of improved cathode technology but because of the continued application of computer-aided design to the gun problem. G. Fiducial Mark Detectors, Beam Blanking, and Alignment
Three important features of a fast scanning microfabrication system have not been discussed so far-the beam blanker, the fiducial mark detector system, and the alignment system. The beam blanker presents some difficulties because of the speed and stability requirements, particularly in the case of a system based on a field emitter gun. In general terms, the requirements are for k 15-V pulses with
323
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
rise times on the nanosecond time scale. A case has been made that the blanking should be carried out at a point in the beam where an image point can be made to coincide with the center of deflection of the blanker (Lin and Beauchamp, 1973). In this situation, the final image spot on the target does not move during the blanking process but just decreases in intensity (see Fig. 26). The absolute necessity of having to blank on a crossover has yet to be established because some successful approaches do not use this method (Cahen et al., 1972; Chang and Stewart, 1969; Herriott et al., 1975);but it is also true that most of these machines were not concerned with ultrafast
BLANKER APERTURE
.IFIER
FINAL LENS [FOR EXAMPLE)
INVARIANCE OF SPOT POSITION DURING BLA ACTION
FIG.26. The ideal beam blanking situation with center of deflection coinciding with an image plane.
324
P. R. THORNTON
operation or with the use of sensitive resists, i.e., they did not operate under conditions where “tails” due to beam movement during blanking would be most troublesome. In addition to the fast rise time, it is also necessary that the blanking circuit restore the beam to the unblanked conditions rapidly to a very close tolerance. The blanking problem from the electronic view point can be eased by the use of long plates close to the beam and by reducing the aperture size in the blanking aperture plate. However, reducing the plate separation and increasing the length makes the alignment problem proportionately more difficult. In addition, unless the vacuum level in the region of the blanker is low and free from hydrocarbons, these changes can increase the possibility of charge-up difficulties and the possibility of having to change components with a greater frequency than is acceptable. The blanking system provides another useful function. During these periods of time in which the beam is blanked off the target, the current flows into the blanking aperture plate. If this plate is “floated” from ground, the aperture current can, after amplification, be used to monitor the beam current itself and the beam noise and to provide the necessary feedback should correction be required. Other potential methods of beam blanking exist but have yet to be reported on in this context. Two possibilities are the use of slow-wave structures (MacMaster and Dudley, 1973) and the application of EBRD devices (Selzars et al., 1974). Figure 27 shows the physical basis of the fiducial mark detector system. It is an adaptation of a technique used for observing topological and compositional contrast in a scanning electron microscope (Kimura and Tamura, 1967). A pair of electron detectors are symmetrically placed about the scan center above the target. They are biased to detect the backscattered electrons. The secondary electrons are not detected. Figures 27b through 27e illustrate very schematically the variation in signal from the individual detectors as the beam is canned along a line between the centers in the case where the fiducial mark is a simple V-shaped groove lying at right angles to the scan direction. Two such detector pairs are needed, one for each scan axis, to fully locate the beam position relative to the laser stage. In practice, several different types of detector array have been applied to this problem. The collector-scintillator-light pipe-photomultiplier system which has found such effective application in scanning electron microscopy (Everhart and Thornley, 1960) has been adapted where possible because of its gain bandwidth product, its excellent noise performance, the optical decoupling inherent in the method, and its reliability. The major factor limiting the use of this approach is a space limitation, if the working distance is short and the scintillator-light pipe combination cannot be fitted into the available
325
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I INCIDENT BEAM
DETECTOR 1
DETECTOR
2 (0)
SIGNAL
2
n
+-
DIGITIZED OUTPUT
(ci
(el
FIG.27. An idealized fiducial mark detector system: (a) physical configuration, (b) and (c) signals from individual detectors, (d) summed signal, (e) resultant digital output.
volume. In this circumstance, recourse has to be made to modern forms of electron multipliers, the channel multiplier for example (Chang, 1975a), or to semiconductor particle counters (Wolfe et al., 1975). These approaches do not give such a good signal-to-noise performance, and some signal integration is necessary to make the necessary location measurement to < kO.1 pm. The alignment system is conventional in design but has to be automated. The active elements are either magnetic or electrostatic deflection systems
326
P. R. THORNTON
suitably placed in the column. The automation of such a system has been reported on (Doran et al., 1975; Michail et al., 1975; Pfeiffer and Woodard, 1975; Collier and Herriott, 1975) and involves digital control with a considerable degree of sophistication.
VI. DESIGNOF A FASTSCANNING SYSTEM-USE THERMAL CATHODES
OF
A. Introduction
We indicated earlier that a basic choice exists between the use of a field emitter with the advantages and uncertainties outlined in Section V,E and the application of a thermal emitter with the performance limitation inherent in its use. In the latter case, the problem resolves itself into finding ways of speeding the throughput with the limited brightness available. The following sections describe a very successful approach to the problem developed by Pfeiffer and co-workers (Mauer et al., 1976,1977; Pfeiffer, 1974; Pfeiffer and Loeffler, 1971). The basic ideas are to “tailor” the shape and distribution of an electron spot away from its traditional gaussian properties to those more suitable for this application, and to use as large an electron beam as possible at a given site on the chip commensurate with maintaining the required edge resolution. The approach has the additional advantages that the required data rate and storage capability needed are considerably reduced. B. The Application of Spot Shaping and the Use of a Variable Aperture to Obtain High Throughput
Figure 28 outlines schematically the way in which the required edge resolution restricts the throughput in the conventional approach. In Fig. 28a we have illustrated the ideal resist pattern cross section that we require to give, for example, a l - p n structure, while Fig. 28b shows the approximate current distribution in a spot with a width approaching 1 pm. If such a spot is used to write a 1-pm line, the resulting profile will take the form shown in Fig. 28c, and the width of the sloping region at the line edge is too large or the “edge resolution” is insufficient. The width of the transition region scales with the size of the beam spot used so that the edge resolution can be improved by writing a l-pm line with, for example, four passes of a bpm spot (see Figs. 28d and 28e). Assuming that the current density remains the same in both spot sizes, then the increase in edge resolution is gained by increasing the time required for exposure by a factor of four in the example given above. It is the smallest elements that are the most critical and hence
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
327
I
I
I I I
I FIG. 28. The restriction of throughput by edge resolution: (a) idealized resist profile required, (b) 1-pm spot diameter with gaussian distribution, (c) resultant profile obtained with spot shown in (b), (d) superposition of four $pm spots, (e) resultant profile obtained with system shown in (d).
328
P. R. THORNTON
require greatest attention to edge detail. So if we use a machine that retains a single size spot appropriate to the smallest element, it means that the very broad areas have to be written with very many passes and the time required is nearly directly proportional to the area to be scanned. Two factors can be introduced to improve the situation. The electron beam spot itself can be “tailored ” to meet this application and a system can be devised in which the final tailored spot can be scaled locally to match the size of the element to be written, i.e., we can increase the size of the spot to write the large elements. This approach has been exploited very effectively in recent years by Pfeiffer and his co-workers. Consider first the tailoring of the spot in terms of the diagrams given in Fig. 29. The first of these figures shows the current distribution within the conventional round beam. Independent of orientation, the fall-off in current density is gaussian on a scale determined by the half-width CT. The shape of the spot is determined by the shape of the effective source delivered by the electron gun, and the distribution within the spot is determined by the distribution of the emission from this source. Figure 29b illustrates the ideal spot from the viewpoint of microlithography. The spot is square in terms of physical shape and square in terms of current distribution. Such a spot could be used to write a line of width equal to the spot size in a single pass and still give the required edge resolution. Figure 29c shows how the addition of a second aperture in an image plane itself can lead to a “sharpened” distribution approaching the required ideal. In the next section, we outline an optical system due originally to Koehler (1893) which provides a square spot size and a sharpened distribution which approaches the ideal distribution discussed above. In addition, the approach is extended so that the size of the square spot can be varied over a substantial range and so that rectangular spots can also be developed.
C. Use of Koehler Illumination Figures 30 and 31 illustrate the ideas employed in successive stages. Consider the three-lens system shown in Figure 30 first. An electron gun illuminates a square aperture placed in the principal plane of the first lens. The gun has an effective source which is circular in shape and has a gaussian distribution. The first and second lenses combine to throw a magnified image of this source onto the plane of a round beam-limiting aperture, which because it has a smaller diameter than the image truncates the transmitted distribution into a “domed ” distribution of the required type. Interspersed with this series of gun source images of increasing size are two demagnified images of the square aperture facing the gun. The final one of these images is formed beyond the beam-limiting aperture and so has a sharpened distribu-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
329
(a)
OBJECT SPOT SHAPES CURRENT DISTRIBUTIONS
*
i‘,
=A=-n I
!
I
I
I
FIG.29. Beam “tailoring”: (a) conventional round beam with a gaussian distribution, (b) ideal distribution for microfabricationwork, (c) “ tailoring” of the distribution edges by inclusion of a second aperture in the image plane.
tion. This three-lens system provides a square spot with sharpened distribution which can subsequently be further demagnified and deflected. No facility has been included so far to vary the final spot size. Figure 31 shows an elegant way of achieving this specification. The previous column has been extended at the gun end by the inclusion of a further lens and an additional
330
P. R. THORNTON SPOT AND APERTURE SHAPES
CURRENT DISTRIBUTIONS
A A
FIG. 30. Use of Koehler illumination to provide a square spot with a sharpened current distribution.
square aperture. The first lens now performs two functions; it projects an image of the gun source into the plane of the source in the previous system, and at the same time, it forms an image of the first square aperture in the plane of the second square aperture. An additional component is a twodimensional scanning/deflectorsystem located so that its center of deflection is in the image plane containing the first image of the gun source. Imagine, for example, that the image of the first square aperture is larger than the actual size of the second; then as the scan is activated along a direction parallel to one side of the second aperture, the net size of source “seen” by
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I SPOT AND APERTURE SHAPES
33 1
CURRENT
c-
0
L-
n-
-
FIG.31. Extension of Fig. 30 to include capability of varying the final spot size by control of input source size.
the rest of the column downstream is that indicated in Fig. 32b. A similar line-shaped source can be produced in a perpendicular direction simply by scanning in that direction. If the scan is along the diagonal, then a series of squares of varying dimensions is obtained depending on scan position (see Fig. 32d). Since an LSI chip consists mostly of lines/elements running in two
P. R. THORNTON
332 SECOND SQUARE APERTURE
IMAGE OF FIRST SQUARE APERTURE (a)
I FIG.32. The variable aperture method: (a) relative positions of the image of the first square aperture and the second square aperture with zero input to the relevant deflector, (b) and (c) formation of a long rectangular spot in the Y direction by movement in X direction and vice versa, (d) production of increasingly small square spots by movement along the diagonal.
directions at right angles, great savings in time can be made because the number of elements to expose is greatly reduced. If the current density can be held constant across the range of spot sizes, then the savings is directly proportional to the ratio of the elements needed. There are pleasing features about the system shown in Fig. 31. In note form we can list these as follows: (1) The change in final size is brought about by a fast (electrostatic) deflector effectively varying the source size. The lenses in the system remain
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 1
333
at constant excitation, so no slow drift or hysteresis is implicit in this approach as is the case in more conventional methods involving magnification changes with a constant source size. (2) Because the deflector operates on an image of the electron gun source, all subsequent images of this source do not move as the effective rectangular source is varied. In particular, the final image impinging on the beam-limiting aperture does not move or change size. (3) When the effective source size is increased by varying the size of the square aperture seen by the system, the effective solid angle supplying electrons to the final “ clipped ” image decreases by the same factor as the source size. So that provided the distribution is effectively flat-topped, the current density remains constant and the time required to expose each element is independent of its size. (4) There is lateral movement of the effective square source when the size of this source is changed. However, this lateral shift is demagnified down the column in direct ratio of the total demagnification of the square source. The resultant movement in the final target plan is small and is predictable for each change in source size. It can, therefore, be corrected for. ( 5 ) All electron guns require periodic adjustment to maintain the spot current within prescribed limits. Such adjustments can change the size of the effective gun source and can alter its position along the z axis. Therefore, gun adjustment can lead to refocusing adjustments with inherent complications in automation and to loss of time. In the variable aperture approach, the effective source size and position are determined by aperture position alone and are independent of gun settings. Therefore, gun settings can be varied without refocusing. (6) One question that arises whenever a system is assessed is, What limitation does the approach impose on the range of linewidths that can be used? In particular, what increments in linewidths can be used. In the variable aperture approach, the linewidth is determined by the initial source size and the system demagnification. The source size of the linewidth increment can be made so small that a continuous variation of linewidth is available. These features represent a considerable achievement in terms of elegance of concept, practicality, and ability to break the submicron barrier. However, the approach is not without its complications. Again, in note form, we can comment on these: (1) The gain in speed that can be realized by this approach depends on the scan strategy employed and on the nature of the pattern to be exposed. There will be a “dead time ” every time the beam size is varied. Devices with a wide range of line size with small increments in line size will take longer to
334
P. R. THORNTON
expose than those with wider increments between widths. Also, the approach has decreasing value as the fraction of exposed area to be written with minimum linewidth increases. Intuitively, the best scan strategy is to write all elements requiring one size spot or shape, change the spot size/shape, and write all elements appropriate to the new shape, etc. Such a scheme requires retracing over the scan area with large excursions. Resultant variations in deflector loading and thermal effects may play a role (see Section IX). Careful housekeeping in relation to data storage, particularly when proximity effect corrections are incorporated, has to be observed. (2) Just how wide a range of variable spots can be built usefully into one machine is problematic. In practice we would like to have a range of, say, from 4 pm up to 10 pm. To achieve this specificationimplies that the square apertures can be made with considerable finesse with regard to edge roughness. And, possibly more important, the constancy of illumination through the beam-forming aperture has to be maintained. The approach has the good feature that the current density remains the same as we change the spot size. If the distribution is not uniform, greater and greater statistical variations in the current density occur as the spot size is decreased involving " time out" for checking, correction, and return to the writing site. (3) The question of more exotic patterns is still unresolved. The pressure here arises currently from magnetic bubble work which involves the use of T-bars and chevrons involving lines at 45" to the main structure. In principle, one approach is to develop and project an aperture of the required shape into the plane of the second square aperture then use the rest of column to expose the entire subpattern at the required sites by a process resembling a miniature step and repeat. (4) Just how well the shape could be maintained down to submicron depends on the aberrations of the final stages; reduction of these aberrations could lead to limitations as to speed and/or scan area. In addition, some loss of detail due to the proximity effect will result. ( 5 ) This approach does not avoid complications due to coulombicinteractions. The top half of the column down to the fourth lens can obtain the necessary current density by exploiting both gun brightness and a relatively large angular aperture. It is the final stages that impose limitation on the angular aperture that can be used. The net result is that a high gun brightness is required, and we must expect some loss of monochromaticity. In addition, the path length from the column is considerable and further electron-electron interactions will increase the energy spread. (6) From the electron-optical design viewpoint, two factors have to be stressed. Extensive computer-aided design is required to optimize the total system in regard to aberration reduction. Second, the column is a very exact instrument in relation to the first-order properties. The optics diagrams
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
335
shown in Figs. 30 and 31 contain one important oversimplification in that they do not reflect the rotation inherent in magnetic lenses. As a result of this rotation, the alignment of the first and second square apertures and the final square image with the movement axes of the stage is an electrical as well as a mechanical problem. Some correcting system may well have to be incorporated, particularly in situations where the beam voltage has to be varied. The dual role of the first lens gives an additional constraint. The primary purpose of this lens is to project an image of the first square aperture into the plane of the second aperture. Its secondary purpose is to throw an image of gun source into the center of deflection of the first deflector. These are both critical functions and the overall reliability of the system is increased by effectively separating the fine adjustment involved in the second purpose. This has been achieved by incorporating a secondary set of deflector plates, which, by electrical adjustment alone enable the center of deflection to be brought into coincidence with the relevant image (Pfeiffer and Loeffler, 1971). (7)The system is not without its engineeringdifficulties. Over and above the problems associated with multilens columns, there are unique problems in the variable aperture approach. We have already outlined the attention to detail that has to be paid in relation to the square aperture fabrication. The fabrication details of the final projector/deflector combination will be considered in Section IX.The inherently short working length restricts the space available for the fiducial mark detector system. The incorporation of the blanker, if blanking at a crossover is required, may present some problems.
D. Experimental Results We can indicate the current state of progress being made with variable aperture systems by considering not only the results obtained with a full variable aperture system (Pfeiffer, 1974;Pfeiffer and Loeffler, 1971),but also those obtained with its predecessor El 1 (Mauer et al., 1976, 1977),which gave a shaped beam with fixed final spot size. Working with a tungsten cathode at a brightness of 3 x lo5 A/cm2 sr with a beam voltage of 25 kV, patterns have been exposed over scan areas of 0.4cm x 0.4cm to 0.5cm x 0.5 cm without loss of pattern fidelity and edge resolution (Pfeiffer and Loeffler, 1970).In these studies, square elements of sizes from 0.6 pm x 0.6 pm to 2.0pm x 2.0 pm in increments of 0.2 pm were exposed. It was found that the spot maintains its position and its edge resolution during high-speed switching between spot sizes to within the limit set by experimental error, i.e., less than 0.1 pm. No quantitative indication is given of the actual switching rate. Nor is there any clear suggestion of what resist was used.
336
P. R.
THORNTON
Experiments indicated that the gun used in this approach (Mauer et al., 1977) delivered the required 3 pA into a 2.5 pm x 2.5 pm spot, for example, with a stability of & 1%. The illumination within the spot varied by less than +_ 2.5 %. One slightly disturbing feature is the fact that the cathode life was low-between 30 and 40 hr. The reason for this lack of duration is not clear, but a superficial examination of the published photographs (Weber and Yourke, 1977) indicate that the designers may have sacrificed pumping speed in order to avoid the complicationsthat can arise when ion pumps are placed in juxtaposition with the electron beam. It is noteworthy that the column is carefully designed to meet the high mechanical specification required in microfabrication work. In particular, a considerable number of mechanical movements have been included to avoid further complication of the corrections needed. The square aperture(s) can be rotated to align with the deflection axes of the main magnetic deflector. The same facility is applied to the electrostatic deflection system. The stigmator can be mechanically prepositioned so that corrections for astigmatism, when applied, do not cause lateral movement of the beam. The same degree of freedom is applied to the round beam-limiting aperture. The question of aperture contamination is resolved, in System El 1 at least (Mauer et al., 1977), by allowing the impinging beam to heat the square aperture to some degree, while heat can be applied directly to the beamlimiting aperture. Details regarding the final lens in the variable aperture column-the combined projector-deflector system-are considered in Section IX,as conceptually, they belong to a discussion of the deflector problem.
VII. THEDEVELOPMENT OF A FAST SCANNING SYSTEMUSEOF FIELD EMITTER CATHODES A. Introduction and Potential Performance
The basic properties of field emitter cathodes were outlined in Section V. It was shown that the recent developments associated with field emitters have led to a mode-confined beam with a relatively narrow emission cone. This development reduces some of the difficulties facing the use of field emitters in microfabrication systems. There are no published accounts of the use of such emitters in microfabrication work, but we have considerable background data to establish the merits, potentiality, and difficulties associated with the approach. Briefly, the case for the use of a field emitter is mainly concerned with device fabrication where the critical elements have sizes of the order of 1000-2000 A. Under these conditions, a field emitter
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
337
system will outperform a conventional thermal emitter system (Veneklasen, 1971, 1972; Broers, 1972; Drechsler et al., 1960; Cosslett and Haine, 1956). At this level of resolution, the gains inherent in the variable aperture approach become less significant. So, under an initial examination, the field emitter approach seems to be the major contender at this resolution level. In quantitative terms, it has been shown that specific emissions of the order of lo00 pA/sr can be sustained under experimental conditions for some hundreds of hours with effective sources sizes I250 A, with half-angles subtended at the gun of the order of rad. Neglecting for the present, the role played by coulombic interactions on the beam, these figures translate, for example, into a 2500-A image spot with a current density of the order of lo3 A / m 2 compared to the 50 A / m 2 that is the upper limit obtainable by the application of thermal cathodes. It is this factor that highlights the attraction of the field emission approach. The questions that arise to dampen this enthusiasm are the technological difficulties associated with the operation of field emitters at high current levels, the possible limitation imposed by coulombic interactions, and the question of noise and its effect on linewidth when fast resists of poor contrast are used. The expertise currently being applied to the problems is formidable. The next section outlines the more important practical difficulties facing the use of such systems. B. Practical Dfjculties Associated with Field Emission Microfabrication Systems
Referring back to Fig. 16, we see that at the large half-angles, appropriate to our application, the effective source is a spherically aberrated disk of least confusion located at a short distance behind the actual cathode tip. There are extensive data, both experimental (Nomura et al., 1973; Nishida, 1971) and analytical (Shimizu et al., 1973; Kuo, 1976) which show that the actual position of the disk along the z axis is a function of V,/V, for a very wide range of gun design. In microfabrication work, V, has to be fixed, so that any perturbation of V , will alter the source position. This variance of source position with change in V , complicates both the ability to control the emission level of the gun and to retain the large depth of focus implicit in a field emission system. The sensitivity to this type of change arises because a field emission microfabrication system is a magnifying system. In a system delivering a g p m spot from a 0.05-pm source, the total system magnification M , is 10, and if the source position is perturbed by an amount U sthe , image is moved out of focus by an amount AZi , given by
-
AZi = - M f A Z , x Therefore, a displacement of
-
25 pm
- lo2 AZs in.) results in a movement of
338
P. R. THORNTON
the final focus plane by 2400 pm (lo-’in.) which is of the order of 20 times the depth of focus of the system. It is unlikely that a simple gun can be developed which holds the source position fixed for the voltage excursions required to keep the specific emission constant. The implication is clear that additional lens elements are required to make the gun control and the focus stability requirements compatible. An interesting solution has been suggested by Kuo (1976)following on work by Veneklasen (1972).This worker has argued that there is a premium to be placed on obtaining a gun with a low aberration coefficient and has suggested that a magnetic lens be incorporated below the field emitter at a positive voltage V, relative to the tip. As magnetic lenses have better aberration performance than electrostatic lenses, such a selection should, with suitable optimization, give the highest spot current available. At the same time, as V, is varied to keep Ik constant, the input to the lens winding could be programmed to change in such a way as to keep the first crossover down the beam in a constant position. This approach would greatly reduce the refocusing necessary after making beam current corrections. The fact that a field emitter system has an approximately x 10 magnification also compounds difficulties associated with vibration, magnetic pickup, and alignment. In this approach, all movements and misalignments in the gun area of the column become magnified further down the column. Consider the vibration specification. We need to position 1-pm size elements to within kO.1 pm, taking account of all errors. Even if we use up all the available uncertainty to accommodate cathode movement and vibration, this specification implies that the amplitude of any vibration of the cathode tip must be kept below k 100 A. A similar sensitivity applies to the alignment specification. A misalignment of 25 pm at the gun end can project into a 250 pn displacement in the final lens/deflector system. Veneklasen (1972)has outlined the magnetic problem with particular reference to a main frequency “ wobble ” imposed on the beam by the magnetic pickup. The suggestion is that the magnetic shielding should be such as to reduce the ac magnetic fields to the order of 10 pG at the beam. There is a different set of problems that occur with the use of modern field emitters in microfabrication systems. These problems are essentially of “ a reduction to practice” nature in which the results obtained in somewhat idealized conditions by experienced personnel in the laboratory have to be transferred to the factory floor. The main area of interest is in the establishing and the maintaining of mode-confined condition. Briefly, the situation can be summarized as follows: In the laboratory, cathodes are assessed and tested in a diode configuration with an open structure. So there is little or no backbombardment onto the cathode. Provision is made so that the emission pattern can be observed directly, and the practicalitiesinvolved in the align-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. 1
339
ment of cathode and anodes are avoided. In a practical system, the emission pattern cannot be observed directly and a triode gun is required. Significant backbombardment occurs and the question of alignment adds complexity; and a premium is placed on an extensively long pump-down time. The progress toward build up can only be observed indirectly. Under these conditions, it cannot be claimed that the activation process, involving a careful adjustment of field, temperature, and gas content, has been reduced to practice to the same degree of finesse that the use of (310) emitter has achieved giving low downtime and good yield. The final consideration relates to the noise figures and linewidth variation that can result when sensitive resists with poor y values are used. The situation can be summed up here by saying that a determined research effort to understand and control the noise properties of advanced field emitters would be a good investment. Considerable progress has been made. This topic is reviewed in Section VI1,D. Section V,E,5 also indicated a way in which an engineering compromise between total current and noise can be made.
C. Experimental Work Results on the use of mode-confined field emitters in a magnetic column have been published by Wolfe (1975), who worked with a gun originally intended for the micromachining of storage data. This worker studied both the (100) built-up emitter and the zirconated emitter under near identical conditions. The cathodes were operated in a diode mode and were assessed by the tip within the high-field region of a magnetic lens. This lens produced a collimated beam which was subsequently focused by a final lens onto the target plane. Under these conditions, effects due to anomalous energy spreads were minimized. Working in a pressure range of lO-*-lO-’ torr, Wolfe determined that the diameter of the emission area for the built-up emitter is 55 A at an emission level corresponding to a specific emission 700 ,uA/sr. For the zirconated emitter at the same specific emission, the diameter of the emission area was found to be 285 A. In each case, the beam half-angle subtended at the gun was 1.5 x lo-’ rad. No pertinent noise data were given. We can utilize Wolfe’s work in another way to give some insight into the problem of high-current effects, particularly coulombic interactions. Using a system magnification of 3.2, this worker obtained 5 x lo-’ A into a 350-A spot using a built-up emitter and achieved the same current into a 950-A spot using a zirconated emitter. The beam half-angle at the target was 4.7 x rad in each case. The corresponding current densities are 5 x lo4 and 7 x lo3 A/cm2, respectively. In Section X, we shall compare these experimental values with theoretical estimates. For the present, we can
-
P. R. THORNTON
340
use these data to indicate the power of a field emitter based system if we can avoid or overcome effects due to coulombic interactions. Using the values obtained by Wolfe for cathodes operating in a diode mode with an emission voltage of 5 kV, then for a microfabrication system operating at 20 kV, we rad with an A into a beam half-angle of 7.5 x would obtain 5 x effective source size of 300 A, using a gun of spherical aberration constant C,, 5 an. If the system magnification used is x 10, we would obtain A into a 0.3-pm spot with a beam half-angle of 7.5 x rad. 5 x The resultant current density is 700 A/cm2. This figure is approximately 10 times higher than the corresponding figure obtainable with thermal cathodes. These results are indirectly supported by the work of Kuo (1976) using the built-up emitter at somewhat lower beam currents. Kuo obtained specific emission values of the order of 200 pA/sr at 20 kV with total emission currents of the order of 10 pA. Other investigations of field emitter sources for microfabrication work include an experimental assessment by Wardly (1973a) and the earlier work of Liversey (1973). Both of these investigations were concerned with conventional field emitters of (111) orientation (Wardly, 1973a) and (310) orientation (Liversey, 1973). Wardley indicated that no " currentdependent " spot growth occurred at specific emissions of less than lo3 pA/sr in a column consisting of a triode gun, a scanning system, and a target plane. The vacuum level used was lo-'' torr. Adequate lifetime was obtsned and the periodic flashing required did not interfere significantly with the system's usefulness. One test at 3WpA total emission gave 100 hr of operation with an average time between flashing of 4 hr. One cause for concern was the noise level quoted. In a 10-min period, the noise level was 10%peak to peak. This noise figure was sustained for a period up to 20 hr. This figure is probably marginal if the use of low contrast resists is envisaged. Liversey's pioneering application (1973)of field emitters to microfabrication work does not give insight into the question of highcurrent effects or into the problem of field emitter noise and its role in affecting linewidth in fast resist. Liversey worked with a slow, high contrast resist (PMMA) and produced good quality 1-pm lines with 1-pm separation.
-
N
-
D. Future Possibilities The question of coulombic interactions is considered in Section X. Here we can address the problem of noise, stability, and cathode life. We have indicated that an engineering solution to the problem can be considered by using tips with larger emission areas at the expense of some loss of performance. A long-term solution can only be obtained by achieving a soundly based knowledge of the detailed conditions existing at the tip as a function of temperature, field, environment, etc. This scientific investigation is already
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
341
well under way. Using modern surface analysis techniques, surface physicists have provided extensive data pertinent to the systems considered here. A brief and incomplete survey will indicate the extent and relevance of the recent work being undertaken. The absorption isotherms of oxygen on the (100) (110), and (111) surfaces of tungsten over a pressure range of 10-9-10-4torr for temperatures between 1500-2600°K have been published by Banninger and Bas (1975).While Baur et al. (1976)have examined the kinetics and resultant structures that occur when oxygen is absorbed on (100)tungsten. The behavior of carbon monoxide, nitrogen, and hydrogen on (100)tungsten have been reported (Frotizheim et al., 1977; SinghBoparai and King, 1976;Jasger and Menzel, 1977). The interaction of 0, and CO and of 0,and CO, on (1 11) tungsten has received attention (Hop kins and Watta, 1976). The diffusion of oxygen on (110)tungsten and the electron desorption of CO’ and 0’ from (111) tungsten covered by a monolayer of CO has been studied (Madey et al., 1976). In another area, Mee (1976)has shown how scanning electron microscopy can be usefully applied to the study of failure modes in (100)built-up field emitters.
VIII. THEDEVELOPMENT OF A FASTSCANNING SYSTEMT H E ROLEOF COMPUTER-AIDED DESIGN A. General Approach
The power of computer-aided design, at the present time, is such that workers can, in principle, leave most of the design of the total system to the computer, the only limitation being cost and access to suitable computer facilities. But with the development of better programming techniques, tasks that previously required a large computer can now be completed on the more advanced minicomputers. So the question of cost and accessibility is becoming increasingly less significant. As a result, workers now have a freedom of choise as to how the work load is to be reduced by the computer and when and in what manner, intuition and physical insight can be incorporated into the design. The choices are as extensive and as individual as the designers themselves. But, in general, the computer-aided design not only considers the theoretical limitations but, in successive iterations, examines the practical aspects of the electron-optical design and the limitation imposed on the electron-optical design by other technologies. The basic design has to include such factors as the properties required of the final image spot; the limitations imposed on the system by the available cathode technology; the need to incorporate “secondary ’* but essential components; fabrication difficulties; the question of performance compromise for gains in
342
P. R. THORNTON
reliability and cost; and the interaction of the electron-optical design with the electronic, software, and mechanical engineering aspects of the system. To see how the approach develops, we can list the possible iterations in the following manner: Stage 1. An intuitive or educated choice of the general approach, type of gun, number of lenses, type of deflector, etc. Stage 2. System analysis in terms of component parameters to maximize the current into the required undejected spot. Stage 3. Incorporation of the deflection problem. Stage4. A repetition of Stages 1-3 to compare several general approaches-and to select the best. Stage 5. A more detailed analysis to fix the specifications of individual components. Stage 6. The detailed computer-aided design of individual components to meet the required specification. Stage 7. Consideration of the high-current problem. Stage 8 and onwards. Further iterations to relieve undue burden on one element, to increase ease of fabrication, ease of assembly, ease of alignment, and to increase tolerance to errors, etc. At the completion of Stage 2, we should have listed the first- and thirdorder properties of column elements that are required to give the specified undeflected spot. This list will identify the critical components and will specify the required properties in terms of focal length, spherical aberration, chromatic aberration, distortion, off-axis aberrations, etc. Stage 3 will extend this list to the deflector system. Stage 4 would allow a choice to be made between several general approaches. Or, if ambiguity remains, then the better possibilities could be carried forward. Stage 5 substantiates or modifies to the required degree of sophistication, the requirements placed on each component. Stage 6 is the detailed computer-aided design of individual components or groups of components to meet the specification. Stage 7 is, in essense, a consideration of the limitations of the analysis and, in particular, of how high-current effects could impose additional limitations on the system performance. Subsequent stages are basically an examination of the role of perturbations of the optimum design and of the trade-offs to be made for the reasons stated. The computer-aided design capability really shows its worth in Stage 6, but can also be of considerable help in Stages 2,3, and of particular interest in microfabrication work, in Stage 7. Before considering the detailed design of individual components, it is convenient, both from the viewpoint of the present discussion and the consideration of high-current effects in Section X
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
343
to examine the way in which the detailed properties of a given component can be integrated into the total design by treating a particular problem. The problem has been selected on the ground of its illustrative nature, but it is an important case in practice; it is the question of the adaptation of a scanning electron microscope into a microfabrication system. In this case, an initial analysis quickly shows that the critical components are the final magnetic lens, which has to be modified to incorporate the required deflection area into the system, and the deflector system itself. The pertinent data are discussed in the next section with particular reference to Eq. (16) and to
HIGH CURRENT A BE R RAT I0NS SEE SECTION X
LOG BEAM HALF ANGLE
a
i
d
FIG.33. Summation of aberration contributions to determine the system performance as a -t-+ ) indicate the additional aberrations that can function of beam half-angle. (arise at high currents due to coulombic effects and are pertinent to the discussion in Section X.
P. R. THORNTON
344
B. The Combination of Contributions to Final Image Size, Shape, and Position Equation (16) gives the undeflected final image size di in terms of the various contributions to the total aberration.
d; = d:o
+ di + d$ + d:
( W
dio = d,,,(ao/ai) = (4Zs/n2Bg)112 x l/ai
( W
dsa = &s(Mf)ar?
(W
dc = Cc(Mf)(Ar/viPi
(164
dd = ci/ai (1W In establishing Eq. (la), we have assumed that the various contributions to the final image size can be added in quadrature. And we have assumed that the beam-limiting aperture position has been chosen to reduce off-axis aberrations to below the significance level. dio, dd,d, ,and dc are the contributions to the image size from the demagnified image of the gun source, diffraction, spherical aberration, and chromatic aberration, respectively. In Eq. (16), Mf is the magnification at which the final lens is operative, Ci is a dimensional constant, and AV is the rms value of the energy spread from all sources. The other parameters have been defined elsewhere. Figure 33 gives plots of (d; - dfo)1’2, dd, d,, and d, against ai for selections of the system parameters. In the case of the scanning electron microscope, where increase in resolution is the aim, the process of system optimization is to minimize diowith respect to di ,bearing in mind that the spot current has to be sufficient to overcome loss of contrast due to noise limitation in the detector system, i.e., the optimization condition is ddi/dai = 0 provided I , 2 Zso (17) where Zso is some lower limit usually set by shot noise considerations. In the case of a microfabrication system, the ultimate in resolution is not required, so resolution is sacrificed to increase the available current. The net result within the limitation of Fig. 33 is to move the operating point to higher values of di where, in general, the performance is limited by spherical aberration. In this situation, the optimization condition is obtained by increasing the spot current by increasing the angular aperture, ai, to its maximum value consistent with keeping di constant and equal to a given fixed value. We can briefly indicate how the deflection problem can be included. Assume, for the present, that deflection aberrations can be summed in quadrature with the aberration of the undeflected spot. Also assume that the deflection system is such that astigmatism and coma are ’
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
345
the main contributions to the deflection aberration. We can, therefore, write the size of the deflected spot dii in terms of dio as
(4 - d?O)/d:o = (ki a/32)2+ (k2a2/3)z = (1
+f ) 2
-1
(18)
where f 1/10. In this equation, k, and k2 are functions of the deflector nature, its configuration, and its location (see Section IX). The deflector design problem is then to reduce k,, k2 (and equivalent parameters) so that the allowed value of ai is reduced as little as possible (see Fig. 34). A full development of these arguments will set the design for N
(1) the lens, as the reduction of C,(M,) and possibly of C,(M,), (2) the deflector, as the reduction of k,, kz , etc.
t v)
I-
P In
A N D RANGE REQUIRED SPOT
n Ly
-I
V Ly
2 LL Ly
n
5
BEAM HALF A N G L E , a
i
d
t
YI
-
N
In
Z
Q
U
v)
V BEAM HALF ANGLE
a;-
FIG.34. Extension of Fig. 33 to indicate the inclusion of the deflection problem.
346
P. R. THORNTON
Here the computer aids in the solution mainly by solving Laplace’s equation in a variety of increasingly complex geometries and combinations of fields. C. The Detailed Computer-Aided Design of lndividual Electron-Optical Components 1. General The computation of the properties of individual electron-optical components or of small groups of components proceeds through three main stages. The first stage is the determination of the field distribution that exists within a predetermined geometry under a given set of boundary conditions with a chosen degree of excitation. The second stage is the calculation of the “ first-order ” or “ linear” or “gaussian ” optical properties of the system, by the calculation of suitably selected trajectories within the validity of the approximation that sin a can be written as a, where a is the inclination of the beam to the optic axis. The final stage is the calculation of the third-order aberrations of the system under varying conditions of excitation within the range where sin a can be written as a - a3/6. Expressions for the contribution to the total differences between the first- and third-order trajectories are examined for their dependence on a, p, x, y, etc. and are identified largely in terms of the analogous geometrical optical distortions and aberrations. Of these three tasks, the determination of the relevant field distribution is by far and away the most difficult and time-consuming.The underlying problem is the solution of Laplace’s equation to a high degree of accuracy in structures of increasing geometrical complexity and of increasing complication of boundary conditions. Early workers utilized resistance network analogs and actual physical measurements to determine the field distribution (Lenz, 1950; Dugas et al., 1961) in simple geometries of high symmetry; but the major advance was made by the application of the digital computer (Kamminga et al., 1968; Read, 1969; Munro, 1971a; see, for example, Hawkes, 1973).
2. The Computation of the Field Distribution The initial work was limited to simple geometries because of the storage needs that arise when complex geometries were considered. Figure 35 outlines the problem. In Fig. 35a, we have a simple structure with a high degree of symmetry, i.e., a symmetrical magnetic lens with the bore diameter D equal to the separation S between the pole pieces. The presence of two
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
A
347
B
D
FIG.35. (a) Simple, highly symmetrical magnetic lens structure discussed in text, (b) more complex structure used to discuss the finite difference approach.
348
P. R. THORNTON
symmetry axes enables us to reduce the problem to the consideration of any one quadrant, such as ABCD. In the absence of saturation effects, we can work in terms of the magnetic scalar potential with suitable boundary conditions. A relaxation method (or equivalent process) can then be employed by dividing the region of ABCD into a " mesh " of lattice points and solving for the potential throughout this region. We need to have the greatest accuracy in the region of the magnetic gap. So the mesh is chosen sufficiently fine to give the required accuracy in this region. But since the gap and its lateral extension are substantial fractions of the total dimensions, the fineness of mesh can be extended throughout the whole of the region ABCD without overloading storage capability. For example, if we take ten lattice points = S/2 - D/2, then the total area can be contained within 50 x 40 lattice points. In Fig. 35a, the magnetic boundary conditions lie along lines of mesh points so that no additional complexity or approximation results.
-
3. Calculation of First- and Third-Order Optical Properties The calculation of the first- and third-order properties relies heavily on early work by Glaser (1940,1956),Scherzer (1936) and others (see, in particular¶Zworykin et al., 1945). These workers gave the necessary formulas to calculate the required properties of lenses as integral functions of the axial field and its derivatives. Haantjes and Lubban (1957,1959)and Kelly (1972) have given the corresponding formulas for magnetic and electrostatic deflection, respectively (see Section IX). Recent advances include an analysis by Munro (1974) which enables the properties of a combined magnetic lens and magnetic deflection system to be examined. Some idea of the current capability of computer-aided design can be obtained noting that the total system optimization process to be included in the actual program (Munro, 1975). The basic input data include the physical dimensions and location of the components of an assembly. The output data include a listing of selected aberrations. Values are added in quadrature and are used as input to a program that minimizes the sum of the squares of functions of several variables (Powell, 1965). The program, therefore, predicts the optimum geometry and location to give the minimum aberration. Due regard to the physical realizability of the required configuration is made by not allowing the size, location, etc. to assume unrealistic values. If, for example, we attempt to apply the above approach to the structure shown in Fig. 35b, which represents a lens analyzed by Heritage (1973), for possible application to microfabrication work, we meet additional complexities. The first factor is the absence of a second symmetry plane which at least
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
349
doubles the complexity of the problem. The boundaries no longer lie along the mesh lines in all cases, leading to some additional complexity and degree of approximation. Finally, if we seek to extend a fine mesh appropriate to the important regions (the magnetic gap and the pinhole aperture on the right of the lens) over the entire region, the required storage can increase by a factor of 102-103.A multimesh approach using rectangular (or square) meshes can be used, but the boundary approximation problem remains. And the multimesh approach in this form can get quite complex in cases of current importance, such as the case of a field emitter gun for example. In the region of the cathode tip (- 500 A in radius), the mesh has to be ultrafine to give the required accuracy. In the region of the first anode, the mesh size has to be that appropriate to a scale of the order of 1 mm; while the separation of the first and second anodes can be of the order of centrimeters requiring another scale of mesh. Recently, we have seen the establishment of an approach that avoids these complications and allows a generality of approach which satisfies the complexities inherent in microfabrication work. This method, the finite difference method, was first developed by Zienkiewicz (1967; Zienkiewicz and Cheung, 1965) to solve structural engineering problems. But its extensive application to electron optics by Munro (1971b, 1972, 1973; Owen, 1975) has been a significant feature in the development of advanced systems. In brief, the basis of the method is to allow the mesh to depart from a square or rectangular configuration in a manner which is specific to each structure in that the mesh is distorted to merge with the geometry of the structure. Figure 36 shows how this distortion can be achieved for the lens illustrated in Fig. 35b. The structure is broken up into a few, very large grid shapes, and the computer is used to give the fine mesh within this framework. Figure 36a shows how the allocation of mesh points could be made along the z axis and r axis directions, while Fig. 36b shows how the computer generates the variable mesh lines in the r axis direction. A complementary process in the z direction would surround each point by an array of four quadrilaterals, which can, in turn, be divided in triangular arrays giving each point surrounded by six neighboring points. The meshes are concentrated where required, the boundary conditions are integrated into the mesh in an elegant way, and the computer deals with the “number crunching” aspects. For more detail, the reader is referred to Munro (1971a, 1972,1973)and Owen (1975). The pleasing feature of this approach includes its extendability. We have already stressed the flexibility in relation to geometric configuration. The approach can be extended to include the variation of permeability of the iron as the excitation is varied. This process can be extended to compute the flux distribution throughout the total circuit including the windings themselves. Electrostatic lenses can be examined by
350
P. R. THORNTON
(b)
FIG.36. The fitting of a variable size mesh to the geometry of an electron-optical structure: (a) generation of the coarse mesh by indication of the number of mesh points to be assigned between any pair of coordinates, (b) computer generation of the fine mesh by linear interpolation-only a partial fine z-mesh is illustrated. The number of mesh points chosen in the figure is determined by clarity of exposition, not on the more realistic basis of actually determining the distribution.
allowing the permeability to assume a very large value. The approach has already found application in a wide variety of situations (Munro, 1971,1972, 1975; Owen, 1975, 1977).
D. Limitations Probably the major limitation facing the analysis outlined in the previous sections is that it pertains to a low-current situation, whereas in microfabrication applications, we are continually striving to increase the
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
351
current. The basic problem is that the analysis to date, in general, neglects the interaction between electrons within the beam. This topic is the subject of some controversy and merits consideration in a separate section, as it probably represents the major limitation on the development of a fast scanning system by any approach. This topic is developed in Section X. The second uncertainty arises when we seek to combine the various contributions to the total aberration on the beam. In general, the consensus is that the contributions should be added in quadrature (Klemperer and Barnett, 1971; Pease, 1963), but this opinion is by no means universal. Workers such as Grivet (1972),Mulvey (1967),and Owen (1975, also private communication) contend that a linear combination of aberrations is more appropriate. The resolution of this question is particularly important in microfabrication work, where factors of two or less in throughput can make the difference between economic success or not. Harte (1973) has quoted an example involving spherical aberration in which the available current differs by a factor of just greater than two depending on whether a quadrature or a linear combination is more exact. From the viewpoint of microfabrication, the situation is equally obscure. Consider the question of the deflectioninduced growth of the spot at the edges of the required scan field. Adopting the notation of previous sections and taking an empirically derived estimate that the growth of the deflected spot in the radial direction cannot exceed approximately 10% of the undeflected spot, we have
dii = dio(l +f) 1.1 dio (194 If the deflection-induced aberrations lead to a total contribution dd,then the relationship between dii,dio, and dd is d? = d& + d i
(19b)
if addition in quadrature is appropriate, and
dii = dio + d,j (194 if a linear combination is correct. The significance of the difference can be seen by combining Eqs. (19a), (19b) and (19c). If the linear approach is valid, the dd/dio
-
0.10
Where as with addition in quadrature
-
dd/dio 0.46 The difference is considerable and indicates the need to resolve this question. An initial analysis has been made by Harte (1973). The result is that for
centrosymmetric aberrations involving Gaussian distributions, addition in
352
P. R. THORNTON
quadrature is correct. From the viewpoint of the present application in which coma-a noncentrosymmetric aberrations-is important, we can only state that the question is unresolved, and that the correct solution lies within the limits indicated above. Further limitations of the application of computer-aided design become apparent when we consider the approach in detail. In microfabrication applications, there has been an increasing stress on chromatic effects. Kuo (1976) has indicated that some care is necessary in regions where this contribution to the total aberration becomes important. The important point here is that the axial position in which the aberration due to chromatic spread is minimized need not coincide with either the Gaussian plane or the position of the disk of least confusion due to spherical aberration. Under these conditions a straightforward addition of terms in quadrature leads to an underestimate of the total spot size. In microfabrication work, the drive toward high-current and highangular apertures implies that spherical aberration, an uncorrectable error within the framework of the practicalities of our application, plays a significant role. We have accurate estimates of the spherical aberration properties of individual lenses but are uncertain as to the total spherical aberration resulting from combinations of two or more lenses except within the thin-lens approximation (Liebmann, 1949).
IX. THEDEFLECTION PROBLEM A. Unique Features Required of a Microfabrication Deflector System
The choice of the deflection system to be used in a microfabrication system is a critical one, as it is often this component which imposes the performance limitation on the system. The problem is not a uniquely electron-opticalone, as the application imposes limitation on the freedom of design that is available. These limitations arise from several areas, but a natural starting point is the limitations imposed by the wafer itself. Figure 37 outlines the problem. In Fig. 37a, we have shown the idealized situation in which a well-behaved beam is scanned over the target, follows a curved focal plane, and so moves out of focus in a predictable manner which can be corrected dynamically as the beam moves. Here the target is assumed to be perfectly flat.. In reality, the wafer will, in general, be warped. The warpage can take various forms, but a common form of distortion is shown in Fig. 37b. Here the wafer is bowed with a deflection of up to 60 pm at the wafer edge. The curvature is a slowly varying quantity over the wafer
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
UNCORRECTED DEFLECTED BEAM
--
353
rn
\
\\\
ACTUAL SURFACE
IDEALIZED SURFACE
/
‘I \
w
dS dS
FIG.37. Complications introduced by out-of-plane wafer distortions and stage errors: (a) idealized and correctable situation, (b) an indication of the wafer “bowing” problem, (c) parameters used to define the placement uncertainty arising from wafer and/or stage errors.
surface. Superimposed on this microscopic variation are small, localized variations of the order of I pm due to the essential surface topography of the device fabrication. A further complication is that the mechanical stage can have uncertainties in pitch and yaw. The total result is that run to run there are variations in working distances which are unpredictable, random, and which cannot be corrected readily. These variations in working distances lead to variations in scan size. The possibility, therefore, exists that
354
P. R. THORNTON
errors in spacing and “butting” can arise when a device layer is being imposed on a previous layer. We can see the implication in quantitative terms by means of the parameters defhed in Fig. 37c. If we specify that the error located due to this cause has to be less than 0.05 pm, then dW < 0.05 x low4x (W/S)cm. With S = 0.25 cm, Table I1 gives the allowable variation in working distance as a function of working distance. Table I1 clearly indicates that a long working distance is compatible with the realities of wafer distortion, but that a significant problem exists at short working distances. A long working distance is acceptable for a microfabrication system using a field emitter cathode, but is not really useful for a system exploiting a thermal emitter. One way around this dilemma is to make use of a “telecentric” system, that is to say, a system which deflects the beam sideways and concurrently straightens the beam up so that the beam hits the wafer at normal incidence. Under these conditions the “butting” error at high deflection angle is reduced. Usually such an approach involves the use of a double deflector system (Fig. 38a) or a deflector-focusing lens combination (Fig. 38b). Pfeiffer has argued (1974) that this latter combination can provide the electron-opticalequivalent of an achromatic lens in the sense that the aberration introduced into the deflected spot by the existence of a chromatic spread can be largely counteracted by the spherical aberration introduced by the lens. Another problem arises when the problems of speed and deflection aberration are considered. TABLE I1 ALLOWABLE VARIATIONIN WORKINGDISTANCE AS A FUNCTIONOF WORKINGDISTANCE FOR A SYSTEM WITH A 5 mm x 5 mm SCANAREA -
W (cm) dW (pm)
1.0 0.2
2.0 0.4
5.0 1.0
10.0 2.0
20.0 4.0
Here the basic decision is between electrostatic and electromagnetic deflection. Electrostatic deflection is intrinsically fast, as no significant current has to be supplied and voltages of the order of +50 V into a total capacitance of the order of 10-15 pF have to be supplied with 18-bit DAC accuracy. Unfortunately, the aberration properties of electrostatic systems are about a factor of five worse than electromagnetic deflectors unless special precautions and configurations are used. On the other hand, electromagnetic systems, although adequate from the viewpoint of deflector aberrations and distortions, have intrinsic speed limitations in that a significant
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
355
-I /
\
-I (b)
FIG.38. “Telecentric”scan systems giving beam displacement with normal landing: (a) by the use of a double deflector, (b) by the combined action of a deflector and a focusing field.
current has to be delivered at a voltage sufficient enough to overcome the back emf generated by Lenz’s law. At the same time, inertia due to selfinductive damping and due to eddy currents have to be overcome. Attempts to realize the good aberration properties of magnetic systems and the inherent speed of electrostatic systems have led to the use of combination deflection systems. Here a “ major ” deflector unit, usually electromagneticin action, is used relatively infrequently to provide the major deflection from “ block ” to “ block ” within a total electrical scan field. A “ secondary ” electrostatic deflector is used to give small but rapid excursions to write detail within the addressed block. Provided the necessary synchronization and the necessary accuracy of calibration can be obtained, this approach is acceptable. It is against this background of interrelating problems and design compromises that a microfabrication deflector system has to be developed. The first and major problem is the reduction of relevant aberration. The topic is considered in the following section within the general context of the total design problem.
356
P. R. THORNWN
B. Dejector Design Philosophy and Statement of the Electron-Optical Problem The factors that enter into the design of the deflector system can be listed as follows: (1) The sensitivity, involving deflector type, configuration, and location in system. (2) The aberrations,involvingdeflector type, configuration,and location in system. (3) The use of dynamic correction to reduce deflection aberrations followed by system optimization of design to reduce remaining aberrations. (4) The fabrication aspects in which the difficulties of fabrication, assembly, and stability are examined. (5) The electronic implications in terms of speed, stability, and repeatability are determined. (6) The need to periodically recalibrate against the absolute standard inherent in the laser-driven stage and relative to each other has to be examined, together with the required ability to diagnose the residual errors in a deflector system, to determine the constancy of such errors, and to establish an ability to correct such errors.
The starting point in this development is the question of the geometric distortion and aberrations that develop on a deflected beam. Table 111summarizes the aberrations of the ideal spot, indicates their dependence on scan angle and on beam half-angle,and suggests whether they can be dynamically corrected or not. A reasonable philosophy to develop in relation to this problem is to attempt to dynamically correct as many as possible of the aberrations by the inclusion of the necessary hardware/software combinations and then to optimize the system performance. This outlook is not universal. For example, Fame11 et al. (1973)elected to minimize barrel distortion, a correctable distortion, and applied four dynamic corrections for field curvature, scan rotation, isotropic astigmatism, and anisotropic astigmatism. But, in general, the fullest use of dynamic correction followed by reduction of the residual aberrations is the better approach. One possible exception exists and that is when a field emitter system is being considered. In this case, using a round beam with gaussian distribution,the data rates are high-200 MHz. When we recall that the dynamic corrections have to be superimposed in real time on the pattern data using a computer of limited capability,then the
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
357
TABLE I11 ABERRATIONS IN MAGNETIC DEFLECTOR sYSTl?MS, THEWDEPENDENCE ON SCAN SIZE x,Y , AND BEAMHALF-ANOLE a, AND THE POSSIBILITY OF DYNAMIC CORRECTION
Aberration Field curvature
Isotropic astigmatism Anisotroptic astigmatism Isotropic and anisotropic coma
Dependence on scan size and on beam half-angle
X and Y components varying
as X'K and Y2a,respectively Square law dependence in X and Y, linear dependence on a Square law dependence in X and Y (cross terms), linear dependence on a Interactive terms with linear dependence on scan and square dependence on a
Isotroptic transverse chromatic aberration Anisotropic transverse chromatic aberration Axial chromatic aberration Spherical aberration
Dynamic correctability Readily corrected by use of small dynamic focus coil Readily corrected by use of dynamic stigmator Correctable by use of dynamic stigmator Difficult to correct
In special cases, some interactive correction between chromatic aberration and spherical aberration is possible.
Varies as a3
question of dynamic correction has to be reviewed from the electronic standpoint. As a result, it is better to analyze each configuration both with and without dynamic corrections. For completeness and later reference, we have included Tables IV and V which list the distortions fundamental to the deflection process itself and the additional distortions that arise due to misassembly and fabrication difficulties. Tables 111-V give a statement of the electron-optical design problem.
358
P. R. THORNTON TABLE 1V FUNDAMENTAL DISTORTIONS OCCURRING IN DEFLECTOR SYSTEMS Dependence on scan angle and on beam half-angle
Distortion
Y3
Isotropic distortion
x3,
Barrel distortion
X Y * , X2Y
Cause and possible correctability Readily corrected by additional terms to scan input wave forms As for isotropic distortion
TABLE V FABRICATION A N D ASSEMBLYDISTORTIONS OCCURRING IN DEFLECTOR SYSTEMS Distortion
Cause
Tilt distortion
Tilt of deflector assembly relative to target
Quadrilateral distortion Curvilinear distortion Sensitivity distortion
Misplacement of one set of coils relative to the other
Rhombic distortion
Random, localized distortion
Departure from perpendicularity of coil sets Localized faults within deflector windings
Comment
Can be corrected for by suitable calibration technique First-order correction possible if effect is small Compounds problems associated with in-plane wafer distortions
C. The Question of Scan Strategy
There is a series of considerations with regard to deflector design that arise from software problems associated with the real-time processing of data to control the beam position and the beam blanker, and to incorporate the corrections required for the proximity effect.The problem can be stated
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
359
in this way. The amount of storage required and the rates at which data have to be delivered are such that there is cause for concern in relation to reliability and the loss of system usage due to computer downtime. Any factor that can be exploited to reduce the storage required and to reduce the data flow should be examined in depth. A scan strategy that allows considerable data compression must be a good contender for application in a fast system. From the viewpoint of deflection design, a major implication is the way in which the pattern is actually written. Figure 39 summarizes the approaches that have been exploited or discussed in depth. Figure 39a is the traditional raster scan in which every point on the chip is sequentially scanned and exposed or not by control of the beam blanker. After completion of the raster scan, the wafer is mechanically relocated to bring the next chip under the beam. Figure 39b shows a simple variant where again each point is sequentially exposed by a “serpentine” raster in this case. Figure 39c illustrates an approach developed at the Bell Telephone Research Laboratory (Collier and Herriott, 1975; Henderson et al., 1975) which utilizes a combined mechanical and electronic scan system. In the figure, the mechanical stage provides a continuous movement from left to right under laser control. At the same time, an electronic scan provides a small deflection in the perpendicular direction. As a result, a narrow “strip” of each chip is written and the same strip is then repeated on the next and sequential chips. In this way, only the data pertaining to this strip has to be contained in active memory. The next strip is repeated in the reverse direction and “butted ” onto the first. Another approach is shown in Fig. 39d. This is the vector scan approach (Speth et al., 1975; Chang et al., 1975) which seeks to compress the data needs by recognizing that not every point in the chip has to be scanned. By breaking up the pattern into a series of blocks that can be defined by four coordinates (the a, b, c, and d of Fig. 39d), the data requirements are reduced. Each block can be addressed and ‘‘ filled in ” by the beam in turn. Figure 39e illustrates the use of patching or “butting” to complete a large chip. In the example shown, the chip is broken up into 16 subfields, each subfield is addressed and written either in a raster or vector scan mode. The patterns have to be matched or patched together at the edges of the subfields. Figure 39f shows that there is a relationship between size of the subfields that can efficiently be used and the microscopic details of the pattern being exposed. The illustration is a gross oversimplification,but shows that both from the viewpoint of data compressions and “butting” accuracy, it makes sense to choose the subfield so that the boundary passes between structures rather than through a structure. It is not always possible to achieve this type of subdivision, because it implies restriction on the location and size of the microscopic device elements. This limitation may
(c) a combined FIG.39. Suggested scan strategies and a factor affectingperformance: (a) the traditional raster scan, (b) a serpenthe raster laser-controlled mechanical scan and a limited electronic scan, (d) the vector scan approach, (e) the completion of a chip exposure by the ubutting” of subchip fields, (f) a schematic presentation of the way in which subfield size should be selected to minimize butting problems.
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
361
well not be acceptable. One final factor has to be considered in relation to the drive electronics. The analog electronics driving the deflector units has to be of high stability as well as fast in operation. Here we have to consider drifts and variations due to thermal effects.If the amplifiers have to work in a pattern that requires a variety of deflection excursions, then the system is more prone to thermal effects than in the case when the pattern is written by means of excursions of approximately equal magnitude. D. Analytical Treatments of the Dejector Problem
Frequent reference is made in the literature to two papers by Haantjes and Lubban (1957, 1959) which formally outline the magnetic deflection problem. These papers list the first- and third-order properties of magnetic deflector systems and give expressions for the relevant aberration and sensitivity constants. The treatment is limited only in one respect, in that it deals with deflection as a separate problem. No attempt was made to include combination fields involving both deflection and focusing action. This work has been extended by Kelly (1972) to include electrostatic deflection. Again, the formalism of the problems is completely established without reference to the superposition of focusing fields. Recently, Munro (1974, 1975) has extended the available analytical basis to include combination fields in particular to the case where magnetic deflection and magnetic lens action are occurring together. On the application side, other recent work includes an analysis of pre- and postlens magnetic systems by Owen (1975). An examination by Thomson (1975) of a double deflector system used above the final lens analyzed not only the aberration behavior but also the tolerance to mechanical misalignment. Munro (1975) has made a more complete comparison between the six possible configurations listed here: (1) single postlens deflection,
(2) prelens double deflection, (3) prelens double deflection, with one set of coils rotated, (4) double deflection with second coil located within lens field, ( 5 ) in lens deflection, (6) in lens deflection with predeflector coil. Amboss (1975) has considered the configurational dependence of the aberration performance of a postlens deflector system. Crewe and Parker (1976) have shown how it is possible, in principle, to eliminate all third-order aberrations from a scanning electron beam system by the addition of further electron-optical elements. Owen (1977) has examined a telecentric system in which a magnetic deflector is followed by a large-bore electrostatic lens.
362
P. R. THORNTON
E. Recent Experimental Data
Here we shall stress approaches that give a final deflected spot that lands normally or near normally, The value of such “telecentric” systems was outlined in Section IX,A. In Sections V1,B-D, we outlined the variableaperture approach to microfabrication as far as the production of the “tailored ” spot of variable size. No indication was given of the method used to scan the beam. The missing feature is the final lens/deflector combination that completes the column shown in Fig. 32. It is an application of the ideas outlined by Pfeiffer (1975b) and indicated schematically in Fig. 38b, i.e., a displaced beam is obtained by the combined actions of a deflector field and a focusing field. It is also a practical application of the “finite difference” approach to the calculation of complex fields (see Section VIII). The resultant lens is a wide-bore solenoid lens in which good symmetry and uniformity of field is obtained by using semi-insulating ferrite spacers to give a distributed array of lens gaps down the length of the solenoid. These ferrite spacers transfer the magnetic potential without leading to complications from eddy currents. The deflector used is a two-stage system consisting of an electrostatic predeflector and a main electromagnetic unit. Also included are a dynamic focus coil to correct for field curvature and a dynamic stigmator to correct for deflection-induced astigmatism. Distortion errors, due to deflection, can be corrected by additional input to the electrostatic predeflector. While no details of the actual landing angle have been published, this system has produced detail as small as 0.6 pm x 0.6 pm over a 0.5 cm x 0.5 cm scan without loss of edge resolution at the edges of the scan field and with placement errors less than the quoted experimental error of -t 0.1 pm. Further details are given in Section V1,D. The major uncertainties in the published data are the actual speed or operation and the system behavior with sensitive resists. An alternative approach to a combined focusing/deflector system has been proposed by Ohiwa et al. (1971). A main magnetic lens and an array of eight subsidiary coils are used to move the magnetic center of the total combination and to focus the beam. Over and above the questions of speed, stability, etc., the major uncertainty arises from the number of components involved. These components have to be located relative to each other with sufficient accuracy to avoid two possible complications-the development of mechanical aberrations which outweigh the advantages of the method and the need to provide extensive dynamic correction to overcome the distortions associated with mechanical misplacement. This approach by Ohiwa et al. has been tested experimentally in a situation only of indirect relevance to microfabrication. These workers have
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
363
established a basic verification of their method by applying it to the improvement of CRT performance, i.e., to a situation in which the scale of scan, spot size, and placement error is much larger than that appropriate to microfabrication work. As a result, its usefulness to microfabrication work cannot be said to have been established. A simpler approach analyzed by Owen (1977) using a single wide-bore electrostatic lens and a magnetic deflector has not been fully tested experimentally. Owen stresses that the use of an electrostatic lens avoids the complications inherent in the rotational nature of magnetic focusing. The computer design predicts that for an off-axis distance of 5 mm (i.e., a 0.7 cm x 0.7 cm scan area), with a chromatic spread of 1 part in lo4 and a beam half-angle of loT3rad, the spot growth at the field edge due to geometric aberration is 0.1 pm and due to chromatic aberration is 0.2 pm. While the distortion arising during the scan can be corrected either by precorrection of the scan data or by varying the lens excitation during the scan. Experimentally, it has been confirmed that the wide-bore lens can operate safely and reliably at 10 kV and that the first-order focal properties agree with theoretical prediction to within 6%. No aberration or distortion data are currently available. F. Fabrication Aspects We can restress the accuracy required of a scanning electron beam microfabrication system by noting that we wish to have a total placement error due to all causes of kO.1 pm within a scan field of 5 x 5 mm. In an ideal situation, we would like the errors due to mechanical fabrication to be significantly less than the 1 part in 5 x lo4 that this specification implies. For an electrostatic system for structure with bores of the order of 2.5 cm (1 in.) and lengths of the same order, this specification means that structures must be made and assembled to within 5 pm (0.0002 in.). This tolerance is performable but expensive. In the case of magnetic systems, we cannot wind coils to this level of accuracy. This comment is particularly true of the traditional saddle coil configuration which involves curved structures fabricated on a three-dimensional former. To overcome this difficulty, the use of planar toroidal structures have been successfully exploited (Pfeiffer, 1974; Munro, 1975). Even on this basis, it is unlikely that the structure can be made to a greater precision that 1 part in lo3. In this situation, we can arrange to provide a first-order correction for the linear distortions. For the distortions that are localized and nonlinear, there is no solution but preselections of components and special fabrication methods. Material considerations are of high priority in this area. Ferromagnetic materials are excluded because of the inherent hysteresis introduced into the deflector by
P. R. THORNTON
364
their presence. Electrical conductors have to be excluded in simple form because of the additional inertia that results from the resultant eddy current losses. At the same time, we need materials that are good thermal conduo tors to distribute the applied wattage uniformly and repeatably. Wardly (1973b, 1974, 1975c) have discussed the question of eddy current losses and compensation in depth. Pfeiffer (1972) has shown how the eddy current problem can be greatly reduced by the use of semi-insulating ferrites to provide the necessary transfer of magnetic potential without the electrical conductivity inherent in the use of soft iron. G. Electronic Implications
It is instructive here to adapt an analysis due to Jones and Owen (1977), who outlined the nature of the total deflection problem in terms of a figure similar to that given in Fig. 40. Here the abscissa represents the current -CURRENT DRIVE LIMITATION 1 5 1 max
LIMITATION NIX,/ dS/ d I 10
I
1
IN AMPS
-
100
FIG. 40.A statement of the magnetic deflection problem, indicating schematically the various limitations. After Jones and Owen (1977)
through the coils and the ordinate represents the number of turn. To a good approximation, we can represent the coils as a critically damped parallel resonant circuit with a decay constant z, given by
-
(KN'c)''' (20) where c is the total capacity of the coils, the leads, and the associated vacuum feedthroughs, and the self-inductance of the coil L has been written L = KN'. If we specify that we want z to be equal to or less than zo to give a required settling time, then Eq. (20) gives T
N
-= (zo/Kc)''Z
(21) The second limitation arises from the required deflection scan S and the
ELECTRON PHYSICS I N DEVICE MICROFABRICATION. I
365
sensitivity dS/dZ available in the system. If So is the required maximum scan, then Nl> So (dS/dZ).The drive amplifier will only be able to deliver a given current I,, and perform to the remaining elements of the specification, i.e., Z 5 Zmm. The wattage limitation on the coil can be derived as follows. If W, is the maximum allowed wattage in the coil, then W, 2 ZZR= Izlp/u is the resistivity of the wire, where 1 is the length per turn, and u is the cross section of the wire used, or NZ2 2 W, a/pl. A final criterion exists in that the voltage applied has to be greater than the back emfgenerated. This argument, when developed, puts an upper limit on N2Z which is represented by a line in the upper part of the right-hand side of Fig. 40.The actual figures used can vary somewhat subjectively, but in general, the allowed current is approximately 5 A or less, the allowed wattage is, say, 3-5 W,and the number of turns, say, 10-30. In the case of electrostatic deflection, the problem is simplified in that double-ended voltages of the order of f50 V have to be supplied to the main deflectors with a total capacity of the order of 10-15 pF with ldbit DAC accuracy and that reduced voltages of the order of f 5 V have to be delivered to the predeflector, again with a total capacity of the order of 10 pF with 10- to 12-bit DAC accuracy. The calibration of both deflectors has to conform with the frequently repeated specification of a total placement error of less than f O . l pm. H . Calibration and Diagnostics
A calibration and diagnostic capability is an inherent part of any microfabrication deflector system. The calibration procedure should reduce, by an order, any distortions due to fabrication of the types listed in Table V. It should also take account of any rotational error between the directions of stage movement and the axes of the pattern to be written, which is determined by the locations and orientation of the fiducial marks. It is probable that a two-stage calibration procedure is best fitted to meet microfabrication needs: A detailed and extensive calibration of the total deflection system including electronics against the laser system on, say, a monthly basis to establish the constancy or lack of constancy of the various components and a quicker calibration combined with error correction between the actual writing on chips. If the system moves out of specification, or if during the initial system “run up,” it becomes necessary to establish where the deflection fault lies, then the procedure outlined by Jones and Owen (1977) can be used with advantage. The basic approach is to expose a matrix of lattice points in a resist layer across an area equal to the required scan. Subsequently, the actual positions of the matrix points are carefully measured and compared with the theoretical or ideal positions of the matrix points (see Fig. 41). Then a computer plot of the error vector at each point
366
P. R. THORNTON
can be made which illustrates both the magnitude and direction of the vector. This plot enables a series of computer programs to analyze the data on the assumption that the errors are due to tilt, misplacement, rotation, nonorthogonality, etc. The output from such analysis is a clear indication of the major mechanical faults and an idea of the magnitude of the residual random errors. Thus, the faults can be corrected and a judgment made whether the deflection unit is acceptable or not.
FIG.41. Schematic but typical “map” of the array of error vectors obtained in analysis of deflector performance.
I. Future Possibilities
Two possibilities should be outlined in passing, as the potential in each has yet to be fully exploited. Both these possibilities involve advanced electron optics of a type not considered here-the multiaperture lens and the distributed lens. Heynick and his co-workers (1975) at SRI have been early explorers of the aperture lens in device microfabrication. The image of a single preformed aperture plate or mask is projected in parallel onto a large number of sites by an array of aperture lenses. By the incorporation of a suitable deflector system, the pattern can be repeated in small groups centered on each aperture lens. Different patterns can be made by the use of different masks. A pleasing feature of this work has been the quality and variety of work performed at the research and development level with very simple, uncostly equipment. The possible incorporation of a laser-driven stage could provide a system of considerable versatility. The possibility merits further consideration.
ELECTRON PHYSICS IN DEVICE
MICROFABRICATION. I
367
If we ask ourselves how large a scan can a telecentric deflection system give, we should recall a system in advanced electron beam memory work (Wang et al., 1973). A conventional gun, lens, and predeflector deliver an apertured beam to a wide-bore distributed lens. This lens is terminated by a multiaperture plate and can deliver a normal beam through each aperture in turn under the control, the predeflector. It has been shown that the beam can be held parallel to within 50 p a d over a 7.5-cm diameter circle. Subsequent electron-optical elements below the distributed lens would differ in the case of microfabrication from the “fly’s eye” lens (Newberry, 1966) used in memory work. One concept would be to move a small, lightweight lensdeflector-detector combination by a laser-controlled stage under each aperture in turn. The wafer could be held in a relatively inexpensive handling mechanism below this unit. X. HIGH-CURRENT EFFECTS
A. The Role of Coulombic Interactions
Work with cathode ray tubes (Thomson and Headrick, 1949) and Pierce guns (Pierce, 1949) has led to careful attention being paid to space charge problems. More recently the available analyses have been generalized to treat more complex situations (Schwartz, 1957; Hollway, 1952; Bosi, 1975). The bulk of this work has regarded the coulombic scattering essentially as an elastic collision between two particles, as a result of which the trajectories get displaced in angular direction and in radial extent. There is no interchange of energy between the scattered electrons and the electron beam is modeled to retain its essentially monochromatic nature. Recently more attention has been paid to energy changes that can occur along the optic axis as a result of these collisions. Such changes lead to an increased energy spread on the beam with resultant additional complication in our ability to focus and to deflect the beam. We can gain qualitative insight into the effects these interactions can have on column performance by considering Fig. 42, which describes the interaction in terms of parameters first used by Loeffler (1969). One electron, the reference electron, is considered to travel along the optic axis of the column. The second electron, the interacting electron, travels along a general trajectory. The closeness of approach is measured by an impact parameter b, which is defined in the figure. The scattering leads to three interactions. One interaction is parallel to b,, i.e., along the optic axis, and leads to energy variations. Scattering along the direction of b, leads to changes in trajectory angle, while the component perpendicular to both b, and b, gives a trajectory displacement (Loeffler, 1969). Loeffler (1969) and Loeffler and Hudgin
P. R. THORNTON
368
0 W
1
b =b,
2
+
I FIG.42. Definition of the spatial relationship between the position of the reference electron moving along the z axis and an electron following a general trajectory.
(1971) have analyzed the case of a paraxial, monochromatic, rotationally symmetric crossover in a field-free region. Placing heavy reliance on Monte Carlo methods and suitable averaging, these workers have calculated the expected mean value for each of the three components outlined above:
(a) The interaction energy change
-
AEi w (e2/4ne0)x Fl(Aro)/aoro
(224
(b) The angular displacement Aa w (e/4mO)x F,(Ar0)/2Vbaoro
(22b)
(c) The radial displacement Ar w (e/4m0) x F3(Aro)/2Vb'bar(!
(224
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
369
In this equation, ro is the radius of the crossover, a, is the beam halfangle subtended at the crossover, v b is the beam voltage, and I is given by
I
= (Ib/v;”) x
(231
(2 x 10’ e3/rn)-’”
where I,, is the beam current. The functions F,, F 2 , and F 3 have the forms shown in Fig. 43 when plotted against Ir,. At high current levels, all three functions tend to a ( I b r o /vb)1’2)1’2 dependence. At lower currents, the dependence is steeper and more variable. F1, F 2 , and F , vary with Ir, (i.e., Ibro/vt”) to the 100
10
t
0
1
U
Od
r4
0.1 c
U
0.01
0.001 0.01
0.1
A
10
1 r
o
100
d
FIG.43. A schematic representationof the dependence of F , , F , , and F , on Ir,. See text.
approximate first, second, and third power, respectively. In the high-current regime, we can use the relationship between beam current and brightness to eliminate a, from Eq. (22a) and to obtain -
AEia(Br0)1/2/Vi/4
(24)
This result has been confirmed experimentally (see Pfeiffer, 1971). These coulombic interaction aberrations are “diffractionlike” in the sense that unlike all other aberrations except diffraction, they produce a loss of focusing ability which increases as the beam angle is decreased. Therefore, in qualitative terms, coulombic effects introduce contributions with the nature shown as dashed lines in Fig. 33 in Section VII1,B. As the coulombic
370
P. R. THORNTON
interactions increase, the range of angular aperture that is available decreases and the smallest spot size achievable may be significantly increased. In addition, when we include the deflector problem, further restriction is placed on the design. To see just how important these effects are, we have to establish, as far as possible, quantitative estimates from both theory and experiment and to examine the agreement. In practice, we shall see that considerable uncertainty exists in this work area.
B. Quantitative Estimates of Coulombic Effects We have stressed the analysis originated by Loeffler because it is the only one that isolates all three coulombic induced aberrations and it has already been discussed in relation to the design of microfabrication systems. Estimates obtained by earlier workers (Thomson and Headrick, 1949) will be examined later. Here we seek to compare experimental data with the estimates made by Loeffler (1969). In Section V,F,1, we outlined a careful study by Pfeiffer (1971) in which experimental estimates were made of gun brightness, the energy spread from the gun, and the energy spread due to the total system. In these tests, the total system was a four-lens column with additional lenses before the target plane. Table VI gives a synopsis of the results. It can be seen that for sources varying in effective diameter by a factor of 10 and with total emission currents varying by a factor 100, the relationship predicted in Section X,A,AEguna(Bgr,)1/2, is obtained to within +20%. Bearing in mind the complexity of the measurements, this agreement can be deemed satisfactory. It should be stressed, perhaps, that the constant of proportionality between AE,,, and (Bgro)1’2is 0.115 f 0.014. Pfeiffer (1971) quotes a column of values which is 10 times too high. Also included in Table VI are theoretical estimates derived from Loeffler (1969).In comparing these data, we come up against several uncertainties. In the first place, it can be argued that it is unjustified to seek comparison between Loeffler’s theory and data derived from gun studies because one basic premise used in the theoretical development was that the analyzed crossover is in a free region. Such a premise cannot be applied, in general, to electron guns. Second, there are the varying properties of the guns studied. Of the three guns studied, the tungsten hairpin and the conventional LaB, guns have conventional crossovers in front of the cathode under the bias conditions appropriate here; whereas the aperture-limited LaB, gun has a virtual source located behind the cathode itself. In the first case, the observed energy spread is the square root of the quadrature sum of the thermal spread from the cathode itself and that arising from the interactions at the cross-
-
-
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
371
TABLE VI COMPARISON BETWEEN EWERIMENTAL DATAOBTAINED ON ELECTRON GUNS AND THE PREDICTIONS MADEFOR A FIELD-FREE CROSSOVER'
Gun type
Tungsten hairpin 300 60
Conventional LaB, cathode with pointed tip 25 9
Aperturelimited LaB, gun 2.7 7
3.0 0.14 3.5 4.5 0.50
0.95 0.116 2.2 2.4 0.32
0.65 0.09 1.6 1.85 0.32
2.95 1 .OO
0.89 0.65
Not applicable 0.65
2.83 5.10 1.80
0.69 1.30 1.88
Not applicable Not' applicable -
Gun brightness held constant throughout at 1.5 x lo5 A/cm2 sr.
over. In the second situation, the measurement is most simply interpreted in terms of the chromatic spread from the cathode itself. In seeking to understand this situation without more complete data, the present reviewer is tempted to analyze these data further with the object of determining the discrepancy between theory and experiment, albeit under conditions of some uncertainty. Provided the results are not quoted out of context, we can argue as follows. We can obtain a lower limit to the chromatic spread from the cathode by assuming the theoretical value % 2kT, which is appropriate to low current densities. These values are given in Table VI, line 7. Using these estimates, we obtain the upper limits to the chromatic spread resulting from the crossover given in line 8 of Table VI. There is an immediate discrepancy of a factor of two between the theoretical estimate of the chromatic spread predicted for the third gun and that measured experimentally. Recent studies have shown, in general, that the chromatic spread from cathodes is somewhat greater than 2kT, at all but the lowest current densities. If we accept the experiment estimate as the more appropriate and
372
P. R. THORNTON
apply the same factor to the other two guns, we obtain the alternative estimates of the thermal chromatic spreads and the crossover chromatic spreads given in lines 10 and 11 of Table VI, respectively. The immediate conclusions are that uncertainties in the magnitude of the thermal spread at the cathode are of secondary importance and that the theory predicts values of AE crossover which are larger than those observed experimentally but are larger by less than a factor of two. Bearing in mind the difficulty of making the measurements and the complexity of the calculation, this degree of agreement should be given due credit. Against this background, Pfeiffer’s comment (1971) that the theory is not in quantitative agreement with the experimental data should be taken as a measure of exacting standards rather than as a criticism of the theory. In view of this degree of agreement and of the successful prediction that AEJ(B, ro)1/2, it seems reasonable to state that the theory has considerable application to interaction problems in high-field regions but that to give exact predictions, some correction is needed. Whether this correction involves reanalysis to include the presence of a high field or whether it is fundamental to the theoretical approach is problematical. However, further insight can be gained by examination of the data obtained by Wolfe (1975) (see Section VI1,C). In this case, the departure from the assumptions of the model is minimal in that any interaction with a high-field region is concerned, in the worst case, only with the “tail” of the field of a magnetic lens. It should be recalled that Wolfe obtained a current of 5 x lo-’ A at 5 kV into a spot of 350 or 950 A depending on whether a built-up field emitter or a zirconated emitter was used. In both cases, the half-angle subtended at the crossover is 4.7 x rad. In this case, the ambiguity in interpretation is significantly induced and we can derive the estimates shown in Table VII. The theory predicts that under the experimental conditions, there will be no significant trajectory distortion or angular displacement of the trajectories. While the predicted energy spread is of the order of 4 eV which is consistent with an absence of chromatic defocusing provided the lens used had a chromatic aberration constant of x 20 cm or less. Such a value is not unreasonable for the focal lengths employed. We can sum up the position by saying that we have experimental observations that 5 x lo-’ A can be delivered into a lO00-A spot with a halfangle of 4.7 x lo-’ rad. No measurements of the energy spread were given. There is no inconstancy between these observations and the predictions made by Loefiler’s analysis in that the theory predicts no coulombic effects under these conditions except for an acceptable prediction of the energy spread. But the theory cannot be said to have been established on a quantitative basis as a result of these data, because the measurements are too limited to make a complete comparison.
a
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
373
TABLE VII COMPARISON BETWEEN EXPERIMENTAL DATA.AT A FIELDFREE ~OssOveR AND THEoREnCAL PREDICTTON Cathode type 1, (PA) b‘
cv)
2r0 (expt) (A) a0 r0
(4
A
AE,,, (eV) Ar (4 Aa (rad)
Built-up (100)w
Zirconated
(1Ww
0.5 5.0 350
0.5 5.0 950
--
4.7 x 10-3 1.3 x 10-3 0.18 <1
-3.5 10-3 0.53
-
< lo-6
<
4.7 x 10-3
<1
One apparent inconstancy does exist in the literature and that is a theoretical prediction published by Pfeiffer (1972) which is theoretically unlikely that we can deliver a current of 1 pA into a 1-pm spot with a beam half-angle of lo-’ rad. The prediction takes the form of a curve in a figure similar to Fig. 33. This figure has been used as a basis for debate in relation to the role of coulombic interaction in fast microfabrication systems, and its validity is a matter of some importance. It is the present reviewer’s opinion that the figure is in error, that it does not agree with Loeffler’s predictions (on which the curve is based), and indirectly, that it is contrary to the results reported by Wolfe. In more detail, the disputed curve relates to a 20-kV beam and the major source of spot growth at high currents is determined to be the trajectory displacement given in Eq. (22c). If we ask how the trajectory displacement increases as we seek to force increasing current into a 1-pm spot while keeping the beam half-angle constant at rad, we obtain the approximate values given in Table VIII. According to these data, the trajectory displacement contribution becomes significant at values of I , greater than 20 pA. This conclusion is borne out by estimates derived from the theory of Thomson and Headrick (1949). The suggestion is made that the trajectory displacement contribution has been overestimated in Pfeiffer (1972), the error arising from a possible misinterpretation of the constant relating l / A to V:l2. This is a work area where further study is urgently needed. Irrespective of the detailed comparison between theory and experiment, the Loefller model provides clear indication of the way in which a microfabrication system should be designed to reduce the role of coulombic interactions (see Section X,D. At this point, it is convenient to quickly survey some
374
P. R. THORNTON TABLE VIII
PREDICTIONS OF TRAJECTORY DISPLACEMENT ACCORDING TO LOEFFLER’S MODEL’ 1.
1, (PA)
100 50 20 10 5 2 1
b-’) 7.44 x 3.72 x 1.48 x 7.44 x 3.72 x 1.48 x 7.44 x
104 104
lo4 lo3 10’ 103 10’
Vb = 2 x lo4 v, a. =
r0
A
3.72 1.86 0.744 0.372 0.186 9.3 10-2 3.72 x lo-’
2Ar (approximate)
= 0.6 pm x 0.36 pm x 720 A = 220 x22 A =5a
a
rad.
of the other reported data and opinions pertaining to the “ Boersch effect.” This is the name that has been given to these high-current effects, partio ularly those observed in electron guns. C. Additional Data Pertaining to Coulombic Interactions
The discussion of the previous section centered mainly on the behavior at a field-free crossover. However, coulombic interactions occur at high current levels during the process of electron emission itself (Boersch, 1949, 1954), in the gun itself, (Pfeiffer, 1971) and in unfocused beams traversing a drift space (Ulmer and Zimmermann, 1964). There is substantial uncertainty as to the magnitude of the effects and as to the point at which they become significant. The initial work by Boersch (1949) at a beam energy of 10 kV was later substantiated by Hartwig and Ulmer (1963) using electrons of 100 eV. In addition, Dietrich (1958) working at 35 kV and Simpson and Kuyatt (1967) at 10 eV observed a significant increase in energy spread at high currents. After further work relating to the velocity analyzer, Beck and Maloney (1967) made measurements at 20 eV. These workers measured the energy spread as a function of emitted current and derived indirect estimates of the cathode temperature assuming an experimental dependence of the form I, I. exp(-E/kT,). At the same time, the temperature was measured directly by optical pyrometry. It was found that the two values agreed closely up to operating levels well above those examined by earlier workers, who reported up to a tenfold increase in the indirect estimate of the cathode temperature. Ichinokawa (1969) using a beam energy of 50 kV examined the behavior of three types of cathodes: the traditional tungsten hairpin, a pointed tung-
-
-
ELECTRON PHYSICS IN DEVICE
MICROFABRICATION. I
375
sten emitter operating in a Schottky mode, and a similar pointed tip, but oxide coated. The range of brightness studied was from lo4 to 5 x lo5 A/cm2 sr. There was a monotonic increase in energy spread from 0.5 eV to 2-6 eV (dependingon cathode type) as the brightness increased. These values are in agreement with the values reported by Pfeiffer (1971). Ichinokawa examined the energy sprezd as a function of beam half-angle and concluded that considerable care has to be exercised in the interpretation of data obtained with beams with relatively large beam half-angles, particularly in the case of a retarding field analyzer. In summary, the basic argument is that those electrons of mean total energy eV, and which are inclined at an angle a to the optic axis have a radial energy which imposes limitation on the sensitivity of the analyzer. Quoting the work of T. Ohno and R. Abe (private communication), Ichinokawa (1969)estimates that the rad can contribute a p existence of beam half-angles as small as 5 x proximately 1 eV of energy broadening on a 30-kV beam. One other aspect should be noted in passing. If the detailed dependence of the growth of the energy distribution as a function of Zro/Vt/2 is examined, it is found that the different workers report different dependences. When it is recalled that the data are obtained over a limited range of Zr, /V:/,, a natural explanation follows from hefiler’s curves of F,, F, ,and F , . See Fig. 43 which illustrates that F , is not a uniquely determined function of Zr, / V i / *but varies as (Zro/Vt/2p,with a itself dependent on Zr, / V i ’ 2
-
D. Application to Microfabrication System Design
In practice the implication is clear. In the absence of a theory that has been quantitatively tested over a period of time, it behooves the designer to take “ common sense” precautions to minimize these effects, particularly in relation to chromatic spread : (1) Ground all unwanted current as quickly as possible after emission from the gun. (2) Minimize the total column length and reduce the number of crossovers as much as possible compatable with the required versatility. (3) If the column length has to be increased to accommodate necessary components, use a parallel beam configuration whenever possible. (4) Include variable magnification, variable aperturing, and possibly dynamic correction to allow for empirical optimization.
ACKNOWLEDGMENTS The author would like to thank colleagues for advice and help: Per Gloersen, John Kelly, and Bernie Piwczyk who clarified an earlier text; John Buntel and Walter Worth who drew the figures; and Gail Butler who typed the text.
376
P. R. THORNTON
REFERENCES Ahmed, H.,and Munro, E. (1973) J. Vac. Sci. Technol. 10,972-974. Ahmed, H., and Nixon, W. C. (1973).Scanning Electron Microsc. 6,217. Ahmed, H.,Blair, W., and Lane, R. (1972). Rev. Sci. Instrum. 43, 1048. Amboss, K.(1975).J. Vac. Sci. Technol. 10,pp. 1152-1155. Andenen, W. H. J., and Mol, A. (1968)Proc. Eur. Reg. Con5 Electron Microsc., 4th, 1968 p. 339. Ausdn, S., Smith, H. I., and Flanders, D. C. (1978) J . Vac. Sci. Technol., p. 984. Banninger, U., and Bas, E. B. (1975).Surf. Sci. 50, 279-295. Barbour, J. P., Dolan, W. W., Trolan, J. K., Martin, E. E., and Dyke, W. P. (1953).Phys. Reo. 92,
45.
Barbour, J. P., Charbonnier, F. M., Dolan, W.W., Dyke, W. P., Martin, E. E., and Trolan, J. K. (1960).J. A&. Phys. 117, 1452-1459. Bauer, E., Poppa, H., and Viswanath, Y.(1976).Surf. Sci. 58,517-549. Beasley, J. P., and Squire, D. G. (1975).IEEE Trans. Electron Devices ed-22,376-384. Beck, A. H., and Maloney, C. E. (1967).J. Appl. Phys. 18,845-847. Bettler, P. C., and Charbonnier, F. M. (1960) Phys. Rev. 119,85-93. Blair, W., and Broers, A. N. (1971).IBM Tech. Disclosure Bull. 14,No.2. Bloomer, R. N. (1957) Proc. IEEE, Part B 104, 153. Boersch, H. (1949).Optik (Stuttgart) 5,436-450. Boench, H.(1954).Z.Phys. 39, 115. Bosi, G.(1975).J. Appl. Phys. 46,4689-4696. Brodie, I., and Jenkins, R. 0.(1956) J . Appl. Phys. 27,417. Brodie, K.(1961).J . Appl. Phys. 32, 2039-2046. Broers, A. N. (1965) Microelectron. Reliab. 4, 103. Broers, A. N.(1969) J. Sci. Instrum. [2]2, 273-276. Broen, A. N.(1972).Electron, Ion Beam Sci. Technol., p. 3. Buckingham, J. D. (1965) J . Appl. Phys. 16,1821-1932. Butler, 1. W. (1966)..Int.Congr., Electron Microsc., Proc. 6th, 1966 p. 191. Butz, R., and Wagner, H.(1977).Surf. Sci. 63,448-459. Cahen, O., Sigalle, R., and Trotel, J. (1972).Electron, Ion Beam Sci. Technol., p. 92. Chang, T. H. P. (1975a).Proc. Int. Congr., 8th, 1974 Vol. 1, pp. 650-651. Chang, T. H. P. (1975b).J . Vac. Sci. Technol. 12, 1271-1275. Chang, T. H. P., and Stewart, A. D. G. Rec. Symp. Electron, Ion, Laser Beam Technol., 10th. 1969 pp. 97-106. Chang, T. H.P., Wilson, A. D., Speth, A., and Kern, A. (1974).Electron, Ion Beam Sci. Technol. p. 580. Chang, T. H.P., Speth, A., Wilson, A. D., and Kern, A. (1975).Proc. Symp. Electron, Ion, Laser Beam Technol., 13th. 1975. Charbonnier, F. M., and Martin, E. (1962).J. Appl. Phys. 33, 1897. Collier, R. J., and Herriott, D. R. (1975) U.S. Patent 3,900,737. Cosslett, V. E., and Haine, M. E. (1956) Proc. Int. Con$ Electron Microsc., 3rd, 1954 pp. 639-644. Cosslett, V. E., and Thomas, R N. (1964).Br. J. Appl. Phys. 15, 1283. Crewe, A. V. (1964).J . Appl. Phys. 35,3075. Crewe, A. V. (1966).Science 154,729. Crewe, A. V., and Parker, N. W. (1976) Optik (Stuttgart) 46, 183-184. Crewe, A. V., Eggenberger, D. N., Wall, J., and Welter, L. M. (1968a)Rev. Sci. Instrum. 39,576. Crewe, A. V., Wall, J., and Welter, L. M. (1968b).J. Appl. Phys. 30,5861-5868. Cutler, P. H., and Nagy, D. (1964).Surf. Sci. 3, 71.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
377
Davey, T. P. (1971). Optik (Stuttgart)33, 580-590. Dietrich, W. (1958). Ann. Phys. (Leipzig) [7] 152, 306. Dolan, W. W., and Dyke, W. R. (1939) Phys. Rev. 55,473. Dolan, W. W., Dyke, W. P., and Trolan, J. K. (1953). Phys. Rev. 91, 1054. Doran, S., Perkins, M.,and Stickel, W. (1975). J. Vac. Sci. Technol. 12, 1174-1176. Drechsler, M.,Cossletf V. E., and Nixon, W. C. (1960). Int. Con$ Electron Microsc., Proc., 7th, 1958 p. 13. Dugas, J., Durandeau, P., and Fert, C. (1961) Reo. Opt. 40,227. Dyke, W. P., and Dolan, W. W. (1956). Ado. Electron. Electron Phys. 8, 89. Everhart, T. E. (1967). J. Appl. Phys. 38,4944. Everhart, T. E., and Thornley, R. F. M.(1960). J. Sci. Instrum. [l] 37, 246. Flanders, D. C., and Smith, H. 1. (1978). J. Vac. Sci. Technol. 15, 1001. Fischke, B., Anger, K., Frosien, J., and Oelmann, A. (1977). Proc. Int. Con$ Microlithogr., 1977 pp. 163-166. Fowler, R. H., and Nordheim, L. (1968) Proc. R. Soc. London, Ser. A 119, 173. Friedman, E. B., Liversey, W. R., and Rubiales, A. L. (1973).J. Vac. Sci. Technol. 10,1020-1024. Frotizheim, H., Ibach, H., and Lehwald, S. (1977). Surf Sci. 63, 5666. Glascock, H. H., Jr. (1969). J. Sci. Instrum. 2, [2] 273. Glaser, W. (1956). In “Handbuch der Physik” (S. Fliigge, ed.), Vol. 33, p. 123. Springer-Verlag, Berlin and New York. Glaser, W. (1940). Z. Phys. 116, 56. Gomer, R. (1973). Surf Sci. 38, 373. Good, R. H., and Miiller, E. W. (1956). In “Handbuch der Physik” (S.Fliigge, ed.), Vol. 21, p. 176. Springer-Verlag,Berlin and New York. Grivet, P. (1972). “Electron Optics,” 2nd ed., pp. 726-856. Pergamon, Oxford. Haantjges, J., and Lubban, G. J. (1957) Philip Res. Rep. 12,46. Haantjges, J., and Lubban, G. J. (1957) Philip Res. Rep. 14, 65. Hanszen, K. J., and Lauer, R. (1967). Z. Naturforsch., Teil A 22,238-254. Harte, K. J. (1973). J . Vac. Sci. Technol. 10, 1098-1101. Hartwig, D., and Ulmer, K. (1963). Z. Phys. 173, 294-320. Hawkes, P. W., ed. (1973). “Image Processing and Computer-Aided Design in Electron Optics,” Academic Press, New York. Henderson, R. C., Voschenkow, A. M.,and Mahoney, G. E. (1975). J. Vac. Sci. Technol. 12, 1261. Heritage, M.B. (1973).I n “ Image Processing and Computer-Aided Design in Electron Optics” (P. W. Hawkes, ed.), p. 324. Academic Press, New York. Heritage, M.B. (1975). J. Vac. Sci. Technol. 12, 1135-1140. Herriott, D. R., Collier, R. J., Alles, D. S., and Stafford, J. W. (1975). IEEEE Trans. Electron. Devices ed-22, 385-392. Heynick, L. N., Westerberg, E. R., Hartelius, C. C., Jr., and Lee, R. E. (1975). IEEE Trans. Electron. Devices ed-22, 399-409. Hollway, D. L. (1952). Aust. J. Sci. Res., Ser. A 5, 43&536. Hopkins, B. J., and Watta, G. D. (1976). Surf Sci. 55, 729-734. Hughes, W. C. (1978). Proc. Int. Con$ Electron, Ion, Beam Sci. Technol., ZOth, 1969 p. 441. Ichinokawa, T. (1969). Jpn. J. Appl. Phys. 8, 137-144. Jasger, R., and Menzel, D. (1977). Surf Sci. 63,232-243. Jones, G. A. C., and Owen, G. (1977) Proc. Symp. Electron, Ion, Photon Beam Technol., 1977 p. 896. Kamminga, W., Verster, J. L.,and Francken, J. C. (1968). Optik (Stuttgart) 28,442. Kelly, J. (1972). “Electrostatic Deflection,” Stanford Res. Inst., Menlo Park, California.
P. R. THORNTON
378
Keyes, R. W. (1975). Proc. IEEE 63, 740-767. Kimura, H., and Tamura, H. (1967). IEEE 9th Annu. Symp. Electron, Ion, Laser Beam Technol., 1967 p. 62. Kleint, C. (1963). Ann. Phys. (Leipzig) [7] 10, 309. Klemperer, O., and Barnett, M. E. (1971). ’‘ Electron Optics,” 3rd ed., p. 269. Cambridge Univ. Press, London and New York. Koehler, A. (1893). Z. Wiss. Mikrosc. Mikrosk. Tech. 10,433. Koops, H., Mollensteft, G., and Speidal, R. (1968). Optik (Stuttgart) 28, 518. Kuo, H. P. (1976). Ph.D. Thesis, Cornell University, Ithaca, New York. Kuznetsov, V. A,, Ouchinnikov, A. P., and Tishin, E. A. (1969). Radio Eng. Electron. Phys. (Engl. Transl.) 14, 333-334. Langmuir, D. B. (1937). Proc. IRE 25, 977. Lenz, F. (1950). 2. Angen. Phys. 2,448. Levi, R. (1955). J . Appl. Phys. 26, 639. Liebmann, G. (1949). Proc. Phys. Soc. London, Ser. B 62, 213. Lin, L. H., and Beauchamp, H. L. (1973). J . Vac. Sci. Technol. 10, 987-990. Liversey, W. R. (1973). J . Vac. Sci. Technol. 10, 1028-1032. Loeffler, K. H. (1969). Angew. Phys. 127, 145. Loeffler, K. H., and Hudgin, R. M. (1971). Proc. Int. Congr. Electron Microsc. 7th, 1970 pp. 67-68. MacMaster, G., and Dudley, K. (1973). Proc. Int. Electron. Deoices Mater., 1972. Madey, T. E., Czyzewski, J. J., and Yates, J. T. (1976). Su$ Sci. 57, 580-590. Mauer, J. L., Pfeiffer, H. C., and Stickel, W. (1976). Int. Electron. Devices Meet., 1976 p. 434. Mauer, J. L., Pfeiffer, H. C., and Stickel, W. (1977). IBM J . Res.’beo., 514-521. Mee, P. B. (1976). J. Appl. Phys. 47, 3904-3910. Michail, M. S., Woodard, 0. C., and Yourke, H. S. (1975). U.S. Patent 3,900,736. Moss, H.(1968). Ado. Electron., Suppl. 3. Mulvey, T. (1967). I n “Focusing of Charged Particles” (A. Septier, ed.), Vol. 1, p. 474. Academic Press, New York. Munro, E. (1971a). Ph.D. Thesis, Cambridge University. Munro, E. (1971b). In “Electron Microscopy and Analysis” (W. C. Nixon, ed.), pp. 84-87. Inst. Phys., London. Munro, E. (1972). Proc. Eur. Reg. Con$ Electron Microsc., 5th, 1972 pp. 22-23. Munro, E. (1973). I n “Image Processing and Computer-Aided Design in Electron Optics” (P. W. Hawkes, ed.), pp. 284-323. Academic Press, New York. Munro, E. (1974). Optik (Stuttgart) 39,450466. Munro, E. (1975). J. Vac. Sci. Technol. 12, 1146-1150. Nakhov, A. F. (1960). Sou. Phys.-Solid State (Engl. Transl.) 2, 1934. Newberry, S. P. (1966). Sci. Proc. Fall Joint Comput. Con$ AFIPS, 1966 Vol. 29, p. 717. Nishida, J. (1971). Electron. Commun. Jpn. UB,65-72. Nomura, S., Konoda, T., Kamiryo, T., and Nakaizumi, T. (1973). Scanning Electron Microsc. 6, 66-72.
Nordheim, L. (1928). Proc. R. Soc. London, Ser. A 121,626. Ohiwa, H., Goto, E., and Ono, A. (1971). Electron. Commun. Jpn. UB,44. OKeefe, T.W., Vine, J., and Handy, R. M. (1969). Solid State Electron. 12,841-850. Okuyama, F., and Hibi, T. (1965). Jpn. J . Appl. Phys. 4, 337-342. Owen, G. (1975). Ph.D. Thesis, Cambridge University. Owen, G. (1977). Proc. Int. Con$ Microlithogr., 1977 p. 187. Ozdemir, F. S., Perkins, W. E., Yim, R., and Wolf, E. D. (1973). J. Vac. Sci. Technol. 10, 1008-1011.
ELECTRON PHYSICS IN DEVICE MICROFABRICATION. I
379
Pease, R. F. W. (1963). Ph.D. Thesis, Cambridge University. Pfeiffer, H. C. (1971). Rec. Symp. Electron, Ion, Laser Beam Techno/., l l t h , 2971 p. Pfeiffer, H. C. (1972). Scanning Electron Microsc. 5, 113-120. Pfeiffer, H. C. (1975a). Proc. Electron Microsc., Int. Congr., 8th, 1974 Vol. 1, pp. 46-67. Pfeiffer, H. C. (1975b). J. Vac. Sci. Technol. 12, 1170-1173. Pfeiffer, H.C., and Loeffler, K. H. (1971). Electron Microsc., Proc. Int. Congr., 7th, 1970 p. 63. Pfeiffer, H. C., and Woodard, 0. C. (1975). US.Patent 3,894,271. Pierce, I. R. (1949). “The Theory and Design of Electron Beams.” Van Nostrand-Reinhold, Princeton, New Jersey. Piwczyk, B. P.,and McQuhae, K. G. (1973). J. Vac. Sci. Technol. 10, 101&1019. Powell, M. J. D. (1965). Comput. J. 17, 303-307. Read, F. H. (1969). J. Sci. Instrum. [2] 2, 679-684. Scherzer, 0. (1936). Z. Phys. 101, 593. Schrednik, V. N. (1961) Sou. Phys.-Solid State (Engl. Transl.) 3, 1268. Schwartz, J. W. (1957) R C A Rev. pp. 3-11. Scott, J. P. (1975). IEEE Trans. Electron. Devices 4-22, 409-413. Scott, J. P.(1978). Rec. Symp. Electron, Ion, Photon Beam Technol., 24th, 1977 p. 1016. Shimizu, R., Kuroda, K., Suzuki, T., Nakamura, S., Suganuma, T., and Hashinioto, H. (1973). Scanning Electron Microsc. 6, 53-80. Silzars, A., Bates, D. J., and Ballonoff, A. (1974). Proc. IEEE 62, 1119-1 158. Simpson, J. A., and Kuyatt, C. E. (1967). J. Appl. Phys. 18, 1573. Singh-Boparai, S. P., and King, D. A. (1976). Su$ Sci. 61, 375-378. Smith, H.I., Spears, D. L., and Bernacki, S. E. (1973). J. Vac. Sci. Technol. 10,913-917. Speth, A. J., Wilson, A. D., Kern, A,, and Chang, T. H. P. (1975). J. Vac. Sci. Technol. 12, 1235-1239.
Spicer, D. F., Rodger, A. C., and Varnell, G. L. (1973). J. Vac. Sci. Technol. 10, 1052. Stickel, W., and Pfeiffer, H. C. (1973). J. Vac. Sci. Techno/. 10, A15 Stille, G., and Astrand, B. (1977). Rec. Symp. Electron, Ion, Photon Beam Technol., I4th, 1977 p. 921. Sullivan, P. A., and McCoy, J. H. (1977). .I. Vac. Sci. Technol. 12, 1325-1328. Swanson, L. W. (1973). J. Vac. Sci. Technol. 12, 1228-1233. Swanson, L. W., and Bell, A. E. (1973). Adu. Electron. Electron Phys. 32, 193. Swanson, L. W., and Crouser, L. C. (1967). Phys. Rev. 163, 622-641. Swanson, L. W., and Crouser, L. C. (1969). J. Appl. Phys. 16, 47414749. Swanson, L. W., and Martin, N. A. (1975). J . Appl. Phys. 46, 2029-2050. Thomson, B. J., and Headrick, L. B. (1949). Proc. IRE, 318-324. Thomson, M.G. R. (1975). J. Vac. Sci. Technol. 12, 1156-1160. Timm, G. W., and Van Der Ziel, A. (1966). Physica (Utrecht) 32, 1333-1344. Ulmer, K., and Zimmermann, B. (1964). Z. Phys. 182, 194-216. Van Ostrom, A. G. J. (1965). Ph.D. Thesis, University of Amsterdam. Varnell, G. L., Spicer, D. F., and Radger, A. C. (1973). J. Vac. Sci. Techno/. 10, 1048-1051. Veneklasen, L. H. (1971). Ph.D. Thesis, Cornell University, Ithaca, New York. Veneklasen, L. H. (1972). Optik (Stuttgart) 36, 410433. Veneklasen, L. H., and Siege], B. M. J. (1972). J. Appl. Phys. 43,4989. Vogel, S. F. (1970) Reu. Sci. Instrum. 11, 585. Wang, C. C. T., Harte, K. J., Curland, N., Lokustic, R. K., and Dougherty, E. C. (1973). J. Vac. Sci. Technol. 10, 1110-1113. Wardly, G. A. (1973a). J. Vac. Sci. Technol. 10, 975-978. Wardly, G. A. (1973b). J . Appl. Phys. 44, 5606. Wardly, G. A. (1974). J. Appl. Phys. 45,2316-2320.
380
P. R. THORNTON
Wardly, G. A. (1975a).IEEE Trans. Electron. Devices 4-22,414-417. Wardly, G.A. (1975b).Rev. Sci. Instrum. 44, 1506. Wardly, G.A. (197%) Proc. Electron Microsc., Int. Congr., 8th, 1974 Vol. 1, pp. 141-142. Weber, E. V., and Yourke, H.S. (1977).Electronics, pp. 96101. Wolfe, E. D.,Coane, P.J., and Ozdemir, F. S. (1975).. I . Vac. Sci. Technol. 12, 1266-1270. Wolfe, J. E. (1975).J . Vac. Sci. Technol. 12, 1169. Worster, J. (1969).Optik (Stuttgart) 29,498-505. Yonezawa, A.,Nakagawa, S., and Swzuit, M.(1977).Proc. Electron Microsc. SOC.Am. 35,62-63. Zienkiewicz, 0.C. (1967) “The Finite Element Method in Structural and Continuum Mechanics.” McGraw-Hill, New York. Zienkiewin, 0. C., and Cheung, Y. K. (1965).Engineer 220,407-510. Zworykin, V. K.,Morton, G. A,, Ramberg, E. G., Hiller, J., and Vance, A. W. (1945)“ Electron Optics and the Electron Microscope,” Wiley, New York.
AUTHOR INDEX Numbers in italics refer to the pages on which the complete references are listed.
A A b r a h a m , M. S . , 19.33 Adalsteinsson, O., 191, 195 Adiri, I., 269 Agraz, J. C., 28,33 Ahmed, H., 322,376 Akiyama, S., 162, 195. Alexander, H., 108, 197 Alger, D. L., 162, 197 Allen, G. A., 17, 33 Allenson, M. B., 22, 24, 26, 28,33, 34 Alles, D. S., 288, 323, 377 Amboss, K., 361, 376 Amick, J. A., 24, 36 Andersen, W. H . J., 321, 376 Anderson, L., 191, 197 Andre, 3. P., 24, 33 Andres, U. T., 106, 188, 195 Andrew, D., 25,34 Anger, K., 296, 298,377 Angew, L., 24,35 Antypas, G. A., 5, 7, 16, 17, 24, 27,34, 35 Antypas, J . , 31, 34 Apker, L., 2, 16,34 Appert, J. A., 7 , 3 4 Archard, G. D., 37, 100 Arnaud D’Avitaya, F., 8, 34 Arsac, J . , 204, 207, 221, 222, 223, 269 Ashcroft, E., 204, 269 Ashley, K . L., 28, 34 Astrand, B., 288,379 Auger, P., 7, 29, 34 Austin, S., 298, 299, 376 Avramchuk, A. Z., 162, 195
B Baddour, R. F., 191, 195 Bgdescu, R., 120,195 Badger, D. E., 26, 35 381
Baer, A. D., 5.35 Bailey, R. L., 161, 162, 195 Baker, B. S., 223, 269 Ball, F. L., 67, 101 Ballonoff, A., 324,379 Banninger, U., 341, 376 Barbour, J. P., 309, 313, 319, 376 Barnett, M. E., 351, 378 Bartelink, D. T., 9, 34 Bas, E. B., 341, 376 Bashtovoi, V. G., 139, 140, 195 Bass, S. J . , 24, 33 Bates, D. J., 324,379 Bates, L. F., 134, 137, 195 Baud, C., 9, 10, 15, 26, 34 Bauer, E., 341, 376 Baumeister, W., 38, 100 Bean, C. P., 111, 113, 195, 196 Beasley, J. P., 288, 376 Beauchamp, H. L., 323,378 Beck, A. H., 374, 376 Bell, A. E., 308. 310,379 Bell, R. L., 3, 5, 7, 8, 16, 24, 32.34, 35, 36 Berkovsky, B. M . , 136, 139, 140, 195 Berkowitz, A. E., 113, 195 Bernacki, S. E., 298, 299,379 Bertrand, A. R. V . , 104, 195 Bettler, P. C., 313, 315, 317, 319,376 Biard. J. R., 28, 34 Bibik, E. E., 112, 178, 195 Bibik, V. F., 6, 34 Bienfait, M., 8, 34 Black, J. F., 28.34 Blair, W., 320, 322, 376 Blakely, R. W., Jr., 174, 198 Bliss, G. W., 168, 1% Bloomer, R. N., 322, 376 Boersch, H . , 374,376 Bogardus, E. H., 117, 195 Borziak, P. J . , 6, 34 Bosi, G., 367, 376 Boussinot, F., 269
382
AUTHOR INDEX
Boyle, J. F., 106, 116, 197 Bradley, D. J., 22, 34 Braun, W., 7, 34 Brodie, I., 322, 376 Brodie, K., 322, 376 Broers, A. N., 288, 320, 337, 376 Brown, F., 5,34 Brown, W. F., Jr., 108, 115, 195 Broy, A., 185, 195 Buckingham, J. D., 320, 376 Buckmaster, J. D., 139, 195 Buiocchi, C. J., 19, 33 Burfoot, J. C., 37, 100 Busenberg, S. N., lf9, 196 Bussler, P. H., 100, 100 Butler, J. W., 306, 376 Butz, R., 376 Byme, J. V., 128, 167, 195
C
Cosslett, V. E., 337, 376, 377 Coulombre, R. E., 175, 176, 196 Cousineau, G., 221, 270 Cowley, M. D., 122, 127, 129, 148, 149, 150, 151, 156, 196 Crewe, A. V.,313, 361,376 Crossley, J., 24, 34 Crouser, L. C., 313, 315, 317, 319,379 Csorba, I. P., 21, 34 Curland, N., 367, 379 Curtis, R. A., 157, 168, 187, 189, 196 Cutler, P. H., 306, 308, 377 Czyzewski, J. J., 341,378
D Dahl, 0.-J., 205, 206, 270 d’Auriol, H., 175, 176, 196 Davey, T. P., 294, 303,377 Dawson, L. R., 24, 35 deGennes, P. G., 113, 196 Deitch, R. H., 24, 34 Deltrap, J. H. M., 37, 100 Denning, P. J., 202, 234, 238,270 Derrien, J., 8, 34 Dickey, J., 2, 16, 34 Dietrich, W., 374,377 Dijkstra, E. W., 205, 206, 261,270 Dinan, J. H., 8, 34 Dobson, C. D., 24,34 D o h , W. W., 308, 309, 313, 319,376, 377 Donea, J., 190, 196 Doran, S., 326, 377 Dougherty, E. C., 367,379 Dramarenko, G. S., 6,34 Drechsler, M., 337, 377 Dudley, K., 324, 378 Dugas, J., 346, 377 Dunning, J. W., 162, 197 Durandeau, ,P., 346, 377 Dyke, W. P., 308, 309, 313, 319,376, 377 Dzhauzashtin, K. E., 140, 196
Cahen, O., 323,376 Cillugiiru, 120,195 Cannon, T. M., 97, 100 Carmi, S., 157, 197 Carr, D. L., 28,34 Camco, J. P., 179, 195 Cary, B. B., Jr., 177, 195 Casey, H. C., 24, 34 Chang, C.C., 29,34 Chang, T. H. P., 287, 288, 323, 325, 359, 376, 379 Chantrell, R., 190, 197 Charbonnier, F. M., 310, 313, 315, 317, 319, 376 Charles, S. W., 111, 190, 197. 198 Chen, J. M., 7, 22, 34, 35 Cheung, Y. K., 349, 380 Chiu, W., 64,100 Chu, B.-T., 129, 132, 134, 195 Chu, Y.,203, 204, 270 Chung, D. Y.,120, 195 Ciraci, S., 8, 34 Cissoko, M., 140, 195 Clark, M. G., 6, 7, 34 E Coane, P. J., 325, 380 Collier, R. J., 288, 323, 326, 359, 376, 377 Ebbinghaus, G., 7,34 Colton, C. K., 189, 191, 195, 196 Edgecumbe, J., 5 , 7, 16, 27, 31,34, 35 Conder, P. C., 22, 30,35 Eggenberger, D. N., 376 Cook, E. J., 173, 195 Einstein, A., 117, 196
383
AUTHOR INDEX Enstrom, R. E., 6, 7, 26, 34 Erickson, H. P., 38, 100 Escher, G. A., 31,34 Escher, J. S., 6, 7, 34 Ettenberg, M., 16, 25, 34, 35 Evans, G. B., 5, 16,36 Everhart, T. E., 306, 324,377 Ezekiel, F. D., 158, 162, 192, 196, 197, 198
F Faber, 0. C., Jr., 140, 197 Fan, G. J., 183, 1% Farnell, G. L., 356, 377 Fellgett, P. B., 61, 100 Feltynowski, A., 38, 100 Fenlon, F. H., 177, 195 Fert, C., 346, 377 Finlayson, B. A., 157, 196 Fischke, B., 296, 298, 377 Fisher, D. G., 6, 7, 16, 22, 26, 34 Fisher, F. E., 8, 34 Flanders, D. C., 298, 299, 323, 376, 377 Floyd, R. W., 223, 270 Forester, D. W., 113, 195 Fowler, R. H., 308,377 Fox, R. A., 106, 116, 197 Foy, P. W., 24, 35 Francken, J. C., 346,378 Frank, G., 5, 16, 26, 27, 34 Frank, J., 38, 40, 47, SO, 56, 100, I00 Frenkel, Y. I., 115, 196 Friedman, E. B., 288, 377 Frosien, J., 296, 298, 377 Frotizheim, H., 341, 377 Fuji, H., 162, 195 Fujino, J., 162, 195
G Gailitis, A., 152, 196 Galbraith, L. K., 8, 34 Gallais, A., 24, 33 Galtier, C., 259, 270 Gapik, R. J., 24, 35 Garbe, S., 5, 16, 26, 27, 34 Gelb, A., 178, 198 Gerdau, E., 116, 199 Glachant, A., 8,34
Glaeser, R. M., 64,100 Glascock, H. H., Jr., 322,377 Glaser, W., 348, 377 Goldberg, J., 205, 270 Goldberg, P., 122, 196 Goldstein, B., 6, 8, 29, 34 Goldstein, S. R., 185, 196 Gomer, R., 317, 319,377 Good, R. H., 308,377 Goodwin, A. R., 24,34 Gordon, J., 24, 34 Gordon, W. J., 247. 270 Gossenberger, H. S., 7, 34 Goto, E., 362, 378 Goto, T., 51, 100 Gowers, J. P., 25, 34 Gray, R. J., 185, 186, 196 Green, P. E., 24, 35 Greene, P. O., 26, 35 Gregory, P. E., 8, 34 Grivet, P., 351, 377 Gutierrez, W. A., 25, 26, 35
H Haantjges, J., 348, 361, 377 Hahn, M.,38, 100 Haine, M. E., 337, 376 Halioua, M., 38, 100 Hall, W. F., 119, 196 Hallais, J., 24, 33 Hallish, C. D., 26, 35 Handy, R. M., 291, 378 Hansen, P. B., 202, 270 Hansford, J., 122, 196 Hanszen, K. J., 40, 47. 100, 321, 377 Harada, Y.,51, 100 Hamman, D. C., 206,270 Hams, P. M., 6, 36 Hams, W. W., 67, 97, 101 Harrison, W. A., 8, 34 Harte, K. J., 351, 367, 377, 379 Hartelius, C. C., Jr., 366, 377 Hartwig, D., 374,377 Hashinioto, H., 337, 379 Haus, H. A., 123, 129, 197 Hawkes, P. W., 346,377 Hayashi, I., 24, 35 Hayes, C. F., 122, 196 Headrick, L. B., 367, 370, 373,379
384
AUTHOR INDEX
Heimann, W., 6,35 Heinrich, H.-J., 116, 199 Hemmer, P. C., 106, 196 Henderson, J. A., 25,34 Henderson, R. C., 359,377 Heritage, M. B., 295, 298, 348,377 Heniott, D. R., 288,323, 326,359,376,377 Hess, P. H., 105, 196 Heynick, L. N., 366,377 Hibi, T., 315, 378 Hiller, J., 348, 380 Hoare, C. A. R., 205, 206, 270 Hok, B., 177, 1% Hoene, E. L., 6,35 Holeman, B. R., 22, 30,34, 35 Hollway, D. L., 367,377 Hopkins, B. J., 341, 377 Hoppe, W.,38, 100, 100 Hori, M., 269 Huchital, D. A., 8.35 Hudgin, R. M., 368, 378 Hudgins, W. A., 162, 196 Hughes, F. R., 30,35 Hughes, W. C., 322,377 Hunter. J. S., 173, 178, 196, 198 Hyder, S. B., 25, 35
1
Ibach, H., 341,377 Ichinokawa, T., 374, 375, 377 Ikeguchi, T., 187, 197 Imbro, D., 106, 1% Ingebretson, R. B., 97, 100 Ishihara, A., 162, 195 Isler, W.E., 120, 195
J Jackson, D. A., 31, 36 Jackson, J. R., 270 Jacobi, K., 7.35 Jacobs, I. S., 111, 113, 195, 196 Jacobson, D. M., 122, 196 Jahnson, W. A., 24,35 James, L.W.,3 , 5 , 6 ,7 , 9 , 16, 17,29,34,35, 36 Jasger, R., 341,377 Jeans, J. H.,163, 1%
Jenkins, J. T., 140, 196 Jenkins, R. 0..322,376 Jerics Kansky, E., 6,35 Johnson, C. E., Jr., 183, 196 Johnson, E. O., 27,35 Jones, G. A. C., 364, 366,377 Jones, T. B., 132, 135, 136, 160, 161, 168, 196, 197 Jordan, P. C., 113, 196
K Kaiser, R., 106, 112, 113, 117, 119, 162, 168, 175, 176, 187, 189, 196, 197, 198 Kalinkin, A. K., 162, 195 Kamiryo, T.,337, 378 Kamminga, W., 378 Kang, C., 23, 24,35, 36 Kaplan, B. Z., 122, 196 Kazama, S., 187, 197 Keller, H., 116, 197 Kelly, J., 348, 361, 375, 378 Kern, A., 287, 359,376, 379 Keyes, R. W., 275, 378 Khalafalla, S. E., 104, 105, 188, 197 Kibler, D. F., 206, 270 Kilorenzo, J. W., 24, 35 Kimura, H., 324, 378 King, D. A., 341,379 King, P. G. R., 26, 33 King, S., 24,35 Kleint, C., 317, 378 Klemperer, O., 351, 378 Klug, A., 38, 100 Knuth, D. E., 206, 223, 270 Koehler, A., 328, 378 Konoda, T., 337,378 Koops, H., 295,378 Kosaraju, R. S., 206, 270 Kraeft, B., 108,197 Kressel, A., 9, 35 Kressel, H., 31, 35 Krueger, D. A., 114, 132, 197 Kruyt, H. R., 108, 118, 197 Kundig, W., 116, 197 KUO,H. P., 337, 338, 352, 378 Kupsky, G., 9,35, 36 Kuroda, K., 337, 379 Kuyatt, C. E.,374,379 Kuznetsov, V. A., 310,378
3 85
AUTHOR INDEX
L Lahut, J. A., 113, 195 Lalas, D. P., 157, 197 Lamb, Sir H., 144, 197 Lamotte, A., 191, 195 Lancanshire, R. B., 162, 197 Lane, R., 322, 376 Langer, R., 38, 100, 100 Langmuir, D. B., 378 Lanza, F., 190, 1% Lassetre, E. N., 6, 36 Lauer, R., 321, 377 Ledgard, H . , 206,270 Lee, R. E., 366,377 Legems, M. I., 24, 35 Lehwald, S., 341, 377 Lenz, F., 346, 355, 378 Lerro, I. P., 173, 174, 197 Levi, R., 322,378 Levine, M. B., 162, 197 Levinson, L. M., 113, 195 Li, S. S., 28, 33 Liebmann, G., 352, 378 Lin, I. J., 188, 199 Lin, L. H., 323,378 Linfoot, E. H., 61, 100 Little, J. D. C., 270 Little, J. L., 173, 1% Little, R., 117, 169, 170, 171, 172, 178, 197. 198 Liu, Y. A . , 191, 197 Liu, Y. Z., 25, 26, 35 Liversey, W. R., 288, 340,377 Livingston, J . D., 113, 195 Loeffler, K. H., 326,335,368,370,372,373, 375, 378, 379 Lokustic, R. K., 367, 379 Loveman, D. B., 206,270 Lubban, G. J., 348, 361,377 Luca, E., 120, 195
M McCarthy, J., 204,270 McCoy, J. H., 298,379 McEwan, A. D., 150, 198 Mackor, E. L., 110, 197 MacMaster, G., 324, 378 McNab, T. K., 106, 116, 197
McQuhae, K. G., 288, 379 McTague, J. P., 120, 197 Madey, T. E., 341, 378 Mahoney, G. E., 359,377 Maloney, C. E., 374, 376 Manasevit, H. M., 24, 35 Manista, E. J., 162, 197 Mann, J. E., 64,101 Marcotty, M., 206, 270 Martin, E. E., 309, 310, 313, 319,376 Martin, N. A., 306, 308, 317, 379 Martinelli, R. U., 6, 7, 22, 32, 34, 35 Martinet, A . , 115, 116, 197 Martsenyuk, M. A., 115, 197 Matygullin, B. Y., 112, 195 Mauer, J. L., 326, 335, 336, 378 Mayamlin, V. A., 27,35 Mee, P. B., 341, 378 Melcher, J. R., 128, 129, 141, 142, 143, 150, 152, 197, 199 Menzel, D., 341,377 Meyer, N. I., 9, 34 Michail, M. S., 326, 378 Mikhalev, O., 162, 195 Miller, B. I., 24, 34, 35 Miller, C. W.,140, 197 Milton, A. F., 5, 35 Mir, L., 187, 196 Miskolczy, G., 113, 117, 119, 158, 162, 168, 169, 170, 171, 172, 189, 196, 197, 198 Miyyagin, N. Y. A., 8, 36 Mollensteft, G., 295, 378 Mol, A., 321,376 Moll, J. L., 5 , 7, 9, 16, 25, 34, 35 Moon, R. L., 5, 16,34, 35 Morton, G. A., 348,380 Mosbach, K., 191, 197 Moskowitz, R., 117, 158, 162, 197 Moss, H., 321,378 Miiller, E. W., 41, 101, 308, 377 Mulvey, T . , 351, 378 Munro, E., 322,346, 348,349,350, 361, 363, 376. 378, 379 Muramori, K., 187, 197 Murray, L. A., 28, 35
N Nagy, D., 306, 308,377 Nakagawa, S . , 320, 321, 380
386
AUTHOR INDEX
Nakaizumi, T., 337, 378 Nakamura, S., 337,379 Nakhov, A. F., 378 Neighbors, J. M., 206, 270 Nelson, H., 23, 24, 31, 35 Nestor, J. W., 104, 110, 111, 117, 118, 190, 198 Neuringer, J. L., 122, 128, 131, 132, 139,197 Newberry, S. P., 367, 378 Newell, G. F., 247, 270 Nishida, J., 337, 378 Nishio, M., 162, 195 Nixon, W. C., 322, 337, 376, 377 Nogita, S., 187, 197 Nolin, L., 207, 223, 270 Nomura, S., 337, 378 Nordheim, L., 306, 308, 377, 378 North, J. C., 186, 199 Nuese, C. J., 25, 34
Perkins, W. E., 287, 288, 379 Perry, M. P., 160, 161, 162, 197 Persson, N. C., 162, 197 Peterson, E. A., 114, 197 Pfeiffer, H. C., 304, 320, 321, 326, 328, 335, 336, 354, 362, 363, 364, 369, 370, 372, 373, 374, 375, 378, 379 Philbrick, J . W., 35 Pierce, J. R., 367, 379 Pincus, P. A., 113, 196 Pinkas, E., 24, 34, 35 Piwczyk, B. P., 288, 375,379 Pleskov, Y., 27, 35 Plummer Stocker, B. J., 25, 34 Pohlhausen, K., 24.35 Pollard, J. H., 22, 35 Pornmering, D., 25, 26, 35 Poppa, H., 341, 376 Popplewell, J., 111, 190, 197, 198 Potard, C., 23.35 Powell, M. J. D., 348, 379
0
Oelmann, A., 296, 298,377 Ohiwa, H., 362, 378 O’Keefe, T. W., 291, 378 Okuyarna, F., 315, 378 Olah, E. E., 181, 182, 197 Olsen, G. H., 16, 25,35 Ono, A., 362,378 Orlov, D. V., 162, 195 Orlov, L. P., 136, 195 Ouchinnikov, A. P., 310, 378 Overbeek, J. T. G., 110, 199 Owen, G., 349, 350, 351, 361, 363, 364, 366, 377, 378, 379 Ozdemir, F. S., 287, 288, 325,379, 380
P Panish, M. B., 24,35 Papageorgopoulos, C. A., 7 , 3 5 Papell, S. S., 104, 105, 140, 197 Parker, N. W., 361, 376 Parker, P. H., 105, 196 Pease, R. F. W., 351, 379 Pellegrino, J. J., 169, 170, 171, 172, 198 Penfield, P., 123, 129, 197 Perez, M., 122, 198 Perkins, M., 326, 377
R Rabenhorst, D. W., 162, 174, 197 Rabinow, J., 197 Radger, A. C., 288,379 Raikher, Y. L., 112, 115, 195. 197 Ramberg, E. G., 348,380 Ranke, W., 7,35 Rayk, W. D., 190, 198 Read, F. H., 346, 379 Reiman, J. J., 190, 198 Reimers, G. W., 105, 188, 197 Resler, E. L.,Jr., 140, 189, 190, 197 Resnick, J. Y., 179, 181, 198 Riccus, H. D., 28,35 Rice, R., 204, 270 Richardson, E. G., 64, I01 Rodger, A. C., 356,377 Rohler, R., 66, 100 Romankiw, L. T., 179, 180, 181, 198 Romano-Moran, R., 28,34 Rose, H., 64,100 Rosensweig, R. E., 104, 105, 106, 110, 1 I I , 112, 117, 118, 119, 122, 127, 128, 129, 131, 132, 147, 148, 149, 150, 151, 153, 155, 156, 158, 159, 163, 164, 165, 168, 169, 170, 171, 172, 174, 175, 176, 178, 179, 181, 187, 189, 190, 196, 197, 198
AUTHOR INDEX Roth, J. R., 190, 198 Roucairol, G., 231, 270 Rougeot, H., 9, 10, 15, 26, 34 Rowland, M. C., 26,113 Rubiales, A. L., 288, 377 Ruggiu, G., 207, 209, 223, 270
S Sabelman. E. E., 183, 184, 198 Saffman, P. G., 153, 198 Sakai, H., 187, 197 Sargent, R. W., 178, 198 Savoye, E. D., 30,35 Scheer, J. J., 2, 5 , 8, 16, 22, 23, 35, 36 Scherzer. O., 37, 39, 41, 42, 100, 348, 379 Schmieder, R. W., 174, 179, 198 Schnee, L., 175, 176, 1% Scholander, P. F., 122, 198 Scholten. P. C., 106, 109, 198 Schrednik, V. N., 315, 379 Schuman, P. A., Jr., 28, 35 Schwartz, J. W., 367,379 Schwartz, S. C., 234, 238, 270 Scott, J. P., 291, 379 Scranton, R., 117, 195 Seeliger, R., 37, 100 Selzars, A,, 324,379 Shade, H., 31,35 Sharma, V. K., 116, 198 Shephard, P. G., 111, 198 Shimizu, R., 337,379 Shindo, Y., 162, 195 Shliomis, M. I., 104, 113, 115, 116, 119, 120, 152, 156, 195, 197, 198, 199 Shubert, R. H., 191, 198 Siegel, B. M. J., 38,64,100. 306,308,379 Sigalle, R., 323, 376 Simon, A., 6, 7, 9,34, 35 Simpson, J. A., 374, 379 Simpson, W. I., 24.35 Singh-Boparai, S. P., 341, 379 Sizov, A. P., 162, 195 Skingsley, J. D., 22, 30, 35 Slaby, J. G., 162, 197 Slusarczuk, M. M. G., 179, 180, 181, 198 Small, M. B., 24, 34 Smith, H. I., 298, 299, 323, 376, 377, 379 Smith, K. L., 8, 35 Smith, W. R.,204, 270
387
Someya, T., 51, 100 Sommer, A. H., 5 , 7, 31,36 Sonnenberg, H., 3, 5 , 6, 36 Spears, D. L., 298, 299,379 Speidal, R., 295, 378 Speth, A. J.. 287, 359, 376 Spicer, D. F., 288, 356, 377, 379 Spicer, W. E., 3,5,7,8,9, 16, 25,31,32,34, 35, 36 Squire, D. G., 288, 376 Stafford, J. W., 288, 323. 377 Standish, T. A., 206, 270 Stein, N. W., 26, 35 Steward, G. J., 26, 33 Stewart. A. D. G., 323, 376 Stickel, W., 320,326,335,336,377,378,379 Stille, G., 288, 379 Stockham, T. G., Jr., 97, 100 Stratton, J. A., 163, 198 Stripling, W. W., 178, 198 Stroke, G . W., 38, I00 Stubbs, R. M., 162, 197 Styles, J . C., 174, 198 Suganuma, T., 337,379 Sullivan, P. A., 298, 379 Sumsky, S . , 24,35 Suzuki, T., 337,379 Swanson, L. W., 306, 308, 310, 313, 315, 317, 319, 379 Swzuit, M., 320, 321, 380 Syms, C. H. A.. 25, 26, 33, 36 Szostak, D. J., 8, 16, 25,34, 35
T Taff, E., 2, 16,34 Tamura, H., 324,378 Taylor, G. I., 150, 153, 198 Thoman, D. L., 30,35 Thomas, J. R., 105, 198 Thomas, R. N., 376 Thompson, D. A., 117, 179, 180, 181, 195, 198 Thomson, B. J., 367, 370, 373,379 Thomson, M. G. R., 361,379 Thon, F., 38, 100 Thomley, R. F. M., 324,377 Tietjen, J. J., 5 , 24, 34, 36 Tiller, W. A., 23, 36 Timm, G. W., 317,379
388
AUTHOR INDEX
Timmins, R. S., 104, 110, 111, 117, 118, 190, Walker, J. S., 139, 195 198 Wall, J., 313, 376 Tishin, E. A., 310,378 Wang, C. C. T., 367,379 Toupin, R. A., 183, 196 Wardly, G. A., 291, 293, 294, 340, 364,380 Tozawa, 0..189, 199 Watta, G. D., 341,377 Trepte, L., 40, 47, 100 Weber, E. V.,336,380 Trolan, J. K., 309, 313, 319, 376, 377 Weinstock, R., 163, 199 Trotel, J., 323, 376 Welter, L. M., 313, 376 Tsai, K. R., 6, 36 Welton, T. A., 38, 39, 40, 47, 61, 64,67, 68, Tuffias, R. H., 174, 198 97, 101 Turbeville, J. E., 189, 199 Westerbeck, E., 6,35 Turnbull, A. A., 5 , 16, 25, 34, 36 Westerberg, E. R., 366, 377 Tuval, Y., 188, 199 Whitaker, H. H., 5 , 7, 36 White, H. V., 178, 198 Whitesides, G. M., 191, 195 Widory, A., 231, 270 U Wiener, N.,66, 67, 87, 94, 97, 99, 101 Willasch, D., 38, 100 Uebbing, J. J . , 3,5,6,7,8, 17,29,34,35,36 Williams, B. F., 5 , 6, I, 9, 19, 33, 34, 36 Ueda, K., 162, 195 Wilson, A. D., 287, 359, 376, 379 Ukhanov, I., 28,36 Wilson, H. L., 25, 26, 35 Ulmer, K., 374, 377, 379 Winkler, H., 116, 199 Urschler, G . , 223, 270 Wirth, N., 205, 270 Wohlfarth, E. P., 108, 199 Wolfe, E. D., 287, 288, 325,37Y, 380 V Wolfe, J. E., 339, 372, 373, 380 Wolfe, R., 186, 199 Vance, A. W., 348,380 Woodard, 0. C., 326, 378, 379 Van der Pauw, L. J., 28,36 Worsham, R. E., 64,101 Van der Voort, E., 190, 196 Worster, J., 306, 380 Van Der Ziel, A., 317,379 van Heerden, P. J., 122, 196 Van Laar,J., 2, 5 , 8, 16, 22, 23, 32,35, 36 Y Van Ostrom, A. G. J., 308, 379 Varnell, G. L., 288, 379 Yadin, M., 269 Vasseur, J. P., 270 Yakushin, V. I., 120, 199 Veneklasen, L. H., 306, 308, 337, 338,379 Yantovskii, E. I., 140, 196 Verster, J. L., 346, 378 Yates, J. T., 341,378 Venvey, E. J. W., 110, 199 Yee, E. M., 25, 26, 31, 35, 36 Vine, J., 291, 378 Yep, T. 0.. 7,35 Viswanath, Y., 341, 376 Yim, R., 287, 288,379 Vogel, S. F., 320,379 Yonezawa, A., 320, 321, 380 Vogler, T., 153, 155, 198 Yonzane, M., 189, 199 Voschenkow, A. M., 359,377 Young, R. D., 41, 101 Vuillernin, J., 212, 270 Yourke, H. S., 326, 336, 378, 380 Yu, A., 8,36
W Wadge, W. W., 204, 269 Wagner, H., 376 Waldner, F., 116, 198
2 Zahn, M.,142, 153, 155, 198 Zaitsev, V. M., 152, 199
AUTHOR INDEX Zelazo, R. E., 128, 129, 141, 142, 143, 146, 147, 199
Ziegler, N. F., 64, 101 Zienkiewicz, 0. C., 349, 380
Zimmels, Y., 188, 199 Zimmermann, B., 374, 379 Zmerowski, T. J., 16, 25, 35 Zworykin, V. K., 348,380
389
f '
SUBJECT INDEX A
image theory for, 39-47 partial coherence and, 47 Bright-field electron micrographs, phase grain in, 76 Brownian rotational motion, in ferrofluids, 1 I9 Bulges, crystallization and, 106
Abbe bright field theory, 40-41 Accelerometers, magnetic fluids and, 178179 Actuators, magnetic fluids as, 183-185 Adsorbed polymers, repulsion energy for, 110
Advanced working set, in EDELWEISS system, 238-244 ALGOL language, 203 Angular energy distribution, for negative electron affinity photoemitters, 21-22 Artificial muscle, ferrofluid activator as,
C
184
Atomic actions, in EXEL, 209-21 1 Avco Corporation, 175,187 AWS, see Advanced working set
B Basicflows, conical meniscus and, 131-134 Bearings, magnetic fluids in, 163-175 Bernoulli equation, for magnetic fluids, 128- 130 Biochemical processing, magnetic liquids in, 191 Blind deconvolution, 97 Bright-field electron image theory, see also Bright-field electron microscopy object reconstruction in, 55-67 presentation and discussion ofdata in, 74-87 programs for determining W(k), 97-100 programs for numerical tests of reconstruction, algorithm for, 67-74 quadrature effects in, 87-94 statistical error in, 50-55 wave function for, 40 Bright-field electron microscopy image enhancement in, 37-100
CCD arrays, 273 Cesium oxide photocathode, 3 Chip sorting, magnetic fluids in, 185 Classical Quincke problem, 134-135 COBOL language, 203 Colloidal dispersiorl, stability of, 108-1 1 1 Computer-aided design offast scanning systems, 341-352 limitations of, 350-352 Computer architecture choosing high-level language in, 204-205 new trends in, 203-205 Computer programming, see Programming Conical meniscus, in magnetic liquids, 13 1-1 34 Control structures, 206 CONV, see Conventional microscope Convection flows, in magnetic liquids, 138-140 Conventional microscope, parametric sets describing, 75 Coulomb repulsion, in magnetic fluids, 110 Coulombic effects, quantitative estimates Of, 370-374 Coulombic interactions in high-current effects during microfabrication, 367-370 recent research in, 374-375 Crystallization, bulges or kinks in, 106 Current detectors, 178 Czochralski method, 23
391
392
SUBJECT INDEX
D Dampers, magnetic fluid, 175-176 Darcy’s law, 154 Deflection problem analytical treatments of, 361-363 calibration and diagnostics in, 365-367 electronic implications of, 364-365 experimental data in, 362-363 fabrication aspects of, 363-364 future possibilities in, 366-367 possible configurations in, 361 Demagnifyingelectron projection system, 296 Device microfabrication, see Microfabrication Diffractogram, 98 in bright-field electron image theory, 80 Direct execution architecture, 204 Displays, magnetic fluids in, 179-183 DNA (deoxyribonucleic acid), as ordered array, 74 DNA molecule, number ofcarbon atoms in, 81 DNA organization, ofobject atoms, 76
E EDELWEISS architecture, 223-256 EDELWEISS AWS, efficiency of, 238-244 EDELWEISS system, 201-269 analytical model of, 244-256 anticipated allocation of resources in, 236 description of, 223-232 EXEL language in, 207 EXELETTE machine and, 256-263 global performances in, 253-257 management of, 226-229.236 monocustomer system and, 255 as multiprocessor system, 244 operating principle of, 223-225 partition into subsystems, 247-253 restrictions to, 246-247 segmentation and fragmentation in, 229232 server’s utilization in, 256 single-user family and, 256-264 three-processor implementation in, 262264
waiting times in, 254 working set in, 232-244 Elastomeric capsule, 184 Electromagnetic transportation, I76 Electron beam in electron projection systems based on photocathode, 291-295 interactions with resist-coated substrate, 280-288 as microfabrication device, 288-300 Electron beam lithography, 277-279 Electron beam system pattern generation of, 288-291 proximity effect and, 287-288 Electron beam projection systems, 291-298 Electronics industry, new inventions and developments in, 272 Electron micrograph, enhancing of, 38 Electron microscope aperture defect in, 37 conventional, 55-56,75 Electron microscopy resolution improvement in, 37 high-field, see High-field electron microsCOPY
Electron-optical design, practical aspects of, 302-303 Electron-optical design problems in fast-scanning system development, 300-301 in microfabrication, 356-358 Electron projection systems demagnifying,296 photocathode and, 291-295 Emission probability, equation for, I 1 Empirical programming, 205 EMPTY, as ordered array, 74 Epitaxy system, horizontal liquid-phase, 24 Equilibrium magnetic properties formation ofchains and clusters in, 1131 I4 in magnetic fluids, 1 11-1 14 EXEL actions in, 209-21 1 argument transfer and evaluation in, 212 atomic actions in, 209-21 1 EXIT symbol in, 209 formulas and, 207 as GO-TO-less language, 223 iteration operator in, 209 logical levels of, 224
SUBJECT INDEX procedures in, 210-213 processors in, 224-225 symbols used in, 208-209 type expression in, 212 EXEWAPL, 207 EXEL/BASIC, 207 EXELETTE machine, 256-262 allocator in, 260 calculator in, 260 definition mode in, 258-259 description of, 256-258 execution mode and, 159 floppy disk management in, 259-262 internal management of, 259-262 operation of, 258-259 processors in, 260-261 EXEL/FORTRAN, 207 EXEL language, as GO-TO-less control structure language, 207 EXEL program definitions and scope of applications in, 218-219 example of, 213-217 programming methodology in, 222 system of regular program equations in, 217-222 =symbol, in EXIT, 209,219-222 External magnetic field, in ferroliquids, 117, see also Magnetic fluids
F Fast-scanning system cathode choice in, 303 computer-aided design in, 341-352 computer limitations in design of, 350-352 deflection problem in, 352-367 development of, 300-326 and electron-optical properties of gases, 303-306 fiducial mark detectors. beam blanking, and alignment in, 322-326 field distribution in, 346-348 field emission cathodes in. 306-319, 336-341 final image.size and, 344-346 final position in, 344-346 first- and third-order properties in, 348350
393
image shape in, 344-346 individual components in, 346-350 magnetic deflection problem in, 352-367 scan strategy in, 358-361 thermal cathodes in, 320-336 Ferrofluids, 104-IO5,see also Magnetic fluids; Magnetic liquids angular momenta ofparticles in, 120 in biochemical processing, 191 as liquid level sensors, 179 magnetic particles in, 104 nominal properties of, I21 pH of, 122 preparation of, 105-106 viscosity of, 117-120 Ferrofluid seals, 158-163 Ferrohydrodynamic seal, for superconductor generator, 161 Ferromagnetic microstructures, observing Of, 185-186 Ferromagnetism, in magnetic fluids, 105I06 Fiber optics, 273 Field distribution, in fast-scanning system design, 346-348 Field emission cathodes detailed pronerties of, 308 in fast-scanning systems, 306-3 19,33634I mode-confined field emitters in, 3 13potential performance of, 336-337 practical difficulties with, 337-339 (310)emitter in, 312-313 Field emitter noise, 317-319 Field emitters mode-confined, 313-317 stability and life of, 317-319 (310)type, 312-313 Field emitter system experimental work with, 339-340 future possibilities of, 340-341 magnification of, 338 Fluid, magnetic, see Magnetic fluid Fluid dynamics magnetic attraction in, 108-109 of magnetic fluids, 103-192 Fluid motion, conversion of thermal energy to, 189-190 Fluid penetration, stabilization of through porous medium, 153
394
SUBJECT INDEX
Formula, in EXEL, 207 FORTRAN, 203
G Gallium arsenide, in semiconductor technology, 272 Gallium arsenide photocathode, 2-3 Gas laser, high-power, 162 General Electric Company, 162 General-purpose operating systems, 202203 GO-TO-less control structure language, EXEL as, 207 GO TO statements, control structures and, 203,206 Gouy experiment, modified, 137-138 Gradient field stabilization, 146-148 Graphics, magnetic fluids in, 179-183 GREFFIER processor, 224,262 function of, 229-230
Image theory, for bright-field electron microscopy, 39-47, see also Bright-field electron image theory Indirect execution computer architecture, 204 Instabilities formulation of problem in, 141-142 gradient field stabilization and, 146 in magnetic liquids, 141 Rayleigh-Taylor problem in, 144-146 reduction of dispersion relation for linear medium, 143 Instrument damper, 176 INTENDANT processor, 230-231,262 lnviscid relationships, summary of, 130-13 1
J Josephson junctions, 273
K H HC, see High-coherence microscope Heteroepitaxial structures, transmission photocathodes with, 25-27 High-coherence microscope, 75 High-current effects, Coulombic interactions in, 367-375 Hitachi, Ltd., 187 Houston Research, Inc., 189 HUISSIER processor, 225,263 Hydrostatic bearings, one-fluid, 174
I Image enhancement, in bright-field electron microscopy, 37-100, see also Brightfield electron image theory IMAGE job step, 69,73-74 Image plane amplitude, 45 Image plane intensity, 45 in bright-field electron image theory, 92 Image quality, informational approach to, 64
Kelvin-Helmholtz instability, 152 Koehler illumination, thermal cathodes and, 328-335
L Langevin theory, 132 Lanthanum hexaboride thermal emitter, 320-321 Least recently used algorithm, 202 LEED, see Low energy electron diffraction Level detectors, 178 Levitation and force on immersed body, 166-168 in magnetic fluid, 163-168 LINEAR value, in bright-field electron microscopy, 75-76,82,91 Liquid level sensors, ferrofluids as, 179 Liquidhquid separation systems, 189 Liquid-phase epitaxy system, for NEA photocathodes, 24 Loudspeakers, magnetic fluid for voice coil in, 173-174
SUBJECT INDEX
Low energy electron diffraction, 7 measured error for, 22 LRU algorithm, see Least recently used algorithm
M
395
in lubrication, 192 in magnetic separation by fluid coating, 191 magnetic stress tensor and body force in, 122- I28 magnetization kinetics in, 114-1 17 in magnetohydrostatic separation, 187188
preparation of, 105-108 Magnetic etching device, 185 in printing, 182-183 Magnetic field, perpendicular orientation of, processes based on, 186-192 I48 pushing of fluid by, 153 Magnetic field viscosity, with external field self-levitation in, 164-166 present, 118-120 shaft sealsfor, 158-163 Magnetic fluid bearings stability related to net potential curves, computations for, 169-171 I 10- I 1 1 configurations of, 174 structure and properties of, 103-122 hydrostatic, 174 tabulated data and other properties of, measurements of, 168 120- 122 new developments in, 173-175 as transducers, 177-179 reductionof model to practice, 171-174 Magneticfluid seals Magnetic fluid center, for voicecoil, 173applications of, 161-163 1 74 principle of, 159 Magnetic fluid coating, magnetic separation Magnetic force/gravitational force ratio, 105 by, 191 Magnetic liquid bearings, measurements in, Magnetic fluid devices, 163-175,see also 168-169 Magnetic fluid bearings Magnetic liquid flows, surface elevation in Magnetic fluid flotation phenomena, passive normal field, 135-137 bearings based on, 163-175 Magnetic liquids,sec also Magnetic fluids Magnetic fluids, see also Feirofluids; convective flows in, 138-140 Magnetic liquids in energy conversion, 189 as actuators, 183-185 fluid dynamics of, 103-192 alternate forms ofgrouping magnetic ingl-dphics, 179-183 termsfor, 126-128 instabilities and modifications in, 141-157 basicflowsin, 131-140 Kelvin-Helmholtz instability of, 152-153 in biochemical processing, 191 magnetic attraction in, 108-109 chains and clusters in, I13 modified Gouy experiment in, 137-138 in chip sorting, 185 normal field surface instability in, 148-152 for dampers, 175- I76 peaks on surface of, 151 indevices. 157-186 stabilization of fluid perturbation through in displays, 179-183 a porous medium, 153-155 in electromagnetic transportat ion, 176 steric repulsion in, 109-1 10 equilibrium magnetic properties in, 1 1 1thermoconvective instability of, 155-156 1 I4 viscosity in, 117-120 experimental measurements in, 116 Magnetic lubricants, 192 ferromagnetic microstructures and, 185Magnetic separation, by fluid coatings, 191 186 Magnetic stress tensor, body force and, fluid dynamics of, 122-157 122-126 generalized Bernoulli equation for, 128Magnetization kinetics 130 experimental measurements in, 116-1 17 inviscid flow relationships in, 130-13 I in magnetic fluids, 114-1 17
396
SUBJECT INDEX
superparamagnetism induced by Brownian motion in, 115-1 16 Magnetocaloric cooling, 157 Magnetohydrostatic separation, 187-188 Methyl groups, 106 Microfabrication alignment in, 322-326 beam blanking in, 322-326 challenge of, 274 Coulombic effects in, 367-374 deflection problem in, 352-367 deflector design philosophy in, 356-358 electron beam lithography in, 277-279 electron-optical problem in, 356-358 electron physics in, 272-375 fast-scanning system in, 300-352 fiducial mark detectors in, 322-326 high-current effects in, 367-375 lithographic approaches to, 279-280 magnetic deflection problem in, 352-367 photolithographic methods in, 275-277 photon and electron beam lithography for, 275-280 scanning electron beam approach in, 299-300 scan strategy in, 358-361 thermal cathodes in, 326-336 X-ray lithography in, 298-299 Microfabrication deflector system, 352-367 Microscope design, essential consideration in, 64,see also Electron microscope Modulation transfer function, 55
materials used in, 6 photoemission stability for, 31-32 semiconducting material for, 22 surface of, 29 technology for, 22-31 Negative electron affinity photoemitters, 1-33, see also Negative affinity photocathodes angular energy distribution for, 21-22 gallium arsenide in, 32 material requirements for, 32 photoemission stability as dark current for, 31-32 pioneers in, 16 Nonmagnetic body, passive levitation of, I63 NORMAL (parameter value), 75,82,89,91 Normal field surface instability, 148-152 Normal pulse surface elevation, 135-137 Nusselt number. 139
0
OBJECTjob step, 68,97 Wiener spectrum for, 94-97 Object reconstruction, in bright-field image theory, 55-67 Operating systems, general-purpose, 202203 Optical masks, 273
N
P
NASA Radio Astronomy Explorer Satellite, 175 NEA, see Negative electron affinity Negative electron affinity discovery of, 2 photocathodes using, 2-8,22-3 I Negative electron affinity photocathode diffusion length in, 27-28 doping of, 28 high vacuum for, 29-31 high-vacuum sealing equipment for, 29-30 investigation of instrumental characteristics of, 27-29 low dark current of, 3 1-32
Partial coherence, effect of, 47-59 Photocathode cesium-gallium arsenide, 8 in electron projection systems, 291-295 ofgallium oxide, 2-4 multialkaline, 22 negative electron affinity type, 2-8 reflection type, 9-16 as semiconductors, 2 transmission, see Transmission photocathodes Photoemission stability, 3 1-32 Photoemission technology, development of, 1-2
397
SUBJECT INDEX Photoemission, by transmission, 17-21 Photolithographic methods, 275-277 PWI, 203 PLOT job step, 73 Polyethylene oxide, in water, 109 Polyisobutylene, 106 Polymers, as stabilizers, 106 Printing, magnetic fluids in, 182-183 Procedurecalls, in EXEL, 211-213 Program debugging, 205 Program manipulation, 206-207 Programming empirical, 205 new trends in, 205-207 top-down, 205-206 Proximity effect, in electron beam-substrate interactions, 287-288
Q Quantum yield, for reflection photocathode, 14 Queuing systems, generalities on, 245 Queue lengths, average, 249. 266-267
R Rayleigh-Taylor instability, 144, 154 Reconstruction algorithm,numerical tests of. 67 Reflection photocathodes, 9- 16 quantum yield for, 12-14 spare-charge zone in, 9 Resist-coated substrate, electron beam interactions with, 280-288 Resource allocation, in EDELWEISS system, 236 ROBOT processor defined, 224-225 function of, 229-230
S Satellite damper, 175 Scanning, see Fast-scanning system
Scanning electron beam approach, central role of, 299-300 Scanning electron beam system, pattern generated by, 288-291 SCRIBE processor, 224-225 Semantic transforms, 207 Semiconductor material, for NEA photocathodes, 22-23 Silver-oxygen-cesium photocathode, 1 Smoothing process, defined, 99 Solubility parameters, of polymers, 106 SOUTIER processor, 224-225.263 function of, 231-232 SPEC job step, 76,99 in image theory, 75 Stepping motor damper, I76 Superparamagnetism, Langevin theory of, I32 Syntactic transforms, 206-207 Syntax-oriented computer architecture, 204
T Taylor wavelength, 144 TEXT displays, 74-75,81 high-coherence imaging of, 87 in image theory, 75,89,91 as ordered array, 74 TEXT job step, 81 Thermal cathodes experimental results in, 331-336 in fast-scanning systems, 320-336 Koehler illumination in, 328-335 lanthanum hexaboride thermal emitter and, 320-32 I vs. other types, 322 spot shaping in, 326-328 tungsten hairpin filament in, 321-322 variable aperture for, 326-328 Thermal energy, conversion of to fluid motion, 189 Thermoconvective instability, 155- 157 (310) emitter, as field emission cathode, 312-313 Top-down programming, 205-206 Transducers accelerometers and, 178-179 acoustic, 177 current detectors and, 178
398
SUBJECT lNDEX
level detectors and, 178 magnetic fluids as, 177-179 pressure generator and, 177 Transmission photocathodes, 17-21 action layer thickness for, 19-20 diffusion lengthin, 18-19 with heteroepitaxial structures, 25-27 quality of substrate/active layer interface, 21 quantum yield for, 20 recombination speed of, 21 Transmission, photoemission by, 17-21 Tungsten hairpin filament, 321-322 Typeexpression, in EXEL,212
U Ultraviolet light, radiation from, 1
of magnetic liquid, 117-120 with no external field, 117-1 I8 Von Neumann computer architecture, 203
W
Water-base ferroliquids, pH of, 122 Wiener algorithm, 87 Wiener function, for reconstruction of high-coherence micrograph, 86 WIENER job step, 76,99 in bright-field image theory, 75 Wiener kernel, 83 Wiener spectra, 78-81 of object set, 94 Working set in classical system, 233-236 defined, 232 in EDELWElSS system, 232-233
V Van der Waals attraction, 108 Van der Waals-London or dispersion force, 108 Viscosity with external field, 118-120
X XFORM job step, 73-74 XPLOTjob step, 74 X-ray lithography, 298-299