ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 63
CONTRIBUTORS TO
THISVOLUME
E. CARLEMALM C. COLLIEX FREDE. G...
31 downloads
1312 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 63
CONTRIBUTORS TO
THISVOLUME
E. CARLEMALM C. COLLIEX FREDE. GARDIOL DONALD GREENSPAN E. KELLENBERGER H. TIMAN FRANCIS T. S. Yu
Advances in
Electronics and Electron Physics EDITEDBY PETER W. HAWKES Laboratoire d’Optique Electronique du Centre National de la Recherche ScientiJique Toulouse. France
VOLUME 63 1985
ACADEMIC PRESS, INC. (Harcourt Brace Jovanovich, Publishers)
Orlando San Diego New York London Toronto Montreal Sydney Tokyo
COPYRIGHT @ 1985, BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS,INC.
Orlando, Florida 32887
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NWI 7DX
LIBRARY OF CONGRESS CATALOG CARD NUMBER:49-7504 ISBN 0-12-014663-0
PRINTEDINTHE UNITEDSTATESOFAMERICA
85868788
9 8 7 6 5 4 3 2 1
CONTENTS CONTRIBUTORS TO VOLUME 63 . . . .............. PREFACE
................ ...............
Recent Advances in White-Light Image Processing FRANCIS T. S. Yu I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . I1. White-Light Optical Image Processing . . . . . . . . . . . . . . 111. Coherence Requirement . . . . . . . . . . . . . . . . . . . . IV . Coherence Measurement . . . . . . . . . . . . . . . . . . . . V. Source Encoding, Image Sampling and Filtering . . . . . . . . . . VI . Advances in Image Processing . . . . . . . . . . . . . . . . . VII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
vii ix
1 3 7 24 36 43 68 69
A Survey of Recent Advances in the Theory and Practice of Vacuum Photoemitters H. TIMAN
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . I1. Stability of Photocathodes and the Influence of External and Environmental Conditions . . . . . . . . . . . . . . . . . 111. Physical Properties of PC . . . . . . . . . . . . . . . . . . . . IV . Enhancement of Photoemission . . . . . . . . . . . . . . . . . V. Formation. Composition. and Spectral Response. . . . . . . . . . VI. Theoretical Attempts and Models . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
Open-Ended Waveguides: Principles and Applications FREDE. GARDIOL I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . I1. Definitions and Generalities . . . . . . . . . . . . . . . . . 111. Measurements . . . . . . . . . . . . . . . . . . . . . . . . IV. Applications . . . . . . . . . . . . . . . . . . . . . . . . . V. Theoretical Development for a Flanged Waveguide Radiating into an Infinite Homogeneous Medium . . . . . . . . . . . . VI. Application to Particular Structures . . . . . . . . . . . . . . V
.
. .
73 75 83 99
106 119 133
140 141 147 152 162 172
vi
CONTENTS
VII . Conclusion . . . . . . . . . . . . . . Appendix: Infinite Sample in Waveguide . References . . . . . . . . . . . . . . .
........... . . . . . . . . . . . .
...........
Discrete Mathematical Physics and Particle Modeling DONALD GREENSPAN I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . I1. Newtonian Mechanics . . . . . . . . . . . . . . . . . . . . . I11. Special Relativistic Mechanics. . . . . . . . . . . . . . . . . IV . Quantum Mechanics: A Speculative Model of Vibrations in the Water Molecule . . . . . . . . . . . . . . . . . . . . . V. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
.
181 182 184
189 190 242 259 264 266
Contrast Formation in Electron Microscopy of Biological Material E. CARLEMALM. C COLLIEX. and E. KELLENBERGER I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . I1. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Scattering Cross Sections and Constants . . . . . . . . . . . . . IV. Contrast with Unstained Sections of Aldehyde-Fixed Biological Material . . . . . . . . . . . . . . . . . . . . . . . . . . V. Experimental Confirmations . . . . . . . . . . . . . . . . . . VI. Discussion of the Consequences for the Interpretation of Micrographs . . . . . . . . . . . . . . . . . . . . . . . VII. Discussion of Limitations . . . . . . . . . . . . . . . . . . . VIII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .
332
. AUTHORINDEX SUBJECT INDEX.
335 343
.
.........................
.........................
270 276 286 294 309 319 327
331
CONTRIBUTORS TO VOLUME 63 Numbers in parentheses indicate the pages on which the authors’ contributions begin.
E. CARLEM ALM,Department of Microbiology, Biozentrum, University of Basel, CH-4056 Basel, Switzerland (269) C. COLLIEX, Laboratoire de Physique des Solides, Universit6 de Paris-Sud, F-91405 Orsay, France (269) FRED E. GARDIOL,Laboratoire d’ElectromagnCtisme et d‘Acoustique, Ecole Polytechnique FCdCrale, CH- 1007 Lausanne, Switzerland ( 139) DONALDGREENSPAN, Department of Mathematics, The University of Texas at Arlington, Arlington, Texas 760 19 ( 189) E. KELLENBERGER, Department of Microbiology, Biozentrum, University of Basel, CH-4056 Basel, Switzerland (269) H. TIMAN,Thomson-CSF Components Corporation, DuMont Division, Dover, New Jersey 07801 (73)
FRANCIS T. S. Yu, Department of Electrical Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802 ( 1)
vii
This Page Intentionally Left Blank
PREFACE The spread of topics in this volume illustrates well the wide range of subjects that we attempt to cover in these Advances in Electronics and Electron Physics. I have mentioned repeatedly that chapters on image processing are more than welcome, and as Francis T. S. Yu’s review indicates, analog processing is by no means excluded. Two chapters are devoted to devices, which have always occupied a major place. Vacuum photoemitters are discussed by H. Timan, and open-ended waveguides are discussed by Fred E. Gardiol. The essay by Donald Greenspan will, I hope, help to propagate thinking in discrete terms, a very necessary complement these days to the traditional emphasis on analysis and continuous functions. Finally, we have a chapter on biological electron microscopy by E. Carlemalm, C. Colliex, and E. Kellenberger, which is appropriate here because the recently developed techniques for electron image processing are being applied almost exclusively to biological specimens.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL . 63
Recent Advances in White-Light Image Processing FRANCIS T. S. YU Department of Electrical Engineering The Pennsylvania State University University Park. Pennsylvania
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 I1. White-Light Optical Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . I11. Coherence Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 A . Propagation of Mutual Coherence Function . . . . . . . . . . . . . . . . . . . . 8 B. General Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 C. Coherence Requirement for Image Deblurring . . . . . . . . . . . . . . . . . . . 15 1V. Coherence Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 A . Measurement Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 B. Coherence Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 36 C . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V . Source Encoding, Image Sampling and Filtering . . . . . . . . . . . . . . . . . . . . 36 A . Source Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 B. Image Sampling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 C. An Illustrative Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 43 V1. Advances in Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Color Image Deblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 B. Color Image Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 C. Color Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 D . Pseudocolor Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 E . Real-Time Pseudocolor Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 63 VII . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
I . INTRODUCTION
Signal processing originated with a group of electrical engineers whose interest was mainly centered on electrical communication . Nonetheless. from the very beginning of the development of signal processing. the interest in its application to image processing has never been totally disregarded . In recent 1 Copyright 0 1985 by Academic Press. Inc . All rights of reproduction in any form reserved. ISBN 0-12-014663-0
2
FRANCIS T. S. YU
advances in image processing, the relationship between signal processing and image processing has grown to a more profound stage. Since its invention as a strong coherent source, the laser has become a fashionable tool for many scientific applications, particularly with regard to coherent optical image processing. However, coherent optical processing systems are usually plagued with coherent noise, which frequently limits their processing capability. As noted by the late Gabor, the Nobel Prize winner in physics in 1970 for his invention of holography, coherent noise is the number one enemy of modern optical image processing (1). Aside from the coherent noise, the coherent source is usually expensive, and the coherent processing environment is very stringent. For example, a heavy optical bench and dustfree environment are generally required. Recently, we have looked at optical processing from a different standpoint. A question arises, is it necessarily true that all optical image processing requires a coherent source? The answer to this question is that there are many optical processings that can be carried out by a white-light source (2). The basic advantages of white-light image processing are (i)it is capable of suppressing coherent noise, (ii) the white-light source is usually inexpensive, (iii) the processing environment is not very demanding, (iv) the white-light system is relatively easy and economical to operate, and (v) the white-light processor is particularly suitable for color image processing. The reader may ask, since the white-light system offers all these glamorous merits, why it has been ignored for so long? The answer to this question is that, it was generally accepted that an incoherent source cannot process the image in complex amplitude. However, none of the practical sources are strictly incoherent,even a white-light source. In fact we were able to utilize the partial coherence of a white-light source to perform complex amplitude processing. The proposed white-light processor on the one hand is capable of suppressingthe coherent noise like an incoherent processor; on the other hand it is capable of processing the signal in complex amplitude like a coherent processor. There is, however, a different approach toward the application of the white-light processor. In coherent processing, virtually no one cares about the coherence requirement,since the laser provides a very strong coherent source. However, for white-light processing an understanding of the coherence is usually required. Therefore, in white-light image processing we would first like to know the image processing operation. For example; is it a ID or 2D processing operation? Is the image filtering a point or point-pair operation? What is the spatial bandwidth of the image signal? etc. Then we would be able to calculate the coherence requirements at the Fourier and at the input planes. From these calculated results, we would be able to utilize an image sampling
WHITE-LIGHT IMAGE PROCESSING
3
function and a source encoding mask to obtain these requirements. The objective of using a image sampling function is to achieve a higher degree of temporal coherence in the Fourier plane so that the image can be processed in complex amplitude. Source encoding serves to alleviate the shortcomings inherent in an extended source. Although much optical image processing can be implemented by systems that use incoherent light (3-6), there are other severe drawbacks. The incoherent processing system is capable of reducing the inevitable artifact noise, but it generally introduces a dc bias buildup problem, which results in poor noise performance. Techniques have been developed for coherent operation with light of reduced coherence (7, 8); however, these techniques also possess severe limitations. Mention must be made that there are techniques of reducing the temporal coherence requirement on the light source in optical image processing; one, the use of incoherent instead of coherent optical processing, has been pursued by Lowenthal and Chavel(9) and Lohmann (lo),among others. The other, the reduction of coherence while still operating in the linear amplitude mode, has been pursued by Leith and Roth (11) and by Morris and George (12). The latter is the one that we believe to hold the most promise, and it is the concept of complex amplitude processing that we would discuss. In the following section, we shall describe a technique that permits certain image processing (13, 14) operations to be carried out by a spectrally broadband light source (i.e., white light). This method is capable of performing image processing that obeys the concept of coherent light rather than incoherent optics. The proposed white-light image processing system is linear in complex amplitude rather than in intensity as in the conventional incoherent processing system. In other words, the white-light processor is operating in a partially coherent mode rather than an incoherent mode, and so the image can be processed in a complex amplitude.
11. WHITE-LIGHT OPTICAL IMAGEPROCESSING We shall now describe an optical image processing technique that can be carried out by a white-light source, as illustrated in Fig. 1. The white-light image processing system is similar to that of a coherent system, except in the utilization of a white-light source, source encoding mask, signal sampling grating, multispectral band filters, and achromatic transform lenses. For example, if we place an image transparency s(x, y) in contact with a sampling phase grating, the complex light field for every wavelength 1 at the Fourier
FRANCIS T.S. YU
4 Input image
Source
1
Sampling phpse groting
Output plane
x'
Y'
white-light source
I
Po
FIG.1. A white-light optical image processor.
plane P2 would be (assuming a pinhole source encoding mask)
E(p,q;4 =
ss
s(x,y)exp(ip,x)expC-i(px
+ 4 y ) l d x d y = S(P - p 0 , q )
(1)
where the integral is over the spatial domain of the input plane Pl, ( p , q ) denotes the angular spatial frequency coordinate system, p o is the angular spatial frequency of the sampling phase grating, and S ( p , q ) is the Fourier spectrum of s(x, y). If we write Eq. (1) in the form of spatial coordinate (a, 8), we have
E(a, 8;4= SCa - (Af/2n)po,B1 (2) where p = (2n/Af)a, q = (2n/Af)P,and f is the focal length of the achromatic transform lens. Thus, we see that the Fourier spectra would disperse into rainbow color along the a axis, and each Fourier spectrum for a given wavelength A is centered at a = k (Af/2n)p0. In image filtering, we assume that a set of narrow-spectral-band complex spatial filters is available. In practice, all the images are spatial frequency limited; the spatial bandwidth of each spectral band filter H(p,,, q,,) is therefore for a1 I a I a2 otherwise
(3)
where p,, = (2n/A,,f)a,q,, = (2n/A,,f)P,A,, is the main wavelength of the filter, and a1 = (A,,f/2n)(po Ap) and a2 = (A,,f/2n)(p0- Ap) are the upper and lower spatial limits of H(p,,,q,,), respectively; Ap is the spatial bandwidth of the input image s(x, y). Since the limiting wavelengths of each H ( p , , ,q,,) are
+
A, = A ( P 0 + AP)/(PO - AP)
and
= &(PO
+ AP)
- AP)/(P~
(4)
the spectral bandwidth of each H ( p , , ,q,,) can be approximated by AAn N
An
{ ~ P J P / C P ~- ( A P ) ~1I = (~AP/PO)A~
(5)
WHITE-LIGHT IMAGE PROCESSING
5
If we place this set of spectral band filters side-by-side and properly positioned over the smeared Fourier spectra, then the intensity distribution of the output light field can be shown as N
where h(x,y; A,) is the spatial impulse response of H ( p , , ,q,,) and * denotes the convolution operation. Thus, the proposed white-light processor is capable of processing the image in complex amplitude. Since the output intensity is the sum of the mutually incoherent narrow-band spectral irradiances, the annoying coherent artifact noise can be suppressed. It is also apparent that the whitelight source contains all the visible wavelengths; the white-light processor is very suitable for color image processing. We have stated earlier that in white-light image processing we would approach the operation from a different standpoint. For example, if image filtering is two-dimensional (e.g., 2D correlation operation), we would synthesize a set of narrow-spectral band filters for each 1, for the entire smeared Fourier spectra, as illustrated in Fig. 2. On the other hand, if the image filtering is one-dimensional (e.g., deblurring due to linear motion), a broadband fan-shaped spatial filter can be utilized to accommodate the scale variation due to wavelength as illustrated in Fig. 3. There is, however, a temporal coherence requirement imposed on the image filtering in the Fourier plane. Since the scale of the Fourier spectrum varies with wavelength, a temporal coherence requirement should be imposed on each narrowspectral-band filter in the Fourier plane. Thus H(p,,, q,,) should be imposed by the temporal coherence requirement, i.e.,
Thus, we see that a high degree of temporal coherence can be attained in the
.. ... H. (a# 1
+ B FIG.2. A set of narrow-spectral-band filters.
FRANCIS T. S. YU
6
RED
H(q)
--t-
I
VIOLET
FIG.3. A broadband fan-shaped filter.
Fourier plane by simply increasing the spatial frequency of the sampling grating. Needless to say that this temporal coherence requirement of Eq. (7) can also be applied for the broadband fan-shaped filter of Fig. 3. There is also a spatial coherence requirement imposed on the input plane of the white-light image processor. The spatial coherence function at the input plane can be shown as (25,16)
which is essentially the Van Cittert-Zernike theorem (27, 28), where y(x,) denotes the intensity distribution of the source encoding function. From the above equation, we see that the spatial coherence and source encoding functions form a Fourier transform pair, i.e.,
and
where 9 denotes the Fourier transformation. This Fourier transform pair implies that, if a spatial coherence function is calculated, then a source encoding function corresponding to this spatial coherence function can be determined through the Fourier transformation of Eq. (9), and vice versa. We note that the source encoding function can consist of apertures of any shape or complicated gray-scale transmittance. However, in practice, the source
WHITE-LIGHT IMAGE PROCESSING
7
encoding is only realizable if it is a positive real function, i.e., 0
s Y(Xd 5 1
(11)
In white-light image processing, we would search for a reduced spatial coherence function for the processing operation. From the calculated reduced spatial coherence function, a source encoding function that satisfies the physical realizability condition of Eq. (1 1) can be obtained. One of the basic objectives of the source encoding is to compensate for the extended source, that is, to improve the utilization of light power so that the image processing can be carried out by a physical white-light source.
REQUIREMENT 111. COHERENCE
So far, most of the optical image processing techniques have confined themselves to either extremely coherent or incoherent light. Surely we must expect a continuous transition between these two extreme limits. Such a transitional region exists and is known as the field of partial coherence. The earliest investigation bearing on the subject of partial coherence may be the work of Verdet in 1865 (I9),who studied the region of coherence for light from an extended source. It was, however, in 1890 that Michelson (20) first established the relationship between the visibility of interference fringes and the intensity distribution of an extended source. Mention must be made of a significant contribution to the theory of partial coherence due to Berek in 1926 (21). He utilized the concept of correlation in relation to microscope image formation. However, the most important development in our knowledge of partial coherence must be due to the works of Van Cittert (I 7) in 1934 and Zernike (18) in 1938. They determined the degree of coherence for light disturbances at any two points on a screen illuminated by an extended light source. The Van Cittert-Zernike method was later simplified and applied to the study of image formation and resolving power by Hopkins (22,23)in the 1950s. Nevertheless it was Wolf's mutual coherence function that in 1957 (15) brought coherence theory into a broader scope of applications. This is simply due to the fact that the mutual coherence function depicts the correlation properties of the light disturbances at any two points for which it obeys the general wave equation. Such properties of the mutual coherence function are adequate for the analysis of any optical experiments involving interference and diffraction phenomena. Before investigating the behavior of a white-light optical image processor, it is necessary to establish a transformational relationship of the mutual
8
FRANCIS T. S. YU
coherence function (or correlation function), which determines the degree of coherence under partially coherent illumination. We shall use Wolf's theory (15) of partially coherent light to develop a transformational formula for the mutual coherence function propagating through an ideal thin lens. We then apply this transformational formula to derive an operational formula for the white-light optical image processor that we have proposed. A. Propagation of Mutual Coherence Function One of the most remarkable and useful properties of a converging lens is its inherent ability to perform the two-dimensional Fourier transformation of a complex image. For partially coherent illumination, we show that a fourdimensional Fourier transformation of the mutual coherence function can be established by an ideal thin lens. It is obvious that a monochromatic plane wave passing through a thin lens suffers a phase delay transformation, such as (2)
T(5,q ) = exp(ikvAo)exp[ - Wt2+ v 2 ) / 2 f 1
(12)
where q is the refractive index of the thin lens, f is its focal length, A. is its maximum thickness, k = 2 4 1 , 1 is the wavelength, and (5, q) is the spatial coordinate system of the thin lens. Let us now consider an object transparency inserted at a distance do in front of the lens and illuminated by a spatially partially coherent light. If the mutual coherence function at the object plane is T,(x,, y,; x,, y,) as depicted in Fig. 4, then the mutual coherence function at the output (a, p) plane can be evaluated. Since the mutual coherence function at the front of the lens can be
X
E
a
i A
t
P
FIG.4. A partial coherent optical system.
WHITE-LIGHT IMAGE PROCESSING
9
-m
then the mutual coherence function immediately behind the lens is It is apparent that the mutual coherence function at the output plane can therefore be written as r4(a1
9
f l l ; a2 7 8 2 )
By further substituting Eqs. (12) and (14) into Eq. (15) we have r4(a1, P I ;
9
82)
'+-
-25, I:(
;:) (!:-+- :;) (9:-+- a:) -21
+2t,
+ 2a2(;P + Y- 9 ] j d t 1 dvll dt2dv2)dx1dY1 dX2dY2 1
do
(16)
FRANCIS T. S. YU
10
We further note that
= exp[ - i k ( 2
.JYm
+:)I(; + do
1 -
1
exp{ik(i
j)]
:(
+ -&- ;)[tl
-
+
:)I(;
1
+
-;)I2}dtr
r2,
and similar formulas can also be written for q l , and v2 coordinates. Therefore the transformational formula for the mutual coherence function at the output plane is
r&l,81; a29 8 2 )
W
-m
where C is an appropriate constant. This equation illustrates that the output mutual coherence function can be calculated by a four-dimensional integral function. To this point we have neglected the finite extent of the lens aperture. Such an approximation is an accurate one if the distance do is sufficiently small to place the input transparency deep within the region of Fresnel diffraction with respect to the lens aperture. This condition is satisfied in the vast majority of problems of interest, particularly for optical image processing. The limitation of the effective input object by the finite lens aperture is known as a vignetting effect. The vignetting effect in the input space can be minimized if the input object is placed close to the lens or the lens aperture is much larger than the
WHITE-LIGHT IMAGE PROCESSING
11
input object. In practice it is often preferred to place the object transparency directly against the lens to minimize vignetting. There are, however, two special cases of Eq. (18) worth mentioning: (1) For d , = f ; the output plane is located at the back focal plane. Equation (18) reduces to r4(@1Y B1; a 2 9 P 2 )
W
-m
Thus we see that except for a quadratic phase factor, the output mutual coherence function is essentially the Fourier transform of the input mutual coherence function. (2) For do = d , = f ; both input and output plane are located at the front and back focal length of the lens. The quadratic phase factor vanishes and Eq. (19) reduces to r4(al,P1;a2 3
=
Pz)
c JfJrl(xlY , ; x 2 , Y2) 9
-52
k
.exP{--iy[(.,x,
+ P l y , ) - (azx2 + P 2 Y 2 ) l
I
d X l d Y , dx*dy2
(20)
which is exactly a four-dimensionalFourier transformation, between the input and output mutual coherence functions. In partially coherent optical processing (e.g., white-light processing) an extended incoherent source is usually used at the front focal plane of a collimating lens to illuminate the input object transparency. Thus the mutual coherence function at the source plane (xo, y o ) can be written as
where y ( x o ,y o ) is the intensity distribution of the incoherent light source and
FRANCIS T.S. YU
12
(xo,yo) the spatial coordinate system of the source plane. If we assume that the input object transparency is located at the back focal plane of the collimator, then the mutual intensity function at the input plane reduces to Tl(X1 - X 2 i Y l - Y2) m
-m
which is essentially the Van Cittert-Zernike theorem (17, 18).It is also evident that the mutual coherence function at the input plane, illuminated by ail extended incoherent source, is a space-invariant function.
B. General Formulation We shall now develop a general formulation for the propagation of mutual coherence functions through a white-light optical image processor. With reference to the white-light image processor of Fig. 1, the mutual coherence function at the input plane Pl due to the source irradiance Y(xo,Yo;4 is
W l , Y 1; x2 y2; 4 9
where the integration is over the source plane Po. The mutual coherence function immediately behind the sampling phase grating can be written as
r’(xl, Y,;x2,yZ;1) = w 1 ,Y 1; x2,y2;M X ,
I
y1)
(24) .s*(x2,y2)exp(i2~voxl)exp(-i2nv,x,) where the superscript asterisk denotes the complex conjugate and vo is the spatial frequency of the sampling grating. Similarly, the mutual coherence function at the Fourier plane P2 can be written as
W13B1; a2 B 2 ; 4
where the integration is over the input plane Pl. By substituting Eqs. (23) and
13
WHITE-LIGHT IMAGE PROCESSING
(24) into Eq. (25), we have
W , , P , ; a,,P2; 4=
ss
Y(X0,Yo;
.S*(XO
4 m o + a1 - W 0 9 Y o + B 1 )
+ a2 + L f ” 0 , Y O + P2)dxodyo
(26)
where the integration is over the source plane Po, and S(a,B) is the Fourier spectrum of the input image s ( x , y ) . If we assume that a set of narrow-spectralp) of Eq. (3) is inserted at the Fourier plane, shown in band spatial filters H,(a, Fig. 1, the mutual coherence function immediately behind each of the spectral band filters would be l - ’ ( ~ l , B l ;a2,Bz; 4= W 1 , B l ; a 2 2 0 2 ; 4 H , ( ~ l , P l ) W ~ 2 , P 2 ) (27) It is therefore evident that the mutual coherence function due to the nth spatial filter at the output plane P3 would be
W l , y ; ;x;,y;;
4
Since the interest is usually centered at the output irradiance, by letting x’, = x i = x’ and y’, = y ; = y’ the output image irradiance due to the nth spectral band filter is Ifl(x’,y‘;4=
ss
Y(X0,Yo;
4dXOdYO
1
.H.’(n2,B2)enp[i~(i2x’+ P2v’) 4 d P 2 for A f , < ;1 5 ,Ihn
(29)
where ,If,,and A,,, are the lower and upper wavelength limits of Hfl(a,a)of Eq. (4). By changing the variables (a,,B1) and ( a 2 , p 2 ) to (a,p), Eq. (28)
14
FRANCIS T. S. YU
reduces to
-ex,(
- i$(ax’
>i‘
+ by’)
da dp dxo dyo
for AI, 5 15 A,,,,
(30)
Let us denote by S(A) the relative spectral intensity of the light source and by C(A) the relative spectral response of the output detector or recording material; the output image irradiance resulting from each narrow-spectralband filter is then l n ( X f , y‘) =
1
A.
+ AA,/2
I,
- AAn/2
~ ( x+, a - ~ f v oYo , + P)
- H,,(a, P) exp(
-
for n = 1,2,...,N
(31)
where A,, and AA,, are the center wavelength and the bandwidth of the nth narrow-spectral-band filter H,,(a, 8). Therefore the overall output image irradiance would be the incoherent addition of these image irradiances of Eq. (3 l), i.e.,
where N is the total number of the spectral band filters. Furthermore, if the image processing is a one-dimensional operation, then a fan-shaped spatial filter of the type illustrated in Fig. 2 can be utilized; thus Eq. (32) can be reduced to the form
W, Y O = JJjY(X0,
yo; W(W(A)
-exp( -i$(x’a
~0
> r
+ y’p)
+a -~
dadP d x o dyo d A
,+ B)H(a,P )
0yo ,
(33)
WHITE-LIGHT IMAGE PROCESSING
15
Now we are in a position to ask a fundamental question: to what degree can the coherence requirements be relaxed without sacrificing the overall output image quality of the processing system? The answer to this question is that the nature of the image processing operation governs the temporal and spatial coherence requirements necessary to obtain satisfactory results. We shall apply the general formulation of Eq. (32) or (33) to suit the specific need of the processing operation under consideration. In the following we will discuss the temporal and spatial coherence requirements for that specific need; for example, image deblurring. In other words, for other specific processing operations, the coherence requirement can also be deduced by a similar approach. However, this lengthy illustration is beyond the scope of the present chapter. Therefore, we refer the interested readers to the references (24-26).
C. Coherence Requirement for Image Deblurring
Since linear smeared image deblurring is primarily a one-dimensional processing operation and the inverse filtering takes place with every image point, the spatial coherence requirement depends on the smeared length of the object. Thus, this optical image processing operation is limited by the source size and the spectral bandwidth of each of the narrow-spectral-band filters H,,(a, p), that is, the spatial and temporal coherence requirements.
I . Temporal Coherence Requirement There is a temporal coherence requirement imposed on white-light image processing for photographic image deblurring. This temporal coherence requirement is a measure of the output deblurred image quality due to the spectral bandwidth of each narrow-spectral-band inverse filter H,,(cr, p). For the analysis of the temporal coherence requirement, we assume that the white-light image processor utilizes a broad-spectral-band point source, [i.e., l ( x o ,y o ) = 6(x,, yo)] and its spectral distribution is uniform throughout the spectral bandwidth [i.e., S(A) = K 3. Without loss of generality, we also assume that the relative spectral response of the output detector is uniform [i.e., C(1) = K ] . If the smeared length w is known a priori, the output deblurred image irradiance of Eq. (32) can be written as
where the proportionality constant has been ignored and A,, and A,,,, are the lower and upper wavelength limits of the nth narrow-spectral-band
16
FRANCIS T.S. YU
filter &,(a, b);
S(a - Afvo, p)H,(a, p) exp for
,I[,S
A 5 A,,,
(35)
is the complex light distribution of the deblurred image, resulting from the nth narrow-spectral-band filter; and S(a - Afvo,b) is the blurred image spectrum. Thus it is apparent that the temporal coherence requirement for image deblurring operation is set by the spectral bandwidth of each of the spectral band filters. If we denoted by Aa, the spatial width of the narrow-spectralband filters, the spectral bandwidth A& of &(a, b) can be written as
AA, = Aa,/fvo (36) which is proportional to Aa, and inversely proportional to vo. It is now apparent that the temporal coherence (ie., An,) of each H,(a, /1) is controlled by both the filter width Aa, and the sampling grating frequency vo. Since each H,(a, /Iis )designed to fit a specific wavelength A,, and the filter width Aa, is determined by the spatial frequency content of the input image s(x, y) thus the higher the spatial frequency content of the input image is, the wider the filter H,(a,p) is required. In other words, to maintain a higher degree of temporal coherence for high spatial frequency image, a higher sampling grating frequency vo is needed. Since the image deblurring takes place at every image point, the image deblurring operation can be evaluated with respect to every blurred image point, i.e., ~ ( x , Y=) rect(y/W) (37) where w denotes the smeared length, and rect(y/W) 4
1, 0,
for lyl < W/2 otherwise
The corresponding Fourier spectrum for every A is S(a - Afv,,p) = sinc[(nW/,If)/l]
where sinc(x)
(sin x)/x
Thus the nth-spectral-band deblurring filter is
where A,, denotes the central wavelength of the nth deblurring filter H,(a, 8). From Eq. (38), it becomes evident that smeared image deblurring is a one-
WHITE-LIGHT IMAGE PROCESSING
17
dimensional processing operation. For simplicity, we adopt a 1 D notation in the following. By substituting Eqs. (38) and (39) into Eq. (35), we have
for Llfl 5 A 5 Ah,,,
n = 1,2,.. . ,N
(40)
Equation (40) can be further reduced to the following form (24): 4f;i 2M (- I)”sinL nmA, . 2nmAn , sin-y sgn(y’), for Iy’l > W/2 LW m=l (41) Afl(y’;4= nmA, 2nmA,, 4f1, 2M -( - I)”C0ST cosfor (y’(5 W/2 A y’, n m=l ~
c
c
for Aln 5 1 5 ,Ih,,, n = 1,2,. . . ,N ; M is an arbitrary large positive integer. By substituting Eq. (41) in Eq. (34),the output deblurred image irradiance can be shown as
where
Si(A’) 4
lo x “sin
18
FRANCIS T.S. YU = n(m - m’)(l - 2y’/ W )
a:,!,,( y’) = n(m - m’)( 1
+ 2y’/ W )
+ m’)(l - 2y’/W) a:,!,.( y’) = n(m + m’)( 1 + 2y’/ W ) &,!,,(y’) = n(m
and a:,!,,(y‘) = n[m(l - 2y’/W) - m‘(1
+ 2y‘/W)]
+ 2y’/W) - m’(1 - 2y’/W)] agA,(y’) = n [m( 1 - 2y’/ W ) + m’( 1 + 2y‘/ W)] agL,(y’) = n[m(l + 2y‘/W) + m’(1 - 2y’/W)] ac,!,,(y’)= n[m(l
Equations (42) and (43) provide the mathematical basis for the evaluation of the temporal coherence requirement of smeared image deblurring with a white-light processor. Plots of the normalized deblurred image irradiance, defined as a function of y‘ and the bandwidth A I n , are shown in Fig. 5; I, = 5461 A was used for the calculation. It is evident that if the light source is strictly coherent (i.e., AIn = 0), the deblurred point image is infinitesimal. Also
w = 1.0
-0.12 -a10
-am
-0.06
-a04 -0.02
o
0.02
0.04
0.06
0.08
0.10
0.12
FIG.5. Output intensity distribution of the deblurred image: AAn, spectral bandwidth of the narrow-spectral-band deblurring filter HJa, b); W ,smeared length.
19
WHITE-LIGHT IMAGE PROCESSING
W = 2.0 m m
f U
0.4 .
.-
Spectral Bandwidth
Ax(&
FIG.6. Plots of the deblurring width A W as a function of the spectral bandwidth 111, of the narrow-spectral-band deblurring filter for various values of smeared length.
TABLE I EFFECT OF TEMPORAL COHERENCE REQUIREMENT AW/W AA,(A)
1/20 270
1/15 400
1/10 640
1/8 750
1/5 990
note that the degree of deblurring decreases as the temporal coherence of the deblurred filter reduces. The deblurred length A W represents the spread of the deblurred image irradiance. It is formally defined as the separation between the 10%points of I(y’). The deblurred length A W can be shown to decrease monotonically with AA,,. Plots of A W as a function of the spectral bandwidth AL,, for various values of smeared length W are shown in Fig. 6. It may be shown that the deblurred length A W is linearly proportional to the smear length W for a given value of A&. The greater the smear, the more difficult the deblurring process. In principle this may be corrected by decreasing the spectral bandwidth of each of the deblurred filters H,,(a, 8). Table I numerically summarizes the preceding analysis. The value of AA, can be regarded as the temporal coherence requirement for each HJa, 8) for the deblurring process. 2. Spatial Coherence Requirement The relationship between the source size and the image intensity distribution is the key factor in the calculation of the degree of spatial coherence requirement for a white-light image deblurring system. For simplicity in notation, we again use a 1D representation in the following:
FRANCIS T. S. YU
20
Assume the intensity distribution of the light source is given by Y(X09
Yo) = rect(Yo/As)4xo)
(44)
where 6(xo)is the Dirac delta function and rect(E) 4
{il
lYol Ad2 otherwise
If an extended monochromatic light source is used, the output intensity distribution of Eq. (31) becomes
where
and the asterisk denotes the convolution operation. Equation (45) can be reduced to the form of
21
WHITE-LIGHT IMAGE PROCESSING
By substituting Eq. (46) into Eq. (34), the deblurred image irradiance is found to be 16f3A3 Ib')(y',As) = ~ ~ ~ { @ ~ , ! , r ( A s ) c o s [ 2-nm')y'/W] (m Wn m m '
+ @!,$(As)cos[2z(m
+ m')y'/W]},
for Iy'l > W/2
(47)
and 161.:f3 I~')(y', As) = T x x { @ g , ! , , ( A s ) Wn m m '
+ @:,!,(As)
sin[2n(m' - m)y'/W]
+ @$,!,,(As) cos[2n(m' m)y'/ W] + Oc,!,,(As)cos[2n(rn + m')y'/W]}, -
for Jy'J 5 W/2
(48)
where - [( - l)"Cj(m,
As) - ( - l)""Cj(m', As)]
- [( - I)"Cj(m,
As) - ( - 1)"'Cj(m', As)]
@?,!,,(AS) = ( - I)"
+ (-1)"
m [C: (m', As) m2 - m'
m
2
+ C; (m', As)]
+ @(As) + [Cj(rn,As) - Cj(m', As)] - [Cj(m,As)
+ Cj(m', As)]
Sj(m,As) Li Si(2z[@(As) - m)]) - Si{2n[-@(As) Cj(m,As) Si(u) 4
Ci[2nl@(As)
lo7 sin v
dv
+ ml]
-
-
Ci[2nlC(As) - mll
ml)
FRANCIS T.S . YU
22 Ci(u) A C(As)
1;
dv
ASW/2Lf
+ m]} f I)[-C(As) + m]}
S:(m,As) a Si{n(2y’/W f I)[C(As) -Si(n(2y’/W
and
C:(m,As) a Ci[n(2y’/W f IIIC(As) + mi] -Ci[n12y’/W f III-C(As)
+ ml]
Thus the spatial coherence requirement of an image deblurring process may be evaluated by using Eqs. (47) and (48). The output intensity distribution of this process is plotted in Fig. 7. With reference to the definition of the deblurred length AW stated previously (i.e., the separation between the 10% points of the output image irradiance I,,), Fig. 8 shows the plots of AW as a function of the source size As and the smear length W. From this figure it can be seen that when the spatial width of the light source increases beyond a
y’(mm) -0.10 -0.08 -0.06 -0.04 -0.02
0
0.02
0.04 0.06
0.08
0.10
FIG.7. Output intensity distribution of the deblurred image for various values of the source size As.
23
WHITE-LIGHT IMAGE PROCESSING
3 E
2.0
_-
1.6
-
--
W = 2.0 rnrn
Q
f
.-U
3 0,
1.2
L L
1.0
3
0.8
.-c
3
aa
n 0.5 0.4
0
0.2
0.6
0.4
*sc
1.2
1.0
0.8
Source Size AS
1.4
(mm)
FIG.8. Plots of thedeblurring width as a function of the source size for various values of the smear length W . TABLE I1 EFFECTOF SPATIAL COHERENCE REQUIREMENT
As (mm) W(mm)
1/20"
1/15"
1/10"
1/5"
0.5 1 2
0.2 0.1 0.05
0.38 0.18 0.08
0.6
0.92 0.40 0.18
0.26 0.12
AWjW.
critical size Asc, the deblurred length becomes independent of As and equal to W. Table I1 provides a brief numerical summary of the key parameters for the determination of the spatial coherence in image deblurring. From this table it is evident that the spatial coherence requirement is inversely proportional to the smear length W . That is, the longer the smearing length, the smaller the source size required. We stress that, in principle, the coherence requirement for any specific white-light image processing can be evaluated by the general formulation of
24
FRANCIS T. S. YU
Eq. (31) or (33). For the interested readers we refer to refs. 24-26 for the evaluation of the coherence requirement for image subtraction and image correlation.
IV. COHERENCE MEASUREMENT
In the 1930s Van Cittert (17) and Zernike (18) predicted a profound relationship between the spatial coherence and the intensity distribution of the light source. However, it was the work of Thompson and Wolf (27,28)that demonstrated a two-beam interference technique to measure the degree of partial coherence. They have shown that, under quasimonochromatic illumination, the degree of spatial coherence is dependent on the source size and the distance between two arbitrary points. The degree of temporal coherence is, however, dependent on the spectral bandwidth of the light source. They have also illustrated several coherence measurements that are very consistent with the Van Cittert-Zernike predictions. We shall in this section describe a dual-beam technique for the coherence measurement of a white-light image processor (29).We shall show that a high degree of coherence can be obtained in the spatial frequency plane so that the image can be processed in complex amplitude with the entire spectral band of a white-light source. A . Measurement Technique We shall now describe a visibility measurement to determine the degree of coherence in the P direction at the Fourier plane of the white-light image processor of Fig. 1. For simplicity, we shall utilize a narrow slit as a onedimensional object at the input plane, as shown in Fig. 9a. The complex light distribution at the Fourier plane can be written as
W ,P; 4= C{sinc[(nD/Lf)P]} * d [ u - (Af/24p0, PI
(49)
where C represents an appropriate complex constant, D is the slit width, f is the focal length of the achromatic transform lens, p o is the angular spatial frequency of the phase sampling grating, and the asterisk denotes the convolution operation. It is clear that Eq. (49) describes a fan-shaped smeared Fourier spectrum of the input slit for which the scale of the sinc factor increases as a function of the wavelength A and decreases as the size of the object D (i.e., slit width) increases. To increase the efficiency of the visibility measurement along the P direction, we would use a pair of slanted narrow slits at the smeared Fourier spectra, as
25
WHITE-LIGHT IMAGE PROCESSING Sampling Grating
i
/
(b)
101
FIG.9. Visibility measurement along the /I direction:(a) input plane and (b) Fourier plane. D,input slit width; d , mean slit separation; Lo, mean wavelength of the light source; p o , angular spatial frequency of the sampling grating.
illustrated in Fig. 9b. The inclination angle of this pair of slit should be adjusted with the separation of the slits as
8 = tan-’(2nd/A,p0) (50) where d is the mean separation of the slits, A, is the mean wavelength of the light source, and p o is the angular spatial frequency of the sampling grating. The transfer function of this pair of slanted slits can be written as
[ (p $!)
H ( a , p) = 6
-
+ 6 (p +
$-)I
* b( u - ;Po
1
P)
(5 1)
The output intensity distribution can be written as I(x’, y ‘ ) = K[1
+ COS(~X~X’)]
(52) where K is an appropriate proportionality constant. Equation (52) represents an achromatic fringe pattern of the entire spectral band of the light source. Strictly speaking, all practical white-light sources are extended sources. For simplicity of illustration, we assume that an extended square source is used. Thus the output intensity distribution can be written as
m
+ y’p)
1 l2
du dp dx, d y , d A
(53)
26
FRANCIS T. S. YU
where H ( a , P ) is given by Eq. (51),
and a is the dimension of the extended light source. From this equation, we see that the visibility (i.e., degree of coherence) is dependent upon the source size a, the object size D (i.e., slit width), and the angular spatial frequency po of the sampling grating. To investigate the degree of coherence variation in the a direction, again we insert a narrow slit aperture as an input object but parallel to the sampling direction of the sampling grating, as shown in Fig. 10a. The smeared Fourier spectra can be written as, ~ ( ap;, 2 ) = sinc[(zD/Af)a]
* 6 [ a - ( A f P d ~ oPI,
(54)
which describes a narrow smeared rainbow color spectra along the a axis. For the visibility measurement, a pair of narrow slits is inserted at the Fourier plane perpendicular to the a axis and centered at c1 = Lofpo/2n, as shown in Fig. lob, where Lo is the center wavelength of the light source and f is the focal length. The transfer function of this pair of slits can be written as
(
2:) ( 2:)
H(a,P)=h a----Po
+ 6 a+---p0
(55)
where h is the separation between the slits. Again with the assumption of a
i
(0)
Sampling Grating
9
Narrow
SlJt
Ibi
FIG.10. Visibility measurement along the a direction: (a) input plane and (b) Fourier plane.
D,input slit width; h, slit separation; a,, center location of the pair of narrow slits; A,, mean wavelength of the light source; pa, angular spatial frequency of the sampling grating.
WHITE-LIGHT IMAGE PROCESSING
27
square light source, the output intensity distribution can be shown as
where H(c1,p) is defined by Eq. (55). From this equation, again we see that the degree of coherence in the c1 direction is dependent on the source size a, the object size D, and the angular spatial frequency po of the sampling grating. B. Coherence Measurement
The optical setup for the measurement of the degree of coherence at the Fourier plane of a white-light optical image processor is shown in Fig. 1 1. This setup utilizes dual-beam interference technique of Thompson and Wolf for coherence measurement. The output interference fringe pattern can be traced by a linear scanning photometer and displayed on an oscilloscope for the visibility measurement. In the experiment, the photometer is made by mounting a photomultiplier on the top of a motor-driven linear translator. Since it is a one-dimensional fringe pattern, a narrow slit can be utilized at the input end of the photomultiplier for the visibility measurement, i.e., where I,,, and Iminare the maximum and minimum irradiances of the fringes. Needless to say that the visibility of the fringes is in fact a measure of the degree of coherence measurement (2, IS),i.e.,
v = IYI
(58)
where y is the complex degree of coherence. We shall, in the following, illustrate the visibility (i.e., coherence) measurement in the /J and in the ci directions, at the Fourier plane. 1. fi-Direction Coherence Measurement
We shall now describe the coherence measurement in the p direction, as illustrated in Fig. 1 1. We shall show that the visibility varies as functions of the mean separation d of the pair of slanted slits, the source size a, the object size D (ie., slid width), and spatial frequencies of the sampling grating. Figure 12
Source
Linear Scan
FIG.11. An optical setup for the coherencemeasurement along the p direction:y(xo, y o ) ,source encoding mask;a, source size; D, input slit size; L, achromatic transform lenses; H(a,p), pair of slant slits; MD, photometer; OSC, oscilloscope.
29
WHITE-LIGHT IMAGE PROCESSING
shows the variation of the degree of coherence as a function of main separation d for various values of source size a. From this figure we see that the degree of coherence decreases as the separation d increases. Further increase in d increases the reappearance of the coherence. Still further increase in d causes the repeated fluctuation of visibility. In this figure, we also see that the degree of coherence increases as the source size a decreases. There is an interesting phenomenon in this coherence measurement. We see that as the source size a decreases further, the reappearance of the visibility increases. However, the overall irradiance of the smeared Fourier spectra decreases. This phenomenon is primarily due to the finite object size under uniform source size illumination. Furthermore, if the source size a is further increased, for example exceeding 1.O mm in Fig. 12, the decrease in coherence due to the source size is not apparent. This is primarily due to a comparable broader object size D (e.g., D = 0.6 mm) as compared with a narrower sinc factor derived from source size a. Nevertheless, if the input slit size further decreases, the changes in degree of coherence for large source sizes can be seen. We shall now provide a set of output fringe patterns with a set of normalized photometer traces, as shown in Fig. 13. The fringe patterns of IYI
0
0.72
1.44
2.1 6
0
3.46
6.92
10.38
2.88 13.84
3.60
d(mrn)
17.30
fa =d/kof(e/mrn )
FIG. 12. Plots of degree of coherence along the p direction as a function of mean slit separation d for various values of source size a: ji = d / l , f , mean slit separation in spatial frequency.(D = 0.6 mm; po = 136 liters/min; f = 381 mm; 1, = 5461 A,)
FIG.13. Samples of fringe visibility patterns at the output plane. The upper portions of (a)(f) show the fringe visibility patterns. The lower portions show the corresponding photometer traces. In these experiments, the fringe visibility and the spatial frequency were varied by changing the mean separation distance between the two slanted slits. (a)-(c) were obtained at points a, b, and con the first lobe of Fig. 12. These figures show the decrease in fringe visibility and corresponding increase in spatial frequency as the separation d increases.(d) and (e)were obtained at points d and e o n the second lobe of Fig. 12. These two figures also show the increase of spatial frequency as d further increases. (f) was obtained at point f on the third lobe of Fig. 12.
WHITE-LIGHT IMAGE PROCESSING
31
FIG.13. (Continued)
Figs. 13a-c were taken at visibilities of 0.88,0.68, and 0.08 corresponding to mean separations d = 0.36,0.72, and 1.44 mm at points a, b, and c indicated on the main lobe of the plot a = 0.4 mm in Fig. 12. Figures 13d and e were taken on the second lobe of the visibility reappearance, which corresponds to points d and e in Fig. 12. The degrees of coherence at these two points are 0.42 and 0.15, respectively. And the corresponding mean slit separations are 2.16 and 2.52 mm. Figure 13f was taken on the third-lobe visibility at point f. The degree of coherence is 0.3 and the mean slit separation is 2.88 mm. Let us now investigate the degree of coherence as a function of mean separation d for various input object sizes D (i.e., slid width), as plotted in Fig. 14. Again, we see that the degree of coherence decreases as d increases. Further increase in d also causes side lobes to reappear. In this figure, we also see that the degree of coherence increases as object size D decreases. Finally, Fig. 15 shows the visibility measure as a function of mean separation d for two values of sampling grating frequencies. From this figure, we see that the degree of coherence is dramatically improved in the Fourier plane by the insertion of the sampling grating. 2. a-Direction Coherence Measurement We shall now in this stage measure the degree of coherence in the a direction at the Fourier plane. The measurement technique is essentially
0
3.46
6 92
I 0 38
I3 8 4
17.30 fB
=d/.J(Grn,,,
)
FIG. 14. Plots of degree of coherence along the /I direction as a function of mean slit separation d for various values of input slit width D. (a = 0.4 mm; p o = 136 liters/mm; f = 381 mm; A,, = 5461 A.)
0
3.46
6.92
I 0.38
I 3.84
I 7.30 ,f
=d/~.f(e/,,,,)
FIG. 15. Plots of degree of coherence along the /I direction as a function of mean slit separation d for two values of sampling grating frequency p o . (a = 0.4 mm; D = 0.6 mm; f = 381 mm; Lo = 5461 A.)
33
WHITE-LIGHT IMAGE PROCESSING
identical to that of Fig. 1 1 , except that the input object is replaced by a pair of horizontal slits, as shown in Fig. 10. In coherence measurement, we centered the pair of slits at the center of the smeared Fourier spectra, a point corresponding to 1 = 5461 A. Figure 16 shows plots of degree of coherence as a function of slit separation h, for various values of source size u. From this figure, we see that the degree of coherence decreases as h increases. Further increase in h again causes the reappearance of the visibility side lobes. However, the degree of coherence is generally not affected by the variation of the source size a. Figure 17 shows a set of the visibility fringe patterns that we have obtained at the output image plane. This set of pictures was taken at points a, b, c, and d as shown in Fig. 16. The corresponding degrees of coherence are 0.76,0.50,0.14, and 0.05. The respective separations are h = 0.36,0.72, 1.44, and 2.16 mm. We shall now plot the degree of coherence as a function of h for various values of object size D (i.e., slit widths D),as shown in Fig. 18. From these plots, we see that the degree of coherence decreases as the object size D increases. Figure 19 shows the variation of coherence, due to spatial frequency of the sampling grating, as a function of slit separation h. From this figure, we see IYI
I
0
'
~
346
'
J
692
'
1038
I
' 1384
I
I
1730
'
fo=h/X.f(@rnrn)
FIG.16. Plots of degree of coherence along the a direction as a function of slit separation h for various values of source size a: f.= h/A,f, corresponding slit separation in spatial frequency. ( D = 0.6 mm; p o = 136 liters/min; f = 381 mm; I, = 5461 A.)
34
FRANCIS T. S. YU
FIG.17. Samples of fringe visibility patterns. The upper portions of (a)-(d) show fringe visibility patterns; the lower portions show the corresponding intensity profiles. (a) and (b) were obtained at points a and b on the first lobe of Fig. 16. (c) was obtained at point c on the second lobe of Fig. 16, and (d) at point d on the third lobe of Fig. 16.
I Yl
1
"
"
0
'
"
'
3.46
"
6.92
10.38
13.84
17.30 fa =h/A.f(e/mm
)
FIG.18. Plots of degree of coherence along the a direction as a function of slit separation h for various values of object size D.( a = 0.4 mm; p o = 136 liters/min; f = 381 mm; 1, = 5461 A.) I YI
L-
W
c
0
u
c
0
al al L
cn 0, n 1
0
"
"
'
3.46
"
'
6.92
"
10.38
13.84
17.30 fa
=h/k.f(+/mm)
FIG.19. Plots of degree of coherence along the CL direction as a function of slit separation h for two values of sampling grating frequency p,. ( a = 0.4 mm; D = 0.6 mm; J = 381 mm; 1, = 5461 A.)
36
FRANCIS T. S. YU
that higher degree of coherence is achievable with the insertion of a highspatial-frequency grating at the input plane. Let us now briefly discuss the overall effect of coherence in the (a,B ) spatial frequency plane. By comparing the visibility measurement of Figs. 12 and 16, we see that the degree of coherence substantially increases in the direction as the source size decreases. There is, however, no significant improvement in the a direction for smaller source sizes. Although both cases show the increase in coherence for smaller object sizes, the increase in coherence is higher in the b direction as compared with the a direction, as shown in Figs. 14 and 18. With reference to the plots of Figs. 15 and 19, both cases show significant improvement in the degree of coherence with the insertion of a sampling grating. However, the improvement in coherence in the j direction is somewhat higher than in the u direction. This is primarily due to overlapping of the smeared rainbow Fourier spectra.
C. Summary
We have devised a dual-beam interference technique to measure the degree of coherence in the Fourier plane of a white-light optical image processor. The effect of coherence variation due to source size, input object size, and spatial frequency of the sampling grating are plotted as a function of distance in the fi and a directions of the Fourier plane. We have shown that the degree of coherence increases as the spatial frequency of the sampling grating increases. Although the improvement in degree of coherence at the Fourier plane is quite evident, the improvement in the fi direction (i.e., the direction perpendicular to the light dispersion) is somewhat more effective than in the a direction. The results indicate that this white-light optical image processing technique is somewhat more effectivein the fl direction than in the a direction. Nevertheless the existence of the high degree of coherence in the Fourier plane allows us to process the image in complex amplitude rather than in intensity. And the white-light processing technique is very suitable for color image processing, as will be shown in Section VI.
v.
SOURCE ENCODING, IMAGE SAMPLING AND FILTERING
In this section, we shall describe a linear transformation relationship between spatial coherence function and source encoding intensity transmittance function. Since the spatial coherence requirement is depending upon the image processing operation, a more relaxed spatial coherence function
WHITE-LIGHT IMAGE PROCESSING
37
may be used for a specific image processing operation. By Fourier transforming this spatial coherence function, a source encoding intensity transmittance function may be found. The purpose of source encoding is to relax the coherence requirement, so that an extended white-light source can be used for the image processing. In other words, the source encoding technique is capable of generating an appropriate spatial coherence function for a specific image processing application and at the same time it utilizes the available light power more effectively. We shall provide an example to demonstrate that complex image processing can actually be carried out by an encoded extended white-light source. A. Source Encoding We shall begin our discussion with the Young's experiment under extended incoherent source illumination, as shown in Fig. 20. First, we assume that a narrow slit is placed at plane p o behind an extended source. To maintain a high degree of spatial coherence between the slits Q1and Q 2 at PI,the source size should be very narrow. If the separation between Q1and Qzis large, then a narrower slit size S , is required. Thus, to maintain a high degree of spatial coherence between Q1and Qz,the slit width should be
w
LR/2ho
Q
r:
% Q ;2
r2
-R-
FIG.20. Young's experiment with extended source illumination.
(59)
38
FRANCIS T. S. YU
where R is the distance between planes Po and P, , and 2ho is the separation between Q , and Q,. Let us now consider two narrow slits S, and S, located in source plane Po. We assume that the separation between S , and S , satisfies the following path length relation: r’, - r; = ( r , - r,)
+ m1
(60)
where the rare the respective distances from S1and S2to Q , and Q2 as shown in the figure, rn is an arbitrary integer, and 1 is the wavelength of the extended source. Then the interference fringes due to each of the two source slits Sl and S , should be in phase. A brighter fringe pattern can be seen at plane P2. To further increase the intensity of the fringe pattern, one would simply increase the number of source slits at appropriate locations in the source plane Po such that every separation between slits satisfies the coherence or fringe condition of Eq. (60). If separation R is large, i.e., R >> d and R >> 2ho, then the spacing d between the source slits becomes d = m(lR/2h0)
(61)
From the above illustration, we see that by properly encoding an extended source, it is possible to maintain the spatial coherence between Q1and Q,, and at the same time it increases the intensity of illumination. Thus the use of a specific source encoding technique for a given image processing operation may result in better utilization of an extended source. To encode an extended source, we would first search for a spatial coherence function for an information processing operation. With reference to the extended-source optical image processor of Fig. 1, the spatial coherence function at input plane P, can be written as (31) r(x1 xi) = 9
JJ
~(xo)K(xo, xi)K*(xo, x i ) dxo
(62)
where the integration is over the source plane Po, y(xo) is the intensity transmittance function of a source encoding mask, and K(x,,x,) is the transmittance function between source plane Po and input plane P, ,and can be written
W0, xl) = expCi2n(xox1l~ff)l
(63)
By substituting K(xo,x,) into Eq. (62), we have
1
T(xl - x;) = ~[y(xO)eip[i2n$(x, - x;) dx, From Eq. (64), we see that the spatial coherence function and source encoding
WHITE-LIGHT IMAGE PROCESSING
39
intensity transmittance functions form a Fourier transform pair
Ax,)
=
m-(x,
-
(65)
and
r(xl
x;) = s-l[y(xo)l (66) where 9 denotes the Fourier transformation operation. We note that the Fourier transform relationship of Eqs. (65) and (66) is also known as the Van Cittert-Zernike Theorem. Thus we see that if a spatial coherence function for an image processing is provided, then a source encoding intensity transmittance function can be found through Fourier transformation. We further note that in practice the source encoding function should be a positive real function which satisfies the following physically realizable condition: -
B. Image Sampling and Filtering
There is, however, a temporal coherence requirement for white-light image processing. In optical image processing, the scale of the image spectrum varies with wavelength of the light source. Therefore a temporal coherence requirement should be imposed on every processing operation. If we restrict the image spectra, due to wavelength spread, to within a small fraction of the fringe spacing d of a narrow-spectral-band filter H,(cc, @)(e.g.,deblurring filter), then we have PmfA1,/2n << d
(68)
where l/d is the highest spatial frequency of the filter, P,,, is the angular spatial frequency limit of the input image transparency, f is the focal length of the achromatic transform lens, and A1, is the spectral bandwidth of the narrowspectral-band filter H,,(a, @).The spectral width or the temporal coherence requirement of the spatial filter is, therefore, A1,/An << rc/hoP,,,
(69)
where A,,is the center wavelength of the nth narrow-spectral-band filter, 2h0 is the main separation of the input image transparencies, and 2h0 = A,, f Id. In order to gain some feeling of magnitude, we provide a numerical example. Let us assume that the size of the input image is 2h0 = 5 mm, the center wavelength of the filter H,(a,@) is 1, = 5461 A, and we introduce a factor 10 into Eq. (69), that is, AA,, = 10~L,,/hoP,,,
(70)
40
FRANCIS T. S. YU TABLE I11 TEMPORAL REQUIREMENT FOR H,(a, p) PJ2n (linesjmm)
0.5
1
5
20
100
A& (A)
218.4
109.2
21.8
5.46
1.09
Several values of spectral width requirement A& for various spatial frequencies P,,, are tabulated in Table 111. From Table 111, we see that, if the spatial frequency of the input image transparencies are low, a broader spectral width of the narrow-spectral-band filters can be used. In other words, if higher spatial frequency is required for an image processing operation, then narrower-spectral-width spatial filters are needed. Evidently, a narrower spectral spread AAn corresponds to a higher temporal coherence requirement, which can be obtained by increasing the image sampling frequency p , . However, a higher image sampling frequency may require larger aperture of transform lenses in the optical system, which tend to be more expensive. Neverthe1ess;in practice, high-quality images have been obtained for relatively low-cost transform lenses, as will be shown in Section VI. C. An Illustrative Application
We shall now illustrate an application of source encoding, image sampling, and filtering in a white-light image processor. Let us now consider a polychromatic image subtraction. The image subtraction of Lee (32) that we would consider is essentially a one-dimensional processing operation, in which ID fan-shaped diffraction grating should be utilized, as illustrated in Fig. 21. We note that the fan-shaped grating (i.e., filter) is imposed by the temporal coherence condition of Eq. (69). Since the image subtraction is a point-pair processing operation, a strictly broad spatial coherence function at the input plane is not required. In other words, if one maintains the spatial coherence between the corresponding image points to be subtracted at the input plane, then the subtraction operation can be carried out at the output image plane. Thus instead of using a strictly broad spatial coherence function, a reduced spatial coherence function may be utilized, such as
r(Y - Y’) = S(y
-
y’ - h,)
+ S(y - y’ + h,)
where 2ho is the main separation between the two input color image transparencies. The source encoding function can therefore be evaluated
WHITE-LIGHT IMAGE PROCESSING
41
FIG.21. A white-light image subtraction processor: T(x),phase grating; L,, image lens; L,, collimated lens; L , and L,, achromatic transform lenses; ~ ( y )source , encoding mask; G, fanshaped diffraction grating.
through the Fourier transform of Eq. (65) as
Unfortunately Eq. (72) is a bipolar function which is not physically realizable. To ensure a physically realizable source encoding function, we introduce a reduced spatial coherence function with the point-pair coherence requirement (31):
with N >> 1 a positive integer and w << d. Equation (73) represents a sequence of narrow pulses which occur at every ly - y’l = nh,, where n is a positive integer, and their peak values are weighted by a broader sinc factor, as shown in Fig. 22. Thus, a high degree of spatial coherence can be achieved at every point pair between the two input color image transparencies. By taking the Fourier transformation of the reduced spatial coherence function of Eq. (73), the corresponding source encoding function is
42
FRANCIS T. S. YU
-I
2ho-l
Fic. 22. A spatial coherence function.
where w is the slit width, d = ilf/ho is the separation between slits, and N is the number of slits. Since y(ly1) is a positive real function which satisfies the constraint of Eq. (67), the proposed source encoding function of Eq. (74) is physically realizable. In view of Eq. (74), we also note that the slit separation d is linearly proportional to 1.The source encoding is a fan-shaped function, as shown in Fig. 23. To obtain lines of rainbow color spectral light sources for the signal processing, we would utilize a linear extended white-light source with a dispersive phase grating, as illustrated in Fig. 21. Thus with the described
II -ilk
I I
Id!-
FIG.23. A source encoding mask.
WHITE-LIGHT IMAGE PROCESSING
43
broadband source encoding mask, image sampling grating, and fan-shaped sinusoidal grating, a color-substrated image can be seen at the output image plane.
VI. ADVANCES IN IMAGE PROCESSING It would take a prohibitive number of pages to describe all the recent advances in white-light image processing. We shall restrict our discussion to a few recent results we consider to be of interest. Since the white-light image processor is particularly suitable for color processing, we shall restrict our discussion mostly to color image processing. A . Color Image Deblurring
One of the interesting applications of white-light image processing is in the restoration of blurred color photographic images. We have recently presented a broadband white-light image processing technique for deblurring smeared color photographic images (33,34).Since linear motion deblurring is a 1D processing operation and its deblurring filter is a point-by-point filtering, a fan-shaped deblurring filter can be utilized. We note that monochrome photographic image deblurring by a coherent image processor was illustrated by Tsujiuchi (35)in 1963. The inverse spatial filter was synthesized by the combination of an amplitude and a pure phase filter. This can also be accomplished by a holographic synthesis technique. The preparation of such a filter by holographic techniques had been studied by Stroke and Zech (36), and by Lohmann and Paris (37). Nevertheless, the holographic synthesis technique also suffers one disadvantage, namely a low diffraction efficiency. Mention must also be made of image deblurring with a computer-generated phase filter obtained by Tsujiuchi, Honda, and Fukaya (38).They have shown various effects due to amplitude, phase, and amplitudephase filtering. Another interesting result obtained by Horner (39,40)should also be mentioned. He has shown how optimum image deblurring may be obtained with least-mean-squares error filtering. Let us now discuss a broadband image deblurring technique utilizing a fan-shaped deblurring filter for color image deblurring. Let a linear smeared color image be described as
44
FRANCIS T. S . YU
where $(x, y ) and s(x, y) are the smeared and unsmeared images, re,,(+)
1, 4 0,
{
IYI I w/2 otherwise
and W is the smeared length. Let us insert the smeared image transparency of Eq. (75) into the input plane Pi of a white-light optical processor as shown in Fig. 1. The complex light distribution for every wavelength A at the back focal length of the transform lens would be sI(x, y ; A) exp(ip,x)exp
1
+ yP)
dx dy
(77)
where po is the angular spatial frequency of the phase grating, CI = (Af/27r)p and P = (Af/2n)qrepresent the spatial coordinate system of Fourier plane Pz, (p, q) is the corresponding angular spatial frequency coordinate system, f is the focal length of the achromatic transform lens, and C is an appropriate complex constant. Thus, Eq. (77) can be written as
E(a,P; 4= CSCa - (Af/24PO, PI
(78)
where
is the linear smeared image spectrum. Since the scale of the image spectrum is proportional to the wavelength of the light source, the corresponding image spectra would smear into a fanshaped rainbow color, as can be seen from Eq. (79). In other words, the top of the smeared spectra is in red and the bottom is in violet. Let us assume that a fan-shaped broad-spectral-band deblurring filter (i.e., a broadband inverse filter) to accommodate the variation of the scale of the signal spectra is available. This fan-shaped filter is described by the following equation:
In image deblurring we would insert this deblurring filter of Eq. (79) in the spatial frequency plane of Pz, The complex light distribution for every I at the
WHITE-LIGHT IMAGE PROCESSING
45
output image plane P3 can be written as
where 9 denotes the inverse Fourier transform; by substituting Eqs. (79) and (80) into Eq. (81), we have d x , Y ; 4= s(x, Y ) exp(ip0x) (82) which is independent of the wavelength of the light source. The resultant output intensity distribution can be shown as Y ) = j*Jdx, Y ; 4l2dL 'v AMX, Y)I2
(83)
which is proportional to the entire spectral bandwidth AA of the white-light source. Thus we see that this proposed white-light deblurring technique is capable of processing the image with the entire visible spectral band, and it is very suitable for the application to color image deblurring. Since the integration of Eq. (83) is taken from the entire spectral band of the white-light source, the coherent artifact noise in principle can be eliminated. We shall now provide a result of color image deblurring obtained with this white-light image processing technique. Figure 24 shows a black-and-white
FIG.24. A black-and-white picture of a blurred color image.
46
FRANCIS T. S. YU
FIG.25. A black-and-white picture of a deblurred color image.
picture of a blurred color image of a building due to linear motion of camera as an input object. From this figure we see that the white window frames, the front doors, bushes, two white beams, trees, etc. are severely smeared. Figure 25 shows a black-and-white picture of the deblurring image that we have obtained with this white-light processing technique. It can be seen that the color reproduction is very faithful and the deblurring effect is spectacularly good; for example, the window frames, bushes, beams, front doors, trees, etc., can be clearly identified. B. Color Image Subtraction
Another interesting application of white-light image processing is in image subtraction. Image subtraction may be of value in many applications such as urban development, highway planning, earth resources studies, remote sensing, meteorology, automatic surveillance, inspection, etc. (41). Image subtraction may also be applied to communications, as a means of bandwidth compression. For example, it would be necessary to transmit only the differences between images in successive cycles, rather than the entire image in each cycle.
WHITE-LIGHT IMAGE PROCESSING
47
Optical image synthesis by complex amplitude subtraction was described by Gabor et al. (42).The technique involves successive recordings of two or more complex diffraction patterns on a holographic plate and the subsequent reproduction of the composite holographic images. A few years later, Bromley et al. ( 4 3 , 4 4 )described a holographic Fourier subtraction technique for which a real-time image and a previously recorded holographic image could be subtracted. Although good image subtraction was reported, it appears that the illumination for the holographic image reconstruction must be carefully arranged. In a more recent paper, Lee et al. ( 4 5 , 4 6 )proposed a technique by which image subtraction and addition can be achieved by a diffraction grating technique. This technique involves the insertion of a diffraction grating in the spatial frequency domain of a coherent optical processor. The advantage of this technique is a real-time subtraction capability. Since space does not permit us to review all the various techniques of image subtraction, we refer to the review paper by Ebersole (47). However, most of the optical image synthesis involves the use of a coherent source to carry out the image subtraction. But coherent sources also introduce artifact noise. In this section we will apply the white-light processing technique to color image subtraction (48,49). We will now insert two color image transparencies, side by side, in contact with a phase grating at the input plane PI of the white-light optical processor of Fig. 1. At the spatial frequency plane P2, the complex light distribution for each wavelength A of the light source may be described as
(;:
)(i;)
(
;: ) (; )
E(a,p;L) = S, a - - p o , p
+S2 u--p0,p
exp ---hop
exp i-hop
where &(a, p) and S2(a,p) are the Fourier spectra of the input color images sl(s, y) and s2(x, y), and 2ho is the main separation of the two color images s1 and s2, Again we see that two input spectra disperse into rainbow colors along the a axis of the spatial frequency plane. For image subtraction, we insert a diffraction grating in the spatial frequency plane. Since the dispersed Fourier spectra vary with respect to the wavelength of the light source, we must insert a fan-shaped grating to compensate for the wavelength variation. If we let this fan-shaped grating be
H ( a , p ; 1 ) = 1 + sin[(2n/11f)hoP]
(85)
FRANCIS T. S. YU
48
for all a, then the output irradiance would be b y ) = A.IClS,(X,Y - ho)12 + b 2 ( X , Y
+ h0)I2 + 4 l S l ( X , Y )
- S2(X,Y)I2
+ IS,(X,Y - 2hO)l2 + IS,(X,Y + 2hO)l21 (86) where AA is the spectral bandwidth of the white-light source. Thus, the subtracted color image ISl(x,y) - S,(X, y)12 can be seen at the optical axis of the output plane. In practice, it is difficult to obtain a true white-light point source. However, this shortcoming can be overcome with the source encoding technique discussed in Section V. We shall now describe a color image subtraction operation with two encoded extended incoherent sources as depicted in Fig. 26. We note that the extension to the entire spectral band of a white-light source, as described in Section V, is currently under investigation. In color image subtraction, we insert the color image transparencies O,(X, y) and 02(x,y) in the open apertures of the input plane Pl, which can be described as f ( x , Y) = O,(X
-
ho, Y)
+ OAX + ho, Y)
(87)
ZIRCONIUM A R C
X
FIG.26. Color image subtraction with encoded extended incoherent sources: BS, beam spliter; MS, source encoding mask; L, achromatic transform lens; G, diffraction gratings.
WHITE-LIGHT IMAGE PROCESSING
49
where 2h0 is the main separation between the two input transparencies. Two source encoding marks (i.e., MS1and M S , ) and two sinusoidal gratings (i.e., G, and G,) for the red and the green wavelengths are used; these can be written as
and
for A = 1,and Ag,where Ar and Ag are the red and green wavelengths; W is the slit width and 2ho is the main separation of the two input transparencies. By a straightforward but rather cumbersome evaluation, the irradiance around the origin of the output image plane P3 can be shown to be I(x, Y ) = Ir(X, Y )
+
= KlOl,(-%
lg(x,
Y)
v) - OZr(X9 Y)I2 + K IO,,(&
Y ) - OZ&, Y)I2
(90)
where I,(x,y ) and I&, y) denote the red and green subtracted color image irradiances; Olr, O,,, Olg, and O,, are the corresponding red and green color input objects. From Eq. (90) we see that the subtracted color image can be obtained at the output image plane. In experiment, a mercury arc lamp with a green filter (5461 A) and a zirconium arc lamp with a red filter (6328 8) were used for the color light sources. The intensity ratio of the two resulting light sources was adjusted to about unity with a variable beam splitter. The slit widths for the source encoding masks were about 2.5 pm and the spacings of the slits were 25 pm for the green wavelength and 29 pm for the red wavelength. The overall size of the source encoding masks was about 3 x 3 mm2; they contained about 120 and 100 slits, Iespectively. The focal length of the transform lenses was 300 mm. A liquid gate containing two color image transparencies about 6 x 8 mm2 each and with a separation of about 13.2 mm was placed behind the collimator. Two sinusoidal gratings with spatial frequencies of 1/(25 pm) and 1/(29 pm) were used for the green and red color image subtraction operation, as shown in Fig. 26. For an experimental demonstration, we provide two continuous-tone color images of two sets of fruit, as shown in black-and-white pictures of Fig. 27a and b. By comparing the two figures, we see that a dark green cucumber and a red tomato are missing in Fig. 27b. Figure 28 shows a blackand-white picture of the subtracted color image obtained with this incoherent image subtraction technique. In the subtracted color image, we see the profiles of a cucumber in dark green and a tomato in red at the output image plane.
50
FRANCIS T. S. YU
FIG.27. Black-and-white pictures of the input color objects.
FIG.28. A black-and-white picture of the subtracted color image.
WHITE-LIGHT IMAGE PROCESSING
51
C. Color linage Retrieval
Archival storage of color films has long been an unresolved problem for film industries around the world. The major reason is that the organic dyes used in the color films are usually unstable under prolonged storage. Thus it causes a gradual color fading in the film. Although there are several available techniques for preserving the color images, all of them possess certain definite drawbacks. One of the most commonly used techniques is, by repetitive application of primary color filters, to preserve the color images in three separate rolls of black-and-white film. To reproduce the original color image, a system with three primary color projectors should be used for these three rolls of films. These films should be projected in perfect unison so that the primary color images will be precisely recorded on a fresh roll of color film. However, this technique has two major disadvantages: First, the storage volume for each film is tripled. Second, the reproduction system is rather elaborate and expensive. In this section, we shall describe a white-light processing technique for archival storage (50) and color image retrieval. We shall show that this technique is the most efficient and effective technique developed to date. This technique also allows direct viewing by simple white-light image processing. This capability is particularly attractive for portable color image projection and home movie applications. We shall now describe a spatial encoding technique utilizing a white-light source (51). A color transparency is used as an object to be encoded, by sequential exposure to primary color lights, onto a black-and-white film, as illustrated in Fig. 29. The encoding is effected by spatial sampling of the
I I
T
4 2f
I
Collimated White Lighl
/7-
Color Transparency
-Objective
Primary Color Filter
I
1
Grating )Contact Photographic Film
FIG.29. A spatial encoding technique.
52
FRANCIS T. S. YU
primary color images of the color transparency with a specific sampling frequency and a predescribed direction onto a monochrome film. In order to avoid the moire fringe pattern in the retrieved color image, we propose to sample one of the primary color images in one independent spatial coordinate, and the remaining two primary color images in the other independent spatial coordinate. Since any mixture of red with green or with blue colors produces a wide range of intermediate colors, we propose to encode the red color image in one independent spatial coordinate direction, and the blue and green color images in the other remaining independent coordinate direction. Thus a small amount of color spread (i.e., color cross-talk) from blue to green (but not from green to blue) may not be avoided. However, this small amount of color spread will not cause significant adverse effects in the retrieved color image, for two reasons: First, a slight mixture of blue into green will not produce significant color changes. Second, strictly speaking all the color transparencies are not natural colors; thus a small amount of color deviation would not be noticeable by human perception. We shall now let the intensity transmittance of the encoded films be T,(T Y) = K { T(x,~ ) [ + l sgn(cos~r~)l
+ Th(x,Y)[l + sgn(cosphx)l + <(x?Y)cl + sgn(cospgx)l}-Y
(91) where T,(x, y) is the encoded black-and-white negative transparency; K is an appropriate proportionality constant; T,, Tb, and are the red, blue, and green color image exposures; pr, P b , and p g are the respective carrier spatial frequencies ;(x, y) is the spatial coordinate system of the encoded film, y is the film gamma, and
sgn(cos x) A
1, -1,
cosx 2 0 cosx < 0
Instead of obtaining a positive-image transparency, through a contact printing process, we shall bleach the encoded negative-image film to obtain a phase object transparency (52,53).Let us assume that the bleached transparency is encoded in the linear region of the diffraction efficiency versus log exposure curve (i.e., D-E curve). The amplitude transmittance of the bleached transparency can therefore be written as
0, Y) = exp[id)(x, Y)l
(92)
where d)(x, y) represents the phase delay distribution, which is proportional to the exposure of the encoded film (54),such as ~ ) ( x , Y=) M{T,(X,Y)C~ + sgn(cos~r~)I
+
Tb(x?Y)[l
+ sgn(cospbx)l + <(x?Y)[l + sgn(cospgx)l}
(93)
53
WHITE-LIGHT IMAGE PROCESSING Extended White-Llght Source
P
R e d Color Filter
Plane
FIG.30. A white-light processor for color image retrieval.
where M is an appropriate proportionality constant. If we place this bleached encoded film at the input plane PI of a white-light optical processor, as illustrated in Fig. 30, then the first-order complex light distribution, for every A, at the spatial frequency plane P2 would be
where f , f b , and fg are the Fourier transforms of T,, and Tb, and Tg respectively, an asterisk denotes the convolution operation, and the proportionality constants have been neglected for simplicity. We note that the last cross product term of Eq. (94) would introduce a moire fringe pattern, which can be easily masked out at the Fourier plane. Thus by proper color filtering of the smeared Fourier spectra, a true color image can be retrieved at the output
54
FRANCIS T. S. YU
image plane P3. The corresponding output image irradiance can be shown as
+ T a x , Y ) + T;(X,
(95) Y) which is a superposition of three primary encoded color images. Thus we see that a moire free color image can indeed be reproduced. In our experiment, the encoding transparency is made by Kodak Technical Pan film 2415. The advantage of using Kodak 2415 film is that it is a highresolution film with a relatively flat spectral response. The disadvantage is that this film is coated with a thin layer of dyed-gel backing, which introduces a low-level noise in the bleaching process. In order to avoid the toe region of the D-E curve, the film is slightly pre-exposed. Otherwise, it would introduce a low-exposure nonlinear effect that would cause color unbalance of the retrieved image. The D-E curve for Kodak 2415 film with 40lines/mm sampling frequency is plotted in Fig. 31. From this figure, we see that the bleached encoded film offers a higher I(x, Y ) = m x , Y )
Log Exposure( mcs)
FIG.31. Diffraction efficiency versus log exposure for Kodak 2415 film. (0, bleached; x , nonbleached.)
diffraction efficiency, with optimum value at exposure 8.50 x l o p 3mcs. It is possible to optimize the encoding process with respect to the D-E curve in the following steps: First, by preexposuring the film beyond the toe region. Second, by subdividing the remaining dynamic range exposure into three equal parts for the red, green, and blue color image encodings. For experimental demonstration, Fig. 32 shows a black-and-white picture of the retrieved color image obtained by Kodak 2415 encoding film. From the retrieved image we have seen that a high quality moire-free color image can be obtained. However there is still some remnant noise present in the retrieved image which is primarily due to the thin layer of dyed-gel backing on the Kodak 2415 film. This thin layer of dyed-gel backing introduces additional noise through bleaching process. Nevertheless, the remnant noise may be
WHITE-LIGHT IMAGE PROCESSING
55
FIG.32. A black-and-white picture of a retrieved color image.
avoided by utilizing other types of encoding films, a topic currently under investigation. For comparison, we also provide a black-and-white picture of the original color image as shown in Fig. 33. By comparing the two color images of Fig. 32 with Fig. 33, we have seen that the retrieved color image was spectacularly faithful, with virtually no color cross-talk. Although the resolution and contrast are still below the acceptable level, these drawbacks may be improved by utilizing a more suitable film, which we are still pursuing.
D. Pseudocolor Encoding Most of the optical images obtained in various scientific applications are gray-level density images. For example, scanning electron microscopic images, multi-spectral-band aerial photographic images, x-ray transparencies, infrared scanning images, etc. However, humans can perceive in color better than in gray-level variations. In other words, a color-coded image can provide greater opportunity for visual discrimination. In current practice, most of the pseudocolorings are performed by digital computer techniques (Sj).If the images are initially digitized, the computer technique may be a logical choice. However, for continuous-tone images, an
56
FRANCIS T.S. YU
FIG.33. A black-and-white picture of the original color image.
optical color encoding technique would be more advantageous for at least three major reasons: First, the technique can in principle preserve the spatial frequency resolution of the image to be color coded. Second, the optical system is generally easy and economical to operate. Third, an optical pseudocolor encoder is generally less expensive as compared with the digital counterpart. 1. Pseudocoloring through Spatial Encoding We shall now describe a white-light density pseudocolor encoding technique for monochrome images (56). We assume that a gray-level x-ray transparency (called a positive image) is available for pseudocoloring. By the contact printing process, a negative and a product x-ray image transparencies can be made. We shall now describe a spatial encoding technique to obtain a three-gray-level-image encoding transparency for the pseudocoloring. The spatial encoding is performed by respectively sampling the positive, the negative, and the product image transparencies onto a black-and-white photographic film, with specific sampling frequencies oriented at specific azimuthal directions. To avoid the moirt fringe pattern, (see Section V1,C) we shall sample these three images in orthogonal direction with different specific
WHITE-LIGHT IMAGE PROCESSING
57
Collimated
Collimated
Collimated
White L i g h t
White Light
White L i g h t
Sinusoidal Grating
7
Negative Transparency
Photographic F i l m
Photographic Film
Photogrophlc Film
FIG.34. A spatially encoding technique.
sampling frequencies, as shown in Fig. 34. The intensity transmittance of the encoded film can therefore be written as
WGY)= K{T1(x,y)C1+ sgn(cosp,y)l
+ T,(X,Y)C1 + sgn(cosP,x)l + T3(X,Y)Cl + sgn(cosp,x)l)-Y
(96)
where K is an appropriate proportionality constant; T,, T,, and T3 are the positive, negative, and product image exposures; p , , p , , and p3 are the respective carrier spatial frequencies; (x, y) is the spatial frequency coordinate system of the encoded film; y is the film gamma; and sgn(cos x) A
ill,
cosx >= 0 cosx
To obtain a surface relief phase object transparency, the encoded transparency is bleached with a R-10 formula (52,53)and we assume that the bleached transparency is encoded in the linear region of the D-E curve. The encoded phase transmittance is therefore t(x, Y)
= expCi4(x,
Y)l
(97)
where 4(X?Y) =
M{T,(X,Y)C1 + sgn(cosp1Y)l
+ T2(x, Y)C 1 + sgn(cos P 2 X ) l + T3(x,Y)C 1 + sgn(cos P3X)l)
(98)
where M is an appropriate proportionality constant. If we insert this encoded phase transparency at the input plane PI of a white-light optical processor, as illustrated in Fig. 30, the complex light distribution due to t(x, y), for every 2, at
58
FRANCIS T. S. YU
the spatial frequency plane P2 can be shown as
where TI, F2,and f 3 are the smeared Fourier spectra of the positive, negative, and product images, respectively; an asterisk denotes the convolution operation; and the proportionality constants have been neglected for simplicity. Again, we see that the last cross product (ie., the moirb fringe pattern) can be avoided by spatial filtering. Needless to say that by proper color filtering of the first-order smeared Fourier spectra, a moirb-free pseudocolor encoded image can be obtained at the output plane P3.The corresponding pseudocolor image irradiance is therefore
+,Y )
=
T:,(x,Y ) + T2& Y ) + G,(x, Y )
( 100)
where Tf,, Tib,and T i , are the red, blue, and green intensity distributions of the three spatially encoded images. For experimental demonstration, Figs. 35 and 36 show a set of black-andwhite pictures of color-coded images of a woman's pelvis. The x-ray was taken following a surgical procedure. A section of the bone between the sacroiliac joint and spinal column has been removed. In Fig. 35, the positive image is encoded in red, the negative image is encoded in blue, and the product image is encoded in green. By comparing the pseudocolor-coded image with the original black-and-white x-ray picture, it appears that the soft tissues can be better differentiated by the color-coded image as demonstrated by the fact that the image contrast in the region containing the gastrointestinal tract is evidently superior in the color-coded image. On the other hand, there seems to be a degradation in the resolution in the color-coded image along edges of the hard tissues. This is perhaps caused by two reasons: First, high-frequency information may be eliminated owing to the low-spatial-frequency encoding gratings (40 lines/mm and 26.7 mm lines/mm) employed. Second, the image may be smeared as a result of the film development process. These two problems can be easily corrected by selecting higher-frequency encoding gratings and by gaining more experience in film processing.
WHITE-LIGHT IMAGE PROCESSING
59
FIG.35. A black-and-white picture of a density pseudocolor-coded image. Positive image is coded in red, negative image is coded in blue, and the product image is coded in green.
FIG.36. A black-and-white picture of a reversal color-coded image of Fig. 35. Positive image is coded in blue, negative image is coded in red, and the product image is again coded in green.
Another point worth noting is that a reversal of the color encoding can be easily implemented as shown in Fig. 36, where the positive and negative images are encoded in blue and red while the product image remains in green. This color mixture capability could be beneficial because an image in different color combination may reveal subtle features which are otherwise undetected.
FRANCIS T.S. YU
60
For instance, the air pockets in the colon of the patient can be identified more easily with Fig. 36 than with Fig. 35. Moreover, a wide variety of other pseudocolor-encoded images can also be obtained by simply alternating the color filters in the Fourier plane of the white-light processor. Finally, we would stress that this white-light pseudocolor encoder offers several advantages over the digital counterpart. The encoder is far less expensive and in principle the technique offers a higher color-coded image resolution.
2. Pseudocoloring through Halftone Screen It is also interesting to describe a technique utilizing a specially fabricated halftone screen to perform pseudocolor encoding by density (57, 58). A halftone transparency of the original image is first obtained using a specially constructed halftone screen as described by Kato and Goodman (59). The amplitude transmittance of the halftone image in one dimension can be described as
1 m
h(x) =
6(x - nxo) * re,(:)
n=-m
where 6(x) is the Dirac delta function, xo is the pulse period, w is the pulse width, and
Note that the pulse period xo is determined by the period of the halftone screen and the pulse width w is a function of the density of the original image at x. The halftone image transparency is then inserted into the white-light processor shown in Fig. 37. The complex amplitude distribution at the back focal plane of the achromatic lens L , can be written as E ( a , l ) = exp[ -iKl(:)2][h(x)exp(
-i%ax)dx
( 102)
where a is the spatial frequency coordinate, K , is an appropriate constant, and f is the focal length of the lens. The integration is performed over the spatial limits of the input transparency and the spectral width of the white-light source. For a given wavelength l,the various orders of diffracted light at the spatial frequency plane are
E ( a , l ) = exp[ -K,(;)']
f
"= -
12fZ 7.caxo
~
sin(y)d(a
-
z)
(103)
61
WHITE-LIGHT IMAGE PROCESSING W H ITE-LIGHT POINT SOURCE
p3
1
Y
FIG.37. White-light optical processor: L, achromatic collimating lens; L , , achromatic transform lens; P,, halftone input transparency; P2, spatial frequency plane; P3, output image plane.
Thus we can see that, except for the zeroth-order term, the higher-order diffractions are dispersed into rainbow colors along the c1 direction at the spatial frequency plane Pz. Within the visible region of the diffracted light, there is no overlapping between diffraction orders up to the third order. For a given wavelength the irradiance of the nth diffraction order at a = nAf/xo is 1, = (&f/nn)2sin2(nnw/xo)
( 104)
To perform pseudocolor encoding, we may select two different colors (e.g., red, green) from two different orders of diffraction by placing narrow slits at the appropriate positions to bandpass the desired colors. At the output plane P3, we have a pseudocolor image formed by the addition of the intensities of the two filtered color images. In our experiments a tungsten arc lamp is used as the white-light source and a specially fabricated one-dimensional multilevel halftone screen is used to produce the halftone input, as shown in Fig. 38. For simplicity, we use a single slit with adjustable width as the spatial filter. The slit is placed between the second and third orders as shown in Fig. 39, such that only the red portion of the second-order diffraction and the blue-green portion of the third-order diffraction can pass through. In Fig. 40, we show a black-and-white picture of a pseudocolor-encoded image obtained with this white-light processing technique.
62
FRANCIS T. S. YU
FIG.38. Halftone input transparency of an x-ray picture.
SLIT FILTER
P 4
0
BLUE
0 o r n - Q
RED
FIG.39. Spatial filtering of dispersed spectrum by adjustable slit filter.
Aside the complication of generating a halftone image, this technique also suffers two major drawbacks, a spatial resolution loss and the presence of sampling lines in the color-coded image. However, this technique shows another method of producing density pseudocolor-coded images with simple white-light processing.
WHITE-LIGHT IMAGE PROCESSING
63
FIG.40. A black-and-white picture of a color-coded image.
E . Real-Time Pseudocolor Encoding
In the previous section we have described simple white-light pseudocolor density encodings through image sampling. We have shown that the whitelight pseudocoloring techniques are very simple and economical to operate. However, the techniques still suffer one major drawback; they are not realtime pseudocolor encoding techniques. We shall now describe a real-time white-light pseudocoloring technique for spatial frequency and density encodings (60).We stress that this real-time pseudocoloring technique may offer some advantages in some specific applications. 1. Spatial Frequency Pseudocoloring
In spatial frequency pseudocolor encoding, we place a gray-level image transparency s(x, y) in contact with a two-dimensional high-diffractionefficiency grating T(x, y) at the input plane PI of a white-light optical processor, as shown in Fig. 41. For simplicity, we assume that the amplitude transmittance of the two-dimensional diffraction grating is T(x,y)= 1 + ~ c o s p o x + ~ c o s q , y where po and qo are the carrier spatial frequencies of the diffraction grating.
64
FRANCIS T. S. YU WHITE-LIGHT POINT SOURCE COLLIMATING
ACHROMATIC
p
SPATIAL FREQUENCY
x
FIG.41. A real-time white-light pseudocolor encoder.
The corresponding complex light distribution for a given wavelength 1at the spatial frequency plane P, is
where S(a,/?)is the Fourier spectrum of input monochrome image s(x,y). From this equation, we see that four first-order signal spectra are dispersed in rainbow color proportional to wavelength 1along the a and P axes. Since the spatial filtering is effective in the direction perpendicular to the color smeared spectrum, we adopt one-dimensional spatial filters for pseudocolor encoding as shown in Fig. 42. The complex light amplitude distribution immediately behind the spatial frequency plane is then E(p,q;d) = SAP - Po,dH,(q) + Sr(P,q - ~ o ) H I ( P )
+ Sb(p + p07q)H2(q)+ Sb(P,q + q O ) H Z ( P )
(lo7)
where S, and Sb are the selected color-band image spectra (e.g., red and blue), H , and H, are the one-dimensional spatial filters, and (p,q) is the angular
65
WHITE-LIGHT IMAGE PROCESSING P Red color spectrol bond
t
1
bond
FIG.42. Spatial frequency pseudocolor encoding.
spatial frequency coordinate system of P 2 .The output image irradiance can be shown as
) exp(iq,y)sr(x, Y ) * h,(x)12 I(x, Y) N A&Iexp(ip,x)sr(x, Y) * h , ( ~ +
+ AAblexp(-ipOx)sb(x,y) * h 2 ( y ) + exp(-iqOy)sb(x, y) * h2(X)I2 (108)
where A I r and A& are the color (e.g., red,and blue) spectral bands of the signal spectra and h , and h2 are the corresponding impulse responses of H , and H2 . Thus we see that two spatially filtered images are incoherently superimposed to form a color-encoded image at the output plane P3 of the white-light image processor. Figure 43 shows a black-and-white picture of a spatial frequency pseudocolor-encoded radar image obtained by this processing technique, where the high-spatial-frequency contents are encoded in red and the lowspatial-frequency contents are encoded in blue. 2. Density Pseudocoloring We shall now describe a real-time density pseudocolor encoding technique. In density pseudocoloring we insert two narrow strips of half-wave phase objects in the centers of the selected color-band image spectra to provide the image contrast reversal, as shown in Fig. 44. The complex
66
FRANCIS T. S. YU
FIG.43. A black-and-white picture of a real-time spatial frequency pseudocoloring radar image. The high-spatial-frequency terrains are encoded in red and the low-spatial-frequency terrains are encoded in blue.
P
band
FIG.44. Contract reversal density pseudocolor encoding.
WHITE-LIGHT IMAGE PROCESSING
67
amplitude light distribution immediately behind the spatial frequency plane is E ( P ,4; 1)= SAP - P O , 4) + &(P, 4 - 4 0 )
+ S,(P
-P o , W ( q ) + S g ( P 4 - qo)H(d
(109)
where S , and S, are the selected color-band image spectra (e.g., red and green) and
are the narrow strips of half-wave phase objects. At the output imaging plane P3, the complex light amplitude distribution can be approximated by
where (sE(x,y)) denotes the spatial ensemble average (i.e., the dc level) of sg(x,y). Since the two images s, and sBnare diffracted from two different color spectral bands of the light source, they are mutually incoherent. The output image irradiance is, therefore,
where Ir(x,y) is a positive color (e.g., red) image irradiance, Ign(x,y) is an (approximate) contrast-reversed or negative color (e.g., green) image irradiance, AA, and AAEare the narrow color spectral bands of the signal spectra (e.g., red and green). Thus we see that a density pseudocolor-coded image is formed with incoherent addition of a positive image in one color and a negative image in another. Figure 45 shows a black-and-white picture of a real-time pseudocolordensity-encoded x-ray image of a hand. From this figure we see that a broad range of density pseudocolor-encoded images can be obtained with this technique. Finally we would note that, with this white-light pseudocolor encoding technique, there is a freedom to select different color spectral bands for pseudocolor encoding. Thus in practice a wide range of different pseudocolorencoded images can easily be obtained.
68
FRANCIS T.S. YU
FIG.45. A black-and-white picture of a density pseudocolor encoding image of an x-ray transparency. In color the thicker bones are displayed in red and the fingers in green..
VII. CONCLUDING REMARKS We have described the basic principle of a white-light image processor, which is very suitable for color image processing. The coherence requirement for white-light image processing is briefly analyzed through the partial coherence theory of Wolf (15).We have shown that the proposed white-light image processor is capable of processing the image in complex amplitude as a coherent image processor; at the same time the processor is capable of suppressing artifact noise as an incoherent processor. In order to alleviate the spatial coherence constraint of a physical light source, a source encoding concept is developed so that the image processing can be carried out with an extended source. To remove the constraint of the temporal coherence of a white-light source, we utilize an image sampling grating to improve the degree of temporal coherence in the Fourier plane so that the image can be processed in complex amplitude. We have also
WHITE-LIGHT IMAGE PROCESSING
69
demonstrated that the white-light image processor is very economical and easy to operate, in contrast with coherent counterparts. Therefore we would expect that the white-light image processor would offer a broader range of application to many scientific imageries. In this chapter, we have also described some of the recent advances in white-light image processing. We have demonstrated that color image deblurring, color image subtraction, color image retrieval, and pseudocolor encoding can be easily carried out by the proposed white-light processing technique. We have also shown that these white-light image processing operations can be evaluated from the stand point of conventional linear systems. In spite of the flexibility of digital image processing, optical methods offer the advantage of capacity, color, simplicity, and cost. Instead of confronting each other, we can expect a gradual merging of the optical and digital techniques. The continued development of optical-digital interfaces and various electro-optics devices will lead to a fruitful result: hybrid opticaldigital image processing techniques, utilizing the strengths of both processing operations. Furthermore, I believe that white-light image processing is at the threshold of widespread application. I hope that this chapter will serve a basic foundation, already established in part, to help guide interested readers toward various imaginative image processing applications. In view of the great number of contributors, I apologize for possible omissions of appropriate references.
ACKNOWLEDGMENTS The support of the U.S. Air Force Office of Scientific Research in the area of white-light optical processing is gratefully acknowledged.
REFERENCES 1. D. Gabor, Laser speckle and its elimination. I B M J . Res. Deu. 14, 509 (1970). 2. F. T. S. Yu, “Optical Information Processing.” Wiley (Interscience), New York, 1983. 3. G . L. Rogers, Non-coherent optical processing. Opt. Laser Technol. 7, 153 (1975). 4. K. Bromley, An optical incoherent correlation. Opt. Arta 21, 35 (1974). 5. M. A. Monahan, K . Bromley, and R.P. Bocker, Incoherent optical correlations. Proc. I E E E 65, 121 (1977). 6. G . L. Rogers, “Noncoherent Optical Processing.” Wiley, New York, 1977.
70
FRANCIS T. S. YU
7. E. N. Leith and J. Upatnieks, Holography with achromatic-fringe systems. J . Opt. Soc. Am. 57, 975 (1967). 8. R. E. Brooks, L. 0.Weflinger, and R. F. Wuerker, Pulsed laser holograms. IEEE J . Quantum Electron. QE-2, 275 (1966). 9. S. Lowenthal and P. Chavel, Noise problems in optical image processing, in “Conference on Holography and Optical Processing” R. Wiener and J. Shamir, eds.), p. 45. Plenum, New York, 1977. 10. A. Lohmann, Incoherent optical processing of complex data.” Appl. Opt. 16,261 (1977). 1 1 . E. N. Leith and J. Roth, White-light optical processing and holography. Appl. Opt. 16, 2565 (1977). 12. G. M. Morris and N. George, Matched filtering using band-limited illumination, Opt. Lett. 5, 202 (1980). 13. F. T. S. Yu, A new technique of incoherent complex signal detection. Opt. Commun. 27, 23 (1978). 14. F. T. S. Yu, A technique of white-light optical processing with diffraction grating method, in “1980 International Optical Computing Conference,” Book 11, Proc.Soc. Photo-Opt. Instrum. Eng. 232.9 (1980). 15. M. Born and E. Wolf; “Principles of Optics,” 2nd rev. ed. Pergamon, Oxford, 1964. 16. F. T. S. Yu, S. L. Zhuang, and S. T. Wu, Source eroding for partially coherent optical processing. Appl. Phys. [Part] B 27,99 ( I 982). 17. P. H. Van Cittert, Die Wahrschcinlicke Schwingungs verteilung in einer von einer lichtquelle direkt Oden Mittels einer linse. Physica (Amsterdam) 1, 201 (1934). 18. F. Zernike, The concept of degree of coherence and its application to optical problems. Physica (Amsterdam) 5,785 (1938). 19. M. E. Verdet, Constitution de la lumiere non palariseeet de la lumiere partiellement palorisee. Ec. Norm. Super., Paris 2,291 (1865). 20. A. A. Michelson, On the application of interference methods to astronomical measurements. Philos. Mag. [ 5 ] 30, 1 (1890). 21. M. Berek, Uber Kohlrenz und Konsonanz des Lichtes. Z . Phys. 36,824 (1926). 22. H. H. Hopkins, The concept of partial coherence in optics. Proc. R . Soc. London, Ser. A 208, 263 (1951). 23. H. H. Hopkins, On the diffraction theory of optical image. Proc. R . Soc. London, Ser. A 217, 408 (1953). 24. S . L. Zhuang and F. T. S. Yu, Coherence requirements for partially coherent optical processing. Appl. Opt. 21, 2587 (1982). 25. G. M. Morris and N. George, Space and wavelength dependence of a dispersioncompensated matched filter. Appl. Opt. 19, 3843 (1980). 26. F. T. S. Yu, Y. W. Zhang, and S. L. Zhuang, Coherence requirement for partially coherent correlation detection. Appl. Phys. [Part] B 30, 23 (1983). 27. B. J. Thompson and E. Wolf, Two-beam interferencewith partially coherent light. J . Opt. Soc. Am. 47,895 (1957). 28. B. J. Thompson, Illustration of the phase change in two-beam interference with partially coherent light. J . Opt. Soc. Am. 48,95 (1958). 29. F. T. S. Yu, F. K. Hsu, and T. H. Chao, Coherence measurement of a grating-based whitelight optical signal processor. Appl. Opt. 23,333 (1984). 30. F. T. S. Yu, S. L. Zhuang, and S. T. Wu, Source encoding for partially coherent optical processing. Appl. Phys. [Part] B 27, 104 (1982). 31. S. T. Wu and F. T. S. Yu, Source encoding for image subtraction. Opt. Lett. 6, 652 (1981). 32. S. H. Lee, S. K. Yao, and A. G. Milnes, Optical image synthesis (complex amplitude addition and subtraction) in real time by a diffraction-grating interferometric method. J . Opt. Soc. Am. 60, 1037 (1970).
WHITE-LIGHT IMAGE PROCESSING
71
33. F. T. S. Yu,S. L. Zhuang, and T. H.Chao, Color-photographic-image deblurring by whitelight processing technique. J. Opt. 13, 57 (1982). 34. T. H. Chao, S. L. Zhuang, S. Z. Mao, and F. T. S. Yu, Broad spectral band color image deblurring. Appl. Opt. 22, 1439. (1983). 35. J. Tsujiuchi, Correction of optical images by compensation of aberrations and spatial frequency filtering. Prog. Opt. 2, 133 (1963). 36. G. W. Stroke and R. G. Zech, A posteriori image-correcting deconvolution by holographic Fourier-transform division. Phys. Lett. A 25A, 89 (1967). 37. A. W. Lohmann and D. P. Paris, Computer generated spatial filters for coherent optical data processing. Appl. Opt. 7,651 (1968). 38. J. Tsujiuchi, T. Honda, and T. Fukaya, Restoration of blurred photographic images by holography. Opt. Commun. 1,379 (1970). 39. J. L. Horner, Optical spatial filtering with the least mean-square-error filter. J . Opt. SOC.Am. 59, 553 (1969). 40. J. L. Horner, Optical restoration of images blurred by turbulence using optimum filter theory. Appl. Opt. 8, 167 (1970). 41. F. T. S. Yu, N. G. Wang, and S. L. Zhuang, IC circuit board inspection with incoherent optical processing, in “Robotics and Industrial Inspection,” S P l E Proc. Sens. Robot Technol. 360,310(1982). 42. D. Gabor, G. W. Stroke, R. Restrick, A. Funkhouser, and D. Brumm, Optical image synthesis (complex amplitude addition and subtraction) by holographic Fourier transformation. Phys. Lett. IS, 116 (1965). 43. K. Bromley, M. A. Monahan, J. F. Bryant, and B. J. Thompson, Complex spatial filtering by holographic Fourier subtraction. Appl. Phys. Lett. 14,67 (1969). 44. K. Bromley, M. A. Monahan, J. F. Bryant, and B. J. Thompson, Holographic subtraction. Appl. Opt. 10, 174(1971). 45. S . H. Lee, Image processing. in “Handbook of Optical Holography” (H. J. Caulfield, ed.), p. 537. Academic Press, New York, 1979. 46. S. K. Yao and S. H. Lee, Synthesis of a spatial filter for combined operations of subtraction and correlation. Appl. Opt. 10, 1154 (1971). 47. J. F. Ebersole, Optical image subtraction. Opt. Eng. 15,436 (1975). 48. S. T. Wu and F. T. S. Yu, Image subtraction with encoded extended incoherent source. Appl. Opt. 20,4082 (1981). 49. F. T. S. Yu and S. T. Wu, Color image subtraction with extended incoherent sources. J . Opt. 13, 183 (1982). 50. F T. S. Yu, White-light processing technique for archival storage of color films. Appl. Opt. 19, 2457 (1980). 51. F. T. S. Yu,X. X. Chen, and S. L. Zhuang, Progress report on archival storage of color films. J. Opt. SOC.Am. 72,1721 (1982). 52. J. Upatnieks and C. Leonard, Diffraction efficiency of bleached, photographically recorded interference patterns. Appl. Opt. 8,85 (1969). 53. B. J. Chang and K. Winick, Silver-halide gelatin holograms, in “Recent Advances in Holography,” Proc. Soc. Photo-Opt. Instrum. Eng. 215, 172 (1980). 54. H.M. Smith, Basic holographic principles, in “Holographic Recording Materials” (H. M. Smith, ed.), p. 8. Springer-Verlag. Berlin and New York, 1977. 55. H. C. Andrews, A. B. Tescher, and R. P. Kruger, Image processing by digital computer, I E E E Spectrum 9,20 (1972). 56. F. T. S. Yu, X. X. Chen, and T. H. Chao, White-light pseudocolor encoding with three primary colors. J. Opt. 15, 55 (1984). 57. H. K. Liu and J. W. Goodman, A new coherent optical pseudocolor encoder. Nouu. Reu. Opt. 7,285 (1 976).
72
FRANCIS T. S. YU
58. A. Tai, F. T. S. Yu, and H. Chen, White-light pseudocolor density encoder. Opt. Lett. 3, 190 (1978). 59. H. Kato and J. W. Goodman, Nonlinear filtering in coherent optical systems through halftone screen processes. Appl. Opt. 14, 1813 (1975). 60. F. T. S. Yu, S. L. Zhuang, T. H. Chao, and M. S. Dymek, Real-time white-light spatial frequency and density pseudocolor encoder. Appl. Opt. 19,2986 (1980).
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL 63
A Survey of Recent Advances in the Theory and Practice of Vacuum Photoemitters H. TIMAN Thomson-CSF Components Corporation DuMont Dioision Dover, New Jersey
I. I n t r o d u c t i o n . . . . . .
..
., .......
,
...
.
,
... . .
..
11. Stability of Photocathodes and the Influence of External and
Environmental Conditions . . . . . . . . . . . . . . A. Stability and Influence of Gases. .
.
.
.
,
B. Temperature Effects .
,
.
,
,
,
,
... . .
111. Physical Properties of P C . . . . . . . . .
..
. . . . .. .. ,
.. .
A. Secondary Emission of P C Films. . . . . . . B. Optical Characteristics and Constants . . . , C. Time Response of P C . . . . . . . . . . . , , D. Uniformity of PC . . , , . . . . . . . . . . . E. Structure of Base Layers . . . . . , . . , . . F. Hall Effect; Photoconductivity . . . . . . . . G. Resistance o r Conductivity . . . . . . . . . . H. Structure of Photocathodes, , , . . . . . . , IV. Enhancement of Photoemission . . . . . . . . , A. Field Enhancement . . . . . . . . . . . . . , B. Optical Enhancement , . . . . . . . . . . , . V. Formation, Composition, and Spectral Response VI. Theoretical Attempts and Models. . . . . . , . . References . . . . . . . , , . . . . . . . . . . , ,
, ,
. .
. .. ., ., ,
. . ,
.
. .
,
.
... ., . ., , ., , , . .
. . .. ., .. .. . . . ... ... .., ... ... .., ... ... . . . ... ... ...
. . .
.. , . , . .... , . . . , , . , ., , , .. . . .. .. . . . . .... ., , , .... ,
,
. . . . . .
... ... ...
. . . .
.
,
.
......
......... ..., , .... ..... .... .. . .... . . .. . . .... . , , , . . . , . . , . , . . , , . . . .. .. .. .. ..., , .., . . . . . . . . . . ..... . .. . . . . , . . . . . . .... .... .... ..... . . . . . . . . . . ..... . .. . . ...... . . ..... . ..
. . . .
. . . . .
73 75 75 79 83 83 84 90
90
. . . .
91 93 94 96 99 99
.
101 106
.
. .
.
119 133
I. INTRODUCTION
The external photoeffect has played an important role in the birth of modern physics, and has subsequently been utilized in a wide-ranging variety of devices. There is hardly a scientist or engineer who has not used these devices in his laboratory or factory. Materials with high photoemissive yield can be grouped conveniently in three categories: (1) Metals in bulk or thin-film form, which show appreciable sensitivity only in the UV. 73 Copyright 0 1985 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-014663-0
74
H.TIMAN
(2) Semiconductors in single-crystal or polycrystalline form, activated by alkali metals. These materials have recently been reviewed by Rougeot-Baud in Volume 48 of this series. (3) So-called conventional photocathodes, consisting of a thin metal film, activated by alkali metals. This group is the subject of this chapter. While the practice of photocathodes-especially in the application of photomultipliers and image tubes- has become familiar and widespread, the same cannot be said of our knowledge and theoretical understanding of these thin-film emitters. Ironically, the practical success of these devices has contributed to this situation. The technology of photocathodes and their production, while highly specialized, can be learned simply by practical observation and judgement, and reasonably satisfactory results can be achieved in this way. Of course such a technological approach can hardly ascertain whether results correspond to optimum conditions and what such conditions may be. Photocathode (PC) thin films are also among the most complex semiconductor combinations. Repeatability of physical data and spectral yield are still largely unattainable. The experimental methods themselves very often must be suspected to influence or change the very properties studied because of the aggressive and volatile nature of free alkalis and surface coverages. It appears thus that valid conclusions should be based on large samples and unfortunately very few investigations of such scope have been made. We find also that many studies are performed in photomultipliers-so to speak as a sideline of their performance studies-where many additional variables are introduced. Most companies active in the manufacture of photoemissive devices indeed show little inclination in pursuing basic studies or at least in making them available to the scientific community. This is certainly a consequence of the fact that in almost all cases activation procedures are still considered trade secrets. This explains why we have so few physical data on the more promising new developments which exceed the state of the art. Under these circumstances it is small wonder that our knowledge is rather sketchy and disjoint, that many findings do not agree with others, and that we have only a small number of individuals actively contributing. In sum we may say that the basic study of vacuum photoemitters has not been a favorite subject of researchers, certainly not in proportion to the practical importance of photoemissive devices. In part this is of course due to the fact that quick and eye-catching results are hard to come by in this discipline. For a readable and still very useful review of the field as perceived in 1968 we refer to A. Sommer’s book “Photoemissive Materials” (I-Z). Our review deals primarily with the ideas and concepts developed for the modern “head-
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
75
on” semitransparent photocathode since then and we have tried to avoid too much comparison with older sources because of the advances in vacuum and processing technology. We show in Table I the present status of performance for the most significant semitransparent PC deposited on glass or similar substrates. Data are culled from manufacturer’s specifications and it is only realistic to assume that average surfaces will fall below (but on occasion also exceed) these “typical values.” It is clear that we have achieved great advances in peak and in low photon energy sensitivities without, however, removing the basic limitations inherent in the thin-film semiconductor.
11. STABILITY OF PHOTOCATHODES AND THE INFLUENCE OF EXTERNAL AND ENVIRONMENTAL CONDITIONS A . Stability and Injuence of Gases The stability of PC has found attention with respect to inoperational or storage stability; operational stability; under varying conditions. Sommer (11-1) points out that PCs usually remain stable under recommended storage conditions (darkness, low humidity, temperatures below 40°C) over long periods of time if proper processing and vacuum techniques have been employed. Similar experiences with MPTs are well known to manufacturers, where tubes often show unchanged characteristics for as long as 15 years with or without occasional moderate usage. In continuous or intermittent operation under recommended conditions MPTs and photodiodes more often than not show excellent life and stability. Specifically Pertsev et al. (11-2)report eight years of continuous use of a FEU1A MPT with unchanged spectral sensitivity and gain but greatly improved noise characteristics. Even for extreme tail response of the S-20 at 0.91 pm, Melamid and Potopov (11-3) report stability for levels of 10-7-10-8-W pulsed laser light for 250 h of operation. PC instability, often mentioned as “fatigue” or “slump,” is not a sharply defined concept, usually denoting changes in spectral sensitivity, reversible or irreversible. Positive-ion bombardment of the PC because of residual gases in the vacuum envelope is a major factor. Coates (11-4) suggests lowering of dynode 1 potential in MPTs to minimize this effect and he explains long-term slump as caused by permeation of He. Positive-ion bombardment is of course even more damaging in highvoltage devices where it causes “ion burn” spots or nonuniformities.
TABLE I SENSITIVITY RANGE (mA/W) FOR HIGH-YIELD PHOTOEMITTERS (FROM MANUFACTURERS' SPECIFICATIONS)"
.
Cs$b (S-4) Cs,Sb-MnO (S-11)
0.2
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
28* 32b
306 32b
35b 306 35b
38 35 55 6 18
40 50 68 18 22 58
36 55 65 20 25 42 58 60 65 20 60 80 95 75
27 50 60 18 22 27 45 55 60 30 55 45 65 65
15 30 40
45 10 15 8 15 5
0.5 1.0 1.5 4 7 0
Ag-Bi-0 (S-10) Na-K-Sb (S-24)
35b
39
406
58
s-20
306
30"
3Sb
45 55 5 55 70 80 70
S-25 or ERMA 111
K-Cs-Sb (Bialkali)
44Y
406
Cs-Bb-Sb Cs-Te
15b
17b
a
506 70b 45 4
60 10 7 60 85 100 75
l7
14 30 40 50 35 50 25 40 35
15 32 42
42 5 10 15
5 22 35 35 40
0.70
2.0 3.0
14 22 20 27
0.75
0.80
-
-
6 12 18 23
2.0 1 12 17
3.0
For S-1 data see Fig. 16. Data require high-quality UV-transmitting glass. QE can be derived from milliamperes per watt (0.124/>+ in micrometers).
0.85
0.90
0.35 1.o 1 14
1 5
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
77
Arkhipova (11-5) describes the release of multiple electron bunches and subsequent changes in spectral response from alkali antimonides upon deliberate bombardment with 8-15-kV ions. The emission of electrons for the S-20 is considerably higher than for the K-Cs-Sb and very much higher than for the S-1 1. No explanation is given for these differences, which could be due to the lower work function of the S-20. Maysokaya and Privoleva (11-6) measured the effects of intense illumination on inoperative alkali antimonide-exposure levels are not givenand found strong but reversible increases in conductivity and response near the threshold. The same effects were even more pronounced at low temperatures ( - 183°C). In a later paper Jones (11-7) reports strongly enhanced dark emission immediately after exposure to intense illumination with a 0.69-pm ruby laser.‘He postulates the presence of electrons excited and held in trap levels which are “flushed out” when voltages are finally applied. Again this effect of initial instability-especially in high-sensitivity S-20-after light exposure is well known in MPT testing. In an interesting experiment Gelikonov and Khandokhin (11-8) describe the loss of linearity of S-20 response if He-Ne laser power of 0.3-7 pW is focused on a 3 x 10-3-mm-diam spot. They interpret this as a consequence of an intense small space charge in front of the illuminated spot. In practice sustained very strong illumination can also result in “burn spots” or loss of sensitivity similar to the effects of positive-ion bombardment most likely electrolytically caused (11-9). Budde (11-10)reports loss of red sensitivity in S-10 cathodes for up to three years. Similarly, infrared “slump” of the S-1 seems to occur spontaneously. Timan (11-11) shows that loss of IR sensitivity is initially accompanied by gains in shorter wavelengths and can be remedied temporarily by addition of Cs. In these and similar cases surface conditions which determine the lowphoton-energy emission are of course most immediately affected. Transfer of PC from the processing vessel-normally well saturated with alkali vapors-into an alkali-free or nearly free environment has found attention because of the practical interest for imaging devices. Theodorou (11-12) reports considerable loss of red sensitivity of S-20 in such transfers and detrimental effects of any released gases-e.g, in tipping- which normally are gettered by excess alkalis. The success reported by Van Huyssteen (11-13) in transferring S-25 cathodes with more than 300pA/lm and measuring stability for more than six months appears to be due to the fact that no special precautions were taken to exclude alkali vapors from the transfer position. Holtom and Hopkins (11-14)report loss in sensitivity for the S-25 in changing to a Cs-free environment. We have no positive reports about stability of the S-1 after transfer.
78
H.TIMAN
There have been attempts to correlate cathode performance with gas release from alkali generators or channels. Della Porta (ZZ-15) sensibly suggests such releases be minimized. Ghosh and Varma (11-16)explain their good S-20 performance (250-300 pA/lm) as a result of a K prewet on the glass before Sb evaporation which avoids exposure of the Sb layer to initial gas release from the K channel. The practice of alkali prewetting of the substrate before evaporation of the base layer is a concept widely and successfully employed in industry. The thorough and complete outgassing of alkali generators (channels or pellets) is of course standard procedure. Several authors have studied the influence of gases on PC under controlled conditions. There is not always agreement between different sources. Detailed studies were made by Decker (ZZ-27), Garfield (ZZ-28), and McMullan and Powell (ZZ-19), primarily on the S-20. There is agreement about the rather immediate and detrimental effect of H 2 0 and CO, vapor. For these gases there is also no recovery after removal. Decker also investigated less active gases, such as H,, N, CO, and methane, and found them harmless in small quantities. These findings have generally been confirmed in the practice for all photoemissive cathodes. There appears to be considerable disagreement about the effect of oxygen on the S-20. Decker and McMullan both report loss of sensitivity on 0, exposure, but the computed oxygen pressures, as well as “sticking factors” and “gas monlayers,” differ widely. Both find that the red sensitivity- with emission centers obviously closer to the vacuum face-is affected first. McMullan states “that the sticking factor of 0, is reduced to 50% when red sensitivity is reduced by about 30%”-indicating of course saturation of alkalis close to the vacuum interface-and that “much higher 0, quantities are needed to affect blue sensitivity.” In contrast, Garfield states that “S-20 sensitivities can often be enhanced by light oxidation if excess Cs is present in the cathode surface layer,” a concept not easy to measure or define. He also notes the usual effect of lowering surface conductivity upon oxidation. This effect is always observed in oxygen exposure and again points to the partial conversion of metallic alkali to alkali oxides (an effect especially noticeable in the formation of the S-1). It appears thus that the actual mechanism of oxygen interaction at the interface and in the bulk of the layer and possible compound formation are still unclear. The calculated values of gas monolayers and/or sticking factors may therefore have little practical meaning. Practitioners nowadays-in contrast to many earlier studies and Garfield’s findings-tend to agree that 0, enhancement is observed only on “poor” (improperly or incompletely processed) S-20, while “good” S-20 are influenced detrimentally. In the case of Cs,Sb (S-4), which often shows considerable extension of threshold with
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
79
increase of red sensitivity (see, e.g., ref. 11-19) there is serious doubt about the stability of this oxygen enhancement and about the introduction of additional thermionic emission. We find therefore that today in manufacturing “superficial” oxidation of alkali antimonides after bulk formation is apparently not used to any noticeable degree. Oxygen, of course, is a major factor constituent of the S-1 and S-10 in their formation and of the 111-V cathodes. In conclusion, we can say that the loss of Cs (or possibility other alkali combinations) from surface coverages or states and the effect of residual gases, either simply present or ionized in operation, are cited as the most likely causes of instability, which always affects low-photon-energy response first. Both of these effects will be aggravated in transfer systems, where the gettering through excess alkalis is usually eliminated. In turn excess alkali attracted to the cathode can also be an instability factor. An example is given by Kansky et al. (11-20) for the S- 1. Instability can, of course, also be introduced externally, by extreme conditions or temperature changes. The effects of space environment and radiation damage are not directly related to the physics of the PC and its vacuum enclosure. Besides actual decomposition of the PC most effects are caused in the envelope material. For a discussion with sources we reference Johnson (11-21).Authors and manufacturer manuals agree on the advantages of dark storage and long moderate aging to achieve best stability in photoemissive devices.
B. Temperature EfSects The temperature behavior of PC has been investigated for two main reasons: (i) Determination of activation energies from temperature dependence of photoemission, thermionic emission, photoconductivity, and resistance for band models, defect levels, etc. (see Section VI). (ii) Practical reasons of MPT usage in cooling or at elevated temperatures. Budde (11-22) has shown that all alkali antimonides lose sensitivity at - 190°Cif deposited on glass. For detailed data the classical work by Murray and Manning (11-23) still remains significant (Figs. 1-3). Curves given there clearly show the difference between cathodes on conductive substrates and on glass, and detail spectral features. They report loss in sensitivity near the threshold for the S-1 1 and S-20, with generally much better performance for the less resistive S-20 (see, e.g., ref. 11-24).The blue and green “bulk” sensitivity remains virtually unaffected for PC on conductive substrates. The very small increases in blue sensitivity, reported here, and to a larger extent for Cs,Sb by Spicer and Wooten (11-25), if indeed real changes of quantum yield, could be explained vaguely by decreased phonon losses.
80
H. TlMAN 1.4 L
1.2 I .o
0.8 0.6
-.
.
-
nA J
W
a
0.2
" n
40
20
0
-20 -40
-60 -80 -100 -120 -140 -160 -180-200 TEMPERATURE (OC)
FIG.1. S-20 on glass: RCA C-7261 (tube number 2, multialkali cathode). [From Murray and Manning (11-23).0 1960 IEEE.]
FIG.2. S-11 on glass: RCA 6655 (tube number 1, diode operation). [From Murray and Manning (11-23).01960 IEEE.]
In moderate heating S-1 1 and S-20 cathodes show similar tendencies, with blue sensitivity virtually unchanged and with red sensitivity increasing considerably together with a shift in the threshold. For examples we refer to Kanev et al. (11-26)and Garbuny et al. (11-27). Any practical use of increased threshold sensitivity of course is negated by a corresponding increase in thermionic emission I,,,. It is generally assumed that the temperature dependence of I,, of PCs is described by the well-known Richardson plots, which then allow the determination of a work function 6
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
81
1.2 1.0 0.8
0.6
0.4 -I
a W
0.2 n “40
20
0
-20 -40 -60 -80 -100 -120 -140 -160 -180-200 TEMPERATURE
(OC)
FIG.3. S-1 I on conductive substrate: DuMont K-1428 (diode operation.) [From Murray and Manning (11-23) 01960 IEEE.]
from their slope. For numerical values here and in the following we refer to Table 11. Lowering of the high 1,h is here of great practical importance. Sherring (11-28)mentions a reduction by lo4 at - 50 versus 25°C. The values of vary considerably according to “doping” and formation. Examples of the relation between threshold response and are given in Timan (Z1-29). Timan and Kansky (11-11,11-20) show that IR sensitivity drops with increasing temperature. The reason, as in “slump,” appears to be the high mobility within the cathode film and change in optimized surface conditions. However, cooling does not affect IR sensitivity adversely and is used extensively in practice. S-1.
Alkali antirnonides. Davies (11-30) reports that no further improvement in MPTs can be achieved by cooling beyond -40°C. This has to be understood in the context that thermionic emission becomes insignificant against other noise sources. A description of PC cooling techniques can be found in Benci et a!. (11-31). High-temperature applications have also been discussed for the bialkali combinations. Matheson (11-32) suggests the use of a 99.9% pure A1,03 window instead of glass and obtains good performance with the K-Cs-Sb up to the surprisingly high temperature of 200°C. Others find the same cathode on glass to change drastically at 125-160°C with strongly enhanced dark emission and red response indicating a change in the structure and ordering of the thin film. Persyk (11-33)discusses the S-24 (K-Na-Sb). His surfaces could
82
H. TIMAN
TABLE I1 FUNCTION 4 FROM RECENTSOURCES
THERMIONIC EMISSION AND WORK
I,, at 25°C (A/cm2)
Cs,Sb (S-4, S-5) Cs3Sb-Mn0 (S-11) Cs-Rb-Sb Na,K-Sb (S-24)
K-Cs-Sb (bialkali)
s-20 S-25, ERMA
Richardson-plot work function 4 at 25°C (eV)
10- 5- 10- l 6 10- 14-10-15 10-14-10-15 10- 18-10- 19 10- I’ (11-33)” (11-33)b 10-16-10-17 lo-’’ (11-32)‘ (I1-32)c 10- 14- 10 - 1 5 10- 13-10-14
1.30- 1.50 (1-6) est. 1.50 1.20- 1.70 1.80-2.00
10-Lo-lO-l’
0.85- 1.10 0.95-1.15 1.10- 1.30
1.30- 1.55
1.05(?)-1.45 1 .0
s-1 Cs prewet-Cs processing K prewet-Cs processing Cs prewet-K processing a
10-11-10-12
10-12-10-14
At 70°C. At 180°C. At 150°C.
be cycled repeatedly between 20 and 180°C without deterioration. Jedlicka (ZZ-34), although evaluating obviously much too heavy layers, arrives at similar data. The S-24 apparently has the highest 4 and thus smallest change with temperature of all high-yield PCs and has been used in applications up to
175°C. The rather large variations of 4 reported in the S-24 can probably be explained by minute contamination of K or Na by Cs, which would tend to decrease 4. Effects similar to heating can also be introduced by intense illumination, possibly equivalent to “localized” heating of the PC (II-6). Such effects appear especially pronounced in oxidized alkali antimonides. Generally we find good agreement between sources. Changes in spectral response are most pronounced near the threshold and are obviously caused by increase or decrease in available defect or impurity levels. In the blue and green region alkali antimonides can be used to - 180°C with conductive substrates similar to the higher-conductivity S-1. In heating cycles care must be taken to stay well below temperatures which change PC structure or composition. It appears that then performance changes will remain reversible.
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
83
From formation technology we may then conclude that 110-130°C for the S-11 and related PCs 140-160°C for the S-2O,150-17O0C for the K-Cs-Sb, and 160-180°C for the S-24 are the highest temperatures at which these PCs can be used. Numerical values for the thermionic emission and the work function are derived only by measurement and presently we have no clear understanding of their relationship to other physical parameters of the PC thin film, except a general parallelism with the photoelectric threshold.
111. PHYSICAL PROPERTIES of PC
A . Secondary Emission of PC Films Although alkali films or combinations are formed on Sb-coated multiplier dynodes and also constitute an active film on CuBe-0 or AgMg-0 dynodes, their SE has not been paid much attention in recent research. S-1 cathodes were investigated by Sommer (IZZ-I), who finds no correlation between photoelectric and SE yield A. Addition of Ag reduces A as it does UV photoemission, while superficial oxidation increases A. Sommer concludes that the SE is due only to Cs-0 (or other cesium oxides). Again Sommer (111-2)reports very high A (up to A = 30 at 900 V) for lightly oxidized KCs-Sb, claiming much better performance than for conventional CsSb dynodes. SE of the S-20 was measured by Vorob’yeva et al. (111-3)as early as 1964. They report peak values of A = 30 at 1.4 kV for “thick” S-20 films (95% Sb coverage) on glass and A = 36 at 1.6 kV for Cr substrates. They also find much lower values for the thin films of 200-400 A which are representative of semitransparent PC, thus indicating a very large escape depth 1 of the secondaries. They observe a clear temperature dependence for higher primary energies and attribute the lower A at elevated temperatures to increased phonon losses, especially scattering by lattice vibrations. In a recent report, Ghosh and Varma (JZ1-4) measuring S-20 with 300 ,uA/lm, give peak values of A = 39 at E , = 1.8 kV with A = 20 at E, = 0.5 kV. Using the empirical Dekker-Lye formula and estimating the range R of primaries in the film as R = C , Epc2,they arrive at 1 = 400 A. This value is at the high end of thicknesses measured for good S-20. Unfortunately Ghosh fails to give even an estimate of the PC thickness derived from other film data.
84
H. TIMAN
B. Optical Characteristics and Constants
Although it would seem that optical data should be readily measurable with reasonable accuracy, we again have only few and rather widely varying data. Although individual surfaces will differ in their optical data as a function of composition and formation, we nonetheless seem to arrive at models for the different cathode types which represent their behavior in the region of significant absorption and can serve as guidelines there, while considerable variations are found in regions of weaker absorption. For the computation of the optical constants (film thickness d , absorption coefficient k , refractive index n) we have to make the assumption of a homogeneous, isotropic film. This is definitely not the case for the S-1, while the alkali antimonides come closer. In principle the optical absorption A should be an important, probably determining factor in theexplanation of the spectral response. For example, optical characteristics alone are sufficient to explain the previously considered baffling fact that glass incidence Q E is much higher than for direct vacuum incidence. The absorption in the first case is considerably higher, simply because of the larger difference of n (PC) versus n (vac). For examples of QE/A we refer to Timan (111-5) and Jung and Stadlmann (111-6).Generally reports refer to randomly oriented light with normal incidence. There are indications that absorption of coherent, strictly monochromatic light in the PC thin film may differ (see Section V). For a suitable program to compute the optical constants one must have three independent variables: substrate reflection Rf, vacuum reflection R,, and transmission 7: Many authors have used only two sets of data and assumed validity of some data taken from other work. Timan, in a fairly recent report (111-5), gives tabulated data of characteristics and constants for model surfaces of the important PC types. Only triplets d, n, k with d staying fairly constant over the region 0.4-1.0 pm measured were allowed. Measurement techniques and corrections are also described. S - I . The author ( I I I - 5 ) finds a low R , and a minimum in Taround 2.0 eV typical for his large sample. A detailed effort is made there to compute one model surface #113A under the assumption of a homogeneous film (Figs. 4,5). The surprising fact resulted that a near-interference condition prevails for the region 0.4-1 .O pm with very high n values in the infrared. The condition nd/i. close to $ was also verified for other S-1 surfaces in unpublished measurements. These results were measured on PC prepared differently from the “classical method” (see Section V) and are not in agreement with work by Kondrashov and Shafov from the 1950s.Data are certainly subject to revision if a better mathematical model for the microcrystalline structure of the S-1 can be derived (see Section 111, E).
85
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
AF
0.6
P X
0.4 - 2 u)
-em
0.2
0 I .o
0.8
0.6
3=E i
100%
0.4
0.2
0
1
3.0
I
I
I .OeV
2.0
FIG.4. S-1 surface #113A (Ag-0-Cs); initial luminous sensitivity 56 pA/lrn, 2540 IR filter 14%. [From Timan (111-5).]
S-10, S-11, K-Cs-Sb. For details and tabulated data see Timan (111-5). Virtually no earlier or other data have been reported. The S-10 shows rather slowly varying R and Tvalues over large portions of the visible spectrum (in agreement with its “panchromatic response”). For the S-11 (Cs,Sb on MnO substrate-only Cs,Sb on glass had been investigated previously) and the KCs-Sb the author finds near-interference conditions (nd/A 3) in the blue region of heavy absorption. Correlating with the rather high values of n and k we have large values of substrate and much higher values of vacuum reflection. There K-Cs-Sb has the least amount of red absorption of the high yield alkali antimonides in agreement with its very low thermionic emission.
-
N a - K - S b (S-24), Rb-Cs-Sb. Jedlicka (111-7) reports optical properties of the S-24. The value of d = 170 nm, however, appears much too heavy to be
86
H.TIMAN Refractive index n
(h)
Absorption coefficient k (7d
8 j r
a3
0 nm
a1
ii 0.
0.5
0.6
0.7
1 Fic. 5. Optical constants of S-l(#113A). [From Timan (JII-5).]
tS
representative, which seems also evident from the reversal of the Q E response for vacuum versus glass incidence. For this heavy layer Jedlicka also reports a reflection minimum at 0.45 pm. Similarly, Jedlicka’s report on the Cs-Rb-Sb cathode-recently brought out as “super S-11” or “modified S-11” by EMI, DuMont, and other MPT manufacturers-deals with an extremely heavy layer and can not be considered representative of the cathodes used in MPT practice. Additional work is here necessary. S-20, S-25. Here we have more activity and several, not always agreeing data. Hofmann and Deutscher (111-9) give data for the 0.5-0.8-pmregion and report n and k as being independent of cathode thickness (see also Section VI).
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
87
Varma (111-10) give typical optical data for 150-200-pA/lm cathodes. The reported minimum of reflection at 0.45 pm has only been found by Hoehne (111-11)for much heavier layers. However, the typical “hump” in the absorption below 2.0 eV, which stretches the S-20 toward much lower energies than the other alkali antimonides, has been confirmed in the data of the author (111-5). Varma postulates two different absorption processes: a “residual” absorption from 0.85 pm on (photoelectrically inactive) and an “active” component rising quickly above 0.85 pm. However, his values at 0.9 pm and beyond seem to be questionable, with A R T adding up to more than 100%. Hoehne (111-11), with similar thinking, measures heavy layers ( 1 35 nm). For discussion see Section V. Interestingly, Hoehne describes actual measurements of changes of photosensitivity and absorption (9 months apart) and finds a direct relationship. To our knowledge this is the only reference in which time changes of electric and optical characteristics of a PC are correlated (Fig. 6).
+ +
I
- 0.4
40
- 0.3 - 0.2
- 0.1
(eV 1 FIG.6. Photosensitivity S and optical absorption A of an S-20 photocathode 8 days (I) and 9 months (11) after fabrication. [From Hoehne (111-11).]
0.6
0A
0.2 0
1.0
0.8
0.6
0.4
0.2
1 XOeV
(01
I 2.0eV
1
ev
0
3.0eV
2.0eV
1.0e'
(b)
FIG.7. Optical characteristics of S-20 trialkali (Sb-K-Na-Cs) surface: (a) #SS20, initial luminous sensitivity 125 pA/lm; (b) #086, init luminous sensitivity 160 pA/lm. [From Timan (222-S).]
89
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
The author (111-5) gives R,, R,, IT; and A values for several S-20 surfaces with sensitivities of 125-170 pA/lm and computes n, d , and k, from these data (Figs. 7 and 8).These thin cathodes (30-45 nm) do not show any indication of a reflection minimum; n and k seem to be complementary, with lower n values correlating with lower k values for the thicker cathodes-thus keeping the physical absorption rather than individual values independent of cathode thickness. As for all alkali antimonides we find reasonable constancy for the
tefractive index n ( X )
Absorption coefficient
K ( XI
I(A\SS-20 0
---_ n(3L) 086
0
I
0.5
0.7
I
0.9 X micrometer
L
FIG. 8. Optical constants of SS-20 trialkali (K-Na-Cs-Sb), d = 32.4 nm; 086 trialkali (K-Na-Cs-Sb), d = 45.35 nm. [From Timan (fII-5).]
90
H.TIMAN
refractive index only beyond 0.55 pm, while values toward the blue increase rapidly. This behavior would be expected for an absorption edge. For the S-20 numerical differences of n seem to correlate with the numerical values of the reflection (higher n, higher R for thinner cathodes). There is a deplorable lack of information about the high-sensitivity S-25no optical data have been reported although the cathodes have been produced under various trade names (ERMA I, 11, Extended Red, etc.) for nearly ten years.
C . Time Response of PC The time response of PC is usually not a limiting factor in pulse operation. Ronkin (111-12)states that all major PC types perform equally well for pulse rates up to lo8 pps. Duchet (111-1.3) reports that the high resistivity of one of his S-20 adversely affects time response. Similarly, Garfield and Folkes (11114)report considerably improved performance for high-current fast pulsing of S-20 with a copper mesh (55% IT: 750 mesh/in.) embedded in the glass substrate. They also confirm the unsuitability of Nesa (tin oxide) undercoating for the S-20 because of its poor resistance to Na. Earlier sources estimated the time interval between light impact and photoelectron release as 10-'2-10-'4 sec.
D. Uniformityof PC Reliable measurements of PC uniformity per se are hard to come by in recent work. Most data refer to uniformity of MPT, where the collection efficiency and gain uniformity of dynode 1 distort the PC data. This has been recognized by Gulakov et al. (111-15). Budde (111-16) reports +25% from a median as typical for virtually all well-processed cathodes. This is certainly only true for regions of medium and high absorption, while changes for lower absorption are much greater. For the nonuniformity of the S-1 we refer to Section V, where an explanation of the observed strong nonuniform IR response is reviewed. The formation of the S-1 indicates semblance to the one of the 111-V cathodes where severe nonuniformity is also observed after initial Cs-0 cycles ( I 11-17). As a curiosity we mention the idea of Lazarenko and Tokareva (111-18)of photographically reproducing the response contour of the PC and then use this as mask to reduce nonuniformity.
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
91
E. Structure of Base Layers
Authors more recently have considered the effects of the nature and structure of base layers (Ag, Sb thin films) on subsequent PC formation. The usual technique of monitoring the base layer by optical transmission T to ausually not even well-defined-“white-light” source permits only crude control. Monitoring the reflection changes of the glass (or other) substrateas practiced in the production setup of the DuMont laboratories-is slightly preferable because it indicates better the bridging of the nuclei. Generally the study of these Ag and Sb films belong to the realm of thin films. We will concern ourselves only with reports directly relating to PC formation techniques. S - I . For Ag films the classical work by Sennett and Scott (111-19) is still a good guide. The author studies superficially oxidized Ag films (111-20) in different preparations and concludes that their properties determine the quality of the PC formed. An alkali prewetting of the glass substrate (Cs or K) appears essential for best performance. He finds Ag films with type A properties: clear blue-light blue-violet color, preservation of island structure, low conductivity, low IR absorption; and type B: grayish blue-metallic, at least partial bridging of islands, high conductivity, high IR absorption; and postulates that only type A layers allow formation of high-sensitivity PCs (see Section V). Type B layers resemble layers prepared by the “classical” method with its intermediate complete oxidation of the Ag film. For physical data and electron micrographs of type A layers see Timan (111-20);for a description of the classical method see, e.g., Sommer (111-21). Unsuccessful experiments with other metal base layers are described by Sommer (111-22). S-10. No detailed study of the nature of the Bi-Ag layer is known-for preparation techniques compare (111-21,p. 169).
Alkali antimonides. Sb films are less complex in growth and properties and we have more studies with reasonable consensus. Garfield justifiedly criticizes the use of light transmission T as a criterion for the condition of the Sb layer (111-23).In his microbalance study he shows that substrate nature--specifically also the often used pretreatment with K or Cs-changes the relationship between surface density or weight and T(Fig. 9). This demonstrates the dubious value of relating weight or weight-computed surface densities to the effective optical thickness d . Such correlations are usually based on the work by Condas (111-24).
92
H. TIMAN
- lo8 - lo6 <E - 5 N -
lo5 -lo4
.$
.‘ur+ v)
0)
CT .A
U
0
I
1
1
I
2
1
I
I
I
I
I
I
I
3 4 5 6 7 8 9 Antimony surface density ( p g / c r n 2 )
1
0
I
FIG.9. Light transmission as a function of surface density for Sb films. A, Pyrex (K flush); B, silica (K flush);C, mica (K flush);D, silica (K flush);E, Pyrex (no K flush); F, mica (no K flush); G , silica (no K flush). [From Garfield (11-18).]
McCarroll(111-25) discusses Sb films on glass and on MnO thin films (S-11 preparation). He determines (by quartz crystal oscillator balance) the weight of these Mn films as 2-2.5 pg/cm2 for the usual 90-94% T and their thickness as 43-58 A for 90-80% T Similarly to Garfield, he makes the interesting observation that Sb films on MnO are “optically denser” than those directly on glass. He shows the oxidized Mn to be MnO from direct measurement of the lattice constant a as 4.44 A (bulk 4.45 A). He finds Sb films on glass, MnO, and C , to be amorphous to at least 30 A and observes transition to crystalline form for heavier layers upon heating (annealing) to 150°C (only partial for layers below 30 A). Hall and Eastment (111-26)find island structure and the same transition in the Sb thin films for a much higher computed value of d 12 nm. This value is also mentioned by Robbie and Beck (111-27). Hall observes an interesting coincidence of higher conductivity and higher photoelectric response (at 248 nm) at and beyond the crystallization transition. Robbie confirms the amorphous nature of thin Sb films (relating, e.g., 85% T to d 4.2 nm) and adds that they remain so when heated to formation temperature, while he confirms McCarroll on crystallization of heavier layers. He postulates preferred growth of the crystallites into states of lowest free energy. Ghosh and Varma (111-28) studied Sb films by electron microscopy under different vacuum conditions. They also confirm growth in island structure (average
-
-
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
93
width 30 nm) and report joining of islands to an electrically continuous film (?) at 12 nm (at 10 nm in UH vacuum of torr). They find that gases, either from poor vacuum or from alkali channels, will prevent crystallization altogether. In practice light transmission of Sb bases for all high-yield PCs ranges from 90 to 70%.We can therefore assume that these films will nearly always be amorphous with island structure. Little attention has been given to other physical properties of these layers-specifically absorption-and their possible influence on PC performance. This has been shown to be of significance at least in the case of the S-1. In recent years formation techniques have been perfected which eliminate base layers altogether. For example, the S-20 can be formed successfully by alternating K and Sb in small steps initially directly onto the substrate (see, e.g., ref. 111-28)and similar techniques should be applicable to all other PC types (see Section V). As an interesting thought we mention the deposition of Cs,Sb on a lattice matching substrate of CsI (see Section 111,H). F. Hall Effect; Photoconductivity
We found only one attempt to measure charge carrier density p by means of the Hall effect in the $20. Hofmann and Deutscher (111-29)confirm p-type conductivity with p 5.4 x 10”-1.6 x 1 O l 8 . Sommer (111-21, p. 86) points to the inherent difficulties of measurement of photoconductivity (Pc) in photoemitters. It is not clear how much of the old work by Spicer (111-30) is still valid, and more recent data are sparse. For example, Spicer’s reported value of E, 1.6 eV for the Cs,Sb would imply an enormous amount of defect or impurity absorption in view of considerable optical absorption extending to 1.1-1.2 eV (see, e.g., ref. 111-5). The photoconductivity of high-sensitivity (200-300 pA/lm) S-20 was measured by Bhatia et al. (111-31),who found a Pc peak at 0.55 pm (close to the photoemissive peak of 0.5 pm) and a Pc threshold at 1.1-1.5 eV interpreted as a “fundamental absorption edge.” Ghosh and Varma (111-28)show band gap or Pc threshold E, for the alkali antimonides. For numerical values see Table 111. However, the reported low value of Eg 1.05 eV for the S-24 appears unlikely in view, for example, of the optical data given in Jedlicka (111-7). As Fisher had already pointed out (11132) this and the postulated photoelectric threshold E , (smaller than for K,CsSb) contradicts the general rule of lower Eg for PCs containing Cs. The data for E , and the high luminous sensitivity ( > 100 A/lm) reported strongly suggest Cs contamination of the S-24. It is known that even minute additions of Cs (or Rb) drastically alter the spectral response, thermionic emission, and
-
N
-
94
H. TIMAN TABLE I11 COMPILATION OF E,, E,, +,,, FROM RECENT SOURCES
Cs,Sb s-11 S-11-Bx3 (30 pA/cm) S-l1-#2(30pA/cm) Rb-Cs-Sb
Na,KSb
K-Cs-Sb B-1 (45 pA/cm) B-3 (48 pA/cm) s-20 T- 1 (170 pA/cm) Surface (190 pA/cm) s-25 Surface 300 pA/cm 280 pA/cm a
1.60 1.60 1.10 1.05 1.80-2.00 1.40 1.30 (C)a 1.80-2.0 1.00 1.00-1.00 1.30 (C) 1.00 I .20 1.20 1.20 1.00 1.35-1.45 1.05 1.OO 1.10 1.10 1.00
-
0.45 < 0.45 0.80 0.85
-
-
-
-
0.25 -
1.23 -
-
1.40- 1.75
1.00 0.70-0.75
-
-
1.75-2.00
-
-
1.10 0.70 0.85 0.80 0.55 0.45 0.27 0.24 0.30 0.23
-
1.56 ~
1.44 1.03 -
1.00
VI-1, VI-29 vi-29 v1-7 v1-7 v1-19 v1-19
V-25 11-26 VI-29, VI-I VI-8, VI-31 v-20 V1-29 v1-8 v1-7 v1-7 v1-29 v1-2 v1-7 v1-14 v1-8 V-36 v1-14
(C): data from conductivity measurements.
thermal performance of the S-24. Neither Fisher nor Ghosh and Varma discusses the possibility of Cs contamination or precautions taken. Pc of the S-1 and S-10 have not been reported. G . Resistance or Conductivity
Very similar problems are encountered in the measurement and interpretation of PC resistance R . The usual method consists of measuring the surface resistivity (ohm per centimeter) between conductive strips on the PC substrate. It is then difficult to judge whether we actually measure bulk film resistivity or some surface or contact phenomenon. The island or microcrystalline structure of the films-relative to its degree of preservation-also may change value spontaneously or with time (see, e.g., S-1). It seems difficult under these circumstances to assign detailed theoretical meaning to the data on finished
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
95
cathodes as they are probably of a qualitative nature only. Still, PC types show significant differences in their average R values and we find important practical consequences: for example, observation of R during formation has been used to establish the composition and nature of the PC film. S-1. Numerous data are given in Timan (111-33) which show a considerable spread from lo4 to 10” R/cm2 with individual values certainly depending on the state of oxidation of the interfaces and on the nature of the island structure. R values are given for several years of observation and they show a disturbing trend to increase with time, indicating deteriorating contacts rather than actual surface changes. Similar problems may occur in other PCs, which would make individual data rather suspect. Resistance effects, limiting PC response, do not seem to exist for the S-1 or the similar S- 10. Indeed, initial instability or dark-current surges after illumination, as reported for the alkali antimonides, are not reported (see Section 11). Only one attempt at determination of activation energy is known. Timan (111-34) arrives at AE 0.175 eV for a cathode formed on a type A Ag layer and at a very small value of A E 0.5 x eV for one formed on a type B layer. The curves do not show a bend for the range of 0-90°C.
-
-
Alkali antimonides. Most measurements try to establish activation energies of defect levels from the plot of R/l/T). These measurements do not seem to be reproducible with any degree of accuracy-numerical values are rather uncertain and the temperature range limited. These problems have already been pointed out by Sommer (111-21, pp. 79ff), e.g., with a Cs,Sb example where values change drastically from one run to the next. It appears that most sources (but not all) indicate a-slightly material dependent-knee or bend in their plots around 80-100°C (see, e.g., ref. 11135); also (111-8, 111-28). This would imply generally the existence of at least two distinct defect levels separated by A E 0.25-0.8 eV. It must by noted that actual physical changes at these more elevated temperatures cannot be excluded. Changes of this nature would tend to be reversible upon cooling. We have few surface resistivity data on finished cathodes in practical use. However, a rank order of R magnitude seems to be established: S-20-S-24S-4, S-1 1 -oxidized Cs,Sb-K,CsSb, the most resistive. In practice the same order determines nonlinearities of PC response to higher light intensities. Here we also observe the effect of “voltage dependences” denoting an abnormal current-collector voltage curve, even for low voltage ranges, e.g., 20-400 V, indicating marked resistance effects. Values may differ by as much as two or three orders of magnitude for the same surface type with S-20: R lo5 R/cm2 up to K-Cs-Sb: R 10” R/cm2. Values of R of course also
-
N
-
96
H. TIMAN
influence PC changes at low temperatures, which again are least pronounced for tb- S-20 (see Section I).
H . Structure of Photocathodes Here we review sources describing the physical structure and appearance of PC films. S - I . In agreement with older sources the author shows (111-34) in numerous electron micrographs that the island or granular structure of the Ag base layer is preserved in finished cathodes. The structural and optical resemblance between base layer and finished cathode seems to be independent of substrate pretreatment, evaporation speed, oxygen interaction, and processing. The microcrystals grow apparently to an optimal thickness by taking up cesium relative to their initial thickness (see Section V, also ref. 111-36). Cathodes of this structure have the best sensitivities reported so far. In macroscopic appearance these films range from light-medium clear blue to clear violet-blue. Heavy, metallic areas almost always indicate breakup of the structure, most likely oversaturation with Cs (111-34). Electron diffraction studies have shown only clear evidence of Ag. While Cs-0 sites determine the interface property of the finished cathode in an obviously highly critical way the bulk Ag content does not appear to be critical. Ag atoms obviously have a great freedom of diffusion mobility in the film. This has been shown by direct observation of surface resistance after additional Ag evaporation in the Asao treatment and by corresponding changes in the spectral response (111-34111-37).The recently investigated idea of forming heavy Cs-0 layers on very thin Ag substrates or adding small amounts of Ag to such layers most likely produces more homogeneous films. So far structural studies of these films have not been reported. For a description of these ideas-largely still unproven in practical use but very likely of good potential-see Heiman (111-38),Pakhomov and Melamid (11139), and Timan (111-33). S-10. We have no reports at all about film structure. The color of these cathodes is reported to be gray with a bluish tint. From optical and spectral data it appears that this cathode is a “mix” of the two PC groups. It would appear to be of theoretical interest to study its structure.
Alkali antimonides. Here we find some degree of agreement among different sources. It seems assured that the effective photoemitters in this group have cubic crystal structure with p-type conductivity, while the low-efficiency combinations have hexagonal symmetry with n-type conductivity. The physical and crystal structure of these PC thin films has been studied by
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
97
electron diffraction and by electron microscopy. Unfortunately all of these methods are of an invasive nature and/or require special substrates or poorer vacuum conditions. Beck and Robbie (111-40) describe the apparatus and the difficulties encountered in diffraction PC studies. Generally diffraction patterns are typical of a randomly oriented polycrystalline material; lines possibly indicating a preferred orientation or ordered state are usually difficult to discern or interpret. Most authors assume-simply by transferring bulk concepts-that the bulk of the PC film, in its optimally processed state, consists of the alkali,-Sb form. The dangers of assuming a well-defined composition-especially in multialkali cathodes-are discussed in Section V; also compare Kansky (1Z1-41).
The most recent structural study was made by Dowman et al. (111-42) by scanning electron diffraction. Especially for the monoalkaline films he is in good agreement with older sources.
Na,Sb. Na,Sb has been observed only in hexagonal form with n-type conductivity; the color of the film is an indistinct light yellow. Rb,Sb. Dowman states that in thicknesses producing good photoemission this material exists only in cubic form with a lattice constant of d 8.80 A. Older sources, measuring heavier layers, mention the existence of a hexagonal form and coexistence or transitions between the two states (compare Sommer 111-21, p. 104).Color is similar to Cs,Sb-reddish brown with a violet tinge.
-
K,Sb. This compound now has little practical value per se, but is formed as an intermediary in bialkali and trialkali cathodes. Sommer and McCarroll (111-43) report a hexagonal crystal structure (violet in color) upon usual exposure of a Sb film to K and a different, more sensitive, cubic form of brown color, formed when an intermediate oxygen glow discharge is applied to the Sb base layer. Ninomiya (111-44) in turn reports formation of a cubic compound of violet color with d 8.49 8. Dowman maintains that first a cubic compound (again with d 8.50 A) forms which changes spontaneously into the hexagonal form (a 6.03 A, c 10.69 A). In contrast to Sommer he finds higher sensitivity for the cubic form. Fortunately, these considerable differences do not have too much practical significance. In multialkali formation the color of the initial unoxidized K-Sb compound is almost always violet, indicating cubic structure.
--
Cs,Sb. In this well-studied material agreement between old and new sources overwhelmingly confirms the cubic structure (D 0,) with a lattice constant d 9.14-9.15 A and p-type conductivity. These findings also agree excellently with the analysis of bulk Cs,Sb, showing very similar data.
-
98
H. TIMAN
McCarroll(111-45) has studied electron diffraction patterns of Cs,Sb on Mn-0 or directly on thin carbon film substrates, with PC thickness from 150350 A. He and Dowman both report the same polycrystalline phase of a FCC cell with d 9.14A, with predominant lines { h + k + I } = 4n. An ordered state was only achieved after the PC films were annealed for 3 days at 220°C (annihilating of course all photosensitivity). In a later report McCarroll(11146) estimates the minimum size of crystallites as 75-85 A. He, as well as Dowman (111-42), find processing to maximum photoemission coincident with the formation of the cubic structure. Dowman finds “superlattice” rings { h + k + I } = 4n 2, as well as { h k + l } = 4n k 1 in his scan of some Cs, Sb surfaces, indicating some degree of preferred orientation. He also finds-in cooling of the PC-that addition of Cs was necessary to preserve optimum sensitivity and cubic structure. This “slump” is of course not usually encountered in formation and is obviously due to the relatively poor vacuum conditions in those diffraction studies. Kansky (111-4Z) describes cathode formation as a “penetration of alkali atoms between stationary Sb atoms” which leads to an “isotropic swelling” of the Sb film increasing the volume 5.0-6.3 times. In the group of multialkalis more difficulties in interpreting results are encountered.
-
+
+
Rb-Cs-Sb. This material has become increasingly the replacement for the simple S-11 cathode but there is only one brief mention by Dowman, indicating cubic structure similar to Cs,Sb.
-
McCarroll (111-47) and Dowman (111-42) both agree on a K-Cs-Sb. cubic (D 0,) symmetry with a lattice constant d 8.61 A for the most efficient samples with a presumed composition K ,-Cs-Sb.
Na-K-Sb. More effort has been expended on this mixture as it is considered to be the basis of the S-20. Upon annealing powdered Na-K-Sb cathode material for several hours at 160-220°C McCarroll(111-46) obtains cubic structure with d 7.727 A, in excellent agreement with values of bulk material produced by direct synthesis. In earlier work McCarroll and Simon found direct evidence that the cubic phase and optimum photosensitivity occur only for Na :K ratios close to 2 : 1. These findings have been confirmed by later work of Dowman (111-42).
-
S-20. There is still considerable argument about the distribution of constituents and physical structuring. It is of course very likely that the many different possible ways of producing reasonably good S-20 cause the wide variances in results and interpretations. It should be noted that similar problems of widely varying preparation techniques are much less evident in
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
99
the other alkali antimonides. Ninomiya (111-44) states that the crystal structure of his S-20 (with sensitivities of 160-280 pA/lm) is the same as that of Cs,Sb but with d 8.72 A instead of 9.14 A. He concludes that to account for the “shrinking” of the lattice about 30% of the Cs atoms have been replaced by K and Na atoms. From their rather unclear transmission electron micrographs they estimate the size of the S-20 crystallites to be 1000-2000 A. These views have not been supported by later reports. Garfield (111-23)finds cubic symmetry with d 7.73 A very similar to that of NaJ-Sb. McCarroll (111-46) reports a small but in his opinion significant swelling of the lattice constant from d 7.72 to 7.745 A upon conversion of the Na,K-Sb into the S-20 by Cs-Sb cycling. Dowman finds that diffractograms from his cathodes differ considerably despite similar preparation techniques (111-42). For example, he finds rings indicating Na,K-Sb (cubic),NaKzSb (hexagonal),and K,Cs-Sb (cubic) all together in some samples. His very best preparations ( - 300 pA/lm), however, show signs of Na,K-Sb cubic only. These divergent views only confirm that our present knowledge about the $20 is simply not sufficient to permit us to arrive at a valid model for its physical structure. Doubts exist whether such a structure for the S-20 can be expected at all or whether our present-day techniques result in a broad spectrum of a whole related family of trialkali systems.
-
-
-
IV. ENHANCEMENT OF PHOTOEMISSION A. Field Enhancement
The application of external fields leads to a reduction in the surface barrier or work function 4 (so-called Schottky effect) and consequently changes spectral response and yield. The effect is of theoretical and also of practical interest, especially in proximity-focused image tubes. It is generally assumed that the effects are governed by the simple Schottky equations, derived from the image force within the PC vacuum interface: i,
- i,exp(cE,”’),
- 4, (&)
112
4,
-
E,”’
where E, denotes external field, e, electron charge, and E, vacuum permittivity. Burroughs ( I V - I ) and Borzyak (1V-2)discuss the S-1. Borzyak reports a gain G = 5.0 at 0.9 pm but only G = 1.10 at 0.7 pm in the peak region accompanied by strong increase in thermionic emission. Burroughs describes an extension of the threshold (0.08 pm for E, = lo4 V/cm for a 37-pA/lm cathode) inconsistent with the Schottky effect. Of course the microcrystalline
100
H. TIMAN
structure of the S-1hardly allows the use of an average 4 because we certainly have patches of widely differing surface barriers. In addition, the small “eigenenergy” of the infrared and thermionic electrons will be most affected by vacuum barrier decreases. Ghosh and Varma (IV-3) and Cochrane and Thumwood (ZV-4) have studied the S-20 and S-25. Both find the expected effects of enhanced emission in the low-photon-energy region with little or no measurable effects in the region below 0.5 pm. They confirm the validity of the Schottky equations for dependence on external fields. Cochrane observes a “selective maximum” of enhancement near the threshold without any noticeable extension of the threshold itself. In contrast, the older work by Hollisch and Crowe (IV-5)for fields of 4.3 kV/cm-indicates considerable extension of the threshold with most pronounced increases in the very threshold region of 0.95-1.0 pm (Fig. 10). Theoretical considerations would tend to support Hollisch‘s findings, as it is difficult to understand why the predicted change in the work function should not express itself in the photoelectric threshold. Holtum and Hopkins (ZV-6) use the validity of the Schottky equations to decide against the existence of a heterojunction barrier at the vacuum interface. The useful application of the field enhancement is limited by a corresponding increase in thermionic emission. Cochrane ( I V-4) reports an apparent “breakdown” of the surface barrier at E, = 5.5 kV/cm for the S-20, probably caused by asperities at the interface due to sharp points or similar foreign objects. Generally it must be expected that thermionic emission will be enhanced at least as much as near-threshold photo yield. An improvement in SIN can therefore be only expected for pulsed operation or with cooling. For alkali antimonides Cochrane asserts that the type of cathode material or level of sensitivity have little influence on field effects.While properly processed and aged cathodes usually show a low-voltage saturation plateau (roughly extending from 50-300 V/cm) for all wavelengths, it is known that improperly and incompletely processed cathodes show strong voltage dependence. This is more pronounced if the cathode film also displays high resistivity. In conclusion, we can say that there is complete agreement in the literature about the validity of the Schottky equations for field effects at any given spectral line. There is general agreement about the spectral dependence of the effect-however, detailed data vary widely. It does not seem possible to explain the spectral structure without much more detailed assumptions about the physical distribution of emission centers and of photoelectron energies. Indeed, measurements of field effects in the transmission and reflection incidence mode could help to verify emission models. Work of this nature has not been performed yet. For a detailed, still useful, more theoretical discussion of field effects we recommend (ZV-7).
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
101
I oc
-
IC
3
\
a E
v
%
c .> .c
.-
u)
c
0 v)
I .o
0.I
400
500
600
700 800
900 1000
Wavelength (nm)
FIG.10. Sensitivity measured ( 1 ) at normal incidence and low field, 226 pA/lm; (2) with field enhancement only, 244 pA/lm; (3) with optical enhancement only, 478 pA/lm; (4) with both optical and field enhancement, 516 pA/lrn. [From Hollisch and Crowe (IV5).]
B. Optical Enhancement
Optical gain would appear to be a painless and logical way to increase the QE of photocathodes. Nonetheless, optical enhancement methods have found only limited applications in practice. Two approaches are possible: (1) Making use of the fact that total reflection between an optically denser medium (PC) and the substrate (glass) occurs at larger angles of incidence, to ensure multiple light passes through the PC. The beam is usually introduced into the PC via a prism. (2) Tuning the PC thin film to a maximum response at a desired wavelength by means of one or more dielectric spacers.
H.TIMAN
102
4
3-
a
1
-
2-
I
---
I00
0-
1
400
I
500
I
600
7 0
FIG.1 I . Relative gain A as function of wavelength for different angles of incidence; S-11. [From Dvorak (lV8).]
Dvorak (ZV-8) beautifully details method (1) for the S-11 and S-20. His curves clearly show the onset of amplification at the angle of incidence 9 40"corresponding to the angle of total internal reflection for glass-and the effect of lighter absorption on the enhancement ratio. As he correctly notes, the existence of the effect is independent of the refractive index and thickness of PC. The actual yield of course does depend on PC physical properties (Figs. 11 and 12). A later study by Jones (ZV-9) confirms in principle Dvorak's findings for the S-20 but also demonstrates the significant differences for the two states of polarization. Several questionable assumptions (e.g., constant value of n over a wide range) and the fact that only one PC-as MPT output-was investigated give his findings only qualitative value. Apparently alone, Shaw et al. (ZV-10)have investigated the blue and nearUV enhancement of the S-20 with oblique incidence. They find a nearly constant gain of 1.5 for the region 0.22-0.45 pm investigated. This strongly
-
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
10
103
i-
400
500
-
600
700
800
X (nm) FIG.12. As in Fig. 1 I for S-20. [From Dvorak ( I V-S).]
suggests that the optical characteristics of the S-20 do not change there as radically as in the visible. Hollisch (IV-5)gives tabulated data for the S-20 but arrives at much lower gain figures than the others. One report about the S-1 claims gains of 2.3-2.8 in the 0.8-1.05 pm region with 9 50"( I V I I ) . For a detailed description of the formation and properties of the S-1 formed on reflective substrate with dielectric spacer (Sp R) we refer to Timan (IV-12).The author did not find significant improvement of IR sensitivity in the coated as compared with the uncoated areas of cathode. This result is also mentioned by Raffan and Gordon ( I V-Zj). Pakhomov and Melamid ( I V-14) introduced the original thought of increasing the Q E of the S-1 by reducing optical absorption due to colloidal Ag particles and only then applying optical enhancement. The idea is based on
-
104
H. TIMAN
the high mobility of Ag atoms in the S-1, here utilized in a narrow strip cathode. To my best knowledge the idea has not found practical realization in available devices. Kossel et al. (ZV-1.5) develop the theory of interference cathodes with one or two dielectric spacers on or without a reflective substrate. They arrive at a theoretical gain of 3.7 in a Sp R S-20, tuned for I,,, = 0.73 pm, which he claims had been reached in practice. The Sp R is of course can only be used for light falling directly on the PC-vacuum interface. Greschat et al. (ZV-16) reproduce some of their curves and results for the S-11 and theorize on “corner cube enhancement,’’ concluding as expected that enhancement is most pronounced for very thin PC films. Corner cubes here mean surfaces of hexagonal prisms as glass substrate. Unfortunately these authors do not present measured data. Raffan and Gordon (ZV-13) report application of Sp R in image tubes, using an anodized A1 mirror as dielectric substrate and reflector. Figure 13
A
(A)
FIG. 13. Spectral response enhancement of S-20 on glass and AI,O,-AI substrates. -, Al; ---, A1,0,, 400 pA/lm; --, glass (trans.),200 pA/lm; ------ glass (vac.), 132 pA/lm. [From RaKan-Gordon ( W I . ? ) . ]
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
105
shows their results. They claim wide band enhancement with a “thin” dielectric substrate under S-20 cathodes with thickness ranging from 33 to 42 nm. In contrast, Deltrap and Hanna (IV-17),reporting application of Sp R in a large size image intensifier, state that enhancement in the red can only be achieved with thick cathodes. They cited = 90 nm for optimum enhancement at 0.8 pm with an SiO spacer of 150 nm. Most of these uncertainties seem to be due to the fact that efforts to measure n, d, and k independently were not made or not feasible. Feldner et al. (IV-18)discuss enhancement of Cs,Sb through a transmissive dielectric layer between glass and PC. Kossel ( I V-15)gives the theoretical limit for enhancement of absorption for this case: A,,, = A,,/(l - R,,), where A,, and R,, stand for the absorption and reflection of the PC itself. Feldner et al. used SnO, (gain in the red only) and CeO, (integral enhancement of 32%) as A14 spacers. They deny the existence of any improvement with a MnO layer-in contrast to experiments in which S-11 cathodes show considerable enhancement against Cs,Sb films only. For a detailed but inconclusive study of cesium antimonide on MnO we refer to McCarroll (IV19). We mention also Chen and Wang ( I V 2 0 ) in this connection. They report peak emission and much more stable operation for the thermionic cathode (W-Cs-0,) if Cs reacts with preabsorbed oxygen. The S-1 1 could be partially a somewhat related effect. Metal films of 90-95% transparency have been used, especially on the highly resistive UV cathodes, to enhance linearity. For achieving optimum performance in the visible range these films are not suitable (see, e.g., ref. I V-19 for a discussion of Mn films). Greschat (IV-16) reports strongly enhanced response of Cs,Sb when deposited on CsI crystallites. He explains it with a match of lattice con2d(CsI)] and the resulting lack of recombination at the stants [j(Cs,Sb) CsI-Cs,Sb interface. This idea of matching substrate and PC is an interesting possibility which should find further attention as it may possibly allow formation of conventional cathodes as monocrystalline films similar to 111-V cathodes. Enhancement of SIN is discussed by Popov ( I V-21) and Demyanova (IV-22),who suggest that only a small portion of the PC should be used in signal detection and that unwanted thermionic emission should be suppressed-either by magnetic defocusing ( I V-21) or by metallizing the unused portion (IV-22). A similar concept of limiting the presence of active materials in an MPT to a small area is commercially realized in the DuMont KM 3054. Practical difficulties,as well as the nature of most MPT applications, have prevented full exploitation of the optical enhancement potential. Most MPT applications call for direct coupling of source and PC substrates. There is also
-
106
H. TIMAN
evidence that dielectric films are affected by alkalis and theoretical predictions are seldom achieved; there is also of course the practical difficulty of measuring film thicknesses. We find “side-window’’ MPTs (PC deposited on metallic reflector) available with enhanced low absorption response, and EM1 offers an S-20 with “prismatic” substrates with moderately enhanced red response-these however suffer from much larger PC area with correspondingly increased thermionic emission. A good general discussion of optical design factors for optical enhancement can be found in Gunter et al. (1V-23). We feel that especially for Laser detection and similar narrow-band signals the optical enhancement still has great potential.
V. FORMATION, COMPOSITION, AND SPECTRAL RESPONSE While it is obvious that formation techniques and schedules will influence composition and emission characteristics, we do not have clear correlations of these parameters. In general every manufacturer or even every researcher has his own “special”-most of those never disclosed-and common denominators exist only in the outlines. S-1. Here it appears that the importance of certain absorption and optical characteristics has been established. A number of different formation techniques have been reported and discussed. For a description of the “classical” method still being used by many researchers in its nearly fifty year old form we reference Sommer ( V -1, p. 134ff). Timan (V-2) investigates a rather different formation scheduleessentially with much more shallow oxidation of the Ag base layer-together with many variations. In his view the formation consists of a “buildup” of absorption in the lower-photon-energy region through Cs interaction or Cs-0 cycling (so-called “DuMont” method). This formation can be monitored by simple transmission ( T )measurements which confirm a coincidence of sensitivity peaking and completion of T change (which is roughly equivalent to absorption changes). It is stated that the amount of Cs, which is active in the absorption changes, is dependent on the initial condition of the Ag base layer. Thinner layers require more Cs-0 cycles than thicker ones for optimization, both eventually resulting in similar final optical characteristics (Fig. 14). It is implied that all microcrystallites of the Ag base (see Section 111, H) are thus brought to similar final conditions. Through this “selfadjusting” of the Cs content we can give more physical meaning to the concept of film thickness for the inhomogeneous S-1. However, if initial conditions are too different, “patches” or severe response nonuniformities will result.
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
107
100 f?’
FIG.14. Development of S- I for different base-layer thicknesses. #423: pretreatment of 4-pA Cs on glass: 1-2, Ag evaporation for 8.75 min; 2-3, rearrangement of Ag layer; 3-4,160-pm 0, intrqduced; 4-5, glow discharge; 5-6, Cs drift at 120°C; 6-7, peaking response with 0,; 7-8, excess 0,; 8-9, Cs drift at 120°C; 9-10, traces of Ag and 02.Thermionic emission 5.5 x lo-’, A/cm2, sensitivity 50 pA/lm; 2540 2.1 pA/lm. #446: pretreatment of 4-pA Cs on glass: 1-2, Ag evaporation for 2.5 min; 2-3, rearrangement of Ag layer; 3-4, 160-pm 0, introduced; 4-5, glow discharge; 5-6, Cs drift at 140°C 6-7, 0, interaction. Thermionic emission 5.4 x 10- ” A/cm2, sensitivity 52 pA/lm; 2540 6.2 pA/lm.
For an example of the behavior of QE and absorption during formation steps we reproduce Fig. 15. Here we see in Cs interaction major increases in green to IR absorption while blue absorption actually declines. Oxygen interaction strongly affects interface conditions, i.e., optically speaking, the reflectivity. From the optical data of type A layers we find their similarity to those of the optimized S-1. So we can understand why only Ag allows S-1 type formation and why a “minimum thickness” is always a prerequisite for IR response-two points which were always raised in the older literature [cf., e.g., Sommer (V-1, p. 147)]. The observation of certain limitations of Cs uptake by the Ag base have also been discussed by Heiman (V-3, V 4 ) ,who relates it in a different interpretation to the amount of oxygen present in the classically prepared Ag base and by Jennings and Dean (V-5). We have also reports that S-1 cathodes can be built by Ag-Cs or Cs-0 layering (V-2, V-3, also early work by Asao). Timan again shows typical buildup of absorption in Ag-Cs cycling with sensitivity peaking through one final superficial oxidation (V-2).Heiman and Timan discuss buildup of heavy
108
H. TIMAN
Wavelength (eV) FIG.15. Development of A , QE during S-1 formation. State I, Ag baselayer; 11, PC before oxidation; 111, PC after oxidation.
Cs-0 layers on a very thin Ag base (2-3”/, coverage). Both report that such films can reach good S-1 sensitivities by final Ag addition and also that too heavy Cs-0 layers-apparently as a consequence of destroying optimum optical conditions-become insensitive. Indeed, Timan shows that again these cathodes, formed in such a different way, reach optical data close to normal S-I. While photoemission between 0.3 and 0.4 pm is considered to be due to elementary Ag (see ref. V - I , p. 168), it would appear that beyond 0.450.50 pm only a certain amount of the film absorption A , for example, the increase in A observed in Cs interaction with an Ag base, is indeed photoelectrically active. In extension of these thoughts Pakhomov (V-6) suggests minimizing the Ag content of the cathode, thus minimizing “useless” IR absorption. This idea unfortunately has not been tested technologically. These widely differing preparation techniques already indicate the difficulty of arriving at a nominal composition. Older literature as well as the extensive work by Heiman (V.?) suggest Cs,O and Ag as the major constituents; however, the actual determination of physical amounts is rather questionable because of the aggressive nature of alkalis and their adherence to
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
109
virtually any type of surface. These problems have been discussed by Soboleva ( V - 7 )and by Koosman (V-8).It appears therefore hardly possible to exclude the coexistence of other cesium oxides or to make meaningful quantitative analysis. Specifically, it is unlikely that the Cs content of the finished cathode is determined by the amount of oxygen bound in the silver oxide in the Ag base, as Heiman suggests, Results by Timan rather indicate that this amount depends on the initial “optical thickness” and probably some other physical factors (such as structure and conductivity) of the microcrystalline Ag base. This view is supported by the different absorption growth of thicker versus thinner layers-as shown before-and by the possibility of growing the cathode in Ag-Cs layering only. In the same vein Timan (V-9)shows that good cathodes are produced with widely varying contributions of Ag and Cs to its absorption. For example, surface 5235: 26.5% Ag, 40.5% Cs-0, sensitivity 40 pA/lm, IR 5.2 pA/lm; surface 5642 35.5% Ag, 28.5% Cs-0, sensitivity 45 pA/lm, IR 5.5 pA/lm; surface A446 50% Ag, 18% Cs-0, sensitivity 52 pA/lm, I R 6.2 pA/lm (percentages measured as absorption contributions). In view of this and of the possibility of forming good S-1 cathodes with partial replacement of Cs by K or Rb (V-2)it is doubtful that anything like a definite composition can be determined. These effects of “doping” with higher-workfunction alkalis on spectral response and thermionic emission are shown in Fig. 16. Timan ( V - 9 ) reports highest luminous sensitivity (60 pA/lm with 12.5 pA/lm through the Corning 2540 IR filter) for the (Cs-Cs) combination on a type A Ag base layer. Klimkin (V-ZO)described the existence of a “two-photon” effect, peculiar to the S-1, upon excitation with high energy density from a 1.76-pm laser. This effect cannot be produced in the S-20. Malherbe ( V - Z l ) reports S-1 image converter use out to 1.7 pm heated to 50°C. A careful analysis of the “tail response” of the S-1 types and its correlation to other physical characteristics (absorption, conductivity) is still lacking. Specifically, Malherbe’s observation contradicts other reports of the loss of IR sensitivity in heating (see Section I). Considerable changes are produced by additional evaporation of Ag on finished cathodes. Verma (VZ2), Srinivasan (V-Z3),and Timan (VM) discuss this very old Asao treatment. A direct proof of the great diffusion mobility of Ag in the S-1 is given in Timan (V-14). The results in most cases point to increased luminous sensitivity with small or negative effects beyond 0.95 pm. Timan (V-24) discusses in detail similar treatments with Bi which lower thermionic emission. In evaporation of small amounts of metals the vacuumPC interface is most immediately affected by corresponding changes in the surface states (see also Section VI).
110
H. TIMAN
'\
\ \
' 0
0.65
0.8
0.95
I15
Wavelength ( p m )
--
-
FIG.16. Spectral yield of S-l as a function of processing. Curve A: (Cs,Bi prewet)-Cs processing; Ilh 10-10-10-'2 A/cm2. Curve B (K,Bi)-Cs; Ilh 10-11-10-13 A/cmZ.Curve C: (Cs)-Cs; I,,, 10-9-10-11 A/cm2. Curve D: (Cs or K)-K or (Rb or Cs)-Rb; Ilh 10-12-10-14 A/cmZ.Curve E: SEDEC S-1 curve.
-
S-10. Nothing new to report here. All the questions raised by Sommer (V-I, p. 172)are still unanswered. Because of the advances in technology in the S-20, this cathode has lost most of its applications to this higher-performance competitor.
Cs,Sb and related. The formation of Cs,Sb and the related S-11 is the most uncomplicated of the alkali antimonides and has been covered extensively in the older literature. Robbie (V-15)discusses electron diffraction studies of Sb films and Cs,Sb (see also Section 111,E).His conclusion that layers which remain amorphous at formation temperature (85% or more light transmission) give better cathodes is generally in agreement with manufacturers' experience. Sommer's elegant method to prove n or p conductivity
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
111
from the behavior of resistance relative to photoemission (see ref. V-1, pp. 84ff) has incidentally been used by later researchers for the other alkali antimonides. His derivation as well as other methods suggest a stoichiometric excess of Sb in the Cs,Sb cathode. Garfield and Thumwood (V-16) in their microbalance study arrive at a Cs:Sb ratio from 2.6 to 2.9 for optimum sensitivity, which also coincides with highest optical absorption. This result, if confirmed, would again point to the close correlation between optical data and photoresponse. They identify resistance peaks with the formation of the intermediate Cs-Sb compounds and state that “excess” (or better “free”) Cs in the form of a surface coverage eventually determines the quality of the PC. The problems in measuring actual amounts of alkalis in the PC thin film, pointed out in the S-1 case, of course exist here also. The spectral response of the Cs,Sb and the S-11 is nowadays only of interest in the blue region and major improvements have not been achieved, although recently the combination (Cs-Rb),Sb seems to show promise (cf. Table I). Superficial oxidation of the final cathode, for red extension, is not used much anymore because of increased thermionic emission. Artem’ev (V-17, V-18) reports two-electron emission from Cs,Sb when illuminated with strictly monochromatic narrow light pulses. The contribution of these pulses increases with the QE of the cathode and with the degree of coherence of the monochromatic source. He explains the effect as due to correlation of photons. This may significantly affect the response of PC to short laser pulses and its linearity, but apparently there have been no further investigations. The effects of the MnO substrate have been studied by McCarroll and others (see Section 111). It is shown that the MnO does not act as an oxygen source for the Cs,Sb (see, e.g., ref. V-1, p. 74). It appears not unlikely that the better performance of the S-11 compared to the Cs,Sb may be simply due to an optical buffer effect between the lowrefractive-index glass and the high-refractive-index PC (cf. Sections III,B and 111,E). Cs-Rb-Sb. Formation and properties of this cathode have been described by Jedlicka and Vlim (V-19,also V-20).Their method, consisting of a layering of Sb + activation with Cs + Sb + activation with Rb + final superficial oxidation, and his physical data, specifically a stated thickness d = 245 nm, hardly seem representative of techniques used now. We have no detailed description of more modern technology here, only the releases and manuals of MPT manufacturers (EMI, DuMont) claiming “improved S-11” or “super S-1 1” performance. Again the applications interest here is only directed to the blue region and the “hump” existing in the region of
112
H. TIMAN
0.55-0.6 pm (see, e.g., ref. V-19), resulting in nominally higher luminous sensitivities, is of little practical impact. There is good support for a composition very similar to Cs,Sb in a mixture (Rb, -$s,)-Sb, where x can vary considerably. K-Cs-Sb. This is undoubtedly now the most important blue-sensitive cathode; QE as high as 30-33% at peak are claimed by MPT manufacturers. With a glass incidence absorption of about 60%at 3.0 eV (for detailed data see Section 111, B; V-21) this would already amount to the usually assumed theoretical “limit” of 50%; indeed, even higher QE/A values are found in the case of vacuum incidence. Because of the high resistivity of the cathode (see Section 111, G) all sensitivity figures can only be measured and achieved for very low light levels. Understandably this cathode is mostly used in lowactivity scintillation counting and similar applications. Sommer still suggests an increase in QE and threshold through final oxidation, but effects on thermionic emission and resistivity are considered undesirable nowadays. Little has been published about formation technology. Ghosh and Varma (V-22)suggest formation of a K,-Sb cathode to peak by K-Sb coevaporation at 180°C (sensitivity 25 pA/lm, peak QE 10%) followed by Cs-Sb coevaporation to peak without any oxidation. They contrast their method (well known in principle for nearly a decade in S-20 preparation) to the conventional one wherein an Sb film of given thickness (or light transmission) is initially deposited; they claim superior performance by achieving 100-120 pA/lm. Unfortunately, they do not provide data about peak QE, resistivity, or load performance of their cathodes. At present we do not have any published information as to which methods are being used by major MPT manufacturers. Old work by McCarroll (V-23) suggests the composition K,Cs-Sb by chemical and crystallographic analysis. Sommer (V-1, p. 126) hypothesizes that the low thermionic emission and high resistivity of the K-Cs-Sb film are due to a low concentration of defect levels and to rather close adherence to the stoichiometric ratio 3: 1 for alkali: antimony. Despite the practical importance of this cathode, we did not find any further studies to prove or disprove these assumptions. Lately some researchers (see S-20) have suggested that K,Cs Sb is a major constituent in the S-20; for a discussion see below under S-20 and Holtom and Hopkins (V-24).
-
-
N u , K-Sb, S-20, and trialkali compounds. Na,K-Sb, undisputedly the basis for the most successful trialkali combination (Na-K-Cs-Sb or S-20) but also a cathode in its own right because of its good high-temperature performance, is included here. We can quickly dispose of other combinations. The Sb-K-Rb-Cs cathode was studied by Dvorak (V-25).Results show only mediocre luminous sensitivity with no remarkable properties. Indeed, in
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
113
numerous experiments it has been quite firmly established that the introduction of Rb for any of the partners in the S-20 will lead to inferior performance. As in many other questions concerning this most complicated PC, we have no theoretical explanation of this fact. The Na-K-Sb system, also called the S-24, is considered the foundation of the S-20 by all researchers and constitutes the bulk of the S-20 thin film. It is in the effect and compositional nature of the third component, Cs, that we find divergent and diverse opinions. Interestingly enough, it has so far not been a subject of curiosity why Na has to be present in the trialkali system in order to give the red sensitivity of the S-20. This is certainly surprising if we consider that Na itself forms only poor photocathodes and that it is also very ineffective in any other combination but Na-K. A possible explanation could be the growth of Na,K-Sb in cubic all other efficient alkali antimonides-while crystal phase (D 0,)-like Na,Sb and probably all the other mixtures grow only hexagonally. In discussion of the S-24 in the literature we find only one mention of the danger of Cs “contamination,” which can drastically affect results (V-26). Some of the data directly pertaining to the S-24 must therefore be taken with a grain of salt, as most investigations have been done in connection with S-20 research. We distinguish two types of formation: (1) Initial evaporation of a Sb base layer of variable thickness, followed by activation. (2) Initial formation of a K-Sb cathode by “coevaporation” of Sb in K vapor, followed by activation similar to the one used in (1). While method (1) is still used extensively in the production of MPT it appears that the more modern form (2) has now more adherents and may give generally better sensitivities. However, many older papers and findings are still based on type (1) formation. A typical (1) formation for the S-24 is described, for example, by Dowman et al. (V-26):(i) Sb to 85% “white-light’’ transmission. (ii) K added at 140160°C to form K,Sb to peak sensitivity. (iii) Addition of Na at 180’Csensitivity first increases then drops to low values. This coincides with the electron diffraction profile, indicating excess of Na and formation of Na,Sb. (iv) Cooling to 140-160°C and addition of small amounts of Sb and K (up to twenty times!) with large increases in photosensitivity. Dowman finds that attainment of maximum photosensitivity always corresponds to the cubic phase Na,K Sb (d = 7.74 f 0.01 A). He notes that the increase in photosensitivity during formation is directly related to the disappearance of diffraction peaks attributed to NaK,-Sb and Na,-Sb. Other practitioners find higher temperatures necessary for optimum Na interaction or use a much heavier
114
H. TIMAN
initial base (see, e.g., ref. V-27, V-28), but the given schedule is basically representative of (1). A description of method (2) can be found in the Ghosh-Varma report (V-22).They report the coevaporation method as new technique although it had already been used extensively since about 1972, for example, in image converter production. Ghosh and Varma also claim values of 100-120 pA/lm for their S-24. With these and similar claims we have to suspect the presence of minute amounts of Cs from outgassing Cs channels or similar sources. Jedlicka et al. (V-20, V-29) report more modest sensitivities of 40-60 pA/lm. Unfortunately their data were measured on very heavy cathodes. Structural studies of cathodes formed by (2) are not known but we have no reason to assume formation of a different phase. The S-24 itself is of practical interest because of its high-temperature behavior (see Section I), but its performance in temperature cycling will be adversely affected even by trace amounts of Cs or Rb in the cathode. While we have generally good agreement about composition of the S-24 we cannot say the same about the S-20.The initial formation steps are those of the S-24 as given above and we again distinguish the two methods (1) and (2). As an example of the uncertainty we already have considerable disagreement about the effect of Cs if present or released during the early formation steps. Dowman (V-26) and Feldner (V-30) emphasize that the Na-K-Sb cathode must be optimized before addition of Cs, while Sommer ( V - I ,p. 115) does not find any optimized alkali sequence. The author himself observes that Cs prewetting is used in S-20 production with good results. Generally, however, the opinion prevails that the Na-K-Sb phase (with or without traces of Cs) should indeed be optimized before the final (?) Cs-Sb cycling. Even such an unambiguous property as physical appearance seems ill defined. While the author and most other practitioners find the color of good cathodes to be light to dark yellow (or golden?) in reflected and transmitted light (for thicknesses of 300-450 A) Holtom and Hopkins (V-24)report “deep blue” for good sensitivity layers of 300-1000-1( thickness. Unfortunately this study does not clarify how the cathodes were viewed. S-20 cathodes indeed appear deep blue when viewed on an opaque metallic substrate. Similar wide variations are reported for determination of the film thickness d (cf. Section 111). One older report is still of great interest. Theoderou (V-31) shows the spectral development of photoemission. The appearance of red sensitivity (beyond 0.65 pm) is almost entirely due to the final Cs-Sb cycling, while blue and green sensitivity remain approximately on the Na-K-Sb level (Fig. 17). This observation alone would be hardly reconcilable with the formation of a mostly blue-sensitive compound such as K,-Cs-Sb on the surface of the Na,-K-Sb film, as suggested by Dowman et al. (V-26)and Varma and Ghosh
115
RECENT ADVANCES IN VACUUM PHOTOEMITTERS m
(d)
Wovelength
(I!
1
FIG. 17. Development of spectral response during S-20 processing. [From Theodorou (11-12).]
(V-28),but more so with a Cs-Sb surface effect as suggested by Holtom and Hopkins (V-24). With both assumptions the “red extension” still remains rather inexplicable, unless one assumes drastic changes of red absorption. We know from formation practice that “cycling” of small amounts of K and later Cs with Sb is necessary for buildup of the red sensitivity and we should assume that red absorption is also developed in the same process. Unfortunately we lack any serious study of the growth of spectral absorption during formation-although it would seem that such experiments would be easy to perform-and a meaningful evaluation of spectral response versus film thickness. Garfield (V-32) tries to relate weight, transmission, and photosensitivity; however, white-light sensitivity hardly gives relevant information. In two recent reports the author connects optical constants and data with spectral emission. Timan (V-33) shows that absorption remains fairly unaffected by changes in film thickness (with R and T changing correspondingly) and concludes that this fact accounts for the variances in physical data observed between equally good S-20 cathodes and vice versa. In Timan (V-34) the light or photon distribution in the PC thin filmdetermined by the optical constants-is evaluated and it is shown that the red emission comes mostly from surface regions with green and blue emission distributed through the entire thickness. This would support the view of
116
H.TIMAN
differentiation in the S-20 as postulated by many authors and by practical formation experiences. We can conclude that the refractive index and obviously the absorption coefficient are determining factors of the spectral response, as already suggested by Holtom and Hopkins (V-24). At present we need further studies of this aspect, especially of the “thickening” of the S-20 into the S-25. It would appear that monitoring of spectral reflectivity and transmission could supply additional information. The literature contains numerous prescriptions for making S-20 cathodes, agreeing usually in principle only. For method (1)-compare S-24-we cite Garfield (V-32),Varma and Ghosh (V-28),and Ninomiya et al. (V-34, with wide variations as already mentioned for the S-24. For method (2) we reference Ghosh and Varma (V-22)and Bhatia et al. (V-36).These two sources attach special import to the sensitivity of the first K-Sb layer and find a direct relationship between final result and performance at this point. Spectral sensitivities as high as 400 pA/lm are reported, for example, by Dowman (V-26), although it is impossible to reconcile his figure with his description of a “very weak red response.” Rather undocumented reports from Philips Laboratories claim point response up to 700-800 pA/lm. The higher luminous sensitivity values for the S-25 are due to the shifting of peak response to longer wavelengths and a corresponding widening and lowering of the response curves. For these curves we can only cite MPT manufacturers’ manuals (see Table I). It is our opinion that these effects are almost entirely due to changes in the absorption and photon distribution because of changes in the optical constants of the thin film. There is little doubt that computations similar to the ones cited above (V-34)will then result in the shifted response curves. Unfortunately we do not have optical data for the S-25 available. In this context we mention the study of Dolizy et al. (V-37),who predict sensitivity of 700 pA/lm for PC thickness of 120 nm. It is difficult to see how such a prediction could be made without detailed knowledge of the change of optical constants with PC thickness. The composition of the S-20 has been the subject of several papers. McCarroll (V-38)discusses the effects and concentration of Cs in the S-20, arriving at a lattice widening of 0.2%of the Na,-K-Sb lattice through Cs and a Cs content of 1%. He rejects the presence of Cs as surface layer only because of the higher conductivity of the S-20 and the possibility of adding Cs before the completion of the Na-K-Sb phase. Neither argument seems convincing as there is obviously a strong mobility of the alkalis in the PC (note all the displacements and transitions taking place!), and as discussed earlier conductivity measurements could indeed be surface rather than bulk. Through later work it is, however, quite well established that indeed the Cs content is much more than a monolayer. Kansky (V-39)claims that the amount of Cs increases toward the vacuum interface and that correspondingly the lattice
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
117
constant d = 7.74 of Na,-K-Sbchanges to the d = 9.15 A of Cs,Sb. Much more recent Auger surface studies by Holtom and Hopkins (V-24) seem to support the view of a Cs-richer surface layer and the hypothesis of a Cs-Sb dipole layer. Their measurements indicate that Cs is concentrated into a thin surface film (5-10 A) with increased Sb concentration directly below. Because of their detectability limitation Cs may also be present in smaller concentrations throughout the bulk. They find higher Cs concentrations for the S-25 and speculate that there is a “free dipole matrix” (meaning Cs not tightly bound to surface states) which accounts for the large variations of spectral response and for the poorer stability of the S-25. In his interesting work Hoehne (V-40) unfortunately studies very heavy layers (1000-1350 A)not representative of the usual technology but he reports quite good sensitivities. We show some of his data in Table IV. Surprisingly, he finds ratios of Na:K much smaller than 2: 1; Cs content is between 3 and 7% and is higher for layers lower in Na. Dowman et al. (V-26)and Ghosh and Varma (V-22)suggest a surface layer of K,-Cs-Sb on top of the Na,-K-Sb phase. Dowman’s data, however, indicate strong diffractions for a K-Cs-Sb phase only for mediocre S-20. It is hard to reconcile this assumption with the distinctly higher conductivity and red response of the S-20 compared with Na,-K-Sb and K,-Cs-Sb PCs; this criticism has also been voiced by later sources (see, e.g., ref. V-24). Critically it should be noted that a different composition should also exist at the glass interface, see also Sommer (V-41). Garfield’s interesting report (V-32)gives weight relations for the different stages of S-20 formation (Fig. 18). He arrives at the conclusion that the Na-K-Sb system is closer to Na-K,-Sb than to Na,-K-Sb, but points out possible error due to readsorption of Na from walls, etc. His final Cs-Sb cycling accounts for about 10% of the weight of the finished cathode. Through
Na,-K,-Cs,-Sb
TABLE IV TRIALKALI; x +y
+ z = 3.0 ASSUMED
Tube number
Sensitivity (pA/lm)
X
Y
I
E,”(eV)
286 776 718
205 195 330 200 165
1.63 1.71 1.43 1.34 1.71
1.28 1.19 1.44 1.45 1.08
0.09 0.10 0.13 0.21 0.21
1.32 1.35 1.38 1.38 1.30
298 304
E , interpolated from steep rise of absorptivity. E , equivalent to sensitivity of 0.5 mA/W.
EOb(eV) 1.36 1.36 1.395 1.41 1.335
118
H. TIMAN
Weight
1 6- ~
- 15
I
I
.C
--18OoC+
0
0
I-16O0CCooling
IbI2b $0 4b sb sb $0 sb bo--------’I80 190 200 I
I
0
Time (min) FIG.18. Variation in weight, photoemission, and light absorption for S-20 during processing. [From Garfield (11-18).]
observation of the resistance behavior during formation Garfield concludes that changes from n-type to p-type conductivity take place during K-Sb cycling after completion of the Na-K-Sb phase. The final step of Cs-Sb cycling does not change the nature of the conductivity but increases its value. In contrast to other sources Garfield also finds increases in conductivity and sensitivity upon oxygen exposure, but only if his cathodes contain excess Cs. In conclusion, we can say that our knowledge of the composition of the S-20 is still very much in flux. Apparently there is a wide range of formation schedules which can be optimized to similar “peak” sensitivity and response patterns. This in turn points of course to considerable differences in physical composition. It would seem that a major cause of “bad” cathodes is excess Na in the cathode which can not be replaced by other alkalis. The general consensus is that we have indeed a bulk Na,K-Sb film with a different (most likely Cs richer) composition at the interfaces. The nature of these border regions is still quite uncertain and is almost certainly very dependent on formation technology. It is fortunate that the rather limited understanding of the S-20 formation has not prevented us from obtaining excellent practical results. Especially the
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
119
reasonably good sensitivities achievable out to 0.85 pm have increased the range of applications for the S-20. A more thorough understanding can only be expected if our present rather piecemeal approach is changed to one which combines better controls with a simultaneous study of all physical aspects. In view of the considerable experimental difficulties (and apparatus) involved this is a formidable task-as in the case of the S-1-and we do not foresee an early successful conclusion, especially in view of the rather indifferent attitude of the manufacturers of related devices. Finally, we present a brief review of more recent work on thin-film UV cathodes. Little has been reported and Sommer’s articles (V-1, V-41) still remain basic information. The UV response of the S-20 has been discussed by Shaw et al. (V-42), who find enhancement with a Suprasil prism possible only to 0.22 pm and Sobieski (V-43), who reports an astonishing number of response peaks and valleys on MgF, substrates. Sobieski also reports Q E of the K-Cs-Sb surface. Bartlett (V-44)suggests the use of appropriate filters to make the K-Cs-Sb surface “solar blind” and make use of the better QE compared with the standard UV cathodes. Charman (V-45)shows that excess Cs impairs the solar blindness of Cs-Te cathodes. The most significant study comes from Denisov and Klimin (V-46) about Rb-Te. They analyze electron emission velocities and conclude that photoemission in the region 2.0-3.6 eV originates from the conduction band (excess Rb acting as a donor), while higher-energy response comes from the valence band. They determine E, = 0.4 + 0.1 eV and E, = 3.2 0.2 eV. Lapson and Timothy (V-47)report on the properties and applicability of UV window materials.
+
VI. THEORETICAL ATTEMPTS AND MODELS
By now the reader has realized the rather uncertain state of our knowledge. Most researchers have had to limit themselves to specific theoretical aspects only and often far-reaching conclusions have been drawn from isolated facts or observations. Because of the nature of the photoemissive film only noninvasive experimental techniques in excellent vacuum should be used and the temperature range of measurements is also severely limited because of the strong possibility of irreversible structural and compositional changes, conditions which were not always realized in older work. Nonetheless, we find often that numerical values are quoted from older sources-especially the classical work by Spicer (VI-1)from 1958 although
120
H. TIMAN
formation technology has greatly changed and advanced since then. As we have seen from uncertainties of composition and structure we are of course unable to support any serious computational model of energy band structures. The mathematically very simplified and simplistic approach to emission models is based on the introduction of an “electron affinity” term E,. It should be understood that this term is really purely empirical and is introduced to explain discrepancies between photoelectric threshold E , and band gap E , in a flat band structure. The experimental techniques used in the more recent literature for theoretical derivations on thin-film PCs are as follows: (1) Optical measurements to arrive at the absorption edge EOg and determine differences between “fundamental” and impurity absorption; also to compute optical constants under the assumption of an isotropic, homogeneous layer (certainly only true in the idealized case). (2) Photoconductivity p c measurements to determine the band gap E , from the rapid rise of the p c curve at this energy value. This measurement is vulnerable to misinterpretation as already pointed out in previous chapters. (3) Measurements of thermionic emission It,,as function f(T)to derive the thermionic work function-which is determined by location of the Fermi level E,, which in turn is dependent on impurity and defect levels in the band gap. (4) Determination of the temperature dependence of resistivity. The activation energy of these curves can generally be evaluated to give E , for near intrinsic conditions (at higher temperatures) or to determine impurity or other levels at lower temperatures. Similar problems as in (2) arise when the usual technique of measuring conductivity between metal strips on the cathode is used; see also Section III,G. A less ambiguous method would be to measure conductivity or a derivative thereof from PC response versus light intensity as f(T),but no such studies have been made. ( 5 ) Measurement of the energy distribution of photoelectrons Epe.The threshold E , should be equivalent to E(hv) - Epe(max);in principle the curves could also be examined for escape depth, energy losses, etc., especially if taken from the two different sides of the PC film. (6) Measurement of the Hall effect to determine the nature and carrier density pc-only one report is known (see Section 111,F). (7) Photoelectric measurements to find QE and threshold E , directly. Unfortunately here, very importantly, Eo or A, is difficult to determine exactly and of course varies from surface to surface. The often applied Kane formula QE (EA- E,)”-with the exponent m determined from the QE curve!-is hardly more accurate than direct measurement. Several authors have simply
-
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
121
substituted a cutoff value of response for the actual threshold (see, e.g., ref. VI-2).Of course nonuniformities are most pronounced in the threshold region and must be guarded against. (8) Spectral response and yield for substrate incidence versus vacuum incidence as function of PC thickness. This would again appear a good noninvasive method to estimate location of emission centers. (9) Recently mathematical methods have been applied to determine photon distribution and derive then spectral response and escape depth from the threshold E , and an empirical probability of escape (VI-3). Comprehensive studies combining several of these techniques (on the same surface or sample of surfaces) are still largely unavailable. Optical measurements are evidently among the least ambiguous and least invasive. With most researchers in agreement we know that PhEm (photoemission)-except in the immediate vicinity of the threshold region, where the small value of the QE may allow the assumption of a surface or defect effectis due to absorption in the “bulk” of the PC thin film and is determined by the properties of the bulk “compound.” Sommer and Spicer (VZ-4) view absorption as a multistep process:
(1) Absorption of the photon and excitation of an electron. (2) Movement of the electron to the vacuum interface-with
loss of energy . It is generally assumed that the prevailing loss mechanism for high energy excitation (several times band-gap energy) is pair production with large energy loss per collision, while for exciting energies below pair production threshold, lattice collision (phonon generation) with small loss per collision and correspondingly increased escape depth (VI-4- VI-6) is dominant. (3) Escape through the vacuum barrier. If E , is the “smallest” energy still causing PhEm then we set E , = E , + E,; if E , can be determined independently we find E , as a function of the threshold. It has to be understood that in this way we can only find the “effective” E , and can not draw any conclusions about actual band structure and shape. Sommer and Spicer have shown-on the example of Cs,Sb and Cs,Bi-that a smaller E , : E , ratio signifies better yield for similar band structures. Spicer (VI-1) in his 1958 paper distinguishes artificially between photoelectrically “effective” and ineffective parts of the absorption and derives response approximations by simply assuming an exponential law for the intensity I(zl within the bulk PC: I(z). = I ( 0 ) e - k z
122
H.TIMAN
where k is an absorption-related constant. However, as shown in Timan (VI-3) such an assumption is obviously not valid because the thin-film intensity distributions, as derived from optical constants, are very different indeed (for curves, see ref. VI-3 and Figs. 20 and 21). For the alkali antimonides some researchers similarly hold that we have (at least) two types of absorption process, one due to “fundamental” or direct transitions across the band gap and another due to states closer to the conduction band. For Varma and Ghosh’s and Hoehne’s studies see Section II1,B. Experimentally we can interpolate the optical band gap or absorption edge Eoo from two sets of data: vanishing absorption, A = 1 - R - T , and the vanishing differenceR, - R,. In (VI-7)a comparison of these two sets of data finds excellent agreement. In all cases we find optical absorption extends much further than Eo. If bulk absorption is predominant Eos will be generally close to EB. Table I11 lists values mentioned in the more recent literature. Where such data werenot available older data are given. It is of interest to compare the values of Timan (VI-7),which were derived from optical and photoelectric measurements only, with the values determined from photoconductive measurements by Spicer (VZ-I) and Ghosh and Varma (VI-8). We find reasonable agreement except for the S-11 versus Cs,Sb. The data given in Timan ( V I - 7 ) apparently are the only ones measured for the more modern S-11 and are not at all close to the older Cs,Sb data (compare, e.g., VI-9).It is difficult to understand why the Mn-0 substrate should change the band gap, but the values of Timan (VI-7) fit well between the K-Cs-Sb and S-20,in rather good agreement with the PhEm properties. Indeed, Spicer expressed “uncertainty” about the band gap of Cs,Sb in Spicer (VZ-10). With all of the older values we have little information about cathode thickness and efficiency and about accuracy of optical measurements. Hardly any of the older sources determine the absorption correctly from A = 1 R - T , where R can have considerable magnitude. Indeed, many of these data were not taken on semitransparent PCs, which are now the ones of practical and theoretical importance. These error sources have already been mentioned by Goerlich in his 1959 review (VI-9). Technology has also resulted in higher luminous sensitivities for the S-1 1 which should be expressed in its physical data. Similar uncertainties apply to data given by Jedlicka for the Cs-Rb-Sb. Here we know that PC thickness was much greater than used in modern technology. New releases on this “bialkali“ by EM1 and DuMont claim performance superior to K-Cs-Sb or Cs,Sb (see Table I). Optical measurements or band-gap values for this new development are now known. If we disregard these discrepancies, we find indeed that the band gap E , or EOgis
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
123
a fundamental property of the alkali antimonides, with a range of only 1.0-1.2 eV, while the electron affinity and with it the threshold-and the red sensitivity-may vary considerably. For instance, the very low E , (or good red sensitivity) reported by Ghosh and Varma ( V f - 8 )still does not reflect any change in the band gap. Optical data for the S-1 and S-10 are difficult to interpret. One example in Timan ( V f - 7 )only allows determination of E, = 0.5-0.6 eV, Eo = 1.05 eV, E, = 0.45-0.55 eV, these values having only schematic significance. Heiman’s extensive work of 1973 (Vf-ZZ) does not attempt any interpretation of optical data. For the S-10 we find in Timan ( V f - 7 ) E , = 0.8-0.85 eV, Eo = 1.501.55 eV, with E, = 0.65-0.75 eV in reasonable agreement with the only other investigation of this PC, by Sommer and Spicer (Vf-12).The lower values of E , as compared with the alkali antimonides explain quite readily the higher thermionic emission of these two surfaces. Recent pE measurements for the S-10 and the S-1 have not been reported. Only a few work-function or Richardson plots have been reported recently. For two surfaces reported by Timan ( V f - 7 ) ,(Vf-f3) we can construct a more complete band model (cf. also Fig. 24):T-1 (S-20):sensitivity 170 pA/lm, d 30 nm, E, = 1.05 eV, E, = 0.45 eV, 4 = 1.44 eV, Ef = 0.06 eV above the valence band; B-3 (K-Cs): sensitivity 48 pA/lm, d 38 nm, E, = 1.20 eV, E, = 0.80 eV, 4 = 1.56 eV, Ef 0.44 eV above the valence band. Holtom and Hopkins (Vf-f4) report an interesting study of 4 versus E , ; their results are presented in Table V. As there is no obvious correlation between E , and the position of E , with the sensitivity we must assume the influence of rather ill-defined band bending or considerable changes in composition which may influence the band gap. This study is a good example of the difficulties encountered is drawing theoretical conclusions from a larger sample without complete physical data.
-
-
TABLE V BANDMODEL DATAWITH ASSUMPTION OF CONSTANT BANDGAP
Sensitivity (pA/lm) 280 195 190 138 130 125 123
(eV)
E, (eV)
E, (eV; E, = 1.0)
E, (eV; position above valence band)
1.0 1.15 1.03 1.20 1.30 0.90 1.30
1.23 1.31 1.30 1.50 1.43 1.40 1.46
0.23 0.31 0.27 0.50 0.43 0.40 0.46
0.23 0.16 0.27 0.30 0.13 0.50 0.16
4
H.TIMAN
124
For the S-1 Timan reports numerous values of 4 for different compositions, and their changes upon surface treatment with Bi. Little can be deduced from these values for a theoretical model except the extreme dependence of IR tail response on surface conditions and “doping” (VZ-15) (see also Table 11). Some conductivity versus temperature band-gap values from the “intrinsic” range of the semiconductor PC are given in Table 111. They usually differ somewhat from the optical or p c data. Ghosh and Varma (VZ-8)report identical activation energies of 0.1,0.21, and 0.35 eV for the S-24 and the S-20. They draw the far-reaching conclusion that Cs does not affect the band gap or even defect levels. Similarly, Hofmann and Deutscher (VZ-16) determine acceptor levels of 0.28-0.33 eV on five samples with the activation energies apparently increasing with cathode thickness (ranging from 250-1800 A). A similar tendency of course applies then to the E, values determined, which are 0.14-0.17 eV above the valence band. Mostly only older reports are available about the photoelectron energy p , and its distribution function. Spicer’s 1961 report (VZ-5), based on earlier findings by Apker and Taft, studies pe for high-energy photons (4.0-5.7 eV). From the pronounced peak of pe at 0.5 eV (slow electrons) for hu > 4.0 eV he deduces thresholds El for pair production: for the S-20 El = 3.0 eV, for Cs,Sb El = 2.0 eV. He explains the difference by the degree of crystal order, with Cs,Sb partially disordered (?). It is difficult to evaluate his data for mean free path or escape probability, derived from general considerations, in view of the total lack of absorption or thickness data for the surfaces investigated. S-20. Ezard (VZ-17) mentions that for hu < 2.5 eV most p e < 0.2 eV, concluding that there is no direct relationship to the photon energy. In contrast, Ghosh and Varma (VZ-18) report most probable pe = 1.0 eV for hu = 0.4 pm and pe = 0.1 eV at 0.8 pm. Here we obviously lack a more detailed analysis of the semitransparent S-20 and also of the S-25. Rb-Cs-Sb. Jedlicka and Vlim (VZ-19)report a mean of p e at 0.4 eV with a very narrow distribution for hu between 2 and 3 eV and the appearance of higher p e up to a maximum of 1.8eV and a much wider distribution for hu > 3.0 eV. They find Eo 1.65 eV from their pe curves. Their curves are in sharp contrast to the very broad maxima displayed for all photon energies > 2.0 eV in the curves by Kanev et al. (VZ-20).These differences may be due to the different preparation of the PC, especially their different thicknesses.
-
Cs,Sb. Needham (VZ-21) cites a mean p e = 0.4 eV with CT = 0.2 eV for visible light photons; for high-energy photons we expect a pe peak around
125
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
1.0 eV because of the onset of pair production, according to Spicer (V1-5).Taft and Philipp (VI-22) deduce a lower limit of E , = 0.4eV from distribution peak separation in the p e curves. K-Cs-Sb. Only one report, by Kumar and Thumwood (VI-23),mentions a mean pe = 0.6 eV with (i = 0.3 eV for photon energies between 2 and 3 eV. From this fact the authors speculate about a lower surface barrier compared to Cs,Sb, which hardly seems in agreement with the much higher work function of this PC. S-1, S-10. We could not find any more recent reports about p e . Image tube resolution on infrared converters indicates a low emission energy and a rather large o for IR photons. For an old source we mention Soboleva (VI-24). Escape depth, probability of escape, and location of emission centers are closely related. Older reports have determined escape depths mostly from experiments on “wedge-type’’ variable-thickness PC thin films where response for substrate versus vacuum incidence was compared. In this manner, for example, Burton (VZ-25)estimates escape depths 1 of -250 A for Cs,Sb. It is not possible to rely on these older sources nowadays as we know that PC thickness may strongly influence optical constants and absorption characteristics. Love and Sizelove (V1-26) devise a mathematical model of escape probability P under the assumption of complete energy transfer from photon to electron and from the transmission probability of such particles across a square-wave surface potential barrier. They arrive at similar probability curves with shifted thresholds for the S-11 and S-20, suggesting a similar escape mechanism for the alkali antimonides. This is in agreement with more recent sources. Spicer generally prefers (VI-ZO) to split the absorption a = up a, and assign P = 1.0 to transitions ap and P = 0 to transitions a,, where ap is determined from the spectral response pattern. It is obvious that this approach cannot result in any theoretical insight. An interesting and sensible report comes from Hofmann and Deutscher (VZ-27).They first determine for the S-20 optical constants n and k, and find that they are independent in first approximation of PC thickness only for the region of 0.5-0.8 pm. Their values here are in rather good agreement with the data tabulated in a later report ( V I - 7 ) ;compare also Section 11. Hofmann then continues by equating photoresponse
+
- jz=o z=d
4,
E:(z)P(z)dz
where EA is the electric vector of the incident light wave and P(z) the probability of escape, or, in Hofmann’s terminology, the transport function.
126
E,+E,
H. TIMAN
= l55eV
Eg = l e v
Vacuum L e v e l CONDUCTION BAND
-u BANDGAP
0
n(E)
n(E)
Eg
n (El
FIG.19. Illustration of the photoelectrically effective part of the absorption. [From Hofmann and Deutscher (III-29).]
Hofmann selects P(z) = 4 ( z / 2 ~ ' / ~ as ) the error function independent of photon energy, with T as escape depth of photoelectrons. From the ratio Z,,(substrate)/Z,,(vacuum) he arrives at a constant T = 300 A for the range 0.5-0.7 pm for his sample of four different PC thicknesses. Very similarly the same authors determine for Cs,Sb the escape depth as only 150 A (VZ-28). An instructive, if simplistic model of the photoelectrically effective absorption is given, graphically demonstrating the larger QE of higher-energy photons (Fig. 19). Here we find a clear understanding of the fact that the knowledge of the optical data of the PC thin film is necessary for a valid theoretical approach. In the same vein, with more detail, Timan (VZ-3)finds a probability function and with it the location of emission centers, as a function of the threshold only. As examples we show the S-20 and bialkali intensity distributions in Figs. 20 and 21. These intensity distributions clearly show the interference effects for the region of higher refractive indices and heavier absorption. This results in a fairly uniform intensity across the whole film for the blue region, while for the S-20 long wavelength intensity is favorably higher toward the vacuum interface.
n(E)
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
127
0.40 -
0
I
1
1
1
0.2
0.4
0.6
0.8
I.o
7 FIG.20. Intensity distribution B-I; K-Cs surface. T = measured transmission. A , absorption front incidence. [From Timan (V-34).0 1981 IEEE.]
I 0.544 pm 0.48 pm 0.408 pm 0.3885 pm
Abs. front inc. A , = 37.5% A, = 46% A, = 59% A , = 62%
Refractive index
n = 2.75 n n n
= 4.84 = 4.125 =
3.75
=
Absorption coefficient k k k k
= 0.57 = 0.66
= 1.02 = 1.06
To derive the spectral shape and magnitude again I ( z ) is matched with a P(z) which is assumed to be a function of distance from the vacuum interface z and photon energy:
F(1)is found to consist of a steeply declining component in the vicinity of 1, and a much more slowly changing one in the higher-energy regions, possibly supporting the view of two different absorption mechanisms. For his sample surfaces the author finds an identical functional dependence for all alkali antimonides, resulting in “parallel” P ( z )curves differing only in the value of Lo, and confirms earlier suggestions by Love and Sizelove (VZ-26). The product Z(z)P(z) reflects the location of emission centers and probabilities of escape as functions of photon energy (Figs. 22 and 23). The
128
H. TIMAN 0.89prn
-__---I AT
-_----
0.60-
//-+
//-/ I
I
0.69prn
//-//
FIG.21. Intensity distribution T-2; S-20 surface. T = measured transmission. [From Timan (V-34).0 1981 IEEE.]
I
Abs. front inc.
0.891 jm 0.693 pn 0.567 pm 0.48 p n 0.41 Nm
A , = 6.5% A , = 19.0% A , = 27.5% A , = 39% A, = 52%
Refractive index n n n n n
3.10 3.20 = 3.35 = 3.60 = 4.35 =
=
Absorption coefficient k = 0.09 k = 0.26 k = 0.50 k = 0.80 k = 1.65
integration q(4 =W
j:
) WP(4dz
with q(1) the electron yield, q a constant conversion factor independent of 1, and k(A) the absorption coefficient, conforms quite closely to the measured spectral response pattern. We find the following significant results: Effective escape depth indeed increases with photon energy in the lowerenergy region; the closer we come to the threshold, the smaller the effectively contributing lamella becomes. This may support the assumption of participation of defect levels or band bending or compositional changes near the vacuum interface.
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
129
0 . 4 4 5 p m 71 mA/W I
I
I
I
I
,'
0.48pm 57.2md/W
(V-34).0 1981
The blue maximum displayed by all alkali antimonides follows from the optimized I ( z ) P ( z )in the peak region which gets slightly worse for the shorter 1 regions with even higher absorption. While these are good results this approach still needs further evaluation. Timan fails to give comparative data for vacuum incidence of the light vector and fails in supplying a valid model for the absorption in the thin film. However, it would seem that the different response patterns for the S-25, Rb-Cs, etc., and most likely also variations between surfaces of the same type, can be explained similarly. These computations by Deutscher, Hirschberg, and Timan, which are based on classical field theory (evolving the behavior of the electrical field vector in a semiinfinite homogeneous, isotropic slab of PC), depict the relation between absorption and photoemission quite interestingly,
130
H. TIMAN
0 . 4 8 p m 35.5mAIW
1 /
0.408pm 44 mA/W
f FIG.23. Electron generation function B-I; K-Cs. d IEEE.]
= 408
A. [From Timan (V-34).0 1981
but of course cannot explain the nature of the photoemissive effect. They are based on the continuity of the electric field vector (E2= I ) in the thin film and the assumption that photoelectron generation is strictly proportional to the field intensity at every point in the same way as this is firmly established for the thin-film response in toto. We discuss here the “band bending” proposed at the vacuum (and substrate) interface for the alkali antimonides. The “old masters” Sommer (VI-29) and Spicer (VI-30)point out that band bending can not be distinguished from actual changes in E,. Fisher (VI-31)postulates band bending over x 60 A for the Na-K-Sb cathode to explain its higher conductivity and lower threshold compared to the K-Cs-Sb (Fig. 24). Similarly, Ghosh and Varma ( V I - 8 )take up a composition model from Dowman et al. (VZ-32)to construct a band scheme of the S-20 with a surface layer of K-Cs,-Sb. The rather serious objections against these ideas have been mentioned in Section V. S - I . For a discussion of the rather unclear state of earlier theories we refer the reader to Sommer’s discussion (VI-29,pp. 159-166). More recently,
131
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
v 1 . 7 5
Ev’
-P
-d
5FIG.24. Suggested energy band models for K,Cs-Sb and Na,K-Sb. [From Fisher ef al. (111-32).]
Heiman and Timan have done extensive work, but the matter has become only slightly clearer. Heiman (Vl-33)investigates primarily heavy Cs-0 layers on different metallic substrates and finds strongly increased yield only on Ag substrates (see also Section V). His further thoughts about the Cs,-0 emission centers and nature of the technical S-1 must be questioned as his cathodes do not reach the best possible quantum yield. Timan (VZ-34) reports strongly improved QE and finds that cathode thicknesses are optimized to an interference condition (see also Sections 111, V). For a model surface (113 A) he again evaluates the product I ( z ) P ( z ) and finally arrives at the typical S-1 response pattern with an escape probability.
Results are presented in Fig. 25. From different premises Gugel et a2. (VZ-35)arrive at band bending at the interface due to surface-absorbed Cs. Summarizing the more recent findings about this most difficult surface: Ag and Cs oxides may vary widely in composition, but certain minimum amounts appear necessary. A certain “patchiness” of the Ag base (apparently a specific property of Ag only) is preserved in processing and appears a condition for optimum performance. This of course complicates the validity of any mathematical model.
132
‘6.01
H. TIMAN
,9000
/
x
II
I
/
14.0
80008
12.0‘-
L
FIG..25. Electron generation function S-1 ( # 133A), d ized for peak sensitivity 5.2 mA/W at 0.8 pm.
=
58.0 n m . ~ : ~ ~ ’ o l ? l ( T ) P normal(5)
An optimum thickness or “interference” condition results from IR optimized processing. Physical absorption is largely “inactive” Ag absorption which remains dominant, though modified, in the finished cathode. IR emission emanates from “levels” of cesium oxide close to the vacuum interface; the lower the photon energy the thinner the contributing lamella. This is also empirically confirmed by the extreme sensitivity of the IR emission to any surface changes. Our knowledge seems decidedly too sketchy to design any detailed band or emission models. The large variations achievable through processing again point to a whole “family” of S-1 types rather than some better-defined composition. With interest declining in the practical application of the S- 1 because of its low QE it is doubtful that the open questions here will be
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
I33
answered soon. No new theoretical ideas have been reported for the S-10, which again finds even less practical interest now. While progress can be detected in our theoretical models many basic questions remain unanswered. In his 1976 study P. J. Vernier (Vl-36)details the difficulties still unresolved. Exact calculations are possible only for the simplest photoemitters, i.e., metals. The efficient semiconductors with their higher yields and escape depths, lower work function, and more favorable E, : E , ratio offer specific problems of surface interactions for which we have no adequate model. As Vernier points out, it is too simplistic to describe absorption in surface layers with bulk constants and it does not help to treat photoelectron generation as a three-step process if photon-electron conversion occurs in one as yet ill-defined process which should follow the conservation rules of energy and momentum. This field certainly offers still a great many opportunities for original research and we can only hope that many of the data now still sadly missing will be collected and published and will allow a more balanced view of thinfilm photoemitters, which are presently considered mostly as somewhat exotic practical tools.
REFERENCES Section I 1. A. Sommer, “Photoemissive Materials,” Wiley, New York, 1968.
Section
I1
1. A. H. Somrner, Appl. Opt. 12,90 (1973). 2. A. N. Pertsev, 1. V. Reznikov, M. N. Suzin, and S. N. Cherenkevich, Instrum. Exp. Tech.(Engl. Trans!.) 14, 1255 (1971). 3. A. E. Melarnid and A. M. Potopov, Instrum. Exp. Tech. (Engl. Trans!.) 18, 1210 (1975). 4. P. B. Coates, Appl. Phys. 6 , 1862 (1973). 5. T. A. Arkhipova er al., Radio Eng. Electron. Phys. (Engl. Transl.) 18,925 (1973). 6. K. A. Maysokaya and V. Ye. Privoleva, Radio Eng. Electron. Phys. (Engl. Transl.) 9, 436 ( 1964). 7. D. P. Jones and G . S. Kent, J . Phys. E 7,744 (1974). 8. V. M. Gelikonov and P. A. Khandokhin, Radio Eng. Electron. Phys. (Engl. Transl.) 23(6) (1978). 9. A. H. Sommer, “Photoemissive Materials,” p. 79. Wiley, New York, 1968. 10. W. Budde, Appl. Opt. 12, 2108 (1973). 11. H. Timan, “Study and Improvement of the S-1,” Final Rep. on Ctr. DA 44-009-AMC-l36(T). USA-ERDL, Ft. Belvoir, Virginia, 1968. 12. D. G. Theodorou, Ado. Electron. Electron Phys. 22A, 477 (1966).
134
H. TIMAN
13. C. F. Van Huysteen, Ado. Electron. Electron Phys. 40A, 419 (1976). 14. R. Holtom and G. P. Hopkins, J . Phys. D 12(7) (1979). 15. P. Della Porta, Proc. Conf. Tube Tech., 1968 (1968). 16. C. Ghosh and B. P. Varma, J. Appl. Phys. 49(8), 4549 (1978). 17. R. W. Decker, Adu. Electron. Electron Phys. 28A, 357 (1969). 18. B. R. C. Garfield, Adu. Electron. Electron Phys. 33A, 343 (1971). 19. D. McMullan and J. R. Powell, Adu. Electron. Electron Phys. 40A, 427 (1976). 20. E. Kansky, B. Otrin, S. Jeric, and P. Gspan, Nuouo Cimento, Suppl. 5, 139 (1967). 21. S. M. Johnson, Jr., IEEE Trans. Nucl. Sci. NS-20(1) (1973). 22. W. Budde and P. Kelly, Appl. Opt. 10, 2612 (1972). 23. R. B. Murray and J. J. Manning, IEEE Trans. Nucl. Sci. NS-7(2-3) (1960). 24. E. Feldner, P. Gortich, and T. Muller, Proc. IMEKO Symp. Photon Detect. 5 (1971). 25. W. E. Spicer and F. Wooten, IEEE 51, 1127 (1963). 26. V. Kanev, K. Nanev and R. Petrova, Radio Eng. Electron. Phys. (Engl. Transl.) 9,338 (1964). 27. M. Garbuny, T. P. Vogel, and J. R. Hansen, J . Opt. SOC.Am. 51,261 (1961). 28. C. N. Sherring, J . Phys. E 3, 1016 (1970). 29. H. Timan, to be published. 30. M. W. Davies, Rev. Sci. Instrum. 43, 556 (1971). 31. S. Benci, P. A. Benedetti, and M. Mantredi, Appl. Opt. 13, 1554 (1974). 32. R. M. Matheson and F. Helvy, IEEE Trans. Nucl. Sci. NS-15, 195 (1968). 33. .D. E. Persyk, J. L. McDonie, and A. F. Faulkner, IEEE Trans. Nucl. Sci. NS-23(1) (1976). 34. M. Jedlicka, Adu. Electron. Electron Phys. 28A, 323 (1969).
Section 111 1. A. H. Sommer, J. Appl. Phys. 42,467 (1971). 2. A. H. Sommer, J . Appl. Phys. 43,2479 (1972). 3. 0. B. Vorob’yeva, A. A. Mostovskiy, and G. B. Stuchinskiy, Radio Eng. Electron. Phys. (Engl. Transl.) 9,414 (1964). 4. C. Ghosh and B. P. Varma, J . Appl. Phys. 49(8) (1978). 5. H. Timan, Rev. Tech. Thomson-CSF 8 , 4 (1976). 6. L. Jung and H. Stadlmann, Proc. I M E K O Symp. Photon Detect., 4th, 1969 (1969). 7. See Section 11, Ref. 34. 8. H. H. Hofmann, K. Deutscher, and A. Scharmann, Z. Phys. 236,298 (1970). 9. H. H. Hofmann and K. Deutscher, Z. Phys. 236,288 (1970). 10. B. P. Varma, J. Phys. D 6, 628 (1973). 11. E. L. Hoehne, Adu. Electron. Electron Phys. 33A, 369 (1971). 12. Z. M. Ronkin, Radio Eng. Electron. Phys. (Engl. Transl.) 9, 1102 (1964). 13. M. Duchet, Ado. Electron. Electron Phys. 22A, 499 (1966). 14. B. R. C. Garfield and J. R. Folkes, Adu. Electron. Electron Phys. 28A, 375 (1969). 15. N. A. Gulakov er al., Instrum. Exp. Tech. (Engl. Transl.) 15, 1187 (1973). 16. W. Budde, Proc. l M E K O Symp. Photon Detect., 41h, 1969 (1969). 17. C. D. Hollisch, Rev. Sci. Instrum. 45(11) (1974). 18. V. R. Lazarenko and L. G. Tokareva, Instrum. Exp. Tech. (Engl. Transl.) 19(5) (1978). 19. R. S. Sennett and G. D. Scott, J . Opt. SOC.Am. 40,203 (1950). 20. See Section 11, Ref. 29. 21. Section 11, Ref. 9, p. 134. 22. A. H. Sommer, Final Rep. 44-009-eng-3642. USA-ERDL, Ft. Belvoir, Virginia, 1960. 23. See Section 11, Ref. 18. 24. G. A. Condas, Rev. Sci. Instrum, 33,987 (1962).
RECENT ADVANCES IN VACUUM PHOTOEMITTERS 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.
135
W. H. McCarroll, J. Appl. Phys. 39, 3414 (1968). T. W. Hall and R. M. Eastment, Phys. Status Solidi A 2,327 (1970). J. C. Robbie and A. H. Beck, J . Phys. D 6, 1381 (1973). See Section 11, Ref. 16. H. H. Hofmann, K. Deutscher and A. Scharmann, 2. Phys. 236,298 (1970). W. E. Spicer, Phys. Rev. 112, 114(1958). T. B. Bhatia, G. K. Bhide, C. Ghosh, G. N. Kelkar, M. Srinivasan, and B. P. Varma, and R. L. Verma, Ado. Electron. Electron Phys. 40A, 409 (1974). D. G. Fisher, and A. F. McDonnie, and A. H. Sommer, J. Appl. Phys. 45,487 (1974). H. Timan, “Study of the S-1”, Final Rep. Ctr. DA 44-009-AMC-1811 (E). USA-ERDL, Ft. Belvoir, Virginia, 1970. See Section 11, Ref. 1I . R. E. Simon and R. Suhrmann, “Der Lichtelektrische Effekt.” Springer-Verlag, Berlin and New York, 1958. W. Heiman and E. L. Hoehne, Proc. I M E K O Symp. Photon Detect., 4th, 1969 (1969). Wu-Quan-de, Acta Physiol. Sin. 28(4) (1979). W. Heiman, Exp. Tech. Phys. 21, 193, 325,431 (1973). M. T.Pakhomov and A. E. Melamid, Radio Eng. Electron. Phys. (Engl. Transl.) 20(9) (1976). A. H. Beck and J. C. Robbie, Int. J . 33,361 (1972). E. Kansky, Adu. Electron. Electron Phys. 33A, 356 (1971). A. A. Dowman, T. H. Jones, and A. H. Beck, J. Phys. D 8(1) (1975). A. H. Sommer and W. H. McCarroll, J. Appl. Phys. 37, 174 (1966). T. Ninomiya, K. Taketoshi, and H. Tachiya, A d a Electron. Electron Phys. 28A, 337 (1969). W. H. McCarroll, J. Appl. Phys. 39, 3414 (1968). W. H. McCarroll, R. J. Pfaff, and A. H. Sommer, J . Appl. Phys. 42, 569 (1971). W. H. McCarroll, J. Phys. Chem. Solids 26, 191 (1965).
Section I V I. 2. 3. 4.
E. G. Burroughs, Appl. Opt. 8,261 (1969). P. G. Borzyak, Proc. I M E K O Symp. Photon Detect., 4lh, 1969 (1969). C. Ghosh and B. P. Varma, Thin Solid Films 46(2) (1978). J. A. Cochrane and R. F. Thumwood, Adu. Electron. Electron Phys. 40A, 441 (1976). 5. C. D. Hollisch and J. R. Crowe, Appl. Opt. 8, 1750(1969). 6. See Section 11, Ref. 14. 7. See Section 111, Ref. 35, pp. 120ff. 8. M. Dvorak, Proc. I M E K O Symp. Photon Detect., 5, 76 (1971). 9. D. P. Jones, Appl. Opt. 15(4) (1976). 10. S. A. Shaw, G. R. Grant, and W. D. Gunter, Jr., Appl. Opt. 10,2559 (1971). 11. E. G . Burroughs, Appl. Opt. 7,2429 (1968). 12. See Section 111, Ref. 33. 13. W. P. Raffan and A. W. Gordon, Adu. Electron. Electron Phys. 28A, 433 (1969). 14. See Section 111, Ref. 39. 15. D. Kossel, K. Deutscher, and K. Hirschberg, Adu. Electron. Electron Phys. 28A, 419 (1969). 16. W. Greschat, H. Heinrich, and P. Romer, Adu. Electron. Electron Phys. 40A, 397 (1976). 17. J. M. Deltrap and A. H. Hanna, Adu. Electron. Electron Phys. 28A, 443 (1969). 18. E. Feldner, P. Gortich, and K. Wagner, Phys. Status Solidi A 1, 221 (1970). 19. See Section 111, Ref. 45. 20. J. M. Chen and S. Wang, J . Appl. Phys. 48(4) (1977). 21. Yu. V. Popov et al., Instrum. Exp. Tech. (Enyl. Transl.) 14, 231 (1971).
136
H. TIMAN
22. T. A. Demyanova, Instrum. Exp. Tech. (Engl. Transl.) 14, 1160 (1971). 23. W. D. Gunter, Jr., G. R. Grant, and S. A. Shaw, Appl. Opt. 9, 251 (1970).
Section V 1. See Section 11, Ref. 9, pp. 134ff. 2. See Section 11, Ref. 29. 3. W. Heiman, E. L. Hoehne, S. Jeric, and E. Kansky, Exp. Tech. Phys. 21 (3-5), 193,325,431 (1973). 4. See Section Ill, Ref. 38. 5. A. E. Jennings and R. J. Dean, Adu. Electron. Electron Phys. 22A, 441 (1966). 6. See Section 111, Ref. 39. 7. N. A. Soboleva, Bull. Acad. Sci. U S S R 22, 575 (1958). 8. J. Koosman, IEEE Transl. Nucl. Sci. NS-II, 56 (1964). 9. See Section 111, Ref. 33. 10. V. M. Klimkin and V. E. Prokopev, Instrum. Exp. Tech. (Engl. Transl.) 19(5)(1978). 1 I. A. Malherbe and M. Tessier, Adu. Electron. Electron Phys. 22A, 493 (1966). 12. R. L. Verma and B. P. Varma, Indian J. Pure, Appl. Phys. l l ( 9 ) (1973). 13. M. Srinivasan, B. M. Bhat, and N. Govindarajan, J . Phys. E7(10) (1974). 14. See Section 11, Ref. 11. 15. See Section 111, Ref. 27. 16. B. R. C. Garfield and R. F. Thumwood, Ado. Electron. Electron Phys. 22A, 459 (1966). 17. V. V. Artem’ev, Radio Eng. Electron. Phys. (Engl. Transl.) 8,614 (1963). 18. V. V. Artem’ev, Radio Eng. Electron. Phys. (Engl. Transl.) 13, 316 (1968). 19. M. Jedlicka and P. Vlim, Adu. Electron. Electron Phys. 22A, 449 (1966). 20. See Section 11, Ref. 34. 21. See Section 111, Ref. 5. 22. See Section 11, Ref. 16. 23. See Section 111, Ref. 47. 24. See Section 11, Ref. 14. 25. M. Dvorak, Adu. Electron. Electron Phys. 28A, 347 (1969). 26. See Section 111, Ref. 42. 27. See Section 111, Ref. 10. 28. B. P. Varma and C. Ghosh, Indian J Pure Appl. Phys. 13(1)(1975). 29. M. Jedlicka and J. Raus, Proc. I M E K O Symp. Photon Detect., 4th, 1969 (1969). 30. See Section IV, Ref. 18. 31. See Section 11, Ref. 12. 32. See Section 11, Ref. 18. 33. See Section 111, Ref. 5. 34. H. Timan, IEEE Trans. Nucl. Sci. NS-28( I), 652 (1981). 35. See Section Ill, Ref. 44. 36. See Section Ill, Ref. 31. 37. P. Dolizy, 0.Delura, and M. Deloron, Acta Electron. 20, 265 (1978). 38. See Section 111, Ref. 46. 39. See Section 111, Ref. 41. 40. See Section 111, Ref. I I. 41. A. H. Sommer, R C A Rev. 28,75 (1967). 42. See Section IV, Ref. 10. 43. S. Sobieski, Appl. Opt. 15,2298 (1976). 44. R . Bartlett, Sci. Instrum. 45, 779 (1974).
RECENT ADVANCES IN VACUUM PHOTOEMITTERS
137
45. W. N. Charman, J . Phys. E 2, 157 (1969). 46. V. P. Denisov and A. 1. Klimin, Radio Eng. Electron. Phys. (Engl. Trans/.)23(2) (1978). 47. L. B. Lapson and J. G. Timothy, Appl. Opt. 12, 388 (1973).
Section VI 1. See Section 111, Ref. 30. 2. See Section Ill, Ref. 1 I. 3. See Section V, Ref. 34. 4. A. H. Sommer and W. E. Spicer, in "Photoelectronic Materials and Devices" (S. Larach, ed.), pp. 175-222. Van Nostrand-Reinhold, Princeton, New Jersey, 1966. 5. W. E. Spicer, J. Phys. Chem. Solids 22, 365 (1961). 6. R. Niedermayer and H. Mayer, eds.,"Basic Problemsin Thin Films,"Gottingen,VandenhoekeRuprecht, 1966. 7. See Section 111, Ref. 5. 8. See Section 11, Ref. 16. 9. P. Goerlich, Adv. Electron. Electron Phys. 11, 10 (1959). 10. W. E. Spicer, J. Appl. Phys. 31,2077 (1960). 11. See Section V, Ref. 3. 12. A. H. Sommer and W. E. Spicer, J. Appl. Phys. 32, 1036(1961). 13. H. Timan, Rep. No. 33, DA-44-009-136 (T). USA-ERDL, Ft. Belvoir, Virginia, 1967. 14. See Section 11, Ref. 14. 15. See Section 111, Ref. 33. 16. See Section 111, Ref. 9. 17. L. Ezard, R C A Rev. 36, 711 (1975). 18. See Section IV, Ref. 3. 19. M. Jedlicka and P. Vlim, Czech. Con$ Electron. Vac. Terhnol., 3rd. 1966, p. 351 (1966). 20. See Section 11, Ref. 26. 21. M. J. Needham and R. F. Thumwood, Ado. Electron. Electron Phys. 28A, 129 (1969). 22. E. A. Tart and H. R. Philipp, Phys. Rev. 115, 1563 ( 1 959). 23. K. Kumar and R. F. Thumwood, J . Phys. E 5,536 (1972). 24. N. A. Soboleva, Radio Eny. Electron. Phys. (Engl. Trans/.)4,204,223 (1959). 25. J. Burton, Phys. Rev. 72, 531 (1947). 26. J. A. Love and J. R. Sizelove, Appl. Opt. 7, 1559 (1968). 27. See Section 111, Ref. 9. 28. K. Hirschberg and K. Deutscher, Phys. Status Solidi 26, 257 (1968). 29. See Section 11, Ref. 9. 30. W. E. Spicer, R C A Rev. 19 (4) (1958). 31. See Section 111, Ref. 32. 32. See Section 111, Ref. 42. 33. See Section 111, Ref. 38. 34. See Section 11, Ref. 29. 35. B. M.Gugel, A. E. Melamid, and B. M. Stepanov, Radio Eng. Electron Phys.(Enyl. Transl.) 22, 101 (1977). 36. P. Vernier, f r o g Opt. 14 (1976).
This Page Intentionally Left Blank
.
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS VOL . 63
Open-Ended Waveguides: Principles and Applications FRED E. GARDIOL Laboratoire d 'Electrornagnitisme et d 'Acoustique Ecole Polyiechnique Fidirale Lausanne. Switzerland
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 I1. Definitions and Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 A. Antenna Analysis: The Internal and External Problems . . . . . . . . . . . . . . 141 B. Description of the Mode Structure . . . . . . . . . . . . . . . . . . . . . . . . . 143 144 C . Equivalent Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 D . Flanged and Unflanged Apertures . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 147 A. Slotted-Line Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Reflectometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 C . Time-Domain Reflectometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 D . Resonant Cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 E. Presentation of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 IV . Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A. Antenna Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 B. Measurement of Small Distances . . . . . . . . . . . . . . . . . . . . . . . . . . 155 C . Radiation into a Plasma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 D . Diathermy and Hyperthermia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 E . Thermography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 F. Measurement of Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 G . About Rigorous Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 V . Theoretical Development for a Flanged Waveguide Radiating into an Infinite Homogeneous Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 A. Description of the Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 B. Theoretical Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C . Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 D . Green's Function Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 E . PointMatching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 F . MomentMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 G . Variational Principle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 H . characteristic Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 1. Variational Principle and Characteristic Modes . . . . . . . . . . . . . . . . . . 170 J . Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 K . Transmission into Oversized Waveguide . . . . . . . . . . . . . . . . . . . . . . 171 L. Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 VI . Application to Particular Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 172 A . TM,, Mode in Circular Waveguide . . . . . . . . . . . . . . . . . . . . . . . . 172 139 Copyright 0 19x5 by Academic Press . Inc . All rights of reproduction in any form reserved . ISBN 0-12-014663-0
140
FRED E. GARDIOL
B. Coaxial Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. TE,, Mode in Circular Waveguide . . . . . . . . . . . . . . . . . . . . . . . . . D. TE,, Mode in Rectangular Waveguide . . . . . . . . . . . . . . . . . . . . . . . VII. Conclusion.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Infinite Sample in Waveguide . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
174 176 179 181 182 184
I.. INTRODUCTION
A wavepide suddenly ends, abruptly cut open in a plane perpendicular to its longitudinal direction. The electromagnetic waves which were traveling along within the waveguide suddenly become free to spread out across space in all directions. The structure realized just by cutting the waveguide is a simple and crude antenna, in which the transition from guided to free wave propagation takes place within the plane of the discontinuity. Open-ended waveguides are used for quite a variety of applications, such as simple antennas, elements radiating in a plasma, diathermy and hyperthermia applicators, thermography detectors, probes for the measurement of distances or of material properties, etc. These applications are considered in more detail in Section IV. The basic structure, sketched in Fig. 1, consists of a waveguide, terminated or not by a flange in the plane of the aperture. The inside of the waveguide is filled with a homogeneous dielectric, assumed to be lossless (quite often air), having a relative permittivity E,. It radiates into a material of complex permittivity Ed, which may vary with position: the situations generally covered by theory are either homogeneous media (gd independent of position) and
FIG.1 . Open-ended waveguide.
OPEN-ENDED WAVEGUIDES
141
stratified ones (gd then has a step-function dependence on the coordinate 2). The structures encountered in practice are often more complex. The waveguides considered here are metallic, with any cross section, including closed two-conductor lines such as coaxial lines. The general theoretical development is given in Section V, whereas particular applications to circular and rectangular waveguides and to coaxial lines are presented in Section VI. The treatment could in fact be extended to cover inhomogeneously filled guides and open structures as well, such as striplines and optical fibers. However, this will not be done here. Many papers have been devoted to the study of open-ended waveguides, both from the experimental and from the theoretical points of view. The purpose of the present chapter is to provide a general survey of the present state of the art in this particular area of microwaves. It does not expect to be complete, but rather should serve as a broad introduction for researchers entering this field. Since the field is rather wide, it was necessary to define some borders. As a result, the following related areas are not covered in the present survey: parallel-plate waveguides (for which many theoretical developments have appeared in the literature which unfortunately all too often cannot be extended to closed waveguides or to two-media problems), cavity-backed antennas, and transmission techniques for the measurement of materials (which make use of two open-ended waveguides on both sides of a sample of material). This means that the materials measurements considered here rely on the reflection within the waveguide.
11. DEFINITIONS AND GENERALITIES A . Antenna Analysis: The Internal and External Problems
The analysis of radiating systems is made in two steps: the internal and the external problems. These two problems are basically distinct and must be treated separately (Mosig and Gardiol, 1982). The aim of the internal problem is the determination of the charges and currents on the antenna conductors or, in the case of an aperture, the electric field in the plane of the opening. This is done by solving Maxwell’s equations in the close vicinity of the antenna and then applying the boundary conditions. Since the fields must be determined in the very region where their sources are located, the mathematical formulation contains singularities and discontinuities, which make calculations difficult, in particular in the numerical evaluation of integrals (Yaghjian, 1980).
142
FRED E. GARDIOL
The antenna currents and charges, as well as the aperture fields, are of course related to the antenna excitation. They are, however, not identical with the excitation itself. For the open-ended waveguides eonsidered here, the excitation is a forward wave propagating along the waveguide, generally in one mode only (most often the dominant mode). Within the aperture itself, other modes are excited, which are required to satisfy the boundary conditions (Section 11,B). The aperture field then contains components of many waveguide modes, not only of the exciting one. The external problem considers the radiated fields at large distances from the sources. The components of the fields associated with reactive power accumulation close to the antenna decrease as l/rz or l/r3: at a sufficiently great distance only the radiation components, which exhibit a l/r decay, remain significant. This leads to important simplifications-valid in the farfield region only. The radiated power density, which is a function of the direction with respect to the antenna, is then compared to the power fed to the antenna, defining as follows the directivity:
where P(O,+) is the real part of the Poynting vector (radiated power density) and Pt is the total radiated power. In order to determine the fields at large distances from the antenna, one has to know either the currents and charges on the antenna conductors or the fields in the aperture. The external problem can therefore only be solved when the results of the internal problem are available (Rhodes, 1974). The internal problem is inherently far more difficult to solve than the external one; in many instances its resolution is practically impossible, even with the most powerful computer techniques presently available. Therefore, quite often a distribution of antenna currents and fields is assumed: the internal problem is not solved, but some educated guess is made as to which form this distribution might take. This approach, even if it is quite approximate, has proved most useful as far as the radiated fields are concerned: these are obtained through integration over the whole antenna structure, so that discrepancies tend to be averaged out. This is no longer true for the reflection factor in the connecting line. The input impedance to an antenna is strongly affected by the near-field components, which contribute to its reactive part. Approximations are then far less satisfactory. For the open-ended waveguide considered here, many publications assume that the electric field within the aperture is actually the one of the excitation signal only. This assumption is all the more insidious in that it is
OPEN-ENDED WAVEGUIDES
143
often not introduced as such, but rather stated as being an obvious fact (maybe the authors did not even realize that they were making an approximation., .). The direct result is to neglect all higher-order modes and their effects (energy storage, Section 11,C). This assumption may be satisfied in some particular situations, but definitely not in the general case. The limits to its validity have not been determined. B. Description of the Mode Structure
In any lengthwise uniform waveguide, extending in theory from minus infinity to plus infinity along one direction-taken here as the coordinate axis z , the general structure of the fields is the superposition of modes of propagation. These modes, obtained by solving Maxwell’s equations in the presence of the boundary conditions on the waveguide walls, form an orthogonal set. This means that any mode may propagate independently, without mixing or interfering in any manner with the other modes of the set. Their characteristics may be studied separately for each mode. As soon as any disturbance is introduced within the waveguide, the uniformity is broken. New boundary conditions are introduced which an individual mode of the uniform waveguide does not satisfy. However, these new boundary conditions can always be satisfied by a combination of several modes, since the modes form a complete set. The disturbance produces coupling between modes, so that modes are excited in its vicinity (Collin, 1960). This is true for all disturbances: termination of a waveguide by an aperture is no exception. Which modes are excited at a discontinuity depends upon its geometry, in particular upon possible symmetry conditions (Section VI). In the structures used to carry out measurements of materials properties, precautions are taken to have a single propagating mode, most often the dominant one (or lowest-order mode). In this situation, the higher-order modes excited at the discontinuity are evanescent; i.e., they decay exponentially with distance away from the perturbed region. At a sufficient distance, only the waves of the propagating mode-incident and reflected-do remain. The reflection factor is then easily measured, in a plane where the higherorder mode components have become vanishingly small (measurements in a multimode structure are difficult to manage). Since the reflection factor varies with position along the waveguide, a reference plane must be chosen so that the problem may be entirely satisfied. This reference plane must be taken far enough from the opening that the fields of higher-order evanescent modes have decayed to amplitudes small enough to be neglected. As the field structure of the propagating waves has a A,/2
144
FRED E. GARDIOL
period, the reference plane is taken at a distance n 4 / 2 from the opening, where the reflection factor for the propagating mode is the same as within the aperture itself. C . Equivalent Circuit
When a transmission line is terminated by an ideal open circuit, no current flows across the end of the line. The load impedance is then infinite, and its admittance is equal to zero. Open-ended waveguides are not ideal open circuits, however, owing to the presence of two effects: (1) Part of the incident power is radiated into the right-hand half-space 2
> 0.
(2) Some energy is stored in the electromagnetic fields on both sides of the aperture (near-field regions).
These two effects are more or less additive, and are thus best understood in terms of shunt admittances connected across the line in the reference plane (Section 11,B). When the material filling the right-hand half-space is lossless; i.e., when its relative permittivity E~ is a real quantity, the equivalent circuit is made of two elements (Fig. 2): (1) The radiated power (Section II,A) is represented by a conductance G,. This does not mean that this power is actually dissipated, but that it has vanished as far as the waveguide is concerned. (2) Energy is stored by evanescent modes close to the aperture (Section 11,B): this is represented by a susceptance B,. Evanescent TE modes predominantly store magnetic energy, TM modes electric energy (Collin, 1960). One would expect, through continuity of the fields, that the same type of energy would be stored close to the opening in the right-hand half-space. A discontinuity at which TE modes are excited predominantly is thus represented by a shunt inductance in the equivalent circuit, while a capacitance
6gq FIG.2. Equivalent circuit of an open-ended waveguide radiating into a lossless medium.
c OPEN-ENDED WAVEGUIDES
0
I
145
I
FIG.3. Equivalent circuit of an open-ended waveguide radiating into a lossy medium.
appears for structures in which TM modes dominate. When no higher-order modes are excited (infinite sample in waveguide, see the Appendix), or when the energies stored in the two types of modes balance each other out (resonant circuit), the susceptance in the equivalent circuit vanishes. When the material is lossy, i.e., when its relative permittivityg, is complex, two components must be added to the equivalent circuit (Fig. 3): (1) Radiation into a lossy material produces an inductive susceptance B,, in addition to the resistive element G, (Appendix). (2) Losses connected with reactive energy storage (inductive or capacitive) are represented by a conductance G,. Integral relationships link the equivalent-circuit elements to the field structure. They are not particularly useful in practice, as one would first have to know the field distribution itself. Nevertheless, the equivalent circuit may help understand the basic interaction mechanisms. On the other hand, it is possible to determine the overall equivalent circuit once the reflection factor is known. An example for a simple situation is given in the Appendix. D. Flanged and Unflanged Apertures
The transition from waveguide to space can have different shapes: the waveguide may be terminated by a flat metal plate, or flange, or it may simply end (Fig. 4). The theoretical developments usually consider a simple model: a waveguide terminated by an infinite flange (Fig. 4a, see also Sections V and VI). The waveguide, which extends from z = - co to z = 0, is terminated by a flat metallic plate, generally assumed to be a perfect electric conductor (PEC), in the plane z = 0. The complete domain is thus divided into two different regions: region I is the inside of the waveguide, while region I1 is the complete right-hand half-space z > 0. The fields are determined separately in the two
146
FRED E. GARDIOL
FIG.4. Flanged and unflanged apertures: (a) infinite flange, (b) unflanged waveguide, and (c) finite flange.
regions, and then the conditions of continuity are imposed in the plane of the opening. The unflanged waveguide of Fig. 4b is definitely a different structure in which a third region is present: region I11 extends over the part of the left-hand half-space z < 0 which surrounds the waveguide. The electromagnetic fields must then also be determined in region 111 and continuity be ensured across the whole plane z = 0. In between are all finite flanges (Fig. 4c). Region I11 does also exist in this case, and a fourth region appears behind the flange, adding an increased degree of complexity to the problem. It must be noted here that, while most theoretical developments are devoted to the first geometry (infinite flange, Fig. 4a), measurements can only be carried out, obviously, with finite flanges or without any flange. The “infinite” flange becomes then, in practical terms, a “large” flange, having a size much larger than the waveguide cross section. One may then admit an approximate
.VSWR
8
9
10
11
12
FIG.5. Comparison of the reflection from an open-ended X-band rectangular waveguide (a = 22.86 mm, b = 10.16 mm):(a) with 100 x 100 mm flange and (b) without flange.
OPEN-ENDED WAVEGUIDES
147
equivalence between the two. The correspondence is best when the material is lossy; it is less favorable with low-loss materials (Frood and Wait, 1956). The size of the flange affects significantly the field distribution and therefore the reflection within the waveguide (Fig. 5). In a circular waveguide, the effect of the thickness of the waveguide wall was considered by James and Greene (1978).The width of the beam radiated by an open-ended waveguide may be significantly reduced by adding a flange (Risser, 1949).
111. MEASUREMENTS
Several different approaches are available to measure the reflection in a waveguide. The most important ones shall be briefly outlined in this section. A . Slotted-Line Techniques
Well known to every microwave student is the slotted line (Montgomery, 1947; Giordano, 1963), in which a sliding probe samples the amplitude of the electric field in terms of position along the line. The ratio of maximum to minimum voltage along the line is the voltage standing-wave ratio s (VSWR). The phase of the reflection factor p is given by the distance between minima, measured with the line terminated-by the device tested (z,) and with a short circuit in the reference plane ( 2 , ) : 24
= n[4(zC- z,)/Ag
+ 1 k 2nl
(2)
where I , is the guide wavelength and n is an integer, selected to keep the phase within a specified range. The reflection factor is then
p = [(s
- l)/(s
+ 1)1exp(j24)
(3)
1 --Stan4 s -j t a n 4
(4)
and the impedance is given by
z= zc-11 +-pp_
=
z,
where Z , is the characteristic wave impedance of the guide. The slotted-line technique is quite accurate, but involves a number of manipulations, so considerable manual skill and care are required from the operator. The technique would be quite difficult to automatize. An alternative approach has been proposed, using two fixed probes and swept frequency signals (de Ronde, 1965).While quite accurate, this technique is rather difficult to calibrate and is therefore seldom used.
148
FRED E. GARDIOL
It was assumed here that only one mode propagates in the waveguide. If several modes can propagate, a nonperiodic variation (chaotic behavior) of the fields is observed. B. ReJectometry
Reflectometer techniques have also been in use for a long time (Engen and Beatty, 1959)-they have actually become the ones most commonly used since the introduction of the network analyzer (Anderson and Dennison, 1967).The reflected signal is sampled by a directional coupler and compared, in amplitude and phase, to the incident signal, sampled by another coupler (Fig. 6). The accuracy of the technique is limited by the directivity of the two couplers (Staeger and Kartaschoff, 1977).The resulting error may, however, be compensated to some extent by calibration and computer techniques. The network analyzer is, however, a rather sophisticated, expensive, and bulky piece of equipment, specifically designed for the microwave laboratory but illsuited for “field” measurements. NETWORK ANALYZER
SOURCE
COUPLERS
PROBE
U FIG.6. Two-coupler reflectometer.
For some time now, six-port junctions have been proposed to measure reflection and transmission parameters (Engen, 1977). The incident and reflected signals are linearly combined in four different ways (Fig. 7) and the four resulting signals detected. The amplitude and the phase of the reflection factor are then determined, requiring rather complicated calculations. The sixport approach, combined with a microprocessor, is capable of providing accurate measurements with a portable instrument for field operation, as was recently demonstrated by Ziircher et al. (1983). In principle, waveguide reflection methods are capable of providing measurements at any frequency within the operating range of the waveguide. Usually, however, the phase of the reflection is difficult to determine with sufficient accuracy. Here, too, only one mode should be able to propagate in the waveguide.
OPEN-ENDED WAVEGUIDES
149
FIG.7. Six-port junction in microstrip.
C . Time-Domuin Rejlectometry
While in the previous methods a single-frequency signal is used for the measurements (at any given period of time the signal has only one frequency), in time-domain reflectometry (TDR) measurements are made with all the frequencies at the same time. A step impulse with very short rise time is sent on the transmission line, reaches the sample to be measured, and is reflected (Fig. 8). The Fourier transforms of the incident and reflected pulses are taken and compared, yielding p(w) (Bucci et ul., 1972; Iskander and Stuchly, 1972; Waldmeyer and Zschokke-Granacher, 1975). The measurement is actually completed before the arrival of the reflection from the back end of the sample: for all practical purposes, the sample can be considered as being infinite, and a simple relationship links g,, and p- (Appendix). SAMPLING SCOPE TEST SAMPLE
FIG.8. Principle of the time-domain reflectometer (TDR).
150
FRED E. GARDIOL
This measurement technique is quite broadband and therefore applicable only to broadband transmission lines: to coaxial lines, but not to hollow waveguides. It could quite well be utilized with open-ended coaxial lines. D. Resonant Cavities When the open-ended waveguide is terminated at the other end by a short circuit in a plane z = - d (Fig. 9), a resonant cavity is obtained when the reflection at the opening is large enough. In this case, the material measured perturbs the fields outside of the open cavity (external perturbation).
FIG.9. Open-ended cavity, coupled to coaxial lines by inductive loops.
A cavity resonance is defined by two quantities: the resonant frequency fo and the unloaded quality factor Q o . While the standard laboratory techniques used to measure these quantities are rather long and tedious (Ginzton, 1957), several approaches are now available to carry out the measurements in a fully automatic way (Ney and Gardiol, 1977; Bernier et al., 1982). The input impedance of a transmission line of length d terminated into a load impedance is given by (Ramo et al., 1965)
z,
Zi. = Zo(ZL+ Z o tanh -yd)/(Z,
+ zLtanh -yd)
(5)
In that plane, a short circuit is placed so that the transverse voltage vanishes, specifying Zin= 0 and therefore
zL+ Z,tanhyd-
(6) This relationship actually provides a simple way to determine -y in terms of the reflection factor py = c1 + j j -
1 =-ln 2d
=0
Z,-& 1 ( Z o + ZL) = %'n(-p)
(7)
151
OPEN-ENDED WAVEGUIDES
The propagation factory is in turn linked to the cavity resonant frequency (complex) by the waveguideequation:
- Ap
w=--
co
-
y - -Jp2
+ p2 -
tx2
- 2jcrp = w ,
+j w ,
(8)
where p is the transverse wave number of the waveguide mode considered. Resonance is only obtained when p 2 + p 2 > cr2. The real part of the complex angular frequency, when losses are small, is approximately related to the resonant frequency by 0,r? 2Zf0
(9)
The unloaded quality factor is directly given by the ratio of real to imaginary parts: QO = W r f 2 0 i
(10)
Cavity resonant techniques only permit to make measurements at certain discrete frequencies, the ones at which resonance occurs. They are therefore not well suited to measure materials having strongly frequency-dependent properties. Accurate measurements may only be obtained with large reflections; i.e., the open-ended waveguide should not radiate too much (Section 11,C).
E . Presentation of Results In the literature, results are presented in a variety of forms: reflection factor, VSWR, load capacitance, admittance, impedance, equivalent circuit, etc. This variety makes comparison difficult. One may therefore wonder how results should be presented to facilitate their use in practice. To answer this question, let us just consider which are the quantities actually measured at microwaves. All waveguide measurements (Sections III,A, III,B, and II1,C) provide the reflection factor p, while cavity resonance yields fo and Q o , from which p can also be determined (Section 111,D).The “natural” quantity to be presented appears thus to be the reflection factor (amplitude and phase). All other quantities are intermediate values, which are not directly related to the measurement or the material properties. They may be derived fairly easily once the reflection factor has been determined (for instance on a Smith chart). In lower frequency ranges, measurements are generally made with impedance bridges, which yield directly the capacitance and the conductance.
152
FRED E. GARDIOL
IV. APPLICATIONS Open-ended waveguides are used in a large number of practical situations: a general review is given in this section. For some applications, the important parameter is the radiation itself how does one manage to get a certain field amplitude to (or from) some specified region of space? In others, radiation provides a means to carry out some measurement: distance or material properties. In both instances, the structures may be Homogeneous-the medium in the waveguide is the same as that into which it radiates (Ed = &c), most often air; Inhomogeneous- the waveguide is filled with a medium of relative permittivity E , and radiates into another medium of permittivity&, (which may vary with position). These two properties serve to set up categories in this section. Subsection A considers radiation in the homogeneous case, and B its use to measure small distances or displacements. For inhomogeneous structures, radiation is considered in Subsections C (plasmas), D (diathermy), and E (thermography), while its application to the measurement of materials is covered under F. This section ends, in Subsection G , with general considerations as to why it is desirable to obtain values as close as possible to the exact ones. A. Antenna Elements
First of all, an open-ended waveguide is a simple antenna, which may be used to radiate an electromagnetic wave. This is a homogeneous situation: the waveguide is air filled, it radiates into an air-filled region. The radiation properties of open-ended waveguides have been considered by a number of authors, for circular, coaxial, and rectangular cross sections. 1. Circular Waveguides The radiated field components are given by Risser (1949), who does not consider the resulting reflection within the waveguide. On the other hand, Marcuvitz (1950) provides equivalent-circuit representations for radiation from T M o l , TE,,, and TE,, modes. Walls of zero thickness are considered and a transform method is applied, yielding equivalent-circuit elements in a reference plane which is not the plane of the opening (except for the TE, case). The mathematical expressions provided, for the circuit elements and the reference plane location, involve integrations (over an infinite interval) of expressions containing modified Bessel functions. They are not immediately usable, but fortunately results are also presented in graphical form.
153
OPEN-ENDED WAVEGUIDES
A rigorous approach to the study of unflanged open-ended circular waveguides with zero-thickness walls is presented by Weinstein (1969a,b). The mathematical development is based on the surface currents over the waveguide walls. This technique appears to be specific to the homogeneous problem. Open-ended circular waveguides are used as feed elements in highperformance reflector antennas for satellite communications (Balling and Jacobsen, 1977). The metal flange is recessed from the plane of the aperture. The radiated fields are determined from a combination of the method of moments and diffraction theory. A similar structure, with offset chokes, was considered by Wohlleben et al. (1972), also as a primary feed for high-efficiency antennas. An integral equation approach, developed by Bird (1979) to study mode coupling between several waveguides is applied, among others, to the openended flanged circular waveguide excited by the dominant TE, mode. Taking into account some higher-order modes (TM,, , TE,,, TM,,,.. .) the results confirm the change of sign of the susceptance observed experimentally by Bailey and Swift (1968). Inductive at low frequencies, the susceptance vanishes (resonance) and then turns capacitive as frequency increases. The effect of an iris within the aperture was considered by Deshpande and Das (1975), who made use of the conservation of the flux of reaction. The input admittance is determined by taking into account the effect of higherorder modes.
,
2. Coaxial Line
The equivalent circuit for an unflanged open-ended coaxial line is given by Marcuvitz (1950). The same procedure was used as for circular waveguides. The radiation from the TEM mode in a coaxial line terminated into a flat infinite metal plate was considered by Levine and Papas (1951) and later on by Galejs (1969). A stationary form for the admittance is obtained, and the aperture fields are assumed to be those of the dominant TEM mode only (higher-order modes are neglected). The admittance at the aperture, normalized with respect to the characteristic admittance of the coaxial line, is then given by
+ L[i{
2SiCkaJl
II
- Si(2k6
+ (6/a),
sin g ) ) d $ )
-
(
:>
2(6/a)cos $1 - Si 2ka sin-
154
FRED E. GARDIOL
where k = w / c o = w G 0 , a and b being, respectively, the radii of the inner and outer conductors (Fig. 13). An approximate expression is given when kd < 1, where d = +(a + b):
the characteristic admittance of the coaxial line itself is
5 = 2n[ZOl~g(b/a)]-'
(13) Harrington (1961) also considers a coaxial line opening into a ground plane, treating it in terms of the equivalent problem of a magnetic current loop. The aperture field is also assumed to be the one in the coaxial line. The far fields are determined, the power radiated is found to vary as l/A4 (A = 2 n / k ) , the directive gain is 3, and a radiation conductance is determined ,by G = - (4n5 320
b'-a'
)'
'A log(b/a)
3. Rectangular Waveguide The radiation characteristic of a rectangular waveguide was considered by Barrow and Greene (1938), applying Huygens' principle in the formulation given by Kirchhoff. The aperture fields were assumed to be those of the TE,, mode in the waveguide; the far-field components were obtained, and were found to agree remarkably well with experimental values. Chu (1940) considered the same structure, in terms of modified Kirchhoff formulas, making the same assumption. Schelkunoff (1936, 1939) proposes the use of equivalence and induction theorems. The far-field components are also given by Risser (1949).None of these publications considers the resulting reflection within the waveguide itself. Lewin (1951) and later on Galejs (1969) considered a rectangular waveguide opening into an infinite metal flange. A stationary expression for the input admittance is obtained, and the aperture field is assumed to be the one of the dominant TE,, mode (higher-order modes neglected). The values of conductance and capacitive susceptance are determined. Mautz and Harrington (1976) consider the same problem in terms of a generalized network formulation. An integral equation is obtained and is solved by the moment method. A computer program is available (Mautz and
OPEN-ENDED WAVEGUIDES
155
Harrington, 1978),yielding the admittance, the tangential electric field in the aperture, and the radiation gain patterns. Various approaches to the problem were compared by Fernez (1978): a Green’s function and a plane-wave superposition technique were found to provide the most efficient formulations, in terms of computer calculations. Particular precautions were taken to get rid of the singularities appearing within the integrals. Since then, an exact solution by the correlation matrix method has become available (MacPhie and Zaghloul, 1980). The interaction between rectangular waveguide feeds was considered by Mailloux (1 969a,b). The problem is formulated in terms of simultaneous integral equations, solved approximately by expanding the aperture fields in a Fourier series. Close agreement with experiment was demonstrated. Cross-polarization in rectangular openings was investigated by Jamieson and Rozzi (1977), in terms of a Rayleigh-Ritz variational resolution. The analysis showed the importance of higher-order modes with orthogonally polarized electric fields.
B. Measurement of Small Distances As was shown by Decreton (1975), the reflection in an unflanged waveguide illuminating a flat metal plate is formed of aconstant term resulting from multiple diffraction at the waveguide edges, and of terms due to the reflection of the plate (Fig. 10). When the constant term is compensated by matching, the phase of the remaining reflection is proportional to the distance from the waveguide end to the plate, except at very small distances (Ramachandraiah and Gardiol, 1972; Gardiol, 1978).
FIG. 10. Diffraction at the open end of a waveguide facing a reflecting plate: (a) multiple diffraction by ends of waveguide walls (independent of d ) and (b) rays reflected by the plate, phase proportional to distance d .
156
FRED E. GARDIOL
To measure very small distances (in the millimeter range), the cavity technique proposed by Soga (1973) may be utilized: a flanged open-ended circular waveguide cavity resonating in the TE,,, mode is placed close to a metal plate. The resonant frequency was found to vary linearly with gap width (between 0 and 15 mm).
C . Radiation into a Plasma When a space vehicle re-enters the earth atmosphere, radio transmission is perturbed by the formation of a sheath of ionized particles (plasma layer) surrounding the vehicle. This plasma modifies the antenna admittance, which in turns affects the signal. A flanged rectangular waveguide covered by a plasma slab was considered by Villeneuve (1965).The plasma is assumed to be a lossless dielectric with a relative permittivity E~ smaller than unity (above plasma cutoff). An expression for the self-reaction of the source is obtained by using a variational technique. It is found that the plasma thickness has little effect on the admittance, except for very thin sheaths (in this respect, a plasma layer is quite different from a dielectric slab having E~ > 1). The more complex situation of a stratified plasma is covered by Galejs (1965, 1969).The radiating rectangular waveguide is connected to another rectangular waveguide of large size containing the plasma layers. An inhomogeneous lossy plasma is considered by Crosswell et al. (1968). The theoretical development derived by Compton (1964)for lossy dielectrics is applied to the plasma situation, and results are compared with experimental values. The aperture field is taken as a combination of TE,, and TE,, modes: it was found that the TE3, mode (excited at the junction) actually has very little effect on the calculated conductance and susceptance. The techniques used to study radiation into plasmas are those developed for lossless or lossy dielectrics (Section IV,F). D. Diathermy and Hyperthermia
In the diathermy process, some parts of the body are heated for therapeutic purposes, utilizing nonionizing radiation, often in the microwave range (Licht, 1965;Schwan, 1968).Power densities up to several watts per square centimeter are locally applied to routinely treat a large variety of chronic ailments, among them arthritic and rheumatismal diseases. The resultant temperature elevation produces increased metabolic activity and dilatation of blood vessels, causing thereby increased blood flow. Healing and defense reactions of the human body are found to be stimulated, yielding beneficial results.
OPEN-ENDED WAVEGUIDES
157
The more recent hyperthermia process was developed for the treatment of cancer. Heat is applied preferentially to the diseased tumoral region, the temperature of which is increased above a critical threshold, around 42°C. This treatment is generally associated with x rays and chemotherapy. Tumors have actually been observed to decrease in size and, in some situations, to vanish completely (Sterzer et al., 1980). Diathermy applicators have usually been radiator antennas. The microwave radiation is concentrated to some extent on the patient, but a large part of the radiated power leaks to the surroundings, providing no useful heating to the process, but creating hazards to the operators. More recent designs are open-ended waveguides, which are placed in tight contact with the part to be heated. The aperture may further be surrounded by chokes to prevent leakage and unnecessary exposure (Guy et al., 1978; Kantor et al., 1978; Kantor and Witters, 1980; Stuchly et al., 1980). Small coaxial probes have also been designed for deep treatments (Swicord and Davis, 1981).The use of a dielectric rod was proposed by Fray et al. (1980). The distribution of the fields and the heating pattern produced by a rectangular aperture in contact with tissues were determined by Guy (1971). The analysis follows the approach used for plasmas by Villeneuve (1965), considering two layers of biological tissue. Another study was published by Audet et al. (1980), determining the reflection coefficient and the near-field configuration. Diathermy and hyperthermia applicators must meet two basic requirements: they must be matched to the waveguide feeding the signal (reflections would mean reduced efficiency)and they must selectively heat the region to be treated. The last requirement is particularly stringent for hyperthermia, where one wishes to thoroughly destroy a tumor, but without damaging the surrounding tissues. It is therefore necessary to know how living organisms respond to microwaves. Measurement probes were designed to determine the permittivity of tissues (Section IV,F,2). One must also be able to continuously monitor the internal temperature during the treatment. This may be done with thermography (Section IV,E). E . Thermoyraphy
Any physical body above absolute zero emits electromagnetic noise. The equivalent noise temperature, which may be determined by means of sensitive radiometers, is related to the actual physical temperature of the emitting body (Ulaby et al., 1981). Radiometry is also used to detect the microwave radiation of the human body itself. Since microwaves can propagate to some extent through living
158
FRED E. GARDIOL
tissues, one can detect hot spots deep within the body, produced either by malignant tumors (Barrett et al., 1977; Gautherie et al., 1979) or by rheumatismal irritations of joints (Edrich and Smyth, 1977). Infrared thermography, on the other hand, only shows the temperature close to the surface. Thermography is a rather peculiar technique in which the microwave apparatus actually remains passive: the signal itself is produced outside, within the biological material; it flows into the waveguide, and a very sensitive receiver is required to detect it. In order to carry out accurate measurements of temperature, the probe itself must be matched to the body: this requires a careful design and a good knowledge of the microwave properties of living tissues (Section IV,F,2). Flanged open-ended rectangular waveguides are currently used. Monitoring of the temperature during hyperthermia treatments is most important to ensure their success, and thermography is used for this purpose. Combined hyperthermia-thermography equipment has been designed in which a probe is periodically switched between a high-power microwave generator and a sensitive radiometer receiver (Audet et al., 1980; Nguyen et al., 1980; Chive et al., 1982). The temperature information obtained is used to control the output of the hyperthermia generator. F . Measurement of Materials
To clearly show the interest of a simple, easy to use, nondestructive technique to measure materials, here is a true story: While investigating propagation through loaded waveguides (Gardiol and Parriaux, 1973) a bar of absorbing material was sliced to close tolerances. A large number of measurements was taken; after a few days work, they were found to be useless, because the material was too lossy. Preliminary measurements, with a slab of -supposedlythe same material had yet given quite promising results. This story may sound familiar to many: lack of uniformity between batches of materials all too often produces great problems in practice. These would easily be avoided if material properties could be determined, even roughly, before starting a complex procedure. The “classical” techniques for the measurement of materials at microwaves always require cutting and machining of samples to be placed in a waveguide or in a cavity (Altschuler, 1963; Appendix): they are obviously not adequate to provide a quick response (in real time, for instance). In contrast, nondestructive measurement techniques make it possible to measure materials which should not be disturbed or damaged: pressure-
OPEN-ENDED WAVEGUIDES
159
sensitive compounds, walls of buildings (ancient or new), people. They may be used to continuously monitor a fabrication process. The availability of inexpensive solid-state microwave sources, together with the phenomenal growth of microprocessor-based measuring systems make nondestructive techniques an interesting and practical approach to evaluate material properties at microwaves. They provide interesting means to solve in a simple fashion rather difficult problems. This fact was certainly not overlooked and a large number of publications have appeared in the literature, presenting instruments and theoretical derivations. The measurement of the reflection factor within the open-ended waveguide (Section 111) is the experimental part of the process. The final quantity one wishes to know is the permittivity of the material, which is in some way related to its physical characteristics, and among them moisture content. The relationship between reflection and permittivity could be determined from measurements, by carrying a large number of them with accurately known calibrated materials and then interpolating. This is a comparative approach, which is limited by the availability of calibrated materials. Futhermore, measurement errors tend to add up in the process, providing finally rather low accuracy (this could be somewhat improved by repeating the measurements a large number of times and running statisticsbut this obviously takes time). It is therefore useful to determine the true relationship by analyzing the electromagnetic fields around the aperture (Sections V and VI). 1. Circular Waveguide
The open-ended circular waveguide with infinite flange excited by the dominant TE mode was considered by several authors. Mishustin (1965) set up a stationary expression for the input admittance, then evaluated it, assuming the aperture fields to be those of the dominant mode (the introduction of higher-order modes was contemplated, but found to lead to a rather awkward formulation). Bailey and Swift (1968)consider radiation into a dielectric slab, assuming also TE,, aperture fields. This study was later on extended to absorber panels (Rudduck and Yu, 1974). The complete study, taking also into account the higher-order modes excited at the aperture, is more recent and was carried out with two different techniques: Hankel transforms (Fray et al., 1981; Khandavalli, 1983) and characteristic modes (Gardiol et al., 1983). Circular waveguide apertures have also been utilized for some time in the measurement of magnetized semiconductors (Vernon and Dorschner, 1971; Laurinavicius and Pozela, 1974). The magneto-Kerr and the Faraday effects
160
FRED E. GARDIOL
produce a rotation of the plane of polarization. Their measurement can in turn be used to determine material properties. The possible effects of refraction due to the waveguide edges have not been evaluated. The radiation from a circular waveguide excited in the TM,, mode (fields independent of the azimuthal coordinate 4) was considered rigorously by a point matching technique (Gex-Fabry et al., 1979) and by Hankel transforms (Fray et al., 1982). This mode is not the dominant mode in the circular waveguide, so this excitation may only be used for open-cavity measurements (Section 111,D). 2. Coaxial Lines Cavities made of a section of coaxial line terminated by a short circuit at one end and an opening at the other were utilized to measure moisture in paper sheets (Bosisio et al., 1970), fiberglass content in reinforced plastics (Forssell, 1974), humidity in buiding walls (Ziircher and Gardiol, 1980), and water content in snow (Aebischer and Maetzler, 1983). A similar resonator, with a capacitive gap coupling at the input was used by Moschiiring and Wolff (1981) to measure thin low-loss slabs. Recently, a marked interest has been observed in measurements of biological tissues in uiuo using coaxial impedance probes: questions have arisen concerning differences in the measured permittivity of the same tissue in uiuo and in vitro. A large number of publications are devoted to this topic (Burdette et al., 1980; Stuchly and Stuchly, 1980; Brady et al., 1981; Athey et al., 1982; Kraszewski et al., 1982, 1983; Stuchly et al., 1982a,b; Gajda and Stuchly, 1983). Small unflanged probes are generally used: a ground plane is considered unsuited for in uiuo measurements, as it impedes the placement of the probe in the tissue (Athey et al., 1982). The effects of radiation and of higher-order modes are neglected, assumption generally satisfied at sufficiently low frequencies. The effect of the aperture is represented by a lossy capacitor linearly proportional to the complex permittivity gd of the material. On the theoretical side, a static analysis based on the relaxation method was presented by Tanabe and Joines (1976) for a flanged coaxial line. The capacitance found is a logarithmic function of the permittivity Ed. The approach developed by Galejs (1969) was extended to radiation into dielectrics, considering a single mode (Yepez et al., 1977). A rigorous development, taking all modes into account, was presented by Mosig et al. (1981). Coaxial-line probes have a significant advantage over their waveguide counterparts in their size and their frequency coverage. They also have a significant drawback: possible difficulty in obtaining a good contact to the center conductor.
OPEN-ENDED WAVEGUIDES
161
3. Rectangular Waveguides The radiation from a flanged rectangular waveguide was treated by Compton (1964), assuming the aperture field to be the one of the dominant TE,, mode. Admittance values are obtained, both for an infinite lossy medium and a finite lossy slab. A rectangular waveguide covered by a dielectric slab was considered by Crosswell et al. (1967), in terms of a plane-wave expansion of the fields (also with the field of the dominant mode in the aperture). The surface waves on the slab are taken into account. Some of the higher-order modes in the opening, i.e., the TE modes having only one transverse E-field component, were taken into account in an analysis by Decreton and Gardiol(l974) for the infinite medium situation. Later on the development was extended to cover also slabs and cavity measurements (Decreton and Ramachandraiah, 1975; Ramachandraiah and Decreton, 1975). Since then, the more complex but quite rigorous characteristic mode technique has been applied to both situations: infinite medium and finite slab (Teodoridis et al., 1983; Teodoridis, 1984). Open-ended rectangular waveguides were also utilized to measure the properties of biological tissues (Hey-Shipton et a!., 1982). The waveguide opening covered by a dielectric plate was also considered by Bodnar and Paris (1970), who used a variational principle. G . About Rigorous Solutions
Some may feel that the search for rigorous solutions is really an academic exercise, a waste of time and effort. After all, who cares for three-figure accuracy on calculations when measurement error may reach lo%?Also, the mathematical techniques required to obtain accurate values are long and complex (Sections V and VI); no actual designer would be expected to use them in practice: they would be much better off with a simple formula, even an approximate one. Many of these remarks are indeed quite true. Still, the fact remains that it is worthwhile, whenever feasible, to look for rigorous solutions, just because they are rigorous. The results obtained are the ultimate ones: no new technique may further improve the accuracy. The search has then ended; there is no point in developing further approximate treatments. A rigorous solution is comparable to a primary standard: it provides a solid reference against which other less sophisticated methods may be checked. Most of the early theoretical treatments of open-ended waveguides neglected higher-order modes. This was done as a matter of fact by many authors, who simply took the dominant mode fields as aperture fields, without
162
FRED E. GARDIOL
any particular comment. Some authors indicate that, actually, higher-order modes should also be taken into account, but that the formulation would then become too involved or awkward (which was true at the time). Still, none of the publications covered in this survey even tried to evaluate the resulting error.. . . This is what a rigorous solution can provide: the evaluation of the effect of higher-order modes on the reflection factor. Other approximate treatments neglect the radiation itself when is this approximation permissible? Also, what is the effect produced by a metal flange? Still, there is not much interest, for the user, in knowing that some complex mathematical approach does exist, so complicated that he cannot have access to it. What is needed at that point are simple extrapolation formulas, based on exact values, suitable for programming on a pocket calculator. To see what can be done, let’s look at what happened for microstrip lines. In the early 1970s approximate techniques for microstrips just bloomed all over the place. A conformal mapping formulation was developed by Schneider (1969) for the quasi-TEM approximation. It provided values in implicit form, not directly usable for the designer. But approximate formulas could be derived from it, to accuracies of 1% or better, which are practically always used nowadays (Schneider, 1969; Hammerstad, 1975; Owens, 1976). This is what one would actually need for open-ended waveguides.
v. THEORETICAL DEVELOPMENT FOR A FLANGED WAVEGUIDE RADIATING INTO AN INFINITE HOMOGENEOUS MEDIUM A. Description of the Geometry
A metallic waveguide with perfect electric conductor (PEC) walls is terminated in the transverse plane z = 0 by a flat metal flange (also a PEC) extending to infinity in the transverse direction (Fig. 4a). The waveguide is filled with a homogeneous lossless nonmagnetic dielectric of relative permittivity E,. The material filling the right-hand half-space (0 I z I co) is assumed to be homogeneous, isotropic, linear, and nonmagnetic, defined completely by its complex relative permittivity gd = E& - jc;. The time dependence of the waveguide signal and of all the radiated fields is a sine wave of frequency f =o/2n. Complex notation is used throughout, the actual fields being obtained in terms of their phasor equivalent by the following relation: X(t) = Re[,,/?X exp(jwt)]
(15)
163
OPEN-ENDED WAVEGUIDES
Underlining indicates a complex quantity; the time dependence exp(jot) can then be omitted. Since the waveguide is lossless, it may support either propagating modes (above cutoff) for which the propagation factor is pure imaginary (y = j p ) , and evanescent ones (below cutoff) with a real y = CI propagation factor. B. Theoretical Development
Only the transverse components of the fields are required to entirely define the problem. Within the waveguide, they are given by
where Uo is the amplitude of the incident wave, I-, is the reflection factor for the incident mode, ( n # 1) are the relative amplitudes of the other modes excited at the aperture and reflected into the waveguide, while En(rt)is the normalized transverse electric field of mode n , y,, its propagation factor, _Y, its characteristic wave admittance, and H, = e, En.The subscript t indicates that a vector is transverse (no component along z). The propagation factor of the mode n is given by
r,
x
Yn =
(Pi -~ 2 ~ o ~ c P o ) 1 ’ z
(18)
where p n is the transverse wave number of mode n. It is assumed here that the incident signal in the waveguide, denoted by the subscript 1, is of a single mode. Since the whole system is linear, the simultaneous incidence of several waveguide modes could be treated by superposition. The incident mode is not necessarily the dominant mode of the waveguide; it must however be propagating (above cutoff). The transverse modal functions En(rl)and H,,(rl) are part of complete orthonormal sets of modes, satisfying the normalization condition r
Em*En dS’ = (Em,En) = 6,,
6,
H;H;
with dS’ = (Hm,H,) = 6,,
6,,
=
{
1 if m = n 0 if m f n
(19) (20)
The integration is taken over the complete cross section S’ of the waveguide. To simplify notation further on, integrations of this type will be denoted by angle brackets as defined in Eq. (19).
164
FRED E. GARDIOL
The fields radiated into the material through the open end of the waveguide are determined by means of aperture theory (Harrington, 1961; Lewin, 1951). The opening itself is replaced by a perfectly conducting metal wall, over which circulates an equivalent magnetic surface current - M = -e,- x El, proportional to the actual tangential field within the aperture. The magnetic field radiated into the material is then given by
where k d = w , / G , r(x, y , z ) denotes any observation point within the right-hand half-space, r; is a point within the waveguide aperture (over which the integral is evaluated), and V is the differential operator del, or nabla. In the plane of the aperture, z = 0, the transverse components of the electric and magnetic fields must both be continuous (in the absence of surface electric current). The continuity of the electric field is satisfied just by introducing (16) into (21).The continuity of the magnetic field is then obtained by equating (17) and (21) with z = 0, which yields
(22) The transverse vector rl is located within the opening. The only unknowns in Eq. (22) are the reflection factors I-,; the equation may be expressed in a simplified symbolic form as m
1 TnAn(rt) = D(rt)
n= 1
(23)
where the two transverse expressions An(rl) and D(r,) are obtained by inspection of (22). Solving the equation means determining the values of the usually one looks mostly for The mathematical resolution of (22) is rather difficult as (1) The expression of the integrand is too complex for the integration to be carried out analytically (except maybe in some simple situations). Numerical integration is therefore required. (2) When the source and the observer happen to coincide, i.e., when rr = r;, the integrand is singular. The singularity is not an essential one, in the sense that the integral remains defined. Still, adequate precautions must be taken, and the singularity should be extracted before numerical integration is carried out (Mosig and Gardiol, 1982).
r,;
r,.
OPEN-ENDED WAVEGUIDES
165
(3) The integration is followed by a double differentiation ( V V . = grad div). This operation must also be carried out numerically: the differentiation process is quite sensitive to noise, hence even minor rounding errors tend to build up, leading to large instabilities in the computation process. One may introduce the VV operator under the integration sign and carry out the differentiation before the integration: higher-order poles, more difficult to extract, are obtained in this manner. Integration by parts may be used to eliminate the singularities (Fernez, 1978).
-
C . Normalization
It is worthwhile, at this point, to closely examine expression (22)in order to determine whether the variables may be grouped, reducing the complexity, and find out what scaling of sizes, frequencies, and permittivities is feasible. The characteristic mode admittances are given by
is the plane-wave admittance in vacuum andf, the cutoff where Yo = frequency for mode n in the empty waveguide, which is inversely proportional to the linear dimensions of the waveguide cross section. A reduced frequency-size variable is therefore defined as K
= wd
JG= wd &/c0
(26) where d is a characteristic dimension of the waveguide cross section. The mode admittances then take the form Yn =
YoJLFn(K)
(27)
where the function F,(K) may be deduced from (24)or (25),once the type of mode and the waveguide shape have been specified. Additional reduced quantities are then defined:
166
FRED E. GARDIOL
The expression to be solved then takes the form m
It is a function of the two reduced variables K and q. This means that a change of size may be offset by a change in frequency (6long as the cross section retains the same shape and aspect ratio), that a change in permittivity may be compensated by a change of frequency or of size, as long as the permittivity ratio q remains the same, and so on. Of particular interest is the fact that the permstivity ratio q appears within (33): large values of permittivity may be measured accuratdy by loading the waveguide with a material having a high permittivity. D. Green's Function Notation
Equation (22) may be written in a symbolic manner by introducing dyadic Green's functions (Gardiol et al., 1983):
where the external and internal dyadic Green's functions are defined by
A dyadic quantity is obtained through juxtaposition of two vectors (Felsen and Marcuvitz, 1973); is the two-dimensional unit dyadic in the transverse plane; M = e, x E, is the equivalent magnetic surface current. Introducing the value of the transverse electric field E, given by (16) into (34) and developing the expression obtained yields (22) or (23). This is an alternative way of expressing the same boundary condition.
r
E . Point Matching
The simplest approach to solve Eq. (23) is point matching: instead of applying the boundary condition over the complete aperture, i.e., for all values
167
OPEN-ENDED WAVEGUIDES
of r,, a discrete set of points r j is selected at which the expressions A, and D are evaluated, yielding equations
where Snj= A,(rj) and T j = D(rj). One then must also truncate the summation over the waveguide modes n in order to obtain a self-consistent system The values of the are then obtained by of N equations with N unknowns matrix inversion. Obviously, applying the continuity condition for the transverse magnetic field at a finite number of points means that it is not applied elsewhere on the aperture. It is generally felt that, since the field is usually well behaved, taking a sufficient number of points ensures that the field will also be approximately continuous between the points used for matching. By evaluating the magnetic field on both sides of other points within the aperture and taking the difference it is possible to check whether this is true. The procedure is, however, rather long, so the verification is seldom made. The convergence depends quite critically on the selection of the matching points rj: for the TM,, mode in circular waveguide (Section VI,A), a particular sequence gives an upper bound, another one a lower bound: by taking an intermediate sequence, the number of modes required for convergence was considerably reduced (Gex-Fabry et al., 1979). It is also possible to take a number of equations larger than the number of modes (unknowns), in which case the system of equations is overspecified, and is solved by least-squares methods.
r,.
r,
F . Moment Method
Rather than matching at some particular points on the aperture, one may impose the continuity of some weighted averages called moments (Harrington, 1968). Equation (23) then yields m
The functions Wj are weighting functions, for which different choices are possible. If one chooses impulse functions d(rj) one simply obtains the point matching technique (Section V,E). In the Galerkin method, the weighting
168
FRED E. GARDIOL
functions are taken equal to the transverse mode functions, W j = Hj. This approach takes advantage of the orthogonality properties of the transverse modal function H, [Eq. (20)]. The system of equations (38) is truncated; one considers N unknowns r, and N equations (38).The system is then self-consistent and may be solved by matrix inversion. G. Variational Principle
When Eq. (22) is dot multiplied by the magnetic current in the aperture and integrated over the waveguide cross section, a variational principle is obtained (Galejs, 1969; Bodnar and Paris, 1970).The values obtained for the are then stationary; i.e., a first-order error in the field values will only produce a secondorder error in the reflection factors. One then has
r,
m
Introducing the development for M then yields
As this expression is stationary for I-, it is differentiated with respect to this quantity, and the derivative is set equal to zero, yielding
This provides a set of equations quite similar to those obtained by the Galerkin method, letting here Anj = Dj
2(Hj, A n )
= (Hj9D) - (H1,Aj)
(44) (45)
It is then also truncated and solved by matrix inversion. This approach is quite powerful; however, it has not been used to its full potential in the resolution of open-ended waveguide problems. Many authors did set stationary expressions (Section IV) and then introduced the fields of the waveguide mode. If the fields of other modes, which are neglected in this process, are indeed of little significance, then the variational principle provides good accuracy. On the other hand, if the effect of other modes is large, the use of a variational principle may not be sufficient to guarantee adequate accuracy. One other thing must be kept in mind: the variational principle provides a better approximation of the eigenvalues, but not of the eigenfunctions. The
OPEN-ENDED WAVEGUIDES
169
values of the fields obtained by this approach may still be rough approximations, even while rather accurate values for the reflection factors are obtained (and in particular for the lowest-order one). H . Characteristic Modes
In still another approach, the boundary condition equation is developed, in its Green’s function formulation, over the set of characteristic modes of’the aperture(Harrington and Mautz, 1971a,b).Equation (34) is then written in the following way:
Y,H, =
--
6.(C
+&.M~S= ’ y , , ( ~ )= gop(M)+jbo,(M)
(46)
The term yo, which operates on the unknown magnetic current M is a complex admittance operator, possessing a real part gop,the conductance operator, and an imaginary part, the susceptance operator bop. The characteristic mode functions Mk of the aperture are defined by where A k is a real constant (eigenvalue) associated with the eigenvector Mk. The eigenvectors are normalized, forming a complete orthonormal set of functions for this problem: with Pkk= 1 W. The unknown magnetic current density M is then expanded over the set of eigenvectors, yielding after some calculations (Gardiol et al., 1983)
The eigenvalues ikand eigenvectors M, are determined in terms of the transverse-mode functions Hn(17):
c m
Mk=
n= 1
#knHn
Introducing this expression into (47) and integrating over the cross section of the opening, one obtains a set of linear equations for the #kn terms:
f
n= 1
c m
Uknbmn=
n = l
#kngmn
170
FRED E. GARDIOL
where the gmnand b,, terms are, respectively, the real and imaginary parts of the admittance term y,, - given by ymn
= Ymn
+ j b m n = (Hm, xop(Hn))
In the above development, it waspossible to simplify somewhat the term with differentiations VV (Dubost and Zisler, 1976). The set of equations (51) is truncated to its first N terms and solved for the uknterms by matrix inversion. The reflection factor for the incident mode is then given by
-
I . Variational Principle and Characteristic Modes
The two previous approaches may actually be combined. A variational principle for the input admittance is given by (Gardiol et a!., 1983)
1-r,
-Y --=
Yl
1 +r1
+
[s{s,M*(c Bho)*M'dSdS'
- u; Y,
(jp
(54)
YIHl dS)i
where J h o is the part of the internal Green's function (36) that operates on the n # 1 modes only. The base function used in the determination of characteristic modes is then (1 - El)
b
M - YIHl d S =
-
b 6.
M*(G+ B h o ) * M ' d S d S '
a set of eigenvalues A,* and of amplitude factors of which the input admittance is given by
(55)
is then obtained, in terms
The eigenvectors were also assumed to be normalized to 1 W.
OPEN-ENDED WAVEGUIDES
171
J . Transform Methods Rather than solving the boundary problem in real space, one may transform the expressions for the fields in the waveguide and in the material, and carry out the resolution in transformed space. For circular apertures, the radiated fields are expressed as a sum of circular cylindrical waves and Hankel transforms are applied, given by UtnJ(S) =
s:
pJ,(ps)u(p)dp
(57)
thus, instead of working with the radial variable p , one utilizes the transform variable s in the resolution of the problem. Some of the integrations can be carried out analytically. The reflection factors for the different modes excited at the aperture are given by one-dimensional integral expressions. The range of integration is, however, infinite, while the integrands contain infinite series of products of Bessel functions, most of them singular within the range of integration. Particular care is needed to evaluate the integrals (Fray et al., 1981, 1982; Khandavalli, 1983). For rectangular waveguide openings, a technique involving Fourier transforms of the correlation functions of the tangential electric fields in the aperture was developed by MacPhie and Zaghloul (1980). The technique is also quite involved mathematically. K . Transmission into Oversized Waveguide
The flanged open-ended waveguide problem is considered by some authors as the limiting case of a transition between two waveguides for which a solution is available (Wexler, 1967). The radiation integral of (21) is replaced by a sum over the modes of the second waveguide; as the latter's size is increased and tends toward infinity the sum tends toward an integral (Galejs, 1965; Audet et al., 1980). When the second waveguide is lossless, singularities appear at the cutoff of each waveguide mode. This affects the behavior of the reflection factor in the vicinity of the cutoff frequencies. This effect should obviously disappear at the large size limit: some smoothing technique might be used to reach this end result. This effect would be less marked for a second waveguide containing lossy materials. L. Comments
This section has presented mathematical techniques of increasing sophistication for the rigorous resolution of flanged open-ended waveguide problems. The first approach, point matching, is the simplest one, involving the
172
FRED E. GARDIOL
least amount of mathematical development. At the other end of the range, transform techniques carry the analytical treatment quite far, reducing the number of integrals to be evaluated numerically. In the process, however, integrands become increasingly complicated. All expressions, however, have several things in common: (1) Infinite series, which must be truncated in the computation process. The number of terms required for accurate results must be determined through a study of the convergence. When upper and lower bounds are available, the absolute accuracy of the final result can be evaluated. (2) Integrands too complicated to allow for analytical integration. Numerical integration techniques are therefore needed. (3) Singular integrands, requiring particular precautions in the integration process. If the location of integration points is selected without consideration of the pole structure, a nonzero possibility exists for an integration point to fall very near, or even right on the pole. Any result between - 00 and + co may then be obtained. It is therefore imperative to first extract the pole, and integrate only a well-behaved function by numerical methods. The most difficult part in the whole resolution of the problem is certainly the computation, about which little appears in the publications-which technique should be selected, and what are the differences between the various approaches available? So far, every author has made more or less use of one particular method, so that no comparison between the respective merits and drawbacks of the various approaches is available as yet. It would sound reasonable to start with a simple approach at first, in which successive steps may easily be checked, and then progressively optimize the computing efficiency, i.e., obtain a faster convergence. Increased sophistication in the method used does not necessarily mean improved efficiency.
VI. APPLICATION TO PARTICULAR STRUCTURES A. TM,, Mode in Circular Waveguide
In an open-ended circular waveguide, the significant dimension is the guide radius a (thus the reduced variable K becomes, in this case,fa&/c,). A simple situation is encountered when the incident signal is in the TM,, mode, for which the fields are independent of the azimuthal coordinate 4. It only has one component for the transverse electric field EP and one for the magnetic field If4.Because of the axial symmetry of the structure, these particularities are retained for the complete field distribution; hence only modes of the TMon
OPEN-ENDED WAVEGUIDES
173
set are excited at the aperture (Gex-Fabry et al., 1979).The transverse-mode function is here given by (Gardiol, 1984) En = e , C J l ( p o n ~ ) / ~ ~ J , ( ~ o n a ) l
(58)
with the transverse wave number poll = zon/a, where zon is the nth zero of the Bessel function J,. Carrying out the development, one finds the following values for the functions appearing in the boundary condition equation:
where
with 0
=
Jp
+
- 2tt‘COS$
(62)
This equation is solved by using the point matching approach (Section V,E): a sequence of N values is selected for t = p/a, corresponding to concentric rings on the waveguide aperture. The main integration problem is the presence of a singularity at u = 0, i.e., when the observation point coincides with the source point. The most efficient way to deal with it was by extracting the pole in the following way:
where g(&,, $,) = 0. The first term on the right-hand side of (63) is no longer singular and can easily be evaluated numerically. The remaining integral is of a simpler form; it can further be decomposed and the remaining singular part is evaluated analytically. The waveguide opening is divided into a number of segments, and the points are taken either at the end or on the middle of the segments. In the first converge from the lower side (for modulus and case, the values for argument), for the second from the upper side (Fig. 11). By taking a distribution with average locations a much faster convergence was obtained.
r,
174
FRED E. GARDIOL
I
I
./lDl
0.1’5 , 4
8
1 0 NUMBER
4 6 OF P O I N T S
I
,
/*
6
8
1
0
N
FIG.1 1 . Convergence of the calculated values for the TM,, mode in circular waveguide: (a) N segments, internal points, (b) N - 1 segments, points at the ends of segments, and (c) distribution with average location of points.
FIG. 12. Reflection factor as a function of material permittivity for the TM,, mode in circular waveguide.
Calculated values for the reflection factor as a function of material properties are given in Fig. 12. Since the higher-order modes excited at the aperture are all TM modes, the equivalent circuit is capacitive (Section 11,C). It must be remembered here that the TM,, mode is not the dominant mode of the circular waveguide. Its use would therefore be restricted to cavity measurements. B. Coaxial Line
In a coaxial line (Fig. 13), the dominant mode is transverse electromagnetic (TEM). Its two field components, EP and &I only depend upon the radial variable p. The whole structure is axially symmetrical, so the complete field
OPEN-ENDED WAVEGUIDES
175
FIG.13. Open-ended coaxial line.
distribution is also independent of the azimuthal coordinate Cp and has transverse field components E p and Ifs.The modes excited at the aperture are part of the TM,, set and, therefore, the equivalent susceptance of the aperture radiating into a lossless medium is capacitive (Section KC). The transverse-mode functions in the coaxial line are given by
El
= e p1/( p J l n b l a )
(64)
(TEM mode)
E, = epCflCJl(P,~)Yo(Pn4 - Yl(PflP)J,(P,a)l
(TM,, modes,
n # 1)
(65)
where J,,,(x)is the Bessel function of the first kind and Y,(x) the Bessel function of the second kind of order m.The transverse wave numbers p n are solutions of the equation Y,(P,a)Jo(P,b) = J O ( P f l 4 Y,(Pflb)
(66)
where n is a positive integer; the first solution corresponds to n = 2, and so on. The normalization factor C, is given by
The reduction of the variables is carried out with respect to the outer diameter b [ = d in Eq. (26)].The ratio b / a then becomes a parameter; in practice, most coaxial lines have a ratio b / a 2 3.59, which provides the lowest possible attenuation. The mathematical development and the calculations closely follow the ones for the TMol mode in circular waveguide (Section VI,A). Here also the pole in the integrand is extracted, and the convergence is accelerated by a careful selection of radii in the point matching process (Mosig et al., 1981).
176
FRED E. GARDIOL
I
0
I
1
,
2
4
6
6
1
I
,
8 1 0 0
1
2
4
I
6
I
,
8 1 0
FIG. 14. Comparison of reflection factors in coaxial line: (a) this method, (b) method of Tanabe and Joines (1976), and (c) method of Marcuvitz (1950).
FIG.15. Reflection factor in coaxial line as a function of material permittivity.
The reflection factor for a coaxial line with a = 1.4364 mm, b = 4.7250 mm is presented in Fig. 14 for the homogeneous situation E, = E~ = 2.05 as a function of frequency. Results obtained by this method are compared to those of Tanabe and Joines (1976) and of Marcuvitz (1950). Since Tanabe does not remains at unity for a lossless consider radiation losses, the modulus of dielectric. A polar diagram of rl is presented in Fig. 15 for SR7 coaxial cable ( a = 1.05 mm, b = 3.675 mm, E, = 2.3) at 10 GHz, as a function of E&) and EL. The effect of radiation may readily be detected, as the curve for E&) = 0 moves away from the unit circle as E& increases.
r,
C . TE, Mode in Circular Waveguide
A more complex situation is encountered with the dominant TE, mode in the circular waveguide. The fields have an azimuthal dependence rn = 1 and, as a consequence, possess transverse components along ep and eg for both the electric and the magnetic field. The m = 1 field dependence is retained, by
177
OPEN-ENDED WAVEGUIDES
FIG. 16. Open-ended circular waveguide with TE, mode: (-) transverse magnetic field lines.
electric field lines, (---)
symmetry, for all the modes excited at the aperture: they are either TE,, or TM modes. The TE, mode, like all TE,, and TM,, modes is spatially degenerate, and so other field structures may be obtained by a rotation of the coordinate axes. As far as reflection is concerned all possible degenerate modes will yield the same result. The functions given here correspond to one particular orientation (Fig. 16). The transverse functions for the modes are given here by (Gardiol, 1984).
,
For the TE,, modes (even n): En = cnCe,(l/p)J,(pnp)sin4
+ e+PnJ;(Pnp)cos4l
(68)
where J;(P,a) = 0
(69)
Cn = JZ{&C(pna)2
-
~I”~J,(P~Q))-’
(70)
the prime denotes derivation with respect to the argument. For the TM,, modes (odd n): En
= - C n Cep P n J ’l( P n p ) sin
4 + e,( 1 / P ) J , ( P n P ) cos 4 1
(7 1)
where Jl(Pna) = 0
Cn = @ C & P ~ ~ J ; ( P ~ ~ ) I - ’
(72)
(73)
The cutoff frequencies of TE,, and TM,, modes alternate, since the transverse wave numbers pn are proportional respectively to extrema and
178
FRED E. GARDIOL
zeros of the Bessel function J , . Odd values of the numbering index n correspond therefore to TE modes, even values to TM modes. The point matching approach utilized in the previous sections was also considered for this geometry (Besson et al., 1980).However, serious problems were encountered in the computation process. Here, too, the integrands possess a singularity, which is extracted in the manner indicated in previous sections. Results obtained through numerical integration are then differentiated twice, so numerical errors tend to build up. Very stringent precautions are needed to obtain accurate results, and the computation time increases accordingly: it becomes interesting then to consider a more developed method, where the effect of differentiation is less critical. The method of characteristic modes was selected for this purpose, and the values of the fields introduced into the expression for y,,, Eq. (52). Considerable simplifications are possible, finally yielding the fGllowing value (Gardiol et al., 1983):
y,,
=
-k,2[(-
+
l)"+"Tlnql T 3
+ [l
- (-
I)""]p,p,T;;
+ +y),,6,,
with integrals of the form
with 1 = 0, 1,2 and R = Jp2
+ p"
- 2pp'c0~$
The integrands are singular for R = 0: the pole is extracted as in the previous sections and the integration over the remaining well-behaved functions carried out numerically. A fast convergence was observed in the computation process (Fig. 17). It presents an alternative character for the ,providing thus an immediate evaluation of the convergence. modulus of In this case, a good approximation can already be obtained with only five modes. The reflection factor is presented in polar form as a function of reduced permittivity ga/cc in Fig. 18. Due to the presence of both evanescent TE and TM modes, the susceptance for the lossless case (E&) = 0) is inductive for small values of E & , then becomes capacitive for larger values. The results provided by theory were compared with experimental values measured on a slotted line in circular waveguide, providing good agreement. A polynomial approximation was set up using best-fit methods, providing simple (third-order) relationships for E & / E , and tan 6 as functions of VSWR and phase angle (at one frequency).
r,
179
OPEN-ENDED WAVEGUIDES
172O
170°
1.20.
,
FIG.17. Alternating convergence for the TE, mode in circular waveguide at K
=
2.17.
0.5
FIG.18. Reflection factor as a function of material permittivity for TE, mode in circular waveguide at K = 2.15.
D . T E , , Mode in Rectangular Waveguide
The dominant mode in the rectangular waveguide (Fig. 19) has a single component for the electric field (E,,) and for the magnetic field (H,).At the opening, however, higher-ordermodes TE,, and TM,, modes with m odd and
FIG.19. Open-ended rectangular waveguide.
180
FRED E. GARDIOL
n even are excited which (except for n = 0) have two-component fields given by the modal equations (Gardiol, 1984)
TE,, modes: mn . mnx mnx . nny cos sin -- ey- sin cosy) a b a a
EiE =
~
(77)
TM,, modes: ETM = mn
-----..-( 2
PmnJab
ex-mn cos-mnx sin , nny
a
a
b
+ ey-nnb sin. mnx c o s y ) a
(78)
where m and n are both positive integers, with m + n # 0 for TE modes and mn # 0 for TM modes, d(0) = 1 and 6(n) = 0 if n # 0, while pmn = ,/(mn/a)2
+ (nn/b)2
(79)
FIG.20. Convergence of calculated values in rectangular waveguide: (a) with one mode only, (b) with 3 modes, and (c) with 10 modes. Crosses indicate measured values.
FIG.21. Reflection factor, as a function of material permittivity, for the TE,, mode in rectangular waveguide at 10 GHz.
OPEN-ENDED WAVEGUIDES
181
The method of the characteristic modes was selected to solve this problem, following the same general procedure as in the previous section (Teodoridis et al., 1983). Here, however, the selection of an adequate sequence among the TE and TM modes is not directly evident: modes were selected according to their relative coupling with the TE,, mode in the plane of the opening. Here, too, the integrands possess a singularity, which could however be suppressed by means of a transformation into polar coordinates. The convergence obtained for the results is shown in Fig. 20, which represents calculated curves (solid lines) for 1, 3, and 10 characteristic modes and measurement values (crosses). The complex reflection factor is represented in Fig. 21 for an X-band waveguide (a = 22.86 mm, b = 10.16 mm) with a 10-GHz signal. Here also a third-order polynomial was determined by means of best-fit methods (Teodoridis, 1984).
VTI. CONCLUSION The theoretical analysis of open-ended waveguides can nowadays be carried out rigorously for the usual shapes, taking into account the higherorder modes, for a flanged waveguide radiating into a homogeneous halfspace. In some instances it was extended to stratified media and radiation through reduced-size apertures. The computation times required are, however, quite large, even on fast processors. The use of more sophisticated mathematical expressions may provide a somewhat faster convergence and thus reduce the computation time required to reach the desired accuracy. On the other hand, users would like simple relations and, until these become available, will keep using simple approximations. A need exists therefore to extract the most significant information from the tables and graphs published so far and put it in more manageable form. One must also remember that theoretical developments always yield the reflection factor in terms of material permittivity, while in practice one measures the reflection factor and would like to determine from it the material properties: one thus has to solve the inverse problem. A step in this direction was recently taken by providing polynomial expansions for circular and rectangular waveguides (Gardiol et al., 1983;Teodoridis et al., 1983). At this time, however, values are only available for one particular frequency. Several problems remain to be treated, among them the effect of finite flanges. One may expect that the study of the fields close to the flange could provide an approximate estimation, allowing one to choose a flange size at which the infinite flange assumption is still valid. On the other hand, a quite different approach would be required for &flanged waveguides: maybe an
182
FRED E. GARDIOL
extension of diffraction techniques to inhomogeneous media, or a transform method. Finally, radiation of an open-ended waveguide into anisotropic media has apparently not yet been considered from a theoretical point of view.
I N WAVEGUIDE APPENDIX: INFINITE SAMPLE
When a material having sufficiently large losses is placed in a waveguide, the fields decay rapidly within the material and reflections from the back end of the sample may be neglected (Fig. 22). One then only has to consider the front end transition, assuming in fact that the material sample is infinite (Altschuler, 1963).The validity of the assumption is easily checked by moving a sliding short circuit at the back of the sample: the reflection measured at the input should not change (Fig. 22b). (ai
TEST
SAMPLE
SLIDING SHORT
FIG.22. Infinite sample in waveguide: (a) idealized structure and (b) actual measurement setup.
The transition from empty to filled waveguide is a flat surface; the waveguide cross section does not change between the two regions. The boundary conditions at the discontinuity are then satisfied by the fields of a single mode: this is a particular case, in which no higher-order modes are excited at the junction. In metallic waveguides, the dominant mode is a TE mode, for which the characteristic wave impedance is given by
z, = J ’ w o I y o
z, j o p o / y f =
(empty guide)
(All
(filled guide)
(A21
where y o and y f are, respectively, the propagation factors in the empty and the filled guide, and being similarly the characteristic wave impedances in
z, zf
OPEN-ENDED WAVEGUIDES
183
the two sections. The admittance ratio in the plane of the discontinuity is then given by
where k , = W J E ~ ~ p, ,is the transverse wave number, which depends upon the waveguide section and the mode considered (for the dominant TE,, mode in rectangular waveguide, plo = n/a);and 4' = p 2 / k i . As the mode used to make the measurements must be above cutoff, k, > p . The equivalent circuit is given by the real and imaginary parts of the relative admittance (A3):
where
The transmission of power into the dielectric is represented by the conductance G I , while the losses produce an inductive susceptance Bl (negative) which vanishes when E: = 0. The complex permittivity E d may be extracted from (A3), yielding
The admittance ratio is obtained from measurement of the reflection factor p- in the plane of the discontinuity (Section 111,A):
p=- r,-_yl = 1 - s exp(2j9)
r,+x
~
l + s
where s is the voltage standing-wave ratio (VSWR) and 2 9 the phase angle of the reflection factor. Combining (A7) and (A8) one obtains (Altschuler, 1963; Gardiol, 1973)
This method, developed for lossy materials, may also be extended to measure low-loss liquids, powders, or granular products (Stuchly, 1970). A
184
FRED E. GARDIOL
reflectionless back end is provided by inserting a tapered lossy load within the material measured. When used for solid materials, on the other hand, this method requires a suitably long sample, machined to fit exactly within the waveguide cross section: any gap between the material and the waveguide walls produces a measurement error, difficult to evaluate or to correct. In TEM mode lines, for instance in coaxial line, [ = 0.
ACKNOWLEDGMENTS The author wishes to thank the Swiss National Science Foundation for continuous support over the years under Grants 2.086-0.78, 2.244-0.79, and 2.657-0.80, and all the members of the Laboratory of Electromagnetism and Acoustics who at one time or another took part in the project, and whose names appear in the references.
REFERENCES Aebischer, H., and Maetzler, C. (1983). Proc. Eur. Microwave Con& 13th, 1983 p. 483. Altschuler, H. M. (1963). In “Handbook of Microwave Measurements” (M. Sucher and J. Fox, eds.), 3rd ed., Vol. 11, p. 51 1. Polytechnic Press, Brooklyn, New York. Anderson, R. W., and Dennison, 0.T. (1967). Hewlett Packard J . 18(6), 1. Athey, T. W., Stuchly, M. A., and Stuchly, S. S . (1982). I E E E Trans. Microwave Theory Tech. MTT-30,82. Audet, J., Bolomey, J. C., Pichot, C., Nguyen, D. D. Robillard, M., Chive, M., and Leroy, Y. (1980). J . Microwave Power 15, 177. Bailey, M. C., and Swift, C. T. (1968). I E E E Trans. Antennas Propag. AP-16, 386. Balling, P., and Jacobsen, 1. (1977). Electron. Lett. 13, 719. Barrett, A. H., Myers, P. C., and Sadowski, N. L. (1977). Radio Science 12, 167. Barrow, W. L., and Greene, F. M. (1938). Proc. IRE 26, 1498. Bernier, L.-G., Sphicopoulos, T., and Gardiol, F. E. (1982). Arch. Elektron Ubertragungstech. 36, 479. Besson, J. C. E., Mamane, S., and Gardiol, F. E. (1980). Proc. Eur. Microwave Conf, 10th 1980 p. 317. Bird, T. S. (1979). I E E Microwaves, Opt. Acoust. 3, 172. Bodnar, D. G., and Paris, D. T. (1970), IEEE Trans. Antennas, Propag. AP-18,216. Bosisio, R. G., Giroux, M., and Couderc D. (1970). J . Microwave Power 5,25. Brady, M. M., Symons, S. A., and Stuchly, S. A. (1981). IEEE Trans. Biomed Eng. BME28,305. Bucci, 0.M., Cortucci, G., Franceschetti, G., Savarese, C., and Tiberie R. (1972). IEEE Trans. Instrum. Meas. IM-21,237.
OPEN-ENDED WAVEGUIDES
185
Burdette, E. C., Cain, F. L., and Seals, J. (1980). IEEE Trans. Microwave Theory Tech. MTT-28, 414. Chive, M., Plancot, M., Leroy, Y., Giaux, G., and Prevost, B. (1982).Proc. Eur Microwaue ConJ, 12th, 1982 p. 547. Chu, L. J. (1940), J . Appl. Phys. 11,603. Collin, R. E. (1960). “Field Theory of Guided Waves.” McGraw-Hill, New York. Compton, R. T. (1964). Ph.D. Thesis, Ohio State University, Columbus. Crosswell, W. F., Rudduck, R. C., and Hatcher, D. M. (1967) I E E E Trans. Antennas Propag. AP15, 627. Croswell, W. F., Taylor, W. C., Swift, C. T., and Cockrell C. R. (1968). I E E E Trans. Antennas Propag. AP-16,475. Decreton, M. C. (1975) A G E N Mitt. 19, 57. Decreton, M. C., and Gardiol, F. E. (1974).I E E E Trans. Instrum. Meas. 1M-23,434. Decreton, M. C., and Ramachandraiah, M. S. (1975). I E E E Trans. Microwaue Theory Tech. MTT-23, 1077. de Ronde, F. C. (1965).IEEE Trans. Microwave Theory Tech. MTT-11,435. Deshpande, M. D., and Das, B. N. (1975). Proc. I E E (London) 122,795. Dubost, G., and Zisler, S. (1976).“Antennes a large bande,” Masson, Paris. Edrich, J., and Smyth, C. J. (1977). Proc. Eur. Microwave Cons., 7th., 1977 p. 713. Engen, G. F. (1977). I E E E Trans. Microwave Theory Tech. MTT-25, 1075. Engen, G. F., and Beatty, R. W. (1959). IRE Trans. Microwave Theory Tech. MTT-7,351. Felsen, L. B., and Marcuvitz, N. (1973). “Radiation and Scattering of Waves.” Prentice-Hall, Englewood Cliffs, New Jersey. Fernez, C. (1978). Dr.-Engr. Thesis, University Paris-Sud, Orsay Center, France. Forssell, B. (1974). Proc. Eur. Microwaue Conf., 4th., 1974 p. 132. Fray, C., Khayata, N., and Papiernik, A. (1980). Electron. Lett. 16, 741. Fray, C., Chandrasekhar, K., and Papiernik, A. (1981). Electron. Lett. 17, 718. Fray, C., Khayata, N., and Papiernik, A. (1982). Arch. Elektron. Ubertragungstech. 36, 107. Frood. D. G., and Wait, J. R. (1956). Proc. I E E 103, 103. Gajda, G. B., and Stuchly, S. S. (1983). I E E E Trans. Microwave Theory Tech. MTT-31,380. Galejs, J. (1965). IEEE Trans. Antennas Propag. AP-13,64. Galejs, J. (1969). “Antennas in Inhomogeneous Media.” Pergamon, Oxford. Gardiol, F. E. (1973). Microwaves 12(11), 68. Gardiol. F. E. (1978). Bull. Swiss Electr. Assoc. 69, 634. Gardiol, F. E. (1984).“Introduction to Microwaves.” Artech House, Dedham, Massachusetts. Gardiol, F. E., and Parriaux, 0.(1973). IEEE Trans. Microwave Theory Tech. MTT-21,457. Gardiol, F. E., Sphicopoulos, T., and Teodoridis, V. (1983). In “Reviews of Infrared and Millimeter Waves”(K. J. Button, ed.), Vol. I , p. 343. Plenum, New York. Gautherie, M., Edrich, J., Zimmer, R., Guerquin-Kern, J. L., and Robert, J. (1979). J . Microwave Power 14, 123. Gex-Fabry, M., Mosig, J. R., and Gardiol, F. E. (1979). Arch. Elektron. Ubertragungstech. 33,473. Ginzton, E. L. (1957). “Microwave Measurements.” McGraw-Hill, New York. Giordano, A. B. (1963). In “Handbook of Microwave Measurements” (M. Sucher and J. Fox, eds.), 3rd ed., Vol. I, p. 83. Polytechnic Press, Brooklyn, New York. Guy, A. W. (1971). I E E E Trans. Microwave Theory Tech. MTT-19,214. Guy, A. W., Lehmann, J. F., Stonebridge, J. B., and Sorensen, C. C. (1978). I E E E Trans. Microwave Theory Tech. MTT-26, 550. Hammerstad, E. 0.(1975). Proc. Eur. Microwave Conf., 5th.. 1975 p. 268. Harrington, R. F. (1961). “Time-harmonic Electromagnetic Fields.” McGraw-Hill, New York. Harrington, R. F. (1968). “Field Computation by Moment Methods.” Macmillan, New York.
186
FRED E. GARDIOL
Harrington, R. F., and Mautz, J. R. (1971a).IEEE Trans. Antennas Propag. AP-19,622. Harrington, R. F., and Mautz, J. R. (1971b). IEEE Trans. Antennas Propag. AP-19,629. Hey-Shipton, G. L., Matthews, P. A,, and McStay, J. (1982). Phys. Med. Biol. 27, 1067. Iskander, M. F., and Stuchly, S. S. (1972). IEEE Trans. Instrum. Meas. IM-21,425. James, G . L., and Greene, K. J. (1978). Electron. Lett. 14,90. Jamieson, A. R., and Rozzi, T. E. (1977). Electron. Lett. 13, 744. Kantor, G., and Witters, D. M. (1980). IEEE Trans. Microwave Theory Tech. MTT-28, 1418. Kantor,G., Witters, D. M., and Greiser, J. W. (1978).IEEE Trans. Microwave Theory Tech. MTT26, 563. Khandavalli, C. (1983).Dr.-Engr. Thesis, University of Limoges, France. Kraszewski, A., and Stuchly, S. S. (1983). I E E E Trans. Instrum. Meas. IM-32, 517. Kraszewski, A,, Stuchly, M. A., and Smith, A. M. (1982). Bioelectromagnetics (N. Y.)3,421. Laurinavicius, A,, and Pozela, J. (1974). Phys. Status Solidi A 21,733. Levine, H., and Papas, C. H. (1951). J. Appl. Phys. 22,29. Lewin, L. (1951). “Advanced Theory of Waveguides.” Iliffe, London. Licht, S. (1965).“Therapeutic Heat and Cold,” Waverly Press, Baltimore, Maryland. MacPhie, R. H., and Zaghloul, A. I. (1980). IEEE Trans. Antennas Propag. AP-28,497. Mailloux, R. J. (1969a). IEEE Trans. Antennas Propag. AP-17,49. Mailloux, R. J. (1969b).IEEE Trans. Antennas Propag. AP-17,740. Marcuvitz, N. (1950). “Waveguide Handbook.” McGraw-Hill, New York. Mautz, J. R., and Harrington, R. F. (1976). “Transmission from a Rectangular Waveguide into Half Space through a Rectangular Aperture,” Rep. TR-76-5. Dept. of Electrical and Computer Engineering, Syracuse University, New York. Mautz, J. R., and Harrington, R. F. (1978).IEEE Trans. Microwave Theory Tech. MTT-26,44. Mishustin, B. A. (1965). Sou. Radiophys. 8,1178. Montgomery, C. G . (1947). “Theory of Microwave Measurements.” McGraw-Hill, New York. Moschiiring, H., and Wolff, I. (1981). Proc. Eur. Microwave Conf..,Ilth, 1981 p. 183. Mosig, J. R., and Gardiol, F. E. (1982). Adu. Electron. Electron Phys. 59, 139. Mosig, J. R., Besson, J. C. E., Gex-Fabry, M., and Gardiol, F. E. (1981). IEEE Trans. Instrum. Meas. IM-30,46. Ney, M., and Gardiol, F. E. (1977). IEEE Trans. Instrum. Meas. IM-26, 10. Nguyen, D. D., Robillard, M., Chive, M., Leroy, Y., Audet, J., Pichot, C., and Bolomey, J. C. (1980). Proc. Eur. Microwave Conf., 10th 1980 p. 232. Owens, R. P. (1976). Radio Electron. Eng. 46, 360. Ramachandraiah, M. S., and Decreton, M. C. (1975). IEEE Trans. Instrum. Meas. IM-24,287. Ramachandraiah, M. S., and Gardiol, F. E. (1972). Microwaue Power Symp., 1972, p. 48. Ramo, S., Whinnery, J. R., and Van Duzer, T. (1965). “Fields and Waves in Communications Electronics.” Wiley, New York. Rhodes, D. R. (1974). “Synthesis of Planar Antenna Sources.” Oxford Univ. Press (Clarendon), London and New York. Risser, J. R. (1949). In “Microwave Antenna Theory and Design” (S. Silver,ed.), p. 334. McGrawHill, New York. Rudduck, R. C., and Yu, C. L. (1974). IEEE Trans. Antennas Propag. AP-22,251. Schelkunoff, S. A. (1936). Bell Syst. Tech. J. 15,92. Schelkunoff, S. A. (1939). Phys. Rev. 56,308. Schneider, M. V. (1969). Bell Syst. Tech. J . 48, 1421. Schwan, H. P. (1968). In “Microwave Power Engineering” (E. C. Okress, ed.), Vol. 2, p. 215. Academic Press, New York. Soga, H. (1973). J . Microwave Power 8,253. Staeger, C., and Kartaschoff, P. (1977). Microwaves 16(4),41.
OPEN-ENDED WAVEGUIDES
187
Sterzer, F., Paglione R., Nowogrodzi, M., Beck, E., Mendecki, J., Friedenthal, E., and Boststein, C. (1980). Microwaue J . 23( l), 39. Stuchly, M. A., and Stuchly, S. S. (1980). I E E E Truns. Instrum. Meas. IM-29, 176. Stuchly, M. A., Stuchly, S. S., and Kantor, G. (1980). IEEE Trans. Microwave Theory Tech. MTT-28,267. Stuchly, M. A., Athey, T. W., Samaras, J. M., and Taylor, G. E. (1982a). IEEE Trans. Microwave Theory Tech. MTT-30.87. Stuchly, M. A., Brady, M. M., Stuchly, S. S., and Gajda, G. B. (1982b). IEEE Trans. Instrum. Meas. IM-31, 116. Stuchly, S . S . (1970).J . Microwaue Power 5, 62. Swicord, M. L., and Davis, C. C. (1981). I E E E Trans. Microwave Theory Tech. MTT-29, 1202. Tanabe, E., and Joines, W. T. (1976). IEEE Trans. Instrum. Meas. IM-25,222. Teodoridis, V. (1984).Sc.D. thesis, Ecole Polytech. Fed. Lausanne, Switzerland. Teodoridis, V., Sphicopoulos. T., and Gardiol, F. E. (1983). Inr U R S I Symp., 1983 p. 573. Ulaby, F. T., Moore, R. K., and Fung, A. K. (1981). “Microwave Remote Sensing-Active and Passive.” Addison-Wesley, Reading, Massachusetts. Vernon, R. J., and Dorschner, T. A. (1971). IEEE Trans. Microwaue Theory Tech. MTT-I9,287. Villeneuve, A. T. (1965).IEEE Trans. Antennas Propag. AP-13, 115. Waldmeyer, J., and Zchokke-Granacher, 1. (1975).J. Phys. D 8, 1513. Weinstein, L. A. (1969a). “Open Resonators and Open Waveguides.” Golem Press, Boulder, Colorado. Weinstein, L. A. (1969b). “The Theory of DiRraction and the Factorization Method.” Golem Press, Boulder, Colorado. Wexler, A. (1967). IEEE Trans. Microwaue Theory Tech. MTT-15, 508. Wohlleben, R., Mattes, H., and Lochner, 0.(1972).Electron. Lett. 8,474 Yaghjian, A. D. (1980). Proc. IEEE 68,248. Yepez, H. M., Decotignie, J. D., and Gardiol, F. E. (1977). Proc. Int. Symp. Microwave Diagn. Semicond, I977 p. 263. Ziircher, J. F., and Gardiol, F. E. (1980). Meus. Prog. Sci. Technol., Proc. I M E K O Congr. Int. Meas. Confed., 8th. 1979 p. 393. Zurcher, J. F., Borgeaud, M., and Gardiol, F. E. (1983).Mikrowellen Mag. 9(2), 168.
This Page Intentionally Left Blank
ADVANCES I N ELECTRONICS A N D ELECTRON PHYSICS, VOL. 63
Discrete Mathematical Physics and Particle Modeling DONALD GREENSPAN Department of Mathematics The University of Texas at Arlington Arlington, Texas
I. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Newtonian Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Particle Modeling of Solids, Liquids, and Gases . . . . . . . . . . . . . . . . . C. Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Conservative Modeling: Laminar and Turbulent Fluid Flow, Heat Conduction, Elastic Vibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Nonconservative Modeling: Heat Convection, Shock-Wave Generation, the Liquid Drop Problem, Porous Flow, Interface Motion of a Melting Solid, Soap Films, String Vibrations, and Solitons . . . . . . . . . . . . . . . . . . . . F. Models with Self-Reorganization: Celestial Phenomena, Biological Cell Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Quantitative Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Special Relativistic Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Theory in One Space Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . B. Relativistic Harmonic Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . C. Theory in Three Space Dimensions. . . . . . . . . . . . . . . . . . . . . . . . . IV. Quantum Mechanics: A Speculative Model of Vibrations in the Water Molecule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
189 190 190 194 194 200
211 234 239 242 245 250 252 259 264 266
I. INTRODUCTION It is somewhat startling that the foundations of both Newtonian and special relativistic physics can be reformulated by using only arithmetic, with the very same conservation laws and symmetry following as in continuum physics. Moreover, not only does the arithmetic approach simplify the tools necessary for such theoretical considerations, but, from it, new types of models of natural phenomena follow readily and in a reasonable way. In this chapter we will explore both the theory and the application of the arithmetic approach, which is motivated by, and implemented through, modern highspeed digital computer technology. 189 Copyright 0 1985 hy Academic Press. Inc. All rights of reproduction i n any form reserved ISBN 0-12-014663-0
190
DONALD GREENSPAN
11. NEWTONIAN MECHANICS
Though superseded by relativistic mechanics for the study of cosmological and electromagnetic phenomena, and by quantum mechanics for the study of atomic and molecular phenomena, Newtonian mechanics still retains exceptional value. Its formalism enables one to model accurately and to solve the resulting dynamical equations for a large and practical spectrum of phenomena between those of the microcosm and the macrocosm. It is then to Newtonian mechanics that we direct attention first. A. Gravity Since we wish to reformulate Newtonian mechanics using only arithmetic, and since it is always difficult to know how to begin correctly, let us first develop some intuition by studying the following simple ,experiment with a force with which we are all familiar, namely, gravity. If a particle of mass rn, situated h feet above ground, is dropped from a position of rest, one can approximate its height x above ground every At seconds as it falls For example, if h = 400 and if one has a camera whose shutter time is A t , then one can take pictures of the fall at the times tk = kAt, k = 0,1,2,. . . , and, from the photographs and the knowledge that h = 400, approximate the heights X ( t k ) = x k above ground by simple ratio and proportion. Suppose, then, that this has been done for At = 1.0, that is, for a very slow camera, and that, to the nearest foot, one finds x,, = 400, x1 = 384, x2 = 336, x3 = 256, x4 = 144, xs = 0. These data are recorded in column A of Table I. Since it is always convenient mathematically to know how far a particle has traveled from its initial position, we first rewrite our data as XO
= 400 - 0,
XI=
~3
= 400 - 144,
~4
400 - 16,
= 400 - 256,
~2
= 400
XS
= 400 - 400
-
64
in which each term preceded by a negative sign is the distance traveled in time t k . Each of these terms, however, is seen readily to have a factor of 16, so that we may rewrite our data next as XO
= 400 - 16(0),
XI=
~3
= 400 -
~4
16(9),
400 - 16(1),
= 400 -
16(16),
~2
= 400 - 16(4)
xg
= 400
-
16(25)
But each term in parentheses is a perfect square, so we now have XO = 400 - 16(0)2, XI= 400 - 16(1)2, ~2 = 400 - 16(2)2 ~3
= 400 - 16(3)2,
~4
= 400 -
16(4)2,
XS
= 400 -
16(5)2
191
DISCRETE MATHEMATICAL PHYSICS TABLE I VELOCITY AND ACCELERATION CALCULATIONS
Time
A Measured height
t,=O
x,=400
v,=o
1 t,=2 t,= 3
X, =
384 x 2 = 336 x j = 256
U, =
t4=4 t,=5
x 4 = 144 x,=o
ti
=
B Velocity by calculus -32
C Acceleration by calculus a,= -32 a, = -32
U, =
-32
-32
U, =
-64
u3
-96
u2 = -64
a2 =
-96 u 4 = -128 u5 = -160
a3 = -32 a 4 = -32
~3
=
a5 =
-32
Finally, since At = 1, we note that t o = 0, t , = 1, t , which implies xo = 400 - 16(to),, x 1 = 400 - 16(t1),, x3 = 400 - 16(t3),,
x4 = 400 - 16(t4),,
E Acceleration by arithmetic
D Velocity by arithmetic
-32 -32 a, = -32 a,= -32 a 4 = -32
v,=O
=
a, =
a,
u 4 = -128 v5 = -160
= 2, t3 =
=
3, t4 = 4, t 5 = 5,
x2 = 400
-
16(t2),
x5 = 400 - 16(t5),
or, more concisely, xk = 400 - 16(tk)’, k = 0, 1,2,3,4,5 (1) Our problem next is how to proceed with (1). From the continuum point of view, one interpolates and extrapolates to yield the classical formula x
= 400 -
16t2
(2)
which is now amenable to the full power of the calculus. In this fashion one has, almost immediately, V = -32t (3) a = -32 (4) Using (3) and (4), we have recorded in column B of Table I the particle’s velocities, and in C the accelerations, at the times t o , t l , t,, t 3 ,t4, t 5 during its fall. If our thinking, however, had been fashioned in a computer-dominated environment, then, because the real number system is not the number system of any computer, we might have chosen to store (1) in a memory bank and then tried to develop simple algebraic formulas for velocity and acceleration. We will show now how this can be done in such a fashion that the results are identical with those obtained from (3) and (4). At the times that the pictures were taken, let the velocity of P be denoted by uk = u(t,), k = 0, 1,2,3,4,5. Since P was dropped from a position of rest, let uo = 0
(5)
192
DONALD GREENSPAN
For k > 0, let V k be defined as an average rate of change of position with respect to time by +(uk+ 1
-t U k ) = ( X k +
-X
1
k ) / k
k = 0, 1,23394
(6)
The left-hand side of (6) is, of course, a smoothing operator, which is perfectly reasonable when dealing with experimental data. However, (6) is not as convenient from the computer viewpoint as its equivalent form uk + 1
= -uk
+
(2/Af)(Xk
+1 - Xk),
k
= 0, 1,2,3,4
(6')
which is a recursion formula. Substitution of k = 0,1,2,3,4into (6') yields, in order, u1 = - 32, u, = -64, ug = -96, u4 = - 128, u5 = - 160, which are recorded in Column D of Table I and are identical to the entries to Column B. Next, from a deterministic viewpoint, one would know a particle's initial position and velocity, but not its initial acceleration. The acceleration is intimately related to the force, which is at present under study. Thus, a, is not known and must be generated by some formula. If a ( t k ) = a k , we assume simply that a k = (Uk+l - Uk)/At, k = 0, 1,2,3,4 (7) From (7) and the values of u just found, we have a, = a, = a, = a3 = a4 = - 32, which are recorded in column E of Table I and are identical with the corresponding entries in column C. Formula (7) does not allow a determination of a s , because this would require knowing u s . Nevertheless, the entries indicate quite clearly that the acceleration due to gravity is constant with the value - 32. Now, just because the arithmetic formulas (5)-(7) have given the same results as the continuous formulas (3) and (4) does not mean that we have a formulation which has physical significance, since physics is characterized by conservation laws and symmetery. Surprisingly enough, our approach to gravity will also yield conservation and symmetry (Greenspan, 1973a, 1980a). We will, however, confine attention now only to the conservation of energy. Recall first the fundamental Newtonian dynamical equation
F = ma
(8)
the kinetic energy formula K
and, for a falling body with a
= *muz
= - 32, the
(9)
potential-energy formula
V = 32mx
(10)
The classical energy conservation law is then simply K , + K = K o + V,,
t>O
(1 1 )
DISCRETE MATHEMATICAL PHYSICS
193
However, the data in column A of Table I were obtained from a sequential set of photographs, that is, at distinct times t, = kAt ,and so in place of (8)-( 10)we can only assume Fk = mak,
k = 0 , 1 , 2 , ...
(12)
2
Kk=$mvk,
k = 0 , 1 , 2 ,...
(13)
V,
k = 0,1,2,...
(14)
Next, define work Wn,n
=
=
32mxk,
1,2,3,. . . , by
Then, by (5), (7) and (12),
and so Wn=Kn-Ko,
n = l , 2 , 3 ,...
(16)
which, incidentally, is independent of the structure of F and is a fundamental result in continuum mechanics. On the other hand, since ak = -32, one has from (12) and (15) that
wn= - 32m
n- 1
i=O
(xi+ - xi) = - 32mxn + 32mx0
so that from (14), Wn =
-
V , + Vo,
n = 1,2,3,...
(17)
Finally, elimination of Wnbetween (16) and (17) yields Kn+V,~Ko+Vo,
n = 1 , 2 , 3 ,...
(18)
in complete analogy with (11). Moreover, since K O and Vo are determined entirely from the initial data xo and vo, it follows that K Oand Voare the same in both ( 1 1) and (18); thus our arithmetic approach conserves exactly the same total energy, independently of At, as does classical Newtonian theory.
194
DONALD GREENSPAN
B. Particle Modeling of Solids, Liquids, and Gases
Before extending the ideas of Section 11, A to forces which are more complex than gravity, we must explain first our general approach to modeling. We begin from the fundamental physical principle (Feynman et al., 1963) that the gross dynamical behavior of any solid, liquid, or gas is determined from the dynamical behavior of its constituent atoms and molecules. In applying this principle, we will assume, without loss of generality, that the fundamental constituents are molecules. Classically, the forces that act on molecules are of two types, the long range and the short range. Long-range forces, like gravity and gravitation, are those which act uniformly on every molecule. Short-range forces are those which act only between any molecule and each of its immediate neighbors. This local interaction is of the following general nature (Hirschfelder et al., 1954).If two molecules are pushed together they repel each other, if pulled apart they attract each other, and mutual repulsion is of a greater order of magnitude than is mutual attraction. Mathematically, this behavior is often formulated as follows. The magnitude F of the force F between two molecules which are locally r units apart is of the form
F
=
-
u/rP f p/rq
(19)
where, typically, a>0,
p>o,
q>p>7
The major problem in the simulation of any physical body is that there are too many component molecules to incorporate into the model. The classical mathematical approach, then, is to replace the large, but finite, number of molecules by an injnite set of points. In so doing, the rich physics of molecular interaction is lost. A viable computer alternative is to replace the large number of molecules by a much smaller number of particles, continue the long-range forces without change, but readjust the parameters in (19) to compensate. It is this latter approach which we will follow. C . Generalizations
In this section we will consider forces which are more complex than gravity and which will be fundamental in our approach to dynamical modeling. First all the general definitions and results will be summarized. Then, for clarity and simplicity, we will prove the conservation laws for the three-body problem. The proofs given for this fundamental problem extend, but with algebraic complexity, to all other cases (Greenspan, 1973b, 1974c, 1981e).
DISCRETE MATHEMATICAL PHYSICS
For At > 0, let t k
= kAt,
k
195
= 0, 1,2,. . . , and
consider a system of particles with and acceleration a i , k . In analogy with (6)and (7), assume that
4, i = 1,2,3,. . . ,n. Let pi have mass m i , and, at time t k , be located at velocity v i , k
If Fi,k is the force acting on assumed to be related by
at time
tk,
then force and acceleration are
If Fi,k is a central, “l/rz”force, like gravitation or Coulombic interaction, then the arithmetic, conservative force formula (Greenspan, 1972b, 1980a) is Fi,k
+ ri,k)/[ri,kri,k
= Cl(ri.k + 1
+ I (ri.k
+ ri.k + 111
More generally (Greenspan, 1973b), if P, interacts with the other n - 1 particles and the force is attractive like l/rP and repulsive like l/P, then the arithmetic, conservative force on 4 is given by
(li.k+ 1
+ ri.k
- rj,k+
I
(23)
- rj,k)
where G 2 0, H 2 0, q > p 2 2, and r i j , k is the distance between 4 and 4 at time t k . Note that (23) is of such generality that the choices p = 2 and H = 0 yield, as a special case, collisionless gravitational interaction. Note also that, with regard to the motion of a single particle, arithmetic conservative formulas are special cases of the following general formula (LaBudde and Greenspan 1974). For any Newtonian potential &r), let
Then arithmetic formula (24) conserves exactly the same energy, linear, and angular momentum as does its continuous, limiting counterpart
F=
86 r ar r
196
DONALD GREENSPAN
Finally, observe that only for simple forces, like gravity, do the continuous and discrete approaches yield exactly the same dynamical behavior. In general (LaBudde and Greenspan, 1976a,b), the two approaches yield results which differ by terms of order (At)3in both position and velocity. Let us now show in detailed fashion how the conservation laws are established. The essential details for all cases of interest follow from the specific details for the planar three-body problem, which will be given next. In this problem, three particles P , ,P 2 , P 3 , of masses m , , m 2 ,m3, respectively, interact simultaneously with one another in a collisionless fashion under the force of gravitation. For At > 0 and t k = kAt, k = 0,1,2,. . ., and for each of i = 1,2,3, = ( X i $ , y i , k ) , have velocity viSk= let particle pi of mass mi be located at ( u i , k , x , u i , k , y ) , and acceleration a i , k = ( a i , k , & u ~ , ~ at , ~ time ) t k . In analogy with (6) and (7), let f(Vi,k+ 1
+ Vi,k) = (ri,k+ ai.k
1
.. (25) = 1,2,3, k = 0, 1,2,. . . (26)
i = 1,2,3, k = 0,
- I'i,k)/h,
i
= (Vi,k+ 1 - Vi,k)/AC,
To relate force and acceleration, we assume a discrete Newtonian equation i = 1,2,3,
Fi+k=miai,k,
k = 0 , 1 , 2 ,...
(27)
where Fi,k
(28)
= (Fi,k,xr Fi,k,y)
This time, the work W, is defined by
c 3
K = i= 1 w , n where
That derivation that yielded (16) implies now K,,,= + m i ( u f n , x 2 -fm(u2
+
so that if the kinetic energy
Ki,k
i
of Pi at
tk
r.0.x
+ u i2, o , y )
is defined by
then Defining the kinetic energy K k of the system at time
tk
by
3 Kk
=
Ki.k i= 1
(33)
197
DISCRETE MATHEMATICAL PHYSICS
yields the classical result W, = K ,
-
KO
(34)
Next, the precise structure of the force components of (28) is given as follows. If rij,kis the distance between 8 and Piat time t k , then set Fl,k.x
= -
+ Xl,k)
Gm1m2[(x1,k+1
+ rlZ.k+l)
r12,kr12,k+l(r12,k
'13.kr13,k+
(35)
+ r13,k+l)
l(r13.k
+ X2,k) - ( X l . k + l + X l , k ) l r12,kr12.k+ 1(r12.k + r 1 2 , k + l ) - Gm2m3 C(X2.k + 1 + X2,k) - ( X 3 , k + 1 + X 3 . k ) l
= -
Cmlm2[(x2,k+l
r23,kr23,k+
=
1(r23,k f r 2 3 , k +
+ X3,k)
GmIm,[(x3,k+l F3,k.x
+ X3,k)l
f xl,k) - (x3,k+l
-Gmlm3[(xl,kfl
F2,k..x
+ X2.k)l
- (X2,k+l
+ r13,k+l)
r13.kr13.k+l(r13.k
- Gm2m3[(x3,k+
1
+ X3,k)
+ Xl,k)l
- (Xl,k+l
-
(36)
1)
- (X2,k+
1
+ r23.k+
r 2 3 , k + lr23.k(r23,k
+ X2.k)l
(37)
1)
while F l , k , y , F 2 , k , y , F 3 , k , y , are defined by interchanging y and x in (39, (36), (37), respectively. To establish the conservation of energy, consider again (29). Substitution of (35)-(37) and the corresponding formulas for F 1 , k , y , F , , , , , ' a n d F 3 , k . y into (29) yields readily
W, = -Gmlm2nf1(
'12,k+l
k=O
- Gm2m3 k=O
- '12.k)
-G
~
r12,kY12,k+l
(
r23,k+ 1
, r 1 3 . k~+ 1 k=O
- r23,k
r23,kY23.k+1
Defining the potential energy
Ej,k
V.. 1J.k
and Pj at
of the pair
= -Gmim./r.. I
?I,k
~
- '13.k
r13,kr13.k+l
tk
by
~
~
~
(
198
DONALD GREENSPAN
implies then that = V12,0
wn
+ V13.0 + v23,0
-
-
V12.n
V13.n
-
(38)
v23.n
If the potential energy V, of the system at time t k is defined by
V, = V12,k then (38) implies
+ V13.k + V23,k
wn= Vo - v,
(39)
Finally, elimination of Wnbetween (34) and (39) yields
n = 1,2,3,...
K n + % = K O + Vo,
which is the classical law of conservation of energy. Next, observe that (27) and (35)-(37) imply mlal,k,x
+ m2a2.k,x + m 3 a 3 , k . x
= O,
2
Hence ml(Vl.k+l,x
- Vl.k.x)
+ m2(v2,k+l.x
- v2.k.x)
+ m3('3,k+l.x
- v3,k.x)
=
k20
(40)
Summing both sides of (40) over k from 0 to j - 1, where j 2 1, yields ml(u1,j.x
- vi.0.X)
+ mz(vZ,j.x
- vz,o,x)
+ m 3 ( ~ 3 . j . x - v3,o.x)
However, since the last identity is valid also for j m l v l ,j . x
+ m 2 o 2 ,j . x + m 3 v 3 , j , x
= mlvl.O.x
= 0, it
= 0,
'
j 2 1
follows that
+ m2v2,0,x + m3v3.0,x j 20
(41)
Similarly, mlvl,j,y
+ m2vZ,j.y + m3v3,j.y
= mlvl,O.y
+ m2v2.0.y + m 3 v 3 . 0 , y j 2 0
(42)
However, Eqs. (41)-(42) are the classical equations for the conservation of linear momentum, and so the second conservation law is valid. With regard to conservation of angular momentum, consider a particle P of mass m which, at t k , is located at rk and has velocity v k . Then the angular momentum L k of P at t k is defined by Lk
= m(rk
x
vk)
Next, at t k , let particle pi of mass mi be located at r i , k , have velocity V i , k , and have angular momentum Li,k,that is, Li,k
= mi(ri.k
Vi.k)
(43)
199
DISCRETE MATHEMATICAL PHYSICS
For any system of three particles, let the system angular momentum L k at t k be defined by 3
Lk =
i= 1
(44)
Li.k
What we wish to show is that
k = l , 2 , 3,... (45) and this is done as follows. From (43) and the laws of vector cross products, L k = L , ,
Li.k+ 1
- Li,k = mi(ri,k+
1
= mi[ri,k+
1
V i , k + 1)
- ri.k)
fcVi.k+ 1
1
+ ri.k)
-
x
+ $(ri,k+
- mi(ri,k
+I
(Vi,k
Vi.k)
+ Vi,k) - vi,k)l
From (25)-(28), then Li,k+ 1
- Li.k = m i [ ( r i , k + l
+ 4 c r i , k + 1 + ri.k) = 4At(ri,k+ 1
For notational simplicity set T.r.k = l(r 2 i.k+
+ ri,k) 1
[(Ti&+
1
- ri,k)/At]
(ai,kAt)l Fi.k
+ ri,k)
Fi,k
so that Li.k
+1
- Li.k = (At)Ti.k
Hence, if 3
Tk
=
1
i= I
Ti,k
then (44), (47), and (48) imply Lk+ 1
- Lk = (At)Tk
Now, if
Tk=O,
k = 0 , 1 , 2 ,...
(49)
then L k + i = L k ,
k = 0 , 1 , 2 ,...
which implies (49, and the discussion would be complete. It remains for us to show then that, for the three-body problem, (49) is valid.
200
DONALD GREENSPAN
To do this observe that Tk
+ T2,k + T3,k =$(rl.k+l + rl.k) Fl,k + r3.k) ' 3 . k = Tl,k
+ r2,k)
+$(r2,k+l
F2,k
+
%r3,k+1
(50)
where Fi,k,F2,,, and F3,kare, by (35)-(37) and the corresponding formulas for the y components, Fl,k
= -
+ rl.k
Gm1m2(r1,k+1
- r2,k+
-
Gm1m3(r1,k+
1
+ rl.k
- r3,k+
= -
Gm1m2(r2,k+
1
+ r2.k
1
+ r2.k
+ r3,k
Gm1m3(r3,k+1
= -
r13.k+lr13.k(r13.k
- Gm2m3(r3,k+
1
+ r3.k
r23.k+lr23.k(r23.k
- r3.k)
(51)
- rl.k)
+ r13.k+l) - r3,k+
r23,k+ 1r23.k(r23,k+
F3,k
1
- rl.k+ 1
r13.kr13.k+l(r13.k
- Gm2m3(r2,k+
- r2,k)
+ r13.k+l)
r13.kr13.k+l(r13,k
F2,k
1
+ r12,k+l)
r12,kr12.k+l(r12.k
1
1 - r3,k)
+ r23.k)
(52)
-rl,k)
- rl.k+l
+ r13.k+l) - r2.k+
1
- r2.k)
+ r23.k+l)
(53)
However, direct substitution of (51)-(53) into (50)yields, by the laws of vector cross products, Tk=O,
k = 0 , 1 , 2,...
Thus, the classical law of conservation of angular momentum is valid. For completeness, note that symmetry is also valid, but because proofs for the special algebraic formulas needed are relatively lengthy, the interested reader is referred to the references (see, e.g, Greenspan, 1973a,c; 1981e). D. Conservative Modeling: Laminar and Turbulent Fluid Flow, Heat Conduction, Elastic Vibration
Let us show now how to model, as described in Section II,B, some basic physical phenomena by use of the formulas of Section II,C. Note first, however, that the use of these formulas requires that one solve a system of nonlinear or transcendental equations at each time step. For example, from (25)-(28) and (35)-(37), the planar three-body problem with the parameter
20 1
DISCRETE MATHEMATICAL PHYSICS
choices G = 1, m , = m2 = m3 = 10 would require the solution at each time step of the following twelve equations in the 12 unknowns X i , k + l , y i , k + l , u i , k + l . x , u i , k + l . y ( i = 1,2,3): Xi,k yi.k
+ 1 - Xi,k - * A t ( U i , k +
+1
Ol,k+I,x
+
- yi.k - *At(ui,k + 1.y - ul,k,x
(Xl,k+l
[
+ loAt
+
Xl,k)
- Ul.k,y
+
+ +
ui,k,x)
= 0,
i = 1, 2, 3
ui,k,y)
=
O,
= 1, 2, 3
(Xl,k+l
+ Xl.k)
+ XZ.k) + r12,k+l)
- (X2.k+l
r12,krlZ.k+l(r12.k
- (x3,k+l
r13.kr13,k+l(r13,k
u l , k + 1,y
1.x
+ X3.k) ] = o
+ r13,k+l)
lo~,C"l,k+
1
Y1,k) - (Y2,k+ 1
r12,kr12,k+l(r12,k
+ Y2.k)
+ r12.k+l)
1
+ Y1,k) - (y3.k+ 1 + Y3.k) = 0 r13,kr13.k+l(r13.k + r13,k+l) (X2,k+ 1 + XZ,k) - ( X l . k + 1 + X l . k ) u Z , k + l , x - u Z , k , x + loAt rlZ.kr12,k+l(r12,k + IlZ,k+l) + ( X 2r ,2k3+, kl r 2+3 , kX+Z1,(kr Z) 3-. k ( X 3+, kr+2l 3 , +k + lx)3 , k ) =
+
(yl.k+l
1
U 3 , k + I,x - u3.k.x
+
where
+ loAt +
-
(x3,k+
1
+ X3.k)
r13.kr13,k+
+
(x3,k+ 1 x3,k) (XZ.k+ I X2.k) r23,k+ lr23.k(r23,k r 2 3 , k + 1)
+
1 + Xl,k) + r13.k+ 1 )
- (Xl,k+
l(r13.k
]=o
202
DONALD GREENSPAN
Such systems are solved easily by the Newton-Lieberstein method (Lieberstein, 1960; Greenspan, 1974d), which, in its most general form is described as follows. Consider the nonlinear system of k equations in the k unknowns x1,x2,x 3 , .. . ,xk: .f1(XlrX2,X3,
9
f2(x1,
*.
x2,x3,
f3(x1,x2,x,,
h(xl, X29x3,
9
* *
9
xk)
=0
., xk) = 0
* * >
* * *
xk)
=0
,xk) = 0
Then the Newton-Lieberstein iteration formulas for this system are
in which w is an overrelaxation factor in the range 0 < w < 2 and each partial derivative is evaluated at the same point as is the function in the numerator of the term in which the partial derivative occurs. Computer implementation is always practical because no matrix inversion routines are required. For our first study in modeling, let us consider the flow of a liquid out of a nozzle. By restricting attention only to the immediate flow out of the nozzle, we can, at present, neglect the effect of gravity. Indeed, for simplicity, the modeling examples in this section will all emphasize the effects of local forces only. The complexities that result when both local and long-range forces are essential and will be considered later. Consider, then, a two-dimensional liquid in motion, a small portion of which is shown in Fig. 1. Let particles P,-P,, be called the first row, P12-P23 the second row, and P24-P34 the third row. In (23), let G = H = mi = 1, i = 1,2,. . .,34, and p = 7, q = 10. The initial positions of Pi-P34 are set so that
DISCRETE MATHEMATICAL PHYSICS
203
FIG.1. A small fluid section.
P 1 3 - P 2 2 are centers of regular hexagons of radii r = (1.5)1’3, while the remaining particles are centered at the vertices of the hexagons. Since the initial positions are fixed, the motion of the particles will be completely determined once we fix all the initial velocities ui,o,x and u ~ , ~To, do ~ . this, let us suppose that the particles have just been emitted horizontally from the nozzle. If this were the case, then ui,o,x would dominate u ~ , ~Moreover, , ~ . not all particles would have exactly the same velocities because of possible collisions with the nozzle housing. So let us choose
ui,o,x =
V
+
&(,I,
Ui,O,y
- Ei.2,
i
=
1,2,.. .,34
where V is a parameter which assures relatively horizontal motion, while ziql and ei,* are relatively small random numbers which give the particles small perturbations from purely horizontal motion. For simplicity, let the computer generate all the q l and eia2in a random fashion so that
In the examples that follow, our interest will center on increasing values of V and on initial time steps only. Figure 2 shows the particle motion for V = 50 and At = 0.02 at t = 0.2,0.4,0.6,0.8,1.0,1.2,1.4. A gentle wave motion develops in each row, while the rows maintain their relative positions. A flow of this nature is said to be laminar. Figure 3 shows the motion for V = 300 and At = 0.02 at t = 0.2,0.4,0.6,0.8,1.O. Repulsion between the particles has assumed a greater significance and, though the rows still maintain their relative positions, the motion is becoming more chaotic. Figure 4 shows the motion for V = 1000 with At = 0.01 at t = 0.2,0.4,0.6,0.8,1.0. Here the laminar character of the flow has disappeared in that the rows no longer maintain their relative positions, and the motion becomes extremely chaotic, or, more descriptively, turbulent. Thus, with the increase in velocity, particles can come nearer to other particles, which results in increased repulsive forces and more complex motion.
204
DONALD GREENSPAN
( a lt.0.2
( e ) t-1.0
(g)t=1.4
I
FIG.2. Laminar flow.
205
DISCRETE MATHEMATICAL PHYSICS
(o)t-0.2
( b )t = 0 . 4
FIG.3. Laminar flow with limited chaos.
206
DONALD GREENSPAN
( d ) t =0.8
FIG.4. Turbulent flow.
207
DISCRETE MATHEMATICAL PHYSICS
Intuitively, turbulence is often thought of as a type of fluid flow which is characterized by the rapid appearance and disappearance of many small vortices, or whirlpools. If one defines a vortex as a counterclockwise, or a clockwise, motion exhibited by three or more relatively close particles, then such configurations do indeed appear in our example for V = 1000, do break down quickly owing to the very large effects of repulsion, and then do reappear in different particle groupings (Greenspan, 1974a). Note also that, though turbulent flow is known to be the most common type of fluid flow, there is as yet no continuous model of it which is realistic (Birkhoff, 1983; Saffman, 1968; von Karmen, 1963). As a second example, let us consider heat conduction in a bar. Since heat conduction of this type is a molecular phenomenon, it is reasonable again to consider local force interaction only. The prototype problem is formulated as follows. Let the planar region bounded by rectangle OABC, as shown in Fig. 5a, represent a bar. Let lOAl = a, lOCl = c. A section of the boundary of the bar is heated. The problem is to describe the flow of heat through the bar. Our discrete approach proceeds as follows. First, we wish to replace the region OABC by a configuration of particles. Moreover, the configuration must be such that it has the characteristics of a solid. This can be accomplished readily be triangulating the region, setting the particles at the verticles of the triangles, and choosing the distance between any two neighboring particles so that the local force of interaction is zero. It follows from (23), with n = 2, that two particles have a zero-force interaction if the distance r between them is given (Greenspan, 1974b, 1980a) by r
=
{ H ( q - 1)/[G(p - l)]}1’(9-p)
-
(54)
Next let us fix the parameter choices mi = 1, G = H = 1, p = 7, q = 10, so that for (54), r = 1.1447, and, for a 11 and c 2, let us subdivide the given region into regular triangular subsections, with the length of each side of any triangle equal to 1.1447. One possible such arrangement, and the one we will use, is shown in Fig. 5b. Now by heating a section of the boundary of the bar, we will mean increasing the velocity, and hence the kinetic energy, of some of the particles whose centers are on OABC. By the temperature K , k of particle at time t k , we will mean the following. Let M be a fixed positive integer and let Kj,k be the is defined by kinetic energy of at tk. Then T+k
-
e
e
208
DONALD GREENSPAN
Ix
c d d o o o
1
I
FIG.5. Heat flow in a bar.
which is, of course, the arithmetic mean of the kinetic energies of fi at M consecutive time steps. By the flow of heat through the bar we will mean the transfer to other particles of the bar of the kinetic energy added at the boundary. Finally, to follow the flow of heat through the bar one need only follow the motion of each particle and, at each time step, record its temperature.
DISCRETE MATHEMATICAL PHYSICS
209
To illustrate, consider the bar shown in Fig. 5b with the parameter choices given above. Assume that a strong heat source is placed above P6, and then removed, in such a fashion that
(-$I2? -$I2)>
= (0,
($127 -$/2) while all other initial velocities are 0. As regards the choice of M , which is relatively arbitrary, set M = 20. From the resulting calculations with A t = 0.025, Figs. 5c-g show the constant-temperature contours T = 0.1,0.06,0.025,0.002 at t , , t l o , t l ~ , t 2 0t,,, , respectively. The resulting wave motion is clear. As a final example in this section, let us develop the basic mechanics of discrete elasticity by concentrating on the vibration of an elastic bar. The problem is formulated physically as follows. Let the region bounded by rectangle OABC, as shown in Fig. 5a, represent a bar which can be deformed, and which, after deformation, tends to return to its original shape. The problem is to describe the motion of such a bar after an external force, which has deformed the bar, is removed. Equivalently, the problem is to describe the motion of an elastic bar after release from a position of tension. Our discrete approach proceeds as follows. The given region is first subdivided into triangular subregions. Then, dcformation results in the compression of certain particles and the stretching apart of others. Release from a position of deformation, or tension, results in repulsion between each pair of particles which has been compressed and attraction between each pair which has been stretched, the net effect being the motion of the bar. As a particular example, let m i= 1, p = 7, q = 10, G = 425, H = 1000. From (54), r = 1.5225. Consider, for variety, the 30-particle bar which results by deleting P, and P32 from the configuration of Fig. 5b. The particles PI,P12, and P 2 2 ,whose respective coordinates are (0,2.6371 l), (0.76127,1.31855), and (O,O), are fixed throughout. In order to obtain an initial position of tension like P14, P15,P16, PI79 P18,PI,, Pzo,P2, at that shown in Fig. 6a, first set PI3, (2.28357,1.29198), (3.80588,1.26541), (5.32632, 1.18573), (6.84052,1.02658), (8.33992,0.76219), (9.81058,0.36813), ( 1 1.23199, -0.17750), (12.57631, -0.89228), (13.80807, - 1.78721), respectively. Any two consecutive points pk,pk + ,k = 13,14,.. . ,20, are positioned r units apart. The points P2-PI0and P23-P31are positioned as follows: Pk- and Pk+ 1 1 are the two points which are r units from both Pk and pk+ for each of k = 12,13,. .. ,20. Each consecutive pair of points in the P2-P10set are then separated by a distance greater than r, while each consecutive pair of points in the p23-p31 set are separated by a distance less than r. Thus P2-Plo are in a stretched position, while P23-P31 are compressed. From the initial positions shown in Fig. 6a, the oscillatory bar motion is determined from (20)-(23) with all initial velocities set equal to 0 (Greenspan, vS.O
=
,
v6.0
v7.0 =
210
DONALD GREENSPAN
FIG.6. Vibration of a bar.
1974b, 1980a). Figure 6 shows the first upward swing of the bar and indicates clearly that each row of particles exhibits wave oscillation and reflection during the gross bar motion; that is, wave oscillations go through the bar during its upward swing. Finally, note that formulas (20)-(23) have proved to be particularly useful for the study of a variety of astronomical phenomena, especially when the conservation of angular momentum is required. The study of such phenomena usually needs, in addition, the use of a variable time step, because of large variations in the forces involved (Albrycht and Marciniak, 1981b; Carusi and Valsecchi, 1980). Also, note that higher-order formulas have been
21 1
DISCRETE MATHEMATICAL PHYSICS
developed which conserve energy and linear momentum, but, thus far, only (20)-(23) conserve all the fundamental physical invariants (LaBudde and Greenspan, 1976a,b; Marciniak, 198 1). E . Nonconservative Modeling: Heat Convection, Shock- Wave Generation, the Liquid Drop Problem, Porous Flow, Interface Motion of a Melting Solid, Soap Films, String Vibrations, and Solitons
In each of the conservative models of Section II,D, the implicit structure of the dynamical difference equations required that a system of nonlinear equations in positions and velocities be solved at each time step. Such a requirement can be relatively expensive if the number of particles is large. In this section, then, we will show how to formulate and study discrete models by using explicit formulas. The price one pays for the resulting economy is that one no longer retains exact conservation. Nevertheless, it should be understood that if exact conservation is essential, then each model to be discussed can be reformulated and reanalyzed by using (20)-(23). It should be noted, also, that in this section we will introduce long-range forces in addition to local forces. Explicit formulas for the positions and velocities of particles can be introduced in a variety of ways. For example, consider a set of n particles P I ,P 2 , .. . ,P, of masses m , , m 2 , . . .,mn, respectively. For positive time step At, let t k = kAt, k = 0,1,. . . . At t k , let pi be located at have velocity V i , k , and have acceleration ai,k,for i = 1,2,. . .,n. Assume that position, velocity, and acceleration are related by the leap-frog formulas (Greenspan, 1980a):
+
(55)
vi,+= vi,o +Atai,, V i , k + t = Vi,k-+
+ (At)ai,k,
ri,k+1 = ri,k + (At)Vi,k++,
k = I,&...
k
= 0, 192, *
(56)
(57) The name “leap-frog” is derived from the way the position and velocity are defined at alternate, sequential time points. Next, if Fi,kis the force acting on at time t k , then we assume, as usual, that *
9
(58) F.i , k = m.a. i r.k Once an exact structure is given to Fi,k, the motion of each particle will be determined recursively by (55)-(58) from prescribed initial data. The special structures to be described will yield recursion formulas which are entirely explicit. Of course, Runge-Kutta, Taylor expansion, and predictor-corrector formulas may be used in place of (55)-(57), though we will utilize only the above in all models to be described.
212
DONALD GREENSPAN
Consider now the process of heat convection. In particular, let us consider heat transfer and fluid motion of a gas which is contained in a twodimensional, cylindrical container of radius A which is open at the top. The local-force interaction formula will be taken to be Ftk
=
(
-Gmimj rfi,k
Hmimj
rji,k
14. 1J.k
rij.k
+-)-
,
i = 1 , 2 ,..., n
(59)
The total local force on 4 due to all the other particles is taken to be n
F??'
=
i =1
F:k
j+i
Finally, we include the long-range force of gravity and define Fi,k in ( 5 8 ) by F,k
=
FZt
-
980miy
(61)
where, in two dimensions, y = (0,l). Particles which have collided with the walls of the container will be reflected and damped in the following simple fashion. Since the radius of the cylinder is A , let the ends of the cylinder base be located at (- A , 0) and ( A ,0) in the XYplane. If at t k particle 4 is at ( X i & , y i , k ) , then
> A * pi reset at (2A - Xi,k, yi,k) < - A * pi reset at ( - 2 A - Xi&, yi,k) yi,k < 0 3 4 reset at (Xi,k, -yi,k) If v t k is taken as the velocity of 4 at t k and V i , k is to be the reset velocity, then (a) (b) (c)
Xi,k
Xi,k
>A or Xi.k < - A ui,k,x = - 6 u t k , x ? Ui,k,y = 6 u f k , y < * ui,k,x = 6 u t k , x , 0i.k.y = - 6 u t k . y , where 0 I 6 I 1. Of course, the value of the damping constant 6 will depend entirely on the nature of the wall. In particular, let us now consider the parameter choices n = 50, A = 1, G = 0, H = 1, p = 1, q = 6, mi = 2.5,6 = 0.1, At = The major problem which confronts us immediately is that of determining all initial positions and velocities so that the fluid is physically stable. Without computers, this problem is entirely intractable. For this reason, we fix the particles' initial data in some approximate fashion and then let the particles interact in accordance with (55)-(58) and (61) until physical stability results. This was done on the UNIVAC 1110 in the following way. First, the particles were distributed uniformly and in rows so that the height of the top row was h. For each of h = 0.5,1.0,1.5,2.0,2.5,3.0,3.5,4, the particles were allowed to interact with 0 initial velocities. Only the top row was observed. For h = 0.5,1.0,1.5,2.0, the top row exhibited upward motions only. For h = 3.0,3.5,4.0, the top row (a) (b)
Xi& yi,k
DISCRETE MATHEMATICAL PHYSICS
213
showed downward motions only. So h = 2.5 was selected as a first approximation for the height of the fluid. With this height fixed, the particles were then reset in a more physical way by having their density increase with the depth, but with no particle set on a wall of the container. Initial velocities were determined at random in the range - 200 Iui,o,x I200, - 200 _< ui,o,yI 200. The particle arrangement that resulted after 14.4 sec of physical time is shown in Fig. 7. Particles without direction arrows showed no significant motion. The exact positions and velocities of all 50 particles are given in Greenspan (1978a). With regard to Fig. 7, it is worth noting that the wall particles achieved a relatively stable formation first, due to the strong damping, that the particles below y = 1 did so next, and that the particles above y = 1 did so last. The central particles do exhibit strong, but stable, motion, the entire ui,x I 22.5, range of particle velocities having been reduced to - 19.0 I - 19.5 I ui,y I 17.5. Moreover, the density increases with the depth, indicating that the fluid is either a heavy gas or a light liquid. We will now demonstrate that the fluid so generated and shown in Fig. 7 possesses the very fundamental property of buoyancy. For this purpose a new particle, PSI,whose mass is 0.25, is inserted into the fluid at (0,O.S) and is assigned an initial velocity of 0. The resulting motion of Ps at every tenth time step, for a total of 330 time steps, is shown in Fig. 8. The very rapid ejection of PSIis seen to include relatively large lateral motions in the center of the fluid, due, of course, to the fact that the fluid particles in this region still exhibit such motions. Heuristically, the lighter particle is “carried with the currents” in this central area. The final ejection of Psl is to a position above the fluid.
FIG. 7. A heavy gas.
214
DONALD GREENSPAN 0
FIG.8. Buoyancy effect in a heavy gas.
We will show now that, in general, heating the fluid shown in Fig. 7 results in fluid expansion, while “judicious” heating can yield convective currents. By heating the fluid we mean increasing the velocity, and hence the kinetic energy, ofvarious fluid particles. First, let us do this in the following way. Every time a particle is found in the region bounded by 0.5 s x I1.0 and 0 Iy I 0 . 2 , let its component of velocity q Y be increased by 100. The resulting effect is to have the particles in this special region move out vertically and rapidly, and to have the resulting partial vacuum filled by particles directly to the left of the region. Computation for 600 time steps yields the fluid expansion shown in Fig. 9. Suppose next that we decrease the amount of heat and the size of the heating area. This time let each particle which collides with the base of the cylinder in that portion where 0.7 Ix I1 be reflected with its component of velocity q Yincreased by 25. The resulting particle motions are now much slower and less erratic than those shown in Fig. 9. Indeed, Fig. 10 shows the resulting motion after 900 time steps, where a counterclockwise vortex appears and is so indicated, while Fig. 11 shows the development of a larger motion of this type after 1600 time steps. Another problem of wide general interest which is associated with gases is that of shock-wave generation. Consider, then, a gas as shown in a long tube in Fig. 12a. Into this tube insert a piston, as shown in Fig. 12b. If one first moves the piston down the tube slowly, as shown in Fig. 12c, the gas particles increase in density per unit volume in a relatively uniform way. However, if, as shown in Fig. 12d, the piston is moved at a very high rate of speed, gas particles compact on the cylinder head, with the result that the original gas consists of two
DISCRETE MATHEMATICAL PHYSICS
8
215
d
'b
d
6
0-
b
Q *
s
b
d, e v
n
-
-
v
( v
FIG.9. Gas expansion due to heating.
distinct portions, one with a very high density, the other with about the same density as at the start. The boundary between these two portions, which is shown as a dotted line in Fig. 12d, is called a shock wave. More generally, a shock wave can be thought of as follows. Assign to a given gas a positive measure of average particle density. Let a body B pass through the gas at a very high rate of speed. In certain regions about B, there may occur sets of gas particles whose densities are not average. Then, a
216
DONALD GREENSPAN
0,
P
Q 0
b
I
I
9
I
b4.0 b
P
Gd' 1 FIG.10. Vortex generation in a gas.
2 FIG.1 1 . Circulatory motion in a gas.
217
DISCRETE MATHEMATICAL PHYSICS
I
0
o o o o o
0
0
0
0
0
0
(a)
0
u
0
-
iO-0-0
0 0 \ p o O 0 ~
(d)
FIG.12. Shock-wavegeneration.
boundary between sets of particles with average density and those with “greater than average” density is called a shock wave. Let us illustrate this “greater than average” density concept and the development of a shock wave by considering next a particular shock-tube problem. Consider the tube configuration in Fig. 12b. For convenience, a coordinate system will be fixed relative to the piston head, as shown in Fig. 13, so that the particles will be considered to be in motion relative to the piston. Let the tube be 100 units long, so that A0 = 100, and 10 units high, so that Y
#
I I I
E
A
FIG. 13. Positioning of axes.
218
DONALD GREENSPAN
AB = 10. Now, every At = 0.01 sec, let a column of particles, each of radius r = 0.35 and of unit mass m, enter the tube at AB. Each such column is determined as follows. At each time, tk, each position (- 100, n + i), n = 0 , 1 , 2 , .. . ,9, is either filled by a particle or left vacant by a random process, like the toss of a coin. Once it has been determined that a particle & is at such a location, its velocity vi,ois determined by vi,o = (100 + E ~ , E, ~ ~, ~ where ), q l and E ~ are , ~ random, but are small in magnitude relative to 100, thus assuring that the gas has a relative high speed in a relatively uniform direction. Because of the high speeds and short time durations in shockwave development, gravity will be neglected in the formulation. Allowing a repulsive force to simulate the effects of particle collision, we assume
where rijSkis the distance between & and P j in the tube at tk and
H={
0, 1,
if if
rij,k2 2r = 0.7 rij,k< 2r = 0.7
The parameter 5 is a measure of how close the centers of two particles can come and serves to conserve mass. If and when a particle impacts on either the top or the bottom of the tube, or on the piston head, we will assume that it rebounds, as shown in Fig. 14, with P=a+y (62) The quantity y is determined at random in the range 0 I y I 71/40, subject to the restriction 0 5 fl I 4 2 . If this last restriction is satisfied for both choices of
FIG.14. Wall reflection.
DISCRETE MATHEMATICAL PHYSICS
219
sign in (62), then the sign is to be determined at random. If the incident speed is ui, while the reflected speed is u,., it will be assumed that (u,( = 0.21u,(
which can be interpreted as a transference of kinetic energy from the particles of the gas to the particles of the container. Note that the damping factor, which is 0.2 in this case, depends entirely on the nature of the boundary surface. For the above simple formulation with q = 1 and = 0.1, Fig. 15 shows the shock-wave structure at time t160, that is, after 1.6 sec, when “greater than average”density is defined to mean that the distance between a particle and at least five other particles is less than unity (Greenspan, 1980a). It should be noted that the heating of the walls of the tube, a phenomenon of fundamental importance in shock-tube generation, is usually too difficult to incorporate into continuous models. Next, let us consider a liquid problem. In so doing we will develop phenomena related to free-surface motion, wave motion over a wall, stratified flow, and diffusion of a pollutant. The long-range force is gravity. Consider, then, a liquid in a cavity, or well, as shown in Fig. 16a. The liquid consists of 190 particles, each of unit mass (Greenspan, 1980b), each shown as an unshaded particle. Consider, also, a liquid drop which consists of 15 particles, each of mass rn = 2, each represented as a shaded particle. At t o (Fig. 16a), a configuration is assumed in which the drop has already hit the surface and has flattened. The parameters are H = G = 100, p = 1, q = 3, At = with t y I I I
FIG.15. Shock-wave simulation.
220
DONALD GREENSPAN
J
” ,
FIG.16. Fluid drop dispersion and free-surface wave generation.
gravity equal to -980mi. This time, however, we restrict the local forces to radial distance of 0.25; that is, if the distance between two particles is greater than 0.25, then the local force between them is taken to be zero. All initial data are given in Greenspan (1980b). Figure 16a-d show the resulting dynamical behavior. Figures 16a and 16b show the entry and initial dispersion of the drop into the well. Figures 16b and 16c show clearly the wave reaction of the well liquid. Figure 16c shows two developing reactions within the well, one of which is a backflow over the sinking drop; the other is a wave flow over the right-hand wall. These actions continue in Fig. 16d, which shows, in addition, a distinct decrease in the drop’s vertical velocity, which, in the case of a pollutant, causes a layered effect. Figure 17a-d clarify the gross motion shown in Fig. 16 by illustrating the deformation with time of various liquid columns. The accordion-type deformation directly below the drop suggests turbulent vortex circulation, while the column motion to the right of the drop indicates that particles on the tops of columns tend to flow over the right barrier, while those below tend to flow over the submerging drop. An entirely different type of liquid motion which is also of wide interest is porous flow. In general, porous flow is the study of liquid transport through a solid or an earthlike conglomerate of small solids, like sand,
DISCRETE MATHEMATICAL PHYSICS
22 1
FIG.17. Fluid drop analysis.
0 0 0 0 0
FIG.18. Porous flow.
As an example, consider a particular porous flow problem in which the liquid and the solid are as shown in Fig. 18a. The liquid is contained within a square region above a triangular, porous ground section. The configuration is complicated by the presence of a solid shelf, shown above and to the right of the ground, through which the liquid cannot flow. The problem is to describe the flow of the liquid around the shelf and through the ground.
222
DONALD GREENSPAN
A total of 128 particles are chosen, 100 being liquid and 28 being solid. The a local interaction parameter choices are H = 100, G = 0, q = 3, At = distance of 0.25, and a velocity damping factor off the walls of 0.1. The mass of each liquid particle is taken to be unity. No motion is allowed for any solid particle, which enables one to enhance the liquid-solid interaction by taking Gravity is fixed at the mass of each solid particle to be 0.2 (Greenspan, 1980~). - 980mi.The solid particles are arranged uniformly, as shown in Fig. 18a, and all initial positions and velocities of the fluid particles are recorded in Greenspan (1980a). Figure 18 shows the resulting flow from t o to t12,000.Initially, the fluid enters the porous area and quickly saturates the left corner, forming a small dead zone. Then, because the porous area has open horizontal channels, there is a rapid horizontal flow to the right, as the liquid follows a path of least resistance. The flow vertically then continues in the fashion shown. Using an analogous formulation, Vargas (1983) has developed very recently an effective two-fluid oil recovery model. As a third example of liquid motion, let us consider the very difficult problem of describing the motion of the boundary between the liquid and the solid portions of a melting solid. This problem is usually called the Stefan problem. From the continuous point of view, a major difficulty lies in setting correct conditions at the interface, that is, along the moving boundary, since such conditions are not clearly understood physically. In our study of the Stefan problem, we will consider first the subsidiary problem of constructing a solid in which both local and long-range forces act on all particles, for only in such a configuration will the melted particles flow downward. Let us begin with a triangular body of 28 particles, as shown in Fig. 19. The distance between each horizontal pair of adjacent particles is 1.2 units and the distance between each row of particles is 0.8 units (Greenspan and Rosati 1978). All initial positions are given in Table 11. All initial velocities are taken to be zero. The initial parameter choices are p = 4,q = 6, mi = 0.05, G = 2500, H = 2900. For variety, the gravity term in (61) is chosen to be -32miy. We assume also that the body is supported by the X axis; this will be implemented as fo~~ows. If any particle 4 falls below the X axis at time t k , then it is reflected and its velocity reset in the following manner: Xi,k
+
yi.k
+
-yi,kr
Ui,k.x
+
O,
ui,k,y --* ( o . l ) U i . k . y
With this modification, then, particles P,-P2, were allowed to interact in accordance with (55)-(61) in order to find the equilibrium position of the figure. The result was interesting, but not satisfying. By time t , o,ooo all the particles had fallen to the X axis and the body had gone through a transformation like that of honey, in which it had deformed gradually into a flat surface. The problem was that gravity was overcoming the interparticle
223
DISCRETE MATHEMATICAL PHYSICS
FIG.19. Initial triangular configuration
TABLE I1 INITIAL PARTICLE POSITIONS OF A TRIANGULAR BODY
1
2 3 4 5 6 7
-3.6 -2.4 - 1.2 0.0 1.2 2.4 3.6
0.0
8
0.0
9 10 11
0.0 0.0 0.0 0.0 0.0
12 13 14
-3.0 - 1.8
-0.6 0.6 1.8 3.0 -2.4
0.8 0.8 0.8 0.8 0.8 0.8 1.6
15
- 1.2
16 17 18
0.0 1.2 2.4 - 1.8 -0.6 0.6
19
20 21
1.6 1.6 1.6 1.6 2.4 2.4 2.4
22 23 24 25 26 27 28
1.8 - 1.2
0.0 1.2 -0.6 0.6 0.0
2.4 3.2 3.2 3.2 4.0 4.0 4.8
forces of attraction and repulsion. To remedy this situation, since interparticle forces increase with m2 while gravity increases only with m, the mass was increased to 0.25. The calculations were repeated and this time a configuration like that in Fig. 19 did result. However, to test that the body was cohesive, particle P19was removed. Further iteration resulted in P23,Pz6, and Pz8 sliding down to replace P , , , P23, and P Z 6 , respectively. Thus, the body simulated a pile of sand. To remedy this situation the mass was increased to
224
DONALD GREENSPAN
unity, and the calculations were repeated. The results now simulated a gravitational collapse in which each particle showed exceptionally strong attraction to all other particles. Indeed, the entire body initially rose upward and gradually settled down into a triangular body similar to, but smaller than, that shown in Fig. 19. To test that the resulting body was cohesive, particle Plg was removed again. Further iteration showed that Pl, P8, and P14had moved up the side of the triangle while P23, P26sand P2, had moved down, thereby filling in the void left by Plg.Thus, the resulting body was cohesive, but not in the way one would expect a large solid body to be. The problem was that PI and P28,for example, were attracting each other very strongly, whereas for a large body this would not happen, since interparticle forces are local. There are then two possible ways to remedy this last problem. One can take n much larger than 28. In this way the distance between the furthestseparated particles increases and the resulting interparticle force becomes
FIG.20. Equilibrium points of a triangular solid.
225
DISCRETE MATHEMATICAL PHYSICS
negligible. To economize, however, the following alternative was used. We merely set Firk equal to zero if 1451> 1.5. With this modification, the calculations were repeated again and were finally successful. The resulting equilibrium position is shown in Fig. 20 and all positions and velocities are recorded in Table 111. Removal of particle P , , resulted in no additional structural movement in two thousand additional time steps. We are now ready to consider the Stefan problem relative to the triangular configuration shown in Fig. 20. Consider, then, placing a heat source near but to the right of the uppermost particle Pze in Fig. 20. Suppose the heat source is at (0.2,5.7). Heat will be transferred to the body by increasing the velocity and hence the kinetic energy of various body particles as follows. At each time step t k ,consider the distance Ri of particle q from the heat source. If Ri 2 2.5, then the velocity of c.is left unchanged. If R < 2.5, then its velocity components are TABLE 111 EQUILIBRIUM POSITIONS A N D VELOCITIES: TRIANGULAR BODY i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
X
Y
- 3.2652
O.oo00
- 2.1924
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.8909 0.8990 0.9032 0.9032 0.8990 0.8909 1.8374 1.8146 1.7923 1.8146 1.8374 2.7530 2.6879 2.6879 2.7530 3.6478 3.6174 3.6478 4.5068 4.5068 5.4351
- 1.0895
0.0000 1.0895 2.1924 3.2652 - 2.6754 -1.6310 -0.5445 0.5445 1.6310 2.6754 - 2.1366 - 1.0763 O.oo00 1.0763 2.1366 - 1.6674 -0.5441 0.544 1 1.6674 - 1.0877
O.oo00 1.0877 - 0.5536
0.5536
o.oO0o
"x
0.0000 O.oo00
o.Ooo0 o.Ooo0 0.0000 O.oo00 0.0000 - 0.9437 2.00 10 4.1 844 -4.1844 -2.0010 0.9437 0.9267 -2.5412 O.oo00 2.5412 -0.9267 - 0.9468 - 1.8075 1.8075 0.9468 0.206 1 o.Ooo0 -0.2061 -0.5620 0.5620 o.oO00
"Y
0.0005 0.0065 0.001 8 0.0016 0.001 8 0.0065 0.0005 1.0055 3.7973 -0.9128 - 0.9128 3.7973 1.0055 0.4191 - 2.1506 - 1.5731 -2.1506 0.4191 0.6 165 2.1719 2.1719 0.6165 - 2.6777 0.4901 - 2.6777 0.7410 0.7410 - 2.8452
226
DONALD GREENSPAN
reset to
so that the intensity varies as an inverse square, with r being a positive variation constant, and along the line joining the center of the particle and the center of the heat source. With r = 2, the result is intensive heating and, as shown in Fig. 21, a resultant splash effect. The motion of the interface is clear, as is a final buckling effect in the base (Greenspan, 1978b). For our next modeling example, let us explore the phenomenon of surface tension in a liquid. Surface tension is most easily isolated when a liquid is already in the thin-film state, for which reason soap films have been studied extensively, both experimentally and theoretically. The problem we will explore is called the minimal surface problem or the Plateau problem, and is described as follows. Determine the shape of the film through a closed, threedimensional wire which has been dipped into a soapy liquid. Classically, the Plateau problem can be formulated as a steady-state boundary value problem for the nonlinear partial differential equation
or one in which the functional J
=
jj[ + 1
2);(
+ ($)2]1’2dA
RuS
is to be minimized. No mathematical methods exist for solving the Plateau problem using either of the above formulations. Let us show, then, how easily the particle approach is applicable by considering next a specific, nontrivial problem. Let us construct the minimal surface determined by a wire in the shape of a skew quadrilateral, as shown in Fig. 22. Any soap film bounded by this wire will contract in all possible ways to minimize its surface area. To simulate this surface,consider a planar arrangement of 8 1 particles, arranged in a triangular mosaic, as shown in Fig. 23. Let the distance between any two adjacent particles be unity and the local interaction distance be 1.3. The configuration is folded along the diagonal from P66 to Pal into two planar sections and set into a spatial configuration like that shown in Fig. 22, with the line segments joining the points P 6 6 , PSO,Pal, and PSI, consecutively, forming the skew quadrilateral. Next the force parameters p = 2, q = 4,G = 1.O, H = 0.5, and the time step 0.001 are fixed. The 32 boundary particles are fixed for all time and the interior
FIG.21. Melting of a solid.
228 A
FIG.22. A skew quadrilateral.
50
FIG.23. A triangular mosaic of particles.
DISCRETE MATHEMATICAL PHYSICS
229
2
(-3.98. 0 , . 3 4 )
(3.98. 0 , . 3 4 )
FIG.24. Minimal surface through a skew quadrilateral.
particles are assigned zero initial velocities. The interior particles are then allowed to vibrate to steady state. Gravity is neglected because the weight of the film is negligible. Each interior particle is allowed to interact only with neighbors within the interaction distance. The resulting steady-state configuration is shown in Fig. 24 and is in agreement with experimental results. The steady-state positions of the particles are recorded in Coppin and Greenspan (1983). Figure 25 shows the steady-state configuration of a stretched elastic sheet, generated in a fashion similar to that described above. As a final application, let us consider fundamental motion of elastic strings. A discrete string is an ordered set of m + 2 homogeneous particles Pj, j = 0,1,. . . ,m, rn 1, as shown in Fig. 26, with respective centers ( x j , k , y j , k ) at time t k . The problem to which we will direct our attention is, primarily, that of describing the return of a discrete string to a position of equilibrium from an initial position of tension. For the present, assume that Po and P , + are fixed, while P I , P 2 , .. . ,P, are = 0. Also for the present, assume that free to move. Thus, X g , k = Y0.k = y , ,
+
230
DONALD GREENSPAN
Y
J FIG.25. Steady-state, stretched elastic sheet configuration.
any horizontal motion of each moving particle is negligible, so that the transverse vibration to be described will be the gross result of each particle’s oscillation in the Ydirection only. In studying the motion of each 5,j = 1,2,. . . ,n, we will consider gravity, tensile, and viscous forces. For this purpose, let Tl be the tensile force between 5- and 5, and let T2 be the tensile force between 4 and 5+ as shown in Fig. 27. Let the viscosity vary with the velocity. Then, for each 5, the force at t k is taken to be
j = 1,2,..., m; k = 0 , 1 , 2 ,...,n - 1
where a 2 0 and ti is the mass of each particle (Greenspan, 1970, 1971).
DISCRETE MATHEMATICAL PHYSICS
FIG.26. A discrete string.
Y
t FIG.27. Force on a typical string particle Pj.
23 1
232
DONALD GREENSPAN
FIG.28. Nonlinear motion of a 21-particle string.
Figure 28 shows wave motion of a 21 particle string with a = 0.15, m = 0.05,g = 0, and
The initial configuration is that shown at t = 0.00 and the initial velocities are all zero (Greenspan, 1973a). Figure 29 shows how the wave motion displayed in Fig. 28 varies if one chooses different force laws and increases the number of particles to 1001. The wave labeled C was generated by using Hooke’s law. The waves labeled A and B were generated by using different nonlinear force laws. Only the nonlinear waves display the common phenomenon of trailing waves. Figure 30 shows the motion and trailing-wave generation of a particular solitary wave, or soliton (Greenspan, 1972a). Figure 31 shows the results of a completely different kind of string motion. In it are shown the motions of two heavy, elastic strings in which one end has been released while the other remains fixed. Of course, both gravity and motion of the X direction are included in this model (Schubert and Greenspan, 1972).
233
DISCRETE MATHEMATICAL PHYSICS
0.6
-
0
2
-0.5 FIG.29. Wave configurations with different force laws and 1001 particles.
0.6 tlS
(a)
2.0
0.6
(C)
2.0
FIG.30. Soliton motion.
234 0
DONALD GREENSPAN 2
t=. 7 2
FIG.31. Motion of heavy, elastic strings fixed at one end.
F . Models with Self-Reorganization: Celestial Phenomena, Biological Cell Sorting Since both celestial and molecular systems exhibit remarkable selfreorganization phenomena, it is natural that modeling which incorporates both long-range and local forces possesses the same self-reorganization capabilities. It is to such models that we turn now. Binary star systems in which the two rotating stars are visible separately have long been of interest to astronomers. For many such visual binaries, judicious combination of observational data and Kepler’s laws allows the
DISCRETE MATHEMATICAL PHYSICS
235
determination of the mass of each star of the system. Interest in binary systems, however, has increased since applications of modern telescopic, spectroscopic, and computer techniques have revealed that most stars are binary systems, classified now as visual, spectroscopic, and eclipsing pairs. Especially in the case of close binaries, a variety of interesting phenomena also have been discovered, including strong sources of X-rays, evolutionary and dynamical phenomena which occur in very short periods of time (on the order of two days), and the presence of circumscribed gaseous material which emits radio waves and exhibits strong tidal and streaming motions (Kopal, 1978). Let us show, then, how to model close binary systems in the spirit described in Section 11, B. The long-range force will now be gravitation. Let D be a positive constant. Let r i j , k be the distance between pi and Pi. Then, if r i j , k > D, the force on pi due to Pjis taken to be gravitation, and is given by where G* is a gravitational constant. If r i j , k 5 D, then the force on pi exerted by Pj is taken to be the local force F .1j.k .
=
[
-1
Gmimj -k Hmimj rji,k
-~
(rij,k)P
(rij.k)4
rij.k
where G , H , p, and q are positive constants. The total force F i , k on pi at t k due to all particles different from pi is defined by Fi.k
=
1
i= 1
Fij,k
j+i
Let us consider modeling the evolution of a close binary system from a planar, rotating, gaseous body whose density is not uniform. Let the particles P I ,P , , . ., , be arranged in a triangular mosaic, as shown in Fig. 32. Each particle is one unit from each of its immediate neighbors and the entire configuration is relatively circular. To simulate the nonuniformity of the density, the particles have been assigned masses, scaled appropriately and distributed at random, from the values 10,000,8000,6000,4000,and 2000. The number of particles assigned these masses are 2,4,12,26, and 195, respectively. The exact assignment is given in Greenspan (1983) and is shown in Fig. 32 by the use of circles of decreasing radii, the largest corresponding to a mass of 10,000and, in decreasing order, the smallest corresponding to a mass of 2000. For clarity in later discussions, circles representing masses of 2000 will be left unshaded and will be called light particles, while all other circles will be shaded and will be called heavy particles. The entire system is then set into counterclockwise rotation about the origin with an angular speed 6 = 4, but with random small perturbations
236
DONALD GREENSPAN
-
FIG.32. Initial configuration.
superimposed (Greenspan and Collier, 1978). The initial velocity directions are shown in Fig. 32. Next, let us set the parameter values as G* = 0.08, G = H = 5, p = 4, q = 6, D = 2.3 (Greenspan, 1983). The resulting particle motions were then determined with At = through t32,000.Figure 33 shows at tsooothat the heavy particles are self-reorganizing into four groups, while light particles have formed an outer layer of the system. Figure 34 shows clearly at t12,000the type of fluid distortion which is characteristic of the system as it rotates. Also, it shows that the four groups of heavy particles are now reorganizing into two larger subgroups. Figures 35-37 show the very slow motion of the two concentrations of heavy particles toward each other at the respective times t 1 6 , 0 0 0t24,000, , and t32,000.At these times the number of light particles which either have escaped or are too far out to be included in the graphs are 15,31, and 42, respectively. The rapid rotation, loss of mass, and formation of two heavy cores are characteristic of close binary systems and are already present in the figures.
DISCRETE MATHEMATICAL PHYSICS
237
Y 12
.
0
-14
FIG.33. G* = 0.08,T = t,,,,.
From a similar calcu ition (Greenspan and CoLa-, 78), but wit,, sma I modifications in both the force parameters and the initial mass distribution, Figure 38 shows an evolutionary stage of a moonlike body from a hot, swirling gas. The hexagonal shapes denote solid particles, while the circular shapes denote liquid particles. Since the determination of a physical state at the molecular level is exceedingly complex (Barker and Henderson, 1976), the following simplistic rules were implemented. A gas particle is one whose velocity is sufficiently large to prevent bonding. A liquid particle is one which is not a gas particle, but whose velocity allows it to change bonds readily. All other particles are classified as solid particles. The solid particles shown in Fig. 38 do indeed, move as a rigid configuration. Next let us explore the biological phenomenon called cell sorting. Cell sorting is a form of biological self-reorganization which has received wide attention in recent years as a result of some startling experimental results (Steinberg, 1963). Holtfreter, for example, began with a mixture of individual mesoderm, endoderm, and ectoderm cells and was able to induce selfreorganization in a normal tissue in which the mesoderm cells were interior to
238
DONALD GREENSPAN
"
FIG.34. G* = 0.08,T
= t12,000.
the tissue and the endoderm and ectoderm were at the periphery. Weiss and Taylor more recently cultured aggregates of cells from different organs and were able to induce organogenesis which approached the complexity of normal organization. Thus certain biological cells, originally were arranged by type, will tend to self-reorganize into their original structure after being separated and mixed. The process is called cell sorting. To model cell sorting, consider 81 particles of three different masses, as shown in Fig. 39a. Let us choose the local parameters to be p = 4, q = 6, G = H = 5 and neglect the long-range force. Let the central velocities of the particles be chosen so that the entire configuration is in a liquid state (Greenspan, 1981a).Then Fig. 39 shows the natural self-reorganization of the particles into a layered configuration similar to that of the Holtfreter experiment. Note, finally, that if one is willing to allow the local-force parameters G and H to vary with time, then one can simulate changing surface potentials. This
239
DISCRETE MATHEMATICAL PHYSICS
Y 4
X
FIG.35. G*
= 0.08,T = t16,000.
has been applied to model that stage in the development of volvox in which it matures and inverts (Greenspan, 1980e). G. Quantitutive Modeling
Thus far, the models developed have all been qualitative in nature. In order for them to be quantitative, one must be able to incorporate experimental results into the local-force formulas of any model. In this section we will show how to do this for a problem involving stress waves in aluminum bars. Consider a set of n ordered particles Pl, P 2 , .. .,P,, along the X axis. Let each particle interact locally only with its immediate neighbors. If 8 and pi+ are a consecutive pair of particles, let the force between them have magnitude F given by
F=
- Gm,mi+I l r P
+ Hm,m,
+
l/rq
(63)
This equation is a nonlinear function of the separation distance r between Pi and pl., 1 . The equilibrium distance r o , that is, the value of r for which F = 0,
240
DONALD GREENSPAN
. 15
0
12
-1 2
X
0
FIG.36. G* = 0.08, T = t,,,ooo.
satisfies (64) Note, however, that in a small neighborhood about r o , F in (63) is approximately linear and can be related to Young's modulus as follows. Classically, the modulus of elasticity E is the derivative of the stress with respect to the strain at the zero-strain point. The strain E is defined by GJH = rbP--"
E = (r
- ro)/ro
(65)
and the stress as FJA, where A is the area over which the force acts. Hence
DISCRETE MATHEMATICAL PHYSICS
FIG.37. G* = 0.08,T = t32.000.
But, from (63),
while from (65),
Thus (64)-(67) yield
24 1
242
DONALD GREENSPAN to
5
0
-5
- IC Fig. 38. Fluid-solid phase of planetary evolution.
Substitution into (63)yields then
in which E , A , and r,, represent physical properties of the problem. Application of the leap-frog formulas with p = 2 and q = 4 in (68) to a 20particle simulation of a 10-in. cylindrical aluminum bar with diameter 4 in. yields the bar strain history shown in Fig. 40 in response to a sinusoidal impulsive force on the left end (Reeves and Greenspan, 1982).The results are in complete agreement with theoretical prediction.
111. SPECIAL RELATIVISTIC MECHANICS Precise modeling of very distant phenomena or of objects whose speed is close to the speed of light cannot be accomplished through Newtonian mechanics, but requires a relativistic formulation. For this reason, attention is
DISCRETE MATHEMATICAL PHYSICS
FIG. 39. Biological self-sorting. (a) t = t o , (b) t (e)
=
t8,0003
(f)
=
= t,.ooo,
243
(c) t = t7,000, (d) t = t l z . o o o .
‘32.000.
directed now to special relativistic mechanics, and, for clarity, assumptions and results necessary for the discussion will be recalled first. Consider two rectangular, Cartesian, inertial coordinate systems X Y Z , X ’ Y ’ Z which are initially coincident. For the present, assume that X ’ Y ’ Z ’is in uniform motion with speed u in the x direction only relative to X Y Z . We call X Y Z the lab frame and X‘Y‘Z‘the rocket frame. At the origins of the reference frames are observers with identical, synchronized clocks.
244
DONALD GREENSPAN 10"
FU)?
.
0
I
I
2"
2"
ll; 2" ,b 1"' 1
I
- 3.01
0
I
I
1
I
I
I
20
40
60
80
100
120
T
140
( ~ 5 )
FIG.40. Uniform bar strain history (free-free end conditions).
Let particle P be at (x, y, z ) at time tin the lab frame and let it be at (x', y', z') at corresponding time t' in the rocket frame. Then ( x , y, z, t ) and (x', y', z', t')are called events. The precise mathematical formulas relating events are the linear Lorentz transformation formulas: x' = c(x - U t ) / ( C 2 - u y , y' = y, t' = (c2t - ux)/[c(c2 - u2)1/2]
z' = z, (69)
which are, of course, equivalent to x = c(x'
+ Ut')/(C2 t = (C2t'
- u2)1'2,
y = y',
+ u x ' ) / [ c ( c 2 - u2p23
z
= z',
(70)
Observe that the Lorentz formulas involve only arithmetic operations. Note also that it will be mathematically convenient to avoid division by zero, so we will assume that IuI c. Though, initially, special relativistic mechanics was applied to electromagnetics (Einstein, 1905), it was shown subsequently that for particle motion in the X direction. if the Newtonian dynamical equations were modified in the
-=
245
DISCRETE MATHEMATICAL PHYSICS
lab frame to (71)
then the motion of P in the rocket frame was governed by the dynamical equation
In (71) and (72), c is the speed of light and mo is a constant called the rest mass of P . The fundamental result incorporated in (71)-(72), called symmetry, is that the structures of the laws of motion in both systems are identical. A. Theory in One Space Dimension
Let us develop first a completely arithmetic approach to special relativistic mechanics under the simplistic assumption that the rocket frame is moving with speed u in the X direction relative to the lab frame. We will see later that the arithmetic approach not only simplifies the formalism, but that it is important also because the solution of dynamical problems will necessitate the availability of identical computers in the lab and rocket frames. For At > 0, let tk = k A t , k = 0,1,. . . . Let particle P be at (xk,yk, zk) at time tk in the lab frame and let it be at (x;,y;,z;) in the rocket frame at the corresponding time t ; . Then ( x k ,yk ,zk, tk) and (x;,y ; , z;, ti) are called (discrete) events. The formulas relating these events are, of course, x i = c(xk - Utk)/(C2- u2)ll2,
z; = z k , y; = y k , t ; = (c2tk - uxk)/[c(c2- u2)1/21]
xk = c(x; + uf;)/(c2 - u2)ll2, tk = (c2?;
y, = y ; ~
zk
= z;,
+ ux;)/[c(c2- u2)1/2]
(73) (73')
In the rocket frame, the proper time of (x;, y;, z ; , t ; ) is defined by z k = (c2tiZ - XL2 - yi2
- z kI 2 ) 1/2
provided that &;2
- x;2
- y;2
- z;2
>0
(74)
The proper time z k is invariant under the Lorentz transformation; that is, z k is also the proper time of event (xk9 Y k r Z k , tk): zk = ( C 2 t z - x; - y i -
zt)1/2
(75)
246
DONALD GREENSPAN
To prove this, note that
= c2t; - xk' - y:
-
z;
from which the invariance follows immediately. Note, also, that this invariance and (74) imply c2t: - xk' - y; - z; > 0
(76)
Next, in the usual forward difference operator notation AF(k) = F(k
+ 1) - F(k)
note that the quantity 6?k defined by 6?k = [C2(Atb)2 - (AX;)' - (AY;)' - ( A Z k ) 2 ] 112 I
(77)
is also invariant, that is, under the Lorentz transformation, 6?k = [C2(Atk)2 -
- (Ayk)' - (AZk)2]"2
(78)
As in (74), we will assume that
c2(At;)2 - AX;)^ - (AY;)~- (Az;)~ > 0
(79)
which implies
>0 (80) The quantity 6z, is called the proper time between successive events (x;,y;,z;, t ; ) and ( ~ ; + ~ , y ; + ~ , z ; + ~ , Of t ; +course, ~ ) . it is also the proper time between successive events(x,,y,,zk,t,) and (Xf+l,yk+l,Zk+l,tk+1). Note that 6z, is not, in general, the same as AT,. At first, to study the motion of a particle P, we assume that this motion is in the X direction only. To analyze the motion of P it will be convenient to have arithmetic concepts of velocity, acceleration, and a dynamical equation. Let us then define velocity and acceleration first. If P is at (Xk,Yk,Zk) at tk in the lab frame and at (xL,y;,z;) at the corresponding time t; in the rocket frame, then at tk, the velocity V ( t k ) = v k and acceleration a(tk)= ak of P are defined in the lab frame by the forward difference quotients C2(Atk)2
- (Ayk)2 - (AZk)2
-
Vk
= AX,/&,
(81)
ak
= AVk/Atk
(82)
DISCRETE MATHEMATICAL PHYSICS
247
while in the rocket frame, the velocity u’(t;) = u; and acceleration a’(t;) = a; of P are defined by AX;/At;
(83)
a; = Au;/At;
(84)
V; =
In order to develop connecting relationships between U k and u; and between ak and a;, note first that (71) and (72) imply AX;
= C(AXk - UAtk)/(C2- U2)1/2
(85)
= AYk
(86)
Az; = Azk
(87)
At; = (C2Atk- UAXk)/c(c2- U 2 ) l ”
(88)
Hence, (83), (85), and (88) imply 0;
= c2(uk - u)/(c2 - uuk)
(89)
while (84), (88), and (89) imply a; = {c3(c2- u2)3’2/[(c2- u u k + l ) ( c 2 - uu,)2]} ak
(90)
Of course, (89) is equivalent to Uk = c y u ;
+ u)/(c2 + uu;)
(91)
while (90) is equivalent to
+
ak = {c3(c2- u 2 ) 3 / 2 / [ ( c 2 uu;+ 1)(c2+ UU;)’]} a;
(92) Next, let m(k)be the mass of particle P which has velocity uk at time tk in the lab frame. Also, let m’(k) be the mass of P when its velocity is v; at corresponding time t; in the rocket frame. Then the linear momentum P k of P at tk in the lab frame is defined by Pk
= m(k)vk
(93)
while the linear momentum of P in the rocket frame is defined by
(94) It is now worth noting that formula (89) for the transformation of velocities is identical with that of continuous relativistic mechanics. Thus the usual arguments (Taylor and Wheeler, I966), in which identical objects undergo elastic collisions, yield the results that if momentum is to be conserved, then it p;
= m’(k)u;
248
DONALD GREENSPAN
is necessary that m(k) and m’(k) satisfy
m(k) = cmo/(c2 - u;)l/’
(95)
m’(k) = cmo/(c2 - u;’)l/’
(96)
where rn, is the rest muss of P . We do assume that the law of conservation of momentum is valid, whether or not the colliding objects are identical and whether or not the collision is elastic, so (95) and (96) are valid. Next, note that the Newtonian dynamical difference equation Fk = muk is not symmetric with respect to the Lorentz transformation. Thus we must define a new dynamical equation. This equation, together with (81) and (82), or (83) and (84), will enable one to determine the motion of a given particle from given initial data. In the rocket frame, the equation we select is F; =
Av;
c’m‘(k) [(c’ - u;Z)(c’ - u;t+ 1)]1’2
(97)
because (Greenspan, 1976a), under the Lorentz transformation, (97) maps into Fk
=
c2m(k) [(c’ - $)(c’ - u;, 1)]1’’
Avk
in the lab frame. Of course, (97) and (98) are structurally the same, thus establishing symmetry. Note that taking limits in, say, (98) yields the particular form F=-c’
c’m
dv - v’ dt
of the invariant relativistic differential equation (71), where m is defined as in (95), but with the index k deleted. For completeness, let us consider energy next. At the time tk in the lab frame, the total energy E of particle P is defined by E
(99)
= m(k)c2
From (95) and (99), then, E=
c2mo (1 - v ; / c y
so E = c2m,(l
+ ~(u;/c’) +
. * a )
DISCRETE MATHEMATICAL PHYSICS
249
or E
= c2mo
+ j m o v i + ...
( 100)
The quantity $mouf is, of course, the classical Newtonian kinetic energy. For the special case uk = 0, (100) reduces to E,
= moc2
which is called the rest energy, or proper energy, of P. Another convenient formula for expressing energy is (Greenspan, 1976a) E = moC3 &k/&k
Next let us derive relationships which connect energy and momentum. Elimination of m(k) between (93) and (99) yields pkc2 = VkE
A second interesting relationship is E Z = p i c 2 + m;c4
(101)
which follows from (93), (95), and (99). The special significance of (101) is that it and the conservation of linear momentum imply the conservation of energy. Finally, let us consider the momentum-energy vector. Since, for all practical purposes, the restricted type of motion studied in this section never requires consideration of y , and zk, we will confine attention now to the event (Xk,tk),rather than to (x,, y k , z k , t k ) The . event (xk,tk) maps into the event (x;, t ; ) under the Lorentz transformation. Also, thus far we have not placed any emphasis on any particular set of measurement units. In this connection, we will now be specific in the following way. Let E* = E / c 2
( 102)
be a normalized energy in the sense that the units of E* are units of mass. Our present purpose is to show that the number Couple ( P k , E*), where P k iS given by (93) and E* is given by (102), is a vector, called the momentum-energy vector. Precisely, this means that ( p k ,E * ) maps under the Lorentz transformation exactly as does (&, tk). Thus we wish to show that p;
= c(pk - UE*)/(C2- U 2 ) 1 ’ 2
E*’ = ( C 2 E *
- upk)/[c(c2- U2)”2]
From (89), (93), (94), and (102), then cm0 c2 - uvk c 2 ( V k - u ) P; = ( 2 - U i ) 1 ’ 2 c(c2 - u2)1’2 c2 - U U k
(103) ( 104)
250
DONALD GREENSPAN
from which (103) follows. From (89),(93), (94), (99), and (102)
from which (104) follows, and the result is established.
B. Relatiuistic Harmonic Oscillation Oscillation is one of the fundamental types of motion in physics, and harmonic oscillation is fundamental in Newtonian modeling. The classical equation of continuous harmonic motion, that is, X
+ k2x = 0,
k >0
is solved easily for any initial or boundary conditions. Let us see, however, what the consequences are of considering a harmonic oscillator whose velocity is large relative to the speed of light, thus requiring a relativistic formulation. Let P be a particle of mass m which is in motion along an X axis in the lab frame. Let the force F acting on P be given by F = - k2x,k > 0. The resulting motion of P i s called harmonic motion. By (71), the differential equation of the motion of P is d -(mu) = -k2x dt
which, in absolute coordinates rn, x
=c =
1, becomes
+ k2x(1 - i 2 ) 3=/02
(105)
For given initial data x(0) = a, i ( 0 ) = p, one cannot solve (105) analytically, and so numerical procedures are essential. Thus, a computer must now be introduced into the lab frame. For consistency with relativistic assumptions (Taylor and Wheeler, 1966), an identical computer must be introduced also into the rocket frame. But, if one then applies any of the usual numerical methods, like the Runge-Kutta, Taylor, or predictor-corrector methods, to approximate the solution of (105), the numerical results in the lab frame are not related by the Lorentz transformation to the numerical results in the rocket frame. Thus the introduction of computer methodology, in general, leads to a violation of symmetry, with an accompanying loss of the basic physical relationship connecting lab and rocket frame events. The solution to this dilemma is to do the computations by use of (81), (82), and (98) in the lab and (83), (84), and (97) in the rocket. Use of these formulas yields numerical
25 1
DISCRETE MATHEMATICAL PHYSICS
1.6
14
1.2 C
0 .-
t .0
Relativistic case
1.i
t,
.-
L . I -
0
r
t
;0 . a 0
v
Newtonian case
+3 ._ -
i?
0.6
0.4
0.2
0.2
0.4
0.6
0.e
I .o
FIG.41. Comparison of relativistic and Newtonian harmonic oscillation.
"0
252
DONALD GREENSPAN
results which are related by the Lorentz transformation, thus preserving the symmetry even between computed events (Greenspan, 1980a,d). A comparison of first oscillation amplitude values for Newtonian and relativistic oscillators, determined in the lab with the aid of (98), is given in Fig. 41, and shows large differences when speeds are close to the speed of light (Greenspan, 1980a). C. Theory in Three Space Dimensions
Let us now suppose that the relative uniform velocity of the rocket frame X‘Y‘Z’ relative to the lab frame X Y Z is given by u = (u1,u2,u3).Let
u2 = a:
+ u:
8 = ( 8 1 9 8 2 7 8 3 ) = u/c + u: = c’(p: + p ; + p i ) = C”2 y = (1 - 82)-1/2
With regard to events ( x k ,yk, zk, t k ) and (xi, y ; , z ; , i ; ) , let
1
+ p : y2 +
P182-
(LZ)= 8183-
*
Y2 81P2Y+l
Y2
Y2
8183y+I
P2P3-
82P3- y
Y2
+l
-
Y2
-C81Y
Y2 Y+l Y2
1+83-
+
- CP2Y
(106) 1
-C83Y
+
81
-
--YC
--Y
82
C
--Y
P3
C
Y
-
The transformation (106) is convenient from the physical point of view.
253
DISCRETE MATHEMATICAL PHYSICS
From the geometric point of view, a more convenient form can be given as follows. Let new coordinates, called Minkowski coordinates, defined by
Then the Lorentz transformation L = (Lij) is given by (Arzelies, 1966): R; = LRk
..2
1
Y
+ 8 ? q
..2
8182-
Y
Y+l
Y2 8182-
Y+l Y2
81P3-
-
y+ 1
i8lY
Y2 Y+l
82P3-
- iP2Y
82832
1 + 83-
Y2
Y2 +
iP3Y
i82Y i83Y
Y
Consider now two rectangular Cartesian coordinate systems X Y Z and X ‘ Y ’ Z ’ which initially coincide. Let X ’ Y ’ Z ’ ,the rocket frame, be in relative uniform motion with respect to X Y Z , the lab frame, and let this relative velocity be u = (ul, u2, u3).Assume that particle P is in motion in the lab frame and at time tk is at (xk, yk, zk). Then the velocity v k and acceleration ak at time tk are defined by
If, at corresponding time ti, P is at (xi, yL,z;) in the rocket frame, then the velocity vi and acceleration a; of P in the rocket frame at corresponding time
254
DONALD GREENSPAN
I;,] [ I;,]
t; are defined by
Ax;/At’ = Ai;.”],
u;,k
V; =
a;,k
a;
=
= [.V$,k/At;-
Az;/At;
u;,k
Av;.k/At;
a;.k
Av;.k/At;
The respective magnitudes v k , v ; , a k , and a; of defined in the customary way by u;
=0f-k
a; =
The quantity
zk,
+ + + at,k + a:,k, Vi,k
v:,k,
VL2
vk,
+ v;:’ + = a f k + a;’,k + a;:k
= v;:1
a;’
v;, ak, and a; are v;:3
defined in the lab frame by zk
= ( C 2 t ; - x;
- y;
- z;)1/2
is invariant under L*, since c2t;’
- x;’ - y;2 - z;2 = C 2 t ; -
.;
-
y ; - z;
When c2t; zk
- xk” - yk’ - z;
>0
is called the proper time o f event ( x k , y k , z k , t k ) . The quantity 6 z k , defined by 6Zk
= [C’(Atk)’
- (AX,)’
- ( A y k ) 2 - (AZk)2]”2
is, similarly, an invariant of L* and is called the proper time between successive events ( x k , y k , z k , t k ) and ( x k + 1, y k + 1 , z k + 1 ,t k + 1 ). Note that 6Tk
= A t k ( C 2 - V;)”’
= A t ; ( C 2 - UL2)l/’
Finally, note also that
In Minkowski coordinates,
One converts the laboratory frame, for example, into a Minkowski space as follows. Minkowski space is the set of quadruples ( X l , k , X 2 , k t X 3 , k , X 4 , k ) with the distance d between any two such quadruples, say, ( X l , k , X 2 , k , X 3 , k , x 4 , k ) and
255
DISCRETE MATHEMATICAL PHYSICS
li2 i - 1 Xi,k
- xi.k)2]
In this geometric space, t k is the distance between ( X l , k , X 2 , k , x3,k,x 4 , J and the origin (0,0, 0,O). We assume completely similar definitions in the rocket frame so that it too can be considered to be a Minkowski space. Now, any quantity which has four components and is given in the lab frame by
and in the rocket frame by
is called a 4-vector if and only if W ' = LW
Given two 4-vectors W " ) and W ' 2 )in, say, the lab frame, we define the inner product W " ) .W") by W").W'2'
=
wy)w\2)
+ wyw(22) +
w y )w 3( 2 ) + w (4l ) w $ 2 )
,I1
In developing special relativistic mechanics further, we will proceed by using only Minkowski coordinates. Suppose now that particle P is in motion and in the lab frame is at ( x k , y k , z k ) at time t k while in the rocket frame it is at ( x ; , y ; , z ; ) at the corresponding time t ; . At time t k in the lab frame, we define the Minkowski 4velocity vk and Minkowski 4-acceleration A, of P by
[
-Ax1, k / d T ] , Ax2.k/dzk
1
Ak =
A2,k
vl ,k/"k
-
Av2sk/dzk
Ax3,k/dzk
A3,k
Av3.k/dTk
AX4.kldZk
A4,k
Av4.k/6Tk
256
DONALD GREENSPAN
V ; and A; are defined in the rocket frame by
I1 Av;,k/8Tk
Av;,k/Gzk Av;,k/6Tk
AVk,k/Gzk
Direct computation reveals easily that both v k and A , are 4-vectors. The magnitude V, of 4-vector v k is defined by
V, = [ - ( v k * v k ) ] ” 2 An analogous definition holds for the magnitude V ; of V ; . We proceed now under the assumption that, without the presence of an external force, the interaction of two particles conserves linear momentum. To be precise, let particle P of mass m ( k )at time t k be in motion in the lab frame. At time t k , the linear momentum P k of P is defined in the lab frame by = m(k)vk
Pk
and in the rocket frame by
p ; = m‘(k)v; where m ( k ) = cmo/(c2- Z I ~ ) ” ~ , m’(k) = cmo/(c2 - U ; ~ ) ’ / ~ and m, is the rest mass of P . The total energy E of particle P of mass m(k) at time E
tk
is defined by
= m(k)c2
Then, as in Section III,A, the rest energy formula E,
= moc2
follows readily, as do the formulas
E = moC3 A t k / 8 ? k and E2 = pic2
+ m;c4,
where P k is the magnitude of P k . We proceed next to establish the momentum-energy results of Section III,A. Let E*
=
E/c2
DISCRETE MATHEMATICAL PHYSICS
257
We wish to show that
is a 4-vector under L*, called the momentum-energy vector. This is valid, since
Finally, let us examine the development of a dynamical equation which is symmetric under the Lorentz transformation. It is a most unfortunate mathematical and physical consequence of special relativity theory that the simple generalization
F = (d/dt)(mv) of (71) does not, in general, transform under L* into
F‘ = (d/dt’)(m’v’) euen in the continuous case (Bergmann, 1942). To resolve this failure, two approaches have been followed. First, one might proceed under the approximating assumption that if a rocket frame were attached to P , so that it can have accelerated motion, and if at time t the velocity of P is v, then one can treat the rocket frame at time t as being instantaneously in uniform relative motion with velocity v with respect to the lab frame. Second, one can formulate equations of motion directly in Minkowski space. Because the second approach has a firmer mathematical basis than does the first, we will explore it. In Minkowski space, we assume the dynamical difference equation
258
DONALD GREENSPAN
where
cckm(k)= mo We also define a three-dimensional projection of (108) by
Ff: = c2{rn(k)A[
-
(109)
[L\rn(k)/dTk]Vi}
where c2 and Vf: have replaced ctk and $( h+ + h),respectively, in (108) and where the superscript p denotes the dropping of the fourth component of the given 4-vector. Let us show first that (109) reduces to (98) when P and the rocket frame both have velocities in the X direction only. We must show then that Fl,k
= c2{ m ( k ) A1 .k - [ A m ( k ) / d T k l
(1 10)
vl . k )
is the same as (98). Now -
c2{m(k)Al,k
[Am(k)/drkl
= (c2/sTk)[m(k)vl,k+
vl.k>
I -
m(k + l ) v l , k l
Note next that
m(k) = c m o A t k / b T k so that Fl.k
+ l)vl.kl
= ~ c m ~ k ~ / ~ r n ~ A c k ~ ~ ~ r n ~k ~rn(k v l . k + ~
-
c2m(k) (dTk/Atk)(6Tk X
+ l / A t k + 1)
+ l / d T k + 1) ( d T k + l l A t k + 1 ) At
-(Axk/dTk)(dTk/Atk)
which implies c2m(k)
FLk = [(C’
- U f ) ( C ’ - U;+
A vl .k I ) ] ” ’ At,,
But (1 11) is simply (98) in the new notation, and the result is proved. In Minkowski space there is a problem in the study of (108),which will now be written as Fk = m O A k - ( A m O / d T k ) ( V k + 1 + v k ) / 2 (1 12) There is some question about whether or not rest mass should, in fact, depend on time through Fk (Arzelies, 1966; Synge, 1965). In some sense this seems to undermine our physical intuition about what rest mass should be, but mathematically we could continue under this assumption as follows. If rest mass does depend on time through F k , then, by taking inner products, one
DISCRETE MATHEMATICAL PHYSICS
259
finds Fk*(Vk+l
+
= -(AmO/Gzk)(Vk+l
+ Vk)'(Vk+l + vk)/4
If then one restricts attention to forces Fkwhich satisfy
+ vk)
(113) so that the force is normal to the average velocity, one can always choose Am = 0, so that m0 continues to be invariant. Restriction of attention to such forces also reduces (1 12) to Fk'(Vk+l
F,
=
= moAk
which is invariant under the Lorentz transformation and is completely analogous in structure to Newton's equation of motion. Note that, in the limit, (1 13)is the exact condition satisfied by the motion of a charged particle in an electromagnetic field (Synge, 1965). In the case in which Am, # 0, alternatives to (109) which are more consistent with continuum results have been developed by Albrycht and Marciniak (1 98 1a).
IV. QUANTUM MECHANICS: A SPECULATIVE MODELOF VIBRATIONS IN THE WATER MOLECULE The deepest understanding of nature seems to be available only through an understanding of the dynamical interaction between atoms and molecules. Indeed, chemical and biological phenomena seem to have their primary driving mechanics at this level. So let us turn finally to the very important area of atomic and molecular dynamics. At the outset, we state the obvious, which is that we know very little about dynamical mechanisms at the molecular level. In particle physics, the rapidity of changes in unifying theories is indicative of how little we know. Even the belief that quarks and leptons are the fundamental structural units of nature is challenged by many (see, e.g., Harari, 1983). The greatest successes in the atomic and molecular areas have been through quantum mechanics, and these have been in the nondynamical areas of energy calculation, in spectroscopy, and in steady-state aspects of solidstate physics (French and Taylor, 1978; Pauling 1960; Slater, 1960). Indeed, quantum mechanics has proved to be distinctly unsuccessful for dynamical phenomena (French and Taylor, 1978), for which reason various combinations of quantum mechanics and classical mechanics have been devised recently. Levine and Bernstein (1974), for example, determine energies from
260
DONALD GREENSPAN
Schrodinger’s equation and then scattering trajectories from Newton’s equations. Kirschenbaum and Wilets (1980) incorporate the Heisenberg and Pauli principles into classical n-body models. Greenspan (1981b,c) assumes the energy to be known and incorporates a local repulsion term into classical equations to prevent electron capture by a nucleus. However, because of the great need to be able to simulate dynamical interaction on the molecular level, we will now venture into the area by making a distinct departure from all existing methods and theories, developing a new approach which is based on very broad physical assumptions. Our modeling will be discrete, since the components of molecules are discrete. Because of the speculative nature of the discussion, however, we will not dwell on whether or not difference equations or differential equations are more appropriate to use. However, much of the mathematical methodology will be motivated from discussions in previous sections. But, most importantly, we will establish the viability of our approach by producing a computer example of a dynamical water molecule in which the vibrations are in complete agreement with experimental results. First, we make two fundamental physical assumptions. We assume that each atom or molecule under consideration is stable, that is, it neither collapses nor explodes. Experimentally, this is valid for most atoms and molecules which are isolated from radiation and from other atoms and molecules (French and Taylor, 1978).Also, we assume that for any given set of like particles (say electrons, or helium atoms, or water molecules), any two particles of the set have substructures which are either identical to or mirror images of each other. From this assumption it follows, for example, that the force between a pair of identical electrons need not be the same as that between a mirror image pair. With regard to this second assumption, recall that the basic double-slit experiments for electrons, for example, yield particle distributions like those shown in Fig. 42. Such a distribution is interpreted in quantum mechanics as a wave interference pattern, which motivates the usual assumption that electrons have both particle and wave properties. We shall, however, interpret the distribution in Fig. 42 as a separation phenomenon, resulting from the assumption that electrons are of two different types, that is, identical and mirror image types (Greenspan, 1981a). In all other respects, the quantum nature of matter and energy is still assumed to be valid. From our two basic assumptions, then, we begin as follows. Electrons and nuclei will be treated as point sources. Given n particles pi, i = 1,2;. . . . n, let m i and ei be the mass, measured in grams, and the charge, measured in electrostatic units, of Pi for each value of i. For positive time step At and t k = kAt, k = 0,1,2,..., let Pi at time t k be located at = (Xi.k,yi.k,zi,k), measured in centimeters, have velocity V i , k = ( U i , k , x , u ~ , ~u i ,,k J~, ,
26 1
DISCRETE MATHEMATICAL PHYSICS
FIG.42. Electron distribution in special double-slit experiments.
measured in centimeters per second, and have acceleration a i , k = ~ , ~ , ~ ,measured in centimeters per square second. If the force Fi,k = ( & & . , 4 , k , y , 4 , k , z ) acts on pi at t k , then we assume the dynamical relationship (ai,k,x,u
(1 14) Fi,k = m i i i , k , i = 1,2,.. . ,II, k = 0, 1,2,.. . Let us consider first a Coulombic-type component in Fi,kin ( 1 14).For this purpose, let F!;.:, the electrostatic force exerted on pi by 4, be
where rij,kis the vector from pi to 4 at time t k . Then the total electrostatic force FI," on pi due to all the other particles is taken to be
Unfortunately, even if (1 15) were the only force we wished to consider (and it is not), the numerical solution of (1 14) in the present units would produce severe computational underflow and overflow problems. So, before proceeding further, let us make the following changes of variables: Mi = 1028mi,
R
Ei = 10"ei/4.80286
= ( X , X Z ) = 10l2(x,y,z) = 1012r,
T
=
1022t
262
DONALD GREENSPAN
In the new variables, M(e1ectron) = 9.1085, M(proton) = 16,742, E(e1ectron) = - 1, E(proton) = 1, and the particular dynamical form of (114) transforms into EiEj Mid 2 R i * k= (4.80286)2 dT2 j,= 1 (Rij,k)'
+'"
]#i
Rji,k Rij,k
where
The numerical solution of (116) can now be generated readily without underflow or overflow problems, so that we now modify (116) as follows, to include additional effects. We will consider it in the more general form
where the ' i j , k will, in general, be related to the boundedness of the molecule. Since the actual structure of the A , , depends on the molecule under consideration, we will deal with it more specifically next in connection with a water molecule model. Consider an ultrasimplistic model of the water molecule which consists of only seven particles, that is, an 0 nucleus, two H nuclei, and four electrons. The reason we do so is as follows. Extensive preliminary calculations with larger models yielded the following phenomenon consistently. Of the six second-ring electrons from the 0 atom and the two electrons from the H atoms, four were invariably oscillating close to the 0 nucleus, while the other four were invariably oscillating far from the 0 nucleus. Such a phenomenon followed from a kinetic energy transfer which resulted whenever any electron came between the 0 nucleus and a second electron. The four-electron model follows by absorbing both the inner-ring 0 electrons and the four oscillating near the 0 nucleus into the 0 nucleus. Thus, let Pl represent the 0 nucleus; let P2, P3 represent H nuclei; and let P4, P5,P6, P, represent electrons. Since six electrons have been absorbed into Pl, it follows readily that the respective particle charges are El = 2,
E2
= E3 =
1,
E4 = E.j
= E,j = E7 =
-1
By assuming also that neutron and proton masses are approximately equal, it follows that the respective particle masses are Ml = 267,584,
M2 = M3 = 16,724,
M4 = M5 = M6 = M7 = 9.1085
DISCRETE MATHEMATICAL PHYSICS
263
For the initial positions and velocities of P,-P7, let us take R l , o = (O,O, 0), R3.0 = (5354, -4759, -4759), R4.0 = (36,000,0, 15,000), R5,o = (36,000,0, - 15,000), R6,o = (36,000,15,000,0), R7.0 = (36,000, - 15000,0), and Vi,o = (O,O, 0), i = 1,2,3,. . .,7. The bond angle of this initial configuration is 103” and the initial bond length is 8600 units. From these initial data, the motions of P,-P7 can be approximated readily from the nonlinear system (1 17) once the A , , are defined. We will assume that Fijsk= - F j i , k , so the Aij,, need be defined only for i < j. Now, we have no way of knowing how to choose the A,,,, since there are no related experimental results. Thus, our choices will be entirely speculative, and are made as follows. If i = 1 and j = 4,5,6,7, set pij = 0 and R2,o = (5354,4759,4759),
Aij,k = -( 1400/Rij,,)4
+ (Rij,k/30,000)2
(1 18)
In (1 18),the term with exponent 4 preserves the domain of the inner electrons, even though these have been absorbed into the nucleus. The term with exponent 2 preserves the boundedness of the molecule by limiting the radii of the electrons. If i = 1 and j = 2,3, set pij = 0 and
which preserves the boundedness of the molecule by limiting the radii of the H nuclei. If i = 2 and j = 3, set pij = & and, for reasons indicated in Greenspan (1981d), set
If i = 2,3 a n d j = 4, 5,6,7, set pij = 0 and Aij,k = 0. If i = 4, 5 a n d j = 6, 7 set and Aij,, = 0. If i = 4 and j = 5, or, if i = 6 and j = 7, set pij = - & and Aij,, = 0. Thus, all particle interactions are now defined. From the initial data and force formula symmetries, the motions of the hydrogen nuclei P2 and P3 are restricted to the plane Y = Z . Figure 43 shows the computed motion, using the leap-frog formulas, of P2 in this plane. The circle in the figure encloses the allowable domain of oscillation, as determined by Shibata and Bartell (1965),and the computer results agree, then, completely with their experimental measurements. With regard to electron motions, it was observed that these were almost entirely in coordinate planes. Figure 44, then, shows the projection in the X Y plane of the motion of electron P6. Each electron oscillated no more than about 45,000 units from the oxygen nucleus, which is in complete agreement
pij =
264
DONALD GREENSPAN
t
-2000
FIG.43. Motion of P2 relative to oxygen nucleus.
with the quantum-mechanical estimate of Eisenberg and Kauzmann (1969). The figure shows clear orbital precession which, though not persistent, is recurrent.
V. CONCLUDING REMARKS With regard to Section I1 on Newtonian mechanics, the following should be noted. A conservative, arithmetic formulation of the Hamiltonian approach has been developed by LaBudde (1980), while Cadzow (1970) has developed a discrete calculus of variations. Other discrete models developed include classical cavity flow (Greenspan, 1976b), conservative harmonic
r
DISCRETE MATHEMATICAL PHYSICS
265
50000
FIG.44. Precessional motion of electron orbits.
oscillation (Greenspan, 1972c, 1980a; Mahar, 1982),discrete biological population dynamics (May, 1975), biological circularization and gastrulation (Greenspan, 1982b,c),atmospheric streaming (Peterson and Uccellini, 1979), ocean wave generation due to earthquakes (Greenspan, ef al., 1976), and conservative interactions in which the potentials are anisotropic (LaBudde and Greenspan, 1978). With regard to string vibrations in particular, energy transfer has been studied by Fermi et al. (1955), while resonance and stability have been investigated by Auret and Snyman (1978). From the relativistic point of view, a discrete approach distinctly different from that given in Section I11 has been developed by Lorente (1974).Radiative transfer on discrete spaces has been investigated by Preisendorfer (1965). Unfortunately, all attempts to reformulate electromagnetic theory using only arithmetic processes have had limited success. The basic reason is that Maxwell’s equations can be reformulated as an alternate system of wave equations, and when one replaces the wave equation by a difference equation, the principle of rod contraction does not allow this equation to remain symmetric under the Lorentz transformation. However, discretization of certain integral relationships can lead to arithmetic formulations which are
266
DONALD GREENSPAN
“conservative” in the sense that they conserve “discrete energy” (Ardelyan and Gushchin, 1982).However, this “discrete energy” is different from the energy of the continuous formulation. With regard to phenomena usually modeled quantum mechanically, a three-bead-two-rod discrete, classical approach has been developed for modeling macromolecules and has been applied successfully to the study of polymers (Gottlieb, 1977). Strictly from a computer point of view, a conservative computer logic and two types of sequential circuitry have been developed by Fredkin and Toffoli (1982), while computer systems analysis theory is being developed for the discrete modeling of both natural and social systems (Barto, 1975; Ziegler, 1976). In addition, current technological progress in the development of parallel computers and laser-type transistors, for example, promise a future of scientific progress beyond all previous expectations.
REFERENCES Albrycht, J., and Marciniak, A. (1981a).Int. J. Theor. Phys. 20, 821. Albrycht, J., and Marciniak, A. (1981b). Celest. Mech. 24,391. Ardelyan, N. V., and Gushchin, I. S . (1982).Vestn.Mosk. Unio.,Ser. 15: Vychislitel. Mat. Kibernet. No. 3,p. 3. Arzelies, H. (1966).“Relativistic Kinematics.” Pergamon, Oxford. Auret, F. D., and Snyman, J. A. (1978).Appl. Math. Modelling 2, 7 . Barker, J. A., and Henderson, D. (1976).Rev. Mod. Phys. 48,587. Barto, A. G.(1975).Ph.D. Dissertation, University of Michigan, Ann Arbor. Bergmann, P. (1942).“Introduction to the Theory of Relativity.” Prentice-Hall, Englewood Cliffs, New Jersey. Birkhoff, G. (1983).SIAM Rev. 25, 1. Cadzow, J. A. (1970).Int. J . Control 11, 393. Carusi, A.,and Valsecchi, G. B. (1980).Moon Planets 22, 133. Coppin, C., and Greenspan, D. (1983).Appl. Math. Comput. 13, 17. Einstein, A. (1905).Ann. Phys. (Leipzig)4, 891. Eisenberg, D., and Kauzmann, W. (1969).“The Structure and Properties of Water.” Oxford Univ. Press, London and New York. Fermi, E., Pasta, J., and Ulam, S . (1955).[Rep.] LA-1940.Los ALamos Sci. Lab. Feynman, R. P., Leighton, R. B., and Sands, M. (1963).“The Feynam Lectures on Physics.” Addison-Wesley, Reading, Massachusetts. Fredkin, E., and Toffoli, T. (1982).Int. J . Theor. Phys. 21,219. French, A. P., and Taylor, E. F. (1978).“An Introduction to Quantum Physics.” Norton, New York. Gottlieb, M. (1977).Comput. Chem. 1, 155. Greenspan, D. (1970).Comput. J. 13, 195. Greenspan, D.(1971).BIT 11,399.
DISCRETE MATHEMATICAL PHYSICS
267
Greenspan, D. (l972a). Tech. Rep. 167. Dept. Comput. Sci., University of Wisconsin, Madison. Greenspan, D. (1972b). Util. Math. 2, 105. Greenspan, D. (1972~).Bull. Poly. lnst. Iasi 17 (22),Sect. I , 205. Greenspan, D. (1973a).“Discrete Models.” Addison-Wesley, Reading, Massachusetts. Greenspan, D. (1973b). Bull Am. Math. Soc. 79,423. Greenspan, D. (1973~).Found. Phys. 3,247. Greenspan, D. (1974a). Cornput. Methods Appl. Mech. Eng. 3,293. Greenspan. D. (1974b).Cornput. Struct. 4,243. Greenspan. D. (1974~).Bull. Am. Math. Soc. 80,553. Greenspan, D. (1974d). “Discrete Numerical Methods in Physics and Engineering.” Academic Press, New York. Greenspan, D.(1976a). l n t . J . Theor. Phys. 15,557. Greenspan, D. (1976b). Tech. Rep. 265. Dept. Comput. Sci., University of Wisconsin, Madison. Greenspan, D. (1978a). Appl. Math. Cornput. 4, 15. Greenspan, D. (1978b).Comput. Methods Appl. Mech. Eng. 13,95. Greenspan, D. (1980a).“Arithmetic Applied Mathematics.” Pergamon, Oxford. Greenspan, D. (1980b).Math. Comput. Simulation XXII, 200. Greenspan, D. (1980~).Appl. Math. Modelling 4,95. Greenspan, D. (198Od).lnt. J . Gen. Syst. 6.25. Greenspan, D. (198Oe). Tech. Rep. 130. Math. Dept., University of Texas at Arlington. Greenspan, D. (1981a). J . Math. B i d . 12, 227. Greenspan, D. (1981b). J . Cornput. Appl. Math. 7,41. Greenspan, D. (1981~).J . Cornput. Appl. Math. 7, 129. Greenspan, D. (1981d). Appl. Math. Cornput. 9,301. Greenspan, D. (198le). “Computer-Oriented Mathematical Physics.” Pergamon, Oxford. Greenspan, D. (1982a).Tech. Rep. 189. Dept. Math., University of Texas at Arlington. Greenspan, D. (1982b). In “Discrete Simulation and Related Fields” (A. Javor, ed.),p. 153. NorthHolland, Publ., Amsterdam. Greenspan, D. (1982~).I n “Nonlinear Phenomena in Mathematical Sciences” (V. Lakshmikantham, ed.), p. 471. Academic Press, New York. Greenspan, D. (1982d).Tech. Rep. 193. Dept. Math., University of Texas at Arlington. Greenspan, D. (1983). Tech. Rep. 195. Dept. Math., University of Texas at Arlington. Arlington Texas. Greenspan, D., and Collier, C. (1978). 1. Inst. Math. Appl. 22,235. Greenspan, D., and Rosati, M. (1978).Cornput. Struct. 8, 107. Greenspan, D., Cranmer, M., and Collier, J. (1976).Tech. Rep. 277. Dept. Comput. Sci., University of Wisconsin, Madison. Harari, H. (1983).Sci. Am. 248, 56. Hirschfelder, J. O., Curtiss, C. F., and Bird, R. B. (1954). “Molecular Theory of Gases and Liquids.” Wiley, New York. Kirschenbaum, C. L., and Wilets, L. (1980). Phys. Rev. 21,834. Kopal, Z. (1978). “Dynamics of Close Binary Systems.” Reidel Publ., Dordrecht, Netherlands. LaBudde, R. A. (1980). Int. J . Gen. Syst. 6, 3. LaBudde, R. A,, and Greenspan, D. (1974).J . Cornput. Phys. 15, 134. LaBudde, R. A., and Greenspan, D. (1976a).Nurner. Math. 25,323. LaBudde, R. A., and Greenspan, D. (1976b). Numer. Math. 26, 1. LaBudde, R. A,, and Greenspan, D. (1978). V. J . Sci. 29, 18. Levine, R. D., and Bernstein, R. B. (1974).“Molecular Reaction Dynamics.” Oxford Univ. Press (Clarendon),London and New York.
268
DONALD GREENSPAN
Lieberstein, H. M. (1960). Tech. Rep. 122. Math. Res. Cenl., University of Wisconsin, Madison. Lorente, M. (1974). Publ. 437. Cent. Theor. Phys., MIT, Cambridge, Massachusetts. Mahar, T. J. (1982).S l A M J. Numer. Anal. 19, 231. May, R. M. (1975).J . Theor. B i d . 51, 51 I . Marciniak, A. (1981). Ph.D. thesis. A. Mickiewicz Univ., Poznan, Poland. Pauling, L. (1960). “The Nature of the Chemical Bond,” 3rd ed. Cornell Univ. Press, Ithaca, New York. Peterson, R.A,, and Uccellini, L. (1979). Monthly Weather Rev. 107, 566. Preisendorfer, R. W. (1965). “Radiative Transfer on Discrete Spaces.” Permagon, New York. Reeves, W. R., and Greenspan, D. (1982). Appl. Math. Modelling 6, 185. Safhan, P. G. (1968). In “Topics in Nonlinear Physics” (N. J. Zabusky, ed.), p. 485. SpringerVerlag, Berlin and New York. Schubert, A. B., and Greenspan, D. (1972). Tech. Rep. 158. Dept. Comput. Sci., University of Wisconsin, Madison. Shibata, S., and Bartell, L. S. (1965). J . Chem. Phys. 42, 1147. Slater, J. C. (1960). “Quantum Theory of Matter,” 2nd ed. McGraw-Hill, New York. Steinberg, M. S. (1963). Science 141,401. Synge, J. L. (1965). “Relativity: The Special Theory.” North-Holland Publ., Amsterdam. Taylor, E. F., and Wheeler, J. A. (1966). “Spacetime Physics.” Freeman, San Fransisco, California. Vargas, C. (1983). Ph.D. Thesis, University of Texas at Arlington. von Karmen, T. (1963). “Aerodynamics.” McGraw-Hill, New York. Ziegler, B. P. (1976). “Theory of Modelling and Simulation.” Wiley, New York.
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS.VOL. 63
Contrast Formation in Electron Microscopy of Biological Material E. CARLEMALM Department of Microbiology Biozentrum, Uniuersity of Basel Basel, Switzerland
C. COLLIEX Laboratoire de Physique des Solides UniuersitP de Paris-Sud Orsay, France
E. KELLENBERGER Department of Microbiology Biozentrum, University of Basel Basel, Switzerland
I. Introduction. . . . .
,
..., .
. . .
..
,
,
...
. . . . .
,
.
..
,
,
... . .
.
11. T h e o r y . , . . . . . . . . . . . , . . . . . . . . . . . . . . . , , . . . . . . . . . . A. Basic Equations for the Single-Scattering Approximation . . . . . . . . . . . .
111.
IV.
V.
VI.
B. Multiple Scattering . . . . , . . . . . . . , , , . . . . . . , . . . , . . . . . . C. Contrast Formation. . . . , . . . . . . . . . . . . . . . . , . . . . . . . . . . D. Treatment of Superpositions: The Dilution Theorem . . . . . . . . . . . . . . Scattering Cross Sections and Constants , . . , , . . . . . . , . . . . . . . . . . A. Atomic Scattering Cross Sections , . , , , . . . . . . . . , . . . . . . . . . . B. Cross Sections for Composite Matter , . . . . . . . . . . , . . . . . . . . . . Contrast with Unstained Sections of Aldehyde-Fixed Biological Material . . . . . A. Calculated Values of Scattering Constants for Different Materials . . . . . . . B. Thickness Dependence of Scattering . . , , . . . . . . . . , . . . , . . . . . . C. Influences of Thickness, Density, and Scattering on BF and D F Contrasts. . . D. Influence of Thin-Section Relief Compared for the Two Imaging Modes . . . E. How to Calculate Contrast of Real Biological Structures . , . . . . . . . . . . Experimental Confirmations , . . . . . . , . . . . . . . . . . . . . . . . . . . . . A. Definition of Test Object . , . . . . . . . . , , , . . . . . . . . . . . . . . . . B. Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. R e s u l t s . . . . , , . . . . , , . . . . . . , . . . . . . . . . . . , . . , . . . . . D. Positive Stain in Thin Sections . . . , . . . . , , . . . . . . . . . . . . . . . . Discussion of the Consequences for the Interpretation of Micrographs . . . . . . A. Introduction: Different Densities. ,
270 276 276 27 8 28 1 284 286 286 29 1 294 294 295 298 302 306 309 309 310 311 316 319 319
269 Copyright 0 1985 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-014663-0
270
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
B. Characteristics of the Specimen That Produce Contrast . . . . . . . . . . . . C. Contrast in Conventional Imaging . . . . . . . . . . . . . . . . . . . . . . . . D. Contrast in Ratio Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Discussion of Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Beam-Induced Destruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Plastic Deformations in Thin Sections . . . . . . . . . . . . . . . . . . . . . . C. Limitations due to Positive Stain Overcome by the Possibility of Observing Unstained Sections . . . . . . . . . . . . . . . . . . . . . . . . . D. Limitations due to Negative Stain and Potential of Observing Frozen-Hydrated Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Limitations due to Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
321 322 326 321 321 328 329 330 330 33 I 332
I. INTRODUCTION
Besides fulfilling the requirements of geometric optics, any imaging is possible only when some properties of the specimen can become reflected as intensity differences which eventually are visible to the eye. Vision and classical light microscopy are based on light absorption. In the thickness range which allows sharp imaging, electrons are not absorbed but only scattered. Among the basic treatises on electron microscopy (Zworykin et al., 1945; von Ardenne, 1940;von Borries, 1949a,b)the last author emphasized contrast tormation with biological specimens; he, and later Hall (1953)derived the basic equations expressing the relation between specimen properties and the resulting contrast. Assuming that contrast is mainly due to the elimination of electrons scattered into wide angles, these authors postulated that contrast in conventional bright-field imaging is essentially dependent on the mass density p and the thickness x of the specimen. As we will see in this chapter, this is true when only the unscattered electrons are considered or, approximatively, when multiple scattering is neglected. A critical mass thickness ( p x ) , can become defined at which, on average, each electron is scattered once. This value is indeed about 90-100 nm g/ml and is nearly independent of the matter considered. However, already at that time the big problem was the question of the density of a specimen which has sufferedextensive damage from the beam. For the contemporaries of these pioneers, it was rather obvious to take dry distillation as a model for such processes; the remains would essentially be a carbon skeleton comparable to charcoal. The thickness would have been preserved and thus only a reduced density has to be considered. It was common practice to assume that this “biological carbon” would have a reduced density of about 1 g/ml.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
27 1
As we will discuss in this review, the role of hydrogen in determining the scattering properties of organic matter has now become central. Indeed, among the atoms involved in biological matter, hydrogen presents peculiar scattering properties. Studies of beam-induced destruction also tend to show that hydrogen does not leave the matter preferentially (Dubochet, 1975); oxygen seems to depart most easily (Egerton, 1980). The ways in which contrast has been dealt with in regard to instrumentation are highly tortuous. In the first European instruments with electromagnetic lenses (Siemens and Philips) the objective aperture had to ensure usage of only the central part of the lens and thus to eliminate the extremely high spherical aberration inherent in electron optics. It was also believed that contrast was produced by this central objective aperture by adequately eliminating scattered electrons. These microscopes had a tendency to expose the specimen to much higher doses than needed for imaging. Electrostatic lenses have the particular feature that it is not possible to accommodate an aperture in a central position. Despite this and by using a sufficiently narrow illuminating beam, it was possible with electrostatic lenses to produce about the same contrast as with an electromagnetic lens with central aperture. This fact might have had its influence in the following further developments. Hillier, one of the constructors of the Canadian electron microscope (Burton, Hillier, Prebus, et al.)and later director of the development of the RCA electron microscope, was working with biological specimens and therefore rethought contrast formation (Hillier, 1949; Hillier and Ramberg, 1949). He introduced an adequately narrow illuminating beam and showed that the electromagnetic lens without aperture is also able to achieve a usable contrast. Later he introduced the contrast aperture, which-in the objective lens-is positioned in the plane of the image of the condensor aperture (approximately in the back focal plane). At this place, a maximum of scattered electrons can become eliminated. This position is now currently favored. Experimenting with long- and normalfocal-length lenses and small contrast apertures, workers found limitations mainly on the practical side: apertures below some 15pm are difficult to make round, and in particular are extremely sensitive to contamination-induced irregularly and thus to astigmatism. The possibility of compensating contamination-caused astigmatism by a stigmator is extremely limited; contamination layers do not produce a stable electrostatic field, but rather oscillating ones. In addition, the Abbe theory of imaging shows that the contrast aperture in the back focal plane-the Fourier plane-determines resolution by cutting off high spatial frequencies. The compromise with theory is fortunately beyond practical limitations. The above-mentioned findings led at that time to a certain consensus as to how the electron microscope should be conceived for biologists, particularly cytologists: a relatively long focal length (6-9 mm) and a 20-30-pm
212
E. CARLEMALM, C . COLLIEX, AND E. KELLENBERGER
contrast aperture; illumination with rad. With such microscopes, the stain produced by OsO, fixation provided sufficient contrast for observing thin sections. At that time it was already well known that the spherical aberration decreases with shorter focal lengths of the lens. Designers of electron microscopes thus put a great deal of effort into constructing such “highpower” lenses. With the practical limitations in producing contrast apertures, it is obvious that the instrumental ability to produce contrast decreased again. This relative lack of contrast was compensated by more intensive heavy-metal staining and by using still narrower illuminating beams. Two side effects promoted the consensus for using this new type (narrow beam, short focal length) of microscopes in biology: (i) The micrographs showed very fine interference noise which was believed to represent real images of the fine “atomic” structure of matter; this interference noise allowed also the definition of a new type of resolution test on carbon films, which demonstrated instrumental resolving powers in the range of few angstroms. (ii) Still unexplained, the unequal thicknesses within one thin section no longer led to “cloudiness”; the yield of nice micrographs of thin sections became markedly increased. Narrow-beam, high-coherence imaging was adopted by cytologists for these reasons. Through the laser, high-coherence optics was simultaneously developed very far. It had the advantage over incoherent optics of allowing for much further-going calculations. No wonder that this imaging mode in electron microscopy became calculated and investigated very thoroughly (Lenz, 1954;Thon, 1966;Hanszen and Trepte, 1971; Misell, 1975; Burge, 1973). Since interference involves phase differences, it was reasonable to call this imaging mode “phase contrast.” Unfortunately, this was misleading to the biologist, because there is-besides phase differences-no common basis with what is called phase contrast in optical microscopy. Although Zernike (1935) developed theoretically the principle by using the coherent case, the practically usable phase-contrast light microscope requires a maximally incoherent illumination (according to Kohler), in order to give sensefull images (Kellenberger et al., 1985). The theoretical basis of electron microscopic phase contrast showed, however, through the contrast transfer functions, that contrast is produced for specimen details above the interference noise, details which could become related to real structural features (Ericson and Klug, 1971). This was and is still used as a basis for image processing, where statistical noise from all sources. including interference noise, can be eliminated by averaging over repeated structures.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
273
It is obvious that phase contrast occurs as a result of interference between electrons of equal velocity; i.e., phase contrast must be the result of interference between unscattered and elastically scattered electrons. Highresolution specimens should be thin; in such a case, the number of elastically scattered electrons used in imaging is very small when compared to the unscattered ones. This was obvious to many and stimulated some to undertake thorough investigations of dark-field imaging (Ottensmeyer, 1969, Dubochet et al., 1971; reviewed by Dubochet, 1973); when unscattered and most inelastically scattered electrons are eliminated imaging depends essentially on the elastically scattered electrons. This should produce a theoretically simpler situation that is more accessible to experimental tests. Dark-field imaging turned out to be rewarding with such small biological molecules as small polypeptides (Ottensmeyer et al., 1975)and DNA (reviewed in Brack, 1981),but very disappointing with larger structures and thin sections (Weibull, 1974; Sjostrand et al., 1978; Jones and Leonard, 1978). One major limitation is due to the recording mode of conventional transmission microscopes: indeed, a dose is needed in dark field which is 30-80 times that for bright-field imaging. This leads obviously to beam-induced specimen destruction (Thach and Thach, 1971; Dubochet, 1975; Isaacson, 1977; Egerton, 1980a, 1982a).With the STEM, this problem can be overcome. Still, with thicker specimens, and particularly with unstained thin sections (below 50 nm), satisfactory sharpness could not become routinely achieved, as was possible with bright field CTEM on stained material. Most specialists explained this failure by multiple scattering. In the present chapter we will review our findings (Carlemalm and Kellenberger, 1982) which show that the surface relief plays an equally determinant role in this lack of sharpness with completely unstained specimens. In 1970 Crewe and Wall introduced scanning transmission electron microscopy (STEM). One of the several potential advantages of this imaging mode resides in the possibility of collecting separately electrons scattered in different ways. The collected signals can then be processed individually or interactively to modulate the image intensities. For very thick specimens with a high proportion of multiple scatter Crewe and Groves (1974) and Groves (1975) had compared the respective advantages of CTEM and STEM. Many of their considerations are taken up in the present chapter, which concern current biological specimens prepared so as to be optimal for retrieving relevant structural information. Their thickness is such that multiple scatter is not predominant. For the other extreme, the imaging of atoms, Crewe had proposed to use the ratio of elastically over inelastically scattered electrons. The contrast is then mainly a function of the atomic number 2 (Crewe et al.,
214
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
1975). We had introduced this mode of imaging for the observation of biological material, particularly of sections (Carlemalm and Kellenberger, 1982) and have shown that by use of this imaging mode the influence of the surface relief can apparently become reduced very substantially. By ratio contrast it was thus possible to obtain a higher resolution of completely unstained specimens than with the dark-field mode. Ratio contrast on unstained normal biological specimens was rightly criticized (Egerton, 1982b; Ottensmeyer and Arsenault, 1983) because, when first presented (Carlemalm and Kellenberger, 1982), multiple scattering was not considered. Real specimens that are usefull in biology, even the most perfect ones of 10-50 nm thickness, show some multiple scattering. This multiple scattering is relevant whenever we consider either conventional imaging and phase contrast or ratio contrast. For a given specimen thickness the influence obviously becomes stronger with denser matter. The influence of multiple scatter is thus much more considerable with positively or negatively stained material-as is needed in phase-contrast bright-field imaging-than with unstained samples which are viewed with ratio contrast. Ratio or 2 contrast was also not acceptable to those believing that organic materials cannot show sufficient differences in their atomic composition to produce contrast. These questions will be discussed in the present chapter by comparing the influences of varying composition of matter on contrast formation in conventional and ratio-contrast imaging. To do this on real specimens, we had to progress by successive approximations; we thus obtain qualitative answers: First, we will use throughout only the particle aspect of the electrons. We felt that it is today possible to describe biological matter in regard to its electron scattering abilities fairly simply, while its description by charges and potentials is likely to be not only overcomplicated but also feasible with approximations only, which would lead to much less precision than the Heisenberg uncertainty involved in the particle approach. Second, we restricted contrast evaluations in relation to matter in considering elastically and inelastically scattered electrons only. For conventional contrast formation the elastically scattered electrons are essential either directly in the dark-field mode or indirectly in producing phase contrast in narrow-beam, high-coherence imaging. For ratio contrast both types of scattered electrons are involved. Third, after reviewing multiple scatter in Section KB, we neglect it in the following sections related to practical examples. Let us repeat once more that we are fully aware that treatment of contrast by the particle interaction only, as we do here, is an approximation. It is a relatively perfect one for ratio-contrast and dark-field, but much less so for conventional bright-field imaging. In our opinion, however, this most spread type of imaging is also the least understood despite all the calculations
CONTRAST FORMATION IN ELECTRON MICROSCOPY
275
available. Some of us firmly believe that the assumptions to be made for such calculations are much too approximate to be applicable to real biological specimens. While we believe that most conventional electron microscopy specimens behave as near ideal scatterers, the theoreticians have to assume ideal phase scatterers, the theoreticians have to assume ideal phase objects to be able to use wave theory. Only very recently, a real phase phenomenon was observed in a new type of embedding of membranes where lipid seems t 9 be preserved. By defocusing, contrast reversal was easily produced (Westphal et al., 1984). Frozen-hydrated particle suspensions (Adrian et al., 1984) also showed contrast in conventional bright field, which suggest them to be phase objects: puzzling unpublished observations (in collaboration with J. Dubochet, EMBL, Heidelberg) suggest a lack of contrast correlation with the compactness- i.e., concentration-of proteins. We owe the reader also some explanation as to how we “quantify” contrast. We start from a physiological definition of contrast: two areas of gray can be distinguished as different when the two grays, measured as optical densities (negative logarithm of transparency or of reflectance), differ by at least a minimal, threshold value. The physiologically defined law of WeberFechner states that for a wide range of intensities this threshold is a constant. In the case of photographic electron recording, it is known (Valentine, 1966) that the optical density obtained on the negative is proportional to the electron dose received. We have also investigated the situation in STEM and found that here too the optical density on the photograph of the screen is strictly proportional to the voltage of the arriving signal S. In these cases, it is obvious that to the threshold of optical density AG,, corresponds a AZth of the electron dose. Therefore we base our contrast considerations on AZ (AS). (The signal S is derived as linear combinations of scattered and/or unscattered electron.) In the literature this definition is not frequently used; very often authors use A l l 1 instead, in an incorrect analogy with photographic recording of light. In another paper we shall discuss these problems in more detail, in comparing, e.g., direct recording of electrons with the recording by intermediate light produced by a fiber plate and outside photograph. We are obviously aware that the above-mentioned way of considering contrast is applicable and relevant for the visual detection of structural details of micrographs which are large relative to the noise. For small details and obviously for the detection of atoms, the signal-to-noise ratio is more important than the contrast as defined above. In a forthcoming paper we will show examples with real, very small biological structures where it is obvious that the contrast is by far the predominant criterion for judging detectability.
276
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
11. THEORY A . Basic Equations for the Single-Scattering Approximation
The probability of scattering for an incident electron by an atom is governed by a cross section a which is defined as dl
=
-1oNdx
(1)
where N is the number of atoms per unit volume, d x is the length of the electron path, and I is the number of incident electrons. One generally considers elastic scattering (with its ae,cross section) for electrons scattered at large angles without energy loss, and inelastic scattering (with its din cross section) for electrons which have suffered measurable energy loss. The total scattering cross section is then 0
+ ai,
= (T,I
In Section I11 we shall discuss how these cross sections can be calculated from Z , the atomic number. The number N of atoms in compact matter is described by the experimentally determinable mass density p, from which we obtain N = (L/A)P
with L being Avogadro's number 6.023 x loz3and A the atomic weight. The basic equation (1) then becomes with I = I ( x )
d l = -1apdx
(1')
For a convenient separation of variables, one introduces the K factors defined as
As we will see later (Fig. 5 and Table 11), K is to a first approximation a universal constant independent of the matter considered, except for hydrogen and helium. The Kel and Kin,however, depend on Z and A as shown also in Fig. 5. The basic differential equation for unscattered electrons then becomes dl dl = -IKpdx, -= d l n l = - K p d x ( 1'7 I which, integrated, is I,,
= loexp( - k p x )
Iun
= 10 ~ X PPX/PX~I
C
(3) (3')
CONTRAST FORMATION IN ELECTRON MICROSCOPY
277
Despite its simplicity, this equation deserves some discussion.’ The unscattered electrons will reach the value of e - ’ = 0.37 when (px), = 1/K; px was called the “mass thickness” by the pioneers of electron microscopy, and is nothing else than the mass per unit surface; (px),, the critical mass thickness, when expressed in nanometers (for x) and grams per milliliter for p , is 100 2 10 for all relevant substances. The fact of the symmetry of p and x is of extreme interest for rapid calculations of contrast for we can use it answer the question as to how much p must vary to provide the same effect as a variation of the thickness x or the reverse. Or one can ask how much heavy-metal stain must be added to any embedded substance to double the contrast, e.g., in a thin section. Let us take up this last example: A compact protein with p = 1.3 in a resin of p = 1.2 is sectioned to 50 mm. We obtain a px of 65 for the protein and of 60 for the resin, i.e., a A(px) = 5. How much must the density of the protein be changed by a heavy-metal stain to reach A ( p x ) = lo? When the volume change is neglected the density of the stained protein must be 1.4. Next question: How large a thickness variation of the resin would also produce a A( px) = 5? Answer: 4 nm! From these simple considerations it becomes obvious how useful this symmetry of p and x in this formula and the constancy of (px), are. We will see later, in Section IV, that most of the calculations presented there can be very well approximated by the above considerations. The authors of today’s literature often introduce the “mean free path A,” which is the average distance x, = A traveled by a beam electron between two consecutive interactions. From the above, A = l/Kp, which says essentially that the mean free path is inversely proportional to the density p . It seems obvious to us, from what we said above, that A does not help us in any qualitative considerations while (px), does. As long as the number of scattered electrons remains small in comparison to the number of unscattered electrons, Eq. (1’) can also be integrated to provide the thickness behavior of both elastic and inelastic electrons: dl,, = I,, K e l p dx,
dIi,
= I,,Ki,p
dx
replacing I,, by Eq. (3) and integrating, we get IeI(x)
= (KeI/K)IoC’ - exp(-Kpx)l
Iin(x) = (Kin/K)IOCl - exp(-Kpx)I We check that Z,,(x)
’
(4) (5)
+ lel(x)+ Zi,(x) = I,.
The formal analogy of the law of Eq. (3’) with that of Lambert-Beer applicable, e.g., in the case of the absorption of light in solutions might be noted here: I / I , = exp-licx, with c the concentration of a solution and x the thickness of the cell. This becomes -In l / l , = od = kcx (see also Section A).
278
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
Forming the ratio of elastic and inelastic intensities, one shows that
(6) r = 1 e J l i n = KeI/Kin = b e l / b i n from which formula both p and x have disappeared. The result now depends only on the cross sections, which in turn depend on 2;this is nothing else than Crewe's 2 contrast (Crewe et al., 1975). It is therefore only dependent on the elementary composition of the specimen. B. Multiple Scattering
The simplified description presented above is only valid as long as the number of scattered electrons is small compared to that of the unscattered ones. With increasing thicknesses, more numerous multiple scatterings occur and one has to use statistical methods to describe more satisfactorily the interaction of the incident electrons with the specimen. Every biologist with a modern education knows the Poisson distribution and uses it in many of his problems such as virus infection, one- and multiplehit phenomena (as, for example, first-order kinetics in chemistry and inactivation in radiobiology). The interaction of electrons with matter also obeys the Poisson distribution (von Borries, 1949a; Lamvik and Langmore, 1977). The probability P(n) for n rare events to occur on one individual target is
P(n) = (m"/n!)exp( - m)
(7)
with m the average number of events per target in a population. Within the specimen we associate a cylinder of cross section n (be,,bin)and height x with each electron. Scatter of the electron will occur once or more, according to the number of atoms (represented by one point) contained in this cylinder. When using the definition of cross sections, one finds that m = 0 L/A px
= Kpx
(7') As discussed in the preceding section, we obtain the critical mass thickness (px), = 1/K for the case m = 1. As we see in Table 11, this critical mass thickness is, within k lo%, a matter-independent constant and therefore of large practical value for qualitative estimations. This definition will be used throughout in preference to A = 1 /K p . Purely formally, m can be expressed by x/A, which is mathematically elegant, but practically of little use. Since we know ( p x ) , to be about 100, when p is expressed in grams per milliliter and x in nanometers, we thus can immediately estimate m for any practically used specimens as m = p x / ( p x ) , = A p x . Knowing m, we can read in Table I the distribution of single- and multiple-scattering events. For sections with little or no stain and 100 kV, the distribution lies between 0.2
CONTRAST FORMATION IN ELECTRON MICROSCOPY
279
and 0.4. For rn = 0.4, only 20% are multiple among the scattered electrons. For negative stain layers around structures of some 20 nm we reach values of m = 1. Multiple scatter affects here already 40% of the scattered electrons. Considering scattering per se, with a given rn value, one easily calculates the proportion of unscattered electrons: P(0) = exp( -rn)
that is, Iun/lo= exp( - Kpx) as above; the proportion of single-scattering events: P( 1) = rn exp( - m) that is ll/Io = Kpx exp( - Kpx); the proportion of scattering events, including single and multiple events.
f
I,,,,,- 1 - exp(-Kpx)
P(i) = 1 - ~ ( 0=) 1 - exp(-rn),
10
i=l
and the proportion of multiple events is similarly a2
1 P(i) = I
-
P(O)- ~ ( 1 = ) I -(I
i=2
+ rn)exp(-rn)
which can be developed, when rn remains small, into
-
pmUl, rn2 - + ( I - m)m2 (8) It increases approximately with the square of rn. Some typical values are tabulated in Table I. To apply Poisson statistics, the assumption of a straight electron path, and thus also of a straight associated vertical cylinder is needed and entirely correct for single-scattering events. This is only approximately true for multiple scattering: for a single-scattering event we know that the large majority of scattered electrons are still within a scattering angle below TABLE 1 PROPORTIONS OF UNSCATTERED, SINGLY, DOUBLY, AND MULTIPLY SCATTERED ELECTRONS CALCULATED WITH THE POISSON DISTRIBUTION m 0.1
P(0) P(1) P(2) P(>2)
0.905 0.0904 0.0045 0.0001
0.2
0.4
1
0.82
0.67 0.26 0.05 0.02
0.37 0.37 0.18 0.08
0.16
0.016 0.004
280
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
10- rad. Treating double-scattering events with Poisson statistics is thus still a very good approximation, leading only to very small errors. For practical purposes we rarely deal with specimen thicknesses allowing for substantial numbers of more than two consecutive scattering events. At least this is so as long as we use no, or only restricted, amounts of heavy metals, i.e., when the density p is within the range 0.9-1.3, as we have to consider for unstained biological material embedded in resins with and without heavy metal. We can then also apply the Poisson distribution to calculate the relative proportions of I,,, Zin, and Iel. For a single-scattering event we obtain P( 1) = rn exp( - rn) = p x ( K i ,
+ Kel)exp( - K p x )
(9)
with Pi,( 1) = Iin/Io= pxKi, exp( - K p x )
and
(9’) peI(1) = IeI/Io = pxKelexP(-Kpx)
For a double-scattering event we have to introduce the binomial distribution and we obtain the following relative proportions:
+
P(2) = i ( p ~ ) ~ ( K : 2KelKi, ~
+ K?,)exp(-Kpx)
(10)
from this we deduce for the proportion of double elastic scatter
for the elastic-inelastic scattered proportion Pel/in(2)= I e l / i n / l o
= (pxI2Ke1Kinexp( -
KP~)
(1 1b)
and for the double inelastic scatter (1 1 4 Pinjin(2) = Iin/in/lo = i(px)’K?n exp(-Kpx) For the remaining multiple-scatter events above 2, binomial distributions with n are obtained in analogy, but, as observed above will, become increasingly approximate:
with i from 0 to n. From Eqs. (7) and (9) we might calculate the proportion of electrons which have suffered at least one elastic scatter: m
1 - P(0) -
C Pi,(i) i= 1
CONTRAST FORMATION IN ELECTRON MICROSCOPY
28 1
Useful approximations to these formulas were given for the first time by Crewe and Groves (1974). Similar ones have been used by several other authors (Egerton, 1982a; Colliex and Mory, 1984). When the angular distribution of the scattered electrons has also to be taken into account, only approximations can be made. Elastic electrons are distributed into rather wide angles, a few rad, while the inelastic ones remain mainly within a few rad. This is actually an important factor when one wants to develop calculations for quantitative agreement. With experimental conditions one has then to consider the given angular acceptance of the detectors. Moreover, multiplescattering events may redistribute the scattered electrons between the categories defined above; for example, double elastic scattering can advance an electron in the forward direction, so that it can no longer be distinguished from the unscattered beam, or on the contrary, the two angles can add up so that it is scattered outside the maximum acceptance angle of the detector. The real situation is much more satisfactorily described by using the Monte Carlo approach. Each type of scattering event is characterized by a set of parameters (energy loss, angular deviation, length between successive interactions) and one simulates many possible trajectories for an incident electron arriving on the specimen by random access to these parameters. The transmitted electrons are then classified according to the angular acceptance and energy window of the used detectors (Jouffrey, 1983; Reichelt and Engel, 1984) for application to typical organic materials, as illustrated in Fig. 1. C. Contrast Formation
In any imaging disposition of electron microscopy, contrast is achieved by eliminating or selecting part of the transmitted beam. With the conventional EM column (CTEM), scattering contrast has long been distinguished from phase contrast. In the first case, one deals with intensities within well-defined solid angles (bright field, tilted dark field), while in the second mode, one uses interference effects between the unscattered and the elastically scattered wave, with high-coherence illumination. This latter contrast seems to govern highresolution studies; it is particularly important for periodic structures and outof-focus conditions. This will not be considered in this chapter. On the other hand, recent developments with STEMS offer new possibilities for using scattering contrast. These are mainly due to the flexible possibilities for efficient angular and energy selection on the transmitted beam: annular dark-field detectors (ADF) can collect all electrons scattered at large angle; magnetic spectrometers can discriminate any energy loss electrons and, as a special case, can separate all inelastic electrons from those with no energy loss.
282
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
FIG. 1. Representation of the calculated angular and energy distribution after passing through a 50-nm-thick protein layer. (Courtesy of Dr. R. Reichelt.)
Figure 2 shows some typical angular configurations for the most largely used imaging modes in CTEM and STEM. The signals in CTEM can be described by using a single-scattering approximation as sCTEM = I un + alin + ble, BF (12) SICTEM DF
=
allin
+ b’lel
(12‘) where a, b, a’ and b’ must be calculated from the geometry of the incident and collected beams. In CTEM modes, it is only possible to discriminate electrons
283
CONTRAST FORMATION IN ELECTRON MICROSCOPY Inelastic
i\
and bright fielddetector
Spectrometer Briaht-field ~
lens
~
~
r
'
C
c
operture
l
o
r
k detector - f i e
l
d
Specimen
(a1
(b)
FIG.2. Image formation in (a) bri&--t-fieldCTEM with parallel illumination and (b) STE 1 using annular dark-field detector and EEL spectrometer. In the CTEM all electrons deflected at angles larger than 0, are removed by the objective aperture and cause scattering contrast. In STEM the dark-field signal is formed by the electrons scattered to a larger angle than 0, and recorded on the dark-field detector; electrons scattered to an angle smaller than 0, are recorded by the spectrometer.
of different energies by a rather complicated column modification [introduction of a prism-mirror configuration (Castaing and Henry, 1962; Ottensmeyer and Andrew, 1980; Henkelman and Ottensmeyer, 1974),or of its completely equivalent magnetic filter]. The important signals in STEM are
When the system has been designed so that the spectrometer acceptance angle is equal to the minimum acceptance angle of the annular dark-field detector, c=l-c'
each of the collection factors a, b, c, d is smaller than one and depends mainly on the optical geometry of the microscope and slightly on the elementary composition of the matter considered. The angular distribution of scattered electrons varies with the atomic composition of the matter. Elastically scattered electrons have usually a much broader distribution than the inelastic ones. Eusemann et al. (1982) have introduced instead the notion of effective cross section. In our nomenclature oeff= ba in the above example and accordingly
284
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
in the other cases described below. We have modified the formulas of Eusemann et al. accordingly, so as to get a mathematical expression for these collection factors; they can easily be calculated from the known geometry of the microscope (see Section 111,A). In dark-field CTEM (beam tilt), a’ and b’ represent a rather complicated geometry (Langmore et al., 1973) and it will not be dealt with here. For hollowcone dark field in CTEM and for dark field in STEM, the factors are easily described by the geometric parameters of the dark-field detector. For STEM Z contrast, we form the ratio between the signals obtained from the dark-field detector and the spectrometer
By ideal collection, this ratio would be equal to r = ge,/oin.In reality the angular dark-field detector collects less than 100% of the elastically scattered electrons and in addition some inelastically scattered electrons, as shown by Eqs. (1 3) and (1 4). We define r’ = (c
+ dr)/c’
(15)
This constant is, however, of very little practical importance, since it does not help very much for the cases considered here (Table 111). The signal S is easily modified electronically by the gain and by subtracting a constant value. Such electronic manipulations may change the contrast but obviously not increase the information contained in the signal. When considering the effect of multiple scattering (as described by either the Poisson or the Monte Carlo method), one has to point out that there is no possibility to discriminate in energy those electrons collected by the annular dark-field detector. Consequently, IADF= I , ,
+ lin/el
as defined by Eqs. (9) and (1 1). D. Treatment of Superpositions: The Dilution Theorem
In practice, it rarely occurs that different substances of a specimen are nicely arranged side by side. Embedded biological structures will be surrounded by embedding material above and below and on the sides. In thin sections of such embedded material, many biological structures might span the slice from surface to surface, but many others will be partly or wholly included in the slice. The matter-specific constants p, K , and Ke, are thus discontinuous along x.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
1
285
1
FIG. 3. Schematic drawing illustrating the “dilution theorem” described in the text on Section I,D. Arrows indicate path of one electron with its associated cylinder.
Locally, around a “point” of the specimen, the matter has to be considered in a strict treatment as being layered. This situation is mathematically easy to deal with in the case of the unscattered electrons I , , [3], but it becomes very complicated for I , , and Ii, [4] and [S]; in these formulas we already made the simplification of only considering single-scattering events. For thin specimens we can easily evaluate this distribution of discontinuities or layered specimens. In Fig. 3 we see part of a specimen composed of one matter (crosses) with a layer of a second matter (circles). The associated cylinders of the crosses and circles is given. The numbers of crosses and circles contained in the corresponding associated cylinders determines the frequencies when entered in the Poisson distribution. It seems obvious to us that another distribution of the circles, as indicated in Fig. 3 by the cylinder to the right, will not affect the frequencies. In a first approximation, valid for thin specimens, the distribution of matter along the depth x of the section is thus irrelevant. This distribution only becomes relevant when multiple scatter has to be considered. In theory, any discontinuous distribution of matter along x would be reflected in different characteristics of the image according to the two possible orientations of the specimen during observation. This was indeed a point of contention in the early days of electron microscopy and comes up again as a question in every practical electron microscopy course. With a strong influence of multiple scatter, the resulting image should be strongly and significantly different according to whether the supporting film or the structure per se were turned toward the lens. Such a difference should be particularly strong, e g , with shadowed or negatively stained specimens on thick organic supporting films. No significant, striking difference was ever found and/or published. Based on the theoretical considerations above, the consensus nevertheless recommends orienting the side with the specimen toward the lens.
286
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
We therefore think that neglecting multiple scatter in treating the examples in Section IV,E and Fig. 13, is a fully justified approximation which allows for correct qualitative assessments.
111. SCATTERING CROSSSECTIONS AND CONSTANTS A. Atomic Scattering Cross Sections
The scattering power of an isolated atom on an incident high-energy electron is defined by its scattering cross section, which represents its effective target area. It can only be deduced from complex calculations involving refined potential descriptions of the Coulombic field of the nucleus and its orbital electrons. Results are presented either in terms of differential cross sections for scattering within a given solid angle or of total cross sections which measure the probability of elastic or inelastic events as introduced in the theory above. Another way of adapting these calculations for practical use is to weight the total values with an efficiency factor which depends essentially on the experimental conditions. 1. Elastic Cross Sections There are several theoretical studies available in the literature (Burge and Smith, 1962; Crewe et al., 1975; Euseman et al., 1982; Schafer et al., 1971), and the problem is to decide which one is the most suitable for practical applications. In the case of elastic cross sections, there is not too much difference between the published results when one is not interested in accurate values for forward scattering at very small angles. In the following we give the resulting cross sections r~ and scattering constants K in function of the atomic number 2, as obtained (i) by the coarse approximation of Crewe et al. (1975) and (ii) by the equations used by Eusemann et al. (1982). For the approximation according to Crewe et al. (1975) we used
From Eusemann et al. (1982), we have kept the following equations in which the angular parameters are taken into account. For CTEM, bright-field
CONTRAST FORMATION IN ELECTRON MICROSCOPY
287
configuration:
where f (0) is the forward scattering amplitude and 8, is the characteristic scattering angle, estimated from the tables of Schafer et al. (1971). In the simple screened Coulombic potential, 8, can be estimated analytically as
8, = 1/2na where a = a(Z) = 0.885802-'/3 is the screening radius for atomic number Z. Similarly for STEM, with the annular dark-field configuration:
Oo is in CTEM the limiting angle of the objective aperture; in STEM it represents the angle of the illumination cone. 2. Inelastic Cross Sections Though most of the energy transfer between the incident electron and the specimen takes place with the peripheral atomic electrons which are not well described in any of the single models, it is useful to consider first an atomic description, the validity of which will be discussed later on. We have therefore tried to estimate, similarly to the elastic case, some values for total inelastic cross sections by summing over all possible excitations. From Wall et a/.(1974) we can calculate a general 2 dependence following cln= (1.5 x 1 0 ~ 4 / ~ 2 ) 2 ' ~ 2 1 n ( 2 / 8 , )
(20)
In this formula, a characteristic inelastic angle gE= =/2E,, fi = V / c , 2E0 = p2 (V, + mc2),with relativistic correction and the average energy loss AE is supposed to be equal to 62. This assumption is valid for l o w 4 materials, such as most organic substances as checked by Isaacson (1977), but cannot be extended simply to high-Z atoms. The angular efficiency for inelastic scattering has been evaluated from the formulas of Eusemann et al. (1982). They are CTEM:
TABLE I1 RELEVANT CONSTANTS FOR CONTRAST FORMATION I N STEM AND CTEM AT 100 KV Average atomic Hydrogen Carbon Nitrogen Oxygen Phosphorus Carbohydrate Protein Lipid (lecithin) Nucleic acid” HM20 K4M Epon Ice Sn-methacrylate resin 0s-fixed proteinb Potassium phosphotungstate a
P
1 6 7 8 15 4.14 3.83 3.21 5.26 3.34 3.57 4.01 3.33
1 12.01 14.01 16.0 30.97 7.72 7.10 5.75 10.24 6.12 6.53 7.52 6.01
3.78 4.5 1 22.86
-
oe,
10-19
x
10-19
K,, x lo4
K ; . x 104
K x 104
(px): = 1 x
K
10.8
1.4 1.3 1 1.6 1.09 1.23 1.24 0.92
0.35 7.89 8.43 8.95 33.10 4.56 4.29 3.60 6.65 3.70 3.91 4.65 3.22
6.10 10.55 11.40 12.16 13.31 8.82 8.57 7.98 9.57 8.16 8.34 8.74 8.12
2.12 3.96 3.62 3.37 6.44 3.55 3.64 3.76 3.91 3.64 3.61 3.73 3.23
36.41 5.29 4.90 4.58 2.58 6.88 7.27 8.36 5.63 8.04 7.69 7.00 8.15
38.59 9.25 8752 7.94 9.02 10.43 10.91 12.12 9.54 11.69 11.30 10.73 11.38
9.6 9.2 8.3 10.5 8.6 8.8 9.3 8.8
7.27 8.80
1.24 1.63
5.92 6.46
8.02 8.64
4.95 4.42
6.70 5.91
11.64 10.34
8.6 9.7
53.47
5.1‘
60.9 1
12.87
6.86
1.45
8.32
12.0
2 -
Density value represents Na salt.
’0s-fixed protein containing 20% (w/w)0 s . Data from Brown et al. (1977). ( p x ) , is the mass per surface area that statistically leads to one scattering event per incident electron.
-
CONTRAST FORMATION IN ELECTRON MICROSCOPY
289
STEM:
The angular parameters involved, GE, e,, and O,, have already been defined; Izc is the Compton wavelength (2.4263 x m) and q is a fitting constant which has been found to be q E 1.09. For comparison with the efficiency factors introduced above [Eqs. (12)-(14)], for CTEM:
for STEM:
The results of these calculations are shown both on Fig. 4, and in Tables I1 and 111, which concern, respectively, total cross sections and geometrydependent factors (a, b, c‘, d ) in the case of single scattering. We also considered the scattering constants K :
These constants have also been plotted in Fig. 5, which shows that K is approximately constant, except for hydrogen and helium. On the contrary, the ratio r = K,,/Ki, displayed in Fig. 6 varies substantially. Of the two, the more
N
E
u
m
10 20
30 40 50 60 70 80 90
20 30 10 50 60 70
10
Atomic number
80 90
Atomic number
FIG. 4. Elastic scattering cross sections ueIand inelastic scattering cross sections u," in function of Z for 100 kV calculated according to (a) Langmore et al. (1973) and (b) Eusemann et a!. (1982). TABLE 111 GEOMETRICAL PARAMETERS DETERMINING ELECTRON COLLECTION AND RATIOCONTRAST
STEM^
CTEM"
Ratio
Substance
a
b
C
C(
d
r
r'
r/r'
Hydrogen Carbon Nitrogen Oxygen Phosphorus Carbohydrate Protein Lipid (lecithin) Nucleic acid HM20 K4M Epon Ice Sn-methacrylate resin 0s-fixed protein Potassium phosphotungstate
0.887 0.835 0.805 0.780 0.790 0.835 0.843 0.854 0.824 0.850 0.846 0.844 0.834
0.170 0.162 0.123 0.099 0.168 0.133 0.143 0.153 0.141 0.148 0.143 0.151 0.104
0.065 0.095 0.110 0.123 0.122 0.095 0.090 0.085 0.100 0.087 0.089 0.090 0.094
0.934 0.905 0.890 0.877 0.878 0.905 0.910 0.915 0.900 0.913 0.911 0.910 0.906
0.409 0.433 0.483 0.526 0.486 0.476 0.460 0.450 0.480 0.453 0.460 0.450 0.519
0.058 0.748 0.739 0.736 2.565 0.517 0.500 0.449 0.699 0.453 0.470 0.532 0.397
0.025 0.463 0.401 0.442 1.418 0.376 0.353 0.314 0.484 0.319 0.334 0.361 0.331
2.32 1.62 1.35 1.67 1.81 1.38 1.42 1.43 1.44 1.42 1.40 1.47 1.20
0.849 0.838
0.149 0.134
0.087 0.093
0.913 0.907
0.518 0.521
0.739 0.747
0.514 0.531
1.44 1.41
0.720
0.129
0.150
0.850
0.624
4.733
3.474
1.36
~~
~
CTEM, 0, = 7.5 mrad. STEM, 0, = 28 mrad. ' r = uel/uin. r' ,= ,S,/,,S, with the signals collected on the annular dark-field detectors and the LEL in the spectrometer.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
0
,
,
,
,
20
10
,
,
30
I
I
LO
,
I
I
I
.
60
50
.
I
.
.
29 1
.
70
80
90 Z
70
80
9OZ
Atomic number
RAREEARTH ELEMENTS
I
11
11110 20 H CNO P S
30
40
50
60
Atomic number
FIG.5. Scattering constants at 100 kV plotted as a function of Z. Filled columns represent the K,, and open the Kinparts of the total K. The data in (a) are calculated according to Langmore et al. (1973) and in (b) according to Eusemann et al. (1982).
sophisticated calculations using Hartree-Fock models clearly reveals the periodicity of electron shells with increasing 2. B. Cross Sections for Composite Matter
The different organic matters that we consider are composed essentially of H, C, N, and 0 with additional P i n nucleic acids. Other components like S are present only in small amounts. In some cases, we will consider the influence of
292
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
30 25 .c _
20
b
\
a,
15
b
10
5 0 10
20
30 40 50 60 70 80 90 Atomic number
FIG.6. 2-dependent signal r
=
uc,/uinin function of the atomic number.
heavy metals, like 0 s and Sn, respectively, introduced during specimen preparation either as a fixative-stain or covalently bound in an embedding resin, respectively. We will also take into account the possible staining brought about by charge-neutralizing ions, stemming from the buffer used. A general assumption is to calculate average cross sections by summing over the cross sections of all atoms present in a molecule and dividing by the total number of atoms. This procedure implies that the scattering properties of single atoms are the same as those of atoms bound to others. In the case of elastic scattering, most of the cross section is due to interaction with the Coulombic field due to the charge of nuclei; the influence of the neighboring atoms is to modify the distribution of the outer electrons, so that the potential at long distance is slightly modified. Consequences can be found for smallangle scattering but they do not involve noticeable changes concerning the large-angle elastic scattering considered here. For inelastic scattering, the situation is rather different because most of the relevant cross section is due to these outer orbitals. A typical energy loss spectrum of organic matter (Fig. 7) exhibits the following major features: a narrow and intense zero-loss peak, a major inelastic contribution in the low-energy-loss (LEL) range, that is, between 10 and 40 eV, consisting in a large and smooth maximum which does not contain
CONTRAST FORMATION IN ELECTRON MICROSCOPY
293
any atom-related feature, and, at higher energy losses (above 50eV), a continuously decreasing tail on which are superposed a succession of characteristic small edges due to atomic core electrons. One notices that at least 30% of the inelastic intensity is contained within this low-energy-loss peak; the general shape of which remains very similar in all organic substances. Consequently, the recording of an “inelastic” image consists in collecting at the exit of the spectrometer LE1 electrons contained within an energy window from typically 10 to 50 eV. For contrast quantitation, the relevant inelastic cross sections can be calculated in two ways. The first method is to average atomic inelastic cross sections in a new way as for the elastic term. The second approach consists in evaluating the total intensity in a real EEL spectrum with a Lorentzian-type fit (Wall et al., 1974). If one compares the two calculations for a typical adenine specimen, the results fit within 10%(Colliex et al., 1984).When one is interested in organic and biological substances in the absence of heavier elements, the atomic calculation constitutes a satisfactory source of inelastic scattering cross sections.
150
v)
c
0 V
-aaJJ
100
U
E aJ
0 0
50
100
150
ENERGY LOSS (eV 1 FIG.7. Average EEL spectra of HM20 showing the low energy-loss (LEL)electrons used in ratio contrast; 80 keV. (Courtesy of Dr. R. Reichelt.)
294
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
Iv.
CONTRAST WITH UNSTAINED SECTIONS OF
ALDEHYDE-FIXED
BIOLOGICAL MATERIAL A. Calculated Values of Scattering Constants for Diflerent Materials
As did von Borries (1949a) and Hall (1953) in the pioneering days of electron microscopy, we chose as variables in our theoretical equations the thickness x, the mass density p, and introduced the scattering constants K = KeI+ Kin: The latter are obtained from the matter-specific average scattering cross sections (T = oel+ oinby multiplication by L / A (2). This (a)
10-
08 G
-baJ
1
b
Sn-Cesin
Nuclelc
-
oc!d40r
06-
ig,!
04 02-
-
bon
‘PO/
pro Carbohydrate HM?y K ~ M 0 ICe /A0 // I
1
1
1
.
I
-
FIG.8. Z-dependent signal r = uc,/uinas (a) a function of the average Z and (b) of the hydrogen content of organic matter. On the right scale in (b) we indicate the elastic and inelastic scattering cross sections u,, and uin.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
295
choice revealed itself judicious because then the signal Iei/Ii,,is reduced to the For conventional imaging the signal depends on px very simple form oel/oin. as a product that a commutative on both p and x with equal influence, but also on the scattering constants Kel, K i n ,and K . In the next section we will explore the degree to which these scattering constants are involved in contrast formation. Tables I1 and 111 present the calculated values of these matter-specific constants for substances involved in biological electron microscopy. From these data and using the simple equations in Section 111 it should be easy to calculate any signal for any other substance not listed in Tables I1 and 111. For the range of biological materials, the ratio r = oel/oin can be presented either as a function of an average 2 (Fig. 8a), or, more conveniently, as a function of decreasing hydrogen content (Fig. 8b). Only those matters which contain P or Sn (or any “abnormal” atoms other than N, C, 0),are not on the curve, but more or less far above it. As we will show further below, this fact illustrates the extremely high sensitivity of Z contrast to 2, which is much stronger than in conventional imaging contrast. This has important practical consequences, in that small amounts of metals incorporated into organometallic resins or into ice produce with Z-imaging manifold the contrast that is obtained with conventional imaging (Carlemalm et al., 1982b). It implies also that specific labeling with heavy metals requires very much smaller probes or tags when imaging with 2 contrast than with the conventional modes. For the same reason, a general heavy-metal stain or shadowing-if still requiredcan be used in very much smaller amounts than usual. B. Thickness Dependence of Scattering
By using a Poisson distribution for the description of multiple scattering (Section II,B), we have introduced the critical mass thickness (px), = 1/K as the one at which, on average, one scattering event occurs per incident electron. In this case, 37% of the electrons are still passing unscattered, 37% are single scattered, and the remaining 26% are multiply scattered. Some typical values of ( p x ) , are listed in Table 11. Its variance for all matters considered is 12%,but with extremes of 8.3 and 12.0, indicating that for more precise calculations, we have also to take into account the variations, rather small, of K besides the more important ones in p, x , and Kel. In Fig. 9a we have plotted the curves of I,,,, Zel, and Ii, in function of x and calculated from single scattering. In this case, Iel/Iin= oel/oin= const. Only very thin specimens of about p x = 10 nm g/ml can be approximated correctly by these equations. Specimens of such thicknesses are seldom available or can be made only at the cost of preparation-method-induced deformations. Thin
296
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
I I10
10
I
I (a)
08
08
06
06
01
04
L
- Inelastic 02
02 Elastic - scattered
-
0
0 0
50
100
150
0
nrn
I
50
100
150
nrn
I10
08
06
tc
04
04
t
02
0
0
50 nm
100
0
50
100
nm
FIG.9. Plots of calculated signals in function of thickness x. (a) Scattered electrons calculated with the statistical method (Section 11,B). HM20, 100 kV, 28 mrad. (b) Calculated signals, taking 0, into account. HM20,lOO kV, 28 mrad. Note that the curve for the A D F signal is almost straight when compared with the curve for elastically scattered electrons in (a); this is due to electrons multiply scattered to a larger angle than 0, and thus falling onto the A D F detector. (c) The thickness dependence of the A D F signal is much larger than that of the ratio signal at the same contrast. The curve 2.3 x A D F signal is the ADF signal recalculated to the same contrast as the ratio signal at x = 50 nm. HM20-protein, 100 kV, 28 mrad (protein, ------, HM20, -). (d) The signals calculated with the Monte Carlo method for a different 0,. Protein, 100 kV 13 mrad. (Courtesy of Dr. R. Reichelt.)
CONTRAST FORMATION IN ELECTRON MICROSCOPY
297
sections, for example, have two surfaces which are heavily distorted by plastic deformation to a depth reaching easily 5 nm on each side (Section IV,D). A good compromise is reached with sections of 30-50 nm. This is still about half to two-thirds of (px),, when considering unstained material. In Fig. 9b we have plotted corresponding curves obtained with Eqs. (8)-(1 l), by which up to six multiple scattering events have been included. We have assumed that all scattered electrons with at least one elastic scatter will fall on the annular darkfield (ADF) detector. The remaining unscattered and inelastic scattered electrons will be collected in the central aperture; they are dispersed by the spectrometer and the LEL electrons detected. The ratio S A D F / S L E L between the two signals is measured and provides the ratio contrast. In Fig. 9c we have plotted the curves calculated for both the embedding resin HM20 and for an average protein. When comparing 9b with 9c, we first find that the ADF signal increases nearly linearly with the specimen thickness, far further than the I,, alone. This is due to multiple scatter which removes some of the electrons from the central, near-axial part and shifts them to larger angles. In consequence, the proportion of collected LEL electrons decreases. The ratio S,D,/SL,L increases with x , with an x-dependent contrast comparable to that of the ADF signal alone. This confirms the authors who calculated this dependence on x (e.g., Egerton, 1982a). Why then d o we maintain the claim that the ratio contrast is relief independent as is observed experimentally? This is easily understandable when comparing the contrast between sections of a protein and of resin HM20 of an equal thickness of 50 nm. In dark field, we obtain a contrast ASD of about 0.02. The same contrast is obtained by a thickness variation of Ax = 9 nm of HM20. For the ratio contrast, however, we obtain ASR = 0.055, which is 3.3 times the contrast obtained in dark field. We can electronically amplify the dark-field signal (“more gain”) by 3.3, so as to obtain ASD = ASR= 0.055. The same contrast is achieved by a variation of AX, = 9 nm of the resin. In order to obtain the same with ratio contrast A x R = 27 nm! This simply means that, e.g., knife marks, strongly visible in the dark-field mode, will disappear in the ratio contrast. This is demonstrated in Fig. 10, where we show an image with knife marks, one half recorded with ratio contrast the other half with dark field. In conclusion, we can say that the relative, matter-specific contribution to contrast is very much larger with ratio contrast than with dark field; in just the reverse manner, the influence of thickness variations is very much higher with dark field than with ratio contrast. The practical consequence of this situation is that the influence of relief is negligible with ratio contrast applied to unstained material. It is obvious, but worth mentioning, that with the heavy stain used in current routine methods, the influence of relief becomes negligible in any case. The limitations due to stain will be discussed later (Section VI1,C).
298
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
The above calculations are based on somewhat arbitrary decisions about the distribution of scattered electrons on the two detectors. This can be done very much precisely by applying Monte Carlo procedures (Reichelt and Engel, 1984).These authors have done this for our Vacuum Generators HB-5 STEM and our homemade spectrometer. The ratio-contrast results obtained with different matters are given in Fig. 9d. There we see again that the contrast AS, is nearly independent of x and therefore also approximately the same as the 2 contrast at x = 0. From the above it becomes quite clear that contrast calculations based on the formulas of single scatter, i.e., formulas strictly valid only for very thin specimens, give throughout some overestimates. This estimated error in calculated contrast is, however, barely larger than some lo%, and thus certainly within other errors due to other approximations in the basic assumptions of the theories. In what follows, we therefore use again the singlescatter approximations. C. lnjluences of Thickness, Density, and Scattering on BF and DF Contrasts
In conventional bright field, most inelastic electrons are collected in the recorded image and only contribute to a general background. The contrast is then essentially produced by partly eliminating the elastically scattered electrons, while in conventional dark-field imaging we use practically only the elastically scattered electrons. In one way or the other, the I , , are thus responsible for contrast. From Eq. (4) we see that their amount depends on p, x, K,,, and K . As we see from Table I1 for organic materials, neither K , , nor K is constant when we consider hydrogen in the calculations. Biological material is variably rich in hydrogen. Hydrogen is no more sensitive to beam-induced loss than is phosphorus (Thach and Thach, 1971), and phosphorus behaves comparably to carbon (Dubochet, 1975). Most sensitive among the atoms involved is oxygen (Egerton, 1980), obviously depending on its bonding. A preferential, beam-induced loss of oxygen would therefore increase the relative amount of hydrogen. It is thus reasonable to calculate contrast for biological materials without yet considering beam-induced modification. Beam damage would indeed tend to improve contrast. These simplifying assumptions are in good agreement with the experimental results reported in Section V. In order to appreciate the influences of the four variables ( p , x, KelrK ) on contrast in conventional imaging, we differentiate the expression (4) for I,, and obtain the sum of the partial differentials for each of the four variables. To
CONTRAST FORMATION IN ELECTRON MICROSCOPY
299
obtain simpler expressions we divide the resulting Ielby run[Eq. (33)]:
Ale, -
- Ke,pAx
lun
1 + Ke,xAp + -[exp(Kpx) K
Ke, + 7[1 K
- exp(K px)
-
l]AKel
+ K px]AK
Keeping all A’s of the variables at 0, except one, we can calculate units for each of the four variables which produce the same contrast. For these equioalent contrast units (ecu’s) we obtain
0.012 Ke1P
u, = -,
up=-
0.0 12 Kelx ’
0.012K - exp(K p x ) - 1’ -
uKeI
UK
=
0.012K KeIC1 - e x ~ ( K PX)
+ K PXI
Equation (25) can now be written by means of these equivalent contrast units as (25’) ux
up
uK,~
In this equation, the differences of the four variables are expressed as multiples of the ecu’s. If this difference is now exactly one unit, then the resultingcontrast is 0.012. In Table IV we give for different biological substances and embedding media the equivalent contrast units, as calculated with formulas (26) and using TABLE IV EQUIVALENT CONTRAST UNITSFOR Ax, A p , A K , , , AND A K
Substance
u x (nm)
Ice HM20 K4M Epon Sn-methacr ylate Carbohydrate Lipid Nucleic acid Protein 0s-fixed protein
4.0 3.0 2.7 2.6 2.0 2.4 3.2 1.9 2.5 1.7
up
(g/ml)
0.093 0.082 0.083 0.080 0.061 0.085 0.080 0.077 0.082 0.068
UK.1
UK
0.26 x 0.21 0.18 0.18 x 0.18 0.16 x 0.23
10-3
0.14
10-3
10-3 10-3 10-3 10-3
10-3
10-3
0.17 10-3 0.13 x 1 0 - 3
-4.70 x I O - ~ -2.90 x 10-3 -2.26 x 10-3 -2.17 x 10-3 -1.6 x 10-3 -1.76 x 10-3 -3.37 x 10-3 -1.21 x 10-3 -2.00 x 10-3 -1.01 x
300
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
the data computed in the previous tables and figures of this paper. These units also depend on the thickness, and in the table we have provided ecu’s for 40 nm. We see for example that u, for HM20 is 3 nm, a thickness difference which-under favorable circumstances in dark-field observation-we estimate to lead to a just visible contrast. All other ecu’s tabulated lead to exactly the same contrast. For practical biological work, x = 40 nm is a good average. As we will see, this value seems to be optimal for thin sections. Even if beam damage would result in a complete carbonization, we should not be induced to compare such 40-nm organic slices with 40-nm-thick carbon. During carbonization of most organic material, the dimensions are approximately kept and the resulting carbon is porous. Such porous carbon has been called “biological carbon” (von Borries, 1949a) and has been considered to have a density of less than 1 g/ml (charcoal!). In some few cases, the organic material might melt during carbonization. Then the density might be higher, but the thickness is reduced accordingly. From Table IV we can already learn several interesting facts: it shows, e.g., that the u, is the highest for ice, followed by lipid and HM20. This signifies that these substances are not very efficient when transforming thickness variations into contrast. More informative, however, is Table V, which expresses contrast between embedded biological substances and embedding material in ecu’s, taken from Table IV. To do this, we had to make approximations: First, we kept x constant throughout as 40 nm. Second, the calculated differences p, K , , ,and K had to be expressed in em’s in order to be useful for interpretations. Neither the ecu of the embedding medium nor that of the embedded substances is completely correct for this purpose, but they are the only available ones. We therefore simply tabulated both, giving some sort of lower and higher estimate of the resulting contrast. In regard to contrast, it becomes obvious from Table V that ice is throughout the best embedding medium for unstained biological substances. It is interesting that lipids would lead to negative contrast with all organic embedding media (HM20, K4M, Epon). This fact is relevant obviously only in those cases in which lipids are not extracted. Extraction always occurs (Weibull et al., 1983) except at temperatures below -60°C (Weibull et al., 1984). Epon is particularly unsuitable for the observation of unstained proteins. In very general terms, it is obvious from Table V that p does not have a very much greater influence than K , , . Nor is there any correlation at all between the two differences. When considering conventional imaging of unstained material, it is therefore an extremely crude approximation when speaking of essentially px-dependent contrast formation.
RELATIVE CONTRAST BETWEEN
TABLE V BIOLOGICAL SUBSTANCES AND EMBEDDING MEDIA IN CONVENTIONAL IMAGING
Carbohydrate eCU
from h
Ice
AP K
1
AK
2.0 8.6
ecu from
ecu from
b
a
5.7 2.0 5.9 13.6
0.9 2.0 -0.2 2.7
~
~~
HM20
AP
-0.4
AP
Total
2.0 -0.3 0.4 2.1
AP AKel AK Total
2.0 - 1.0 0.1 1.1
AP
2.6 -7.8 0.8 -4.4
~~
K4M
AKel
AK ~~
Epon
3.8
AKel AK Total
~~
Sn-
AKel
methacrylate
AK ~~
Total
0.4 3.4
Nucleic acid
ecu from a
5.2 1.2
Total
Lipid
3.6 -0.6 0.7 __ 3.1
~
1.9 - 1.1 0.2 1.o
~
3.1 - 1.1
0.6 - 0.2 __ -0.7
0.5 -0.1 __ -0.7
-2.8 0.8 - 0.4 - 2.4
- 2.9
2.0
0.5 2.1
-0.2 ~
- 1.1
- 0.4 ~
1.o 2.3
~
~
- 3.0 0.2 -0.6 - 3.4
~
ecu from
ecu from
ecu from
ecu from
ecu from
a
h
a
h
a
1.4 2.6 0.4 10.4
9.0 4.9 1.4 14.3
4.1 1.6 0.1 5.8
4.6 2.9 0.2 7.7
7.6 7.7 0.3 __ 15.6
2.6
6.6 3.7 0.5 10.8
6.3 1.3 0.7 8.3
~
~
6.7 2.0 1.7 10.4
- 2.5
7.1
4.9 2.2 1.4 8.5
- 3.0
4.6 1.1 0.6 6.3
4.8 1.4 0.9 __ 6.1
0.8
-0.5 ~
~
0.1 - 0.4 - 3.3
~
0s-fixed protein
ecu from b
4.6 1.7
0.9
~
Protein
~
~
~
~
2.6 0.0 0.3 2.9
0.1 0.2 0.2 1.1
0.8 -0.5 -0.1 __ 0.2
~
0.0 ~
0.4 3.0
~
0.7 0.2 1.1
5.9 6.2 1.o __ 13.1
0.7 -0.5 -0.1 0.1
4.9 3.8 0.2 __ 8.9
5.7 5.3 0.4 11.4
0.2 ~
1.9
- 4.0
- 3.0
6.1
4.8
1.o
0.7
6.4
- 8.8
- 6.6
- 5.2
- 5.7
- 7.4
- 7.3
- 7.1
- 3.0
0.7 ~
~
- 6.2
-0.3 10.9
~
-0.1 8.3
~
1.3 1.7
0.5
1.7 ~
~
- 0.9
0.4 ~
- 5.8
~
7.9 6.0 1.3 15.2
4.8 4.5 0.4 9.7
~
~
~
11.3 15.3 1.3 21.9
~
- 6.6
0.8 4.2
~
~
5.7 -4.1 1.3 2.9
302
E. CARLEMALM, C. COLLIEX, AND. E. KELLENBERGER
We conclude by saying that the contrast of embedded biological material in conventional imaging is not sufficiently described by considering only the density p. For material not containing heavy atoms, the scattering constants have particularly important influence. They reflect here the hydrogen content of the material. For heavy-atom-containing material, either obtained by 0 s fixation or represented by an Sn-containing embedding resin, both p’ and K,, gain additional influence.
D. Injluence of Thin-Section Relief Compared for the Two S T E M Imaging Modes In a first paper (Carlemalm and Kellenberger, 1982),we proposed that the unsharp imaging of unstained thin sections (40 nm) by dark-field imaging is not due to optical causes and multiple scattering, but rather to a consequence of the surface relief on both faces of the slice. Due to plastic deformations associated with the tearing forces of the cleavage process of ultramicrotomy, this relief obviously would not faithfully represent the biological structures. Since in ratio-contrast imaging the influence of Ax vanishes and with that most of the influence of such relief, this would explain why unstained sections lead to very well-defined micrographs (Fig. 10) in the meantime, Williger, from our laboratory, has investigated these reliefs by careful fine grain shadowing (Fig. 11). They estimated the thickness variations to be between 3 and 5 nm. We evaluated the influence of such relief on contrast in STEM ADF imaging by assuming an embedded compact protein in a section of constant thickness. We calculated its contrast with the surrounding embedding material by using formulas (4), (13), and the data of Tables I1 and 111. We then reduced the thickness of the protein by a Ax which was calculated so as to reduce the contrast by exactly the same amount as that provided between protein and embedding material in a section without thickness variation. When the thickness of the embedded protein, but not of the embedding, is reduced by the calculated matching thickness difference, then the contrast between protein and surroundings disappears. If this difference is comparable in magnitude to that of the relief, then our hypothesis is verified: the influence of a relief of 3-5 nm provides contrast differences similar to the density differences of the embedded material. In Table VI we give several numerical examples. From the examples above we have learnt that the relief of sections of unstained embedded biological material has an influence on contrast which is comparable to that due to specific differences of the matter, as reflected in p, K , and K e , .When we consider also the quality of this relief, as shown in Fig. 15,
CONTRAST FORMATION I N ELECTRON MICROSCOPY
303
FIG.10. (a) Ratio-contrast and (b) ADF image of a section from rat duodenum, with knife marks showing up as strong diagonal lines in the A D F image but very much suppressed in the ratio-contrast image. (a) and (b) do not represent two independent micrographs; they were obtained as a single photographic record from one complete scan during which the imaging mode was switched. This is demonstrated also by the very weak continuations of the knife marks from the right into the left part of the micrograph. Note also the white “halos” or “shadows” associated with nearly all structures of the right part and their absence-except one-in (a). This is due to thickness variations as discussed in Sections 1V.D and V,C.
the unsharp imaging is understandable as a consequence of the surface deformations. As theory predicts and experiments have shown (Carlemalm and Kellenberger, 1982), imaging by ratio contrast reduces the influence of the distorted surface reliefs when compared to dark-field imaging. To further explore this situation we calculated the resulting contrastexpressed by AS-for a protein of 5-nm dimension positioned differently in respect to the surface of a 40-nm-thick section. In this case we calculated the contrast for both modes of imaging (ADF and ratio) and for different embedding media. From Fig. 12 we learn immediately that the position of the protein has virtually no effect in ratio contrast. Letting half of the protein protrude from the surface leads with ADF imaging to doubling and in some cases even to a more than tenfold increase of contrast. Increasing protrusion
304
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
FIG. 11. Section of HMZO-embedded phage T4 adsorbed onto E . coli envelopes and shadowed with tungsten-tantalum alloy, revealing the relief of one of the two surfaces.
of the protein out of a tin resin or Epon even produces a contrast reversal when conventionally imaged, but a nearly unaltered contrast when using ratio-contrast imaging. From these calculations it becomes obvious that for unstained structures in the range of 5-10 nm, dark-field imaging of thin sections cannot be expected to provide meaningful results, because the contrast differences between the same structures inside of a section and those partly protruding are so strong that interpretation becomes most difficult. Either a very strong staining of the structure or ratio contrast leads to interpretable results, in which the distorted surface reliefs are not predominant in contrast formation. It should be noted here that already a relief of 1-2.5 nm is sufficient to create this queer contrast situation. From the examples in Fig. 12 we learn also that the tin-containing embedding resin is perfect for ratio contrast, but not usable in conventional
TABLE VI
INFLUENCE
OF A p AND Ax ON CONTRAST' /
.I
Ax
........ .. .. .. .. .. .. ...;.....;.....;. . : . : . ..~............
COMPACT'PROTEINS
Density Section thickness (nm)
Protein Unstained
40
1.3
40 100
0s-stained
1.63
1.3
Thickness variation leading to the same contrast A.x (nm)
Resin p
Ap
HM20 1.09 Epon 1.24 K4M 1.23
0.21 0.06 0.07
7.06 (6.4) 0.38 (1.8) 2.94 (2.1)
Ice
0.92
0.38
15.73 (1 1.6)
Epon 1.24
0.39
15.70 (9.6)
Ice
0.38
29.2
0.92
Range of predominance of relief (5 nm) Border range
Range of predominance of structures inside of the section
a A cylinder formed of compact protein is resin embedded and sectioned, such that it spans the section; the thickness variation Ax of the protein is calculated, leading to the same contrast as does the density difference Ap between protein and resin; the values in parentheses are calculated with a constant K (K = 10.65 x
306
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
Protruding raction
0.25
0.5
AS
AS
0 AS
Embedded
AOF
ratio
AOF
ratio
ADF
ratio
x10-2
x10-2
x10-*
x10-2
x10-2
x10-2
Ice
-0.15
-0.25
-0.25
-0.24
-0.35
-0.24
HM20
-0.094
-0.38
-0.18
-0.37
-0.26
-0.36
K4M
-0.029
-0.21
-0.12
-0.20
-0.20
-0.20
Epon
-0.017
0.12
-0.11
0.12
-0.20
0.12
0.23
2.21
0.11
2.14
0
2.08
-0.34
-1.85
-0.43
-1.79
-0.
-1.14
in
~~
Sn-methacrylate
0 s - f i xed protein i n Epon
FIG.12. Contrast AS of a large (5-nm) embedded protein (slippled square)in relation to the fraction of the protein that protrudes out of a 40-nm-rhick section. Negative values mean that the embedded material is bright against a dark background. The values are calculated for an annular dark-field collection angle of 28 m a d to m and 100 k V .
imaging because of the contrast reversal that occurs as soon as the structures protrude from it. This is confirmed experimentally. E . How to Calculate Contrast of Real Biological Structures
In order to simplify argumentation, we have considered until now only extended compact proteins. This is a putative entity which never exists in this simple form as a biological structure. It is well known, however, that protein molecules are generally rather compact in that they contain in the interior of
CONTRAST FORMATION I N ELECTRON MICROSCOPY
307
their domains no, or at best only very few water molecules. Bound water is in the form of a thin “hydration shell” of two to three layers on the outside of hydrophilic areas of proteins. Larger biological structures are composed of many protein molecules or subunits acting as building blocks. Therefore, water-filled spaces necessarily exist between subunits in such “supramolecular protein structures.” The water will eventually be replaced by resin, negative stain, or transformed into ice. Because of the superposition in depth, the “effective”concentration of such protein assemblies is therefore easily reduced by some 20-300/,. The same is true for polysaccharides, which form gels containing frequently more than 80% water, and for any form of DNA and chromatin. As long as the effective resolution has not reached that of individual molecules, reduced concentrations have to be used for contrast calculations. Pro rata of the initial water “content” a resulting density has to be calculated as a mixture of biological substance and embedding material. All biological structures, except lipids and proteins within lipid layers, are therefore not compact entities but rather “spongy” structures which contain water in the form of cell sap. When dried (in vacuum!) most of this water will disappear and the structure will collapse (Kistler and Kellenberger, 1977; Kellenberger and Kistler, 1979).This collapse is best avoided by keeping the structures embedded either in ice or in a resin. Biological structures, when observed in the electron microscope, have therefore to be considered as mixtures of biological material (protein, nucleic acids, and polysaccharides) with variable amounts of embedding medium. In an artifact-free preparation, the amount of the latter-measured as volume-should correspond exactly to the native water content. In Section II,D we demonstrated the “dilution theorem,” which is valid for thin specimens (up to 50 nm): After passage of the electron through a cylinder of the specimen, the frequency distribution of electrons into the three classes I,,, I,,, and I,, is not dependent on the in-depth distribution of the constituent material of this cylinder. If we assume protein and resin, it does not matter if this protein is equally distributed within the resin or localized in a compact form somewhere in the cylinder. We have illustrated such situations in Fig. 13. In Fig. 13a we show a compact substance spanning the whole thickness of a slice of resin HM20. Except for lipids and lipids with proteins, such compact substances are not encountered in nature. In Fig. 13b we show schematically two biological structures: the one composed of globular subunits (bl), the other represented by a coil of a fibrous type (b2). Both are supposed to have contained 50% of water when native. These structures exemplify in b l a protein structure, such as a virus capsid, composed of protein subunits, and in b2 a nucleic acid filament packed into a chromosome or virus. In order to treat these two cases numerically, we make use of the dilution theorem: either we distribute the biological matter of Fig. 13b equally over the
308
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
whole depth of the specimen c, or we “compact” each and separate them, as shown in Fig. 13d. When the biological structures in b had initially 50% (v/v) of water, then the compacted structures are half of their native extent. That is, the 10-nm structure is now reduced to 5 nm, and the relative volumes of resin and biological matter are now 35 : 5. This ratio allows us to calculate the average values of the corresponding matter-specific constants p, K , , , K i n ,and K of the mixture by using the data of Tables I1 and 111. This is exemplified in Fig. 13 for protein, DNA, and carbohydrate by calculating the conventional dark field and 2 contrast, expressed as a signal difference with the surrounding embedding medium (Lowicryl HM20). When considering biological matter as compact-as is unfortunately frequently done-we see in column a that compact DNA would produce in dark field about 2.5 times the contrast of compact protein. However, as soon as we consider a biologically realistic situation, as in b, the difference between DNA and protein becomes much smaller. A water content of some 75% (w/w) is found in the case of one of the most compact forms of DNA known, namely that found in the head of a bacteriophage (Earnshaw et al., 1978). For the values in Fig. 13 we note that the contrasts obtained for “real” samples by annular dark-field imaging and in the ratio-contrast mode differ by only a factor of 3. This is exactly the factor found when we studied in Fig. 9c the contrast obtained between protein and resin HM20 in the two imaging modes, but when using multiple scatter. There we also explained that by
AS Protein(p = 1 3 )
ONA ( p = l 61) Carbohydrote ( p =1 44)
AS
Ratio
ADF
- 0 033
-00073
- 0 16
- 0 019
-0057
- 0 013
Ratio - 0 0038 -0 O N -0 0062
AOF
- 0 00091 -00036 - 0 0016
FIG. 13. Examples of contrast calculations for various “real” structures, embedded in HM20. (a) Compact matter spanning the section. (b) Schematic representation of “porous” or “spongy” structures consisting of matter and resin in the ratio 1 : I ; b l represents globular and b2 fibrous matter. (c) How the matter of the structures in (b)can be regarded as diluted into resin and spanning the entire section; for details see Section II,D of the test. (d) How the matter of cases (b) is compacted and as such allows more easily for calculations.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
309
electronic amplification (“gain”), the dark-field contrast can be increased. But we should remember here that then the variations in thickness Ax also lead to enhanced contrast and thus become visible, while this is not the case in ratio contrast. Practically, the difference between the two modes of imaging lies mainly in the quality and sharpness of the images, and not so much in the contrast.
V. EXPERIMENTAL CONFIRMATIONS A . Dejinition of Test Object
With the following we will demonstrate the consistency between theory and experiment. To do this in a reasonable manner, we have selected a biological system which is biochemically and biophysically well defined such that the structures visible on micrographs can be evaluated in respect to their scientific meaningfulness. Bacteriophage T4 is one of the most thoroughly and extensively investigated viruses. Its protein composition is known and carefully related to the physical parameters of its structure (Eiserling, 1979). The agreement between structures viewed in electron microscopy by negative stain (Kellenberger et al., 1965), thin sections (Wunderli et al., 1977), and freeze fracturing (Branton and Klug, 1975) is good. The envelope of the host of this bacteriophage, E . coli, is also well investigated. There is no doubt about the existence of a biochemically defined outer membrane (Steven et al., 1977); a periplasmic space separates this outer membrane from the inner one, the plasma membrane. Only the structure of the periplasmic space and of the peptidoglycan located between inner and outer membrane are not yet completely agreed on. New electron microscopy procedures have probably settled these unanswered questions as we discuss elsewhere (Hobot et al., 1984). Fortunately, they are not relevant to the arguments which we present below. When a phage infects a bacterium, an inner tail tube is “drilled” through part of the envelope, probably up to the plasma membrane, but not piercing it (Furukawa et al., 1979; Labedan and Goldberg, 1979). The “drilling” is associated with a contraction of the tail sheath. This mechanism has been studied in detail (reviewed in Caspar, 1980, and Eiserling, 1979). In the micrographs presented in the following, we show phages, of which most have adsorbed on “empty” bacterial envelopes and which-in consequence-have a contracted tail sheath. Some of these phages have partly or fully ejected their DNA. Triggering of the DNA release from the head
3 10
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
1 15 8
(a)
(b)
FIG. 14. Schematic drawing of the structures that embedded and sectioned give the images in Figs. 11 and 16-19. (a) A T4 phage. (b) A T4 phage after absorption onto an E . coli cell envelope; the tail sheath is now contracted and the tail core (a tube!) has penetrated the outer layers down to the cytoplasmic membrane. (c) The E . coli envelope with the outer membrane, the periplasmic gel, and the inner or cytoplasmic membrane. The inner part of the periplasmic gel is lost upon separation of the two membranes.
indeed requires contact of the tip of the tail tube with the plasma membrane (Furukawa et al., 1979). During preparation of empty cell envelopes, most of the plasma membrane becomes detached from the outer membrane and the peptidoglycan. This procedure is used in biochemistry to separate the envelopes into two fractions which in their composition are distinct. The known structural features of this phage-envelope system are summarized in Fig. 14.
B. Materials and Methods Bacteriophages T4 were produced by infection of E . coli B. The phages were purified on a sucrose gradient which was later removed by dialyses against Sorensen phosphate buffer pH 7.0. Envelopes from E . coli B were prepared by shaking bacteria, harvested in exponential phase, with ballotini beads in a Mickle apparatus. The envelopes were washed twice by centrifugation. This material was kindly provided by Dr. Jan Hobot and Cornelia Kellenberger of our laboratory. Bacteriophages were adsorbed onto the envelopes, fixed with 2% glutaraldehyde, and centrifuged to a loose pellet in a table-top centrifugation.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
31 1
This pellet was serially dehydrated in ethanol at progressively lower temperatures,-embedded in Lowicryl HM20 or a tin-containing resin (Carlemalm et ul., 1982b),and sectioned with a LKB microtome. Shadowed sections were obtained by electron gun evaporation of a tantalum-tungsten source at an angle of 14" (Villiger et ul., 1984). Micrographs were recorded with a Philips EM 301 with a 30-pm objective aperture (corresponding to a half angle of about 7.5 mrad) and a Vacuum Generators HB 501 STEM (situated in the Laboratoire de Physique des Solides, Orsay, France) equipped with a Gatan EEL spectrometer. The spectrometer slit is set to collect electrons with energy losses of 5-60 eV; the collector aperture is the inner hole of the annular dark-field detector and corresponds to about 25 mrad. The outer angle of detection for ADF is about 200 mrad. For a full description of the STEM configuration used see Colliex and Mory (1 984). C. Results
From our theoretical and numerical considerations we are led to postulate for conventional imaging a strong influence of the surface relief of both faces of a thin slice, an influence that vanishes when using ratio contrast. To demonstrate this influence experimentally, we will first consider micrographs of sections obtained from embeddings in a tin-containing resin. Variations of thickness of the embedding medium will be more strongly manifest than those obtained with other embedding media, as we will see afterward. In Fig. 15 we show the putative profile of a slice of the sort that leads to the micrographs of Figs. 16 and 17. Figure 16 presents the same specimen region
E
..:.
. . ..
FIG.IS. A drawing of a cross view of a thin section as it has to be imagined to explain our experimental data. The letters point out features seen in Figs. 1 1 and 16.
312
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
FIG.16. STEM images of sectioned T4 phages adsorbed on E . coli envelopes embedded in an organometallic methacrylate resin. (a) Dark-field image formed by mainly elastically scattered electrons. (b) Image formed by inelastically scattered electrons. (c) Z-contrast image formed by the ratio of elastically over inelastically scattered electrons. (d) As in (c), but with photographically reversed contrast. Surface artifacts produced during sectioning are well emphasized due to the scattering properties of the Sn resin, which enhances the effect of the relief. Note that these effects are eliminated in (c).The letters mark features which are discussed in the text.
imaged in STEM as dark field with elastic (16a) and inelastic (16b) electrons. In Fig. 16c the ratio-contrast image is given. In regions marked (E) the outer membrane is seen in a vertical superposition. In Fig. 16c it has the expected dimension of 10-11 nm. The same dimension is also obtained on stained sections (Fig. 17). In P1 we see (partially) empty phages with contracted tail sheaths and some undefined material that remains inside the head. In Fig. 16a
CONTRAST FORMATION IN ELECTRON MICROSCOPY
313
FIG. 17. CTEM bright-field image of the same specimen as in Fig. 16, but embedded in Lowicryl HM20 and stained with uranyl acetate and lead citrate.
these phages, marked P2 and P3, clearly show something which could be the head membrane or head shell made of protein. The unwarned observer could believe that this represents new information. This is unfortunately not the case. The proteinous head shell is only about 4.5nm thick, as can be recalculated from the known protein content. This value is an upper estimate, because the shell is obviously a porous assembly of subunits, as explained in Section IV,E. The “porosity” and “unevenness” can, however, only be estimated by considering the size of substances which can or cannot penetrate through the shell (Leibo et al., 1979) and from shadowed micrographs of shells (Scraba et al., 1973; Branton and Klug, 1975; Kistler et al., 1978). Even if this is done very optimistically, we cannot even reach half of the values measured in Fig. 16a and 16b, while these dimensions are of the correct order of magnitude in 16c. The thickness in 16c corresponds to that of stained shells in Fig. 17 and other published micrograph (Wunderli el a!., 1977; Carrascosa and Kellenberger, 1978). The phage tails in P2 of Fig. 16 show again in 16a a black core, surrounded by a white cylinder, while it is only a black cylinder of 17 nm diameter in Fig. 16c. From chemistry and information processing of negatively stained
314
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
tails, a diameter of 18 nm is determined (Amos and Klug, 1975; Smith et al., 1976). Again, the “beautiful” image in Fig. 16a is a misleading artifact, explainable by the relief, while that of 16c is in perfect agreement with the known features. The normal and flipped-out baseplate of the phages [schematically drawn in Fig. 14, according to Kellenberger et al. (1965);reviewed in Eiserling (1979)l are clearly visible in Fig. 16c, particles P1, P2, and Bp, but not in 16a or 16b. Still considering known structures, we have the somewhat puzzling situation that full and empty phage heads can be distinguished in both Figs. 16a and and 16b, but not in 16c. This can be explained by the fact that the ratio contrast of the phage-head DNA happens to be matched by that of the tin resin. Empty heads are obviously filled with resin, with the exclusion of remaining internal proteins. In order to make a distinction, the tin content of the resin has to be chosen differently. Some other features of biologically undefined structures are also of interest. The particles designated A and P2 (Fig. 16) have an associated dark region, always in the same direction, like a shadow. This is easily explained by a sectioning artifact reflecting in a thinner region of the resin behind the particle in respect to the direction of sectioning (Fig. 15). The regions G and G2 in Fig. 16 can, in our opinion, be interpreted much more readily in 16c than in 16a and 16b. We think that the filaments visible in region G2 of 16c are likely to be bundles of DNA. To finish the discussion of Fig. 16, we should mention that the tin resin used for these micrographs is still beam sensitive. Larger magnifications reveal a granularity above noise which precludes higher resolutions, as would be needed for resolving the tail fibers of phage T4. A new tin-containing resin is under construction and has already demonstrated the absence of beam-induced granularity. Results with biological material will be reported later. In Fig. 18 we show a section of the same biological material as in Figs. 16 and 17, but this time embedded in Lowicryl HM20, which is particularly rich in hydrogen. Figures 18a and 18b are the elastic dark-field and Z-contrast pictures, respectively. It is interesting that these two micrographs were taken on an area which was accidentally contaminated at low magnification by working with too few lines per frame. This contamination is obvious as a line pattern in 18a, but not in 18b. This contamination layer had a variable thickness which reflects in the elastic dark-field image, but not in ratio contrast. As a whole, 18b is better defined than 18a. It confirms our arguments made above (Fig. 10) and in previous papers. Finally, in Fig. 19, we present the same material embedded in Epon and observed in the ratio-contrast mode. We immediately note that the phages are now only represented by their DNA content; the proteins no longer have
CONTRAST FORMATION IN ELECTRON MICROSCOPY
315
FIG.18. STEM images of an unstained section from the same specimen as in Fig. 17:(a) an ADF image with contamination and etching marks which in the ratio-contrast image (b) are hardly visible due to the ability of ratio contrast to suppress differences of mass thickness.
sufficient contrast to be visible. As we discussed in Section IV,A, this is not unpredictable: indeed, Epon, as an epoxy resin, has a hydrogen content which is comparable to that of proteins and substantially higher than that of HM20. This fact has also been used in Fig. 10. By these observations we eliminate the possibility that contrast of the biological material might have been due only to adsorbed metal ions.
316
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
FIG. 19. STEM ratio-contrast image of Epon-embedded T4 phages illustrating the influence of the elementary composition on the contrast. The DNA inside the phage head is the only visible structure. For composition and scattering properties see Fig. 8 and Tables I1 and Ill.
The advantage of ratio contrast over staining will obviously become really apparent only at higher magnifications and with adequate specimens such as, e.g., the septate junctions (Garavito et al., 1982). D. Positive Stain in 7hin Sections
As we have learned, the contrast in conventional modes of imaging of slices of embedded biological material depends on the differences of the mass densities p and of the average scattering cross sections, expressed by K , , of the two components (the biological matter and the embedding medium). In conventional bright-field imaging, the contrast of unstained resin-embedded
CONTRAST FORMATION IN ELECTRON MICROSCOPY
317
material is so low that one cannot work with such a material. It happens that fixation with OsO, was found very early to be preferable to all the other fixatives already known from light microscopy. It was understood by most that 0 s must also become deposited into or onto the fixed material (Bahr, 1955) and thus increase both density p and K e , . As discussed in the Introduction, at some period in the history of electron microscopy, the 0 s deposited through fixation with OsO, was sufficient for producing adequate micrographs. In our laboratory we titrated the resulting 0 s deposit by neutron activation (Carlemalm et al., 1985) and found for 0 s some 10 f 2% (w/w). By uranyl acetate treatment in aqueous solution about an equal amount is added to it. These measurements were done on a defined protein structure, the capsid of bacteriophage T4, and on whole bacteria. To find out to what extent this added amount of 0 s is reflected in increased contrast, we embedded 0s-fixed bacteria simultaneously with glutaraldehydefixed ones. The result is shown in Fig. 20a. At first sight it is astonishing how little contrast is gained by this 10% addition of heavy metal. As mentioned in the Introduction, it became customary to stain the sections with uranyl acetate and lead citrate. In Fig. 20b we show the result of a section stained with uranyl acetate and lead. Now the micrographs look acceptable. Interestingly enough, we are no longer able to distinguish by contrast whether the bacteria were fixed with OsO, or with glutaraldehyde. The small contrast increase due to the 10% 0 s deposit is therefore very much smaller than the added stain on the sections. From these observations we estimate that section staining leads to deposits which are 4-10 times those of the initial amounts; in other words, to the biological material an about equal amount of heavy metal is added. With such large amounts of heavy metal, the question arises about the location of the deposits in relation to the macromolecule. This problem can now be studied by comparing stained, conventionally observed material with unstained material observed in the ratio mode. With septate junctions it was found that hydrophobic parts of proteins are not stained by uranyl acetate (Garavito et al., 1982), but it was not yet possible to decide where the stain was in respect to the hydrophilic part of the molecule (Fig. 21). Further work on different structures is needed. Studies of only 0s-fixed, but not poststained material should not only be made in dark field (Ottensmeyer and Pear, 1975), but now also in ratio contrast (Ottensmeyer and Arsenault, 1983).It is most likely that new information will be gained. Indeed, from the values in Tables I1 and I11 we can see that the relative influences of thickness variations, i.e., of the surface reliefs, are already so strongly reduced with 20% stained material that they are negligible.
FIG. 20. CTEM bright-field images of glutaraldehyde and Os0,-fixed E . coli cells embedded together in Epon. Image (a) is recorded from a section which has not been poststained, (b) is a recording from a section that had been poststained with uranyl acetate and lead citrate. Note the minute contrast difference between the cells in (a). Despite the fact that the cell marked “0s” contains about 10% (w/w) osmium, almost all the contrast in (b) is due to the poststaining. Note that the bacteria marked Glu in (a) when observed in the ratio mode would look similar to that marked “0s” in (b)!
CONTRAST FORMATION IN ELECTRON MICROSCOPY
319
FIG.21. Septate junctions of the testis of Drosophila melanogaster. The left micrograph is glutaraldehyde-fixed, fully unstained material embedded in Lowicryl HM20 and imaged by Zcontrast in STEM. The right micrograph is from sections of the same block, but now poststained with uranyl acetate and observed in CTEM. Note that the hydrophobic parts of the junction are here unstained, and that it is difficult to decide between positive or negative staining.
VI. DISCUSSION OF THE CONSEQUENCES FOR THE INTERPRETATION OF MICROGRAPHS A . Introduction: Diferent Densities
The final goal for studying different modes of producing contrast is to find out which type of contrast formation provides the most biologically useful information about the object. In conventional electron microscopy, a first obstacle, however, takes precedence over the above-mentioned goal: the differences of constitution between different unstained biological materials are so small that it is difficult to achieve any contrast! The development of electron microscopy has only followed the line of increasing resolution, completely forgetting that the limitation with biological material per se is not resolution,
320
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
but contrast. The biologists have compromised with the physicists by introducing continuously more and stronger heavy-metal staining. But now they have become aware that the relation of heavy stain to the stained object is again a new, nearly unknown limitation. New specimen preparation methods are being devised and new imaging modes proposed that should overcome also these limitations; we shall discuss these now. One of the most noble purposes of microscopy is to provide an image. On micrographs morphological patterns are discovered which frequently indicate new biological structures. By biochemical and biophysical procedures they can then be isolated, but only if the morphological definition of the structure is kept as a continuous guide. This all sounds trivial and is illustrated by the history of mitochondria, chloroplasts, microsomes, ribosomes, viruses, various vesicles, and so on. But the limitations of electron microscopy become obvious when considering smaller objects. We cannot discuss these here and refer the reader to a review (Kellenberger and Chiu, 1982). The correct interpretation of finer details becomes crucial in any further interaction between microscopy and biochemistry toward elucidating structures below the 10-nm range. For physiological reasons, an image is composed of gray tones (or colors) associated with a morphological pattern. Gray tones, or grays as we will call them for short, are physiological impressions or sensations. It was found more than a century ago that a linear progression of grays is seen when the (reflected or transmitted) intensity of light is progressing exponentially (or “geometrically”). This law of Weber-Fechner is valid only in a limited range, but seems to hold for nearly all our sensory systems. The Weber-Fechner law simply states that the physiological sensation is proportional to the logarithm of the physical intensities. This physiological law is frequently confused by physicists with the Beer-Lambert law (see footnote to p. 277) of the decrease of intensity in absorbing matter and which shows that the optical density, defined as the negative logarithm of the transmitted intensity, decreases linearly with the thickness. According to this law, a linearly increasing thickness of an absorbing matter produces a linear progression of optical densities and therefore a linearly increasing series of grays. For our eye “twice as thick is twice as gray.” But also “twice concentrated-in a solution-twice as gray or twice as colored.” It is obvious that electron microscopists would like to correlate the gray of a micrograph with specific properties of the specimen. Here is where unpleasant confusions start! These began by calling dark parts of a micrograph electron dense, instead of electron opaque, meaning “highly electron scattering.” It was clear that confusion had to arise with data obtained by x-ray diffraction where the density of electrons, the real electron density, is directly determinant. As we will see below, the mass density p is important in
CONTRAST FORMATION IN ELECTRON MICROSCOPY
32 1
determining the grays in micrographs and for electron diffraction. No wonder then that the measured optical densities of micrographs are transformed nearly automatically into mass densities when producing 3D reconstructions. Experts in thin sectioning realized frequently that the gray (wrongly named electron density) obtained on sections of cells is in the first approximation a reflection of concentration of matter in the initial aqueous cytosol, and in the second, a consequence of stain and specific chemical composition. A “dense,” in the sense of concentrated, cytoplasm is darker on the micrograph than a more diluted “less dense” part of it elsewhere in the cell! Water is the main component of living matter, and 80% of it is the average for a cell! We have introduced the dilution theorem (Section II,D) in the hope that it will help to dissipate some of these confusions, and thus to stop their inhibitory consequences for the progress of science. B. Characteristics of the Specimen That Produce Contrast
Knowing on each area of a specimen the number of each sort of atom and their scattering cross sections (elastic, inelastic, and total) would suffice to describe the proportions of different species of transmitted electrons. The number of atoms has, however, to be determined from the mass density p and the thickness x . It is interesting to note that atoms do not have an exactly determined volume and the density of protein, e.g., cannot yet simply be calculated as a consequence of the relative amounts of constituent C , N, 0, and H. Apparently, packing is variably compact or, in other words, the spacefilling volume of the different atoms depends on their surroundings. Unfortunately, determining experimentally the density p is not very simple, because most biological macromolecules have a hydration shell, of which the “organized” water behaves differently than the surrounding water. In vacuum, or after substitution by organic liquids, hydration shells are supposed to be totally or at least mostly removed. But in reality, very little is still known about these shells; neutron diffraction and also observation of still-frozen specimens in the electron microscope after transformation of water into vitreous ice have been recently introduced as experimental methods for studying these problems. Thickness x and density p substitute for the number of atoms when we know also the atomic weights. The latter, together with the atom-specific cross sections, are combined in the matter-specific scattering constants K , , + Kin = K . As the strict theory (Section 11) shows, the unscattered electrons decrease as P(0) = exp( - K p x ) . The calculations show in addition that the K’s for stained and unstained biological matter are nearly constant, with deviations of only + l o % .
322
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
The inelastically and elastically, singly and multiply scattered electrons are represented by more complex functions represented by the type 1 exp( - Kpx), multiplied by either K,, or Kin, and combinations thereof. This type of equation is mathematically unwieldy; in general, by serial developments, they are simplified into linear functions with corrections for nonlinearity. The collected electrons, or those remaining after removal of a certain class and which are used for imaging, are given by rather long formulas. In STEM, the electrons collected on the annular dark-field detector happen to have a largely extended linear range, because of multiple scatter, which moves scattered electrons from the LEL detector to the annular dark-field detector. By electronic manipulations it is easy to transform a 1 - exp function back into a simple exponential by subtracting it from a constant. The essential determinants for scattered electrons are thus p and x and the relative amounts of KeIand Kin(Kin= K - K J . In conventional imaging p and x are contributing most to contrast, while in the ratio contrast this is K,,/Kin (Sections II,C and IV,C). The contrast-forming weight of matterspecific properties (Keland Kin)to that of thickness is more than three times higher in ratio contrast than in dark field. A thickness (or density) variation has to be more than three times as high in ratio contrast to produce the same contrast as in the dark-field mode (Section IV,B).
C. Contrast in Conventional Imaging
In a forthcoming paper we will detail the precise reasons why in STEM and CTEM AG is proportional to AS, which, we recall is AS z A(px)exp(- K p x ) . This means that the contrast decreases with increasing mass thickness px. This can be verified by comparing a small particle, as, e.g., a virus, positioned on something with variable thickness as, e.g., a holey film. The signal S can be transformed into light of intensity L, with L 2: S , by a fluorescent screen. The emitted light can then be transmitted to the photographic film by a fiber plate (Guetter and Menzel, 1978). The photographic response to light has historically been developed so as to achieve an extended range of linearity with G = 1nL. With so-called outside photography, as it is performed in the Zeiss EM 109, we obtain therefore AG 5 A(px). This is also experimentally verified and reported in the above-mentioned paper, These differences of response are practically relevant only for specimens with very high contrast differences. Most specimens, however, have typically a very small range. In such cases, the micrographs obtained under these two conditions have no visibly different characteristics. Over small
CONTRAST FORMATION IN ELECTRON MICROSCOPY
323
ranges, part of an exponential curve can always be approximated by a straight line. In a first approximation, contrast is due either to density differences, to thickness differences, or to both: A ( p x ) = AX) when p is constant or x ( A p ) in a specimen of constant thickness but variable substance. Constant thickness is given by embedded material, if the two surfaces are perfectly flat. Ideally, this should be the case with thin sections. As we have summarized here (Section IV,D) and shown in detail elsewhere, this is not really so, and the surface reliefs (fracture surfaces) on both sides have amplitudes of the order of 3-5 nm. The density range of unstained biological matter goes only from 0.9 to 1.4. With OsO, fixation and uranyl acetate post fixation, about 20% w/w of heavy metal is deposited, which leads to densities of proteins in the range 1.5-1.7. This is not yet sufficient for conventional microscopy (Section V,D). With additional section staining, very high densities are reached which now are sufficient for eliminating most contrast problems in bright-field imaging. Obviously, we then observe the heavy-metal deposits only and not the biological material itself, which singularly narrows the possibilities for obtaining high-resolution information. Sections of unstained biological material are of extremely low contrast; only with some particularly compact structures, like bacteria, can an image exceptionally be made in the CTEM, bright-field mode (see Fig. '20a). In the dark-field mode, more contrast is achieved; even unstained material is easily observable (Weibull, 1974; Sjostrand et al., 1978; Jones and Leonard, 1978). The results were, however, disappointing in their lack of definition; we have shown here (Section IV,D) and elsewhere that the surface relief is mostly responsible: thickness differences have a much stronger influence than the density differences between unstained biological material and resin. Rapidly frozen thin films of aqueous particle suspensions without supporting film (Lepault et a/., 1983b) should lead to flat surfaces, provided the particles are smaller than the film thickness and that sublimation during or before observation is completely avoided. High-coherence, narrow-beam imaging of viruses prepared in this manner has given astonishing results (Adrian et al., 1984). Although the contrast per se is low, the background inteference noise is so small that a photographic contrast amplification is easily possible (from discussions with J. Dubochet, EMBL, Heidelberg). This could possibly be due to vitreous ice being really a phase object, in contrast to carbon films, which are likely to be excellent scatterers. A thorough theoretical consideration of these new experimental facts by physicists is needed and of the highest interest.
324
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
With cryosections observed in the frozen-hydrated state, the problems are similar, but much more pronounced than with resin sections: (i) the surfaces of cryosections are what we are used to observing in replicas from cryofracturing, and (ii)it is still difficult to obtain sections thinner than some 100 nm. Nevertheless, biologically interesting results have been achieved with conventional bright-field imaging (Dubochet et al., 1983). With ratio contrast-in opposition to conventional imaging-the influences of thickness variations due to surface reliefs are very much decreased and the contribution of matter-specific contrast is increased. As we will discuss in the next section, ratio contrast is thus particularly indicated for observing thin sections of unstained material. The procedure of negative staining, despite its manual simplicity leads in some cases to very intricate situations which are frequently barely interpretable; we should not forget that negative stains and sustains are aqueous solutions of a few percent of matter which during dry-down lose their water completely and thus become rearranged continuously under the influence of surface tensions (including wettability) and viscosity. One should simultaneously keep in mind that the surface tension deforms either the particle (Kellenberger and Kistler, 1979) or the supporting film by wrapping it around part of the particle (Kellenberger et al., 1982; Fig. 22). In most cases, both will occur to degrees which depend on the relative deformabilities and elasticities of specimen and film. Every microscopist dreams of negatively stained specimens which end up as an embedding in heavy-metal salt and limited by two flat surfaces! This event might occur exceptionally with two-dimensional biological crystals with relatively smooth surfaces, like the purple membrane (Unwin and Henderson, 1975) but certainly do not do so as a rule. This is thecase, however, with frozen-hydrated films of particle suspensions. One should also not forget to mention that the technique using selfsupporting frozen-hydrated suspensions of particles is also applicable without difficulty when using heavy-salt solutions as the suspension medium. Negative contrast should become achievable with some 3-4 atom percents of heavy metals but only when using ratio contrast. In the case of nonflat surfaces of negatively stained preparations and 3D reconstructions by tilt, one assumes the Beer-Lambert law to be applicable and thus also the validity of our dilution theorem (Section 11,D).In this case, curves of equal optical density are interpretable as curves of equal mass density but only when both the particle and the distribution of negative stain around it are outlined individually. Any 3D reconstruction should provide the entire profile, comprising particle, negative stain, and supporting film, comparable to the schemes of Fig. 23.
FIG. 22. CTEM image of shadowed particles and schematic drawing illustrating the wrapping phenomena of a thin carbon support film. (a) 126-nm polystyrene spheres adsorbed on both sides of a thin carbon film (about 6 nm thick). The particles on one side are negatively stained with uranyl acetate (particle t o the left). The particle to the right is unstained and the specimen is shadowed on this unstained side. Note that both particles throw a shodow. (b) Same as (a) but shadowed on both sides. (c) Virus particle (TYMV) prepared as (b).
326
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
FIG.23. Scheme showing the expected stain distributions for extreme cases of (a) rigid and (b) deformable specimens; for (c) single-support films and (d) the sandwich technique.
D. Contrast in Ratio Imaging As we have seen in the theoretical discussion (Section II), Reichelt and Engel (1984) have shown that the signal of ratio contrast leads to response curves in function of x, which are parallel for different materials (Fig. 9d). The difference AS between the curves of two biological matters stays more or less constant and nearly equal to the ratio cel/cin. AS leads in our STEM to a contrast AG 2: AS (C being measured as optical density in micrographs obtained on photographic emulsions). As we have explained by the dilution theorem, this contrast reflects the concentration of the biological material in the resin (or ice), as long as the biological macromolecules are not individually resolved. Ratio contrast therefore should allow direct determination of concentrations, provided the general category of the involved biological matter is known (nucleic acids, proteins, polysaccharides, lipids). These concentrations are meaningful only when the region to be investigated spans integrally through the slice or layer. Such concentration determinations are expected to be very useful biologically and particularly easily made with thin sections. If the structure does not span the slice or layer, the embedding above and below the structure “dilutes” the signal, as does the embedding material mixed directly with the macromolecules. The main advantage of ratio imaging is, however, a consequence of the fact that the influence of A(px) is very much reduced when compared to darkfield imaging: the surface relief resulting from the fracture involved in the
CONTRAST FORMATION IN ELECTRON MICROSCOPY
321
cleavage process of cutting is practically eliminated, as are also the knife marks. The other advantages are given by the much higher influence of heavy metals: 25% of Sn integrated into an organic resin provides too much “negative” contrast in the ratio mode, but nearly none with conventional bright-field imaging (Carlemalm et al., 1982b). We might extrapolate this knowledge to the so-called negative stain, or what was frequently called a sustain. We understand that negative stain not only acts to provide contrast, but apparently sustains the structure and thus preserves it from surface-tension-induced collapses. Pure sucrose and glucose were used for that purpose (Unwin and Henderson, 1974). To provide contrast, some heavy atoms were frequently mixed into such sustains with little success. This we understand now, and have learned in addition that small amounts of heavy metals lead to very strong ratio contrast. This was verified in our laboratory by the work of M. Wurtz and M. Hanner on “sustained” bacteriophages which gave nearly no contrast in conventional bright field, but a very strong one with ratio contrast. At the same time, the preservation of the head and its dimensions were exceptionally good because of the sustain. The extreme contrast of heavy metals in ratio contrast is also the cause of the fact that cytochemically specific heavy-metal tags used on unstained sections lead to a much stronger signal with ratio contrast. The consequence of this is that smaller tags should give usable signal-to-noise ratios with ratiocontrast imaging. It should be emphasized here that the results of ratio-contrast imaging might depend on the collection angles associated with the geometries of the objective lens and the annular detector used. Much more work is needed to definitively optimize this problem.
VII. DISCUSSION OF LIMITATIONS A more detailed discussion of limitations in the electron microscopy of biological material is provided by Kellenberger and Chiu (1982); thus we will summarize here only the most important of them. A . Beam-Induced Destruction
Ultimately all electron microScopy of biological material will be limited by the extent of beam-induced destruction, although numerous other limitations are still predominant and, potentially, seem to be avoidable. We will
328
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
nevertheless start to discuss beam-induced distortions because we have only mentioned (Section IV,C) but not yet emphasized this old, but again fashionable, phenomenon. Beam-induced distortions have been observed by one of us (E.C.) on HM20 embedded, unstained gap junctions and other fine structures, when observed with ratio contrast under low-dose conditions. The doses were about the same as those used in minimal beam conditions of conventional bright field. It will now be very important to compare scanning with conventional imaging and to keep the specimen at different temperatures. It is presently possible to observe, at 93 K, with cryostages and conventional lenses in STEM and CTEM and with cryolenses (Dietrich et al., 1977)-kept at some 1020 K-already in CTEM and soon also in a cryo-STEM (built by A. Jones and collaborators at EMBL, Heidelberg). From these studies we should obtain more information on the influence of the dose rate (or intensity) of the beam. Presently, with CTEM, all measurable effects are only dose dependent. The range of intensity variations in CTEM is, however, too small to permit definitive analysis. On frozen-hydrated material observed in CTEM, the “bubbling” effect (Chang et al., 1983) occurs suddenly when certain critical doses are reached. What will this effect become in STEM? First observations at 93 K showed that the effect is at least strongly modified. Only further work will elucidate if the effect disappears or is replaced by “microbubbling.” Nothing definitive is known about the cryoprotection factor at 10-20 K, which was claimed originally to be around 30 (Knapek and Dubochet, 1980). With better experimental systems, this factor was very much reduced (Lepault et al., 1983a),but there are still claims of substantial improvement (Chiu and Jeng, 1982)..Itis obvious that these studies have to be continued with various, adequate specimens. It is possible to construct resins that are more beam resistant, e.g., by using cyclic components as has been done with the new Sn-containing resins (J. D. Acetarin and E. Carlemalm, unpublished). This is, however, only possible with a compromise: this resin-before curing-is viscous, and cannot be used in low-temperature procedures (Carlemalm et al., 1982a). B. Plastic Deformations in 7hin Sections
The process of microtome cutting of material embedded in ice or resins is in reality a cleavage. Cleavage happens when the tensile force attains a critical value at which rupture occurs. This value is strongly matter dependent and occurs after a phase of so-called plastic flow, which, after release of tension, is not, or only partly, reversible and thus leads to a permanent deformation. The
CONTRAST FORMATION IN ELECTRON MICROSCOPY
329
extent of the phase of plastic flow is extremely matter dependent. Glass, probably also vitreous ice, is-at the speeds of cutting actually used-nearly now in the plastic range. In general, anything which is called brittle has these characteristics. Very little is known as yet about embedded biological material, except that the “complementary” cleavage surfaces do in fact not complement their reliefs (Williams and Kallman, 1955) because of the different plastic flow of embedded biological matter and embedding. The cohesion between these two is also likely to play an important role and would explain why epoxy embeddings show less relief than others. Indeed, epoxys like Araldite are reputed to be among the best “glues” or cements. This would also explain why Epon embeddings have to be etched to reacting well with antibodies, while crosslinked methacrylate embeddings (Lowicryl K4M) do so immediately without any prior treatment. It is obvious that the above-described plastic flows and ensuing deformations are likely to produce some “cracks” within and between embedded matter and resin, cracks which would help to produce massive uptake of heavy-metal stain. Although very little experimental work has been published on these problems (possibly because the referees do not like to face an unpleasant reality?) we have to be aware that deformations will always be present in slices produced by current microtomy techniques. The depth of the perturbed layer might decrease by an optimization of techniques. Since ratio contrast images not only the surfaces but also the inside, an improved information is expected, as shown already on septate junctions (Garavito et al., 1982). With vitreous ice as embedding material, phase contrast might be sufficient to produce contrast with thick sections. In this case, the interior would also be predominant in forming image contrast. But, again, experimental studies (not to speak of theories) are only at their very beginning.
C. Limitations due to Positive Stain Overcome by the Possibility of Observing Unstained Sections We have already mentioned several times that nearly equal amounts of heavy metal are needed to visualize, e.g., a protein in conventionally observed sections (Fig. 20). It is clear that we know nothing about the location of heavy metal relative, e.g., to protein, particularly if we remember what we said in Section VI1,B. It is obvious to us and some others that it is high time to observe also unstained material, so as to start understanding something about these relative locations. Since ratio contrast allows the study of unstained material, this limitation is removed, but might be replaced by those described in Sections VII,A and VI1,B.
330
E. CARLEMALM, C. COLLIEX, A N D E. KELLENBERGER
D. Limitations due to Negative Stain and Potential of Observing Frozen-Hydrated Material We have discussed negative stain in Sections VI,C, in Section VI,D emphasized the difficulties of interpretation encountered, and have discussed their elimination by the possibilities offered by the new prospects offered the observation of frozen-hydrated thin layers of suspensions (Lepault et al., 1983b).More experimental work has to be done with different imaging modes and temperatures of observation. It seems clear to us that this technique will revolutionize electron microscopy of small, biochemically isolated structures because here the situation is simple enough to be both understandable and interpretable. By comparison with thin sections and extrapolation of the knowledge related to them, we might also learn to understand and improve over the present sectioning artifacts, so as to provide better sectioning techniques. We have to be aware that microtomy, together with cryofracturing, are the only techniques available to study fine structures in their natural context inside of the cells. No other biophysical technique is as yet available for such direct in situ studies.
E . Limitations due to Noise In order to reduce beam damage one reduces the dose as far as possible in compromising with quantum noise. The latter is enhanced with decreasing dose. It might eventually reach a level at which direct imaging is no longer possible; redundant objects might be subjected to information processing and thus the noise reduced in proportion with the number of images of the many identical objects used. In ratio contrast we have white (statistical) quantum noise in both the elastic and the inelastic signal when we work with minimal doses. The quotient of two white noises becomes a new noise of a special type, following an F distribution (Guenther, 1964). This noise is not as pleasant as white noise, because it is a skewed distribution, which could be bothersome in cytochemical labels with heavy metals (Section V1,D). These ratio-contrast noises have to be further explored, experimentally and theoretically. The problem has been overshadowed by the observation that the STEM in Orsay, Paris had a much less noisy inelastic signal than the elastic one, compared to the STEM in Basel. The cases for this are presently being explored. We tend to believe that the main reason for the differences
CONTRAST FORMATION IN ELECTRON MICROSCOPY
33 1
should be sought in the different geometries of the detectors and the ensuing differences in collection angles, as well as the lower beam currents used in Basel.
VIII. CONCLUDING REMARKS The imaging of single atoms has been the dream of physicists occupied theoretically or with the design of electron microscopes during the last two decades. Some efforts have also been made toward a better physical understanding of the microscopy of very thick specimens. Since the times of such pioneers of biophysical electron microscopy a B. von Borries, C. Hall, and J. Hillier, the “normal” biological specimen, on the preparation of which the electron microscopist spends all his imagination and skill, has almost never been treated by the physicists. This is very understandable, because no precise theory can be formulated, and physics can become involved only in an oldfashioned qualitative manner! We have attempted this venture at understanding and describing in physical terms what hundreds or thousands of scientists do everyday when they try to interpret electron micrographs. With conventional imaging we included also the ratio-contrast techniques, which together with the recently developed cryotechniques, open new avenues to biological electron microscopy. The possibility of extending our visual knowledge of the microworld from the usual 10 or so to some 2-5 nm no longer seems to us to be a mere dream. And this, not only with regularly arrayed two-dimensional crystals, where 1-2-nm imaging is currently achieved, but also with the direct imaging of structures in situ in the cell. This is a level where electron microscopy has a monopoly. Only thin sections and cryofractures are able to provide direct information. These are very important for rounding up all the indirect evidence gathered from genetics and biochemistry.
ACKNOWLEDGMENTS We gratefully acknowledge the very competent assistance of Werner Villiger in various instances and of Renate Gyalog in thin sectioning. We also gratefully acknowledge the many and rewarding discussions with Professor T. Tschudi and Professor H. Rose (Darmstadt), Dr. J. Dubochet (EMBL Heidelberg), and Dr. Rudolf Reichelt in our laboratory. We are most indebted to Dr. R. Reichelt, Robert Wyss, and Dr. Andreas Engel for designing and constructing a spectrometer for the STEM in Basel and for EMBL, Heidelberg, so that work could also continue in these two places. We also greatly appreciate the opportunity to share the results obtained in our laboratory by Dr. R. Reichelt and Dr. A. Engel with the Monte Carlo method prior to publication. We are grateful to Elvira Amstutz and Marianne Schafer for their patient and efficient
332
E. CARLEMALM, C. COLLIEX, AND E. KELLENBERGER
typewriting, to Dr. Michel Wurtz for his help with the graphic art, and to Margrit Steiner for excellent darkroom work. The Basel group is supported by Grant No. 3.069.81 from the Swiss National Science Foundation, and the Orsay group by a grant from CNRS (US120041).
REFERENCES Adrian, M., Dubochet, J., Lepault, J., and McDowall, A. W. (1984). Nature (London)308, 32. Amos, L. A,, and Klug, A. (1975).J . Mol. Biol. 99,51. Bahr, G. F. (1955). Exp. Cell Res. 9,277. Brack, C. (1981). CRC Crit. Rev. Biochem. 10, 113. Branton, D., and Klug, A. (1975). J . Mol. Biol. 92,559. Brown, G. M., Noe-Spinlet, M. R., Busing, W. R., and Levy, H. A. (1977). Acta Crystallogr., Sect. B. B33, 1038. Burge, R. E. (1973). J. Microsc. (Oxford)98, 251. Burge, R. E., and Smith, G. H. (1962). Proc. Phys. Soc., London 79,673. Carlemalm, E., and Kellenberger, E. (1982). E M B O J. 1,63. Carlemalm, E., Garavito, R. M., and Villiger, W. (1982a). J. Microsc. (Oxford)126, 123. Carlemalm, E., Acetarin, J. D., Villiger, W., Colliex, C., and Kellenberger, E. (1982b). J. Ultrastruct. Res. 80, 339. Carlemalm, E., Baschong, W., Seiler, H., Hohl, C., and Kellenberger, E. (1985). In preparation. Carrascosa, J. L., and Kellenberger, E. (1978).J. Virol. 25,831. Caspar, D. L. D. (1980). Biophys. J . 32, 103. Castaing, R., and Henry, L. (1962). C. R . Hehd. Seances Acad. Sci., Ser. B 255,76-80. Chang, J.-J., McDowall, A. W., Lepault, J., Freeman, R., Walter, C. A,, and Dubochet, J. (1983). J. Microsc. (Oxford)132, 109. Chiu, W., and Jeng, T. W. (1982). Ultramicroscopy 10,63. Colliex, C., and Mory, C. (1984). I n “Quantitative Electron Microscopy” (J. N. Chapman and A. J. Creven, eds.). Scottish Uniu. Summer School Phys. 25, 149-216. Colliex, C., Jeanguillaume, C., and Mory, C. (1984). J. Ultrastruct. Res. 83. Crewe, A. V., and Groves, T. (1974). J. Appl. Phys. 45,3662. Crewe, A. V., and Wall, J. (1970). J. M o l . Biol. 48, 375. Crewe, A. V., Langmore, J. P., and Isaacson, M. S. (1975). In “Physical Aspects of Electron Microscopy and Microbeam Analysis”(M. Siege1and D. R. Beaman, eds.),p. 47. Wiley, New York. Dietrich, I., Fox, F., Knapek, E., Lefranc, G., Nachtrieb, K., Weyl, R., and Zerbst, H. (1977). Ultramicroscopy 2,241. Dubochet, J. (1973). In “Principles and Techniques of Electron Microscopy: Biological Applications” (M. A. Hayat, ed.), Vol. 3, p. 115. Van Nostrand-Reinhold, Princeton, New Jersey. Dubochet, J. (1975).J . Ultrastruct. Res. 52,276. Dubochet, J., Ducommun, M., Zollinger, M., and Kellenberger, E. (1971). J . Ultrastruct. Res. 35, 147. Dubochet, J., McDowall, A. W., Menge, B., Schmid, E. N., and Lickfeld, K. G. (1983).J . Bacteriol. 155, 381. Edrnshaw, W. C., King, J., and Eiserling, F. A. (1978). J. Mol. Biol. 122,417. Egerton, R. F. (1980). Ultramicroscopy 5,521. Egerton, R. F. (1982a). Ultramicroscopy 10,297-300.
CONTRAST FORMATION IN ELECTRON MICROSCOPY
333
Egerton, R. F. (l982b).J . Microsc. (Oxford) 126,95. Eiserling, F. A. (1979). In “Comprehensive Virology” (H. Fraenkel-Conrat and R. R. Wagner, eds.), Vol. 13, p. 543. Plenum, New York. Erickson, H. P., and Klug, A. (1971). Philos. Trans. R . Soc. London, Ser. B 261, 105. Eusemann, R., Rose, H., and Dubochet, J. (1982). J . Microsc. (Oxford) 128,232. Furukawa, H., Mamada, H., and Mizishiman, S. (1979). J . Bacteriol. 140, 1071. Garavito, R. M., Carlemalm, E., Colliex, C., and Villiger, W. (1982). J . Ultrastruct. Res. 80, 344. Groves, T. (1975). Ultramicroscopy 1, 15. Guenther, W. C. (1964). I n “Analysis of Variance,” p. 18. Prentice-Hall, Englewood Cliffs, New Jersey. Guetter, E., and Menzel, M. (1978). Proc. Int. Congr. Electron Microsc., 9th 1978 Vol. I, p. 92. Hall, E. C. (1953). In “Introduction to Electron Microscopy,” p. 226. McGraw-Hill, New York. Hanszen, K. J., and Trepte, L. (1971). Optik 33, 166. Henkelman, R. M., and Ottensmeyer, F. P. (1974).J . Microsc. (Oxford)102,79. Hillier, J. (1949).J . Bacteriol. 57, 3 13. Hillier, J., and Ramberg, E. G. (1949). Proc. Devt Coqf Electron Microsc., 1949, pp. 42-51. Hobot, J. A,, Carlemalm, E., Villiger, W., and Kellenberger, E. (1984). J . Bacteriol 160, 143-1 52. Isaacson, M. (1975).I n “Techniques in Electron Microscopy and Microprobe Analysis”(B. Siegel and D. Beaman, eds.), Chapter 14, p. 247. Wiley, New York. Isaacson, M. (1977). In “Principles of Electron Microscopy” (M. A. Hayat, ed.), Vol. 7, p. 1. Van Nostrand-Reinhold, Princeton, New Jersey. Jones, A. V., and Leonard, K. R. (1978). Nature (London)271,659. JouRrey, B. (1983). I n “Microscopie Electronique ve en Science de Materiaux” (B. Jouffrey, A. Bourret, and C. Colliex, eds.), p. 85. CNRS, Paris. Kellenberger, E., and Chiu, W. (1982). Ultramicroscopy 10, 165. Kellenberger, E., and Kistler, J. (1979). I n “Unconventional Electron Microscopy for Molecular Structure Determination” (W. Hoppe and R. Mason, eds.), p. 49. Vieweg, Braunschweig. Kellenberger, E., Bolle, A., Boy de la Tour, E., Epstei, R. H., Franklin, N. C., Jerne, N. K., RealeScafati, A., Sechaud, J., Bendet, I., Goldstein, D., and Lauffer, M. A. (1965). Virology26,419. Kellenberger, E., Haner, M., and Wurtz, M. (1982). Ultramicroscopy 9, 139. Kellenberger, E. et al. (1985) In preparation. Kistler, J., and Kellenberger, E. (1977).J . Ultrastruct. Res. 59, 70. Kistler, J., Aebi, U., Onorato, L., ten Heggeler, B., and Showe, M. K. (1978).J . Mol. Biol. 126,571. Knapek, E., and Dubochet, J. (1980). J . Mol. Bid. 141, 147. Labedan, B., and Goldberg, E. B. (1979).Proc. Nut Arad. Sci. U.S.A. 76,4669. Lamvik, M. K., and Langmore, J. P. (1977). Scanning Electron Microsc. 1,401. Langmore, J. P., Wall, J., and Isaacson, M. S. (1973). Optik 38, 335. Leibo, S. P., Kellenberger, E., Kellenberger-van der Kamp, C., Frey, T. G., and Steinberg, C. M. (1979). J . Virol. 30, 327. Lenz, F. (1954). 2. Naturforsch. A. 9A, 185. Lepault, J., Dubochet, J., Dietrich, I., Knapek, E., and Zeitler, E. (1983a). J . Mol. Biol. 163,51 I . Lepault, J., Booy, F. P., and Dubochet, J. (1983b).J . Microsc. (Oxford)129, 89. Misell, D. L. (1975). In “Physical Aspects of Electron Microscopy and Microbeam Analysis” (B. M. Siegel and D. R. Beaman, eds.), p. 63. Wiley, New York. Ottensmeyer, F. P. (1969). Biophys. J . 9, 1144. Ottensmeyer, F. P., and Andrew. J. W. (1980).J . Ultraitruct. Res. 72, 336. Ottensmeyer, F. P., and Arsenault, J. C. (1983).Scanning Electron Microsc. 4, 1867. Ottensmeyer, F. P., and Pear, M. (1975). J . Ultrastruct. Res. 51, 253. Ottensmeyer, F. P., Whiting, R. F., Schmidt, E. E., and Clemens, R. S. (1975).J . Ultrastruct. Res. 52, 193.
334
E. CARLEMALM, C. COLLIEX, AND E. KELLENRERGER
Reichelt, R., and Engel, A. (1984).-Ultramicroscopy 13,279. Schafer, L., Yates, A. C., and Bonham, R. A. (1971).J. Chem. Phys. 55, 3055. Scraba, D. G.,Raska, I., Kellenberger, E., and Moor, H. (1973).J. Ultrastruct. Res. 44,27. Sjostrand, F. S.,Dubochet, J., Wurtz, M., and Kellenberger, E. (1978).J. Ultrastruct. Res. 65,23. Smith, P.R., Aebi, U., Josephs, R., and Kessel, M.(1976).J. Mol. Biol. 106,243. Steven, A. C., ten Heggeler, B., Miiller, R., Kistler, J., and Rosenbusch, J. P. (1977).J. Cell Biol. 72, 292. Thach, R. E., and Thach, S. S. (1971).Biophys. J . 11,204. Thon, F. (1966).Z . Naturjorsch., A 21A,476. Unwin, P.N. T., and Henderson, R. (1975).J. Mol. Biol. 94,425. Valentine, R. C. (1966).Adv. Opt. Electron Microsc. 1, 180. von Ardenne, M. (1940).“Electronen Uebermikroskopie.” Springer-Verlag, Berlin and New York. von Borries, B. (1949a).“Uebermikroskopie.” Verlag Editio Cantor-Aalendort/Wiirtt. von Borries,. B. V. (1949b).Z . Naturforsch., A 4A, 51. Wall, J., Isaacson, M., and Langmore, J. P. (1974).Optik 39,359. Weibull, C . (1974).J . Bacteriol. 120,527. Weibull, C., Christiansson, A., and Carlemalm, E. (1983).J. Microsc. (Oxford) 129,201. Weibull, C., Carlemalm, E., and Villiger, W. (1984).J. Microsc. (Oxford)134,213. Westphal, C., Bachhuber, K., and Frosch, D. (1984).J . Microsc. (Oxford) 133,111. Williams, R. C., and Kallman, F. (1955).J. Biophys. Biochem. Cytol. 1, 301. Wunderli, H., van den Broek, J., and Kellenberger, E. (1977).J. Suprarnol. Struct. 7, 135. Zernike, F. (1935).Z.Tech. Phys. 16,454. Zworykin, V. K., Morton, G. A., Ramberg, F. G., Hillier, J. and Vance, A. W. (1945).“Electron Optics and the Electron Microscope.” Wiley, New York.
Author Index Numbers in parentheses are reference numbers and indicate that an author’s work is referred to although his or her name is not cited in the text. Numbers in italics indicate the pages on which the complete references are given. A
Acetarin, J. D., 332 Adrian, M., 275, 323,324,332,333 Aebi, U., 313, 314, 333 Aebischer, H., 160, 184 Albrycht, J., 210, 259, 266 Altschuler, H. M., 158, 182, 183, 184 Amos, L. A., 314,332 Anderson, R. W., 148, 184 Andrew, J. W., 283,333 Andrews, H. C., 55(55), 71 Ardelyan, N. Y., 266,266 Arkhipova, T. A., 77, 133 Arsenault, J. C., 274, 317, 334 Artem’ev, V. V., 1 11,136 Arzelies, H., 252, 253, 258,266 Athey, T. W., 160,184, 187 Audet, J., 157, 158, 171, 184, 186 Auret, F. D., 265,266
B Bachhuber, K., 275,334 Bahr, G. F., 332 Bailey, M. C., 153, 159, 184 Balling, P., 153, 184 Barker, J. A., 237, 266 Barrett, A. H., 158, 184 Barrow, W. L., 154, 184 Bartell, L. S., 263, 268 Bartlett, R., 119, 136 Barto, A. G., 266,266 Baschong, W., 275, 317,332 Beatty, R. W., 148, 185 Beck, E., 157, 187 Beck, A. H., 92,97(111-42), 98(111-42), 99(III42), 110(V-15), 113(V-26), 114(V-26), 116(V-26), 1 17(V-26), 13O(VI-32), 135, 136, 137 Benci, S., 81(11-31), 134 335
Bendet, I., 3 14, 333 Benedetti, P. A., 8 l(I1-31), 134 Berek, M., 7, 70 Bergmann, P., 257,266 Bernier, L.-G., 150, 184 Bernstein, R. B., 259,267 Besson, J. C. E., 150, 160, 175, 178,184, 186 Bhat, B. M., 109(V-13), 136 Bhatia, T. B., 93(111-31), 116(V-36), 135, 136 Bhide, G. K., 93(111-31), 116(V-36), 135, 136 Bird, R. B., 194,267 Bird, T. S., 153, 184 Birkhoff, G., 207,266 Bocker, R. P., 3(5), 69 Bodnar, D. G., 161, 168,184 Bolle, A., 3 14, 333 Bolomey, J. C., 157, 158, 171, 184, 186 Booy, F. P.,333 Borcham, R. A., 286,287,334 Borgeaud, M., 148, 178 Born, M., 6( 15), 7( 15), 8( 15), 68( 15), 70 Borzyak, P. G., 99, 135 Bosisio, R. G., 160, 184 Boststein, C., 157, 187 Boy de la Tour, E., 314,333 Brack, C., 273,332 Brady, M. M., 160, 184, 187 Branton, D., 309, 3 13,332 Bromley, K., 3(4, 5), 47, 69, 71 Brooks, R. E., 3(8), 70 Brown, G. M., 288,332 Brumm, D., 47(42), 71 Bryant, J. F., 47(43,44), 71 Bucci, 0. M., 149, 184 Budde, W., 77, 79(II-22), 90, 133, 134 Burdette, E. C., 160, 185 Burge, R. E., 272, 286,332 Burroughs, E. G., 99, 103(IV-1l), 135 Burton, J., 125, 137 Busing, W. R., 288, 332
AUTHOR INDEX
C Cadzow, J. A., 264,266 Cain, F. L., 160, 185 Carlemalm, E., 273,274, 275, 295, 300, 302, 303, 309, 31 1, 316, 317, 323, 327, 328, 329,332, 333, 334 Carrascosa, J. L., 3 13,332 Carusi, A., 210, 266 Caspar, D. L. D., 309,332 Castaing, R., 283, 332 Chandrasekhar, K., 159, 171,185 Chang, B. J., 52(53), 71 Chang, J. J., 328, 332 Chao, T. H., 24(29),43(33,34), 63(60), 70, 71 Charman, W. N., 119,136 Chavel, P., 3, 70 Chen, H., 60(58), 72 Chen, J. M., 105,135 Chen, X. X., 51(51), 71 Cherenkevich, S . N., 75(11-2), 133 Chiu, W., 320, 327, 328, 332, 333 Chive, M., 157, 158, 171, 184, 185, 186 Christiansson, A,, 300, 334 Chu, L. J., 154, 185 Clemens, R. S., 273, 334 Coates, P. B., 75, 133 Cochrane, J. A,, 100,135 Cockrell, C. R., 156, 185 Collier, C., 236, 237, 265, 267 Colliex,C., 281,293,311,316,317,329,332, 333 Collin, R. E., 141, 144, 185 Compton, R. T., 156, 161,185 Condas, G . A., 9 1,134 Coppin, C., 229,266 Cortucci, G., 149,184 Couderc, D., 160,184 Cranmer, M., 265, 267 Crewe, A. V., 273,274,278,281,286,332 Crosswell, W. F., 156, 161, 185 Crowe, J. R., 100, 101, 103, 135 Curtiss, C. F., 194, 267 D Das, B. N., 153, 185 Davis, C. C., 157, 187 Dean, R. J., 107, 136 Decker, R. W., 78, 134 Decotignie, J. D., 160, 178
Decreton, M. C., 155, 161, 185, 186 Della, P., 78, 134 Deloron, M., 116(V-37), 136 Deltrap, J. M., 105, 135 Delura, O., 116(V-37), 136 Demyanova, T. A., 105, 136 Denisov, V. P., 119, 137 Dennison, 0. T., 148, 184 de Ronde, F. C., 147,185 Deshpande, M. D., 153,185 Deutscher, K., 86, 93, 95(III-8), 104(IV-I 5), 105(IV-15), 124, 125, 126(VI-28), 134, 135, 137 Dietrich, I., 328, 332, 333 Dolizy, P., 116(V-37), 136 Dorschner, T. A., 159, 187 Dowman, A. A., 97, 98, 99, 113(V-26), 114(V-26), 116(V-26), 117(V-26), 130(VI-32), 135, 136, 137 Dubochet, J., 271, 273, 275, 283, 286, 287, 290, 291, 298, 323, 324, 328, 330, 332, 333,334 Dubost, G., 170,185 Duchet, M., 90, 134 Ducommun, M., 273,332 Dvorak, M., 102, 103, 112, 135, 136 Dymek, M. S., 63(60), 72
E Earnshaw, W. C., 308,332 Eastment, R. M., 92,135 Ebersole, J. F., 47, 71 Edrich, J., 158, 185 Egerton, R. F., 271, 273, 274, 281, 298, 332, 333 Einstein, A., 244, 266 Eisenberg, D., 264, 266 Eiserling, F. A., 308, 309, 314, 332, 333 Engel, A., 281, 298, 326,334 Engen, G . F., 148, 185 Epstei, R. H., 314, 333 Erickson, H. P., 272,333 Eusemann, R., 283,286,287,290,291,333 Ezard, L., 124, 137
F Faulkner, A. F., 81(11-33), 134 Feldner, E., 79(II-24), 105(IV-18), 114, 134, 135, 136
337
AUTHOR INDEX Felsen, L. B., 166, 185 Fermi, E., 265, 266 Fernez, C., 155, 165, 185 Feynam, R. P., 194,266 Fisher, D. G., 93, 130(31), 131(111-32), 135, 137 Folkes, J. R., 90, 134 Forssell, B., 160, 185 Fox, F., 328,332 Franceschetti, G., 149, 184 Franklin, N. C., 3 14,333 Fray, C., 157, 159, 160, 171, 185 Fredkin, E., 265,266 Freeman, R., 328, 332 French, A. P., 259, 260,266 Frey, T. G., 3 13,333 Friedenthal, E., 157, 187 Frood, D. G., 147,185 Frosch, D., 275,334 Fukaya, T., 43, 71 Fung, A. K., 157, 187 Funkhouser, A., 47(42), 71 Furukawa, H., 309, 3 10,333
G Gabor, D., 2,69 Gajda, G. B., 160, 185, 187 Galejs, J., 153, 154, 156, 160, 168, 171, 185 Garavito, R. M., 316,317,327,329,332,333 Garbuny, M., 80(11-27), 134 Gardiol, F. E., 141, 148, 150, 155, 158, 159, 160, 161, 164, 166, 167, 169, 170, 173, 175, 178, 181, 183,184, 185, 186, 187 Garfield, B. R. C., 78, 90,91(111-23),99, 11 1, 115, 116, 117,134, 136 Gautherie, M., 158, 185 Gelikonov, V. M., 77, 133 George, N., 3, 15(25), 70 Gex-Fabry, M., 160, 167, 173, 175,185, 186 Ghosh, C., 78, 83, 92, 93(III-3I ) , 95(III-28), 100, 112, 114(V-28), 115(V-28), 116(V36), 117, 122, 123, 124, 130, 134, 135, 136, 137 Giaux, G., 158, 185 Ginzton, E. L., 150, 185 Giordano, A. B., 147,185 Giroux, M., 160, 184 Goerlich, P., 122(VI-9),137 Goldberg, E. B., 309,333
Goldstein, D., 3 14, 333 Goodman, J. W., 60(57), 71, 72 Gordon, A. W., 103, 104,135 Gortich, P., 79(11-24), 105(IV-18), 114, 134, 135, 136 Gottlieb, M., 266, 266 Govindarajan, N., 109(V-13), 136 Grant, G. R., 102(IV-10),106(IV-23), 1 19(V42), 135, 136 Greene, F. M., 154, 184 Greene, K. J., 147, 186 Greenspan, D., 192, 194, 195, 196,200,202, 207, 209, 210, 211, 219, 220, 222, 226, 229, 230, 232, 235, 236, 237, 238, 239, 242, 249, 252, 260, 263, 264, 265, 266, 267, 268 Greiser, J. W., 157, 186 Greschat, W., 104(IV-16), 105(IV-16),135 Groves, T., 273, 28 1, 332, 333 Gspan, P., 79(II-20), 8l(II-20), 134 Guenther, W. C., 330,333 Guerquin-Kern, J. L., 158, 185 Guetter, E., 322, 333 Gugel, B. M., 131(VI-35), 137 Gulakov, N. A., 90, 134 Gunter, W. D., Jr., 102(IV-10), 106(IV-23), 119(V-42), 135, 136 Gushchin, I. S., 266,266 Guy, A. W., 157,185
H Hall, E. C., 270,294,333 Hall, T. W., 92, 135 Hammerstad, E. O., 162,185 Hlner, M., 324,333 Hanna, A. H., 105,135 Hansen, J. R., 80(11-27), 134 Hanssen, K. J., 272, 333 Harari, H., 259,267 Hamngton, R. F., 154, 155, 164, 167, 169, 185, 186 Hatcher, D. M., 161, 185 Heiman, W., 96(III-36), 107(V-3, V-4), 123(VI-I l), 131, 135, 136, 137 Heinrich, H., 104(IV-16), 105(IV-16),135 Helvy, F., 81(11-32), 134 Henderson, D., 237,266 Henderson, R., 324, 327,334 Henkelman, R. M., 283,333
338
AUTHOR INDEX
Henry, L., 283,332 Jones, T. H., 97(111-42), 98(111-42), 99(III-42), Hey-Shipton, G. L., 161,186 113(V-26), 114(V-26), 116(V-26), Hillier, J., 270, 27 1, 333, 334 117(V-26), 130(VI-32),135, 136, 137 Hirschberg, K., 104(IV-15), 105(IV-15), Josephs, R., 3 14,334 126(VI-28),135, 137 Jouffrey, B., 281,333 Hirschfelder, J. O., 194, 267 Jung, L., 84, 134 Hobot, J. A., 309,333 Hoehne, E. L., 87, 96(III-36), 107(V-3), K 108(V-3), 117, 121(VI-2), 123(VI-1l), Kallman, F., 329,334 134, 135, 136, 137 Hofmann, H. H., 86, 93, 95(III-8), 124, 125, Kanev, V., 8O(II-26), 124(VI-20), 134, 137 Kansky, E., 79(11-20), 81(11-20), 97, 98, 126,134, 135, 137 107(V-3), 108(V-3), 116, 123(VI-1I), Hofmann, W., 324,333 134, 135,136, 137 Hohl, C., 275, 317, 332 Hollisch,C. D.,90(111-17), 100,101, 103,134, Kantor, G., 157,186, 187 Kartaschoff, P., 148,186 135 Holton, R., 77, 100, 112, 114, 115, 116, 117, Kato, H., 60, 72 Kauzmann, W., 264,266 123,134, 135, 136, 137 Kelkar, G. N., 93(111-31), 1 16(V-36),135,136 Honda, T., 43, 71 Hopkins, G. P., 77, 100, 112, 114, 115, 116, .Keilenberger, E., 272, 273, 274, 275, 295, 302, 303, 307, 309, 311, 313, 314, 317, 117, 123,134, 135, 136, 137 320, 323, 324, 327, 328, 329, 332, 333, Hopkins, H. H., 7, 70 334 Horner, J. L., 43, 71 Kellenberger-van der Kamp, C., 313, 333, Hsu, F. K., 24(29), 70 334 Kelly, P., 79(II-22), 134 I Kent, G. S.,77(11-7), 133 Kessel, M., 314, 334 Isaacson, M., 273,287,293,333, 334 Isaacson, M. S.,273,274,278,284,286,290, Khandavalli, C., 159, 171, 186 Khandokhin, P. A., 77,133 29 1,332,333 Khayata, N., 157, 160, 171, 185 Iskander, M. F., 149, 186 King, J., 308, 332 Kirschenbaum, C. L., 260,267 J Kistler, J., 307, 309, 313, 324, 333, 334 Klimin, A. I., 119, 137 Jacobsen, J., 153,184 Klimkin, V. M., 109(V-10), 136 James, G. L., 147, 186 Klug, A., 272,309,313,314,332,333 Jamieson, A. R., 155, 186 Knapek, E., 328,333 Jeanguillaume, C., 293, 332 Jedlicka, M., 82, 93, 11 1(V-20), 112(V-19), Koosman, J., 109, 136 Kopal, Z., 235,267 114(V-20,V-29), 124,134, 136, 137 Kossel, D., 104(IV-15), 105(IV-15), 135 Jeng, T. W., 328,332 Kraszewski, A., 160,186 Jennings, A., 107, 136 Jeric, S., 79(II-20), 8 l(I1-20), 107(V-3), Kruger, R. P., 55(55), 71 Kumar, K., 125,137 108(V-3), 123(VI-1l), 134. 136, 137 Jerne, N. K., 314,333 L Johnson, S. M., Jr., 79,134 Joines, W. T., 160, 176, 187 Labedan, B., 309, 333 Jones, A. V., 273, 323,333 LaBudde, R.A., 195,196,211,264,265,267 Jones, D. P., 77(11-7), 102, 133, 135 Lamvik, M. K., 278,333
339
AUTHOR INDEX
Langmore, J. P., 273,274,278,284,286,287, 290,291,293,332, 333, 334 Lapson, L. B., 119, 137 Lauffer, M. A,, 314,333 Laurinavicius, A., 159, 186 Lazarenko, V. R., 90, 134 Lee, S. H., 40,47(46), 70, 71 Lefranc, G., 328, 332 Lehmann, J. F., 157, 185 Leibo, S. P., 313, 333 Leighton, R. B., 194,266 Leith, E. N., 3(7), 70 Lenz, F., 272, 333 Leonard, C., 52(52), 71 Leonard, K. R., 273, 323,333 Lepault, J., 275,323, 324,328, 330,332,333 Leroy, Y., 157, 158, 171,184, 185, 186 Levine, H., 153, 186 Levine, R. D., 259, 267 Levy, H. A., 288,332 Lewin, L., 154, 164, 186 Licht, S., 156, 186 Lickfeld, K. G., 324, 332 Lieberstein, H. M., 202, 268 Liu, H. K., 60(57), 71 Lochner, O., 153,178 Lohmann, A. W., 3,43, 70, 71 Lorente, M., 265,268 Lowenthal, S., 3, 70 Love, J. A., 125, 127,137
M McCarroll, W. H.,92,97,98,99, 105(IV-19), 112, 116(V-38), 135, 136 McDonie, J. L., 81(11-33), 134 McDonnie, A. F., 93, 130(VI-3I), 13l(II1-32), 135, 137 McDowall, A. W., 275, 323, 324, 328, 332, 333 McMullan, D., 78, 79(11-19), 134 MacPhie, R. H., 155, 171, 186 McStay, J., 161, 186 Maetzler, C., 160, 184 Mahar, T. J., 265,268 Mailloux, R. J., 155, 186 Malherbe, A., 109(V-1I), 136 Mamada, H., 309, 3 10,333 Mamane, S., 178,184 Manning, J. J., 79, 80, 81, 134
Mantredi, M., 81(11-31), 134 Mao, S. Z., 43(34), 71 Marciniak, A., 2 10, 2 1 1, 259,266, 268 Marcuvitz, N., 152, 153, 166, 176, 185, 186 Matheson, R. M., 81(11-32), 134 Mattes, H., 153, 187 Matthews, P. A., 161, 186 Mautz, J. R., 154, 155, 169, 186 May, R. M., 265,268 Mayer, H., 121(VI-6), 137 Maysokaya, K. A., 77, 82(11-6), 133 Melamid, A. E., 75, 96, 103, 108(V-6), 131(VI-35), 133, 135, 136, 137 Mendecki, J., 157, 187 Menge, B., 324,332 Menzel, M., 322,333 Michelson, A. A,, 7, 70 Milnes, A. G., 40, 70 Misell, D. L., 272, 333 Mishustin, B. A., 159, 186 Mizishiman, S., 309, 3 10, 333 Monahan, M. A., 3(5), 47(43, 44), 69, 71 Montgomery, C. G., 147,186 Moor, H., 313,334 Moore, R. K., 157,187 Morris, G.M., 3, 15(25), 70 Morton, G. A., 270,334 Mory, C., 281,293,311,332 Moschiiring, H., 160,186 Mosig, J. R., 141, 160, 164, 167, 173, 175, 185, 186 Mostovskiy, A. A., 83(III-3), 134 Miiller, R.,309, 334 Muller, T., 79(II-24), 134 Murray, R. B., 79, 80, 81, 134 Myers, P. C., 158, 184
N Nachtrieb, K., 328,332 Nanev, K., 80(11-26), 124(VI-20), 134, 137 Needham, M. J., 124,137 Ney, M., 150,186 Nguyen, D. D., 157, 158, 171,184, 186 Niedermayer, R., 121(V1-6), 137 Ninomiya, T., 97(III-44), 99(111-44), 1 16(V35), 135, 136 Noe-Spinlet, M. R., 288, 332 Nowogrodzi, M., 157,187
340
AUTHOR INDEX 0
Onorato, L., 3 13,333 Otrin, B., 79(11-20), 81(II-20), 134 Ottensmeyer, F. P., 273, 274, 283, 317, 333, 334 Owens, R. P., 162,186
P Paglione, R., 157, 187 Pakhomov, M. T., 96, 103, 108(V-6), 135, 136 Papas, C. H., 153,186 Papiernik, A., 157, 159, 160, 171, 185 Pans, D. P., 43, 71 Pans, D. T., 161, 168,184 Pamaux, O., 158,185 Pasta, J., 265, 266 Pauling, L., 259,268 Pear, M., 317,334 Persyk, D. E., 8 l(11-33), 134 Pertsev, A. N., 75(II-2), 133 Peterson, R. A., 265, 268 Petrova, R., 80(11-26), 124(VI-20), 134, 137 Pfaff, R. J., 98, 99, 116(V-38), 135, 136 Philipp, H. R., 125, 137 Pichot, C., 158,186 Pichot, J. C., 157, 158, 171, 184 Plancot, M., 158, 185 Popov, Yu. V., 105,135 Potopov, A. M., 75, 133 Powell, J. R., 78, 79(II-19), 134 Pozela, J., 159, 186 Preisendorfer, R. W., 265,268 Prevost, B., 158, 185 Privoleva, V. Ye., 77, 82(II-6), 133 Prokopev, V. E., 109(V-10), 136
R Raffan, W. P., 103, 104,135 Ramachandraiah, M. S., 155,161, 185, 186 Ramberg, E. G., 271,333 Ramberg, F. G., 270, 334 Ramo, S., 150, 186 Raska, I., 3 13, 334 Raus, J., 114(V-29), 136 Reale-Scafati, A., 314, 333 Reeves, W. R., 242,268 Reichelt, R., 281, 298, 326, 334
Restnck, R., 47(42), 71 Reznikov, I. V., 75(11-2), 133 Rhodes, D. R., 142,186 Risser, J. R., 147, 152, 154, 186 Robbie, J. C., 92, 97, 110(V-15), 135, 136 Robert, J., 158,185 Robillard, M., 157, 158, 171, 184, 186 Rogers, G. L., 3(3, 6), 69 Romer, P., 104(IV-16), 105(IV-16), 135 Ronkin, 2.M., 90, 134 Rosati, M., 222,267 Rose, H., 283, 286, 287, 290, 291, 333 Rosenbusch, J. P., 309,334 Roth, J., 3, 70 Rozzi, T. E., 155, 186 Rudduck, R. C., 159, 161,185, 186 S
Sadowski, N. L., 158, 184 Saffman, P. G., 207,268 Samaras, J. M., 160, 187 Sands, M., 194,266 Savarese, G., 149,184 Schlfer, L., 286, 287,334 Scharmann, A., 93, 95(111-8), 126(III-29), 134, 135 Schelkunoff, S . A., 154, 186 Schmid, E. N., 324,332 Schmidt, E. E., 273,334 Schneider, M. V., 162, 186 Schubert, A. B., 232,268 Schwan, H. P., 156,186 Scott, G. D., 9 1, 134 Scraba, D. G., 3 13,334 Seals, J., 160, 185 Sechaud, J., 314,333 Seiler, H., 275, 317, 332 Sennett, R. S., 9 1, 134 Shaw, S. A., 102(IV-10), 106(IV-23), 119(V42), 135, 136 Shemng, C. N., 8 1,134 Shibata, S., 263, 268 Showe, M. K., 313,333 Simon, R. E., 95(III-35), lOO(IV-7), 135 Sizelove, J. R., 125, 127, 137 Sjostrand, F. S., 273, 323, 334 Slater, J. C., 259, 268 Smith, A. M., 160, 186 Smith, G. H., 286,332
34 1
AUTHOR INDEX Smith, H. M., 52(54), 71 Smith, P. R., 3 14, 334 Smyth, C. J., 158, 185 Sobieski, S., 119, 136 Soboleva, N. A., 108, 109(V-7), 125,136, 137 Soga, H., 156,186 Sommer, A. H., 74, 75, 77(II-9), 83, 91, 93, 95, 97, 98, 99, 106, 107, 108(V-1), 110, 1 I l(V-1), 112, 116(V-38), 117, 119, 121, 122, 125(VI-10), 130(VI-3l), 13l(II132), 133, 134, 135. 136, 137 Sorensen, C. C., 157, 185 Sphicopoulos, T., 150, 159, 161, 166, 169, 170, 178, 181, 184, 185, 187 Spicer, W. E., 79,93, 119, 121, 122, 123, 124, 125(VI-10), 130,134, 135, 137 Srinivasan, M., 93(111-31), 109(V-13), I16(V36), 135, 136 Stadlmann, H., 84, 134 Staeger, C., 148, 186 Steinberg, C. M., 3 13, 333 Steinberg, M. S., 237, 268 Stepanov, B. M., 131(VI-35), 137 Sterzer, F., 157, 187 Steven, A. C., 309,334 Stonebridge, J. B., 157, 185 Stroke, G. W., 43,47(42), 71 Stuchinskiy, G. B., 83(III-3), 134 Stuchly, M. A., 157, 160, 184, 186, 187 Stuchly, S. A., 160, 184 Stuchly, S. S., 149, 157, 160, 183, 184, 185, 186. 187 Suhrmann, R., 95(III-35), 100(IV-7), 135 Suzin, M. N., 75(II-2), 133 Swicord, M. L., 157, 187 Swift, C. T., 153, 156, 159, 184, 185 Symons, S. A., 160,184 Synge, J. L., 258, 259,268 Synman, J. A., 265,266
T Tachiya, H., 97(III-44), 99(III-44), 1 16(V35), 135, 136 Taft, E. A., 125, 137 Tai, A., 60(58), 72 Taketoshi, K., 97(III-44), 99(III-44), 1 16(V35), 135, 136 Tanabe, E., 160, 176, 187 Taylor, E. F., 247, 250, 259, 260, 266, 268
Taylor, G. E., 160, 187 Taylor, W. C., 156, 185 ten Heggeler, B., 309, 3 13, 333, 334 Teodoridis, V., 159, 161, 166, 169, 170, 178, 181,185, 187 Tescher, A. B., 55(55), 71 Tessier, M., 109(V-1I), 136 Thach, R. E., 273, 298,334 Thach, S. S., 273, 298, 334 Theodorou, D. G., 77, 114, 133, 136 Thompson, B. J., 24(27, 28), 47(43, 44), 70, 71 Thon, F., 272, 334 Thumwood, R. F., 100, 111, 124, 125, 135, 136, 137 Tiberie, R., 184 Timan, H., 77, 81, 84, 85, 89(III-5), 91(III20), 95, 96(III-34), 103(IV-12), 106, 107(V-2), 109(V-2), 112(V-21), 115, 116(V-34), 121(VI-3), 122(VI-3), 123(VI-13), 124(VI-15), 125(VI-7), 126, 127, 128, 129, 130, 131, 133, 134, 135, 136, 137 Timothy, J. G., 119, 137 Toffoli, T., 265,266 Tokareva, L. G., 90, 134 Trepe, L., 272, 333 Tsujiuchi, J., 43, 71 U Uccellini, L., 265, 268 Ulaby, F. T., 157, 187 Ulam, S.,265,266 Unwin, P. N. T., 324, 327, 334 Upatnieks, J., 3(7), 52(52), 70, 71 V Valentine, R. C., 275, 334 Valsecchi, G. B., 210,266 Vance, A. W., 270,334 Van Cittert, P. H., 6, 7, 12, 24, 70 van den Broek, J., 309,3 13,334 Van Duzer, T., 150, I86 Van Huysteen, C. F., 77, 134 Vargas, C., 222, 268 Varma, B. P., 78, 83, 87, 92, 93(111-31), 95(III-28), 100, 109(V-12), 112, 114(V27, V-28), 115(V-28), 116(V-36), 117, 122, 123, 124, 130,134, 135, 136, 137
342
AUTHOR INDEX
Verdet, M. E., 7, 70 Verma, R. L., 93(111-31), 109(V-12), 135, 136 Vernier, P., 133, 137 Vernon, R. J., 159, 187 Villeneuve, A. T., 156, 157, 187 Villiger, W., 300, 302, 309, 31 1, 316, 317, 323,327,329,332,333, 334 Vim, P., 111, 112(V-19), 124,136, 137 Vogel, T. P., 8O(II-27), 134 von Ardenne, M., 270,334 von Bomes, B. V., 270,278,294,300,334 von Karmen, T., 207,268 Vorob’yeva, 0. B., 83(III-3), 134
W Wagner, K., 105(IV-18), 114,135, 136 Wait, J. R., 147,185 Waldmeyer, J., 149, 187 Wall, J., 273, 284, 287, 290, 291, 293, 332, 333,334 Walter, C. A,, 328,332 Wang, N. G., 46(41), 71 Wang, S., 105, 135 Weflinger, L. O., 3(8), 70 Weibull, C., 273, 300, 323,334 Weinstein, L. A., 153, 187 Westphall, C., 275, 334 Wexler, A., 171, 187 Weyl, R., 328, 332 Wheeler, J. A., 247, 250, 268 Whinnery, J. R., 150, 186 Whiting, R. F., 273, 334 Wilets, L., 260, 267 Williams, R. C., 329, 334 Winick, K., 52(53), 71 Witters, D. M., 157, 186 Wohlleben, R., 153, 187 Wolf, E., 6( 1 3 , 7, 8, 24(27), 68, 70
Wolff, I., 160, 186 Wooten, F., 79, 134 Wu, S. T., 6(16), 38(31), 47(48,49), 70, 71 Wuerker, R. F., 3(8), 70 Wunderli, H., 309, 313, 334 Wu-Quan-de, 96(111-37), 135 Wurtz, M., 273,323,324,333,334
Y Yaghjian, A. D., 141, 187 Yao, S. K., 40,47(46), 70, 71 Yates, A. C., 286, 287, 334 Yepez, H. M., 160,187 Yu, C. L., 159, 186 Yu, F. T. S., 2(2), 3(13, 14), 6(16), 15(24,26), 24(24,26,29), 38(31), 43(33,34), 46(41), 47(48,49), 51(50,51), 60(58), 63(60), 69, 70, 71 Z
Zaghloul, A. I., 155, 171, 186 Zchokke-Gdnacher, I., 149,178 Zech, R. G., 43, 71 Zeitler, E., 333 Zerbst, H., 328,332 Zernike, F., 6, 7, 12, 24, 70, 272, 334 Zhang, Y. W., 15(26), 24(26), 70 Zhuang, S. L., 6(16), 15(24, 26), 24(24, 26), 43(33,34),46(41),51(51),63(60), 70, 71, 72 Ziegler, B. P., 266,268 Zimmer, R., 158,185 Zisler, S., 170, I85 Zollinger, M., 273,332 Zurcher, J. F., 148, 160,187 Zworykin, V. K., 270,334
Subject Index A Acceleration, in analysis of particle motion, 246-247 AFD, see Annular dark-field detector Alkali, effect on photocathode stability, 77 79 Alkali antimonide photocathode base layers, 9 1 instability, 77, 79, 81 -82 photoconductivity, 93 - 94 resistance, 95 - 96 structure, 96 - 97 theoretical aspects, 122-123, 125, 129130 Angular momentum, conservation of, see Conservation of angular momentum Annular dark-field detector (AFD), in electron microscopy, 281, 283-284 bacterium - bacteriophage study, 3 1 1 , 315 linear range of electrons, 322 real biological structures, 308- 309 thickness dependence of signal, 296 -297 thin-section relief and contrast, 302- 306 Antenna, open-ended waveguide, 141- 143, 152-155 Antimony, photocathode base, 91 -93, 98 Astigmatism, in electron microscopy, 27 1 Astronomy, particle modeling, 234-242 Atom contrast formation in electron microscopy of, 273,275 scattering cross sections, 286-29 1 Atomic number contrast, in electron microscopy, see Z contrast scatteringcross sectionsand, 287,290-292
B Bacteria, electron microscopy, 309 - 3 18, 323 Bacteriophage, electron microscopy, 309 316 Beam-induced destruction, in electron microscopy, 328
Beer-Lambert law, 320,324 Bialkali photocathode, 76 formation, composition, and spectral response, 1 12, 119 optical data, 85 photoconductivity, 94 resistance, 95 secondary emission, 83 stability, 8 1 - 83 structure, 98 theoretical aspects, 122, 125-126, 130 Binary star system, 234.-237 Biological and medical applications electron microscopy, see Electron microscopy, contrast formation in open-ended coaxial lines, 160 open-ended waveguides, 156- 158, 161 particle modeling of cell sorting, 237-239, 243 Biological carbon, 300 Bright-field imaging, in electron microscopy, 282-283 bacteria, 3 18 bacterium - bacteriophage study, 3 13 contrast, 316, 323-324 elastic cross sections, 286-287 influences of thickness, density, and scattering on, 298 - 302 C
Carbohydrate, contrast formation in electron microscopy, 288, 290,299, 301, 308 Carbon, contrast formation in electron microscopy, 288, 290, 298, 300, 325 Cell sorting, in biology, 237-239, 243 Cesium antimonide photocathode, see S-4 photocathode Cesium-rubidium -antimony photocathode, 76 formation, composition, and spectral response, 1 1 1 optical data, 86 photoconductivity, 94
343
344
SUBJECT INDEX
Cesium-rubidium-antimony photocathode (Continued) temperature effects, 82 theoretical aspects, 122, 124 Circular waveguide, open-ended, 152- 153, 156 measurement of materials, 159- 160 TE,, mode, 176- 179 TMo, mode, 172- 174 Coaxial line, open-ended, 153- 154, 174176 measurement of materials, 160 TEM mode, 184 Coherence, white-light image processing, 2 3,5-36 image deblumng, 15- 24 measurement, 24- 36 mutual coherence function, 8 - 14 Coherent noise, 2 suppression, in white-light image processing, 5 color image deblumng, 45 Coherent optical image processing, 2 Color film, fading of, 5 1 Color image, in white-light image processing deblumng, 43 -46 retrieval, 5 1 - 55 subtraction, 46-50 Conservation laws, three-body problem, 194-200 Conservation of angular momentum, 198200 Conservation ofenergy, 192- 193, 195, 197198,249 Conservation of linear momentum, 198, 247-249 Contrast, in electron microscopy, see Electron microscopy, contrast formation in Contrast aperture, in electron microscopy, 27 1 -272 Conventional transmission electron microscope (CTEM), 273 bacteria, 3 18 bacterium - bacteriophage study, 3 13 beam-induced destruction, 328 contrast formation, 28 1-284, 322-326 mass thickness, 322 relevant constants for, 288 elastic cross sections, 286-287 geometrical parameters determining electron collection and ratio contrast, 290 inelastic cross sections, 287,289
Critical mass thickness, in electron microsCOPY, 277-278,295 Cryosection, in electron microscopy, 324 CTEM, see Conventional transmission electron microscope
D Dark-field imaging, in electron microscopy, 273-274, 284 bacterium - bacteriophage study, 3 12, 315-316 contrast, 323 elastic cross sections, 287 influences of thickness, density, and scattering on, 298 - 302 linear range of electrons, 322 osmium-stained material, 3 17 real biological structures, 308- 309 thickness of material, 297 thin-section reliefand contrast in, 302- 306 Density, in electron microscopy, 298- 302, 319- 321,323 Diathermy, application of open-ended waveguide to, 156-157 Dilution theorem, in electron microscopy, 284-286, 307, 321,324,326 Discrete mathematical physics, 189-268 conclusion, 264-266 introduction, 189- 190 Newtonian mechanics, 190-242,264 quantum mechanics, 259-264 special relativistic mechanics, 242 - 259 DNA bacteriophage infection of a bacterium, 309,314, 316 contrast formation in electron microscopy, 307 - 308 Drosophila melanogaster, electron microscopy studies, 3 19
E Elastic scattering, in electron microscopy, 274,276,283-284 cross sections, 286-287,291, 294 influences on contrast, 298 multiple scattering, 280 - 28 1 Elastic vibration, 209 - 2 1 1, 229, - 234 Electromagnetic lens, contrast formation with, 27 1
345
SUBJECT INDEX Electron microscopy, contrast formation in. 269-334 conclusions, 33 I contrast, quantification of, 275 experimental confirmations, 309 - 3 19 instrumentation, 271 -272 introduction, 270-276 limitations, 327-331 micrographs, interpretation of, 3 19-327 scattering cross sections and constants, 286-293 theory, 276-286 contrast formation, 28 1 -284 multiple scattering, 278-281 single-scattering approximation, basic equations for, 276-278 superposition and the dilution theorem, 284-286 unstained sections of aldehyde-fixed materials, 294 - 309 Electrostatic lens, contrast formation with, 271 Energy conservation, see Conservation of energy Equivalent circuit open-ended coaxial line, I53 open-ended waveguide, 144- 145, 152, 174, 183 Escherichia coli, electron microscopy, 309 316. 318
F Film, photographic, color image retrieval, 5 1, 54-55 Filter, white-light image processing,4-6, 1316, 18-19, 39-40 color image deblumng, 43-44 pseudocolor encoding, 58,6 1, 64 Fourier spectrum, white-light image processing, 4-5, 16, 24, 26, 33, 36, 47, 58, 64
G Gas heat transfer and fluid motion, 2 12- 2 15 particle modeling, 194, 212-219 shock-wave generation, 2 14-219 Gas particle, 237 Gravitation, 195- 196 Gravity, 190-193, 196
H Halftone screen, white-light image processing, 60-63 Harmonic oscillation, relativistic, 250-252 Heat conduction, 207-209 Heat convection, 2 I2 - 2 15 Heavy metal, in electron microscopy specimens, 292, 317, 320, 327, 329; see also Osmium-fixed material Helium, photocathode instability, 75 Holography image subtraction, 47 spatial filter, synthesis of, 43 Hydration shell, of biological macromolecules, 32 I Hydrogen contrast formation in electron microscopy, 288,290,298, 302 scattering properties in organic matter, 27 1 Hyperthermia, application of open-ended waveguide to, I57 - 158
I Ice, contrast formation in electron microsCOPY, 288,290,299-301, 306, 323 Image deblumng, in white-light image processing, 15- 24 color images, 43-46 spatial coherence requirement, 19-24 temporal coherence requirement, 15- 19 Image filtering, white-light image processing, 4-5, 13-16, 18-19,39-40 color image deblumng, 43-44 coor image retrieval, 53 pseudocolor encoding, 58, 61, 64 Image processing, white-light, see White-light image processing Image retrieval, white-light image processing, 51-55 Image sampling, white-light image processing, 39-40, 68 Image subtraction, white-light image processing, 40-43,46-50 Impedance bridge, open-ended waveguide measurements, 151 Incoherent light, image processing, 2 -3 Inelastic scattering, in electron microscopy, 274,276,283-284 cross sections, 287-294
346
SUBJECT INDEX
Inelastic scattering, in electron microscopy (Continued) influences on contrast, 298 multiple scattering, 280-281 Interference noise, in electron microscopy, 272,323
tion of angular momentum; Conservation of linear momentum Multiple scattering, in electron microscopy, 274,278-281,284-286,322 Mutual coherence function, 7 - 8 propagation, 8 - 14
N Laminar fluid flow, 202 - 205 Laser, 2 Lens, in electron microscopy, 27 1 -272 Light exposure, photocathode stability, 77 Linear momentum, conservationof, seeconservation of linear momentum Lipid, contrast formation in electron microsCOPY,288,290,299 - 30 1 Liquid drop problem, 2 19-22 1 interface motion of a melting solid, 222227 laminar flow, 202-205 particle modeling of, 194, 202 - 207,2 19229 porous flow, 220-222 surface tension, 226,228-229 turbulent flow, 203,206-207 Liquid particle, 237
Negative staining, in electron microscopy, 324,327, 330 Newtonian mechanics, 189-242 conservative modeling, 200-2 1 1 generalizations, 194- 200 gravity, 190- 193 nonconservative modeling, 2 1 1 -234 particle modeling of solids, liquids, and gases, 194 quantitative modeling, 239 -242 self-reorganization, models with, 234 - 243 Noise coherent optical image processing, 2 electron microscopy, 272, 323, 330-33 1 white-light image processing, 5 color image deblumng, 45 color image retrieval, 54 Nucleic acid, contrast formation in electron microscopy, 288, 290, 299, 301, 307308
M
0
Manganese, photocathode base, 92 Manganese oxide, photocathode substrate, 111 Massper unit surface, in electron microscopy, 277 Mass thickness, in electron microscopy, 277, 322 Mean free path, in electron microscopy, 277 Medical applications,see Biological and medical applications Metallic waveguide, TE mode, I82 Microstrip, 162 Microtome cutting, of electron microscopy specimens, 328 - 329 Microwaves measurement of materials at, 158- 159 medical applications, 156- 158 quantities measured at, 151 Molecular dynamics, 259 -265 Momentum, conservation of, see Conserva-
Open-ended waveguide, 139- 187 antenna analysis, 141- 143 applications, 152- 162 antenna elements, 152- 155 circular waveguide modes, 172- 174, 176- 179 coaxial line, 174- 176 diathermy, 156- 157 hyperthermia, 156- 157 measurement of materials, 158- 161 measurement of small distances, 155156 radiation into a plasma, 156 rectangular waveguide mode, 179- 181 rigorous solutions, 161- 162 thermography, 157- 158 basic structure, 140 conclusions, 18I - I82 definitions, 14I - 147 diffraction, 155
SUBJECT INDEX equivalent circuit, 144- 145 flanged and unflanged apertures, 145- 147 homogeneous medium, 152, 162- 172 infinite sample in waveguide, 182- 184 inhomogeneous medium, 152 introduction, 140- 141 measurements, 147- 15 1 mode structure, 143- 144 reflectometry, 148- 150 resonant cavities, 150- 15 1 slotted-line measurement techniques, 147-148 theoretical development of flanged waveguide, I62 - I72 characteristic modes, 169- 170 comments, 171 - 172 geometry, 162- 163 Green’s function notation, 166 moment method, 167-168 normalization, 165- 166 point matching, 166- 167, 178 transform methods, 17 1 transmission into oversized waveguide, 171 variational principle, 168- 170 Optical image processing, 2; see also Whitelight image processing Osmium-fixed material, in electron microsCOPY, 272,288,290,299,301-302,306, 317,323 Oxygen contrast formation in electron microscopy, 288,290,298 photocathode instability, 78 79
-
P Partial coherence, 7- 8, 1 1; see also Whitelight image processing Particle modeling biological cell sorting, 237-239, 243 celestial phenomena, 234-242 conservative modeling, 200-2 11 elastic vibration, 209-21 1 generalizations, 194-200 gravity, 190- 193 heat conduction, 207 -209 heat convection, 2 12- 2 I5 interface motion of a melting solid, 222227 laminar fluid flow, 202 - 205
347
liquid drop problem, 2 19-22 1 nonconservative modeling, 21 1-234 porous flow, 220-222 quantitative modeling, 239-242 quantum mechanics, 260-265 self-reorganization,models with, 234-243 shock-wave generation, 2 14-2 19 soap films, 226,228 -229 solids, liquids, and gases, I94 solitons, 232-233 special relativistic mechanics, 244-259 string vibrations, 229-234 turbulent fluid flow, 203, 206-207 PC, see Photocathode Perfect electric conductor (PEC), open-ended waveguide, 145, 162 Phase contrast, in electron microscopy, 272 273,281 Phosphorus, contrast formation in electron microscopy, 288, 290, 298 Photocathode (PC), 74 base layers, structure of, 91 -93 energy sensitivity range, 76 fatigue, 75 field enhancement of photoemission, 99101
formation, composition, and spectral response, 106- 119, 121 gases, influence of, 75-79 Hall effect, 93, 120 infrared radiation, 8 1 optical characteristics, 84 - 90, 120- 123, I26 optical enchancement of photoemission, 101- I06 photoconductivity, 93-94, 120 physical properties, 83 -99 resistance or conductivity, 94-96, 120 secondary emission, 83 stability, 75-83 structure, 96-99 temperature effects, 79- 83, 120 theoretical attempts and models, 1 19- 133 thermionic emission, 80- 8 1, 120 time response, 90 uniformity, 90 Photodiode stability, 75 Photoemitter, vacuum, see Vacuum photoemitter Plasma, open-ended waveguide radiation into, 156
348
SUBJECT INDEX
Plastic deformation, of thin electron microscopy specimens, 328 -329 Poisson distribution, in electron microscopy, 278-280 Polychromatic image subtraction, 40-43 Porous flow, 220 - 222 Positive-ion bombardment, photocathode instability, 75, 77 Potassium-cesium-antimony photocathode, 76 formation, composition, and spectral response, 1 12, 1 19 optical data, 85 photoconductivity, 94 resistance, 95 secondary emission, 83 stability, 8 1- 83 structure, 98 theoretical aspects, 122, 125, 130 Protein, contrast formation in electron microscopy, 288,290, 299, 301 -308, 329 Pseudocoloring,in white-lightimage processing, 55-68 density pseudocoloring, 65 -68 halftone screen, 60-63 real-time encoding, 63 -67 spatial encoding, 56 -60 spatial frequency encoding, 63 - 66
Q Quantum mechanics, 259 -264
R Ratio contrast, in electron microscopy, 274, 326-327 bacterium - bacteriophage study, 3 12, 3 15 geometric parameters determining, 290 noise, 330 osmium-fixed material, 3 17 real biological structures, 308- 309 thickness of material, 297-298, 322, 324 thin-section relief and contrast, 302- 304 Radiometry, medical applications, 157 Rectangular waveguide, open-ended, 146, 154-155 measurement of materials, 161 TE,, mode, 179- 181, 183
Reflectometry, in open-ended waveguide measurements, 148- I50 Resolution, in electron microscopy, 27 1272,3 19 Resonant cavity, open-ended waveguide, 150-151
S Sampling grating frequency, in white-light image processing, I6 Scanning transmission electron microscopy (STEM), 273,275 bacterium -bacteriophagestudy, 3 1 1- 3 12, 315-316 beam-induced destruction, 328 contrast formation, 28 1-284 mass thickness and, 322 relevant constants for, 288 thin-sections relief and, 302- 306 Drosophila melanogaster, 3 19 elastic cross sections, 287 geometrical parameters determining electron collection and ratio contrast, 290 inelastic cross sections, 289 linear range of electrons, 322 noise, 330-331 Scattering,in electron microscopy, 27 1,273275, 321 -322 basic equations for single-scattering approximation, 276-278 constants for different materials, 294-295 cross sections and constants, 286-293 atomic scatter cross sections, 286-29 1 composite matter, 29 1-293 elastic cross sections, 286-287 inelastic cross sections, 287-29 1 multiple scattering, 278 -28 I thickness dependence of, 295 Scattering contrast, in electron microscopy, 28 1 Schottky effect, in photoemitters, 99- 100 Semiconductor, magnetized, measurement of, 159 Shock-wave generation, 2 14-2 19 Signal processing, 1-2 Signal-to-noiseratio, in contrast formation in electron microscopy, 275 Silver, photocathode base, 91, 96, 106- 109, 131-132
SUBJECT INDEX Single scattering,in electron microscopy, 280, 282 basic equations, 276-278 Slotted-line technique, in measurements of open-ended waveguide, 147- 148 Soap film, 226,228-229 Sodium-potassium-antimony photocathode, see S-24 photocathode Solid interface motion of a melting solid, 222227 particle modeling of, 194, 222-227 Solid particle, 237 Soliton, 232-233 Source encoding, in white-light image processing, 6-7, 36-42, 68 color image retrieval, 5 1- 52 color image subtraction, 48-49 Spatial coherence function, white-light image processing, 6 - 7, 36 - 42 Spatial coherence requirement, white-light image processing, 19-24 Spatial filter, white-light image processing,4 6, 13-14, 39 color image deblumng, 43 pseudocolor encoding, 58, 61, 64 Special relativistic mechanics, 189, 242-259 harmonic oscillation, 250-252 theory in one space dimension, 245-250 theory in three space dimensions, 252-259 Spectral band filter, white-light image processing, 4-5, 13-16, 18-19, 39-40 color image deblumng, 44 Spherical aberration, in electron microscopy, 271 -272 S-l photocathode base layers, 9 1 field enhancement of photoemission, 99100 formation, composition, and spectral response, I06 - 109 optical data, 84 optical enhancement of photoemission, 103- 104 resistance, 95 secondary emission, 83 stability, 77-79, 81 -82 structure, 96 theoretical aspects, 123- 125, 130- 132 uniformity, 90
349
S-4 (Cs,Sb) photocathode, 76 formation, composition, and spectral response, 110- 11 1 optical enhancement of photoemission, 105 photoconductivity, 93-94 resistance, 95 structure, 97 -98 theoretical aspects, 122, 124- 126 thermionic emission, 82 S-I0 photocathode, 76, 110 optical data, 85 stability, 77, 79 structure, 96 theoretical aspects, 123, 125, 133 S-11 photocathode, 76 base layers, 92 formation, composition, and spectral response, 110- 11 I optical data, 85 optical enhancement of photoemission, 102, 104-105 photoconductivity, 94 resistance, 95 stability, 79-83 theoretical aspects, 122, 125 S-20 photocathode, 76 base layers, elimination of, 93 field enhancement of photoemission, 100 formation, composition, and spectral response, 112- 1 I9 Hall effect, 93 optical data, 86-90 optical enhancement of photoemission, 102- 106 photoconductivity, 93-94 resistance, 95 secondary emission, 83 stability, 77-80, 82-83 structure, 98-99 theoretical aspects, 124- 126, 128- 129 thermionic emission, 100 time response, 90 S-24 (Na-K-Sb) photocathode, 76 formation, composition, and spectral response, 113- 114, 116 optical data, 85 photoconductivity, 93 resistance, 95 stability, 8 I -83
350
SUBJECT INDEX
S-24 (Na- K- Sb) photocathode (Continued) structure, 98 theoretical aspects, 124, 130 S-25 photocathode, 76 field enhancement of photoemission, 100 formation, composition, and spectral range, 116 optical data, 90 stability, 77, 82 theoretical aspects, 129 Stefan problem, 222-226 STEM, see Scanning transmission electron microscopy Stress, particle modeling, 239-242, 244 String vibration, 229-234 Subtracted color image, see Color image subtraction Superposition, in electron microscopy, 284 286 Surface tension, 226, 228-229, 324 Symmetry, in special relativistic mechanics, 24 5
T TDR, see Timedomain reflectometry TE mode metallic waveguide, 182 open-ended waveguide, 144, 165 circular waveguide, 177- 178 rectangular waveguide, 179- 181 TEol mode, open-ended circular waveguide, 152, 167 TEl0mode,open-ended waveguide, 154,156, 161, 179-181, 183 TE, I mode, open-ended circular waveguide, 152-153, 159, 176-179 TE,, mode, open-ended waveguide, 156 TE103 mode, open-ended waveguide, 156 TEM mode, open-ended coaxial line, 153, 174-175, 184 Temporal coherence requirement, in image deblumng, 15-19,39-40,68 Thermography, application of open-ended waveguide to, 157- 158 Thickness, in electron microscopy, 295 - 302, 323-324 Three-body problem, 194-200 Time-domain reflectometry (TDR), in openended waveguide measurements, 149150
TM mode open-ended coaxial line, 175 open-ended waveguide, 144- 145, 165 circular waveguide, 177- 178 rectangular waveguide, 179- 181 TMol mode, open-ended circular waveguide, 152, 160, 172-174 Trialkali photocathode, 112- 118 Turbulent fluid flow, 203,206 -207 Two-coupler reflectometer, 148
U Ultraviolet cathode, 119 V
Vacuum photoemitter, 73- 137 formation, composition, and spectral response, 106- 119 introduction, 73-75 photocathodes physical properties, 83 -99 stability of, 75- 83 photoemission, enhancement of, 99 - 106 theoretical attempts and models, 119- 133 Van Cittert-Zernike theorem, 39 Velocity, in analysis of particle motion, 246 247 Vignetting effect, in optical image processing, 10- 11
W Water molecule, speculative mode of vibrations in, 259-265 Waveguide, see also Circular waveguide; Open-ended waveguide; Rectangular waveguide infinite sample in, 182- 184 Waveguide mode, see also specific modes circular open-ended waveguide, I72 - 174 flanged waveguideradiating into an infinite homogeneous medium, 163, 165, 168-171 higher-order modes, 161- 162 Weber-Fechner law, 320 White-light image processing, 1-72 advances, 43 - 67,69 advantages, 2 coherence measurement, 24- 36
35 1
SUBJECT INDEX coherence requirement, 7-24 image debluning, 15 -24 mutual coherence function, 8- 14 color image deblumng, 43-46 color image retrieval, 5 1-55 color image subtraction, 46-50 conclusions, 68 - 69 fringe patterns, 29-31, 34, 52-53, 56, 58 image sampling and filtering, 39-43 introduction, I - 3 optical image processor, 4 - 5,69 processing technique, 3 - 7
pseudocoloring, 55 -68 source encoding, 36 -42
x X-ray, white-light image processing, 56, 58 60,67-68 2
Z contrast, in electron microscopy,273 - 274, 278,284 Drosophila melanogaster, 3 19
This Page Intentionally Left Blank