ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 109
EDITOR-IN-CHIEF
PETER W. HAWKES CEMESlLahoratoire d'Optique Electronique du Centrz National de la Recherche Sc ientrfrqrte Toulouse. France
ASSOCIATE EDITORS
BENJAMIN KAZAN Xero.1 Corporation Palo Alto ResearcA Center Palo Alto. Califoinia
TOM MULVEY Department of Electronic, Engineering and Applied Physics Aston Uniwrsity Birminghanr, United Kingdom
Advances in
Imaging and Electron Physics EDITEDBY PETER W. HAWKES CEMESlLahor~~itnire (1’ Optique Elec~tr~otiiyue du Centre National de lu Recherche Scientifque Toulousr. F1.unc.e
VOLUME 93
ACADEMIC PRESS San Diego New York Boston London Sydncy Tokyo Toronto
This book is printed on acid-free paper.
@
Copyright 0 1995 by ACADEMIC PRESS, INC All Rights Reserved. No part of this publicarion may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495 United Kingdon? Edition pihlislzed by Academic Press Limited 24-28 Oval Road, London NW 1 7DX
International Standard Serial Number: 1076-5670 International Standard Book Number: 0- 12-0 14735- I PRINTED IN THE UNITED STATES OF AMERICA 95 96 91 98 99 00 B C 9 8 7 6 5
4
3 2
1
CONTENTS CONTRIBUTORS . . . . . . . . . . . . . . . . . . . . . . . PREFACE. . . . . . . . . . . . . . . . . . . . . . . . . .
vii ix
Group Invariant Fourier Transform Algorithms R . TOLIMIERI. M . AN. Y. ABDELATIF. C. Lu. G . KECHRIOTIS. AND N . ANUPINDI I. I1 . 111. IV. V. VI . VII . VIII . IX .
Introduction . . . . . . . . . . . . Group Theory . . . . . . . . . . . FT of a Finite Abelian Group . . . FFT Algorithms . . . . . . . . . . Examples and Implementations . . . Affine Group RT Algorithms . . . . Implementation Results . . . . . . Affine Group CT FFT . . . . . . . Incorporating ID Symmetries in FFT References . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 14 16
21 30 42 46 53 55
Crystal-Aperture STEM JACOBUST. FOURIE I . Introduction . . . . . . . . . . . . . . I1 . Theoretical Considerations and Experimental 111. Experimental Results in Imaging . . . . IV. Summary and Conclusions . . . . . . . References . . . . . . . . . . . . . .
. . . . . . . . 57 . . . . 59 . . . . . . . . . 90 . . . . . . . . . 106 . . . . . . . . 107
Evidence
Phase Retrieval Using the Properties of Entire Functions N . NAKAJIMA I . Introduction . . . . . . . . . . . . . . . . . . . . . . I1 . Theoretical Background . . . . . . . . . . . . . . . . . I11. Extension to Two-Dimensional Phase Retrieval . . . . . . . V
109 112 131
vi
CONTENTS
IV. Application to Related Problems . . . . . . . . . . . V. Conclusions . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . 139 167 168
Multislice Approach to Lens Analysis GIULIOPozzr I . Introduction . . . . . . . . . . . . . . . . . . . . . . I1 . Standard Multislice and BPM Equations and First Applications . . . . . . . . . . . . . . . . . . . . . . 111. Application of the Multislice Equations to Round Symmetric Electron Lenses . . . . . . . . . . . . . . . . . . . . 1V. Improved BPM Equations and Application to Gradient Index Lenses . . . . . . . . . . . . . . . . . . . . . V. Beyond the Paraxial Approximation . . . . . . . . . . . . VI . Conclusions . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
202 207 215 216
Orientation Analysis and Its Applications in Image Analysis N . KEITH TOVEY.MARKW. HOUNSLOW. A N D JIANMINWANG Introduction . . . . . . . . . . . . . . . . . . . . . . Definition of the Task . . . . . . . . . . . . . . . . . . Image Acquisition . . . . . . . . . . . . . . . . . . . Image Processing and Analysis of Orientation . . . . . . . . Generalized Intensity Gradient Operators . . . . . . . . . . Enhanced Orientation Analysis-Domain Segmentation . . . . Applications of Orientation Analysis . . . . . . . . . . . . Implementation and Automation of Orientation Analysis . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
220 224 228 231 246 287 300 319 323 326
I. I1. I11 . 1V. V. V1. VII . VIII . IX .
INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . .
173 176 186
331
CONTRIBUTORS Numbers in parentheses indicate the pages on which the authors’ contributions begin.
Y.ARDELATIF ( I ) , AWARE Inc., Cambridge, Massachusetts 02142 M. AN ( I ) , AWARE Inc., Cambridge, Massachusetts 02142 N. ANUPINDI ( I ) , AWARE Inc., Cambridge, Massachusetts 02142 JACOBUS T. FOURIE(57), CSIR Division of Materials Science and Technology, Pretoria 0001, South Africa MARKW. HOUNSLOW(219), School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom
G. KECHRIOTIS( I ) , AWARE Inc., Cambridge, Massachusetts 02142 C. Lu ( I ) , AWARE Inc., Cambridge, Massachusetts 02142 N. NAKAJIMA(109), College of Engineering, Shizuoka University, Hamamatsu 432, Japan
GIULIOPozzr (173), Department of Physics, University of Bologna, 40126 Bologna, Italy
R. TOLIMIERI ( 1 ), AWARE Inc., Cambridge, Massachusetts 02 142 N. KEITH TOVEY(219), School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom JIANMIN WANG(2 l9), School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom
This Page Intentionally Left Blank
PREFACE
The five chapters that make up this volume cover advanced topics in crystallographic computing, image restoration and analysis, particle optics, and a revolutionary new idea concerning the scanning transmission imaging mode. The volume opens with a chapter by a group of authors, most of whom are no strangers to these Advances. M. An and colleagues have already contributed a survey on discrete FFT algorithms; here they present in detail their work on group-invariant Fourier transform algorithms, which are of vital interest in crystallography. By linking the crystal symmetry to the algorithm itself, higher dimensional Fourier transforms can be performed very efficiently. The authors set out the underlying mathematics and its practical implementation fully and this account will no doubt be helpful for many users of these techniques. The second chapter is by J . T. Fourie, who has been publishing articles in the electron microscopy literature over the past few years on a revolutionary way of attaining high resolution information. He has not, however, previously prepared a long connected account of these ideas and the associated experimental tests; here, he has brought together both the theoretical background and the related experiments, which will excite widespread interest in his approach. N. Nakajima, author of the third contribution, has been working on phase retrieval for several years and has prepared a detailed account of this research; the theory is recapitulated with care and a variety of types of application are then examined. Despite the immense amount of thought that has been devoted to this problem, difficulties still remain, as Nakajima points out. Complementary contributions on this theme are planned for future volumes, notably from the school of the late Richard Bates, who contributed to these Advances in 1986. Electron lens properties have been very thoroughly studied for more than half a century, essentially by calculating trajectories through the lens fields and then evaluating the various cardinal elements and aberration coefficients. In the fourth chapter, G. Pozzi demonstrates that this is not the only way of analyzing lenses. For many years, it has been usual to calculate the propagation of electron waves through specimens by picturing the latter cut into very thin slices and then propagating the wave through the potential in each slice. Pozzi applies this idea to the calculation of lens properties. A full account of this new approach is presented, including aberrations, which had not previously been fully covered in this way. The volume ends with a magisterial account of orientation analysis and the associated image processing methods by N . K . Tovey, M. W. Houslow, and IX
X
PREFACE
J. Wang. The whole subject is reviewed: first restoration, enhancement, and edgedetection, then simple and more advanced applications in the specific domain of orientation analysis (for mineralogical samples in particular but of course the techniques are of much wider applicability). This is virtually a short monograph on the subject and will be heavily used in the specialist area in question. I am most grateful to all the authors for the trouble that they have taken, not only in preparing these surveys but also in ensuring that they are accessible to readers who are not specialists in the same subject area. 1 thank them all most sincerely and conclude as usual with a list of forthcoming articles. The volume numbers of those already in press are indicated. Peter W. Hawkes
FORTHCOMING ARTICLES Nanofabrication Use of the hypermatrix Image processing with signal-dependent noise The Wigner distribution Parallel detection Discontinuities and image restoration
Hexagon-based image processing Microscopic imaging with mass-selected secondary ions Modern map methods for particle optics Nanoemission Magnetic reconnection Cadmium selenide field-effect transistors and display
ODE methods Electron microscopy in mineralogy and geology The artificial visual system concept Projection methods for image processing Space-time algebra and electron physics The study of dynamic phenomena in solids using field emission Gabor filters and texture analysis
H. Ahmed D. Antzoulatos H. H. Arsenault M. J. Bastiaans P. E. Batson L. Bedini, E. Salemo, and A. Tonazzini S. B. M. Bell M. T. Bernius M. Berz and colleagues Vu Thien Binh A. Bratenahl and P. J. Baum T. P. Brody, A. van Calster, and J. E Farrell J. C. Butcher P. E. Champness J. M. Coggins P. L. Combettes C. Doran and colleagues M. Drechsler J. M. H. Du Buf
PREFACE
Group algebra in image processing Miniaturization in electron optics The critical-voltage effect Amorphous semiconductors Stack filtering Median filters RF tubes in space Mirror electron microscopy Relativistic microwave electronics Rough sets The quantum flux parametron The de Broglie-Bohm theory Contrast transfer and crystal images Morphological scale space operations Algebraic approach to the quantum theory of electron optics Signal representation Electron holography in conventional and scanning transmission electron microscopy Quantum neurocomputing Surface relief
Spin-polarized SEM Sideband imaging Ernst Ruska, a memoir Regularization Near-field optical imaging Vector transformation Seismic and electrical tomographic imaging SEM image processing Electronic tools i n parapsychology
xi D. Eberly (vol. 94) A. Feinerman A. Fox W. Fuhs M. Gabbouj N. C. Gallagher and E. Coyle A. S. Gilmour R. Godehardt (vol. 94) V. L. Granatstein J. W. GrzymalaBusse (vol. 94) W. Hioe and M. Hosoya F! Holland K. Ishizuka I? Jackway R. Jagannathan and S. Khan W. de Jonge and P. Scheuermann E Kahl and H. Rose (vol. 94) S. Kak (vol. 94) J. J. Koenderink and A. J. van Doom K. Koike W. Kmkow L. Lambert and T. Mulvey A . Lannes A. Lewis W. Li McCann and colleagues N. C. MacDonald R. L. Morris
xii
PREFACE
Image formation in STEM
C. Mory and
The Growth of Electron Microscopy
T. Mulvey (ed.) (vol. 95)
The Gaussian wavelet transform
R. Navarro, A. Taberno and G. Cristobal G. Nemes T. Oikawa and N. Mori S. J . Pennycook G. A. Peterson H. Rauch H. G. Rudenberg D. Saldin G. Schmahl J. I? E Sellschop J. Serra M. I. Sezan H. C. Shen T. Soma J. Toulouse J. K. Tsotsos Y. Uchikawa D. van Dyck L. Vincent L. Vriens, T. G. Spanjer, and R. Raue A. Zayezdny and I. Druckmann (vol. 94) A. Zeilinger, E. Rasel, and H. Weinfurter
C. Colliex
Phase-space treatment of photon beams Image plate Z-contrast in materials science Electron scattering and nuclear structure The wave-particle dualism Scientific work of Reinhold Rudenberg Electron holography X-ray microscopy Accelerator mass spectroscopy Applications of mathematical morphology Set-theoretic methods in image processing Texture analysis Focus-deflection systems and their applications New developments in ferroelectrics Knowledge-based vision Electron gun optics Very high resolution electron microscopy Morphology on graphs Cathode-ray tube projection TV systems
Signal description
The Aharonov-Casher effect
ADVANCES I N IMAGING AND ELECTRON PHYSICS. VOL . 93
Group Invariant Fourier Transform Algorithms' R . TOLIMIERI. M . AN. Y . ABDELATIF. C . LU. G . KECHRIOTIS. and N . ANUPINDI. A WARE Inc., One Memorial Drive. Cambridge. Massachusetts
. . . . . I . Introduction . . . . 11. GroupTheory . . . . . . . . . . A . Finite Abelian Group . . . . . . . B . Character Group . . . . . . . . C . Point Group . . . . . . . . . . D . Affine Group . . . . . . . . . E . Examples . . . . . . . . . . . 111. FT of a Finite Abelian Group . . . . . . A . Periodization-Decimation . . . . . 1V . FFTAlgorithms . . . . . . . . . . A . Introduction . . . . . . . . . . B . RT Algorithm . . . . . . . . . C . CT FFT Algorithm . . . . . . . . D . Good-Thomas Algorithm . . . . . . V . Examples and Implementations . . . . . A . RT Algorithm . . . . . . . . . B . CT FFT Algorithm . . . . . . . V1 . Affine Group RT Algorithms . . . . . . A . Introduction . . . . . . . . . . B . Point Group RT Algorithm . . . . . C . AffineGroup RT Algorithm . . . . . D . X'.l nvariant RT Algorithm . . . . . VII . Implementation Results . . . . . . . A . Complexity . . . . . . . . . . VIII . Affine Group CT FFT . . . . . . . . A . Extended CT FFT: Abelian Point Group . B . CT FFT with Respect to Pmmm . . . . C . Extended CT FFT: Abelian Affine Group D . C T FFT with Respect to Fmmm . . . . IX . Incorporating ID Symmetries in FFT . . . References . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 3 6 9 10 11 14 15 16 16 16 17 19 21 21 21
30 30 31 39 41 42 45 46 41 48 49 52 53 55
'
This research was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by the Air Force Office of Scientific Research under contract number F49620.91.0098 . The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied. of the Advanced Research Projects Agency or the U.S. Government . 1
Copyright 0 1995 by Academic Press. Inc . All rights of reproduction in any form reserved .
2
R . TOLIMIERI et al.
I . INTRODUCTION The design of algorithms for computing the crystallographic Fourier transform is a subject in applied group theory. In previous works (An et al., 1991; Tolimieri el al., 1993) we exploited several elementary results in finite abelian group theory and developed the basic abstract constructs underlying the class of divide and conquer algorithms for computing the multidimensional (MD) discrete Fourier transform (DFT). This setting provides a convenient landscape for introducing a class of divide and conquer crystallographic algorithms. In An et al. (1991), we outlined a systematic approach for classifying three-dimensional (3D) crystallographic groups. Applications to 3D crystallography require a detailed understanding of this classification. Similar classifications exist to some extent in higher dimensions and are equally important for applications to quasicrystallography. The theory developed in this work will operate within the abstract formulation presented in An et al. (1991), Tolimieri et af. (1993). Finite abelian groups will serve as data indexing sets. A class of affine group fast Fourier transform (FFT) algorithms will be introduced which fully use data invariance with respect to subgroups of the affine group of data indexing sets. The affine subgroup need not come from a crystallographic group. This approach removes dimension, transform size, and crystallographic group from algorithm design and serves to bring out fundamental algorithmic procedures rather than produce an explicit algorithm. These procedures provide tools for writing code which scales over dimension, transform size, and crystallographic group and which can be targeted to various architectures. In fact these methods apply to all 230 3D crystallographic groups and to composite transform sizes. We will show the power of these tools by way of an extensive list of implementation examples. We distinguish three algorithm strategies. The first is based on the well-known Good-Thomas (GT) or prime factor algorithm which breaks up an FT computation into a sequence of smaller size DFT computations determined by the relatively prime factors of the initial transform sizes. In An et al. (1991) we developed an abstract formulation of the GT and applied it as a tool for crystallographic algorithms. Our treatment here will be brief and mostly contained in examples. Reduced transform (RT) algorithms were considered in detail in An el al. (1991), Tolimieri et al. (1993). A simple generalization of the RT approach based on collections of subgroups will be presented, which provides a universal framework for affine group Fourier transform (FT) algorithms. In applications to 3D crystallography this class of algorithms replaces the problem of computing the FT of 3D group invariant data by that of computing in parallel the FT of a collection of 1D or 2D group-invariant
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
3
data sets. The latter problem is substantially simpler and several efficient implementations are widely practiced. A third approach, based on a generalization of Cooley-Tukey fast FT (CT FFT), will be discussed which performs generalized periodizations (Tolimieri et al., 1993) with respect to affine subgroups. This method applies to abelian affine subgroup invariant data and hence to about 100 of the 230 3D crystallographic groups. A C T FFT algorithm associated to an abelian subgroup X of the affine group provides code for Y invariant data with respect to every subgroup Y of X . In applications, we choose X such that the associated CT FFT is easy to code and efficient and such that X contains a large collection of subgroups Y of interest. X itself need not be a crystallographic group. An example will be provided which shows how one code applies to 71 of the crystallographic groups. This work is organized as follows: In Section 11, we will review all the necessary group theory. Finite abelian group theory will be briefly considered as it is covered in many elementary texts. We reference Tolimieri et a / . (1993) as it contains all the necessary results. The affine group of a finite abelian group will be defined. Constructs related to the action of affine subgroups on data indexing sets will be introduced. In Section 111 we define the Fourier transform of an abelian group and study its fundamental role in interchanging periodization and decimation operations (duality). The RT, CT, FFT, and GT algorithms are presented in Section IV as applications of this duality to different global decomposition strategies. Affine group FFT algorithms based on the RT algorithm are discussed in Section VI, while those coming from the application of the affine group CT FFT are introduced in Section VIII. In Section IX, we briefly sketch a method of incorporating 1D symmetry into FFT computations, which calls on lower order existing FFT routines using the symmetry condition. Throughout this work, we will provide many examples. These examples have been chosen to reflect both the theory and our experience and others over several years in writing code for the 3D crystallographic FT. 11. GROUPTHEORY
A . Finite Abelian Group Denote by Z / N the group of integers modulo N consisting of the set (0, 1 ,
..., N
-
11,
with addition taken modulo N. Z / N is a cyclic group of order N and every cyclic group of order N is isomorphic to Z / N . For example, the
4
R. TOLlMlERI et
a/.
multiplicative group UN o f complex N t h roots of unity ( 1 , w,
WN- 1 - a * ,
= eZri/N
I,
9
is a cyclic group of order N a n d the mapping 0: Z/N
--t
UN,
defined by o ( n ) = wn, 0 In < N , is a group isomorphism from Z / N onto U,,. The direct product of two finite abelian groups A , xA 2
is the set of all pairs ( a l ,a,), a, E A , , a, E A , with componentwise addition. By the fundamental theorem of finite abelian groups, every finite abelian group A is isomorphic to a direct product of cyclic groups, A = Z/N, x
*.*
x Z/NR.
(1)
We call Eq. ( 1 ) a presentation of A . A finite abelian group can have several presentations which vary as to the number of cyclic group factors as well as the orders of the cyclic groups. For example, 2 / 3 0 = Z / 2 x Z / 1 5 = Z / 3 x Z/10 = Z / 5 x Z / 6 = Z/2 x Z / 3 x Z / 5 . In general, we have Theorem 11.1. The direct product of cyclic groups having relatively prime orders is a cyclic group. Theorem 11.1 is a special case of the Chinese remainder theorem (CRT). Theorem 11.2 (Chinese Remainder Theorem). Let N = N , N , . NR be a factorization of N into pairwise relatively prime integers. Then there exist uniquely determined integers 0
Ie,,e,,
..., eR < N
satisfying e, = 1 mod N,, e, = OmodN,,
1 5 r, s IR , r # s.
The set (el, e,, ..., e R )is called the complete system of idempotents for the factorization N = N , N , .. . NR.
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
5
Let (el, e,, . .., eR1 be the complete system of idempotents for the factorization N = N, N, ...N R . By CRT,
= e, mod N,
e;
(2)
eres = OmodN,
1
I
r, s
IR ,
r#s
(3)
R
e, = 1 modN.
(4)
r= 1
It follows that every n n
= nlel
E
Z / N has a unique expansion of the form
+ n,e, + .--+ n R e R m o d N ,
n,EZ/N,.
In fact, n,
= nmodN,,
1 Ir
R.
I
CRT shows that the mapping X: Z / N
+
Z/N1
X
Z/N, x
X
Z/N,
defined by x(n)
=
( n l , n,,
n, = n mod N,,
n,),
Ir IR
(5)
+ n2e2 + ... + n R e R m o d N .
(6)
1
is an isomorphism having inverse ~ - ~ ( n , , n , , . - - n ,=) rile,
CRT is the basis for many theoretic and applied results in algorithm design. It is a major tool for interchanging between 1D and MD arrays which is the core of the GT algorithm. The use of idempotents in describing this interchange is most important in implementation (Tolimieri et al., 1993). CRT can be used to derive the primary factorization of a finite abelian group. Suppose A is a finite abelian group of order N, and we write N
where P I ,P,, e.g.,
= ppIp,"2
.. . P G M ,
. . .,PM are distinct A = Z/N,
X
X
a, 2 1,
(7)
primes. Choose any presentation of A , Z/NR,
N
=
Ni... N R
and write N, = PY~(') P;~"',
a,(r)
2
0, 1
Im
Then Z/N,
=
Z/PPI"' x
*
- x Z/PGM'",
I M.
(8)
6
R . TOLIMIERI ef al.
and we have, by rearranging factor, the primary factorization of A , where The primary factorization of A is unique as the factors A,,, can be described as the set of all elements in A having order which is a power of the prime P,,, . B. Character Group Consider a finite abelian group A of order N . The character group A * of A is the set of all group homomorphisms
a*:A
+
Or,
which group addition defined by
(a* + b*)(a) = a*(a)b*(a),
a*, b*
E
A*, a
E
A.
(10)
The character group A* is the natural indexing set for FT as we can view A as the time parameter space and A* as the frequency parameter space. We will usually write a*(a) as ( a , a*>. The mapping 4: Z / N ( Z / N ) * defined by 2?ri(mn/ N ) , Osn,m
= e -+
establishes an isomorphism
Z/N = (Z/N)*.
More, generally, the mapping
4: Z/Nl x ... X Z/NR
+
(Z/N,X
X
Z/NR)*
defined by
( ( m l , * * * , m R ) , 4 ( n. l. -, , n R ) > = e
27ri(m,n,”,)
.. . e 2 n ; ( ~ R n n ” R )
(1 1)
establishes an isomorphism
Z/N, X
a * *
X
Z/NR = (Z/Nl X ... X Z/NR)*.
By the fundamental theorem, every finite abelian group A is isomorphic to its character group A * .
7
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
1. Duality
Fix an isomorphism $ from A onto A * . The dual B' of a subgroup B of A is defined by B' = ( a E A : ( 6 , $(a)) = 1, for all b E Bj. (12) Since $ is an isomorphism, $(B')
=
( $ ( b ' ) :b'
E
B')
is the subgroup of all characters of A that act trivially on B. Consider the quotient group A/B of B-cosets
+ B = {a + b :b E B]
a with abelian group addition
(a + B)
+ (a' + B) = ( a + a ' ) + B.
The isomorphism $ induces isomorphisms B'
-+
(A/B)*,
I&: A/BL
-+
B*,
by the formulas
(a
+ B$,(b'))
=
(a,$(b')),
+ B'))
=
(6, $(a)>,
( b , &(a
a E A , b' a
E
A, b
E
E
B',
(13)
B.
(14)
The characterization of $(B') by Eqs. (13) and (14) implies both induced isomorphisms are well defined, i.e., independent of coset representation. The induced isomorphisms 4, and $* play fundamentral roles in the description of divide and conquer FT algorithms. 2. The Vector Space L ( X )
Denote the space of all complex valued functions on a finite set X by L ( X ) . L ( X ) is a vector space over C with addition and scalar multiplication defined by
(f + g)(x) = f ( 4+ g(x), (af)(x)= 4 f ( X ) ) ,
Q!
f,g E U X ) ,x E X, E c,f E U X ) ,x E x.
Consider a finite abelian group A and a subgroup B of A . For f define
PerLtf(4
=
c f(a + b)
beB
E
L(A) (15)
8
R . TOLIMIERI et al.
and
The periodization operator Per, and the decimation operator Dec, are fundamental operators on L ( A ) . Suppose A has order N.L ( A ) has dimension N.The evaluation basis of L(A) is the collection of functions
(e, : a E A ) defined by
We will denote the evaluation basis by A . The character basis of L(A) is the collection A* of characters of A . Relative to the inner product on L ( A ) defined by (f9g)
=
c f(a)g(a),
f,g
E
UA),
(18)
O€A
where s(a)denotes the complex conjugate of g(a), the evaluation basis is an orthonormal basis of L ( A ) . Since for a*, b* E A * ,
N,
(a*,b*) =
0,
a* = b*, a* # b*,
the set 1
-A*
JN
is an orthonormal basis of L ( A ) . 3 . Canonical Isomorphism
The evaluation basis A and the character basis A* are canonical in the sense that they depend solely on group structures and not on presentation. Although the groups A and A* are isomorphic, there is no canonical isomorphism. Duality is defined relative to a particular choice of isomorphism from A onto A * . By extension, the groups A and A * * , the dual of A * , are also isomorphic, and in fact a canonical isomorphism can be defined. The canonical isomorphism, as we will see in Section 111, defines the FT of A .
9
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
For a E A , the mapping @(a)of A* a*
@(a)(a*)= ( a , a*>,
(19)
A*,
E
is a character of A * . The mapping @ : A + A**
(20)
is a canonical isomorphism, since it is defined without reference to presentation. Consider the evaluation basis A of L ( A ) and the character basis A** of L(A*). The canonical isomorphism 0 of A onto A** defines a linear isomorphism L(@)from L ( A ) onto L(A*). C. Point Group
Denote the automorphism group of a finite abelian group A by Aut(A). Subgroups of Aut(A) are called point groups. For a point group H a n d a point a E A , the isotopy subgroup Ha of a in H is defined by Ha = ( a E H : a(a) = a). (21)
H, is a subgroup of H . A point a E A is called a fixed point of H if H = H,. The H-orbit of a, denoted by H(a), is defined by H(u)
( a ( ~: a)
=
H).
E
The mapping a
+
a(a):H
-+
A
induces a bijection from the space of right cosets aH,, a E H , onto H(a). Fix a group isomorphism 9: A + A * . For a E Aut(A),define the adjoint a+ E Aut(A) by (a, d(a+(c))>=
Set a'
=
(44,$(c)>,
a, c
E
A.
(24)
( a + ) - ' ,and observe that
(cup)# = a'p',
( a - y = (a')-'.
For a point group H, define
H'
=
(a' : a
E
H).
The H-orbit H(B) of a subgroup B of A is the collection of subgroups
H ( B ) = (a@): a
E
H).
(25)
10
R. TOLIMIERI et al.
Under duality
H # ( B * ) = (H(B))'. A collection G3 of subgroups of A is called H-invariant if
h E H, B
h(B) E 63,
E
63.
if G3 is H-invariant, the action of H partitions 63 into disjoint H-orbits. Define a complete system of H-orbit representatives in 63 as any collection of subgroups in G3
B , , ..-,BR such that 63 is the disjoint union of the collection of H-orbits
H(B,),
9
H(BR).
A covering of A is a collection of subgroups 63 of A such that
A = U B . B€63
Set
63'
=
(B' : B
E
631.
We say that G3 is a dual covering of A if 63' is a covering of A . We can always construct an H-invariant covering 63 of A .
D. Affine group The affine group of A ,
Aff(A) = A QAut(A), is the set of all (a, a), a
EA
,a
E
Aut(A), with group composition
(a, a)(a',a ' ) = (a + a(a'),aa'). A f f ( A ) acts on A by ( a ,a)(c)= a
+ a(c),
a, c
E
A , a E Aut(A).
(29)
For x E A f f ( A ) , we write x = (a,, a,), a, E A , a, E Aut(A). We define two actions of Af f ( A ) on L ( A ) .For f E L ( A ) and x E Af f ( A ) , define xfW
= f(x(a)),
a EA,
x#f(a)= ( a , , $
=f
and is x#-invariant if x#f = f .
(30) (31)
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
11
Choose a subgroup X of Aff(A). An f E L ( A ) is X-invariant if f is x-invariant for all x E X , and is X#-invariant i f f is x#-invariant for all x E X. The point group X of X is defined by
x = (ax:xEX). k is a subgroup of Aut(A), but in general
(32)
is not contained in X .
E. Examples
Example ZI.2. (P6,). Crystallographic group P 6 , (Henry and Lonsdale, 1952) is generated by
x
=
(O,O,
M2r
a)
acting on Z/3N x Z/3N x Z/6M for natural numbers N and M , x(a, , a,, a3) = (a,
-
a 2 ,a3 + M ) .
Throughout the rest of this example, we will set A
For (a,, a , , a31
E
=
2/12 x 2/12 x 2/12 x 12.
A,
x ( a l , a z , a 3 ) = (a, - a Z , a 1 , a 3+ 2),
x 2( a 1 9 a 2 , a 3 )
= (-a,,a,
-
a 2 , a 3+ 4),
2
x ( a , ,a,, a3) = ( - a , , - a 2 , a3 + 61, x4@, a,, a,) 5
x (a,, a,, a,) x 6(a,,a,,a,)
=
(a2 - a , , - a , , a4 + 81,
= (a,, a, - a , , a,
+
101,
= (a,,a,,a,).
P 6 , acting on A decomposes A into distinct P6,-orbits each of order 6. P6, is also a crystallographic group denoted by P 6 (Henry and Lonsdale, 1952). P 6 is generated by a.
a @ , ,a,, a,)
= (a, -
a2,a,).
P6-orbits also dsecompose A into distinct orbits. A P6-orbit may have one, two, three, or six elements. P6(0, 0, a3) = ((0,0, a,)],
0
Ia3 I11,
and (0, 0, u3) are fixed points of P6. P6(4,8, a,)
=
((498,
(8,4, 0 3 ) ) -
0 5 a3 5 11.
12
R . TOLIMIERI et al.
The isotropy subgroup of (4,8, a,) is generated by a2. P6(6,6, a31 = ((6, 6,
0 5 a3 5 1 1 .
(016, 0311 (630, a,)],
The isotropy subgroup of (6,6, a,) is generated by a,. The nontrivial isotropy subgroups, (1, a2,a4)and ( 1 , a,], where 1 denotes the identity automorphism, are again crystallographic groups denoted by P3 and P2 (Henry and Lonsdale, 1952), respectively. With respect to 4 defined in Eq. (1 l), (a-'(al =
3
02
a319
9
d(b1, b2
I
- 2ni exp[- 12 b2bI +
= ((01 9
4(-&
3
b3))((02
(a2 -
1
bl
= ((a1 7 a2 9 a319 4(a#(b1 1
and a # ( b l b, 2 ,b,)
= (-
I
02
-
02
9
a3),
aJb2 + a,b,l
$(b1 b2 b3)) 3
1
1
+ 6 2 b3))
b2
9
9
b3)))9
b 2 ,b1 + b 2 ,b,).
Example 11.2 (P6/mmm). to the abstract group
Crystallographic group P6/mmm is isomorphic 2 / 6 4 Z/2 x 2/2.
We will describe the group by listing the three generators: a, P(UI,Q2103) Y(Q1 9 0 2 9
=
(a29
a , , -a,),
a31 = ( a ] ,0 2 , -a,),
P# = B, Y# =
y.
This is a nonabelian group, and we have the following commuting relations:
pa
= a-lp,
ya = ay,
yp
=
py.
Set A = 2/12 x 2/12 x 2/12. We will consider isotropy subgroups of elements.
13
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
For a # 0 , 6 ,
where P3ml is a crystallographic point group. Example 11.3 (Pmmm). Let A = Z / 2 N x Z/2M x Z/2L, for natural numbers N , M , and L . Pmmm c Aut(A) is generated by p l ( a , ,a 2 ,u3) = a2 3 a 3 ) , P2(a1 3 u 3 ) = (a1 -a2 a3), P3(a1 0 2 0 3 ) = (a1 a2 -a3). Each of the generators is of order 2 and Pmmm has eight elements. With respect to the isomorphism defined in Eq. (1 l ) , u i = 1,2, 3 . pi = p i , The subgroup 9
9
9
9
3
B = ( ( b , N ,b2M,b 3 L ) :bj = 0 , 1 , i
9
9
1, 2 , 3 )
=
9
(33)
is the group of fixed points of Pmmm. Let B , = ( ( b , ,bZM, b3L) 0 I 61
Pmmm,,
=
I2N
-
1, b2 = 0, 1, b3 = 0 , 1 ) .
(1, P2, P3, P2P3l-
B2 = ( ( b , , n ,b 2 ,b 3 L ) : 05 b2 I2 M - 1, b ,
PmmmB,
=
( l , PI
9
P3
9
=
0 , 1 , b3 = 0, 1 1 .
PIP3).
Example 11.4 (Fmmm). Set a = Z / 2 N x Z/2M x Z / 2 L , for natural numbers, N , M , and L . The crystallographic affine group Fmmm < A f f ( A ) is B x Pmmm, where B < A is the fixed subgroup of Pmmm given in Eq. ( 3 3 ) . Each of the generators is of order 2 and Fmmm has 64 elements. An element of Fmmm is of the form (b,~ [ ' P ~ P ~b)E,B,
rk =
0, 1, k
=
1,2,3.
We will denote the elements of Fmmm by an ordered 6-tuple of 1's and 0's by listing the values of bj and rk in order, i.e., (b1,N , b2M, b3L, P';'P;~P;') (bi b2 b3 + +
9
9
9
rl,
r2 r3)9
14
R. TOLIMIERI er 01.
In this notation, the group composition in Fmmm is given by componentwise addition modulo 2 in each of the six components. We will also index the elements of Fmmm from 0 to 63 by the binary expansion of the 6-tuple, (b,,b 2 ,b 3 ,rl , r,, r3)
+ 2f2 + 4t3 + 8rl + 16r2 + 32r3.
tl
++
In this notation =
sl
9
s29 s3, s4
s5
3
9
s6, S T ) .
There are no fixed points of Fmmm.
Fmmm,
=
Pmmm.
Fmmm
=
Pmmm
111. FT
lSO
=
OF A
9
s8
9
s l 6 9 s24 3 s 3 2 9 s40
9
s48 I s56).
FINITEABELIAN GROUP
View A as a basis of L ( A ) and A** as a basis of L(A*). In Section 11, we defined the canonical isomorphism @ : A + A**
by a
@@)(a*)= (a, a*>,
E
A , a*
E
A*.
The Fourier transform FA of A is the unique linear extension
FA : L ( A ) + L ( A * )
(34)
of 0. It follows that FA is a linear isomorphism given by
FA f(a*) =
c f(a)(a, a*>,
f E LV), a*
E
A*,
(35)
aeA
with inverse given by
The coefficients off over the character basis are given by (1 / N )FAf(- a*), a* E A * . For an isomorphism 4: A 4 A * , define the FTpresentation
F4:L ( A )
-+
L(A)
(37)
by
(F+f)(a)= (FAf)($(a)),
f E L(A), a
A.
(38)
15
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
FT presentations associated to different isomorphisms differ by input permutations. The choice of 4 can be an important parameter in algorithm design especially in crystallographic applications where 4 can be matched to crystallographic symmetry to simplify coding. Throughout this chapter we A*. fix an isomorphism 4: A For a subgroup B of A , define the induced Fourier transforms -+
F,, : L ( A / B ) FG2:L ( B )
+
+
L(B*),
(39)
L(A/B'),
(40)
by the formulas
(F,,f)(b')
=
(h,,f)(4,@L)), f
(F+,f)(a+ B')
=
(FBf)(&(a f EL)),
E L(A/B),
f
bL
E
B',
E L(B),a E A ,
(41) (42)
where 4, and dZ are defined in Eqs. (13) and (14). F,, and F,, are linear isomorphisms. We will write Ff, for F,, and Ff, for F,, when we want to bring out the dependence on the subgroup B. A . Periodization-Decimation
Divide and conquer algorithms for computing the action of F, decompose the computation into a collection of induced FT computations. In this Section we will see how the FT interchanges the fundamental operations of periodization and decimation. For a subgroup B of A and f E L ( A ) ,Per, f E L(AS)is B-periodic and we can view PerBf E L ( A / B ) . Theorem 111.1. For f E L ( A ) , F,(PerB F ) vanishes off of B' and
=
6'
F,,(PerB f ) ( b ' ) ,
E
EL.
Theorem 111.1 implies we compute F, f on the subgroup B' by computing the induced FT F,,(Per, f ) . For f E L ( A ) , we can view Dec, f E L ( B ) . Theorem 111.2. For f
E
L ( A ) , F,(Dec, f ) is EL-periodic and
PerBl (F+f)(a) = o(BL)F,(DecBf )(a) = o(B')F,,(Dec, f ) ( a
+ B'),
a EA.
Theorem 111.2 computes the periodization of F, f relative to B' computing the induced FT F,,(Dec, f ) .
by
16
R. TOLIMIERI et al.
IV. FFT ALGORITHMS A . Introduction Algorithms are distinguished by their strategies for decomposing the global computation. Cooley-Tukey fast FT (CT FFT) algorithms partition the computations into FT of periodizations or decimations relative to the cosets of some subgroup B of A . Recently formulated reduced transform (RT) algorithms decompose the computation into FT of periodizations or decimations relative to a collection of subgroups covering A . Details including implementation stages on RISC and massively parallel multiprocessors can be found in Kechriotis et al. (1993) with performance results. In this chapter, we will briefly outline the structure of the RT, CT FFT, and GT algorithms. Detailed derivations of these algorithms can be found in An et al. (1990) and Tolimieri et al. (1993). B. R T Algorithm RT algorithms decompose the computation of FT into a collection of induced FT taken over the subgroups of a covering or dual covering of the indexing set. One form of the RT algorithm begins with a dual covering 63 of A and computes F+f by forming the collection of periodizations Per, f
E
L(A/B),
B
E (33;
computing the collection of induced FT Ff,,(Per,f),
BE
a.
This completes the computation since Ff,(Per, f)equals F+f on B' and 63 is a dual covering of A . A dual form RT algorithm begins with a covering (33 of A . For each a E A define the integer valued function p on A by p(a) = the number of subgroups in 63 containing a.
Define the weighted decimations off by
t 0,
otherwise.
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
Since
(I3
17
covers A ,
and we can compute F4f by 0
forming the collection of decimations BE
Deci f E L ( B ) , 0
(I3;
computing the collection of induced FT Ft(Decif ) ,
B E 63.
Redundant computation is a necessary part of RT algorithms. An analysis of the advantages and disadvantages of RT algorithms can be found in Tolimieri et al. (1993). Typically these algorithms are targeted to large size MD DFT computations on shared memory multiprocessors but have been implemented on distributed memory multiprocessors with significant speedup as compared to standard CT FFT implementations. The RT algorithm on some machines can be bottlenecked by the I/O bandwidth required in the initial stage periodizations but offers complete parallelization (subject to the number of processors and granularity) afterwards and can be easily scaled to transform size and machine configuration. This should be compared with standard approaches which interleave communication and computation by global transpositions. In applications, say, to the M-dimensional FT, the collection 63 is usually taken such that duals are a covering set of K-dimensional (K < M ) planes through the origin. The dimension K is an important design parameter as it affects local granularity and global parallelism.
C. CT FFT Algorithm Choose a subgroup B < A . One form of the CT FFT begins by subjecting data to generalized periodizations relative to B . This step can be implemented by a collection of Fourier transform computations. However we choose to express this step as a collection of generalized periodizations to bring out the analogy with the RT algorithm and to clearly distinguish stages requiring full data access from stages acting, in parallel, on localized data.
18
R . TOLIMIERI et al.
Choose a subgroup B of A . For f E L(A) and b* E B*, define fb* by
&(a)
=
c f ( a + b)(b,b*),
E
L(A)
a E A.
(45)
beB
We call fb* a generalized periodization since
fb*(a+ 6 ) = ( b , b*)fb*((a), Theorem IV.l.
a EA, b
E
B.
(46)
For f E L(A),
It follows that we can compute F4f by computing the collection of FT F+fb*, b* E B*. Consider the group isomorphism &:A/B' --* B*. Choose a complete system of B'-coset representatives z(b*) E $;'(b*),
Theorem IV.2.
b* E B*.
(47)
Fdfb* vanishes off of the B'-coset, z(b*) + B', and
1 F+f(Z(b*)+ b * ) = -F6 fb*(z(b*)+ b'),
0)
6'
E
B'.
F+fb* determines F+f on the B'-coset z(b*) + B', b* E B*. Since the B'-cosets form a disjoint partition of A , the computations F+fbs,
b* E B*
can be implemented in parallel and the second sum in Theorem I V . l requires no computation. Once the generalized periodizations are computed, the computation can be completed in parallel by induced FT computations which output F+f on B'-cosets. This is accomplished by first performing a twiddle factor multiplication of generalized periodizations defined as follows: For b* E B*, define g,, E L ( A ) by
19
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
Theorem IV.3. F+fb*(Z(b*)+ 6')
o(B)F+lgb*(bL), 6'
1
E
B'.
The CT FFT algorithm combines Theorems IV.2 and IV.3 and computes F+f by independent computations of F+f on the disjoint BL-cosets z(b*) + B by the collection of induced smaller size FT computations F+lgb*(bL), 6'
E
B L , b* E B*.
(49)
CT FFT Algorithm
f
E
L(A) I
I
fb* E
gb*
E
b* E B*
L(A), I
b*
L(A/B), I
E
B*
b* E B*
F+,gb*E L ( B L ) ,
D. Good-Thomas Algorithm The GT will be derived as a special case of the CT FFT. In An et a/. (1991) and Tolimieri et al. (1993), a direct proof was given. Choose a subgroup B < A . We require that A has a direct product decomposition, A=BxC, where C is a subgroup of A . Choose group isomorphisms
d B :B
4
B*,
&-: C
-+
C*.
The mapping +:A+A* defined by ( ( b ' ,c ' ) , d(b, c)> = ( b ' d(b)>(c',d(c)>, is a group isomorphism. Relative to 4 C=B',
b=CL.
b, 6'
E
B, c, c'
E
C
20
R. TOLIMIERI e t a / .
Since A/B = B' and A/B' = B, 4; = $ B ~and 4; the notation of the previous section, we can take
=
In particular, in
b* E B*,
z(b*) = 4i'(b*),
which amounts to taking B as a complete system of B'-coset representatives in A. Under these assumptions, the CT FFT takes the form
F, f ( b
+ b')
b E B, 6' E B L .
F,,igg,(b)(b'),
=
Compute
b
g,,(b) E W ' ) ,
E
B.
Compute F+BL(gdB(b))E L(B'),
The second stage is a collection of FT computations over B I . We will see that the first stage is a collection of FT computations over B . By definition
which equals F,Bfb'
(b) 9
where
fbi(b) = f ( b + b'),
b E B, b' E B'.
The precise statement of the stages of the GT can now be given as follows:
GT algorithm Form the slices
fbL
E
L(B),
b'
E
B'.
Compute the collection of FT over B Fg, fbl
E
L(B),
b' E B'
Form the functions
gg,(b) E L(B'),
b
E
B
This step requires data transpose (or permutation).
21
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
Compute the collection of FT over B'
bE
F4HLgr$B(b) E L(B'), Set
Fr$f(b +, b*) = F,,I gg,Cb)(b*)* This step requires data transpose (or permutation).
V. EXAMPLES AND IMPLEMENTATIONS
For applications to X-ray crystallography, we will take a 3D case to illustrate the theory presented here. In particular, the smallest nontrivial case, 2 / 1 2 x 2 / 1 2 x 2 / 1 2 is used in many of the examples, while Z / 3 N x Z / 3 N x Z / 6 M and Z / 2 N , x Z / 2 N 2 x Z / 2 N 3 are used in the implementation for several natural numbers. In all the examples, we will take the fixed isomorphism $I given in Eq. ( 1 1). To simplify notation, especially in presenting covering subgroups, we will use the following definition and notation. Let A be a finite abelian group. For a E A denote by ( a ) , the subgroup of A generated by a,
( a ) = { a ,2a, 3a, ..., ( K - l ) a ) , where K is the smallest positive integer such that Ka the order of a.
=
0 E A . K is called
A . RT Algorithm Two forms of RT algorithm wil be derived for A = Z / 3 x Z / 3 x Z / 3 . Using CRT, we will extend our current example to groups of the form Z / 3 * 2N x Z / 3 * 2N x Z / 6 M for integers N a n d M .
Example V.2. RT algorithm I for A = Z / 3 x Z / 3 x Z / 3 . Set A Z / 3 x Z / 3 . The following four subgroups cover A : ((0,1 ) ) x Z / 3 ,
B:
= ((1,
B t = ((2, 1 ) ) x Z/3,
Bt
=
B:
=
1)) x Z/3,
((1,O)) x Z / 3 ,
=
Z/3 x
22
R . TOLIMIERI et a/.
c2=o c , = o
Example V.2. RT algorithm I1 for A = Z/3 x Z/3 x Z/3. We list a collection of 13 covering subgroups along with their dual groups. Each of the covering subgroups is of order 3, while the dual group is a subgroup
23
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
of order 9. For a
=
0, 1 , 2 and b l , b2 = 0 , 1,2,
p ( u l , a 3 ,u3) = 1 for all ( a l ,a2, u3) E A , except p(O,O, 0) = 13. We will show two of the computations explicitly. The rest follows in exactly the same way. To index the periodizations with respect to D,, set
A103 : ((O,O, 01, (1,0,0), (2,O,O)J,
(52)
Usually, coset representatives are not unique. Note that although the collection in Eq. (52) can be used as AID5 as well as A / D 3 ,Eq. (53) cannot be used for A / D , . For a, c = 0 , 1,2, 2
2
per,, f(c, 0 , 0 ) =
C C f(bl bl=0 bl=0
PerD5f(0,0 , c)
C C f ( b l , b 2 , h + b2 + d. bl=O b , = o
2
=
+ c, 2bl, b2),
2
2
F + , , J ~ ( ~a,, 0 ) =
c f3(c,
0,0)~(-2*i/3)oc,
c=o 2
F+,,,f5(2a,2a, a)
=
C f 5 ( 0 ,0 , ~ c=o
) e ( - ~ ~ ~ / ~ ) ~ ~ .
24
R. TOLIMIERI et al.
Remaining cases follow in the same way, and the induced FT computations are implemented by 13 independent 3-point FTs. The above two derivations show uniform decomposition of a 3D problem into 2D and 1D problems, respectively. However, the above two cases can be combined to provide various decompositions.
Example V.3. RT algorithm for A = Z/2N x Z/2N. We will list a collection of covering subgroups of A and their dual subgroups of order 2N by listing their generators. A is covered by the 2N + 2N-' subgroups shown in Table I. To organize the periodizations, we will set
The collection of induced FT is implemented by 2N + 2N-' independent 2N-point FT computation. For the dual RT algorithm, we list the values of the function p on A with respect to the collection of covering subgroups given in Table I. Denote by Uo the multiplicative units of Z/2N, i.e.,
U,
= (a E
Z/2N : a = 1 mod 2).
For 1 5 n 5 N - 1, set
U,
= (a E
Z/2N: GCD(a,2N) = 2"). TABLE I.
COVERING SUBGROUPS OF
~
/ x ~2
/~
2
~
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
25
Then
u u,.
N- 1
Z/2N
=
n=O
For a,, E U,,, a,, # 0 ,
o Ij < 2 N , o II <
p(a,,j, a,,) = 2", p ( a n ,2a,,l)
=
2",
p(0,O) = 2 N + 2 N - 1 . Let 03 be the collection of covering subgroups of Z / 2 N x Z / 2 N given in Table I . For B E 03, compute Dec; f .
To index the induced FT computations, we will fix A/B*-coset representatives, 0 I j I2 N - 1, A / ( ( - l , j ) ) : ( ( 0 , l)), 0
A / ( ( - 2 1 , 1)) : ( ( l , O ) ) ,
II I2 N - ' -
1.
The collection of induced FT computation is implemented by 2 N + 2 N - 1 independent 2N-point FT. To complete the computation of F+, we use the periodicity F[(Decg f ) ( a + B') = F[(Dec; f ) ( a ) and the formula
F,f
c F,,(DeGf).
=
BE@
Example V.4. Hybrid RT/GT algorithm. Set A = Z / 3 * 2 N x Z / 3 * 2 N for a natural number N . By the fundamental theorem, A =AIXA,,
where A ,
=
Z / 2 N x Z / 2 N and A , B
=
=
(54)
Z / 3 x Z / 3 . The subgroup
( ( a l e l a, 2 e 1 )E A : 0
Ia , , a, 5
21
,
is isomorphic to A , while BL
=
((n1e2,n,e2) E A : 0
5
n , , n,
5
2N - 1)
is isomorphic to A , , where e, and e, are the idempotents associated with the isomorphism in Eq. (54). We have A
=
BxB'.
26
R . TOLIMIERI et at. TABLE 11. COVERING SUBGROUPS OF 2 / 3 x 213
k
Subgroup
Generator
Dual group generator
Using GT algorithm, we can compute FA by computing F A , followed by F A 2 . The induced FT computations FA, and FA2 are implemented by RT algorithm.
Example V.5. Covering subgroup computation via CRT. Covering subgroups and their dual subgroups for A , are given in Table 11. A , x A , is covered by (A,xL::Osj13],
while dual subgroups are given by
((0,0) X N
L k
05k
5 3).
We can also decompose A , into covering subgroups. To see this, let = 2 (see Table 111). The idempotents in this case are el = 9 , e, = 4 and the collection B,+k =9Mt + 4 L : ,
O s j s 5 , O r k s 3 ,
of 24 subgroups covers A . Each subgroup has order 12, given in Table IV. TABLE 111. COVERING SUBGROUPS OF A , = 214 x 214 j
Subgroup
Generator
Dual group generator
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
27
TABLE IV. COVERING SUBGROUPS OF 2/12 x 2/12 (j,k)
Subgroup
Generator
Dual group generator
B. CTFFTAlgorithm
Example V.6. CT algorithm for 2/12. Set w = e-2*i”2 . F o r f E L(Z/12), 11
(F,f)(b) =
c mwab
a , b ~ A .
a=O
For B = to, 4, 81, B’ = ( 0 , 3 , 6 , 9 ) , relative to 6 defined in Eq. (11). Generalized periodization o f f gives rise to three functions
fo4d
=
f ( 4+ f(a + 4) + f ( a + 81,
+ 4) + w8f(a + 8), fs*(a) = f ( a ) + w8f(a + 4) + w4f(a + 8), f4*(a)
= f ( a ) + w4f(a
a
E
2/12.
28
R. TOLIMIERI el al.
By Eq. (46), fb*(a) needs to be computed only on a set of B-coset representatives, say, ( 0 , 1,2, 3). Thus the periodization is usually implemented by four independent 3-point Fourier transform of the strided values off. Choosing z(O*) = 0, 2(4*) = 1, 2(8*) = 2, &?,*(a)= fo*(a)* g4:(4
= f4*(a)(a,d41)) = f4*(a)wU,
a
&*(a) = fg*(a)(a,6(2)) = fx*(a)w2",
E
2/12.
(a, b(z(b*)))is the so-called "twiddle factor".
3
The quotient group A / B contains 4 elements, B, 1 + B, 2 + B, and + B. Via the homomorphism 4I and the B-periodicity of g b * , we have
F+f(z(b*)+ bl) = F+,g,*(bL) 3
=
c gb*(a + B)(a +
o=o
3
=
c gb*(a)(a,$l(b'))*
a=O
Since b' = 3b, for some b E A and w3 = e-2?ri'4,the computation of F+ is completed by the three independent 4-point Fourier transform of g b , , b* E B*.
Example V. 7.
Multidimensional CT FFT.
Z/2Nl x Z/2N2 x Z/2N3,
A
=
B
= ( ( O , O , O),
(Nl 0 , O,), (0,N2 0 ) -(N1 N2 O), (O,O, (N, 0 , Nd, (0, N2 N3), (Nl N2 Ndl 9
9
9
=
9
3
9
[ ( b , , N , ,b 2 N 2 ,b 3 N 3 ) b, :
=
9
N3)9
(55)
9
0 or 1, n
=
1,2, 3).
Label the elements of B by b k r0 5 k 5 7 in the order given in Eq. (55). Note that the matrix of values of the characters in Table V is F(2)
0 F(2) 0 F(2),
where 0 denotes the matrix tensor product and F(2) denotes the 2-point FT matrix, F(2)
=
[
1
-1
1.
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
29
TABLE V. VALUESON B OF CHARACTERS OF A bX
bt
b:
bo b, b*
1 1
I
-I
b,
I
-1
-1
b4
1
I
b,
1 1 1
b6
b7
1
I
1
-I -1
-1
1 1 -1 -1
-1
1
1
-I
I
-I
b:
1
I
-1
1
b:
b:
1 1 1 1 -1 -1
bz 1
1 1
-1 1
1 -1 -1 1 -1
-1
-1
-1
-1
-1
-I
1
-I -I
6:
-I
1 1 -1
I 1
1
By Eq. (46), we need to compute f,,; on a set of B-coset representatives, say, C = ((01 , a2, a,) 0 5 ~j 5 Nj - 1, j = 1,2,3 1.
Order C antilexicographically. Denote by f,, the vector of values o f f on C listed in order by the ordering of C. Similarly, define the vectors f k , 0 5 k 5 7 by listing the values in order of C , f,
=
[f(c
+ bk)],
cE
c.
Then the periodization is obtained by the matrix operation, fb$
f0
fbi
f,
fbt
f2
f,;
f3
f,;
f4
fb;
fS
fbz
f4
fb;
f7
where IK denotes the K x K identity matrix. B'
=
((2a,,2a2,2a3): 0 5 a ,
INj -
1, j
=
1,2,3].
With the following choice of B'-coset representatives,
z(b,*) = (O,O, 01,
z(b,*)= (1,0,0), Z ( b 2 * )
=
(0,1, O),
Z(b3*)= (1, 1, O),
Z(b4') = (090, 1)-
Z(b,*) = (1,0,0), Z(bfj*)
=
(0,1 , l),
Z(b7*) =
(1,1, 1).
30
R. TOLIMIERI et al.
9
where T is the 8N,N2N3 x 8N,N2N3 diagonal matrix whose entry at position a, ~ 2 + ~kN,3 N2N3is ((01
9
a29 0 3 1 , z(bk*)),
0 5 k 5 7.
Since A / B = B'
= Z/N, x Z/N2x Z / N 3 ,
the induced FT is of size N 1 x N2 x N 3 applied to the eight independent functions g b ; , 0 I k I 7 .
VI. AFFINE GROUPRT ALGORITHMS A . Introduction A class of affine group RT algorithms will be constructed which act on data
f E L ( A ) invariant under the action of affine subgroups X < A f f ( A ) . The effect will be twofold as follows: reduction in the number of required induced FT computations; the induced FT computations will be on data invariant under a collection of subgroups of X .
For x E A f f ( A ) , we define two actions on L(A):
xf(4 = f ( x 4 , x#f(a) =
(0,,d(&))f(cYu,#a).
The first main result we have is
Theorem VI.l. F&f)
=
x"FJf 1.
(56)
(57)
31
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
=
x#F,(c).
Corollary. f is x-invariant i f and only i f F, f is x#-invariant. RT algorithms provide a general framework for computing the FT of data invariant under affine subgroups. We begin with data invariant under point groups. B. Point Group RT Algorithm Choose a dual covering 63 of A . The RT algorithm computes F, f , f by the collection of induced FT computations
F t Per, f,
B
E
E
L(A),
63.
We will now describe how to modify this form of the RT algorithm when f is invariant under the action of a point group H < Aut(A). This invariance will reduce the number of required induced FT computations to a set of induced FT computations on data invariant under subgroups of H . Suppose f in H-invariant. Choose a dual covering 63 invariant under H such that h E H , B E 63. h(B) E 63,
The collection of dual subgroups 6 3 ' is invariant under H' and we can choose a subset a0c 63 such that 63; is a complete system of H#-orbit representatives in 63'. Since f is H invariant, F, f is H#-invariant and it suffices to compute the following collection of induced FT: FmB,(perBf
1,
BE
630
*
(58)
This has the effect of reducing the number of induced FT required to complete the computation.
32
R. TOLIMIER1 et al.
The periodized data Per, f, B E a0inherits some of the data redundancy off. For a subgroup B < A , define
HE = (h E H : h(B) H E
B).
=
induces a group of automorphisms of A/B by h(a+ B ) = h a + B ,
Theorem VI.2.
hEH,,aEA.
Iff is H-invariant a n d B is a subgroup of A, then
Per, f(ha) In particular, Per, f
E
=
a
Perh-](,)f(a),
E
A, h
E
H.
L(A/B) is HB-invariant.
By the theorem, the induced FT in Eq. (58) is computed on H,-invariant Per, f, B E a0. To make full use of the H-invariance off we must supply code which makes full use of this HE. In crystallographic applications we can choose 63 such that A / B is 1D or 2D. Standard point group FFT algorithms can be applied in the 1D case (see the Appendix). 2D point group invariant FFT algorithms have recently been implemented using variants of Winograd’s multiplicative FFT (An et al., 1990; An el al., 1992b). H-Invariant RT Algorithm. Choose a dual covering 63 of A invariant under H and a complete system of H-orbit representatives a0in 63. Form the periodizations Per, f
E
L(A/B),
b
E
CB0.
Compute the H,-invariant induced FTs Fi(PerBf),
BE
Fi(Per,f),
B E 63,
@O-
Compute by H#-invariance.
Example VI.2. P6-invariant R T algorithm I . Set A3
=
Z/6M,
A
=
Z/3
*
2 N Z/3 ~
2N X A3,
for integers N and M . Using the Chinese remainder theorem, we can write A as @,A, + %A,) x A39 where A , = Z/2N x Z/2N and A, = Z/3 x Z/3. A is covered by the following collection of subgroups, where Lk (k = 0 , 1 , 2 , 3 ) , are given in Table 11;
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
B: Bk
P6#(B;)
=
33
+ e2L; x A , .
=
e,A,
=
((0,0 ) ) + e2Lk x ( 0 )
{ B t ,B : , B:)
P6#(B:)
=
{Bt),
and (B: : 0 Ik 5 3 ) is a P6#-invariant covering of A . Hence for P6-invariant f E L ( A ) , we need to compute FAf only on B,I and B i . fo = To index the periodization, set
f2 =
A / B , : A , + e2L:, A/B,:A , For 0
In , , n2 I
r
+ e2L;,
N - 1, 0 I k I 2 , 0
Per,J 0 , 1,
=
s = 2,3.
Im 5
6M - 1,
2
c f ( e , n , + e2k,e1n2+ e2a,m ) c f ( e l n l + e2(k + 2 4 , e 1 n 2+ e2a,m).
f o ( e l n ,+ e2k , e 1 n 2 m , )=
a=O 2
f 2 ( e lnl , e l n 2 + e2k,m) =
a=O
fo(a3(eln , + e2k9el n2, m ) ) = fl(-e,nl
-
e2k, - e , n 2 , m )
2
=
f ( - e , n , - e2k, -e,n2
+ e2a,m )
a=O 2
=
c f ( e l n , + e2k,e,n2
-
e2a,m)
a=O
+ e 2 k ,el ,n2, m),
=fl(eln,
f 2 ( 4 e ,n , el n2 + e2k, m ) ) 9
+ e2k - e 1 n 2m, )
= f 3 ( - e , n 2 ,e1n2
c f ( - e l n 2 + 2e2a,e l n 2+ e2k e,n2 + e2a,m ) c f ( e l n , + e2k + 2e2a,e,n2+ e2a,m), 2
=
-
a=O 2
=
a=o
= f d e l n , , el n2
+ e2k,m )
P ~ B=, P6B, = P6,,
=
(1, a 3 )= P 2 ,
The induced FT computations F? invariant data, respectively.
P6B,
=
P6.
and F Z are made on P2 and P 6
34
R. TOLIMIERI et at. TABLE VI. DECOMPOSITION OF SUBGROUPS
IN
z/4 X z/4
P6'-ORBIT DECOMPOSITION OF SUBGROUPS IN
z/3 X z/3
P6'-ORBIT
TABLE VII.
Example V1.2. P6-invariant RT algorithm 11. We can further reduce invariance condition on the periodized functions by applying RT on A , . T o this end, we will set A , = Z/4 x Z/4, and use the covering subgroups that are given in Table IV. The collection Osj15,01k13
@=DA J,k = B A J,k x A , , covers
x 2/12 x A , .
2/12
The dual subgroups are given by 0 I j I5, 0 5 k I3. Dj,k = B,,k X (01, Let d M j X A , = M j 8x A , and a#L, x A , = Lk#x A , . Then we have
a'((e,Mj + e2 Lk )x A , )
=
( e , M j , + e 2 L k #x) A , .
Thus to compute the P6#-orbit decomposition of 03 (see Tables VI and VII), we first decompose the collections (Mjx A , : 0 Ij I5 ) and (Lkx A , : 0 Ik I4) independently, then place the decomposition into @ by CRT. We have the following P6#-orbit decomposition of A : P6#(Dto) = { D t o Di,2 D,',,), P6#(D3',0)= IDio D,',2 D,',,I, 3
P6#(Dt,O) =
lDt,O,
9
Dt,2
9
311
P6#(D:,o) = IDiL,o,N,2*Dt3I,
P6#(DtO) =
(Df,O
9
,O:,2 9
Dt,
319
P6#(G,o) = ( D i , o - D t z , D $ 3 ) ,
P6#(DO',i) = ~ D ~ , ~ ~ D ~ P~6 #,( DDt i )~= ,[ DI~ ) >: , i ~ D ~ , i l . i,D
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
35
We will choose as P6'-orbit representatives, a 0
=
(D&o D:,o Dt,o Dt.0 D42,o,Dk,o, Dt, I ,D:, 9
9
9
9
(59)
I).
It is easy to show that the periodizations of P6-invariantf E L ( A )with respect to the duals of the previous P6#-orbit representatives are P2-invariant, and the induced FT computations are made on this invariant data. Let f b e the FT of a P6-invariant function f E L ( A ) ,f o n DjSkE CB0 is determined by the induced FT of pj,k-periOdiZed function f D . By the P6'4nvariance o f f , for example, f of Ilko determines f on and f on Dk3.
;i":,2
f ( o , l , m ) = f i l l , 1,m) = f ~ , o , m ) , (0, 1, m)
E
DiO9
(11, 1, m) E DiS2,
(11,0, m) E Di,3.
Example VZ.3. P3-invariant RT algorithm. Crystallographic group P 3 is generated by a 2 . Since P 3 is a subgroup of P6, P6#-invariant covering of 2/12 x 2/12 x A3 is also P3-invariant. In fact, the P3#-orbits and the P6#-orbits of the covering subgroups are the same. Thus as in the case of P6, the induced FTs are computed only on the collection a0.However, the periodized functions have only the trivial invariance, and symmetry specific FT routines are not required.
Example VZ.4. P6lmrnm-invariant covering for 2/12 x 2/12 x A 3 . The above two examples leads to the following unifying strategy. Choose a point group H that contains sufficiently many subgroups. Since H#-invariant covering is invariant under any subgroup K' < H', f o r K-invariant data, RTalgorithm proceeds by disabling the computations except on the K#-orbit representatives. As an example, we will consider the crystallographic P6/mmm which contains all the trigonal and hexagonal point groups, which comprise 16 of the 53 3D crystallographic point groups:
P6/mmm#(Dto) = P t o , Di,2,D i 3 , Dk,o,D t 2 ,Dt,3), P6/mmm#(Dt,o) = (D$,o,D;,2,D&3, D,',o,D;,2,Di3), P6/mmm#(D;,o)
=
P6/mmm#(Dt,o)
=
DkS2,D,',,l,
lD,',o, D:,2, D,',,,I,
P6/mmm#(D& = ID;, , D$, , D;, 1, P6/mmm#(Dt
= ID:,
1,
Di
02:A.
36
R. TOLIMIERI et al.
A collection of P6/mmmu-orbit representatives is w;,o
1
o:,o
1
0210 9
D:,o Dt, 1 ,D:, 3
*I*
and the computation is required only on this collection of subgroups for P6/mmm-invariant functions. To simplify notation, set Hj,k = P6/mmmDjfk, the invariant group of the DL,-periodized functions: H0,o
=
H , ,= ~
H2,o
H,,l
= H0,l =
H3.0 =
(1,
= I l , a 3 ,PI
a3P,Y,a3Y,PY, (Y3PY1.
Y,a3Y1.
a3,
The induced FT computations are made on the Ho,o or H,,,-invariant functions.
Example VZ.5.
Implementation of RT with respect to P6/mmm A = Z/3
*
2N x Z/3 * 2N x Z/6M.
By the fundamental theorem, A = Z/2N x Z/2N x Z/3 x Z/3 x Z/3
- 2M
Let el and e, be the system of idempotents associated with the isomorphism z / 3 * 2N = Z/2N x z / 3 and again set A 3 = Z/6M.
a>
=
(e,L:
+ e,M;)
x A,,
where L: and M; are a collection of covering subgroups in Z/2N x Z/2N and Z/3 x Z/3, respectively, as listed in Tables VIII and IX. For easier reference, we repeat the tables here. It is straightforward to show that B is a P6/mmmu-invariant dual covering of A . We will give the P6/mmm'-orbit decomposition of B. Recall Pu = /3 and y u = y. TABLE VIII. ~
COVERING SUBGROUPS OF
/ x2
Table Note: We will denote this collection by 03
~
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
37
TABLE IX. COVERING SunGKouPs OF Z/3 x 2/3
k
Subgroup
Generator
Dual group generator
P6/mmm#-orbit structure in 213 x Z/3 is the same as that of P3#, since actions by or y do not change the orbit structure. P6/mmmu(L,) W,)
=
L3
P6/mrnmu(L,)
{Lo,L2,L,},
=
PW,)
I
=
L,,
P(L2)
=
=
[L,].
L2.
P6#-orbit of ( ( j ,l ) ) , p6'((j9 1))
=
{<(-i,I)),
+ I)), ( ( - j
((-1,j
- 1,j))I
contains three distinct subgroups. T o see this, note first ((-1,j
+
((-j
1 , j = ~ ( ( j - ' ( - j - I), 1)).
-
1)) = ((1, - j - I)),
As j ranges through U,, j - ' ( - j - 1) ranges through Z/2N - U , , and - j - 1 ranges through 21,O II 5 2N-1 - 1. In fact, we have the following partitioning of 63 into P6#-orbits:
u
[ ( ( j ,I)), ( ( - 1 , j
+
I)), ( ( - j
-
1,j))).
.iE UO
/3 maps ( ( j ,1)) onto ( ( 1 , j ) ) . We will show that there are exactly four subgroups of the form ((j , 1)) with j E U, that are p-invariant. Suppose ( ( j , 1)) = ((1,j))
=
K-I,
1)).
T h e n j 2 = 1 r n 0 d 2 ~ . jE U, can be written as 21 + 1, 0 terms of 1, the following congruences hold: (21
+ 1)2
=
412
I1 s
+ 41 + 1 = 1 mod 2N. 441 + 1) = 0 mod 2N. /(I + 1)
= 0 mod 2N-2.
2 N - L- 1. In
38
R. TOLIMIERI et a/.
The last congruence has exactly four solutions for 0
I
1 = 0,
I
1 I 2N-1 - 1,
* j = l ,
1 = 2N-2 - 1, =. j
=
2N-1 - 1,
+ 1,
1 = 2N-2,
*
j = 2N-1
/ = 2N-1 - 1,
3
j = 2N - 1.
Partitioning of 63 into P6/mmm#-orbits gives 1I j
I
2N-1 - 3,
( ( ( 2 + 1, I)), ((-1,2j + 2)), ( ( - Y - 2 , 2 j + I)), ((1,2j + I)), ((2j + 2, -I)), ((2j + 1, - 2 j - 2))). (((1, I)), ((--1,2)), ((-2, I))), (((2N-1 - 1, l)), ((-1, 2N-1)), ( ( - 2 y 2N-1 - l))), (((2N-1 + 1, l)), ((-1, 2N-1 + 2)), ((-2N-1 - 2, 2N-1 + 1))).
.
(((-19
I)), ((-1, O)), ((0, -1))).
There are 2N-1 P6-mmm#-orbits in 63, four of which contain three subgroups. Action by y does not change the orbit structure. We list two examples of P6/mmm#-orbits in a. Set 1 = 2 j + 1. From the orbit of ((I, 1)) in 63 and L o , we obtain
((el1, - 1) x A , ,
((-el + 2 e 2 , e l l + e 2 ) ) x A 3 ,
+ e 2 ,ell)) x A , , ((ell + 2e2, -el + e2)>x A , ,
((1, e l l ) ) x A , ,
((-ell
((ell, -ell + e2)>x A 3 .
From the orbit of (( 1, 1)) and L o , we obtain
((-el + 2e2,e1 + 1 ) ) x A 3 ,
((el,1))xA3, ((-2e,
+ e2,el)> x A 3 .
In a, there are 4 ... 2N-1 P6mmm#-orbits, four of which contain three subgroups; the rest contain six subgroups. For completeness, we list the values of idempotents as follows: 1. If 2N = 1 mod3, then
el
=
2N- 1
+ 1,
e2 = 2N.
2. If 2N = 2 mod 3, then el
=
2N + 1,
e,
=
2N-1.
39
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
Choose a P6/mmm-invariant function f E L(A). By the invariance, the induced FT computation only on a collection of P6/mmm#-orbit representatives determines the FT off. As in Example VI.4, the periodized functions are invariant under one of the two subgroups of P6/mmm, H,,, or H I , , . Specifically, a periodized function F D is HI,,-invariant if the P6/mmm# orbit of D contains six subgroups, while fD is H,,,-invariant if the P6/mmm# orbit of D contains three subgroups.
C . Affine Group R T Algorithm
Choose a subgroup X of Aff(A) and denote the point group of X by For X-invariant f E L(A) we have F+f(a!a)
=
a E A, x E X .
(ax,4(a!a)>F4f(a),
k.
(60)
F+f i.s not invariant under Xi."but F+f(a) determines F+f at each point in the X#-orbit of a. Choose an k-invariant dual covering 63 of A and a complet? system a, of k-orbit representatives in 63. 63; is a complete system of X' representatives in the covering 63' of A . In the presence of X-invariance, the RT algorithm can be implemented by first computing the induced FT Fi(Per, f ),
B
E
630,
The remaining induced FT computations can be determined by complex multiplications implied by Theorem VI. 1. the X-invariance off reduces the number of required induced FT computations. For any subgroup B < A, define X,
= (X E
X : a,@) = B ) .
X , is a subgroup of X and acts on L(A/B). Theorem VI.3.
Iff is X-invariant then Per, f E L(A/B) is X,-invariant.
By the theorem the induced FT computations
are taken on X,-invariant data. To make full use of the X-invariance o f f we must provide a code which makes full use of the X,-invariance of Per, f, B E 63,. In 1D or 2D, affine group invariant FFT algorithms are substantially simpler because of the restricted class of 1D or 2D affine group actions.
40
R . TOLIMIER1 ef al.
X-Invariant RT Algorithm. Choose an k-invariant dual covering 63 of A and a complete system a0of k-orbit representatives in 63. Form the periodizations Per, f
E
L(A/B),
B
E
a,,.
Compute X,-invariant FT
%.
Ft(PerBf),
B
F;(Per,f),
B E 63,
E
Compute by Eq. (60).
Example VZ.6. Affine group-invariant RT. There are five affine crystallographic groups whose point group is P 6 (see Table X). RT algorithm proceeds as in the case of P6. Now the invariance condition on FT is given by Eq. (60). For 0 II I5, a P6,-invariant f E L ( A ) , the induced FT of the Dj,,-periodization o f f determines f^ on DLk E a,,. To determine ?on P6#-orbits of Djtk set ((el, c 2 ,c,), ~ ( o , oM , ))= w ?(el,
~ 2~,3 = )
=
e-2ai’6.
wC’!f(a#(c1 , c2, ~ 3 ) )
, c 2 ,c,))
=
wzc3~((LyZ)#(cI
=
~ ~ ‘ ~ ! f ( (, ca2~,c,)) )#(~~
= w4‘3!f((a4)#(c1,
c2,
c,))
= w5C3!f((CY5)#(c1,
c2,
c,)),
1 5 1 I5 .
The group that contains all of the 48 tetragonal crystallographic groups is P4/rnmrn. As in the case of P6/rnrnrn, once a P4/rnrnrn#-invariant covering subgroup is partitioned into P4/rnrnrn#-orbits, a code for the RT TABLE X . AFFINEGROUPSWITH POINTGROUPP6 Group
Generator
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
41
algorithm with respect to this partitioning contains codes for FT computation of functions invariant under subgroups of P4/mmm. One can also choose a group that contains all the crystallographic point groups; this group need not be a crystallographic group.
D. X#-Invariant RT Algorithm Consider a subgroup X of A f f ( A ) . In many applications we will have to compute the inverse FT of X#-invariant data. Up to index reversal, this problem is equivalent to computing the FT of X'hvariant data. We will embed this problem in the second form RT algorithm. In problems requiring several stages of FT and inverse FT, it makes sense to follow the first form RT algorithm which outputs decimated data by the second form RT algorithm which inputs decimated data and conversely, removing the necessity of data rearrangement steps at each cycle. In the second form of RT algorithm we compute F+f , f E L (A ) by first computing the collection of induced FT
Ft(Dec; f ) , Theorem VI.4.
B
For a subgroup B < A , i f f
F+(Dec, f ) ( - a )
=
63.
E
E
L ( A ) is X#-invariant, then a
F+(Dec,:, f ) ( - x a ) ,
E
A , x E X.
(61)
Proof. F+(DecBf)(-c) =
C
f ( b ) ( b ,6 ( C ) )
C
f(a 'b )(b , d
beB
=
c - ai'a,))
bEB
=
C,
f ( b ) ( b ,6(a,c - a,))
b E a,B
=
F+(Decaf,f (-
XC).
Choose an k'#-invariant covering 63 of A and a complete system cR0 of k'#-orbit representatives in a.It suffices to compute the collection of induced FT F;(Dec,f), B E (Ro. The remaining induced FT computations can be computed from the theorem. Set X B = [ X E X : a,@) = B ) .
42
R. TOLIMIERI et a/.
Theorem VI.5. in variant.
For X#-invariant f E L(A) and B < A , Dec, f is
Dec, f ( b ) = (a,, 4(a:b))DeC, f(a,#b),
b
E
B, X
E
X,
i'-
.
In 3D crystallographic applications, specialized routines as described in the preceding two subsections can be applied to these induced FT computations.
VII. IMPLEMENTATION RESULTS We have implemented symmetrized 3D crystallographic FFTs for the case of P6 symmetric data. The data is assumed to be defined on the Z/3N x Z/3N x Z/6M lattice, where N and M are powers of two.
Algorithm 1 1. Use CRT to re-index the data set such that the problem is transformed to an equivalent 5D computation:
Z/3N x Z/3N x Z/6M
+
Z / 3 x Z / 3 x Z / N x Z / N x Z/6M.
Although this step is computationally expensive, involving irregular accessing of the data stored in the main memory, it should be noted that in many applications where a large number of iterations of the forward and inverse FFT are required, the CRT re-indexing can be carried out only once and then the optimization can be performed in the 5 D domain. 2. Apply the RT algorithm to the Z / 3 x Z / 3 to compute the periodized data on two out of the total four subgroups. The periodization results in two distinct data sets, A , and A , , each defined on Z / 3 x Z / N x Z / N x Z/6M. 3 . Perform two 4D FFTs on the data sets A , and A , to implement the induced FT. The sets A , and A , are P2 and P6 symmetric correspondingly, such that efficient symmetrized FFT code can be used for the computations. If symmetrized FFT code is not used in step 3 , the computational savings are roughly on the order of 1/2. In Fig. 1 we plot the speed up over the nonsymmetrized FFT versus the size of the data set. The second implementatioon results in even more speedups over the nonsymmetrized FFT:
43
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS Speedup
2.6
1.6'
I 1
0.5
1.5
Data Size
2
lo5
FIGURE 1. Speedup of the P6 symmetrized FFT over the nonsymmetrized FFT versus the data size. Symmetrized RTA on Z/3 x Z/3.
Algorithm 2 1. Use the CRT to re-index the data set such that the problem is transformed to an equivalent 5D computation:
Z/3N x Z/3N x Z/6M
--t
Z/3 x Z/3 x Z / N x Z/N x Z/6M.
2. Apply the RT algorithm on Z/3 x Z/3 x Z / N x Z / N and compute the periodized data on one-third of the total 4 x (3/2)N subgroups. The periodization results in 2 N distinct data sets, each defined on Z/6M. 3. Perform 2 N independent 1D FFTs on data of length 6M. These distinct data sets are P 2 symmetric, so that efficient P2-symmetrized FFT code can be used. If symmetrized FFT code is not used in step 3, the computational savings are roughly on the order of 1/3. In Fig. 2 we plot the speedup over the nonsymmetrized FFT versus the size of the data set. If P2-symmetrized FFT code is used, the computational savings are roughly on the order of 1/6, which is the theoretical maximum since the original data are P 6 symmetric. The P 6 symmetrized RT algorithm-based FFTs share the highly parallelizable structure of the general RT algorithm. A variety of choices of a
44
R . TOLIMIER1 et al.
Speedup
2.510
2
4
6
8
10 Data Size
12
lo4
FIGURE 2. Speedup of the P6 symmetrized FFT over the nonsyrnrnetrized FFT versus the data size.
multiprocessor algorithm are available allowing for efficient implementations depending on the characteristics of the particular platform. Consider for example Algorithm 1. If two processors are available and all of the 2 * 3 N N 6M data set is stored in each processor, no interprocessor communication is needed since each processor can independently compute the periodization and 4D FFT. If only half of the data is stored in the memory of each processor, then in order to compute the periodizations, each processor has to send its data to the other, resulting in a total amount of communication (number of processors x size of messages) equal to 2 * 3 * N * N * 6M. If P > 2 processors are available, the data can be divided along the last dimension into sets of size 2 * 3 N N * 6 M / P , each set being stored into the local memory of one processor. After the computation of the periodizations, each processor keeps 3 N N * 6 M / P of local data, and then performs local FFTs along the first three dimensions. To complete the computation, FFTs along the last dimension have to be performed. Since the data are distributed among the processors along the last dimension, a global transposition is required: Each processor keeps 1/ P of its local data, and sends ( P - 1 ) / P data to other processors. The total communication
- - -
-
- -
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
45
requirements are then ( P - 1) x local data size = ( P - 1) x 3 * N * N 6 M / P . In an alternative implementation, P processors are being divided into P / 2 clusters of two processors, with local data being duplicated within each cluster. In this implementation, each node stores twice as much data as before, but the efficiency can be increased in certain multiprocessor networks since now the global transposition step is replaced with two independent global transpositions each involving only P / 2 nodes.
A . Complexity 1. Row-Column Algorithm
Set
A
=
Z/3N x Z/3N x Z/3M.
The computation of the 3D FT using a conventional row-column algorithm of processing the data dimension at a time on many parallel systems exacts a considerably higher price on interprocessor communication than FT computation. RT algorithm offers an alternate data movements in MD FT computation. We list some performance results here.
2 . GT-RT Algorithm I Using CRT,
A = A , x A 2 = (Z/3 x Z/3) x (Z/3 x Z/N x Z / N x Z/M). Data reduction (periodization) stage costs 4 x 2 x 3N 2M additions, which can be combined with data loading operation in a broadcasting mode; on some parallel systems it is given for free. In a 4-processor system, each processor carries out 2 x 3N2Madditions, while receiving input data, followed by a local 5D 3 x 3 x n x N x M FT computation. This algorithm eliminates interprocessor communication completely, and each processor has a balanced load with uniform computation format. 3. GT-RT Algorithm 11
A = A , x A 2 = (Z/3 x Z/3 x Z/3) x (Z/N x Z / N x Z/M). In this decomposition, each processor carries out ( 2 x 3) x N2Madditions to implement periodization while receving input data, followed by a local 4D 3 x N x N x M FT Computation. This decomposition is well suited on a 13-processor system. Both reduction and FT computation are carried out in parallel.
46
R. TOLIMIERI et al. TABLE XI. TIMINGRESULTSON iPSC/860 (3D) (4 NODES) GT-RT (4 nodes) Size 48 x 48 x 48 48 x 48 x 96 48 x 96 x 96
Row-Column (4 nodes)
Time (ms)
Size
Time (ms)
3 60 512 980
64 x 64 x 64 6 4 x 64 x 128 64 x 128 x 128
566 1122 2202
TABLE XI1. TIMINGRESULTSON iPSC/860 (3D) (4 NODES) GT-RT (4 nodes) Size 48 x 48 x 48 x 96 x
Row-Column (8 nodes) Size
Time (ms) 48 x 48 x 96 x 96 x
48 96 96 96
360 512 980 2029
64x 64 x 64 x 128 x
64x 64 x 128 x 128 x
Time (ms) 64 128 128 128
282 585 1152 2216
The RT Algorithms I and I1 show uniform decomposition of a 3D problem into subsets. The combination of RT algorithms with other fast algorithms will provide a highly scalable feature that can be matched to various degrees of parallelism and granularity of a parallel system. The RT algorithm partitions input data at the global level to match each subset into node processors, carrying out loading and reduction operations concurrently at each node; then FT computations are performed in parallel. In Tables XI and XII, timing results on the Intel iPSC/860 with 4- and 8-node implementations are given. The timing results of the next power of 2 sizes of Intel FFT library are also included for comparison. (Non-power of 2 routines are not available in the standard library.) The GT-RT algorithm I was implemented on the 4-node hypercube architecture. The periodization (reduction stage) is coded in standard Fortran, whereas the FFT and 3-point FT calls on the Kuck & Associates optimized assembly routines and our own vectorized 3-point FT routines, respectively. VIII. AFFINEGROUPCT FFT The global decompostion stage of a CT FFT algorithm computes pseudoperiodizations relative to a subgroup B of the indexing group A . In this section we present a CT FFT algorithm whose pseudoperiodizations
47
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
are taken relative to an abelian subgroup X c A f f ( A ) . In the classic case, X consists of pure translations. If Y is a subgroup of X , the CT FFT algorithm associated to X can easily be adopted to produce an FFT algorithm for Y-invariant data. The code which implements this CT FFT produces, by a process of disabling, Y-invariant FFT code for every subgroup Y of X. For applications, the choice of X is motivated by two factors. First, the code for the CT FFT associated to X should be simple to write, scalable, and efficient. Second, X should contain a large collection of subgroups of interest in applications. A . Extended CT FFT: Abeliun Point Group Choosef E L ( A ) and an abelian subgroup G of Aut(A).For y* the pseudoperiodizations fy* E L ( A ) by fy*(a)
=
c f(Y)(Y, Y*),
0
E
G* define
€A.
(62)
yeG
Since
o(G),
y = identity map,
otherwise,
y * E G*
we can write
We can compute F+f by computing the collection of FTs fy*,
y* E G*.
(65)
We have replaced a single FT computation by a collection of FT computations. However, the pseudoperiodizations satisfy the following group invariance property:
Theorem VIII.1.
For y*
E
G*,
f,*Ow)= ( Y , y*>f,*(a), F+fy*(y"(a))= ( Y ,
Y*>F+fT*(U),
U E A ,~ E G . a E A , Y E G.
We will say thatf, is G-invariant with character. The CT FFT associated to G decomposes the computation of F+f into a collection of FT computations on G-invariant with character data which can be implemented by simple modifications of the point group RT algorithm.
48
R. TOLIMIERI et al.
Suppose K is a subgroup of G . If we begin with a K-invariant data, we can reduce the number of FT computations. Set K,
= (y* E
G* : ( K , y*>
= 1,
for all
K E K).
(66)
K, is a subgroup of G* isomorphic to the character group (G/K)*. Choose a complete set of representatives of K-cosets in G Y o , Y1, Then every g
E
(67)
YL-I.
G can be written uniquely in the form y = KY/,
K
EK, 0
I < L.
I
(68)
Theorem VIII.2. I f f E L ( A ) is K-invariant then the pseudoperiodization f,. vanishes unless y* E K, .
Proof. L-1
f,*(a)
=
c c f(KY/a)(KY/,Y*)
/ = 0K E K L-l
=
c f~Yra)(rr,Y*>c
I=O
(K,Y*>
KEK
by K-invariance. Since C, ( K , y * > vanishes unless y* E K, , the proof of the theorem is complete. Code f o r the CT FFT algorithm associated to G applies to the computation of the FT of the K-invariant data, K < G , by disabling all the pseudoperiodizations corresponding to y* B K, .
B. CT FFT with Respect to Pmmm For p, p
E
Pmmm, p = p;'ppp;3,
T = p;1ppp:3,
define ( p , r*> = (-
1)rltl+r2t2+r3f3
Associate with the function f E L ( A ) , the column vector fo of length K = 8NML by listing f ( a l ,a 2 ,a,), antilexicographic ordering of (a, ,a 2 ,a,) E A . Also define the vectors f , , 0 Ij 5 7 by listing f(s,(a, , a 2 ,a3),in order of ( a , ,a 2 ,a,) E A . The generalized periodizations off with respect t o Pmmm can be implemented by the vector additions
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
where F(2) denotes the 2-point FT matrix, F(2)
=
[
1 - 1
49
'1
and I, is the K x K identity matrix. Crystallographic group P 2 (Henry and Lonsdale, 1952) is a subgroup of Pmmm. P2
=
(1,s24].
p2*
=
(
9
s 2 4 9 s32
9
s56)*
I f f € L(A) is P2-invariant, then four of the periodizations vanish. Each of the non-vanishing periodizations are Pmmm-invariant up to multiplication by k 1, and FT is computed with this invariance. Another crystallographic subgroup of Pmmm is P222: p222 P222,
= (
3
s24
9
s40 9 s481,
= ( 1, S 56).
For P222-invariant f , all the periodizations except f,; and f s f 6 vanish. Iff is Pmmm-invariant, then computation is carried out only for f S ; . C. Extended CT FFT: Abelian Affine Group
The discussion of Section A will be extended to abelian subgroups X of Aff(A) of the form X = B x K where B is a subgroup of A and K is a subgroup of Aut(A). The CT FFT algorithm associated to X combines features of the standard CT FFT associated to B and the abelian point group CT FFT associated to K . The pseudoperiodizations are now taken with respect t o the affine subgroup X . The motivation is to unify the writing of FT code for affine group invariant data.
50
R. TOLIMIERI et a/.
Choose an abelian subgroup X of A ff (A)of the form X = B x K. Then X * = B* x K * . We will usually write bk for (b,k) and b*k* for (b*,k*). Denote a complete set of B'-coset representatives by
z(b*) = 4i1(b*),
b* E B*.
For f E L ( A ) , define the pseudoperiodizations fx*
fx*(a)=
(70)
E L ( A ) ,x* E X*, by
a E A , x* E x*.
f(xa)<x,x*),
(71)
x EX
a
fx*(x(a))= (x,x*)fx*(a),
E
A , x*
E
x*
Since 1
c
f=-
(73)
fX*l
O ( X ) x* EX*
we can compute F+f by the collection of FT computations
F+fx*,
EX*.
X*
A direct computation shows that f,. satisfies the group invariance with character condition. In particular,
f,*(b Define g,
E
+ a) = ( b , b*)fx*(a),
b
E
B,X*
=
b*k* E X * .
(74)
L ( A ) ,x* E X*,by
a
g,*(a) = fx*(a)(a,4(z(b*))),
E
A , x*
=
b*k*.
(75)
g,. is B-invariant and can be viewed as a function in L ( A / B ) .
Theorem VIII.3. and we have
For x*
=
b*k*
F+fx*(z(b*)+ b')
=
E
X*,F+fx* vanishes off of z(b*) + B'
o(B)F+lgxo(b'), 6'
E B'.
Proof. Choose a complete system of representatives for the B-cosets in A
m ,, ...,m,. Setting
c
=
~ ( b f+) b',
a = mj
in
+ b,
bf
E
B*, b' E B'
1 Ij s J , b E B ,
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
51
we have, applying Eq. (74),
which vanishes unless b: = b*, proving F+f vanishes off of z(b*) + B I . Then by Theorem IV.2, J
F+f(z(b*)+ b') = o(B) C g,*(mj)(mj, d(b*)), j= 1
completing the proof of the theorem. For b* E B* define S(b*) = ( g b t k s : k* E K * ] . By Theorem VIII.3,
(76)
b* E B ' ,
(77)
which implies that F+f on the coset z(b*) + B',
b* E B * ,
is determined by the induced FT of functions in S(b*). The pseudoperiodization operations introduce data redundancies which we will now describe. Set C = A / B . K acts by the identity mapping on B and induces a group of automorphisms of C denoted also by K . For b* E B* and k E K , there exists a unique C b * ( k ) E B' such that
Theorem VIII.4. For x*
=
b*K* E X * and
K E
K,
52
R. TOLIMIER1 et
al.
Proof. By Eqs. (72), ( 7 9 , and (78) g,*(K(a)) =
(K,
K*>(K(a),4(z(b*))>fx*(a), a E A ,
= (K,
K*>(a,W # ( z ( b * ) ) ) > f x * ( a )
= (K,
K*)F&, 4(Cb*(K)))gx*(a).
K EK
The second statement can be proved by usual arguments. A modified RT algorithm can be applied to the induced FT computations. For a subgroup Y of X,set
Y*
=
(x* E X * :( y , x * ) = 1, for ally
E
Y).
(83)
Arguing as in Theorem VIII.2, we have the following theorem:
Theorem VIII.5. I f X is a subgroup o f A f f ( A )and Y is a subgroup of X, then f o r Y-invariant f E L ( A ) , the pseudoperiodizations f x * , x* E X* vanishes unless x* E Y, . Affine group CT FFT code for X can be used to compute the FT of Y-invariant data, for any subgroup Y of X . In several important applications, the group X can be chosen such that the corresponding CT FFT algorithm can be implemented by simple 1D routines, while more complicated code is required for a direct implementation of the FT of Y-invariant data Y.
D. CT FFT with Respect to Fmmm We will continue with the notations established in Example 11.4:
Fmmm
=
B x Pmmm.
We will use the B-periodization computation of Example V.7 as the first stage of the two-stage pseudoperiodizations with respect to Fmmm. Recall the ordering of the elements of Fmmm given in Example 11.4: B
= (SO, sl,s 2 ,
Pmmm
=
Fmmm
= (SS/+k:
s3
9
Iso,ss,s16, $ 4 ,
s4,
s5, s6, s 7 ) ,
s32, 3 4 0 , s48,
~
~
~
0 5 k, 1 571.
For (ala2a3) E A , observe that
~ ~ ~ ( a ~ = , sas /~+,/ (aa l~, a)z , a 3+)s I ,
~ E B .
1
,
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
53
In Example V.7, periodizations 0 5 15 7
fb,\
are made on the collection of B-coset representatives
c = ( ( a l , a , , a , ) : O I a i I N ; , i =1 , 2 , 3 ) . 7
f,?+,(a) =
7
c c c c
f(S6na
-k S m ) ( S m v
Sk*)(s8n9 $/)
n = O m=O
7
=
fb;f(s6na)(s6n
9
$/)
n=O
7
=
fbtf(s8n+na)<sn
9
bz)(s6n3
s&>*
n=O
CT FFT with respect to F m m m was implemented on a Sun4 station (Abdelatif, 1994).
IX. INCORPORATING1D SYMMETRIES IN FFT We have developed various FFT algorithms incorporating certain 1D symmetry. In this appendix, we give an example of incorporating invariance conditions in data without giving up the use of highly efficient FFT routines. Set A = Z/N, for a natural number N . For f E L ( A ) , the invariance conditions we will consider here are
f ( a ) = * f ( - a).
(84)
An efficient algorithm was given by Cooley et al. (1970) and Rabiner (1979) which reduced the computation to that for an N/2-point FFT with preprocessing and postprocessing. The procedures are summarized as follows:
(a) Compute N/4- I
c
V(0) = 2
f(2a
+ 1).
a=O
(b) For a
=
1,2, ..., N/4 - 1 , formulate the sequence g(a) as g(a)
=f(24
+ [fW + 1) - f(2a
g(N/2 - a) =f(2a) - [f(2a g(0) = f ( O ) , g(N/4)
= f(N/2).
+ 1) -f(2a
-
1)1,
- l)],
54
R. TOLIMIER1 el a/.
(c) Take the N/2-point FFT of g(a); call this result G(b). (d) Form two sequences b
U(b) = Re[G(b)],
V(b) = (e) For b
=
Im [G(b)l 2 sin(2nb/N) '
0, 1,2, ..., N / 4 ,
b = 1 , 2) . . . )N / 4 - 1.
1,2, . . . , N / 4 , the transformed data sequence F(b) is given as
=
F(b) = U(b) + V(b), F(N/2
-
b) = U(b) - V(b),
F(0) = U(0) + V(O),
F(N/2)
=
U(0) - V(0).
Notice that in step (d), the computation involves division by {sin(2nb/N)J.This may case a stability problem for large size N . We summarize here an algorithm proposed in Lu and Tolimieri (1992) to overcome the stability problem. (a) Form two sequences h(a) = f ( a )
+ f(N/2
g(a) = [ f ( a )- f(N/2
a = 0, 1,2,
- a), -
a)[ cos(2na/N),
..., N / 4 , a = 0, 1,2, . . ., N/4,
where both h(a) and g(a) have invariance conditions. (b) Take the N/2-point (half size) symmetric FT of h(a) and g(a). (c) The transformed data sequence F(b) is given as F(2b) = H(b),
b
=
0, 1,2, ..., N / 4 - 1,
F(1) = G(O), F(2b
+ 1) = 2G(b) - F(2b - l),
b = 1, 2,
...,N/4
-
1.
This algorithm can be recursively used for transform size of N = 2'" or > 1 and I is an odd number. In step (a), multiplications by (cos(2na/N)) are required to formulate g(a). If, however, n is twice an odd number, then an alternative procedure, based on the Good-Thomas prime factor algorithm (Good, 1958; Thomas, 1963), can be used to avoid these multiplications. In this case, n = 2ml, where rn
GROUP INVARIANT FOURIER TRANSFORM ALGORITHMS
55
the computational procedures can be stated as (a) Take the N/2-point (half size) symmetric FFT of fl(a)= f(2a) and f 2 ( a )= f(N/2 + 2a); call them F,(b) and F2(b)respectively. (b) For b = 0, 1,2, ..., (N/2 - 1)/2, the transformed data sequence F(b) is given as F(2b) = F(N - 2b) = FI(2b) + F,(2b), F(N/2
+ 26) = F(N/2
- 26) =
FI(2b) - F2(26).
If the data is real, the same algorithm can be used with half size real FFTs. The saving in FFT computation will be approximately 50% in comparison with complex data. REFERENCES Abdelatif, Y. (1994). Periodization and Decimation for FFTs and crystallographic FFTs. Ph.D Thesis, CCNY, CUNY. An, M., Gertner, I . , Rofheart, M.. and Tolimieri, R. (1991). Discrete fast Fourier transform algorithms: A tutorial survey. In “Advances in Electronics and Electron Physics” (P. Hawkes, Ed.), Vol. 80. Academic Press, New York. An, M., Cooley, J . W., and Tolimieri, R. (1990). Factorization method for crystallographic Fourier transforms, A d v . Appl. Math. 11, 358-371. An, M., Lu, C., Prince, E., and Tolimieri, R. (1992a). Fast Fourier transform algorithms of real and symmetric data. Acta Cryst. A48, 415-418. An, M., Lu, E., Prince, E., and Tolimieri, R. (1992b). Fast Fourier transforms for space groups containing rotation axes of order three and higher. Acta Cryst. A48, 346-349. Anupindi, N., and Prabhu, K. M. (1990). Split-radix FHT algorithm for real-symmetric data. Electron. Leu. 26, 1973-1975. Bricogne, G. (1974). Geometric sources of redundency in intensity data and their use of phase determination. Acta Cryst. A30, 395-405. Bricogne, G., and Tolimieri, R. (1990). Symmeterized FFT Algorithms. “The IMA Volumes in Mathematics and Its Applications,” Vol. 23. Springer-Verlag, New York/Berlin. Burrus, C. S. (1977). Index mappings for multidimensional formulation of the DFT and convolution. IEEE Trans. ASSP ASSP-25, 239-242. Cooley, J . W., Lewis, P. A,, and Welch, P . D. (1970). The fast Fourier transform algorithms: programming considerations in the calculation of sine, cosine and Laplace transforms. J . Sound Vib. 12, 315-337. Gertner, I. (1988). A new efficient algorithm to compute the two-dimensional discrete Fourier transform. IEEE Trans. ASSP 37(7), 1036-1050. Good, I . J . (1958). The interaction algorithm and practical Fourier analysis. J . R . Statis. SOC. B. 20(2), 000-000. Henry, N. F. M., and Londsdale, K. (ed.) (1952). “International Tables for X-Ray Crystallography,” Vol. I. The Kynoch Press, England. Kechriotis, G . , An, M., Bletsas, M., Manolakos, E., and Tolimieri, R. (1993). A hybrid approach for computing multidimensional DFTs on parallel machines and its implementation on the iPSC/860 hypercube. IEEE Trans. Signal Proc. 00, 000-000.
56
R. TOLIMIERI et a / .
Lu, C., and Tolimieri, R. (1992). New algorithms for the FFT computation of symmetric and translational complex conjugate sequences. Proc. IEEE 1992 Int. Conf. ASSP, 23-26. Rabiner, L. (1979). On the use of symmetry in FFT computation. IEEE Trans. ASSSP, ASSSP-27, 000-OOO. Ten Eyck, L. F. (1973). Crystallographic fast Fourier transforms, ACTA Crystullogr. Sect. A 29, 183-191. Thomas, L. H. (1963). “Using a Computer to Solve Problems in Physics, Application of Digital Computers.” Ginn, Waltham, MA. Tolimieri, R., An, M., and Lu, C. (1993). “Mathematics of Multidimensional Fourier Transform Algorithms.” Springer-Verlag, New York/Berlin. Tolimieri, R., An, M., and Lu, C. (1989). “Algorithms for Discrete Fourier Transform and Convolutions.” Springer-Verlag, New York/Berlin.
ADVANCES IN IMAGING A N D ELECTRON PHYSICS. VOL. 93
Crystal-Aperture STEM JACOBUS T. FOURIE Division of Materials Science and Technology, CSIR. Pretoria, South Africa
I. Introduction . . . . . . . . . . . . . . 11. Theoretical Considerations and Experimental Evidence
. . . . . . . . . . . . . . . . .
A. Strong Absorption of Electron Waves and the Nature of Transmitted Radiation B. Crystal-Aperture Optical Systems of Atomic Dimensions . . . . . . . C. Predictions on Zone Axis Patterns from Electron-Ray Simulation . . . . D. Atomic Structure of Zone Axis Tunnels through a (110) Foil . . . . . . E. Electron-Source Requirements and the Virtual Source in [3 101 Field-Emission F. Auto-Magnification Effects in Direct Imaging of the Nucleus . . . . . . I l l . Experimental Results in Imaging . . . . . . . . . . . . . . . . A. Experimental Method in Crystal-Aperture STEM . . . . . . . . . . B. Improved Resolution in Crystal-Aperture STEM . . . . . . . . . . C. Imaging of Single Adatoms of Gold . . . . . . . . . . . . . . D. Imaging of Subatomic Detail . . . . . . . . . . . . . . . . IV. Summary and Conclusions . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
57 59 59 63 66 73 79 87 90 90 91 94 100
106 107
I . INTRODUCTION The science of electron microscopy has progressed, in terms of resolution, by about one order of magnitude since the middle 1940s. For example, the RCA EMU commercial electron microscope of that period, as described by Hall (1953), provided a resolution of slightly less than 2 nm, whereas modern microscopes can resolve about 0.1 nm. The method of crystal-aperture scanning transmission electron microscopy (STEM) is directed toward the obtaining of resolutions that are considerably better than the present optimum level. To this end, an attempt has been made to obtain images under conditions where electron optical diffraction would be absent. Under such circumstances, the incident aperture could be reduced to obtain minimal spherical and chromatic aberration, without incurring the usual diffraction broadening associated with a reduction in the magnitude of the aperture. At this point it should be stated emphatically, that there is no intention, within this chapter, to call into question the validity of the Heisenberg uncertainty principle which forms the basis of diffraction effects. Instead, the exploring of the crystal-aperture STEM method is simply an empirical procedure to establish whether, along 57
Copyright 6 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
58
JACOBUS T. FOURIE
crystal zone-axis directions in thin foils less than 20 nm in thickness and within atomic size optical systems, conditions might exist under which the diffraction effects might be absent. This empirical approach is guided by the information already known about electron wave propagation through crystal lattices, as discussed, for example, by Whelan (1979). Thus, the dynamical theory of electron diffraction dictates that for s < 0, where s is the diffraction error, the incident wave is strongly absorbed (Whelan, 1979). Consequently, when a high-intensity transmission of electrons occurs in spite of conditions existing where s < 0, such as in the centers of (110) or (100) zone axis patterns (ZAPs), a strong probability may exist that the particle nature of the electron could become a dominant factor in the transmission of radiation. A further aspect to consider, and one which may have an influence, is that the crystalaperture optical system is of near atomic size and thus of a magnitude which is many orders smaller than standard systems. The following aspects of imaging through a crystal aperture by STEM are carefully considered within this chapter: (i) The bent-foil ZAP forms the basis of the practical application of the crystal-aperture STEM method. In such patterns, the condition s < 0 exists for the sets of reflection planes involved in producing the configuration. Thus, in the present chapter, a detailed analysis will be made of ZAPs within the region where s < 0. This will be done in terms of computer-simulated straight line trajectories, as well as in experiemental electron microscopy at 100 to 200 kV. (ii) The cold field-emission electron source is considered at length and particular attention is given to field-emission along the [310] axis in tungsten. The tips used in commercial field-emission guns have this axial orientation and are positioned so that the tip axis coincides with the optical axis of the microscope. Attention is drawn in the discussion to the unique aspect of [310] emission, which is at least an order of magnitude brighter than along any other crystal axis in bodycentered-cubic (bcc) tungsten. It is pointed out, firstly, that this bright emission is incorporated within a finite electron current which is largely paraxial. Hence, this current may be related to a virtual point source at infinity, where the virtual current density of that point source would tend toward infinity. These aspects of the source are of cardinal importance in explaining the subatomic resolution that is shown in the results. Secondly, the paraxial nature of the radiation is essential for focusing an adequate electron current into the small aperture which the zone axis tunnel presents to the incoming focussed cone of rays, which, at focus, will form the probe. Thirdly, an analysis of the paraxial state of the radiation suggests that the system would
CRY STAL-APERTURE STEM
59
be insensitive to transverse or longitudinal vibrations of the source relative to the objective lens. Lastly, it follows from the crystalaperture configuration that the spherical aberration error of the objective lens would be reduced considerably by the narrow aperture, that is, as the cube of the aperture, whereas the chromatic aberration error would be reduced in direct proportion to that aperture. (iii) Experiments on the imaging by crystal-aperture STEM of thin gold deposits on ( 1 10) copper foils are discussed. These experiments involve the imaging of gold particles by means of STEM systems based on three different types of electron source, namely, heated tungsten, heated lanthanum hexaboride and a cold field-emission, [310] orientated, single crystal tungsten tip. In the discussion of these experiments, it is pointed out that the heated tungsten source, because of a lack of brightness, showed a reduced resolution when used in the crystal-aperture mode. On the other hand, the lanthanum hexaboride source, which is 10 times brighter than the tungsten source, produced images where the resolution was improved over that normally obtainable on the instrument. For the cold field-emisison STEM source, a greatly improved resolution was demonstrated when the related STEM machine was used in the crystal-aperture mode. Furthermore, there were strong experimental indications that resolutions better than 0.01 nm are obtainable. Consequently, the possibility exists, not only of imaging the positions of single adatoms on surfaces but also of resolving the structure within a given adatom. (iv) In the final application of the method to be discussed in this chapter, an attempt was made to resolve the structure within the gold atom itself. The relevant images were obtained at an instrument magnification of lo’, where the scan line density in object space was sufficiently high to allow the resolution of structure within the atom. These images suggest that a hexagonal, orbitlike structure is present in the gold atom, and this conclusion could be confirmed by the Fourier transform of a digitized image of that atom.
11. THEORETICAL CONSIDERATIONS AND EXPERIMENTAL EVIDENCE
A . Strong Absorption of Electron Waves and the Nature of Transmitted Radiation In this section, the basis of the present method is considered, and for this purpose, the transmission of electrons through a crystal lattice is analyzed. A clear distinction is made between situations where electrons are expected to demonstrate wave properties and where their particle properties
60
JACOBUS T. FOURIE
would be dominant. Firstly, the origin of bend extinction contours and related bent-foil ZAPS are considered. Secondly, the likelihood of electron transmission, as particles, through the center of zone axis tunnels is discussed. The well-known bend extinction contour, relevant to a single set of diffraction planes, has been analyzed (Whelan, 1979) in terms of the dynamical theory of electron diffraction. For further discussion it is necessary to define the expression w = s&, where s is the diffraction error, and $I is the extinction distance relating to the diffraction vector, g, of a set of ( h k l ) reflecting planes. In regard to the extinction-contour, it is significant that the dynamical theory predicts that this phenomenon will occur where w is negative. Thus, if the incident direction of the electron wave on the crystal is such that this condition is met, the Bloch wave that is most strongly absorbed is also the one that is excited predominantly. To interpret the appearance of the bend extinction-contour, a pair of rocking curves are placed back-toback (Whelan, 1979; Fourie and Terblanchk, 1992), as in Fig. 1. It is clear from this figure (Fourie and Terblanchk, 1992) that the maximum absorption of scattered waves would occur around the direction where 6 = 0 for a given set of reflection planes. Here, 6 is the angle between the incident direction and the reflection planes. A further classification of the problem is obtained by a consideration of the electron ray diagram in Fig. 2. The argument there is particularly significant in regard to the electron particle model that forms the basis of the present method. With reference to Fig. 2, then, the vertical lines are envisaged to represent a set of (200) reflecting planes in copper. The rays C’R’and CR are incident at exactly the Bragg angle, BB, which at 100 k V is 10.2 mrad, and for which direction, w = 0. Referring to Fig. 1, it will be noted that the transmitted intensity at 8, is considerable, but that, for B = 5.0mrad, where w = -0.5, this intensity is about zero. Similarly, for
>-,:
;
e
c (I)
s o
3
2
1
294
230
166
0 -1-16-1 1 0 2 ~ 1 ~ 3o 8 38 8 (mrad)
0
1
ioz(e,ps
2 250
~ 294
3
FIGURE1. Rocking curves based on the dynamical theory of electron diffraction. The curves are placed back-to-back for the purpose of representing bend extinction-contours in a crystal foil. Courtesy Fourie and Terblanche (1992).
CRYSTAL-APERTURE STEM
61
‘q \
R’
/
FIGURE2. A representation of directions within a cone of electron rays incident on a copper crystal, where the top surface has a {OOl) orientation.
6 = 3.8 mrad and w = -1, the intensity is about zero. The latter incident direction would correspond approximately with that of the rays B‘R’ and BR, whereas for rays A‘R’ and AR, 0 = 0 and w assumes the maximum negative value of - 1.6. Here, also, the transmitted intensity is about zero. The regions of incident angles between B’R’ and A’R’, on the left, and from BR to AR, on the right, correspond to regions where the strongly absorbed Bloch wave is primarily excited, as argued earlier. On the other hand, for incident directions D’R’ to C’R’ and DR to CR (in Fig. 2), the strongly transmitted Bloch wave is primarily excited. It ensues from Fig. 1 that the transmitted intensity reaches a maximum within these latter angular regions or at the position where w = 0.5. If the cone of rays O’PO’’ in Fig. 2 is considered, it is obvious that the ray directions within that cone would fall within the low-intensity transmittance regions of B’R’A’ and BRA, as discused previously. However, this prediction of low transmittance, on the basis of electron wave theory, apparently does not hold for the centers of bent-foil zone axis patterns, for zone axes such as (110). This situation exists even though the ray directions there would coincide with those within O’PO’’ and even though the bent foil ZAP is a combination of bend extinction contours related to a number of sets of reflection planes, such as the (002), ( l i l ) , and (711) planes for a [110] ZAP. Experimentally it is found that the brightest
62
JACOBUS T. FOURIE
FIGURE3 . A bent-foil ( 1 10) zone axis pattern in a copper single crystal foil, covered in gold particles on one side.
transmittance occurs, in fact, exactly along the ( 1 10) zone axis, as is clearly demonstrated in Fig. 3 for a ( 1 10) ZAP. On the basis of the arguments and experimental observations presented earlier, the following assumptions are made, which, within the experiments of crystal-aperture STEM, are shown,empirically, to be valid for the results obtained. Firstly, if the electrons with incident directions corresponding to the cone O’PO” exhibited a wave nature, they would be strongly absorbed by the crystal lattic, and, hence, the transmitted intensity would approach zero, as in Fig. 1, for those incident directions. Secondly, if the electrons, exclusively, exhibited particle properties, they would be transmitted with maximum intensity through the zone axis tunnel. Thirdly, for such electrons of an exclusively particle nature, there would be no manifestation, within the atom-size zone axis tunnel, of those electron-optical diffraction phenomena which are normally observed in the focusing of electron beams. On the basis of these assumptions, then, the following empirical conclusion can be made: within the crystal-aperture formed by a zone axis tunnel, electron optical conditions would be of such a nature that the point focusing of electrons originating from a point source might be approached. This achievement would be made possible by the absence of diffraction combined with the smallness, in size and angle, of the crystal aperture, which, thus, would minimize the spherical and chromatic aberration errors.
CRYSTAL-APERTURE STEM
63
B. Crystal-Aperture Optical Systems of Atomic Dimensions The method of crystal-aperture STEM differs markedly from other more conventional modes of imaging in three main aspects. The first is the fact that the final objective aperture is a zone axis tunnel within a crystal. The second is the fact that the sample, an atom, is mounted (or, more specifically, adsorbed) on the bottom surface of the crystal and in the center of the aperture (or zone axis tunnel). The third is the fact that the volume of the final aperture system is about 16 orders of magnitude smaller than that of a conventional STEM system. These three aspects are discussed in this section. A brief consideration of the zone axis tunnel is presented here, with reference to Fig. 4. The detailed structure will be discussed further in Sections II,D and II,E. For the zone axis tunnel there are two options, as shown in Figs. 4a and 4b. In Fig 4a, filled circles indicate copper atoms
FIGURE4. A simplified representation of a crystal-aperture, of magnitude a,, in the form of a zone axis tunnel. (a) In a copper foil of thickness i,, with a gold atom (open circle) adsorbed in the mouth of the tunnel; (b) in a copper foil of thickness f,, coincident with an equivalent tunnel, in a gold particle of thickness f,.
64
JACOBUS T. FOURIE
which line the tunnel for the full thickness, t,, of the foil. Within the exit mouth of this tunnel, a gold atom (open circle) has been adsorbed in a stacking fault position. The incident beam is limited to an aperture of a,by the zone axis tunnel, and is focused exactly on the adsorbed gold atom at the exit end of the tunnel. In Fig. 4b, a similar situation is depicted, except that, now, the presence of a thin, epitaxially-grown, gold particle of thickness t,, is present on the exit surface of the copper foil. The diagram portrays a position where the lattices of copper and gold, which differ parameter-wise, are in phase. Thus, the zone axis tunnel in the copper foil is extended by an additional three atomic spacings by the gold crystal. The atom upon which the beam is focused, in Fig. 4b, is a gold atom which is presumed to have been adsorbed on the exit surface of the gold crystal during the process of vapor-deposition of gold onto the copper foil, as discussed in detail in Section II1,A. As in Fig. 4a, the atom was adsorbed in a stacking fault position in the center of the zone axis tunnel. The optical characteristics and volume of a standard electron-optical system will now be compared with that of the crystal-aperture STEM system. In Fig. 5a is shown the classic broadening of a parallel beam of particles (electrons for example) which has been directed to pass through a slit of A y . According to the Heisenberg principle, the individual electron will undergo upward or downward deflection at the slit. Thus, it will acquire component momentum, perpendicular to its original direction of flight, of amount A p , with the resultant momentum, p , remaining constant. The well-known Heisenberg relation A p Ay I h , where h is Planck’s constant, is then valid. This process may be described as the diffraction of electrons at a slit. The electron-optical system, of atomic dimensions, used by Fourie (1992b, 1993) is shown diagrammatically in Fig. 5b. The design criteria of this system has been discussed in detail by Fourie (1993). The essential elements for subatomic resolution are (i) the cold field-emission electron source forms a virtual source of vanishing dimensions, as discussed later; (ii) the final probe formation occurs within a (1 10) zone axis tunnel of gold; and (iii) the adatom requiring study is placed centrally, at S, within the exit mouth of the zone tunnel, a position which must coincide with the image plane of the STEM system. It is noted from Fig. 5b that the standard optics of the field-emission STEM system is envisaged to focus a beam (outer cone unhatched, inner cone hatched) onto the crystal, with the focal point at S, a position which coincides with the exit surface of the crystal and the sample position. For the purpose of subatomic resolution, it is probably necessary that the diameter of the standard beam at the entrance surface, E, does not exceed 0.4 nm. The zone axis tunnel, with an effective apertureopening Ay, will then select the central (hatched) cone of the beam from the standard beam. Since the aperture involved will be about 1 mrad, the
65
CRYSTAL-APERTURE STEM
-
Lb
-
FIGURE5 . (a) Diffraction broadening of a beam of rays through a rectangular slit; (b) the suggested electron-ray paths through a crystal-aperture, in the absence of diffraction phenomena.
objective lens will be able to focus the beam to a spot of subatomic dimensions at S. Thus, focusing is still performed by the lens and it is not believed that the zone axis tunnel is involved in the focusing process. The only function of the tunnel is in providing an aperture for the objective lens. The combination of components involved in this last event of the focusing process may be seen as an optical system which contains an aperture, Ay, and an image plane at a distance, L b , from the aperture, where the sample is situated. The total volume of this ultramicro optical system would be Vb = (Ay)’Lb. Similarly the volume of the macroscopic system in Fig. 5a would be V, = (Ay)’L,. For standard STEM systems, as in Fig. 5a, the objective aperture t o focal plane distance would be about La = 10 mm, and the aperture opening, Ay, about 0.02mm, from which it follows that V, = 4 x mm3. However, for Fig. l b, where the (110) tunnel width
66
JACOBUS T. FOURIE
for copper is about 1.8 x lo-’ mm and L b , the foil thickness, is about mm, Vb = 3 x mm3. Thus & / V , = 8 x lo-’’. The latternumber emphasizes the smallness of the crystal-aperture optical system and its difference in volume relative to standard macroscopic systems by 16 orders of magnitude. As discussed earlier, the method requires that diffraction phenomena be absent within the final probe formation. The following are unique aspects of that system that either individually, or in combination, may be responsible for the absence of diffraction: (i) The crystal aperture and the sample are coherently connected by a single crystal atomic lattice. Thus, the sample is adsorbed in a stacking fault site on the exit surface of the crystal, whereas the aperture is situated on the entrance surface of an underlying, epitaxially related zone axis tunnel of copper which leads into a coincident tunnel of the adsorbent gold crystal. (ii) The volume size of the crystal aperture system is 16 orders of magnitude smaller than standard systems. (iii) The vanishingly small virtual source aspect of the cold field-emission tip probably only begins to have significance within the crystalaperture system, as in Fig. 4(b). Within standard systems, the aberrations probably override any benefits which otherwise might be associated with the virtual source concept. C. Predictions on Zone Axis Patterns From Electron-Ray Simulation In Section II,A a case was made for regarding the electron transmission through the centers of ZAPS to be essentially that of particles, with the transmission of waves being suppressed by strong absorption along the zone axis. In the present section, a two-dimensional arrangement of atoms along a [loo] zone axis is considered, to establish what configuration of pattern may be expected when the electron interaction with the atomic structure is of a purely particle nature. Thus the simplified model is, first, described together with the method of computer simulation. Second, results of the computer simulation are presented in graphical form and an indication is given of the qualitative configuration of the patterns that might be expected. Third, experimental observations (Fourie, 1992a) on ZAPS are presented, which appear t o confirm the predictions. For the computer simulation (Fourie and TerblanchC, 1992) of the rectilinear transmission of electrons through a crystal, a two-dimensional lattice as shown in Fig. 6 is used. Here, the (100) plane of the face-centered cubic (fcc) copper lattice is shown, and the plane of the figure bisects the
67
CRYSTAL-APERTURE STEM
3'
2'
M' 1'1
0
0
0 0
0
0
0 0 0 0
0 0
0
n'
c ' o 0'0 B' -9
1
D'
/ \
-
9
FIGURE6 . A diagrammatic representation of the proposed rectilinear paths of electrons through the fcc crystal lattice of a thin foil of copper. Dark circles indicate atoms involved in electron-atom encounters. Courtesy Fourie and Terblancht (1992).
atoms in that (100) plane. The (020) planes are perpendicular to the plane of the figure and are assumed to bisect the atoms along the columns AA', BB', etc. The optical axis of the probe, 00', is assumed to be parallel to [OOl], to coincide with the plane of the figure and to be positioned at 1/4a with respect to the column of atoms DD'. That is, the optical axis is positioned symmetrically between the atom columns DD' and EE'. Under these conditions, the probe will be bisected by the plane of the figure. The outermost rays within this section of the probe are 3 ' 0 ' and 30', as shown in Fig. 6.
68
JACOBUS T. FOURIE
In the simplified model which was used for the simulation (Fourie and Terblanche, 1992), it was assumed that the interaction of the probe rays within the section 3 ’ 0 ’ 3 with the bisected atoms within the plane of the figure, may be equated with the interaction of the equivalent probe rays in the (020) plane with the atoms in the (020) planes. That is, the twodimensional situation depicted in Fig. 6 was assumed to be comparable with the three-dimensional situation, where the (020) planes would extend above and below the plane of the diagram and where the probe rays would form a solid cone. For the computer simulation, the ray 00’ is tilted from the direction it occupies parallel to [OOl] through prescribed angular increments, e.g., 68, , 68,, and 60,, to positions 1, 2, and 3, respectively, as in Fig. 6. At every position the number of encounters with atoms is recorded. Clearly, the number of encounters will be a function of 0, r, and t , where 8 is the angular position of the ray, r is the atomic radius, and t is the foil thickness. Note that there would be no encounters with atoms for rays falling only within the central section, M’O’M, for r a n d t as in Fig. 6. It is clear that a variation in r in the simulation is equivalent to a variation in V , the accelerating voltage. This follows from the formulae for the elastic cross section of atoms in relation to the electron velocity (see, for example Reimer (1984), pp. 21 and may be deduced. The results 150) from which the relationship r a obtained from the computer simulation will now be discussed. In Fig. 7, for constant r = 0.025 nm, the number of encounters as a function of 0 is plotted for foils of different thicknesses, where t = 25, 50, and 120 nm for curves A, B, and C, respectively. The results show that for thin crystals, such as for curve A, there is a wide angular region around the zone axis, where no electron-atom interaction occurs, that the peaks of encounters are widely spaced, and that there is little contrast between the zero-encounter regions and the peaks of encounter. For thicker crystals, such as curve B or C, the region of zero encounters contracts, the peaks of encounters lie closer together and the contrast between the zero-encounter region and the peaks increases. In Fig. 8, curves for constant t = 50 nm, and for r = rl , r,, and r 3 ,where r, = 0.025 (curve A), r, = 0.050 (curve B) and r3 = 0.075 nm (curve C), are plotted. On the basis of the formula r a it follows that r I / r 2= Thus, for the values given, it follows that r l / r 2 = 0.5, and thus that V,/V, = 0.25. Thus if 6 is set equal to 200 kV, V, would be 50 kV. It is clear from Fig. 8, therefore, that the diameter of the central, encounterfree region, would increase with decreasing atomic radius, or equivalently, with increasing voltage. It follows that, in terms of the electron-ray or particle model, the results in Figs. 7 and 8 provide definite predictions on how the central bright region
m
m.
m,
69
CRYSTAL-APERTURE STEM
150
120
cn
L
90
Q)
4J
C
1
30
0
- 15
- 10
-5
Theta
0
5
10
15
(rnradl
FIGURE7. The effect of the thickness, t , in computer-simulated electron-atom encounters, as a function of 8, for the (020) atomic column model, for the atomic radius r = 0.025 nm. For curves A, B, C , respectively, t = 25, 50, and 120 nm. Courtesy Fourie and Terblanche (1992).
of the ZAP would react to variations in foil thickness or in the acclerating voltage. Subsequently, it was possible to carry out experiments which confirmed these predictions. These experiments will now be discussed. For the purpose of obtaining real space ZAPs in transmission electron microscopy (TEM), it is necessary that dome- or cup-shaped dimples should be present in the foil (Reimer, 1984). If these dimples are not present from a chance bending of the foil, the foil may be purposely deformed by a slight plastic bending, in order to introduce such dimples. Thus, the (110) bent foil ZAP could easily be obtained and then used in assessing the conclusions drawn from theory. In this regard it was necessary to obtain ZAPs at various t’s and V’s. This was achieved, for t , by tilting the sample around an appropriate crystallographic direction at constant V, causing the ZAP to shift to either larger or smaller 1. For altering V , at constant t, a ZAP at an appropriate t was photographed, and then, while maintaining the tilt and position of the sample constant, V was altered before taking another photograph.
70
JACOBUS T. FOURIE
120
ln
L a,
t
90
c, C
3 0
u
c
W
60
30
0 -15
- 10
-5
Theta
0
5
10
15
(mradl
FIGURE8. The effect of the atomic radius, r, in computer-simulated electron-atom encounters, as a function of 0 for the (020) atomic column model, for t = 50 nm. For curves A, B, C, respectively, r = 0.025, 0.050, and 0.075 nm. Courtesy Fourie and Terblanche (1992).
In the following discussion, observations in TEM on zone-axis patterns will be described in which the variation of such patterns, as a function of t or V , were recorded. These results will be discussed concerning their significance in relation to the crystal-aperture STEM method. A good example of the effect of t on the diameter of the central bright region of a ZAP, is shown in Figs. 9a and 9b. Here, ( 1 10) ZAPS were recorded at 200 kV for t approximately equal to 30 and 50 nm, respectively. The corresponding microdensitometer traces taken along X’X, are shown in Figs. 10a and lob, respectively. These curves should be compared with the theoretical curves in Fig. 7, where it should be noted that the condition of zero encounters represents the highest intensity and thus that the curves are inverted with respect to Figs 10a and lob. Thus the diameter of the central bright region, in an experimental study, is seen to decrease with increasing thickness. That is, with reference to Figs. 10a and Fig. lob, the FWHM (full width at half maximum) of the central peak decreases from Fig. 1Oa to Fig. lob. Also, the ratio of the central peak height t o that of the neighboring peak height increases with increasing thickness. This observation
CRYSTAL-APERTURE STEM
71
FIGURE9. (a) A ( I 10) ZAP at 200 k V in a foil thickness, I , , estimated at about 30 nm; (b) a ( I 10) Z A P at 200 k V in a position near to that of Fig. 9a, but where the foil thickness, t,, was estimated to be about 50 nrn. Courtesy Fourie (1992a).
CRYSTAL-APERTURE STEM
73
is clearly supported by the same trend in the corresponding ratio of the theoretical curves in B and C in Fig. 7. The variation of the central peak diameter with a variation in voltage can clearly be assessed from the experimental images in Figs 1l a and 11b, where the (1 10) ZAP was photographed under identical orientation conditions for accelerating voltages of 150 and 200 kV, respectively. From these images it is clear that the diameter of the central bright region has increased markedly with increased voltage. This result should be compared with the theoretical curves A and B (for example) in Fig. 8. Note that the width of the region of zero encounters (or the highest intensity of transmission) increases from curve B to curve A, where the value of r decreases from 0.050 nm to 0.025 nm, respectively. Since a decrease in r represents an increase in voltage, the theoretical curves in Fig. 8, support qualitatively the experimental observations in Figs 1 l a and 1 lb. In summary, concerning the electron particle theory of ZAPS and the associated experimental results, it may be asserted that there is a close correspondence of the central regions of simulated zone axis patterns with experimental patterns. This fact lends strong support to the underlying assumption for the simulation procedure; namely, that within a certain narrow aperture, electrons will penetrate the crystal zone axis tunnels along rectilinear paths and, in the process, will demonstrate particle properties.
D. Atomic Structure of Zone Axis Tunnels through a ( 1 10) Foil The predictions concerning electron-ray trajectories and the associated interactions with atoms surrounding (100) zone axis tunnels, were considered in II,C. The selection of (100) tunnels for the computer simulation procedure had been decided upon since the two-dimensional approach used was physically more reasonable for the (100) tunnels than for (1 10) tunnels. However, all of the experimental work, thus far, has been confined to the exclusive use of (1 10) tunnels. In this section then, a detailed consideration of the (1 10) zone axis tunnel is presented. This will include an empirical consideration of the electronic structure between atoms. In Fig. 12, ABCDEF are atoms at the corners of unit cells in the facecentered cubic lattice, with G and H representing atoms in the face centers. Atoms at corners of unit cells are, for convenience, represented by larger circles than those at face centers. The rest of the structure is built up in an obvious manner, from this starting structure, by the addition of unit cells and by the sectioning of the structure along the planes DIJK and U'UQTT', resulting in surfaces of a [ 110) orientation. The (1 10) tunnel of interest is along the direction QM. That is, it is a tunnel in a (110) direction and
74
JACOBUS T. FOURIE
FIGURE11. (a) A (110) ZAP at 150 k V in a foil thickness estimated at about 30nm; (b) a ( 1 10) ZAP at 200 k V in exactly the same position and tilt orientation as for Fig. 1 la. Courtesy Fourie (1992a).
CRYSTAL-APERTURE STEM
FIGURE12. Perspective diagram of zone axis tunnels in the fcc crystal lattice. The 1101 faces DIJK and U’UQTT’, are on opposite sides of the thin foil. A ( 1 10) zone axis tunnel is shown, with triangle QRS as the entrance mouth and MNP the exit mouth of the tunnel, which extends along the apices QVCM. The crosses I , 2, and 3 represent gold atoms adsorbed at tunnel positions on the exit surface.
is perpendicular to the plane of the foil DIJK. The entrance to the tunnel is defined atomically by the triangle of atoms QRS on the (110) entrance surface, upon which the STEM probe is incident. The internal tunnel within the bulk is defined by repeating, identical triangles with apices Q, V , C, and M, with M the apex of the triangle NMP on the exit surface, DIJK. Alternatively, an identical but inverted tunnel could be defined by NPW. Furthermore, the tunnels defined by DNE, or the inverse, KPY, could also function as crystal apertures. The crosses 1 , 2, and 3 on the exit surface DIJK, represent gold atoms adsorbed in the mouths of tunnels. When the entrance and exit surfaces are within close proximity of each other as, for example, for a 20-nm-thick foil, it is envisaged that the internal electronic structure of the foil would assume a surfacelike state and that a cylinderlike tunnel, in the electronic sense, would come into existence, as indicated by the broken circles along the direction QVCM. It is further envisaged that high-energy electrons may travel along this electronic tunnel without significant interaction with the electric field within the bulk of the material. This would correspond with the conditions for forming a 110) bent-foil ZAP in TEM, with the very marked bright center, as in Fig. 3 . Also, it is clear from Fig. 3 that the central bright region consists, roughly, of two concentric regions, with the central region showing extreme brightness.
<
76
JACOBUS T. FOURIE
FIGURE13. A superposition of two (220) planes at the surface of a crystal, with atom positions at line intersections. The filled circles are within the uppermost (220) plane. The configuration at A represents an envisaged electronic interaction zone between two atoms within the (220) planes. The configuration at B represents the envisaged electronic interaction zones between a group of atoms. The circles 1, 2, and 3 shown at B are envisaged to be electron-free tunnels at the surface and t o be within thin foils of thickness < 20 nrn.
With reference to Fig. 13, where a diagrammatical representation of a (110) foil is viewed vertically from above, the electronic structure near the surface and between atoms, is envisaged, empirically, to roughly resemble that at A, for two neighboring atoms, and that at B for an assembly of six atoms. It is further assumed that the circular regions 1, 2, and 3 at B are free of electrons. This presentation, then, resembles that of the three-dimensional structure, discussed with reference to Fig. 12. Thus, the circular regions 1, 2, and 3 at B in Fig. 13 would be tunnels through the foil. Experimental support for the foregoing discussion on the electronic structure within the zone axis tunnels, may be found, first, in Fig. 3, already discussed, and second, more clearly, in Figs. 14a and 14b. In Fig. 14a is shown a (1 10) ZAP obtained at 200 kV in a copper foil, coated with gold to an average thickness of 1 nm on the exit surface. The separate gold particles are visible within the ZAP image. What is very noticeable in Fig. 14a and pertinent to the present argument is that the region within the innermost dark ring consists of two, approximately circular and clearly defined, concentric regions. The centermost is very bright, whereas the region adjoining the dark ring is less bright. This description is emphasized by the microdensitomer scan along X'X, which is shown in Fig. 14b.
CRYSTAL-APERTURE STEM
77
FIGURE 14. (a) The (110) bent-foil ZAP obtained at 200kV in a copper thin foil (f
< 20 nm), coated with gold particles on one side. Within the central dark ring, two concen-
tric regions may be distinguished, where the centermost region shows extreme brightness; (b) a rnicrodensitorneter trace along X X ’ of Fig. 14a.
78
JACOBUS T. FOURIE
a
-15
-10
-5
0
Theta
5
10
15
5
10
15
(mrad)
b In L aJ Y
c3 0
u C
w
-15
- 10
-5
0
Theta
(mrad)
FIGURE15. (a) The computer-simulated electron-atom encounters curve as a function of 0 (see Fig. 7 ) for t = 25 nm; (b) the envisaged electron-atom encounters curve as a function of 0, for zone axis tunnels as at B in Fig. 13.
The empirical interpretation of the phenomena in Fig. 14a may be given with reference to Figs. 6 and 7. In Fig. 6 , the central aperture within which there is no intereaction of electron rays with atoms, i.e., the aperture M’O’M, would be reduced, owing to the electronic structure between atoms, as discussed earlier. Hence in Fig. 7, for curve A, which is reproduced in Fig. 15a, the effect of the electronic structure around the atoms would be that of introducing the shoulders in the curves, as in Fig. 15b. This latter argument thus produces a result which corresponds closely with the experimental result shown in Figs 14a and 14b. As explained earlier, the curve in Fig. 14b would be the inverse of that in Fig. 15b since the brightest region in Fig. 14b would represent the region of zero encounters in Fig 15b. To summarize, the empirical considerations on the distribution of the electronic structure between atoms in thin foils are supported by the experimental results on ZAPS. Apparently, therefore, the effect of this electronic structure would be, essentially, to form a lining to the zone axis
CRYSTAL-APERTURE STEM
79
tunnel, thus reducing the aperture of the tunnel. On the other hand, the density of the electronic structure within the cylinderlike central part of the tunnel would be extremely low, and the interaction of the scanning beam, with that structure would be negligible. Thus completely unhindered rectilinear trajectories of high-energy electrons would be possible through the central region of the tunnel..
E. Electron-Source Requirements and the Virtual Source in [3 101 Field-Emission In the paper by Crewe et al. (1968), the authors stressed the important relation that exists, in STEM, between the source brightness and the achievable resolution. Within that context, the same authors emphasized that any attempt to improve the resolution of the scanning electron microscope must involve an increase in brightness of the source. This important aspect of STEM will be discussed in this section, where the crystallographic nature of cold field-emission from body-centered cubic tungsten, will be emphasized. Thus, firstly, the bright (310) facet, which forms during the “flashing” (or preheating) of the field-emission tip of [3 101 axial orientation, results in paraxial field-emission which appears to be coming from a point source of extremely high brightness, and which is situated at infinity on the optical axis. Secondly, the fact that the relevant (310) facet is orthogonal to the optical axis results in a situation where the position of the virtual point-source becomes insensitive t o either transverse or longitudinal vibrations of the tip. These aspects of the cold field-emission tip of [310] orientation, as used in VG-STEM instruments, will now be discussed. The practical aspects of the manufacturing of field-emission tips, suitable for use in STEM machines, have been discussed in detail by Crewe et al. (1968). Further, a comprehensive survey of the fundamentals of fieldemission is to be found in the review paper of Dyke and Dolan (1956). A very important aspect of field-emission from tungsten tips, which is emphasized by Crewe et al. (1968), is that the emission along the [310] axis and, correspondingly, from a (310) facet, is considerably brighter than the emission along other axes or from other facets. In this regard, measurements of Dyke et al. (1954) provide quantitive information on the variation of the current density, J , with the polar angle, for several azimuths. In particular, these measurements were applied to a clean hemispherical tungsten cathode with a ( 1 10) axis, and indicate very convincingly, that field-emission from the (310)facet is the brightest, and probably orders of magnitude brighter than for (100) or (110) facets.
80
JACOBUS T. FOURIE
Crewe et al. (1968) has discussed in detail two forms of the (310) tip which they used. The first type, which they described as a “normal” (310) tip, is formed, initially, by electro-etching at 12 V dc, in a sodium hydroxide solution. The etched tip is then formed to the desired final configuration by Torr. The term flashing indicates a heating “flashing” in a vacuum of of the filament to a temperature where some crystallographic faceting occurs at the tip, in the high vacuum, and where all surface contaminants are driven off. The second type of tip, which is described as a “remolded” tip, is obtained by flashing the tip while simultaneously applying a positive dc voltage of between 1 and 7 kV to the tip. This causes field ion evaporation (see, for example, Muller, 1960), resulting in the tip becoming narrower and more markedly faceted, as may clearly be seen from the profiles of the two types of tip, as shown by Crewe et al. (1968). In Figs. 16a and 16b are shown stereographic projections which refer directly to the previously mentioned [310] tip profiles (Crewe et a/., 1968). Further, these projections refer also to the diagrammatically presented tip profiles in Figs. 17 and 18. Thus, in Fig. 16a, the [310] direction is in the center of the projection. This direction coincides with the axis of the tip, and the reader is viewing the tip, vertically from above and along the optical axis. The poles of other important facets are shown on the equatorial line. However, in Fig. 16b, the projection is such that the viewer is observing the tip along [OOT], which is at right angles t o the axis of the tip. Thus, for this projection, the important facets are positioned on the top half of the circumferential great circle. This latter projection corresponds to the plane of projection of the tip profiles in Figs. 17 and 18. These profiles are considered below in considerable detail. In Fig. 17 is shown a faceted tip that corresponds closely with the profile of the normal tip discussed by Crewe et al. (1968). The indices of the facets are indicated on the profile. As emphasized earlier, the most important facet is the (310) facet, where the [310] direction coincides with the optical axis of the system in which the tip is mounted. Further, the current density from the (310) facet is orders of magnitude greater than the density from the adjoining (110) and (100) facets. The transition regions P and Q, which include the [130] and [3TO] pole directions, in Fig. 17, would have reasonably high current densities. However, because of the lower density distribution of the equipotential surfaces compared with the density of equipotential surfaces above the (3 10) facet, these current densities would be small compared with that from the central (310) facet. Concerning the application of the normal tip, it is of interest that the VG-STEM series of microscopes uses tips of that nature. In these microscopes, the tip is flashed with the extraction voltage switched off but with the main accelerating voltage left on. Discussion of the characteristics
CRYSTAL-APERTURE STEM
81
b
FIGURE16. (a) The stereographic projection relating to a 13101 field-emission tungsten tip when viewed along the optical axis in a [3iO]direction with respect to the tip. The poles of important crystal facets are shown on the equatorial line; (b) the stereographic projection relating to the same tip as in (a) but viewed at right angles to the optical axis, that is, along the [OOi] direction. Here, the poles of important field-emitting facets are on the upper semicircumferential great circle. This projection relates t o the profiles of field-emission tips shown by Crewe ef al. (1968).
82
JACOBUS T. FOURIE
FIGURE17. Facets on the so-called “normal” field-emission tip of [310] axial orientation.
of the normal tip is of importance regarding the present review, because the results described in Section I11 were obtained in VG-STEM machines. Further discussion of this point will follow. It is opportune at this point, to consider the “remolded” tip described by Crewe et al. (1968) since this tip could be even more suitable to the crystalaperture method than the normal tip. The profile of the remolded tip is shown in Fig. 18. This tip is more distinctly faceted than the normal tip,
FIGURE18. Facets on the so-called “remolded” tip of [310] axial orientation.
CRYSTAL-APERTURE STEM
83
FIGURE19. A model of the “remolded” tip showing the [310] facet (bright) and four ( I 10) facets.
and the profile in Fig. 18 corresponds closely with that of a remolded tip, as shown by Crewe el at. (1968). It is clear that the (310) facet is about the same size as that for the normal tip and that it is bounded directly by (1 10) and (170) facets, without the in-between transition regions P and Q shown in Fig. 17 for the normal tip. In three dimensions, it is likely that the faceting of the remolded tip would correspond to that of the model in Fig. 19, which shows the (310) facet (bright), bounded by four ( 1 10) facets. Such a tip would provide, essentially, a high current density from a single (310) facet, which is orthogonally orientated to the optical axis, while the current density from the 11 10) facets would be orders of magnitude smaller. Remolded tips will not be discussed further since such tips are not in general use; however, a further discussion of normal tips is required. In Fig. 20, the optical properties of the faceted normal tip are considered. Here, the approximate shape and positions of equipotential surfaces above and around the tip are indicated. It is known from electricity theory that for such a sharp protuberance, as represented by the tip, the equipotential surfaces will be the most closely spaced at the extremity of the protuberance. This, obviously, will occur immediately in front of the (310) facet and thus the highest electric field would be present there. This occurrence would further amplify the natural high current density obtainable from that facet. Furthermore, because the (310) facet is orthogonal to the optical axis, the equipotential surfaces immediately above that facet will also be orthogonal
84
JACOBUS T. FOURIE
FIGURE20. A section through the “normal” [310]-axial tip, showing the equipotential surfaces adjacent to the tip. The highest density of these surfaces occurs immediately adjacent to the (310) facet. The surfaces adjacent to that facet are orthogonal to the tip axis and the optical axis. Thus, field-emission from the (3 10) facet is essentially paraxial.
to the axis. Hence, electrons from the (310) facet will be emitted paraxially and will remain essentially paraxial while travelling through the column of the microscope. A further consideration of the tip surface in Fig. 20 leads to the conclusion that, apart from the front (310) facet, there would be another two regions of strong emission, namely, the regions P and Q in Fig. 17 which, respectively, contain the pole directions [ 1301 and [3TO]. The field-emission from these two regions are indicated by B and C, respectively, in Fig. 20. However, as argued before, the strongest emission of electrons would be from the front (310) facet and this would occur paraxially. On the basis of geometrical optics, this paraxial electron current may be considered to originate from a virtual point source situated at minus infinity. This point source is designated by the symbols VS, . Similarly, the electrons within the diverging beams B and C, which are emitted from approximately spherical surfaces, will appear to be coming from virtual sources VSB and VSc, respectively. Because of the lower electric fields involved, the electron currents into B and C would be less than into A, where the associated field is high. Further, because of the large angles to the optical axis of the directions of emission into B and C, it is likely that the emission current into A will dominate overwhelmingly in probe formation where the objective aperture is very small, as in the method of crystal-aperture STEM. Thus, for these two reasons further discussion will be confined only to the emission into A.
CRYSTAL-APERTURE STEM
85
G FIGURE 21. The complete field-emission crystal-aperture STEM system, where T is the fieldemission tip, G thegun assembly, L the objective lens, X the thin, [ I101 copper foil, M the zone axis tunnel, C the gold atom adsorbed in the center of the exit mouth of the tunnel, and A the paraxially emitted electron rays.
It is believed that in the crystal-aperture method the zone axis tunnel forms an aperture of about 1 mrad. Furthermore, it is required, for a detectable signal, that a current of at least 10-”A should enter that aperture when the probe is focused. Thus the beam, A, in Fig. 20, should probably contain a current of at least 2 x A to allow for losses through the system. Clearly, since this finite current is associated with the virtual point source at minus infinity, it follows that the virtual current density at that point source would approach infinity. In Fig. 21, a diagram depicts the electron gun at 0.The extraction voltage is V, and the accelerating voltage is Vo. The electron paths from the fieldemission tip, T, to the first cross-over at P and through the objective lens, L, can be followed. It is noted that the central paraxial cone of rays, A, is focused by the lens L, through the crystal zone axis tunnel, M, onto the adatom at C, which coincides with the exit surface. The paraxiality of the central cone is important for electron-optical reasons and for the minimization of vibration effects. These aspects will now be discussed. Firstly, the field-emission tip has a lens effect on the emitted rays by the equipotential surfaces shown in Fig. 20. Hence, for large apertures a considerable spherical aberration error would occur. Thus, in Fig. 20, the virtual sources VS, and VSc would, effectively, not be point sources but would be considerably enlarged because of the spherical aberration within the equipotential surfaces. However, for the paraxial rays and for the equipotential surfaces which are orthogonal to the optical axis, as in Fig. 20, there would, ideally, be no spherical aberration error, because of the small aperture involved. Thus the concept of the virtual point source VS,, as discussed earlier, would remain valid. Secondly, the paraxiality of rays coming from the (3 10) facet has a further, important consequence in the ultrahigh-resolution domain, namely, that of
86
JACOBUS T. FOURIE
.-
-
vsA
FIGURE22. A diagrammatical representation of the paraxial beam of electron rays, used to illustrate the argument of vibration insensitivity in such an arrangement.
rendering the system insensitive to mechanical vibration. This aspect is considered with reference to Fig. 22. Here, the (310) facet is shown at P, the objective aperture at Q and the crystal aperture at CA. It follows from mathematical theory that parallel lines intersect at infinity. Thus, as discussed, the virtual point source, VS,, will be situated there. Similarly, the optical axis, which is part of the set of parallel lines along the X-direction, will pass through that point source. If the (310) facet at P experiences small transverse vibrations within the YZ plane and relative to the objective lens at Q (to which the optical axis is fixed), it follows that the point source at minus infinity would remain stationary. This would be the situation also for longitudinal vibrations along the X-axis. Thus, it may be concluded that the resolution of the system would be singularly insensitive to vibrations of the real-space source, reelative to the objective lens. Vibrations within the objective lens of the sample relative to the optical axis would be minimized because the sample C, in Fig. 22, is mounted on the final aperture, which is the crystal CA. This crystal, again, is mounted in the top-entry sample stage. It is unlikely that the top-entry stage would vibrate relative to the objective lens and scan coils, since these elements are part of the compact objective lens assembly. The general conclusion therefore is that the system as a whole, when used in the crystal-aperture mode, would be small-vibration insensitive. The errors normally introduced by lens aberrations within standard STEM systems have been summarized by Oatley (1972). These are due, mainly, t o the spherical aberration of the objective lens, the chromatic aberration of the lens resulting from a spread in energy of the electrons, and the effects of diffraction. The spherical aberration error in STEM is proportional to the cube of the aperture of the final focused probe. Thus, a reduction of the aperture from, say, 20 mrad to 1 mrad, would reduce the focused diameter of the probe by a factor of about lo4. This improvement may be expected only if diffraction effects were absent. As discussed, the absence of diffraction was assumed,
CRYSTAL-APERTURE STEM
87
empirically, for the special case where the final aperture is imposed by a zone axis tunnel of atomic dimensions. The chromatic aberration error is directly proportional to the final incident aperture. Hence, an improvement by a factor of 20 would be obtained by using the crystal aperture.
F. Auto-Magnification Effects in Direct Imaging of the Nucleus In Section III,D, micrographs are discussed in which subatomic detail is visible. Among other detail, the atomic nucleus is also imaged. If the spacing between scan lines within object space at the maximum instrument magnification of lo7 is considered, it appears that the spacing is about two orders larger than the diameter of the gold nucleus. An explanation of why it is nevertheless possible to image the nucleus is t o be found in the automagnification effect, which could be induced by the high electric field around the nucleus. The automagnification effect may be understood with reference to Figs 23a and 23b. At the sample, the scanning process occurs in three dimensions, and in Fig. 23a the horizontal XY plane, or image plane is presented, with the Z-direction, which is the beam direction, going into the plane of the paper. The scan direction, X , is from left to right, and the scan advance direction, Y , from top to bottom. The probable approximate paths that the scanning probe would follow on the sample, as it moves into the field around the nucleus, N, are shown. Scan lines 6‘ and 5‘ are not influenced by the field, but 4‘ and 3’ are. These scan paths are caused to diverge strongly from the optical-system-designated paths, causing a considerable increase in effective spacing of the scan lines within the region immediately preceding the nucleus. This phenomenon would cause a decrease in magnification within that region. At the nucleus, where the angle at which the incoming beam is incident on the equipotential surfaces surrounding the nucleus would be fairly constant, the spacing between the scan lines would be fairly uniform. However, this spacing would be considerably smaller than the optical-system-designated spacing, thereby causing an increase in magnification at the nucleus. In Fig. 23b, the probable electron beam directions around the nucleus and within the YZ plane which bisects the nucleus, are shown. The incident beams 1 to 6 and 1 ’ to 6’, in Fig. 23b, are numbered in correspondence with the scan lines in Fig. 23a. The electron beam paths, as shown, are deflections from the optical axis as a result of the positive charge on the nucleus. These deflections may be considered in terms of the equipotential surfaces in the electric field around the nucleus and also in terms of Snell’s law
88 a 6’
JACOBUS T. FOURIE
b
PLANE I
SCAN DIRECTION
5’
I
I
.
SCAN ADVANCE
3
4
I I
6
4
/
5
I
6
FIGURE23. (a) The scan lines on the sample are distorted by the electric field around the nucleus; (b) the beam directions corresponding to Fig. 23a are shown.
of refraction at such surfaces. This was done in an earlier paper by Fourie (1979), concerning a related physical situation and is represented in Figs 24a and 24b. The experimental situation involved a silicon monoxide insulator thin-foil that was placed on a copper grid and then caused to charge positively (Fourie, 1979) by the transmission of an electron beam through the foil and within the central region of the copper grid square. The charge generated on the surface of the foil was envisaged to result in equipotential surfaces around the grid square, as shown in Fig. 24a. Assuming, in Fig 24a, that there is only one equipotential surface which separates two spaces at potentials V, and V,, it is obvious that if V, were increased, the quantity (V,/ V0)”* = sin i/sin r (which represents Snell’s law), would increase in magnitude. Therefore, for a fixed angle of incidence, i, the angle of refraction, r, would decrease to maintain the equality. Thus, the total deviation in the refracted beam, from the incident beam direction, would increase. The beam at position a, for example, would then strike the aperture diaphragm further out from the center after refraction. The position of the scanning beam, where transmission through the aperture would just occur, would thus shift inwards, for example, from a to a’, i.e., to smaller i; and likewise from c to c‘. This
89
CRY STAL-APERTU RE STEM
I I
.
P
,
6,7. 10-’2A
1 pm u
m
24.10-’* A
I
ozmod’
I
I
‘h aperture diaphragm
Lelectron-detector transmitted electon image bright field
31. l o - ’ * A
a FIGURE24. (a) The equipotential lines which surround an insulator foil when it is positively charged; (b) the transmitted bright region alters in size with increasing current in the scanning probe. Courtesy Fourie (1979).
would mean that the diameter of the bright region, which represents transmission through the aperture below the sample, would shrink if the potential, V,, were increased. An increase in V , would occur if the primary beam current, L,, and thus, the rate of secondary electron production out of the thin foil, were increased. This would result in a higher density of positive charge within the electron bombarded region. According to the previous arguments, it can therefore be expected that the central bright region would shrink with increased Ip.This is shown t o occur in Fig. 24b. The example, discussed earlier, indicates that a significant deviation of the beam direction in a scanning beam may be caused even at a relatively low charge density in the sample. It may therefore be concluded that very sharp deviations in beam direction are probable in the vicinity of the atomic nucleus, where a very high charge density exists. It will be shown in Section III,D that the crystal-aperture FE-STEM method is capable of producing images of the nucleus and surrounding electron orbits of a gold atom. For some of the photographic recordings in that study, the detail was of such a high degree that certain aspects of atomic structure were observable. The relevant observations will be discussed in Section III,D.
90
JACOBUS T. FOURIE
111. EXPERIMENTAL RESULTS IN IMAGING
A . Experimental Method in Crystal-Aperture STEM The options in the experimental method are limited by the fact that the application of crystal-aperture STEM is directed toward the study of single atoms. Furthermore, for such a study, these atoms need to be lodged, centrally, at the exit end of zone axis tunnels, as adatoms. Hence, there is no question of using standard materials of a known structure in the present method, because none of the materials used in either TEM or STEM for obtaining images of columns of atoms, are suitable. This follows, firstly, because the images of columns indicate average atomic positions within a large assembly of lattice sites. Secondly, the column imaging results inevitably in a situation in which there is no direct information on the structure of any given single atom within the column. It follows from these considerations that the system of sample used by the present author (Fourie, 1989, 1992a, 1993, 1994) for single atom imaging in crystal-aperture STEM, is probably the most straightforward that is available for this special purpose. The system consists of an electrontransparent (110) single crystal foil of copper, onto one side of which, a very thin layer of gold is evaporated from a thermal source. The use of a thermal source rather than a sputtered source is important, because only the thermal source provides a deposit where discrete particles of gold, about 10nm in lateral diameter, will form epitaxially on the (110) copper foil. These particles, when imaged in high resolution within the [110] ZAP, will manifest moire fringes (see, for example, Hirsch et al. (1965), which originate from diffraction associated with (002), (171) and (71 1) crystallographic planes. These sets of fringes will occur simultaneously and with equal strength, when the STEM beam is parallel to the zone axis with a high degree of accuracy (Fourie, 1993). This exact alignment of the crystal zone axis with the optical axis is probably essential for the attainment of ultrahigh resolution. It is unlikely that such accurate alignment could be obtained without the aid of moire fringes. A practical example of such alignment is given in Section III,C. The practical aspects of producing samples for crystal aperture STEM will now be discussed. Firstly, a circular single crystal disc of 3.1-mm diameter of a (1101copper crystal that is about 0.25 mm thick is obtained. Secondly, this sample is then thinned in a jet polisher using 10% nitric acid in ethyl alcohol. The cross section of the sample, then, would resemble approximately that shown in Fig. 25, where the plane surface was facing upwards. However, the planarity of that surface is not important, and samples thinned in a doublejet polisher have been used with equal success. Thirdly, one side of the
CRYSTAL-APERTURE STEM
91
FIGURE25. A schematic representation of the 11 10) single crystal foil of copper, which has an electron transparent region at H and is also shadowed with gold from the thermal source G , forming a small gold particle, A. Courtesy Fourie (1989).
sample is shadowed with pure gold, evaporated from a heated tungsten filament, G, at an angle of 20" to the surface and to an average thickness of about 1 nm. The relevant configuration is shown in Fig. 25. It should be noted that the side upon which the gold is deposited, will later be the exit surface for electrons, when the sample is placed inside the microscope. The preparation described results in a surface covered with small particles of gold. These particles appear to grow epitaxially on the copper. Furthermore, it was observed that some of the gold atoms, which had been adsorbed on gold particles from the gold vapor during the very last stages of deposition, had settled, not into regular lattice positions, but into stacking fault positions (Fourie, 1993). Such positions, then, would coincide with the exit centers of zone axis tunnels. These sites are really the only positions where the ultrahigh resolution of the crystal-aperture method may effectively be employed An extension of the previous situation would be to use a second evaporation of foreign material, such as silicon, onto the deposited gold crystals, as discussed by Fourie (1993). This would involve a sensitive method, which would allow a carefully controlled evaporation of less than a monolayer, as used by Krishnamurthy et al. (1990). The impinging foreign atoms, in this process, might then also be adsorbed into stacking fault sites on suitable gold crystals which would already be in place on the copper single crystal foil. These newly adsorbed, foreign atoms, would then be in positions where imaging by crystal-aperture STEM could be applied. Thus the structure of single atoms of most elements could be made the subject of study at ultrahigh resolution by the crystal-aperture method.
B. Improved Resolution in Crystal-Aperture STEM The imaging of gold particles by STEM has been attempted under various conditions of STEM operation (Fourie, 1992, 1993). In the first such attempt two standard TEM-STEM machines were used. The one such machine was the 1976 JSM-200 Jeol machine, which was provided with a factory-installed STEM facility with a maximum magnification of 5 x lo5 at 200 kV. This
92
JACOBUS T. FOURIE
FIGURE26. The appearance of a ( I 10) ZAP in a single crystal foil of copper as obtained in STEM at 200 k V , using a thermionic tungsten electron source, at a direct electron optical magnification of 6 x lo4. Courtesy Fourie (1992b).
machine used a heated tungsten filament as source and the nominal resolution was given as 5 nm. The second machine was the Philips 420 TEM-STEM of 1986 vintage, equipped with a standard twin lens, and capable of a maximum magnification in STEM of 8 x lo5 at 120 kV. The electron source was a heated lanthanum hexaboride tip and the nominal resolution was 2 nm. Some results obtained from these two instruments in the crystal-aperture mode are now considered. In Fig. 26 is shown the (1 10) ZAP obtained, in STEM, in the JSM-200 at 200 kV. It is clear that there is a central bright region at A, although the contrast between this bright region and the dark extinction fringes associated with the Z A P is poor. The detail of the gold particles within the bright zone is unsatisfactory. This aspect is exemplified further by the image in Fig. 27, where a higher direct magnification was used to image the same area. In fact, the detail is poorer than what would have been obtained from gold particles on a thin carbon foil, using standard STEM imaging. It is believed that this degraded image is due to insufficient brightness in the thermal tungsten source. Thus the small aperture presented by the ( 1 10) zone axis tunnel could not accept sufficient current from the
CRYSTAL-APERTURE STEM
93
FIGURE27. The same area as in Fig. 26 but a t a direct electron optical magnification of 2 x lo5. Courtesy Fourie (1992b).
low intensity beam to generate a reasonable signal, resulting in excessive noise and poor contrast. A considerable improvement was observed when the lanthanum hexaboride source of the Philips 420 microscope was used (Fourie, 1992b). The relevant bent-foil ZAP, in STEM, is shown in Fig. 28. Clearly, there is considerable fine detail with high contrast, within the central bright region, B. In Fig. 29a, a micrograph taken at a direct magnification of 1 x lo5 is shown. The particle at B shows exceptionally sharp edges, and a microdensitometer trace across that particle, is shown in Fig. 29b. It is clear that the slope of the curve B to C indicates a resolution of, at worst, 0.5 nm. This value must be considered in the context of the standard resolution of 1.5 nm quoted for the particular TEM-STEM machine which was used (Fourie, 1992b). Thus, clearly, an enhancement by a factor of 3 in probe resolution was obtained by using (110) tunnels as crystal apertures in conjunction with the lanthanum hexaboride source.
94
JACOBUS T. FOURIE
FIGURE28. The appearance of a (110) ZAP in a single crystal foil of copper as obtained in STEM at 120 k V , using a lanthanum hexaboride source at a direct electron optical magnification of 5 x lo4. Courtesy Fourie (1992b).
The fact that the same improvement of the image was not possible for the Jeol TEM-STEM where the lower brightness, thermal tungsten source was used, suggested that the source brightness was an important factor. Thus, it was predicted (Fourie, 1992b) that a cold field-emission source which is 100 times brighter than the lanthanum hexaboride source would be dramatically more successful in lowering the resolution. This prediction proved accurate (Fourie, 1993, 1994) and the relevant results are discussed in Sections III,C and III,D. C. Imaging of Single Adatoms of Gold From the considerations regarding electron source requirements in Section II,E, and the observations concerning the improved resolution with increased source brightness in Section III,B, it was apparent that a high
95
CRYSTAL-APERTURE STEM
C
lorn
b FIGURE29. (a) The bright central region within a ( 1 10) ZAP taken at a direct magnification of 1 x 10’; (b) a microdensitorneter trace along X ’ X (in Fig. 29a) across the particle at B; the original trace, as well as a straight line approximation, is shown. Courtesy Fourie (1992b).
96
JACOBUS T. FOURIE
FIGURE30. The appearance, from above, of the two top (2201 planes in copper, where the filled circles are in the uppermost plane. The crosses 1, 2, and 3 represent atoms adsorbed in stacking fault positions, forming a linear configuration along the [OOI] direction. Courtesy Fourie (1993).
source brightness was of cardinal importance in crystal-aperture STEM. Thus, it was expected that a major advance in resolution could be achieved when using a [310] tungsten tip in cold field-emission, in conjunction with the crystal-aperture method. The results in this section, then, represent the first application (Fourie, 1993) of a system in which such a combination existed. As discussed earlier, it is found that when a thin layer of gold is vacuum deposited onto a (110) single crystal foil of copper, small epitaxial particles are formed. These are about 10 nm in horizontal diameter and 1-3 nm thick. Because of the difference in lattice parameter between copper and gold, the lattices of gold particles and the lattice of the copper substrate will go in an out of register, repeatingly, at regular intervals across the particle. The fcc lattice when viewed along the [Ti01 zone axis, will present a pattern such as in Fig. 30. If two such lattices, of spacing corresponding to gold and copper, are superimposed, as for epitaxial gold particles on copper, a pattern, such as in Fig. 3 1, is generated. Three regions, where the lattices are in approximate coincidence, are marked A, B, and C . Within these regions the zone axis tunnels within the gold particle and the copper substrate are in coincidence. This situation would correspond with the simplified crystalaperture shown in Fig. 4b. The phenomenon of superimposed lattices, as discussed, will show moire fringes. For ( 1 10) foils and within the (1 10) ZAP, these moire fringes will
CRYSTAL-APERTURE STEM
c
liiol
97
1 nrn
FIGURE31. A pattern which is generated when two lattices such as in Fig. 30, appropriately sized to represent the lattice parameters of copper and gold, are superimposed with zero relative twist. Courtesy Fourie (1993).
originate from diffraction associated with (002), (111) and ( i l l ) crystallographic planes. Such patterns are shown in Fig. 32, which is a STEM image taken within a ( 1 10) ZAP. As discussed, it is important to obtain conditions where the beam is directed down the zone axis with a high degree of accuracy. To achieve this, a gold particle within the ZAP is sought out, where a symmetric triangular pattern is manifested. An example of this is to be found at P, in Fig. 32. This particle was photographed at a direct electron optical magnification of 5 x lo5. A higher direct magnification of the same particle produced considerably more detail, as shown in Fig. 33. Here the direct magnification was 1.7 x lo6 and the triangular symmetry of the particle, P, is very evident. Further, single adatoms are clearly visible. For ease of viewing, the image in Fig. 33 was optically enlarged to obtain the image in Fig. 34. Here, the dark spots 1, 2, and 3 in region A are bright field images of single adatoms, which are individually resident in the center of zone axis tunnels. These resident positions correspond with that of the atom in the mouth of the tunnel, in Fig. 4b. In Fig. 35, a densitometer trace
98
JACOBUS T. FOURIE
FIGURE32. A micrograph of gold particles on a copper foil taken at a direct electron optical magnification of 5 x 10’. Moire fringes are evident overall. Courtesy Fourie (1993).
across the dark spots 1, 2, and 3 in Fig. 34, is shown. This trace is significant, because it shows that the spacings between 1, 2, and 3 are consistent with stacking fault positions along 1, 2, and 3. Such positions have been marked by crosses, 1,2, and 3 in Fig. 30 and also in Fig. 12. It should be noted that the observed spacings in a [OOI] direction are not possible for lattice positions, but only for stacking fault positions on the { 110) plane. This observation, therefore, lends strong support to the fundamental configuration put forth for the crystal-aperture and shown in Figs. 4b and 12. Also, in Fig. 35, note that the FWHM of adatom images 2 and 3 are 0.082 and 0.1 nm, respectively. It follows, thus, that resolutions below 0.1 nm are attainable in crystal-aperture field-emission STEM. Furthermore, the effect of the crystal-aperture is clearly significant here, because the STEM instrument used is rated for a resolution of 0.5 nm, when applied under standard conditions. Finally, the electron optical details used in producing the bright-field image in Fig. 34, are 0,= IOmrad, /3, = 1 mrad, and
CRYSTAL-APERTURE STEM
99
FIGURE 33. A micrograph of the same area as Fig. 32, taken at 1.7 x lo6 direct magnification. The moire fringes at P in Fig. 32 are clearly visible in this image, as shown by the arrow at P . Courtesy Fourie (1993).
M = 1.7 x lo6, where asis the objective aperture, a, the collector aperture and M the direct instrument magnification. In summary, the indication in the earlier work on the Philips TEMSTEM machine, was that an instrument using a brighter source than the lanthanum hexaboride source would be even more successful in increasing the resolution in the crystal-aperture mode. The results discussed earlier indicate that this conclusion is valid for the field-emission source, which is 100 times brighter than the lanthanum hexaboride source. The clarity of the images in Fig. 34 and the fact that the single atom images at A, i.e., 1, 2, and 3 are so completely resolved, suggests that the probe diameter is so small that it is not a limiting factor in the resolution obtained. However, as suggested by Fourie (1993), the direct instrument magnification of 1.7 x lo6, was probably inadequate and it was thus expected that a direct
100
JACOBUS T. FOURIE
FIGURE34. An increased optical enlargement of the micrograph in Fig. 33. The images of single adatoms are visible i n the bright regions A, B, and C . Courtesy Fourie (1993).
magnification of 1 x lo7 would provide even more detail in the image. Such an experiment is discussed in Section III,D.
D. Imaging of Subatomic Detail In this section, consideration is not given as much to the imaging of the positions of adatoms on a crystal surface, but it is given more to the structural detail within a given single adatom.
CRYSTAL-APERTURE STEM
101
FIGURE 35. Microdensitometer traces across the images of atoms 1, 2, and 3 in region A of Fig. 34. Courtesy Fourie (1993).
For imaging in the STEM mode, the instrument magnification at which an image is recorded is one of the decisive factors which determines the resolution within object space. Thus, if the diameter of a gold atom is, say, 0.10 nm, and if sub-atomic detail is required to be observed within that atom, it is preferable that at least 10 scan lines should traverse that atom in object space during the recording process. At a magnification of 1 x lo’, and with a monitor containing 2000 lines on a screen 100 mm in height, the line spacing in object space will be 0.005 nm. Thus an atom of 0.1 mm diameter would be traversed by 20 lines. Obviously, for an instrument magnification of 2 x lo7, as for the VG601, there would be about 40 scan lines crossing the atom in object space, and the quality of the recorded image would be improved correspondingly. Results on the structure within a gold atom, which were obtained at a direct magnification of lo7 in a VG HB501 UX crystal-aperture STEM system, are described in the following paragraphs. Firstly, attention is directed to a higher enlargement of Fig. 34, as shown in Fig. 36. Here, because of the relatively low, direct instrument magnification used, the adatom images 1 , 2 , and 3 have, at most, 8 scan lines traversing the diameter of every atom. Thus, the detail of atomic structure within the images is limited. However, based on common features within 1, 2, and 3,
102
JACOBUS T. FOURIE
FIGURE 36. The region A, from Fig. 34, at higher enlargement, showing dark central regions within atom images 1, 2, and 3 .
certain conclusions may be drawn. Thus, there is a dark center in every atom, which indicates a strong scattering of electrons, presumably near the nucleus. Further, the dark centers are surrounded by less electron-dense regions, which have a shape ranging between circular and hexagonal. It is noticeable that the dark centers of 1, 2, and 3 form an array of considerable linear exactness along [OOl]. This linearity, together with the [OOl] directionality of the array, is further proof that the images are those of atoms adsorbed in lattice stacking-fault sites on the crystal surface, as in Figs. 12 and 30. The dark centers to the structures, suggest that the probe is sufficiently narrow to record a difference in electron scattering near the nucleus, as compared to a scattering farther out. This deduction will be confirmed from images obtained at a direct magnification of lo’, as shown and discussed below. In Fig. 37, the image of the structure at A corresponds in total magnification with the images 1, 2, and 3 in Fig. 36. However, the direct instrument magnification in Fig. 37 was 1 x lo7, or about a factor of 6 more than for Fig. 36. Thus, according to the arguments already presented on the influence of instrument magnification on attainable detail, greatly improved detail of structure should be visible within A of Fig. 37. Inspection of the structure confirms, indeed, that there is much more detail.
CRYSTAL-APERTURE STEM
103
FIGURE37. The image of an atom at region A, obtained at a direct magnification of 1 x lo7 and enlarged to the same total magnification as in Fig. 36.
The general appearance of A in Fig. 37, is very similar to that of images 1, 2, and 3 in Fig. 36. There is a dark center, indicating a strong scattering
of the probe near the nucleus. This dark center is surrounded by a lighter region where the scattering is less pronounced. A higher enlargement of A, in Fig. 37, is shown in Fig. 38. Here, the considerable detail existing within the structure may be discerned more easily. For example, there is an
FIGURE38. An increased enlargement of A in Fig. 37, showing considerable subatomic detail, including a pronounced directionality along AA'.
104
JACOBUS T. FOURIE
extremely dark spot within the very center of the structure, which has a diameter of 0.0013 nm, or 1.3 x mm. This dark region, probably, is the nucleus, and, probably, was recordable only because of the suggested automagnification effect of the electric field around the nucleus, as discussed in Section II,F. This observation is sufficiently important to merit a separate discussion below. The actual diameter, D,of the nucleus of the gold atom, which has an atomic number 79, my be calculated from the equation D = 2r,A"3 (see, for example, Schiff, 1968), where r, = 1.3 x mm and A is the atomic number. This leads to D = 1.12 x lo-'' mm. As indicated previously, the measured diameter, D,, of the nucleus in Fig. 38 is 1.3 x mm, where, in measurement, the nominal direct instrument magnification, M , was assumed. To maintain consistency, therefore, it is required to assume an automagnification effect within the nuclear field of D,/D = 1.1 x lo2. Thus, on this basis, the direct electron optical magnification at the nucleus would have been about lo9, during the recording of the central dark spot in Fig. 38. Other details, in Fig. 38, which stand out clearly, are orbitlike patterns around the central dark spot and a very marked linear structure running along AA'. To determine whether the structure observed was, in fact, the image of an authentic structure, the image in Fig. 38 was digitized by means of a video camera coupled to a Kontron Ibas image-processing system. This digitized image is displayed in Fig. 39, and the corresponding Fourier transform is indicated in Fig. 40, in consistent orientation with respect to Fig. 39. The structure of Fig. 40 is that of concentric hexagons, which fact demonstrates, decisively, that the image presented in Figs. 38 and 39 is that of an authentic structure, showing considerable order. The nature of this structure may be explored empirically as in Figs. 41a and 41b. In Fig. 41a is shown the digitized image of a set of concentric hexagons and in Fig. 41b the Fourier transform of Fig. 41a, in consistent orientation. The apices of the hexagons in Fig. 41b are in the same orientation as those of the Fourier transform in Fig. 40. Thus it may be concluded that the orientation of structure in Figs. 38 and 39 is the same as that of the hexagons in Fig. 41a. Also, with reference to Fig. 41b, it should be noted that there is a pronounced "star" structure of lines at 60" to one another, and these lines intersect centrally. In Fig. 40, where the Fourier transform of the experimental image is shown, the corresponding line to MM' in Fig. 41b may clearly be discerned. This observation is a further indication of the existence of a structure of concentric hexagons in the real image of the gold adatom in Fig. 38. This hexagonlike structure was also deduced by Fourie (1994) by a direct consideration of real images of gold atoms.
CRYSTAL-APERTURE STEM
105
FIGURE40. The Fourier transform of Fig. 39, in consistent orientation with that figure.
106
JACOBUS T. FOURIE
FIGURE 41. (a) A digital image of a set of concentric hexagons; (b) the Fourier transform of Fig. 41a, in a consistent orientation.
IV. SUMMARY AND CONCLUSIONS
In the present article a survey was made, of studies that have been attempted using the crystal-aperture STEM method, and indicating the results that have been achieved. It was indicated that, in order for the method to function effectively, the final process of probe formation, which occurs over a distance of about 20 nm, is required to be free of diffraction effects. However, it was emphasized that there was no intention of calling into question the Heisenberg uncertainty principle. Instead, within the present chapter, there has been an experimental exploring, on an empirical basis, of conditions where diffraction effects might be suppressed. It is believed that this situation does occur under the described conditions. This conviction is supported strongly by experimental results obtained from the application of the method of cold field-emission crystal-aperture STEM. This conclusion applies, in particular, to systems like the VG STEM system where the cold field-emission tip has a [310] axial orientation. It was argued that for such a tip, which coincides with the optical axis, an extremely bright paraxial emission of electrons may be expected. Also, it was foreseen that a “remolded” tip, as considered by Crewe et al. (1968), would provide even greater brightness for the [310] paraxial radiation, and thus even higher resolutions in crystal-aperture STEM than is obtained from the so-called “normal” tip in present use.
CRYSTAL-APERTURE STEM
107
The experimental results, as reported here, need to be augmented with further results involving (110) zone axis tunnels as crystal apertures. In addition, a different and essential view of single adatoms should be obtained by using (100) zone axis tunnels, where atoms would be adsorbed in a different atomic orientation, with respect to the direction of imaging. It was demonstrated, in the present experiments, that by progressing from a direct instrument magnification of 1.7 x lo6 to 1 x lo7 in crystalaperture STEM, a profound influence was exercised on the amount of observable detail in the image. Consequently, it is expected that similar images produced at a magnification of 2 x lo’, which is within the capability of the VG 601, would provide a corresponding improvement in observable detail.
REFERENCES Crewe, A. V., Eggenberger, D. N., Wall, J., and Welter, L. M. (1968). Rev. Sci. Instr. 39, 576. Dyke, W. P. and Dolan, W. W. (1956). Adv. Electron. and Electron Phys. 8, 89 Dyke, W. P., Trolan, J . K . , Dolan, W. W., and Crundhauser, F. J . (1954). J. Appl. Phys. 25, 106. Fourie, J . T. (1979). In “Scanning Electron Microscopy” (0.Johari, Ed.), p. 87. SEM Inc., AMF O’Hare. Fourie, J. T . (1989). Scanning 11, 281. Fourie, J . T . (1992a). Optik 90, 85. Fourie, J. T. (1992b). Optik 90, 134. Fourie, J. T . (1993). Optik 95, 128. Fourie, J . T. (1994). Proc. 13th Int. Congress on Electr. Miscroscopy (B. Jouffrey, Ed.), p. 415. Les Editions de Physique, France. Fourie, J . T., and Terblanche (1992). Optik 90, 37. Hall, C. (1953). “Introduction to Electron Microscopy,” p. 205. McGraw-Hill, London. Hirsch, P. B., Howie, A., Nicholson, R. B., Pashley, D. W., and Whelan, M. J. (1965). “Electron Microscopy of Thin Crystals,” p. 361. Butterworths, London. Krishnamurthy, M., Drucker, J . S., and Venables, J . A. (1990). Proc. 12fb Int. Congress on Electron Microscopy (L. D. Peachey and D. B. Williams, Eds.), p. 308. San Francisco Press. Miiller, E. W. (1960). Adv. Electron. and Electron Phys. 13, 83. Oatley, C. W. (1972). “The Scanning Electron Microscope.” Cambridge University Press, London. Reimer, L. (1984). “Transmission Electron Microscopy.” Springer-Verlag, Berlin. Schiff, L. I. (1968). “Quantum Mechanics,” p. 456. McCraw-Hill, New York. Whelan, M. J . (1979). In “Diffraction and lmaging Techniques in Materials Science” (S. Amelinckx, R. Gevers, and J. van Landuyt, Eds.), p. 43. North-Holland, Amsterdam.
This Page Intentionally Left Blank
ADVANCES IN IMAGING A N D ELECTRON PHYSICS. VOL. 93
Phase Retrieval Using the Properties of Entire Functions N. NAKAJIMA College of Engineering, Shizuoka University 3-5-I Johoku, Hamamatsu 432, Japan
I . Introduction . . . . . . . . . . . . 11. Theoretical Background . . . . . . . . A . Logarithmic Hilbert Transform . . . . B. Exponential Filter . . . . . . . . . C. Fourier Series Expansion . . . . . . D. Lorentzian Filter . . . . . . . . . E. Simulated Example . . . . . . . . 111. Extension to Two-Dimensional Phase Retrieval A. Algorithm . . . . . . . . . . . B. Simulated Example . . . . . . . . C. Experimental Example . . . . . . . IV. Application to Related Problems . . . . . A. Hartley Transform . . . . . . . . B. Stellar Speckle Interferometry . . . . . C. Blind Deconvolution . . . . . . . . D. Coherent Imaging through Turbulence . . V . Conclusions . . . . . . . . . . . . References . . . . . . . . . . . .
. . . . . . . . , . . . ,
.
.
,
.
,
. . . . . .
. . . . . . .
. . .
. .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
109 112 112 116 118 124 127 131 131 133 134 139 139 143 144 152 167 168
I . INTRODUCTION The wavefront of a monochromatic wave is expressed as a complex amplitude with two parameters, modulus and phase. For high frequency phenomena such as light, X-ray, and electron waves, however, the only physical quantity that can be directly observed is the intensity, which is proportional to the square of the modulus of the complex amplitude, and the phase information inherent in wave phenomena is lost on an intensity recording. For instance, in a light or an electron microscope with monochromatic illumination, the directly measurable quantity is only the intensity distribution in the image plane or some other plane; however, a knowledge of the phase of the complex amplitude in such a plane is indispensable for the structure determination of a scattering object. A standard technique for solving this problem is interferometry or holography; that is, a second 109
Copyright (r) lY95 by ALademic Press, Inc All rights of reproducmn in any form reserved
110
N. NAKAJIMA
coherent wavefront of known modulus and phase is added to the unknown wavefront. The intensity of the sum of the two waves, therefore, depends on both the modulus and the phase of the complex amplitude of the unknown wave. In this chapter, another approach is considered, in which, without such a coherent reference wave, the phase of the complex amplitude is retrieved from one or more intensity distributions. The problem is referred to as phase retrieval and is particularly useful in some situations where a coherent reference wave is rarely available (for example, electron microscopy or X-ray diffraction). In optics, the fundamentals of the phase retrieval problem were first discussed by Wolf (1962) and Walther (1963) who were concerned with the problem of retrieving the phase of the Fourier transform of an object function from the Fourier modulus. This problem is equivalent to reconstructing the object from the Fourier modulus by using the inverse Fourier transform. Since their studies, various investigations of the phase retrieval problem have been performed. Up to now two types of approach to this problem have mainly been studied. The first type is the algorithmic approach using iterative procedures. The second type is the analytic (noniterative) approach based on the mathematical properties of bandlimited functions. Although this chapter is concerned with the latter approach, a brief review of the two approaches is presented here. Gerchberg and Saxton (1972) first proposed an iterative transform algorithm, which bounces back and forth between the object and the Fourier domains, where the object and Fourier moduli are applied as the constraints in the two domains. Iterations are continued until a solution is found that agrees with both the object and the Fourier modulus data. A modified version of the Gerchberg-Saxton algorithm was presented by Fienup (1978), which uses the nonnegativity and/or the support constraints in the object domain instead of the object modulus. He found that there was a great improvement in the reconstruction of two-dimensional (2-D) objects as compared with one-dimensional (1-D) objects. Subsequently, it was shown (Bruck and Sodin, 1979; Hayes, 1982; Sanz and Huang, 1983) that almost all 2-D objects with finite support are uniquely defined (to within some trivial ambiguities) by the modulus of their Fourier transforms in the absence of noise. In spite of this uniqueness property, it is generally very difficult to reconstruct a 2-D object from its Fourier modulus alone, and in consequence, the additional object constraints described previously are necessary. Many successful reconstructions of 2-D objects have been demonstrated by using various kinds of iterative algorithms: for example, there are the reconstructions of nonnegative objects by Fienup’s algorithm (Fienup, 1978, 1982), generalized projections (Levi and Stark, 1984), simulated annealing (Nieto-Vesperinas and Mendez, 1986), maximum
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
111
entropy (Bryan and Skilling, 1986), conjugate gradients (Lane, 1991), and the reconstructions of complex objects by the Gerchberg-Saxton algorithm and the similar iterative algorithm based on measurement of two defocused images (Misell, 1973). However, it is still unknown (even in noiseless cases) what conditions are sufficient in a practical sense to ensure uniqueness for the solution obtained by those iterative algorithms, and hence those algorithms sometimes stagnate in a local minimum solution different from a true one. On the other hand, the analytic (noniterative) approach is derived from the theoretical studies (Wolf, 1962; Walther, 1963) based on the properties of entire functions in one dimension. One of the properties, which is useful in phase retrieval, is a logarithmic Hilbert transform relationship between the Fourier modulus and phase. A number of researchers (see, for example, Hoenders, 1975; Burge et al., 1976) have formulated several versions of the logarithmic Hilbert transform relationship, whereby the Fourier phase can be evaluated from the Fourier modulus and the positions of zeros in the complex lower half plane of the Fourier transform function extended its real variable into a complex one. However, it is impossible to deduce the positions of zeros from only the Fourier modulus. A zero location method using an exponential filter has been proposed (Walker, 1981; Wood et al., 1981; Nakajima and Asakura, 1982). This method is to locate the complex zeros from comparison between two Fourier intensity distributions of the exponential filtered and unfiltered objects. In this method, however, searching many zeros in the complex plane becomes troublesome, because the relation between the Fourier modulus and the positions of zeros is nonlinear. A linear method of retrieving the phase from two Fourier intensities of the filtered and unfiltered objects without zero location has been proposed (Nakajima, 1987). In contrast to the iterative algorithms for phase retrieval, the analytic approach using the properties of entire functions ensures the uniqueness of the solution in a 1-D case but has some difficulties in extending to a 2-D case. For example, the useful generalization of the logarithmic Hilbert transform to the 2-D case cannot be accomplished mathematically (NietoVesperinas, 1980). Thus, Walker (1982) has shown the reconstruction of a 2-D object by combining the exponential filter method with an iterative algorithm. Nakajima and Asakura (1985, 1986) have presented an algorithm, for applying 1-D phase retrieval methods to a 2-D case. Using this algorithm with the linear method for phase retrieval by exponential filtering (Nakajima, 1987), we can determine the 2-D Fourier phase from three Fourier intensities measured without a filter and with two exponential filters decaying in the horizontal and vertical directions (Nakajima, 1989). Deighton et al. (1985) have proposed an algorithm for generating all
112
N. NAKAJIMA
possible solutions of a 2-D phase by the zero location in the whole complex plane of 1-D strips of a single 2-D Fourier intensity distribution. Lane el a/. (1987) considered another approach to phase retrieval by tracking the zero sheets of the 2-D Fourier intensity extended analytically into 2-D complex space (four real dimensions), in which the zero sheet of the Fourier transform of an object is separated from the zero sheet of its complex conjugate, thereby allowing the Fourier transform function to be reconstructed from one Fourier intensity distribution. This approach, however, tends to be computationally intensive and sensitive to noise, like other methods that employ the complex zeros. This chapter is devoted to a review of phase retrieval using the logarithmic Hilbert transform and the exponential filter method with their application to some related problems. A characteristic of the present phase retrieval method is that the uniqueness of the solution by this method is ensured mathematically although it requires two (in 1-D cases) or three (in 2-D cases) Fourier intensity distributions. In Section I1 the theoretical background for the properties of entire functions is presented. Section I11 introduces the algorithm for applying the 1-D phase retrieval method to the 2-D case, and presents the simulated and experimental examples of reconstructing 2-D objects. Finally, the application of the present method to the problems related to phase retrieval is considered in Section IV. Because this chapter can only cover part of some facets of the phase retrieval problem, the reader is referred to previously presented excellent reviews, among which are the books by Saxton (1978) and by Hurt (1989), and the book chapters by Ferwerda (1978), Ross et al. (1980), Bates and Mnyama (1986), Hayes (1987), Dainty and Fienup (1987), Levi and Stark (1987), Fiddy (1987), and Fienup (1991).
11. THEORETICAL BACKGROUND
A . Logarithmic Uilbert Transform The phase retrieval problem is only solvable if the complex amplitude function at the observation plane belongs to a particular class. This particular function is called an entire function of exponential type. The mathematical properties of entire functions of exponential type have been consistently studied so far in the phase retrieval problems for onedimensional situations. The discussion in this section is also restricted to a one-dimensional case.
113
PHASE RETRIEVAL BY ENTLRE FUNCTIONS
We now set the following two basic assumptions: 1. A scattering object is of finite extent. 2. A Fourier transform relationship exists between the object function and the scattered complex amplitude function in the diffraction region.
According t o these two assumptions, we consider the phase retrieval problem
on a Fourier-transforming optical system with a converging lens as shown in Fig. 1. We assume quasi-monochromatic fully spatially coherent illumination and a transparent object. The complex amplitude in the object plane is regarded as the object functionf(u), where the object plane is defined as the plane immediately behind the object perpendicular to the optical axis. Then, the scattered complex amplitude function F(x) in the Fourier plane is defined by
F(x) =
Sob
f(u) exp( - 2nixu) du,
(1)
where the interval a Iu Ib of integration represents the object extent, x is a variable normalized with the product of the wavelength I of illuminating light and the focal length f of lens, and the multiplicative factors outside the integral and the effect of the lens aperture have been neglected since these are not essential to discussions in this review. By changing the real variable x to the complex one with z = x + iy in Eq. (l), the function F(x) on the real axis x is extended to the function in the complex plane. Then the function F(z) becomes an entire function of exponential type b from a theorem formulated originally by Paley and Fourier plane
Object plane
Object
I
Lens
LX
-1
Holder
1
FIGURE1. Schematic arrangement of Fourier transforming optical system: f is the focal length of lens.
114
N. NAKAJIMA
Wiener (1934) if and only if it is given by
F(z) =
S‘
f(u) exp( -2nizu) du,
where 0 Ila1 Ib < ao, f(u) is an integrable function in (a, b), and the function F(z) is square-integrable on the real axis [i.e., F(x) E L2(- 0 0 , m)]. From Eq. (2), it is understood that the complex amplitude function appearing in optics belongs to entire functions. The entire function is analytic in the whole finite complex plane with the remarkable properties. One of them, which is useful in the phase retrieval problem, is the fact that, if the lower limit a of the interval (a, b) is nonnegative, the real and imaginary parts of F(x) are related by the well-known Hilbert transforms or dispersion relations (Titchmarsh, 1948),
dx‘,
(3)
dx’,
(4)
where Re and Im indicate taking the real and imaginary parts, respectively, and P denotes that the Cauchy principal value is to be taken. These relationships can be obtained from the calculation of a contour integral in the complex lower half-plane. If either the real or imaginary part of F(x) is obtained in an experiment, the complex function F(x) can be calculated from the relation of Eq. (3) or (4) and, finally, the object functionf(u) can be reconstructed from F(x) by the inverse Fourier transform. In actual situations, only the modulus of F(x) is directly obtained from detection of the intensity. Therefore the relationship between the modulus and the phase of F(x) given by F(x)
=
IW)Ie x P [ k w l
(5)
is more desirable than that between the real and imaginary parts of F(x). For this purpose, F(x) is modified by taking its natural logarithm as follows: ln F(x) = In IF(x)I
+ i4(x).
(6)
The Hilbert transform relationship between the real and imaginary parts of InF(x) can be obtained as the same form of Eq. (4) by 4(x)
=
--P n
1,
-m
ln’F(x’)l
x’(x - x’)
dx’ - 2nax
+ 4(0),
(7)
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
115
where $(O) is the constant phase at x = 0, and -2nux is the linear phase term due to the lower limit a of the object interval in the object plane. Equation (7) is called the modified logarithmic Hilbert transform for the function In F(x), which was formulated by Burge et ul. (1976). Since In F(z) has the same region of analyticity as F(z) except at the points where F(z) = 0, the relation of Eq. (7) can be established only in the case that lnF(z) does not have any singularities in the complex lower half plane. Unfortunately, the actual situation is not so simple, because many functions generally have zeros in the complex lower half plane. Consequently, Eq. (7) cannot always be used to calculate the phase +(x) from the modulus of F(x), and the logarithmic Hilbert transform should be considered by taking into account the influence of zeros in the complex lower half plane on the derivation process of the actual phase. In consideration of this point, we now introduce the Hilbert function given by Fh(x)
=
IF(x)l exp[i6h(x)l,
(8)
where &(X) is the Hilbert phase calculated from Eq. (7). In other words, the Hilbert function corresponds to a function all of whose zeros in the complex lower half plane are reflected onto the upper half plane. It is well known that an entire function of exponential type may be described everywhere by its zeros with the expression being known as a Hadamard product (Boas, 1954), m
F(z) = B
rI (1 j = I
-
dzj),
(9)
where we assumed that there is no zero at the origin of the complex plane, and B is a scaling constant. Using the Hadamard product, we may represent the relation between the Hilbert function Fh(x) with zeros only in the complex upper half plane and the actual complex function F(x) with zeros in both upper and lower planes as
where N is the number of zeros in the complex lower half plane, zj is the vector notation of the j t h zero in the complex lower half plane [i.e., F(zj) = 01, and the asterisk denotes the complex conjugate. Substitution of Eqs. ( 5 ) and (8) into Eq. (10) yields
116
N. NAKAJIMA
where the modulus of the product term in Eq. (10) is unity and the symbol arg denotes the argument of the complex function (zj - x). The phase terms in Eq. (1 1) are given by N
$h(X)
Larg(zj -
= $(XI -
-
arg(zj)l*
(12)
j =1
Since the Hilbert phase $h(X) is calculated by using Eq. (7) from the modulus of F(x), the general Hilbert transform involving the influence of zeros of F(x) in the complex lower half plane can be finally obtained from Eqs. (7) and (12) as $(x)
x
= -- P 7r
j-
lnlF(x‘)l dx’ x’(x - x ’ )
N
+2 C
[arg(zj - x)
-
arg(zj)] - 2nax
+ $(O).
(13)
j =1
The first term on the right hand side of Eq. (13) corresponding to the Hilbert phase implies the fundamental minimum condition of the phase, in which an object function to be reconstructed from the observed Fourier modulus and the retrieved phase must have the finite extent ( b - a) in the object plane. The second term in Eq. (13) supplements the information about the object function determined by the first term. This complemental information corresponds to the effect of the zeros of F(z) in the complex lower half plane, which does not appear in the modulus IF(x)) and is only contained in the phase $(x). The rest of the terms represent the linear and constant phases, and the effect of these phases do not appear in the positions of zeros of F(z) and the modulus IF(x)(. The ambiguity concerned with the linear and constant phases is situated outside the phase retrieval from the intensity distributions and will be regarded here as unimportant components. To use Eq. (13) in the phase retrieval problem, the positions of zeros of F(z) in the complex lower half plane must be known beforehand. Therefore, in the next section we consider the problem of finding the zeros of F(z).
B. Exponential Filter Since the intensity distribution IF(x)I2 is generally recorded in a conventional experiment, we expand IF(x)I2 into the complex plane by a process of analytic continuation and investigate the positions of zeros of the expanded intensity distribution IF(z)I2 (where z = x + iy). From Eq. (2),
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
117
the function IF(z)(’ is given by
=
1;
i : f ( u ) f * ( u ’ ) exp[-2xi(~ - u’)z] dudu’.
With the introduction of a variable r rewritten as
IF(z)I2=
jb-=
-(b-a)
1
= u - u’,
(14)
this equation can be
b
exp(2xyr)
f(u’ + r)f*(u’) du’ exp(-2nixz) dr, (15)
a
where b - II and - ( b - a) are the maximum and the minimum in the variation of r, respectively. This equation indicates that the function IF(z)l’ in the complex plane is obtained by taking a Fourier transform of the product of the exponential function exp(2xyr) and the autocorrelation of the object function f(u). It is noted that the autocorrelation of the object function is obtained from an inverse Fourier transform of the observed intensity distribution IF(x)I2. Hence, the function IF(z)I2can be easily calculated from the observed intensity IF(x)I2 by using a computer so that the positions of zeros of IF(z)(’ in the complex plane can be known. Since the function IF(z)1’ takes zeros where F(z) and F*(z*) become zeros, two sets of zeros for F(z) = F*(z*) = 0 appear at the conjugate positions of the complex plane. In other words, the distribution of zeros of IF(z)I2is always symmetrical about the real axis x in the complex plane. Evaluating Eq. (15), we can then determine the values of xj f i l y j l , ( j = 1,2, 3, . . . , n) for the positions of all zeros of F(z) in the complex plane. The observed intensity IF(x)I2contains the information about two possible positive and negative values ( y j > 0 and y j < 0) of yj for each zero of F(z). To use Eq. (13) in the phase retrieval of the function F(x), however, the zeros of F(z) have to be located in the complex plane rather than those of IF(z)I’ and the position vectors of zeros of F(z) in the complex lower half plane must be substituted into Eq. (13). Consequently, the sign of y j is next asked to be determined for the phase retrieval. If N complex zeros are present for the function F(z), there are 2N different functions, corresponding to the observed intensity IF(x)1’, with the same modulus on the real axis but with a different distribution of the phase. The ambiguity of solving the phase problem becomes 2N and hence one true solution for the zero distribution cannot be directly obtained from the observed intensity distribution IF(x)I’. Of course, any real zeros are seen as zeros on the x axis and hence their location is immediate. Since real zeros coincide with their complex conjugate, they are not a source of ambiguity in the phase retrieval.
118
N . NAKAJIMA
A method of zero location for the function F(z) by means of an exponential filter was proposed independently by Walker (1981), Wood et al. (1981), and Nakajima and Asakura (1982). This method is based on two intensity measurements in the Fourier plane of the object. The first is a measurement of the intensity distribution IF(x)I2 of a Fourier transform of the object. The second measurement is made with the object modulated by an exponential filter exp( -2ncu) (a mask with an exponentially decaying transmittance), where c is a known constant. The complex amplitude p(x) of the Fourier transform of the filtered object is given by
p ( x ) = F(x - ic) =
lb
f(u) exp( -2ncu) exp( - 2nixu) du.
(16)
.a
This modulation by the exponential filter has the effect of shifting the Fourier transform function along the imaginary axis of the complex plane of its argument. In the case of the positive value of c, the function F(z), which is expanded into the complex plane by a process of analytic continuation, has the zero distribution shifted toward the positive direction of the imaginary axis with a distance ic from the zero distribution of F(z). Then the function I ~ S ’ ( Z calculated )~~ from the intensity 1F(x)l2has the zeros (zj = xj f ily, + c l , j = 1, 2, 3, . . ., n ) symmetric about the real axis. Using the known constant c of the exponential filter and the two values lyjl and Iyj + cI, the sign of y j can be determined and then the zeros of the function F(z) can be located. Consequently, the phase $(x) of the complex amplitude F(x) in the Fourier plane can be determined by calculating Eq. (13) with the modulus IF(x)l and the positions of the zeros in the complex lower half plane. Substituting the zeros of F(z)into Eq. (9), we can also determine both the modulus IF(x)I and the phase @(x)except for linear and constant phases. Although the two phases retrieved by using Eqs. (9) and (13) are equal in principle, the phase retrieval using Eq. (13) has the advantage that the number of zeros to be taken into account in Eq. (13) is less than in Eq. (9) because the influence of zeros on the real axis and in the complex upper half plane is automatically taken into account by the logarithmic Hilbert transform.
C. Fourier Series Expansion The phase retrieval based on the zero location by means of an exponential filter in the previous section mathematically ensures the uniqueness of the solution in the one-dimensional case. The zero location method, however, has a problem in that the location of many zeros in the complex plane
119
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
becomes troublesome because the relation between the observable intensity distribution and the position of its zeros is nonlinear. In this section, the linear method (Nakajima, 1987, 1988a) of retrieving a Fourier phase from Fourier intensity distributions of an object is presented. This method is based on the solution of linear equations consisting of unknown coefficients in the Fourier series of the phase and two Fourier intensities obtained with and without expontial filtering in the object plane. 1. Phase Retrieval Using a Fourier Series Basis For the derivation of the following equations, we first rewrite Eq. ( 5 ) as
F(x) = M(x) exp[i+(x)l,
(17)
where M(x) denotes the Fourier modulus IF(x)I. From Eqs. (16) and (17), the complex amplitude of the Fourier transform of the object modulated by an exponential filter exp( - 2ncu) is given by 4 x 1 = F(X - ic) =
M(X
-
ic) exp[i+(x - ic)].
(18)
The observable modulus is then written as
lF(x)l
=
I M (X
-
ic)l exp[-Im +(x - ic)],
(19)
where Im denotes the imaginary part of the phase function 4(x - ic), which is the complex function owing to the expansion of the real variable x into the complex one x - ic. It can be seen from Eq. (19) that the Fourier modulus of the filtered object contains the information of the Fourier phase 4(x). Equation (19) can be rewritten as
The left-hand side of this equation can be calculated from the observed data, because 1F(x)1 is the square root of the Fourier intensity measured when the exponential filter is used in the object plane, and because IM(x - ic)l is related to M(x), the Fourier modulus measured without the filter, by the relationship M(x
-
ic) =
Im[ lm
1
M ( x ’ )exp(27riux’) dx’
,-m
-m
x exp( - 2ncu) exp( - 2nixu) du.
(21)
Thus M(x - ic) is the Fourier transform of the product of the inverse Fourier transform of M(x) and the exponential function exp( - 2ncu).
120
N. NAKAJIMA
Finally we consider a method of computing the unknown phase function +(x).One approach to solving Eq. (20) is to represent +(x), for - I < x < I , in terms of an appropriate basis function, e.g., a Fourier series basis (Nakajima, 1987), j = 1
where n is large enough to encode the phase distribution. Thus the unknown phase function 4(x) is represented by the unknown coefficients aj and bj ( j = 1, ..., n). Substituting Eq. (22) into Eq. (20) and evaluating the imaginary part of 4 ( x - ic) we obtain In
IM(x - ic)l
j=1
By using the two moduli Ip(x)l and lM(x - ic)l at 2n values of x we obtain 2n simultaneous equations from which the unknown coefficients aj and bj ( j = 1, . . ., n) can be determined. The phase 4(x) is retrieved by substituting the results of the solution into Eq. (22). Consequently, the object function is reconstructed by an inverse Fourier transform of the function with the observed modulus IF(x)I and the retrieved phase 4(x).
2. Phase Retrieval Using the Logarithmic Hilbert Transform and the Fourier Series Expansion The procedure using Eq. (23) is applicable for all kinds of object functions except for a Hermitian function. When the object function is a Hermitian function [i.e., f(u) = f * ( - u ) ] , the simultaneous equations of Eq. (23) cannot be solved because its modulus 1F(x)l becomes equal to 1M(x - ic)l in all values of x . There is a method of phase retrieval for such functions (Nakajima, 1988a) that is based on use of the logarithmic Hilbert transform and a similar Fourier series expansion. We first rewrite Eq. (13) as
$(x) =
$h(X)
+ &(x),
(24)
where & ( X ) is the Hilbert phase and can be calculated from the observable modulus IF(x)I by using Eq. (7) except for the trivial factors (i.e., constant and linear phases), and 4,(x) is the phase with the influence of the zeros in the complex lower half plane. Substitution of Eq. (24) into Eq. (17) gives
F(x) = FhW exp[i4,(x)l,
(25)
where Fh(x) is the Hilbert function of Eq. (8). Using Eqs. (18) and (25), the observable Fourier modulus of the filtered object can be rewritten as
lF(x)l
=
IF, (x
- ic)l
exp[-Im 4,(x - ic)].
(26)
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
121
The function Fh(x - ic) can be calculated from the Hilbert function Fh(x) by the same process of analytic continuation as in Eq. (21); that is, Fh(x - ic) is obtained by taking a Fourier transform of the product of the inverse Fourier transform of the Hilbert function Fh(x)and the exponential function exp( - 27ccu). Representing &(x) in terms of the same Fourier series basis as in Eq. (22) and substituting the imaginary part of +,(x - ic) into Eq. (26) we obtain
The unknown coefficients aj and bj ( j = 1 , ..., n) can be solved from the , which simultaneous equations for the 2n data of ln[lF(x)l/lFh(x - i c ) ( ] in is the modulus observed with the exponential filter and IFh(x - ic)(is evaluated from the modulus IF(x)l by using Eqs. (7), (8), and (21). The phase +,(x) with the influence of zeros in the complex lower half plane is determined from the results of the solution. Consequently, the phase d(x) of the Fourier transform F(x) of the object can be obtained by adding the phase 4,(x) to the Hilbert phase &(x). 3. Analysis of the Information Contained by the Ratio of Fourier Moduli
Equation (27) can be applied to a case of any distribution of zeros, and hence it permits the treatment of retrieving the Fourier-transform phase of a Hermitian object function, because the ratio in the left-hand side of Eq. (27) involves the influence of only zeros situated in the complex lower half plane and it never becomes unity for all values of x. These features of Eqs. (23) and (27) can be proved by using the Hadamard product of Eq. (9) as follows: By representing the zeros in the lower and upper half planes separately, we can rewite Eq. (9) as
where B was set to be unity for simplicity, zuj and z l j are the vector notations of thejth zero in the upper and lower half planes, respectively, N , and N , are the numbers of the corresponding zeros, and the zeros situated on the real axis (z = x) are included among the group of zuj. The function F(x) in Eq. (18) is expressed with the help of Eq. (28) as
122
N. NAKAJIMA
where
zuj = xuj + iyuj, zlj = xlj + iylj. We now introduce a new real axis
x’
=
x
-
ic,
(31)
which is obtained by shifting the old real axis x by a distance c toward the negative direction in the complex plane. Then the observable intensity Ip(x)12is given by
where
+ i(yuj+ c), z [ j = X i j + q y I j+ c).
zLj
=
xij
(33)
Also, in Eq. (33)
x U’J . = x U J. - ic X i j = XI,
-
(34)
ic.
The square root of Eq. (32) corresponds to the numerator of Eqs. (23) and (27). The function M(x - ic) in the denominator of Eq. (23) is written from Eqs. (14) and (28) as M(X - ic) = [F(X- i c ) ~ * (+x i c ) ~ ” ~
The square of the modulus of the function M(x - ic) on the new real axis x’ = x - ic is then given by
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
123
where = x i j - i(yU J.
26' = X i j
-
c),
(37) -
i ( y l j - c).
It is found that the coefficients of the Fourier series of the phase in Eq. (23) are calculated from the ratio of the distributions of zeros in Eqs. (32) and (36). We consider the case of Hermitian object functions. A Fourier-transformed function F(x) of the Hermitianf(u) has a symmetric distribution of zeros about the real axis x in the complex plane (Fiddy and Ross, 1979); I.e.,
z
U J'
(38)
= 26.
Equation (36) is then reduced to
where the relations zLj = z;', Z i j = z C , and zAj = z:j (for the zeros on the real axis x) have been used. It is evident that this equation is equal to Eq. (32). Consequently, in the case of Hermitian objects, the unknown coefficients of phase cannot be solved by using Eq. (23) because the lefthand side of Eq. (23) becomes zero for all values of x. We next consider the case of Eq. (27). There is a difference betweeen the denominators of Eqs. (23) and (27). The Hilbert function Fh(x)involved in Eq. (27) can be expressed with the help of Eq. (28) as
since the function Fh(x) is obtained by reflecting the zeros of F(z) in the complex lower half plane onto the upper half plane (Burge er al., 1976). On the new real axis, x' = x - ic, Eq. (40) becomes
The square of the modulus of this function on the new real axis x' is given by IFh(x - ic)I2 =
I
(l j = I
111'>12 I z:j
.2I)$
(lj = I
21j
(42)
124
N. NAKAJIMA
Even in the case of Hermitian objects , the ratio of Eqs. (32) and (42) never becomes unity for all values of x because it involves the influence of only zeros situated in the complex lower half plane. Therefore the unknown coefficients of Eq. (27) can be solved in the case of Hermitian objects. If the function F(z) has no zeros in the complex lower half plane, the ratio of Eqs. (32) and (42) become unity for all values of x , but the phase 4(x) can be determined only from the modulus IF(x)l by using the logarithmic Hilbert transform of Eq. (7). It is found from Eqs. (32) and (42) that the phase retrieval method in Section II,C,2 can be applied to a case of any distribution of zeros in the complex plane.
D. Lorentzian Filter In the previous section we considered the method for retrieving the phase of the Fourier transform of an object from Fourier intensity distributions obtained with and without an exponential filter in the object plane. This method, however, is not appropriate for the cases in which the direct operation of exponential filtering in the object plane is not possible. If, for instance, the object is isolated from its Fourier transform plane (i.e., observation plane) such as in remote sensing, the direct filtering operation on the object for measuring the Fourier modulus becomes difficult. In such a situation, a filtering operation in the Fourier transform plane of the object may be preferable. In this section an alternative phase retrieval method (Nakajima, 1992) is presented that allows for the retrieval of the Fourier phase of an object from its Fourier modulus and the modulus of a convolution between the Fourier transform of the object and a Lorentzian filter. The Fourier transform of the exponential filtered object is not equivalent precisely to the convolution between the Fourier transform of the object and the Lorentzian filter because the inverse Fourier transform of a Lorentzian function is not an exponential function but a symmetrical double exponential. Assuming that the extent of the object is finite, we can consider a quasi-equivalence between the Lorentzian and the exponential filtering. Figure 2 depicts an example of an optical system in one dimension that gathers the types of data needed for the phase retrieval described here. We assume quasimonochromatic fully spatially coherent illumination and a transparent object with a finite extent. The complex amplitude function in the object plane is regarded as the object function f(u). The intensity data in the Fourier plane are collected by scanning the system of detector 1 and detector 2 with a Lorentzian filter and a converging lens L2 along the x axis
125
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
Fourier plane
C ject I ine
X
1
Object
Detector 1
Holdei
1-
‘1
\ kf2-4 ++
Lorentzian filter
‘1
FIGURE2. Schematic configuration of the phase retrieval using a Lorentzian filter:.&andJ2 are focal lengths of lenses L , and L,, respectively.
in Fig. 2. The data collected by detector 1 are the Fourier intensity IF(x)I2 of the object functionf(u). From detector 2, we can obtain the squared modulus of a convolution integral between the Fourier transform F(x) of the object and a Lorentzian filter with the amplitude transmittance c2/(x2 + c 2 ) ,where c is a known positive constant. The process related to detector 2 is formally explained as follows: In Fig. 2, the Lorentzian filter is placed immediately in front of lens L2 and detector 2 is placed in the back focal plane of lens L , . Then the intensity distribution of field amplitude G(x, u ’ ) on detector 2 can be written as
IW, u‘)I2 =
I 1.
C2
F(x’) (x‘- x)2
+
c2
l2
exp( -2niu‘x’) dx’ ,
(43)
where u’ is the coordinate axis on detector 2, x is the coordinate of a center of the Lorentzian filter, and r is the finite extent of the aperure of lens L , . When the intensity at the origin of u’ = 0 is observed, its modulus is given from Eq. (43)as
IG(x9 0)l
=
I s.
m’) (x’-
dx’l
C2
x)2
+
c2
.
(44)
We assume that the aperture extent is large enough to show the relation
I a x ,0) I = IF(x)I
9
(45)
126
N . NAKAJIMA
where
in which the term (x’ - x)’ is rewritten as (x - x’)’ by virtue of the symmetry of the Lorentzian function. Since Eq. (46) is the convolution integral between two functions F(x) and c’/(x’ + c’), the inverse Fourier transform of Eq. (46) is given by
L m
p(x)exp(2niux) dx = f ( u ) c nexp(-
I2ncuI).
(47)
Under the condition that the object is set on one side of the object plane, as shown in Fig. 2, the Fourier transform of Eq. (47) becomes p(x)
=
I’ 1
f ( u ) c nexp( - I2ncuI) exp( -2nixu) du b
=
cn
f ( u )exp[- 2ni(x - ic)u]du
a
=
cnF(x - ic),
(48)
where the interval (a, 6) of the integral denotes the finite extent of the object functionf(u) which lies above the optical axis in Fig. 2 (i.e., 0 < a < b). It can be seen from Eqs. ( 4 9 , (46), and (48) that the Fourier modulus IF(x - ic)l obtained from the object function multiplied by an exponential function exp(-2ncu) is equivalent to the modulus 1G(x, 0)l observed by making use of the Lorentzian filter at the Fourier plane, provided that the object is set on one side of the object plane. Consequently, we can also utilize the numerical procedure used in the previous section for the phase retrieval by exponential filtering. Thus, p(x) is rewritten from Eqs. (18) and (48) as F(x) = c n ~ ( x ic) exp[i+(x - ic)]. (49) Substituting Eq. (49) into Eq. (45) we obtain
Since the left-hand side of this equation can be calculated from the observed data, the unknown phase function +(x) can be determined by a Fourier series expansion procedure as shown in Section II,C. The phase retrieval for Hermitian object functions can also be carried out by the method based on use of the logarithmic Hilbert transform and the Fourier series expansion.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
127
E. Simulated Example In this section, a computer simulation of an object reconstruction is done by using the phase retrieval method described in Section II,C,l, since the principal part of phase retrieval in this review is to use an exponential filter and a Fourier series expansion. The following data processing by a computer will be carried out with 128 sampling points. The object function for the reconstruction is a phase object with a finite extent exp(i2[cos(2nu)
+ sin(O.lnu)])
-0.625 Iu 5 0.625 otherwise.
(51)
The modulus and phase of the original object function f(u) are shown in Figs. 3a and 3b, respectively. The modulus of the object function multiplied by an exponential filter exp( - 2ncu) with parameter c = 0.08 is shown in Fig. 3c. Studies of numerical experiments on phase retrieval using exponential filtering indicate that suitable values of c are in the range O.O2/w < c < 0.25/w, where w denotes the width of an object function. The modulus M(x) of the Fourier transform F(x) of the object function is shown in Fig. 4a. Figure 4b shows the Fourier modulus IF(x)l of the object multiplied by the exponential filter. Figure 4c shows the modulus
n
a
"I
C
FIGURE 3. The original phase object function of Eq. (51): (a) modulus, (b) phase of the object, and (c) modulus of the object multiplied by an exponential filter exp(-2ncu) with parameter c = 0.08.
128
N. NAKAJIMA
FIGURE4. The Fourier moduli of the object in Fig. 3 for phase retrieval: (a) ordinary Fourier modulus IF(x)l of the object; (b) modulus I&)[ of the Fourier transform of the object multiplied by the exponential filter as shown in Fig. 3c; (c) modulus IM(x - ic)( of the function calculated from the Fourier modulus in (a) by using Eq. (21); (d) retrieved Fourier phase.
IM(x - ic)l calculated from the modulus in Fig. 4a by using Eq. (21). Thus the simultaneous equations with 124 lines were constructed by using Eq. (23), the data of the two moduli Ip(x)>land IM(x - ic)l, and 62 unknown coefficients each for aj and b j . A LU-decomposition method, which comprises the lower and the upper triangularization techniques of matrix, was used for solving the simultaneous equations. Note that the data of the two Fourier moduli at two or three sampling points in the neighborhood
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
129
FIGURE5 . Reconstructed phase object function: (a) modulus and (b) phase of the object obtained by taking a n inverse Fourier transform of the complex function with the modulus of Fig. 4a and the retrieved phase of Fig. 4d.
of both sides of the interval (-I < x < I) must be neglected to stabilize the solution of the simultaneous equations. In this simulation the Fourier modulus data of two sampling points from both sides of the interval were omitted. Figure 4d shows the Fourier phase 4(x) retrieved by substituting the resultant coefficients aj and bj ( j = 1, ...,62) into Eq. (22). Figures 5a and 5b show the modulus and phase of the reconstructed object, respectively, which is obtained by taking an inverse Fourier transform of the complex function with the Fourier modulus of Fig. 4a and the retrieved phase +(x) of Fig. 4d. It is seen from comparison of Figs. 3 and 5 that the object is almost faithfully reconstructed with the exception of a Gibbs phenomenon that appears in the modulus of Fig. 5a. The ambiguity of the constant phase difference between Figs. 3b and 5b is unavoidable in phase retrieval from intensity as is mentioned in Section I1,A. To simulate phase retrieval from measurements of the noisy intensity, a complex normal random noise n(x) with n(x) = nl(x) + in2(x),nl(x) and n2(x) being normal random numbers and independent from each other, is produced by a computer and added to the Fourier transform F(x) of the object with a form of F,(x) = F(x) + n(x). Another complex normal random noise n’(x) with a power level similar to that of n(x) is also added to the Fourier transform p(x) of the object modulated by the same exponential filter as in Fig. 3c. A factor of the signal-to-noise ratio (SNR) defined by SNR = 1, Ip(x)12/C, ln’(x)(’ is now introduced at the Fourier plane. Figures 6a and 6b show the noisy Fourier moduli IF,(x)l and l&(x)l for the object function in Fig. 3, respectively. The SNR is 136.6 in Fig. 6b. The noisy modulus IM,(x - ic)l shown in Fig. 6c was evaluated from the data in Fig. 6a by using Eq. (21). The object function reconstructed from the
130
N. NAKAJIMA
FIGURE 6. The noisy Fourier moduli of the object in Fig. 3 for phase retrieval: the moduli (a), (b), and (c) are the same as in Figs. 4a, 4b, and 4c, respectively, but for the presence of noise.
moduli in Figs. 6b and 6c by the same procedure as in the noiseless case is shown in Fig. 7. The reconstructed object in Fig. 7 is disturbed by the noise, but the features of the original object are retrieved even in the noisy case. The reconstruction using the present phase retrieval method is comparatively robust for noise.
b
:p!+ -71
FIGURE 7. Same as in Fig. 5 but for reconstruction from the noisy moduli in Fig. 6.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
131
111. EXTENSION TO TWO-DIMENSIONAL PHASE RETRIEVAL
A. Algorithm The two-dimensional (2-D) phase retrieval can be applied in many fields; for example, in scattering phenomena problems and optical astronomy. The phase retrieval based on the properties of entire functions of exponential type ensures the uniqueness of the solution in a one-dimensional (1-D) case. However, there are some difficulties in extending it to a two-dimensional (2-D) case. Nieto-Vesperinas (1980) has tried a generalization to the 2-D formula of the logarithmic Hilbert transform, but it has been shown that the 2-D formulation of the logarithmic Hilbert transform does not provide an expression which is useful for retrieving the 2-D phase from the measured 2-D modulus. Deighton et al. (1985) have proposed a method for generating all possible solutions of a 2-D phase, which uses the direct zero location in the whole complex plane of one-dimensional strips of a single 2-D intensity distribution. Lane et al. (1987) have presented an approach to phase retrieval by tracking the zero sheets of the Fourier intensity extended analytically into 2-D complex space (four real dimensions), wherein the zero sheet of the Fourier transform of an object function is separated from the zero sheet of the complex conjugate of the Fourier transform. Because approaches of this kind must employ the complex zeros in many 1-D complex planes or the zero sheets in 2-D complex space, they tend to be computationally intensive and sensitive to noise. In this section, an extension (Nakajima, 1989) of the 1-D method described in Section I1 to 2-D phase retrieval is presented. The extension algorithm uses three Fourier moduli of the object, which are measured without a filter and with two exponential filters decaying in the horizontal and vertical directions. The advantage of this algorithm is that it is relatively fast and insensitive to noise. First, the 2-D Fourier transform relationship is defined by
F ( x l ,x2) =
I s.
f(ul,u,) exp[-2ni(x, u I
+ x2u2)1du, du,,
(52)
wheref(u, , u,) is a 2-D object function with a finite extent R at the object plane, and F(x, ,x,) is a complex function of the form W
I > x2) = I W I
9
x2)I
exp[idJ(x,, X d l l
(53)
in which lF(x,, x,)l and q5(xl,x,) are the observable modulus and the phase of F(x, ,x,) in the Fourier transform plane.
132
N. NAKAJIMA
The phase retrieval method described in Section II,C is implemented in the 2-D case by sectioning the Fourier modulus into a set of 1-D parallel slices. We first obtain 2-D Fourier modulus of the object from the measurement of the ordinary Fourier intensity according to the relationship of Eq. (52). A second measurement is made with the 2-D object modulated by an exponential filter with an amplitude transmittance exp( -2ncul). The result of this modulation is given from Eq. (52) as F(xl - ic,x2) =
s 5.
exp(-2ncul)f(ul, u2)
x exp[-2ni(xlul
+ x2u2)]du, du,,
(54)
where c is a known positive or negative constant. These two Fourier moduli IF(x,, x,)l and IF(x, - ic, x2)l are used to calculate the Fourier phases on a set of 1-D slices (lines) in the direction of the x1 axis. Then we consider a function on an arbitrary line of x, = C i n Eq. (54). From Eqs. (53) and (54), F(x, - ic, C ) can be rewritten as F(x, - ic, C ) = M(x, - ic, C) exp[i+(x, - ic, C ) ] ,
(55)
where we introduced the modulus function defined by M(xl,xz) IF(x,, x2)l. The observable modulus is then given by IF(xl - ic, C)l
=
IM(x, - ic, C)l exp[-Im+(x, - ic, C ) ] ,
=
(56)
where Im denotes an imaginary part of the complex function +(xl - ic, C ) . The function M(x, - ic, C ) can be calculated by taking the l-D Fourier transform of the product of the l-D inverse Fourier transform of M(xl, C ) and the exponential function exp( - 2ncu1) in the same manner as in Eq. (21). As described in Section II,C, we assume that the l-D phase +(xl, C ) within an observational region ( - I < x1 < I ) of the modulus M(x, , C ) can be represented approximately by using the Fourier series basis as
where a j ( C ) and b j ( C )(n = 1, ...,n) denote unknown coefficients for the l-D phase on an arbitrary line of x2 = C. Substituting Eq. (57) into Eq. (56) and evaluating the imaginary part of +(xl - ic, C ) we obtain
j = 1
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
133
Consequently, the phase + ( x l ,C ) can be obtained by substituting the results of the solution of Eq. (58) into Eq. (57). This procedure is conducted for a set of the lines parallel to the x , axis. Then there are unknown constant phase differences among these lines independent of each other. To resolve this ambiguity, a third measurement of the Fourier intensity is made with an exponential filter exp(-2ncu2), which is obtained by a 90" rotation of the mask exp( -2ncu,). This measurement, together with the measurement of the Fourier intensity in the absence of the filter, is used to determine the phase distribution along one line in the direction of the x , axis. If we choose the phase distribution 4(0, x2) on the x2 axis as the constant phase differences, the 2-D phase is determined by adding the constant phase to the phases on each line parallel to the x 1 axis, +(XI,
xz) = 44x1 C ) + +(O, xz), 9
(59)
where the coordinates x, and C have to change with the same value. Then the object function f(ul,u2) is reconstructed by an inverse Fourier transform of the function with the observed modulus IF(x,,x,)I and the retrieved phase + ( x l ,x2). The present method allows us to retrieve the 2-D phase from three moduli, IF(x,, xz)l, IF(x, - ic, xz)l, and IF(xl,x2 - ic)l, but the phase retrieval for Hermitian object functions [i.e., f(u,, u2) = f*(- u l , -u2)] has to be carried out by the method based on use of the logarithmic Hilbert transform and the Fourier series expansion as described in Section II,C,2.
B. Simulated Example The numerical performance of the 2-D phase retrieval method described in the previous section is presented here by reconstructing a 2-D real object function. The data processing was carried out with 128 x 128 pixels, but the results in Fig. 8 are illustrated by use of a part of the whole (47 x 47 pixels). Figure 8a shows the original real object function defined within the extent u: + uf I0.8. Figure 8b shows the reconstructed object from three Fourier moduli, ~ F ( x l , x 2 )IF(x, ~ , - ic,x2)l, and IF(xl,x2- ic)l, of the object function using Eqs. (54)-(59), where 62 unknown coefficients each for a j ( C ) and b,(C) are used for the calculation of 1-D phase on each line. In this case, the two moduli, IF(x, - ic,x2)l and IF(x,, x, - ic)l, were evaluated by using exponential filters exp( -2ncu1 ) and exp( - 27ccuz) (with parameter c = 0.04) at the object plane, respectively. Figure 8c shows the object reconstructed in the same way as in Fig. 8b from three noisy moduli, which were produced by adding complex normal random noises n ( x l ,x,) to the functions F(x, ,x2), F(x, - ic,x2), and F(x, ,x2 - ic). The signal-to-noise
134
N . NAKAJIMA
FIGURE8. Reconstruction of a real function by using the two-dimensional phase retrieval algorithm: (a) original object function; (b) reconstructed object function from noise-free Fourier moduli obtained with and without an exponential filter in the object plane; (c) reconstructed object function from noisy Fourier moduli in the same way as in (b).
ratio defined by SNR = C,,,,, IF(x, - ic,X ~ ) ) ~ / C/ n~( ,x , ,x2)I2 , ~ is 161 in the case of Fig. Sc. Errors in Fig. 8b are due to the use of finite sampling points for the phase retrieval evaluation. In Fig. Sc, the reconstructed object is blurred by noise, but the outlines of the original object are seen to be retrieved in the noisy case. C . Experimental Example
In this section, the reconstruction of a phase object from experimental far field intensities (Nakajima, 1990) is demonstrated with the 2-D phase retrieval method described in Section III,A. The optical system used in
135
PHASE RETRIEVAL BY ENTIRE FUNCTIONS Exponential Filter
El .L
CCD Camera
Digital Memory Personal Computer
FIGURE9. Schematic diagram of the phase-retrieval experiment: f is the focal length of lens L , .
performing the experiment is shown in Fig. 9. In this experiment, the phase object was composed of a coverging lens of focal length f, = 202.8 mm and a 1-mm-diameter circular aperture situated at the center of the lens. A He-Ne laser beam of wavelength I = 0.6328 pm was collimated by lenses L , and L2 and used to illuminate the phase object. The strength of the laser light was controlled by a polarizer placed in front of the laser. Lens L , of focal length f = 404.8 mm produced a Fourier transform of the complex amplitude at the plane of the circular aperture. The Fourier intensity data were collected by a charge-coupled device (CCD) TV camera (NEC T1-22A11). The video signal was converted to a 128 x 128 eight-bit digital image by using a digital memory (Mitani Shouji IMM-256V8). Calculating the phase retrieval and reconstructing the complex amplitude at the plane of the circular aperture was carried out by a personal computer. A minicopy film was used as the exponential filter for phase retrieval, in which the exponential intensity distribution displayed on a TV monitor was recorded with a camera. The accuracy of this filter was checked by observing the intensity distribution of the object image with the CCD camera. The amplitude transmittance of the exponential filter across the area used in the experiment falls to 0.74 of its initial value over a 1-mm distance. This filter can be regarded as a continuous tone transparency. To perform the exponential filtering, the film was placed in contact with the plane of the circular aperture, and the Fourier intensity in this situation was observed with the CCD camera. Figures 10 and 11 show the reconstruction of the phase object from experimental Fourier intensities. Each of the figures in Fig. 10 represents collected intensity data consisting of 128 x 128 pixels. The data processing was carried out with the same pixels as in Fig. 10. Figure 10a shows the
136
N. NAKAJIMA
FIGURE10. Fourier intensity data for the object consisting of the converging lens of the focal lengthy, = 202.8 mm and the circular aperture of 1-mm diameter: (a) Fourier intensity of the object; (b), (c) Fourier intensities of the object multiplied by an exponential filter at the object plane in the direction parallel to the x,- and x,-axes, respectively. [continued
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
137
FIGURE10.-continued.
Fourier intensity data of the object. Figures 10b and 1Oc show Fourier intensities of the object multiplied by the exponential filter made of minicopy film at the object plane in the directions parallel to the x1 and x, axes, respectively. The signal power of the Fourier intensity is reduced by the effect of the exponential filter. To increase the signal-to-noise ratio on the data, the laser light used in Figs. 10b and 1Oc was made more intense than that in the unfiltered case of Fig. 10a by using the polarizer. Note that the constant multiplicative difference between the intensities in Fig. 10a and in Fig. 10b or 1Oc is not important for the present phase retrieval, because, as we know from Eqs. (57) and ( 5 8 ) , it merely yields a linear phase factor by which the object is shifted parallel in the object plane. The Fourier phase was calculated from the data in Fig. 10 using the Fourier series expansion. The unknown coefficients (62 each for u j ( C ) and b j ( C )in Eq. (57)) were solved from the observed data on each line. Before calculating the phase, a uniform bias component was subtracted from the Fourier data in Fig. 10, because this was noise produced by the CCD camera. The level of the subtracted bias was the same as observed when no light entered the CCD camera. No other calibration was performed on the data. Figure 11 shows the object reconstructed by taking an inverse Fourier transform of the complex function with the retrieved phase and the modulus obtained via the intensity in Fig. 10a. The computing time of the
138
N. NAKAJIMA
I I I
L2 - 0.5 0.5 (mm) FIGURE11. Reconstructed phase object from the data in Fig. 10: (a) modulus and (b) phase of the reconstructed object; (c) cross-sectional profile of the phase in (b), taken along a line passing through the center. The solid and dashed curves represent the reconstructed phase and the phase calculated from the focal length f, of the lens used as the object, respectively.
phase retrieval and the object reconstruction was about 35 min on the personal computer (i-80286 at 10 MHz, FORTRAN). Figures l l a and l l b are the modulus and phase of the reconstructed object, respectively, where the results are illustrated by use of part of the whole (i.e., 69 x 69 pixels for the modulus and 37 x 37 pixels for the phase).
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
139
Owing to noise in the data, the reconstructed phase has values (modulo 2n) even outside the extent of the circular aperture used at the object plane. Thus, the unnecessary phases outside the expected object support were cut off for illustration in Fig. 1 lb. In Fig. 1 Ic, a cross section at the center of the phase distribution in Fig. 1l b is shown by the solid curve, and the dashed curve shows a phase distribution at the object plane, calculated from the focal length f, = 202.8mm of the converging lens used as the phase object. From the reconstructed object in Fig. 11, it is found that the reconstruction of the object phase appears more robust for noise than that of the object modulus. This fact is due to the inverse relationship of the property such that the Fourier moduli used for the phase retrieval are influenced more by a change of object phase than by a change of object modulus. The tendency to robustness of reconstructed object phases has already been seen in computer simulations (Nakajima, 1989).
IV. APPLICATION TO RELATEDPROBLEMS A . Hartley Transf o r m
Use of the Hartley transform (Hartley, 1942) in digital and optical image processing was recently presented (Bracewell, 1983, 1984; Bracewell et al., 1985; Li and Eichmann, 1985). Because the Hartley and the Fourier transforms can be related by a simple additive operation, optical analog implementation of the Hartley transform is possible, and some optical systems of its transform were proposed (Bracewell et al., 1985; Li and Eichmann, 1985). Despite the appositeness of the Fourier transform for describing the operation of a lens, the Hartley transform has a convenient feature not found in the Fourier transform. In many imaging applications, only the intensity of the Fourier transform of an object function can be observed, and the phase information of the transform is lost. Even if the object function has only real values, the phase of the Fourier transform of the object is needed to reconstruct the object function because the Fourier transform of a real function is usually complex. On the other hand, the Hartley transform of a real object function is always real, and only the sign is lost when the Hartley intensity is recorded. The sign ambiguity is a much less serious defect than the absence of phase knowledge when one is recording the intensity of the Fourier transform of a real object function. Millane (1986) indicated that low-frequency estimates of a real object function can be determined from the moments or from the low-frequency Hartley intensity, and he suggested that these estimates can be improved by using an iterative algorithm to impose support and positivity constraints on
140
N. NAKAJIMA
the object. A direct method for reconstructing a real object function from the intensity of its Hartley transform was proposed by Nakjima (1988b). This method is established as a closed-form expression by the phase retrieval using the properties of entire functions and has no necessity for a priori object information, such as nonnegativity and the extent of the object. In this section, the direct method is described for a one-dimensional case. 1. Relation between the Hartley and the Fourier Transforms The 1-D Hartley transform of a real functionf(u) is defined as (Hartley, 1942) H(x)
-:1 1
=
f(u)[cos(2nxu)
+ sin(2nxu)l du.
(60)
H(x)[cos(2nux)
+ sin(2nux)l dx.
(61)
The inverse relation is
m
f(u) =
-m
We assume that the Fourier transform of f(u) is defined by the function F(x) in Eq. (1). Then the function H(x) can be related to the function F(x) as
H(x) = )(l
+ i)F(x) + i ( l - i)F(-x).
(62)
Furthermore, from Eq. (I), this expression can be written as m
[)(l
+ i ) f ( u )+ i ( l
-
i)f(-u)] exp(-2nixu) du.
(63)
If we introduce an effective object function (64) E(u) = $(I + i ) f ( u ) + )(1 - i)f(-u), we find from Eqs. (63) and (64) that the Hartley transform H(x) can be regarded as the Fourier transform of the effective object function E(u). The Fourier transform of a real function is complex except for the case of Hermitian functions. On the other hand, the Hartley transform of a real function is real. So whenf(u) is a real function, the effective function E(u) has the Hermitian property E(u) = E*(- 24). (65) 2. Reconstruction Method for Real Functions We first assume that a Hartley transform of a real object functionf(u) is given by f(u)[cos(2nxu)
+ sin(2nxu)l du,
(66)
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
141
where (a, b) of the integral denotes the finite interval of the object. The Hartley transform H(x) is written as H(x)
=
IH(x)l exp[i4(x)l,
(67)
where IH(x)l and 4(x) denote the modulus and the phase of H(x), respectively. The quantity to be observed experimentally is only the modulus (H(x)l. To reconstruct the object function f(u), the phase retrieval from intensity is needed. When the object function f ( u ) is a real one, its Hartley transform H(x) becomes a real function of which the phase is either 0 or II. In this case, if the positions and the order of zeros of the intensity IH(x)I2 are determined by examining the behavior of the Hartley intensity near each zero, then the phase of H(x) can be determined in principle from one lobe to the next. Such a procedure, however, would be error sensitive in practical cases. In this subsection, as a more robust procedure, a closed-form method is presented. Using Eqs. (63) and (64), we can rewrite Eq. (66) as b
H(x)
E(u) exp( - 2nixu) du,
=
(68)
1-b
where ( - b, b) denotes the interval of the effective object function E(u) provided that la1 c Ibl is held in Eq. (66). This expression shows that the problem of retrieving the phase of the Hartley transform H(x) can be regarded as the phase retrieval problem from the Fourier transform intensity of the effective object function E(u). Thus we use the mathematical properties described in Section 11. The phase in Eq. (67) is rewritten as 4(x)
=
4 h M + 4&),
(69)
where bh(x) is the Hilbert phase and can be calculated directly from the modulus IH(x)I by using Eq. (7) except for constant and linear phases, and &(x) is the phase with the influence of the zeros in the complex lower half plane. Substitution of Eq. (69) into Eq. (67) gives
in which Hh(x) is the Hilbert function for IH(x)l. In the case of Fourier phase retrieval for a real object function, the phase 4,(x) that includes the influence of zeros in the complex lower half plane cannot be directly determined from a Fourier modulus of the object. In the Hartley phase retrieval for a real object function, however, the phase 4,(x) can be evaluated from the modulus IH(x)l by using the Hermitian property of the
142
N. NAKAJIMA
effective object function in Eq. (65). This is because the zeros of the Hartley transform have a symmetric distribution about the real axis x in the complex plane. The phase retrieval procedure is described as follows: We consider the Fourier transform of the effective object function E(u) modulated by exponential filters of exp( -2ncu) and exp(2ncu), where c is a constant. Then the resulting equations are given by H(x
k
ic)
=
1".
E(u)exp( k 2ncu) exp( - 2nixu) du.
(72)
Using this equation and the Hermitian property of the effective object function E(u) in Eq. (65), we can obtain a relation that
IH(X
-
ic)l = IH(X
+ ic)l.
(73)
Expanding the real variable x into the complex one x f ic, Eq. (70) becomes
H(x + ic)
=
Hh(x
+ ic)exp[i&(x
k
ic)].
(74)
Substitution of Eq. (74) into Eq. (73) yields IHh(x - ic)l exp[-Im q i ~ x - ic)] = IHh(x
+ ic)l exp[-Im
$,(x
+ ic)], (75)
where Im denotes an imaginary part of the phase function &(x This equation can be rewritten as
k
ic).
The function IHh(x f ic)l in the left-hand side of this equation can be calculated from the modulus IH(x)I by using Eqs. (7), (21), and (71). Representing c$,(x) in terms of the same Fourier series basis as in Eq. (22) and substituting the imaginary parts of dz(x k ic) into Eq. (76) we obtain
The phase c#I,(x) is determined from the results of the solution for Eq. (77). The phase d(x) of the function H(x) is derived by adding the phase q5,(x) to the Hilbert phase $h(X). Thus the object functionf(u) is reconstructed by an inverse Hartly transform of the function with the observed modulus IH(x)I and the retrieved phase c$(x). Consequently we find that a real function is uniquely determined, except for an ambiguity in sign [i.e., +q5(x) or -r$(x)], from only one modulus of its Hartley transform by the present method. This is in contrast to the Fourier phase retrieval for a real function, in which two Fourier moduli must be observed with and without an exponential
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
143
filter. The reconstruction of a 2-D real object function from the modulus of its 2-D Hartley transform can also be carried out by using the present method with the extension algorithm described in Section III,A.
B. Stellar Speckle Interferometry The atmosphere of the earth restricts the resolution of the conventional astro-image up to about 1 arcsec even if we use a large telescope. T o overcome this difficulty, speckle interferometry was invented by Labeyrie (1970). This technique, however, gives only an autocorrelation (or equivalently a power spectrum) of an object up t o the diffraction limit of the telescope, but does not give the object itself. To address the object reconstruction problem, one must use a phase retrieval method. Various methods for reconstructing stellar objects from interometric data in optical astronomy have been studied (see, for example, Dainty and Fienup, 1987). In this section, it is shown that the phase retrieval method described in Section II,C is applicable to stellar speckle interferometry. In stellar speckle interferometry, we obtain a sequence of instantaneous astro-images degraded by atmospheric turbulence. The degraded shortexposure image i(u) through a telescope can be written as the convolution
I-m
i(u)
=
f(u’)s(u- u ‘ ) d u ’ ,
(78)
where f ( u ) is an object intensity, and s(u) is an instantaneous point-spread function of atmosphere including the telescope transfer function. Also, for simplicity and brevity, one-dimensional notations have been employed; the results are equally valid for two dimensions. In this method, the instantaneous image intensity i(u) is recorded in the usual way, and the ensemble average of the Fourier intensity of the image is calculated as
I
I
IF(x)I2( IS ( X ) I 2,
( I(x) 2, =
9
(79)
where the Fourier transforms of the functions are represented by their corresponding uppercase letters, (. . .) denotes the ensemble average, and ( IS(x)I2) is a transfer function that can be measured by observing a point source. We also compute the ensemble average of a second Fourier intensity of the image modulated by an exponential function exp( -2ncu). Using the characteristics of an exponential function (Walker, 1982), the filtered image can be written from Eq. (78) as i(u) exp( -2ncu)
=
1---
f ( u ’ ) exp( - 2ncu’)s(u - u ’ )
,
x exp[-2nc(u
-
u ’ ) ] du’.
(80)
144
N. NAKAJIMA
and hence the ensemble average of the Fourier intensity of the filtered image is given by (Iz(x - ic)I2> = I F ( X - ic)12. (81) The ensemble averages (IS(x)I2) and (IS(x - ic)I2) in Eqs. (79) and (81) are calculated from the data of the point-spread function corresponding to a reference single star in stellar speckle interferometry. Thus, dividing Eqs. (79) and (81) by those ensemble averages corresponding to the reference star, respectively, we obtain the Fourier moduli of the unfiltered and filtered objects. Although Walker (1982) used an iterative algorithm for the phase retrieval from these two Fourier moduli, the Fourier phase of the object can be calculated directly from the Fourier moduli by using the Fourier series expansion as mentioned in Section I1,C. Finally, the object intensity f(u) is reconstructed by an inverse Fourier transform of the complex function with the averaged Fourier modulus and the retrieved phase. The reconstruction of a two-dimensional stellar object can also be performed by combining the present method with the extension algorithm described in Section III,A. It was demonstrated (Ohtsubo el al., 1991) that, by this method, the double star image was successfully reconstructed from the data obtained in the actual observation using speckle interferometry. C. Blind Deconvolution
Convolution appears frequently in many branches of science and engineering. In optics, for example, for a mathematical model of a degrading process in an imaging system, we often use a convolution integral with an object function and a space-invariant point-spread function. When the point-spread function is estimated from another observation, the removal of the degradation from the convolution can be conducted by using one of a number of restoration methods (see, for example, Andrews and Hunt, 1977). If the point-spread function is unknown and only the convolution is available, the problem of deconvolution becomes more difficult, and this is called the blind-deconvolution problem. This problem is closely related to the phase retrieval problem. Lane and Bates (1987a) have considered possible solutions to the blind-deconvolution problem using the idea of the zero sheet in the two-dimensional complex space. Inspired by their work, Ghiglia et al. (1993) have developed a systematic approach and an operational code for performing the deconvolution of multiply-convolved two-dimensional complex data sets in the absence of noise. The procedure used in this approach is quite complex and involved, and hence yields a large computational burden.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
145
Ayers and Dainty (1988) have made a significant advance by the proposal of an iterative blind-deconvolution algorithm, which is analogous in concept to iterative phase retrieval algorithms (Gerchberg and Saxton, 1972; Fienup, 1978, 1982). Davey et al. (1989) have extended the algorithm of Ayers and Dainty by the incorporation of a support constraint and a Wiener-type filter, permitting the deconvolution of a contaminated complex-valued image. McCallum (1 990) has described an alternative algorithm for the solution of the blind-deconvolution problem by using simulated annealing. Lane (1992) has presented a technique for applying a conjugate gradient algorithm to the problem. The technique is less intensive computationally than the technique based on simulated annealing. Maximum-likelihood estimation techniques have been applied to the blind deconvolution of photon-limited situations (Holmes, 1992) and to the multiframe blind deconvolution of turbulence-degraded images (Schulz, 1993). Although the reported results of these algorithms up to now are encouraging, their uniqueness and convergence properties are uncertain at present. In this section we consider the blind-deconvolution problem under restricted conditions that the components of the convolution are Hermitian and non-Hermitian functions and that the support of the non-Hermitian function is known. It is demonstrated that this restricted problem can be solved by a method consisting of the following two steps (Nakajima, 1991): In the first step, the Fourier phase of the non-Hermitian function is retrieved from the convolution by using the symmetrical property of the Hermitian function based on analytic theory of entire functions. In the second step, the non-Hermitian function is reconstructed from its support constraint and the retrieved phase by using the phase-only reconstruction algorithm developed by Hayes (1982) and Oppenheim et al. (1982). The characteristic of the combined method is that the uniqueness property of its solution is understood from the theory of analytic functions. Although this method is a solution to the restricted version of the blind-deconvolution problem, there is the possibility of applying it to some practical situations such as restoration of images blurred by focusing error and/or linear motion because point-spread functions of the blurring process of this kind are Hermitian functions and original images may generally be regarded as non-Hermitian.
1 . Deconvolution Procedure
The discussion in this section is restricted, for simplicity, to a one-dimensional case. The results are straightforwardly extended to a two-dimensional case by using the algorithm in Section III,A. The convolution g(u) of two
146
N. NAKAJIMA
functions, f(u) and h(u), is given by
i.. W
g(u) =
f(u')h(u- u')du'.
The blind-deconvolution problem is the recovery of the unknown function f ( u ) from a given g(u) without prior knowledge of the function h(u). The purpose of this subsection is to describe the method for reconstructing the object function f(u) from the convolution g(u) on the assumption that the functions f ( u ) and h(u) are non-Hermitian and Hermitian functions, respectively, and support of the object functionf(u) is a priori known. The deconvolution method consists of the following two steps: (1) retrieve the Fourier phase of the non-Hermitian function f(u) from the data g(u) by using the mathematical properties of entire functions; and (2) reconstruct the object function f(u) from its Fourier phase and the support constraint by the phase-only reconstruction algorithm. The Fourier-transform representation of Eq. (82) becomes G(x)
=
(83)
F(x)HO,
where the Fourier transforms of functions are represented by their corresponding uppercase letters. The phase of the function H(x) is zero or n because of the Hermitian property of the function h(u). In general cases therefore the phase of G(x) does not equal the phase of F(x) owing to phase jumps of H(x). In the first step we retrieve the phase of F(x) by using the mathematical properties of entire functions F(z) and H(z), in which a real variable x is changed to a complex variable z = x + iy. We first calculate two functions from the data g(u). One of these is a Fourier transform of the product of the data g(u) and an exponential function exp( -2ncu), G(x - ic)
=
lb
g ( u ) exp( - 2ncu) exp( - 2nixu) du,
(84)
, a
where c is an arbitrary constant, and (a, 6) denotes the interval of g(u). The other is a function obtained by exchanging a real variable of the modulus of G(x) into the complex one x - ic in the same way as Eq. (21), M G ( x - ic) =
1;- [ j'-
MG(x') exp(2niux') dx'
-cc
x exp( - 2ncu) exp( - 2nixu) du,
1 (85)
where the modulus of G(x) is rewritten as MG(x). From Eqs. (83)-(85), the ratio of the moduli of the two complex functions G(x - ic) and MG(x - ic)
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
147
can be written as
except that the value of the denominator is zero, where M F ( x ) and MH(x) denote modulus functionsIF(x)I and IH(x)I, respectively. Substituting H(x) into Eq. (85) instead of MG(x)and using the Hermitian property of the function h(u) [i.e., h(u) = h*(-u)], we can prove a relation H(x - ic) = H*(x
+ ic).
(87)
When expanding the intensity [MH(x)I2 in the complex plane by a process of analytic continuation in the same way as Eq. (14), we obtain a relation between two functions M H ( x - ic) and H(x - ic): [ M ~ ( x- ic)12 = H(X
-
ic)H*(x
+ ic).
(88)
From Eqs. (87) and (88), we obtain
IMH(x - ic)J = IH(x
-
ic)l.
(89)
Thus, when, in the case of a Hermitian function, real variables of its Fourier transform H(x) and its Fourier modulus M H ( x )are exchanged into a complex one, moduli of both resultant functions are found to be equal as in Eq. (89). Substitution of Eq. (89) into Eq. (86) produces
Consequently, it can be see that the Fourier transform of the Hermitian function is reduced in its influence on the ratio in Eq. (90). Let F(x) be written as F(x)
=
M&) exp[icb(x)l,
(91)
where M,(x) and @(x)are the modulus and the phase of F(x), respectively. The modulus of the function obtained by exchanging x of F(x) into x - ic is given by
IF(X
-
ic)l
=
IM,(x
-
ic)l exp[-Im+(x
-
ic)].
(92)
Substitution of this equation into Eq. (90) gives
From this equation, the phase $(x) can be determined by the procedure using the Fourier series expansion in Section II,C.
148
N . NAKAJIMA
In the second step of the present deconvolution method we reconstruct the object functionf(u) from its retrieved phase +(x) and its known support at the object plane. It has been shown mathematically that if a function is constrained to have finite support and to have no zero-Fourier-phase factors, then it is uniquely defined by its Fourier phase and its support (Hayes et al., 1980; Hayes, 1982). A number of different algorithms have been proposed for reconstructing a function from the phase of its Fourier transform (Hayes, 1982; Oppenheim el al., 1982; Levi and Stark, 1983). For example, there is a closed-form solution that requires finding the solution to a set of linear equations (Hayes, 1982). We use here an iterative approach (Oppenheim et al., 1982) that is similar in style to the iterative phase-retrieval algorithms. Since the convergence properties of the iterative algorithms for reconstruction from phase can be understood theoretically, convergence is guaranteed by the theory in contrast to the iterative phaseretrieval algorithms. This phase-only reconstruction algorithm consists of the following four steps: (1) take a Fourier transform of f k ( u ) , which is an estimate of f(u), yielding Fk(x); (2) exchange the phase tyk(x) of Fk(x) into the retrieved phase 9(x), yielding Fk)(x); (3) take the inverse Fourier transform Fk)(x), yielding f/(u), the corresponding image; and (4) make the changes infL(u) that permit it to satisfy the support constraint to form fk+l(u),a new estimate of the object. For kth iteration, these steps are =
IFk(X)l exP[iWk(x)l
=
F.T-[fk(u)l,
f/(u) = I.F.T.[F'(x)], f,(u)
if u
(94)
(96) E
y, u # 0
ifu=O
(97)
otherwise, where y is the set of points inside the object support, CY is an arbitrary constant, and fk(u), Fk(x), and tyk(x) are estimates of f(u), F(x), and the phase +(x), respectively. F.T.[...] and I.F.T.[...] stand for operators of the Fourier and the inverse Fourier transforms. This iterative algorithm is started by using an inverse Fourier transform of the complex function consisting of the Fourier modulus IG(x)l of the data g(u) and the retrieved phase 9(x), and the algorithm is repeated until the error of the reconstructed object outside the support in the object domain decreases to a sufficiently low level.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
149
2. Numerical Example The performance of the deconvolution method described in the previous subsection is demonstrated by computer simulations in a one-dimensional case. Figure 12a shows a non-Hermitian object function f(u) that is to be reconstructed. The data processing by a computer is carried out with 128 sampling points. Figure 12b shows a Hermitian function h(u) that corresponds to a point-spread function of an incoherent imaging system with a focusing error. The convolution g(u) of two functions in Figs 12a and 12b is shown in Fig. 12c. The modulus of the Fourier transform G(x) of g(u) is shown in Fig. 13a. Figures 13b and 13c show the moduli IC(x - ic)l and lMG(x - ic)l, respectively, which were calculated from the function g(u) by use of an exponential function exp(-27ccu) with parameter c = 0.04 in Eqs. (84) and (85). The Fourier phase 4(x) retrieved from those moduli in Figs. 13b and 13c by using a Fourier series expansion is shown in Fig. 13d. Figure 13e shows the phase of G(x). From a comparison of Fig. 13d with Fig. 13e, it can be seen that the phase shift of 7c in Fig. 13e is corrected
M b
C
J
FIGURE12. Original functions for blind deconvolution: (a) non-Hermitian object function to be reconstructed; (b) Hermitian function corresponding to a point-spread function of a n incoherent imaging system with a focusing error; (c) convolution of the two functions in (a) and (b).
150
N . NAKAJIMA
d
FIGURE13. Object reconstruction process: (a) Fourier modulus IG(x)l of the convolution in Fig. 12(c); (b) modulus 1G(x - ic)l of the Fourier transform of the product of the convolution data in Fig. 12(c) and a n exponential function exp(-2ncu) with parameter c = 0.04; (c) modulus IMG(x - ic)l calculated from (a) by using Eq. ( 8 5 ) ; (d) retrieved Fourier phase of the object from the moduli in (b) and (c); (e) Fourier phase of the convolution in Fig. 12(c); (f) reconstructed object function from the phase in (d) and the support constraint of the object.
by the first deconvolution procedure. Note that the data near zeros of IG(x - ic)l or IM,(x - ic)l have t o be discarded in evaluating Eq. (93) for stability of the solution. From the retrieved phase +(x) and a known object support constraint, the object function is reconstructed by using the iterative algorithm described in Eqs. (94)-(97). Figure 13f shows the reconstructed object after 200 iterations. Then the portion of both sides of the retrieved phase in Fig. 13d, in which vibration due to typical phaseevaluation errors appears, must be neglected for the convergence of the iterative algorithm to the solution. Reconstruction of the object from noisy data is shown in Fig. 14. Figure 14a shows the noisy convolution produced by adding the absolute value of normal random numbers to the convolution g(u) in Fig. 12c.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
151
FIGURE14. Noisy case of the blind deconvolution in Figs. 12 and 13: (a) noisy convolution produced by adding random noise to the convolution in Fig. 12c; (b) retrieved Fourier phase by the same procedure as in Fig. 13; (c) reconstructed object function from the phase in (b) and the same support constraint as used in Fig. 13f.
A factor of the signal-to-noise ratio (SNR) is defined by SNR = Cjgi/Ci ni, where g j and njcorrespond to values of the convolution g(u) and the noise at each sampling point, respectively, and they are summed up within the extent of g(u). The SNR in Fig. 14a is 25. The Fourier phase retrieved by the same procedure as in the noiseless case is shown in Fig. 14b. Figure 14c shows the object reconstructed from the retrieved phase and the support constraint after 200 iterations. The two-dimensional blind deconvolution of a Hermitian and a nonHermitian function can be performed by using the present method with the extension algorithm described in Section II1,A. From the 2-D results in computer simulation (Nakajima, 1991), it was found that the reconstructed 2-D objects are influenced by the shape of an object support constraint used in the second step of the method, but, if the support is close to an actual one, the outline of the object function is stably reconstructed. If the object support is unknown apriori, it may be effective to use the technique
152
N. NAKAJIMA
(Lane and Bates, 1987b) for applying the phase-only reconstruction algorithm with successively decreasing region of support constraint until a large increase outside the support is observed.
D. Coherent Imaging through Turbulence Imaging through turbulence is an important problem that has been the subject of extensive studies. Most reconstruction methods have been developed under conditions of incoherent illumination, but a number of recent studies have examined the applicability of these methods to coherent or partially coherent illumination. Two systems of illumination have been considered. In the direct-illumination system, the object is illuminated directly by a plane or spherical wave, for example, and viewed through turbulence (Mavroidis et al., 1990; Solomon and Dainty, 1992). In doublepassage imaging, the object is illuminated through turbulence and viewed through either the same or different turbulence (Fante, 1985; Mavroidis and Dainty, 1990; Mavroidis et al., 1991; Solomon et al., 1991). The difficulties with the conventional reconstruction techniques in one-passage coherent imaging have been pointed out (Mavroidis et al., 1990). There is in practice more interest in double-passage coherent imaging through turbulence. It was shown (Mavroidis and Dainty, 1990) that the average intensity spectrum for an object in the double-passage imaging case contains diffractionlimited information on the Fourier modulus of the object. Moreover, it was demonstrated (Mavroidis et al., 1991) that the use of a nonredundant aperture simplifies the retrieval of the object Fourier modulus. In such an imaging system, however, neither the long- nor the short-exposure image intensity carries information on the phase of the object Fourier transform, so that the reconstruction of a complex-valued object is difficult, though a real and nonnegative object can be reconstructed (Solomon et al., 1991) from only its Fourier modulus by using Fienup’s algorithm (Fienup, 1978, 1982). The Gerchberg-Saxton algorithm (Gerchberg and Saxton, 1972), for example, is also not applicable because the object image used in the algorithm is severely blurred by turbulence. There is a method (Nakajima and Saleh, 1994) for the reconstruction of a complex-valued object in one-passage coherent imaging through a random phase screen (representing turbulence). The reconstruction method is based on the phase retrieval method described in Section 11. In this section, an extension of the reconstruction method used in one-passage imaging to double-passage imaging (Nakajima and Saleh, 1995) is presented. Then we can also treat the reconstruction of a complex-valued object in
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
153
X'
I
Lxponenh I filter
FIGURE15. Geometry of double-passage coherent imaging through a random phase screen and reconstruction by measurement of the intensities of the image Fourier transform with and without an exponential filter.
one-passage imaging as a special case of double-passage imaging. The imaging system is illustrated in Fig. 15. A random phase screen of complex amplitude transmittance H(x) = exp[iO(x)] is placed in the pupil plane of the illumination and viewing system. The phase O(x) is a random function that is assumed to be stationary (homogeneous). An object with deterministic complex amplitude reflectancef(u) is situated at a distance R from the random phase screen. The illumination is assumed to be provided by a beam of monochromatic laser light of wavelength A , which is focused onto the pupil plane by a lens L , and a half-silvered mirror and transmitted through the random phase screen. Assuming that the distance R is sufficiently large for the far field approximation to be satisfied, the complex amplitude of the beam at the object plane is given by m
( ';x->
B(x)H(x) exp --
dx,
where B(x) is the illumination beam amplitude in the pupil plane. Unimportant constants and a phase curvature term outside the integral have been ignored. Also, for simplicity and brevity, one-dimensional notations have been employed; the results are equally valid for two dimensions. The amplitude of the reflected light is imaged through the random phase screen using a lens L , of focal lengthy,. Assuming the usual isoplanatic approximation,
154
N. NAKAJIMA
the complex amplitude in the image plane can be written as
where m
and P(x) is a pupil function of unit value inside the lens aperture and zero elsewhere. It is evident from Eqs. (98), (99), and (100) that, using appropriate coordinate transformations, g ( u i ) is the output of a linear filter of input f(u), modulated by the illumination function q(u) and transfer function proportional to P(x)H(x). The randomness of the phase of the transfer function has the effects: it degrades the spatial coherence of the object illumination and it reduces the resolution of an imaging system based on measurement of the average (long-exposure) image intensity ( lg(uj)I2>.If the width of the illuminating beam at the screen is sufficiently smaller than the correlation length of the random screen so that the coherence area in the object plane is greater than the extent of the object, the imaging system is effectively coherent and is linear in the complex amplitude. Under this quasi-coherent illumination, the system is equivalent to the single-passage coherent imaging system. For this double-passage imaging system, we consider here the reconstruction method (Nakajima and Saleh, 1995) using a technique of phase retrieval based on the measurement of two averaged intensities-the intensity of the Fourier transform of the image field, and the intensity of the Fourier transform of the image after transmission through an exponential filter (a mask with an exponentially decaying transmittance). At the “receiver” side of the imaging system in Fig. 15, measurements of average Fourier intensities of the unfiltered and filtered image fields can be used for reconstructing the complex object f(u). 1. Measured Intensities
Instead of meauring the average image intensity (Ig(ui)12>, we use here a reconstruction method based on measurement of two average Fourier intensities: the intensity of the Fourier transform of the image and the intensity of the Fourier transform of the product of the image with an exponential function exp(-2mui), where c is a known constant. The measurement is implemented optically by use of two identical lenses L, of focal length f2,as illustrated in Fig. 15. The mutliplication is realized by use of a transparency of complex amplitude transmittance exp( - 2ncu;).
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
155
Note that the phase curvature term exp(inu?/Afl) in the image plane must be eliminated before g ( u j ) is Fourier transformed. This is accomplished, for example, by use of a lens with transmittance exp( - inuf/Afl). The complex amplitude of the Fourier transform of the unfiltered and filtered images are
and
respectively. Substituting Eqs. (99) and (100) into Eqs. (101) and (102) and assuming that P(x) 1, i.e., that the lens aperture is sufficiently larger than the extent of the Fourier transform of the object, we obtain,
-
where fi(x)
=
5
m
f ( u ) q ( u )ex,( -m
-%)
du
is the Fourier transform of the product J ; ( u ) = f(u)q(u) of the object and the complex amplitude of the illumination, evaluated at spatial frequency x/AR. The notations can be simplified considerably by scaling the functions F, H , G, and 8 and the variable x’ to rewrite Eqs. (103) and (104) in the form G(2) = 4 ( 2 ) f i ( 2 ) ,
( 106)
G(2) = E(2 - ic>fi(f - ic),
(107)
-
where G(2) = G(Af253,
G(2) = G(Af22),
E ( 2 ) = E(-A.f12), f i ( 2 ) = H( - Af12),
156
N . NAKAJIMA
and I
x‘
x=-.
Af2
The average Fourier intensities are now given by Z,(i)
= (Iz.(i)12>
-
=
(l~(i)l’lfi(i)I”,
(108)
z 2 ( i ) = [~> = <1E(i - ic)I21fi(i - ic)12>. (109) Using Eqs. (98) and (105) and f i ( i ) = exp[i&f)], in which 8 ( i ) = e ( - A f , i ) , Egs. (108) and (109) can be rewritten as
where
2ni(ux
-
1
u’x’)
dx dx’
is the correlation function of the illumination,
2ni(ux
-
1
u’x’)
dx dx’,
(1 13)
and Im denotes the imaginary part of a complex function. Equations (1 10) and (1 11) indicate that each of the measurements Z l ( i ) and Z 2 ( i ) is related to the object function f ( u ) by a bilinear transformation, which is characteristic of a partially coherent imaging system. The transformation kernel depends on the spatial correlation of the random phase screen and the width of the illumination beam.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
157
Expressions for the expectations in Eqs. (1 12) and (1 13) can be derived (Nakajima and Saleh, 1995) by using the following assumptions. The phase of the phase screen is described by a zero-mean Gaussian random process of autocorrelation
where a2and r a r e the variance and correlation length of the phase B(x). The variance a2is assumed to be large. We further assume that the illumination beam is Gaussian, B(X) = ~ , e x p [ - ( x - X , ) ~ / W ~ I , (1 16) where B, and xo indicate the central value and the central position, respectively, and W is the beam width. Then the resultant expressions for the expectations in Eqs. (1 12) and (1 13) are given by
and
where a, is the standard deviation of the derivative of e(x). Substituting Eqs. (116) and (117) into Eq. (112), we obtain r(U,
u’) =
- U),
where
is the mean intensity of the illumination beam; =
( A R / r W ) m
is its width; y(u) = exp( -
$)
expr+)
158
N. NAKAJIMA
is its degree of spatial coherence;
W i= (AR/nW
)
r
m
=
rw
(123)
is the spatial coherence length of the illumination in the object plane; and
r = -WC
W’
in which W, = r/a is the correlation length of the random phase screen. Similarly, the kernel for the filtered image intensity is obtained by substituting Eqs. (116) and (118) into Eq. (113) giving <(2;u , u ‘ )
=
e~p[2(a,Af~~)~]Z)’~(u + d)Z)’2(u’ + d)y(u’ - u), (125)
where
d=
2A2Rfl a2cexp[ - (%
+rAJ1-f7]
nt
Using Eqs. (119) and (125), we can rewrite Eqs. (110) and (111) as
where 3,(1) = y(1)
1-
f ( ~ ) I ) ’ ~ ( u ) f *+ ( ul)Z)’2(u
+ l)du,
(129)
-m
I-m
S2(1) = exp[2(a,Afl c)’]y(l)
xj*(u
f(~)Z;’~(u+ d )
+ l)Zi’2(u + I + d ) du,
(130)
and I = u’ - u. It can be shown (Born and Wolf, 1980) that Eq. (127) corresponds to the Fourier intensity of the object function f ( u ) illuminated with partially coherent light characterized by the mean modulus Z/’2(u)and degree of spatial coherence y(1). The function y(1) depends on both the width W of the illuminating beam and the correlation length W, = t/a of are the random phase screen [see Eq. (1 17)]. The height and width of also dependent on Wand W, and their ratio r = W,/W. The modulus of the object function is modulated by the modulus function Z/I2(u) via the mean
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
159
FIGURE 16. Dependence of the width ct: of the illumination beam in the object plane and its spatial coherence length Wci on the ratio r = W , / W of the random phase screen correlation length W, and the original width of illumination beam W .
intensity of the illuminating beam in Eq. (120). Figure 16 illustrates the dependence of the illumination beam width U: (width of Zj(u)) and its spatial coherence length WCj(width of y ( l ) ) on the ratio r = W,/W. For large r, i.e., the beam at the phase screen intercepts a small fraction of a coherence area, the light is effectively coherent, its width approaches the limit ( L R / n W ) , and its coherence length Wcj 00 as r increases. When r decreases, the illumination loses coherence, the beam width increases, and the coherence length Wcjdecreases. On the other hand, it can be seen that the intensity in Eq. (130) is proportional t o the Fourier intensity of the filtered object function f(u) under illumination with the degree of spatial coherence y ( l ) and the mean intensity Zi(u + d ) , which is the illumination intensity displaced by a distance d . When the width of the mean intensity I j ( u )is comparable with the extent of the object f(u),the displacement d may have a significant effect on the distribution of the reflected light. This causes the average intensity Z2(35)t o change in the neighborhood of the position 2 = -(x,,/LfJ. In our computer simulation, a dip appeared in Z,(35) when the illuminating beam width W was comparable with the correlation length W, of the random phase screen. Such changes of the average intensity influence the phase retrieval quality. There are two techniques that reduce these deleterious effects. One is the use of an illuminating beam of small width, so that the width of the mean intensity Zi(u) is sufficiently wide for the +
160
N. NAKAJIMA
displacement d to have no effect. The other is shifting the position of the illuminating beam in the pupil plane t o counteract the displacement effect. The computer simulations in Section IV,D,3 were performed by use of the latter technique. In the following discussion, the effect of d on the phase retrieval is neglected.
2. Reconstruction We now consider the estimation of the complex object function f ( u ) from the measured intensities 11(2)and Z2(2) assuming that the width W of the illuminating beam and the correlation length W, of the random screen are known constants (i.e., the degree of spatial coherence y ( f ) and the mean modulus Z/l2(u) can be estimated). From the measurements I,(* and Z2(2), the functions 3 1 ( f ) and 32(1) can be computed by inverting Eqs. (127) and (128). Let Il(2)and Zi(2)be the Fourier transforms of 3 1 ( 1 ) / y ( f ) and 3 2 (1 )/y (1 ), respectively. Using Eqs. (127)-(130), then, we obtain
Z;(i)
(131)
= 1&2)12,
z~(z)= exp[2(cr,~f,c)~]I F e ( -~ ic)I2,
(132)
-%)
(133)
where .W
F,(x)
f ( ~ ) Z / / ~exp( (u)
= -m
du,
and Fe(2)= F e ( - A f 1 2 ) . If the degree of spatial coherence y ( l ) is nearly unity within the extent of the autocorrelation of the object, Eqs. (131) and (132) are approximately equal to the observed intensities Zl(2) and 12(2), respectively, Zl(2) z 11(2),
I;@)
3
(134)
Z2(2).
Let M(2) and +(i) be the modulus and phase of
(135)
Fe(i?),i.e.,
Fe(2)= ~ ( 2exp[i4(2)]. )
(136)
Using Eq. (136) in Eqs. (131) and (132) we obtain I;@)
=
M2(2),
I i ( 2 ) = exp[2(0,llfc)~] IM(2 - ic)I2exp[-2 Im 4(2 - ic)].
(137) (138)
161
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
It follows that Eq. (138) can be rewritten as
Using the known function y ( l ) , the intensity Zi(2)is calculated from 12(2). Also IM(2 - ic)I2 is related to M 2 ( 2 ) ,which is evaluated from the average intensity ZI(2) which in turn is computed from I , @ ) with the relationship of Eq. (21) in Section II,C. Then M(2 - ic) is the Fourier transform of the product of the inverse Fourier transform of M ( 2 ) and the exponential function exp( - 2ncu). One remaining problem is that the term (a,Aflc)2, which appears in Eq. (139), is unknown. Since this term is constant, it is convenient to write it in the form (a,Af, c ) ~= 2nsc, where s is another unknown constant. Then the right-hand side of Eq. (139) becomes -1m 4(2 - ic)
+ 2nsc = -1m
4(2 - ic) - 2ns Im(2
ic)
ic) + 2ns(2 - ic)]
=
-Im[4(2
=
-1m v/(2- ic),
-
-
( 140)
where I,@) = 4(2) + 2ns2. Thus the term (a,Af, c)’ corresponds to a linear phase factor 2ns2 added to the phase function +(2).The addition of a phase shift 2nsx to the Fourier transform of the object functionf(u) is equivalent t o a shift of the object, i.e., to a displaced object f(u + s). The shift is determined by a statistical average of a function of random phase e(2) and the constant c . In many applications, however, the exact registration of the object is unimportant. This effect shall be ignored in the remainder of this section. We write Eq. (139) in the form
D(2) = -1m ~ ( -2ic),
(141)
where D(2) = ln(Z;(Z)/lM(2 - ic)12) is a known function. Equation (141) can be solved by the same procedure as in Section II,C. Finally the modulated object function f ( ~ ) Z / / ~ in ( uEq. ) (133) is reconstructed by an inverse Fourier transform of the function consisting of the modulus M ( 2 ) , obtained via the average intensity of Eq. (137) and the retrieved phase ~ ( 2 ) . The object functionf(u) can be obtained by compensating for the Gaussian envelope of Z/I2(u), which can be evaluated from the known values of the width W of the illuminating beam and the correlation length W, of the random screen. As described in Section III,A, this reconstruction method can be readily extended to the two-dimensional case by sectioning the Fourier intensity
162
N. NAKAJIMA
into a set of one-dimensional parallel slices, and using a third measurement of the Fourier intensity with an exponential filter decaying in the orthogonal direction (obtained by a 90" rotation of the mask). 3. Simulated Example We have tested the reconstruction system presented in the previous subsection by computer simulation of an example of a two-dimensional complex-valued object illuminated and imaged through a random phase screen. The modulus and phase of the original object functionf(u), shown in Figs 17a and 17b, are sampled at 64 x 64 points. The modulus is unity within a square of size 7 x 7 pixels and zero elsewhere, and the phase takes three values: 7c within an inner square of size 3 x 3 pixels, 3n/4 and n/2 within a middle and outer region, respectively. Figure 17c shows the squared modulus (intensity) of the object Fourier transform. For simplicity, we set the scale of the x axis equal to that of the x' axis in Fig. 15. The random phase screen used in this simulation is an array of 64 x 64 statistically identical random numbers with zero mean, standard deviation ~7 = 1.08n, and a circularly symmetric Gaussian correlation function of 1/e-width 7 = 14.1 pixels. Since the phase standard deviation ~7 = 1 . 0 8 is ~ large in this case, the autocorrelation function of the screen may be approximated by the Gaussian function in Eq. (1 17), and the spatial coherence length W, = 7/ci of the screen is about 4.2 pixels. To generate this random function we begin with an array of identical and statistically independent random numbers and convolve it with a two-dimensional Gaussian function. Figure 17d is a sample of the image intensity Ig(uj)I2in double-passage imaging through one realization of the random phase screen. Here, a two-dimensional Gaussian beam of width W = 0.2 pixel was used as the illuminating light so that the ratio r = W,/ W = 21, i.e., the illumination beam width is more that 21 times smaller than the screen's spatial coherence length. The illumination in the object plane is a Gaussian beam of width U: = 102 pixels and coherence length Wci = 21 x 102 = 2142pixels, thereby extending well beyond the object size of 7 pixels. In the present simulations, the central position of the illuminating beam was fixed at the coordinates (0.5,0.5) (where 0.5 = 5pixels) to reduce the effect of a dip that appears in the average Fourier intensity of the filtered object image, as described in Section IV,D, 1. The object is reconstructed by use of an exponential filter exp( - 27ccu;) in the image plane with parameter c = 0.04 (corresponding to a l/e-width of about 25.5 pixels because the unit length of pixel in the image plane is set to be 10/64). Two tests of object reconstruction are conducted. In these tests the only source of noise is assumed to be the fluctuations of the phase screen.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
163
FIGURE17. Original object function used in the computer simulation: (a) modulus; (b) phase; (c) intensity of the Fourier transform; and (d) one sample of the image taken through a random phase screen with standard deviation u = 1 . 0 8 ~and correlation length r = 14.1 pixels under the illumination of a Gaussian beam of width W = 0.2 pixel.
The Fourier intensity data are averaged over N = 100 realizations of the random phase screen. The modulus and phase of the object reconstructed in the first test with an illuminating beam of width W = 0.2 pixels are shown in Figs. 18a and 18c, respectively, and cross-sectional profiles of the original and reconstructed modulus and phase along horizontal lines passing through the center are compared in Figs 18b and 18d. In the second test, we used the illumination beams of width W = 2 pixels. The ratios r in these cases are about 2.1 for W = 2. Since the coherence length Wcjin the test is about 25.8 pixels, the Fourier intensity of the object
164
N. NAKAJIMA
b
d
FIGURE 18. Reconstruction of the object shown in Figs. 17a and b, when the only source of noise is the fluctuations of the phase screen. Fourier intensities are averaged over N = 100 realizations of the phase screen, and the illumination is a Gaussian beam of width W = 0.2 pixel. (a) Modulus and (c) phase of the reconstructed object; (b) and (d) are cross-sectional profiles of the functions in (a) and (c), respectively, taken along horizontal lines passing through the center. The dashed and solid curves represent the original and reconstructed objects, respectively.
is observed under partially coherent illumination. We therefore need to use the compensation method described in Section IV,D,2. The effect of the degree of spatial coherence y ( l ) was eliminated from the average Fourier intensities by dividing the inverse Fourier transforms of these intensities by y ( / ) within the extent of the autocorrelation function of the object. Besides, we compensated the reconstructed object function for the mean modulus I;’2(u)of illumination. Figures 19a and 19c show the modulus and phase of the reconstructed objects for W = 2. Profiles of the reconstructed and original object modulus and phase are compared in Figs. 19b and 19d. We
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
b
165
d
FIGURE19. Same as in Fig. 18 but for illumination using a Gaussian beam of width W = 2 pixels and compensating for the effects of the degree of spatial coherence y ( / ) and the mean modulus I,"*(u) of illumination.
find that even under partially coherent illumination the edge sharpness of the reconstructed modulus is recovered. A measure of the quality of reconstruction is the normalized root-meansquared (rms) error,
respectively, and s and cy are admissible constant displacement and phase shift that are inherent within this reconstruction method, but are irrelevant to the quality of reconstruction. The rms errors for the reconstructed objects in Figs. 18 and 19 are 0.3047 and 0.2986, respectively.
166
N. NAKAJIMA
0.8
-
I
I
I
I
I
+Wz2.0,
I
I
bl
-0-
W=0.2, t=4.2
bl
-A-
W=2.0, t = 1 4 . 1
--O-
W=0.2, t=14.1
w 0.6 -
I
I
t=4.2
u
I
0
I
I
I
I
0.5n
I
I
I
I
I
I
7l
I
I
I
1 .sn
Standard Deviation (a) FIGURE20. Dependence of the rms reconstruction error (ER) on the standard deviation of the random phase for two values of the correlation length r = 4.2 and 14.1 pixels and illumination beams of width W = 0.2 and 2 pixels. 100 frames are observed.
We have also calculated the rms error for the reconstruction using the illuminating beams of width W = 0.2 and 2.0 pixels for various statistical realizations of the random phase screen, provided that the approximate analysis of Eqs. (1 34) and (135) (assuming that the illumination is coherent) was used. Figure 20 shows the dependence of the error on the standard deviation r~ of the random phase for two correlation lengths T = 4.2 and 14.1 pixels. The error increases with increase of r~ and with decrease of 7. This is a result of the increase of the standard deviation of the phase derivative 0, with increase of r~ and its decrease with increase of 7 (since a large 7 indicates a smoother function). As in one-passage imaging (Nakajima and Saleh, 1994), the signal-to-noise ratio decreases with increase of a,. As shown in Fig. 16, the coherence length Wci in the object plane decreases with decrease of the ratio r = W c / W , reducing the reconstruction quality. This is why the error for W = 2 pixels is larger than that for W = 0.2pixels, and the difference between the two error curves when T = 4.2 pixels is wider than that when 7 = 14.1 pixels. We have investigated the reconstruction error for various statistical realizations of the random phase screen when averaged over a finite, instead of infinite, exposure time. The general behavior is similar to that of onepassage coherent imaging (Nakajima and Saleh, 1994). The reconstruction is robust and the problem is not ill-posed.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
167
V. CONCLUSIONS
Up to now various studies on the phase retrieval problem in optics have already been performed. In particular, studies of the phase retrieval using iterative algorithms have been actively conducted in recent years. One reason for this situation seems to be that the iterative algorithms can easily handle the 2-D phase retrieval problem. Although the relationship between the Fourier phase and modulus of a 2-D object is almost always unique, the iterative algorithms presently involve the problem of stagnation in a local minimum solution different from a true one, especially in the noisy case. On the other hand, the analytic approach using the properties of entire functions ensures the uniqueness of the solution, but it is not easy to carry out the 2-D phase retrieval from a single Fourier modulus. In such a case, it is necessary t o track the zero sheets of the Fourier modulus analytically extended into 2-D complex space or t o locate the zero points in the complex planes of 1-D strips of the Fourier modulus. Since the relationship between the Fourier modulus and the zero sheets or the zero points is nonlinear, tracking zero sheets or locating zero points tends to be computationally intensive and sensitive to noise. As a result, this kind of approach is unpractical at present. The phase retrieval method described in this chapter is computationally efficient and comparatively robust for noise, because it uses the linear method of retrieving the phase from Fourier intensities of the exponential filtered and unfiltered objects without zero location. In applying the present method to the 2-D case, however, two observations of the modulus in the Fourier transform plane, in addition to the ordinary Fourier modulus of an object, are needed under the condition that the object is multiplied by an exponential filter in two directions orthogonal to each other. Using the three Fourier modulus data sets, the present method does contain sufficient information about the object, and hence has no necessity for a priori object information, such as the nonnegativity and the extent of the object, which are used in the iterative algorithms for phase retrieval. However, an optimum inclination of the exponential filter for the phase evaluation from the Fourier data has not yet been determined theoretically. Thus we have to use the filter with a suitable inclination obtained empirically. The phase retrieval problem arises in various fields of science and engineering. Some applications were described in Section IV. There are other important fields which are not mentioned in this review: for example, X-ray crystallography and femtosecond-pulse measurement. In X-ray crystallography, the phase retrieval from a diffraction data of a crystalline specimen is the central problem. Unfortunately, the phase retrieval methods
168
N. NAKAJIMA
developed in optics cannot be directly applied to the problem, because the crystalline specimen does not have compact support and has a periodic structure (so that the continuous Fourier intensity cannot be measured). In such a problem, additional a priori information seems necessary to the constraint of the solution (Millane, 1990, 1993; Perez-Ilzarbe, 1992; Harrison, 1993). The phase retrieval problem in femtosecond-pulse measurement occurs not in space domain but in time domain. It is now possible to create laser pulses with a few femtoseconds in length, and such pulses have found a wide range of applications, in which one needs a technique for measuring the time-dependent intensity and phase of an individual pulse. An iterative algorithm modified by use of apriori information about the measurement system was applied to that problem (Trebino and Kane, 1993), although the algorithm involves the stagnation problem in a case. In those various applications, however, there are few examples of applying the iterative or analytic phase-retrieval methods to the experimental data obtained with actual measurement systems. In order to put those methods to practical use as measurement systems, one needs to investigate the reconstruction precision for more data in experiments and thereby to improve those methods. ACKNOWLEDGMENTS The author is very grateful to Professor T. Asakura, Hokkaido University, for his valuable comment and encouragement. The author also wishes to express his gratitude t o Professor B. E. A. Saleh, Boston University, for his helpful suggestions. REFERENCES Andrews, H. C., and Hunt, 9 . R. (1977). “Digital Image Restoration.” Prentice-Hall, Englewood Cliffs, New Jersey. Ayers, G . R., and Dainty, J . C. (1988). lterative blind deconvolution method and its applications, Opt. Lett. 13, 547-549. Bates, R. H. T., and Mnyama, D. (1986). The status of practical Fourier phase retrieval. In “Advances in Electronics and Electron Physics” (P. W. Hawkes, Ed.), Vol. 67, pp. 1-64. Academic Press, New York. Boas, R. P . (1954). “Entire Functions.” Academic Press, New York. Born, M., and Wolf, E. (1980). “Principles of Optics” (Sixth Ed.). Pergamon, New York. Bracewell, R. N. (1983). “Discrete Hartley transform, J. Opt. Soc. A m . 73, 1832-1835. Bracewell, R. N. (1984). The fast Hartley transform. Proc. IEEE 72, 1010-1018. Bracewell, R. N., Bartelt, H., Lohmann, A. W., and Streibl, N. (1985). Optical synthesis of the Hartley transform. Appl. O p f . 24, 1401-1402.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
169
Bruck, Y. M., and Sodin, L. G. (1979). On the ambiguity of the image reconstruction problem. Opt. Commun. 30, 304-308. Bryan, R. K., and Skilling, J. (1986). Maximum entropy image reconstruction from phaseless Fourier data. Opt. Acta 33, 287-299. Burge, R. E., Fiddy, M. A., Greenaway, A. H., and Ross, G . (1976). The phase problem. Proc. R. SOC. London Ser. A 350, 191-212. Dainty, J. C., and Fienup, J . R. (1987). Phase retrieval and image reconstruction for astronomy. In “Image Recovery: Theory and Application” (H. Stark, Ed.), pp. 231-275. Academic Press, New York. Davey, B. L. K., Lane, R. G . , and Bates, R. H. T. (1989). Blind deconvolution of noisy complex-valued images. Opt. Commun. 69, 353-356. Deighton, H. V., Scivier, M. S., and Fiddy, M. A. (1985). Solution of the two-dimensional phase-retrieval problem. Opt. Lett. 10, 250-25 1. Fante, R. L. (1985). Imaging of an object behind a random phase screen using light of arbitrary coherence. J . Opt. SOC. Am. A 2, 2318-2329. Ferwerda, H. A. (1978). The phase reconstruction problem for wave amplitudes and coherence functions. In “Inverse Source Problems in Optics” (H. P . Bakes, Ed.), pp. 13-39. Springer-Verlag, Berlin. Fiddy, M. A. (1987). The role of analyticity in image recovery. In “Image Recovery: Theory and Application” (H. Stark, Ed.), pp. 499-529. Academic Press, New York. Fiddy, M. A., and Ross. G. (1979). Analytic Fourier optics: the encoding of information by complex zeros. Opt. Acta 26, 1139-1 146. Fienup, J. R. (1978). Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3, 21-29. Fienup, J . R. (1982). Phase-retrieval algorithms: a comparison. Appl. Opt. 21, 2758-2769. Fienup, J . R. (1991). Phase-retrieval imaging problems. In “International Trends in Optics” (J. W. Goodman, Ed.), pp. 407-422. Academic Press, New York. Gerchberg, R. W., and Saxton, W. 0. (1972). A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik 35, 237-246. Ghiglia, D. C., Romero, L. A., and Mastin, G. A. (1993). Systematic approach to twodimensional blind deconvolution by zero-sheet separation. J . Opt. SOC. Am. A 10, 1024- 1036. Harrison, R. W. (1993). Phase problem in crystallography. J. Opt. SOC. A m . A 10, 1046-1055. Hartley, R. V. L. (1942). A more symmetrical Fourier analysis applied to transmission problems. R o c . IRE 30, 144-150. Hayes, M. H. (1982). The reconstruction of a multidimensional sequence from the phase or magnitude of its Fourier transform. IEEE Trans. ACOUSI. Speech Signal Process. ASSP-30, 140-154. Hayes, M. H. (1987). The unique reconstruction of multidimensional sequences from Fourier transform magnitude or phase. In “Image Recovery: Theory and Application” (H. Stark, Ed.), pp. 195-230. Academic Press, New York. Hayes, M. H., Lim, J. S., and Oppenheim, A. V. (1980). Signal reconstruction from the phase or magnitude of its Fourier transform. IEEE Trans. Acoust. Speech Signal Process. ASSP-28, 670-680. Hoenders, B. J. (1975). On the solution of the phase-retrieval problem. J. Math. Phys. 16, 1719- 1725. Holmes, T. J . (1992). Blind deconvolution of quantum-limited incoherent imagery: maximumlikelihood approach. J. Opt. SOC.A m . A 9 , 1052-1061. Hurt, N. E. (1989). “Phase Retrieval and Zero Crossings.” Kluwer, Dordrecht. Lane, R. G. (1991). Phase retrieval using conjugate gradients. J. Mod. Opt. 38, 1797-1813.
170
N. NAKAJIMA
Lane, R. G. (1992). Blind deconvolution of speckle images. J. Opt. SOC.Am. A 9, 1508-1514. Lane, R. G., and Bates, R. H. T. (1987a). Automatic multidimensional deconvolution. J. Opt. SOC.Am. A 4, 180-188. Lane, R. G., and Bates, R. H . T. (1987b). Relevance for blind deconvolution of recovering Fourier magnitude from phase. Opt. Commun.63, 11-14. Lane, R. G., Fright, W. R., and Bates, R. H. T. (1987). Direct phase retrieval. IEEE Trans. Acoust. Speech Signal Process. ASSP-35, 520-526. Labeyrie, A. (1970). Attainment of diffraction-limited resolution in large telescopes by Fourier analyzing speckle patterns in star images. Astron. and Aslrophys. 6, 85-87. Levi, A., and Stark, H. (1983). Signal reconstruction from phase by projections onto convex sets. J. Opt. SOC.Am. 73, 810-822. Levi, A., and Stark, H. (1984). Image restoration by the method of generalized projections with application to restoration from magnitude. J. Opt. SOC. Am. A 1, 932-943. Levi, A., and Stark, H. (1987). Restoration from phase and magnitude by generalized projections. In “Image Recovery: Theory and Applications” (H. Stark, Ed.), pp. 277-320. Academic Press, New York. Li, Y., and Eichmann, G. (1985). Coherent optical generation of Hartley transform of real images. Opt. Commun.56, 150-154. Mavroidis, T., and Dainty, J. C. (1990). Imaging after double passage through a random screen. Opt. Lett. 15, 857-859. Mavroidis, T., Dainty, J . C., and Northcott, M. J. (1990). Imaging of coherently illuminated objects through turbulence: plane-wave illumination. J. Opt. SOC.Am. A 7, 348-355. Mavroidis, T., Solomon, C. J., and Dainty, J. C. (1991). Imaging a coherently illuminated object after double passage through a random screen. J. Opt. SOC. Am. A 8, 1003-1013. McCallum, B. C. (1990). Blind deconvolution by simulated annealing. Opt. Commun. 75, 101-105. Millane, R. P. (1986). Image reconstruction from the Hartley transform intensity. Opt. Commun. 60, 269-274. Millane, R. P. (1990). Phase retrieval in crystallography and optics. J. Opt. SOC.Am. A 7, 394-41 1. Millane, R. P. (1993). Phase problems for periodic images: effects of support and symmetry. J. Opt. SOC.Am. A 10, 1037-1045. Misell, D. L. (1973). An examination of an iterative method for the solution of the phase problem in optics and electron optics. J . Phys. D 6, 2200-2225. Nakajima, N. (1987). Phase retrieval from two intensity measurements using the Fourier-series expansion. J. Opt. SOC. A m . A 4, 154-158. Nakajima, N. (1988a). Phase retrieval using the logarithmic Hilbert transform and the Fourierseries expansion. J. Opt. SOC.Am. A 5 , 257-262. Nakajima, N. (1988b). Reconstruction of a real function from its Hartley-transform intensity. J. Opt. SOC. Am. A 5, 858-863. Nakajima, N. (1989). Two-dimensional phase retrieval by exponential filtering. Appl. Opt. 28, 1489-1493. Nakajima, N. (1990). Reconstruction of phase objects from experimental far field intensities by exponential filtering. Appl. Opt. 29, 3369-3374. Nakajima, N. (1991). Blind deconvolution of a Hermitian and a non-Hermitian function. J. Opt. SOC. Am. A 8, 808-813. Nakajima, N. (1992). Phase retrieval using a Lorentzian filter. Jpn. J . Appl. Phys. 31, 3348-3353. Nakajima, N., and Asakura, T. (1982). Study of zero location by means of an exponential filter in the phase retrieval problem. Optik 60, 289-305.
PHASE RETRIEVAL BY ENTIRE FUNCTIONS
171
Nakajima, N., and Asakura, T. (1985). A new approach to two-dimensional phase retrieval. Opt. Acta 32, 647-658. Nakajima, N., and Asakura, T. (1986). Two-dimensional phase retrieval using the logarithmic Hilbert transform and the estimation technique of zero information. J. Phys. D 19, 319-331. Nakajima, N., and Saleh, B. E. A. (1994). Reconstruction of a complex-valued object in coherent imaging through a random phase screen. Appl. Opt. 33, 821-828. Nakajima, N., and Saleh, B. E. A. (1995). Reconstruction of a complex-valued object in double passage coherent imaging through a random phase screen. Appl. Opt. 34, (in press). Nieto-Versperinas, M. (1980). Dispersion relations in two dimensions: application to the phase problem. Optik 56, 377-384. Nieto-Versperinas, M., and Mendez, J. A. (1986). Phase retrieval by Monte Carlo methods. Opt. Commun. 59, 249-254. Ohtsubo, J., Takahasi, Y., and Nakajima, N. (1991). Reconstruction of stellar images by phase retrieval based on the Fourier decomposition method. Opt. Eng. 30, 1332-1336. Oppenheim, A. V., Hayes, M. H., and Lim, J. S. (1982). Iterative procedures for signal reconstruction from Fourier transform phase. Opt. Eng. 21, 122-127. Paley, R. E. A. C., and Wiener, N. (1934). “Fourier Transforms in the Complex Domain.” American Mathematical Society, Ann Arbor, Michigan. Perez-Ilzarbe, M. J. (1992). Phase retrieval from the power spectrum of a periodic object. J . Opt. SOC.Am. A 9, 2138-2148. Ross, G., Fiddy, M. A., and Nieto-Vesperinas, M. (1980). The inverse scattering problem in structural determinations. In “Inverse Scattering Problems in Optics” (H. P. Bakes, Ed.), pp. 15-71. Springer-Verlag, Berlin. Sanz, J. L. C., and Huang, T. S. (1983). Unique reconstruction of a band-limited multidimensional signal from its phase or magnitude. J. Opt. SOC.Am. 73, 1446-1450. Saxton, W. 0. (1978). “Computer Techniques for Image Processing in Electron Microscopy.” Academic Press, New York. Schulz, T. J. (1993). Multiframe blind deconvolution of astronomical images. J. Opt. SOC. Am. A 10, 1064-1073. Solomon, C. J., and Dainty, J. C. (1992). Imaging a coherently illuminated object through a random screen by using a dilute aperture. J. Opt. SOC. Am. A 9, 1385-1390. Solomon, C. J., Lane, R. G., Mavroidis, T., and Dainty, J. C. (1991). Double passage imaging through a random screen using a nonredundant aperture. J. Mod. Opt. 38, 1993-2008. Titchmarsh, E. C . (1948). “Introduction to the Theory of Fourier Integrals” (2nd Ed.). Oxford Univ. Press, London. Trebino, R., and Kane, D. J. (1993). Using phase retrieval to measure the intensity and phase of ultrashort pulses: frequency-resolved optical gating. J. Opt. SOC. A m . A 10, 1101-1 11 1. Walther, A. (1963). The question of phase retrieval in optics. Opt. Acta 10, 41-49. Walker, J. G. (1981). The phase-retrieval problem: a solution based on zero location by exponential apodization. Opt. Acta 28, 735-738. Walker, J. G. (1982). Computer simulation of a method for object reconstruction from stellar speckle interferometry data. Appl. Opt. 21, 3132-3 137. Wolf, E. (1962). Is a complete determination of the energy spectrum of light possible from measurements of the degree of coherence? Proc. Phys. SOC. London 80, 1269-1272. Wood, J. W., Fiddy, M. A., and Burge, R. E. (1981). Phase retrieval using two intensity measurements in the complex plane. Opt. Left. 6, 514-516.
This Page Intentionally Left Blank
ADVANCES I N IMAGING AND ELECTRON PHYSICS. VOL . 93
Multislice Approach to Lens Analysis GIULIO POZZI Department of Physics. University of Bologna via Irnerio 46. 40126 Bologna. Italy
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . I1 . Standard Multislice and BPM Equations and First Applications . . . . . A . Basic Equations . . . . . . . . . . . . . . . . . . . . B. Intermezzo: the Aharonov-Bohm Effect . . . . . . . . . . . C . Application to the Quadrupole Electron Lenses . . . . . . . . . D . Light Propagation in Quadratic Index Media . . . . . . . . . . 111. Application of the Multislice Equations to Round Symmetric Electron Lenses . A . Inadequacy of the Standard Equations . . . . . . . . . . . . B. The Improved Phase-Object Approximation . . . . . . . . . . C . Thick Lens Theory . . . . . . . . . . . . . . . . . . D . Propagation of a Spherical Wave in the Lens Field . . . . . . . . E . The Glaser-Schiske Diffraction Integral . . . . . . . . . . . . F . The Multislice Method and the Paraxial Schrodinger Equation . . . . IV . Improved BPM Equations and Application to Gradient Index Lenses . . . . A . General Considerations . . . . . . . . . . . . . . . . . B. Propagation of Gaussian Wavefronts . . . . . . . . . . . . C . Application to an Integrated Optics Lens . . . . . . . . . . . V . Beyond the Paraxial Approximation . . . . . . . . . . . . . . A . Phase-Object Approximation with Spherical Input Waves . . . . . . B. Equation for the Spherical Aberration Coefficient . . . . . . . . C . Comparison with the Classic Results . . . . . . . . . . . . . V I . Conclusions . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . .
. . . .
. .
.
. .
. . .
.
.
. .
173 176 176 178 181 185 186 186 190 192 194 195 200 202 202 204 205 207 208 210 212 215 216
I . INTRODUCTION The basic motivations for this work have stemmed mainly from the teaching experience of the author (Pozzi. 1985. 1988. 1992a) and from the related recognition (confirmed by many talks with young and more experienced colleagues working in electron microscopy) that it is generally accepted that all that is needed for a proper understanding of an electron lens is (i) to demonstrate its focussing properties by solving the equations of motion for paraxial electrons crossing the axis at the object point; and (ii) to transfer the light optical results for analyzing its wave optical behavior (see Reimer. 1984). 173
Copyright W 1995 by Academic Press. Inc . All rights of reproduction in any form reserved .
174
GIULIO POZZI
At the basis of the aforementioned attitude is the fact that the problem of image formation within a transmission electron microscope has customarily been divided into two parts (Glaser, 1952, 1956; Glaser and Schiske, 1953). These are (a) the interaction problem, in which the aim is to calculate the object wavefunction at the exit plane of the specimen; and (b) the propagation problem, in which the objective is to investigate the transfer of the object wavefunction from the object to the recording plane. Separate approximate theoretical methods have been applied to the two problems; namely, the Born or eikonal approximation for treating the interaction problem (Glauber, 1959; Zeitler and Olsen, 1964, 1967; Reimer, 1984), and the solving of the Schrodinger equation for treating propagation in the lens field (Glaser, 1952, 1956; Glaser and Schiske, 1953). Unfortunately, their elegant electron optical treatment of the problem of image formation has not received the consideration it deserves in the electron microscopy community having been written in German and hence being rather unknown to the majority of researchers, although it is the starting point for all the subsequent analyses of the process of image formation according to the transfer theory (see Hawkes, 1973). In addition, the analogy t o the optical lens theory was so close that the basic features of the electron optical case have been almost completely neglected. In the author’s opinion, the previous outlined approach, although effective, is too restrictive and overshadows some relevant peculiarities of the electron optical realm which deserve better consideration. In particular, the analogy between light optics and electron optics breaks down completely when magnetic fields are involved and vector potentials must be considered (Ehrenberg and Siday, 1949; Born and Wolf, 1980; Hawkes and Kasper, 1989, 1994). It follows that the rays of particles are no longer orthogonal trajectories of the equiphase surfaces, a fact which is still a source of trouble for those trying to push the optical analogy beyond its validity range (Christenson and Eades, 1986, 1988). Moreover, we should remember that it was during the analysis of the concept of refractive index in electron optics that Ehrenberg and Siday (1949) were led to the discovery of what is now called the Aharonov-Bohm effect, after the influential paper entitled “Significance of Electromagnetic Potentials in the Quantum Theory’’ (Aharonov and Bohm, 1959). This effect, peculiar to electron optics, raises fundamental questions about locality in quantum mechanics and in the electromagnetic interaction, and it has stimulated a still vigorous theoretical and experimental debate reviewed by Olariu and Popescu (1985) and by Peshkin and Tonomura (1989).
The need for a unified approach to the general problem of image formation is also pointed out by the consideration that in the electron microscope
MULTISLICE APPROACH TO LENS ANALYSIS
175
(and this is particularly evident in magnetic immersion objective lenses) the electrons of the beam interact at the same time with the electromagnetic fields of the specimen at the microscopic (atomic) or mesoscopic level and with the fields of the lenses, which are macroscopic and usually possess cylindrical symmetry around the optic axis. Since interaction and propagation differ only by their different scale dimensions (both of them much larger than the de Broglie electron wavelength) it is desirable to encompass all parts of the process of image formation in the same theoretical framework. Starting from the above considerations, it has been investigated whether the phase-object approximation and multislice method (Cowley, 1981), which have proved to be so successful in the calculation of the object wavefunction at the exit of the specimen and also in the simulation of high resolution electron microscopy images (see for reviews Van Dyck, 1985 and Watanabe, 1993), can also be successfully applied to the propagation problem in magnetic (Pozzi, 1989) and electric (Pozzi, 1990) round symmetric electron lenses. The results of this analysis are summarized in the present chapter, which can be considered, with respect to the classical treatment, as an alternative approach, starting from first principles and obtaining the same results, yet using more up-to-date conceptual tools. It is curious to note that the basic motivations for the physico-optical approach, known in the electron optical community as the multislice method (Cowley and Moodie, 1957b) originally stemmed from an analysis of the problem of image formation (Cowley and Moodie, 1957a). From this point of view this work may be considered the completion of the program laid down by Cowley and Moodie (1957a) in their seminal paper, in which they also pointed out the interdisciplinary character of their method. In the same spirit, the companion field of light optics has been surveyed, because, in spite of the different statistics, when the occupation number in the phase space is very low, photons and electrons have the same behavior, as pointed out by Gabor (1961). Therefore it is not surprising that, in many cases, essentially the same methods have been developed independently in the two related fields, starting from the particular problem to be solved. It has thus been found that the multislice method has, as its counterpart in optics, the beam propagation method (BPM) proposed to investigate the light propagation in graded index optical fibers (Feit and Fleck, 1978) and then extended to the analysis of integrated optics structures, such as thin film waveguides and gratings (Van Roey et al., 1981). Therefore, the chapter is organized as follows. In the second section the standard multislice approach is recalled and used to analyze electromagnetic cyclindrical lenses from the wave optical standpoint. It is shown how the
176
GIULIO POZZI
results obtained by Storbeck (1973) are recovered without any modification of the basic equations. It is also shown how Aharonov-Bohm effects are framed quite naturally in this formulation. The shortcomings arising in the analysis of round symmetric electromagnetic lenses are analyzed at the beginning of the third section, where it is shown how to implement the basic equations to obtain the correct results (Pozzi, 1989, 1990). The propagation of general spherical waves within a thick electromagnetic lens is then investigated, as a first step toward a general theory of image formation. The analogy between electron and light optics is employed in the fourth section to write improved BPM equations (Pozzi, 1992b) and to study the propagation of a Gaussian beam in an integrated optic lens (Di Sebastiano and Pozzi, 1992). It will be shown that this approach removes an inconsistency pointed out by Gribble and Arnold (1988) between the predictions of standard BPM and those of geometrical optics. Finally, the fifth section presents an extension of the phase-object approximation that is able to encompass the problem of lens aberrations. In particular, the case of spherical aberration is considered, and it is shown that the same results are recovered as those predicted by the geometric optical analysis (Glaser, 1952, 1956). Throughout this paper, in the spirit of the multislice method, a heuristic approach is followed at the expense of mathematical rigor. In addition, the author apologizes for the personal bias in the reference list; acknowledgments to original works and authors for light optics may fortunately be found in the book by Born and Wolf (1980) and, for electron optics, in the classic work of Glaser (1952) as well as in the encyclopedic treatise by Hawkes and Kasper (1989, 1994).
11. STANDARD MULTISLICE AND BPM EQUATIONS AND
FIRSTAPPLICATIONS
A . Basic Equations
The basic idea of the physico-optical approach known in electron optics as the multislice method and in light optics as the beam propagation method (BPM) is t o approximate the propagation of the electron or light beam through a continuous medium, whether an electron or gradient index lens or a specimen (Fig. la), by means of a discontinuous process during which the beam alternatively interacts with thin phase screens and propagates freely through the vacuum regions between them (Fig. lb).
MULTISLICE APPROACH TO LENS ANALYSIS
177
a
w
LENS
b
FIGURE1. (a) Propagation of a spherical wavefront W in a continuous medium (lens). (b) Its approximation by a series of thin phase screens at z , _ ] ,z,, z , + ~whose exit points act as Huygens sources for secondary spherical waves.
This is accomplished by dividing the field (or the specimen) into thin slices perpendicular to the direction of the incident beam and by projecting each slice into the entrance plane, which acts as a two-dimensional phase object . In the case of electron lenses, a fixed Cartesian coordinate system X , Y , z is considered, whose z axis is coincident with the optic axis. Let V ( X , Y , z) and A ( X , Y , z ) be the electrostatic potential and the magnetic vector potential associated to the lens field. The transmission T,, function of the i-slice in the standard phase-object approximation (Cowley, I98 l), implemented for the magnetic case by Wohlleben (1971), is given by
where, in the nonrelativistic approximation, E is the accelerating voltage, I is the de Broglie electron wavelength, and h = h / 2 z is the reduced Planck constant. The sign of the phase shift is consistent with the assumption that - e is the electron charge, so that eE is the kinetic energy and these
178
GIULIO POZZI
quantities are related by
h2 L2 2meE 4 n 2 ' where m is the electron mass. Equation (1) is explained as follows: let w ( X , Y , z;) represent the wavefunction immediately before the i-plane, then, the wavefunction immediately after, at zi+,is given by
w w , y , z;+)= w(X, y , Z;)T,,W,y , 2;).
(3)
The propagation of the electron wavefunction between two neighboring slices whose distance is E = zi+, - zi is then calculated according to the Huygens-Fresnel principle in the paraxial (Fresnel) approximation (Born and Wolf, 1980; Goodmann, 1968); that is,
which is the convolution between the object wavefunction after the slice and the Fresnel propagator between the two slices. In the light optical case, if the refractive index changes of the medium are small, so that it can be written as
n(X, Y,z )
=
no
+ 6n(X, Y , z),
(5)
it is found that the propagation is still described by Eq. (4), where L = Ao/no now denotes the wavelength corresponding to the mean refractive index no ( A , = vacuum wavelength), and the transmission function Top,is given by
where ko = 2n/L, is the wave number in vacuum.
B. Intermezzo: The Aharonov-Bohm Effect The fact that potentials instead of fields appear in the transmission function, Eq. (l), as well as in the Schrodinger equation is not trivial, as pointed out by Ehrenberg and Siday (1949), who first noted that effects
MULTISLICE APPROACH TO LENS ANALYSIS
Z
179
Z b
a
FIGURE2. Coordinate system and geometry for the Aharonov-Bohm effect involving (a) an enclosed magnetic flux and (b) the electrostatic field due to a bimetallic wire, equivalent to that produced by two positively and negatively charged lines.
could be observed, which were associated, at least in principle, with a flux. Let us consider the experiment, rediscovered by Aharonov and Bohm (1959), of introducing a long solenoid into a coherent electron beam (Bayh, 1962) or better, a superconducting hollow cylinder with trapped magnetic flux (Lischke, 1969), as sketched in Fig. 2a. The z-component of the vector potential around an infinite ideal solenoid lying along the Y axis can be written as
where 0 is the enclosed magnetic flux. By applying Eq. (1) to calculate the phase shift undergone by the electrons, it turns out that the transmission function of the solenoid is given by ToI(X,Y ) = exp[ne@ sign(X)/h] (8)
so that an observable phase difference Aa,
=
2ne@/h
(9)
is detectable if an interference experiment is carried out, where the wavefunction passing to the left of the solenoid is overlapped onto that passing to the right. The paradox is caused by the fact that electrons are not locally influenced by the magnetic field, which is zero outside the solenoid, and they always propagate in field-free regions if the solenoid is made impenetrable t o them. This interference effect has been verified and confirmed by a number of experiments (see for reviews Olariu and Popescu, 1985; Peshkin and
180
GIULIO POZZI
Tonomura, 1989), the last of which investigated completely shielded superconducting toruses by means of electron holography (Tonomura et al., 1986). This experiment definitely demonstrates that electrons do not experience any magnetic field, and therefore the phase difference cannot be attributed to an external leakage field or to Lorentz force effects on the portion of the electron beam going through the magnet, as some authors have defended. It is interesting to note that a rather similar effect arises also in the electrostatic case. Let us consider the field of a bimetallic wire having zero net charge and lying parallel to the Y axis, Fig. 2b, which is equivalent, at large distances, to that of a dipole line having a strength proportional to R AV, where R is the wire radius and A V is the contact potential difference. More precisely, 2R AV(Xcos 8 - z sin 8 ) V(X, Y , z ) = n ( X 2 + z2) 3
where 8 is the angle between the bimetallic wire and the electron beam. If Eq. (10) is introduced into Eq. (1) t o calculate the transmission function, there results 2nR AVsigg(X) cos 8 T b w ( X , Y ) = exp
[
so that the constant phase difference is now given by Ap
=
1
4nR AVcos 8 AE ’
Therefore, by rotating the wire, it is possible to have a phase difference varying continuously between zero and its maximum value, which is a few n for wires about 1 pm in diameter, with a contact potential difference of A V = 0.5 V, observed at 100 kV accelerating voltage. Experiments have fully confirmed the above predictions (Matteucci et al., 1982; Matteucci and Pozzi, 1985; Matteucci et al., 1992). It is interesting to note analogies and differences between the magnetic and the electrostatic case. In both cases the space is multiply connected owing to the presence of the solenoid or the bimetallic wire, and the effect is detectable as a displacement of the interference fringe pattern with respect to the unperturbed diffraction envelope. However, the electrostatic case is not as paradoxical as the magnetic one, because the electrons experience the electrostatic field, and the effect depends on the kinetic energy eE of the electrons, i.e., it is a dispersive phase shift. As pointed out by Boyer (1973) in his analysis using wavepackets, this means that in the electric case a classical lag effect is present, which can be
MULTISLICE APPROACH TO LENS ANALYSIS
181
made larger than the wavepacket length, thus destroying the interference pattern (Schmid, 1984), whereas such effect is not expected in the magnetic case, where the phase shift is independent of the electron energy. C. Application to the Quadrupole Electron Lenses
A magnetic or electrostatic quadrupole lens is characterized by two perpendicular planes of symmetry in the potential or field (Septier, 1971). The simplest field distribution for the electrostatic case can be generated by a pair of conductors at potential U having the hyperbolic profile ( X / d ) 2- ( Y / d ) 2= 1, and by another pair at potential - U having ( X / d ) 2- ( Y / d ) 2= -1, Fig. 3a. Moreover, the field is assumed to be of infinite extent in the z direction. The scalar potential is therefore given by V(X, Y , 2 )
=
U d
7(X2 - Y2).
Important features of the electron motion follow directly from the symmetry properties of the field: namely, electrons travelling in a symmetry plane will always stay in that plane. Moreover, whereas those in the plane Y = 0 are attracted by the charged surfaces and thus diverge from the axis (defocusing action), those in the X = 0 plane are repelled from the charged surfaces toward the axis so that a focusing action is present in this plane. A magnetic field of analogous properties can be derived from the vector potential (Septier, 1971) A ( X , Y , z)
Po nI
= -( X 2 -
d2
Y2)e,,
IY
a
b
I
FIGURE3. (a) Electric and (b) magnetic quadrupole hyperbolic fields.
(14)
182
GIULIO POZZI
which can be generated by a four-pole magnet with hyperbolic shapes and alternating polarity, rotated by 77/4 with respect to the electrostatic case, Fig. 3b. n l is the total ampere-turns in the coils wound on the poles, and ,uo is the permeability of free space. The vector potential has the same form as the scalar potential, but the symmetry properties of the magnetic field are different. It turns out that the magnetic field is everywhere perpendicular to the symmetry planes, so that, if an electron is moving in the symmetry plane, it will remain in it owing to the Lorentz force. These hyperbolic fields have the property that the magnitude of the electric (or magnetic) field is everywhere proportional to the radial distance measured from the axis; hence, the gradient of the field is a constant given for the electrostatic case by
and for the magnetic case by
KdO)
as
=ar =
2ponNd2.
Let us apply the multislice method to the wave optical analysis of the propagation of the electron in such a quadrupole field, by considering that the wavefunction at the plane z = zo is given by the Gaussian exp[-(X2
+ y2)/w2],
(17)
where w, the distance at which the amplitude is decreased by a factor l/e compared with its value on the axis, is usually referred to as the beam spotsize (Yariv, 1985). When the input wavefunction Eq. (17) is first multiplied by the transmission function of the slice, where Eqs. (13) and/or (14) have been introduced in Eq. ( l ) , the phase of the output wavefunction is still quadratic in form in the variables X and Y , but with coefficients which have become complex and are no longer equal. The following convolution with the Fresnel factor, Eq. (2),which can be carried out analytically using standard definite integrals (Gradshteyn and Ryzhik, 1980) does not alter this general conclusion, but only modifies the values of the coefficients of the quadratic terms and adds an amplitude factor. This overall behavior is maintained at every step, so, instead of doing all the calculations, we start from a generic plane zi, where the input Gaussian wavefunction is given in its most general form by
183
MULTISLICE APPROACH TO LENS ANALYSIS
where a(zJ is the amplitude term and Px(zi) and B y ( z i ) are two complex numbers whose real and imaginary parts are inversely proportional t o the radius of curvature and the width of the wavefunction in the corresponding direction. More precisely,
To fix the ideas, let us consider the electrostatic case, Eq. ( 1 3 ) . If the wavefunction Eq. (18) is multiplied with the transmission function of the i-slice and then convoluted with the Fresnel propagator between the slices, it results, after some simple but slightly involved algebraic calculations, that the wavefunction at the plane zi+, is given by the same form as Eq. ( 1 8 ) , where the new coefficients are related to the old ones by the relations a(zj + E )
=
42;)
J1 + E&(z;) + E2U/Ed2J1+ &Py(zi)
-
I
E2U/Ed2
(20)
and
Pdz; + E) When E equations
+
Py(2;)
=
1
-
EU/Ed2
+ EP~(ZJ E2U/Edi’ -
0 the former finite difference equations become the differential 42)
a’(z) = --((Bx(Z)
2
+ PY(Z)),
and where the prime denotes the derivative with respect to z. Equations (24) and ( 2 5 ) , with the change of variables
become
U Ed2
u“(2) - - U ( Z )
v”(2)
U
=
0,
+ Ed 2v(2) = 0.
(23)
184
GIULIO POZZl
Those equations are simply the equations of the trajectories (Septier, 1971), which can easily be integrated, giving
+ B sinh ~ E z ,
U(Z) = A cash ~ E z
(29)
and V(Z) =
CCOSPEZ + DsinPEz,
where A , B , C , and D are arbitrary constants to be determined by the initial conditions, and
Also, the amplitude, Eq. (23), can easily be integrated after the changes of variables, Eq. (26), resulting in a(z) = 1/-.
Determining the arbitrary constants by imposing the condition that at the 0 plane the general solution equals Eq. (17), it turns out that the axial Gaussian beam within the electric quadrupole lens is given by
z
=
expl(in/A)[Px(z)X2 + P Y ( Z ) Y 2 l 1 ~ ( C O~ SE +z (iA/nw2pE) sin BEZ)(COsh ~ E +z (iA/nw2BE) sinh ~
9
(33)
E z )
where BE sinh ~ px(z) = COSh DE.2
E + z (iA/nw2) cash ~ E z + (iA/nW2pE) sinh pEZ '
(34)
and
If the magnetic case is considered, the same results are obtained, provided with p M , given by
BE is replaced
2ep0nlA
&4
=
KM(0)e
hd2 -- -. mu
The evolution of the spotsize w of a Gaussian beam through a quadrupole lens is sketched in Fig. 4, which shows the focusing in the vertical plane and the divergence in the horizontal plane. The results obtained can easily be generalized to a real quadrupole lens of finite length, because in the paraxial regions the same expansions for the scalar and vector potentials hold, provided they are multipled by the z-dependent function k(z),which is equal to unity at the lens center and zero outside.
MULTISLICE APPROACH TO LENS ANALYSIS
185
FIGURE4. Evolution of the spotsize of a Gaussian beam through a quadrupole lens, showing focusing in the Y and divergence in the X direction.
Also, the uncoupled set of differential Equations (27) and (28) should be accordingly modified so that in the final expression of the Gaussian beam the trigonometric and hyperbolic functions (which correspond to the rectangular model approximation) are replaced by the two independent solutions of the new equations for the trajectories. The above calculations can easily be extended to more general quadratic expressions for the input beam (at the expense of some complications in the algebra), including spherical and plane waves and combinations thereof. By a procedure explained later in connection with the wave optical imaging properties of round symmetric lenses, Section III,E, the results obtained by Storbeck (1973) for the case of pure quadrupoles can then be easily recovered.
D. Light Propagation in Quadratic Index Media The results obtained previously for quadrupole lenses can be transferred immediately and easily into the optical field in order to analyze lenslike media (Yariv, 1985). Taking for the variation of the refractive index the expression -n k 6n(X, Y , z ) = 2(x2 + YZ), (37) 2ko where k2 is a constant, it turns out (see also Section IV) that k2/2ko plays the role of and the sign of kz gives origin, if it is positive, to an oscillating ray, Eq. (30), or, if it is negative, to a diverging ray, Eq. (29). Therefore, according to the sign, a thin section of the medium acts as a positive or negative lens (Yariv, 1985).
186
GIULIO POZZl
The relationship between input and output values for the Gaussian beam parameter p(z), Eqs. (34) and ( 3 9 , is known in optics as the ABCD law and is very useful for the tracing of Gaussian beams through a complicated sequence of lenslike elements.
OF THE MULTISLICE EQUATIONS TO 111. APPLICATION ROUNDSYMMETRIC ELECTRON LENSES
A . Inadequacy of the Standard Equations
In the foregoing section we have seen how the multislice method can be applied successfully t o the case of quadrupole electron lenses. Let us analyze more closely the consequences of the basic equations, and more properly, of the transmission function, Eq. ( l ) , when the electrostatic potential is due to a lens field of round symmetry. In the paraxial approximation V(X, Y , z )
=
Vo(z)- &d'(z)(X'
+ Y2),
(38)
where &(z) is the electrostatic potential on the optic axis and the prime denotes, as usual, the derivative with respect to z. Inserting Eq. (38) in Eq. ( l ) , it is found that the slice is responsible for a transmission function given by T,,(X,Y , zi)= exp
Vo(z)dz
-
in (X2+ Y2) 41E
~
(39)
Recalling that the amplitude transmission function of a thin lens having focal length f is given by (Goodmann, 1968)
it is ascertained that the transmission function Eq. (39) contains a quadratic term which predicts at least the expected lens effect, the focal lengthf, of the i-slice being given by
This is the well-known expression for the lens effect of a single aperture (Glaser, 1952; Farago, 1970). However, when the focal length of a lens is
MULTISLICE APPROACH TO LENS ANALYSIS
187
calculated in the weak, thin lens approximation, Eq. (39) does not give the correct expression, which, in the image space, is instead given by (Glaser, 1952; Grivet, 1972) 1
where
@(z),defined
1
according to the relation
@(z)= E + V,(Z)
(43)
corresponds to the standard choice of the electrostatic potential, whose zero level has been fixed so that it is directly proportional to the kinetic energy. The situation is even more dramatic for the magnetic lens case; in fact, the components of the magnetic vector potential in the paraxial region are given by
where B,(z) is the axial component of the magnetic field. Therefore the main phase shift given by Eq. (1) vanishes identically so that no lens effect is expected at all. In spite of these shortcomings the foregoing analysis suggests the direction to take in order to improve the basic equations of the multislice method. First, for the magnetic lens case, there is evidently a failure of the standard phase-object approximation, which should be capable at least of predicting the weak, thin lens effect. Second, for the electric lens case, it is evident that the variation of the electron wavelength along the optic axis should be properly taken into account. Therefore, let us reconsider how the standard phase-object or eikonal approximation is derived for the case of the electric field due to the specimen atoms (Glauber, 1959). The starting point is the Schrodinger equation, written in the nonrelativistic case
2e
V 2 w - -A
hi
*
Vv
e2 + V v + -2me (V + E ) v - 7 J A 2 W = 0, h2 h
(45)
where the additional constraint divA = 0 has been imposed on the magnetic vector potential. In the purely electrostatic case (A = 0 ) , the crystal potential energy eV(X, Y , z) is considered as a small perturbation with respect to the kinetic energy eE of the incident electron beam. Therefore, if the plane wave solution of the unperturbed Schrodinger equation propagating parallel to
188
GIULIO POZZI
the optic axis z is given by yo = e x p ( T ) ,
then the solution of the perturbed Schrodinger equation is looked for in the form
w
(47)
= wox.
The resulting equation for x is given by 4niax 4n2 v2x + - + -vx
I az
EL2
= 0.
The phase-object approximation is obtained when the V 2 x term is neglected. In fact, in this case, the equation for x results as
which can be immediately integrated to give the first phase term in Eq. (l),
Let us also introduce the magnetic field in the same approximation. The equation for x, once the V 2 x term is neglected, is found to be
i.e., with respect to Eq. (49) we have three additional terms. The first, ie
- jA , W , y, 21x9
(52)
can be simply added to the electrostatic term in Eq. (49) and the integration can be carried out in the same manner as before. It ensues that the resulting phase shift is the magnetic contribution displayed in Eq. (1). Therefore, the standard phase-object approximation for magnetic fields amounts to neglecting the other terms, as shown by Wohlleben (1971). Of course this can no longer be done for the magnetic lens case, where the main shift vanishes identically. Therefore, let us investigate the effect of the other terms. The second term,
MULTISLICE APPROACH TO LENS ANALYSIS
189
is an additional phase shift which can immediately be integrated along z. The resulting multiplicative factor in the transmission function
is exactly the term responsible for the lens effect of the weak magnetic lens, with focal length (Glaser, 1952; Farago, 1970; Grivet, 1972)
If also the third term
is taken into account, it turns out that the z integration can no longer be carried out in the same manner as before. In order to find the effect on the wavefunction also in this case, it is necessary, as shown by Glaser (1952, 1956), to introduce a new rotating or screw coordinate system x , y , z linked to the original fixed Cartesian system X , Y, z by the relations X
=
x cos e(z) - y sin e(z),
Y
=
x sin e(z) + y cos B(z),
z
= 2,
where e(z) is a function of z to be suitably determined (see Fig. 5).
FIGURE5. Fixed ( X , Y , z ) and rotating (x, y , z ) coordinate systems.
(57)
190
GIULlO POZZI
By introducing the new wavefunction $(x,Y , z ) = X(X9 y , 21,
the resulting equation for $ in the rotating system is found to be
a$ az
-
xi V $ - - e2A iB,(x 2 2 +y2)$LE 16nh2 (59)
and if 0(z) is chosen in such a way that
8’(z)
=
1 el, B,, 4 7th - -
it follows that
which can again be directly integrated along z as before. In conclusion, the effect of the magnetic field of the lens in the phase-object and paraxial approximations is twofold: (a) it introduces a quadratic phase shift responsible for the lens effect; and (b) the wavefunction at the exit plane of the slice zi should be rotated by the angle A 0 as expressed by
This second effect is automatically included by the adoption of the rotating coordinate system x, y , z. B. The Improved Phase-Object Approximation
In order to treat the more general case of the presence of both electric and magnetic fields, let us introduce the magnetic field and the z-dependence of the electron wavelength. First we generalize Eq. (46) to find a solution of the Schrodinger equation of the form
where p(z), given by p(z) = [ 2 m e ~ z ) 1 ” ~ ,
represents the classical electron momentum along the optic axis.
MULTlSLlCE APPROACH TO LENS ANALYSIS
The resulting equation for v2x
191
x is given by
2 m ) ax ip’(z) + -+h az A
“‘2
2h
e
@,,(z)(x2
+
y2)x
+ Y 2 ) x= 0 ,
(65)
where the identity W ( z ) = c ( z ) has been used. As for Eq. (48) the phase-object approximation is recovered when the V 2 x term is neglected. In fact, in this case, the equation for x results as
which, however, cannot be immediately integrated as before owing to the presence of the term linear in B, with mixed partial derivatives. If the rotating or screw coordinate system x, y , z is introduced, Eq. (57), the resulting equation for the new wavefunction 4(x, y , z), Eq. (58), in the screw system is
and if 0(z) is chosen in such a way that
it follows that the mixed term vanishes again, and the equation can be directly integrated along z as before. Therefore
and By applying Eqs. (58), (64), and (69) between the planes ziand by determining the constant A by the value of the wavefunction at the plane zi, the improved transmission function for the electromagnetic potential of
192
GIULIO POZZI
the i-slice is given by
C. Thick Lens Theory It has been shown in the previous section that the improved phase-object approximation includes the two basic effects of a thin electromagnetic lens; namely, the focusing action and the rotation of the image plane with respect to the object plane. Let us show how all the main features of thick lens theory are recovered by the multislice method. It is convenient to perform the calculations in the rotating coordinate system x , y, z. It can be ascertained that owing to the rotational invariance, the Fresnel propagator also, Eq. (2), has the same form in the new system, i.e.,
where, in addition, the electron wavelength not only is no longer constant but also varies along the optic axis from slice to slice, being given by 2nh
A(z;) = -.
P(Z;)
Let us consider a spherical electron wavefunction in the paraxial approximation at the plane z = z;. We can express this wavefunction as
4(X,Y,z;) = 4 z ; )
where a , y , a,, a,,,and P are real functions of z. If the multiplicative effect of the thin electromagnetic field of the slice is taken into account, and then the convolution through the Fresnel factor, Eq. (71), with A(z,), is carried out, the resulting wavefunction at the plane
MULTISLICE APPROACH TO LENS ANALYSIS
193
,
zi+ can be calculated analytically after lengthy, but not difficult, manipulations using standard definite Fresnel integrals (Gradshteyn and Ryzhik, 1980). It ensues that the wavefunction is still a spherical wave of the form indicated in Eq. (73) and the relations between the old and new coefficients are given by
The above system of finite difference equations can be transformed into the differential equations which the various coefficients obey by allowing E --t 0. Therefore,
and
194
GIULlO POZZI
This set of differential equations is essentially identical to that found by Glaser and Schiske in their analysis of the imaging process through the solution of the paraxial Schrodinger equation (Glaser, 1952, 1956; Glaser and Schiske, 1953). D. Propagation of a Spherical Wave in the Lens Field To solve the set of differential equations (80)-(84),we start with the Riccati equation (84), which, after the substitution
p = - Pr ’ 2r gives for r the second order differential equation
Using Eq. (64), Eq. (86) can be rewritten in the more familiar form
or equivalently, d -(fir’) dz
+446
This equation is the paraxial ray equation for electromagnetic lenses; the general solution of Eq. (88) can be written as the linear combination of two independent solutions p(z) and a(z), between which one of the following relations hold: @ ( p ’ a - P O ’ ) = constant. (89) or p ( z ) ( p ’ a- p a ’ ) = constant = K . (90) Therefore, if we put
it can be easily verified that
MULTISLICE APPROACH T O LENS ANALYSIS
195
and
where A , B , and C a r e constants to be determined by the initial conditions. Noting that
we have that (96) which can be immediately integrated to give (97) Hence, the final result for the spherical wavefunction is
(23'
+ C2)a+ 2Bx + 2Cy + pp'(x2+ y 2 ) (98)
This equation means that once the spherical wave is known in some starting plane zo (usually the object plane), the values of the different parameters are fixed, and then the spherical wave in any plane z is known and given by Eq. (98). The identity between this result and that obtained by Glaser and Schiske (Glaser, 1952, 1956; Glaser and Schiske, 1953) in their solution of the paraxial Schrodinger equation demonstrates that the heuristic approach using the improved multislice equations leads exactly to the same conclusion.
E. The Glaser-Schiske Diffraction Integral As shown by Glaser and Schiske (Glaser, 1952, 1956; Glaser and Schiske, 1953) the importance of the spherical waves of the form given in Eq. (98) lies in the fact that every wavefunction known in some plane zo can be decomposed as a linear superposition of suitable spherical waves. As they form a complete set of solutions of the paraxial Schrodinger equation, the value of the wavefunction in any plane z can be accordingly calculated.
196
GIULIO POZZI
FIGURE6 . Image formation according to particle optics, using the two independent solutions of the paraxial ray equation g(z) and h(z) in the rotating coordinate system (A’, y , z). Note that electron trajectories emerging parallel from the object plane zo converge in points in the Fraunhofer plane zF,whereas trajectories emerging from a generic point recombine again in the image plane ti,with magnification g(zi).
1. Point representation of the object wavefunction
Let us consider, Fig. 6, the two standard solutions of the paraxial ray equation, Eq. (88), satisfying at the object plane zo the initial conditions
then the spherical waves essentially are functions of only the two parameters B and C , and their most general superposition can then be written as
+ p(z)h’(z)(x2+ y 2 ) Using the relation
(101)
MULTISLICE APPROACH TO LENS ANALYSIS
197
derived from Eq. (89), the integral Eq. (101) can be rewritten in the more useful form
The value of the coefficient A(B, C ) can now be determined from the condition that in the plane z = zo the wavefunction given in Eq. (103) equals the wavefunction assigned in that plane, i.e., the object wavefunction. But for z zo the solution h(z) --t 0 so that both amplitude and phase of the spherical waves diverge. Nevertheless, the double integral can still be evaluated by means of asymptotic approximation methods (Born and Wolf, 1980; Murray, 1974). In particular, by the stationary phase method, it can be shown that the main contribution to the value of the integral arises from the points at which the rapidly oscillating phase of the integrand is stationary. In this case for z = zo only one stationary point is present, whose coordinates in the (B, C ) plane are given by -+
B
c = -YP(Zo).
= --XP(ZO),
( 104)
As only a small region around this point contributes to the integral, the amplitude factor can be taken out from the integral and made equal to its value at that point; the remaining integral can then be directly calculated by using standard definite integrals, and the final result reads
v ( x ,Y , zo) = A(-xp(zo), - Y P ( Z o ) ) i m i J . By putting B becomes
=
-xop(zo) and C
=
(105)
-yop(zo), the integral Eq. (103)
which is the Glaser-Schiske generalization of the Kirchhoff-Fresnel integral in the case where an imaging field is present. This integral allows the calculation of the wavefunction at any plane once the wavefunction at the object plane is given.
198
GlULiO POZZI
FIGURE7. Representation in the rotating coordinate system of the wave surfaces and geometric optical rays associated with the basic solution in the point representation. Dashed lines represent the two independent solutions of the paraxial ray equation g(z) and h(z).
In particular, at the image plane, characterized by the condition h(zi) = 0, the same reasoning given before leads to the conclusion that the condition of ideal imaging is realized, because the image wavefunction is a scaled replica, with magnification g(zi), of the object wavefunction (Glaser, 1952, 1956; Glaser and Schiske, 1953). Moreover, by putting iy(xo,y o , zo) = 6 ( x o ,y o ) in Eq. (106), the generic spherical wave used in the expansion Eq. (103) is recovered, so that the physical origin of the divergence of the spherical wave at the object and image planes is due to its Dirac delta behavior. As shown in Fig. 7, the spherical wave has its foci at the object and image planes, and clearly, this corresponds to the representation of the object wavefunction as a superposition of object points in real space.
2. Spatial frequency representation of the object wavefunction If instead of Eq. (100) the opposite choice is made, i.e., P(Z) =
4 z ) = h(z),
dz),
then the most general superposition can be written as
1
(B2 + C2)h(z)+ 2Bx
11
+ p(z)g’(z)(x2+ y 2 )
+ 2Cy
dBdC.
(108)
MULTISLICE APPROACH TO LENS ANALYSIS
At the object plane, expressed as
z
=
zo,
199
these spherical waves become plane waves
Therefore, if we compare Eq. (109) with the spatial frequency representation of the object wavefunction in the object plane,
the parameters A ( B , C), B, and C can be fixed, i.e.,
It ensues that the wavefunction in the whole space is given by
and represents the solution of the paraxial Schrodinger equation in the spatial frequency representation. Figure 8 sketches the generic wave of the representation of Eq. (1 12). Obviously, the point and spatial frequency representations are completely equivalent from the mathematical point of view, and probably for this reason, on historical grounds, the point representation has been preferred to the spatial frequency one. However, considering the approximations made to obtain the previous results (particularly, neglecting V2x) the latter choice may be preferable, as it presents fewer mathematical troubles near the object and image planes, where, in the point representation, the spherical wavefunctions diverge. In addition, this representation naturally leads to the information theory approach of the lens as a spatial frequency linear filter (Lenz, 1971; Hawkes, 1973).
200
GIULIO POZZI
FIGURE8. Representation in the rotating coordinate system of the wave surfaces and geometric optical rays associated with the basic solution in the plane wave representation. It is also shown that for thick lenses the real Fraunhofer plane zF is not coincident with the plane z, where the centers of the spherical wave in the image plane are located.
F. The Multislice Method and the Paraxial Schrodinger Equation It is worthwhile exploring a little further the identity noted at the end of Section III,D between the solution obtained by the multislice method in the limit of vanishing slice thickness and the solution of the paraxial Schrodinger equation, which has been written by Glaser and Schiske (1953) by neglecting only the term a2x/az2 in Eq. (65) instead of the whole Laplacian V2x. The resulting equation reads in the rotating coordinate system
_-
and being of parabolic, instead of elliptic type, allows an efficient numerical solution by noniterative marching techniques applied in the fields of light propagation in optical fibers (Fleck et al., 1976; Feit and Fleck, 1978) and of underwater acoustics (see for a review Tappert, 1977). At the basis of these methods are the following considerations: the formal solution of Eq. (113) can be written as
MULTISLICE APPROACH TO LENS ANALYSIS
20 1
where the operator appearing at the second member of Eq. (1 13) has been split into
and
For a small step E = zi+, - zi the following approximations hold: (Fleck et al., 1976; Makri, 1991): 4(x, Y , z;+,) = ~ X P M Q + ~ ) , I + ( xY, , ZJ =
exp[icnj]exp(ierj]4(x,y , 2;)
+ 0(c2).
( 1 17)
Therefore, at the first order in E the wavefunction at the plane zi+, can be found through the two separate steps +(xi ,A , zi+) = ~ X P W M X ~Y,; , z;),
(1 18)
and 4(xiYyi9 zi+J
=
e x p [ i ~ Q ~ l 4 ( x ~zj+h ,y~,
(1 19)
Equation (1 18) is the solution of the differential equation
with initial condition $(xi, y j ,z;), which is exactly our phase-object approximation, Eq. (67), describing the interaction part, whereas Eq. (1 19) corresponds to the differential equation
which is simply the propagation part in our multislice approach, whose solution in the real space is given in the paraxial approximation by Eq. (71). Fast Fourier transform methods are usually used for the numerical evaluation of the propagation. As the above considerations guarantee the convergence for small E of the numerical solution to the analytical solution of the paraxial Schrodinger equation, they also justify, on a sound basis, the convergence of our analytical approximate solution. It is worthwhile mentioning that these low-order discretization schemes have been developed in the broader context of the path integral formalism
202
GIULIO POZZI
where the Trotter formula (Kleinert, 1990; Makri, 1991) provides a useful key to derive the path integral representation of quantum mechanical systems (Feynman and Hibbs, 1965). Path integral formalism has been introduced also in electron optics by Van Dyck (1975) as a new description for the diffraction of high energy electrons in crystals; the previous considerations clearly show that this description can be easily and profitably extended also to the problem of propagation of the wavefunction in the lens field. IV. IMPROVED BPM EQUATIONS AND APPLICATION TO GRADIENT INDEXLENSES In light optics (as well as in electron microscopy) the BPM has been used as a numerical algorithm, in conjunction with the powerful fast Fourier transform method. The aim of the section is to use the analogy between electron and light optics t o transfer the results obtained in this latter field in order to write improved BPM equations and to solve them in the limit of the slice thickness tending to zero. It is shown that in the paraxial approximation they lead to the analytical solution for the propagation of Gaussian as well as spherical wavefronts in optically inhomogeneous media with a transverse quadratic variation of the refractive index (Pozzi, 1992b; Di Sebastiano and Pozzi, 1992). It will also be shown that this implementation removes two shortcomings of the basic equations pointed out by Gribble and Arnold (1988), who noticed that the BPM does not predict the correct focal length for an integrated optics lens and that the ray equation that can be attributed to the BPM does not agree with the standard geometrical optics ray equation.
A . General Considerations In spite of the close analogy between electrostatic and gradient index lenses it is preferred here t o rederive the main results from the beginning. The starting point is the scalar Helmholtz equation V’W
+ Gn’(r)ty
=
0,
(122)
where n(r) = n(X, Y , z ) is the refractive index and k,,is the wavenumber in vacuum. For a medium with cylindrical symmetry, taking the z axis coincident with the symmetry axis (i.e., the optic axis), it is convenient to write in the Gaussian approximation
n2(X, Y , z)
=
ni(z) - n,(z)(X’
+ Y’),
(123)
203
MULTlSLlCE APPROACH TO LENS ANALYSIS
where we may consider n,(z) as the leading term and the remainder as a perturbation. In order to calculate the transmission function of the i-slice, it is better to start from a solution of the Helmholtz equation of the form
The equation for x is given by V2x
ax + ik, dn,O x - G n 2 ( z ) ( X 2+ Y2)x = 0. + 2ik0n,(z) az dz
Neglecting the V2x term the equation for
(125)
x results in
which can be immediately integrated to give ik,
A
x ~ Y, ,z ) = -ex,[
4x3
-2 (X*
+ Y2)
-dz] .
(127)
Therefore, if w(X, Y, z;) represents the wavefunction immediately before the i-plane, the wavefunction after the slice is given by w
W
9
y, z;+)= w(X, Y, Z;)T,,(X, y, z ; ) ,
( 1 28)
where the improved transmission function Topis
exp[ ik,
"'1
2,
no(z)dz - ik, ( X 2+ Y2) 2
11;'
-dz]
.
(129)
The propagation of the wavefunction between two neighbouring slices is calculated according to the Huygens-Fresnel principle in the paraxial (Fresnel) approximation, Eq. (2), where the wavelength depends on the z-coordinate and is given by 1; = 27r/[kOn0(zj)]. The following changes with respect to the standard BPM equations (which can be recovered from the previous ones by putting n,(z) = constant) should be noticed: (a) the presence of the amplitude factor and the integrands in the phase terms of the transmission function; and (b) the dependence of the wavelength on the axis position.
204
GIULIO POZZI
B. Propagation of Gaussian Wavefronts Let us consider an axial Gaussian wavefront at the plane 9
~
Y , zi) 3
=
z
=
zi.Then
a(zi) expliko[y(zi) + P(zi)(x2 + Y’III,
(130)
where the real and imaginary parts of the coefficient p are inversely proportional to the radius of curvature and the width, as in Eq. (19). If the multiplicative effect of the transmission function of the slice is taken into account and the convolution through the Fresnel factor with Ai, is carried out, the wavefunction at the plane zi+, can be calculated analytically as done before. It ensues that the wavefunction is still Gaussian, and the relations between the old and new coefficients are given by a system of finite difference equations, which can be transformed into differential 0. These equations that the various coefficients satisfy, by allowing E differential equations are given by +
da(z) dno(z) - 2a(z)-, P(Z) - - --a(z) dz
2n0(z) dz
no(z)
(131)
and
By making the substitution
p = - no - dr
(134)
2r dz’
Eq. (133) gives for r the second order differential equation
no(z)d2r 2r dz2
-- +
1 dno(z)dr
2r
dz
n2(z) + -= 0, dz 2n&)
(135)
or equivalently,
d ( n o $ ) + :r
=
0.
dz
It is important to note that this equation is the paraxial ray equation of geometrical optics whose general solution can be expressed as a linear combination of two independent solutions. Therefore, r being a solution
MULTISLICE APPROACH TO LENS ANALYSIS
205
of Eq. (136), the final result for the Gaussian wavefunction is
(137)
where the values of the free coefficients can be determined by matching Eq. (137) with Eq. (130) at the chosen reference plane. It can also be ascertained that Eq. (137) is a solution of the paraxial Helmholtz wave equation. Moreover, it should also be pointed out that the standard BPM equations would give for r a different differential equation and hence a disagreement between BPM and geometrical optics calculations (Gribble and Arnold, 1988). Therefore, for an axial Gaussian wavefront at zo, with a complex beam parameter P(zo), 4 4 x 9
y , 20) = N Z o ) exp
ri
~ P ( Z O ) ( X ’+
The wavefunction at an arbitrary plane
1
Y2) .
(138)
z is given by
where the prime denotes differentiation with respect to z , and g and h are the solutions of the paraxial ray equation of geometrical optics, Eq. (136), which satisfy the initial conditions
C. Application to an Integrated Optics Lens
Let us apply the previous considerations to the propagation of a Gaussian wavefront through a Maxwell fish-eye lens profile, whose refractive index is given by
n(X,Y , z) = ii/(l
+ x2
+ + ”’> a2 y2
206
GlULIO POZZI
In the paraxial approximation n(X, Y , z ) is given by
so that it follows no(z) =
iia2 a’ + z 2 ’
(143)
~
and
n,(z)
=
2ii 2a4 (a2 + z213
The ray equation for the Maxwell lens is therefore (0’
+ z2)7d2r - 2 2 -dr dz
+ 2r = 0,
dz
whose solutions g and h which satisfy the initial conditions Eq. (140) at the object plane zo are
a2 a2 + z0
a2 + z i ’
( 1 46)
a2zo a2 + z i *
(147)
and
h(z) =
a2 + zo
FIGURE 9. Intensity distribution in the plane Y = 0 of the Gaussian wavefunction through the Maxwell fish-eye lens, showing that the focus is at z = 18.5 rnm. The X coordinate varies between -0.03 and +0.03 rnm,whereas z varies between 17 and 20 rnm.
MULTISLICE APPROACH TO LENS ANALYSIS
207
Taking the same data as Gribble and Arnold (1988), i.e., z, = - 5 m m , I = 0.633pm, ii = 2.3111, and a = 22.96mm, &,) = (iA)/(nw& where w, = 0.3 mm, the intensity distribution of the Gaussian beam Eq. (139) in the plane Y = 0 shown in Fig. 9 clearly indicates that the axial focus is at z = 18.5 mm, as predicted by geometrical optics. Remember that the focus calculated by the standard BPM code is at z = 21.32mm (Gribble and Arnold, 1988).
V. BEYOND THE PARAXIAL APPROXIMATION
The successful results discussed in the previous sections on the application of the multislice method to the wave optical analysis of the paraxial properties of electron lenses stimulated a further effort to ascertain whether the nonparaxial properties of the lenses could also be treated within the same framework, with attention focused mainly on the spherical aberration coefficient. Di Sebastiano (1991) made a first attempt by taking the standard multislice equations and simply adding the fourth order terms for the fields in the slice transmission function. Accordingly, also, the kernel in the Kirchhoff-Fresnel integral describing the propagation between adjacent slices was expanded up to the fourth order. The effects of these changes on the propagation of an axial spherical wave, again developed up to the fourth order, were then evaluated by the method of stationary phase, as calculations could not be made analytically. In this way it was possible to find relations between the values of the various coefficients before and after the slice. When the limit of vanishing thickness of the slice is taken as in the paraxial case, it is possible to obtain, in particular, a differential equation for the spherical aberration coefficient, which has almost the same structure and leads almost to the same results as the classic expression (Glaser, 1952). However, only two terms instead of four are recovered. The analysis of the missing terms indicated that this first approach did not properly take into account the obliquity factor and the focusing properties of the slice. In fact, the standard expression for the transmission function of the slice has been derived for the case of a plane wave parallel to the optical axis. When the correction for an inclined wave was properly taken into account, an additional term was recovered so that only one term was left out. In the following section it will be shown that by pushing the heuristic ideas behind the multislice method a step forward, i.e., by simply extending
208
GIULIO POZZI
the phase object approximation to spherical instead of plane input waves, it is possible to obtain more than the correct transmission function of the slice. In fact, when the electromagnetic potentials are expanded up to the second order, the approximations made guarantee that the paraxial wavefunction is the solution of the Schrodinger equation in the whole lens, whereas the correct differential equation for the spherical aberration coefficient is recovered when fourth order terms are taken into account.
A . Phase-Object Approximation with Spherical Input Waves We recall that the basic steps leading to the phase-object approximation (Sections II1,A and II1,B) are as follows: Start with a suitable input wave t,v0 (a plane wave in Sections II1,A and II1,B) and look for a solution of the Schrodinger Eq. (45) in the form yox. 2. Obtain a simpler differential equation for x by neglecting the LapIacian term V’X. 1.
These steps will be repeated and improved here to treat the case of spherical aberration, by considering as yo the axially symmetric spherical wavefunction given by
and by including in the Schrodinger equation the development up to the fourth order of the electromagnetic potentials, given by 1 4
1 64
@(X,Y , Z) = @(z) - - @ ” ( z ) ( X 2+ Y 2 ) + -@IV(z)(X2
+ YZ)’
(149)
A , = -+Y(B,(z) - $B;(z)(X2 + Y 2 ) )
Ay
=
$X(B,(Z) - $B;(z)(X2
+ Y’))
(150)
A , = 0.
With these premises, in order to avoid overload in the formulas, we first consider the case of electric lenses, showing at the end the modifications necessary to include the magnetic field contribution.
209
MULTISLICE APPROACH TO LENS ANALYSIS
Setting w0x into the fourth order Schrodinger equation, the resulting equation for x is 4D2(z) + 2 r ’ ( z ) P ’ ( z )+
+
ti
[4P(z)
y@”(z) 1(x2+
+ r”(z)+ P”(Z)(x2 + Y 2 ) ]+ V2X = 0. ~
X
y2)
(151)
Let us introduce explicitly the fact that yo is a solution of the paraxial equation; then
r’(z) = p(z),
(152)
and
which results in the survival of only the fourth order terms in the radial coordinate in the factor multiplying l / h 2 in Eq. (151). By comparing Eq. (153) with Eq. (84), it may be ascertained that Eq. ( 1 5 3 ) , with the substitution of Eq. (85) leads to the paraxial ray equation for an electrostatic lens. In order to obtain a simpler equation for x, the following approximations should be made: first, as in the standard phase-object approximation, the Laplacian should be neglected; second, the quadratic factors should also be neglected with respect to the zeroth order terms, i.e., P’(z)(X2+ Y2)e
r’(z) = p (z )
(154)
and P”(z)(X2+ Y 2 )e 4P(z)
+ p’(z).
(155)
The meaning of these additional approximations can be evaluated by considering a spherical wavefront centered at z = zo in an equipotential space so that p(z) = p o . In this case
210
ClULlO POZZl
so that both conditions Eq. (154) and Eq. ( 1 5 5 ) are equivalent to
x 2+ Y 2 2(z - zo)
4 1;
(157)
i.e., they are valid insofar as the radius of curvature of the spherical wavefront is large. With these assumptions, making use of Eqs. (152) and (153), the equation for x becomes
B. Equation f o r the Sperical Aberration Coefficient Equation (158) can now be solved, provided the position is taken that
Li
x ~ Y, ,z) = a(z)exp 5p ( z ) ( x 2+ Y’)’
1
,
(159)
which reflects the fact that we are interested in the evolution of the fourth order term in the phase, linked to the spherical aberration, and that we consider negligible the amplitude variations with the radial coordinate. In fact, by inserting Eq. (159)in Eq. (158), the vanishing of the imaginary terms gives Eq. (80) for the amplitude a(z), whose solution is Eq. (92), i.e., the amplitude factor of the paraxial spherical wavefunction. The vanishing of the real term gives the differential equation for the spherical aberration coefficient p(z) as
which, with the use of Eqs. (64) and (91), and noting that
1 d
- -(p4p) p4 dx
=
p’
+ 4p-P’ P
MULTISLICE APPROACH TO LENS ANALYSIS
21 1
can be put, after some calculations, in the form
Let us consider the modifications required to take into account the presence of the magnetic field. First of all, owing to the choices in Eqs. (148) and (159) for yo and x respectively, only the A' term in the Schrodinger equation gives a nonzero contribution, with the consequence that the foregoing formulas are still valid provided the substitutions
and
are made. Equation (162) can then finally be written in the form
where L , M , and N are the Glaser coefficients (Glaser, 1952) given by
and
m
N=-
2 When ~ ( z =) 0, the foregoing calculations and approximations confirm that the paraxial spherical wavefunction found by the multislice method is the solution of the Schrodinger equation when the electromagnetic fields are expanded up to the second order; by the same token, whenp(z) # 0, the solution of the fourth order equation in the whole lens is obtained, and the correct differential equation for the spherical aberration coefficient is recovered.
212
GIULIO POZZl
ZO
ZB
ZA
Z1
FIGURE10. Propagation of a spherical wave through an aberrated lens, whose field lies between the planes zA and z B ,The disc radius in the Gaussian image plane is evaluated by the stationary phase method.
C . Comparison with the Classic Results Following Glaser (1952), let us show the correspondence between wave and geometric optical results. Let us consider, Fig. 10, the propagation of an axial spherical wave originating at the object plane z = zo in the field-free space in front of an electron lens lying between the planes z = zA and z = z ~ Let . z = z1 be the Gaussian image plane, again in field-free space. In this case the standard solution h(z), Eq. (99), whose zeros are in correspondence with the object and Gaussian image planes, should be taken in place of p(z). In the field-free space before the entry plane z = zA of the lens,
h(z) = z
-
zo,
and
p(z0)= p(zA),
(1 69)
whereas, after the exit plane z = z B , we have
The wavefunction at the entry plane of the lens is given, neglecting nonessential amplitude and phase factors, by
(171)
which can be obtained in two equivalent ways: (a) by developing the spherical wavefront opriginating at zo up to the fourth order; or (b) by considering w = w0x and integrating Eq. (165) between the planes zo and z A . Since in the field-free space only the N term is different from zero,
213
MULTISLICE APPROACH TO LENS ANALYSIS
it is found that
which gives, owing to Eq. (169), the correct factor for the fourth order term. Moreover, from Eqs. (85) and (169) it follows immediately that
The wavefunction at the exit plane
z
=
zB is given by
and p(zB) can be obtained by integrating Eq. (165) between the entry and exit planes. It ensues that
owing to Eq. (172). In order to find the wavefunction in the image plane the method followed fails because both p and p are divergent, since h(z,) = 0. Nonetheless, the correspondence with the geometric optical results can be obtained by evaluating the propagation integral by the stationary phase method. Thus, W(Xi, YI 21) = 9
where
s1’
W ( ~ BYB 2 s ) 9
is the fourth order propagator.
9
*
h(Xi YI ; X B YB)dxB dYB 9
9
I
(177)
214
GIULIO POZZI
In fact, the leading term of the wavefunction in the Gaussian image plane is different from zero provided the phase is stationary, that is, the conditions are satisfied that
and, owing to the symmetry, a similar equation, with X a n d Y interchanged. For points at the rim of the illuminated area in the image plane, i.e., XI Q X Band Y, a YE,and taking into account Eq. (175),Eq. (179) becomes
As the coordinates in the exit plane zB are limited by a circular aperture of radius R B , the illuminated area in the image plane is also a circle of radius r given by
r
=
BRA,
(181)
where B is given by
Using Eqs. (165) and (170) and recalling that the space between exit and image planes is field-free, it turns out that as for Eq. (176), this expression can be rewritten in a more general form as
B=
- ")
P(Z 1)h4(ZB)
(Lh4 + 2Mh"h2
+ Nhf4)dz,
(183)
and the identity with the classical Glaser (1952) expression is recovered when the new function
is introduced, for which
so that it finally ensues that
MULTISLICE APPROACH TO LENS ANALYSIS
215
It should be noted that the coincidence of this result with that obtained by the eikonal method (Glaser, 1952) is not surprising, because the semiclassic approach to lens theory (Glaser, 1952, 1956; Pozzi, 1985) rests essentially on similar basic assumptions. VI. CONCLUSIONS It has been shown in this work that the multislice method and the improved phase-object approximation are able to give, in the limit of the slice thickness tending to zero, the analytical solutions of the paraxial Schrodinger and Helmholtz equations in the fields of electron and light optics, respectively. The modifications with respect to the standard equations, which are used mainly for approximate numerical work, can be easily incorporated in the computer programs in optics, thus reconciling the predictions between numerical and geometric optical calculations, a discrepancy first noticed in the case of integrated optics lenses by Gribble and Arnold (1988). The modifications of the programs in the electron optical case should be slightly more complicated, since it has been demonstrated that the effect of each slice is to introduce (a) a quadratic phase shift responsible for the focusing action; and (b) a rotation of the image wavefunction at the exit with respect to the entrance plane. Both these effects are neglected, to the author’s knowledge, in the multislice programs, where the propagation of the electrons is affected only by the atomic and not by the lens field. Although this approximation is valid for thin specimens, the overlooked effects may become relevant when thicker specimens or larger fields of view are considered. For instance, the limited field of view in the z direction observable in reflection electron microscopy, is interpreted from the particle point of view as due to the spiralling of the electron in the lens field. From the wave optical point of view, this same effect can be accounted for as due to the rotation of the propagating wavefunction in the fixed coordinate system, which locally changes the Braggs diffraction conditions. Finally, the first steps have been taken to extend this heuristic approach to the treatment of the spherical aberration. The obtained results raise the hope that nonisoplanatic aberrations can also be taken into account in the same framework, where the basic idea is to use the phase-object approximation extended to spherical instead of plane waves. The cumbersome, but otherwise purely mechanical work necessary to compute the various coefficients and their evolution in the lens, could be made easier by the use of modern computer algebra methods and programs, thus focusing the attention on the more important task of analyzing the approximations involved and their validity range.
216
GlULIO POZZI
Although not his main research field, the problem of image formation and related issues has always been one of the favorites of the author, who regards with awe and admiration the variety of methods that have been and are employed in its treatment. In addition to those shortly and incompletely referred to in this paper, further methods applying Lie group theory (Sanchez-Mondragon and Wolf, 1986) or operatorial ordering techniques (Dattoli el al., 1988; 1990) have been introduced, which discuss the formal analogy between wave propagation and quantum mechanics. In an ideal scale of increasing mathematical sophistication, the present heuristic approach can be placed at the lower end. For this reason it is hoped that on one hand it may serve as introduction to this fascinating subject and on the other it may be useful for young people working in electron microscopy to allow them to encompass in a unitary framework the problem of propagation and interaction of the electrons and to develop a “wave optical-minded” attitude in the interpretation of their experimental results. ACKNOWLEDGMENTS The author thanks Dr. Annamaria Di Sebastiano for her collaboration and particularly for useful discussions which helped him to clarify several issues concerning the extension of the multislice approach to the aberration problem. The critical reading of the manuscript and the helpful comments of Drs. Rodney Herring and Giorgio Matteucci, together with Professor GianFranco Missiroli, are gratefully acknowledged. Finally, the skillful technical assistance of Mr. Stefan0 Patuelli in preparing the drawings has been highly appreciated. This research has been supported by funds from the Minister0 della Universita e Ricerca Scientifica e Tecnologica, coordinated by Consorzio Interuniversitario Nazionale per la Fisica della Materia and Gruppo Nazionale di Struttura della Materia-Consiglio Nazionale delle Ricerche. REFERENCES Aharonov, Y., and Bohm, D. (1959). Phys. Rev. 115, 485-491. Arnaud, J . A. (1976). “Beam and Fiber Optics.” Academic Press, New York. Bayh, H . (1962). Z . Phys. 169, 492-510. Born, M., and Wolf, E. (1980). “Principles of Optics.” Pergamon, New York Boyer, T. (1973). Phys. Rev. D 8 , 1679-1693. Christenson, K . K . , and Eades, J . A. (1986). Ultramicroscopy 19, 191-194. Christenson, K . K . , and Eades, J . A. (1988). Ultramicroscopy 26, 113-132. Cowley, J . M . (1981). “Diffraction Physics.” North-Holland, Amsterdam.
MULTISLICE APPROACH TO LENS ANALYSIS
217
Cowley, J. M., and Moodie, A. F. (1957a). Proc. Phys. Soc. 71, 533-545. Cowley, J . M., and Moodie, A. F. (1957b). Acta Crystallogr. 10, 609-619. Dattoli, G., Gallardo, J . C., and Torre, A. (1988). Riv. Nuovo Cimento 11, 1-79. Dattoli, G., Di Lazzaro, P., and Torre, A. (1990). I / Nuovo Cimento 105B, 165-178. Di Sebastiano, A. (1991). M.Sc. thesis. University of Bologna. Di Sebastiano, A,, and Pozzi, G. (1992). Opt. Lett. 17, 472-474. Eherenberg, W., and Siday, R. E. (1949). Proc. Phys. Soc. 62, 8-21. Farago, P. S. (1970). “Free-electron Physics.” Penguin, Baltimore. Feit, M. D., and Fleck, J . A. (1978). Appl. Opt. 17, 3990-3998. Feynmann, R. P., and Hibbs, A. R. (1965). “Quantum Mechanics and Path Integrals.” McGraw-Hill, New York. Fleck, J . A , , Morris, J . R., and Feit, M. D. (1976). Appl. fhys. 10, 129-160. Gabor, D. (1961). In “Progress in Optics” (E. Wolf, Ed.), Vol. 1, pp. 111-153. NorthHolland, Amsterdam, Glaser, W. (1952). “Grundlagen der Elektronenoptik.” Springer-Verlag. Vienna and Berlin. Glaser, W. (1956). In “Handbuch der Physik” (S. Fliigge, Ed.), Vol. 33, pp. 123-395. Springer-Verlag, Berlin.. Glaser, W., and Schiske, P. (1953). Ann. Physik 12, 240-280. Glauber, R. J. (1959). In “Lectures in Theoretical Physics” (W. E. Brittin and L. G . Dunham, Eds.), Vol 1, pp. 315-414. Interscience, New York. Goodman, P., and Moodie, A. F. (1974). Acta Crystallogr., Sect. A 30, 280-290. Goodmann, J. W. (1968). “Introduction to Fourier Optics.” McGraw-Hill, New York. Gradshteyn, I . S., and Ryzhik, 1. M. (1980). “Table of Integrals, Series and Products.” Academic Press, New York. Gribble, J . J . , and Arnold, J . M. (1988). Opt. Lett. 13, 611-613. Grivet, P. (1972). “Electron Optics.” Pergamon Press, New York. Hawkes, P. W. (1973). In “Image Processing and Computer-Aided Design in Electron Optics” (P. W. Hawkes, Ed.), pp. 1-33. Academic Press, New York and London. Hawkes, P . W., and Kasper, E. (1989, 1994). “Principles of Electron Optics.” Academic Press, New York. Kleinert, H. (1990). “Path Integrals in Quantum Mechanics, Statistics and Polymer Physics.” World Scientific, Singapore. Lenz, F. (1971). In “Electron Microscopy in Materials Science” (U. Valdre, Ed.), pp. 540-569. Academic Press, New York. Lischke, B. (1969). Phys. Rev. Lett. 22, 1366-1368. Makri, N. (1991). Compul. Phys. Commun. 63, 389-414. Matteucci, G., Missiroli, G. F., and Pozzi, G. (1982). Ultramicroscopy 10, 247-251. Matteucci, G., and Pozzi, G. (1985). Phys. Rev. Lelt. 54, 2469-2472. Matteucci, G., Medina, F. F., and Pozzi, 0. (1992). Ultramicroscopy 41, 255-268. Murray, J . D. (1974). “Asymptotic Analysis.” Oxford Univ. Press (Clarendon), London and New York. Olariu, S., and Popescu, I. I. (1985). Rev. Mod. Phys. 57, 339-436. Peshkin, M., and Tonomura, A. (1989). “The Aharonov-Bohm Effect.” Springer-Verlag, Berlin. Pozzi, G. (1985). In “Microscopia Elettronica in Trasmissione e Tecniche di Analisi di Superficie nella Scienza dei Maleriali” (P. G. Merli and M. Vittor Antisari, Ed.), parte B, pp. 57-100. ENEA, Roma. Pozzi, G. (1988). In “Physics of Metals” (E. S. Giuliano and C. Rizzuto, Eds.), pp. 344-361. World Scientific, Singapore. Pozzi, G. (1989). Ultramicroscopy 30, 417-424.
218
GIULIO POZZI
Pozzi, G. (1990). Optik (Stuttgart) 85, 15-18. Pozzi, G. (1992a). In “Electron Microscopy in Materials Science” (P. G. Merli and M. Vittor Antisari, Eds.), 183-192. World Scientific, Singapore. Pozzi, G. (1992b). In “Proceedings of the International Symposium ‘Huygens Principle 1690-1990, Theory and Applications’” (H. Blok, H. A. Ferwerda, and H. K. Kuiken, Eds.), pp. 521-526. North-Holland, Amsterdam. Reimer, L. (1984). “Transmission Electron Microscopy.” Springer-Verlag, Berlin and New York. Sanchez Mondragon, J., and Wolf, K. B. (1986). “Lie Methods In Optics.” Springer-Verlag, Berlin and New York. Schmid, H. (1984). In “Proceedings of the Eighth European Congress on Electron Microscopy” (A. Csanady, P. Rohlich, and D. Szabb, Eds.), pp. 285-286. Prograrnrn Committee of the Congress, Budapest. Septier, A. (1971). In “Electron Microscopy in Materials Science” (U. Valdrk, Ed.), pp. 14-98. Academic Press, New York. Storbeck, F. (1973). Ann. Physik 29, 63-74. Tappert, F. D. (1977). In “Wave Propagation and Underwater Acoustics” (J. B. Keller and J . S. Papadakis, Eds.), pp. 224-287. Springer-Verlag, New York and Berlin. Tonomura, A., Osakabe, N., Matsuda, T., Kawasaki, T., Endo, J . , Yano, S., and Yamada, H. (1986). Phys. Rev. Lett. 56, 792-795. Van Dyck, D. (1975). Phys. Status Solidi B 72, 321-336. Van Dyck, D. (1985). Adv. Electron. Electron Phys. 65, 295-355. Van Roey, J . , Van der Donk, J., and Lagasse, P. E. (1981). J . Opt. SOC. Am. 71, 803-810. Watanabe, K . (1993). Adv. Electron. Electron Phys. 86, 173-224. Wohlleben, D. (1971). In “Electron Microscopy in Materials Science” (U. Valdre, Ed.), pp. 712-757. Academic Press, New York. Yariv, A. (1985). “Optical Electronics.” CBS College Pub., New York. Zeitler, E., and Olsen, H. (1964). Phys. Rev. A 136, 1546-1552. Zeitler, E., and Olsen, H. (1967). Phys. Rev. 162, 1439-1447.
ADVANCES IN IMAGING AND ELECTRON PHYSICS. VOL . 93
Orientation Analysis and its Applications in Image Analysis N . KEITH TOVEY. MARK W . HOUNSLOW. and JIANMIN WANG School of Environmental Sciences. University of East Anglia. Norwich NR4 7TJ. UK
I. I1. I11 . IV .
V.
VI .
VII .
VIII . IX .
Introduction . . . . . . . . . . . . . . . . . . . . Definition of the Task . . . . . . . . . . . . . . . . . Image Acquisition . . . . . . . . . . . . . . . . . . Image Processing and Analysis of Orientation . . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . B. Elementary Edge Detection Operators . . . . . . . . . . . C . Difficulties in the use of Edge Detection . . . . . . . . . . D . Presentation of Results from Orientation Analysis or Edge Detection . E . Simple Quantitative Parameters from Orientation Analysis . . . . Generalized Intensity Gradient Operators . . . . . . . . . . . A . introduction . . . . . . . . . . . . . . . . . . . B. Development of Generalized Intensity Gradient Operators . . . . . C . Nomenclature of Different Formulae for Intensity Gradient Analysis . D . Practical Considerations of Intensity Gradient Analysis . . . . . E . A Comparison of the Various Formulae . . . . . . . . . . F . Treatment of Boundaries . . . . . . . . . . . . . . . G . Use of Pixels with Rectangular Aspect Ratio . . . . . . . . . H . Resolution of Images . . . . . . . . . . . . . . . . I . Noisy Images . . . . . . . . . . . . . . . . . . . J . Statistical Analysis of Orientation Data . . . . . . . . . . . K . Extension of Intensity Gradient Analysis to Three Dimensions . . . Enhanced Orientation Analysis-Domain Segmentation . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . B. Domain Segmentation using a Modal Filter . . . . . . . . . C . Domain Segmentation using the Rayleigh Statistical Test . . . . . D . Domain Segmentation Weighted According to Vector Magnitude . . E . Choice of Radius in Domain Segmentation . . . . . . . . . F . Presentation of Domain-Segmented images . . . . . . . . . G . Some Practical Points about Domain Segmentation . . . . . . . H . Relationship between Domain Segmentation and index of Anisotropy . I . Extensions to Domain-SegmentationTechniques . . . . . . . . Applications of Orientation Analysis . . . . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . B. Orientation Analysis Combined with Porosity Analysis . . . . . C . Orientation Analysis Combined with Multispectral Processing . . . Implementation and Automation of Orientation Analysis . . . . . A . Implementation of Algorithms for Orientation Analysis . . . . . B . Automation of Orientation Analysis . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . .
219
. . . .
220 224 228 231 231 232 235 239 244 246 246 241 252 254 256 261 212 215 211 218 284 281 281 288 292 293 293 296 298 298 298 300 300 301 311 319 319 320 323 326
. .
. . . . . . . . . .
. . . . . . . . . .
.
.
. . . . . .
. . . . . . . . . . . . . . . .
. . . . . .
. . . . . . .
. . . .
. . . . . . . . . . . . .
.
Copyright 0 1995 by Academic Press Inc . All rights of reproduction in any form reserved
.
220
N. KEITH TOVEY ef a/.
I. INTRODUCTION
Edge detection is important in many image processing and analysis applications. In some cases the detection forms part of an image enhancement by highlighting edges. In other cases, the detection of edges is a prerequisite for subsequent processing using the Hough transform to detect lines (Swift, 1992), while edge detection information may also be used in edge-linking algorithms (Gonzalez and Wintz, 1987). Though edge detection routines have found widespread use in many applications, their use in quantitatively describing the general alignments of large numbers of features within images is much less well developed. Such applications are of importance in particulate materials, particularly where the particles are small in size and where the total number within the image may well exceed several thousand. The interest in such materials is of importance as the macroscopic properties of many of these materials are singificantly affected by the nature, size, orientation, and shape of the features seen at the microscopic scale. In some materials these features may be individual particles; in others, individual fibers, and in still others, an arrangement of touching grains. Associated with these features are often a collection of voids or pores which themselves may be isolated or interconnected. Although the development of many of the techniques described here were derived from microfabric analyses, there is no reason why many of the techniques described should not have wider application, particularly those related to orientation analysis. Many of the techniques described in this article are applicable to the study of images of materials from many different disciplines of science, and to avoid possible confusion over terminology, it is important to define the general terms microfabric and microstructure. It is generally recognized that microfabric refers to the geometric arrangement of the constituent particles, fibers, or grains and the associated voids. Some disciplines imply by the term microstructure a definition which embraces the geometric definition covered in microfabric and also covers statements relating to the nature of and/or the strength of structural bonding between the constituent parts. Other disciplines treat the terms microfabric and microstructure as being synonymous. Since the object of this current article is to discuss geometric parameters only, the term microfabric, as defined here, will be used throughout as this is accepted in most disciplines. Use of the term microstructure will be avoided. In addition to the applications of orientation analysis for microfabric studies, there is included in this chapter a discussion of many different formulations of edge detectors as the merits of these are of widespread interest. Included are discussions of enhanced orientation analysis in which edge
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
22 1
detection is but a small part of the overall analysis as well as a review of the integration of the techniques to overall microfabric studies. The influence of microfabric on macroscopic properties is well known. In some materials strength variation with direction can be related to the orientation of the constituent parts. In others it is the flow of fluid that may be of interest. Following the advent of microscopic techniques using optical microscopy, other methods of observation such as transmission electron microscopy, scanning electron microscopy, and more recently, many other imaging systems including confocal microscopy have been widely used to observe the microfabric of materials. In most cases a photographic image is recorded for subsequent interpretation although increasing use is being made of digital image acquisition. With the advent of image processing and analysis techniques, there are now several methods whereby quantification of microfabric is possible, and a review of the latest developments in this field and particularly those related to orientation analysis are the subject of this chapter. One of the more difficult types of microfabric to study is that of soils and sediments, not only because there are problems with preparation for observation but also because of the large range of particle and feature sizes which makes it necessary to use complementary image analysis techniques to study the microfabric of the material. It is for this reason that most examples will be taken from this discipline. However, before a discussion of the methods is given, it is relevant to summarize the alternative methods of quantification that have been used as a prelude to the discussion of image analysis. Most studies of microfabric carried out to date have been largely qualitative descriptions of observed features as the presence of certain types of features may be all that is required in the study in hand. In other situations, a simple form of quantification based on subjective classification is used. This may sometimes include a parametric statement relating to the abundance of or the number of features of a particular type, but this quantification is usually done subjectively by counting or judging proportions of features in a given area. A refinement of this is the use of fabric indices where several different attributes of microfabric are identified (e.g., grain size, shape, degree of weathering, etc.). Each attribute such as grain shape will vary over a range (in this case from very angular to well rounded), and a rank order number can be assigned accordingly (e.g., 1 for very angular to 10 for well rounded, or vice versa). A single attribute may be used as a fabric index and related to external macroscopic factors, but more commonly, an aggregated index taking allowance of several different attributes is constructed. In such cases care must be taken to ensure that the range for all attributes is in the same direction. Thus, for example, if weathering causes particles to become both more rounded and smaller, then the rank
222
N. KEITH TOVEY
el a/.
parameter for shape should increase with roundness, while the rank parameter should decrease as the particle size increases. The use of composite indices such as these is based on the judgement of the observer and also on the subjective decisions made on relative weighting of the different attributes. Although quantification of microfabric based on the methods described earlier are an important complement to qualitative interpretation, there are several possibilities for direct quantification either by using image analysis methods or by other direct measurement techniques, particularly those using the orientation of features. In optical microscopy the optical properties of individual grains have long been used to identify grains seen in thin section, but some of the notable early studies in quantitative orientation anlysis in the study of microfabric include the work of Lafeber (1967) and Morgenstern and Tchalenko (1976a, b). In transmission electron microscopy, quantification of domains of subparallel features was attempted by Smart (1966) and McConnachie (1974) who both used a hand-mapping technique to define the extent of each domain. In scanning microscopy, Boyde (1967) defined the true three-dimensional shape of objects using simple photogrammetric relationships. These were extended by Lane (1969) to allow more general surfaces to be observed, while Tovey (1973a) developed general photogrammetric equations which were applicable for all specimen geometries within the microscope. A comprehensive review of all the equations in their general form is given in Smart and Tovey (1982). Simultaneous with these developments were several applications of photogrammetric techniques in the study of soil microfabric and, in particular, the computation of the threedimensional orientation of all particles within a field of view (e.g., Tovey, 1973b; Tovey and Wong, 1974; and Tovey and Sokolov, 1980, 1981). One difficulty with all these quantitative studies was the requirement for much action by the operator, which inevitably leads to subjectivity. In the case of photogrammetry, it was necessary to define a minimum of three complementary points on each image in a stereo pair and on each particle within the field of view. This had to be done using a stereoscope, and even small errors in positioning could lead to significant errors in the third dimension and, consequently, the orientation of the feature so defined. A further serious difficulty was the time taken for a single analysis. Although care could be taken to ensure consistent results, the analysis was only relevant for a very small area of the sample, and it was not possible to obtain sufficient analyses to enable statistically significant results to be obtained. An alternative approach was therefore sought whereby many images could be scanned rapidly to examine gross variation, and the detailed analysis was then confined to individually selected images from within the original group. Two techniques were explored: namely, optical transform techniques and optical convolution square techniques (e.g., Tovey, 1971, 1972, 1973b;
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
223
Tovey and Wong, 1974,1978; and Tovey and Sokolov, 1981). For these early studies, transparencies of the images were processed on an optical bench to generate either the optical transform pattern or the convolution square pattern. Because both these patterns give a rapid indication of orientation, it was hoped that such analysis could provide the necessary statistical basis for the detail photogrammetric work. The distribution patterns in the transforms have an overall shape approximating to an ellipse, and the approach taken was to delineate the overall shape by hand and then to measure the ratio of the axes of the elliptical shape drawn to define an index of anisotropy. Clearly this last action involved subjectivity and was not entirely satisfactory. Early studies which involved image analysis of particulate materials were the work of Foster and Evans (197 1) and Bennet et al. (1977). Essentially both these studies involved the use of transmission electron micrographs and the thresholding of the images into solid features and voids. One aim in these studies was to examine porosity, while a second aim was to examine orientation patterns. The difficulty of thresholding images was apparent and alternative methods were sought. In particular, Unitt (1975, 1976) and Unitt and Smith (1976) developed a method for orientation analysis using intensity gradients and thereby obviated the need for thresholding. They demonstrated the technique using images from one of the present authors N. Keith Tovey. Although this technique essentially involved the modification of edge detection operators, the emphasis on orientation measurement rather than edge detection has required different approaches to image analysis. Measurement of orientation of features seen in images is of importance, but there are many consequential applications for which orientation analysis is but one step in an overall study of microfabric or related aspects. Although traditional feature analysis methods available on most image analyzers may be used to determine orientation, there are several difficulties which arise when the technique is applied to the study of small features in images. The first key problem arises from the selection of an objective threshold to separate the individual features. Even with the best techniques currently available (see Section VII,B), there are still many features which still touch each other, and these need to be separated before analysis proceeds. This stage will often introduce subjectivity into the process, an effect which should be avoided if at all possible. A second and equally significant problem is the need to store large amounts of data in the form of results (e.g., area, perimeter, orientation, etc., of each particle) from the tens of thousands of features typically present in a single image. It is partly for this reason that the techniques first proposed by Unitt (1975) for orientation analysis have been developed for orientation analysis of microfabric. On the other hand, the traditional feature analysis methods are often adequate for larger features.
224
N. KEITH TOVEY et al.
In all quantitative techniques using image analysis, there are five key states, namely: 1. Definition of the task. 2. Image acquisition. 3. Image processing, including image restoration, enhancement, edge detection, etc. 4. Image analysis, including analysis of orientation data and enhanced application of orientation analysis (e.g., domain-segmentation). 5. Interpretation of results.
The primary aim of this article is to concentrate on both image processing and analysis. However, brief comments regarding both the definition of the task (particularly in cases of complex microfabric analysis) and of image acquisition are given in separate sections at the start. The question of interpretation of the results is beyond the scope of this article; it will vary from one subject area to another and will not be discussed further even though this is often the most important justification of the methodology. The image processing and analysis sections are divided into several component parts namely: 1. A review of edge detection and orientation analysis algorithms (Section IV). This section includes the development of basic formulae and also the postprocessing of results in the form of indices of anisotropy. 2. A discussion of the development of general formulae for orientation analysis and a comprehensive test of many of the available formulae (Section V). Included in this section is the extension of analysis using rectangular pixels and also the extension of the algorithm to three dimensions. An advanced summary of methods to post process the results from orientation analysis is also given. 3. Enhanced orientation analysis, including domain segmentation and domain mapping (Section VI). 4. Applications of orientation analysis with other techniques in image analysis including porosity and multispectral methods (Section VII).
11. DEFINITION OF
THE
TASK
Orientation analysis of images may be important in its own right, but of equal importance is the application of orientation analysis as a tool in a wider set of methods in the processing of images. The definition of an image analysis task may thus be far wider than a basic task to analyze orientation alone.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
225
FIGURE1. Examples of images of microfabric having a large range of particle sizes. (a) Blacktoft series sample-there appears t o be little evidence of orientation; (b) Beccles series sample-there appears t o be alignment in a direction running from top left to bottom right.
226
N. KEITH TOVEY et al.
In images where the range of feature size varies considerably, it is usually desirable to separate the image into two parts, one containing the larger features and the other the finer features. This may be achieved using the mineral segmentation techniques described in Section VII,C. Typical examples of the need for this type of approach include the detailed study of the microfabric of natural soils. These materials often contain particles with a range of sizes stretching over several orders of magnitude. The larger particles of silt and sand size are usually less elongate than the finer particles. During deposition there is often evidence that flow of finer material occurs around the larger grains and a knowledge of the extent of this is important in understanding how the material will deform in the future. Two illustrations of this type of feature are shown in Fig. 1. In Fig. la, there are many larger grains in a matrix of finer material. There is no clear evidence of orientation within the matrix in this image. In Fig. 1b there appears to be a dominant orientation direction running from top left to bottom right. While this is apparent, a feature which only becomes noticeable after orientation analysis and related processing (see Fig. 41 and Section VII,C) is the relative flow of fine-grained material around the coarser grains in this image. This flow is absent in the material shown in Fig. la.
FIGURE2a. Examples of images show microfabric of soils. (a) Marine clay from Hong Kong.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
227
A second example is shown in Fig. 2a. There is a large ball-like feature near the center and there was speculation as to whether the orientation of the material within the feature was different from that outside it. It is far from clear from a descriptive interpretation of the image that this was the case. In all the examples described earlier, it was usually necessary to separate the larger grains from the fine-grained matrix so that different analytical techniques could be applied separately to the two groups of features. Such a separation was achieved using the complementary multispectral methods described in Section VII,C. In a different application, it was the overall orientation of the clay particles seen in Fig. 2b which were of interest. Such particles may sometimes respond t o external stressing as individuals, while in other cases, the deformation is controlled by the relative movement of groups of subparallel particles, called domains, which act as integral units during deformation. In this particular example it is clear that there is a consistent orientation at a microscopic scale over the whole image, but this situation is often not the case (see Figs. Sa, Sb, and 8c). Of the applications of orientation analysis mentioned previously, the last one involving the consolidated sample of kaolin (e.g., Fig. 2) includes only orientation analysis and will be considered first. A flow diagram of the steps
FIGURE2b. Consolidated kaolin.
228
N. KEITH TOVEY e t a / .
1)
I
Image Acquisition
2)
Intensity Gradient Analysis
3)
Draw Rosette Diagram
6)
Threshold one Domain Class
7)
8)
I
I
I
Evaluate Domain Areas
I
Domain Mapping
I Next Domain Class
I
FIGURE 3 . Flow chart for intensity gradient analysis followed by domain segmentation. This represents the procedure on a single image. Batch processing may be achieved by repeating the stages on new images.
needed is shown in Fig. 3. It begins with image acquisition (Step l), followed by raw orientation analysis using algorithms based on intensity gradients (Step 2). Images suitable for such analysis are those containing large numbers of fine particle or fibers. From the basic analysis it is possible to compute several parameters relating to overall orientation trends (steps 3 and 4) and also to use the resulting images as the starting point for domain segmentation and related topics (Steps 6-8). In later sections a discussion of the use of orientation analysis as part of an overall package will be presented.
111. IMAGEACQUISITION For any successful image analysis, an adequate digital image must be acquired. Images are digitized into pixels and in a typical arrangement there will be an array of 512 by 512 pixels. However, this is by no means universal and rectangular images are not uncommon (e.g., 768 x 512). Newer systems tend to have higher pixel resolutions of 1024 x 1024 pixels or even
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
229
higher. Some image processing and analysis systems have pixels which have a unity aspect ratio (i.e., the pixels are square), but there are also many systems which have pixels which have a rectangular aspect ratio or even a hexagonal pixel array. Where possible, systems having arrays of rectangular pixels should be avoided even when they are correctly displayed using framestores, which also have rectangular display formats. Many analytical techniques involving orientation analysis can be particularly sensitive to the use of rectangular pixels and will give erroneous results unless corrections are applied (see Section V,G). Even then the results may give values which differ significantly from corresponding analyses using square pixels. This problem will manifest itself if the effective resolution in the two directions is very different. The intensity distribution within an image is usually represented on a gray scale ranging from 0 (as black) to 255 (as white) although some systems allow a full 16-bit resolution with a gray scale ranging from 0 to 65,535 (or -32,768 to +32,767). It is always possible to acquire a digital image using a TV camera attached to a suitable frame store to grab images of photographs, but this method should be used as a last resort. In many situations, images can be obtained directly by a TV camera (e.g., one attached to an optical microscope), thereby bypassing the inevitable loss in dynamic range arising from the intermediate photographic stage. There is a further advantage in that optical distortion in both the enlarging and digitizing stages are avoided. In satellite imagery and electron microscopy, digital images can usually be recorded directly. In the authors’ laboratory, digital images are acquired from two scanning electron microscopes, from optical microscopes, directly from weather satellites, from video tapes, and also from a TV camera. All facilities are connected to a local peer-to-peer network. Image processing and analysis is done on extended versions of SEMPER (the image processing and analysis software available from Synoptics (271, Cambridge Science Park, Cambridge, England). This software has been extended extensively to incorporate the algorithms described later in this paper. Currently, five P C computers run the software as does a SUN workstation. On the PC computers, a variety of hardware configurations are available including the SYNAPSE, SYNERGY, and SPRYNT image grabbing cards, while one computer can run both the VGA and Windows versions of SEMPER. The simplified schematic of the arrangement in use in the authors’ laboratory is shown in Fig. 4. Two scanning electron microscopes are connected to the network of PC-based computers and also to the SUN workstation. Recently, one of these microscopes has been adapted so that operation is automatic and under the control of the image-processing system (see Section VII1,B).
230
N. KEITH TOVEY el a/.
FIGURE 4 . Simplified schematic diagram of network used for image analysis in the authors’ laboratory. All P C computers can connect to each other and share each others resources. There are several other computers attached to the network used for microanalytical purposes, some of which can run the Windows version of SEMPER. A, 486 computers with Synoptics SYNAPSE card; B, 286 computer with Synptics SYNERGY card and facilities for direct control of SEM; C, SUN work-station with 3 Gbyte hard disk storage and exobyte archiving facilities; D, 286 computer acting as print server and CD-ROM reader; E, 286 computer for communication with LINK Analytical computer and SEM-also server for optical disk; F, 286 gateway computer linking P C network and SUN workstation and outside world; G, Locally built hardware interface to control S800 SEM; H, Camera and switching facility t o select image sources; J, 486 computer with CD-WRITING facilities; K, 486 computer with Synoptics SPRYNT card.
It is important to ensure that the conditions of image acquisition are adequate for the task in hand as, although some subsequent processing may be possible, there will inevitably be some loss of information. For analytical work using the traditional feature analysis packages to detect orientation and other characteristics, it is usually necessary to preprocess the image by segmentation by selecting an appropriate threshold to distinguish between one phase and another. This segmentation can often be assisted by ensuring that most of the brightest and darkest particles are at their respective saturation levels at the time of image capture. Although ideal for analytical purposes, such images are usually less than satisfactory for display and interpretation purposes. Equally, for orientation analysis using the intensity gradient method, saturation should be avoided whenever possible as such
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
23 1
areas will be devoid of information after analysis. It is thus important to clearly define the task under consideration and to set the image acquisition parameters with the particular task in hand. When more than one type of analysis is required there must be an inevitable compromise. For microfabric analysis of images from the scanning electron microscope, both secondary electron and back-scattered electron images may be used. The latter type of images is often preferred as prepared planes in precise orientations may be cut through the sample. This last aspect is particularly important for orientation analysis. For the samples of soils and sediments frequently used by the authors and shown in many of the photographs in this paper, sample preparation proceeded as follows: The wet samples were successively replaced with acetone and then Araldite AY 18 with hardener HZ18 (see Smart and Tovey, 1982 for details of preparation). When hard the samples were then cut in the desired direction and polished before being coated with 20nm of carbon. To ensure adequate contrast without saturation, a standard sample consisting of large quartz particles embedded in the same resin was used as a calibration. When centered on a quartz particle, the intensity was adjusted to a level of 230 on the gray scale, while on the resin it was adjusted to be about 20. This ensured a full dynamic range within the image, allowing for a few brighter iron-rich particles and also for dark organic matter in the sample. In any set of observations, under the same operating conditions of contrast and brightness, an image is also captured of a region of the specimen containing only resin. This image provides information for determining the signal-tonoise ratio for possible later use in image restoration (see Section VII,B,l).
IV. IMAGEPROCESSING AND ANALYSIS OF ORIENTATION A . Introduction
Much interest has been shown in the detection of edges in images and much has been written as to the meaning of the word “edge” in a digital image (e.g., Haralick, 1984). An edge is often used to distinguish between regions which lie within a bounded area and are brighter (or darker) than those in the surrounding region. Haralick called these step edges. On the other hand, edges can be defined where the brightness changes gradually, increasing up to a certain point and then decreasing thereafter. Haralick called these edges roof edges. Essentially, it is the derivative of the change in intensity, or the intensity gradient, which is of importance in defining these edges. As with normal image processing, a particular threshold may be set so that only
232
N. KEITH TOVEY et at.
intensity gradients which exceed a specified threshold value are considered. This can lead to the delineation of the sharpest edges, which may then become fragmented into segments even when the edges themselves are continuous in the real object. Various techniques such as the use of the Hough transform (1962) can then be used to join these segments if required. Although there is extensive literature describing edge detection, the use of related operators as a tool in orientation analysis is much less developed. Nevertheless, the extension to orientation analysis is becoming a powerful tool in image analysis which has widespread application. Before a generalized discussion of orientation analysis and related edge detection operators is presented, the next sections will cover a basic introduction to some of the more simple operators and some of the attendant problems with edge detection in general.
B. Elementary Edge Detection Operators In its most basic form, a simple edge detector kernel may be specified by examining the change in intensity in both the X-direction and the Ydirection. Since the intensity will change by the greatest amount in a direction orthogonal to the orientation of any edge or line feature, all that is required is to determine the gradient of the intensity change in the two orthogonal directions on the image. Let these gradients be AZ/Ax and AZ/Ay, respectively. The direction (d) of the greatest change in intensity is given by
Although this equation gives the direction of the maximum intensity gradient vector, it is often desirable t o specify as an alternative the direction of the feature at that point which will be orthogonal to the above direction. The magnitude of the intensity gradient vector is also of importance and this may be specified as
' ,J
v=
+
(g)2.
All edge detectors rely on the ability to identify the changes in intensity, but it is frequently the magnitude of the intensity gradient vector ( V ) which is used in analysis; a particular threshold for V may be set and the edge defined if this value is exceeded. While the magnitude of the intensity gradient vector is of importance in edge detection, it is the estimation of B that is of interest here. Many edge operators have been proposed. These
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
233
range from the relatively simple ones involving no more than three pixels to well-established operators such as the Roberts (1965), Sobel (see Duda and Hart, 1972), Prewitt (1970), Isotropic (see Jain, 1989), and Unitt (1975) operators, which are based on 4-8 pixels. More involved operators such as those of Haralick and his co-workers (e.g., Haralick, 1984; Zuniga and Haralick, 1987) and those of Smart and Tovey (1988), Tovey et al. (1989, 1992a,b) involve up to 24 or more pixels. A 5 x 5 array of pixels is shown in Fig. 5. These are numbered according to a specific sequence which enables efficient encoding of the several algorithms for edge detection. The most basic edge operator may be computed from a knowledge of the intensity at pixels 0, 1, and 2. Thus for an image with square pixels,
AI Ax
-
I,
-
I,
h
and
where I , , I , , and I2 are the intensities at points 0, 1, and 2, respectively, and h is the spacing between the pixels. These formulations are asymmetric about the central pixel 0 and may introduce some bias. For reasons which will become apparent later, this formula type is known as the 2,2 formula. The first figure in this notation refers to the number of points (excluding the reference pixel) which are used, while the second figure is the minimum number of pixels needed to make the analysis determinate. In this case, since both numbers are equal, the solution is just determinate. Unitt (1975, 1976) and Unitt and Smith (1976) proposed a symmetric operator using the four pixels 1-4. In this case,
AI Ax
-
I, - I, 2h
and
AI I2 - 14 ---
Ay
2h
'
FIGURE5 . Pixel numbering system for intensity gradient analysis. The pixels are numbered in a symmetrical fashion and are also shaded according t o their distance from the central pixel. These pixel numbers are used to denote row numbers in Table 6 .
234
N. KEITH TOVEY et al.
An alternative way to present the formulae is in the form of processing kernels which are passed across the image operator sequential on every pixel. These formulae (Eq. (3c)), although derived initially by Unitt, are now known as the 4’2 formulae. Since the number of points used is greater than the number required for solution, there is a degree of filtering present which improves the precision over the simple formula. For Eq. (3c) the processing kernels are indicated in Table I. Most of the coefficients of the kernels are 0 but have been included here for completeness with more complex kernels. Unitt (1976) recognized that this formulation was susceptible t o noise and proposed an extended analysis using the expansion of Taylor’s theorem. Assuming the intensity is a function of the position x and the intensity at point 0 (Io),the intensity I , at pixel 1 is given by I,
= I0
d I h2 d21 h3 d31 + h -dx + -+ -2 ! dx2 3! dx3
(4)
+ . * a
Similarly, at point 9 the intensity becomes dI dx
I, = I o + 2h-
d21 d31 + 4 h2 -+ 8 -h3 2! dx2 3! dx3
+ . . a
Similar equations may be written for points 3 and 11 by substituting - h for h in the above equations. Then, subtracting the pairs of equations for points 1 and 3 and points 9 and 11 gives
I , - I3
=
dl dx
d31 + 2 -h3 +..3! dx3
dI dx
d31 + 16--h3 + 3 ! dx3
2h-
and
I, - I , , = 4 h -
TABLE I PROCESSING KERNELS FOR UNITT’S BASIC FORMULA
(NOW KNOWN AS
AI/Ax
0 0
0 0
0
-0.5
0 0
0 0
AI/Ay
0 0 0
0
0
0
4,2 FORMULA)
0 0
0
0 0.5
0
0 0
0
0 0
0
0
0
0 0 0 0 0
0 0.5
0 0
0 0
0
0 0
0
0
0
-0.5 0
0
23 5
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
The equations contain no second or fourth order derivatives, and if the fifth and higher odd numbered derivatives are neglected, then the third order derivatives may be eliminated from the above two equations to give
dl dx
1 12h
- = -(8(Z,
-
13)
-
(19
-
(7)
111)).
A similar expression may be obtained for the intensity gradient in the Ydirection using points 2, 4, 10, and^ 12. Unitt (1976) estimated that using this later formula involving the nine points around the central pixel in the form of a cross improved the accuracy over the five-point method by a factor of 9. These formulae are known as the 12,9 formulae and are accurate to the fourth order. Theoretically 9 pixels are needed for a complete solution to the fourth order and there are 12 available in this formula, but of these, four coefficients are in fact zero and it would seem that a fourth order solution is possible using just 8 pixels in a cruciform shape. The corresponding kernels are given in Table 11. There are several other well known operators in common use: e.g. Roberts, Sobel, Prewitt, Isotropic, and the relevant kernels for these are shown in Table 111. A point to note with the Sobel, Prewitt, and Isotropic operators is that the sign of the coefficients is consistent on both sides of the axis of symmetry. In the case of the extended Unitt method (12,9) described earlier, although the operator is symmetric, the sign of the coefficients does vary on each side of the axis of symmetry. The Roberts operator is asymmetric.
C. Difficulties in the use of Edge Detection The selection of edge operators has often been a matter of preference, but some are clearly better than others. There are three potential problems with the use of any of these operators: TABLE 11 PROCESSING KERNELS FOR UNITT’S EXTENDED FORMULA
(NOW KNOWN AS
AI/Ax 0
0
0 0
0.083 -0.667 0
0
0
0
0 0 0 0 0
12,9 FORMULA)
AI/Ay
0 0 0 0 0.667 -0.083 0 0 0 0
0 0 0 0 0
0 0 0
0 0
-0.083 0.667 0 -0.667 0.083
0 0 0 0 0
0 0 0
0 0
236
N. KEITH TOVEY e t a / .
TABLE 111 WELLK N O W N
EDGE-DETECTION OPERATORS
Roberts Operator AI/Ax 0 0 0 0
0 0 0
0
0
0
AI/Ay
0 0
0 0
0 0
1
0 0 0
0 -1 0
0
0
0
0
0
0
0
0 0 0 0 0
0
0
0
0
0 -1 0
0
0 1 0 -1 0
0 0 0 0 0
0
0
1 0 -1 0
0 0 0 0
0 0 1.414 1 0 0 -1.414 - 1 0 0
0 0 0 0 0
0 0 1 0 0
0
0
Sobel Operator AI/Ax 0
AI/Ay
0 -1
0
0
0
0
0
0
0
1
0
0
-2
0
-1
0 0
2 1
0 0 0
0
0
0
1 0 -1 0
0
0
0 0 0
0 2 0 -2
0
Prewitt Operator AI/Ax
0 0 0 0
0
0 -1 -1 -1 0
AI/Ay
0
0
0
I
0
1 1 0
0 0
0 0 0 0 0
0
0
0 0 0
0 1
0 1
0 -1 0
0 -1 0
Isotropic Operator AI/Ax 0 0 0 0 0
0 -1 -1.414 -1 0
AI/Ay
0
0
0
1
0 0
1.414 1 0
0
0 0 0 0 0
0 0
0
0
0 -1 0
0 0
I
(i) When the intensity gradient in both the X - and Y-directions is zero, as it will be in regions of uniform contrast, it is impossible to define a direction as both the numerator and denominator in Eq. (1) are zero. For applications where the delineation of edges or the identification of individual lines are the subject of study, this rarely presents a problem as such ambiguities arise in regions of uniform contrast which are of no interest. On the other hand, these uniform areas may present problems in subsequent analysis during
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
237
microfabric studies. However, in nearly 10,000 images analyzed by the authors, this problem has rarely affected more than 0.1070 of an image. The solution is to treat such areas separately and encode them with an out-of-range value during analysis. (ii) When the signal-to-noise ratio is low, the orientation data obtained will be significantly affected by the random nature of the noise, and in some applications this may affect the overall analysis. Some of the formulae discussed later are better than others for dealing with noisy images. (iii) When the magnitude of the intensity gradient vector is low, only a few angles can be obtained, and this selectivity may introduce some bias in the analysis. This was first recognized by Unitt (1975, 1976), who noted that regions of low contrast could present certain directions t o be selected in preference t o others. Indeed, it was to overcome these problems that he developed the extended 12,9 analysis. Tovey (1980) applied both the 4,2 and the 12,9 formulae to the study of soil microfabric. This was later extended by Tovey and Sokolov (1980, 1981), Smart and Tovey (1982), and Tovey and Smart (1986). Tovey (1980) showed that the problem of certain angles being selected in preference to others still existed even with the extended 12,9 analysis. Part of the problem lay with the limited range of values of intensity that each pixel can have in byteencoded images where the range is normally between 0 and 255 for the whole image. The actual range, however, between adjacent pixels is normally less than 100. So for an angle to be specified to 1" the denominator in Eq. (1) must be 57 times that of the numerator, and even with random data, the chances of obtaining 2" are twice as great as 1". If a region of an image has low contrast and the gray-scale intensity varies only between 0 and 2 for the 4 pixels used in the 4,2 formula, then the only possible values for AZ/Ax and A I / A y in the first quadrant are 0,
0.5/h,
and
l/h,
and the only possible angles which can be defined are tan-' (0),
tan-' (0.5),
tan-' (l),
tan-' ( 2 ) ,
and
tan-'(oo);
i.e., a maximum of five angles in any one quadrant. If intensity values vary between 0 and 3 , the number of possible angles increases to 7 , while for a range of 0-4, the number becomes 9. T o overcome the difficulty of selectivity, Tovey (1980) recommended discarding orientation information where the magnitude was less than 2 / h . Subsequently, Tovey and Smart noted that with the 12,9 method (termed the 9 s method by them) and a magnitude of less than 2 / h , apart from 0" itself, it
238
N. KEITH TOVEY et a/.
was not possible to define another angle less than 2". Indeed, with this threshhold magnitude the first angle possible after 0" is 2.4". This angle may be achieved if the intensity difference between points 2 and 4 is 3 and that between 9 and 1 1 is 1 . The magnitude of the associated vector is then 1.52/h. In microfabric analysis, it is important in orientation analysis to ensure that angles are assigned to an appropriate orientation class. Since no angle can be generated between 0 and 2.4", it was recommended that a class width of 5" (i.e., double the minimum angle computed) would be consistent with a rejection of angles computed from vectors having a magnitude less than 2/h. For an ideal situation, the frequency at which any 5" class width is encountered with a random set of intensity values should be equal. There are 36 possible classes (0-180°), and for vectors with magnitudes less than 2/h, the frequency at which any angular class is selected should be the same for all classes i.e., 2.8% ( = 100/36). With the 12,9 algorithm, this percentage varied between 2.4% and 2.8% for all angles except 0", 45" and 90", where the percentages were 4.0%, 3.3% and 4.0%, respectively. For the 4,2 algorithm the frequency of occurrence of a particular class was as high as 16.7% for 0" and go", but zero for many angular classes (see Table IV). TABLE IV RESULTSFROM
COMPUTER SIMULATION TO INCLUDE ONLY INTENSITY GRADIENT
VECTORS WITH A MAGNITUDE IN THE RANGE
2/h
(AFTER
TOVEY AND SMART, 1986)
Percentage of times angle fell in class Midpoint of class (Degrees)
4,2 method
12,9 method
0
16.7 0 0 0 4.2 4.2 0 4.2 0 8.3 0 4.2 0 4.2 4.2 0 0 0 16.7
4.0 2.6 2.4 2.1 2.8 2.8 2.7 2.1 2.6 3.3 2.6 2.7 2.7 2.8 2.8 2.1 2.4 2.6 4.0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
239
Clearly the 12,9 formula is much better in delineating direction, but this analysis emphasized the need to ensure that histogram class widths were consistent with the threshold magnitude of the intensity gradient vector. It also indicates that images of low contrast are unlikely to give good estimates of orientation. Two further formulae were also considered by Tovey and Smart (1986). One involved the use of all points 1-8 and was generated by using the original Unitt formula (4,2) for points 1-4 and then by rotating the axes through 45" to obtain a second estimate of orientation. The two results were then combined. Although this gave an improvement over the 4 , 2 method, it had little advantage over the 12,9. The final formula involved the combination of the 12,9 formula with the equivalent one rotated through 45" to include points 5-8 and 21-24. This was originally known as the 17point method and it had some benefits over the 12,9 method but this has now been superseded by more general formulae described in Section V.
D. Presentation of Results from Orientation Analysis or Edge Detection The method for displaying results from orientation analysis is largely dependent on the nature of the task involved. Where edge detection is important, a new image with pixels encoded to the magnitude of the intensity gradient vector is displayed. Using a given threshold, magnitudes less than a given value may be excluded so that only the most well-defined edges are displayed. Throughout this article the term magnitude image will be used to describe such an image. For microfabric studies it is the orientation at each pixel that is important, and it is convenient to generate an angles-coded image, where the intensity value at each pixel relates to the orientation as computed from the direction of the intensity gradient vector. In some applications, the vector will have a value in the range 0-360", but in most applications, and particularly most microfabric applications, the computed directions only fall into the region 0-180" since features such as a line can point in either of two directions. This makes it possible to store the angles-coded image in byte form with orientation specified to the nearest degree (0-179 in intensity values representing 0-179"). In practical implementation of the algorithms, a pixel value of 255 is used to code those few areas where the intensity gradient vector is indeterminate and also code those areas where the magnitude falls below a given theshold value (a default threshold cutoff value of 2/h has been used in much of the authors' work). An example of intensity gradient orientation analysis is shown in Fig. 6. Fig. 6a shows a simulated image consisting of a single particle with the
240
N. KEITH TOVEY et al.
FIGURE6. Simulation of images to illustrate orientation analysis using intensity gradients. (a) Single particle; (b) Angles-coded image; (c) Collection of identical, parallel, but randomly spaced particles; (d) Angles-coded image; (e) Magnitude image; (9 Rosette diagram.
associated angles-coded image shown in Fig. 6b. In the latter image two shades of gray are present: one for the vertical sides, the other for the horizontal ones. The intensity correctly depicts the differing orientations of the two sides. In early microfabric studies (Tovey, 1980), it was conventional to evaluate the angular direction in the angles-coded image in terms
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
24 1
of the vector normal to the actual feature (i.e., the vector defining the greatest change in intensity). This was done to conform to an earlier practice adopted from three-dimensional photogrammetric analysis where plateshaped particles may be uniquely defined by their vector normal. However, current wisdom is to display the angular values as the actual orientation of the feature (i.e., orthogonal to previous convention) as this makes interpretation easier in two-dimensional images. In Fig. 6c an image consists of a series of randomly spaced but parallel particles, all of which have the same shape as the one in Fig. 6a. The associated angles-coded image is shown in Fig. 6d, which replicates the image in Fig. 6b. The magnitude image (or edge-detection image) is displayed in Fig. 6e. All boundaries have been correctly defined in this simple image. For microfabric analysis it is often helpful to generate a histogram of orientation values to assess whether there is any preferred orientation. However, it is usually more helpful to display the results as a radial histogram or rosette diagram where the direction is specified by the orientation of the vector, and the length of the radial line is proportional to the frequency. In Fig. 6f there are just two “spikes” corresponding to the directions of the sides of the particles. If all particles are perfectly parallel (even if they are randomly spaced and have different length-to-breadth ratios), then the shape of the rosette diagram will indicate not only the direction of parallelism but also the mean shape of the particles. The mean length-to-breadth ratio would be obtained from the ratio of the spikes in the diagram. In Fig. 7a there is a collection of similarly sized particles, but this time they are not only randomly spaced but also randomly orientated. The angles-coded images show a collection of lines in varying shades of gray. Each line color is associated with its orientation, while all lines of similar orientation have the same color. The rosette diagram in this example is shown in Fig. 7d. Here the two dominant directions can be seen clearly. The generation of an angles-coded image is also an important intermediate step in definition of lines (see Swift, 1992), and as the starting point for domain-segmentation (see Section VI). Three examples of images of real sediments which contain collections of elongate particles are shown in Figs. 8a, Sb, and 8c. These are all samples of clay particles which have been consolidated and then impregnated and polished before observation in the back-scattered mode in the scanning electron microscope. The particles generally have a higher atomic number than the embedding medium and appear relatively light. Some particles richer in iron are much brighter than others. The spacing of features in an image, when analyzed for orientation using the intensity gradient method,
242
N. KEITH TOVEY e t a / .
FIGURE 7. Simulation of randomly orientated particles of similar size and shape. (a) Original simulation; (b) Angles-coded image-regions of similar orientation have the same gray scale value; (c) Magnitude image; (d) Rosette diagram.
affects the choice of kernel for optimum analysis (Section V,E). It is thus important to spend some time checking typical line and/or edge spacings in the image. Figure 9 shows a typical line traverse across part of Fig. 8c. There seems to be a spacing of features in this image which varies from about 4 pixels up to about 10-12 pixels. The angles-coded image of Fig. 8a is shown in Fig. 8d. Generally it is difficult to appreciate the range of orientation with the monochrome presentation shown here. A false color look-up table greatly assists in interpretation, but nevertheless, with the monochrome representation here, regions of similar shades of gray can be discerned and these highlight particular features in the original image. The associated rosette diagrams from the image in Fig. 2b is shown in Fig. 10a, while those from Figs. 8a, 8b, and 8c are displayed as Figs. lob, lOc, and 10d, respectively. Each rosette diagram is constructed from approximately 250,000 separate measurements of orientation within the respective images, and in microfabric work the
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
243
FIGURE8. Examples of images of consolidation kaolin. Figures (a)-(c) are from different regions on the same sample. Unlike Fig. 2b there is n o obvious evidence of overall orientation, but analysis does indicate that Fig. 8c has a preferred orientation direction which is nearly vertical (see Fig. 10d). Figure 8d is the angles-coded image from Fig. 8a.
rosette diagrams are frequently circular or elliptical in shape. Displayed in this manner it becomes a simple matter t o observe the direction, if any, of preferred orientation. In Fig. 2b, the preferred orientation direction is clearly seen in the original micrograph to be inclined slightly to the horizontal. This direction is also evident in the rosette diagram (Fig. 10a). In Fig. 8c there are localized regions which are highly orientated, but it is far from clear if there is any overall orientation within the image as a whole. The rosette diagram (Fig. 10d) shows that there is indeed noticeable orientation, and that the preferred orientation direction is at approximately 165" t o the upward vertical. Despite a similarity in texture in the microfabric shown in Figs. 8a and 8c, it is clear that the overall orientation is very different. In Fig. 10b the rosette diagram is nearly circular, indicating that the features in the image (Fig. 8a) exhibit few signs of overall orientation.
244
N. KEITH TOVEY ef a/.
260
Qrey Level
0
6
10
16
20
26
SO
36
40
46
LO
66
Number of Pixels FIGURE9. Typical wavelength trace across an image. Spacings between maxima typically range from 3 to 12 pixels in this type of image.
FIGURE10. Rosette diagrams. (a) Orientation of features in Fig. 2b; (b)-(d) Orientation of features in Figs. 8a-8c, respectively. The indices and preferred orientation directions are (a) 0.525 at 167"; (b) 0.093 at 164"; (c) 0.059 at 47"; and (d) 0.313 at 165.2".
E. Simple Quantitative Parameters from Orientation Analysis In fabric orientation analysis, it is often desirable to obtain a simple parameter which may be related to external physical factors such as stressing, etc. The form of the rosette diagrams gives a clue as to how to
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
245
proceed. The majority of rosette diagrams of particulate materials approximate to an ellipse in overall shape. (There are exceptions; see for instance Tovey and Krinsley, 1990, and also Fig. 12). Using a least squares analysis, one may compute the best fitting ellipse to the rosette diagram to give the lengths of the major and minor principal axes and also the direction of the major axis. In the earlier work of Tovey (1980), an index of anisotropy was developed which was merely the ratio of the major to the minor principal axes of the fitted ellipse. For a completely random fabric, the index would be unity, but for a completely orientated fabric the index would become large (+ 1). Since most of the particulate materials studied by the authors were mostly platy in shape with an aspect ratio of 1 : 10, from the previous discussion, it seemed unlikely that the index in this form would exceed about 10. Indeed, the maximum value rarely exceeded 2.5 on this scale. The direction of preferred orientation is readily derived from the direction of the major axis. Thus the analysis of the whole image reduces to two parameters-a direction and an index of anisotropy The initial choice for defining the index of anisotropy (I,) arose from its simplicity, but other formulations are possible, such as
b a
(ii)
I,
(iii)
I & =a- -- b - I - - b a a I,
= -
a-b a+b'
= -
(where a is the major axis and b is the minor axis). Formula (i) is the original definition. Despite its simplicity it suffers from the fact that the scale is essentially open ended (i.e., theoretically the value could reach 00). Formula (ii) has merits in that it ranges from zero to unity, but has the disadvantage that as the degree of orientation increases, the index decreases in value from unity for a randomly oriented set of features to zero for a perfectly aligned set. Formula (iii) overcomes the disadvantages of both formulae (i) and (ii) and rises from zero for a random fabric to unity for a perfectly aligned one. This formula is now the preferred one. In typical samples of microfabric, the index rarely goes below about 0.05, and the highest recorded for a wellaligned sample was approximately 0.8. Formula (iv) would seem to have little immediate benefit in image analysis although it closely follows an
246
N. KEITH TOVEY et al.
index used in optical microscopy. While the index of anisotropy computed using this formula also rises from zero for a random sample to unity for a perfectly aligned one, apart from at the extremes, the index is always less than that provided by formula (iii). Indeed for an index value of 0.8 measured with formula (iii), this translates to a value of 0.667 for formula (iv). Thus the dynamic range of possible values achievable in reality is significantly less with formula (iv) than formula (iii). In Fig. 10, the indices specified were all computed using formula (iii).
V. GENERALIZED INTENSITY GRADIENT OPERATORS A . Introduction
The intensity gradient operators based on the nearest 2-8 pixels described in Section IV,B form the basis of analysis. However, there is enough scope to extend the size of the kernel to include additional, if not all, points in the 5 x 5 or an even larger array. There are merits in this extension as there is the possiblity of including filtering effects to overcome problems of noise. On the other hand, the extension to larger arrays does prevent the orientation information from fine detail being seen unless higher order solutions are adopted. Zuniga and Haralick (1987) adopted an approach by evaluating the coefficients of the polynomial fitting the range of intensities found within the area covered by the kernel. They proposed the following operator t o cover a 5 x 5 array which they claim had advantages over other operators including an extension of the Sobel operator. Their operator in the X-direction is defined by the coefficients shown in Table V with the corresponding coefficients in the Y-direction obtained by rotating the matrix counter-clockwise through 90". A different approach was adopted by Smart and Tovey (1988) and provided a method which could be readily extended to any precision and to TABLE V COEFFICIENTS OF ZUNICAAND HARALICK (1987) AI/Ax 8 -10 -16 - 10 8
-16 -25 -28 -25
0 0 0
- 16
0
0
16 25 28 25 16
-8 10 16 10 -8
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
247
any size of kernel, although for most purposes of orientation analysis of microfabric, a 5 x 5 kernel size is ideal. Their method is based on a n extension of the expansion of Taylor’s theorem shown in Section IV,B into two dimensions.
B. Development of Generalized Intensity Gradient Operators The expansions of Taylor’s theorem with specific reference to points 1 and 9 were given in Eqs (4) and (5). A more general relationship for the intensity (Ih) at any point which is at a distance h from the central pixel may be found by using the differential operator
D = -d dx
(9)
so that the expansion becomes h2D2 h3D3 I.e.,
In two dimensions, the intensity Zat the point h of an image is a function of both x and y . The intensity at this point Zh will thus depend on the distance h, in the X-direction from the central pixel Po and the corresponding distance hy in the Y-direction. Following the convention adopted in Eq. (9), it is convenient to write
a
D,
= -
Dy
= -
ax
a
aY
and the extension of equation 10 becomes I,,
=
ehxDxe”yDy10
248
N. KEITH TOVEY et at.
Equation (12) may be expanded in full as Zh
=
+ h,D, + hyDy + ih:Dz + h,hyD,Dy + $h:D: + ihlD: + 3h: hyD:Dy + +h, h:D,G + ih:D: + &h:@ + i h l hyD:Dy + ah: h i e 0,’+ i h , h: DxD: + &h; 0; +&h: D: + &h: hy@ Dy + Ah: hy’Dl + hh;h:DiD: + &h, h; Dxl$ +‘h5 Y DsY + ... + higher order terms)Zo (13)
(1
120
A separate equation such as Eq. (13) may be written for each pixel in the near vicinity of Po. All that is needed is to substitute the relevant distances h, and hy . In Eq. (13) there are 20 terms involving the partial differential coefficients: these are of the form k @I$Zo (where k is a constant). There are 2 coefficients of the first order D, and D y r 3 of the second order, 4 of the third, 5 of the fourth, and 6 of the fifth order. Although only the first order terms are needed for intensity gradient analysis, all the other terms must be eliminated and thus the minimum number of equations for a first order solution is two. For second, third, fourth, and fifth orders, the minimum numbers of equations required are 5 , 9, 14, and 20, respectively. There seems to be little point in finding just one or two coefficients of a particular order, and so either all of a particular order are included or all are disregarded. Within the 5 x 5 pixel array, there are 24 points surrounding the central pixel, and hence 24 possible equations of the form shown in equation (13). Thus theoretically it should be possible to determine solutions for D, and Dy which are accurate to the fifth order. However, it seems logical that any array of pixels chosen should be symmetric otherwise preferred orientation may be induced from the shape of the kernel itself. Only five different symmetric arrays of pixels may be obtained by including all the pixels 1-4, or 1-8, or 1-12, or 1-20, or 1-24 (see Fig. 5). This explains why the pixel numbering sequence has been done in the way shown in Table VI. In the first symmetric array, there are only 4 pixels and although there are more equations than needed for a first order solution, these are insufficient for the five needed for a solution to the second order. Similarly, including all the pixels 1-8 allows a solution to the second order, but falls short of the required number for a third order solution. The group of pixels 1-12 allows a third order solution while, in theory, the 20 pixels in the group 1-20 should provide adequate information for either a fourth order solution (14 equations needed) or a fifth order solution (20 equations needed). Using all pixels in the array should allow a fifth order solution.
-
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
249
To simplify the derivation of the next part of the analysis, it is helpful to replace the product of the relevant partial coefficients multiplied by the corresponding fractional constant by the variable X j , where j can take any value from 1 to 20 for solutions up to and including the fifth order. The value of j represents the position of the relevant coefficient in Eq. (13). Thus, X , = DxIo X2 = DyIo x 6 =
+D:Zo
X7
iGDyZo
=
= h@Dy10
X,,
= &D:Zo.
All other values of X, may be found by reference to equation (13). Of all the coefficients, only X , and X 2 are needed in normal intensity gradient analysis as the second and higher order terms must be eliminated from the set of equations. For some applications (e.g. Haralick, 1984) these higher order coefficients are needed and may be obtained in a similar manner. Smart and Leng (1993) also present the higher order derivative coefficients, this time associated directly with the 20,14 formula discussed below. There are i possible equations each with its own set of coefficients h: * h i . Let these coefficients be B,, where i refers to the particular point and j follows the same convention as before:
B;, = h, B;Z
=
hy
Bj6 = h:
B;,
=
h:hy
BiI6= h:hy Bi2, = h,,5 .
There is a set of i equations for each of the i pixels chosen:
z;
= I,
+
c BjjXj. J
Since the term I, appears in all equations, the notation may be simplified
250
N. KEITH TOVEY e t a / .
further such that BijXj
=
Ci,
j
where
c;= zi - z,. There are thus i equations which may be solved to obtain X , and X , . This is conventiently done in matrix form:
BX
=
C.
(17)
Here B is a matrix with m rows and n columns having coefficients B,, X is a column matrix (1 x m),and C is a column matrix (1 x n). If the number of rows and columns in B are equal, then the solution is just determinate and the relevant expression for X may be found from
X
=
B-'C.
(18)
The solution can follow standard procedures, the only problem arises when the matrix is singular and there is no inverse. In the case of the 5 x 5 array, this problem only occurs in one situation, i.e., when points 1-20 are used to define the 20 equations needed for a fifth-order solution. If the number of rows in B is less than the number of columns, then no solution is possible and a lower order solution must be found with the points available. More usually, the number of rows is greater than the number of columns and there is some redundancy so a least squares solution is possible. Premultiplying both sides of Eq. 18 by the transpose of B gives
BTBX = BTC and
X
=
[BTB]-'BTC.
Once again the computation follows standard practice, the only difficulty arising when the matrix [BTB]is singular. This occurs only once within the 5 x 5 pixel array for symmetric solutions for the situation, when there are 24 points and a fifth order solution is required (i.e., 20 equations). There are alternative methods for solving equations including the Singular Value Decomposition (SVD) method (Press et al., 1986), and these may be used as an alternative if problems exist. Since the matrix B is common for all solutions it is only necessary to work out the coefficients once as shown by Smart and Tovey (1988). Thus for pixel 17, the values of h, and hy are -2 and -1, respectively. If the pixels are square, then h, = hy and the fourth order coefficient for parameter 13 (i.e., h,hj equals -2 x ( -1)3 = +2. Similarly for pixel 19 and parameter 14, the coefficient becomes 16. A full set of all coefficients is given in Table VI.
TABLE VI COEFFICIENTS B,, OF THE MATRIX B.
j i
1 h,
2
3
hy
1 1 0 2 0 1 0 3 - 1 4 0 - 1 5 1 1 -1 1 6 7 - 1 - 1 8 1 -1 9 2 0 1 0 0 2 1 1 - 2 0 12 0 - 2 13 2 1 1 4 1 -1 2 15 1 16 -2 17 -2 -1 18 -1 -2 1 -2 19 -1 20 2 21 2 2 22 -2 2 23 -2 -2 24 2 -2
4 h: 1 0
5 6 7 8 h,hy h: h: 0 0
0 1
0 - 1
1
0
0
0
1 1
1
1 1 4 0 4 0 4 2 1 4 4 1 1 4 4 4 4 4
-1 1 -1 0 0 0
0 2 1 -2 -2 2
2 -2 -2 4 -4 4 -4
1 0
9 h:hy
1 0 1 h,h: h:
0 0 0 0 1 1
0 0 0
1 0 0 1 1 1 1 -1 -1 1 - 1 - 1 - 1 1 1 -1 1 0 8 0 0 4 0 0 0 0 - 8 0 0 4 0 0 0 1 8 4 2 2 4 1 2 4 -1 2 -4 1 -8 4 -2 1 -8 -4 -2 4 -1 -2 -4 4 1 -2 4 1 8 -4 2 8 8 4 8 -8 8 -8 4 4 -8 -8 -8 4 8 -8 8
0 1 0
-
1
4
1 h:
1 2 1 h:hy hth:
3 h,h:
1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 -1 1 -1 1 1 1 1 -1 1 -1 1 -1 0 16 0 0 0 8 0 0 0 0 0 1 6 0 0 0 8 0 0 0 0 1 1 6 8 4 2 8 1 2 4 8 8 1 -2 4 -8 1 16 -8 4 -2 -1 16 8 4 2 -8 1 2 4 8 -8 1 -2 4 -8 -1 16 -8 4 -2 8 1 6 1 6 1 6 1 6 8 16 -16 16 -16 -8 16 16 16 16 16 -16 16 -16 -8
14 h,"
0 1
0 1 1
1 1
1 0 16 0 16 1 16 16 1
1 16 16 1
16 16 16 16
15 h: 1
0
17 hlh:
16 h:hy 0
0
18 h:h; 0
0
19 hxh$ 0
0
20 h: 0
0 1 1 0 0 0 0 0 0 0 0 0 0 - 1 1 1 1 1 1 -1 1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 1 -1 3 2 0 0 0 0 0 0 0 0 0 0 3 2 3 2 0 0 0 0 o o o o 0 - 3 2 3 2 1 6 8 4 2 1 1 2 4 8 1 6 3 2 -1 2 -4 8 -16 32 -32 16 -8 4 -2 1 -32 -16 -8 -4 -2 -1 -1 -2 -4 -8 -16 -32 1 -2 4 -8 16 -32 8 -4 32 -16 -1 2 32 32 32 32 32 32 -32 32 -32 32 -32 32 -32 -32 -32 -32 -32 -32 32 -32 32 -32 32 -32
0
$ k
z
1
~
2
v]
< 7
0
m
2z Z
t<
2
h)
Nore: See Eq. (17), [after Smart and Tovey (1988)l.
~
ch e
~
252
N. KEITH TOVEY el al.
To solve for the coefficients for any particular order, all that is required is to select the appropriate section from the top left of the matrix in Table VI. Thus for a second order solution all the columns up to column 5 should be included, and the number of rows should be selected to provide a symmetric array. The horizontal and vertical lines delineate the key parts of this matrix. The solution to the equation will provide the coefficients for the relevant kernel to be used in intensity gradient analysis. C. Nomenclature of the Duferent Formulaefor Intensity Gradient Analysis
For most applications it will be desirable to use symmetric arrays of pixels. Thus as noted earlier, five different symmetric arrays are possible, and for each of these, there will be a series of solutions involving differing orders of precision. Thus for points 1-4, there is only one solution possible requiring two equations and this will be a first order solution. This is termed the 4,2 solution as there are four points, and two equations are needed. The kernel may be developed from matrix B, by taking the first two columns and the first four rows in Table VI. With the next symmetric array (i.e., with pixels 1-8) there are in fact two solutions possible to both the first order (two equations) and second order (five equations). These are termed the 8,2 and 8,5 solutions respectively. With 12 points, solutions up to and including the third order are possible (12,2, 12,5, and 12,9) while with 20 and 24 pixels there are theoretically five solutions possible in each case (20,2, 20,5, 20,9, 20,14, 20,20, 24,2, 24,5, 24,9, 24,14, 24,20). The matrices generated by both the 20,20 and 24,20 methods are singular and no solution using matrix inversion is possible. However, it is possible to compute the coefficients for the 24,20 method using the singular value decomposition method. At first sight it appears that there are 15 possible formulae (excluding the 20,20 method), however, the coefficients generated for the kernels for several pairs of formulae are identical. Thus both the 8,2 and 8,5 methods give the same coefficients as do the 20,9 and 20,14. A full list of the coefficients for all formula types is given in Table VII. Only the coefficients for the gradient in the X-direction are given as the corresponding ones for the Y-direction are obtained by simple rotation of the kernel. In all there are nine different formulae, but one additional asymmetric formula is also included in the table. This is denoted 2,2 and refers to the most basic formula shown in Eqs. (3a) and (3b). For consistency with earlier work it should be noted that the 4,2 formula corresponds to the basic original method of Unitt (1975) and the 5-point method of Tovey and Smart (1986). The 12,9 formula corresponds to Unitt’s extended analysis (also called the 9C formula by Tovey and Smart). Finally, it should be noted that the 8,5
253
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
TABLE VII
THEKERNELS
0
0 0 0 0
FOR THE FULL RANGE OF ANALYSES DESCRIBED BY
0 0 0
0 0
0 0 0
0
0
0
0
0
0
1000
0
0
0
0
0
0
0 0
0 0
(8,5)/(8,2) ~~
~~~
0
0 0 0 0
0 -167 -167 -167 0
-500 0
0
0
0 0 0 0
167 167 167
0
0
0
0 -667 0 0
0 0 0 0
0
0 0 0 0 0
-148 -162 -148 -105
0 0
0 0
0 0 0
0
0
0 -143 0
-71 -71 -71
0 0 0
0 71 71 71
0
0
0
0
0 0 143 0 0
29 29 29 29 29
0 59 59 59 0
20 20 20 20 20
40 40 40 40 40
-23 93 194 93 -23
12 -47 58 -47 12
(20,2)/(20,5) 0 0 667 0 0
0 0 -83 0 0
0 -59 -59 -59 0
0 0 0 0 0
-29 -29 -29 -29 -29
0 0 0 0 0
0 0 0 0 0
(24,2)/(24,5) -13 207 280 207 -13
0 -77 70 -77 0
-40 -40 -40 -40 -40
-20 -20 -20 -20 -20
(24,9)/(24,14) -105
0
0 0 500 0 0
0
(20,141
74 -12 -40 -12 74
0 0
(12,2)/(12,5)
0
0 83 0 0
13 -207 -280 -207 13
0 0
~
(123
0 77 -70 77 0
SMARTAND TOVEY (1988).
105
148 162 148 105
0 0 0 0 0
(24.20) -74 12 40 12 -74
-12 47 -58 47 -12
23 -93 -194 -93 23
0 0 0 0 0
Note the coefficients for several pairs of formulae are identical. The coefficients shown are for dl/dx. The coefficients for dI/dy are identical but rotated through 90° counter-clockwise from those shown above. All values are lo00times actual coefficients so integer arithmetic may be used.
formula is essentially similar to the Prewitt operator, while the 20,5 formula is similar to the extended Sobel operator (see Zuniga and Haralick, 1987). Smart and Tovey have also used the term “forward difference formula” to refer to the 2.2 formula.
254
N. KEITH TOVEY el al.
D. Practical Considerations of Intensity Gradient Analysis Once a particular kernel has been selected, either one from Table VII or one of the well-known kernels (e.g., Roberts, etc.), the extraction of the two derivative coefficients at each pixel is straightforward. However, two separate conventions exist as to the calculation of direction and these are discussed in detail in Section V, J. It is sufficient to note here that one convention favors using the upward vertical as the reference direction and measures all angles clockwise from this point; the other uses the positive X-axis and measures angles counter-clockwise from this. For most applications, intensity gradient information will only be available in a range from 0 to 180" and it is convenient to mirror the data to produce a symmetric rosette diagram. There is no ambiguity over the computation of the magnitude image from intensity gradient analysis. Although this magnitude image is the key result for edge detection work, it is frequently discarded in orientation work for microfabric analysis. For microfabric analysis, results may be displayed treating each intensity gradient vector as a unit vector or weighted according to a simple function. In some applications of microfabric analysis, the magnitude image may become significant. Figure 11 shows a display of rosette diagram generated for the image in Fig. 2b. In Fig. l l a (which is the same as Fig. lOa), the aggregate rosette diagram for the whole image is displayed. This was computed using all vectors which have a magnitude greater than the selected cutoff level of 2. Figure 1 l b displays the orientation information of vectors with a magnitude less than this cutoff and would thus not be included in the aggregate diagram. This particular figure shows a strong direction aligned vertically and it is a manifestation of the problems discussed in section IV, C. The next diagrams (Figs. 1 lc-11 k) are the individual rosette diagrams for ranges of vector magnitude. In the example shown, the ranges are 2-5, 5-8, 8-13, 13-21,21-34, 34-55,55-90,90-148, and above 148, respectively. The low-magnitude diagrams show almost random distributions, but these become progressively more anisotropic as the magnitude range increases. An alternative method of displaying the results might be to weight the rosette diagram according to the magnitude of the vector before display. In this example, the weighted rosette diagram corresponding to Fig. l l a is shown in Fig. 111. The diagram is now more elongate than in the unweighted version in Fig. 1 la. Tovey and Krinsley (1990) used rosette diagrams from a selected magnitude range to extract information about characteristic V-shaped patterns sometimes found on the surface of sand grains (Fig. 12a). The overall rosette diagram was of little help in extracting information about the mean angle between the limbs of the patterns. This was the case despite the
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
255
FIGURE11. Illustration of rosette diagrams constructed using different magnitude ranges. (a) Cumulative rosette diagram as in Fig. 10a (this was constructed for all magnitudes values in excess of 2/h); (b) Diagram for vectors rejected in analysis because their magnitude was below threshold value of 2/h; (c) Diagram for magnitudes range 2-5; (d) 5-8; (e) 8-13; (f)13-21; (9) 21-34; (h) 34-55; (i) 55-90; (j)90-148; (k) Diagram using vectors with magnitudes greater than 148; (I) Rosette diagram with vectors weighted according t o magnitude. Figure 1 I 1 should be compared with Fig. I l a which is the diagram assuming all vectors are of unit magnitude.
fact that over 60% of computed vectors were rejected because they fell below the cutoff threshold magnitude intensity of 2. Figure 12b shows the distribution of vectors having a magnitude less than 5 (representing 88.9% of all vectors). This is close to a random distribution and illustrates the problem. Figures 12c and 12d show only those vectors with magnitudes in excess of 5 and 8, respectively, when the desired information becomes apparent. The data in these two rosette diagrams were computed from 1 1.1 Yo and 4.0% of vectors, respectively. To the naked eye, brighter and more contrasting features are more obvious than those with low contrast and for some purposes it may be desirable to weight the intensity gradient vector according to some function of the magnitude as described previously. On the other hand this may not be what is required and the unweighted diagram may be more appropriate. As yet, there is no guidance as to what weighting function should be used. In cases of doubt it is probably best to select a series of ranges of magnitude and compute the statistics from each range separately (see for example Tovey, 1980).
256
N. KEITH TOVEY er al.
FIGURE12. Orientation analysis on the surface of a sand grain where it was the distributions and orientation of the V-shape patterns which was of interest. (a) Original micrograph; (b) Rosette diagram from all magnitudes less than S/h (this represents 88.9% of all vectors and shows a nearly circular distribution; (c) Rosette diagram using only vectors with magnitudes greater than 5 / h (1 1.1070 of all vectors); (d) Rosette diagram using only magnitudes greater than 8 / h (4.0% of all vectors).
E. A Comparison of the Various Formulae With the large number of intensity gradient formulae available, it is important t o have some guidance as to selection of an appropriate operator. Some operators commend themselves by their simplicity, others provide
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
257
high-order solutions and would appear more suitable in some situations. Conflicting articles have been written as to the merits of the various formulae. Zuniga and Haralick (1987) demonstrate that under their test conditions their kernel (see Table V) was to be preferred to other operators, including the extended Sobel and the Prewitt operators. Smart and Tovey (1988) inferred that the 20,14 method was an important method as the kernel was approximately circular and therefore did not impose any direction information on the analysis. Furthermore this was a fourth order solution and should be suitable in cases where there is fine detail. On the other hand they recognized that in noisy images, a formula with a high number of pixels but of low order would be better as some degree of filtering would then be present (e.g., the 20,5 formula). In a simple test, this last statement was confirmed by Tovey et al. (1989). Some results of a comparative test undertaken by Tovey and Smart (1986) have already been mentioned (Section IV, C), where the 12,9 formula was shown to be superior to the 4,2 formula although the now obsolete 17-point analysis had some merit. Tovey and Martinez (1991) carried out a more extensive series of tests of all the formulae mentioned in Smart and Tovey (1988), using nearly a hundred different images covering a large range for the index of antisotropy from 0 to 0.8. In their test they assumed that the 20,14 method was the standard and compared all other formulae with this one. When the indices of anisotropy from any of the other formulae were plotted against that for the 20,14 formula, the result should have been a straight line with a gradient of 1.O. Although some formulae did show a strong correlation with the 20,14 algorithm, others did not. They concluded that formulae such as the 8,5 and 12,9 gave an overall result comparable with those of the 20,14 method, and had some benefits in terms of the time taken. Other methods, particularly the 12,s and 20,5 analyses showed little correlation, and thus had little to recommend them. It should be noted that these tests of Tovey and Martinez were conducted on a series of real images covering a wide range of microfabrics. Finally, Smart and Leng (1993) carried out a series of tests on several formulae using a simple synthetically generated sine wave as the input image. Details of their experiment are sparse, but the evidence presented did seem to reinforce the superiority of the 20,14 method although they did not test the kernel of Zuniga and Haralick (1987). Smart and Leng noted that in many kernels the coefficients were not always of the same sign on a particular side of the kernel (e.g., the positive value of 0.013 at pixels 15 and 18 in the 20,14 formula). They proposed three new empirical formulae as shown in Table VIII. They denoted these as the 20S, 20T, and 20U formulae. These are based on the 20 pixels forming a near circular array and thus have some merit. However, in their analysis none of these appeared to be as good as the 20,14 analysis.
TABLE VIII EMPIRICAL FORMULAE
SUGGESTED BY SMART AND
20s" 0 -33 -34 -33 0
-33 -61 -100 -61 -33
LENC (1993)
ZOT 0 0 0 0 0
33 67 100 61 33
0 33 34 33 0
0 -1 -2 -1 0
20u
-1
0
1
-6 -10 -6 -1
0
6 10 6 1
0 0 0
The 20s coefficients should be divided by 1000, the 2OT by 64, and the 20U by 22.
0 1 2 1 0
0 -1 -1 -1 0
-1 -1 -1 -1 -1
0 0 0
0 0
1 1 1 1 1
0 1 1 1 0
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
259
To resolve the question as to the merits of a particular formula it is necessary to consider several points: (i) For pure edge detection the precision of defining a particular angle is not of much importance provided that the edge is adequately defined. (ii) For orientation analysis to detect lines which are reasonably well spaced (e.g., the example of Swift, 1992), a kernel which gives a high accuracy in angular measurement should be chosen. (iii) For microfabric analysis where there are features spaced at a variety of frequencies in the image, the analyzing kernel should perform well at both high and low frequences. (iv) The performance of the kernels in noisy images should be considered. A series of artificial test images was generated to test the formulae. The first image in each series had an intensity which varied in a sinusoidal manner in the X-direction but remained constant in the Y-direction. Ten separate images of this type were generated each with a different frequency (varying from 3 pixels peak-to-peak at the high frequency to 12 pixels peak-to-peak at the lowest frequency). This frequency range corresponds with typical images analyzed for microfabric (see Fig. 9). An amplitude of 40 on the gray scale was selected as this represented a typical range in intensities and corresponded to the smaller perturbations (see Fig. 9). Images with shorter wavelength were not studied in detail as at a wavelength of 2 pixels the wave form would be equivalent to a square wave and all formulae largely failed at this close spacing. The 10 initial test images were then used to generate a further 179 images, each with the orientation of the sine waves turned through 1" between each successive image. Examples of the synthetically generated images area are shown in Figs. 13a-13h. The test of each formula then proceeded by analyzing all 180 images for all wavelengths and computing the histogram showing the distribution of angles computed for each image. A resultant image was computed which showed the computed angle in the X-direction and the image in the series angle (0-180-i.e., the actual angle) in the Y-direction (i.e., the resulting image contains the data from all 180 images). If there were a perfect estimation of angles then there would be a single line at 45",and furthermore the intensity of each point on the line would be 15,376 (the size of the test images was 128 x 128 pixels and estimates within 2 pixels of a boundary were not included). All formulae departed from the ideal, and to illustrate the errors, the resultant images were thresholded at a maximum of 50, which readily highlights any error present. A selection of the results from
260
N. KEITH TOVEY er a/.
FIGURE13. Examples of synthetically generated images. These represent a sinusoidally varying intensity at different wavelengths and orientations. Figures (a) and (b) are at a wavelength of 3 pixels; Figures (c), (d), and (e) are at a wavelength of 5 pixels; Figures (0,(g), and (h) are at a wavelength of 10 pixels.
the simulations for the high frequency (wavelength 3 pixels) is shown in Fig. 14, while corresponding values for a wavelength of 10 pixels are presented in Fig. 15. For a wavelength of 3 pixels, it is clear that all analyses apart from the 20,14 and Sobel formulae depart significantly from the expected straight line. At a wavelength of 10 pixels, however, many formulae give good results. The 20,14 formula is a higher order solution and it is to be expected that this will perform better at the shorter wavelengths than lower order solutions such as the 20,5. Equally, the 12,9 formula is a higher order solution than the 12,5 formula, and as excepted, the former formula is more accurate than the latter at the short wavelengths. To reduce the data set for ease of comparison, the proportion of estimated angles which were exactly correct for each image was estimated as were all values which differed by 1", 2", 3", or more from the correct value. To simplify presentation, all results which differed by more than 10" were aggregated together. The analysis shown in Fig. 16 was done on each line of the output image, separately representing the results from all 180 initial images for each wavelength. Subsequently the results were averaged to obtain an aggregate error value for each wavelength. Selected distributions are shown in the figure. The X-axis indicates the error in degrees. For a near perfect analysis, the theoretical curve would show a high value approaching 100% for those estimates which had a maximum error of 1" or less and decreased to zero thereafter. For a wavelength of 3 pixels, the 20,14 formula was the
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
26 1
FIGURE14. Example of analyses on selected formulae for a wavelength of 3 pixels. A perfect estimation would give a single line at 45". Images have been enhanced to emphasize errors. (a) 2,2 formula; (b) 4,2 formula; (c) 8,5 formula; (d) 12,5 formula; (e) 12.9 formula; (f) 20,s formula; (9) 20,14 formula; (h) Zuniga and Haralick formula; (i) Sobel formula.
only one to give a significant value (60%) for estimates with less than 1" error. However, even with this formula, there are still a significant number of estimates which are in error by 10" or more. For all other formulae, the precision is very much worse and many have the majority of results in error by over 10" (including the Zuniga and Haralick formula). When the wavelength is increased to 10 pixels, most formulae show the expected high value for errors less than lo, and some formulae decrease very rapidly to low values thereafter. In all formulae, very few results are in error by more than 10". The results from other formulae are not shown in Fig. 16 as the results all fall within the general statements made earlier. To simplify the presentation further, angular estimation was considered to be correct provided that the computed angle fell within f 2" of the true value. Using this figure to indicate a correct estimate, it is possible to see
262
N. KEITH TOVEY e t a / .
FIGURE15. Example of analyses on selected formulae for a wavelength of 10 pixels. A perfect estimation would give a single line at 45". Images have been enhanced to emphasize errors. (a) 2,2 formula; (b) 4.2 formula; (c) 8,s formula; (d) 12,s formula; (e) 12,9 formula; (f) 20,s formula; (g) 20,14 formula; (h) Zuniga and Haralick formula; (i) Sobel formula.
how the precision of each formula varies with wavelength. These results are shown in Fig. 17. The 20,14 formula is substantially better than any other formula for a wavelength of 3 pixels, being nearly three times better than the next best (Sobel). However, by the time a wavelength of 8 pixels is reached, other formulae (including that of Zuniga and Haralick and the 20,s formula) appear marginally better. The values plotted in this figure are the raw proportions computed (i.e., these are the proportions of the maximum number of possible estimates of 15,376). In all formulae it is not possible to obtain estimates at all pixels as the magnitude of the intensity gradient vector is too low, and for most purposes, including the computation of orientation statistics, it is more realistic to compare the weighted proportions (i.e., the values of the total number of angles actually computed). The corresponding weighted results are shown in Fig. 18 and this weighting improves the performance of the 20,14 formula at higher
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
700
Frequency
- 20,14 formula
12,9 formula 20,5 formula _ _
000
263
1
I
I
I
,
2
3
4
5
Zuniga and Haralick
.. I
6
7
8
9 1 0 +
Error (degrees)
a
_ _
12,9 formula
- 20,14 formula
wavelengths. Finally, the proportion of correct estimates averaged over all wavelengths for each formula is shown in Table IX. The data in column 2 show the percentage of points at which an estimate or orientation was possible (i.e., above the threshold, which was set at 2).
264
N. KEITH TOVEY el al.
96 correct loo
7 / /
40
20
12,9 formula
_ _
- 20,14 formula I
0
3
4
80
I
I
,
6
7
8
1
I
9
10
11
12
Wavelength
a 100
5
24,20 formula Sobel
% correct
1 -
FIGURE 17. Variation in correct predictions with wavelength for various formulae. Predictions within +2" of actual values are considered to be correct. The percentages are proportions of all possible angles which are correct.
Clearly for some formulae a higher number of orientation angles are actually measured, but these are not always the most accurate. It might be thought that since the Prewitt Formula and the 8,s formula are essentially the sam.e, they should give the same values. The difference arises because the magnitudes of the coefficients do differ (see Tables I11 and VII). It
265
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
20-
~
I
12,9 formula 20,14 formula
,
- -
,
I
24,20 formula Sobel I
I
I
% correct
100 I
I
_-_-
-
I
I
/
,- -8,5 -
- 20,14 formula
formula
203 formula
0
3
b
4
5
6
Zuniga and Haralick
~
I
8
9
10
11
12
Wavelength
FIGURE 18. Data are same as in Fig. 1 / Out witn resuitb welgnreu to anow for esLlildeS actually made. There is little difference between the formulae above a wavelength of 8 pixels, but for fine detail the 20,14 formula is superior.
could be argued that using wavelengths only up to 12 pixels does help to bias the preference toward the 20,14 formula, but this upper limit is consistent with the spacings seen in many of the transects done on images of microfabric (Fig. 9). The data in column 3 show the average proportion of estimates which fell
266
N. KEITH TOVEY et al.
TABLE IX
SUMMARY RESULTS
FROM COMPARISON OF ESTIMATES OF ANGLE FROM INTENSITY GRADIENT ANALYSIS W I T H ACTUAL ANGLES.
Formula type 292 42 8,s 12,5 12,9 20,s 20,14 24.5 24,14 24,20 Zuniga & Haralick Isotropic Prewitt Roberts Sobel Smart & Leng 20s Smart & Leng 20T Smart & Leng 20U
Percentage of estimates
Correct angle predicted within + 2 " (Yo)
Weighted prediction
85.7 90.1 88.3 83.8 90.6 75.5 90.0 72.9 37.4 83.0 94.7 98.5 98.1 90.5 98.6 83.1 78.8 99.0
28.0 54.0 56.7 44.1 62.3 64.3 73.7 32.8 67.2 59.3 13.7 64.8 59.9 56.2 71.3 58.0 63.8 58.3
30.7 60.7 64.9 51.6 69.3 76.8 82.1 40.0 75.7 72.7 71.3 65.9 61.3 62.5 72.4 68.2 80.8 58.9
(TO)
within +2" of the correct value for all wavelengths. Two formulae are noticeably better than others (20,14 and Zuniga and Haralick, 1987). On the other hand, although on average the Zuniga and Haralick method contains estimates from more points overall, the weighted percentage of correct estimates (i.e., weighted according to total number of estimates actually made) is lower than for the 20,14 formula. The choice of formula will depend on the task in hand. Where the definition of a high proportion of features which are widely space is required with a high precision, then the Zuniga and Haralick formula would seem to offer some advantages. On the other hand, where microfabric analyses are being conducted, the ability to have a reasonably consistent precision over a large range of feature frequencies is of importance. The 20,14 formula clearly comes out on top here. The fact that a smaller proportion of angles are actually specified is of less concern as the weighted data would be used in computation of indices of anisotropy. The Smart and Leng 20T formula is also good for overall orientation statistics but this is achieved with far fewer estimates of orientation and may be less reliable in other situations. It is to be expected that higher order solutions such as the 20,14 would
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
267
perform better at high frequencies and this is certainly the case for the 20,14 formula. However, if this were correct, then one might expect the 24,20 to perform even better. This is clearly not the case and the reason lies in the fact that the shape of the structuring kernel is no longer circular. In conclusion it would appear that for microfabric analyses, the 20,14 formula appears best on several counts. Firstly, it has a circular structuring element. Secondly, it is by far the best formula to deal with high frequency data. Thirdly, over a range of frequencies its performance varies much less than other formulae. Finally, although other formulae are marginally better at lower frequencies, overall the 20,14 formula is among the best. Smart and Leng (1993) in a restricted range of tests also suggested that this formula was best, but their claim that it is accurate to 1070 is not borne out when a full range of frequencies is considered. For applications other than microfabric analysis, where features may be widely spaced, then the Zuniga and Haralick formula would seem to have advantages. Where detection of edges is the key factor, then formulae which use results from a high number of pixels are t o be preferred. Here the Sobel, Prewitt, Isotropic, and Smart and Leng 20U formulae have merit.
F. Treatment of Boundaries It is not possible to use intensity gradient formulae right up to the edge of the image, and for some purposes, it is important to consider how to deal with the problem. Generally for a 5 x 5 pixel kernel, any estimate of orientation within the boundary 2 pixels will be less precise than elsewhere in the image, and for most microfabric work it is often sensible to disregard the information in this region. In this way for an image of size m pixels by n pixels, the output angles-coded and magnitude images will be of a size (m - 4)by (n - 4)pixels. This represents a reduction in size of 1.5% for a 512 x 512 pixel image. The results from the reduced image may be used in the normal way to compute indices of anisotropy and other parameters, and the small amount of data from the boundaries will not affect the results. On the other hand, if further processing is required, it is often more convenient to ensure that the output image is of the same size as the input image. To compute the orientation of features in the edge region, the various kernels must be modified to produce asymmetric kernels. In theory, any one of the above formulae could be adapted using the appropriate rows and columns in the matrix in Table VI. Figure 19 shows a representation of an image. In this figure the numbers represent the modifications needed for each pixel. Where numbers are the same, then the same formula may be used. Within the central region there
268
N. KEITH TOVEY et al.
are a collection of blanks representing the standard formula. A total of 24 modified formulae are needed for each of the standard formulae. Along the top edge, for instance, the left hand corner requires a unique kernel just for that pixel as does the pixel immediately to its right. Thereafter, the pixels along the top edge all can use the same kernel except for the last two, which are again unique. These last two will be mirror images of the ones at the start of the line of pixels. In the second row of pixels, a similar situation arises. For the third and subsequent rows (apart from the last two), the situation is identical. For the first two and last two pixels in each line, the kernels are different, but elsewhere, the standard formula can be used.
FIGURE19. Position of pixels at boundary of image. Pixels with same number can use the same edge formula.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
269
There are a large number of possible boundary formulae derived from many of the original formulae discussed in Section V,C, and some examples are shown in Table X for the kernels derived for the edge conditions based on the standard 20,9 and 2 4 3 formulae. In this table, edge formulae are identified by a number which relates to its position as shown in Fig. 19. Unlike the symmetric kernels, where pairs of formulae have the same coefficients, all solutions appear unique, and the reasons for choosing the above two formulae for display was that they did appear somewhat better in defining direction than others, although tests are still continuing. Indeed there are no solutions for the reduced 20,14 formula at many of the boundary sites. Some preliminary testing suggests that the reduced 24,5 formula for the outermost boundary pixels is better than other formulae for these pixels but that the reduced 20,9 formulae are better for the next ring of pixels (i.e., numbers 7, 8, 9, 12, 13, 16, 17, and 18 in Fig. 19). At the time of writing, it has not been possible to test the boundary formulae in the same manner as the standard formulae but tests are currently under way and full details will be published elsewhere. To achieve this test, the various boundary formulae are scanned across the whole of each image (not just the relevant boundary pixels) in the full set of 180 images as in the standard tests. This analysis should thus give a full comparison of boundary formulae with the symmetric kernels. Preliminary results are shown in Table XI and summarize the effective accuracy for the different edge formulae based as indicated in Table X for a wavelength of 10 pixels. Although there is reasonable agreement with the true angles, it is significantly less than for the non boundary formulae. Noteworthy is the fact that the formulae from the penultimate ring of pixels give better estimates than d o those at the boundary. T o indicate the variation between formulae, three separate formulae have been generated for the corner pixels (i.e., reduced 24,5, 20,5, and 20,9 formulae). The results show that there is little from which to choose between the different formulae. When computation of orientation right up to the image edges is required, the following guidelines seem appropriate: (i) Use the 20,14, Zuniga and Haralick or Smart and Leng 20T formulae as appropriate (see discussion in Section V,E) for all the pixels except those in the outer two layers. (ii) Use the 8,5, Sobel, Prewitt, or Isotropic formulae for the penultimate layer. From initial tests all of these seem preferable to the reduced 20,9 (or any other asymmetric formulae). (iii) Use the asymmetric formulae based on the 24,5 analysis for the outermost boundary pixels apart from the four corner pixels.
N 4 0
TABLE Xa
SELECTED EDGE
FORMULAE WITH THE KERNELS CHOSEN AS THE BEST ONES FOR THE RESPECTIVE PIXELS I N
Point 1-24,5
-
-
0 -293 -276
Point 2-24,s
847 563 563
57 -270 -287
-
-
-
-
-
-
-
0 50 100
167 117 67
Point 4-24,5
-167 -17 133
-167 -117 -67
0 -50 -100
-
-63 -171 -155 55
-62 0 -19 -50
333 183 33
-
270 287
-464 -607 -571 -358
139 0
39 256
-167 -67 33
-563 -563
293 276
254 431 321 -5
-128 108 -147 0
89 -76 71 0
-199 -282 -208 22
-
0
0 0 0
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-194 -48 0 -48 -194
-
-83 -33 17
0 0 0
83 33 -17
167 67 -33
-
-39 -256
z 571 358
-52 -262
199 282 208 -22
-89 76 -71
128 -108 147
0
0
-254 -431 -321 5
62 0 19 50
z
-
-
63 171 155 -55
-
6 297 394 297 6
0 -114 62 -114 0
-
-
Point 12-20,9 303 448 497 448 303
-197 -52 -3 -52 -197
-
56 -128 -189 -128 56
-71 -18 0 -18 -71
2<
n -e
Point 9-20,9
Point 11-24,5 -
-
Point 6-24,s
Point 8-20,9
Point 10-24,5
166 20 52 262
-333 -183 -33
167 17 -133
Point 5-24,5
Point 7-20,9 -
Point 3-24,5
-
-
FIG. 19.
?
Point 14-20,9
0 114 -62 114 0
-6 -297 -394 -297 -6
71 18 0 18 71
Point 15-24,5 -56 128 I89 128 -56
-
-
-
197 52 3 52 197
Point 17-20,9
-
-
55 -155 -171 -63 -
-50 -19
0 -62 -
-
-358 -571 -607 -464 -
256 39 0 139 -
-5 321 431 254 -
0 -147 108 -128
-
0 71 -76 89
-
17 -33 -83 -
0 0
0 -
0
48 194
-
-
-
-
22 -208 -282 -199 -
0 0 0 0
-
-
-
-
-
-
-
-276 -293 0 -
-
-
-
-33 67 167 -
133 -17 -167 -
-67 -117 -167 -
-100 -50 0 -
-256 -39 0 -139 -
358 571 607 464 -
-262 -52 -20 -166 -
9
8 9
Point 19-20,9 -22 208 282 199 -
0 -71 76 -89 -
0 147 -108 128
-
5 -321 -431 -254 -
50 19 0 62 -
z -55 155 171 63 -
-
-
-
563 563 874 -
-287 -270 57 -
-
-
287 270 -57 -
l
E z
7 !?
m
-33 -183 -333 -
100 50 0
-
67 117 167
-133 17 167
-
-
-
-
-563 -563 -874 -
276 293 0 -
5
> =!
0
2 9
z
Point 25-24,5 33 183 333 -
.e i? v
V
Point 22-24,5
Point 24-24,5 -17 33 83 -
-
-
Point 21-24,5
Point 23-24,5 33 -67 -167 -
194 48
Point 18-20,9
Point 20-24,5 262 52 20 166
-303 -448 -497 -448 -303
Point 16-24.5
-
-
-
.e K? cli
272
N. KEITH TOVEY et al. TABLE Xb ALTERNATNE KERNELS FOR CORNER
-
-
PIXELS
Point 1-20,s
Point 5-20,s
Point 21-20,5
Point 25-20,5
-246 -96 0
246 684 1070
0 -588 88
0 588 -88
-246 -684 -1070
246 96 0
Point 1-24,9
Point 5-24,9
Point 21-24,9
Point 25-24.9
-
-
-
G . Use of Pixels with Rectangular Aspect Ratio The structuring kernels for all the formulae require the aspect ratio of the pixels t o be unity (i.e., square), otherwise, although the edges may be detected without difficulty, the directions will be anomalous. There are three ways to solve this problem. Firstly, for overall microfabric analysis, a correction may be applied to the overall index of anisotropy in a similar way to that described by Tovey and Sokolov (1981) when correcting for tilt in scanning electron micrograph images. Secondly, the source images may be stretched in the appropriate direction by affine transformation to generate
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
TABLE XI PRELIMINARY RESULTS FROM SELECTED
Formula typea
Position of edge formula (see Fig. 19)
24,s 24,s 24,s 24,s 24,s 24.5
213
EDGE FORMULAE
Percentage of estimates
Correct angle predicted within f 2" (To)
Weighted prediction
2 3 4 5 6
19.81 18.52 18.86 18.29 19.78 17.72
4.22 4.08 4.17 4.05 4.23 4.05
23.52 24.74 27.59 24.73 23.58 24.66
20,9 20,9 20,9
7 8 9
13.39 12.26 13.26
4.11 4.28 4.04
33.36 37.24 33.1 I
24,s 24,s
10 11
17.80 18.55
4.03 4.23
24.47 27.77
20,9 20,9
12 14
11.59 11.54
4.10 4.01
37.47 36.94
24,s 24,s
15 16
18.48 17.69
4.16 3.98
27.34 24.21
20,9 20,9 20,9
17 18 19
13.27 12.10 13.26
4.02 4.12 4.03
33.10 36.22 33.02
24,s 24,s 24,s 24,s 24,s 24,s
20 21 22 23 24 25
17.84 19.74 18.52 18.76 18.30 19.66
4.04 4.18 4.05 4.04 4.05 4.23
24.44 23.43 24.54 26.91 24.70 23.64
20.5 20,s 20.5 20,s
1 5 20 25
21.46 21.43 21.49 21.43
4.14 4.13 4.15 4.11
21.63 21.61 21.74 21.59
20,9 20.9 20,9 20.9
1 5 20 25
23.92 23.92 23.90 23.92
4.15 4.11 4.12 4.15
21.47 21.30 21.22 21.52
I
(To)
The formulae denoted in column 1 are the reference formulae from which the truncated versions have been generated, while the position number refers to the location in Fig. 19 which would be relevant for such a formula. For all pixels in the outermost layer the truncated formulae based on the 24,s kernel appeared to give the best results and only these are shown. For the penultimate layer, the truncated 20,14 formulae were best. At the corners, there is a choice of three different formulae. All data refer to a wavelength of 10 pixels.
274
N. KEITH TOVEY et al.
square pixels. Thirdly, the standard set of formulae derived in Section V,B may be adapted by keeping h, and hy separate thoughout. However, this is inconvenient as a separate set of coefficients for the matrix B are required for each aspect ratio. Smart and Tovey (1988) suggested that it should be possible t o proceed as follows: Let
h,
= s,a
hy
= Syb,
where a and b are now the distances between pixels and s, and sy are integers. The analysis described in Section V,C is then modified to include a and b in the equations for Xi,; for example, XI = UD,IO
X2
=
bDyIo,
and
X,,
=
b5
1200:I , .
(22a)
As before we need to determine D, and Dy ,i.e., XI 0,= a * I,
D y = -x .2 b * I, Using this approach the coefficients B , in the matrix remain the same, and all that is required is to remember that the integers s, and sy replace h, and hy . The angle B is given by
A test of the effectiveness of the formula was done by deliberately modifying the image in Fig. 2b by affine transformation to generate Fig. 20. The Ydirection information was sampled at a spacing of 2 pixels and the information was contained in pixels which were spaced twice as far in the Y-direction as in the X-direction. Table XI1 shows the result from intensity gradient analysis using the 20,14 formula for both the original and the transformed image. In the latter case the parameters a and b were set as 1 and 2, respectively. The agreement in the results is poor. A possible explanation for this was that the reduced information obtained in the Y-direction was affecting the analysis. In order to test this hypothesis, the transformed image was then expanded back to the original shape and the analysis conducted again. Here the information from both the index of anisotropy and the direction of preferred orientation was close to those of the transformed image, confirming that the earlier discrepancy was largely due to the reduced resolution in the Ydirection.
275
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
FIGURE20. Same image as Fig. 2b after affine transformation. The scale in the Y-direction has been reduced by a factor of 2. TABLE XI1 EFFECTS OF
ORIENTATION ANALYSIS USING FORMULAE MODIFIED TO ACCEPT RECTANGULAR ASPECT PIXELS
lndex of anisotropy
Preferred orientation direction (degrees)
Original image unprocessed
0.525
06.9
After affine transformation to generate an image 5 12 x 256 pixel in siLe (i.e., pixels have a 2 : I aspect ratio)
0.281
22.3
Original image reconstructed by reversing affine transformation
0.276
116.3
In row two the original image was transformed so that the pixels had a 2 : I aspect ratio (i.e., height was twice width). Results are very different for original values. After reconstruction of image to original aspect ratio, results remain the same. This indicates the importance of overall resolution in orientation analysis.
H . Resolution of Images The effective pixel resolution is of importance in microfabric analysis. Tovey and Sokolov (1981) showed that the same effective resolution could be obtained either with a low magnification and small aperture for digitizing or by using a larger magnification and a coarse aperture. In a separate test, Tovey el al. (1992b) examined the effect of changes in pixel resolution on computed values of anisotropy. A series of concentric images of
276
N. KEITH TOVEY et al.
scanning electron micrographs of clay microfabric were produced, each one with a different magnification and orientation analysis conducted on the whole set. A similar series of tests on 18 separate areas was also conducted for the research reported here. For these images (also of kaolin microfabric), there was little change in the results as the magnification fell from 2000x to around l5OOx (see Fig. 21). However, below l 5 O O x the computed index generally fell (for those images with an initial high index) and showed quite wide variation at 1000 x or less. Indeed some images with a low index at 2000x show anisotropy when the magnification was reduced. The reason for this was that the individual clay particles, which were visible as separate units at a magnification of 2000 x , could no longer be resolved separately, and the analysis was thus being conducted on aggregates of particles and represented a different type of microfabric. The results shown in Fig. 21 clearly show that the effective resolution of the image must be chosen with care according to the task in hand and the nature of the features to be examined. In microfabric analysis it is desirable to cover as large an area as possible, and if a magnification of 2000x proves to be satisfactory when the images are digitized as a 512 x 512 pixel array, then the same effective resolution may be obtained by halving the magnification but doubling the pixel array to 1024 x 1024. The latter format has advantages as four times the area is then covered at the same level of detail.
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
0
so0
lo00
1#0
2000
Magnification FIGURE21. Variation of index of anisotropy with magnification for 6 different images of samples of consolidated kaolin. The index varies little as the magnification is reduced below 2000x until a magnification of around l00Ox is reached when the index varies widely. In some images, the index falls; in others it rises as the nature of the material analyzed varies.
277
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
I. Noisy Images
Most images have noise present and any orientation analysis must be robust with respect to such noise. Smart and Tovey (1988) predicted that for noisy images, the lower order solutions with a large number of points such as the 20,5, 24,5 would be the best. This was confirmed in a simple experiment on one micrograph by Tovey et al. (1989). To provide a more rigorous test, a set of the 180 standard orientation images used for the comparisons in Section V,E were modified to include noise. In keeping with the tests of Zuniga and Haralick (1987) a zero mean Gaussian distribution of noise was added to all images. The amplitude of the noise was 10% of the range of the actual signal. Analyses were conducted on the full range of formulae for wavelengths of 3, 5 , and 10 pixels. The comparable results to those shown in Table 9 are presented in Table 13. For a wavelength of 3 pixels, the 20,14 formula is still clearly superior, but its advantage is less clear as the wavelength increases. The noise in these images represents a fairly severe test, and would confirm that for most microfabric applications where high frequency components are present, the 20,14 formula is the best. TABLE XI11
SUMMARY DATA
FROM TEST ON NOISY IMAGES
Weighted prediction of correct results (070) Formula
232 42 8,s 12.5 12.9 20,s 20,14 24,s 24,14 24,20 Smart and Leng 20s Smart and Leng S20T Smart and Leng S20U Zuniga and Haralick Isotropic Prewitt Roberts Sobel
Wavelength 3
Wavelength 5
Wavelength 10
17.38 11.42 10.83 1.48 20.79 1.80 55.10 3.98 12.48 20.76 13.18 33.11 4.17 15.29 15.17 10.54 23.19 26.16
24.47 39.90 38.34 18.71 61.19 73.38 73.27 11.73 70.41 55.52 40.96 71.04 28.88 72.46 47.88 36.55 24.04 61.38
28.41 54.94 72.12 49.74 46.45 90.51 63.11 67.95 72.49 64.87 85.73 85.71 78.71 80.22 66.24 65.79 18.87 65.61
Only the weighted results are shown.
278
N. KEITH TOVEY ef al.
J. Statistical Analysis of Orientation Data 1. Introduction
Analysis and statistical description of angular or orientation data is not commonly used outside of certain specialized fields, and is probably not familiar to many image-processing workers. However, important information about the orientation of features can be gained once the measurements have been made. Some simple techniques have been considered already in the form of best-fitting ellipses, but there are many other processing methods which are worthy of note. A few of the more important ones are considered in this section. Angular data may be acquired in a variety of ways; for example, the orientation, or trend of a linear feature may be expressed as aligned at some angle from a reference direction. Alternatively, as in this discussion of orientation measurement from images, a vector property may be determined by intensity gradient analysis, or through some other algorithm. In intensity gradient analysis it is normal to use only directions in the range 0-1 80" but in other applications where additional information is available the "pointing direction" may be known and the full 0-360" orientation may be used. In specifying the direction, intensity gradient analysis also gives the magnitude of the vector specifying this direction. As a result of different types of angular data, four basic classes of orientation data may exist in image processing as follows: (i) Axial data, where the orientation does not have a pointing direction, such as a line segment, perhaps in a binary image. Axial data may have an angular range of 0-180". This is the situation with much of the intensity gradient analysis discussed so far. In many cases the estimates of orientation at all pixels can be considered as having equal value, i.e., each direction is represented by a vector of unit magnitude. Alternatively, in gray-scale images, the magnitude of the vectors specifying orientation may also be used to weight the resulting analysis. (ii) Directional data, where the orientation points in a particular direction. A set of directional data may have magnitude (i.e., vector-type data), or it may not, in which case the vectors can be considered as unit magnitude. Directional data will have an angular range of 0 to 360". (iii) Two-dimensional data, such as vectors determined from a twodimensional image. (iv) Three-dimensional data, such as might be acquired from a multidimensional image or the real world.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
279
The two cases of dimensional data may be axial or directional in form, leading to slightly different treatment in some cases. In the analysis of orientation data from images, concern centers primarily around two-dimensional data, although brief reference will also be made to three-dimensional statistics, which may be of increasing use for specialized image analysis tasks. 2. Presentation of orientation data The inherent ambiguity in orientation data is that they are dependent upon the origin chosen, i.e., the reference direction from which all other angles are measured. There are essentially two origin points which are commonly used in describing orientation data: 1. the conventional +ve Y-direction, which is expressed as the “North”. This is widely used in the environmental sciences, earth, and biological sciences, mainly because directions are measured in the environment with reference to the north magnetic pole or geographic reference system. This convention is used here, because the samples that we deal with are from the natural environment and commonly have to be related to a geographic reference system (Fig. 22a). 2. the conventional +ve X-direction. This is commonly used in engineering or more classic science areas (Fig. 22b). Graphical presentation of 2-D orientation data may be sufficient for analyzing the directions from a group of samples such as the radial histogram (or rosette diagrams) shown in Fig. 10. Axial data may be displayed as a full 0-180” rose, or mirrored about a line drawn through the origin to generate a full set of results in the range 0-360”. Some researchers prefer to use a standard histogram (e.g., Smart and Leng, 1993), but although variations in the proportions of vectors in particular directions are clearly seen, it is less easy to relate the information to the physical direction than with rosette diagrams. Vector-type data may be shown so that rather than treating each vector as having unit magnitude, the frequency of the histogram class corresponds to the sum of the magnitudes of the vectors in that angular class. In the case of the rosette diagrams, this corresponds to the length of the radius vector. Examples of the use of weighted rosette diagrams have been shown in Fig. 111. Although simple weighting of the rosette diagram by the magnitude is possible, the vectors may also be weighted by some other function of magnitude. There is an advantage in doing this in that the results are then weighted according to the brighter and more contrasting features which are prominent in qualitative interpretation; however, such weighting may well be what one is attempting to avoid. Simple weighting of the rosette
280
N. KEITH TOVEY el a/.
0
90
FIGURE22. Axis conventions in orientation analysis. (a) Convention used in earth and environmental sciences with 0" pointing vertically upwards, and directions measured clockwise; (b) Convention used in other sciences-reference direction points towards positive X-axis and angles are measured counterclockwise; Figures (c) and (d) illustrate the problem of axial data where vectors specify direction only between 0" and 180". If vectors are plotted in a semicircle, then different resultant vectors are obtained depending on the origin used. If magnitude of all angles is doubled first, then the correct resultant vector can be computed (direction of this vector is then halved at end to conform to convention used).
diagrams according to magnitude means that the results are dependent on the actual brightness and contrast of the image, whereas the results are independent when unweighted data are used. If weighted data are required, then an improvement on raw weighted data would be to normalize the magnitude image and use the normalized function for weighting purposes. With the development of interest in three-dimensional image analysis, methods are needed to display such data. The results from three-dimensional analysis are more involved and are normally displayed on stereographic projections (Cheeny, 1983). These can also be modified, by additional manipulation, to display magnitude information. Unfortunately, graphical off-the-shelf software packages rarely come with means of displaying orientation data. 3. Statistical description of two-dimensional orientation data Calculation of the arithmetic mean of orientation data, particularly when this is limited to the range 0-180", as it will be in many cases, is not a suitable procedure as it depends upon the choice of the origin (Rock, 1988). Description must be based on vectorial properties, and these have been described extensively in Watson (1966), Mardia (1972), Cheeny (1983), and
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
28 1
Rock (1988). Circular analogues of the familiar linear statistics can be derived. These include parameters such as the mean, median, mode, variance, skewness, kurtosis, etc. The circular mean 6 is defined by Mardia (1972) and may be derived by splitting the vector representing each orientation point into its components in the X - and Y-directions. Thus for the ith pixel, where the orientation is B i , the components parallel to the X - and Y-directions are cos Bj and sin B j , respectively. The components from all pixels are then summed separately:
C=
g, cos ei
S=
N
, sin ei N
and R = & T 3 . Hence,
6 = cos-'(C/R]
=
sin-'(S/R).
R is called the mean resultant length and is a measure of dispersion; it takes values between 0 and 1, and may be used as an alternative to the index of anisotropy when describing the strength of orientation of features within an image. High values of R are associated with small dispersions, i.e., the data are all of similar orientation, and small values of R with large dispersions. The value of 6 specifies the direction of preferred orientation and is an alternative and more robust way of specifying the direction compared to that computed from the principal axis of the ellipse as this alternative method does not rely on the data approximating to an ellipse. However, in tests on over 10,000 images of soil microfabric of consolidated kaolin, the preferred orientation directions computed separately from the resultant vector and the major axis of the ellipse never varied by more than 0.6" and the discrepancy was usually only 0.1-0.2". On the other hand, this was not the case where there were two dominant directions (Tovey and Krinsley, 1990-see also Fig. 12). The mean resultant vector R is uniquely related to the index of anisotropy I, (Smart and Tovey, 1991). However, though the range of values for both R and the index is the same (i.e., 0-1), except at the extremes, the value of R is always much less than I , for the range normally associated with real materials. There is thus some advantage in retaining I, for microfabric analyses provided that checks are made to ensure that the rosette diagram does indeed approximate to an ellipse (see Tovey et al. 1992a). Where wediteighting of data according to magnitude is required, the magnitude information mi can be incorporated by replacing cos Bj and sin Bi
282
N. KEITH TOVEY et ul
with mi cos Oi and mi sin O i , and N with C mi.Smart and Tovey (1982), and Smart and Leng (1993) have also referred to the mean resultant length R as the consistency ratio, a term dating from Reiche (1938), but this term is not widely used in the literature. The circular variance is defined as 1 - R. A measure analogous to standard deviation in Cartesian space is the circular standard deviation (CSD), which, according to Mardia (1972), is defined as -2 In (R)'12. Axial data (i.e., 0-180" range) is particularly troublesome in analysis. This is illustrated in Figs. 22c and 22d. In Fig. 22c, the vectors are plotted in the range from 0-180" with the origin pointing vertically upwards. If only these vectors are used, then the resultant vector will point in the direction R . If data are plotted using the X-axis as the origin, then the resultant vector points in a very different direction. To overcome this problem, several workers (e.g., Curray, 1956; Tovey, 1973c; Mardia, 1972; and Rock, 1988) have suggested pretreating the data by multiplying the values by 2 to obtain a range of angles which are suitable for analysis by Eq. (25) to obtain the circular mean. Once this has been done the result is then halved t o get the mean orientation (Mardia, 1972; Rock, 1988). In the axial case the circular standard deviation is defined as (-2 In (R)'12)/2. The normal distribution in Cartesian space does not have an exact equivalent for orientation data, but there are two distributions which possess the properties of the normal distribution. These are the von Mises distribution and the wrapped normal distribution (Mardia, 1972; Cheeny, 1983). The von Mises distribution is described by
where
e
where is the mean direction, k is the concentration parameter, and Zo(k)is a Bessel function (Mardia, 1972). Of particular significance to orientation analysis is to determine whether a distribution of data shows any statistically significant mean direction, i.e., whether the directions are statistically different from a random selection of directions. This is referred to as uniformity. If a set of directions is uniform then the mean direction has no real statistical significance. The standard parametric test for uniformity, which assumes a von Mises distribution, is the Rayleigh test. This is based on the value of the mean resultant vector length as defined in Eq. (24). Critical values of R are obtained from statistical tables (Mardia, 1972), or may be calculated directly from the
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
283
formula given in Rock (1988). The Rayleigh test is limited by the assumption that the orientations approximate to the von Mises distribution. Appropriate nonparametric, and therefore more powerful, tests for uniformity are the Kuipers test, and the Watson U 2test (see Cheeny, 1983 and Rock, 1988). Looking forward to domain-segmentation (as described in Section VI,C), the Rayleigh statistical test is also used in one method to segment the anglescoded image into a series of domains of subparallel particles. In this application it is necessary to check whether the orientation information within a mask of given circular radius is random. In this application as encoded to date, no specific check is made to determine whether the distribution approximates to the von Mises, although tests are available (Harvey and Ferguson, 1976). In some situations images may show more than one preferred orientation of feature edges (see Fig. 12), which will give rise to multimodal distributions. The statistical means for distinguishing a multimodal distribution from a single-moded distribution, have been addressed by Hsu et al. (1986) and Spurr (1981).
4. Description of three-dimensional orientation statistics Two-dimensional images frequently are taken as a stage in the analysis of orientation data which are in reality three-dimensional. The twodimensional approximation is useful but consideration must be given to the extension to three dimensions. There are two possible approaches here: One is to follow the example given in Tovey and Sokolov (1981) where they conducted intensity gradient analysis on three images taken from orthogonal planes in a sample of consolidated clay. They were able to show that the ellipses from the first two orthogonal planes were sufficient to predict the anisotropy present in the rosette diagram generated from the third plane. In hindsight the agreement here between just three images seems a little fortuitous as with modern facilities it has been possible to show that there is considerable variation in orientation in just short distances within a sample. Nevertheless the approach is still valid as the combined data from several images could be used. This approach has also been mentioned recently by Smart and Leng (1993). The other development toward three-dimensional analysis is in the use of confocal microscopes where images from several sequential planes of a sample may be captured. A brief summary of the extension of intensity gradient analysis to three dimensions is given in Section V,K. With three-dimensional orientation data, there will be both the orientation of the vector in the X - Y plane (0) and the additional
284
N. KEITH TOVEY et nl.
information of the angle between the X - Y plane and the vector (4, the dip), which can take values of -t 90". Methods of description of orientation data rely on first converting the 8, 4 value into direction cosines L , M and N ( L = cos 4 cos 8, M = cos 4 sin 8, N = sin 4). If the data points, when plotted on a stereographic projection (Cheeny, 1983), fall into a cluster, then it is reasonable to assume they form part of a Fisher distribution, which is the equivalent of the von Mises distribution in the 3-D case (Cheeny, 1983; Rock, 1988). The terms circular mean, concentration parameter, and mean resultant length in two-dimensions have their equivalent counterparts in three-dimensions. Normally the term circular mean is replaced by the term spherical mean. The three direction cosines are given by
The mean azimuth, 8 = tan-'(fi/L), and the mean dip, 4 = sin-'(N). More complicated distributions of vectors in 3-D space require the use of Bingham distribution statistics, which can describe clusters and elongated stringlike clusters of vectors in 3-D space (Mardia, 1972; Rock, 1988), in terms of three eigenvectors.
K. Extension of Intensify Gradient Analysis to Three Dimensions With the increasing use of confocal microscopes, there is the potential for true 3-D orientation analysis. Three-dimensional images may be generated as stacks of layers where each layer is separated from its neighbors by a distance comparable to the pixel spacing in the X - Y plane. Equation (12) showed how a generalized formula for the two-dimensional intensity gradient analysis could be developed from a double expansion of Taylor's theorem. The equation may be extended readily into three-dimensions as follows:
When expanded this becomes
285
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
Following the same procedure as for the two-dimensional case, the full expansion of Eq. (29) will allow the construction of a matrix equivalent to B (Eq. 17). In this case there will be 124 rows corresponding to the 124 pixels in the 5 x 5 x 5 cubic array. The total number of coefficients up to and including those of a particular order are 3, 9, 19, 34, 55, 83, and 119 for the 3rd, 4th, 5th, 6th, and 7th orders, respectively. In accordance with the pixel numbering convention used for two dimensions, the pixels are now numbered as shown in Fig. 23. Symmetric arrays may be obtained as in the
LAYER +2
LAYER + I
LAYER 0
I
I
49
I
93-116
I
LAYER - I
LAYER -2
FIGURE23. 3-D pixel numbering system The numbers are chosen in sequence so that symmetric arrays (in 3-D) are selected using numbers up to and including a given number. The pixels shown shaded form a spherical kernel and all are within d5 of the central pixel.
286
N. KEITH TOVEY et a/.
two-dimensional case by including points 1-6, 1-18, 1-26, 1-32, 1-56, 1-80, 1-92, 1-116, and 1-124. A particularly important arrangement is the collection of pixel 1-56 which gives a near spherical distribution (equivalent to the near circular 20,14 solution for two dimensions), but in this case it should be possible to obtain a fifth order solution (i.e., 56,55 using the same notation as for two dimensions). However, this, like the 20,20 formula for two dimensions, is indeterminate. A fourth order solution (56,34) is the nearest equivalent to the 20,14 two-dimensional formula, and coefficients for this arrangement are shown in Table XIV. At the time of writing one preliminary analysis has been undertaken on a confocal image (see Tovey, 1994) and further details of the method are to be reported elsewhere. One point to note is that solutions must be suitable to deal with rectangular pixels as the spacing in the third dimension will normally be TABLE XIV COEFFICIENTS FOR
a u a x FOR
THE FOURTH ORDER SOLUTION FORMULA
0
0
0 0
0 38 0
0
0
0 0 0 0
ANALYSIS
Layer + I
Layer + 2
0 0
56,34 FOR 3-D
0 0 -38 0
0
0
0
0
0
0
0
0
0 0
59 107 59
0 -36 0
0
-59 -107 -59 0
0
0
0 36
0
0
-38 107 156 107 -38
0 -36 61 -36 0
0 0
0
0 0 -38 0
0
0
0
0 0 0
Layer 0
0 36 -61 36 0
38 -107 -156 -107 38
0 0 0
0 0
Layer -1
0 0
36 0 0
0 -59 -107 -59
0
0 0 0
0 0
Layer -2 0 59 107 59
0
0
0
0
0 0 0 0
-36 0 0
0 0 38 0 0
0 0 0
0
0 0
The coefficients are shown in the five layers around the central pixel. The shape of the pixel array selected is close to that of a sphere, and this kernel is the equivalent of the 20,14 kernel for two dimensions. The coefficients for aI/ay are obtained by rotating each of the five layers separately through 90" counterclockwise. The coefficients for N/az may be obtained by moving the coefficient in row i and column j in layer k in aI/ax t o row k and column i of layer j .
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
287
very different even if square pixels are present in the individual layers of the confocal image. VI . ENHANCED ORIENTATION ANALYSIS-DOMAIN SEGMENTATION A . Introduction
Intensity gradient analysis is a powerful tool in its own right, not only for defining edges and orientations but also for deriving simple parameters to describe microfabric. However, it may also be used as the starting point in a segmentation of images based on feature orientation. This process is called domain segmentation. A domain may be defined as a collection of subparallel particles, or alternatively, as a region of an image which has essentially a uniform texture. In the discussion here, an orientation of the features will be implied. The particles may align themselves into a specific direction and in some instances behave as an integral unit in response to external factors such as stressing. It is thus of interest to find methods to automatically segment images into regions in which features have essentially the same orientation. There are essentially two methods whereby this may be achieved: (a) a more basic and faster approach which was developed earlier when computing power was a limitation; and (b) a more rigorous approach. Both methods involve the passage of a large radius filter across the angles-coded image to define the general direction, if any, of regions having a dominant orientation. The more basic method uses a modal filter within the mask area for discrimination and was termed top-contouring by Smart et al. (1990) and this term is still used (Smart and Leng, 1993), although it is more correctly termed the modal filter method. The more rigorous approach uses the mean resultant vector (i.e., a mean filter) as the basis of discrimination using the Rayleigh statistical test (see Section V,J,3). This method has been referred to as consistency ratio mapping (Smart et al., 1990; Smart and Leng, 1993) following the use of the term consistency ratio by Reiche (1938). Since the term mean resultant vecfor is in more general use than the term consistency ratio, albeit not in domain segmentation, it will be the former term which will be used throughout this section. The term domain segmentation will be used as the collective term for both methods when addressing issues common to both techniques. In this discussion, four basic orientation directions will be assumed, i.e., vertical, horizontal, and two directions inclined at 45". The aim is to divide an image into these four basic orientation directions, and include, if
288
N. KEITH TOVEY
el al.
necessary, a fifth class where there is no dominant orientation. The reasons for adopting just four orientation classes are in order to simplify presentation in this paper. Normally, 8, 12, or 16 orientations are used, but displayed images are confusing unless they can be displayed in color. Domain segmentation by either method starts with the angles-coded image derived from intensity gradient analysis. Over this image is passed a large radius mask, and all the pixels within this mask are examined to see if there is any obvious orientation. If there is, then the central pixel is coded to the propriate orientation class, otherwise the pixel is coded with the value reserved for random areas. The mask is then translated by one pixel and the procedure repeated.
B. Domain Segmentation using a Modal Filter (top-contouring) The basic method first processes the angles-coded image so that all pixels having a value in the range 0-22.5" or 157.5-180" (i.e., either side of vertical) are coded 1 in the four-orientation direction case. If eight directions are used, the ranges for class 1 are 0-11.25" and 168.75-180" while other ranges will be relevant if 12 or 16 orientation classes are used. The next sector (class 2), will contain all the pixels having values in the range 22.5-67.5" in the four-orientation direction case. Class 3 will then relate to those pixels with angles in the range 67.5-112.5", and class 4 to those with a range between 112.5 and 157.5". In the examples shown the convention adopted follows the common use in the earth sciences where the reference zero direction points toward the upwards vertical (see Sectin V , J , l ) . For regions of an image where the angle is indeterminate arising from very low contrast, the pixel is initially coded as zero. Across this intermediate coded image is passed a large radius mask similar to the one shown in Fig. 24. A frequency distribution of the number in each class is constructed. In the example shown with a radius of 6 pixels, there are 8 pixels coded with a value 1 , 16 pixels coded with a value 2, 28 with a value 3, and 59 with a value 4. In this example the directions within the general class 4 clearly dominate as they represent over 50% of all pixels in the area. The central pixel is now recoded to a value corresponding to this class value. In the example shown, this pixel remains the same value, but in most cases it will not. Where two or more classes are present in approximately equal proportions, a value of 5 is coded to the central pixel representing a random area or one in which there is no dominant direction. For four-direction segmentation, the resulting image is coded as shown in Table XV. When displayed on the screen it is convenient to have each direction as a separate color and those used in the authors' laboratory are shown in column 3 of the table.
IMAGE ANALYSlS WITH ORIENTATION ANALYSIS
2 2 2 1
2 2 2 1
2 2 2 2
2 2 2 2
2 2 2 2
4 4 2 2
4 4 4 4
3 2 2 2
3 3 4 4
3 4 4 4
3 4 3 3
3 3 3 3
3 3 3 3
3 3 3 2
3 2 2 3
2 2 2 2
2 4 4 4
1 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
3 4 4 4
3 3 3 3
3 3 3 3
3 3 3 3
3 3 2 3
2 2 3 3
289
Fic,uRE 24. Illustration of the large radius modal filter used in the approximate method for domain segmentation. Each pixel is first coded according to a general direction (four directions in this case); then those pixels falling within the mask are examined and coded to the modal class provided this conforms to certain criteria.
While this description implies that an intermediate image is generated, this is not the case for the implemented version as the intermediate coding and the classification may be done in a single pass of the image. The criteria for selecting whether or not a particular masked region had a dominant direction was initially selected somewhat arbitrarily. Two points are important here. First, there must be sufficient points in the masked area TABLE XV
CODING OF
PIXELS USED FOR FOUR DIRECTION DOMAIN
SEGMENTATION TOGETHER WITH COLORS USED IN AUTHORS’ LABORATORY FOR DISPLAYING OUTPUT IMAGE.
General direction Vertical Bottom left to top right Horizontal T o p left to bottom right Random
Pixel code 1
2 3 4 5
Color
red green blue yellow turquoise
290
N. KEITH TOVEY et al.
which are nonzero, and second, the modal class frequency must exceed a given theshold (t). For a four-direction domain segmentation, the modal class must be above 25%, and initial work (e.g., Smart et al., 1990; Tovey el al., 1992a, b) use a simplified formula as follows:
t
=
(100
+ e)/400
(30)
More elaborate tests could be chosen which examined the distribution in classes at each position of the mask, but this takes time, and in early implementations, it was desirable to optimize the timing of the algorithm, and a global test as defined by Eq. (30) was used. The value of e could be chosen arbitrarily, but it is dependent on the number of pixels in the mask (and hence its radius). In the case illustrated here the value can theoretically take any number from - 100 to +300. If the value of e is negative, then the modal class will automatically be selected, and there will be no regions coded as random. As the value of e increases, the proportion of areas coded as random increases. Figure 25 shows examples of varying the value of e over a range from -100 to +loo. Between - 100 and 0 there is no change and no random areas are present. The extent of the latter areas grows as e increases such that by the time e = 100, the whole image is coded as random. Unlike this model filter method, the more rigorous, mean filter, method involves the computation of the mean resultant vector and the use of a uniformity test (i.e., the Rayleigh test-see Section V,J,3), and the question of the choice of a value for e does not arise. With limited computing power available at the time, however, the mean resultant method took 3-4 times as long to execute and it was important to explore more efficient ways of proceeding. Tovey et al. (1992b) thus carried out an extensive series of tests on six different images where they first measured the proportion of random areas using the mean filter method (using the 95% significance level). Thereafter, they used the faster modal filter method but varied the value of e over the full range from - 100 to +300 in the four-direction case. The area covered by the random areas for each value of e was computed and compared with the corresponding area from the mean filter approach. The value of e when the computed areas for the random regions was the same by both of the methods was noted. For all six images, this critical value of e turned out to be 10 for the case of four directions and a mask radius of 19. The value of e must be determined for each radius, for each group of orientation classes, and where relevant, for each level of significance. Using a similar approach and using the same formulation for convenience, values of e = -41 and -59 were found to be the most suitable for 8-direction analysis and 12-direction analysis, respectively.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
29 1
FIGURE25. Effect of choice of parameter e. (a) Original image; (b) e = - 100 (the minimum); (c) e = 0; (d) = 10 (closest approximation to Rayleigh statistical method; (e) e = 25; (f) e = 100.
The previous empirical formula can be modified to deal directly with varying numbers of directions, and Smart and Leng (1993) indicate that e may be computed from
292
N. KEITH TOVEY et al.
e
=
=71d o g
(:)
d * sin(:)
,
where p is the probability, n is the number of pixels in the mask area, and d is the number of direction classes. Smart and Leng (1993) indicate that this is an approximation of a simplified statistical probability formula used to determine R , the mean resultant vector (consistency ratio), expressed as R > J { I / N JIn ( ~ / p ) ,
(32)
where N i s the number of pixels in the mask area which have valid values (i.e., in the intensity gradient analysis, the magnitude is greater than the cutoff threshold), a n d p is the probability of obtaining a greater value of the mean resultant vector magnitude (consistency ratio) by chance (e.g., 0.05, etc.). It should be noted that Eq. (32) is only valid for Ngreater than about 15. Other more complex formulae are available, but in practice only a few levels of significance are likely to be used (e.g., 90%, 95%, and 99%), and one should use a lookup table of values specifically obtained for these levels. At the 95% level, Eq. (32) becomes R > (3/N)”2 for N > 15, while at 99’4’0, it becomes R > (4.61/N)”2 for N > 15. As the mask is passed over the image, a histogram of the frequency of the different angular classes is built up. Since with large radius filters, the majority of pixels are covered in the next position of the mask, a particularly efficient algorithm may be generated by storing the histogram from one mask position t o another, stripping off the values associated with the trailing edge of the mask, and adding the new values to their appropriate classes for the new position. In this way the number of computations is reduced from a value which is proportional to the square of the radius to one which is proportional to twice the diameter. A single algorithm to define the shape of the circular mask is available if 0.4 is added to the radius when generating this mask. This algorithm was first suggested by Smart (1987) and gives a good approximation to the circle. The current authors have found that an even better approximation to a circle, irrespective of radius, is achieved using the above algorithm if only odd-valued radii are used for the mask.
C. Domain Segmentation using the Rayleigh Statistical Test(Consistency Ratio Mapping) The basic approach in using the orientation statistics is to compute two separate histograms, one indicating the components of the vectors parallel to the X-direction and the other, the components parallel to the Y-direction.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
293
A summation of the components is then made so that the mean resultant vector (R)within the mask area may be determined. The magnitude of this resultant vector is then compared either with statistical tables or with the value from a computed formula (see section V,J,3). If the statistical test (at the appropriate level of confidence) indicates that the direction is significantly different from uniform, then the relevant domain class to which the direction of the mean resultant vector belongs is coded t o the central pixel. The analysis using this method does not rely on empirical values for the threshold number of valid pixel points or a threshold chosen for significance. Since only integer values of orientation are present it is sensible to generate a single lookup table for the sine and cosine of each whole degree just once at the start of the analysis.
D. Domain Segmentation Weighted According to Vector Magnitude Just as both weighted and unweighted computations of the index of anisotropy or the mean resultant vector may be made (see Fig. 111 and Section V,J,3), domain segmentation by either method may be done using vectors weighted according to their magnitudes rather than to their treatment as unit vectors. Using weighted vectors, the more contrasting features become more dominant in defining domains. It is also possible t o do domain segmentation using only vectors having a particular range of magnitudes, and there may be some advantage in images where there are several groups of features, each with a different brightness range. Using the pixels which have the highest magnitude values would generate a domainsegmented image relating to the brightest features. E. Choice of Radius in Domain Segmentation While the choice of values for the variable e for the modal filter method are arbitrary, they can nevertheless be derived empirically by comparing them with the values derived from the rigorous statistical analysis. However, the choice of radius is also a key parameter and would seem somewhat arbitrary. Figure 26 shows a test image where the analysis has been conducted using varying radii from 7 up to 39 pixels in radius. For smaller radii, the image is segmented into many domains, but beyond a radius of 20 pixels, the image changes little. The big question is what is the correct radius. In general, the smaller domains are lost as the radius increases, but in some situations larger irregularly shaped domains may split into two or more parts, thereby reducing the mean domain size as the radius increases. A series of tests using six different images was conducted where the radius
294
N. KEITH TOVEY et a/.
FIGURE26. Effect of radius of mask on domain segmentation. (a) Original image; (b) Radius 7 pixels; (c) Radius 11 pixels; (d) Radius 19 pixels; (e) Radius 29 pixels; ( f ) Radius 39 pixels. All segmentations were done using Rayleigh method.
used in the analysis was progressively increased. The results were reported in Tovey et al. (1992b), but may be summarized as follows: The number of domains in each image was high for small radii, and at first decreased
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
295
rapidly for all images, but changed little after about I5 pixels. Conversely the mean area of each domain rose rapidly for small radii and then become almost constant. Based on this and a parallel set of investigations by Smart and his co-workers at Glasgow, a value of 19 or 20 pixels radius was chosen as being most suitable for the type of image in hand. This radius also gave a segmentation which was judged to be realistic by a group of experienced microscopists. Smart and Leng (1993) report a test similar to that just described but using a magnification of 400x. For this test they indicated a radius of 5 which would scale up to approximately 25 for the 2000x used here. They also attempted to generate an error function during a complementary series of tests, this time using an image at a magnification of 2000 x , They produced a diagram which was sharply pointed toward a minimum at a value between 15 and 20 pixels and suggested that this confirmed the choice of a 20-pixel radius for the domain segmentation. The choice of optimum radius is clearly dependent on the effective resolution of the image, and in the case of microscopic images, this relates to the magnification. It also depends on the nature of the material itself, and for a coarser grained material it may well be relevant to use a larger radius filter. Although the pixel radius of 19 or 20 seems reasonable for the material studied, and predictions may be made for the comparable radius at other magnifications, there needs to be careful consideration of the radius when other types of material or image are studied. Even though subjective evidence points to mask radii around 20 pixels for much microfabric work, it is not possible to generate a test standard image with features in just a restricted number of orientations to define domains as this will automatically predetermine the result. Ultimately the choice of radius must be related to physical reality. By definition, a domain is a collection of subparallel particles. In many microfabric studies magnifications are chosen so that the feature spacing varies from 3 or 4 pixels up to large distances, but with the majority of spacings in the range of 5-10 pixels. The radius of the mask must be sufficient to cover at least two particles otherwise aggregation into domains will not be seen. The choice of around 20 pixels thus seems reasonable. One possible approach is to standardize the radius with reference to a standard feature size. This may be done by thresholding a standard set of images to produce a binary image by a method similar to that described in Section VII1,B. It then becomes a simple matter to compute the average intercept horizontal or vertical size of features and voids in this image. The procedure is repeated with any new type of image, and the radius chosen in the new image will then be a simple ratio of the average intercept sizes measured (i.e., of the new images to that of the reference set) multiplied by
296
N. K E I T H T O V E Y ef a/
a standard radius which has been tested on a standard set of images. This approach firmly relates the mask size t o the physical reality of feature size. F. Presentation of Domain-Segmented Images Domain-segmented images have a restricted range of gray-scale values and may be displayed in the normal way, but this makes interpretation of the output difficult even when the original can be placed alongside. Four methods to assist in display are available, which are shown in Figs. 27 and 28 and may be stated as follows:
(i) Only the outlines of the various domains can be displayed (Fig. 27b). (ii) Lines between the domains can be overlain on the original image (Fig. 27c). (iii) The original gray-scale image can be reduced in contrast (say, 0-49),
FIGURE27. Methods to display domain segmentation. (a) Original image; (b) Outline of domains; (c) Outlines overlayed on original; (d) Method used with special color lookup table (e.g., vertical domains are colored in various shades of red, while horizontal domains are in various shades of blue, etc.-see Table XV).
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
297
and a new output image generated with class 1 areas shown with a gray scale 0-49, class 2 areas can then be shown within the range 50-99, class 3 in the range 100-149, class 4 in the range 150-199, and the random class 5 as 200-249. If a special color lookup table is generated, then each class can be displayed as a gray scale tinted by an appropriate color. This represents by far the best way to display domain-mapping. A poor representation of this color display is shown in Fig. 27d. (iv) The different domains may be displayed in different shades of gray (or color on the screen, Fig. 28a). (v) Shaded lines can be drawn on the gray-scale domain-segmented image to highlight the direction in each domain (Fig. 27b). Random areas are shown by a series of dots. (iv) Selected class area may be highlighted leaving the other areas as blank images (Figs. 28c and 28d).
FIGURE 28. Additional different methods to display domain-segmented images. (a) Regions delineated in various shades of gray (color is useful o n the screen); (b) As in (a) but with shading to highlight the directions; (c) Delineation of' domain class 1 only (vertical); (d) Delineation of domain class 4 only (top left-bottom right).
298
N. KEITH TOVEY et al.
G . Some Practical Points about Domain Segmentation All domain-segmented images show, to a greater or lesser degree, an amount of oversegmentation into small regions which may be only a few pixels in size. The original definition of a domain implied a collection of features, and separate features must be spaced by several pixels otherwise they will be in the same overall domain feature. Some form of filtering is thus important to remove the isolated pixels and either a second pass of domain segmentation (using a slightly modified algorithm as the pixels are already coded into a class rather than original angles) may be used, or alternatively, a small sized kernel (e.g., 5 x 5 ) can apply a median filter to the image. This greatly improves the segmentation without losing the overall structure. The edge of the image cannot be correctly processed in domain segmentation, and in many cases, the output image will be reduced in size by an amount equal to the diameter of the mask. Where it is important, a boundary region of the image of size m x n may be reflected outwards to generate a starting image (n + 2r) x ( m + 2r) pixels in size. This procedure is to be preferred over a simple replication of pixels at the boundary.
H . Relationship between Domain Segmentation and Index of Anisotropy Since the domain-segmented image is derived from the angles-coded image, there may be a relationship between the number of domains defined and the index of anisotropy. Figure 29 clearly shows that there is a strong linear relationship with the number of domains declining as the index rises.
I. Extensions to Domain-Segmentation Techniques There are several possible extensions of domain segmentation which may be applicable in some circumstances. Firstly, the mean resultant vector (consistency ratio) computed at each pixel within the defining mask could be encoded as two additional images. One is the magnitude of the vector, and the second, the direction of the vector. The former image would allow the strength of orientation across the image to be presented. The information so displayed would be independent of orientation and might be useful in a simple textural segmentation. Regions with a high value would indicate a high degree of orientation (and possibly packing) and these could be separated from those with low values. It is also probable that regions within the centre of a domain would have the highest magnitude of the mean resultant vector. Such maxima might be used as the basis of seed
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
299
0.1
~
0.a
n
0 &
0.6
.-8
3
0.4
r
0 0.3
x a 0.2
-c Q
0.1
0 60
70
00
110 190 1SO 170 100 210 230 2SO 2 7 0 200
Number of Domains per Image FIGURE 29. Variation in index of anisotropy with number of domains for images segmented with eight orientation directions.
points in some applications to remove oversegmentation in watershed algorithms (see Beucher, 1992 for a discussion of the use of seed points to reduce oversegmentation). From the second image giving the direction of the mean resultant vector, a revised overall orientation parameter (e.g., index of anisotropy of mean resultant vector length) for the whole image could be computed. This computed value would tend to emphasize orientation to a greater extent than that computed from the index of anisotropy or the mean resultant vector. Once the domain-segmented image has been obtained it may be combined with the angles-coded image so that the strength of alignment within each domain class can be examined separately. Thus the strength of orientation may be greater in one general direction than another, but this fact may go unnoticed in the earlier analysis described in Section V,J. In addition it becomes possible to examine how the overall shape of the domain relates to the alignment of the constituent features within the domain. It is a simple matter, from the domain-segmented image, to separate each orientation class into a separate bindary image so that the size and shape of each individual domain may be computed using standard feature statistical parameters available with most image processors. Some domains may be very irregular in shape, and some may touch all four sides of an image; hence, some feature statistical parameters, such as shape or ferret diameter, must be treated with caution. On the other hand the overall area distribution of domains will not be unrealistic.
300
N . KEITH TOVEY el a/.
VII. APPLICATIONS OF ORIENTATION ANALYSIS A . Introduction The techniques for orientation anlysis described in Sections IV-VI may be used as part of a wider process of analysis particularly in the area of microfabric. In other areas, correct delineation of edges is important in identifying linear features (Swift, 1992), while robust edge-detection algorithms are also needed as part of an integrated package which also involves watershed algorithms. The edge-detection routines may be used to assist in removing oversegmentation lines from the watershed image. There is extensive literature on many of these techniques, but the application of the orientation analysis to other areas of microfabric analysis is much less well developed, and this section will concentrate on some of those applications for which orientation analysis is an important stage in an overall Image Acquisition 2)
I
Compute Fourier Transform
3)
Search for Interference Peaks and Mask
4)
Determine Signal to Noise Ratio
5)
Image Reconstruction using Wiener Filter
6)
Compute Relative Contrast Histogram
7)
Threshold Image
8)
Porosity Computations of whole image
9)
Orientation Analysis and Domain Segmentation
10)
Porosity computation for each domain class
I I
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
301
sequence of analysis. Two very different applications will be considered here. The first involves analysis of porosity and combines an intitial segmentation of a gray-scale image into a binary image to allow estimates of porosity to be made (i.e., the proportion of black and white in this image). By combining this analysis with domain segmentation it is possible to investigate variations in this porosity in different directions. A flow diagram of the processes involved is show in Fig. 30. A description of the stages is given more fully in Section VI1,B. However, to summarize, stages 1-8 involve the image capture, image segmentation, and the basic porosity analysis, while stages 9 and 10 combine the domain segmentation methods to extract extra information. The second application is the combination of orientation analysis based on intensity gradient analysis with that derived by more traditional feature analysis statistics. For this process, it is often necessary to divide the image into two component parts. One component contains those regions of the image which are associated with the larger features and which are suited to traditional feature analysis for which intensity gradient analysis is not as well adapted. The other component relates to the fine-grained matrix for which intensity gradient analysis is the best approach. An example using multispectral analysis as the basis of the initial segmentation is included. This combines information from several different images of the same area to achieve an optimum segmentation as a preliminary processing step to orientation analysis. The full procedure is shown in Fig. 31 and the separate stages are covered in Section VII,3.
B. Orientation Analysis Combined with Porosity Analysis Two methods are available for using orientation analysis, and in particular domain segmentation, as part of an overall analysis of porosity within samples. Firstly, the images may be segmented by the selection of a suitable threshold and then combined with the domain-segmented image. Secondly, there is a less precise method which nevertheless can separate groups of images into high and low porosity based on the relative distribution of gray levels within each domain class. Both methods will be considered. 1. Binary Image Method
Porosity analysis requires that the image be segmented into two or more discrete components, one representing the voids and the others representing various classes of feature. In most situations this will represent a simple binary image. For a good discrimination between the two components, ideally there should be a single gray level below which the value of all pixels
302
N. KEITH TOVEY et al.
I
Digital Image and X-ray Map Acquisition I
2)
i
I
Image selection and stacking I
7)
Training Area Selection
I
Edge Detection from BSE image
I
Identification of Large Voids (Relative Contrast Histogram Method)
for each mineral class
I
mineral grains
1 Mineral sizdshape
lo)
determine orientation of matrix
II) I
12)
Domain area I size I shape
13)
Domain mapping
17)
Compute orientation Statistics
FIGURE31. Flow chart for image analysis combining multispectral techniques with orientation analysis.
are voids, and above which the values are all solid features. This situation is rarely the case even when enhanced contrast is used during image acquisition. The best situation that can be normally expected is a bimodal distribution where there is a peak frequency on the histogram of intensities corresponding with the voids and a similar peak corresponding with the
303
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
solid parts. Segmentation into a binary image in this case is relatively straightforward as all that is required is t o determine the minimum value between the two peaks. In most images of microfabric, the situation is more complex and there is rarely a unique minimum, indeed, it is rare to have an image with a bimodal distribution of gray levels. Figure 32 shows the actual histogram of gray levels from Fig. 2b. This histogram is clearly unimodal. The situation is complicated by the fact that all images have an inherent degradation imposed upon them by the imaging system. This is particularly true for the scanning electron micrograph images shown here where the degradation is a function of both the imaging system and the specimen beam interactions. In the procedures discussed here the images were reconstructed using a Wiener filter to obtain the improved image shown in Fig. 33. The full details of the procedure adopted for these images have been described in Hounslow and Tovey (1992) and Tovey and Hounslow (1994), but the following brief explanation will suffice for this paper: The point spread function (PSF) for the microscope at the operating magnification was measured by observing the intensity distribution in several directions across some very fine bright particles. The change in intensity was found to approximate to a Gausian distribution having a rootmean-square radius of approximately 1.5 pixels. A signal-to-noise ratio was also estimated for each batch of samples by observing intensity variation over an otherwise blank specimen. Essentially the Wiener filter was 2600
Frequency
2000
0
60
100
160
200
260
Grey Level FIGURE 32. Histogram of gray levels in Fig. 2b. In most images of microfabric the histogram is unimodal making it difficult to select a correct threshold to separate the image into two phases, and more advanced routines must be used.
304
N. KElTH TOVEY el al.
computed from the formula suggested by Rosenfeld and Kak (1982) as follows:
Y(u, u)
=
H * ( u , v) mod(H(u, u)12 + l/SNR2 ’
(33)
where Y(u, u) is the computed Wiener filter, mod (H(u,u)12 is the power spectrum of the point spread function (PSF), H * (u , u) is the complex conjugate of the Fourier transform of PSF, SNR is the signal-to-noise ratio of the image. Once this function has been computed it is a simple matter to obtain the Fourier transform of the reconstructed image (F(u, u)) from
F(u, u)
=
G(u, u) Y(u, u),
(34)
where G(u, u) is the Fourier transform of the actually acquired image. While the reconstructed image may be obtained from F(u, u), some serious problems may arise if high-frequency regular noise is present across the image. This is often seen in scanning electron microscope images and arises from electrical interference. It is important that such noise is filtered before the application of the Wiener filter. A convenient way to d o this is to search the fourier transform of the original image for high intensity peaks at large radii from the zero order reflection. These peaks may be located automatically and masked with a small circular mask prior to the application of the Wiener filter. Hounslow and Tovey (1992) and Tovey (1994) show examples of the nature of this noise which often has the appearance of moire fringes. The improvement in the resolution in detail from the reconstructed image is clearly seen in Fig. 33. Theoretically, the Wiener filter assumes an approximately linear addition of noise in the image. Strictly this is not correct mathematically, and other more involved algorithms involve nonlinear solutions to the problem (e.g., Razaz et al., 1993). Nevertheless the Wiener filter does improve the quality of the images considerably. The reconstructed image appears more amenable to direct binary segmentation by selecting a threshold as described previously. However, the grayscale histogram is still unimodal, and thus unsuitable for simple treatment. Several algorithms for segmentation have been proposed; some use edge detectors or Laplacian operators or both (e.g., Kohler, 1981; Haralick and Shapiro, 1985; Sahoo et af., 1988; and Haddon and Boyce, 1990). Some of these methods have proved particularly effective in applications such as the recognition of handwriting against a pictorial background (White and Rohrer, 1983), but such techniques are generally unsuitable when there are many features of varying size present in the image. Some local consistency
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
305
FIGURE33. Image in Fig. 8a restored using Wiener filter.
information is usually needed to assist in the segmentation and in the keeping track of what is void and what is solid. In the case of the apparently complex form of handwriting used by White and Rohrer in their example, it was the width of the characters, which was approximately constant, which could be used in the decision-making process. Several algorithms have been explored, and the most suitable found so far has been the relative contrast histogram method first proposed by Kohler (1981). This is an attractive algorithm as it does not presuppose a particular value at which thresholding should start. In theory, a search could be made over the full range of gray-scale values from 0-255, but usually a range can be specified which is somewhat narrower and which will thus speed up calculation. The details of this method are described in Hounslow and Tovey (1992) and Tovey et al. (1994b), but essentially the method is one which examines the relative gray-level differences between adjacent pairs of pixels (either vertically or horizontally) and a selected threshold level. The maximum difference will occur for a selected threshold level midway between the values at the two pixels. The procedure is
306
N. KEITH TOVEY et al.
repeated at all adjacent pairs of pixels within the image to generate an aggregated histogram, the maximum of which indicates the optimum gray level at which thresholding should take place. Examples from using this algorithm on both the unrestored image (Fig. 8a) and restored image (Fig. 33) are shown in Fig. 34. The advantages from using the restored image are clearly apparent. Hounslow and Tovey (1992) analyzed nearly 2000 images to determine the porosity from image analysis using reconstructed images and the
FIGURE34. Binary images generated from Fig. 8a (a) This is the best that can be achieved using the original image; (b) The binary image generated after image restoration. There is much more detail in this than in the unrestored image.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
307
relative contrast histogram method for segmentation. There was very good agreement between the values computed by this image analysis method and those derived using bulk moisture content measurements on adjacent parts of the samples. The advantage of image analysis is that variations in porosity over small distances within the sample may be investigated. Once the binary image of Fig. 34b has been obtained, an orientation analysis following the stages outlined in the flow chart in Fig. 3 may be conducted to generate a domain-segmented image, and this latter image may be combined with Fig. 34b so that porosity variations within different domain classes may be examined, an advantage which is not available using other techniques. A particularly efficient algorithm may be constructed assuming that the binary image (B(x,y)) is coded 1 for the features and 0 for the voids, and that the domain-segmented image (D(x,y ) ) is coded with values 1 , 2, 3, etc., representing the different domain classes. These two images are multiplied together pixel by pixel (the operator (*) signifies pixel multiplication) as expressed by
In S(x,y ) , the voids will remain with a zero gray-level, but the solids are coded according to the domain class to which they belong. Thus for a fourdirection segmentation, solid particles which are nearly horizontal will be coded 3 (see Table XIV). The negative (reversed contrast) version of the binary image B(x, y ) is generated (N(x,y ) ) , and this too is multiplied by D(x,y). Thus,
w ,Y ) = w ,Y )* D(x,Y ) .
(36)
Histograms of gray level for both the solid image ( S ( x , y ) )and the voids image (V(x,y ) ) are generated. Let these be called H, and Hv, respectively. For four-direction orientation analysis these histograms will have five classes, one for each direction and one for the random areas. It becomes a simple matter t o generate a new histogram Hp which gives the porosity within each class as
H p = H- .V HS
(37)
There should be no problem with a zero divisor in any class because to generate the domain class in the first place, it is necessary to have solid features present. Use of such extended analysis has enabled the variation in porosity between directions to be examined. In the majority of the samples
308
N. KEITH TOVEY el a/.
investigated by the authors, the porosity in the random areas was always greater than that in any of the directional classes. On the other hand the horizontal domains were found to be more porous than their vertical counterparts.
2. Gray-Level Porosity Method An approximation to the porosity in a domain has been suggested by Smart and Leng (1993) and assumes that the intensity at a given pixel is a simple function of the intensities of the pure voids and pure solids such that
where I s is the gray level associated with the solid features, I, is the gray level associated with the voids, and I is the gray level associated with a particular pixel. Though the calibration of the gray-level values to ascertain both I , and I , is possible in well-conformed images, this is not always possible. In the examples of soil microfabric shown in Fig. 8, the solid features are usually too fine to enable a reliable gray level to be set at time of image capture. Indeed in these images it was found that even though two standard gray levels were set for each group of images, the variation in gray level within the wholly solid and void standards could be as much as k 10 on a gray scale range from 0 to 255. In such instances, it is not possible to use Eq. (38) to obtain reliable estimates of porosity. There are alternative, but less accurate approaches to setting these values directly. If the material consists only of one material and voids, then the minimum intensity within the image will correspond t o a void and the maximum to solid. Improvements are possible if a circular local mean filter is passed across the image first (Smart and Leng, 1993 suggest a radius of 6 pixels) before these estimates of maximum and minimum are made. Another approach is to use the actual calibration values (despite the problems mentioned earlier), and use filtering to minimize noise. Finally, it is possible t o truncate, say, the bottom 2% and top 2% of intensities in the image, and use these truncation values for I , and I s . If the actual porosity is known from other sources (e.g., gravimetric analysis), then it is possible to improve the precision of the estimate of porosity. Despite these difficulties, it is always possible to examine the relative differences between one domain class and another provided that the domain-segmented image has been generated. Clearly this method is not as useful overall as the more rigorous method described earlier, but it has the advantage of greater speed. A particularly efficient algorithm uses the porosity image (P(x,y ) ) and multiples each pixel
309
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
by the corresponding pixel in the domain-segmented image (D(x,y ) ) to
generate a new image (E(x,y)).That is,
E(x, Y ) = P(x, Y )* W x ,Y ) * (1s - I"), where the operator (*) signifies pixel by pixel multiplication. 100000
(39)
Frequency
10000
1000
100
10
1
0
225
450
675
a
900
1125
1350
1575
1800
2025
Channel Number 0.7
Porosity
0.6
0.5
0.4
0.3
b
1
2
3
4
5
6
7
8
9
Sector Number
FIGURE 35. Gray-level porosity method of analysis. (a) Aggregated histogram over all orientation sectors; (b) Porosity computed from each sector in (a); (c) Method of displaying porosity results from several images. The orientation sectors are marked along the top while the sample numbers are down the side. The gray-scale convention shows more porous regions as dark (corresponding with situation in images), which displayed in this manner gives a rapid [continued way to see differences in porosity, etc.
310
N. KEITH TOVEY ef al.
FIGURE 35-continued.
This computation must be in integer format as the output gray-scale range (i.e., number of channels in the histogram) will be from 0 to ( I , - I,) x n, where n is the number of orientation directions in domain segmentation (including the random areas). A histogram is now generated of E(x, y). All that is then required is to divide the histogram into n sections and work out the mean value in each of the sections of the histogram. An example of this modified histogram using eight-direction segmentation of image (Fig. 2b) is shown in Fig. 35a. In this example, Z, = 225 and Zv = 0. The nine sections included in Fig. 35a (including Section IX for the random areas) show the intensity distribution within each sector. The area
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
311
under the curve in each sector is proportional to the total area covered by that domain class while from the mean value within each class the porosity may be obtained. Thus for the nth sector the porosity P, is given by
where r is the channel number, F, is the frequency of occurrence in the rth channel, s is the channel number corresponding to the first channel of data from sector n, t is the channel number corresponding to the last channel data from sector n. Also, s = ( n - 1) ( I , - I,) and t = n ( I , - I,) - 1. The porosities computed for the nine sectors are shown in Fig. 35b. Noteworthy are the two sectors ( 2 and 8) which have a significantly lower porosity than other sectors. In many applications, it will be necessary to analyze several images from the same sample. A good method to display all the porosity information is shown in Fig. 35c. Here a resultant image is constructed which is n pixels wide by m deep ( n refers to the number of domain segmentation classes, and m the number of separate images). The first pixel in row 1 (working from the top) is coded with an intensity representing the mean porosity in the domain segment 1 of image 1. Pixel 2 is coded representing the corresponding information in segment 2 and so on. The successive rows show the related information from further images. In Fig. 35c, data from 18 separak images are shown. A low porosity is shown as a lighter shade in the figure (corresponding to the normal gray-scale image). The variations in porosity both between images and segments within an image can readily be seen. Thus the segments 2 and 8 in the chosen example in Figs. 35a and 35b are clearly anomalous when the whole set of images is considered.
-
-
C. Orientation Analysis Combined with Preliminary Multispectral Processing 1. Introduction
Orientation analysis techniques developed using intensity gradient methods are ideal for extracting orientation information for the fine particles which are typically 5-10 pixels wide. Large particles, however, can cause a problem as often there will be little or no information that can be derived from the center of such particles. Alternatively, noise may be present within these large particles and may confuse the computation of indices of anisotropy, etc. On the other hand, unlike the fine-grained particles, the larger ones are particularly suited to traditional feature size analysis including feature area,
312
N. KEITH TOVEY el a/.
perimeter, shape, orientation, and so on. What is required is a means to separate the two groups of features from each other. There are possibilities using image reconstruction followed by binary segmentation and the deliberate removal of larger features. However, all such segmentation is prone to problems such as touching particles, and although routines do exist to separate such features, difficulties can still arise particularly if there are varying shades of gray within individual particles and the number of features present is large. In some applications, multispectral information may be present. Thus in true color images, separate red, green, and blue images may be obtained. In satellite images, there are often seven separate radiation bands which can be used. In microscopy, different illumination conditions, including the use of ultraviolet, can provide an additional spectral band. In the case of electron microscopy, it is possible to acquire a range of X-ray images, each one corresponding to a separate element at the same time as the normal backscattered image. Each separate X-ray image (and the normal image), may be treated as different spectral layers, and techniques which are common in remote sensing applications may be used to classify the images into different categories. Several recent papers have described aspects of this, e.g., Tovey and Krinsley (1991), and Tovey et al. (1992~).The extension of the method to incorporate the orientation analysis described in this paper is covered in full in Tovey and Krinsley (1992), and Tovey et al. (1992d, 1994a). 2. Multispectral Processing of Images The procedure for multispectral classification is well known in remote sensing, but has hitherto found few applications in image analysis. A set of images of different spectral bands of exactly the same area are needed and should be stacked to form a multilayer image. In some instances, one or more of the spectral layers may be obtained at different times or in different conditions, and in these cases, the first stage in the processing is to ensure that there is correct registration of all features between the layers. This may require coordinate transformation. In the case of much of the work done by the authors, the back-scattered image is acquired at the same time as the Xray maps, so there is no problem in registration between the layers. An example of a typical X-ray image showing 4 out of the 12 X-ray maps taken at the same time as the image shown in Fig. 2a is included in Fig. 36. Longer image acquisition times are desirable when X-ray images are captured (sometimes lasting for over 12 h when a 1024 x 1024 image is acquired with 12 separate elemental maps). Over such periods, a general change in dc brightness is quite possible, arising from changes in the beam conditions in the electron microscope. The authors overcome this problem by first
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
313
FIGURE36. Selected X-ray maps of Fig. 2b. The ball-like region is clearly iron-rich.
acquiring a back-scattered image in the normal way (taking typically 30 s), and a second one at the time of capture of the X-ray images. Any differences in brightness in the two back-scattered images from one region to another arises from the beam instability which will also affect the intensity of the X-ray images, and these latter images can be scaled appropriately. To avoid noise problems in this scaling, both back-scattered images are temporarily blurred using a moderate radius-averaging filter, and it is the ratio of the intensity values at each pixel in these blurred images which is used as a scaling for the X-ray images. Having checked that the various images are in register and any preliminary processing is complete, the next stage in the process is to define areas within typical larger features which are characteristic of that type of feature. In the case of X-ray mapping associated with electron microscopy of mineral grains, high concentrations of a particular element may be sufficient to specify unique characteristics for a particular feature. The absence of particular data in a spectral band is also important as are also combinations of information from the various spectral bands. In theory, it
314
N. KEITH TOVEY et al.
should be possible to initiate the process using an unsupervised form of classification, but in the case of microscopy it is often quite complex, and some guidance by the operator is essential at present. It is a simple matter to delineate an area typical of a particular feature using the computer mouse. This process is repeated until all major feature types have been identified. In the case of soil minerals, it is standard practice also to specify as one feature the background matrix which consists of particles which are only a few pixels in size. (This is usually related to a physically significant size, e.g., 2 p m for magnifications of about 400x upwards as this relates to the clay-sized fraction.) An example of such feature delineation is shown in Fig. 37. In this example, two separate sections of the matrix were selected for training areas because the nature of the material within and outside the ball-like feature were so different. Region “A” was identified as quartz from the high concentration of silicon and absence of other materials, while potassium feldspar (region “B”) was identified from the high concentration
FIGURE 37. Training areas used in delineating features in Fig. 2a. A-quartz; B-feldspar; C-chalk; D-rutile; E-magnetite (?); F-pyrite; G-matrix within aggregate; H-matrix outside aggregate.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
315
of potassium combined with silicon and aluminum. The grain marked “E” had a high concentration of iron and moderate amounts of manganese. The data of intensities within each of the test areas, in each of the spectral bands, are then analyzed to generate a covariance matrix. Finally, using this information with the original stacked image, it is possible to obtain a probable classification of the various features present in the image (step 5 in Fig. 31). While it is possible to force a classification at all points, this is undesirable as selected minerals may have been missed in the initial identification. By allowing classification only if the region falls within a given range of the central group of that class, regions of uncertainty are left. These usually lie around the edges of the larger features, although sometimes complete features are left unclassified if an inadequate number of different classes were identified in the first place. Once skeletal classification has been achieved, it is possible to postprocess the image using the techniques outlined in Tovey et al. (1992d) to delineate the features correctly. Essentially this process involves the use of an edge detection algorithm (see Sections IV and V) to highlight the edges of the key features. Usually this results in severe oversegmentation of the image unless a high threshold of the magnitude of the intensity gradient is set. However, using too high a threshold often loses other key information. In several examples used by Tovey et al. (1992d), a threshold of the magnitude of 10 was used initially, and the defined edges were then improved using some mathematical morphological techniques, such as dilation, erosion, and skeletonization. Even then, oversegmentation is present, but with the additional information from the raw classified image, these oversegmentation lines can be largely removed. Once this has been done, the outlines of the features are then known, and the features in the raw classified image are dilated until they fill the whole of the boundaries delineated. Essentially this postprocessing can be described by the steps shown in Fig. 38. The full procedure is described in detail in Tovey et al. (1992d). The final classified image is called the mineral-segmented image in the earth sciences and is shown in Fig. 39. As with the domain-mapped image, display of such images is better in color, but the limitations of display have been overcome using shading. Each feature class in the image is normally coded with a single gray level as shown in Table XVI, which also indicates the proportion of each feature present in the image. In most examples, voids are present, and these present problems as they contain only embedding medium which contains only elements of low atomic number which cannot be detected by the system in use. Thus little X-ray information is available from such regions. It is normally more appropriate to use the relative contrast histogram method (see Section VII,B, 1) to identify the voids from the solids in a separate processing step rather than rely on the
316
N . KEITH TOVEY et al.
Edge detection on BSE image
a)
- 20,14 formula (step 6 of Fig. 31)
b)
skeletonisation of magnitude image (possible additional dilatioderosion)
d)
definition of voids (step 7 of Fig. 31)
e)
I Generation of a transform matrix from steps (c) and 1
f)
I
(d) and raw classified image
1
I
I final identification of each feature and dilation to fill 1
I
boundaries in (b)
6)
fill holes in features, processes multi- mineral grains
h)
use edge detection on image from step ( 9 )
i)
dilate new magnitude image
j)
remove over-segmentation lines (i.e. high degree of overlap in (i) and (g))
I
multispectral classification methods. Once the segmented image is available, it then becomes a simple matter to extract all particles of a given class for analysis using traditional feature-sizing packages. Of importance are parameters such as area, perimeter, shape, orientation, and lumpiness. Examples of analysis of the different parameters have been displayed in Tovey and Krinsley (1991); Tovey et al. (1992~);and Tovey, (1994). The use of mineral segmentation thus allows the extraction information from the separate mineral species, a result which is not possible by any other method.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
317
FIGURE 39. Mineral-segmented image of Fig. 2a. TABLE XVI PROPORTION OF MINERALS-FEATURES PRESENT Mineral-feature Large voids Matrix Matrix within aggregate Quartz grains Feldspar Chalk Rutile Magnetite Pyrite
Pixel code
I N IMAGE I N
FIG.2b.
Percentage present 2.2 13.5
14.5 5.3 2.7
0.6 0.2 0.7 0.4
The fine-grained matrix cannot be processed in the same way, as the features are usually only a few pixels in size. However, they are ideal for orientation analysis using the techniques described in Sections IV, V, and VI. A binary mask is now generated by setting all pixels with values 0 (voids) or greater than 2 (in the case of the features in Fig. 39) to zero and all the remaining ones to unity (i.e., where there is matrix). The original image (Fig. 2a) is then multiplied by this binary mask, and the matrix may be analyzed for orientation using an appropriate edge-detection algorithm
318
N. KEITH TOVEY et al.
(the 20,14 formula is preferred as it performs better with fine spacings). An angles-coded image may be generated for later analysis while an index of anisotropy may be computed for the clay matrix alone (or in this case for the matrix both within and outside the iron-rich aggregate, separately). In this image, the index was computed as 0.229 at an angle of 2.4" for the general matrix and as 0.374 at an angle of 174.6" for the material within the iron-rich area (both angles measured clockwise from the upward vertical). The significantly high index for the material within the iron-rich region suggests that it has been subjected to more compaction in its past history. Using the angles-coded image, extended orientation analysis involving domain segmentation (Section VI,2 and VI,3) may be done. In the example shown here the magnification was rather lower than in the example discussed in Fig. 25 and a radius of 9 pixels was used. The domainsegmented image is shown in Fig. 40. Within the iron-rich region one domain direction (i.e., vertical) dominates, but this is not the case outside. In the example shown here, there appears to be no obvious flow of clay matrix particles around the larger
FIGURE40. Domain-segmented image of Fig. 2a.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
319
FIGURE41. Domain-segmented image of Fig. 1 b. There is a dominant domain with features inclined from top left to bottom right. Around several mineral grains the domains are aligned tangential to the grain surfaces indicating that there has been a postdepositional movement of the grains relative to the matrix.
mineral grains. In another example (Fig. 41) which is the domain-segmented image from Fig. lb, there is not only a dominant domain with features inclined from top left to bottom right, but there are also small domains with features aligned tangential to the larger grains, indicating that a flow of material has taken place since deposition. Such information may be a key pointer to the diagenetic history of the material.
VIII.
IMPLEMENTATION AND
AUTOMATION OF ORIENTATION ANALYSIS
A . Implementation of Algorithms for Orientation Analysis Six of the new algorithms described in Sections IV, V, VI, and VII have been developed and incorporated into extended versions of SEMPER. These may be summarized as follows: (i) A multi-purpose intensity gradient algorithm which can directly use any of a number of standard kernels shown in Sections IV and V and indirectly other kernels (e.g., those empirical ones described by Smart and Leng, 1993). Equally, the asymmetric formulae may be
320
(ii)
(iii)
(iv) (v) (vi)
N. KEITH TOVEY et at.
selected. The algorithm produces an angles-coded image as well as the option of a magnitude image (for simple edge detection), a simple histogram of frequencies, or several histograms, one for each of a range of magnitude intensities. Separate histograms can be generated for the data arising from vectors with magnitudes less than a user-defined cutoff value. Either square (as default) or rectangular aspect pixels may be processed, and options allow for providing histograms in either unit vector or aggregated form. A multipurpose statistical routine which computes the best-fitting ellipse, indices of antisotropy, and preferred direction, as well as mean resultant vector length and tests for the von Mises distribution. A three-dimensional version of the two-dimensional intensity gradient algorithm. At present this has fewer options than available in the two-dimensional case. A general domain-segmentation algorithm which can use either the approximate modal filter analysis or the angular mean filter method. An algorithm to overlay color on the original image according to the domain orientation. An algorithm which follows the relative contrast histogram method of Kohler (1981). This allows automatic selection of suitable thresholds for binary segmentation of image without any prior knowledge of the value.
In addition several programs combining several standard commands in SEMPER and also the extension commands listed previously have been written for use in wider applications such as those described in Section VII. All the new commands have been written with objective image analysis in mind so that subjective decisions which may vary from one session to another are avoided or at least minimized. This makes the procedures ideal for automation.
B. Automation of Orientation Analysis For many purposes where individual images are processed separately to enhance or extract features, the question of automation is not relevant. However, when larger numbers of images are to be processed, automation is desirable if not essential. In applications where subjective decisions relating to the selection of a threshold are needed, then this in itself can cause bias in the final result if the objective is one of image analysis rather than image processing. It is in this area that the orientation analysis
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
32 1
methods based on intensity gradient algorithms come into their own. The problem of automation is particularly important in microfabric analysis as it is often subtle differences between samples that are the subject of investigation, and many images from each sample must be analyzed before statistically relevant results are achieved. With the exception of part of the multispectral processing of images, all the other techniques referred to in this paper can be fully automated. For many years, the authors have processed automatically batches of up t o 100 images at a time. No intervention by an operator has been necessary. Until recently, the speed of computer processing meant that overnight batch runs were the norm. A typical batch run for orientation anlysis would generally follow the flow diagram shown in Fig. 3 . After completion of one image, processing would automatically begin on the next. Some modifications were made from time to time. For instance, it has sometimes been found useful to process the images using 4-, 8-, and 12direction domain segmentation and this is readily achieved by looping through the stages 5 to 8 for as many different sets of analyses as required. For analyses involving porosity measurements as well as domain segmentation, the whole procedure, including the image reconstruction, can be done without operator intervention by repeating stages 2 to 10 in the flow diagram shown in Fig. 29. With all automation of analysis, a key aspect is careful management of the processes and a form of database management scheme must be kept automatically so that a constant record of the processing of each image is kept. Some development in automation is needed in the area of classification using the multispectral approach, but even if fully unsupervised classification were to become reliable, there is a need for the operator to label the different features based on experience. Automation requires the development of robust algorithms but has the advantage that the need for subjective decisions which may vary from image to image can be removed. It might be argued that some of the parameters set in intensity gradient analysis (or domain segmentation), such as the cutoff threshold or the radius of the mask, are not entirely objective. Nevertheless extensive tests have been done to check that the values chosen are realistic. Furthermore any departure from optimum in the choice of these parameters will affect all images taken under the same conditions to the same extent and should therefore pose little problem. Indeed, suggestions have been made in Section VI,5 how the radius of the mask may be chosen in other applications so as to be consistent with the value recommended here. Batch processing of many images of a subject in a n automatic manner can greatly assist in obtaining valuable information about observed
3 22
N. KEITH TOVEY et a/.
features. However, the acquisition of images themselves may then produce a bottleneck to further development. In the case of microfabric analysis, operator fatigue at the electron microscope or the overbooking of such facilities may cause problems. Clearly the next stage in development is to automate the image capture, and, at the time of writing (August, 1994), the authors have successfully tested a fully automated system for acquiring directly any number of images from a scanning electron microscope. The whole system is under the control of the image-processing software (SEMPER), which generates commands to automatically move the specimen to new locations, change the magnification if required in a predetermined fashion, automatically focus, and then capture an image at either 512 x 512 pixel resolution or 1024 x 1024 pixel resolution. Tests have been completed with up to 50 images recorded in a single session with the instrument left unattended. Further developments will include an extension to overnight running to acquire several hundred images and to optimize the use of the scanning electron microscope. As each image is acquired, key information about the magnification and stage coordinates are automatically recorded in a database for later use in analysis. During automatic capture, small reference images (128 x 128 pixels) of each recorded image are combined to form a mosaic of 16 (or 24) images per page for later easy reference. With automation of image capture, it is a simple matter to have the same or a separate image-processing package to analyze the images directly using the intensity gradient and domain-segmentation algorithms. However, difficulties may arise with full automation of this type including (i) Artifacts may be captured in a fully automated computer-generated array of points. (ii) Images beyond the area of interest may be captured and analysed. (iii) Microscope parameters may change to produce poor quality images. The first two of these are of little consequence as two methods of operation are available for stage movement. The stage may be moved in a regular manner according to some predetermined algorithm. In this case, the reference library of small images may be consulted and dubious images may be discarded. Alternatively, the specimen may be moved manually to the various areas of interest before acquisition of the image and recording of the coordinates for subsequent use in the automatic sequence. Problems of poor quality images are easily addressed by consulting the reference collection of images.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
323
IX. CONCLUDING REMARKS
Orientation analysis and the related topic of edge detection are important tools in image processing and analysis. In the past much has been written on edge detection but the use of algorithms for orientation analysis is much less common. Within the area of orientation analysis there are two main applications; one as a means of edge linking or use in Hough transforms, the other as a method for the analysis of fabric or microfabric of materials. Established edge-detection algorithms such as the Roberts, Sobel, and Prewitt operators are well known, but while these may be adequate for edge detection, they are not always as good for determining orientation. These are all 3 x 3 operators, but improvements are possible using 5 x 5 arrays, such as the one proposed by Zuniga and Haralick (1987) which it is claimed improves precision of accuracy of angular determination. It was the method used by Swift (1992) in his analysis to detect edges and is based on a polynomial fit to the intensity points around the pixel in question and it is readily extendible to larger arrays such as 7 x 7. Smart and Tovey (1988) adopted a different approach using a two-dimensional expansion of Taylor’s theorem to generate a general matrix of coefficients from which different groupings of formulae may be extracted for use. The matrix includes coefficients from all 24 points in the 5 x 5 array surrounding the central pixel from which any order solution up to a fifth order solution is possible. Extension to a 7 x 7 array should allow solutions up to the eighth order, but this is at the expense of losing fine detail which may be much smaller than the kernel size. The choice of the optimum formula is largely irrelevant when only edge detection is required, but some formulae are much better than others in defining orientation, and indeed the nature of the image itself does influence the choice. With images with a short wavelength periodicity of features (e.g., 3-5 pixel spacing) the 20,14 formula of Smart and Tovey (1980) is definitely the best, but its advantage decreases as the wavelength increases so that at a 10-pixel wavelength, the Zuniga and Haralick (1987) formula is somewhat better and comparable with the 20,5 formula of Smart and Tovey (1988). For images with features spaced at significant distances, either of these formulae or the empirical 20U formula of Smart and Leng (1993) would be sensible choices. On the other hand, for microfabric studies where the majority of features are spaced at wavelengths between 3 and 12 pixels, the 20,14 formula is the best choice. The 20,14 formula outperforms the 24,20 formula at the short wavelengths despite the fact that the latter is a higher order solution. The reason is that the 20,14 formula has a kernel which uses a near circular array (rather than the square array of the 24,20
324
N . KEITH TOVEY et al.
formula), and the kernel itself does not impose orientation information on the analysis. For very noisy pictures, the low order solutions but with a high number of pixels, such as the 20,5, become progressively better. Orientation analysis for microfabric applications produces new images which contain orientation information at a large number of points in each image. Data analysis methods are in common use in the earth- and related sciences but are less well used in other disciplines, and a brief review of some of the key points has been included in Sections IV,E and V,J. Essentially, the information may be reduced to two parameters: an index of anisotropy and a direction of preferred orientation. With axial type data which only has information in the range 0-180", analysis must be treated with care as the results will be dependent on the origin chosen. Techniques such as temporarily doubling the angle, conducting the analysis, and then halving the result is one way around the problem. The index of anisotropy is derived from a least squares fit to the radial histogram distribution of measured angles and thus has attractions of simplicity. It has a scale which ranges from zero for a random fabric to unity for a complete aligned one. However, while many fabrics approximate to an elliptical distribution, others do not and more reliable parameters are the mean resultant vector and the associated direction. The term consistency ratio is used by Smart and his co-workers but is exactly the same as the mean resultant vector. The direction measured by this means is always close to that from the index of anisotropy approach and the difference between the two methods has never exceeded 0.6" in over 10,000 separate measurements. The value of the mean resultant vector also ranges from zero t o unity, is uniquely related to the index of anisotropy, but has a much lower value. For this reason, the greater dynamic range of the index of anisotropy for real images has merits for continued use. Asymmetric formulae for use at the boundaries of images have been tested, but these are generally not as good as the symmetric formulae, and to cover the boundary regions of an image, it is preferable to use a 3 x 3 symmetric formula for the penultimate layer around the image rather than an asymmetricversion of a truncated 5 x 5 kernel. Around the edge, asymmetric formulae may be used and generally kernels based on a truncated 24,5 formula seem best. For the four corner pixels, however, no formula has been found which has an adequate accuracy in orientation determination. Many workers have claimed high accuracy for their particular kernel, but these claims have been on a restricted range of tests and did not cover the full spectrum of wavelengths done here. All formulae produce errors in estimation, but despite the superiority of the 20,14 formula for microfabric work, the 1 Yo error claimed by Smart and Leng (1993) is definitely a significant underestimate when a range of conditions are examined.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
325
For microfabric analysis, problems of inaccuracies may be addressed to some extent by discarding computed orientation vectors whose magnitude is below a given threshold. Further work in this area is probably warranted. In some applications, the selection of vectors which only fall within a given range of magnitudes may be relevant (Tovey and Krinsley, 1990). Finally, computations of index of anisotropy or mean resultant vector may be weighted according to some function of the magnitude of the vector. This tends to enhance the importance of the more contrasting features in the image and in doing so will perhaps give a value nearer to that from subjective interpretation. This may or may not be what is actually required. Domain segmentation may be used to delineate regions of an image with similar orientation. There are two approaches, both involving the passage of a large radius filter over the image. One method is based on a modal filter and examines previously coded data situated within the filter; the other, and more rigorous approach, uses the magnitude of the mean resultant vector within the mask as the basis of this segmentation. Smart and his co-workers use the terms top-contouring and consistency ratio mapping to describe the two methods of domain segmentation. The choice of parameters used in the modal filter approach may be optimized with reference to the mean resultant vector method. However, the choice of radius is a matter which is not easily resolved. For images showing the microfabric of soils and sediments at a magnification of 2000 x , a radius of about 19 or 20 pixels seems realistic and there is some evidence to support this. With observations on new materials, the choice of radius could be standardized using the approach suggested in Section VIJ. With the advent of confocal microscopy, there is no reason why orientation analysis should not be extended into three dimensions using the separate layers of the confocal image as the third dimension. A suitable three-dimensional kernel has been developed and the first results are to be reported shortly (Tovey, 1994). However, more work needs to be done to generalize the formula to allow for three-dimensional pixels which are not cuboid in shape. The orientation analysis algorithms discussed in this paper may be combined with other image analysis methods to extract further information from the images. Noteworthy in this respect is the porosity analyses discussed in Section VII,B. Intensity gradient analysis is particularly suited to the microfabric analysis of fine-grained materials. Coarse features on the other hand are more suited to traditional features analysis. The use of multispectral classification methods allows the separation of images into their two component parts and provides information about the nature of materials which is not possible by other means. Some applications of orientation analysis involve the processing of a few
326
N . KEITH TOVEY et a/.
images, but with microfabric analysis, automation is important, and a scheme has been developed to allow the processing of large numbers of images in batch form automatically, thereby obviating the need for subjective involvement which is usual in much other work on microanalysis. Recent developments have included the full automation of the capture and analysis of images by controlling the operations of a scanning electron microscope directly from within the image-processing facility.
ACKNOWLEDGMENTS The authors with to acknowledge financial assistance from SERC Grants Nos. GR/D/90574 and GR/H/40808, (US) AFOSR Grant No. 87-0346, NATO Grant 890948, and British Council Grant No. JRS91/02. Acknowledgement is also given to colleagues at University of Glasgow, including Peter Smart, Xiaoling Leng, and Xiaohong Bai for lengthy discussions on many of the topics covered in Section IV, V, and VI. Collaborative work with Wyss Yim (University of Hong Kong), Tony Greenaway (University of Jamaica), David Krinsley (University of Oregon), Mick Paul (Heriot-Watt University), and David Dent and Bill Corbett (University of East Anglia) have proved helpful in the extension of orientation analysis with the multispectral techniques described in Section VI1,C. Technical assistance from Stephen Bennett, Jackie Desty, and Clare Reuby is also acknowledged.
REFERENCES Bennett, R. H., Bryant, W. R., and Keller, G . H. (1977). Clay fabric and geotechnical properties of selected submarine cores from the Mississippi Delta. Professional Paper 9, NOAA Atlantic Oceanographic and Meterological Laboratories, Miami, Florida. Beucher, S. (1992). The watershed transformation applied to image segmentation. Scanning Microsc. (Suppl) 6 , 299-314. Boyde, A. (1967). A single stage carbon replica method and some related techniques for the analysis of the electron microscope image. J. R . Micros. SOC. 86, 359-370. Cheeny, R. F. (1983). “Statistical Methods in Geology for Field and Laboratory Decisions.” Allen and Unwin, London. Curray, J. R. (1956). The analysis of two-dimensional orientation data. J. Geol64, 117-131. Duda, R., and Hart, P . (1972). Use of Hough transformation to detect lines and curves in pictures. Commun. Assoc. Comput. Mach. 15, 1 1 . Foster, R. H., and Evans, J. S. (1971). Image analysis of clay fabric by Quantimet. Microscope 19, 31-47.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
327
Gonzalez, R. C., and Wintz, P. (1987). “Digital Image Processing” (2nd ed.). AddisonWesley, Reading, MA. Haddon, J . F., and Boyce, J . F. (1990). Unification of image segmentation and edge detection. Proc. IEEn. 137, 129-135. Haralick, R. M. (1984). Digital step edges from zero-crossing of second directional derivatives. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6, 58-68. Haralick, R. M., and Shapiro, L. G. (1985). Image segmentation techniques. Compu. Vision, Graphics, Image Process. 29, 100-152. Harvey, P. K., and Ferguson, C. C. (1976). On testing orientation data to goodness of fit to a von Mises distribution. Compu. Geosc. 2, 261-268. Hough, P. V. C. (1962). Method and means for recognizing complex patterns. U.S. Patent 3,069,654. Hounslow, M. W., and Tovey, N. K. (1992). Porosity measurement and domain segmentation of back-scattered SEM images of particulate materials. Scanning Microsc. (Suppl.) 6, 245-254. Hsu, Y-S., Walker, J . J., and Ogren, D. E. (1986). A stepwise method for determining the number of component distributions in a mixture. J. Math. Geol. 18, 153-161. Jain, A. K. (1989). Image analysis and computer vision. In “Fundamentals of Digital Image Processing.” (T. Kailath, Ed.), p. 17. Prentice-Hall, New Jersey. Kohler, R. (1981). A segmentation system based on thesholding. Compu. Graphics Image Process. 15, 319-338. Lane, G. S. (1969). The application of stereographic techniques to the scanning electron microscope. J. Sc. Instrum. J . Phys. E Ser. 2 , 565-569. Lafeber, D. (1967). The optical determination of spatial (three-dimensional) orientation of platy clay minerals in soil thin sections. Geoderma. 1, 359-369. Mardia, K. V. (1972). “Statistics of directional data.” Academic Press, New York/London. Morgenstern, N. R., and Tchalenko, J . S. (1967a). The optical determination of preferred orientation in clays and its application t o the study of microstructure in consolidated Kaolin-I. Proc. R . SOC.(London) A300, 218-234. Morgenstern, N. R., and Tchalenko, J . S. (1967b). The optical determination of preferred orientation in clays and its application to the study of microstructure in consolidated Kaolin-I. Proc. R . Soc. (London) A300, 235-250. McConnochie, I. ( 1 974). Fabric changes in consolidated kaolin. Geotechnique 24, 208-222. Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1986). “Numerical Recipes.” Cambridge Univ. Press, London/New York. Prewitt, J. M. S. (1970). Object enhancement and extraction. In “Picture Processing and Psychopictorics” (B. S. Lipkin and A. Rosenfeld, Eds.), pp. 75-149. Academic Press, New York. Razaz, M., Lee, R., and Shaw, P . (1993). A nonlinear iterative least-squares algorithm for image restoration. Proc. IEEE Nonlinear Signal Process. 4.1-4.6. Reiche, P. (1938). An analysis of lamination-The Coconino Sandstone. J. Geol. 46,905-932. Roberts, L. G. 1965. Machine perception of three dimensional solids. In “Optical and Electrooptical Information Processing” (J. T. Tippet et al., Eds.), pp. 159-197. MIT Press, Cambridge, MA. Rock, N. M. S. (1988). Numerical geology. In “Lecture Notes in the Earth Sciences” (S. Bhattacharji, G. M. Friedman, H . J . Neugebauer, and A. Seilacher, Eds.), Vol. 18. SpringerVerlag, Berlin. Rosenfeld, A., and Kak, A. C. (1982). “Digital Image Processing.” Academic Press, New York. Sahoo, P. K., Soltani, S., and Wong, A. K. C. (1988). A survey of thresholding techniques. Compu. Vision, Graphics, Image Process. 41, 233-260.
328
N. KEITH TOVEY el al.
Smart, P. (1966). Soil structure, mechanical properties and electron microscopy. Ph.D. thesis. Cambridge Univ., London. Smart, P . (1987). Personal communication. Smart, P . and Leng, X. (1993). Present developments in image analysis. Scanning Microsc. 7, 5-16. Smart, P., and Tovey, N. K. (1982). “Electronmicroscopy of Soils and SedimentsTechniques.” Oxford Univ. Press, London. Smart, P. and Tovey, N. K. (1988). Theoretical aspects of intensity gradient analysis. Scanning 10, 115-121. Smart, P. and Tovey, N. K. (1991). Microfabric of the deformation of soils. Third Annual Report to the (American) Air Force Office for Support for Research, Grant No. 87-0346. Smart, P., Tovey, N. K., McConnochie, I . , Leng, X. and Hounslow, M. W. (1990). Automatic analysis of electron microstructure of cohesive sediments. In “Microstructure of Fine-Grained Sediments from Mud to Shale” (R. H. Bennett, W. R. Bryant, and M. H. Hulbert, Eds.), pp. 359-366. Springer-Verlag, New York. Spurr, B. D. (1981). On estimating the parameters in mixtures of circular normal distributions. J. Math. Geol. 13, 163-174. Swift, J. A. (1992). The detection and quantification of straight-lined irregularities on surfaces. Scanning Micros. (Suppl.) 6, 283-291. Tovey, N. K. (1971). Soil structure analysis using optical techniques on scanning electron micrographs. Proc. 4th Inl. Symp. Scanning Electron Microsc. (0.Johari, Ed.) pp. 49-56. IIT Research Institute, Chicago. Tovey, N. K. (1972). The analysis of scanning electron micrographs of soil structure using a convolution square camera. Proc. 25th Electron Microsc. Anal. Group Symp. (W. C. Nixon, Ed.), pp. 244-247. lnstitute of Physics. Tovey, N. K. (1973a). A general photogrammetric method for the analysis of scanning electron micrographs. In “Systems and Applications.” Proc. Scanning Electron Micros. Conf. (W. C. Nixon, Ed.), pp. 84-89. Institute of Physics. Tovey, N. K. (1973b). Quantitative analysis of electron micrographs of soil structure. Proc. Int. Symp. Soil Structure (R. Pusch, Ed.), Vol. 1, pp. 50-57 (Goteborg). Swedish Geotechnical Institute, Stockholm. Tovey, N. K . (1973~).General reporter’s discussion on session 1 of soil, structure symposium. Proc. Int. Symp. Soil Structure (R. Pusch, Ed.), Vol. 2, pp. 1-19 (Goteborg). Swedish Geotechnical Institute, Stockholm. Tovey, N. K. (1980). A digital computer technique for orientation analysis of micrographs of soil fabric. J . Microsc. 120, 303-315. Tovey, N. K. (1994). Techniques to examine microfabric and particle interactions of collapsible soils. Proc. NATO Workshop Collapsible Soils, Loughborough. (In press.) Tovey, N. K., and Hounslow, M. H. (1994). Quantitative microporosity and orientation analysis in soils and sediments. J. Geol. SOC. (In press.) Tovey, N. K., and Krinsley, D. H. (1990). A technique for quantitatively assessing orientation patterns in sand grain microtextures. Bull. Int. Assoc. Eng. Geol. 41, 117-127. Tovey, N. K. and Krinsley, D. H. (1991). Mineralogical mapping of scanning electron micrographs. Sediment. Geol. 75, 109-123. Tovey, N. K. and Krinsley, D. H. (1992). Mapping the orientation of fine-grained minerals in soils and sediments. Bull. Int. Assoc. Eng. Geol. 46, 93-10], Tovey, N . K., and Martinez, M. D. (1991). A comparison of different formulae for orientation analysis of electron micrographs. Scanning 13, 289-298. Tovey, N. K., and Smart, P. (1986). Intensity gradient techniques for orientation analysis of electron micrographs. Scanning 8, 75-90.
IMAGE ANALYSIS WITH ORIENTATION ANALYSIS
329
Tovey, N. K., and Sokolov, V. N. (1980). Quantitative methods for measurements of scanning electron micrographs of soil fabric. Proc. Conf. Int. Soc. Photogrammefry (Hamburg)Remote Sensing Commission V, 154- 163. Tovey, N. K., and Sokolov, V. N. (1981). Quantitative methods for soil fabric analysis. Scanning Electron Microsc. Series I, 536-554. Tovey, N. K., and Wong, K. Y. (1974). Some aspects of quantitative measurements from electron micrographs of soil structure. In “Soil Microscopy.” Proc. 4th Int. Working Meet. Soil Micromorphol. (G. K. Rutherford, Ed.), 207-222. Limestone Press. Tovey, N. K . , and Wong, K. Y. (1978). Optical techniques for analysis scanning electron micrographs. Scanning Electron Microsc. 1, 381-392. Tovey, N. K . , Smart, P., Hounslow, M. W., and Leng, X . L. (1989). Practical aspects of automatic orientation analysis of micrographs. Scanning Microsc. 3 , 771-784. Tovey, N. K . , Smart, P., Hounslow, M. W., and Leng, X . L. (1992a). Automatic mapping of some types of soil fabric. Geoderma 53, 179-200. Tovey, N. K . , Smart, P., Hounslow, M. W., and Desty, J. P. (1992b). Automatic orientation analysis of microfabric. Scanning Microsc. Suppl. 6 , 3 15-330. Tovey, N. K., Krinsley, D. H., Dent, D. L., and Corbett, W. M. (1992~).Techniques to quantitatively study the microfabric of soils. Geoderma 53, 217-235. Tovey, N. K., Dent, D. L., Krinsley, D. H., and Corbett, W. M. (1992d). Processing rnultispectral SEM images for quantitative microfabric analysis. Scanning Microsc. (Suppl.) 6, 269-282. Tovey, N. K., Dent, D. L., Krinsley, D. H., and Corbett, W. M. (1994a). Quantitative micromineralogy and microfabric of soils and sediments. In “Soil Micromorphology” (A. J . Ringrose-Voase, and G. S. Humphreys, Eds.), pp. 541-547, Elsevier, Amsterdam. Tovey, N. K., Smart, P., and Hounslow, M. W. (1994b). Quantitative Methods to determine microporosity in soils and sediments. In “Soil Micromorphology” (A. J . Ringrose-Voase, and G. S. Humphreys, Eds.), pp. 531-539. Elsevier, Amsterdam. Unitt, B. M. (1975). A digital computer method for revealing orientation information in images. J. Phys. E. Ser. 2. 8, 423-425. Unitt, B. M. (1976). On-line digital image processing for the scanning electron microscope. Ph.D. thesis, Cambridge Univ., London. Unitt, B. M. and Smith, K. C. A. (1976). The application of the minicomputer in scanning electron microscopy. In “Electron Microscopy.” Proc. 6th Eu. Cong. Electron Microsc. (D. G. Brandon, Ed.), TAL International Pub. 1, 162-167. Watson, G. S. (1966). The statistics of orientation data. J . Ceol. 74, 786-797. White, J . M., and Rohrer, G. D. (1983). Image thesholding for optical character recognition and other applications requiring character image extraction. IBM J. Res. Dev. 27, 400-41 1. Zuniga, 0. A., and Haralick, R. M. (1987). Integrated directional derivative gradient operator, IEEE Trans. Systems, Man, Cybe. SMC-17, 508-517.
This Page Intentionally Left Blank
Index
A
phase retrieval, 110-1 12 two-dimensional, 131-133 reduced transform algorithms, 16-17,
ABCD law, 186 Abelian group, 3-6 duality, 3,7,8 extended Cooley-Tukey fast Fourier transforms, 49-50 Fourier transform, 14-I5 vector space, 7-8 Aberrations muitislice approach, 207-215 scanning transmission electron microscopy,
21-27,30-46 row-column algorithm, 45 Angles-coded image, 239-243,3 18, 320 Anisotropy index, see Index of anisotropy Astrophysics, stellar spreckle interferometry,
143-144 Atomic structure, crystal-aperture scanning transmission electron microscopy,
86-87
73-79
Adatoms, crystal-aperture scanning transmission electron microscopy, 90, 94-100 Affine group, 10- I I fast Fourier transform algorithm. 3 Cooley-Tukey, 46-47 reduced transform algorithm, 30-3 I , 39-41 point group, 31-39 Xn invariant, 41-42 Aharonov-Bohm effect, 174. 176,178-IRI Algorithms. 2-3 Cooley-Tukey algorithms, 17-19,27-30,
Auto-magnification, direct imaging of nucleus,
87-89,104 Automation, orientation analysis, 320-322 Axis conventions, orientation analysis,
279-280
B
46-49 domain segmentation. 304-307 Gerchberg-Saxton algorithm, I10 Good-Thomas algorithms, 19-21 GT-RT algorithm, 45-46 hybrid RT/GT algorithm, 25-26 iterative algorithms, 110-1 1 1 , 145,148,167 orientation analysis. 319-320,323, 325
33 I
Back-scattered electron images, 23 I Beam propagation method (BPM), 175 basic equations, 175-178,186-190 improved equations, 202-207 Beam spotsize, 182 Bend extinction contour, 60 Bent foil zone axis pattern (ZAP), 69,
75,77 Binary image, porosity analysis, 301-308 Blind deconvolution, 144-152 Boundaries, intensity gradient analysis,
267-272 BPM, see Beam propagation method
332
INDEX
c Canonical isomorphism. linite abelian groups. 8-9 Character basis. 8 Character group, 6-9 Chinese remainder theorem, 4-6 Chromatic aberrations, scanning transmission electron microscopy, 87 Circular mean, orientation analysis. 28 I Circular standard deviation, orientation analysis, 282 Circular variance, orientation analysis. 282 Coherent imaging, through turbulence. 152- I 6 6 Computer simulation crystal-aperture scanning transmission electron microscopy. 66-73 object reconstruction, 127- 130, 133- 134. 162-166 Confocal microscopy. 283, 325 Consi\tency ratio domain segmentation. 298 orientation analysis. 282, 287, 292-293 Convolution. 144 Convolution square pattern, 223 Coolcy-Tukey (CT) algorithm fast Fourier transform algorithm. 17- 19, 27-30 abclian aftinc group. 49-50 abelian point group. 47-49 aftine groups, 3, 46-47 extended, 47-52 multidimensional. 28-30 Copper foil. crystal-aperture scanning transmission electron microscopy. 59, 90-9 I Covering, group theory, 10, 16 CRT. set Chinesc remainder theorem Crystal-aperture scanning transmission clcctroii microscopy (STEM), 57- 107 direct imaging of nucleus, 87-89. 104 experimental. 66-87, 90-91 imaging, 58-59. 63-66, 87-90. 94- 106 resolution, 91-04 theory, 59-66 Crystal lattice electron\. transmission through, 59-62, 66-73 zone axis tunnels. 73-79 Crystallography group-invariant transforin algorithms. I -55 Fnimni group, 13- 14
P 6 , group, I I I2 P6/ninrnr group. 12- I3 Pninini group, 13 X-ray. phase retrieval, 167- I68 C T algorithm. s w Cooley-Tukey algorithm -
D Decimation finite abelian group. 8. 15 weighted. I6 Deconvolution. blind, 144- 152 Diffraction crystal-aperture scanning transmission electron microscopy. 63-65 dynaniical theory of elcctron diffraction, 58, 60 Digital image acquisition. 229 edge. 231 noisy. 257. 277 Domain, 227, 287 Domain segmentation, 287-299. 325 algorithms. 304-307 anisotropy index. 298, 299 image presentation, 296-298 modal lilter, 288-292 multispcctral analysia, 3 18-3 19 radius. choice. 293-296 Rayleigh statistical test, 292-293 vector magnitude, 293 Double-passage coherent imaging. 152- 154 Dual covering, group theory. 10 Duality. abelian group, 3, 7, 8 Dynamical theory, electron diffraction. 58, 60
E Edge. digital image. 23 1 Edge detection operators, 232-239, 259, 320. 323 orientation analysis, 220, 23 1-239. 300 prcsentation of results, 239-244 8.2 formula. 252, 253 8 5 formula, 252. 253, 257, 261-262. 264. 266, 269. 277 Electromagnetic lenses multislice approach. 179%181-182. 187 cylindrical, 175-176 Glascr-Schiske diffraction intcgral. 195-202 improved phase-ohject approximation. 190- I92 paraxial properties, 194-207 quadrupole. 18 I - I 8 5
333
INDEX spherical aberration. 207-2 1 5 spherical wave propagation, 194- I95 thick lens theory. lY2-I94 Electron lenscs cylindrical, 175- 176 quadrupolc. I8 I- 185 round symmetric, 186-202 Electron microscopy crystal-aperture xanning clccti-on microscopy, 57- 107 experimental, 66-87, 90-9 I imaging, 58-59. 63-66. 87-90, 94- 106 resolution, Y 1-94 theory, 59-66 optics. 174-176, 215, 216 Electron optics, 64, 174-176, 187-190 Aharonov-Bohm effect, 174, 176, 178-1x1 Glaser-Schiske diffraction integral, 1Y5-202 improved phase-object approximation, 190-1192 rriultislice approach, spherical aberration, 207-2 I S paraxial propcrties, 194-207 spherical wave propagation. 194- 195 thick lens theory, 192-194 Electron ray model. 60-61 Electron ray simulation. predictions. done axis pattern. 66-73 Electrons diifraction, dynamical theory, 5 8 , 60 transmission through crystal lattice, S9-62, 66-73 Electron wave theory, 60-6 I Electrostatic lenses. multislicc approach. 177-178, 182-184. 187-188 Entire functions. phase rctricval by, 109- 168 Exponential filter, phace retrieval, I 1 1 - 1 12. I 16- I I 8 Extended Coolcy-Tukey fast Fourier transform. 47-52
F Fast Fourier transfoi-m (FFT) algorithm, 16 Cooley-Tukey algorithm, 17- 19. 27-30 extended, 47-52 Good-Thomas algorithm, 19-21 reduced transform algorithm. I h- 17, 2 1-27 Femtosecond-pulse ineasureinent, phase retrieval. l67-- I 6X FFT, srr Fast Fourier transform algorithm Field emission tips, 5 8 , 70 563.1 formula. 286
56.55 forniula, 286 Finite abelian group, 3-6 Fourier transform, 14-15 vector space, 7-8 Fixed point. 9 Fnrnirn group. 13-14, 52-53 Forward difference formula, 253 Fourier modulus, phase retrieval, I10- I 12, 131-132. 144, 167 Fourier series expansion phase retrieval. 118-124, 131-133 Hartley transform, 140- 142 Fourier transform group invariant algorithms fast Fourier transform algorithm, 16-30. 47-52 tinitc abclian group, 3, 14-15 one-dimensional symmetry, 53-55 three-dimensional symmetry. 1-53 phasc retrieval blind-deconvolution problem, 146- 148 coherent imaging through turbulence, 155, I60
Hartley transform and, 140 stellar speckle interferometry, 143- 144 two-dimensional, I3 I 4.2 forniula, 234, 237-239, 252, 253. 257, 26 1-262. 266. 277
G Gaussian beam. 184- 185, 186 Gaussian wavefront, propagation, 204-207 General theory of image formation. 174- 176, 215-2 16 Gcrchbcrg-Saxton algorithm, I 10 tion integral, 1')s-202 Gold. crystal-aperture scanning transmission electron microscopy, 59. 90- 105 Good-Thomas (GT) algorithm, 2 fast Fourier transform algorithm, 19-2 I hybrid RT/GT algorithm, 25-26 Gray level porosity analysis, 308-3 I I Group-invariant transform algorithms. I-55 Group theory, 2-3 affine group, 10- I I character group. 6-9 finite abelian group, 3-6 point group. 9- 10 G T algorithm. s e e Good-Thomas algorithm GT-RT algorithms. 45-46
H Hartley transform. 139- 143
334
INDEX
Heisenberg uncertainty, crystal-aperture scanning transmission electron microscopy, 57-58, 64 Hermitian object functions, phase retrieval, 123-124, 126, 133 Heuristic approach, multislice approach to lens analysis, 173-216 Hilbert function, 115 Hilbert phase, 1 15- I 16 H-invariance, 10 H-orbit, 9 Hybrid RT/GT algorithm, 25-26
I Imaging crystal-aperture scanning transmission electron microscopy, 58-59, 63-66, 87-90 gold adatoms, 94- I00 subatomic detail, 100- 105 double-passage coherent imaging, 152- I54 general theory of image formation, 174- 176, 215-216 multispectral processing, 313-319 orientation analysis, 220-232, 323-326 algorithms, 319-320 applications, 300-3 19 automation. 320-322 domain segmentation, 278-299 edge detection operators, 23 1-239 image acquisition, 228-23 1 image analysis, 219-228 image processing, 23 1-239 image resolution, 275-276 intensity gradient operators, 246-287 presentation of results, 239-244 quantitative parameters, 244-246 phase retrieval blind-deconvolution problem, 144- 145 coherent imaging through turbulence, I 52- I 66 resolution crystal-aperture scanning transmission electron microscopy, 9 1-94 orientation analysis, 275-276 subatomic, 64 Improved phase-object approximation, 190- I92 Index of anisotropy domain segmentation, 298, 299 orientation analysis, 223, 245-246, 324, 325 Intensity gradient, 23 I
Intensity gradient analysis, 228, 230. 246-272, 246-278, 3 19-320 boundaries, 267-272, 324 image resolution, 275-276 noisy images, 257, 277 pixels numbering, 233, 268, 284 rectangular aspect ratio, 272, 274-275 statistical analysis of data, 278-284 three-dimensional, 284-287 Interaction, image formation, 174- I76 Isomorphism, finite abelian groups, 6-9 Isotopy subgroup, 9 Isotropic operator, 235, 236, 266, 267, 269, 277 Iterative algorithms, 110-1 I I , 167 blind-deconvolution problem, 145, 148
K Kirchoff-Fresnel integral, 197. 207 Kuipers test, domain segmentation, 283 L Lanthanum hexaboride, crystal-aperture scanning transmission electron microscopy, 93 Lenses aberrations multislice approach, 207-215 scanning transmission electron microscopy, 86-87 electromagnetic, 187-188 cylindrical, 175-176 Glaser-Schiske diffraction integral, 195-202 improved phase-object approach, 190- 192 paraxial properties, 194-207 quadrupole, I8 I 185 round symmetric, 186-202 spherical aberration, 207-2 I5 spherical wave propagation, 194- I95 thick lens theory, 192-194 image formation, 174- 176 multislice approach, 173-216 optical, 174-176, 185-186 propagation basic equations, 175- 178 improved equations, 202-207 light, quadratic index media, 185- I86 Logarithmic Hilbert transform, phase retrieval, 111-116, 120-121 Lorentzian filter, phase retrieval, 124- 126 -
INDEX
M Magnetic lenses, see Electromagnetic lenses Magnitude image, 239, 254, 320 Mean filter method, orientation analysis, 290 Mean resultant length, orientation analysis. 28 1-282 Mean resultant vector, 287, 293, 298 Microfabric, 220 Microfabric analysis noisy images. 277 orientation analysis, 220-326 photogrammetric equations, 222 pixel resolution, 275-276 quantitative, 221-224 soil and sediment, 221 -227, 23 I , 259, 275-276, 2x1, 318-319 Microscopy, see specific tei~hniyues Microstructure. 220 Minerals, microfabric, orientation analysis, 226 Mineral segmented image, 3 IS, 3 I7 Modal filter, domain segmentation, 287-292 Moire fringes, crystal aperture scanning transmission electron microscopy. 90, 96-97 Multidimensional Cooley-Tukey algorithm. 28-30 Multislice approach electromagnetic lenses, 179, I8 I - 182, 187- I88 cylindrical, 175- 176 Glaser-Schiske diffraction integral, 195-202 improved phase-object approach, 190- 192 quadrupole, 175- 176, 18 I - I85 round symmetric, 186-202 spherical aberration, 207-2 I5 spherical wave propagation, 194- 195 thick lens theory, 192-194 image formation, 174- I76 lens analysis, 173-2 I6 paraxial properties, 194-207 optical, 174- 176, 185- I86 propagation basic equations, 175-178, 186-190 improved equations, 202-207 Multispectral processing, orientation analysis, 311-319,321
N Nucleus, imaging using crystal-aperture scanning transmission electron microscopy, 87-89, 104
335
0 Object reconstruction, 110- I12 computer simulation, 127- 130, 133- 134, 162- I66 One-dimensional phase retrieval, I 10- I 12, 131 deconvolution, 145- 149 One-dimensional symmetry, fast Fourier transform, 53-55 I10 foil, 58 Optical astronomy phase retrieval. I3 I stellar spreckle interferometry. 143- 144 Optical convolution square techniques, 222-223 Optical lens theory, 174- 176, 186- I90 Optical transform techniques, 222-223 Optics, see d s o Electron optics electron microscopy, 174-176, 215, 216 multislice approach, 173-2 16 Orientation domain segmentation, 287-299 edge detection, 220, 23 1-244, 259, 320, 333 soil and sediment microfabric, 221 -227, 231, 259, 275-276, 281, 318-319 Orientation analysis applications, 300 with multispectral processing, 31 1-319 with porosity analysis, 301-31 I automation, 320-322 image acquisition, 228-23 I image analysis. 220-232, 323-326 algorithms, 319-320 domain segmentation, 278-299 edge detection, 23 1-239 index of anisotropy, 223. 245-246 intensity gradient operators. 246-287 quantitative parameters, 244-246 image processing, 23 1-239 image resolution, 275-276 presentation of results, 239-244 axis conventions, 279-280 quantitative analysis, 222-224, 244-246 statistical analysis, 278-294 P P6, group, I I P6/mmm group, 12-13 Paraxial properties, electron lenses, 194-207 Paraxial ray equation, 194, 206 Paraxial Schrodinger’s equation, multislice method, 200-202
336
INDEX
Periodimtion. finite ahelian group, 15 Phase retrieval by entire functions, 109-168 algorithms, I 10- I 12. I3 1 - I33 blind-deconvolution problem, 144 cohcrent imaging through turbulence. 152- I 66 computer simulation. 127- 130. 133- I34 exponential filter, I 1 1 - 1 12. 116-1 18 Fourier series expansion, 118- 124, 131-133, 140-142 Hartley transform, 140- 143 Hermitian object functions, 123- 124, 126. I33 logarithmic Hilben translonn, I I I - 1 16, 120- 12 1 Lorentzian filter, 124- I26 one-dimensional, I 10- I 12, 13 I , 145-149 theory, I 12- I3 I two-dimensional, I 10- 1 12. I3 1 - 139. 161-162. 167 x r o location method. I 12, I 18, 167 zero sheets method. 112. 13 I , 167 Photogrammetric equations. microfiabric analysis. 222 Pixels digital image, 228-229 domain segmentation. 288-289 edge detection, 233 numbering system, 233. 268. 284 rectangular aspect ratio, 229. 272. 274-275 square pixels. 233. 274, 329 Pmntnt group, 13. 48-49 Point group. 9-10 extended Cooley-Tukey fast Fourier transform, 47-52 reduced transform algorithm, 3 1-39 Point spread function, 303, 304 Porosity analysis. orientation analysis. 301-311 Prewitt operator, 235. 236, 253, 264, 266. 267. 269. 277, 323 Prime factor algorithm, 2 Propagation basic equations, 176-177, 186-190 Gaussian wavefront. 204-207 image formation, 174- I76 improved equations, 202-207 light. in quadratic index media, 185- I86 quadrupole field, I8 I - 18.5 spherical wave in lens field, 194-19.5
v
Quadratic index media, light propagation, 185- I86 Quadrupole electron lenses, multisljce approach. I8 I - I85 Quantitative techniques. image analysis. 222-224
R Rayleigh statistical test domain \egmcntation, 292-293 orientation analysis. 283 Reconstruction. 110-1 12 blind-deconvolution, 144- 152 coherent imaging, 152, 160-166 computer simulation, 127- 130. 133- 134, 162-166 Harrley transform. 140- 143 two-dimensional, I3 I - 139, I6 I - I62 Rectangular pixels, 229, 272, 274-275 Reduced transform (RT) algorithm, 2-3. 16, 31 aftine group. 30-3 I , 39-4 1 fast Fourier transform algorithm. 16- 17. 21-27 hybrid R T K T algorithm, 25-26 point group, 3 1-39 X"-invariant algorithm, 4 1-42 Remolded tip. 80. 82-83, 106 Resolution crystal-aperture scanning transmission clectron microscopy, 9 1-94 orientation analysis. 275-276 subatomic, 64 Roberts operator. 235, 236. 266. 277, 323 Rocking curves, dynamical theory of diffraction, 60 Roof edge, 231 Rosette diagram. 240. 241-243. 245. 254-255. 279 Row-column algorithm, 45 RT algorithm, see Reduced transform algorithm
S Scanning transmission electron microscopy (STEM), 57 aberrations, 86-87 crystal-aperture STEM, 57- 107 direct imaging of nucleus, 87-89. 104 experimental, 66-87. 90-91 imaging, 58-59, 63-66, 87-90, 94-106 resolution, 9 1-94
337
INDEX theory. 59-66
304 Sclirvdinger equation. electron optics. 174, 178, 184. I85. I87. 190, 200-202. 206. 208-209. 21 I Sediment, microfabric. orientation analysis,
221-224, 231 SEMPER, 229. 319. 322 Simulation crystal-aperture scanning transmission electron microscopy, 66-73 ohjcct reconstruction. 127- 130. 133- 134, 162- 166 Sohel operator. 235. 236, 253. 26 1-26?, 266. 267. 269. 277. 323 Soil. microfabric. orientation analy\is, 22 1-224, 226-227. 259. 275-276.
2x1, 318-319 Spherical aberrations multislice approach to lens analysis. 207-2 I5 scanning transmission electron microscopy, 86-87 Spherical mean. orientation analy$i\. 2x4 Spherical wave. propagation in lens field. 194- 195 Sprccklc interferometry. 143- 144 Square pixels, 229. 233. 274 Stalistical analysis. orientation ;iniilysis data. 278-294 Stellar sprcchlc interferometry, 143- 144 STEM. s w Scnnning triinsinission electron microscopy Step edge. 23 I Superimposed lattices. cry\tal-npcrturc scanning ti-ansmi\sion electron microscopy. 96
T TEM. s c c Transmission clcctroii microscopy Thick lens theory, 192- I94 Thrcc-dimensional crystallogriphic group. group-invariant transform algorithms.
2-3 Three-dimcn\ional orientation data. 2x3-2x7.
325 310 lield emissions. STEM, 58-59. 79-X7 'Thrc\holding, 223. 23 1 - 3 2 Top contouring. 287 Tran\form algorithms, phase retrieval, I 10 Transmission electron niicro\copy (TEM). 60
crystal-aperture \canning Iriiiismissioii electron microscopy. 57- 107 image formation. I 7 4 microfabric mnlysis. 222. 223 Tungsten, crystal-aperture scanning transmission electron microwopy. 58-59. 79 Turbulence. coherent imaging. 153- 166 12.2 formula, 252. 253 12.9 formula. 235, 237-239, 252. 2.53, 257. 261-166. 277 20.2 formula, 252. 253 20.5 i'oimula, 252. 253. 257. 261-266. 269. 272. 273. 277. 323 20.9 lormula. 252. 269-271, 773 20,14 fomiuln. 252. 253. 257. 761-267. 269. 274-275. 277. 2x6. 31 X, 323. 324 20.20 fomiula, 7-52. 286 20s formula. 257-2.58. 266, 277 2OT tormula, 257-2.58. 266. 269. 277 20U fomiula, 257-258. 266. 277. 323 24.2 lormula. 252. 253 24.5 tomiula. 252. 2.53. 266. 269-27 I. 273. 277. 324 24,O forniul:i, 252. 253. 272 24.14 formula, 252. 253. 2 6 6 277 24.20 tormula. 252. 253. 266. 777. 323-324 Twiddle lactor, 2X Two-dimenaionaI orientation data. 280-283 Two-dimensional phase retrieval. I 10- I 12, 131-139. l6l-lO2. 167 simulation. 133- 134 2.2 formula, 233, 252. 253, 26 1-26?, 266, 277
1 1 llnccrt;iinty, crystnl-aperture scanning transmission electron mici-oscopy,
57-58. 64 Unifoi-mity. orientation analysis. 282
V Vector spncc. linitc abclian group. 7-8 V G scanning transmis\ion clcctroii microscopy. 79, 80. 82. I06 Vibrations. scanning tmnsmissioii electron microscopy, X6
W Watson I/' test. 2x3 Wave optic\. multislice approach, 173-2 I 6 Weighted decimation. 16 Wiener filter, 303-305
338 X X x invariant reduced transform algorithm, 4 1-42 X-ray crystallography, phase retrieval. 167-168 X-ray mapping, 3 12-3 19 Z ZAP, see Zone axis pattern Zero location method, phase retrieval, 112, 118, 167
INDEX Zero sheets method, phase retrieval, 1 12, 13 I , 167 Zone axis pattern (ZAP), crystal-aperture scanning transmission electron microscopy, 58, 66-73 Zone axis tunnels (ZAT), 62-65, 85, 107 through 110 foil, atomic structure, 73-79 Zuniga and Haralick formula, 253-267, 269, 277, 323
This Page Intentionally Left Blank
I S B N 0-12-014735-1